For example imagine a system with 5 points split into groups of at least 2. There would be one call to findgroups on all 5 points, then calls to findgroups on a 3-2 split and on a 2-3 split. Those two calls could run in parallel, as long as the result from the call on all 5 points waits to receive the results from both children threads before deciding which is the optimal solution. You would have to be careful to keep threads from writing to the same memory (specifically for the removals in pointset), but I think (depending on how many cores our system operates on) this could incur impressive speed up.
Update: Did not incur any speed up by implementing changes the the calculation for center or distance. Reading about python threads https://pymotw.com/2/threading/
==Runtime==