jmlarson - 9 months ago 38

Python Question

I have many points

`X`

`f`

`numpy`

`X`

`f`

`r`

`X`

`sp.spatial.distance.pdist(X)`

`def cluster(X,f,r):`

pts,n = np.shape(X)

centers = []

for i in range(0,pts):

pdist = sp.spatial.distance.cdist(X,[X[i]])

if not np.any(np.logical_and(pdist <= r, f < f[i])):

centers.append(i)

return centers

This takes minutes. Is there a way to quickly cluster based on proximity and another metric?

Answer

You can significantly reduce the number of distance computation by keeping a record. For instance, if j is a neighbor of a center i and it has a larger f value, then j can never be a center since one of its neighbors is i which has a smaller f value. Please check the following and let me know if you need clarification.

```
def cluster4(X,f,r):
pts,n = np.shape(X)
centers = np.ones((pts,1),dtype=int)
for i in range(pts):
if not centers[i]:
continue
pdist = sp.spatial.distance.cdist(X,[X[i]])
inrange = (pdist <= r)
inrange[i] = False
lesser = (f < f[i])
if np.any(inrange & lesser):
centers[i] = 0
centers[inrange & np.invert(lesser)] = 0
return np.where(centers == 1)[0]
```

Source (Stackoverflow)