I'd like to cluster points given to a custom distance and strangely, it seems that neither scipy nor sklearn clustering methods allow the specification of a distance function.
For instance, in
All of the scipy hierarchical clustering routines will accept a custom distance function that accepts two 1D vectors specifying a pair of points and returns a scalar. For example, using
import numpy as np from scipy.cluster.hierarchy import fclusterdata # a custom function that just computes Euclidean distance def mydist(p1, p2): diff = p1 - p2 return np.vdot(diff, diff) ** 0.5 X = np.random.randn(100, 2) fclust1 = fclusterdata(X, 1.0, metric=mydist) fclust2 = fclusterdata(X, 1.0, metric='euclidean') print(np.allclose(fclust1, fclust2)) # True
Valid inputs for the
metric= kwarg are the same as for