Phil Phil - 1 year ago 132
Python Question

Get nearest point to centroid, scikit-learn?

I am using K-means for a clustering problem. I am trying to find the data point which is most close to the centroid, which I believe is called the medoid.

Is there a way to do this in scikit-learn?

Answer Source

This is not the medoid, but here's something you can try:

>>> import numpy as np
>>> from sklearn.cluster import KMeans
>>> from sklearn.metrics import pairwise_distances_argmin_min
>>> X = np.random.randn(10, 4)
>>> km = KMeans(n_clusters=2).fit(X)
>>> closest, _ = pairwise_distances_argmin_min(km.cluster_centers_, X)
>>> closest
array([0, 8])

The array closest contains the index of the point in X that is closest to each centroid. So X[0] is the closest point in X to centroid 0, and X[8] is the closest to centroid 1.