Harsh Wardhan Harsh Wardhan - 5 months ago 53
Python Question

How to assign sample_weights in sklearn.cluster DBSCAN?

I'm using DBSCAN to find clusters of pixel values of an RGB image.

db = DBSCAN(eps=0.3, min_samples=10).fit(X)


where,
X
is an
N x 3
matrix. Each row of
X
contains RGB triplets.

Now, I want to assign weights to pixel values as a function of distance from the center of the image.
And this is the function I'm using:

score = 1 / (1 + math.exp(-a * distance)) # a = 0.001


I compute
weight_matrix
filled with
score
as above

Next I did this:

db = DBSCAN(eps=0.3, min_samples=10).fit(X,y=None, sample_weight=weight_matrix)


where, length of the
weight_matrix
array is equal to the number of rows in
X
.

But I get the following error:

TypeError: fit() got an unexpected keyword argument 'y'


So I tried doing it like this:

db = DBSCAN(eps=0.3, min_samples=10).fit(X, sample_weight=weight_matrix)


Now I get this error:

TypeError: fit() got an unexpected keyword argument 'sample_weight'


I think I'm passing the arguments incorrectly, but couldn't be sure. My scikit-learn version is 0.14.0.

Answer

It seems that you are using scikit-learn v <= 0.15, as this is the last version where DBSCAN had fit of form

fit(X)

since 0.16 it is

fit(X, y=None, sample_weight=None)

Simply update your scikit-learn to 0.16 or 0.17.X

Comments