saud saud - 1 year ago 116
Python Question

Varying n_neighbors in scikit-learn KNN regression

I am using scikit-learn's KNN regressor to fit a model to a large dataset with

n_neighbors = 100-500
. Given the nature of the data, some parts (think: sharp delta-function like peaks) are better fit with fewer neighbors (
n_neighbors ~ 20-50
) so that the peaks are not smoothed out. The location of these peaks are known (or can be measured).

Is there a way to vary the

I could fit two models and stitch them together, but that would be inefficient. It would be preferable to either prescribe 2-3 values for
or, worse, send in an list of

Answer Source

I'm afraid not. In part, this is due to some algebraic assumptions that the relationship is symmetric: A is a neighbour to B iff B is a neighbour to A. If you give different k values, you're guaranteed to break that symmetry.

I think the major reason is simply that the algorithm is simpler with a fixed quantity of neighbors, yielding better results in general. You have a specific case that KNN doesn't fit so well.

I suggest that you stitch together your two models, switching dependent on the imputed second derivative.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download