saud saud - 23 days ago 6
Python Question

Varying n_neighbors in scikit-learn KNN regression

I am using scikit-learn's KNN regressor to fit a model to a large dataset with

n_neighbors = 100-500
. Given the nature of the data, some parts (think: sharp delta-function like peaks) are better fit with fewer neighbors (
n_neighbors ~ 20-50
) so that the peaks are not smoothed out. The location of these peaks are known (or can be measured).

Is there a way to vary the
n_neighbors
parameter?

I could fit two models and stitch them together, but that would be inefficient. It would be preferable to either prescribe 2-3 values for
n_neighbors
or, worse, send in an list of
n_neighbors
.

Answer

I'm afraid not. In part, this is due to some algebraic assumptions that the relationship is symmetric: A is a neighbour to B iff B is a neighbour to A. If you give different k values, you're guaranteed to break that symmetry.

I think the major reason is simply that the algorithm is simpler with a fixed quantity of neighbors, yielding better results in general. You have a specific case that KNN doesn't fit so well.

I suggest that you stitch together your two models, switching dependent on the imputed second derivative.

Comments