KingDan KingDan - 2 months ago 17
R Question

K-Nearest-Neighbour

I'm trying to use the

knn
function that comes in the
class
library of R. It's giving me an error that "train" is not the same length as "class".

Upon printing the lengths of train and class respectively, I found that train has a length of 100 (as needed) and class has a length of 2 (as expected). If I understand correctly,
cl
, or class, is meant to be a factored vector of labels. My labels are just "orange" and "blue". I followed the example in the documentation yet the error persists. Is there something glaringly wrong with my code? Any help is appreciated.

library(class)

x <- runif(100, 1, 100)
y <- runif(100, 1, 100)
train.df <- data.frame(x, y)

x.test <- runif(100, 1, 100)
y.test <- runif(100, 1, 100)
test.df <- data.frame(x.test, y.test)

cl <- factor(c(rep("orange", 100), rep("blue", 100)))

knn(train.df, test.df, cl, k=3, prob=TRUE)

Answer

cl is 200 elements long. Try calling rep 50 times for each class instead.

library(class)

x <- runif(100, 1, 100)
y <- runif(100, 1, 100)
train.df <- data.frame(x, y)

x.test <- runif(100, 1, 100)
y.test <- runif(100, 1, 100)
test.df <- data.frame(x.test, y.test)

cl <- factor(c(rep("orange", 50), rep("blue", 50)))

knn(train.df, test.df, cl, k=3, prob=TRUE)
Comments