misctp asdas misctp asdas - 13 days ago 10
R Question

'sum' not meaningful for factors while using diag(prop.table()) functionality

I'm trying to find the mean of misclassification count upon running of KNN algorithm which generated a confusion matrix. Below is the result when i execute "prop.table(t,1)"

kdd_train <- dataset_normalized[1:140000,]
kdd_test <- dataset_normalized[140001:145586,]

kdd_train_target <- dataset_extracted[1:140000,12]
kdd_test_target <- dataset_extracted[140001:145586,12]
prop.table(t,1)
m1
kdd_test_target FALSE TRUE
FALSE 0.997044917 0.002955083
TRUE 0.048592189 0.951407811


However when i execute the command "error_per_class = diag(prop.table(m1))", it returned an error

> error_per_class = diag(prop.table(m1))
Error in Summary.factor(c(1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, :
‘sum’ not meaningful for factors


Is there any way to fix it ? Appreciate any help thanks !

Answer

The reason is mentioned in the error, the variable is factor. It is not possible to apply prop.table directly on a factor class as it requires some computation.

prop.table(m1)

Error in Summary.factor(c(2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, : ‘sum’ not meaningful for factors

Based on the values shown, it should be a logical vector, so convert it to logical and it should work

as.logical(m1)
prop.table(as.logical(m1))
#[1] 0.09090909 0.09090909 0.00000000 0.00000000 0.00000000 0.00000000 0.09090909 0.00000000 0.00000000 0.09090909 0.00000000 0.09090909 0.00000000 0.00000000
#[15] 0.09090909 0.00000000 0.09090909 0.09090909 0.00000000 0.09090909 0.09090909 0.00000000 0.00000000 0.09090909

data

set.seed(24)
m1 <- factor(sample(c(TRUE, FALSE), 24, replace=TRUE))
kdd_test_target  <- factor(sample(c(TRUE, FALSE), 24, replace=TRUE))