misctp asdas - 1 year ago 242
R Question

# 'sum' not meaningful for factors while using diag(prop.table()) functionality

I'm trying to find the mean of misclassification count upon running of KNN algorithm which generated a confusion matrix. Below is the result when i execute "prop.table(t,1)"

``````kdd_train <- dataset_normalized[1:140000,]
kdd_test <- dataset_normalized[140001:145586,]

kdd_train_target <- dataset_extracted[1:140000,12]
kdd_test_target <- dataset_extracted[140001:145586,12]
prop.table(t,1)
m1
kdd_test_target       FALSE        TRUE
FALSE 0.997044917 0.002955083
TRUE  0.048592189 0.951407811
``````

However when i execute the command "error_per_class = diag(prop.table(m1))", it returned an error

``````> error_per_class = diag(prop.table(m1))
Error in Summary.factor(c(1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L,  :
‘sum’ not meaningful for factors
``````

Is there any way to fix it ? Appreciate any help thanks !

The reason is mentioned in the `error`, the `variable` is `factor`. It is not possible to apply `prop.table` directly on a `factor` class as it requires some computation.

``````prop.table(m1)
``````

Error in Summary.factor(c(2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, : ‘sum’ not meaningful for factors

Based on the values shown, it should be a logical vector, so convert it to logical and it should work

``````as.logical(m1)
prop.table(as.logical(m1))
#[1] 0.09090909 0.09090909 0.00000000 0.00000000 0.00000000 0.00000000 0.09090909 0.00000000 0.00000000 0.09090909 0.00000000 0.09090909 0.00000000 0.00000000
#[15] 0.09090909 0.00000000 0.09090909 0.09090909 0.00000000 0.09090909 0.09090909 0.00000000 0.00000000 0.09090909
``````

### data

``````set.seed(24)
m1 <- factor(sample(c(TRUE, FALSE), 24, replace=TRUE))
kdd_test_target  <- factor(sample(c(TRUE, FALSE), 24, replace=TRUE))
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download