Barker Barker - 14 hours ago 2
R Question

R package caret confusionMatrix with missing categories

I am using the function

confusionMatrix
in the R package
caret
to calculate some statistics for some data I have. I have been putting my predictions as well as my actual values into the
table
function to get the table to be used in the
confusionMatrix
function as so:

table(predicted,actual)


However, there are multiple possible outcomes (e.g. A, B, C, D), and my predictions do not always represent all the possibilities (e.g. only A, B, D). The resulting output of the
table
function does not include the missing outcome and looks like this:

A B C D
A n1 n2 n2 n4
B n5 n6 n7 n8
D n9 n10 n11 n12
# Note how there is no corresponding row for `C`.


The
confusionMatrix
function can't handle the missing outcome and gives the error:

Error in !all.equal(nrow(data), ncol(data)) : invalid argument type


Is there a way I can use the
table
function differently to get the missing rows with zeros or use the
confusionMatrix
function differently so it will view missing outcomes as zero?

As a note: Since I am randomly selecting my data to test with, there are times that a category is also not represented in the actual result as opposed to just the predicted. I don't believe this will change the solution.

Answer

You can use union to ensure similar levels:

library(caret)

# Sample Data
predicted = c(1,2,1,2,1,2,1,2,3,4,3,4,6,5) # Levels 1,2,3,4,5,6
reference = c(1,2,1,2,1,2,1,2,1,2,1,3,3,4) # Levels 1,2,3,4

u = union(predicted, reference)
t = table(factor(predicted, u), factor(reference, u))
confusionMatrix(t)