Oriol Prat Oriol Prat - 3 months ago 8
R Question

missing factor level in R

I have a factor column, and I get a missing level, why R creates this missing level?

d0s$y
[1] E E E E E E G G G G G G G P P P P P P P
Levels: E G P

levels(d0s$y)
[1] "" "E" "G" "P"

Answer

It could be that there are blanks "" in the dataset before subsetting. One way would be to call droplevels to remove the unused levels

d0s$y <- droplevels(d0s$y)

Or call factor again

d0s$y <- factor(d0s$y)

However, it can be also be that the "" elements are already there but because it is a factor the print option doesn't show it

y1 <- factor(rep(c("E", "G", "P", ""), each = 3))
y1
#[1] E E E G G G P P P      
#Levels:  E G P
levels(y1)
#[1] ""  "E" "G" "P"

Suppose, if we subset the 'y1'

 y2 <- y1[y1 %in% c("E", "G", "P")]
 levels(y2) #the unused levels are still there
 #[1] ""  "E" "G" "P"

unless we drop those levels

 levels(droplevels(y2))
 #[1] "E" "G" "P"