Mokimos Mokimos - 1 year ago 56
R Question

Why do columns in data frames change class when subsetted versus apply?

I am trying to add a summary row to a data frame detailing the levels of each column. I ran into a problem applying the levels function across the frame. I think the reason is that columns treated individually are treated as factor vectors, but when the apply function is used they are treated as characters:

a = c("a","b","c")
b = c("d","e","f")
m = cbind(a,b)
df =
[1] "factor"
apply(df, MARGIN=2, class)
a b
"character" "character"

Which I think is the cause of the problem:

[1] "a" "b" "c"
apply(df, MARGIN=2, levels)

I had a look at the help documentation on apply, data frames, and around the web. Can someone explain why this is?

Answer Source

You can use lapply or sapply function to know your class of variables, to my understanding apply goes through column element wise so each element is a character so the output shows as as character, where as lapply and sapply functions works on variables so it gives class of variables either its as character or factor

[1] "factor"

[1] "factor"

       a        b 
"factor" "factor"