Ricardo Saporta Ricardo Saporta - 1 month ago 5
R Question

How does one change the levels of a factor column in a data.table

What is the correct way to change the levels of a

factor
column in a
data.table
(note: not data frame)

library(data.table)
mydt <- data.table(id=1:6, value=as.factor(c("A", "A", "B", "B", "B", "C")), key="id")

mydt[, levels(value)]
[1] "A" "B" "C"


I am looking for something like:

mydt[, levels(value) <- c("X", "Y", "Z")]


But of course, the above line does not work.

# Actual # Expected result
> mydt > mydt
id value id value
1: 1 A 1: 1 X
2: 2 A 2: 2 X
3: 3 B 3: 3 Y
4: 4 B 4: 4 Y
5: 5 B 5: 5 Y
6: 6 C 6: 6 Z

Answer

You can still set them the traditional way:

levels(mydt$value) <- c(...)

This should be plenty fast unless mydt is very large since that traditional syntax copies the entire object. You could also play the un-factoring and refactoring game... but no one likes that game anyway.

To change the levels by reference with no copy of mydt :

setattr(mydt$value,"levels",c(...))

but be sure to assign a valid levels vector (type character of sufficient length) otherwise you'll end up with an invalid factor (levels<- does some checking as well as copying).