j_5chneider - 3 months ago 8
R Question

# R - Obtain the connection between the numeric values and the level labels in a factor

I'm struggling to find the connection between numeric (integer) values that exist in a R factor object and its level labels. I know how to define the levels and the labels. But let's assume I get an unfamiliar data set in which I'll find several factors (here: sex & color):

``````test <- data.frame(
factor(c(1,2,1,1,2,2,1),
levels= c(1,2),
labels = c("female", "male")
),
factor(c(3,2,2,1,4,4,5),
levels= c(1,2,3,4,5),
labels= c("red", "green", "blue", "yellow", "brown")
)
)

names(test) <- c("sex", "color")
test

sex  color
1 female   blue
2   male  green
3 female  green
4 female    red
5   male yellow
6   male yellow
7 female  brown
``````

I will be able to obtain the level labels by using
`attributes()`
and I will be able to obtain the numeric values e.g. by using
`test\$sex <- as.numeric(test\$sex)`

But how do I know, that 1 equals female and 2 equals male? Same thing (even worse) for the colors. How do I establish the connection?

Thanks

As others have said, the integer value simply increments along the length of the levels. Personally, I find this easiest to visualize in a reference table.

``````test <- data.frame(
sex = factor(c(1,2,1,1,2,2,1),
levels= c(1,2),
labels = c("female", "male")
),
color = factor(c(3,2,2,1,4,4,5),
levels= c(1,2,3,4,5),
labels= c("red", "green", "blue", "yellow", "brown")
)
)

# Make a reference table
data.frame(level = seq_along(levels(test\$color)),
label = levels(test\$color))

level  label
1     1    red
2     2  green
3     3   blue
4     4 yellow
5     5  brown
``````

If you want to get the references for all of the factors in a data frame, you can vectorize the code:

``````factor_reference <- function(data)
{
Ref <-
lapply(data,
function(x)
{
if (is.factor(x)) data.frame(level = seq_along(levels(x)),
label = levels(x))
else NULL
}
)

Ref[!vapply(Ref, is.null, logical(1))]
}

factor_reference(test)
\$sex
level  label
1     1 female
2     2   male

\$color
level  label
1     1    red
2     2  green
3     3   blue
4     4 yellow
5     5  brown
``````