pluke pluke - 6 months ago 25
R Question

dplyr mutate on dataframe .Label value, not reference

I have the following dataframe:

temp <- structure(list(ID = c("1234", "1223", "5555", "2344", "4567", "6543"),
Eat = structure(c(6L,1L, 5L, 2L, 3L, 4L),
.Label = c("", "Cabbage", "Carrot", "Lettuce", "Potato","Asparagus", "Mushroom", "Apple"), class = "factor")),
row.names = c(NA, 6L), class = "data.frame", .Names = c("ID", "Eat"))

I want to note each time there is nothing to Eat:

temp %>% mutate(Eat = ifelse(Eat != "" & !, Eat, "Nothing!"))

However, the result is the mutate on the Eat structure values,:

ID Eat
1 1234 6
2 1223 Nothing!
3 5555 5
4 2344 2
5 4567 3
6 6543 4

How can I get the .Labels carried across to make:

ID Eat
1 1234Asparagus
2 1223 Nothing!
3 5555 Potato
4 2344 Cabbage
5 4567 Carrot
6 6543 Lettuce


If it's not an requirement in your project, try to avoid factor. character are much easier to handle and are stored as memory efficient as factor. I only use factor when it comes to plotting or some specific sort order other than alphabetical is needed.

"... R has a global string pool. This means that each unique string is only stored in one place, and therefore character vectors take up less memory than you might expect" (Hadley Wickham, Advanced R)

This was different in the past which explains why coercion of strings to factor was and still is the default in many functions. You have to call read.csv or data.frame with the explicit parameter stringsAsFactors = FALSE to avoid this.

Recent R packages like data.table or those from Hadley's tidyverse (tibble) never coerce inputs.

But if you need factor you may follow @Alistaire's advice and use Hadley's forecats package.