ipuzek ipuzek - 1 year ago 53
R Question

Collapse variables into one variable with values preserving the variable names

After merging two data frames, I get this:

d.f <- data.frame(sex = c("M", "F", NA, NA),
age = c(NA, NA, "old", "young"),
n = c(2, 1, 3, 4))

...but would like to combine the first two variables into one, and end up with this:

sex.age n
sex.M 2
sex.F 1
age.old 3
age.young 4

It seems like a simple (and probably common) problem, but I am really struggling to find the answer. I thought there is something ready made along the lines of tidyr and reshape, but I am not finding it - probably because the resulting structure is very untidy.

The data structure will stay the same, meaning that NA's are always in different rows. Solutions like



apply(d.f[, 1:2], 1, function(x) x[!is.na(x)])

do work, but there is no easy way to get the variable names prefixes.

Answer Source

Here is an option using dplyr/tidyr. We covert the 'wide' to 'long' format with gather, unite the columns 'Var' and 'Val' to create 'sex.age' and rearrange the column order.

gather(d.f, Var, Val, -n, na.rm = TRUE) %>% 
                 unite(sex.age, Var, Val, sep=".") %>%
#    sex.age n
#1     sex.M 2
#2     sex.F 1
#7   age.old 3
#8 age.young 4