Kenji Kenji - 2 days ago 5
R Question

How to melt with data.table into person years getting year suffix as variable column?

I have a data.table that looks like:

PID vara90 vara91 vara92 vara93 varb90 varb91 varb92 varb93
1: 1 2 1 45 77 "Str1" "Str3" "Str3" "Str1"
2: 2 6 1 5 3 "Str2" "Str5" "Str4" "Str1"
3: 3 1 1 56 65 "Str3" "Str2" "Str6" "Str1"
4: 4 8 1 45 9 "Str1" "Str8" "Str3" "Str2"
5: 5 12 3 45 44 "Str4" "Str1" "Str8" "Str3"
6: 6 2 6 78 1 "Str5" "Str3" "Str1" "Str4"
7: 7 2 3 45 23 "Str2" "Str1" "Str3" "Str1"
8: 8 2 1 44 87 "Str1" "Str3" "Str1" "Str4"


And want to get it into long format with the following shape:

PID year vara varb
1: 1 90 2 "Str1"
2: 1 91 1 "Str3"
3: 1 92 45 "Str3"
.
.
.


I am using:

melt(df, id.vars("PID", measure=patterns("^vara","^varb"), value.name=c("vara","varb"),variable.name="year)


This gets me what I want, except that the
year
column is populated by numbers from 1 to 3. Is there a way to get the year column populated by the year suffixes from the variable names? I can do that with base
reshape
and with
tidyr
, but I have a really large dataset that asks for the memory efficiency of data.table.

Answer

Until the FR linked by Frank is implemented, we'll have to set the levels manually (variable col is factor):

ans = melt(dt, measure = patterns("^vara", "^varb"),
              value.name = c("vara", "varb"), variable.name = "year")
setattr(ans$year, 'levels', gsub("vara", "", names(dt)[2:5]))
Comments