PavoDive PavoDive - 3 months ago 20
R Question

Melt data.table according to nested list

I had a data.table like this:

library(data.table)
dt <- data.table(a = c(rep("A", 3), rep("B", 3)), b = c(1, 3, 5, 2, 4, 6))


I needed to perform an operation (forecast) on the values for each
a
, so I decided to put them in a list, like this:

dt <- dt[, x := .(list(b)), by = a][, .SD[1,], by = a, .SDcols = "x"]


Now I wanted to "melt" (that's the thing that comes to mind)
dt
back into its original form.

I could do it for very few levels of
a
like this:

dt2 <- rbind(expand.grid(dt[1, a], dt[1, x[[1]]]), expand.grid(dt[2, a], dt[2, x[[1]]]))


but of course, the solution is impractical for more levels of
a
.

I've tried

dt2 <- dt[, expand.grid(a, x[[1]]), by = a]


which results in

dt2
## a Var1 Var2
## 1: A A 1
## 2: A A 3
## 3: A A 5
## 4: B A 2
## 5: B A 4
## 6: B A 6


it's interesting to notice that
Var1
doesn't actually follow the "A - B" pattern expected (but at least
a
remains).

Is there a better approach to achieve this?

EDITS



Expected output will be the result of

dt2[, .(a, Var2)]



  • Corrected "melt" for "dcast".


Answer

You are looking for a method to nest(convert a column from a atomic vector type to list type) and unnest(the opposite direction) in a data.table way. This is different from reshaping data which either spread a column values to row header(dcast) or gather the row headers to a column values(melt):

In data.table syntax, you can use list and unlist on the target column to summarize or broadcast it along with group variables:

Say if we are starting from:

dt
#    a b
# 1: A 1
# 2: A 3
# 3: A 5
# 4: B 2
# 5: B 4
# 6: B 6

To repeat what you have achieved in your first step, i.e. nest column b, you can do:

dt_nest <- dt[, .(b = list(b)), a]
dt_nest
#    a     b
# 1: A 1,3,5
# 2: B 2,4,6

To go the opposite direction, use unlist with the group variable:

dt_nest[, .(b = unlist(b)), a]
#    a b
# 1: A 1
# 2: A 3
# 3: A 5
# 4: B 2
# 5: B 4
# 6: B 6