MichaelChirico MichaelChirico - 1 year ago 50
R Question

`dcast` with empty RHS

Is there any way to flip wide my data without first specifying a variable against which to be flipped? The logical default seems to me to be the in-group index.

For example,

DT <- data.table(id = rep(6:10, each = 3), var = rnorm(15))
DT
# id var
# 1: 6 1.58293930
# 2: 6 0.44234019
# 3: 6 -0.06576521
# 4: 7 -0.65124980
# 5: 7 0.88371933
# 6: 7 -1.94998135
# 7: 8 -1.95746466
# 8: 8 -0.50978195
# 9: 8 -0.40450447
# 10: 9 -0.61097399
# 11: 9 -0.92335213
# 12: 9 -0.19881983
# 13: 10 0.13022635
# 14: 10 -0.30141200
# 15: 10 0.78355188


What I want is basically, for each
id
, each value of
var
in a different column (and
NA
s if there's any
id
with fewer
var
values associated), which can be done like so:

DT[ , I := 1:.N, by = id] #side note, why doesn't I:=.I work??
dcast(DT, id ~ I, value.var = "var")
# id 1 2 3
# 1: 6 1.5829393 0.4423402 -0.06576521
# 2: 7 -0.6512498 0.8837193 -1.94998135
# 3: 8 -1.9574647 -0.5097820 -0.40450447
# 4: 9 -0.6109740 -0.9233521 -0.19881983
# 5: 10 0.1302263 -0.3014120 0.78355188


However, it would be more convenient if I didn't have to define
I
first, like so:

dcast(DT, id~ ., value.var = "var")


But this doesn't work:


Aggregate function missing, defaulting to 'length'


# id .
# 1: 6 3
# 2: 7 3
# 3: 8 3
# 4: 9 3
# 5: 10 3


Is there perhaps an aggregating function that I could pass to get the desired effect?

Answer Source

This is now possible using the rowid function hot off the press in the development version of data.table (1.9.7, installation instructions here):

> dcast(dt, id ~ rowid(id), value.var = "var")
   id          1          2          3
1:  6  1.1050942  0.1271620  1.3051373
2:  7 -0.5441056 -0.6866828 -0.8083762
3:  8 -0.6812820 -1.1934716 -1.3913903
4:  9 -0.3462497 -0.8229276 -1.0884394
5: 10 -0.4600681  0.6173795 -1.0125658

See ?rowid for more options, examples, and explanation.