MichaelChirico MichaelChirico - 3 months ago 11
R Question

`dcast` with empty RHS

Is there any way to flip wide my data without first specifying a variable against which to be flipped? The logical default seems to me to be the in-group index.

For example,

DT <- data.table(id = rep(6:10, each = 3), var = rnorm(15))
DT
# id var
# 1: 6 1.58293930
# 2: 6 0.44234019
# 3: 6 -0.06576521
# 4: 7 -0.65124980
# 5: 7 0.88371933
# 6: 7 -1.94998135
# 7: 8 -1.95746466
# 8: 8 -0.50978195
# 9: 8 -0.40450447
# 10: 9 -0.61097399
# 11: 9 -0.92335213
# 12: 9 -0.19881983
# 13: 10 0.13022635
# 14: 10 -0.30141200
# 15: 10 0.78355188


What I want is basically, for each
id
, each value of
var
in a different column (and
NA
s if there's any
id
with fewer
var
values associated), which can be done like so:

DT[ , I := 1:.N, by = id] #side note, why doesn't I:=.I work??
dcast(DT, id ~ I, value.var = "var")
# id 1 2 3
# 1: 6 1.5829393 0.4423402 -0.06576521
# 2: 7 -0.6512498 0.8837193 -1.94998135
# 3: 8 -1.9574647 -0.5097820 -0.40450447
# 4: 9 -0.6109740 -0.9233521 -0.19881983
# 5: 10 0.1302263 -0.3014120 0.78355188


However, it would be more convenient if I didn't have to define
I
first, like so:

dcast(DT, id~ ., value.var = "var")


But this doesn't work:


Aggregate function missing, defaulting to 'length'


# id .
# 1: 6 3
# 2: 7 3
# 3: 8 3
# 4: 9 3
# 5: 10 3


Is there perhaps an aggregating function that I could pass to get the desired effect?

Answer

This is now possible using the rowid function hot off the press in the development version of data.table (1.9.7, installation instructions here):

> dcast(dt, id ~ rowid(id), value.var = "var")
   id          1          2          3
1:  6  1.1050942  0.1271620  1.3051373
2:  7 -0.5441056 -0.6866828 -0.8083762
3:  8 -0.6812820 -1.1934716 -1.3913903
4:  9 -0.3462497 -0.8229276 -1.0884394
5: 10 -0.4600681  0.6173795 -1.0125658

See ?rowid for more options, examples, and explanation.