Connor J - 1 year ago 71
R Question

# How to create multiple new columns with data.table's j over different subsets

I want to create multiple variables that are aggregating various subsets of a dataset. For an illustrating example, say you have the following data:

``````DT = data.table(Group1 = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4),
Group2 = c(1,1,1,2,2,1,1,2,2,2,1,1,1,1,2,1,1,2,2,2),
Var1 = c(1,1,0,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0))
``````

I want to find several averages of variable
`Var1`
. I want to know:

• `mean(Var1)`
grouped by
`Group1`

• `mean(Var1)`
for only those with
`Group2 == 1`
, grouped by
`Group1`

• `mean(Var1)`
for only those with
`Group2 == 2`
, grouped by
`Group1`

Or, in data.table parlance,

``````DT[, mean(Var1), by=Group1]
DT[Group2==1, mean(Var1), by=Group1]
DT[Group2==2, mean(Var1), by=Group1]
``````

Obviously, calculating any one of these is very straightforward. But I can't find a good way to calculate all three of them, since they use different subsets in
`i`
. The solution I've been using so far is generating them individually, then merging them into a unified table.

``````DT_all <- DT[, .(avgVar1_all = mean(Var1)), by = Group1]
DT_1 <- DT[Group2 == 1, .(avgVar1_1 = mean(Var1)), by = Group1]
DT_2 <- DT[Group2 == 2, .(avgVar1_2 = mean(Var1)), by = Group1]
group_info <- merge(DT_all, DT_1, by = "Group1")
group_info <- merge(group_info, DT_2, by = "Group1")

group_info
#    Group1 avgVar1_all avgVar1_1 avgVar1_2
# 1:      1         0.4 0.6666667 0.0000000
# 2:      2         0.6 1.0000000 0.3333333
# 3:      3         0.2 0.2500000 0.0000000
# 4:      4         0.0 0.0000000 0.0000000
``````

Is there a more elegant method I could be using?

Just do it all in one grouping operation using `.SD`:

``````DT[, .(
all  = mean(Var1),
grp1 = .SD[Group2==1, mean(Var1)],
grp2 = .SD[Group2==2, mean(Var1)]
),
by = Group1]

#   Group1 all      grp1      grp2
#1:      1 0.4 0.6666667 0.0000000
#2:      2 0.6 1.0000000 0.3333333
#3:      3 0.2 0.2500000 0.0000000
#4:      4 0.0 0.0000000 0.0000000
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download