Zach - 1 year ago 100
R Question

Aggregating sub totals and grand totals with data.table

I've got a

`data.table`
in R:

``````library(data.table)
set.seed(1)
DT = data.table(
group=sample(letters[1:2],100,replace=TRUE),
year=sample(2010:2012,100,replace=TRUE),
v=runif(100))
``````

Aggregating this data into a summary table by group and year is simple and elegant:

``````table <- DT[,mean(v),by='group, year']
``````

However, aggregating this data into a summary table, including subtotals and grand totals, is a little more difficult, and a lot less elegant:

``````library(plyr)
yearTot <- DT[,list(mean(v),year='Total'),by='group']
groupTot <- DT[,list(mean(v),group='Total'),by='year']
Tot <- DT[,list(mean(v), year='Total', group='Total')]
table <- rbind.fill(table,yearTot,groupTot,Tot)
table\$group[table\$group==1] <- 'Total'
table\$year[table\$year==1] <- 'Total'
``````

This yields:

``````table[order(table\$group, table\$year), ]
``````

Is there a simple way to specify subtotals and grand totals with data.table, such as the
`margins=TRUE`
command for plyr? I would prefer to use data.table over plyr on my dataset, as it is a very large dataset that I already have in the data.table format.

I'm not aware of a simple way. Here's a first stab at an implementation. I don't know `margins=TRUE` in plyr, is this what that does?

``````crossby = function(DT, j, by) {
j = substitute(j)
ans = rbind(
DT[,eval(j),by],
DT[,list("Total",eval(j)),by=by[1]],
cbind("Total",DT[,eval(j),by=by[2]]),
list("Total","Total",DT[,eval(j)]),
use.names=FALSE
# 'use.names' argument added in data.table v1.8.0
)
setkeyv(ans,by)
ans
}

crossby(DT, mean(v), c("group","year"))

group  year        V1
[1,]     a  2010 0.2926945
[2,]     a  2011 0.4176346
[3,]     a  2012 0.4227796
[4,]     a Total 0.3901875
[5,]     b  2010 0.5231845
[6,]     b  2011 0.4997119
[7,]     b  2012 0.4306871
[8,]     b Total 0.4835788
[9,] Total  2010 0.4278093
[10,] Total  2011 0.4463616
[11,] Total  2012 0.4271160
[12,] Total Total 0.4350153
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download