koteletje koteletje - 2 months ago 8
R Question

setDT - group by over all columns in a vector

Imagine one has a table X, with 10 columns Col1 - Col5. Col5 is column with amounts. With setDT I can compute the sum of Col5 over the unique Col1-Col4 combinations:

by = .(Col1,Col2,Col3,Col4)]

Is there a way to prevent having to list every Col1-4 in the by statement and instead for instance use a vector that identifies all the group by columns?

This for instance doesn't work for obvious reasons:

by = .(colnames(X[,-c(5)]))]

I don't want to use the aggregate function because it turns everything in factors.

vkp vkp

You can use paste0.

setDT(X)[,list(sum(Col5)), by = .(c(paste0("Col",1:4)))]