Alex Alex - 6 days ago 5
R Question

dynamic column names in data.table, R

I am trying to add columns to my

data.table
, where the names are dynamic. I addition I need to use the
by
argument when adding these columns. For example:

test_dtb <- data.table(a=sample(1:100, 100), b=sample(1:100, 100), id=rep(1:10,10))
cn <- parse(text="blah")
test_dtb[,eval(cn):=mean(a), by=id]

Error in `[.data.table`(test_dtb, , `:=`(eval(cn), mean(a)), by = id) :
LHS of := must be a single column name when with=TRUE. When with=FALSE the LHS may be a vector of column names or positions.


Another attempt:

cn <- "blah"
test_dtb[,cn:=mean(a), by=id, with=FALSE]
Error in `[.data.table`(test_dtb, , `:=`(cn, mean(a)), by = id, with = FALSE) : 'with' must be TRUE when 'by' or 'keyby' is provided





Update from Matthew:

This now works in v1.8.3 on R-Forge. Thanks for highlighting!

See this similar question for new examples:

Assign multiple columns using data.table, by group

Answer

Updated of 2016-11-29

Nowadays, you can just do this:

## `(cn)` (or `eval(cn)`) gets evaluated to "blah" before `:=` is carried out
test_dtb[, (cn):=mean(a), by=id]
head(test_dtb, 4)
#     a  b id blah
# 1: 41 19  1 54.2
# 2:  4 99  2 50.0
# 3: 49 85  3 46.7
# 4: 61  4  4 57.1

Original answer:

You were on exactly the right track: constructing an expression to be evaluated within the call to [.data.table is the data.table way to do this sort of thing. Going just a bit further, why not construct an expression that evaluates to the entire j argument (rather than just its left hand side)?

Something like this should do the trick:

## Your code so far
library(data.table)
test_dtb <- data.table(a=sample(1:100, 100),b=sample(1:100, 100),id=rep(1:10,10))
cn <- "blah"

## One solution
expr <- parse(text = paste0(cn, ":=mean(a)"))
test_dtb[,eval(expr), by=id]

## Checking the result
head(test_dtb, 4)
#     a  b id blah
# 1: 30 26  1 38.4
# 2: 83 82  2 47.4
# 3: 47 66  3 39.5
# 4: 87 23  4 65.2
Comments