CronAcronis CronAcronis - 3 months ago 14
R Question

data.table fast acces to column by expression

I am trying to get a column from data.table by given expression. I receive

CaseID
as an expression.
expr_caseid <- expression(CaseID)
. How do I get the column by expression in a fastest possible way?

library(data.table)
dt_fcst <- data.table(CaseID = as.integer(runif(1e8)*100))

expr_caseid <- expression(CaseID)

testExpr = function(DT_, expr_){
DT_[[deparse(substitute(expr_))]]
}

testGetElement = function(DT_, expr_){
getElement(DT_, deparse(substitute(expr_)))
}

library(microbenchmark)
microbenchmark(
## by_char = dt_fcst[['CaseID']],
by_deparse = testExpr(dt_fcst, CaseID),
## by_expr = dt_fcst[, list(CaseID)],
## by_dollar = dt_fcst$CaseID,
by_eval = eval(
expr_caseid,
envir = as.environment(dt_fcst)
),
by_getElement = testGetElement(dt_fcst, CaseID)
# ,by_index = dt_fcst@.Data[[1]]
, times = 1000L)


Results of performance measurements:

Unit: microseconds

expr min lq mean median uq max neval cld
by_deparse 37.2 41.35 55.0700 46.15 60.6 357.8 1000 b
by_eval 15.9 22.30 29.2194 24.80 34.3 289.8 1000 a
by_getElement 38.3 42.20 55.9087 47.30 63.2 283.3 1000 b

Answer

with comment by Frank

Unit: microseconds
            expr  min    lq     mean median    uq     max neval cld
 by_evalNoCoerce  1.8  4.00   8.7652   5.30  7.60   479.8  1000  a 


microbenchmark(
##  by_char = dt_fcst[['CaseID']],
    by_deparse = testExpr(dt_fcst, CaseID),
##  by_expr = dt_fcst[, list(CaseID)],
##  by_dollar = dt_fcst$CaseID,
    by_eval = eval(
        expr_caseid, 
        envir = as.environment(dt_fcst)
    ),
    by_getElement = testGetElement(dt_fcst, CaseID),
    by_evalNoCoerce = eval(expr_caseid, dt_fcst)
#   ,by_index = dt_fcst@.Data[[1]]
, times = 1000L)
Comments