sds sds - 1 year ago 100
R Question

Filter data.table by multiple columns, dynamically

Suppose I have a

with a few columns:

a <- data.table(id=1:1000, x=runif(100), y=runif(100), z=runif(100))

I want to drop the rows where
is below the median:

a <- a[ x > median(x) & y > median(y) & z > median(z) ]

(aside: does the above call
3 times or 3000 times?)

What I do is

my.cols <- c("x","y","z")
my.meds <- sapply(my.cols, function(n) median(a[[n]]))
a <- a[ Reduce(`&`,Map(function(i) a[[my.cols[i]]] > my.meds[i], 1:length(my.cols))) ]

Is this the best I could do?

Answer Source

One option is to construct the string you want and eval/parse it:

EVAL = function(...)eval(parse(text=paste0(...)))   # standard helper function

a[ EVAL(my.cols, ">median(", my.cols, ")", collapse=" & ") ]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download