user1839897 user1839897 - 1 month ago 10
R Question

dplyr string as column reference

Is there anyway to pass a string as column reference to a dplyr procedure?

Here is an example - with a grouped dataset and a simple function where I try to pass a string as reference to a column. Thanks!

machines <- data.frame(Date=c("1/31/2014", "1/31/2014", "2/28/2014", "2/28/2014", "3/31/2014", "3/31/2014"),
Model.Num=c("123", "456", "123", "456", "123", "456"),
Cost=c(200, 300, 250, 350, 300, 400))

my.fun <- function(data, colname){
mutate(data, position=cumsum(as.name(colname)))
}

machines <- machines %>% group_by(Date, Model.Num)
machines <- my.fun(machines, "Cost")

Answer

Here's an option that uses interp() from the lazyeval package, which came with your dplyr install. Inside your function(s), you'll need to use the standard evaluation version of the dplyr functions. In this case that would be mutate_().

Note that the new column position will be identical to the Cost column here because of how you've set up the grouping in machines. The second call to my_fun() shows it working on a different set of grouping variables.

library(dplyr)
library(lazyeval)

my_fun <- function(data, col) {
    mutate_(data, position = interp(~ cumsum(x), x = as.name(col)))
}

my_fun(machines, "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      250
# 4 2/28/2014       456  350      350
# 5 3/31/2014       123  300      300
# 6 3/31/2014       456  400      400

## second example - different grouping
my_fun(group_by(machines, Model.Num), "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      450
# 4 2/28/2014       456  350      650
# 5 3/31/2014       123  300      750
# 6 3/31/2014       456  400     1050