neversaint neversaint - 1 year ago 50
R Question

How to take vector content as backquote variable in tidyr/dplyr

I have the following data frame, at it works as I wanted
with this code:

df <- structure(list(celltype = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L), .Label = c("Bcells",
"DendriticCells", "Macrophages", "Monocytes", "NKCells", "Neutrophils",
"StemCells", "StromalCells", "abTcells", "gdTCells"), class = "factor"),
sample = c("SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated"), `mean(score)` = c(0.160953535029424, 0.155743474395545,
0.104788051104575, 0.125247035158472, -0.159665650045289,
-0.134662049979712, 0.196249441751866, 0.212256889027029,
0.0532668251890109, 0.0738264693971133, 0.151828478029596,
0.159941552142933, -0.14128323638966, -0.120556640790534,
0.196518649474078, 0.185264282171863, 0.0654641151966543,
0.0837989059507186, 0.145111577618456, 0.145448549866796)), .Names = c("celltype",
"sample", "mean(score)"), row.names = c(7L, 8L, 17L, 18L, 27L,
28L, 37L, 38L, 47L, 48L, 57L, 58L, 67L, 68L, 77L, 78L, 87L, 88L,
97L, 98L), class = "data.frame")

library(tidyr)
library(dplyr)

df %>% spread(sample, `mean(score)`) %>%
mutate(pairwise_division = `SP ID treated` / `SP ID control`)
df

## celltype SP ID control SP ID treated pairwise_division
## 1 Bcells 0.16095354 0.15574347 0.9676300
## 2 DendriticCells 0.10478805 0.12524704 1.1952416
## 3 Macrophages -0.15966565 -0.13466205 0.8434003
## 4 Monocytes 0.19624944 0.21225689 1.0815668
## 5 NKCells 0.05326683 0.07382647 1.3859746
## 6 Neutrophils 0.15182848 0.15994155 1.0534358
## 7 StemCells -0.14128324 -0.12055664 0.8532976
## 8 StromalCells 0.19651865 0.18526428 0.9427313
## 9 abTcells 0.06546412 0.08379891 1.2800739
## 10 gdTCells 0.14511158 0.14544855 1.0023222


Note there, the line

mutate(pairwise_division = `SP ID treated` / `SP ID control`)


uses back quote of a string.

What I wanted to do then is to take those quoted values from the
list. I tried this:

content <- c("SP ID treated" , "SP ID control")

df %>% spread(sample, `mean(score)`) %>%
mutate(pairwise_division = content[1] / content[2])
df


But it gave me this error:

Error: non-numeric argument to binary operator


What's the right way to do it?

Answer Source

If you want to play with strings for parameters, you are doing to have to use mutate_() rather than mutate(). For example

df %>% spread(sample, `mean(score)`) %>% 
    mutate_(.dots=list(pairwise_division = 
        substitute(a/b, list(
        a=as.name(content[1]), 
        b=as.name(content[2]))
    )))

The as.name() will make sure you get a valid variable name.