neversaint neversaint - 3 months ago 14
R Question

How to take vector content as backquote variable in tidyr/dplyr

I have the following data frame, at it works as I wanted
with this code:

df <- structure(list(celltype = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L), .Label = c("Bcells",
"DendriticCells", "Macrophages", "Monocytes", "NKCells", "Neutrophils",
"StemCells", "StromalCells", "abTcells", "gdTCells"), class = "factor"),
sample = c("SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated", "SP ID control", "SP ID treated", "SP ID control",
"SP ID treated"), `mean(score)` = c(0.160953535029424, 0.155743474395545,
0.104788051104575, 0.125247035158472, -0.159665650045289,
-0.134662049979712, 0.196249441751866, 0.212256889027029,
0.0532668251890109, 0.0738264693971133, 0.151828478029596,
0.159941552142933, -0.14128323638966, -0.120556640790534,
0.196518649474078, 0.185264282171863, 0.0654641151966543,
0.0837989059507186, 0.145111577618456, 0.145448549866796)), .Names = c("celltype",
"sample", "mean(score)"), row.names = c(7L, 8L, 17L, 18L, 27L,
28L, 37L, 38L, 47L, 48L, 57L, 58L, 67L, 68L, 77L, 78L, 87L, 88L,
97L, 98L), class = "data.frame")

library(tidyr)
library(dplyr)

df %>% spread(sample, `mean(score)`) %>%
mutate(pairwise_division = `SP ID treated` / `SP ID control`)
df

## celltype SP ID control SP ID treated pairwise_division
## 1 Bcells 0.16095354 0.15574347 0.9676300
## 2 DendriticCells 0.10478805 0.12524704 1.1952416
## 3 Macrophages -0.15966565 -0.13466205 0.8434003
## 4 Monocytes 0.19624944 0.21225689 1.0815668
## 5 NKCells 0.05326683 0.07382647 1.3859746
## 6 Neutrophils 0.15182848 0.15994155 1.0534358
## 7 StemCells -0.14128324 -0.12055664 0.8532976
## 8 StromalCells 0.19651865 0.18526428 0.9427313
## 9 abTcells 0.06546412 0.08379891 1.2800739
## 10 gdTCells 0.14511158 0.14544855 1.0023222


Note there, the line

mutate(pairwise_division = `SP ID treated` / `SP ID control`)


uses back quote of a string.

What I wanted to do then is to take those quoted values from the
list. I tried this:

content <- c("SP ID treated" , "SP ID control")

df %>% spread(sample, `mean(score)`) %>%
mutate(pairwise_division = content[1] / content[2])
df


But it gave me this error:

Error: non-numeric argument to binary operator


What's the right way to do it?

Answer

If you want to play with strings for parameters, you are doing to have to use mutate_() rather than mutate(). For example

df %>% spread(sample, `mean(score)`) %>% 
    mutate_(.dots=list(pairwise_division = 
        substitute(a/b, list(
        a=as.name(content[1]), 
        b=as.name(content[2]))
    )))

The as.name() will make sure you get a valid variable name.