pluke pluke - 1 month ago 19
R Question

R add a new column to dataframe using mutate_ where column name is specified by a variable

I have a dataframe, that I want to add a column to, where the column is defined by a variable name:

df <- diamonds
NewName <- "SomeName"
df <- df %>% mutate_(paste0(NewName," = \"\""))


This gives me the following error:

Error: attempt to use zero-length variable name


I've seen plenty of examples of mutate_ being used to change column names, but not to dynamically create columns. Any help?

Answer

The issue has to do with when the evaluation of the statement is occurring. By my understanding, the goal of mutate_ is not to recreate the syntax of mutate, for example using paste to create mutate(SomeName = ""). Instead, it is to allow generation of functions to pass. The reason your approach is failing is (I believe) the fact that it is looking for a function named "".

Instead, you need to pass in a function that can be evaluated (here, I am using paste as a placeholder) and set the name of that column using your variable. This should work:

df <- diamonds
NewName <- "SomeName"
df <- df %>% mutate_(.dots = setNames("paste('')",NewName))

This also allows more control, for example, you could paste cut and color:

df <- df %>% mutate_(.dots = setNames("paste(cut, color)",NewName))

gives:

   carat       cut color clarity depth table price     x     y     z    SomeName
   <dbl>     <ord> <ord>   <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>       <chr>
1   0.23     Ideal     E     SI2  61.5    55   326  3.95  3.98  2.43     Ideal E
2   0.21   Premium     E     SI1  59.8    61   326  3.89  3.84  2.31   Premium E
3   0.23      Good     E     VS1  56.9    65   327  4.05  4.07  2.31      Good E
4   0.29   Premium     I     VS2  62.4    58   334  4.20  4.23  2.63   Premium I
5   0.31      Good     J     SI2  63.3    58   335  4.34  4.35  2.75      Good J
6   0.24 Very Good     J    VVS2  62.8    57   336  3.94  3.96  2.48 Very Good J
7   0.24 Very Good     I    VVS1  62.3    57   336  3.95  3.98  2.47 Very Good I
8   0.26 Very Good     H     SI1  61.9    55   337  4.07  4.11  2.53 Very Good H
9   0.22      Fair     E     VS2  65.1    61   337  3.87  3.78  2.49      Fair E
10  0.23 Very Good     H     VS1  59.4    61   338  4.00  4.05  2.39 Very Good H

(Of note, I also got the initial syntax to work the first time, followed by subsequent failures. Worth digging into.)

Comments