Mark BornSuccessful Harris Mark BornSuccessful Harris - 2 months ago 32
R Question

Data Management in R

So I have this code where I am trying to unite separate columns called grade prek-12 into one column called

. I have employed the
package and used this line of code to perform said task:

unite(dta, "Grade",
dta$Gradek, dta$Grade1, dta$Grade2,
dta$Grade3, dta$Grade4, dta$Grade5,
dta$Grade6, dta$Grade7, dta$Grade8,
dta$Grade9, dta$Grade10, dta$Grade11,

However, I have been getting an error saying this:

error: All select() inputs must resolve to integer column positions.
The following do not: * c(Gradeprek, dta$Gradek, dta$Grade1, dta$Grade2, dta$Grade3, dta$Grade4, dta$Grade5, dta$Grade6, ...

Penny for your thoughts on how I can resolve the situation.


You are mixing and matching the two syntax options for unite and unite_ - you need to pick one and stick with it. In both cases, do not use data$column - they take a data argument so you don't need to re-specify which data frame your columns come from.

Option 1: NSE The default non-standard evaluation means bare column names - no quotes! And no c().

unite(dta, Grade, Gradeprek, Gradek, Grade1, Grade2, Grade3, ..., 
    Grade12, sep = "")

There are tricks you can do with this. For example, if all your Grade columns are in this order next to each other in your data frame, you could do

unite(dta, Grade, Gradeprek:Grade12, sep = "")

You could also use starts_with("Grade") to get all column that begin with that string. See ?unite and its link to ?select for more details.

Option 2: Standard Evaluation You can use unite_() for a standard-evaluating alternative which will expect column names in a character vector. This has the advantage in this case of letting you use paste() to build column names in the order you want:

unite_(dta, col = "Grade", c("Gradeprek", "Gradek", paste0("Grade", 1:12)), sep = "")