andrewH andrewH - 2 months ago 16
R Question

R: Using dplyr to remove a column named in a string contained in a variable,

I am creating a bunch of files. I have a list of two letter names. Each file includes a column with the same name as the file. I do a series of things to them, identifying both the file and the column with the name contained in a variable, sT. For example, I have a file called OH, and sT contains "OH".

The very last thing I want to do to the file is remove the eponymous column and return a file with the same name. I am trying to become fluent in tidy, language of the tidyverse, so I am trying to do this with select.

OH <- data.frame(X=1:2, OH=3:4)

I think this should work under nonstandard evaluation:

assign(sT, select(get(eval(sT)), -as.symbol(get(sT)))

where sT is "OH" and get(eval(sT)) is the file OH. And I think one of these should work, under standard evaluation:

assign(sT, select(get(eval(sT)), - sT))


assign(sT, select_(get(eval(sT)), paste0("-", sT)))

depending on whether
will accept the minus sign inside of the string. But none of them do, returning respectively:

Error in -as.symbol(sT) : invalid argument to unary operator

Error in eval(expr, envir, enclos) : object 'OH' not found

Error in eval(expr, envir, enclos) : object 'OH' not found


You need to use matches

assign(sT, select(get(eval(sT)), -matches(sT)))

Edit: as alistaire points out, it should be as below in case there are other columns whose names contain OH

assign(sT,select(get(eval(sT)), -matches(paste0('^', sT, '$'))))

Doing it as below is probably more readable. It's also faster.

assign(sT, OH[which(names(OH) != sT)])

If you want it as a function to lapply with here's one

removecol <- function(string, data = F){
    if(class(data) == 'logical') data <- get(eval(sT))
    assign(sT, data[which(names(data) != sT)], envir = .GlobalEnv)

#  X OH
#1 1  3
#2 2  4
#  X
#1 1
#2 2