Marco Marco - 1 month ago 3
R Question

Filter dataframe using global variable with the same name as column name

library(dplyr)


Toy dataset:

df <- data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
df
x y
1 1 4
2 2 5
3 3 6


This works fine:

df %>% filter(y == 5)
x y
1 2 5


This also works fine:

z <- 5
df %>% filter(y == z)
x y
1 2 5


But this fails

y <- 5
df %>% filter(y == y)
x y
1 1 4
2 2 5
3 3 6


Apparently, dplyr cannot make the distinction between its column
y
and the global variable
y
.
Is there a way to tell dplyr that the second y is the global variable?

Answer

You can do:

df %>% filter(y == .GlobalEnv$y)

or:

df %>% filter(y == .GlobalEnv[["y"]])

or:

both of which work in this context, but won't if all this is going on inside a function. But get will:

df %>% filter(y == get("y"))
f = function(df, y){df %>% filter(y==get("y"))}

So use get.

Or just use df[df$y==y,] instead of dplyr.