giordano giordano - 2 months ago 12
R Question

R Does something like EXISTS in SQL exists in R?

EXISTS/NOT EXISTS are features in SQL which are incredible powerful.
Is there a way how I can use R commands or functions to perform what I have with EXISTS in SQL? Example:

# Two tables
x0 <- data.frame(a=rep(1:4,2))
x1 <- data.frame(a=c(2,2,3,5))


I want a new indicator b which gives me 1 if the value in x0 EXISTS in table x1 otherwise 0.

# Initialize b
x0$b <- 0

# Update b
sqldf(c("UPDATE x0
SET b = 1
WHERE EXISTS (SELECT 1
FROM x1
WHERE x0.a = x1.a
)"
, "SELECT * FROM main.x0"
)
)


Results:

a b
1 1 0
2 2 1
3 3 1
4 4 0
5 1 0
6 2 1
7 3 1
8 4 0

Answer

Try this:

x0$b <- (x0$a %in% x1$a) + 0L

Using 0L rather than 0 is mostly me just being a bit pedantic; it ensures that everything remains as integers rather than numeric (doubles).

The reason !(x0$a %in% x1$a) + 0L returns the booleans, not the integers is due to operator precedence. The negation is applied last. Try moving it just one spot to the right as the first thing inside the parens: (!x0$a %in% x1$a) + 0L

As @Roland points out, if you find the whole + 0L bit too clever to be clear, you can always just do as.integer(x0$a %in% x1$a).