giordano - 6 months ago 40

R Question

EXISTS/NOT EXISTS are features in SQL which are incredible powerful.

Is there a way how I can use R commands or functions to perform what I have with EXISTS in SQL? Example:

`# Two tables`

x0 <- data.frame(a=rep(1:4,2))

x1 <- data.frame(a=c(2,2,3,5))

I want a new indicator b which gives me 1 if the value in

`# Initialize b`

x0$b <- 0

# Update b

sqldf(c("UPDATE x0

SET b = 1

WHERE EXISTS (SELECT 1

FROM x1

WHERE x0.a = x1.a

)"

, "SELECT * FROM main.x0"

)

)

Results:

`a b`

1 1 0

2 2 1

3 3 1

4 4 0

5 1 0

6 2 1

7 3 1

8 4 0

Answer

Try this:

```
x0$b <- (x0$a %in% x1$a) + 0L
```

Using `0L`

rather than `0`

is mostly me just being a bit pedantic; it ensures that everything remains as integers rather than numeric (doubles).

The reason `!(x0$a %in% x1$a) + 0L`

returns the booleans, not the integers is due to operator precedence. The negation is applied last. Try moving it just one spot to the right as the first thing inside the parens: `(!x0$a %in% x1$a) + 0L`

As @Roland points out, if you find the whole `+ 0L`

bit too clever to be clear, you can always just do `as.integer(x0$a %in% x1$a)`

.