Ignacio Ignacio - 1 year ago 104
R Question

Subset a data frame based on another

I have two data frames, x and y.

x<-data.frame(id=c(1,2,3,4,5), g=c(21,52,43,94,35))
y<-data.frame(id=c(3,4,7), u=c(55, 77, 99))


I want to subset x to include only the observations with "IDs" that are also in y.

What is the best way of doing this?

Thanks!

Answer Source

Use setdiff to exclude observations appearing in both df

> x[setdiff(x$id, y$id),]  
  id  g
1  1 21
2  2 52
5  5 35

Use merge to include observations present in both df

> merge(x, y)
  id  g  u
1  3 43 55
2  4 94 77

or looking for this subset?

> x[intersect(x$id, y$id),]
  id  g
3  3 43
4  4 94