Kim Phan Kim Phan - 2 months ago 6
R Question

Subset of dataframe for which 2 variables match another dataframe in R

I'm looking to obtain a subset of my first, larger, dataframe 'df1' by selecting rows which contain particular combinations in the first two variables, as specified in a smaller 'df2'. For example:

df1 <- data.frame(ID = c("A", "A", "A", "B", "B", "B"),
day = c(1, 2, 2, 1, 2, 3), value = seq(4,9))

df1 # my actual df has 20 varables
ID day value
A 1 4
A 2 5
A 2 6
B 1 7
B 2 8
B 3 9

df2 <- data.frame(ID = c("A", "B"), day = c(2, 1))

df2 # this df remains at 2 variables
ID day
A 2
B 1


Where the output would be:

ID day value
A 2 5
A 2 6
B 1 7


Any help wouldbe much appreciated, thanks!

Answer

This is a good use of the merge function.

df1 <- data.frame(ID = c("A", "A", "A", "B", "B", "B"),
                  day = c(1, 2, 2, 1, 2, 3), value = seq(4,9))

df2 <- data.frame(ID = c("A", "B"), day = c(2, 1))

merge(df1,
      df2,
      by = c("ID", "day"))

Which gives output:

  ID day value
1  A   2     5
2  A   2     6
3  B   1     7