Joe King Joe King - 1 year ago 91
R Question

Select rows from a data frame based on values in a vector

I have data similar to this:

dt <- structure(list(fct = structure(c(1L, 2L, 3L, 4L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 4L), .Label = c("a", "b", "c", "d"), class = "factor"), X = c(2L, 4L, 3L, 2L, 5L, 4L, 7L, 2L, 9L, 1L, 4L, 2L, 5L, 4L, 2L)), .Names = c("fct", "X"), class = "data.frame", row.names = c(NA, -15L))

I want to select rows from this data frame based on the values in the
variable. For example, if I wish to select rows containing either "a" or "c" I can do this:

dt[dt$fct == 'a' | dt$fct == 'c', ]

which yields

1 a 2
3 c 3
5 c 5
7 a 7
9 c 9
10 a 1
12 c 2
14 c 4

as expected. But my actual data is more complex and I actually want to select rows based on the values in a vector such as

vc <- c('a', 'c')

So I tried

dt[dt$fct == vc, ]

but of course that doesn't work. I know I could code something to loop through the vector and pull out the rows needed and append them to a new dataframe, but I was hoping there was a more elegant way.

So how can I filter/subset my data based on the contents of the vector

Answer Source

Have a look at ?"%in%".

dt[dt$fct %in% vc,]
   fct X
1    a 2
3    c 3
5    c 5
7    a 7
9    c 9
10   a 1
12   c 2
14   c 4

You could also use ?is.element:

dt[is.element(dt$fct, vc),]