Sebastian Zeki Sebastian Zeki - 3 months ago 8
R Question

How to subset by group with dplyr

I have a dataset as follows:

col1 col2
1 26
1 43
1 34
2 54
2 27
3 15
4 1
4 4


I would like to select only those groups where col2 is more than 25 so the resulting dataset should be

col1 col2
1 26
1 43
1 34
2 54
2 27


This is an example dataset rather than the real thing so rather than a simple subset answer I am really looking for a dplyr answer along the lines of:

Nr<-Mrd %>%
group_by(col1) %>%
slice(which.min(col2>25))


however this answer will get me the rows in each group that are >25 rather than the groups that have >25 as their minimum.

Answer

Following your train of thought, you don't need which.min, but min and filter instead of slice

df %>% 
   group_by(col1) %>% 
   filter(min(col2) > 25)

#Source: local data frame [5 x 2]
#Groups: col1 [2]

#   col1  col2
#  <int> <int>
#1     1    26
#2     1    43
#3     1    34
#4     2    54
#5     2    27