just_rookie just_rookie - 3 months ago 9
R Question

R dplyr: how to remove smaller groups?

I would like to remove smaller groups using

dplyr
. For example, the dataframe:

ID group value
1 1 6
2 1 2
3 2 0
4 2 5
5 2 3
6 3 7
7 3 1
8 4 3
9 4 7
10 4 5


Group size of group 1, group 2, group 3, and group 4 are 2, 3, 2 and 3, and I want to remove the group 1 and group 3 since their size are less than 3. Thank you in advance!

Answer

You can use n() to get the number of rows per group, and filter on it, take a look at ?n(), the last example about the usage of n() is filtering based on the size of groups:

df %>% group_by(group) %>% filter(n() >= 3)

# Source: local data frame [6 x 3]
# Groups: group [2]

#      ID group value
#   <int> <int> <int>
# 1     3     2     0
# 2     4     2     5
# 3     5     2     3
# 4     8     4     3
# 5     9     4     7
# 6    10     4     5