Bogs Bogs - 4 months ago 8
R Question

Keep only two rows corresponding to an ID in a data frame

I have the following data (this is a mock version) and I am using R.

ID m
1 m1
1 m2
1 m3
2 m1
2 m2
3 m1
3 m2
3 m3
3 m4
4 m1


Each ID has an m1 row and the rest of the m's are of variable length amongst the ID's. I would like to keep the m1 value and the last value corresponding to each ID. The ideal output would look like this:

ID m
1 m1
1 m3
2 m1
2 m2
3 m1
3 m4
4 m1


Thank you very much in advance.

Answer

The same result with dplyr:

df %>% 
   group_by(ID) %>%
   filter(row_number()==n()|m=='m1')


Source: local data frame [7 x 2]
Groups: ID

  ID  m
1  1 m1
2  1 m3
3  2 m1
4  2 m2
5  3 m1
6  3 m4
7  4 m1