Samantha Samantha - 1 month ago 8
R Question

Delete the last entry of groups in a data frame

I was hoping to clean my data by deleting the last entry of all groups with the same elements.

my data looks somewhat like this:

type 2 3
1 A 2.3 4
2 A 3.4 5
3 B 5.5 6
4 B 6 7
5 B 3 7
6 C 5 6
....


ie. I am trying to get rid of the last entry of every group with the same type, so it will look like this.

type 2 3
1 A 2.3 4
2 B 5.5 6
3 B 6 7
4 C 5 6


My actual data have different length for each type, and usually over a few hundreds. I thought of group_by and then
last()
but it seems to work only with
summarize
. any idea?

Answer

Here is another option with dplyr. After grouping by 'type', we check the sequence of row (row_number()) is not equal to the number of rows (n()- corresponds to the last row number as well) or |) if the number of rows is equal to 1 (n()==1). So, basically, we are removing the last row by creating the logical index (row_number() !=n()) along with an exception to handle the cases where there is only a single row (n()==1).

library(dplyr)
df1 %>% 
    group_by(type) %>%
    filter(row_number()!=n()|n()==1)
#  type   `2`   `3`
#  <chr> <dbl> <int>
#1     A   2.3     4
#2     B   5.5     6
#3     B   6.0     7
#4     C   5.0     6
Comments