Samantha - 8 months ago 32

R Question

I was hoping to clean my data by deleting the last entry of all groups with the same elements.

my data looks somewhat like this:

`type 2 3`

1 A 2.3 4

2 A 3.4 5

3 B 5.5 6

4 B 6 7

5 B 3 7

6 C 5 6

....

ie. I am trying to get rid of the last entry of every group with the same type, so it will look like this.

`type 2 3`

1 A 2.3 4

2 B 5.5 6

3 B 6 7

4 C 5 6

My actual data have different length for each type, and usually over a few hundreds. I thought of group_by and then

`last()`

`summarize`

Answer

Here is another option with `dplyr`

. After grouping by 'type', we check the sequence of row (`row_number()`

) is not equal to the number of rows (`n()`

- corresponds to the last row number as well) or `|`

) if the number of rows is equal to 1 (`n()==1`

). So, basically, we are removing the last row by creating the logical index (`row_number() !=n()`

) along with an exception to handle the cases where there is only a single row (`n()==1`

).

```
library(dplyr)
df1 %>%
group_by(type) %>%
filter(row_number()!=n()|n()==1)
# type `2` `3`
# <chr> <dbl> <int>
#1 A 2.3 4
#2 B 5.5 6
#3 B 6.0 7
#4 C 5.0 6
```

Source (Stackoverflow)