user1737564 - 1 year ago 50

R Question

I have a data frame like following:

`X1 X2`

a 1

a 2

a 3

b 4

b 5

b 1

c 4

c 4

c 6

d 1

d 0

e 6

e 8

e 9

Preferred output data frame returns a unique value from the first column and the corresponding max value from the second column, like following.

`X1 X2`

a 3

b 5

c 6

d 1

e 9

Thanks!

Answer Source

This can be done with one of the group by operations. In `base R`

, `aggregate`

does this

```
aggregate(X2~X1, df1, max)
# X1 X2
#1 a 3
#2 b 5
#3 c 6
#4 d 1
#5 e 9
```

Or with `dplyr`

```
library(dplyr)
df1 %>%
group_by(X1) %>%
summarise(X2= max(X2))
```

Or `data.table`

```
library(data.table)
setDT(df1)[, .(X2= max(X2)), by = X1]
```

A faster option would be to `order`

the 'X2' in descending and select the first observation for each 'X1'

```
setDT(df1)[order(-X2, X1), head(.SD, 1), by = X1]
```