user5243421 - 6 months ago 41

R Question

Let's say I have two columns of data. The first contains categories such as "First", "Second", "Third", etc. The second has numbers which represent the number of times I saw "First".

For example:

`Category Frequency`

First 10

First 15

First 5

Second 2

Third 14

Third 20

Second 3

I want to sort the data by Category and sum the Frequencies:

`Category Frequency`

First 30

Second 5

Third 34

How would I do this in R?

Answer

Using `aggregate`

:

```
x <- data.frame(Category=factor(c("First", "First", "First", "Second",
"Third", "Third", "Second")),
Frequency=c(10,15,5,2,14,20,3))
aggregate(x$Frequency, by=list(Category=x$Category), FUN=sum)
Category x
1 First 30
2 Second 5
3 Third 34
```

(embedding @thelatemail comment), `aggregate`

has a formula interface too

```
aggregate(Frequency ~ Category, x, sum)
```

Or if you want to aggregate multiple columns, you could use the `.`

notation (works for one column too)

```
aggregate(. ~ Category, x, sum)
```

or `tapply`

:

```
tapply(x$Frequency, x$Category, FUN=sum)
First Second Third
30 5 34
```