user5243421 - 1 month ago 8x
R Question

How to sum a variable by group?

Let's say I have two columns of data. The first contains categories such as "First", "Second", "Third", etc. The second has numbers which represent the number of times I saw "First".

For example:

Category Frequency
First 10
First 15
First 5
Second 2
Third 14
Third 20
Second 3

I want to sort the data by Category and sum the Frequencies:

Category Frequency
First 30
Second 5
Third 34

How would I do this in R?

Using aggregate:

x <- data.frame(Category=factor(c("First", "First", "First", "Second",
"Third", "Third", "Second")),
Frequency=c(10,15,5,2,14,20,3))
aggregate(x\$Frequency, by=list(Category=x\$Category), FUN=sum)
Category  x
1    First 30
2   Second  5
3    Third 34

(embedding @thelatemail comment), aggregate has a formula interface too

aggregate(Frequency ~ Category, x, sum)

Or if you want to aggregate multiple columns, you could use the . notation (works for one column too)

aggregate(. ~ Category, x, sum)

or tapply:

tapply(x\$Frequency, x\$Category, FUN=sum)
First Second  Third
30      5     34
Source (Stackoverflow)