n49o7 n49o7 - 10 days ago 5
R Question

Sort a data frame by group sum first, then by frequency

I have the following data in a data.frame:

reg id n
Echo E1 7
Charlie C1 3
Alpha A1 5
Echo E2 1
Alpha A2 4
Charlie C2 9


I am trying to sort it by reg sum first, then by n (both decreasing).
The result would be:

reg id n
Charlie C2 9
Charlie C1 3
Alpha A1 5
Alpha A2 4
Echo E1 7
Echo E2 1


Because Charlie has 12, Alpha has 9 and Echo has 8.

So far, I've only been able to group and do the second part of the sorting (through poor use of dplyr). The reg column still gets sorted alphabetically.

Is there an elegant solution to this?

Answer

Using dplyr

# data
df1 <- read.table(text = "reg        id    n
Echo       E1    7
Charlie    C1    3
Alpha      A1    5
Echo       E2    1
Alpha      A2    4
Charlie    C2    9", header = TRUE)

library(dplyr)
df1 %>% 
  group_by(reg) %>% 
  mutate(nSum = sum(n)) %>% 
  arrange(-nSum, -n) %>% 
  select(-nSum)

# Source: local data frame [6 x 3]
# Groups: reg [3]
# 
#       reg     id     n
#    <fctr> <fctr> <int>
# 1 Charlie     C2     9
# 2 Charlie     C1     3
# 3   Alpha     A1     5
# 4   Alpha     A2     4
# 5    Echo     E1     7
# 6    Echo     E2     1
Comments