Johnny Chiu Johnny Chiu - 1 month ago 9
R Question

How to count and calculate percentages for two columns in an R data.frame?

In R, I have a data.frame like this:

df1 <- data.frame(
grade = rep(LETTERS[1:5], 4),
sex = c(rep("male", 5), rep("female", 5), rep("male", 4), rep("female", 6)),
class = c(rep(1, 10), rep(2, 10))
)

df1

grade sex class
1 A male 1
2 B male 1
3 C male 1
4 D male 1
5 E male 1
6 A female 1
7 B female 1
8 C female 1
9 D female 1
10 E female 1
11 A male 2
12 B male 2
13 C male 2
14 D male 2
15 E female 2
16 A female 2
17 B female 2
18 C female 2
19 D female 2
20 E female 2


I want to count the percentage of sex in each class and make another data.frame like:

Class Male_percent Female_percentage
1 50% 50%
2 40% 60%


Can someone teach me how to do it?
This question might have been asked before, but I don't know what's the keyword for this question. I am sorry if I ask the same question again.

Answer

using data.table package you can do the following

setDT(df)[ , .(
                Male_Percent = paste0(( nrow(.SD[sex == "male"]) / .N ) * 100 , "%")   , 
                Female_Percent = paste0(( nrow(.SD[sex == "female"]) / .N ) * 100 , "%")
              )   , 
           by = class
         ]

Result

#     class      Male_Percent  Female_Percent
# 1:     1          50%            50%
# 2:     2          40%            60%

another dplyr solution will be

df %>%
  group_by(sex , class) %>%
  summarise(n = n()) %>%
  group_by(class) %>%
  summarise(
    Male_Percent = paste0((n[sex == "male"] / sum(n)) * 100 , "%")    , 
    Female_Percent = paste0((n[sex == "female"] / sum(n) * 100) , "%")   
  )

#  class   Male_Percent     Female_Percent
#   1          50%            50%
#   2          40%            60%
Comments