Neil - 1 year ago 104
R Question

# group by in dplyr and calculating percentages

I have following dataframe in r

``````   Service      Container_Pick_Day
ABC              0
ABC              1
ABC              1
ABC              2
ABC              NA
ABC              0
ABC              1
DEF              NA
DEF              0
DEF              1
DEF              1
DEF              1
DEF              2
DEF              1
``````

Column
`Container_Pick_Day`
is numeric and consist of
`NA`
values.
What I want to do is calculate
`Service`
wise percentage of containers picked up on
`0th day,after 1 day,2 day and so on`
ignoring
`NA`
values

Desired dataframe would be

`````` Service      Container_Pick_Day      Percentage
ABC              0                (2/6)*100 = 33.33
ABC              1                (3/6)*100 = 50
ABC              2                (1/6)*100 = 16.67
DEF              0                (1/6)*100 = 16.67
DEF              1                (3/6)*100 = 50
DEF              2                (1/6)*100 = 16.67
``````

I did following in R,but its generating NA values in output

``````  df%>%
group_by(Service) %>%
summarise(pick_day_perc = n()/sum(Container_Pick_Day),na.rm=T) %>%
as.data.frame()
``````

Do I have to group by
`Service and Container_Pick_Day`
both ?

Adding an answer based on all comments above provided by @nicola, @akrun and myself,

``````library(dplyr)

#nicola
df %>%
filter(!is.na(Container_Pick_Day)) %>%
group_by(Service,Container_Pick_Day) %>%
summarise(Percentage=n()) %>%
group_by(Service) %>%
mutate(Percentage=Percentage/sum(Percentage)*100)

#akrun
df %>%
filter(complete.cases(Container_Pick_Day)) %>%
count(Service, Container_Pick_Day) %>%
group_by(Service) %>%
transmute(Container_Pick_Day, Percentage=n/sum(n)*100)

#Sotos
df %>%
na.omit() %>%
group_by_all() %>%
summarise(ptg = n()) %>%
group_by(Service) %>%
mutate(ptg = prop.table(ptg)*100)
``````

All resulting to,

``````Service Container_Pick_Day Percentage
<fctr>              <int>      <dbl>
1     ABC                  0   33.33333
2     ABC                  1   50.00000
3     ABC                  2   16.66667
4     DEF                  0   16.66667
5     DEF                  1   66.66667
6     DEF                  2   16.66667
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download