FiofanS - 1 year ago 61
R Question

# How to calculate average time for aggregated data per different groups?

I have the following data frame and this question relates to [this thread]

``````df = data.frame(c("2012","2012","2012","2013"),
c("AAA","BBB","AAA","AAA"),
c("X","Not-serviced","X","Y"),
c("2","10","3","2.5"))

colnames(df) = c("year","type","service_type","waiting_time")
``````

I want to get average waiting times for serviced and nonserviced groups. This is how the data is grouped:

``````library(data.table)
setDT(df)[, .(num_serviced = sum(service_type != "Not-serviced"),
num_notserviced = sum(service_type =="Not_serviced"),
avg_wt = mean(waiting_time)), ## THE PROBLEM HERE!!!
.(year, type)][, Total := num_serviced + num_notserviced][]
``````

However
`avg_wt = mean(waiting_time))`
estimated average waiting time over Total. I would rather need
`avg_wt_serviced`
and
`avg_wt_notserviced`
.

The result must be:

``````year  type num_serviced num_notserviced num_total avg_wt_serviced  avg_wt_notserviced
2012  AAA  2            0               2         2.5              0
``````

With `dplyr`, we can use `mean`

``````library(dplyr)
df %>%
group_by(year,type) %>%
summarise(num_serviced = sum(service_type != "Not-serviced"),
num_notserviced = sum(service_type == "Not-serviced"),
num_total = num_serviced + num_notserviced,
avg_wt_serv = mean(waiting_time[service_type != "Not-serviced"]),
avg_wt_notser = mean(waiting_time[service_type == "Not-serviced"]))

#   year  type num_serviced num_notserviced num_total avg_wt_serv  avg_wt_notser
#   <fctr> <fctr>   <int>           <int>     <int>      <dbl>         <dbl>
#1   2012    AAA       2               0         2        2.5            NaN
#2   2012    BBB       0               1         1        NaN            10
#3   2013    AAA       1               0         1        2.5            NaN
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download