Thirst for Knowledge - 1 year ago 123
R Question

# Calculate cumulative percentage for each unit over a time series

I have the following data:

``````ID <- c(1, 2, 1, 2, 1, 2)
year  <- c(1, 1, 2, 2, 3, 3)
population.served  <- c(100, 200, 300, 400, 400, 500)
population  <- c(1000, 1200, 1000, 1200, 1000, 1200)
all <- data.frame(ID, year, population.served, population)
``````

I want to calculate the % of the population served for each ID by year. I've attempted this, but I only manage to calculate the % served for each year. I need some way for iterating through each ID and year, to capture the cumulative sum as the numerator.

I want the data to look like this:

``````ID <- c(1, 2, 1, 2, 1, 2)
year  <- c(1, 1, 2, 2, 3, 3)
population.served  <- c(100, 200, 300, 400, 400, 500)
population  <- c(1000, 1200, 1000, 1200, 1000, 1200)
cumulative.served <- c(10, 16.7, 40, 50, 80, 91.7)
all <- data.frame(ID, year, population.served, population, cumulative.served)
``````

This can easily be done with the `dplyr` package:

``````all %>%
arrange(year) %>%
group_by(ID) %>%
mutate(cumulative.served = round(cumsum(population.served)/population*100,1))
``````

the output is then:

``````     ID  year population.served population cumulative.served
<dbl> <dbl>             <dbl>      <dbl>             <dbl>
1     1     1               100       1000              10.0
2     2     1               200       1200              16.7
3     1     2               300       1000              40.0
4     2     2               400       1200              50.0
5     1     3               400       1000              80.0
6     2     3               500       1200              91.7
``````

Or in a similar way with the fast `data.table` package:

``````library(data.table)
setDT(all)[order(year), cumulative.served := round(cumsum(population.served)/population*100,1), by = ID]
``````

After some trial and error, I also figured out a base R approach:

``````all <- all[order(all\$ID, all\$year),]
all\$cumulative.served <- round(100*with(all, ave(population.served, ID, FUN = cumsum))/all\$population, 1)
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download