aaronmams - 1 year ago 185
R Question

# calculate rolling sum based on row index in R

I am trying to calculate a grouped rolling sum based on a window size k but, in the event that the within group row index (n) is less than k, I want to calculate the rolling sum using the condition k=min(n,k).

My issue is similar to this question R dplyr rolling sum but I am looking for a solution that provides a non-NA value for each row.

I can get part of the way there using dplyr and rollsum:

``````library(zoo)
library(dplyr)
df <- data.frame(Date=rep(seq(as.Date("2000-01-01"),
as.Date("2000-12-01"),by="month"),2),
ID=c(rep(1,12),rep(2,12)),value=1)
df <- tbl_df(df)
df <- df %>%
group_by(ID) %>%
mutate(total3mo=rollsum(x=value,k=3,align="right",fill="NA"))

df
Source: local data frame [24 x 4]
Groups: ID [2]

Date    ID value tota3mo
(date) (dbl) (dbl)   (dbl)
1  2000-01-01     1     1      NA
2  2000-02-01     1     1      NA
3  2000-03-01     1     1       3
4  2000-04-01     1     1       3
5  2000-05-01     1     1       3
6  2000-06-01     1     1       3
7  2000-07-01     1     1       3
8  2000-08-01     1     1       3
9  2000-09-01     1     1       3
10 2000-10-01     1     1       3
..        ...   ...   ...     ...
``````

In this case, what I would like is to return the value 1 for observations on 2000-01-01 and the value 2 for observations on 2000-02-01. More generally, I would like the rolling sum to be calculated over the largest window possible but no larger than k.

In this particular case it's not too difficult to change some NA values by hand. However, ultimately I would like to add several more columns to my data frame that will be rolling sums calculated over various windows. In this more general case it will get quite tedious to go back change many NA values by hand.

Using the `partial=TRUE` argument of `rollapplyr` :

``````df %>%
group_by(ID) %>%
mutate(roll = rollapplyr(value, 3, sum, partial = TRUE)) %>%
ungroup()
``````

or without dplyr (still need zoo):

``````roll <- function(x) rollapplyr(x, 3, sum, partial = TRUE)
transform(df, roll = ave(value, ID, FUN = roll))
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download