Tyu1990 Tyu1990 - 1 month ago 14
R Question

Conditional rolling sum

This is the data frame

Date ID cost Value
15/12/2016 1 yes 200
15/10/2016 1 yes 100
15/9/2016 1 yes 55
15/04/2016 1 yes 1000
15/12/2016 2 yes 300
15/10/2016 2 yes 200
15/9/2016 2 yes 100
15/04/2016 2 yes 1000
15/12/2016 3 no 300
15/10/2016 3 no 200
15/9/2016 3 no 100
15/04/2016 3 no 1000


I'd like to repeat the rolling sum on 3 month for each ID that have a cost = "yes". Note that in the example the IDs are just 3 but in my DB are n.

The output should be

Date ID Value Rolling_Sum
15/12/2016 1 200 355
15/10/2016 1 100 155
15/9/2016 1 55 55
15/04/2016 1 1000 1000
15/12/2016 2 300 600
15/10/2016 2 200 300
15/9/2016 2 100 100
15/04/2016 2 1000 1000


I've seen a lot of example in other questions. one of my biggest problem is that the date is not continue.. so i can have different lag between different data.

Thanks

Answer

You could use the foverlaps function from the data.table-package for this:

library(data.table)
library(lubridate)

# convert the data to a 'data.table'
setDT(dt)
# convert the Date column to date-class
dt[, Date := as.Date(Date, '%d/%m/%Y')]
# create an exact same column to be used by the 'foverlaps' function
dt[, bdate := Date]
# create a reference 'data.table' with the 3 month intervals
dtc <- copy(dt)[, bdate := Date %m-% months(3)]
# set the keys for the reference data.table (needed for the 'foverlaps' function) 
setkey(dtc, ID, bdate, Date)
# create the overlaps and summarise
foverlaps(dt[cost=='yes'], dtc, type = 'within')[, .(val = sum(i.Value)), by = .(ID, Date)]

which gives:

   ID       Date  val
1:  1 2016-12-15  355
2:  1 2016-10-15  155
3:  1 2016-09-15   55
4:  1 2016-04-15 1000
5:  2 2016-12-15  600
6:  2 2016-10-15  300
7:  2 2016-09-15  100
8:  2 2016-04-15 1000