hannes101 hannes101 - 1 month ago 7
R Question

Transform Year-to-date to Quarterly data with data.table

Quarterly data from a data provider has the issue that for some variables the quarterly data values are actually Year-to-date figures. That means the values are the sum of all previous quarters (Q2 = Q1 + Q2 , Q3 = Q1 + Q2 + Q3, ...).
The structure of the original data looks the following:

library(data.table)
library(plyr)
dt.quarter.test <- structure(list(Year = c(2000L, 2000L, 2000L, 2000L, 2001L, 2001L, 2001L, 2001L)
, Quarter = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L)
, Data.Year.to.Date = c(162, 405, 610, 938, 331, 1467, 1981, 2501))
, .Names = c("Year", "Quarter", "Data.Year.to.Date"), class = ("data.table", "data.frame"), row.names = c(NA, -8L))


In order to calculate the quarterly values I therefore need to subtract the previous Quarter from Q2, Q3 and Q4.
I've managed to get the desired results by using the
ddply
function from the
plyr
package.

dt.quarter.result <- ddply(dt.quarter.test, "Year"
, transform
, Data.Quarterly = Data.Year.to.Date - shift(Data.Year.to.Date, n = 1L, type = "lag", fill = 0))

dt.quarter.result
Year Quarter Data.Year.to.Date Data.Quarterly
1 2000 1 162 162
2 2000 2 405 243
3 2000 3 610 205
4 2000 4 938 328
5 2001 1 331 331
6 2001 2 1467 1136
7 2001 3 1981 514
8 2001 4 2501 520


But I am not really happy with the command, since it seems quite clumsy and I would like to get some input on how to improve it and especially do it directly within the data.table.

Answer

Here is the data.table syntax, and you might find data.table cheat sheet helpful:

library(data.table)
dt.quarter.test[, Data.Quarterly := Data.Year.to.Date - shift(Data.Year.to.Date, fill = 0), Year][]    

#    Year Quarter Data.Year.to.Date Data.Quarterly
# 1: 2000       1               162            162
# 2: 2000       2               405            243
# 3: 2000       3               610            205
# 4: 2000       4               938            328
# 5: 2001       1               331            331
# 6: 2001       2              1467           1136
# 7: 2001       3              1981            514
# 8: 2001       4              2501            520