Mace Mace - 3 months ago 14
R Question

Producing a rolling average of ALL the previous observations per ID an unbalanced panel data set

I am trying to compute rolling means of an unbalanced data set. To illustrate my point I have produced this toy example of my data:

ID year Var RollingAvg(Var)
1 2000 2 NA
1 2001 3 2
1 2002 4 2.5
1 2003 2 3
2 2001 2 NA
2 2002 5 2
2 2003 4 3.5


The column
RollingAvg(Var)
is what I want, but can't get. In words, I am looking for the rolling average of ALL the previous observations of
Var
for each
ID
. I have tried using
rollapply
and
ddply
in the
zoo
and the
plyr
package, but I can't see how to set the rolling window length to use ALL the previous observations for each ID. Maybe I should use the plm package instead? Any help is appreciated.

I have seen other posts on rolling means on BALANCED panel data set, but I can't seem to extrapolate their answers to unbalanced data.

Thanks,

M

Answer

Using data.table:

library(data.table)
d = data.table(your_df)

d[, RollingAvg := {avg = cumsum(Var)/seq_len(.N);
                   c(NA, avg[-length(avg)])},
    by = ID]

(or even simplified)

d[, RollingAvg := c(NA, head(cumsum(Var)/(seq_len(.N)), -1)), by = ID]