Cofeinnie Bonda Cofeinnie Bonda - 3 months ago 97
Python Question

Rolling Regression Estimation in Python dataframe

I have a dataframe like this:

Date Y X1 X2 X3
22 2004-05-12 9.348158e-09 0.000081 0.000028 0.000036
23 2004-05-13 9.285989e-09 0.000073 0.000081 0.000097
24 2004-05-14 9.732308e-09 0.000085 0.000073 0.000096
25 2004-05-17 2.235977e-08 0.000089 0.000085 0.000099
26 2004-05-18 2.792661e-09 0.000034 0.000089 0.000150
27 2004-05-19 9.745323e-09 0.000048 0.000034 0.000053

......

1000 2004-05-20 1.835462e-09 0.000034 0.000048 0.000099
1001 2004-05-21 3.529089e-09 0.000037 0.000034 0.000043
1002 2004-05-24 3.453047e-09 0.000043 0.000037 0.000059
1003 2004-05-25 2.963131e-09 0.000038 0.000043 0.000059
1004 2004-05-26 1.390032e-09 0.000029 0.000038 0.000054


I want to run a rolling 100-day window OLS regression estimation, which is:

First for the 101st row, I run a regression of Y-X1,X2,X3 using the 1st to 100th rows, and estimate Y for the 101st row;

Then for the 102nd row, I run a regression of Y-X1,X2,X3 using the 2nd to 101st rows, and estimate Y for the 102nd row;

Then for the 103rd row, I run a regression of Y-X1,X2,X3 using the 2nd to 101st rows, and estimate Y for the 103rd row;

......

Until the last row.

How to do this?

Answer
model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['X1', 'X2', 'X3']], 
                               window_type='rolling', window=100, intercept=True)
df['Y_hat'] = model.y_predict
Comments