jf328 jf328 - 3 months ago 37
Python Question

pandas dataframe, add one column as moving average of another column for each group

I have a dataframe

df
like below.

dates = pd.date_range('2000-01-01', '2001-01-01')
df1 = pd.DataFrame({'date':dates, 'value':np.random.normal(size = len(dates)), 'market':'GOLD'})
df2 = pd.DataFrame({'date':dates, 'value':np.random.normal(size = len(dates)), 'market':'SILVER'})
df = pd.concat([df1, df2])
df = df.sort('date')

date market value
0 2000-01-01 GOLD -1.361360
0 2000-01-01 SILVER 0.255830
1 2000-01-02 SILVER 0.196953
1 2000-01-02 GOLD 1.422454
2 2000-01-03 GOLD -0.827672
...


I want to add another column as the 10d moving average of value, for each market.

Is there a simple
df.groupby('market').???
that can achieve this? Or do I have to pivot the table to wide form, smooth each column, then melt back?

Answer

You could use groupby/rolling/mean:

result = (df.set_index('date')
            .groupby('market')['value']
            .rolling(10).mean()
            .unstack('market'))

yields

market          GOLD    SILVER
date                          
2000-01-01       NaN       NaN
2000-01-02       NaN       NaN
2000-01-03       NaN       NaN
2000-01-04       NaN       NaN
2000-01-05       NaN       NaN
2000-01-06       NaN       NaN
2000-01-07       NaN       NaN
2000-01-08       NaN       NaN
2000-01-09       NaN       NaN
2000-01-10  0.310077  0.582063
2000-01-11  0.312008  0.752218
2000-01-12  0.151159  0.877230
2000-01-13  0.213611  0.742156
2000-01-14  0.440113  0.614720
2000-01-15  0.551360  0.649967
...
Comments