Aris Aris - 1 month ago 6
Python Question

Rolling Mean in Pandas

I have this initial DataFrame in Pandas

A B C D E
0 23 2015 1 14937 16.25
1 23 2015 1 19054 7.50
2 23 2015 2 14937 16.75
3 23 2015 2 19054 17.25
4 23 2015 3 14937 71.75
5 23 2015 3 19054 15.00
6 23 2015 4 14937 13.00
7 23 2015 4 19054 37.75
8 23 2015 5 14937 4.25
9 23 2015 5 19054 18.25
10 23 2015 6 14937 16.50
11 23 2015 6 19054 1.00


If I want to obtain this result, how could I do it?

A B C D E
0 23 2015 1 14937 NaN
1 23 2015 2 14937 NaN
2 23 2015 2 14937 16.6
3 23 2015 1 14937 35.1
4 23 2015 2 14937 33.8
5 23 2015 3 14937 29.7
6 23 2015 4 14937 11.3
7 23 2015 4 19054 NaN
8 23 2015 5 19054 NaN
9 23 2015 5 19054 13.3
10 23 2015 6 19054 23.3
11 23 2015 6 19054 23.7
12 23 2015 6 19054 19.0


I tried a GroupBy but I dind't get it

DfMean = pd.DataFrame(DfGby.rolling(center=False,window=3)['E'].mean())

Answer

I think you can use groupby with rolling (need at least pandas 0.18.1):

s = df.groupby('D').rolling(3)['E'].mean()
print (s)
D        
14937  0           NaN
       2           NaN
       4     34.916667
       6     33.833333
       8     29.666667
       10    11.250000
19054  1           NaN
       3           NaN
       5     13.250000
       7     23.333333
       9     23.666667
       11    19.000000
Name: E, dtype: float64

Then set_index by D with swaplevel for same order for matching output:

df = df.set_index('D', append=True).swaplevel(0,1)

df['E'] = s

Last reset_index and reorder columns:

df = df.reset_index(level=0).sort_values(['D','C'])
df = df[['A','B','C','D','E']]

print (df)
     A     B  C      D          E
0   23  2015  1  14937        NaN
2   23  2015  2  14937        NaN
4   23  2015  3  14937  34.916667
6   23  2015  4  14937  33.833333
8   23  2015  5  14937  29.666667
10  23  2015  6  14937  11.250000
1   23  2015  1  19054        NaN
3   23  2015  2  19054        NaN
5   23  2015  3  19054  13.250000
7   23  2015  4  19054  23.333333
9   23  2015  5  19054  23.666667
11  23  2015  6  19054  19.000000