Aris - 1 year ago 89
Python Question

# Rolling Mean in Pandas

I have this initial DataFrame in Pandas

``````                  A             B             C       D          E
0               23       2015             1      14937     16.25
1               23       2015             1      19054      7.50
2               23       2015             2      14937     16.75
3               23       2015             2      19054     17.25
4               23       2015             3      14937     71.75
5               23       2015             3      19054     15.00
6               23       2015             4      14937     13.00
7               23       2015             4      19054     37.75
8               23       2015             5      14937      4.25
9               23       2015             5      19054     18.25
10              23       2015             6      14937     16.50
11              23       2015             6      19054      1.00
``````

If I want to obtain this result, how could I do it?

``````           A             B             C       D          E
0               23       2015             1      14937     NaN
1               23       2015             2      14937     NaN
2               23       2015             2      14937     16.6
3               23       2015             1      14937     35.1
4               23       2015             2      14937     33.8
5               23       2015             3      14937     29.7
6               23       2015             4      14937     11.3
7               23       2015             4      19054     NaN
8               23       2015             5      19054     NaN
9               23       2015             5      19054     13.3
10              23       2015             6      19054     23.3
11              23       2015             6      19054     23.7
12              23       2015             6      19054     19.0
``````

I tried a GroupBy but I dind't get it

``````DfMean = pd.DataFrame(DfGby.rolling(center=False,window=3)['E'].mean())
``````

I think you can use `groupby` with `rolling` (need at least pandas 0.18.1):

``````s = df.groupby('D').rolling(3)['E'].mean()
print (s)
D
14937  0           NaN
2           NaN
4     34.916667
6     33.833333
8     29.666667
10    11.250000
19054  1           NaN
3           NaN
5     13.250000
7     23.333333
9     23.666667
11    19.000000
Name: E, dtype: float64
``````

Then `set_index` by `D` with `swaplevel` for same order for matching output:

``````df = df.set_index('D', append=True).swaplevel(0,1)

df['E'] = s
``````

Last `reset_index` and reorder columns:

``````df = df.reset_index(level=0).sort_values(['D','C'])
df = df[['A','B','C','D','E']]

print (df)
A     B  C      D          E
0   23  2015  1  14937        NaN
2   23  2015  2  14937        NaN
4   23  2015  3  14937  34.916667
6   23  2015  4  14937  33.833333
8   23  2015  5  14937  29.666667
10  23  2015  6  14937  11.250000
1   23  2015  1  19054        NaN
3   23  2015  2  19054        NaN
5   23  2015  3  19054  13.250000
7   23  2015  4  19054  23.333333
9   23  2015  5  19054  23.666667
11  23  2015  6  19054  19.000000
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download