Wookeun Lee Wookeun Lee - 12 days ago 4
Python Question

variable shift in padas

Here is a Dataframe

import pandas as pd

df = pd.DataFrame({'A':[3,5,3,4,2,3,2,3,4,3,2,2,2,3],
'B':[10,20,30,40,20,30,40,10,20,30,15,60,20,15]})

A B
0 3 10
1 5 20
2 3 30
3 4 40
4 2 20
5 3 30
6 2 40
7 3 10
8 4 20
9 3 30
10 2 15
11 2 60
12 2 20
13 3 15


I'd like to append C columns, containing rolling average of B (rolling period =A)

For example, C value at row index(2) should be df.B.rolling(3).mean() = mean(10,20,30)

C value at row index(4) should be df.B.rolling(2).mean() = mean(40,20)

Anybody can help me?

Answer

probably stupid slow... but this get's it done

def crazy_apply(row):
    p = df.index.get_loc(row.name)
    a = row.A
    return df.B.iloc[p-a+1:p+1].mean()

df.apply(crazy_apply, 1)

0           NaN
1           NaN
2     20.000000
3     25.000000
4     30.000000
5     30.000000
6     35.000000
7     26.666667
8     25.000000
9     20.000000
10    22.500000
11    37.500000
12    40.000000
13    31.666667
dtype: float64

explanation
apply iterates through each column or each row. We iterate through each row because we used the parameter axis=1 (see 1 as the second argument in the call to apply). So every iteration of apply passes the a pandas series object that represents the current row. the current index value is in the name attribute of the row. The index of the row object is the same as the columns of df.

So, df.index.get_loc(row.name) finds the ordinal position of the current index value held in row.name. row.A is the column A for that row.

Comments