skyork skyork - 6 months ago 118
Python Question

Reference values in the previous row with map or apply

Given a dataframe

, I would like to generate a new variable/column for each row based on the values in the previous row.
is sorted so that the order of the rows is meaningful.

Normally, we can use either
, but it seems that neither of them allows the access to values in the previous row.

For example, given existing rows
a b c
, I want to generate a new column
, which is based on some calculation using the value of
in the previous row.

How should I do it in pandas?


If you just want to do a calculation based on the previous row, you can calculate and then shift:

In [2]: df = pd.DataFrame({'a':[0,1,2], 'b':[0,10,20]})

In [3]: df
   a   b
0  0   0
1  1  10
2  2  20

# a calculation based on other column
In [4]: df['c'] = df['b'] + 1

# shift the column
In [5]: df['c'] = df['c'].shift()

In [6]: df
   a   b   c
0  0   0 NaN
1  1  10   1
2  2  20  11

If you want to do a calculation based on multiple rows, you could look at the rolling_apply function ( and