skyork - 11 months ago 182

Python Question

Given a dataframe

`df`

`df`

Normally, we can use either

`map`

`apply`

For example, given existing rows

`a b c`

`d`

`c`

How should I do it in pandas?

Answer

If you just want to do a calculation based on the previous row, you can calculate and then shift:

```
In [2]: df = pd.DataFrame({'a':[0,1,2], 'b':[0,10,20]})
In [3]: df
Out[3]:
a b
0 0 0
1 1 10
2 2 20
# a calculation based on other column
In [4]: df['c'] = df['b'] + 1
# shift the column
In [5]: df['c'] = df['c'].shift()
In [6]: df
Out[6]:
a b c
0 0 0 NaN
1 1 10 1
2 2 20 11
```

If you want to do a calculation based on multiple rows, you could look at the `rolling_apply`

function (http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments and http://pandas.pydata.org/pandas-docs/stable/generated/pandas.rolling_apply.html#pandas.rolling_apply)

Source (Stackoverflow)