AmanArora AmanArora - 3 months ago 67
Python Question

Logarithmic returns in pandas dataframe

Python pandas has a pct_change function which I use to calculate the returns for stock prices in a dataframe:

ndf['Return']= ndf['TypicalPrice'].pct_change()


I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function:

ndf['retlog']=np.log(ndf['TypicalPrice'].astype('float64')/ndf['TypicalPrice'].astype('float64').shift(1))
#np is for numpy

Answer

Here is one way to calculate log return using .shift(). And the result is similar to but not the same as the gross return calculated by pct_change(). Can you upload a copy of your sample data (dropbox share link) to reproduce the inconsistency you saw?

import pandas as pd
import numpy as np

np.random.seed(0)
df = pd.DataFrame(100 + np.random.randn(100).cumsum(), columns=['price'])
df['pct_change'] = df.price.pct_change()
df['log_ret'] = np.log(df.price) - np.log(df.price.shift(1))

Out[56]: 
       price  pct_change  log_ret
0   101.7641         NaN      NaN
1   102.1642      0.0039   0.0039
2   103.1429      0.0096   0.0095
3   105.3838      0.0217   0.0215
4   107.2514      0.0177   0.0176
5   106.2741     -0.0091  -0.0092
6   107.2242      0.0089   0.0089
7   107.0729     -0.0014  -0.0014
..       ...         ...      ...
92  101.6160      0.0021   0.0021
93  102.5926      0.0096   0.0096
94  102.9490      0.0035   0.0035
95  103.6555      0.0069   0.0068
96  103.6660      0.0001   0.0001
97  105.4519      0.0172   0.0171
98  105.5788      0.0012   0.0012
99  105.9808      0.0038   0.0038

[100 rows x 3 columns]