WhitneyChia WhitneyChia - 17 days ago 7
Python Question

pandas create column referencing itself

I want to create a new column in pandas but the values are calculated referencing the value in the cell above it. I have a column called returns and essentially the value should be return * value from the previous row.

Conceptually I think it should be something like this, but this doesn't work and I'm not sure how to get it.

df2['value'] = [100 if x == 0 else x * y for x, y in zip(df2['return'], df2['value'].shift(1))]


So, data looks like this:

return
0
0.99756466142691
0.99846199238689
1.004349336899
1.0018775199783


I want this:

return value
0.0000000000 100.0000000000
0.9975646614 99.7564661427
0.9984619924 99.6030399383
1.0043493369 100.0362471152
1.0018775200 100.2240671677


Thanks!

Answer

Solution is with loop, because need previous value:

for i, row in df2.iterrows():
    if row['return'] == 0:
        df2.loc[i, 'value'] = 100 
    else:
        df2.loc[i, 'value'] = df2.loc[i,'return']  * df2.loc[i-1, 'value'] 

print (df2)
     return       value
0  0.000000  100.000000
1  0.997565   99.756466
2  0.998462   99.603040
3  1.004349  100.036247
4  1.001878  100.224067
Comments