Uday Shankar Uday Shankar - 4 years ago 4329
Python Question

Use of loc to update a dataframe python pandas

I have a pandas dataframe (df) with the column structure :

month a b c d


this dataframe has data for say Jan, Feb, Mar, Apr. A,B,C,D are numeric columns. For the month of Feb , I want to recalculate column A and update it in the dataframe i.e. for month = Feb, A = B + C + D

Code I used :

df[df['month']=='Feb']['A']=df[df['month']=='Feb']['B'] + df[df['month']=='Feb']['C'] + df[df['month']=='Feb']['D']


This ran without errors but did not change the values in column A for the month Feb. In the console, it gave a message that :


A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead


I tried to use .loc but right now the dataframe I am working on, I had used
.reset_index()
on it and I am not sure how to set index and use .loc. I followed documentation but not clear. Could you please help me out here?
This is an example dataframe :

import pandas as pd import numpy as np
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])


I want to update say one date : 2000-01-03. I am unable to give the snippet of my data as it is real time data.

Answer Source

As you could see from the warning you should use loc[row_index, col_index]. When you subsetting your data you get index values. You just need to pass for row_index and then with comma col_name:

df.loc[df['month'] == 'Feb', 'A'] = df.loc[df['month'] == 'Feb', 'B'] + df.loc[df['month'] == 'Feb', 'C'] + df.loc[df['month'] == 'Feb', 'D'] 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download