user308827 user308827 - 1 month ago 23
Python Question

Getting SettingWithCopyWarning warning even after using .loc in pandas

df_masked.loc[:, col] = df_masked.groupby([df_masked.index.month, df_masked.index.day])[col].\
transform(lambda y: y.fillna(y.median()))


Even after using a .loc, I get the foll. error, how do I fix it?

Anaconda\lib\site-packages\pandas\core\indexing.py:476: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s

Answer

You could get this UserWarning if df_masked is a sub-DataFrame of some other DataFrame. In particular, if data had been copied from the original DataFrame to df_masked then, Pandas emits the UserWarning to alert you that modifying df_masked will not affect the original DataFrame.

If you do not intend to modify the original DataFrame, then you are free to ignore the UserWarning.

There are ways to shut off the UserWarning on a per-statement basis. In particular, you could use df_masked.is_copy = False.

If you run into this UserWarning a lot, then instead of silencing the UserWarnings one-by-one, I think it is better to leave them be as you are developing your code. Be aware of what the UserWarning means, and if the modifying-the-child-does-not-affect-the-parent issue does not affect you, then ignore it. When your code is ready for production, or if you are experienced enough to not need the warnings, shut them off entirely with

pd.options.mode.chained_assignment = None

near the top of your code.


Here is a simple example which demonstrate the problem and (a) solution:

import pandas as pd

df = pd.DataFrame({'swallow':['African','European'], 'cheese':['gouda', 'cheddar']})
df_masked = df.iloc[1:]
df_masked.is_copy = False   # comment-out this line to see the UserWarning
df_masked.loc[:, 'swallow'] = 'forest'

The reason why the UserWarning exists is to help alert new users to the fact that chained-indexing such as

df.iloc[1:].loc[:, 'swallow'] = 'forest'

will not affect df when the result of the first indexer (e.g. df.iloc[1:]) returns a copy.