Gavin Gavin - 1 year ago 190
Python Question

compare string got error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I am trying to use

if
condition to update some values in a column using the following code:

if df['COLOR_DESC'] == 'DARK BLUE':
df['NEW_COLOR_DESC'] = 'BLUE'


But I got the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


So what is wrong with this piece of code?

Answer Source

To answer your immediate question, the problem is that the expression df['COLOR_DESC'] == 'DARK BLUE' results in a Series of booleans. The error message is telling you that there is no one unambiguous way to convert that array to a single boolean value as if demands.

The solution is actually not to use if, since you are not applying the if to each element that is DARK_BLUE. Use the boolean values directly as a mask instead:

rows = (df['COLOR_DESC'] == 'DARK BLUE')
df.loc[rows, 'COLOR_DESC'] = 'BLUE'

You have to use loc to update the original df because if you index it as df[rows]['COLOR_DESC'], you will be getting a copy of the required subset. Setting the values in the copy will not propagate back to the original, and you will even get a warning about that.

For example:

>>> df = pd.DataFrame(data={'COLOR_DESC': ['LIGHT_RED', 'DARK_BLUE', 'MEDUIM_GREEN', 'DARK_BLUE']})
>>> df
     COLOR_DESC
0     LIGHT_RED
1     DARK_BLUE
2  MEDUIM_GREEN
3     DARK_BLUE

>>> rows = (df['COLOR_DESC'] == 'DARK BLUE')
>>> rows
0    False
1     True
2    False
3     True
Name: COLOR_DESC, dtype: bool

>>> df.loc[rows, 'COLOR_DESC'] = 'BLUE'
>>> df
     COLOR_DESC
0     LIGHT_RED
1          BLUE
2  MEDUIM_GREEN
3          BLUE
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download