sudonym sudonym - 21 days ago 5
Python Question

How to change specific cell values in a pandas dataframe column series based on multiple conditions?

I am trying to replace all values in a pandas dataframe column

df.column_A
if they fall within the range of 1 to 10.

However, when I do:

df.loc[(1 < df.column_A < 10), "Column_A"] = 1
,

I am yielding:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
.

Alternatively, when I do:

df.loc[(df.column_A < 10) & (df.column_A > 1), "df.column_A"] = 1
,

I am yielding no error at all, but the values don't get replaced.

Strange is that when I do:

df.loc[(df.column_A < 10) | (df.column_A > 1), "df.column_A"] = 1
,

all values in
df.column_A
get replaced with
1
, as I would expect.

This means that the syntax of the line is correct, so the mistake must be due to some factors I don't understand.

What am I doing wrong?

Answer

It is a simple problem. .loc takes index labels or boolean list/Series. So this will work:

df.loc[(df.column_A < 10) & (df.column_A > 1), "column_A"] = 1

Note that I removed df. from the column index place.


df.loc[(1 < df.column_A < 10), "Column_A"] = 1

Will not work because the operation (1 < df.column_A < 10) seems logical, but tries to collapse the whole Series into one value. And since it does not know whether you want an and, or or some other combination, it raises that error.

df.loc[(df.column_A < 10) | (df.column_A > 1), "df.column_A"] = 1

Should not work either, because you are not referencing the columns correctly. It is funny that you are getting no errors. Perhaps you did something in your program earlier that saves you...