fmalaussena fmalaussena - 14 days ago 5
Python Question

Flag on condition on other columns

I'd like to do the following (in pseudo code) :

for each row of my dataframe;
if the value of the cell "date" is between the values of the cells "begin" and "end", then write "1" in the cell "flag", 0 otherwise


I tried the following :

df['flag'] = 1
df['flag'] = df['flag'].apply(lambda x:x if (df['begin'] < df['date'] and df['date'] < df['end']) else 0)
# (I'm coming from R...)


And I get :

The truth value of a Series is ambiguous


I get what Python is telling me, that in the condition, it isn't comparing the contents of the cells in each row, but the whole columns.

How can I get what I want ? (The solution doesn't have to follow the same approach, I'm new to Python and here to learn)

Thanks.

Answer

You want

df['flag'] = ((df['date'] > df['begin']) & (df['date'] < df['end'])).astype(int)

Assuming that dates are datetime and your begin and end are datestrings this should work

The problem with this:

df['flag'] = df['flag'].apply(lambda x:x if (df['begin'] < df['date'] and df['date'] < df['end']) else 0)

firstly if doesn't understand how to treat a boolean array hence the error, additionally to compare multiple conditions you should use the bitwise operators &, | and ~ for and, or, and not respectively. Additionally due to operator precedence the multiple conditions must be enclosed in parentheses ()

So ((df['date'] > df['begin']) & (df['date'] < df['end'])) will return a boolean Series, you can then cast the type using astype(int) to convert True to 1 and False to 0