Tomas Tomas - 4 months ago 12
Python Question

Python - opposite of __contains__

I was wondering if there exists a Python opposite to

__contains__
(i.e., something like
__notcontains__
). I need it for the following piece code:

df_1 = df[(df.id1 != id1_array) | (df.id2.apply(id2_array.__contains__)]
df_2 = df[(df.id1 == id1_array) & (df.id2.apply(id2_array.__notcontains__)]


In other words, in
df1
I want only observations for which
id1
is not in
id1_array1
or
id2
is in
id2_array
, while for
df2
I want only observations for which
id1
is in
id1_array
and
id2
is not in
id2_array
.

Who can help me out here? Thanks in advance!

Answer

To answer how to do this in pure pandas you can use isin and use the negation operator ~ to invert the boolean series:

df_1 = df[(df.id1 != id1_array) | (df.id2.isin(id2_array)]
df_2 = df[(df.id1 == id1_array) & (~df.id2.isin(id2_array)]

This will be faster than using apply on a larger dataset as isin is vectorised