ldevyataykina ldevyataykina - 3 months ago 8
Python Question

Pandas: condition to string

I have dataframe and I want to delete string from dataframe, if some column contain

avito
and doesn't contain
telefony
.
I can write condition

df1 = df[~df.url.str.contains(r"avito")]


but I don't know how can I add condition with
telefony

data:

url
avito.ru/mytischi/telefony/sim_karty_s_nulevym_balansom_bonus_689217820
avito.ru/moskva/blackberry_z10_714072090
avito.ru/moskva/telefony/blackberry_bold_new_rost-test_original_697592392
avito.ru/moskva/telefony/blackberry_bold_blask_new_e._a._c._rost-test_696289049
avito.ru/moskva/blackberry_z30_lte_4g_714023258
vk.com


Desire output:

url
avito.ru/mytischi/telefony/sim_karty_s_nulevym_balansom_bonus_689217820
avito.ru/moskva/telefony/blackberry_bold_new_rost-test_original_697592392
avito.ru/moskva/telefony/blackberry_bold_blask_new_e._a._c._rost-test_696289049
vk.com

Answer

You want to compound your boolean conditions and negate it:

In [18]:
df[~(df['url'].str.contains('avito') & ~df['url'].str.contains('telefony'))]

Out[18]:
                                                 url
0  avito.ru/mytischi/telefony/sim_karty_s_nulevym...
2  avito.ru/moskva/telefony/blackberry_bold_new_r...
3  avito.ru/moskva/telefony/blackberry_bold_blask...
5                                             vk.com

So the inner condition:

df['url'].str.contains('avito') & ~df['url'].str.contains('telefony')

here we are looking for urls that contain 'avito' and don't contain 'telefony':

In [19]:
df['url'].str.contains('avito') & ~df['url'].str.contains('telefony')

Out[19]:
0    False
1     True
2    False
3    False
4     True
5    False
Name: url, dtype: bool

we then invert the above by enclosing in parentheses and using ~ like in the first code snippet