jovicbg jovicbg - 2 years ago 80
Python Question

If one row in two columns contain the same string python pandas

I have a dataframe looking like this:

id k1 k2 same
1 re_setup oo_setup true
2 oo_setup oo_setup true
3 alerting bounce false
4 bounce re_oversetup false
5 re_oversetup alerting false
6 alerting_s re_setup false
7 re_oversetup oo_setup true
8 alerting bounce false


So, I need to classified rows where string 'setup' is contained or not.

And simple output would be:
id k1 k2 same
1 re_setup oo_setup true
2 oo_setup oo_setup true
3 alerting bounce false
4 bounce re_setup false
5 re_setup alerting false
6 alerting_s re_setup false
7 re_setup oo_setup true
8 alerting bounce false


I've tried something with this, but as I expact, I have error with selecting multiple columns.

data['same'] = data[data['k1', 'k2'].str.contains('setup')==True]

Answer Source

I think you need apply with str.contains, because it working only with Series (one column):

print (data[['k1', 'k2']].apply(lambda x: x.str.contains('setup')))
      k1     k2
0   True   True
1   True   True
2  False  False
3  False   True
4   True  False
5  False   True
6   True   True
7  False  False

Then add DataFrame.all for check if all Trues per row

data['same'] = data[['k1', 'k2']].apply(lambda x: x.str.contains('setup')).all(1)
print (data)
   id          k1        k2   same
0   1    re_setup  oo_setup   True
1   2    oo_setup  oo_setup   True
2   3    alerting    bounce  False
3   4      bounce  re_setup  False
4   5    re_setup  alerting  False
5   6  alerting_s  re_setup  False
6   7    re_setup  oo_setup   True
7   8    alerting    bounce  False

or DataFrame.any for check at least one True per row:

data['same'] = data[['k1', 'k2']].applymap(lambda x: 'setup' in x).any(1)
print (data)
   id          k1        k2   same
0   1    re_setup  oo_setup   True
1   2    oo_setup  oo_setup   True
2   3    alerting    bounce  False
3   4      bounce  re_setup   True
4   5    re_setup  alerting   True
5   6  alerting_s  re_setup   True
6   7    re_setup  oo_setup   True
7   8    alerting    bounce  False

Another solutions with applymap for elemnt wise check:

data['same'] = data[['k1', 'k2']].applymap(lambda x: 'setup' in x).all(1)
print (data)
   id          k1        k2   same
0   1    re_setup  oo_setup   True
1   2    oo_setup  oo_setup   True
2   3    alerting    bounce  False
3   4      bounce  re_setup  False
4   5    re_setup  alerting  False
5   6  alerting_s  re_setup  False
6   7    re_setup  oo_setup   True
7   8    alerting    bounce  False

If only 2 columns simple chain conditions with & like all or | like any:

data['same'] = data['k1'].str.contains('setup') & data['k2'].str.contains('setup')
print (data)
   id          k1        k2   same
0   1    re_setup  oo_setup   True
1   2    oo_setup  oo_setup   True
2   3    alerting    bounce  False
3   4      bounce  re_setup  False
4   5    re_setup  alerting  False
5   6  alerting_s  re_setup  False
6   7    re_setup  oo_setup   True
7   8    alerting    bounce  False
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download