Brian Postow Brian Postow - 6 months ago 15
Python Question

How do I select from a column that might be boolean or the string '1'?

I've got a pandas dataframe that I grab from a database, so I don't have much control over the format. For reasons that defy explanation, in some cases, the query returns True/False, in others, it returns 1/NULL. This goes into a csv, which is then later read into the dataframe.

If the column is True/False, they're read correctly, and I can say:

DF = DF[DF['column']]


to only get the rows where the column has True. It's great.

However, when the data is 1/Null, the 1 is a string, so everything gets messed up.

I'd LIKE to do:

DF = DF[ DF['column'] or DF['column'] == '1']


However, that doesn't work.

ALternatively, if there was a way to easily check if I have '1'/'' and if so, convert the column into True/False, that would also work.

Answer

It is not easy, because number 1 is equal True. So I first convert values to string and then check string 1 by str.contains:

DF = pd.DataFrame({'column':[True, True, 1,  1, 1]})
print DF

  column
0   True
1   True
2      1
3      1
4      1

print DF[DF['column'].astype(str).str.contains('1')]

  column
2      1
3      1
4      1

Or use old Ayhan solution:

print  DF[DF['column'].astype(str) == '1' ]

  column
2      1
3      1
4      1

EDIT:

If you need check True or number 1:

DF = pd.DataFrame({'column':[True, True, 1,  1, 1, False, 2]})
print DF
  column
0   True
1   True
2      1
3      1
4      1
5  False
6      2

print DF[DF['column'] == 1]
  column
0   True
1   True
2      1
3      1
4      1

print DF[DF['column'] == True]
  column
0   True
1   True
2      1
3      1
4      1

It works, because True is equal 1 in comparing.

Comments