ktorquem ktorquem - 4 months ago 66
Python Question

Iterating over columns and reassigning values - Pandas/Python

Trying to use a for loop to iterate over columns and change Yes and No's to 1 and 0.

For some reason, I am getting an invalid type comparison error when attempting this:

Panda DataFrame has multiple columns, one of them being "Combined"

for col,row in d.iteritems():
d.loc[d[col] == 'No', col] = 0
d.loc[d[col] == 'Yes', col] = 1


TypeError: invalid type comparison

For comparison, I can successfully perform this on a single column without issues:

d.loc[d['Combined'] == 'No', 'Combined'] = 0
d.loc[d['Combined'] == 'Yes', 'Combined'] = 1


Any reason why plugging the value of col into the loc function in place of the actual column name throws an error? Does it need to be converted to a string or something before?

Answer

There must be columns which are taking integer values and for those rows its an "invalid comparison". So just check if its an instance of str and you are good to go.

for col,row in d.iteritems():
    if isinstance(row[0], str):
        d.loc[d[col] == 'No', col] = 0
        d.loc[d[col] == 'Yes', col] = 1

And for the same reason

d.loc[d['Combined'] == 'No', 'Combined'] = 0

this is working perfectly, as its already a column with string values.

Comments