mati mati - 6 months ago 37
Python Question

pandas element-wise comparison and create selection

In a dataframe I would like to compare the elements of a column with a value and sort the elements which pass the comparison into a new column.

df = pandas.DataFrame([{'A':3,'B':10},{'A':2, 'B':30},{'A':1,'B':20},{'A':2,'B':15},{'A':2,'B':100}])

df['C'] = [x for x in df['B'] if x > 18]


I can't find out what's wrongs and why I get
'ValueError: Length of values does not match length of index'
.

Answer

All columns in a DataFrame have to be the same length. Because you are filtering away some values, you are trying to insert fewer values into column C than are in columns A and B.

So, your two options are to start a new DataFrame for C:

dfC = [x for x in df['B'] if x > 18]

or but some dummy value in the column for when x is not 18+. E.g.:

df['C'] = np.where(df['B'] > 18, True, False)

Or even:

df['C'] = np.where(df['B'] > 18, 'Yay', 'Nay')

P.S. Also take a look at: Pandas conditional creation of a series/dataframe column for other ways to do this.

Comments