mati - 1 year ago 82

Python Question

In a dataframe I would like to compare the elements of a column with a value and sort the elements which pass the comparison into a new column.

`df = pandas.DataFrame([{'A':3,'B':10},{'A':2, 'B':30},{'A':1,'B':20},{'A':2,'B':15},{'A':2,'B':100}])`

df['C'] = [x for x in df['B'] if x > 18]

I can't find out what's wrongs and why I get

`'ValueError: Length of values does not match length of index'`

Answer

All columns in a `DataFrame`

have to be the same length. Because you are filtering away some values, you are trying to insert fewer values into column C than are in columns A and B.

So, your two options are to start a new DataFrame for `C`

:

```
dfC = [x for x in df['B'] if x > 18]
```

or but some dummy value in the column for when x is not 18+. E.g.:

```
df['C'] = np.where(df['B'] > 18, True, False)
```

Or even:

```
df['C'] = np.where(df['B'] > 18, 'Yay', 'Nay')
```

P.S. Also take a look at: Pandas conditional creation of a series/dataframe column for other ways to do this.