user2242044 user2242044 - 8 days ago 6
Python Question

Add a static list to a new Pandas Dataframe column via apply

I have a complex function that generates a

list
for each row in a
Pandas
dataframe
. I'd like to make that
list
the value for each row in a new column called
mylist
.

The ability of Pandas to do this seems dependent on the number of columns in the starting dataframe.

import pandas as pd

df = pd.DataFrame(data=[['A', 'D'],
['B', 'E'],
['C', 'F']],
columns=['col1', 'col2'])

df1 = pd.DataFrame(data=[['A', 'D', 'G'],
['B', 'E', 'H'],
['C', 'F', 'I']],
columns=['col1', 'col2', 'col3'])

def add_list(row):
return [1,3, 3]

df['mylist'] = df.apply(add_list, axis=1)
print df


yields:

col1 col2 list
0 A D [1, 3, 3]
1 B E [1, 3, 3]
2 C F [1, 3, 3]


This additional code yields
ValueError: Wrong number of items passed 3, placement implies 1
. Why should the number of columns in the starting
dataframe
have an impact?

df1['mylist'] = df1.apply(add_list, axis=1)
print df1


If I change the function to the below (adding one element), then there is no error:

def add_list(row):
return [1,3, 3, 4]


expected output:

col1 col2 col3 list
0 A D G [1, 3, 3]
1 B E H [1, 3, 3]
2 C F I [1, 3, 3]

Answer

This is bizarre behaviour. A solution seems to be to return a tuple instead of a list.

def add_list(row):
    return (1, 3, 3)

df1['mylist'] = df1.apply(add_list, axis=1).apply(list)

In the last line you'll notice the tuples are converted to lists once they are in the dataframe.