hvedrung hvedrung - 2 months ago 38
Python Question

pack dataframe columns to list in pandas

I need to pack pandas DataFrame columns in one column containing lists. Example:

For

>>>df
a b c
0 81 88 1
1 42 7 23
2 8 37 63
3 18 22 20


make list column:

list_col
0 [81,88,1]
1 [42,7,23]
2 [8,37,63]
3 [18,22,20]


If I try




df.apply(list,axis=1)




python returns same DataFrame.

In case I try

>>> df.apply(lambda r:{'list_col':list(r)},axis=1)
a b c
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN


is not working.

Even brute method

>>> df['list_col'] = ''
>>> for i in df.index:
df.ix[i,'list_col'] = list(df.ix[i,df.columns[:-1]])


returns error:

Traceback (most recent call last):
File "<pyshell#45>", line 2, in <module>
df.ix[i,'list_col'] = list(df.ix[i,df.columns[:-1]])
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 88, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 158, in _setitem_with_indexer
len(self.obj[labels[0]]) == len(value) or len(plane_indexer[0]) == len(value)):
TypeError: object of type 'int' has no len()


The only working method I found is:

df['list_col'] = df.apply(lambda r:{df.columns[0]:list(r)}, axis=1)[df.columns[0]]


This gives me what I want but maybe there is more straight way?

Answer

Just assign the column as a list on df.values will do:

df['list_col'] = list(df.values)

df
    a   b   c      list_col
0  81  88   1   [81, 88, 1]
1  42   7  23   [42, 7, 23]
2   8  37  63   [8, 37, 63]
3  18  22  20  [18, 22, 20]