clg4 clg4 - 4 months ago 66
Python Question

Pandas - Unpack column of lists of varying lengths of tuples

I would like to take a Pandas Dataframe named

df
which has an ID column and a lists column of lists that have variable number of tuples, all the tuples have the same length. Looks like this:

ID list
1 [(0,1,2,3),(1,2,3,4),(2,3,4,NaN)]
2 [(Nan,1,2,3),(9,2,3,4)]
3 [(Nan,1,2,3),(9,2,3,4),(A,b,9,c),($,*,k,0)]


And I would like to unpack each list into columns 'A','B','C','D' representing the fixed positions in each tuple.

The result should look like:

ID A B C D
1 0 1 2 3
1 1 2 3 4
1 2 3 4 NaN
2 NaN 1 2 3
2 9 2 3 4
3 NaN 1 2 3
3 9 2 3 4
3 A b 9 c
3 $ * k 0


I have tried
df.apply(pd.Series(list)
but fails as the
len
of the list elements is different on different rows. Somehow need to unpack to columns and transpose by ID?

Answer
In [38]: (df.groupby('ID')['list']
            .apply(lambda x: pd.DataFrame(x.iloc[0], columns=['A', 'B', 'C', 'D']))
            .reset_index())
Out[38]: 
   ID  level_1    A  B  C    D
0   1        0    0  1  2    3
1   1        1    1  2  3    4
2   1        2    2  3  4  NaN
3   2        0  NaN  1  2    3
4   2        1    9  2  3    4
5   3        0  NaN  1  2    3
6   3        1    9  2  3    4
7   3        2    A  b  9    c
8   3        3    $  *  k    0