user3191569 user3191569 - 3 months ago 50
Python Question

split pandas column with tuple

I have a dictionary of the form;

data = {A:[(1,2),(3,4),(5,6),(7,8),(8,9)],
B:[(3,4),(4,5),(5,6),(6,7)],
C:[(10,11),(12,13)]}


I create a dataFrame by:

df = pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in data.iteritems()]))


which in turn becomes;

A B C
(1,2) (3,4) (10,11)
(3,4) (4,5) (12,13)
(5,6) (5,6) NaN
(6,7) (6,7) NaN
(8,9) NaN NaN


Is there a way to go from the dataframe above to the one below:

A B C
one two one two one two
1 2 3 4 10 11
3 4 4 5 12 13
5 6 5 6 NaN NaN
6 7 6 7 NaN NaN
8 9 NaN NaN NaN NaN

Answer Source

You can use list comprehension with DataFrame constructor with converting columns to numpy array by values + tolist and concat:

cols = ['A','B','C']
L = [pd.DataFrame(df[x].values.tolist(), columns=['one','two']) for x in cols]
df = pd.concat(L, axis=1, keys=cols)
print (df)

   A       B       C    
  one two one two one two
0   1   2   3   4   5   6
1   7   8   9  10  11  12
2  13  14  15  16  17  18

EDIT:

Similar solution with dict comprehension, integers values was converted to floats, because type of NaN is float too.

data = {'A':[(1,2),(3,4),(5,6),(7,8),(8,9)],
        'B':[(3,4),(4,5),(5,6),(6,7)],
        'C':[(10,11),(12,13)]}

cols = ['A','B','C']
d = {k: pd.DataFrame(v, columns=['one','two']) for k,v in data.items()}
df = pd.concat(d, axis=1)
print (df)
    A        B          C      
  one two  one  two   one   two
0   1   2  3.0  4.0  10.0  11.0
1   3   4  4.0  5.0  12.0  13.0
2   5   6  5.0  6.0   NaN   NaN
3   7   8  6.0  7.0   NaN   NaN
4   8   9  NaN  NaN   NaN   NaN