Mike Mike - 2 months ago 13
Python Question

Splitting stacked dataframe in pandas

I have a dataframe like

age sex values
time
2015 10 F 589628.0
2015 10 M 458390.0
2015 11 F 108018.0
2015 11 M 764350.0
....
2000 60 M 34676.0
2000 60 F 45488.0


I would like to create data frame like

age F M
time
2015 10 589628.0 458390.0
2015 11
....
2000 60 45488.0 34676.0


reducing the rows by half and adding a column. I have tried to do this with pivot, but no avail.

df.pivot(columns='sex', values='values')


but this returns

Index contains duplicate entries, cannot reshape


Any ideas how can I cleanly split the dataframe without writing a tedious function to do it?

Cheers, Mike

Answer

Using pivot_table and reset_index:

In [17]: df
Out[17]: 
      age sex    values
time                   
2015   10   F  589628.0
2015   10   M  458390.0
2015   11   F  108018.0
2015   11   M  764350.0

In [18]: newdf = df.reset_index().pivot_table('values', ['time', 'age'], 'sex').reset_index()

In [19]: newdf.columns.name = None

In [20]: newdf = newdf.set_index(['time'])

In [21]: newdf
Out[21]: 
      age         F         M
time                         
2015   10  589628.0  458390.0
2015   11  108018.0  764350.0
Comments