fhgd fhgd - 9 days ago 6
Python Question

Flatten a one-to-one mapping in a multiindex pandas dataframe

I have the following data structure:

from collections import OrderedDict
import pandas as pd

d = OrderedDict([
((5, 3, 1), {'y1': 1}),
((5, 3, 2), {'y2': 2}),
((5, 4, 1), {'y1': 10}),
((5, 4, 2), {'y2': 20}),

((6, 3, 1), {'y1': 100}),
((6, 3, 2), {'y2': 200}),
((6, 4, 1), {'y1': 1000}),
((6, 4, 2), {'y2': 2000}),
])

df = pd.DataFrame(
d.values(),
index=pd.MultiIndex.from_tuples(d.keys(), names=['x3', 'x2', 'x1']),
)


The table looks like

y1 y2
x3 x2 x1
5 3 1 1 NaN
2 NaN 2
4 1 10 NaN
2 NaN 20
6 3 1 100 NaN
2 NaN 200
4 1 1000 NaN
2 NaN 2000


As you can see there is a one-to-one mapping between x1 and the columns (x1=1: y1, x1=2: y2) which I want to flatten into

y1 y2
x3 x2
5 3 1 2
4 10 20
6 3 100 200
4 1000 2000


How can I do it?

Answer

You can use stack for remove NaN, because create Series, remove third level by reset_index and last reshape by unstack:

print (df.stack().reset_index(level=2,drop=True).unstack(2))
           y1      y2
x3 x2                
5  3      1.0     2.0
   4     10.0    20.0
6  3    100.0   200.0
   4   1000.0  2000.0
Comments