kjo - 1 year ago 40
Python Question

# How to add rows for all missing values of one multi-index's level?

Suppose that I have the following dataframe

`df`
, indexed by a 3-level multi-index:

``````In [52]: df
Out[52]:
C
L0 L1 L2
0  w  P   1
y  P   2
R   3
1  x  Q   4
R   5
z  S   6
``````

Code to create the DataFrame:

``````idx = pd.MultiIndex(levels=[[0, 1], ['w', 'x', 'y', 'z'], ['P', 'Q', 'R', 'S']],
labels=[[0, 0, 0, 1, 1, 1], [0, 2, 2, 1, 1, 3], [0, 0, 2, 1, 2, 3]],
names=['L0', 'L1', 'L2'])

df = pd.DataFrame({'C': [1, 2, 3, 4, 5, 6]}, index=idx)
``````

The possible values for the
`L2`
level are
`'P'`
,
`'Q'`
,
`'R'`
, and
`'S'`
, but some of these values are missing for particular combinations of values for the remaining levels. For example, the combination
`(L0=0, L1='w', L2='Q')`
is not present in
`df`
.

I would like to add enough rows to
`df`
so that, for each combination of values for the levels other than
`L2`
, there is exactly one row for each of the
`L2`
level's possible values. For the added rows, the value of the
`C`
column should be 0.

IOW, I want to expand
`df`
so that it looks like this:

``````          C
L0 L1 L2
0  w  P   1
Q   0
R   0
S   0
y  P   2
Q   0
R   3
S   0
1  x  P   0
Q   4
R   5
S   0
z  P   0
Q   0
R   0
S   6
``````

REQUIREMENTS:

• the operation should leave the types of the columns unchanged;

• the operation should add the smallest number of rows needed to complete only the specified level (
`L2`
)

Is there a simple way to perform this expansion?

Suppose L2 initially contains all the possible values you need, you can use `unstack.stack` trick:
``````df.unstack('L2', fill_value=0).stack(level=1)