kjo kjo - 3 months ago 14
Python Question

How to add rows for all missing values of one multi-index's level?

Suppose that I have the following dataframe

df
, indexed by a 3-level multi-index:

In [52]: df
Out[52]:
C
L0 L1 L2
0 w P 1
y P 2
R 3
1 x Q 4
R 5
z S 6


Code to create the DataFrame:

idx = pd.MultiIndex(levels=[[0, 1], ['w', 'x', 'y', 'z'], ['P', 'Q', 'R', 'S']],
labels=[[0, 0, 0, 1, 1, 1], [0, 2, 2, 1, 1, 3], [0, 0, 2, 1, 2, 3]],
names=['L0', 'L1', 'L2'])

df = pd.DataFrame({'C': [1, 2, 3, 4, 5, 6]}, index=idx)


The possible values for the
L2
level are
'P'
,
'Q'
,
'R'
, and
'S'
, but some of these values are missing for particular combinations of values for the remaining levels. For example, the combination
(L0=0, L1='w', L2='Q')
is not present in
df
.

I would like to add enough rows to
df
so that, for each combination of values for the levels other than
L2
, there is exactly one row for each of the
L2
level's possible values. For the added rows, the value of the
C
column should be 0.

IOW, I want to expand
df
so that it looks like this:

C
L0 L1 L2
0 w P 1
Q 0
R 0
S 0
y P 2
Q 0
R 3
S 0
1 x P 0
Q 4
R 5
S 0
z P 0
Q 0
R 0
S 6


REQUIREMENTS:


  • the operation should leave the types of the columns unchanged;

  • the operation should add the smallest number of rows needed to complete only the specified level (
    L2
    )






Is there a simple way to perform this expansion?

Answer Source

Suppose L2 initially contains all the possible values you need, you can use unstack.stack trick:

df.unstack('L2', fill_value=0).stack(level=1)

enter image description here