Charles Charles - 1 year ago 179
Python Question

Multi-Indexed fillna in Pandas

I have a multi-indexed dataframe and I'm looking to backfill missing values within a group. The dataframe I have currently looks like this:

df = pd.DataFrame({
'group': ['group_a'] * 7 + ['group_b'] * 3 + ['group_c'] * 2,
'Date': ["2013-06-11",
"2013-07-02",
"2013-07-09",
"2013-07-30",
"2013-08-06",
"2013-09-03",
"2013-10-01",
"2013-07-09",
"2013-08-06",
"2013-09-03",
"2013-07-09",
"2013-09-03"],
'Value': [np.nan, np.nan, np.nan, 9, 4, 40, 18, np.nan, np.nan, 5, np.nan, 2]})

df.Date = df['Date'].apply(lambda x: pd.to_datetime(x).date())
df = df.set_index(['group', 'Date'])


I'm trying to get a dataframe that backfills the missing values within the group.
Like this:

Group Date Value
group_a 2013-06-11 9
2013-07-02 9
2013-07-09 9
2013-07-30 9
2013-08-06 4
2013-09-03 40
2013-10-01 18
group_b 2013-07-09 5
2013-08-06 5
2013-09-03 5
group_c 2013-07-09 2
2013-09-03 2


I tried using
pd.fillna('Value', inplace=True)
, but I get a warning on setting a value on copy, which I've since figured out is related to the presence of the multi-index. Is there a way to make fillna work for multi-indexed rows? Also, ideally I'd be able to apply the fillna to only one column and not the entire dataframe.

Any insight on this would be great.

Answer Source

Use groupby(level=0) then bfill and update:

df.update(df.groupby(level=0).bfill())
df

Note: update changes df inplace.

enter image description here

Other alternatives

df = df.groupby(level='group').bfill()

df = df.unstack(0).bfill().stack().swaplevel(0, 1).reindex_like(df)

Column specific

df.Value = df.groupby(level=0).Value.bfill()