Pedro Braz Pedro Braz - 5 months ago 60
Python Question

Pandas MultiIndex groupby retaining index levels

After research I have found no similar questions on this or any other forum.

I'm grouping a MultiIndex dataframe by its inner level. The thing is, after grouping by I still want to know which were the "chosen values" on this inner index.

So I have something of the sort

df = pd.DataFrame([['A', 1, 3],
['A', 2, 4],
['A', 3, 6],
['B', 1, 9],
['B', 2, 10],
['B', 4, 6]],
columns=pd.Index(['Name', 'Date', 'Value'], name='ColumnName')
).set_index(['Name', 'Date'])

ColumnName Value
Name Date
A 1 3
2 4
3 6
B 1 9
2 10
4 6


What I wanted is

ColumnName Value
Name Date
A 3 6
B 4 6


What I was capable of doing was using this command:

df.groupby(level=('Name')).last()


was retrieving this:

ColumnName Value
Name
A 6
B 6


Or, by using this command:

df.groupby(level=('Name','Date')).last()


retrieving an error.

Keep in mind that this is a performance sensitive application.

Thoughts ?

EDIT: Meanwhile I did submit a feature request at GitHub

Answer

This will get it done:

def get_slice(df):
    l0, l1 = df.index.levels
    b0, b1 = df.index.labels

    n = len(l0)
    myslice = range(n)

    for i in myslice:
        myslice[i] = (l0[i], l1[b1[b0 == i][-1]])

    return df.loc[myslice]

Timed

%%timeit
get_slice(df)

1000 loops, best of 3: 458 ┬Ás per loop