coderunner007 coderunner007 - 21 days ago 5
Python Question

multiple row selection in multi indexed dataframe

Suppose I write this code in pandas to create a dataframe:

pd.DataFrame({'x':random.sample(range(1,100), 4),
'y':random.sample(range(1,100), 4),
'z':random.sample(range(1,100), 4)},
index = [['a1', 'b1', 'c1','d1'], ['a2', 'b2', 'c2', 'd2']])


This results in the following dataframe:

x y z
a1 a2 8 2 85
b1 b2 43 93 58
c1 c2 1 46 24
d1 d2 60 37 62


I want to select the multi indexed rows by passing a list like:

[[a1, a2], [b1, b2], [c1, c2]]


to return:

x y z
a1 a2 8 2 85
b1 b2 43 93 58
c1 c2 1 46 24


Is there a function in pandas that does it?

Answer

Yes, but then you need to define the indexes as tuples like this:

target_index = [('a1', 'a2'), ('b1', 'b2'), ('c1', 'c2')]

Then

df.loc[target_index]

gives you the desired output:

       x  y  z
a1 a2  0  2  3
b1 b2  1  3  4
c1 c2  2  4  5