RMichalowski RMichalowski -4 years ago 68
Python Question

Pandas groupby on multiple values

Start with a sorted table:

Index | A | B | C |
0 | A1| 0 | Group 1 |
1 | A1| 0 | Group 1 |
2 | A1| 1 | Group 2 |
3 | A1| 1 | Group 2 |
4 | A1| 2 | Group 3 |
5 | A1| 2 | Group 3 |
6 | A2| 7 | Group 4 |
7 | A2| 7 | Group 4 |


Returns records 0,1,2,3,6,7

First I want to create groups based on Columns A and B.
Then I want only the first two subgroups of a Column A group returned.
I want all the records returned for the subgroup.

Thank you so much.

Answer Source

Use pd.factorize within a groupby and filter for less than 2

df[df.groupby('A').B.transform(lambda x: x.factorize()[0]).lt(2)]
# same as
# df[df.groupby('A').B.transform(lambda x: x.factorize()[0]) < 2]

    A  B        C
0  A1  0  Group 1
1  A1  0  Group 1
2  A1  1  Group 2
3  A1  1  Group 2
6  A2  7  Group 4
7  A2  7  Group 4
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download