swepab - 1 year ago 43

Python Question

I have a pandas dataframe on the following form:

id grp values1 values2

0 1 a_1 2 4

1 1 a_2 3 6

2 1 b_1 4 8

3 2 b_2 5 10

4 2 c_1 6 12

5 3 z_1 7 14

6 4 y_1 8 16

7 5 a_1 9 18

8 5 a_2 10 20

9 5 a_1 11 22

10 5 a_2 12 24

11 7 z_1 13 26

12 8 q_1 14 28

13 9 a_2 15 30

14 9 a_1 16 32

What I wand is a reduced data frame with the rows and ids containing "a_1" and "a_2" only for the ID-group like below.

id grp values1 values2

7 5 a_1 9 18

8 5 a_2 10 20

9 5 a_1 11 22

10 5 a_2 12 24

13 9 a_2 15 30

14 9 a_1 16 32

The end data frame should have even numbers when doing a count on the id variable since the group values goes in "pairs", i.e. "a_1" and "a_2" together. Also, this should work for an arbitrary number of "group by columns" like the id column is.

The code for the dataframe is posted below:

`df = pd.DataFrame({'id' : [1,1,1,2,2,3,4,5,5,5,5,7,8,9,9]`

,'grp' : ['a_1', 'a_2', 'b_1', 'b_2', 'c_1', 'z_1', 'y_1', 'a_1', 'a_2', 'a_1', 'a_2', 'z_1', 'q_1', 'a_2','a_1']

,'values1' : [i for i in range(2,17,1)]

,'values2' : [i for i in range(4,33,2)]

})

Thanks for considerations, time and input!

Answer Source

Or you could take the easier way and:

```
filtered_df = df.ix[(df['grp'] == 'a_1') | (df['grp'] == 'a_2')]
```