ramesh ramesh - 5 months ago 62
Python Question

pandas subset and drop rows based on column value

my df:

dframe = pd.DataFrame({"A":list("aaaabbbbccc"), "C":range(1,12)}, index=range(1,12))

Out[9]:
A C
1 a 1
2 a 2
3 a 3
4 a 4
5 b 5
6 b 6
7 b 7
8 b 8
9 c 9
10 c 10
11 c 11


to subset based on column value:

In[11]: first = dframe.loc[dframe["A"] == 'a']
In[12]: first
Out[12]:
A C
1 a 1
2 a 2
3 a 3
4 a 4


To drop based on column value:

In[16]: dframe = dframe[dframe["A"] != 'a']
In[17]: dframe
Out[16]:
A C
5 b 5
6 b 6
7 b 7
8 b 8
9 c 9
10 c 10
11 c 11


Is there any way to do both in one shot? Like subsetting rows based on a column value and deleting same rows in the original df.

Answer

It's not really in one shot, but typically the way to do this is reuse a boolean mask, like this:

In [28]: mask = dframe['A'] == 'a'

In [29]: first, dframe = dframe[mask], dframe[~mask]

In [30]: first
Out[30]:
   A  C
1  a  1
2  a  2
3  a  3
4  a  4

In [31]: dframe
Out[31]:
    A   C
5   b   5
6   b   6
7   b   7
8   b   8
9   c   9
10  c  10
11  c  11
Comments