ccsv ccsv - 2 months ago 7
Python Question

Pandas selecting duplicates that occur n times

I have a dataframe how would I select duplicates that occur two times only

import pandas as pd

df=pd.DataFrame({'Name':['Two','Twice','Twice','three','three','three','one', 'Two'],
'key':[2,2,2,1,1,3,1,1,],
'Last':['Foo','Macy','Gayson','Simpson','Diablo','Niggah','Simpson', 'Mortimer']
})


r=df[df.duplicated(subset=['Name'], keep =False)]


print(r)


so I would get:

Last Name key
0 Foo Two 2
1 Macy Twice 2
2 Gayson Twice 2
7 Mortimer Two 1

Answer

try this:

In [80]: df.groupby('Name').filter(lambda x: len(x) == 2)
Out[80]:
       Last   Name  key
0       Foo    Two    2
1      Macy  Twice    2
2    Gayson  Twice    2
7  Mortimer    Two    1
Comments