JeffTheKiller JeffTheKiller - 1 month ago 7
Python Question

Python: How to delete all rows which for each ID have only one value?

I have a dataframe in pandas, looking like this:

ID event
1 2
1 3
2 2
2 2
3 2
3 1
3 5
3 2


I would like to delete all rows, which for given ID have only one, same value in 'event' column. So my output should be:

ID event
1 2
1 3
3 2
3 1
3 5
3 2


because only ID = 2 has the same values in event column.
I tried iterating over ID, but it didn't give me appriopriate results. I know that solution should be simple here, but just can't come up with an idea.

Answer Source

A df.groupby with dfGroupBy.transform should do it:

In [1471]: df[df.groupby('ID')['event'].transform(lambda x: x.nunique() > 1)]
Out[1471]: 
   ID  event
0   1      2
1   1      3
4   3      2
5   3      1
6   3      5
7   3      2