Yariv Yariv - 1 year ago 116
Python Question

How to find duplicate names using pandas?

I have a

with a column called
containing strings.
I would like to get a list of the names which occur more than once in the column. How do I do that?

I tried:

funcs_groups = funcs.groupby(funcs.name)

But it doesn't filter out the singleton names.

Answer Source

If you want to find the rows with duplicated name (except the first time we see that), you can try this

In [16]: import pandas as pd
In [17]: p1 = {'name': 'willy', 'age': 10}
In [18]: p2 = {'name': 'willy', 'age': 11}
In [19]: p3 = {'name': 'zoe', 'age': 10}
In [20]: df = pd.DataFrame([p1, p2, p3])

In [21]: df
   age   name
0   10  willy
1   11  willy
2   10    zoe

In [22]: df.duplicated('name')
0    False
1     True
2    False