Tanmaya Meher - 1 year ago 62
Python Question

# finding non-unique elements in list not working

I wanted to find the non-unique elements in the list, but I am not able to figure out why this is not happening in the below code section.

``````>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> for i in d:
...     if d.count(i) == 1:
...             d.remove(i)
...
>>> d
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b', 6, 3]
``````

6 and 3 should have been removed.
where as, if I use

``````d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c']
``````

I am getting correct answer. Please explain what is happening, I am confused !!!

I am using python 2.7.5.

Removing elements of a list while iterating over it is never a good idea. The appropriate way to do this would be to use a `collections.Counter` with a list comprehension:

``````>>> from collections import Counter
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6, 'f', 3]
>>> [k for (k,v) in Counter(d).iteritems() if v > 1]
['a', 1, 2, 'b', 4]
``````

If you want keep the duplicate elements in the order in which they appear in your list:

``````>>> keep = {k for (k,v) in Counter(d).iteritems() if v > 1}
>>> [x for x in d if x in keep]
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b']
``````

I'll try to explain why your approach doesn't work. To understand why some elements aren't removed as they should be, imagine that we want to remove all `b`s from the list `[a, b, b, c]` while looping over it. It'll look something like this:

```+-----------------------+
|  a  |  b  |  b  |  c  |
+-----------------------+
^ (first iteration)

+-----------------------+
|  a  |  b  |  b  |  c  |
+-----------------------+
^ (next iteration: we found a 'b' -- remove it)

+-----------------------+
|  a  |     |  b  |  c  |
+-----------------------+
^ (removed b)

+-----------------+
|  a  |  b  |  c  |
+-----------------+
^ (shift subsequent elements down to fill vacancy)

+-----------------+
|  a  |  b  |  c  |
+-----------------+
^ (next iteration)
```

Notice that we skipped the second `b`! Once we removed the first `b`, elements were shifted down and our `for`-loop consequently failed to touch every element of the list. The same thing happens in your code.