Phil - 1 year ago 274
Scala Question

# How to find duplicates in a list?

I have a list of unsorted integers and I want to find those elements which have duplicates.

``````val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102)
``````

I can find the distinct elements of the set with dup.distinct, so I wrote my answer as follows.

``````val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102)
val distinct = dup.distinct
val elementsWithCounts = distinct.map( (a:Int) => (a, dup.count( (b:Int) => a == b )) )
val duplicatesRemoved = elementsWithCounts.filter( (pair: Pair[Int,Int]) => { pair._2 <= 1 } )
val withDuplicates = elementsWithCounts.filter( (pair: Pair[Int,Int]) => { pair._2 > 1 } )
``````

Is there an easier way to solve this?

Try this:

``````val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102)
dup.groupBy(identity).collect { case (x, List(_,_,_*)) => x }
``````

The `groupBy` associates each distinct integer with a list of its occurrences. The `collect` is basically `map` where non-matching elements are ignored. The match pattern following `case` will match integers `x` that are associated with a list that fits the pattern `List(_,_,_*)`, a list with at least two elements, each represented by an underscore since we don't actually need to store those values (and those two elements can be followed by zero or more elements: `_*`).

You could also do:

``````dup.groupBy(identity).collect { case (x,ys) if ys.lengthCompare(1) > 0 => x }
``````

It's much faster than the approach you provided since it doesn't have to repeatedly pass over the data.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download