Ruby Ruby - 3 years ago 176
Scala Question

How do I efficiently count distinct fields in a collection?

I am currently doing this:

val count = sightings.map(_.shape).distinct.length


However,
map
creates an intermediary collection, which in my case is a Vector thousands of times larger than what
distinct
produces.

How do I bypass this intermediate step and get the set of distinct shapes? Or, even better, the count of distinct shapes.

Answer Source

You can use an iterator to not create the intermediate collection and then accrue the shapes in a Set to get the distinct ones:

val count = sightings.iterator.map(_.shape).toSet.size

Alternatively, you can use collection.breakOut to accrue the items in a Set without creating the intermediate collection (another answer suggested using breakOut, but in a different way):

val distinctShapes: Set[Shape] = sightings.map(_.shape)(collection.breakOut)
val count = distinctShapes.size
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download