Andrew Mairose Andrew Mairose - 2 months ago 18
Java Question

Get object with max frequency from Java 8 stream

I have an object with

city
and
zip
fields, let's call it
Record
.

public class Record() {
private String zip;
private String city;

//getters and setters
}


Now, I have a collection of these objects, and I group them by
zip
using the following code:

final Collection<Record> records; //populated collection of records
final Map<String, List<Record>> recordsByZip = records.stream()
.collect(Collectors.groupingBy(Record::getZip));


So, now I have a map where the key is the
zip
and the value is a list of
Record
objects with that
zip
.

What I want to get now is the most common
city
for each
zip
.

recordsByZip.forEach((zip, records) -> {
final String mostCommonCity = //get most common city for these records
});


I would like to do this with all stream operations. For example, I am able to get a map of the frequency for each
city
by doing this:

recordsByZip.forEach((zip, entries) -> {
final Map<String, Long> frequencyMap = entries.stream()
.map(GisSectorFileRecord::getCity)
.filter(StringUtils::isNotBlank)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
});


But I would like to be able to do a single-line stream operation that will just return the most frequent
city
.

Are there any Java 8 stream gurus out there that can work some magic on this?

Here is an ideone sandbox if you'd like to play around with it.

Answer

You could have the following:

final Map<String, String> mostFrequentCities =
  records.stream()
         .collect(Collectors.groupingBy(
            Record::getZip,
            Collectors.collectingAndThen(
              Collectors.groupingBy(Record::getCity, Collectors.counting()),
              map -> map.entrySet().stream().max(Map.Entry.comparingByValue()).get().getKey()
            )
         ));

This groups each records by their zip, and by their cities, counting the number of cities for each zip. Then, the map of the number of cities by zip is post-processed to keep only the city having the maximum count.