golfradio golfradio - 5 months ago 36
Java Question

Java 8 streams groupby and count multiple properties

I have an object Process that has a date and a boolean error indicator. I want to get a count of total processes and a count of processes with errors for each date. So for example Jun 01 will have counts 2, 1; Jun 02 will have 1, 0 and Jun 03 1, 1. The only way I have been able to do this is streaming twice to get the counts. I have tried implementing a custom collector but haven't been successful. Is there an elegant solution instead of my kludgy method?

final SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
final List<Process> processes = new ArrayList<>();
processes.add(new Process(sdf.parse("2016-06-01"), false));
processes.add(new Process(sdf.parse("2016-06-01"), true));
processes.add(new Process(sdf.parse("2016-06-02"), false));
processes.add(new Process(sdf.parse("2016-06-03"), true));

System.out.println(processes.stream()
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));

System.out.println(processes.stream().filter(order -> order.isHasError())
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));

private class Process {
private Date date;
private boolean hasError;

public Process(Date date, boolean hasError) {
this.date = date;
this.hasError = hasError;
}

public Date getDate() {
return date;
}

public boolean isHasError() {
return hasError;
}
}


Code after @glee8e's solution and @Holger's tips

Collector<Order, Result, Result> orderCollector = Collector.of(
() -> new Result(),
(r, o) -> {
r.increment(0);
if (o.isHasError()) {
r.increment(1);
}
}, (r1, r2) -> {
r1.add(0, r2.get(0));
r1.add(1, r2.get(1));
return r1;
});

Map<Date, Result> results = orders.stream().collect(groupingBy(Order::getDate, orderCollector));
results.entrySet().stream().sorted(Comparator.comparing(Entry::getKey)).forEach(entry -> System.out
.println(String.format("date = %s, %s", sdf.format(entry.getKey()), entry.getValue())));

Answer

It is preferable that we add a POJO to store the result, or the combiner function may looks a bit obscure. I declared the POJO as public, but you can change it if you think it better to hide it.

public class Result {
     public int all, error;
}

Main code:

// Add it somewhere in this file.
private static final Set <Characteristics> CH_ID = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));

//...
// This is main processing code
processes.stream().collect(collectingAndThen(groupingBy(Process::getDate, new Collector<Process, Result, Result> {
            @Override
            public Supplier<Result> supplier() {
                return Result::new;
            }

            @Override
            public BiConsumer<Process, Result> accumlator() {
                return (p, r) -> {
                    r.total++;
                    if (p.isHasError())
                        r.error++;
                };
            }

            @Override
            public BinaryOperator<Result> combiner() {
                return (r1, r2) -> {
                    r1.total += r2.total;
                    r1.error += r2.error;
                    return r1;
                };
            }

            @Override
            public Function<Result, Result> finisher() {
                return Function.identity();
            }

            @Override
            public Set<Characteristics> characteristics() {
                return CH_ID;
            }
})));

PS: I assume you have import static java.util.stream.Collectors