pivovarit pivovarit - 1 month ago 15
Java Question

How to preserve state when performing operations on Java 8 Stream?

I need to parse a string consisting of different integer numbers representing periods when a certain user was or wasn't watching tv.

I start by splitting the string and collecting it into an ArrayList:

final List<String> separated = stream(split(s, "A"))
.map(str -> "A" + str)
.flatMap(str -> stream(split(str, "B")).map(s2 -> s2.startsWith("A") ? s2 : "B" + s2))
.collect(Collectors.toList());


The tricky thing comes now. I need to transform those strings into domain objects with from/to fields. So, in order to map this properly, my mapping function needs to be aware of the former element. So I did the following:

LocalDateTime temp = initial;

final ArrayList<WatchingPeriod> result = new ArrayList<>();

for (final String s1 : separated) {
final WatchingPeriod period = new WatchingPeriod(temp, temp.plusMinutes(parseLong(substring(s1, 1))),
s1.startsWith("A"));

result.add(period);
temp = period.getTo();
}
return result;


I feel it's a huge step back since I am breaking the whole stream pipeline just in order to get back to the old school for-each. Is there any way I can do the whole processing in one stream pipeline? I am thinking about creating a custom Collector that would look at the last element in a collection and calculate correct LocalDateTime objects basing on this.

Examples:

input string: "A60B80A60", which means that someone was watching something for 60 minutes, then stopped for 80 and then watched again for 60 minutes

and as a result I'd like to get a List with objects:

1) from: 0:00, to: 1:00, watched: true

2) from: 1:00, to: 2:20, watched: false

3) from: 2:20, to: 3:20, watched: true

calculation of each object requires knowledge about the previous one

Answer

This is not about successive pairs, but about collecting cumulative prefix. In functional programming such operation is usually called scanLeft and it's present in many functional languages like Scala. Unfortunately it's absent in current implementation of Java 8 Stream API, so we can only emulate it with forEachOrdered. Let's create a model object:

static class WatchPeriod {
    static final DateTimeFormatter FORMATTER = DateTimeFormatter.ofPattern("HH:mm");
    final LocalTime start;
    final Duration duration;
    final boolean watched;

    WatchPeriod(LocalTime start, Duration duration, boolean watched) {
        this.start = start;
        this.duration = duration;
        this.watched = watched;
    }

    // Takes string like "A60" and creates WatchPeriod starting from 00:00
    static WatchPeriod forString(String watchPeriod) {
        return new WatchPeriod(LocalTime.of(0, 0),
                   Duration.ofMinutes(Integer.parseInt(watchPeriod.substring(1))),
                   watchPeriod.startsWith("A"));
    }

    // Returns new WatchPeriod which start time is adjusted to start
    // right after the supplied previous period
    WatchPeriod after(WatchPeriod previous) {
        return new WatchPeriod(previous.start.plus(previous.duration), duration, watched);
    }

    @Override
    public String toString() {
        return "from: "+start.format(FORMATTER)+", to: "+
            start.plus(duration).format(FORMATTER)+", watched: "+watched;
    }
}

Now we can split the input string like "A60B80A60" to the tokens "A60", "B80", "A60", map these tokens to WatchPeriod objects, then store them into the resulting list:

String input = "A60B80A60";
List<WatchPeriod> result = new ArrayList<>();
Pattern.compile("(?=[AB])").splitAsStream(input)
    .map(WatchPeriod::forString)
    .forEachOrdered(wp -> result.add(result.isEmpty() ? wp : 
                          wp.after(result.get(result.size()-1))));
result.forEach(System.out::println);

The output is:

from: 00:00, to: 01:00, watched: true
from: 01:00, to: 02:20, watched: false
from: 02:20, to: 03:20, watched: true

If you don't mind using third-party library, my free StreamEx enhances Stream API adding missing scanLeft operation among other features:

String input = "A60B80A60";
List<WatchPeriod> result = StreamEx.split(input, "(?=[AB])")
        .map(WatchPeriod::forString).scanLeft((prev, next) -> next.after(prev));
result.forEach(System.out::println);

The result is the same.