shmosel shmosel - 4 months ago 26
Java Question

Can sequential stream operations have side-effects?

I'm trying to stream a limited number of values into a set, but I need to verify that they're new elements before applying the limit. For example:

Set<Integer> destination = ...
Set<Integer> source = ...
source.stream()
.filter(i -> !destination.contains(i))
.limit(10)
.forEach(destination::add);


But the redundant
contains()
check is bothering me, since
add()
can both add the element and report whether it's new to the collection. So I was thinking of doing this:

source.stream()
.filter(destination::add)
.limit(10)
.forEach(i -> {}); // no-op terminal operation to force evaluation


Ignoring the hacky terminal operation, there's the problem of using a filter operation with a side-effect, which is generally discouraged. I understand why it would be unsafe to use
map()
and
filter()
with side-effects on parallel streams. My question is, would it be acceptable on a sequential stream, as in this case? If not, why not?

Answer

There isn't a fundamental problem with side effects and sequential streams, but the second implementation above is invalid, because the stream API doesn't guarantee that each stage will be executed on each element in turn.

In the second implementation, more than 10 elements may be added to destination before the limit is applied. Your no-op forEach will only see 10, but you may end up with more in the set.

In addition to streams, java has looping constructs like for and while that can make it easy to express things like this.

If you must use streams, you can do it like this:

int maxSize = destination.size()+10;
source.stream().allMatch(x -> destination.size()<maxsize && (destination.add(x)||true));

The allMatch will stop the iteration as soon as the predicate returns false.