membersound membersound - 9 months ago 47
Java Question

How to ensure order of processing in java8 streams?

I want to process lists inside an

java object. I have to ensure processing all elements in order I received them.

Should I therefore call
on each
I use?

Or it it sufficient to just use the stream as long as I don't use parallelism?

Answer Source

You are asking the wrong question. You are asking about sequential vs. parallel whereas you want to process items in order, so you have to ask about ordering. If you have an ordered stream and perform operations which guaranty to maintain the order, it doesn’t matter whether the stream is processed in parallel or sequential; the implementation will maintain the order.

The ordered property is distinct from parallel vs sequential. E.g. if you call stream() on a HashSet the stream will be unordered while calling stream() on a List returns an ordered stream. Note that you can call unordered() to release the ordering contract and potentially increase performance. Once the stream has no ordering there is no way to reestablish the ordering. (The only way to turn an unordered stream into an ordered is to call sorted, however, the resulting order is not necessarily the original order).

See also the “Ordering” section of the package documentation.

In order to ensure maintenance of ordering throughout an entire stream operation, you have to study the documentation of the stream’s source, all intermediate operations and the terminal operation for whether they maintain the order or not (or whether the source has an ordering in the first place).

This can be very subtle, e.g. Stream.iterate(T,UnaryOperator) creates an ordered stream while Stream.generate(Supplier) creates an unordered stream. Note that you also made a common mistake in your question as forEach does not maintain the ordering. You have to use forEachOrdered if you want to process the stream’s elements in a guaranteed order.

So if your list in your question is indeed a java.util.List, its stream() method will return an ordered stream and filter will not change the ordering. So if you call .forEachOrdered(), all elements will be processed sequentially in order, whereas for list.parallelStream().filter().forEachOrdered() the elements might be processed in parallel (e.g. by the filter) but the terminal action will still be called in order (which obviously will reduce the benefit of parallel execution).

If you, for example, use an operation like

List<…> result=inputList.parallelStream().map(…).filter(…).collect(Collectors.toList());

the entire operation might benefit from parallel execution but the resulting list will always be in the right order, regardless of whether you use a parallel or sequential stream.