The collect
operation would produce unordered output if the Collector
you passed it had different characteristics. That is, if the CONCURRENT
and UNORDERED
flags were set (see Collector.characteristics()
).
Under the hood Collectors.toList()
is constructing a Collector
roughly equivalent to this:
Collector.of(
// Supplier of accumulators
ArrayList::new,
// Accumulation operation
List::add,
// Combine accumulators
(left, right) -> {
left.addAll(right);
return left;
}
)
A bit of logging reveals the lengths that the collect
operation is going to to maintain thread safety and stream order:
Collector.of(
() -> {
System.out.printf("%s supplying\n", Thread.currentThread().getName());
return new ArrayList<>();
},
(l, o) -> {
System.out.printf("%s accumulating %s to %s\n", Thread.currentThread().getName(), o, l);
l.add(o);
},
(l1, l2) -> {
System.out.printf("%s combining %s & %s\n", Thread.currentThread().getName(), l1, l2);
l1.addAll(l2);
return l1;
}
)
logs:
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-0 supplying
ForkJoinPool-1-worker-0 accumulating 2 to []
ForkJoinPool-1-worker-1 accumulating 6 to []
ForkJoinPool-1-worker-0 supplying
ForkJoinPool-1-worker-0 accumulating 4 to []
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-1 accumulating 5 to []
ForkJoinPool-1-worker-0 supplying
ForkJoinPool-1-worker-0 accumulating 3 to []
ForkJoinPool-1-worker-0 combining [3] & [4]
ForkJoinPool-1-worker-0 combining [2] & [3, 4]
ForkJoinPool-1-worker-1 combining [5] & [6]
ForkJoinPool-1-worker-0 supplying
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-0 accumulating 1 to []
ForkJoinPool-1-worker-1 accumulating 8 to []
ForkJoinPool-1-worker-0 supplying
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-1 accumulating 9 to []
ForkJoinPool-1-worker-1 combining [8] & [9]
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-1 accumulating 7 to []
ForkJoinPool-1-worker-1 combining [7] & [8, 9]
ForkJoinPool-1-worker-1 combining [5, 6] & [7, 8, 9]
ForkJoinPool-1-worker-0 accumulating 0 to []
ForkJoinPool-1-worker-0 combining [0] & [1]
ForkJoinPool-1-worker-0 combining [0, 1] & [2, 3, 4]
ForkJoinPool-1-worker-0 combining [0, 1, 2, 3, 4] & [5, 6, 7, 8, 9]
You can see that each read from the stream is written to a new accumulator, and that they are carefully combined to maintain order.
If we set the CONCURRENT
and UNORDERED
characteristic flags the collect method is free to take shortcuts; only one accumulator is allocated and ordered combination is unnecessary.
Using:
Collector.of(
() -> {
System.out.printf("%s supplying\n", Thread.currentThread().getName());
return Collections.synchronizedList(new ArrayList<>());
},
(l, o) -> {
System.out.printf("%s accumulating %s to %s\n", Thread.currentThread().getName(), o, l);
l.add(o);
},
(l1, l2) -> {
System.out.printf("%s combining %s & %s\n", Thread.currentThread().getName(), l1, l2);
l1.addAll(l2);
return l1;
},
Characteristics.CONCURRENT,
Characteristics.UNORDERED
)
Logs:
ForkJoinPool-1-worker-1 supplying
ForkJoinPool-1-worker-1 accumulating 6 to []
ForkJoinPool-1-worker-0 accumulating 2 to [6]
ForkJoinPool-1-worker-1 accumulating 5 to [6, 2]
ForkJoinPool-1-worker-0 accumulating 4 to [6, 2, 5]
ForkJoinPool-1-worker-0 accumulating 3 to [6, 2, 5, 4]
ForkJoinPool-1-worker-0 accumulating 1 to [6, 2, 5, 4, 3]
ForkJoinPool-1-worker-0 accumulating 0 to [6, 2, 5, 4, 3, 1]
ForkJoinPool-1-worker-1 accumulating 8 to [6, 2, 5, 4, 3, 1, 0]
ForkJoinPool-1-worker-0 accumulating 7 to [6, 2, 5, 4, 3, 1, 0, 8]
ForkJoinPool-1-worker-1 accumulating 9 to [6, 2, 5, 4, 3, 1, 0, 8, 7]
Best Answer
There are two different kinds of "ordering" going on here, which makes the discussion confusing.
One kind is encounter order, which is defined in the streams documentation. A good way to think about this is the spatial or left-to-right order of elements in the source collection. If the source is a
List
, consider the earlier elements being to the left of later elements.There is also processing or temporal order, which isn't defined in the documentation, but which is the time order in which elements are processed by different threads. If the elements of a list are being processed in parallel by different threads, a thread might process the rightmost element in the list before the leftmost element. But the next time it might not.
Even when computations are done in parallel, most
Collectors
and some terminal operations are carefully arranged so that they preserve encounter order from the source through to the destination, independently of the temporal order in which different threads might process each element.Note that the
forEach
terminal operation does not preserve encounter order. Instead, it's run by whatever thread happens to produce the next result. If you want something likeforEach
that preserves encounter order, useforEachOrdered
instead.See also the Lambda FAQ for further discussion about ordering issues.