Java 8 Collector Guide
Collector Overview:
Collector
represents a special mutable reduction operation. Elements are incorporated by
updating the state of a mutable container rather than by replacing the intermediate result. This is
desirable behavior when we want to reduce a Stream into some sort of Collection
. It would be very
inefficient to create a new Collection
Object during every step of the reduction (as is typical
in reduction operations), so we can use Collector
to avoid that.
We’ll dive deeper into the finer points of collection vs reduction in a separate tutorial. For
now, let’s take a look at the pieces that make up a Collector
.
Container Supplier
The container supplier is responsible for creating a new mutable container for the result. It has the following abstract method signature:
Supplier<A> supplier();
Accumulator
The accumulator incorporates data elements into the result container. It has the following abstract method signature:
BiConsumer<A, T> accumulator();
Combiner
The combiner is used (in the Stream
framework) during parallel execution. Separate process
separate sections of the Stream
, accumulating their partial result into a mutable container. Those
containers eventually need to be combined into one single result, hence the combiner. It has the
following abstract method signature:
BinaryOperator<A> combiner();
Finisher
Performs optional final transformation. Collectors may set (and the majority do)
the IDENTITY_TRANSFORM
characteristic, in which case the finishing transformation is an identity
function with an unchecked cast from A
to R
. It has the following abstract method signature:
Function<A, R> finisher();
An Example Visualized
Let’s try to visualize an example Stream
collection process to help understand the different
components. Also, be sure to note the differences between serial and parallel execution.
Serial Collection:
Parallel Collection:
What Are The Rules?
To ensure that sequential and parallel executions produce equivalent results, the collector functions must satisfy an identity and an associativity constraints. – JavaDoc
Essentially, there are two things that must hold true in order for a Collector to perform equivalently during parallel and sequential execution.
The Identity Constraint
The identity constraint says that for any partially accumulated result, combining it with an empty result container must produce an equivalent result. That is, for a partially accumulated result a that is the result of any series of accumulator and combiner invocations, a must be equivalent to
combiner.apply(a, supplier.get())
– JavaDoc
Essentially, combining a partial result with an empty result should be the same as passing the partial result through the identity function. When combining two result containers, only the specific contents of the containers should affect the result.
The Associativity Constraint
The associativity constraint says that splitting the computation must produce an equivalent result. That is, for any input elements t1 and t2, the results r1 and r2 in the computation below must be equivalent:
A a1 = supplier.get(); accumulator.accept(a1, t1);- accumulator.accept(a1, t2); R r1 = finisher.apply(a1); // result without splitting A a2 = supplier.get(); accumulator.accept(a2, t1); A a3 = supplier.get(); accumulator.accept(a3, t2); R r2 = finisher.apply(combiner.apply(a2, a3)); // result with splitting
– JavaDoc
The associativity constraint is a little more straight-forward. A Collector is considered associative if splitting (and thus processing the elements in a different order) produces the same result.
I think most are familiar with the concept of associativity in algebra:
If you stretch your mind a little bit you can look at associativity in collection the same way.
A Collection of Collectors
Now that we have a feel for the different components of a Collector
, let’s take a look at just a
few of the JDK supplied Collectors.
JDK Convenience Collectors
A Special Case: Collecting into Maps.
In order to collect into a Map
, we need to upgrade our accumulator into a higher order function,
composed of three other functions. These “sub” functions are:
Key Mapper
The key mapper transforms each element from the stream into a key for the Map
being collected
into.
Value Mapper
The value mapper transforms each element from the stream into a value for the Map
being collected
into.
Merger
Any collisions (when two elements produce the same key) are handled by the merger. Many of the
predefined Map
collectors just throw an Exception
unconditionally, but you can easily supply
your own merge function if the desired behavior is more complex.
All Composed Together
Build Your Own
What if none of the supplied Collectors meet our needs? In that case, implementing our own should be
no problem! Let’s create a Collector
similar to Collectors.toList
, but that applies a finishing
step of copying the mutable result container into an ImmutableList
.
public class ImmutableListCollector<T> implements Collector<T, List<T>, ImmutableList<T>> {
@Override
public Supplier<List<T>> supplier() {
return ArrayList::new;
}
@Override
public BiConsumer<List<T>, T> accumulator() {
return List::add;
}
@Override
public BinaryOperator<List<T>> combiner() {
return (l1, l2) -> {
l1.addAll(l2);
return l1;
};
}
@Override
public Function<List<T>, ImmutableList<T>> finisher() {
return ImmutableList::copyOf;
}
@Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
public static <T> ImmutableListCollector<T> toImmutableList() {
return new ImmutableListCollector<>();
}
}
Try it Yourself!
Here is a sample Stream collection using a simple Collector
implementation. Play around with both
of them, and run it to see the results!
Resources / Further Reading
-
https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collector.html
-
https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collector.Characteristics.html
-
https://apprize.best/javascript/lambda/6.html
-
https://www.baeldung.com/java-8-collectors
-
https://dkeenan.com/Lambda/
-
https://www.amazon.com/Mock-Mocking-Bird-Including-Combinatory-ebook/dp/B00A1P096Y/ref=tmm_kin_swatch_0?_encoding=UTF8&qid=&sr=
Hit Me Up
- Twitter -> https://twitter.com/TheJavaImposter
- Linkedin -> https://www.linkedin.com/in/christopherjcooke/
- Medium -> https://medium.com/@christopher.jamescooke