Introduction

There are several ways to reduce Stream as a sequence of input elements into a single summary result. One of them is to use implementations of Collector interface with Stream.collect(collector) method. It’s possible to implement this interface explicitly, but it should start with studying its predefined implementations from Collectors class.

Classification of predefined collectors

There are 44 public static factory methods in Collectors class (up to Java 12) that return predefined implementations of Collector interface. To understand them better, it’s rational to divide them into categories, for example:

It’s reasonable to use static import from Collectors class to make source code more readable.

Collectors to collections

Collectors to reduce input elements into collections are the simplest. They allow collecting streams into List, Set, and a specific Collection.

Regular collectors to collections

To collect Stream to List it’s possible to use a collector from toList method. There are no guarantees about type, mutability, serializability, or thread-safety of the returned List.

List<Integer> list = Stream.of(1, 2, 3)
       .collect(toList());

assertThat(list)
       .hasSize(3)
       .containsOnly(1, 2, 3);

To collect Stream to Set it’s possible to use a collector from toSet method. There are no guarantees about type, mutability, serializability, or thread-safety of the returned Set.

Set<Integer> set = Stream.of(1, 1, 2, 2, 3, 3)
       .collect(toSet());

assertThat(set)
       .hasSize(3)
       .containsOnly(1, 2, 3);

To collect Stream into a specific Collection it’s possible to use a collector from toCollection(collectionFactory) method. Here is used constructor reference to ArrayList as a factory to a specific Collection.

List<Integer> list = Stream.of(1, 2, 3)
       .collect(toCollection(ArrayList::new));

assertThat(list)
       .hasSize(3)
       .containsOnly(1, 2, 3)
       .isExactlyInstanceOf(ArrayList.class);

Collectors to unmodifiable collections

Collections that do not support modification operations are referred to as unmodifiable. Such collections cannot be modified by calling any mutator methods, they are guaranteed to throw UnsupportedOperationException. But only if elements of such collections are immutable, collections can be considered as immutable itself.

To collect stream to unmodifiable List it’s possible to use a collector from toUnmodifiableList method (since Java 10). Elements of such lists cannot be added, removed, or replaced.

List<Integer> unmodifiableList = Stream.of(1, 2, 3)
       .collect(toUnmodifiableList());

assertThat(unmodifiableList)
       .hasSize(3)
       .containsOnly(1, 2, 3);

assertThatThrownBy(unmodifiableList::clear)
       .isInstanceOf(UnsupportedOperationException.class);

To collect Stream to unmodifiable Set it’s possible to use a collector from toUnmodifiableSet method (since Java 10). Elements of these sets cannot be added or removed.

Set<Integer> unmodifiableSet = Stream.of(1, 1, 2, 2, 3, 3)
       .collect(toUnmodifiableSet());

assertThat(unmodifiableSet)
       .hasSize(3)
       .containsOnly(1, 2, 3);

assertThatThrownBy(unmodifiableSet::clear)
       .isInstanceOf(UnsupportedOperationException.class);

To collect streams to unmodifiable collections before Java 10, it’s possible to use a collector from described below collectingAndThen method.

Downstream-designed collectors

There are collectors that have functionality similar to some Stream operations. Indeed these сollectors were designed not to duplicate Stream functionality, but to be passed as arguments (downstream collectors) to other сollectors to perform the multilevel reduction.

Mentioned collectors and Stream operations often are similar but not equivalent. They can have different types of parameters, return values, and different semantics.

In this section downstream collectors are used in a primitive way just to show their functionality. Proper use of downstream collectors is described below in section “Grouping-by” collectors to maps.

Analogs of stream intermediate operations

There are collectors from filtering, mapping, flatMapping methods that have functionality similar to filter, map, flatMap Stream intermediate operations. They all are designed to perform filter-map steps in filter-map-reduce functional pipeline.

To collect input elements that satisfy a condition it’s possible to use a collector from filtering(predicate, downstream) method (since Java 9). Here is used the Predicate i -> i % 2 != 0 that filters odd numbers and toList as a downstream collector.

   List<Integer> listOfOddNumbers = Stream.of(1, 2, 3)
           .collect(filtering(i -> i % 2 != 0, toList()));

   assertThat(listOfOddNumbers)
           .hasSize(2)
           .containsOnly(1, 3);

To collect input elements that are subjected to a one-to-one transformation it’s possible to use a collector from mapping(mapper, downstream) method. Here is used the Function i -> i * i that transforms numbers to their squares and toList as a downstream collector.

   List<Integer> listOfSquares = Stream.of(1, 2, 3)
           .collect(mapping(i -> i * i, toList()));

   assertThat(listOfSquares)
           .hasSize(3)
           .containsOnly(1, 4, 9);

To collect input elements that are subjected to a one-to-many transformation it’s possible to use a collector from flatMapping(mapper, downstream) method (since Java 9). Here is used the Function List::stream that converts stream of lists of elements to stream of elements and toList as a downstream collector.

   List<Integer> list = Stream.of(
           List.of(1, 2),
           List.of(3, 4))
           .collect(flatMapping(List::stream, toList()));

   assertThat(list)
           .hasSize(4)
           .containsOnly(1, 2, 3, 4);

More simpler collector from mapping method should be used, when each input element is converted into exactly one element. More advanced collector from flatMapping method should be used, when each input element can be converted into a Stream of zero, one or many elements.

Analogs of stream terminal operations

There are collectors from _averaging(Int Long Double), counting, maxBy, minBy, summing(Int Long Double), summarizing(Int Long Double)_ methods that have functionality similar to average, count, max, min, sum, summaryStatistics Stream terminal operations. They all are designed to perform specialized reduce steps in filter-map-reduce functional pipeline.
To find the average (arithmetic mean) of int, long, double input elements it’s possible to use collectors from _averaging(Int Long Double)_ methods. It’s necessary to pass a mapper function that converts object input elements to the primitive ones, as the argument to the method (here is used ToIntFunction).
double average = Stream.of(1, 2, 3)
       .collect(averagingInt(i -> i));

assertThat(average).isEqualTo(2);

To count the number of input elements, it’s possible to use a collector from counting method.

long count = Stream.of(1, 2, 3)
       .collect(counting());

assertEquals(3L, count);

To find the maximal input element it’s possible to use a collector from maxBy(comparator) method. It’s necessary to pass a Comparator as the argument to the method (here is used Comparator.naturalOrder).

Optional<Integer> max = Stream.of(1, 2, 3)
       .collect(maxBy(Comparator.naturalOrder()));

assertThat(max)
       .isNotEmpty()
       .hasValue(3);

To find the minimal input element it’s possible to use a collector from minBy(comparator) method. It’s necessary to pass a Comparator as the argument to the method (here is used Comparator.naturalOrder).

Optional<Integer> min = Stream.of(1, 2, 3)
       .collect(minBy(Comparator.naturalOrder()));

assertThat(min)
       .isNotEmpty()
       .hasValue(1);
To find the sum of int, long, double input elements it’s possible to use collectors from _summing(Int Long Double)_ methods. It’s necessary to pass a mapper function that converts object input elements to the primitive ones, as the argument to the method (here is used ToIntFunction).
int sum = Stream.of(1, 2, 3)
       .collect(summingInt(i -> i));

assertThat(sum).isEqualTo(6);
To find all the numerical statistics described above (average, count, max, min, sum) of int, long, double input elements it’s possible to use collectors from _summarizing(Int Long Double)_ methods. It’s necessary to pass a mapper function that converts object input elements to the primitive ones, as the argument to the method (here is used ToIntFunction).
IntSummaryStatistics iss = Stream.of(1, 2, 3)
       .collect(summarizingInt(i -> i));

assertThat(iss.getAverage()).isEqualTo(2);
assertThat(iss.getCount()).isEqualTo(3);
assertThat(iss.getMax()).isEqualTo(3);
assertThat(iss.getMin()).isEqualTo(1);
assertThat(iss.getSum()).isEqualTo(6);

Analogs of stream reduce operations

There are collectors from overloaded reducing methods that have functionality similar to reduce Stream operations. They all are designed to perform general reduce steps in filter-map-reduce functional pipeline.

There are 3 overloaded reduce methods that can have the following parameters:

  • operator - a BinaryOperator to reduce input elements
  • identity - the initial value for the reduction; it’s returned as result value when there are no input elements
  • mapper - a Function to apply to each input element

Example of a collector from reducing(operator) method to calculate the sum of input elements. Because there is no identity parameter, the result type is [Optional](https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/util/Optional.html) to handle the case when there are no input elements.

Optional<Integer> sumOptional = Stream.of(1, 2, 3)
       .collect(reducing(Integer::sum));

assertTrue(sumOptional.isPresent());
assertThat(sumOptional.get()).isEqualTo(6);

Example of a collector from reducing(identity, operator) method to calculate the sum of input elements. Because there is an identity parameter, the result type is Integer.

Integer sum = Stream.of(1, 2, 3)
       .collect(reducing(0, Integer::sum));

assertThat(sum).isEqualTo(6);

Example of a collector from reducing(identity, mapper, operator) method to calculate the sum of squares of input elements.

Integer sumOfSquares = Stream.of(1, 2, 3)
       .collect(reducing(0, element -> element * element, Integer::sum));

assertThat(sumOfSquares).isEqualTo(14);

Collectors to maps

Collectors to reduce input elements to maps are much more complicated than collectors to collections. There are two big categories of such collectors:

  • collectors from “to-map” methods (toMap, toUnmodifiableMap, toConcurrentMap)
  • collectors from “grouping-by” methods (groupingBy, partitioningBy, groupingByConcurrent)

Each input element is converted into key and value, and multiple input elements can be associated with the same key. The difference between the two categories of collectors is in the handling of keys collisions.

There are methods to create collectors to maps that can have the following parameter:

  • mapFactory - a Supplier for new empty Map to collect results

If a collector is created from a method without mapFactory parameter, then there are no guarantees on the type, mutability, serializability, or thread-safety of the returned Map.

“To-map” collectors to maps

Collectors from “to-map” methods reduce input elements to maps whose keys and values are the results of applying key-mapping and value-mapping functions. If many input elements are associated with the same key, it’s possible to use merge function to return a single value by binary reduction.

There are overloaded toMap, toUnmodifiableMap, toConcurrentMap methods that can have the following parameters:

  • keyMapper - a Function to convert input elements into map keys
  • valueMapper - a Function to convert input elements into map values
  • mergeFunction - a BinaryOperator to resolve collisions between values when many input elements are associated with the same key
  • mapFactory - a Supplier for new empty Map to collect results

Examples in this section show reducing streams of words from phonetic alphabets to maps when keys are the first letters of the words (here it’s used s -> s.charAt(0)) and values are the words themselves (here is used Function.identity()).

Regular collectors to maps

Example of a collector from toMap(keyMapper, valueMapper) method, where no keys collisions are guaranteed.

   Map<Character, String> map = Stream.of("Alpha", "Bravo", "Charlie")
           .collect(toMap(s -> s.charAt(0), Function.identity()));

   assertThat(map)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie");

Example of a collector from toMap(keyMapper, valueMapper) method. If two input elements are associated with the same key, an IllegalStateException is thrown.

   assertThrows(IllegalStateException.class, () -> {
       Stream.of(
               "Amsterdam", "Baltimore", "Casablanca",
               "Alpha", "Bravo", "Charlie")
               .collect(toMap(s -> s.charAt(0), Function.identity()));
   });

Example of a collector from toMap(keyMapper, valueMapper, mergeFunction) method. If two input elements are associated with the same key, then merge function (v1, v2) -> v2 selects the new value.

   Map<Character, String> map = Stream.of(
           "Amsterdam", "Baltimore", "Casablanca",
           "Alpha", "Bravo", "Charlie")
           .collect(toMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));

   assertThat(map)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie");

Example of a collector from toMap(keyMapper, valueMapper, mergeFunction, mapFactory) method. Here is used constructor reference to TreeMap as a map factory to a specific Map.

   SortedMap<Character, String> map = Stream.of(
           "Amsterdam", "Baltimore", "Casablanca",
           "Alpha", "Bravo", "Charlie")
           .collect(toMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2, TreeMap::new));

   assertThat(map)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie")
           .isExactlyInstanceOf(TreeMap.class);

Collectors to unmodifiable maps

Maps that do not support modification operations are referred to as unmodifiable. Such maps cannot be modified by calling any mutator methods, they are guaranteed to throw UnsupportedOperationException. But only if keys and values of such maps are immutable, maps can be considered as immutable itself.

To reduce stream to unmodifiable Map it’s possible to use collectors from toUnmodifiableMap methods (since Java 10). Keys and values of such maps cannot be added, removed, or updated.

Example of a collector from toUnmodifiableMap(keyMapper, valueMapper) method, where no keys collisions are guaranteed.

   Map<Character, String> unmodifiableMap = Stream.of("Alpha", "Bravo", "Charlie")
           .collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity()));

   assertThat(unmodifiableMap)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie");

   assertThatThrownBy(unmodifiableMap::clear)
           .isInstanceOf(UnsupportedOperationException.class);

Example of a collector from toUnmodifiableMap(keyMapper, valueMapper) method. If two input elements are associated with the same key, an IllegalStateException is thrown.

   assertThrows(IllegalStateException.class, () -> {
       Stream.of(
               "Amsterdam", "Baltimore", "Casablanca",
               "Alpha", "Bravo", "Charlie")
               .collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity()));
   });

Example of a collector from toUnmodifiableMap(keyMapper, valueMapper, mergeFunction) method. If two input elements are associated with the same key, then merge function (v1, v2) -> v2 selects the new value.

   Map<Character, String> unmodifiableMap = Stream.of(
           "Amsterdam", "Baltimore", "Casablanca",
           "Alpha", "Bravo", "Charlie")
           .collect(toUnmodifiableMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));

   assertThat(unmodifiableMap)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie");

   assertThatThrownBy(unmodifiableMap::clear)
           .isInstanceOf(UnsupportedOperationException.class);

Concurrent collectors to maps

The difference between collectors from toMap and toConcurrentMap methods is in their behavior during parallel reduction.

Collectors from toMap methods create multiple result containers (e.g. HashMap) for each partition and then merge them. Merging key-value entries from one Map into another can be an expensive operation.

Function calls inside non-parallel collectors:

  • supplier - a function to create new result container (is called multiple times)
  • accumulator - a function to add a new element into a result container (is called multiple times)
  • combiner - a function to combine two result containers into one (is called multiple times)

Collectors from toConcurrentMap methods create single result container (e.g. ConcurrentMap) for all partitions. There are no merging key-value entries from one Map to another.

Function calls inside parallel collectors:

  • supplier - a function to create new result container (is called only once)
  • accumulator - a function to add a new element into a result container (is called multiple times)
  • combiner - a function to combine two result containers into one (is never called)

Parallel reduction is performed if all of the following are true:

  • the Stream is parallel
  • the Collector has the characteristic CONCURRENT
  • either the Stream is unordered or the Collector has the characteristic UNORDERED

Parallel reduction may have performance better than sequential reduction. However, the order of inserting key-value entries to maps during parallel reduction is not guaranteed.

Examples in this section are used parallel streams created from method BaseStream.parallel().

Example of a collector from toConcurrentMap(keyMapper, valueMapper) method, where no keys collisions are guaranteed.

   ConcurrentMap<Character, String> map = Stream.of("Alpha", "Bravo", "Charlie")
           .parallel()
           .collect(toConcurrentMap(s -> s.charAt(0), Function.identity()));

   assertThat(map)
           .hasSize(3)
           .containsEntry('A', "Alpha")
           .containsEntry('B', "Bravo")
           .containsEntry('C', "Charlie");

Example of a collector from toConcurrentMap(keyMapper, valueMapper) method. If two input elements are converted into the same key, an IllegalStateException is thrown.

   assertThrows(IllegalStateException.class, () -> {
       Stream.of(
               "Amsterdam", "Baltimore", "Casablanca",
               "Alpha", "Bravo", "Charlie")
               .parallel()
               .collect(toConcurrentMap(s -> s.charAt(0), Function.identity()));
   });

Example of a collector from toConcurrentMap(keyMapper, valueMapper, mergeFunction) method. If two input elements are converted into the same key, then merge function (v1, v2) -> v2 that selects the new value.

Because the order of inserting key-value entries in maps during parallel reduction is not guaranteed, here are verified only keys, not key-values.

   ConcurrentMap<Character, String> map = Stream.of(
           "Amsterdam", "Baltimore", "Casablanca",
           "Alpha", "Bravo", "Charlie")
           .parallel()
           .collect(toConcurrentMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2));

   assertThat(map)
           .hasSize(3)
           .containsKey('A')
           .containsKey('B')
           .containsKey('C');

Example of a collector from toConcurrentMap(keyMapper, valueMapper, mergeFunction, mapFactory) method. Here is used constructor reference to ConcurrentHashMap as a map factory to a specific ConcurrentMap.

   ConcurrentMap<Character, String> map = Stream.of(
           "Amsterdam", "Baltimore", "Casablanca",
           "Alpha", "Bravo", "Charlie")
           .parallel()
           .collect(toConcurrentMap(s -> s.charAt(0), Function.identity(), (v1, v2) -> v2, ConcurrentHashMap::new));

   assertThat(map)
           .hasSize(3)
           .containsKey('A')
           .containsKey('B')
           .containsKey('C')
           .isExactlyInstanceOf(ConcurrentHashMap.class);

“Grouping-by” collectors to maps

Collectors from “grouping-by” methods reduce input elements to maps whose key are groups by applying a classification function. All values, associated with the same key group, are reduced by a downstream collector into one value.

There are overloaded groupingBy, partitioningBy, groupingByConcurrent methods that can have the following parameters:

  • classifier - a Function to collect element into key groups
  • mapFactory - a Supplier for new empty Map to collect results
  • downstream - a Collector to reduce values, associated with the same key group

Collectors from “grouping-by” methods are the place where downstream collectors are designed to be used. As downstream collectors can be used not only predefined collectors from Collectors class but also composite collectors that are combined from other collectors.

Examples in this section show reducing streams of the top 100 US cities by population. These objects have 3 levels of grouping: cities, areas (50 states and 1 federal district) and regions.

Grouping collectors to maps

Example a collector from groupingBy(classifier) method. Here is implicitly used a downstream collector to List.

   Map<Area, List<City>> citiesPerArea = USA.CITIES.stream()
           .collect(groupingBy(City::getArea));

Example collector from groupingBy(classifier, downstream) method. Here is explicitly used a downstream collector from toSet method.

   Map<Area, Set<City>> citiesPerArea = USA.CITIES.stream()
           .collect(groupingBy(City::getArea, toSet()));

Example collector from groupingBy(classifier, mapFactory, downstream) method. Here is used a supplier function to the constructor call of EnumMap as a map factory to a specific Map.

   EnumMap<Area, List<City>> citiesPerArea = USA.CITIES.stream()
           .collect(groupingBy(City::getArea, () -> new EnumMap<>(Area.class), toList()));

Partitioning collectors to maps

Collectors from partitioningBy methods are a special case of collectors from groupingBy methods. The first use more specific Predicate as a classifier function, the second use more general Function.

There are overloaded partitioningBy methods that can have the following parameters:

  • predicate - a Predicate to collect element into two key groups
  • downstream - a Collector to reduce values, associated with the same key group

Collectors from partitioningBy methods always produce values for both true and false keys, even if a value group is empty.

Examples in this section show reducing streams into two complementary collections by equality to zero the remainder from division input elements by a certain number.

Example of a collector from partitioningBy(predicate) method. Here is implicitly used a downstream collector to List.

Map<Boolean, List<Integer>> reminderFromDivisionBy2IsZero = Stream.of(1, 2, 3)
       .collect(partitioningBy(i -> i % 2 == 0));

assertThat(reminderFromDivisionBy2IsZero)
       .hasSize(2)
       .containsEntry(false, List.of(1, 3))
       .containsEntry(true, List.of(2));

Example of a collector from partitioningBy(predicate, downstream) method. Here is explicitly used a downstream collector from toSet method.

Map<Boolean, Set<Integer>> reminderFromDivisionBy4IsZero = Stream.of(1, 2, 3)
       .collect(partitioningBy(i -> i % 4 == 0, toSet()));

assertThat(reminderFromDivisionBy4IsZero)
       .hasSize(2)
       .containsEntry(false, Set.of(1, 2, 3))
       .containsEntry(true, Set.of());

Concurrent grouping collectors to maps

There are collectors from groupingByConcurrent methods that are designed for parallel reduction similar to collectors from toConcurrentMap methods.

Examples in this section are used parallel streams created from method Collection.parallelStream().

Example of a collector from groupingByConcurrent(classifier) method. Here is implicitly used a downstream collector to List.

   ConcurrentMap<Area, List<City>> citiesPerArea = USA.CITIES
           .parallelStream()
           .collect(groupingByConcurrent(City::getArea));

Example of a collector from groupingByConcurrent(classifier, downstream) method. Here is explicitly used a collector from toSet method.

   ConcurrentMap<Area, Set<City>> citiesPerArea = USA.CITIES
           .parallelStream()
           .collect(groupingByConcurrent(City::getArea, toSet()));

Example of a collector from groupingByConcurrent(classifier, mapFactory, downstream) method. Here is used constructor reference to ConcurrentHashMap as a map factory to a specific ConcurrentMap.

   ConcurrentMap<Area, List<City>> citiesPerArea = USA.CITIES
           .parallelStream()
           .collect(groupingByConcurrent(City::getArea, ConcurrentHashMap::new, toList()));

Other collectors

Some collectors can’t be assigned to any category described above.

Collector from collectingAndThen(downstream, finisher) method was designed to perform additional finishing processing of the summary result. It was often used to produce unmodifiable collections before introducing in Java 10 collectors from described above toUnmodifiableList, toUnmodifiableSet methods.

List<Integer> unmodifiableList = Stream.of(1, 2, 3)
       .collect(collectingAndThen(toList(), Collections::unmodifiableList));

assertThat(unmodifiableList)
       .hasSize(3)
       .containsOnly(1, 2, 3);

assertThatThrownBy(unmodifiableList::clear)
       .isInstanceOf(UnsupportedOperationException.class);

Collectors from overloaded joining methods were designed to join input elements that are CharSequence implementations (String, StringBuffer, StringBuilder etc) of into String summary result.

Example of a collector from joining() method, that joins string without delimiter between them.

   String result = Stream.of(1, 2, 3)
           .map(String::valueOf)
           .collect(joining());

   assertThat(result).isEqualTo("123");

Example of a collector from joining(delimiter) method, that joins string with a delimiter between them.

   String result = Stream.of(1, 2, 3)
           .map(String::valueOf)
           .collect(joining(","));

   assertThat(result).isEqualTo("1,2,3");

Example of a collector from joining(delimiter, prefix, suffix) method, that joins string with a delimiter between them and a prefix and a suffix around the summary result.

   String result = Stream.of(1, 2, 3)
           .map(String::valueOf)
           .collect(joining(",", "[", "]"));

   assertThat(result).isEqualTo("[1,2,3]");

The teeing collector was designed to compose of two downstream collectors at once (since Java 12). Example of a collector from teeing method to find minimal and maximal values of input elements.

Map.Entry<Optional<Integer>, Optional<Integer>> limits = Stream.of(1, 2, 3)
       .collect(
               teeing(
                       minBy(Integer::compareTo),
                       maxBy(Integer::compareTo),
                       AbstractMap.SimpleImmutableEntry::new
               )
       );

assertNotNull(limits);

Optional<Integer> minOptional = limits.getKey();
assertThat(minOptional)
       .isNotEmpty()
       .hasValue(1);

Optional<Integer> maxOptional = limits.getValue();
assertThat(maxOptional)
       .isNotEmpty()
       .hasValue(3);

Conclusion

Extended code examples are available in the GitHub repository.