java8 streams ebook

27

Upload: sven-ruppert

Post on 13-Apr-2017

167 views

Category:

Software


0 download

TRANSCRIPT

by Sven Ruppert

Java 8 StreamsHow to Kick Ass with Lambdas

ISBN: 978-1-909264-17-5 © Developer.Press An Imprint created by Software & Support Media Ltd.

3

Java 8 – Streams

1 Java 8 – StreamsWith the release of Java 8, the Streams-API arrived on the scene. But what is the advantage for the developer? And how is it used? Here begins our step-by-step approach of this fundamental API. You can find all sources for this book at:

https://bitbucket.org/rapidpm/entwicklerpress-shortcut-JDK 8-streams

1.1 Data in – Data out

What were those stream thingies again?

At some point in working with Java, every developer has been confronted with some kind of stream. But what exactly makes up a stream in JDK 8?

• Streams are not data structures. That basically means they do not constitute storage for data. Instead, they can be seen as pipelines for data streams. Here, different transformations are applied to the data. In this special case, the transformations are not performed on the data of the source structure itself. As a result, the underlying data structures, such as arrays or lists, are not changed. A stream thus wraps the data structure, withdraws source data from it and works on copies.

• Streams have been conceptualized for the usage of lambdas. That means there are no streams without lambdas, which does not pose a problem since streams and lambdas are both contained in JDK 8.

• Streams do not offer random access to the source data via index or the like. Access to the first element is possible, but not to any following elements.

• Streams provide excellent support for delivering results as an array or list, for example.

• Streams are organized in a lazy way. This means that the elements are not fetched until the operation is supposed to be applied to them. Assuming that the data source consists of 1,000 elements, then the first access does not take 1,000 time units but one time unit. (Provided that the access to an element is linear in time consumption).

• Streams are parallel if requested. The streams can basically be divided into two main groups: the serial and the parallel implementations. Therefore when operations are executed atomically and without invariants, we don’t need the typical multithreaded code to use the cores sleeping in the system.

• Streams are unbound, since they are not initially filled like collections. Hence streams are also infinite. One can define generator functions that ensure the permanent delivery of source data. This source data is generated when the client is consuming elements of the stream.

Where does all the source data come from?

When you think about the fact that streams do not keep their own data the way collections do, then the ques-tion arises as to where the data comes from. The most common way to generate streams is the usage of the following methods by which a stream is created from a fixed number of elements: Stream.of(val1,val2,val3…), Stream.of(array) and list.stream(). These methods that generate from a fixed domain also include the method that creates a stream from a string. A string is nothing else than a finite chain of chars.

final Stream<String> splitOf = Stream.of("A,B,C".split(","));

Equally streams can be generated from streams. We’ll take a close look at this in the next section.

Now two possibilities to generate streams are still missing. The first one is to programmatically create a stream with a builder.

4

Java 8 – Streams

final Stream<Pair> stream = Stream.<Pair>builder().add(new Pair()).build();

The other and final possibility is to use a generator. This is done by the method Streams.generate(..) (Listing 1), where the method of the argument obtains an instance of the class Supplier<T>. Where is all the data going?

Stream.generate(() -> { final Pair p = new Pair(); p.id = random.nextInt(100); p.value = "Value + " + p.id; return p; })

Listing 1: generate-method

Where is all the data going?

Since we now know where the data is coming from, the question arises how we can retrieve the data out of the stream. After all, usually the idea is to continue working with it. The easiest way is to generate an array with the method stream.toArray() or a list by use of stream.collect (Collectors.toList()) from the stream.

With this, we have covered nearly 90 percent of the usages. Nevertheless, we can also generate sets and maps: sets with the method stream.collect (Collectors.toSet()), and maps by using stream. collect (Collectors. groupingBy(..)). The argument of groupingBy() provides at least one function with which an aggregation can be carried out. The aggregation is the key for our map and the value is a list of the type of the element of the stream. One possibility that might seem a little unusual for some developers is to output the stream in a string. In order to achieve this we use in the method collect a toStringJoiner whose parameter is a delimiter. The result is then a list of toString()- representations generated from all elements and concatenated through this delimiter.

public static void main(String[] args) { final List<Pair> generateDemoValues = generateDemoValues(); //Stream from Values final Stream<Pair> fromValues = Stream.of(new Pair(), new Pair()); //Stream from Array final Pair[] pairs = {new Pair(), new Pair()}; final Stream<Pair> fromArray = Stream.of(pairs); //Stream from List final Stream<Pair> fromList = generateDemoValues.stream(); //Stream from String final Stream<String> abc = Stream.of("ABC"); final Stream<IntStream> of = Stream.of("ABC".chars()); final Stream<String> splitOf = Stream.of("A,B,C".split(",")); //Stream from builder final Stream<Pair> builderPairStream = Stream.<Pair>builder().add(new Pair()).build(); //Stream to Array final Pair[] toArray = generateDemoValues.stream().toArray(Pair[]::new); //Stream to List final List<Pair> toList = generateDemoValues.stream() .collect(Collectors.toList()); //Stream to Set final Set<Pair> toSet = generateDemoValues.stream() .collect(Collectors.toSet());

5

Java 8 – Streams

//Stream to Map final Map<Integer,List<Pair>> collectedToMap = generateDemoValues.stream() .collect(Collectors.groupingBy(Pair::getId)); System.out.println("collectedToMap.size() = " + collectedToMap.size()); for (final Map.Entry<Integer, List<Pair>> entry : collectedToMap.entrySet()) { System.out.println("entry = " + entry); } }

Listing 2: examples of stream out methods

Summary

You’ll be happy to see that the streams can be easily integrated into existing Java-code. No unnecessary wrap-pers have to be written. Therefore this integration can be easily used in old as well as in new projects. Once you’re accustomed to the API, you’ll find many positions in which a high code reduction can be achieved by use of streams.

Core-Methods

Now that we have discussed how the data gets into the streams and how it is retrieved we now will deal with the data transformation. Among others, the following three basic methods are available to us – forEach, match and find – available with which one can quickly and easily undertake the first attempts.

ForEach – a lambda for each element

The forEach(<lambda>) method is actually doing exactly what one suspects. It applies the lambda that has been passed as argument to every single element of the stream. This method can also be found with iterable, list, map and some other classes/interfaces – a fact that fortunately leads to shorter code constructs. Listing 3 shows the iteration in pre JDK 8 notation, listing 4 by use of forEach(<lambda>). In Listing 4 there are one long and two short versions. The long version uses the complete notation including the curly brackets. The two short versions use the notation to draw on static methods.

final List<Pair> generateDemoValues = new PairListGenerator(){}.generateDemoValues(); //pre JDK 8 for (final Pair generateDemoValue : generateDemoValues) { System.out.println(generateDemoValue); }

Listing 3: pre JDK 8 Notation

//long version generateDemoValues.stream() .forEach(v -> { System.out.println(v) }); //short version - serial generateDemoValues.stream() .forEach(System.out::println); //short version - parallel generateDemoValues.parallelStream() .forEach(System.out::println);

Listing 4: JDK 8 Notation

In Listing 4 there is a small but subtle difference in the two short versions. The first definition traces back to a serial stream, the second to a parallel stream. In both it is ensured that the method reference can only be ap-plied once to each element. However, the order in which the individual elements are processed is not guaran-

6

Java 8 – Streams

teed if it concerns parallel streams. If the order is of importance, we need to use the forEachOrdered(<lambda>) method. Here, the order that exists in the data source is being maintained during processing. This should only be used when absolutely necessary. However it is not obvious on first sight that a NotNullCheck is being performed and in case of null a NullPointer Exception is thrown. Every element is consumed by the consumer’s accept method. (Listing 5)

//class - ReferencePipeline @Override public void forEach(Consumer<? super P_OUT> action) { evaluate(ForEachOps.makeRef(action, false)); } //class - AbstractPipeline final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) { assert getOutputShape() == terminalOp.inputShape(); if (linkedOrConsumed) throw new IllegalStateException(MSG_STREAM_LINKED); linkedOrConsumed = true; return isParallel() ? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags())) : terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags())); } //class - ForEachOps public static <T> TerminalOp<T, Void> makeRef(Consumer<? super T> action, boolean ordered) { Objects.requireNonNull(action); //throws NPE return new ForEachOp.OfRef<>(action, ordered); } //class – ForEachOps. OfRef static final class OfRef<T> extends ForEachOp<T> { final Consumer<? super T> consumer; //… @Override public void accept(T t) { consumer.accept(t); } }

Listing 5: forEach

When using forEach(<lambda>) we should to consider the following: through the method accept in the con-sumer, the element is being consumed. This means that forEach(<lambda>) can only be applied to a stream once. In this context, we also speak of a terminal operation. If you need to apply more than one operation to the element, this can also happen within the passing lambda.

However, the argument of the forEach(<lambda>) method can be reused by holding an instance and then ap-plying it to several streams. (Listing 6)

Likewise the manipulation of surrounding variables is not allowed. We see how this occurs in the context of the method map and reduce. Although, the greatest difference to a for-loop is that it cannot be interrupted ahead of time – neither with break nor with return.

final Consumer<? super Pair> consumer = System.out::println; generateDemoValues.stream().forEachOrdered(consumer); generateDemoValues.parallelStream().forEachOrdered(consumer);

Listing 6: forEachOrdered

7

Java 8 – Streams

Map – How about transformations?

The method map(<lambda>) generates a new stream consisting of the sum of all transformations of the ele-ments of the source stream. Again the argument here is a lambda. This means that except for the functional coupling the target stream does not have to have anything in common with the source stream. (Listing 7) In the following example a Stream<Pair> turns into a Stream<DemoElement> in order to subsequently be mapped in a Stream<String>. The method can be applied as many times as required since every time the result is a new stream.

//map from Point to DemoElements final Stream<DemoElement> demoElementStream = generateDemoValues.stream().map(v -> { final String value = v.getValue(); final DemoElement d = new DemoElement(); d.setDatum(new Date()); d.setValue(Base64.getEncoder() .encodeToString(value.getBytes())); return d; }); final Stream<String> stringStream = demoElementStream.map(v -> v.getValue()); final Stream<String> stringStreamShort = demoElementStream.map(DemoElement::getValue); //map from Point to DemoElements to Strings final List<String> stringList = generateDemoValues.stream() .map(v -> { final String value = v.getValue(); final DemoElement d = new DemoElement(); d.setDatum(new Date()); d.setValue(Base64.getEncoder() .encodeToString(value.getBytes())); return d; }) .map(DemoElement::getValue) .collect(Collectors.toList());

Listing 7: map

1.1.1 Filter – Who should it be?

Just like the map(<lambda>) method, the filter(<Lambda>) method also generates a new Stream. The elements for the next steps are filtered out of the set of source elements (Listing 8). The filter(<Lambda>) method can be applied several times in sequence, whereas with every call the set is filtered further. Therefore a further reduction is taking place. The filter(<Lambda>) method can be applied in any combination. E.g. map à filter à map à filter à filter. final Stream<Pair> filteredPairStream =

generateDemoValues.stream() .filter(v -> v.getId() % 2 == 0);

Listing 8: filter

1.1.2 FindFirst – Give me an optional

Sometimes there is a set of elements with an undefined order and indefinite quantity, but with exactly one element from which we can extract certain characteristics. Queries to the database that in most cases don’t pose a problem (thanks to SQL), can extend the source code on the imperative side.

The findFirst() method provides the first element from the stream. A trivial method at first sight, a surprisingly useful one at a second glance. The return value is an optional (Listing 9): if there’s an empty stream we have an empty optional.

8

Java 8 – Streams

final List<String> demoValues = Arrays.asList("AB","AAB","AAAB","AAAAB","AAAAAB"); final Optional<String> first = demoValues.stream() .findFirst();

Listing 9: FindFirst

Optionals? What were they again? Since JDK 8 optionals are contained in the language scope. The idea be-hind an optional is nothing else than the Null-Object-Pattern in conjunction with service methods. Since this is a new class let’s take a brief look at it.

Optional – yes / no / maybe?

The original Null-Object-Pattern can be explained as follows: the return value of a method is of type List<X>. If the result is an empty set, then the method provides an empty list. The following code can either request this status by isEmpty() or the processing loop generates an empty result. So far everything is ok. However, if the return value is one single instance of Type X then unfortunately many methods return a null. The conse-quence of this is that code parts must be repeatedly provided with the following if-else constructs. (Listing 10)

final Integer x = methodX(); if(x == null){ //do something } else{ //do something else }

Listing 10: if-else construct

To remedy this, we can add a NULL-element to the class. As of JDK 8 this is no longer necessary. The JDK now offers a solution. You can picture the optional as a holder that can be asked for the existence of the included value-element. Since it now is contained in the language scope it already has been implemented in some clas-ses – for example in the streams. There are two methods to generate instances of the class optional yourself. The first is of(<T>) and is always expecting a reference unequal null. The second method is ofNullable(<T>). In the second version a null can also be transferred. Incidentally, null-values are all reduced to one common reference. In the implementation of optional this is defined by:

private static final Optional<?> EMPTY = new Optional<>();

//how to create an Optional

final Optional<String> optionalA = Optional.of("A"); final Optional<String> optionalB1 = Optional.ofNullable("B"); final Optional<String> optionalB2 = Optional.ofNullable(null);

Listing 11: How to create an Optional

The class optional additionally provides the following methods (Listing 12):

System.out.println("optionalB2.isPresent() = " + optionalB2.isPresent()); optionalA.ifPresent(System.out::println); // result = 'A' optionalB2.ifPresent(System.out::println);// no output demoValues.stream() .forEach(v -> { Optional.ofNullable(v) .filter(o -> o.contains("AAA")) .ifPresent(System.out::println); });

9

Java 8 – Streams

demoValues.stream() .forEach(v -> { Optional.ofNullable(v) .map(o->o.concat("_X")) .filter(f->f.contains("AAA")) .ifPresent(System.out::println); }); //method addX returned Optional<String> final Optional<Optional<String>> map = optionalA.map(Part04::addX); final Optional<String> flatMap = optionalA.flatMap(Part04::addX); demoValues.stream() .forEach(v -> { Optional.ofNullable(v) .flatMap(Part04::addX) .filter(f -> f.contains("AAA")) .ifPresent(System.out::println); }); System.out.println(optionalB2.orElse("noop")); try { optionalB2.orElseThrow(NullPointerException::new)); } catch (NullPointerException e) { e.printStackTrace(); }

Listing 12: Optional

The method isPresent returns true if a value unequal null is included. In contrast, the method ifPresent(<lambda>); executes the transmitting lambda if a value unequal null exists. With filter(<Predicate>) only the optionals that match the defined filter criterion are returned – all others are returned as empty. The method map matches the map-method of the streams, whereas the result is returned packed in an optional. For this reason there is also the method flatMap() that returns the result without an enclosing optional. With orElse alternative values can be returned if the value is null. With the method orElseThrow an Exception is thrown if the value is null.

findFirst – What is the first element?

Using the method findFirst() the first hit is returned as optional from the defined value range based on the stream content. But this also means that it must not necessarily be the first in the order of the input value. Listing 13 returns the first element of the stream (“AB”, “AAB”, “AAAB”, “AAAAB”, “AAAAAB”) which contains a string “AAA”. As a result, our example always contains “AAAB”. But what happens if we define it as ParallelStream? (Listing 14) What can happen is that any value of the value list that corresponds to the criterion is returned because the stream is processed parallel to the ….

final String value = demoValues .stream() .filter(o -> o.contains("AAA")) .findFirst().orElse("noop "); System.out.println("value = " + value);

Listing 13: findFirst – serial

for(int i=0; i<10;i++){ final String valueParallel = demoValues .parallelStream() .filter(o -> o.contains("AAA")) .findFirst().orElse("noop "); System.out.println("value ("+i+") = " + valueParallel);

10

Java 8 – Streams

}

Listing 14: findFirst – parallel

The method findFirst() belongs to the “terminal” methods. This means that after the invocation of findFirst() no further stream operations can be performed. The stream is being terminated. By using findFirst() we can map complex patterns to obtain specific objects from the stream. Since the streams are basically pipelines, there are only as many elements produced as necessary for the finding of this one element. In contrast to the expressions in the conventional notation, the expressions via streams are usually much more compact. The usage of findFirst() is suitable when a declarative, quantity-based description of the individual entity cannot be applied and therefore an imperative approach is necessary.

1.1.3 Reduce – Bring it down to a common denominator

The previous considerations have exclusively looked at transformations that displayed a mapping from n onto m. The method reduce((v1,v2)->) however enables the mapping of n elements onto one final element. Here the particular focus is on the different behaviour of serial and parallel streams.

reduce – merge and reduce

All methods that we have looked at so far were not able to include, for example, elements of the position n-1 in the usage of the element n. Now how can we generate values that are built up on each other? As an example we say that the value n-1 always has to be attached to the value n. The input values are the characters of the chain “A, B, C, D, E”. These elements are to be merged. Listing 15 shows the first version, once with serial and once with the parallel stream. The method reduce((v1,v2)->)receives a lambda with two parameters: V1 and V2, the content of which are the elements n-1 and n from the stream. In both versions the result is the same string “ABCDE”, packed in an optional. This is surprising because the second version is a parallel stream.

final List<String> demoValues = Arrays.asList("A", "B", "C", "D", "E"); System.out.println(demoValues.stream() .reduce(String::concat)); //Optional[ABCDE] System.out.println(demoValues.parallelStream() .reduce(String::concat)); //Optional[ABCDE]

Listing 15: merge and reduce

So let us change the implementation of Listing 15 to the implementation of Listing 16 to see a little more. Now the result for the serial version is X_ABCDE and for the parallel version X_AX_BX_CX_DX_E_. Now we must ask the question, in which sub-step the particular result is produced (Listing 17). For this purpose we extend the output a little.

final List<String> demoValues = Arrays.asList("A", "B", "C", "D", "E"); //result is X_ABCDE System.out.println(demoValues.stream() .reduce("X_", String::concat)); //result is X_AX_BX_CX_DX_E System.out.println(demoValues.parallelStream() .reduce("X_", String::concat));

Listing 16: merge and reduce

The output is now being extended with a postfix. Now we can easily recognize the individual steps that take place in the parallel stream. The serial version results in the string X_A_B_C_D_E_. The prefix also comes first here and then every element is concatenated with the “_”. Everything is in the same order as it is available in the source stream. The result looks completely different with the parallel version: X_A_X_B__X_C_X_D_X_E____ (At the end there are 4 underscores, behind the B there are 2 underscores). Here it is worth looking at the in-

11

Java 8 – Streams

dividual steps more precisely. (Listing 18) It is important to note that Listing 19 will look different after each iteration. The steps however stay the same, just like the result.

final List<String> demoValues = Arrays.asList("A", "B", "C", "D", "E"); System.out.println(demoValues.stream() .reduce("X_", (v1,v2)->{ System.out.println("v1 -> " + v1); System.out.println("v2 -> " + v2); return v1.concat(v2)+"_"; })); System.out.println(demoValues.parallelStream() .reduce("X_", (v1,v2)->{ System.out.println("v1 -> " + v1); System.out.println("v2 -> " + v2); return v1.concat(v2)+"_"; }));

Listing 17: reduce

v1 -> X_ v1 -> X_ v2 -> D v1 -> X_ v2 -> E v1 -> X_D_ v2 -> X_E_ v1 -> X_ v2 -> B v1 -> X_ v2 -> A v2 -> C v1 -> X_A_ v2 -> X_B_ v1 -> X_C_ v2 -> X_D_X_E__ v1 -> X_A_X_B__ v2 -> X_C_X_D_X_E___ X_A_X_B__X_C_X_D_X_E____

Listing 18: reduce output

If we track the individual steps, we recognize the distribution that results from the processing of the parallel streams. Different partial components are being concatenated, the sum however is constant because the do-main stays the same as well as the order in which the data gets into the stream. A and B form the first value pair that is evaluated, C and D is the second pair, followed by E. The partial results are merged. For this reason there are two underscores after B and four after E. When we sort the sub-steps into a legible sequence and always print out v1, v2 and the partial result, we then obtain Listing 19.

v1 X_ plus v2_ B_ => X_B_ v1 X_ plus v2_ E_ => X_E_ v1 X_ plus v2_ D_ => X_D_ v1 X_ plus v2_ C_ => X_C_ v1 X_ plus v2_ A_ => X_A_ v1 X_D_ plus v2_ X_E__ => X_D_X_E__ v1 X_A_ plus v2_ X_B__ => X_A_X_B__ v1 X_C_ plus v2_ X_D_X_E___ => X_C_X_D_X_E___ v1 X_A_X_B__ plus v2_ X_C_X_D_X_E____ => X_A_X_B__X_C_X_D_X_E____

Listing 19: reduce output

12

Java 8 – Streams

The method reduce enables us to merge the values from the source stream to obtain a single result. Here it is important to consider the distinctions between the serial and parallel processing. The results can be different depending on the particular reduction transformation. With trivial things, such as finding a maximum value, side issues do not occur. However with these trivial transformations you should also test if the result is still equivalent to the desired outcome. When using streams, we find many basic functions that are already in-cluded in the API and that spare you the development of basic utilities. We will now take a look at these and show how they are used with some short examples.

1.1.4 Limit / Skip – Don't overdo it

Streams can be indefinitely long. This means that in the extreme case streams have no end. That's why it can sometime be useful to process streams only to a certain length or to just collect a certain set of results, since the rest cannot be used for the following logic. The method limit(count)is designed exactly for this case. The following example shows on the one hand how the initial set can be reduced and on the other hand how the set of the results can be limited. (Listing 20) Thus the remaining steps are always limited at the point when the method limit(count) is being called.

final List<Integer> demoValues = Arrays.asList(1,2,3,4,5,6,7,8,9,10); //limit the input -> [1, 2, 3, 4] System.out.println(demoValues .stream().limit(4) .collect(Collectors.toList())); //limit the result -> [5, 6, 7, 8] System.out.println(demoValues .stream().filter((v)->v > 4) .limit(4) .collect(Collectors.toList()));

Listing 20: limit

The method skip(count) works a little differently. Here we also have a limitation of the stream, but instead we have an absolute limit. The counter indicates how many elements are being skipped. The end is open how-ever. The limitation therefore takes place in the beginning by skipping n elements without processing them. (Listing 21) The method skip(counter) can also occur several times and in several places of the entire construct.

//jumping over the first 4 elements -> [5, 6, 7, 8, 9, 10] System.out.println(demoValues .stream().skip(4) .collect(Collectors.toList()));

Listing 21: skip

1.1.5 Distinct – Just once please

We know the command distinct from SQL, which allows us to reduce a set of values to only one single value – and thereby the generation of a unique-set. The method distinct() does exactly the same thing. (Listing 22) The implementation itself works in the class DistinctOps on a ConcurentHashMap because this operation has also been developed for parallel streams. The distinct-set is then the KeySet of the HashMap. The determining element is the hashCode- and equals- implementation of the elements which are supposed to be transferred into the unique-set. At this point you can influence the behaviour and the performance of the distinct operation.

// [77, 79, 81, 95, 43, 10, 53, 48, // 74, 68, 60, 86, 83, 24, 57, 28, 8, // 85, 70, 66, 20, 14, 97, 73, 22, // 36, 40, 39, 32, 19, 41, 67, 25, 88] final Random random = new Random(); System.out.println(

13

Java 8 – Streams

Stream.generate(() -> random.nextInt(100)) .limit(40) .distinct() .collect(Collectors.toList()) );

Listing 22: distinct

1.1.6 Min / Max – Very small, very large

The methods min(<Comparator>) and max(<Comparator>) return the minimum or maximum from the set of the values in the stream. (Listing 23) This value is determined by use of Comparator. This means that all ele-ments have to be iterated. Thus it cannot be performed on infinite streams. Accordingly the definition of the Comparator allows different interpretations about what is a minimum and what is a maximum. At the same time, the implementation of the Comparator is one of the defining components in the performance because it is applied to all elements. In any case, it is faster than sorting the elements with findFirst(), because the com-plexity of min/max is O(n) and the complexity of the sorting is O(n log n).

//find the maximum System.out.println(demoValues .stream().max(Integer::compareTo)); //find the BUG ;-) System.out.println(demoValues .stream().min((v1, v2) -> Integer.compare(v2, v1)));

Listing 23: min / max

1.1.7 allMatch, anyMatch, noneMatch, count

The methods allMatch(<Predicate>), anyMatch(<Predicate>), noneMatch(<Predicate>) return a boolean.

• allMatch if the defined condition is true with exactly all elements

• anyMatch if some elements correspond to the condition (minimum 2)

• noneMatch if no single element corresponds to the condition

Looking at the runtime of the single methods you can observe that noneMatch(<Predicate>) has to be applied to the entire value supply. anyMatch(<Predicate>) and allMatch(<Predicate>) on the other hand cancel as soon as the result is derivable. In our case allMatch(<Predicate>) cancels after exactly one comparison because this one already does not match. (odd number) after anyMatch(<Predicate>) cancels after exact two successful hits because the condition “any” is fulfilled. (Listing 24)

// true, some are matching System.out.println("anyMatch " + demoValues.stream() .map((e) -> { System.out.println("e = " + e); return e; }) .anyMatch((v) -> v % 2 == 0)); //false, not all are matching System.out.println("allMatch " + demoValues.stream() .map((e) -> { System.out.println("e = " + e); return e; }) .allMatch((v) -> v % 2 == 0)); //false, not all are NOT matching System.out.println("noneMatch " + demoValues.stream()

14

Java 8 – Streams

.map((e) -> { System.out.println("e = " + e); return e; }) .noneMatch((v) -> v % 2 == 0));

Listing 24: match

e = 1 e = 2 anyMatch true e = 1 allMatch false e = 1 e = 2 noneMatch false e = 1 e = 2 e = 3 e = 4 e = 5 e = 6 e = 7 e = 8 e = 9 e = 10

Listing 25: match output

Now only the method count() is missing. It can be explained quite simply because this method returns the number of elements that have been processed in the stream so far. (Listing 26)

//5 matching the filter, 2,4,6,8,10 System.out.println("count " + demoValues.stream() .map((e) -> { System.out.println("e = " + e); return e; }) .filter((v) -> v % 2 == 0) .count());

Listing 26: count

1.1.8 Parallel / Sequential – Switch if necessary

The last two methods that we are going to look at here are parallel() and sequential(). This way the methods that in turn return a stream can be operated explicitly in a serial or a parallel version. If a following operation cannot be performed parallel then this can happen with the method call seriell(). You can decide for every individual stream whether it should work parallel or serially.

System.out.println(demoValues.stream() //seriell .map((m1) -> m1) .parallel() .map((m2) -> m2) .sequential() //seriell .collect(Collectors.toList()));

Listing 27: parallel / sequential

15

Java 8 – Streams

1.1.9 Summary

These few basic methods have enabled us to use streams efficiently and effectively. For practice, I recommend refactoring existing source texts in constructs with streams. Here you will see that this transformation invol-ves a massive code reduction. In some places thanks to the streams you can also parallelize subtasks. This leads to a higher utilization of the existing modern CPU architectures. The rebuild will be worth it!

1.2 Streams v threads v serialWe will now look at an example in which we complete a task first in the classical serial approach, then by the using threads, and in the end with streams. How big are the differences in the code complexity and what are the differences in the performance?

1.2.1 The task

Let us begin with a simple interface: the Worker (Listing 28), in which two methods that are being used for the generation of synthetic load are defined. The goal here is to build a matrix with marker (generateDemoValue-Matrix) that afterwards is interpolated by usage of Splines (generateInterpolatedValues).

public interface Worker { public static final int ANZAHL_KURVEN = 200; public static final int ANZAHL_MESSWERTE = 10; public static final int MAX_GENERATED_INT = 100; public abstract List<List<Integer>> generateDemoValueMatrix(); public abstract List<List<Double>> generateInterpolatedValues(List<List<Integer>> baseValues); }

Listing 28: Interface Worker

For the examples two parts have been outsourced. The first part is the generation of a value sequence (Demo-ValueGenerator) for the display of the markers. (Listing 29) The implementation has been selected as interface with default-method which equates a JDK 8 notation. At this point the implementation itself was still made without Streams.

public interface DemoValueGenerator { public default List<Integer> generateDemoValuesForY() { final Random random = new Random(); final List<Integer> result = new ArrayList<>(); for (int i = 0; i < Worker.ANZAHL_MESSWERTE; i++) { final int nextInt = random.nextInt(Worker.MAX_GENERATED_INT); result.add(nextInt); } return result; } }

Listing 29: DemoValueGenerator

The second part is the calculation of the interpolated values (WorkLoadGenerator), which on the one hand ser-ves to generate load and on the other hand is being used as example for the integration of ThirdParty-Code. (Listing 30)

16

Java 8 – Streams

public class WorkLoadGenerator { public static final int STEP_SIZE = 100; private UnivariateFunction createInterpolateFunction( final List<Integer> values) { final double[] valueArrayX = new double[values.size()]; for (int i = 0; i < valueArrayX.length; i++) { valueArrayX[i] = (double) i * STEP_SIZE; } final double[] valueArrayY = new double[values.size()]; int i = 0; for (final Integer value : values) { valueArrayY[i] = (double) value.intValue(); i = i + 1; } final UnivariateInterpolator interpolator = new SplineInterpolator(); final UnivariateFunction function = interpolator.interpolate(valueArrayX, valueArrayY); return function; } public List<Double> generate(final List<Integer> v) { final UnivariateFunction interpolateFunction = createInterpolateFunction(v); //baue Kurve auf final int anzahlValuesInterpolated = (v.size() - 1) * STEP_SIZE; final List<Double> result = new ArrayList<>(); for (int i=0; i < anzahlValuesInterpolated - 1;i++) { final double valueForY = interpolateFunction.value(i); result.add(valueForY); } return result; } }

Listing 30: WorkLoadGenerator

1.2.2 Sequential version

The sequential version (Listing 31) is kept rather simple and should be quite similar to a first implementation.

public class WorkerSerial implements Worker { @Override public List<List<Double>> generateInterpolatedValues( List<List<Integer>> baseValues) { final WorkLoadGenerator generator = new WorkLoadGenerator(); final List<List<Double>> result = new ArrayList<>(); for (final List<Integer> valueList : baseValues) { final List<Double> doubleList = generator.generate(valueList); result.add(doubleList); } return result;

17

Java 8 – Streams

} private DemoValueGenerator valueGenerator = new DemoValueGenerator(){}; public List<List<Integer>> generateDemoValueMatrix() { final List<List<Integer>> result = new ArrayList<>(); for (int i = 0; i < ANZAHL_KURVEN; i++) { final List<Integer> demoValuesForY = valueGenerator.generateDemoValuesForY(); result.add(demoValuesForY); } return result; } }

Listing 31: WorkerSerial

The process is quite simple since one curve is generated per measurement sequence. This happens consecu-tively. It is obvious that this can be managed parallel for each curve.

1.2.3 Parallel Versions with Threads

The first parallel version (Listing 32) is realized with some good old threads. One thread is responsible for the task for each curve. At the end the results are collected. It is obvious immediately that this implementation is much more complicated.

public class WorkerParallelThreads implements Worker { @Override public List<List<Integer>> generateDemoValueMatrix() { final List<List<Integer>> result = new ArrayList<>(); final List<Task> taskList = new ArrayList<>(); for(int i = 0; i< ANZAHL_KURVEN; i++){ taskList.add(new Task()); } for (final Task task : taskList) { task.run(); } for (final Task task : taskList) { try { task.join(); result.add(task.result); } catch (InterruptedException e) { e.printStackTrace(); } } return result; } @Override public List<List<Double>> generateInterpolatedValues( List<List<Integer>> baseValues) { final List<List<Double>> result = new ArrayList<>(); final List<TaskInterpolate> taskList = new ArrayList<>(); for (final List<Integer> baseValue : baseValues) { final TaskInterpolate taskInterpolate = new TaskInterpolate(); taskInterpolate.values.addAll(baseValue); taskList.add(taskInterpolate); }

18

Java 8 – Streams

for (final TaskInterpolate task : taskList) { task.run(); } for (final TaskInterpolate task : taskList) { try { task.join(); result.add(task.result); } catch (InterruptedException e) { e.printStackTrace(); } } return result; } public static class Task extends Thread { public List<Integer> result = new ArrayList<>(); private DemoValueGenerator valueGenerator = new DemoValueGenerator(){}; @Override public void run() { result.addAll( valueGenerator.generateDemoValuesForY()); } } public static class TaskInterpolate extends Thread { public final List<Integer> values = new ArrayList<>(); public final List<Double> result = new ArrayList<>(); private final WorkLoadGenerator generator = new WorkLoadGenerator(); @Override public void run() { result.addAll(generator.generate(values)); } }

Listing 32: WorkerParallelThreads

1.2.4 Parallel versions with ExecutorService

Since JDK5 the ExecutorService has enabled us to reuse the threads that have already been instanced. Here Callables are being used (which are nothing else than a definition of the threads in the previous example). As a result, the effort for the developer is here almost identical. The usage of the ExecutorService is a little easier than the handling of the threads themselves. (Listing 33)

public class WorkerParallelExecutorService implements Worker{ private final ExecutorService executorService; public WorkerParallelExecutorService( ExecutorService executorService) { this.executorService = executorService; } @Override public List<List<Integer>> generateDemoValueMatrix() { final List<Task> taskList = new ArrayList<>(); for(int i = 0; i< ANZAHL_KURVEN; i++){ taskList.add(new Task()); } final List<List<Integer>> result

19

Java 8 – Streams

= new ArrayList<>(); try { final List<Future<List<Integer>>> futureList = executorService.invokeAll(taskList); for (final Future<List<Integer>> future : futureList) { final List<Integer> valueList = future.get(); result.add(valueList); } } catch (InterruptedException | ExecutionException e) { e.printStackTrace(); } return result; } @Override public List<List<Double>> generateInterpolatedValues( List<List<Integer>> baseValues) { final List<TaskInterpolate> taskList = new ArrayList<>(); for (final List<Integer> baseValue : baseValues) { final TaskInterpolate taskInterpolate = new TaskInterpolate(); taskInterpolate.values.addAll(baseValue); taskList.add(taskInterpolate); } final List<List<Double>> result = new ArrayList<>(); try { final List<Future<List<Double>>> futureList = executorService.invokeAll(taskList); for (final Future<List<Double>> future : futureList) { final List<Double> valueList = future.get(); result.add(valueList); } } catch (InterruptedException | ExecutionException e) { e.printStackTrace(); } return result; } public static class Task implements Callable<List<Integer>> { private DemoValueGenerator valueGenerator = new DemoValueGenerator(){}; @Override public List<Integer> call() { final List<Integer> result = new ArrayList<>(); result.addAll( valueGenerator.generateDemoValuesForY()); return result; } } public static class TaskInterpolate implements Callable<List<Double>> { public final List<Integer> values = new ArrayList<>(); public final List<Double> result = new ArrayList<>(); private final WorkLoadGenerator generator = new WorkLoadGenerator();

20

Java 8 – Streams

@Override public List<Double> call() { result.addAll(generator.generate(values)); return result; } } }

Listing 33: WorkerParallelExecutorService

1.2.5 Parallel Version with Streams

Now to the implementations via streams. (Listing 34) The immense reduction of code is immediately evi-dent. The entire part that is usually necessary for the parallelism is here reduced to the usage of the method parallelStream().

public class WorkerParallelStreams implements Worker{ @Override public List<List<Integer>> generateDemoValueMatrix(){ return Stream .generate(this::generateDemoValuesForY) .limit(ANZAHL_KURVEN) .collect(Collectors.toList()); } @Override public List<List<Double>> generateInterpolatedValues( List<List<Integer>> baseValues) { final List<List<Double>> baseValueMatrix = generateDemoValueMatrix() .parallelStream() .map(v -> { final WorkLoadGenerator generator = new WorkLoadGenerator(); return generator.generate(v); }) .collect(Collectors.toList()); return baseValueMatrix; } public List<Integer> generateDemoValuesForY(){ final Random random = new Random(); return Stream .generate( () -> random.nextInt(MAX_GENERATED_INT)) .limit(ANZAHL_MESSWERTE) .collect(Collectors.toList()); } }

Listing 34: WorkerParallelStreams

Last but not least everything can be defined in the interface itself. For this you just have to define the method of the class WorkerParallelStreams via default in the interface itself.

1.2.6 Parallel version with streams, default–methods

Since JDK 8 there is also the default-method in an interface. Using this you can perform the implementation directly within the interface if you want to spare one level in the inheritance hierarchy and if you assume that there will be further implementations. In our example this looks like this (Listing 35).

21

Java 8 – Streams

public interface WorkerJDK 8 { public default List<List<Integer>> generateDemoValueMatrix(){ return Stream .generate(this::generateDemoValuesForY) .limit(ANZAHL_KURVEN) .collect(Collectors.toList()); } public default List<List<Double>> generateInterpolatedValues( List<List<Integer>> baseValues) { final List<List<Double>> baseValueMatrix = generateDemoValueMatrix() .parallelStream() .map(v -> { final WorkLoadGenerator generator = new WorkLoadGenerator(); return generator.generate(v); }) .collect(Collectors.toList()); return baseValueMatrix; } public List<Integer> generateDemoValuesForY(){ final Random random = new Random(); return Stream .generate( () -> random.nextInt(MAX_GENERATED_INT)) .limit(ANZAHL_MESSWERTE) .collect(Collectors.toList()); } }

Listing 35: WorkerJDK 8

1.2.7 Summary

In summary, it can be said that the combination of streams and the additional new language elements of the JDK 8 can lead to a clear code reduction. Already simple components can be parallelized without the deve-loper having to deal with the otherwise necessary constructs such as threads. However, we must remember that all this is not possible without a good understanding of concurrency. Also, a parallel implementation is not necessarily the most efficient one – let alone semantically the same.

1.2.8 JavaFX Example

Now that we have taken a look at streams by themselves, here is an example of their use in a JavaFX GUI. As a starting point, I used one of the official JavaFX examples of Oracle. [2] I decided to use the line-chart. Let us assume in the following that we must display n series of measurements. These measurements have been recorded every 100 time units (in our case generated by random) and are to be represented by curves. The measurements are being interpolated via splines. For this I use the commons-math3 Lib of Apache. [3] The series of measurements are to be displayed at the same time in the line-chart. Here the following steps per curve are necessary:

• Get (generate) the series of measurements

• Calculate the interpolated values

• Generate the graphic elements

• Fill the line-chart

22

Java 8 – Streams

1.2.9 Get (Generate) the Series of Measurements

In this example, we are generating the measurements. The data structure shall consist of a list which contains, a list of integer values for each series of measurements: List<List<Integer>> Traditionally one proceeds as follows. (Listing 36)

public List<List<Integer>> generateDemoValueMatrix(){ final List<List<Integer>> result = new ArrayList<>(); for (int i = 0; i<200; i++){ final List<Integer> generatedDemoValuesForY = generateDemoValuesForY(); result.add(generatedDemoValuesForY); } return result; } public List<Integer> generateDemoValuesForY(){ final Random random = new Random(); final List<Integer> result = new ArrayList<>(); for(int i = 0; i<10; i++){ result.add(random.nextInt(100)); } return result; }

Listing 36: DemoValues preStreams

Thanks to the new Stream-API, we can formulate these constructs more efficiently. (Listing 37) There is no concurrency involved yet, but you can see the usage of the new Streams-API. Here, we notice the usage of the builder-pattern which enables a sequential description, similar to the usage of operators.

public List<List<Integer>> generateDemoValueMatrix(){ return Stream .generate(this::generateDemoValuesForY) .limit(200) //Anzahl Kurven .collect(Collectors.toList()); } public List<Integer> generateDemoValuesForY(){ final Random random = new Random(); return Stream .generate(() -> random.nextInt(100)) .limit(10) .collect(Collectors.toList()); }

Listing 37: DemoValues Streams

1.2.10 Calculate the Interpolated Values

Here the values are calculated by using the commons-math3 Lib of Apache [3]. The approach is very straight-forward. At the beginning, the interpolation-algorithm is chosen and the following ones are initialized with the existing markers. A unique instance of the UnivariateFunction is generated for every series of measure-ments so that no unwanted side-effects will occur during the later parallel execution. (Listing 38)

private UnivariateFunction createInterpolateFunction( final List<Integer> values){ final double[] valueArrayX = new double[values.size()]; for (int i = 0; i < valueArrayX.length; i++) { valueArrayX[i] = (double)i* STEP_SIZE; } final double[] valueArrayY

23

Java 8 – Streams

= new double[values.size()]; int i=0; for (final Integer value : values) { valueArrayY[i] = (double) value.intValue(); i= i+1; } final UnivariateInterpolator interpolator = new SplineInterpolator(); final UnivariateFunction function = interpolator.interpolate( valueArrayX, valueArrayY); return function; }

Listing 38: InterpolateFunction

The calculation of the missing measurement itself (Listing 39) is now executed (interpolateFunction.value(i)) parallel (.parallelStream()) for every series of measurements List<Double>. Here you immediately see that the syntactic effort for the achievement of parallelism is approaching zero. The following elements are performed in a Root-Fork-Join-Pool with one single call on the stream. More time is thus required for the sequential ver-sion that is iterated over the list of measurement series to calculate the values of a series.

private List<List<Double>> getValuesForSeries() { final List<List<Integer>> demoValueMatrix = generateDemoValueMatrix(); final List<List<Double>> collect = demoValueMatrix .parallelStream() .map(v -> { final UnivariateFunction interpolateFunction = createInterpolateFunction(v); //baue Kurve auf final int anzahlValuesInterpolated = (v.size()-1) * STEP_SIZE; final List<Double> result = new ArrayList<>(); for (int i=0; i < anzahlValuesInterpolated-1; i++) { final double valueForY = interpolateFunction.value(i); result.add(valueForY); } return result; }) .collect(Collectors.toList()); return collect; }

Listing 39: calculate Values

1.2.11 Generate the graphic elements

For the measurements that have been calculated, the graphic elements can now be generated and then be passed to the instance of the line-chart. Again here the series of measurements are being processesed parallel. (Listing 40)

private List<XYChart.Series> generateNextSeries(){ final List<XYChart.Series> chartSeries = getValuesForSeries() .parallelStream() .map(v -> {

24

Java 8 – Streams

final XYChart.Series nextSeries = new XYChart.Series(); int i = 0; for (final Double valueForY : v) { final XYChart.Data data = new XYChart.Data(i, valueForY); nextSeries.getData().add(data); i = i + 1; } return nextSeries; }).collect(Collectors.toList()); return chartSeries; }

Listing 40: create GUI Elements

1.2.12 Fill the line-chart

Now the only thing left to do is pass the graphic elements. This is a sequential process and only consists of the iterations over the basic list of the single series of measurements. List<XYChart.Series>. (Listing 41)

final List<XYChart.Series> serieses = generateNextSeries(); final ObservableList<XYChart.Series> data = lineChart.getData(); data.addAll(serieses);

Listing 41: Fill the line-chart

1.2.13 Summary

By using the Streams-API you can reach a higher concurrency – even with simple tasks. The great thing about this is how simple it is to use. The syntax clearly indicates which parts can be parallelized and which can't. The developer does not have to deal with threads, as was the case in previous Java releases.

In this simple example, on the 11th iteration the serial version (LineChartSerialDemo) lasted 2.799.209.417ns, the parallel version (LineChartDemo) 261.545.220ns. After all, this is a SpeedUP of Factor 10.

1.3 Streams – Pattern examples

1.3.1 Is it a prime?

Let's start with a small example: the implementation (Listing 42) to determine if a number is a prime. The method isPrime2 is realized without streams, isPrime1 is the equivalent realized by usage of streams. In this simple implementation, the difference is still quite small. However, it already shows a different legibility and, in my opinion, the advantages of the stream-version.

public static void main(String[] args) { for(int i = 0; i<1_000_000; i++){ final boolean b = isPrime1(i) != isPrime2(i); if(b) System.out.println("ungleiches Ergebnis = " + i); } } public static boolean isPrime1(int n) { if (n <= 1) return false; if (n == 2) return true;

25

Java 8 – Streams

return n >= 2 && IntStream .rangeClosed(2, (int) (Math.sqrt(n))) .allMatch((d) -> n % d != 0); } public static boolean isPrime2(int n) { if (n <= 1) return false; if (n == 2) return true; if (n % 2 == 0) return false; for (int i = 3; i <= Math.sqrt(n) + 1; i = i + 2) { if (n % i == 0) return false; } return true; }

Listing 42: isPrime

1.3.2 Fibonacci as Stream

You can pass a supplier as an instance to a stream. This also means that as long as the stream is used, there is always the same instance being used. This can be used, for example, to generate numerical orders that are based on each other. In this example we show the trivial implementation to generate Fibonacci numbers.

public static void main(String[] args) { final Stream<Long> fibStream = makeFibStream(10); fibStream.forEachOrdered(System.out::println); } public static Stream<Long> makeFibStream() { return(Stream.generate(new FibonacciSupplier())); } public static Stream<Long> makeFibStream(int numFibs) { return(makeFibStream().limit(numFibs)); } public static List<Long> makeFibList(int numFibs) { return(makeFibStream(numFibs) .collect(Collectors.toList())); } public static class FibonacciSupplier implements Supplier<Long> { private long previous = 0; private long current = 1; @Override public Long get() { long next = current + previous; previous = current; current = next; return(previous); } }

Listing 43: Fibonacci

1.3.3 Matrix as stream

Streams also allow us to work elegantly on an n-dimensional matrix. In the following example (Listing 44), we will search for the number 66 in a two-dimensional matrix. To keep things simple, it is assumed that the number will only be found only once. The pre-streams-solution is based on nested For-Loops with a label on the outermost loop. Generally you can deduce the following transformation rules:

• Common For-Loops can be mapped onto forEach if no cancellation during the loop iteration is required.

26

Java 8 – Streams

• If a condition is to be checked via if, then there are two alternatives.

• If without else: then this can be mapped onto the method filter

• If with else: this is mapped with the Map-method within which the case differentiation is performed as a result, where a a transformation in streams is going to be profitable depends strongly on the control flow.

public static void main(String[] args) { final List<List<Integer>> matrix = new ArrayList<>(); matrix.add(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9)); matrix.add(Arrays.asList(1,2,3,4,5,66,7,8,9)); matrix.add(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9)); matrix.forEach(System.out::println); final Integer s = matrix.stream() .map(l -> l.stream() .filter(v -> v.equals(66)) .findFirst().orElse(null)) .filter(f->f != null) .findFirst() .orElse(null); System.out.println("s = " + s); Integer result = null; endPos: for (final List<Integer> integers : matrix) { for (final Integer integer : integers) { if(integer.equals(66)){ result = integer; break endPos; } } } System.out.println("result " + result); }

Listing 44: Matrix

1.3.4 Summary

When using streams, the following questions (among others) come up:

Is a concurrency required or not? If yes, then in many cases streams by usage of parallelStream() are a simple and quick approach.

• Should the nestling of the control stream be reduced? Here it depends on the constructs within the case differentiation itself. Quite often with slight alterations you can build convincing constructs via streams that in the long run lead to a better maintainability. Whether this is profitable for old projects, must be decided according to the individual case.

• Do you need to map mathematical functions? In many cases you can accelerate your success by using streams without having to integrate Scala or other functional languages into the project.

All in all, streams are a very effective support in the daily work with Java.

You will realize that the generic approach turns out to be quite a relief when working with typical business applications. The adjustment to streams should usually lead to noticeable results within two to three days. Try it!

27

Java 8 – Streams

2 Example Case and ProspectsThe project

In a certain project, the objective was to thoroughly refactor a product that has been developed in Swing within a timeframe of over ten years. This should occur parallel to the planned development. To keep the adjustment efforts for the developers low the team should refrain from using DSLs. A central goal was to massively reduce the current amount of code while preserving the existing functionality.

The implementation

The implementation of the project took place in the following steps.

In the beginning automatic code-transformations were performed within the statistic semantic in order to obtain the highest possible linguistic standard. In other words, to transform as many old-language constructs as possible into new ones. For example you can envision the transformation of all For-Loops to the ForEach-Version. Subsequently code-duplicates were searched for in the homogenized code and removed. Now the deployment of manual reconstructions. By analysing the existing code, base design patterns could be de-veloped that enable structural searching and replacing. These few steps alone, helped save a considerable amount of code .

How much do streams and lambdas kick ass?

The usage of streams and lambdas had an essential impact on the reduction of code – a fact that can motivate you to switching to JDK 8 early on in projectJDK 8. With source texts that have been switched to JDK 8, I could save up to 30% of source text. The most important work here, was the identification of existing patterns and the transformation to stream equivalents. A very pleasant side effect in many places is the additional paral-lelism and therefore the higher reactivity.