parallel streams in java 8

ParallelStreams Concurrent data processing in Java 8 David Gómez G. @dgomezg [email protected]

Upload: david-gomez-garcia

Post on 17-Jul-2015




3 download


Page 1: Parallel streams in java 8

ParallelStreamsConcurrent data processing in Java 8David Gómez G.@[email protected]

Page 2: Parallel streams in java 8

Do you remember?

use stream()

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads


Page 3: Parallel streams in java 8

Previously on…

Page 4: Parallel streams in java 8

Streams? What’s that?

Page 5: Parallel streams in java 8

A Stream is…An convenience method to iterate over

collections in a declarative wayList<Integer>  numbers  =  new  ArrayList<Integer>();for  (int  i=  0;  i  <  100  ;  i++)  {   numbers.add(i); }  

List<Integer> evenNumbers = .filter(n -> n % 2 == 0) .collect(toList());


Page 6: Parallel streams in java 8

Anatomy of a Stream


Intermediate Operations





Final operation




Page 7: Parallel streams in java 8

Iterating a Stream

List<Integer> evenNumbers = .filter(n -> n % 2 == 0) .collect(toList());

Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed


Page 8: Parallel streams in java 8

Iterating a Stream

List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList());

Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent


Page 9: Parallel streams in java 8

Parallel Streams

use stream()

List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}

//This will use just a single thread Stream<Integer> evenNumbers =;

or parallelStream()//Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream();


Page 10: Parallel streams in java 8

Let’s test it

use stream()

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads


Page 11: Parallel streams in java 8

Going parallel

use stream()

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads


Page 12: Parallel streams in java 8

Previously on…

Page 13: Parallel streams in java 8

Parallelism Under the hood

Page 14: Parallel streams in java 8

Fork/Join Framework

Proposed by Doug Lea

"a style of parallel programming in which problems are solved by (recursively) splitting them into subtasks that are solved in parallel."

Available in Java 7

Used by ParallelStreams

Page 15: Parallel streams in java 8

The F/J algorithm

Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } }

as proposed by Doug Lea

Page 16: Parallel streams in java 8


ExecutorService implementation that • has a defined number of Workers (threads) • executes ForkJoinTasks • submitted by execute(ForkJoinTask  task)  

• or by invoke(ForkJoinTask  task)

Page 17: Parallel streams in java 8


Abstract class that represents a task to be run concurrently

Every ForkJoinTask could be splitted (if not small enough) and solved Recursively

Two concrete implementations • RecursiveAction  if not returning value • RecursiveTask  if returning a value

Page 18: Parallel streams in java 8


Any of the threads created by the ForkJoinPool

Executes ForkJoinTasks

Everyone has a Dequeue for tasks (allows task stealing)

Page 19: Parallel streams in java 8


Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } }

the F/J algorithm

plus Task Stealing.

Page 20: Parallel streams in java 8

Fork/Join. When to use?

For computations that could be splitted into smaller tasks aka ‘divide and conquer’ algorithms Independent

Reduction with no contention.

Page 21: Parallel streams in java 8

ParallelStreams in action!

Page 22: Parallel streams in java 8


for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads

Page 23: Parallel streams in java 8

Thread.activeCount not accurate

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

Thread.activeCount() does not show the effective number of threads processing the stream

Page 24: Parallel streams in java 8

Better count threads involvedSet<String> workerThreadNames = new ConcurrentSet<>();

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); }

Page 25: Parallel streams in java 8

Threads usage

ParallelStreams use the common ForkJoinPool

Number of worker threads configured with -­‐Djava.util.concurrent.ForkJoinPool.common.parallelism=n

Useful to keep CPU parallelism under control…

…but …

Page 26: Parallel streams in java 8

Limiting parallelism

for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); }


5001069 elements computed in 269 msecs with 5 threads


Page 27: Parallel streams in java 8

Limiting parallelismfor (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); } System.out.println("credits to threads: “ + workerThreadNames);

5001069 elements computed in 269 msecs with 5 threads credits to threads: ForkJoinPool.commonPool-worker-0, ForkJoinPool.commonPool-worker-1, ForkJoinPool.commonPool-worker-2, ForkJoinPool.commonPool-worker-3, main


Page 28: Parallel streams in java 8

Threads Involved in ParallelStream

ParallelStreams use the common ForkJoinPool

Thread invoking ParallelStream also used as Worker

Caveats: •ParallelStream processing is synchronous for invoking thread

•Other Threads using common ForkJoinPool could be affected

Page 29: Parallel streams in java 8

ParallelStream Hack

ParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);long start = System.currentTimeMillis();

numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList());

Page 30: Parallel streams in java 8

ParallelStream Hack

ParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);long start = System.currentTimeMillis();ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get();

Page 31: Parallel streams in java 8

ParallelStream HackParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get();

Task submitted in 1 msecs 5000805 elements computed in 328 msecs with 4 threads

Page 32: Parallel streams in java 8

ParallelStream Hack benefits

A custom ExecutorService • Does not affect other ParallelStreams • Does not affect Common ForkJoinPool users • Reduces unpredictable latency due to other CommonForkJoin Pool load

• Invoking thread not used as worker (async parallel process)

Page 33: Parallel streams in java 8

Problems derived from Common ForkJoinPool

Page 34: Parallel streams in java 8

Blocking for IO

If firsts URLs stuck on a ConnectionTimeOut, overall performance could be affected Stream<String> urls = Files.lines(Paths.get("urlsToCheck.txt"));List<String> errors = urls.parallel().filter(url -> { //Connect to URL and wait for 200 response or timeout return true; }).collect(toList());

Page 35: Parallel streams in java 8

Nested parallelStreams

Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis();IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .parallel().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000));});

Process finalized in 22974 msecs Process finalized in 22575 msecs Process finalized in 22606 msecs

Page 36: Parallel streams in java 8

Nested parallelStreams

Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis();IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .sequential().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000));});

Process finalized in 12491 msecs Process finalized in 12589 msecs Process finalized in 12798 msecs

Page 37: Parallel streams in java 8

Other performance problems

Page 38: Parallel streams in java 8

Too much Auto(un)boxing

outboxing and boxing of Integers in every filter call

List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList());

4999464 elements computed in 290 msecs with 8 threads 4999464 elements computed in 276 msecs with 8 threads 4999464 elements computed in 257 msecs with 8 threads 4999464 elements computed in 265 msecs with 8 threads

Page 39: Parallel streams in java 8

Less Auto(un)boxing

outboxing and boxing of Integers in every filter call

List<Integer> even = numbers.parallelStream() .mapToInt(n -> n) .filter(n -> n % 2 == 0) .sorted() .boxed() .collect(toList());

4999460 elements computed in 160 msecs with 8 threads 4999460 elements computed in 243 msecs with 8 threads 4999460 elements computed in 144 msecs with 8 threads 4999460 elements computed in 140 msecs with 8 threads

Page 40: Parallel streams in java 8


Page 41: Parallel streams in java 8


ParallelStreams eases concurrent processing but: • Understand how it works • Don’t abuse the default common ForkJoinPool

• Don’t use when blocking by IO • Or use a custom ForkJoinPool

• Avoid unnecessary autoboxing • Don’t add contention or synchronisation • Be careful with nested parallel streams • Use method references when sorting