![Page 1: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/1.jpg)
Futures, Scheduling, and Work Distribution
Companion slides forThe Art of Multiprocessor Programming
by Maurice Herlihy & Nir Shavit
(Some images in this lecture courtesy of Charles Leiserson)
![Page 2: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/2.jpg)
Art of Multiprocessor Programming 22
How to write Parallel Apps?
• Split a program into parallel parts• In an effective way• Thread management
![Page 3: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/3.jpg)
Art of Multiprocessor Programming 33
Matrix Multiplication
BAC
![Page 4: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/4.jpg)
Art of Multiprocessor Programming 44
Matrix Multiplication
ci j =N ¡ 1X
k=0
aki ¢bj k
![Page 5: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/5.jpg)
Art of Multiprocessor Programming 55
Matrix Multiplication
cij = k=0N-1 aki * bjk
![Page 6: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/6.jpg)
Art of Multiprocessor Programming 6
Matrix Multiplication class Worker extends Thread { int row, col; Worker(int row, int col) { row = row; col = col; } public void run() { double dotProduct = 0.0; for (int i = 0; i < n; i++) dotProduct += a[row][i] * b[i][col]; c[row][col] = dotProduct; }}}
![Page 7: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/7.jpg)
Art of Multiprocessor Programming 7
Matrix Multiplication class Worker extends Thread { int row, col; Worker(int row, int col) { row = row; col = col; } public void run() { double dotProduct = 0.0; for (int i = 0; i < n; i++) dotProduct += a[row][i] * b[i][col]; c[row][col] = dotProduct; }}}
a thread
![Page 8: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/8.jpg)
Art of Multiprocessor Programming 8
Matrix Multiplication class Worker extends Thread { int row, col; Worker(int row, int col) { row = row; col = col; } public void run() { double dotProduct = 0.0; for (int i = 0; i < n; i++) dotProduct += a[row][i] * b[i][col]; c[row][col] = dotProduct; }}}
Which matrix entry to compute
![Page 9: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/9.jpg)
Art of Multiprocessor Programming 9
Matrix Multiplication class Worker extends Thread { int row, col; Worker(int row, int col) { row = row; col = col; } public void run() { double dotProduct = 0.0; for (int i = 0; i < n; i++) dotProduct += a[row][i] * b[i][col]; c[row][col] = dotProduct; }}}
Actual computation
![Page 10: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/10.jpg)
Art of Multiprocessor Programming 10
Matrix Multiplication void multiply() { Worker[][] worker = new Worker[n][n]; for (int row …) for (int col …) worker[row][col] = new Worker(row,col); for (int row …) for (int col …) worker[row][col].start(); for (int row …) for (int col …) worker[row][col].join();}
![Page 11: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/11.jpg)
Art of Multiprocessor Programming 11
Matrix Multiplication void multiply() { Worker[][] worker = new Worker[n][n]; for (int row …) for (int col …) worker[row][col] = new Worker(row,col); for (int row …) for (int col …) worker[row][col].start(); for (int row …) for (int col …) worker[row][col].join();}
Create n x n threads
![Page 12: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/12.jpg)
Art of Multiprocessor Programming 12
Matrix Multiplication void multiply() { Worker[][] worker = new Worker[n][n]; for (int row …) for (int col …) worker[row][col] = new Worker(row,col); for (int row …) for (int col …) worker[row][col].start(); for (int row …) for (int col …) worker[row][col].join();}
Start them
![Page 13: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/13.jpg)
Art of Multiprocessor Programming 13
Matrix Multiplication void multiply() { Worker[][] worker = new Worker[n][n]; for (int row …) for (int col …) worker[row][col] = new Worker(row,col); for (int row …) for (int col …) worker[row][col].start(); for (int row …) for (int col …) worker[row][col].join();}
Wait for them to finish
Start them
![Page 14: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/14.jpg)
Art of Multiprocessor Programming 14
Matrix Multiplication void multiply() { Worker[][] worker = new Worker[n][n]; for (int row …) for (int col …) worker[row][col] = new Worker(row,col); for (int row …) for (int col …) worker[row][col].start(); for (int row …) for (int col …) worker[row][col].join();}
Wait for them to finish
Start them
What’s wrong with this picture?
![Page 15: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/15.jpg)
Art of Multiprocessor Programming 15
Thread Overhead
• Threads Require resources– Memory for stacks– Setup, teardown– Scheduler overhead
• Short-lived threads– Ratio of work versus overhead bad
![Page 16: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/16.jpg)
Art of Multiprocessor Programming 16
Thread Pools
• More sensible to keep a pool of– long-lived threads
• Threads assigned short-lived tasks– Run the task– Rejoin pool– Wait for next assignment
![Page 17: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/17.jpg)
Art of Multiprocessor Programming 17
Thread Pool = Abstraction
• Insulate programmer from platform– Big machine, big pool– Small machine, small pool
• Portable code– Works across platforms– Worry about algorithm, not platform
![Page 18: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/18.jpg)
Art of Multiprocessor Programming 18
ExecutorService Interface
• In java.util.concurrent– Task = Runnable object
• If no result value expected• Calls run() method.
– Task = Callable<T> object• If result value of type T expected• Calls T call() method.
![Page 19: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/19.jpg)
Art of Multiprocessor Programming 19
Future<T>Callable<T> task = …; …
Future<T> future = executor.submit(task);
…
T value = future.get();
![Page 20: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/20.jpg)
Art of Multiprocessor Programming 20
Future<T>Callable<T> task = …; …
Future<T> future = executor.submit(task);
…
T value = future.get();
Submitting a Callable<T> task returns a Future<T> object
![Page 21: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/21.jpg)
Art of Multiprocessor Programming 21
Future<T>Callable<T> task = …; …
Future<T> future = executor.submit(task);
…
T value = future.get();
The Future’s get() method blocks until the value is available
![Page 22: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/22.jpg)
Art of Multiprocessor Programming 22
Future<?>Runnable task = …; …
Future<?> future = executor.submit(task);
…
future.get();
![Page 23: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/23.jpg)
Art of Multiprocessor Programming 2323
Future<?>Runnable task = …; …
Future<?> future = executor.submit(task);
…
future.get();
Submitting a Runnable task returns a Future<?> object
![Page 24: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/24.jpg)
Art of Multiprocessor Programming 2424
Future<?>Runnable task = …; …
Future<?> future = executor.submit(task);
…
future.get();
The Future’s get() method blocks until the computation is complete
![Page 25: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/25.jpg)
Art of Multiprocessor Programming 2525
Note
• Executor Service submissions– Like New England traffic signs– Are purely advisory in nature
• The executor– Like the Boston driver– Is free to ignore any such advice– And could execute tasks sequentially …
![Page 26: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/26.jpg)
Art of Multiprocessor Programming 2626
Matrix Addition
00 00 00 00 01 01
10 10 10 10 11 11
C C A B B A
C C A B A B
![Page 27: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/27.jpg)
Art of Multiprocessor Programming 2727
Matrix Addition
00 00 00 00 01 01
10 10 10 10 11 11
C C A B B A
C C A B A B
4 parallel additions
![Page 28: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/28.jpg)
Art of Multiprocessor Programming 2828
Matrix Addition Taskclass AddTask implements Runnable { Matrix a, b; // add this! public void run() { if (a.dim == 1) { c[0][0] = a[0][0] + b[0][0]; // base case } else { (partition a, b into half-size matrices aij and bij) Future<?> f00 = exec.submit(addTask(a00,b00)); … Future<?> f11 = exec.submit(addTask(a11,b11)); f00.get(); …; f11.get(); … }}
![Page 29: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/29.jpg)
Art of Multiprocessor Programming 29
Matrix Addition Taskclass AddTask implements Runnable { Matrix a, b; // add this! public void run() { if (a.dim == 1) { c[0][0] = a[0][0] + b[0][0]; // base case } else { (partition a, b into half-size matrices aij and bij) Future<?> f00 = exec.submit(addTask(a00,b00)); … Future<?> f11 = exec.submit(addTask(a11,b11)); f00.get(); …; f11.get(); … }}
Constant-time operation
![Page 30: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/30.jpg)
Art of Multiprocessor Programming 30
Matrix Addition Taskclass AddTask implements Runnable { Matrix a, b; // add this! public void run() { if (a.dim == 1) { c[0][0] = a[0][0] + b[0][0]; // base case } else { (partition a, b into half-size matrices aij and bij) Future<?> f00 = exec.submit(addTask(a00,b00)); … Future<?> f11 = exec.submit(addTask(a11,b11)); f00.get(); …; f11.get(); … }}
Submit 4 tasks
![Page 31: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/31.jpg)
Art of Multiprocessor Programming 31
Matrix Addition Taskclass AddTask implements Runnable { Matrix a, b; // add this! public void run() { if (a.dim == 1) { c[0][0] = a[0][0] + b[0][0]; // base case } else { (partition a, b into half-size matrices aij and bij) Future<?> f00 = exec.submit(addTask(a00,b00)); … Future<?> f11 = exec.submit(addTask(a11,b11)); f00.get(); …; f11.get(); … }}
Base case: add directly
![Page 32: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/32.jpg)
Art of Multiprocessor Programming 32
Matrix Addition Taskclass AddTask implements Runnable { Matrix a, b; // multiply this! public void run() { if (a.dim == 1) { c[0][0] = a[0][0] + b[0][0]; // base case } else { (partition a, b into half-size matrices aij and bij) Future<?> f00 = exec.submit(addTask(a00,b00)); … Future<?> f11 = exec.submit(addTask(a11,b11)); f00.get(); …; f11.get(); … }}
Let them finish
![Page 33: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/33.jpg)
Art of Multiprocessor Programming 33
Dependencies
• Matrix example is not typical• Tasks are independent
– Don’t need results of one task …– To complete another
• Often tasks are not independent
![Page 34: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/34.jpg)
Art of Multiprocessor Programming 3434
Fibonacci
1 if n = 0 or 1F(n)
F(n-1) + F(n-2) otherwise
• Note– Potential parallelism– Dependencies
![Page 35: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/35.jpg)
Art of Multiprocessor Programming 3535
Disclaimer
• This Fibonacci implementation is– Egregiously inefficient
• So don’t try this at home or job!
– But illustrates our point• How to deal with dependencies
• Exercise:– Make this implementation efficient!
![Page 36: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/36.jpg)
Art of Multiprocessor Programming 36
class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}}
Multithreaded Fibonacci
![Page 37: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/37.jpg)
Art of Multiprocessor Programming 37
class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}}
Multithreaded Fibonacci
Parallel calls
![Page 38: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/38.jpg)
Art of Multiprocessor Programming 38
class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}}
Multithreaded Fibonacci
Pick up & combine results
![Page 39: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/39.jpg)
Art of Multiprocessor Programming 3939
The Blumofe-Leiserson DAG Model
• Multithreaded program is– A directed acyclic graph (DAG)– That unfolds dynamically
• Each node is– A single unit of work
![Page 40: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/40.jpg)
Art of Multiprocessor Programming 4040
Fibonacci DAGfib(4)
![Page 41: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/41.jpg)
Art of Multiprocessor Programming 4141
Fibonacci DAGfib(4)
fib(3)
![Page 42: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/42.jpg)
Art of Multiprocessor Programming 4242
Fibonacci DAGfib(4)
fib(3) fib(2)
fib(2)
![Page 43: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/43.jpg)
Art of Multiprocessor Programming 4343
Fibonacci DAGfib(4)
fib(3) fib(2)
fib(2) fib(1) fib(1)fib(1)
fib(1) fib(1)
![Page 44: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/44.jpg)
Art of Multiprocessor Programming 4444
Fibonacci DAGfib(4)
fib(3) fib(2)
callget
fib(2) fib(1) fib(1)fib(1)
fib(1) fib(1)
![Page 45: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/45.jpg)
Art of Multiprocessor Programming 4545
How Parallel is That?
• Define work:– Total time on one processor
• Define critical-path length:– Longest dependency path– Can’t beat that!
![Page 46: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/46.jpg)
Unfolded DAG
Art of Multiprocessor Programming 46
![Page 47: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/47.jpg)
Parallelism?
Art of Multiprocessor Programming 47
Serial fraction = 3/18 = 1/6 …
Amdahl’s Law says speedup cannot exceed 6.
![Page 48: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/48.jpg)
Work?
Art of Multiprocessor Programming 48
1
2
3
754
7 8 9 10
11 12 13 14
15 16
17
18
T1: time needed on one processor
Just count the nodes ….
T1 = 18
![Page 49: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/49.jpg)
Critical Path?
Art of Multiprocessor Programming 49
T∞: time needed on as many
processors as you like
![Page 50: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/50.jpg)
Critical Path?
Art of Multiprocessor Programming 50
1
2
3
4
5
6
7
8
9
Longest path ….
T∞ = 9
T∞: time needed on as many
processors as you like
![Page 51: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/51.jpg)
Art of Multiprocessor Programming 5151
Notation Watch
• TP = time on P processors
• T1 = work (time on 1 processor)
• T∞ = critical path length (time on ∞ processors)
![Page 52: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/52.jpg)
Art of Multiprocessor Programming 5252
Simple Laws
• Work Law: TP ≥ T1/P– In one step, can’t do more than P work
• Critical Path Law: TP ≥ T∞
– Can’t beat infinite resources
![Page 53: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/53.jpg)
Art of Multiprocessor Programming 5353
Performance Measures
• Speedup on P processors– Ratio T1/TP
– How much faster with P processors• Linear speedup
– T1/TP = Θ(P)
• Max speedup (average parallelism)– T1/T∞
![Page 54: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/54.jpg)
Sequential Composition
Art of Multiprocessor Programming 54
A B
Work: T1(A) + T1(B)
Critical Path: T∞ (A) + T∞ (B)
![Page 55: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/55.jpg)
Parallel Composition
Art of Multiprocessor Programming 55
Work: T1(A) + T1(B)
Critical Path: max{T∞(A), T∞(B)}
A
B
![Page 56: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/56.jpg)
Art of Multiprocessor Programming 5656
Matrix Addition
00 00 00 00 01 01
10 10 10 10 11 11
C C A B B A
C C A B A B
![Page 57: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/57.jpg)
Art of Multiprocessor Programming 5757
Matrix Addition
00 00 00 00 01 01
10 10 10 10 11 11
C C A B B A
C C A B A B
4 parallel additions
![Page 58: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/58.jpg)
Art of Multiprocessor Programming 5858
Addition
• Let AP(n) be running time – For n x n matrix– on P processors
• For example– A1(n) is work
– A∞(n) is critical path length
![Page 59: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/59.jpg)
Art of Multiprocessor Programming 5959
Addition
• Work is
A1(n) = 4 A1(n/2) + Θ(1)
4 spawned additions
Partition, synch, etc
![Page 60: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/60.jpg)
Art of Multiprocessor Programming 6060
Addition
• Work is
A1(n) = 4 A1(n/2) + Θ(1)
= Θ(n2)
Same as double-loop summation
![Page 61: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/61.jpg)
Art of Multiprocessor Programming 6161
Addition
• Critical Path length is
A∞(n) = A∞(n/2) + Θ(1)
spawned additions in parallel
Partition, synch, etc
![Page 62: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/62.jpg)
Art of Multiprocessor Programming 6262
Addition
• Critical Path length is
A∞(n) = A∞(n/2) + Θ(1)
= Θ(log n)
![Page 63: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/63.jpg)
Art of Multiprocessor Programming 6363
Matrix Multiplication Redux
BAC
![Page 64: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/64.jpg)
Art of Multiprocessor Programming 6464
Matrix Multiplication Redux
2221
1211
2221
1211
2221
1211
BB
BB
AA
AA
CC
CC
![Page 65: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/65.jpg)
Art of Multiprocessor Programming 6565
First Phase …
2222122121221121
2212121121121111
2221
1211
BABABABA
BABABABA
CC
CC
8 multiplications
![Page 66: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/66.jpg)
Art of Multiprocessor Programming 6666
Second Phase …
2222122121221121
2212121121121111
2221
1211
BABABABA
BABABABA
CC
CC
4 additions
![Page 67: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/67.jpg)
Art of Multiprocessor Programming 6767
Multiplication
• Work is
M1(n) = 8 M1(n/2) + A1(n)
8 parallel multiplications
Final addition
![Page 68: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/68.jpg)
Art of Multiprocessor Programming 6868
Multiplication
• Work is
M1(n) = 8 M1(n/2) + Θ(n2)
= Θ(n3)
Same as serial triple-nested loop
![Page 69: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/69.jpg)
Art of Multiprocessor Programming 6969
Multiplication
• Critical path length is
M∞(n) = M∞(n/2) + A∞(n)
Half-size parallel
multiplications
Final addition
![Page 70: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/70.jpg)
Art of Multiprocessor Programming 7070
Multiplication
• Critical path length is
M∞(n) = M∞(n/2) + A∞(n)
= M∞(n/2) + Θ(log n)
= Θ(log2 n)
![Page 71: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/71.jpg)
Art of Multiprocessor Programming 7171
Parallelism
• M1(n)/ M∞(n) = Θ(n3/log2 n)
• To multiply two 1000 x 1000 matrices– 10003/102=107
• Much more than number of processors on any real machine
![Page 72: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/72.jpg)
Art of Multiprocessor Programming 7272
Shared-Memory Multiprocessors
• Parallel applications– Do not have direct access to HW
processors• Mix of other jobs
– All run together– Come & go dynamically
![Page 73: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/73.jpg)
Art of Multiprocessor Programming 7373
Ideal Scheduling Hierarchy
Tasks
Processors
User-level scheduler
![Page 74: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/74.jpg)
Art of Multiprocessor Programming 7474
Realistic Scheduling Hierarchy
Tasks
Threads
Processors
User-level scheduler
Kernel-level scheduler
![Page 75: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/75.jpg)
Art of Multiprocessor Programming 75
For Example
• Initially,– All P processors available for application
• Serial computation– Takes over one processor– Leaving P-1 for us– Waits for I/O– We get that processor back ….
![Page 76: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/76.jpg)
Art of Multiprocessor Programming 76
Speedup
• Map threads onto P processes• Cannot get P-fold speedup
– What if the kernel doesn’t cooperate?• Can try for speedup proportional to P
![Page 77: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/77.jpg)
Art of Multiprocessor Programming 7777
Scheduling Hierarchy
• User-level scheduler– Tells kernel which threads are ready
• Kernel-level scheduler– Synchronous (for analysis, not correctness!)– Picks pi threads to schedule at step i
![Page 78: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/78.jpg)
Greedy Scheduling
• A node is ready if predecessors are done
• Greedy: schedule as many ready nodes as possible
• Optimal scheduler is greedy (why?)
• But not every greedy scheduler is optimal
Art of Multiprocessor Programming 78
![Page 79: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/79.jpg)
Greedy Scheduling
Art of Multiprocessor Programming 79
There are P processors
Complete Step:• >P nodes ready• run any P
Incomplete Step:• < P nodes ready• run them all
![Page 80: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/80.jpg)
Art of Multiprocessor Programming 8080
Theorem
For any greedy scheduler,
TP ≤ T1/P + T∞
![Page 81: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/81.jpg)
Art of Multiprocessor Programming 8181
Theorem
For any greedy scheduler,
Actual time
TP ≤ T1/P + T∞
![Page 82: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/82.jpg)
Art of Multiprocessor Programming 8282
Theorem
For any greedy scheduler,
No better than work divided among processors
TP ≤ T1/P + T∞
![Page 83: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/83.jpg)
Art of Multiprocessor Programming 8383
Theorem
For any greedy scheduler,
No better than critical path length
TP ≤ T1/P + T∞
![Page 84: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/84.jpg)
TP ≤ T1/P + T∞
Art of Multiprocessor Programming 84
Proof:
Number of incomplete steps ≤ T1/P …
… because each performs P work.
Number of complete steps ≤ T1 …
… because each shortens the unexecuted critical path by 1
QED
![Page 85: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/85.jpg)
Art of Multiprocessor Programming 8585
Near-OptimalityTheorem: any greedy scheduler is within a factor of 2 of optimal.
Remark: Optimality is NP-hard!
![Page 86: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/86.jpg)
Art of Multiprocessor Programming 8686
Proof of Near-OptimalityLet TP
* be the optimal time.
TP* ≥ max{T1/P, T∞}
TP ≤ T1/P + T∞
TP ≤ 2 max{T1/P ,T∞}
TP ≤ 2 TP*
From work and critical path laws
Theorem
QED
![Page 87: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/87.jpg)
Art of Multiprocessor Programming 8787
Work Distributionzzz…
![Page 88: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/88.jpg)
Art of Multiprocessor Programming 8888
Work Dealing
Yes!
![Page 89: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/89.jpg)
Art of Multiprocessor Programming 8989
The Problem with Work Dealing
D’oh!
D’oh!
D’oh!
![Page 90: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/90.jpg)
Art of Multiprocessor Programming 9090
Work StealingNo work…
Yes!
![Page 91: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/91.jpg)
Art of Multiprocessor Programming 9191
Lock-Free Work Stealing
• Each thread has a pool of ready work• Remove work without synchronizing• If you run out of work, steal someone
else’s• Choose victim at random
![Page 92: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/92.jpg)
Art of Multiprocessor Programming 9292
Local Work Pools
Each work pool is a Double-Ended Queue
![Page 93: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/93.jpg)
Art of Multiprocessor Programming 9393
Work DEQueue1
pushBottom
popBottom
work
1. Double-Ended Queue
![Page 94: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/94.jpg)
Art of Multiprocessor Programming 9494
Obtain Work
• Obtain work• Run task until• Blocks or terminates
popBottom
![Page 95: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/95.jpg)
Art of Multiprocessor Programming 9595
New Work
• Unblock node• Spawn node
pushBottom
![Page 96: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/96.jpg)
Art of Multiprocessor Programming 9696
Whatcha Gonna do When the Well Runs Dry?
@&%$!!
empty
![Page 97: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/97.jpg)
Art of Multiprocessor Programming 9797
Steal Work from OthersPick random thread’s DEQeueue
![Page 98: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/98.jpg)
Art of Multiprocessor Programming 9898
Steal this Task!
popTop
![Page 99: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/99.jpg)
Art of Multiprocessor Programming 9999
Task DEQueue
• Methods– pushBottom– popBottom– popTop
Never happen concurrently
![Page 100: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/100.jpg)
Art of Multiprocessor Programming 100100
Task DEQueue
• Methods– pushBottom– popBottom– popTop
Most common – make them fast(minimize use of
CAS)
![Page 101: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/101.jpg)
Art of Multiprocessor Programming 101101
Ideal
• Wait-Free• Linearizable• Constant time
Fortune Cookie: “It is better to be young, rich and beautiful, than old, poor, and ugly”
![Page 102: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/102.jpg)
Art of Multiprocessor Programming 102102
Compromise
• Method popTop may fail if– Concurrent popTop succeeds, or a – Concurrent popBottom takes last task
Blame the victim!
![Page 103: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/103.jpg)
Art of Multiprocessor Programming 103103
Dreaded ABA Problem
topCAS
![Page 104: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/104.jpg)
Art of Multiprocessor Programming 104104
Dreaded ABA Problem
top
![Page 105: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/105.jpg)
Art of Multiprocessor Programming 105105
Dreaded ABA Problem
top
![Page 106: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/106.jpg)
Art of Multiprocessor Programming 106106
Dreaded ABA Problem
top
![Page 107: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/107.jpg)
Art of Multiprocessor Programming 107107
Dreaded ABA Problem
top
![Page 108: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/108.jpg)
Art of Multiprocessor Programming 108108
Dreaded ABA Problem
top
![Page 109: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/109.jpg)
Art of Multiprocessor Programming 109109
Dreaded ABA Problem
top
![Page 110: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/110.jpg)
Art of Multiprocessor Programming 110110
Dreaded ABA Problem
topCAS
Uh-Oh …
Yes!
![Page 111: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/111.jpg)
Art of Multiprocessor Programming 111111
Fix for Dreaded ABA
stamp
top
bottom
![Page 112: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/112.jpg)
Art of Multiprocessor Programming 112112
Bounded DEQueuepublic class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; …}
![Page 113: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/113.jpg)
Art of Multiprocessor Programming 113113
Bounded DEQueuepublic class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; …}
Index & Stamp (synchronized)
![Page 114: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/114.jpg)
Art of Multiprocessor Programming 114114
Bounded DEQueuepublic class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] deq; …}
index of bottom taskno need to synchronizememory barrier needed
![Page 115: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/115.jpg)
Art of Multiprocessor Programming 115
Bounded DEQueuepublic class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; …}
Array holding tasks
![Page 116: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/116.jpg)
Art of Multiprocessor Programming 116
pushBottom()
public class BDEQueue { … void pushBottom(Runnable r){ tasks[bottom] = r; bottom++; } …}
![Page 117: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/117.jpg)
Art of Multiprocessor Programming 117
pushBottom()
public class BDEQueue { … void pushBottom(Runnable r){ tasks[bottom] = r; bottom++; } …} Bottom is the index to store
the new task in the array
![Page 118: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/118.jpg)
Art of Multiprocessor Programming 118
pushBottom()
public class BDEQueue { … void pushBottom(Runnable r){ tasks[bottom] = r; bottom++; } …}
Adjust the bottom index
stamp
top
bottom
![Page 119: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/119.jpg)
Art of Multiprocessor Programming 119
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; }
![Page 120: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/120.jpg)
Art of Multiprocessor Programming 120
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; } Read top (value & stamp)
![Page 121: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/121.jpg)
Art of Multiprocessor Programming 121
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; } Compute new value & stamp
![Page 122: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/122.jpg)
Art of Multiprocessor Programming 122
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; }
Quit if queue is empty
stamp
top
bottom
![Page 123: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/123.jpg)
Art of Multiprocessor Programming 123
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; }
Try to steal the task
stamp
topCAS
bottom
![Page 124: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/124.jpg)
Art of Multiprocessor Programming 124
Steal Work
public Runnable popTop() { int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = oldTop + 1; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom <= oldTop) return null; Runnable r = tasks[oldTop]; if (top.CAS(oldTop, newTop, oldStamp, newStamp))
return r; return null; }
Give up if conflict occurs
![Page 125: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/125.jpg)
Art of Multiprocessor Programming 125
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0; }
Take Work
![Page 126: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/126.jpg)
Art of Multiprocessor Programming 126
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0; }
126
Take Work
Make sure queue is non-empty
![Page 127: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/127.jpg)
Art of Multiprocessor Programming 127
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0; }
Take Work
Prepare to grab bottom task
![Page 128: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/128.jpg)
Art of Multiprocessor Programming 128128
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0; }
Take Work
Read top, & prepare new values
![Page 129: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/129.jpg)
Art of Multiprocessor Programming 129
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
129
Take Work
If top & bottom one or more apart, no conflict
stamp
top
bottom
![Page 130: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/130.jpg)
Art of Multiprocessor Programming 130
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
130
Take Work
At most one item left
stamp
top
bottom
![Page 131: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/131.jpg)
Art of Multiprocessor Programming 131
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
131
Take WorkTry to steal last task.Reset bottom because the DEQueue will be empty even if unsuccessful (why?)
![Page 132: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/132.jpg)
Art of Multiprocessor Programming 132132
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
Take Work
stamp
topCAS
bottom
I win CAS
![Page 133: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/133.jpg)
Art of Multiprocessor Programming 133133
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
Take Work
stamp
topCAS
bottom
If I lose CAS, thief must have won…
![Page 134: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/134.jpg)
Art of Multiprocessor Programming 134134
Runnable popBottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldTop = top.get(stamp), newTop = 0; int oldStamp = stamp[0], newStamp = oldStamp + 1; if (bottom > oldTop) return r; if (bottom == oldTop){ bottom = 0; if (top.CAS(oldTop, newTop, oldStamp, newStamp)) return r; } top.set(newTop,newStamp); return null; bottom = 0;}
Take WorkFailed to get last task(bottom could be less than top) Must still reset top and bottom
since deque is empty
![Page 135: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/135.jpg)
Art of Multiprocessor Programming 135135
Old English Proverb
• “May as well be hanged for stealing a sheep as a goat”
• From which we conclude:– Stealing was punished severely– Sheep were worth more than goats
![Page 136: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/136.jpg)
Art of Multiprocessor Programming 136136
Variations
• Stealing is expensive– Pay CAS– Only one task taken
• What if– Move more than one task– Randomly balance loads?
![Page 137: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/137.jpg)
Art of Multiprocessor Programming 137137
Work Balancing
5
2
b2+5c/ 2 = 3
d 2+5e/2=4
![Page 138: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/138.jpg)
Art of Multiprocessor Programming 138138
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
![Page 139: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/139.jpg)
Art of Multiprocessor Programming 139139
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
Keep running tasks
![Page 140: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/140.jpg)
Art of Multiprocessor Programming 140140
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
With probability1/|queue|
![Page 141: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/141.jpg)
Art of Multiprocessor Programming 141141
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
Choose random victim
![Page 142: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/142.jpg)
Art of Multiprocessor Programming 142142
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
Lock queues in canonical order
![Page 143: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/143.jpg)
Art of Multiprocessor Programming 143143
Work-Balancing Threadpublic void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task != null) task.run(); int size = queue[me].size(); if (random.nextInt(size+1) == size) { int victim = random.nextInt(queue.length); int min = …, max = …; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]);}}}}}
Rebalance queues
![Page 144: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/144.jpg)
Art of Multiprocessor Programming 144144
Work Stealing & Balancing
• Clean separation between app & scheduling layer
• Works well when number of processors fluctuates.
• Works on “black-box” operating systems
![Page 145: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/145.jpg)
Art of Multiprocessor Programming 145145
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
• You are free:– to Share — to copy, distribute and transmit the work – to Remix — to adapt the work
• Under the following conditions:– Attribution. You must attribute the work to “The Art of Multiprocessor
Programming” (but not in any way that suggests that the authors endorse you or your use of the work).
– Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.
• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to– http://creativecommons.org/licenses/by-sa/3.0/.
• Any of the above conditions can be waived if you get permission from the copyright holder.
• Nothing in this license impairs or restricts the author's moral rights.
![Page 146: Futures, Scheduling, and Work Distribution Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit (Some images in this](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649d345503460f94a0b867/html5/thumbnails/146.jpg)
Art of Multiprocessor Programming 146146
T O M
M RA V LO O
R I D LED