imag€¦ · motivationapplication modeldeployment using smtsymmetry eliminationdistributed memory...
TRANSCRIPT
![Page 1: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/1.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Mapping and Scheduling StreamingApplications using SMT Solvers
Pranav Tendulkar
Supervisors:Dr. Oded Maler Dr. Peter Poplavko
Verimag, FRANCE
13 October 2014
Tendulkar Mapping/scheduling for many-core 1 / 52
![Page 2: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/2.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core Processors Everywhere
Cars
Phones
Space-shuttle
Tablets Laptops
Cameras
Smart-TV
Tendulkar Mapping/scheduling for many-core 2 / 52
![Page 3: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/3.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core Processors Everywhere
source : http://www.csl.cornell.edu/courses/ece5745/handouts.html
Tendulkar Mapping/scheduling for many-core 3 / 52
![Page 4: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/4.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core Processors Everywhere
source : http://www.csl.cornell.edu/courses/ece5745/handouts.html
Tendulkar Mapping/scheduling for many-core 3 / 52
![Page 5: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/5.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core systems
How To:
Deploy the application to the platform
Decide number of processors to use?
Allocate tasks to processors and schedule them
Tendulkar Mapping/scheduling for many-core 4 / 52
![Page 6: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/6.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core systems
How To:
Deploy the application to the platform
Decide number of processors to use?
Allocate tasks to processors and schedule them
Tendulkar Mapping/scheduling for many-core 4 / 52
![Page 7: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/7.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core systems
How To:
Deploy the application to the platform
Decide number of processors to use?
Allocate tasks to processors and schedule them
Tendulkar Mapping/scheduling for many-core 4 / 52
![Page 8: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/8.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-core systems
How To:
Deploy the application to the platform
Decide number of processors to use?
Allocate tasks to processors and schedule them
Tendulkar Mapping/scheduling for many-core 4 / 52
![Page 9: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/9.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform Model
Application Model
OptimizationTechniques
CodeSolution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 10: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/10.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform ModelApplication Model
OptimizationTechniques
CodeSolution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 11: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/11.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform ModelApplication Model
OptimizationTechniques
CodeSolution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 12: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/12.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform ModelApplication Model
OptimizationTechniques
CodeSolution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 13: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/13.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform ModelApplication Model
OptimizationTechniques
Code
Solution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 14: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/14.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Our Deployment Framework
Constraints
Performance
Mem
ory
#Pro
cess
ors
Platform ModelApplication Model
OptimizationTechniques
CodeSolution
Tendulkar Mapping/scheduling for many-core 5 / 52
![Page 15: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/15.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Application Model
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relationsannotated with execution time
Tendulkar Mapping/scheduling for many-core 6 / 52
![Page 16: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/16.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Application Model
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relationsannotated with execution time
Tendulkar Mapping/scheduling for many-core 6 / 52
![Page 17: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/17.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Application Model
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relations
annotated with execution time
Tendulkar Mapping/scheduling for many-core 6 / 52
![Page 18: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/18.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Application Model
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relations
annotated with execution time
Tendulkar Mapping/scheduling for many-core 6 / 52
![Page 19: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/19.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment Problem
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relations
Deployment Solution
C F E D H J
A B G I
Time
P1
P2
Pro
cess
ors
Mapping : Task⇒ Processor
Scheduling : Task⇒ Time
Tendulkar Mapping/scheduling for many-core 7 / 52
![Page 20: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/20.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment Problem
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relations
Deployment Solution
C F E D H J
A B G I
Time
P1
P2
Pro
cess
ors
Mapping : Task⇒ Processor
Scheduling : Task⇒ Time
Tendulkar Mapping/scheduling for many-core 7 / 52
![Page 21: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/21.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment Problem
Task Graph
D
E
F
G
B
C
A
H
I
J
Tasks : Software procedure
Edges : Precedence relations
Deployment Solution
C F E D H J
A B G I
Time
P1
P2
Pro
cess
ors
xC
Mapping : Task⇒ Processor
Scheduling : Task⇒ Time
Tendulkar Mapping/scheduling for many-core 7 / 52
![Page 22: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/22.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment Problem
Solution1:
C F E D H J
A B G I
Time
P1
P2
Pro
cess
ors
Solution2:
A B C D E F G H I J
Time
P1
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 8 / 52
![Page 23: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/23.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment Problem
Solution1:
C F E D H J
A B G I
Time
P1
P2
Pro
cess
ors
Solution2:
A B C D E F G H I J
Time
P1
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 8 / 52
![Page 24: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/24.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Solution space is large
Tendulkar Mapping/scheduling for many-core 9 / 52
![Page 25: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/25.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment problem
How to:find optimal solutions in exponential design space.
model complex hardware which has Processors, Network, DMA
evaluate multiple criteriaLatencyMemory usedProcessors used...
Tendulkar Mapping/scheduling for many-core 10 / 52
![Page 26: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/26.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment problem
How to:find optimal solutions in exponential design space.
model complex hardware which has Processors, Network, DMA
evaluate multiple criteriaLatencyMemory usedProcessors used...
Tendulkar Mapping/scheduling for many-core 10 / 52
![Page 27: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/27.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Deployment problem
How to:find optimal solutions in exponential design space.
model complex hardware which has Processors, Network, DMA
evaluate multiple criteriaLatencyMemory usedProcessors used...
Tendulkar Mapping/scheduling for many-core 10 / 52
![Page 28: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/28.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Outline
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 11 / 52
![Page 29: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/29.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 12 / 52
![Page 30: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/30.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model of Computation
Synchronous Dataflow graphs (SDF)by Edward Lee and David Messerschmitt in 1987
represents Streaming Applications
Computation
Input output
Tendulkar Mapping/scheduling for many-core 13 / 52
![Page 31: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/31.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model of Computation
Synchronous Dataflow graphs (SDF)by Edward Lee and David Messerschmitt in 1987
represents Streaming Applications
Computation
Input output
Tendulkar Mapping/scheduling for many-core 13 / 52
![Page 32: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/32.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model of Computation
Synchronous Dataflow graphs (SDF)by Edward Lee and David Messerschmitt in 1987
represents Streaming Applications
Computation
Input output
Tendulkar Mapping/scheduling for many-core 13 / 52
![Page 33: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/33.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre
Blur1
Blur0
Blur2
Blur3
Post
images images
Tendulkar Mapping/scheduling for many-core 14 / 52
![Page 34: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/34.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 35: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/35.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4Pre Blur Post
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 36: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/36.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 37: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/37.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 38: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/38.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Actor Blur is compact representation of data parallel tasks.
All Blur tasks have same properties such as execution time.
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 39: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/39.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Synchronous DataFlow
Pre Blur Post
SDF Graph
4 1 1 4
Pre
Blur1
Blur0
Blur2
Blur3
Post
Task Graph
Actors
Edges
Rates
Actor Blur is compact representation of data parallel tasks.All Blur tasks have same properties such as execution time.
Tendulkar Mapping/scheduling for many-core 15 / 52
![Page 40: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/40.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Split-Join Graphswe use split-join graphs : restriction of SDF
still covering perhaps 90% of use cases in the literature
a simple example:
A B Cα 1/α
α : spawn and split
1/α: wait and join
A0
B1
B0
. . .
Bα−1
C0
Tendulkar Mapping/scheduling for many-core 16 / 52
![Page 41: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/41.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Split-Join Graphswe use split-join graphs : restriction of SDF
still covering perhaps 90% of use cases in the literature
a simple example:
A B Cα 1/α
α : spawn and split
1/α: wait and join
A0
B1
B0
. . .
Bα−1
C0
Tendulkar Mapping/scheduling for many-core 16 / 52
![Page 42: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/42.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Split-Join Graphswe use split-join graphs : restriction of SDF
still covering perhaps 90% of use cases in the literature
a simple example:
A B Cα 1/α
α : spawn and split
1/α: wait and join
A0
B1
B0
. . .
Bα−1
C0
Tendulkar Mapping/scheduling for many-core 16 / 52
![Page 43: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/43.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Restrictions compared to general SDF
Split-join does not support:
Stateful actors
Non-proportional rates
Initial tokens and cyclic paths
A B
2 3
32
4
Tendulkar Mapping/scheduling for many-core 17 / 52
![Page 44: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/44.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Restrictions compared to general SDF
Split-join does not support:
Stateful actors
Non-proportional rates
Initial tokens and cyclic paths
A B2 3
32
4
Tendulkar Mapping/scheduling for many-core 17 / 52
![Page 45: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/45.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Restrictions compared to general SDF
Split-join does not support:
Stateful actors
Non-proportional rates
Initial tokens and cyclic paths
A B2 3
32
4
Tendulkar Mapping/scheduling for many-core 17 / 52
![Page 46: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/46.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 18 / 52
![Page 47: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/47.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 48: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/48.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 49: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/49.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 50: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/50.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 51: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/51.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = true
UNSATout0 = false
&out1 = true
SATin0 = true,
in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 52: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/52.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 53: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/53.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = true
SATin0 = true,
in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 54: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/54.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 55: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/55.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
SATisfiability solver (SAT / SMT)
Boolean variablesin0, in1, in2 ...out0, out1, out2 ...
Constraintsout0 = in0 ∨ in1 ⊕ in2 ...
SAT solver
variables
constraints
out0 = true&
out1 = trueUNSAT
out0 = false&
out1 = trueSAT
in0 = true,in1 = false, ...
SMT extends SAT by numeric variables and constants
Tendulkar Mapping/scheduling for many-core 19 / 52
![Page 56: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/56.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description Variables
Start time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
![Page 57: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/57.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
![Page 58: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/58.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
![Page 59: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/59.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
![Page 60: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/60.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
A0
B0
Time
Pro
cess
ors
![Page 61: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/61.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
B2 B1
TimeP
roce
ssor
s
OR
B1 B2
Time
Pro
cess
ors
![Page 62: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/62.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Encoding deployment with constraints
A0
B1
B0
B2
B3
C0
Task Graph
Actor A B CTasks A0 B0 B1 B2 B3 C0
Description VariablesStart time xA0 xB0 xB1 xB2 xB3 xC0
Allocated proc. pA0 pB0 pB1 pB2 pB3 pC0
Duration dA dB dC
Precedence ConstraintsxB0 ≥ (xA0 + dA)
Mutual Exclusion Constraintsif (pB1 = pB2) then
xB1 ≥ (xB2 + dB) ∨ xB2 ≥ (xB1 + dB)
Latency CostLatency = (xC0 + dC)
Tendulkar Mapping/scheduling for many-core 20 / 52
A0 B0 B2
B1 B3 C0
Latency
Time
Pro
cess
ors
![Page 63: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/63.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-criteria Problem
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
P4
Latency = 3#Proc = 4
Latency
Pro
cess
ors
(4,2)
(3,4)Pareto Set
Tendulkar Mapping/scheduling for many-core 21 / 52
![Page 64: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/64.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-criteria Problem
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
P4
Latency = 3#Proc = 4
Latency
Pro
cess
ors
(4,2)
(3,4)
Pareto Set
Tendulkar Mapping/scheduling for many-core 21 / 52
![Page 65: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/65.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-criteria Problem
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
P4
Latency = 3#Proc = 4
Latency
Pro
cess
ors
(4,2)
(3,4)
Pareto Set
Tendulkar Mapping/scheduling for many-core 21 / 52
Conflicting Criteria
![Page 66: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/66.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Multi-criteria Problem
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
P4
Latency = 3#Proc = 4
Latency
Pro
cess
ors
(4,2)
(3,4)Pareto Set
Tendulkar Mapping/scheduling for many-core 21 / 52
![Page 67: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/67.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
Upper Bound UpperB
ound
Latency
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 68: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/68.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
Upper Bound UpperB
ound
Latency
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 69: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/69.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0
B0 B2
B1 B3 C0
Time
P1
P2
P3
Latency = 5#Proc = 3
Upper Bound UpperB
ound
Latency
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 70: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/70.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
A0 B0 B2
B1 B3 C0
Time
P1
P2
Latency = 4#Proc = 2
A0
B0 B2
B1 B3 C0
Time
P1
P2
P3
Latency = 5#Proc = 3
Upper Bound UpperB
ound
Latency
Pro
cess
ors
SAT
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 71: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/71.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
Latency = 4#Proc = 2
Not Possible
Upper Bound UpperB
ound
Latency
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 72: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/72.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
Latency = 4#Proc = 2
Not Possible
Latency = 2#Proc = 1
Also Not Possible
Upper Bound UpperB
ound
Latency
Pro
cess
ors
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 73: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/73.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Problem Monotonicity
Latency = 4#Proc = 2
Not Possible
Latency = 2#Proc = 1
Also Not Possible
Upper Bound UpperB
ound
Latency
Pro
cess
ors
UNSAT
Tendulkar Mapping/scheduling for many-core 22 / 52
![Page 74: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/74.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 75: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/75.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints
SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 76: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/76.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 77: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/77.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 78: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/78.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 79: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/79.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 80: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/80.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 81: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/81.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 82: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/82.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design Space Exploration
Split-join Graph
SMT Constraints SMT Solver
Design SpaceExploration Algorithm
costconstraints
solutions
(x1, y1)
SAT
(x2, y2)
UNSAT
(x3, y3)
?TIMEOUT
Timeout:Cannot decide SAT / UNSAT in a given TIME-BUDGET.
Tendulkar Mapping/scheduling for many-core 23 / 52
![Page 83: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/83.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Exploration Algorithm
Divide cost space using grids
One SMT query per point on the grid
Finer grid after every iteration
Don’t query in known area
sat points unsat points not yet explored points
Tendulkar Mapping/scheduling for many-core 24 / 52
![Page 84: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/84.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Exploration Algorithm
Divide cost space using grids
One SMT query per point on the grid
Finer grid after every iteration
Don’t query in known area
sat points unsat points not yet explored points
Tendulkar Mapping/scheduling for many-core 24 / 52
![Page 85: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/85.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Exploration Algorithm
Divide cost space using grids
One SMT query per point on the grid
Finer grid after every iteration
Don’t query in known area
sat points unsat points not yet explored points
Tendulkar Mapping/scheduling for many-core 24 / 52
![Page 86: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/86.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Exploration Algorithm
Divide cost space using grids
One SMT query per point on the grid
Finer grid after every iteration
Don’t query in known area
sat points unsat points not yet explored points
Tendulkar Mapping/scheduling for many-core 24 / 52
![Page 87: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/87.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Exploration Algorithm
Divide cost space using grids
One SMT query per point on the grid
Finer grid after every iteration
Don’t query in known area
sat points unsat points not yet explored points
Tendulkar Mapping/scheduling for many-core 24 / 52
![Page 88: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/88.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 25 / 52
![Page 89: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/89.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 90: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/90.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 91: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/91.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 92: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/92.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 93: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/93.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 94: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/94.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 95: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/95.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
all instances of actor C are similar (symmetric)
No change in latency !
Huge number of such symmetric solutions
Add constraints to eliminate all but one
Tendulkar Mapping/scheduling for many-core 26 / 52
![Page 96: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/96.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
C10
C11
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
a lexicographic schedule
B1 C01 C10 C11 D1 E
A0 B0 C00 D0
time
P1
P2
lexicographic order : C00 � C01 � C10 � C11
enforce lexicographic order in schedule:s(u) ≤ s(u′) for u� u′
s(C00) ≤ s(C01) ≤ s(C10) ≤ s(C11)
Tendulkar Mapping/scheduling for many-core 27 / 52
![Page 97: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/97.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
C10
C11
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
a lexicographic schedule
B1 C01 C10 C11 D1 E
A0 B0 C00 D0
time
P1
P2
lexicographic order : C00 � C01 � C10 � C11
enforce lexicographic order in schedule:s(u) ≤ s(u′) for u� u′
s(C00) ≤ s(C01) ≤ s(C10) ≤ s(C11)
Tendulkar Mapping/scheduling for many-core 27 / 52
![Page 98: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/98.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
C10
C11
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
a lexicographic schedule
B1 C01 C10 C11 D1 E
A0 B0 C00 D0
time
P1
P2
lexicographic order : C00 � C01 � C10 � C11
enforce lexicographic order in schedule:s(u) ≤ s(u′) for u� u′
s(C00) ≤ s(C01) ≤ s(C10) ≤ s(C11)
Tendulkar Mapping/scheduling for many-core 27 / 52
![Page 99: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/99.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
C10
C11
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
a lexicographic schedule
B1 C01 C10 C11 D1 E
A0 B0 C00 D0
time
P1
P2
lexicographic order : C00 � C01 � C10 � C11
enforce lexicographic order in schedule:s(u) ≤ s(u′) for u� u′
s(C00) ≤ s(C01) ≤ s(C10) ≤ s(C11)
Tendulkar Mapping/scheduling for many-core 27 / 52
![Page 100: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/100.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
C00
C01
C10
C11
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
a lexicographic schedule
B1 C01 C10 C11 D1 E
A0 B0 C00 D0
time
P1
P2
lexicographic order : C00 � C01 � C10 � C11
enforce lexicographic order in schedule:s(u) ≤ s(u′) for u� u′
s(C00) ≤ s(C01) ≤ s(C10) ≤ s(C11)
Tendulkar Mapping/scheduling for many-core 27 / 52
![Page 101: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/101.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry : Theorem
Lexicographic Schedule
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
Theorem : Every group has a lexicographic schedule
Corollary : No feasible cost is lost
Tendulkar Mapping/scheduling for many-core 28 / 52
![Page 102: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/102.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry : Theorem
Lexicographic Schedule
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
Theorem : Every group has a lexicographic schedule
Corollary : No feasible cost is lost
Tendulkar Mapping/scheduling for many-core 28 / 52
![Page 103: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/103.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry : Theorem
Lexicographic Schedule
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
Theorem : Every group has a lexicographic schedule
Corollary : No feasible cost is lost
Tendulkar Mapping/scheduling for many-core 28 / 52
![Page 104: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/104.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry : Theorem
Lexicographic Schedule
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
Theorem : Every group has a lexicographic schedule
Corollary : No feasible cost is lost
Tendulkar Mapping/scheduling for many-core 28 / 52
![Page 105: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/105.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Task Symmetry : Theorem
Lexicographic Schedule
a schedule
B1 C10 C01 C00 D0 E0
A0 B0 C11 D1
time
P1
P2
C01 C00
a permuted schedule
B1 C10 C00 C01 D0 E0
A0 B0 C11 D1
time
P1
P2
C00 C01
Theorem : Every group has a lexicographic schedule
Corollary : No feasible cost is lost
Tendulkar Mapping/scheduling for many-core 28 / 52
![Page 106: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/106.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Processor Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
schedule
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
swapped P1 and P2
Tendulkar Mapping/scheduling for many-core 29 / 52
![Page 107: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/107.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Processor Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
schedule
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
swapped P1 and P2
Tendulkar Mapping/scheduling for many-core 29 / 52
![Page 108: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/108.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Processor Symmetry
C00
C01
C10
C11
B0
B1
A0
D0
D1
E0
task graph
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
schedule
A0
B1
B0 C00
C10
C01
C11
D0
D1
E0
Time
P1
P2
swapped P1 and P2
Tendulkar Mapping/scheduling for many-core 29 / 52
![Page 109: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/109.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Pareto Exploration
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
without symmetry breaking
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
with symmetry breaking
0 8 16 24 32 40 48 56
Takes no time Times out
Exploration : Processors vs Latency α = 30
Solver PerformanceTimeouts reduce !The gap between SAT and UNSAT points is smaller.
Tendulkar Mapping/scheduling for many-core 30 / 52
![Page 110: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/110.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Pareto Exploration
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
without symmetry breaking
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
with symmetry breaking
0 8 16 24 32 40 48 56
Takes no time Times out
Exploration : Processors vs Latency α = 30
Solver PerformanceTimeouts reduce !The gap between SAT and UNSAT points is smaller.
Tendulkar Mapping/scheduling for many-core 30 / 52
![Page 111: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/111.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Pareto Exploration
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
without symmetry breaking
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
with symmetry breaking
0 8 16 24 32 40 48 56
Takes no time Times out
Exploration : Processors vs Latency α = 30
Solver PerformanceTimeouts reduce !The gap between SAT and UNSAT points is smaller.
Tendulkar Mapping/scheduling for many-core 30 / 52
![Page 112: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/112.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Pareto Exploration
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
without symmetry breaking
0 5 10 15 20 25 30
Latency
0
5
10
15
20
25
30
Pro
cess
ors
Sat Points Unsat Points Pareto Curve
with symmetry breaking
0 8 16 24 32 40 48 56
Takes no time Times out
Exploration : Processors vs Latency α = 30
Solver PerformanceTimeouts reduce !The gap between SAT and UNSAT points is smaller.Tendulkar Mapping/scheduling for many-core 30 / 52
![Page 113: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/113.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Video Decoder3D cost space (CL,CP ,CB) exploration, CB - total buffer size
MPEG video decoder:
122 Tasks
Latency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
With symmetry Without symmetry
Better Pareto points in same TIME-Budget !
Tendulkar Mapping/scheduling for many-core 31 / 52
![Page 114: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/114.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Video Decoder3D cost space (CL,CP ,CB) exploration, CB - total buffer size
MPEG video decoder:
122 TasksLatency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
With symmetry Without symmetry
Better Pareto points in same TIME-Budget !
Tendulkar Mapping/scheduling for many-core 31 / 52
![Page 115: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/115.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Video Decoder3D cost space (CL,CP ,CB) exploration, CB - total buffer size
MPEG video decoder:
122 TasksLatency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
Latency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
Latency(.10 3)
8
12
16
20
24Buff
erSize
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
With symmetry Without symmetry
Better Pareto points in same TIME-Budget !
Tendulkar Mapping/scheduling for many-core 31 / 52
![Page 116: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/116.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Video Decoder3D cost space (CL,CP ,CB) exploration, CB - total buffer size
MPEG video decoder:
122 TasksLatency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
With symmetry Without symmetry
Better Pareto points
in same TIME-Budget !
Tendulkar Mapping/scheduling for many-core 31 / 52
![Page 117: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/117.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Video Decoder3D cost space (CL,CP ,CB) exploration, CB - total buffer size
MPEG video decoder:
122 TasksLatency(.10 3)
8
12
16
20
24
Buffer
Size
150
200
250
300
350
400
Pro
cess
or
0
20
40
60
80
100
120
140
[5,367,91]
[24,276,1]
[14,276,122]
[14,333,62]
[10,323,122]
[17,182,122]
[7,205,122]
[24,182,1]
[19,182,31]
[10,229,31]
With symmetry Without symmetry
Better Pareto points in same TIME-Budget !
Tendulkar Mapping/scheduling for many-core 31 / 52
![Page 118: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/118.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Distributed memory scheduling
So far we ignored the communication costs
For distributed memory, communication needs to be modeled
Tendulkar Mapping/scheduling for many-core 32 / 52
![Page 119: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/119.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Distributed memory scheduling
So far we ignored the communication costs
For distributed memory, communication needs to be modeled
Tendulkar Mapping/scheduling for many-core 32 / 52
![Page 120: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/120.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 33 / 52
![Page 121: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/121.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters16 processors2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 122: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/122.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters
16 processors2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 123: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/123.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters
16 processors2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 124: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/124.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters16 processors
2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 125: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/125.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters16 processors2 MB Shared Memory
DMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 126: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/126.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters16 processors2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 127: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/127.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Kalray MPPA-256512KB
QuadCore
USMC PCIe inter laken DDR
GPIOs
Eth
Interlaken
Quad
Core
512K
B
Eth
Interlaken
Quad
Core
512K
B
DDR
GPIOs PCIe interlaken
QuadCore
512KB
SharedMemory
D-NocRouter
DMA
syst.core
C-NocRouter
C-NoC
DSU
P0 P1
P2 P3
P8 P9
P10 P11
P4 P5
P6 P7
P12 P13
P14 P15
16 compute clusters16 processors2 MB Shared MemoryDMA
Toroidal 2D network
Tendulkar Mapping/scheduling for many-core 34 / 52
![Page 128: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/128.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
The problem?
Which cluster to allocate?
Which processor to allocate?
Connected tasks in same or different cluster?
Communicating tasks if to be added, which DMA?
And the constraintsPrecedence
Mutual Exclusion
Costs
For 10 tasks, 256 processors,
1.20892582× 1024 potential solutions!
Tendulkar Mapping/scheduling for many-core 35 / 52
![Page 129: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/129.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
The problem?
Which cluster to allocate?
Which processor to allocate?
Connected tasks in same or different cluster?
Communicating tasks if to be added, which DMA?
And the constraintsPrecedence
Mutual Exclusion
Costs
For 10 tasks, 256 processors,
1.20892582× 1024 potential solutions!
Tendulkar Mapping/scheduling for many-core 35 / 52
Split the problem into sub-problems.
![Page 130: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/130.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 131: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/131.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
A B
C
D
E F
Group the Actors
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 132: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/132.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
A B
C
D
E F
Group the Actors
GoalsLoad balance the groupsMinimize data exchange
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 133: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/133.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
A B
C
D
E F
Place the Groups
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 134: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/134.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
A B
C
D
E F
Place the Groups
GoalsMinimize distance between communicating groups
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 135: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/135.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
C0
D0
C1
D1
B0
B1
A0
E0
E1
F0
A0 B0
B1
C0
C1
D0
D1
E0
E1
F0
transfer
Time
Clu
ster
0C
lust
er1
P1
P2
DMA0
P1
P2
ScheduleTasks
Transfer
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 136: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/136.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Design FlowApplication
Graph
Partitioning
Placement
Multi-clusterScheduling
C0
D0
C1
D1
B0
B1
A0
E0
E1
F0
A0 B0
B1
C0
C1
D0
D1
E0
E1
F0
transfer
Time
Clu
ster
0C
lust
er1
P1
P2
DMA0
P1
P2
ScheduleTasks
Transfer
GoalsMinimize LatencyMinimize Buffer size
Tendulkar Mapping/scheduling for many-core 36 / 52
![Page 137: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/137.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and Transfers
Cluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 138: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/138.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and TransfersCluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 139: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/139.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and TransfersCluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 140: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/140.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and TransfersCluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 141: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/141.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and TransfersCluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 142: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/142.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Output of Design Flow
Tasks and TransfersCluster Mapping
Processor and DMA Mapping
Start time
EdgesCommunication buffer size
ApplicationLatency
Tendulkar Mapping/scheduling for many-core 37 / 52
![Page 143: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/143.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
DMA Model
Tasks communicating via DMA:
A I G B
DMA
I G
A
I
I
G
B
Time
Clu
ster
0C
lust
er1
P1
DMA0
P1
Task Description Resources used Task duration
I Initialization Processor and DMA Constant
G Network Transfer Only DMA Transfer size dependent
Tendulkar Mapping/scheduling for many-core 38 / 52
![Page 144: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/144.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
DMA Model
Tasks communicating via DMA:
A I G B
DMA
I G
A
I
I
G
B
Time
Clu
ster
0C
lust
er1
P1
DMA0
P1
Task Description Resources used Task duration
I Initialization Processor and DMA Constant
G Network Transfer Only DMA Transfer size dependent
Tendulkar Mapping/scheduling for many-core 38 / 52
![Page 145: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/145.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
DMA Model
Tasks communicating via DMA:
A I G B
DMA
I
G
A
I
I
G
B
Time
Clu
ster
0C
lust
er1
P1
DMA0
P1
Task Description Resources used Task duration
I Initialization Processor and DMA Constant
G Network Transfer Only DMA Transfer size dependent
Tendulkar Mapping/scheduling for many-core 38 / 52
![Page 146: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/146.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
DMA Model
Tasks communicating via DMA:
A I G B
DMA
I GA
I
I
G
B
Time
Clu
ster
0C
lust
er1
P1
DMA0
P1
Task Description Resources used Task duration
I Initialization Processor and DMA Constant
G Network Transfer Only DMA Transfer size dependent
Tendulkar Mapping/scheduling for many-core 38 / 52
![Page 147: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/147.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 148: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/148.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 149: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/149.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 150: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/150.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 151: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/151.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 152: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/152.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-control
DMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 153: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/153.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Model TransformationAn example application graph:
A B[α, ω]
Partition-Aware graph:
A Iwr Gwr Bewt : [1, w
↑] ewn : [1] ert : [α, ω]
Buffer-Aware graph:
A Iwr Gwr
Fst
B
IrdGrd
ewt : [1, w↑] ewn : [1] ert : [α, ω]
ews: [1]
ewb : [1, 0, b(e
wt )]
e rs:[1]
ern : [1]
erb : [α −1
, 0, b(ert )]
DMA : Data
DMA : flow-controlDMA-Completion
Tendulkar Mapping/scheduling for many-core 39 / 52
![Page 154: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/154.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
VLD IQ/IDCT COLOR
12 1
12
VLD : Variable Length Decoder
IQ / IDCT : Inverse Quantization / Inverse Discrete Cosine Transform
Color : Color Conversion
Tendulkar Mapping/scheduling for many-core 40 / 52
![Page 155: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/155.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
VLD IQ/IDCT COLOR
12 1
12
VLD : Variable Length Decoder
IQ / IDCT : Inverse Quantization / Inverse Discrete Cosine Transform
Color : Color Conversion
Tendulkar Mapping/scheduling for many-core 40 / 52
![Page 156: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/156.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
VLD IQ/IDCT COLOR
12 1
12
VLD : Variable Length Decoder
IQ / IDCT : Inverse Quantization / Inverse Discrete Cosine Transform
Color : Color Conversion
Tendulkar Mapping/scheduling for many-core 40 / 52
![Page 157: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/157.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
VLD IQ/IDCT COLOR
12 1
12
VLD : Variable Length Decoder
IQ / IDCT : Inverse Quantization / Inverse Discrete Cosine Transform
Color : Color Conversion
Tendulkar Mapping/scheduling for many-core 40 / 52
![Page 158: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/158.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 159: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/159.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 160: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/160.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 161: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/161.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 162: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/162.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 163: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/163.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEGDecoder
Partitioning
Placement
Multi-clusterScheduling
Partitioning
Placement
Multi-clusterScheduling
Partitioning Solutions:
Cz : No. of Groups
Cη : Total communication cost
Cτ : Max. workload per group
Allocated group Exploration CostSolution vld iq color Cz Cη Cτ
Ps0 0 1 2 3 12384 424012Ps1 0 0 1 2 2736 758116Ps2 0 0 0 1 0 934288Ps3 0 1 1 2 9648 510276
Scheduling Solutions:
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0 Ps1 Ps2 Ps3
Tendulkar Mapping/scheduling for many-core 41 / 52
![Page 164: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/164.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEG decoder latency measured on Kalray platform
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps1
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize(
byte
s)
Ps2
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize(
byte
s)
Ps3
model measured-min. measured-max.
Maximum prediction error of 9%
Tendulkar Mapping/scheduling for many-core 42 / 52
![Page 165: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/165.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
JPEG Decoder Example
JPEG decoder latency measured on Kalray platform
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps0
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize
(byt
es)
Ps1
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize(
byte
s)
Ps2
0.4 0.5 0.6 0.7 0.8 0.9 1
·106
1
1.1
1.2
·104
Latency (cycles)
Buf
ferS
ize(
byte
s)
Ps3
model measured-min. measured-max.
Maximum prediction error of 9%
Tendulkar Mapping/scheduling for many-core 42 / 52
![Page 166: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/166.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
StreamIt Benchmarks
JPEG
Dec.
Beam
Form
er
Inser
tion Sor
t
Merge
Sort
Radix
Sort
Dct1 Dct2 Dct3 Dct4 Dct5 Dct6 Dct7 Dct8
DctCoa
rse
DctFine
Comp.
coun
t
Matrix
Mult.
Fft0
20
40
60
80
100
25
155
7
37
6 4
8
4
8 8
24
7 10
3
6 4
8 8
#Solutions %error
Tendulkar Mapping/scheduling for many-core 43 / 52
![Page 167: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/167.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
StreamIt Benchmarks
JPEG
Dec.
Beam
Form
er
Inser
tion Sor
t
Merge
Sort
Radix
Sort
Dct1 Dct2 Dct3 Dct4 Dct5 Dct6 Dct7 Dct8
DctCoa
rse
DctFine
Comp.
coun
t
Matrix
Mult.
Fft0
20
40
60
80
100
25
155
7
37
6 4
8
4
8 8
24
7 10
3
6 4
8 8
#Solutions %error
Tendulkar Mapping/scheduling for many-core 43 / 52
![Page 168: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/168.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 44 / 52
![Page 169: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/169.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal
SAT
UNSAT
Tendulkar Mapping/scheduling for many-core 45 / 52
![Page 170: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/170.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal SAT
UNSAT
Tendulkar Mapping/scheduling for many-core 45 / 52
![Page 171: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/171.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal SAT
UNSAT
Tendulkar Mapping/scheduling for many-core 45 / 52
![Page 172: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/172.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal SAT
UNSAT
Tendulkar Mapping/scheduling for many-core 45 / 52
![Page 173: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/173.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal SAT
UNSAT
Tendulkar Mapping/scheduling for many-core 45 / 52
![Page 174: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/174.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal
TIMEOUT
foundoptimal
Tendulkar Mapping/scheduling for many-core 46 / 52
![Page 175: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/175.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal
TIMEOUT
foundoptimal
Tendulkar Mapping/scheduling for many-core 46 / 52
![Page 176: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/176.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal
TIMEOUT
foundoptimal
Tendulkar Mapping/scheduling for many-core 46 / 52
![Page 177: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/177.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
lower bound upper boundoptimal
TIMEOUT
foundoptimal
Tendulkar Mapping/scheduling for many-core 46 / 52
![Page 178: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/178.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3 Optimized Schedule
Such constraints makes the problem harder for SMT
Tendulkar Mapping/scheduling for many-core 47 / 52
![Page 179: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/179.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3 Optimized Schedule
Such constraints makes the problem harder for SMT
Tendulkar Mapping/scheduling for many-core 47 / 52
![Page 180: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/180.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3 Optimized Schedule
Such constraints makes the problem harder for SMT
Tendulkar Mapping/scheduling for many-core 47 / 52
![Page 181: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/181.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3 Optimized Schedule
Such constraints makes the problem harder for SMT
Tendulkar Mapping/scheduling for many-core 47 / 52
![Page 182: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/182.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Lessons learnt from SMT solver
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3
A0 B0
B1
B2
B3
C0
Time
P1
P2
P3 Optimized Schedule
Such constraints makes the problem harder for SMT
Tendulkar Mapping/scheduling for many-core 47 / 52
![Page 183: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/183.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Two-step optimization
Get a loose schedule from the solver
Optimize it for:LatencyProcessors used
Upper Bound UpperB
ound
LatencyP
roce
ssor
s
LooseSolution
Tendulkar Mapping/scheduling for many-core 48 / 52
![Page 184: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/184.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Two-step optimization
Get a loose schedule from the solver
Optimize it for:LatencyProcessors used
Upper Bound UpperB
ound
LatencyP
roce
ssor
s
LooseSolution
Tendulkar Mapping/scheduling for many-core 48 / 52
![Page 185: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/185.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Two-step optimization
Get a loose schedule from the solver
Optimize it for:LatencyProcessors used
Upper Bound UpperB
ound
LatencyP
roce
ssor
s
LooseSolution
optimized
Tendulkar Mapping/scheduling for many-core 48 / 52
![Page 186: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/186.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Two-step optimization
Get a loose schedule from the solver
Optimize it for:LatencyProcessors used
Upper Bound UpperB
ound
LatencyP
roce
ssor
s
LooseSolution
optimized
SAT
Tendulkar Mapping/scheduling for many-core 48 / 52
![Page 187: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/187.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Overview
1 Motivation
2 Application Model
3 Deployment using SMT
4 Symmetry elimination
5 Distributed memory scheduling
6 SMT Solving
7 Conclusions
Tendulkar Mapping/scheduling for many-core 49 / 52
![Page 188: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/188.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Conclusions and Future Work
Conclusions:
Symmetry elimination finds better solutions
Combined Optimization with Communication modeling
Automated design flow for distributed memory
Tendulkar Mapping/scheduling for many-core 50 / 52
![Page 189: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/189.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Conclusions and Future Work
Conclusions:
Symmetry elimination finds better solutions
Combined Optimization with Communication modeling
Automated design flow for distributed memory
Tendulkar Mapping/scheduling for many-core 50 / 52
![Page 190: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/190.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Conclusions and Future Work
Conclusions:
Symmetry elimination finds better solutions
Combined Optimization with Communication modeling
Automated design flow for distributed memory
Tendulkar Mapping/scheduling for many-core 50 / 52
![Page 191: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/191.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
References
P. Tendulkar, P. Poplavko, and O. Maler. “Symmetry Breaking forMulti-criteria Mapping and Scheduling on Multicores”. In:FORMATS. 2013
P. Tendulkar, P. Poplavko, I. Galanommatis, and O. Maler.“Many-Core Scheduling of Data Parallel Applications using SMTSolvers”. In: DSD. 2014
P. Tendulkar, P. Poplavko, and O. Maler. Strictly PeriodicScheduling of Acyclic Synchronous Dataflow Graphs using SMTSolvers. Tech. rep. Verimag Research Report, 2014
Tendulkar Mapping/scheduling for many-core 51 / 52
![Page 192: imag€¦ · MotivationApplication ModelDeployment using SMTSymmetry eliminationDistributed memory schedulingSMT SolvingConclusions Pareto Exploration 0 5 10 15 20 25 30 Latency 0](https://reader036.vdocuments.mx/reader036/viewer/2022071102/5fdb78da71f431114f25df34/html5/thumbnails/192.jpg)
Motivation Application Model Deployment using SMT Symmetry elimination Distributed memory scheduling SMT Solving Conclusions
Thank You
Questions?
Tendulkar Mapping/scheduling for many-core 52 / 52