preesm: a dataflow-based rapid prototyping …...milano – 1sept. 11th 2014 institut...
TRANSCRIPT
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
1
PREESM: A Dataflow-Based Rapid
Prototyping Framework for
Simplifying Multicore DSP
Programming
Maxime Pelcat, Karol Desnos, Julien Heulot
Clément Guy, Jean-François Nezan, Slaheddine Aridhi
EDERC 2014 Conference, Milan, September 11th
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
2
1990 1995 2000 2005 2010 2015
Transistors/chip
x2 every 18 months
Source: “Hardware-dependent Software”, Ecker, et. al
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
3
1990 1995 2000 2005 2010 2015
Lines of code/chip
x3.5 every 18 months
Transistors/chip
x2 every 18 months
Source: “Hardware-dependent Software”, Ecker, et. al
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
4
1990 1995 2000 2005 2010 2015
Lines of code/chip
x3.5 every 18 months
Lines of code/day
+25% every 18 months
Transistors/chip
x2 every 18 months
Source: “Hardware-dependent Software”, Ecker, et. al
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
5
1990 1995 2000 2005 2010 2015
Lines of code/chip
x3.5 every 18 months
Lines of code/day
+25% every 18 months
Transistors/chip
x2 every 18 months
Software
Productivity Gap
Source: “Hardware-dependent Software”, Ecker, et. al
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
6
Typical Single DSP Environment
Simulator
+ Debugger
+ Profiler
OS
Core (s)
Program Compiler
C/C++
Algorithm Code
Command Line
Options
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
7
…
…
Multicore DSP Rapid Prototyping
Architecture Model
Functional Algorithm
Model + Code
Constraints+ Options
OS
Core
2
Core
1
OS Simulator
+ Debugger
+ Profiler
Rapid
Prototyping Deployment
Program Program Program
Program
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
8
Reduce Software Productivity Gap
• In early design phases: Metrics
• Design parallel algorithms
– Automatic mapping and scheduling
• Predictable time and memory
– choose the right algorithm and hardware
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
9
Reduce Software Productivity Gap
• In late design phases: Rapid Prototyping
• Automatic multi-core speedup
• Inter-core communication
• Guaranteed Deadlock-freeness
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
10
Reduce Software Productivity Gap
• For migration to a new hardware
• Seamless porting to a new architecture
• Legacy code reuseability
• Portable performance
Dataflow modelling can help
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
11
…
…
PREESM for C6678
Archi
Model
Algo dataflow
+ C Code
Scenario
SYS/
BIOS
C66 C66
SYS/
BIOS
PREESM Simulator
+ CCS Debugger
and Profiler
PREESM Multiple C Programs
Program Program
Program Program
C6678
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
12
Algo dataflow: PiSDF
Read Display Filter
1 Size Size
Size
Size
Size
K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi “PiMM: Parameterized and
Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration”, SAMOS XIII
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
13
PiSDF
Read Display Filter
Size
1 Size Size
Size
Size
Size
K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi “PiMM: Parameterized and
Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration”, SAMOS XIII
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
14
PiSDF
Kernel Size Size/N
Size/N
Size/N
Size/N
Read Display in
Filter
Size
ou
t
N
Size
Size
back
fee
d Size Size
1 Size Size
Size
Size
Size
C
Code
C
Code
K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi “PiMM: Parameterized and
Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration”, SAMOS XIII
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
15
PiSDF
Kernel Size Size/N
Size/N
Size/N
Size/N
Read Display in
Filter
Size
ou
t
N
Size
Size
back
fee
d Size Size
1 Size Size
Size
Size
Size
C
Code
C
Code
C
Code
K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi “PiMM: Parameterized and
Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration”, SAMOS XIII
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
16
Algo dataflow: PiSDF
K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi “PiMM: Parameterized and
Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration”, SAMOS XIII
PiSDF MoC is:
Hierarchical &
Compositional
PiSDF fosters:
- Predictability
- Parallelism
- Lightweight runtime
overhead
- Developer-friendliness
Statically parameterizable
Dynamically reconfigurable
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
17
Archi: System-Level Archi. Model
• Representing contentions as TDMA
DDR3
core1
core2
core3
MSMC
core5
core6
core7
5.3 GB/s
16 GB/s
core4 core8
TMS320C6678
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
18
• Scheduling based on latency and load balancing
PREESM: Multicore Scheduling
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
19
• Scheduling based on latency and load balancing
PREESM: Multicore Scheduling
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
20
• Scheduling based on latency and load balancing
PREESM: Multicore Scheduling
core1
core2
core3
core4
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
21
PREESM: Memory Bounds
• Bounding the memory needs of an application
graph to:
- Evaluate the memory requirements
- Adjust the size of architecture memory
- Assess the optimality of a memory allocation
0 Available Memory
Wasted
memory
Possible
allocated memory
Insufficient
memory
Upper Bound Lower Bound
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
22
A D B
E
C
o1
o2
A B
C
D E
22
Actor A
Actor B
Actor D
time Actor E
Actor C
o1 o2
PREESM: Prototype Code Generation
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
23
PREESM Features
• Open Source Tool
– Available on GitHub
• Research-Oriented Tool
– New models, optimizations, scheduling
• Eclipse-based Integrated Tool
– Several plug-ins, metamodels
• Extended Web Tutorials
– http://preesm.sourceforge.net/website
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
24
Other Tools
• OpenMP, OpenEM
– Adding Rapid Prototyping
• MAPS Compiler, Polycore Polymapper, SynDEx
– Open-source code
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
25
PREESM Features
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
26
Some Results on Stereo Matching
0
1
2
3
4
5
6
1 2 3 4 5 6 7 8
Theoreticalspeedup
MeasuredPerformance
Number of cores Number of cores
0
10
20
30
40
50
60
70
80
90
1 2 3 4 5 6 7 8
allocatedmemory
lower memorybund
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
27
Conclusion
• Reduce Software Productivity Gap
– Design space exploration
– Rapid Prototyping
– Extract coarse grain parallelism
– Portable performance
PREESM Dataflow modelling can help!
– Good decisions necessitate extensive information on both computation and data flow
Milano – Sept. 11th 2014
INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES
28
Thanks!
M. Pelcat, K. Desnos, J. Heulot, C. Guy, J.-F. Nezan, S. Aridhi, "PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming" EDERC, 2014.
PREESM Tutorial – 16:00 – 17:00 - Room: Oro Plenaria
M. Pelcat, S. Aridhi, J. Piat , J.-F. Nezan, "Physical Layer Multicore Prototyping: A Dataflow-Based Approach for LTE eNodeB".
Springer, 2012.