This project and the research leading to these results has received funding from the European Community's
Seventh Framework Programme [FP7/2007-2013] under grant agreement n
318693
Santhosh Kumar Rethinagiri, Oscar Palomar, Osman Unsal, Adrian Cristal
Barcelona Supercomputing Center
Energy-Aware Computing Workshop 2014
Power and Energy Modelling of Multi-core Processors for
System-Level Design Space Exploration
EACO Workshop, Sept. 10th 2014, Bristol
ParaDIME Consortium
2
TU – Dresden (GERMANY)
IMEC- Leuven (BELGIUM)
BSC- Barcelona (SPAIN)
University of Neuchatel
(SWITZERLAND)
Cloud & Heat Technologies - Dresden
(GERMANY)
EACO Workshop, Sept. 10th 2014, Bristol
Why ParaDIME ?
Parallel Distributed Infrastructure for
Minimization of Energy
Rising cost
Hardware cost
Programming efficiency
Runtime optimization
Energy aware data center computing
3
EACO Workshop, Sept. 10th 2014, Bristol
The ParaDIME Stack
4
ParaDIME Infrastructure
Data Center
Computing Node/Stack
OS
JVM
API
Application/BM
Intra Data Center Scheduler
Simulated HW
Cores
Accelerators
Interconnect
Future Devices
JVM
Scala
AKKA
Actor Sched
Computing Node/Stack
Re
al
HW
JVMJVM
OSOS
VMVM
Hypervisor
Re
al
HW
Re
al
HW
Re
al
HW
Re
al
HW
Hyper
JVM JVM
OS
VM
Scala
AKKA
Actor Sched
API
Application/BM
Mu
lti
Da
ta C
en
ter
Sc
he
du
ler
BSC
IMEC
UNINE
TuD
Cloud & Heat Technologies
EACO Workshop, Sept. 10th 2014, Bristol 5
Challenges of modelling power of heterogeneous systems
Estimating power/energy is a critical design goal for electronic
devices.
Designers today must evaluate power estimation as early as
possible in the electronics design.
Design changes are easier in the design phase and have the
greatest impact on application power estimation at System-Level.
A platform to use different processors and components.
Functional level is accurate but it’s a course grain. Restriction in
terms of measuring power from the real board.
For fine grain, we can achieve it from gate level simulation.
Restriction applies as we don’t have the tools and RTL sources.
Very slow simulation speed.
Another challenge is power law holds for a simple processor but for
complex processor system remains debatable ?
EACO Workshop, Sept. 10th 2014, Bristol 6
Power estimation methodology and tools
McPAT
EACO Workshop, Sept. 10th 2014, Bristol
Hybrid design space exploration methodology
EACO Workshop, Sept. 10th 2014, Bristol
First step : FLPA ( Functional Level Power Analysis)
EACO Workshop, Sept. 10th 2014, Bristol
Functional block (ARM Cortex-A9)
EACO Workshop, Sept. 10th 2014, Bristol
Generic Power Model Parameters
The Parameters which influence the power in a system.
EACO Workshop, Sept. 10th 2014, Bristol
Power measurement environment
11
Courtesy: Open-People project
EACO Workshop, Sept. 10th 2014, Bristol
Variation of Instruction Per Cycle (IPC) in Power for ARM Cortex-A8
0
5
10
15
20
25
30
35
40
45
50
0 0.5 1 1.5 2
Power(mW)
Instruc onPerCycle(IPC)
Power(m
W)
EACO Workshop, Sept. 10th 2014, Bristol
Power consumption models generated with FLPA
EACO Workshop, Sept. 10th 2014, Bristol
Second Step: System Level Power Analysis
EACO Workshop, Sept. 10th 2014, Bristol
Result Interface
15
EACO Workshop, Sept. 10th 2014, Bristol
Result Interface
16
EACO Workshop, Sept. 10th 2014, Bristol
Results and Comparison (Power estimation)
EACO Workshop, Sept. 10th 2014, Bristol
Results and comparison (Energy)
EACO Workshop, Sept. 10th 2014, Bristol
Third step: Auto optimization
DVFS
Runtime – Inter task DVFS
Programmer annotation based DVFS
Work-load balancing based on task
Runtime
Programmer based request
19
EACO Workshop, Sept. 10th 2014, Bristol
Task scheduling
20
EACO Workshop, Sept. 10th 2014, Bristol
Optimization based on work load balancing
21
EACO Workshop, Sept. 10th 2014, Bristol
Optimization (Inter task DVFS)
22
EACO Workshop, Sept. 10th 2014, Bristol
Conclusion
In our tool, we have proved that our estimates are accurate.
Adaptable for any kind of complex processor system.
Added advantage, rapid prototyping of the components and porting
of the applications made easy.
Estimating power and designing applications made easy and time
efficient.
EACO Workshop, Sept. 10th 2014, Bristol
The ParaDIME Stack
24
ParaDIME Infrastructure
Data Center
Computing Node/Stack
OS
JVM
API
Application/BM
Intra Data Center Scheduler
Simulated HW
Cores
Accelerators
Interconnect
Future Devices
JVM
Scala
AKKA
Actor Sched
Computing Node/Stack
Re
al
HW
JVMJVM
OSOS
VMVM
Hypervisor
Re
al
HW
Re
al
HW
Re
al
HW
Re
al
HW
Hyper
JVM JVM
OS
VM
Scala
AKKA
Actor Sched
API
Application/BM
Mu
lti
Da
ta C
en
ter
Sc
he
du
ler
BSC
IMEC
UNINE
TuD
Cloud & Heat Technologies
EACO Workshop, Sept. 10th 2014, Bristol
Hardware Architecture
Energy-Efficient Message Passing Message passing microarchitecture
Message passing accelerator
Task passing
Operation Below Safe Vdd
Automatic HW lowering of Vdd
SW-guided (low-power annotation)
Errors?
Heterogeneous Computing Architectural level
Device level
25
EACO Workshop, Sept. 10th 2014, Bristol
Heterogeneous system-level environment
26
Memory
Bus
Task 3
Data
PETS Tool activity counter Interface
Virtual Platform
Task 1 Task 2
ARM Cortex-A9 Quad-core
ISS
ARM Cortex-A8 Quad-core
ISS
DSP C64x ISS
FPGA Hardware
Accelerator
GPU accelerator
EACO Workshop, Sept. 10th 2014, Bristol
Heterogeneous computing results and comparison
27
0
200
400
600
800
1000
1200
1400
1600
ARM Cortex-A9(quad core)
ARM Cortex-A8(quad core)
FPGA (ZynQ) DSP C64x GPU Tegra3
K-means
Ene
rgy
(J)
EACO Workshop, Sept. 10th 2014, Bristol
Message passing programming model Actor model (Akka+Scala)
Annotations to provide information to the hardware
Operation below Safe Vdd
Approximate Computing @Storage(Array("precise=false", "VF_relax=true"))
var x = 5
@Calculation(Array("VF_relax=true", "VF_det=DMR", VF_corr=TM"))
def calc(first:Array[double])
Rewrite/Expand annotated code with Scala Macro Annotations
Programming model
28
This project and the research leading to these results has received funding from the European Community's
Seventh Framework Programme [FP7/2007-2013] under grant agreement n
318693
Oscar Palomar (BSC)
Energy-Aware Computing Workshop 2014
Power and Energy Modeling of Multi-core Processors for
System-Level Design Space Exploration
EACO Workshop, Sept. 10th 2014, Bristol
Computation of power and measurement of voltage for
OMAP
EACO Workshop, Sept. 10th 2014, Bristol
Power measurement environment