exploring power and throughput for dataflow applications ... · exploring power and throughput for...

31
Exploring Power and Throughput for Dataflow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*, George Ungureanu, Johnny ¨ Oberg, Ingo Sander Department of Electronics School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Stockholm, Sweden *[email protected] Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 1 / 26

Upload: others

Post on 20-Jul-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Exploring Power and Throughput for DataflowApplications on Predictable NoC Multiprocessors

Kathrin Rosvall, Tage Mohammadat*, George Ungureanu, Johnny Oberg, Ingo Sander

Department of ElectronicsSchool of Electrical Engineering and Computer Science

KTH Royal Institute of TechnologyStockholm, Sweden*[email protected]

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 1 / 26

Page 2: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Outline

1 Introduction

2 Design Space Exploration: Paper Excerpt

3 Closing remarks

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 2 / 26

Page 3: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Design ChallengeTiming Guarantees in the Many-Core Era

Technology advances lead to

increasingly parallel,powerful and complexarchitecturesincreasingly advanced anddemanding applications

Difficult to verify timingrequirements

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

Network-on-Chip (NoC)

We have already problems with complexity today! How do we designtomorrow’s ”sea-of-cores” systems with timing guarantees?

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 3 / 26

Page 4: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Design ChallengeTiming Guarantees in the Many-Core Era

Technology advances lead to

increasingly parallel,powerful and complexarchitecturesincreasingly advanced anddemanding applications

Difficult to verify timingrequirements

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

Network-on-Chip (NoC)

We have already problems with complexity today! How do we designtomorrow’s ”sea-of-cores” systems with timing guarantees?

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 3 / 26

Page 5: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Design ChallengeMixed-criticality challenge for Transport Industry

Hard requirements on timing.Typical use-cases require low power consumption and mixed-critical.

Simple use-cases have hundreds to millions points in design space.

Dependencies between software and hardware exacerbate complexity.

Consequence

Architectural designs are experience-based and not methodological.

Large safety-margins at the cost of efficiency.

Engineering costs, design and verification time, are exploding.

Vision

Automated design methods that are correct-by-construction!

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 4 / 26

Page 6: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Design ChallengeMixed-criticality challenge for Transport Industry

Hard requirements on timing.Typical use-cases require low power consumption and mixed-critical.

Simple use-cases have hundreds to millions points in design space.

Dependencies between software and hardware exacerbate complexity.

Consequence

Architectural designs are experience-based and not methodological.

Large safety-margins at the cost of efficiency.

Engineering costs, design and verification time, are exploding.

Vision

Automated design methods that are correct-by-construction!

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 4 / 26

Page 7: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Design ChallengeMixed-criticality challenge for Transport Industry

Hard requirements on timing.Typical use-cases require low power consumption and mixed-critical.

Simple use-cases have hundreds to millions points in design space.

Dependencies between software and hardware exacerbate complexity.

Consequence

Architectural designs are experience-based and not methodological.

Large safety-margins at the cost of efficiency.

Engineering costs, design and verification time, are exploding.

Vision

Automated design methods that are correct-by-construction!

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 4 / 26

Page 8: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

ForSyDe (Formal System Design) VisionThe approach

ForSyDe envisions a correct-by-construction design flow by combining

System modelling based on models of computation

Computing platforms providing tight computational metrics, e.g.:

computation and communication timecommunication time. . .

Leveraging

Modelling librariesSimulation methodsTransformation and refinement rulesDesign Space ExplorationDeterministic computing architecturesSound compilation and synthesis methods. . .

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 5 / 26

Page 9: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Envisioned Design FlowThe big picture

A1 A2 A3

B1 B2

B3

Analyzable Application Models

• formal base (MoCs)

• executable

Appl.A

GlobalConstr.

Appl.B

Design Constraints

• real-time

• power and energy

• . . .S

R

S

R

S

R

S

R

Platform Architecture

• multiprocessor

• predictable performance

Design Space Exploration(based on formal model and

predictable architecture)

S

R

S

R

S

R

S

R

B1 B2

B3 A1 A2A3

Mapping

Synthesisand

Compilation

Implemen-tation

Implementation

• Customized hardware

• Efficient software

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 6 / 26

Page 10: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Introduction

Envisioned Design FlowScope and Context

Devices

Logic Gates

IPs

System-on-Chip*Components: Processors-

Memories- I/Os

System of Systems**Hierarchical SoCs

Satisfy criticality and throughputMinimize: Power and Cost

Behavioural Domain Structural Domain

Physical Domain

Systems

Algorithms

Register transfers

Logic

Transfer functions

System-of-Systems

Processor system

ALUs, RAM, etc.

Gates, flip-flops, etc.

Transistors

Physical partitions

Floorplans

Module layout

Cell layout

Transistor layout

ComputingSystemDesign

Plat-forms

ASIC

FPGA

APU

DSP

GPU

Design

Evalua-tion

Off-line

On-line

Explora-tion

Exhau-stive

Incom-plete

SpaceIS

Archit-ecture

Alloc-ation

Mapp-ing

Rou-ting

Sched-uling

Config-uration

Objec-tive

Optim-isation

Satis-faction

Decisions

Compile-time

Run-time

Applica-tion

Dataflow

Sta-ticDyna-

mic

Tasks

Peri-odic

Aper-iodic

Specific-ations

funct-ional

Extrafunct-ional

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 7 / 26

Page 11: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Outline

1 Introduction

2 Design Space Exploration: Paper Excerpt

3 Closing remarks

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 8 / 26

Page 12: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Contributions

Support for low-power design of mixed-critical applications onNoC-based MPSoC by:

spatio-temporal partitioning of computing and networking resources,whilesatisfying real-time constraints (throughput), andminimising system’s power consumption.

Exploiting:

applications’ formal modelsplatform’s heterogeneity, modes and options.temporally-disjoint networks property.

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 9 / 26

Page 13: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The Framework

Application Graph G1Application Graph Gz

a1,1

a1,2

a1,3p1,1 q1,1

p1,2

q1,2p1,3

q1,3 az,1

az,2

az,4

az,3pn,1

qn,1 pn,2

qn,2

pn,3

qn,3 pn,4qn,4

· · ·

Design Space Exploration

+AppConstraints D1

+AppConstraints Dz

· · ·

+System

Constraints

Ds

Tar

get-

spec

ific

Pro

per

ties

(WC

ET

,W

CC

T,

mem

ory

foot

prin

t,p

ower

con

sum

pti

on,

run

-tim

ese

rvic

es,

mo

de

chan

gepr

ofilin

g)

Implem

entation

Library

ConfigurationMappingSchedulesPerformance Data O

utpu

ts

S0,0

R0,0

S1,0

R1,0

Sr,0

Rr,0

S0,1

R0,1

S1,1

R1,1

Sr,1

Rr,1

S0,2

R0,2

S1,2

R1,2

Sr,2

Rr,2

S0,c

R0,c

S1,c

R1,c

Sr,c

Rr,c

Switching Stages

Stage 0

Stage 1

Stage 2

Stage k

...

Mo

du

lok

time

axis

Flittk

Flittk−1

Flittk−2

Flitt0

...

Network InterfaceInjection Table

Cycle 0

Cycle 1

Cycle 2

Cycle n

...

TD

Maxis

X

X...

µPkind,modeMemory µPkind,modeMemory. . .

TDM· · · 1 · · · r 1 · · · r 1 · · ·Time Division Multiple-Access Bus

Core 1 Core m

Memory

Peripherals

Injection Table

Cycle 0

Cycle 1

Cycle 2

Cycle n

...

TD

Maxis

X

X...

...

...

......

...

. . . . . . . . .

. . .

. . .

ArchitectureModel

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 10 / 26

Page 14: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The FrameworkApplication Model

synchronous dataflow graphs

actor characterisation library for eachprocessing element type and mode

WCETMemory footprint

A B2 3

A0

A1

A2

B0

B1

22

11

11

22

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 11 / 26

Page 15: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The FrameworkArchitectural Model

Mesh topology, withtemporarily-disjoint networks

bufferless switchingfixed routing (X-,Y-,Z-)guaranteed services viatime slot assignment

Low power and smallfootprint

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

S

R

Network Interface

Injection Table

cycle 0

cycle 1

cycle 2

cycle n

...

X

X...

CPUmode/HW Acc

MemoryInjection Table

cycle 0

cycle 1

cycle 2

cycle n

...

X

X...

...

...

......

...

. . . . . . . . .

. . .

. . .

Nostrum Network-on-Chip

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 12 / 26

Page 16: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The FrameworkPerformance Models

temporal analysis for applications based on MoC theories.linear power models:

static: as a function of platform components’ kind and mode.dynamic: as a function of resource utilisation/traffic.

system power = (∑p∈P

dynPowerp+statPowerp)

+dynPowerNoC+statPowerNoC

dynPowerp = (∑

a∈A|proca=p

wceta(p, proc modep))

· dynPow(proc modep) ÷ perioda

dynPowerNoC = dynPowerNIs+dynPowerswitches

+dynPowerlinks

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 13 / 26

Page 17: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The FrameworkDesign Constraints

Support for mixed-critical applications through:

spatio-temporal partitioning of computing and networking resources.time constraints (throughput): satisfylow power execution: minimize

Currently supported performance metrics:

throughput / iteration periodenergy consumptionothers: area and monetary cost, memory consumption

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 14 / 26

Page 18: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The FrameworkDesign Constraints

Support for mixed-critical applications through:

spatio-temporal partitioning of computing and networking resources.time constraints (throughput): satisfylow power execution: minimize

Currently supported performance metrics:

throughput / iteration periodenergy consumptionothers: area and monetary cost, memory consumption

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 14 / 26

Page 19: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The MethodConstraint Programming (CP) Programme

Captures the problem as a whole

no decomposition into sub-problems:(potentially) optimal, complete and trade-off-awareSimultaneous mapping, scheduling, configuration and performanceanalysis

Declarative nature: flexible, modular and extendable modelSupports different design goals with the same CP model

Separation of concerns: modelling vs solving

Complete or heuristic search using the same CP model

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 15 / 26

Page 20: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The MethodCP model + Dataflow Analysis

A CP model reflects a mapping- and scheduling-aware graph(MSAG):

captures applications and platform.

analyses implications of mapping and scheduling (e.g. power).

Px

Py srca

blocka

senda

reca dsta

srcb

blockb

sendb

recb dstb

next

next

sendNext

sendNextInterconnect

NI

buffer ssrca ,dsta

buffer rsrca ,dsta

buffer ssrcb ,dstb

buffer rsrcb ,dstb

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 16 / 26

Page 21: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

The MethodCP model + Dataflow Analysis

A CP model reflects a mapping- and scheduling-aware graph(MSAG):

captures applications and platform.analyses implications of mapping and scheduling (e.g. power).

Px

Py srca

blocka

senda

reca dsta

srcb

blockb

sendb

recb dstb

next

next

sendNext

sendNextInterconnect

NI

buffer ssrca ,dsta

buffer rsrca ,dsta

buffer ssrcb ,dstb

buffer rsrcb ,dstb

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 16 / 26

Page 22: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

ExperimentsExperiments set

# Description

1 Physical experiment: TDN-NoC FPGA platform2-5 DSE for larger problem with the fixed mapping with different

optimization goals:2 power consumption, one TDN slot/processor3 power consumption, multiple TDN slots/processor4 throughput for cyclic graph, one TDN slot/processor5 throughput for cyclic graph, multiple TDN slot/processor

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 17 / 26

Page 23: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

ExperimentsApplications set

getPx

gx gy

abs

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

getIm

usan

dir

thin

putIm

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

frontE

rasta

pows

auds

comp

filter backE

1

1

1

11

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

11

1

1

1

1

1

1

1

1

1

1

1

read

CC

DCT1

Huff1

1

1

1

1

DCT2

Huff2

1

1

1

1

DCT3

Huff3

1

1

1

1

DCT4

Huff4

1

1

1

1

DCT5

Huff5

1

1

1

1

DCT6

Huff6

1

1

1

1

CS

write

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

11111

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

11111

1

1

1

1

1

1

1

1

1

1

1

1

a0

a1

a2

a9

a8

a4

a3

a5

a6

a7

1

1

1

1 1

1

1

11

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

11

11

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

JPEG encoderRASTA-PLPSUSANSobel Synthetic Cyclic Graph

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 18 / 26

Page 24: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Experiment 1: Physical experimentValidation on a Xilinx Zynq Quadprocessor: Description

1 Goals: Minimize power consumption + satisfy throughput.2 App/platform: Sobel & Susan on Nostrum NoC.

⇐= Network-on-Chip

⇐= Microblaze tile

⇐= ARM tile

Tile 1: MicroblazeSystem

Tile 2: MicroblazeSystem

Tile 3: MicroblazeSystem

NI NI NI

Switch S10 Switch S11

Switch S01 Switch S00

NI

Tile 0: Dual Cor-tex A9

Dynamically Reconfigurable Clock Generator

clk0

clk1 clk2 clk3

DFS

PS-PL BarrierD

FS

DF

S

DF

S

DF

S

Off-chip DVFS

Processing System (PS)

Programmable Logic (PL)

Network Interfaces (NI)

Links

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 19 / 26

Page 25: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Experiment 1: Physical experimentValidation on a Xilinx Zynq Quadprocessor: Solution

Allocation

Configuration

Mapping

Scheduling

Exact area cost

Caches disabled

10% time margin

Low-Power

⇐= Network-on-Chip

(Disabled Tile) ⇐= Microblaze tiles

⇐= A9 MPCore tile

F=50MHz F=50MHz F=0MHz

NI {X,−,−,−} NI {−,−,−,−} NI {−,−,−,−}

Switch S2 Switch S3

Switch S1 Switch S0

NI {−,−,−,−}

Dynamically Reconfigurable Clock Generator

clk0,F=50MHz

clk1 clk2 clk3

DynamicFrequency

Scaling

(DFS)PS-PL Barrier

DF

S

DF

S

DF

S

Off-chip Dynamic Fre-quency Scaling (DVFS)

Processing System (PS), Mode=High Performance#partitions=2

Programmable Logic (PL)

32-bit Links

getIm usan

dirthin

putIm

1

1

1

1

1111

1

1

1

1

1111

getPx

gx gy

abs

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 20 / 26

Page 26: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Experiment 3: Larger designPower Minimisation and Slot Assignment for Fixed Mapping

S0,0

S0,1

S0,2

S0,3

S0,4

S1,0

S1,1

S1,2

S1,3

S1,4

S2,0

S2,1

S2,2

S2,3

S2,4

S3,0

S3,1

S3,2

S3,3

S3,4

S4,0

S4,1

S4,2

S4,3

S4,4

S5,0

S5,1

S5,2

S5,3

S5,4

getPx

gx

gy

abs

getIm

usan

dir

thin

putIm

frontE

rasta pows auds comp

filter

backE

read

CC

DCT1

Huff1

DCT2

Huff2

DCT3

Huff3

DCT4

Huff4

DCT5

Huff5

DCT6

Huff6

CS

write

a0

a1

a2

a9

a8

a4

a3

a5

a6

a7

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 21 / 26

Page 27: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Design Space Exploration: Paper Excerpt

Experiment 3: Larger designPower Minimisation and Slot Assignment for Fixed Mapping: Solution

0 0.5 1 1.5 2 2.5 3 3.5 4

·105

0.9

1

1.1

1.2

1.3

1.4

·104

Exploration time (ms)

Pow

er(m

W)

Best FoundExp. 3, Run 1Exp. 3, Run 2Exp. 3, Run 3Exp. 3, Run 4Exp. 3, Run 5

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 22 / 26

Page 28: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Closing remarks

Outline

1 Introduction

2 Design Space Exploration: Paper Excerpt

3 Closing remarks

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 23 / 26

Page 29: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Closing remarks

Summary

Low-power design spaceexploration of

multiple dataflow applicationswithstatic mixed-criticality supporton a shared MPSoC platform

support for predictablenetwork-on-chip

Application Graph G1Application Graph Gz

a1,1

a1,2

a1,3p1,1 q1,1

p1,2

q1,2p1,3

q1,3 az,1

az,2

az,4

az,3pn,1

qn,1 pn,2

qn,2

pn,3

qn,3 pn,4qn,4

· · ·

Design Space Exploration

+AppConstraints D1

+AppConstraints Dz

· · ·

+System

Constraints

Ds

Tar

get-

spec

ific

Pro

per

ties

(WC

ET

,W

CC

T,

mem

ory

foot

prin

t,p

ower

con

sum

pti

on,

run

-tim

ese

rvic

es,

mo

de

chan

gepr

ofilin

g)

Implem

entation

Library

ConfigurationMappingSchedulesPerformance Data O

utpu

ts

S0,0

R0,0

S1,0

R1,0

Sr,0

Rr,0

S0,1

R0,1

S1,1

R1,1

Sr,1

Rr,1

S0,2

R0,2

S1,2

R1,2

Sr,2

Rr,2

S0,c

R0,c

S1,c

R1,c

Sr,c

Rr,c

Switching Stages

Stage 0

Stage 1

Stage 2

Stage k

...

Mo

du

lok

time

axis

Flittk

Flittk−1

Flittk−2

Flitt0

...

Network InterfaceInjection Table

Cycle 0

Cycle 1

Cycle 2

Cycle n

...

TD

Maxis

X

X...

µPkind,modeMemory µPkind,modeMemory. . .

TDM· · · 1 · · · r 1 · · · r 1 · · ·Time Division Multiple-Access Bus

Core 1 Core m

Memory

Peripherals

Injection Table

Cycle 0

Cycle 1

Cycle 2

Cycle n

...

TD

Maxis

X

X...

...

...

......

...

. . . . . . . . .

. . .

. . .

ArchitectureModel

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 24 / 26

Page 30: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Closing remarks

Open Problems

Exploring models-of-computation that change in time.

Scenario-aware data-flow graphs.Reconfigurable computing.

Exploring computer architectures

parallelisms within computing elementsnetwork topologiesmemory hierarchy

Targetting resiliency

Graceful degradationSecurity trade-offs

Designing for adaptive and distributed computing.

Distributed exploration: limits 10x10 (Vivado), 6x8 (GeCode).

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 25 / 26

Page 31: Exploring Power and Throughput for Dataflow Applications ... · Exploring Power and Throughput for Data ow Applications on Predictable NoC Multiprocessors Kathrin Rosvall, Tage Mohammadat*,

Closing remarks

This work is part of ForSyDe/DeSyDe

Kathrin Rosvall, Tage Mohammadat, George Ungureanu, Johnny Oberg, and Ingo Sander.Exploring power and throughput for dataflow applications on predictable nocmultiprocessors.In Euromicro Conference on Digital System Design (DSD 2018), Prague, Czech Republic,August 2018.

Kathrin Rosvall and Ingo Sander.Flexible and trade-off-aware constraint-based design space exploration for streamingapplications on heterogeneous platforms.ACM Transactions on Design Automation of Electronic Systems (TODAES), 23(2), 2017.

Ingo Sander, Axel Jantsch, and Seyed-Hosein Attarzadeh-Niaki.ForSyDe: System design using a functional language and models of computation.In Handbook of Hardware/Software Codesign. Springer, Dordrecht, 2017.

George Ungureanu and Ingo Sander.A layered formal framework for modeling of cyber-physical systems.In 2017 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages1715–1720. IEEE, 2017.

More resources: https://github.com/forsyde/desyde

Tage Mohammadat et al. Architectural Design of Efficient MPSoCs 26 / 26