chapter 7 function-architecture codesign paradigm

Post on 04-Feb-2016

59 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Chapter 7 Function-Architecture Codesign Paradigm. Function Architecture Co-design Methodology. System Level design methodology Top-down ( synthesis ) Bottom-up ( constraint-driven ). Synthesis. Verification. Function. Architecture. Trade-off. Mapping. Refinement. Abstraction. HW. - PowerPoint PPT Presentation

TRANSCRIPT

1

Chapter 7 Function-Architecture Codesign Paradigm

2

Function Architecture Co-design Methodology

System Level design methodology

Top-down (synthesis)

Bottom-up (constraint-driven)

3

Ref

inem

ent

Synthesis Verification

Function Architecture

HW SW

Mapping

Trade-off

Trade-off

Abs

trac

tion

Co-design Process Methodology

4

System Level Design Vision

Functioncasts a shadow

Abstraction

Refinement

Architecturesheds light

Constrained Optimization Constrained Optimization

and Co-designand Co-design

5

Main Concepts

Decomposition

Abstraction and successive refinement

Target architectural exploration and estimation

6

Decomposition

Top-down flow Find an optimal match between the application

function and architectural application constraints (size, power, performance).

Use separation of concerns approach to decompose a function into architectural units.

7

Abstraction & Successive Refinement

Function/Architecture formal trade-off is applied for mapping function onto architecture

Co-design and trade-off evaluation from the highest level down to the lower levels

Successive refinement to add details to the earlier abstraction level

8

Target Architectural Exploration and Estimation

Synthesized target architecture is analyzed and estimated

Architecture constraints are derived An adequate model of target architecture is built

9

Architectural Exploration in POLIS

10

Main Steps in Co-design and Synthesis

Function architecture co-design and trade-off– Fully synthesize the architecture?– Co-simulation in trade-off evaluation

• Functional debugging • Constraint satisfaction and missed deadlines• Processor utilization and task scheduling charts• Cost of implementation

Mapping function on the architecture– Architecture organization can be a pre-designed collection of components

with various degrees of flexibilities– Matching the optimal function to the best architecture

11

Function/ Architecture Co-design vs. HW/SW Co-design

Design problem over-simplified Must use Fun./Arch. Optimization & Co-design to

match the optimal Function to the best Architecture1. Fun./Arch. Co-design and Trade-off

2. Mapping Function Onto Architecture

12

Reactive System Co-synthesis(1)

Control-dominated Design

EFSM Representati

on

CDFG Representati

on

HW/SW

Decompose Map

Map

EFSM: Extended Finite State Machines

CDFG: Control Data Flow directed acyclic Graph

13

Reactive System Co-synthesis(2)

CDFG is suitable for describing EFSM reactive behavior but

Some of the control flow is hiddenData cannot be propagated

S1a:= a + 1

S0a:= 5

S2

a

EFSM

Mapping

a := 5a := 5

state := S1

state := S1

Case (state)Case (state)

BEGINBEGIN

ENDEND

S1

a := a + 1

a := a + 1

state := S2

state := S2

emit(a)emit(a)

S2

CDFG

S0

14

S1

S0

S2

EFSMRepresentation

a

a:= 5

a:= a + 1a:= 6

a

Optimized EFSMRepresentation

Data Flow Optimization

15

Optimization and Co-design Approach

Architecture-independent phase

– Task function is considered solely and control data flow analysis is performed

– Removing redundant information and computations

Architecture-dependent phase

– Rely on architectural information to perform additional guided optimizations tuned to the target platform

16

Graphical EFSM Esterel

ReactiveVHDL Specification

FFGFFG

AUXModeling

Function Architecture

AFFGAFFG

SHIFT(CFSM Network)

FunctionalOptimization

Macro-levelOptimization

Micro-levelOptimization

ResourcePool

Est

imat

ion

and

Val

idat

ion

BehavioralOptimization

Cost-guidedOptimization

HW/SWRTOS/InterfaceCo-synthesis

Decomposition

Constraints

SW Partition HW Partition

Processor

BU

S

Inte

rfac

e

Inte

rfac

e

HW1

HW2

HW5

HW4HW3

RTOS

Concrete Co-design Flow

17

Design

Representation

Function/Architecture Co-Design

18

Function/ArchitectureOptimization and Co-design

Design

Application

Decomposition

controlcontrol

datadata

i/oi/o

ASICsASICsprocessorsprocessors

fsmfsmfsmfsmdata f.data f.data f.data f.II OO

IDR

SW Partition HW Partition

Processor

BU

S

Inte

rfac

e

Inte

rfac

e

HW1

HW2

HW5

HW4HW3

RTOS Hardware/Software Co-synthesis

Mapping

Abstract Co-design Flow

19

Design

IDRIDR

SW HW

Architecture Independent

Architecture Dependent Constraints

Unifying Intermediate Design Representation for Co-design

Intermediate Design Representation

Functional Decomposition

20

Architectural Space

Application Space

Application Instances

Platform Instance

System Platform

Platform Design Space Exploration

Platform Specification

Platform-Based Design

Source: ASV

21

Models and System

Models of computation

– Petri-net model (graphical language for system design)

– FSM (Finite-State Machine) models

– Hierarchical Concurrent FSM models

POLIS system

– CFSM (Co-design FSM)

– EFSM (Extended FSM): support for data handling and asynchronous communication

22

CFSM

Includes

– Finite state machine

– Data computation

– Locally synchronous behavior

– Globally asynchronous behavior

Semantics: GALS (Globally Asynchronous and Locally Synchronous communication model)

23

CFSM2

CFSM3

C=>G

CFSM1

C=>FB=>C

F^(G==1)

(A==0)=>B

C=>ACFSM1

CFSM2

C=>B

B

C=>G

C=>BA

C

C

G

F

CFSM Network MOC

MOC: Model of Computation

Communication between CFSMs by means of events

24

System Specification Language

“ Esterel”

– as “front-end” for functional specification

– Synchronous programming language for specifying reactive real-time systems

Reactive VHDL

Graphical EFSM

25

Intermediate Design Representation (IDR)

Most current optimization and synthesis are performed at the low abstraction level of a DAG (Direct Acyclic Graph).

Function Flow Graph (FFG) is an IDR having the notion of I/O semantics.

Textual interchange format of FFG is called C-Like Intermediate Format (CLIF).

FFG is generated from an EFSM description and can be in a Tree Form or a DAG Form.

26

Design

Functional Decomposition

FFGFFGI/O SemanticsI/O Semantics

SW HW

Architecture Independent

Architecture Dependent

Constraints EFSM SemanticsEFSM Semantics

AFFGAFFG

Refinement Restriction

(Architecture) Function Flow Graph

27

FFG/CLIF

Develop Function Flow Graph (FFG) / C-Like

Intermediate Format (CLIF) • Able to capture EFSM

• Suitable for control and data flow analysis

EFSM FFGOptimized

FFGCDFG

Data Flow/ControlOptimizations

28

Function Flow Graph (FFG)

– FFG is a triple G = (V, E, N0) where

• V is a finite set of nodes

• E = {(x,y)}, a subset of VV; (x,y) is an edge from x to y where x

Pred(y), the set of predecessor nodes of y.

• N0 V is the start node corresponding to the EFSM initial state.

• An unordered set of operations is associated with each node N.

• Operations consist of TESTs performed on the EFSM inputs and

internal variables, and ASSIGNs of computations on the input alphabet

(inputs/internal variables) to the EFSM output alphabet (outputs and

internal (state) variables)

29

C-Like Intermediate Format (CLIF)

Import/Export Function Flow Graph (FFG)

“Un-ordered” list of TEST and ASSIGN operations

– [if (condition)] goto label

– dest = op(src)

• op = {not, minus, …}

– dest = src1 op src2

• op = {+, *, /, ||, &&, |, &, …}

– dest = func(arg1, arg2, …)

30

Preserving I/O Semantics

input inp;

output outp;

int a = 0;

int CONST_0 = 0;

int T11 = 0;

int T13 = 0;

S1: goto S2;S2: a = inp; T13 = a + 1 CONST_0; T11 = a + a; outp = T11; goto S3;S3: outp = T13; goto S3;

31

FFG / CLIF Example

(cond2 == 0) / output(a)(cond2 == 1) / output(b)

Legend: Legend: constantconstant, , output flowoutput flow, , dead operationdead operationSS## = State, = State, SS##LL## = Label in State S# = Label in State S#

S1x=x+yx=x+ya= b+c

a=xcond1 = (y==cst1)cond2 = !cond1;

y = 1

FunctionFlow Graph

S1: x = x + y;x = x + y;a = b + c;a = x;cond1 = (y == cst1);cond2 = !cond1;if (cond2) goto S1L0output = a;goto S1; /* Loop */

output = b;goto S1;

S1L0:

CLIFTextual Representation

32

Tree-Form FFG

33

Function/Architecture Optimizations

Function/Architecture Co-Design

34

Function Optimization

Architecture-Independent optimization objective:

– Eliminate redundant information in the FFG.

– Represent the information in an optimized FFG that has a minimal number of nodes and associated operations.

35

FFG Optimization algorithm(G)

begin

while changes to FFG do

Variable Definition and Uses

FFG Build

Reachability Analysis

Normalization

Available Elimination

False Branch Pruning

Copy Propagation

Dead Operation Elimination

end while

end

FFG Optimization Algorithm

36

Optimization Approach

Develop optimizer for FFG (CLIF) intermediate design representation

Goal: Optimize for speed, and size by reducing

– ASSIGN operations

– TEST operations

– variables

Reach goal by solving sequence of data flow problems for analysis and information gathering using an underlying Data Flow Analysis Data Flow Analysis (DFA) framework(DFA) framework

Optimize by Optimize by information redundancy eliminationinformation redundancy elimination

37

Sample DFA ProblemAvailable Expressions Example

Goal is to eliminate re-computations– Formulate Available Expressions Problem

– Forward Flow (meet) Problem

AE = a+2}

AE = a+1, b+2}AE = a+1}AE = a+1}

AE = AE =

AE = Available Expression

S1

t:= a + 1

S3a := a * 5t3 = a + 2

S2t1:= a + 1t2:= b + 2

38

Data Flow Problem Instance

A particular (problem) instance of a monotone data flow analysis framework is a pair I = (G, M) where M: N F is a function that maps each node N in V of FFG G to a function in F on the node label semilattice L of the framework D.

39

Data Flow Analysis Framework

A monotone data flow analysis framework D = (L, , F) is used to manipulate the data flow information by interpreting the node labels on N in V of the FFG G as elements of an algebraic structure where

– L is a bounded semilattice with meet , and

– F is a monotone function space associated with L.

40

Data Flow Equations

Solving Data Flow Problems

AE = a+2}

AE = a+1, b+2}AE = a+1}AE = a+1}

AE = AE =

AE = Available Expression

S1

t:= a + 1

S3a := a * 5t3 = a + 2

S2t1:= a + 1t2:= b + 2

41

Solving Data Flow Problems

Solve data flow problems using the iterative method

– General: does not depend on the flow graph

– Optimal for a class of data flow problems Reaches fixpoint in polynomial time (O(n2))

42

FFG Optimization Algorithm

Solve following problems in order to improve design:

– Reaching Definitions and Uses

– Normalization

– Available Expression Computation

– Copy Propagation, and Constant Folding

– Reachability Analysis

– False Branch Pruning

Code Improvement techniques

– Dead Operation Elimination

– Computation sharing through normalization

Type text

43

Function/Architecture Co-design

44

Function Architecture Optimizations

Fun./Arch. Representation:

– Attributed Function Flow Graph (AFFG) is used to represent architectural constraints impressed upon the functional behavior of an EFSM task.

45

Architecture Dependent Optimizations

libArchitecturalInformation

EFSM FFG OFFG CDFGAFFG

ArchitectureIndependent

Sum

46

EFSM in AFFG (State Tree) Form

F0

F1

F2

F8

F7F6

F5

F4F3S0

S1

S2

47

Architecture Dependent Optimization Objective

Optimize the AFFG task representation for speed of execution and size given a set of architectural constrains

Size: area of hardware, code size of software

48

y = a + ba = cx = a + b

y = a + b

z = a + ba = c

y = a + b

x = a + b

1

2

3

6

7

54

88

99

10

ReactivityLoop

Motivating Example

Eliminate the redundant needless runtime re-evaluation of the a+b operation

49

Cost-guided Relaxed Operation Motion (ROM)

For performing safe and operation from heavily executed portions of a design task to less visited segments

Relaxed-Operation-Motion (ROM): begin

Data Flow and Control Optimization

Reverse Sweep (dead operation addition, Normalization and available operation elimination, dead operation elimination)

Forward Sweep (optional, minimize the lifetime)

Final Optimization Pass

end

50

Cost-Guided Operation Motion

Cost EstimationDesign

Optimization

User Input Profiling

InferenceEngine

AttributedFFG

Relaxed Relaxed Operation MotionOperation Motion

FFG(back-end)

51

Function Architecture Co-designin the Micro-Architecture

AFFGFFG fsmfsmfsmfsm data f.data f.data f.data f.

II OOcontrolcontrol

datadata

i/oi/o

ASICsASICsprocessorsprocessors

SystemConstraints

SystemSpecs

DecompositionDecomposition

t1= 3*bt2= t1+a

emit x(t2)

Operator Strength ReductionInstruction Selection

52

Operator Strength Reduction

t1= 3*b

t2=t1 + a

x=t2

expr1 = b + b;

t1 = expr1 + b;

t2 = t1 + a;

x = t2;

Reducing the multiplication operator

53

Architectural Optimization

Abstract Target Platform

– Macro-architectures of the HW or SW system design tasks

CFSM (Co-design FSM): FSM with reactive behavior

– A reactive block

– A set of combinational data-low functions

Software Hardware Intermediate Format (SHIFT)

– SHIFT = CFSMs + Functions

54

SW Partition HW Partition

ProcessorB

US

Inte

rfac

e

RTOS

Inte

rfac

e

HW1

HW2

HW5

HW4HW3

Macro-Architectural Organization

55

Architectural Organization of a Single CFSM Task

CFSM

56

Task Level Control and Data Flow Organization

Reactive ControllerEQ

1

RESET

INC

0

MUX

a

b

c

a

y

a_EQ_b

INC_a

RESET_a

s

57

CFSM Network Architecture

Software Hardware Intermediate FormaT (SHIFT) for describing a network of CFSMs

It is a hierarchical netlist of

– Co-design finite state machine

– Functions: state-less arithmetic, Boolean, or user-defined operations

58

SHIFT: CFSMs + Functions

59

Architectural Modeling

Using an AUXiliary specification (AUX)

AUX can describe the following information

– Signal and variable type-related information

– Definition of the value of constants

– Creation of hierarchical netlist, instantiating and interconnecting the CFSMs described in SHIFT

60

Mapping AFFG onto SHIFT Synthesis through mapping AFFG onto SHIFT and AUX

(Auxiliary Specification)

Decompose each AFFG task behavior into a single reactive control part, and a set of data-path functions.

Mapping AFFG onto SHIFT Algorithm (G, AUX)

begin

foreach state s belong to G do

build_trel (s.trel , s, s.start_node, G, AUX);

end foreach

end

61

Architecture Dependent Optimizations

Additional architecture Information leads to an increased level of macro- (or micro-) architectural optimization

Examples of macro-arch. Optimization

– Multiplexing computation Inputs

– Function sharing

Example of micro-arch. Optimization

– Data Type Optimization

62

ReactiveController

a

b

c

d

e

s

MUX

ITE

ITE out

de

Tout

Distributing the Reactive Controller

Move some of the control into data path as an ITE assign expression

ITE: if-then-else

63

+

Control{1, 2}

1

2

b

-c-

T(b+-c-)

a

c

Multiplexing Inputs

c = a

T = b + c +b

T(b+a)

a

+b

T(b+c)

c

64

Micro-Architectural Optimization

Available Expressions cannot eliminate T2

But if variables are registered (additional architectural information) we can share T1 and T2

b

++

a x

Out

T(a+b)

S1

T1 = a + b;x = T1;a = c;

S2T2 = a + b;Out = T(a+b);

emit(Out)

65

Hardware/Software Co-Synthesis and

Estimation

Function/Architecture Co-Design

66

FFG AFFG

FFG Interpreter (Simulation)

Co-Synthesis Flow

EFSMCDFGSHIFTSHIFT

SoftwareCompilation

ObjectCode (.o)

HardwareSynthesis

Netlist

Or

67

POLIS Co-design Environment

Programmable Programmable BoardBoard

P of choiceP of choice FPGAsFPGAs FPICsFPICs

Graphical EFSMGraphical EFSM ESTERELESTEREL ................................

CFSMsCFSMs

PartitioningPartitioning

SW SynthesisSW Synthesis

SW Code + SW Code + RTOSRTOS

Logic NetlistLogic Netlist

HW SynthesisHW Synthesis

SW EstimationSW Estimation HW EstimationHW Estimation

Physical PrototypingPhysical Prototyping

Performance/trade-off Performance/trade-off EvaluationEvaluation

CompilersCompilers

68

POLIS Co-design Environment Specification: FSM-based languages (Esterel, ...)

Internal representation: CFSM network

Validation:

– High-level co-simulation

– FSM-based formal verification

– Rapid prototyping

Partitioning: based on co-simulation estimates

Scheduling

Synthesis:

– S-graph (based on a CDFG) based code synthesis for software

– Logic synthesis for hardware

Main emphasis on unbiased verifiable specification

69

Hardware/Software Co-Synthesis

Functional GALS CFSM model for hardware and software

initially unbounded delays refined after architecture mapping

Automatic synthesis of:

• Hardware

• Software

• Interfaces

• RTOS

70

ResourcePool

CFSMNetwork

RTOSSynthesis

HW/SWSynthesis

PhysicalPrototyping

RTOS Synthesis and Evaluation in Polis

1. Provide communication mechanisms among CFSMs implemented in SW and between the OS is running on and HW partitions.

2. Schedule the execution of the SW tasks.

71

a := a + 1a := 0

detect(c)a<

b

BEGIN

END

emit(y)

T

F TF

40

266341

14

189

Estimation on the Synthesis CDFG

72

Architecture Evaluation Problem

BehaviorBehavior

ArchitectureArchitecture

HDLHDLHighHighCostCost

Out ofOut ofSpecSpec

SystemSystemBehaviorBehavior

Refin

Refinee

Refin

Refinee

SystemSystemArchitectureArchitecture

SystemSystemBehaviorBehavior

Refin

Refinee

Refin

Refinee

SystemSystemArchitectureArchitecture

Time and MoneyTime and Money

73

Proper Architectural Evaluation

Time and MoneyTime and Money

SystemSystemBehaviorBehavior

Refin

Refinee

SystemSystemArchitectureArchitecture

SystemSystemArchitectureArchitecture

SystemSystemArchitectureArchitecture

In SpecIn SpecLow CostLow Cost

BehaviorBehavior

ArchitectureArchitecture

ImplementationImplementation

74

Network ofEFSMs

Network ofEFSMs

SW Estimation

SW Estimation HW EstimationHW Estimation

HW/SW Co-SimulationPerformance/trade-off Evaluation

HW/SW Co-SimulationPerformance/trade-off Evaluation

HW/SW PartitioningHW/SW Partitioning

Estimation-Based Co-simulation

75

Co-simulation Approach (1)

Fills the “validation gap” between fast and slow models

– Performs performance simulation based on software and hardware timing estimates

Outputs behavioral VHDL code

– Generated from CDFG describing EFSM reactive function

– Annotated with clock cycles required on target processors

Can incorporate VHDL models of pre-existing components

76

Co-simulation Approach (2)

Models of mixed hardware, software, RTOS and interfaces

Mimics the RTOS I/O monitoring and scheduling

– Hardware CFSMs are concurrent

– Only one software CFSM can be active at a time

Future WorkFuture Work

– ArchitecturalArchitectural view instead of component view view instead of component view

77

Research Directions in F-A Codesign

Functional decomposition, cross- “block” optimization ~ hardware/software partitioning techniques

Task and system level algorithm manipulations ~ performing user-guided algorithmic manipulations

top related