synchron 2008, 2 december 2008 aoste jean-vivien millo, epi aoste periodic scheduling in process...

Post on 20-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Synchron 2008, 2 December 2008

Jean-Vivien MILLO, EPI AOSTEAOSTE

Periodic scheduling in Periodic scheduling in process networkprocess network

Application to latency insensitive design

SoC design2

• Designers make IP.Designers make IP.

• Builders get together IPs in a SoCBuilders get together IPs in a SoC

• Issues:Issues:• IPs come from differents builder. Its have to interactIPs come from differents builder. Its have to interact• Latencies occurs in long interconnection wiresLatencies occurs in long interconnection wires

A B A BSynchronoushypothesis

Latency

Goal3

• Find a method to formally design interconnection Find a method to formally design interconnection of SoC :of SoC :

• Including LatenciesIncluding Latencies• Keeping a maximal rateKeeping a maximal rate• Minimizing size of interconnection resources Minimizing size of interconnection resources

• Due to embedded constraintsDue to embedded constraints

The method4

• Synchronous modelSynchronous model

• Asynchronous modelAsynchronous model• Marked/Event GraphMarked/Event Graph

• Resynchronized modelResynchronized model

IP3IP2IP1

SoC

IP3IP2IP1

SoC

IP3IP2IP1

SoC

Plan5

• Process Network: Marked Graph

• Dynamic scheduling:Latency Insensitive Design

• Static scheduling:Equalization

• Balanced static schedulingBalanced binary word

Process network

6

Marked graph7

IP1

IP2

IP3

Computation latency=3

Communication latency=3

Precondition

Basic notions and results8

• The number of token by cycle is invariant The number of token by cycle is invariant

• Rate of a cycle=Rate of a cycle=

• Graph rate=slowest cycle rateGraph rate=slowest cycle rate

# tokens# latencies

Scheduling9

• Firable node: All it’s input channel Firable node: All it’s input channel contain a tokencontain a token

• An execution step: Parallel An execution step: Parallel execution of a subset of firable execution of a subset of firable node.node.

• ASAP execution:{firing ASAP execution:{firing node}={firable node}node}={firable node}

• State of the system=place marking.State of the system=place marking.

Scheduling10

• ASAP execution of a closed ASAP execution of a closed system is periodicsystem is periodic

• During an execution, schedule of a During an execution, schedule of a node can be represented as a node can be represented as a binary wordbinary word

• 1 means activity, 0 inactivity1 means activity, 0 inactivity

Schedules of nodes11

1

0

0

0

1

1

1

1

1

1

1

11

1

11

1

00

0

1

0

0

0

=

Latency Insensitive Design

12

Latency insensitive design13

• Solves problems of heterogeneous IP Solves problems of heterogeneous IP interconnection and latencies but:interconnection and latencies but:• Rate of the graph is not even maximalRate of the graph is not even maximal• Communication resources are overrateCommunication resources are overrate

• It’s a Marked graph with:It’s a Marked graph with:• Place with capacity 2Place with capacity 2

this implement a back pressure protocolthis implement a back pressure protocolC=2

Outcome of LID14

• Back pressure protocol needs additional control Back pressure protocol needs additional control pathspaths

• Places of capacity 2 allow a correct behavior Places of capacity 2 allow a correct behavior but are overrate:but are overrate:• Place with capacity 1 or 2 at “some places” should Place with capacity 1 or 2 at “some places” should

be enough.be enough.• What “some places” means ?What “some places” means ?

Equalization ProcessEqualization Process

Equalization

15

Equalization

• Result of the PhD thesis of Julien Boucaron

• Reduce size of communication buffer and simplify back-pressure protocol– Statically increase the rate of fast cycle as

close as possible to the rate of the critical cycle. – Statically schedule cycle (due to periodic

behavior)

• C1 : 5/8 and C2 :3/3C1 is critical

•If we add an integer •latency on C2 : 3/4

•If we add an other latency, C2 :3/5<5/8 : Rate of the graph is reduced

We need fractional latencyWe need fractional latency

Equalization Process: add latencies

• It’s a wire whenIt’s a wire when not holdnot hold.• It’s a register when It’s a register when holdhold.

• A simple register followed by a FR has the same A simple register followed by a FR has the same capacity than a Relay station but…capacity than a Relay station but…• we can add it only where we need

Fractional RegisterFractional Register 18

hold

Val_in Val_outFR

Reg

Equalization Process: schedulingSymbolic simulation to find:

– FR placement (and it’s rate)

– Static schedule of nodes– Static schedule of FR

11(00111011)*

4

01(10011101)*

00(11001110)* 00(01100111)*

00(01110110)*

10(01110110)*

11(00001111)*

Outcome of equalization• Addition of virtual latency is use to reduce data

accumulation:

• Symbolic simulation definitively fixes the static schedules of the graph. Sometimes, an “other schedule” should greatly reduce the size of interconnection buffer:

• The idea is good but it have to be associate to another idea to find optimal solution.Select the good schedule : Balanced Schedule.

Balanced schedule: example21

• Agraph with 2 cycles with rate: 5/7 et 2/2.

(1111001)*

4

(1111100)*

(0111110)*

(0011111)*

(1100111)*

(1110011)*

(0121000)*

(0010000)* 1

3

(0111000)*

(0111011)*

(1011101)*

(1101110)*

(0110111)*

(1101101)*

(1110110)*

(0110110)*

Schedule are binary words said “balanced”

Balanced binary word

22

Balanced word

• The difference in the occurrence of the same factor in two sub words of the same length is bounded by 1.

U= 00100101001001

||v|1-|w|1|≤1

Rotation

• Unitary rotation :

• For all finite binary word, u (|u|=n) (u)=unu1u2u3…un-1

• Ex: (1001001)=1100100.

• So i = o i-1.-i : The reverse rotation.

Transposition

• Unitary transposition relation:

• v is the transpose of u iff u1,u2 such that u=u110u2 and v=u101u2. We note v=(u)

• uSkp, vSk

p such that v =(u).

• For all (k,p), It exist an integer α such that uSk

p, -α(u) = (u)

α is the inverse of –k modulo p-k*α≡ 1 mod p

u=11011010

(u)=11010110

Balanced word• Ex: (k,p)=(5,8), α=3

u=11011010

-α(u)=1101011011010110 11010110

Construction of a balanced static

schedule of a graph

27

A new approach28

• We’ll analytically compute the best schedule, We’ll analytically compute the best schedule, analytically build the solution and force the system analytically build the solution and force the system to reach this behaviorto reach this behavior

• Best schedule is a balanced scheduleBest schedule is a balanced schedule

Integer latencies insertion29

•C1 : 5/8 and C2 :3/3C1 is critical

•We add a integer latencyon C2 : 3/4

And we add…

FR insertion30

4

•C1 : 5/8 and C2 :3/4

We must add a FR on a channel of C2 independent of C1

Compute schedules of nodes31

4

01101101

10110110

10110110

1

101011011

1

10101101

1

11010110

1

01101011

1

101101011

1-4*

110110101

Rate of the graphe=5/8(k,p)=(5,8):

01101101 Skp

Compute schedules of FRs 32

401001011

For a FR :While the graph is equalized and schedules of nodes are balanced Only one RF by channel is enough.

r

Compute state of the steady phase33

01101101

10110110

10110110

01011011

10101101

1101011001101011

10110101

11011010

01001011

?

??

? ?

??

?

?

??

Find initialization34

Steady phaseInitial phase

1

0

1

1

1

00

0

0

1

1

100

0

0

0

0

0

0

( )

10(01101101)

10(10110110)

10(10110110)

11(01011011)

01(10101101)

00(11010110)00(01101011)

00(10110101)

00(11011010)

01(01001011)

Conclusions

The construction of a balanced scheduling in addition to integer latency insertion results to a graph with maximal rate and minimal communication resources size.

IP1

IP2

IP3

10(10110110)

00(01101011)

00(10110101)

01(01001011)

00(01101011)

10(10110110)

00(01101011)

IP

Thanks.Thanks.

top related