minimum-buffered routing of non- critical nets for slew rate and reliability control supported by...

33
Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale Silicon Research Center C. Alpert (IBM) A. B. Kahng, B. Liu, I. Măndoiu (UCSD) A. Zelikovsky (GSU) http://vlsicad.ucsd.edu

Post on 19-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and

Reliability Control

Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and

Reliability Control

Supported by Cadence Design Systems, Inc. and the MARCO Gigascale Silicon Research Center

C. Alpert (IBM)

A. B. Kahng, B. Liu, I. Măndoiu (UCSD)

A. Zelikovsky (GSU)

http://vlsicad.ucsd.edu

OutlineOutline

Motivation

Previous Work

Formulation

Our contributions Buffering a given tree

Simultaneous tree construction and buffering

Experimental results

Summary and research directions

Bounded load capacitance is achieved by buffer insertion

MotivationMotivation

Timing analysis requires electrical correctness Load caps and slew times

must be within range of lookup tables, else timing analysis can’t be trusted!

Electrical correctness is guaranteed by bounding load capacitance

Gate delay

Input slew time

Lo

ad c

ap

Output slew time

Input slew time

Lo

ad c

ap

Slew time control needed for all nets, including nets with tens of thousands of sinks (SE, Reset, ...)

Previous Work on Buffer InsertionPrevious Work on Buffer InsertionFanout optimization during synthesis

Berman et al. 89, ...

Delay optimization van Ginneken 90: dynamic programming Lillis et al. 96 : simultaneous buffering and wiresizing Alpert et al. 98 : simultaneous noise and delay optimization

Slew time and skew control Tellez-Sarrafzadeh 97

Our differences Post-layout stage, not synthesis stage Buffering for electrical correctness, not for delay

optimization – in fact, before delay optimization Simultaneous tree construction and buffering Polarity consideration

Formulation - Non-Inverting CaseFormulation - Non-Inverting Case

Given: net N with Source r Sinks S, each with

Input capacitance Cs

Find: buffered routing tree for N with min number of buffers while satisfying

Load cap constraint: The source and each buffer drives CU cap

Cs2

Cs1

r

CU

CU

Per-unit length wire capacitance Cw

A single buffer type, with Non-inverting type Input capacitance Cb Load cap upper-bound CU

Cs1

Cs2

r

Formulation – Inverting CaseFormulation – Inverting Case

Given: net N with Source r, Sinks S, each with

Input capacitance Cs

Unit length wire capacitance Cw

A single buffer type, with

Input capacitance Cb Load cap upper-bound CU

Find: buffered routing tree for N with min number of buffers while satisfying Load cap constraint: The source

and each buffer drives CU cap

CU

CU

+

+

Inverting type

Signal polarity

Sink polarity constraints

Our ContributionsOur Contributions New problem formulations and contexts

Buffering for electrical correctness, pre-timing analysis Simultaneous tree construction and buffering Polarity constraints

4+

2+

Approximation

Simultaneous tree construction and buffering

Buffering of a given tree

Inverting

Non-inverting

Dynamic programming

Cut&Connect Clustering

Greedy

HeuristicOptimal

OPEN

Hardness results Buffering RSMT is not always optimum Optimum interconnect not always on Hanan grid NP-hard to approximate within a ratio of 2-

Six algorithms

OutlineOutline

Motivation

Previous Work

Formulation

Our contributions

Buffering a given tree Simultaneous tree construction and buffering

Experimental results

Summary and research directions

Cs Cs1

p

Cs2

L(p,s1) Cw + Cs1 +

L(p,s2) Cw + Cs2 > CU

Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree

Linear time greedy algorithm Extension of a node-weighted tree partition algorithm

by Kundu-Misra79 A different algorithm was given by Tellez-Sarrafzadeh97

Insert buffers bottom-up such that each buffer drives largest possible load cap CU

?L(b1,s) Cw + Cs = CU

b1

L(b2,b1) Cw + Cb1 = CU

b2

A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p

Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb

Cs1

s1

Cs2

s2

Non-inverting Buffering of a Given TreeNon-inverting Buffering of a Given Tree

L(s1,b1) Cw + Cs1 = CU

b1

p

CU

Cs2

s2

Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree

A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p

Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb

b1

p

L(s2,b2) Cw + Cs2 = CU

b2

CU

Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree

A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p

Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb

b1

p

b2 (L(b1,p) +L(b2,p))Cw + 2Cb > CU

b3

CU

L(b1,p) Cw + Cb L(b2,p)Cw + Cb

Can be implemented to run in linear time

Inverting Buffering of a Given TreeInverting Buffering of a Given Tree

+ 0.5CU- - (0.5CU+) -

Cw = Cb = 0

Greedy buffering is not optimal

+ 0.5CU- - (0.5CU+) -

For each node u of T in post-order traversal Insert buffers in Tp driving load cap = CU if possible Try all the possibilities and insert 0,1 or 2 buffers at head of

each branch (9 cases for binary tree) For each polarity, find the feasible buffering of Tu with minimum

number of buffers, breaking ties by minimum residual capacitance

At root, choose solution with min number of buffers between the two possible polarities

Insert buffers in top-down order

Inverting Buffering of a Given TreeInverting Buffering of a Given Tree

Linear runtime for bounded-degree trees

Dynamic programming algorithm

OutlineOutline

Motivation

Previous Work

Formulation

Our contributions

Buffering a given tree

Simultaneous tree construction and buffering

Experimental results

Summary and research directions

Hardness ResultsHardness Results

Optimum buffering of optimum Steiner tree is not always optimum

CU=14, Cs=Cb=0

64

5 12source

6

5

3 14source

Hardness ResultsHardness Results

Optimum buffered tree may not be on the Hanan grid

CU=8, Cs=Cb=1

3

6 2

source 1 7

3

7 1

source 1 7

Hardness ResultsHardness Results

NP-hard to approximate within a ratio of 2 - Proof by reduction from RSMT problem

RSMT: Does there exist a Steiner min tree over terminals S of length k ?

Given Cb = 0, Cw = 1, CU = k, does there exist a buffered routing tree over terminals S with 1 buffer ?

Any 2 - approximation algorithm will solve RSMT problem in polynomial time Impossible (unless P=NP) !

Theorem: The problem can be approximated within a ratio of 2 (1 + ) for any > 1 / (CU / Cb - 2) for non-inverting buffer type 1 : PTAS by S. Arora J. ACM ‘98 Optimum number of buffers

Every buffer inserted by the algorithm drives a load of at least CU/2

Theoretically best-possible result NP-hard to approximate within a ratio of 2 –

Approximation – Non-inverting caseApproximation – Non-inverting case

bU

b

CC

CTcOPT

)(

Construct an -approximate Steiner tree T

Transform T into a binary tree

Apply the greedy algorithm to T

Construct an -approximate Steiner tree T

Transform T into a binary tree

Apply the greedy algorithm on T

Replace each buffer b by 2 inverting buffers, each driving a copy of Tb (one copy for “+’’ sinks, one copy for “-” sinks)

Approximation – Inverting caseApproximation – Inverting case

Theorem: The problem can be approximated within a ratio of 4(1+) for inverting buffer type

Approximation ratio is not known to be tight

+ - + -

Heuristic: Cut and ConnectHeuristic: Cut and Connect

Construct a Steiner minimum tree T

CU = 8, Cb = Cs = 0

4

3

2

6

Apply the greedy algorithm to T

For each buffer b driving < CU cap

CU = 8, Cb = Cs = 0

4

3

Heuristic: Cut and ConnectHeuristic: Cut and Connect

Construct a Steiner minimum tree T

Apply the greedy algorithm to T

For each buffer b driving < CU cap

4

3

2 Cut a neighboring subtree and

reconnect it under b if possible

CU = 8, Cb = Cs = 0

4

3

Heuristic: Cut and ConnectHeuristic: Cut and Connect

Construct a Steiner minimum tree T

Apply the greedy algorithm to T

For each buffer b driving < CU cap Cut a neighboring subtree and

reconnect it under b if possible 1 1 4

3

1 3

3

2

6

Relocate b downstream if necessary

Heuristic: ClusteringHeuristic: Clustering

Construct a Steiner min tree TCU = 10, Cb = Cs = 0

7

431

11

1

While c(T) > CU

Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU

Heuristic: ClusteringHeuristic: Clustering

Construct a Steiner min tree T

While c(T) > CU

Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU

Connect closest neighboring sink under b, if possible

CU = 10, Cb = Cs = 0

7

4

3

4

Heuristic: ClusteringHeuristic: Clustering

Construct a Steiner min tree T

While c(T) > CU

Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU

Connect closest neighboring sink under b, if possible

Replace Tb by b as a sink

CU = 10, Cb = Cs = 0

44

2

Re-construct Steiner min tree T

Heuristic: ClusteringHeuristic: Clustering

Construct a Steiner min tree T

While c(T) > CU

Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU

Connect closest neighboring sink under b, if possible

Replace Tb by b as a sink Re-construct Steiner min tree T

CU = 10, Cb = Cs = 0

7

44

23

7

431

11

1 Differences with Cut&Connect Re-construct Steiner tree after

each buffer insertion Cut a sink instead of a subtree

OutlineOutline

Motivation

Previous Work

Formulation

Our contributions Buffering a given tree

Simultaneous tree construction and buffering

Experimental results

Summary and research directions

Experimental ResultsExperimental Results

Greedy Cut&Connect Clustering Lower Bound

CU#buf Run

time#buf Run

time#buf Run

time#buf

500 806 6.59 778 39.1 729 890.01 571

1000 388 6.58 374 58.6 350 424.8 283

2000 191 6.58 153 89.0 171 208.8 138

4000 95 6.57 92 147.6 84 103.6 68

8000 45 6.57 44 113.8 42 49.3 23

Industry design with 34K terminals, Cw = 0.177fF/um, Cb = 37.5fF

Runtimes in seconds on a Ultra-60

Lower Bound: (c(T) – CU) / (CU – Cb)

Summary and Research DirectionsSummary and Research Directions

New formulation and context for minimum buffering

Methods apply to nets with up to tens of thousands of sinks savings of up to 12% in the number of inserted buffers Reference implementations in MARCO GSRC Bookshelf:

http://vlsicad.ucsd.edu/GSRC/bookshelf/Slots/Buffer

Ongoing research Buffering with slew and buffer skew constraints (SASIMI’01) Improved heuristics for simultaneous tree construction and

buffering with inverting buffer type Buffer libraries (not just single buffer type) Multi-constraints, e.g., load cap and fanout upper bounds

Thank you !Thank you !

Solution QualitySolution Quality

0.95

1

1.05

1.1

1.15

1.2

500 2000 8000

greedy

Cut&Connect

Clustering

An industry design with 22000 terminals

CU

Nu

mb

er

of

bu

ffe

rs

no

rma

lize

d b

y

low

er

bo

un

d

EfficiencyEfficiency

0

200

400

600

800

1000

1200

500 2000 8000

greedy

Cut&Connect

Clustering

An industry design with 22000 terminals

CU

Ru

nti

me

(se

c)