minimum-buffered routing of non- critical nets for slew rate and reliability control supported by...
Post on 19-Dec-2015
214 views
TRANSCRIPT
Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and
Reliability Control
Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and
Reliability Control
Supported by Cadence Design Systems, Inc. and the MARCO Gigascale Silicon Research Center
C. Alpert (IBM)
A. B. Kahng, B. Liu, I. Măndoiu (UCSD)
A. Zelikovsky (GSU)
http://vlsicad.ucsd.edu
OutlineOutline
Motivation
Previous Work
Formulation
Our contributions Buffering a given tree
Simultaneous tree construction and buffering
Experimental results
Summary and research directions
Bounded load capacitance is achieved by buffer insertion
MotivationMotivation
Timing analysis requires electrical correctness Load caps and slew times
must be within range of lookup tables, else timing analysis can’t be trusted!
Electrical correctness is guaranteed by bounding load capacitance
Gate delay
Input slew time
Lo
ad c
ap
Output slew time
Input slew time
Lo
ad c
ap
Slew time control needed for all nets, including nets with tens of thousands of sinks (SE, Reset, ...)
Previous Work on Buffer InsertionPrevious Work on Buffer InsertionFanout optimization during synthesis
Berman et al. 89, ...
Delay optimization van Ginneken 90: dynamic programming Lillis et al. 96 : simultaneous buffering and wiresizing Alpert et al. 98 : simultaneous noise and delay optimization
Slew time and skew control Tellez-Sarrafzadeh 97
Our differences Post-layout stage, not synthesis stage Buffering for electrical correctness, not for delay
optimization – in fact, before delay optimization Simultaneous tree construction and buffering Polarity consideration
Formulation - Non-Inverting CaseFormulation - Non-Inverting Case
Given: net N with Source r Sinks S, each with
Input capacitance Cs
Find: buffered routing tree for N with min number of buffers while satisfying
Load cap constraint: The source and each buffer drives CU cap
Cs2
Cs1
r
CU
CU
Per-unit length wire capacitance Cw
A single buffer type, with Non-inverting type Input capacitance Cb Load cap upper-bound CU
Cs1
Cs2
r
Formulation – Inverting CaseFormulation – Inverting Case
Given: net N with Source r, Sinks S, each with
Input capacitance Cs
Unit length wire capacitance Cw
A single buffer type, with
Input capacitance Cb Load cap upper-bound CU
Find: buffered routing tree for N with min number of buffers while satisfying Load cap constraint: The source
and each buffer drives CU cap
CU
CU
+
+
Inverting type
Signal polarity
Sink polarity constraints
Our ContributionsOur Contributions New problem formulations and contexts
Buffering for electrical correctness, pre-timing analysis Simultaneous tree construction and buffering Polarity constraints
4+
2+
Approximation
Simultaneous tree construction and buffering
Buffering of a given tree
Inverting
Non-inverting
Dynamic programming
Cut&Connect Clustering
Greedy
HeuristicOptimal
OPEN
Hardness results Buffering RSMT is not always optimum Optimum interconnect not always on Hanan grid NP-hard to approximate within a ratio of 2-
Six algorithms
OutlineOutline
Motivation
Previous Work
Formulation
Our contributions
Buffering a given tree Simultaneous tree construction and buffering
Experimental results
Summary and research directions
Cs Cs1
p
Cs2
L(p,s1) Cw + Cs1 +
L(p,s2) Cw + Cs2 > CU
Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree
Linear time greedy algorithm Extension of a node-weighted tree partition algorithm
by Kundu-Misra79 A different algorithm was given by Tellez-Sarrafzadeh97
Insert buffers bottom-up such that each buffer drives largest possible load cap CU
?L(b1,s) Cw + Cs = CU
b1
L(b2,b1) Cw + Cb1 = CU
b2
A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p
Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb
Cs1
s1
Cs2
s2
Non-inverting Buffering of a Given TreeNon-inverting Buffering of a Given Tree
L(s1,b1) Cw + Cs1 = CU
b1
p
CU
Cs2
s2
Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree
A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p
Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb
b1
p
L(s2,b2) Cw + Cs2 = CU
b2
CU
Non-Inverting Buffering of a Given TreeNon-Inverting Buffering of a Given Tree
A vertex is critical if c(Tp)>CU and c(Tu)<CU child u of p A child u is heaviest if c(Tu) + c(u,p) > c(Tv) + c(v,p) other child v of p
Find a critical vertex p by a post-order traversal of T Find a heaviest child u of p Insert a buffer b on edge (u,p) such that c(u,b) = min{CU-c(Tu), c(u,p)} Recursively find an optimum buffering B’ of T\Tb
b1
p
b2 (L(b1,p) +L(b2,p))Cw + 2Cb > CU
b3
CU
L(b1,p) Cw + Cb L(b2,p)Cw + Cb
Can be implemented to run in linear time
Inverting Buffering of a Given TreeInverting Buffering of a Given Tree
+ 0.5CU- - (0.5CU+) -
Cw = Cb = 0
Greedy buffering is not optimal
+ 0.5CU- - (0.5CU+) -
For each node u of T in post-order traversal Insert buffers in Tp driving load cap = CU if possible Try all the possibilities and insert 0,1 or 2 buffers at head of
each branch (9 cases for binary tree) For each polarity, find the feasible buffering of Tu with minimum
number of buffers, breaking ties by minimum residual capacitance
At root, choose solution with min number of buffers between the two possible polarities
Insert buffers in top-down order
Inverting Buffering of a Given TreeInverting Buffering of a Given Tree
Linear runtime for bounded-degree trees
Dynamic programming algorithm
OutlineOutline
Motivation
Previous Work
Formulation
Our contributions
Buffering a given tree
Simultaneous tree construction and buffering
Experimental results
Summary and research directions
Hardness ResultsHardness Results
Optimum buffering of optimum Steiner tree is not always optimum
CU=14, Cs=Cb=0
64
5 12source
6
5
3 14source
Hardness ResultsHardness Results
Optimum buffered tree may not be on the Hanan grid
CU=8, Cs=Cb=1
3
6 2
source 1 7
3
7 1
source 1 7
Hardness ResultsHardness Results
NP-hard to approximate within a ratio of 2 - Proof by reduction from RSMT problem
RSMT: Does there exist a Steiner min tree over terminals S of length k ?
Given Cb = 0, Cw = 1, CU = k, does there exist a buffered routing tree over terminals S with 1 buffer ?
Any 2 - approximation algorithm will solve RSMT problem in polynomial time Impossible (unless P=NP) !
Theorem: The problem can be approximated within a ratio of 2 (1 + ) for any > 1 / (CU / Cb - 2) for non-inverting buffer type 1 : PTAS by S. Arora J. ACM ‘98 Optimum number of buffers
Every buffer inserted by the algorithm drives a load of at least CU/2
Theoretically best-possible result NP-hard to approximate within a ratio of 2 –
Approximation – Non-inverting caseApproximation – Non-inverting case
bU
b
CC
CTcOPT
)(
Construct an -approximate Steiner tree T
Transform T into a binary tree
Apply the greedy algorithm to T
Construct an -approximate Steiner tree T
Transform T into a binary tree
Apply the greedy algorithm on T
Replace each buffer b by 2 inverting buffers, each driving a copy of Tb (one copy for “+’’ sinks, one copy for “-” sinks)
Approximation – Inverting caseApproximation – Inverting case
Theorem: The problem can be approximated within a ratio of 4(1+) for inverting buffer type
Approximation ratio is not known to be tight
+ - + -
Heuristic: Cut and ConnectHeuristic: Cut and Connect
Construct a Steiner minimum tree T
CU = 8, Cb = Cs = 0
4
3
2
6
Apply the greedy algorithm to T
For each buffer b driving < CU cap
CU = 8, Cb = Cs = 0
4
3
Heuristic: Cut and ConnectHeuristic: Cut and Connect
Construct a Steiner minimum tree T
Apply the greedy algorithm to T
For each buffer b driving < CU cap
4
3
2 Cut a neighboring subtree and
reconnect it under b if possible
CU = 8, Cb = Cs = 0
4
3
Heuristic: Cut and ConnectHeuristic: Cut and Connect
Construct a Steiner minimum tree T
Apply the greedy algorithm to T
For each buffer b driving < CU cap Cut a neighboring subtree and
reconnect it under b if possible 1 1 4
3
1 3
3
2
6
Relocate b downstream if necessary
Heuristic: ClusteringHeuristic: Clustering
Construct a Steiner min tree TCU = 10, Cb = Cs = 0
7
431
11
1
While c(T) > CU
Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU
Heuristic: ClusteringHeuristic: Clustering
Construct a Steiner min tree T
While c(T) > CU
Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU
Connect closest neighboring sink under b, if possible
CU = 10, Cb = Cs = 0
7
4
3
4
Heuristic: ClusteringHeuristic: Clustering
Construct a Steiner min tree T
While c(T) > CU
Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU
Connect closest neighboring sink under b, if possible
Replace Tb by b as a sink
CU = 10, Cb = Cs = 0
44
2
Re-construct Steiner min tree T
Heuristic: ClusteringHeuristic: Clustering
Construct a Steiner min tree T
While c(T) > CU
Insert a buffer b above critical node v with max c(Tv) < CU and c(Tparent(v)) > CU
Connect closest neighboring sink under b, if possible
Replace Tb by b as a sink Re-construct Steiner min tree T
CU = 10, Cb = Cs = 0
7
44
23
7
431
11
1 Differences with Cut&Connect Re-construct Steiner tree after
each buffer insertion Cut a sink instead of a subtree
OutlineOutline
Motivation
Previous Work
Formulation
Our contributions Buffering a given tree
Simultaneous tree construction and buffering
Experimental results
Summary and research directions
Experimental ResultsExperimental Results
Greedy Cut&Connect Clustering Lower Bound
CU#buf Run
time#buf Run
time#buf Run
time#buf
500 806 6.59 778 39.1 729 890.01 571
1000 388 6.58 374 58.6 350 424.8 283
2000 191 6.58 153 89.0 171 208.8 138
4000 95 6.57 92 147.6 84 103.6 68
8000 45 6.57 44 113.8 42 49.3 23
Industry design with 34K terminals, Cw = 0.177fF/um, Cb = 37.5fF
Runtimes in seconds on a Ultra-60
Lower Bound: (c(T) – CU) / (CU – Cb)
Summary and Research DirectionsSummary and Research Directions
New formulation and context for minimum buffering
Methods apply to nets with up to tens of thousands of sinks savings of up to 12% in the number of inserted buffers Reference implementations in MARCO GSRC Bookshelf:
http://vlsicad.ucsd.edu/GSRC/bookshelf/Slots/Buffer
Ongoing research Buffering with slew and buffer skew constraints (SASIMI’01) Improved heuristics for simultaneous tree construction and
buffering with inverting buffer type Buffer libraries (not just single buffer type) Multi-constraints, e.g., load cap and fanout upper bounds
Solution QualitySolution Quality
0.95
1
1.05
1.1
1.15
1.2
500 2000 8000
greedy
Cut&Connect
Clustering
An industry design with 22000 terminals
CU
Nu
mb
er
of
bu
ffe
rs
no
rma
lize
d b
y
low
er
bo
un
d