ee 201a (starting 2005, called ee 201b) modeling and optimization for vlsi layout instructor: lei he...
Post on 21-Dec-2015
222 views
TRANSCRIPT
EE 201A(Starting 2005, called EE 201B)
Modeling and Optimization for VLSI Layout
Instructor: Lei He
Email: [email protected]
Chapter 7 Interconnect Delay
7.1 Elmore Delay
7.2 High-order model and moment matching
7.3 Stage delay calculation
Basic Circuit Analysis Techniques
• Output response
• Basic waveforms– Step input
– Pulse input
– Impulse Input
• Use simple input waveforms to understand the impact of network design
Network structures & state
Input waveform & zero-states
Natural response vN(t)(zero-input response)
Forced response vF(t)(zero-state response)
For linear circuits: )()()( tvtvtv FN
unit step function
u(t)=0
1
0t
0t
1
pulse function of width T
0
1/T
-T/2 T/2
)
2()
2(
1)(
Ttu
Ttu
TtPT
unit impulse function
1)(
0any for s.t.
0for singular
0for 0)(
0when )( : )(
dtt
t
tt
TtPt T
dt
tduδ(t)
dxxtut
)(or
)()(
definitionBy
Basic Input Waveforms
• Definitions:– (unit) step input u(t) (unit) step response g(t)– (unit) impulse input (t) (unit) impulse response h(t)
• Relationship
• Elmore delay
ttdxxhtgdxxtu
dt
tdgth
dt
tdut
0)()()()(
)()(
)()(
Step Response vs. Impulse Response
(Input Waveform) (Output Waveform)
0 0
)()(' dttthdtttgTD
Analysis of Simple RC Circuit
)()()(
)())(()(
)()()(
tvtvdt
tdvRC
dt
tdvC
dt
tCvdti
tvtvtiR
T
T
first-order linear differentialequation with
constant coefficients
state variable
Inputwaveform
±v(t)C
R
vT(t)
i(t)
Analysis of Simple RC Circuit
0)()(
tvdt
tdvRCzero-input response:
(natural response)
step-input response:
match initial state:
output responsefor step-input:
v0
v0u(t)
v0(1-eRC/T)u(t)
RCt
N Ke(t)vRCdt
dv(t)
v(t)
11
)()()(
0 tuvtvdt
tdvRC
)()()()( 00 tuvKetvtuvtv RCt
F
)()1()( 0 tuevtv RCt
0)( 0)0( 0 tuvKv
Delays of Simple RC Circuit
• v(t) = v0(1 - e-t/RC) under step input v0u(t)
• v(t)=0.9v0 t = 2.3RCv(t)=0.5v0 t = 0.7RC
• Commonly used metricTD = RC (Elmore delay to be defined
later)
Lumped Capacitance Delay Model
• R = driver resistance• C = total interconnect capacitance + loading capacitance
• Sink Delay: td = R·C
• 50% delay under step input = 0.7RC• Valid when driver resistance >> interconnect resistance• All sinks have equal delay
driver
3N
2N
N
Rd
Cload
g0d
gintdloaddD
CLCR
CCRCRt
driver
3N
2N
N
Lumped RC Delay Model
• Minimize delay minimize wire length
Rd
Cload
Delay of Distributed RC Lines
......!4!2
1
2)cosh(
cosh
1)(
42
xx
eex
sRCssV
xx
out
Vout(t) Vout(s)Laplace
Transform
RVIN VOUT
C
VOUTVIN
R
C
0.5
1.0
VOUT
DISTRIBUTED
LUMPED
1.0RC 2.0RC time
Step response of distributed and lumped RC networks.A potential step is applied at VIN, and the resulting VOUT
is plotted. The time delays between commonly usedreference points in the output potential is also tabulated.
Delay of Distributed RC Lines (cont’d)
Output potential range Time elapsed(Distributed RC
Network)
Time elapsed(Lumped RC
Network)
0 to 90% 1.0 RC 2.3 RC
10% to 90% (rise time) 0.9 RC 2.2 RC
0 to 63% 0.5 RC 1.0 RC
0 to 50% (delay) 0.4 RC 0.7 RC
0 to 10% 0.1 RC 0.1 RC
Distributed Interconnect Models
• Distributed RC circuit model– L,T or circuits
• Distributed RCL circuit model
• Tree of transmission lines
Distributed RC Circuit Models
0Z
0Z
Distributed RLC Circuit Model (without mutual inductance)
Delays of Complex Circuits under Unit Step Input
• Circuits with monotonic response
• Easy to define delay & rise/fall time• Commonly used definitions
– Delay T50% = time to reach half-value, v(T50%) = 0.5Vdd
– Rise/fall time TR = 1/v’(T50%) where v’(t): rate of change of v(t) w.r.t. t
– Or rise time = time from 10% to 90% of final value
• Problem: lack of general analytical formula for T50% & TR!
t
1
0.5v(t)T50%
TR
t
Delays of Complex Circuits under Unit Step Input (cont’d)
• Circuits with non-monotonic response
• Much more difficult to define delay & rise/fall time
)( of change of rate : )('
(monotone) responseoutput : )(
tvtv
tv
0.5
1
T50%
v(t)
t
t
v’(t)
medianof v’(t)(T50%) v'(t)
dtttvTD
ofmean
)('0
Elmore Delay for Monotonic Responses
• Assumptions:– Unit step input – Monotone output response
• Basic idea: use of mean of v’(t) to approximate median of v’(t)
• T50%: median of v’(t), since
• Elmore delay TD = mean of v’(t)
def.)(by )( of valuefinal of half
)(')('%50
%500
tv
dttvdttvT
T
0
)(' tdttvTD
Elmore Delay for Monotonic Responses
Why Elmore Delay?
• Elmore delay is easier to compute analytically in most cases– Elmore’s insight [Elmore, J. App. Phy 1948]– Verified later on by many other researchers, e.g.
• Elmore delay for RC trees [Penfield-Rubinstein, DAC’81]• Elmore delay for RC networks with ramp input [Chan, T-
CAS’86]• .....
• For RC trees: [Krauter-Tatuianu-Willis-Pileggi, DAC’95]T50% TD
• Note: Elmore delay is not 50% value delay in general!
Elmore Delay for RC Trees
• Definition– h(t) = impulse response
– TD = mean of h(t)
=
• Interpretation– H(t) = output response (step process)– h(t) = rate of change of H(t)
– T50%= median of h(t)
– Elmore delay approximates the median of h(t) by the mean of h(t)
0
dtt h(t)
medianof v’(t)(T50%) v'(t)
dtttvTD
ofmean
)('0
h(t) = impulse response
H(t) = step response
Elmore Delay of a RC Tree[Rubinstein-Penfield-Horowitz, T-CAD’83]
ion!contradict 0)('
at capacitor thecharge currents all Since
resistors via connects & )()( Since
at 0 is to nodeany fromcurrent theThen,
at )( achieve nodeLet
0)(' s.t. Then,
0)( that Assume
0)0(
at nodeany of voltagemin. thebe )(Let
nodeevery at 0)( response impulse
))()('( nodeevery at 0)('
in tree nodeevery for in monotonic is )(
treeRC a toapplied isinput step awhen
1min
minmin
min1min1
1min
11minmin
1min01
0min
min
min
th
iii
iithth
tii
tthi
thtt
th
h
tth
ith
thtivitv
ittv
i
i
ii
i
Lemma:
Proof:
Apply impulse func. at t=0:
imin
i
current iimin
Elmore Delay in a RC Tree (cont’d)
00
000
i
))(1()1)((lim])()([lim
)(|)()('
)()(
) and between res.path (common ) cap o(current t
)in scap' all current to(on drop voltageThe)(1
)( node of cap. current to The
node delay to Elmore
& input to from
pathcommon of resistance:
at rooted subtree: ; node input to frompath :
dttvTTvdttvTTv
dttvttvdtttvT
dt
tdvCRR
dt
tdvC
PPk
SRPtvdt
tdvCi
CRT
i
kjPP
R
isiP
iiT
T
iiT
iiiD
k
kkkiki
k
kk
kki
Pkikii
i
kkkiD
kj
jk
ii
i
i
i
input
i
k
jSi
path resistance Rii
Rjk
Theorem :
Proof :
Elmore Delay in a RC Tree (cont’d)• We shall show later on that
i.e. 1-vi(T) goes to 0 at a much faster rate than 1/T when T
• Let
0))(1(lim
TTviT
)(
)](1[))(1(lim
)(
)](1[
)(
)()(
)](1[)(
0
0
0
kkkii
iiT
D
kkkii
k kkkkikki
kkkki
t
k
kkkii
t
ii
CRf
dttvTTvT
CRf
tvCRCR
tvCR
dxdx
xdvCRtf
dxxvtf
i
(1)
vi(t)
1
0 t
area t
ii dxxvtf0
)](1[)(
) & (since Then,
)(
Recall Let
2
kiiikikkpDR
iik
kkiR
kkkiD
kkkkp
RRRRTTT
RCRT
CRTCRT
ii
i
i
Some Definitions For Signal Bound Computation
Signal Bounds in RC Trees
• Theorem
ii
iRiR
iRiD
i
i
i
pp
Rip
i
ii
i
RD
Tt
TTT
p
R
i
p
D
RpT
tT
TT
p
D
RDRi
Di
TTteeT
T
tv
tT
tT
TTteeT
T
TTttTt
Ttv
t
)(
)(
1
)(
0 1
bounds Upper
1
) when trivial-(non 0 1 )(
0 0
bounds Lower
Delay Bounds in RC Trees
p
Di
ip
DipRp
Ri
D
p
Ri
ip
RRRD
ip
T
Ttv
tvT
TTTTt
Ttv
Tt
T
Ttv
tvT
TTTTt
tvTTt
i
i
i
i
ii
iii
1)( when )(1
ln
)(1
:boundsUpper
1)(hen w)(1
ln
)(1
:boundsLower
iD
Computation of Elmore Delay & Delay Bounds in RC Trees
• Let C(Tk) be total capacitance of subtree rooted at k
• Elmore delay
fashion
up-bottom ain y recursivel elinear timin computed becan formula threeall *
:boundlower
)(
:boundupper
)(
2
k ii
kkiR
kkkp
pkkkD
R
CRT
TCRT
TCRT
i
i
i
Comments on Elmore Delay Model
• Advantages– Simple closed-form expression
• Useful for interconnect optimization
– Upper bound of 50% delay [Gupta et al., DAC’95, TCAD’97]• Actual delay asymptotically approaches Elmore delay as input
signal rise time increases
– High fidelity [Boese et al., ICCD’93],[Cong-He, TODAES’96]• Good solutions under Elmore delay are good solutions under
actual (SPICE) delay
Comments on Elmore Delay Model
• Disadvantages– Low accuracy, especially poor for slope computation– Inherently cannot handle inductance effect
• Elmore delay is first moment of impulse response
• Need higher order moments
Chapter 7.2 Higher-order Delay Model
Time Moments of Impulse Response h(t)
• Definition of moments
)()( sHth L
dttthsi
dtsti
thdtethsH
i
i
i
i
ist
00
00
0
)()(!
1
)(!
1)()()(
i-th moment dttthi
m iii
0)()1(
!
1
• Note that m1 = Elmore delay when h(t) is monotone voltage response of impulse input
Pade Approximation
• H(s) can be modeled by Pade approximation of type (p/q):
– where q < p << N
1)(ˆ
1
01,
sasa
bsbsbsH
pp
qp
• Or modeled by q-th Pade approximation (q << N):
q
j j
jqqq ps
ksHsH
1,1 )(ˆ)(ˆ
• Formulate 2q constraints by matching 2q moments to compute ki’s & pi’s
General Moment Matching Technique
• Basic idea: match the moments m-(2q-r), …, m-1, m0, m1, …, mr-1
(i) initial condition matches, i.e.
)ˆ(or
)(lim)(ˆlimor ),0()0(ˆ
11
s
mm
ssHsHshhs
(ii) 22,,1,0for ˆ qkmm kk
)(ˆ of moments sH
q
q
ps
k
ps
k
ps
ksH
2
2
1
1)(ˆ
)(ˆˆˆ 1110
rrr ssmsmm
• When r = 2q-1:
Compute Residues & Poles
j
j ii
i
i
ii
i
i
p
s
p
k
ps
pk
ps
k)(
1 0
2212122
212
1
1
1222
221
1
02
2
1
1
121
)(
)(
)(
)0(
qqq
q
q
q
q
q
q
mp
k
p
k
p
k
mp
k
p
k
p
k
mp
k
p
k
p
k
mhkkk
condition initial
))(ˆlim( sHss
match first 2q-1 momentsEQ1
Basic Steps for Moment Matching
Step 1: Compute 2q moments m-1, m0, m1, …, m(2q-2) of H(s)
Step 2: Solve 2q non-linear equations of EQ1 to get
residues :,,,
poles :,,,
21
21
q
q
kkk
ppp
Step 3: Get approximate waveform
Step 4: Increase q and repeat 1-4, if necessary, for better accuracy
tpq
tptp qekekekth 2121)(ˆ
Components of Moment Matching Model
• Moment computation– Iterative DC analysis on transformed equivalent DC circuit
– Recursive computation based on tree traversal
– Incremental moment computation
• Moment matching methods– Asymptotic Waveform Evaluation (AWE) [Pillage-Rohrer, TCAD’90]
– 2-pole method [Horowitz, 1984] [Gao-Zhou, ISCAS’93]...
• Moment calculation will be provided as an OPTIONAL reading
Chapter 7 Interconnect Delay
7.1 Elmore Delay
7.2 High-order model and moment matching
7.3 Stage delay calculation
Stage Delay
A B C
Source Interconnect Load
)(),( LTLITT BCABAC
delay Transitiondelayn Propagatio:)(
ctinterconne delay with Gate:),(
ctinterconneout delay with Gate:),0(
LT
LIT
LT
BC
AB
AB
)(),0(),( LTLTLITT BCABABctInterconne
Modeling of Capacitive Load
• First-order approximation: the driver sees the total capacitance of wires and sinks
• Problem: Ignore shielding effect of resistance pessimistic approximation as driving point admittance
• Transform interconnect circuit into a -model [O’Brian-Savarino, ICCAD’89]– Problem: cannot be easily used with most device
models
• Compute effective capacitance from -model [Qian-Pullela-Pileggi, TCAD’94]
-Model[O’Brian-Savarino, ICCAD’89]
• Moment matching again!• Consider the first three moments of driving point
admittance (moments of response current caused by an applied unit impulse)
• Current in the downstream of node k
1
1)(2)2()1( )1(
)()(
)(/)()(
)()(
i Tj
iijjjj
Tjj
jTj
jk
inkk
jTj
jk
kk
k
k
smCsmsmsC
ssHCsY
sVsIsY
ssVCsI
Driving-Point Admittance Approximations
• Driving-point admittance = Sum of voltage moment-weighted subtree capacitance
• Approximation of the driving point admittance at the driver
i
i
ikk
Tj
ijj
ik
sysY
mCyk
1
)(
)1()(
)(
)(sY
General RC Tree:lumped RC elements,distributed RC lines
Driving-Point Admittance Approximations
• First order approximation: y(1) = sum of subtree capacitance
• Second order approximation: yk(2) = sum of subtree
capacitance weighted by Elmore delay
)1(yC
)1(yC
2)1()2( )/(yyR
)()1( sY
)()2( sY
Third Order Approximation: Model
21)1( CCy
R
1C2C
)()3( sY
21
)2( RCy
31
2)3( CRy
)3(
2)2(
1
)(
y
yC
1)1(
2 CyC
21
)2(
C
yR
Current Moment Computation
• Similar to the voltage moment computation• Iterative tree traversal:
– O(n) run-time, O(n) storage
• Bottom-up tree traversal:– O(n) run-time– Can achieve O(k) storage if we impose order of traversal,
k = max degree of a node– O’Brian and Savarino used bottom-up tree traversal
pwm
1pTv
C
pvm
Bottom-Up Moment Computation
• Maintain transfer function Hv~w(s) for sink w in subtree Tv, and moment-weighted capacitance of subtree:
• As we merge subtrees, compute new transfer function Hu~v(s) and weighted capacitance recursively:
u
vvv CLR ,,
v
w
up
um 1pTu
C
• New transfer function for node w
10for ,0for pjCpjm jT
jw v
pjCLCRm
CmCmC
jTv
jTv
jv
qT
j
q
qjvv
jv
jT
vv
vv
1for )(
,
21
1
0
111
pjmmmj
q
qw
qjv
jw 0for
0
pwm
pvm
1pTv
C
• New moment-weighted capacitance of Tu:
10for )(
pjCCuchildv
jT
jT vu
)()()( ~~~ sHsHsH wvvuwu
Current Moment Computation Rule #1
Cyy DU )1()1()(sYU
)(sYD )2()2(DU yy
)3()3(DU yy
C
Current Moment Computation Rule #2
)1()1(DU yy
)(sYU
)(sYD
2)1()2()2( )( DDU yRyy 3)1(2)2()1()3()3( )())((2 DDDDU yRyyRyy
R
Current Moment Computation Rule #3
Cyy DU )1()1(
)(sYU
)(sYD
2)1(2)1()2()2(
3
1)()( CyCyRyy DDDU
3)1(22)1(3)1(2
)2()2()1()3()3(
15
2)(
3
2)(
3
4)(
)())((2
CyCyCyR
yCyyRyy
DDD
DDDDU
R
C
Current Moment Computation Rule #4(Merging of Sub-trees)
B
iDiU
B
iDiU
B
iDiU
yy
yy
yy
1
)3()3(
1
)2()2(
1
)1()1(
)(sYU
)(1 sYD
)(sYDB
B Branches
Example: Uniform Distributed RC Segment
Purely capacitive
Wide metal (distributive)
Narrow metal (distributive)
Narrow metal (lumped RC)
Wide metal (lumped RC)
Wide metal ()
Narrow metal ()
Cload/Cmax
TAB/TAB(0)
Why Effective Capacitance Model?
• The -model is incompatible with existing empirical device models– Mapping of 4D empirical data is not practical from a storage
or run-time point of view
• Convert from a -model to an effective capacitance model for compatibility
• Equate the average current in the -load and the Ceff load
R
1C2C effCin in
Equating Average Currents
R
1C2C effCin in
I CI
)()(
)()()(
)(1
)(1
00
sVsCsI
sVsYsI
dttIt
dttIt
outeffC
out
t
CD
t
D
DD
tD = time taken to reach 50% point, not 50% point of input to 50% point of output
Waveform Approximation for Vout(t)
Dxx
xiout
tttttba
ttctVtV
)(
0)(
2
2xi ctVa
xctb 2
• Quadratic from initial voltage (Vi = VDD for falling waveform) to 20% point, linear to the 50% point
• Voltage waveform and first derivative are continuous at tx
Average Currents in Capacitors
]2
[2
)2()2(1
)(0
xD
D
xeff
t
t xeff
t
effD
C
tt
t
tcC
dtctCdtctCt
tID
x
x
]2
[2
)( 22
xD
D
xC
tt
t
tcCtI
• Average current of C1 is not quite as simple:
– Current due to quadratic current in C2
– Current due to linear current in C2
• Average current for (0,tx) in C2
• Average current for (tx,tD) in C2
• Average current for (0,tD) in C2
Average Currents in C1
1
11)(
2
2)( 2
11
21 RC
t
xx
xC
x
eRCtRCt
t
cCtI
1
11
1
)(
11
)(
211
1
112
11)(22)(
RC
tt
xDx
RC
tt
RC
t
xxD
C
xD
xDx
ett
RCCct
eeRCtRCtt
cCtI
11
1
)(
211
21 )(
2
2)( RC
t
RC
tt
xDxx
DC
DxD
eeRCRCtttt
t
cCtI
Computation of Effective Capacitance
• Equating average currents
• Problem: tD and tx are not known a priori
• Solution: iterative computation– Set the load capacitance equal to total capacitance
– Use table-lookup or K-factor equations to obtain tD and tx
– Equate average currents and calculate effective capacitance– Set load capacitance equal to effective capacitance and iterate
11
)(
2
21
2
112 )(
)(1 RC
t
RC
tt
tDx
tD
eff
DxD
xxee
tt
RC
t
RCCCC
2/
2/
Dx
indD
tt
tt
Stage Delay Computation
• Calculate output waveform at gate– Using Ceff model to model interconnect
• Use the output waveform at gate as the input waveform for interconnect tree load
• Apply interconnect reduced-order modeling technique to obtain output waveform at receiver pins