ece 558/658 : lecture 20 interconnect design (chapter 9) clock distribution (chapter 10.3.3)
DESCRIPTION
ECE 558/658 : Lecture 20 Interconnect Design (Chapter 9) Clock distribution (Chapter 10.3.3). Atul Maheshwari. The Wire – Chapter 4 – Lecture 15. schematics. physical. Capacitance: The Parallel Plate Model. Fringing Capacitance. Complete Capacitance Picture. Wire Resistance . - PowerPoint PPT PresentationTRANSCRIPT
© Digital Integrated Circuits2nd Interconnect
ECE 558/658 : ECE 558/658 : Lecture 20Lecture 20
Interconnect DesignInterconnect Design(Chapter 9)(Chapter 9)
Clock distributionClock distribution(Chapter 10.3.3)(Chapter 10.3.3)
Atul Maheshwari
© Digital Integrated Circuits2nd Interconnect
The Wire – Chapter 4 – Lecture 15The Wire – Chapter 4 – Lecture 15
transmitters receivers
schematics physical
© Digital Integrated Circuits2nd Interconnect
Capacitance: The Parallel Plate ModelCapacitance: The Parallel Plate Model
Dielectric
Substrate
L
W
H
tdi
Electrical-field lines
Current flow
WLt
cdi
diint
© Digital Integrated Circuits2nd Interconnect
Fringing CapacitanceFringing Capacitance
W - H/2H
+
(a)
(b)
© Digital Integrated Circuits2nd Interconnect
Complete Capacitance PictureComplete Capacitance Picture
fringing parallel
© Digital Integrated Circuits2nd Interconnect
Wire Resistance Wire Resistance
W
LH
R = H W
L
Sheet ResistanceRo
R1 R2
© Digital Integrated Circuits2nd Interconnect
Impact of Interconnect ParasiticsImpact of Interconnect Parasitics
• Reduce Robustness
• Affect Performance• Increase delay• Increase power dissipation
Classes of Parasitics
• Capacitive
• Resistive
• InductiveImpact of Parasitics
© Digital Integrated Circuits2nd Interconnect
INTERCONNECTINTERCONNECT
© Digital Integrated Circuits2nd Interconnect
Capacitive Cross TalkCapacitive Cross Talk
X
YVX
CXY
CY
© Digital Integrated Circuits2nd Interconnect
Capacitive Cross TalkCapacitive Cross TalkDynamic NodeDynamic Node
3 x 1 m overlap: 0.19 V disturbance
CY
CXY
VDD
PDN
CLK
CLK
In1In2In3
Y
X
2.5 V
0 V
© Digital Integrated Circuits2nd Interconnect
Capacitive Cross TalkCapacitive Cross TalkDriven NodeDriven Node
XY = RY(CXY+CY)
Keep time-constant smaller than rise time
0
0.5
0.450.4
0.35
0.3
0.25
0.2
0.150.1
0.05
010.80.6
t (nsec)0.40.2
X
YVX
RYCXY
CY
tr↑
© Digital Integrated Circuits2nd Interconnect
Dealing with Capacitive Cross Dealing with Capacitive Cross TalkTalk
Avoid floating nodes Protect sensitive nodes Make rise and fall times as large as possible Differential signaling Do not run wires together for a long distance Use shielding wires Use shielding layers
© Digital Integrated Circuits2nd Interconnect
ShieldingShielding
GND
GND
Shieldingwire
Substrate (GND )
Shieldinglayer
VDD
© Digital Integrated Circuits2nd Interconnect
Cross Talk and PerformanceCross Talk and Performance
Cc
- When neighboring lines switch in opposite direction of victim line, delay increases
DELAY DEPENDENT UPON ACTIVITY IN NEIGHBORING WIRES
Miller EffectMiller Effect
- Both terminals of capacitor are switched in opposite directions (0 Vdd, Vdd 0)
- Effective voltage is doubled and additional charge is needed (from Q=CV)
© Digital Integrated Circuits2nd Interconnect
Impact of Cross Talk on Delay Impact of Cross Talk on Delay
r is ratio between capacitance to GND and to neighbor
© Digital Integrated Circuits2nd Interconnect
Structured Predictable InterconnectStructured Predictable Interconnect
S
S SV V S
G
SSV
G
VS
S SV V S
G
SSV
G
VExample: Dense Wire Fabric ([Sunil Kathri])Trade-off:• Cross-coupling capacitance 40x lower, 2% delay variation• Increase in area and overall capacitance
© Digital Integrated Circuits2nd Interconnect
Encoding Data Avoids Worst-CaseEncoding Data Avoids Worst-CaseConditionsConditions
Encoder
Decoder
Bus
In
Out
© Digital Integrated Circuits2nd Interconnect
Driving Large CapacitancesDriving Large Capacitances
V in Vout
CL
VDD
• Transistor Sizing• Cascaded Buffers
© Digital Integrated Circuits2nd Interconnect
Using Cascaded BuffersUsing Cascaded Buffers
CL = 20 pF
In Out
1 2 N
0.25 m processCin = 2.5 fFtp0 = 30 ps
F = CL/Cin = 8000fopt = 3.6 N = 7tp = 0.76 ns
(See Chapter 5)
© Digital Integrated Circuits2nd Interconnect
How to Design Large TransistorsHow to Design Large Transistors
G(ate)
S(ource)
D(rain)
MultipleContacts
small transistors in parallel
Reduces diffusion capacitanceReduces gate resistance
© Digital Integrated Circuits2nd Interconnect
Bonding Pad DesignBonding Pad DesignBonding Pad
Out
InVDD GND
100 m
GND
Out
© Digital Integrated Circuits2nd Interconnect
Tristate BuffersTristate Buffers
InEn
En
VDD
Out
Out = In.En + Z.En
VDD
In
En
EnOut
Increased output drive
© Digital Integrated Circuits2nd Interconnect
INTERCONNECTINTERCONNECT
© Digital Integrated Circuits2nd Interconnect
Impact of ResistanceImpact of Resistance Impact on performance - We have already
learned how to drive RC interconnect Impact of resistance is commonly seen in
power supply distribution: IR drop Voltage variations
Power supply is distributed to minimize the IR drop and the change in current due to switching of gates
© Digital Integrated Circuits2nd Interconnect
Using BypassesUsing BypassesDriver
Polysilicon word line
Polysilicon word line
Metal word line
Metal bypass
Driving a word line from both sides
Using a metal bypass
WL
WL K cells
© Digital Integrated Circuits2nd Interconnect
Resistivity and PerformanceResistivity and Performance
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
1.5
2
2.5
time (nsec)
volta
ge (V
)
x= L/10
x = L/4
x = L/2
x= L
Diffused signal Diffused signal propagationpropagation
Delay ~ RCDelay ~ RCDelay ~ LDelay ~ L22
CN-1 CNC2
R1 R2
C1
Tr
Vin
RN-1 RN
The distributed rc-lineThe distributed rc-line
© Digital Integrated Circuits2nd Interconnect
RepeatersRepeaters
Repeaters are buffers / inverters inserted at regular intervals.
Makes Delay linearly proportional to the wire length.
Questions to be answered – Where and how big the repeaters should be ?
© Digital Integrated Circuits2nd Interconnect
Reducing RC-delay – Repeater insertionReducing RC-delay – Repeater insertion
Repeater
(chapter 5)
© Digital Integrated Circuits2nd Interconnect
Repeater InsertionRepeater Insertion
Taking the repeater loading into account
For a given technology and a given interconnect layer, there exists For a given technology and a given interconnect layer, there exists an optimal length of the wire segments between repeaters. The an optimal length of the wire segments between repeaters. The delay of these wire segments is delay of these wire segments is independent of the routing layer!independent of the routing layer!
© Digital Integrated Circuits2nd Interconnect
Repeater Design LimitationsRepeater Design Limitations Delay-optimal repeaters are area and power
hungry – use of sub-optimal insertion Optimal placement requires accurate
modeling of interconnect. Optimal placement not always possible. Performance limited due to significant
interconnect resistance. Source of noise – Supply and Substrate
© Digital Integrated Circuits2nd Interconnect
Advanced techniques - Reducing the Advanced techniques - Reducing the swingswing
tpHL = CL Vswing/2
Iav
Reducing the swing potentially yields linear reduction in delay Also results in reduction in power dissipation Delay penalty is paid by the receiver Requires use of “sense amplifier” to restore signal level Frequently designed differentially (e.g. LVDS)
© Digital Integrated Circuits2nd Interconnect
Single-Ended Static Driver and Single-Ended Static Driver and ReceiverReceiver
CL
VDD
VDD VDD
driver receiver
VDDL
VDDLIn
OutOut
© Digital Integrated Circuits2nd Interconnect
Dynamic Reduced Swing NetworkDynamic Reduced Swing Network
fIn2.fIn1.
f M2
M1 M3
M4
Cbus Cout
Bus Out
VDD VDD
f
VbusVasym
Vsym
2 4 6time (ns)
8 10 120
0.5
1
1.5
2
2.5
0
© Digital Integrated Circuits2nd Interconnect
RI Introduced NoiseRI Introduced Noise
M1
X
I
R̀
RV
f pre
V
VDD
VDD - V`
I
© Digital Integrated Circuits2nd Interconnect
Power DistributionPower Distribution Low-level distribution is in Metal 1 Power has to be ‘strapped’ in higher layers of
metal. The spacing is set by IR drop,
electromigration, inductive effects Always use multiple contacts on straps
© Digital Integrated Circuits2nd Interconnect
Power and Ground DistributionPower and Ground Distribution
GND
VDD
Logic
GND
VDD
Logic
GND
VDD
(a) Finger-shaped network (b) Network with multiple supply pins
© Digital Integrated Circuits2nd Interconnect
3 Metal Layer Approach (EV4)3 Metal Layer Approach (EV4)3rd “coarse and thick” metal layer added to the
technology for EV4 designPower supplied from two sides of the die via 3rd metal layer
2nd metal layer used to form power grid90% of 3rd metal layer used for power/clock routing
Metal 3
Metal 2
Metal 1
Courtesy Compaq
© Digital Integrated Circuits2nd Interconnect
4 Metal Layers Approach (EV5)4 Metal Layers Approach (EV5)4th “coarse and thick” metal layer added to the
technology for EV5 designPower supplied from four sides of the dieGrid strapping done all in coarse metal
90% of 3rd and 4th metals used for power/clock routing
Metal 3
Metal 2
Metal 1
Metal 4
Courtesy Compaq
© Digital Integrated Circuits2nd Interconnect
2 reference plane metal layers added to thetechnology for EV6 designSolid planes dedicated to Vdd/Vss
Significantly lowers resistance of gridLowers on-chip inductance
6 Metal Layer Approach – EV66 Metal Layer Approach – EV6
Metal 4
Metal 2Metal 1
RP2/Vdd
RP1/Vss
Metal 3
Courtesy Compaq
© Digital Integrated Circuits2nd Interconnect
Electromigration (1)Electromigration (1)
Limits dc-current to 1 mA/m
© Digital Integrated Circuits2nd Interconnect
Electromigration (2)Electromigration (2)
© Digital Integrated Circuits2nd Interconnect
Interconnect Projections:Interconnect Projections:GeometryGeometry
# of metal layers is steadily increasing due to:• Increasing die size and device count: we need more wires and longer wires to connect everything
• Rising need for a hierarchical wiring network; local wires with high density and global wires with low RC
substrate poly
M1
M2
M3
M4
M5
M6
Tins
H
W S
= 2.2 -cm
0.25 m wiring stack
Minimum Widths (Relative)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
M5M4M3M2M1Poly
Minimum Spacing (Relative)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
M5M4M3M2M1Poly
© Digital Integrated Circuits2nd Interconnect
Interconnect Projections :Interconnect Projections :Low-k dielectricsLow-k dielectrics
Both delay and power are reduced by dropping interconnect capacitance
Types of low-k materials include: inorganic (SiO2), organic (Polyimides) and aerogels (ultra low-k)
The numbers below are on the conservative side of the NRTS roadmap
© Digital Integrated Circuits2nd Interconnect
Interconnect Projections: CopperInterconnect Projections: Copper Copper is planned in full sub-0.25
m process flows and large-scale designs (IBM, Motorola, IEDM97)
With cladding and other effects, Cu ~ 2.2 -cm vs. 3.5 for Al(Cu) 40% reduction in resistance
Electromigration improvement; 100X longer lifetime (IBM, IEDM97) Electromigration is a limiting factor
beyond 0.18 m if Al is used (HP, IEDM95)
Vias
© Digital Integrated Circuits2nd Interconnect
Diagonal WiringDiagonal Wiring
y
x
destination
Manhattan
source
diagonal
• 20+% Interconnect length reduction• Clock speed Signal integrity Power integrity • 15+% Smaller chips plus 30+% via reduction
Courtesy Cadence X-initiative
© Digital Integrated Circuits2nd Interconnect
Clock distributionClock distribution
© Digital Integrated Circuits2nd Interconnect
Clock DistributionClock Distribution
CLK
Clock is distributed in a tree-like fashion
H-tree
© Digital Integrated Circuits2nd Interconnect
More realistic H-treeMore realistic H-tree
[Restle98]
© Digital Integrated Circuits2nd Interconnect
The Grid SystemThe Grid SystemD r iv e r
D r iv e r
Dri
ver
Driv
er
G C LK G C LK
G CL K
G CL K
•No rc-matching•Large power
© Digital Integrated Circuits2nd Interconnect
Example: DEC Alpha 21164Example: DEC Alpha 21164Clock Frequency: 300 MHz - 9.3 Million Transistors
Total Clock Load: 3.75 nF
Power in Clock Distribution network : 20 W (out of 50)
Uses Two Level Clock Distribution:
• Single 6-stage driver at center of chip• Secondary buffers drive left and right side
clock grid in Metal3 and Metal4Total driver size: 58 cm!
© Digital Integrated Circuits2nd Interconnect
21164 Clocking21164 Clocking 2 phase single wire clock,
distributed globally 2 distributed driver channels
Reduced RC delay/skew Improved thermal distribution 3.75nF clock load 58 cm final driver width
Local inverters for latching Conditional clocks in caches to
reduce power More complex race checking Device variation
trise = 0.35ns tskew = 150ps
tcycle= 3.3ns
Clock waveform
Location of clockdriver on die
pre-driver
final drivers
© Digital Integrated Circuits2nd Interconnect
Clock Drivers
© Digital Integrated Circuits2nd Interconnect
Clock Skew in Alpha ProcessorClock Skew in Alpha Processor
© Digital Integrated Circuits2nd Interconnect
2 Phase, with multiple conditional buffered clocks 2.8 nF clock load 40 cm final driver width
Local clocks can be gated “off” to save power
Reduced load/skew Reduced thermal issues Multiple clocks complicate race
checking
trise = 0.35ns tskew = 50ps
tcycle= 1.67ns
EV6 (Alpha 21264) ClockingEV6 (Alpha 21264) Clocking600 MHz – 0.35 micron CMOS600 MHz – 0.35 micron CMOS
Global clock waveform
PLL
© Digital Integrated Circuits2nd Interconnect
21264 Clocking21264 Clocking
© Digital Integrated Circuits2nd Interconnect
EV6 Clock ResultsEV6 Clock Results
GCLK Skew(at Vdd/2 Crossings)
ps5
101520253035404550
ps300305310315320325330335340345
GCLK Rise Times(20% to 80% Extrapolated to 0% to 100%)
© Digital Integrated Circuits2nd Interconnect
EV7 Clock HierarchyEV7 Clock Hierarchy
GCLK(CPU Core)L2
L_CL
K(L
2 Ca
che)
L2R_
CLK
(L2
Cach
e)
NCLK(Mem Ctrl)
DLL
PLL
SYSCLK
DLL
DLL
+ widely dispersed drivers
+ DLLs compensate static and low-frequency variation
+ divides design and verification effort
- DLL design and verification is added work
+ tailored clocks
Active Skew Management and Multiple Clock Domains