timing issues & clock distributioncourse.ece.cmu.edu/~ece322/lectures/lecture10/lecture10... ·...
TRANSCRIPT
![Page 1: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/1.jpg)
Timing Issues and Clock Distribution
Lecture 1018-322 Fall 2003
Textbook: [Sections 7.5, 10.1, 10.3]
![Page 2: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/2.jpg)
Overview
Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew
![Page 3: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/3.jpg)
Review: Register Timing
© Prentice Hall 1995clk-to-Q (propagation) delay (tpFF)
hold time
setup time
Unstable data
cycle time
clk
Q
![Page 4: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/4.jpg)
Sequential Systems: The Big Picture
PrimaryInputs
PrimaryOutputsCombinational
Logic
Next State
Current State
MemoryElements
(Registers)Clock
![Page 5: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/5.jpg)
Maximum Clock Frequency
FF’s
LOGIC
tp,comb
φ
“Speed” of the sequential machine (how fast can this machine be clocked)
f = 1/Tφ (clock frequency)
Example: tp ~ 100ns => 10MHz (limit on performance)
tp,FF + tp,comb + tsetup < Tφ
![Page 6: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/6.jpg)
Setup Time
Required time for input to be stableBEFORE CLOCK EDGE
Comb.Logic
Data stable herebefore clock here
![Page 7: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/7.jpg)
Setup Time Fix
Φ
Data
This violation can be fixed by stretching the clock cycle
OK
Φ
Data
![Page 8: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/8.jpg)
Setup Time Fix 2
Φ
Data
OR… by accelerating the combinational logic
OK
Φ
Data
![Page 9: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/9.jpg)
Hold Time
Required time for input to be stableAFTER CLOCK EDGE
Comb.Logic
Data stable hereafter clock here
![Page 10: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/10.jpg)
Hold Time Violations
Prop Delay: 1 ns Hold Time: 2 ns
Hold time violations are caused by “short paths”Cannot be fixed by slowing down the clock!!!
Fixed by slowing down fast paths
![Page 11: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/11.jpg)
Timing Analysis
Look for longest path: clock speedLook for shortest paths: check hold time
Static Timing Analysis:Attempt to determine longest/shortest path from schematicDifficult problem Know the delay of logic elements, but cannot easily reason about
the entire design
![Page 12: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/12.jpg)
False Paths
Example: #4
#3
#2 #3
Solutions:SimulationFalse Path Analysis
![Page 13: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/13.jpg)
Speeding up System Performance: Pipelining
RE
G
φ
REG
φR
EGφ
log.
RE
G
φ
REG
φ
RE
G
φ
.
RE
G
φ
RE
G
φ
logOut Out
a
b
a
b
Non-pipelined version Pipelined version
tp,comb
![Page 14: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/14.jpg)
How Good Is This?
Tmin,pipe = tp,reg + max(tp,ADD,tp,abs,tp,log ) + tsetup,reg
Pipelining is used to implement high-performance data-pathsAdding extra pipeline stages only makes sense up to a certain point
RE
Gf
RE
G
φR
EG
φ
.
RE
G
φ
RE
G
φ
log Out
a
b
Pipelined version
![Page 15: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/15.jpg)
Overview
Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew
![Page 16: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/16.jpg)
Synchronous Pipelined Data-Path: Clock Skew
Clock Rates as High as 1 GHz in CMOS!
CL1 R1 CL2 R2 CL3 R3Out
tφ’ tφ’’ tφ’’’
tl,mintl,max
tr,mintr,max
ti
Clock Edge Timing Depends upon Position
A clock line behaves as a distributed RC line
Each register sees a localclock time depending on their distance from the clock source -> clock skew
δ = tφ” – tφ’ (> 0 or <0)
Clock skew can severely affect the performance
Note: we assumed here tsetup = 0
φ
In
![Page 17: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/17.jpg)
Constraints on Skew
R1 R2
φ’ φ’’δ
tr,min + tl,min + ti
(a) Race between clock and data.
tφ’ tφ’’ = tφ’ + δ
dataearliest time
If the local clock of R2 is delayed w.r.t. R1, it might happen that the inputs of R2 change before the previous data is latched -> race
δ ≤ tr,min + ti + tl,min
R1 R2
φ’ φ’’+ Tδ
tr,max + tl,max + ti
(b) Data should be stable before clock pulse is applied.
tφ’ tφ’’ + T =
data
φ’’
tφ’ + T + δ
worst-case
The correct input data is stable at R2 after the worst-case propagation delay. The clock period must be large enough for the computations to settle.
T ≥ tr,max + ti + tl,max - δ
![Page 18: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/18.jpg)
Clock Constraints in Edge-Triggered Logic
δ tr min, ti tl min,+≤
T r max, ti tl max, δ–+≥
+
+t
(1)
(2)
Maximum Clock Skew Determined by Minimum Delay between Latches (condition 1)Minimum Clock Period Determined by Maximum Delay between Latches (condition 2)
![Page 19: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/19.jpg)
Positive and Negative Skew
R R RData
The clock is routed in the same direction as data
The skew has to satisfy (1)If it violates (1), then the circuit
malfunction independently of the clock period Clock period decreases!!!
(a) Positive skewφ
CL CLCL
R R RData
φ (b) Negative skewThe clock is routed in the opposite direction of data
(1) is satisfied implicitly. The circuit operates correctly independently of the skew
Clock period increases by | δ|CL CLCL
![Page 20: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/20.jpg)
Overview
Timing issues & clock distributionPipeliningClock skew. Register timingCounter clock skew
![Page 21: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/21.jpg)
Countering Clock Skew
RE
G
φ
RE
G
φR
EG
φ
.
RE
G
φ
log Out
In
Clock Distribution
Positive Skew
Negative Skew
Data and Clock Routing
Goal: clock skew between registers is bounded!(What matters is the relative skew between communicating registers.)
![Page 22: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/22.jpg)
Clock Distribution: H-Trees
clk
• Every branch sees the same wire length and capacitance •The clock skew is theoretically zero• The sub-blocks should be small enough s.t. the skew within the block is tolerable• It is essential to consider clock distribution early in the design process
Clock distribution is a major design problem!
![Page 23: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/23.jpg)
Clock Network with Distributed Buffering
Module
Module
Module
Module
Module
Module
CLOCK
main clock driver
secondary clock drivers
Reduces absolute delay, and makes Power-Down easierSensitive to variations in Buffer Delay
Local Area
![Page 24: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/24.jpg)
DEC Alpha 21164
Clock Drivers
9.3 M Transistors, 4 metal layers, 0.55µmClock Freq: 300 MHzClock Load: 3.75 nFPower in Clock = 20W (out of 50W)Two Level Clock Distribution:
oSingle 6-stage driver at centeroSecondary buffers drive left and right side
o Max clock skew less than 100psecoRouting the clock in the opposite directionoProper timing
![Page 25: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/25.jpg)
Clock Skew in Alpha
Clock driver
![Page 26: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/26.jpg)
Timing & Race Conditions: Example
AB
SumCoutCin
AB
SumCoutCin
AB
SumCoutCin
32-bit reg
32-bit reg
vv
32-bit adder
≈
R1
R2
clk driver 150Ω
300fF
SourceDestination
32-bit reg
v
R5
32-bit reg
v
R4
32-bit reg
v
R3
~1mm wire 200Ω, 100fF
![Page 27: Timing issues & clock distributioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture10/Lecture10... · 2003-09-25 · Power in Clock = 20W (out of 50W) Two Level Clock Distribution: oSingle](https://reader034.vdocuments.mx/reader034/viewer/2022042111/5e8ca8a0a798a36f90159e2c/html5/thumbnails/27.jpg)
Example (cont’d)
150Ω 200Ω
600fF 50fF 50fF 900fF
φ’ φ”π model
tφ’ = 0.69 (150) (650) = 67pstφ” = 0.69 [(150) (650) + (150 + 200)(950)] = 297psδ = tφ’ – tφ” = 230ps
Find the skew between the source register clock (φ’) and the destination (φ”)
δ ≤ tr,min + ti + tl,min condition (1)thold + δ ≤ tclk-Q + tsum100 + 230 ≤ 50 + 300 TRUE => No race problem
Check race condition
T ≥ tr,max + ti + tl,max - δ condition (2)T ≥ tclk-Q + 31 tcarry + tsum - δ + tsetupT ≥ 50 + 31(250) + 300 –230 + 150 => T ≥ 8.2 nsFind minimum clock period