kevans cdr slides
TRANSCRIPT
Digital Clock and Data Recovery for High-Speed Serial Data Links
Digital Clock and Data Recovery for High-Speed
Serial Data Links
1
Ken Evans [email protected]
Digital Clock and Data Recovery for High-Speed Serial Data Links
Typical 6 Gbps SATA / SAS PMA Layer
2
• Adaptive RX equalization at 6.0 Gbps.
• Digital clock and SS-modulated data recovery.
• 1:10 ser/des to accomodate 8b/10b coding in PCS layer.
• SSCG MPLL serving shared RX/TX frequency synthesizer.
• 1b transmitter de-emphasis, programmable to -6 dB, -3.5 dB
• Link scaling for legacy modes (1.5, 3.0 Gbps).
Digital Clock and Data Recovery for High-Speed Serial Data Links
Typical 5 Gbps PCI-E or USB PMA Layer
3
• Adaptive RX equalization for 5.0 Gbps USB, none for PCI-E.
• Digital clock and SS-modulated data recovery.
• 1:10 ser/des to accomodate 8b/10b coding in PCS layer.
• SSCG MPLL serving shared RX/TX frequency synthesizer.
• 1b transmitter de-emphasis, programmable to -6 dB, -3.5 dB
• Link scaling for legacy PCI-E (2.5 Gbps).
Digital Clock and Data Recovery for High-Speed Serial Data Links 4
Combining Macros into Multilane SERDES ASIC
Digital Clock and Data Recovery for High-Speed Serial Data Links
– Multiple applications / standards can be satisfied with universal macros – PMAs are almost identical.
– Signaling rates can be dialed in by scalable link design.
– PCS layer functions tend to be universal, i.e. scrambling, 8b/10b coding, BIST, loopback modes.
– RX / TX equalization optional depending on channel medium.
• SATA 3.0, USB 3.0 require RX EQ, TX de-emphasis. • PCI-E 2.0 does not require RX EQ up to 34” FR4.
– Mostly digital implementations mandatory at this point.
• Multi-lane solutions demand minimal thermal / area footprint. • Quick process migrations require maximum portability. • Macros must be digitally programmable to accommodate
specific application specs and usage customizability.
5
Notes on SERDES IP Macros
Digital Clock and Data Recovery for High-Speed Serial Data Links 6
Relevant CDR Blocks
6
• Digital PLL (DPLL)recovers digital phase error to drive loop into lock.
• Varying degrees of data deserialization embedded within CDR.
• Digital-to-phase conversion bridges digital and analog domains.
• Frequency synthesizer required to produce clock which is ideally matched in frequency with RX data.
• Loop dynamics are governed by both linear and non-linear behavior.
Digital Clock and Data Recovery for High-Speed Serial Data Links 7
Basic DPLL Structure Block-level Arrangement
System Representation
• DPLL phase adjustment accomplished by rotating the phase of clock produced by VCO.
• DPLL frequency adjustment accomplished by adjusting rate of phase rotation.
• DPC is the DPLL’s “VCO”, but does not integrate.
• Simplistically, phase error sampler is viewed as a slicer. All remaining elements are linear.
• Non-linear loop dynamics difficult to analyze in closed form….
Digital Clock and Data Recovery for High-Speed Serial Data Links 8
Basic !!PD Mechanics • !!PD 2x oversamples
data by 00(I) and 900(Q) clock phases.
• In locked state, I clock samples data at the eye center and Q clock samples zero-crossings.
• Early and late phase errors produced by XORing sliced data and edge samples.
• No error produced when no transition present – important feature!
• Early, late, and no-transition states are all mutually exclusive.
• !!PD never produces stable output – causes “dither” jitter at lock point.
Digital Clock and Data Recovery for High-Speed Serial Data Links 9
!!PD Behavior in Presence of Jitter
• Clock and data jitter causes !!PD to have a “stochastically” linear range.
• At any given phase error, there are odds “P” that the edge clock will sample Dk, and “1-P” that it will sample Dk-1.
• “Linear” range of !!PD is the probability distribution of edge samples over phase error.
• K!!pd is the slope of the distribution at zero phase error.
• Jitter is a combination of rj and dj components.
Digital Clock and Data Recovery for High-Speed Serial Data Links 10
!!PD Behavior in Presence of Jitter (cont’d)
• Random jitter (rj) used for simulation – gaussian probability distribution.
• Extremes of phase error saturate at +/- DT, the RX data transition density, because !!PD produces no error when no transition is present.
• !!PD gain at lock, K!!pd, is given as 1 / [σj(2π)1/2].
• “Linear” range is contained within +/- 2σj.
Digital Clock and Data Recovery for High-Speed Serial Data Links 11
Three PRBS-31 RX Test Subjects
• PRBS-31 patterns generated with 9b oversampling (512 samples per UI). • Sequence processed by raised cosine filter to mimic dispersive channel
loss. • White gaussian random jitter applied to waveforms: σrj = 1%, 3%, 5% UI. • Primary sequence length for simulation is 216 symbols.
Digital Clock and Data Recovery for High-Speed Serial Data Links 12
Simulated !!PD Transfer Curves
• Higher jitter lowers CDR loop gain – has consequences for sinusoidal jitter tolerance (JT) compliance.
• Higher jitter broadens the range over the data eye that the CDR will behave as a linear feedback loop.
• Lower jitter causes CDR to behave as a non-linear slewing feedback loop.
Digital Clock and Data Recovery for High-Speed Serial Data Links 13
Effect of Sampler Offsets
• Small random and systematic sampler offsets have a manageable effect on PD behavior (left plot).
• Larger offsets can cause skews in in the lock point as well as make the CDR’s loop gain erratic and unpredictable (right plot).
• Use circuit design techniques to minimize random offsets, and robust layout techniques to minimize systematic offsets. Digital offset calibration is also recommended.
Digital Clock and Data Recovery for High-Speed Serial Data Links 14
Typical Error Generator Implementation
• CDR loop cannot process error symbols at baud rate – digital filter cannot move that fast.
• Half-rate I/Q clocks from DPCs perform 1:2 data and edge deserialization. • Symbols are aligned and deserialized 2:10, and half-rate clock is divided by 5. • Array of unit !!PDs are used to generate 10b error sequence, which is eventually
decimated into a signed, 2b error signal moving at 1/10th the baud rate.
Digital Clock and Data Recovery for High-Speed Serial Data Links 15
Sampling and Deserialization
Digital Clock and Data Recovery for High-Speed Serial Data Links 16
2T Sampling and Alignment Timing
Digital Clock and Data Recovery for High-Speed Serial Data Links 17
Deserialization Timing (Data)
Digital Clock and Data Recovery for High-Speed Serial Data Links 18
Deserialization Timing (Edge)
Digital Clock and Data Recovery for High-Speed Serial Data Links 19
10b !!PD Implementation
• Unit !!PD cells generate 10 sets of 3b hot-coded errors.
• The 10b !!PD needs to “look ahead” into the next data frame to compute the 10th 3b error code. A redundant “wrap” bit, shared between two adjacent data frames, is made available for this purpose.
• 10 x 3b error signal is retimed prior to decimation.
Digital Clock and Data Recovery for High-Speed Serial Data Links 20
10b !!PD Timing
Digital Clock and Data Recovery for High-Speed Serial Data Links 21
Error Decimation by Majority Voting • Majority voting used to
summarize 5b error signal.
• Democratic process is inherently non-linear, must be simulated to determine effect on DPLL loop gain.
• MUX matrix maintains a “tally” of early, late, and no-transition bits within the 5b error word.
• Might require inter-stage retiming depending on MUX implementation.
Digital Clock and Data Recovery for High-Speed Serial Data Links 22
Majority Voting Simulation
• Transfer curve simulation of !!PD in combination with MV yields KMV = 3.0. • Good numerical agreement over three jitter values, although higher jitter simulations
produce a noisier curve.
Digital Clock and Data Recovery for High-Speed Serial Data Links 23
DPLL Loop Filter Design
Digital Clock and Data Recovery for High-Speed Serial Data Links 24
DPLL Loop Filter Design (cont’d 1) • 2nd order filter provides both phase and frequency integration. • Phase integration path accumulates phase differences between
sampling clocks and the RX data stream. • Frequency integration path accumulates frequency differences
between analog VCO and RX data stream. Allows DPLL to track static errors due to host and receiver crystal frequency mismatches (+/-300 ppm), and time-varying errors due to spread-spectrum clock modulation (0 to -5000 ppm).
• Gain implemented by bit shifting, i.e. • (+1) x 22
is ’01’ ‘0100’ (+4) • (-1) x 22 is ’11’ ’1100’ (-4)
• Attenuation implemented by truncating dither bits, i.e. • (+4) x 2-2 is ’0100’ ’01’ (+1) • (-4) x 2-2 is ‘1100’ ’11’ (-1)
Digital Clock and Data Recovery for High-Speed Serial Data Links 25
DPLL Loop Filter Design (cont’d 2) • Phase path has gain phug = 2{3,4,5,6} x 2-6. • Frequency path has gain frug = 2{0,1,2,3} x 2-12. • Frequency integrator requirements:
• Must accumulate frequency error until integrator saturates: adder must never overflow or underflow.
• Frequency error can be either positive or negative polarity: adder must perform signed (2’s complement) arithmetic.
• Phase integrator requirements: • DPC has full-scale rotation range of 3600: adder must overflow and/or underflow
to rotate DPC beyond this range. • DPC maps unsigned binary code to phase value, adder must be unsigned as
well.
Digital Clock and Data Recovery for High-Speed Serial Data Links 26
DPLL Frequency Tracking
• RX data stream is generally SS-modulated by host-side TX.
• RX clock is asynchronously SS-modulated by MPLL unit.
• Worst-case frequency difference occurs when modulations are 1800 apart – 5000 ppm.
• Including static frequency differences, DPLL must elastically track up to +300 ppm to -5300 ppm over time.
• Frequency integrator must have sufficient bits to accommodate, with integrator loss (2-6), decimation factor (10), and DPC resolution (2-9).
• Frequency resolution is the ppm value that the DPLL can change recovered clock frequency per update time.
Digital Clock and Data Recovery for High-Speed Serial Data Links 27
DPLL Frequency Tracking (cont’d)
• DPLL step response simulations performed with and without static frequency offset.
• 9b DPC resolution yields 512 samples per UI.
• 514 samples are allocated between each retiming interval.
• Toffset = 106 x (514/512 – 1) = 3900 ppm.
• Slope of accumulation proportional to amount of offset tracked by DPLL.
• Slope of accumulation will time-vary according to RX data spread-spectrum modulation.
Digital Clock and Data Recovery for High-Speed Serial Data Links 28
CDR Small-Signal Model (z-1)
• Model is valid when DPLL is in lock: • Variations in ϕdata are contained within [ -2σj, +2σj ]. • Variations in ϕdata, fdata are moving at a rate within the CDR’s tracking bandwidth
• !!PD noise is always present, and can be input referred with value σ!!pd / K!!pd.
Digital Clock and Data Recovery for High-Speed Serial Data Links 29
CDR Small-Signal Model (s)
• For operating frequencies much lower than the update rate of the DPLL, Euler difference equation can be used to transform z-domain to s-domain.
• Loop gain L(s) has the same form as that of a typical charge-pump APLL, but with an added exponential term that models the digital latency of the DPLL.
• In the simplistic case where latency is omitted, the DPLL tracking response has identical form with that of the APLL counterpart.
• Loop bandwidth and transfer peaking can both be designed through this transformation.
Digital Clock and Data Recovery for High-Speed Serial Data Links 30
CDR Frequency Response
• DPLL bandwith and peaking response is sensitive to CDR update rate and jitter. • For fixed σj, Tupd, bandwidth can be extended and peaking suppressed by increasing
phase update gain (phug). • Serial data link jitter should be empirically observed, and phug and frug values should
then be chosen to yield desired bandwidth and peaking.
Digital Clock and Data Recovery for High-Speed Serial Data Links 31
CDR Frequency Response (cont’d)
phug frug BW peaking (MHz) (dB)
1 / 8 1 / 4096 3.012 0.4267 1 / 8 1 / 2048 3.195 0.7674 1 / 8 1 / 1024 3.373 1.0672 1 / 8 1 / 512 3.545 1.3393
1 / 4 1 / 4096 5.832 0.1234 1 / 4 1 / 2048 5.925 0.2325 1 / 4 1 / 1024 6.018 0.3339 1 / 4 1 / 512 6.111 0.4298
1 / 2 1 / 4096 11.903 0.0334 1 / 2 1 / 2048 11.95 0.0648 1 / 2 1 / 1024 11.996 0.095 1 / 2 1 / 512 12.043 0.1242
1 1 / 4096 25.539 0.0087 1 1 / 2048 25.562 0.0172 1 1 / 1024 25.585 0.0255 1 1 / 512 25.608 0.0337
phug frug BW peaking (MHz) (dB)
1 / 8 1 / 4096 1.115 1.0568 1 / 8 1 / 2048 1.281 1.7968 1 / 8 1 / 1024 1.432 2.4037 1 / 8 1 / 512 1.569 2.9265
1 / 4 1 / 4096 1.968 0.3312 1 / 4 1 / 2048 2.06 0.6016 1 / 4 1 / 1024 2.151 0.8421 1 / 4 1 / 512 2.239 1.062
1 / 2 1 / 4096 3.833 0.0943 1 / 2 1 / 2048 3.88 0.1787 1 / 2 1 / 1024 3.927 0.2576 1 / 2 1 / 512 3.973 0.3326
1 1 / 4096 7.756 0.0253 1 1 / 2048 7.78 0.0492 1 1 / 1024 7.803 0.0722 1 1 / 512 7.826 0.0946
phug frug BW peaking (MHz) (dB)
1 / 8 1 / 4096 0.736 1.5644 1 / 8 1 / 2048 0.886 2.5744 1 / 8 1 / 1024 1.017 3.3691 1 / 8 1 / 512 1.133 4.0333
1 / 4 1 / 4096 1.213 0.5144 1 / 4 1 / 2048 1.304 0.9142 1 / 4 1 / 1024 1.391 0.4029 1 / 4 1 / 512 1.474 0.5161
1 / 2 1 / 4096 2.301 0.151 1 / 2 1 / 2048 2.347 0.2822 1 / 2 1 / 1024 2.394 0.4029 1 / 2 1 / 512 2.44 0.5161
1 1 / 4096 4.586 0.0413 1 1 / 2048 4.609 0.0795 1 1 / 1024 4.633 0.1161 1 1 / 512 4.656 0.1514
σj = 0.01UI Tupd = 2ns
σj = 0.03UI Tupd = 2ns
σj = 0.05UI Tupd = 2ns
Digital Clock and Data Recovery for High-Speed Serial Data Links 32
Effect of Loop Latency σrj = 0.03UI phug = 2-3
frug = 2-12
σrj = 0.03UI phug = 1
frug = 2-12
• Low loop bandwidth settings are relatively insensitive to increases in digital latency. • High loop bandwidth settings see a severe degradation in phase margin.
• Bandwidth and peaking values shift – be careful! • Lower phase margin causes the loop to have higher quality factor – bad for jitter transfer.
• Limit cycle jitter worsens due to lengthened phase correction time.
Digital Clock and Data Recovery for High-Speed Serial Data Links 33
Jitterless CDR Behavior
• A jitterless CDR does not obey linear loop dynamics at all.
• A 5b step in ϕdata causes the DPLL to slew towards the lock point.
• The lock point is characterized by a 1 lsb prbs dithering sequence, which is known as “limit cycle” behavior.
• Dithering is caused by the full-scale meta-stable output of the !!PD, and characterized by σ!!pd.
• Dither magnitude will grow beyond 1 lsb if loop latency increases – another reason to retime sparingly.
Digital Clock and Data Recovery for High-Speed Serial Data Links 34
Jittered CDR Behavior
• 4b step applied to ϕdata at CDR input. • Low jitter condition shows response is dominated by non-linear slewing. Very close
to the lock point, the phase error falls within the linear range of the !!PD and the response tracks the s-domain model.
• High jitter condition shows response closely tracks the s-domain model because majority of input step is contained within the linear range of the !!PD.
• Important observation: phase / frequency tracking capability is band-limited.
Digital Clock and Data Recovery for High-Speed Serial Data Links 35
CDR Jitter Tolerance
• Jitter Tolerance (JT) testing is a frequently-used SERDES compliance test.
• Measure of sinusoidal jitter tracking capability.
• Compliant CDRs do not penetrate the mask definition.
• Mask typically has a corner frequency of rb / 1667.
• Low-jitter environments can meet the mask specification over a variety of loop settings, as shown.
Digital Clock and Data Recovery for High-Speed Serial Data Links 36
CDR Jitter Tolerance (cont’d)
• As jitter increases, the lower bandwidth settings start approaching the mask boundary. For high jitter conditions, some settings cause mask violations.
• For a given level of RX jitter, the DPLL requires relatively low bandwidth settings to maintain good phase margin and jitter transfer performance. But Jitter tolerance requires higher bandwidth settings….
• Conclusion: jitter transfer and jitter tolerance are conflicting design goals and must be balanced.
Digital Clock and Data Recovery for High-Speed Serial Data Links 37
References
1. J. Lee et al., “Analysis and Modeling of Bang-Bang Clock and Data Recovery Circuits”, IEEE Journal of Solid-State Circuits, Vol. 39, No. 9, September 2004, pp. 1571-1580.
2. J. Sonntag et al., “A Digital Clock and Data Recovery Architecture for Multi-Gigabit/s Binary Links”, IEEE Journal of Solid-State Circuits, Vol. 41, No. 8, August 2006, pp. 1867-1875.
3. P. K. Hanumolu, et al., “A Wide-Tracking Range Clock and Data Recovery Circuit”, IEEE Journal of Solid-State Circuits, Vol. 43, No. 2, February 2008, pp. 425-439.