link technology and its application - icdevice.co.kr file1 1 link technology and its application...
TRANSCRIPT
1
1
Link Technology and its Application
Deog-kyoon Jeong
Seoul National University
Microprocessor Architecture Lab.
2
Outline
z Introductionz Problems and Circuit Techniquesz Oversampling vs. Tracking receiverz Gigabit Ethernet PHYz Chip-to-chip/Bus Interfacez Conclusion
2
3
Introduction
z Increasing demand for high speed interconnect– Large data size in multimedia environment– Resource sharing through computer network– Data communication between high speed IC’s
z Applications– Digital video Interface for monitors
• PanelLink
– High bandwidth memory Interface• Rambus
– I/O sub-systems for computers• Network: Gigabit Ethernet, FiberChannel
• P1394
4
Why is Chip to Chip Communication soimportant?
❑ Rent’s Rule for MicroprocessorsNp = Kp Ng
β
Np = Number of Signal Pins
Ng = Number of Gates
Kp = 0.82, β = 0.45 for Microprocessors❑ For Pentium Pro, 387 pins with 5.5M Transistors
(474 pins According to Rent’s Rule)❑ In year 2000, 850 pins with 20M Transistors
3
5
Basic Electronics
z Transmitter – convert bits to an analog voltage
z Channel (Wire)– propagate the voltage
z Receiver– convert the analog voltage to bits
6
Problems and Circuit Techniques
z Circuit speed
z Interconnect Integrity– Pad capacitance– Bonding wire– Package trace– PCB trace
z PLL & DLL jitter in the receiver
z Wire limits
4
7
Circuit Speed and Technology Scaling
z Inverter speed scales roughly as L– Delay of most digital gates track an inverter
z Technology Development– 0.8um CMOS with 5V : 500Mbps ?– 0.5um CMOS with 3.3V : 800Mbps ?– 0.35um CMOS with 3.3V : 1.5Gbps ?– 0.25um CMOS with 2.5V : 2.5Gbps ?
8
Multiplexing Transmitter
• Typical Transmitter
• Multiplexing Transmitter
Can operate at lower speed
5
9
Multiplexing Transmitter
z Maximize the available bandwidth by multiplexing at theoutput
z Bandwidth limitation is where the multiplexing takes placez The other circuits except output escapes circuit speed
limitation
10
Output Multiplexingclk0
clk1
clk2
clk3
D0 D1 D2 D3 D4 D5 D6 D7 D8
clk0
clk1
clk1
clk2
clk7
clk0
D0 D1 D7
clk0
clk1
clk1
clk2
clk7
clk0
D0 D1 D7
out_b out
6
11
Typical Receiver
z Amplifier restores small swing input– Amplifier must limit on each bit
– If it does not limit, next bit will depend on previous bit → ISI– Single amplifier will limit performance
12
Demultiplexing Receiver
z Sampling and Amplifyingz Regenerative amplifier is highest gain-bandwidth product
amplifierz Demultiplexing at sampling switches
7
13
Demultiplexing Receiver
z Each receiver– Sample, amplify, reset
– If reset is complete, no ISI possible
Receiver
Din
Din
DataAlignment
D[0:7]
clk0
clk1
clk7
Receiver
Receiver
14
Input Receiver
Din Din
clk
clk clk
sb ssb
s
out
out_b
8
15
Signal Reflection & Wave Diagram
VS
RS
RL
Z0
V1+
V1-
V2+
V2-
V1+
V1+ = VS
Z0
RS + Z0V1
- = V1+
RL - Z0
RL + Z0
V2- = V1
-
RS - Z0
RS + Z0
16
Unterminated Linetd
P
P
0 td/2 3td/2
VS
VS
ZO
td
2td
VS
5td/2
9
17
Parallel Termination
RL = ZO
td
td
2td
VS
P
P0 td/2 3td/2
VS
VS
ZO
18
Series Termination
openRS= ZO
td
td
2td
VS/2
VS
P
P0 td/2 3td/2
VS/2 VS
VS
ZO
10
19
Signal Transmission Methods
V
V
Parallel Termination with Voltage Drive
Series Termination with Voltage Drive
Zo
Zo
RTERM = Zo
RTERM = Zo
20
Signal Transmission Methods
Parallel Termination with Current Drive
Source Termination with Voltage Drive
Zo
Zo
RTERM = Zo
RTERM = Zo
I
I
11
21
Power Issues
z Signal Power in the Transmission Line
– Current Drive with Parallel Termination
Vdd ISIGNAL Pr(1)
– Voltage Drive with Series Termination
C Vdd VSWING Pr(transition)
– Voltage Drive with Parallel Termination
Vdd2 / Zo
Zo2 Zo
Vdd
2 Zo
Vdd
22
Interconnection Topology
z Point-to-Point vs Bus Structure– Stub introduces additional impedance discontinuity– Even a 5-cm stub causes excess distortion
in 500Mbps signal
12
23
Point-to-Point and Bus Structure
z Stub is really a Problem in a Bus Structure
24
Signal Driver Circuits
z Unidirectional vs Bidirectional– 1x BW with Simple Circuit vs
2x BW and Circuit Complexity
z Differential vs Single-Ended– High SNR with Robustness vs
Low Pin Count with Accurate Reference
z Voltage Driver vs Current Driver– ECL-like vs Open Drain
13
25
Clock Receiving Schemes and Circuits
z PLL– High Order → Stability problem– Frequency synthesis easy
z DLL– Unconditionally stable– Frequency synthesis problematic– Narrow operating frequency range
26
Block Diagram & Basic Concept of PLL
VoltageControlledOscillator
Loop FilterCharge PumpPhase
FrequencyDetector
UP
DOWN
REF_CLK
VCO_CLK
Frequency Control
fREF
fVCO
Basic Concept : A clock generator whose frequency and phase are exactly synchronized to reference clock
14
27
VCO
Loop FilterCharge PumpPhase
FrequencyDetector
UP
DOWN
VCO_CLK
Vctrl
fREF
fVCO
fVCO = fREFDelay Cells
28
VCO Characteristics
f
Vctrl
Linearity
fVCO
15
29
Signal Waveforms of PLL
: P h a s e D i f f e r e n c e
R E F _ C K
V C O _ C K
S E T B
U P
D O W N
When VCO_CLK is slower than REF_CLK, the duration of UP is larger than that of DOWN
30
Signal Waveforms of PLL
: P h a s e D i f f e r e n c e
R E F _ C K
V C O _ C K
S E T B
U P
D O W N
When VCO_CLK is faster than REF_CLK, the duration of DOWN is larger than that of UP
16
31
Zero Delay Buffer
Oscillator
chip0 chip1
REF_CLK
Oscillator
chip0 chip1
REF_CLK
: PLL/DLL
32
Pre-Scalar
VoltageControlledOscillator
Loop FilterCharge PumpPhase
FrequencyDetector
UP
DOWN
REF_CLK
VCO_CLK
Frequency Control
fREF
fVCO
N
fVCO
N= fREF fVCO = N fREF
17
33
Multi-Phase Clocking
REF_CLKVCO_CLK0
PLL
VCO
VCO_CLK1
VCO_CLK2
VCO_CLK3
34
delaycell0
Idummy Ivctrl
delaycell1
Idummy Ivctrl
delaycellN-1
Idummy Ivctrl
PhaseDetector
UP
DOWN
ChagePump
LoopFilter
Control Voltage (Vctrl)
Voltage Controlled Delay Chain
REF_CLK DLL_CLK
Block Diagram of DLL
18
35
REF_CLK
DLL_CLK
UP
DOWN
REF_CLK
DLL_CLK
UP
DOWN
( a )
REF_CLK
DLL_CLK
UP
DOWN
REF_CLK
DLL_CLK
UP
DOWN
( b )
36
Wire Frequency Limits
z Not perfect conductor– Series resistance
• Skin Effect
– Dielectric loss
z Loss depends on:– Frequency
• Effective resistance larger at high frequencies
– Aspect ratio• Length/width
– Materials• Teflon better than FR4
• Conductor material
19
37
Cable Properties
Transmission through 6 or 12 meters of RG55U cable
38
Effect of LossLow pass filtering effect
Filtering effect leads to intersymbol interference (ISI)
• Long sequence of ‘1’ will get to full rail
20
39
Effect on Data Eyes
Bandwidth of channel = 80ps = 2GHz bit time(ps)
40
Overcoming Wire Limitation
z Equalization– Output data is run through FIR filter, which is inverse filter of cable
– Use binary weighted current source transistors
z Multibit symbols– Transmit more than 1 bit per symbol
– Encode bits in amplitude level
z Channel coding– Transition maximization, DC balance
21
41
Equalization
Example of transmitter equalization
Relatively less ISI
Relatively larger ISI
42
Multiple Bits per Symbol4 PAM
Multiple
Bit Eye
22
43
Oversampling vs. Tracking Receiver
z Tracking system– Architecture– Problems
z Oversampling system– Architecture– Merits & limitation
z Performance comparisons
44
Data Receiving Schemes and Circuits
z 1X sampling– Most systems– RAMBUS
z 2X over-sampling– GigaBlaze(TM) SerialLink: LSI Logic
z 3X over-sampling– PanelLink: Silicon Image
23
45
3X over-sampling Architecture
z Digital Phase Tracking– No Storage Capacitance => Fast Acquisition
z Overcoming circuit skew and channel skewz Power Consumption
Data
0 0 0 1 1 1 0 0 0
Data
? 0 ? ? 1 ? ? 0 ?
(a) (b)
46
Tracking System
TX clock
TransmitterTXdata
Transmission media
Clockrecovery(PLL)
Datasampler
Serial toparallel
converter
RXdata
RecoveredTX clock
Receiver
÷N Systemclock
Clock recovery circuit: recovers clock from high-speed signal through transmission media
Clock recovery, data sampler:Operates in data transfer rates
24
47
Data/Clock Recovery in Tracking Receiver
UP/DOWN signals are used for frequency control in PLL
D QDFF
D QDFF
UP
DOWNSerial data
Recovered clock
Serial data
Recovered clock
UP
DOWN
0 1 1 1 0 1
Uncertain region of data sampling
Recovered dataRecovered data
48
Tracking System�Û� ÷ÛÏz 3¬ Û�
– Serial data� ��¿ �S |o(¯LÛ 0 ÏS 1)� ´� 3¬� 3¬»Ë× G�Ô
– Serial data� `� Î Intersymbol Interference� 7� 3¬� jitter Û�
z /3ï ´�– ´�Û 3¬» £Û� /3ï¿ �Ë�×S �Ï3 £C:
• 3¬� ���Ã(rise/fall time)3 /3ï ¬� ��« / |o� Û��S /3ï �Ë�� �Ï ��
• /3ï �Ë«� logic threshold �� (ç�, Ã�, ´Ô °)
• K�; �WS �¯ �Ã
• ïß� �WS �¯ �Ã
25
49
Oversampling System
3¬ Û�K�S×�§� �c� 3¬��ïTX clock
TransmitterTXdata
Transmission media
Multi-phasePLL
Over-sampler
RXdata
RX clock
Receiver
Parallelconverter
Digitalclock/data
recovery Systemclock
M 1M
TX 3¬» RX 3¬ÿ100ppmÔ�� »Ë× ç3¿¿d: plesiochronous system
50
Data/Clock Recovery in Oversampling Receiver
Serial data
0 1 1 1 0 1Data
1 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1
Multi-phaseoversampling
Oversampleddata
Data transitionpositions
Recovereddata 0 1 1 1 0 1
� /3ï [3 �Ï� k3�Û /3ï; ß�' /3ï ß�Ã�¿ ã7|û�� Ã��; 3�ç
26
51
»�·� /3ï/3¬ ´� ïß: d§Ï
z dÏ
– Serial data� ó¿�K ´�Û 3¬ÿ ,� ;Ô
– �Ë� 3¬» £Û� »�·�×S �Ï� ç3� ó¿: »�·� �Ë�Ãh3 ÐS
z §Ï: »�·�� ï7 /3ï ´� �Ï ãç(δ)
– S: »�·� �Ë� Ãh
z 3o�
– »�·�ç� QC: ��§s� ´`� \¿
20
S≤≤ δ
52
g� £�� t� ðd ��
Parameter Default valueData rateData streamPulse transition timeSNR at receiver sideTransmitter jitterReceiver jitterChannel bandwidth
1GbaudPseudo random number sequence (no coding)200ps12dB70ps70ps667MHz
¿Ô:- �Û� �c�ÿ �;� 3¬ jitter � P» »Ë× ¿¬¬� ¿�- �Ï� ¿¬¬ÿ 3�À7 ¿¿¬ t» �ï xð� ¿�- g� jitterÿ noiseS Gaussian random number� g7�- TX 3¬» RX 3¬� »Ë× ç3S 100ppm- ðdÿ BER(Bit error rate)� àÔ
27
53
ðd ��: »�·�ç
1.0E- 06
1.0E- 05
1.0E- 04
1.0E- 03
1.0E- 02
3 4 5 6 7 8 9 10
O versampling ratio
BE
R
BS _i LJ_ i HS _ i
BS _f LJ_ f HS _ f
Prefix: �P /3- BS: 12dB-70ps- LJ: 12dB-30ps- HS: 18dB-70ps
Suffix: ×�ï Ô�- i: 3�À7 /3ï ´�- f: £ÛÀ7 /3ï ´�
* o�- ÿ×ç� »�·�¿�×ç³£ ßk- `�3 �ïÀ7 |o�S 5ï 3��»�·�S ó�'
54
ðd ��: »Ë× ç3
1 .0E- 04
1 .0E- 03
1 .0E- 02
1 .0E- 01
1 .0E+0 0
-1 60 0 -1 20 0 -8 00 -4 00 0 4 00 8 00 1 20 0 1 60 0
F requency d ifference [ppm]
BER
x3_ i x4_ i x5_ i
x3_ f x4_ f x5_ f
Prefix: »�·�ç- x3: 3ï »�·�- x4: 4ï »�·�- x5: 5ï »�·�
Suffix: ×�ï Ô�- i: 3�À7 /3ï ´�- f: £ÛÀ7 /3ï ´�
* o�- ÿ×ç »�·�ç; �,£ÛÀ7 /3ï ´�ÿ»Ë× ç3� �ÿ�«ðd ¿� (èdðÿ ��)
28
55
ðd ��: `�
1 .0 E -06
1 .0 E -05
1 .0 E -04
1 .0 E -03
1 .0 E -02
1 .0 E -01
1 .0 E +0 0
0 5 1 0 1 5 2 0SN R [d B ]
BE
R
x3_ i x5_ i
x3_ f x5_ f
T_ 0% T_ 10 %
T_ 20 %
×�ï Ô�- T_0%: �Ë� ãç¿0%7 Tracking �c�- T_10%: �Ë� ãç¿10%7 Tracking �c�
* �Ë� ãç:- 0%: ¿d BER3 mÿÏ�Û �Ë�- 50%: /3ï� [3�Ï�Û �Ë�
* o�3ïÿ 5ï� »�·��c�ÿ 12%ÿ 7%��Ë� ãç; ÕStracking �c�� 3¸
56
Gigabit Ethernet PHY
z Major circuits– Multiphase PLL (phase-locked loop)– Serializer– Line driver– On-chip terminator
z Block diagramz Chip
29
57
Multiphase PLL
Phase-frequencydetector
Chargepump
LoopfilterSystem
clock
Multiphase clocks: 0 15 1 16 13 28 14 29
Up
Down
�¯ �à Ûs �÷
Differentialdelay cell
58
Serializer
ck4
ck0
d0
ck5
ck1
d1
ck3
ck9
d9
ck4
ck0
d0
ck5
ck1
d1
ck3
ck9
d9
M1
M2 M3
Dout- Dout+
- d0 ~ d9: 10-b parallel data- ck0 ~ ck9: 10-phase clocks from multiphase PLL
(30Û� multiphase 3¬ Ð�Û 10Û; kh)
Level shifter (logical high level)P-MOS resistive load
30
59
Line Driver
Din- Din-
TX-TX+
Currentcontrol
Din- Din-
TX+ TX-
Voltage mode driver: - Active pull-up/pull-down - Driving current is not constant
Current mode driver: - Constant current pull-down - Resistive pull-up by load
60
On-Chip Terminator
RX+
RX-
N-MOSControl
P-MOSControl
Resistancecontrol
RX+ RX-
Io-Io0
Vm
Vl
Voltage
Current
Current in N-MOS
Current in P-MOS
NetCurrentVh
Current
Voltage
Linear-regionin P-MOS
Center-taptermination
Vddtermination
I-V characteristicsCircuits
31
61
Gigabit Ethernet SERDES Chip Block
30-phasePLL
Serializer
On-chipTermin-
nator
Framealigner
PhaseTracker
30-phaseoversampler
10bTX data
125MHzTX clock
10bRX data
ClockSelector
RX clock
1.25GbaudTX signal
1.25GbaudRX signal
62
Gigabit Ethernet SERDES Chip Photo
- Process: 3-metal 0.35µm CMOS- Die size: 2.49x2.65mm2 - Package: 64-pin PQFP- Total power consumption: 500mW
32
63
State-of-the-Art Pin Interfaces
Bandwidth Signaling TechnologyInstitution Reference
Rambus 660Mbps Open Drain 0.3um CMOS ISSCC-96
IBM 1062Mbps ECL 0.5um CMOS ISSCC-95
Intel 2x450Mbps Bidirectional 0.6um CMOS ISSCC-95
Hitachi 2x300Mbps Bidirectional 0.5um CMOS ISSCC-95
Rambus 500Mbps Open Drain 0.6um CMOS ISSCC-94
Stanford 500Mbps 1V-CMOS 0.6um CMOS Symp-94
Seoul NU 770Mbps Open Drain 0.9um CMOS Symp-94
NEC 125Mbps GTL/LV-TTL 0. 5um CMOS Symp-93
ISSCC-980.35um CMOSOpen Drain800Mbps
Symbios
Rambus,Intel
Silicon image
1.25Gbps Open Drain? 0.5um CMOS? ISSCC-97
3x1.5Gbps Open Drain 0.35um CMOS ISSCC-98
64
RAMBUS System
ClockGenerator
CPU DRAM0 DRAM1 DRAM31
Vterm=2.5
SOut Sin SOut Sin SOut Sin
9 Z=Z0
R=Z0
BusData[8:0]
BusCtrl
BusEnable
ClockFromMaster
ClockToMaster
Vref=2.2V
Gnd(8)
Vdd(5)
Data Bandwidth: 500Mb/sSwing Voltage: 600mV
33
65
RAMBUS signaling
Bus Clock
DataChannel
TX_DLL
PhaseDetector
XOR
Tclk
Output Driver RX_DLL
PhaseDetector
Rclk
Input Receiver
Input Receiver
Bus Clock
Tclk
Rclk
DATA DN DN+1 DN+2 DN+3
66
RAMBUS Output Driver
8
Tclk
Tclk_b
6 Current ControlRegister
Current ControlCircuit
W2W4W8W16W32W
Pad
0 2 4 6 1 3 5 7
Vterm
Vref
Von
“0”
“1”
34
67
RAMBUS Input Receiver
Data
Clock_b
Clock_b
Clock
Vref
Clock
Vbias
Clock_b
Out_b
Out
68
Conclusion
z o�
– CMOS high speed interconnect� Oversampling ïT3h� dÏ
• £7 ´Ôû�� �7 3�ð
• Ó�÷ è�� �� �7 3¬ ´� dä
• �Ï �d� �_�· K�� ´ß� 7� �Ïà Ãì �K�
– ãw�+, �Óã Ã`, �L �gk 7ïW3c� 3h
z d� Ã�– ³§ïÿ »� ïï Ã� ïß §Û�: Pèd �L �Ï +ð» data
multiflexing ïT� kh�« »�ïï ïß� �W� t(
– ³§ï�c��Û s? Cc �ô +/ §Û�: Ã�÷Û Cc, �gk Cc, �c� Cc °� �L� §; Cc� ��