a 2.1 pj/bit, 8 gb/s ultra-low power in- package serial...
TRANSCRIPT
1
Po-Wei Chiu, Muqing Liu, Qianying Tang and Chris H. Kim
Department of Electrical and Computer Engineering
University of Minnesota, Minneapolis, MN, USA
A 2.1 pJ/bit, 8 Gb/s Ultra-Low Power In-Package Serial Link Featuring a Time-based
Front-end and a Digital Equalizer
2
Outline
• Motivation
• Proposed Time-Based Receiver
• Proposed Delay Line Based Time Amplifier
• 65 nm Test Chip Measurement results
• Conclusion
3
Motivation
• System-in-Package− Multi-chip integrated in a single package
− Enables small form factor
I/O
Package (SiP, 2.5D/3D, interposer)
Chip1 Chip2
Logic
In-packagedata, clock
I/O I/OLogicI/O
Off-packagedata, clock
Off-packagedata, clock
4
RX Equalization Technique
• Continuous Time Linear Equalizer (CTLE)
• Decision Feedback Equalizer (DFE)
From Channel
Freq.G
ain
Time
Vo
lt.
CTLE DFE DSP
Analog Digital
5
RX Equalization Technique
• ADC-based receiver enables digital equalization
• Take advantage of CMOS scaling
From Channel
Freq.G
ain
CTLE ADC DSP
Analog Digital
6
RX Equalization Technique
From Channel VTC TA DSP
Analog Digital
TDC
• Voltage-to-Time Converter (VTC)
• Time Amplifier (TA)
• Time-to-Digital Converter (TDC)
• Digital-intensive time-based front-end
• Signal amplifier is performed in the time domain while equalization is performed by DSP circuits
7
Time-Based Receiver
CTLE VGA S/H ADC
Analog DFE
Analog Frontend
CTLE VGAFully-
Analog
Analog FE, Digital
Equalizer
Voltagebased
Voltage based
FeaturesReceiver Type
Digital FE, Digital
Equalizer
Time based
Proposed Time-based Frontend
VTC TA TDC DSP
AnalogDigital
DSP
Analog Frontend
8
Voltage-to-Time Converter
CLKi
CLKRX
VTC
CLKREF
VRX
EN
In Out
VRX
VREF
[1] P. Chiu, et al., JSSC, 2018.
VRX
CLKi
CLKRX
CLKREF
Δt
1 0 0 1
1 0 0 1
Δt
• Time delay is determined by input signal
• Voltage signal is converted to time delay
9
Related Time Amplifier
• NAND gate based Ring-oscillator design
• Speed is limited by oscillator frequency
[2] B. Kim, et al., CICC, 2015.
10
Related Time Amplifier
• NAND gate based Ring-oscillator design
• Speed is limited by oscillator frequency
[2] B. Kim, et al., CICC, 2015.
11
Proposed Time Amplifier
STARTo
STOPo
STARTi
STOPi
EN
1X 1X 1X 1X
1X 1X 1X 1X
NX NX NX NX
NX NX NX NX
IN
EN'
EN
OUT
• Open loop delay line for short range application
• Inverter based design
12
Proposed Time Amplifier
STARTo
STOPo
STARTi
STOPi
EN
1X 1X 1X 1X
1X 1X 1X 1X
NX NX NX NX
NX NX NX NX
STARTo
STOPo
EN
STARTi
STOPi
13
Proposed Time Amplifier
STARTo
STOPo
STARTi
STOPi
EN
1X 1X 1X 1X
1X 1X 1X 1X
NX NX NX NX
NX NX NX NX
STARTo
STOPo
EN
STARTi
STOPi
Δt
AND gate delay>Δt • STARTi driving by N+1 inverter Δt longer than STOPi
14
Proposed Time Amplifier
STARTo
STOPo
STARTi
STOPi
EN
1X 1X 1X 1X
1X 1X 1X 1X
NX NX NX NX
NX NX NX NX
(N+1)Δt
STARTo
STOPo
EN
STARTi
STOPi
Δt
15
Time Amplifier Simulation Results
• High linearity between input and output delay @ 2GHz operation
Ou
tpu
t Δ
t (p
s)
Input Δt (ps)
65nm GP, 1V, 25ºC, post-layout
10 15 20 25 300
20
40
60
80
100
120
140
160
N=1
N=4
16
4-bit TDC
Delay Unit
Delay Unit
Delay Unit
Delay Unit
Arb Arb Arb Arb
td td td td
td td td td
4 4 4 4
Th
erm
om
ete
r-to
-Bin
ary
DOUT[3:0]
STARTo
STOPo
Arbiter
• Conventional 4-bit Vernier line TDC
• Total TDC delay should smaller than 1 UI
17
Digital DFE
16:1
4w0000
w0001
w0010
w0011
w1111
Ch1
D1D2D3D4
D1D2D3D4
Ch4
~
• 4-bit digital comparator
• TDC output compared with predetermined weights
18
Proposed Transceiver for SiP
4 Phase CLK Gen.
DSP (Digital
Eq.)
BER Mon.
VREF
Time Amplifier
4bit-TDCVTC
4
4 GHzClock
3-T
ap
1
/2 r
ate
F
FE
21
5-1
PR
BS
TX Chip RX Chip
S-D
S-DDCC
DIV/2
DCC
DCC
*Prog.Delay
0˚,90˚,180˚,270˚
*Prog.Delay
*For in-situ BER testing
7mm in-package link (wire bond)
21
5-1
PR
BS
• Transmitter: PRBS, FFE and clock
• Receiver: 4-lane time-based receiver, digital equalization and BER monitor
19
65nm Test-Chip and Photo
• TX and RX chip are integrated in same package
20
In-situ BER Eye-Diagram Monitor
CLK TAPhase Delay
ΔTVTC
VTC
Time offset
DS
P
11
b
Co
un
ter
Pa
rall
el
to s
eri
al
BER Monitor
Error Count
Err
PRBS
Tim
e o
ffse
t
Phase delay
[1] P. Chiu, et al., JSSC, 2018.
• X-axis: Phase delay in red box
• Y-axis: Time offset in blue box
21
Measured BER Bathtub
• Eye width = 0.12UI @BER < 10-12
w/o DFE
65nm GP, 1V, 25ºC, D=7mm
1E-12
Bit
Err
or
Rate
Phase (UI)
-0.4 0-0.6 -0.2 0.4 0.60.2
w/ DFE
1E-9
1E-6
1E-3
22
Measured BER Eye Diagram
• Y-axis: time offset code corresponds to voltage offset
23
Performance Summary
Application
Technology
RX Area (w/o DSP)
Voltage
Data Rate
RX Architecture
Power Efficiency
(pJ/b)
JSSC’12 [3]
Front-end Type
Resolution
BER
JSSC’13 [4] JSSC’15 [5] This work
SiP
65nm
1V
8 Gb/s
4x TDC
2.1(TX+RX, includes
DSP power)
Time-Based(VTC+TA)
4 bit
<1E-12
0.0192 mm2
JSSC’16 [6]
Off Chip
40nm
0.9V
10.3125 Gb/s
4x Flash ADC
15.1(RX only)
Voltage-Based(VGA)
6 bit
<1E-12
0.27 mm2
Off Chip
65nm
1.1V
10 Gb/s
4x Flash ADC
8.1(RX only)
Voltage-Based(CTLE +VGA)
4 bit
<1E-9
0.288 mm2
Off Chip
40nm
1V
8.5-11.5 Gb/s
4x Flash ADC
18.9(RX, includes
Clock)
Voltage-Based(VGA)
6 bit
<1E-12
0.82 mm2
Off Chip
65nm
1V
10 Gb/s
32x SAR ADC
7.9(RX only)
Voltage-Based(Analog FFE)
6 bit
<1E-10
0.38 mm2
24
Conclusion
• Digital-intensive time-based front-end receiver is proposed for SiP application
• The proposed time-based receiver achieves an energy-efficiency of 2.1 pJ/b while the area is 0.0192mm2
• The proposed VTC and TA based implementation significantly reduces circuit complexity and has favorable scaling properties