analysis and avoidance of cross-talk in on-chip buses chunjie duan ericsson wireless communications...
Post on 22-Dec-2015
217 views
TRANSCRIPT
Analysis and Avoidance of Analysis and Avoidance of Cross-talk in on-chip Cross-talk in on-chip
busesbuses
Chunjie DuanEricsson Wireless Communications
Anup Tirumala Jasmine Networks
Sunil P KhatriUniversity of Colorado, Boulder
OutlineOutline
Introduction Classification of Cross-talk types Eliminating 3C and 4C sequences Eliminating 4C sequences Experimental Results Conclusions
IntroductionIntroduction
Verified cross-talk trends Accurate 3-D capacitance extraction Delay variation 2.47:1 (200 m wires, 10X drivers,
0.1 m technology)
Deep sub-micron process
s t
wa v aCI
CL
v
a
CLCL
CIa v a
CL
v
a
CL
CICI
CL
a av
a
CL
v
CLCL
CI CI
a
CIa av
v
CI
CL CL CL
CICI
CL CL CL
CI
CL
CI
CL CL
Cross-talk vs Bus Data Cross-talk vs Bus Data PatternPattern
When λ ~ 0.1μm, r = CI/CL > 10 (metal 4)
Effective total capacitance depends on bus data sequence :
Best case: 0 x CI x L
Worst case: 4 x CI x L
0·CI
Ctotal = 0 ·CICtotal = 4 ·CI
0·CI 2·CI 2·CI
Classification of Cross-Classification of Cross-talktalk
4·C sequence:
3·C sequence
2·C sequence
1·C sequence
0·C sequence
Forbidden patterns (“010” and “101”)
Eliminating 3C & 4C Eliminating 3C & 4C SequencesSequences Motivation
Maximum bus data rate depends on total capacitance seen by any bit
Removing 3C and 4C sequences will increase the maximum data rate
Simple approach: shielding g s g s g s g... (ground line between signals) No 3C or 4C sequences possible However, bus-width is doubled Coding gain = (throughput/area)with coding
(throughput/area)without coding
Coding gain = 0 for this approach- 1
Eliminating 3C & 4C Eliminating 3C & 4C SequencesSequences
Theorem: If no forbidden patterns are allowed on the bus, Proof: see paper
Our approach: Encode the data on the bus to get rid of the forbidden
patterns
Questions to be answered: What is the number of redundancy bits (and the coding
gain)? How to practically implement such a CODEC ?
CCtotal 2max
Number of Redundancy Number of Redundancy BitsBits
Map the n bit bus to a k=n+r bit bus so that the k bit data bus has no forbidden patterns
Definitions: T(n): number of distinct n-bit vectors.
T(n)=2n
TB(n): number of n-bit vectors which contain a forbidden pattern
TG(n): number of n-bit vectors which do not contain forbidden patterns
Let the sets of vectors be V(n), VB(n), and VG(n) respectively
Let v(n), vB(n) and vG(n) respectively represent an element of these sets
TGG(n): Number of n-bit vectors in VG(n) with last 2 bits ‘00’ or ‘11’
TGB(n): number of n-bit vectors in VG(n) with last two bits ‘01’ or ‘10’
Goal: to find the smallest k such that nG nTkT 2)()(
Counting Forbidden Counting Forbidden VectorsVectors
v(n) can be constructed by appending {0,1} to any v(n-1) Two v(n) are constructed from any v(n-1) Two vB(n) are constructed from any vB(n-1)
xxx010xx -> xxx010xx0, xxx010xx1
One vGG(n) and one vGB(n) are constructed from any vGG(n-1) xxxxxx00 -> xxxxxx000, xxxxxx001
One vGG(n) and one vB(n) are constructed from any vGB(n-1) xxxxxx01 -> xxxxxx010, xxxxxx011
Counting Forbidden Counting Forbidden VectorsVectors
Algorithm Initial conditions (n=3)
T(3) = 8, TG(3) = 6, TB(3)=2, TGG(3)=4, TGB(3)=2
Inductive step T(n) = 2 x T(n-1); TG(n) = 2 x TG(n-1) + TG(n-1)
TGG(n) = TGG(n-1) + TGB(n-1)
TB(n) = 2 x TB(n-1) + TGB(n-1)
Eliminating 3C & 4C Eliminating 3C & 4C sequences sequences
44% overhead when n > 30 bits Coding gain %391
44.01
2
G
overhead percentage
0.00E+00
5.00E-02
1.00E-01
1.50E-01
2.00E-01
2.50E-01
3.00E-01
3.50E-01
4.00E-01
4.50E-01
5.00E-01
0 10 20 30 40 50 60 70 80 90 100
3C & 4C CODEC 3C & 4C CODEC ImplementationImplementation
Implements a one-to-one map from V(n) to VG(k) Look-Up Table, straightforward, can achieve
minimum overhead (44%), but not practical Our implementation
62.5% overhead (higher than minimum) Modular and straightforward
Break bus into 4-bit groups Encode each group independently (4bit -> 5 bit) Additional logic to handle across-the-boundary forbidden
patterns
Ripple effect (Eliminated by pipelining)
3C & 4C CODEC 3C & 4C CODEC ImplementationImplementation
CODEC block diagram
Input output 0000 00000 0001 00001 0010 00110 0011 00011 0100 01100 0101 00111 0110 01110 0111 01111 1000 11111 1001 11110 1010 11001 1010 11100 1100 10011 1101 11000 1110 10001 1111 10000
b0
b1
b2
b3
b4
b5
b6
b7
b8
b9
b10
b11 b12
b13
b14
b15
Eliminating 4C sequencesEliminating 4C sequences Less aggressive: eliminating 4C sequences only Less overhead (33%) : simpler implementation Simpler algorithm
Divide the bus into 3 bit groups When 4C sequence occurs, complement group data Insert group complement indicator Special handling for across-the-boundary forbidden sequences
(see paper for details)
Examples: 101 001 -> 010 010 1010 0010 -> 1011 0100
Experimental ResultsExperimental Results
Bus simulations CODEC was not modeled Spice3, 0.1μm model Transmission line with inter-wire coupling Quantify delay dependency on bus vector
sequences
CODEC implementation Currently implemented 3C & 4C CODEC
Matching delay on CODEC outputs
4C CODEC implementation planned in future
Bus Simulation ResultsBus Simulation Results Bus length 5mm, 10mm or 20mm Driver strength 30X, 60X and 120X of minimum
DELAY comparison(1mm trace)
-1.00E+00
-5.00E-01
0.00E+00
5.00E-01
1.00E+00
1.50E+00
2.00E+00
0 1 2 3 4 5 6
0c
1C
2C
3C
4C
DELAY comparison(2mm trace)
-1.00E+00
-5.00E-01
0.00E+00
5.00E-01
1.00E+00
1.50E+00
2.00E+00
0 1 2 3 4 5 6 7 8
0C
1C
2C
3C
4C
Trc_len Buf_size 0C 1C 2C 3C 4C10mm 30x <100 200 350 550 75010mm 60x <100 100 250 400 50010mm 120x <100 120 170 300 35020mm 30x 100 300 600 1000 160020mm 60x 100 250 400 600 90020mm 120x <100 150 300 550 750
CODEC ResultsCODEC Results Compare waveform with coding and w/o coding Random input sequence
Random
sequence
Recovered sequence
encoder decoderdriver receiver
Random
sequence
Recovered sequence
encoder decoderdriver receiver
Encoder/decoder delay ~250ps Max data rate more than 2X compared to scheme with no
encoding
CODEC ResultsCODEC Results
random sequence directly into bus buffer
20mm trace 45x buffer > 1ns delay variation
Random sequence into 3C & 4C encoder
20mm trace 45x buffer < 500ps delay variation
received data pattern w coding
-5.00E-01
0.00E+00
5.00E-01
1.00E+00
1.50E+00
2.00E+00
0 2 4 6 8 10 12 14 16 18
Vin1
Vseg1
Vseg2
Vseg3
Vseg4
Vseg5
waveform w/o encoder
-4.00E-01
-2.00E-01
0.00E+00
2.00E-01
4.00E-01
6.00E-01
8.00E-01
1.00E+00
1.20E+00
1.40E+00
1.60E+00
0 2 4 6 8 10 12 14 16 18 20
Vtx1
Vtx2
Vtx3
Vtx4
Vtx5
Experimental ResultsExperimental Results
Reshaped data after receivers
without coding, edge jitter ~ 1000ps
with coding edge jitter < 500ps
delay variation w/o coding
-2.00E-01
0.00E+00
2.00E-01
4.00E-01
6.00E-01
8.00E-01
1.00E+00
1.20E+00
1.40E+00
0 2 4 6 8 10 12 14 16 18
Voo1
Voo2
Voo3
Voo4
Voo5
received data w coding
-2.00E-01
0.00E+00
2.00E-01
4.00E-01
6.00E-01
8.00E-01
1.00E+00
1.20E+00
1.40E+00
0 2 4 6 8 10 12 14 16
rcv1
rcv2
rcv3
rcv4
rcv5
ConclusionsConclusions Inter-wire capacitance increasingly significant in DSM
VLSI interconnect Total capacitance is heavily dependent on bus data sequence With 44% overhead, we can eliminate 3C & 4C cross-talk
Compared to shielding, which has 100% overhead
Implemented CODEC to eliminate 3C and 4C cross-talk sequences
Proposed CODEC to eliminate 4C cross-talk sequences with 33% overhead
Simulation results match our analysis.