analysis and avoidance of cross-talk in on-chip buses chunjie duan ericsson wireless communications...

21
Analysis and Avoidance Analysis and Avoidance of Cross-talk in on-chip of Cross-talk in on-chip buses buses Chunjie Duan Ericsson Wireless Communications Anup Tirumala Jasmine Networks Sunil P Khatri University of Colorado, Boulder

Post on 22-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Analysis and Avoidance of Analysis and Avoidance of Cross-talk in on-chip Cross-talk in on-chip

busesbuses

Chunjie DuanEricsson Wireless Communications

Anup Tirumala Jasmine Networks

Sunil P KhatriUniversity of Colorado, Boulder

OutlineOutline

Introduction Classification of Cross-talk types Eliminating 3C and 4C sequences Eliminating 4C sequences Experimental Results Conclusions

IntroductionIntroduction

Verified cross-talk trends Accurate 3-D capacitance extraction Delay variation 2.47:1 (200 m wires, 10X drivers,

0.1 m technology)

Deep sub-micron process

s t

wa v aCI

CL

v

a

CLCL

CIa v a

CL

v

a

CL

CICI

CL

a av

a

CL

v

CLCL

CI CI

a

CIa av

v

CI

CL CL CL

CICI

CL CL CL

CI

CL

CI

CL CL

Cross-talk vs Bus Data Cross-talk vs Bus Data PatternPattern

When λ ~ 0.1μm, r = CI/CL > 10 (metal 4)

Effective total capacitance depends on bus data sequence :

Best case: 0 x CI x L

Worst case: 4 x CI x L

0·CI

Ctotal = 0 ·CICtotal = 4 ·CI

0·CI 2·CI 2·CI

Classification of Cross-Classification of Cross-talktalk

4·C sequence:

3·C sequence

2·C sequence

1·C sequence

0·C sequence

Forbidden patterns (“010” and “101”)

Eliminating 3C & 4C Eliminating 3C & 4C SequencesSequences Motivation

Maximum bus data rate depends on total capacitance seen by any bit

Removing 3C and 4C sequences will increase the maximum data rate

Simple approach: shielding g s g s g s g... (ground line between signals) No 3C or 4C sequences possible However, bus-width is doubled Coding gain = (throughput/area)with coding

(throughput/area)without coding

Coding gain = 0 for this approach- 1

Eliminating 3C & 4C Eliminating 3C & 4C SequencesSequences

Theorem: If no forbidden patterns are allowed on the bus, Proof: see paper

Our approach: Encode the data on the bus to get rid of the forbidden

patterns

Questions to be answered: What is the number of redundancy bits (and the coding

gain)? How to practically implement such a CODEC ?

CCtotal 2max

Number of Redundancy Number of Redundancy BitsBits

Map the n bit bus to a k=n+r bit bus so that the k bit data bus has no forbidden patterns

Definitions: T(n): number of distinct n-bit vectors.

T(n)=2n

TB(n): number of n-bit vectors which contain a forbidden pattern

TG(n): number of n-bit vectors which do not contain forbidden patterns

Let the sets of vectors be V(n), VB(n), and VG(n) respectively

Let v(n), vB(n) and vG(n) respectively represent an element of these sets

TGG(n): Number of n-bit vectors in VG(n) with last 2 bits ‘00’ or ‘11’

TGB(n): number of n-bit vectors in VG(n) with last two bits ‘01’ or ‘10’

Goal: to find the smallest k such that nG nTkT 2)()(

Counting Forbidden Counting Forbidden VectorsVectors

v(n) can be constructed by appending {0,1} to any v(n-1) Two v(n) are constructed from any v(n-1) Two vB(n) are constructed from any vB(n-1)

xxx010xx -> xxx010xx0, xxx010xx1

One vGG(n) and one vGB(n) are constructed from any vGG(n-1) xxxxxx00 -> xxxxxx000, xxxxxx001

One vGG(n) and one vB(n) are constructed from any vGB(n-1) xxxxxx01 -> xxxxxx010, xxxxxx011

Counting Forbidden Counting Forbidden VectorsVectors

Algorithm Initial conditions (n=3)

T(3) = 8, TG(3) = 6, TB(3)=2, TGG(3)=4, TGB(3)=2

Inductive step T(n) = 2 x T(n-1); TG(n) = 2 x TG(n-1) + TG(n-1)

TGG(n) = TGG(n-1) + TGB(n-1)

TB(n) = 2 x TB(n-1) + TGB(n-1)

Eliminating 3C & 4C Eliminating 3C & 4C sequences sequences

44% overhead when n > 30 bits Coding gain %391

44.01

2

G

overhead percentage

0.00E+00

5.00E-02

1.00E-01

1.50E-01

2.00E-01

2.50E-01

3.00E-01

3.50E-01

4.00E-01

4.50E-01

5.00E-01

0 10 20 30 40 50 60 70 80 90 100

3C & 4C CODEC 3C & 4C CODEC ImplementationImplementation

Implements a one-to-one map from V(n) to VG(k) Look-Up Table, straightforward, can achieve

minimum overhead (44%), but not practical Our implementation

62.5% overhead (higher than minimum) Modular and straightforward

Break bus into 4-bit groups Encode each group independently (4bit -> 5 bit) Additional logic to handle across-the-boundary forbidden

patterns

Ripple effect (Eliminated by pipelining)

3C & 4C CODEC 3C & 4C CODEC ImplementationImplementation

CODEC block diagram

Input output 0000 00000 0001 00001 0010 00110 0011 00011 0100 01100 0101 00111 0110 01110 0111 01111 1000 11111 1001 11110 1010 11001 1010 11100 1100 10011 1101 11000 1110 10001 1111 10000

b0

b1

b2

b3

b4

b5

b6

b7

b8

b9

b10

b11 b12

b13

b14

b15

Eliminating 4C sequencesEliminating 4C sequences Less aggressive: eliminating 4C sequences only Less overhead (33%) : simpler implementation Simpler algorithm

Divide the bus into 3 bit groups When 4C sequence occurs, complement group data Insert group complement indicator Special handling for across-the-boundary forbidden sequences

(see paper for details)

Examples: 101 001 -> 010 010 1010 0010 -> 1011 0100

Experimental ResultsExperimental Results

Bus simulations CODEC was not modeled Spice3, 0.1μm model Transmission line with inter-wire coupling Quantify delay dependency on bus vector

sequences

CODEC implementation Currently implemented 3C & 4C CODEC

Matching delay on CODEC outputs

4C CODEC implementation planned in future

Bus Simulation ResultsBus Simulation Results Bus length 5mm, 10mm or 20mm Driver strength 30X, 60X and 120X of minimum

DELAY comparison(1mm trace)

-1.00E+00

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 1 2 3 4 5 6

0c

1C

2C

3C

4C

DELAY comparison(2mm trace)

-1.00E+00

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 1 2 3 4 5 6 7 8

0C

1C

2C

3C

4C

Trc_len Buf_size 0C 1C 2C 3C 4C10mm 30x <100 200 350 550 75010mm 60x <100 100 250 400 50010mm 120x <100 120 170 300 35020mm 30x 100 300 600 1000 160020mm 60x 100 250 400 600 90020mm 120x <100 150 300 550 750

CODEC ResultsCODEC Results Compare waveform with coding and w/o coding Random input sequence

Random

sequence

Recovered sequence

encoder decoderdriver receiver

Random

sequence

Recovered sequence

encoder decoderdriver receiver

Encoder/decoder delay ~250ps Max data rate more than 2X compared to scheme with no

encoding

CODEC ResultsCODEC Results

random sequence directly into bus buffer

20mm trace 45x buffer > 1ns delay variation

Random sequence into 3C & 4C encoder

20mm trace 45x buffer < 500ps delay variation

received data pattern w coding

-5.00E-01

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

0 2 4 6 8 10 12 14 16 18

Vin1

Vseg1

Vseg2

Vseg3

Vseg4

Vseg5

waveform w/o encoder

-4.00E-01

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

1.60E+00

0 2 4 6 8 10 12 14 16 18 20

Vtx1

Vtx2

Vtx3

Vtx4

Vtx5

Experimental ResultsExperimental Results

Reshaped data after receivers

without coding, edge jitter ~ 1000ps

with coding edge jitter < 500ps

delay variation w/o coding

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

0 2 4 6 8 10 12 14 16 18

Voo1

Voo2

Voo3

Voo4

Voo5

received data w coding

-2.00E-01

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1.40E+00

0 2 4 6 8 10 12 14 16

rcv1

rcv2

rcv3

rcv4

rcv5

ConclusionsConclusions Inter-wire capacitance increasingly significant in DSM

VLSI interconnect Total capacitance is heavily dependent on bus data sequence With 44% overhead, we can eliminate 3C & 4C cross-talk

Compared to shielding, which has 100% overhead

Implemented CODEC to eliminate 3C and 4C cross-talk sequences

Proposed CODEC to eliminate 4C cross-talk sequences with 33% overhead

Simulation results match our analysis.

Thank You!Thank You!