ee384y: packet switch architectures part ii load-balanced switches

41
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] http://www.stanford.edu/~nickm

Upload: pomona

Post on 12-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

EE384Y: Packet Switch Architectures Part II Load-balanced Switches. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] http://www.stanford.edu/~nickm. The Arbitration Problem. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

1

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

EE384Y: Packet Switch ArchitecturesPart II

Load-balanced Switches

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

[email protected]://www.stanford.edu/~nickm

Page 2: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

2

The Arbitration Problem

A packet switch fabric is reconfigured for every packet transfer.

For example, at 160Gb/s, a new IP packet can arrive every 2ns.

The configuration is picked to maximize throughput and not waste capacity.

Known algorithms are probably too slow.

Page 3: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

3

Approach

We know that a crossbar with VOQs, and uniform Bernoulli i.i.d. arrivals, gives 100% throughput for the following scheduling algorithms: Pick a permutation uar from all permutations. Pick a permutation uar from the set of size N in which each

input-output pair (i,j) are connected exactly once in the set. From the same set as above, repeatedly cycle through a fixed

sequence of N different permutations.

Can we make non-uniform, bursty traffic uniform “enough” for the above to hold?

Page 4: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

4

Design Example

GoalsScale to High Linecard Speeds (160Gb/s)

No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity

Scale to High Number of Linecards (640)

Provide Performance Guarantees 100% Throughput Guarantee No Packet Reordering

Stanford “Optics in Routers” projecthttp://yuba.stanford.edu/or/

Some challenging numbers: 100Tb/s 160Gb/s linecards 640 linecards

Page 5: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

5

Outline

Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards

Page 6: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

6

In

In

In

Out

Out

Out

R

R

R

R

R

R

Router capacity = NRSwitch capacity = N2R

100% Throughput in a Mesh Fabric

?

?

?

?

?

?

?

?

?

R

R

R

R

R

R

R

R

R

RRRR

Page 7: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

7

R

In

In

In

Out

Out

Out

R

R

R

R

R

R/N

R/N

R/N

R/NR/N

R/N

R/N

R/N

R/N

If Traffic Is Uniform

RNR /NR /NR /

R

NR / NR /

Page 8: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

8

Real Traffic is Not Uniform

R

In

In

In

Out

Out

Out

R

R

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

RNR /NR /NR /

R

RNR /NR /NR /

R

RNR /NR /NR /

R

R

R

R

?

Page 9: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

9

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Load-Balanced Switch

Load-balancing stage Forwarding stage

In

In

In

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

100% throughput for weakly mixing traffic (Valiant, C.-S. Chang)

Page 10: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

10

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

112233

Load-Balanced Switch

Page 11: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

11

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N33

22

11

Load-Balanced Switch

Page 12: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

12

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/NR/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Intuition: 100% Throughput

Arrivals to second mesh:

Capacity of second mesh:

Second mesh: arrival rate < service rate

111

111

111

where,1

UaUN

b

01

-b RUaUN

C

UN

RC

Cba

[C.-S. Chang]

Page 13: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

13

Another way of thinking about it

1

N

1

N

1

N

External Outputs

Internal Inputs

External Inputs

Load-balancing cyclic shift

Switching cyclic shift

Load Balancing

First stage load-balances incoming packets Second stage is a cyclic shift

Page 14: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

14

Load-Balanced Switch

External Outputs

Internal Inputs

1

N

ExternalInputs

Load-balancing cyclic shift

Switching cyclic shift

1

N

1

N

11

2

2

Page 15: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

15

ˆ( ) ,

ˆ mod

1. Consider a periodic sequence of permutation matrices:

where is a one-cycle permutation matrix

(f or example, a TDM sequence), and .

2. I f 1st stage is

tP t P P

t t N

Main Result [Chang et al.]:

1 1

1

2 2

( ) ( ),

( ) ( ),

scheduled by a sequence of permutation

matrices:

where is a random starting phase, and

3. The 2nd stage is scheduled by a sequence of permutation

matrices:

4. Then the swit

P t P t

P t P t

ch gives 100% throughput f or a very broad

range of traffi c types.

1st stage makes non-unif orm traffi c unif orm,

and breaks up burstiness.

Observation:

Page 16: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

16

Outline of Chang’s Proof

1

( )

( )

( ) ( ) ( )

( )

( 1)

1. Let be the matrix of arrivals at time , where

indicates an arrival at f or .

2. Let be the input traffi c to the second stage.

3. Let be the queue length matrix:

ij

a t t

a t i j

b t P t a t

q t

q t

2

20

1

1 1

max ( ) ( 1) ( 1), 0 ,

( ) max .

( ) ( ).

1( ) ( ) ( ) ( ) ( ) .

1lim

expands to

I f no output is oversubscribed, converges to steady state

t

s ts

t

q t b t P t

q t b P

q t q

E b t E P t a t E P t E a t eN

bt

:Theorem

Proof :

21

1 1( ) ( ) 0.

( )Holds under some mild conditions on (weakly mixing arrival processes).

t

s

s P s e eN N

a t

Page 17: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

17

Outline

Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards

Page 18: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

18

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Packet Reordering

12

Page 19: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

19

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Bounding Delay Difference Between Middle Ports

1

2

cells

Page 20: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

20

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

123

0

UFS (Uniform Frame Spreading)

12

Page 21: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

21

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

FOFF (Full Ordered Frames First)

12

Page 22: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

22

FOFF (Full Ordered Frames First)

Input Algorithm N FIFO queues corresponding to the N output flows Spread each flow uniformly: if last packet was sent to

middle port k, send next to k+1. Every N time-slots, pick a flow:

- If full frame exists, pick it and spread like UFS - Else if all frames are partial, pick one in round-robin order and send it

123

12

4

N

Page 23: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

23

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Bounding Reordering

123

NN

Page 24: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

24

FOFF

Output properties N FIFO queues corresponding to the N middle

ports Buffer size less than N2 packets If there are N2 packets, one of the head-of-line

packets is in order

111

22

333

Output

4

N

Page 25: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

25

FOFF Properties

Property 1: FOFF maintains packet order.

Property 2: FOFF has O(1) complexity.

Property 3: Congestion buffers operate independently.

Property 4: FOFF maintains an average packet delay within constant from ideal output-queued router.

Corollary: FOFF has 100% throughput for any adversarial traffic.

Page 26: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

26

In

In

In

Out

Out

Out

R

R

R

R

R

R

Output-Queued Router?

?

?

?

?

?

?

?

?

R

R

R

R

R

R

R

R

R

RRRR

Page 27: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

27

Outline

Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards

Page 28: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

28

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

From Two Meshes to One Mesh

One linecard

In

Out

Page 29: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

29

From Two Meshes to One Mesh

First meshIn Out

In Out

In Out

In Out

One linecard

Second mesh

R R

R

R

R

Page 30: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

30

From Two Meshes to One Mesh

Combined meshIn Out

In Out

In Out

In Out

2RR

2R

2R

2R

Page 31: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

31

Many Fabric Options

Options

Space: Full uniform meshTime: Round-robin crossbarWavelength: Static WDM

Any spreadingdevice

C1, C2, …, CN

C1

C2

C3

CN

In Out

In Out

In Out

In Out

N channels each at rate 2R/NOne linecard

Page 32: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

32

AWGR (Arrayed Waveguide Grating Router) A Passive Optical Component

Wavelength i on input port j goes to output port (i+j-1) mod N

Can shuffle information from different inputs

1, 2…N

NxN AWGR

Linecard 1

Linecard 2

Linecard N

1

2

N

Linecard 1

Linecard 2

Linecard N

Page 33: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

33

In Out

In Out

In Out

In Out

Static WDM Switching: Packaging

AWGR

Passive andAlmost Zero

Power

A

B

C

D

A, B, C, D

A, B, C, D

A, B, C, D

A, B, C, D

A, A, A, A

B, B, B, B

C, C, C, C

D, D, D, D

N WDM channels, each at rate 2R/N

Page 34: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

34

Outline

Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards

Page 35: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

35

Scaling Problem

For N < 64, an AWGR is a good solution. We want N = 640. Need to decompose.

Page 36: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

36

A Different Representation of the Mesh

In Out

In Out

In Out

In Out

R 2R

Mesh

2R In Out

In Out

In Out

In Out

R

2RR

Page 37: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

37

A Different Representation of the Mesh

In Out

In Out

In Out

In Out

R In Out

In Out

In Out

In Out

R2R/N

Page 38: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

38

1

2

3

4

Example: N=8

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

2R/8

Page 39: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

39

When N is Too LargeDecompose into groups (or racks)

4R/42R 2R1

2

3

4

5

6

7

8

2R2R

1

2

3

4

5

6

7

8

4R 4R

Page 40: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

40

When N is Too LargeDecompose into groups (or racks)

1

2

L

2R2R

2R

1

2

L

2R2R

2R

Group/Rack 1

Group/Rack G

1

2

L

2R2R

2R

Group/Rack 1

1

2

L

2R2R

2R

Group/Rack G

2RL

2RL 2RL

2RL2RL/G

2RL/G

2RL/G

2RL/G

Page 41: EE384Y: Packet Switch Architectures Part II Load-balanced Switches

41

Outline

Basic idea of load-balancing Packet mis-sequencing An optical switch fabric Scaling number of linecards