achieving 100% throughput where we are in the course…

32
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic Technique: Uniform schedule (easy) 3. Non-uniform traffic, but known traffic matrix Technique: Non-uniform schedule (Birkhoff-von Neumann) 4. Unknown traffic matrix Technique: Lyapunov functions (MWM) 5. Faster scheduling algorithms Technique: Speedup (maximal matchings) Technique: Memory and randomization (Tassiulas) Technique: Twist architecture (buffered crossbar) 6. Accelerate scheduling algorithm Technique: Pipelining Technique: Envelopes Technique: Slicing 7. No scheduling algorithm Technique: Load-balanced router

Upload: ian-reeves

Post on 01-Jan-2016

30 views

Category:

Documents


4 download

DESCRIPTION

Achieving 100% throughput Where we are in the course…. Switch model Uniform traffic Technique: Uniform schedule (easy) Non-uniform traffic, but known traffic matrix Technique: Non-uniform schedule (Birkhoff-von Neumann) Unknown traffic matrix Technique: Lyapunov functions (MWM) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Achieving 100% throughput Where we are in the course…

1

Achieving 100% throughputWhere we are in the course…

1. Switch model2. Uniform traffic

Technique: Uniform schedule (easy)

3. Non-uniform traffic, but known traffic matrix Technique: Non-uniform schedule (Birkhoff-von Neumann)

4. Unknown traffic matrix Technique: Lyapunov functions (MWM)

5. Faster scheduling algorithms Technique: Speedup (maximal matchings) Technique: Memory and randomization (Tassiulas) Technique: Twist architecture (buffered crossbar)

6. Accelerate scheduling algorithm Technique: Pipelining Technique: Envelopes Technique: Slicing

7. No scheduling algorithm Technique: Load-balanced router

Page 2: Achieving 100% throughput Where we are in the course…

Buffered CrossbarsWith Performance Guarantees

Taken from the 2004 Ph.D. defense of:

Shang-Tse (Da) ChuangDepartment of Electrical Engineering,Stanford University, http://yuba.stanford.edu/~stchuang

Page 3: Achieving 100% throughput Where we are in the course…

3

Motivation

Network operators want performance guarantees Throughput guarantee Delay guarantee

High performance routers use crossbars

Hard to build crossbar-based routers with guarantees

My talk: How a crossbar with a small amount of internal

buffering can give guarantees

Page 4: Achieving 100% throughput Where we are in the course…

4

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Page 5: Achieving 100% throughput Where we are in the course…

5

Generic Crossbar-Based Architecture

Speedup of S

Scheduler

VOQs

Page 6: Achieving 100% throughput Where we are in the course…

6

Admissible Traffic

1 , , j

iji

ij

Traffic Matrix

Traffic is admissible if

Page 7: Achieving 100% throughput Where we are in the course…

7

100% Throughput An algorithm delivers 100% throughput if for any

admissible traffic the average backlog is finite

Throughput Guarantee

Speedup of S

Scheduler

Page 8: Achieving 100% throughput Where we are in the course…

8

Previous Work

1985 1990 1995 2000 2005

Wave Front Arbiter [Tamir]

Parallel Iterative Matching [Anderson et al.]

iSLIP [McKeown]

Longest Port First [Mekkittikul et al.]

Maximum Weight Matching [McKeown et al.]

Maximal Matching S=2[Dai,Prabhakar]

Heuristics

TheoreticallyProven

Page 9: Achieving 100% throughput Where we are in the course…

9

Maximal Matching Has Become Hard

TTX Switch Fabric Uses maximal matching Speedup less than 2 Consumes up to 8kW Limited to ~2.5Tb/s No 100% throughput guarantee

Page 10: Achieving 100% throughput Where we are in the course…

10

Traditional Crossbar

Crossbar Requirements An input can send at most one cell An output can receive at most one cell

Scheduling Problem Must overcome two constraints simultaneously

New Crossbar Relieve contention Remove dependency between inputs and outputs

Page 11: Achieving 100% throughput Where we are in the course…

11

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 12: Achieving 100% throughput Where we are in the course…

12

Buffered Crossbar

Arrival Phase Scheduling Phases – Speedup of 2 Departure Phase

Page 13: Achieving 100% throughput Where we are in the course…

13

Scheduling Phase

Input Schedule Each input selects in parallel a cell for an empty crosspoint

Output Schedule Each output selects in parallel a cell from a full crosspoint

Page 14: Achieving 100% throughput Where we are in the course…

14

Example of Input/Output Scheduling

Round-robin Policy Each input schedules in a round-robin order Each output schedules in a round-robin order

Page 15: Achieving 100% throughput Where we are in the course…

15

Previous Work

Buffered Crossbar Simulations [Rojas-Cessa et al. 2001] 32x32 switch, Uniform Bernoulli Traffic, Round-Robin, S=1

0.01

0.1

1

10

100

1000

0.025 0.125 0.225 0.325 0.425 0.525 0.625 0.725 0.825 0.925

Offered Load p

Ave

rag

e D

elay

(C

ell

Tim

e)

1-SLIP

4-SLIP

Buffered Crossbar

Ideal Router

Page 16: Achieving 100% throughput Where we are in the course…

16

Theorem 1 A buffered crossbar with speedup of 2 delivers 100%

throughput for any admissible Bernoulli iid traffic using any work-conserving input/output schedules.

100% Throughput

Page 17: Achieving 100% throughput Where we are in the course…

17

Intuition of Proof

ε

<1-ε

<1-ε

1 2

1-ε 1-ε+ + ε = 2- ε

When a flow is backed up, the services for this backlog exceeds the arrivals

Page 18: Achieving 100% throughput Where we are in the course…

18

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 19: Achieving 100% throughput Where we are in the course…

19

Work-conserving Property If there is a cell for a given output in the system, that

output is busy.

Work Conservation

Output Queued (OQ) Switch

Page 20: Achieving 100% throughput Where we are in the course…

20

?

Emulating an OQ switch

Under identical inputs, the departure time of every cell from both switches is identical

Page 21: Achieving 100% throughput Where we are in the course…

21

4

Input Priority List

57 6

56

1

1

2

9

2

3

8 3

1

Label each cell with their corresponding departure times Arrange input cells into an input priority list Output selects crosspoint with earliest departure time

4

Page 22: Achieving 100% throughput Where we are in the course…

22

Input Priority List

57 6

56

4

132

9

4

2

13

1

8

2

Good guy

Bad guysBad guy

Label each cell with their corresponding departure times Arrange input cells into an input priority list Output selects crosspoint with earliest departure time

Page 23: Achieving 100% throughput Where we are in the course…

23

Definitions

57 6

56

2

4

132

9

4

2

13

Output Margin – cells at its output with earlier departure time Input Margin – cells ahead in input priority list destined to

different outputs Total Margin – Output Margin minus Input Margin

1

8

2 good guys2 bad guys

Page 24: Achieving 100% throughput Where we are in the course…

24

Emulation of FIFO OQ Switch

57 6

56

2

4

12

9

4

2

13

Scheduling Phase Crosspoint is full – Output Margin will increase by one Crosspoint is empty – Input Margin will decrease by one

Total Margin increases by two

1

8 3

Page 25: Achieving 100% throughput Where we are in the course…

25

Emulation of FIFO OQ Switch

57 6

56

2

4

12

9

4

2

13

Arrival Phase Input Margin might increase by one

Departure Phase Output Margin will decrease by one

Total Margin decreases by at most two

1

8 3

3

Page 26: Achieving 100% throughput Where we are in the course…

26

Emulation of FIFO OQ Switch

57 6

56

2

4

2

9

4

2

3

8 33

Lemma 1 For every time slot, total margin does not decrease

Page 27: Achieving 100% throughput Where we are in the course…

27

FIFO Insertion Policy

56

4

2

9

4

2

3

857 6 323

47

Arrival Phase Cell for non-empty VOQ, insert behind cells for same

output Cell for empty VOQ, insert at head of input priority list

Page 28: Achieving 100% throughput Where we are in the course…

28

FIFO Insertion Policy

57 6

56

2

4

2

9

4

2

3

8 33

Lemma 2 An arriving cell will have a non-negative total margin

4 7

Page 29: Achieving 100% throughput Where we are in the course…

29

Theorem 2 A buffered crossbar with speedup of 2 can exactly emulate a

FIFO OQ switch.

Result was shown independently B. Magill, C. Rohrs, R. Stevenson, “Output-Queued Switch

Emulation by Fabrics With Limited Memory”, in IEEE Journal on Selected Areas in Communications, pp.606-615, May. 2003.

Theorem 3 A buffered crossbar with speedup of 2 can be work-conserving

with a distributed algorithm.

Emulation of FIFO OQ Switch

Page 30: Achieving 100% throughput Where we are in the course…

30

Summary

Buffered crossbars Uses crosspoints to relieve contention Inputs and outputs schedule independently and in

parallel

Performance guarantees Throughput – any work-conserving input/output

schedule Work Conservation – simple insertion policy

Page 31: Achieving 100% throughput Where we are in the course…

31

Relevant Papers

Crossbars Shang-Tse Chuang, Ashish Goel, Nick McKeown,

Balaji Prabhakar, “Matching Output Queuing with a Combined Input Output Queued Switch,” IEEE Journal on Selected Areas in Communications, vol.17, n.6, pp.1030-1039, Dec.1999.

Buffered Crossbars Shang-Tse Chuang, Sundar Iyer, Nick McKeown,

“Practical Algorithms for Performance Guarantees in Buffered Crossbars,” in preparation for IEEE/ACM Transactions on Networking.

Page 32: Achieving 100% throughput Where we are in the course…

32

Thank you!