the parallel packet switch

25
Stanford University © 1999 The Parallel Packet Switch Web Site: http://klamath.stanford.edu/fjr Sundar Iyer, Amr Awadallah, & Nick McKeown High Performance Networking Group, Stanford University.

Upload: delta

Post on 06-Jan-2016

42 views

Category:

Documents


0 download

DESCRIPTION

The Parallel Packet Switch. Sundar Iyer, Amr Awadallah, & Nick McKeown High Performance Networking Group, Stanford University. Web Site: http://klamath.stanford.edu/fjr. Contents. Motivation Introduction Key Ideas Speedup, Concentration, Constraints Centralized Algorithm - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Parallel Packet Switch

Stanford University © 1999

The Parallel Packet Switch

Web Site: http://klamath.stanford.edu/fjr

Sundar Iyer,

Amr Awadallah,

&

Nick McKeown

High Performance Networking Group,

Stanford University.

Page 2: The Parallel Packet Switch

Stanford University © 1999

Contents

Motivation Introduction Key Ideas

Speedup, Concentration, Constraints

Centralized Algorithm– Theorems, Results & Summary

Motivation for a Distributed Algorithm Concepts

Independence, Trade-Off, Request Duplication

Performance of DPA Conclusions & Future Work

Page 3: The Parallel Packet Switch

Stanford University © 1999

Motivation

To build– a switch with memories running slower than the line rate

– a switch with a highly scaleable architecture

To build – an extremely high-speed packet switch

– a switch with extremely high line rates

Quality of Service Redundancy

“I want an ideal switch”

Page 4: The Parallel Packet Switch

Stanford University © 1999

Architecture Alternatives

An Ideal An Ideal Switch:Switch:

• The memory The memory runs at lower runs at lower than line rate than line rate speedsspeeds•Supports QoSSupports QoS•Is easy to Is easy to implementimplement

QoSSupport

MemorySpeeds

Input Queued1x

2x

Nx

Ease ofImplementation

Output Queued

PPSSwitch ?

CIOQSwitch

Ideal !

X

Y

Z

Page 5: The Parallel Packet Switch

Stanford University © 1999

What is a Parallel Packet Switch ?

A parallel packet-switch (PPS) is comprised of multiple identical lower-speed packet-switches operating

independently and in parallel. An incoming stream of packets is spread, packet-by-packet, by a de-multiplexor

across the slower packet-switches, then recombined by a multiplexor at the output.

R

R

R

R

R(R/k) (R/k)

MultiplexerDemultiplexer1

1

RDemultiplexer

2

R

Demultiplexer

3

Demultiplexer

R

2

k=3

N=4

1

2

3

N=4

Multiplexer

Multiplexer

Multiplexer

Output-Queued Switch

Output-Queued Switch

Output-Queued Switch

Page 6: The Parallel Packet Switch

Stanford University © 1999

Key Ideas in a Parallel Packet Switch

•Key Concept - “Inverse Multiplexing”•Buffering occurs only in the internal switches !•By choosing a large value of “k”, we would like to arbitrarilyreduce the memory speeds within a switch

Can such a switch work “ideally” ?Can it give the advantages of an output queued switch ?

What should the multiplexor and de-multiplexor do ?Does not the switch behave well in a trivial manner ?

Page 7: The Parallel Packet Switch

Stanford University © 1999

Definitions Output Queued Switch

– A switch in which arriving packets are placed immediately in queues at the output, where they contend with packets destined to the same output waiting their turn to depart.

– “We would like to perform as well as an output queued switch”

Mimic (Black Box Model)– Two different switches are said to mimic each other, if under identical inputs,

identical packets depart from each switch at the same time

Work Conserving– A system is said to be work-conserving if its outputs never idle unnecessarily.

– “If you got something to do, do it now !!”

Page 8: The Parallel Packet Switch

Stanford University © 1999

Ideal Scenario

R

R

R

R

R(R/3)

(R/3)

MultiplexerDemultiplexer

1

1

R

Demultiplexer

2

R

Demultiplexer

3

Demultiplexer

R

2

k=3

N=4

1

2

3

N=4

Multiplexer

Multiplexer

Multiplexer

Output-Queued Switch

Output-Queued Switch

Output-Queued Switch

(R/3)

(R)

(R/3)

(R/3)

(R/3

(R/3)

Packets destined to output port two

Page 9: The Parallel Packet Switch

Stanford University © 1999

Potential Pitfalls - Concentration

R

R

R

R

R(R/3)

(R/3)

MultiplexerDemultiplexer

1

1

R

Demultiplexer

2

R

Demultiplexer

3

Demultiplexer

R

2

k=3

N=4

1

2

3

N=4

Multiplexer

Multiplexer

Multiplexer

Output-Queued Switch

Output-Queued Switch

Output-Queued Switch

(R/3)

(2R/3)

(R/3)

(R/3)

(R/3)

“Concentration is when a large number of cells destined to the same output are concentrated on a small fraction of internal layers”

Packets destined to output port two

Page 10: The Parallel Packet Switch

Stanford University © 1999

Can concentration always be avoided ?

R

R2

3

1

R

R

R

R

R

B

C

AC1:A, 1

C2:A, 2

C3:A, 1

R

R2

3

1

R

R

R

R

R

B

C

A C1 C3

C2

R

R2

3

1

R

R

R

R

B

C

C4:B, 2

C5:B, 2

R2

3

1

R

R

R

R

R

B

C

A C3

C4

Cells arriving at

Cells arriving at

R

Cells departing at

Cells departing at

(c) (d)

C5

C3

t=0

t=1

t=0’

t=1’

Page 11: The Parallel Packet Switch

Stanford University © 1999

Link Constraints Input Link Constraint- An external input port is

constrained to send a cell to a specific layer at most once every ceil(k/S) time slots.

This constraint is due to the switch architecture– Each arriving cell must adhere to this constraint

Output Link Constraint– A similar constraint exists for an output portDemultiplexer Demultiplexer

After t =4 After t =5A speedup of 2, with 10

links

Page 12: The Parallel Packet Switch

Stanford University © 1999

AIL and AOL Sets Available Input Link Set: AIL(i,n), is the set of layers to which

external input port i can start sending a cell in time slot n.

– This is the set of layers that external input i has not started sending any cells to within the last ceil(k/S) time slots.

– AIL(i,n) evolves over time

– AIL(i,n) is full when there are no cells destined to an input for ceil(k/S) time slots.

Available Output Link Set: AOL(j,n’), is the set of layers that can send a cell to external output j at time slot n’ in the future.

– This is the set of layers that have not started to send a new cell to external output j in the last ceil(k/S) time slots before time slot n’

– AOL(j,n’) evolves over

time & cells to output j

– AOL(j,n’) is never full as long as there are cells in the system destined to output j.

Page 13: The Parallel Packet Switch

Stanford University © 1999

Bounding AIL and AOL Lemma1: AIL(j,n) >= k - ceil(k/S) +1

Lemma2: AOL(j,n’) >= k - ceil(k/S) +1

Demultiplexer

At t =n

k ceil(k/S) -1

AIL(i,n)

k - ceil(k/S) +1

Page 14: The Parallel Packet Switch

Stanford University © 1999

Theorems Theorem1: (Sufficiency) If a PPS guarantees that each

arriving cell is allocated to a layer l, such that l € AIL(i,n) and l € AOL(j,n’), (i.e. if it meets both the ILC and the OLC) then the switch is work-conserving.

AIL(i,n) AOL(j,n’) The intersection set

Theorem2: (Sufficiency) A speedup of 2k/(k+2) is sufficient for a PPS to meet both the input and output link constraints for every cell

– Corollary:A PPS is work conserving, if S >2k/(k+2)

Page 15: The Parallel Packet Switch

Stanford University © 1999

Theorems .. contd

Theorem3: (Sufficiency) A PPS can exactly mimic an FCFS-OQ switch with a speedup of 2k/(k+2)

Analogy to CLOS ?

Page 16: The Parallel Packet Switch

Stanford University © 1999

Summary of Results CPA - Centralized PPS Algorithm Each input maintains the AIL set. A central scheduler is broadcast the AIL Sets CPA calculates the intersection between AIL

and AOL CPA timestamps the cells The cells are output in the order of the global

timestamp

If the speedup S >= 2, then

– CPA is work conserving

– CPA is perfectly load balancing

– CPA can perfectly mimic an FCFS OQ Switch

Page 17: The Parallel Packet Switch

Stanford University © 1999

Motivation for a Distributed Solution Centralized Algorithm not practical

– N Sequential decisions to be made– Each decision is a set intersection– Does not scale with N, the number of input ports

Ideally, we would like a distributed algorithm where each input makes its decision independently.

Caveats– A totally distributed solution leads to concentration– A speedup of k might be required

Page 18: The Parallel Packet Switch

Stanford University © 1999

Potential Pitfall

R

R

R

R

R(R/k) (R/k)

MultiplexerDemultiplexer1

1

RDemultiplexer

2

R

Demultiplexer

3

DemultiplexerR

2

k=3

N=4

1

2

3

N=4

Multiplexer

Multiplexer

Multiplexer

Output-Queued Switch

Output-Queued Switch

Output-Queued Switch

“If inputs act independently, the PPS can immediately become non work conserving”

•Decrease the number of inputs which request simultaneously

•Give the scheduler choice

•Increase the speedup appropriately

Page 19: The Parallel Packet Switch

Stanford University © 1999

DPA - Distributed PPS Algorithm Inputs are partitioned into k groups of size

floor(N/k) N schedulers

– One for each output– Each maintains AOL(j,n’)

There are ceil(N/k) scheduling stages– Broadcast phase

– Request phase Each input requests a layer which satisfies ILC & OLC (primary

request) Each input also requests a duplicate layer (duplicate request) Duplication function

– Grant phase The scheduler grants each input one request amongst the two

Page 20: The Parallel Packet Switch

Stanford University © 1999

The Duplicate Request Function

Input i € group g The primary request is to

layer l l’ is the duplicate request

layer k is the number of layers

l’ = (l +g) mod k

LayerGroup

Layer1 Layer2 Layer3

Group1 2 3 1

Group2 3 1 2

Group3 1 2 3“Inputs belonging togroup k do not send duplicate requests”

Page 21: The Parallel Packet Switch

Stanford University © 1999

Key Idea - Duplicate Requests

R

R

R

R

R(R/k) (R/k)

MultiplexorDemultiplexor1

1

RDemultiplexor

2

R

Demultiplexor

3

Demultiplexor

R

2

k=3

N=4

A

B

C

D

Multiplexor

Multiplexor

Multiplexor

Output-Queued Switch

Output-Queued Switch

Output-Queued Switch

C1: B

C2: B

C3: B

C4: B

Group 1 = 1,2; Group2 = 3; Group 3 = 4Inputs 1,3,4 participate in the first scheduling stageInput 4 belongs to group 3 and does not duplicate

Page 22: The Parallel Packet Switch

Stanford University © 1999

Understanding the Scheduling Stage in DPA

A set of x nodes can pack at the most x(x-1) +1 request tuples A set of x request tuples span at least ceil[sqrt(x)] layers The maximum number of requests which need to be granted to a

single layer in a given scheduling stage is bounded by ceil[sqrt(k)]

1

2

34

5

So a speedup of around sqrt(k) suffices ?

Page 23: The Parallel Packet Switch

Stanford University © 1999

DPA … results Fact1: (Work Conservance - Necessary condition for PPS)

– For the PPS to be work conserving we require that no more than s cells be scheduled to depart from the same layer in a given window of k time slots.

Fact2: (Work Conservance - Sufficiency for DPA)

– If in any scheduling stage we present only layers which have less than S - ceil[sqrt(k)] cells belonging to the present k-window slot in the AOL. then DPA will always remain work conserving.

Fact3: We have to ensure that there always exists 2 layers such that– l € AIL & AOL– l’ is the duplicate of l– l’ also € AIL & AOL

A speedup of S suffices, where

– S > ceil[sqrt(k)] +3, k > 16

– S > ceil[sqrt(k)] + 4, k > 2

Page 24: The Parallel Packet Switch

Stanford University © 1999

Conclusions & Future Work

CPA is not practicalDPA has to be made simpler

•Extend the results to take care of•Non FIFO QoS policies in a PPS

•Study multicasting in a PPS

Page 25: The Parallel Packet Switch

Stanford University © 1999

Questions Please !