lect4_switchcore

8/8/2019 lect4_switchcore

1/39

Switch core architecture

qS w itch in g fa b ricqQ u e u in g v a ria tio n sqS w itch sch e d u lin g a lg o rith m s


2/39

Switch core in a router

Arbiter

Optional input queueOptional output queue

Switch fabric that

provides

parallel data path


3/39

Switch fabric topologies

Crossbar

Simple space division switch

Each crosspoint can be turned on or

off

configuration

DataIn

Data Out


4/39

Crossbar

Allows any permutation communicationto be non-blocking

Permutation communication: each inputport connects to a distinct output port.

Each node can send/receive at mostonce

Example:


5/39

Crossbar

Advantages:

Simple to implement

Simple control

Flexible

Drawback

Number of crosspoints, not scalable

Good for small N


6/39

Multi-stage switch

In a crossbar, in each switching phase,only one crosspoint in each row orcolumn is active.

One connection goes through onecrosspoint

The objective is to achieve nonblockingcommunication for permutations.

Multi-stage network basic idea Compress the crosspoints:

Each connection goes through multiplecrosspoints

Reduce the number of crosspoints whilestill have nonblocking communication


7/39

Clos networks

Clos networks (Clos 1953) is thefather of all multi-stage networks

Basic form: 3 stage Clos networks

Three stages: input, middle, andoutput

k input switches, m middle switches, k

output switches Input stage: n x m switches

Middle stage: k x k switches

Output stage: m x n switches

Each input switches connects to each

of the middle switches; each


8/39

Clos networks

Total numberof switches?

Total numberofinput/outputports?

-K 1 -K 1

Input stage,

n x m switches

Middle stage,

k x k switches

output stage,

m x n switches


9/39

An example Clos network

Building a large 6x6 nonblocking switch with 6 2x2 switches and 2 3x3 switch


10/39

Another example Closnetwork

Is this network nonblocking? Why?

The total width of middle stage is 6

/ .while the total input output width is 9 When is a Clos network nonblocking then?


11/39

About non-blockingnetworks

Strict-sense nonblocking: can find aroute from a free input to a freeoutput without changing existingroutes.

Wide-sense nonblocking: can find aroute from a free input to a freeoutput without changing existingroutes by suitably choosing routes for

new connections. Rearrangeably nonblocking: can route

permutation without contention (mayrequire old connections to be

rerouted).


12/39

Nonblocking conditions for Closnetworks

Clos Theorem (1953): A Clos networkis strict-sense nonblocking if andonly if m >= 2n-1.

Proof?


13/39

Nonblocking conditions for Closnetworks

Benes Theorem (1962): A Clos networkis rearrangeably nonblocking iff m >=n.

Necessary condition is straight-forward Sufficient condition:

Build a bipartite graph Nodes: input and output switches Edges: connections (from input to output

switches) Maximum degree


14/39

Strict sense nonblocking andrearrangeably nonblocking

example

( , , ) , -s 2 2 3 is re arra n g e a b ly n on b lockin g b u t n o t strict se n se n on b lo cki


15/39

Recursive construction of largeswitches

Each switch in the 3 stage Closnetworks can be recursiveconstructed by smaller switches.

Construct an N x N nonblocking switchwith 2x2 switches (Benes networks): Input/output stage: N/2 2x2 switches Middle stage: 2 N/2 x N/2 switches

(recursively build)

How many 2x2 switches needed?How many crosspoints needed?

( ) =T N ?


16/39

16 x 16 Benes network


17/39

Multistage networks

Many variations of multi-stage networks Clos, Benes, Banyan, Cantor, etc.

With different objectives

The idea is somewhat similar.

All of them try to provide crossbarfunctionality: all try to achievenonblocking communication forpermutations.

Both crossbar and multistage networksare used as switching fabric.


18/39

What is besides switchingfabric?

What problem does the switching fabricsolve?

Any permutation can be done in one

switching circle. 1-to-1 demand

The traffic in router is more complexthan permutation.

Many-to-1 demand?

Not all packets can get through the fabricright away.

Still need buffering and scheduling!!


19/39

Switch Model

N x N switch

Fix sized packet (cell), much easierfor switch to manage. Mostpractical switches use fixed sizedcells.

All line rates are the same: line cards

aggregate lines with different rates. Switching circle: arrival of time

between cells (determined by the

line rate)


20/39

Input/output queuingvariations

Output queuing: buffering at the switch output. Maximum throughput

Memory must be N times faster than line speed.

Memory speed is already a bottleneck!!

Not a choice for top of the line routers


21/39

Input/output queueingvariations

Input queueing: buffering at theswitch input.


22/39


Head of line blocking with inputqueuing

Throughput can be significantly

affected.


23/39


Impact of end of line blocking:maximum throughput = 2-sqrt(2) =58.6%

De

lay

Load58.6% 100%


24/39


Virtual output queueing (input queue): Use a separate queue for each output port in each

line card.

Remove the head of line blocking

Arbiter becomes more complex


25/39


Combined Input Output Queueing(CIOQ)

Queues in both input and output

Memory speedup 1


26/39

Input queue scheduling, thebipartite matching problem

The scheduling algorithm should try tomaximize the number of connection tomaximize the throughtput


27/39

Maximum and maximalmatching

Maximum matching: find the largest number ofconnections

How to do it?

O(N^3) complexity, starvation

Maximal matching Cannot add any connection on the matching without

causing problem

More practical

Maximum matchingThe problem Maximal matching


28/39

Practical Matchingalgorithms

PIM parallel iterative matching

RRM Round-Robin matching

iSLIP iterative serial-line IP


29/39

PIM

Repeat until no new matching isfound

1.Request: each unmatched input

sends a request to every output forwhich it has a queued cell

2.Grant: If an unmatched outputreceives any requests, it randomlygrants one.

3.Accept: If an input receives grants, itrandomly accept one.


30/39

PIM example

R e q u e s t G ra n t A c c e p t


31/39

PIM example

T h e n e x t ite ra tio n


32/39

PIM property

Converge in O(logN) iterations onaverage (what is the worst case?)

Does not perform well for single

iteration 63% (1-1/e) of the throughput

Computed from the probability that aninput remain ungranted.

Hardware random number generator? We would like to have algorithm that

perform well in one iteration!This function is in the critical data path.


33/39

RRM Round robin matching

Request: the same

Grant: if an output receives requests, itchooses the one that appears next in

a fixed round-robin schedule startingfrom the highest priority element Increment the round robin pointer

Accept: if a input receives a grant, it

accepts the one that appears next ina fixed round-robin schedule startingfrom the highest priority

Increment the round robin pointer


34/39

RRM example


35/39

RRM

RRM has lower complexity than PIM

RR arbiters are simpler than randomarbiters

Deterministic method Can perform poorly for certain pattern

The output arbiters are somewhat

synchronized Can have starvation


36/39

iSLIP

An variation of RRM: not movinggrant arbiters unless the grant isaccepted.

Algorithm is the same as RRM exceptthat in grant, the RR pointer isincremented to one location

beyond the granted input if andonly if the grant is accepted in step3.


37/39

iSLIP example


38/39

iSLIP properties

Property 1: Lowest priority is given tothe most recently madeconnection.

Property 2: No starvation, at most N^2 scheduling circles to be served.

Property 3: Under heavy load, all

queues with a common output havethe same throughput.


39/39

iSLIP properties

Simple to implement

Starvation free

Throughput is about 100%

Fair

As load increases, get larger sizedmatch

lect4_switchcore

Documents