A Scalable Switch for Service Guarantees
Bill Lin (University of California, San Diego)
Isaac Keslassy (Technion, Israel)
IEEE Hot Interconnects XIII, August 17-19, 2005 2
Motivation
Scalability: Traffic demands growing, driven in part by increasing broadband adoption 10x increase in broadband subscription in just last 3
years, already over 100 million subscribers 1.25-2.4 Gbps fiber to homes emerging (GPON,
GEPON, EPON, BPON …)
Service Guarantees: Operators need bandwidth partitioning capabilities Provide guaranteed rates in service-level agreements Enable logical partitioning of converged networks Traffic engineering in general
IEEE Hot Interconnects XIII, August 17-19, 2005 3
Router Wish List
Scalable in line rates and number of linecards e.g. R = 160 Gbps (new packet every 2ns), thousands
of linecards, petabit capacity No centralized scheduler No per-packet dynamic switch reconfigurations Low complexity linecards
Provide performance guarantees 100% throughput guarantee Service guarantees No packet reordering
IEEE Hot Interconnects XIII, August 17-19, 2005 4
Existing Architectures
Output-Queueing (OQ) Switch Well-known rate guarantees possible with Weighted
Fair Queueing or Deficit Round-Robin scheduling• But OQ switches require speedup of N
Crossbar Switches, using Input-Queueing (IQ) or Combined Input-Output Queueing (CIOQ) OQ emulation possible
• But expensive centralized scheduling and per-packet dynamic switch reconfigurations
Birkhoff-von Neumann decomposition• If traffic matrix known, can provide rate guarantees with
distributed scheduling, but still requires per-packet dynamic switch reconfigurations
IEEE Hot Interconnects XIII, August 17-19, 2005 5
Existing Architectures (cont’d)
Load-Balanced Switches Chang et al., “Load balanced Birkhoff-von Neumann
switches, Part I: one-stage buffering”, Computer Communications, 2002
Keslassy et al., “Scaling Internet routers using optics”, ACM SIGCOMM 2003
• A key idea: fixed configuration uniform meshes in optics, no dynamic switch reconfigurations
• Showed 100 Tb/s load-balanced router with R = 160 Gbps and N = 640 linecards
Showed 100% throughput for “best effort” traffic, but no service guarantees
IEEE Hot Interconnects XIII, August 17-19, 2005 6
This Talk
Presents the Interleaved Matching Switch (IMS) Like a load-balanced switch, use fixed
configuration uniform meshes, implemented with an optical fabric No arbitrary per-packet switch reconfiguration
Can emulate any IQ or CIOQ switch Can emulate a Birkhoff-von Neumann switch
If traffic matrix known, can ensure 100% throughput, service guarantees, and packet ordering
Show we can use O(1) distributed online scheduling
IEEE Hot Interconnects XIII, August 17-19, 2005 7
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Generic Load-Balanced SwitchUsing Fixed Configuration Uniform Meshes
R/NR/N
R/NR/N
In
In
In
LinecardsLinecards Linecards
112233
IEEE Hot Interconnects XIII, August 17-19, 2005 8
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Generic Load-Balanced SwitchUsing Fixed Configuration Uniform Meshes
R/NR/N
R/NR/N
LinecardsLinecards Linecards
In
In
In
33
22
11
IEEE Hot Interconnects XIII, August 17-19, 2005 9
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
Generic Load-Balanced SwitchUsing Fixed Configuration Uniform Meshes
R/NR/N
R/NR/N
LinecardsLinecards Linecards
In
In
In
Many Fabric Options (any spreading device)
Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches
Just need fixed uniform rate channels at R/N
No dynamic switch reconfigurations
Many Fabric Options (any spreading device)
Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches
Just need fixed uniform rate channels at R/N
No dynamic switch reconfigurations
IEEE Hot Interconnects XIII, August 17-19, 2005 10
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
From Load-Balanced Switch
R/NR/N
R/NR/N
LinecardsLinecards Linecards
In
In
In
IEEE Hot Interconnects XIII, August 17-19, 2005 11
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
To Interleaved Matching Switch
R/NR/N
R/NR/N
Linecards LinecardsLinecards
Move main packet buffers to INPUT
Add coordination slots in MIDDLE
Retain Fixed Configuration MeshesRetain Fixed Configuration Meshes
IEEE Hot Interconnects XIII, August 17-19, 2005 12
How It Works
IMS works by emulating an IQ or CIOQ crossbar switch, but without per-packet dynamic switch reconfigurations (will show how centralized scheduling can be avoided later)
IEEE Hot Interconnects XIII, August 17-19, 2005 13
How It WorksLinecards LinecardsLinecards
Out
Out
Out
R
R
R
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
R/NR/N
R/N
R/N
R/N
R/NR/N
R/NR/N
A1A1A2A2A1A1
B1B1B1B1B1B1
C1C1C2C2C1C1C1C1
B2B2B2B2
C2C2C2C2
A1A1A2A2A
B
C
IEEE Hot Interconnects XIII, August 17-19, 2005 14
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A1A1A2A2A1A1
B1B1B1B1B1B1
C1C1C2C2C1C1C1C1
B2B2B2B2
C2C2C2C2
A1A1A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1B1B1B1B1
C1C1C2C2C1C1C1C1
B2B2B2B2
C2C2C2C2
A1A1A2A2
R
R
R
Crossbar Switch
IEEE Hot Interconnects XIII, August 17-19, 2005 15
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A1A1A2A2A1A1
B1B1B1B1B1B1
C1C1C2C2C1C1C1C1
B2B2B2B2
C2C2C2C2
A1A1A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1B1B1B1B1
C1C1C2C2C1C1C1C1
B2B2B2B2
C2C2C2C2
A1A1A2A2
R
R
R
Crossbar Switch
IEEE Hot Interconnects XIII, August 17-19, 2005 16
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A1A1A2A2A1A1
B1B1B1B1
C1C1C2C2
C1C1
B2B2B2B2
C2C2C2C2
A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1
B1B1B1B1
C1C1C2C2
C1C1
C1C1
B2B2B2B2
C2C2C2C2
A1A1
A2A2
R
R
R
Crossbar Switch
B1B1C1C1A1A1
IEEE Hot Interconnects XIII, August 17-19, 2005 17
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A1A1A2A2A1A1
B1B1B1B1
C1C1C2C2
C1C1
B2B2B2B2
C2C2C2C2
A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1
B1B1B1B1
C1C1C2C2
C1C1
C1C1
B2B2B2B2
C2C2C2C2
A1A1
A2A2
R
R
R
Crossbar Switch
R
R
B1B1
C1C1
A1A1
Differences with crossbar switch
No dynamic switch reconfigurations
Departure times delayed by 2N time slots, N time slots per mesh, otherwise same sequence
Packet transfers initiated at each time slot to next MIDDLE linecard in round-robin order
Differences with crossbar switch
No dynamic switch reconfigurations
Departure times delayed by 2N time slots, N time slots per mesh, otherwise same sequence
Packet transfers initiated at each time slot to next MIDDLE linecard in round-robin order
IEEE Hot Interconnects XIII, August 17-19, 2005 18
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A1A1A2A2A1A1
B1B1B1B1
C1C1C2C2
B2B2B2B2
C2C2
A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1B1B1
C1C1C2C2
B2B2B2B2
C2C2
A2A2
R
R
R
Crossbar Switch
R
R
C1C1C2C2C1C1C2C2
IEEE Hot Interconnects XIII, August 17-19, 2005 19
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A2A2A1A1
B1B1
C1C1C2C2
B2B2B2B2
C2C2
A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A1A1A2A2A1A1
B1B1
B1B1
C1C1C2C2
B2B2B2B2
C2C2
A2A2
R
R
R
Crossbar Switch
R
R
C2C2
C1C1
C2C2
A1A1B1B1C1C1
IEEE Hot Interconnects XIII, August 17-19, 2005 20
How It Works
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
Linecards
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R/N
LinecardsLinecards
A2A2A1A1
C1C1C2C2
B2B2
C2C2
A2A2
R
R
R
Out
Out
Out
Interleaved Matching Switch
R
R
R
XBAR Linecards
Out
Out
Out
R
R
R
R
R
R
Linecards
A2A2A1A1
C1C1C2C2
B2B2
C2C2
A2A2
R
R
R
Crossbar Switch
R
R
C2C2C2C2
B1B1B2B2B1B1B2B2
Crossbar MATCHINGS are INTERLEAVED across MIDDLE linecards (analogous to memory interleaving)
Crossbar MATCHINGS are INTERLEAVED across MIDDLE linecards (analogous to memory interleaving)
IEEE Hot Interconnects XIII, August 17-19, 2005 21
IQ and CIOQ Switch Emulation
An IMS can emulate any IQ or CIOQ switch.
IEEE Hot Interconnects XIII, August 17-19, 2005 22
When Traffic Matrix is Known When traffic matrix is known, can perform
Birkhoff-von Neumann decomposition offline
Given any admissible traffic matrix
Can decompose into a series of permutation matrices ( ) such that
where
IEEE Hot Interconnects XIII, August 17-19, 2005 23
Example
Consider following example:
Use weighted fair queueing to schedule each permutation matrix proportionally to its corresponding weight
IEEE Hot Interconnects XIII, August 17-19, 2005 24
Distributed Storage and Scheduling
Distributed storage: each input linecard only stores its corresponding “rows”
Distributed scheduling: each input linecard only responsible for scheduling its own VOQs
O(1) time/hardware complexity: use deficit round-robin scheduling (many efficient variants)
IEEE Hot Interconnects XIII, August 17-19, 2005 25
Birkhoff-von Neumann Emulation
If traffic matrix known, an IMS can guarantee 100% throughput and guaranteed flow rates when combined with Birkhoff-von Neumann decomposition and online fair scheduling
IEEE Hot Interconnects XIII, August 17-19, 2005 26
Frame-Based Decomposition
If traffic matrix can be converted to an integer matrix by multiplying by an integer F, then can be decomposed into F permutations
Known decomposition algorithms (if F is integer multiple of N ) Birkhoff-von Neumann: O( N3.5 ) Slepian-Duguid: O( N3 )
New efficient formulation using edge-coloring O( N2 log N)
IEEE Hot Interconnects XIII, August 17-19, 2005 27
Conclusions
Scalability IMS leverages scalability of fixed optical meshes If traffic matrix known, distributed online scheduling
can achieve O(1) time and hardware complexity
Emulation IMS can emulate any IQ or CIOQ switch under
same speedup and matching
Guarantees If traffic matrix known, can ensure 100% throughput,
service guarantees, and packet ordering via Birkhoff-von Neumann switch emulation
For integer matrices, new edge coloring formulation
Thank You