processing packets in packet switches cs343 may 7 th 2003
DESCRIPTION
Processing packets in packet switches CS343 May 7 th 2003. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] www.stanford.edu/~nickm. Contents. What processing is done where? What does a packet switch look like? - PowerPoint PPT PresentationTRANSCRIPT
1
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
Processing packets inpacket switches
CS343May 7th 2003
Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University
[email protected]/~nickm
2
Contents
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
3
The Network Layer View of the Internet
Routers
End hosts
4
Hierarchical arrangementA crude approximation
Core Routers
End hosts
Edge Routers
Core routers: Maximum capacity, minimum function. Typically: 16 ports of 10Gb/s. Capacity 160Gb/s, 200Mpps. Price $1M.
Edge routers: Medium capacity, maximum flexibility and function. Typically: 16 ports of 2.5Gb/s. Capacity 20-30 Gb/s, 10-20Mpps. Price $200k.
5
Hierarchical arrangementEnd hosts
(1000s per mux)
Access multiplexer
Core RoutersPOP
POP
POP
Edge Routers
Point of Presence (POP)
POP: Point of Presence. Richly interconnected by mesh of long-haul links.Typically: 40 POPs per national network operator; 10-40 core routers per POP.
10Gb/s “OC192”
6
Autonomous Systems
POP
POP POP
POP
POP
POP POP
POP
POP
POP POP
POP
POP
POP POP
POP
Sprint
Worldcom
AT&T
Global Crossing
“peering points”
7
How we connectCorporate/campus Environment
Ethernet switch
Typically: 100 ports of 100Mb/s Ethernet
i/f
POP
POP
POP
10Gb/s “OC192”
Building-wide router e.g. gates-rtr.stanford.edu
Typically: 16 ports of 1Gb/s Ethernet
POP
Campus or company-wide router e.g. border-rtr.stanford.edu
Typically: mixture of 2.5Gb/s “OC48” and Gb/s Ethernet
8
POP
POP
POP
10Gb/s “OC192”
How we connectHome modem/DSL environment
Telephone switch with DSL line interface at your local Central Office
i/f DSL Router/NATTypically: 10/100Mb/s
Point of Presence (POP)
9
Outline
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
10
What a High Performance Router Looks Like
Cisco GSR 12416 Juniper M160
6ft
19”
2ft
Capacity: 160Gb/sPower: 4.2kW
3ft
2.5ft
19”
Capacity: 80Gb/sPower: 2.6kW
11
Other packet switches
Cisco 7500 “edge” routers
Lucent GX550 Core ATM switch
D-Link DSL router
Wiring closet in Packard building
12
Outline
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
13
The IP Datagram
Flags
vers
TTL
TOS
checksum
HLen Total Length
ID FRAG Offset
Protocol
SRC IP Address
DST IP Address
(OPTIONS) (PAD)
<=64 KBytes
Offset within original packet
Hop count
14
Forwarding in an IP Router
1. Lookup packet DA in forwarding table.– If known, forward to correct port.– If unknown, drop packet.
2. Decrement TTL, update header checksum.
3. Forward packet to outgoing interface.4. Transmit packet onto link.
15
Ethernet Frame Format
PreamblePreamble SFDSFD DADA SASA TypeType DataData PadPad CRCCRC
7 1 6 6 2 0-1500 0-46 4
1. Preamble: trains clock-recovery circuits2. Start of Frame Delimiter: indicates start of frame3. Destination Address: 48-bit globally unique address
assigned by manufacturer.1b: unicast/multicast1b: local/global address
4. Type: Indicates protocol of encapsulated data (e.g. IP = 0x0800)5. Pad: Zeroes used to ensure minimum frame length6. Cyclic Redundancy Check: check sequence to detect bit errors.
Bytes:
16
Encapsulation
PreamblePreamble SFDSFD DADA SASA Type= IP
Type= IP DataData PadPad CRCCRC
IP DataIP Header
17
Outline
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
18
Generic Router Architecture
LookupIP Address
UpdateHeader
Header ProcessingData Hdr Data Hdr
~1M prefixesOff-chip DRAM
AddressTable
AddressTable
IP Address Next Hop
QueuePacket
BufferMemoryBuffer
Memory~1M packetsOff-chip DRAM
19
Generic Router Architecture
LookupIP Address
UpdateHeader
Header Processing
AddressTable
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
AddressTable
BufferManager
BufferMemory
BufferMemory
BufferManager
BufferMemory
BufferMemory
BufferManager
BufferMemory
BufferMemory
20
Contents
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
21
RouteTableCPU Buffer
Memory
LineInterface
MAC
LineInterface
MAC
LineInterface
MAC
Typically <0.5Gb/s aggregate capacity
First Generation Routers
Shared Backplane
Line Interface
CPU
Memory
22
Second Generation RoutersRouteTableCPU
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingCache
FwdingCache
FwdingCache
MAC
BufferMemory
Typically <5Gb/s aggregate capacity
23
Third Generation Routers
LineCard
MAC
LocalBuffer
Memory
CPUCard
LineCard
MAC
LocalBuffer
Memory
Switched Backplane
Line Interface
CPUMem
ory FwdingTable
RoutingTable
FwdingTable
Typically <50Gb/s aggregate capacity
24
Fourth Generation Routers
Switch Core Linecards
Optical links
100sof metres
160Gb/s - 20Tb/s routers in development
25
Contents
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
26
Trends in Technology, Routers & Traffic
1
10
100
1,000
10,000
100,000
1,000,000
1980 1983 1986 1989 1992 1995 1998 2001
Nor
mal
ized
Gro
wth
sin
ce 1
980
DRAM Random Access Time1.1x / 18months
Moore’s Law2x / 18 months
Router Capacity2.2x / 18months
Line Capacity2x / 7 months
User Traffic2x / 12months
27
Trends and Consequences
1
10
100
1000
1996 1997 1998 1999 2000 2001
CPU Instructions per minimum length packet
1
Consequences:1. Packet processing is getting harder, and eventually network
processors will be used less for high performance routers.2. (Much) bigger routers will be developed.
0
100
200
300
400
500
600
2003 2006 2009 2012
Norm
alized g
row
th
5-folddisparity
traffic
Routercapacity
Disparity between traffic and router growth
2
28
Trends and Consequences (2)
0
1
2
3
4
5
6
1990 1993 1996 1999 2002
Pow
er (
kW)
approx...
Power consumption will Exceed POP limits
3 Disparity between line-rate and memory access time
4
1
10
100
1,000
10,000
100,000
1,000,000
Nor
mal
ized
Gro
wth
Rat
e
Consequences:3. Multi-rack routers will spread power over multiple racks.4. It will get harder to build packet buffers for linecards.
29
Contents
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
30
Technology Options
General purpose processor MIPS PowerPC Intel
Network processor Intel IXA and IXP processors IBM Rainier Control plane processors: SiByte (Broadcom), QED
(PMC-Sierra).
FPGA ASIC
31
Network ProcessorsLoad-balancing
cache
cache
cache
cache
cache
Off chip Memory
DispatchCPU
CPU
CPU
CPU
CPU
CPU
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
Incoming packets dispatched to:1. Idle processor, or 2. Processor dedicated to packets in this flow
(to prevent mis-sequencing). 3. Processor for processing needed by packet,
e.g. security, transcoding, application-levelprocessing.
32
Network ProcessorsPipelining
cache
Off chip Memory
CPU
cache
CPU
cache
CPU
cache
CPU
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
DedicatedHW support, e.g. lookups
Processing broken down into (hopefully balanced) steps,Each processor performs one step of processing.
33
Network Processors
Pros Flexibility: Protocols change, features are added. Reduced development time: In principle, should be
quicker to develop software than design a custom chip. Reduces time-to-market, development costs, …
Cons Less efficient: slower than custom chip, more power. Usually designed using standard processors cores, not
optimized for stream processing. Generally about 10x slower than general purpose CPU. Unusual development environments; hard to program. Often hard to partition functions over processors.
34
General Observations
Up until about 1998, Low-end packet switches used general purpose
processors, Mid-range packet switches used FPGAs for datapath,
general purpose processors for control plane. High-end packet switches used ASICs for datapath,
general purpose processors for control plane.
More recently, 3rd party network processors now used in many low-
and mid-range datapaths. Home-grown network processors used in mid- and
high-end.
35
Contents
1. What processing is done where?2. What does a packet switch look like?
Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch
architecture
3. Trends and consequences4. Technology options for processing packets
General purpose CPU Network processors FPGA ASIC
5. My 2c
36
My 2c on network processors
Is it clear that multiple small parallel processors are needed?
When are 10 processors at speed 1 better than 1 processor at speed 10?
Network processors make sense if: Application is parallelizable into multiple threads/contexts. Uniprocessor performance is limited by load-latency.
If general purpose processors evolve anyway to: Contain multiple processors per chip, Support hardware multi-threading,
…then perhaps they are better suited because: Greater development effort means faster general purpose
processors, Existing well-known development environments.
37
My 2c on network processors
Context
Data cache(s)
Data Hdr
Characteristics:1. Stream processing.2. Multiple flows.3. Most processing on
header, not data.4. Two sets of data:
packets, context.5. Packets have no
temporal locality, andspecial spatial locality.
6. Context has temporal and spatial locality.
Characteristics:1. Shared in/out bus.2. Optimized for data
with spatial and temporallocality.
3. Especially optimized forregister accesses.
The nail:
The hammer:
38
A network uniprocessor
Data cache(s)
Off chip Memory
Context memoryhierarchy
Add hardware support for multiple threads/contexts.
On-chip FIFO On-chip FIFO
Off-chip FIFOs Off-chip FIFOsHead/tail Mailbox registers