processing packets in packet switches cs343 may 7 th 2003

Post on 06-Jan-2016

29 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Processing packets in packet switches CS343 May 7 th 2003. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu www.stanford.edu/~nickm. Contents. What processing is done where? What does a packet switch look like? - PowerPoint PPT Presentation

TRANSCRIPT

1

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

Processing packets inpacket switches

CS343May 7th 2003

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

nickm@stanford.eduwww.stanford.edu/~nickm

2

Contents

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

3

The Network Layer View of the Internet

Routers

End hosts

4

Hierarchical arrangementA crude approximation

Core Routers

End hosts

Edge Routers

Core routers: Maximum capacity, minimum function. Typically: 16 ports of 10Gb/s. Capacity 160Gb/s, 200Mpps. Price $1M.

Edge routers: Medium capacity, maximum flexibility and function. Typically: 16 ports of 2.5Gb/s. Capacity 20-30 Gb/s, 10-20Mpps. Price $200k.

5

Hierarchical arrangementEnd hosts

(1000s per mux)

Access multiplexer

Core RoutersPOP

POP

POP

Edge Routers

Point of Presence (POP)

POP: Point of Presence. Richly interconnected by mesh of long-haul links.Typically: 40 POPs per national network operator; 10-40 core routers per POP.

10Gb/s “OC192”

6

Autonomous Systems

POP

POP POP

POP

POP

POP POP

POP

POP

POP POP

POP

POP

POP POP

POP

Sprint

Worldcom

AT&T

Global Crossing

“peering points”

7

How we connectCorporate/campus Environment

Ethernet switch

Typically: 100 ports of 100Mb/s Ethernet

i/f

POP

POP

POP

10Gb/s “OC192”

Building-wide router e.g. gates-rtr.stanford.edu

Typically: 16 ports of 1Gb/s Ethernet

POP

Campus or company-wide router e.g. border-rtr.stanford.edu

Typically: mixture of 2.5Gb/s “OC48” and Gb/s Ethernet

8

POP

POP

POP

10Gb/s “OC192”

How we connectHome modem/DSL environment

Telephone switch with DSL line interface at your local Central Office

i/f DSL Router/NATTypically: 10/100Mb/s

Point of Presence (POP)

9

Outline

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

10

What a High Performance Router Looks Like

Cisco GSR 12416 Juniper M160

6ft

19”

2ft

Capacity: 160Gb/sPower: 4.2kW

3ft

2.5ft

19”

Capacity: 80Gb/sPower: 2.6kW

11

Other packet switches

Cisco 7500 “edge” routers

Lucent GX550 Core ATM switch

D-Link DSL router

Wiring closet in Packard building

12

Outline

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

13

The IP Datagram

Flags

vers

TTL

TOS

checksum

HLen Total Length

ID FRAG Offset

Protocol

SRC IP Address

DST IP Address

(OPTIONS) (PAD)

<=64 KBytes

Offset within original packet

Hop count

14

Forwarding in an IP Router

1. Lookup packet DA in forwarding table.– If known, forward to correct port.– If unknown, drop packet.

2. Decrement TTL, update header checksum.

3. Forward packet to outgoing interface.4. Transmit packet onto link.

15

Ethernet Frame Format

PreamblePreamble SFDSFD DADA SASA TypeType DataData PadPad CRCCRC

7 1 6 6 2 0-1500 0-46 4

1. Preamble: trains clock-recovery circuits2. Start of Frame Delimiter: indicates start of frame3. Destination Address: 48-bit globally unique address

assigned by manufacturer.1b: unicast/multicast1b: local/global address

4. Type: Indicates protocol of encapsulated data (e.g. IP = 0x0800)5. Pad: Zeroes used to ensure minimum frame length6. Cyclic Redundancy Check: check sequence to detect bit errors.

Bytes:

16

Encapsulation

PreamblePreamble SFDSFD DADA SASA Type= IP

Type= IP DataData PadPad CRCCRC

IP DataIP Header

17

Outline

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

18

Generic Router Architecture

LookupIP Address

UpdateHeader

Header ProcessingData Hdr Data Hdr

~1M prefixesOff-chip DRAM

AddressTable

AddressTable

IP Address Next Hop

QueuePacket

BufferMemoryBuffer

Memory~1M packetsOff-chip DRAM

19

Generic Router Architecture

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

20

Contents

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

21

RouteTableCPU Buffer

Memory

LineInterface

MAC

LineInterface

MAC

LineInterface

MAC

Typically <0.5Gb/s aggregate capacity

First Generation Routers

Shared Backplane

Line Interface

CPU

Memory

22

Second Generation RoutersRouteTableCPU

LineCard

BufferMemory

LineCard

MAC

BufferMemory

LineCard

MAC

BufferMemory

FwdingCache

FwdingCache

FwdingCache

MAC

BufferMemory

Typically <5Gb/s aggregate capacity

23

Third Generation Routers

LineCard

MAC

LocalBuffer

Memory

CPUCard

LineCard

MAC

LocalBuffer

Memory

Switched Backplane

Line Interface

CPUMem

ory FwdingTable

RoutingTable

FwdingTable

Typically <50Gb/s aggregate capacity

24

Fourth Generation Routers

Switch Core Linecards

Optical links

100sof metres

160Gb/s - 20Tb/s routers in development

25

Contents

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

26

Trends in Technology, Routers & Traffic

1

10

100

1,000

10,000

100,000

1,000,000

1980 1983 1986 1989 1992 1995 1998 2001

Nor

mal

ized

Gro

wth

sin

ce 1

980

DRAM Random Access Time1.1x / 18months

Moore’s Law2x / 18 months

Router Capacity2.2x / 18months

Line Capacity2x / 7 months

User Traffic2x / 12months

27

Trends and Consequences

1

10

100

1000

1996 1997 1998 1999 2000 2001

CPU Instructions per minimum length packet

1

Consequences:1. Packet processing is getting harder, and eventually network

processors will be used less for high performance routers.2. (Much) bigger routers will be developed.

0

100

200

300

400

500

600

2003 2006 2009 2012

Norm

alized g

row

th

5-folddisparity

traffic

Routercapacity

Disparity between traffic and router growth

2

28

Trends and Consequences (2)

0

1

2

3

4

5

6

1990 1993 1996 1999 2002

Pow

er (

kW)

approx...

Power consumption will Exceed POP limits

3 Disparity between line-rate and memory access time

4

1

10

100

1,000

10,000

100,000

1,000,000

Nor

mal

ized

Gro

wth

Rat

e

Consequences:3. Multi-rack routers will spread power over multiple racks.4. It will get harder to build packet buffers for linecards.

29

Contents

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

30

Technology Options

General purpose processor MIPS PowerPC Intel

Network processor Intel IXA and IXP processors IBM Rainier Control plane processors: SiByte (Broadcom), QED

(PMC-Sierra).

FPGA ASIC

31

Network ProcessorsLoad-balancing

cache

cache

cache

cache

cache

Off chip Memory

DispatchCPU

CPU

CPU

CPU

CPU

CPU

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

Incoming packets dispatched to:1. Idle processor, or 2. Processor dedicated to packets in this flow

(to prevent mis-sequencing). 3. Processor for processing needed by packet,

e.g. security, transcoding, application-levelprocessing.

32

Network ProcessorsPipelining

cache

Off chip Memory

CPU

cache

CPU

cache

CPU

cache

CPU

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

DedicatedHW support, e.g. lookups

Processing broken down into (hopefully balanced) steps,Each processor performs one step of processing.

33

Network Processors

Pros Flexibility: Protocols change, features are added. Reduced development time: In principle, should be

quicker to develop software than design a custom chip. Reduces time-to-market, development costs, …

Cons Less efficient: slower than custom chip, more power. Usually designed using standard processors cores, not

optimized for stream processing. Generally about 10x slower than general purpose CPU. Unusual development environments; hard to program. Often hard to partition functions over processors.

34

General Observations

Up until about 1998, Low-end packet switches used general purpose

processors, Mid-range packet switches used FPGAs for datapath,

general purpose processors for control plane. High-end packet switches used ASICs for datapath,

general purpose processors for control plane.

More recently, 3rd party network processors now used in many low-

and mid-range datapaths. Home-grown network processors used in mid- and

high-end.

35

Contents

1. What processing is done where?2. What does a packet switch look like?

Examples of packet switches What does a packet switch do? Typical packet switch architecture Evolution of high performance packet switch

architecture

3. Trends and consequences4. Technology options for processing packets

General purpose CPU Network processors FPGA ASIC

5. My 2c

36

My 2c on network processors

Is it clear that multiple small parallel processors are needed?

When are 10 processors at speed 1 better than 1 processor at speed 10?

Network processors make sense if: Application is parallelizable into multiple threads/contexts. Uniprocessor performance is limited by load-latency.

If general purpose processors evolve anyway to: Contain multiple processors per chip, Support hardware multi-threading,

…then perhaps they are better suited because: Greater development effort means faster general purpose

processors, Existing well-known development environments.

37

My 2c on network processors

Context

Data cache(s)

Data Hdr

Characteristics:1. Stream processing.2. Multiple flows.3. Most processing on

header, not data.4. Two sets of data:

packets, context.5. Packets have no

temporal locality, andspecial spatial locality.

6. Context has temporal and spatial locality.

Characteristics:1. Shared in/out bus.2. Optimized for data

with spatial and temporallocality.

3. Especially optimized forregister accesses.

The nail:

The hammer:

38

A network uniprocessor

Data cache(s)

Off chip Memory

Context memoryhierarchy

Add hardware support for multiple threads/contexts.

On-chip FIFO On-chip FIFO

Off-chip FIFOs Off-chip FIFOsHead/tail Mailbox registers

top related