csc457 seminar yongkang zhu december 6 th, 2001 about network processor

CSC457 Seminar

YongKang Zhu

December 6th, 2001

About Network Processor

Outline

1. What is a NP, why we need it and its features

2. Benchmarks for NP evaluation

3. Several issues on NP design

(a). Processing unit architecture

(b). Handling I/O events

(c). Memory (buffer) organization and management

What is a network processor?

A network processor is a highly programmable

processor, which is suitable for performing

intelligent and flexible packet processing and traffic

management functions at line speed in various

networking devices, such as routers and switches,

etc.

A typical router architecture

Why NP and their features?

Fast growth in transmission technology

Advanced packet processing functions

Traditional methods: using ASIC or off-the-shelf CPU

Performance

Programmability, flexibility

Design and implementation complexity

Value proposition

Benchmarks for NP evaluation

Major metrics include: Throughput: bps, pps, connections per second,

transactions per second

Latency: time for a packet passing through NP

Jitter: variation in latency

Loss Rate: ratio of lost packets

Commbench - by Mark Franklin

1. Two categories of typical applications: Header processing applications: RTR, FRAG, DRR, TCP Payload processing applications: CAST, REED, ZIP, JPEG

2. Selecting appropriate input mix to represent different workload and traffic pattern

3. Design implications (computational complexity)

Importance of selecting input mix

Some Issues on NP design

Processing unit architecture

Fast handling I/O events

Memory organization and management

Processing unit architecture

Four architecture reviewed:

1. a super scalar microprocessor (SS)

2. a fine-grained multithreading microprocessor (FGMT)

3. a chip multiprocessor (CMP)

4. a simultaneous multiprocessor (SMP)

Comparison among four architectures

1. CMP and SMP can explore more instruction level parallelism and packet level parallelism

2. However, other problems are introduced, as how to efficiently handling cache coherency and memory consistency

Handling I/O

Make equal sized internal flits

Higher level pipeline for packet processing

Using coprocessor

Higher (task) level pipeline

Memory organization & management

1. Using novel DRAM architectures: page mode DRAM

Synchronous DRAM

Direct Rambus DRAM

2. Using slow DRAM in parallel: Ping-pong buffering

ECQF-MMA (earliest critical queue first)

Ping-pong buffering

Buffer Organization

Buffer Usage

ECQF-MMA (earliest critical queue first) Using slow DRAM and fast SRAM to

organize buffer structure

total Q FIFO queues

memory bus width is b cells

memory random access time is 2T

the size of each SRAM is bounded to Q * (b - 1) cells

Arbiter selects which cells from which FIFO queue will depart in future

requests to DRAM for replenishing SRAM FIFOs are sent after being accumulated to a certain amount

guarantee a maximum latency experienced by each cell

Intel's IXP1200 1 StrongArm core and 6 RISC micro engine

can manage up to 24 independent threads

two interfaces: IX bus and PCI

IX bus for connecting MAC ports

PCI bus for connecting master processor

register files replicated in each micro engine

on-chip scratch SRAM and I/O buffers

two sets of register files each micro engine

128 GPRs and 128 transfer registers

instruction set architecture

specified field for context switch

specified instruction for reading on-chip scratch SRAM

One application of Intel's IXP1200

Conclusions

1. what is a NP, why we need it and its features

2. benchmarks

3. processing unit architectures: CMP or SMP

4. fast handling I/O: task pipeline, coprocessor

5. memory architectures

-- only a small part of a huge design space

csc457 seminar yongkang zhu december 6 th, 2001 about network processor

Documents

management slide

coprocessor slide

cell slide

network processor slide

memory consistency slide

input mix slide

memory buffer organization

memory organization