csc457 seminar yongkang zhu december 6 th, 2001 about network processor
TRANSCRIPT
CSC457 Seminar
YongKang Zhu
December 6th, 2001
About Network Processor
Outline
1. What is a NP, why we need it and its features
2. Benchmarks for NP evaluation
3. Several issues on NP design
(a). Processing unit architecture
(b). Handling I/O events
(c). Memory (buffer) organization and management
What is a network processor?
A network processor is a highly programmable
processor, which is suitable for performing
intelligent and flexible packet processing and traffic
management functions at line speed in various
networking devices, such as routers and switches,
etc.
A typical router architecture
Why NP and their features?
Fast growth in transmission technology
Advanced packet processing functions
Traditional methods: using ASIC or off-the-shelf CPU
Performance
Programmability, flexibility
Design and implementation complexity
Value proposition
Benchmarks for NP evaluation
Major metrics include: Throughput: bps, pps, connections per second,
transactions per second
Latency: time for a packet passing through NP
Jitter: variation in latency
Loss Rate: ratio of lost packets
Commbench - by Mark Franklin
1. Two categories of typical applications: Header processing applications: RTR, FRAG, DRR, TCP Payload processing applications: CAST, REED, ZIP, JPEG
2. Selecting appropriate input mix to represent different workload and traffic pattern
3. Design implications (computational complexity)
Importance of selecting input mix
Some Issues on NP design
Processing unit architecture
Fast handling I/O events
Memory organization and management
Processing unit architecture
Four architecture reviewed:
1. a super scalar microprocessor (SS)
2. a fine-grained multithreading microprocessor (FGMT)
3. a chip multiprocessor (CMP)
4. a simultaneous multiprocessor (SMP)
Comparison among four architectures
1. CMP and SMP can explore more instruction level parallelism and packet level parallelism
2. However, other problems are introduced, as how to efficiently handling cache coherency and memory consistency
Handling I/O
Make equal sized internal flits
Higher level pipeline for packet processing
Using coprocessor
Higher (task) level pipeline
Memory organization & management
1. Using novel DRAM architectures: page mode DRAM
Synchronous DRAM
Direct Rambus DRAM
2. Using slow DRAM in parallel: Ping-pong buffering
ECQF-MMA (earliest critical queue first)
Ping-pong buffering
Buffer Organization
Buffer Usage
ECQF-MMA (earliest critical queue first) Using slow DRAM and fast SRAM to
organize buffer structure
total Q FIFO queues
memory bus width is b cells
memory random access time is 2T
the size of each SRAM is bounded to Q * (b - 1) cells
Arbiter selects which cells from which FIFO queue will depart in future
requests to DRAM for replenishing SRAM FIFOs are sent after being accumulated to a certain amount
guarantee a maximum latency experienced by each cell
Intel's IXP1200 1 StrongArm core and 6 RISC micro engine
can manage up to 24 independent threads
two interfaces: IX bus and PCI
IX bus for connecting MAC ports
PCI bus for connecting master processor
register files replicated in each micro engine
on-chip scratch SRAM and I/O buffers
two sets of register files each micro engine
128 GPRs and 128 transfer registers
instruction set architecture
specified field for context switch
specified instruction for reading on-chip scratch SRAM
One application of Intel's IXP1200
Conclusions
1. what is a NP, why we need it and its features
2. benchmarks
3. processing unit architectures: CMP or SMP
4. fast handling I/O: task pipeline, coprocessor
5. memory architectures
-- only a small part of a huge design space