block-based packet buffer with deterministic packet departures hao wang and bill lin university of...

23
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

Upload: brice-hodges

Post on 18-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

HPSR, June 13-16, 2010 Parallel and Interleaved DRAM DRAM banks Assume DRAM-to-SRAM access latency ratio is 3 PPP PPP Hao Wang and Bill Lin3

TRANSCRIPT

Page 1: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

Block-Based Packet Buffer with Deterministic Packet Departures

Hao Wang and Bill Lin

University of California, San Diego

HSPR 2010, Dallas

Page 2: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Buffer in Routers

Scheduler and

Packet Buffers

in

• Input linecards have 40byte @ 40Gbps = 8ns to read and write a packet.

• Routers need to store the packets to deal with congestion– Bandwidth X RTT = 40Gb/s*250ms = 10Gb buffer.– Too big to store in SRAM, hence need to use DRAM.

• Problem: DRAM access time ~40ns. Roughly 10x speed difference.

in

in

out

out

out

Hao Wang and Bill Lin 2

Page 3: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Parallel and Interleaved DRAM

DRAM banks

• Assume DRAM-to-SRAM access latency ratio is 3

PPPPPP

Hao Wang and Bill Lin 3

Page 4: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Problems with Parallelism

• Access patterns may create problems.

• To access 3, 6, 9 and 11 one after another, it is possible to issue interleaved read requests and read those packets out at line rate.

DRAMs

1

3

14

11 10

6 5 4

8 9

13

12

2

7

Hao Wang and Bill Lin 4

Page 5: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Problems with Parallelism

• But, accessing 2 & 3 or 10 & 11 in succession is problematic.

• This is an example of Packet Access Conflict

DRAMs

1

3

14

11 10

6 5 4

8 9

13

12

2

7

Hao Wang and Bill Lin 5

Page 6: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Use Packet Departure Time• Wide classes of routers (Crossbar Routers) where the

packets departures are determined by the scheduler on the fly.– Packet buffers which cater to these routers exist but are

complex

• There are other high performance routers such as Switch-Memory-Switch, Load Balanced Routers for which packet departure time can be calculated when the packet is inserted in the buffer.

Hao Wang and Bill Lin 6

Page 7: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

Solution• We will use the known departure times of the

packets to schedule them to different DRAM banks such that there won’t be any conflicts at arrival or departure.

HPSR, June 13-16, 2010 Hao Wang and Bill Lin 7

Page 8: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Buffer Abstraction• Fixed sized packets, time is slotted (Example: 40Gb/s,

40 byte packet => 8ns).

• The buffer may contain arbitrary large number of logical queues, but with deterministic access.

• Single-write Single-read time-deterministic packet buffer model.

Hao Wang and Bill Lin 8

Page 9: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Buffer Architecture• Interleaved memory architecture with multiple

slower DRAM banks.– K slower DRAM banks

• b time slots to complete a single memory read or write operation

• b consecutive time slots is a frame• Each bank is segmented into several sections• Memory block is a collection of sections

Hao Wang and Bill Lin 9

Page 10: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Proposed Architecture

……

reservation table

D1DRAMs

arriving packets

departing packets

bypass buffer

departure reorder buffers

1 2 K

… … …

…………

12

M

1 2 b

… … …

…………

12

N

1 2 b

… … …

…………

12

N

D2 DK

memory block

Hao Wang and Bill Lin 10

Page 11: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Reservation Table

…19 24 2022 … 2120

1 2 3 4 5 … K

0 0 11 … 33

bloc

ks

1

i

23 25 2220 … 24192

Hao Wang and Bill Lin 11

• Use a counter of size log2N bits to keep track of the actual number of packets in N packet locations.

• Reduce the size of the reservation table by

2log NN

Page 12: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Access Conflicts• Arrival conflicts

– An arriving packet keeps a bank busy for b cycles– Need b-1 additional banks

• Departure conflicts– It takes b cycles to read a packet to output– Need b additional banks.

• Overflow conflicts– Incoming packets with departure times within N frames

are stored in the same memory block– N×b arrivals, however, each memory section stores at

most N packets

Hao Wang and Bill Lin 12

Page 13: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

memory section

Water-filling Algorithm

HPSR, June 13-16, 2010 Hao Wang and Bill Lin 13

busy

… memory block

occupied

most empty available bank

• A memory block is managed by a row of the reservation table

Page 14: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Access Conflicts• Water-filling Algorithm

– Pick the most empty bank to store the arriving packet– Solve overflow conflicts

• Theorem: With at least 3b-1 DRAM banks, it is always possible to admit all the arrival packets and write them into memory blocks based on their departure times.

Hao Wang and Bill Lin 14

Page 15: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

DRAM Selection Logic

17 19 2423 … 2620

1 2 3 4 5 … K

K columns

…M ro

ws

s

20 16 1922 … 2325s+1

10 0 10 … 0

write candidate vector W

15 ∞ 19∞ … ∞∞

m=3

X

15 22 1921 … 2023s+u

reservation table R

Hao Wang and Bill Lin 15

Page 16: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Arrival

17 19 2423 … 2620

1 2 3 4 5 … K

K columns

…M ro

ws

s

20 16 1922 … 2325s+1

10 0 10 … 0

write candidate vector W

15 ∞ 19∞ … ∞∞

m=3

X

15 22 1921 … 2023s+u

reservation table R

Hao Wang and Bill Lin 16

• Use write candidate vector W to check arrival conflicts and departure conflicts

Page 17: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Arrival

17 19 2423 … 2620

1 2 3 4 5 … K

K columns

…M ro

ws

s

20 16 1922 … 2325s+1

10 0 10 … 0

write candidate vector W

15 ∞ 19∞ … ∞∞

m=3

X

15 22 1921 … 2023s+u

reservation table R

Hao Wang and Bill Lin 17

• Pick the most empty bank to store the incoming packet

Page 18: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Packet Departure

Hao Wang and Bill Lin 18

• Packets in a memory block are moved to one of the departure reorder buffers before their departure times.

• Pick the fullest memory section first upon departure

• It is always possible to read all the packets from a memory section even if the section is full

• All packets are guaranteed to depart on time.

Page 19: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

SRAM Bypass Buffer• The worst case of the minimum round-trip latency

for storing and retrieving a packet to and from one of the DRAM banks is (2N+1)×b time slots.

• A bypass buffer to store packets with departure times shorter than (2N+1)×b time slots away.

Hao Wang and Bill Lin 19

arriving packets

departing packets

packet locator

……

head pointer

Page 20: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

SRAM Requirement (in MB)

Hao Wang and Bill Lin 20

• N is the number of packets represented by one entry in the reservation table. Line rate is 100Gb/s

N reservation table

departure buffers

bypass buffer

TOTAL

1 30 0.01 0.01 30.01

32 4.69 0.04 0.04 4.77

64 2.82 0.08 0.08 2.97

128 1.65 0.16 0.16 1.96

256 0.94 0.32 0.32 1.57

512 0.53 0.63 0.63 1.78

1024 0.30 1.25 1.26 2.80

Page 21: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

SRAM Requirement Comparison

Hao Wang and Bill Lin 21

• Line rate is 40Gb/s. RTT 250 ms. b = 16. K = 3b-1• Average packet size 40 bytes• The total SRAM size in our proposed block-based

packet buffer is only 8.3% of the previous frame-based scheme and 1.6% of the state-of-the-art SRAM/DRAM prefetching buffer scheme.

prefetching-based frame-based This paper

64 MB 12 MB 1 MB

Page 22: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

HPSR, June 13-16, 2010

Conclusion• Packet buffer architecture with deterministic packet

departure, e.g., Switch-Memory-Switch and Load-Balanced Routers.

• SRAM requirement grows logarithmically with the line rate.

• Required number of DRAM banks is a small constant independent of the arrival traffic patterns, the number of flows and the number of priority classes.

• Scalable to growing packet storage requirements in future routers while matching increasing line rates

Hao Wang and Bill Lin 22

Page 23: Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas

Thank You for Your Kind Attention