the cms event builder demonstrator based on myrinet

38
1 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000 The CMS Event Builder Demonstrator based on Myrinet Introduction Myrinet Overview Tests of the Switching Fabric Event Building Studies Future Work and Conclusions Frans Meijers CERN/EP on behalf of the CMS DAQ group CHEP2000, Padova Italy, Feb 2000

Upload: marin

Post on 04-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

The CMS Event Builder Demonstrator based on Myrinet. Introduction Myrinet Overview Tests of the Switching Fabric Event Building Studies Future Work and Conclusions. Frans Meijers CERN/EP on behalf of the CMS DAQ group CHEP2000, Padova Italy, Feb 2000. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The CMS Event Builder Demonstrator  based on Myrinet

1 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

The CMS Event Builder Demonstrator based on Myrinet

IntroductionMyrinet OverviewTests of the Switching FabricEvent Building StudiesFuture Work and Conclusions

Frans Meijers CERN/EPon behalf of the CMS DAQ group

CHEP2000, Padova Italy, Feb 2000

Page 2: The CMS Event Builder Demonstrator  based on Myrinet

2 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Introduction

DAQ architecture and EVB parameters Event building by switches. Crossbar EVB traffic shaping: barrel shifter Banyan network A multistage 1024 port switch The CMS DAQ system

Page 3: The CMS Event Builder Demonstrator  based on Myrinet

3 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

DAQ architecture and EVB parameters

100 kHz

1 Mbyte

1 Tbps

Detector Front-end

Computing Services

ReadoutSystems

Builder and Filter

Systems

Event Manager

Builder Networks

Level 1Trigger

RunControl

5122 kbyte

Level-1 Maximum trigger rate

Average event size

Builder network (512x512 port) aggregate throughput

Number of Readout Units Average event fragment size

High Level Trigger acceptance 1 - 10 %

Page 4: The CMS Event Builder Demonstrator  based on Myrinet

4 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Event building by switches. Crossbar

The maximum switch load for random traffic is about 63% (large N limit) due to head-of-line blocking

Higher efficiency:• queues at input and/or outputs ports• traffic shaping (example: barrel shifter 100%)

NxN matrixN2 number of crosspoints

Page 5: The CMS Event Builder Demonstrator  based on Myrinet

5 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

EVB traffic shaping: barrel shifter

sources emit to mutually exclusive destinations in a cycle • works only for fixed size chunks • needs synchronisation

Event

1234

5

Event 234

5

1

Step 1 Step 2 Step 3 Step 4

Page 6: The CMS Event Builder Demonstrator  based on Myrinet

6 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Banyan network

Example : 8x8 made of 3 stages 2x2 (8=23)

• single path per connection • suffers from internal blocking • number of cross points : N log2 N

• For random traffic (no intermediate IQ and no OQ): efficiency drops with s, N; for “infinite” N, eff. 20% • There exists a non-blocking barrel-shifting pattern

s0

s7

d0

d7

Page 7: The CMS Event Builder Demonstrator  based on Myrinet

7 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

A multistage 1024 port switch Banyan topology: NxN out of nxn N=ns

• basic unit: 8x8 crossbars • 3 stages: 512x512 • need 192 crossbars in total

Important to study multistage switches

Page 8: The CMS Event Builder Demonstrator  based on Myrinet

8 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

The CMS DAQ system

F U

Computing and Communication Services

EVM

LV1

R U

Detector front-end readout

Ctrl

Page 9: The CMS Event Builder Demonstrator  based on Myrinet

9 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Myrinet overview

Myrinet features Myrinet switches Network Interface Card

Page 10: The CMS Event Builder Demonstrator  based on Myrinet

10 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Myrinet is a System Area Network (SAN) point to point links, byte wide, full-duplex, 1.3 Gbps per direction,

very low error rate

packet structure: routing header, payload and tail each crossbar switch strips leading byte from routing header

wormhole routing (versus store-and-forward) no buffering, low latency, arbitrary length packets

byte based flow control (STOP/GO) no packet loss inside switching fabric 3Q 2000: link speed from 1.3 Gbps to 2.6 Gbps

Myrinet features

PAYLOAD

ROUTING HEADER

......CRC

STOP

GO

Page 11: The CMS Event Builder Demonstrator  based on Myrinet

11 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Myrinet switches

M2M-OCT-SW8• 32 ports • 8 times 4x4 crossbars

7 6

3

5

2

4

1 0

• Large switch fabric built out of 4x4 crossbar elements• now 8x8 crossbar available as basic element

Page 12: The CMS Event Builder Demonstrator  based on Myrinet

12 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Network interface card

MyrinetSAN link32 or 64

(33 or 66 MHz)

hostDMA

RISC

Pkt Interface

Memory

Address Data

LANai7

Send DMA

64 (66 MHz)

PCIBridge

66 MHz

2 MByte

Recv DMA8(80 MHz, NRZ)

8

M2M-PCI64

Developed a custom Myrinet Control Program• controls DMA engines• implements low-level communication protocol

Page 13: The CMS Event Builder Demonstrator  based on Myrinet

13 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Switch tests

Set-up for switch test Traffic conditions tested Point-to-point 1x1 Parameters point-to-point 1x1 Point-to-Point NxN - Mutually exclusive paths Block on output port Block on internal switch Random Traffic

Page 14: The CMS Event Builder Demonstrator  based on Myrinet

14 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Demonstrator set-up for switch tests

• 32 nodes Linux PCs• PC: 450 MHz PII BX PCI 33 MHz/32bit • Myrinet switch: M2M-OCT-SW8, NIC: M2M-PCI64[A] • two-stage Banyan network out of 4x4 crossbars

sources

destinations

Page 15: The CMS Event Builder Demonstrator  based on Myrinet

15 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

1 2 3 4 5 6 7 1615141312111098

1 2 3 4 5 6 7 1615141312111098

Traffic conditions tested

1 2 3 4 5 6 7 1615141312111098

3217 18 19 20 21 22 23 3130292827262524

1 2 3 4 5 6 7 1615141312111098

1 2 3 4 5 6 7 1615141312111098

1 2 3 4 5 6 7 1615141312111098

1 2 3 4 5 6 7 1615141312111098

Random traffic

Point-to-point traffic (fixed destinations)1 2 3 4 5 6 7 1615141312111098

3217 18 19 20 21 22 23 3130292827262524

1 2 3 4 5 6 7 1615141312111098

3217 18 19 20 21 22 23 3130292827262524

Page 16: The CMS Event Builder Demonstrator  based on Myrinet

16 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Point-to-point 1x1

full host - NIC DMA: limited by PCI (33 MHz/32bit)partial host - NIC DMA: NIC memory - link: full packet host - NIC: only headerslimited by SAN link Allows to load switch to maximum

PCI

link

Page 17: The CMS Event Builder Demonstrator  based on Myrinet

17 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Parameters point-to-point 1x1

Partial host - NIC DMA

• above 1 kbyte: linear behaviour• below 1 kbyte: plateau 5 s (NIC-host communication)

speed: 128 Mbyte/s -> PCI speed speed: 141 Mbyte/s -> 92% link eff.

Full host - NIC DMA

time per packet = overhead + size / speed

Page 18: The CMS Event Builder Demonstrator  based on Myrinet

18 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Point-to-point NxN - Mutually exclusive paths

[d = 4*(s%4)+s/4, s=0-15]

As expected;Aggregate throughput through the switch is linear in N

1x14x4

8x8 16x16

sd 4

Page 19: The CMS Event Builder Demonstrator  based on Myrinet

19 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Block on output port

measured at source #0

Force m (=1,2,3,4) sources on the same destination:Each source gets 1/m of Vmax

1

2

34

Page 20: The CMS Event Builder Demonstrator  based on Myrinet

20 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Block on internal switch

Force 2 sources on different destinations, but through same intermediate path:

As expected; plateau at Vmax/2

measured at source #0

Page 21: The CMS Event Builder Demonstrator  based on Myrinet

21 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Random traffic

measured at destinations

Efficiency: 4x4: 69 % expect 68%16x16: 51 % limited by head-of-line blocking

sources send, independently, to a random destination according to a uniform distribution

1x1

4x4

16x16

1 2 3 4 5 6 7 1615141312111098

1 2 3 4 5 6 7 1615141312111098

Page 22: The CMS Event Builder Demonstrator  based on Myrinet

22 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Event building studies EVB demonstrator set-up Event building protocol Variable size event fragments Event building performance Event building: scaling behaviour Traffic shaping EVB performance with traffic shaping performance for variable size event fragments EVB with traffic shaping: scaling behaviour Traffic shaping: time evolution

Page 23: The CMS Event Builder Demonstrator  based on Myrinet

23 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

EVB demonstrator set-up

• 32+1 Linux PCs [450 MHz PII BX PCI 33 MHz/32bit] • Myrinet switch: M2M-OCT-SW8, NIC: M2M-PCI64[A] • 16x16 two-stage Banyan network out of 4x4 crossbars• Myrinet between RUs and BUs (full duplex). N-to-N traffic• Fast Ethernet between BUs and EVM. N-to-1 traffic• No emulation of Level-1 trigger

EVM

PC: emulate RU

PC: emulate BU

Page 24: The CMS Event Builder Demonstrator  based on Myrinet

24 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Request EvtId

BU EVM RU

EvtId

Request Data

Send Data

Clear EvtId

EVM Builder Network

RU

BU

Event building protocol

level1

Several EvtId messages are grouped in a single Ethernet packet

Myrinet

Page 25: The CMS Event Builder Demonstrator  based on Myrinet

25 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Variable size event fragments

Log-normal distributionexample: Average = 2 kbyte, RMS = 2 kbytemimics CMS data readout

EVBEVB

Builder Units

Readout Units

Page 26: The CMS Event Builder Demonstrator  based on Myrinet

26 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Event building performance

Fragment rate per node † 16x16:For 2 kbyte fragments: 30 kHz

• No traffic shaping• Fixed size event fragments

2k

unstable

4x4

8x816x16

1x1

results:• 1x1 is close to point-to-point• Performance decrease from 4x4 to 8x8 to 16x16, as expected• from small sizes: overhead 7 s

† Fragment rate per node = level-1 rate

Page 27: The CMS Event Builder Demonstrator  based on Myrinet

27 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Event building - scaling behaviour

• take average fragment size of 2 kbyte• also variable size fragments

results:• For variable size reduced performance, as expected• No scaling in N

Need simulation for large N

?

Page 28: The CMS Event Builder Demonstrator  based on Myrinet

28 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Traffic shaping

• Sources divide fragments into fixed size packets (blocks) and cycle through all destinations• Inspired by ATM rate division (block size is 53 bytes)• Should work for large N multistage switch as well

Implementation: • Performed by NIC control program• Block size set to 4 kbyte (30 s cycle)• Barrel shifter without external synchronisation (Myrinet back pressure by HW flow control)• Packets can be (partially) empty ...... ... ...

BU0 BU1 BU2 BU3

RU0 RU1 RU2 RU3

Page 29: The CMS Event Builder Demonstrator  based on Myrinet

29 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

EVB performance with traffic shaping

• fixed size event fragments4k

results:• close to point-to-point

fragment rate per node 16x16:for 2 kbyte fragments: 65 kHz2k

Page 30: The CMS Event Builder Demonstrator  based on Myrinet

30 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Performance for variable size event fragments

2k

decrease of efficiency withlarger RMS of fragment size distribution (in agreement with Monte Carlo)

[†with full host-NIC DMA about 80 Mbyte/s or 40 kHz]

Fragment rate per node for nominal average of 2k and RMS 2k †: 60 kHz

Page 31: The CMS Event Builder Demonstrator  based on Myrinet

31 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

EVB traffic shaping - scaling behaviour

EVB

with traffic shaping: approximate scaling

Page 32: The CMS Event Builder Demonstrator  based on Myrinet

32 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Traffic shaping - time evolution (I)

BS cycling rate * block size

23:00 ?• throughput dropped• traffic shaping barrel shifter stayed in sync

?

2 hours (= 2 108cycles, 10 Tbyte moved)

Page 33: The CMS Event Builder Demonstrator  based on Myrinet

33 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Traffic shaping - time evolution (II)

1 hour (= 108cycles)

BS cycling rate * block size

perturb system :1: slow down RU1: all BU’s reduced rate2: slow down BU1: only BU1 reduced rate

1 2

traffic shaping barrel shifter stays in sync

EVM

RU

BU

Page 34: The CMS Event Builder Demonstrator  based on Myrinet

34 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Future work and conclusions

Future work Conclusions

Page 35: The CMS Event Builder Demonstrator  based on Myrinet

35 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Future work

Evaluate Myrinet 2000 available 3Q 2000 link speed from 1.3 Gbps to 2.6 Gbps switches based on 8x8 crossbars as elementary units

Further study of traffic shaping Simulation Extrapolate to large systems

Page 36: The CMS Event Builder Demonstrator  based on Myrinet

36 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Conclusions Event builder demonstrator 16x16 based on Myrinet multistage

switch and Linux PCs established. Performed systematic switch studies. As expected. Measured event building performance

without traffic shaping: no scaling, as expected with traffic shaping: approximate scaling

For nominal event fragment sizes with average and RMS of 2 kbyte achieved about 60 kHz trigger rate or 120 Mbyte/s per node (almost 2 Gbyte/s aggregate)

That is, today, a factor two off from CMS needs, assuming scaling. Measurements provide parameters for simulation of large scale

(512x512) systems

Page 37: The CMS Event Builder Demonstrator  based on Myrinet

37 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Extra Material

Page 38: The CMS Event Builder Demonstrator  based on Myrinet

38 The CMS Event Builder Demonstrator based on Myrinet Frans Meijers. CHEP 2000, Padova Italy, Feb 2000

Multi-step Event Building

Step 1: at 100 kHzRejection factor 10 with 0.25 of the data from High Level Trigger

Step 2: at 10 kHzRemaining 0.75 of the data

Throughput reduced by 0.25+0.1x0.75=0.33, ie factor 3 At the cost of control complexity and increased latency

• With link speed of 1 Gbps need factor 2 from multi-step event building for 100 kHz level-1 rate (assuming 100% efficient switch )• If higher speed links in 2003-2004, then single-step event builder

100 kHz

10 kHz