implementation of the star data acquisition system using a myrinet network

17
Implementation of the STAR Data Acquisition System using a Myrinet Network J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz (Brookhaven National Laboratory) J.M. Nelson (University of Birmingham) C. Adler, J.S. Lange (University of Frankfurt)

Upload: kolya

Post on 04-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Implementation of the STAR Data Acquisition System using a Myrinet Network. J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz (Brookhaven National Laboratory) J.M. Nelson (University of Birmingham) C. Adler, J.S. Lange (University of Frankfurt). First Collisions at RHIC!. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Implementation of the STAR Data Acquisition System using a Myrinet Network

Implementation of the STAR Data Acquisition System using a Myrinet

Network

J.M. Landgraf, M.J. LeVine,

A. Ljubicic, Jr., M.W. Schulz(Brookhaven National Laboratory)

J.M. Nelson(University of Birmingham)

C. Adler, J.S. Lange(University of Frankfurt)

Page 2: Implementation of the STAR Data Acquisition System using a Myrinet Network

First Collisions at RHIC!

Star Control Room June 12th, 2000 9:00pm

Page 3: Implementation of the STAR Data Acquisition System using a Myrinet Network

Outline

• The STAR DAQ System– Components– Event Building Network

• Introduction to Myrinet

• Myrinet Implementation– Myrinet Software (GM)– STAR DAQ Software– myriLib

• Year 2 Event Builder

• Performance & Reliability

Page 4: Implementation of the STAR Data Acquisition System using a Myrinet Network

STAR DAQ

DAQ Readout Units– VME Crate-Based – Custom RBs with ASICs & i960 CPUs– Motorola MVME Detector Broker

L3– Linux Farm (Compaq Alpha workstations)– Physics based build decision

Event Building Network– Token Management– Event Building– Event Storage and Buffering

Page 5: Implementation of the STAR Data Acquisition System using a Myrinet Network

DAQ / L3 Event Building Network

Squares: MVME / VxWorksCircles: Alpha Workstations / Linux

Diamonds: Ultrasparc Workstations / Solaris

Page 6: Implementation of the STAR Data Acquisition System using a Myrinet Network

What is Myrinet?• Commercial Network From Myricom

(www.myri.com)

• Low cost (~$1K / Card, $4-6K / Switch) • PCI / PMC Network Interface Cards• High bandwidth (1.28 + 1.28 Gb/sec)• Low Latency (13 usec)• Scalable switched topology• Network control performed in software• Open-source MCP / Driver software

Page 7: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Architecture

• Network Card Interface (PCI64B)

– Lanai processor controls network– Local memory buffer– Both network & PCI DMA engines

• Switches

– Cut-through wormhole routing– CRC is recalculated at each stage

Including header– Stop/Go flow control mediated with

Small slack buffer

Page 8: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Throughput

We Tested: 32 / 64 bit Myrinet cards VxWorks MVME 2604, MVME 2306 Linux Compaq Alpha Linux Intell Solaris Ultrasparc

Page 9: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Software Network mapping

• Each myrinet node maintains list of port offsets to each other node

• Dynamic and Static mapping supported• Alternate routes can be forced by user

Myrinet driver (GM)• Variable length Messages

–Sender / Receiver provide buffersin advance for each size

– Sender / Receiver notified and mustreturn buffer to gm

• Directed Sends– DMA directly to host memory– Receiver not notified• GM imposes structure on user program– Poll / Block on gm_receive()– GM is not thread-safe

Page 10: Implementation of the STAR Data Acquisition System using a Myrinet Network

DAQ SoftwareSoftware is Message Based

for(;;){ msgQReceive(&msg); switch(msg.cmd) { }}

Sending is routed to the proper network

Each network has an associated daemon

daqMsgSend(node, &msg)

node/task/domain Local QueueMyrinetEthernetVME

myriLibethComLibvmComLib

que[task]

ICCP Message Protocol• 120 byte messages• Standard header

Page 11: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriLib

DAQ library which wraps gmmyriMsgSend()myriMemCpy()

What does it do?• Manages the DMA message buffers • Handles callback functions• Thread synchronization• Misc… (Byte order, 32 vs. 64 bit etc.)• Bypasses DMA limitations on Solaris

Several Flavors• Threaded vs Process• Buffered vs Unbuffered DMA copies

Page 12: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriLib OperationsThreaded (VxWorks tasks) myriLib

These lead to extra latency/reduced throughputfor directed sends

Process myriLib with Buffering

Page 13: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriMemCopy() Throughput

Page 14: Implementation of the STAR Data Acquisition System using a Myrinet Network

Multi-Sender myriLib Throughput

32-bit card MVME 2306 senders64-bit Ultrasparc receiver

Page 15: Implementation of the STAR Data Acquisition System using a Myrinet Network

Year 2 Event BuildingSolaris Myrinet Cards allow us to implement the EVB on the BB Node

– Removes a node from the networkSimplifies SoftwareReplaces point-to-point transfer with many-to-point transfer

– More Memory (1.5GB vs. 256 MB)Simplifies SoftwareThroughput increase via multiple pftp streams (30-35 MB/Sec vs. 25 MB/Sec)

– Multi-CPU Ultrasparc MachineCompression on Built Events?

Preliminary Results Show– Improved Small Event Performance

(25 evts/Sec 140 evts/Sec)

– Improved Throughput to BB(28 MB/Sec 100 MB/Sec)

Page 16: Implementation of the STAR Data Acquisition System using a Myrinet Network

Year 1 Performance & Reliability

RHIC Data Run 3 Months Data Taking ~15 Days Integrated Stable Beam Little down time due to DAQ

STAR Performance ~10 TB data ~2.03 Million Events

Myrinet Performance4 known message failures (>108)

– Cause not known– Reported by software– Resulted in aborted run

No known data corruption

Page 17: Implementation of the STAR Data Acquisition System using a Myrinet Network

Au-Au Central Collision

130 GeV Au-Au Collision viewed through the L3 Event DisplaySeveral thousand tracksTracking in real time (~100 msec)