[ieee 2012 ieee-npss real time conference (rt 2012) - berkeley, ca, usa (2012.06.9-2012.06.15)] 2012...

5
978-1-4673-1084-0/12/$31.00 ©2012 IEEE Design and Implementation of DAQ Readout System for the Daya Bay Reactor Neutrino Experiment Xiaolu Ji, Fei Li, Kejun Zhu Abstract–The Daya Bay Reactor Neutrino Experiment will consist of seventeen separate detector subsystems distributed in three underground experimental halls. There will be eight PMT based anti-neutrino detectors (ADs), six water-Cherenkov detectors, and three RPC detector subsystems. Each detector will be read out using an independent VME crate. The data acquisition (DAQ) readout system (ROS) reads data fragments from electronics and trigger modules then concatenates them into an event in each crate. A detailed design and implementation of the DAQ readout system for the Daya Bay Reactor Neutrino Experiment will be presented. I. INTRODUCTION he data acquisition (DAQ) system for the Daya Bay reactor neutrino experiment is designed as a multi-level system using embedded Linux, advanced commercial computers and distributed network technology, and is modeled after the BESIII and ATLAS DAQ systems. The hardware architecture is shown in Fig. 1. Fig. 1. Hardware architecture of Daya Bay DAQ The system has been designed entirely with a gigabit Ethernet network. Due to the distances involved, the three experimental halls connect to the surface through single mode fibers. Double peer to peer single mode fiber cables connect the front-end and back-end networks. All computers except for a local control PC are placed in the surface computer room for more convenient management and maintenance. The Daya Bay experiment also adopted the blade server based computing farm to construct the back end system. The DAQ has two x3650 servers acting as files servers for computing farm and data storage. Nine blade Manuscript received June 8, 2012. This work was supported by the Ministry of Science and Technology of China (No. 2006CB808103). The authors are with the Institute of High Energy Physics, Chinese Academy of Sciences, 100049 Beijing, China (e-mail: [email protected]). servers serve as computing nodes for data gathering and data quality monitoring [1]. II. ROS ARCHITECTURE DESIGN A. ROS Introduction DAQ readout system (ROS) is the front-end DAQ system, which has the interface with electronics system and back-end DAQ system. ROS reads data fragments from electronics modules and concatenates them into events, then sends events to DAQ back-end software to do monitoring and storing. Electronics module detector ROS EFD cable VME net Analog signal Data fragment Event Fig. 2. ROS’ position in the data readout chain B. System Requirements 1) Throughput Requirements The expected maximum physics data throughput rate is less than 0.5 MB/s for one crate and less than 1.5 MB/s for one hall assuming a baseline trigger rate. The total normal physics data throughput rate for all 3 halls is expected to be about 3 MB/s [2]. These estimates would increase with implementation of full waveform digitization, noisy PMTs, or implementation of additional triggers (e.g., LED calibration triggers). The system design should have sufficient flexibility to allow for background studies. Consequently, the designed DAQ event rate could achieve 1 kHz with a 2 kilobyte event size resulting in a throughput less than 2 MB/s per crate. The DAQ system is required to have a negligible readout dead-time. This requires fast online memory buffers that can hold multiple detector readout snapshots while the highest level DAQ CPUs perform online processing and transfer to permanent storage. 2) Functionality Requirements The main tasks of ROS include: i. Configure electronics modules ii. Read data fragments via VME bus iii. Data check iv. Event Build v. Send events to DAQ back-end software via Ethernet vi. Monitor data flow status T

Upload: kejun

Post on 11-Dec-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 2012 IEEE-NPSS Real Time Conference (RT 2012) - Berkeley, CA, USA (2012.06.9-2012.06.15)] 2012 18th IEEE-NPSS Real Time Conference - Design and implementation of DAQ readout

978-1-4673-1084-0/12/$31.00 ©2012 IEEE

Design and Implementation of DAQ Readout System for the Daya Bay Reactor Neutrino Experiment

Xiaolu Ji, Fei Li, Kejun Zhu

Abstract–The Daya Bay Reactor Neutrino Experiment will consist of seventeen separate detector subsystems distributed in three underground experimental halls. There will be eight PMT based anti-neutrino detectors (ADs), six water-Cherenkov detectors, and three RPC detector subsystems. Each detector will be read out using an independent VME crate. The data acquisition (DAQ) readout system (ROS) reads data fragments from electronics and trigger modules then concatenates them into an event in each crate. A detailed design and implementation of the DAQ readout system for the Daya Bay Reactor Neutrino Experiment will be presented.

I. INTRODUCTION

he data acquisition (DAQ) system for the Daya Bay reactor neutrino experiment is designed as a multi-level system

using embedded Linux, advanced commercial computers and distributed network technology, and is modeled after the BESIII and ATLAS DAQ systems. The hardware architecture is shown in Fig. 1.

Fig. 1. Hardware architecture of Daya Bay DAQ

The system has been designed entirely with a gigabit Ethernet network. Due to the distances involved, the three experimental halls connect to the surface through single mode fibers. Double peer to peer single mode fiber cables connect the front-end and back-end networks.

All computers except for a local control PC are placed in the surface computer room for more convenient management and maintenance. The Daya Bay experiment also adopted the blade server based computing farm to construct the back end system. The DAQ has two x3650 servers acting as files servers for computing farm and data storage. Nine blade

Manuscript received June 8, 2012. This work was supported by the Ministry of Science and Technology of China (No. 2006CB808103).

The authors are with the Institute of High Energy Physics, Chinese Academy of Sciences, 100049 Beijing, China (e-mail: [email protected]).

servers serve as computing nodes for data gathering and data quality monitoring [1].

II. ROS ARCHITECTURE DESIGN

A. ROS Introduction DAQ readout system (ROS) is the front-end DAQ system, which has the interface with electronics system and back-end DAQ system. ROS reads data fragments from electronics modules and concatenates them into events, then sends events to DAQ back-end software to do monitoring and storing.

Electronics module

detector ROS EFD

cable

VME

net

Analog signal

Data fragment Event

Fig. 2. ROS’ position in the data readout chain

B. System Requirements 1) Throughput Requirements

The expected maximum physics data throughput rate is less than 0.5 MB/s for one crate and less than 1.5 MB/s for one hall assuming a baseline trigger rate. The total normal physics data throughput rate for all 3 halls is expected to be about 3 MB/s [2]. These estimates would increase with implementation of full waveform digitization, noisy PMTs, or implementation of additional triggers (e.g., LED calibration triggers). The system design should have sufficient flexibility to allow for background studies. Consequently, the designed DAQ event rate could achieve 1 kHz with a 2 kilobyte event size resulting in a throughput less than 2 MB/s per crate.

The DAQ system is required to have a negligible readout dead-time. This requires fast online memory buffers that can hold multiple detector readout snapshots while the highest level DAQ CPUs perform online processing and transfer to permanent storage.

2) Functionality Requirements The main tasks of ROS include: i. Configure electronics modules

ii. Read data fragments via VME bus iii. Data check iv. Event Build v. Send events to DAQ back-end software via Ethernet

vi. Monitor data flow status

T

Page 2: [IEEE 2012 IEEE-NPSS Real Time Conference (RT 2012) - Berkeley, CA, USA (2012.06.9-2012.06.15)] 2012 18th IEEE-NPSS Real Time Conference - Design and implementation of DAQ readout

C. Hardware Design The electronics system of the Daya Bay experiment is

design as a VME based system. The electronics modules are 9U VME standard module, and installed in 9U VME crates, which constitute the hardware platform of ROS. ROS is a real-time embedded system based on a VME bus. Each VME crate holds a VME system controller MVME5500, an embedded single-board computer manufactured by Motorola. It is based on a PowerPC CPU and Universe II chip for VME bus interface [3].

VME Bus

VME Crate

MVME5500(controller)

Electronics modules

Fig. 3. VME based readout system

There are seventeen separate detector subsystems in three underground experimental halls, eight anti-neutrino detectors (ADs), six water-Cherenkov detectors, and three RPC detectors. As shown in Fig. 4, ROS runs in each VME crate independently, reads out the corresponding detector’s data. All crates connect to gigabit Ethernet switches through MVME5500, then ROS can communicate with DAQ readout control computers. All data streams are sent to DAQ back-end software separately.

D. Software Design TimeSys LinuxLink [4], a commercial embedded real-time

Linux with kernel version 2.6.9 is employed as the operating system of ROS.

ROS is a real-time system which contains four main parallel tasks as shown in Fig. 5. Ring buffer is used for data transfer between tasks, while the message queue and semaphore are used for information communication.

Status chart is introduced to describe the system actions. There are some restrictions to the DAQ Software, especially for the synchronization of each readout crate in ROS actions.

ROS fetches data through VME bus. Tundra Universe II [5] VMEbus-PCI provides the interface of CPU to accessing VME bus. BSP (board support package) and vme_universe driver which interrelated to system development are also offered by LinuxLink.

The VME bus controller reads data from electronics board using chained block transfer (CBLT) mode, and then sends out event data via gigabit Ethernet.

AD VME Crate water-Cherenkov

VME Crate RPC VME Crate

AD VME Crate RPC VME Crate

AD VME Crate RPC VME Crate

Far Site

Daya Bay Site

LingAo Site

water-Cherenkov VME Crate

water-Cherenkov VME Crate

Fig. 4. ROS in three experiment halls

Readout Thread

Datapack Thread

Output Thread

MonitorThread

Electronics module data

PackedEvent data

Event data

Useful information

Fig. 5. Parallel tasks in ROS

III. RESEARCH ON LINUX PLATFORM Tests have been done to validate the feasibility of using

embedded Linux OS (Operating System) in ROS. The performance based on Linux is tested both in single item and integration. For single item test, the capabilities which related to the readout system were concerned. As shown in table I, the time of VME reading and writing, the max transfer speed of data transfer, the overhead of interrupt and data transfer, the network speed, CPU occupancy and the context switch overhead are tested, using MVME5500 as the master and MVME2431 as the slave. These tests all showed good performances [6].

To ensure the integration performance with electronics system and back-end DAQ, integrated test has been taken. Some modifications have been made to the code of Linux kernel and vme_universe driver to fit the requirements of the DAQ system during the development. And some optimization methods have been adopted to improve the system capability. After these optimizations, the integrated test system can work well, and the performance is sufficient to satisfy the needs of ROS. Then it can be said that the embedded Linux OS is feasible and reliable in the Daya Bay experiment.

Page 3: [IEEE 2012 IEEE-NPSS Real Time Conference (RT 2012) - Berkeley, CA, USA (2012.06.9-2012.06.15)] 2012 18th IEEE-NPSS Real Time Conference - Design and implementation of DAQ readout

TABLE I. SINGLE ITEM PERFORMANCE ON LINUX

TimeSys Linux MVME5500 & 2431

VME Read 1638 ns VME Write 405 ns

DMA (4096bytes) 18.42 Mbytes/s DMA CPU Overhead 15 us

Interrupt CPU Overhead 16 us 100M Network Speed

(1024bytes) 92 Mbps

Network CPU Usage 14% Context Switch Overhead 5.78 us

IV. IMPLEMENTATION

A. Electronics Configuration ROS is the only sub-system in DAQ which has the interface

with the electronics system. So the online configurations of electronics modules need to be executed in ROS. Electronics configuration database (Electronics ConfDB) is used for store the important information and configuration parameters of electronics. The Electronics ConfDB is described as XML (Extensible Markup Language) files, which are convenient for the modifications and the storage of parameters.

Each time before a new run start, user should set all the parameters in XML files to the correct values, and ROS will get these parameters to initialize the electronics and set the corresponding work status of modules. All the parameters in Electronics ConfDB will be stored in online databases after the run starts.

XML fileROS

getConfigurationparameters

setconfigurationparameters

latigid

configure

userVME crate

Fig. 6. The flow of electronics configuration

B. Run Mode The DAQ should be capable of taking different data types

from detectors running together in different run modes. Table II shows the summary of run mode requirements.

ROS should be able to handle different run modes in all aspects such as:

i. ROS should get the electronics parameters for the corresponding run mode from Electronics ConfDB.

ii. ROS need to obtain corresponding information from IS (Information Service) component of DAQ.

iii. There are differences during executing the configuration process for electronics in ROS.

iv. There are differences in ROS data taking process. v. Minor differences in filling event format should be

concerned in ROS. vi. ROS should consider each detector’s behavior in

the corresponding run mode.

TABLE II SUMMARY OF RUN MODE REQUIREMENTS

Run Mode AD Water-

Cherenkov RPC

physics Y Y Y Electronics Diagnosis

Y Y N/A

Pedestal Y Y N/A AD Calibration Y Physics Physics Water Shield Calibration

Physics Y Physics

Mineral Oil Monitoring

Y Physics Physics

Y means this detector will work according to the corresponding run mode

C. Dataflow Monitoring ROS is the front-end DAQ system, and is the earliest one

which can directly access the electronics raw data. Real time monitoring of the dataflow quality is essential for the efficiency assurance of the experiment.

Reset monitor parameters

Get parameters value

Calculate parameters value

Update parameters start

Wait for sampling time

Updated parameters finish

N

Receive stop command?

Task quit

Y

Fig. 7. Realization of the parameters for dataflow monitoring in ROS

The dataflow monitoring in ROS provides the information of:

i. Data package number received by ROS ii. Event number assembled in ROS

iii. Event number sent out by ROS iv. Average received data size, trigger rate, bandwidth

in specified time interval v. Average sent out event size, event rate, bandwidth

in specified time interval vi. Trigger rate of each trigger type [7]

The definitions of these entire monitor parameters are described in the relevant XML files, and the realization of the values are executed in ROS (Fig. 7). During a run, ROS

Page 4: [IEEE 2012 IEEE-NPSS Real Time Conference (RT 2012) - Berkeley, CA, USA (2012.06.9-2012.06.15)] 2012 18th IEEE-NPSS Real Time Conference - Design and implementation of DAQ readout

update these values in the specified time interval, and then submit to DAQ monitor system to display. Monitor the status of dataflow can help developers to debug the system during the commissioning, and can assist shifters in understanding the status of data-taking, as shown in Fig. 8.

Fig. 8. Dataflow monitor in ROS

D. Data Check Strict data quality check is implemented in DQM (Data

Quality Monitoring) system, which is based on the offline system and not a real time system. To catch the abnormal cases in electronics data as early as possible, prompt raw data check should be implemented in ROS. ROS will not perform very detailed data analysis because of limited CPU resources. The primary check items in ROS are for electronics data format and some important readout information in data packages, such as data flag, trigger number, time stamp, check information, module position, and so on.

If some errors have been observed during data check, ROS will notify the DAQ MRS (Message Reporting System) and then execute the abnormal data pack process, which will save the original data package with an error flag tagged. Shifters will be informed about the errors from message panel of the DAQ IGUI (Integrated Graphical User Interface).

E. Event Building Event building is executed within each VME crate by ROS.

There are two types of VME detector readout electronics, one for PMTs (ADs and water shield detectors) and another for the RPC detectors. As shown in Fig. 9, data from PMT systems are assembled by trigger ID which is tagged in data fragments of electronics modules. Data from RPC system is not ordered in trigger ID or time stamp [8], so ROS need to sort all the RPC electronics data by time stamp tagged in data fragments,

and then assemble the time-ordered electronics data using an appropriate time window.

Fig. 9. Event build in ROS

All data created within an experiment hall can be merged in DAQ back-end software SFO (sub-farm output). Merging is configurable for any combination of detectors and different run modes. Furthermore, the events should be sorted in the merged stream by the time stamp. Finally, event data must be recorded to a disk array by SFO.

F. Data Communications in ROS

Fig. 10. Flowchart of data transfer between ROS tasks

In ROS, readout, data packing and output are three parallel tasks which are related to the data transfer. Producer-consumer mode is used for data transfer between tasks in ROS. Ring buffer is adopted as the data buffer, and message queue is used for information exchange.

For example, readout and data packing tasks is an instance of producer-consumer mode. As shown in Fig. 10, readout task gets the data package from VME bus and then stores data in ring buffer, after that, puts a message in the message queue to notify the data packing task. Once data packing task

Page 5: [IEEE 2012 IEEE-NPSS Real Time Conference (RT 2012) - Berkeley, CA, USA (2012.06.9-2012.06.15)] 2012 18th IEEE-NPSS Real Time Conference - Design and implementation of DAQ readout

receives the message, it will read out data from the ring buffer. For data packing and output tasks, the process of data transmission also obeys this producer-consumer mode.

G. Data Throughput Capability

F

Fig. 11. Data readout capability in ROS

Fig. 12. Data output capability in ROS

It has been introduced that there are three types of detectors in the Daya Bay experiment: ADs, water-Cherenkov detectors, and RPC detectors. AD has the highest requirement for the data throughput performance, which means high event rate and high data bandwidth. So this section will focus on the DAQ performance in AD electronics crate.

Fig. 11 and Fig. 12 show the data throughput capability in ROS for AD crate readout. As shown in these two figures, when the readout data size is less than about 5 kilobytes, the data readout capability is better than output in ROS, otherwise output capability is better. The average data size of the AD crate is less than 2 kilobytes, so the data throughput capability is limited by the output part in ROS, which is caused by the communication mechanism between ROS and back-end software.

Table III shows the DAQ system throughput performance for one AD readout crate. The event rate and bandwidth are the maximum DAQ capability in current system settings. If all the 192 PMT in one detector are fired at same time, and there is only one hit in each electronics channel, which means the event size is about 1.7 kilobytes, in this case the maximum

event rate is 1.7 kHz and the data bandwidth is about 2.9 megabytes per second. This performance is sufficient to meet the requirement of the Daya Bay experiment.

Furthermore, if needed, adopt multi trigger CBLT readout mode in ROS readout task, or multi event transfer mode in ROS output task can improve the data throughput capability significantly.

TABLE III. PMT DAQ THROUGHPUT PERFORMANCE

Fired Channel (full crate: 16chn*12FEE)

48 96 192 192 192

Hit Number per Channel 1 1 1 2 5 Event Size (KB) 0.57 0.94 1.7 3.0 8.1 Event Rate (kHz) 2.5 2.0 1.7 1.6 1.1

Readout Bandwidth (Mbytes/s)

1.4 1.9 2.9 4.8 9.2

V. CONCLUSION The architecture design of DAQ readout system for the

Daya Bay has been achieved. The integration system tests have been taken both in laboratory and onsite, and the test results show that the DAQ ROS performed well and the architecture design meets current experimental requirements.

Fifteen detectors (total seventeen) have been installed in three experiment halls and are operating with steady data-taking since December 24, 2011. Take the Daya Bay Experimental Hall (also known as EH1) for example, during the experimental livetime from December 24, 2011 to May 11, 2012, the fraction of the total DAQ time (3195.4 hours) in total calendar time (3336 hours) was 95.8%. The DAQ dead time due to full data buffers was determined with dedicated scalers to be less than 0.0025% [9], showing that the DAQ system is capable and reliable for the Daya Bay experiment.

REFERENCES [1] F. Li, X.L. Ji, X.N. Li, and K.J. Zhu, IEEE Trans. Nucl. Sci. NS-58(4)

(2011) 1723. [2] X. H. Guo et al. (Daya Bay Collaboration), Proposal of the Daya Bay

experiment, arXiv:hep-ex/0701029 (2007). [3] Motorola, “MVME5500 Series VME Single-Board Computer data sheet,

” 2003. [4] https://linuxlink.timesys.com/3/Products. [5] Tundra Semiconductor Corporation, “Universe II VME-to-PCI Bus

Bridge User Manual,” 2002. [6] X.L. Ji et al., “Research and design of DAQ system for Daya Bay

reactor neutrino experiment,” in Proc. IEEE Nuclear Science Symp. Conf. Record, Dresden, Germany, 2008, pp. 2119–2121, N30-76.

[7] H. Gong, et al., Nucl. Inst. and Meth. A 637 (2011) 138. [8] H.F. Hao et al., Development of VME system in RPC electronics for

Daya Bay Reactor Neutrino Experiment, Nuclear Science and Techniques (to be published)

[9] F. P. An et al. (Daya Bay Collaboration), arXiv:1202.6181.

0

2

4

6

8

10

12

0123456789

0 2 4 6 8 read

out B

andw

idth

(MBy

tes/

s)

even

t rat

e(kH

z)

CBLT readout data size(kBytes)

data readout capability in ROS

event rate

BandWidth

05101520253035

0

1

2

3

4

5

Ban

dwid

th(M

B/s)

even

t rat

e (k

Hz)

event size(Byte)

data output capability in ROS

eventratebandwidth