andreapetrucci_acat_2007
TRANSCRIPT
The Run Control and
Monitoring System of
the CMS Experiment
Presented by Andrea Petrucci
INFN, Laboratori Nazionali di Legnaro, Italy
On behalf of the DAQ Group of CMS collaboration
ACAT 2007, 23-27 April 2007, Amsterdam, Netherlands
2
Outline
2ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Run Control and Monitor System :
Architecture• Logical Layer• Services• Components• Technologies
At the Magnet Test and Cosmic Challenge (MTCC) • Control structure• Operation• Components• Results
GRICC Project
3
What is CMS?
3ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
The Compact Muon Solenoid (CMS) experiment is one of two large general-purpose particle physics detectors being built on the proton-proton Large Hadron Collider (LHC) at CERN in Switzerland.
•to explore physics at the TeV scale
•to discover the Higgs boson
•to look for evidence of physics beyond the standard model
•to be able to study aspects of heavy ion collisions
The main goals of the experiment are:
4
Run Control and Monitor System
4ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
• The Run Control and Monitor System (RCMS) is responsible for controlling and monitoring the CMS experiment during the data taking.
• RCMS views the experiment as a set of partition, where a partition is a grouping of entities that can be operated independently.
• Main operations are configuration, monitoring, error handling, logging and synchronization with other subsystems.
55ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Baseline DAQ Configuration
• 512 inputs• 2024 outputs
CMS Data Acquisition
Control and Monitor requirements• O(104 ) distributed Objects to
– control– configure– monitor
• On-line diagnostics• Interactive system
6
RCMS is integrated in the CMS On-line system :• It controls the “DAQ component”
– Data transport– Event processing
• It monitors the “Detector Control System” DCS– manages the slow controls of the whole experiment.
6ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Run Control and Monitor System
The SOAP protocol and the Web Services have been adopted as the main means for communication .
The online process environment is XDAQ,
a C++ framework for a distributed Data Acquisition System.
7
RCMS Logical Structure
7ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
• A Session is the allocation of the hardware and software of a CMS partition needed to perform data-taking.
• Multiple Sessions may coexist and operate concurrently.
• Each Session is associated with a Top Function Manager, that coordinates all the actions.
TopTop
Sub-DetectorSub-Detector
Sub-DetectorSub-Detector
Sub-DetectorSub-Detector
Sub-DetectorSub-Detector
Sub-DetectorSub-Detector
Sub-DetectorSub-Detector
ServicesServices
ServicesServices
ServicesServices
Sub-System(DAQ) Resources
8
– SECURITY SERVICE• login and user account
management;
– RESOURCE SERVICE (RS)• information about DAQ
resources and partitions;
– INFORMATION AND MONITOR SERVICE (IMS)
• Collects messages and monitor data; distributes them to the subscribers;
– JOB CONTROL• Starts, monitors and stops the
software elements of RCMS, including the DAQ components;
RCMS Services
8ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
9
Function Manager
9ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Input Handler : It handles all the input events of the FM (GUIs or other FMs, errors, states, logs and monitor messages)
Event Processor: It handles all the incoming message and decide where to send them. It has processing capability
Finite State Machine (FSM): The behavior of the FM is driven by a FSM.
Resource Proxy: It handles all the outgoing connections with the resources.
The purpose of a Function Manager (FM) is to control a set of resources.
InputHandler
EventProcessor
FSMEngine
ResourceProxy
Function Manager
Resources
State Flow
Error Flow
Monitor Flow
Control Flow
Customizable
10
Resource Service
10ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
RS Manager tool
RS DB
RS API
DAQ Configurator
The Resource Service (RS) stores the process configuration of the On-line System.
features
Flexible data store
Java API
Configuration documents can be built on the fly from relational schema
Versioning system
Oracle and MySQL Implementation
11
Log Collector
11ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
• Collects log information from log4j compliant applications (i.e. on-line process).
…
PublishSubscriber System
Display System
Storage System
Log Collector
Relational DBOracle,MySQL
Message System
Access via JDBC
Access via TCP
RCMS applications and XDAQ
applications
• Send log information directly to a Display System (Chainsaw) .
• Stores log information in a database and visualizes them (LogDBViewer) .
• Distributes/publishes log information through a message system (Java Message Service).
12
RCMS main components
12ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Log Collector WebAppChainsaw
Log Viewer
JobControlProcess Control
FirefoxJSP/Ajax GUI
Config data Config dataConditions data
Process Config
Log Messages
CommandsNotifications
RCMS
RS ManagerManager/Editor
DAQ ConfiguratorConfiguration
Function Manager
Framework
RS APIRunInfo APIHwcfg API
RS DBGCK DBRunInfo DBLog DBHwcfg DB
User Interface
13
RCMS Technologies
13ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Technologies and tools:• Web Applications,Java Servlets (Apache Tomcat)• WebService (Axis, WSDL, SOAP)• Web Tecnologies (Ajax,JSP)• Databases
– Oracle– MySQL
Architecture Implementation
Resource Service (RS) Resource Service
Information and Monitor Service (IMS) LogCollector
SubSystem Controllers (FMs) RCMS Framework
Top Function Manager RCMS Framework
GUIs Default JSP GUI - RCMS Framework
JobControl XDAQ Framework
14
Magnet Test and Cosmic Challenge
14ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
The main goals of the Cosmic Challenge were:•Test Muon alignment systems.•Commission the several sub-detectors (Drift Tubes - DT, Hadron Calorimeter – HCAL, Tracker, etc.) and Cosmic Trigger. •demonstrate cosmic ray reconstruction with multiple sub-detectors.
The Magnet Test and Cosmic Challenge (MTCC) is a milestone of the CMS experiment, it completes the commissioning of the magnet system (coil & yoke) before its lowering into the cavern.
Scale MTCC versus CMS
Data Sources:20 out of 600 3%
Filter Nodes: 14 out of 20000.3%
Trigger rate: 100 Hz out of 100 kHz 0.1%
Event size: 200 kB out of 1 MB20%
Scale MTCC versus CMS
Data Sources:20 out of 600 3%
Filter Nodes: 14 out of 20000.3%
Trigger rate: 100 Hz out of 100 kHz 0.1%
Event size: 200 kB out of 1 MB20%
15
FMs Control Structure at MTCC I & II
15ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Web Browser (GUI)
Level 0 FM
Level 1 FM
Level 2 FM
User interaction with Web Browser connected to Level 0 FM.
Level 0 FM is entry point to Run Control System.
Level 2 FMs are sub-system specific custom
implementations.
Level 1 FM interface to the Level 0 FM and have to implement a standard set of inputs and states.
TOP
LTC
CSC DAQ
RPC DT
TRK
ECAL
HCAL
FB RB FF
Resources
FEC FED
Resources are on-line system components
16
RCMS at MTCC I & II
ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Top DAQLTC
HCALECAL
DT
CSCRPC
TRK
RCMS Operation Scenario
– Sub-system function managers were written using the RCMS software
– The run configuration was communicated via a global configuration key
– The Run Info DB was used to store end-of-run summary information and status information about the run. It also contained the schema to generate Run Numbers and Run Sequence Numbers.
N
Sub-Detector controlled 8
Function Managers used 14
Online resources controlled ~ 100
17
RCMS Components at MTCC I & II
17ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
global keylocal keyconfigurationlogmessage
Top DAQ
LTC-Trg
HCAL
ECAL
DT
CSC
RPC
Global Configuration
Keys
Global Configuration
Keys
RPCRS
RPCRS
LTC-TrgRS
LTC-TrgRS
EMURS
EMURS
DTRSDTRS
DAQRS
DAQRS
global key
ECALRS
ECALRS
HCALRS
HCALRS
TRK TRKRS
TRKRS
LOG DBLOG DB
TopRS
TopRS
Collector Collector
Collector
Collector
Collector
Collector
Collector
Collector
Collector
Configuration
A Global configuration Key identified a sub-system configuration.
The configuration local to the sub-system were decouple from each other and the top configuration.
18
MTCC Data taking
18ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
subsystem events
0
20000000
40000000
60000000
80000000
100000000
120000000
140000000
160000000
180000000
10/3/06 0:00 10/8/06 0:00 10/13/06 0:00 10/18/06 0:00 10/23/06 0:00 10/28/06 0:00 11/2/06 0:00 11/7/06 0:00
date
even
ts [
#]
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
B-fi
eld
[T
]
HCALCSCRPCDTMagnet Field
19
MTCC result
19ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
• RCMS software was stable.
• Separation of Subsystem installations worked well.
• Recorded ~ 160 M events on a period of one month
20
RCMS and GRIDCC
20ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
• It is a project of 3-years and started in September 2004
• Web site: www.gridcc.org
What is GRIDCC ?
The Grid enabled Remote Instrumentation with Distributed Control and Computation (GRIDCC) is a project funded by the European community, aimed to provide access to and control of distributed complex instrumentation.
The CMS RCMS is one of the main applications for the GRIDCC project .
The RCMS software is the core of the Instrument Element of the GRIDCC.
2121ACAT 2007 23-27 April 2007, Amsterdam Andrea Petrucci - LNL-INFN
Thank you for your attention.
• Any Questions?