ildefons magrans, cms trigger software technical coordinator 1 complexity management solutions for...

Download Ildefons Magrans, CMS Trigger Software Technical Coordinator 1 Complexity Management Solutions for High Energy Physics Control Systems: The CMS experiment

If you can't read please download the document

Upload: bryson-banning

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1

Ildefons Magrans, CMS Trigger Software Technical Coordinator 1 Complexity Management Solutions for High Energy Physics Control Systems: The CMS experiment Zurich (IBM Research Laboratory) 23 th January 2008 Ildefons Magrans de Abril CMS Trigger Software Technical Coordinator, CERN, Geneva Slide 2 Ildefons Magrans, CMS Trigger Software Technical Coordinator 2 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 3 Ildefons Magrans, CMS Trigger Software Technical Coordinator 3 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 4 Ildefons Magrans, CMS Trigger Software Technical Coordinator 4 CERN, European Organization for Nuclear Research Large Hadron Collider CERN provides research facilities to particle physicists worldwide Slide 5 Ildefons Magrans, CMS Trigger Software Technical Coordinator 5 Large Hadron Collider (LHC) Largest superconducting installation: 27 Km ring 3 billion euros CMS and ATLAS detect collision information (event): 40 million events/second Slide 6 Ildefons Magrans, CMS Trigger Software Technical Coordinator 6 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 7 Ildefons Magrans, CMS Trigger Software Technical Coordinator 7 Compact Muon Solenoid (CMS) Human complexity: 39 countries 182 Institutes (CERN is 1) 3330 people ~800 students! Numeric complexity: 21.6 m long 15 m diameter 12500 tones 4 Tesla solenoid (100.000 time earth mag. Field) 1200 m 3 /hour of water for cooling (~gva jet deau 1800) 10 MWatts required for operation (~10.000 houses) Time complexity: Design stated 20 years ago! 7-8 years for construction 15 years of expected operational life time Already developing upgrades Slide 8 Ildefons Magrans, CMS Trigger Software Technical Coordinator 8 The CMS sensor Silicon Tracker: Find charged particle tracks and momentum Electromagnetic Calorimeter: Measure energy of particles interacting electromagnetically Hadronic Calorimeter: Measure energy of particles interacting via the strong nuclear force (heavy neutral particles) Muon detector: Find muon tracks ? Particle physicist Slide 9 Ildefons Magrans, CMS Trigger Software Technical Coordinator 9 The CMS Trigger and Data Acquisition System We can just store 100 events per second Solution based on two filter levels: Level-1 Trigger (HW) High Level Trigger (SW) Control system coordinates experiment operation 40 million events/second ~55 million Channels ~1 Mbyte per event Slide 10 Ildefons Magrans, CMS Trigger Software Technical Coordinator 10 About this talk CMS Control System. SOFTWARE L1 Decision Loop and detector front-ends. HARDWARE Slide 11 Ildefons Magrans, CMS Trigger Software Technical Coordinator 11 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 12 Ildefons Magrans, CMS Trigger Software Technical Coordinator 12 Context complexity Numeric dimension: Thousands of hardware modules and the same order of electronic links Time dimension: L1 Trigger development starts the year 2000 L1 Trigger design for the SLHC has already started! CERN Linux platform upgrades every 2 years Periodic Software & Hardware upgrades Human & political dimension: Large number of independent research institutes with similar requirements using different technologies (e.g. FPGA vs ASIC, VME vs PCI vs tiny ) Most people are particle physicist with few % of time dedicated to SW development. ~20% students Slide 13 Ildefons Magrans, CMS Trigger Software Technical Coordinator 13 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 14 Ildefons Magrans, CMS Trigger Software Technical Coordinator 14 XSEQ: A Software environment model XML Control Sequence (XSEQ) Device Description Device Data Devices Interpreter Processes platform independent control sequences 1.XML as uniform data representation format for both data and code Long term technologic inversion (XML is here to stay) Maximize usage of standard tools Simplify software configuration management 2. Interpreted approach for the code Execute code independently of the platform Slide 15 Ildefons Magrans, CMS Trigger Software Technical Coordinator 15 XSEQ language XSEQ language (XML-based sequencer ): Syntax specified in xsd documents + Extensions: file system, SOAP, DOM, HW access (PCI & VME). Exception handling with error recovery mechanism Other: object oriented and design by contract extensions. XSEQ example 1: Hello world XSEQ syntax core definition Exception handling Every tag is a function Slide 16http://xseq.cern.ch/register_table.xml PCIi386BusAdapter ecd6 fd05 0 my_device CTRL my_data Device specifications Common tools for processing code and data. Simplifies core development Decoupling syntax and semantic enhances sharing code between sub-systems with similar requirements Extends interpreter in order to execute a new syntactic extension Scoped variable Not accessible in upper hierarchical levels"> Ildefons Magrans, CMS Trigger Software Technical Coordinator 16 XSEQ example 2: hardware accesshttp://xseq.cern.ch/register_table.xml PCIi386BusAdapter ecd6 fd05 0 my_device CTRL my_data Device specifications Common tools for processing code and data. Simplifies core development Decoupling syntax and semantic enhances sharing code between sub-systems with similar requirements Extends interpreter in order to execute a new syntactic extension Scoped variable Not accessible in upper hierarchical levels Slide 17 Ildefons Magrans, CMS Trigger Software Technical Coordinator 17 Online software integration Xseq program (URL) XDAQ framework: CMS in house developed C++ Middleware Return message generated by the XSEQ program Peer transport (SOAP) XDAQ executive (one per host computer) Interpreter plug-in XDAQ application The running XSEQ program can access the original SOAP message and retrieve parameters SOAP message specifies the URL of the XSEQ program or embeds a the program itself Slide 18 Ildefons Magrans, CMS Trigger Software Technical Coordinator 18 XSEQ example 3: distributed system SOAP Remote server: Standalone Interpreter Client: HEPHY. Vienna. Global Trigger server CERN. Geneve SOAP pt Interpreter plug-in Xdaq executive Bus protocol SOAP message sent by Client msg board fname chip my_msg SOAP exension SOAP messae returned to the client Slide 19 Ildefons Magrans, CMS Trigger Software Technical Coordinator 19 XSEQ conclusions Good: Suitable technologic investment (XML is here to stay) Reduces in house development (Large asset of standard tools) Enhances code sharing among sub-systems (extension mechanism) Enhances platform evolution (interpreted approach) Simplifies software configuration management (uniform usage of XML for code/data) Bad: XML is verbose (programming with XSEQ is not fun), but: An XML editor could help XSEQ could serve as the underlying syntax to store virtual instrumentation developed with graphical tools like Labview Just a prototype. It is not being used for production Slide 20 Ildefons Magrans, CMS Trigger Software Technical Coordinator 20 Outline 1CERN and the LHC 2The CMS experiment 3Enhancing complexity management with web services 3.1Software environment model: XSEQ 3.2Concrete architecture: The CMS Trigger Supervisor Slide 21 Ildefons Magrans, CMS Trigger Software Technical Coordinator 21 CMS Trigger Supervisor Context ~55 Million Channels, ~1 Mbyte per event CMS Control System. SOFTWARE L1 Decision Loop. HARDWARE Slide 22 Ildefons Magrans, CMS Trigger Software Technical Coordinator 22 HW context: L1-Trigger Decision Loop Configuration: 64 crates O(10 3 ) boards Firmware ~ 15 MB/board O(10 2 ) regs/board Testing: O(10 3 ) links Integration coordination: 27 research institutes Time: Research: 1992-2000 Development: 2000-present Fully replaced by 2010 L1 decision loop operation ~ business Slide 23 Ildefons Magrans, CMS Trigger Software Technical Coordinator 23 SW context: Experiment control system Run Control and Monitoring System (RCMS): Overall experiment control and monitoring RCMS framework implemented with java Detector Control System (DCS): Detector safety, gas and fluid control, cooling system, rack and crate control, high and low voltage control, and detector calibration. DCS is implemented with PVSSII Cross-platform Data AcQuisition middleware (XDAQ): C++ component based distributed programming framework Used to implement the distributed event builder L1-Trigger Control and Hardware Monitoring System: Provides a machine and a human interfaces to operate, test and monitor the Level-1 decision loop hardware components. (8) Experiment control system ~ business IT infrastructure Slide 24 Ildefons Magrans, CMS Trigger Software Technical Coordinator 24 Project phases and terminology HW Context Busines: To filter the best events Concept Business needs Project team Prototype Prove of concept SW Context Business software infrastructure Framework Services and core developments System Architecture Services New business capabilities: e.g. configuration Slide 25 Ildefons Magrans, CMS Trigger Software Technical Coordinator 25 Business needs and project team ECAL energy Trigger Supervisor GUI 0..n 1 1 G. Muon Trigger HF energy GT/TCS G. Cal. Trigger Trigger Supervisor pattern comp. DT TF RPC hits DT hits CSC hits 11 1 11 1 11 1 1 1 11 CSC TF 1 HCAL energy R. Cal. Trigger 1 Experiment control system Business need: coordinate operation of CMS subsystems (eg. Configuration and test) TS team (2 + 1 or 2 students) : Services + core developments Architecture Business capabilities Sub-system developers coordination & support Communication 1 developer per subsystem: Uses services to develop the subsystem architecture Customizes subsystem architecture as required by TS team 0..n Slide 26 Ildefons Magrans, CMS Trigger Software Technical Coordinator 26 Baseline service infrastructure Subsystem OSWI integration effort (C++, Linux) Supervisory and Control Infrastructure development effort DCS (PVSSII, Windows) ++Ok RCMS (Java) ++ XDAQ (C++, Linux) Ok+ CMS official software frameworks to develop distributed systems: DCS, RCMS, XDAQ: Subsystems Online SoftWare Infrastructure needs to be integrated Infrastructure should be oriented to develop SCADA systems XDAQ-based baseline solution + additional development to reach SCADA framework Slide 27 Ildefons Magrans, CMS Trigger Software Technical Coordinator 27 Core development: The Cell Synchronous and Asynchronous SOAP API Other plug-ins: Command: RPC method. SOAP API extensions Monitoring items FSM Plug-ins Xhannel infrastructure: Designed to simplify access to web services (SOAP and HTTP/CGI) from operation transition methods Tstore (DB) Monitor collector Cells Control panel plug-ins + e.g. GT panel e.g. DTTF panel HTTP/CGI: Automatically generated e.g. Cell FSM operation Cell plug-ins (FSM, commands, control panels) hide HW and SW platform evolution Slide 28 Ildefons Magrans, CMS Trigger Software Technical Coordinator 28 Service providers: building blocks RCMS components Tstore: DB interface. Exposes SOAP. 1 per system. Mon. Collector: Polls all cell sensors. 1 per system. Mstore: interface M. collector with Tstore. 1 per system. Job control: Remote startup of XDAQ applications. 1 per host. XS: Reads logging data base. 1 per cell. Monitor sensor: Cell interface to poll monitoring information. 1 per cell. Cell: Facilitates subsystem integration and operation (additional development, next slide). 1 per crate. Log Collector: 1 per system. Collects log statements from cells and forward them to consumers. Architecture based uniquely on these components Slide 29 Ildefons Magrans, CMS Trigger Software Technical Coordinator 29 Architecture + + Building blocks Users guide Workshops Support Subsystem Usage model proposal = Control system Monitoring system Logging system Start-up system Slide 30 Ildefons Magrans, CMS Trigger Software Technical Coordinator 30 Control architecture 1 crate ~ 1 cell Multicrate subsystems ~ 2 level of subsystem cells (1 subsystem central cell) Centralized access to DBs Hierarchical control system Stable infrastructure in top of what new business capabilities can be defined Slide 31 Ildefons Magrans, CMS Trigger Software Technical Coordinator 31 Monitoring architecture 1 cell ~ 1 sensor Centralized system: 1 Collector, 1 Mstore Centralized access to DBs Infrastructure that facilitates the hardware monitoring Slide 32 Ildefons Magrans, CMS Trigger Software Technical Coordinator 32 Logging and start-up architecture 1 cell ~ 1 XS Centralized system: 1 Collector 1 host ~ 1 JC Auxiliar infrastructure Slide 33 Ildefons Magrans, CMS Trigger Software Technical Coordinator 33 New business capabilities: How to Entry cellOperation statesOperation transitionsService testOperation transition methods Particle physicist manager S1S2 S3 S4 e 12 ()e 23 ()e 34 () e 43 () S1S2 S3 e 12 ()e 23 () S1S2 S3 e 12 ()e 23 () Subsystem SW developer New business capabilities can be coordinated by particle physicist managers without SW expertise Slide 34 Ildefons Magrans, CMS Trigger Software Technical Coordinator 34 Trigger Supervisor conclusions Design:Services, architectureand business capabilities 1 Services: Reduced number of building blocks already developed in-house (but the Cell) Main building block: Cell Isolates Hardware/Software evolution from architecture implementation Adapts sub-system integration tasks to the human context academic background (Non SW experts) 2 Architecture: Uniquely based on 7 building blocks Simplifies sub-system integration coordination Stable infrastructure Isolates services evolution from the implementation of business capabilities 3 New business capabilities: Coordination methodology associated with the architecture Facilitates the implementation of new business capabilities taking into account the academic background of managers (Non SW experts) Slide 35 Ildefons Magrans, CMS Trigger Software Technical Coordinator 35 Summary Enhancing control systems design & development with web-services technologies: 1.XML-based programming language: Maximizes usage of existing XML standards and tools, good tech. investment, max. code sharing and platf. evolution 2.Control system design example: Services: Hides HW/SW evolution Architecture: Hides Services evolution, stable infrastructure Business capabilities: Developed in top of the architecture Slide 36 Ildefons Magrans, CMS Trigger Software Technical Coordinator 36 Thank you very much! For more information: [email protected] http://triggersupervisor.cern.ch