online monitoring and filtering graham july 2009 graham july 2009
Post on 14-Dec-2015
Embed Size (px)
- Slide 1
Online monitoring and filtering Graham July 2009 Graham July 2009 Slide 2 Monitoring and filtering in CODA v2 Up to 32 ROCs. A single event builder (EB) EB output is a stream of single events. EB is connected to Event Transport (ET) system. ET has one or more online analysis, filter and monitor programs attached. Event recorder attaches to ET and takes all events that survive filtering. Slide 3 CODA v2 system Slide 4 Simplified ET Slide 5 ET has following features: Can be more than one data producer per ET. Each station can have a user provided filter algorithm that looks at the data tags. Can be more than one data consumer per station but algorithm is shared. System has fair play algorithms. round robin vs first free etc. Stations can be configured to accept all events, a sample of events or be skipped when their fifo is full. Slide 6 Since data moves on a track programs attached to stations after the producers but before data recorder can modify or filter data. Similarly programs attached to stations after the data recorder can monitor the data and if configured to skip events when their input is full do not introduce dead time. Slide 7 7 Hall B ET1 ET2 ET3 EB ER ECAL TOF CerDTagger DC LA-CAL Slide 8 Online farm Distributed Need processing cycles Need high bandwidth Must survive node problems Two modes: Filter Monitor Slide 9 Reminder of EB architecture Slide 10 Online farm proposal Slide 11 Proposal Each EMU in the final stage of the EB writes to an ET. provides one station per farm node. configured to load balance between nodes. EMU has one or more backup ETs if preferred full. Each node has a local ET and several jobs. Local ET gets data from the remote ET. Each job gets data from and puts to local ET. After filter/monitor local ET puts to a remote ET. One or more event recorders pull data from this ET. Slide 12 How it works First ET is a source of data for one or more nodes. Load balance and fault tolerance between nodes. Second ET, local to node is source for several jobs. Load balance and fault tolerance between jobs. Last ET has data sources from one or more nodes. Control nodes and jobs using AFECS. Why it works Distributed and parallel Only requires configuration of ET systems can tune parameters to alter behavior. Slide 13 Issues What does the data look like at this stage? Events? Blocks of events? Does it matter? What do we do with non-physics events? Does it matter if event N appears before or after event N+1?