wlcg transfers dashboard
DESCRIPTION
WLCG Transfers Dashboard. WLCG Workshop in conjunction with CHEP 2012, 20.05.2012, New York Julia Andreeva , David Tuckett , Daniel Dieguez , Danila Oleynik , Artem Petrosyan , Gunnar Roe, Michail Salichos , Alexandr Uzhinskiy. Contents. Motivation - PowerPoint PPT PresentationTRANSCRIPT
Experiment Support
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
DBES
WLCG Transfers Dashboard
WLCG Workshop in conjunction with CHEP 2012,
20.05.2012, New York
Julia Andreeva, David Tuckett, Daniel Dieguez, Danila Oleynik, Artem Petrosyan, Gunnar Roe,
Michail Salichos, Alexandr Uzhinskiy
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Contents
• Motivation• Overview of the key concepts of the WLCG
transfer monitoring system• Current status and issues• Dashboard UI• Integration of xRootD monitoring• Summary
Julia Andreeva, WLCG Workshop 2
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Motivation
• Currently there is no tool which can provide an overall view of data transfer on the WLCG scope (across LHC experiments, across various technologies used, for example FTS and xRootD, across multiple local FTS instances, etc..)
• Every LHC experiment follows it’s own data transfers through a VO-specific monitoring system.
• There is a clear similarity between the tasks performed by all VO-specific transfer monitoring systems. Operations like aggregation of the FTS transfer statistics is done by every VO separately, though it could be done once , centrally and then can be served to all experiments via well defined set of APIs
• In order to organize data transfer in the most efficient way experiments need more information than is currently available. For example correlations of data transfer between experiments, latencies related to SRM operations during data transfers, etc...
Julia Andreeva, WLCG Workshop3
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Concept (1)• WLCG transfer monitoring is a common solution which
provides cross-VO, cross-technology view not coupled with any VO-specific data management system
• VO transfer monitoring integration– Transfer events via MSG broker
• Avoids polling and screen-scraping local FTS instances
– Transfer statistics via Dashboard API• Avoids redundant event storage and statistics generation
– Transfer plots via Dashboard UI• Avoids redundant development of common plots
Dashboard
MSG Broker
API UI
FTS instance
Xrootd etc VO Monitoring
Currently main technology for CMS, ATLAS and
LHCb
Currently main technology for ALICE
Julia Andreeva, WLCG Workshop4
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Concept (2)
• WLCG transfer monitoring is a common solution which provides cross-VO, cross-technology view not coupled with any VO-specific data management system
Dashboard
MSG Broker
API UI
FTS instance
Xrootd etc VO Monitoring
Implementation started with FTS monitoring
Julia Andreeva, WLCG Workshop5
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Current status (1)
• Required deployment of FTS 2.2.8 which was enabled for transfer status reporting via MSG (GT group)
• The prototype is up and running for more than half a year
• Example of excellent collaboration between several groups in CERN IT (ES, GT, PES, DB) , between IT and PH ( active participation of CMS and ATLAS computing teams), between CERN and JINR (Dubna)
Julia Andreeva, WLCG Workshop6
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Current status (2)
• WLCG Transfer Dashboard was developed using a similar schema and UI as ATLAS DDM Dashboard. This allowed a prototype to be put in place in a short time ~ 2 months.
• Full production setup is in place:
-The schema was validated by ORACLE experts from CERN IT-DB and was deployed in production
-Production collectors and UIs are running in a redundant mode (2 hosts),
- 2 production message brokers are setup (many thanks to Lionel Cons (CERN IT-GT) and CERN IT-PES group)
• Testing and integration environment created ( integration DB, test message broker, VMs for collectors and UIs)
• Alarms are enabled in case any of the production FTS instances does not report for longer than 2 hours
• First cycle of validation was performed by the CMS colleagues (special thanks to Jozep Flix) and all reported bugs were fixed
• No problems with UIs or collectors were detected over last months
• Delayed announcing system to be in production due to results of the consistency checks
Julia Andreeva, WLCG Workshop7
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Current status (3)
• The most important step of validation is consistency checks performed in order to understand data trustworthiness. Data is compared between WLCG Transfer Dashboard and Phedex and ATLAS DDM Dashboard.
• First results of consistency checks with pilot FTS server were very promising. However, after deployment of FTS 2.2.8 to all T1s, consistency checks showed a big discrepancy, in particular for ATLAS up to 50%
• Problem was understood, thanks to Michail Salichos (CERN-IT-GT). It is caused by a bug in activeMQ-cpp client used by FTS publisher.
• Workaround was found (Michail Salinchos) . A fixed version of the FTS publisher was deployed to the Triumf and ASGC FTSs 3 weeks ago. Permanent consistency checks show perfect agreement.
• Tentative schedule for service to be in production 2-3 weeks from now. Depends on patching of all FTS services for activeMQ-cpp client bug.
Julia Andreeva, WLCG Workshop8
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Consistency checks
ATLAS DDM plot for Triumf WLCG Transfers Dashboard plot for Triumf
Julia Andreeva, WLCG Workshop9
ExperimentSupport Dashboard UI: overview
Key Features• Flexible filtering and
grouping• Statistics matrix & error
samples• Customizable plots• Web API: JSON, XML
Implementation• Uses common xbrowse UI
framework originally developed for ATLAS DDM Dashboard 2.0
Julia Andreeva, WLCG Workshop
10
ExperimentSupport
• Filtering by sliding/fixed interval
• Filtering by VO
• Filtering by FTS server
• Filtering and groupingof sources / destinationsby country, site, host, token
• GOCDB naming for cross-VO view
• VO-specific naming for single-VO view
Dashboard UI: filtering & grouping
Julia Andreeva, WLCG Workshop
11
ExperimentSupport
• Matrix– Source– Destination X– Efficiency– Throughput– Successes– Failures
• Error samples
Dashboard UI: matrix & error samples
JuliaAndreeva, WLCG Workshop12
ExperimentSupport
• Plots– Source– Destination– VOX– Efficiency– Throughput– Successes– Failures
Different kinds of plots are availablePossibility tocustomize plots (time bins,# of shown Items, etc…)See backup slides
Dashboard UI: plots
Julia Andreeva, WLCG Workshop13
ExperimentSupport
• Throughput side-by-side: PhEDEx v. Dashboard
• Throughput difference: relative & absolute
• In development– Automated cross-checking with alarms
Dashboard UI: consistency
12 hoursCERN KIT
24 hoursCERN RAL
Julia Andreeva, WLCG Workshop14
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Next steps (FTS monitoring)
• System developers work in close contact with the VOs. Thanks a lot for active participation of CMS and ATLAS. Got a lot of feature requests, which will be addressed by the future development:– Filter by FTS channel– FTS channel status: current and evolution– Status of the FTS queues. Correlations between
transfer performance metrics and status of the queue
– Transfer part statistics: SRM overhead, GRIDFTP
- Ranking plots and quality map plots
Julia Andreeva, WLCG Workshop15
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ESIntegration of XRootD transfers
• Dashboard
• MSG Broker
• API
• UI
• FTS instance
• Xrootd • VO
Monitoring
XRootD federation monitoring part is under developmentIs being developed mainly by JINR (Dubna) Julia Andreeva, WLCG Workshop
16
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES XRootD monitoring
• Is being implemented with 3 levels of hierarchy
-local site
-federation
-global
Julia Andreeva, WLCG Workshop17
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ESxRootD monitoring architecture
Users-site administrators
and VO support teams at the site
Users-VO computing
teams,Federation support
teams
Users-VO computing
teams, site administrators,
VO management, WLCG management`
Julia Andreeva, WLCG Workshop18
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ESXRootD monitoring architecture
Julia Andreeva, WLCG Workshop19
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES xRootD monitoring (local site)• There are two implementations:
-based on MonAlisa (used by ALICE and with some extensions by CMS)
-developed in the framework of Tier3 monitoring project for ATLAS (Ganglia)
• Both approaches use XRootD monitoring data (smry and detailed flow) reported by XRootD redirectors with UDP. Not event-like content
• CMS and ATLAS developed readers reformatting these flows into event-like data which contains: event time, source and destination domains, path and filename, username, file size, #bytes read/written
• There is no knowledge about federation topology at the site level• Event-like data complemented with the name of the site which hosts
the publisher is published to MSG• MonAlisa or Ganglia UI
Julia Andreeva, WLCG Workshop20
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ESXRootD monitoring architecture
Julia Andreeva, WLCG Workshop21
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ESXRootD monitoring (Federation)• At the federation level data published by the sites will be
consumed from MSG.• Events coming from different sites will be aggregated and
complemented with topology information. Currently data processing on the federation level is planned to be implemented with map reduce (Under development)
• Transfers handled by federation will be exposed through the federation UI.
• Implementation of the Federation UI is similar to the UI of the Global Transfer Dashboard. Adapting global WLCG Transfer UI is straightforward since it is JavaScript client application which expects data in JSON format, fully decoupled from the data source.
• First prototype should be ready by the end of June.• Federation data in the format similar to FTS transfer status
messages will be published to MSG for global monitoring system
Julia Andreeva, WLCG Workshop22
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES XRootD monitoring
• On the Global level implementation done for FTS should be to a big extent re-used for XRootD (collectors and UI)
• Plan to have full chain enabled by the end of the year
Julia Andreeva, WLCG Workshop23
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Summary
• The FTS monitoring part is ready and will be announced to be in production as soon as all FTS instances are patched for activeMQ-cpp client bug. Further development follows the requirements of the experiments
• The XRootD monitoring part is in the active development phase. Progressing well. Hopefully the first prototype will be ready by the end of June. Full functionality should be enabled by the end of the year
• Having FTS and XRootD monitoring covered by a global monitoring system would allow to provide pretty complete picture of the WLCG transfers.
• Example of excellent collaboration of several groups in CERN IT, IT and PH, CERN and JINR
Julia Andreeva, WLCG Workshop24
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Links
• Dashboard UI (prototype)http://dashb-wlcg-transfers.cern.ch/ui/
• Twikihttps://twiki.cern.ch/twiki/bin/view/LCG/WLCGTransferMonitoring
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGTransfersDashboard
• Please see a poster during CHEP poster session
Julia Andreeva, WLCG Workshop25
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Backup slides.Dashboard UI: plot types
Julia Andreeva, WLCG Workshop
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Dashboard UI: plot types
Julia Andreeva, WLCG Workshop
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Dashboard UI: plot types
Julia Andreeva, WLCG Workshop
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Dashboard UI: plot customisation
Julia Andreeva, WLCG Workshop
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Dashboard UI: plot customisation
Julia Andreeva, WLCG Workshop
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
ES Dashboard UI: plot customisation
Julia Andreeva, WLCG Workshop