wlcg transfers dashboard

31
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/ DB ES WLCG Transfers Dashboard WLCG Workshop in conjunction with CHEP 2012, 20.05.2012, New York Julia Andreeva , David Tuckett, Daniel Dieguez, Danila Oleynik, Artem Petrosyan, Gunnar Roe, Michail Salichos, Alexandr Uzhinskiy

Upload: keaton-coffey

Post on 02-Jan-2016

47 views

Category:

Documents


3 download

DESCRIPTION

WLCG Transfers Dashboard. WLCG Workshop in conjunction with CHEP 2012, 20.05.2012, New York Julia Andreeva , David Tuckett , Daniel Dieguez , Danila Oleynik , Artem Petrosyan , Gunnar Roe, Michail Salichos , Alexandr Uzhinskiy. Contents. Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: WLCG Transfers Dashboard

Experiment Support

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

DBES

WLCG Transfers Dashboard

WLCG Workshop in conjunction with CHEP 2012,

20.05.2012, New York

Julia Andreeva, David Tuckett, Daniel Dieguez, Danila Oleynik, Artem Petrosyan, Gunnar Roe,

Michail Salichos, Alexandr Uzhinskiy

Page 2: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Contents

• Motivation• Overview of the key concepts of the WLCG

transfer monitoring system• Current status and issues• Dashboard UI• Integration of xRootD monitoring• Summary

Julia Andreeva, WLCG Workshop 2

Page 3: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Motivation

• Currently there is no tool which can provide an overall view of data transfer on the WLCG scope (across LHC experiments, across various technologies used, for example FTS and xRootD, across multiple local FTS instances, etc..)

• Every LHC experiment follows it’s own data transfers through a VO-specific monitoring system.

• There is a clear similarity between the tasks performed by all VO-specific transfer monitoring systems. Operations like aggregation of the FTS transfer statistics is done by every VO separately, though it could be done once , centrally and then can be served to all experiments via well defined set of APIs

• In order to organize data transfer in the most efficient way experiments need more information than is currently available. For example correlations of data transfer between experiments, latencies related to SRM operations during data transfers, etc...

Julia Andreeva, WLCG Workshop3

Page 4: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Concept (1)• WLCG transfer monitoring is a common solution which

provides cross-VO, cross-technology view not coupled with any VO-specific data management system

• VO transfer monitoring integration– Transfer events via MSG broker

• Avoids polling and screen-scraping local FTS instances

– Transfer statistics via Dashboard API• Avoids redundant event storage and statistics generation

– Transfer plots via Dashboard UI• Avoids redundant development of common plots

Dashboard

MSG Broker

API UI

FTS instance

Xrootd etc VO Monitoring

Currently main technology for CMS, ATLAS and

LHCb

Currently main technology for ALICE

Julia Andreeva, WLCG Workshop4

Page 5: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Concept (2)

• WLCG transfer monitoring is a common solution which provides cross-VO, cross-technology view not coupled with any VO-specific data management system

Dashboard

MSG Broker

API UI

FTS instance

Xrootd etc VO Monitoring

Implementation started with FTS monitoring

Julia Andreeva, WLCG Workshop5

Page 6: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Current status (1)

• Required deployment of FTS 2.2.8 which was enabled for transfer status reporting via MSG (GT group)

• The prototype is up and running for more than half a year

• Example of excellent collaboration between several groups in CERN IT (ES, GT, PES, DB) , between IT and PH ( active participation of CMS and ATLAS computing teams), between CERN and JINR (Dubna)

Julia Andreeva, WLCG Workshop6

Page 7: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Current status (2)

• WLCG Transfer Dashboard was developed using a similar schema and UI as ATLAS DDM Dashboard. This allowed a prototype to be put in place in a short time ~ 2 months.

• Full production setup is in place:

-The schema was validated by ORACLE experts from CERN IT-DB and was deployed in production

-Production collectors and UIs are running in a redundant mode (2 hosts),

- 2 production message brokers are setup (many thanks to Lionel Cons (CERN IT-GT) and CERN IT-PES group)

• Testing and integration environment created ( integration DB, test message broker, VMs for collectors and UIs)

• Alarms are enabled in case any of the production FTS instances does not report for longer than 2 hours

• First cycle of validation was performed by the CMS colleagues (special thanks to Jozep Flix) and all reported bugs were fixed

• No problems with UIs or collectors were detected over last months

• Delayed announcing system to be in production due to results of the consistency checks

Julia Andreeva, WLCG Workshop7

Page 8: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Current status (3)

• The most important step of validation is consistency checks performed in order to understand data trustworthiness. Data is compared between WLCG Transfer Dashboard and Phedex and ATLAS DDM Dashboard.

• First results of consistency checks with pilot FTS server were very promising. However, after deployment of FTS 2.2.8 to all T1s, consistency checks showed a big discrepancy, in particular for ATLAS up to 50%

• Problem was understood, thanks to Michail Salichos (CERN-IT-GT). It is caused by a bug in activeMQ-cpp client used by FTS publisher.

• Workaround was found (Michail Salinchos) . A fixed version of the FTS publisher was deployed to the Triumf and ASGC FTSs 3 weeks ago. Permanent consistency checks show perfect agreement.

• Tentative schedule for service to be in production 2-3 weeks from now. Depends on patching of all FTS services for activeMQ-cpp client bug.

Julia Andreeva, WLCG Workshop8

Page 9: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Consistency checks

ATLAS DDM plot for Triumf WLCG Transfers Dashboard plot for Triumf

Julia Andreeva, WLCG Workshop9

Page 10: WLCG Transfers Dashboard

ExperimentSupport Dashboard UI: overview

Key Features• Flexible filtering and

grouping• Statistics matrix & error

samples• Customizable plots• Web API: JSON, XML

Implementation• Uses common xbrowse UI

framework originally developed for ATLAS DDM Dashboard 2.0

Julia Andreeva, WLCG Workshop

10

Page 11: WLCG Transfers Dashboard

ExperimentSupport

• Filtering by sliding/fixed interval

• Filtering by VO

• Filtering by FTS server

• Filtering and groupingof sources / destinationsby country, site, host, token

• GOCDB naming for cross-VO view

• VO-specific naming for single-VO view

Dashboard UI: filtering & grouping

Julia Andreeva, WLCG Workshop

11

Page 12: WLCG Transfers Dashboard

ExperimentSupport

• Matrix– Source– Destination X– Efficiency– Throughput– Successes– Failures

• Error samples

Dashboard UI: matrix & error samples

JuliaAndreeva, WLCG Workshop12

Page 13: WLCG Transfers Dashboard

ExperimentSupport

• Plots– Source– Destination– VOX– Efficiency– Throughput– Successes– Failures

Different kinds of plots are availablePossibility tocustomize plots (time bins,# of shown Items, etc…)See backup slides

Dashboard UI: plots

Julia Andreeva, WLCG Workshop13

Page 14: WLCG Transfers Dashboard

ExperimentSupport

• Throughput side-by-side: PhEDEx v. Dashboard

• Throughput difference: relative & absolute

• In development– Automated cross-checking with alarms

Dashboard UI: consistency

12 hoursCERN KIT

24 hoursCERN RAL

Julia Andreeva, WLCG Workshop14

Page 15: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Next steps (FTS monitoring)

• System developers work in close contact with the VOs. Thanks a lot for active participation of CMS and ATLAS. Got a lot of feature requests, which will be addressed by the future development:– Filter by FTS channel– FTS channel status: current and evolution– Status of the FTS queues. Correlations between

transfer performance metrics and status of the queue

– Transfer part statistics: SRM overhead, GRIDFTP

- Ranking plots and quality map plots

Julia Andreeva, WLCG Workshop15

Page 16: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ESIntegration of XRootD transfers

• Dashboard

• MSG Broker

• API

• UI

• FTS instance

• Xrootd • VO

Monitoring

XRootD federation monitoring part is under developmentIs being developed mainly by JINR (Dubna) Julia Andreeva, WLCG Workshop

16

Page 17: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES XRootD monitoring

• Is being implemented with 3 levels of hierarchy

-local site

-federation

-global

Julia Andreeva, WLCG Workshop17

Page 18: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ESxRootD monitoring architecture

Users-site administrators

and VO support teams at the site

Users-VO computing

teams,Federation support

teams

Users-VO computing

teams, site administrators,

VO management, WLCG management`

Julia Andreeva, WLCG Workshop18

Page 19: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ESXRootD monitoring architecture

Julia Andreeva, WLCG Workshop19

Page 20: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES xRootD monitoring (local site)• There are two implementations:

-based on MonAlisa (used by ALICE and with some extensions by CMS)

-developed in the framework of Tier3 monitoring project for ATLAS (Ganglia)

• Both approaches use XRootD monitoring data (smry and detailed flow) reported by XRootD redirectors with UDP. Not event-like content

• CMS and ATLAS developed readers reformatting these flows into event-like data which contains: event time, source and destination domains, path and filename, username, file size, #bytes read/written

• There is no knowledge about federation topology at the site level• Event-like data complemented with the name of the site which hosts

the publisher is published to MSG• MonAlisa or Ganglia UI

Julia Andreeva, WLCG Workshop20

Page 21: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ESXRootD monitoring architecture

Julia Andreeva, WLCG Workshop21

Page 22: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ESXRootD monitoring (Federation)• At the federation level data published by the sites will be

consumed from MSG.• Events coming from different sites will be aggregated and

complemented with topology information. Currently data processing on the federation level is planned to be implemented with map reduce (Under development)

• Transfers handled by federation will be exposed through the federation UI.

• Implementation of the Federation UI is similar to the UI of the Global Transfer Dashboard. Adapting global WLCG Transfer UI is straightforward since it is JavaScript client application which expects data in JSON format, fully decoupled from the data source.

• First prototype should be ready by the end of June.• Federation data in the format similar to FTS transfer status

messages will be published to MSG for global monitoring system

Julia Andreeva, WLCG Workshop22

Page 23: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES XRootD monitoring

• On the Global level implementation done for FTS should be to a big extent re-used for XRootD (collectors and UI)

• Plan to have full chain enabled by the end of the year

Julia Andreeva, WLCG Workshop23

Page 24: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Summary

• The FTS monitoring part is ready and will be announced to be in production as soon as all FTS instances are patched for activeMQ-cpp client bug. Further development follows the requirements of the experiments

• The XRootD monitoring part is in the active development phase. Progressing well. Hopefully the first prototype will be ready by the end of June. Full functionality should be enabled by the end of the year

• Having FTS and XRootD monitoring covered by a global monitoring system would allow to provide pretty complete picture of the WLCG transfers.

• Example of excellent collaboration of several groups in CERN IT, IT and PH, CERN and JINR

Julia Andreeva, WLCG Workshop24

Page 25: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Links

• Dashboard UI (prototype)http://dashb-wlcg-transfers.cern.ch/ui/

• Twikihttps://twiki.cern.ch/twiki/bin/view/LCG/WLCGTransferMonitoring

https://twiki.cern.ch/twiki/bin/view/LCG/WLCGTransfersDashboard

[email protected]

• Please see a poster during CHEP poster session

Julia Andreeva, WLCG Workshop25

Page 26: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Backup slides.Dashboard UI: plot types

Julia Andreeva, WLCG Workshop

Page 27: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Dashboard UI: plot types

Julia Andreeva, WLCG Workshop

Page 28: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Dashboard UI: plot types

Julia Andreeva, WLCG Workshop

Page 29: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Dashboard UI: plot customisation

Julia Andreeva, WLCG Workshop

Page 30: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Dashboard UI: plot customisation

Julia Andreeva, WLCG Workshop

Page 31: WLCG Transfers Dashboard

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

ES Dashboard UI: plot customisation

Julia Andreeva, WLCG Workshop