21 march 2000system managers meeting slide 1 the particle physics computational grid paul...

35
21 March 2000 System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

Upload: maud-cook

Post on 03-Jan-2016

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 1

The Particle Physics Computational Grid

Paul Jeffreys/CCLRC

Page 2: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 2

Financial Times, 7 March 2000

Page 3: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 3

Front Page FT, 7 March 2000

Page 4: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 4

LHC Computing: Different from Previous Experiment Generations

– Geographical dispersion: of people and resources – Complexity: the detector and the LHC environment– Scale: Petabytes per year of data– (NB – for purposes of this talk – mostly LHC specific)

~5000 Physicists 250 Institutes

~50 Countries

Major challenges associated with:Major challenges associated with: Coordinated Use of Distributed Computing Resources Coordinated Use of Distributed Computing Resources Remote software development and physics analysisRemote software development and physics analysis Communication and collaboration at a distanceCommunication and collaboration at a distance

R&D: A New Form of Distributed System: Data-GridR&D: A New Form of Distributed System: Data-Grid

Page 5: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 5

The LHC Computing Challenge – by example

• Consider UK group searching for Higgs particle in LHC experiment– Data flowing off detectors at 40TB/sec (30 million floppies/sec)!

• Factor of c. 5.105 rejection made online before writing to media– But have to be sure not throwing away the physics with the background– Need to simulate samples to exercise rejection algorithms

• Simulation samples will be created around the world• Common access required

– After 1 year, 1PB sample of experimental events stored on media• Initial analysed sample will be at CERN, in due course elsewhere

– UK has particular detector expertise (CMS: e-, e+, )– Apply our expertise to : access 1PB exptal. data (located?), re-analyse

e.m. signatures (where?) to select c. 1 in 104 Higgs candidates, but S/N will be c. 1 to 20 (continuum background), and store results (where?)

• Also .. access some simulated samples (located?), generate (where?) additional samples, store (where?) -- PHYSICS (where?)

• In addition .. strong competition• Desire to implement infrastructure in generic way

Page 6: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 6

Proposed Solution to LHC Computing Challenge (?)

• A data analysis ‘Grid’ for High Energy Physics

Tier 1

T2

T2

T2

T2

T2

3

3

3

3

3

33

3

3

3

3

3

CERN T2

44 4 4

33

Page 7: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 7

Access Patterns

Raw Data ~1000 Tbytes

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

AOD ~10 TB

Reco-V1 ~1000 Tbytes Reco-V2 ~1000 Tbytes

ESD-V1.1 ~100 Tbytes

ESD-V1.2 ~100 Tbytes

ESD-V2.1 ~100 Tbytes

ESD-V2.2 ~100 Tbytes

Access Rates (aggregate, average)

100 Mbytes/s (2-5 physicists)

500 Mbytes/s (5-10 physicists)

1000 Mbytes/s (~50 physicists)

2000 Mbytes/s (~150 physicists)

Typical particle physics experiment in 2000-2005:On year of acquisition and analysis of data

Page 8: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 8

Hierarchical Data Grid

• Physical– Efficient network/resource use local > regional > national > oceanic

• Human– University/regional computing complements national labs, in turn

complements accelerator site• Easier to leverage resources, maintain control, assert priorities at

regional/local level– Effective involvement of scientists and students independently of

location

• The ‘challenge for UK particle physics’ … How do we:– Go from the 200 PC99 farm maximum of today to 10000 PC99 centre?– Connect/participate in European and World-wide PP grid?– Write the applications needed to operate within this hierarchical grid?Write the applications needed to operate within this hierarchical grid?ANDAND – Ensure other disciplines able to work with us, our developments &

applications are made available to others, exchange of expertise, and enjoy fruitful collaboration with Computer Scientists and Industry

Page 9: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 9

Quantitative Requirements

• Start with typical experiment’s Computing Model• UK Tier-1 Regional Centre specification• Then consider implications for UK Particle Physics Computational

Grid– Over years 2000, 2001, 2002, 2003

– Joint Infrastructure Bid made for resources to cover this

– Estimates of costs

• Look further ahead

Page 10: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 10

Page 11: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 11

Page 12: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 12

Page 13: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 13

Page 14: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 14

Page 15: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 15

Page 16: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 16

Page 17: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 17

Page 18: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 18

Steering Committee

‘Help establish the Particle Physics Grid activities in the UK'a. An interim committee be put in place.b. The immediate objectives would be prepare for the presentation to John Taylor on

27 March 2000, and to co-ordinate the EU 'Work Package' activities for April 14c. After discharging these objectives, membership would be re-consideredd. The next action of the committee would be to refine the Terms of Reference

(presented to the meeting on 15 March)e. After that the Steering Committee will be charged with commissioning a Project

Team to co-ordinate the Grid technical work in the UKf. The interim membership is:

• Chairman: Andy Halley• Secretary: Paul Jeffreys• Tier 2 reps: Themis Bowcock, Steve Playfer• CDF: Todd Hoffmann• D0: Ian Bertram• CMS: David Britton• BaBar: Alessandra Forti• CNAP: Steve Lloyd

– The 'labels' against the members are not official in any sense at this stage, but the members are intended to cover these areas approximately!

Page 19: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 19

UK Project Team

• Need to really get underway!• System Managers crucial!• PPARC needs to see genuine plans and genuine activities…• Must coordinate our activities• And

– Fit in with CERN activities

– Meet needs of experiments (BaBar, CDF, D0, …)

• So … go through range of options and then discuss…

Page 20: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 20

EU Bid(1)

• Bid will be made to EU to link national grids– “Process” has become more than ‘just a bid’

• Almost reached the point where have to be active participant in EU bid, and associated activities, in order to access data from CERN in the future

• Decisions need to be taken today…

• Timescale:– March 7 Workshop at CERN to prepare programme of work (RPM)

– March 17 Editorial meeting to look for industrial partners

– March 30 Outline of paper used to obtain pre-commitment of partners

– April 17 Finalise ‘Work Packages’ – see next slides

– April 25 Final draft of proposal

– May 1 Final version of proposal for signature

– May 7 Submit

Page 21: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 21

EU Bid(2)

• The bid was originally for 30MECU, with matching contribution from national funding organisations– Now scaled down, possibly to 10MECU

– Possibly as ‘taster’ before follow-up bid?

– EU funds for Grid activities in Framework VI likely to be larger

• Work Packages have been defined– Objective is that countries (through named individuals) take

responsibility to split up the work and define deliverables within each, to generate draft content for EU bid

– BUT

• Without doubt the same people will be well positioned to lead the work in due course

• .. And funds split accordingly??

• Considerable manoeuvering!

– UK – need to establish priorities, decide where to contribute…

Page 22: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 22

Work Packages

Middleware Contact Point

1 Grid Work Scheduling Cristina Vistoli INFN2 Grid Data Management Ben Segal CERN3 Grid Application Monitoring Robin Middleton UK4 Fabric Management Tim Smith CERN5 Mass Storage Management Olof Barring CERNInfrastructure6 Testbed and Demonstrators François Etienne IN2P37 Network Services Christian Michau CNRSApplications8 HEP Applications Hans Hoffmann 4expts9 Earth Observation Applications Luigi Fusco10 Biology Applications Christian MichauManagement11 Project Management Fabrizio Gagliardi CERN

Robin is ‘place-holder’ – holding UK’s interest (explanation in Open Session)

Page 23: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 23

UK Participation in Work Packages

MIDDLEWARE1. Grid Work Scheduling2. Grid Data Management TONY DOYLE, Iain Bertram?3. Grid Application monitoring ROBIN MIDDLETON, Chris Brew4. Fabric Management5. Mass Storage Management JOHN GORDON

INFRASTRUCTURE6. Testbed and demonstrators7. Network Services PETER CLARKE, Richard Hughes-Jones

APPLICATIONS8. HEP Applications

Page 24: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 24

PPDG

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC

PPDG as an NGI ProblemPPDG Goals

The ability to query and partially retrieve hundreds of terabytesacross Wide Area Networks within seconds,

Making effective data analysis from ten to one hundred USuniversities possible.

PPDG is taking advantage of NGI services in three areas:– Differentiated Services: to allow particle-physics bulk data

transport to coexist with interactive and real-time remotecollaboration sessions, and other network traffic.

– Distributed caching: to allow for rapid data delivery in response tomultiple “interleaved” requests

– “Robustness”: Matchmaking and Request/Resourceco-scheduling: to manage workflow and use computing and netresources efficiently; to achieve high throughput

Page 25: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 25

PPDG

27Richard P. Mount CHEP 2000Data A nalysis for SLAC Physics

PPDG Resources• Network Testbeds:

– ESNET links at up to 622 Mbits/s (e.g. LBNL-ANL)

– Other testbed links at up to 2.5 Gbits/s (e.g. Caltech-SLAC via NTON)

• Data and Hardware:– Tens of terabytes of disk-resident particle physics data (plus hundreds of

terabytes of tape-resident data) at accelerator labs;

– Dedicated terabyte university disk cache;

– Gigabit LANs at most sites.

• Middleware Developed by Collaborators:– Many components needed to meet short-term targets (e.g.Globus, SRB,

MCAT, Condor,OOFS,Netlogger, STACS, Mass Storage Management)already developed by collaborators.

• Existing Achievements of Collaborators:– WAN transfer at 57 Mbytes/s;

– Single site database access at 175 Mbytes/s

Page 26: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 26

PPDG

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC

PPDG First Year Milestones• Project Start August, 1999

• Decision on existing middleware to be October, 1999integrated into the first-year Data Grid;

• First demonstration of high-speed January, 2000site-to-site data replication;

• First demonstration of multi-site February, 1999cached file access (3 sites);

• Deployment of high-speed site-to-site July, 2000data replication in support of twoparticle-physics experiments;

• Deployment of multi-site cached file August, 2000access in partial support of at least twoparticle-physics experiments.

Page 27: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 27

PPDG

DoE NGI Program PI Meeting, October 1999Particle Physics Data Grid Richard P. Mount, SLAC

First Year PPDG “System” Components

Middleware Components (Initial Choice): See PPDG Proposal Page 15 Object and File-Based Objectivity/DB (SLAC enhanced)

Application Services GC Query Object, Event Iterator, Query Monitor

FNAL SAM System Resource Management Start with Human Intervention

(but begin to deploy resource discovery & mgmnt tools) File Access Service Components of OOFS (SLAC) Cache Manager GC Cache Manager (LBNL) Mass Storage Manager HPSS, Enstore, OSM (Site-dependent) Matchmaking Service Condor (U. Wisconsin) File Replication Index MCAT (SDSC) Transfer Cost Estimation Service Globus (ANL) File Fetching Service Components of OOFS File Movers(s) SRB (SDSC); Site specific End-to-end Network Services Globus tools for QoS reservation Security and authentication Globus (ANL)

Page 28: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 28

LHCb contribution to EU proposal HEP Applications Work Package

• Grid testbed in 2001, 2002• Production 106 simulated b->D*pi

– Create 108 events at Liverpool MAP in 4 months– Transfer 0.62TB to RAL– RAL dispatch AOD and TAG datasets to other sites

• 0.02TB to Lyon and CERN

• Then permit a study of all the various options for performing a distributed analysis in a Grid environment

Page 29: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 29

American Activities

• Collaboration with Ian Foster– Transatlantic collaboration using GLOBUS

• Networking– QoS tests with SLAC

– Also link in with GLOBUS?

• CDF and D0– Real challenge to ‘export data’

– Have to implement 4Mbps connection

– Have to set up mini Grid

• BaBar– Distributed LINUX farms etc in JIF bid

Page 30: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 30

Networking Proposal - 1DETAILS Demonstration of high rate site to site file replication Single site-to-site tests at low rates to set up technologies and gain experience. These should include and benefit the experiments which will be taking data. TIER-0 -> TIER-1 (CERN-RAL) TIER-1 -> TIER-1 (FNAL-RAL) TIER-1 to TIER-2 (RAL-LVPL, GLA/ED) Use existing monitoring tools. Adapt to function as resource predictors also. Multi site file replication, cascaded file replication at modest rates rates. Transfers at Neo-GRID rates

DEPENDENCIES/RISKS Availability of temporary PVCs on inter and intra-national WANs – or from collaborating industries. Needs negotiation now. Monitoring expertise/ tools already available: PPNCG(UK), ICFA(Worldwide)

RESOURCES REQUIRED 1.5 SY 10 Mbit/s PVCs between sites in 00-01 50 Mbits/s PVCs between sites in 01-02 > 100 Mbits/s PVCs in 02-03

MILESTONES Jan-01: Demonstration of low rate transfers between all sites Jan-02: Demonstration of cascaded file transfer Demonstration of sustained modest rate transfers 03: Implementation of sustained transfers of real data at rates approaching 1000 Mbits/s

Page 31: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 31

Networking - 2

Differentiated Services Deploy some form of DiffServ on dedicated PVCs. Measure high and low priority latency and rates as a function of strategy and load.

[Depends upon QoS developments]. Attempt to deploy end-to-end QoS across several interconnected networks.

PVCs must be QoS capable. May rely upon proprietary or technology dependent factors in short term. Monotoring tools WAN end-to-end depends upon expected developments by network suppliers.

1.5 SY Same PVCs as in NET-1 Production deployment of QoS on WAN

Apr-01: Successful deployment and measurement of pilot QoS on PVCs under project control.

Page 32: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 32

Networking - 3Monitoring and Metrics for resource prediction. Survey and define monitoring requirements of GRID . Adapt existing monitoring tools for for measurement and monitoring needs of network work packages (all NET-xx) as described here. In particular develop protocol sensitive monitoring as will be needed for QoS Develop and test prediction metrics

PPNCG monitoring ICFA monitoring

0.5 SY

Dec-00: Interim report on GRID monitoring requirements Jul-01: Dec-01: Finish of adaptation of existing tools for monitoring Jul-02: First prototype predictive tools deployed Dec-02: Report on tests of predictive tools

Page 33: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 33

Networking - 4Data Flow modelling Assimilate Monarc modeling tool set. Determine requirements of model of UK GRID – and to what extent this factorises or not from international GRID. Appraise work needed to adapt/write necessary components. Configure and run models in parallel with transfer tests NET-1 and QoS tests NET-2 for calibration purposes Apply models to determination of GRID topology and resource location.

Applicability of existing tools unknown before appraisal.

3 SY

Oct-00: Assimilate Monarc Dec-00: Determine requirements of GRID model Determine scope of work needed to adapt/provide components. ??: Configure initial model.

Page 34: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 34

Pulling it together…

• Networking:– EU work package

– Existing tests

– Integration of ICFA studies to Grid

• Will networking lead the non-experiment activities??

• Data Storage– EU work package

• Grid Application Monitoring– EU work package

• CDF, D0 and BaBar– Need to integrate these into Grid activities

– Best approach is to centre on experiments

Page 35: 21 March 2000System Managers Meeting Slide 1 The Particle Physics Computational Grid Paul Jeffreys/CCLRC

21 March 2000 System Managers Meeting Slide 35

…Pulling it all together

• Experiment-driven– Like LHCb, meet specific objectives

• Middleware preparation– Set up GLOBUS?

• QMW, RAL, DL ..?– Authenticate– Familiar– Try moving data between sites

• Resource Specification• Collect dynamic information• Try with international collaborators

– Learn about alternatives to GLOBUS

– Understand what is missing

– Exercise and measure performance of distributed cacheing

• What do you think?• Anyone like to work with Ian Foster for 3 months?!