1 the datagrid workpackage 8 f.carminati 28 june 2001

22
1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

Upload: rudolf-tucker

Post on 16-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

1

The DataGrid WorkPackage 8

F.Carminati 28 June 2001

Page 2: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

2FOCUSJune 28, 2001

Distributed computingBasic principle

Physicists must have in principle equal access to data and resources

The system will be extremely complex Number of sites and components in each site Different tasks performed in parallel: simulation,

reconstruction, scheduled and unscheduled analysisBad news is that the basic tools are missing

Distributed authentication & resource management Distributed namespace for files and objects Local resource management of large clusters Data replication and caching WAN/LAN monitoring and logging

Good news is that we are not alone These issues are central to the developments in the US and

in Europe under the collective name of GRID

Page 3: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

3FOCUSJune 28, 2001

The GRID“Dependable, consistent,

pervasive access to[high-end] resources”

Dependable: Can provide performance

and functionality guarantees

Consistent: Uniform interfaces to a

wide variety of resourcesPervasive:

Ability to “plug in” from anywhere

Page 4: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

4FOCUSJune 28, 2001

GLOBUS hourglassFocus on architecture issues

Low participation cost, local control and support for adaptation

Use to construct high-level, domain-specific solutions

A set of toolkit services Security (GSI) Resource management

(GRAM) Information services (MDS) Remote file management

(GASS) Communication (I/O,

Nexus) Process monitoring (HBM)

Diverse global services

Core Globusservices

Local OS

A p p l i c a t i o n s

Page 5: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

5FOCUSJune 28, 2001

DataGRID project

Huge project 21 partners, 10MEuro, 3 years, 12

WorkPackages

Global design still evolving An ATF has been mandated to design the

architecture of the system to be delivered at PM6 (=now!)

Continuous feed-back from users needed

Users are in three workpackages HEP in WP8 Earth Observation in WP9 Biology in WP10

Page 6: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

6FOCUSJune 28, 2001

DataGRID Project WP1 Grid Workload Management 11/18.6 WP2 Grid Data Management 6/17.6 WP3 Grid Monitoring Services 7/10 WP4 Fabric Management 5.7/15.6 WP5 Mass Storage Management 1/5.75 WP6 Integration Testbed 6/27 WP7 Network Services 0.5/9.4 WP8 High Energy Physics Applications 1.7/23.2 WP9 Earth Observation Science Applications 3/9.4 WP10 Biology Science Applications 1.5/6 WP11 Dissemination and Exploitation 1.2/1.7 WP12 Project Management 2.5/2.5

Page 7: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

7FOCUSJune 28, 2001

WP8 mandatePartners CNRS, INFN, NIKHEF, PPARC, CERNCoordinate exploitation of testbed by the HEP experiments

Installation kits have been developed by all experiments

Coordinate interaction of the experiments with the other WP’s and the ATF

Monthly meetings of the WP8 TWG Common feedback to WP’s and to ATF

Identify common components that could be integrated in an HEP “upper middleware” layerLarge unfunded effort (776 m/m ~ 22 people)Very small funded effort (60 m/m ~ 1.5 people)

Page 8: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

8FOCUSJune 28, 2001

The multi-layer hourglass

OS & Net services

Bag of Services (GLOBUS)

DataGRID middlewarePPDG, GriPhyn, EuroGRID

HEPVO common application layer

Earth Obs. Biology

ALICE ATLAS CMS LHCb

Specific application layer

WP9 WP 10

GLOBUS team

DataGRID ATF

WP8-9-10 TWG

Page 9: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

9FOCUSJune 28, 2001

CPU(SI95) DSK(GB) TP(TB) NWK(Mb/s) PeopleBari 150 100 - 12 2Birming. 800 1000 (RAL?) ? 1Bologna 140 100 - 34 2Cagliari 200 400 - 4 2Catania 1100 600 1.2 4 6CERN 350 400 5 (CASTOR) 2 Dubna 800 1500 - 2 3GSI (D) 1150 1000 ? (robot) 34 2IRB (Kr) 400 2000 - 2 5Lyon 38000 1000 30 (HPSS) 155 2Merida (MX)35 40 - 0.002 1NIKHEF (NL)850 1000 ? (robot) 10 1+OSU (US) 3400 100 130 (HPSS) 622 2Padova 210 200 - 16 1Saclay ? ? - ? 1Torino 850 1200 - 12 6UNAM (MX) 35 40 - 2 2

ALICE Site list & Resources

Page 10: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

10FOCUSJune 28, 2001

ALICE – Testbed PreparationALICE Installation KIT (M. Sitta)List of ALICE Users (& CERTIFICATES) Distributed production test (R. Barbera)HPSS/CASTOR Storage (Y. Schutz)Logging/Bookkeeping (Y. Schutz)

Input from stdout & stderr Store in MySQL/ORACLE

Integration in DataGRID WP6 6/2001: List of ALICE users (& Certificates) to WP6 7/2001: Distributed test with bookkeeping system 8/2001: Distributed test with PROOF (reconstruction) 9/2001: Test with DataGRID WP6 resources 12/2001: Test M9 GRID Services release (preliminary)

Page 11: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

11FOCUSJune 28, 2001

ALICE distributed analysis model

Local

Remote

Selection

Parameters

Procedure

Proc.C

Proc.C

Proc.C

Proc.C

Proc.C

PROOF

CPU

CPU

CPU

CPU

CPU

CPU

TagDB

RDB

DB1

DB4DB5DB6

DB3

DB2

Bring the KB to the PB and not the PB to the KB

Page 12: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

12FOCUSJune 28, 2001

AliRoot - DataGRID ServicesReconstruction & Analysis for the ALICE Physics Performance Report will be the short term use caseALICE Data Catalog being developed

data selection (files, to become objects soon

TGlobus class almost completed interface to access GLOBUS security from ROOT

TLDAP class being developed interface to access IMS from ROOT

Parallel Socket Classes (TPServer, TPServerSocket) TFTP Class, root deamon modified

parallel data transfer

Page 13: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

13FOCUSJune 28, 2001

ATLAS Grid Activities - 1ATLAS physicists are involved, and play relevant roles in the EU and US grid projects started in the last year EU DataGrid, Japan also active, US GriPhyN and

PPDG The Computing Model is based on Tier-1, Tier-2

and local Tier-3 connected using grid Data Challenges planned for testing and tuning of

the grid and the Model

Till now work has been done for requirements and use case collection first tests of the available grid services interaction with the different projects for

cooperation and at least interoperability

Page 14: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

14FOCUSJune 28, 2001

ATLAS Grid Activities - 2The ATLAS Grid Working Group (EU, US, Japan, etc.) meets regularly during the s/w weeks and 2 ad hoc workshop have been heldJobs have been run between different ATLAS sites using Globus + additions A s/w kit for Atlas Geant3 simulation +

reconstruction on any Linux 6.2 or 7.1 machine is available and installed in INFN sites +CERN+ Glasgow and Grenoble. Inclusion of Athena-Atlfast being studied

Production style jobs have been run between Milan-CERN-Rome. In July planned with also Glasgow, Grenoble and possibly LUND using the kit

Page 15: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

15FOCUSJune 28, 2001

ATLAS Grid Activities - 3Grid test using also Objectivity federations are planned in summer-autumnATLAS is involved in the proposed projects EU DataTag and US iVDGL Transatlantic Testbed are between the deliverables Very important for testing EU-US interoperability and

getting stricter cooperation between grid projects Links needed at level of projects: at experiment level

links in ATLAS are excellent, but this is not enough

Tests already going on between US and EU, but a lot of details (authorization, support) to be sorted out. More CERN involvement requested.

Page 16: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

16FOCUSJune 28, 2001

CMS IssuesCMS is a “single” collaboration a coordination required among DataGrid, GriPhyN, PPDG and also INFN-Grid“Production” (Data Challenges) and CPT projects are the driving efforts

Computing & Core Software=CCS + Physics Reconstruction & Selection= PRS + Trigger and Data Acquisition Systems= TriDAS

Being Grid one of the CMS Computing activities, not all the CMS sites and people are committed to Grid Therefore (only) some of the CMS sites will be involved in first testbed activitiesThe “grid” of sites naturally include EU and US sites (+ others countries/continents)

Page 17: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

17FOCUSJune 28, 2001

Current CMS Grid ActivitiesDefinition of the CMS Applications Requirements to the Grid

Planning and implementation of the first tests on “proto-Grid” tools Use of GDMP in some “real production” distributed environment [done] Make use of CA authorization in a distributed sites’ scenario [tested] Use of some Condor distributed (remote) submission schemes [done] Evaluate the PM9 deliverables of WP1 and WP2 (Grid Scheduler and

enhanced Replica Management)

Test dedicated resources to be provided by the participating sites To provide a reasonable complex and powerful trial To allow for “quasi-real” environment testing for the Applications

CERN has a central role in these tests, and adeguate resource support has to be provided aside of the “Experiments’ Data Challenges” needed support.

Page 18: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

18FOCUSJune 28, 2001

Current LHCb Datagrid activities

Distributed MC production at CERN, RAL, Liverpool and Lyon Extending to Nikhef, Bologna, Edinburgh/Glasgow

by end 2001

Current Testbed-0 tests using CERN and RAL (Globus problems encountered) Will extend to other sites in next few months

Using parts of MC production system for ‘Short term use case’Active participation in WP8 from UK, France, Italy, Nikhef, CERN (15 people part-time)Need more dedicated effort

Page 19: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

19FOCUSJune 28, 2001

1.Production started by filling out a Web form:Version of softwareAcceptance cutsDatabaseChannelNumber of events to generateNumber of jobs to runCentre where the jobs should run

2.Web form calls a java servlet that:Creates a job script (one per job)Creates a cards file (one-three per job) with random number seeds and job options(The cards files need to be accessible by the running job)Issues a job submit command to run script in batch-> WP1

3.Script does the following:Copies executable, detector database and cards filesExecutes executableExecutable creates output datasetOutput copied to local mass store -> WP5Log file copied to web browsable areaScript calls java program (see 4)

4.Java program calls servlet at CERN to:Transfer data back to CERN -> WP2Update meta-database at CERN -> WP2

LHCb Short term Use Case for GRID

Page 20: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

20FOCUSJune 28, 2001

LHCb – Medium to long term planning

Would like to work with middleware WPs from now, helping test early prototype software with use case (parts of MC production system)More generally would like to be involved in the development of the architecture for the long termWriting planning document based on ‘use scenarios’- will develop library of use scenarios for testing Datagrid softwareStarting major project interfacing LHCb OO software framework GAUDI to GRID services Awaiting recruitment of ’new blood’

Page 21: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

21FOCUSJune 28, 2001

GAUDI and external services(some GRID based)

Converter

Algorithm

Event DataService

PersistencyService

AlgorithmAlgorithm

Transient

Event Store

Detec. DataService

PersistencyService

Transient

Detector

Store

MessageService

JobOptionsService

Particle Prop.Service

OtherServices Histogram

ServicePersistency

Service

Transient

Histogram Store

ApplicationManager

ConverterConverterEventSelector

Analysis Program

OSMass

Storage

EventDatabasePDG

Database

DataSetDB

Other

MonitoringService

HistoPresenter

Other

JobService

Config.Service

GAUDI framework

Page 22: 1 The DataGrid WorkPackage 8 F.Carminati 28 June 2001

22FOCUSJune 28, 2001

Issues and concersAfter a very difficult start the project seems now taking up speed Interaction with the WP’s and the ATF was difficult to establishExperiments are very active a large unfunded effort activity is ongoing

Approximately 5-6 people per experimentFunded effort still very slow in coming (2 out of 5!)CERN testbed still insufficiently staffed & equipped

DataGrid should have a central role in the LHC computing project – it is not the case now

The GRID testbed and the LHC testbed should merge soon

Interoperability with US Grid’s is very important We have to keep talking with them and open the testbed

to US colleagues