the european datagrid project technical status

15
CERN The European DataGrid Project Technical status www. eu - datagrid .org Bob Jones (CERN) Deputy Project Leader

Upload: meg

Post on 09-Jan-2016

22 views

Category:

Documents


3 download

DESCRIPTION

The European DataGrid Project Technical status. www.eu-datagrid.org Bob Jones (CERN) Deputy Project Leader. DataGrid scientific applications Developing grid middleware to enable large-scale usage by scientific applications. Bio-informatics - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The European DataGrid Project Technical status

CERN

The European DataGrid Project

Technical status

www.eu-datagrid.orgBob Jones (CERN)

Deputy Project Leader

Page 2: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 2

DataGrid scientific applicationsDeveloping grid middleware to enable large-

scale usage by scientific applications

Earth Observation

•about 100 Gbytes of data per day (ERS 1/2)

•500 Gbytes, for the ENVISAT mission

Bio-informatics Data mining on genomic databases (exponential growth) Indexing of medical databases (Tb/hospital/year)

Particle Physics Simulate and reconstruct complex physics

phenomena millions of times

LHC experiments will generate 6-8 PetaBytes/year

Page 3: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 3

The Project

9.8 M Euros EU funding over 3 years

90% for middleware and applications (HEP, Earth Obs. and Bio Med.)

Three year phased developments & demos (2001-2003)

Total of 21 partners Research and Academic institutes as well as industrial companies

Extensions (time and funds) on the basis of first successful results: DataTAG (2002-2003) www.datatag.org

CrossGrid (2002-2004) www.crossgrid.org

GridStart (2002-2004) www.gridstart.org

Page 4: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 4

Research and Academic Institutes•CESNET (Czech Republic)•Commissariat à l'énergie atomique (CEA) – France•Computer and Automation Research Institute,  Hungarian Academy of Sciences (MTA SZTAKI)•Consiglio Nazionale delle Ricerche (Italy)•Helsinki Institute of Physics – Finland•Institut de Fisica d'Altes Energies (IFAE) - Spain•Istituto Trentino di Cultura (IRST) – Italy•Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany•Royal Netherlands Meteorological Institute (KNMI)•Ruprecht-Karls-Universität Heidelberg - Germany•Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands•Swedish Research Council - Sweden

Assistant Partners

Industrial Partners•Datamat (Italy)•IBM-UK (UK)•CS-SI (France)

Page 5: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 5

EDG structure : work packages The EDG collaboration is structured in 12 Work Packages:

WP1: Work Load Management System

WP2: Data Management

WP3: Grid Monitoring / Grid Information Systems

WP4: Fabric Management

WP5: Storage Element

WP6: Testbed and demonstrators

WP7: Network Monitoring

WP8: High Energy Physics Applications

WP9: Earth Observation

WP10: Biology

WP11: Dissemination

WP12: Management

}} ApplicationsApplications

Page 6: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 6

Project Schedule Project started on 1/1/2001

TestBed 0 (early 2001) International test bed 0 infrastructure deployed

Globus 1 only - no EDG middleware

TestBed 1 ( now ) First release of EU DataGrid software to defined users within the project:

HEP experiments, Earth Observation, Biomedical applications

Project successfully reviewed by EU on March 1st 2002

TestBed 2 (end 2002) Builds on TestBed 1 to extend facilities of DataGrid

TestBed 3 (2nd half 2003)

Project completion expected by end 2003

Page 7: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 7

EDG middleware GRID architecture

Collective ServicesCollective Services

Information & Monitoring

Information & Monitoring

Replica ManagerReplica

ManagerGrid

SchedulerGrid

Scheduler

Local ApplicationLocal Application Local DatabaseLocal Database

Underlying Grid ServicesUnderlying Grid Services

Computing Element Services

Computing Element Services

Authorization Authentication and Accounting

Authorization Authentication and Accounting

Replica CatalogReplica Catalog

Storage Element Services

Storage Element Services

SQL Database Services

SQL Database Services

Fabric servicesFabric services

ConfigurationManagement

ConfigurationManagement

Node Installation &Management

Node Installation &Management

Monitoringand

Fault Tolerance

Monitoringand

Fault Tolerance

Resource Management

Resource Management

Fabric StorageManagement

Fabric StorageManagement

Grid

Fabric

Local Computing

Grid Grid Application LayerGrid Application Layer

Data Management

Data Management

Job Management

Job Management

Metadata Management

Metadata Management

Service IndexService Index

APPLICATIONS

GLOBUS

M / W

Page 8: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 8

EDG interfaces

Collective ServicesCollective Services

Information & Monitoring

Information & Monitoring

Replica ManagerReplica Manager Grid SchedulerGrid Scheduler

Local ApplicationLocal Application Local DatabaseLocal Database

Underlying Grid ServicesUnderlying Grid Services

Computing Element Services

Computing Element Services

Authorization Authentication and Accounting

Authorization Authentication and Accounting

Replica CatalogReplica Catalog

Storage Element Services

Storage Element ServicesSQL Database

ServicesSQL Database

Services

Fabric servicesFabric services

ConfigurationManagement

ConfigurationManagement

Node Installation &Management

Node Installation &Management

Monitoringand

Fault Tolerance

Monitoringand

Fault ToleranceResource

ManagementResource

ManagementFabric StorageManagement

Fabric StorageManagement

Grid Application LayerGrid Application Layer

Data Management

Data ManagementJob ManagementJob Management Metadata

ManagementMetadata

ManagementObject to File

MappingObject to File

Mapping

Service IndexService Index

Computing Computing ElementsElements

SystemSystemManagersManagers

ScientistScientistss

OperatingOperatingSystemsSystems

FileFile SystemsSystems

StorageStorageElementsElementsMassMass Storage SystemsStorage Systems

HPSS, CastorHPSS, Castor

UserUser AccountsAccounts

CertificateCertificate AuthoritiesAuthorities

ApplicationApplicationDevelopersDevelopers

BatchBatch SystemsSystemsPBS, LSF, etc.PBS, LSF, etc.

Page 9: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 9

EDG overview : current project status

EDG currently provides a set of middleware services Job & Data Management

GRID & Network monitoring

Security, Authentication & Authorization tools

Fabric Management

Runs on Linux Red Hat 6.1 platform Site install & config tools and set of common services available

5 core EDG 1.2 sites currently belonging to the EDG-Testbed CERN(CH), RAL(UK), NIKHEF(NL), CNAF(I), CC-Lyon(F),

also deployed on other testbed sites (~15) via CrossGrid, DataTAG and national grid projects

actively used by application groups

Intense middleware development continuously going on, concerning: New features for job partitioning and check-pointing, billing and accounting New tools for Data Management and Information Systems. Integration of network monitoring information inside the brokering polices

Page 10: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 10

Dubna

Moscow

RAL

Lund

Lisboa

Santander

Madrid

Valencia

Barcelona

Paris

Berlin

LyonGrenoble

Marseille

BrnoPrague

Torino

Milano

BO-CNAFPD-LNL

Pisa

Roma

Catania

ESRIN

CERN

IPSL

Estec KNMI

Testbed Sites

Page 11: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 11

Tutorials

DAY1

Introduction to Grid computing and overview of the DataGrid project

Security

Testbed overview

Job Submission

lunch

hands-on exercises: job submission

DAY2 Data Management

LCFG, fabric mgmt & sw distribution & installation

Applications and Use cases

Future Directions

lunch

hands-on exercises: data mgmt

The tutorials are aimed at users wishing to "gridify" their applications using EDG software and are organized over 2 full consecutive days

December2 & 3 – Edinburgh5 & 6 - Turin 9 & 10 - NIKHEF

To date about 120 people trained

http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/

Page 12: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 12

Through links with sister projects, there is thepotential for a truely global scientific applications grid

Demonstrated at IST2002 and SC2002 in November

Related Grid projects

Page 13: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 13

Details of release 1.3

Based on Globus 2.0beta but with binary modifications taken from Globus 2.2

large file transfers (gridFTP)

“lost” jobs (GASS cache)

unstable information system (MDS 2.2)

new Replica Catalog schema

More reliable job submission Res Broker returns errors if

overloaded

Stability tests successfully passed

Minor extensions to JDL

Improved data management tools GDMP v3.2

automatic &explicit triggering of staging to MSS

support for parallel streams (configurable)

Edg-replica-manager v2.0 uses GDMP for MSS staging shorter alias’ for commands (e.g. edg-

rm-l edg-replica-manager-listReplicas) new file mgmt commands: getbestFile,

cd, ls, cat etc. support for parallel streams

(configurable)

Better fabric mgmt Bad RPMs no longer block installation Available on Linux RH 6. Not backward compatible with EDG 1.2

Addresses bugs found by applications in EDG 1.2 - being deployed in November

Page 14: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 14

Incremental Steps for testbed 2

1. Fix “show-stoppers” for application groups – mware WPs (continuous)

2. Build EDG1.2.x with autobuild tools

3. Improved (automatic) release testing

4. Automatic installation & configuration procedure for pre-defined site

5. Start autobuild server for RH 7.2 and attempt build of release 1.2.x

6. Updated fabric mgmt tools

7. Introduce prototypes in parallel to existing modules RLS R-GMA

8. GLUE modified info providers/ consumers

9. Storage Element v1.0

10. Introduce Reptor

11. Add NetworkCost Function

12. GridFTP server access to CASTOR

13. Introduce VOMS

14. Improved Res. Broker

15. LCFGng for RH 7.2

16. Storage Element v2.0

17. Integrate mapcentre and R-GMA

18. Storage Element V3.0

Expect this list to be updated regularly

Prioritized list of improvements to be made to the current release as established with usersfrom September through to end of 2002

Page 15: The European DataGrid Project Technical status

B.Jones– Nov 2002 - n° 15

Plans for the future

Further developments in 2003 Further iterative improvements to middleware driven by users needs More extensive testbeds providing more computing resources Prepare EDG software for future migration to Open Grid Services Architecture

Interaction with LHC Computing grid (LCG) LCG intends to make use of the DataGRID middleware LCG is contributing to DataGRID

Testbed support and infrastructure Get access to more computing resources in HEP computing centres

Testing and verification Reinforce the testing group and maintain a certification testbed

Fabric management and middleware development

New EU project Make plans to preserve current major asset of the project: probably the largest Grid

development team in the world EoI for FP6 ( www.cern.ch/egee-ei ), possible extension of the project, etc.