lcg lhc computing grid project from the web to the grid 23 september 2003 jamie shiers, database...

Download LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland

If you can't read please download the document

Upload: amice-ray

Post on 18-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

LCG The European Organisation for Nuclear Research The European Laboratory for Particle Physics  Fundamental research in particle physics  Designs, builds & operates large accelerators  Financed by 20 European countries (member states) + others (US, Canada, Russia, India, ….)  1MSF budget - operation + new accelerators  2000 staff users (researchers) from all over the world  LHC (starts ~2007) experiment: 2000 physicists, 150 universities, apparatus costing ~€300M, computing ~€250M to setup, ~€60M/year to run  10+ year lifetime

TRANSCRIPT

LCG LHC Computing Grid Project From the Web to the Grid 23 September 2003 Jamie Shiers, Database Group IT Division, CERN, Geneva, Switzerland LCG Overview Very brief overview of CERN Use of Oracle at CERN a partnership lasting two decades From the Large Electron Positron collider (LEP) to the Large Hadron Collider (LHC) The LHC Computing Grid (LCG) and Oracles role LCG The European Organisation for Nuclear Research The European Laboratory for Particle Physics Fundamental research in particle physics Designs, builds & operates large accelerators Financed by 20 European countries (member states) + others (US, Canada, Russia, India, .) 1MSF budget - operation + new accelerators 2000 staff users (researchers) from all over the world LHC (starts ~2007) experiment: 2000 physicists, 150 universities, apparatus costing ~300M, computing ~250M to setup, ~60M/year to run 10+ year lifetime LCG airport Computer Centre Geneva 27km LCG LEP : 1989 2000 (RIP) 27km ring with counter-circulating electrons-positrons Oracle Database selected to help with LEP construction Originally ran on PDP-11, later VAX, IBM, Sun, now Linux Oracle now used during LEP dismantling phase Data on LEP components must be kept forever Oracle is now used across entire spectrum of labs activities Several Sun-based clusters (8i OPS, 9i RAC) Many stand-alone Linux-based systems Both database and increasingly Application Server LCG High-lights of the LEP Era LEP Computing started with the MAINFRAME Initially IBM running VM/CMS, large VAXcluster, also Cray In 1989, first proposal of what led to Web was made Somewhat heretical at the time: strongly based on e.g. use of Internet protocols, whereas official line was OSI Goal was to simplify task of sharing information amongst physicists: by definition distributed across the world Technology convergence: explosion of Internet explosion of Web In early 1990s, first steps towards fully distributed computing with farms of RISC processors running Unix The SHIFT project, winner of ComputerWorld Honors Award LCG The Large Hadron Collider (LHC) A New World-Class Machine in the LEP Tunnel (First proposed in 1979!) LCG The LHC machine Two counter- circulating proton beams Collision energy TeV 27 Km of magnets with a field of 8.4 Tesla Super-fluid Helium cooled to 1.9K The worlds largest superconducting structure LCG The ATLAS detector the size of a 6 floor building! LCG The Atlas Cavern January 03 LCG Data Acquisition Multi-level trigger Filters out background Reduces data volume Record data 24 hours a day, 7 days a week Equivalent to writing a CD every 2 seconds Level 3 Giant PC Cluster 160 Hz 160 Hz (320 MB/sec) Data Recording & Offline Analysis Level 2 - Embedded Processors 40 MHz interaction rate equivalent to 2 PetaBytes/sec Level 1 - Special Hardware Atlas detector LCG Oracle for Physics Data Work on LHC Computing started ~1992 (some would say earlier) Numerous projects kicked off 1994/5 to look at handling multi-PB of data; move from Fortran to OO (C++) etc. Led to production solutions from ~1997 Always said that disruptive technology, like Web, would have to be taken into account In 2002, major project started to move 350TB of data out of ODBMS solution; >100MB/s for 24 hour periods Now ~2TB of physics data stored in Oracle on Linux servers A few % of total data volume; expected to double in 2004 LCG Linux for Physics Computing First steps with Linux started ~1993 Port of Physics Application Software to Linux on PCs 1996: proposal to setup Windows-based batch farms for Physics Data Processing Overtaken by developments in Linux: Windows a poor match for batch environment Linux essential trivial to port to from Solaris, HP/UX etc Convergence of technologies: PC h/w offers unbeatable price- performance; Linux becomes robust ~All Physics Computing at CERN now based on Linux / Intel Strategic platform for the LHC LCG The Grid The Solution to LHC Computing? LHC Computing Project = LHC Computing Grid (LCG) LCG LHC Computing Grid (LCG) Global requirements: handle processing and data handling needs of 4 main LHC collaborations Total of PB of data per year (>20 million CDs); lifetime 10+ years Analysis will require equivalent of 70,000 of todays fastest PCs LCG project established to meet these unprecedented requirements Builds on work of European DataGrid (EDG) and Virtual Data Toolkit (US) Physicists access world-wide distributed data & resources as if local System determines where job runs, based on resources required/available Initial partners include sites in CH, F, D, I, UK, US, Japan, Taiwan & Russia LCG Centres taking part in the initial LCG service ( ) around the world around the clock LCG LCG and Oracle Current thinking is that bulk data will be streamed to files RDBMS backend also being studied for analysis data File catalog (10 9 files) and file-level metadata will be stored in Oracle in a Grid-aware catalog In longer term, event level metadata may also be stored in the database, leading to much larger data volumes A few PB, assuming total data volume of PB Current storage management system CASTOR at CERN also uses a database to manage the naming / location of files bulk data stored in tape silos and faulted in to huge disk caches LCG Storage Element Replica Location Services Replica Manager Local Replica Catalog Replica Metadata Catalog Storage Element Files have replicas stored at many Grid sites on Storage Elements. Each file has a unique GUID. Locations corresponding to the GUID are kept in the Replica Location Service. Users may assign aliases to the GUIDs. These are kept in the Replica Metadata Catalog. The Replica Manager provides atomicity for file operations, assuring consistency of SE and catalog contents. LCG Todays Deployment at CERN rlsatlasrlsalicerlscmsrlslhcbrlsdteamrlscert02rlscert01 lxshare071d rlstest lxshare169d lxshare183d lxshare069d Oracle Application Server hosting Grid Middleware per VO Shared Oracle Database for LHC Experiments Based on standard parts out of CERN stores Disk server (1TB mirrored disk); Farm node (dual processor) LCG Future Deployment Currently studying 9iRAC on supported h/w configurations Expect Grid infrastructure to move to AS cluster + RAC in Q1/ Expect CASTOR databases to move to RAC also in 2004 May also move few TB of event-level metadata (COMPASS) to a single RAC All based on Linux / Intel LCG Summary During past decade, have moved from era of Web to Grid Rise of Internet Computing, move from mainframes to RISC to farms of dual-processor Intel boxes running Linux Use of Oracle has expanded from small, dedicated service for LEP construction to all areas of the labs work, including handling Physics Data Both Oracle DB and AS, including for Grid infrastructure The Grid viewed as disruptive technology in that it will change the way we think about computing, much like the Web