computing for hall d
DESCRIPTION
Computing for Hall D. Ian Bird Hall D Collaboration Meeting March 22, 2002. Data Volume per experiment per year (Raw data - in units of 10 9 bytes). But : collaboration sizes!. Technologies. Technologies are advancing rapidly Compute power Storage – tape and disk Networking - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/1.jpg)
Computing for Hall DComputing for Hall DComputing for Hall DComputing for Hall D
Ian Bird
Hall D Collaboration MeetingMarch 22, 2002
![Page 2: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/2.jpg)
Data Volume per experiment per year (Raw data - in units of 109 bytes)
100
1000
10000
100000
1000000
1980 1990 2000 2010
E691
E665
E769
E791
CDF/ D0
KTeV
E871
BABAR
CMS/ ATLAS
E831
ALEPH
J LAB
STAR/ PHENI X
NA48
ZEUS
But: collaboration sizes!
![Page 3: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/3.jpg)
Technologies
• Technologies are advancing rapidly– Compute power– Storage – tape and disk– Networking
• What will be available 5 years from now?– Difficult to predict – but it will not be a problem to provide any of the
resources that Hall D will need….
– E.g computing:
![Page 4: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/4.jpg)
Recently, 5 TB IDE cache disk (5 x 8u) per 19”
Intel Linux Farm
First purchases, 9 duals per 24” rack
FY00, 16 duals (2u) + 500 GB cache (8u) per 19” rack
FY01, 4 CPU per 1u
![Page 5: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/5.jpg)
Compute power
• Blades– Low power chips
• Transmeta, Intel
– Hundreds in a single rack
• “An RLX System 300ex chassis holds twenty-four ServerBlade 800i units in a single 3U chassis. This density achievement packs 336 independent servers into a single 42U rack, delivering 268,800 MHz, over 27 terabytes of disk storage, and a whopping 366 gigabytes of DDR memory. “
![Page 6: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/6.jpg)
Technologies
• As well as computing, developments in Storage and Networking will also make rapid progress
• Grid computing techniques will bring these technologies together
• Facilities – new Computer Center planned
• Issues will not be technology, but:– How to use them intelligently– Hall D computing model– People– Treating computing seriously enough to assign sufficient resources
![Page 7: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/7.jpg)
(Data-) Grid Computing(Data-) Grid Computing(Data-) Grid Computing(Data-) Grid Computing
![Page 8: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/8.jpg)
Particle Physics Data GridCollaboratory Pilot
Who we are:Four leading Grid Computer Science Projects
andSix international High Energy and Nuclear Physics Collaborations
What we do:Develop and deploy Grid Services for our Experiment Collaborators
andPromote and provide common Grid software and standards
The problem at hand today:Petabytes of storage, Teraops/s of computing
Thousands of users, Hundreds of institutions,
10+ years of analysis ahead
![Page 9: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/9.jpg)
PPDG Experiments
ATLAS - a Toroidal LHC ApparatuS at CERN Runs 2006 onGoals: TeV physics - the Higgs and the origin of mass …
http://atlasinfo.cern.ch/Atlas/Welcome.html
BaBar - at the Stanford Linear Accelerator Center Running
NowGoals: study CP violation and more
http://www.slac.stanford.edu/BFROOT/
CMS - the Compact Muon Solenoid detector at CERN Runs 2006
onGoals: TeV physics - the Higgs and the origin of mass …
http://cmsinfo.cern.ch/Welcome.html/
D0 – at the D0 colliding beam interaction region at Fermilab Runs SoonGoals: learn more about the top quark, supersymmetry, and the Higgs
http://www-d0.fnal.gov/
STAR - Solenoidal Tracker At RHIC at BNL Running
NowGoals: quark-gluon plasma …
http://www.star.bnl.gov/
Thomas Jefferson National Laboratory Running
NowGoals: understanding the nucleus using electron beams …
http://www.jlab.org/
![Page 10: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/10.jpg)
PPDG Computer Science Groups
Condor – develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing on large collections of computing resources with distributed ownership.
http://www.cs.wisc.edu/condor/
Globus - developing fundamental technologies needed to build persistent environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations
http://www.globus.org/
SDM - Scientific Data Management Research Group – optimized and standardized access to storage systems
http://gizmo.lbl.gov/DM.html
Storage Resource Broker - client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and cataloging/accessing replicated data sets.
http://www.npaci.edu/DICE/SRB/index.html
![Page 11: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/11.jpg)
Delivery of End-to-End Applications& Integrated Production Systems
to allow thousands of physicists to share data & computing resources for scientific processing and analyses
PPDG Focus:
- Robust Data Replication
- Intelligent Job Placement and Scheduling
- Management of Storage Resources
- Monitoring and Information of Global Services
Relies on Grid infrastructure:- Security & Policy- High Speed Data Transfer- Network management
Resources: Computers, Storage, Networks
Operators & Users
![Page 12: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/12.jpg)
Project Activities, End-to-End Applicationsand Cross-Cut Pilots
Project Activities are focused Experiment – Computer Science Collaborative developments.
Replicated data sets for science analysis – BaBar, CMS, STARDistributed Monte Carlo production services – ATLAS, D0, CMSCommon storage management and interfaces – STAR, JLAB
End-to-End Applications used in Experiment data handling systems to give real-world requirements, testing and feedback.
Error reporting and responseFault tolerant integration of complex components
Cross-Cut Pilots for common services and policies Certificate Authority policy and authenticationFile transfer standards and protocolsResource Monitoring – networks, computers, storage.
![Page 13: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/13.jpg)
Year 0.5-1 Milestones (1)
Align milestones to Experiment data challenges:
– ATLAS – production distributed data service – 6/1/02
– BaBar – analysis across partitioned dataset storage – 5/1/02
– CMS – Distributed simulation production – 1/1/02
– D0 – distributed analyses across multiple workgroup clusters – 4/1/02
– STAR – automated dataset replication – 12/1/01
– JLAB – policy driven file migration – 2/1/02
![Page 14: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/14.jpg)
Year 0.5-1 Milestones
Common milestones with EDG:
GDMP – robust file replication layer – Joint Project with EDG Work Package (WP) 2 (Data Access)
Support of Project Month (PM) 9 WP6 TestBed Milestone. Will participate in integration fest at CERN - 10/1/01
Collaborate on PM21 design for WP2 - 1/1/02
Proposed WP8 Application tests using PM9 testbed – 3/1/02
Collaboration with GriPhyN:
SC2001 demos will use common resources, infrastructure and presentations – 11/16/01
Common, GriPhyN-led grid architecture
Joint work on monitoring proposed
![Page 15: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/15.jpg)
Year ~0.5-1 “Cross-cuts”
• Grid File Replication Services used by >2 experiments:– GridFTP – production releases
• Integrate with D0-SAM, STAR replication• Interfaced through SRB for BaBar, JLAB• Layered use by GDMP for CMS, ATLAS
– SRB and Globus Replication Services• Include robustness features• Common catalog features and API
– GDMP/Data Access layer continues to be shared between EDG and PPDG.
• Distributed Job Scheduling and Management used by >1 experiment:• Condor-G, DAGman, Grid-Scheduler for D0-SAM, CMS• Job specification language interfaces to distributed schedulers – D0-SAM,
CMS, JLAB
• Storage Resource Interface and Management• Consensus on API between EDG, SRM, and PPDG• Disk cache management integrated with data replication services
![Page 16: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/16.jpg)
Year ~1 other goals:
• Transatlantic Application Demonstrators:– BaBar data replication between SLAC and IN2P3– D0 Monte Carlo Job Execution between Fermilab and NIKHEF– CMS & ATLAS simulation production between Europe/US
• Certificate exchange and authorization.– DOE Science Grid as CA?
• Robust data replication.– fault tolerant – between heterogeneous storage resources.
• Monitoring Services– MDS2 (Metacomputing Directory Service)?– common framework– network, compute and storage information made available to scheduling and resource management.
![Page 17: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/17.jpg)
PPDG activities as part of the Global Grid Community
Coordination with other Grid Projects in our field:GriPhyN – Grid for Physics NetworkEuropean DataGridStorage Resource Management collaboratoryHENP Data Grid Coordination Committee
Participation in Experiment and Grid deployments in our field:ATLAS, BaBar, CMS, D0, Star, JLAB experiment data handling systemsiVDGL/DataTAG – International Virtual Data Grid LaboratoryUse DTF computational facilities?
Active in Standards Committees:Internet2 HENP Working Group Global Grid Forum
![Page 18: Computing for Hall D](https://reader036.vdocuments.mx/reader036/viewer/2022062309/568144d8550346895db1a50b/html5/thumbnails/18.jpg)
What should happen now?
• Collaboration needs to define it’s computing model– It really will be distributed – grid based– Although the compute resources can be provided – it is not obvious that
the vast quantities of data can really be analyzed efficiently by a small group
• Do not underestimate the task
– The computing model will define requirements for computing – some of which may require some lead time
• Ensure software and computing is managed as a project equivalent in scope to the entire detector – It has to last at least as long, it runs 24x365– The complete software system is more complex than the detector, even for
Hall D where the reconstruction is relatively straightforward– It will be used by everyone
• Find and empower a computing project manager now