1 september 1998harvey newman, caltech the lhc computing challenge harvey b. newman california...
TRANSCRIPT
1 September 1998 Harvey Newman, Caltech
The LHC Computing ChallengeThe LHC Computing Challenge
Harvey B. NewmanHarvey B. NewmanCalifornia Institute of TechnologyCalifornia Institute of Technology
CHEP 98CHEP 981 September 19981 September 1998
1 September 1998 Harvey Newman, Caltech
The LHC Software ChallengeThe LHC Software Challenge
Harvey B. NewmanHarvey B. NewmanCalifornia Institute of TechnologyCalifornia Institute of Technology
CHEP 98CHEP 981 September 19981 September 1998
1 September 1998 Harvey Newman, Caltech
Executive Summary Executive Summary
Challenges facing HEP computingChallenges facing HEP computing Complexity:Complexity: the Detector, the Data, and the LHC the Detector, the Data, and the LHC Scale:Scale: Data Storage and Access, Users and Developers Data Storage and Access, Users and Developers Worldwide dispersionWorldwide dispersion of people and resources of people and resources
Leading example is the LHC Leading example is the LHC Hardware: exponential price/performance evolutionHardware: exponential price/performance evolution Networks: an important issue (especially for NMS)Networks: an important issue (especially for NMS) Data:Data: storage and access solutions not yet storage and access solutions not yet
provenproven
Software: immediate needs for software (not just 2005) Software: immediate needs for software (not just 2005)
New technology generation(s) New technology generation(s) Complex problems require professional helpComplex problems require professional help
An immediate need for An immediate need for Software EngineeringSoftware Engineering and and Software EngineersSoftware Engineers
Common problems deserve common solutionsCommon problems deserve common solutions
1 September 1998 Harvey Newman, Caltech
Challenges: ComplexityChallenges: Complexity
Events:Events: Bunch crossing time of 25 ns is so short that (parts of) events Bunch crossing time of 25 ns is so short that (parts of) events
from different crossings overlapfrom different crossings overlap
Signal event is obscured by 20 overlapping uninteresting Signal event is obscured by 20 overlapping uninteresting collisions in same crossing (hundreds of extra particles) collisions in same crossing (hundreds of extra particles)
1 September 1998 Harvey Newman, Caltech
Challenges: ComplexityChallenges: Complexity
Detector: Detector: ~2 orders of magnitude more channels than today~2 orders of magnitude more channels than today Triggers must choose correctly only 1 event in every 400,000Triggers must choose correctly only 1 event in every 400,000 Level 2&3 triggers are software-based (must be of highest quality)Level 2&3 triggers are software-based (must be of highest quality)
1 September 1998 Harvey Newman, Caltech
ALICEALICE
Heavy ion experiment at LHCHeavy ion experiment at LHC
Studying ultra-relativistic Studying ultra-relativistic nuclear collisionsnuclear collisions
Extremely high data ratesExtremely high data rates1.5GB/s1.5GB/s
Relatively short running periodRelatively short running period1 Month = 1 Petabyte/Year1 Month = 1 Petabyte/Year
Special Trigger ProblemsSpecial Trigger ProblemsQ-g plasma signalsQ-g plasma signalsExtremely complex eventsExtremely complex events
Online data treatments Online data treatments consideredconsidered
Online processingOnline processingLossless compressionLossless compressionLossy compression (later ?)Lossy compression (later ?)
1 September 1998 Harvey Newman, Caltech
ATLASATLAS
General-purpose LHC General-purpose LHC experimentexperiment
High Data rates:High Data rates: 100MB/second100MB/second
High Data volumeHigh Data volume 1PB/year1PB/year
Test beam projects using Test beam projects using Objectivity/DB in Objectivity/DB in preparation:preparation:
Calibration databaseCalibration database Expect 600GB raw Expect 600GB raw
and analysis dataand analysis data
1 September 1998 Harvey Newman, Caltech
CMSCMS
General-purpose LHC General-purpose LHC experimentexperiment
Data rate: 100MB/secondData rate: 100MB/second Data volume: 1 PB/yearData volume: 1 PB/year
Two test beam projects Two test beam projects based on Objectivity based on Objectivity successfully completedsuccessfully completed
Database used in the Database used in the complete chain:complete chain:
Test beam DAQTest beam DAQ Reconstruction Reconstruction AnalysisAnalysis (Java3D) Event Viewing(Java3D) Event Viewing
1 September 1998 Harvey Newman, Caltech
LHCbLHCb
Dedicated experiment Dedicated experiment looking for CP-violation looking for CP-violation in the B-meson system. in the B-meson system.
Lower data rates than Lower data rates than other LHC experiments. other LHC experiments.
Total data volume Total data volume around 400TB/year.around 400TB/year.
1 September 1998 Harvey Newman, Caltech
Challenges: ScaleChallenges: Scale
For ATLAS or CMSFor ATLAS or CMS Event output rateEvent output rate 100 events/sec 100 events/sec
(ATLAS and CMS)(ATLAS and CMS) (10**9 events/year) (10**9 events/year) Data written to tapeData written to tape 100 MBytes/sec 100 MBytes/sec
(1 Petabyte/yr = 10**9 MBytes)(1 Petabyte/yr = 10**9 MBytes) Processing capacityProcessing capacity > 10 TIPS (= 10**7 MIPS) > 10 TIPS (= 10**7 MIPS) Typical networks Typical networks Hundreds of Mbits/second Hundreds of Mbits/second Lifetime of experimentLifetime of experiment 2-3 decades 2-3 decades Users Users ~1700 physicists ~1700 physicists Software developers ~100Software developers ~100
Plus ~1.5 Gbyte/sec for ALICEPlus ~1.5 Gbyte/sec for ALICE ~100 Petabytes Total for the LHC~100 Petabytes Total for the LHC
1 September 1998 Harvey Newman, Caltech
Challenges: Geographical SpreadChallenges: Geographical Spread
CMSCMS
1700+ Physicists1700+ Physicists 150+ Institutes150+ Institutes 30+ Countries30+ Countries
CERN states 55 %CERN states 55 %
NMS 45 %NMS 45 %
Atlas Atlas SizeSize Comparable Comparable
Major challenges associated with:Major challenges associated with: Communication and collaboration at a distanceCommunication and collaboration at a distance Distributed and heterogeneous computing resources Distributed and heterogeneous computing resources Remote software development and physics analysisRemote software development and physics analysis
1 September 1998 Harvey Newman, Caltech
LHC Computing ModelsLHC Computing Models
PlanPlan for Hardware, Network and Software systems for Hardware, Network and Software systems to support timely and competitive analysis to support timely and competitive analysis by a worldwide collaborationby a worldwide collaboration
ArchitectureArchitecture of Hierarchical of Hierarchical networked networked ensemble of ensemble of heterogeneous,heterogeneous, data-serving and processingdata-serving and processing computing systemscomputing systems
Key TechnologiesKey Technologies Object-Oriented software model Object-Oriented software model Object Database Management SystemsObject Database Management Systems Hierarchical Storage Management SystemsHierarchical Storage Management Systems Networked Collaborative EnvironmentsNetworked Collaborative Environments Possible Use of an Agent-Driven O/SPossible Use of an Agent-Driven O/S
1 September 1998 Harvey Newman, Caltech
LHC Computing Models:LHC Computing Models:The Leading Edge and the MainstreamThe Leading Edge and the Mainstream
The LHC data handling problem has no analog now The LHC data handling problem has no analog now ( (i.ei.e. Petabyte-scale and resources . Petabyte-scale and resources distributed worldwide) distributed worldwide)
Similar needs will be increasingly common by time of Similar needs will be increasingly common by time of LHC startup LHC startup
Solutions by HEP now could also be applicable to Solutions by HEP now could also be applicable to academic research and industry in the not-too-far futureacademic research and industry in the not-too-far future
Finding solutions is mission-criticalFinding solutions is mission-critical for ALICE, ATLAS, CMS and LHCb for ALICE, ATLAS, CMS and LHCb
HEP may be the first to face HEP may be the first to face many of the key problemsmany of the key problems
1 September 1998 Harvey Newman, Caltech
Computing Model: CMS SchemeComputing Model: CMS Scheme
1 September 1998 Harvey Newman, Caltech
Computing Model: Hardware Costs Computing Model: Hardware Costs and Milestonesand Milestones
Exponential Price/performance evolution (?)
With a ``just-in-time’’ purchasing policy processing power and (disk) storage capacity may not be the major challenges
HARDWARE MILESTONES 1997 1998 1999 2000 2001 2002 2003 2004 2005
REGIONAL CENTRESIdentify initial candidatesTurn on functional centresFully operational centres
CENTRAL SYSTEMSFunctional prototypeTurn on initial systemsFully operational system
1 September 1998 Harvey Newman, Caltech
Computing Model: Archival StorageComputing Model: Archival Storage
Exponential price/performance evolution ?Exponential price/performance evolution ? Large scale data archives are a niche marketLarge scale data archives are a niche market
Continued reliance on Continued reliance on Tapes Tapes is foreseenis foreseen(our projections in the late 1980’s for 2005 were different !)(our projections in the late 1980’s for 2005 were different !)
Slow Slow evolution of costs and technology over time.evolution of costs and technology over time. NA48 experience (1998): still 200 kCHF for 100 TbyteNA48 experience (1998): still 200 kCHF for 100 Tbyte
Reliability not enough to avoid “backup copies”Reliability not enough to avoid “backup copies” Outlook for Outlook for reading backreading back a Petabyte: may be expensive a Petabyte: may be expensive
(Tape) Archival Storage Software(Tape) Archival Storage Software Only Only HPSS HPSS appears to have the scalability neededappears to have the scalability needed
for the Petabyte range for the Petabyte range HPSS HPSS has a ways to go before being a commercial, has a ways to go before being a commercial,
robust product in production for multiplatformsrobust product in production for multiplatforms A heavy investment of CERN/Caltech/SLAC… effortA heavy investment of CERN/Caltech/SLAC… effort
to make to make HPSS HPSS evolve in directions suited for HENPevolve in directions suited for HENP Investigation of homegrown alternatives at FNAL and DESYInvestigation of homegrown alternatives at FNAL and DESY
1 September 1998 Harvey Newman, Caltech
Computing Model: Systems Computing Model: Systems ManagementManagement
Management IssuesManagement Issues Four experiments at once with diverse needsFour experiments at once with diverse needs ““Commodity” hardware (CPU/Disk Farms) to optimize costsCommodity” hardware (CPU/Disk Farms) to optimize costs
More than the minimum number of piecesMore than the minimum number of pieces Perhaps less than the maximum system reliabilityPerhaps less than the maximum system reliability
Industry solutions areIndustry solutions are unlikely unlikely to be availableto be available Different level of integration, and of reliability from most (?)Different level of integration, and of reliability from most (?) Need generally resilient, auto-configuring, self-healing systemsNeed generally resilient, auto-configuring, self-healing systems Applications should reconfigure themselves & the hardware, and go onApplications should reconfigure themselves & the hardware, and go on High standards for robustness imposed on the ODBMS + HPSSHigh standards for robustness imposed on the ODBMS + HPSS
How much can be done in a production environment ? How much can be done in a production environment ? It is imperative to understand how data analysis might be done It is imperative to understand how data analysis might be done
in such a distributed environment (viz. in such a distributed environment (viz. MONARCMONARC))
1 September 1998 Harvey Newman, Caltech
Computing Model: NetworksComputing Model: Networks
Wide-area networks are crucial Wide-area networks are crucial due to worldwide distribution ofdue to worldwide distribution of
People and institutesPeople and institutes Computing resourcesComputing resources
A rapid increase in network A rapid increase in network functionality and bandwidth functionality and bandwidth is essentialis essential
Price/performance evolutionPrice/performance evolution is still relatively slowis still relatively slow (Especially transoceanic)(Especially transoceanic)
and future costs are uncertainand future costs are uncertain
1 September 1998 Harvey Newman, Caltech
HEP Bandwidth Needs EvolutionHEP Bandwidth Needs Evolution
HEP GROWTHHEP GROWTH1987 - 1997 1987 - 1997 A Factor of one to Several Hundred on A Factor of one to Several Hundred on
Principal Transoceanic Links Principal Transoceanic Links
A Factor of Up to 1000 in Domestic Academic A Factor of Up to 1000 in Domestic Academic and Research Nets and Research Nets
HEP NEEDSHEP NEEDS 1998 - 2005 Continued Study, First Results (ICFA-NTF)1998 - 2005 Continued Study, First Results (ICFA-NTF)
Show A Factor of One to Several HundredShow A Factor of One to Several Hundred
COSTS ( to Vendors)COSTS ( to Vendors)Optical Fibers and WDM: a factor of two reduction, Optical Fibers and WDM: a factor of two reduction,
or much more, per year ?or much more, per year ?
PRICEPRICE ? ?
““Affordable, once prices are linked to the vendors costs”.Affordable, once prices are linked to the vendors costs”.But when will prices be linked to costs ?But when will prices be linked to costs ?
1 September 1998 Harvey Newman, Caltech
HEP Network ApplicationsHEP Network Applications
Interactive Sessions: Interactive Sessions: For traditional E-mail, file For traditional E-mail, file transfer,editors, X11transfer,editors, X11
Web accessWeb access and Web-based Sessions and Web-based Sessions Packet VideoconferencingPacket Videoconferencing, with shared documents , with shared documents and and
applicationsapplications Distributed and Remote Processing Distributed and Remote Processing
and Data Analysis (AFS, DFS)and Data Analysis (AFS, DFS) Remote Control RoomRemote Control Room Distributed (Object) Database Management SystemsDistributed (Object) Database Management Systems Advanced Applications for Remote CollaborationAdvanced Applications for Remote Collaboration
Collaboratories (DoE 2000)Collaboratories (DoE 2000) Environments with Multiple Real Time Shared Environments with Multiple Real Time Shared
Applications (Habanero, Tango)Applications (Habanero, Tango) Immersive Virtual Environments (Immersidesk) Immersive Virtual Environments (Immersidesk)
1 September 1998 Harvey Newman, Caltech
ICFA-NTFICFA-NTF Network Requirements Study Network Requirements Study
Network Needs of the ICFA CommunityNetwork Needs of the ICFA Community were studied on the basis of:were studied on the basis of:
Responses by major collaborations to a Questionnaire Responses by major collaborations to a Questionnaire on present and future network usageon present and future network usage
Computing Technical Proposals, reports and presentations by the Computing Technical Proposals, reports and presentations by the collaborations on network requirements (Atlas, CMS, RHIC, …)collaborations on network requirements (Atlas, CMS, RHIC, …)
Scaling according to the evolution of computing technology: local area Scaling according to the evolution of computing technology: local area network speeds, data rates to storage, stored data volumes network speeds, data rates to storage, stored data volumes
Constraints (lower limits) imposed by the network bandwidth and Constraints (lower limits) imposed by the network bandwidth and computing requirements available from homescomputing requirements available from homes
The bandwidth required to complete particular data analysis tasksThe bandwidth required to complete particular data analysis tasks Present and near-future available bandwidth on major national and Present and near-future available bandwidth on major national and
international network links, and the possible range of their costsinternational network links, and the possible range of their costs
ICFA-NTF Requirements WG Report: ICFA-NTF Requirements WG Report: http://l3www.cern.ch/~newman/icfareq98.htmlhttp://l3www.cern.ch/~newman/icfareq98.html
1 September 1998 Harvey Newman, Caltech
HEP Bandwidth Growth RatesHEP Bandwidth Growth Rates
Data Rateto Storage
Data VolumeStored Annually
LANBandwidth
CPUPower
3-10 2-10 10-30 10-30
Growth Rates Per Five Years*Growth Rates Per Five Years*
Bandwidth requirements for the next generation Bandwidth requirements for the next generation of experiments: of experiments: 10-30 times greater than 199810-30 times greater than 1998
A factor of 100-1000 increase is required during A factor of 100-1000 increase is required during the next decade.the next decade.
* * The largest experiments tend to be at the The largest experiments tend to be at the upper end of this rangeupper end of this range
1 September 1998 Harvey Newman, Caltech
ICFA Network Task Force Bandwidth ICFA Network Task Force Bandwidth Requirements Estimate (Mbps)Requirements Estimate (Mbps)
Year 1998 2000 2005
BW Utilized Per Physicist
(and Peak BW Used)
0.05 -
0.25
(0.5 - 2)
0.2 - 2
(2 - 10)0.8 - 10
(10 - 100)
BW Utilized by a UniversityGroup
0.25 - 10 1.5 - 45 34 - 622
BW to a Home-laboratory
or Regional Centre1.5 - 45
34 -
155
622 -
5000
BW to a Central Laboratory
Housing One or More MajorExperiments
34 - 155155 -622
2500 -10000
BW on a Transoceanic Link 1.5 - 20 34-155622 -
5000
1 September 1998 Harvey Newman, Caltech
SOFTWARE: The Key Challenge and the Solution to ComplexitySOFTWARE: The Key Challenge and the Solution to Complexity A Modern, Engineered Software Framework A Modern, Engineered Software Framework Object-Oriented DesignObject-Oriented Design Modern Languages (C++, Java,...) and Tools (ODBMS, HPSS,...) Modern Languages (C++, Java,...) and Tools (ODBMS, HPSS,...) Use of Mainstream Commercial products wherever possibleUse of Mainstream Commercial products wherever possible
Computing Model: Software (I)Computing Model: Software (I)
CMS CMS ExampleExample
1 September 1998 Harvey Newman, Caltech
An Engineered Software Framework is REQUIRED:An Engineered Software Framework is REQUIRED: To handle the complexity of the detector and the dataTo handle the complexity of the detector and the data For the reliability and maintainability of the software For the reliability and maintainability of the software
over a 20 Year Project Life-Cycleover a 20 Year Project Life-Cycle To serve data efficiently to a worldwide-distributed collaborationTo serve data efficiently to a worldwide-distributed collaboration For an efficient and cost-effective data analysisFor an efficient and cost-effective data analysis
R&D on the Model, Framework, Products and ToolsR&D on the Model, Framework, Products and Tools is needed to provide the functionality required by HEPis needed to provide the functionality required by HEP development cannot be delayed development cannot be delayed
global reconstruction is required for studies of CMS physics global reconstruction is required for studies of CMS physics
performance and hence detector tuning (already now!)performance and hence detector tuning (already now!)
monitoring and calibration during construction (ongoing)monitoring and calibration during construction (ongoing)
steady build up to production software systems turn-onsteady build up to production software systems turn-on
Computing Model: Software (II)Computing Model: Software (II)
1 September 1998 Harvey Newman, Caltech
1997 1998 1999 2000 2001 2002 2003 2004 2005CORE SOFTWAREEnd of Fortran developmentGEANT4 simulation of CMS 1 2 3 4Reconstruction/analysis framework 1 2 3 4Detector reconstruction 1 2 3 4Physics object reconstruction 1 2 3 4User analysis environment 1 2 3 4DATABASEUse of ODBMS for test-beamEvent storage/retrieval from ODBMS 1 2 3 4Data organisation/access strategyFilling ODBMS at 100 MB/sSimulation of data access patternsIntegration of ODBMS and MSSChoice of vendor for ODBMSInstallation of ODBMS and MSS
General milestone 1 Proof of concept
2 Functional prototype
3 Fully functional
4 Production system
Computing Model: Software (III)Computing Model: Software (III)
Important milestones are not in the distant future!
1 September 1998 Harvey Newman, Caltech
Software Development Software Development Tactics (1998 Tactics (1998 ))
Practical ApproachPractical Approach Short Development Cycles: Short Development Cycles:
milestones every ~3 months (less if possible)milestones every ~3 months (less if possible) Experience to complement formal trainingExperience to complement formal training Help by (a critical mass of) expertsHelp by (a critical mass of) experts Teach good practice (architecture, design, coding); Teach good practice (architecture, design, coding);
rather than the “theory” of OO Designrather than the “theory” of OO Design Use formality; initially don’t insist too muchUse formality; initially don’t insist too much Moderate use of OO “Wrappers” to meet Moderate use of OO “Wrappers” to meet
the experiment’s near-term deadlinesthe experiment’s near-term deadlines Frequent discussions, workshops, etc. Frequent discussions, workshops, etc.
Note: Strong Need for Worldwide Cooperative Software Note: Strong Need for Worldwide Cooperative Software Development Development Improved Remote Collaborative Tools Improved Remote Collaborative Tools
1 September 1998 Harvey Newman, Caltech
Software Development StrategySoftware Development Strategy
Keep the Longer Term Goals In SightKeep the Longer Term Goals In Sight Encourage, then require clean architectural design Encourage, then require clean architectural design Design the framework for component-reuseDesign the framework for component-reuse Encourage, then slowly enforce the use ofEncourage, then slowly enforce the use of
Formal design methodsFormal design methods Procedures to manage the development processProcedures to manage the development process
Code conventions and checkingCode conventions and checking Configuration managementConfiguration management Documentation templatesDocumentation templates
Work from the framework “inwards”Work from the framework “inwards” (Still: Short Development Cycles; Deliver Products)(Still: Short Development Cycles; Deliver Products)The Final Software must be of High QualityThe Final Software must be of High Quality
1 September 1998 Harvey Newman, Caltech
Software DevelopmentSoftware Development Short Term Goals (CMS) Short Term Goals (CMS)
CCMSMS A Analysisnalysis andand R Reconstruction andeconstruction and F Framework ramework (CARF)(CARF)
Sept 1998:Sept 1998: Software workshop on domain breakdown and commonality Software workshop on domain breakdown and commonality Dec 1998:Dec 1998: Prototype subdetector (TDR Quality) OO reconstruction Prototype subdetector (TDR Quality) OO reconstruction Early 1999:Early 1999: Non-OO expert can use simulation and reconstruction for Non-OO expert can use simulation and reconstruction for realistic physics studies realistic physics studies
Short-term goals (1998-2001)Short-term goals (1998-2001) Reconstruction of simulated Reconstruction of simulated
CMS data and test-beam CMS data and test-beam
Global reconstruction and Global reconstruction and detector appraisal / tuning detector appraisal / tuning
Higher-level trigger designHigher-level trigger design
Generic reconstruction Generic reconstruction visualization classesvisualization classes
1 September 1998 Harvey Newman, Caltech
Software EngineeringSoftware Engineering
Key Software Engineers Key Software Engineers
Needed by 1999Needed by 1999
At CERN and Especially At/From Remote InstitutesAt CERN and Especially At/From Remote Institutes Software support for physicists,Software support for physicists,
especially outside of CERNespecially outside of CERN Software framework development Software framework development
and implementation and implementation Computing Model development Computing Model development Generic visualizationGeneric visualization
The analog of mechanical and electrical engineers in The analog of mechanical and electrical engineers in the (similarly-sized) sub-detector hardware projectsthe (similarly-sized) sub-detector hardware projects
1 September 1998 Harvey Newman, Caltech
Software EngineersSoftware Engineers
Crucial for success of a Crucial for success of a Distributed Distributed software effortsoftware effort Provide foundation for University analysis of LHC dataProvide foundation for University analysis of LHC data Establish a University and remote laboratory-based role Establish a University and remote laboratory-based role
in the Software and ongoing R&Din the Software and ongoing R&D Leverage remote computing facilities Leverage remote computing facilities
(INFN, CCIN2P3, FNAL, Caltech, LBNL...) (INFN, CCIN2P3, FNAL, Caltech, LBNL...) Leverage the availability of state-of-the-art regional, Leverage the availability of state-of-the-art regional,
continental (ESNET, Abilene, QUANTUM), and continental (ESNET, Abilene, QUANTUM), and Intercontinental networksIntercontinental networks
Coordinate with the Global OO Physics Reconstruction,Coordinate with the Global OO Physics Reconstruction,and other CERN-based effortsand other CERN-based efforts
1 September 1998 Harvey Newman, Caltech
Software Engineering Tasks in 1999Software Engineering Tasks in 1999
SOFTWARE SUPPORT FOR PHYSICISTSSOFTWARE SUPPORT FOR PHYSICISTS Develop and maintain the framework, together with the Core Develop and maintain the framework, together with the Core
Software team at CERNSoftware team at CERN Install, test, deploy and maintain the software repository on Install, test, deploy and maintain the software repository on
multiple platforms (Unix flavors + NT) for LHC physicistsmultiple platforms (Unix flavors + NT) for LHC physicists
Framework and global reconstruction codeFramework and global reconstruction code Users’ environment, tools, & libraries: Users’ environment, tools, & libraries:
CLHEP, LHC++, ObjectivityCLHEP, LHC++, Objectivity Developer’s environment: cvs, SoftRelTools, UPS/UPDDeveloper’s environment: cvs, SoftRelTools, UPS/UPD GEANT4 OO simulation system (from mid-1999)GEANT4 OO simulation system (from mid-1999)
Develop and monitor standardsDevelop and monitor standards Version management and code distributionVersion management and code distribution Training - both in-person and over networks Training - both in-person and over networks Coordination of code walkthroughs and periodic reviewsCoordination of code walkthroughs and periodic reviews License management and distributionLicense management and distribution
1 September 1998 Harvey Newman, Caltech
Software and Computing Software and Computing Engineering Tasks in 1999Engineering Tasks in 1999
COMPUTING MODEL design and developmentCOMPUTING MODEL design and development Interface with the Interface with the MONARC MONARC Project on Data Management Project on Data Management
and Computing Using Distributed Architecturesand Computing Using Distributed Architectures Support for Model-simulation software and networked Support for Model-simulation software and networked
test-bedtest-bed Coordinate between groups at CERN, Coordinate between groups at CERN,
DE, FI, FR, IT, JP, UK, US, etc. DE, FI, FR, IT, JP, UK, US, etc.
NETWORK-DISTRIBUTED OBJECT DATABASESNETWORK-DISTRIBUTED OBJECT DATABASES Interface with Interface with RD45RD45 and and GIODGIOD Project Support and tests for Project Support and tests for
1-10 1-10 TeraTerabyte-scale object databasesbyte-scale object databases
Development and support of distributed HPSS Development and support of distributed HPSS Systems, inter-working with the Systems, inter-working with the
Objectivity ODBMSObjectivity ODBMS
Distribution and management of large samples of Distribution and management of large samples of fully simulated and reconstructed events in an fully simulated and reconstructed events in an Objectivity/DB federationObjectivity/DB federation
1 September 1998 Harvey Newman, Caltech
Problem decomposition using OO analysis and Problem decomposition using OO analysis and design is the basis for software organization and design is the basis for software organization and management.management.
This results in categories (domains) with a clean This results in categories (domains) with a clean hierarchical dependency structure, avoiding hierarchical dependency structure, avoiding circular dependencies between categories. circular dependencies between categories.
A two level system of A two level system of domains domains and and packagespackages is is being implemented.being implemented.
Exporting of users’ interface files through the Exporting of users’ interface files through the package file structure results in a clean separation package file structure results in a clean separation of users’ interfaces and all package-internal files. of users’ interfaces and all package-internal files.
CMS Software Organization and CMS Software Organization and StructureStructure
1 September 1998 Harvey Newman, Caltech
Package 1Package C oordina tor
Package 2 , e tc ..Package C oordina tor
D om ain AC oordina tor
Package n+ 1 , e tc .....
D om ain BC oordina tor
...
D om ain C , e tc ..C oordina tor
Repos itoryL ibrar ian
CMS Sample Repository StructureCMS Sample Repository Structure
1 September 1998 Harvey Newman, Caltech
in te rface src
overview .htm l
index .htm l
htm l
doc tes t
Package N
Sample Domains and Package Sample Domains and Package StructureStructure
Domain ExamplesDomain Examples
Inner TrackerInner TrackerMuon TrackerMuon Tracker
EcalEcalHcalHcal
CalorimetryCalorimetryDetectorDetector
Combined ReconstructionCombined ReconstructionVisualizationVisualization
User InterfaceUser InterfaceFrameworkFramework
ToolkitToolkitEvent ClassificationEvent Classification
Test BeamTest Beam
1 September 1998 Harvey Newman, Caltech
OO Interfaces in GEANT4 OO Interfaces in GEANT4
The use of software external to GEANT4 is managed The use of software external to GEANT4 is managed via Object Oriented technology (abstract interfaces). via Object Oriented technology (abstract interfaces).
Drivers to multiple graphics systems and user Drivers to multiple graphics systems and user interfaces (batch scripts, command line, GUIs) are interfaces (batch scripts, command line, GUIs) are supported without introducing dependencies.supported without introducing dependencies.
The GEANT4 persistency manager ensures The GEANT4 persistency manager ensures independence from I/O implementations (while, in independence from I/O implementations (while, in GEANT3, Zebra I/O and memory management was GEANT3, Zebra I/O and memory management was hard-wired in the system).hard-wired in the system).
1 September 1998 Harvey Newman, Caltech
Modularity: GEANT4 Modularity: GEANT4 ExamplesExamples
The modular structure of the GEANT4 components is The modular structure of the GEANT4 components is replicated within each component, to manage multiple replicated within each component, to manage multiple implementations and options.implementations and options.
Example 1:Example 1: physics processes or particles families physics processes or particles families can be selected and loaded, with much higher can be selected and loaded, with much higher granularity than in GEANT3. granularity than in GEANT3.
Example 2:Example 2: the representation of Volumes as three the representation of Volumes as three dimensional solids is not hard-wired as in GEANT3. dimensional solids is not hard-wired as in GEANT3. One may choose and selectively (and transparently) One may choose and selectively (and transparently) load different solid-modelling options. load different solid-modelling options.
1 September 1998 Harvey Newman, Caltech
LHC Data ModelsLHC Data Models
HEP data models are complex!HEP data models are complex! Typically hundreds of structure Typically hundreds of structure
types (classes)types (classes) Many relations between themMany relations between them Different access patternsDifferent access patterns
LHC experiments rely on LHC experiments rely on OO technologyOO technology
OO applications deal with networks OO applications deal with networks of objects of objects
Pointers (or references) are Pointers (or references) are used to describe relations used to describe relations
Existing solutions do not scaleExisting solutions do not scale Solution suggested by RD45: Solution suggested by RD45:
ODBMS coupled to a Mass ODBMS coupled to a Mass Storage System Storage System
EventEvent
TrackListTrackList
TrackerTracker CalorimeterCalorimeter
TrackTrackTrackTrackTrackTrack
TrackTrackTrackTrack
HitListHitList
HitHitHitHitHitHitHitHitHitHit
1 September 1998 Harvey Newman, Caltech
LHC Data ModelLHC Data Model (Physicists’ View) (Physicists’ View)
Data Organized In an Object “Hierachy”Data Organized In an Object “Hierachy” Raw, Reconstructed (ESD), Analysis Objects (AOD), TagsRaw, Reconstructed (ESD), Analysis Objects (AOD), Tags
Data DistributionData Distribution All raw, reconstructed and master parameter DB’s at CERN All raw, reconstructed and master parameter DB’s at CERN
All event tag data at all centersAll event tag data at all centers
Selected data sets at each regional centre (CMS)Selected data sets at each regional centre (CMS)
HOTHOT data automatically moved to centres data automatically moved to centres
Processing FlexibilityProcessing Flexibility Continuous retrieval/recalculation/storage decisionsContinuous retrieval/recalculation/storage decisions
Trade off data storage, CPU and network capabilitiesTrade off data storage, CPU and network capabilitiesto optimize coststo optimize costs
Object Database Management System (ODBMS)Object Database Management System (ODBMS)
1 September 1998 Harvey Newman, Caltech
An ODBMS by 2005An ODBMS by 2005(Physicists’ View)(Physicists’ View)
Multi-Petabyte Networked Database FederationsMulti-Petabyte Networked Database Federations Backed by a Networked Set of Archival StoresBacked by a Networked Set of Archival Stores
High Availability and Immunity from CorruptionHigh Availability and Immunity from Corruption Lock server(s) and resynchronization mechanismsLock server(s) and resynchronization mechanisms
“ “Seamless” response to database queriesSeamless” response to database queries Managed Intra-Site and Inter-Site Migration Managed Intra-Site and Inter-Site Migration
Both automatic (tunable) and manualBoth automatic (tunable) and manual
Clustering and Reclustering of ObjectsClustering and Reclustering of ObjectsPhysically cluster according to access patternsPhysically cluster according to access patternsRecluster as needed, to optimise efficiencyRecluster as needed, to optimise efficiencyGoal: Transfer only “useful” dataGoal: Transfer only “useful” data
From disk server to clientFrom disk server to client From tape to diskFrom tape to disk
1 September 1998 Harvey Newman, Caltech
Persistent Objects in an ODBMSPersistent Objects in an ODBMS
PersistencyPersistency Objects retain their state between two program contextsObjects retain their state between two program contexts
Storage entity is a complete objectStorage entity is a complete object State of all data membersState of all data members
OO Language SupportOO Language Support Abstraction, Inheritance, Polymorphism, Templates Abstraction, Inheritance, Polymorphism, Templates
Tight Language BindingTight Language Binding ODBMS allow ODBMS allow use of persistent objects directlyuse of persistent objects directly
as variables as variables of the OO language of the OO language C++, Java and Smalltalk (heterogeneity)C++, Java and Smalltalk (heterogeneity)
I/O On DemandI/O On Demand No explicit store & retrieve callsNo explicit store & retrieve calls
Location Transparency (Using “Smart Pointers”)Location Transparency (Using “Smart Pointers”) Database auomatically locates and reads objects when accessedDatabase auomatically locates and reads objects when accessed Allows decoupling of the logical and physical data modelsAllows decoupling of the logical and physical data models
1 September 1998 Harvey Newman, Caltech
Physical Model and Logical ModelPhysical Model and Logical Model
Physical model may be changed to optimise performancePhysical model may be changed to optimise performance Existing applications continue to workExisting applications continue to work
1 September 1998 Harvey Newman, Caltech
A Distributed FederationA Distributed Federation
ApplicationApplication
Objy ClientObjy Client
Objy ServerObjy Server ObjyObjyLock ServerLock Server Objy ServerObjy Server
HPSS ClientHPSS Client
HPSS ServerHPSS Server
ApplicationApplication
Objy ClientObjy Client Objy ServerObjy Server
Application HostApplication Host Application & Disk ServerApplication & Disk Server
Disk ServerDisk Server Data ServerData Serverconnected to HPSSconnected to HPSS
1 September 1998 Harvey Newman, Caltech
Common Project: MONARC (I)Common Project: MONARC (I)
MModels odels OOf f NNetworked etworked AAnalysis At nalysis At RRegional egional CCentersenters(Caltech, CERN, FNAL, Heidelberg, INFN, (Caltech, CERN, FNAL, Heidelberg, INFN,
KEK, Marseilles, Oxford, Tufts,…)KEK, Marseilles, Oxford, Tufts,…)MONARC goals include:MONARC goals include: Specification of the main parameters characterizing the Models and Specification of the main parameters characterizing the Models and
their “performance” (throughputs, latencies) for data analysistheir “performance” (throughputs, latencies) for data analysis Determination of classes of Computing Models are feasible for LHC Determination of classes of Computing Models are feasible for LHC
(matched to network capacity and data handling resources)(matched to network capacity and data handling resources) Production of “Baseline Models” that fall into the “feasible” categoryProduction of “Baseline Models” that fall into the “feasible” category Verify baselines for resource requirements:Verify baselines for resource requirements:
(computing, data handling, and networks)(computing, data handling, and networks)
COROLLARIES:COROLLARIES: Help Help define the Analysis Processdefine the Analysis Process for LHC experiments for LHC experiments Help Help define Regional Center architecturedefine Regional Center architecture and functionality and functionality Provide guidelines to keep the final Computing Models Provide guidelines to keep the final Computing Models
in the feasible rangein the feasible range
1 September 1998 Harvey Newman, Caltech
Common Project: MONARC (II) Common Project: MONARC (II)
Understand ensemble of centers as a Understand ensemble of centers as a “distributed data analysis system”“distributed data analysis system”
Design site architecturesDesign site architectures Hardware, Software and Management Hardware, Software and Management
services providedservices provided Differences in site configurations - Differences in site configurations -
fundamentally necessary or resource-drivenfundamentally necessary or resource-driven Modes of center operationModes of center operation Organization and individual usage patternsOrganization and individual usage patterns
Identify “sound” classes of Models Identify “sound” classes of Models Technically and financially feasibleTechnically and financially feasible Aim at Conceptual Design by 2000 or 2001Aim at Conceptual Design by 2000 or 2001
Identify and develop candidate sitesIdentify and develop candidate sites France, Italy, UK, USA, …, Pakistan, etc.France, Italy, UK, USA, …, Pakistan, etc. Identify their functions and relative rolesIdentify their functions and relative roles (Later) Help develop prototype centers(Later) Help develop prototype centers
1 September 1998 Harvey Newman, Caltech
MONARC: New Distributed System Features
Four-to-Five Tiered Client Server SystemFour-to-Five Tiered Client Server System Heterogeneous Central/Regional/Institute/Workgroup Server Hierarchy; + Heterogeneous Central/Regional/Institute/Workgroup Server Hierarchy; +
+ Users’ Desktop Clients + Users’ Desktop Clients Location Transparency Location Transparency Scope beyond today’s Intranet/Extranet applicationsScope beyond today’s Intranet/Extranet applications
LAN/National WAN/International WAN MixLAN/National WAN/International WAN Mix Complex components:Complex components:
Distributed computing tools Distributed computing tools Data access toolsData access tools Data analysis toolsData analysis tools Multiple data management systems:Multiple data management systems:
Objectivity/DB and HPSS (or another TMS)Objectivity/DB and HPSS (or another TMS)
Realtime Middleware:Realtime Middleware: Data recompute/transport decisionsData recompute/transport decisions Data location broker(s)Data location broker(s) Network performance tracking in real timeNetwork performance tracking in real time
1 September 1998 Harvey Newman, Caltech
Discrete Event SimulationDiscrete Event Simulation
State-based paradigm, model has discrete state domainState-based paradigm, model has discrete state domain
State transitions are triggered through Time-Ordered State transitions are triggered through Time-Ordered and Conditional Events (that occur instantaneously) and Conditional Events (that occur instantaneously)
Events are Managed in QueuesEvents are Managed in Queues
Time-advancing mechanisms:Time-advancing mechanisms: Unit-Time approach : Advance time in sufficiently small Unit-Time approach : Advance time in sufficiently small
but equal steps and (e.g. ModNet)but equal steps and (e.g. ModNet) Event-Driven approach: Advance time according to the next Event-Driven approach: Advance time according to the next
event (e.g. SoDA, ModSim, etc.)event (e.g. SoDA, ModSim, etc.)
State transitions can cause the creation of further State transitions can cause the creation of further future eventsfuture events
Structuring mechanisms:Structuring mechanisms: Group state information into Entities (in SoDA called Group state information into Entities (in SoDA called
‘Components’)‘Components’) Sequence of related events (concept in SoDA only: ‘Processes’)Sequence of related events (concept in SoDA only: ‘Processes’)
1 September 1998 Harvey Newman, Caltech
SoDA System: Current StatusSoDA System: Current Status
SSimulation imulation oof f DDistributed istributed AArchitectures: rchitectures: Christoph Von Praun (CERN/IT)Christoph Von Praun (CERN/IT)
Development of the SoDA simulation theory Development of the SoDA simulation theory and tool doneand tool done
Detailed models for systems with well defined system Detailed models for systems with well defined system boundaries and workload patterns:boundaries and workload patterns:
NA48 Central Data Recording (Meiko CS-2)NA48 Central Data Recording (Meiko CS-2) NA45 Reconstruction on PCSFNA45 Reconstruction on PCSF ATLAS Event Filter prototypeATLAS Event Filter prototype ......
Prototype models for systems with open boundaries Prototype models for systems with open boundaries and/or fuzzy workload:and/or fuzzy workload:
Average Physicist 2005Average Physicist 2005 Multiple communicating workgroups in different Multiple communicating workgroups in different
time-zones on a routed networktime-zones on a routed network
1 September 1998 Harvey Newman, Caltech
SoDA Study: Average Physicist in 2005 (1/8)SoDA Study: Average Physicist in 2005 (1/8)
GoalSimulation of the network bandwidth consumption caused by the Simulation of the network bandwidth consumption caused by the work multiple of physicists in a non-trivial network. The physicists work multiple of physicists in a non-trivial network. The physicists work in different time zones. The network utilization profile is aligned work in different time zones. The network utilization profile is aligned to a working day, in their time-zone.to a working day, in their time-zone.
Daily WAN task mix of one physicistDaily WAN task mix of one physicistCoffee
Conferencing Room Seminar Email …
duration 2.0 0.5 0.4 2.0max. bandwidth send 52.0 1000.0 200.0 -max. bandwidth receive 460.0 1000.0 800.0 -volume send - - - 36.0volume receive - - - 144.0priority send 1.0 1.0 1.0 1.0priority receive 1.0 1.0 1.0 1.0lots per day 2 1 1 60start daytime 8.0 10.0 9.0 8.0stop daytime 18.5 11.0 17.0 18.0timezone -8 -8 -8 -8source Caltech Caltech Caltech Caltech
1 September 1998 Harvey Newman, Caltech
Average Physicist in 2005 (3/8)Average Physicist in 2005 (3/8)
Physical / logical structure of the wide area networkPhysical / logical structure of the wide area network
Washington-Cern Cern-Washington …
nominal bandwidh 2.0 2.0 [Mbit/s]update interval 192.0 192.0 [s]granularity 4 4 [#]max. amount per lot 2000.0 2000.0 [kbit]
1 September 1998 Harvey Newman, Caltech
Average Physicist in 2005 (8/8)Average Physicist in 2005 (8/8)
Interaction of physicist work groups in different time zonesInteraction of physicist work groups in different time zones (CERN= GMT+1 Caltech = GMT-8)(CERN= GMT+1 Caltech = GMT-8)
1 September 1998 Harvey Newman, Caltech
Common Project: GIODCommon Project: GIOD
Globally Interconnected Object Database(Caltech, CERN, Hewlett-Packard)
1 September 1998 Harvey Newman, Caltech
GIOD Goals and TechnologyGIOD Goals and Technology
GGlobally lobally IInterconnected nterconnected OObject bject DDatabase Projectatabase Project(Caltech, CERN, Hewlett-Packard)(Caltech, CERN, Hewlett-Packard)
GIOD Goals include:GIOD Goals include: Investigate the scalability of commercial ODBMS’sInvestigate the scalability of commercial ODBMS’s Find models of organizing the data worldwide to optimize Find models of organizing the data worldwide to optimize accessaccess and and
analysisanalysis for the physicist for the physicist Test/develop strategies for coherent caching across the LAN and WANTest/develop strategies for coherent caching across the LAN and WAN Devise a network-distributed system architecture that has sufficient Devise a network-distributed system architecture that has sufficient
flexibility while maintaining database integrity and efficiencyflexibility while maintaining database integrity and efficiency
Uses existing leading-edge hardware and software systems:Uses existing leading-edge hardware and software systems: Caltech HP ExemplarCaltech HP Exemplar HPSSHPSS Objectivity/DB (& Versant)Objectivity/DB (& Versant) C++, JavaC++, Java ATM LAN and high-speed WANATM LAN and high-speed WAN Distributed task modelingDistributed task modeling
1 September 1998 Harvey Newman, Caltech
GIOD: Scalability of a HEP Computing Workload on the Exemplar
Track reconstruction:Track reconstruction: CPU-intensive with CPU-intensive with modest I/O. modest I/O.
Event level Event level (coarse-grained)(coarse-grained) parallelism parallelism
N = 15 - 210 reconstruction processesN = 15 - 210 reconstruction processes evenly distributedevenly distributed in the system. in the system.
Data in an Data in an Objectivity/DBObjectivity/DB database database federation, hosted on the Exemplar. federation, hosted on the Exemplar.
Objects read with simple Objects read with simple read-aheadread-ahead optimisation layer. (Performance gain of optimisation layer. (Performance gain of x2.) x2.)
Conclusion:Conclusion: Exemplar very well suited for Exemplar very well suited for this workload. With two (of four) node this workload. With two (of four) node filesystems it was possible to utilise 150 filesystems it was possible to utilise 150 processors in parallel with very processors in parallel with very high high efficiencyefficiency. .
Outlook:Outlook: expect to utilise all processors expect to utilise all processors with near 100% efficiency when all four with near 100% efficiency when all four filesystems are engaged.filesystems are engaged.
1 September 1998 Harvey Newman, Caltech
GIOD: Simulations of Higgs GIOD: Simulations of Higgs and QCD Backgroundsand QCD Backgrounds
Run Monte CarloRun Monte Carlo simulationsimulation to generate tracker & calorimeter data for Higgs to generate tracker & calorimeter data for Higgs signal and ~1 million multi-jet background events (using the Caltech Exemplar)signal and ~1 million multi-jet background events (using the Caltech Exemplar) Fill Objectivity federated databaseFill Objectivity federated database with “persistent” hits, tracks, & energies with “persistent” hits, tracks, & energies Run reconstruction algorithmsRun reconstruction algorithms for tracking and energy clustering for tracking and energy clustering WriteWrite results into the databaseresults into the database (reconstructed tracks & cluster objects) (reconstructed tracks & cluster objects) Open database, extract event, and displayOpen database, extract event, and display it using Java applet to show raw it using Java applet to show raw
hits, energy deposits, reconstructed tracks, energy map and clusters.hits, energy deposits, reconstructed tracks, energy map and clusters.
1 September 1998 Harvey Newman, Caltech
GIOD: CMSOO Java3D Event ViewerGIOD: CMSOO Java3D Event Viewer
Tracker geometry
ECAL Cluster
Individual ECAL crystal with energy
Reconstructed Track
1 September 1998 Harvey Newman, Caltech
GIOD: CMSOO - Database GIOD: CMSOO - Database Population LogisticsPopulation Logistics
~50 GByte in RAID5
Ethernet10 Mbyte/sec SCSI
EthernetHiPPI
WAN or 3590 Air-freight
30 GByte
1 September 1998 Harvey Newman, Caltech
GIOD CMSOO: Future DirectionsGIOD CMSOO: Future Directions
Transition to a complete prototype global reconstructionTransition to a complete prototype global reconstruction Tracker, ECAL, HCAL and Muon subdetectorsTracker, ECAL, HCAL and Muon subdetectors With CMS Core Software and Physics Reconstruction TeamsWith CMS Core Software and Physics Reconstruction Teams
Development of one or a few prototype physics analysesDevelopment of one or a few prototype physics analyses
All using persistent objects in a network-federated All using persistent objects in a network-federated ODBMS coupled with HPSSODBMS coupled with HPSS
Tests with “core” LHC computing tasksTests with “core” LHC computing tasks Reconstruction, Physics Analysis, Scanning, SimulationsReconstruction, Physics Analysis, Scanning, Simulations Using the system as a multi-user test bedUsing the system as a multi-user test bed Over LAN, Regional WAN and International WANOver LAN, Regional WAN and International WAN
Internet2 Demo: End SeptemberInternet2 Demo: End September
Integration into a Shared Collaborative Environment:Integration into a Shared Collaborative Environment: Habanero (NCSA) and/or Tango (Syracuse): Java BasedHabanero (NCSA) and/or Tango (Syracuse): Java Based
1 September 1998 Harvey Newman, Caltech
CMSOO - A Planned Hardware CMSOO - A Planned Hardware ConfigurationConfiguration
~500 GByte in RAID5
155-622 Mbit/sec ATM LAN
HiPPI
> 40 Mbyte/sec SCSI
Event reconstruction using C++ on the C200 and Exemplar
Event Viewing using Java applet on the C200
Database served on Exemplar and SDSC peer systems
Containers in HPSS served via HiPPI by RS6000
Measure network loads and reconstruction/event
viewing performance … use data as input to modelling tool
CalREN II155-622 Mbit/secWAN
1 September 1998 Harvey Newman, Caltech
RD45/GIOD DRO WAN area testsRD45/GIOD DRO WAN area tests
AMS
AMS
AMS
CALTECH AP
CERNSP AP
HPRD45 AP
DB1DB2
DB1 IMAGE
DB2 IMAGE
Data server: Pentium Pro 200 MHz, Windows NT 4.0
Data server: RS/6000 POWER2 AIX 4.1
Data server: HP 712/60 HP/UX 10.20
LockServer
LockServer
LockServer
1 September 1998 Harvey Newman, Caltech
RD45/GIOD DRO WAN area testsRD45/GIOD DRO WAN area tests
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 50 100 150 200 250
number of update
mili
seco
nds
create LAN
create WAN
commit LAN
commit WAN
Generation of updates during one day
NON SATURATEDHOURS ~ 1 Mbit/sec
SATURATED HOURS~ 10 Kbit/sec
1 September 1998 Harvey Newman, Caltech
Common Project: HEPVIS (I)Common Project: HEPVIS (I)
HHigh igh EEnergy nergy PPhysics hysics VisVisualization ualization
(CDF, CMS, D0, E835, FNAL/CD, GEANT4, L3, SND,...)(CDF, CMS, D0, E835, FNAL/CD, GEANT4, L3, SND,...)
HEPVIS goals include:HEPVIS goals include: Identification of common elements of interactive detector and Identification of common elements of interactive detector and
event visualization and analysis systems (avoid duplication of event visualization and analysis systems (avoid duplication of effort) effort)
Provision of an OO class library:Provision of an OO class library: Generic graphics classes for detector and event objectsGeneric graphics classes for detector and event objects
Components for the construction of portable graphical user interfacesComponents for the construction of portable graphical user interfaces
Consistency with HEP software strategies Consistency with HEP software strategies (LHC, Run II, GEANT4, etc.)(LHC, Run II, GEANT4, etc.)
Based on mainstream technologies (OpenGL, OpenInventor, …)Based on mainstream technologies (OpenGL, OpenInventor, …)
Provision of a forum (every 18 months) for achieving the above Provision of a forum (every 18 months) for achieving the above (HEPVIS workshops at FNAL/1995, CERN/1996, SLAC/1998)(HEPVIS workshops at FNAL/1995, CERN/1996, SLAC/1998)
1 September 1998 Harvey Newman, Caltech
Common Project: HEPVIS (II)Common Project: HEPVIS (II)
HEPVIS as used for HEPVIS as used for CDF Run II studiesCDF Run II studies
1 September 1998 Harvey Newman, Caltech
Common Project: HEPVIS (III)Common Project: HEPVIS (III)
CMS work (within CARF) is consistent with HEPVIS and GEANT Strong coupling of
work (Northeastern) on
CMS, D0, and L3
1 September 1998 Harvey Newman, Caltech
Platform EvolutionPlatform Evolution
Mainframes Vector Processors SIMD MPPs
Distributed Memory SMPs NOWs Shared Memory MPs NUMA MPs
Distributed Computers, Heterogeneous Platforms
Past
Prese
nt
Futur
e Heterogeneity: Architecture O/S Node CPU
Latencies Variable (internode, intranode)
Bandwidths Different for different links Different based on traffic
(a la DARPA)
JPL Cray YMP
Exemplar Hypernodes
1 September 1998 Harvey Newman, Caltech
Distributed Systems:Distributed Systems:A Provider’s ViewA Provider’s View
Distributed computing has brought vastly increased Distributed computing has brought vastly increased computing capacity at lower costcomputing capacity at lower cost
Are We happy? - Not completely:Are We happy? - Not completely: High complexity: Multiple processor architectures, and Multiple processor architectures, and
multiple operating systems multiple operating systems
Many possible points of failureMany possible points of failure
Complex interactionsComplex interactions
This can affect reliability as seen by the userThis can affect reliability as seen by the user
Very hard to maintain the reliability standardsVery hard to maintain the reliability standards set by the mainframes set by the mainframes
Services must be ported to (and maintained on) Services must be ported to (and maintained on) multiple environmentsmultiple environments
A major part of the effort to develop and run individual services A major part of the effort to develop and run individual services goes into their interaction with other services goes into their interaction with other services
1 September 1998 Harvey Newman, Caltech
Distributed System ArchitectureDistributed System Architecture
1 September 1998 Harvey Newman, Caltech
Beyond Traditional Architectures:Beyond Traditional Architectures:Agent Driven Operating SystemsAgent Driven Operating Systems
““Agents are objects with rules and legs” -- D. TaylorAgents are objects with rules and legs” -- D. Taylor A large ensemble of mobile autonomous agents running over A large ensemble of mobile autonomous agents running over
a network of loosely coupled computersa network of loosely coupled computers
Each agent is given a small subtask, and independently searches Each agent is given a small subtask, and independently searches for resources, competes for network BW and compute resourcesfor resources, competes for network BW and compute resources
(HEP: and access to data).(HEP: and access to data).
Example: TRW work on “OO Adaptive Parallelism” Example: TRW work on “OO Adaptive Parallelism” Using Using IBM Java IBM Java AgletsAglets for Signal Processing for Signal Processing [Dominic et al.][Dominic et al.]
Architecture: An Agent “Society”, with ClusteringArchitecture: An Agent “Society”, with Clustering Stationary Boss and mobile Master Agents (one per SMP)Stationary Boss and mobile Master Agents (one per SMP) Worker (mobile slave), Load-balancer, and (Multicast) Timer AgentsWorker (mobile slave), Load-balancer, and (Multicast) Timer Agents Merge Agents progressively collect resultsMerge Agents progressively collect results Designed to be Adaptive and Fault Tolerant;Designed to be Adaptive and Fault Tolerant; Still Still
vulnerable to major network outages vulnerable to major network outages Extendable to multi-site and multi-user applications (but more complex)Extendable to multi-site and multi-user applications (but more complex)
1 September 1998 Harvey Newman, Caltech
Beyond Traditional Architectures:Beyond Traditional Architectures:Agent Software Framework ExampleAgent Software Framework Example
OO Design OO Design Inverted Inverted DependenciesDependencies
Agent Agent UtilitiesUtilities
Clustering Clustering AlgorithmsAlgorithms
MasterSlave MasterSlave AgentsAgents
AgentFramesAgentFrames
Level 0Level 0
Level 1Level 1
Level 2Level 2
Level 3Level 3Clustering Clustering AgentsAgents
Agents Agents Parallelism without Parallelism without Parallel Algorithms Parallel Algorithms
Allows Complete Development Allows Complete Development and Test of the Algorithmsand Test of the AlgorithmsWithout Without the Frameworkthe Framework
Agent Package DependenciesAgent Package Dependencies
1 September 1998 Harvey Newman, Caltech
LHC Desktop Data Analysis Sub-ModelLHC Desktop Data Analysis Sub-Model
Physics objects ~10 Bytes/eventPhysics objects ~10 Bytes/event
Data analysis task isData analysis task is CPU intensiveCPU intensive
Analysis request broker Analysis request broker (Agent hierarchy)(Agent hierarchy)
Manages agent requests from Manages agent requests from desktops on the LAN/WANdesktops on the LAN/WAN
Pre-stages and filtersPre-stages and filters the the required data (required data (re-clustersre-clusters for for individuals or workgroups)individuals or workgroups)
Designed to Designed to optimize multi-level optimize multi-level cachecache usage. usage.
Data movement to desktops: Data movement to desktops: on demand and last resorton demand and last resort
Desktop
Desktop
~2000 MIPS
Desktop
Desktop
DesktopDesktop
Desktop
Desktop
SERVER
Event Collection Store
~0.03 - 10 Gbytes ~30 - 1000 K Events
From Offline Storage
AnalysisRequestBroker
1-10 MB/sec
~500 MB RAM
Desktop
1 September 1998 Harvey Newman, Caltech
Summary: The LHC Computing and Summary: The LHC Computing and Software ChallengesSoftware Challenges
The Computing Technology (R)evolutionThe Computing Technology (R)evolution We assume it will continueWe assume it will continue
Networks Networks on Every Distance Scaleon Every Distance Scale
Software: Software: Modern Languages, Methods and ToolsModern Languages, Methods and ToolsThe Key to Manage ComplexityThe Key to Manage Complexity A PRACTICAL APPROACHA PRACTICAL APPROACH FORMAL ENGINEERING FORMAL ENGINEERING FORTRANFORTRAN The End of an Era; The End of an Era;
The TRANSITIONThe TRANSITION A Coming of Age A Coming of Age
A New Generation of Distributed SystemsA New Generation of Distributed Systems Object Database FederationsObject Database Federations An Ensemble of Tape and Disk Mass StoresAn Ensemble of Tape and Disk Mass Stores A Deep A Deep Heterogeneous Heterogeneous Client/Server Hierarchy,Client/Server Hierarchy,
of Up to 5 Levels of Up to 5 Levels
The Emergence of New Classes of Operating SystemsThe Emergence of New Classes of Operating Systems
1 September 1998 Harvey Newman, Caltech
The LHC Computing Challenges:The LHC Computing Challenges:Approaches to SolutionsApproaches to Solutions
Track Computing and Software Technology;Track Computing and Software Technology;Proactively Proactively Project the FutureProject the Future
R&D on Networks, Databases and Distributed R&D on Networks, Databases and Distributed SystemsSystems
Understand Understand Regional CentresRegional Centres and the and the LHC Analysis LHC Analysis ProcessProcess (a la MONARC) (a la MONARC)
Make the Transition.Make the Transition.Build the OO Software.Build the OO Software.Use ItUse It to Meet the Experiments’ Deadlines. to Meet the Experiments’ Deadlines.
Generate a Generate a Vision of LHC ComputingVision of LHC ComputingFollow Its DirectionsFollow Its Directions
Learn By DoingLearn By Doing
START NOWSTART NOW
1 September 1998 Harvey Newman, Caltech
AcknowledgementsAcknowledgements
David WilliamsDavid Williams Lucas TaylorLucas Taylor Julian BunnJulian Bunn Les RobertsonLes Robertson Juergen MayJuergen May David JacobsDavid Jacobs Philippe GalvezPhilippe Galvez John HarveyJohn Harvey Fabrizio GagliardiFabrizio Gagliardi Paul MessinaPaul Messina David Stickland David Stickland Vincenzo InnocenteVincenzo Innocente Hans-Peter WellischHans-Peter Wellisch Rene BrunRene Brun Homer NealHomer Neal Stu LokenStu Loken Olivier MartinOlivier Martin Juergen KnoblochJuergen Knobloch Peter Van Der VyrePeter Van Der Vyre
Jamie ShiersJamie Shiers Martti PimiaMartti Pimia Werner JankWerner Jank Dirk DuellmanDirk Duellman Eva Arderiu-RiberaEva Arderiu-Ribera Richard MountRichard Mount Federico Carminati Federico Carminati Les CottrellLes Cottrell Krzysztof SliwaKrzysztof Sliwa Paolo CapiluppiPaolo Capiluppi Laura PeriniLaura Perini Irwin GainesIrwin Gaines Joel ButlerJoel Butler John WomersleyJohn Womersley Tom NashTom Nash Ian WillersIan Willers Jacques AltaberJacques Altaber Hossny El-SheriefHossny El-Sherief