![Page 1: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/1.jpg)
Interactive Data Analysis with PROOFBleeding Edge Physics with Bleeding Edge Computing
Fons RademakersCERN
![Page 2: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/2.jpg)
DVD stack with oneyear of LHC data!(~ 20 Km)
Mt. Blanc(4.8 Km)
LHC Data Challenge•The LHC generates:
■ 40 million collisions per second
•Combined the 4 experiments record:■ After filtering, 100 interesting collision per
second
■ From 1 to 12 MB per collision ⇒ from 0.1 to 1.2 GB/s
■ 1010 collisions registered every year
■ ~ 10 PetaBytes (1015 B) per year
■ LHC data correspond to 20 millions DVD’s per year!
■ Computing power equivalent to 100.000 of today’s PC
■ Space equivalent to 400.000 large PC disks
Balloon(30 Km)
Airplane(10 Km)
![Page 3: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/3.jpg)
•Typical HEP analysis needs a continuous algorithm refinement cycle
HEP Data Analysis
Implement Implement algorithmalgorithmImplement Implement algorithmalgorithm
Run over data setRun over data setRun over data setRun over data set
Make Make improvementsimprovements
Make Make improvementsimprovements
![Page 4: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/4.jpg)
HEP Data Analysis•Ranging from I/O bound to CPU bound
•Need many disks to get the needed I/O rate
•Need many CPUs for processing
•Need a lot of memory to cache as much as possible
![Page 5: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/5.jpg)
Some ALICE Numbers
•1.5 PB of raw data per year
•360 TB of ESD+AOD per year (20% of raw)
•One pass using 400 disks at 15 MB/s will take 16 hours
Using parallelism is the only way to analyze this amount of data in a reasonable amount of time
![Page 6: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/6.jpg)
PROOF Design Goals•System for running ROOT queries in parallel on a large number of distributed computers or multi-core machines
•PROOF is designed to be a transparent, scalable and adaptable extension of the local interactive ROOT analysis session
•Extends the interactive model to long running “interactive batch” queries
![Page 7: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/7.jpg)
Where to Use PROOF•CERN Analysis Facility (CAF)
•Departmental workgroups (Tier-2’s)
•Multi-core, multi-disk desktops (Tier-3/4’s)
![Page 8: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/8.jpg)
The Traditional Batch Approach
File File catalogcatalog
BatchBatchScheduleSchedule
rr
StorageStorage
CPU’sCPU’s
QueryQuery
Split analysis job in NSplit analysis job in Nstand-alone sub-jobsstand-alone sub-jobs
Collect sub-jobs andCollect sub-jobs andmerge into single outputmerge into single output
Batch clusterBatch cluster
• Split analysis task in N batch jobs• Job submission sequential• Very hard to get feedback during processing• Analysis finished when last sub-job finished
Job Job splittersplitter
JobJob
JobJob
JobJob
JobJob
Job Job MergerMerger
JobJob
JobJob
JobJob
JobJob
QueueQueue
JobJob
JobJob
JobJob
JobJob
![Page 9: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/9.jpg)
The PROOF Approach
File File catalogcatalog
MasterMaster
ScheduleSchedulerr
StorageStorage
CPU’sCPU’s
QueryQuery
PROOF query:PROOF query:data file list, mySelector.Cdata file list, mySelector.C
Feedback,Feedback,merged final outputmerged final output
PROOF clusterPROOF cluster
• Cluster perceived as extension of local PC• Same macro and syntax as in local session• More dynamic use of resources• Real-time feedback• Automatic splitting and merging
![Page 10: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/10.jpg)
Multi-Tier Architecture
Adapts to wide area
virtual clusters
Geographically separated domains,
heterogeneous machines
Network performanceLess important VERY important
Optimize for Optimize for data localitydata locality or high bandwidth data or high bandwidth data server accessserver access
![Page 11: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/11.jpg)
The ROOT Data ModelTrees & Selectors
preselectipreselectionon
analysianalysiss
OkOk
Output listOutput list
Process()Process()
BrancBranchh
BrancBranchh
BrancBranchh
BrancBranchh
LeaLeaff
LeaLeaff
LeaLeaff
LeaLeaff
LeaLeaff
LeaLeaff
LeaLeaff
EventEvent nnRead Read
needed needed parts onlyparts only
ChainChain
Loop over eventsLoop over events
11 22 nn lastlast
Terminate()Terminate()- Finalize analysis- Finalize analysis
(fitting, ...)(fitting, ...)
Terminate()Terminate()- Finalize analysis- Finalize analysis
(fitting, ...)(fitting, ...)
Begin()Begin()- Create histograms- Create histograms- Define output list- Define output list
Begin()Begin()- Create histograms- Create histograms- Define output list- Define output list
![Page 12: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/12.jpg)
TSelector - User Code
// Abbreviated versionclass TSelector : public TObject {protected: TList *fInput; TList *fOutput;public void Notify(TTree*); void Begin(TTree*); void SlaveBegin(TTree *); Bool_t Process(int entry); void SlaveTerminate(); void Terminate();};
12
![Page 13: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/13.jpg)
TSelector::Process()
13
... ... // select event b_nlhk->GetEntry(entry); if (nlhk[ik] <= 0.1) return kFALSE; b_nlhpi->GetEntry(entry); if (nlhpi[ipi] <= 0.1) return kFALSE; b_ipis->GetEntry(entry); ipis--; if (nlhpi[ipis] <= 0.1) return kFALSE; b_njets->GetEntry(entry); if (njets < 1) return kFALSE; // selection made, now analyze event b_dm_d->GetEntry(entry); //read branch holding dm_d b_rpd0_t->GetEntry(entry); //read branch holding rpd0_t b_ptd0_d->GetEntry(entry); //read branch holding ptd0_d
//fill some histograms hdmd->Fill(dm_d); h2->Fill(dm_d,rpd0_t/0.029979*1.8646/ptd0_d); ... ...
![Page 14: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/14.jpg)
The Packetizer•The packetizer is the heart of the system
•It runs on the master and hands out work to the workers
•Different packetizers allow for different data access policies
■All data on disk, allow network access■All data on disk, no network access■Data on mass storage, go file-by-file■Data on Grid, distribute per Storage Element
•Makes sure all workers end at the same time
Pull architectureworkers ask for work, no complex worker state in the
master
![Page 15: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/15.jpg)
PROOF Scalability on Multi-Core Machines
Current version of Mac OS X fully8 core capable. Running my MacProsince 4 months with dual Quad CoreCPU’s.
![Page 16: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/16.jpg)
Production Usage in Phobos
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
16
![Page 17: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/17.jpg)
Interactive Batch•Allow submission of long running queries
•Allow client/master disconnect, reconnect
•Allow interaction and feedback at any time during the processing
![Page 18: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/18.jpg)
Analysis Scenario
AQ1: 1s query produces a local histogramAQ1: 1s query produces a local histogramAQ2: a 10m query submitted to PROOF1AQ2: a 10m query submitted to PROOF1AQ3 - AQ7: short queriesAQ3 - AQ7: short queriesAQ8: a 10h query submitted to PROOF2AQ8: a 10h query submitted to PROOF2
BQ1: browse results of AQ2BQ1: browse results of AQ2BQ2: browse intermediate results of AQ8BQ2: browse intermediate results of AQ8AQ3 - AQ6: submit 4 10m queries to PROOF1AQ3 - AQ6: submit 4 10m queries to PROOF1
CQ1: browse results of AQ8, BQ3 - BQ6CQ1: browse results of AQ8, BQ3 - BQ6
Monday at 10:15Monday at 10:15ROOT session on my ROOT session on my laptoplaptop
Monday at 16:25Monday at 16:25ROOT session on my ROOT session on my desktopdesktop
Wednesday at 8:40Wednesday at 8:40Browse from any web Browse from any web browserbrowser
![Page 19: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/19.jpg)
•Open/close sessions
•Define a chain
•Submit a query,execute a command
•Query editor
•Online monitoring of feedback histograms
•Browse folders with query results
•Retrieve, archive and delete query results
Session Viewer GUI
![Page 20: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/20.jpg)
Monitoring•MonALISA based monitoring
■Each host reports to MonALISA■Each worker reports to MonALISA
•Internal monitoring■File access rate, packet generation time and latency, processing time, etc.
■Produces a tree for further analysis
![Page 21: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/21.jpg)
Query Monitoring
The same for: CPU usage, cluster usage, memory, event rate,local/remote MB/s and files/s
21
![Page 22: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/22.jpg)
ALICE CAF Test Setup
•Since May evaluation of CAF test setup■33 machines, 2 CPUs each, 200 GB disk
•Tests performed■Usability tests■Simple speedup plot■Evaluation of different query types■Evaluation of the system when running a combination of query types
![Page 23: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/23.jpg)
Query Type Cocktail
•4 different query types■20% very short queries■40% short queries■20% medium queries■20% long queries
•User mix■33 nodes■10 users, 10 or 30 workers/user, max ave. speedup = 6.6
■5 users, 20 workers/user■15 users, 7 workers/user
![Page 24: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/24.jpg)
Relative Speedup
24
Average expected speedupAverage expected speedup
![Page 25: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/25.jpg)
Relative Speedup
25
Average expected speedupAverage expected speedup
![Page 26: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/26.jpg)
Cluster Efficiency
26
![Page 27: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/27.jpg)
Multi-User Scheduling
•Scheduler is needed to control the use of available resources in multi-user environments
•Decisions taken per query based on the following metric:
■Overall cluster load■Resources needed by the query■User quotas and priorities
•Requires dynamic cluster reconfiguration
•Generic interface to external schedulers planned (Condor, LSF, ...)
![Page 28: Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN](https://reader033.vdocuments.mx/reader033/viewer/2022051622/56649ea95503460f94bad0ad/html5/thumbnails/28.jpg)
Conclusions•The LHC will generate data on a scale not seen anywhere before
•LHC experiments will critically depend on parallel solutions to analyze their enormous amounts of data
•Grids will very likely not provide the needed stability and reliability we need for repeatable high statistics analysis
•Wish us good luck!