high energy physics & computing grids techfair univ. of texas @ arlington november 10, 2004
TRANSCRIPT
High Energy Physics & Computing Grids
TechFair
Univ. of Texas @ ArlingtonNovember 10, 2004
2
What’s the Point?High Energy Particle Physics is a study of the smallest pieces of matter.
It investigates (among other things) the nature of the universe immediately after the Big Bang.
It also explores physics at temperatures not common for the past 15 billion years (or so).
It’s a lot of fun.
3
Fermilab Tevatron• World’s Highest Energy proton-anti-proton collider
– Ecm=1.96 TeV (=6.3x10-7J/p 13M Joules on 10-4m2)Equivalent to the kinetic energy of a 20t truck at a speed 80 mi/hr
Chicago
Tevatron p
p CDF
DØ
DØ Detector
• Weighs 5000 tons• Can inspect 3,000,000 collisions/second• Will record 50 collisions/second• Records approximately 10,000,000 bytes/second• Will record 4x1015 (4,000,000,000,000,000) bytes
in the current run (4 PetaByte).
30’
30’
50’
ATLAS Detector
• Weighs 10,000 tons• Can inspect 1,000,000,000 collisions/second• Will record 100 collisions/second• Records approximately 300,000,000
bytes/second• Will record 1.5x1015 (1,500,000,000,000,000)
bytes each year (1.5 PetaByte).
5
DØ Detector (cross sectional view)
6qT
ime
p p
q g
K
“par
ton
jet”
“par
ticle
jet”
“cal
orim
eter
jet”
hadrons
CH
FH
EM
Highest ET dijet event at DØHighest ET dijet event at DØ
0.69 GeV, 472E
0.69 GeV, 475E21
T
11T
How does an Event Look in the DØ Detector?
7
How are computers used in HEP?
Digital data
Data Reconstruction
pp
8
650 Collaborators78 Institutions18 Countries
DØ Collaboration
9
10
Data Challenges in HEP• Enormous data need to be analyzed
– 1.5 – 2.0 PB / year for ATLAS (PB = 1015 Bytes)– Equivalent to 2,857,142 CD-ROMs
• Data is shared among world-wide collaboration• Processing requirements exist for:
– Reconstruction of captured data– Cataloging captured data and reconstructions– Sharing cataloged information– Analyzing data– Simulating events
• The Solution?– Grid Computing
11
What is a Computing Grid?• Grid: Geographically distributed computing resources configured for
coordinated use• Physical resources & networks provide raw capability• “Middleware” software ties it together
12
UTA-HEP Grid Collaborations• Grid3 – US collaboration led by the International Virtual Data
Grid Laboratory (http://www.ivdgl.org)
– Supports ATLAS, CMS, LIGO, SDSS and others– Testbed for large scale collaborations
• D0SAR- D0 Southern Analysis Region (http://www-hep.uta.edu/d0-sar/d0-sar.html)
– Supports Analysis and Simulation needs in D0 for international group of collaborators
– UTA acts as Regional Analysis Center• THEGrid – Texas High Energy Grid (http://www-hep.uta.edu/d0-sar/d0-sar.html)
– Sharing computing resources of several Texas universities
13
UTA has the first and the only US RAC
UTA is the only US DØ RAC
DØSAR formed around UTAMexico/Brazil
OU/LU
UAZ
RiceLTU
UTA
KUKSU
Ole Miss
14
UTA-DPCC•100 Pentium 4 Xeon 2.6GHz CPU •64TB of Disk space
•84 Pentium 4 Xeon 2.4GHz CPU •7.5TB of Disk space
•Total CPUs: 193•Total disk: 73TB•Total Memory: 189Gbyte
15
UTA Monitoring Applications
Developed, implemented and improved by UTA Students
Nu
mb
er o
f Jo
bs
% o
f To
tal Availab
le CP
Us
Time from Present (hours)
Anticipated CPU OccupationJobs in Distribute Queue
Commissioned and being deployed
Supervisor –Executors(ATLAS DC2 Production System Powered by UTA Developers)
Windmill
numJobsWantedexecuteJobsgetExecutorDatagetStatusfixJobkillJob
Jabber communicationpathway executors
Don Quijote(file catalog)
Prod DB(jobs database) execution
sites(grid)
1. lexor2. dulcinea3. capone4. legacy
supervisors
execution sites(grid)
Designed at UTA
17
Grid Production Statistics
UTA33%
OU20%
LBL47%
Figure : Pie chart showing the sites where DC1 single particle simulation jobs were processed. Only three grid testbed sites were
used for this production in August 2002.
Figure : Pie chart showing the number of pile-up jobs successfully completed at various U.S. grid sites for dataset 2001 (25 GeV dijets). A total of 6000 partitions
were generated.
These are examples of some datasets produced on the Grid. Many other large samples were produced, especially at BNL using batch.
18
Figure :Cumulative number of Monte-Carlo events produced since August, 2003 for the D0 collaboration by remote site.
19
UTA, 17%
BNL, 17%
UC, 14%BU, 13%
IU, 10%
UCSD, 5%
UM, 4%
UB, 4%
PDSF, 4%
FNAL, 4%
CalTech, 4%
Others , 4% Figure : Percentage contribution toward US-ATLAS DC2 production by computing site.
Figure : Integrated CPU-days consumed by US-ATLAS
computing sites. From 5/28/04 to 9/10/2004.
UTA’s Contribution~10 CPU years
20
UTA is a Major Player in HEP Grids• Organizing regional and international grids
• DØSAR• THEGrid• Grid3
• Developing production software and monitoring applications• Windmill (ATLAS) McFarm (DØ)• McPerm, McQue, Pippy, GridView
• Providing substantial computing resources– UTA is largest producer of monte carlo simulations for DØ collaboration– UTA is largest producer of data for US-ATLAS collaboration.
• Proposing the Southwest Tier2 Center for ATLAS computing with OU, LU, UNM (4x our current size)