us atlas tier 1 facility rich baker brookhaven national laboratory doe/nsf review of u.s. atlas and...
TRANSCRIPT
US ATLAS Tier 1 Facility
Rich BakerRich Baker
Brookhaven National LaboratoryBrookhaven National Laboratory
DOE/NSF Review of U.S. ATLAS and CMS Computing ProjectsDOE/NSF Review of U.S. ATLAS and CMS Computing Projects
Brookhaven National LaboratoryBrookhaven National LaboratoryNOVEMBER 14-17, 2000NOVEMBER 14-17, 2000
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 2
Tier 1 Facility Requirements (1)
Based on NCB Review NumbersBased on NCB Review Numbers Focus on Analysis (200k of 209k SI95)
Probably Insufficient for Simulation
CPU: 209,000 SpecInt95CPU: 209,000 SpecInt95 Commodity Pentium/Linux
Estimated 640 Dual Processor Nodes
Online Storage: 365 TB DiskOnline Storage: 365 TB Disk High Performance Storage Area Network
Baseline: Fibre Channel Raid Array
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 3
Tier 1 Facility Requirements (2)
Tertiary Storage: 2 PB Tape LibraryTertiary Storage: 2 PB Tape Library Baseline: HPSS, STK Media & Tape Drives
75% Event Summary Data
25% Simulation, Analysis Objects, Local Data
“Raw” I/O Rate: 400 MB/second, 12.5 PB/year
Exploit Use Patterns to Maximize Efficiency Random Access to AOD - Always on Disk Managed Access to ESD - Grid SW? Custom SW?
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 4
Estimation Methods - Hardware (1)
Use Recent RCF Purchases as Cost BaselineUse Recent RCF Purchases as Cost Baseline
Moore’s Law Scaling for Commodity Components (CPU, Moore’s Law Scaling for Commodity Components (CPU,
Disk, Tape)Disk, Tape)
STK Tape Drives: Constant Cost per Drive, Double I/O STK Tape Drives: Constant Cost per Drive, Double I/O
Capacity Every 2 YearsCapacity Every 2 Years
Similar Constant Cost Projections for High Performance Similar Constant Cost Projections for High Performance
Data Mover NodesData Mover Nodes $40K per HPSS Mover Node
$30K per SAN Control Node
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 5
Estimation Methods - Hardware (2)
Local Area Network: 8% of Disk+CPU Cost plus $20K per Local Area Network: 8% of Disk+CPU Cost plus $20K per
HPSS MoverHPSS Mover
Firewall/WAN Hardware: 25% of LAN CostFirewall/WAN Hardware: 25% of LAN Cost
Interactive NodesInteractive Nodes 2 Linux Nodes Purchased per Year
Maintain One Sun/Solaris Node
““General Purpose” NodesGeneral Purpose” Nodes 21 Currently for RCF - Estimate 25 for ATLAS
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 6
Estimation Methods - Software
Share Site License Costs with RCFShare Site License Costs with RCF HPSS: 50% of $200K by 2005
LSF: 50% of $65K Starting 2002
Veritas: $5K per SAN Control NodeVeritas: $5K per SAN Control Node Or Other SW Choice
Most Other SW License Costs Can Only be Estimated - Most Other SW License Costs Can Only be Estimated -
Total $97K in 2005Total $97K in 2005 Good Estimate Based on Actual RCF Costs to Support Operational
Facility & Development
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 7
Existing Facility
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 8
E450(NFS Server)
Dual Intel
Dual Intel
USATLASSwitch
SANHub
BackupServer
HPSSArchiveServer
Ÿ XXX.USATLAS.BNL.GOVŸ E450 front line with SSHŸ Objectivity Lock Server
200 GBytesRAID Disk
US ATLAS Tier 1 Facility
62 Intel/LinuxDual 700/450 MHz256/512 MBytes
9/18 GBytes100 Mbit Ethernet(3,200 SPECint95)
9840TapesAFS
Servers
AFS
~10 GBytesRAID DiskUS ATLAS
AFS
Ÿ LSFŸ AFSŸ ObjectivityŸ Gnu etc.
US ATLAS Equipment
RCF Infrastructure
~50 GBytesJBOD Disk
Intel/LinuxWeb Server
August 2000 Configuration
128 MBytes18 GBytes
.
.
.
RCFLAN
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 9
Timeline Overview
Prototype – FY ‘01 & FY ‘02Prototype – FY ‘01 & FY ‘02 Initial Development & Test, 1% to 2% scale
Establish Facility Independent from RCF
Lessons Learned from RCF Experience
System Tests – FY ‘03 & FY ‘04System Tests – FY ‘03 & FY ‘04 Large Scale System Tests, 5% to 10% scale
Support Growing Tier 2 Network
Operation – FY ‘05, FY ‘06 & beyondOperation – FY ‘05, FY ‘06 & beyond Full Scale System Operation, 20% (‘05) to 100% (‘06)
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 10
Tier 1 Facility Capacity
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
2000 2001 2002 2003 2004 2005 2006 2007
Year
Pe
rce
nta
ge
Co
mp
lete
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 11
Tier 1 Facility Staffing
0.0
5.0
10.0
15.0
20.0
25.0
30.0
2001 2002 2003 2004 2005 2006
Operations
Performance Monitoring
Grid & WAN
HSM
System Administration
Planning & Management
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 12
Tier 1 Budget By Year (FY’01 k$)
Personnel Material Total2001 771$ 639$ 1,411$ 2002 1,044$ 565$ 1,609$ 2003 1,428$ 970$ 2,398$ 2004 2,080$ 1,190$ 3,270$ 2005 3,378$ 1,697$ 5,074$ 2006 3,378$ 4,968$ 8,346$
Total 12,078$ 10,029$ 22,107$
•Includes Tier 1 Staff Under WBS 2.3.2
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 13
Tier 1 Budget (FY’01 k$)
$-
$1,000
$2,000
$3,000
$4,000
$5,000
$6,000
$7,000
$8,000
$9,000
2001 2002 2003 2004 2005 2006
Material
Personnel
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 14
Tier 1 Material Costs (FY’01 k$)
CPU Disk HPSS SW LAN Travel Staff Other2001 -$ 277$ 129$ 94$ 46$ 43$ 18$ 32$ 2002 51$ 145$ 92$ 135$ 20$ 74$ 24$ 24$ 2003 171$ 304$ 115$ 135$ 44$ 104$ 34$ 62$ 2004 231$ 309$ 214$ 159$ 50$ 144$ 51$ 32$ 2005 239$ 368$ 525$ 159$ 100$ 184$ 83$ 38$ 2006 1,184$ 1,973$ 966$ 159$ 281$ 184$ 83$ 138$
Total 1,877$ 3,375$ 2,042$ 842$ 540$ 733$ 294$ 326$
HPSS License Included in HPSSHPSS License Included in HPSS
Volume Manager Software Included in DiskVolume Manager Software Included in Disk
““Other” Includes Power, Special Purpose Nodes, Video Other” Includes Power, Special Purpose Nodes, Video
Conferencing, SuppliesConferencing, Supplies
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 15
Tier 1 Materials Summary (FY’01 k$)
$-
$1,000
$2,000
$3,000
$4,000
$5,000
2001 2002 2003 2004 2005 2006
Other
Staff
Travel
LAN
SW
HPSS
Disk
CPU
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 16
Tier 1 Facility Beyond 2006 (FY’01 k$)
Staffing: Constant at 25.5 FTEStaffing: Constant at 25.5 FTE
Major HW ComponentsMajor HW Components Constant $ at 33% of 2006 Full Facility Cost
Allows for Continual Upgrade
All Other Costs Level at 2005/2006 LevelsAll Other Costs Level at 2005/2006 Levels
CPU Disk HPSS SW LAN Travel Staff Other489$ 814$ 398$ 159$ 116$ 184$ 83$ 88$
Material Personnel Total2,331$ 3,378$ 5,709$
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 17
Comments
Scaleable Design - Recent 2.5X ExpansionScaleable Design - Recent 2.5X Expansion
Very Late Bulk ProcurementVery Late Bulk Procurement Working System Earlier - Minimize Design Risk
Maximize Moore’s Law Advantage
Retain Flexibility as Long as Possible
Leverage RCF KnowledgeLeverage RCF Knowledge Lessons Learned
Improved Estimation
November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 18
Summary
Facility Already RunningFacility Already Running
Near Term Prototype Planning in ProgressNear Term Prototype Planning in Progress
Budget Exceeds Agency Guideline by 33%Budget Exceeds Agency Guideline by 33% Despite Recent 2.5X Scale Expansion!
Estimates Are RealisticEstimates Are Realistic Detailed Cost Basis From Recent Purchases
Moore’s Law Uncertainty
Build to Cost Contingency FeasibleBuild to Cost Contingency Feasible As Long as Tier 2 Facilities Are Funded