us atlas tier 1 facility rich baker brookhaven national laboratory doe/nsf review of u.s. atlas and...

18
US ATLAS Tier 1 Facility Rich Baker Rich Baker Brookhaven National Laboratory Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory Brookhaven National Laboratory NOVEMBER 14-17, 2000 NOVEMBER 14-17, 2000

Upload: eric-booker

Post on 05-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

US ATLAS Tier 1 Facility

Rich BakerRich Baker

Brookhaven National LaboratoryBrookhaven National Laboratory

DOE/NSF Review of U.S. ATLAS and CMS Computing ProjectsDOE/NSF Review of U.S. ATLAS and CMS Computing Projects

Brookhaven National LaboratoryBrookhaven National LaboratoryNOVEMBER 14-17, 2000NOVEMBER 14-17, 2000

Page 2: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 2

Tier 1 Facility Requirements (1)

Based on NCB Review NumbersBased on NCB Review Numbers Focus on Analysis (200k of 209k SI95)

Probably Insufficient for Simulation

CPU: 209,000 SpecInt95CPU: 209,000 SpecInt95 Commodity Pentium/Linux

Estimated 640 Dual Processor Nodes

Online Storage: 365 TB DiskOnline Storage: 365 TB Disk High Performance Storage Area Network

Baseline: Fibre Channel Raid Array

Page 3: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 3

Tier 1 Facility Requirements (2)

Tertiary Storage: 2 PB Tape LibraryTertiary Storage: 2 PB Tape Library Baseline: HPSS, STK Media & Tape Drives

75% Event Summary Data

25% Simulation, Analysis Objects, Local Data

“Raw” I/O Rate: 400 MB/second, 12.5 PB/year

Exploit Use Patterns to Maximize Efficiency Random Access to AOD - Always on Disk Managed Access to ESD - Grid SW? Custom SW?

Page 4: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 4

Estimation Methods - Hardware (1)

Use Recent RCF Purchases as Cost BaselineUse Recent RCF Purchases as Cost Baseline

Moore’s Law Scaling for Commodity Components (CPU, Moore’s Law Scaling for Commodity Components (CPU,

Disk, Tape)Disk, Tape)

STK Tape Drives: Constant Cost per Drive, Double I/O STK Tape Drives: Constant Cost per Drive, Double I/O

Capacity Every 2 YearsCapacity Every 2 Years

Similar Constant Cost Projections for High Performance Similar Constant Cost Projections for High Performance

Data Mover NodesData Mover Nodes $40K per HPSS Mover Node

$30K per SAN Control Node

Page 5: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 5

Estimation Methods - Hardware (2)

Local Area Network: 8% of Disk+CPU Cost plus $20K per Local Area Network: 8% of Disk+CPU Cost plus $20K per

HPSS MoverHPSS Mover

Firewall/WAN Hardware: 25% of LAN CostFirewall/WAN Hardware: 25% of LAN Cost

Interactive NodesInteractive Nodes 2 Linux Nodes Purchased per Year

Maintain One Sun/Solaris Node

““General Purpose” NodesGeneral Purpose” Nodes 21 Currently for RCF - Estimate 25 for ATLAS

Page 6: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 6

Estimation Methods - Software

Share Site License Costs with RCFShare Site License Costs with RCF HPSS: 50% of $200K by 2005

LSF: 50% of $65K Starting 2002

Veritas: $5K per SAN Control NodeVeritas: $5K per SAN Control Node Or Other SW Choice

Most Other SW License Costs Can Only be Estimated - Most Other SW License Costs Can Only be Estimated -

Total $97K in 2005Total $97K in 2005 Good Estimate Based on Actual RCF Costs to Support Operational

Facility & Development

Page 7: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 7

Existing Facility

Page 8: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 8

E450(NFS Server)

Dual Intel

Dual Intel

USATLASSwitch

SANHub

BackupServer

HPSSArchiveServer

Ÿ XXX.USATLAS.BNL.GOVŸ E450 front line with SSHŸ Objectivity Lock Server

200 GBytesRAID Disk

US ATLAS Tier 1 Facility

62 Intel/LinuxDual 700/450 MHz256/512 MBytes

9/18 GBytes100 Mbit Ethernet(3,200 SPECint95)

9840TapesAFS

Servers

AFS

~10 GBytesRAID DiskUS ATLAS

AFS

Ÿ LSFŸ AFSŸ ObjectivityŸ Gnu etc.

US ATLAS Equipment

RCF Infrastructure

~50 GBytesJBOD Disk

Intel/LinuxWeb Server

August 2000 Configuration

128 MBytes18 GBytes

.

.

.

RCFLAN

Page 9: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 9

Timeline Overview

Prototype – FY ‘01 & FY ‘02Prototype – FY ‘01 & FY ‘02 Initial Development & Test, 1% to 2% scale

Establish Facility Independent from RCF

Lessons Learned from RCF Experience

System Tests – FY ‘03 & FY ‘04System Tests – FY ‘03 & FY ‘04 Large Scale System Tests, 5% to 10% scale

Support Growing Tier 2 Network

Operation – FY ‘05, FY ‘06 & beyondOperation – FY ‘05, FY ‘06 & beyond Full Scale System Operation, 20% (‘05) to 100% (‘06)

Page 10: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 10

Tier 1 Facility Capacity

0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

2000 2001 2002 2003 2004 2005 2006 2007

Year

Pe

rce

nta

ge

Co

mp

lete

Page 11: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 11

Tier 1 Facility Staffing

0.0

5.0

10.0

15.0

20.0

25.0

30.0

2001 2002 2003 2004 2005 2006

Operations

Performance Monitoring

Grid & WAN

HSM

System Administration

Planning & Management

Page 12: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 12

Tier 1 Budget By Year (FY’01 k$)

Personnel Material Total2001 771$ 639$ 1,411$ 2002 1,044$ 565$ 1,609$ 2003 1,428$ 970$ 2,398$ 2004 2,080$ 1,190$ 3,270$ 2005 3,378$ 1,697$ 5,074$ 2006 3,378$ 4,968$ 8,346$

Total 12,078$ 10,029$ 22,107$

•Includes Tier 1 Staff Under WBS 2.3.2

Page 13: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 13

Tier 1 Budget (FY’01 k$)

$-

$1,000

$2,000

$3,000

$4,000

$5,000

$6,000

$7,000

$8,000

$9,000

2001 2002 2003 2004 2005 2006

Material

Personnel

Page 14: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 14

Tier 1 Material Costs (FY’01 k$)

CPU Disk HPSS SW LAN Travel Staff Other2001 -$ 277$ 129$ 94$ 46$ 43$ 18$ 32$ 2002 51$ 145$ 92$ 135$ 20$ 74$ 24$ 24$ 2003 171$ 304$ 115$ 135$ 44$ 104$ 34$ 62$ 2004 231$ 309$ 214$ 159$ 50$ 144$ 51$ 32$ 2005 239$ 368$ 525$ 159$ 100$ 184$ 83$ 38$ 2006 1,184$ 1,973$ 966$ 159$ 281$ 184$ 83$ 138$

Total 1,877$ 3,375$ 2,042$ 842$ 540$ 733$ 294$ 326$

HPSS License Included in HPSSHPSS License Included in HPSS

Volume Manager Software Included in DiskVolume Manager Software Included in Disk

““Other” Includes Power, Special Purpose Nodes, Video Other” Includes Power, Special Purpose Nodes, Video

Conferencing, SuppliesConferencing, Supplies

Page 15: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 15

Tier 1 Materials Summary (FY’01 k$)

$-

$1,000

$2,000

$3,000

$4,000

$5,000

2001 2002 2003 2004 2005 2006

Other

Staff

Travel

LAN

SW

HPSS

Disk

CPU

Page 16: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 16

Tier 1 Facility Beyond 2006 (FY’01 k$)

Staffing: Constant at 25.5 FTEStaffing: Constant at 25.5 FTE

Major HW ComponentsMajor HW Components Constant $ at 33% of 2006 Full Facility Cost

Allows for Continual Upgrade

All Other Costs Level at 2005/2006 LevelsAll Other Costs Level at 2005/2006 Levels

CPU Disk HPSS SW LAN Travel Staff Other489$ 814$ 398$ 159$ 116$ 184$ 83$ 88$

Material Personnel Total2,331$ 3,378$ 5,709$

Page 17: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 17

Comments

Scaleable Design - Recent 2.5X ExpansionScaleable Design - Recent 2.5X Expansion

Very Late Bulk ProcurementVery Late Bulk Procurement Working System Earlier - Minimize Design Risk

Maximize Moore’s Law Advantage

Retain Flexibility as Long as Possible

Leverage RCF KnowledgeLeverage RCF Knowledge Lessons Learned

Improved Estimation

Page 18: US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory

November 14-17, 2000November 14-17, 2000Rich Baker US ATLAS Tier 1 FacilityRich Baker US ATLAS Tier 1 Facility 18

Summary

Facility Already RunningFacility Already Running

Near Term Prototype Planning in ProgressNear Term Prototype Planning in Progress

Budget Exceeds Agency Guideline by 33%Budget Exceeds Agency Guideline by 33% Despite Recent 2.5X Scale Expansion!

Estimates Are RealisticEstimates Are Realistic Detailed Cost Basis From Recent Purchases

Moore’s Law Uncertainty

Build to Cost Contingency FeasibleBuild to Cost Contingency Feasible As Long as Tier 2 Facilities Are Funded