us cms software and computing project us cms collaboration meeting at fsu, may 2002

66
May 11, 2002 U.S. CMS Collaboration Meeting 1 Lothar A T Bauerdick Fermilab US CMS Software and Computing US CMS Software and Computing Project Project US CMS Collaboration Meeting at FSU, May 2002 US CMS Collaboration Meeting at FSU, May 2002 Lothar A T Bauerdick/Fermilab Lothar A T Bauerdick/Fermilab Project Manager Project Manager

Upload: alayna

Post on 12-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002. Lothar A T Bauerdick/Fermilab Project Manager. Scope and Deliverables. Provide Computing Infrastructure in the U.S. — that needs R&D Provide software engineering support for CMS - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 1Lothar A T Bauerdick Fermilab

US CMS Software and Computing ProjectUS CMS Software and Computing Project

US CMS Collaboration Meeting at FSU, May 2002US CMS Collaboration Meeting at FSU, May 2002

Lothar A T Bauerdick/FermilabLothar A T Bauerdick/Fermilab

Project ManagerProject Manager

Page 2: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 2Lothar A T Bauerdick Fermilab

Scope and DeliverablesScope and DeliverablesProvide Computing Infrastructure in the U.S. — that needs R&DProvide Computing Infrastructure in the U.S. — that needs R&DProvide software engineering support for CMSProvide software engineering support for CMS

Mission is to develop and build “User Facilities” for CMS physics in the U.S.Mission is to develop and build “User Facilities” for CMS physics in the U.S. To provide the enabling IT infrastructure that will allow To provide the enabling IT infrastructure that will allow

U.S. physicists to fully participate in the physics program of CMSU.S. physicists to fully participate in the physics program of CMS To provide the U.S. share of the framework and infrastructure software To provide the U.S. share of the framework and infrastructure software

Tier-1 center at Fermilab provides computing resources and supportTier-1 center at Fermilab provides computing resources and support User Support for “CMS physics community”, e.g. software distribution, help deskUser Support for “CMS physics community”, e.g. software distribution, help desk Support for Tier-2 centers, and for physics analysis center at FermilabSupport for Tier-2 centers, and for physics analysis center at Fermilab

Five Tier-2 centers in the U.S.Five Tier-2 centers in the U.S. Together will provide same CPU/Disk resources as Tier-1Together will provide same CPU/Disk resources as Tier-1 Facilitate “involvement of collaboration” in S&C developmentFacilitate “involvement of collaboration” in S&C development

Prototyping and test-bed effort very successfulPrototyping and test-bed effort very successful Universities will “bid” to host Tier-2 centerUniversities will “bid” to host Tier-2 center

taking advantage of existing resources and expertisetaking advantage of existing resources and expertise Tier-2 centers to be funded through NSF program for “empowering Universities”Tier-2 centers to be funded through NSF program for “empowering Universities”

Proposal to the NSF submitted Nov 2001Proposal to the NSF submitted Nov 2001

Page 3: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 3Lothar A T Bauerdick Fermilab

Project Milestones and SchedulesProject Milestones and Schedules

Prototyping, test-beds, R&D started in 2000Prototyping, test-beds, R&D started in 2000“Developing the LHC Computing Grid” in the U.S.“Developing the LHC Computing Grid” in the U.S.

R&D systems, funded in FY2002 and FY2003 R&D systems, funded in FY2002 and FY2003 Used for “5% data challenge” (end 2003)Used for “5% data challenge” (end 2003)

release Software and Computing TDR (technical design report)release Software and Computing TDR (technical design report)

Prototype T1/T2 systems, funded in FY2004 Prototype T1/T2 systems, funded in FY2004 for “20% data challenge” (end 2004)for “20% data challenge” (end 2004)

end “Phase 1”, Regional Center TDR, start deploymentend “Phase 1”, Regional Center TDR, start deployment

Deployment: 2005-2007, 30%, 30%, 40% costsDeployment: 2005-2007, 30%, 30%, 40% costs

Fully Functional Tier-1/2 funded in FY2005 through FY2007Fully Functional Tier-1/2 funded in FY2005 through FY2007 ready for LHC physics run ready for LHC physics run

start of Physics Program start of Physics Program

S&C Maintenance and Operations: 2007 onS&C Maintenance and Operations: 2007 on

Page 4: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 4Lothar A T Bauerdick Fermilab

US CMS S&C Since UCRUS CMS S&C Since UCRConsolidation of the project, shaping out the R&D programConsolidation of the project, shaping out the R&D program

Project Baselined in Nov 2001: Workplan for CAS, UF, Grids endorsedProject Baselined in Nov 2001: Workplan for CAS, UF, Grids endorsed

CMS has become “lead experiment” for Grid work CMS has become “lead experiment” for Grid work Koen, Greg, Rick Koen, Greg, Rick US Grid Projects PPDG, GriPhyN and iVDGLUS Grid Projects PPDG, GriPhyN and iVDGL EU Grid Projects DataGrid, Data TagEU Grid Projects DataGrid, Data Tag LHC Computing Grid ProjectLHC Computing Grid Project

Fermilab UF team, Tier-2 prototypes, US CMS testbed Fermilab UF team, Tier-2 prototypes, US CMS testbed Major production efforts, PRS support Major production efforts, PRS support

Objectivity goes, LCG comesObjectivity goes, LCG comes We do have a working software and computing system! We do have a working software and computing system! Physics Analysis Physics Analysis CCS will drive much of the common LCG Application areCCS will drive much of the common LCG Application are

Major challenges to manage and execute the projectMajor challenges to manage and execute the project Since fall 2001 we knew LHC start would be delayed Since fall 2001 we knew LHC start would be delayed new date April 2007 new date April 2007 Proposal to NSF in Oct 2001, things are probably moving nowProposal to NSF in Oct 2001, things are probably moving now New DOE funding guidance (and lack thereof from NSF) is starving us in 2002-2004New DOE funding guidance (and lack thereof from NSF) is starving us in 2002-2004

Very strong support for the Project from individuals in CMS, Fermilab, Grids, Very strong support for the Project from individuals in CMS, Fermilab, Grids, FAFA

Page 5: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 5Lothar A T Bauerdick Fermilab

Other New DevelopmentsOther New Developments NSF proposal guidance AND DOE guidance are (S&C+M&O)NSF proposal guidance AND DOE guidance are (S&C+M&O)

That prompted a change in US CMS line management That prompted a change in US CMS line management Program Manager will oversee both Construction Project and S&C ProjectProgram Manager will oversee both Construction Project and S&C Project

New DOE guidance for S&C+M&O is much below New DOE guidance for S&C+M&O is much below S&C baseline + M&O requestS&C baseline + M&O request

Europeans have achieved major UF funding, Europeans have achieved major UF funding, significantly larger relative to U.S.significantly larger relative to U.S.LCG started, expects U.S. to partner with European projectsLCG started, expects U.S. to partner with European projects

LCG Application Area possibly imposes issues on CAS structure LCG Application Area possibly imposes issues on CAS structure

Many developments and changes Many developments and changes that invalidate or challenge much of what PM tried to achievethat invalidate or challenge much of what PM tried to achieve

Opportunity to take stock of where we stand in US CMS S&COpportunity to take stock of where we stand in US CMS S&Cbefore we try to understand where we need to go before we try to understand where we need to go

Page 6: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 6Lothar A T Bauerdick Fermilab

Vivian has left S&C Vivian has left S&C

Thanks and appreciation Thanks and appreciation for Vivian’s workfor Vivian’s workof bringing the UF project of bringing the UF project to the successful baselineto the successful baseline

New scientist positionNew scientist positionopened at Fermilabopened at Fermilabfor UF L2 managerfor UF L2 managerand physics!and physics!

Other assignments Other assignments Hans Wenzel Hans Wenzel

Tier-1 ManagerTier-1 Manager Jorge RodrigezJorge Rodrigez

U.Florida pT2 L3 managerU.Florida pT2 L3 manager Greg Graham Greg Graham

CMS GIT Production Task LeadCMS GIT Production Task Lead Rick Cavenaugh Rick Cavenaugh

US CMS Testbed CoordinatorUS CMS Testbed Coordinator

Page 7: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 7Lothar A T Bauerdick Fermilab

Project StatusProject Status

User Facilities status and successes: User Facilities status and successes: US CMS Prototypes systems: Tier-1, Tier-2, testbedUS CMS Prototypes systems: Tier-1, Tier-2, testbed Intense collaboration with US Grid project, Grid-enabled MC production systemIntense collaboration with US Grid project, Grid-enabled MC production system User Support: facilities, software, operations for PRS studies User Support: facilities, software, operations for PRS studies

Core Application Software status and successes:Core Application Software status and successes: See Ian’s talkSee Ian’s talk

Project Office startedProject Office started Project Engineer hired, to work on WBS, Schedule, Budget, Reporting, DocumentingProject Engineer hired, to work on WBS, Schedule, Budget, Reporting, Documenting SOWs in place w/ CAS Universities —MOUs, subcontracts, invoicing is comingSOWs in place w/ CAS Universities —MOUs, subcontracts, invoicing is coming In process of signing the MOUs In process of signing the MOUs Have a draft of MOU with iVDGL on prototype Tier-2 fundingHave a draft of MOU with iVDGL on prototype Tier-2 funding

Page 8: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 8Lothar A T Bauerdick Fermilab

Successful Base-lining ReviewSuccessful Base-lining Review““The Committee endorses the proposed The Committee endorses the proposed

project scope, schedule, budgets and management plan”project scope, schedule, budgets and management plan”

Endorsement for the “Scrubbed” project plan following the DOE/NSF guidanceEndorsement for the “Scrubbed” project plan following the DOE/NSF guidance$3.5M$3.5MDOEDOE + $2M + $2MNSFNSF in FY2003 and $5.5 in FY2003 and $5.5DOEDOE + $3M + $3MNSFNSF in FY2004! in FY2004!

Findings & Recommendations of theProject Management Subcommittee

US CMS Project Management is in place andworking well

Project appears well definedĞScope can be achieved with proposed resources

ĞBudget matches agency guidance profile

ĞSchedule takes advantage of unofficial LHC slip -appears achievable (but with little or no margin)

Findings & Recommendations of theProject Management Subcommittee (2)

US CMS has taken an excellent first step in definingwhat services they require from grid-develop SWpackagesĞ US CMS needs to implement the tracking procedure to

assess grid-project progress and its impact on US CMSschedule

Committee is concerned about the potential impact onUS CMS of future design/specification decisionsmade by CERN, especially, e.g., in the area of datapersistence models and GRID technology

Page 9: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 9Lothar A T Bauerdick Fermilab

0.13 M0.13 MHelsinkiHelsinki

0.07 M0.07 MWisconsinWisconsin

0.31 M0.31 MIN2P3IN2P3

0.76 M0.76 MINFNINFN

1.10 M1.10 MCERNCERN

1.27 M1.27 MBristol/RALBristol/RAL

2.50 M2.50 MCaltechCaltech

0.05 M0.05 M

0.06 M0.06 M

0.43 M0.43 M

1.65 M1.65 M

Simulated EventsSimulated EventsTOTAL = 8.4 MTOTAL = 8.4 M

UFLUFL

UCSDUCSD

MoscowMoscow

FNALFNAL

TYPICAL EVENT SIZES

Simulated 1 CMSIM event

= 1 OOHit event= 1.4

MB

Reconstructed 1 “1033” event

= 1.2 MB

1 “2x1033” event

= 1.6 MB

1 “1034” event= 5.6

MB

0.22 TB0.22 TBBristol/RALBristol/RAL

0.08 TB0.08 TB

--

0.05 TB0.05 TB

0.10 TB0.10 TB

0.20 TB0.20 TB

0.40 TB0.40 TB

0.45 TB 0.45 TB

12 TB12 TB

14 TB14 TB

UFLUFL

HelsinkiHelsinki

WisconsinWisconsin

IN2P3IN2P3

UCSDUCSD

INFNINFN

MoscowMoscow

FNALFNAL

CERNCERN

Reconstructed Reconstructed w/ Pile-Upw/ Pile-UpTOTAL = 29 TBTOTAL = 29 TB

0.60 TB0.60 TBCaltechCaltech

CMS Produced Data in 2001CMS Produced Data in 2001

These fully simulated data samples are essential for physics and trigger studiesThese fully simulated data samples are essential for physics and trigger studies

Technical Design Report for DAQ and Higher Level TriggersTechnical Design Report for DAQ and Higher Level Triggers

Page 10: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 10Lothar A T Bauerdick Fermilab

Production OperationsProduction OperationsProduction Efforts are Manpower intensive! Production Efforts are Manpower intensive!

Fermiulab Tier-1 Production Operations Fermiulab Tier-1 Production Operations ∑ = 1.7 FTE∑ = 1.7 FTE sustained effortsustained effort to fill those 8 roles to fill those 8 roles

+ the system support people that need to help if something goes wrong!!!+ the system support people that need to help if something goes wrong!!!

At Fermilab (US CMS, PPDG)

Greg Graham, Shafqat Aziz, Yujun Wu, Moacyr Souza, Hans Wenzel, Michael Ernst, Shahzad Muzaffar + staff

At U Florida (GriPhyN, iVDGL)

Dimitri Bourilkov, Jorge Rodrigez, Rick Cavenaugh + staff

At Caltech (GriPhyN, PPDG, iVDGL, USCMS)

Vladimir Litvin, Suresh Singh et al

At UCSD (PPDG, iVDGL)

Ian Fisk, James Letts + staff

At Wisconsin

Pam Chumney, R. Gowrishankara, David Mulvihill + Peter Couvares, Alain Roy et al

At CERN (USCMS)

Tony Wildish + many

Page 11: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 11Lothar A T Bauerdick Fermilab

US CMS Prototypes and Test-bedsUS CMS Prototypes and Test-bedsTier-1Tier-1 and Tier-2 Prototypes and Test-beds operational and Tier-2 Prototypes and Test-beds operational

Facilities for event simulationFacilities for event simulationincluding reconstructionincluding reconstruction

Sophisticated processing for pile-up simulationSophisticated processing for pile-up simulation User cluster and hosting of data samples User cluster and hosting of data samples

for physics studiesfor physics studies Facilities and Grid R&DFacilities and Grid R&D

Page 12: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 12Lothar A T Bauerdick Fermilab

IBM - servers

CMSUN1 Dell -servers

Chocolat

snickers

Chimichanga

Chalupa

Winchester Raid

Tier-1 equipmentTier-1 equipment

Page 13: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 13Lothar A T Bauerdick Fermilab

Popcorns (MC production)

frys(user)

gyoza(test)

Tier-1 EquipmentTier-1 Equipment

Page 14: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 14Lothar A T Bauerdick Fermilab

Using the Tier-1 system User SystemUsing the Tier-1 system User System

Until the Grid becomes reality (maybe soon!) people who want to use computing facilities at Fermilab need to obtain an account

That requires registration as a Fermilab user (DOE requirement)

We will make sure that turn-around times are reasonably short, did not hear complains yet

Go to Go to http://computing.fnal.gov/cms/http://computing.fnal.gov/cms/ click on the "CMS Account" button that will guide you through the process click on the "CMS Account" button that will guide you through the process Step 1: Get a Valid Fermilab ID Step 1: Get a Valid Fermilab ID Step 2: Get a fnalu account and CMS account Step 2: Get a fnalu account and CMS account Step 3: Get a Kerberos principal and krypto card Step 3: Get a Kerberos principal and krypto card Step 4: Information for first-time CMS account users Step 4: Information for first-time CMS account users

http://consult.cern.ch/writeup/form01/http://consult.cern.ch/writeup/form01/

Got > 100 users, currently about 1 new user per weekGot > 100 users, currently about 1 new user per week

Page 15: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 15Lothar A T Bauerdick Fermilab

FRY1

FRY5

FRY7

FRY3

FRY2

FRY6

FRY4

FRY8

SWITCH BIGMACGigaBit

100Mbps

RAID 250 GB

SCSI 160

R&D on “reliable i/a service”OS: Mosix?batch system: Fbsng?Storage: Disk farm?

US CMS User ClusterUS CMS User Cluster

To be released June 2002! nTuple, Objy analysis etc

Page 16: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 16Lothar A T Bauerdick Fermilab

EnstoreSTKENSilo

Snickers

RAID 1TB

IDE

AMD serverAMD/Enstore interface

Network

Users

Objects

> 10 TB

Working on providingPowerful disk cache

Host redirection protocol allows to add more servers --> scaling+ load balancing

User Access to Tier-1 DataUser Access to Tier-1 DataHosting of Jets/Met dataHosting of Jets/Met data

Muons will be coming soonMuons will be coming soon

Page 17: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 17Lothar A T Bauerdick Fermilab

US CMS T2 Prototypes and Test-bedsUS CMS T2 Prototypes and Test-bedsTier-1 and Tier-1 and Tier-2Tier-2 Prototypes and Test-beds operational Prototypes and Test-beds operational

Page 18: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 18Lothar A T Bauerdick Fermilab

California Prototype Tier-2 SetupCalifornia Prototype Tier-2 Setup

UCSD CaltechUCSD Caltech

Page 19: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 19Lothar A T Bauerdick Fermilab

Benefits Of US Tier-2 CentersBenefits Of US Tier-2 CentersBring computing resources close of user communitiesBring computing resources close of user communities

Provide dedicated resources to regions (of interest and geographical)Provide dedicated resources to regions (of interest and geographical) More control over localized resources, more opportunities to pursue physics goalsMore control over localized resources, more opportunities to pursue physics goals

Leverage Additional Resources, which exist at the universities and labsLeverage Additional Resources, which exist at the universities and labs Reduce computing requirements of CERN (supposed to account for 1/3 of total LHC facilities!) Reduce computing requirements of CERN (supposed to account for 1/3 of total LHC facilities!) Help meet the LHC Computing ChallengeHelp meet the LHC Computing Challenge

Provide diverse collection of sites, equipment, expertise for development and testingProvide diverse collection of sites, equipment, expertise for development and testing Provide much needed computing resourcesProvide much needed computing resources

US-CMS plans for about 2 FTE at each Tier-2 site + Equipment fundingUS-CMS plans for about 2 FTE at each Tier-2 site + Equipment fundingsupplemented with Grid, University and Lab funds (BTW: no I/S costs in US CMS plan)supplemented with Grid, University and Lab funds (BTW: no I/S costs in US CMS plan)

Problem: How do you run a center with only two peopleProblem: How do you run a center with only two peoplethat will have much greater processing power than CERN has currently ?that will have much greater processing power than CERN has currently ?

This involved facilities and operations R&DThis involved facilities and operations R&Dto reduce the operations personnel required to run the centerto reduce the operations personnel required to run the centere.g. investigating cluster management software e.g. investigating cluster management software

Page 20: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 20Lothar A T Bauerdick Fermilab

U.S. Tier-1/2 System OperationalU.S. Tier-1/2 System Operational

CMS Grid Integration and Deployment CMS Grid Integration and Deployment on U.S. CMS Test Bed on U.S. CMS Test Bed

Data Challenges and Production RunsData Challenges and Production Runson Tier-1/2 Prototype Systemson Tier-1/2 Prototype Systems

““Spring Production 2002” finishingSpring Production 2002” finishing Physics, Trigger, Detector studies Physics, Trigger, Detector studies Produce 10M events and 15 TB of dataProduce 10M events and 15 TB of data

also 10M mini-biasalso 10M mini-biasfully simulated including pile-upfully simulated including pile-upfully reconstructedfully reconstructed

Large assignment to U.S. CMSLarge assignment to U.S. CMS

Successful Production in 2001:Successful Production in 2001: 8.4M events fully simulated, including pile-up, 8.4M events fully simulated, including pile-up,

50% in the U.S.50% in the U.S. 29TB data processed29TB data processed

13TB in the U.S.13TB in the U.S.

Page 21: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 21Lothar A T Bauerdick Fermilab

US CMS Prototypes and Test-bedsUS CMS Prototypes and Test-bedsAll U.S. CMS S&C Institutions are involved in DOE and NSF Grid ProjectsAll U.S. CMS S&C Institutions are involved in DOE and NSF Grid Projects

Integrating Grid softwareIntegrating Grid softwareinto CMS systemsinto CMS systems

Bringing CMS ProductionBringing CMS Productionon the Gridon the Grid

Understanding the Understanding the operational issuesoperational issues

CMS directly profit from Grid FundingCMS directly profit from Grid Funding

Deliverables of Grid Projects Deliverables of Grid Projects become useful for LHC in the “real world”become useful for LHC in the “real world” Major success: MOP, GDMPMajor success: MOP, GDMP

Page 22: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 22Lothar A T Bauerdick Fermilab

Grid-enabled CMS ProductionGrid-enabled CMS Production

Successful collaboration with Grid Projects!Successful collaboration with Grid Projects! MOP (Fermilab, U.Wisconsin/Condor):MOP (Fermilab, U.Wisconsin/Condor):

Remote job execution Condor-G, DAGmanRemote job execution Condor-G, DAGman GDMP (Fermilab, European DataGrid WP2)GDMP (Fermilab, European DataGrid WP2)

File replication and replica catalog (Globus)File replication and replica catalog (Globus)

Successfully used on CMS testbedSuccessfully used on CMS testbed

First real CMS Production use finishing now!First real CMS Production use finishing now!

Page 23: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 23Lothar A T Bauerdick Fermilab

Recent Successes with the GridRecent Successes with the Grid

Grid Enabled CMS Production Environment Grid Enabled CMS Production Environment NB: MOP = “Grid-ified” IMPALA, vertically integrated CMS applicationNB: MOP = “Grid-ified” IMPALA, vertically integrated CMS application

Brings together US CMS with all three US Grid ProjectsBrings together US CMS with all three US Grid Projects PPDG: Grid developers (Condor, DAGman), GDMP (w/ WP2), PPDG: Grid developers (Condor, DAGman), GDMP (w/ WP2), GriPhyN: VDT, in the future also the virtual data catalogGriPhyN: VDT, in the future also the virtual data catalog iVDGL: pT2 sites and US CMS testbediVDGL: pT2 sites and US CMS testbed

CMS Spring 2002 production assignment of 200k events to MOPCMS Spring 2002 production assignment of 200k events to MOP Half-way through, next week transfer back to CERNHalf-way through, next week transfer back to CERN

This is being considered a major success — for US CMS and Grids!This is being considered a major success — for US CMS and Grids!

Many bugs in Condor and Globus found and fixedMany bugs in Condor and Globus found and fixed

Many operational issues that needed and still need to be sorted outMany operational issues that needed and still need to be sorted out

MOP will be moved into production Tier-1/Tier-2 environmentMOP will be moved into production Tier-1/Tier-2 environment

Page 24: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 24Lothar A T Bauerdick Fermilab

Successes: Grid-enabled ProductionSuccesses: Grid-enabled Production

Major Milestone for US CMS and PPDG Major Milestone for US CMS and PPDG From PPDG internal review of MOP:From PPDG internal review of MOP:

““From the Grid perspective, MOP has been outstanding. It has both legitimized From the Grid perspective, MOP has been outstanding. It has both legitimized the idea of using Grid tools such as DAGMAN, Condor-G, GDMP, and Globus the idea of using Grid tools such as DAGMAN, Condor-G, GDMP, and Globus in a real production environment outside of prototypes and trade show in a real production environment outside of prototypes and trade show demonstrations. Furthermore, it has motivated the use of Grid tools such as demonstrations. Furthermore, it has motivated the use of Grid tools such as DAGMAN, Condor-G, GDMP, and Globus in novel environments leading to the DAGMAN, Condor-G, GDMP, and Globus in novel environments leading to the discovery of many bugs which would otherwise have prevented these tools discovery of many bugs which would otherwise have prevented these tools from being taken seriously in a real production environment. from being taken seriously in a real production environment.

From the CMS perspective, MOP won early respect for taking on real From the CMS perspective, MOP won early respect for taking on real production problems, and is soon ready to deliver real events. In fact, today or production problems, and is soon ready to deliver real events. In fact, today or early next week we will update the RefDB at CERN which tracks production at early next week we will update the RefDB at CERN which tracks production at various regional centers. This has been delayed because of the numerous various regional centers. This has been delayed because of the numerous bugs that, while being tracked down, involved several cycles of development bugs that, while being tracked down, involved several cycles of development and redeployment. The end of the current CMS production cycle is in three and redeployment. The end of the current CMS production cycle is in three weeks, and MOP will be able to demonstrate some grid enabled production weeks, and MOP will be able to demonstrate some grid enabled production capability by then. We are confident that this will happen. It is not necessary at capability by then. We are confident that this will happen. It is not necessary at this stage to have a perfect MOP system for CMS Production; IMPALA also this stage to have a perfect MOP system for CMS Production; IMPALA also has some failover capability and we will use that where possible. However, it has some failover capability and we will use that where possible. However, it has been a very useful exercise and we believe that we are among the first has been a very useful exercise and we believe that we are among the first team to tackle Globus and Condor-G in such a stringent and HEP specific team to tackle Globus and Condor-G in such a stringent and HEP specific environment.”environment.”

Page 25: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 25Lothar A T Bauerdick Fermilab

Successes: File TransfersSuccesses: File TransfersIn 2001 were observing typical rates for large data transfers,In 2001 were observing typical rates for large data transfers,

e.g. CERN - FNAL 4.7 GB/houre.g. CERN - FNAL 4.7 GB/hourAfter network tuning, using Grid Tools (Globus URLcopy) we gain a factor 10!After network tuning, using Grid Tools (Globus URLcopy) we gain a factor 10!

Today we are transferring 1.5 TByte of simulated data from UCSD to FNALToday we are transferring 1.5 TByte of simulated data from UCSD to FNAL at rates of 10 MByte/second! at rates of 10 MByte/second!

That almost saturates the network I/f out of Fermilab (155Mbps) and at UCSD (FastEthernet)…That almost saturates the network I/f out of Fermilab (155Mbps) and at UCSD (FastEthernet)…

The ability to transfer a TeraByte in a day is crucial for the Tier-1/Tier-2 systemThe ability to transfer a TeraByte in a day is crucial for the Tier-1/Tier-2 system

Many operational issues remain to be solvedMany operational issues remain to be solved GDMP is a grid tool for file replication, developed jointly b/w US and EUGDMP is a grid tool for file replication, developed jointly b/w US and EU ““show case” application for EU Data Grid WP2: data replicationshow case” application for EU Data Grid WP2: data replication Needs more work and strong support Needs more work and strong support VDT team (PPDG, GriPhyN, iVDGL) VDT team (PPDG, GriPhyN, iVDGL)

e.g. CMS “GDMP heartbeat” for debugging new installations and monitoring old ones.e.g. CMS “GDMP heartbeat” for debugging new installations and monitoring old ones. Installation and configuration issues — releases of underlying software like GlobusInstallation and configuration issues — releases of underlying software like Globus Issues with site security and e.g. FirewallIssues with site security and e.g. Firewall Uses Globus Security Infrastructure, which demands”VO” Certification Authority infrastructure for Uses Globus Security Infrastructure, which demands”VO” Certification Authority infrastructure for

CMSCMS Etc pp…Etc pp…

This needs to be developed, tested, deployed and shows This needs to be developed, tested, deployed and shows that the USCMS testbed is invaluable!that the USCMS testbed is invaluable!

Page 26: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 26Lothar A T Bauerdick Fermilab

DOE/NSF Grid R&D Funding for CMSDOE/NSF Grid R&D Funding for CMS

DOE/NSF Grid R&D Funding for CMS 2001 2002 2003 2004 2005 2006

GriPhyNTotal, including CS and all Experiments 2543 2543 2543 2241

CMS Staff 582 582 582 582

iVDGLTotal, including CS and all Experiments 2650 2750 2750 2750 2750

CMS Equipment 232 192 187 57 65CMS Staff 234 336 358 390 390

PPDGTotal, including CS and all Experiements 3180 3180 3180

Caltech 187 187 187FNAL 132 132 132UCSD 80 80 80Total CMS 399 399 399

Page 27: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 27Lothar A T Bauerdick Fermilab

Farm SetupFarm SetupAlmost any computer can run the CMKIN and CMSIM steps Almost any computer can run the CMKIN and CMSIM steps

using the CMS binary distribution system (US CMS DAR)using the CMS binary distribution system (US CMS DAR)

As long as ample storage is available problem scales wellAs long as ample storage is available problem scales well

This step is “almost trivially” This step is “almost trivially” put on the Grid — almost…put on the Grid — almost…

Page 28: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 28Lothar A T Bauerdick Fermilab

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

e.g. on the 13.6 TF - $53M TeraGrid?e.g. on the 13.6 TF - $53M TeraGrid?

Page 29: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 29Lothar A T Bauerdick Fermilab

Farm Setup for ReconstructionFarm Setup for Reconstruction

• The first step of the reconstruction is Hit Formatting, where simulated The first step of the reconstruction is Hit Formatting, where simulated data is taken from the Fortran files, formatted and entered into the data is taken from the Fortran files, formatted and entered into the Objectivity data base.Objectivity data base.

• Process is sufficiently fast and involves enough data that more than Process is sufficiently fast and involves enough data that more than 10-20 jobs will bog down the data base server.10-20 jobs will bog down the data base server.

Page 30: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 30Lothar A T Bauerdick Fermilab

All charged tracks with pt > 2 GeV

Reconstructed tracks with pt > 25 GeV

(+30 minimum bias events)

This makes a CPU-limited task (event simulation) VERY I/O intensive!

Pile-up simulation!Pile-up simulation!

Unique at LHC due to high Luminosity and short bunch-crossing timeUnique at LHC due to high Luminosity and short bunch-crossing time

Up to 200 “Minimum Bias” events overlayed on interesting triggersUp to 200 “Minimum Bias” events overlayed on interesting triggers

Lead to “Pile-up” in detectors Lead to “Pile-up” in detectors needs to be simulated! needs to be simulated!

Page 31: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 31Lothar A T Bauerdick Fermilab

Farm Setup for Pile-up DigitizationFarm Setup for Pile-up DigitizationThe most advanced production step is digitization with pile-upThe most advanced production step is digitization with pile-up

The response of the detector is digitized the physics objects are reconstructed and stored The response of the detector is digitized the physics objects are reconstructed and stored persistently and at full luminosity 200 minimum bias events are combined with the signal persistently and at full luminosity 200 minimum bias events are combined with the signal eventsevents

Due to the large amount of minimum bias events multiple Objectivity AMS data servers are needed. Several configurations have been tried.

Page 32: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 32Lothar A T Bauerdick Fermilab

Objy Server Deployment: ComplexObjy Server Deployment: Complex

4 Production Federationsat FNAL. (Uses catalog only tolocate database files.)

3 FNAL servers plus several worker nodes usedin this configuration.

3 federation hosts with attached RAID partitions2 lock servers4 journal servers9 pileup servers

Page 33: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 33Lothar A T Bauerdick Fermilab

Example of CMS Physics StudiesExample of CMS Physics StudiesResolution studies for jet reconstructionResolution studies for jet reconstruction

Full detector simulation essential to understand jet resolutionsFull detector simulation essential to understand jet resolutions

Indispensable to design realistic triggers and understand rates at high lumiIndispensable to design realistic triggers and understand rates at high lumi

QCD 2-jet events with FSRQCD 2-jet events with FSR

Full simulation w/ tracks, HCAL noiseFull simulation w/ tracks, HCAL noise

QCD 2-jet events, no FSRQCD 2-jet events, no FSR

No pile-up, no tracks recon, no HCAL noiseNo pile-up, no tracks recon, no HCAL noise

Page 34: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 34Lothar A T Bauerdick Fermilab

Pile up & Jet Energy ResolutionPile up & Jet Energy ResolutionJet energy resolutionJet energy resolution

Pile-up contribution to jet are large and have large variationsPile-up contribution to jet are large and have large variations Can be estimated event-by-event from total energy in eventCan be estimated event-by-event from total energy in event Large improvement if pile-up correction applied (Large improvement if pile-up correction applied (red curvered curve)) e.g. 50% e.g. 50% 35% at 35% at EETT = 40GeV = 40GeV

Physics studies depend on full detailed detector simulation Physics studies depend on full detailed detector simulation realistic pile-up processing is essential!realistic pile-up processing is essential!

1

3.5 4.5

Page 35: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 35Lothar A T Bauerdick Fermilab

Tutorial at UCSDTutorial at UCSD

Very successful 4-day tutorial with ~40 people attendingVery successful 4-day tutorial with ~40 people attending Covering use of CMS software, including CMKIN/CMSIM, ORCA, OSCAR, IGUANACovering use of CMS software, including CMKIN/CMSIM, ORCA, OSCAR, IGUANA Covering physics code examples from all PRS groupsCovering physics code examples from all PRS groups Covering production tools and environment and Grid toolsCovering production tools and environment and Grid tools

Opportunity to get people togetherOpportunity to get people together UF and CAS engineers with PRS physicistsUF and CAS engineers with PRS physicists Grid developers and CMS usersGrid developers and CMS users

The tutorials have been very well thought throughThe tutorials have been very well thought through very useful for self-study, so they will be maintained very useful for self-study, so they will be maintained

It is amazing what we already can do with CMS softwareIt is amazing what we already can do with CMS software E.g. impressive to see IGUANA visualization environment, E.g. impressive to see IGUANA visualization environment,

including “home made” visualizationsincluding “home made” visualizations

However, our system is (too?, still too?) complexHowever, our system is (too?, still too?) complex

We maybe need more people taking a day off and We maybe need more people taking a day off and go through the self- guided tutorials go through the self- guided tutorials

Page 36: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 36Lothar A T Bauerdick Fermilab

Allocation for FY2002 Caltech UCSD FNAL NEU Princeton UC Davis UFlorida TOTALCore Applications Software (CAS) FTE

2.0 2.0 3.0 1.0 1.0 9.0User Facilities (UF) FTE

1.0 1.0 5.5 1.5 9.0TOTAL FTE 3.0 1.0 7.5 4.0 1.0 1.0 1.5 18.0

CAS Personnel (salary, PC, travel,...) 310.0 270.0 620.0 150.0 185.0 1,535.0 UF Personnel (salary, PC, travel,...) 155.0 155.0 795.0 232.0 1,337.0 UF Tier 1 Equipment 140.0 140.0 UF Tier 2 Equipment 120.0 112.0 232.0 Project Office, Management Reserve 390.0 274.0 664.0

TOTAL COST [AY$ x 1000] 585.0 155.0 1595.0 894.0 150.0 185.0 344.0 3,908.0

FY2002 UF FundingFY2002 UF FundingExcellent initial effort and DOE support for User FacilitiesExcellent initial effort and DOE support for User Facilities

Fermilab established as Tier-1 prototype and major Grid node for LHC computingFermilab established as Tier-1 prototype and major Grid node for LHC computing Tier-2 sites and testbeds are operational and are contributing to production and R&DTier-2 sites and testbeds are operational and are contributing to production and R&D Headstart for U.S. efforts has pushed CERN commitment to support remote sitesHeadstart for U.S. efforts has pushed CERN commitment to support remote sites

The FY2002 funding has given major headaches to PMThe FY2002 funding has given major headaches to PM DOE funding $2.24M was insufficient to ramp the Tier-1 to base-line sizeDOE funding $2.24M was insufficient to ramp the Tier-1 to base-line size The NSF contribution is unknown as of todayThe NSF contribution is unknown as of today

According to plan we should have more people and equipment at Fermilab T1According to plan we should have more people and equipment at Fermilab T1 Need some 7 additional FTEs and more equipment fundingNeed some 7 additional FTEs and more equipment funding This has been strongly endorsed by the baseline reviewsThis has been strongly endorsed by the baseline reviews All European RC (DE, FR, IT, UK, even RU!) have support at this level of effortAll European RC (DE, FR, IT, UK, even RU!) have support at this level of effort

Page 37: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 37Lothar A T Bauerdick Fermilab

Plans For 2002 - 2003Plans For 2002 - 2003

Finish Spring Production challenge until JuneFinish Spring Production challenge until June User Cluster, User FederationsUser Cluster, User Federations Upgrade of facilities ($300k)Upgrade of facilities ($300k)

Develop CMS Grid environment toward LCG Production GridDevelop CMS Grid environment toward LCG Production Grid Move CMS Grid environment from testbed to facilitiesMove CMS Grid environment from testbed to facilities Prepare for first LCG-USUF milestone, November? Prepare for first LCG-USUF milestone, November?

Tier-2, -iVDGL milestones w/ ATLAS, SC2002Tier-2, -iVDGL milestones w/ ATLAS, SC2002 LCG-USUF Production Grid milestone in May 2003LCG-USUF Production Grid milestone in May 2003

Bring Tier-1/Tier-2 prototypes up to scaleBring Tier-1/Tier-2 prototypes up to scale Serving user community: User cluster, federations,Grid enabled user environmentServing user community: User cluster, federations,Grid enabled user environment UF studies with persistency frameworkUF studies with persistency framework Start of physics DCs and computing DCsStart of physics DCs and computing DCs

CAS: LCG “everything is on the table, but the table is not empty”CAS: LCG “everything is on the table, but the table is not empty” persistency framework - prototype in September 2002persistency framework - prototype in September 2002

Release in July 2003Release in July 2003 DDD and OSCAR/Geant4 releasesDDD and OSCAR/Geant4 releases New strategy for visualization / IGUANANew strategy for visualization / IGUANA Develop distributed analysis environment w/ Caltech et alDevelop distributed analysis environment w/ Caltech et al

Page 38: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 38Lothar A T Bauerdick Fermilab

Funding for UF R&D PhaseFunding for UF R&D PhaseThere is lack of funding and lack of guidance for 2003-2005There is lack of funding and lack of guidance for 2003-2005

NSF proposal guidance AND DOE guidance are (S&C+M&O)NSF proposal guidance AND DOE guidance are (S&C+M&O) New DOE guidance for S&C+M&O is much below New DOE guidance for S&C+M&O is much below

S&C baseline + M&O requestS&C baseline + M&O request

Fermilab USCMS projects oversight has proposed Fermilab USCMS projects oversight has proposed minimal M&O for 2003-2004minimal M&O for 2003-2004and large cuts for S&C given the new DOE guidanceand large cuts for S&C given the new DOE guidance

The NSF has “ventilated the idea” to apply a rule of 81/250*DOE fundingThe NSF has “ventilated the idea” to apply a rule of 81/250*DOE funding This would lead to very serious problems in every year of the projectThis would lead to very serious problems in every year of the project

we would lack 1/3 of the requested funding ($14.0M/$21.2M)we would lack 1/3 of the requested funding ($14.0M/$21.2M)

Page 39: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 39Lothar A T Bauerdick Fermilab

DOE/NSF funding shortfallDOE/NSF funding shortfallNSF Project Costs - 11/2001

NSF guidance using 81/250 rule

0.73 0.94 1.13

2.924.13 4.13

0

2

4

6

8

10

12

14

2002 2003 2004 2005 2006 2007

CAS NSF UF Tier2 labor

UF Tier2 h/w Project Office NSF

Mgmt Reserve NSF NSF (assumed)

FY

Million AY$

DOE Project Costs - 11/2001DOE-FNAL Funding Profile - 4/2002

2.22.9

3.5

9.0

12.8 12.8

0

2

4

6

8

10

12

14

2002 2003 2004 2005 2006 2007

CAS DOE UF Tier1 labor

UF Tier1 h/w Project Office DOE

Mgmt Reserve DOE DOE-FNAL(2002 S&C)

FY

Million AY$

Page 40: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 40Lothar A T Bauerdick Fermilab

FY2003 Allocation à la Nov2001FY2003 Allocation à la Nov2001

FY2003 Budget Allocation - 11/2001

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Software Labor

Facilities Labor

FacilitiesEquipment

Project Office

ManagementReserve

DOE NSF iVDGL

$M

Total Costs $5.98M DOE $4.53M NSF $1.45M

Page 41: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 41Lothar A T Bauerdick Fermilab

Europeans Achieved Major UF Funding Europeans Achieved Major UF Funding

Funding for European User Facilities in their countriesFunding for European User Facilities in their countriesnow looks significantly larger than UF funding in the U.S.now looks significantly larger than UF funding in the U.S.

This statement is true relative to size of their respective communitiesThis statement is true relative to size of their respective communities

It is in some cases event true in absolute terms!!It is in some cases event true in absolute terms!!

Given our funding situation:Given our funding situation:are we going to be a partner for those efforts?are we going to be a partner for those efforts? BTW: USATLAS proposes major cuts in UF/Tier-1 “pilot flame” at BNLBTW: USATLAS proposes major cuts in UF/Tier-1 “pilot flame” at BNL

Page 42: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 42Lothar A T Bauerdick Fermilab

Forschungszentrum KarlsruheTechnik und Umwelt

Regional Data and Computing Centre Germany (RDCCG)

RDCCG Evolution (available capacity)30% rolling upgrade each year after 2007

+281837378063452111117Tape /TByte

+4501421437203113457Disk /TByte

+276896134461991CPU /kSI95

2007+200720054/20044/20034/200211/2001month/yearLHC+nLHC

FTE Evolution 2002 - 2005Support: 5 - 30Development: 8 - 10New office building to accommodate 130 FTE in 2005

Networking Evolution 2002 - 20051) RDCCG to CERN/FermiLab/SLAC (Permanent point to point): 1 GBit/s Ğ 10 GBit/s => 2 GBit/s could be arranged on a very short timescale2) RDCCG to general Internet: 34 MBit/s Ğ 100 MBit/s => Current situation, generally less affordable than 1)

How About The Others: DEHow About The Others: DE

Page 43: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 43Lothar A T Bauerdick Fermilab

How About The Others: ITHow About The Others: IT

TIER1 Resources

FARM DISKS TAPES

Year (SI2000) (TB) (TB)

2001 60,000 10 102002 200,000 80 502003 900,000 120 3002004 1,550,000 192 6002005 3,100,000 380 2,0002006 4,000,000 480 3,000

HARDWARE

Tier2 will have almost the same amount of CPU & Disks

TIER1 Resources

Type N. New Outsource

Manager 1Deputy 1LHC Experiments Software 2Programs, Tools, Procedures 2 2FARM Management & Planning 2 2ODB & Data Management 2 1Network (LAN+WAN) 2 2Other Services (Web, Security, etc.) 2 1Administration 2 1System Managers & Operators 6 6

Total 22 9 6

PERSONNEL

Tier2 Personnel is of the same order of magnitude.

Page 44: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 44Lothar A T Bauerdick Fermilab

How About The Others: RUHow About The Others: RU

Russian Tier2-Cluster

Cluster of institutional computing centers with Tier2 functionality and summary resources at 50-70% level of the canonical Tier1 center for each experiment (ALICE, ATLAS, CMS, LHCb):

analysis; simulations; users data support.

Participating institutes:Moscow SINP MSU, ITEP, KI, ÉMoscow region JINR, IHEP, INR RASSt.Petersburg PNPI RAS, ÉNovosibirsk BINP SB RAS

Coherent use of distributed resources by means of DataGrid technologies.

Active participation in the LCG Phase1 Prototyping and Data Challenges (at 5-10% level).

Gbps/É155/É20-30/15-302-3/10NetworkMbps

LCG/commodity

125050-1002010Tape TB

85050-70126Disk TB

41025-35105CPU kSI95

20072004 Q42002 Q42002 Q1

FTE: 10-12 (2002 Q1), 12-15 (2002 Q2), 25-30 (2004 Q4)

Page 45: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 45Lothar A T Bauerdick Fermilab

How About The Others: UKHow About The Others: UK

John Gordon - LCG 13th March 2002- n° 5

UK Tier1/A Status

Hardware Purchase for delivery today156 Dual 1.4GHz 1GB RAM, 30GB disks (312 cpus)26 Disk servers (Dual 1.266GHz) 1.9TB disk eachExpand the capacity of the tape robot by 35TB

Current EDG TB setup14 Dual 1GHz PIII, 500MBRAM 40GB disksCompute Element (CE)Storage Element (SE)User Interfaces (UI)Information Node (IN)+ Worker Nodes (WN)

+Central Facilities(Non Grid)250 CPUs10TB Disk35TB Tape(Capacity 330 TB)

John Gordon - LCG 13th March 2002- n° 8

Projected Staff Effort [SY]

Area GridPP @CERN CS

WP1 Workload Management 0.5 [IC] 2.0 [IC]WP2 Data Management 1.5++ [Ggo] 1.0 [Oxf]WP3 Monitoring Services 5.0++ [RAL, QMW] 1.0 [HW]

Security ++ [RAL] 1.0 [Oxf]WP4 Fabric Management 1.5 [Edin., LÕpool]WP5 Mass Storage 3.5++ [RAL, LÕpool]WP6 Integration Testbed 5.0++ [RAL/MÕcr/IC/Bristol]WP7 Network Services 2.0 [UCL/MÕcr] 1.0 [UCL]WP8 Applications 17.0ATLAS/LHCb (Gaudi/Athena) 6.5 [Oxf, Cam, RHUL, BÕham, RAL]CMS 3.0 [IC, Bristol, Brunel]CDF/D0 (SAM) 4.0 [IC, Ggo, Oxf, Lanc]BaBar 2.5 [IC, MÕcr, Bristol]UKQCD 1.0 [Edin.]Tier1/A 13.0 [RAL]Total 49.0++ 10.0 ->25.0 6.0 = 80++

Page 46: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 46Lothar A T Bauerdick Fermilab

How About The Others: FRHow About The Others: FR

LCG workshop -13 March 2002 - Denis LINGLIN

0,5 PBytesData Bases,Hierarchical storage

~ 700 cpu's, 20 k SI-9540 TB disk

Network & QoS.Custom services "à la carte"

45 people

ONE computing centre forIN2P3-CNRS & DSM-CEA(HEP, Astropart., NP,..)

National : 18 laboratories,40 experiments,

2500 people/users

International : Tier-1 / Tier-Astatus for several US, CERN

and astrop. experiments

Budget:

~ 6-7 M € /year

Plus ~ 2 M € for personnel

CC-IN2P3

Page 47: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 47Lothar A T Bauerdick Fermilab

Compared to European efforts the US CMS UF efforts are very smallCompared to European efforts the US CMS UF efforts are very small In FY2002 The US CMS Tier-1 is sized at 4kSI CPU and 5TB storageIn FY2002 The US CMS Tier-1 is sized at 4kSI CPU and 5TB storage The Tier-1 effort is 5.5 FTE. In addition there is 2 FTE CAS and 1 FTE GridThe Tier-1 effort is 5.5 FTE. In addition there is 2 FTE CAS and 1 FTE Grid

S&C base-line 2003/2004: Tier-1 effort needs to be at least $1M/year above FY2002S&C base-line 2003/2004: Tier-1 effort needs to be at least $1M/year above FY2002to sustain the UF R&D and become full part of the LHC Physics Research Gridto sustain the UF R&D and become full part of the LHC Physics Research Grid Need some some 7 additional FTEs, more equipment funding at the Tier-1Need some some 7 additional FTEs, more equipment funding at the Tier-1 Part of this effort would go directly into user supportPart of this effort would go directly into user support Essential areas are insufficiently covered now, need to be addressed in 2003 the latestEssential areas are insufficiently covered now, need to be addressed in 2003 the latest

Fabric managent • Storage resource mgmt • Networking • System configuration management • Collaborative tools • Fabric managent • Storage resource mgmt • Networking • System configuration management • Collaborative tools • Interfacing to Grid i/s • System management & operations supportInterfacing to Grid i/s • System management & operations support

This has been strongly endorsed by the S&C baseline review Nov 2001This has been strongly endorsed by the S&C baseline review Nov 2001 All European RC (DE, FR, IT, UK, even RU!) have support at this level of effortAll European RC (DE, FR, IT, UK, even RU!) have support at this level of effort

FY2002 - FY2004 Are Critical in USFY2002 - FY2004 Are Critical in US

Page 48: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 48Lothar A T Bauerdick Fermilab

The U.S. User Facilities The U.S. User Facilities Will Seriously Fall BackWill Seriously Fall Back

Behind European Tier-1 EffortsBehind European Tier-1 EffortsGiven The Funding Situation!Given The Funding Situation!

To Keep US Leadership andTo Keep US Leadership and

Not to put US based Science at DisadvantageNot to put US based Science at Disadvantage

Additional Funding Is RequiredAdditional Funding Is Required

at least $1M/year at Tier-1 Sitesat least $1M/year at Tier-1 Sites

Page 49: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 49Lothar A T Bauerdick Fermilab

LHC Computing Grid ProjectLHC Computing Grid Project$36M project 2002-2004, half equipment half personnel: $36M project 2002-2004, half equipment half personnel:

“Successful” RRB“Successful” RRB Expect to ramp to >30 FTE in 2002, and ~60FTE in 2004Expect to ramp to >30 FTE in 2002, and ~60FTE in 2004 About $2M / year equipmentAbout $2M / year equipment e.g. UK delivers 26.5% of LCG funding e.g. UK delivers 26.5% of LCG funding AT CERNAT CERN ($9.6M) ($9.6M) The US CMS has The US CMS has requestedrequested $11.7M $11.7M IN THE USIN THE US + CAS $5.89 + CAS $5.89 Current Current allocationallocation (assuming CAS, iVDGL) would be $7.1 (assuming CAS, iVDGL) would be $7.1 IN THE USIN THE US

Largest personnel fraction in LCG Applications AreaLargest personnel fraction in LCG Applications Area ““All” personnel to be at CERNAll” personnel to be at CERN

““People staying at CERN for less than 6 months are counted at a 50% level, regardless of their experience.””

CCS will work on LCG AA projects CCS will work on LCG AA projects US CMS will contribute to LCG US CMS will contribute to LCG

This brings up several issues that US CMS S&C should deal withThis brings up several issues that US CMS S&C should deal with Europeans have decided to strongly support LCG Application AreaEuropeans have decided to strongly support LCG Application Area But at the same time we do not see more support for the CCS effortsBut at the same time we do not see more support for the CCS efforts

CMS and US CMS will have to do at some level a rough accounting CMS and US CMS will have to do at some level a rough accounting of LCG AA vs CAS and LCG facilities vs US UFof LCG AA vs CAS and LCG facilities vs US UF

Page 50: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 50Lothar A T Bauerdick Fermilab

Impact of LHC DelayImpact of LHC DelayFunding shortages in FY2001 and FY2002 already lead to significant delaysFunding shortages in FY2001 and FY2002 already lead to significant delays

Others have done more --- we seriously are understaffed and do not do enough nowOthers have done more --- we seriously are understaffed and do not do enough now We lack 7FTE already this year, and will need to start hiring only in FY2003We lack 7FTE already this year, and will need to start hiring only in FY2003 This has led to delays and will further delay our effortsThis has led to delays and will further delay our efforts

Long-term Long-term do not know, too uncertain predictions of equipment costs to do not know, too uncertain predictions of equipment costs to

evaluate possible costs savings due to delays by roughly a yearevaluate possible costs savings due to delays by roughly a year However, schedules become more realisticHowever, schedules become more realistic

Medium termMedium term Major facilities (LCG) milestones shift by about 6 monthsMajor facilities (LCG) milestones shift by about 6 months 1st LCG prototype grid moved to end of 2002 --> more realistic now1st LCG prototype grid moved to end of 2002 --> more realistic now End of R&D moves from end 2004 to mid 2005 End of R&D moves from end 2004 to mid 2005 Detailed schedule and work plan expected from LCG project and CMS CCS (June)Detailed schedule and work plan expected from LCG project and CMS CCS (June)

No significant overall costs savings for R&D phaseNo significant overall costs savings for R&D phase We are already significantly delayed, We are already significantly delayed,

and not even at half the effort of what other countries are doing (UK, IT, DE, RU!!) and not even at half the effort of what other countries are doing (UK, IT, DE, RU!!) Catch up on our delayed schedule feasible, if we can manage to hire 7 people in FY2003Catch up on our delayed schedule feasible, if we can manage to hire 7 people in FY2003

and manage to support this level of effort in FY2004and manage to support this level of effort in FY2004 Major issue with lack of equipment fundingMajor issue with lack of equipment funding

Re-evaluation of equipment deployment will be done during 2002 (PASTA)Re-evaluation of equipment deployment will be done during 2002 (PASTA)

Page 51: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 51Lothar A T Bauerdick Fermilab

US S&C Minimal RequirementsUS S&C Minimal Requirements

The DOE funding guidance for the preparation of the US LHC research program approaches adequate funding levels around when the LHC starts in 2007, but is heavily back-loaded and does not accommodate the base-lined software and computing project and the needs for pre-operations of the detector in 2002-2005.

We take up the charge to better understand the minimum requirements, and to consider non-standard scenarios for reducing some of the funding short falls, but ask the funding agencies to explore all available avenues to raise the funding level.

The LHC computing model of a worldwide distributed system is new and needs significant R&D. The experiments are approaching this with a series of "data challenges" that will test the developing systems and will eventually yield a system that works.

US CMS S&C has to be part of the data challenges (DC) and to provide support for trigger and detector studies (UF subproject) and to deliver engineering support for CMS core software (CAS subproject).

Page 52: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 52Lothar A T Bauerdick Fermilab

UF NeedsUF Needs

The UF subproject is centered on a Tier-1 facility at Fermilab, that will be driving the US CMS participation in these Data Challenges.

The prototype Tier-2 centers will become integrated parts of the US CMS Tier-1/Tier-2 facilities.

Fermilab will be a physics analysis center for CMS. LHC physics with CMS will be an important component of Fermilab's research program. Therefore Fermilab needs to play a strong role as a Tier-1 center in the upcoming CMS and LHC data challenges.

The minimal Tier-1 effort would require to at least double the current Tier-1 FTEs at Fermilab, and to grant at least $300k yearly funding for equipment. This level represents the critical threshold.

The yearly costs for this minimally sized Tier-1 center at Fermilab would approach $2M after an initial $1.6M in FY03 (hiring delays). The minimal Tier-2 prototypes would need $400k support for operations, the rest would come out of iVDGL funds.

Page 53: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 53Lothar A T Bauerdick Fermilab

CAS NeedsCAS Needs

Ramping down the CAS effort is not an option, as we would face very adverse effects on CMS. CCS manpower is now even more needed to be able to drive and profit from the new LCG project - there is no reason to believe that the LCG will provide a “CMS-ready” solution without CCS being heavily involved in the process. We can even less allow for slips or delays.

Possible savings with the new close collaboration between CMS and ATLAS through the LCG project will potentially give some contingency to the engineering effort that is to date missing in the project plan. That contingency (which would first have to be earned) could not be released before end of 2005.

The yearly costs of keeping the current level for CAS are about $1.7M per year (DOE $1000k, NSF $700k), including escalation and reserve.

Page 54: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 54Lothar A T Bauerdick Fermilab

Minimal US CMS S&C until 2005Minimal US CMS S&C until 2005

Definition of Minimal:Definition of Minimal:if we can’t afford even this, the US will not participate in the if we can’t afford even this, the US will not participate in the CMS Data Challenges and LCG Milestones in 2002 - 2004CMS Data Challenges and LCG Milestones in 2002 - 2004

For US CMS S&C the minimal funding for the R&D phase (until 2005) would include (PRELIMINARY) Tier1: $1600k in FY03 and $2000k in the following years Tier2: $400k per year from the NSF to sustain the support for Tier2 manpower CAS: $1M from DOE and $700k from the NSF Project Office $300k (includes reserve)

A failure to provide this level of funding would lead to severe delays and inefficiencies in the US LHC physics program. Considering the large investments in the detectors, and the large yearly costs of the research program such an approach would not be cost efficient and productive.

The ramp-up of the UF to the final system, beyond 2005, will need to be aligned with the plans of CERN and other regional centers. After 2005 the funding profile seems to approach the demand.

Page 55: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 55Lothar A T Bauerdick Fermilab

Where do we stand?Where do we stand?Setup of an efficient and competent s/w engineering support for CMSSetup of an efficient and competent s/w engineering support for CMS

David is happy and CCS is doing wellDavid is happy and CCS is doing well ““proposal-driven” support for det/PRS engineering supportproposal-driven” support for det/PRS engineering support

Setup of an User Support organization out of UF (and CAS) staffSetup of an User Support organization out of UF (and CAS) staff PRS is happy (but needs more)PRS is happy (but needs more) ““proposal driven” provision of resources: data servers, user clusterproposal driven” provision of resources: data servers, user cluster Staff to provide data sets and nTuples for PRS, small specialized productionStaff to provide data sets and nTuples for PRS, small specialized production Accounts, software releases, distribution, help desk etc pp Accounts, software releases, distribution, help desk etc pp Tutorials done at Tier-1 and Tier-2 sitesTutorials done at Tier-1 and Tier-2 sites

Implemented & commissioned a first Tier-1/Tier-2 system of RCsImplemented & commissioned a first Tier-1/Tier-2 system of RCs UCSD, Caltech, U.Florida, U.Wisconsin, FermilabUCSD, Caltech, U.Florida, U.Wisconsin, Fermilab Shown that Grid-tools can be used in “production”Shown that Grid-tools can be used in “production”

and greatly contribute to success of Grid projects and middlewareand greatly contribute to success of Grid projects and middleware Validated use of network between Tier-1 and Tier-2: 1TB/day!Validated use of network between Tier-1 and Tier-2: 1TB/day!

Developing a production quality Grid-enabled User FacilityDeveloping a production quality Grid-enabled User Facility ““impressive organization” for running production in USimpressive organization” for running production in US

Team at Fermilab and individual efforts at Tier-2 centersTeam at Fermilab and individual efforts at Tier-2 centers Grid technology helps to reduce the effortGrid technology helps to reduce the effort

Close collaboration with Grid project infuses additional effort into US CMSClose collaboration with Grid project infuses additional effort into US CMS Collaboration between sites (including ATLAS, like BNL) for facility issuesCollaboration between sites (including ATLAS, like BNL) for facility issues Starting to address many of the “real” issues of Grids for PhysicsStarting to address many of the “real” issues of Grids for Physics

Code/binary distribution and configuration, Remote job execution, Data replicationCode/binary distribution and configuration, Remote job execution, Data replication Authentication, authorization, accounting and VO servicesAuthentication, authorization, accounting and VO services Remote database access for analysis Remote database access for analysis

Page 56: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 56Lothar A T Bauerdick Fermilab

What have we achieved?What have we achieved?

We are participating and driving a world-wide CMS production We are participating and driving a world-wide CMS production DC DC

We are driving a large part of the US Grid integration and deployment work We are driving a large part of the US Grid integration and deployment work That goes beyond the LHC and even HEPThat goes beyond the LHC and even HEP

We have shown that the Tier-1/Tier-2 User Facility system We have shown that the Tier-1/Tier-2 User Facility system in the US can work!in the US can work! We definitely are on the map for LHC computing and the LCGWe definitely are on the map for LHC computing and the LCG

We also are threatened to be starved over the next yearsWe also are threatened to be starved over the next years The FA have failed to recognize the opportunity for continued US leadership in this field The FA have failed to recognize the opportunity for continued US leadership in this field

— as others like the UK are realizing and supporting!— as others like the UK are realizing and supporting! We are thrown back to a minimal funding level, and even that has been challenged We are thrown back to a minimal funding level, and even that has been challenged

But this is the time where our partners at CERN will expect to see us deliver But this is the time where our partners at CERN will expect to see us deliver and work with the LCG and work with the LCG

Page 57: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 57Lothar A T Bauerdick Fermilab

ConclusionsConclusionsThe US CMS S&C Project looks technically pretty sound The US CMS S&C Project looks technically pretty sound Our customers (CCS and US CMS Users) appear to be happy, but want moreOur customers (CCS and US CMS Users) appear to be happy, but want moreWe also need more R&D to build the system, We also need more R&D to build the system,

and we need to do more to measure up to our partnersand we need to do more to measure up to our partners

We have started in 1998 with some supplemental fundsWe have started in 1998 with some supplemental fundswe are a DOE-line item nowwe are a DOE-line item now

We have received less than requested for a couple of years now, We have received less than requested for a couple of years now, but this FY2002 the project has become bitterly under-funded but this FY2002 the project has become bitterly under-funded

— cf. the reviewed and endorsed baseline— cf. the reviewed and endorsed baseline

The funding agencies have faulted on providing funding for the US S&C The funding agencies have faulted on providing funding for the US S&C and on providing FA guidance for US User Facilitiesand on providing FA guidance for US User Facilities

The ball is in our (US CMS) park nowThe ball is in our (US CMS) park nowit is not an option to do “just a little bit” of S&Cit is not an option to do “just a little bit” of S&Cthe S&C R&D is a project: base-line plans, funding profiles, change controlthe S&C R&D is a project: base-line plans, funding profiles, change control

It is up to US CMS to decideIt is up to US CMS to decideI ask you to support my request to build up the User Facilities in the USI ask you to support my request to build up the User Facilities in the US

Page 58: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 58Lothar A T Bauerdick Fermilab

THE ENDTHE END

Page 59: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 59Lothar A T Bauerdick Fermilab

UF Equipment CostsUF Equipment Costs

Detailed Information on Tier-1 Facility Costing Detailed Information on Tier-1 Facility Costing

See Document in Your Handouts!See Document in Your Handouts!

All numbers in FY2002$kAll numbers in FY2002$k

Fiscal Year 2002 2003 2004 2005 2006 2007 Total 2008(Ops)

1.1 T1 Regional Center 0 0 0 2,866 2,984 2,938 8,788 2,6471.2 System Support 29 23 0 35 0 0 87 151.3 O&M 0 0 0 0 0 0 0 01.4 T2 Regional Centers 232 240 870 1,870 1,500 1,750 6,462 1,2501.5 T1 Networking 61 54 42 512 462 528 1,658 4851.6 Computing R&D 511 472 492 0 0 0 1,476 01.7 Det. Con. Support 84 53 52 0 0 0 189 01.8 Local Comp. Supp. 12 95 128 23 52 23 333 48

Total 929 938 1,584 5,306 4,998 5,239 18,992 4,446

Total T1 only 697 698 714 3,436 3,498 3,489 12,530 3,196

Page 60: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 60Lothar A T Bauerdick Fermilab

Total Project CostsTotal Project Costs

In AY$MIn AY$M Fiscal Year 2002 2003 2004 2005 2006 2007

Project Office 0.32 0.48 0.49 0.51 0.52 0.54

DOE 0.15 0.31 0.32 0.33 0.34 0.35NSF 0.17 0.17 0.18 0.18 0.19 0.19

Software Personnel 1.49 1.72 2.14 2.25 2.36 2.48

DOE 0.87 0.91 0.96 1.01 1.06 1.11NSF 0.62 0.81 1.18 1.24 1.30 1.37

UF Personnel 1.14 2.26 3.00 5.26 6.99 7.89

for Tier-1 DOE 0.83 1.97 2.48 4.33 5.42 6.28for Tier-2 NSF 0.31 0.29 0.52 0.93 1.57 1.62

UF Equipment 0.45 0.75 1.51 5.35 5.19 5.52

for Tier-1 DOE 0.45 0.70 0.71 3.44 3.50 3.49for Tier-2 NSF 0.00 0.05 0.80 1.91 1.69 2.03

Management Reserve 0.34 0.77 0.71 1.34 1.51 1.96

DOE 0.23 0.64 0.45 0.91 1.03 1.44NSF 0.11 0.13 0.27 0.43 0.47 0.52

Total Costs 3.73 5.98 7.86 14.71 16.57 18.39

Total DOE 2.53 4.53 4.91 10.02 11.35 12.67Total NSF 1.20 1.45 2.94 4.69 5.22 5.73

Page 61: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 61Lothar A T Bauerdick Fermilab

Fiscal Year 2002 2003 2004 2005 2006 2007

Simulation CPU (Si95) 2,000 3,000 4,000 7,200 28,800 72,000

Analysis CPU (Si95) 750 2,100 4,000 8,000 32,000 80,000Server CPU (Si95) 50 140 270 1,500 6,000 15,000

Disk (TB) 16 31 46 65 260 650

Total Resources U.S. CMS CPU (Si95) 310,000

Tier-1 and all Tier-2 Disk (TB) 1,400

U.S. CMS Tier-1 RC Installed CapacityU.S. CMS Tier-1 RC Installed Capacity

Fully Functional

Facilities

20% Data Challenge

Prototype Systems

5% Data Challenge

R&D Systems

310 kSI95 310 kSI95 today is ~ 10,000 PCs today is ~ 10,000 PCs

Page 62: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 62Lothar A T Bauerdick Fermilab

Alternative ScenariosAlternative Scenarios

Q: revise the plans as to not have CMS and ATLAS identical scope?Q: revise the plans as to not have CMS and ATLAS identical scope?- never been tried in HEP: always competitive experimentsnever been tried in HEP: always competitive experiments- UF model is NOT to run a computer centers, but to have an UF model is NOT to run a computer centers, but to have an

experiment-driven effort to get the physics environment in placeexperiment-driven effort to get the physics environment in place- S&C is engineering support for the physics project -- outsourcing S&C is engineering support for the physics project -- outsourcing

of engineering to a non-experiment driven (common) project would of engineering to a non-experiment driven (common) project would mean a complete revision of the physics activities. This would mean a complete revision of the physics activities. This would require fundamental changes to experiment management and require fundamental changes to experiment management and structure, that are not in the purview of the US part of the structure, that are not in the purview of the US part of the collaborationcollaboration

- specifically the data challenges are not only and primarily done for specifically the data challenges are not only and primarily done for the S&C project, but are going to be conducted as a coherent effort the S&C project, but are going to be conducted as a coherent effort of the physics, detector AND S&C groups with the goal to advance of the physics, detector AND S&C groups with the goal to advance the physics, detector AND S&C efforts. the physics, detector AND S&C efforts.

- The DC are why we are here. If we cannot participate there would The DC are why we are here. If we cannot participate there would be no point in going for a experiment driven UFbe no point in going for a experiment driven UF

Page 63: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 63Lothar A T Bauerdick Fermilab

Alternative ScenariosAlternative Scenarios

Q: are Tier-2 resources spread too thin? Q: are Tier-2 resources spread too thin? - The Tier-2 efforts should be as broad as we can afford it. We are The Tier-2 efforts should be as broad as we can afford it. We are

including university (non-funded) groups, like Princetonincluding university (non-funded) groups, like Princeton- If the role of the Tier-2 centers were just to provide computing If the role of the Tier-2 centers were just to provide computing

resources we would not distribute this, but concentrate on the Tier-resources we would not distribute this, but concentrate on the Tier-1 center. Instead the model is to put some resources at the 1 center. Instead the model is to put some resources at the prototype T2 centers, which allows us to pull in additional prototype T2 centers, which allows us to pull in additional resources at these sites. This model seems to be rather resources at these sites. This model seems to be rather successful. successful.

- iVDGL funds are being used for much of the efforts at prototype T2 iVDGL funds are being used for much of the efforts at prototype T2 centers. Hardware investments at the Tier-2 sites up to know have centers. Hardware investments at the Tier-2 sites up to know have been small. The project planned to fund 1.5FTE at each site (this been small. The project planned to fund 1.5FTE at each site (this funding is not yet there). In CMS we see additional manpower at funding is not yet there). In CMS we see additional manpower at those sites of several FTE, that comes out of the base program those sites of several FTE, that comes out of the base program and that are being attracted from CS et al communities through the and that are being attracted from CS et al communities through the involvement in Grid projectsinvolvement in Grid projects

Page 64: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 64Lothar A T Bauerdick Fermilab

Alternative ScenariosAlternative Scenarios

Q: additional software development activities to be combined?Q: additional software development activities to be combined?

- this will certainly happen. Concretely we already started to plan the this will certainly happen. Concretely we already started to plan the first large-scale ATLAS-CMS common software project, the new first large-scale ATLAS-CMS common software project, the new persistency framework. Do we expect significant savings in the persistency framework. Do we expect significant savings in the manpower efforts? These could be in the order of some 20-30%, if manpower efforts? These could be in the order of some 20-30%, if these efforts could be closely managed. However, the these efforts could be closely managed. However, the management is not in US hands, but in the purview of the LCG management is not in US hands, but in the purview of the LCG project. Also, the very project is ADDITIONAL effort that was not project. Also, the very project is ADDITIONAL effort that was not necessary when Objectivity was meant to provide the persistency necessary when Objectivity was meant to provide the persistency solution. solution.

- generally we do not expect very significant changes in the generally we do not expect very significant changes in the estimates for the total engineering manpower required to complete estimates for the total engineering manpower required to complete the core software efforts, the possible savings would give a the core software efforts, the possible savings would give a minimal contingency to the engineering effort that is to date minimal contingency to the engineering effort that is to date missing in the project plan. -> to be earned first, then released in missing in the project plan. -> to be earned first, then released in 20052005

Page 65: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 65Lothar A T Bauerdick Fermilab

Alternative ScenariosAlternative Scenarios

Q: are we loosing, are there real cost benefits?Q: are we loosing, are there real cost benefits?- any experiment that does not have a kernel of people to run the data any experiment that does not have a kernel of people to run the data

challenges will significantly loosechallenges will significantly loose- The commodity is people, not equipmentThe commodity is people, not equipment- Sharing of resources is possible (and will happen), but we need to keep Sharing of resources is possible (and will happen), but we need to keep

minimal R&D equipment. $300k/year for each T1 is very little funding for minimal R&D equipment. $300k/year for each T1 is very little funding for doing that. Below that we should just go home…doing that. Below that we should just go home…

- Tier2: the mission of the Tier2 centers is to enable universities to be part of Tier2: the mission of the Tier2 centers is to enable universities to be part of the LHC research program. That function will be cut in as much as the the LHC research program. That function will be cut in as much as the funding for it will be cut.funding for it will be cut.

- To separate the running of the facilities form the experiment’s effort: This is a To separate the running of the facilities form the experiment’s effort: This is a model that we are developing for our interactions with Fermilab CD -- this is model that we are developing for our interactions with Fermilab CD -- this is the ramping to “35 FTE” in 2007, not the “13 FTE” now; the ramping to “35 FTE” in 2007, not the “13 FTE” now; some services already now are being “effort-reported” to CD-CMS. We have some services already now are being “effort-reported” to CD-CMS. We have to get the structures in place to get this right - there will be overheads to get the structures in place to get this right - there will be overheads involvedinvolved

- I do not see real cost benefits in any of these for the R&D phase. I prefer not I do not see real cost benefits in any of these for the R&D phase. I prefer not to discuss the model for 2007 now, but we should stay open minded. to discuss the model for 2007 now, but we should stay open minded. However, if we want to approach unconventional scenarios we need to However, if we want to approach unconventional scenarios we need to carefully prepare for them. That may start in 2003-2004?carefully prepare for them. That may start in 2003-2004?

Page 66: US CMS Software and Computing Project US CMS Collaboration Meeting at FSU, May 2002

May 11, 2002U.S. CMS Collaboration Meeting 66Lothar A T Bauerdick Fermilab

UF + PMUF + PM

Control room logbookControl room logbook

Code dist, dar, role for gridCode dist, dar, role for grid

T2 workT2 work

CCS schedule?CCS schedule?

More comments on the jobMore comments on the job

Nucleation point vs T1 user communityNucleation point vs T1 user community

new hiresnew hires

Tony’s assignment, prod runningTony’s assignment, prod running

Disk tests, benchmarking, common work w/ BNL and iVDGL facility grpDisk tests, benchmarking, common work w/ BNL and iVDGL facility grp

Monitoring, ngop, ganglia, iosif’s stuffMonitoring, ngop, ganglia, iosif’s stuff

Mention challenges to test bed/MOP: config, certificates, installations, and Mention challenges to test bed/MOP: config, certificates, installations, and help we get from grid projects: VO, ESNET CA, VDThelp we get from grid projects: VO, ESNET CA, VDT

UF workplanUF workplan