matchmaking in glideinwms in cms

30
CERN, Dec 2012 glideinWMS matchmaking 1 glideinWMS for users Matchmaking in glideinWMS in CMS by Igor Sfiligoi (UCSD)

Upload: igor-sfiligoi

Post on 15-Jan-2015

326 views

Category:

Technology


3 download

DESCRIPTION

This document provides a high level overview of how glideinWMS-based instanced do matchmaking in CMS (a High Energy Experiment). The information is accurate as of early Dec 2012.

TRANSCRIPT

Page 1: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 1

glideinWMS for users

Matchmaking in glideinWMSin CMS

by Igor Sfiligoi (UCSD)

Page 2: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 2

Scope of this talk

This talk provides a high level description of howglideinWMS matchmaking

works in CMS.

Reader is expected to be familiar with the CMS experiment environmenthttp://cms.web.cern.ch/

Page 3: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 3

glideinWMS architecture

● A reminder

Central manager

Negotiator

Submit node

Schedd

Execute node

Condor

Submit node

Submit node

Execute node

Execute node

Execute node

Execute node

Grid

G.F.

G.F.VO FE

+3

+1

Page 4: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 4

Two levels of matchmaking

● First in the VO Frontend● To decide where

to provision resources● i.e. where

to send glideins

● Then in the HTCondor Negotiator● To decide

which Job gets the glidein Slot

Central manager

Negotiator

Submit node

Schedd

Execute node

Condor

Submit node

Submit node

Execute node

Execute node

Execute node

Execute node

Grid

G.F.

G.F.VO FE

+3

+1

The two

must havecompatible

policies

Page 5: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 5

Defining the policy

● The VO FE configures the glideins● So it can define the Slot Requirements

● Preferred strategy to leave all policy decisions in the VO FE hands, i.e. both● VO FE matchmaking policy● HTCondor matchmaking policy

● This implies● Users should not define Job Requirements● Instead, publish attributes describing requirements

Easier keep themin sync this way

http://www.slideshare.net/igor_sfiligoi/condor-week-12-attribute-matchmaking-move-req-out-of-user-hands

Page 6: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 6

CMS Production @ CERNPolicies

Page 7: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 7

Description

● The VO FE @ CERN serves the production needs● i.e. Reconstruction and MC production

● Job submission regulated by service managed by a dedicated team, so jobs are● Targeted● Well behaved

At least by and large

Page 8: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 8

Matchmaking policy

● Two dimensions● Grid Site● Single CPU vs HTPC

● The actual policy is the AND of both● Both VO FE policy and HTCondor policy

defined in the VO FE instance configuration

Page 9: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 9

Matching on Grid site name

● User Jobs expected to publish the attributeDESIRED_Sites● e.g. +DESIRED_Sites = “T2_DE_DESY,T2_US_UCSD”

● The G.F. and the glideins advertisingGLIDEIN_CMSSite

● The matchmaking policy isGLIDEIN_CMSSite ∈ DESIRED_Sites

String list

Page 10: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 10

Matching on Job Type

● Use Jobs can publish the attributeDESIRES_HTPC● e.g. +DESIRES_HTPS = 1● If not defined, defaults to 0

● The G.F. And the glideins may advertiseGLIDEIN_Is_HTPC● If not defined, defaults to False

● The matchmaking policy is(GLIDEIN_Is_HTPC==True)==(DESIRES_HTPC==1)

Integer representation of Boolean values

Boolean value

Page 11: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 11

Example submit file

Universe = vanillaExecutable = mcgenArguments = -k 1543.3Output = mcgen.outError = mcgen.errLog = mcgen.log+DESIRED_Sites = “T2_DE_DESY,T2_US_UCSD”+DESIRES_HTPC = 0Requirements = TrueQueue 1

Universe = vanillaExecutable = mcgenArguments = -k 1543.3Output = mcgen.outError = mcgen.errLog = mcgen.log+DESIRED_Sites = “T2_DE_DESY,T2_US_UCSD”+DESIRES_HTPC = 0Requirements = TrueQueue 1

Page 12: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 12

CMS AnaOps @ UCSDPolicies

Page 13: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 13

Description

● VO FE @ UCSD serves CMS analysis users● User Jobs much more chaotic

● Most users don't really understand their needs● Must protect from accidental errors● Yet keep the system flexible

● Net result● More complex policy

Page 14: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 14

Two different policies

● The AnaOps FE actually has two policies● The Regular policy● The Overflow policy

● The Regular policy tries to match resources● Based on User desires

● The Overflow policy “outsmarts” the Users● Will violate User desires without breaking the Jobs● The aim is to finish user jobs sooner● User can opt-out, if he wishes

Page 15: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 15

The Regular M.M. policy

● Four+one dimensions● Grid Site● Single CPU vs HTPC● Memory usage● Job duration● Number of Job Starts

● The actual policy is the AND of both● Both VO FE policy and HTCondor policy

defined in the VO FE instance configuration

Due to preemption

Page 16: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 16

Grid site selection

● This is both similar and different compared to the Production FE @CERN● Serves the same purpose, but supports three

different ways to select a site– Due to historical evolution

● The three options are● GLIDEIN_CMSSite ∈ DESIRED_Sites● GLIDEIN_SEs ∈ DESIRED_SEs● GLIDEIN_Gatekeeper ∈ DESIRED_Gatekeepers

● The actual policy is the OR of the three

Planning to extend to(GLIDEIN_SEs ∩ DESIRED_SEs) ≠∅

Page 17: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 17

Job type selection

● Just like @ CERN

Page 18: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 18

Memory Usage

● Most Grid sites put strict limits on the amount of memory that can be used● Will kill glideins if they exceed the limit

● G.F. and glideins advertise the Entry-specific limitGLIDEIN_MaxMemMBs

● Jobs can explicitly declare the needed memoryrequest_memory● Condor will also measure it at run time

– ImageSize – Virtual memory used– ResidentSetSize – True memory usage

● Policy: JobMemory <= GLIDEIN_MaxMemMBs

Native Condor attribute, no + needed

Use a combinationof these to calculatethe actual JobMemory

Page 19: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 19

Job Duration 1/2

● Glideins have a limited lifetime● Must fit within the limits of the Grid site's queue● Glideins publish the deadlineGLIDEIN_ToDie– Jobs must finish before reaching the deadline

● Final user job lifetime unpredictable● Depends on the type of computing done● User should indicate the expected job lifetime

– Else we have to assume reasonable defaults

Not many users setthis value(s) right now

Page 20: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 20

Job Duration 2/2

● The same type of computation may take different amount of time● e.g. Based on the type of input

● Jobs can declare two attributes● NormMaxWallTimeMins – Expected limit● MaxWallTimeMins – Absolute max limit

● The matchmaking logic is● Use NormMaxWallTimeMins for

the first job startup● Use MaxWallTimeMins for all others

Based on simple assumptionthat the job was killed for

hitting the deadline.

Page 21: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 21

Cut on number of re-starts

● Not really a user configurable property● More an emergency break

● In a properly configured system,should never be triggered● But unexpected problems happen● So better limit the damage

Page 22: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 22

The Overflow Use case

● User Jobs specify a list of sites, because the data they need is there

● With recent versions of CMSSW, jobs can access the data from remote● With a small performance penalty

● We can thus schedule jobs “anywhere”● As long as the needed data is

at a Site that has joined the xrootd federation● But only if no CPU available “close to the data”

– And not too far, eitherhttp://indico.cern.ch/contributionDisplay.py?contribId=381&sessionId=5&confId=149557http://indico.cern.ch/contributionDisplay.py?contribId=232&sessionId=8&confId=149557

Page 23: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 23

The Overflow M.M. policy

● Violate only the “Site selection” rule● Keep all the others

● Plus, add one+one more:● An opt-out mechanism● Delayed matching

Page 24: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 24

New Site M.M. policy

● The user specified attribute is used to flag the job as “Overflowable”● i.e. the job will match if and only if

(DESIRED_<site>s ∩ SUPPORTED_<site>s) ≠∅

● Matching jobs can then run on any glidein● Additional limits can be put in place by the FE,

but mostly invisible to the user

Still support all 3 types of site identification

Page 25: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 25

The opt-out mechanism

● The Overflow policy considers all jobs by default● But Users may want to opt-out some of the Jobs

– Sometimes it is just a need(to get deterministic results, e.g. for testing a site)

● To opt-out, the user defines+CMS_ALLOW_OVERFLOW = False

● The FE will not consider such jobs for Overflowing

Page 26: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 26

Delayed matching

● As said initially, Jobs should preferentially run close to the data● Overflow should only consider jobs

“that cannot find resources close to the data”

● We implemented it based on time● Jobs are matched only

if waiting in the queue for more than 6 hours

Users cannot influence it

Page 27: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 27

Example submit file

Universe = vanillaExecutable = myanaArguments = -k 1543.3Output = myana.outError = myana.errLog = myana.logrequest_memory = 1500+DESIRED_SEs = "dc2-grid-64.brunel.ac.uk,stormfe1.pi.infn.it"+NormMaxWallTimeMins = 7200+MaxWallTimeMins = 14400+DESIRES_HTPC = 0+CMS_ALLOW_OVERFLOW = TrueRequirements = TrueQueue 1

Universe = vanillaExecutable = myanaArguments = -k 1543.3Output = myana.outError = myana.errLog = myana.logrequest_memory = 1500+DESIRED_SEs = "dc2-grid-64.brunel.ac.uk,stormfe1.pi.infn.it"+NormMaxWallTimeMins = 7200+MaxWallTimeMins = 14400+DESIRES_HTPC = 0+CMS_ALLOW_OVERFLOW = TrueRequirements = TrueQueue 1

Page 28: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 28

The End

Page 29: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 29

Pointers

● glideinWMS Home Pagehttp://tinyurl.com/glideinWMS

● HTCondor Home Pagehttp://research.cs.wisc.edu/htcondor/

● HTCondor [email protected]@cs.wisc.edu

● glideinWMS [email protected]

Page 30: Matchmaking in glideinWMS in CMS

CERN, Dec 2012 glideinWMS matchmaking 30

Acknowledgments

● The creation of this document was sponsored by grants from the US NSF and US DOE,and by the University of California system