bookkeeping tutorial

23
Bookkeeping Tutorial

Upload: wilbur

Post on 25-Feb-2016

96 views

Category:

Documents


2 download

DESCRIPTION

Bookkeeping Tutorial. Bookkeeping content. Contains records of all “jobs” and all “files” that are produced by production jobs Job: In fact technically a “step” in a workflow E . g . “Gauss step”, “Brunel step” … For real RAW data: the “job” is in fact a DAQ run - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bookkeeping Tutorial

Bookkeeping Tutorial

Page 2: Bookkeeping Tutorial

Bookkeeping Tutorial 2

Bookkeeping content

m Contains records of all “jobs” and all “files” that are produced by production jobs

m Job:o In fact technically a “step” in a workflow

P E.g. “Gauss step”, “Brunel step”…o For real RAW data: the “job” is in fact a DAQ runo Has input files (except runs and Gauss)o Has output files

P Note that files may not be kept (i.e. have a replica)P All files are registered in order to keep the full history

o Has metadataP Location, production number, application, CPUTime, etc…

m Files:o Always output of a “job”o Files are defined by an LFN (Logical File Name)o Contain metadata

P Number of events, size, event type, etc…

Page 3: Bookkeeping Tutorial

Bookkeeping Tutorial 3

Bookkeeping purpose

m Provenance databaseo Contains the full history of productions

P Traceability of datasetsm User dataset search

o Select a list of files from selection criteriaP Only files with a replica!P Generate Gaudi configuration file

o Give also access to the job/file treeP E.g. investigate history of a file

m Production datasets searcho Select the dataset to be processed by production jobs

P Ensures consistency of input files for a productiono Uses directly the BK API to get the list of files

Page 4: Bookkeeping Tutorial

Bookkeeping Tutorial 4

Bookkeeping partitioning

m Configuration Name / versiono Real data

P <DAQ partition> / <activity>o Simulated data

P “MC” / <activity>d <activity> : “2008” / “DC06” / …

m Conditionso Parameters of initial data

P All subsequent processed data inherit the “conditions”o Real data

P DAQ conditionsd Beam conditions, energy, magnetic field, detector conditions…

o Simulated dataP Simulation conditions

d Beam energy, magnetic field, luminosity, generator settings…

Page 5: Bookkeeping Tutorial

Bookkeeping Tutorial 5

Processing pass

m Associated to a level of processingo Within a given partition (config name / version + conditions)o Corresponds to the whole processing workflow

P Single workflow for a given processing passP Compatible versions of applications

o Specifies the processing pass of input data when applicableP Sequence of processing

o Re-processing creates branches

Gauss

SIM

Boole

DIGI

Brunel

DST

DaVinci

ETC

Brunel

DST

SimReco

Stripping

Page 6: Bookkeeping Tutorial

Bookkeeping Tutorial 6

Other query parameters

m Event typeo File propertyo Real data

P 90000000 : real data full streamP 90000001 : real data express streamP Types to be defined for stripping streams

o Simulated dataP LHCb convention for decay tree

m File typeo Data content / format

P Format not yet used

Page 7: Bookkeeping Tutorial

Bookkeeping Tutorial 7

Running the bookkeeping GUI

m Needs a valid Grid certificatem Needs an X server

m lhcb-bkko SetupProject Dirac

P Sets up the environmento If needed: lhcb-proxy-init

P Creates a proxyo dirac-bookkeeping-gui

m Individual commands can be issued from the prompt!

Page 8: Bookkeeping Tutorial

Bookkeeping Tutorial 8

The query tree

Page 9: Bookkeeping Tutorial

Bookkeeping Tutorial 9

More info

m Right click ono Conditionso Processing pass

Page 10: Bookkeeping Tutorial

Bookkeeping Tutorial 10

Event type and file type

Page 11: Bookkeeping Tutorial

Bookkeeping Tutorial 11

Dataset selection

Logical File name

Page 12: Bookkeeping Tutorial

Bookkeeping Tutorial 12

Saving configuration (a.k.a. options) file

m Python configuration (default)o Still possible to create .opts (discouraged!)o .txt file for just a list of LFNs

m All files or selected files (if any)

Page 13: Bookkeeping Tutorial

Bookkeeping Tutorial 13

Dealing with PFNs or XML catalogs

m Using ganga + DIRACo Bookkeeping integrated in ganga:

P dataset = browseBK()o LFN handling is then automatic…

m If you really need XML catalog or PFNs, use genXMLCatalogo Ensures files are available on the specified siteo Gets the PFN from the Storage Element

P Not constructed “by hand”

Page 14: Bookkeeping Tutorial

Bookkeeping Tutorial 14

Dealing with XML catalog and PFNs

Page 15: Bookkeeping Tutorial

DIRAC Monitoringweb portal

15

Page 16: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 16

General information

m Entry point to the DIRAC web portalo http://dirac.cern.ch

m Web implementation of (almost) a full desktop applicationo Monitoring of productions / jobso Accounting (jobs, data management)o Allows to take actions on jobs

m Authentication / authorisation is mandatoryo Anonymous access gives minimal accesso Get a certificate and load it in our in your browser

https://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/Certificateo DIRAC authorisation through “DIRAC groups”

P Default: lhcb_userP Other groups: lhcb_prod, dirac_admin…P Future: specific groups per physics groups, PPG (for production

authorisation)…P Capabilities depends on the group

Page 17: Bookkeeping Tutorial

DIRAC Monitoring tutorial 17

The DIRAC portal home page

IdentityDIRAC group

DIRAC instance

Menus

Page 18: Bookkeeping Tutorial

DIRAC Monitoring tutorial 18

Job Monitoring

Selection

Monitoring info Actions

Page 19: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 19

Job Monitoring (cont’d)

m Selectiono For group lhcb_user, only see your own jobso Can select with

P StatusP SiteP DateP …

m Columnso Can tailor the columns to be displayedo Clicking toggles the sorting in the column

m Rowso Jobs displayed in pages (default 25 rows, don’t exceed 100)o Can scroll pages

Page 20: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 20

Logging info

Page 21: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 21

Output peeking

Page 22: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 22

Attributes

Page 23: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 23

Parameters