atlas metadata interface campaign definition in ami s.albrand 23/02/2016atlas metadata interface1
DESCRIPTION
"Campaigns are defined by :" "A short name (30 characters) : unique in the database A dataset project (or set of projects) A map which associates each member of a set of pairs of productionSteps and dataTypes to a set of AMI configuration tags A description (1000 chars) The dataset projects are either dataNN_* or mcNN_*. Thus datasets which do not belong to these groups cannot be part of production campaign (such as valid_*, user*, group*)" N.B. Nothing was said about streamsTRANSCRIPT
ATLAS Metadata Interface
Campaign Definition in AMI
S.Albrand
05/05/23 ATLAS Metadata Interface 1
Story so far
• Requested last SW week.• Details of proposed implementation
circulated in January.– Some examples received – but not all my
questions answered.• Implementation started <2 weeks ago.
"Campaigns are defined by :"• "A short name (30 characters) : unique in the database• A dataset project (or set of projects)• A map which associates each member of a set of pairs of
productionSteps and dataTypes to a set of AMI configuration tags
• A description (1000 chars)• The dataset projects are either dataNN_* or mcNN_*. Thus
datasets which do not belong to these groups cannot be part of production campaign (such as valid_*, user*, group*)"
• N.B. Nothing was said about streams
Defines a data campaign. Requires two arguments: campaignName - a short name (30 alpha numeric characters, no spaces) projectName - a datasetProject name ( a mistake on my part – needs a separate step)
If no other argument is given a new empty campaign is created.
Optional arguments: ( for an MC campaign ) pyDict - a python dictionary in Text format.
(pyAMI only) campaignDictFile=filename (containing the dictionary as described above)
StreamName (equivalent to physicsShort, usually omitted ) (I added this because at least one of the examplesI was given had a stream wild card)
description - a long (1000 chars) description of the campaign.
AddCampaign
Not yet implemented
Examples:AddCampaign campagnName=mc11a projectName=mc11_7TeV
pyDict="{'MC11c': {'mc11_7TeV': {'*': {'recon': {'AOD': ['r3043', 'r3060', 'r3108', 'r3072', 'r3073', 'r3074', 'r3075', 'r3076', 'r3077', 'r3078', 'r3079', 'r3080', 'r3081', 'r3082', 'r3083', 'r3084', 'r3085', 'r3086', 'r3044', 'r3110', 'r3097', 'r3071', 'r3068', 'r3070', 'r3069', 'a145', 'a146'], 'ESD': ['r3043', 'r3060', 'r3108', 'r3072', 'r3073', 'r3074', 'r3075', 'r3076', 'r3077', 'r3078', 'r3079', 'r3080', 'r3081', 'r3082', 'r3083', 'r3084', 'r3085', 'r3086', 'r3044', 'r3110', 'r3097', 'r3071', 'r3068', 'r3070', 'r3069', 'a145', 'a146']}, 'merge': {'AOD': ['r2993', 'r3109', 'r3063']}, 'digit': {'RDO': ['d621', 'd622', 'd623', 'd619']}}}}} "description='This is an example'
Questions : Once the existing campaigns have been entered in AMI will anyone need this?
If yes, do you need an "overwrite" function?
AddCampaign campagnName=mc11a_empty projectName=mc11_7TeV [description="a description"]
/* creates (reserves) an empty campaign */
Problems
• I received several examples of pyDict format from different people, and the formats were all a bit different. I chose Borut's format as it looked "real".
• My error : I have made (by error) a simplification. At the moment one campaignName is associated with exactly one project and stream.– Not too difficult to correct transparently – but decided
to ignore it for the moment – so that I could have something to show today.
ListCampaign
ListCampaign –pyDict=true campaignName=solveig_test2
{'solveig_test2': {'mc11_7TeV': {'*': {'recon':{'AOD': ['a146', 'r3000', 'r1235', 'r2346'], 'ESD': ['r1234', 'r2345']}}}}}
• Or get it in standard AMI format.Questions : Does the order of the tags matter?Who reads the dict format? Can I have a copy of the
reading code?
Other functions for filling a campaign
• Already available:– AddProdStepGroup : adds a prodstep, and dataType
couple and optionally a tagSet to a campaign. Rejects illegal values.
– AddTagSet : adds a tagSet to a prodstep, dataType couple of a campaign. Rejects undeclared tags.
The other ones described in the specification will follow.They are "Updates" and "Removes". I will of course correct the treatment of projectTags.
Are you sure you really want streams? (Data Prep uses the same super tag for all streams)
A few remarks & questions• I suppose that there will be a phase of building up a
definition with fairly frequent updates? – Borut said "No notion of "closed" campaigns"
• How do clients want to be informed of changes in a campaign definition? What do they do with the information? – Presume that if DDM is using regex to identify datasets as part of
a campaign, then they can generate them themselves from a pyDict?
• It doesn't seem very scalable to me to mark in AMI which datasets as members of a campaign which may change at any moment (or is it always additive?)
Messy Tag prodstep coupling• I would have liked to be able to say to a client "This tag
type does not go with the dataType/prodStep you provided".
• But the use of tags, and even prodSteps is too messy. (Double use of s tags and r tags in particular)
• So I am only checking that prodstep is declared and that a tag exists at the moment.
• I will add a warning "This tag is already in another camapign"
Next steps
• Make web interface (c.f. Period definition interface)
• Test it ? Who?• Document.• Release…