watt hepdata jun2016 - indico · 2018. 11. 19. · g. watt hepdata staff in durham • graeme watt:...

19
users’ tutorial Graeme Watt (IPPP Durham) “(Re)interpreting the results of new physics searches at the LHC” CERN, 15 th June 2016 1 http://hepdata.cedar.ac.uk https://hepdata.net

Upload: others

Post on 18-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • users’ tutorial

    Graeme Watt (IPPP Durham)“(Re)interpreting the results of new

    physics searches at the LHC”CERN, 15th June 2016

    1

    http://hepdata.cedar.ac.uk

    https://hepdata.net🔜

    https://indico.cern.ch/event/525142/http://hepdata.cedar.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.stfc.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://twitter.com/HepDatahttp://hepdata.net

  • G. Watt

    What is HepData?• Unique open-access repository for scattering data from

    more than 8000 experimental HEP (“hep-ex”) papers.

    • Traditional focus on production cross sections etc.• In recent years also include data from LHC searches

    (event counts, signal/background, efficiencies/acceptances, limits, exclusion contours, cut-flow tables, SLHA files, etc.).

    • Options to make plots and export to various formats.• Based in Institute for Particle Physics Phenomenology

    (IPPP) at Durham University (UK), going back to 1970s.

    • Funded by UK Science & Technology Facilities Council (STFC). Grant to support two staff extended to 2019.

    2

  • G. Watt

    HepData staff in Durham

    • Graeme Watt: Database Manager (2013-)• Mike Whalley: Previous Database Manager• Joanne Bentham: Administrative Assistant• Frank Krauss: Principal Investigator (2014-)

    3Close collaboration with Inspire group at CERN (2014-)

    Graeme Watt Mike Whalley Joanne Bentham Frank Krauss

    http://inspirehep.net

  • G. Watt

    HepData 🔜 HEPData

    4

    hepdata.cedar.ac.uk

    hepdata.net

    • Originally developed by Andy Buckley and Mike Whalley: 
 “HepData reloaded: reinventing the HEP data archive”, 
 PoS ACAT2010 (2010) 067 [arXiv:1006.0517].

    • Still production site, hosted in Durham.

    • Lead developer: Eamonn Maguire (CERN).• Overlay on Invenio 3 digital library software.• Development site (soon production), hosted at CERN.

    Code: https://github.com/HEPData

    http://hepdata.cedar.ac.ukhttp://hepdata.nethttp://arxiv.org/abs/1006.0517http://invenio-software.orghttps://github.com/HEPData

  • G. Watt

    Current data “input” format

    5

    http://hepdata.cedar.ac.uk/view/ins1203852/d2/input

    *dataset:*location: Page 20 of preprint*dscomment: The measured total cross sections. The first systematic uncertainty is the combined systematic uncertainty excluding luminosity, the second is the luminosity*reackey: P P --> Z0 Z0 X*obskey: SIG*qual: RE : P P --> Z0 Z0 X*yheader: SIG(total) IN FB*xheader: SQRT(S) IN GEV *data: x : y 7000; 6.7 +- 0.7 (DSYS=+0.4,-0.3,DSYS=0.3:lumi);*dataend:

    • All data needs to be transformed into a standard “input” text format.• Metadata for each table followed by data points in a structured format.• *reackey and *obskey keywords can be used for searching database.• Flexible table format: any number of rows and ‘x’ or ‘y’ columns possible.• Limited support to attach and link auxiliary files stored in a web directory.

    http://hepdata.cedar.ac.uk/view/ins1203852/d2/input

  • G. Watt

    Yes

    (Current) experiment self-submission

    6

    Administrator(e.g. convener of physics group)

    http://hepdata.cedar.ac.uk/manage

    Encoder(e.g. primary author of physics analysis)http://hepdata.cedar.ac.uk/input

    Enter Inspire ID to allocate paper.Password sent by email. Upload text file in “input” format

    (+ optional supplementary files).Record is added to test DB.Flag as ‘Ready’ once satisfied.Receive email notification that

    record is ‘Ready’ on test DB.

    Add record to public DB.

    (Procedure agreed in HepData meeting at CERN on 26th June 2014)

    NoOK?

    • First paper added by ATLAS SUSY group on 28th October 2014: http://hepdata.cedar.ac.uk/view/ins1304458

    http://hepdata.cedar.ac.uk/managehttp://hepdata.cedar.ac.uk/inputhttp://hepdata.cedar.ac.uk/view/ins1304458

  • G. Watt

    (Current) submission system usage

    7• Total of 253 papers: ATLAS (94), CMS (74), ALICE (55), LHCb (15).

  • G. Watt

    Submission system usage by ATLAS

    8

    • Total of 94 ATLAS papers: SUSY (20), EXOT (22), 
HION (14), STDM (28), HIGG (3), TOPQ (6), BPHY (1).

  • G. Watt

    Submission system usage by CMS

    9

    • Total of 74 CMS papers: B2G (6), SMP (24), FSQ (6), 
HIN (12), TOP (16), BPH (4), HIG (6), SUS (0), EXO (0).

  • G. Watt

    New YAML text format

    10

    ---name: 'Table 2'label: 'Data from Page 20 of preprint'description: | The measured total cross sections. The first systematic uncertainty is the combined systematic uncertainty excluding luminosity, the second is the luminosity.keywords: - {name: reactions, values: ['P P --> Z0 Z0 X']} - {name: observables, values: [‘SIG']} - {name: phrases, values: ['Inclusive', 'Integrated Cross Section', 'Cross Section', 'Proton-Proton Scattering', 'Z Production', 'Z pair Production']} - {name: cmenergies, values: [7000.0]}additional_resources:independent_variables: - header: {name: 'SQRT(S)', units: 'GEV'} values: - {value: 7000}dependent_variables: - header: {name: 'SIG(total)', units: 'FB'} qualifiers: - {name: 'RE', value: 'P P --> Z0 Z0 X'} values: - value: 6.7 errors: - {symerror: 0.7, label: 'stat'} - {asymerror: {plus: 0.4, minus: -0.3}, label: 'sys'} - {symerror: 0.3, label: 'sys,lumi'}

    • Data from existing HepData database was exported to the new YAML format for migration to the new system.

    http://hepdata.cedar.ac.uk/view/ins1203852/d2/yaml

    https://hepdata.net/record/ins1203852

    New keywords

    Auxiliary files

    http://github.com/HEPData/hepdata-submissionhttp://hepdata.cedar.ac.uk/view/ins1203852/d2/yamlhttps://hepdata.net/record/ins1203852https://hepdata.net/record/ins1203852http://yaml.org

  • G. Watt

    Yes

    New submission system

    11

    Coordinator(e.g. convener of physics group)

    Uploader(e.g. primary author of physics analysis)

    Initiate new paper with Inspire ID or title.

    Invitation sent by email.

    Upload text files in YAML format(+ optional supplementary files).

    Alternatively, .oldhepdata file.

    Enter Inspire ID if not done already.Add record to public DB.

    NoOK?

    • Also a Sandbox for any user to test uploads without special privileges.

    Reviewer(e.g. supervisor of primary author)

    Receives email.Checks each table.

    Mark tables as “Passed” review.

    Mark as “Attention Required”.

    https://hepdata.net/submission

    https://hepdata.net/record/sandboxhttps://hepdata.net/submission

  • G. Watt

    DOIs and versioning

    • DOIs minted for whole record and each table. 12

    https://www.doi.orghttps://www.hepdata.net/record/22518?version=2

  • G. Watt

    Conversion tools

    • Work by Michal Szostak (CERN summer student) 
http://cds.cern.ch/record/2055193
http://github.com/HEPData/hepdata-converter

    13

    X

    ROOT (in future)

    https://cds.cern.ch/record/2055193https://github.com/HEPData/hepdata-converter

  • G. Watt

    Data output formats• YAML: native HEPData format.• CSV: comma-separated values.• YODA: for inclusion in a Rivet analysis.• ROOT: binary .root file, not CINT script.

    14

    https://hepdata.net/record/ins1422615

    • Each table in a directory.• TGraphAsymmErrors for

    each dependent variable.• If finite bin width, also separate

    TH1F for value and each error.• Multidimensional (TH2F, TH3F).

    http://github.com/HEPData/hepdata-submissionhttps://en.wikipedia.org/wiki/Comma-separated_valueshttps://yoda.hepforge.orghttps://rivet.hepforge.orghttps://root.cern.chhttps://root.cern.ch/cinthttps://hepdata.net/record/ins1422615

  • G. Watt

    JSON endpoints• All HTML web pages have a JSON equivalent.• Search for data from all CMS publications: 


    https://hepdata.net/search/?collaboration=CMS&format=json

    • Get all information on an ATLAS SUSY publication: 
https://hepdata.net/record/ins1422615?format=json






    • Import a data table into Mathematica (json, csv, yoda):15

    Import[“https://www.hepdata.net/record/data/71624/63779/1”,”JSON”]

    https://hepdata.net/search/?collaboration=CMS&format=jsonhttps://hepdata.net/record/ins1422615?format=jsonhttps://www.hepdata.net/record/data/71624/63779/1

  • G. Watt

    Publication-driven search• Powerful search and faceting provided by Elasticsearch.

    16https://hepdata.net/search/?q="dark+matter"&collaboration=ATLAS

    http://elastic.cohttps://hepdata.net/search/?q=%22dark+matter%22&collaboration=ATLAShttps://hepdata.net/search/?q=%22dark+matter%22&collaboration=ATLAS

  • G. Watt

    Data-driven search

    • Prototype at http://hepdata.rufian.eu.

    17

    Master’s project by Juan Luis Boya García (University of Salamanca)

    http://hepdata.rufian.euhttp://hepdata.rufian.eu/#zCvuS

  • G. Watt

    HEPData beyond data• HEPData could easily be used to store tables of

    numbers and auxiliary files associated with theoretical papers (that have an Inspire entry), e.g. 
ATLAS/CMS DM Forum, LHC Higgs/SUSY XS WGs.

    • Various frameworks provide analyses that are each closely tied to a particular experimental paper.

    • Examples: Rivet, MadAnalysis5, fastNLO, APPLgrid.• Store analysis in HEPData linked to data record?• Inherit benefits such as versioning and DOI minting.

    18

    http://inspirehep.nethttps://arxiv.org/abs/1507.00966https://twiki.cern.ch/twiki/bin/view/LHCPhysics/LHCHXSWGhttps://twiki.cern.ch/twiki/bin/view/LHCPhysics/SUSYCrossSectionshttps://rivet.hepforge.orghttp://madanalysis.irmp.ucl.ac.be/wiki/PublicAnalysisDatabasehttp://fastnlo.hepforge.orghttps://applgrid.hepforge.org

  • G. Watt

    Summary

    • HepData used by ATLAS SUSY and EXOT (less by HIGG). 
Also by CMS B2G and HIG (but not by SUS or EXO?).

    • Transition to rewritten HEPData site (hepdata.net) hosted at CERN. Many improvements and new features.

    • HEPData should be the reference site for hosting LHC data.• Information/format can vary greatly between data records:


    recasters are welcome to help to standardise formats.

    • Keywords should be expanded beyond historical terms to better accommodate new types of data from LHC searches.

    • New HEPData has better support for hosting auxiliary files: 
Rivet/MadAnalysis5 analysis code could be attached to data.

    19

    http://hepdata.nethttps://rivet.hepforge.orghttp://madanalysis.irmp.ucl.ac.be/wiki/PublicAnalysisDatabase