-
users’ tutorial
Graeme Watt (IPPP Durham)“(Re)interpreting the results of new
physics searches at the LHC”CERN, 15th June 2016
1
http://hepdata.cedar.ac.uk
https://hepdata.net🔜
https://indico.cern.ch/event/525142/http://hepdata.cedar.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.dur.ac.ukhttp://www.stfc.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://www.ippp.dur.ac.ukhttp://twitter.com/HepDatahttp://hepdata.net
-
G. Watt
What is HepData?• Unique open-access repository for scattering data from
more than 8000 experimental HEP (“hep-ex”) papers.
• Traditional focus on production cross sections etc.• In recent years also include data from LHC searches
(event counts, signal/background, efficiencies/acceptances, limits, exclusion contours, cut-flow tables, SLHA files, etc.).
• Options to make plots and export to various formats.• Based in Institute for Particle Physics Phenomenology
(IPPP) at Durham University (UK), going back to 1970s.
• Funded by UK Science & Technology Facilities Council (STFC). Grant to support two staff extended to 2019.
2
-
G. Watt
HepData staff in Durham
• Graeme Watt: Database Manager (2013-)• Mike Whalley: Previous Database Manager• Joanne Bentham: Administrative Assistant• Frank Krauss: Principal Investigator (2014-)
3Close collaboration with Inspire group at CERN (2014-)
Graeme Watt Mike Whalley Joanne Bentham Frank Krauss
http://inspirehep.net
-
G. Watt
HepData 🔜 HEPData
4
hepdata.cedar.ac.uk
hepdata.net
• Originally developed by Andy Buckley and Mike Whalley: “HepData reloaded: reinventing the HEP data archive”, PoS ACAT2010 (2010) 067 [arXiv:1006.0517].
• Still production site, hosted in Durham.
• Lead developer: Eamonn Maguire (CERN).• Overlay on Invenio 3 digital library software.• Development site (soon production), hosted at CERN.
Code: https://github.com/HEPData
http://hepdata.cedar.ac.ukhttp://hepdata.nethttp://arxiv.org/abs/1006.0517http://invenio-software.orghttps://github.com/HEPData
-
G. Watt
Current data “input” format
5
http://hepdata.cedar.ac.uk/view/ins1203852/d2/input
*dataset:*location: Page 20 of preprint*dscomment: The measured total cross sections. The first systematic uncertainty is the combined systematic uncertainty excluding luminosity, the second is the luminosity*reackey: P P --> Z0 Z0 X*obskey: SIG*qual: RE : P P --> Z0 Z0 X*yheader: SIG(total) IN FB*xheader: SQRT(S) IN GEV *data: x : y 7000; 6.7 +- 0.7 (DSYS=+0.4,-0.3,DSYS=0.3:lumi);*dataend:
• All data needs to be transformed into a standard “input” text format.• Metadata for each table followed by data points in a structured format.• *reackey and *obskey keywords can be used for searching database.• Flexible table format: any number of rows and ‘x’ or ‘y’ columns possible.• Limited support to attach and link auxiliary files stored in a web directory.
http://hepdata.cedar.ac.uk/view/ins1203852/d2/input
-
G. Watt
Yes
(Current) experiment self-submission
6
Administrator(e.g. convener of physics group)
http://hepdata.cedar.ac.uk/manage
Encoder(e.g. primary author of physics analysis)http://hepdata.cedar.ac.uk/input
Enter Inspire ID to allocate paper.Password sent by email. Upload text file in “input” format
(+ optional supplementary files).Record is added to test DB.Flag as ‘Ready’ once satisfied.Receive email notification that
record is ‘Ready’ on test DB.
Add record to public DB.
(Procedure agreed in HepData meeting at CERN on 26th June 2014)
NoOK?
• First paper added by ATLAS SUSY group on 28th October 2014: http://hepdata.cedar.ac.uk/view/ins1304458
http://hepdata.cedar.ac.uk/managehttp://hepdata.cedar.ac.uk/inputhttp://hepdata.cedar.ac.uk/view/ins1304458
-
G. Watt
(Current) submission system usage
7• Total of 253 papers: ATLAS (94), CMS (74), ALICE (55), LHCb (15).
-
G. Watt
Submission system usage by ATLAS
8
• Total of 94 ATLAS papers: SUSY (20), EXOT (22), HION (14), STDM (28), HIGG (3), TOPQ (6), BPHY (1).
-
G. Watt
Submission system usage by CMS
9
• Total of 74 CMS papers: B2G (6), SMP (24), FSQ (6), HIN (12), TOP (16), BPH (4), HIG (6), SUS (0), EXO (0).
-
G. Watt
New YAML text format
10
---name: 'Table 2'label: 'Data from Page 20 of preprint'description: | The measured total cross sections. The first systematic uncertainty is the combined systematic uncertainty excluding luminosity, the second is the luminosity.keywords: - {name: reactions, values: ['P P --> Z0 Z0 X']} - {name: observables, values: [‘SIG']} - {name: phrases, values: ['Inclusive', 'Integrated Cross Section', 'Cross Section', 'Proton-Proton Scattering', 'Z Production', 'Z pair Production']} - {name: cmenergies, values: [7000.0]}additional_resources:independent_variables: - header: {name: 'SQRT(S)', units: 'GEV'} values: - {value: 7000}dependent_variables: - header: {name: 'SIG(total)', units: 'FB'} qualifiers: - {name: 'RE', value: 'P P --> Z0 Z0 X'} values: - value: 6.7 errors: - {symerror: 0.7, label: 'stat'} - {asymerror: {plus: 0.4, minus: -0.3}, label: 'sys'} - {symerror: 0.3, label: 'sys,lumi'}
• Data from existing HepData database was exported to the new YAML format for migration to the new system.
http://hepdata.cedar.ac.uk/view/ins1203852/d2/yaml
https://hepdata.net/record/ins1203852
New keywords
Auxiliary files
http://github.com/HEPData/hepdata-submissionhttp://hepdata.cedar.ac.uk/view/ins1203852/d2/yamlhttps://hepdata.net/record/ins1203852https://hepdata.net/record/ins1203852http://yaml.org
-
G. Watt
Yes
New submission system
11
Coordinator(e.g. convener of physics group)
Uploader(e.g. primary author of physics analysis)
Initiate new paper with Inspire ID or title.
Invitation sent by email.
Upload text files in YAML format(+ optional supplementary files).
Alternatively, .oldhepdata file.
Enter Inspire ID if not done already.Add record to public DB.
NoOK?
• Also a Sandbox for any user to test uploads without special privileges.
Reviewer(e.g. supervisor of primary author)
Receives email.Checks each table.
Mark tables as “Passed” review.
Mark as “Attention Required”.
https://hepdata.net/submission
https://hepdata.net/record/sandboxhttps://hepdata.net/submission
-
G. Watt
DOIs and versioning
• DOIs minted for whole record and each table. 12
https://www.doi.orghttps://www.hepdata.net/record/22518?version=2
-
G. Watt
Conversion tools
• Work by Michal Szostak (CERN summer student) http://cds.cern.ch/record/2055193 http://github.com/HEPData/hepdata-converter
13
X
ROOT (in future)
https://cds.cern.ch/record/2055193https://github.com/HEPData/hepdata-converter
-
G. Watt
Data output formats• YAML: native HEPData format.• CSV: comma-separated values.• YODA: for inclusion in a Rivet analysis.• ROOT: binary .root file, not CINT script.
14
https://hepdata.net/record/ins1422615
• Each table in a directory.• TGraphAsymmErrors for
each dependent variable.• If finite bin width, also separate
TH1F for value and each error.• Multidimensional (TH2F, TH3F).
http://github.com/HEPData/hepdata-submissionhttps://en.wikipedia.org/wiki/Comma-separated_valueshttps://yoda.hepforge.orghttps://rivet.hepforge.orghttps://root.cern.chhttps://root.cern.ch/cinthttps://hepdata.net/record/ins1422615
-
G. Watt
JSON endpoints• All HTML web pages have a JSON equivalent.• Search for data from all CMS publications:
https://hepdata.net/search/?collaboration=CMS&format=json
• Get all information on an ATLAS SUSY publication: https://hepdata.net/record/ins1422615?format=json
• Import a data table into Mathematica (json, csv, yoda):15
Import[“https://www.hepdata.net/record/data/71624/63779/1”,”JSON”]
https://hepdata.net/search/?collaboration=CMS&format=jsonhttps://hepdata.net/record/ins1422615?format=jsonhttps://www.hepdata.net/record/data/71624/63779/1
-
G. Watt
Publication-driven search• Powerful search and faceting provided by Elasticsearch.
16https://hepdata.net/search/?q="dark+matter"&collaboration=ATLAS
http://elastic.cohttps://hepdata.net/search/?q=%22dark+matter%22&collaboration=ATLAShttps://hepdata.net/search/?q=%22dark+matter%22&collaboration=ATLAS
-
G. Watt
Data-driven search
• Prototype at http://hepdata.rufian.eu.
17
Master’s project by Juan Luis Boya García (University of Salamanca)
http://hepdata.rufian.euhttp://hepdata.rufian.eu/#zCvuS
-
G. Watt
HEPData beyond data• HEPData could easily be used to store tables of
numbers and auxiliary files associated with theoretical papers (that have an Inspire entry), e.g. ATLAS/CMS DM Forum, LHC Higgs/SUSY XS WGs.
• Various frameworks provide analyses that are each closely tied to a particular experimental paper.
• Examples: Rivet, MadAnalysis5, fastNLO, APPLgrid.• Store analysis in HEPData linked to data record?• Inherit benefits such as versioning and DOI minting.
18
http://inspirehep.nethttps://arxiv.org/abs/1507.00966https://twiki.cern.ch/twiki/bin/view/LHCPhysics/LHCHXSWGhttps://twiki.cern.ch/twiki/bin/view/LHCPhysics/SUSYCrossSectionshttps://rivet.hepforge.orghttp://madanalysis.irmp.ucl.ac.be/wiki/PublicAnalysisDatabasehttp://fastnlo.hepforge.orghttps://applgrid.hepforge.org
-
G. Watt
Summary
• HepData used by ATLAS SUSY and EXOT (less by HIGG). Also by CMS B2G and HIG (but not by SUS or EXO?).
• Transition to rewritten HEPData site (hepdata.net) hosted at CERN. Many improvements and new features.
• HEPData should be the reference site for hosting LHC data.• Information/format can vary greatly between data records:
recasters are welcome to help to standardise formats.
• Keywords should be expanded beyond historical terms to better accommodate new types of data from LHC searches.
• New HEPData has better support for hosting auxiliary files: Rivet/MadAnalysis5 analysis code could be attached to data.
19
http://hepdata.nethttps://rivet.hepforge.orghttp://madanalysis.irmp.ucl.ac.be/wiki/PublicAnalysisDatabase