digital resource attribution and citation through … · ucp . digital resource attribution and...
TRANSCRIPT
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Digital Resource Attribution and Citation through Unique Identifiers
Matt Mayernik Project Scientist
Research Data Services Specialist NCAR Library/Integrated Information Services (IIS)
DCERC Kickoff Workshop June, 2014
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
NCAR/UCAR Community Resources
2
Issues
• Discoverability - How to get UCAR resources in the hands of researchers and students? – Finding data on the internet can be very difficult – Most data are discovered through the literature
• Traceability - How to connect scholarship with underlying resources? – Reproducibility and transparency of science – Digital resources are largely linkable online
• Attribution - How to reward production of digital resources, e.g. data and software? – Data/software production, management, and support require significant
expertise and effort – Professional credit structures rarely account for this work
3
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
5th Coupled Model Intercomparison Project (CMIP5)
“Previous archives have not always been careful about version control ..., and it has not always been clear which data has eventually been used. In the case of CMIP3 and AR4 an additional problem has been that it was not clear which data was in the CMIP3 archive at any time, and so in some cases where users have used ‘all the CMIP3 models’ or ‘all the AR4 models’ it is non-trivial to go back and be sure what data has been used.”
– Williams, D. N., Lawrence, B. N., Lautenschlager, M., Middleton, D., & Balaji, V. (2011). The Earth System Grid Federation: Delivering globally accessible petascale data for CMIP5. In Proceedings of the 32nd Asia-Pacific Advanced Network Meeting., 121-130. Quote from pg. 125. http://home.badc.rl.ac.uk/lawrence/static/2012/02/15/WilEA11.pdf
4
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
What is a Data Citation?
5
Example from: Lindsay, R., et al. 2012. Seasonal forecasts of Arctic sea ice initialized with observations of ice thickness. Geophysical Research Letters, 39(21). DOI: 10.1029/2012GL053576
Citation to journal article
Citation to data set
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Data Citation Recommendations
• Earth Science Information Partners (US): http://bit.ly/data_citation
• Digital Curation Centre (UK): http://www.dcc.ac.uk/resources/how-guides/cite-datasets
• Joint Declaration of Data Citation Principles (Feb 2014): http://www.force11.org/datacitation – 60 organizational endorsements to date
6
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Challenge – Overcoming Inertia • Researchers may not…
– know that they are asked to cite data – know how to cite data – know that they have data to cite – be willing to spend time creating data citations
• Data users typically acknowledge their data sets and software using some combination of: – In-text descriptions of data sets and/or data providers – Mentions in the acknowledgements sections of a paper – Citations to papers about a data set – Including data set collectors/creators as coauthors on the paper
air • planet • people UCAR NCAR UCP
7
Challenge – Data ≠ Article • Digital resources are highly heterogeneous and dynamic
– At what granularity should digital identifiers be assigned? – How should identifiers be assigned to resources that change over
time? • Who “owns” digital identifiers in multi-organizational projects? • Data “peer review” is not yet a well specified process
air • planet • people UCAR NCAR UCP
8
UCAR Data Citation Initiatives
9
• Develop coherent approaches across the organization for: – Technical tools and methods – Policy/procedural protocols and standards – User and community engagement
• Broad participation from UCAR/NCAR groups – CISL/DSS, CISL/VETS, EOL, IIS, NESL/CGD, Unidata
• Engagement with other groups – CISL/IMAGe, CISL/USS, HAO, JOSS, RAL, NESL/ACD
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Actionable Identifiers
• Assigning actionable identifiers to digital resources – Digital Object Identifiers (DOIs)
• Provide a persistent locator for internet-based resources • Widely used for scholarly papers • Growing use for other kinds of resources
– Archival Resource Keys (ARKs) • Less implication of persistence • Additional features to address granularity
10
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Data http://dx.doi.org/10.5065/D6RN35ST
resolves to http://www.earthsystemgrid.org/project/NARCCAP.html
11
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Software http://dx.doi.org/10.5065/D6WD3XH5 resolves to
http://www.ncl.ucar.edu/
12
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Facilities / Services http://n2t.net/ark:/85065/d7wd3xhc resolves to
http://www2.cisl.ucar.edu/resources/yellowstone
13
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Citations
• NARCCAP Data: – Mearns, L.O., et al., 2007, updated 2012. The North American Regional
Climate Change Assessment Program dataset, National Center for Atmospheric Research Earth System Grid data portal, Boulder, CO. Data downloaded 2014-05-22. [doi:10.5065/D6RN35ST]
• NCL Software: – The NCAR Command Language (Version 6.1.2) [Software]. (2013).
Boulder, Colorado: UCAR/NCAR/CISL/VETS. http://dx.doi.org/10.5065/D6WD3XH5
• Yellowstone Supercomputer: – Computational and Information Systems Laboratory. 2012.
Yellowstone: IBM iDataPlex System (Climate Simulation Laboratory). Boulder, CO: National Center for Atmospheric Research. http://n2t.net/ark:/85065/d7wd3xhc
14
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
UCAR/NCAR Progress
• Community and consensus building • Producing technical reports
– White paper on internal identifier and citation recommendations – Report from UCAR data citation workshop held in 2012
• Organizing citation implementations within individual groups – UCAR groups have assigned citable identifiers to 38 data sets,
2 software packages, 1 model, and 1 facility – NCAR Library has assigned citable identifiers to 550+ NCAR
Technical Notes and UCAR Manuscripts
16
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Early Metrics
17
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Citations and acknowledgements to UCAR resources using unique digital identifiers
ID Resource UCAR Group Date of ID
Assignment Citations Acknow. Both TOTAL
10.5065/D6WD3XH5 NCL Software CISL VETS April 2012 29 25 54
10.5065/D6RN35ST NARCCAP Data CISL May 2012 16 16 85065/d7wd3xhc (ARK) Yellowstone supercomputer CISL May 2012 3 18 3 24
10.5065/D6CC0XMC Barrow Airborne Sea Ice
Data ACADIS Sept. 2012 1 1
10.5065/D6FF3QBR NCAR Tech Note - APE Atlas NCAR Library Feb. 2013 7 7
10.5065/D6NZ85N3 Juneau Icefield Glacier Mass
Balance ACADIS April 2013 1 1
10.5065/D6DF6P6C
Marine Regions Boundary Data for the Bering Sea Shelf
and Slope EOL May 2013 3 3
10.5065/D6N014HK USGS Permafrost data ACADIS Oct. 2013 1 1
Table compiled by querying IDs in Google Scholar, Apr. 10, 2014
Professional Societies
• American Meteorological Society – Policy Statement on “Full and Open Access to Data” officially adopted
by AMS in Dec. 2013. http://www.ametsoc.org/policy/2013fullopenaccessdata_amsstatement.html
– Data citation recommendations put forth to AMS Publications Commission in April 2014. Currently under consideration.
1. Add data citation to the author guidelines 2. Add data citation to the reviewer guidelines 3. Encourage authors to consider data requirements early
• American Geophysical Union – New Publications Data policy as of Dec. 2013 – “…all data necessary to understand, evaluate, replicate, and build
upon the reported research must be made available and accessible whenever possible.”
http://publications.agu.org/author-resource-center/publication-policies/data-policy/
18
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Integrating Citation Tools and Repositories
• New tools for software archiving and citation - DOIs for GitHub software repositories – Integrating with the Zenodo data repository – Zenodo makes an archive copy for every code “release”
https://guides.github.com/activities/citable-code/
19
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Going Forward
• Need to embed digital resource citation within research institutions – Norms of practice and symbols of success and importance – Intermediaries who provide assistance and expertise – Routines for making citations very easy to find, create, and use – Standards should be endorsed and used – Embedding within stable long-term archives and repositories is
essential
20
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP
Thank You
NCAR Technical Notes: – “Data Citations in NCAR” - http://dx.doi.org/10.5065/D6ZC80VN – “Bridging Data Lifecycles: Tracking Data Use via Data Citations
Workshop Report” - http://dx.doi.org/10.5065/D6PZ56TX Forthcoming article: – Mayernik, M.S., Callaghan, S., Leigh, R., Tedds, J., & Worley, S.
(in press). Peer review of data sets: When, why, and how. Bulletin of the American Meteorological Society. http://dx.doi.org/10.1175/BAMS-D-13-00083.1
21
air • planet • people Integrated Information Services (IIS) UCAR NCAR UCP