Download - ORNL DAAC: Metrics of Data Use and Citation
ORNL DAAC: Metrics of Data Use and Citation
Robert Cook, DAAC Scientist Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN [email protected]
Data Curation Education in Research Centers Workshop Boulder, CO June 5-7, 2012
3
ORNL DAAC • Archive data products produced by projects within
NASA’s Terrestrial Ecology Program
• Mission: – assemble, distribute, and provide data services for a
comprehensive archive of terrestrial biogeochemistry and ecological dynamics observations and models to facilitate research, education, and decision-making in support of NASA’s Earth science.
ORNL DAAC: daac.ornl.gov
ORNL DAAC: Data Publications
3. Regional and Global Studies (198)
•Climate • Soils •Vegetation •Hydroclimatology •Daymet
2. Validation of Land Products (23)
Total Data Sets = 969
1. Field Campaigns (738) • FIFE •OTTER • SNF • BOREAS • LBA
BOREAS LBA
LBA
S2K S2K
In-situ Observations
? Remote Sensing
LAI/fPAR NPP
• Land Validation •MODIS Subsets • FLUXNET •NPP • BigFoot
LAI/fPAR NPP
4. Model Products (10) • Benchmark Models
•IBIS, BIOME-BGC, LSM •Manuscript Models
•PNeT, Century, Biome-BGC
4
5
Discover, access, extract, and analyze
Acquiring Data
• Training in Data Management • Enhanced metadata entry
Tools • Data Discovery and Access
Tools
• Data Exploration Tools • Visualize • Subset • Analyze • Integrate
• Spatial data analysis
Enable collectors and users to spend more time analyzing the data and less time doing data management
Data Center Activities
Motivation for Data Citations
• Determine impact of the ORNL DAAC on Science
• Metric: Use citation metrics to indicate how many DAAC data sets have been used in peer-reviewed papers, dissertations, or policy reports – Useful to NASA and the ORNL DAAC
8
Benefits of Citations
• Provide basis for increased incentives, recognition, and rewards for data activities
• Online digital data enables testing of published hypotheses – Peer-examination and review of conclusions or analysis
based on experimental or observational data
• Online digital data allows users to make new uses of the data – either in isolation, or in combination with other data sets.
9
11
Scientific Impact of a Data Center Citations to data sets indicate their reuse
Strack, J.E., G.E. Liston, and R.A. Pielke. 2004. Modeling snow depth for improved simulation of snow-vegetation-atmosphere interactions. Hydrometeorology 5:723 - 734.
Elements of a data product citation • Authors • Year of publication • Data product title
Examples:
Turner, D.P., W.D. Ritts, and M. Gregory. 2006. BigFoot NPP Surfaces for North and South American Sites, 2002-2004. Data set. Available on-line [http://daac.ornl.gov] from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/750.
Strahler, A.H., C.L.B. Schaaf, and E. Tsvetsinskaya. 2009. ISLSCP II AVHRR
Albedo and BRDF, 1995. In Hall, F.G., G. Collatz, B. Meeson, S. Los, E. Brown de Colstoun, and D. Landis (eds.). ISLSCP Initiative II Collection. Data set. Available on-line [http://daac.ornl.gov/ ] from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/928
• Data center and URL • Digital Object Identifiers (after 2008)
• Date accessed / version number
12
Characteristics of an identifier (DOI)
• Persistent – Registered with The DOI System http://dx.doi.org/
through DataCite
• Actionable – http://dx.doi.org/10.3334/ORNLDAAC/1086
• Specific – Links to the data set
• Complete – Links to data and the information needed to
understand and use the data
14
Digital Object Identifier for Data Sets
• DOI would allow proper citation of data sets – Legitimizes on-line data products for researchers and
journal editors
• DOI as an invariant reference to a data set – Facilitate finding data for readers (Scholarship) – Easier to find journal articles that cite data
15
When releasing a data set, the ORNL DAAC:
• Announces the data publication to the community
• Sends letter to authors congratulating them on the data publication
• Asks authors to add data publication to their CV • Shows productivity
• promotion, tenure, annual evaluations
• Demonstrates compliance with Agency policies for data management
16
Estimating Scientific Impact: Steps Taken
• Ask users for reprints of publications based on data sets obtained from the ORNL DAAC
– When they order the data and one year after they ordered data
• Review on-line services (e.g., Web of Science, Elsevier’s Science Direct) to see if ORNL DAAC or DAAC Projects are mentioned or if data have been cited in papers
– full text search, citation indices
17
1
Project Cited Referred Total
BOREAS 91 80 171
FLUXNET 31 121 152
SOIL COLLECTIONS 112 16 128
NET PRIMARY PRODUCTIVITY (NPP) 78 27 105
MODIS LAND PRODUCTS SUBSETS 40 65 105
VEGETATION COLLECTIONS 66 18 84
SAFARI 2000 38 25 63
FIFE 5 53 58
CLIMATE COLLECTIONS 45 10 55
RIVER DISCHARGE (RIVDIS) 22 9 31
Data Used in the Peer-reviewed Literature: By Project (1994 – 2012)
Impact Factor Top 10 Journals
Cited Referred Total
4.6 Remote Sensing of Environment 43 63 106
4.0 Agricultural and Forest Meteorology 26 28 54
6.3 Global Change Biology 19 33 52
3.3 Journal of Geophysical Research 14 35 49
3.3
Journal of Geophysical Research-Atmospheres 13 26 39
5.3 Global Biogeochemical Cycles 17 6 23
1.2
International Journal of Remote Sensing 9 12 21
4.1 Biogeosciences 15 5 20
2.5 Forest Ecology and Management 16 2 18
2.4 Ecological Modelling 8 8 16
ORNL DAAC Data Used in the Peer-reviewed Literature: By Journal (1994 – 2012)
2
Estimating Scientific Impact: Steps Taken
• Need a system to track the usage and citation of data sets using automated systems similar to those used for traditional publications
• Beginning to work with publishing groups – Thomson Reuters: Web of Science and Web of
Knowledge – Elsevier’s Science Direct – John Wiley
21
Challenges
• Identifiers for subsets of data products – we lack the necessary constructs and conventions for
referring to portions of a data product
• Tracking data usage, citations, and citation index
29
Future readers of on-line journal articles
• View data, using the data citation • View how data analyzed
– Scripts or scientific workflow
• View steps used to prepare data for figures • Use own process to sort, combine, analyze data,
and replot • Confirm the results of the paper • Test new hypotheses
30
31
Web Resources
• ORNL DAAC http://daac.ornl.gov
• Search Tool / Metadata Clearinghouse
http://mercury.ornl.gov/ornldaac/index.jsp