iassist 2011 presentation: tracking data reuse motivations, methods, and obstacles
DESCRIPTION
TRANSCRIPT
![Page 1: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/1.jpg)
Tracking Data Reuse Motivations, Methods, and Obstacles
Heather PiwowarDataONE postdoc with NESCent and Dryad
@researchremix
IASSIST2011 #iassist
![Page 2: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/2.jpg)
http://www.metmuseum.org/toah/ho/09/euwf/ho_24.45.1.htm
![Page 3: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/3.jpg)
http://www.flickr.com/photos/jsmjr/62443357/
![Page 4: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/4.jpg)
http://www.flickr.com/photos/camilleharrington/3587294608/
![Page 5: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/5.jpg)
http://www.flickr.com/photos/rkuhnau/3318245976/
![Page 6: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/6.jpg)
http://www.flickr.com/photos/conformpdx/1796399674/
![Page 7: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/7.jpg)
http://www.flickr.com/photos/rkuhnau/3317418699/
![Page 8: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/8.jpg)
http://www.flickr.com/photos/zemlinki/261617721/
![Page 9: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/9.jpg)
http://www.flickr.com/photos/tracenmatt/3020786491/
![Page 10: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/10.jpg)
http://www.flickr.com/photos/the-o/2078239333/
![Page 11: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/11.jpg)
http://www.flickr.com/photos/ryanr/142455033/
?
![Page 12: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/12.jpg)
http://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Gamma_distribution_pdf.svg/500px-Gamma_distribution_pdf.svg.png
![Page 13: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/13.jpg)
http://www.flickr.com/photos/archeon/2941655917/
![Page 14: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/14.jpg)
![Page 15: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/15.jpg)
![Page 16: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/16.jpg)
In 2009, 116 articles cited ORNL DAAC data.
Finding these articles took 70-80 hours
across at least 12 resourcesall chosen from a deep understanding of this specific research domain
then the full text of all the hits were manually reviewed
Valerie Enriquez interview with James Kidderhttp://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data
![Page 17: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/17.jpg)
publicly archived dataset
dataset has an iden2fier?(DOI, url, accession #)
IDs are difficult to unambiguously iden2fy in full text unless they have a unique paCern (DOI) or unusual prefix or suffix.
search in full text of all papers
search in reference sec2ons of all papers
sort hits to disambiguate reuse from submission
dataset submission record men2ons data collec2on ar2cle publica2on?
gather papers that cite the data collec2on paper
sort hits to disambiguate reuse from other cita2on contexts
dataset submission record has submiCer name or dataset
2tle?
with dataset unique ID
with (submi-er surname AND repository name), and also(dataset 9tle AND repository name)
with (first author surname AND repository name)
with dataset unique ID
DOI/ID search not supported by ISI Web of Science or Scopus
DOI/ID search works in Google Scholar, but scope is poorly defined, results are messy.
This cita2on paCern (dataset DOI/ID in references sec2on) is used almost exclusively for dataset reuse. Manual disambigua2on not required: can be automated pending API support.
Disambigua2on is 2me consuming: most cita2ons are not in the context of reuse
Requires access to full text of search hits for sor2ng
This flow s2ll misses aCribu2ons embedded in supplementary informa2on, reuses aCributed through a query descrip2on, etc.
Disambigua2on is 2me consuming
Requires access to full text of search hits for sor2ng
Only finds cita2ons indexed by cita2on databases
DOI/ID reference search possible in full-‐text portals like PubMed Central and HighWire Press, however portal coverage is limited and search is not restricted to references sec2on.
Cita2on history export is 2me consuming: automa2on not supported.
This cita2on paCern (cita2on to data crea2on paper) is very common in some subdisciplines, so probably finds most reuses.
This cita2on paCern (accession numbers in full text) is very common in some subdisciplines, so probably finds most reuses.Requires ability to query
full text across all literature that may contain reuse
Link to data collec2on paper oVen missing from dataset submission record, especially when dataset submission predates ar2cle publica2on.
Does not require access to full-‐text
How to iden9fy Dataset Reuse in the published literature
Names and 2tles are messy iden2fiers
Heather Piwowar, v1.0, CC-‐BY
This cita2on paCern is currently rare
This cita2on paCern is difficult to track with exis2ng tool limita2ons
with data collec2on ar2cle’s journal, volume, page, etc.
![Page 18: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/18.jpg)
![Page 19: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/19.jpg)
10 * 100 = 1000
![Page 20: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/20.jpg)
publication-based datasets
![Page 21: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/21.jpg)
deposited in 2005
![Page 22: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/22.jpg)
![Page 23: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/23.jpg)
![Page 24: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/24.jpg)
1. following citations to the paper that describes the data
collection, then filtering.
![Page 25: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/25.jpg)
![Page 26: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/26.jpg)
2. searching for accession numbers, urls, and DOIs in
full text
![Page 27: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/27.jpg)
![Page 28: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/28.jpg)
http://api.plos.org/2011/05/31/announcing_the_plos_search_api/
![Page 29: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/29.jpg)
2005 long time ago
biomedicine familiar, also very dominant
search interfaces not well designed for this task
helpdesks are very helpful
![Page 30: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/30.jpg)
stay tuned for results
poster at ASIS&T, SIGUSE
![Page 31: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/31.jpg)
I post my data, code, and statistical scripts: http://researchremix.org
Share yours too!
-> Open Notebook Science
http://www.flickr.com/photos/myklroventine/892446624/
![Page 32: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/32.jpg)
https://notebooks.dataone.org/tracking1000datasets/
![Page 33: IASSIST 2011 presentation: Tracking Data Reuse Motivations, Methods, and Obstacles](https://reader033.vdocuments.mx/reader033/viewer/2022051816/5455b9a6af79592b448b4a9c/html5/thumbnails/33.jpg)
thank youTodd Vision,
Estephanie Sta MariaJonathan CarlsonDryad and DataONE teams
The open science online community and those who release their articles, datasets and photos openly