what does it take to add data to my repository? ryan scherle open repositories 2015 promoting...
TRANSCRIPT
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
What does it take to add datato my repository?
Ryan Scherle
Open Repositories 2015
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
The rise of data at OR
2006 2007 2008 2009 2010 2011 2012 2013 2014 20150
5
10
15
20
25
30
35
40
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
What’s all the fuss about?
• DSpace used to be a place to put articles, but now many institutions are adding data.
• Researchers are getting pressure to archive it.• Are the IRs ready?
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Can’t I just put it in my IR?
Maybe. What was your IR designed to hold?
Data archiving landscape
Datatype Focus
Com
mun
ity F
ocus
General
General
Focused
Focused
Figshare
Institutional Repository
SupplementalMaterials
Genbank
Pangaea Zenodo
LabDatabase
Dryad
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Policy issues for data
• Determining what to accept• Determining the correct granularity of data to
archive• Licensing issues• Software associated with data• Storing directly vs. referencing external
repositories
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Determining what to accept
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Determining the granularity of data to archive
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Licensing issues
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Software associated with data
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Storing directly vs. external references
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Ensuring data is ready for ingest
• Educating users• Viruses• Human subjects and other sensitive data• Copyright notices• Recommended file formats• File names• Is metadata adequate to find it? Even outside the
IR?• Is metadata adequate to understand it?
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Educating users
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Viruses
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Human subjects and other sensitive data
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Copyright notices
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
File formats
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
File names
Ryan Scherle 2012 NESCent Winter Wallop
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Is metadata adequate to find it? Outside the IR?
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Is metadata adequate to understand it?
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Technical capabilities of the repository
• Version control and maintaining citability• Transferring large files• Maintaining storage for large content• APIs for data access by external tools• Relationship management for data, software, and
documentation
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Version control and maintaining citability
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Transferring and managing large files
A little example
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
APIs for data access by external tools
searchForStuff() giveMeAccess()
getTheThing() showMeTheMetadata()
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Relationships for data, software, documentation
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Thanks!
Ryan Scherle@[email protected]
http://datadryad.orghttp://datadryad.org/pages/faqhttp://wiki.datadryad.org
PROMOTING SCHOLARSHIP THROUGH OPEN DATA
Photo credits
• Cry baby, TheGiantVermin, https://flic.kr/p/u7t5Q• square-peg-round-hole-21, Yoel Ben-Avraham, https://flic.kr/p/6pmtQL• Easy!, Robert Hruzek, https://flic.kr/p/bF5tB6• Lex Macho Inc., Dan DeChiaro, https://flic.kr/p/7tRkDE• Current county courthouse, Snohomish County, https://flic.kr/p/i5GdBg• Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer,
Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net• The Web is Agreement, Paul Downey, https://flic.kr/p/3KyJpy• confused Bush, a-birdie, https://flic.kr/p/2BK1nH• 40+296 Hello?, bark, https://flic.kr/p/8uh4rn• Keep Searching, Margot Trudell, https://flic.kr/p/aSUSGM• I SHOOT RAW!!, Jared Polin, https://flic.kr/p/8iQNV9• Copyright reasons, gaelx, https://flic.kr/p/bx59Gn• Diagnosing the dummy, Ano Lobb, https://flic.kr/p/83Rkit• Bio_Hazard_Virus_Matrix_by_Robbert_van_der_Steeg, Robbert van der Steeg,
https://flic.kr/p/dBKoa7• Trashing old software, jm3 on Flickr, https://flic.kr/p/vb8QT• Mount Airy Arrow, Zyada Follow, https://flic.kr/p/3enSSY• Old License plate Map, Josh Kellogg, https://flic.kr/p/CYHv9• rocks, erika dot net, https://flic.kr/p/3npF8k• Reject or Accept?, Simon Lieschke, https://flic.kr/p/pipBB