metadata for research data: how to understand your data and find it later ayla stein & william...
TRANSCRIPT
Metadata for Research Data: How to Understand Your Data
and Find it Later
Ayla Stein & William Pooler
Session Overview• Introductions• Learning Objectives• Metadata and its Uses• The Importance of Metadata• Metadata Standards and Best Practices
Learning Objectives
1. Define ‘metadata’ and identify examples2. Express the importance of metadata3. Identify metadata schemas/standards, and explain
reasons to use4. Identify data content standards and explain their
importance5. Outline an approach to creating metadata for a project
Module 3: Metadata
What is Metadata?
“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information” (NISO, Understanding Metadata 2004;1).
Module 3: Metadata
Metadata Helps You:
• find data from other researchers to support your research;
• use the data that you do find; • help other professionals to find and use data
from your research; and• use your own data in the future when you may
have forgotten details of the research.
Module 3: Metadata
Metadata benefitsBeginning of
ProjectStreamlines
Access
Living Document
Policy Documentation
Storage and Backup Plan
Sharing & Use of Raw Data
Data Format Types
During Project
Streamlines Access
Communication Cost
Updating Documentation
Data Versioning
Policy Review
Training
Metadata benefits cont’d
Metadata benefits cont’dExit Strategy
Streamlines Access
Discoverability
Proper Citation
Archiving
Sharing & Reuse for Publication
Published Data Formats
Data Documentation Levels• Preliminary Background Information
• Data Collection
• Publication and Sharing
• Preservation and Archiving
Basic Categories of Metadata
• Descriptive
• Structural
• Administrative– Technical– Preservation
Module 3: Metadata
Metadata Schemas
“…sets of metadata elements designed for a specific purpose, such as describing a particular type of information resource” (NISO, Understanding Metadata 2004; pg. 2).
Note on Standards
Some Sample Metadata Standards
• Ecological Metadata Language (EML)
• Data Documentation Initiative (DDI)
• Dublin Core (DC)
Module 3: Metadata
What does metadata look like?
Readme Files.zip file of .txt file used as reference transcriptome in: Labib Rouhana, Ana P. Vieira, Rachel H. Roberts-Galbraith, and Phillip A. Newmark. (2012). PRMT5 and the role of symmetric dimethylarginine in chromatoid bodies of planarian stem cells. Development.
LEGEND:Supplementary File 1. Reference pooled transcriptome for S. mediterranea. Reference transcriptome generated using default de novo assembly parameters in CLC Genomics Workbench (CLC Bio, Aarhus, Denmark) from the following databases: a S. mediterranea transcript discovery sequencing project (Blythe, et al 2010, PMID: 21179477), 454 sequencing reads from sexual S. mediterranea generated by our laboratory, maker and de_novo gene predictions from the S. mediterranea genome (Robb, et al 2008, PMID: 17881371; Cantarel, et al 2008, PMID: 18025269), and neuropeptide sequences (Collins, et al 2010, PMID: 20967238). 156,959 reads were assembled into 22,120 contigs, with 33,829 sequences not matching. The contigs and nonmatching sequences were joined into one library and renamed numerically in Galaxy (Giardine, et al 2005, PMID: 16169926).
Collecting and Sharing Metadata
• Controlled vocabularies
• Technical standards
Module 3: Metadata
Controlled Vocabularies
• Help take the guess work out of choosing between: • a preferred spelling; • a scientific or popular term • determining which synonym to use.
Module 3: Metadata
Technical Standards
ISO 8601 technical standard:
• YYYY (e.g. 1997)• Year and month:• YYYY-MM (e.g. 1997-07)
Complete date: YYYY-MM-DD (e.g. 1997-07-16)
Module 3: Metadata
Content TypesMIME Internet Media types:
• Application• Audio• Image• Model• Multipart• Message• Text• Video
Data Dictionary
“Documentation of the names of entities used in a software application or database, including in each entry its definition (size and type), where and how it used, and its relationship to other data” (SAA Glossary).
Approaches to Creating Metadata• Is there a disciplinary schema I should use?
• Could I use Dublin Core?
• Should I create my own schema?
Tool Examples• Colectica
• Excel
• Rightfield
Module 3: Metadata
Best Practices• Consult the Research Data Service!• Consistent data entry is important – review your work and
keep a data dictionary!• Avoid extraneous punctuation• Avoid most abbreviations• Use templates and macros when possible• Extract pre-existing metadata • Use an established metadata standard whenever possible!
Module 3: Metadata
Activity!• What metadata schema will I use?
– Disciplinary– Dublin Core– DIY
Campus Resources
Library Resources: Research Data Service: http://researchdataservice.illinois.edu/ Scholarly Commons: http://www.library.illinois.edu/sc/index.html
Upcoming Workshops:Basics of Research Data Publication and Sharing: (2014-11-13T13:00/13:50-06:00)November 13, 2014 1:00pm – 1:50pmMain Library, Room 314
All Data Management Workshops: http://researchdataservice.illinois.edu/workshops/ All Savvy Researcher Workshops: http://illinois.edu/calendar/list/4068
Thank You!• Please fill out the one minute feedback paper!
SourcesSilver, C. (2014, October 4). “Combining and Converting Qualitative and Quantitative Data in CAQDAS Packages: an Aid to Mixed Method Research.” Retrieved from http://www.surrey.ac.uk/sociology/research/researchcentres/caqdas/support/analytictasks/combining_and_converting_qualitative_and_quantitative_data_in_caqdas_packages_an_aid_to_mixed_method_research.htm
Digital Curation Centre’s Disciplinary Metadata resource. http://www.dcc.ac.uk/resources/metadata-standards .
Hogrefe, K., Stocks, K. 2011. "The Importance of Metadata Standards." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdatastandards/stdimportance. Accessed March 22, 2013.
Internet media type. (2014, September 6). In Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Internet_media_type&oldid=624395435
Lamar Soutter Library University of Massachusetts Medical School. “New England Collaborative Data Management Curriculum.” http://library.umassmed.edu/necdmc.
Media Types. (2014, September 8). Internet Assigned Numbers Authority. Retrieved September 8, 2014, fromhttp://www.iana.org/assignments/media-types/media-types.xhtml
Metadata. (2014, August 29). In Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Metadata&oldid=622031523
Sources, ContinuedMetadata, structural. Federal Agencies Digitization Guidelines Initiative Glossary. Retrieved from http://www.digitizationguidelines.gov/term.php?term=metadatastructural
Monroe, R. xkcd: ISO 8601. Retrieved from https://xkcd.com/1179/
Monroe, R. xkcd: Standards. Retrieved from http://xkcd.com/927/
National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf
Neiswender, C. 2010. "Introduction to Metadata." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdataintro. Accessed April 1, 2013.
National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf
Miller, Steven J. 2011. Metadata Resources: Selected Reference Documents, Web Sites, and Readings: https://pantherfile.uwm.edu/mll/www/resource.html
Pearce-Moses, R. (2005). data dictionary. Glossary of Archival and Records Terminology. Society of American Archivists. Retrieved fromhttp://www2.archivists.org/glossary/terms/d/data-dictionary