metadata for research data: how to understand your data and find it later ayla stein & william...

34
Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Upload: wilfrid-dixon

Post on 29-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Metadata for Research Data: How to Understand Your Data

and Find it Later

Ayla Stein & William Pooler

Page 2: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Session Overview• Introductions• Learning Objectives• Metadata and its Uses• The Importance of Metadata• Metadata Standards and Best Practices

Page 3: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler
Page 4: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Learning Objectives

1. Define ‘metadata’ and identify examples2. Express the importance of metadata3. Identify metadata schemas/standards, and explain

reasons to use4. Identify data content standards and explain their

importance5. Outline an approach to creating metadata for a project

Module 3: Metadata

Page 5: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

What is Metadata?

“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information” (NISO, Understanding Metadata 2004;1).

Module 3: Metadata

Page 6: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Metadata Helps You:

• find data from other researchers to support your research;

• use the data that you do find; • help other professionals to find and use data

from your research; and• use your own data in the future when you may

have forgotten details of the research.

Module 3: Metadata

Page 7: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Metadata benefitsBeginning of

ProjectStreamlines

Access

Living Document

Policy Documentation

Storage and Backup Plan

Sharing & Use of Raw Data

Data Format Types

Page 8: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

During Project

Streamlines Access

Communication Cost

Updating Documentation

Data Versioning

Policy Review

Training

Metadata benefits cont’d

Page 9: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Metadata benefits cont’dExit Strategy

Streamlines Access

Discoverability

Proper Citation

Archiving

Sharing & Reuse for Publication

Published Data Formats

Page 10: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Data Documentation Levels• Preliminary Background Information

• Data Collection

• Publication and Sharing

• Preservation and Archiving

Page 11: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Basic Categories of Metadata

• Descriptive

• Structural

• Administrative– Technical– Preservation

Module 3: Metadata

Page 12: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Metadata Schemas

“…sets of metadata elements designed for a specific purpose, such as describing a particular type of information resource” (NISO, Understanding Metadata 2004; pg. 2).

Page 13: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Note on Standards

Page 14: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Some Sample Metadata Standards

• Ecological Metadata Language (EML)

• Data Documentation Initiative (DDI)

• Dublin Core (DC)

Module 3: Metadata

Page 15: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

What does metadata look like?

Page 16: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler
Page 17: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler
Page 18: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Readme Files.zip file of .txt file used as reference transcriptome in: Labib Rouhana, Ana P. Vieira, Rachel H. Roberts-Galbraith, and Phillip A. Newmark. (2012). PRMT5 and the role of symmetric dimethylarginine in chromatoid bodies of planarian stem cells. Development.

LEGEND:Supplementary File 1. Reference pooled transcriptome for S. mediterranea. Reference transcriptome generated using default de novo assembly parameters in CLC Genomics Workbench (CLC Bio, Aarhus, Denmark) from the following databases: a S. mediterranea transcript discovery sequencing project (Blythe, et al 2010, PMID: 21179477), 454 sequencing reads from sexual S. mediterranea generated by our laboratory, maker and de_novo gene predictions from the S. mediterranea genome (Robb, et al 2008, PMID: 17881371; Cantarel, et al 2008, PMID: 18025269), and neuropeptide sequences (Collins, et al 2010, PMID: 20967238). 156,959 reads were assembled into 22,120 contigs, with 33,829 sequences not matching. The contigs and nonmatching sequences were joined into one library and renamed numerically in Galaxy (Giardine, et al 2005, PMID: 16169926).

Page 19: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Collecting and Sharing Metadata

• Controlled vocabularies

• Technical standards

Module 3: Metadata

Page 20: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Controlled Vocabularies

• Help take the guess work out of choosing between: • a preferred spelling; • a scientific or popular term • determining which synonym to use.

Module 3: Metadata

Page 21: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler
Page 22: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Technical Standards

ISO 8601 technical standard:

• YYYY (e.g. 1997)• Year and month:• YYYY-MM (e.g. 1997-07)

Complete date: YYYY-MM-DD (e.g. 1997-07-16)

Module 3: Metadata

Page 23: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Content TypesMIME Internet Media types:

• Application• Audio• Image• Model• Multipart• Message• Text• Video

Page 24: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Data Dictionary

“Documentation of the names of entities used in a software application or database, including in each entry its definition (size and type), where and how it used, and its relationship to other data” (SAA Glossary).

Page 25: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Approaches to Creating Metadata• Is there a disciplinary schema I should use?

• Could I use Dublin Core?

• Should I create my own schema?

Page 26: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Tool Examples• Colectica

• Excel

• Rightfield

Page 27: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Module 3: Metadata

Page 28: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler
Page 29: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Best Practices• Consult the Research Data Service!• Consistent data entry is important – review your work and

keep a data dictionary!• Avoid extraneous punctuation• Avoid most abbreviations• Use templates and macros when possible• Extract pre-existing metadata • Use an established metadata standard whenever possible!

Module 3: Metadata

Page 30: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Activity!• What metadata schema will I use?

– Disciplinary– Dublin Core– DIY

Page 31: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Campus Resources

Library Resources: Research Data Service: http://researchdataservice.illinois.edu/ Scholarly Commons: http://www.library.illinois.edu/sc/index.html

Upcoming Workshops:Basics of Research Data Publication and Sharing: (2014-11-13T13:00/13:50-06:00)November 13, 2014 1:00pm – 1:50pmMain Library, Room 314

All Data Management Workshops: http://researchdataservice.illinois.edu/workshops/ All Savvy Researcher Workshops: http://illinois.edu/calendar/list/4068

Page 32: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Thank You!• Please fill out the one minute feedback paper!

Page 33: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

SourcesSilver, C. (2014, October 4). “Combining and Converting Qualitative and Quantitative Data in CAQDAS Packages: an Aid to Mixed Method Research.” Retrieved from http://www.surrey.ac.uk/sociology/research/researchcentres/caqdas/support/analytictasks/combining_and_converting_qualitative_and_quantitative_data_in_caqdas_packages_an_aid_to_mixed_method_research.htm

Digital Curation Centre’s Disciplinary Metadata resource. http://www.dcc.ac.uk/resources/metadata-standards .

Hogrefe, K., Stocks, K. 2011. "The Importance of Metadata Standards." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdatastandards/stdimportance. Accessed March 22, 2013.

Internet media type. (2014, September 6). In Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Internet_media_type&oldid=624395435

Lamar Soutter Library University of Massachusetts Medical School. “New England Collaborative Data Management Curriculum.” http://library.umassmed.edu/necdmc.

Media Types. (2014, September 8). Internet Assigned Numbers Authority. Retrieved September 8, 2014, fromhttp://www.iana.org/assignments/media-types/media-types.xhtml

Metadata. (2014, August 29). In Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Metadata&oldid=622031523

Page 34: Metadata for Research Data: How to Understand Your Data and Find it Later Ayla Stein & William Pooler

Sources, ContinuedMetadata, structural. Federal Agencies Digitization Guidelines Initiative Glossary. Retrieved from http://www.digitizationguidelines.gov/term.php?term=metadatastructural

Monroe, R. xkcd: ISO 8601. Retrieved from https://xkcd.com/1179/

Monroe, R. xkcd: Standards. Retrieved from http://xkcd.com/927/

National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Neiswender, C. 2010. "Introduction to Metadata." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdataintro. Accessed April 1, 2013.

National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Miller, Steven J. 2011. Metadata Resources: Selected Reference Documents, Web Sites, and Readings: https://pantherfile.uwm.edu/mll/www/resource.html

Pearce-Moses, R. (2005). data dictionary. Glossary of Archival and Records Terminology. Society of American Archivists. Retrieved fromhttp://www2.archivists.org/glossary/terms/d/data-dictionary