mets: a status report jerome mcdonough new york university [email protected]
TRANSCRIPT
METS: What is it?
A XML document format for encoding digital library objects which can fulfill roles of SIP, AIP and DIP within the OAIS reference model
Initial scope limited to objects comprised of text, image, audio & video files
Promote interoperability of descriptive, administrative and technical metadata while supporting flexibility in local practice
METS: Why?
“If tools are to be developed that work with digitized archival objects across distributed repositories, these objects will require some form of standardization. “– The Making of America II Testbed
Project: A Digital Library Service ModelBernard J. Hurley, John Price-Wilkin, Merrilee Proffitt, Howard Besser
METS: Who’s to blame?
Jerome McDonough (Editoral Board Chair), New York University Rick Beaubien, University of California Morgan Cundiff, Library of Congress Susan Dahl, University of Alberta Richard Gartner, Bodleian Library at Oxford Nancy Hoebelheirich, Stanford University Mark Kornbluh, Michigan State University Cecilia Preston, Preston & Lynch Merrilee Proffitt, Research Libraries Group Richard Rinehart, BAM/PFA Mackenzie Smith, Massachusetts Institute of Technology Taylor Surface, OCLC Brian Tingle, California Digital Library Robin Wendler, Harvard University
METS: Who’s using it?
CDL – Content Mgmt./Digital Object Repository Cornell/UVA – Fedora/Tibetan & Himalayan Dig. Library Florida Center for Library Automation – Union Catalog
of Digital Materials, Digital Archive Göttingen Digitalisierungs-Zentrum – Retrospective
Digitization Harvard University Library -- biomedical image stacks,
preservation audio, page-turned objects Library of Congress – Audio-Visual Prototyping Project MIT – DSPACE NYU Libraries– Digital Repository, CRL Web Archiving
METS: Who’s using it?
OCLC – Digital Archive Implementation Oxford University – Oxford Digital Library RLG – Cultural Materials Service Stanford University Library/AIS – Stanford Digital
Repository University of Alberta – Peel’s Prairie Provinces Project UC Berkeley Library – Archival Collections,
TOC/Indexes for off-site material, CS Tech Report (w/OAI Interface)
Univ. of Chicago Library – Digital Collections University of Graz, Austria – Austrian Literature Online
METS XML Schema
METS Document
Header
Descript. MD
Admin. MD
File List
Link Struct.
Struct. Map
Behaviors
Structural Map
Object modeled as tree structure (e.g., book with chapters with subchapters….)
Every node in tree can be associated with descriptive/administrative metadata and…
Individual/multiple files (or portions thereof) or
Other METS documents
Structural Map
<div type=“book” label=“Hunting of the Snark”><div type=“chapter” label=“Fit the First”>
<fptr>…</fptr></div><div type=“chapter” label=“Fit the Second”>
<fptr>…</fptr></div>…
</div>
Link Structure
Records all links between nodes in structural map
Uses XLink/Xptr syntax Caveat Encoder: make sure your
structural map supports your link structure
Content Files Listing
Records file specific technical metadata (checksum, file size, creation date/time) as well as providing access to file content
Files are arranged into groups, which can be arranged hierarchically
Files may be referenced (using Xlink) or contained within the METS document (in XML or as Base64 Binary)
Descriptive Metadata
Non-prescriptive/Multiple instances Desc. metadata associated with entirety
of METS object or subcomponents Desc. metadata may be internal (XML or
binary) or external (referenced by XLink) to METS document
Administrative Metadata
4 Types: Technical, Rights, Source Document, Digital Provenance
Non-prescriptive/Multiple instances associated with entirety of METS object
or subcomponents may be internal (XML/binary) or external
(XLink) to METS document
METS Header
Metadata regarding METS document Creation/Last Modification Date/Record
Status Document Agents (Creator, Editor,
Archivist, Preservation, Disseminator, Rights Owner, Custodian, etc.)
Alternative Record ID values
Behaviors Section
Multiple Behaviors allowed for any METS document
Behaviors may operate on any part of METS document
May provide information on API, service location, etc.
METS Structure
Oral History
Introduction
Q1 & Answer
Q2 & Answer
AIFF Master
TEI Tran-
scription
AES/EBUTech. Metadata
Text Tech. Metadata
MARC21 Record
Time Code Link
IDREF Link
METS Extension Schema
Descriptive Metadata (DC, MARCXML, MODS) Administrative Metadata
– Technical image: NISO Still Image (MIX) text: NYU & LOC A/V Prototyping audio: AES/EBU (Real Soon Now) & LOC A/V Prototyping video: SMPTE (Not Real Soon) & LOC A/V Prototyping
– IP Rights (XrML, ODRL, MPEG 21, Stanford)– Digital Provenance (capture/migration): LOC A/V
Prototyping & OCLC/RLG Working Group (Soon than I’d like)
METS Examples
Afghanistan Digital Library Library of Congress Viewer NYU Multimedia Viewer METS + Zooming Spaces
METS Example: Time-Based Media
<m:file ID="F01" MIMETYPE="image/gif"><m:file ID="F02" MIMETYPE="audio/wav"><m:file ID="F03" MIMETYPE="text/plain">
<m:div LABEL="slide 1"><m:fptr><m:par><m:area FILEID="F01"/><m:area FILEID="F02" BEGIN="00:00:00.100"
END="00:00:03.500" BETYPE="SMIL" EXTENT="2.5s" EXTTYPE="SMIL"/>
<m:area FILEID="F03" BEGIN="p01" END="p02" BETYPE="IDREF"/></m:par></m:fptr></m:div> This, plus….
METS Example: Time-Based Media
<body><p id="p01">Recovery from drug or alcohol abuse can be a long
lonely road</p><p id="p02">Help someone you love</p><p id="p03">Call 1-800-444-6472</p><p id="p04">Help Close the Health Gap</p><p id="p05"/>
</body>
…this, along with an audio file and some XSLT, gives you…
METS Example: Time-Based Media
<smil><head><layout><root-layout id="right" width="320" height="404" background-color="green"/>
<region id="visualarea" left="0" top="0" width="100%" height="240"/> <region id="textarea" left="0" top="242" width="100%" height="160"/></layout></head><body><par>
<img src="../image/gap01.gif" region="visualarea“ dur="00:00:31.000" /><audio src="../audio/track01.wav" /><text src="track01.txt" region="textarea" dur="00:00:31.000"/>
</par></body></smil> …this, and…
METS Example: Time-Based Media
{QTtext}{font:Geneva}{plain}{size:12}{textColor: 65535, 65535, 65535}{backColor: 0, 0, 0}{justify:center}{timeScale:1000}{width:320}{height:160}{timeStamps:absolute}{language:0}{textEncoding:0}[00:00:00.000] Loading...[00:00:00.100] Recovery from drug or alcohol abuse can be a long lonely road[00:00:04.500] Help someone you love[00:00:06.000] Call 1-800-444-6472[00:00:08.000] Help Close the Health Gap[00:00:12.000] Closing...[00:00:12.000]
…this.
METS Development Tools
Harvard Java Toolkit http://hul.harvard.edu/mets/
NYU XSLT for METS http://dlib.nyu.edu/metstools/
More coming soon…http://www.loc.gov/standards/mets/
METS Profiles
“Learning Zen is a phenomenon of gold and dung. Before you understand it, it's like gold; after you
understand it, it's like dung.”
METS Profiles
different institutions can, and will, differ in how they define structural, administrative and descriptive metadata, even for the same work;
different institutions can, and will, differ in their use of content file formats;
different institutions can, and will differ in their use of rules of description, controlled vocabularies, etc., etc., etc….
METS was designed to be flexible, so it could adapt to your local practices, but that means:
So much for interoperability.
METS Profiles
dictating use of particular extension schema, rules of description, and controlled vocabularies
specifying arrangement and use of METS elements and attributes for particular classes of documents
specifying the technical characteristics of data files within a METS object
identifying tools for creating/processing METS documents compliant with a particular profile
METS profiles allow digital libraries to specify constraintsthat they place on METS for ingest, storage/processing ordissemination, including:
METS Profiles
An XML schema for METS profiles has been developed and distributed to the METS community for review.
A registration process has been developed by the METS editorial board in cooperation with the Lib. of Congress Network Dev. & MARC Stds Office.
Registration is optional; profiles are useful even without registration for defining local practice.
METS: Next Steps
Better documentation Training sessions (all over the place) Tool development (particularly open source) Help spark extension schema development
(video tech. metadata, IP rights, digital provenance)
Work on controlled vocabularies for use in METS
Establish registry of METS repositories
METS: Further Info
METS Web Site: http://www.loc.gov/standards/mets
METS Mailing List: [email protected] …or contact me at