mets: a status report jerome mcdonough new york university [email protected]

31
METS: A Status Report Jerome McDonough New York University [email protected]

Upload: kory-wilkinson

Post on 28-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

METS: A Status Report

Jerome McDonoughNew York [email protected]

METS: What is it?

A XML document format for encoding digital library objects which can fulfill roles of SIP, AIP and DIP within the OAIS reference model

Initial scope limited to objects comprised of text, image, audio & video files

Promote interoperability of descriptive, administrative and technical metadata while supporting flexibility in local practice

METS: Why?

“If tools are to be developed that work with digitized archival objects across distributed repositories, these objects will require some form of standardization. “– The Making of America II Testbed

Project: A Digital Library Service ModelBernard J. Hurley, John Price-Wilkin, Merrilee Proffitt, Howard Besser

METS: Who’s to blame?

Jerome McDonough (Editoral Board Chair), New York University Rick Beaubien, University of California Morgan Cundiff, Library of Congress Susan Dahl, University of Alberta Richard Gartner, Bodleian Library at Oxford Nancy Hoebelheirich, Stanford University Mark Kornbluh, Michigan State University Cecilia Preston, Preston & Lynch Merrilee Proffitt, Research Libraries Group Richard Rinehart, BAM/PFA Mackenzie Smith, Massachusetts Institute of Technology Taylor Surface, OCLC Brian Tingle, California Digital Library Robin Wendler, Harvard University

METS: Who’s using it?

CDL – Content Mgmt./Digital Object Repository Cornell/UVA – Fedora/Tibetan & Himalayan Dig. Library Florida Center for Library Automation – Union Catalog

of Digital Materials, Digital Archive Göttingen Digitalisierungs-Zentrum – Retrospective

Digitization Harvard University Library -- biomedical image stacks,

preservation audio, page-turned objects Library of Congress – Audio-Visual Prototyping Project MIT – DSPACE NYU Libraries– Digital Repository, CRL Web Archiving

METS: Who’s using it?

OCLC – Digital Archive Implementation Oxford University – Oxford Digital Library RLG – Cultural Materials Service Stanford University Library/AIS – Stanford Digital

Repository University of Alberta – Peel’s Prairie Provinces Project UC Berkeley Library – Archival Collections,

TOC/Indexes for off-site material, CS Tech Report (w/OAI Interface)

Univ. of Chicago Library – Digital Collections University of Graz, Austria – Austrian Literature Online

METS: Technical Components

Primary XML Schema Extension Schema Controlled Vocabularies

METS XML Schema

METS Document

Header

Descript. MD

Admin. MD

File List

Link Struct.

Struct. Map

Behaviors

Structural Map

Object modeled as tree structure (e.g., book with chapters with subchapters….)

Every node in tree can be associated with descriptive/administrative metadata and…

Individual/multiple files (or portions thereof) or

Other METS documents

Structural Map

<div type=“book” label=“Hunting of the Snark”><div type=“chapter” label=“Fit the First”>

<fptr>…</fptr></div><div type=“chapter” label=“Fit the Second”>

<fptr>…</fptr></div>…

</div>

Link Structure

Records all links between nodes in structural map

Uses XLink/Xptr syntax Caveat Encoder: make sure your

structural map supports your link structure

Content Files Listing

Records file specific technical metadata (checksum, file size, creation date/time) as well as providing access to file content

Files are arranged into groups, which can be arranged hierarchically

Files may be referenced (using Xlink) or contained within the METS document (in XML or as Base64 Binary)

Descriptive Metadata

Non-prescriptive/Multiple instances Desc. metadata associated with entirety

of METS object or subcomponents Desc. metadata may be internal (XML or

binary) or external (referenced by XLink) to METS document

Administrative Metadata

4 Types: Technical, Rights, Source Document, Digital Provenance

Non-prescriptive/Multiple instances associated with entirety of METS object

or subcomponents may be internal (XML/binary) or external

(XLink) to METS document

METS Header

Metadata regarding METS document Creation/Last Modification Date/Record

Status Document Agents (Creator, Editor,

Archivist, Preservation, Disseminator, Rights Owner, Custodian, etc.)

Alternative Record ID values

Behaviors Section

Multiple Behaviors allowed for any METS document

Behaviors may operate on any part of METS document

May provide information on API, service location, etc.

METS Structure

METS Structure

Oral History

Introduction

Q1 & Answer

Q2 & Answer

AIFF Master

TEI Tran-

scription

AES/EBUTech. Metadata

Text Tech. Metadata

MARC21 Record

Time Code Link

IDREF Link

METS Extension Schema

Descriptive Metadata (DC, MARCXML, MODS) Administrative Metadata

– Technical image: NISO Still Image (MIX) text: NYU & LOC A/V Prototyping audio: AES/EBU (Real Soon Now) & LOC A/V Prototyping video: SMPTE (Not Real Soon) & LOC A/V Prototyping

– IP Rights (XrML, ODRL, MPEG 21, Stanford)– Digital Provenance (capture/migration): LOC A/V

Prototyping & OCLC/RLG Working Group (Soon than I’d like)

METS Examples

Afghanistan Digital Library Library of Congress Viewer NYU Multimedia Viewer METS + Zooming Spaces

METS Example: Time-Based Media

<m:file ID="F01" MIMETYPE="image/gif"><m:file ID="F02" MIMETYPE="audio/wav"><m:file ID="F03" MIMETYPE="text/plain">

<m:div LABEL="slide 1"><m:fptr><m:par><m:area FILEID="F01"/><m:area FILEID="F02" BEGIN="00:00:00.100"

END="00:00:03.500" BETYPE="SMIL" EXTENT="2.5s" EXTTYPE="SMIL"/>

<m:area FILEID="F03" BEGIN="p01" END="p02" BETYPE="IDREF"/></m:par></m:fptr></m:div> This, plus….

METS Example: Time-Based Media

<body><p id="p01">Recovery from drug or alcohol abuse can be a long

lonely road</p><p id="p02">Help someone you love</p><p id="p03">Call 1-800-444-6472</p><p id="p04">Help Close the Health Gap</p><p id="p05"/>

</body>

…this, along with an audio file and some XSLT, gives you…

METS Example: Time-Based Media

<smil><head><layout><root-layout id="right" width="320" height="404" background-color="green"/>

<region id="visualarea" left="0" top="0" width="100%" height="240"/> <region id="textarea" left="0" top="242" width="100%" height="160"/></layout></head><body><par>

<img src="../image/gap01.gif" region="visualarea“ dur="00:00:31.000" /><audio src="../audio/track01.wav" /><text src="track01.txt" region="textarea" dur="00:00:31.000"/>

</par></body></smil> …this, and…

METS Example: Time-Based Media

{QTtext}{font:Geneva}{plain}{size:12}{textColor: 65535, 65535, 65535}{backColor: 0, 0, 0}{justify:center}{timeScale:1000}{width:320}{height:160}{timeStamps:absolute}{language:0}{textEncoding:0}[00:00:00.000] Loading...[00:00:00.100] Recovery from drug or alcohol abuse can be a long lonely road[00:00:04.500] Help someone you love[00:00:06.000] Call 1-800-444-6472[00:00:08.000] Help Close the Health Gap[00:00:12.000] Closing...[00:00:12.000]

…this.

METS Development Tools

Harvard Java Toolkit http://hul.harvard.edu/mets/

NYU XSLT for METS http://dlib.nyu.edu/metstools/

More coming soon…http://www.loc.gov/standards/mets/

METS Profiles

“Learning Zen is a phenomenon of gold and dung. Before you understand it, it's like gold; after you

understand it, it's like dung.”

METS Profiles

different institutions can, and will, differ in how they define structural, administrative and descriptive metadata, even for the same work;

different institutions can, and will, differ in their use of content file formats;

different institutions can, and will differ in their use of rules of description, controlled vocabularies, etc., etc., etc….

METS was designed to be flexible, so it could adapt to your local practices, but that means:

So much for interoperability.

METS Profiles

dictating use of particular extension schema, rules of description, and controlled vocabularies

specifying arrangement and use of METS elements and attributes for particular classes of documents

specifying the technical characteristics of data files within a METS object

identifying tools for creating/processing METS documents compliant with a particular profile

METS profiles allow digital libraries to specify constraintsthat they place on METS for ingest, storage/processing ordissemination, including:

METS Profiles

An XML schema for METS profiles has been developed and distributed to the METS community for review.

A registration process has been developed by the METS editorial board in cooperation with the Lib. of Congress Network Dev. & MARC Stds Office.

Registration is optional; profiles are useful even without registration for defining local practice.

METS: Next Steps

Better documentation Training sessions (all over the place) Tool development (particularly open source) Help spark extension schema development

(video tech. metadata, IP rights, digital provenance)

Work on controlled vocabularies for use in METS

Establish registry of METS repositories

METS: Further Info

METS Web Site: http://www.loc.gov/standards/mets

METS Mailing List: [email protected] …or contact me at

[email protected]