preservation metadata: between theory and practicepreservation metadata workshop (2) the hague, the...
TRANSCRIPT
![Page 1: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/1.jpg)
Preservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata for preservation of digital objects: background, functions, and standards” – Preservation Metadata Workshop (1), Hilversum, The Netherlands, 4 March 2014
Preservation Metadata: between theory and practice
![Page 2: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/2.jpg)
OUTLINE
1. General introduction to preservation metadata 2. The PREMIS Data Dictionary 3. A use case: the Preservation Health Check
2
![Page 3: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/3.jpg)
Introduction to preservation metadata
3
![Page 4: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/4.jpg)
metadata Function � Discovery � Access � Management � Control intellectual property
rights � Identification � Certify authenticity � Mark content structure � Indicate status � Describe processes � Etc.
Type � Descriptive � Administrative � Technical � Rights/Access � Structural � Meta-metadata � Etc.
4
![Page 5: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/5.jpg)
digital preservation Digital preservation is part and parcel of the “management and
preservation” tasks and responsibilities of a heritage institution. Digital information poses its own set of challenges to preservation: • The overwhelming volume of digital information created daily and
the uncontrolled duplication of information; • The complexity of digital information (content, structure, context,
presentation, behaviour) and the evolving boundaries of the scholarly record and the cultural record;
• The dependency on software/hardware (incl. incompatible, obscure or proprietary systems)
• The rapid technological change and the danger of obsolescence • The ease of (accidental or malicious) content alteration • Doubts about the reliability and integrity of electronic records and
the need to vouch for their authenticity
5
![Page 6: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/6.jpg)
digital preservation Digital preservation is part and parcel of the “management and
preservation” tasks and responsibilities of a heritage institution. Digital information poses its own set of challenges to preservation: • The overwhelming volume of digital information created daily and
the uncontrolled duplication of information; • The complexity of digital information (content, structure, context,
presentation, behaviour) and the evolving boundaries of the scholarly record and the cultural record;
Ø The dependency on software/hardware (incl. incompatible, obscure or proprietary systems)
Ø The rapid technological change and the danger of obsolescence
• The ease of (accidental or malicious) content alteration • Doubts about the reliability and integrity of electronic records and
the need to vouch for their authenticity
6
![Page 7: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/7.jpg)
preservation metadata in 2000 “We can then say that the main problem metadata
for long term preservation will help to solve is the problem of technological obsolescence.” (p.4)
7 http://www.kb.nl/sites/default/files/docs/NEDLIBmetadata.pdf
![Page 8: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/8.jpg)
preservation metadata in 2002 “Preservation metadata (…) is the information
necessary to maintain the viability, renderability, and understandability of digital resources over the long-term.” (p.1)
8
http://www.oclc.org/content/dam/research/activities/pmwg/pm_framework.pdf?urlm=161391
![Page 9: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/9.jpg)
preservation metadata in 2005 “Preservation metadata (…) metadata supporting
the functions of maintaining viability, renderability, understandability, authenticity, and identity in a preservation context.” (p. ix)
9
http://www.loc.gov/standards/premis/
![Page 10: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/10.jpg)
The SPOT Model for risk assessment
SPOT Model
Availability
Identity
Persistence
Renderability
Understandability
Authenticity
Threats
http://www.dlib.org/dlib/september12/vermaaten/09vermaaten.html
Six essential properties of successful digital preservation
![Page 11: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/11.jpg)
metadata and preservation metadata
“Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”
METADATA
“Metadata that supports and documents the digital preservation process”
PRESERVATION METADATA
![Page 12: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/12.jpg)
supporting and documenting the digital preservation process • Provenance:
– The chain of custody/ownership of the digital object; info about the depositor; etc.
• Authenticity:
– The documentation of changes affecting the authenticity of the digital object during the preservation process
• Preservation Activity:
– The documentation of actions taken to preserve the digital object • Technical Environment:
– The documentation of the dependencies on and changes in the technical environment needed to render and use the digital object
• Rights:
– The documentation of the rights and permissions for carrying out preservation activities on the digital object (duplication, migration, transformations)
![Page 13: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/13.jpg)
![Page 14: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/14.jpg)
OAIS Information Model
Information Package Concepts and Relationships (Figure 2-3)
![Page 15: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/15.jpg)
Preservation Description Information
Preservation Description Information
Reference Information
Provenance Information
Context Information
Fixity Information
Preservation Description Information (Figure 4-16) – June 2012 version
Reference information: identifiers of the Content Provenance information: history of the custody Context information: relation of the Content to other objects Fixity information: a data integrity checksum of the Content Access Rights Information: permissions for preservation operations
Access Rights Information
![Page 16: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/16.jpg)
How to record and manage change
OAIS rule: if the PDI changes, the AIP version changes.
Implementation choices: e.g. fixity information in source AIP + keep log of data integrity checks and their
outcomes separate from the AIP.
16
![Page 17: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/17.jpg)
OAIS compliance relevant to preservation metadata
OAIS Mandatory Responsibilities: 1. Negotiating and accepting information 2. Obtaining sufficient control of the information to
ensure long-term preservation 3. Determining the "designated community" 4. Ensuring that information is independently
understandable 5. Following documented policies and procedures 6. Making the preserved information available
![Page 18: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/18.jpg)
Digital repository certification
– RLG-NARA Task Force on Digital Repository Certification – Various other certification initiatives (CRL, DCC, nestor,
DRAMBORA) – Trusted Repositories Audit & Certification (TRAC): Criteria and
Checklist (March 2007) • Organisational infrastructure
– e.g., governance, organisational structures, mandates, policy frameworks, funding systems, contracts and licenses
• Digital Object Management (OAIS functions) – e.g., ingest, metadata, preservation strategies
• Technologies, Technical Infrastructure, & Security
![Page 19: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/19.jpg)
Functions of a trusted digital repository relevant to preservation metadata • Maintains persistent, unique identifiers for all archived
objects • Identifies properties it will preserve • Verifies each submitted object during ingest • Creates archival package from submission package to
include technical and rights metadata • Has mechanisms to authenticate content and its source • Ensures that content information isn’t corrupted and
maintains integrity by using fixity information • Manages number and location of copies of all digital
objects • Employs documented preservation strategies
19
![Page 20: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/20.jpg)
Functions of a trusted digital repository relevant to preservation metadata • Maintains precise descriptions of actions necessary to ensure
that objects are preserved • Has mechanisms for monitoring and notification when formats
are becoming obsolete • Uses tools and resources such as format registries to
establish semantic and technical context • Has processes for storage media and/or hardware changes • Tracks and manages intellectual property rights and
restrictions • Ensures that agreements applicable to access conditions are
adhered to • Maintains descriptive metadata for access and retrieval and
associates it with object
20
![Page 21: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/21.jpg)
PREMIS
21
![Page 22: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/22.jpg)
Standards that address preservation metadata: technical • PREMIS • Images
– NISO Z39.87 and MIX – Adobe and XMP (Extensible Metadata Platform) – Exif (Exchangeable Image File Format) – IPTC (International Press Telecommunications Council)/XMP
• Text: textMD • Sound
– AES57-2011: Audio Object XML Schema – AES60-2011: Core Audio Metadata – AudioMD (Library of Congress)
![Page 23: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/23.jpg)
Standards that address preservation metadata: technical
• Video – VideoMD – SMPTE RP210 – Technical metadata in EBUCore, PBCore – U.S. Federal Agencies Digitization Guidelines – MPEG-7 and MPEG-21 for video
![Page 24: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/24.jpg)
Standards that address preservation metadata: Structural § METS § PREMIS § MPEG 21 Digital Item Declaration § OAI/ORE § Specific format types
– MXF – AVI
![Page 25: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/25.jpg)
Standards that address preservation metadata: Rights • PREMIS • METS Rights • CDL Copyright schema • Creative commons • PLUS for images • MPEG-21 REL for moving images • ONIX for licensing terms • Full rights expression languages
– XRML/MPEG-21 – ODRL
![Page 26: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/26.jpg)
PREMIS Data Dictionary • May 2005: Data Dictionary for Preservation
Metadata: Final Report of the PREMIS Working Group • March 2008: PREMIS Data Dictionary for Preservation
Metadata, version 2.0
• Jan. 2011: version 2.1
• April 2012: version 2.2
• Announced in September 2013: version 3.0
• Data Dictionary: – Comprehensive view of information needed to support digital preservation
• Guidelines/recommendations to support creation, use, management – Based on deep pool of institutional experiences in setting up and managing operational
capacity for digital preservation
![Page 27: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/27.jpg)
Guiding principles: “implementable, core preservation metadata”
• Preservation metadata: maintain viability, renderability, understandability, authenticity, identity in a preservation context
• Core: What most preservation repositories need to know to preserve digital materials over the long-term
• Implementable: rigorously defined; supported by usage guidelines/recommendations; emphasis on automated workflows and metadata generation
• Technical neutrality: no assumptions about technologies, systems and architectures, where metadata is stored
![Page 28: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/28.jpg)
Scope
• What PREMIS DD is: – Common data model for organizing/thinking about preservation metadata – Guidance for local implementations – Standard for exchanging information packages between repositories – Compatible with the OAIS reference and information model
• What PREMIS DD is not: – Out-of-the-box solution: need to instantiate as metadata elements in repository
system – All needed metadata: excludes business rules, format-specific technical
metadata, descriptive metadata for access, non-core preservation metadata – Lifecycle management of objects outside repository – Rights management: limited to permissions regarding actions taken within
repository
![Page 29: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/29.jpg)
PREMIS Data Model
Intellectual Entities
Objects
Rights Statements
Agents
Events
![Page 30: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/30.jpg)
Intellectual Entities
Examples: • The Chamber by John Grisham (an
ebook) • “Maggie at the beach”
(a photograph) • The Metropolitan New York Library
Council Website (a website)
• Set of content that is considered a single intellectual unit for purposes of management and description (e.g., a book, a photograph, a map, a database)
• Has one or more digital representations
• May include other Intellectual Entities (e.g. a website that includes a web page)
• Not fully described in PREMIS DD, but can be linked to in metadata describing digital representation THIS WILL CHANGE IN 3.0
![Page 31: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/31.jpg)
Objects
Examples: § a PDF file § A book composed of several
XML files and many images § TIFF file containing a header
and 2 images
Objects are what repository actually preserves FILE: named and ordered sequence of bytes that is known by an operating system REPRESENTATION: set of files, including structural metadata, that, taken together, constitute a complete rendering of an Intellectual Entity BITSTREAM: data within a file with properties relevant for preservation purposes (but needs additional structure or reformatting to be stand-alone file) FILESTREAMS (files within files) are considered files since can be rendered alone
![Page 32: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/32.jpg)
Object Example: book in two versions
Intellectual Entity Da Vinci Code by Dan Brown
Representation 1 Page image version
Representation 2 ebook version
File 1: page1.tiff
File 2: page2.tiff
File N: pageN.tiff
File 1: book.lit
File N+1: METS.xml
![Page 33: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/33.jpg)
Semantic units pertaining to Objects
• Object identifier • Preservation level • Significant characteristics • Object characteristics
– fixity – format – size – creating application – inhibitors – object characteristics
extension • Original name
• Storage • Environment
– software – hardware
will change in 3.0 • Digital signatures • Relationships • Linking event identifier • Linking rights statement
identifier
![Page 34: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/34.jpg)
Events
Examples: § Validation Event: use JHOVE tool to
verify that chapter1.pdf is a valid PDF file
§ Ingest Event: transform an OAIS SIP into an AIP (one Event or multiple Events?)
• An action that involves or impacts at least one Object or Agent associated with or known by the preservation repository
• Helps document digital provenance. Can track history of Object through the chain of Events that occur during the Objects lifecycle
• Determining which Events are in scope is up to the repository (e.g., Events which occur before ingest, or after de-accession)
• Determining which Events should be recorded, and at what level of granularity is up to the repository
![Page 35: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/35.jpg)
Semantic units pertaining to Events: provenance and preservation activity
§ Event identifier § Event type (e.g. capture, creation, validation, migration,
fixity check, ingestion) § Event dateTime § Event detail § Event outcome § Event outcome detail § Linking agent identifier § Linking object identifier
![Page 36: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/36.jpg)
Agents
Examples: § Rebecca Guenther (a person) § New York Public Library (an
organization) § JHOVE version 1.0 (a software
program)
• Person, organization, or software program/system associated with an Event or a Right (permission statement)
• Agents are associated only indirectly to Objects through Events or Rights
• Not defined in detail in PREMIS DD; not considered core preservation metadata beyond identification
![Page 37: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/37.jpg)
Semantic units pertaining to Agents
• Agent Identifier • Agent Name • Agent Type • Agent Note • Agent Extension • Linking Event Identifier • Linking Rights Identifier
![Page 38: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/38.jpg)
Rights Statements
Example: § Priscilla Caplan grants FCLA
digital repository permission to make three copies of metadata_fundamentals.pdf for preservation purposes.
• An agreement with a rights holder that grants permission for the repository to undertake an action(s) associated with an Object(s) in the repository.
• Not a full rights expression language; focuses exclusively on permissions that take the form: – Agent X grants Permission Y
to the repository in regard to Object Z.
![Page 39: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/39.jpg)
Semantic units pertaining to Rights
• Rights Statement • Rights Statement Identifier • Rights Basis • Copyright Information • License Information • Statute Information • Other Rights Information
• Rights Granted • act • restriction • termOfGrant • rightsGranted
• Linking Object Identifier • Linking Agent Identifier • rightsExtension
![Page 40: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/40.jpg)
Relationships
• PREMIS Data Dictionary supports expression of relationships between: – Different Objects
• Structural: relationships between parts of a whole • Derivation: relationships resulting from replication or transformation of
an Object • New relationships in 3.0: replacement, dependency, generalization,
reference – Different Entities
• Relationships are established through reference to Identifiers of other Objects or Entities
![Page 41: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/41.jpg)
PREMIS Maintenance Activity • Web site:
– Permanent Web presence, hosted by Library of Congress
– Central destination for PREMIS-related info, announcements, resources
– Home of the PREMIS Implementers’ Group (PIG) discussion list
• PREMIS Editorial Committee:
– Set directions/priorities for PREMIS development – Coordinate future revisions of Data Dictionary and XML
schema – Promote implementation – International in scope, cross domain
http://www.loc.gov/standards/premis/
![Page 42: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/42.jpg)
Implementation resources • Tools:
– XML schema – PREMIS-in-METS toolbox <http://pim.fcla.edu> – Controlled vocabularies at http://id.loc.gov – RDF/OWL ontology for use as Linked Data
• Guidelines: – PREMIS conformance statement – PREMIS & METS guidelines
• Community Working groups on special topics • Implementation Fairs
• Others: – Understanding PREMIS (available in multiple languages) – PIG Forum – Implementation Registry – Tools Registry
![Page 43: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/43.jpg)
Some implementers …
• DAITTSS (Florida) • Ex Libris Rosetta • OCLC’s Digital Archive™ • Archivematica • HathiTrust • TIPR (Towards Interoperable Preservation
Repositories) – FCLA, NYU and Cornell
• Digital libraries in Spain – Mandated for use in cultural heritage preservation
repositories See: http://www.loc.gov/premis/premis-registry.html
![Page 44: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/44.jpg)
PREMIS Conformance
• Conformance statement issued in 2010 • PREMIS Conformance Working Group active
now • Levels of conformance:
– Level 1 A repository uses an internal metadata schema whose elements can be mapped to PREMIS. The mapped metadata can satisfy the principles of use at both the semantic unit and Data Dictionary levels. The repository is able to produce documentation demonstrating such mapping for representative samples of its holdings.
– Level 2 A repository implements the PREMIS Data Dictionary as its internal metadata schema in a way that satisfies the principles of use at both the semantic unit and Data Dictionary levels and in a form that does not require further mapping or conversion.
![Page 45: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/45.jpg)
URLs, etc.
• PREMIS Maintenance Activity: http://www.loc.gov/standards/premis/
• PREMIS Data Dictionary for Preservation Metadata:
http://www.loc.gov/standards/premis/v2/premis-2-1.pdf
• PREMIS Implementation Registry http://www.loc.gov/standards/premis/registry
• PREMIS Implementers Group list http://listserv.loc.gov/listarch/pig.html
![Page 46: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/46.jpg)
A use case: the preservation health check
46
![Page 47: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/47.jpg)
- Open Planets Foundation (OPF) A community hub for digital preservation whose main goal is
to jointly manage and improve tools and research outcomes for practical use.
- OCLC Research A community resource for shared R&D that addresses
challenges facing libraries and archives in a rapidly changing information technology environment.
- Bibliothèque nationale de France The BnF runs a fully operational trusted digital repository
(SPAR). They volunteered to become a PHC-pilot site.
What is the Preservation Health Check Pilot?
![Page 48: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/48.jpg)
As part of their preservation management task, repository managers need to be able to monitor the preservation status of the content of their repository.
We are looking at regular “routine check-ups” that can support this monitoring task. – Monitoring should be made easy (automatically
generated reports or dashboard) – Monitoring should be based on objective data,
generated by the repository (e.g. preservation metadata)
The Preservation Health Check proposition
![Page 49: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/49.jpg)
The analogy
![Page 50: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/50.jpg)
If a Preservation Health Check is a monitoring activity to be performed on a repository with digital content
1. What are empirical indicators (i.e. measures) for PHCs? 2. Are preservation metadata recorded by repositories
useful as health indicators for PHCs? Monitoring is about tracking change ... intentional and
unintentional change.
The research question
![Page 51: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/51.jpg)
Goal: To develop an implementable logic (or protocol) to
support PHCs, and to test this logic against the store of preservation metadata maintained by an operational preservation repository.
![Page 52: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/52.jpg)
The BnF runs a fully operational trusted digital repository (SPAR). They volunteered to become a PHC-pilot site.
The empirical data consists of: 1. A sample (200 GB) of the PREMIS data (AIP-METS
files), covering the following collections: – Gallica = digitised periodicals, monographs, still images and
manuscripts (TIFF + OCR-files) – Legal deposit Web harvests (warc files) – 3rd party collection (Centre Pompidou)
The pilot site
![Page 53: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/53.jpg)
The empirical data consists of (continued): 2. All the Reference Information packages in SPAR that
contain reference information/code/specifications of (external) tools used during INGEST (ex. JHOVE) and of formats ingested;
3. Per collection: SLAs defining policy agreements with SIP suppliers concerning the preservation regime to be applied at the INGEST and ARCHIVAL STORAGE stages.
The pilot site
![Page 54: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/54.jpg)
Mapping PREMIS on to SPOT
PREMIS Data
Model
Int. Ent.
SPOT Model
Availability
Identity
Persistence
Renderability
Understandability
Authenticity
Objects
Agents
Rights
Events
Semantic Units
Threats
![Page 55: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/55.jpg)
preservation metadata in 2005 “Preservation metadata (…) metadata supporting
the functions of maintaining viability, renderability, understandability, authenticity, and identity in a preservation context.” (p. ix)
55
http://www.loc.gov/standards/premis/
![Page 56: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/56.jpg)
![Page 57: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/57.jpg)
Findings: coverage
SPOT property # of PREMIS semantic units*
• Availability 16 • Identity 19 • Persistence 10 • Renderability 15 • Understandability 14 • Authenticity 16 *Container level only; Agents, Events, Rights considered one semantic unit
![Page 58: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/58.jpg)
Findings: coverage
• What does coverage in terms of “number of PREMIS semantic units” mean?
• More meaningful: Do the PREMIS semantic units address the threats associated with a SPOT property?
Example of a gap between SPOT and PREMIS: SPOT property: Understandability We found no PREMIS semantic units that provide
information that aids in the understanding or interpretation of the content of the archived digital object.
![Page 59: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/59.jpg)
A repository usually implements a large number of explicit and implicit policy decisions; however, PREMIS currently makes few provisions for recording these in preservation metadata (the semantic unit preservationLevel being a notable exception).
Findings: preservation policies
![Page 60: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/60.jpg)
PREMIS conformance does not require explicit encoding of metadata if the information applies to all objects in the repository.
This impedes the provision of automated PHC services (by a third-party provider) because efficient provision of this service would likely require the information in semantic units to be explicitly recorded, and implemented in a standard way.
Findings: explicit encoding
![Page 61: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/61.jpg)
Logic for assessing Persistence
SPOT Model
Availability
Persistence
Identity
Renderability
Understandability
Authenticity
Threats
Six essential properties of successful digital preservation
![Page 62: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/62.jpg)
62
![Page 63: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/63.jpg)
• If storage medium information is not available in PREMIS metadata, the PHC will need to take other information sources into account – such as audit reports generated by storage management systems.
• We note that there are no pre-defined events for Corruption and Readability in PREMIS, which means that the repositories need to define their own events. PREMIS does provide a list of recommended event labels for the semantic unit eventType, but it is just a “suggested starter list”.
• The repository should have policies in place that prescribe frequencies of fixity checks, of medium refreshment, backup policy, etc. The PREMIS semantic unit preservationLevel does not address such policies. The PHC flow thus needs to get the policy information from other sources.
Logic for assessing Persistence
![Page 64: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/64.jpg)
A use case: the preservation health check (to be continued)
64
![Page 65: Preservation Metadata: between theory and practicePreservation Metadata Workshop (2) The Hague, the Netherlands 19 June 2014 Titia van der Werf adapted from: Rebecca Guenther, “Metadata](https://reader036.vdocuments.mx/reader036/viewer/2022063011/5fc66069183561681a38d82e/html5/thumbnails/65.jpg)
Thank You!
©2014 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license: http://creativecommons.org/licenses/by/3.0/”