planning for digital preservation. planning for preservation digital preservation issues come up...

Post on 28-Mar-2015

225 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Planning for Digital Preservation

Planning for Preservation

Digital preservation issues come up much faster than traditional preservation issuesDigital resources need on-going attentionBuild a preservation strategy into your project from the startKeep dealing with the short-term issues and you won’t ever need to face the long-term problem

Issues

The content of digital resources is only accessible with the aid of intermediary technologiesDigital resources are complexReliance on specific combination of formats, software and hardware to operate correctlyI.T. develops rapidly, and resources can become obsolete very quickly

Three Key Areas

Content – the bits and bytes

Technologies: software systems; hardware: websites, access and delivery systems

Organisational

Planning for the Future

Short-term: Initial technology still current and actively supported - 0 - 5 yearsMedium-term: Initial technology still in use and supported, but no longer used for new work - 5-10 yearsLong-term: Initial technology no longer used or supported - 10+ years

….. In the Short Term

Making digital assets available

Website administration Website updatesSoftware and operating system patchesPeriodic backupsPeriodic checks on master copies

…….. In the Medium Term

Keeping your existing digital outputs ‘up and running’

Upgrading operating systems and softwareUpgrading hardwareReplacing hardware componentsRefreshing master copiesPeriodic backupsPeriodic checks on master copies

…….. In the Longer Term

Overcoming technological obsolescence to preserve a usable digital resource

Introducing completely new softwareReplacing entire hardware systemsEnhancing functionalityPeriodic backupsPeriodic checks on master copies

During the Data Creation Phase

Importance of backupsPreferably more than one copy, on and off siteAppropriate frequencyMore than one file formatCheck your backupsBut backup is not preservation!

What to Preserve?

Significant Characteristics

Very difficult to preserve everything (data, functionality and interaction) about a digital resourceDocumented or commonly understood significant characteristics help simplify preservation action

Analogue……

Book - Significant:Words, paragraphs, chapters, author, publication date, …

Not Significant:Binding, print run, font, colour of paper, …

Newspapers - Significant: Words, paragraphs, headlines, size of type, date, page number of article, …

Not Significant:Size of page, spacing, text justification, colour of paper, …

Digital………

There is a shared understanding of what is important in a paper-based resourceLess agreement about what is important in a digital resourceComplicated to decide as software and formats support many options that are not knowingly used but have default settings

Questions to ask….

What are the significant characteristics of your digital outputs?

What are the digital objects that make up your resource?What is the purpose of your digital resource?

Think about the problem in terms of content and purposeVery difficult (if not impossible) to ensure your resource stays exactly the same in the futureWhat can change without adverse effects?What changes must be limited, and by how much?How can you check changes are acceptable?

Assessing the scale of the Preservation Task

Estimating volume and type:Textual DocumentsStill ImagesMoving ImagesAudio filesNumeric datasetDatabaseMarkup Documents (XML etc.)CADGISVirtual realityWebsiteSoftware executable

Risk Assessment for file formats used

Review data types and file formats

Assess the risks associated with those file formats

Establish policy for dealing with them

Preservation Metadata

Metadata needed to manage preservation of digital collections: technical; administrative Not necessarily a “complete set” of preservation metadata elementsPossible sources:

OCLC/RLG Working Group; the Consultative Committee for Space Data Systems; CEDARS project; The UK National Archives (formerly the Public Record Office); Arts and Humanities Data Service; NEDLIB project; California Digital Library; Harvard University Library

File Structure

Create an overview of the file structureCreate a list of all filesCreate a logical file strategy from the outset

Choose consistent filenamesAvoid using re-using same filename even in separate folders.Store files in a logical order with systems and contents files kept apart.Summary of contents may be included with each file.Keep a record of encryption keys – important for preservation.

Preservation Strategies: Content

Migration: convert the data to work with new applicationsEmulation: convert the data, application (and operating system) to work on new hardwareTechnology preservation: Keep everything running Virtual computing: create a standard ‘virtual’ runtime environmentMigration on demand: convert original format directly into up-to-date format

Theory ----- Practice

In practice, migration is the simplest and most common approachLimitations of migration are:

Can be difficult to ensure accurate migrationDoes not capture functionality, only (possibly partial) dataMay need to be repeated frequentlyMight lead to ‘mutation’ over time

Migrating to new standards – but which one?

"The good thing about standards is that there are so many to choose from“ (A. Tanenbaum)

Quicktime 1.0 1992MPEG-1 1992Real Media 1995MPEG-2 1996RealVideo 1997MPEG-4 1999Quicktime 5.0 1999Active Streaming Format 1999

DIVX 5.0 2002 The number of A/V “de-facto” standard formats has exploded in the past five years, and this does not cover the dozens of audio and video codec combinations!

Measuring Longevity of Standard

Who developed it?Microsoft, Motion Picture Expert Group, etc.

 Has it received mainstream support?Can your hardware save data in that format?

 What organisations are using it?Is it used in industry

 Is it widely accepted by the professional and amateur community?

Technology watch – check web sites, developer forums and newsgroups.

 Has it been submitted as an ISO standard?

Measuring Longevity of Standard

Are there any legal actions to change the standard?

Is there a licensing fee?

What tools are available to create and manipulate the format

Open source vs. proprietaryPRONOM – National Archive database of 250 software products, 550 file formats and 100 manufacturers

Can I execute these tools on my computer?Java, Windows-only, Mac-only

Choosing a Suitable Migration Path

What are the main features?Small file size, streaming support

 Will it support your specialist needs?Subtitles, DRM, Internet delivery, etc.

 Does it provide sufficient qualityLossless vs. lossy compression.

 Will it impose any restrictions on use?Can it actually be played by your target audience?

 Is the standard stable or does it change frequently?

How will this affect your desire to use the format?

Migration problems

Have you encountered any problems when accessing these files in other applications?

Quirks (text not displaying, desynchronised audio/video, upside-down video playback).Version incompatibilities

Migrating to other formatsAre there any other problems when exporting to other formats? E.g. lossless-to-lossless conversion, in-editableDocument quirks & incompatibilities for later.

Updating Hardware

Hardware has changed dramatically in the last 3 years

Memory – DDR vs. SD-RAMCPU – pin compatibilityGraphics cards – AGP 2x, 4x, 8xOperating system – will Windows NT4/98 run on newer hardware?

Do you upgrade existing hardware or replace it with new equipment?

Updating Software

Software changes on a frequent basisFour service packs available for Windows 2000.Microsoft issues 3 patches per week on average.Legal action force changes to plugin handling.In addition, there is an estimated 20 un-patched vulnerabilities in Internet Explorer alone (PivX Solutions).

Do you upgrade to a later operating system or continue to use an operating system & software with known security flaws?

Preserving Your Website: technical issues

Standards And FormatsHas the Web site been designed using open standards, which should help future-proofing?Have proprietary formats been used (for which backwards compatibility may not be considered)

Architecture & ImplementationHas the technical architecture of the Web site been documented?Can you continue to use technical systems after funding has finished?

Preserving Your Website: content issues

Accuracy:Is the content of the Web site accurate today Who and how will changes be madeCould the content of the Web site be misleading in the future?

Usability:Maintaining links – short medium and long term

Legal:Is the Web site legal (accessibility; copyright; defamation; IPR; …)?Will the Web site be legal tomorrow, if new legislation is enacted? How will you know – who will make necessary changes?

Maintaining a Website

Run a link check across the Web site. Fix broken internal links and as many external links as is reasonable. Document the link report.

Run HTML (and CSS) validation checks across the Web site. Fix as many invalid pages as is reasonable. Document the findings.

Run an accessibility check across the Web site. Fix as many inaccessible pages as is reasonable. Document the findings.

Maintaining a Website

Address technical areas:

Remove any backend scripts which are no longer needed

Remember that scripts, etc. are liable to go wrong.

Ensure that applications are configured to break gracefully and provide meaningful errors – tell users who to contact if they find an error

Procedures framework

From start to finish:Creation and Management Manuals within Procedures Framework

Key File Format Conversion Guides

Digital Object Preservation Handbook: a ‘how-to’ guide

Options for Ensuring Preservation

Once a project is completed……………

Live, (supported) systemArchivedOrganisational Repository‘Shelved’Abandoned

Not Recommended……..

AbandonedMay be appropriate, probably isn’t, think about archiving the resource instead

‘Shelved’Don’t - shelving a digital resource without active, on-going attention is highly likely to result in its lossMedia degradationSoftware and hardware obsolescenceLoss of knowledge about the resource

Recommended……. But Think About

Live SystemImportance of functionality/interfaceOrganisational buy-in: who is running the system, and what is their commitment to it?What will happen if the system is shut down?Is the digital resource completed or on-going?Who Pays?

Recommended…… But Think About

Deposit in an ArchiveIs the digital resource going to a trusted archive?Are only some aspects of the resource being archived?Will it be available for others to use?Will the resource be updated in the future?Costs?

Recommended…….. But think about

Establish a RepositoryBusiness model and financial plan

Management and administrative processes Policies and proceduresSystems and toolsSoftware and hardwareResource curation Metadata and documentationPreservation management

Establishing Requirements

A pragmatic approach – workable and achievablePreservation requirementsEstablish common practices, procedures and use of standardsInvestigate and establish hardware, systems, and tools requirementsInvestigate and evaluate productsBusiness planning and costings

Developing the Architecture

The architecture must support:The entire activity cycle including ingest, data management, storage, long term preservation, discovery, access and deliveryAll necessary security aspectsComplex resourcesDiscovery and delivery options

Summary

Build in preservation right from the startDocument decisions/policies/proceduresBalance longevity with innovationBe ruthless about what you must keep and what can be discardedThink content and functionalityPlanningIt’s a continuous process – not a one-off

top related