digital preservation - the saga continues - scape training event, guimarães 2012
DESCRIPTION
This presentation is an introduction to Digital Preservation given by David Tarrant, Open Planets Foundation, at the first SCAPE Training event, ‘Keeping Control: Scalable Preservation Environments for Identification and Characterisation’, in Guimarães, Portugal on 6-7 December 2012.TRANSCRIPT
![Page 1: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/1.jpg)
Digital Preservation
The Saga Continues
![Page 2: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/2.jpg)
SCAPE
http://www.dpconline.org/
![Page 3: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/3.jpg)
• It won’t do itself
• It won’t go away
• Don’t wait for perfection
SCAPE
‘Digital Preservation: what I
wish I knew before I started’
![Page 4: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/4.jpg)
SCAPE
Digital preservation makes bleak reading …
![Page 5: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/5.jpg)
SCAPE
Let’s restate the problem …
•Digital stuff has value. It is an asset.
•It has potential and creates new opportunities.
•Use gives rise to direct and indirect outcomes.
...but...
•Deployment depends on software, hardware and people.
•Software, hardware and people change.
...therefore...
•Access is not guaranteed without (some) action
•Value, opportunity, impact not guaranteed
![Page 6: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/6.jpg)
SCAPEKey responses
1. MigrationChanging the format of a file to ensure
the information content can be read
2. EmulationIntervening in the operating system to
ensure that old software can function
and information content can be read
3. Hardware preservationMaintaining access to data and processes
by maintaining the physical computing
environment including hardware and
peripherals.
4. etcResearch and development field, new
solutions and new approaches continue to
emerge, eg virtualisation for preservation
![Page 7: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/7.jpg)
SCAPE
Access and long term use
depends on the
configuration of hardware
and software and the
capacity of the operator.
Change is not a bug.
![Page 8: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/8.jpg)
SCAPE
Technology continues to
change creating the
conditions for obsolescence.
Need to become a learning
institution
![Page 9: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/9.jpg)
SCAPE
Storage media have a short life
and storage devices are subject
to obsolescence.
Be mobile and format neutral
![Page 10: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/10.jpg)
SCAPE
Digital preservation systems
are subject to the same
obsolescence as the objects
they safeguard.
Standards and modularity
![Page 11: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/11.jpg)
SCAPE
Digital resources are intolerant
of gaps in preservation.
Ongoing process
![Page 12: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/12.jpg)
SCAPE
The problems are more
subtle than we realised a
decade ago…
e.g. file format
obsolescence
Changing file formats?
Conformant containers?
Units of information?
![Page 13: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/13.jpg)
SCAPE
How to pick a winner ...
Todd, M 2009 ‘File formats for preservation’, DPC Technology Watch Report
02/09, online at http://www.dpconline.org/advice/technology-watch-
reports.html
beyond and potentially over-writing the criteria ...
repository managers should align the recognition and
weighting of criteria with a clear preservation strategy
that articulates the purpose of the repository and the
needs of its designated community;
![Page 14: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/14.jpg)
SCAPE
How to pick a winner
...
... it’s not going to be about obsolescence so
much as workflow and capacity
You ain’t seen nothing yet
Data growth on 3 axes
•volume
•complexity
•expectation
![Page 15: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/15.jpg)
SCAPE
Digital Preservation as a ‘discipline’
Daunting challenge
Decade of research and development
Replete with jargon and acronyms
Turf war between professions?
A whole new barrier
The last decade has shown definitively that using
fancy words are not the same as solving problems
Courtesy NASA/JPL-Caltech
![Page 16: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/16.jpg)
SCAPE
How much does it cost?
Lifecycle costs of digital objects
vs
Lifecycle costs of books
vs
Lifecycle costs of museum objects
vs
Lifecycle costs of archives
vs
Lifecycles costs of historic environment
![Page 17: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/17.jpg)
www.dpconline.org-
Ge= ng$started…$
SCAPE
![Page 18: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/18.jpg)
SCAPE
The reality?
You don’t need to understand
or do all of this.
... and it doesn’t all have to exist at the same time
![Page 19: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/19.jpg)
SCAPE
The reality?
Get started now
not later
![Page 20: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/20.jpg)
Preservation Lifecycle
IdentificationIdentification
CharacterisationCharacterisation
Risk AssessmentRisk Assessment
PlanningPlanning
ActionAction
DROID, FIDO, FILE, FITS, TIKA…
JHOVE, JPYLYZER, exiftool, FITS…
Knowledge + Policy + Risk = Continue
Plato
Migration, Emulation
SCAPE
![Page 21: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/21.jpg)
This Training
IdentificationIdentification
CharacterisationCharacterisation
JHOVE,
JPYLYZER
FITS
TIKA
DROID
FIDO
FILE
Exiftool
FITS
SCAPE
![Page 22: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/22.jpg)
So you have dug a hole?
![Page 23: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/23.jpg)
Stage 2
• What did you find?
• Is it worth preserving?
• What are the problems?
![Page 24: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/24.jpg)
Aim of Training
![Page 25: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/25.jpg)
Time to get married
1. Luis Bravo
2. Jose Casanova
3. Vitor Fernandes
4. Sebastien Leroux
5. Joao Pereira
6. Rui Rodrigues
7. Carlos Velentim
8. Jose Carvalho (Papiro)
9. Omar Coelho
1. Jose Carvalho (SDUM)
2. Carlos Duarte
3. Luis Ferreira
4. Cristiana Freitas
5. Claire Johnson
6. Anthony Laerdahl
7. Helena Medeiros
8. Antonio Rodrigues
9. Cidalia Ferreira
SCAPE
Column 1 Column 2
![Page 26: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/26.jpg)
Getting Started (1)
• Wifi Network = SMS, password = Sarmento1881127
• Download Virtualbox (if you don’t have it)
• Start Virtualbox
• Plug-in USB memory key
• Open the memory key folder and double click the extension pack file to install it (follow instructions at this point)
• Return to virtual box:
• From the main menu (file), select “Import Appliance”
• Browse to the memory key and select the only file
• Wait for this to import
• Once done you can safely remove the key.
SCAPE
![Page 27: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/27.jpg)
Getting Started (2)
• Once done, click the machine and press the settings button (maybe in right click)
• Click shared folders
• Click add
• Add a shared folder (e.g. your desktop or downloads folder)
• Tick auto-mount!
• Click OK to return to the main screen
• Start the machine
• Wait..
SCAPE
![Page 28: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/28.jpg)
Getting Started (3)
• Password is training.
• Ignore update manager if it appears
• Press the top left ubuntu home button and
type terminal (select and run the app)
• Type: cd /media/sf_Desktop (where Desktop is
the folder you shared previously) and press
enter
• Type: fido *
SCAPE
![Page 29: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/29.jpg)
Bundle or Not?
• Pros
– Single Input/Output
– Consistent
– Easy
• Cons
– Out of date
– Doesn’t Scale
SCAPE
![Page 30: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/30.jpg)
Questions (1)
• What tool would you use?
SCAPE
![Page 31: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/31.jpg)
Training
Keeping Control - Scalable
Environments for Identification
and Characterisation
SCAPE
![Page 32: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/32.jpg)
Aims
This training course will cover elements dealing
with scalable identification, characterisation
and validation of large collections of varying file
types. Users will be introduced to a number of
tools designed for each of these purposes and
involved in problem solving scenarios. Further,
users will be required to evaluate the use of
scalable and cloud based technologies in
developing solutions for given scenarios.
SCAPE
![Page 33: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/33.jpg)
Learning Outcomes (1)
• Distinguish between different file types and
identify the requirements for characterising
each.
• Carry out a number of identification,
characterisation, and duplication detection
experiments on example files.
SCAPE
![Page 34: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/34.jpg)
Learning Outcomes (2)
• Critically evaluate characterisation and
identification tools and assess their
advantages and disadvantages when used in
different scenarios.
SCAPE
![Page 35: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/35.jpg)
Learning Outcomes (3)
• Conduct an in-depth analysis of large volumes
of identification and characterisation data and
find representative sample records suitable for
preservation planning experiments.
![Page 36: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/36.jpg)
Learning Outcomes (4)
• Compare and contrast the differences in
running characterisation and identification
tools both stand-alone and within workflows.
• Envisage a system that combines workflows
with identification, characterisation and
validation tools to suit a variety of scenarios.
SCAPE
![Page 37: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/37.jpg)
Our Last Commitment
Slides will be available Monday!
SCAPE
![Page 38: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/38.jpg)
Thank You
Franz San Galli
SCAPE
![Page 39: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/39.jpg)
Thank-YouSCAPE
![Page 40: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/40.jpg)
Thank YouSCAPE
![Page 41: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/41.jpg)
Next Time…
Building Applications Infrastructures
for Action Services
London, September 2013
(wet)
SCAPE
![Page 42: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/42.jpg)
Then….
Critical Path: Effective Evidence Based
Preservation Planning
Denmark, November 2013
(cold)
SCAPE
![Page 43: Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012](https://reader034.vdocuments.mx/reader034/viewer/2022051817/54833d2ab4af9faf0d8b499a/html5/thumbnails/43.jpg)
Tonight
www.goo.gl/q6wKB
Our Table
7:15pm Eleven Bar
@ Hotel Fundador
Free Beer*
* 1 Free Beer subject to completion of online survey!
SCAPE