modelling the digital preservation costs paul wheatley digital preservation manager british library
TRANSCRIPT
![Page 1: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/1.jpg)
MODELLING THE DIGITAL PRESERVATION COSTS
Paul Wheatley
Digital Preservation Manager
British Library
![Page 2: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/2.jpg)
22
Summary
Overview of the model: Aims Development process Model
Results
Evaluation
Conclusions
![Page 3: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/3.jpg)
33
Scope
Acquisition
Ingest
Metadata
Storage
Access
Preservation
![Page 4: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/4.jpg)
44
Background and aims
Previous work (see Final Report): National Archief, Digital Bewaring – full costing/audit approach Oltmans, Kol – lifecycle and strategies
Key aims: Make the first major step in defining and estimating the lifecycle cost
of digital preservation activities. Propose a model for comment by the wider preservation community Enable the LIFE Case Studies to be compared and contrasted by
providing some cost estimates for “P” in the Lifecycle Model. Attempt to identify the scale of preservation costs. Are they
dramatically high as suggested previously by many in the preservation community or are they more achievable as suggested recently (see Rusbridge, C, “Excuse Me... Some Digital Preservation Fallacies?”)?
![Page 5: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/5.jpg)
55
Development process
Key cost factors, experimentation, iterative development and refinement
Based on evidence or indications of trends where possible
Editable inputs where key estimation or assumptions made
Cost component review
Application of draft model, refinement of inputs
Team review, refinement of model weaknesses
![Page 6: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/6.jpg)
66
The Generic LIFE Preservation Model
Preservation = t * TEW + (t / ULE + PON) * (CRS + UME + PPA + QAA)
Expansion of calculated components:• ULE – Unaided Life Expectancy of a Format = BLE + 0.1*t• CRS – Cost of new rendering solution = (1 - PTA) * TDC * FCX + PTA * COA• PPA – Performing preservation action = PON * (SCM + n * HVM)• QAA – Quality Assurance = n * BCT * FCX• PTA – Proportion of Tool Availability = STA(1-t/20)+ETA(t/20)
Expansion of scaling components:• PON – Proportion of normalisation = 0.4• FCX - Format complexity (e.g. JPEG = 0.2, WMF = 0.4, PDF = 0.6, Word = 0.8)
Expansion of cost component inputs:• HVM – High volume migration cost per object = £0.05• BCT – Base cost of testing a preservation action per object = £0.17• UME – Update Metadata = 2 metadata officer weeks @ £30k annual salary = £1250• TDC – Tool development cost = 24 programmer months @ £30k annual salary - £60000• COA – Cost of available tool = £1500• TEW - Technology Watch = 1 metadata officer week @ £30k annual salary = £625• BLE - Base life expectancy = 8 (years)• STA – Starting tool availability = 0.5• ETA – Ending tool availability = 0.9• SCM – Setup cost of migration = £340
![Page 7: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/7.jpg)
77
The Generic LIFE Preservation Model :key elements explained
Preservation = t * TEW + (t / ULE + PON) * (CRS + UME + PPA + QAA)
Frequency of action
TechWatch
Preservation action
Preservation cost of n objects of a particular format for the period 0 to t.
Preservation = + *
Eg. 20000 objects of the GIF format for a period of 10 years.
Monitoring formats and software for obsolescence
Updating and managing metadata (Representation Information).
The number of preservation actions within the time period calculated
Q/AUpdate
metadata
Performpreservation
action
Cost ofPreservation
tool
![Page 8: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/8.jpg)
88
Series of small technology watch events and spikes of preservation activity at
increasing intervals
The occurrence of costs(1st detailed sample of the model)
Time (years)
Pre
serv
atio
n ac
tivity
0 t
Time (years)
Pre
serv
atio
n ac
tivity
0 t
Preservation actionPreservation = + *Tech
WatchFrequency of action
Example : FCLA Action Planshttp://www.fcla.edu/digitalArchive/
Base life expectancy = 8 yearsIncreases by a year every decade
Time (years)
Pre
serv
atio
n a
ctiv
ity
0 t
Preservation actions
Technology watch
![Page 9: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/9.jpg)
99
Q/AUpdate
metadata
Performpreservation
action
Cost ofPreservation
tool
Complexity of file formats(2nd detailed sample of the model)
• Size• Complexity• Proprietary• Open• Standardised
Frequency of action
TechWatch
Preservation actionPreservation = + *
=
Category Complexity Examples
Simple 0.1 ASCII, Unicode
Bitmap 0.2 JPEG, GIF
Mark-up 0.3 XML, HTML
Vector 0.4 EMF, Draw
Multimedia 0.6 MPEG3, WAV
Document 0.8 Word, PDF
Complex 1 Oracle database dump
FormatComplexity
![Page 10: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/10.jpg)
1010
Preservation tool cost (3rd detailed sample of the model)
Cost ofPreservationTool (CRS)
Frequency of action
TechWatch
Preservation actionPreservation = + *
Q/AUpdate
metadata
Performpreservation
action
=
Proportionof tool
Availability(PTA)
=
Cost of developing a new tool
Cost of acquiring
an existingtool
+
PT
A(1- )P
TA
ToolDevelopmentCost (TDC)
=Estimated as 24 programmer months @ 30k annual salary
(£60000)Format
ComplexityCost of
Availabletool
= Estimated as £1500Time (years)
Pro
port
ion
of to
ol
avai
labi
lity
(PT
A)
0 20
0%
100% Tool availability
(1-t/20) + (t/20)
STA
ETA
ETA
STA
= 0.9
= 0.5
Average proportionacross the time period
Preservation = t * TEW + (t / ULE + PON) * (CRS + UME + PPA + QAA)
![Page 11: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/11.jpg)
1111
Estimated costs using the model
File FormatFormat Complexity
Number of objects
Frequency of pres action
GIF 0.2 225079 1.51
Case study name Sub category Year1 Year 10Percentage of total lifecycle cost
VDEP e-monographs £0.89 £1.45 4%
VDEP e-serials £10 £27 2%
Web archiving £425 £8509 62%
File Format
Technology watch
Preservationtool cost
MetadataPreservation action
Quality assurance
Total cost (over 10 years)
GIF £6,250 £7,027 £1,889 £7,008 £11,564 £33,738
Estimated preservation costs for GIF files in the Web Archiving
Case Study
Comparison of average object
preservation costs across the Case
Studies
![Page 12: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/12.jpg)
1212
Model outputs:WA Case Study, percentage breakdown
Quality assurance
Preservation action
Metadata
Tool cost
Technology watch
1 5 10 20
Time period (years)
Breakdown of complete preservation costs over time in the WA Case Study
![Page 13: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/13.jpg)
1313
Self evaluation of the model
Evaluation against key aims: Make the first major step in defining and estimating the lifecycle cost
of digital preservation activities. Propose a model for comment by the wider preservation community Enable the LIFE Case Studies to be compared and contrasted by
providing some cost estimates for “P” in the Lifecycle Model. Attempt to identify the scale of preservation costs. Are they
dramatically high as suggested previously by many in the preservation community or are they more achievable as suggested recently (see Rusbridge, C, “Excuse Me... Some Digital Preservation Fallacies?”)?
![Page 14: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/14.jpg)
1414
Further work and refinement
Refinement based on real cost data, removal of assumptions
Level of detail
Format complexity
Re-ingest
More detailed discussion in the Final Report…
![Page 15: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/15.jpg)
1515
Summary and conclusions
Estimating the cost is not easy but appears to be possible!
Provides a useful perspective on performing preservation
Focuses on achieving cost effective preservation
![Page 16: MODELLING THE DIGITAL PRESERVATION COSTS Paul Wheatley Digital Preservation Manager British Library](https://reader033.vdocuments.mx/reader033/viewer/2022051316/5697bfa61a28abf838c982be/html5/thumbnails/16.jpg)
1616
Finally…
Two appeals to the audience:
Please cost, record and publish your preservation work
Provide comment on the preservation model:
Questions, comments, evaluation:[email protected]