Total Cost of PreservationCost Modeling for Sustainable Services
Stephen AbramsPatricia Cruse
John KunzeUniversity of California Curation Center
California Digital Library
Screening the Future 2012: Pause, Play, and Press ForwardLos Angeles, May 21-23, 2012
Outline
Goals Prior work Modeling preservation activity Total cost of preservation
► Pay-as-you-go price model
► Paid-up price model
Conclusions Questions and discussion
http://wiki.ucop.edu/display/Curation/Cost+Modeling
Source: Getty Images
Goals
Understand costs in order to plan for and implement sustainable preservation services
Investigate the possibility of paid-up pricing in order to address► Boom-or-bust budget cycles► Fixed-term, grant funded projects
Source: www.sharedidiz.com/
End
date!
Prior work
Nationaal Archief (2005)http://www.nationaalarchief.nl/sites/default/files/docs/kennisbank/codpv1.pdf
LIFE (2008)http://www.life.ac.uk/
KRDS (2010)http://www.beagrie.com/krds.php
DataSpace (2010)http://arks.princeton.edu/ark:/88435/dsp01w6634361k
Jean-Daniel Zeller (2010)“Cost of digital archiving: Is there a universal model?”8th European Conference on Digital Archiving, Geneva, April 28-30, 2010 http://regarddejanus.files.wordpress.com/2010/05/costsdigitalarchiving-_jdz_eca2010.pdf
Rosenthal (2011)http://blog.dshr.org/2011/09/modeling-economics-of-long-term-storage.html
}Identification of granular cost components
}Assumption of annual decrease in aggregate cost, i.e., discounted cash flow (DCF)
Critique of DCF approach
Key assumptions
Consider only the costs incurred by the preservation service provider► Costs of content creation by collection managers are out
of scope
Costs can be categorized unambiguously as fixed or marginal, and one-time or recurring► One-time costs can be annualized over the effective
lifespan of the activity or system component
Cost model components
System, composed of various
Services for necessary/desirable functions, running on
Servers, deployed by
Staff, in support of content
Producers, who use
Workflows to submit instances of
Content Types, which occupy
Storage, and are subject to ongoing
Monitoring and periodic
Interventions; all subject to managerial
Oversight
Number and unit cost of Producers
Total cost of preservation
OViMjSkCWmPnATCP
Fixed cost of System
Number and unit cost of Workflows
Unit cost and number of
Content Types
Number and unit cost of
Storage
Number and unit cost of Monitoring
Number and unit cost of
Interventions
System component subsumes Services
and Servers
Staff costs are subsumed by other
components
Total cost to service
provider
Fixed cost of oversight
Total cost of preservation
OViMjSkCWmPnATCP
Model is rich enough to represent the full economic cost of preservation
Implemented by a spreadsheet that captures all subsidiary costs
Total cost of preservation
OViMjSkCWmPnATCP
Model is rich enough to represent the full economic cost or preservation
But service providers can customize the model to exclude components whose costs are not recoverable or are subsidized as a matter of local policy
Assumption: Cost allocation Cost of the Archive, Workflows, Content Types,
Monitoring, and Interventions are “common goods”► Equally beneficial to all Providers► Properly apportioned across all Providers
Cost of a single Producer
SkPn
OViMjCWmAG P
Number of Storage units attributable to
Producer
Number of Producers
Unit cost of a Producer
Total cost attributable to a given Producer
Assumptions: Billing
Costs are billed for at the end of the period of service
The cost model should be revenue neutral
Pay-as-you-go cash flow
Expense
Income
GGt = 0 1 2 3
GCash flow diagram
G G G
TGGTGT
t
1
0
)(
Cost of a single Producer
Cumulative pay-as-you-go price over time period T
Pay-as-you-go price for a single
Producer
Cumulative pay-as-you-go price
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
$16,000
$14,000
$12,000
$10,000
$ 8,000
$ 6,000
$ 4,000
$ 2,000
$ 0
Year (T)
Cost
($)
TGGTGT
t
1
0
)( )(G
Cumulative pay-as-you-goG (T )
Cumulative pay-as-you-go price over time period T … for “forever”
as a function of time T
Assumptions: Costs over time
Moore’s law, 1971 – 2011Source: Wikipedia
Kryder’s law, 1980 – 2012Source: Wikipedia
The aggregate cost of providing preservation service decreases over time; and that decrease is uniform► Moore’s and Kryder’s laws
Assumptions: Costs over time
The aggregate cost of providing preservation service decreases over time; and that decrease is uniform► Moore’s and Kryder’s laws► State-of-the-art tools and understanding► Productivity increases
Discounted pay-as-you-go cash flow
(1–d )2·GG (1–d )·GDiscounting
factor
t = 0 1 2 3
Expense
Income
Discounted cash flow
(DCF) diagram
G (1–d )·G (1–d )2·G
tT
t
dGdTG )1(),(1
0
Discounted pay-as-you-go price over time period T
Cost of a single Producer
Pay-as-you-go price for a single
Producer
Compounding over time
as a function of time TDiscounted pay-as-you-go price
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
$16,000
$14,000
$12,000
$10,000
$ 8,000
$ 6,000
$ 4,000
$ 2,000
$ 0
Year (T)
Cost
($)
dd T
GdTG 11),(d
GdG ),(
(1-d)t discount factor
Discounted pay-as-you-goG (T,d )Discounted pay-as-you-goG (,d )
Cumulative pay-as-you-goG (T )
… for “forever”Discounted pay-as-you-go price over time period T
Discount factor
d is the weighted sum of the expected changes in number and unit cost of individual components
Weighting factors ω are the proportion that a particular component contributes to the aggregate cost G, e.g.
)()( CCWmWAA dddddd )()()( SkSPPViVMjM ddddddd
Gn
AA
Gn
WmW
Drawbacks to pay-as-you-go pricing
Only viable for Producers with reliable annual funding sources
Boom-or-bust budgeting or the termination of funded project work can interrupt this funding
Any interruption in proactive preservation care can lead to irretrievable data loss
Assumptions: Investment return
Preservation service providers can carry forward budgetary surpluses across fiscal years
Surplus funds can be invested with the return supplementing the surplus
Paid-up cash flow
t = 0 1 2 3
Expense
Income
(1–d )2·G(1–d )·G
F r ·F
F Surplus (1+r )·F –G
r ·[(1+r )· F –G ]
(1+r )· [(1+r )·F –G ]–(1–d )·G
r ·[(1+r )·[(1+r) ·F–G ]–(1–d )·G ]–(1–d )2·G
(1+r )·[(1+r )· [(1+r )· F –G ]–(1–d )·G ]–(1–d )2·G 1
1
0 )1(
1),,(
t
tT
t r
dGrdTF
G
Paid-up price for time period T
Paid-up price Investment return
Cost of a single Producer
as a function of time TPaid-up price
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
$16,000
$14,000
$12,000
$10,000
$ 8,000
$ 6,000
$ 4,000
$ 2,000
$ 0
Year (T)
Cost
($)
)()1(
)1()1(),,(drr
drT
TT
GrdTF
dr
GrdF
),,(
(1–d)t discount factor
(1+r)t investment return
Paid-up price, for TF (T,d ,r)Paid-up price, for F (,d ,r)
Discounted pay-as-you-goG (T,d )Discounted pay-as-you-goG (,d )
Cumulative pay-as-you-goG (T )
… for “forever”Paid-up price for time period T
Paid-up example
Pay-as-you-go price, G $ 650 (1 TB) Discount factor, d 5% Investment return, r 2% Term, T 10 years Paid-up price, F $ 4,725
Year Income Expense Surplus0 $ 4,725.00 – $ 4,725.00
1 $ 94.32 $ 650.00 $ 4.285.32
2 $ 83.39 $ 617.50 $ 3,764.21
3 $ 72.70 $ 586.63 $ 3,262.29
4 $ 62.43 $ 557.29 $ 2,778.42
5 $ 52.53 $ 529.43 $ 2,311.52
6 $ 42.99 $ 502.96 $ 1,859.55
7 $ 33.79 $ 477.81 $ 1,422.53
8 $ 24.91 $ 453.92 $ 816.52
9 $ 16.33 $ 431.22 $ 401.63
10 $ 8.03 $ 409.66 $ 0.00
< $ 5,216 < $ 6,500
dr
Coefficient of permanence
It is useful to be able to transition from a pay-as-you-go to a paid-up price basis
If you’re currently paying G on a pay-as-you-go basis, you can upgrade to a paid-up basis with a
one-time payment of F = G ·φ , where
► Princeton DataSpace, φ ≈ 30 (T = )► USC digital repository, φ ≈ 1.2 (T = 20)
dr
1
Problems with R&D
TCP modeling is dependent on the predicative reliability of r and d► For d, extrapolate from Moore’s and Kryder’s laws?
Moore’s law, 1971 – 2011Source: Wikipedia
Kryder’s law, 1980 – 2012Source: Wikipedia
?
?
Problems with R&D
TCP modeling is dependent on the predicative reliability of r and d► For d, extrapolate from Moore’s and Kryder’s laws?► For r, extrapolate from 30 year Treasury bonds?
30 year treasuries, 2007 – 2012Source: http://ycharts.com/indicators/30_year_treasury_rate
30 year treasuries, 1882 – 2012Source: Robert Schiller
?
Model the risk
Round up r and d, i.e., adding a fixed “risk premium”
Add an additional risk component R to the formula for G► Its influence on the price can grow over time, reflecting
increasing uncertainty, by setting a negative discount factor dR so that 1–dR > 1
► Note, however, that if the weighted sum d becomes less than 0 and |d | > r then G (T ) will not converge to a limit
SkPn
OViMjCWmAG P
+ R
Recalibrate the model
G and F do not have to be fixed values over time► Periodically recalculate based on current conditions
(actual costs for G ) and predictions (r and d ), and apply prospectively
► Retrospective service contracts remain “locked-in”
Hybrid price model
Distinguish between costs that are (relatively) easy to quantify and forecast, and those that aren’t► Use the paid-up model for the former and pay-as-you-go
for the latter
Easy Difficult
Archive Intervention
Producer
Workflow
Content Type
Monitoring
Storage
Hybrid price model
Distinguish between costs that are (relatively) easy to quantify and forecast, and those that aren’t► Use the paid-up model for the former and pay-as-you-go
for the latter
► Bit preservation only
Easy Difficult
Archive Content Type
Producer Workflow
Storage Monitoring
Intervention
Bound the uncertainty
The discounted cash flow (DCF) approach is problematic on practical and theoretical grounds► Difficulty in the setting fixed values for r and d that
realistically represent financial and technological trends over time
Stochastic modeling to determine the probability distribution of possible outcomes► C.f., David Rosenthal, FAST ‘12
http://blog.dshr.org/2012/02/fast-2012.html
CNI Fall 2011http://www.youtube.com/watch?v=_5lQxmyz3xY
Preservation forever
Some things are intended to last forever…
Source: John Church Company Source: United Artists
Preservation forever
?
Some things are intended to last forever…
Preservation for …
A fixed term – 10 years? 20 years? – may be appropriate for much content► Give content an opportunity to prove its worth, as
evidenced by someone’s commitment to pay for its subsequent preservation
Transparency and opportunity
Possible outcomes…► We overestimate our costs and collect too much
● Fund a higher level of service● Refund some portion
► We underestimate● Ask for additional funds● Lower service levels● De-accession content – but at least it was preserved up to that
point and had a chance to prove its value, and gain an advocate
Conclusions
Different customers have different funding capabilities► Flexibility in price models is important
Any price model is based on an idealization of the real world► Assumptions matter
Understanding all of your costs is a precondition to a policy decision to recover all or part of those costs► Cost accounting is difficult
If investment return and discount factor can be reliably projected, DCF can be used to model of long-term costs► What if not?
Conclusions
Even if we don’t have a perfect model, we need to move forward now with a “good enough” model
For more information
Total Cost of Preservation: Cost Modeling for Sustainable Serviceshttp://wiki.ucop.edu/display/Curation/Cost+Modeling
UC Curation Centerhttp://www.cdlib.org/[email protected]
Stephen Abrams Mark ReyesPatricia Cruse Abhishek SalveScott Fisher Joan StarrErik Hetzner Tracy SenecaGreg Janée Carly StrasserJohn Kunze Marisa StrongMargaret Low Adrian TurnerDavid Loy Perry Willett