the dpp guide to digital - amazon web servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... ·...
TRANSCRIPT
![Page 1: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/1.jpg)
ARCHIVING
THE DPP GUIDE TO
DIGITALARCHIVINGDIGITAL
![Page 2: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/2.jpg)
PAGE 2
CONTENTS
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Introduction: The Case for a Digital Archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1 Why Store My Media? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
The risks inherent in the world of digital storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Forever is a Long Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
2 What is a Store? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
An Introduction to the OAIS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 What Should I Keep? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Identifying the Key Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Risk-Value Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Formats and Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
4 Where Should I Keep It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Differentiating Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Underlying Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
Solution Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
The Land of Lost Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
5 How Will I Know It’s Safe? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Application of technology to provide assurance of archiving . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6 How Will I Find It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Minimum Metadata Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Unique Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Embedded Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7 How Can I Stop the Wrong People Getting In? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
It’s Not Easy Being Secure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
The Threat Within . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Encryption at Rest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8 How Long Should I Keep It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Policies and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
What is a Policy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
What are Guidelines? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9 How Do I Know It Will Always Play? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Integrity Management During Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Data Tape Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
All good things must come to an end . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
10 How Much Will It Cost Me? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
![Page 3: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/3.jpg)
PAGE 3
When the DPP published our very first report, The
Reluctant Revolution, in 2011, we said that we had surveyed
independent production companies and they’d told us
digital archive was “not a key issue yet” .
Their view was reasonable: in a tape-based production
environment a move to digital archiving represented
nothing but pain, expense and risk . We predicted this
situation would change when the industry moved to
file-based delivery; and sure enough it did . Almost the
moment production companies became aware of the shift
programmed for October 2014, they also began to enquire
about digital storage solutions .
The DPP responded to this interest
by publishing an introductory
guide called 10 Things You Need To
Know About Digital Storage . We
promised at the time that a more
comprehensive guide would follow .
And here it is .
The introductory guide aimed to help people understand
the basic principles behind storing on file rather than tape .
This fuller guide has a more specific purpose . The DPP
Guide to Digital Archiving is designed for anyone who needs
to create and maintain a long-term collection of digital
media assets . It is intended to help independent production
companies, cultural organisations and other bodies who
need to migrate and then grow major collections of audio-
visual media . Such companies and organisations will
almost certainly hold the rights to this material, have a
direct interest in exploiting it (either commercially, or by
making it accessible to the public, or both), and will need to
guarantee the material will still be findable and usable for
many generations to come .
We have kept the format of the 10 Things guide, with the
same section headings, allowing us to elaborate on specific
topics to complement the original content . By maintaining
the same 10 Things structure we have also made it easier
for you to use this more detailed guide as a reference
document .
If you don’t immediately have time to digest the entire
document, we direct you to the recommendations in the
Conclusion section .
Anyone creating such a digital media collection would
be wise to employ the services of someone with proven
experience in this area of archiving . However this guide will
provide you with the knowledge to understand what the
experts are talking about – and the framework to undertake
what could be the most nerve-racking investment of your
professional life .
After all, who wants to be the modern-day equivalent of the
person who decided in the 1970s not to keep episodes of Dr
Who? But then, how were they to know? They didn’t have
the benefit of a guide like this .
Mark HarrisonManaging Director
Digital Production Partnership Ltd
FOREWORD
![Page 4: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/4.jpg)
PAGE 4
The Case for a Digital Archive
Before we take you into the detail of what a Digital Archive
is and how to set one up, we ask you to consider two crucial
questions:
• Why do you want to keep the content in the first place?
• Why is it worth investing in doing it properly?
It is more expensive to store digital content than to keep
a tape on a shelf – not least because the susceptibility of
digital content to re-versioning means there are likely to
be more copies of it . It also requires more technology
than a physical collection, and technology decisions
and investments are rarely trivial for any company or
organisation . So it is necessary to have a clear sense of
purpose as to why you need to have an archive, and why
you need one that is worthy of the name ‘archive’ .
The Hoarding Reflex
It’s always tempting to keep everything, particularly in the
digital age where files are ‘invisible’ and search systems
promise so much .
But this approach used to happen in the pre-digital world
too . Many physical archives took an approach to ‘archiving’
that actually meant putting everything aside – and thereby
INTRODUCTION
putting off the awful day when the material needed to be
sorted . Anyone who has ever needed to clear out a tape
‘archive’ (aka storage cupboard) will be familiar with how
chaotic it tends to be . Much of the problem stems from the
fact that material initially labelled for one purpose becomes
impossible to identify unless decoded by the original
producer or the librarian who set up the collection – both of
whom are likely to have left years ago .
But let’s not kid ourselves . It is hard to throw things away .
When did you last hear of a technology vendor promoting
its fantastic deletion capability? It doesn’t feel overly bold
to assert there is no company or organisation that creates
or handles content that cheerily throws it all away as soon
as it has served its original purpose .
Hidden Value
Just as we know there is always a tendency to hoard, we
also know that the most common justification made for
such a reflex is ‘just in case .’ When it comes to digital
media, ‘just in case’ might relate to the possibility of a legal
or compliance issue; but more commonly it relates to the
possibility the material might have some kind of future
value – either for re-use or for sale .
Perhaps the greatest potential – but least actual – benefit
of digital media is its reusability . Such media is inherently
searchable in a way physical media just isn’t . And the first
pre-requisite for reusing or retrieving content is being able
to find it .
This is the heart of any decision to create to digital archive:
we can’t merely delete everything we create; and if we are
going to keep any of it, a digital archive makes it possible to
find it and extract its value .
But just how much value does old digital content have – and
is it worth the cost of maintaining it in an archive? This is an
impossible question to answer of course: it all depends on
the material . If you are lucky enough to be the owner and
rights holder of FA Cup footage, then the value will be rather
a lot . If you are the proud possessor of hundreds of hours of
interviews with people who never made the final cut for a
reality show, then rather less (until one of them becomes
famous, much later) .
This is where the challenges of setting up a digital archive
become the greatest benefit of setting up such an archive .
Throwing a tape in a cupboard didn’t really entail a decision
– which meant you never had to address whether it had any
future value or not . But as soon as you make the decision to
keep digital media in a form by which it will remain usable
and findable, you are forced to make judgements about the
worthwhileness of keeping it . This guide will assist you in
making those judgements .
![Page 5: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/5.jpg)
PAGE 5
Less is More
The overall cost of keeping a programme as a tape on the
shelf currently remains lower than keeping it in a server
or server system – meaning the cost of digital storage is
a more important consideration . The reality of making
a decision to set up a digital archive is likely to be that
you will, over time, actually keep less material than you
otherwise would – simply because you now have the
capability to make retention and deletion decisions, and
because your confidence that you can retrieve the right
copy of something will reduce the temptation to keep all
the just-in-case copies .
But as may now be apparent, we are making one major
assumption: you employ or have the benefit of a person or
persons who is qualified to manage a digital archive . Make
no mistake: a digital store with a search capability is not
an archive . This guide will provide a formal explanation of
what you need before you can reasonably describe your
collection as an archive – and to be frank you may find
the pre-requisites a little daunting . But they become a lot
less daunting if you have access to a professional Media
Manager or archivist . What may feel complex and off-
putting to you, will be second nature to them . You might
think of them as your best system investment: they will give
you something that no amount of technology investment
can buy .
The Human Factor
Just as it’s a simple truth that we find it hard to throw
content away, it’s also a simple truth that those who are
employed to make content are not going to be as focused on
maintaining it . It’s all about priorities, skills and motivation .
No-one whose primary function is production (the
creation of content) is going to be as focused on archiving
(the retention or deletion of content) as someone who is
employed for precisely that purpose .
So does every organisation need a specialist role? This
depends on the size and complexity of the collection and
workflows . The demands of the BBC or ITV Archives
for example are very different from a small independent
production company or facility house . But you will at the
very least need your collection to be getting the benefit of a
specialist individual – it’s just that that individual might work
for someone else, and be overseeing your material as part
of a larger archiving service .
If you do decide you need a specialist Media Manager,
a sophisticated skillset is now required . Traditional
information management skills are still important but
should be accompanied by a practical business sense,
excellent communication abilities, and particularly good
influencing skills: this is the person who will need to
articulate the importance of good metadata within the
end-to-end process . In short, the Media Manager will be
the person persuading those who might be moving on
to another job next week that they should perform tasks
necessary for the content they created to be kept for a
lifetime – and beyond .
Digital Archiving: It’s a State of Mind
This is a fitting point to leave you with before we embark
on this guide: just as the very decision to create a digital
archive forces an organisation into disciplined thinking that
can ultimately save money, increase value and improve
efficiency, so the employment of a professional Media
Manager can change the culture of the workplace .
The presence of a Media Manager will make it much easier
to create frameworks for how to work with digital media, to
assign roles and responsibilities and to establish policies . It
signals ‘we understand what it means to work with digital
media .’ And that, after all, is your business .
INTRODUCTION
![Page 6: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/6.jpg)
PAGE 6
1
The risks inherent in the world of digital storage
Technologies for the storage of content have a long history
and an exciting and rapidly evolving future .
Each advance in storage technology brings with it a hitherto
inconceivable increase in storage density . However, these
developments are accompanied by a corresponding
reduction in the lifespan of the storage medium .
The table on the right highlights major milestones in
storage development throughout the ages and shows how
each major advance delivers an increase in storage density
of a factor of one thousand but a corresponding decrease in
lifespan of a factor of ten .
For this and other reasons, it is often said that we are
the generation likely to leave least behind us for future
generations .
For cultural organisations this is a worrying trend, but for
content owners it also creates a challenge to the financial
and commercial security that the long-term custodianship
of content can bring with it .
Why store my media?
Moreover, moving from a physical to a digital world
introduces other non-technical challenges . For legacy
physical content such as film, the disposal of content
required a conscious decision . In the digital ephemeral
world however, quite the opposite applies . Unless a
conscious decision is made to secure content that would
otherwise just flow from one ‘cloud’ to another, it has
the potential to be lost through absence of a decision to
capture it, rather than through a conscious decision to
dispose of it .
These are not just theoretical concerns, and affect media
professionals on a daily basis, whether they are aware of
it or not . It is difficult to retro-fit solutions, and suitable
planning must be undertaken before content loss occurs
rather than after the fact .
Medium Storage Density (bits/cm2) Lifespan (years)
Stone 10 10,000
Paper 10,000 1000
Film 10,000,000 100
Disc 10,000,000,000 10
“Preserving Moving Pictures and Sound” Wright 2012
![Page 7: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/7.jpg)
PAGE 7
WHY STORE MY MEDIA?
1
You will hear archivists use the term preservation with regard to the long-term storage of content .
Generally, storage is about keeping your content safe today, but preservation is about prolonging the life of content so that it lasts as long as you need it to, which may sometimes be forever .
STORAGE VS PRESERVATION
Of course, not all content is worthy of such attention and
hence it is useful to consider the lifespan that you would
like your content to have and act accordingly, applying a
suitable level of effort and funding .
If your content will outlive the life of the hardware and/or
software solution that it currently resides within then
you will need to consider the long-term preservation
management approaches that we discuss within this
document .
Forever is a Long Time
Many archives consider the “100 year question” namely
how can you be sure that your content will be safe and
accessible in one hundred years’ time . This is not an
unreasonable question, and small changes you can make
today to the way you manage your content can make
this a realistic likelihood . If you plan for such a long-term
view, then short term preservation comes as a natural
consequence .
A crucial archiving principle is to document the decisions
you have made and the reasons behind them so that future
custodians of your content can treat it accordingly . Unless
you are planning to live for another hundred years, your
content collections need to be self-describing .
In this document we’ll discuss the relationship you have
with your content and also, by implication, with the
technology it’s stored upon . Irrespective of who provides
your storage solution, you should never put blind trust in
the technology . You should take as much responsibility
for your media storage as is necessary to feel you have
adequately discharged the duty of care that you have as
current custodian of the content .
Although it’s important that your content is kept safe, it’s
equally important that it’s kept alive and accessible, not
least because this accessibility and ability to commercialise
or exploit your content is likely to justify the funding that
you’ll require to ensure the continued safe and secure
storage of your content .
![Page 8: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/8.jpg)
PAGE 8
2
As outlined in the previous section, a store is a place where
you can keep your content safe . For any organisation,
not just in broadcasting, keeping material secure and
accessible over an extended period of time requires a
methodical approach . Even the traditional filing cabinet
requires some kind of system if content is to be retrieved
efficiently .
An introduction to the OAIS Model
Fortuitously, an international standard exists to define an
approach to address exactly this issue .
This standard is ISO 14721 – the Open Archival Information
System – more commonly known as OAIS . It is a conceptual
framework for setting the standard for archiving activities
rather than a method for carrying out those activities . As
such, the OAIS model does not require the use of any
particular computing platform, database management
system, technology, or media .
OAIS will already be well known to professionals managing
large archive collections . In fact, a working knowledge of
What is a store?
this standard could be considered an essential requirement
for anyone you are considering employing to look after
your content .
If you have not come across OAIS before, understanding
this framework is likely to help you organise your archive .
At first glance, the model might seem to apply to larger
archives employing dedicated teams of people but it
is equally valid for smaller operations and can act as a
checklist to ensure that no critical function has been
overlooked and that each function is mapped to a specific
role or individual .
It also provides a common language to use when discussing
and agreeing archiving policies and is now used as a
reference model by a wide variety of organisations in the UK
and internationally with digital archiving needs, including
the BBC, ITV and other European archives .
The Further Reading section of this document includes a link
to a thorough introductory guide to OAIS but the key points
are presented here .
STAKEHOLDERSThere are three main stakeholders or ‘entities’ defined
in OAIS:
Producer
Management
Consumer(DesignatedCommunity)
OAIS(Archive)
![Page 9: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/9.jpg)
PAGE 9
WHAT IS A STORE?
2
ProducerThis doesn’t mean a television producer but rather the
person or system transferring content into the Archive .
ConsumerAlso known as the Designated Community, these are
the persons or systems expected to use the information
preserved in the archive .
ManagementThe persons defining how the Archive should operate and
function .
It is a principle of OAIS that decisions are made primarily
with reference to the Consumer (or “Designated
Community”) as it is ultimately only for their benefit that
content is being stored . This is a key point that can help
you make decisions on what content you store in your
archive, and how .
INFORMATION MODELOAIS defines three primary packages of information that get
managed as part of an archive system . For some systems
they may all be exactly the same file format, but they are
more often tailored towards the specific practicalities,
requirements and constraints of the people or systems
creating, storing, and consuming the content respectively .
Submission Information Package (SIP)The package transferred from the Producer to the Archive
for Ingest . For media archives this is likely to be the item or
collection of items that you want to archive, including all
accompanying metadata such as title, episode and version
production number .
Archival Information Package (AIP)The form of the package that is stored within the Archival
Storage . This is the content that you are preserving,
accompanied by additional metadata to support its long-
term management for example checksums, additional IDs,
or information regarding the classification, provenance,
retention period and restrictions on usage or access .
Dissemination Information package (DIP)The form of the package that is delivered to Consumers
for Access . This is your content as it leaves your archive,
having been rendered, packaged, trimmed or assembled
into a form suitable for a specific customer or consumer of
your content .
It’s worth noting that in the parlance of OAIS, these
packages refer to both the media and the metadata .
OAIS doesn’t mandate a single SIP, AIP, and DIP to be
used within an individual archive and actively encourages
different definitions for different producers and consumers .
Likewise there may not be a one-to-one relation between
these packages, for example with a DIP often formed from
multiple AIPs .
FUNCTIONSThe OAIS model, shown overleaf, describes six primary
services .
1 IngestAccepting content into the Archive from Producers
2 Archival StorageManagement of the long-term storage and maintenance of
content .
3 Data ManagementMaintenance of the databases describing the content in the
Archive .
4 Preservation PlanningDefining the strategy for preserving content in the presence
of changing technologies and user needs .
5 AccessThe process by which Consumers locate, request and receive
content from the Archive .
![Page 10: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/10.jpg)
PAGE 10
WHAT IS A STORE?
2
6 AdministrationManagement of the day-to-day operations of the Archive and co-ordination of the five other activities .
Although we have focused on the procedural aspects of OAIS, at its heart is the need for an organisation
to make a commitment to long-term digital preservation and accept the responsibilities that this brings .
You may notice that the OAIS documentation makes reference to Space Data Systems, revealing that the primary author of the document was NASA, the American space agency .
NASA is particularly notable for running very long projects, for example the Space Shuttle programme which ran from 1969 to 2011 – a 42 year project . It was crucial that all their data was kept safe and accessible throughout the project and that information created on the first day of the project was readable on the last day .
The timespan of the Space Shuttle Programme naturally saw the obsolescence of many technologies, highlighting the need for continual migration of information and the need to manage the long-term life of content in the presence of changing technologies and user needs .
MAYBE IT IS ROCKET SCIENCE
ARCHIVALSTORAGE
ACCESSINGEST
DATA MANAGEMENT
PRESERVATION PLANNING
ADMINISTRATION
OAIS
Consum
erProd
ucer
Management
![Page 11: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/11.jpg)
PAGE 11
3
Identifying the Key Metadata
A common misconception is that every piece of data
produced by an organisation or company has value
for the archive . In fact, much relates to a point-in-time
during the development of a programme, and can
conflict with decisions taken later . For example, the
original script may contain lines that were later edited
out; a piece of music that is included is later changed;
or a unique programme identifier is superseded when a
newer version is created .
So rather than retaining all data, you will need to decide
the relative value that each piece of data has, and the trust
that you place in it . There is no issue with storing data
created at any point in the production process so long as
you understand which pieces of metadata are known to be
accurate and universally authoritative .
It is therefore essential to have a complete picture of the
data flow through your organisation and to understand the
processes that cause data to be added or modified . You
can then decide which data is unarguably trustworthy and
which could become so with additional quality control .
What should I keep?
Accurate relevant metadata is often seen as a key contributor
to helping organisations gain the competitive edge .
Risk-Value Analysis
Given unlimited budget you might choose to apply the same
level of diligence and quality control to all your content .
However, given limited funding and resource, most archives
have to decide on a set of priorities that direct effort to that
content which is most valuable to the business . ‘Value’ in this
context might not just be commercial value: it could reflect
other priorities, such as re-use value, heritage value, legal
requirements for retention or public-access availability .
One example of a commercial model for this is illustrated in
the diagram on the right where the most detailed metadata
and widest format availability is applied to the percentage
of the collection that is predicted to generate the most sales
interest .
For example, in this scenario, it would be possible to weight
the amount of effort devoted to metadata augmentation
according to the expected exploitation and commercial
GOLDGeneratedtop sales
SILVERGenerated
very profitable sales
BRONZEGeneratedsolid sales
![Page 12: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/12.jpg)
PAGE 12
WHAT SHOULD I KEEP?
3
activity . If clips from news programmes do not attract high volume sales, it would
not be worth using valuable journalists’ time to shot-list . In contrast, high-profile
high-value natural history footage may warrant employing a dedicated Media
Manager to apply detailed tagging information to the content .
In the preceding diagram, maximum quality control and data effort is expended on
the Gold content, ensuring that it can easily be discovered and that it is available in
a form to allow quick and cost-effective re-use .
The Bronze categories, meanwhile, would have a light-touch minimum metadata
set of information . This category might include older programmes that have
shown no commercial activity over a set period of time . If commercial interest in
this tier of content emerged over time, a business case might be made to fund
enhancement of this metadata .
Formats and Standards
“The nice thing about standards is that you have so many to choose from”
(Andrew S. Tanenbaum)
In digital media preservation there is no single ‘correct’ answer to the question
“what file format should I use for the AIP?”
It’s often impractical to hope to standardise on a single format whilst allowing for
tiers of multiple qualities and also supporting the desire to occasionally store the
original source files .
Archiving of content at the highest quality and with the highest level of integrity management may not be relevant for all your content .
In considering technology solutions, a crucial approach is the ability to consider the relative value of all the content being stored and architect the solution appropriately .
For example, would you consider a single archived rushes clip to have equal value to a fully-finished programme file and hence do they warrant the same degree of content resilience, speed of access, or choice of file format?
Designing or procuring an archive storage solution supporting tiers of quality and service will allow more cost-effective use of limited budget than ascribing identical value to all content .
Even within a tiered model, it’s not essential that a particular piece of content exists only at a single level . It’s common for easily accessible copies to be held for all content, but additional higher-quality but less frequently required versions can also be held in a slower storage tier .
NOT ALL CONTENT IS CREATED EQUAL
![Page 13: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/13.jpg)
PAGE 13
WHAT SHOULD I KEEP?
3
Where content is being digitised from videotape, your
starting position is often the baseband video output of the
tape machine – and you need to make a conscious decision
about the quality at which you capture the content . A
natural modern choice for digitisation of content is to use
the same format that you would if the content was to be
delivered for transmission . As future content will only be
naturally available in this quality, you may decide that this
effectively sets an acceptable quality level for your archive .
In the UK this would therefore mean one of the family of
AS-11 DPP formats, although it’s important to understand
the distinction between the AMWA AS-11 DPP file-format
specification and the UK DPP delivery specification as you
may find that your legacy archive content can easily be
made to conform to the former but, due to being created
when a previous delivery specification prevailed, may not
conform to the latter .
If though, as highlighted above, there is a suggestion that
the content was of suitable value or importance to be
worthy of archiving at a higher quality, then alternative
higher quality codecs could be used with the pinnacle of
quality being an uncompressed or lossless file format .
Similarly it’s possible that your legacy archive content
may need to undergo processing, such as aspect ratio
conversion, before use in a new programme and therefore
would benefit from storing at a higher quality to ensure an
acceptable result in the completed programme .
When using very high quality formats, you may find that
these are often less supported by commercial systems and
you should be careful about interchange and compatibility
before committing to a format .
For some specific digital videotape formats, notably the
DV family, it’s possible to transfer the content in its native
form without introducing an additional compression pass –
creating files which contain an exact copy of the compressed
essence from the videotape . In these cases, it’s worth being
clear what method is being used to create files from these
tapes and ensure it is optimal .
It’s also a good time to consider what additional work would
be required on the archive content following the completion
of the digitisation process, but before the future intended
use is achievable . For example, due to modern delivery
standards being different from those prevailing when video
master tapes were created, it is likely that content will need
re-versioning before sending for distribution or publication .
Likewise, digitisation from video tape can be an imperfect
process and you would want to consider the level of quality
check required before deeming the media file to be an
acceptable representation of the source video tape .
THE RAW MATERIALSFor some categories of content, you will not be in control of
a format and codec choice as these will be dictated by the
Even if you were to decide today on a single archive storage
format, the advance in content consumption technologies
(e .g . Ultra High Definition and beyond) means that you will
rapidly need a new format, and even if you were to attempt
to migrate all content to this new format, it’s likely that you
will never reach an equilibrium with all your content stored
in a single format within your archive . Unless you are dealing
with a static historic collection of content, that is .
It’s important to consider the future life of your content,
for example whether there is any expectation that it would
be published on a future platform where the native quality
exceeds what is currently required – such as transmitting
SD programmes on an HD channel, or HD programmes as
UHD . Where such up-conversions are a possibility, it’s worth
considering the additional value that might be gained from
holding a higher quality version of the content, even if this is
not in your normal delivery format . For example you might
choose to hold a higher quality version of your content in an
edit-platform codec such as Apple ProRes or Avid DNxHD .
FROM PHYSICAL TO DIGITALIn selecting archive formats, it’s useful to make the
distinction between content which started life as a digital file,
and content migrated from analogue or tape-based media .
Content which has only ever existed as a digital file is
already at its zenith of quality, and care must be taken not
to compromise this .
![Page 14: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/14.jpg)
PAGE 14
WHAT SHOULD I KEEP?
3
programme making process – for example RAW camera rushes . Here you are faced
with the decision of converting these into a more normalised form or leaving them
in their proprietary formats . It’s worth considering the future use of the content and
how readily accessible and controlled the content needs to be .
A useful parallel is to consider film negatives, which are often held by media
archives as the highest quality raw materials associated with legacy high-value
productions, even though a more readily accessible video tape copy is held of the
content . It is accepted that there will be a greater cost in handling film negatives
than accessing traditional video tape, however this is outweighed by the benefits
brought by access to this higher quality version of the content (for example, re-
mastering for commercial sale) .
Like with film negatives, there is sometimes a creative decision to be made
when converting camera raw files (e .g . REDCODE) into normalised formats .
Decisions on dynamic range, framing and grading are required and hence it is
neither cost-effective nor practical to perform this conversion work up-front
without knowledge of the final use of the content .
Similarly, the moment at which you know for sure that you want to re-use
content is the future moment at which its value has become apparent . It is at
this time that you are best placed to consider the level of investment you want to
make in converting from raw materials that you may hold . It is likely to be more
cost-effective to hold a large quantity of raw footage and accept the cost of
having to convert a small amount of this when commercialisation opportunities
arise, than to transfer all material up-front to the latest commonly-used format .
In making any decision around formats, standards, and technologies an important
factor to consider is the breadth of adoption of the product or the approach, i .e .
how many other people in the world are also using it . The more organisations and
institutions that share your specific preservation and migration needs, the more likely
that jointly you will have access to the solutions you require .
For example, Panasonic D3 video tape was only adopted by a small subset of the global
broadcast industry (notably the BBC and NHK) and migrating from this now-obsolete
format brought with it a number of practical and technical challenges . LTO data tape,
however, is used globally across a wide range of industries including medical, banking,
pharmaceutical and science . Problems with the future migration and access to this
format are shared with such a wide and varied range of users that greater confidence
can be placed in the likelihood of solutions becoming available when needed .
Similarly with file formats, although there are a large number of standards to choose
from, picking a format with wide adoption and wide product compatibility will always
make your future requirements easier to deliver . For this reason the work that the DPP
has undertaken in creating and promoting the adoption of the AS-11 DPP format makes
it worthy of consideration as a primary format within media archives .
A PROBLEM SHARED
![Page 15: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/15.jpg)
PAGE 15
4
As a potential consumer of digital storage technologies, it is
likely you will be presented with a plethora of commercial
products and services, and you will need to choose the best fit
to your requirements and budget .
Not all content owners and creators have the necessary
technical resources to manage a functioning digital archive
in-house . Similarly, not everyone will be able to secure
ongoing technology investment if their operating model is
based upon project-focused activities with corresponding
peaks and troughs in their funding resulting from content
creation and commercialisation .
MANAGED SERVICESManaged services are a very practical and efficient
way of delivering archiving solutions . Rather than each
potential customer needing to become an expert in
archive technologies, you can look to suppliers who
specialise in these capabilities and provide services to a
number of customers .
There are several specific factors that become relevant when
considering managed services, but which would be less
relevant for in-house offerings .
Where should I keep it?
Primarily, it isn’t possible to fully transfer the risk of content
loss to a contracted third party as failure in the action of your
supplier is likely to ultimately have a greater impact on you
than contractually allowed financial damages .
If you value your content, it isn’t viable to fully devolve
responsibility of its long-term life to a managed service
provider under a purely contractual relationship . You owe
more to your content and have a duty of care to ensure that
its lifespan is managed effectively . You should therefore take
more interest in how it is being managed and how its long-
term life is being assured . It’s not unreasonable to want
to understand how your content is being stored and even
consider independent assurance of the approach, rather than
only focusing on service levels .
Nonetheless, with sufficient caution around risk-ownership,
managed service offerings can play a very important role
in a complete archive solution, especially for small and
medium sized organisations who can effectively ‘buy-in’ the
archive expertise as part of a managed service if they don’t
have this in-house .
CLOUD The word ‘Cloud’ is often misused and misunderstood but it
is, simplistically, a general term used to describe managed
services provided over the internet . You often find this used
in relation to media and archive service offerings, however
these should simply be considered and evaluated in the way
that you would with any other managed service offering .
Generic Cloud storage, as provided by products such as
Amazon S3 and Microsoft Azure, is different in nature, and
is generally just a storage component which can be used
as part of a complete architected technical solution . For
example, end-users would not be directly accessing Amazon
S3 storage: instead it would be used as the ‘backend’ to a
more user-friendly product .
All Cloud systems add an additional layer of abstraction
between the end-user and the technology providing the
service . The advantage of this abstraction is to insulate you
from technology changes and release you from the need to
support and maintain complex infrastructure and operations .
Particularly where you require global resilience, Cloud
offerings can make this available to small and medium sized
organisations where creating this capability in-house would
![Page 16: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/16.jpg)
PAGE 16
WHERE SHOULD I KEEP IT?
4
be too complex and cost-prohibitive . A consequence of this
approach is that you often aren’t made aware of the actual
technology being used to provide the service and instead
the solution is described and defined purely in terms of the
Service Level Agreements (SLAs) offered by the supplier . It’s
therefore important to ensure you fully understand the nature
of these SLAs and the consequences of them not being met .
Simply getting a percentage of your monthly fee refunded
in case of downtime or content loss may not be adequate
compensation, so you may choose to design your solution
with this in mind . It’s best not to be fully reliant on purely
SLA-defined services if the financial remedies outweigh the
value you give to the content .
Differentiating Characteristics
Different storage media each have a range of properties, and
selecting the most relevant medium will involve balancing
each of these for your specific scenario . Factors include:
• Cost
• Read/Write performance – how fast content can flow
to and from your storage
• Access speed – the delay to retrieve content from
the system
• Data permanence (e .g . manufacturer’s expected
error rate)
• Physical degradation profile
• Environmental needs for storage of the media,
including space, temperature and humidity control
• Long-term support for the devices necessary to
provide access to the content .
• Technology requirements for continued storage (e .g .
power, tech support)
• Compatibility and interchange requirements (for
example, do you have partners or subsidiaries with
whom you may need to share your archive media?)
Finding that you have worms in your archive would not seem a pleasant proposition, especially for paper archives! However WORM stands for “Write Once, Read Many” and is an option available for some storage types to provide an un-erasable permanent version of normally modifiable media .
For example, you can buy WORM versions of LTO tapes which behave in all ways like traditional media except that once data is written it cannot be modified .
This provides an additional degree of protection against unwanted system or human error which could otherwise cause content loss .
Whilst not always applicable, it’s worth considering this technology when evaluating different storage options .
THE EARLY BIRD CATCHES THE WORM
![Page 17: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/17.jpg)
PAGE 17
WHERE SHOULD I KEEP IT?
4
Underlying Technologies
PORTABLE DISK Hard disk drives have a very important part to play in modern
broadcast workflows but, being complex electro-mechanical
devices, they are prone to failure .
As hard drive sizes increase, the potential to lose large amounts
of data or media in a single incident increases dramatically .
Portable hard drives have a very valid use for short-term
storage such as moving files between systems, but are a bad
choice for any form of medium and long term storage .
MANAGED DISKProfessional systems making use of hard drives such as IT
and broadcast servers or storage arrays take into account the
possibility of drive malfunction and keep sufficient copies of
the data to allow recovery from failure .
Various redundancy methods are used to ensure varying
levels of protection against content loss ranging from full
mirroring of content between disks, to more complex ‘parity’
arrangements to provide ability to recover from failures of
multiple drives without a corresponding linear increase in
the storage volumes used .
A commonly-used redundancy method is known as RAID –
which stands for Redundant Array of Independent Disks .
All redundancy methods add an overhead to the efficiency
of storage usage and hence you will normally see storage
server capacities quoted as both ‘raw’ and ‘usable’ figures
reflecting the differing availability of storage capacity once
these protection methods have been applied .
OPTICAL DISCWe are all familiar with optical discs such as CD and
DVD but, despite all looking ostensibly similar, there are
considerable differences in the physical and chemical make-
up of these discs which affect their long term stability .
Commercially mass-produced music CDs and video DVDs
are created by a pressing process similar to how vinyl discs
are made, and the resulting data is effectively held within
a robust metallic layer . Barring rare manufacturing defects,
and potential degradation from very excessive temperatures,
these are a stable long-term medium .
Recordable disc (CD-R, DVD+/-R, CD-RW, DVD-RW) can
be filled with data using two different physical processes . In
both cases, a laser is used to write the data by either ‘burning’
it into a dye-layer for permanent (-R) formats, or causing a
physical ‘phase change’ in a crystal structure for re-writeable
(-RW) formats . Although there are inherent differences in
the resulting longevity of the formats, experience shows
that neither technology provides a good long-term archiving
format . Some manufactures sell ‘archive grade’ discs such
as the Millenniata M-DISC, or JVC Archival Grade products,
which claim to offer improved life-span but it is often the
manual handling of the discs such as printing, labelling,
writing on them, and general usage which can introduce
factors causing early unwanted degradation .
A new generation of re-writable ‘phase change’ discs has
emerged as a relative of Blu-ray technologies and some
manufacturers have added a robust caddy around the discs
to remove issues relating to manual handling of the discs and
provide somewhere to safely affix labels .
Formats such as Sony XDCam and Optical Disc Archive;
Hitachi Digital Preservation Platform; and the cross-
manufacturer Archive Disc product provide viable options
for long-term storage . Like other media such as LTO, some of
HARD DRIVES FAILPLAN FOR WHEN THEY DO
![Page 18: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/18.jpg)
PAGE 18
WHERE SHOULD I KEEP IT?
4
these formats can be ejected for storage on shelves to reduce
energy-consumption and, in this case, generally do not need
any special storage conditions such as temperature, dust and
humidity control . Many are available in WORM configurations,
allowing greater confidence in data permanence and
can provide read-verify functionality where content is
automatically re-read after writing to make sure it has been
successfully stored .
As with any storage medium, you would be advised to
compare the characteristics of each format against your
particular requirements .
For all optical discs, there is an international standard (ISO
18925) offering guidance on the correct usage and storage of
these discs to ensure continued viability .
DATA TAPEA popular storage medium which is frequently used for long-
term archive storage is data tape such as LTO or T10000K .
The properties of this format are less familiar to many users
and often misunderstood, and so some key advantages and
disadvantages are presented below .
There are a number of reasons why data tape can often be a
good format to form part of an archive storage system:
• It’s a very stable format, which, if stored in suitable
conditions, does not suffer significant degradation over
time . This is partly due to the data on the tape being less
densely packed than on other media such as hard drives
and hence it is less susceptible to random corruption
through mechanisms such as thermal instability .
• The format has inherent verification built-in . Any content
written to data tape is immediately read back by a
different tape-head within the drive to ensure that the
information was written perfectly to tape, with the write
process being re-tried if issues are detected .
• The tapes themselves hold embedded digital information
(on a chip within the tape shell) regarding their history of
usage, errors and issues to allow systems to interrogate
this and act accordingly .
• Where tapes are ejected from robots and put on shelves
they are effectively decoupled from the live archive
system, and this provides some security in the event of
an issue affecting the copies in the live system .
However, data tape does have the following limitations
• It is not a random-access medium and there can often
be many minutes of delay for content to be recalled
from tapes in a library . It’s worth remembering that your
copy on data tape doesn’t have to be the copy used for
access and additional copies on disk or cloud storage can
provide more expedient access .
You’ll notice that sometimes this word ends with a ‘c’ and sometimes with a ‘k’ .
The reasoning for this is a matter of history, but in current parlance ‘disc’ is used to refer to optical or vinyl media – generally where the physical item itself is circular . ‘Disk’ meanwhile is reserved for magnetic media such as hard drives, generally where the physical items are hidden within rectangular housings .
DISC VS DISK
![Page 19: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/19.jpg)
PAGE 19
WHERE SHOULD I KEEP IT?
4
• A number of incompatible methods exist for how data
can be stored on the tapes, leading to interchange
problems between different products . Even emerging
standards such as LTFS (Linear Tape File System),
whilst laudable in allowing easy interchange of data
tapes between systems from different vendors, can
bring complications when used in long-term archive
scenarios which need careful consideration . Variants of
the industry-standard TAR formats, or the emerging AXF
format are worthy of consideration in archive scenarios
but the exact choice of standard is non-trivial and should
be made in consultation with experts .
• Data tape needs good storage conditions, ideally with
managed temperature, humidity and dust control .
• Tapes have a limited number of read or write cycles
before they can wear out . Systems expecting to make
frequent use of tapes need to take this into account and
migrate content before the manufacturer recommended
limits are reached .
• The entry-level cost and level of technical expertise of
implementing data tape systems correctly can be larger
than using some other storage media .
Particularly, when storing archive content on data tape, be
careful to actually create data tapes suitable for archive
use and not simply IT backups of the content . There are a
number of common characteristics of IT backups which
don’t lend themselves to use in archive scenarios such as
the use of block-based incremental copies (where a file
can be split across multiple tapes) and reliance on a central
database to interpret the stored content rather than creating
self-describing and self-contained tapes . Put simply, IT
backup tapes are often of no use without the systems that
created them and the databases that describe them .
COLD STORAGEA particular property of archive storage is that a very small
proportion of the stored content is likely to be accessed in
any particular time period . It is therefore not an efficient
use of energy or cooling for all your content to exist in
permanently-powered disk arrays .
Solutions based on Optical Disc or Data Tape naturally use
no additional power for content which is stored but not being
accessed, however products are also available which apply
the same principles to hard disk storage and ‘spin down’ or
de-power drives which are not currently being used . This
is a principle employed by large organisations such as
Facebook who need to store huge volumes of data with low
usage profiles . As with any storage medium, you would need
to ensure that adequate consideration has been given to
management of the long-term integrity of content, especially
if the underlying storage media has not been designed with
archive usage in mind .
Solution Design
We have described the various technologies that can be used
when architecting a solution, but the real skill comes in using
these components in combination to design a solution which
meets your requirements in the most cost-effective way . As
re-iterated throughout this document, meeting the various
needs of access, preservation and security is rarely achieved
by the use of a single product . It is often cost-effective to
use a blended approach – for example using less secure drive
arrays for fast access, with the content also being stored on
less accessible but more secure media .
The Land of Lost Content
In considering how to ensure the longevity of your content
it is prudent to have a thorough understanding of the
mechanisms that can act to prevent this .
In general, it is important to remember that all media
has a limited lifespan and an inevitable degree of
ongoing degradation and corruption . It’s simply a case of
understanding the risk that this introduces, and balancing
this against the cost of mitigating it . Storing content digitally
is naturally entering into a game of chance and you need to
be clear of the odds and the consequences .
![Page 20: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/20.jpg)
PAGE 20
WHERE SHOULD I KEEP IT?
4
Professionals involved in transmitting content over
satellite links are familiar with the concept of a naturally
imperfect transport medium due to the expected level of
corruption and interference that occurs on such links . They
are comfortable acting accordingly to mitigate these issues
and ensure adequate transmission of content .
Storage professionals who take a similar approach and
accept the imperfect nature of file storage are in a better
position than those who pretend that the issue doesn’t
exist . Planning for the day when content will be lost, either
through catastrophic failure or through gradual degradation,
will focus the mind on mechanisms to ensure the continued
lifespan of content in the presence of inevitable issues with
the underlying media .
For low volumes of content, particularly for content where
each item has a limited intrinsic value, it can be realistic
to ignore the possibility of data corruption or loss . But for
the archiving of complete radio and television programmes
where the accumulated cost and effort expended in the
creation of content now rests purely within the single
surviving media file, then the value of the content will
normally outweigh any desire to overlook the potential for
data corruption .
Risks to data permanence take a number of forms:
Catastrophic loss of storage mediumThis is a well-understood failure scenario, where a single
copy of content could be lost through technology failure or
environmental action such as fire or flood .
Technology storage solutions naturally provide a degree of
capability to recover from internal storage medium failures .
RAID storage for example allows for failures of one or more
hard drives without resultant data loss by the creation of
data redundancy across the array of drives and the provision
of automated recovery of the lost media .
Recovery after the failure of an entire collection of content
due to environmental action is possible by holding one or
more additional copies of content in a different geographical
location . This can also form part of your disaster recovery
provision – allowing your business still to operate during
temporary functional loss of one site .
Both these failure modes and mitigation approaches are well
understood and commonly deployed in archive scenarios .
An important consideration in deciding on the creation
of multiple copies of content is to consider how closely
coupled these are, and hence what scenarios could cause
multiple or all copies to be simultaneously lost . For example,
if all copies are managed by the same software system, is it
possible that human or system action could equally affect
all copies? For this reason, externalised, decoupled copies
are often created – for example by ejecting data tapes from
robotic storage libraries and storing these off-site so that
they are out of reach of erroneous or malicious action .
Gradual physical degradation of storage mediumMany storage media suffer from practical physical
degradation at a level greater than that quoted by the
manufacturer . The most prevalent example of this is
‘disc-rot’ affecting writeable optical discs such as CD-R
and DVD-R and caused by a range of chemical and
physical degradations resulting in catastrophic data
loss . This is generally irreversible with the only learning
being to consider the possibility of similar loss occurring
in the future and planning accordingly . On a practical
note, content on simple writeable CD or DVD should
not be considered as having been safely archived and
these formats are prime examples of media worthy of
preservation activities .
Natural gradual variation in the content storedA less familiar phenomenon is the gradual decay of storage
media where randomly occurring environmental factors
such as cosmic radiation and thermal instability can cause
low-level infrequent random corruption of data which can
be both undetectable and unrecoverable . This is colloquially
known as ‘bit rot’ .
![Page 21: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/21.jpg)
PAGE 21
WHERE SHOULD I KEEP IT?
4
Manufacturers quote the likelihood of uncorrected data
corruption occurring in their media with figures such as
1x1015 bits of data (125 Terabytes) as the amount of content
that would need to pass through a medium before a single bit
of data is corrupted . Raw media error rates are improved by
error detection and error correction coding, however these
only provide a degree of protection and don’t eradicate all
potential data loss .
Although these probability figures appear incredibly
small and therefore unlikely, the volumes of content being
considered mean that they cannot be ignored . These are also
quoted as average, best case error rates, hence you would
expect to actually encounter some corruption earlier than
the figures might suggest . A long-term study by the Swiss
scientific establishment CERN gives real world experiences
of data corruptions with a frequency of the order of 1x10-7 .
This would be a worrying statistic if carried through to media
storage situations .
These figures also only relate to one journey of content
through the medium in question . In practice, content is
likely to flow through many hard drives and multiple parts
of computer memory on its way to a more secure storage
medium and the cumulative risks that this concatenation
of probabilities brings about should be given due
consideration .
The integrity management processes discussed within
this document can help to alleviate these issues, but don’t
feel complacent just by having multiple copies of content
unless you can say at any point in time which of them are
uncorrupted . This is particularly true when you come to
migrate content from one storage solution to another . In this
case you are likely to only migrate from one source copy of
each media file, and not migrate from each instance, hence
it’s essential to know the integrity of any content you are
using in your migration .
Technology obsolescence reducing the ability to read the storage medium.Many storage media, especially those designed for archive
use, are sufficiently stable to not degrade over time to any
sufficient level – if correctly handled . You will therefore often
see media such as LTO data tape quoted as having a 25-year
life .
In essence this is a commitment that the data will, in the
most part, still be intact on the storage media after 25 years .
It is not, however, a commitment that technologies will still
exist and be supportable to be able to read that medium .
In the case of LTO, there is a continual evolution in the
generation of the format, from LTO-1, holding 100 GB of data,
through to the current LTO-6 format holding 2500 GB . As
an example of support lifecycles, LTO-3 was the prevailing
format until approximately 2008; however by 2015 it was
becoming harder to purchase devices capable of reading
LTO-3 tapes . Anyone still needing to migrate from this
format would be well advised to stockpile suitable hardware .
Technology obsolescence reducing the ability to understand the stored contentEven if the storage media hasn’t degraded and you have the
technology to read the medium, you are not guaranteed
to be able to understand the data that is stored on it . As
formats and standards evolve over time, not all software
maintains the ability to read every previously existing format .
If you are expecting to need to read a format long into the
future without migrating the content between file formats
along the way, you would be advised to consider also
archiving the knowledge and capability of how to read that
format rather than rely on the capabilities of future software
solutions in this respect .
Sometimes, the stored information cannot be understood
without corresponding data held in a separate system . For
example, IT backup systems may store raw information on
data tapes but may rely on information held in the database
of the backup product to understand the context for the data .
For example the file names and folder structure may only be
held in the database and not on the tapes themselves .
![Page 22: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/22.jpg)
PAGE 22
WHERE SHOULD I KEEP IT?
4
Good practice for archive systems is that the storage media
is self-contained and self-describing such that it doesn’t
rely on any external metadata or systems to allow it to be
accessed and understood .
Human ErrorEven if your solution takes into account all possible ways
in which a storage medium could fail, you may still find it
susceptible to human error .
In the digital world, the potential for a single, seemingly
insignificant action to destroy large volumes of data is far
greater than in the era of physical storage . Furthermore,
such a data loss may go unnoticed by the custodians of
the content in a way that a corresponding large-scale
destruction of physical content would never have .
Particular attention should be given to software upgrades,
systems administration activities and housekeeping tasks .
Limiting the ability for a single person to delete all digital
copies of content is greatly advised, for example not allowing
an engineer to simultaneously have administrative access to
your primary and secondary copies .
In 1986 to commemorate the 900th anniversary
of the Domesday Book (William The Conqueror’s
original survey of the country in 1086) the BBC ran
a groundbreaking project to create an equivalent
digital survey of the UK .
The country was divided up into 26,000 rectangles
and schools across the country contributed images,
sound, videos, and text capturing the essence of life,
work and play in the country in 1986 .
To this day, it is one of the largest crowd-sourced
projects in the country with over one million people
contributing to the project . The user-interface was
ground-breaking in a number of ways, allowing
users to walk around towns in the style of Google
Street View and allowing search and navigation of
the huge array of data, all pre-dating the invention
of the World Wide Web .
The content was stored on Laser Discs and these
have not suffered any degration over time and are
still perfectly intact .
The original Domesday book has lasted for 930
years so far, however our ability to read the contents
on the Domesday discs lasted about 10 years .
This wasn’t due to format degradation, but to
obselecence of the software and hardware to read
and understand the data they contained .
Interestingly, this issue was greatly compounded
by the restrictive licences under which the original
content was obtained, effectively limiting some
content to be usable only in its original context and
using the now-obsolete storage medium .
Some projects have successfully extracted some
specific content from the discs but they remain
largely inaccessible .
The lesson being that if you are at the forefront of
technology be very careful about your archiving
decisions and consider what you could do now,
both from a technology and contractual standpoint,
to ensure that your data has the lifespan that you
would like it to have .
BBC DOMESDAY
![Page 23: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/23.jpg)
PAGE 23
5
Application of technology to provide assurance of archiving
The OAIS model provides an excellent framework within
which to consider how best to apply technology to meet the
needs of an archive system . It is therefore useful to address
technology aspects in relation to key principles of OAIS
beginning with Ingest .
INGESTIf you don’t successfully capture the content you had
intended, or aren’t able confidently to assure that you have,
then there is little point in expending effort and cost in
managing its long-term life .
The OAIS principle of Submission Information Packages
(SIPs) can be interpreted to imply an approach to control
the form in which content is presented for ingest to a digital
archive both in terms of media and metadata . It’s important
to note that this doesn’t mandate a single universal input
format but instead invites you to understand your input
formats to ensure that system behaviour in relation to each
of these is defined .
How will I know it’s safe?
In reality you may actually find that you are only able to
control some of your input formats . Maybe you have a small
list of defined formats used within your organisation, but
you may also wish to maintain the ability to archive arbitrary
files such as RAW outputs from whatever camera format
your content creators have chosen to use . In this case, you
could create a variety of SIP definitions for the recognised
controlled formats, but would also need to create a generic
SIP with a minimum metadata set for archiving files which
themselves may not be of a normalised and controlled
specification .
The SIP does not determine how the content is stored in
the archive, only the form in which it is presented for ingest .
We therefore need to consider the conversion of SIPs into
Archival Information Packages (AIPs) as content is stored
within an archive .
A crucial consideration is whether to modify content as it
is ingested into an archive . With a single defined AIP, any
content not conforming to this definition would be converted
or ‘normalised’ as it is ingested into an archive .
This gives benefits through ‘standardisation’ of content to be
managed within the archive and for onward distribution, but
adds complexity to the ingest process and to the assurance
of complete and valid capture of content .
Most notably, it becomes crucial to ensure that no
information is lost and that no unintended degradation in
content quality is experienced during the conversion of a
file from a SIP to an AIP . It is not a trivial task to automate
or to keep up with changes to formats and systems while
being unequivocally confident that the process is performing
perfectly .
Where content is received which deviates from expected
specifications, the best case is that the system notices
this discrepancy and reports the issue for attention . The
worst case is that it fails silently and unwittingly creates an
inferior or degraded copy of the source file . In practice, a
combination of the two often occurs because although good
system design can plan for every such known eventuality, the
design may not be exhaustive and may not cope with future
technological advancements not conceived of at the time of
system design .
![Page 24: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/24.jpg)
PAGE 24
HOW WILL I KNOW IT’S SAFE?
5
Importantly, creating a new rendition of the media file at import means that the
checksum of the media file in the AIP will not be the same as the checksum of the
media file in the SIP, which is inconvenient . You therefore need to be totally sure
that the AIP for which you create a new checksum contains a perfect copy of all the
content that has value within the SIP, and hence you can treat the new checksum
as authoritative from this point on .
Archiving systems from other industries often maintain two instances of a file – a
‘bitstream’ copy which is an exact replica of the original and a ‘logical’ copy which
preserves the meaning and useful content of the file . For document archiving,
storage of two copies isn’t a huge overhead given the size of the files, but it needs
greater consideration when dealing with larger media files .
This is not to say that normalisation on ingest isn’t a valid option – it’s regularly
performed when content is ingested into some asset management platforms,
however it’s important to appreciate the complexity that it brings .
In summary:
• Choosing not to normalise content provides simplification up-front, but can
create complexities downstream .
• Normalising upon import can lead to simplifications downstream but you need to
invest in ensuring and assuring that your normalisation process works perfectly .
If the ultimate goal of normalisation upon ingest is to ensure long-term viability
of the content then you may want to consider effectively archiving the ability and
knowledge of how to normalise the content rather than actually performing it
upfront on all content .
PRESERVATION VS ACCESS
The two primary goals of archive technologies are to keep content safe and to provide access to it . What is often not given due consideration is that it is not always necessary to achieve both these objectives with the same technologies or with the same formats and standards . It is quite normal to use a blend of approaches, for example:
• Using cloud or cost-effective disk storage to provide access to content, whilst using data tape for an archive copy with managed integrity but less expedient access .
• Storing a copy of media in a widely accessible format (e .g . AS 11 DPP) but also where the content warrants it, storing a higher quality copy in a file format that is less easy to access or edit but preserves a better quality version of the content .
KILL TWO BIRDS WITH TWO STONES
![Page 25: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/25.jpg)
PAGE 25
HOW WILL I KNOW IT’S SAFE?
5
ASSURANCE OF ARCHIVINGIt is important to assure, wherever possible, that content has been archived
successfully and this principle is equally as important whether you deliver the
archive technology in-house or you outsource it to a managed service provider .
Checksums should be created as soon as a SIP is received into the archive . Where
no normalisation of content occurs on ingest, and where checksums were created
as early as possible in the content creation process (such as when the content
is QCd) you will be able to programmatically determine that a perfect copy of
the content has been secured by comparing a checksum of your archived content
(re-read from archive storage if necessary) with the original source checksum .
Where automated archiving solutions are employed to capture content and
deliver it to an archive system, it is possible to make use of Automated Quality
Check products but it is also recommended that manual QC processes, in
the form of random spot-checks should be employed to catch issues which
hadn’t been considered when designing a solution . Where you want to have
ultimate confidence in your QC processes, do not have people or systems check
their own homework . Where a software tool processes some files, use a different
tool to validate them and where a person performs a task, have someone else
check it .
INTEGRITY MANAGEMENTGiven the likelihood of decay in content being stored, it becomes essential to
ensure that your content is not adversely affected . This can be achieved by
actively managing your content .
File checksums, or ‘hashes’ are key to any system that aims to manage content integrity .
They take any file, however large, and create a very short unique fingerprint which can be used to unambiguously identify the file and determine that it is still identical to when the checksum was created .
The most common hash function used for validating media content is the MD5 checksum, which creates a 32-character value in the style of…
252e60baf2658f6ea5237c45f47c6fde
…although other algorithms such as SHA-1, SHA-256 and CRC-32 are also used .
If a checksum is created as early as possible in the life of a piece of media – for example when it is Quality Checked – and if this checksum is then stored as the master content fingerprint for that file, during any future processing or migration task the content can be validated as being identical to when the checksum was created .
CHECKSUMS
![Page 26: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/26.jpg)
PAGE 26
HOW WILL I KNOW IT’S SAFE?
5
In archive situations, it is typical to archive content and then
not access it for a number of years – but that’s a bad time to
discover that both your primary and secondary copies have
been lost .
It’s worth setting a policy to define that all content undergoes
a scheduled, albeit infrequent, integrity check to ensure it
is still intact . This frequency should be selected to balance
the impact of this process against the desire to detect errors
before more than one copy of your content is affected . It can
make sense to align these infrequent checks with the need
to migrate content between storage media .
If your interest in this topic is sparked and you choose to
read more about integrity management you will undoubtedly
encounter the word ‘fixity’ . This term is used by archive
professionals to refer to the measurement of the fact that
a file is unchanged since it was stored . Validating that files
haven’t been corrupted is therefore sometimes referred to
as ‘fixity checking’ but this is simply the same checksum
management process described above .
Keeping multiple copies, ideally in geographically resilient
locations, provides the mechanism to recover from failure .
However it is also necessary to provide the mechanism to
detect failure . Keeping more copies does not in itself provide
full protection against gradual degradation unless you have
a way of knowing which instances are uncorrupted at any
point in time .
If it moves, checksum it
A simple approach to allow confidence that no corruption
has occurred is to create a checksum as early in the life
of the content as possible, and to re-check this whenever
content is restored, migrated, moved or delivered . This
capability is frequently offered as part of asset management
and/or storage management solutions .
It’s worth noting that when performing a partial-restore
on content, where only a small portion of the entire file is
retrieved, it’s not practical to validate full-file checksums
and hence these activities cannot easily contribute to the
ongoing assurance of archive integrity .
If it doesn’t move, checksum it
An approach that is less common is additionally to check the
integrity of content which hasn’t been accessed for some
time . Manufacturers of archive technology products often
call this ‘scrubbing’ .
![Page 27: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/27.jpg)
PAGE 27
6
Minimum Metadata Set
Even with thorough selection processes, there is little point in
storing content if you are unable to find it again successfully .
This puts the emphasis on getting your metadata correct at
time of archiving to ensure that it adequately describes your
content in a way that will let you find it again .
The OAIS model described earlier in this document
introduces the concept of a Submission Information Package
(SIP) which allows you to define the descriptive metadata
you wish to capture . It also encourages you to consider the
eventual consumer of the content when defining this . It
should be noted that the OAIS approach doesn’t mandate
a single SIP for all content, so you may choose to define a
minimum metadata set which is needed for all content
but allow variants for different content types . You might,
for example, accept that you will use different sets of
information to describe a completed programme from those
applied to rushes material .
Even with only a limited number of potential metadata
templates, you will still benefit from having a universal
How will I find it?
minimum metadata set defining a small subset of fields
common to all content types and which is sufficient to
allow the assets to be managed in the archive . This might
simply include some basic editorial information and unique
identifier but should ideally also contain detail on how long
the content is expected to be held for .
The DPP have defined a metadata set for delivering
programme files to transmission . This is detailed in the
AS-11 file-format specification – specifically in the ‘shim’ (a
name given to a constraint on an existing standard) which
customises this for DPP file-delivery scenarios . Given the
purpose of this delivery specification, it’s unsurprising that
it focuses primarily on the metadata needing to support
programme playout rather than for long-term archiving .
Nonetheless, it would be advisable to consider this standard
as you design your archive ‘data model’ as it provides a
useful starting point for editorial metadata standardisation
and it’s likely that your content will need to be contained
within an AS-11 DPP file at some point in its life .
Although it’s common to focus on adding file-level metadata,
people can often overlook the benefits arising from simply
providing a high level description of a collection to capture
its reason for existence and how it came to be . Often this is
the missing piece of the puzzle that brings context to the
low-level granular metadata and informs the user on how
best to interpret it .
When defining and documenting metadata fields, it’s worth
considering the following characteristics:
• Should the metadata field be constrained to a limited set
of values? For example allowing only numeric values, or
requiring selection from a drop-down list . Constraining a
field can improve consistency but could devalue it if the
constraints are too severe .
• Should all internal users be able to view the metadata
field, or would it be visible only to a limited group?
• Who, if anyone, is able to modify the metadata field?
• Would you like users to be able to search on the contents
of the metadata field, either when typing into a single
‘Google-style’ search box, or when using advanced
search?
![Page 28: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/28.jpg)
PAGE 28
HOW WILL I FIND IT?
6
• Are there any sensitivities associated with the field that
would prohibit wide publication? With the desire to
expose content catalogues to the internet for commercial
exploitation, it’s essential to understand what subset of
your data would be suitable for such a purpose .
In addition to the metadata you explicitly provide when
content is archived, it’s useful to consider other data you
may already have which could be used to augment the
asset-level metadata . For example, you may hold or be able
to obtain subtitle files for your completed programmes, and
many asset management systems are now making use of
these to provide enhanced searching .
Similarly, there may be information effectively embedded
within content which new technologies may allow you
to unlock . Speech-to-text, topic extraction, video text
recognition, automated tagging, and phonetic search are
all technologies that are moving from the research and
development laboratories into mainstream products –
greatly improving the ability to find your content .
Search technologies are currently a major area of focus for
innovation, in contrast to storage and archiving technologies
where advancement is less rapid . Consequently it would be
wise, when selecting asset management technologies, to
be sure that the chosen supplier and product are able to
benefit from, and ideally drive, new innovation in the search
experience . This will improve the efficiency of the archive
operation and keep pace with increasing volumes of content
being stored .
Unique Identifiers
You may find many identifiers within your organisation which
are described as being ‘unique’ but in practice some are
more unique than others .
Often, you will find alphanumeric editorial or technical IDs
used to describe assets within your production process such
as Programme Numbers, Make Numbers or Version Codes .
Although these can be unique within a certain environment
or with reference to a certain level in your commissioning
structure (e .g . identifying a programme version) they rarely
uniquely define a piece of media sufficiently to be used as
your primary ID in a digital archive system .
For example, you might hold multiple copies of a single
editorial programme version in different systems but would
want to unambiguously distinguish between them, even
though they all share the same Programme ID .
Similarly, your Programme Number may be unique within
your organisation or broadcaster but there may be no
guarantee that it is globally unique . This may have not
been a problem in the past but with growing needs to share
content between organisations these internal IDs are often
not sufficiently unique to be of use in all situations . If you
use your internal IDs publicly and where there is a risk of
these clashing with IDs from other organisations, ensure
you include the issuing body (e .g . your organisation name)
in your metadata to allow disambiguation .
A number of standards exist to assist here such as SMPTE
330M – defining the Unique Material Identifier (UMID)
used by a variety of media systems such as Avid editing
systems, or tapeless capture products such as XDCam .
Other initiatives focusing specifically on global identifiers
for exchanging and publishing of content include the
International Standard AudioVisual Number (ISAN) and the
Entertainment Identity Registry (EIDR) .
Some systems, such as Media Asset Management Systems,
will assign unique private identifiers, such as UMIDs, to all
content they hold, but you will need to consider whether
these are suitable for use as public identifiers .
![Page 29: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/29.jpg)
PAGE 29
HOW WILL I FIND IT?
6
In absence of asset management systems, it is sometimes
common to construct file names from metadata fields and
effectively use this as a unique ID for the file . Whilst this is
a pragmatic solution to a lack of a metadata management
system, it introduces a range of complexities if carried
through to use in digital archive systems . You would be
advised to keep your IDs purely for that simple purpose – to
uniquely identify an entity – and store your metadata in a
dedicated system with your ID providing the link between
the two .
Similarly, when digitising content from videotape to file,
the use of filenames based on sequences of editorial and
publication information can lead to unforeseen problems
stemming from poor metadata quality or clashing IDs which
can cause delays to your project . Decoupling the media
conversion activity from metadata issues, through use of the
simplest possible identifiers can streamline this process .
This can be a complex topic but it is essential to ensure
that someone in your organisation takes responsibility
for ensuring you have a defined approach and that it is
implemented consistently and effectively .
Embedded Metadata
Many file formats allow metadata to be embedded such that
content becomes self-describing . This is good practice and
can help with the future life of the content you create, but
it’s important to decide and document which copy of your
metadata is the master .
It is normal that the metadata embedded in a file is
just a point-in-time snapshot and not necessarily your
authoritative copy . It therefore only exists as a ‘reference
of last resort’ to improve understanding of the content in
extreme scenarios and wouldn’t necessarily need to be
updated if and when your metadata changes .
The most important metadata to embed is a globally-unique
identifier, which will allow the file to be unambiguously
linked to the authoritative metadata in whichever system
holds the master copy .
![Page 30: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/30.jpg)
PAGE 30
7
It’s Not Easy Being Secure
Information Security is difficult . Even organisations with
multi-million dollar IT budgets fall prey to hackers and suffer
content and data breaches .
If you are not fully confident in your organisation’s capabilities
in data security, it is highly recommended that you take
specialist advice . Even if you believe yourselves to be fully
competent in this area, there is little harm in getting third
party assurance and potentially commissioning a ‘penetration
test’ given the potential consequences of a breach .
Like many other topics in this document, there is already an
international standard ISO-27001:2013 providing guidance,
standards, and means of certification for providers of
technology services . Reading the ISO-27001 standards
document may allow you to gain confidence in your
own provision, or highlight that you might benefit from
independent advice .
How can I stop the wrong people from getting in?
Below are three specific topics commonly discussed in
relation to modern security provision .
The Threat Within
Previous common practice was to strongly defend the
boundaries of your organisation but assume that people
within your facility and IT network could be allowed relatively
unfettered access to systems . You may, of course, have
access rights and permissions applying to your applications,
but you’re unlikely to be applying the same level of rigour
to your internal systems that you do to those on the public
internet .
Recent experience has shown that data breaches and leaks
often come from within an organisation with disgruntled
or otherwise-motivated employees making use of ‘social
engineering’ (such as tricking people into exposing their
login details) or exploiting security weaknesses to access
and carry away high-value content .
Coinciding with this is the desire to make internal systems
available remotely to support flexible working or global
collaboration, and also to make use of Cloud hosting to
enable cost-effective deployment and scaling .
For these reasons it is becoming advisable to specify and
design your internal IT systems as if they were on the public
internet and to pay closer attention to the level of access that
your own staff are given .
Authentication
We all have first-hand experience of information security
practices in our home and work life .
We are continually reminded of the need for long and
complex passwords and the need to keep them unique and
safe from prying eyes .
![Page 31: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/31.jpg)
PAGE 31
HOW CAN I STOP THE WRONG PEOPLE FROM GETTING IN?
7
Passwords however, are becoming an incomplete and
imperfect way of securing systems and new approaches and
technologies are needed .
First, technologies for cracking passwords using brute-force
techniques are increasing in performance and effectiveness
faster than we can convince people to use increasingly
longer and stronger passwords . Secondly, passwords are
easy to share and lose and don’t actually give you confidence
that they are being entered by only the person to whom you
initially issued them .
A common approach in high-security systems is to require
Two Factor Authentication (2FA) where you need to prove
you have two unrelated components to gain access . This is
most commonly ‘something you have’ and ‘something you
know .’ We will all be familiar with bank cards where you
supply the card you have and the PIN you know in order to
get access to your money .
Likewise, some internet banking systems or corporate
remote access services require you to enter a password
whilst also supplying a rolling code from a physical token or
device .
Two factor authentication is becoming more common in
public systems such as Gmail, Facebook and iCloud which all
support this approach using your mobile phone as the thing
you have although it is often not enabled by default .
In considering the security model for your system, you
should consider whether you would benefit from such an
approach to be sure that the person gaining access is who
you believe them to be and not just someone who has seen a
password written on a post-it note stuck to a monitor .
Encryption at Rest
It is becoming commonplace to use encryption when moving
content over the internet . This is good practice and no longer
introduces delays or overhead to the process .
What is less common is to make use of ‘encryption at rest’ to
ensure that unwarranted physical access to storage media
will not result in access to content .
While not everyone may choose to make use of this
technology for content stored within their own premises, it
is good practice to use this when content leaves your site
either on hard drives, memory sticks, laptops or in the Cloud .
All modern computer Operating Systems include the option
for automatic Full Drive Encryption to ensure that a laptop
or hard drive falling into the wrong hands will not release
the secrets stored within, and it is good practice to mandate
or enforce the application of these technologies . Likewise,
when choosing cloud-based services you may want to
consider whether they offer such protection .
There are, however, disadvantages to making greater use of
encryption-at-rest within your systems, for example:
• Encrypted content cannot be manipulated or processed
without decryption . If you need to transcode, create
proxies, or partially-restore content then it will need to
be decrypted while these processes occur, creating a
potential weak-point in the system .
• Particularly with partial restore, it is normally necessary
to restore and decrypt the entire file in order to extract a
small portion from it, thereby negating the major benefit
that partial restore offers .
• Management of encryption keys becomes a critically
important task . With a sufficiently secure system, loss of
the encryption key equates to loss of all content .
• Encrypted content is more susceptible to data loss, for
example bit-rot . If you are lucky, corruption of a single
‘bit’ in a media file could just result in a minor amount of
digital dropout but the same corruption in an encrypted
file is likely to result in complete file loss .
![Page 32: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/32.jpg)
PAGE 32
8
Without endless capacity to store content and unlimited
budget to manage it, you will need to define the criteria for
what comes in and out of the archive .
Furthermore, the business analysis that should be the
foundation for understanding the purpose of having an
archive will determine the valuable lifetime of a piece of
content and when it should be deleted or relocated to
another place .
The important thing is to have consistency and agreed
processes so that the holdings of the archive do not
reflect the mind-set of the manager in charge at any
particular time .
Policies and Guidelines
These decisions will provide the overarching criteria for
a set of written rules that will act as a constant reference
guide to the archive owner . The policies will be agreed with
all stakeholders in the archive, be endorsed top-down, and
be regularly reviewed to assure their continued relevance to
How long should I keep it?
the business . Any fundamental changes need to be agreed,
approved and communicated . Without these checks in place
an archive can quickly become unmanageable and costs can
spiral out of control .
What is a Policy?
A policy is a high level statement that may consist of just a
few words: e .g . “we collect a minimum set of good quality
data to support the production of science programmes” .
That statement alone indicates that you select data and that
there are rules around the type, source and accuracy . The
reader immediately gets a sense of the worth of the material
to the organisation . It’s more typical for a policy to include a
little more detail; but generally the objective is to make such
statements accessible and understandable to anyone at any
level in the company .
It may be useful to consider the readability from the point
of view of a Board Member of the organisation and also a
new apprentice, both of whom have the same requirement to
read and understand the purpose of having an archive .
What are Guidelines?
Guidelines make the initial policy statement more tangible to
an individual or department, interpreting what is required by
them to fulfil archive responsibilities .
They could detail the intake policy and what technical and
metadata standards are expected when a piece of content
enters the archive . They could explain a schedule for
reviewing and selecting content and who may have overall
approval . Metadata guidelines may set down where data
is acquired or added or where producers are expected to
submit shot-lists .
One golden rule for guidelines is to always use role names, as
opposed to specific people, so that, amendments can be kept
to a minimum when staff come and go .
![Page 33: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/33.jpg)
PAGE 33
9
Migration
As discussed above, if you are intending to keep content in
perpetuity, it is inevitable it will outlive a specific technology
solution or storage medium .
Although in some rare cases you may commit to preserve
forever the capability to access the source media, for
example by continuing to maintain specialist hardware
and software, it will generally be necessary to consider
the need to migrate content between formats, carriers
or technologies .
Migration is just as important for metadata as it is for
media files and, for both scenarios, may need to occur on a
number of levels:
1 Physical Media Storage
2 File wrapper or metadata container
3 File codec or metadata structure
How do I know it will always play?
At its simplest, this is just an exercise in either copying or
converting media and metadata, en-masse, between storage
media and data containers . There are, however, a number of
real-world constraints that need to be considered .
In deciding when you choose to migrate content, you will
need to consider how your ability to access the content may
be degrading in relation to the emergence of new formats
onto which to migrate that content . For example, you may
want to migrate LTO3 content to LTO6 tape before LTO3
drives become obsolete but after LTO6 has become a
stable and cost-effective format . It is possible to adjust the
practicality of this timing by pre-purchasing equipment which
is due to become obsolete, although consideration should be
given to the availability of support, spares, and consumables .
A primary factor in migrating content which currently
resides in a live environment is the requirement not to affect
the live operation through the additional content access that
the migration process requires . A common approach where
a resilient second copy of content is held is to migrate from
the secondary copies rather than the main instance .
Access to data to drive the migration process is crucial .
Content stored in an archive may vary over time, for
example with superseded files still residing on media even
though they are not referenced by the live repository . You
may therefore only want to migrate current valid data rather
than the entire content of tapes, but would need data to
drive this process .
Similarly, when migrating metadata it is also normal to
perform a data-cleansing exercise before migration to
ensure that you are not needlessly migrating redundant or
erroneous metadata .
Integrity Management During Migration
The most effective checksum lives with a piece of content
through its entire life, however the potential need to
migrate between different formats or codecs introduces a
discontinuity, during which extra attention should be given
to ensure the integrity of content is maintained .
![Page 34: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/34.jpg)
PAGE 34
HOW DO I KNOW IT WILL ALWAYS PLAY?
9
Where multiple copies of content are held to reduce the
likelihood that content is corrupted at rest, it’s important
not to adversely affect this approach by only migrating
from one copy without also validating the integrity of these
files . The normal migration approach would be to use file
checksums to ensure that content is identical to when it was
archived and make use of the additional copies to replace
any corrupted content .
If you don’t have access to checksums created when content
was originally archived, then you will need to take great care
to ensure that the migration has happened successfully, for
example comparing all the resilient copies you hold to give
extra confidence .
You can make use of a variety of tools to assist with
validation that no loss of content integrity has occurred
during migration, for example:
File-level checksums: When only migrating between storage platforms, this can
unequivocally prove perfect migration .
Content checksums: Where lossless migrations between format carriers
are undertaken, some formats support media-level
checksums either on a per-frame or per-track basis
which will allow unambiguous verification that the content
within a new wrapper format is identical to that contained
in the original .
Content comparison: Where transcoding is being performed, it becomes harder
to ensure that the created file is equivalent to the source .
Simple comparisons of file properties such as duration and
number of audio tracks can provide a level of confidence, but
some tools allow the content of two files to be compared
and a measure of the extent to which the media content
differs to be calculated as a Peak Signal-to-Noise Ratio
(PSNR) figure .
Whilst seemingly obvious, it’s important to remember that
if you go to the effort of creating checksums then you need
to store these safely somewhere too . A ‘verifiable manifest’
of all your content will also allow you to confirm that no
content is missing, as well as confirming that any existing
content is intact .
Data Tape Migration
Where content is stored in robotic data tape libraries, there
are two characteristics of these systems that need specific
consideration during migration activities:
• There is a delay while the robotic mechanism selects the
required tape from a library and loads it into a tape drive
for access .
• The content is stored linearly on the tape, so the system
must spool to the relevant portion of the tape before
content is read from it .
Both these factors introduce a delay in random access of a
specific piece of content . Where large volumes of content
need to be migrated, the cumulative effect of these delays
can make the operation prohibitive . For example, one million
small audio files migrated using a single data tape drive in
a system where there is a three minute delay to access a
random file will take over five years!
These delays can be drastically reduced by using information
about how the content is stored within the system to drive
an efficient migration process . This would result in content
on each tape being migrated together rather than swapping
between tapes, and also that the content should be migrated
in the exact order that it is stored on the linear tape rather
than requiring continual spooling to access the content .
Some storage management systems include this capability
as part of the product but where this is not provided, special
consideration should be given to whether it is possible to
access the required data and drive a process to migrate the
content in a timely manner .
![Page 35: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/35.jpg)
PAGE 35
HOW DO I KNOW IT WILL ALWAYS PLAY?
9
All good things must come to an end
Exit planning is a specific subset of migration and is
important for all solutions but especially relevant for
managed services .
Put simply, don’t put any media or metadata into a system
unless you are very clear about how to get it out again later .
A common error is to focus more on the media and
editorial metadata and not ascribe sufficient value to the
organisational metadata that accompanies it . For example
the cataloguing and organising of content stored within a
media asset management system effectively results in the
creation of metadata which can have a high value and which
should be considered in migration and exit planning .
Your exit plan must be agnostic of destination and provide
generic capability to access all media and metadata in a
defined way such that it can be migrated into a replacement
system even if this system is not conceived or defined at the
time of entry into the initial system .
It’s good practice to create an exit plan, but it’s even better
practice to keep it up to date as technology changes .
Let us consider a scenario where you chose to store
your media and metadata with a service provider .
It is likely that you will change or re-evaluate your
solution or supplier before your content has reached
the end of its useful life and hence you need to plan
for how you will eventually move your content to a
new solution or provider .
Also, imagine that you are continually augmenting
the metadata you hold by your interactions with
the system, such as through the organisation,
cataloguing and accessing of content .
Your exit planning should take place before you
begin to store content in your system . It is also
advisable to run a proof-of-concept of any exit
procedures once you have a small volume of content
stored, but before it’s too late to change your
approach .
You would want to confirm that the agreed process
would allow a defined subset of media to be
delivered back to you in a usable form, and that the
accompanying associated metadata contains all the
information you require including both the original
and added data . An example of this additional
metadata would be organisation information, for
example resulting from putting the media into a
folder, bin, or collection within the system .
Where this metadata is passed back to you in a
physical form, such as a data tape, you would be
advised to confirm that you are able fully to access
the metadata without use of the system from which
you are testing exit . For example, you should confirm
that you can read the data tape using a generic
system, play the video, and read the metadata .
For metadata interchange, it is likely that XML files
will be used and you would ideally want a competent
person in your organisation to validate these and
confirm they contain the required information . If
this isn’t possible, you might enlist the help of your
supplier to walk you through the files and highlight
the information of interest .
You would be advised to re-run this exercise at defined,
albeit infrequent intervals . You should also trigger it
upon significant changes to the underlying technology
infrastructure such as major version upgrades .
EXIT PLANS – A WORKED EXAMPLE
![Page 36: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/36.jpg)
PAGE 36
10
THE PRACTICE
“In theory there is no difference between theory and practice,
but in practice there is.” (various)
Now that you have an understanding of the key aspects of
Digital Archiving and Preservation, you may find yourself
looking to build or procure a technology solution or service
to deliver your archiving needs .
Putting aside business aspects such as organisational
design and training for a moment, and focusing purely on
technology aspects, you should first be clear who in your
organisation is making your technology decisions and
whether they have the experience and expertise needed to
do so .
If you don’t have suitable knowledge within your own
organisation you may want to call upon knowledgable
experts from other establishments; however always ensure
that you have clearly defined accountability for technology
decisions and that those being held accountable are aware of
their position and given the support that they need .
How much will it cost me?
A major factor that becomes relevant in considering
technology solutions is cost, as this will ultimately dictate
the limitations and capabilities of your system . When
considering cost in relation to other factors it is useful to
re-iterate the following points:
The same solution need not necessarily be
used to provide both preservation and access.
Architecting a single system to have the required levels of
content integrity and assurance whilst also providing wide
and expedient access may not be cost effective . For example,
you may choose to make your content available via cheap
fast hard disk arrays, while keeping a secure copy on less
accessible but more secure data tape .
Not all content is created equal
In considering technology solutions, you can consider
the relative value of all the content being stored and
architect the solution appropriately . For example, is a single
archived rushes clip of equal value to a fully-finished
programme file and hence do they warrant the same
degree of content resilience; speed of access; or choice of
file format?
Naturally, you will be considering whether to self-provide the
solution or look to a managed service . Either way, remember
that your information belongs to you and ensure that you
don’t relinquish more control of it than you would like .
GENERAL CONSIDERATIONSIn this document we have focused on the goal of an
organisation to have access to a Digital Archive . Whether
you deliver this in-house or look to a managed service
provider you should ensure that you and your ‘supplier’ agree
on the defining characteristics of your system, and document
them through the creation of (at least) the following plans:
• Predicted Capacity and Usage Plan
• Service Definition
• Migration Plan
• Exit Plan
• Quality Assurance Plan – to allow you to have confidence
that your system is working as intended without simply
putting all your faith in the technology
• Resilience and Replication Plan
• Disaster Recovery and Business Continuity Plan
![Page 37: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/37.jpg)
PAGE 37
HOW MUCH WILL IT COST ME?
10
Considering these points will help you reach a common
understanding between you and your supplier about
expectations of the solution being provided . This is equally
relevant for internal provision as it is for managed services
and will ensure that any desired characterises of the service
which are liable to have a cost implication are discussed and
consensus reached on a practical solution .
For example, there is undoubtedly an additional cost arising
from processes to assure the safe storage of content
and from considering preservation needs such as integrity
management, migration planning, and exit planning . All
of thesecosts should be balanced against the risk and
financial impact of content loss . It may be that for some
categories of content such as raw rushes materials,
you may decide that the balance swings in favour of
limited preservation management and reduced storage
cost; but it is likely that the reverse will be true for
completed programmes .
For each of the plans described above, you should consider
whether you want to set aside funding for rehearsals and
trial-runs at service commencement, and potentially at
regular intervals thereafter, to ensure the viability of the
approach in the presence of changing technology . This is
commonly practiced around disaster recovery, to ensure
that content is still available even in the presence of defined
system failures, but less commonly practiced for exit and
migration planning outside of specific archive-focused
scenarios .
Where you become heavily reliant on a particular solution
you may consider entering into an escrow agreement . Here,
the software and information necessary to continue using a
solution is lodged with a third-party and passed on to you,
by means of a legal agreement, if your provider were to
cease trading .
Another approach to cover for cessation of your supplier is
for them to enter into contractual agreements with other
providers of similar services such that these other parties
can take on the custodianship and management of your
content if your initial supplier was to cease trading .
In managed service offerings in some other industries you
naturally get immediate feedback on the performance of
the contract, for example if the performance of a catering
contract fails or degrades then your staff will make you
aware of it . For archive managed services, whether or not
your content is being appropriately managed is often a
matter of trust . It is therefore recommended that additional
assurance measures are put into place to ensure that the
archive and preservation management practices that you
are expecting are actually being delivered .
To ensure that your content is being managed effectively, it is
advisable to set aside a proportion of your budget for quality
assurance to be performed within your own organisation .
In selecting a vendor, you may look to a certification such
as ‘Trustworthy Repository’, ‘Trusted Digital Repository’, or
equivalent accreditation to allow you to be comfortable that
your supplier has considered all essential factors relating
to long-term storage of content . At the very least, peruse
the checklists provided in relation to these certifications to
acquaint yourself with the factors that are considered by
expert bodies to be characteristics of such an organisation,
and consider whether the additional effort required to meet
these requirements is worth the potential cost which may be
incurred as a result .
The National Digital Stewardship Alliance (NDSA) has also
created a simple approach for evaluating the maturity of a
particular service or solution which you might find useful .
This focuses on the practical aspects of archive storage and
takes the form of a tiered set of recommendations entitled
‘Levels of Digital Preservation’ . Links to this, and other
approaches are provided in the Further Reading section of
this document .
![Page 38: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/38.jpg)
PAGE 38
1 Consider the eventual consumer of your content when making all decisions
2 Remember that not all content has equal value
3 Be clear about owners and responsibilities for all archive functions
4 Don’t put archiving at the end of your production process – consider it throughout
5 Document your decisions
6 Consider the lifespan that you’d like your content to have, and plan accordingly
7 Remember that Preservation and Access can be delivered using different solutions
8 Don’t put blind trust in technology – always assure your processes and plan for failure
You may feel overwhelmed by how many things this guide gives you to think about . But if you hold on to
the following eight principles, and approach the topics discussed in this document methodically, there
is every reason to believe you will create a first-class digital archive .
CONCLUSION
You and your colleagues may then be amazed at the positive impact this has on your business .
Never have we been so aware of how precious and rare good content is; to have a place where you can
keep yours safely, and retrieve it easily, will feel extraordinarily empowering .
![Page 39: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/39.jpg)
PAGE 39
DPP: 10 Things You Need to Know About Digital Storage
The document is available to DPP Members at http://www.digitalproductionpartnership.co.uk
Digital Preservation Coalition: Preserving Moving Pictures and Sound
http://dx.doi.org/10.7207/twr12-01
The Open Archive Information System Reference Model: Introductory Guide
http://dx.doi.org/10.7207/twr14-02
OAIS: Full Specification
http://public.ccsds.org/publications/archive/650x0m2.pdf
AMWA AS-11 DPP Format Specification
http://www.amwa.tv/projects/AS-11.shtml
CERN study on data integrity
http://indico.cern.ch/event/13797/session/0/contribution/3/attachments/115080/163419/Data_integrity_v3.pdf
NDSA Levels of Digital Preservation
http://www.digitalpreservation.gov/ndsa/activities/levels.html
Digital Preservation Metrics (including TRAC and TDR checklists)
http://www.crl.edu/archiving-preservation/digital-archives/metrics
Digital Curation Centre: Lifecycle model
http://www.dcc.ac.uk/resources/curation-lifecycle-model
Practical Digital Preservation
http://www.facetpublishing.co.uk/title.php?id=047555#.VS1xLjHF9SQ
Personal Digital Archiving
http://www.digitalpreservation.gov/personalarchiving/documents/NDIIP_PA_poster.pdf
FURTHER READING
![Page 40: THE DPP GUIDE TO DIGITAL - Amazon Web Servicesdpp-assets.s3.amazonaws.com/wp-content/uploads/... · assumption: you employ or have the benefit of a person or persons who is qualified](https://reader034.vdocuments.mx/reader034/viewer/2022050223/5f68f8013426bb45c01374a3/html5/thumbnails/40.jpg)
PAGE 40
V1 .1
This DPP production was brought to you by Steve Daly and Heather Powell, both of whom have many years of practical experience of working with media archives . They were assisted by Mark Harrison, Emma Vandore, Rachel Baldwin and Abdul Hakim . We’d very much like to thank the numerous DPP Members who have also contributed to this publication: it has benefited greatly from their collective expertise .
Design by Vlad Cohen http://www.thunder-and-lightning.co.uk
Copyright Notice:
This publication is copyright © Digital Production Partnership Ltd 2015 .
All rights are reserved and it is prohibited to reproduce or redistribute all or
any part of this content . It is intended for members’ use only and must not be
distributed outside of an organisation . For clarity, this prohibits distribution to
members of a trade association, educational body or not-for-profit organisation
as defined by the DPP membership categories . Any exception to this must be
with the permission of the DPP .