digital preservation at norfolk record office

22
Digital Preservation at Norfolk Record Office A report prepared by Pawel Jaskulski (Digital Preservation trainee) for Gary Tuson (County Archivist). March 11 th , 2016

Upload: pawel-jaskulski

Post on 14-Apr-2017

131 views

Category:

Government & Nonprofit


1 download

TRANSCRIPT

Page 1: Digital Preservation at Norfolk Record Office

Digital Preservation at Norfolk Record Office

A report prepared by Pawel Jaskulski (Digital Preservation trainee)

for Gary Tuson (County Archivist).

March 11th, 2016

Page 2: Digital Preservation at Norfolk Record Office

Executive Summary

Digital Preservation strategy at Norfolk Record Office evolved from the archive’s own

active interest in the emerging domain and from the anticipated necessity of

integrating accessioning digitally born archives procedure within regular archival

processing framework.

The launch of Norfolk Sound Archive in 2003 and Norfolk Record Office’s (NRO)

participation in the Skills for the Future programme signalled strong commitment to

build expertise within the field of digital technologies. Senior archivists, with support

from Norfolk County Hall’s ICT services, have developed over time bitstream

preservation capability with elements of a ‘parsimonious approach’ to digital

preservation.

The archive approved its first version of a Digital Preservation Policy in 2007, in

which it addressed the need for advancing its digital preservation strategy to involve

format migration pathways and clearly defined accessioning workflows. NRO’s

influential role within East of England Regional Archive Council led to a regional pilot

project (currently at its proof of concept stage) employing cloud hosted instance of

Archivematica connected with a cloud storage system provided by Arkivum.

Acknowledgement

I would like to thank my line manager Ian Palfrey (Senior Archivist/Collection

Management) and Gary Tuson (County Archivist) for guiding me in the process of

completing this report.

Page 3: Digital Preservation at Norfolk Record Office

Contents

1. Introduction 1

2.1 Background 2

2.2 Wider Regional Context 2

2.3 NRO Digital Preservation Policy 3

2.4 Requirements 4

2.5 Parsimonious Approach – Bitstream Preservation 6

2.6 Towards Logical Preservation 6

3. Conclusions 8

4. Recommendations 8

References 10

Appendices 12

Page 4: Digital Preservation at Norfolk Record Office

1

1. Introduction

The aim of this report is to survey approaches to digital preservation at Norfolk

Record Office in order to make recommendations for future improvements. Digital

preservation has now become one of the key priorities for the NRO service plan.

The archive has been involved in the Skills for the Future traineeship scheme since

2014, signalling its commitment to filling the skills gaps within the sector. This, as

well as its contribution to collaborative working within East of England Regional

Archive Council (EERAC) demonstrates NRO’s strong dedication to developing

digital preservation strategy for the region and within institutional context of a local

authority archive service.

The report will trace the evolution of interest in digital preservation, from its original

concerns about the need to develop a strategic plan for preservation of electronic

records to the latest developments with EERAC and its current pilot project.

Page 5: Digital Preservation at Norfolk Record Office

2

2.1 Background

Norfolk Record Office collects and preserves unique archives relating to the history

of Norfolk and makes them accessible to as wide a range of people as possible. It is

a joint service of Norfolk County Council and the District Councils of Norfolk and is

democratically accountable via the joint Norfolk Records Committee.1 NRO is located

at The Archive Centre in Norwich with additional services operating from Norfolk

Heritage Centre at Norwich Millennium Library and King's Lynn Borough Archive.

In April 2003 the record office launched the Norfolk Sound Archive with a purpose to

collect, preserve and provide public access to sound recordings relevant to life in

Norfolk. Its remit includes:

• Provide information on holdings and access to original recordings

• Preserve sound recordings

• Locate existing sound recordings that are worth preserving for the future

• Provide support and training for on-going and new oral history projects

• Promote the use of sound recordings, particularly within education

• Links to organizations who also hold collections of sound recordings relating to

Norfolk and who are carrying out oral history work in the county2

Overall NRO incorporates three repositories: Norfolk Record Office, Norfolk Sound

Archive and King’s Lynn with variety of holdings and record types under its custody.3

2.2 Wider Regional Context

The creation of Regional Archives Councils for the nine English Regions in 1999 led

to the publication of a series of regional archive strategy documents. The report

created for East of England Regional Archive Council (EERAC) in 2003 sets out the

aims of the Preserving the Present for the Future project:

This project encompasses a whole range of issues concerning records

management and the preservation of electronic records and aims to ensure

that contemporary records in all forms – public and private - are both properly

managed now and preserved for the future. It will involve creating the

infrastructure and building confidence to turn theoretical knowledge into

1 Norfolk County Council has two Joint Committees: http://www.norfolk.gov.uk/Council_and_Democracy/Our_budget_and_council_tax/Statement_of_accounts/NCC152976, accessed 09/03/2016 2 Norfolk Sound Archive has a well-established digitisation workflow based on the work of the British Library Sound Archive. 3 Norfolk Record Office Archive Collections http://www.archives.norfolk.gov.uk/Archive-Collections/index.htm; Additionally, Norfolk Sound Archive collections feature in Directory of UK Sound Collections: http://www.bl.uk/projects/uk-sound-directory, accessed 09/03/2016

Page 6: Digital Preservation at Norfolk Record Office

3

practical action across the Region. It will also seek to identify collaborative

solutions for the preservation of digital data.4

With the archive sector development strategy clearly stated the next important step

forward was the East of England Digital Preservation Regional Pilot Project that took

place between August 2004 - March 2005.5 Although NRO was not directly involved

in the project, it participated as an observer.6 The test bed project was aimed at

better understanding of the processes and costs of preserving digitalised material by

assessing feasibility of outsourcing specialist services on a regional basis to the UK

Data Archive based at the University of Essex. Among the recommendations were

two lessons learned from the project that remain relevant within the context of digital

preservation at a local authority archive:

- Need to consider further modelling of costs and benefits linked with the three

scenarios of: in-house provision, working through consortia and contracting

out;

- Need to develop and clarify the OAIS model to make it more intelligible and

better aligned, with more conventional terminology applied to archive

administration and records management.7

2.3 NRO Digital Preservation Policy

In July 2007 the NRO approved a Digital Preservation Policy. It is currently

undergoing a revision and will be supported by related documents: Digital Records

Accessioning Checklist and Advice to Creators of Digital Records.8 The policy

recognises the urgent need to address developing an OAIS compliant digital

preservation strategy, especially with the view of Norfolk Sound Archive expanding

its activities. Key points from the policy are:

1.6 The NRO (and NSA) expect to receive for appraisal and preservation an

increasing number of ‘born-digital’ records, mainly, in the first instance, from

private organisations, groups and individuals, which due to their functionality

or quantity cannot be preserved by printing out hard copies. This expectation

is based on contact with depositors and with colleagues in other local

government archive services.

4 Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC 2003, p. 33 5 Digital Archives Regional Pilot profile on UK Data Archive http://www.data-archive.ac.uk/about/projects/darp, accessed 09/03/2016 6 Chairman of EERAC at a time was John Alban, Norfolk County Archivist, who co-wrote foreword to Report of the East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006, http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016 7 Ibid., p. 43 8 See appendices A and B for draft versions. The Digital Preservation Policy is an internal document and as such is not publicly available or published online at the moment. An important point of reference for Digital Preservation Policy is the Archive Collecting Policy http://www.archives.norfolk.gov.uk/view/NCC098771 accessed 09/03/2016

Page 7: Digital Preservation at Norfolk Record Office

4

1.7 Digital records, i.e., the type of data, format of carrier and associated

hardware and software needs, are likely to be varied and in many cases

cannot be anticipated.

The appraisal section outlines the scope of digital records collecting policy:

3.2 The NRO will not necessarily accept custody of every digital record offered to

it. Records which in effect we cannot use or whose use imposes

unacceptable costs or conditions on the NRO might not be selected for

retention. Factors which may affect this are:

the existence of adequate metadata

the removal or neutralisation of security features

the provision of a free licensed copy of the original software, if necessary,

to access and maintain the record

The document also set great store on open (non-proprietary) standards that are ‘to

be employed as far as possible for preservation purposes. Where those standards

are absent, file formats accepted as the industry standard should be used, e.g., TIFF

files.’

2.4 Requirements

A quick survey of digitally born archives received by NRO gives an insight into the

requirements for digital preservation. The total volume of data held within the archive

amounted to 87.8 GB in November 2015 with majority of the records consisting of

various formats: raster and vector images, text documents, audio and video files as

shown below (please refer also to appendix D):

Page 8: Digital Preservation at Norfolk Record Office

5

Figure 1: Collection Profile of all Digitally Born Archives at the NRO

Page 9: Digital Preservation at Norfolk Record Office

6

Norfolk Sound Archives’ digital assets will soon need 3 TB of storage and various

digitisation projects occupy similar space on shared network drives.

Improving the accessioning system in respect to digitally born archives is the main

drive towards establishing digital preservation strategy. The aim is to build digital

preservation capabilities so as to be able to choose digital records over their paper-

based equivalents (hard copies/printouts).

2.5 Parsimonious Approach - Bitstream Preservation

NRO has been collecting digitally born archives since 1997 and has introduced

gradually different elements of the ‘parsimonious’ approach to digital preservation,

resulting in a systematic method initiated by the Senior Archivist (Collection

Management).9 All digital accessions were processed manually in the same manner:

- One month quarantine

- Virus scan performed

- Integrity checks conducted by generating checksums

- Digital objects extracted from removable media and transferred to a secure

and designated network drive location

- Technical metadata generated (file count, file sizes, directory and file listings)

- File format identified by creating profiling reports with DROID

- Top level descriptions created in CALM cataloguing system

All metadata generated in the above process is stored alongside the digital records

within the top level directory to which digital objects were transferred.

This is a time-consuming process and uses more than one tool, creating metadata in

separate files. This will be difficult to sustain in the long-term.

2.6 Towards Logical Preservation

The service plan for 2014/2015 identifies the lack of compliance with the OAIS

standard – a consideration, which has been gaining significance on the NRO Risk

Register.10 With greater understanding of OAIS functional model and further

research into currently available digital preservation systems (Preservica, Rosetta

etc.) NRO decided to explore Archivematica as the preferred solution, since it offers

9 Norfolk Records Office Service Plan for 2014-2015 enclosed in Norfolk Record Committee meeting agenda from Thursday 1 May 2014 http://www.norfolk.gov.uk/download/norfrec010514agendapdf, p. 43, accessed 06/03/2016; Tim Gollins, Parsimonious preservation: preventing pointless processes!, http://www.nationalarchives.gov.uk/documents/information-management/parsimonious-preservation.pdf, accessed 06/03/2016 10 Ibid., p. 38

Page 10: Digital Preservation at Norfolk Record Office

7

normalisation for preservation and access purposes. It acts as an Archival

Information Package creator, managing the workflow from transfer, through ingest to

archival storage and dissemination.

The software was first installed in a test environment (Ubuntu 14.04.4 LTS running

on HP Linux compatible machine) and used to process a sample dataset comprising

of file formats similar to those accessioned by NRO in the past. After initial results it

became apparent that in order to fully assess Archivematica’s capabilities it needs to

be deployed in a production environment, connected with appropriate storage

systems.11

The software is commonly referred to as a pipeline, since in itself it is not a repository

(storage system), nor an access system but a processing pipeline connecting them,

supporting pre-ingest and ingest activities of a digital repository.

Arkivum, a company specialising in digital data archiving offers a fully hosted service:

cloud storage integrated with Archivematica from January 2016.12 In order to reduce

the costs and utilise economies of scale NRO suggested a pilot project to EERAC

members in February 2016, demonstrating why Archivematica is the preferred

system for digital records’ ingest.13 In summary the main advantages are:

- It is OAIS compliant and supports PREMIS metadata schema together with

METS and Dublin Core standards14

- The Archival Information Package is structured in accordance with Library of

Congress BagIt specification15

- It uses The National Archives’ file format registry PRONOM

- It is open source and under active development with substantial user

community from across heritage, arts and academic sectors.16

- It runs a series of configurable micro-processes provided by open source tools

integrated within Archivematica, which can be replaced as technologies

change fulfilling the requirement of OAIS’ Manage System Configuration

function17

It was discussed with EERAC members that working together could entail shared

resources: cloud-based infrastructure with digital preservation software accessible

through browser (Archivematica) and linked cloud storage (provided by Arkivum).

The main focus of the project is to evaluate Archivematica as a digital preservation

tool but it will also look at integrating it with AtoM, an access and cataloguing system

11 Norfolk County Council’s ICT security restriction prohibited integration with the system. 12 http://arkivum.com 13 Presentation is available on SlideShare: http://www.slideshare.net/PaweJaskulski1/archivematica-and-local-authority-archive-services accessed 09/03/2016 14 PREMIS http://www.loc.gov/standards/premis/, METS http://www.loc.gov/standards/mets/, Dublin Core http://dublincore.org/, accessed 11/03/2013 15 E-Ark Report on Available Formats and Restrictions: http://www.eark-project.com/resources/project-deliverables/7-e-ark-d41-report-on-available-formats-and-restrictions/file, p. 24, accessed 10/03/2016 16 Archivematica users group forum: https://groups.google.com/forum/#!forum/archivematica 17 Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012: http://public.ccsds.org/publications/archive/650x0m2.pdf, p. 4-12, accessed 11/03/2016

Page 11: Digital Preservation at Norfolk Record Office

8

supporting digital objects and developed by the same organisation that created

Archivematica: Artefactual Systems.18

3. Conclusion

Part of the Digital Preservation policy should also be advocacy and consultancy

aimed at promoting digital archiving in Norfolk, especially amongst community

archives and other likely donors and depositors. Data must be capable of re-use with

sufficient metadata (detailed enough documentation in regards to chain of custody,

creators, rights, technical provenance: what software and what operating system the

files were created in what file format etc.).19 Advice on best practice in regards to

electronic record keeping should be embedded within the policy to foster better

understanding of digital preservation concepts among electronic records creators,

donors and depositors.

4. Recommendations

With NRO continuing to explore Archivematica and its applications to archival

processing workflow this report recommends:

- The NRO improves its Preservation Planning by compiling an Action Plan for

all file formats received within digital accessions. The Action Plan will inform

staff what must be done to normalise a digital object at Ingest into a

preservation and/or dissemination formats. This will inform format migration

strategy. For example, TIFF is the currently preferred preservation format for

images as it is less prone to data loss than other raster image file formats.

NRO is interested in exploring PNG file format as preservation format for

certain types of digital records.

- The NRO improves its Preservation Planning by identifying and compiling a

list of Significant Properties per type (text, audio, etc.) to help staff decide

whether a format migration has produced acceptable results, retaining its

authenticity and evidential value.20

- The designated community of the NRO is the general public, which demands

a robust access system and intellectual property rights management

procedures. The NRO should review its designated community and identify

18 The current cataloguing system used by EERAC members is CALM. If the project is successful it would require migration of the records to a new system supporting digital objects like Access to Memory (AtoM): https://www.accesstomemory.org. A concern shared also by ARCW Digital Preservation Working Group: http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed 09/03/2016 19 Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and David Anderson, Preserving Complex Digital Objects, Facet Publishing 2014, p. 50-51 20 Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions, applications, implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016

Page 12: Digital Preservation at Norfolk Record Office

9

any sub-sets or new communities whose access requirements are specialised

or more complex and demanding and how these can be met.

- The NRO considers how it will add the timely delivery of digital records to

users in its facilities (search room and online access).

- Although not an immediate priority NRO audits its other digital assets. Please

see Appendix C for suggested actions.

Additionally with the view of NRO and EERAC continuing collaborative work as

consortium this report suggest:

- Archives associated within EERAC occupy disparate geographical locations,

which could translate to a network of disparate storage locations, improving

security and ensuring disaster recovery plan. Assuming that each archive has

its own ICT infrastructure, or is willing to develop it, that would fulfil the first

requirement of the NDSA Levels of Digital Preservation.21

- EERAC could continue its work towards Distributed Digital Preservation

model, in which the members own preservation infrastructures and expertise

rather than outsourcing this core service to external vendors as with the

MetaArchive Cooperative example.22

- EERAC agrees on best practices and standards to support interoperability and

sustainability of the project. It is important to refer to standards for trusted

digital repository: DRAMBORA, Data Seal of Approval or TRAC in order to aid

concentrating the efforts on achievable goals.23

21 Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses, http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed 10/03/2016 22 Adrian Brown, Practical Digital Preservation, Facet 2013, p. 103-106; Also https://educopia.org/presentations/long-term-preservation-strategies-architecture-views-implementers, accessed 09/03/2016 23 Main Certification Standards include: peer-reviewed self-assessment Data Seal of Approval Assessment http://datasealofapproval.org, DRAMBORA Digital Repository Audit Method Based on Risk Assessment http://www.repositoryaudit.eu and TRAC Trustworthy Repositories Audit and Certification https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016

Page 13: Digital Preservation at Norfolk Record Office

10

References

Kilian Amrhein and Marco Klindt, One Core Preservation System for All your Data. No Exceptions!,

https://opus4.kobv.de/opus4-zib/files/5663/iprespaper-finaledit.pdf, accessed 11/03/2016

Philip C. Bantin, Strategies for Managing Electronic Records: A New Archival Paradigm? An

Affirmation of Our Archival Traditions?, http://www.indiana.edu/~libarch/ER/macpaper12.pdf,

accessed 11/03/2016

Adrian Brown, Practical Digital Preservation, Facet 2013

Edward M. Corrado and Heather Lea Moulaison, Digital Preservation for Libraries, Archives, and

Museums, Rowman & Littlefield 2014

Tim Gollins, Parsimonious preservation: preventing pointless processes!,

http://www.nationalarchives.gov.uk/documents/information-management/parsimonious-

preservation.pdf, accessed 06/03/2016

Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions,

applications, implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016

Helen Heslop, An Approach to the Preservation of Digital Records, National Archives of Australia

2002, http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm16-47161.pdf, accessed

11/03/2016

Sarah Higgins, The DCC Curation Lifecycle Model, The International Journal of Digital Curation,

Volume 3, Issue 1, 2008, p. 134-140

Kirnn Kaur, Report on testing of cost models and further analysis of cost parameters, APARSEN 2013

https://rd-alliance.org/system/files/filedepot/113/APARSEN-REP-D32_2-01-1_0.pdf, accessed

11/03/2016

Anna Kugler, Hannes Kulovits, From TIFF to JPEG 2000?, D-Lib Magazine, Volume 15, Issue 11/12,

2009, http://www.dlib.org/dlib/november09/kulovits/11kulovits.html, accessed 11/03/2016

Brian F. Lavoie, The Open Archival Information System Reference Model: Introductory Guide (DPC

Technology Watch), OCLC and DPC 2004, http://www.dpconline.org/docs/lavoie_OAIS.pdf and its 2nd

2014 edition http://www.dpconline.org/component/docman/doc_download/1359-dpctw14-02,

accessed 11/03/2016

Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and

David Anderson, Preserving Complex Digital Objects, Facet Publishing 2014

Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital

Preservation Gap. A Jisc Research Data Spring project. Phase One report - July 2015,

https://dx.doi.org/10.6084/m9.figshare.1481170.v1, accessed 11/03/2016

Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital

Preservation Gap. A Jisc Research Data Spring project Phase Two report - February 2016,

https://dx.doi.org/10.6084/m9.figshare.2073220.v1, accessed 11/03/2016

Page 14: Digital Preservation at Norfolk Record Office

11

David Pearson and Colin Webb, Defining File Format Obsolescence: A Risk Journey, The

International Journal of Digital Curation, Volume 3, Issue 1, 2008, p. 89-106

Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses,

http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed

10/03/2016

Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC

2003

Jeff Rothenberg, Ensuring Longevity of Digital Information, CLIR 1999,

http://www.clir.org/pubs/archives/ensuring.pdf, accessed 11/03/2016

Jeff Rothenberg, Preserving Authentic Digital Information, CLIR 2000,

http://www.clir.org/pubs/reports/pub92/rothenberg.html, accessed 11/03/2016

Bronwen Sprout et all, Archivematica As a Service: COPPUL's Shared Digital Preservation Platform,

http://summit.sfu.ca/system/files/iritems1/15519/CJILS39.2-9-Sprout.pdf, accessed 11/03/2016

Adam Tovell and James Knight, Directory of UK Sound Collections, British Library 2015:

http://www.bl.uk/britishlibrary/~/media/subjects%20images/sound/directory%20of%20uk%20sound%2

0collections.pdf, p. 250-294, accessed 11/03/2016

Colin Webb, David Pearson and Paul Koerbin, 'Oh, you wanted us to preserve that?!' Statements of

Preservation Intent for the National Library of Australia's Digital Collections, D-Lib Magazine, Volume

19, Issue 1/2, 2013 http://www.dlib.org/dlib/january13/webb/01webb.html, accessed 05/03/2016

Geoffrey Yeo, Trust and context in cyberspace, Archives and Records, Volume 34, Issue 2, Routledge

2013, p. 214-234

Archives and Records Council Wales Digital Preservation Working Group Case Study:

http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed

09/03/2016

East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006,

http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016

Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2.,

Consultative Committee for Space Data Systems 2012,

http://public.ccsds.org/publications/archive/650x0m2.pdf, accessed 11/03/2016

Trustworthy Repositories Audit & Certification: Criteria and Checklist, OCLC and CRL 2007,

https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016

Page 15: Digital Preservation at Norfolk Record Office

Appendix A Digital Records Accessioning Checklist (Draft Version January 2016)

1

26 January 2016

Accessioning Digital Records Guide Accessioning Digital Records does not differ from standard Accession Procedure. Please refer to it first and follow the steps as described. Accessioning Digital Records corresponds to the Ingest entity within OAIS functional model. For the purpose of this guide this is broken down into Transfer and Appraisal, since donors and depositors not always perform selection and arrangement prior to transferring digitally born archives. If the digital material is in need of preservation (converting to preservation file formats) this should be also preceding the Ingest.

(1.1.1) At the point of contact negotiate technical details for the delivery of digitally born archives. Please refer to (and explain to the donor/depositor) Advice to Creators of Digital Records document. (1.1.2) Obtain Intellectual Property Rights and permission to manipulate and/or destroy (delete) digital material for the purpose of preservation, appraisal (in line with NRO Collecting Policy) and access. Permission is needed to authorize the procedures necessary to meet preservation objectives. For example: creating a new version of the archived item so that it can be rendered by current technologies, discarding of material that does not meet the criteria of NRO Collecting Policy or creating access copies to be published online on NRO website. (1.2.1) Run virus scan on the received digital records on a designated Digital Preservation workstation (quarantined workarea) that accepts all incoming digital records accessions. This will happen automatically if the digital content is being sent as a file transfer over internet or other network (FTP transfer, e-mail, via downloadable link etc.). (1.2.2) If the digital records are received on removable media (DVD, CD-ROM, USB Memory Stick etc.) connect them to the designated DP workstation to run virus scan. (1.2.3) Make sure to update Removable Media Inventory if these are to be retained in the Cold Store. Number all carriers consecutively based on their accession number, e.g., ACC 2012/33 RM 2 of 5 (where are RM stands for Removable Media). (1.3.1) Irrespective of how the digital records are delivered (an accession might comprise both: file transfers and removable media as hybrid/mixed accession) there should be two copies created of the entire dataset comprising the accession. One respecting original order and preserving all digital material as it was deposited/donated. And a second working copy, created in exactly the same manner, but intended for an archivist to perform appraisal (AKA digital curation). As mentioned both datasets should be structured as originally delivered. In the case of mixed accession, including different removable media, there should be separate directory created for each carrier of digital material. These should be named according to a convention: <Sequential Number 000>_<Type of Carrier>_<Label if Applicable>

Transfer Appraisal Preservation IngestArchival Storage

Access

Page 16: Digital Preservation at Norfolk Record Office

Appendix A

2

For example: 001_Floppy-Disc_Accounts-1993 and all placed within a removableMedia folder located at the top level directory you are working in. (1.3.2) Review content of the accession and create intellectual arrangements. This is an Appraisal stage – it is desirable to decide at this point how to structure the digital records into a Submission Information Package. If some of the files fall out of the scope of the collection policy they should be deleted now, subject to donor’s/depositor’s permission. Please refer to File Formats Action List document to identify superfluous objects. (1.3.3) As part of the above process you should ensure that all the changes are recorded. Create metadata folder and metadata.csv file inside it with an Excel spreadsheet template (coming soon) next to submissionDocumntation folder within the directory storing the digital objects you are working on. Provide descriptive information at appropriate level, complying with Dublin Core metadata standard. (1.4) Create Disk Images for removable media using BitCurator, if it is important to preserve functionality or particular features of the physical carrier of digital content (for example menu of a DVD Video or interactive CD-ROM). (1.5.1) Collect Fixity Information – check if fixity information is delivered by the donor/depositor and verify if the files are not corrupted. (1.5.2) Generate Fixity Information – check checksums before and after transfer whenever data are being copied from one storage system (physical discs, CD-ROM, DVD, USB memory stick etc.) to another one (Local File System, server storage, shared drives etc.) Check file count and file size. Tool: Fsum Frontend (http://fsumfe.sourceforge.net/index.php?page=usage) To create a checksum file .sha2 for all files within a directory

1. Select in the menu « Generate check file ». 2. Select the folder containing your files. 3. To select location where you want the result to be saved select second option

from the drop-down menu: “1 file in any place”. Otherwise the program creates the file within the directory containing your files with the first option: “1 file in tree root”

4. Select the format SHA2 512 (used by Archivematica) and click the button « Generate ».

(1.6) Scan for sensitive and personal information (credit card details, addresses, phone numbers etc.) (1.7) Identify file formats using DROID (http://www.nationalarchives.gov.uk/information-management/manage-information/policy-process/digital-continuity/file-profiling-tool-droid/)

Page 17: Digital Preservation at Norfolk Record Office

Appendix A

3

(1.8) Include a digital copy of your email correspondence with the donor/depositor any other documentation related to the transfer within submissionDocumentation folder, which should be created within the top directory you are working on. With the view to use Archivematica in the future all metadata generated throughout accessioning process (checksums, DROID results etc.) should be put into a submissionDocumentation folder next to metadata folder that will contain metadata.csv file created by an archivist according to the Excel template. (1.9) Assess the overall size of the accession in bytes and convert the size figure to an easily readable format (e.g.: MB,GB,TB) Glossary Fixity information - hash, message digest, checksum, manifest file Bit Rot - On magnetic media the binary digits are (essentially) represented by individual particles of magnetic material whose polarity represents either 1 or 0. Sometimes interference or just general degradation of the media can cause these particles to flip, reversing the meaning of a particular bit. Or on optical media, physical damage or decay of the dyes which are used in writable DVDs and CDs has similar effects. There’s usually a degree of error correction built in, but eventually this can build up and corrupt data irretrievably. Disk Image - copy of the bitstream that is read off the disk through the computer’s input/output equipment. The standard forensics software that creates a disk image also generates a cryptographic hash of the entire disk image.

Page 18: Digital Preservation at Norfolk Record Office

Appendix B

1

Advice to Depositors of Digital Records

Transfer of Intellectual Property Rights

Archives may not be able to assign their limited resources to the task of preserving data for which the value is unknown but at the same time, there is a need to preserve ‘valuable’ datasets. This is why we ask for the permission to manipulate and/or destroy digital content donated to us, so as we can ensure the best use of our resources and prioritise accepting deposits according with our collecting policy (See our website: http://www.archives.norfolk.gov.uk/view/NCC098771). Permission to destroy is needed in order to perform preservation tasks as well as to ensure that Norfolk Record Office meets the requirements of its collecting policy.

Please be aware that by signing the accession form you give Norfolk Record Office the authority to process, migrate and destroy the data for the purpose of preservation. This mean that original data carriers (removable media on which they were stored like USB Memory Stick, CDs, Floppy Discs etc.) may be discarded.

Norfolk Record Office preserves collections donated to it to be accessible to general public. Please let us know if the digital records contain any sensitive or confidential information, so that public access can be restricted for a suitable period of time.

Metadata

In order to ensure that the digital records are accessible in the future we must collect all necessary information required to open and view digital files. To the best of your knowledge please provide us with information about:

- What software (including version) was used to create, open, read, edit and

save the file/s;

- What operating system was used (including version; for example Windows

XP Service Pack 2003, Mac OS X 10.6.8 etc.);

- For what purpose the data were generated/created and around what time?

Please complete Digital Files and Removable Media Inventory form (Excel spreadsheet template) that will list content of your deposit and whenever possible size on a disk.

For Current Records managers

Preferred Deposit Format

In order to ensure long-term sustainability of access to the records it is recommended that records managers use current preservation formats. If within means and resources of your organisation export data that you want to preserve and deposit with NRO in the following formats.

Page 19: Digital Preservation at Norfolk Record Office

Appendix B

2

Media Type File Formats

Text PDF/A: Portable Document Format (Archival; ISO 19005-3 compliant)

Image (Raster) TIFF: Uncompressed Baseline Tagged Image File Format v.6 (No LZW compression)

PNG: Portable Network Graphics (lossless compression)

Image (Vector) SVG: Scalable Vector Graphics File

Sound WAV: Broadcast Wave Format

For Existing Digital Records

Accepted Deposit Format

Media Type File Formats

Text DOCX: MS Word Open XML Document (created in MS Office 2007 and above)

XLSX: MS Excel Open XML Document (created in MS Office 2007 and above)

PPTX: MS PowerPoint Open XML Document (created in MS Office 2007 and above)

ODT: OpenDocument Text Document (created in OpenOffice)

ODS: OpenDocument Spreadsheet (created in OpenOffice)

ODP: OpenDocument Presentation (created in OpenOffice)

PDF/A: Portable Document Format (Archival)

TXT: Plain Text File (ANSI or UTF-8 encoded)

RTF: Rich Text Format File

XML: Extensible Markup Language Data File

CSV: Comma Separated Values File

Image (Raster) TIFF: Tagged Image Format File

PNG: Portable Network Graphic

Image (Vector) SVG: Scalable Vector Graphics File

Sound WAV: Waveform Audio File Format

AIFF: Audio Interchange File Format

MP3: Moving Picture Experts Group Layer 3 compression

FLAC: Free Lossless Audio Codec File

OGG: Ogg Vorbis Audio File

Video* MPEG-1/2: Moving Picture Experts Group

AVI: Audio Video Interleave File (uncompressed)

MOV: Quicktime Movie (uncompressed)

MP4: Moving Picture Experts Group (with H.264 encoding)

Email EML: Electronic Mail Format

3D Graphics OBJ: Wavefront Object files

DROID

If depositing large amount of data that equals to system migration please use DROID before submitting your deposit (http://www.nationalarchives.gov.uk/documents/information-management/droid-how-to-use-it-and-interpret-results.pdf).

Page 20: Digital Preservation at Norfolk Record Office

Appendix C

1

Digital Audit Questionnaire

The aim of this questionnaire is to identify the requirements for future storage and

preservation of any digital material being within possession of NRO. This will mainly

encompass:

- Digitally born records being deposited to NRO or already held by NRO

- Outputs of digitisation projects (both images and sounds)

- Electronic records generated by NRO itself (organisational records like office

administration, email correspondence etc.)

It is important that all employees will take part in this exercise (needs assessment) in order

to fully understand the scope of the necessary actions to be taken.

PAST ARCHIVE SERVICE ACTIVITIES

1. Are you aware of any important digital material that must be kept (digital

files/electronic records like text documents, spreadsheets, scanned images or digital

photographs) within your department that are being stored on either network drive,

external hard drive or any type of removable media (DVDs, CDs, floppy disks,

memory cards, USB memory sticks etc.)?

2. If yes can you provide details below:

Type of Storage (network drive, CD, DVD, external HDD, floppy disk etc.)

Type of content (spreadsheets, word documents, PDFs, digital images/photographs, scanned documents)

Volume (size on disk in either MB, GB or TB; if small put less than 1MB)

3. Email – do you know of any emails that you might have sent yourself or received

from someone that should have been kept for future reference? In this situation

would you normally print off the email and file the printout? Would you consider as

an alternative printing the email into a PDF file and saving it onto designated

network drive locations?

Page 21: Digital Preservation at Norfolk Record Office

Appendix C

2

Click here to enter text.

CURRENT ARCHIVE SERVICE ACTIVITIES

1. In your everyday tasks at work do you work with digital material (files)?

Click here to enter text.

2. What are they?

Click here to enter text.

3. Do you think they are important to an extent that they would need to be preserved

over time for future access?

Click here to enter text.

4. How strongly would you identify the need to do so? In other words what is the value

of the digital material that you produce/work on? Does it need to be kept by NRO? If

yes, for how long?

Click here to enter text.

5. If you’ve answered yes to the above, can you estimate the volume of digital material

that is being produced (the amount of data that need to be kept)?

Small (can be specified in MB), Medium (can be specified in GB), Large (can be

specified in TB, PT)

Click here to enter text.

Thank you

Page 22: Digital Preservation at Norfolk Record Office

Appendix D

A survey of digitally born archives received by the Norfolk Record Office compiled with The National Archives’ DROID profiling tool identified 107 various file formats.

Image (Raster) 64%Miscellaneous 10%

Word Processor 8%

Text (Mark-up) 7%

Email 6%

Page Description 2%

Text (Structured) 2%

Image (Raster), Aggregate 1%

Presentation 0% Image (Vector) 0% Audio 0% Video 0%Spreadsheet 0%

Audio, Video 0%

Text (Unstructured) 0%

Dataset 0%

Aggregate 0%

Database 0%

Image (Vector), Text (Mark-up) 0%

File Formats per Type

Image (Raster)

Miscellaneous

Word Processor

Text (Mark-up)

Email

Page Description

Text (Structured)

Image (Raster), Aggregate

Presentation

Image (Vector)

Audio

Video

Spreadsheet