digitisation infrastructure - june 2007

14
Joint Information Systems Committee June 22, 2022 | Programme Meeting | Slide 1 Publishing Cultural Heritage Alastair Dunning Digitisation Programme Manager JISC (Joint Information Systems Committee) [email protected], 0203 006 6065 UCL Presentation, 19 th June

Upload: alastair-dunning

Post on 12-May-2015

1.040 views

Category:

Education


3 download

DESCRIPTION

The presentation looks at some of the key capabilities that are required, whether at a campus-wide, regional or national level to make sure that digitisation happens effectively, as rapidly as possible and offers value for money in the medium and long term.

TRANSCRIPT

Page 1: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 1

Publishing Cultural Heritage

Alastair Dunning

Digitisation Programme Manager JISC (Joint Information Systems Committee) [email protected], 0203 006 6065

UCL Presentation, 19th June

Page 2: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 2

JISC Digitisation Programme

Manager for 8 projects, part of 16 project programme to digitise UK cultural heritage. For example

– British Newspapers 1620-1900

– Pre-Raphaelite Art

– Images from Scott Polar Research Institute

– Nineteenth-Century Pamphlets

– 20th-century Government Cabinet Papers

– http://www.jisc.ac.uk/digitisation

Started April 2007, finishing March 2009

Page 3: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 3

Digitisation is easy

http://homepage.mac.com/xcia0069/lizzie-innes/index.htm

Page 4: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 4

Growth of Digitisation

Possibilities of Internet inspired rapid data capture of precious objects all over the world

But maybe this started out as a reactive cottage industry?

– Museums, Libraries and Archives rushing to digitise material and dump it on the web

How long does this material last on the Internet? Is it good quality? Can people locate it? Can they use it?

Quantity of material and issue of long-term digitisation effects published material. Added pressure supplied by Google digitisation programme

…. Digitisation is difficult

Page 5: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 5

Need for an infrastructure

To address the issues raised in previous slide

– How long does this material last on the Internet? Is it good quality? Can users locate it? Can they use it?

Illustrations from the British model; other country’s models may be different

Demonstration that mass digitisation is complex, involving multiple players and technologies

Good infrastructure allows publication of cultural heritage to happen quickly; to show value for money; to be usable; to be easily accessible by educational communities and general public

Page 6: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 6

Data capture

To convert the physical to digital

– Flat scanners, robotic scanners, 3D scanners, direct capture via digital camera, remote controlled camera, conversion via medium (e.g. microfilm), reel-to-digital, millions of typists

To cope with all kinds of material (newspapers, stained glass, banners, posters, maps, census, reports, grey literature, artefacts, film, audio … )

Need to have keen idea of priorities for digitisation

Ensure competition but not redundancy (Keep machines working; keep staff in place)

Requires research on success of methodologies, dialogue with other subject areas (i.e. sciences)

Page 7: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 7

If you don’t have a range of options for data capture – cultural heritage won’t get digitised

University of Southampton Robotic Scanner – Details at

http://www.soton.ac.uk/mediacentre/news/2004/nov/04_181.shtml

Page 8: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 8

Standards and Formats

What file formats to ensure high-quality, long-term use

– Images - TIFF, but also JPEG2000, PNG

– Text – XML (and flavours thereof), but also RTF, Word

– Sound – WAV, AIFF, MP3, Ogg (formats and wrappers)

– Film – MJPEG, MPEG4, AVI, Quicktime, Flash (ditto)

Normally developed internationally, but local variations occur

Co-ordination, certification, co-operation, involvement and decisiveness at national and international levels

As with all parts of infrastructure, research and innovation

If you don’t have this – see current mess over video!

Page 9: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 9

Metadata

Requires sophisticated of experts who know the digital objects (e.g. newspapers, sound recordings, census reports)

As with before, international co-ordination, certification, co-operation to develop international schema and vocabularies

These are required at subject level, format level, technical levels, preservation levels. For example

– Dublin Core, MODS – generic resource description

– VRA4 – digital image description, including technical details

– METS – wraps together different information on a digital object

– PREMIS – preservation metadata over long term

If you don’t have this – trust and authenticity, interoperability, resource discovery are severely hindered

Page 10: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 10

Data Delivery

I.e. the people that build websites

Complex engagement between commercial (Google, ProQuest, Thomson Gale, JSTOR) and non-commercial suppliers (universities, museums etc.)

Huge range of potential business models

– Institutional subscription, Personal subscription

– Pay-per-view, Google Ads

– Open Access

– Mixed model

But no definitive answers about the more successful

Page 11: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 11

Data Delivery – What is required

Ability to regularly serve up websites and data

Systems to deliver a range of digital content (e.g. newspapers, audio, posters, artifacts)

Low overheads and year on year costs

Good understanding of end-users

Working in partnership with other content providers

Commitment to innovation and good practice

If you don’t have this – wheel will be constantly reinvented, users will be driven away, material will be siloed

Page 12: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 12

Preservation Facilities

Digital objects become obsolete with time. Experts are required to ensure this does not happen

– Expertise in handling digital assets (content and all metadata) in long term, and preferably also the hardware and media that hold such content

– Must be trusted and reliable

– Good relationship with data delivery providers

– Continual research – why, what and how to preserve?

Without this, digital data will be lost, endangering the entire investment made in digitisation

Page 13: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 13

Preservation Facilities – Case Study

A good example from the late 1990s

Orphaned archaeological data rescued from obsolescence

CDs, floppy discs, PCs, databases, word files, CAD files all left

But lack of metadata meant not all data could be retrieved

http://ahds.ac.uk/creating/case-studies/newham/

Page 14: Digitisation Infrastructure - June 2007

Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 14

Digitisation Infrastructure

Network capabilities

Authentication

Tools Development

Usability testing

Copyright clearing houses

Consultants

Trained expert staff

Suitable courses

Data capture

Standards, Formats

Metadata

Data Delivery

Preservation

And of course Money

Skill is in making sure these pieces fit together