digitisation infrastructure - june 2007
DESCRIPTION
The presentation looks at some of the key capabilities that are required, whether at a campus-wide, regional or national level to make sure that digitisation happens effectively, as rapidly as possible and offers value for money in the medium and long term.TRANSCRIPT
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 1
Publishing Cultural Heritage
Alastair Dunning
Digitisation Programme Manager JISC (Joint Information Systems Committee) [email protected], 0203 006 6065
UCL Presentation, 19th June
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 2
JISC Digitisation Programme
Manager for 8 projects, part of 16 project programme to digitise UK cultural heritage. For example
– British Newspapers 1620-1900
– Pre-Raphaelite Art
– Images from Scott Polar Research Institute
– Nineteenth-Century Pamphlets
– 20th-century Government Cabinet Papers
– http://www.jisc.ac.uk/digitisation
Started April 2007, finishing March 2009
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 3
Digitisation is easy
http://homepage.mac.com/xcia0069/lizzie-innes/index.htm
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 4
Growth of Digitisation
Possibilities of Internet inspired rapid data capture of precious objects all over the world
But maybe this started out as a reactive cottage industry?
– Museums, Libraries and Archives rushing to digitise material and dump it on the web
How long does this material last on the Internet? Is it good quality? Can people locate it? Can they use it?
Quantity of material and issue of long-term digitisation effects published material. Added pressure supplied by Google digitisation programme
…. Digitisation is difficult
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 5
Need for an infrastructure
To address the issues raised in previous slide
– How long does this material last on the Internet? Is it good quality? Can users locate it? Can they use it?
Illustrations from the British model; other country’s models may be different
Demonstration that mass digitisation is complex, involving multiple players and technologies
Good infrastructure allows publication of cultural heritage to happen quickly; to show value for money; to be usable; to be easily accessible by educational communities and general public
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 6
Data capture
To convert the physical to digital
– Flat scanners, robotic scanners, 3D scanners, direct capture via digital camera, remote controlled camera, conversion via medium (e.g. microfilm), reel-to-digital, millions of typists
To cope with all kinds of material (newspapers, stained glass, banners, posters, maps, census, reports, grey literature, artefacts, film, audio … )
Need to have keen idea of priorities for digitisation
Ensure competition but not redundancy (Keep machines working; keep staff in place)
Requires research on success of methodologies, dialogue with other subject areas (i.e. sciences)
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 7
If you don’t have a range of options for data capture – cultural heritage won’t get digitised
University of Southampton Robotic Scanner – Details at
http://www.soton.ac.uk/mediacentre/news/2004/nov/04_181.shtml
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 8
Standards and Formats
What file formats to ensure high-quality, long-term use
– Images - TIFF, but also JPEG2000, PNG
– Text – XML (and flavours thereof), but also RTF, Word
– Sound – WAV, AIFF, MP3, Ogg (formats and wrappers)
– Film – MJPEG, MPEG4, AVI, Quicktime, Flash (ditto)
Normally developed internationally, but local variations occur
Co-ordination, certification, co-operation, involvement and decisiveness at national and international levels
As with all parts of infrastructure, research and innovation
If you don’t have this – see current mess over video!
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 9
Metadata
Requires sophisticated of experts who know the digital objects (e.g. newspapers, sound recordings, census reports)
As with before, international co-ordination, certification, co-operation to develop international schema and vocabularies
These are required at subject level, format level, technical levels, preservation levels. For example
– Dublin Core, MODS – generic resource description
– VRA4 – digital image description, including technical details
– METS – wraps together different information on a digital object
– PREMIS – preservation metadata over long term
If you don’t have this – trust and authenticity, interoperability, resource discovery are severely hindered
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 10
Data Delivery
I.e. the people that build websites
Complex engagement between commercial (Google, ProQuest, Thomson Gale, JSTOR) and non-commercial suppliers (universities, museums etc.)
Huge range of potential business models
– Institutional subscription, Personal subscription
– Pay-per-view, Google Ads
– Open Access
– Mixed model
But no definitive answers about the more successful
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 11
Data Delivery – What is required
Ability to regularly serve up websites and data
Systems to deliver a range of digital content (e.g. newspapers, audio, posters, artifacts)
Low overheads and year on year costs
Good understanding of end-users
Working in partnership with other content providers
Commitment to innovation and good practice
If you don’t have this – wheel will be constantly reinvented, users will be driven away, material will be siloed
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 12
Preservation Facilities
Digital objects become obsolete with time. Experts are required to ensure this does not happen
– Expertise in handling digital assets (content and all metadata) in long term, and preferably also the hardware and media that hold such content
– Must be trusted and reliable
– Good relationship with data delivery providers
– Continual research – why, what and how to preserve?
Without this, digital data will be lost, endangering the entire investment made in digitisation
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 13
Preservation Facilities – Case Study
A good example from the late 1990s
Orphaned archaeological data rescued from obsolescence
CDs, floppy discs, PCs, databases, word files, CAD files all left
But lack of metadata meant not all data could be retrieved
http://ahds.ac.uk/creating/case-studies/newham/
Joint Information Systems Committee April 12, 2023 | Programme Meeting | Slide 14
Digitisation Infrastructure
Network capabilities
Authentication
Tools Development
Usability testing
Copyright clearing houses
Consultants
Trained expert staff
Suitable courses
Data capture
Standards, Formats
Metadata
Data Delivery
Preservation
And of course Money
Skill is in making sure these pieces fit together