ogf 2007 presentation-1 · • transparent migration to new media. open repositories 2007, san...

11
Open Repositories 2007, San Antonio, 23 Jan. 2007 1 Funded by: © AHDS Grid activities at the Arts and Humanities Data Service Mark Hedges Arts and Humanities Data Service King’s College London Funded by: © AHDS OGF 20, Manchester, 7 May 2007 Overview • What is the AHDS? • Grid applications at the AHDS • Next steps

Upload: others

Post on 18-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 1

Funded by:

© AHDS

Grid activities at the Arts and Humanities Data Service

Mark Hedges

Arts and Humanities Data ServiceKing’s College London

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Overview• What is the AHDS?• Grid applications at the AHDS• Next steps

Page 2: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 2

Funded by:

© AHDS

What is the Arts and Humanities Data Service?

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

What does the AHDS do?

• Preserves and distributes digital resources for research and teaching in arts and humanities subjects

• These resources are free for educational and private use

• Generally available online

Page 3: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 3

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

How is the AHDS Organised?

• Established in 1996• Evolved considerably over 10 years• Managing Executive• Geographically distributed centres for

particular disciplines: History, Visual Arts, Performing Arts, Archaeology, Literature/Languages/Linguistics,

• Virtual centres for other disciplines

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Collections

History

Archaeology

Literature/Linguistics

Visual Arts

Performing Arts

• Highly diverse in terms of type and size

• Images, text, databases, video, sound, multimedia

•Complex internal structures

•Require discipline-specific knowledge to process

• increased acquisition rate

Page 4: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 4

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Museum of London Archaeological Archive

New Survey of London Life and Labour, 1929-1931

Corpus of Romanesque Sculpture in Britain

Imperial War Museum

Electronic Corpus of Tyneside English

Funded by:

© AHDS

Grids and the AHDS

Page 5: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 5

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Grid activities at the AHDS

• DARIAH• Preservation environment• Repositories and grids

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Digital Preservation

“Ensuring the usability of a digital resource through changing technological regimes with a minimum loss of the resource’s intellectual content.”

AHDS Preservation Glossary

Page 6: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 6

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

AHDS preservation approach• Complies with OAIS (Open Archival

Information System) reference model• Preservation actions on ingest

- Capture preservation metadata set- Format normalisation

• Post-ingest: monitoring format/tool obsolescence and format migration

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Data grid based preservation• First approach - based on Storage

Resource Broker• Virtualisation of storage• Distributed across heterogeneous

resources (within AHDS and elsewhere)• Multiple replicas• Metadata associated with data object• Transparent migration to new media

Page 7: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 7

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Drawbacks• Difficult to integrate specialised

preservation requirements• Implemented as external client code• Metadata limited in scope (compared to

what we want to store for our complex objects)

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Enhanced approach• Based on iRODS (Rule Oriented Data

System)• Data management or preservation

actions encoded as rules built up from atomic services

• Rules integrated with system, yet easily changeable

Page 8: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 8

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Simple example • In the preservation archive, files are

periodically checked for fixity and repaired as necessary

• Define a rule set implementing this, with multiple possibilities for corrective action.

• An advantage: the services can access more complex metadata held in external digital repository systems.

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Nature of humanities research data

• “Hard” sciences – requirements derived from need for fast access to large distributed data sets, simulations

• Humanities – complexity and context dependency of research material

Page 9: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 9

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Digital repositories• Need to represent humanities digital content

so as to reflect its complexity and context.• So: Store using flexible digital repository

systems (Fedora at AHDS). • Need seamless integration between these

highly structured repositories.• So: Integration repository software with grid

middleware.

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Repository-Grid IntegrationTwo broad approaches:• Grid as virtualised distributed storage. • Repositories as data resources on grid.

Page 10: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 10

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Grid storage for Fedora• Fedora has been integrated with SRB,

providing virtualised storage. • Currently looking at iRODS integration.• Will be able to make use of the complex

metadata stored within Fedora (for discovery as well as preservation).

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Fedora repositories as grid resources• Use grid technologies to allow access to

distributed Fedoras belonging to different administrative domains.

• Registries to store information about repositories and contents.

• Grid AuthN and AuthZ mechanisms providing uniform access.

Page 11: OGF 2007 Presentation-1 · • Transparent migration to new media. Open Repositories 2007, San Antonio, 23 Jan. 2007 7 Funded by: ©AHDS OGF 20, Manchester, 7 May 2007 Drawbacks •

Open Repositories 2007, San Antonio, 23 Jan. 2007 11

Funded by:

© AHDS

OGF 20, Manchester, 7 May 2007

Contact

[email protected]