aggregation using linked data – locah project experiences

Post on 16-May-2015

5.804 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Workshop with Paul Walk and Herbert Van De Sompel at OAI7, Geneva, http://indico.cern.ch/conferenceTimeTable.py?confId=103325#20110622

TRANSCRIPT

                                                             

www.bath.ac.uk

UKOLN is supported by:

Aggregation Using Linked Data – LOCAH Project Experiences

23rd June 2011

OAI7, Geneva, Switzerland

Adrian Stevenson

LOCAH Project Manager

                                                             

www.bath.ac.uk

LOCAH Project• Linked Open Copac and Archives Hub• Funded by #JiscEXPO 2/10 ‘Expose’ call

– 1 year project. Started August 2010

• Partners & Consultants:– UKOLN – Adrian Stevenson, Julian Cheal– Mimas – Jane Stevenson, Bethan Ruddock, Yogesh

Patel– Eduserv – Pete Johnston– Talis – Leigh Dodds, Tim Hodson– OCLC - Ralph LeVan, Thom Hickey– Ed Summers

• http://blogs.ukoln.ac.uk/locah/ tag: #locah

                                                             

www.bath.ac.uk

Archives Hub and Copac• UK National Data Services based at Mimas• Archives Hub is an aggregation of archival

descriptions from archive repositories across the UK– http://archiveshub.ac.uk

• Copac provides access to the merged library catalogues of libraries throughout the UK, including all national libraries– http://copac.ac.uk

                                                             

www.bath.ac.uk

What is LOCAH Doing?

• Part 1: Exposing Archives Hub & Copac data as Linked Data

• Part 2: Creating a prototype visualisation

• Part 3: Reporting on opportunities and barriers

                                                             

www.bath.ac.uk

We’re Aggregating

• If something is identified, it can be linked to• We take items from one dataset and link

them to items from other datasets

BBCBBCVIAFVIAF

DBPediaDBPediaArchives

HubArchives

Hub

CopacCopac

GeoNamesGeoNames

                                                             

www.bath.ac.uk

Enhancing our data• Already have some links:

– Time - reference.data.gov.uk URIs– Location - UK Postcodes URIs and Ordnance

Survey URIs – Names - Virtual International Authority File

• Matches and links widely-used authority files - http://viaf.org/

– Names - DBPedia

• Also looking at:– Subjects - Library Congress Subject Headings and

DBPedia

http://data.archiveshub.ac.uk/

‘Aggregates’ property points to http://www.openarchives.org/ore/terms/aggregates

Visualisation Prototype• Using Timemap –

– Googlemaps and Simile

– http://code.google.com/p/timemap/

• Early stages with this• Will give location and

‘extent’ of archive.• Will link through to

Archives Hub

                                                             

www.bath.ac.uk

BBC Music

                                                             

www.bath.ac.uk

APIs, Mashups and Linked Data

• Mashups work against a fixed set of data sources

• Hand crafted by humans

• Don’t integrate well

• Linked Data promises an unbound global data space

• Easy dataset integration

• Generic ‘mesh-up’ tools

                                                             

www.bath.ac.uk

Aggregation / Integration Challenges

                                                             

www.bath.ac.uk

Sustainability

• Can you rely on data sources long-term?

• Ed Summers at the Library of Congress createdhttp://lcsh.info

• Linked Data interface for LOC subject headings

• People started using it

                                                             

www.bath.ac.uk

Library of Congress Subject Headings

                                                             

www.bath.ac.uk

Scalability

• Will the Web of Data scale?

Example by Bradley Allen, Elsevier at LOD LAM Summit, SF, USA

                                                             

www.bath.ac.uk

Data Modelling• Complexity

– Archival description is hierarchical and multi-level

• Dirty Data

Licensing• ‘Ownership’ of data• Hard to track attribution• CC0 for Archives Hub and Copac data

                                                             

www.bath.ac.uk

Linked Data the Way for Aggregation?

• Enables ‘straightforward’ aggregation of wide variety of data sources

• New channels into your data services

• Researchers are more likely to discover sources

• ‘Hidden' collections of repositories become of the Web

                                                             

www.bath.ac.uk

Questions for Discussion

• Will using vocabularies and ontologies always be too difficult?– Or will the tools appear? – MS Access

for Linked Data?

• Will the Web of Data scale?

                                                             

www.bath.ac.uk

– What constitutes data worth linking to?– How to find datasets suitable for

interlinking? – How to make my dataset worth linking to?– How to encourage others to link to my

data?– What is the added value of links? – How to determine the quality of a link?

Questions if you’ve bought in

                                                             

www.bath.ac.uk

Attribution and CC License

• Sections of this presentation adapted from materials created by other members of the LOCAH Project

• This presentation available under creative commons Non Commercial-Share Alike:

http://creativecommons.org/licenses/by-nc/2.0/uk/

top related