linked open data for cultural heritage

34
Linked Open Data Projects for Cultural Heritage: Evolution of an Information Technology Julia Marsden – Carolyn Li- Madeo Jeff Edelstein – Noreen Whysel Lola Galla – Alison Cultural Heritage: Description & Access Pratt SILS LIS 670 – Spring 2013 Prof. Cristina Pattuelli

Upload: noreen-whysel

Post on 26-Jan-2015

107 views

Category:

Technology


2 download

DESCRIPTION

This paper surveys the landscape of linked open data projects in cultural heritage, exam- ining the work of groups from around the world. Traditionally, linked open data has been ranked using the five star method proposed by Tim Berners-Lee. We found this ranking to be lacking when evaluating how cultural heritage groups not merely develop linked open datasets, but find ways to used linked data to augment user experience. Building on the five-star method, we developed a six-stage life cycle describing both dataset development and dataset usage. We use this framework to describe and evaluate fifteen linked open data projects in the realm of cultural heritage.

TRANSCRIPT

Page 1: Linked Open Data for Cultural Heritage

Linked Open Data Projects for Cultural Heritage:

Evolution of an Information Technology

Julia Marsden – Carolyn Li-MadeoJeff Edelstein – Noreen WhyselLola Galla – Alison Rhonemus

Cultural Heritage: Description & Access

Pratt SILS LIS 670 – Spring 2013Prof. Cristina Pattuelli

Page 2: Linked Open Data for Cultural Heritage

WHAT IS LINKED OPEN DATA?

Linked Data provides a mechanism for representing

databases (RDF) and a mechanism for querying those

databases (SPARQL)*

Linked Open Data uses W3C Semantic Web standards to

create relationships between previously isolated data silos

Behind almost every website is a database and although these

sites are linkable the information in their databases

is left unconnected

*From the New York Times’ OPEN blog

Page 3: Linked Open Data for Cultural Heritage

REVIEW OF TERMINOLOGIES

RDF Triple

Subject

Object

Predicate

URI

API

An Application Programming Interface

softwareprogram

softwareprogram

Allows software programs to interactwith one another

URL URNURI

Unique Resource Identifier

URI

SPARQL Query

• SPARQL Protocol and RDF Query Language• Query language for RDF / Databases• Allows users to write unambiguous queries

Page 4: Linked Open Data for Cultural Heritage

METHODOLOGY

•Affiliation / Mission / Intended Audience•Knowledge Organization / Data Models & Vocabulary  •Technology Platform                              •Usability/Interface Design•Discovery (search & navigation)•Data Shareability (ie. availability of an API)•Sustainability (ie. digital preservation, documentation or available code)•Project Leaders•Funding Sources•Level of Collaboration•Analysis•Star-Rating (ie Tim Berners-Lee's coffee cup)

Page 5: Linked Open Data for Cultural Heritage

Developing Datasets Release one or more datasets in linked open format, expressed as RDF triples, that others may use. Projects: Library of Congress; Pan- Canadian Documentary Heritage Network

Linking Data Cultural heritage institutions link their datasets to others (e.g., DBpedia, VIAF, GeoNames) to enhance discovery and reuse of

their collections. Projects: Hungarian National Library; Civil War 150; Linking Lives; Bibliothèque national de France

Documenting Processes for Reuse Explain linked open data and ways that cultural heritage professionals can use datasets. Projects: New York Times; Deutsche National Bibliothek

Developing User Interfaces Institutional or collaborative projects use the datasets to develop applications , including interfaces, visualizations, and augmented reality. Projects: Agora; Pan-Canadian Documentary Heritage Network; Amsterdam Mobile City App; Linked Jazz

Promoting Reuse Institutions go beyond the creation of their own test projects, encouraging users to develop innovative applications. Projects: Open Cultuur Data, EUScreen

Expanding the Definition of Cultural Heritage Efforts from outside the cultural heritage framework, such asgovernment agencies and international aid organizations, can serve to strengthen societies and their cultural institutions. Project: Open Data for Resilience Initiative

LINKED DATA LIFE CYCLES

Page 6: Linked Open Data for Cultural Heritage

Stage 1. Developing Datasets

Page 7: Linked Open Data for Cultural Heritage

Pan-Canadian Documentary Heritage Network• Formed in 2010; highly collaborative effort across a broad spectrum

of LAMs.

• Pilot project results published July 2012:• RDF metadata• Detailed project report• Demonstration video, “Out of the Trenches”

• Project content submitted in various formats:• War songs (MARC records; BAnQ)• War posters (spreadsheets; McGill)• Newspaper articles, postcards, and wartime records (MODS XML; University of Alberta)• Portrait archives of CEF solders; WWI documents (spreadsheets; University of Calgary)• Archival material from Saskatchewan War Experience Project (DC RDF; University of

Saskatchewan)

• Use of external LOD datasets:• Geonames, VIAF, LCSH, TGM, Rameau, LACSH• Metadata then mapped to ontologies (e.g., events, places,

persons)

• Principal findings: • Good approach for resource integration and discovery• Considered “reuse” in terms of using element sets in multiple

contexts (e.g., “role” as predicate or as object) and repurposing vocabularies

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 8: Linked Open Data for Cultural Heritage

LIBRARY OF CONGRESS

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 9: Linked Open Data for Cultural Heritage

LIBRARY OF CONGRESS

Dereferenceable URI

Name Variants

Related Terms

Promotes existing Library of Congress

resources to Linked Open Data

web resources, uncovers and

connects related names and terms

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 10: Linked Open Data for Cultural Heritage

LIBRARY OF CONGRESS

Multiple formats are available for wider use

LC Classification Numbers are related to

each entry

Connects with and acknowledges other

schemes

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 11: Linked Open Data for Cultural Heritage

Stage 2. Linking Data

Page 12: Linked Open Data for Cultural Heritage

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 13: Linked Open Data for Cultural Heritage

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 14: Linked Open Data for Cultural Heritage
Page 15: Linked Open Data for Cultural Heritage

CIVIL WAR DATA 150

Project was designed to encourage the contribution

of a wide variety of data sources: from institutions

to individuals

Partnership between The Archives of Michigan, The

Internet Archive and Freebase

Celebrating the sesquicentennial of the

American Civil War

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 16: Linked Open Data for Cultural Heritage

CIVIL WAR DATA 150Project Goals:

Create web apps to enable users to add to

or modify shared metadata with strong

identifiers

Engage the public in the process of interacting with and adding value to the

data

Identify sources and map metadata into

Freebase

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 17: Linked Open Data for Cultural Heritage

LOCAH and Linking Lives• Projects of Archives Hub UK (http://archiveshub.ac.uk), which represents more than 220

institutions

• LOCAH (Linked Open Copac & Archives Hub; 2010-2011):• Published data from Archives Hub finding aids and Copac, a union catalog of more than 70

major UK libraries• Created LOD resources:

1. SPARQL endpoint2. Query box for trying out SPARQL queries3. RDF dump of the dataset4. Archives HUB EAD to RDF XSLT stylesheet

• Linking Lives (2011-2012) expanded on LOCAH• Test project focusing on biography• Brought in more external datasets (Dbpedia, VIAF,

Freebase, OpenLibrary, BBC Programmes, Linked Open British National Biography)

• Developed interface model (wireframe)

• Principal findings:• Even when expressed in triples, data may lack uniformity, requiring time-consuming clean-up• Difficulty of firmly establishing identity when there are variant forms of names or identifying

roles (e.g., “author” vs. “writer”) and when different people have the same name

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 18: Linked Open Data for Cultural Heritage

Stage 3. Documenting Processes for Reuse

Page 19: Linked Open Data for Cultural Heritage

DEUTSCHE NATIONAL BIBLIOTEK

• Linked Data Service• Library scientist led• Authority names and

bibliographic data• Downloadable dataset• SRU and OAI/PMH interfaces• Extensive documentation

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 20: Linked Open Data for Cultural Heritage

THE NEW YORK TIMES

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 21: Linked Open Data for Cultural Heritage

THE NEW YORK TIMES

The OPEN BlogDocuments and contextualizes the

APIsPlatform for sharing Open Source

CodeForum for trouble shooting and

ideas

Downloadable SKOS FilesThe entire dataset is downloadableDevelopers can also chose by topic

Users are invited to utilize the datasets and APIs through

downloads, documentation, support and explanation of LOD

terminology, code and uses

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 22: Linked Open Data for Cultural Heritage

THE NEW YORK TIMESAvailable APIs

Developer NetworkAPI Request Tool allows developers to search through the expansive list of APIs and set parameters for their

search using a widget. The tool then formats the URL and request

results

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 23: Linked Open Data for Cultural Heritage

Stage 4. Developing User Interfaces

Page 24: Linked Open Data for Cultural Heritage

AUSTRALIAN WAR MEMORIAL

• Proof of concept• Developer led• Embedded RDF tags• Page based API• No documentation or

downloadable dataset

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 25: Linked Open Data for Cultural Heritage

THE AMSTERDAM MUSEUM

• Mobile app parses data from Amsterdam museum and linked ontologies

• Proposal for visual interface that enables user to become tour guide

• Current problem: search and download speed

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 26: Linked Open Data for Cultural Heritage

Out of the Trenches Demonstration Video

Subjects can be explored across a range of dimensionsSource: http://www.canadiana.ca/sites/pub.canadiana.ca/files/LOD-Demo-ENG_0.mp4

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 27: Linked Open Data for Cultural Heritage

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 28: Linked Open Data for Cultural Heritage

Stage 5. Promoting Reuse

Page 29: Linked Open Data for Cultural Heritage

OPEN CULTUUR DATA INITIATIVE

• Offered workshops on how cultural heritage orgs could open their data

• Hosted hackathons to encourage developers to turn datasets into apps

• Three award-winners: • VISTORY (using LOD Open Images dataset)• Rijksmonumenten.info• Connected Collection

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 30: Linked Open Data for Cultural Heritage

OPEN CULTUUR DATA INITIATIVE

Screenshot from http://www.glimworm.com/vistory.shtml

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 31: Linked Open Data for Cultural Heritage

EUSCREEN• Linked Data Pilot• International collaboration• Open, International standards• Downloadable datasets• Fully documented• Showcase of projects in blog• Active in promoting reuse

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 32: Linked Open Data for Cultural Heritage

Stage 6. Expanding the Definition of Cultural Heritage

Page 33: Linked Open Data for Cultural Heritage

developing datasets – linking data – documenting processes – developing user interfaces – promoting reuse – expanding definitions

Page 34: Linked Open Data for Cultural Heritage

CONCLUSIONS• (Most) LOD projects:

• Proof of concept• No access to a dataset• Not highly documented• Highly curated• Experimental• Promising

• The number of LOD datasets continues to increase• Actual use by cultural heritage institutions appears to remain limited

• Trust remains an obstacle• Compare: “A guppy is_a_Kind_of fish” (TRUE)

“A pony is_a_Kind_of fish" (UNTRUE) Computers see these as equally valid.• Verifying or identifying source of a statement may become a best practice

• Information added to triples? “A guppy is_a_Kind_of fish [source] DBpedia”

• Published datasets hold great potential for making the content of an archive's collections known• Researcher studying Person A finds that a collection of Person X's letters includes letters

to or from Person A