d3.1 state of the art assessment on linked data and digital preservation

21
Midterm Workshop, Catania, April 2014 D3.1 State of the art assessment on Linked Data and Digital Preservation René van Horik, Data Archiving & Networked Services, The Netherlands

Upload: prelida-project

Post on 02-Dec-2014

406 views

Category:

Technology


1 download

DESCRIPTION

The presentation was given by René van Horik from Data Archiving & Networked Services, The Netherlands, at the PRELIDA Midterm Workshop in Catania, April 2014.

TRANSCRIPT

Page 1: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

D3.1 State of the art assessment on Linked Data and Digital Preservation

René van Horik, Data Archiving & Networked Services, The Netherlands

Page 2: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Outline

• Introduction / Context• Summary of Deliverable D3.1

Page 3: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Objectives of the PRELIDA Project

• collect, organize and publish use cases related to the long-term access to LD

• create a comprehensive state of the art on LD and DP technologies

• set up a technology observatory• bring together scientists and stakeholders for

identifying relevant challenges and paths for addressing them in the near future

• draw attention of standardization bodies

Page 4: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

WP3 of Prelida project• Objective of WP3:

– to provide overview of the state of art in Digital Preservation and Linked Data.

– Information transfer between two communities.

• Partners: CNR / APA / HUD / UIBK• Contributors: Sotiris Batsakis, David Giaretta,

Christophe Gueret, René van Horik, Maarten Hogerwerf, Antoine Isaac, Carlo Meghini, Andrea Scharnhorst.

• Deliverables:– WP3.1 State of the art (month 12)– WP3.2 Consolidated state of art (month 24)

Page 5: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Two communities: Linked Data & Digital Preservation

Plato and Aristotle. Fragment of fresco “the School of Athens” by Raphael (1509-1510)

Only sensorial experience can lead to knowledge.Pointing to the earth(realism)

Applied mathematical methods leads to knowledge.Pointing to heaven(mystical nature of the universe)

Page 6: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Table of contents of D3.1

1. Introduction2. Definitions and terminology3. Relevant dimensions addressed by DP projects4. Initial ideas on preserving Linked (Open) Data5. Use cases

– Cedar– Dbpedia– Europeana

6. Conclusions7. Bibliography

Page 7: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Definitions and terminology

Page 8: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Digital Preservation

Long Term Preservation as defined by the OAIS reference model:

Components:• Long Term• Independently Understandable• Designated Community• Authenticity• Information• Data• Representation Information(see page 15)

Page 9: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Threats to digital preservation

Page 10: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

The obvious slide on data

Page 11: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Linked (Open) Data

• Use the Web as a platform to publish and re-use identifiers that refer to data

• Use a single data model for expressing the data (RDF)

• 3 ways to publish RDF data:– As annotation to Web documents (RDF data included within the HTML

code of the Web)

– As Web documents (RDF documents are served next to HTML documents)

– As a database (“triple stores”, query language: SPARQL)

(see page 23 for example related to Dbpedia)

Page 12: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Three ways to publish RDF data

Page 13: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Relevant dimensions addressed by DP projects

Page 14: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

What is specific for LOD for the DP community?

• A number of formats (RDF / Triple Store / …)• No clear boundary• Dynamic, changes over time• Unclear ownership

Page 15: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

DP topics potentially relevant for preservation of LD objects

• Object classification and validation• Persistent identifiers• Audit & Certification / Trustworthy Digital Repositories

Page 16: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Some initial ideas on preserving LOD

• Object classification• Representation information -> Representation

network• (Persistent Identifiers)• (Audit & Cetification of TDR)

Page 17: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Classification of objects

•must at least make sure we consider different types of data

– rendered vs non-rendered– composite vs simple– dynamic vs static– active vs passive

RDF Triple: dynamic/complex/non-rendered/passive(page 38)

Page 18: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

OAIS Information model: Representation Information

The Information Model is key

Recursion ends at KNOWLEDGEBASE of the DESIGNATED COMMUNITY

(this knowledge will change over time and region)

Does not demand that ALL Representation Information be collected at once.

A process which can be tested

Page 19: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Use Cases

• CEDAR• Dbpedia• Europeana

Page 20: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Page 21: D3.1 State of the art assessment on Linked Data and Digital Preservation

Midterm Workshop, Catania, April 2014

Thank you