provenance information in the web of data

34
Provenance Information in the Web of Data Olaf Hartig Humboldt-Universität zu Berlin http://olafhartig.de/foaf.rdf#olaf

Upload: olaf-hartig

Post on 10-May-2015

2.553 views

Category:

Technology


4 download

DESCRIPTION

The slides for my presentation at the Linked Data Workshop at WWW 2009.

TRANSCRIPT

Page 1: Provenance Information in the Web of Data

Provenance Informationin the Web of Data

Olaf HartigHumboldt-Universität zu Berlin

http://olafhartig.de/foaf.rdf#olaf

Page 2: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 2

● Provenance of a data item: information about the history

Page 3: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 3

● Provenance of a data item: information about the history

Page 4: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 4

● Provenance of a data item: information about the history

Page 5: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 5

Outline

Towards a model ofWeb data provenance

Provenance informationin the Web of data today

Upcomingtasks

Page 6: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 6

● Main research areas: (scientific) workflows, DBMSs

● General focus: data creation

Existing Provenance Research

Page 7: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 7

Page 8: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 8

Page 9: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 9

Page 10: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 10

Page 11: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 11

Web data provenancecomprises

two dimensions:

Data Creation • Data Access

Page 12: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 12

Basics of the Provenance Model

● Provenance graph describes provenance of a data item● Nodes: provenance elements – pieces of provenance info● Edges: relate provenance elements to each other● Subgraphs for related data items possible

Page 13: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 13

Basics of the Provenance Model

● Provenance model defines:● Types of provenance elements● Relationships

Page 14: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 14

Basics of the Provenance Model

● Provenance model defines:● Types of provenance elements● Relationships

● High level of abstraction (only main element types)

Page 15: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 15

Basics of the Provenance Model

● General differentiation:

Actors

Executions

Artifacts

Page 16: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 16

Data Access Dimension

Data Item

Information Resource

Data Access

contains

Relation tothe provided Information

Resource

Data Providing Service (Non-Human)

Data Publisher(Human)

Service Provider

uses controls

Data Accessor(Non-Human)

Access Time

Page 17: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 17

Data Access Dimension cont.

Public Key

(Signed)Artifact

Integrity Assurance

Relation tothe signed Data

Signer

owns

Verification Result

Digital Signature

signs

Page 18: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 18

ProvenanceInformation

ProvenanceInformation

ProvenanceInformation

Data Creation Dimension

Data Creator(Human or Non-human)

{complete,disjoint}

Relation tothe created Data

Creation Time

Creation Guidelines

Data Creation

responsible for responsible for

Data Creating Service (e.g. Software Agent)

Data Creating Entity (e.g. Person, Group, Orga.)

Data Creating Device(e.g. Sensor)

Source Data

Data Item

(Encompassing)Data Item

part of

Page 19: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 19

Provenance informationin the Web of data today

Page 20: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 20

Provenance-related Vocabularies

DC – Dublin Core Metadata Terms

FOAF – Friend of a Friend

SIOC – Semantically-Interlinked Online Communities

● SWP – Semantic Web Publishing vocabulary

● WOT – Web of Trust schema

● OMV – Ontology Metadata Vocabulary

● PML – Proof Markup Language

● Changeset vocabulary

● Ouzo Provenance Ontology

Page 21: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 21

Provenance-related Vocabularies

DC – Dublin Core Metadata Terms

Page 22: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 22

Provenance-related Vocabularies

DC – Dublin Core Metadata Terms● dc:creator

● dc:contributor

● dc:source

● dc:created

● dc:modified

● dc:publisher

● dc:provenance

Page 23: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 23

Provenance-related Vocabularies

DC – Dublin Core Metadata Terms● dc:creator

● dc:contributor

● dc:source

● dc:created

● dc:modified

● dc:publisher – “an entity responsible for making the resource available”

● dc:provenance

Page 24: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 24

DC – Dublin Core Metadata Terms● dc:creator

● dc:contributor

● dc:source

● dc:created

● dc:modified

● dc:publisher – “an entity responsible for making the resource available”

● dc:provenance

Provenance-related Vocabularies

Data Access

Data Providing Service (Non-Human)

Data Publisher(Human)

Service Provider

uses controls

Page 25: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 25

Main Issues Today

● Vocabularies:● Partly unsuitable● Lack of certain features● Coverage of provenance model impossible

Page 26: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 26

Provenance-related Vocabularies

DC – Dublin Core Metadata Terms

Property Occurrences*

dc:creator about 24,284

dc:contributor 476

dc:source about 3,631

dc:created about 82,720

dc:modified about 12,020

dc:provenance 7

*Measured by querying Sindice; Feb. 7, 2009 (by that time Sindice indexed about 48,99 million documents)

Page 27: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 27

Main Issues Today

● Vocabularies:● Partly unsuitable● Lack of certain features● Coverage of provenance model impossible

● General lack of provenance-related metadata on the Web of data

Page 28: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 28

Possible Reasons

● Lack of suitable vocabularies● Lack of usable tools● Ignorance / lack of sensitization

Page 29: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 29

Upcomingtasks

Page 30: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 30

Address the Issues

● Let's develop a vocabulary for Web data provenance● Proposal: refine the presented provenance model● Integrate existing vocabularies for specific types of

provenance elements

Page 31: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 31

Address the Issues

● Let's develop a vocabulary for Web data provenance● Proposal: refine the presented provenance model● Integrate existing vocabularies for specific types of

provenance elements

● Let's develop usable tools for data providers● Edit and publish provenance-related metadata● Automatic generation if possible

Page 32: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 32

Address the Issues

● Let's develop a vocabulary for Web data provenance● Proposal: refine the presented provenance model● Integrate existing vocabularies for specific types of

provenance elements

● Let's develop usable tools for data providers● Edit and publish provenance-related metadata● Automatic generation if possible

● Let's raise awareness of data providers● Probably the hardest task● Maybe voiD can help

Page 33: Provenance Information in the Web of Data

Olaf HartigHumboldt-Universität zu Berlin

http://olafhartig.de/foaf.rdf#olaf

Thank you!

Page 34: Provenance Information in the Web of Data

Olaf Hartig - Provenance Information in the Web of Data 34

These slides have been created byOlaf Hartig

http://olafhartig.de

This work is licensed under aCreative Commons Attribution-Share Alike 3.0 License

(http://creativecommons.org/licenses/by-sa/3.0/)

Attribution:● http://www.flickr.com/photos/adrenalin/3032734/● http://www.hasslefreeclipart.com● http://www.flickr.com/photos/dullhunk/428079229/● http://www.flickr.com/photos/darwinbell/1337963794/● http://www.flickr.com/photos/alandd/2780700767/● http://www.flickr.com/photos/simeon_barkas/2872099696/● http://www.flickr.com/photos/robinh00d/122544491/● http://www.flickr.com/photos/adrenalin/3032747/