rdf data model - imranihsan.comimranihsan.com/upload/lecture/sws1703.pdf · • understand the rdf...

29
SEMANTIC WEB IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD WWW.IMRANIHSAN.COM 03 RDF DATA MODEL RESOURCE DESCRIPTION FRAMEWORK

Upload: others

Post on 24-Apr-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

SEMANTIC WEB

IMRAN IHSANASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABADWWW.IMRANIHSAN.COM

03RDF DATA MODELRESOURCE DESCRIPTION FRAMEWORK

Page 2: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

MOTIVATION

2

• How do you encode the piece of knowledge:

"The theory of relativity was discovered by Albert Einstein."

<theory>

<name>Theory of Relativity</name>

<discoverer>Albert Einstein</discoverer>

</theory>

• or

<person>

<name>Albert Einstein</name>

<discovered>Theory of Relativity</discovered>

</person>

• or

<person name="Albert Einstein">

<discovered>Theory of Relativity</discovered>

</person>

• There is no unique way (in XML) to represent knowledge.

• Information represented in such ways is not easy to integrate. (Why?)

• RDF helps to solve this problem.

Page 3: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

GOALS

3

• Understand the RDF data model, including

• URI and IRI concepts

• Triples

• Resources

• Literals

• Blank nodes

• Lists

Page 4: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

RDF OVERVIEW

4

• RDF = Resource Description Framework

• W3C Recommendation since 1998

• Version 1.1 since 2014

• RDF is a data model

• Originally used for metadata for web resources, then generalized

• Encodes structured information

• Universal, machine readable exchange format

• Data structured in graphs

• Vertices, edges

Page 5: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

PARTS OF THE RDF GRAPH

5

• URIs

• Used to reference resources unambiguously

• Literals

• Describe data values with no clear identity like "100 km/h"

• Blank nodes

• Facilitate existential quantification for an individual with certain properties without naming it

Page 6: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

EXAMPLE OF AN RDF GRAPH

6

Page 7: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

RDF TRIPLE

7

COMPONENTS OF AN RDF TRIPLE

• Modeled using linguistic categories (but not always consistent)

• Allowed assignments:

• Subject: URI or blank node

• Predicate: URI (a.k.a. property)

• Object: URI, blank node or literal

• Node and edge labels should be unambiguous, so that the original graph is reconstructablefrom triple list

Page 8: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

URI

8

• URI = Uniform Resource Identifier

• Used to create globally unique names for resources

• Every object with a clear identity can be a resource

• Books, places, organizations ...

• In books domain the ISBN serves the same purpose

Page 9: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

URI SYNTAX

9

• Extension of the URL concept

• Not every URI denotes a web document, but the URL is often used as URI for web documents

• Starts with URL schema, which is separated from the rest by ":"

• examples: http, ftp, mailto, file

• Typically hierarchical structure

• [scheme:][//authority][path][?query][#fragment]

Page 10: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

SELF-DEFINED URIS

10

• Necessary if resource has no URI yet or URI is not known

• Use HTTP URIs of own website to avoid naming collisions

• Facilitates creation of documentation of URI at this location

• Example: http://jens-lehmann.org/foaf.rdf#i

• Separation of URI for …

• a resource (a real-world thing)

• and its documentation (e.g. an HTML page)

• … with the help of URI references (with “#”-attached fragments) or content negotiation

• Example: URI for Shakespeare's "Othello":

• bad (why?): http://de.wikipedia.org/wiki/Othello

• good: http://de.wikipedia.org/wiki/Othello#URI

Page 11: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

IRI

11

• IRI = Internationalized Resource Identifier

• Generalization of URI concept

• IRI can contain Unicode

• Example:

• http://www.example.org/Wüste

• http://www.example.org/사막

Page 12: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

LITERALS

12

• Used to model data values

• Representation as strings

• Interpretation through datatype

• Literals without datatype are treated as strings

• Literals may never be the origin of a node of an RDF graph

• Edges may never be labeled with literals

Page 13: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

TURTLE SYNTAX

13

• Language to serialize RDF Triples to strings

• Turtle – Terse RDF Triple Language

• URIs in angle brackets: <http://dbpedia.org/resource/Leipzig>

• Literals in quotes

• "Leipzig"@de

• "51.333332"^^xsd:float

• Triples are subject-predicate-object sentences terminated with a dot.

<http://dbpedia.org/resource/Leipzig>

<http://www.w3.org/2000/01/rdf-schema#label>

"Leipzig"@de .

• Whitespace and line breaks are ignored outside of identifiers

• Status: W3C Recommendation, http://www.w3.org/TR/turtle/

Page 14: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

TURTLE ABBREVIATIONS

14

• In Turtle one can use abbreviations

• Syntax: @prefix abbr ':' <URI> .

• E.g. @prefix dbr: <http://dbpedia.org/resource/> .

• One can transform

<http://dbpedia.org/resource/Leipzig>

<http://www.w3.org/2000/01/rdf-schema#label>

"Leipzig"@de .

• into

@prefix dbr: <http://dbpedia.org/resource/> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema> .

dbr:Leipzig rdfs:label "Leipzig"@de .

Page 15: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

TURTLE ABBREVIATIONS

15

• Triples with the same subject can be grouped together

@prefix rdf:

...

@prefix geo:

dbr:Leipzig dbp:hasMayor dbr:Burkhard_Jung ;

rdfs:label "Leipzig"@de ;

geo:lat "51.333332"^^xsd:float ;

geo:long "12.383333"^^xsd:float .

• Even triples with the same subject and predicate can be grouped together

@prefix dbr: .

@prefix dbp: .

dbr:Leipzig dbp:locatedIn dbr:Saxony, dbr:Germany;

dbp:hasMayor dbr:Burkhard_Jung .

Page 16: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

LITERALS II – DATATYPES

16

• Example: xsd:decimal

Page 17: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

DATATYPES IN RDF

17

• So far: literals are untyped, treated as strings: "02" < "100" < "11" < "2"

• Typing allows better, in other words, semantic interpretation of values

• Datatypes get identified by URIs and are freely choosable

• Typically usage of XML Schema Datatypes (XSD)

• Syntax: "data value"^^<datatype-URI>

• rdf:HTML and rdf:XMLLiteral are the only predefined datatypes in RDF

• Used for HTML and XML fragments

Page 18: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

EXAMPLE

18

• Graph:

• Turtle:

@prefix dbr: <http://dbpedia.org/resource/> .

@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>.

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

dbr:Leipzig geo:lat "51.333332"^^xsd:float ;

geo:long "12.383333"^^xsd:float .

Page 19: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

LANGUAGE DECLARATION

19

• Influences only untyped literals

• Example:

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

http://dbpedia.org/resource/Leipzig

rdfs:label "Leipzig"@de, "Леи пциг"@ru .

• In RDF 1.0 the following literals were all different, but implementations typically treated them the same.

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix dbr: <http://dbpedia.org/resource/> .

dbr:Leipzig

rdfs:label "Leipzig", "Leipzig"@de, "Leipzig"^^xsd:string .

• As of RDF 1.1 "Leipzig" is a shorthand for "Leipzig"^^xsd:string.

Page 20: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

N-ARY RELATIONS I

20

• Cooking with RDF

• "For the preparation of mango chutney you need 450g of green mango , a teaspoon of cayenne pepper ..."

• 1st attempt to model this recipe:

• Not satisfying:

• Ingredients and amounts coded as strings

• Search for recipes which contain green mango not easily possible

@prefix ex: <http://example.org/> .

ex:Chutney ex:hasIngredient "450g green mango", "1tsp Cayenne pepper" .

Page 21: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

N-ARY RELATIONS II

21

• Cooking with RDF

• "For the preparation of mango chutney you need 450g of green mango , a teaspoon of cayenne pepper ..."

• 2nd attempt to model this recipe:

• Even worse:

• No unambiguous association between ingredient and amount possible

@prefix ex: <http://example.org/> .

ex:Chutney

ex:ingredient ex:GreenMango;

ex:amount "450g" ;

ex:ingredient ex:CayennePepper;

ex:amount "1tsp" .

Page 22: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

N-ARY RELATIONS III

22

• Problem: it is a real trivalent, or ternary relationship (see e.g. databases)

• Recipe Ingredient Amount

• Mango Chutney green Mango 450g

• Mango Chutney Cayenne pepper 1 tsp

• Directly not possible to express in RDF

• Solution: introduction of helper nodes

Page 23: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

N-ARY RELATIONS IV

23

• Helper nodes in RDF:

• As graph:

• In Turtle Syntax:

@prefix ex: <http://example.org/> .

ex:Chutney ex:hasIngredient ex:ChutneyIngredient1.

ex:ChutneyIngredient1 ex:ingredient ex:GreenMango;

ex:amount "450g" .

Page 24: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

BLANK NODES

24

• Blank nodes can be used for resources which don't need to be named

• Can be read as existential statements

• As graph:

• In Turtle Syntax:@prefix ex: <http://example.org/> .

ex:Chutney ex:hasIngredient _:id1 .

_:id1 ex:ingredient ex:GreenMango;

ex:amount "450g" .

# can be shortened:

ex:Chutney ex:hasIngredient

[ ex:ingredient ex:GreenMango;

ex:amount "450g" ] .

Page 25: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

LISTS

25

• General data structures for enumerating arbitrarily many resources

• Distinction between

• Container: adding new elements possible ordered and unordered container types

• Collections: ordered list; adding new elements impossible

• Can be modeled with previously presented tools, so no additional expressiveness

Page 26: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

TYPES OF CONTAINER

26

• The list root node is assigned one of the following rdf:types:

• rdf:Seq

• Interpretation as ordered list, sequence

• rdf:Bag

• Interpretation as unordered set

• Order coded in RDF not relevant

• rdf:Alt

• Set of alternatives

• Usually only one list element relevant

Page 27: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

COLLECTIONS

27

• Idea: recursive partition of list into a head element and (possibly empty) rest list

• Turtle Syntax (Shortened Notation with brackets)

@prefix ex: <http://example.org/> .

ex:AKSW ex:groupLeaders (ex:Sören ex:Jens ex:Axel) .

Page 28: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

SUMMARY

28

• extensively supported standard for storing and exchanging data

• enables almost syntax-independent representation of distributed information in a graph based data model

• pure RDF is very individual oriented

• almost no possibility to represent schema

Page 29: RDF DATA MODEL - imranihsan.comimranihsan.com/upload/lecture/SWS1703.pdf · • Understand the RDF data model, including • URI and IRI concepts • Triples • Resources • Literals

MINI PROJECT – II

29

1. Create a small knowledge base in Turtle describing a domain (your family)!

2. Write an RDF resource description describing yourself in Turtle with labels in two different languages, your birthday and age!

3. Draw an RDF graph for representing a recipe for cup cakes!

4. Create an RDF list of European countries!