graph and rdf databases context : course of advanced databases prepared by : nassim bahri nabila...

49
Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Upload: elizabeth-moody

Post on 19-Jan-2018

227 views

Category:

Documents


0 download

DESCRIPTION

Introduction 3 Not Only SQL Storing data in memory Distributed databases

TRANSCRIPT

Page 1: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph and RDF Databases

Context : Course of Advanced Databases

Prepared by : Nassim BAHRI Nabila HOSNI

February 19th, 2015

Page 2: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Table of contents

I. Introduction :Overview of BIG DATA & NOSQL

II. Graph Databases

III. RDF Databases

IV. Application example

V. Scientific article

VI. Conclusion and Q&A

Page 3: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Introduction

3

Not Only SQL

Storing data inmemory

Distributeddatabases

Page 4: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Introduction : Data Model

4

Documents Databases

(Voldemort, Riak)

Big Table Column

(Hbase, cassandra, Hypertable)

Key-Value

(MongoDB)

Graph Databases

(Neo4J)

Page 5: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Introduction : Data Model

5

Data complexity

Data

size Key-Value Stores

Column Family

Document Databases

Graph Databases

90% of use cases

This is what we are interested

Source : Neo Technology webinar

Page 6: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases

What is Graph Database?

A graph database is a databases whose specific purpose is the storage of graph-oriented data structures.

Is simply an object oriented database based on Graph theory.

6

Page 7: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases

Representation

• Nodes

• Relationships between nodes

• Properties on both

7

2

3

1

Name : JohnAge : 43

Name : Google

Type : FordColor : blue

Work inSince : 2013

Page 8: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph DatabasesThe power of Graph Databases

Performance Flexibility

Agility8

Page 9: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph VS Relational DatabasesRelational Database Modeling

ID Name

1 Larry Page

2 Sergey Brin

3 Larry Elisson

N …

ID Name

1 Google

2 Oracle

… …

N …

PersonID CompanyID Since

1 1 1998

2 1 2001

3 2 2010

Person

Company

WorksIn

SELECT Person.NameFROM Person,Company,WorksInWHERE Company.Name='Google'AND WorksIn.CompanyID=Company.IDAND WorksIn.PersonId=Person.ID;

Google's employees?

Lookup

Lookup

Lookup

9

Page 10: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph VS Relational DatabasesGraph Database Modeling

Name : Larry Page Name : Google

Name : Sergey Brin

Name : Oracle

Name : Larry Elisson

Person 1

Person 2

Company 1

Company 2

Person 3

WorksIN

WorksIN

WorksInSince : 2001

Since : 2010

Since : 1998

Lookup

10

Page 11: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph DatabasesGraph storage and graph processing

1. The underlying storage• Some databases use native graph storage,• The other databases use relational database, an object-oriented database,…

2. The processing engine• The nodes are physically connected to each other in database,• index-free adjacency

11

Page 12: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph DatabasesGraph Database Management System

12Source [1]

Page 13: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleVisual Modeling

13

Name : JohnAge : 27 FRIEND_OF

Name : SallyAge : 32

Title : Graph DatabasesAuthors : Ian Robinson,

Jim Webber

Since : 01/09/2013

Since : 01/09/2013

On : 02/09/2013Rating : 4

On : 02/03/2013Rating : 5

FRIEND_OF

HAS_READ HAS_READ

Page 14: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleCreate a simple dataset// Create SallyCREATE (sally:Person { name: 'Sally', age: 32 })

// Create JohnCREATE (john:Person { name: 'John', age: 27 })

// Create Graph Databases bookCREATE (gdb:Book { title: 'Graph Databases', authors: ['Ian Robinson', 'Jim Webber'] })

// Connect Sally and John as friendsCREATE (sally)-[:FRIEND_OF { since: 1357718400 }]->(john)

// Connect Sally to Graph Databases bookCREATE (sally)-[:HAS_READ { rating: 4, on: 1360396800 }]->(gdb)

// Connect John to Graph Databases bookCREATE (john)-[:HAS_READ { rating: 5, on: 1359878400 }]->(gdb) 14

Page 15: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : Example

15

Page 16: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleSimple selection from node:

Query 1 : How old are Sally?MATCH (sally:Person { name: 'Sally' })RETURN sally.age as sally_age

16

Page 17: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleSimple selection from node:

Query 2 : Who are the authors of Graph Databases?MATCH (gdb:Book { title: 'Graph Databases' }) RETURN gdb.authors as authors

17

Page 18: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleSelection using relationship:

Query 3 : Who are sally's friends?MATCH (sally:Person { name: 'Sally' }) MATCH (sally)-[r:FRIEND_OF]-(person) RETURN person.name as sally_friend

18

Page 19: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleSelection using relationship and group function:

Query 4 : What is the average rating of Graph Databases?MATCH (gdb:Book { title: 'Graph Databases' })MATCH (gdb)<-[r:HAS_READ]-()RETURN avg(r.rating) as average_rating

19

Page 20: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleUsing order and limit in query:

Query 5 : Who Read Graph Databases First, Sally or John?MATCH (people:Person) WHERE people.name = 'John' OR people.name = 'Sally'MATCH (people)-[r:HAS_READ]->(gdb:Book { title: 'Graph Databases' })RETURN people.name as first_reader ORDER BY r.on LIMIT 1

20

Page 21: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleVisual Modeling

21

Name : JohnAge : 27 FRIEND_OF

Name : SallyAge : 32

Name : AlainAge : 19

Since : 01/09/2013

Since : 01/09/2013

FRIEND_OF

FRIEND_OF

Since : 01/11/2014

Page 22: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleCompleting our schema// Create AlainCREATE (alain:Person { name: 'Alain', age: 19 })

// Connect Sally and Alain as friendsMATCH (alain:Person { name: 'Alain' })MATCH (sally:Person { name: 'Sally' })CREATE (sally)-[:FRIEND_OF { since: 1358818400 }]->(alain)

22

Alain

Sally

John

GDB book

Page 23: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleNode / relationship navigation:

Query 6 : Which is shared between Alain and John Friend?MATCH (alain:Person { name: 'Alain' })MATCH (john:Person { name: 'John' })MATCH (alain)-[:FRIEND_OF]-(person)-[:FRIEND_OF]-(john)RETURN person.name as friend

23

Page 24: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Graph Databases : ExampleUpdate node’s properties:

Query 7 : Change Alain name to LarryMATCH (n { name: 'Alain' })SET n.name = 'Larry'

Query 8 : Remove propertyMATCH (n { name: 'Larry' })SET n.name = NULL

Query 9 : Add propertyMATCH (n { name: 'John' })SET n += { hungry: TRUE , position: 'Entrepreneur' }

24

Page 25: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesThe principle of the web

25

HTTP Request

HTTP Response

URL : http://website.com

Communication protocol : HTTP

Representation language : HTML

Page 26: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesChanging status

26

URL URI IRI

Uniform ResourceLocator

Uniform ResourceIdentifier

International ResourceIdentifier

http://website.com http://animals.com#lion http://الحيوانات.tn#lion

Page 27: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesW3C Standards

27

Identification

Representation

Query Reasoning

Page 28: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF Databases

RDFResource

Description

Framework

28

means

: pages, person, animalsIdea,…

: attributes, characteristics,Relationship,…

: Model, language and syntax to build description

Page 29: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesData model & syntax

Description : (Subject, Predicate, object)

“example : doc.html is created by John and belongs to the music theme”

29

Doc.html is created by JohnDoc.html belongs to music theme

Page 30: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesData model & syntax

(Subject, Predicate, object)

(Vertex, edge, Vertex)

30

John

Doc.html

Music

Author

Theme

Page 31: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesLabeled graph with URI and literals

31

http://www.website.com/john#me

http://www.website.com/doc.html

Music

http://www.website.com/schema#author

http://www.website.com/schema#theme

Page 32: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesRDF SyntaxesXML, Turtle, TriG, JSON-LD,…

Turtle syntax

@prefix rdf : <http://www.w3.org/1999/02/22-rdf-syntax-ns#>@prefix site : <http://www.website.com/schema#><http://www.website.com/doc.html>

site:author <http://www.website.com/john#me>;site:theme "Music".

32

Page 33: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

• Syntax similar to SQL

SELECT data,FROM data sourceWHERE { conditions }

33

Page 34: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

?x rdf:type ex:PersonGet all person

SELECT ?subject ?property ?valueWHERE { ?subject ?property ?value }

Get the full Graph database

SELECT ?x WHERE{ ?x rdf:type ex:Person . ?x :name ?name . }

Get all person who have a name 34

Page 35: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

Declaring prefixes

PREFIX esen : <http://esen.tn#>SELECT ?studentWHERE {

?student esen:registeredAt ?x.}

35

Page 36: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

Optional pattern

PREFIX foaf : <http://xmlns.com/foaf/0.1>SELECT ?person ?nameWHERE {

?person foaf:homepage <http://john.info> .OPTIONAL { ?person foaf:name ?name .}

}name : unbound

36

Page 37: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

Union

PREFIX foaf : <http://xmlns.com/foaf/0.1>SELECT ?nameWHERE {

?person foaf:name ?name .{

{?person foaf:homepage <http://john.info> .} UNION {?person foaf:homepage <http://paul.info> .}

}} 37

Page 38: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesSPARQL Protocol And RDF Query Language

Minus

PREFIX ex : <http://website.com#>SELECT ?personWHERE {

{ ?person rdf:type ?type }MINUS { ?person rdf:type ex:student }

}

38

Page 39: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

RDF DatabasesUse case : rich snippets Google

39

<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Person"> My name is <span property="v:name">Pierre Dumoulin</span>. My personal homepage: <a href="http://www.example.com" rel="v:url" > www.homepage.com</a>I’m living is <span rel="v:address" typeof="v:address"> <span property="v:street-address">12 street name</span> <span property="v:locality">city name</span> ,<span property="v:region">XY</span> <span property="v:postal-code">12345</span>. <span></div>

Page 40: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Application example (RDF)Data storage# Default graph (stored at http://example.org/foaf/aliceFoaf) @prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Alice" .. _:b foaf:mbox <mailto:[email protected]> .. _:a foaf:mbox <mailto:[email protected]> .

QueryPREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?nameFROM <http://example.org/foaf/aliceFoaf>WHERE { ?x foaf:name ?name }

40

NameAlice

Result

Page 41: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Application example (Neo4J)Question : Who is older, Sally or John?

41

Name : JohnAge : 27 FRIEND_OF

Name : SallyAge : 32

Name : AlainAge : 19

Since : 01/09/2013

Since : 01/09/2013

FRIEND_OF

FRIEND_OF

Since : 01/11/2014

Page 42: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Application example (Neo4J)Who is older, Sally or John?

MATCH (people:Person)WHERE people.name = 'John' OR people.name = 'Sally'RETURN people.name as oldestORDER BY people.age DESCLIMIT 1

42

Page 43: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Scientific article

Title : Querying RDF Data from a Graph Database PerspectiveBook title : The Semantic Web: Research and ApplicationsPages : 346-360Online ISBN : 978-3-540-31547-6Series Volume : 3532Publisher : Springer Berlin HeidelbergCopyright : 2005Authors : Renzo Angles

Claudio Gutierrez

43

Page 44: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Scientific articleMODEL LEVEL DATA

COMPLEXITYCONNECTIVITY TYPE OF DATA

Network physical simple high homogeneous

Relational logical simple low homogeneous

Semantic user simple/medium high homogeneous

Object-O logical/physical complex Medium heterogeneous

XML logical medium medium heterogeneous

RDF logical medium high heterogeneous

44

Table 1 : Summary of comparison among different database models

Page 45: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Scientific articlePROPERTY G G+ GraphLog Gram GraphDB Lorel F-G

Adjacent nodes +/- √ √ √ +/- √ +/-

Adjacent edges +/- √ √ √ +/- √ +/-

Degree of a node X √ √ x ? X x

Path √ √ √ √ √ √ √

Fixed-length Path √ √ √ √ √ √ √

Distance between two nodes X √ √ X ? x x

Diameter x √ √ X ? x X

45

Table 2 : Support of some graph database query languages for the example graph properties

Page 46: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Scientific articlePROPERTY RQL SeRQL RDQL Triple N3 Versa RxPath

Adjacent nodes +/- +/- +/- +/- +/- +/- X

Adjacent edges +/- +/- +/- +/- X x X

Degree of a node +/- x x x x x X

Path x x x x X x +/-

Fixed-length Path +/- +/- +/- +/- +/- X +/-

Distance between two nodes x x x x x x X

Diameter x x x x x x x

46

Table 3 : Support of some current RDF query languages for some example graph properties

Page 47: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Conclusion

• Using Graph database for storing data in graph form or in hierarchical tree structure.

• Graph database : Performance, Agility, Flexibility

• The shortest path

47

Page 48: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

48

Thanks for your attention

Page 49: Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015

Bibliography[1] Ian Robinson, Jim Webber, and Emil Eifrem.«Graph Databases».O’REILLY, 2013.[2] Serge Miranda, Fabien Gandon. «Des Bases de Données à Big Data». Course at Nice Sophia university, MOOC, 2015.[3] Michel Domenjoud. «Bases de données graphes : un tour d’horizon». Available on <http://blog.octo.com/bases-de-donnees-graphes-un-tour-dhorizon> (consulted 18/02/2015).[4] Neo4J community. «Cypher Query Language». Available on <http://neo4j.com/developer/data-modeling/> (consulted 18/02/2015).[5] Frank Manola, Eric Miller, Brian McBride. «RDF 1.1 Primer». Available on <http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140225/> (consulted 18/02/2015).[6] Eric Prud'hommeaux, Andy Seaborne. «SPARQL Query Language for RDF». Available on <http://www.w3.org/TR/rdf-sparql-query/> (consulted 18/02/2015).[7] Neo4J community. «Introduction to graph databases webinar». Available on <http://www.neo4j.org/learn/videos_webinar> (consulted 18/02/2015).

49