how caching improves efficiency and result completeness for querying linked data

Post on 11-May-2015

900 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

That's the slides with which I presented my paper at the Linked Data on the Web workshop (LDOW) at WWW 2011, Hyderabad, India.

TRANSCRIPT

How Caching ImprovesEfficiency and Result Completeness

for Querying Linked Data

Olaf Hartighttp://olafhartig.de/foaf.rdf#olaf

Database and Information Systems Research GroupHumboldt-Universität zu Berlin

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 2

SELECT DISTINCT ?i ?labelWHERE {

?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i .

OPTIONAL { ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") }}ORDER BY ?label

?

Can we query the Web of Dataas of it were a single,giant database?

Our approach: Link Traversal Based Query Execution [ISWC'09]

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 3

Main Idea

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 4

Main Idea

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 5

query-localdataset

Main Idea

http://bob.name

?

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 6

query-localdataset

Main Idea

http://bob.name

?

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 7

query-localdataset

Main Idea

http://bob.name

?

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 8

query-localdataset

Main Idea

http://bob.name

?

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

“Descriptor object”

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 9

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

n

ame

project ?prj

Query

?prjName

know

s

http://bob.name

?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 10

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

knows

http://bob.name

http://alice.name

n

ame

project ?prj

Query

?prjName

know

s

http://bob.name

?acq n

ame

project ?prj

Query

?prjName

know

s

http://bob.name

?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 11

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http://alice.name

?acq

n

ame

project ?prj

Query

?prjName

know

s

http://bob.name

?acq

knows

http://bob.name

http://alice.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 12

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http

://al

ice.

nam

e

?

http://alice.name

?acq

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 13

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http

://al

ice.

nam

e

?

http://alice.name

?acq

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 14

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http

://al

ice.

nam

e

?

http://alice.name

?acq

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 15

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

n

ame

project ?prj

Query

know

s

?acq

?prjName

http://bob.name

http://alice.name

?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 16

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http://alice.name

?acq

n

ame

?prjName

know

s

http://bob.nameQuery

project ?prj?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 17

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

project http://.../AlicesPrj

http://alice.name

http://alice.name

?acq

n

ame

?prjName

know

s

http://bob.nameQuery

project ?prj?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 18

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http://alice.name http://.../AlicesPrj

?prj?acq

http://alice.name

?acq

n

ame

?prjName

know

s

http://bob.nameQuery

project ?prj?acq

project http://.../AlicesPrj

http://alice.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 19

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http://alice.name

?acq

n

ame

?prjName

know

s

http://bob.nameQuery

project ?prj?acq

http://alice.name http://.../AlicesPrj

?prj?acq

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 20

query-localdataset

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Main Idea

http://alice.name

?acq

n

ame

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

http://alice.name http://.../AlicesPrj

?prj?acq

http://.../AlicesPrj “ … “

?prjName?prj

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 21

query-localdataset

Main Idea

http://alice.name

?acq

n

ame

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

http://alice.name http://.../AlicesPrj

?prj?acq

http://.../AlicesPrj “ … “

?prjName?prj

http://alice.name

?acqhttp://.../AlicesPrj “ … “

?prjName?prj

● Intertwine query evaluation with traversal of data links

● We alternate between:● Evaluate parts of the query (triple patterns)

on a continuously augmented set of data● Look up URIs in intermediate

solutions and add retrieved datato the query-local dataset

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 22

Characteristics

● Link traversal based query execution:● Evaluation on a continuously augmented dataset● Discovery of potentially relevant data during execution● Discovery driven by intermediate solutions

● Main advantage:● No need to know all data sources in advance

● Limitations:● Query has to contain a URI as a starting point● Ignores data that is not reachable* by the query execution

* formal definition in the paper

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 23

query-localdataset

The Issue

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 24

query-localdataset

The Issue

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

http://bob.name?

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 25

query-localdataset

The Issue

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

knows

http://bob.name

http://alice.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 26

query-localdataset

The Issue

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

knows

http://bob.name

http://alice.name

?acq ?iLabel?i

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 27

query-localdataset

query-localdataset

The Issue

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

nam

e

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 28

query-localdataset

query-localdataset

Reusing the Query-Local Dataset

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

nam

e

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 29

query-localdataset

Reusing the Query-Local Dataset

lab

el

interest?i

Querykn

ows

?acq

?iLabel

http://bob.name

nam

e

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

knows

http://alice.name

http://bob.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 30

query-localdataset

Reusing the Query-Local Dataset

lab

el

interest?i

Querykn

ows

?iLabel

?acq

http://bob.name

nam

e

know

s

http://bob.nameQuery

project?acq

?prjName

?prj

http://alice.name

?acq

knows

http://bob.name

http://alice.name

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 31

Re-using the query-local dataset (a.k.a. data caching)may benefit

query performance + result completeness

Hypothesis

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 32

Contributions

● Systematic analysis of the impact of data caching● Theoretical foundation* ● Conceptual analysis* ● Empirical evaluation of the potential impact

● Out of scope: Caching strategies (replacement, invalidation)

*see paper

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 33

Experiment – Scenario

● Information about thedistributed social network of FOAF profiles● 5 types of queries

● Experiment Setup:● 23 persons● Sequential use➔ 115 queries

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 34

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse given order

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate

● no reuse experiment:● No data caching

● given order experiment● Reuse of the query-local

dataset for the complete sequence of all 115 queries

● Hit rate:

look-ups answered from cacheall look-up requests

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 35

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate

● no reuse experiment:● No data caching

● given order experiment● Reuse of the query-local

dataset for the complete sequence of all 115 queries

● Hit rate:

look-ups answered from cacheall look-up requests

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 36

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate query execution time(in seconds)

number of query results

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 37

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate query execution time(in seconds)

number of query results

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 38

Summary

● Contributions:● Theoretical foundation● Conceptual analysis● Empirical evaluation

● Main findings:● Additional results possible (for semantically similar queries)● Impact on performance may be positive but also negative

● Future work:● Analysis of caching strategies in our context● Main issue: invalidation

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 39

Backup Slides

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 40

Contributions

● Theoretical foundation (extension of the original definition)

● Reachability by a Dseed

-initialized execution of a BGP query b

● Dseed

-dependent solution for a BGP query b

● Reachability R(B) for a serial execution of B = b1 , … , b

n

➔ Each solution for bcur

is also R(B)-dependent solution for bcur

● Conceptual analysis of the impact of data caching

● Performance factor: p( bcur

, B ) = c( bcur

, [ ] ) – c( bcur

, B )

● Serendipity factor: s( bcur

, B ) = b( bcur

, B ) – b( bcur

, [ ] )

● Empirical verification of the potential impact

● Out of scope: Caching strategies (replacement, invalidation)

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 41

Query Template Contact

SELECT * WHERE { <PERSON> foaf:knows ?p .

OPTIONAL { ?p foaf:name ?name } OPTIONAL { ?p foaf:firstName ?firstName } OPTIONAL { ?p foaf:givenName ?givenName } OPTIONAL { ?p foaf:givenname ?givenname } OPTIONAL { ?p foaf:familyName ?familyName } OPTIONAL { ?p foaf:family_name ?family_name } OPTIONAL { ?p foaf:lastName ?lastName } OPTIONAL { ?p foaf:surname ?surname }

OPTIONAL { ?p foaf:birthday ?birthday }

OPTIONAL { ?p foaf:img ?img }

OPTIONAL { ?p foaf:phone ?phone } OPTIONAL { ?p foaf:aimChatID ?aimChatID } OPTIONAL { ?p foaf:icqChatID ?icqChatID } OPTIONAL { ?p foaf:jabberID ?jabberID } OPTIONAL { ?p foaf:msnChatID ?msnChatID } OPTIONAL { ?p foaf:skypeID ?skypeID } OPTIONAL { ?p foaf:yahooChatID ?yahooChatID }

}

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 42

Query Template UnsetProps

SELECT DISTINCT ?result ?resultLabel WHERE{

?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> .?result rdfs:domain foaf:Person .

OPTIONAL { <PERSON> ?result ?var0 } FILTER ( !bound(?var0) )

<PERSON> foaf:knows ?var2 .?var2 ?result ?var3 .?result rdfs:label ?resultLabel .?result vs:term_status ?var1 .

}ORDER BY ?var1

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 43

Query Template Incoming

SELECT DISTINCT ?result WHERE{

?result foaf:knows <PERSON> .

OPTIONAL{

?result foaf:knows ?var1 .FILTER ( <PERSON> = ?var1 )<PERSON> foaf:knows ?result .

}FILTER ( !bound(?var1) )

}

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 44

Query Template 2ndDegree1

SELECT DISTINCT ?result WHERE{

<PERSON> foaf:knows ?p1 .<PERSON> foaf:knows ?p2 .FILTER ( ?p1 != ?p2 )

?p1 foaf:knows ?result .FILTER ( <PERSON> != ?result )?p2 foaf:knows ?result .

OPTIONAL {<PERSON> ?knows ?result .FILTER ( ?knows = foaf:knows )

}FILTER ( !bound(?knows) )

}

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 45

Query Template 2ndDegree2

SELECT DISTINCT ?result WHERE{

<PERSON> foaf:knows ?p1 .<PERSON> foaf:knows ?p2 .FILTER ( ?p1 != ?p2 )

?result foaf:knows ?p1 .FILTER ( <PERSON> != ?result )?result foaf:knows ?p2 .

OPTIONAL {<PERSON> ?knows ?result .FILTER ( ?knows = foaf:knows )

}FILTER ( !bound(?knows) )

}

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 46

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

Experiment – Single Query

hit rate

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

● no reuse experiment:● No data caching

● upper bound experiment● Reuse of query-local dataset

for 3 executions of each query● Third execution measured

● Hit rate:

look-ups answered from cacheall look-up requests

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 47

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

Experiment – Single Query

hit rate

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

● no reuse experiment:● No data caching

● upper bound experiment● Reuse of query-local dataset

for 3 executions of each query● Third execution measured

● Hit rate:

look-ups answered from cacheall look-up requests

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 48

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

Experiment – Single Query

hit rate query execution time(in seconds)

number of query results

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 49

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

Experiment – Single Query

hit rate query execution time(in seconds)

number of query results

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 50

Experiment – Single Query

● In the ideal case for Bupper

= [ bcur

, bcur

] :

● pupper

( bcur

, Bupper

) = c( bcur

, [ ] ) – c( bcur

, Bupper

) = c( bcur

, [ ] )

● supper

( bcur

, Bupper

) = b( bcur

, Bupper

) – b( bcur

, [ ] ) = 0

Experiment Avg.1 number of Query Results

(std.dev.)

Average1

Hit Rate

(std.dev.)

Avg.1 query Execution Time

(std.dev.)

no reuse4.983

(11.658)

0.576(0.182)

30.036 s(46.708)

upper bound5.070

(11.813)

0.996(0.017)

1.943 s(11.375)

1 Averaged over all 115 queries

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 51

Experiment – Single Query

● Summary (measurement errors aside):● Same number of query results● Significant improvements in query performance

Experiment Avg.1 number of Query Results

(std.dev.)

Average1

Hit Rate

(std.dev.)

Avg.1 query Execution Time

(std.dev.)

no reuse4.983

(11.658)

0.576(0.182)

30.036 s(46.708)

upper bound5.070

(11.813)

0.996(0.017)

1.943 s(11.375)

1 Averaged over all 115 queries

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 52

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

query execution time(in seconds)

number of query results

● given order experiment:● Reuse of the query-local

dataset for the complete sequence of all 115 queries

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 53

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

query execution time(in seconds)

number of query results

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

hit rate

● given order experiment:● Reuse of the query-local

dataset for the complete sequence of all 115 queries

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 54

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

hit rate query execution time(in seconds)

number of query results

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 55

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

hit rate query execution time(in seconds)

number of query results

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 56

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

hit rate query execution time(in seconds)

number of query results

Bgiven order

= [ q1 , … , q

38 ]

s( q39

, Bgiven order

) = b( q39

, Bgiven order

) – b( q39

, [ ] )

= 9 – 1

= 8

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 57

ContactInfoPhillipe

UnsetPropsPhillipe

2ndDegree1Phillipe

2ndDegree2Phillipe

IncomingPhillipe

0 0,2 0,4 0,6 0,8 1

0 0,2 0,4 0,6 0,8 1

no reuse upper bound

given order

Experiment – Complete Sequence

(Query No. 36)

(Query No. 37)

(Query No. 38)

(Query No. 39)

(Query No. 40)

0 5 10 15 20 25 30

0 5 10 15 20 25 30

0 20 40 60 80

0 20 40 60 80

hit rate query execution time(in seconds)

number of query results

Bgiven order

= [ q1 , … , q

38 ]

p'( q39

, Bgiven order

) = c'( q39

, [ ] ) – c'( q39

, Bgiven order

)

= 31.48 s – 68.64 s

= – 37.16 s

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 58

Experiment – Complete Sequence

● Summary:● Data cache may provide for additional query results● Impact on performance may be positive but also negative

Experiment Avg.1 number of Query Results

(std.dev.)

Average1

Hit Rate

(std.dev.)

Avg.1 query Execution Time

(std.dev.)

no reuse4.983

(11.658)

0.576(0.182)

30.036 s(46.708)

upper bound5.070

(11.813)

0.996(0.017)

1.943 s(11.375)

given order6.878

(12.158)

0.932(0.139)

39.845 s(145.898)

1 Averaged over all 115 queries

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 59

Experiment – Complete Sequence

● Executing the query sequence in a random order results in measurements similar to the given order.

Experiment Avg.1 number of Query Results

(std.dev.)

Average1

Hit Rate

(std.dev.)

Avg.1 query Execution Time

(std.dev.)

no reuse4.983

(11.658)

0.576(0.182)

30.036 s(46.708)

upper bound5.070

(11.813)

0.996(0.017)

1.943 s(11.375)

given order6.878

(12.158)

0.932(0.139)

39.845 s(145.898)

random orders6.652

(11.966)

0.954(0.036)

36.994 s(118.700)

Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 60

These slides have been created byOlaf Hartig

http://olafhartig.de

This work is licensed under aCreative Commons Attribution-Share Alike 3.0 License

(http://creativecommons.org/licenses/by-sa/3.0/)

top related