searching the web of data (tutorial)

226
!"#$%&'() +&" ,"- ./ 0#+# !"#$%&'( '# *+,- ./012 3$45$6 1"$#$2 2" 3"4. 56!5 7"$8"4"9 !""#$%%&'()*+*,-%-./% :#+;# <.=" >#4-.$) ?('@"$='+9 !""#$%%000+12"324!*5'+.'%

Upload: gerard-de-melo

Post on 28-Jul-2015

35 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Searching the Web of Data (Tutorial)

! !

!"#$%&'()*+&"*,"-*./*0#+#!"#$%&'()'#)*+,-)./012)3$45$6

1"$#$2*2"*3"4.56!5*7"$8"4"9

!""#$%%&'()*+*,-%-./%

:#+;#*<.=">#4-.$)*?('@"$='+9

!""#$%%000+12"324!*5'+.'%

Page 2: Searching the Web of Data (Tutorial)

! !

AB$*CB+.$'#4*C.2#9

! 7589:"(9;)0/;//<01;1/

! +$==99)>%9'?;)00;1/<0.;//

! @"94#&$A4;)BAC)#&D9

! 7(&:94;)E&(()F9)'G'&('F(9)$A(&A9

Page 3: Searching the Web of Data (Tutorial)

! !

C&"*D4#(

! 5(+$.2B%+'.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-

! EB"$9'()*F'(8"2*AG"(*0#+#

! EB"$9'()*+&"*,"-*./*0#+#

! EB"$9*?(2"$=+#(2'()

! AB+4..8*#(2*6.(%4B='.(

Page 4: Searching the Web of Data (Tutorial)

! !

5(+$.2B%+'.(

! 5(GB+H)H9C6$%:42)I$54

! 0"='$"2*AB+GB+HI$5"D9A#)-'A?&AJ

Page 5: Searching the Web of Data (Tutorial)

! !

5(+$.2B%+'.(

! 5(GB+H)H9C6$%:42)#89)E$%(:K4)I'#'

! C$B"*0"='$"2*AB+GB+HB::%944)L49%K4)M99:4

Page 6: Searching the Web of Data (Tutorial)

! !

?="$I=*J""2=

M99:AK#)F9)N('&A)&A=$%D'#&$A)A99:4

Page 7: Searching the Web of Data (Tutorial)

! !

?="$I=*J""2=

M99:AK#)F9)J9A"&A9)O&A=$%D'#&$AP)A99:4

Page 8: Searching the Web of Data (Tutorial)

! !

AB$*K.%B=H*5(/.$L#+'.(*J""2=

Page 9: Searching the Web of Data (Tutorial)

! !

AB$*K.%B=H*5(/.$L#+'.(*J""2=

Page 10: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 11: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 12: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 13: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 14: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 15: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 16: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=

Page 17: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=#22$"=='()*.+&"$*5(+"(+=

Page 18: Searching the Web of Data (Tutorial)

! !

0#+#M0$'@"(*>(=N"$=#22$"=='()*.+&"$*5(+"(+=

Page 19: Searching the Web of Data (Tutorial)

! !

C&"*,"-*#=*#*0#+#-#="H*O(+'+'"=

Page 20: Searching the Web of Data (Tutorial)

! !

C&"*,"-*#=*#*0#+#-#="H*O(+'+'"=

Page 21: Searching the Web of Data (Tutorial)

! !

C&"*,"-*#=*#*0#+#-#="H>++$'-B+"=PQ"4#+'.(=&'G=

Q$F)!&#(9R$5'#&$A7#'%#&AJ)I'#97'('%C9#5S

M9649'%585'N'F&(&#&94

Page 22: Searching the Web of Data (Tutorial)

! !

C&"*D4#(

! 5(+$.2B%+'.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-

! EB"$9'()*F'(8"2*AG"(*0#+#

! EB"$9'()*+&"*,"-*./*0#+#

! EB"$9*?(2"$=+#(2'()

! AB+4..8*#(2*6.(%4B='.(

Page 23: Searching the Web of Data (Tutorial)

Searching the Web of Data

Outline I

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 2 / 73

Page 24: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Structured data on the Web

Semantic Markup (metadata, tags) embedded in HTMLMicroformats, hCard, hCalendar, RDFa

Knowledge basesLarge collections of RDF data

Linked (Open) DataReferences between collections of RDF data

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 3 / 73

Page 25: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Semantic markup

If the search engine gets some help in better “understanding” the contentof a web page, rich snippets highlighting and displaying certain informationcan be created.

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=99170

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 4 / 73

Page 26: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Microformats

Efforts started in 2003

Fixed formats for specific type of informationhCard: people, companies, organizations, and placeshCalendar: calendaring and eventshReview: reviews of products, companies, events. . .

Cannot represent arbitrary data

Indexed by Google and Yahoo since 2009

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 5 / 73

Page 27: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

hCard example<div>

<img src="www.example.com/bobsmith.jpg" /><strong>Bob Smith</strong>Senior editor at ACME Reviews200 Main StDesertville, AZ 12345

</div>

<div class="vcard"><img class="photo" src="www.example.com/bobsmith.jpg" /><strong class="fn">Bob Smith</strong><span class="title">Senior editor</span> at<span class="org">ACME Reviews</span>

<span class="adr"><span class="street-address">200 Main St</span><span class="locality">Desertville</span>,<span class="region">AZ</span><span class="postal-code">12345</span>

</span></div> http://support.google.com/webmasters/bin/answer.py?hl=en&answer=146897

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 6 / 73

Page 28: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

RDFa

Proposed in 2004, W3C recommendation

Can be used together with any vocabulary (no restriction on schema)

Can assign URIs as global primary keys to entities

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 7 / 73

Page 29: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

RDFa example

<div>My name is Bob Smith but people call me Smithy. Here is my home page:<a href="http://www.example.com">www.example.com</a>.I live in Albuquerque, NM and work as an engineer at ACME Corp.

</div>

<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Person">My name is <span property="v:name">Bob Smith</span>,but people call me <span property="v:nickname">Smithy</span>.Here is my homepage:<a href="http://www.example.com" rel="v:url">www.example.com</a>.I live in Albuquerque, NM and work as an

<span property="v:title">engineer</span>at <span property="v:affiliation">ACME Corp</span>.

</div>

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=146898

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 8 / 73

Page 30: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Facebook’s Open Graph Protocol

Introduction of “like” buttons in 2010Allows site owners to determine how entities are displayed in FacebookRelies on RDFa for encoding data in HTML pages

http://developers.facebook.com/docs/opengraphprotocol/

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 9 / 73

Page 31: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Microdata

Proposed in 2009 as part of HTML5

Alternative technique for embedding structured data

Tries to be simpler than RDFa

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 10 / 73

Page 32: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Microdata

<div>My name is Bob Smith but people call me Smithy. Here is my home page:<a href="http://www.example.com">www.example.com</a>I live in Albuquerque, NM and work as an engineer at ACME Corp.

</div>

<div itemscope itemtype="http://data-vocabulary.org/Person">My name is <span itemprop="name">Bob Smith</span>but people call me <span itemprop="nickname">Smithy</span>.Here is my homepage:<a href="http://www.example.com" itemprop="url">www.example.com</a>I live in Albuquerque, NM and work as an

<span itemprop="title">engineer</span>at <span itemprop="affiliation">ACME Corp</span>.

</div>

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=176035

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 11 / 73

Page 33: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

schema.org

“Standardized” vocabulary 2011, supported by Bing, Google, Yahooand YandexAsk site owners to embed data to enrich search results200+ types: event, organization, person, place, product, review,. . .Encoding: basically microdata (RDFa)Main usage: highlighting and enriching data snippets in search results

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=99170

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 12 / 73

Page 34: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic markup

Embedded data in HTML

RDFa and Microdata usage grows, microformats are still present

A rather small set of vocabularies is used

The content and the vocabularies are very focused towards the majorconsumers

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 13 / 73

Page 35: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Google Knowledge Graph

Currently1, more than 500 million entities, such as celebrities, cities,movies,. . .

Consists of commercial third-party data and Web data

Enriching search results with summaries

Is increasingly being used by Google to answer queries

1Google Official Blog http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 14 / 73

Page 36: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Google Knowledge Graph

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 15 / 73

Page 37: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Google Knowledge Graph

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 15 / 73

Page 38: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 16 / 73

Page 39: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Semantic Web

��������������������������������

������������ ���������������������������� �����

������������������������������ �������� ����������

������������� �������������������������������������������������� �����

�����������

� ������������������ �

http://www.slideshare.net/lod2project/the-semantic-data-web-sren-auer-university-of-leipzig

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 17 / 73

Page 40: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Semantic Web

Web 1.0 (standard Web)Web 2.0 (data generated by users)Web 3.0 (Semantic Web)Machine-readable dataURIs for documents and concepts (entities)“Web of data”

Web server & Web 1.0:

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 18 / 73

Page 41: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

http://www.slideshare.net/cloudofdata/toward-the-data-cloud

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 19 / 73

Page 42: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Standards

Sharing structured data across the Web relies on standards

Standard graph-based data modelRDF

Different syntaxes and formatsRDF/XML, RDFa

Powerful, logic-based schema languages and reasoningOWL

Query languages and protocolsHTTP, SPARQL

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 20 / 73

Page 43: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Linked Open Data: design issues

Design issues (rules)

1 Use URIs as names for things

2 Use HTTP URIs so that people can look up those names

3 When someone looks up a URI, provide useful information, using thestandards (RDF, SPARQL)

4 Include links to URIs in other datasetsGoal: linking URIs in different data sets describing the same realworld entity

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 21 / 73

Page 44: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Growth: Semantic Web and Linked Open Data

May 2007

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 22 / 73

Page 45: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Growth: Semantic Web and Linked Open Data

March 2008

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 22 / 73

Page 46: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Growth: Semantic Web and Linked Open Data

July 2009

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 22 / 73

Page 47: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Semantic Web and Linked Open Data

Growth: Semantic Web and Linked Open Data

September 2011

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 22 / 73

Page 48: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 23 / 73

Page 49: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Knowledge bases

Large collections of semantic data

YAGO [9], Freebase [10], DBpedia [2],. . .

Mostly result of information extraction

Data format in general RDF

Often participate as sources in the Linked Open Data cloud

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 24 / 73

Page 50: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

RDF

Each resource (entity) is identified by a globally unique URI

Data is stored in the form of facts

Triple: (subject, property, object)

Subject: URI

Predicate/Property: URI

Object: URI or literal (strings, integers, booleans, etc.)

http://dbpedia.org/resource/Aalborg

http://dbpedia.org/ontology/country

http://dbpedia.org/resource/Denmark

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 25 / 73

Page 51: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

RDF

Each resource (entity) is identified by a globally unique URI

Data is stored in the form of facts

Triple: (subject, property, object)

Subject: URI

Predicate/Property: URI

Object: URI or literal (strings, integers, booleans, etc.)

Using prefixesdbpedia:Aalborg

dbpedia-owl:country

dbpedia:Denmark

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 25 / 73

Page 52: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

RDF

Triples connect to graphs

dbpedia-owl:country

dbpedia:Aalborg

dbpedia-owl:country

dbpedia:Denmark

dbpedia-owl:isPartOf

dbpedia:North_Denmark_Region

123432

dbpedia:populationTotal

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 26 / 73

Page 53: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

RDF

Triples connect to graphs. . . and possibly other sources

dbpedia-owl:country

dbpedia:Aalborg

dbpedia-owl:country

dbpedia:Denmark

dbpedia-owl:isPartOf

dbpedia:North_Denmark_Region

123432

dbpedia:populationTotal

yago:Denmark

geonames:2624886

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 26 / 73

Page 54: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Schema

The schema (vocabulary, ontology) can be expressed in OWL

Definition of classes, properties, restrictions,. . .

Allows for validation and reasoning

Schema information is also represented as RDF triples

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 27 / 73

Page 55: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Managing large amounts of RDF data

Relational RDF data management

A single relational tableThree columns (subject, property, object)

Property tablesn-ary table columns for the same subject

Binary tablesOne two-column table for each property

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 28 / 73

Page 56: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

A single relational table

Example triples

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:isPartOf dbpedia:North Denmark Region)(dbpedia:North Denmark Region, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:populationTotal, 123432)

subject property object

dbpedia:Aalborg dbpedia-owl:country dbpedia:Denmarkdbpedia:Aalborg dbpedia-owl:isPartOf dbpedia:North Denmark Region

dbpedia:North Denmark Region dbpedia-owl:country dbpedia:Denmarkdbpedia:Aalborg dbpedia-owl:populationTotal 123432

Works with standard relational DBMS and SQL

Problems: self joins, query optimization

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 29 / 73

Page 57: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

A single relational table

Example triples

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:isPartOf dbpedia:North Denmark Region)(dbpedia:North Denmark Region, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:populationTotal, 123432)

subject property object

dbpedia:Aalborg dbpedia-owl:country dbpedia:Denmarkdbpedia:Aalborg dbpedia-owl:isPartOf dbpedia:North Denmark Region

dbpedia:North Denmark Region dbpedia-owl:country dbpedia:Denmarkdbpedia:Aalborg dbpedia-owl:populationTotal 123432

Works with standard relational DBMS and SQL

Problems: self joins, query optimization

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 29 / 73

Page 58: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Property tablesExample triples

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:isPartOf dbpedia:North Denmark Region)(dbpedia:North Denmark Region, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:populationTotal, 123432)

citysubject country isPartOf populationTotal

dbpedia:Aalborg dbpedia:Denmark dbpedia:North Denmark Region 123432

regionsubject country

dbpedia:North Denmark Region dbpedia:Denmark

Grouping information about entities with similar properties

n-ary tables for the same subject

Difficult to create a proper layout

Null values

Problems with multi-valued attributes

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 30 / 73

Page 59: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Property tablesExample triples

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:isPartOf dbpedia:North Denmark Region)(dbpedia:North Denmark Region, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:populationTotal, 123432)

citysubject country isPartOf populationTotal

dbpedia:Aalborg dbpedia:Denmark dbpedia:North Denmark Region 123432dbpedia:Kassel dbpedia:Germany 195530

regionsubject country

dbpedia:North Denmark Region dbpedia:Denmark

Grouping information about entities with similar properties

n-ary tables for the same subject

Difficult to create a proper layout

Null values

Problems with multi-valued attributesKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 30 / 73

Page 60: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Property tables

citysubject country isPartOf populationTotal

dbpedia:Aalborg dbpedia:Denmark dbpedia:North Denmark Region 123432dbpedia:Kassel dbpedia:Germany 195530

regionsubject country

dbpedia:North Denmark Region dbpedia:Denmark

Grouping information about entities with similar properties

n-ary tables for the same subject

Difficult to create a proper layout

Null values

Problems with multi-valued attributes

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 30 / 73

Page 61: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Binary tablesExample triples

(dbpedia:Aalborg, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:isPartOf dbpedia:North Denmark Region)(dbpedia:North Denmark Region, dbpedia-owl:country, dbpedia:Denmark)(dbpedia:Aalborg, dbpedia-owl:populationTotal, 123432)

dbpedia-owl:countrysubject object

dbpedia:Aalborg dbpedia:Denmarkdbpedia:North Denmark Region dbpedia:Denmark

dbpedia-owl:isPartOfsubject object

dbpedia:Aalborg dbpedia:North Denmark Region

dbpedia-owl:populationTotalsubject object

dbpedia:Aalborg 123432

Create a seperate table for each property

Can become inefficient for queries involving many common properties

Becomes inefficient if there are too many different propertiesKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 31 / 73

Page 62: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Binary tables

dbpedia-owl:countrysubject object

dbpedia:Aalborg dbpedia:Denmarkdbpedia:North Denmark Region dbpedia:Denmark

dbpedia-owl:isPartOfsubject object

dbpedia:Aalborg dbpedia:North Denmark Region

dbpedia-owl:populationTotalsubject object

dbpedia:Aalborg 123432

Create a seperate table for each property

Can become inefficient for queries involving many common properties

Becomes inefficient if there are too many different properties

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 31 / 73

Page 63: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Native triple stores

RDF-3X [7]

Dictionary encoding to reduce storage space

Extensive use of B+-tree indexes(SPO, OPS, PSO, SOP, OSP, POS)Aggregated indexes: S, P, O, SP, SO, PO, PS, OP, OS

Triples are materialized in the indexes

Histograms provide the query optimizer with further statistcis

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 32 / 73

Page 64: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

Column stores

SW-Store [1]

Binary tables in combination with a column-oriented DBMS (C-store)

Sorted tables

Supports multi-valued attribues (listed in a successive row)

Increased costs for updates and tuple reconstruction

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 33 / 73

Page 65: Searching the Web of Data (Tutorial)

Searching the Web of Data

Structured data on the Web

Data management

More alternatives

Store RDF data in a matrix with bit-vector compression

Store RDF as XML and use XML technology

Graph databases

. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 34 / 73

Page 66: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 35 / 73

Page 67: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 36 / 73

Page 68: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 69: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 70: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 71: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 72: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 73: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 74: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 75: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 76: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 77: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 78: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Who is Carlo Pedersoli?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 37 / 73

Page 79: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Browser-based link traversal

Browser-based link traversal

Browser-based link traversal is the most “natural” way of looking upinformation using Linked Data

Might be very tedious and frustrating

Takes much time

But you will discover much information that you never intended tosearch for

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 38 / 73

Page 80: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 39 / 73

Page 81: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Keyword search for Linked Data

Given a set of keywords, find all relevant information.

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 40 / 73

Page 82: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Falcons

Focus: entities

Crawling

Follow links contained in RDF documents

Construct virtual documents for entitiesLiteralsHuman-readable names and descriptions (rdfs:label, rdfs:comment)

Create indexesTerms in virtual documentsEntity classes

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 41 / 73

Page 83: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Falcons

Query processing

Keywords

Filtering based on classes/types

Output: entities with snippets

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 42 / 73

Page 84: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Falcons

http://ws.nju.edu.cn/falcons/objectsearch/index.jsp

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 43 / 73

Page 85: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Falcons

http://ws.nju.edu.cn/falcons/objectsearch/index.jsp

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 43 / 73

Page 86: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Falcons

http://ws.nju.edu.cn/falcons/objectsearch/index.jsp

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 43 / 73

Page 87: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Sindice

Original focus: documents and sources

Indexes

URI (Unified Resource Identifier)

IFP (Inverse Functional Properties)

Literal

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 44 / 73

Page 88: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Sindice

http://sindice.comKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 45 / 73

Page 89: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Sindice

http://sindice.comKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 45 / 73

Page 90: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Keyword search for Linked Data

Indexing

Indexes on virtual documents

Creating virtual documents based on the data (entity,source,triple,. . . )

Create inverted indexes on URIs, properties, classes, literals,. . . orcombinations

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 46 / 73

Page 91: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

1 Structured data on the WebSemantic markupSemantic Web and Linked Open DataData management

2 Querying Linked Open DataBrowser-based link traversalKeyword search for Linked DataStructured queries

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 47 / 73

Page 92: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Structured queries

What are the movies that both Carlo Pedersoli (Bud Spencer) andMario Girotti (Terence Hill) acted in?Or what are the movies that only one of them (without the other)acted in?

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 48 / 73

Page 93: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Structured queries

Structured query language

SPARQL

Query processing strategies

Materialized query processing

Lookup-based query processing

Federated query processing

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 49 / 73

Page 94: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

SPARQL

Example query

PREFIX dbpedia: <http://dbpedia.org/property/>PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>SELECT ?city, ?pop WHERE {?city dbpedia-owl:country dbpedia:Denmark .?city dbpedia:populationTotal ?pop .FILTER (?pop > 100000)

}

SPARQL

Similar to SQL

Variables start with “?”

Queries consist of triple patterns, e.g.:?city dbpedia-owl:country dbpedia:Denmark

Joins between triple patterns are expressed by common variables

Filters express additional constraintsKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 50 / 73

Page 95: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

SPARQL

Example query

PREFIX dbpedia: <http://dbpedia.org/property/>PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>SELECT ?city, ?pop WHERE {?city dbpedia-owl:country dbpedia:Denmark .?city dbpedia:populationTotal ?pop .FILTER (?pop > 100000)

}

SPARQL

Similar to SQL

Variables start with “?”

Queries consist of triple patterns, e.g.:?city dbpedia-owl:country dbpedia:Denmark

Joins between triple patterns are expressed by common variables

Filters express additional constraintsKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 50 / 73

Page 96: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

SPARQL

Example query

PREFIX dbpedia: <http://dbpedia.org/property/>PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>SELECT ?city, ?pop WHERE {?city dbpedia-owl:country dbpedia:Denmark .?city dbpedia:populationTotal ?pop .FILTER (?pop > 100000)

}

SPARQL

Similar to SQL

Variables start with “?”

Queries consist of triple patterns, e.g.:?city dbpedia-owl:country dbpedia:Denmark

Joins between triple patterns are expressed by common variables

Filters express additional constraintsKatja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 50 / 73

Page 97: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

SPARQL

Query

dbpedia-owl:country

dbpedia:Denmark

dbpedia:populationTotal

?pop ?city

Data

dbpedia-owl:country

dbpedia:Aalborg

dbpedia-owl:country

dbpedia:Denmark

dbpedia-owl:isPartOf

dbpedia:North_Denmark_Region

123432

dbpedia:populationTotal

yago:Denmark

geonames:2624886

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 51 / 73

Page 98: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

SPARQL

Query

dbpedia-owl:country

dbpedia:Denmark

dbpedia:populationTotal

?pop ?city

Data

dbpedia-owl:country

dbpedia:Aalborg

dbpedia-owl:country

dbpedia:Denmark

dbpedia-owl:isPartOf

dbpedia:North_Denmark_Region

123432

dbpedia:populationTotal

yago:Denmark

geonames:2624886

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 51 / 73

Page 99: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Query processing strategies

Query processing strategies

Materialized query processing

Lookup-based query processing

Federated query processing

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 52 / 73

Page 100: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Query processing strategies

Query processing strategies

Materialized query processing

Lookup-based query processing

Federated query processing

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 52 / 73

Page 101: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Materialized query processing

Characteristics

Centralized storage

Crawl and download the data

Evaluate queries locally

As an alternative to evaluating queries on huge data sets on a singlemachine, we can make use of distributed architectures and parallelprocessing, e.g., MapReduce, NoSQL, P2P, Grid,. . . .

Problem

Hash partitioning is not optimal for complex queries

Possible solution

Clustered RDF management using graph partitioning [6]

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 53 / 73

Page 102: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Materialized query processing

Characteristics

Centralized storage

Crawl and download the data

Evaluate queries locally

As an alternative to evaluating queries on huge data sets on a singlemachine, we can make use of distributed architectures and parallelprocessing, e.g., MapReduce, NoSQL, P2P, Grid,. . . .

Problem

Hash partitioning is not optimal for complex queries

Possible solution

Clustered RDF management using graph partitioning [6]

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 53 / 73

Page 103: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Materialized query processing

Characteristics

Centralized storage

Crawl and download the data

Evaluate queries locally

As an alternative to evaluating queries on huge data sets on a singlemachine, we can make use of distributed architectures and parallelprocessing, e.g., MapReduce, NoSQL, P2P, Grid,. . . .

Problem

Hash partitioning is not optimal for complex queries

Possible solution

Clustered RDF management using graph partitioning [6]

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 53 / 73

Page 104: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Data graph

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 54 / 73

Page 105: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Graph partitioning

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 54 / 73

Page 106: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Assigning triples to partitions – 1-hop guarantee

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 54 / 73

Page 107: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

2-hop guarantee

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 54 / 73

Page 108: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Query execution is more efficient in RDF stores than in Hadoop [6]

Goals

Pushing as much of the processing as possible into RDF stores

Minimizing the number of Hadoop jobs

The larger the hop guarantee, the more work is done in RDF stores

Query processing

Choose center of the query graph

Calculate distance from the center to the furthest edge

If distance <= n: query can be handled by nodes independentlywithout communication

If distance > n: communication is needed, split up into smallersubqueries, Hadoop

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 55 / 73

Page 109: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Query execution is more efficient in RDF stores than in Hadoop [6]

Goals

Pushing as much of the processing as possible into RDF stores

Minimizing the number of Hadoop jobs

The larger the hop guarantee, the more work is done in RDF stores

Query processing

Choose center of the query graph

Calculate distance from the center to the furthest edge

If distance <= n: query can be handled by nodes independentlywithout communication

If distance > n: communication is needed, split up into smallersubqueries, Hadoop

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 55 / 73

Page 110: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Find football players playing for clubs in a region where they were born

SELECT ?player ?club ?region WHERE {?player rdf:type ex:footballer .?player ex:playsFor ?club .?player ex:bornIn ?region .?club ex:region ?region .?region ex:population ?pop .

}

ex:footballer?pop

?club

?player

?region

rdf:type

ex:bornInex:population

ex:playsFor

ex:region

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 56 / 73

Page 111: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Find football players playing for clubs in a region where they were born

SELECT ?player ?club ?region WHERE {?player rdf:type ex:footballer .?player ex:playsFor ?club .?player ex:bornIn ?region .?club ex:region ?region .?region ex:population ?pop .

}

ex:footballer?pop

?club

?player

?region

rdf:type

ex:bornInex:population

ex:playsFor

ex:region

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 56 / 73

Page 112: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Find football players playing for clubs in a region where they were born

ex:footballer?pop

?club

?player

?region

rdf:type

ex:bornInex:population

ex:playsFor

ex:region

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 57 / 73

Page 113: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

Find football players playing for clubs in a region where they were born

ex:footballer?pop

?club

?player

?region

rdf:type

ex:bornInex:population

ex:playsFor

ex:region

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 57 / 73

Page 114: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

If the query is too “big”, the query is decomposed into multiple smallersubqueries, and the results are combined using MapReduce.

ex:citesex:wr

ittenBy

ex:hasTitle

ex:writtenBy

ex:hasName

ex:hasName

isOwned

isOwned

?name2?author2

?title2

?name1?author1

?art2?art1

Workload-Aware Replication [5]

Replicate additional queries at the boundaries

Avoid MapReduce by using a designated coordinator node

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 58 / 73

Page 115: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Clustered RDF management

If the query is too “big”, the query is decomposed into multiple smallersubqueries, and the results are combined using MapReduce.

ex:citesex:wr

ittenBy

ex:hasTitle

ex:writtenBy

ex:hasName

ex:hasName

isOwned

isOwned

?name2?author2

?title2

?name1?author1

?art2?art1

Workload-Aware Replication [5]

Replicate additional queries at the boundaries

Avoid MapReduce by using a designated coordinator node

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 58 / 73

Page 116: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Query processing strategies

Query processing strategies

Materialized query processing

Lookup-based query processingNo statistics or indexesEvaluate parts of the query locallyDereference URIs in intermediate solutions, download the dataUse downloaded data to compute other parts of the query,dereference. . .

Federated query processingBased on technologies originally developed for distributed databasesystems, P2P systems, and data integrationData is storedEvaluate parts of the query on remote sources

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 59 / 73

Page 117: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Query processing strategies

Query processing strategies

Materialized query processing

Lookup-based query processingNo statistics or indexesEvaluate parts of the query locallyDereference URIs in intermediate solutions, download the dataUse downloaded data to compute other parts of the query,dereference. . .

Federated query processingBased on technologies originally developed for distributed databasesystems, P2P systems, and data integrationData is storedEvaluate parts of the query on remote sources

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 59 / 73

Page 118: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Federated query processing

Federated SPARQL query processing [8]

SPARQL Request Query Result

Parsing Source Selection Query Execution(Bound Joins)

Global Optimizations(Groupings + Join Order)

SPARQLEndpoint 1 . . .

Subquery Generation:Evaluation at

Relevant Endpoints

LocalAggregation ofPartial ResultsCache

Per Triple Pattern

SPARQL ASK queriesSPARQL

Endpoint 2SPARQL

Endpoint N

Assumption

The sources are capable (and willing) to evaluate SPARQL queries(SPARQL endpoints).

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 60 / 73

Page 119: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection

Goal

Identify sources that might contribute to the query result

Approaches

Naive

SPARQL ASK requests and cachingStatistics and indexes

Keyword indexesPredicate URIs, types of instancesURI indexesFrequent pathsService-level descriptionsVoiD statisticsHistograms. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 61 / 73

Page 120: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection

Goal

Identify sources that might contribute to the query result

Approaches

Naive

SPARQL ASK requests and cachingStatistics and indexes

Keyword indexesPredicate URIs, types of instancesURI indexesFrequent pathsService-level descriptionsVoiD statisticsHistograms. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 61 / 73

Page 121: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Scenario

SPARQL endpoints

No statistics/indexes

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 122: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Scenario

SPARQL endpoints

No statistics/indexes

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 123: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Scenario

SPARQL endpoints

No statistics/indexes

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 124: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Send message with triple patternto all 4 sources → 4 requests

Receive 200 mappings for?Country and ?Capitale.g., ?Country=ex:Germany,

?Capital=ex:Berlin

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 125: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Use results to evaluate 2ndtriple pattern (nested loop)

200× 4 requestse.g., SELECT ?CountryPop WHERE

{ex:Germany ex:population

?CountryPop .}

150 mappings

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 126: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Use results to evaluate 3rd triplepattern (nested loop)

150× 4 requestse.g., SELECT ?CapitalPop WHERE

{ex:Berlin ex:population

?CapitalPop .}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 127: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Naive federated query processing

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

In total:4 + 200× 4 + 150× 4 = 1404requests

Many (unnecessary) requests sent to the sources!

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 62 / 73

Page 128: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection

Goal

Identify sources that might contribute to the query result

Approaches

Naive

SPARQL ASK requests and cachingStatistics and indexes

Keyword indexesPredicate URIs, types of instancesURI indexesFrequent pathsService-level descriptionsVoiD statisticsHistograms. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 63 / 73

Page 129: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with SPARQL ASK

Does not require special cooperation from the sources

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Each source receives a message for each triple patternResponse: true/false

ASK {?Country ex:capital ?Capital .

}

ASK {?Capital ex:population ?CapitalPop .

}

ASK {?Country ex:population ?CountryPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 64 / 73

Page 130: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with SPARQL ASK

Does not require special cooperation from the sources

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Each source receives a message for each triple patternResponse: true/false

ASK {?Country ex:capital ?Capital .

}

ASK {?Capital ex:population ?CapitalPop .

}

ASK {?Country ex:population ?CountryPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 64 / 73

Page 131: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with SPARQL ASK

Does not require special cooperation from the sources

Example query

SELECT ?Country ?Capital?CountryPop ?CapitalPop WHERE {

?Country ex:capital ?Capital .?Country ex:population ?CountryPop .?Capital ex:population ?CapitalPop .

}

Each source receives a message for each triple patternResponse: true/false

ASK {?Country ex:capital ?Capital .

}

ASK {?Capital ex:population ?CapitalPop .

}

ASK {?Country ex:population ?CountryPop .

}

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 64 / 73

Page 132: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection

Goal

Identify sources that might contribute to the query result

Approaches

Naive

SPARQL ASK requests and cachingStatistics and indexes

Keyword indexesPredicate URIs, types of instancesURI indexesFrequent pathsService-level descriptionsVoiD statisticsHistograms. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 65 / 73

Page 133: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with VoID statistics

Example DBpedia2

Prefixes@prefix owl: http://www.w3.org/2002/07/owl#.@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpo: <http://dbpedia.org/ontology/> .

...

General information

Basic statistics

Predicate statistics

Class statistics

2http://code.google.com/p/fbench/source/browse/trunk/EvalBenchmark/suites/SPLENDID/void/

dbpedia3.5.1_subset-void.n3?r=119

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 66 / 73

Page 134: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with VoID statistics

Example DBpedia2

Prefixes

General informationvoid:sparqlEndpoint <http://dbpedia.org/sparql> ;

...

Basic statistics

Predicate statistics

Class statistics

2http://code.google.com/p/fbench/source/browse/trunk/EvalBenchmark/suites/SPLENDID/void/

dbpedia3.5.1_subset-void.n3?r=119

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 66 / 73

Page 135: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with VoID statistics

Example DBpedia2

Prefixes

General information

Basic statisticsvoid:triples "43620475" xsd:integer ;void:entities "2222456" xsd:integer ;void:properties "1063" xsd:integer ;void:distinctSubjects "9495865" xsd:integer ;void:distinctObjects "13636604" xsd:integer ;

Predicate statistics

Class statistics

2http://code.google.com/p/fbench/source/browse/trunk/EvalBenchmark/suites/SPLENDID/void/

dbpedia3.5.1_subset-void.n3?r=119

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 66 / 73

Page 136: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with VoID statistics

Example DBpedia2

PrefixesGeneral informationBasic statisticsPredicate statisticsvoid:propertyPartition [void:property dbpo:aSide ;void:triples "1600" xsd:integer ;void:distinctSubjects "1552" xsd:integer ;void:distinctObjects "1554" xsd:integer] , [void:property dbpo:abbreviation ;void:triples "1144" xsd:integer ;void:distinctSubjects "1141" xsd:integer ;void:distinctObjects "1096" xsd:integer] , [...];

Class statistics2http://code.google.com/p/fbench/source/browse/trunk/EvalBenchmark/suites/SPLENDID/void/

dbpedia3.5.1_subset-void.n3?r=119

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 66 / 73

Page 137: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection with VoID statistics

Example DBpedia2

Prefixes

General information

Basic statistics

Predicate statistics

Class statisticsvoid:classPartition [void:class dbpo:Activity ;void:entities "1234" xsd:integer] , [void:class dbpo:Actor ;void:entities "37898" xsd:integer] , [...]

2http://code.google.com/p/fbench/source/browse/trunk/EvalBenchmark/suites/SPLENDID/void/

dbpedia3.5.1_subset-void.n3?r=119

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 66 / 73

Page 138: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Source selection

Goal

Identify sources that might contribute to the query result

Approaches

Naive

SPARQL ASK requests and cachingStatistics and indexes

Keyword indexesPredicate URIs, types of instancesURI indexesFrequent pathsService-level descriptionsVoiD statisticsHistograms. . .

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 67 / 73

Page 139: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Index complete triples

Index (s,p,o) in combinations capturing correlation

Based on multidimensional histograms

Identify relevant sources for triple patterns

Identify relevant sources for joins

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 68 / 73

Page 140: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Histograms

Transform triple statements into numerical space (hash functions)(ex:BudSpencer ex:actedIn ex:m1234) → (323, 232, 124)

Insert into the matching bucket

Lookup: transform triple pattern into numerical space(ex:BudSpencer ex:actedIn ?m)

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 69 / 73

Page 141: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Histograms

Transform triple statements into numerical space (hash functions)(ex:BudSpencer ex:actedIn ex:m1234) → (323, 232, 124)

Insert into the matching bucket

Lookup: transform triple pattern into numerical space(ex:BudSpencer ex:actedIn ?m)

so

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 69 / 73

Page 142: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Histograms

Transform triple statements into numerical space (hash functions)(ex:BudSpencer ex:actedIn ex:m1234) → (323, 232, 124)

Insert into the matching bucket

Lookup: transform triple pattern into numerical space(ex:BudSpencer ex:actedIn ?m)

so

so 1515 1610 00 00

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 69 / 73

Page 143: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Histograms

Transform triple statements into numerical space (hash functions)(ex:BudSpencer ex:actedIn ex:m1234) → (323, 232, 124)

Insert into the matching bucket

Lookup: transform triple pattern into numerical space(ex:BudSpencer ex:actedIn ?m)

so

so 1515 1610 00 00 s

o 1515 1610 00 00Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 69 / 73

Page 144: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Histograms

Transform triple statements into numerical space (hash functions)(ex:BudSpencer ex:actedIn ex:m1234) → (323, 232, 124)

Insert into the matching bucket

Lookup: transform triple pattern into numerical space(ex:BudSpencer ex:actedIn ?m)

Alternative to histograms (QTree or clustering) [11]:

so A1A2 A

B

C

B1 B2

so A1A2 A

B

C

B1 B2

so

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 69 / 73

Page 145: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Structured queries

Histogram-based indexing

Join cardinality estimation and source selection

Determine buckets for 1st triple pattern

Determine buckets for 2nd triple pattern

Determine buckets that overlap in the join dimension (e.g., subject)

Estimate cardinality based on the degree of overlap

1st BGP

2nd BGP

2nd BGP

subject

obje

ct

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 70 / 73

Page 146: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

Summary: querying Linked Open Data

Querying Linked Open Data (LOD)

Browser-based link traversalMost natural way of looking up Linked Data

Keyword search for Linked Open DataSeveral search engines available coming in different flavors

SPARQL query processingMaterialized query processingLookup-based query processingFederated query processing

Relies on powerful and available sourcesStatistics require additional cooperation

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 71 / 73

Page 147: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

References I

[1] Daniel J. Abadi, Adam Marcus, Samuel Madden, and Kate Hollenbach. SW-Store:a vertically partitioned DBMS for Semantic Web data management. VLDB J.,18(2):385–406, 2009.

[2] Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak,and Zachary G. Ives. DBpedia: A nucleus for a Web of open data. In ISWC, 2007.

[3] Christian Bizer. Topology of the Web of Data, 2012. Keynote LWDM 2012.http://www.wiwiss.fu-berlin.de/en/institute/pwo/bizer/research/publications/Bizer-TopologyWoD-LWDM2012-BEWEB2012.pdf?1361005355.

[4] Gianluca Demartini, Peter Mika, Thanh Tran, and Arjen P. de Vries. From ExpertFinding to Entity Search on the Web, 2012. Tutorial ECIR 2012.http://diuf.unifr.ch/main/xi/EntitySearchTutorial.

[5] Katja Hose and Ralf Schenkel. WARP: Workload-Aware Replication andPartitioning for RDF. In DESWEB’13, 2013.

[6] Jiewen Huang, Daniel J. Abadi, and Kun Ren. Scalable SPARQL Querying of LargeRDF Graphs. PVLDB, 4(11):1123–1134, 2011.

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 72 / 73

Page 148: Searching the Web of Data (Tutorial)

Searching the Web of Data

Querying Linked Open Data

References II

[7] Thomas Neumann and Gerhard Weikum. RDF-3X: a RISC-style engine for RDF.PVLDB, 1(1):647–659, 2008.

[8] Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, and Michael Schmidt.Fedx: Optimization techniques for federated query processing on linked data. InISWC 2011, pages 601–616, 2011.

[9] Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: a core ofsemantic knowledge. In WWW, 2007.

[10] Metaweb Technologies. The freebase project. http://freebase.com.

[11] Jurgen Umbrich, Katja Hose, Marcel Karnstedt, Andreas Harth, and Axel Polleres.Comparing data summaries for processing live queries over linked data. World WideWeb, 14:495–544, 2011.

Katja Hose Searching the Web of Data March 24, 2013 – ECIR 2013 73 / 73

Page 149: Searching the Web of Data (Tutorial)

! !

C&"*D4#(

! 5(+$.2B%+'.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-

! EB"$9'()*F'(8"2*AG"(*0#+#

! EB"$9'()*+&"*,"-*./*0#+#

! EB"$9*?(2"$=+#(2'()

! AB+4..8*#(2*6.(%4B='.(

Page 150: Searching the Web of Data (Tutorial)

! !

>++$'-B+"M7#="2*O(+'+9*!"#$%&

Page 151: Searching the Web of Data (Tutorial)

! !

1$#G&M7#="2*O(+'+9*!"#$%&

Page 152: Searching the Web of Data (Tutorial)

! !

1$#G&M7#="2*O(+'+9*!"#$%&TBUV.)>%$649%;)O!"#$"%$&'()*#+,%-+.+%$.+)/+0%01.#*2%

/"+%3),/%$+*/1.4%#*%5#3#$&*%6)33+4%!"

"#$$%&'!('!%)*!+,-../*!000!,-..!1(2'!3(4#!56%&7

Page 153: Searching the Web of Data (Tutorial)

! !

1$#G&*EB"$9*D$.%"=='()

! 7D'%#)&A:9W&AJ)$=)#%&N(94

! Q$&A)V%:9%)VN#&D&X'#&$A;)'(()N9$N(9)G4S)'(()M$F9()N%&X9)6&AA9%4

-IY1Z)[M9"D'AA)9#)'(S).//\]OY&A:)M$F9()^%&X9)6&AA9%4)68$)4"%G&G9:)F$#8)6$%(:)6'%4)'A:)#89&%)$6A)58&(:P

Page 154: Searching the Web of Data (Tutorial)

! !

Page 155: Searching the Web of Data (Tutorial)

! !

1$#G&*EB"$9*Q"4#R#+'.(

! D$.-4"LH)7#%"5#"%9:)I'#')$=#9A)&A5$DN(9#9

! !.4B+'.(H)+'%9="()@"9%C)-9('W'#&$A9SJS)F$%A,A)_)5&#&X9AV=2)(&G94,A6$AB6'%:)_)A$D&A'#9:Y$%2)48$%#R&4#9:Y$%

! D.=='-4"*>GG$.#%&H)R'AJ"'J9)3$:9(4[*(F'44"$A&)9#)'(S)*7E+)./00]

Page 156: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

Page 157: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

Page 158: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

>'($J)[,Q+B,)R`I)E$%?48$N)./00];9W#%'5#)N%$:"5#)'##%&F"#94)#$)&DN%$G9)N%$:"5#):9:"N(&5'#&$A

Page 159: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

Page 160: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

! F5J0>*S7T&L*"+*#4U*65:3*VWXVY

! B"#$D'#&5'((C)5%9'#9)A96)(&A?468&(9)'55$"A#&AJ)=$%:9N9A:9A5&94

! 3'N-9:"59<F'49:'(J$%&#8D)=$%)45'('F&(&#C

Page 161: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

:9)39($)a)E9&?"D)[./0/]S)LA#'AJ(&AJ)#89)+%$44<R&AJ"'()R&A?)7#%"5#"%9)$=)E&?&N9:&'S)B+R)./0/

>':)9A#&#C)(&A?4)(9':)#$)&A5$A4&4#9A#9b"&G'(9A59)5('4494

_)L49):&4#&A5#A9449G&:9A59)#$)49N'%'#9:&4#&A5#)9A#&#&94

Page 162: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(*OR#LG4"HF"[email protected].$)

:9)39($)a)E9&?"D)[,7E+).//\]

R'AJ"'J9475%&N#4+$"A#%&94+8'%'5#9%4E$%:4E$%:)79A494*#5S

`"F)&A)I&J&#'()R&F%'%&94)'A:)R&A?9:)R&AJ"&4#&5)I'#')5($":4

Page 163: Searching the Web of Data (Tutorial)

! !

0"2BG4'%#+'.(

I'(#$A2)3&?')a)>('A5$)[+,H3)./00];

-'A?&AJ)A99:4)#$)#'?9)&A#$)'55$"A#)5$%9=9%9A59c:9:"N(&5'#&$A)%94"(#4

Page 164: Searching the Web of Data (Tutorial)

! !

J"R+H*Q#(8'()

Page 165: Searching the Web of Data (Tutorial)

! !

5JOZ*O(+'+9*C$#%8

! 1.#4H)J&G9A)')(&4#)b"9%C2)=&A:)'A:)%'A?)%9(9G'A#)9A#&#&94)=%$D)'A)Z3R)5$((95#&$A)[E&?&N9:&']

! OR#LG4"H)O!"#$%&'()*$"(+#,&-)./&#&)0)*'()%'1).,+/)!"#$-P

! 7"NN(9D9A#'%C)&AN"#)&A=$%D'#&$A

! '])+'#9J$%C;)2$"(+#,&-[*A#&#C)-'A?&AJ)!'4?]

! F])*W'DN(9)9A#&#&94;)3#'(*&4)5&#6'(14)7%',([R&4#)+$DN(9#&$A)!'4?]

Page 166: Searching the Web of Data (Tutorial)

! !

CQO6*O(+'+9*C$#%8

! -'A?&AJ)9A#&#&94)=$"A:)&A)')G4#'(+"R+*5$((95#&$A)[+("9E9F]

! .//d)#'4?;)=&A:)%9('#9:)9A#&#&94

!"#$%&!'"#$%!(")*)+,"-$(%./(*"01&2&!'(")*)+,"-$(%!(")*)+,345%67#(8(9:;<(":::=<&=<:>>;>!'(")*)+,345%!)-?0(),(")*)+%/?0-"*@-)*/"!')-?0(),(")*)+%!"-??-)*A(%B*?7*"(C1)D-)16#??(")7+1#C(1./(*"01&2&1E7-"(CF!'"-??-)*A(%

Page 167: Searching the Web of Data (Tutorial)

! !

D$./'4"M7#="2*Q#(8'()

0SL49)A'D9:)9A#&#C):9#95#&$A)#$$(4)#$)=&A:)A'D9:)"(+'+'"=*'(*2.%BL"(+*%.44"%+'.(

.SY$%)9'58)9A#&#C2)"49)#89)$55"%%9A59)5$A#9W#4)#$)5%9'#9)')@'$+B#4*2.%BL"(+)[49%G&AJ)'4)')N%$=&(9]S

1S!89A)"49)$")B4#$*5Q)D9#8$:4)#$)%'A?)#8949)N%$=&(9):$5"D9A#4S

))))))))))))))))))))))[>'($J)9#)'(S)!V,7)./00]

Page 168: Searching the Web of Data (Tutorial)

! !

C"R+*Q"=B4+M7#="2*Q#(8'()

0SY&%4#)=&A:)%9(9G'A#):$5"D9A#4

.S*W#%'5#)9A#&#&94)=%$D)#8949):$5"D9A#4

1SL49)G'%&$"4)4$%#4)$=)%'A?&AJ)#958A&b"94)[9SJS)=%9b"9A5C2)('AJ"'J9)D$:9(&AJ2)E&?&N9:&')'4)F'5?J%$"A:)?A$6(9:J9]

)))

Page 169: Searching the Web of Data (Tutorial)

! !

C#R.(.L'%*K'4+"$'()

! Y&(#9%)9A#&#&94)F'49:)$A)#89&%)$A#$($J&5'()#CN9)[9SJS)%&#-$(2)$#8'(,9'+,$(2)$%)D$%9)4N95&=&5;)6"-&"62)6$:,&]

! *W'DN(9;)#%C)=&A:&AJ)9W'DN(9)9A#&#&94)=$%)#89)#'%J9#)#CN9)'A:)#89A)5$DN'%&AJ)5'A:&:'#94)6&#8)#89D[e958#$D$G')./0/]

Page 170: Searching the Web of Data (Tutorial)
Page 171: Searching the Web of Data (Tutorial)
Page 172: Searching the Web of Data (Tutorial)

! !

3OJC>H*3B4+'4'()B#4*C#R.(.L9*S/$.L*,.$2J"+*[*V\W*,'8'G"2'#=Y

2"*3"4.*]*,"'8BLU*65:3*VWXWU&++GHPP+'(9U%%PBN(#G'

Page 173: Searching the Web of Data (Tutorial)

! !

Q#(8'()*!B-)$#G&=*

))))))))>9=$%9; Y&A:)9A#&#&94))))))))M$6; Y&A:)68$(9)4"FJ%'N842)9SJS)D'#58&AJ))))))))))))))))) ?9C6$%:4

*(F'44"$A&)a)>('A5$S)+,H3)./00;R'AJ"'J9)3$:9(<F'49:)-'A?&AJ)=$%)O5$D9:C)'5':9DC)'6'%:P

Page 174: Searching the Web of Data (Tutorial)

! !

Q#(8'()*!B-)$#G&=*

! MBUB)[H'4A95&)9#)'(S).//\])

7#'#&4#&5'()('AJ"'J9)D$:9()=$%)%'A?&AJ)9A#&#C<%9('#&$A48&N)4"FJ%'N84

!89C)'(()8'G9):$5#$%'():9J%994)=%$D)U9%D'A)"A&G9%4&#&94)

[8$A$%'%C):$5#$%'#9)=$%)#89)I'('&)R'D']S

Page 175: Searching the Web of Data (Tutorial)

! !

Q#(8'()*!B-)$#G&=

!B-)$#G&=*=&.B42*-"*2"(="*#(2*'(/.$L#+'@"

C9G'%#4*>GG$.#%&H*=&A:)7#9&A9%)!%994

OR#LG4"HBAJ9(')39%?9(B%A$(:)7586'%X9A9JJ9%

)))

7!B-S)H'4A95&)9#)'(S),+I*).//d

Page 176: Searching the Web of Data (Tutorial)

! !

EB"=+'.(=^

Page 177: Searching the Web of Data (Tutorial)

! !

C&"*D4#(

! 5(+$.2B%+'.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-

! EB"$9'()*F'(8"2*AG"(*0#+#

! EB"$9'()*+&"*,"-*./*0#+#

! EB"$9*?(2"$=+#(2'()

! AB+4..8*#(2*6.(%4B='.(

Page 178: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*Q"%.)('+'.(

3.$"*+&#(*L.$"*+&#(*_W`*./*,"-*="#$%&*"()'("*aB"$'"=*

#'L*#+*"(+'+'"=SD.B(2*"+*#4U*,,,*VWXWY

"U)U*G".G4"b*G4#%"=b*

$"=+#B$#(+=b*G$.2B%+=b%.LG#('"=

Page 179: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*Q"%.)('+'.(

U$$J(9)f9&#J9&4#)./0.)!$N)!%9A:&AJ)79'%5894

[5%9:&#;)-$F9%#)M9"D'C9%]

Page 180: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*Q"%.)('+'.(

Page 181: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*Q"%.)('+'.(

Q"="#$%&*6&#44"()"H<.N*+.*L"#=B$"*B="$*=#+'=/#%+'.(^*A/+"(*#GG#$"(+49*(.*/""2-#%8*#4+&.B)&*B="$*'=*&#GG'"$U

Page 182: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*0'=#L-')B#+'.(

Page 183: Searching the Web of Data (Tutorial)

! !

J#L"2*O(+'+9*0'=#L-')B#+'.(

Page 184: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*0"+"%+'.(

XV`*./*#44*aB"$'"=*S3'8#*"+*#4UY

Page 185: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*0"+"%+'.(

Page 186: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*0"+"%+'.(

Page 187: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*0"+"%+'.(

U$$J(9)'4)$=)./01</1

Page 188: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*0"+"%+'.(

! K'("M)$#'("2b*N'+&*#++$'-B+"=HO,A#9((95#"'()N%$N9%#C)('6C9%4)&A)^'($)B(#$P

! Q#(8'()PAG'('.(*3'('()HOU$$:):9A#&4#4)&A)>%$$?(CAPOF94#)%'D9A)&A)V'?('A:P

! C9G"M!G"%'/'%*?5=^

Page 189: Searching the Web of Data (Tutorial)

! !

F'=+*EB"$9*>(=N"$=

! !$A$A2)I9D'%#&A&)a)+":%g<3'"%$"W)[7,U,-)./0.];

! Y&%4#)=&A:)%9(9G'A#)9A#&#&94)&A)4A&NN9#4)=$"A:)&A)%9J"('%)&AG9%#9:)&A:9W

! !89A)%9<%'A?)'A:)9W#9A:)(&4#)FC):&45$G9%&AJ)4&D&('%)9A#&#&94)"4&AJ)7#%"5#"%9:)I'#')%9N$4&#$%C

Page 190: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=

! +"%%9A#(C)'%$"A:)hi)$=)E9F)@"9%&949SJS)OX&N)5$:9)6'#9%G&((9)D'&A9P)[^9#9%)3&?']

Page 191: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=

! OD9J)%C'A)%$D'A59PD'#5894):&==9%9A#)'##%&F"#94)[=&9(:4]5'4#; D9J)%C'AJ9A%9; %$D'A59

! M'jG9)4$("#&$A;)+$DN"#9)>3.h)=$%)9'58)=&9(:2)5$DF&A9

! ^%$F(9D;)9'58)>3.h)G'("9)%9D'&A4)"A'6'%9)$=)68&58)$#89%)b"9%C)#9%D4)69%9)D'#589:)&A)$#89%)=&9(:4

Page 192: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=*N'+&*73VcK

! Y$%)9'58)#9%D2)'JJ%9J'#9)'5%$44):&==9%9A#)=&9(:4S

! !89A)'JJ%9J'#9)'5%$44)#89)#9%D4S

! B(4$;)L49)=&9(:<4N95&=&5)69&J8#4

7$"%59;)-$F9%#4$A)'A:)f'%'J$X')[.//k]

Page 193: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=*N'+&*DQ3!

! ,:9';)Y&9(:<4N95&=&5)69&J8#4)48$"(:)'(4$):9N9A:)$A)4N95&=&5)b"9%C)6$%:l

7$"%59;)H&D2)Z"9)a)+%$=#)[.//d]

Page 194: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=*.@"$*C"R+

*7!*-)[>'4#)9#)'(S)7,U,-).//k];)E8&(9)&A:9W&AJ)'):$5"D9A#)5$((95#&$A2)'::)'%#&=&5&'()49D'A#&5)6$%:4)#$)&AG9%#9:)&A:9W2)9SJS)ON9%4$A;#$AC)F('&%P2)#$)%9=9%9A59)'A)'%#&=&5&'():$5"D9A#)'F$"#)!$AC)>('&%

d.'(*EB"$'"=H*e-.$(*'(*9"#$HXfghi

Page 195: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=*.@"$*C"R+

`'?&'K4)@I*Z;),A:9W)D9'A&AJ):&45$G9%9:)&A)#9W#S

79A#9A59)=$"A:)'4)')%94N$A49)#$)b"9%C)O!"#$%&'()%$'*#$+%"*#&#,"*-P

7$"%59;)8'?&'S5$D)E8&#9N'N9%)$A)79D'A#&5)79'%58)!958A$($JCS)Q'A"'%C)./0/S

Page 196: Searching the Web of Data (Tutorial)

! !

>++$'-B+"*EB"$'"=*.@"$*C"R+

Q"="#$%&*6&#44"()"=H

!E8'#)=$%D4)$=)#9W#)'A'(C4&4)'%9)45'('F(9)9A$"J8)=$%)&A#9J%'#&$A)&A#$)'A)&AG9%#9:)&A:9Wm

!`$6)#$)'G$&:)&A:9W)4&X9)F($6<"N!-$F"4#A944;)`$6)#$):9'()6&#8)A$&492)($AJ)#'&()9A#&#&942)9#5S

!`+,;)E8'#)'%9)#89)F94#)[&A#9%'5#&G9])L,)N'%':&JD4m

Page 197: Searching the Web of Data (Tutorial)

! !

!"L#(+'%*>(=N"$=*/.$*>$-'+$#$9*EB"$'"=

! JC65Q*X6F56:*C#=8H)U&G9A)')b"9%C2)%9#"%A)')48$%#)4"DD'%C)$=)%9(9G'A#)69F)N'J94S

! OR#LG4"H)6$%:):9=&A&#&$A42)'%#&4#)&A=$%D'#&$A2)9#5S

! 6.(=+$#'(+H)7&X9)(&D&#4)[9SJS)0///)58'%'5#9%4)=$%):94?#$N2).\/)=$%)D$F&(9)%"A]

Page 198: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

T'8C')9#)'(S)*3MR^)./00

Page 199: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

T'8C')9#)'(S)*3MR^)./00

Page 200: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

T'8C')9#)'(S)*3MR^)./00

Page 201: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

T'8C')9#)'(S)*3MR^)./00

Page 202: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

T'8C')9#)'(S)*3MR^)./00

Page 203: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*F'(8"2*0#+#

^C#8&')FC)LAJ9%)'A:)+&D&'A$)[./00]

Page 204: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*0#+#

Page 205: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*0#+#

0h)D&((&$A)(&A94)$=)3'#89D'#&5')5$:9

)[./00]

Page 206: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()*.@"$*0#+#

0h)D&((&$A)(&A94)$=)3'#89D'#&5')5$:9

)[./00]

Page 207: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

,>3K4)E'#4$A;)F9'#)8"D'A)Q9$N'%:Cl)58'DN&$A4

Page 208: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! Y&%4#)E'#4$A)G9%4&$A;)#6$)8$"%4)N9%)b"94#&$A

! Q9$N'%:Cl)G9%4&$A;)D'44&G9(C)N'%'((9()[.\\/)5$%942)0n2///U>)$=)-B3]

Page 209: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! Y&%4#)7#9N;)@"94#&$A)+('44&=&5'#&$A

! I9=&A&#&$A)b"94#&$A4

! 3'#8)b"94#&$A4

! ^"XX(9)b"94#&$A4

! E8'#)?&A:)$=)'A469%)&4)%9b"&%9:m)

! `$6;)!9W#)5('44&=&5'#&$A)#958A&b"94)6&#8)'::&#&$A'()%"(94

Page 210: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! I99N)N'%4&AJ2)#9W#)5('44&=&5'#&$A)&A)$%:9%)#$):95$DN$49)b"94#&$A4)&A#$)N'%#4

Page 211: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! !%C)#$)($$?)"N)[N'%#4)$=)'A469%4])&A)?A$6(9:J9)F'494)'A:):'#'F'494;I>N9:&'2)TBUV2)D'AC)$#89%4

! `$69G9%;)%9('#&$A4)G9%C)89#9%$J9A$"4

! VA)#89)=(C),A=$%D'#&$A)-9#%&9G'()'A:)BA469%)*W#%'5#&$A

! 0!>)$=)!9W#;)9A5C5($N9:&'42):&5#&$A'%&942)#894'"%&2)A9642)D"4&5):'#'F'4942),3I>2)'A:)#89)E9F

Page 212: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! ^$44&F(C)#8$"4'A:4)$=)%#(2'2#+"*#(=N"$=

! -9#%&9G9)9G&:9A59)=$%)9'58)5'A:&:'#9)'A:)$#(8*#(=N"$=

! R9W&5'()5$DN'#&F&(&#C

! ^'44'J9)%9(&'F&(&#C

! ^$N"('%)G4S)VF45"%9

! #'W$A$D&5)5$DN'#&F&(&#C

! 7N'#&'(2)#9DN$%'()5$DN'#&F&(&#C

! SSS)[$G9%)h/)45$%&AJ)5$DN$A9A#4]

Page 213: Searching the Web of Data (Tutorial)

! !

EB"=+'.(*>(=N"$'()H*,#+=.(

! >(=N"$*L"$)'()*[OBF%'8'D)R&A5$(AP)G4S)O`$A94#)BF9P];)9449A#&'((C)5$<%9=9%9A59)'A'(C4&4

! Y&A'().@"$#44*$#(8'())5$DF&A&AJ)45$%94)&A):&==9%9A#)6'C4):9N9A:&AJ)$A)#CN9)$=)b"94#&$A)[($J&4#&5)%9J%944&$A)D$:9(4]

Page 214: Searching the Web of Data (Tutorial)

! !

C&"*D4#(

! 5(+$.2B%+'.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-

! EB"$9'()*F'(8"2*AG"(*0#+#

! EB"$9'()*+&"*,"-*./*0#+#

! EB"$9*?(2"$=+#(2'()

! AB+4..8*#(2*6.(%4B='.(

Page 215: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

! !G""%&*Q"%.)('+'.(H),A5%9D9A#'()&DN%$G9D9A#42)#'4?<4N95&=&5)('AJ"'J9)D$:9(&AJ

! Y"(()'":&$)49A#)#$)49%G9%4

! _)4N9958)%95$JA&#&$A)A99:AK#)%"A)$A)N8$A9

! _)8"J9)N$#9A#&'()=$%):'#')5$((95#&$A)

Page 216: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

! 6.(+"R+M>N#$"("==)[($5'#&$A2)4589:"(92)9#5S]

OBAC)J$$:)F"%J9%)o$&A#4)'%$"A:)89%9mP

! Y%$D)&A:&G&:"'()b"94#&$A4)#$)2'#4.)=

O`DDS)`$6)'F$"#)#'5$4mP

! 64#$'/'%#+'.(=H*OI&:)C$")D9'A)N&XX')%94#'"%'A#4)&A)+8&5'J$)$%)+8&5'J$<4#C(9N&XX')N('594)A9'%)C$"mP

! d.'(*aB"$'"=)$G9%)7#%"5#"%9:)I'#';)O,#'(&'A)%94#'"%'A#4)&A)#89)7V3B)A9&J8F$%8$$:)$=)7'A)Y%'A5&45$)#8'#)8'G9)#'F(94)'G'&('F(9)#$A&J8#P

Page 217: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

7$"%59;)8##N;cc48&##8'#4&%&4'C4S#"DF(%S5$DcN$4#c00d00\0/0k\

5(/.$L#+'.(*5(+")$#+'.(H

L49)&A=$%D'#&$A)=%$D)69F)4$"%594)(&?9)J9$A'D94S$%J)[R&A?9:)I'#']S

I9(9J'#9)4$D9)%9b"94#4#$)$#89%)49%G&594)$A)#89)E9F)(&?9)E$(=%'D)B(N8'S

Page 218: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

7$"%59;)8##N;cc4&%&b"94#&$A4S5$Dc,pA99:p'p:%&A?pb0d.

5(+"(+0"+"%+'.(H

Y%$D)@"94#&$A4)#$)!'4?4

Page 219: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

7$"%59;)8##N;cc666SN5D'JS5$Dc'%#&5(9.c/2.\0k2.1dq\1q2//S'4Nr=F&:s&3LN*B8!3$H

5(+"(+0"+"%+'.(H

Y%$D)@"94#&$A4)#$)!'4?4

Page 220: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

! M"'A59K4)M&A')'A:)7-,K4)R$(')=$%)+"4#$D9%)79%G&59

! t+'A),)49#)"N)')%9J"('%)D$A#8(C)N'CD9A#)$=)DC)D$%#J'J9mt

! O`$6)($AJ)8'G9),)J$#)"A#&(),)8'G9)#$)N'C)DC)e&4')F&((P

! O+'A),)J9#)$A)'A)9'%(&9%)=(&J8#)#$)>$4#$AP

,D'J9;)M"'A59

Page 221: Searching the Web of Data (Tutorial)

! !

D"$=.(#4*>=='=+#(%"

>$#9J$)49%G&59)"49:)FC)LA&(9G9%2)+$5')+$(')9#5S

Page 222: Searching the Web of Data (Tutorial)

! !

J.+*dB=+*D&.("=

! OE8'#u4)$A)!>7)'#)d)NSDS)#$A&J8#mP

! OY&A:)5$D9:&94)6&#8)>9A)7#&((9%P

! O^('C)4$D9)-':&$89':P)[D"4&5]

! O+'(()Q'A9)$A)7?CN9P! O79A:)D944'J9)#$)Q$8A;)E'A#)#$)5$D9)#$)6'#58)#89)48$6mP

Page 223: Searching the Web of Data (Tutorial)

! !

J.+*dB=+*D&.("=

! 7&%&)9C94)=%99)D$:9)&A)4$D9)U3)5'%4

! 3&5%$4$=#)!9((D9;)"49:)&A)5'%4)[Y$%:2)H&']2)!e4)[G&')H&A95#]2)'A:)5$D&AJ)#$)^+4

! O,)4'6)$A9)$=)DC)?&:4)#%C&AJ)#$)#'(?)'A:)J&G9)5$DD'A:4)#$)$"%)8$D9)4#9%9$2)6&#8$"#)D"58)4"55944P)[f&J)79%'=&A2)3&5%$4$=#]

Page 224: Searching the Web of Data (Tutorial)

! !

J.+*dB=+*D&.("=

! VA<#89<=(C),A=$%D'#&$A[9SJS):&%95#&$A42)69'#89%)%9N$%#42)D944'J94]

! e$&59)&A#9%='59

U$$J(9K4)^%$o95#)U('44

Page 225: Searching the Web of Data (Tutorial)

! !

6.(%4B='.(

! !+$B%+B$"2*0#+#*.(*+&"*,"-H*D'%?"N2)R&A?9:)I'#'2)HA$6(9:J9)U%'N82)9#5S

! EB"$9'()*+&"*,"-*./*0#+#H*?9C6$%:)49'%582)4#%"5#"%9:)b"9%&942)%'A?&AJ

! EB"$9*?(2"$=+#(2'()H7D'%#)@"94#&$A)BA469%&AJ2)#89)="#"%9)$=)"49%)&A#9%='594

7$"%59;)E&?&N9:&'

Page 226: Searching the Web of Data (Tutorial)

! !

3.$"*5(/.$L#+'.(

1"$#$2*2"*3"4.&++GHPP4"[email protected].$)P)2LP*

:#+;#*<.="&++GHPPNNNU8#+;#M&.="U2"P

EB"=+'.(=^