querying on the web: xquery, rdql, sparql semantic web - spring 2006 computer engineering department...

49
Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

Upload: moshe-nell

Post on 29-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

Querying on the Web:XQuery, RDQL, SparQL

Semantic Web - Spring 2006

Computer Engineering Department

Sharif University of Technology

Page 2: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

2

Outline

• XQuery– Querying on XML Data

• RDQL– Querying on RDF Data

• SparQL– Another RDF query language (under development)

Page 3: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

3

Requirements for an XML Query Language

David Maier, W3C XML Query Requirements:• Closedness: output must be XML• Composability: wherever a set of XML elements is

required, a subquery is allowed as well• Can benefit from a schema, but should also be applicable

without• Retains the order of nodes• Formal semantics

Page 4: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

4

How Does One Design a Query Language?

• In most query languages, there are two aspects to

a query:

– Retrieving data (e.g., from … where … in SQL)

– Creating output (e.g., select … in SQL)

• Retrieval consists of

– Pattern matching (e.g., from … )

– Filtering (e.g., where … )

… although these cannot always be clearly distinguished

Page 5: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

5

XQuery Principles

• A language for querying XML document.

• Data Model identical with the XPath data model– documents are ordered, labeled trees

– nodes have identity

– nodes can have simple or complex types (defined in XML Schema)

• XQuery can be used without schemas, but can be checked against DTDs and XML schemas

• XQuery is a functional language– no statements

– evaluation of expressions

Page 6: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

6

Sample data

Page 7: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

7

<titles>

{for $r in doc("recipes.xml")//recipe

return $r/title}

</titles>

returns

<titles>

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

<title>Ricotta Pie</title>

</titles>

A Query over the Recipes Document

Page 8: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

8

XPath

<titles>

{for $r in doc("recipes.xml")//recipe

return

$r/title}

</titles>

Query Features

doc(String) returns input document

Part to be returned as it is given {To be evaluated}

Iteration $var - variables

Sequence of results,one for each variable binding

Page 9: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

9

Features: Summary

• The result is a new XML document

• A query consists of parts that are returned as is

• ... and others that are evaluated (everything in {...} )

• Calling the function doc(String) returns an input document

• XPath is used to retrieve nodes sets and values

• Iteration over node sets:

let binds a variable to all nodes in a node set

• Variables can be used in XPath expressions

• return returns a sequence of results,

one for each binding of a variable

Page 10: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

10

XPath is a Fragement of XQuery• doc("recipes.xml")//recipe[1]/title

returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

• doc("recipes.xml")//recipe[position()<=3] /title

returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,

<title>Ricotta Pie</title>,

<title>Linguine Pescadoro</title>

an element

a list of elements

Page 11: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

11

Beware: XPath Attributes

• doc("recipes.xml")//recipe[1]/ingredient[1] /@name

→ attribute name {"beef cube steak"}

• string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)

→ "beef cube steak"

a constructor for an attribute node

a value of type string

Page 12: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

12

XPath Attributes (cntd.)

• <first-ingredient>{string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)}</first-ingredient>

→ <first-ingredient>beef cube steak</first-ingredient>

an element with string content

Page 13: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

13

XPath Attributes (cntd.)

• <first-ingredient>{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}

</first-ingredient>

→ <first-ingredient name="beef cube steak"/>

an element with an attribute

Page 14: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

14

XPath Attributes (cntd.)

• <first-ingredient

oldName="{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}">Beef</first-ingredient>

→ <first-ingredient oldName="beef cube steak">

Beef

</first-ingredient>

An attribute is cast as a string

Page 15: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

15

Iteration with the For-Clause

Syntax: for $var in xpath-expr

Example: for $r in doc("recipes.xml")//recipe return string($r)

• The expression creates a list of bindings for a variable $var

If $var occurs in an expression exp,

then exp is evaluated for each binding

• For-clauses can be nested:

for $r in doc("recipes.xml")//recipefor $v in doc("vegetables.xml")//vegetable return ...

Page 16: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

16

Nested For-clauses: Example

<my-recipes>

{for $r in doc("recipes.xml")//recipe

return

<my-recipe title="{$r/title}">

{for $i in $r//ingredient

return

<my-ingredient>

{string($i/@name)}

</my-ingredient>

}

</my-recipe>

}

</my-recipes>

Returns my-recipes with titles as attributes and my-ingredientswith names as text content

Page 17: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

17

The Let Clause

Syntax: let $var := xpath-expr

• binds variable $var to a list of nodes,

with the nodes in document order

• does not iterate over the list

• allows one to keep intermediate results for reuse

(not possible in SQL)

Example:

let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"]

Page 18: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

18

Let Clause: Example

<calory-content>

{let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"]

for $r in $ooreps return

<calories>

{$r/title/text()}

{": "}

{string($r/nutrition/@calories)}

</calories>}

</calory-content>

Calories of recipeswith olive oil

Note the implicitstring concatenation

Page 19: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

19

Let Clause: Example (cntd.)

The query returns:

<calory-content>

<calories>Beef Parmesan: 1167</calories>

<calories>Linguine Pescadoro: 532</calories>

</calory-content>

Page 20: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

20

The Where Clause

Syntax: where <condition>• occurs before return clause • similar to predicates in XPath• comparisons on nodes:

– "=" for node equality– "<<" and ">>" for document order

• Example:

for $r in doc("recipes.xml")//recipewhere $r//ingredient/@name="olive oil"return ...

Page 21: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

21

Quantifiers

• Syntax: some/every $var in <node-set> satisfies <expr>

• $var is bound to all nodes in <node-set> • Test succeeds if <expr> is true for some/every

binding• Note: if <node-set> is empty, then

“some” is false and “all” is true

Page 22: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

22

Quantifiers (Example)

• Recipes that have some compound ingredient

• Recipes where every ingredient is non-compound

for $r in doc("recipes.xml")//recipewhere some $i in $r/ingredient satisfies $i/ingredient Return $r/title

for $r in doc("recipes.xml")//recipewhere every $i in $r/ingredient satisfies not($i/ingredient) Return $r/title

Page 23: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

23

Element Fusion

“To every recipe, add the attribute calories!”<result>

{let $rs := doc("recipes.xml")//recipe

for $r in $rs return

<recipe>

{$r/nutrition/@calories}

{$r/title}

</recipe>}

</result>

an element

an attribute

Page 24: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

24

Element Fusion (cntd.)

The query result:

<result>

<recipe calories="1167">

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

</recipe>

<recipe calories="349">

<title>Ricotta Pie</title>

</recipe>

<recipe calories="532">

<title>Linguine Pescadoro</title>

</recipe>

</result>

Page 25: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

25

Eliminating Duplicates

The function distinct-values(Node Set)

– extracts the values of a sequence of nodes

– creates a duplicate free sequence of values

Note the coercion: nodes are cast as values!

Example:

let $rs := doc("recipes.xml")//recipereturn distinct-values($rs//ingredient/@name)

yields

"beef cube steak

onion, sliced into thin rings

...

Page 26: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

26

Syntax: order by expr [ ascending | descending ]

for $iname in doc("recipes.xml")//@name

order by $iname descending

return string($iname)

yields

"whole peppercorns",

"whole baby clams",

"white sugar",

...

The Order By Clause

Page 27: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

27

The Order By Clause (cntd.)

The interpreter must be told whether the values should be regarded as numbers or as strings (alphanumerical sorting is default)

for $r in $rsorder by number($r/nutrition/@calories)return $r/title

Note:

– The query returns titles ...

– but the ordering is according to calories, which do not appear in the output

Not possible in SQL!

Page 28: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

28

Grouping and Aggregation

Aggregation functions count, sum, avg, min, max

Example: The number of simple ingredients

per recipe

for $r in doc("recipes.xml")//recipe

return

<number>

{attribute {"title"} {$r/title/text()}}

{count($r//ingredient[not(ingredient)])}

</number>

Page 29: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

29

Grouping and Aggregation (cntd.)

The query result:

<number title="Beef Parmesan with Garlic Angel Hair Pasta">11</number>,

<number title="Ricotta Pie">12</number>,

<number title="Linguine Pescadoro">15</number>,

<number title="Zuppa Inglese">8</number>,

<number title="Cailles en Sarcophages">30</number>

Page 30: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

30

Nested Aggregation

“The recipe with the maximal number of calories!”

let $rs := doc("recipes.xml")//recipelet $maxCal := max($rs//@calories)for $r in $rswhere $r//@calories = $maxCalreturn string($r/title)

returns

"Cailles en Sarcophages"

Page 31: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

31

Running Queries with Galax

• Galax is an open-source implementation of

XQuery (http://www.galaxquery.org/)

– The main developers have taken part in the definition of

XQuery

Page 32: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

RDQL

Querying on RDF data

Page 33: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

33

Introduction

• RDF Data Query Language• JDBC/ODBC friendly

• Simple:

SELECTsome information

FROMsomewhere

WHEREthis match

ANDthese constraints

USINGthese vocabularies

Page 34: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

34

Example

Page 35: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

35

Example

• q1 contains a query:SELECT ?x

WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")

• For executing q1with a model m1.rdf:java jena.rdfquery --data m1.rdf --query q1

• The outcome is:x

=============================

<http://somewhere/JohnSmith/>

Page 36: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

36

Example

• Return all the resources that have property FN and the associated values:

SELECT ?x, ?fnameWHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)

• The outcome is:

x | fname ================================================<http://somewhere/JohnSmith/> | "John Smith" <http://somewhere/SarahJones/> | "Sarah Jones"<http://somewhere/MattJones/> | "Matt Jones"

Page 37: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

37

Example

• Return the first name of Jones:

SELECT ?givenName

WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),

(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)

• The outcome is:

givenName

=========

"Matthew"

"Sarah"

Page 38: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

38

URI Prefixes : USING

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?x WHERE (?x, vCard:FN, "John Smith") USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?givenNameWHERE (?y, vCard:Family, "Smith"),

(?y, vCard:Given, ?givenName) USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

Page 39: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

39

Filters

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?resource WHERE (?resource, info:age, ?age) AND ?age >= 24 USING info FOR <http://somewhere/peopleInfo#>

Page 40: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

40

Another Example

SELECT?title ?description ?orbit ?satellite ?sensor ?date

FROM<http://earth.esa.int/showcase/ers/dublin.rdf>

WHERE(?item <dc:title> ?title)(?item <dc:description> ?description)(?item <isc:orbit> ?orbit)(?item <isc:satellite> ?satellite)(?item <isc:sensor> ?sensor)(?item <dc:date> ?date)

USINGisc FOR <http://earth.esa.int/standards/showcase/>dc FOR <http://purl.org/dc/elements/1.1/>rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>

Page 41: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

41

Implementations

• Jena– http://jena.sourceforge.net/

• Sesame– http://sesame.aidministrator.nl/

• RDFStore– <http://rdfstore.sourceforge.net/>

Page 42: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

42

Limitation

• Does not take into account semantics of RDFS• For example:

ex:human rdfs:subClassOf ex:animalex:student rdfs:subClassOf ex:humanex:john rdf:type ex:student

Query: “ To which class does the resource John belong?”Expected answer: ex:student, ex:human, ex:animalHowever, the query:

SELECT ?xWHERE (<http://example.org/#john>, rdf:type, ?x)USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

Yields only:<http://example.org/#student>

• Solution: Inference Engines

Page 43: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

SparQL

Page 44: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

44

Introduction

• A RDF query language currently under development by W3C

• Builds on previous RDF query languages such as rdfDB, RDQL, and SeRQL.

Page 45: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

45

Example RDF

Page 46: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

46

Example

• Simple Query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?url FROM <bloggers.rdf> WHERE {

?contributor foaf:name "Jon Foobar" . ?contributor foaf:weblog ?url . }

Page 47: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

47

Example (cont.)

• Optional block:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?depiction

WHERE { ?person foaf:name ?name .

OPTIONAL { ?person foaf:depiction ?depiction . }

}

Page 48: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

48

Example (cont.)

• Alternative matches:

PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?name ?mbox WHERE { ?person foaf:name ?name . { { ?person foaf:mbox ?mbox } UNION { ?person foaf:mbox_sha1sum ?mbox } } }

• There are many other features in SparQL which is out of scope for this class. Refer to references for more information.

Page 49: Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology

49

References

• http://www.w3.org/TR/xquery/

• A Programmer's Introduction to RDQL– http://jena.sourceforge.net/tutorial/RDQL/

• http://rdfstore.sourceforge.net/

• http://jena.sourceforge.net

• http://sesame.aidministrator.nl/

• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/

• http://www-128.ibm.com/developerworks/java/library/j-sparql/