foundations of rdf databases
TRANSCRIPT
Foundations of RDF Databases
Claudio Gutierrez
Department of Computer ScienceUniversidad de Chile
European Semantic Web Conference - ESWC 2008
• Renzo Angles• Marcelo Arenas• Carlos Hurtado• Sergio Muñoz• Jorge Pérez
Joint Work With
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
To the memory ofAlberto Mendelzon,
database theoreticianand Web enthusiast
Inspired by…
1. RDF and Databases2. RDF and Database models3. RDF Query Language– Requirements and Domains– Manifold Views
4. SPARQL– Syntax and Semantics– Complexity– Expressive Power
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Agenda
1. RDF and Databases2. RDF and Database models3. RDF Query Language– Requirements and Domains– Manifold Views
4. SPARQL– Syntax and Semantics– Complexity– Expressive Power
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Agenda
More like a “Computer Science” than a“Web Science” talk
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Disclaimers
A particular view on the subject
Not a survey!
Apologies to Web Scientists
“ The Semantic Web is therepresentation of data on the WorldWide Web. It is a collaborative effortled by W3C with participation from alarge number of researchers andindustrial partners. It is based on theResource Description Framework(RDF)”
The base of the Semantic Web is RDF
http://www.w3.org/2001/sw/
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Recommendation (1999)
Language for representing
information about
resources in the Web
Particularly metadata
about Web resources
Automation of processing:“RDF is intended for situations in
which this information needs to be
processed by applications, rather
than only displayed to people”
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
A Data Processing perspective
r
12 4
36
5q
sw
tu p
ba
h
cf
Unicode URI
Proof
Trust
Digi
tal S
igna
ture
XML + NS + xmlschema( Text + Links )
(entities + relations )
(Concepts + knowledge)
RDF + rdfschema
Logic + Ontology vocabulary
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
• Manage huge volumes of data with logical precision• Separate modeling from implementation levels
The Database Approach
RDF
+ DB
RDF
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
• Manage huge volumes of data with logical precision• Separate modeling from implementation levels
The Database Approach
RDF
+ DB
RDF
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
As opposed to AI: DB primary concern isscalability. Then expressive power
• Manage huge volumes of data with logical precision• Separate modeling from implementation levels
The Database Approach
RDF
+ DB
RDF
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
As opposed to AI: DB primary concern isscalability. Then expressive power
As opposed to IR: DB primary concern isprecision. Then scalability (recall).
Native Data Store
APIs
Data Structure: RDF Graphs
RDBMS
MySQLMSQL
Oracle DB2
Postgres
Files
Applications Services
Query language
SPARQL SeRQL RDQL RQL …..
RDF Database Technology
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
APIs
Data Structure: RDF Graphs
RDBMS
MySQLMSQL
Oracle DB2
Postgres
Files
Applications Services
Query language
SPARQL SeRQL RDQL RQL …..
Native Data Store
This Talk: Database Modeling Level
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
This Talk: Database Modeling Level
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Hence leaving out:• Visualization, APIs, Services, etc.• Indexing, storing, transactions, etc.
This Talk: Database Modeling Level
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Hence leaving out:• Visualization, APIs, Services, etc.• Indexing, storing, transactions, etc.
But also leaving out: Updating / Constraints / Temporality /Optimization / Aggregation / Flexibility /etc. / etc.
1. RDF and Databases2. RDF and Database models3. RDF Query Language
– Requirements and Domains– Manifold Views
4. SPARQL– Syntax and Semantics– Complexity– Expressive Power
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Agenda
Database Models: Coddʼs definition
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Data structures
Query Language
Integrity constraints
Database Models: Coddʼs definition
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Data structures
Query Language
Blank NodesRDFSVocabulary
Graph (Triple) structure
RDF Data Structure: three main blocks
subClasstype
domain
range
classproperty
subProperty
?X ?Y
?Z∃
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Blank Nodes
?X ?Y
?Z∃
Vocabulary
subClass
type
domain
range
classproperty
subProperty
Graph (Triple) structure
RDF Data Structure: the core
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Blank Nodes
?X ?Y
?Z∃
Vocabulary
subClass
type
domain
range
classproperty
subProperty
Graph (Triple) structure
RDF Data Structure: the core
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Triple structure:set of statements
Blank Nodes
?X ?Y
?Z∃
Vocabulary
subClass
type
domain
range
classproperty
subProperty
Graph (Triple) structure
RDF Data Structure: the core
Graph structure:linked network ofstatements.
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Triple structure:set of statements
RDF Data Structure: Relational Tables (Triple) view
Subject Predicate Object• Triples as tuples• Set of triples as Tables
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Data Structure: Relational Tables (Triple) view
Subject Predicate Object• Triples as tuples• Tables of triples
Advantages:+ Well studied and
well understood+ Reuse relational
technologies
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Data Structure: Relational Tables (Triple) view
Subject Predicate Object• Triples as tuples• Tables of triples
Advantages:+ Well studied and
well understood+ Reuse relational
technologies
Problems (Questions):
- Why yet another syntax for
the relational model?
- Was this the intended
objective of RDF?
- Expressive power limitations
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Graph Database Models:• Data and/or schema are represented by graphs• Query language able to capture main graph operations
and properties• Studied by DB community, but still not well understood
RDF Data Structure: Graph Database Model view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Graph Database Models:• Data and/or schema are represented by graphs• Query language able to capture main graph operations
and properties• Studied by DB community, but still not well understood
RDF Data Structure: Graph Database Model view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Golde
nage
PathDegree ofa Node
AdjacentEdges
Neighborhoods
PROPERTYDiameterDistanceFixed-
length path
Language
QueryGraph
F-G
Lorel
GraphDB
Gram
GraphLog
G+
G
RDF Data Structure: Graph query languages
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
PathDegree ofa Node
AdjacentEdges
Neighborhoods
PROPERTYDiameterDistanceFixed-
length path
Language
QueryGraph
F-G
Lorel
GraphDB
Gram
GraphLog
G+
G
RDF Data Structure: Graph query languages
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Green light for graph features!
Vocabulary
subClass
type
domain
range
classproperty
subProperty
Blank Nodes
Graph (Triple) structure
?X ?Y?Z
∃
RDF Data Structure: Triple structure + Blank nodes
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Vocabulary
subClass
type
domain
range
classproperty
subProperty
Blank Nodes
Graph (Triple) structure
?X ?Y?Z
∃
Complexity / Semantics issues:• Deciding entailment becomes
NP-complete.• Deciding core is DP-complete• Semantics of querying not
simple
RDF Data Structure: Triple structure + Blank nodes
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Data Structure: Ground fragment
Blank NodesVocabulary
Graph (Triple) structure
subClass
type
domain
range
classproperty
subProperty
?X ?Y
?Z∃
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Data Structure: Ground fragment
Blank NodesVocabulary
Graph (Triple) structure
subClass
type
domain
range
classproperty
subProperty
?X ?Y
?Z∃
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Good News: Blank nodescan be treated orthogonallyto ground fragment.
RDF Data Structure: Ground fragment
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }
RDF Data Structure: Ground fragment
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided
RDF Data Structure: Ground fragment
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided• Structural (internal) constraints of the language can be
separated from user-features. e.g. (Class, type, Resource)
RDF Data Structure: Ground fragment
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided• Structural (internal) constraints of the language can be
separated from user-features. e.g. (Class, type, Resource)• Features which do not add expressive power can be
avoided, e.g. reflexivity of subClass and subProperty.
RDF Data Structure: A minimal fragment
Blank Nodes
?X ?Y
?Z∃
subClass
type
domain
range
subProperty
Vocabulary
Graph (Triple) structure
{subClass, subProperty, type, domain, range}
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Data Structure: A minimal fragment
Blank Nodes
?X ?Y
?Z∃
subClass
type
domain
range
subProperty
Vocabulary
Graph (Triple) structure
{subClass, subProperty, type, domain, range}
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Theorem: Simple proof system sound and
complete for the semantics of RDF in this
fragment. That is:
G |= F under RDF semantics iff
G |= F under mRDF semantics
RDF Data Structure: A minimal fragment
Blank Nodes
?X ?Y
?Z∃
subClass
type
domain
range
subProperty
Vocabulary
Graph (Triple) structure
{subClass, subProperty, type, domain, range}
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Theorem: Simple proof system sound and
complete for the semantics of RDF in this
fragment. That is:
G |= F under RDF semantics iff
G |= F under mRDF semantics
Theorem: Let G be a restricted graph in thefragment, and t a ground tuple. Deciding ifG |= t can be done in time O(G x log(G))
1. RDF and Databases2. RDF and Database models3. RDF Query Language
– Requirements and Domains– Manifold Views
4. SPARQL– Syntax and Semantics– Complexity– Expressive Power
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Agenda
RDF Query language: Social Networks domain
+ Find triads by typeRanking
+ Selective counting of neighbors+ Operations between attributes+ Change relation direction based onattributes
Diffusion
+ Extract the subnetwork induced by cliquesof size n+ Build a hierarchy of cliques
Cohesive Subgroups
+ Extract subnetwork by timeFrienship+ Two-mode network to one-mode networkAffiliations
+ Group multiple binary relationsCenter and Periphery+ Extract egonetwork of an actor+ Remove relations between groups
Brokers and Bridges
+ Discretize an attributePrestige
+ Loop removalGenealogies andCitations
+ Extract a subnetwork based on attributes+ Group actors based on attributes+ Selective grouping of actors based onattributes
Attributes andRelations
+ Directed to undirected binary relations+ Remove relations
Looking for SocialStructure
Use Case (local)Chapter title
+ Connected components+ Clustering+ Bicomponents andbrockers
Connected components
+ Detect cohesivesubgroups+ Egonetworks+ Input Domain
Groups(k-neighbors, k-core,n-cliques, k-plex, etc.)
+ GeodesicsPaths and CyclesUse Case (Global)Subgraph family
C. Gutierrez – Foundations of RDF Databases
RDF Query Language: Biology domain
Frequent subgraph recognitionTo find biopathways graph motifs
Subsumption testingEnzyme taxonomies
Frequent subgraph recognitionTo find biopathways graph motifs
Subgraph isomorphismChemical info retrieval
Subgraph homomorphismKinaze enzyme
Transitive ClosureFind all products ultimately derived from aparticular reaction
Least common ancestorObserve multiple products are co-regulated
Shortest path queriesIdentify “important” paths from nutrients tochemical outputs
Graph compositionTo construct pathways from individualreactions
Node matchingChemical structure associated with a node
Graph intersection, union, differenceFind the difference in metabolismsbetween two microbes
Majority graph queryTo combine multiple protein interactiongraphs
Graph compositionTo connect pathways, metabolism of co-existing organisms
Graph QueryUse Case
C. Gutierrez – Foundations of RDF Databases
RDF Query Language: Web domain
C. Gutierrez – Foundations of RDF Databases
DistanceWhat is the Erdös distance between authos X and author Y?
PathsAre suspects A and B related?
AdjacencyAll relatives of degree one of Alice
PathsWhat is the influence of article D?
Degree of a nodeWhat is/are the most cited paper/s?Graph QueryUse Case
Minimalist design:– Tags + Bundles (classes)– No inheritance, no intersection, etc.– Renaming
TagsA tag is simply a word you use to describe a bookmark. Unlike folders, youmake up tags when you need them and you can use as many as you like.
RDF Query Language: Tagging domain
C. Gutierrez – Foundations of RDF Databases
• SQL: Great for finding data from tabular representations, can get complexwhen many tables are involved in a given query
RDF Query Language: Standardizationʼs view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)
Sixth annual W3C Technical Plenary (March 2006)
• SQL: Great for finding data from tabular representations, can get complexwhen many tables are involved in a given query
• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed
RDF Query Language: Standardizationʼs view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)
Sixth annual W3C Technical Plenary (March 2006)
• SQL: Great for finding data from tabular representations,can get complex when many tables are involved in agiven query
• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed
• SPARQL: Good pattern matching paradigm, especiallywhen relationships have to be used to answer a query
Standardizationʼs view (Jim Melton, Oracle, 2006)
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)
Sixth annual W3C Technical Plenary (March 2006)
• SQL: Great for finding data from tabular representations,can get complex when many tables are involved in agiven query
• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed
• SPARQL: Good pattern matching paradigm, especiallywhen relationships have to be used to answer a query
Standardizationʼs view (Jim Melton, Oracle, 2006)
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)
Sixth annual W3C Technical Plenary (March 2006)
SPARQL = Sympathy Queen?
• RDF is the first level of a logical tower• Emphasis in logic features of RDF model• Keep an eye in extensions to more expressive logics• Bad news: complexity issues
RDF Query Language: Logicianʼs view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
• How do we answer the most common queries?• How do we cope with APIs and store developments?• Design usually influenced by current programming and
system tools.• Not always concerned with scalability and long term.
RDF Query Language: Developerʼs view
RDF
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF as a graph data model?
?
RDF Query Language: Database theoreticianʼs view
RDF as a relational model?
Graphs
Relations
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
RDF Query Language: Database theoreticianʼs view
Local Global
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Theorem (Gaifman). A property of graphs is expressible by a closedfirst order formula iff it is equivalent to a combination of properties ofthe form
where v1,…,vs denote vertices and d(x,y) denotes distance
v2> 2r
rv1
v3
v4
v5
RDF Query Language: Database theoreticianʼs view
Local Global
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Theorem (Gaifman). A property of graphs is expressible by a closedfirst order formula iff it is equivalent to a combination of properties ofthe form
where v1,…,vs denote vertices and d(x,y) denotes distance
v2> 2r
rv1
v3
v4
v5
Want Local (relational) or global (graph) queries?
SPARQL (W3C Recommendation, 2008)– Relational view of querying– RDF = triples + blanks– Pattern matching
W3C Working Groupʼs view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
SPARQL (W3C Recommendation, 2008)– Relational view of querying– RDF = triples + blanks– Pattern matching
W3C Working Groupʼs view
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Good News:there is a standard!
SELECT ASKCONSTRUCT DESCRIBEQuery Form
DatasetClause
Where Clause(Graph Pattern)
Triplepattern
FROM
FROM NAMED
Dataset
FILTEROPTIONAL
ANDUNION
X Y Z
X Y
TRUE - FALSE
SPARQL Query (General Structure)
C. Gutierrez – Foundations of RDF Databases - ESWC 2008
Outline
◮ Overview of syntax and semantics of SPARQL
◮ Formal semantics of SPARQL
◮ Complexity of the SPARQL evaluation problem
◮ Expressive Power of SPARQL
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 1 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
In general, in a query we have:
H ←
◮ Head: processing of some variables.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
In general, in a query we have:
H ← P
◮ Head: processing of some variables.
◮ Body: pattern matching expression.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
Core of the language: Example
SELECT ?Name ?EmailWHERE{
?X :name ?Name?X :email ?Email
}
In general, in a query we have:
H ← P
◮ Head: processing of some variables.
◮ Body: pattern matching expression.
We focus on P .
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 2 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ P1P2 }
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2 }
{ P3P4 }
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2OPTIONAL { P5 } }
{ P3P4OPTIONAL { P7 } }
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2OPTIONAL { P5 } }
{ P3P4OPTIONAL { P7OPTIONAL { P8 } } }
}
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2OPTIONAL { P5 } }
{ P3P4OPTIONAL { P7OPTIONAL { P8 } } }
}UNION{ P9 }
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2OPTIONAL { P5 } }
{ P3P4OPTIONAL { P7OPTIONAL { P8 } } }
}UNION{ P9FILTER ( R ) }
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
But things can become more complex ...
Interesting features of patternmatching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
{ { P1P2OPTIONAL { P5 } }
{ P3P4OPTIONAL { P7OPTIONAL { P8 } } }
}UNION{ P9FILTER ( R ) }
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 3 / 29
A standard algebraic syntax
◮ Triple patterns: RDF triples + variables
?X :name "john" (?X , name, john)
◮ Graph patterns: full parenthesized algebra
{ P1 P2 } (P1 AND P2 )
{ P1 OPTIONAL { P2 }} (P1 OPT P2 )
{ P1 } UNION { P2 } (P1 UNION P2 )
{ P1 FILTER ( R ) } (P1 FILTER R )
original SPARQL syntax algebraic syntax
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 4 / 29
A formal semantics for SPARQL
A formal approach is beneficial for:
◮ Providing the user an ultimate guide of language behavior
◮ Clarifying and expliciting corner cases
◮ Helping and simplifying the implementation process
◮ Providing sound foundations
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 5 / 29
A formal semantics for SPARQL
Desiderata for semantics:
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 6 / 29
A formal semantics for SPARQL
Desiderata for semantics:
◮ Compositional approach: The meaning of an expression isdetermined by the meaning of its parts and the way they arecombined.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 6 / 29
A formal semantics for SPARQL
Desiderata for semantics:
◮ Compositional approach: The meaning of an expression isdetermined by the meaning of its parts and the way they arecombined.
◮ Denotational approach: Meaning of expressions is formalizedby assigning mathematical objects which describe themeaning.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 6 / 29
A formal semantics for SPARQL
Desiderata for semantics:
◮ Compositional approach: The meaning of an expression isdetermined by the meaning of its parts and the way they arecombined.
◮ Denotational approach: Meaning of expressions is formalizedby assigning mathematical objects which describe themeaning.
Will present:
◮ A denotational and compositional semantics.
◮ A comparison of it with W3C Semantics of SPARQL
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 6 / 29
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 7 / 29
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms.
The evaluation of a pattern results in a set of mappings.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 7 / 29
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms.
The evaluation of a pattern results in a set of mappings.
Example (Relational view)
◮ Variables 7→ Attributes
◮ Mappings 7→ Tuples
◮ Set of mappings 7→ Tables
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 7 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
◮ make t to match the graph
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Example
graph triple evaluation(R1, name, john)(R1, email, [email protected])(R2, name, paul)
(?X , name, ?Y )?X ?Y
µ1: R1 johnµ2: R2 paul
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Example
graph triple evaluation(R1, name, john)(R1, email, [email protected])(R2, name, paul)
(?X , name, ?Y )?X ?Y
µ1: R1 johnµ2: R2 paul
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The semantics of triple patterns
Given an RDF graph and a triple pattern t
Definition
The evaluation of t is the set of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Example
graph triple evaluation(R1, name, john)(R1, email, [email protected])(R2, name, paul)
(?X , name, ?Y )?X ?Y
µ1: R1 johnµ2: R2 paul
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 8 / 29
The bag semantics of triple patterns
Definition (Bag Semantics)
The evaluation of t is the multisetset (bag) of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 9 / 29
The bag semantics of triple patterns
Definition (Bag Semantics)
The evaluation of t is the multisetset (bag) of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Bag Semantics
◮ Reflects real world practice
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 9 / 29
The bag semantics of triple patterns
Definition (Bag Semantics)
The evaluation of t is the multisetset (bag) of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Bag Semantics
◮ Reflects real world practice
◮ Not well understood from a theoretical point of view
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 9 / 29
The bag semantics of triple patterns
Definition (Bag Semantics)
The evaluation of t is the multisetset (bag) of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Bag Semantics
◮ Reflects real world practice
◮ Not well understood from a theoretical point of view
◮ For RDF/SPARQL, really set/bag semantics (sets for datainput, bag for subsequent processing and output).
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 9 / 29
The bag semantics of triple patterns
Definition (Bag Semantics)
The evaluation of t is the multisetset (bag) of mappings that
◮ make t to match the graph
◮ have as domain the variables in t.
Bag Semantics
◮ Reflects real world practice
◮ Not well understood from a theoretical point of view
◮ For RDF/SPARQL, really set/bag semantics (sets for datainput, bag for subsequent processing and output).
This talk will avoid bag semantics details.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 9 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
µ1 ∪ µ2 : R1 john [email protected]
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
µ1 ∪ µ2 : R1 john [email protected]
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
µ1 ∪ µ2 : R1 john [email protected]µ1 ∪ µ3 : R1 john [email protected] R2
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Compatible mappings
Definition
Two mappings are compatible if they agree in their sharedvariables.
Example
?X ?Y ?Z ?Vµ1 : R1 johnµ2 : R1 [email protected]µ3 : [email protected] R2
µ1 ∪ µ2 : R1 john [email protected]µ1 ∪ µ3 : R1 john [email protected] R2
◮ µ2 and µ3 are not compatible
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 10 / 29
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 11 / 29
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 M2
◮ extending mappings in M1 with compatible mappings in M2
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 11 / 29
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 M2
◮ extending mappings in M1 with compatible mappings in M2
Difference: M1 r M2
◮ mappings in M1 that cannot be extended with mappings in M2
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 11 / 29
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 M2
◮ extending mappings in M1 with compatible mappings in M2
Difference: M1 r M2
◮ mappings in M1 that cannot be extended with mappings in M2
Union: M1 ∪M2
◮ mappings in M1 plus mappings in M2 (set theoretical union)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 11 / 29
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 M2
◮ extending mappings in M1 with compatible mappings in M2
Difference: M1 r M2
◮ mappings in M1 that cannot be extended with mappings in M2
Union: M1 ∪M2
◮ mappings in M1 plus mappings in M2 (set theoretical union)
Definition
Left Outer Join: M1 M2 = (M1 M2) ∪ (M1 r M2)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 11 / 29
Semantics of SPARQL operators
Compositional semantics at work:
Definition
Given P1,P2 graph patterns and D an RDF graph:
[[P1 AND P2]]D →
[[P1 UNION P2]]D →
[[P1 OPT P2]]D →
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 12 / 29
Semantics of SPARQL operators
Compositional semantics at work:
Definition
Given P1,P2 graph patterns and D an RDF graph:
[[P1 AND P2]]D → [[P1]]D [[P2]]D
[[P1 UNION P2]]D →
[[P1 OPT P2]]D →
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 12 / 29
Semantics of SPARQL operators
Compositional semantics at work:
Definition
Given P1,P2 graph patterns and D an RDF graph:
[[P1 AND P2]]D → [[P1]]D [[P2]]D
[[P1 UNION P2]]D → [[P1]]D ∪ [[P2]]D
[[P1 OPT P2]]D →
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 12 / 29
Semantics of SPARQL operators
Compositional semantics at work:
Definition
Given P1,P2 graph patterns and D an RDF graph:
[[P1 AND P2]]D → [[P1]]D [[P2]]D
[[P1 UNION P2]]D → [[P1]]D ∪ [[P2]]D
[[P1 OPT P2]]D → [[P1]]D [[P2]]D
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 12 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?ER1 [email protected]
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?ER1 [email protected]
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?Y ?ER1 john [email protected] paul
?X ?ER1 [email protected]
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?Y ?ER1 john [email protected] paul
?X ?ER1 [email protected]
◮ from the Join
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?Y ?ER1 john [email protected] paul
?X ?ER1 [email protected]
◮ from the Difference
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Simple example
Example
(R1, name, john)(R1, email, [email protected])(R2, name, paul)
( (?X , name, ?Y ) OPT (?X , email, ?E ) )
?X ?YR1 johnR2 paul
?X ?Y ?ER1 john [email protected] paul
?X ?ER1 [email protected]
◮ from the Union
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 13 / 29
Semantics of FILTER patterns
In a pattern (P FILTER F), the filter expression F is a Booleancombination of atoms.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 14 / 29
Semantics of FILTER patterns
In a pattern (P FILTER F), the filter expression F is a Booleancombination of atoms.
A mapping satisfies an atom:
◮ (?X = c) if it gives the value c to variable ?X
◮ (?X =?Y ) if it gives the same value to ?X and ?Y
◮ bound(?X ) if it is defined for ?X
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 14 / 29
Semantics of FILTER patterns
In a pattern (P FILTER F), the filter expression F is a Booleancombination of atoms.
A mapping satisfies an atom:
◮ (?X = c) if it gives the value c to variable ?X
◮ (?X =?Y ) if it gives the same value to ?X and ?Y
◮ bound(?X ) if it is defined for ?X
Definition
[[P FILTER R ]] = { µ ∈ [[P ]] : µ |= R}= Set of mappings in [[P ]] that satisfy R .
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 14 / 29
Semantics of FILTER patterns
In a pattern (P FILTER F), the filter expression F is a Booleancombination of atoms.
A mapping satisfies an atom:
◮ (?X = c) if it gives the value c to variable ?X
◮ (?X =?Y ) if it gives the same value to ?X and ?Y
◮ bound(?X ) if it is defined for ?X
Definition
[[P FILTER R ]] = { µ ∈ [[P ]] : µ |= R}= Set of mappings in [[P ]] that satisfy R .
Makes sense only if var(R) ⊆ var(P) (safe filters).
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 14 / 29
Complexity (The evaluation problem)
Input:
mapping µ,graph pattern P ,RDF graph D.
Question:
Is the mapping in the evaluation of the pattern against the graph?Formally:
Is it true that µ ∈ [[P ]]D ?
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 15 / 29
Evaluation of simple patterns is polynomial.
Theorem
For patterns using only AND and FILTER operators, the evaluationproblem is polynomial:
O(size of the pattern × size of the graph).
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 16 / 29
Evaluation of simple patterns is polynomial.
Theorem
For patterns using only AND and FILTER operators, the evaluationproblem is polynomial:
O(size of the pattern × size of the graph).
Proof idea◮ Check that the mapping makes every triple to match.
◮ Then check that the mapping satisfies the FILTERs.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 16 / 29
Evaluation including UNION is NP-complete.
Theorem
For patterns using only AND, FILTER and UNION operators, theevaluation problem is NP-complete.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 17 / 29
Evaluation including UNION is NP-complete.
Theorem
For patterns using only AND, FILTER and UNION operators, theevaluation problem is NP-complete.
Proof idea◮ Reduction from 3SAT.
◮ A pattern encodes the propositional formula.
◮ ¬ bound is used to encode negation.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 17 / 29
Evaluation including UNION is NP-complete.
Theorem
For patterns using only AND, FILTER and UNION operators, theevaluation problem is NP-complete.
Proof idea◮ Reduction from 3SAT.
◮ A pattern encodes the propositional formula.
◮ ¬ bound is used to encode negation.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 17 / 29
In general: Evaluation problem is PSPACE-complete.
Theorem
For general patterns that include OPT operator, the evaluationproblem is PSPACE-complete.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 18 / 29
In general: Evaluation problem is PSPACE-complete.
Theorem
For general patterns that include OPT operator, the evaluationproblem is PSPACE-complete.
Proof idea◮ Reduction from QBF
◮ A pattern encodes a quantified propositional formula:
∀x1∃y1∀x2∃y2 · · ·ψ.◮ nested OPTs are used to encode quantifier alternation.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 18 / 29
In general: Evaluation problem is PSPACE-complete.
Theorem
For general patterns that include OPT operator, the evaluationproblem is PSPACE-complete.
Proof idea◮ Reduction from QBF
◮ A pattern encodes a quantified propositional formula:
∀x1∃y1∀x2∃y2 · · ·ψ.◮ nested OPTs are used to encode quantifier alternation.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 18 / 29
Data–complexity is polynomial
Theorem
When patterns are consider to be fixed (data complexity), theevaluation problem is in LOGSPACE.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 19 / 29
Data–complexity is polynomial
Theorem
When patterns are consider to be fixed (data complexity), theevaluation problem is in LOGSPACE.
Proof idea
From data–complexity of first–order logic.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 19 / 29
Expressive Power of SPARQL
◮ A query is a function from the set of input data to the set ofoutput data.
◮ The expressive power of a query language is given by the setof queries it can express.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 20 / 29
Expressive Power of SPARQL
◮ A query is a function from the set of input data to the set ofoutput data.
◮ The expressive power of a query language is given by the setof queries it can express.
Definition (Equivalence of languages)
Two query languages L1 and L2 have the same expressive power ifthey can express the same queries.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 20 / 29
Expressive Power of SPARQL
◮ A query is a function from the set of input data to the set ofoutput data.
◮ The expressive power of a query language is given by the setof queries it can express.
Definition (Equivalence of languages)
Two query languages L1 and L2 have the same expressive power ifthey can express the same queries.
(If the languages operate over different data inputs and outputs,have to normalize them before.)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 20 / 29
Expressive Power of SPARQL
Three languages we will consider:
SPARQL
W3C Syntax and Semantics(as in W3C Recommendation 15 Jan 2008).
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 21 / 29
Expressive Power of SPARQL
Three languages we will consider:
SPARQL
W3C Syntax and Semantics(as in W3C Recommendation 15 Jan 2008).
SPARQL-S
W3C Syntax and Semantics. Only safe filters allowed.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 21 / 29
Expressive Power of SPARQL
Three languages we will consider:
SPARQL
W3C Syntax and Semantics(as in W3C Recommendation 15 Jan 2008).
SPARQL-S
W3C Syntax and Semantics. Only safe filters allowed.
SPARQL-C
SPARQL with compositional semantics(as presented in this talk).
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 21 / 29
Expressive Power of SPARQL: Safe Patterns
What is the meaning of (P FILTER R) when var(R) 6⊆ var(P)(non-safe filters)?
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 22 / 29
Expressive Power of SPARQL: Safe Patterns
What is the meaning of (P FILTER R) when var(R) 6⊆ var(P)(non-safe filters)?
Example
Possible meanings of (?X name ?Y) FILTER (?Z > 3)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 22 / 29
Expressive Power of SPARQL: Safe Patterns
What is the meaning of (P FILTER R) when var(R) 6⊆ var(P)(non-safe filters)?
Example
Possible meanings of (?X name ?Y) FILTER (?Z > 3)
1. Non-defined variable ?Z. (Error, False, empty set)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 22 / 29
Expressive Power of SPARQL: Safe Patterns
What is the meaning of (P FILTER R) when var(R) 6⊆ var(P)(non-safe filters)?
Example
Possible meanings of (?X name ?Y) FILTER (?Z > 3)
1. Non-defined variable ?Z. (Error, False, empty set)
2. All values of ?X, ?Y, ?Z such that the expression matches.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 22 / 29
Expressive Power of SPARQL: Safe Patterns
What is the meaning of (P FILTER R) when var(R) 6⊆ var(P)(non-safe filters)?
Example
Possible meanings of (?X name ?Y) FILTER (?Z > 3)
1. Non-defined variable ?Z. (Error, False, empty set)
2. All values of ?X, ?Y, ?Z such that the expression matches.
3. W3C uses the following:◮ IF the expression is inside an optional, e.g.
P OPT ( (?X name ?Y) FILTER (?Z >3) )and variable ?Z occurs in P, THEN (2.)
◮ ELSE (1.)
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 22 / 29
Expressive Power of SPARQL: Safe Patterns
◮ Patterns with non-safe filter are rare cases.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 23 / 29
Expressive Power of SPARQL: Safe Patterns
◮ Patterns with non-safe filter are rare cases.
◮ Patterns with non-safe filters are simulable with safe ones.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 23 / 29
Expressive Power of SPARQL: Safe Patterns
◮ Patterns with non-safe filter are rare cases.
◮ Patterns with non-safe filters are simulable with safe ones.
Why not avoid them?
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 23 / 29
Expressive Power of SPARQL: Safe Patterns
◮ Patterns with non-safe filter are rare cases.
◮ Patterns with non-safe filters are simulable with safe ones.
Why not avoid them?
Theorem
SPARQL and SPARQL-S have the same expressive power.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 23 / 29
Expressive Power of SPARQL: Safe Patterns
◮ Patterns with non-safe filter are rare cases.
◮ Patterns with non-safe filters are simulable with safe ones.
Why not avoid them?
Theorem
SPARQL and SPARQL-S have the same expressive power.
Proof idea◮ There exists generic procedure to translate non-safe queries
into equivalent safe queries.
◮ It uses case-by-case W3C evaluation rules for non-safe queries.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 23 / 29
Expressive Power of SPARQL: Compositional semantics
◮ Compositional and denotational semantics are desirable.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 24 / 29
Expressive Power of SPARQL: Compositional semantics
◮ Compositional and denotational semantics are desirable.
◮ W3C semantics of SPARQL has a complex three-leveloperational procedure for evaluating patterns.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 24 / 29
Expressive Power of SPARQL: Compositional semantics
◮ Compositional and denotational semantics are desirable.
◮ W3C semantics of SPARQL has a complex three-leveloperational procedure for evaluating patterns.
Occam’s razor: Why not keep things simple and clean?
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 24 / 29
Expressive Power of SPARQL: Compositional semantics
◮ Compositional and denotational semantics are desirable.
◮ W3C semantics of SPARQL has a complex three-leveloperational procedure for evaluating patterns.
Occam’s razor: Why not keep things simple and clean?
Theorem
SPARQL-S and SPARQL-C have the same expressive power.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 24 / 29
Expressive Power of SPARQL: Compositional semantics
◮ Compositional and denotational semantics are desirable.
◮ W3C semantics of SPARQL has a complex three-leveloperational procedure for evaluating patterns.
Occam’s razor: Why not keep things simple and clean?
Theorem
SPARQL-S and SPARQL-C have the same expressive power.
Proof idea
The only non-trivial case is the semantics of patterns of the form(P1 OPT(P2 FILTER C ). Just check both definitions coincide.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 24 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting but not surprising:
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 25 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting but not surprising:
Theorem
SPARQL-C and Relational Algebra have the same expressive power.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 25 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting but not surprising:
Theorem
SPARQL-C and Relational Algebra have the same expressive power.
Proof idea.
1. Use known equivalence between Relational Algebra and versionof Datalog.2. From SPARQL-C to Relational Algebra: idea of transformationwas known, e.g., Cyganiak (to Relational Algebra), Polleres (toDatalog). Had to extend to bag semantics.2. From Datalog to SPARQL-C key issue is sound translation ofnegation.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 25 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting and surprising:
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 26 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting and surprising:
Theorem
W3C SPARQL and Relational Algebra have the same expressivepower.
Proof Idea.
Use previous results.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 26 / 29
Expressive Power of SPARQL: Relational Algebra
Interesting and surprising:
Theorem
W3C SPARQL and Relational Algebra have the same expressivepower.
Proof Idea.
Use previous results.
◮ Results hold for bag and set semantics.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 26 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
◮ Could bring to SPARQL most of the machinery of basic SQL.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
◮ Could bring to SPARQL most of the machinery of basic SQL.
B. Foundational:
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
◮ Could bring to SPARQL most of the machinery of basic SQL.
B. Foundational:
◮ SPARQL is a pattern matching version of SQL.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
◮ Could bring to SPARQL most of the machinery of basic SQL.
B. Foundational:
◮ SPARQL is a pattern matching version of SQL.
◮ Only local queries expressible in SPARQL.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Expressive Power of SPARQL: Some consequences
A. Domestic:
◮ Expressive power of SPARQL (limitations and potentialities)completely clarified.
◮ Negation (difference) expressible in SPARQL.
◮ Extension with ASK queries does not add expressive power.
◮ Could bring to SPARQL most of the machinery of basic SQL.
B. Foundational:
◮ SPARQL is a pattern matching version of SQL.
◮ Only local queries expressible in SPARQL.
◮ Still waiting for the third query paradigm: SQL/Tables,XQUERY/Trees, ?/Graphs
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 27 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Conclusions, Final Thougths
◮ RDF is becoming a very relevant data model. Develop it!
◮ We have a ”convenience marriage” with SPARQL. Learn tolove it. RDF and SPARQL are our assets. Be patient!
◮ Simplify, simplify, simplify. Keep everything (but your hope)minimal!
◮ Do not stop the search for ”el Dorado”. Missing graphfeatures will be needed!
◮ Do not reinvent the wheel. Before designing new features orextensions for SPARQL, check if it was tried for SQL.
◮ SPARQL standardization + popularization of RDF +pervasiveness of social networks = explosive combination.
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 28 / 29
Comments, Questions, etc.
Thanks for your attention!
– C. Gutierrez - Foundations of RDF Databases - ESWC 2008 29 / 29