expressive query answering for semantic wikis
DESCRIPTION
TRANSCRIPT
Expressive Query Answering For Semantic Wikis
Jie Bao, Rensselaer Polytechnic Institute
[email protected], http://www.cs.rpi.edu/~baojie
Outline Background: Semantic MediaWiki
General Design Issues: Semantics and Expressivity
Formalizing SMW with Datalog
Extending SMW Modeling and Query Languages
Implementations and Experimental ResultsJan 18, 20112
Semantic Wiki as a Data Store
Jan 18, 20113
Wiki DB Triple Store Online dataData Layer
Map Data EvaluationPublication
Management
Still many not yet
mentioned…
Wiki Layer
App.Layer
(Semantic) Wiki Scripting
Semantic Template
Semantic Query
(PHP, Javascript) Wiki Extensions
HaloExtension
Parser Function
Concept Modeling: RDF, Relational Modeling, Rules
Semantic Forms
ProjectManagement
Remote SemWiki
Group Info. Management
Semantic MediaWiki (SMW)
4
It is the most popular semantic wiki system extending MediaWiki
Mediawiki: What you edit what you see
Jan 18, 2011
Semantic MediaWiki
5
SMW: What you edit(Modeling Language)
what you see
typed link (property)
To author knowledge
Jan 18, 2011
Semantic MediaWiki
6
SMW: What you edit(Querying Language)
what you see
To retrieve knowledge
Jan 18, 2011
Why SMW? Low-cost solution for light-weight semantic
applications Integrated environment for modeling and
querying Simple to setup, easy to use
Can work with hundreds of other MW/SMW extensions Templating, Visualization, Editing, I/O,
Workflow… Access Control, Forms, Maps, SPARQL…
Jan 18, 20117
Expressivity (SMW 1.5.4) SMW-ML (Modeling Language)
category instantiation e.g., [[Category:C]] property instantiation e.g., [[P::v]] subclass, e.g., [[Category:C]] (on a category page) subproperty, e.g., [[Subpropety of:Property:P]] (on a property page)
SMW-QL (Query Language) conjunction: e.g., [[Category:C]][[P::v]] disjunction: e.g., [[Category:C]] or [[P::v]], [[A||B]] or [[P::v||w]] property chain: e.g., [[P.Q::v]] property wildcat: e.g., [[P::+]] subquery: e.g., [[P::<q>[[Category:C]]</q>]] inverse property e.g., [[-P::v]] value comparison, e.g. [[P::>3]][[P::<7]][[P::!5]]
Jan 18, 20118
However, we often need more expressivity
Modeling Domain and Range: “has author” is from “person”
to “document” Inverse property: “has author” <-> “author of” Transitive property: “part of” …
Query Negation: find cities that are not capitals Counting: find professors who advise more than 5
studentsJan 18, 20119
Extending SMW
Goal: offer additional expressivity without losing “wikiness” (i.e., collaborative, simple, easy to learn, informality-tolerate, and evolving-capable )
Jan 18, 201110
Design Issues: Semantics and Expressivity
Jan 18, 201111
Design Issue 1: Open or Close world?
OWL/DL -Like DB/Rule-Like
Jan 18, 201112
or
Design Issue 2: Expressivity Supported
A subset of OWL that Can be implemented using rules Is syntactically simple for common wiki users
Why not full OWL 2 RL or OWL 2 QL? Too complicated for most wiki users
Jan 18, 201113
Design Issue 3: Implementation Reuse existing tools if we can
Low learning curve: hide details from users; incremental changes from SMW
Portability: allow users to choose different backend stores (MySQL, SQL Server, etc.)
Fast enough for a typical semantic wiki (has < O(104) pages [1])
Jan 18, 201114
[1] http://semantic-mediawiki.org/wiki/Sites_using_Semantic_MediaWiki
Solution Formalizing SMW modeling and query
languages using datalog Descriptive, closed-world semantics Well-understood complexity and many known
optimizations
Implementation: leverage highly-optimized LP solvers for
reasoning, e.g., DLV, Clasp, and Smodels Reuse SMW UI for rendering query results
Jan 18, 201115
Expressivity
Modeling Language: a subset of OWL Prime (or RDFS++ named by others) rdfs:subClassOf, subPropertyOf, domain, range owl:TransitiveProperty, SymmetricProperty,
FunctionalProperty, InverseFunctionalProperty, inverseOf owl:sameAs, equivalentClass, equivalentProperty
Query Language: SMW-QL, plus Negation as failure Cardinality
Jan 18, 201116
Modeling SMW with datalog
Jan 18, 201117
Translation Rules for SMW-ML
Subproperty Subclass Class instance Property instance Redirection
P(x,y) :- Q(x,y) . C(x) :- D(x) . C(a) . P(a,b) . a=b.
Jan 18, 201118
Translation Rules for SMW-QL
{{#ask: [[Category:City]] [[capital of::+]] }}
result(x) :- City(x), capital_of(x, y) .
Jan 18, 201119
Translation Rules for SMW-QL
{{#ask: [[Category:A]][[p3::category:B]] or
[[p.p1.p2::<q>
[[Category:D]] or [[p1::<q>[[SomePage]]</q>]]
</q> ||!v ||<q>[[Category:E]]</q> ]]}}
result(x) :- _tmp0(x). _tmp0(x) :- A(x),
p3(x,x0), x0=category:B. _tmp0(x) :- p(x,x2),
p1(x2,x3), p2(x3,x1), _tmp9(x1).
_tmp9(x1) :- _tmp12(x1). _tmp12(x1) :- D(x1). _tmp12(x1) :- p1(x1,x4),
x4=SomePage. _tmp9(x1) :- thing(x), x !
=v. _tmp9(x1) :- E(x1).
Conjunction
Disjunction
Subquery
Inequality
Property chain
Jan 18, 201120
Extending SMW-ML and SMW-QL
Jan 18, 201121
SMW-ML+
[[Domain::C]] [[Range::C]] [[Type::Transitive]] [[Type::Symmetric]] [[Type::Functional]] [[Type::InverseFuncti
onal]] [[Inverse of::Q]]
C(x) :- P(x,y) C(y) :- P(x,y) P(x,y) :- P(x,z), P(z,y) P(x,y) :- P(y,x) SameAs(x,y) :-
P(z,x),P(z,y) SameAs(x,y) :-
P(x,z),P(y,z) Q(x,y) :- P(y,x)
Jan 18, 201122
On page “Property:P”
SMW-QL+ : Negations
{{#askplus: [[<>Category:C]] [[Category:D]]}}
{{#askplus: [[Category:C]] [[<>P::+]]}}
result(x) :- D(x), not C(x) .
result(x) :- C(x), #count{x: P(x,y)}<=0 .
Jan 18, 201123
SMW-QL+: (Non)qualified Cardinality
{{#askplus: [[>=3#P::+]]}}
{{#askplus: [[>=3#P::
<q>[[Category:D]]</q>]]
}}
result(x) :- thing(x),#count{x: P(x,y)}>=3 .
result(x) :- thing(x),#count{x: P(x,y),D(y)}>=3 .
Jan 18, 201124
For safeness
Theoretical Complexity
Jan 18, 201125
SMW RDF
SMW-ML NL-complete • NP-complete;• P-complete for grounded graph [Bruijn and
Heymans 2007]SMW-ML+ NL-complete
SMW-QL P-complete• In L without subqueries
(SPARQL) P-complete [Perez et al 2006]
SMW-QL+ P-complete
Recall that L NL P NP
Implementation and Experimental Results
Jan 18, 201126
Implementation Using DLV as the reasoner
Other LP solvers may be used as well
Two work modes File-based: reasoning based on a static dump
(snapshot) of wiki semantic data. Database-based: reasoning based on a
shadow database via ODBC; Real-time changes of instance data will be updated.
Optimization Caching
Jan 18, 201127
Example:
Jan 18, 201128
Caching
Inverse property
Transitive property
Scalability: Data Complexity
Test machine: 2 * Xeon 5365 Quad 3.0GHz 1333MHz /16G / 2 * 1TB
Dataset: part of DBLP, 10,396 pages, 100,736 triplesJan 18, 201129
10k 20k 30k 40k 50k 60k 70k 80k 90k 100k0.0000.2000.4000.6000.8001.0001.2001.400
Query time(s)Query time(s)
Dataset size (triples)
Near linear
{{#askplus: [[Category:Person]] }}
Scalability: Query Complexity
1 2 3 4 5 6 7 8 9 100.000
0.200
0.400
0.600
0.800
1.000
1.200
1.400
Query time(s)Query time(s)
Subquery depth
Jan 18, 201130
Near constant
{{#askplus: [[Knows::<q>[[Knows::<q>[[Knows::<q>…</q>]]</q>]]</q>]] }}
Dataset: DBLP 100k triples
Scalability: Query Complexity
1 2 3 4 5 6 7 8 9 100.000
0.200
0.400
0.600
0.800
1.000
1.200
1.400
Query time(s)Query time(s)
Disjunctions
Jan 18, 201131
Near constant
{{#askplus: [[Knows::+]] or [[Knows::+]] or [[Knows::+]] …}}
Dataset: DBLP 100k triples
The SemanticQueryRDFS++ extension
Jan 18, 201132
http://www.mediawiki.org/wiki/Extension:SemanticQueryRDFS++
Conclusions and Future Work Formalizing SMW using datalog allows us to
analyze the reasoning complexity of SMW extend SMW modeling and query languages for an
expressive subset of OWL implement a SMW query engine based on DLV that
is scalable for typical uses.
Future Work Incremental reasoning Customized reasoning rules SPARQL <-> SMW-QL+ translations
Jan 18, 201133