ontological conjunctive query answering over large, semi-structured knowledge bases
DESCRIPTION
Ontological Conjunctive Query Answering knows today a renewed interest in knowledge systems that allow for expressive inferences. Most notably in the Semantic Web domain, this problem is known as Ontology-Based Data Access. The problem consists in, given a knowledge base with some factual knowledge (very often a relational database) and universal knowledge (ontology), to check if there is an answer to a conjunctive query in the knowledge base. This problem has been successfully studied in the past, however the emergence of large and semi-structured knowledge bases and the increasing interest on non-relational databases have slightly changed its nature.This presentation will highlight the following aspects. First, we introduce the problem and the manner we have chosen to address it. We then discuss how the size of the knowledge base impacts our approach. In a second time, we introduce the ALASKA platform, a framework for performing knowledge representation & reasoning operations over heterogeneously stored data. Finally we present preliminary results obtained by comparing efficiency of existing storage systems when storing knowledge bases of different sizes on disk and future implications.TRANSCRIPT
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Ontological Conjunctive Query Answering overLarge, Semi-Structured Knowledge Bases
Bruno Paiva Lima da Silva
GraphIK Research Team, LIRMM
FOSDEM 2012 - February 5th
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 1 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
1 Introduction
2 Research Problem
3 ALASKA platform
4 Tests & Results
5 Current & Future work
6 Questions
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 2 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
About me
Bruno PAIVA LIMA DA SILVA
2nd year PhD Student @GraphIK Research Team
(http://www2.lirmm.fr/graphik)
GraphIK team is located at LIRMM, Montpellier, France.
Research topics: Knowledge representation (interrogation of knowledge bases),record linkage & argumentation problems
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 3 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Ontological Conjunctive Query Answering
Problem:
Ontological Conjunctive Query Answering (OCQA)[Also known as Ontology-based Data Access (ODBA)]
Given:
Knowledge base (KB)
Factual knowledgeOntology (Universal knowledge)
(Boolean) Conjunctive Query
OCQA consists in verifying if there is (or not) an answer to the query in the
KB (if the query can be deduced from tke KB).
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 4 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Ontological Conjunctive Query Answering
Problem:
Ontological Conjunctive Query Answering (OCQA)[Also known as Ontology-based Data Access (ODBA)]
Given:
Knowledge base (KB)
Factual knowledgeOntology (Universal knowledge)
(Boolean) Conjunctive Query
OCQA consists in verifying if there is (or not) an answer to the query in the
KB (if the query can be deduced from tke KB).
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 4 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Ontological Conjunctive Query Answering
Problem:
Ontological Conjunctive Query Answering (OCQA)[Also known as Ontology-based Data Access (ODBA)]
Given:
Knowledge base (KB)
Factual knowledgeOntology (Universal knowledge)
(Boolean) Conjunctive Query
OCQA consists in verifying if there is (or not) an answer to the query in the
KB (if the query can be deduced from tke KB).
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 4 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Let us describe the problem through a quick example:
Factual knowledge:Alice and Bob are animals.Alice is a clownfish. Bob is a parrot.
Ontology:“A clownfish is a fish.”“A fish swims.”“A parrot is a bird.”“A bird flies.”
Query #1:Is there a clownfish? Yes, Alice.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 5 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Let us describe the problem through a quick example:
Factual knowledge:Alice and Bob are animals.Alice is a clownfish. Bob is a parrot.
Ontology:“A clownfish is a fish.”“A fish swims.”“A parrot is a bird.”“A bird flies.”
Query #1:Is there a clownfish?
Yes, Alice.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 5 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Let us describe the problem through a quick example:
Factual knowledge:Alice and Bob are animals.Alice is a clownfish. Bob is a parrot.
Ontology:“A clownfish is a fish.”“A fish swims.”“A parrot is a bird.”“A bird flies.”
Query #1:Is there a clownfish? Yes, Alice.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 5 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.Alice swims.
Bob is a bird.Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.
Alice is a fish.Alice swims.
Bob is a bird.Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.
Alice swims.Bob is a bird.
Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.Alice swims.
Bob is a bird.Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.Alice swims.
Bob is a bird.
Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.Alice swims.
Bob is a bird.Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Example
Query #2:Is there an animal who flies?
Factual knowledge:Alice and Bob are animals.
Alice is a clownfish.Bob is a parrot.Alice is a fish.Alice swims.
Bob is a bird.Bob flies.
Ontology:“A clownfish is a fish.”
“A fish swims.”“A parrot is a bird.”
“A bird flies.”
Answer: Yes, Bob.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 6 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
First-Order Logic
We use a decidable subset of First-Order Logic (FOL) to represent the problem:
Definitions:
Terms: Alice, BobPredicates: flies(x), swims(x), friend(x,y), between(x,y,z)Atoms: parrot(Bob), friend(Alice,Bob)Rules: ∀x [hypothesis] bird(x) → [conclusion] flies(x)
According to this formalism, we have:
Factual knowledge: conjunctions of atoms
Ontology: set of rules
Conjunctive Query: conjunctions of atoms
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 7 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Equivalences
According to the chosen set of rules, we retrieve semanticalequivalences from our problem into others that are or have alreadybeen studied in the litterature.
If O is empty, our problem becomes equivalent to theEntailment problem in RDF language.
If O is a set of ∀-rules, we enter the RDFS, Datalog andConceptual Graphs (CGs) scope.[“if x has a car, then x has a driving licence”]
If O is a set of ∀∃-rules, we obtain an equivalence to theproblems found in Datalog± and CGs with rules.[“if x is an human, it exists y , another human, which is its parent”]
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 8 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Equivalences
According to the chosen set of rules, we retrieve semanticalequivalences from our problem into others that are or have alreadybeen studied in the litterature.
If O is empty, our problem becomes equivalent to theEntailment problem in RDF language.
If O is a set of ∀-rules, we enter the RDFS, Datalog andConceptual Graphs (CGs) scope.[“if x has a car, then x has a driving licence”]
If O is a set of ∀∃-rules, we obtain an equivalence to theproblems found in Datalog± and CGs with rules.[“if x is an human, it exists y , another human, which is its parent”]
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 8 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Equivalences
According to the chosen set of rules, we retrieve semanticalequivalences from our problem into others that are or have alreadybeen studied in the litterature.
If O is empty, our problem becomes equivalent to theEntailment problem in RDF language.
If O is a set of ∀-rules, we enter the RDFS, Datalog andConceptual Graphs (CGs) scope.[“if x has a car, then x has a driving licence”]
If O is a set of ∀∃-rules, we obtain an equivalence to theproblems found in Datalog± and CGs with rules.[“if x is an human, it exists y , another human, which is its parent”]
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 8 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Deduction
F |= Q... iff there is a substitution S associating every term of the query to a term inthe facts.
Problem: Finding substitutions(Also known as ENTAILMENT)
{F ,O} |= Q... iff after being enriched by O, there is a substitution S associating everyterm of the query to a term in the facts.
Problem: Applying rules, Finding substitutions(Also known as RULE-ENTAILMENT)
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 9 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Rule application
There are two distinct methods for applying rules:
Forward chaining: (seen in the example)Knowledge base information is increased with rule application.Queries are applied (homomorphism computation) into thefacts when no more information can be added (the base issaturated).
Backwards chaining:Initial query is decomposed/rewritten according to the rules ofthe ontology. Those new queries are then applied to theknowledge base, which was not modified.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 10 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Elementary operations
The efficiency of finding substitutions and applying rules stepsdepends on the efficiency of some elementary operations:
Finding substitutions (homomorphism):
Retrieving a term in the knowledge base.
Retrieving adjacent terms (neighbourhood) of a given term.
Check the existence of an atom with given terms.
Rule application:
Finding substitutions.
Inserting new pieces of information from time to time (andnot all at once).
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 11 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Overview
Until very recently...
Factual knowledge=
RDBMS
However different new factors have appeared, changing the natureof the problem...
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 12 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Overview
Until very recently...
Factual knowledge=
RDBMS
However different new factors have appeared, changing the natureof the problem...
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 12 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Overview
Until very recently...
Factual knowledge=
RDBMS
However different new factors have appeared, changing the natureof the problem...
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 12 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
New factors
Semi-structured data(Abiteboul,1997)Knowledge bases with: “irregular, partial or implicit structure”,
“very large schema”, “schema is ignored”, “schema evolving
rapidly”, “difficult distinction between schema and data”, etc.
Emergence of semi-structured knowledge bases over the web.
KBs can now be very large (see the Semantic Web).
For our work: Large → Does not fit in main memory.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 13 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
State of the art
What we already know about the subject...
RDBs handle very well data stored in secondary memory,however:
Using SQL for querying is not the best solution, as it relies onjoins, which become very costly on larger queries.Homomorphism algorithms use SQL statements for elementaryoperations: their complexity also depend on the size of thetables.
Graph homomorphism works very well with graphs stored inmemory. They were not tested on GDBs yet.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 14 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Objectives
Three different approaches to this problem exist in the litterature:
1 Approximative and probabilistic algorithms.
2 Algorithms optimization.
3 Analysis of storage methods.
We try to show that items 2 and 3 are tightly correlated. How?
Investigating different storage models (RDBs, GDBs & Triple Stores) and their
internal data structure.
Using an abstract architecture to compare their efficiency on elementary
operations.
Writing an efficient algorithm for deduction.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 15 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Objectives
Three different approaches to this problem exist in the litterature:
1 Approximative and probabilistic algorithms.
2 Algorithms optimization.
3 Analysis of storage methods.
We try to show that items 2 and 3 are tightly correlated. How?
Investigating different storage models (RDBs, GDBs & Triple Stores) and their
internal data structure.
Using an abstract architecture to compare their efficiency on elementary
operations.
Writing an efficient algorithm for deduction.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 15 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
ALASKA platform
ALASKA platform
Abstract Logic-based Architecture Storage systems & Knowledge base Analysis
Its goal is to enable to perform OCQA in a logical, generic manner, over
existing, heterogenous storage systems.
Graph to RDB, RDB to Graph, all using an intermediary translation into logics.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 16 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Features & details
Multi-layered architecture: Program goes from higher level
operations down to I/O disk functions.
Classes and interfaces ensuring all the storage systems connected
will have same methods, using a common datatype (based on FOL).
Written in JAVA: Very easy to plug several pieces of code in,
however, with a significant loss in speed and efficiency.
Systems already connected: TSs (Jena, Sesame), RDBs (MySQL,
Sqlite), GDBs (DEX, Neo4j) - Non-definitive list
All layer below application layer work as the lower level part forOCQA (and other KR problems) computation.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 17 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Class diagram
KRRoperations
IFact
< interface >
IAtom
< interface >
ITerm
< interface >
GDBConnectors
RDBConnectors
TSConnectors
Predicate TermAtom
GDB RDB TS
Applicationlayer (1)
Abstractlayer (2)
Translationlayer (3)
Datalayer (4)
Figure: Class diagram of ALASKA architecture.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 18 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
ALASKA for OCQA
Our current goal is to use ALASKA to verify the efficiency of theconnected systems on elementary operations:
Storage tests:
Measuring the time and size when storing smaller, then larger
knowledge bases on disk.
Querying tests:
Measuring the time that each system takes to answer a set of
queries using different algoritms/query engines.
Once both tests are done, there will be a result analysis stage:
Is the best system for storage also the best for querying?
Is there a system that performs excellently on a certain task?
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 19 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Storage algorithm
input = getInputManager();
fact = new XFact(DB location);A fact is created or loaded.X ∈ {DEX, Sqlite, Neo4j, MySQL, etc.}
atoms = input.parse(content);Content is parsed, an atom iterator is returned
fact.store(atoms);Atoms are added to the fact according to the storage type
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 20 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Storage algorithm
Case for a graph database:
Algorithm 1: KB to HypergraphInput: A an atom iteratorOutput: a boolean value
begin1g ←− empty graph;2foreach Atom a in A do3
foreach Term ti in a.terms do4if !exists node with label t then5
if t is a constant term then t ←− c : t;6else t ←− v : t;7
add hyperedge (t1,...,tn) with label a.predicate to g ;8
return true;9
end10
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 21 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Storage algorithm
Case for a relational database:
Algorithm 2: KB to RDBInput: A an atom iteratorOutput: a boolean value
begin1foreach Atom a in A do2
p ←− a.predicate;3if !exists table with label p then4
create table with name p;5
foreach Term t in a.terms do6if t is a constant term then t ←− c : t;7else t ←− v : t;8
insert (t1,...,tn) into table p;9
return true;10
end11
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 22 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
RDF FileInput
ManagerRDF Parser
IFactManager
IFact to GDBTranslation
IFact to RDBTranslation
Graph DBRelational
DBTriple Store
Layer (1)
Layer (2)
Layer (3)
Layer (4)
Figure: Testing protocol workflow for storing a knowledge base in RDF.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 23 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Input elements
For our tests, we use knowledge bases from the SP2B Project:
Presented in 2008 at ISWC.
Initially a SPARQL benchmark.
Has defined a set of queries that covers all SPARQLspecifications.
Also features a Knowledge Base generator, inspired on theDBLP structure.
The generator is able to create bases of any size, maintainingthe same structure.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 24 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Preliminary results
Using our platform, we have evaluated the insertion efficiency ofdifferent storage systems:
Knowledge Base
Transformation into IFact
RelationalDatabase
GraphDatabase
TriplesStore
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 25 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Preliminary results
Using our platform, we have evaluated the insertion efficiency ofdifferent storage systems:
Knowledge Base
Transformation into IFact
RelationalDatabase
GraphDatabase
TriplesStore
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 25 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Preliminary results
Using our platform, we have evaluated the insertion efficiency ofdifferent storage systems:
Knowledge Base
Transformation into IFact
RelationalDatabase
GraphDatabase
TriplesStore
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 25 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Issues
Results have shown that our method was not really appropriate for thesize of knowledge bases we aim work with:
Parsing issues:
More or less memory is used by our program according to the parsing method used.
Bigger memory consumption at parsing = less memory available for the storage system.
Transaction sizes:
At a certain level, it is impossible to store all information at once (Most systems went on swap).
Creation of an atom buffer: information is treated in pieces, parsed then stored in a smaller transaction.
Garbage collecting:
GC overhead limit errors on almost every storage system bases beyond 20M triples.
Recycling JAVA objects became mandatory: setting/re-setting objects attributes instead of
creations/destructions.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 26 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Changes & improvements
The algorithm was then changed to the following version:
input = getInputManager();fact = new XFact(DB location);input.store(fact,content);
Calling the store method makes the parser create the atom buffer (array).An event is thrown when the parser finishes parsing a statement.
Event handling method:
if (buffer is full) { fact.store(buffer); position = 0; }The fact now only stores N (buffer size) atoms at a time.buffer[position].setPredicate(stmtPredicate);buffer[position].setTerms([stmtSubject,stmtObject]);Atoms in buffer are now recycled(Number of atoms created/destroyed = buffer size).position++;
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 27 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
New results
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 28 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Interrogation tests
Next step in the project will be to perform interrogation tests.
Using the platform + a Datalog-to-SQL algorithm, we aimevaluating querying performances of the selected storage systems:
For GDBs:Comparing the efficiency of each system using the same algorithm.
For RDBs:Comparing the efficiency of our algorithm against the native SQLinterface.
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 29 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
The workflow of the tests is detailed below:
F |= Q
AbstractArchitecture
Graph DBRelational DB
Test results− Query size TimeBT ... terms ... s
SQL ... terms ... s
Test results− Query size TimeBT ... terms ... s
Graph ... terms ... s
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 30 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
The workflow of the tests is detailed below:
F |= Q
AbstractArchitecture
Graph DBRelational DB
Test results− Query size TimeBT ... terms ... s
SQL ... terms ... s
Test results− Query size TimeBT ... terms ... s
Graph ... terms ... s
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 30 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
The workflow of the tests is detailed below:
F |= Q
AbstractArchitecture
Graph DBRelational DB
Test results− Query size TimeBT ... terms ... s
SQL ... terms ... s
Test results− Query size TimeBT ... terms ... s
Graph ... terms ... s
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 30 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
The workflow of the tests is detailed below:
F |= Q
AbstractArchitecture
Q → SQL
Graph DBRelational DB
Test results− Query size TimeBT ... terms ... sSQL ... terms ... s
Test results− Query size TimeBT ... terms ... s
Graph ... terms ... s
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 30 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Workflow
The workflow of the tests is detailed below:
F |= Q
AbstractArchitecture
Q → SQLQ → Graph
Query
Graph DBRelational DB
Test results− Query size TimeBT ... terms ... sSQL ... terms ... s
Test results− Query size TimeBT ... terms ... s
Graph ... terms ... s
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 30 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Current & Future work
Currently, we focus on finding (really) difficult queries to checkalgorithms behaviour on these cases.
But some questions in this field are still open:
Traversal queries:Can they enhance homomorphism computation? How?
Real world KBs vs. generated KBs
Can we integrate a constraint solving program for computinghomomorphism?
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 31 / 32
Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions
Questions
Thank you!
Questions & comments...
Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases
PAIVA LIMA DA SILVA Bruno ([email protected]) 32 / 32