an ontology-extended relational algebra piero bonatti università di napoli "federico ii"...

37
An Ontology-Extended Relational Algebra Piero Bonatti Università di Napoli "Federico II" Yu Deng V.S. Subrahmanian University of Maryland College Park

Post on 20-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

An Ontology-Extended Relational Algebra

Piero BonattiUniversità di Napoli "Federico II"

Yu DengV.S. Subrahmanian

University of Maryland College Park

10/28/2003 Ontology Extended Algebra 2

Outline Problem statement Approach Motivating example Ontology-extended relational algebra HOME system Contributions Related work Future work

10/28/2003 Ontology Extended Algebra 3

Problem Statement Integrating heterogeneous data sources is

an important problem. There are many projects in this area, but at syntactic level.

Our goal: Integrate data sources with diverse structures

and assumptions at the semantic level. Answer queries correctly under user’s

assumptions of semantic meaning about the terms being used.

10/28/2003 Ontology Extended Algebra 4

Approach Associate ontologies to data

sources. Ontology interoperation. Extend relational data model

and relational algebra.

10/28/2003 Ontology Extended Algebra 5

Motivating Example Two parts relations:

Relation Parts1 with the schema (Name, Cost, Shipping)

Relation Parts2 with the schema (Item, Price, ShipCost)

Two insurance claim relations: Relation Claims1 with the schema (ClaimId,

Type, Cost) Relation Claims2 with the schema

(ClaimNumber, Type, Value)

10/28/2003 Ontology Extended Algebra

Parts1 and Parts2 Relations

Name Cost Shipping

Tire 54.19 20.05

Gasket 3.05 1.55

Valve 3.35 1.55

Brake pads

78.50 8.50

Evaporator

305.00

11.50

Item Price ShipCost

Wheel 50.05 18.00

Air Gasket

3.00 1.70

Valve 3.35 1.55

Hubcap 11.50 6.00

Spark Plug

20.00 8.50

Parts1 relation Parts2 relation

10/28/2003 Ontology Extended Algebra 7

Problems (1) When users specify a query spanning these

two relations, they may wonder: Do the fields Cost and Price mean the same thing? Is wheel a part of tire? Is air gasket a gasket?

Furthermore, does the field Cost use the unit US dollar? Does the field Price use the unit Euro?

Users may be at a loss to determine these by looking at the fields.

10/28/2003 Ontology Extended Algebra

Claims1 and Claims2 Relation

ClaimId Type Cost

1 burglary 2000

2 theft 150

3 mugging 860

4 arson 1800

ClaimNumber

Type Value

1 robbery 400

2 fire 550

3 auto accident

500

4 burglary

250

Claims1 relation Claims2 relation

10/28/2003 Ontology Extended Algebra 9

Problem (2) Users may have a query such as “Find

all the thefts that involved a cost of over $1000 dollars”. The system should automatically recognize that burglaries, muggings and robberies count as thefts.

In addition, conversions between units are needed if costs are represented in different units in above query.

10/28/2003 Ontology Extended Algebra 10

Ontology Extended Relation (OER)

We use ontology to convey semantics about terms in a domain and associate ontologies with relations.

Intuitively, an Ontology extended relation is an ordinary relation as well as an associated ontology.

10/28/2003 Ontology Extended Algebra 11

Ontology Suppose ∑ is some finite set of strings

and S is some set. An ontology w.r.t. ∑ is a partial mapping Θ from ∑ to hierarchies for S.

For example, ∑ = {isa, part_of, affects} A hierarchy can be regarded as a Hasse

diagram associated with a partial ordering. We provide formal definition in our paper.

10/28/2003 Ontology Extended Algebra 12

Ontology Example

theft

mugging

arson

burglary

Ontology associated with Claims1 relation (∑ = {isa})

Wheel

Valve

Air Gasket

Hubcap

Spark Plug

Ontology associated with Parts2 relation (∑ = {part_of})

10/28/2003 Ontology Extended Algebra 13

Ontology Integration Example query: Find all the thefts that

involved a cost of over $1000 dollars. Ontology integration is needed to answer

this query when performing binary operations between two ontology extended relations.

Interoperation constraints are needed to specify the connections between ontologies. We consider: x = y, x ≤ y, x ≠ y, x !≤ y, suppose x and y are from two different hierarchies.

10/28/2003 Ontology Extended Algebra 14

Definition of Hierarchy Integration

Suppose (Hi, ≤i), 1≤i ≤n are n different hierarchies and suppose IC is a finite set of interoperation constraints. A hierarchy (H, ≤) is said to be an integration of (Hi, ≤i), 1≤i ≤n iff there are n injective mappings φ1,…,φn from H1,…,Hn respectively to H such that:

(i {1,…,n})x ≤i y φi(x) ≤ φi(y).

(x Hi)(y Hj) (x:i op y:j) IC φi(x) op φj(y).

H1

H2

Hn

.

.

.

H

φ1

φ2

φn

10/28/2003 Ontology Extended Algebra

Example of Hierarchy Integration

theft

mugging

arson

burglary

isa hierarchy with Claims1 relation

robbery fireauto-accident

burglary

isa hierarchy with Claims2 relation

theft

mugging arsonburglary

fire auto-accident

Integrated isa hierarchy for Claims1 and Claims2

IC = {theft:1 = robbery:2, arson:1 ≤ fire:2}

With the integrated hierarchy, system can recognize that burglaries, muggings and robberies count as thefts.

10/28/2003 Ontology Extended Algebra 16

Canonical Hierarchy Suppose (Hi, ≤i), 1≤i ≤n are n different

hierarchies and suppose IC is a finite set of interoperation constraints. The canonical hierarchy (H*, ≤*) of (Hi, ≤i), 1≤i ≤n is defined as follows. H* is the set of all strongly connected components

of the graph associated with (Hi, ≤i), 1≤i ≤n. If x*, y* H*, then x* ≤ * y* iff either x* = y* or there

exists a directed path from x:i to y:j (for some x:i x* and y:j y* ) in the hierarchy graph associated with (Hi, ≤i), 1≤i ≤n.

10/28/2003 Ontology Extended Algebra 17

Example of Canonical Hierarchy

theft

mugging

arson

burglary

isa hierarchy with Claims1 relation

robbery fireauto-accident

burglary

isa hierarchy with Claims2 relation

Canonical Hierarchy with Claims1 and Claims2

IC = {theft:1 = robbery:2, arson:1 ≤ fire:2}

theftrobbery

burglary mugging

fire

arson

auto-accident

10/28/2003 Ontology Extended Algebra

Theorems about Hierarchy Integrability

Let (Hi, ≤i), 1≤i ≤n be a family of hierarchies and suppose (H*, ≤*) is its canonical hierarchy. Suppose (H, ≤), φ1,…,φn is any arbitrary witness to the integration of (Hi, ≤i), 1≤i ≤n. Then: [x:i] ≤* [y:j] φi(x) ≤ φj(y).

A set (Hi, ≤i), 1≤i ≤n of hierarchies is integrable if and only if the canonical witness of (Hi, ≤i), 1≤i ≤n is a witness to the integrability of (Hi, ≤i), 1≤i ≤n.

This shows how to integrate hierarchies very efficiently: compute canonical hierarchy and check integrability.

10/28/2003 Ontology Extended Algebra 19

Definition of Ontology Integrability

Suppose is some finite set of strings, S is some set, and 1,…,n are ontologies w.r.t. , S. Suppose IC is a finite set of interoperation constraints. The ontologies 1,…,n are integrable iff for every x , 1(x),…, n(x) are integrable.

10/28/2003 Ontology Extended Algebra 20

Definition of OER An ontology extended relation is a

triple (R, S, Hisa), where S is a schema (A1:1, …,An:n), Hisa is an isa hierarchy and the following constraints are satisfied: 1,…,n Tisa

R belowHisa(1) x … x belowHisa

(n)

BelowH() = {’|’≤} dom()

10/28/2003 Ontology Extended Algebra 21

Ontology Extended Relational Algebra (1)

Example query: Find the car parts from Parts1 relation which are more expensive than Wheel in Parts2 relation. Conversion function is needed to answer this query.

Conversion Function: for each pair of types i and j, we assume there exists at most one conversion function i2j : dom(i) dom(j)

Given a term X, Xt is defined as: t.Ai, if X = Ai, where t is a tuple of relation R. , if X = . v, if X = v:.

10/28/2003 Ontology Extended Algebra 22

Ontology Extended Relational Algebra (2)

Operations in simple select conditions: X op Y, op { =, <>, <, , >, }: Let be the least

common supertype of X and Y, then (type(X)2)(Xt) op (type(Y)2)(Yt) is true.

X instance_of Y: Yt T, type(X) ≤H Yt, and Xt dom(Yt). X subtype_of Y: Xt T , Yt T, Xt ≤H Yt.

If c1, c2 are select conditions, c1 c2, c1 c2, and c1 are select conditions.

Complex operations in select conditions: X below Y: X instance_of Y X subtype_of Y. X above Y: Y below X.

The operators instance_of, subtype_of, below and above are applicable to arbitrary hierarchies.

10/28/2003 Ontology Extended Algebra 23

Ontology Extended Relational Algebra (3)

Suppose (R1, S1, H1),…,(RZ, SZ, HZ) are ontology extended relations, F is a fusion of H1,…,HZ via witness trF.

If E is a relation Ri, [E]F = (R, S, F), where R = trF(Ri), S = (A1:trF(1), …, An: trF(n)).

If E is Ai1,…, Aik(E’) (1 ij n, 1 j k) and if [E’]F = (R’,

(A1:1, …, An:n), F), then [E]F = (R, S, F), where R = Ai1,

…, Aik(R’) and S = (Ai1

:i1, …, Aik

:ik).

If E is E1 x E2 and [Ei]F = (Ri, Si, F), (i = 1, 2), then [E]F = (R, S, F), where R = R1 x R2, S = S1S2.

If E is c(E’), [E’]F = (R’, S, F), then [E]F = (R, S, F), where R = {t R’ (R’, S, F), t |= c}.

10/28/2003 Ontology Extended Algebra 24

Example of Selection Example query: Find all the items

from Parts1 relation which are parts of Tire.

To answer this query: Ontology of Parts1 including part_of

hierarchy. Retrieve the set of subtypes of Tire with

regard to part_of relationship. Transform the query based on the set of

subtypes.

10/28/2003 Ontology Extended Algebra 25

Example of Join Example query: Find the items from

Claims2 relation which are a kind of theft and cost more than the item theft in Claims1 relation.

To answer this query: Integrated ontology of Claims1 and Claims2

including isa hierarchy. Conversion function between the

corresponding units. Transform the query with regard to the

ontology and conversion function.

10/28/2003 Ontology Extended Algebra 26

Ontology Extended Relational Algebra (4)

If E = E1 op E2 where op {, , }, and [Ei]F = (Ri, Si, F), (i=1,2), and S1, S2 have a least common super schema S, then [E]F = (R, S, F), where R = S12S(R1) op S22S(R2).

If E = (S)E’, where S is a schema and [E’]F = (R, S’, F), then [E]F = (S’2S(R), S, F).

10/28/2003 Ontology Extended Algebra 27

Example of Union Example query: Find all the items from

Claims1 and Claims2 that are a kind of theft and involve a cost of over $1000 dollars.

To answer this query: Integrated ontology including isa hierarchy which

contains not only values, but also field names, such as Cost and Value.

Conversion function between corresponding units. Compute least common super schema of Claims1

and Claims2. Convert the selected records to the least common

super schema and compute the union of them.

10/28/2003 Ontology Extended Algebra 28

HOME We built the HOME (Heterogeneous

Ontology Management Engine) system to prove the proposed concepts and implement the algorithms.

The main components in HOME: GUI Ontology maker

Rule maker Ontology inference

Query Executor

10/28/2003 Ontology Extended Algebra 29

Current Status of HOME HOME is implemented in Java. Briefly, HOME has the following major

functionalities: Learn ontology from relational and XML data sources. Modify ontology with a rule maker. Browse ontology with zoomable interface. Import ontology from XML files and write ontology

back to XML files. Ontology integration. Ontology extended query processing for relational

data sources and XML sources.

10/28/2003 Ontology Extended Algebra

Experimental Results (1)

Performance of HOME for conjunctive selection queries based on GNIS data sets

10/28/2003 Ontology Extended Algebra

Experimental Results (2)

Performance of HOME for join queries based on GNIS data sets

10/28/2003 Ontology Extended Algebra

Experimental Results (3)

Join queries with varying selectivity and number of tuples based on GNIS data sets

10/28/2003 Ontology Extended Algebra

Experimental Results (4)

Performance of ontology integration algorithms

10/28/2003 Ontology Extended Algebra 34

Contributions Theory about ontologies and

ontology integration. Theory about ontology extended

relational algebra. HOME: a platform for ontology-

based data integration.

10/28/2003 Ontology Extended Algebra 35

Related Work Integrate heterogeneous data sources:

TSIMMIS from Stanford HERMES from UMD SIMS from USC DISCO from INRIA and UMD

Ontology algebra Scalable Knowledge Composition Project from

Stanford Focused on computing union, intersection, and

difference of ontologies, instead of answering queries with ontologies.

Did not consider embedding ontologies into existing data models.

10/28/2003 Ontology Extended Algebra 36

Future Work Integrate non-relational data sources,

such as semi-structured sources, textual sources, etc.

More effort on Semantic Web, DAML+OIL, RDF, metadata, etc.

Extension to richer ontology structures. Indexing for ontology based data

retrieval. Scaling ontology integration.

10/28/2003 Ontology Extended Algebra

Finally

Thank you!