ontologies presented by: rokhlenko oleg [email protected] data integration seminar spring...

51
es Presented by: Rokhlenko Oleg [email protected] Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Upload: evangeline-hamilton

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Ontologies

Presented by:Rokhlenko Oleg

[email protected]

Data Integration SeminarSpring 2002Supervisor: Ron Y. Pinter

Page 2: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

References1. Ontologies: Principles,Methods and

Applications

by:

Mike Uschold – The university of Edinburg, Scotland & Michael Gruniger – University of Toronto, Canada

2. Ontology-Driven Integration of Scientific Repositoriesby: Vassilis Christophide

Catherine HoustisSpyros Lalis Hariklia TsalapataDepartment of CS, University of Crete, Greece

Page 3: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Agenda

Why Ontologies and what are they? Uses of Ontologies. A skeletal methodology for building

ontologies. Ontologies in practice.

Page 4: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What are the problems? The lack of shared understanding leads to:

Poor communication within and between people and their organizations

Difficulties in identifying requirements and thus defining of a specification of the system

Disparate modeling methods, paradigms, languages and software tools severely limit: Inter-operability The potential for re-use and sharing => much wasted effort re-inventing the wheel

Page 5: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

How can we solve them?The way to address these problems, is to reduce or eliminate conceptual and terminological confusion and come to a shared understanding. Such an understanding can function as a unifying framework for the different viewpoints and serve as the basis for: Communication between people. Inter-Operability between systems. System Engineering benefits as:

Re-usability Reliability Specification

Page 6: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Examples (1) Unifying Research Fields

Situation/Problem:Researches in the different but related fields of AI Planning, Decision Theory and Distributed Systems Theory cannot readily make use of each other’s results. This is because they have a different perspective on and use different terms to describe the same underlying ideas.

Solution:Develop a unifying conceptual framework which enables research results in one field to be applied to the other fields.

Page 7: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Examples (2) Semi­Conductor­Fabrication­

Situation/Problem:Software bought in from the outside includes a WIP tracking system and production line simulation package. The simulation package requires as input, a very large description of a model of the product flow in the factory, which incorporates various details of the WIP tracking mechanism. When new versions of the simulation package are released, or if a new supplier is chosen, the model must be converted to a new format. This conversion is both time consuming and error prone.

Solution:Automate the process of converting the model when new external software is introduced. This both saves time and ensures model fidelity.

Page 8: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Examples (3) Spacecraft­Mission­Operations­

Situation/Problem: Various knowledge based systems were developed independently to assist in different aspects of spacecraft operations (e.g. in planning, anomaly detection, diagnosis). Each uses its own approach to structuring and representing the relevant concepts in a large knowledge base.It is desirable to integrate these system, so that each can make use of the knowledge of the others.

Solution:Use a federated agent based approach to knowledge sharing. The overall system is called ATOS: Advanced Technology Operations System.

Page 9: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What is an ontology? From Greek: Ontos = being, logos = science `Ontology' is the term used to refer to the shared

understanding of some domain of interest which may be used as a unifying framework to solve the above problems in the above described manner.

An ontology necessarily entails or embodies some sort of world view with respect to a given domain. The world view is often conceived as a set of concepts (e.g. entities, attributes, processes), their definitions and their inter relationships; this is referred to as a conceptualization.

Page 10: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What is an ontology? (cont.)

Such a conceptualization may be implicit, e.g. existing only in someone's head, or

embodied in a piece of software. For example, an accounting package presumes some world view encompassing such concepts as invoice, and a department in an organization. The word `ontology' is sometimes used to refer to this implicit conceptualization.

However, the more standard usage and that which we will adopt is that the ontology is an explicit account or representation of [some part of] a conceptualization.

Page 11: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What does an ontology look like? An [explicit] ontology may take a variety of forms,

but necessarily it will include a vocabulary of terms and some specification of their meaning (i.e. definitions). The degree of formality by which a vocabulary is created and meaning is specified varies considerably:

highly informal: expressed loosely in natural language semi informal: expressed in a restricted and structured

form of natural language semi formal: expressed in an artificial formally defined

language rigorously formal: meticulously defined terms with

formal semantics, theorems and proofs of such properties as soundness and completeness.

Page 12: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What did we see till now?

Why Ontologies and what are they? Uses of Ontologies. A skeletal methodology for building

ontologies. Ontologies in practice.

Page 13: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Uses of ontologies

COMMUNICATION

between people and organizations

INTER-OPERABILITY

between systems

Reusable Components Reliability

Specification

SYSTEM­ENGINEERING

We­identify­three­main­categories­of­uses­for­ontologies.­Within­each,­other­distinctions­may­be­important,­such­as­the­nature­of­the­software,­who­the­intended­users­are,­and­how­general­the­domain­is.­

Page 14: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Communication Normative Models

Within any large scale integrated software system, different people must have a shared understanding of the system and its objectives.

Networks of Relationships We can also use ontologies to create a network of relationships,

keep track of what is linked, and explore and navigate through this network.

Consistency and Lack of Ambiguity One of the most important roles an ontology plays in

communication is that it provides unambiguous definitions for terms used in a software system.

Integrating Different User Perspectives If we have a system with multiple communicating agents, this

integration through shared understanding becomes vital.

Page 15: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Inter Operability Many applications of ontologies address the

issue of inter operability, in which we have different users that need to exchange data or who are using different software tools. A major theme for the use of ontologies in domains such as enterprise modeling and multiagent architectures is the creation of an integrating environment for different software tools.

Page 16: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Example

The­term­‘procedure’­used­by­one­tool­is­translated­into­the­term­‘method ‘­used­by­the­other­via­the­ontology,­whose­term­for­the­same­underlying­concept­is­‘process’.­procedure

viewer

translator

Ontology

method

library

give me the procedure for…

here is the

procedure for…

translator

give me the

METHOD for…

here is the

METHOD for…

procedure = ???

procedure =

process

give me the

process for…

here is

the process for…

METHOD =

process

??? = process

Page 17: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Ontologies as Inter Lingua

One approach is to design unique translators for every two party exchange; however, this would require O(n 2 ) translators for n different ontologies

To assist inter operability, ontologies can be used to support translation between different languages and representations.

L1

L3

L2

L4

L1

L3

L2

L4

Interlingua

T1

T2

T3

T4

Using ontologies as inter lingua to support translation, we can reduce the number of translators to O(n) for n different ontologies, since it would only require translators from a native ontology into the interchange ontology

Page 18: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

System Engineering The applications of ontologies that we have

considered to this point have focused on the role that ontologies play in the operation of software systems. In this section we consider applications of ontologies that support the design and development of the software systems themselves: Specification Reliability Reusability

Page 19: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

1. Specification A shared understanding of the problem and the task at

hand can assist in the specification of software systems. The ontology’s role in specification varies with the degree of formality within the system design methodology:

In an informal approach, ontologies facilitate the process of identifying the requirements of the system and understanding the relationships among the components of the system. This is particularly important for systems involving distributed teams of designers working in different domains.

In a formal approach, an ontology provides a declarative specification of a software system, which allows us to reason about what the system is designed for, rather than how the system supports this functionality.

Page 20: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

2. Reliability Informal ontologies can improve the reliability of software

systems by serving as a basis for manual checking of the design against the specification.

Using formal ontologies enables the use of [semi ]automated consistency checking of the software system with respect to the declarative specification. In addition, formal ontologies can be used to make explicit the various assumptions made by different components of a software system, facilitating their integration.

Declaratively specified assumptions may explicitly restrict the applicability of a particular ontology to a problem domain . By proving that the ontology is capable of supporting various reasoning problems, we can demonstrate the reliability of the software system within the domain.

Page 21: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

3. Reusability To be effective, ontologies must also support reusability, so

that we can import and export modules among different software systems.

The problem is that when software tools are applied to new domains, they may not perform as expected, since they relied on assumptions that were satisfied in the original applications but not in the new ones.

By characterizing classes of domains and tasks within these domains, ontologies provide a framework for determining which aspects of an ontology are reusable between different domains and tasks.

Page 22: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What did we see till now?

Why Ontologies and what are they? Uses of Ontologies. A skeletal methodology for

building ontologies. Ontologies in practice.

Page 23: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

A Skeletal Methodology for Building Ontologies

Although there is much collective experience in

developing and using ontologies, there are no standard methodologies for building ontologies.

Proposed comprehensive methodology for developing ontologies includes the following: Identify Purpose and Scope; Building the Ontology; Evaluation; Documentation;

Page 24: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Purpose and Scope

It is important to be clear about why the ontology is being built and what its intended uses are. The previous section explores the space of possible uses; this can be a starting point in identifying the purpose of an ontology yet to be constructed.

It will also be useful to identify and characterize the range of intended users of the ontology.

Page 25: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Building the Ontology The identification of the purpose and scope

of the ontology, at least in general terms, serves to provide a reasonably well defined target for building the ontology.

Three aspects to this are: capture, coding, and integration of existing ontologies.

Page 26: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Capture

By ontology capture, we mean:1) identification of the key concepts and relationships in the domain of interest;

(scoping)2) production of precise unambiguous text definitions for such concepts and

relationships; 3) agreeing on all of the above.

Page 27: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

1) Scoping Brainstorming --- Have a brain storming session to

produce all potentially relevant terms and phrases; Grouping­--- Structure the terms loosely into work

areas corresponding to naturally arising sub groups. Connecting­--- Identify semantic cross references

between the areas; i.e. concepts that are likely to refer to or be referred to by concepts in other areas. This information can be used to help identify which work area to tackle first to minimize likelihood of re work.

Page 28: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

2) Produce Definitions­ Determining­Meta­Ontology­---­Let the careful consideration of

the concepts and their inter relationships determine the requirements for the meta ontology. Keep in mind various possibilities, and use words and phrases in a consistent manner where appropriate (e.g. role, entity, relationship, type, instance).

Work­Areas­--- Address each work area in turn. Start with work areas that have the most semantic overlap with other work areas.

Terms­--- Proceed in a middle out fashion rather than top down or bottom up. That is, define the most fundamental terms in each work area before moving on to more abstract and more specific terms within a work area.

The idea of what is fundamental, or basic, is a psychological phenomenon. For example, `dog' is basic, `mammal' is a generalization, and `cocker spaniel' is a specialization.

Page 29: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Why Middle-Out Approach?

Bottom-Up

Approach

High level of detail Difficult

commonality

Inconsistency Re-work & more effort

Page 30: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Why Middle-Out Approach?

Top-down

Approach

Better control

of the level of detail

Choosing arbitrary

high-level categories

Re-work & more effort

Less stability

Page 31: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Why Middle-Out Approach?

Middle-Out

Approach

Capture

commonality

Consistency &

Accuracy

Less re-work &

less effort

Balance in terms

of the level detail Stability

Starting with most

important concepts

Page 32: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

3) Reaching Agreement­ There is considerable variation in the degree of effort

required to agree on definitions and terms for underlying concepts. For some terms, consensus on the definition of a single concept can be fairly easy. In other cases several terms seem to correspond with one concept definition.

In practice, there are only few cases where commonly used terms have significantly different informal usage, but no useful different definitions could be agreed. This should be recorded in notes against the definition.

Finally, some highly ambiguous terms are identified as corresponding with several closely related, but different concepts. In this situation, the term itself gets in the way of a shared understanding.

Page 33: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Coding By coding, we mean explicit representation of

the conceptualization captured in the previous stage in some formal language. This will involve: committing to the basic terms that will be used to

specify the ontology (e.g. class, entity, relation); this is often called a `meta ontology' because it is in essence, the [underlying] ontology of representational terms that will be used to express the main ontology;

choosing a representation language (which is capable of supporting the meta ontology);

writing the code.

Page 34: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Integrating Existing Ontologies

During either or both of the capture and coding processes, there is the question of how and whether to use [all or part of] ontologies that already exist. In general this is a very difficult problem. One way forward is to make explicit all assumptions underlying the ontology.

Overall, provision of guidance and tools in this area may be one of the biggest challenges in developing a comprehensive methodology for building ontologies. It is easy enough to identify synonyms, and to extend an ontology where no concepts readily exist. However, when there are obviously similar concepts defined in existing ontologies, it is rarely clear how and whether such concepts can be adapted and reused.

Page 35: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Evaluation G'omez P'erez provides a good definition of evaluation

in the context of knowledge sharing technology: “to make a technical judgment of the ontologies, their associated software environment, and documentation with respect to a frame of reference … The frame of reference may be requirements specifications, competency questions, and/or the real world.”

Some detailed work has been done on the evaluation of ontologies which could contribute to a comprehensive methodology for building ontologies .The approach taken in some of this work, is to look first at what has been done in the field of KBS, and to adapt it for ontologies.

Page 36: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Documentation It may be desirable to have established guidelines

for documenting ontologies, possibly differing according to type and purpose of the ontology.

As pointed out by Skuce , one of the main barriers to effective knowledge sharing, is the inadequate documentation of existing knowledge bases and ontologies. To address these problems all important assumptions should be documented, both about the main concepts defined in the ontology, as well as the primitives used to express the definitions in the ontology (i.e. the meta ontology).

Page 37: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

What did we see till now?

Why Ontologies and what are they? Uses of Ontologies. A skeletal methodology for building

ontologies. Ontologies in practice.

Page 38: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

A scenario for Costal Zone Management (CZM)

Environmental scientists and public institutions working on CZM often need to extract and combine data from different scientific disciplines, such as marine biology, physical and chemical oceanography, geology and engineering, stored in distributed repositories.

For example: the transport of waste in particular coastal area given a pollution source. Local authorities could require this information to determine the best location for installing a waste pipeline.

This data is typically generated through a 2- step process, involving the execution of 2 different programs:

Page 39: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Example

Combination of data and programs for producing waste transport data:

Ocean

Circulation

Model

Waste

Transport

Model

Currents Bathymetry

Sea Circulation

Bathymetry

Pollution

Source

Sea Circulation

Waste

Page 40: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

A scenario for CZM (cont.)Provided that user has no knowledge of this information, the following actions are necessary to discover which productions can be used to obtain Waste data for a particular costal are using the available recourses:

1. Locate Waste data stored in the distributed repositories.2. Determine usability of such search results.3. If no Waste data that satisfies the user requirements is

available, locate programs capable to produce this data.4. Having identified as appropriate program, i.e. a Waste

Transport model, determine the required input.5. For each input, locate appropriate sources or determine

ways to produce corresponding data sets.

Page 41: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

An Ontology for the Waste Transport Scenario

Waste

Transport

Model

Ocean

Circulation

Model

Global

Currents

Sea

CirculationBathymetry

Waste

Chemical

Oceanography

CZM

Physical

Oceanography

Page 42: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

The Knowledge Base In order to allow reasoning on the combination

alternatives between data and programs, we advocate a definition of the ontology notions in a KBS using Horn Clauses.

An ontology notion N is defined as a clauseN(A1, A2, …,An) where A1,A2,…,An are it’s attributes.

Relations between concepts are expressed as rules of the form: N(A1,A2,…,An) :- N1(A1,…,An), … ,

Nn(A1,…,An), Expr(A1,…,An) where “:-” denotes implication and “,” conjunction.

Page 43: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

The Knowledge Base (cont.)

The rule body includes program and data concepts Ni as well as constrains Expr , e.g. parameter restrictions, for deducing the notion appearing as a consequent in the rule head.

Exactly one literal in the body describes the corresponding program notion. The rest ot the literals stand for the description of input data required by that program.

The following clauses define the notions introduced in the above ontology:

Bathymetry(Location, GridRes)

ExtCurrents(Location, GridRes)

SeaCirc(Location, GridRes)

OceanCircModel(Location, GridRes)

WasteTransprotModel(Location, GridRes)

N(A1,A2,…,An) :- N1(A1,…,An), … , Nn(A1,…,An), Expr(A1,…,An)

Page 44: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

The Knowledge Base (cont.)

The ontology relation are formalized using 2 rules:

1. SeaCirc­(Location,­GridRes)­:-OceanCircModel­(Location,­GridRes)­,ExtCurrents­(Location,­GridRes’)­,Bathymetry­(Location,­GridRes’’)­,GridRes­<=­GridRes’­,GridRes­<=­GridRes’’

“<=“ denotes higher or equal grid resolution

Rule 1 states that Sea Circulation data for a specific location and grid resolution can be derived from local Bathymetry

and external Current data using Ocean Circulation program.

Page 45: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

The Knowledge Base (cont.)

2. Waste­(Location,­GridRes)­:-WasteTransportModel­(Location,­GridRes)­,SeaCirc­(Location,­GridRes’)­,Bathymetry­(Location,­GridRes’’)­,GridRes­<=­GridRes’­,­GridRes­<=­GridRes’’

Rule 2 states that Waste data for a specific location and grid resolution can be produced by combining Sea Circulation

with local Bathymetry data via a Waste Transport program.

Page 46: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

The Knowledge Base (cont.)

Clauses without a body, called facts , are instances of abstract notions. For example:

SeaCirc (HER, 10m3) stands for 3-D Sea Circulation for the area of Heraklion with a grid resolution of ten cubic meters.

WasteTransport (HER, 1m2) stands for Waste Transport program that computes 2-D Waste data for the area of Heraklion with a grid resolution of one square meter.

Facts are either extensional, indicating available data sets or programs, or intentional, denoting data sets that can be generated through programs.

There is no need to explicitly store facts in KBS. Intentional facts are dynamically deduced through rules. Extensional facts can be constructed “on-the-fly” via metadata search engine that locates the corresponding resources.

Page 47: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

On-Demand Generation of Data Production Paths

Given this formal representation of the ontology, requests for data productions translate into queries to the knowledge base.

A query is a description of the desired resource in terms of an ontology concept. It must be satisfied through the extensional or intentional facts. The latter being sub-queries requiring further expansion. This iterative matching process takes into account all possible combinations of rules and extensional facts.

The result is set of trees, whose nodes are intentional facts and leaves are extensional facts, embodying all valid production paths through which data for the queried concept can be generated.

Page 48: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Example To illustrate the on demand generation of data

production paths, let us assume that the following resources are available in the system repositories, expressed as extensional facts:

Bathymetry (HER, 10m2) ExtCurrents (HER, 10m3) OceanCircModel (HER, 10m2) SeaCirc (HER, 25m3) WasteTransportModel (HER, 10m2) WasteTransportModel (HER, 50m3)

The use can inquire on the concept of Waste without restricting any attributes by posing the query Waste(X, Y)

Page 49: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Example Waste

(HER, 10m2)

WasteTransportModel

(HER, 10m2)

Bathymetry

(HER, 10m2)

SeaCirc

(HER, 10m2)

OceanCircModel

(HER, 10m2)

Bathymetry

(HER, 10m2)

ExtCurrents

(HER, 10m3)

WasteTransportModel

(HER, 50m3)

Waste

(HER, 50m3)

Bathymetry

(HER, 10m2)

SeaCirc

(HER, 25m3)

Production for Waste data as presented to user by GUI.

- Extensional fact

- Intentional fact

- Program

Page 50: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

data sets model

wrapper

WorkflowRuntime

MetadataSearchEngine

Invoke/access

resources

query

Graphical User Interface

Workflow specification

productions

Middleware

system

KnowledgeBase System

export

resources

Architecture (Overview)

Page 51: Ontologies Presented by: Rokhlenko Oleg olegro@cs.technion.ac.il Data Integration Seminar Spring 2002 Supervisor: Ron Y. Pinter

Architecture (cont.) The­Metadata­Search­Engine­is responsible for locating

data sets or programs. It accepts metadata queries on the properties of resources and returns a list of metadata descriptions and references. References point to repository wrappers, which provide an access and invocation interface to the underlying legacy systems where the data and programs reside.

The­Knowledge­Base­System­accepts queries regarding the availability of ontology concepts. It generates and returns the corresponding dada productions based on the available resources and constrains imposed by the ontology rules.

The­Workflow­Runtime­System­monitors and coordinates the execution of workflows. It executes each intermediate step of workflow specification, accessing data and invoking program through the repository wrappers.