ics-forth october 2, 2006 introduction to knowledge representation and conceptual modeling martin...

27
ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research and Technology – Hellas Heraklion – Crete, Greece

Upload: vivian-bradley

Post on 26-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Introduction to Knowledge Representation and Conceptual

Modeling

Martin Doerr

Institute of Computer ScienceFoundation for Research and Technology – Hellas

Heraklion – Crete, Greece

Page 2: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationOutline

Introduction

From tables to Knowledge Representation: Individual Concepts and Relationships

Instances, classes and properties

Generalization

Multiple IsA and instantiation

A simple datamodel and notation

Page 3: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationIntroduction: Basic Notions

Knowledge Representation:

Representation of concepts, items (particulars) and their interrelation as perceived by humans and expressed in terms of a formal language, such as logic, conceptual graphs etc.

The intended meaning (semantics) is the interpretation (identification) of used symbols as things and categories in the described universe (“domain”, “world”, “real world), and the interpretation of expressions, which use those symbols, as statements about their structure and interrelations (early Wittgenstein).

A set of related knowledge representation expressions is called a model (of a domain).

IT Jargon (due to limited scope): “KNOWLEDGE” instead of model

“SEMANTICS” instead of expressions

Page 4: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationIntroduction: Reservations

Limitations:

“Principles of equivalence”: Given a model accepted as correct by a human, logical (automatable) inferences from a model should conform with the expectations of a human. Only in this sense represents knowledge representation (KR) knowledge. KR is a means of communication.

Expressions are rarely/never definitions, but partial constraints. (see also late Wittgenstein, Eleonore Rosch - George Lakoff).

Formal languages fit only partially the way we think.

Psychological Obstacles to create KR:

The true structure of our thoughts is unconscious.

BEWARE of compressions (Gilles Fauconnier, “The Way We Think”). See “Jargon”

Methodological questions reveal part of it (e.g. change of context).

Page 5: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationFrom Forms to Classes (a

“decompression”)

Patient

Name String

Weight Number

Birth date Time

Birth Place String

Address String

Attributes(sometimescalled “part-of”)

Table name

Value types

What does that mean as statements about the world? Is it correct, e.g., “Address” ?

Relational Database Tables:

- an abstraction from forms, - a model for (statistical)

information processing

Page 6: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Address table

Patient

Name String

Weight Number

Birth Date Time

Birth Place String

has

Knowledge RepresentationFrom Forms to Classes (a

“decompression”)

Address: Shared with others

Changes over time

Can be multiple

Independent entity

What about Birth Date?

Page 7: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Address table

Patient

Name String

Weight Number

has

has

Knowledge RepresentationFrom Forms to Classes (a

“decompression”)

Birth Date, Birth Place Shared with others

Birth shared with others (twins)!

Independent entity

Birth

Date Time

Place String

1

Page 8: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Address

Patient

Name String

has

has

has

Knowledge RepresentationFrom Forms to Classes (a

“decompression”)

Weight: Similar but not shared!

Multiple units, measurements

Dependent, but distinct entity

What about the name?

Birth

Date Time

Place String

1

Patient’s Weight1

Page 9: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Address

Patient

has

has

has

has

Knowledge RepresentationFrom Forms to Classes (a

“decompression”)

Name: Shared

Context specific

Independent entity

Who is the Patient then?

Birth

Date Time

Place String

1

Patient’s Weight1

Name

String

Page 10: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Summary:

In the end, no “private” attributes left over.

Widening/ change of context reveals them as hidden, distinct entities.

The “table” becomes a graph of related, but distinct entities, a MODEL

Things are only identified by unique keys – and the knowledge of the reality!

Do we describe a reality now? Are we closer to reality? Do we agree that this is correct? (“Ontological commitment”).

For a database schema, a projection (birth!) of perceived reality can be sufficient and more efficient.

For exchange of knowledge, it is misleading.

For a database schema, it can hinder extension.

Knowledge RepresentationFrom Forms to Classes (a “decompression”)

Page 11: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

In KR we call these distinct entities classes:

A class is a category of items that share one or more common traits serving as criteria to identify the items belonging to the class. These properties need not be explicitly formulated in logical terms, but may be described in a text (here called a scope note) that refers to a common conceptualisation of domain experts. The sum of these traits is called the intension of the class. A class may be the domain or range of none, one or more properties formally defined in a model. The formally defined properties need not be part of the intension of their domains or ranges: such properties are optional. An item that belongs to a class is called an instance of this class. A class is associated with an open set of real life instances, known as the extension of the class. Here “open” is used in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time. (related terms: universals, categories, sortal concepts).

Knowledge RepresentationClasses and Instances

Page 12: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationParticulars

Distinguish particulars from universals as a perceived truth. Particulars do not have specializations. Universals have instances, which can be either particulars or universals.

particulars: me, “hello”, 2, WW II, the Mona Lisa, the text on the Rosetta Stone, 2-10-2006, 34N 26E.

universals: patient, word, number, war, painting, text

“ambiguous” particulars: numbers, saints, measurement units, geopolitical units.

“strange” universals: colors, materials, mythological beasts.

Dualisms:

— Texts as equivalence classes of documents containing the same text.— Classes as objects of discourse, e.g. “chaffinch” and ‘Fringilla coelebs Linnaeus, 1758’ as

Linné defined it.

Page 13: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

DoctorDoctor PatientPatient

George 1George 1Costas 65Costas 65

WeightWeight

AddressAddress

Odos Evans 6.Odos Evans 6.GR71500 Heraklion, GR71500 Heraklion,

Crete, GreeceCrete, Greece

?

instance

property

Knowledge RepresentationClasses and Instances

weighs

dwells at

In KR, instances are independent units of models, not a restricted to the records of one table.

Identity is separated from description.

We can do “multiple instantiation”.

What have doctors and patients in common?

85 Kg85 Kg

Page 14: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

DoctorDoctor PatientPatient

George 1George 1Costas 65Costas 65

PersonPerson

Physical ObjectPhysical Object WeightWeight

AddressAddressdwells at

weighs

isA

Knowledge RepresentationGeneralization and Inheritance

Odos Evans 6.Odos Evans 6.GR71500 Heraklion, GR71500 Heraklion,

Crete, GreeceCrete, Greece

85 Kg85 Kg

subclass

superclass

An instance of a class is an instance of all its superclasses.

A subclass inherits the properties of all superclasses. (properties “move up”)

Page 15: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationOntology and Information Systems

An ontology is a logical theory accounting for the intended meaning of a formal vocabulary, i.e. its ontological commitment to a particular conceptualization of the world. The intended models* of a logical language using such a vocabulary are constrained by its ontological commitment. An ontology indirectly reflects this commitment (and the underlying conceptualization) by approximating these intended models.

Nicola Guarino, Formal Ontology and Information Systems, 1998.

* “models” are meant as models of possible states of affairs.

Ontologies pertains to a perceived truth: A model commits to a conceptualization, typically of a group, how we imagine things in the world are related.

Any information system compromises perceived reality with what can be represented on a database (dates!), and with what is performant. An RDF Schema is no more a “pure” ontology. Use of RDF does not make up an ontology.

Page 16: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Complex logical rules may become difficult to identify for the domain expert and difficult to handle for an information system or user community.

Distinguish between modeling knowing (epistemology) and modeling being (ontology): necessary properties may nevertheless be unknown. Knowledge may be inconsistent or express alternatives.

Human knowledge does not fit with First Order Logic: There are prototype effects (George Lakoff), counter-factual reasoning (Gilles Fauconnier), analogies, fuzzy concepts. KR is an approximation.

Concepts only become discrete if restricted to a context and a function! Paul Feyerabend maintains they must not be fixed.

Knowledge RepresentationLimitations

Page 17: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology

disciplines

viewpoints

Precision/detail

affordable technical

complexity

Conceptualframework

Activities

Communication

ResearchDomain work

Real World Things

Current domainpriorities

how

how

talksabout

maps

servesin order to

Ontology EngineeringScope Constraints of for Ontology

Constraint:

Constraint:

select

select

Page 18: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006 18

E53 Place

E53 Place

A place is an extent in space, determined diachronically with regard to a larger, persistent constellation of matter, often continents -

by coordinates, geophysical features, artefacts, communities, political systems, objects - but not identical to

A “CRM Place” is not a landscape, not a seat - it is an abstraction from temporal changes - “the place where…”

A means to reason about the “where” in multiple reference systems.Examples:

—figures from the bow of a ship—African dinosaur foot-prints appearing in Portugal by continental drift—where Nelson died

Page 19: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge RepresentationA Graphical Annotation for Ontologies

E45 AddressE45 AddressE48 Place NameE48 Place Name

E47 Spatial CoordinatesE47 Spatial Coordinates

E46 Section DefinitionE46 Section Definition E18 Physical StuffE18 Physical Stuff

E44 Place AppellationE44 Place Appellation

E53 PlaceE53 PlaceP88 consists of (forms part of)

P58 defines section of(has section definition)

P59 is located on or within

(has section)

P87 iden

tifies

(is id

entif

ied by) P53 has former or current location

(is former or current location of )

E9 MoveE9 Move

P26 moved to (was destination of)

P27 moved from (was origin of)

P25 moved (moved by)

E12 Production EventE12 Production Event

P108 has produced (was produced by)

P7 took place at (witnessed)

E19 Physical ObjectE19 Physical ObjectE24 Ph. M.-Made StuffE24 Ph. M.-Made Stuff

E9 MoveE9 Move

Page 20: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Knowledge Representation A Graphical Annotation for Ontology

Instances

P59B

has section

P26 moved to P27 moved from

P25 moved

P25 moved

P27 moved from

E20 PersonE20 PersonMartin Doerr

E19 Physical ObjectE19 Physical ObjectSpanair EC-IYG

E9 MoveE9 Move

Flight JK 126

E9 MoveE9 Move

My walk 16-9-2006 13:45

P26 moved to

E53 PlaceE53 PlaceMadrid Airport

Martin Doerr

E53 PlaceE53 PlaceEC-IYG seat 4A

E53 PlaceE53 PlaceFrankfurt

Airport-B10

How I came to Madrid…

Page 21: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringProcess Planning

Define methods of decision taking, revisions, (=> consensus, conflict resolution by analysis of implicit/unconscious purpose of defenders and examples)

Engineer the vocabulary – get rid of the language bias. (words are not concepts: “the child is safe” “I bought a book”).

Carry out a bottom-up process for IsA hierarchies, monontony!

Make Scope notes and definitions (but: Note limitations of definition!)

Do experimental “overmodelling” of the domain to understand the impact of simplifications

Page 22: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringThe Bottom-Up Structuring Process

1. Take a list of intuitive, specific terms, typically domain documents (“practical scope”).

too abstract concepts are often mistaken or missed!

2. Create a list of properties for these terms

essential properties to infer identity (coming into being, ending to be) relevant properties (behaviour) for the discourse (change mental context!) split term into concepts if necessary (“Where was the university when it decided to

take more students?”)

3. Detect new classes from property ranges.

Typically strings, names, numbers hide concepts. Identify concepts independent from the relation: “Who can be a creator?”

Page 23: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringThe Bottom-Up Structuring Process

4. Detect entities hidden in attributes, find their properties

5. Property consistency test

Test domain queries Revise properties and classes

6. Create the class hierarchy

Revise properties and classes

7. Create property hierarchies

Revise properties and classes

8. Closing up the model - reducing the model

delete properties and classes not needed to implement the required functions.

Page 24: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringEvidence, Relevance and Evaluation

An Ontology must be

capable to represent the relevant senses appearing in a source (empirical base)

Analyzing texts, dictionaries, Mapping database schemata.

capable to answer queries queries useful for its purpose/function

“Where was Martin on Sept.16, 15:00 ?”

Its concepts should be

valid under change of context:

E.g., is “This object has name ‘pencil’ ” valid in a pencil shop?.

objectively recognizable and likely to be recognized (useful for integration)

E.g., hero or criminal?

relevant measured by dominance/ frequency of occurrence in a source collection.

Balance subject coverage! Do not let experts get lost in details!

Page 25: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringPractical Tips: Theory of Identity

A class of individual particulars must define for them a substance on their own (scope note!), without depending on relations: “ Learning Object, Subtask, Expression, Manifestation” are not classes! “Work” is a class…

We must know how to decide when they come into/ go out of existence.

We must know what holds them together (unity criterion, scope note!), but we need not be able to decide on all parts!

Instance of a class must not have conflicting properties! (dead and alive, at one place and at many places?) = Is a collection material or immaterial? Is “Germany” one thing?

Essential properties of a class may be unknown! (Platon’s man). The scope note only “reminds” a common concept restricted by limited variation in the real world.

Page 26: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringPractical Tips: How to use IsA

No repetition of properties: What has a Doctor, Person, Animal, Physical Object in common? (count the freight weight of a plane).

Dangers of multiple IsA: Don’t confuse polysemy with multiple nature: Is the Zugspitze a place? Can a museum take decisions?

Identify “primitives”. Distinguish constraints from definitions: Platon’s man. “Washing machine” may be any machine washing.

IsA is a decrease of knowledge: “If I don’t know if he’s a hero, I know he’s a human…”

In an open world: never define a complement: Open number of siblings! Caution with disjoint classes!

Page 27: ICS-FORTH October 2, 2006 Introduction to Knowledge Representation and Conceptual Modeling Martin Doerr Institute of Computer Science Foundation for Research

ICS-FORTH October 2, 2006

Ontology EngineeringOther Tips

Avoid concepts depending on accidental and uncontextual properties (“creator”, “museum object”, “Buhmann”).

Maintain independence from scale: “hamlet – village”, at least introduce a scale-independent superclass.

Independence from point of view: Bying-Selling

Most non-binary relationships acquire substance as temporal entities. Never model activities as links.

Epistemological bias: Distinguish structure of what you can know (information system) from what you believe is (ontology). Quantification!