ics-forth october 2, 2006 introduction to knowledge representation and conceptual modeling martin...
TRANSCRIPT
ICS-FORTH October 2, 2006
Introduction to Knowledge Representation and Conceptual
Modeling
Martin Doerr
Institute of Computer ScienceFoundation for Research and Technology – Hellas
Heraklion – Crete, Greece
ICS-FORTH October 2, 2006
Knowledge RepresentationOutline
Introduction
From tables to Knowledge Representation: Individual Concepts and Relationships
Instances, classes and properties
Generalization
Multiple IsA and instantiation
A simple datamodel and notation
ICS-FORTH October 2, 2006
Knowledge RepresentationIntroduction: Basic Notions
Knowledge Representation:
Representation of concepts, items (particulars) and their interrelation as perceived by humans and expressed in terms of a formal language, such as logic, conceptual graphs etc.
The intended meaning (semantics) is the interpretation (identification) of used symbols as things and categories in the described universe (“domain”, “world”, “real world), and the interpretation of expressions, which use those symbols, as statements about their structure and interrelations (early Wittgenstein).
A set of related knowledge representation expressions is called a model (of a domain).
IT Jargon (due to limited scope): “KNOWLEDGE” instead of model
“SEMANTICS” instead of expressions
ICS-FORTH October 2, 2006
Knowledge RepresentationIntroduction: Reservations
Limitations:
“Principles of equivalence”: Given a model accepted as correct by a human, logical (automatable) inferences from a model should conform with the expectations of a human. Only in this sense represents knowledge representation (KR) knowledge. KR is a means of communication.
Expressions are rarely/never definitions, but partial constraints. (see also late Wittgenstein, Eleonore Rosch - George Lakoff).
Formal languages fit only partially the way we think.
Psychological Obstacles to create KR:
The true structure of our thoughts is unconscious.
BEWARE of compressions (Gilles Fauconnier, “The Way We Think”). See “Jargon”
Methodological questions reveal part of it (e.g. change of context).
ICS-FORTH October 2, 2006
Knowledge RepresentationFrom Forms to Classes (a
“decompression”)
Patient
Name String
Weight Number
Birth date Time
Birth Place String
Address String
Attributes(sometimescalled “part-of”)
Table name
Value types
What does that mean as statements about the world? Is it correct, e.g., “Address” ?
Relational Database Tables:
- an abstraction from forms, - a model for (statistical)
information processing
ICS-FORTH October 2, 2006
∞
∞
Address table
Patient
Name String
Weight Number
Birth Date Time
Birth Place String
has
Knowledge RepresentationFrom Forms to Classes (a
“decompression”)
Address: Shared with others
Changes over time
Can be multiple
Independent entity
What about Birth Date?
ICS-FORTH October 2, 2006
∞
∞
Address table
Patient
Name String
Weight Number
has
has
Knowledge RepresentationFrom Forms to Classes (a
“decompression”)
Birth Date, Birth Place Shared with others
Birth shared with others (twins)!
Independent entity
Birth
Date Time
Place String
1
∞
ICS-FORTH October 2, 2006
∞
∞
Address
Patient
Name String
has
has
has
Knowledge RepresentationFrom Forms to Classes (a
“decompression”)
Weight: Similar but not shared!
Multiple units, measurements
Dependent, but distinct entity
What about the name?
Birth
Date Time
Place String
1
∞
Patient’s Weight1
∞
ICS-FORTH October 2, 2006
∞
∞
Address
Patient
has
has
has
has
Knowledge RepresentationFrom Forms to Classes (a
“decompression”)
Name: Shared
Context specific
Independent entity
Who is the Patient then?
Birth
Date Time
Place String
1
∞
Patient’s Weight1
∞
Name
String
ICS-FORTH October 2, 2006
Summary:
In the end, no “private” attributes left over.
Widening/ change of context reveals them as hidden, distinct entities.
The “table” becomes a graph of related, but distinct entities, a MODEL
Things are only identified by unique keys – and the knowledge of the reality!
Do we describe a reality now? Are we closer to reality? Do we agree that this is correct? (“Ontological commitment”).
For a database schema, a projection (birth!) of perceived reality can be sufficient and more efficient.
For exchange of knowledge, it is misleading.
For a database schema, it can hinder extension.
Knowledge RepresentationFrom Forms to Classes (a “decompression”)
ICS-FORTH October 2, 2006
In KR we call these distinct entities classes:
A class is a category of items that share one or more common traits serving as criteria to identify the items belonging to the class. These properties need not be explicitly formulated in logical terms, but may be described in a text (here called a scope note) that refers to a common conceptualisation of domain experts. The sum of these traits is called the intension of the class. A class may be the domain or range of none, one or more properties formally defined in a model. The formally defined properties need not be part of the intension of their domains or ranges: such properties are optional. An item that belongs to a class is called an instance of this class. A class is associated with an open set of real life instances, known as the extension of the class. Here “open” is used in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time. (related terms: universals, categories, sortal concepts).
Knowledge RepresentationClasses and Instances
ICS-FORTH October 2, 2006
Knowledge RepresentationParticulars
Distinguish particulars from universals as a perceived truth. Particulars do not have specializations. Universals have instances, which can be either particulars or universals.
particulars: me, “hello”, 2, WW II, the Mona Lisa, the text on the Rosetta Stone, 2-10-2006, 34N 26E.
universals: patient, word, number, war, painting, text
“ambiguous” particulars: numbers, saints, measurement units, geopolitical units.
“strange” universals: colors, materials, mythological beasts.
Dualisms:
— Texts as equivalence classes of documents containing the same text.— Classes as objects of discourse, e.g. “chaffinch” and ‘Fringilla coelebs Linnaeus, 1758’ as
Linné defined it.
ICS-FORTH October 2, 2006
DoctorDoctor PatientPatient
George 1George 1Costas 65Costas 65
WeightWeight
AddressAddress
Odos Evans 6.Odos Evans 6.GR71500 Heraklion, GR71500 Heraklion,
Crete, GreeceCrete, Greece
?
instance
property
Knowledge RepresentationClasses and Instances
weighs
dwells at
In KR, instances are independent units of models, not a restricted to the records of one table.
Identity is separated from description.
We can do “multiple instantiation”.
What have doctors and patients in common?
85 Kg85 Kg
ICS-FORTH October 2, 2006
DoctorDoctor PatientPatient
George 1George 1Costas 65Costas 65
PersonPerson
Physical ObjectPhysical Object WeightWeight
AddressAddressdwells at
weighs
isA
Knowledge RepresentationGeneralization and Inheritance
Odos Evans 6.Odos Evans 6.GR71500 Heraklion, GR71500 Heraklion,
Crete, GreeceCrete, Greece
85 Kg85 Kg
subclass
superclass
An instance of a class is an instance of all its superclasses.
A subclass inherits the properties of all superclasses. (properties “move up”)
ICS-FORTH October 2, 2006
Knowledge RepresentationOntology and Information Systems
An ontology is a logical theory accounting for the intended meaning of a formal vocabulary, i.e. its ontological commitment to a particular conceptualization of the world. The intended models* of a logical language using such a vocabulary are constrained by its ontological commitment. An ontology indirectly reflects this commitment (and the underlying conceptualization) by approximating these intended models.
Nicola Guarino, Formal Ontology and Information Systems, 1998.
* “models” are meant as models of possible states of affairs.
Ontologies pertains to a perceived truth: A model commits to a conceptualization, typically of a group, how we imagine things in the world are related.
Any information system compromises perceived reality with what can be represented on a database (dates!), and with what is performant. An RDF Schema is no more a “pure” ontology. Use of RDF does not make up an ontology.
ICS-FORTH October 2, 2006
Complex logical rules may become difficult to identify for the domain expert and difficult to handle for an information system or user community.
Distinguish between modeling knowing (epistemology) and modeling being (ontology): necessary properties may nevertheless be unknown. Knowledge may be inconsistent or express alternatives.
Human knowledge does not fit with First Order Logic: There are prototype effects (George Lakoff), counter-factual reasoning (Gilles Fauconnier), analogies, fuzzy concepts. KR is an approximation.
Concepts only become discrete if restricted to a context and a function! Paul Feyerabend maintains they must not be fixed.
Knowledge RepresentationLimitations
ICS-FORTH October 2, 2006
Ontology
disciplines
viewpoints
Precision/detail
affordable technical
complexity
Conceptualframework
Activities
Communication
ResearchDomain work
Real World Things
Current domainpriorities
how
how
talksabout
maps
servesin order to
Ontology EngineeringScope Constraints of for Ontology
Constraint:
Constraint:
select
select
ICS-FORTH October 2, 2006 18
E53 Place
E53 Place
A place is an extent in space, determined diachronically with regard to a larger, persistent constellation of matter, often continents -
by coordinates, geophysical features, artefacts, communities, political systems, objects - but not identical to
A “CRM Place” is not a landscape, not a seat - it is an abstraction from temporal changes - “the place where…”
A means to reason about the “where” in multiple reference systems.Examples:
—figures from the bow of a ship—African dinosaur foot-prints appearing in Portugal by continental drift—where Nelson died
ICS-FORTH October 2, 2006
Knowledge RepresentationA Graphical Annotation for Ontologies
E45 AddressE45 AddressE48 Place NameE48 Place Name
E47 Spatial CoordinatesE47 Spatial Coordinates
E46 Section DefinitionE46 Section Definition E18 Physical StuffE18 Physical Stuff
E44 Place AppellationE44 Place Appellation
E53 PlaceE53 PlaceP88 consists of (forms part of)
P58 defines section of(has section definition)
P59 is located on or within
(has section)
P87 iden
tifies
(is id
entif
ied by) P53 has former or current location
(is former or current location of )
E9 MoveE9 Move
P26 moved to (was destination of)
P27 moved from (was origin of)
P25 moved (moved by)
E12 Production EventE12 Production Event
P108 has produced (was produced by)
P7 took place at (witnessed)
E19 Physical ObjectE19 Physical ObjectE24 Ph. M.-Made StuffE24 Ph. M.-Made Stuff
E9 MoveE9 Move
ICS-FORTH October 2, 2006
Knowledge Representation A Graphical Annotation for Ontology
Instances
P59B
has section
P26 moved to P27 moved from
P25 moved
P25 moved
P27 moved from
E20 PersonE20 PersonMartin Doerr
E19 Physical ObjectE19 Physical ObjectSpanair EC-IYG
E9 MoveE9 Move
Flight JK 126
E9 MoveE9 Move
My walk 16-9-2006 13:45
P26 moved to
E53 PlaceE53 PlaceMadrid Airport
Martin Doerr
E53 PlaceE53 PlaceEC-IYG seat 4A
E53 PlaceE53 PlaceFrankfurt
Airport-B10
How I came to Madrid…
ICS-FORTH October 2, 2006
Ontology EngineeringProcess Planning
Define methods of decision taking, revisions, (=> consensus, conflict resolution by analysis of implicit/unconscious purpose of defenders and examples)
Engineer the vocabulary – get rid of the language bias. (words are not concepts: “the child is safe” “I bought a book”).
Carry out a bottom-up process for IsA hierarchies, monontony!
Make Scope notes and definitions (but: Note limitations of definition!)
Do experimental “overmodelling” of the domain to understand the impact of simplifications
ICS-FORTH October 2, 2006
Ontology EngineeringThe Bottom-Up Structuring Process
1. Take a list of intuitive, specific terms, typically domain documents (“practical scope”).
too abstract concepts are often mistaken or missed!
2. Create a list of properties for these terms
essential properties to infer identity (coming into being, ending to be) relevant properties (behaviour) for the discourse (change mental context!) split term into concepts if necessary (“Where was the university when it decided to
take more students?”)
3. Detect new classes from property ranges.
Typically strings, names, numbers hide concepts. Identify concepts independent from the relation: “Who can be a creator?”
ICS-FORTH October 2, 2006
Ontology EngineeringThe Bottom-Up Structuring Process
4. Detect entities hidden in attributes, find their properties
5. Property consistency test
Test domain queries Revise properties and classes
6. Create the class hierarchy
Revise properties and classes
7. Create property hierarchies
Revise properties and classes
8. Closing up the model - reducing the model
delete properties and classes not needed to implement the required functions.
ICS-FORTH October 2, 2006
Ontology EngineeringEvidence, Relevance and Evaluation
An Ontology must be
capable to represent the relevant senses appearing in a source (empirical base)
Analyzing texts, dictionaries, Mapping database schemata.
capable to answer queries queries useful for its purpose/function
“Where was Martin on Sept.16, 15:00 ?”
Its concepts should be
valid under change of context:
E.g., is “This object has name ‘pencil’ ” valid in a pencil shop?.
objectively recognizable and likely to be recognized (useful for integration)
E.g., hero or criminal?
relevant measured by dominance/ frequency of occurrence in a source collection.
Balance subject coverage! Do not let experts get lost in details!
ICS-FORTH October 2, 2006
Ontology EngineeringPractical Tips: Theory of Identity
A class of individual particulars must define for them a substance on their own (scope note!), without depending on relations: “ Learning Object, Subtask, Expression, Manifestation” are not classes! “Work” is a class…
We must know how to decide when they come into/ go out of existence.
We must know what holds them together (unity criterion, scope note!), but we need not be able to decide on all parts!
Instance of a class must not have conflicting properties! (dead and alive, at one place and at many places?) = Is a collection material or immaterial? Is “Germany” one thing?
Essential properties of a class may be unknown! (Platon’s man). The scope note only “reminds” a common concept restricted by limited variation in the real world.
ICS-FORTH October 2, 2006
Ontology EngineeringPractical Tips: How to use IsA
No repetition of properties: What has a Doctor, Person, Animal, Physical Object in common? (count the freight weight of a plane).
Dangers of multiple IsA: Don’t confuse polysemy with multiple nature: Is the Zugspitze a place? Can a museum take decisions?
Identify “primitives”. Distinguish constraints from definitions: Platon’s man. “Washing machine” may be any machine washing.
IsA is a decrease of knowledge: “If I don’t know if he’s a hero, I know he’s a human…”
In an open world: never define a complement: Open number of siblings! Caution with disjoint classes!
ICS-FORTH October 2, 2006
Ontology EngineeringOther Tips
Avoid concepts depending on accidental and uncontextual properties (“creator”, “museum object”, “Buhmann”).
Maintain independence from scale: “hamlet – village”, at least introduce a scale-independent superclass.
Independence from point of view: Bying-Selling
Most non-binary relationships acquire substance as temporal entities. Never model activities as links.
Epistemological bias: Distinguish structure of what you can know (information system) from what you believe is (ontology). Quantification!