synergy between ontologization and quality management michael ellsworth, icsi, berkeley (joint work...

42
Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Upload: matilda-malone

Post on 16-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Synergy between Ontologization and Quality Management

Michael Ellsworth,

ICSI, Berkeley

(Joint work with Jan Scheffczyk)

Page 2: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

1. Defining Quality Management

2. The necessity of ontology: The story of FE fillers

3. Metonymy: what ontologies don’t have

4. Ontologies need metonymy; does metonymy need ontologies?

Page 3: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

1. Defining Quality Management

2. The necessity of ontology: The story of FE fillers

3. Metonymy: what ontologies don’t have

4. Ontologies need metonymy; does metonymy need ontologies?

Page 4: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Defining Quality Management for FN

• How well do data correspond to what they should be

• How do the data correspond to our expectations– Considering only expectations based on our

definitions of data categories

Page 5: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Simple example

• The semantic type “Non_lexical_frame” on a frame should mean that the frame has no evoking lexical units, and is purely for frame relations

• NB: The lack of this ST should signify that there are lexical units

• (Used, e.g., to remind frame-makers to add at least 1 LU, non-lexical frames excluded)

Page 6: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Automating QM

• Data is human-produced, so definitions of the data categories are designed to be understood by humans

• But! There is too much data to inspect every piece

• Automation requires formally or operationally defining the data categories

Page 7: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Formalizing data-category definitions

• Iff F, instance of a frame, and not T, instance of “Non_lexical_frame”, s.t. T is an ST of F, then L, where L is an LU of F

• Presupposition: “Non_lexical_frame” should not mark anything other than a frame

• Usage restriction: All non-lexical frames should have at least two frame-to-frame relations to lexical frames

Page 8: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

CDET (plus “glue”)

• Quality definitions in first-order logic can be evaluated with CDET (Scheffczyk 2004)

• If the fix is already known, CDET can automatically correct inconsistencies (not implemented for FrameNet)

• Otherwise, CDET produces a report on the inconsistencies with enough information for a user to understand and correct the problem

Page 9: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

1. Defining Quality Management

2. The necessity of ontology: The story of FE fillers

3. Metonymy: what ontologies don’t have

4. Ontologies need metonymy; does metonymy need ontologies?

Page 10: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

What’s ontology got to do with it?

• Broadly: the process of formalizing data category definitions is ontologization

• But also: formal connection to “official” ontologies and/or WordNet is necessary– Other resources have formalizations that

FrameNet doesn’t

Page 11: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

More complex requirement

• The phrases that fill frame elements should denote the kind of entity specified by the semantic type on the frame element– She batted her eyelashesBody_part.

– Here the Body_part frame element is required to have the semantic type “Body_part”

Page 12: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Formalizing

• The semantic head of a filler phrase should be interpretable as a subtype of the FrameNet Semantic Type– So eyelashes in the above example must be a

subtype of “Body_part”

• But! Most fillers of FEs (pronouns, entity nouns, names) are not described in FrameNet itself

Page 13: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

WordNet

• WordNet has coverage of a very large number of word-senses

• Word senses, via the connection to synsets, are hierarchically connected via the “is-a” relation, so subtypes are determinable

• WordNet does not have synset nodes to cover all FN semantic types

Page 14: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Wrinkle: Finding headwords

• Subtracts preposition and uses headwords from Minipar

• For relative clauses– substitutes the antecedent phrase for relative– It had a sharp pointed face and a feathery tailAnt thatRel arched over its back.

• Ought to take account of transparent nouns– One of his eyebrows arched ironically.– One is a quantifier; eyebrows is the category

• Ought to handle conjunction

Page 15: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Which synset for polysemous words?

• Most appropriate synset highly dependent on genre and frame element

• Therefore: we use all possible synsets

Page 16: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

SUMO connects semantic types and WN

• FrameNet semantic types correspond relatively well to SUMO concepts (see PDF1)

• Cases of mismatch are handled with new nodes defined in SUO-KIF (the language of SUMO) connecting to pre-existing SUMO concepts

• WordNet is already mapped to SUMO (Niles & Pease 2003)

Page 17: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Headwords not in WordNet

• Pronouns– Ex: SheAgent swung her head in his direction.

– She, he, who, etc. connected to SUMO Sentient_agent

• Named entities – Depending on NEs used, readily mappable to

SUMO

Page 18: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Prepositions

• Currently subtracted and not analyzed• Correct for marker prepositions; noun’s type is

assigned regardless of preposition or lack of preposition:– He gives money to local charities.– I’m just going to give her some milk.– You’re doing it for the child she’s foisting on you.

• This is incorrect for relation-defining prepositions:– We walked together to the cab.– To here correctly maps to the SUMO node Goal

Page 19: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Actual errors

• Rare!– E.g.: He swung himselfBody_part around the

corner …

• Himself maps to SUMO Sentient_agent, which is not consistent with Body_part

• The above sentence actually evokes Cause_ to_move_in_place

Page 20: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

1. Defining Quality Management

2. The necessity of ontology: The story of FE fillers

3. Metonymy: what ontologies don’t have

4. Ontologies need metonymy; does metonymy need ontologies?

Page 21: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Assailants, Victims, and Metonymy

• The method so far does not correctly account for many other FEs in other frames

• E.g. the Assailant FE, especially so in a narrow genre– For the following, annotation of text from the

Nuclear Threat Initiative and related texts is used

Page 22: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Background:

• Assailant has the semantic type Sentient, mapped to SUMO Sentient_agent

• The most common fillers of the Assailant FE of the Attack frame are as follows

Page 23: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Filler Headword Frequency

It It 3

Its Its 3

Iraqi Iraqi 2

Iran Iran 2

Terrorist Terrorist 2

The US US 2

Iraq Iraq 1

Al-Qaida Al-Qaida 1

His forces Force 1

By Iraq Iraq 1

US US 1

U.S. U.S. 1

Chadian forces Force 1

Page 24: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

This results in the following hierarchy:

(See PDF2)

Page 25: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

• The problem:– 38% of the fillers evoke the Nation concept in SUMO

– Nation is not a subtype of Sentient_agent

• Clash with linguistic intuition: fillers like “Iraq” are completely unobjectionable as Assailants

• Since we know Iraq is a legitimate instance of SUMO Nation, and yet it fits, the problem could be only be in the SUMO hierarchy (no) or in FN’s semantic type assignment…

Page 26: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Poorly fitting types? Disjunctive Types?

• Lifting the FN semantic type to a more general level connects it to the SUMO Agent node, covering Nation– But this also covers Geopolitical_area (etc.),

including things like city and senate district that are very unlikely to launch literal attacks

• Or should we make a disjunctive type?

Page 27: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

… or not!

• All instances of Attack require that there be some actual person(s) filling the Assailant role:– The initial Iraqi attack destroyed most of the

Kuwaiti jet fighters.

• This implicates the following:– A person or people empowered to act for Iraq

made an attack.

Page 28: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Metonymy

• The implication that a covert entity (the people) fills the same role as a related overtly mentioned entity (Iraq) is the hallmark of metonymy

• Metonymy is pervasive:– Where am I (= my car) parked?– Alternations like possession (relation) and

possession (something in that relation)

Page 29: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Adding Metonymy to the Ontology?

• The solution to the above quandary is to add in an explicit metonymy link to the ontology connecting Nation to People

• Metonymy is normally contextually limited– By Frame and FE

• ##The nation kissed her/them

– At least statistically, also by genre• Nations don’t occur as Assailant in general domain

Page 30: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Metonymy unloved?

• Some associated axioms would involve a difficult-to-define relation of association and subtypes thereof

• Given its contextual dependence, metonymy is unlikely to win the acceptance of run-of-the-mill ontologists, who want fixed facts, not fixed meta-facts

• Axioms concerning communication itself could encode contextual dependence

Page 31: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Metaphor

• May also change selectional restrictions drastically– I chewed on the question for a few days.

• But not always– She shoved several superiors out of her way in

her climb to the top.

• Similarly contextual to metonymy

Page 32: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

1. Defining Quality Management

2. The necessity of ontology: The story of FE fillers

3. Metonymy: what ontologies don’t have

4. Ontologies need metonymy; does metonymy need ontologies?

Page 33: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Ontologies and language

• Ontologists clearly see the need to connect to language– Ease of accessability concerns with ontologies– Connections to WordNet

• Unclear if ontologies can ever have a reasonable interface with language without coming to grips with metaphor and metonymy

• Unclear if current ontologies can incorporate these notions

Page 34: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Ontologies and language?

• There are other vital and pervasive aspects of language-based reasoning absent from ontologies:– Fuzzy, radial categories– Use of underspecification– Contextuality

• Are these difficulties sufficient grounds to discard ontologies?

Page 35: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Language-based Ontology

• Ontologies (or something) will be necessary for reasoning

• Older ontologies may have great difficulty incorporating language in any deep way

• Newer ontologies, some of which seem to be built on more convenient principles (e.g. the Generic Concept Library), might be more attractive

Page 36: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

FIN

Page 37: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)
Page 38: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)
Page 39: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)
Page 40: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)
Page 41: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)

Kinds of Contradictions

• When the data have well-defined properties that bear on other data:– The data can be directly checked against itself

– Such checks already largely implemented with CDET

• When the data have properties that are only confirmable by outside knowledge:– The data can only be checked if the outside knowledge

can somehow be accessed by the automated checker

Page 42: Synergy between Ontologization and Quality Management Michael Ellsworth, ICSI, Berkeley (Joint work with Jan Scheffczyk)