interlingua-based mt interlingua-based machine translation syntactic transfer-based mt – couples...

60
Interlingua-based MT

Upload: lynn-malone

Post on 17-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Interlingua-based MT

Page 2: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Interlingua-based Machine Translation

• Syntactic transfer-based MT – Couples the syntax of the two

languages

• What if we abstract away the syntax

– All that remains is meaning – Meaning is the same across

languages – Simplicity: Only N components

needed to translate among N languages

• Two “small” problems:– What is meaning?– How do we represent meaning?

Direct MT

Interlingua

Transfer-basedMT

Source Target

Parsing

Semantic Interpretation

Semantic Generation

Syntactic Generation

Syntactic Structure

Syntactic Structure

English analyzer

Spanish analyzer

Japanese analyzer

Spanish Generator

Japanese Generator

English generator

Interlingual representation

Page 3: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Example of Interlingua Machine Translation

)2(_);2,(1);1,( ecallcollecteIMakeeeINeed

need

I make

to call

a collect

indefssDefinitene

collectattributes

call

Theme

IAgent

InfinitiveTense

MakeEvent

Theme

IAgent

presentTense

NeedEvent

:

::

:

:

:

:

:

:

:

必要があります (need)

私は (I)

かける (make)

コールを (call)

コレクト (collect)

Interlingua representation

Page 4: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ingredients of a semantic representation• language neutral

– Syntactic variations should result is the same semantics

• sense of a word• deep semantic role labels• scope of quantifiers, adverbials, adjectives• polarity information

Distinguish between

surface structure (syntactic structure) and

deep structure (semantic structure) of sentences.

Different forms of semantic representation:

logic formalisms

ontology / semantic representation languages • Case Frame Structures (Filmore)• Conceptual Dependy Theory (Schank)• Description Logic (DL) and similar KR languages • Ontologies

Page 5: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Text Meaning Representation

• Lexicon has two components– Syntactic part– Semantic constraints part

• Given a sentence, the syntactic part analyzes the input syntactically and the semantic constraints create semantic expressions that can be evaluated.

• Ontology specifies the type hierarchy– Used for checking selectional restrictions – Selectional restrictions used for word-sense disambiguation

• e.g. accident is an event; organization has humans

Page 6: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Constructing a Semantic Representation

General approach:

Start with surface structure derived from parser.

Map surface structure to semantic structure: Use phrases as sub-structures. Find concepts and representations for central phrases (e.g. VP,

NP, then PP) Assign phrases to appropriate roles around central concepts

(e.g. bind PP into VP representation).

Page 7: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Semantic Representation

Semantic Representations are based on some form of (formal) Representation Language.

• Semantics Networks

• Conceptual Dependency Graphs

• Case Frames

• Ontologies

• DL and similar KR languages

Important note: Difference between relations between text strings and referents in the world.

Page 8: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontology (Interlingua) approach

Ontology: a language-independent classification of objects, events, relations

A Semantic Lexicon, which connects lexical items to nodes (concepts) in the ontology

An analyzer that constructs Interlingua representations and selects an appropriate one

Page 9: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Semantic Lexicon

Provides a syntactic context for the appearance of the lexical item

Provides a mapping for the lexical item to a node in the ontology (or more complex associations)

Provides connections from the syntactic context to semantic roles and constraints on these roles

Page 10: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Constructing an InterLingua Representation

For each syntactic analysis:

Access all semantic mappings and contexts for each lexical item.

Create all possible semantic representations.

Test them for coherency of structure and content.

Page 11: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Input: John makes toolsSyntactic Analysis:

Basic Semantic Dependency - Example

cat verbroot maketense present

subject  root johncat noun-proper

object  root     toolcat nounnumber plural

Page 12: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

John-n1syn-struc

root johncat noun-proper

sem-struchuman

name johngender male

tool-n1syn-struc

root toolcat n

sem-structool

Lexicon Entries for John and tool

Page 13: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Relevant extract from the specification of the ontological concept used to describe the appropriate meaning of make:

manufacturing-activity...

agent human theme artifact

Ontological Representation - Example

who

what

Page 14: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

The basic semantic dependency component of the “Text Meaning Representation” (TMR) for:

John makes tools

manufacturing-activity-7

agent human-3theme set-1

element toolcardinality > 1

Semantic Dependency Component

Page 15: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

try-v3syn-struc

root trycat vsubj root $var1

cat nxcomp root $var2

cat vform OR infinitive gerund

sem-strucset-1 element-type refsem-1

cardinality >=1refsem-1 sem event

agent ^$var1effect refsem-2

modalitymodality-type epiteucticmodality-scope refsem-2modality-value < 1

refsem-2 value ^$var2sem event

Means “non finished action; outcome unclear”

semantic representation of “try-v3”

Page 16: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

REQUEST-INFO-130 THEME DEVELOP-2601.PURPOSE DEVELOP-2601.REASON TEXT-POINTER why INSTANCE-OF REQUEST-INFO

DEVELOP-2601THEME SET-2555AGENT NATION-97PHASE CONTINUOUS

TIME FIND-ANCHOR-TIME INSTANCE-OF DEVELOP

TEXT-POINTER developing

NATION-97HAS-NAME Iraq

INSTANCE-OF NATIONTEXT-POINTER Iraq

SET-2555 ELEMENT-TYPE WEAPONCARDINALITY > 1

INSTRUMENT-OF KILL-1864 THEME-OF DEVELOP-2601 INSTANCE-OF WEAPON

TEXT-POINTER weapons

KILL-1864 THEME SET-2556 INSTRUMENT SET-2555 INSTANCE-OF KILL

TEXT-POINTER destruction

SET-2556 THEME-OF KILL-1225 ELEMENT-TYPE HUMAN

CARDINALITY > 100 INSTANCE-OF HUMAN

TEXT-POINTER mass

“Why is Iraq developing weapons of mass destruction?”

Page 17: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Word sense Disambiguation

Methods

Constraint checking

make sure the constraints imposed on context are met

Graph traversal

is-a links are inexpensive

other links are more expensive

the “cheapest” structure is the most coherent one

Hunter-gatherer processing

find (hunt) and eliminate (kill) unlikely interpretations

collect (gather) remaining interpretations

Page 18: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontological Semantics: An example semantic representation language

slides from S. Nirenberg

Page 19: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontological semantics is a computationally tractabletheory of meaning in natural language as well as asuite (OntoSem) of implemented NLP programs and a set ofstatic knowledge resources that support these programs.

Ontological semantics deals directly with extraction,representation and manipulation of text meaning.

Ontosem text analyzers produce interpreted knowledge ready to be used in reasoning-heavy applications that include question answering, cross-document and cross-lingual text summarization, question answering, machine translation and others.

Support of intelligent human-computer interaction in domain- and task-oriented environments is squarelywithin the purview of ontological semantics.

Page 20: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontological semantics concentrates on contentof representations and is adaptable to a number of different representation formats.

Ontological semantics is both a producer and aconsumer of knowledge: deriving text meaning isitself a knowledge-intensive task

Page 21: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

OntoSem

• is devoted to processing naturally occurring texts• strives for high-quality results first followed by concern for broad coverage• expects “unexpected” inputs• seeks quality heuristics of any provenance (knowledge- based or probabilistic, cooccurrence-based)• does not grant syntax a privileged position among the providers of heuristics for semantic processing• does not make a strong distinction between semantics and pragmatics• is applicable to any natural language

Page 22: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontological-semantic analyzers take natural language texts as inputs and generate machine-tractable text meaning representations (TMRs) that form the basis of various reasoning processes.

Sample Input Sentence:

Iran, Iraq and North Korea on Wednesday rejected an accusation by President Bush that they are developing weapons of mass destruction.

The TMR (presented graphically) for the above isas follows:

Page 23: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Output: A Text Meaning Representation (TMR)

This presentation is simplified; the system, in fact, derives much more from text;event instances are shown in ellipses; object instances, in rectangles; only caserole and set membership relations are shown (as labels on links); numerical constraints can be fuzzy, as in the cardinality of SET-1226.

Page 24: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

DENY-1224 ;; speech act AGENT SET-1224 THEME DEVELOP-1224 TIME < FIND-ANCHOR-TIME WEDNESDAY-1224 INSTANCE-OF DENY

TEXT-POINTER reject

ACCUSE-1224 ;; President BushÕs accusation AGENT HUMAN-15691

BENEFICIARY SET-1224 THEME DEVELOP-1224 INSTANCE-OF ACCUSE

TEXT-POINTER accusation

DEVELOP-1224 ;; developing weaponsTHEME SET-1225THEME-OF DENY-1224 ACCUSE-1224

AGENT SET-1224PHASE CONTINUOUS

PURPOSE WARN-1224TIME FIND-ANCHOR-TIME

INSTANCE-OF DEVELOPTEXT-POINTER developing

Word Sense Disambiguation

Instances of OntologicalConcepts

Semantic Dependencies(fillers of ontological properties mentioned intext; not simply relationsamong textual strings)

Triggers for further context-dependent processing

Many additional properties stored with concepts underlying instances

A pretty-printed fragment ofthe actual TMR representationfor sample input

Page 25: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Ontological-semantic systems centrally rely on the followingstatic knowledge resources:

a language-independent ontology that includes knowledge about types of entities in the world, e.g., ATHLETE, WELD or SPEED; ontology-oriented lexicons (and onomasticons, or lexicons of proper names) for each natural language in the system; and a fact repository containing instances of ontological concepts, e.g., Andre Agassi (ATHLETE-3176) or the Apollo 13 mission (SPACEFLIGHT-142)

Page 26: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

A Sample Screen of the Ontology/Lexicon/Fact Repository Browsing and Editing Environment

Page 27: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 28: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 29: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

(diagnosis (diagnosis-n1 (cat n) (anno (def "") (ex "The diagnosis (of cancer) (by the specialist) was made

quickly") (comments ""))

(syn-struc ((root $var0) (cat n) ; diagnosis (pp-adjunct ((root of) (root $var1) (cat prep) (opt +) ; of (obj ((root $var2) (cat n))))) ; disease (pp-adjunct ((root by) (root $var3) (cat prep) (opt +) ; by (obj ((root $var4) (cat n))))))) ; someone (sem-struc (DIAGNOSE ; the ontological mapping (agent (value ^$var4)) ; the case roles (theme (value ^$var2))) (^$var1 (null-sem +)) ; blocks compositional analysis of preps (^$var3 (null-sem +)))) )

Page 30: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

(cancer (cancer-n1 (cat n) (anno (def "a disease") (ex "") (comments "") ) (syn-struc ((n ((root $var1) (cat n) (opt +))) ; animal part as modifier (root $var0) (cat n) ; cancer )) (sem-struc (CANCER (location (value ^$var1) (sem animal-part))) ) )

Page 31: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

(cancer-n2 (cat n) (anno (def "a sign of the zodiac") (ex "") (comments "") ) (syn-struc ((root $var0) (cat n) )) (sem-struc (CANCER-ZODIAC) ) ) )

Page 32: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Currently Available Static Knowledge Sources for English:

• Ontology of about 6,500 concepts (about 95,000 property-value pairs)• English lexicon of about 40,000 entries• Fact repository of about 20,000 facts (outside medical domain)• English Onomasticon of about 350,000 entries• Tokenization knowledge, morphological and syntactic grammars for a number of languages

Page 33: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Preprocessor

InputText

SyntacticAnalyzer

Grammar:Ecology

MorphologySyntax

Lexicon andOnomasticon

Static Knowledge Resources

SemanticAnalyzer

Ontology andFact Repository

TMR

Processing Modules

The analyzer’s conceptual architecture

(in reality, not strictly pipelined)

Page 34: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

The basic (“who did what to whom”) semantic dependency is derived, in the general case, on the basis of

a) lexical-semantic expectations (selectional restrictions) recorded in the ontology and the lexicon and

b) syntactic dependency derived from the results of syntactic analysis.

Page 35: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

The beginnings of system evaluation

Run I: “raw”Run II: preprocessor output correct; Run III: preprocessor and syntactic analysis output correct

Sentences 1 2 3 4 5 6 AverageWords 28 33 8 24 33 26 25.33Senses 79 86 29 150 96 76 86Words in / not in lexicon 28/0 32/1 5/3 24/0 31/2 24/2 24/1.33Syntactic ambiguity count 192 32 16 19 63 47 61.5Overall ambiguity count >1.7M >149M 64 >199M >418M >268K >120MWS disambiguation I 52% 48% 50% 46% 30% 50% 48.0%Semantic dependencies I 67% 33% 17% 40% 33% 29% 36.5%WS disambiguation II 96% 68% 67% 83% 88% 54% 76.0%Semantic dependencies II 69% 50% 63% 33% 69% 29% 52.2%WS disambiguation III 96% 100% 67% 88% 90% 100% 90.2%Semantic dependencies III 85% 100% 63% 90% 100% 86% 87.3%

Page 36: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

In addition to the basic semantic dependency, TMRs also include parameterized information provided by the microtheoriesof aspect, modality (including speaker attitudes), time, styleand others.

Most of these microtheories have been implemented. All would benefit from further work. We are also actively looking intopossibilities of borrowing some microtheories -- either in toto or partially.

Page 37: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 38: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 39: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 40: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract
Page 41: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

FrameNet: Another example of semantic representation

Frame Semantics (Fillmore 1976, 1977, ..)

• Frame: a conceptual structure or prototypical situation

• Frame elements (roles) – Identify participants of the situation– Are local to their frame

• Frame evoking elements (verbs, nouns, adjectives) introduce frames

• E.g. VERDICT:

[The jury]Judge convicted [him]Defentant [on the counts of theft]Charges.

On Thursday [a jury]Judge found [the youth]Defendant [guilty of wounding Mr Lay] Finding

Berkeley FrameNet Project

• Database of frames for core lexicon of English

• Current release: 610 frames, about 9000 lexical units

Page 42: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Types of Relations

FrameNet Relations

• Frame hierarchy: inherits

• Subframes

Contextual Relations between instantiated frames and roles

• Syntactic and/or semantic embedding

• Discourse relations

• Anaphoric relations

Inferences

• On the basis of both

Page 43: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

A Case Study

In the first trial in the world in connection with the terrorist attacks of 11 September 2001, the Higher Regional Court of Hamburg has passed down the maximum sentence. Mounir al Motassadeq will spend 15 years in prison. The 28-year-old Moroccan was found guilty as an accessory to murder in more than 3000 cases.

Page 44: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

FrameNet „as a Net“– Frame-to-Frame Relations –

Subframe relation

• Super frame represents complex event

• Subframes represent sub-events

• Subframes usually inherit some roles of the super frame

Criminalprocess

Arraignment Arrest Sentencing Trial

Charge

JudgeDefendant

Defense

Court

Jury

Offense

Prosecution

Charge

Defendant

... ... ... ...

Page 45: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Local Roles

In the first trial in the world in connection with [the [terrorist]Assailant attacks of [11 September 2001]Time]Case, [the Higher Regional Court of Hamburg]Court has passed down the [maximum]Type sentence.

Page 46: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Local Roles

[Mounir al Motassadeq]Inmates will spend [15 years]Duration in prison.

Page 47: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Local Roles

[The 28-year-old Moroccan]Defendant was found [guilty]Finding as [an accessory to [murder]FocalEntity [in more than 3000 cases]Victim ]Charge.

Page 48: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Unfilled Roles

Target Frame Frame roles Filler (given vs. Induced)

trial TRIAL CASE terrorist attacks (1)

CHARGE accessory to murder (2)

COURT Higher Regional Court (3)

DEFENDANT ... 28-year-old Moroccan (4)

attacks ATTACK ASSAILANT terrorist (5)

VICTIM ... (6) TIME (exth.) 11 September 2001(7)

sentence SENTENCING CONVICT Mounir al Motassadeq (8) COURT Higher Regional Court (9) TYPE ... maximum sentence (10)

prison PRISON INMATES ... Mounir al Motassadeq (11) DURATION (exth.) 15 years (12)

found VERDICT CASE terrorist attacks (13)

CHARGE accessory to murder (14) DEFENDANT 28-year-old Moroccan (15) FINDING ... guilty (16)

accessory ASSISTANCE CO-AGENT (17)

FOCAL_ENTITY murder (18)

HELPER ... 28-year-old Moroccan (19)

murder KILLING KILLER (20)

VICTIM ... m.t. 3000 cases (21)

Page 49: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Target Frame Frame roles Filler (given vs. Induced)

trial TRIAL CASE terrorist attacks (1)

CHARGE accessory to murder (2)

COURT Higher Regional Court (3)

DEFENDANT ... 28-year-old Moroccan (4)

attacks ATTACK ASSAILANT terrorist (5)

VICTIM ... (6)

TIME (exth.) 11 September 2001 (7)

sentence SENTENCING CONVICT Mounir al Motassadeq (8)

COURT Higher Regional Court (9)

TYPE ... maximum sentence (10)

prison PRISON INMATES ... Mounir al Motassadeq (11)DURATION (exth.) 15 years (12)

Found VERDICT CASE terrorist attacks (13)

CHARGE accessory to murder (14)

DEFENDANT 28-year-old Moroccan (15)

FINDING ... guilty (16)

accessory ASSISTANCE CO-AGENT (17)

FOCAL_ENTITY murder (18)

HELPER ... 28-year-old Moroccan (19)

murder KILLING KILLER (20)

VICTIM ... m.t. 3000 cases (21)

Page 50: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Target Frame Frame roles Filler (given vs. Induced)

trial TRIAL CASE terrorist attacks (1)

CHARGE accessory to murder (2)

COURT Higher Regional Court (3)

DEFENDANT ... 28-year-old Moroccan (4)

attacks ATTACK ASSAILANT terrorist (5)

VICTIM ... (6)

TIME (exth.) 11 September 2001 (7)

sentence SENTENCING CONVICT Mounir al Motassadeq (8)

COURT Higher Regional Court (9)

TYPE ... maximum sentence (10)

prison PRISON INMATES ... Mounir al Motassadeq (11)

DURATION (exth.) 15 years (12)

found VERDICT CASE terrorist attacks (13)

CHARGE accessory to murder (14)

DEFENDANT 28-year-old Moroccan (15)

FINDING ... guilty (16)

accessory ASSISTANCE CO-AGENT (17)

FOCAL_ENTITY murder (18)

HELPER ... 28-year-old Moroccan (19)

murder KILLING KILLER (20)

VICTIM ... m.t. 3000 cases (21)

Page 51: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

At the instance level

• given frame instances f1:F1 and f2:F2, where

– f1 and f2 stand in a contextual relation (syn, sem, discourse)

– frame types F1 and F2 stand in some frame relation

=> identify role instances (referents) of f1 and f2 (r1 (= r0) = r2)

frame relation context-related instances inferred relation

Page 52: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

In the first trial in the world in connection with the terrorist attacks of 11 September 2001, the Higher Regional Court of Hamburg has passed down the maximum sentence.

Criminal Process

Trial

SentencingCourt

frame relation

Page 53: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

In the first trial (f1) in the world in connection with the terrorist attacks of 11 September 2001, [the Higher Regional Court of Hamburg] (r2) has passed down the maximum sentence (f2).

The Higher Regional Court of Hamburg

Functional Embedding

Criminal Process

Trial

SentencingCourt

frame relation context-related instances

Page 54: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

The Higher Regional Court of Hamburg

Functional Embedding

Criminal Process

Trial

SentencingCourt

frame relation context-related instances inferred relation

In the first trial (f1) in the world in connection with the terrorist attacks of 11 September 2001, [the Higher Regional Court of Hamburg] (r2=r0= r1) has passed down the maximum sentence (f2).

Page 55: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

At the type level (more involved)

• If instances of frame roles f1:F1 and f2:F2 are often found co-referent within particular contextual relations

=> Hypothesize a frame relation between F1 and F2

(no) frame relation context-related instances inferred relation

Page 56: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Linking Frames and Roles in Context

(no) frame relation context-related instances inferred relations

… the Higher Regional Court of Hamburg has passed down the Maximum sentence. [Mounir al Motassadeq] will spend 15 years in prison.

Sentencing

Prison

Convict

Inmates Discourse Relation

• New Frame Relation

• (Role Binding: Convict=Inmates)

(Co-reference)

Page 57: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

CRIMINAL PROCESS

SENTENCING (1) TRIAL (1)

VERDICT (3)Defendant

Defendant

KILLING (3)

Inferred RelationContextual Relation Killer

Subframe/FE

PRISON (2)

Inmates Duration

ASSISTANCE (3)

Helper Co_agentFocal_entityVictim

Convict Type Court CaseCharge

CaseCharge

Court

Finding

(1) sentence number

Frame, Contextual, and Inferred Relations

Page 58: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

CRIMINAL PROCESS

SENTENCING TRIAL

VERDICT

Defendant

Defendant(the Moroccan)

KILLING

InferenceContextual Relations

Killer

Hierarchy/Subframe/FE

PRISON

Inmates(Motus.)

Duration(15Y)

ASSISTANCE

Helper Co_agentGoal(murder)

Victim(3000)

Convict Duration(maximum)

Court(Hmbg.)

Case(9/11)

Charge

CaseCharge(accessory)

In the first trial .. the higher Regional Court .. has passed down the maximum sentence.Mounir al Motussadeq will spend 15 years in prison.The 28-year-old Moroccan was found guilty as an accessory to murder in .. 3000 cases.

Page 59: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

Statistical Semantic Role Labeling

Page 60: Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT – Couples the syntax of the two languages What if we abstract

References

Jurafsky, D. & J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000. (Chapters 9 and 10)

Helmreich, S., From Syntax to Semantics, Presentation in the 74.419 Course, November 2003.

Nirenburg, S. & V. Raskin, Ontological Semantics, MIT Press, 2004.

Wordnet, http://wordnet.princeton.edu/

Suggested Upper Merged Ontology (SUMO), http://www.ontologyportal.org/