dependency tree-to-dependency tree machine translation

23
Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel 1

Upload: glenna

Post on 23-Feb-2016

60 views

Category:

Documents


0 download

DESCRIPTION

Dependency Tree-to-Dependency Tree Machine Translation. November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel. Problem. Swahili: Watoto ni kusoma vitabu . Gloss: children aux- pres read books - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dependency Tree-to-Dependency Tree Machine Translation

1

Dependency Tree-to-Dependency Tree Machine Translation

November 4, 2011Presented by: Jeffrey Flanigan (CMU)

Lori Levin, Jaime Carbonell

In collaboration with:Chris Dyer, Noah Smith, Stephan Vogel

Page 2: Dependency Tree-to-Dependency Tree Machine Translation

2

ProblemSwahili: Watoto ni kusoma vitabu.Gloss: children aux-pres read booksEnglish: Children are reading books.MT (Phrase-based): Children are reading books.

Why?Phrase Table: Pr(reading books| kusoma vitabu)

Pr(books | kusoma vitabu)Language model: Children are three new reading books. Children are reading books three new.

Swahili: Watoto ni kusoma vitabu tatu mpya.Gloss: children aux-pres read books three newEnglish: Children are reading three new books.MT (Phrase-based): Children are three new books.

Page 3: Dependency Tree-to-Dependency Tree Machine Translation

3

Problem: Grammatical Encoding Missing

Swahili: Nimeona samaki waliokula mashua.Gloss: I-found fish who-ate boatEnglish: I found the fish that ate the boat.

MT System: I found that eating fish boat.

Predicate-argument structure was corrupted.

Page 4: Dependency Tree-to-Dependency Tree Machine Translation

4

Grammatical Relations

I found the fish that ate the boat.

SUBJ

OBJ

⇒ Dependency trees on source and target!

ROOT

DETRCMOD

DOBJ

DETREF

Page 5: Dependency Tree-to-Dependency Tree Machine Translation

5

Approach

Source Dependency Tree

Target Dependency Tree

Source Sentence

Target Sentence

Undo grammatical encoding (parse)

Translate

Grammatical encoding(choose surface form, linearize)

All stages statistical

Page 6: Dependency Tree-to-Dependency Tree Machine Translation

6

Extracting the rules:Extract all consistent tree fragment pairs

Children are

reading

books

three new

Abaana

barasoma

ibitabo

bitatu bishya

NSUBJ

NSUBJ

NUM

AUX

DOBJ

DOBJ

NUMAMOD

AMOD

are

reading

NSUBJAUX

[1] [2]

DOBJbarasoma

NSUBJ

[1] [2]

DOBJ

are

readingNSUBJ

AUX[1] books

DOBJ

ibitabo

barasoma

NSUBJ DOBJ

[1]

[1]

NUM

three new

AMOD

bishya

[1]

NUM AMOD

bitatu

ChildrenAbaana

Example Extracted PairsSOURCE SIDE TARGET SIDE

Abaana

[1]

Children

[1]NUM NUM

Page 7: Dependency Tree-to-Dependency Tree Machine Translation

7

Translating

Extension of phrase-based SMT• Linear strings → Dependency trees• Phrase pairs → Tree fragment pairs• Language model → Dependency language model

Search is top down on the target side using beam search decoder

Page 8: Dependency Tree-to-Dependency Tree Machine Translation

8

Translation Example

umwaana[3]

arasoma

Ibitabo[4]

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ

Inventory of Rules

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

ibitabo books P(e|f)=.7

Input

The child is reading books

Page 9: Dependency Tree-to-Dependency Tree Machine Translation

9

Translation Example

is

reading

[4]umwaana

[3]

arasoma

Ibitabo[4]

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score = w1ln(.5)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))

ibitabo books P(e|f)=.7

[3]

Input

Language model on target dependency tree

Page 10: Dependency Tree-to-Dependency Tree Machine Translation

10

Translation Example

is

reading

booksumwaana

[3]

arasoma

ibitabo

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

[3]

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score = w1ln(.5)+w1ln(.7)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ)))

ibitabo books P(e|f)=.7

Input

Page 11: Dependency Tree-to-Dependency Tree Machine Translation

11

Translation Example

is

reading

booksumwaana

arasoma

ibitabo

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

child

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score(Translation) = w1ln(.5)+w1ln(.7)+w1ln(.8)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ)))+w2ln(Pr(child|(reading,NSUBJ)))+w2ln(Pr(the|(child,DET),(reading,ROOT)))

ibitabo books P(e|f)=.7

Input

Page 12: Dependency Tree-to-Dependency Tree Machine Translation

12

Linearization

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM enough

is

strong

NSUBJ COP

ADVMOD

He

Page 13: Dependency Tree-to-Dependency Tree Machine Translation

13

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is enough strong

Score=Pr(He|START) Pr(<∙ NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

Page 14: Dependency Tree-to-Dependency Tree Machine Translation

14

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is enough strong

Score=Pr(He|START) Pr(is|He,START) Pr(<∙ ∙ NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

Page 15: Dependency Tree-to-Dependency Tree Machine Translation

15

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is strong enough

Score=Pr(He|START) Pr(is|He∙ ,START) Pr(strong|He,is)∙ ∙ Pr(<NSUBJ,HEAD,COP>|is) Pr(<∙ ADVMOD,HEAD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

Page 16: Dependency Tree-to-Dependency Tree Machine Translation

16

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is strong enough

Score=Pr(He) Pr(is|He) Pr(strong|He, is) Pr(enough|strong, is)∙ ∙ ∙ ∙ Pr(<NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

Page 17: Dependency Tree-to-Dependency Tree Machine Translation

17

Comparison To Major ApproachesApproach Similarities Difference

Old Style Analysis-Transfer-Generate Separate analysis, generation, transfer models

Statistical, rules learned

Synchronous CFGs[Chiang 2005] [Zollman et al. 2006]

Model of grammatical encoding Allows adjunction and head switching

Tree-Transducers[Graehl & Knight 2004]

Model of grammatical encoding Different decoding

Quasi-Synchronous Grammars[Gimpel & Smith 2009]

Dependency trees on source and target

Different rules, decoding

Synchronous Tree Insertion Grammars [DeNeefe & Knight 2009]

Allows adjuncts Allows head switching

Dependency Treelets [Quirk et al 2005] [Shen et al 2008]

Dependency trees on source and target

Word order not in rules, linearization procedure

String-to-Dependency MT[Shen et al 2008]

Target dependency language model

Dependency trees on both source and target

Dependency tree to dependency tree (JHU Summer Workshop 2002)[Čmejrek et al 2003] [Eisner 2003]

Dependency trees on source and target. Linearization step.

Different learning of rules, different decoding procedure

Page 18: Dependency Tree-to-Dependency Tree Machine Translation

18

Conclusion• Separate translation from reordering• Dependency trees capture grammatical relations• Can extend phrase-based MT to dependency trees• Complements ISI’s approach nicely

Work in progress!

Page 19: Dependency Tree-to-Dependency Tree Machine Translation

19

Backup Slides

Page 20: Dependency Tree-to-Dependency Tree Machine Translation

20

Allowable Rules• Nodes consistent w/ alignments• All variables aligned• Nodes variables arcs alignments = connected graph∪ ∪ ∪

Optional Constraints• Nodes on source connected• Nodes on target connected• Nodes on source and target connected

Decoding Constraint• Target tree connected

Page 21: Dependency Tree-to-Dependency Tree Machine Translation

21

Head Switching Example

bébé

Le

vient

de

tomber

child

fell

just

The

NSUBJ

NSUBJ

DET

PREP

ADVMOD

POBJ

ADVMOD

[1]

[2]

justNSUBJ ADVMOD

[1]

vient

de

[2]

NSUBJ PREP

POBJ

[1]

fell

[2]

NSUBJ ADVMOD

[1]

[2]

de

tomber

NSUBJ PREP

POBJ

Page 22: Dependency Tree-to-Dependency Tree Machine Translation

22

Moving Up the Triangle

Propositional Semantic Dependencies

Deep Syntactic Dependencies

Surface Syntactic Dependencies

Page 23: Dependency Tree-to-Dependency Tree Machine Translation

23

Comparison to Synchronous Phrase Structure Rules

• Training dataset:

• Test sentence:

• Synchronous decoders (SAMT, Hiero, etc) produce:The children are reading book ’s Charles new all of .The children are reading book Charles ’s all of new .

Problem: Grammatical encoding tied to word order.

Kinyarwanda: Abaana baasoma igitabo gishya kyose cyaa Karooli .English: The children are reading all of Charles ’s new book .

Kinyarwanda: Abaana baasoma igitabo cyaa Karooli gishya kyose.