lecture 4: machine translation and ibm model 1...

105
1 Lecture 4: Machine Translation and IBM Model 1 (Learning) Intro to NLP, CS585, Fall 2014 http://people.cs.umass.edu/~brenocon/inlp2014/ Brendan O’Connor (http://brenocon.com ) Thursday, September 11, 14

Upload: others

Post on 01-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

1

Lecture 4:Machine Translation andIBM Model 1 (Learning)

Intro to NLP, CS585, Fall 2014http://people.cs.umass.edu/~brenocon/inlp2014/

Brendan O’Connor (http://brenocon.com)

Thursday, September 11, 14

Page 2: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

• Homework questions?

• Today

• Quick LM review

• Machine translation -- learning

• Next week

• Machine translation -- decoding, broader issues

2

Thursday, September 11, 14

Page 3: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Interpolation LM [from last time]

3

be the MLE probability distributionP̂Let

P (w|w�1, w�2) = �2P̂ (w|w�1, w�2) + �1P̂ (w|w�1) + �0P̂ (w)

Interpolation estimate is:

�0 + �1 + �2 = 1Mixing weights:

Sparse Denser Dense

Thursday, September 11, 14

Page 4: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

= �2P̂ (w|blacke as) + �1P̂ (w|as) + �0P̂ (w)

P (w|blacke as) Hamlet from NLTK, V=4812http://people.cs.umass.edu/~brenocon/inlp2014/lectures/04-hamlet_ngrams.txt

Thursday, September 11, 14

Page 5: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

= �2P̂ (w|blacke as) + �1P̂ (w|as) + �0P̂ (w)

3 blacke as1 blacke as death1 blacke as hell1 blacke as his

3 non-zeros

P (w|blacke as) Hamlet from NLTK, V=4812http://people.cs.umass.edu/~brenocon/inlp2014/lectures/04-hamlet_ngrams.txt

Thursday, September 11, 14

Page 6: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

= �2P̂ (w|blacke as) + �1P̂ (w|as) + �0P̂ (w)

3 blacke as1 blacke as death1 blacke as hell1 blacke as his

3 non-zeros

205 as1 as 'twer6 as 'twere3 as a1 as actiuely1 as against1 as all1 as an2 as any1 as are1 as bad2 as before3 as by1 as chast1 as checking1 as common1 as cunning1 as damn1 as day2 as death1 as deepe1 as dicers1 as doth1 as easie1 as england1 as ere1 as false1 as farre1 as fits1 as foule1 as fresh2 as from1 as gaming1 as girdle1 as had1 as hamlet1 as hardy

7 as he1 as healthfull1 as hell1 as her3 as his1 as how1 as hush18 as i1 as ice6 as if1 as in8 as it1 as iust2 as kill1 as kinde1 as leuell1 as liue1 as loue1 as low1 as lying1 as mad1 as made1 as make1 as many2 as may1 as meditation1 as mine1 as mortall1 as most3 as much2 as my1 as needfull2 as of1 as oft2 as one2 as our1 as patient

1 as peace1 as proper1 as pure1 as quick-siluer1 as reason1 as sharpe1 as sinewes1 as sinnes3 as snow1 as so1 as some1 as swift1 as th'art13 as the1 as therein2 as they3 as this3 as thou1 as thus1 as thy5 as to1 as vnuallued1 as vulcans1 as watchmen1 as waxe4 as we2 as well1 as white2 as will1 as womans1 as would12 as you2 as your1 as yours

107 non-zeros

P (w|blacke as) Hamlet from NLTK, V=4812http://people.cs.umass.edu/~brenocon/inlp2014/lectures/04-hamlet_ngrams.txt

Thursday, September 11, 14

Page 7: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

= �2P̂ (w|blacke as) + �1P̂ (w|as) + �0P̂ (w)

3 blacke as1 blacke as death1 blacke as hell1 blacke as his

3 non-zeros

205 as1 as 'twer6 as 'twere3 as a1 as actiuely1 as against1 as all1 as an2 as any1 as are1 as bad2 as before3 as by1 as chast1 as checking1 as common1 as cunning1 as damn1 as day2 as death1 as deepe1 as dicers1 as doth1 as easie1 as england1 as ere1 as false1 as farre1 as fits1 as foule1 as fresh2 as from1 as gaming1 as girdle1 as had1 as hamlet1 as hardy

7 as he1 as healthfull1 as hell1 as her3 as his1 as how1 as hush18 as i1 as ice6 as if1 as in8 as it1 as iust2 as kill1 as kinde1 as leuell1 as liue1 as loue1 as low1 as lying1 as mad1 as made1 as make1 as many2 as may1 as meditation1 as mine1 as mortall1 as most3 as much2 as my1 as needfull2 as of1 as oft2 as one2 as our1 as patient

1 as peace1 as proper1 as pure1 as quick-siluer1 as reason1 as sharpe1 as sinewes1 as sinnes3 as snow1 as so1 as some1 as swift1 as th'art13 as the1 as therein2 as they3 as this3 as thou1 as thus1 as thy5 as to1 as vnuallued1 as vulcans1 as watchmen1 as waxe4 as we2 as well1 as white2 as will1 as womans1 as would12 as you2 as your1 as yours

107 non-zeros

17 !25 &61 '200 'd2 'em5 'gainst3 'm14 're119 's61 't1 'tane7 'tis1 'twas1 'tweene1 'twer10 'twere4 'twill1 'twixt45 (44 )2892 ,1879 .3 1.play1 1.player1 1599566 :298 ;459 ?6 [6 ]497 a2 a'th1 a-crosse1 a-downe1 a-downe-a1 a-dreames1 a-foot1 a-sleepe3 a-while1 a-worke1 abhominably1 abhorred1 abilitie3 aboord4 aboue19 about1 abridgements1 abroad1 absent1 absolute1 abstinence1 abstracts2 absurd1 abus1 abuse1 abuses2 accent1 accepts1 accesse3 accident1 accidentall1 accord2 according1 account1 accounted1 accurst1 accuse1 acquaint1 acquire1 acquittance9 act4 acte1 acted1 acting10 action1 actions1 actiuely3 actor2 actors2 acts2 actus1 adaies1 adam1 adams1 addicted1 addition1 addresse1 adheres1 adieu1 adiew1 adioyn1 admirable3 admiration2 admit1 admittance

1 adoption1 adores1 aduanc2 aduancement1 aduantage5 adue1 aduenturous3 aduice1 aduise1 adulterate1 aeneas1 afarre3 affaire1 affaires1 affayre1 affear1 affectation4 affection1 affections1 afflict3 affliction1 affraide1 affrighted1 affront13 after1 afternoone1 afterwards34 againe20 against8 age1 ago1 agreeing2 ah2 aint1 ake1 akers1 al3 alacke1 alarme1 alarum9 alas5 alexander1 alijs109 all1 allegeance1 allies1 allow2 allowance2 allowed7 almost10 alone3 along2 aloofe4 already1 altitude1 altogether2 alwayes52 am1 amaz1 amaze2 amazement1 amb1 ambass1 ambassador3 ambassadors1 ambassadours1 amber1 ambiguous5 ambition3 ambitious1 amble1 amen1 amis1 amisse1 amities1 among53 an1 ancient1 anckle862 and1 and't1 angel2 angell3 angels1 anger

1 angle1 angry1 animals1 annexment1 annoint1 annuall6 anon8 another1 anothers9 answer4 answere1 answered1 answerest1 answers1 anticipation2 anticke1 antike1 antiquity17 any1 apale2 apart2 ape1 apparell2 apparition2 appear1 appear'd3 appeare3 appeares1 appetite1 applaud1 appliance1 appointment2 apprehension1 approue1 appurtenance2 apt1 ar't1 ardure121 are3 argall1 argues3 argument4 arm5 arme9 armes1 armie2 armour1 armours1 arraigne2 arrant3 arras1 arrest1 arrests1 arriued1 arrow2 arrowes16 art1 article1 articles1 artire1 artlesse205 as1 asham1 aside1 ask't2 aske2 asking1 aslant1 asleepe1 aspect1 assaid1 assaies1 assaile1 assault3 assay3 asse1 asse-2 assignes1 assis2 assistant1 associates3 assume1 assumes1 assur1 assur'd2 assurance1 assure1 astonish1 asunder81 at4 attendant2 attendants1 attended1 attends1 attent

1 attractiue5 audience1 audit1 augury1 aunt1 auoid1 auouch2 auoyd1 auspicious2 author1 authorities2 awake27 away2 awe4 awhile2 axe1 ayde2 aye1 aygre1 ayme12 ayre1 ayres1 ayrie1 ayry1 babe2 baby1 bace2 back6 backe1 backward7 bad1 bait2 bak1 bakers1 bakt-meats1 ban1 bands1 banke3 bap1 bapt1 baptista7 bar1 barbars2 barbary2 bare1 bare-foot1 barke10 barn8 barnardo1 barr'd1 barre1 barren4 base1 basenesse2 baser2 basket1 bastard1 bat1 bated1 battalians1 batten1 battery1 battlements1 baudry1 bawd1 bawdy191 be1 be-ratled1 beame1 bear't6 beard1 beards13 beare1 bearers2 beares1 bearing5 beast2 beasts1 beaten3 beating1 beats1 beauer1 beauteous4 beautie1 beautied1 beauties1 beautifed1 beautified3 beauty1 became1 because1 becke1 beckens1 beckons2 becomes11 bed1 bedded1 bedrid16 bee3 been13 beene1 beer

4812 non-zeros

P (w|blacke as) Hamlet from NLTK, V=4812http://people.cs.umass.edu/~brenocon/inlp2014/lectures/04-hamlet_ngrams.txt

Thursday, September 11, 14

Page 8: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

= �2P̂ (w|blacke as) + �1P̂ (w|as) + �0P̂ (w)

3 blacke as1 blacke as death1 blacke as hell1 blacke as his

3 non-zeros

205 as1 as 'twer6 as 'twere3 as a1 as actiuely1 as against1 as all1 as an2 as any1 as are1 as bad2 as before3 as by1 as chast1 as checking1 as common1 as cunning1 as damn1 as day2 as death1 as deepe1 as dicers1 as doth1 as easie1 as england1 as ere1 as false1 as farre1 as fits1 as foule1 as fresh2 as from1 as gaming1 as girdle1 as had1 as hamlet1 as hardy

7 as he1 as healthfull1 as hell1 as her3 as his1 as how1 as hush18 as i1 as ice6 as if1 as in8 as it1 as iust2 as kill1 as kinde1 as leuell1 as liue1 as loue1 as low1 as lying1 as mad1 as made1 as make1 as many2 as may1 as meditation1 as mine1 as mortall1 as most3 as much2 as my1 as needfull2 as of1 as oft2 as one2 as our1 as patient

1 as peace1 as proper1 as pure1 as quick-siluer1 as reason1 as sharpe1 as sinewes1 as sinnes3 as snow1 as so1 as some1 as swift1 as th'art13 as the1 as therein2 as they3 as this3 as thou1 as thus1 as thy5 as to1 as vnuallued1 as vulcans1 as watchmen1 as waxe4 as we2 as well1 as white2 as will1 as womans1 as would12 as you2 as your1 as yours

107 non-zeros

17 !25 &61 '200 'd2 'em5 'gainst3 'm14 're119 's61 't1 'tane7 'tis1 'twas1 'tweene1 'twer10 'twere4 'twill1 'twixt45 (44 )2892 ,1879 .3 1.play1 1.player1 1599566 :298 ;459 ?6 [6 ]497 a2 a'th1 a-crosse1 a-downe1 a-downe-a1 a-dreames1 a-foot1 a-sleepe3 a-while1 a-worke1 abhominably1 abhorred1 abilitie3 aboord4 aboue19 about1 abridgements1 abroad1 absent1 absolute1 abstinence1 abstracts2 absurd1 abus1 abuse1 abuses2 accent1 accepts1 accesse3 accident1 accidentall1 accord2 according1 account1 accounted1 accurst1 accuse1 acquaint1 acquire1 acquittance9 act4 acte1 acted1 acting10 action1 actions1 actiuely3 actor2 actors2 acts2 actus1 adaies1 adam1 adams1 addicted1 addition1 addresse1 adheres1 adieu1 adiew1 adioyn1 admirable3 admiration2 admit1 admittance

1 adoption1 adores1 aduanc2 aduancement1 aduantage5 adue1 aduenturous3 aduice1 aduise1 adulterate1 aeneas1 afarre3 affaire1 affaires1 affayre1 affear1 affectation4 affection1 affections1 afflict3 affliction1 affraide1 affrighted1 affront13 after1 afternoone1 afterwards34 againe20 against8 age1 ago1 agreeing2 ah2 aint1 ake1 akers1 al3 alacke1 alarme1 alarum9 alas5 alexander1 alijs109 all1 allegeance1 allies1 allow2 allowance2 allowed7 almost10 alone3 along2 aloofe4 already1 altitude1 altogether2 alwayes52 am1 amaz1 amaze2 amazement1 amb1 ambass1 ambassador3 ambassadors1 ambassadours1 amber1 ambiguous5 ambition3 ambitious1 amble1 amen1 amis1 amisse1 amities1 among53 an1 ancient1 anckle862 and1 and't1 angel2 angell3 angels1 anger

1 angle1 angry1 animals1 annexment1 annoint1 annuall6 anon8 another1 anothers9 answer4 answere1 answered1 answerest1 answers1 anticipation2 anticke1 antike1 antiquity17 any1 apale2 apart2 ape1 apparell2 apparition2 appear1 appear'd3 appeare3 appeares1 appetite1 applaud1 appliance1 appointment2 apprehension1 approue1 appurtenance2 apt1 ar't1 ardure121 are3 argall1 argues3 argument4 arm5 arme9 armes1 armie2 armour1 armours1 arraigne2 arrant3 arras1 arrest1 arrests1 arriued1 arrow2 arrowes16 art1 article1 articles1 artire1 artlesse205 as1 asham1 aside1 ask't2 aske2 asking1 aslant1 asleepe1 aspect1 assaid1 assaies1 assaile1 assault3 assay3 asse1 asse-2 assignes1 assis2 assistant1 associates3 assume1 assumes1 assur1 assur'd2 assurance1 assure1 astonish1 asunder81 at4 attendant2 attendants1 attended1 attends1 attent

1 attractiue5 audience1 audit1 augury1 aunt1 auoid1 auouch2 auoyd1 auspicious2 author1 authorities2 awake27 away2 awe4 awhile2 axe1 ayde2 aye1 aygre1 ayme12 ayre1 ayres1 ayrie1 ayry1 babe2 baby1 bace2 back6 backe1 backward7 bad1 bait2 bak1 bakers1 bakt-meats1 ban1 bands1 banke3 bap1 bapt1 baptista7 bar1 barbars2 barbary2 bare1 bare-foot1 barke10 barn8 barnardo1 barr'd1 barre1 barren4 base1 basenesse2 baser2 basket1 bastard1 bat1 bated1 battalians1 batten1 battery1 battlements1 baudry1 bawd1 bawdy191 be1 be-ratled1 beame1 bear't6 beard1 beards13 beare1 bearers2 beares1 bearing5 beast2 beasts1 beaten3 beating1 beats1 beauer1 beauteous4 beautie1 beautied1 beauties1 beautifed1 beautified3 beauty1 became1 because1 becke1 beckens1 beckons2 becomes11 bed1 bedded1 bedrid16 bee3 been13 beene1 beer

4812 non-zeros

• Generative view: • Choose history size by prob (λ2,λ1,λ0,).

• Generate w by MLE of that history size.

• Size 1 or 0 are “backoff” choices.

• Prediction time: marginalize (average) over history size selection.

• Many other methods have different ways of combining different sized histories.

• Sharing statistical strength: “X as”

• Extension: common prefixes require less smoothing. “of the _” has 59 non-zeros, so “the _” less important.

• To think about: shouldn’t contexts share statistical strength beyond suffix matching?

P (w|blacke as) Hamlet from NLTK, V=4812http://people.cs.umass.edu/~brenocon/inlp2014/lectures/04-hamlet_ngrams.txt

Thursday, September 11, 14

Page 9: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

6

ForeignEnglish

Convention in Collins/Knight: translating from f into e

Inference task

Thursday, September 11, 14

Page 10: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

7

EnglishForeign

Convention today (sorry!): translating from e into f

Inference task

Early 90’s, IBM ResearchThursday, September 11, 14

Page 11: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

7

EnglishForeign

Convention today (sorry!): translating from e into f

Inference task

ForeignLM

Hypothesized LM:LM generates F

P(f)Hypothesized

translation model:F generates E

P(e | f)

Early 90’s, IBM ResearchThursday, September 11, 14

Page 12: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

7

EnglishForeign

Convention today (sorry!): translating from e into f

Inference task

ForeignLM

Hypothesized LM:LM generates F

P(f)Hypothesized

translation model:F generates E

P(e | f)

P(f | e) P(e | f) P(f) / P(e)=

Early 90’s, IBM ResearchThursday, September 11, 14

Page 13: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

7

EnglishForeign

Convention today (sorry!): translating from e into f

Inference task

ForeignLM

Hypothesized LM:LM generates F

P(f)Hypothesized

translation model:F generates E

P(e | f)

P(f | e) P(e | f) P(f)argmaxf

argmaxf=Inference task

(“Decoding”)

P(f | e) P(e | f) P(f) / P(e)=

Early 90’s, IBM ResearchThursday, September 11, 14

Page 14: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MT as Noisy Channel

7

EnglishForeign

Convention today (sorry!): translating from e into f

Inference task

ForeignLM

Hypothesized LM:LM generates F

P(f)Hypothesized

translation model:F generates E

P(e | f)

P(f | e) P(e | f) P(f)argmaxf

argmaxf=Inference task

(“Decoding”)

Learning task P(f): from large monolingual corpus

P(e | f) today:lexical translation models

P(f | e) P(e | f) P(f) / P(e)=

Early 90’s, IBM ResearchThursday, September 11, 14

Page 15: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Data to learn statistical MT

• Statistical machine translation is almost entirely based on parallel corpora, a.k.a. bitexts.

• Training data: human-translated sentence pairs (e, f)

8

Thursday, September 11, 14

Page 16: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation

• How do we translate a word? Look it up in the dictionary

• Multiple translations

• Different word senses, different registers, different inflections (?)

• house, home are common

• shell is specialized (the Haus of a snail is a shell)

Haus : house, home, shell, household

Thursday, January 24, 13Thursday, September 11, 14

Page 17: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

How common is each?Translation Count

house 5000

home 2000

shell 100

household 80

Thursday, January 24, 13Thursday, September 11, 14

Page 18: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

MLE

p̂MLE(e | Haus) =

8>>>>>><

>>>>>>:

0.696 if e = house

0.279 if e = home

0.014 if e = shell

0.011 if e = household

0 otherwise

Thursday, January 24, 13Thursday, September 11, 14

Page 19: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation• Goal: a model

• where and are complete English and Foreign sentences

• Lexical translation makes the following assumptions:

• Each word in in is generated from exactly one word in

• Thus, we have an alignment that indicates which word “came from”, specifically it came from .

• Given the alignments , translation decisions are conditionally independent of each other and depend only on the aligned source word .

p(e | f,m)

e f

eeif

aiei fai

a

e = he1, e2, . . . , emi f = hf1, f2, . . . , fni

Thursday, January 24, 13Thursday, September 11, 14

Page 20: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation• Goal: a model

• where and are complete English and Foreign sentences

• Lexical translation makes the following assumptions:

• Each word in in is generated from exactly one word in

• Thus, we have an alignment that indicates which word “came from”, specifically it came from .

• Given the alignments , translation decisions are conditionally independent of each other and depend only on the aligned source word .

p(e | f,m)

e f

eeif

aiei fai

a

fai

Thursday, January 24, 13Thursday, September 11, 14

Page 21: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

e = he1, e2, ...emi f = hf1, f2, ...fnia = ha1, a2, ...ami each ai 2 {0, 1, ..., n}

=

p(e | f ,m) =X

a2{0,1,..,n}m

p(a | f ,m)⇥ p(e | a, f ,m)

p(e | f ,m)

Lexical Translation

• Putting our assumptions together, we have:

Alignment Translation | Alignment⇥

p(e | f,m) =X

a2[0,n]m

p(a | f,m)⇥mY

i=1

p(ei | fai)

Thursday, January 24, 13

X

a2{0,1,..,n}m

p(a | f ,m)⇥mY

i=1

p(ei | fai)

[Alignment] x [Translation | Alignment]

Modeling assumptions

Chain rule

Thursday, September 11, 14

Page 22: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation

p(ei | fai)

Thursday, January 24, 13Thursday, September 11, 14

Page 23: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation

p(ei | fai)

p(house | Haus)

Thursday, January 24, 13Thursday, September 11, 14

Page 24: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Lexical Translation

p(ei | fai)

p(house | Haus) p(shell | Haus)

Thursday, January 24, 13Thursday, September 11, 14

Page 25: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

e = he1, e2, ...emi f = hf1, f2, ...fnia = ha1, a2, ...ami each ai 2 {0, 1, ..., n}

Lexical Translation

• Putting our assumptions together, we have:

Alignment Translation | Alignment⇥

p(e | f,m) =X

a2[0,n]m

p(a | f,m)⇥mY

i=1

p(ei | fai)

Thursday, January 24, 13

=p(e | f ,m)X

a2{0,1,..,n}m

p(a | f ,m)⇥mY

i=1

p(ei | fai)

[Alignment] x [Translation | Alignment]

Modeling assumptions

Thursday, September 11, 14

Page 26: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Alignment

p(a | f,m)Most of the action for the first 10 yearsof MT was here. Words weren’t the problem,word order was hard.

Thursday, January 24, 13Thursday, September 11, 14

Page 27: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Alignment• Alignments can be visualized in by drawing

links between two sentences, and they are represented as vectors of positions:

a = (1, 2, 3, 4)>

Thursday, January 24, 13Thursday, September 11, 14

Page 28: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Reordering• Words may be reordered during

translation.

a = (3, 4, 2, 1)>

Thursday, January 24, 13Thursday, September 11, 14

Page 29: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Word Dropping

• A source word may not be translated at all

a = (2, 3, 4)>

Thursday, January 24, 13Thursday, September 11, 14

Page 30: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Word Insertion• Words may be inserted during translation

English just does not have an equivalent

But it must be explained - we typically assumeevery source sentence contains a NULL token

a = (1, 2, 3, 0, 4)>

Thursday, January 24, 13Thursday, September 11, 14

Page 31: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

One-to-many Translation

• A source word may translate into more than one target word

a = (1, 2, 3, 4, 4)>

Thursday, January 24, 13Thursday, September 11, 14

Page 32: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Many-to-one Translation

• More than one source word may not translate as a unit in lexical translation

das Haus brach zusammen

the house collapsed

1 2 3 4

1 2 3

a =???

Thursday, January 24, 13Thursday, September 11, 14

Page 33: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Many-to-one Translation

• More than one source word may not translate as a unit in lexical translation

das Haus brach zusammen

the house collapsed

1 2 3 4

1 2 3

a =??? a = (1, 2, (3, 4)>)> ?

Thursday, January 24, 13Thursday, September 11, 14

Page 34: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1• Simplest possible lexical translation model

• Additional assumptions

• The m alignment decisions are independent

• The alignment distribution for each is uniform over all source words and NULL

ai

for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

Thursday, January 24, 13Thursday, September 11, 14

Page 35: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

mY

i=1

p(e,a | f,m) =

Thursday, January 24, 13Thursday, September 11, 14

Page 36: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

mY

i=1

1

1 + np(e,a | f,m) =

Thursday, January 24, 13Thursday, September 11, 14

Page 37: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

mY

i=1

1

1 + np(ei | fai)p(e,a | f,m) =

Thursday, January 24, 13Thursday, September 11, 14

Page 38: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

mY

i=1

1

1 + np(ei | fai)p(e,a | f,m) =

Thursday, January 24, 13Thursday, September 11, 14

Page 39: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

IBM Model 1for each i 2 [1, 2, . . . ,m]

ai ⇠ Uniform(0, 1, 2, . . . , n)

ei ⇠ Categorical(✓fai)

mY

i=1

1

1 + np(ei | fai)p(e,a | f,m) =

p(ei, ai | f,m) =1

1 + np(ei | fai)

p(e,a | f,m) =mY

i=1

p(ei, ai | f,m)

Thursday, January 24, 13Thursday, September 11, 14

Page 40: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

Recall our independence assumption: all alignment decisions are independent of each other, and given alignments all translation decisions are independent of each other, so all translation decisions are independent of each other.

Thursday, January 24, 13Thursday, September 11, 14

Page 41: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

Recall our independence assumption: all alignment decisions are independent of each other, and given alignments all translation decisions are independent of each other, so all translation decisions are independent of each other.

p(a, b, c, d) = p(a)p(b)p(c)p(d)

Thursday, January 24, 13Thursday, September 11, 14

Page 42: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

Recall our independence assumption: all alignment decisions are independent of each other, and given alignments all translation decisions are independent of each other, so all translation decisions are independent of each other.

Thursday, January 24, 13Thursday, September 11, 14

Page 43: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

Recall our independence assumption: all alignment decisions are independent of each other, and given alignments all translation decisions are independent of each other, so all translation decisions are independent of each other.

p(e | f,m) =mY

i=1

p(ei | f,m)

Thursday, January 24, 13Thursday, September 11, 14

Page 44: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

p(e | f,m) =mY

i=1

p(ei | f,m)

Thursday, January 24, 13Thursday, September 11, 14

Page 45: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

p(e | f,m) =mY

i=1

p(ei | f,m)

=mY

i=1

nX

ai=0

1

1 + np(ei | fai)

=1

(1 + n)m

mY

i=1

nX

ai=0

p(ei | fai)

Thursday, January 24, 13Thursday, September 11, 14

Page 46: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Marginal probabilityp(ei, ai | f,m) =

1

1 + np(ei | fai)

p(ei | f,m) =nX

ai=0

1

1 + np(ei | fai)

p(e | f,m) =mY

i=1

p(ei | f,m)

=mY

i=1

nX

ai=0

1

1 + np(ei | fai)

=1

(1 + n)m

mY

i=1

nX

ai=0

p(ei | fai)

Thursday, January 24, 13Thursday, September 11, 14

Page 47: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein1 2 3 4

1 2 43

NULL0

Start with a foreign sentence and a target length.

Thursday, January 24, 13Thursday, September 11, 14

Page 48: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 49: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 50: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 51: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 52: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the house

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 53: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the house

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 54: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the house

1 2 3 4

1 2 4is3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 55: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the house

1 2 3 4

1 2 4is3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 56: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the house

1 2 3 4

1 2 4is small3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 57: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 58: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 59: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 60: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

the

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 61: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

thehouse

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 62: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

thehouse

1 2 3 4

1 2 43

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 63: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

thehouse

1 2 3 4

1 2 4is

3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 64: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

thehouse

1 2 3 4

1 2 4is

3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 65: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Example

das Haus ist klein

thehouse

1 2 3 4

1 2 4is small

3

NULL0

Thursday, January 24, 13Thursday, September 11, 14

Page 66: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

What is this good for?

• 1. Evaluate translation quality

• 2. Find the best alignment

• The IBM models are still used for this, though not as translation models

59

Thursday, September 11, 14

Page 67: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

a⇤ = arg max

a2[0,1,...,n]mp(a | e, f)

= arg max

a2[0,1,...,n]m

p(e,a | f)Pa0 p(e,a0 | f)

= arg max

a2[0,1,...,n]mp(e,a | f)

a⇤i = arg

nmax

ai=0

1

1 + np(ei | fai)

= arg

nmax

ai=0p(ei | fai)

Thursday, January 24, 13

= p(a|f) p(e|a, f)

Thursday, September 11, 14

Page 68: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 69: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 70: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 71: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 72: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 73: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 74: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 75: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 76: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 77: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 78: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 79: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 80: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 81: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 82: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 83: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 84: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 85: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 86: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 87: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 88: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 89: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 90: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 91: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 92: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 93: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 94: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 95: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Finding the Viterbi Alignment

das Haus ist klein1 2 3 4

NULL0

the home1 2 4

is little3

Thursday, January 24, 13Thursday, September 11, 14

Page 96: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Learning Lexical Translation Models• How do we learn the parameters

• “Chicken and egg” problem

• If we had the alignments, we could estimate the parameters (MLE)

• If we had parameters, we could find the most likely alignments

p(e | f)

Thursday, January 24, 13Thursday, September 11, 14

Page 97: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

How to learn?

• But a is a latent variable. The marginal log-likelihood needs to sum-out the alignments. No closed form.

90

• Training data: tons of (e,f) pairs. Want to learn theta: lexical translation probabilities.

• If knew the alignments a, MLE would be easy!

max

X

(e,f)

log p(e | f ; ✓)

=

X

(e,f)

log

X

a

p(a | f ; ✓)⇥ p(e | a, f ; ✓)!

max

X

(e,f)

log p(e | a, f ; ✓)

Thursday, September 11, 14

Page 98: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM Algorithm• pick some random (or uniform) parameters

• Repeat until you get bored (~ 5 iterations for lexical translation models)

• using your current parameters, compute “expected” alignments for every target word token in the training data

• keep track of the expected number of times f translates into e throughout the whole corpus

• keep track of the expected number of times that f is used as the source of any translation

• use these expected counts as if they were “real” counts in the standard MLE equation

p(ai | e, f) (on board)

Thursday, January 24, 13Thursday, September 11, 14

Page 99: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM for Model 1

Thursday, January 24, 13Thursday, September 11, 14

Page 100: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM for Model 1

Thursday, January 24, 13Thursday, September 11, 14

Page 101: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM for Model 1

Thursday, January 24, 13Thursday, September 11, 14

Page 102: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM for Model 1

Thursday, January 24, 13Thursday, September 11, 14

Page 103: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

EM for Model 1

Thursday, January 24, 13Thursday, September 11, 14

Page 104: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Convergence

Thursday, January 24, 13Thursday, September 11, 14

Page 105: Lecture 4: Machine Translation and IBM Model 1 (Learning)brenocon/inlp2014/lectures/04-mt.pdf•Homework questions? • Today • Quick LM review • Machine translation -- learning

Now what?

• Decoding: find the best translation for a sentence

• Alignments: find the best alignment

• Evaluation?

• Other approaches?

98

Thursday, September 11, 14