machine translation day 20. evaluating mt 2 mt evaluation i have a throbbing pain. i am experiencing...
TRANSCRIPT
![Page 1: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/1.jpg)
Machine Translation
Day 20
![Page 2: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/2.jpg)
2
EVALUATING MT
![Page 3: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/3.jpg)
MT Evaluation
• I have a throbbing pain.• I am experiencing a throbbing
pain.• I am suffering from a throbbing
pain.• I am feeling a throbbing pain.• It is a throbbing pain.• It's throbbing and it really
hurts.• It's painful and it's throbbing.• It's throbbing with pain.
• It's in throbbing pain.• It hurts so much it's throbbing.• I've got a throbbing pain.• I can feel a throbbing pain.• I am suffering from a
throbbing pain.• I am experiencing a throbbing
pain.• I have a painful throbbing.• I feel a painful throbbing.
Source : ズキズキ 痛み ます 。16 human translations:
3
Data from International Workshop on Spoken Language Translation
![Page 4: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/4.jpg)
4
MT Evaluation
• No “right answer”!• What can we test instead?
– Human adequacy / fluency ratings– Human efficacy in an application
(e.g. question answering from translated foreign documents vs. native documents)
– Very accurate, but slow & expensive• Agreement with reference translations
– BLEU (BiLingual Evaluation Understudy: IBM)– Fast system development
![Page 5: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/5.jpg)
5
BLEU (Papineni, ACL 2002)
• MT output:1: It is a guide to action which ensures that the military always obeys the
commands of the party.2: It is to insure the troops forever hearing the activity guidebook that
party direct.
• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed
Party commands.2: It is the guiding principle which guarantees the military forces always
being under the command of the Party.3: It is the practical guide for the army always to heed the directions of
the party.
![Page 6: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/6.jpg)
6
BLEU
• MT output:1: It is a guide to action which ensures that the military always obeys
the commands of the party.2: It is to insure the troops forever hearing the activity guidebook that
party direct.
• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed
Party commands.2: It is the guiding principle which guarantees the military forces always
being under the command of the Party.3: It is the practical guide for the army always to heed the directions of
the party.
![Page 7: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/7.jpg)
7
BLEU
• MT output:1: It is a guide to action which ensures that the military always obeys the
commands of the party.2: It is to insure the troops forever hearing the activity guidebook that
party direct.
• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed
Party commands.2: It is the guiding principle which guarantees the military forces always
being under the command of the Party.3: It is the practical guide for the army always to heed the directions of
the party.
![Page 8: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/8.jpg)
8
BLEU: observations
1: It is a guide to action which ensures that the military always obeys the commands of the party.
2: It is to insure the troops forever hearing the activity guidebook that party direct.
• Observations– Word overlap is indicative– n-gram (word sequence) overlap is even more distinct– Drawing from multiple reference translations helps
![Page 9: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/9.jpg)
9
BLEU metric
• Compute n-gram precisions:Pn = c(matched n-grams) / c(n-grams in candidate)
• Compute a brevity penalty(Prevent candidates from deleting difficult words)BP = exp( min( 1 – r/c, 0 ) ), r = reference length, c =
candidate length• Combine using geometric mean
BLEU = BP (∏∙ i=1n Pi)^(1/n)
• Produces score on a 0-1 scale – often expressed as a “percentage” (e.g., * 100)
![Page 10: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/10.jpg)
BLEU results circa 2002
[from Papineni et al., ACL 2002] [from G. Doddington, NIST]
Distinguishes humans from machines… …correlates well with human judgments
10
However nowadays we’re starting to see problems: - Some systems score better than human translations - In competitions, some “gaming of BLEU” - Rule based systems are at a disadvantage after tuning
![Page 11: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/11.jpg)
11
MT Evaluation: Human• Absolute evaluation
– Given a reference translation human evaluators are asked to rank translation quality on a scale of 1-4
4= Ideal: grammatically correct, all information included3= Acceptable: Not perfect, but definitely comprehensible, AND with
accurate transfer of all important information.2= Possibly acceptable: may be interpretable given context/time, some
information transferred accurately1= Unacceptable: Absolutely not comprehensible and/or little or no
information transferred accurately.
• Relative evaluation– Human judges are presented with a reference translation and two
machine translations in random order, and must pick the better of the two
– Criteria for decision are left up to individual judge
![Page 12: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/12.jpg)
12
Absolute quality: SpanishEnglish
0
20
40
60
80
100
120
Number of Sentences
1 1.5 2 2.5 3 3.5 4Quality Score
BabelfishMSR MT
Average quality scores: Babelfish=2.344 MSR-MT=2.727
![Page 13: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/13.jpg)
Extrinsic evaluation: Microsoft product support site
• Microsoft support knowledge base– Thousands of customer support articles available at
http://support.microsoft.com– However, most are only available in English– Translating all articles by hand is too expensive– Instead we present unedited MT articles– Available in Spanish, French, German, Japanese, etc.
• Some of the publicly available data-driven translations (2002-2003)
![Page 14: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/14.jpg)
14
http://support.microsoft.com
![Page 15: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/15.jpg)
15
PSS survey results (Spanish)
• Overall satisfaction with the article (scale: 1 to 9)– 86.0% scored between 5 and 9; US English = 74.2%
• Technical accuracy of the article (1 to 9)– 75.3% scored between 5 and 9
• Task success– “Did the information in the (machine translated) knowledge base article
help answer your question?” – Yes:
• Machine translated Spanish = 49.7%• Human translated Spanish = 51.2%• US English = 53.6%
![Page 16: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/16.jpg)
WORD ALIGNMENT
![Page 17: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/17.jpg)
17
A very simple MT system
• Get a translation dictionary• Assign a uniform distribution over all
translations of each source word• Tokenize input sentence, replace each word
with its English translation:weil er gestern gegangen istbecause he yesterday gone is
• Not terrible, but not very fluent
![Page 18: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/18.jpg)
18
Simple Statistical Machine Translation
• Given foreign f, find best English translation e*e* = argmaxe P(e | f)
• Use Bayes’ rule to get “noisy channel” modelP(e | f) = P(f | e) P(∙ e) / P(f)argmaxe P(e | f) = argmax P(f | e) P(∙ e)
• P(f | e) is the channel or translation model• P(e) is the language model
![Page 19: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/19.jpg)
19
Toy System A
• Channel model reversed, otherwise identical– Now gives a probability of source given target– Uniform distribution over all source translations of
a given target word• Word-based bigram model as language model
– Improve translations in context– Improves fluency overall
• Looks like an HMM tagger:– Find Viterbi path through a lattice or trellis
![Page 20: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/20.jpg)
20
eat-10.3
eat-9.8
Toy System A: searchweil er gestern gegangen ist
because he yesterday gone is
him left had
his are
has
<s>0
because-3.2
he-5.6
him-5.4
his-5.9
yesterday-8.3
gone-9.9
Only need to keep the best hypothesis ending in some word – bigram LM can’t see beyond that
(Viterbi!)
Each partial hypothesis keeps track of the last word generated (for LM score) and the total score so far
left-10.4
![Page 21: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/21.jpg)
Learning the translation model
• Start from seminal work by IBM back in the late 1980s – early 1990s
• They develop models for identifying word correspondences (word alignments) of parallel data
![Page 22: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/22.jpg)
Learning the translation model
• Say we had some word aligned parallel data• How would we estimate a translation model?
the
house
la maison
the
flower
la fleur
blue
house
the
la maison bleu
![Page 23: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/23.jpg)
Learning the translation model
• Say I had a model of P(french | english)• How can I find alignments?
the
house
la maison
the
flower
la fleur
blue
house
the
la maison bleu
blue
Word Prob
bleu 0.8
… …
the
Word Prob
la 0.3
le 0.3
les 0.2
…
![Page 24: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/24.jpg)
24
Parameter estimation
• Given lists of parallel sentences (e, f)• If we had the hidden alignments a, then we could
estimate multinomial parameters based on countsc(e, f) := number of times e was aligned to fc(e) := number of occurrences of et(f | e) := c(e, f) / c(e)
• On the other hand, if we knew the parameters t( | )∙ ∙ , we could find the most likely alignments
• Bit of a chicken and an egg problem…
![Page 25: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/25.jpg)
25
Expectation-Maximization
• Enter the Expectation-Maximization algorithm– Method for optimizing parameters / finding hidden state in
unsupervised problems• A procedural description for now
– Pick an initial set of parameters t0(f | e), set k = 0– Until convergence…
• Find expected values of the hidden states ak+1 for each pair assuming parameters tk are correct (Expectation)
• Find the most likely parameters tk+1 assuming that hidden states ak+1 are correct (Maximization)
• Increment k
![Page 26: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/26.jpg)
26
Model 1
the
house
[null]
la maison
the
flower
[null]
la fleur
blue
house
the
la maison
[null]
bleu
![Page 27: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/27.jpg)
27
Model 1, EM iteration 0
the
house
[null]
la maison
0.33 0.33
0.33
0.33
0.33
0.33
the
flower
[null]
la fleur
0.33 0.33
0.33
0.33
0.33
0.33
blue
house
the
la maison
0.25 0.25
[null]
bleu
0.25 0.25
0.25 0.25
0.25 0.25
0.25
0.25
0.25
0.25
![Page 28: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/28.jpg)
28
Model 1, EM iteration 1
the
house
[null]
la maison
0.34 0.28
0.34
0.31
0.28
0.42
the
flower
[null]
la fleur
0.32 0.20
0.32
0.36
0.20
0.60
blue
house
the
la maison
0.25 0.32
[null]
bleu
0.27 0.21
0.21 0.25
0.27 0.21
0.16
0.45
0.24
0.16
![Page 29: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/29.jpg)
29
Model 1, EM iteration 2
the
house
[null]
la maison
0.37 0.27
0.37
0.26
0.27
0.46
the
flower
[null]
la fleur
0.37 0.13
0.37
0.26
0.13
0.74
blue
house
the
la maison
0.23 0.36
[null]
bleu
0.31 0.21
0.14 0.21
0.31 0.21
0.11
0.60
0.18
0.11
![Page 30: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/30.jpg)
30
Model 1, EM iteration 6
the
house
[null]
la maison
0.44 0.18
0.44
0.11
0.18
0.64
the
flower
[null]
la fleur
0.48 0.02
0.48
0.05
0.02
0.96
blue
house
the
la maison
0.11 0.58
[null]
bleu
0.44 0.17
0.02 0.08
0.44 0.17
0.02
0.91
0.05
0.02
![Page 31: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/31.jpg)
31
IBM Word-based translation(Brown et al., 1993)
• Model P(f | e): French translations given English
I
do
not
speak
French
je ne parle pas francais
[null]
![Page 32: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/32.jpg)
32
Model 1
• Lots of simplifying assumptions:– All lengths are equally likely
P(m | e) uniform = ∼ ε
– All word alignments are equally likelyP(aj | a1
j-1, f1j-1, m, e) uniform = 1 / (∼ l + 1)
– French word depends on English word it’s aligned toP(fj | a1
j, f1j-1, m, e) ∼ t(fj | eaj) multinomial over English words∼
• Resulting modelP(f, a | e) = ε / (l + 1)m ∏j=1
m t(fj | eaj)
![Page 33: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/33.jpg)
33
A generative story(IBM Models 1-2, HMM)
P(f, a | e) =P(m | e) ∙
∏j=1m (
P(aj | a1j-1, f1
j-1, m, e) ∙
P(fj | a1j, f1
j-1, m, e)
)Exact – chain rule!
Pick the length of the French sentence
For each position in the French sentence…
Pick the English word aligned to the French word in that
position, then…
Pick the French word in that position
E, F: English, French vocabulariese = e1
l = (e1, …, el): English sentence, ei E∈f = f1
m = (f1, …, fm): French sentence, fj F∈a = a1
m = (a1, …, am): word alignment, aj [0..l]∈
![Page 34: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/34.jpg)
Progression of alignment models• Models of increasing complexity
– Only Model 1 is convex
• Models 3, 4, 5 each capture new aspects of the sentence– Capture “fertility”– Different movement models– Each model can initialize its
successor – helps avoid local minima
• Freely available tools for this task– GIZA++– Berkeley aligner
Model Translation Distortion Fertility
1 Yes --- ---2 Yes Abs ---HMM Yes Rel ---3 Yes Abs Yes4 Yes Rel Yes5 Yes Rel Yes
34
![Page 35: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/35.jpg)
Toy System A’
• Our prior toy system used a uniform distribution for translations
• Now we can plug in Model 1 parameters• Language model helps pick translations that
are fluent• Translation model helps pick translations that
are adequate• Looks just like an HMM!
![Page 36: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/36.jpg)
36
eat-10.3
eat-9.8
Toy System A’weil er gestern gegangen ist
because he yesterday gone is
him left had
his are
has
<s>0
because-3.2
he-5.6
him-5.4
his-5.9
yesterday-8.3
gone-9.9
Each translation is like a part-of-speech tag
Becomes
Bigram LM + Model 1!
left-10.4
![Page 37: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/37.jpg)
Some questions:
• What about standard translation dictionaries? Should we include them, and how?
• What translation phenomena are we covering and what are we missing?
• Does it work?
![Page 38: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/38.jpg)
38
Toy System B• System A: finds better translations in context, but can’t reorder
“er gestern gegangen ist he yesterday left had”(should be “he had left yesterday”)
• System B: allow all possible permutations– Each hypothesis now remembers:
• Last target word generated• Set of source words already translated
– 5! = 125 permutations, 10! = 3.6M, 20! = 2.43e18– No way we can afford to keep all translations!
• Group into stacks based on count of words covered– Histogram pruning: limited number of hypotheses on any stack– Threshold pruning: only keep hypotheses within d of best on stack
![Page 39: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/39.jpg)
39Stack 2Stack 0 Stack 1
Toy System B: search
<s>0
00000 because-3.2
10000
he-3.5
01000
he-5.8
1100000
… …
…
Like an expanded Viterbi search, but each hypothesis also needs to remember which source words have been translated already!
yesterday-1.9
00100
because-5.2
100100
weil er gestern gegangen ist
because he yesterday gone is
him left had
his are
has
yesterday-5.6
100100
![Page 40: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/40.jpg)
40
Beyond Toy System B
• Many problems with this system:– System allows all possible reorderings, but some are
much more likely than others– Contextual information is only captured by the target
language model, not in the source• Multiple paths from here:
– Better word alignment– Phrase-based translation: learn bigger translation
units – this is crucial!– Better reordering models: syntax can help here
![Page 41: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/41.jpg)
41
Word-based MT results
SRC: 对外经济贸易合怍部今无提供的数据表明,今年至十一月中国实际利用外资四百六十九点五九亿美元 , 其中包括外商直接投资四百点零七亿美元。
REF: According to the data provided today by the Ministry of Foreign Trade and Economic Cooperation, as of November this year, China has actually utilized 46.959 billion US dollars of foreign capital, including 40.007 billion US dollars of direct investment from foreign businessmen.
WB: The Ministry of Foreign Trade and Economic Cooperation, including foreign direct investment 40.007 billion US dollars today provide data include that year to November china actually using 46.959 billion US dollars and
SRC: Le politique de la haineREF: Politics of hateWB: The policy of the hatred
SRC: Nous avone signé le protocole.REF: We did sign the memorandum of agreement.WB: We have signed the protocol.
SRC: Où était le plan solide?REF: But where was the solid plan?WB: Where was the economic base?
![Page 42: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/42.jpg)
42
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
•
•
•
•
•
blue
house
the
a casa azul
![Page 43: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/43.jpg)
43
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
•
•
•
•
blue
house
the
a casa azul
![Page 44: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/44.jpg)
44
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
•
•
•
blue
house
the
a casa azul
![Page 45: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/45.jpg)
45
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
• house casa
•
•
blue
house
the
a casa azul
![Page 46: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/46.jpg)
46
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
• house casa
• blue house casa azul
•
blue
house
the
a casa azul
![Page 47: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/47.jpg)
47
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
• house casa
• blue house casa azul
• the blue house a casa azul
blue
house
the
a casa azul
![Page 48: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/48.jpg)
48
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
• house casa
• blue house casa azul
• the blue house a casa azul
blue
house
the
a casa azul
![Page 49: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/49.jpg)
49
Word alignment and phrase extraction (Koehn, Och, Marcu 2003)
• the a
• blue azul
• house casa
• blue house casa azul
• the blue house a casa azul
blue
house
the
a casa azul
![Page 50: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/50.jpg)
50
Phrase table
• Extract phrases from all sentence pairs• Estimate P(src | tgt) with c(src, tgt) / c(tgt)
Portuguese English Probver see 0.533ver view 0.129ver to see 0.044ver viewing 0.009ver seeing 0.008ver watch 0.007
…ver o mundo atravês view the world through 1.000ver e adquirir browse and purchase 1.000ver ou editar view or edit 0.875ver filmes watch movies 0.667
![Page 51: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/51.jpg)
51
Word-based vs. phrase-based(BLEU score vs. training data size)
40k 80k 160k 320k20
22
24
26
28
30
Phrases from word alignment
Word-based
[Koehn, Och, and Marcu 2003]
These systems, with sufficient data, produce better translations than
rule-based systems… mostly.
![Page 52: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/52.jpg)
52
Syntax in translation
• Phrases capture contextual translation and local reordering surprisingly well
• However this information is brittle:– “author of the book 本書的作者” tells us nothing about
how to translate “author of the pamphlet” or “author of the play”
– The Chinese phrase “NOUN1 的 NOUN2” becomes “NOUN2 of NOUN1” in English
• No information about global reordering– In Chinese, prepositional phrases often come before verbs; in
English, they’re come after
![Page 53: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/53.jpg)
53
Syntax-based source reordering
• Language is hierarchical – our models should capture this
• Phrasal cohesion (Fox, 2002): most often, each source constituent translates to a contiguous target constituent
• Source parse trees can inform reordering– First parse the source sentence– Then use information about the source to guide
reordering
![Page 54: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/54.jpg)
54
Wang, Collins, Koehn (2007):Parse the Chinese, reorder like English
![Page 55: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/55.jpg)
55
Some pertinent rules
![Page 56: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/56.jpg)
56
Syntax-directed translation
• Begin by parsing source sentence– Syntactic analysis can guide reordering and inform
translation• One approach: Treelet translation (Quirk,
Menezes, and Cherry, 2005)– Use dependency trees: minimal amount of
syntactic information (just head node)
![Page 57: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/57.jpg)
57
Treelet and template extraction
• Start from word aligned sentence pairs
blue housethe
a casa azul
![Page 58: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/58.jpg)
58
Treelet and template extraction
• Parse source:
blue/JJ
house/NN
the/DT
a casa azul
![Page 59: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/59.jpg)
59
Treelet and template extraction
• Project tree:
blue/JJ
house/NN
the/DT
a
casa
azul
![Page 60: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/60.jpg)
60
Treelet and template extraction
• Extract treelet pairs:
• Treelet: connected subgraphof the dependency tree
blue/JJ
house/NN
the/DT
a
casa
azul
the a
blue azul
house casa
blue house casa azul
the blue house a casa azul
the house a casa
![Page 61: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/61.jpg)
61
Treelet and template extraction
• Extract templates:
blue/JJ
house/NN
the/DT
a
casa
azul*/JJ
*/NN
*/DT
*
*
*
![Page 62: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/62.jpg)
62
Europarl English-Spanish
devtest in-domain out-of-domain20%
25%
30%
35%
PhrasalTemplateBL
EU
![Page 63: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/63.jpg)
Impact of preserving ambiguity
• Start with treelet systems– Technical English-German,
English-Japanese– Newswire Chinese-English
• Translate each of k-best parses independently
• Keep the translation with the best score
• Evaluate using BLEU
parses EG EJ CE
1 33.6 36.0 28.2
2 33.8 36.1 28.5
4 34.1 36.3 28.9
8 34.3 36.6 29.2
16 34.5 36.8 29.7
32 34.8 37.1 30.0
![Page 64: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/64.jpg)
64
Target langauge syntax
• If we want a grammatical translation, shouldn’t we use a grammar?
• Use a parser in the target language instead– Translation becomes cross-lingual parsing: find the best
English parse tree for a Chinese sentence– Great for translating into English or other languages with
lots of linguistic resources• Later approaches capture larger synchronous rules at a
time (Marcu et al., 2006) and are pretty successful, though somewhat slow in comparison– Ongoing research to speed things up
![Page 65: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/65.jpg)
What about morphology?
• Most of these approaches treat words as indivisible units
• Some recent work addresses this problem:– Phrasal translations of morpheme sequences
(requires morphological segmentation)
![Page 66: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/66.jpg)
Remaining limitations
• Most systems consider only a single sentence at a time– What about discourse phenomena?– Coreference?
• How do we handle unknown words?• Where do we get the data?
![Page 67: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing](https://reader035.vdocuments.mx/reader035/viewer/2022062421/56649da65503460f94a91d2e/html5/thumbnails/67.jpg)
Thanks!