Download - CSA4050: Advanced Techniques in NLP
Jan 2005 Statistical MT 1
CSA4050: Advanced Techniques in NLP
Machine Translation III
Statistical MT
Jan 2005 Statistical MT 2
Statistical Translation
• Robust
• Domain independent
• Extensible
• Does not require language specialists
• Uses noisy channel model of translation
Jan 2005 Statistical MT 3
Noisy Channel ModelSentence Translation (Brown et. al. 1990)
sourcesentence
target sentence
sentence
Jan 2005 Statistical MT 4
The Problem of Translation
• Given a sentence T of the target language, seek the sentence S from which a translator produced T, i.e.
find S that maximises P(S|T)• By Bayes' theorem
P(S|T) = P(S) x P(T|S)
P(T)
whose denominator is independent of S.• Hence it suffices to maximise P(S) x P(T|S)
Jan 2005 Statistical MT 5
A Statistical MT System
Source Language
Model
TranslationModel
P(S) * P(T|S) = P(S|T)
S T
DecoderT S
Jan 2005 Statistical MT 6
The Three Components of a Statistical MT model
1. Method for computing language model probabilities (P(S))
2. Method for computing translation probabilities (P(S|T))
3. Method for searching amongst source sentences for one that maximisesP(S) * P(T|S)
Jan 2005 Statistical MT 7
ProbabilisticLanguage Models
• GeneralP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s1...s(n-1))
• TrigramP(s1s2...sn) =P(s1)*P(s2|s1)*P(s3|s1,s2) ...*P(sn|s(n-1)s(n-2))
• BigramP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s(n-1))
Jan 2005 Statistical MT 8
A Simple Alignment Based Translation Model
Assumption: target sentence is generated from the source sentence word-by-word
S: John loves Mary
T: Jean aime Marie
Jan 2005 Statistical MT 9
Sentence Translation Probability
• According to this model, the translation probability of the sentence is just the product of the translation probabilities of the words.
• P(T|S) =P(Jean aime Marie|John loves Mary) =P(Jean|John) * P(aime|loves) * P(Marie|Mary)
Jan 2005 Statistical MT 10
More Realistic Example
The proposal will not now be implemented
Les propositions ne seront pas mises en application maintenant
Jan 2005 Statistical MT 11
Some Further Parameters
• Word Translation Probability:P(t|s)
• Fertility: the number of words in the target that are paired with each source word: (0 – N)
• Distortion: the difference in sentence position between the source word and the target word: P(i|j,l)
Jan 2005 Statistical MT 12
Searching
• Maintain list of hypotheses. Initial hypothesis: (Jean aime Marie | *)
• Search proceeds interatively. At each iteration we extend most promising hypotheses with additional wordsJean aime Marie | John(1) *Jean aime Marie | * loves(2) *Jean aime Marie | * Mary(3) *Jean aime Marie | Jean(1) *
Jan 2005 Statistical MT 13
Parameter Estimation
• In general - large quantities of data
• For language model, we need only source language text.
• For translation model, we need pairs of sentences that are translations of each other.
• Use EM Algorithm (Baum 1972) to optimize model parameters.
Jan 2005 Statistical MT 14
Experiment 1 (Brown et. al. 1990)
• Hansard. 40,000 pairs of sentences = approx. 800,000 words in each language.
• Considered 9,000 most common words in each language.
• Assumptions (initial parameter values)– each of the 9000 target words equally likely as
translations of each of the source words.– each of the fertilities from 0 to 25 equally likely for
each of the 9000 source words– each target position equally likely given each source
position and target length
Jan 2005 Statistical MT 15
English: the
French Probability
le .610
la .178
l’ .083
les .023
ce .013
il .012
de .009
à .007
que .007
Fertility Probability
1 .871
0 .124
2 .004
Jan 2005 Statistical MT 16
English: not
French Probability
pas .469
ne .460
non .024
pas du tout .003
faux .003
plus .002
ce .002
que .002
jamais .002
Fertility Probability
2 .758
0 .133
1 .106
Jan 2005 Statistical MT 17
English: hear
French Probability
bravo .992
entendre .005
entendu .002
entends .001
Fertility Probability
0 .584
1 .416
Jan 2005 Statistical MT 18
Bajada 2003/4
• 400 sentence pairs from Malta/EU accession treaty
• Three different types of alignment– Paragraph (precision 97% recall 97%)– Sentence (precision 91% recall 95%)– Word: 2 translation models
• Model 1: distortion independent• Model 2: distortion dependent
Jan 2005 Statistical MT 19
Bajada 2003/4
Model 1 Model 2
word pairs present 244 244
word pairs identified 145 145
correct 58 77
incorrect 87 68
precision 40% 53%
recall 24% 32%
Jan 2005 Statistical MT 20
Experiment 2
• Perform translation using 1000 most frequent words in the English corpus.
• 1,700 most frequently used French words in translations of sentences completely covered by 1000 word English vocabulary.
• 117,000 pairs of sentences completely covered by both vocabularies.
• Parameters of English language model from 570,000 sentences in English part.
Jan 2005 Statistical MT 21
Experiment 2 contd
• 73 French sentences tested from elsewhere in corpus. Results were classified as– Exact – same as actual translation– Alternate – same meaning– Different – legitimate translation but different
meaning– Wrong – could not be intepreted as a translation– Ungrammatical – grammatically deficient
• Corrections to the last three categories were made and keystrokes were counted
Jan 2005 Statistical MT 22
Results
Category # sentences percent
Exact 4 5
Alternate 18 25
Different 13 18
Wrong 11 15
Ungrammatical 27 37
Total 73
Jan 2005 Statistical MT 23
Results - Discussion
• According to Brown et. al., system performed successfully 48% of the time (first three categories).
• 776 keystrokes needed to repair 1916 keystrokes to generate all 73 translations from scratch.
• According to authors, system therefore reduces work by 60%.