Download - Avenue Architecture
Avenue Architecture
Learning
Module
Learned Transfer
Rules
Lexical Resources
Run Time Transfer System
Decoder
Translation
Correction
Tool
Word-Aligned Parallel Corpus
Elicitation Tool
Elicitation Corpus
Elicitation Rule Learning
Run-Time System
Rule Refinement
Rule
Refinement
Module
Morphology
Morphology Analyzer
Learning Module Handcrafted
rules
INPUT TEXT
OUTPUT TEXT
Interactive and Automatic Refinement of translation Rules
• Problem: Improve Machine Translation Quality.
• Proposed Solution: Put bilingual speakers back into the loop; use their corrections to detect the source of the error and automatically improve the lexicon and the grammar.
• Approach: Automate post-editing efforts by feeding them back into the MT system.Automatic refinement of translation rules that
caused an error beyond post-editing.
• Goal: Improve MT coverage and overall quality.
Technical Challenges
Elicit minimal MT information from non-expert users
Automatically Refine and Expand
Translation Rules minimally
Manually written Automatically Learned
Automatic Evaluation of Refinement process
Error Typology for Automatic Rule Refinement (simplified)Missing word
Extra word
Wrong word order
Incorrect word
Wrong agreement
Interactive elicitation of error information
Local vs Long distance
Word vs. phrase
+ Word change
Sense
Form
Selectional restrictions
Idiom
Missing constraint
Extra constraint
TCTool (Demo)• Add a word• Delete a word• Modify a word• Change word order
Actions:
Interactive elicitation of error information
precision recall
error detection 90% 89%
error classification 72% 71%
1. Refine a translation rule:R0 R1 (change R0 to make it more
specific or more general)
Types of Refinement Operations
Automatic Rule Adaptation
R0:
R1:
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonito
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonita
N gender = ADJ gender
2. Bifurcate a translation rule:R0 R0 (same, general rule)
R1 (add a new more specific rule)
Types of Refinement Operations
Automatic Rule Adaptation
R0: NP
DET N ADJ
NP
DET ADJ N
NP
DET ADJ N
NP
DET ADJ N
R1:
a nice house una casa bonita
a great artist un gran artista
ADJ type: pre-nominal
Error Information Elicitation
Refinement Operation Typology
Automatic Rule Adaptation
Change word orderSL: Gaudí was a great artist
MT system output:TL: Gaudí era un artista grande
Ucorrection: *Gaudí era un artista grande Gaudí era un gran artista
A concrete example
clue word
error
correction
Finding Triggering Feature(s): (error word, corrected word) =
need to postulate a new binary feature: feat1
Blame assignment (from MT system output)
tree: <((S,1 (NP,2 (N,5:1 "GAUDI") )
(VP,3 (VB,2 (AUX,17:2 "ERA") )
(NP,8 (DET,0:3 "UN")
(N,4:5 "ARTISTA")
(ADJ,5:4 "GRANDE") ) ) ) )>
Automatic Rule Adaptation
S,1
…
NP,1
…
NP,8
…Grammar
ADJ::ADJ |: [great] -> [grande]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc))
ADJ::ADJ |: [great] -> [gran]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc))
Refining Rules• Bifurcate NP,8 NP,8 (R0) + NP,8’ (R1)
(flip order of ADJ-N)
{NP,8’} NP::NP : [DET ADJ N] -> [DET ADJ N]( (X1::Y1) (X2::Y2) (X3::Y3)
((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y3 agr)) ; det-noun agreement ((y2 agr) = (y3 agr)) ; adj-noun agreement (y2 = x3) ((y2 feat1) =c + ))
Automatic Rule Adaptation
Refining Lexical EntriesADJ::ADJ |: [great] -> [grande]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = -))
ADJ::ADJ |: [great] -> [gran]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = +))
Automatic Rule Adaptation
Evaluating ImprovementAutomatic Rule Adaptation
- Given the initial and final Translation Lattices, the Rule Refinement module needs to take into account, whether the following are present:- Corrected Translation Sentence- Original Translation Sentence (labelled as incorrect
by the user)
un artista gran
un gran artista
un grande artista
*un artista grande
Evaluating ImprovementAutomatic Rule Adaptation
- Given the initial and final Translation Lattices, the Rule Refinement module needs to take into account, whether the following are present:- Corrected Translation Sentence- Original Translation Sentence (labelled as incorrect
by the user)
*un artista gran
un gran artista
*un grande artista
*un artista grande
Challenges and future work
• Credit and Blame assignment from TCTool Log Files and Xfer engine’s trace
• Order of corrections matters ~ explore rule interactions
• Explore the space between batch mode and fully interactive system
• Online TCTool always running to collect corrections from bilingual speakers make it into a game with rewards for the best users
Publications• Font Llitjós, A., J.G. Carbonell and A. Lavie.
"A Framework for Interactive and Automatic Refinement of Transfer-based Machine Translation" EAMT 10th Annual Conference 30-31 May 2005, Budapest, Hungary.
• Font Llitjós, A., R. Aranovich and L. Levin. "Building Machine translation systems for indigenous languages". Second Conference on the Indigenous Languages of Latin America (CILLA II), 27-29 October 2005, Texas, USA.
• Font Llitjós, A., K. Probst and J.G. Carbonell . "Error Analysis of Two Types of Grammar for the Purpose of Automatic Rule Refinement". AMTA, 2004, Washington, USA.
• Font Llitjós, A. and J.G. Carbonell . "The Translation Correction Tool: English-Spanish user studies“. LREC, 2004. Lisbon, Portugal.
QuechuaSpanish MT• V-Unit: funded Summer project in Cusco (Peru)
June-August 2005 [preparations and data collection started earlier]
• Intensive Quechua course in Centro Bartolome de las Casas (CBC)
• Worked together with two Quechua native and one non-native speakers on developing infrastructure (correcting elicited translations, segmenting and translating list of most frequent words)