motivations for transfer-based translation
DESCRIPTION
Motivations for transfer-based translation. lexical ambiguity structural differences See further Ingo 91. Example 1. Sv. Fyll på olja i växellådan. En. Fill gearbox with oil. (from the Scania corpus) fyll på fill obj adv adv obj. Example 2. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/1.jpg)
Motivations for transfer-based translation
• lexical ambiguity
• structural differences
See further Ingo 91
![Page 2: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/2.jpg)
Example 1
Sv. Fyll på olja i växellådan. En. Fill gearbox with oil.(from the Scania corpus)
• fyll på fill
• obj adv
• adv obj
![Page 3: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/3.jpg)
Example 2
Sv. I oljefilterhållaren sitter en överströmningsventil.
En. The oil filter retainer has an overflow valve.(from the Scania corpus)
• sitter has• adv subj• subj obj
![Page 4: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/4.jpg)
Transfer-based translation
• intermediary sentence structure• basic processes
– analysis– transfer– generation (synthesis)
• language modules– dictionary and grammar of SL– transfer dictionary and transfer rules– dictionary and grammar of TL
![Page 5: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/5.jpg)
SL TL
Interlingua
Direct translation
Transfer
Multra
Metal
![Page 6: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/6.jpg)
Levels of intermediary structure
• cf. J&M, Chapter 21
• word order
![Page 7: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/7.jpg)
Metal
• See H&S
![Page 8: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/8.jpg)
MULTRA
Multilingual Support for Translation and Writing• translation engine• transfer-based
– shake-and-bake
• modular• unification-based• preference machinery• trace-able
![Page 9: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/9.jpg)
![Page 10: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/10.jpg)
Analysis
• chart parser (Lisp C)– procedural formalism
• unification and other kinds of operations
• sentence structure– feature structure– grammatical relations– surface order implicit via grammatical relations
See further Sågvall Hein&Starbäck (99),Weijnitz (02), Dahllöf (89)
![Page 11: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/11.jpg)
Transfer
• unification-based• declarative formalism
– Multra transfer formalism (Beskow 93) • lexical and structural rules
• rules are partially ordered• a more specific rule takes precedence over a
less specific one– specificity in terms of number of transfer equations
• all applicable rules are applied• written in prolog
![Page 12: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/12.jpg)
Generation
• syntactic generation– Multra syntactic generation formalism (Beskow 97a)– PATR-like style
• unification• concatenation• typed features
• morphological generation (Beskow 97b)– lexical insertion rules– morphological realisation and phonological finish in
prolog
• written in prolog
![Page 13: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/13.jpg)
An example: Tippa hytten.Tippa hytten. :
(* = (PHR.CAT = CL MODE = IMP
SUBJ = 2ND VERB = (WORD.CAT = VERB INFF = IMP DIAT = ACT LEX = TIPPA.VB.1
VSURF = +) OBJ.DIR = (PHR.CAT = NP NUMB = SING GENDER = UTR CASE = BASIC DEF = DEF HEAD = (LEX = HYTT.NN.1 WORD.CAT = NOUN))) REG = (V1.LEM = TIPPA.VB) SEP = (WORD.CAT = SEP LEX = STOP.SR.0)))
![Page 14: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/14.jpg)
Transfer structureTransfer structure
[VERB : [WORD.CAT : VERB LEX : TILT.VB.0 DIAT : ACT INFF : IMP] OBJ.DIR : [PHR.CAT : NP DEF : DEF NUMB : SING HEAD : [WORD.CAT : NOUN LEX : CAB.NN.0]] MODE : IMP SUBJ: 2ND VSURF: + SEP : [WORD.CAT : SEP LEX : STOP.SR.0] PHR.CAT : CL]
![Page 15: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/15.jpg)
Generation
Tilt the cab.
![Page 16: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/16.jpg)
A grammar rule
defrule legal.obj {<?1 phr.cat> = 'np,not <?1 case> = 'gen, not <?1 case> = 'subj
}
![Page 17: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/17.jpg)
Transfer rules
• copy feature
• delete feature
• transfer feature
• assign feature
![Page 18: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/18.jpg)
Copy feature
LABEL modeSOURCE <* mode> = ?x1TARGET <* mode> = ?x2TRANSFER
![Page 19: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/19.jpg)
Delete feature
LABEL REGSOURCE <* REG> = ANYTARGET <*> = <*> TRANSFER
![Page 20: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/20.jpg)
Transfer feature
LABEL OBJ.DIRSOURCE <* OBJ.DIR> = ?x1TARGET <* OBJ.DIR> = ?x2TRANSFER ?x1 <=> ?x2
![Page 21: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/21.jpg)
Define feature
LABEL trycka.in-pressSOURCE <* lex sym>=trycka.vb+in.ab.1 <* word.cat>=VERBTARGET <* lex>=press.vb.1 <* word.cat>=VERBTRANSFER
![Page 22: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/22.jpg)
A generation rule
LABEL CL.IMPX1 ---> X2 X3 X4 : <X1 PHR.CAT> = CL <X1 VERB> = <X2> <X1 TYPE> = IMP <X1 OBJ.DIR> = <X3> <X1 SEP> = <X4>
![Page 23: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/23.jpg)
A contextual lexical ruleLABEL tänka.på-think.aboutSOURCE <* verb lex sym> = tänka.vb.1 <* obj.prep phr.cat> = pp <* obj.prep prep> = ?prep <* obj.prep prep lex sym> = på.pp.1 <* obj.prep rect> = ?rect1TARGET <* obj.prep phr.cat> = pp <* obj.prep prep word.cat> = PREP <* obj.prep prep lex> = about.pp.1 <* obj.prep rect> = ?rect2TRANSFER ?rect1<=>?rect2
![Page 24: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/24.jpg)
A generation trace
1-Applying Rule cl-sep
1- Applying Rule cl.imp
1- Applying Rule subj2nd-verb-obj.dir
1- Applying Rule verb.main.act
1- Applying Rule np.the-df
1- Applying Rule ng.noun-def
1-Success!
![Page 25: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/25.jpg)
Language resources in the MATS system
• dictionary in a database with different views
• analysis grammar
• transfer grammar– incl. contextually defined lexical rules
• generation grammar
![Page 26: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/26.jpg)
sv-en_LinkLexicon
![Page 27: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/27.jpg)
en-Inflections
![Page 28: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/28.jpg)
en_LemmaLexicon
![Page 29: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/29.jpg)
en_LexemeLexicon
![Page 30: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/30.jpg)
en_Lexicon
![Page 31: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/31.jpg)
en_StemLexicon
![Page 32: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/32.jpg)
sv_Inflections
![Page 33: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/33.jpg)
sv_LemmaLexicon
![Page 34: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/34.jpg)
sv_LexemeLexicon
![Page 35: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/35.jpg)
sv_Lexicon
![Page 36: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/36.jpg)
sv_StemLexicon
![Page 37: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/37.jpg)
The MATS system
Frozen demo…
![Page 38: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/38.jpg)
Assignment 2: Working with MATS
http://stp.ling.uu.se/~evapet/mt04/assignment2.html
![Page 39: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/39.jpg)
Lexicalistic translation
• Identify (lexical) translation units in the source sentence
• Translate each unit separately (considering the context)
• Order the result in agreement with a model of the target language
Formulation due to Lars Ahrenberg; see further AH (reading list) ; see also Beaven, L. John, Shake-and-Bake Machine Translation. Coling –92, Nantes, 23-28 Aout 1992.
![Page 40: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/40.jpg)
T4F – a lexicalistic system
• processes in T4F– tokenisation– tagging– transfer– transposition– filtering
See further AH (in the reading list)
![Page 41: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/41.jpg)
Interlingua translation
• See SN
![Page 42: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/42.jpg)
![Page 43: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/43.jpg)
![Page 44: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/44.jpg)
![Page 45: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/45.jpg)
Applications of alignment
• translation memories
• translation dictionaries
• lexicalistic translation
• statistical machine translation
• example-based translation
![Page 46: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/46.jpg)
Translation memories
• based on sentence links
• optionally, sub sentence links
See further Macklovitch, E. (2000)
![Page 47: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/47.jpg)
Translation dictionaries
• based on word links
• refinement of word links
![Page 48: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/48.jpg)
Refinement of word alignment data
• neutralise capital letters where appropriate• lemmatise or tag source and target units• identify ambiguities
– search for criteria to resolve them
• identify partial links– compounds?– remove or complete them
• manual revision?
![Page 49: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/49.jpg)
Informally about statistical MT
• build a translation dictionary based on word alignment
• aim for as big fragments as possible• keep information on link frequency• build an n-gram model of the target language• implement a direct translation strategy
– including alternatives ordered by length and frequency
• process the output by the n-gram model filtering out the best alternatives and adjust the translation accordingly
![Page 50: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/50.jpg)
Example-based MT
HS (in the reading list)
![Page 51: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/51.jpg)
Some current research topics
• intersentential dependences• hybrid systems: data-driven and rule-driven• improved alignment techniques• improved language modeling in ST• automatic learning from post-editing• translation by structural correspondences• translation of spoken language• improved preference strategies• ambiguity preserving translation
![Page 52: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/52.jpg)
Intersentential dependencies
• pronoun resolution
• lexical ambiguity resolution, such as– (torkar)motorn the motor– (förbrännings)motorn the engine
• fluency
![Page 53: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/53.jpg)
Preserving the information structure
• information structure is expressed in different ways in the source and the target
• syntactic clues are exploited in the analysis to compute the information structure (topic-focus articulation)
• information structure is used to guide the generation
![Page 54: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/54.jpg)
An example
Torkarmotorn M2 är sammankopplad med omkopplare S24 och intervallrelä R22. För att inte motorn skall överbelastas, t.ex. om torkarbladen fastnat, finns en inbyggd termovakt som bryter strömmen till motorn när …
Wiper motor M2 is connected to switch S24 and intermittent relay R22. To prevent motor overload, e.g. if the wiper blade gets stuck, there is an integral thermal sensor which breaks the current to the motor when …
![Page 55: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/55.jpg)
Preferences
• syntactic preferences– the principle of right association– the principle of minimal attachment– two-stage processing
• semantic preferences– lexical selectional restrictions– lexical contextual rules– conceptual taxonomies– likelihood of occurrence
See further Bennet, P. & Paggio, P., 1993, Preference in Eurotra.
![Page 56: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/56.jpg)
Preferences in Multra
• parsing– a formalism for expressing syntactic
preferences in the parse• not fully developed
• transfer– contextual lexical rules– rule specificity
• generation– rule specificity
![Page 57: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/57.jpg)
Hybrid systems
• aims
• components
• problems
• architecture
• scores
![Page 58: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/58.jpg)
Aims of a hybrid system
• simple techniques for simple tasks
• complex techniques for complex tasks
![Page 59: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/59.jpg)
Components of a hybrid systems
• component strategies– translation memory
• full sentences• fragments
• direct translation– statistical translation– ebmt
![Page 60: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/60.jpg)
Component strategies, cont’d
• rule-based translation– simplistic analysis (cf. direct translation)
• word by word (S sequence of words)• phrase by phrase (S sequence of phrases)
– partial parsing– full parsing
![Page 61: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/61.jpg)
Problems of a hybrid system
• how does the system know when a simple technique is appropriate?– does the source tell?– does the target tell?
![Page 62: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/62.jpg)
Architecture and scores
• simple first?
• concerting results?
• scoring?
![Page 63: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/63.jpg)
Improved techniques for re-use of translation
• combining clues for word alignment (Tiedemann 2003)
• interactive word alignment (Ahrenberg et al. 2003)
• parallel treebanks
![Page 64: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/64.jpg)
Translation by structural correspondences
• LFG
• HPSG
![Page 65: Motivations for transfer-based translation](https://reader035.vdocuments.mx/reader035/viewer/2022062518/56814680550346895db39fd1/html5/thumbnails/65.jpg)
Translation of spoken language
See
Krauver, Steven (ed.), 2000, Machine Translation, June 2000. Volume 15, Issue 1-2, Special issue on Spoken Language Translation.