statistical machine translation with rule based re-ordering of source sentences

31
Statistical Machine Translation with Rule Based Re-ordering of Source Sentences Amit Sangodkar Vasudevan N Om P. Damani (CSE, IIT Bombay)

Upload: odina

Post on 20-Jan-2016

57 views

Category:

Documents


1 download

DESCRIPTION

Statistical Machine Translation with Rule Based Re-ordering of Source Sentences. Amit Sangodkar Vasudevan N Om P. Damani (CSE, IIT Bombay). Motivation. Combining Linguistic knowledge with Statistical Machine Translation. - PowerPoint PPT Presentation

TRANSCRIPT

  • Statistical Machine Translation with Rule Based Re-ordering of Source SentencesAmit SangodkarVasudevan NOm P. Damani(CSE, IIT Bombay)

  • MotivationCombining Linguistic knowledge with Statistical Machine Translation.Can re-ordering source language sentences as per target language improve the alignment?

  • ExampleEnglish: Many Bengali poets have sung songs in praise of this land.

    Hindi:

    Re-order: Many Bengali poets this land of praise in songs sung have

  • Translation Architecture

  • Dependency ParserMany Bengali poets have sung songs in praise of this land.amod (poets-3, Many-1)nn (poets-3, Bengali-2)nsubj (sung-5, poets-3)aux (sung-5, have-4)dobj (sung-5, songs-6)prep_in (sung-5, praise-8)det (land-11, this-10)prep_of (praise-8, land-11)

    ------------------------------------Output of Stanford Parser

  • Tree ProcessingHandling Auxiliary Verbsremove and postfix to their respective verbe.g. aux(sung, have) sung_haveHandling Prepositions/Conjunctionsextract the preposition from the relation and attach to parent/childe.g. prep_in(sung, praise) prep(sung, praise_in)

  • Modified Dependency Tree

  • Re-orderingParent-Child PositioningPrioritizing the Relations

  • Re-ordering (Parent-Child Positioning)parent before child conj (conjunction), appos (apposition), advcl (adverbial clause), ccomp (clausal complement), rcmod (relative clause modifier)e.g. John cried because he fell advcl(cry, fell). In Hindi, cry is ordered before fell.child before parent nsubj(subject), dobj(object)e.g. Ram eats mangodobj(eat,mango). In Hindi, mango ordered before eat.

  • Re-ordering (Relation Priority)Deciding the order in case of multiple childrenPriority among relation pairs

  • Illustration - Re-ordering Input Dependency Treesung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering sung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering sung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Manysung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Manysung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengalisung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poetssung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poetssung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets thissung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets this land ofsung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets this land of praise insung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets this land of praise insung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets this land of praise in songs sung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Illustration - Re-ordering

    Output: Many Bengali poets this land of praise in songs sung have sung_havepoetspraise_insongsland_ofthisManyBengalinsubjprepdobjamodnnprepdet

  • Experimental SetupProcedureTrain Moses using Training data with 6-gram language modelTune the Moses using Development dataDecode Testing data using trained MosesThis experimentation procedure on pure data and reordered data

  • Results

  • Translation Example - IActual : .Baseline : .Re-ordered : .

  • Translation Example - IIActual : .Baseline : deliverance .Re-ordered : .

  • ConclusionUsing Linguistic knowledge appears to improve the SMT qualityBLEU score applicability in this context needs to be investigated

  • AcknowledgementsWe acknowledge the Department of IT (DIT), Government of India and the English-to-Indian Languages (EILMT) consortium for making the EILMT tourism dataset available.IIIT Data Set: Data acquired during DARPA TIDES MT project 2003 and later refined at LTRC,IIIT-H.

  • References[Hieu2008] Hieu Hoang, Philipp Koehn, Design of the Moses Decoder for Statistical Machine Translation, ACL Workshop on Software engineering, testing, and quality assurance for NLP 2008.[Marie2006] Marie-Catherine de Marneffe, Bill MacCartney and Christopher D. Manning, Generating Typed Dependency Parses from Phrase Structure Parses. In Proceedings of LREC-06. 2006.[Manual2008] Stanford Dependencies Manual, Available at http://nlp.stanford.edu/software/dependencies_manual.pdf..[Moses] Moses Tutorial, Available at http://www.statmt.org/moses/?n=Moses.Tutorial. .[Singh2007] Smriti. Singh, Mrugunk. Dalal, Vishal Vachhani, Pushpak Bhattacharyya, Om P. Damani. Hindi Generation from Interlingua (UNL), Machine Translation Summit XI, 2007.