![Page 1: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/1.jpg)
Linguistically-motivated Tree-based Probabilistic Phrase Alignment
Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)
![Page 2: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/2.jpg)
Outline Background Tree-based Probabilistic Phrase Alignment
Model Model Training Symmetrization Algorithm Experiments Conclusions
2 04/21/23
![Page 3: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/3.jpg)
Background Many of state-of-the-art SMT systems are
based on “word-based” alignment results Phrase-based SMT [Koehn et al., 2003] Hierarchical Phrase-based SMT [Chiang, 2005] and so on
Some of them incorporate syntactic information “after” word-based alignment [Quirk et al., 2005], [Galley et al., 2006] and so on
Is it enough? Is it able to achieve “practical” translation
quality?
3 04/21/23
![Page 4: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/4.jpg)
Background (cont.) Word-based alignment model works well for
structurally similar language pairs It is not effective for language pairs with great
difference in linguistic structure such as Japanese and English SOV versus SVO
For such language pair, syntactic information is necessary even during alignment process
4 04/21/23
![Page 5: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/5.jpg)
Related Work Syntactic tree-based model
[Yamada and Knight, 2001], [Gildea, 2003], ITG by Wu Incorporating some operations which control sub-
trees (re-order, insert, delete, clone) to reproduce the opposite tree structure
Our model does not require any operations Our model utilizes dependency trees
Dependency tree-based model [Cherry and Lin, 2003] Word-to-word, and one-to-one alignment Our model makes phrase-to-phrase alignment, and
can make many-to-many links
5 04/21/23
![Page 6: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/6.jpg)
Features of Proposed Tree-based Probabilistic Phrase Alignment Model Generation model similar to IBM models Using phrase dependency structures
“phrase” means a linguistic phrase (cf. phrase-based SMT)
Phrase to phrase alignment model Each phrase (node) consists of basically 1 content
word and 0 or more function words Source side content words can be aligned to
content words of target side only (same for function words)
Generation starts from the root node and end up with one of leaf nodes (cf. IBM model is from first word to last word)
6 04/21/23
![Page 7: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/7.jpg)
Outline Background Tree-based Probabilistic Phrase Alignment
Model Model Training Symmetrization Algorithm Experiments Conclusions
7 04/21/23
![Page 8: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/8.jpg)
Dependency Analysis of Sentences
プロピレングリコールは血中グルコースインスリンを上昇させ、血中NEFA 濃度を減少させる
Propylene glycol increases in blood glucose and insulin and decreases in NEFA concentration in the blood
Source Target
Word order
Head node Head node
Root node
Root node
8 04/21/23
![Page 9: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/9.jpg)
IBM Model v.s Tree-based Model IBM Model [Brown et al., 93]
Tree-based Model
)|(),|(maxargˆ eapaefpaa
)|(),|(maxargˆ eefa
TapaTTpa
f : source sentence
e : target sentence
S
s asss eapaefp
1
)|(),|(maxargˆ
S
s asesesf TapaTTp
1,,, )|(),|(maxargˆ
a : alignment
: parameters
fT : source tree
eT : target tree
9 04/21/23
![Page 10: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/10.jpg)
Model Decomposition:Lexicon Probability Suppose consists of nodes and consists
of nodes
is calculated as a product of two probabilities
Ex) 濃度 を - in concentration
上昇 さ せ - increase
),|( aTTp ef
fT J eTI
J
jajef jefpaTTp
1
)|(),|(
)|(jaj efp
)|()|()|( .. jjj ajfuncajcontaj efpefpefp
)in|を()ionconcentrat |濃度( .. funccont pp
Phrase translatio
n probabilit
y
)EMPTY|せ さ()increase|上昇( .. funccont pp
)|(),|(maxargˆ eefa
TapaTTpa
10 04/21/23
![Page 11: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/11.jpg)
Model Decomposition:Alignment Probability Define the parent node of as is decomposed as a product of target
side dependency relation probability conditioned on source side relation
If the parent node has been aligned to NULL, indicates the grandparent of , and this continues until has been aligned to other than NULL
models a tree-based reordering
jf
)|( eTap
J
jjjaae ffreleerelpTap
jj1
)),(|),(()|(
jf )|( eTap
jf jf
jf
Dependency relation
probability
)|( eTap
)|(),|(maxargˆ eefa
TapaTTpa
11 04/21/23
![Page 12: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/12.jpg)
Outline Background Tree-based Probabilistic Phrase Alignment
Model Model Training Symmetrization Algorithm Experiments Conclusions
12 04/21/23
![Page 13: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/13.jpg)
Model Training The proposed model is trained by EM
algorithm First, phrase translation probability is learned
(Model 1) Model 1 can be efficiently learned without
approximation (cf. IBM model 1 and 2) Next, dependency relation probability is
learned (Model 2) with probabilities learned in Model 1 as initial parameters Model 2 needs some approximation (cf. IBM model
3 or greater), we use beam-search algorithm
13 04/21/23
![Page 14: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/14.jpg)
Model 1 Each phrase in source side can
correspond to an arbitrary phrase in target side a or NULL phrase
A probability of one possible alignment is:
Then, tree translation probability is:
Efficiently calculated as:
)1( Jjf j
)1( Iiei )( 0e
)|()|()|,( .1
. jj ajfunc
J
jajcontef efpefpTTap
a
efef TTapTTp )|,()|(
J
j
I
iaj
a
J
jaj jj
efpefp1 01
)|()|(
14 04/21/23
![Page 15: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/15.jpg)
Model 2 (imaginary ROOT node) Root node of a sentence is supposed to
depend on the imaginary ROOT node, which works as a Start-Of-Sentence (SOS) in word-based model
The ROOT node in source tree always corresponds to that of target tree
事例 を 通して援助 の
視点 に必要な
ポイント を確認 した
ROOT
necessarythe point
through the casein the viewpoint
of the assistwas confirmed
ROOT
15 04/21/23
![Page 16: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/16.jpg)
Model 2 (beam-search algorithm) It is impossible to enumerate all the possible
alignment Consider only a subset of “good-looking”
alignments using beam-search algorithm Ex) beam-width = 4
事例 を 通して援助 の
視点 に必要な
ポイント を確認 した
necessarythe point
through the casein the viewpoint
of the assistwas confirmed
NULL
16 04/21/23
![Page 17: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/17.jpg)
Model 2 (beam-search algorithm)
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
17 04/21/23
![Page 18: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/18.jpg)
Model 2 (parameter notations) Dependency relation between two
phrases and is defined as a path from to using the following notations: “c-” if is a pre-child of
“c+” if is a post-child of
“p-” if is a post-child of
“p+” if is a pre-child of
“INCL” if and are same phrase
“ROOT” if is an imaginary ROOT node
“NULL” if is aligned to NULL
),( 21 PPrel
1P 2P 2P 1P
1P 2P
1P 2P
1P2P
1P2P
1P 2P
18 04/21/23
1P
1P
1P
1P
2P
c-
c+p-
p+
2P
2P
2P
2P
1P ROOT
ROOT2P
1P
![Page 19: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/19.jpg)
Model 2 (parameter notations, cont.) In a case where and are two or more
nodes distant from each other, the relation is described by combining the notations
Ex)
04/21/2319
1P 2P
1P2P
c-
c+
c-;c+
1P
2P
c-
c+
p-
p-;c+;c-
![Page 20: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/20.jpg)
Dependency Relation Probability Examples事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
事例 を 通して
援助 の視点 に
必要な
ポイント を確認 した
necessary
the point
through the case
in the viewpoint
of the assist
was confirmed
NULL
20 04/21/23
c-)|(c-ROOT)|(ROOT pp c-)|(p-ROOT)|c-(ROOT; pp
c-)|c(c-;ROOT)|(ROOT pp c-)ROOT;|(ROOTROOT)|(NULL pp
![Page 21: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/21.jpg)
Example事例 を 通して
援助 の視点 に
必要なポイント を
確認 したROOT
necessarythe point
through the casein the viewpoint
of the assistwas confirmed
ROOT
)|,( ef TTapROOT)|(ROOTEMPTY)|(confirmed) was|確認( ppp した
c-)|c(c-;through)| 通 を(case)|事例( ppp してc-)|(c-EMPTY)|(point)|( ppp をポイント
c-)|(c-EMPTY)|EMPTY(necessary)|( ppp 必要なc-)|(cin)|(viewpoint)|( ppp に視点
c-)|(cof)|(assist)|( ppp の援助21 04/21/23
![Page 22: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/22.jpg)
Outline Background Tree-based Probabilistic Phrase Alignment
Model Model Training Symmetrization Algorithm Experiments Conclusions
22 04/21/23
![Page 23: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/23.jpg)
Symmetrization Algorithm Since our model is directed, we run the model
bi-directionally and symmetrize two alignment results heuristically
Symmetrization algorithm is similar to [Koehn et al. 2003], which uses 1-best GIZA++ word alignment result of each direction
Our algorithm exploits n-best alignment results of each direction
Three steps: Superimposition Growing Handling isolations
23 04/21/23
![Page 24: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/24.jpg)
Symmetrization Algorithm1. SuperimpositionSource to Target
5-bestTarget to Source
5-best5 210
1010
5 35
1 57 3
105 9
77 1
・・・ ・・・
24 04/21/23
![Page 25: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/25.jpg)
Symmetrization Algorithm1. Superimposition (cont.)
5 210
1010
5 35
1 57 3
105 9
77 1
Definitive alignment points are adopted The points which don’t
have same or higher scored point in their same row or column
Conflicting points are discarded The points which is in
the same row or column of the adopted point and is not contiguous to the adopted point on tree
5 210
1010
5 35
1 57 3
105 9
77 1
5 210
1010
3
57
105 9
77
25 04/21/23
![Page 26: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/26.jpg)
Symmetrization Algorithm2. Growing Adopt contiguous
points to adopted points in both source and target tree In descending order of
the score From top to bottom From left to right
Discard conflicting points The points which have
adopted point both in the same row and column
5 210
1010
3
57
105 9
77
510
1010
3
57
109
77
26 04/21/23
![Page 27: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/27.jpg)
Symmetrization Algorithm3. Handling Isolation Adopt points which
are not aligned to any phrase in both source and target language
510
1010
3
57
109
77
510
1010
3
57
109
77
27 04/21/23
![Page 28: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/28.jpg)
Alignment Experiment Training corpus
Japanese-English paper abstract corpus provided by JST which consists of about 1M parallel sentences
Gold-standard alignment Manually annotated 100 sentence pairs among the
training corpus Sure (S) alignment only [Och and Ney, 2003]
Evaluation unit Morpheme-based for Japanese Word-based for English
Iterations 5 iterations for Model 1, and 5 iterations for Model 2
28 04/21/23
![Page 29: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/29.jpg)
Alignment Experiment (cont.) Comparative experiment (word-base
alignment) GIZA++ and various symmetrization heuristics
[Koehn et al., 2007] Default settings for GIZA++
Use original forms of words for both Japanese and English
29 04/21/23
![Page 30: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/30.jpg)
Results
Precision RecallF-
measure
proposed
1-best-intersection 90.92 41.69 57.17
1-best-grow 83.30 54.33 65.76
3-best-grow 81.21 56.52 66.65
5-best-grow 80.59 57.33 67.00
GIZA++
intersection 88.14 40.18 55.20
grow 83.50 49.65 62.27
grow-final 67.19 56.91 61.63
grow-final-and 78.00 52.93 63.06
grow-diag 77.34 53.18 63.03
grow-diag-final 67.24 56.63 61.48
grow-diag-final-and 74.95 54.26 62.95
30 04/21/23
![Page 31: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/31.jpg)
Example of Alignment Improvement
Proposed model Word-base alignment
31 04/21/23
![Page 32: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/32.jpg)
Translation Experiments Training corpus
Same to alignment experiments Test corpus
500 paper abstract sentences Decoder
Moses [Koehn et al., 2007] Use default options except for phrase table limit (20
-> 10) and distortion limit (6 -> -1) No minimum error rate training
Evaluation BLEU No punctuations and case-insensitive
33 04/21/23
![Page 33: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/33.jpg)
ResultsPre Rec F BLEU
proposed1-best-intersection 90.92 41.69 57.17 12.73
5-best-grow 80.59 57.33 67.00 15.40
GIZA++
intersection 88.14 40.18 55.20 16.35
grow-diag 77.34 53.18 63.03 17.89
grow-diag-final-and 74.95 54.26 62.95 17.76
34 04/21/23
Definition of function words is improper Articles? Auxiliary verbs? …
Tree-based decoder is necessary BLEU is essentially insensitive to syntactic
structure Translation quality potentially improved
![Page 34: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/34.jpg)
Potentially Improved Example Input:
これ は LB 膜 の 厚み が アビジン を 吸着 する こと で 増加 した こと に よる 。
Proposed (30.13):this is due to the increase in the thickness of the lb film avidin adsorb
GIZA++ (33.78):the thickness of the lb film avidin to adsorption increased by it
Reference:this was due to increased thickness of the lb film by adsorbing avidin
04/21/2335
![Page 35: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/35.jpg)
Conclusion Tree-based probabilistic phrase alignment
model using dependency tree structures Phrase translation probability Dependency relation probability
N-best symmetrization algorithm Achieve high alignment accuracy compared to
word-based models Syntactic information is useful during alignment
process BUT: Unable to improve the BLEU scores of
translation
36 04/21/23
![Page 36: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/36.jpg)
Future Work More flexible model
Content words sometimes correspond to function words and vice versa
Integrate parsing probabilities into the model Parsing errors easily lead to alignment errors By integrating parsing probabilities, parsing
results and alignment can be revised complementary
More syntactical information Use POS or phrase category into the model
37 04/21/23
![Page 37: Linguistically-motivated Tree-based Probabilistic Phrase Alignment](https://reader035.vdocuments.mx/reader035/viewer/2022062802/568145e9550346895db2e8d2/html5/thumbnails/37.jpg)
04/21/2338
Thank You!