chapter development translation algorithm...
TRANSCRIPT
CHAPTER Y.
DEVELOPMENT ~ TRANSLATION ALGORITHM ~ IIaIMPLEHENTATIOB
As discussed ea.r 1 ier, 8. hllm8.n t r8.n l.a.tor l'ecogn ises
various sentence parts on the basis of the typology of
sentence-constituent form8.t ions, In other words" 8.
translator matches a sentential component and/or combination
thereof of the given sentence with the typologies of
sentence parts for the purpose of their identification~
Similarly, identification of different syntactic structures
is carried out by phrase structure analysis," Based on
typology of sentence parts, dicussed in chapter III, it is
possible to' formula.te multiple number of tree-structu.res.
In fa.ct, various types of subjects, predicates, attributes,
object and adverbial modifiers find reflection in the so-
called NP a.nd VP/RVP structures (tra.ditiona.lly known as
subject and predioateof the sentence respectively), and it
becomes pra.ctics.ble to perform mechanica.l tra.ns lat ion of
Russian sentences, including pretty long syntactic
formations, that fit in NP and VP structures involving
different types of subjects, predicates, etc. It is true
that a.ll permutations and combinations of the prima.ries and
secondary sentence parts may not be logically possible (due
to predicativity condition), and this fa.ctor has been taken
care of while developing the translation program.
Given below a.re a. few examples of different parse
105
trees/phrase structure trees which explain natu.ral la.nguage
grammar (here Russian grammar) in the form of context-free
grammar/logic gramma.r, showing forma.l representation of
various sentence parts in the form of phrase structures and
the structure of Russian flentences. (In fa.ct > it is these
parse trees which are to be constructed automatically for
the Russain gra.rnma.r-our parsing problem that we sha.ll
discuss next).
FIG- 1
Gra.mmar:
LT: AT:
s ---) NP VP N NP --) Pro
NP·-:> Aclj N) Pro
NP --) APN (Oli- Adj VP --) VNP AdvP V AP --) Adj Adv AdVP-) Adv Conj Adv Adv
PH..C i
On Sub
he he
v . I 1 Chlt'3. Pred
Conj
A~ I ~ t\Jj< N , . ,
tekstovoi material Attr Obj
read textu8.1 res.d -h\,e textu'3.l
m.aterial material
106
---) It'lateri'3.1 ---) on ---) tekstovoi ---) chital ---) gromko ---) privil'no ---) i
loudly loudly
and correctly and correctly
FIG- 2
Gramms.r:
S ---) NP vp N ---) stlJ.denty NP---> N N ---) ks.nikuly NP---) APN Ad,j ---) zimnye vp---) AdvP V NP V ---) proveli AP---) Ad,j Adv ---) vese~o AdvP-> Adv
s
!\ 'r (N.\.'.,,: ... A1d.i "'\{
~,
zimnye kanikuly
N Adv . --
" ~ )
\ , d
studenty veselo proveli Sub AM Pred Attr Obj
LT: students Rlerri ly spent winter holidays AT: the studen-ts spent winter holidays merrily
107
Gr8.mmar!
S ---)
NP---> NP---> VP---> PP---)
PM I onQ.. Sub
LT: she AT: she
FIG-
NP VP Pro N Aux V PP PP Prep NP
s
~ ....... ~~ I_~-K ,~~/ >~~?
khochet igrat' Pred
wants to p18.y wants to play
108
3
N ---) parke Pro ---) ona Pro ---) n8.mi All X ---) khochet V ---) igra.t' Prep ---) s Prep ---> v
pp
-A . rp l~fj r
P'hOi. PA.O I"· N ,""t' I. .,' / I s nam1 i parke
I Obj AM with us in park with us in the park
Grammar:
LT: AT:
S -_._> NP VP HP---> Pro NP---> Proper N VP--'-> V NP VP vp---) V
.tUJ , "" ",. /./
oni poprosili Sub Pred
they 3.sked they asked
FIG- 4
Proper N ---) IVana. Pro ---) oni V ---) poprosili V ---> udalit'sY8.
PMP'~IN 'I I Ivs.ns. uda.l it' Sy3.
Obj I Obj Ivan to lea.ve Ivan to leave
1139
FIG- 5
Grallll8.r:
S ---) SP VP Proper N ---) Vladillir NP---> Pro Proper I ---> Volgograda NP---> Proper N Pro ---> nail vp---) V pp pp V ---) priekhal pp---> Prep IP Prep ---) k
Prep ---) iz
5
NP
~f\.OrVv N A r~~ NP
J
Vladillir priekhal k i iz nail
Sub Pred I Obj AM LT: Vladimir calle to us froll Volgograd AT: Vladil1ir c8.Ile to us froll Volgograd
110
Grammar:
LT: AT:
S ---) NP VP NP---> N NP---)'APN VP---> Aux V PP PP---) Prep NP AP---) Adj AP AP---) Adj
.. Af At\] ¥J
, t'" t e .1 1nos.rannye Attribute
these foreign these foreign
s
tutisty Sub
tourists tourists
FIG- 6
N ---) turisty N ---} marte Adj ---) e'ti Adj ---) inostra.nnye Aux' ---) khotyat V ---) puteshestvovat' Prep ---) v
pp
il /"". ~ _ J..4
. /1/ '.> ~,j "'J ~
kh6tya.t PI) teshestvovat 'v' marte Predicate AM
want to tra.vel in Ma.rch want to travel in March
111
FIG- 7
Gr~.mrn3.r :
S ---) NP VP N ---)
NP----- > N N ---)
NP---> APM N ---)
VP---> VNP pp Adj ---> pp---> Prep NP Adj ---)
AP---> Adj AP Adj ---)
AP---> Adj V ---)
Prep ---)
s
Ptlih Adj () ,I. J .:> .
e t1 ~nJodye lyudl Attr Suh
IT:these young peopJe accomplished this ~redt feet tor the sa~e of aotherland AT:these '{OUill} people accomplished this ~reat feet for the sakE- of ilotherlan..-l
112
lyudi podvig rodiny e'ti molodye e'tot sovershili r~.di
"
FIG- 8
Gra.lIlma.r:
S ---) NP VP Proper N ---)
NP---) Proper N Proper N ---)
NP---) N N ---> NP---) AP N Adj ---> VP---> AUK V pp pp Adj ---> pp---> Prep NP Aux ---)
AP---> Adj AP V ---)
AP---- > Adj Prep ---)
Prep --->
NP
LT: Russia contillueE. to )lrotest i1qainst this up.justiiied lel!islilticn ill UNO PoT: Russia continues to protest al!ainst this ur.justiiied le~islation in UNO
113 I .
Rossia.. OON' zakonodatel'stva e'togo nespra.vedl ivogo prodolzhaet vozra.ha.t' protiv v
FIG- 9
Grammar:
LT: AT:
s ---) NP VP NP ---) NP1 Conj NP2 NP1-> Nl NP Z-> NP --->
N.Z N
VP ---) V PP PP ---> Prep NP
N P,
knigi i Homogeneous
books and books s.nd
c,,: " S ~; .
kar 8.nd as hi subjects
pencils pencils
s.re are
114
Nl ---> knigi N2 ---) ks.r s.nd as hi N ---> stole V ---) lezh8.t Prep ---) na Conj ---> i
~~
lezhat Pred
lying lying
'" P)tQ.f \-\ P
J I
a stole AM
t8.ble on the table on
FIG- 10
Grammar:
S ---> NP VP N ---> risunki NP---> APN N ---> rasslnotreniya NP---> N . Adj ---> e'ti VP---) VNP Adj ---> n'3.chertannye AP---> t\dj AP Adj ---) ·dal'neishego AP---- > Adv AP V ---) trebu:yut AP---> Adj Adv ---) neyavno
$
Af
~ Aolj /\
Adv AP
I lei·
/'~ , N
Aclj
v
IJ
e ' t i neyavno n'3.chert8.nnye risunki Sub
tr8.ced traced
I trebuyut dal'neishego rassmotreniya
Attribute LT: these poorly AT: these poorly
Pred Attribute I Obj d ia.grams need further eX8.minat ion didagrams need further examination
115
FIG- 11
Gr .s.nm.s.r :
S ---) NP VP N ---) shofar NP---) N N ---> detai VP---> V NP AdvP PP N ---) shkoli AdvP-> Adv V ---) vez PP---> Prep NP Adv ---> domoi
:pb-ep ---) . it'· ;;.. '~ -.. ,
s
v NP
N Adv
shofer vez detei domoi iz shkoli Sub Pred Obj AM AM
LT: chauffeur drove children home from school AT: the ch.s.uffeur drove the children home from the school
116
GralnlUar:
LT: AT:
S NP NP 1 NP 2 NP VP PP AP
dve
two two
---) NP VP --- > NP 1 Con.j NP2"
<)
FIG- 12
Nl ---> sastry -) Numeral Nl -) Numeral N2
N 2 ---) bra.ta. N ---) ministerstve
---) A~ ---) V PP ---) Prep NP
Ad,j ---) e'tom Numeral-> dve Numera.l-> tri
---) Adj
sestry i Sub
sesters 8.nd sesters 8.nd
V ---:> rabots.yut Prep ---> v
s
J\ N l,{ 1~1 N.2...
tri bratS. r8.botayu t Pred
three brothers work in this in this three brothers ~~ork
117
ministry ministry
Gra.mma.r:
S ---) NP Ccipula. AP NP---> N AP---) Adj
NP
N
chelovek Sub
LT: In8.n AT: the m8.n
s
t f ' i :
,~ ~
. , ., . vyglY8~del
Predic8.te looked looked
FIG- 13
N ---) chelovek Copula---> vyglyadel Adj ---) usta.lym
,. l:"'- ,;" 1.;
llstalym
tired tired
118
Grallllnar:
S ---) ·NP yp NP---> N NP---> Pro NP---> APN VP---) y:JNP VP---> V NP AP---> Adj
N
professor Sub
VP
y
poprosil Pred
LT: professor asked AT:the professor asked
FIG-
N N Pro Adj V V
ns.s Obj
us to us to
119
14
---) professor ---) tetradi ---> ns.s ---> novye ---> poprosil ---) prinesti
prinesti I Obj
bring bring
Gr3.JnJnS.T:
LT: AT:
S ---> NP llP NP---> ~Jt.cl;)Vt...."N NP---) Pro . VP---) NP V
s
.' NP
I P.Jt,o p-vv N
I . \>fU1 I
G8.l~na. tet>Y8. Sub Obj
Ga.lina you Galin3. loves
FIG- 15
PKol~N Pro V
" I
lyubit Pred
loves you
120
---) Gs.lins. ---) teby8. ---) lyubit
LT: AT:
FIG-16
Gr8.mlv,u:
S ---> NP VP Proper Nt ---> Ivan NP ---> NP 1 Conj NP2 Proper N2 ---) Sergei NP1---) Proper Nl N NP Z---) Proper NZ Adj NP ---) AP N A 1.1 X VP ---) AUx V NP V AP ---) Adj Conj
s
T' Co,nj
r~op,'Vt. NIt r]l..O ~ N2.
I . IV8.n Serge~
Subject Ivan and Sergei are Iva.n and Sergei are
pytayutsya res hit , Predic8.te
trying to solve the trying to solve the
--.--~--~
---) zadachi ---) trlldneishie ---) pytS.YlltSya ---) reshit' ---) i
-NP
~ AP ',N I .
A~' ,J
I trudneishie z8.d8.chi
Attr Obj most difficult problems most difficult problems
FIG- 17
Grammar:
S ---) NP VP Proper N ---) Igr NP ---) Proper N N ---) armiYll NP ---) N V ---) postllpil VP ---) V AdvP PP V ---> sluzhit' PP ---) Prep NP Prep ---) v AdvP ---) V
s
AdliP P l"t.-o p -e J"t, N J pp
/\ I fJn.ep t'l p
I ~ Igr postupil sluzhit' v 8.rmiyu
Sub Pred AM (of purpose) AM (of pls.ce) LT: Igor came to serve in S.rlTly AT: Igor came to serve· in the army
122
FIG- 18
Gr8.muu~.r ;
LT: AT;
S ---) 1'1P VP N ---) knigi NP ---) N AP N ---) stole NP ---) 1'1 Adj ---) Vlad imir8. VP ---) v PP Adj ---) pis , mennom PP ---) Prep NP V ---) lezh8.t AP ---) Adj Prep ---) ns.
(Note: Since Russi8.n does not follow the'criterion of inte:Tupta.bility', the word "Vladim.ir8." (genitive of Vladimir) with inflection suffix's. has been tak$n as a single lexical unit)
S'
N AP ~ pp
AL ~.~ ..
p.J1.e f S\. P
I ~~. , Arj ~ knigi Vl8.d imir8. lezh8,t na pis'mennom stole
Sub Attr Pred Attr AM books Vl8.d imir' s 8.re lying on writing t8.ble Vladimir's books are hring on the writing t8.ble
123
Grammar:
S ---)
NP ---}
NP ---)
NP ---)
VP ---)
PP ---)
AP ---)
nl8,l' ohik Sub
LT: boy AT:the boy
FIG- 19
NP VP N ---)
N PP N ---)
N N ---)
A~ Adj ---)
v PP v ---)
Prep NP Prep ---)
Adj
s
/\ p/(e.~ :Np
a ________
AP N \ ,
AdJ
I v mekhovoi
Attr (without in fur
shapke agreement)
in fur
124
ca.p ca.p
ma.l' ohik sha.pke sa.du mekhovoi gulyal v
gulya.l Pred
strolled strolled
v sadll AM
in parl in the par
Grammar:
LT: AT:
S ---) NP VP NP ---) N PP NP ---) N VP --'-) V NP PP ---) Prep NP AP ---) Ad.j
NP
~ N pp
otets Sub
f8.ther fa.tiler
/\ s
with 8.nd
NfP
sy-nom Sub
son son
s
FIG- 20
N ---)
N ---)
N ---)
Adj ---)
V ---)
Prep ---)
'-.1.
rs.zr8.b8.t YV8.1 i Pred.
worked out worked out
125
otets synom. plany budushchie razra.ba.tyval i s
Nfl ~~.
AP ~ A~j I
\ . budushch1e plany Attr Obj
future future
pl8.ns plans
· FIG- 21
Gra.mma.r:
S --_.) NP VP N ---) syn NP ---) AP N Adj ---) ego VP ---} AdvP V AdvP Adv ---) khorosho AP ---) Adj Adv ---> po-rl.lsskt AdvP ---:> Adv V ---) govorit
"'-iP
AP N AdvP ''-1 J I
Aot' AcJ.v Aqv J l J '/ govorit ego syn khorosho po-russki
Attr Sub AM Pred AM LT: his son well speaks Rl.lssis.n AT: his son spes,ks Russis.n well
126
FIG- 22
Gramma.r:
S ---) NP VP N ---) NP ---) APN N ---) VP ---) V NP PP N ---) PP ---) Prep NP Adj ---) AP ---) Adj AP Adj -_._)
AP ---) Adj Adj ---)
Adj ---)
V ---)
Prep ---)
s
Ap J0p y
I
LT;' :( '{ollng scientists got the highest iiworc for their nel'l invention In: the ,(ollng scienti!.',ts got the hiqhe~t award far their ne .. inventirHl
127
l.1chenye na.gra.dl.l otknytie e'ti molodye vySOC ha.ishIlYl.1 novoe po h.lf.~hi 1 i za.
Gr8.mm8.r:
LT: AT:
S NP AP AP
dOll
Sub house my
~, ",'"
---) NP copula AP N --..,> k!qm ---> NAP ---> Adj AP ---> Adj
my friend's
Adj ---> Iloego Adj ---} drugs. Adj ---) bol'shoi CopuI8.---> ys
friend's house is
f bol' shoi Pred
big big
128
Gr8.ltI.rnar:
S NP NP AP
---) NP Copula NP ---) Proper N ---) AP N ---) Adj
FIG- 24
Proper N N Adj Copula
---)
---)
---)
---)
Ivs.n 8.rtist ta18.nt 1 ivyi cp
(Note: Copula and the link verb are synonymous in our context). .
LT: Ivan AT: IV8.n
Gramm8.r:
S NP NP
S .
is 8.
---) NP VP ---) Pro ---) AP N
~ A.P ____
A10U N
talant~ivYi ~rtist Attr (of the Pred) Pred
talented ts.lented
FIG- 248.)
N Pro Adj
artist s.rt ist
---) komnate ---) nikto ---) e'toi
VP ---) p8.rticle V PP V ---) sps.l PP ---) Pred NP Prep ---) v AP ---) Adj Ps.rt ic le-) ne
s-
Vp
r nikto ne spal
.R·'P ----~.~ Pi1e1'- ~p \ ~p.,..... __ ~'
, t A-~J k)' t e .01. omn.a e v
LT: nobody not slept in this room AT: nobody slept in this room
129
FIG- 25
Gramm.ar:
S ---) NP VP N ---) studen tl3.m NP ---) Pro N ---> knigi NP ---) AP N N ---) tsentre VP ---) V MP NP PP Pro ---) oni PP ---) Prep NP Adj ---) na.shim AP ---) Adj Adj ---) interesnye
Adj ---) klll'turnonl V ..:.--) poda.rili Prep ---) "
s
. NP
v
Px-o NP
ni pods.rili ns.shim studentaln interesnye knigi "
A0~ I_I' I.
AOJ I .' I k'ul'tlJ.rnom tsentre
Sub Pred Attr I ObjAttr Obj Attr AM
LT:they preserited our studenta interesting books in Cultura.l centre AT:they presented our students interesting books in the Cultural centre
130 .
FIG- 26
Granrm.8.r:
LT: AT:
S ---) NP VP Numeral--) dve MP --~) Nu.meral AP N N ---) knigi MP ---) N N ---) stole VP ---) V PP Adj ---) malen'kie PP ---) Prep NP V ---) lezhat AP ---) Adj Prep ---) na
(Note: Here the term Numeral has been used both as ~ funct ional 18.be 1 8.nd 8.S a c 18.s5 l8.be 1 )
s
NP
pp N u Jrl, IP N
MJ P,:e..p ~p N
dve malen'kie knigi lezhs.t I na stole Numeral Attr Su.b Pred AM
two Slt18.11 books are lying on ta.ble two sma.ll books 8.re lying on the' t8.ble
131
Gra,mmar:
LT: AT:
S NP NP NP VP PP AP AP
NP
lilY Sub
---)
---)
---)
---)
---)
---)
---)
---)
s
NP VP Pro PP Proper AP N V NP Prep NP Adj AP Adj
~o!'e with Victor Vietor 3,nd I
N
FIG- 27
Proper N-> Viktorom N ---) slov8,r ,
Pro ---) my Adj ---> novyi V ---) matematicheskii Prep ---) c· .;>
NP
Ap
~ Ac!- Ap
U I I AdJ' t 'I' , t ~ t' h k" 1 ' SOS.8Vl 1 nOVY1 ma.ema.1C eS~ll s oval'
Pred Attr Obj compiled new mathematical dictionary compiled a new mathematical dictionary
132
FIG- 28
Grammar:
S ---) NP VP NP ---) Pro N NP ---) AP N VP ---) AUX V NP PP PP ---> Prep NP AP ---) Adj Aft AP ---) Ad,j
s
i'\p
r T
Ja khochu poseti{ interesnye Sub Predicate Attr
LT: I want to visit interesting AT: I want to visit interesting
N --:--) u)esta N ---) gorode Pro ---) 18. Adj ---) interesnye Adj ---) e'tom Adj ---) istotichskom Aux ---) khochu V ---) poset ie Prep ---) V
N
AP ~
A~~ I ~
't: . . h' k d mesta v e ~om 1stor1C es om goro .e Obj Attribute AM
places in this historical city places in this historical city
133
We have listed above a few illustrative examples of parse trees
Here the following points may be noted:
l)For the purpose of simplification we have not
reflected punctuation marks in the parse trees.
However. this could be done as follows: S-)'Y\ ~ ,
S/~ ~ C' s - 1Y\C\.J' -::r s ..r1'Y1o.i,.. pL1nG ,) ~. f il'\o.i-Pt.lll c . J
NP vp
Pr 'V~ Jt..O NP
I Oil1 ,.( ':tMt N .
\.M 1<'Y\o\~ H~ '~ JtRCAcli"j CI. booK
2) In Russian. numerals, in a way. are attributes~
howeve~ in the present work they (like determiners in
English) ha.ve been used both as a functiona.l la.bel
and 8.S a class label. in
determiners ("members of a. subc l8.ss of
English
English
adjectival words that limit the nouns they modify in
a special way and that usually are placed before
descriptive adjectives") include a variety of words.
for instance, pre-articles (several of. many of. both
of). articles (def and nondef-thej
demonstrative pronouns (this, these ... ),
sOll'le) • b numers "
(cardinal, ordinal one, two, three ...• first.
second .... ). In our present3.t ion, we ha.ve ta.ken
pr'onomia.l adj ect ives, demonstrat ive pronouns, ord inal
numbers in the ca.tegory of a.djectives as they 8.re
clearly marked as attributes (normally adjectives).
Here it lIl8:Y be added th8.t the term "determiner" ha.s
134
only "'3.rticles" (a term used for determiners in
tra.d it ion'3.l English grammar) as the later is
generally restricted to the definitive article (the)
and indefinite a.rticle (a).
3) In schemes (parse-trees) and in earlierexpla.nations
also we have indicated preposition a.s constituent
e lemen t of '3. sentence p'3.rt. subj ect) under the "hes.d"
word. For insts.nce:
On zhivet yMoskye.
AM
(He lives in Moscow).
Ons. priekhs.la ~'detmi.
I Obj
(She ca.me with children)
4) Earlier, we discussed the structure of a Russian
sentence in terms of sentence parts (cf. Subject with
Attribute + Predica.te + Object with Attribute +
Adverbia.l Modifier). However, after the realisation
of different sentence parts by
speech (npun. pronoun ..... ) the
Russian sentence may appear as:
vs.rious ps.rts
structure of
of
8.
Adjective + Noun + Verb + Adjective + Noun +
Preposition + Noun
135
sentences (small and large) could be represented in
terms of the parts of speech. In fact, the above
structure of a Russian sentence is a combination of
"groups" (c8.11ed "phrases") (cf. Noun phr8.se/ Subject
Noun phrase + Verb phrase + Noun phrase/ Object Noun
phrase + I?repos it ion8.1 phr8.se> ie. NI? + VI? + N P +
PP). It implies that the identification of these
phrases is actually the identification of the
structure of a sentence. In other words, the
identific8.tion of the structure of a sentence
(whether it is a permissible structure or not, ie.
whether it is a sentence or not) based on different
phrase definitions implies identification of the
constituent phrases itself.
5) The homonymous word forms (homographs) , for
instance> "Ivan8." genit ive singul8.r of the proper
noun "Ivan") and "Iv8.na" (accusative singular of the
proper noun "Iv8.n"), h8.ve been kept in the
c8.tegor ies of 8.dject ives and nouns respect ive ly 8.S
the former denotes 8.n attribute and the later 8.ctS 8.S
an object. The following comparison of two Russian
sentence and their 8.dequ8.te tr8.nslation Jl'l8.y well
illustrate this point.
Knigi Ivana lezhat na stole (Attr/Adj)
(Ivan's books are lying on the table)
Ya videl Ivana v biblioteke (Object/Noun)
(1 saw Ivan in the library)
.136
Since we sore not representing os_se distinctions in PROLOG>
the above approach has helped in resolving the problems of
homogr8_phs. We shs_ll now expls_in the process of mechs_nics_1
tr8_nsls_tion by ts_king s_ gramms_r for simple Russian.
Russian-English Kechanioal Translation (using PROLOG)
Often in trans18.tion it is necessary to underst8.nd
what is being said before a proper translation oan be made.
Since a sentence in a language is much more than just an
arbitary sequence of words> a crude word-for-word approach
to translation> obviously cannot be acceptable. To m8.ke 8.ny
hes.dws.y, we hS.ve got to be s.ble to s.ns.lyse the structure of
8. senten'ce to parse it. To do ,this we will first
require s. simple Russis.n grs.mma.r to be defined.
A gr's_mmar for l8.ngus.ge such 8.S Russi8.n is a set of
rules for specifying what sequences of words are acceptable
8.S sentences of th8.t language. It specifies how the words
must group together into phrase and what ordering of these
phrs.ses 8.re 8.llowed. Given s. grs.mm8.r for a langu8.ge (here
Russ i8.n) we can look 8.t s.ny sequence of words a.nd see·
whether it meets the criteria for being acceptable sentence.
Let us define the geners.l structure of s. subset of Russis.n
simple sentence with the help of the following context-free
gr8.mm8.r:
sentence noun _phrs.se nOl.ln_phrs.se noun_phr8.se verb_phrase
---) noun_phrase, verb_phrase. ---) numers.l, s.dject iva> noun.· ---> adjective, noun. ---) proper_noun. ---> verb> prep_phrase.
137
verb_phrase prep _phrs.se prep_phrase
numeral numeral noun noun noun name name verb verb verb verb 8.dj 8.dj 8.dj preposition
.---) verb. ---> preposition> noun-phrase. ---) noun_phrase.
---) dve. ---) tri. ---> knigi. ---> ruchki. ---> stole. ---) IV8.na. ---) VIs.d imir. ---) lezhat. ---) stoys.t. ---> spit. ---> napis8.1. ---) Ins.len' kie. ---> pis'meonom. ---) derevyannom. ---> 08.
("t'"'o" ("three" ( "books" ( "pens" ("table" (-(-("are lying" ("8.re sts.nding" ("sleeps" ("wrote" ( "small" ("writing" ("wooden" (.. II . on
The grammar consists of a set of rules, here shown on
to a line. Each rule specifies a form that a certain kind
of phra.se can tS.ke. The first rule says that a sentence
consists of a phrase called a noun_phrase (a sentence can
take the form: a noun_phra.se.1followed by s. verb_phr8.se). The
second, third, fourth, fifth, sixth, a.nd seventh rules of
the grammar tell us what constitute grammatical forms of for
other phrases. The other rules in the grs.lUlUs.r SS.y how some
phr8.se can be ms.de up in terms of s.ctus.l words, rS.ther in
terms of slls.ller phrases. The things on the right hand side
nS.me 8.ctu8.1 words of. the 18.ngIl8.ge ( Russ i8.n ) , so that the
rule
numeral ---) dve.
c8.n be res.d S.S!
A numeral can take the form: the word dve. H8.ving
explained the gr8.mm.8.r, we can begin to see what sequences of
words are actually grammatical sentences according to it,
138
(sinoe this is a small grammar. it ~ill aocept sentenoes
formed out of fifteen different words). If we wish to
investigate whether a given sentence of words is aotually a
sentenoe s.ocordingly to these oriteria we, need to s.pply the
first rule to ask.
does the sequence deoompose into two phrase, such ths.t the first is 8.n 8.ccepts.ble noun_phrase and the second is a valid verb_phr~se?
Then in order to test whether the first phrase is a
noun_phrase, we need to s.pply the second rule 8.sking.
does it decompose into a numeral followed by s.n 8.djective a.nd then followed by a. noun?
and so on. At the end, if we.succeed~ ~e will have looked
at 8.11 the phrases a.nd sub-phrases of the sentence. 8.S
speoified by the grammar, and will have established a
structure like, for instanoe: (see fig. on next page)
This dia.grs.m (s. pa.rse tree for the sentence) shows
the phrase structure of the sentence.
We h~.ve seen how hs.ving a. grs.mms.r for Russia.n me~ms
th8.t we can construct pS.rse tree to show the structure of
Russian sentences. The prQblem of constructing a. pa.rse
tree for a. sentence, given 8. gra.mm.a.r, is called, as
m~ntioned earlier, the parsing problem. A computer program
that constructs parse trees for sentence of a language is
cs.lIed a. pa.rser.
Now let us see how a Russian sentence is parsed
aocording to the a.bove-mentioned grammar. Since 8. Prolog
progra.m (pa.rser), which pa.rse simple Russian sentences
139
( tWO
noUn:" phR..a.se
L\~J ~d:tve
p;l",en~orn
automatioallY. involves testing to see if something is a
sentenoe) let us define a predioate sentenoe. -~ x -" ... "----<; .. __ 0_- ..-.~... __ ;:,_-..,_.~" ''.~ ~.-",!-,_.yr ....... __ _
- -~/ .... This predicate will
succeed . if a sentence can be parsed properly. and fails if
it cannot. For example, if this predicate is applied to the
sentence: dve malen'kie knigi lezhatns. e'tom pis'mennom
stole should fail. since this sentence oannot be parsed
according to our grammar.
Now consider the problem of checking that a list of
words ( a sentence is held as a Prolog list) is a valid
sentence. A list of words is a valid sentence if:
1) the list has a valid noun phrase at the front;
2) what is left after ~he noun phrase is a valid verb
phrase.
Therefore, given the list of words:
r----lr----------~~------1--------~-----I-----------T-------~ ! dve; IDa.len' kie' knigi I lezh8.t I na. pis' mennom i stole I ~ ____ ~ ___________ ~ _______ L ________ ~ ____ L ___________ L __ _____ ~
The first question which need to be asked is:
Does this list have a valid noun phrase at the front?
If so. remove the noun phrase from the list.
The answer to this question will turn out to be yes
(a.coord ing to our gram!l18.r) > removing t.he noun phrase from
the front leaving the list:
~------------------------------------------------~ I lezhat! na! pis'm.ennom; stole : 'L ________ .J ________ .l.. _____________ ._.l.. ________________ .J
The question to be asked at this stage is:
140
Does this list ha.ve va.lid verb phras a.t the front?
If so, remove the verb phrase from the list.
Again the answer to this turns out to be yes, removing the
verb phrase from the list leaves just the empty, list.
So the strategy for parsing a list of words may be
described as follows:
1) Ta.ke the input list.
2) Identify each of the exp.ected eomponents or phra.ses.
one a.t 8. time.
3) Each time a phrase is identified, remove it from the
front of the list.
4) This remainder is used as the list for identifying
the next component.
5) At the very end~ there will be a remainder list
(possibly empty); this will be returned by the
process as left over.
Parsing a sentence is just a special case of parsing
any type of phrase or structure. Therefore. for each sort
of structure
structure) we
(a. sentence> a.
will have a
noun phrase. or any
sep8.ra.te pred ica.te.
other
Ea.ch
predicate will take an input list. and will specify the
conditions under which there is an occurence of that
structure at the front of the list. If the predicate
succeeds, it will return the left over as an input list
after the structure has been stripped off at the front. If
the input list does not meet the conditions. the predicate
shOll ld fa.i L
141
Based on the a.bove understs.nding s. parser comprises
a set of predicate defining validity requirement of
different structures a.nd voc8.bularly definitions.
Let us first consider the predicate which defines
~hat a valid sentence is. As with any other predicate which
pS.rses a structure, there will be an input list and output
(remainder) list:
Sentence ([--~input list---),[---output list---):-
In completing this predicate, we will assume the existence
of whatever other predicates are required. The 8.Ctu3.l
predics.te is:
sentence (In, Out):- noun_phrase (In, Temp), verb_phrase
(Tem.p, Out).
This states that the input list In has a valid sentence at
the front (returing the left-over list of words called Out)
if in starts with a noun phrase (leaving an intermediate, or
temporary list Temp) and Temp starts with a verb phrase
(le3.ving the re\ll8.inder list Out). The division of the list
c8.n be shown dis.gr8.ms.tics.lly as follows:
r-------------------~-T~------------------T-----------------e ~--noun phrase--- t--verb phrase--- I---Out--- • L _______________ --------------------------~-----------_____ J L-----------input list In-------------------------------J
L--------------Temp-------------------J L---remainder---- J
Assuming that the predic8.tes noun_phrase s.nd verb_phrase
could now ask
? ---sentence ([dve, malen'kie, knigi, lezhat, ns., pis'mennom, stole), [ )).
142
This would return the answer yes. Here the query specifies
that the remainder list must be empty.
Now consider the defination of noun_phrase:
noun_phrase ([---input list---],(---output list---]).
According to our grammar~ there are three possible forms a
v8.lid noun phra.se C8.n t8.k.e, so we will h8.ve three sep8.r8.te
Prolog clauses (i.e statements of facts in Prolog), each of
which will define one valid form of a noun_phrase:
1st case:
noun _phr8.se--- > numer8.1. adject i ve, noun
the detailed format of the input list of words which has a
noun phrase of this sort at the front (and whioh has a
lett-over list called Out) is follows:
r-----------------------------------------------------------1 I numers.l. I 8.dject hre i noun !----Out---- I
~------------L------------~-----------~--------------- _____ ~ L-----------------------------input-------------------_____ J
!..~rema.inder------------J
The predicate whioh check.s that the input list oonforms to
this patterns is:
«( Num, Adj, Nnl Out], Ollt):numeral (Num), 8.djeotive(Adj) , noun (Nn).
(Here is m8.y be noted that if number and gender features are
to be included, then the above predicate will look like:
noun_phrase numer8.l 8.djective noun
([Num, (Num, (Adj, (Nn,
Adj, N n : Out] , N ,G), N ,G), N ,G).
Out, N ,G):-
However, 8.S mentioned earlier, in the present work we are
143
not considering feature set representation in this form).
The B.bove predicB.te tB.kes B.n input list of words and
decomposes it; the first word is called Num, the second Adj,
the third Nn, and the remaining list of words is called
Out. It may be noted that it is by placing this same
remainder list in the output position we strip off the noun
phrase from the input list. Next. the first three words
are tested to check that they are of the right class. If
the decomposition and the checks all succeed, then the
pred icate suoceeds. For instance, the query (GoB.1 in Turbo
Prolog)
? ---noun_phrase «(dve. malen'kie, knigi, stoyat), 011 t) .
would succeed, setting Out to the list (stoyat)> whereas the
query!
? ---noun _phr'J.se «(dve, ruB. len 'kie, interesnye, kn igi, stOY8.t], Out).
would fB.il B.t the third condition in
the third word in the list (referred to wi thin the"
predicate as Nn) is being tested by the predicate noun. The
condition:
noun (intersnye)
would fail, assuming that interesnye has been defined as an
adjective.
2nd case:
noun_phrase----> adjective, noun
The detailed format of an input list which has a noun
phrase of this second type at its front could be shown
diagrammatically as follows:
144
r - - -- - - - - - - - - -J- - - - - - - - - - - - - - - - r - - - - - - - - - - - - - - - - - - - - - ----: I adjective ! noun . I'---------Out-------- .: , L ____________ J ________________ L ______________________ -.
L---------------------input _________________________ J
L----remainder--------- J
A second definition of the predicate nounlphrase, (which
succeeds onlu if the input list has this ~tructure) is as
follows:
noun_phra.se adjective noun
3rd ca.se:
«(Adj, NnlOut], Out):(Adj) , (Nn).
noun_phrase----> name
In this case, the input list must merely start with a
word which ha.s been defined to be a. na.me. Everything after
this ~ord in the list is the remainder. ~he predicate is
therefore:
noun_phrase «(NameIOut], Out):name (N a.me ) ..
The following eX8.mple queries may well illustrate the
operation of noun_phrase:
(i) ? --noun_phrase «(ivan, spit], (spit]).
This succeeds; the fiest two definitions of noun __ phrase are
tried 8.nd f8.il. But the third cla.use ma.tches.
(ii) ? noun_phrase «(dve, ma.len'kie, knigi,lezhat, na,pis'mennom stole], Out).
This first cla.use defining noun_phra.se successfully parses
this input list, a.nd produces the answer:
Out = [ lezhat, na, pis'mennom,stole]
145
(iii) ? --noun_phrase {(na, pis'mennom) stole], Out].
( cf. Is the input list na,pis'mennom, stole a noun phrase?)
This will fail as none of the three cl~lses for noun_
phrase will match this input list.
Verb phrases
A va.lid verb phra.se can be defined by two a.lterna.tive
clauses for a 'predicate verb_phrase. This predicate should
succeed if the input list of words begins with,a valid verb
phrase, in which case the list of remaining words should be
returned as the output list:
verb_phrase{(---input list----], (---output list---]).
Again, let us consider each of the two cases of a
verb phr8.se a.s defined in the gr8.mma.r sepa.ra.te ly.
1st case:
verb_phra.se---) verb, prep_phra.se
For the input list of words to start with a verb phrase of
this pa.rticul8.r from) it must ha.ve the following structure:
r-----------r--------------------r---~----------~--~--4 1 b"' I h "0 t ". , ,I ver ..... --prep P ra.se--- I --- u--- ,.", 1
, J L~ ____ ~_~ __ ~~_~ ____________ ~~ ____ L ___ ~ ____ ~~ ____ ~~ ___ ~~
L~--------~-------input---------------------~-~---~--~~ ~ L_ ----Temp ---------- - - -- -- -------"- -"-';;"-- __ :;..I ."
Lremainder----------- J
An input list which has this structure is defined by
the predicate:
verb_phrase '((VlTemp), Out):-verb (V), prep_phrase (Temp, Out).
This states that an input list of words is a valid verb
phrase of this type if the first word is a. verb, a.nd if wha.t
comes after this verb (the list Temp) starts with a
preposition phrase. The left-over from pre~phrase is then
146
the left-over from the whole verb phrase.
2nd cS.se:
verb_phrase--> verb
The clause which defines this altenative succeeds if the
. first word is s. verb, and returns the input list of words
minus this first word:
verb_phrase «(VlOut], Out):verb· (V).
For example, the querry:
?- verb_phrase ([lezhat, na, pis'mennom, stole], Out) .
The fist clause is selected, setting Out to the empty list.
Like wise, the querry
?- verb_phrase ( ( spit], ( ]).
will sllcceed.
Preposition phrase3
The grs.mms.r sts.tes ths.t there s.re two forms which 8.
valid preposition phrs.se C8.n tS.ke. We will consider es.ch
cS.se seps.ra.tely> s.nd define 8. cl8.use for es.ch one.
1st C8.se:
An input list of words which has this structure is defined
by the predicate:
prep_phrase([PreplTemp), Out):. preposition (Prep),
noun_phrase (Temp,Out).
147
2nd case:
This is defined by:
prep_phrs.se (In,Out):noun_phrase(In, Out).
For example, the following query will succeed:
?- prep_phrase ([na,pis'merinom, stole),( ]).
The first clause for prep_phrase will be selected to parse
this input list of words.
Gathering together all the above-mentioned predicates,
plus the vocs.buls.ry definitions, gives us s. ps.rsing progrs.m
whioh checks for proper structure of simple Russian
sentences.
Having defined the parsing program, let us now trace
the evalution of the querry (i.e evaluation of the goal):
Sentence ([ivan, napisal, dve, malen'kie knigi),Out)? l\llon_phrs.se{[ivs.n, ns.piss.l, dve, m.alen'kie, knigiLTem~?
{Try first clause for noun_phrase} numers.l (ivs.n)? - fs.ils
{Try second clause for noun_phrase} s.djective (ivs.n)? -fails
{Try final clause for noun_phrase} ns.me (i vs.n )? -suoceeds
-succeeds, setting Temp to (napisal, dve, malen'kie, knigi) verb_phrase [ (napisal, dve, malen'kie, knigi),Out)?
{Th~,tb~first clause for verb_phrase:} {initially sets } {V = napisal,Temp=[dve,malen'kie,knigi]} verb (ns.pisal)? - succeeds prep_phrase ([dve,malen'kie,knigi),Out)?
{Try the first clause for prep_phrase:} preposition (dve)? -fails
148
{Try the second cls.use for prep-phrs.se}
noun_phrase «(dve,malen'kie, knigi], Out)?
{Try the first clause for} {noun_phrase: } {initially sets Out=(] }
numeral (dve)? (of. Is dve a numeral? -succeeds adjective (malen'kie)?
-succeeds noun (knigi)? -suoceeds
-succeeds, ~eturing output list ( ] -suooeeds, returing output list ( ]
-succeeds, returing output list [ ]
As it is known the translation of a sentence (say
from Russian to English) has to be dorie by building the
English structure tree/ ps.rse tree from the Russis.n text
version of the sentence ~hich is recognised (parsed). For
insts.nce, the Russis.n sentence, in the text form:
dve malen'kie knigi lezhat na pis'mennom stole
this would be 'translated'
representation which we will
respresented by the following tree:
s.ente.t"Ice..
two bDoles
149
(tra.nsfered) into
cs.ll structured
011
a
tree
Now we will look at the task of how to build an
'English structure tree from 3. Russia.n sentence, by extending
ollr par'sar progr3.m so that it bu i Ids Q.< d e.scr ipt ion of the
structure tree as the various parts of the sentence are
recognised by the parser program.
Building ~ Structure ~.:fI.Q.m. Russian
We can already recognise the different components of
3. Russian sentence using our p8.rser prc1gr3.m. No-w a.ll we
have to do is to build a Prolog structure for every part
-which is recognised.
A component of a structure tree can be represented in
Prolog as a structured object. Each different kind of
structure will be represented ·by its own kind of object.
The components of this object may themselves be structured
objects. In structure tree:
which represents a grammatical structure or phrase of a
p3.r t iculs.r kind, a.nd with
represented by the Prolog object:
kind (C 1, C2 , C3 ,---, Cn )
150
several components, is
First W""'e will work through e~.ch kind of structure > ~.nd
outline the kind of object used to represent each case.
Then the problem of builiding these structures trees
automatically in Prolog wil be tackled. We will start with
the simplest kind of structure--a basic word.
Reprp.~enting B~sic Words
Each word will be held as a simple structure. which
specifies the type of 'word it is and wh3.t the wo-rd itself
i~. Each kind' of word is represented in Prolog by it own
kind of structured object .
.( c f ' 11 a Me. l.i v 0. l'\) 1 Representing Higher Leyel Structure
The object representations which are built for
individual words as shown above must be put together to
build higher level object descriptions of phr~.ses a.nd
structures within a sentence. Let us consider each kind of
structure in turn.
tlQJ.ID. Phrases
When we parse a Russian noun phrase. we need to build
a. t'ree structure which will hold the components of the noun
phrase. Since there are three different cases of a noun
phrase. we will consider each case separately.
1st case:
noun~phra.se--> numers.l. s.dject ive. noun
A noun phrase of this type will be represented by
A Prolog objects which represents this would be: noun.c.:Phrase (NuemraL Adj ect i ve, Noun)
2nd ca.se:
"noun2phrs.se-- > a.dj eet i ve > noun
A representation of this type of noun phrase appears to have
two components. However. to avoid oonfusion between a noun=
phrase object with three components ~.nd one "\:o,lith two, we
will regard this second case as though the numeral is
missing, and fill up the vS.cant numers.l position in the
object with a dummy component. The res~lt is a standard
representation for noun phrase which normally (until we have
a case of unbounded de~endencies) have three components. We
c~.ll this dummy component wh~.tever we want, say nil. This
gives a structure tree for the second type of noun phrase as
follows:
{ mOUn J 3rd case:
noun6phrase--> name
There is no need to build a higher level tree which has only
one component, we will just use tree which W8.S built for
na.me.
Yfu:.:.b. Phrases
1st case:
152
The structure tree for this form of phrase is:
G~ C ~ C-=_=~
{ v.erLh J {p~p- ph~qse J which is r~pr~s~nt~d in Prolog by th~ obj~ct
verb_phrase (Verb, Prep~phrase)
2nd case:
verbOphrase--> verb
Here we wil~ just use the tree which was built for the verb.
P~eposition phrases
1st case:
prep_phrase--> preposition, noun_phrase
The structure tree for this type of phrase is: p..u.p_ phxa....s-e..
This is represented in Prolog by the object:
2nd case:
prep_phrase--> noun_phrase
In this case) it is sufficient to return the tree for noun_
phrase.
Sentences
The tree for a complete sentence has two components:
~ c-. ----"""-5 - ~ ~ ~ ",,01A"I')_ ph'l.IAS-€} <- VeAh_ph/t..Ct.S-e}
which C8.n be held in Prolog 8.S the object:
sentence (Noun_phrase> Verb_phr:,ase)
153
Building ~ Structure ~ E.r.!m. Russian (Automatically
To trans18.te from Russian into English 8.S mentioned
e8.rlier, will be to tr8.ns18.te from Russi8.n text into
Structured English: in other words, to input a list of
Russi8.nwords, 8.nd produce 8. represent8.t ion in Structured
English. For this, we extend our Russian parser so that it
not on ly 8.n8.lyses the structure of a sentence in R11ssi8.n,
but also builds an English structure tree for the sentence.
Thus, our extended Russian parser, which produces 8.n English
structure tree from a lsit of Russian words, must contain
English words, r8.ther th8.n Russi8.n words. For inst8.nee, the
Russian noun_phrase:
dye malen'kie knigi
will be represented by the structure tree
(1'\e~'YI_ r h fC-Q.se
'-! 'Y\OUYV'
-two SYYlo,J...L bootes
Thus, it ia in thiaprocess ~ building ~ structured English
representation ~ ~ Russian sentence where ~ ~
translation ~ ~.
Building Structures ~ Basic Russian Words
The vocabulary-defining primitives, i.e. basic words
(r_noun, r_verb, 8.nd so on) have now to be modified to
return a tree for each basic word. Again, the clause which
builds the tree for the word has to be separated out from
the actual definition of the word itself in the vocabulary,
since the vocabulary proper does not define the structure
154
tree for each word. So, between the vocabulary and the
parser we interpose a set of simple predicates which build
the structure tree for each Rllssis.n word. Ags.in, we rens.me
the vocabulary predicates, from r_noun, r_verb, and so on~
to be noun_r, verb_r, etc. These old names, such as r_noun,
s.re used for the intermedis.te tree-building predics.tes. The
vocabulary itself now becomes
numers.l_r numeral_r noun-,r noun_r noun_r name_r nS.me_r verb_r verb_r verb_r verb_r adjective_r 8.dj ect i ve_r preposition_r
(dve) . (tri). (knigi) . (ruchki). (stole). (ivan) . (vladimir), (lezhat) . (stoyat) . (spit) . (napiss.l) . (malen 'kie). (pis'mennom). (na) .
Now let us consider what English structure trees have to be
built for each type of basic word; For instance, for knigi,
which satisfies:
nOl.ln_r (knigi)
The corresponding tree represented as a Prolog object should
be: noun (books)
To construct this object, we will obvioulsy need to know the
English word which corresponds to each Russian word. As
mentioned earlier, this could be specified using the
predicate me~ms:
means (knigi,books). means (malen'kie,small). means (pis'mennom, writing).
155
However, to choose the right English word to store in the
tree, it is desir8.ble to include the English transls.tion of
a Russian word in the vocabulary. Including this as an
extra component gives the revised vocabulary:
numeral_r nl.lmer8.l r noun_r noun_r noun_r name_r name_r verb_r T/erb_r verb_r verb_r adjective_r 8.djective_r preposition_r
(dve> two) . (tri,three). « kn igi, books) . (rl.lchki > pens). <. stole, table). ( i vs.n , i V8.n ) . (vlad imir, vlad imir). ( lezhat ,Qni hi i n9), ( s t ?ya t > i:l..J.:~:_SJ9~:'(lct i1"j) . (splt,sleeps). (ns.pisal, wrote). (malen'kie,small). (pis'mennom,writing). (na,on).
In general, the English structure tree for any Russian word
is now defined by a predicate:
r_noun (Russian word, structure tree)
which can be defined in full by:
r_noun (R_ word, noun (E_ Word»:noun_ r (R_ word, E_ word).
For insts.nce:
?-r_noun (knigi, Tree).
would produce the reponse:
Tree:noun (books)
The predicates for
defined simi 18.r ly:
r_ numeral
r name
r verb
r - adjective
other types of Russian word can be
(R_word, numeral (E_word»:numeral _r (R_word, E_word).
(R_word, name (E_word»:-name _ r (R_word, E_word).
(R_word, verb (E_word»:-verb _ r (R_word, E_word).
(R_word,adjective(E_word»:adjective~r (R~word.E_word).
156
r _ preposition (R_word,preposition(E_word»:preposition_r(R_word,E_word).
(Here it may be noted that E_word, in a way, isa translated word) ,
Building Trees ~ Bussian Phrases
Now let us consider the higher level trees which are
.built using the basic trees for individual Russian words.
For e8.oh 018.use in the Hussi8.n pS.rser, we 8.dd 8.n extra ps.rs.
meter, whioh is structured English tree to be built.
Consider eaoh cls~se in turn.
t:lmm Phrases
1st oS.se:
Extending the ear;ier clause for parsing Hussian
noun phrases of this form gives the following olause:
r_noun-phrase «(Num,Adj,NnIOut), Out> noun_phrase-(Num_tree, Adj_tree, Nn_tree»)
r _ numers.l (Num, Num_tree), r adjective(Adj, Adj_tree), r noun (Nn, Nn_tree).
It should be obvious from this that the tree object returned
for the input list: (dve, malen'kie, knigi)
is: noun_phrase (numeral(two),8.djcetive (small),n'oun (books»
2nd oase:
To make sure that both types of noun-phrase object have
three components, the 'missing' numers.l should h8.ve its
position failed with the dummy object nil:
r_noun_phrase «(nil, Adjeotive, NnlOut), Out,noun_phrs.se(nil, Adj_tree, Nn_tree»):-
157
~ ..
r_adjective (Adj, Adj_tree), r_noun (Nn, Nn-tree).
3rd case:
The tree for this form of noun phrase is just the tree for
the name:
r_noun_phrase «(Name\Out], Out,Tree):r_name (Name) Tree).
~ Phrases
1st case:
The previous predicate must now have an extra third
argument, which is the tree for a verb phrase of this form:
r_verb_phrase ([VITemp],Out, verb_ phrase (V_tree, P_Tree»:- r_verb (V, V_tree), r _prep_phrase(Temp ,Out ,F _tree).
2nd case:
r-verb-phrase--) r_verb
The tree returned in this case is juctthe tree built for
the verb, called V_tree:
r-verb-phrase ([VIOut], Out) V_tree):r_verb (V, V_tree).
Preposition Phrases
1st case:
The previous predicate, which had two arguments, now had a
third, which is the object built to represent the structure
tree for the preposition phrase:
158
r_prep_phrase([PrepITemp)~Out>prep_phrase(P-tree>N_tree)}:r_preposition (Prep, P_tree)~ r_noun_phrase (Temp, Out) N_tree).
2nd case:
r _,~p_phrase--> r _noun_phrase
Here the tree for the preposition phrase is just the tree
for the noun phrase:
r_prep_phrase (In,Out, P_tree):r_~oun_phrase (In, Out, P_tree).
Sentences
The last clause to be extended to build the tree
representation is that for processing a sentence. This
builds an object ~ith two components:th~ trees for the noun
phrase (N_tree) and
r_sentence r.:..noun_phrase r_ verb_ phrase
sverb phrase (V_tree):
(In, Out, sentence (N_tree, V_tree»:(In, Temp, N_tree), (Temp, Out, V_tree).
Puting all these clause:3/ together, s.nd s.dding structure
trees for bs.sic Russis.n ~ords and the vocs.bu ls.ry (Russis.n-
English), gives us a progrs.Kl which parses Russian sentence
and builds the structure tree for it.
As an example of this program's operation, consider the
follo~ing querry:
?- r_sentence ([dve, malen'kie> knigi,lezhat, na, pis'mennom stole], [ ], Tree).
This querry will succeed, and will produce the
representation of the structure tree as a Prolog object Tree
s.s shown be low.
159
Tree = sentence (
Implementation
noun-phrase ( numers.l (two), adjective (small), noun (books»),
verb-phrs.se ( verb (are lying) pref.phrs.se (
preposition(on), noun-phrase (
adjective (writing), nil, nQun (table}»»)
New fifth generation programming languages,
inc lud ing Tu rbo Pro log, a.nd new imp 1 il1en ts.t ions 0 f these
ls.ngus.ges have al~ays been the stimuls.s for developing
exciting new applications in areas previously untouched by
electron ic comput ing ms.chines and computer techno logy. In
rercent yes.rs, Turbo Pro log has become 8.n imports.nt
instrument in implements.tions of Al technology in s.res.s such
as expert systems development and ns.tl1rs.l ls.nguage
processing.
MDTS has been imple~ented in Turbo Prolog (version
2.0) , We chose this progrs.mming la.nguage on account of
its outstanding features, such as closeness to natural
langus.ge, modl.lls.rity, one la.nglJ.8.ge for progrs.m a.nd da.ta.,
logics.l vs.r is.b les, computa.t ion of re ls.t ions,
understandability, lea.rnabil ity, inbuilt sea.reh strategy,
etc ..
MDTS programme mainly consists of five modules:
160
INPUT ,CONTROL, SYNTAC ,DICT a:nd OUTPUT. Module INPUT hS.ndles
the natura.l input sentences by tl':3.nsforming th~ sequence of
written words and punctuation marks into lists of words and
marks, and it checks whether every sentential component is
known.
Module CONTROL is the high level control structure of
all processes occuring during the translation process. It
calls the module SYNTAC for analysing the sentence~ deciding
upon its result to accept or to refuse it. In the case of
accepting it. it calls the modules DICT and OUTPUT to.
interpret it. and to perform the appropriate translation.
Module SYNTAC handles a CPG. It takes a list,
representing the natural language input sentence and
applies several knowledge sources to yield a logical
sturcture t h:3.t con t:3.ins all the inform:3.t ion for a. semant ic
interprets.t ion. It is able to extract from a set of
possible rea.d ings the a.ppropria.te res.ding. The grammar,
, being able to generate all and only the sentences of a sub
set of Russia.n, recognises if the input sentences be long to
that sub-set of Rusian syntax patterns and establishes a
relation between the written sentence and its meaning.
Module DICT contains the language specific knowledge
a.bout the written input a.nd output sentences. 1. e. a
dictionary of lexical items to support 5L-TL rendition.
Modu le OUTPUT ha.nd les the forma.t ion of TL sentences.
161
Performance at ~ System (HOTS)
1) The system can handle a variety of syntactic structures
involving principal and secondary sentence parts
(Subject, Predicate, Object, Attribllte and Adverbis.l
Modifier) within the category of grammatically simple
sentences (see appendix III).
2) The system (on experiments.l bs.sis) cs.n h8.ndle s. few
types of compound (cf. S--> Sl ConjS2) and complex
sentences (see appendix IV)
3) The system can handle word order at the segment levels
dom ot tss.---house father's ---) father's house---
(cf. r_ Sub «(Word\ (Word 1\ [ ]]], (T word \ (T word r_ noun (Word,T word 1), r_adj (Word
word) .
---Khorosho govorit po-russki: ---well r _ Pred
r_adv r 8.dv
speaks Russi8.n --> spe8.ks Russis.n well ([Wordl(Word 1 : (Word 2 I ( ]]]], [T word
IT word 11 [T word 2\ ( ]]]]):-(Word, T word 1), r_ verb (W,ord 1, T word 2), (Word 2, T word).
4) The system C8.D ha.ndle tra.nsI8.tion8.1 tr8.nsforms.tions such
8.S lexica.l additions s.nd omissions.
(cf. s.ddition: (On) tals.ntlivyi artist (he) talented artist --) (he) is talented artist r Pred «(Wordl( Word 1\ ( ]]], (T word: (T word 1; (T
word 2\ ( ]]]]):-
r link_ verb (T word), r_ adj (Word,T word 1), r noun (Word 1, T word 2).
Omission (deletion):
(Nikto) ne spit (Nobody) not sleeps--) Nobody sleeps
r_Pred [Wordl(Word 11[]]], [T word 1'( ]]):r_particle (Word), r_verb (Word 1, T word 1).
162
Here it m8.y be added that some regrouping may be required if the system is extended to . - 1nclude various other types of syntactic structures.
5) The system can partially resolve homographs by
identifying them as different word classes (in different
syntactic structures)
(cfl) uchenyi provodil interesnye e'ksperimenty Nelln + (Verb + Adj + ~lolln) scientist oonducted interesting ~xperiments.
2) llchenyi sovet prinY8.I pravil 'noe reshenie Adj + Noun + (Verb + Adj + NQun) . 8cademio oouncil took correct decisionF-- '1
6) The number agreement (on experimental basis) could~be
achieved by distinguishing two kinds of noun 'phrs.ses "and
two kinds of verb phrasee (cf.sentence --) singul~r
noun_ phr8.se, sigulB.r_ verb_ phrase; sentence--> pll.lr8.l_
noun_ phrase, p111rB.l _ verb _ phr8.se---). This type of
grammar works correctly but is quite redundant. As
mentioned earlier, (additional arguments could be used
to h8.ndle gra.mms.tic8.1 8.greement, however, in the
present work we 8.re restricting our.::....;~selves only to the
qontext-free grammar) 8.S per our initi8.l work pla.n.
7) The. system (supported by a large ds.tabs.se 8.nd a va.r iety
of syntactic structures) works effectively, however, in
the present form, it
i) can generate nonsentences. ii) cannot differentiate between the homonymDus word
forms belonging to the same gramma.tic8.l catego'ry/word cls.ss.
cr.l) professors pishut nauchnye stat'i (noun in the nomina.t ive p lurs.l)
Professors write research articles. 2) ya videl professors.
(noun in the iccusative singular)
163
.'.
I saw professor
Here it may be added that in many cases we have different
word forms for reflecting the above grammatical information.
For instance:
nominative plural studenty (stUdents) pisateli (writers) prepodavateli teeachers)
accusative singular studenta (student) pisatelya (writer) prepodavate lys. (teacher)
(As regards the homonymous word form 'professora' (noun in
the genitive singular) we could still differentiate by
including this word form in the grammatical category of
adject ives for the sS.me S.cts 8.S 8.n 8.ttribute. In case of
inanimate nouns, the homographs, generally, do not pose any
problems S.S their lexic8.l meanings 8.re the
same. For instance:
iii)
knigi (nominative plural) lezhat na stole (Books 8.re lying on ., " -"table)
ya videl interesnye knigi (acousative plural) (I S8.W interest ing books)
C8.nnot handle those typical s i t.US. t ions where
sentence ps.rts (ss.y Subjeot 8.nd I Obj) could
the
be
identified only on the bs.sis of grs.mms.t ios.l
s.greement. This point ms.y be illustrated
comps.ring the following two sentences having
same syntactic structure
1) t!~ ~_ Vikt<?!,,9~! poshli na rynok. \~uoJ (Sub) (plural-verb)
Victor ~ 1. went to the market.
2) Ya (Sub)
poshel ns. rynok (s inguls.r-verb)
L went to mS.rket Ri.t.h Victor.
by
the
Here' it InS.y be added tha.t the Russian preposition
"n8." in the present context (verb of mot jon +
preposition + noun in the accusative case) means
"to".
164