discourse processing algorithms

Upload: skt-trends

Post on 07-Apr-2018

241 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Discourse Processing Algorithms

    1/11

    E V A L U A T I N G D I S C O UR S E P R O CE S S IN G A L G O R I T H M SMarilyn A. Walker

    Hewlett Packard LaboratoriesFilton Rd., Bristol, England B$12 6QZ, U.K.

    & Universi ty of Pennsylvanialyn%lwalker~hplb.hpl .hp.com

    A b s t r a c tIn order to take steps towards establishing a method-ology for evaluating Natural Language systems, weconducted a case study. We attempt to evaluate twodifferent approaches to anaphoric processing in dis-course by comparing the accuracy and coverage oftwo published algorithms for finding the co-specifiersof pronouns in naturally occurring texts and dia-logues. We present the quantita tive results of hand-simulating these algorithms, but this analysis natu-rally gives rise to both a qualitative evaluation andrecommendations for performing such evaluations ingeneral. We illus trate the general difficulties encoun-tered with quantitative evaluation. These are prob-lems with: (a) allowing for underlying assumptions,(b) determining how to handle underspecifications,and (c) evaluating the contribution of false positivesand error chaining.

    1 I n t r o d u c t i o nIn the course of developing natural language inter-faces, computational linguists are often in the posi-tion of evaluating different theoretical approaches tothe analysis of natural language (NL). They mightwant to (a) evaluate and improve on a current sys-tem, (b) add a capability to a system that it didn'tpreviously have, (c) combine modules from differentsystems.

    Consider the goal of adding a discourse compo-nent to a system, or evaluating and improving onethat is already in place. A discourse module mightcombine theories on, e.g., centering or local focus-ing [GJW83, Sid79], global focus [Gro77], coher-ence relations[Hob85], event" reference [Web86], in-tonationa l structure [PH87], system vs. user be-

    liefs [Po186], plan or intent recognition or production[(3o578, AP86, SIS1], control[WSSS], or complex syn-tactic structures [Pri85]. How might one evaluate therelative contribut ions of each of these factors or com-pare two approaches to the same problem?In order to take steps towards establishing a

    methodology for doing this type o f comparison, weconducted a case study. We att empt to evalu-ate two different approaches to anaphoric processingin discourse by comparing the accuracy and cover-age of two published algorithms for finding the co-specifiers of pronouns in naturally occurring texts anddialogues[Hob76b, BFP87]. Thus there are two partsto this paper: we present the quantitative results ofhand-simulating these algorithms (henceforth Hobbsalgorithm and BFP algorithm), but this analysis nat-urally gives rise to both a qualitative evaluation andrecommendations for performing such evaluations ingeneral. We illustra te the general difficulties encoun-tered with quantitative evaluation. These are prob-lems with: (a) allowing for underlying assumptions,(b) determining how to handle underspecifications,and (c) evaluating the contribution of false positivesand error chaining.

    Although both algorithms are part of theories ofdiscourse tha t posit the interaction of the algorithmwith an inference or intentional component, we willnot use reasoning in tandem with the algorithm's op-eration. We have made this choice because we wantto be able to analyse the performance of the algo-rithms across different domains. We focus on thelinguistic basis of these approaches, using only selec-tional estrictions, o that ou r analysis s nde pen den to f t h e v a g a r i e s o f a pa r t i c u l ar k n o w l e d g e r e p r e s e nt a -t i o n. T hu s wh a t w e ar e ev a l u a t i n g i s th e e x t e n t t ow h i c h t h e s e a l g o r i t h ms s u f f ic e o na r r o w t h e se a r c ho f a n in f e r e n ce c o m p o n e n t I. T hi s an a l y s i s g i v e s u s

    l But note th e def in i t ion of success in sect ion 2 .1 .

    2 5 1

  • 8/6/2019 Discourse Processing Algorithms

    2/11

    s o m e i n d i c a t i o n o f t h e c o n t r i b u t io n o f s y n t a c ti c c o n -s t r a i n t s , t a s k s t r u c t u r e a n d g l o b a l f o c u s t o a n a p h o r i cp r o c e s s i n g .T h e d a t a o n w h i c h w e c o m p a r e t h e a l g o r it h m s a r ei m p o r t a n t i f w e a r e t o e v a l u a t e c l a i m s o f g e n e r a l -

    i ty . I f w e l o o k a t t y p e s o f N L i n p u t , o n e c l e a r d i -v i s io n i s b e t w e e n t e x t u a l a n d in t e r a c t i v e in p u t . Ar e l a t e d , t h o u g h n o t i d e n t i c a l f a c t o r i s w h e t h e r t h el a n g u a g e b e i n g a n a l y s e d i s p r o d u c e d b y m o r e t h a no n e p e r s o n , a l t h o u g h t h is d i s t i n c ti o n m a y b e c o n -f l u t e d i n t e x t u a l m a t e r i a l s u c h a s n o v e l s t h a t c o n t a i nr e p o r t e d c o n v e r s a t i o n s . W i t h i n t w o - p e r s o n i n t e ra c -t i v e d i a l o g u e s , t h e r e a r e t h e t a s k - o r i e n t e d m a s t e r -s la v e t y p e , w h e r e a ll t h e e x p e r t i s e a n d h e n c e m u c ho f t h e i n i t i a t i v e , r e s t s w i t h o n e p e r s o n . I n o t h e r t w o -p e r s o n d i a l o g u e s , b o t h p a r t i e s m a y c o n t r i b u t e d i s -c o u r s e e n t i t ie s t o t h e c o n v e r s a t i o n o n a m o r e e q u a lb a s i s. O t h e r f a c t o r s o f i n t e r e s t a r e w h e t h e r t h e d i -a lo g u e s ar e h u m a n - t o - h u m a n o r h u m a n - t o - c o m p u t e r ,a s w e ll as t h e m o d a l i t y o f c o m m u n i c a t i o n , e . g . s p o k e no r t y p e d , s i nc e s o m e r e s e a r c h er s h a v e i n d i c a t e d t h a td i a l o g u e s , a n d p a r t i c u l a r l y u s e s o f r e f e r e n c e w i t h i nt h e m , v a r y a l o n g t h e s e d i m e n s i o n s [ C o h 8 4 , T h o 8 0 ,G S B C 8 6 , D J 8 9 , W S 8 9 ] .

    W e a n a l y s e t h e p e r f o r m a n c e o f t h e a l g o r i t h m s o nt h r e e t y p e s o f d a t a . T w o o f t h e s a m p l e s a r e t h o s e t h a tH o b b s u s e d w h e n d e v e l o p in g h i s a l g or i t hm . O n e i s a ne x c er p t f r o m a n o v el a n d t h e o t h e r a s a m p l e o f j o u r-n a li s ti c w r i t i n g . T h e r e m a i n i n g s a m p l e i s a s e t o f 5h u m a n - h u m a n , k e y b o a r d - m e d i a t e d , t a s k - o r i e nt e d d i-a l o gu e s a b o u t t h e a s s e m b l y o f a p l as t i c w a t e r p u m p[ C o h 8 4 ] . T h i s c o v e r s o n l y a s u b s e t o f t h e a b o v e t y p e s .O b v i o u s l y i t w o u l d b e i n s t r u c t i ve t o c o n d u c t a s i mi l a ra n a l y s i s o n o t h e r t e x t u a l t y p e s .

    2 QuantitativeEvaluati0n-Black B o x

    2 . 1 T h e A l g o r i t h m sW h e n e m b a r k i n g o n s u c h a c o m p a r i s o n , i t w o u ld b ec o n v e n i e n t t o a s s u m e t h a t t h e i n p u t s t o t h e a l g o -r i t h m s a r e i d en t i ca l a n d c o m p a r e t h e i r o u t p u t s . U n -f o r t u n a t e l y s i n c e r e s e a r c h e r s d o n o t e v e n a g r e e o nw h i c h p h e n o m e n a c a n b e e x p l a i n e d s y n t a c t i c a l ly a n dw h i c h se m a n t i c a ll y , t h e b o u n d a r i e s b e t w e e n t w o m o d -u l e s a r e r a r e l y t h e s a m e i n N L s y s t e m s . I n t h i s c a s et h e B F P c e n t e r i n g a l g o r i t h m a n d H o b b s a l g o r i t h mb o t h m a k e A SS UM P TIO N S a b o u t o t h e r s y s t e m c o m -p o n e n t s . T h e s e a r e , i n s o m e s e n s e , a f u r t h e r s p e c i f i -

    c a t i o n o f t h e o p e r a t i o n o f t il e a l g o r i t h m s t h a t m u s tb e m a d e i n o r d e r t o h a n d - s i m u l a t e t h e a l g o r i t h m s .T h e r e a r e tw o m a j o r s e t s o f a s s u m p t i o n s , b a s e d o nd i sc o u r se s e g m e n t a t i o n a n d s y n t a c t i c r e p r e s e n t a t i o n .W e a t t e m p t t o m a k e t h e s e e x p li c i t f o r e a c h a l g o r i t h ma n d p i n p o i n t w h e r e t h e a l g o r i t h m s m i g h t b e h a v e d i f -f e r e n t l y w e r e th e s e a s s u m p t i o n s n o t w e l l -f o u n d e d .

    I n a d d i t i o n , t h e r e m a y b e a n u m b e r o f U ND E R -SPEC IFIC A T IO N S i n t he des c r i p t i o ns o f t he a l go r i t hm s .T h e s e o f t e n a ri s e b e c a u s e t h e o ri e s t h a t a t t e m p t t oc a t e g o ri z e n a t u r a l l y o c c u r r i n g d a t a a n d a l g o r it h m sb a s e d O n t h e m w i l l a l w a y s b e p r e y t o p r e v i o u s l y u n -e n c o u n t e r e d e x a m p l e s . F o r e x a m p l e , s in c e t h e B F Ps a l i e n c e h i e r a r c h y f o r d i s c o u r s e e n t i t i e s i s b a s e d o ng r a m m a t i c a l r e l a t i o n , a n i m p l i c i t a s s u m p t i o n i s t h a ta n u t t e r a n c e o n l y h a s o n e s u b j e c t . H o w e v e r t h e n o v e lWheels h a s m a n y e x a m p l e s o f r e p o r t e d d i a l o g u e s u c has She continued, unperturbed, ~M r. Va le quotesthe Bible about air pollution." O n e m i g h t w o n d e rw h e t h e r t h e s u b j e c t i s She o r Mr. Vale. I n s o m ec a s e s , t h e a l g o r i t h m m i g h t n e e d t o b e f u r t h e r s p e c i -f ic i e d i n o r d e r t o b e a b l e t o p r o c e s s a n y o f t h e d a t a ,w h e r e a s in o t h e r s t h e y m a y j u s t h i g h l i g ht w h e r e t h ea l g o r i t h m n e e d s t o b e m o d i f i e d ( s e e s e c t i o n 3 . 2 ) . I ng e n e r a l w e c o u n t u n d e r s p e c i f i c a t i o n s a s f a i l u r e s .

    F i n a l l y , i t m a y n o t b e c l e a r w h a t t h e D E F I N I T I O NO F SU C C E SS i s . I n pa r t i cu l a r i t i s no t c l ea r w ha t t od o i n t h o s e c a s e s w h e r e a n a l g o r i t h m p r o d u c e s m u l t i -p l e o r p a r t i a l i n t e r p r e t a t i o n s . I n t h i s s i t u a t i o n a s y s -t e m m i g h t f la g th e u t t e r a n c e a s a m b i g u o u s a n d d r a wi n s u p p o r t f r o m o th e r d i s c o u rs e c o m p o n e n t s . T h i sa r is e s i n t h e p r e s e n t a n a l y s i s f o r t w o r e a so n s : ( 1 ) t h ec o n s t r a i n t s g i v e n b y [ G J W 8 6 ] d o n o t a l w a y s a l l o wo n e t o c h o o s e a p r e f e r r e d i n t e rp r e t a t i o n , ( 2 ) t h e B F Pa l g o r i t h m p r o p o s e s e q u a l ly r a n k e d i n t e r p r e t a ti o n s i np a r a l l e l . T h i s d o e s n ' t h a p p e n w i t h t h e R o b b s a l g o -r i t h m b e c a u s e i t p r o p o s e s i n t e r p r e t a t i o n s i n a s e q u e n -t i a l m a n n e r , o n e a t a t im e . W e c h o s e t o c o u n t a s af a i l u r e t h o s e s i t u a t i o n s i n w h i c h t h e B F P a l g o r i t h mo n l y r e d u c e s t h e n u m b e r o f p o ss i bl e i n t e r p r e t a ti o n s ,b u t R o b b s a l g o r i t h m s t o p s w i t h a c o r r e c t in t e r p r e -t a t i o n . T h i s i g n o r e s t h e f a c t t h a t t I o b b s m a y h a v er e j e c te d a n u m b e r o f i n t e r p r e t a t i o n s b e f o r e s t o p p i n g .W e a l s o h a v e n o t n e e d e d t o m a k e a d e c i s io n o n h o w t os c o r e a n a l g o r i t h m t h a t o n l y f i n d s o n e i n t e r p r e t a t i o nf o r a n u t t e r a n c e t h a t h u m a n s f in d a m b i g u o u s .

    2 . 1 . 1 C e n t e r i n g a l g o r i t h m

    T h e c e n t e r i n g a l g o r i t h m a s d e f i n e d b y B r e n n a n ,F r i e d m a n a n d P o l l a r d , ( B F P a l g o r i t h m ) , i s d e r i v e df r o m a s e t o f r u l e s a n d c o n s t r a i n t s p u t f o r t h b y G r o s z ,

    2 5 2

  • 8/6/2019 Discourse Processing Algorithms

    3/11

    J o s h i a n d W e i n s te i n [ G J W 8 3 , G J W 8 6 ] . W e s h a l l n o tr e p r o d u c e t h i s a l g o r i t h m h e r e ( S e e [ B F P 8 7 ] ) . T h e r ea r e t w o m a i n s t r u c t u r e s i n th e c e n t e r i n g a l g o r i t h m ,t h e CB, t h e BACKW ARD L OOKI NG CE NT E R, w h i c h i sw h a t t h e d i s c o u r s e i s ' a b o u t ' , a n d a n o r d e r e d l i s t ,CF , of F O R W A R D L O O K I N G C E N TE R S , w h i c h a r e t h ed i s c o u r s e e n t it i e s a v a i l a b l e t o t h e n e x t u t t e r a n c e f o rp r o no r n in a l iz a t i o n . T h e c e n t e r i n g f r a m e w o r k p r e d ic t st h a t i n a l o c a l c o h e r e n t s t r e t c h o f d i a l o g u e , s p e a k e r sw i ll p r e f e r t o C O N T I N U E t a l k i ng a b o u t t h e s a m e d i s-c o u r s e e n t it y , t h a t t h e C B w i l l b e t h e h i g h e s t r a n k e de n t i t y o f t h e p r e v i o u s u t t e r a n c e ' s f o r w a r d c e n t e r s t h a ti s r e a l i z e d i n t h e c u r r e n t u t t e r a n c e , a n d t h a t i f a n y -t h i n g i s p r o n o m i n a l i z e d t h e C B m u s t b e .

    I n t h e c e n t e r i n g f r a m e w o r k , t h e o r d e r o f t h ef o r w a r d - c e n t e r s l i s t s i n t e n d e d t o r e f l e ct t h e s a l i e n c eo f d i s c o u r s e e n ti t ie s . T h e B F P a l g o r i t h m o r d e r s t h i sl is t b Y g r a m m a t i c a l r e l a t i o n o f t h e c o m p l e m e n t s o ft h e m a i n v e r b , i . e . f i rs t t h e s u b j e c t , t h e n o b j e c t ,t h e n i n d i r e c t o b j e c t , t h e n o t h e r s u b c a t e g o r i z e d - f o rc o m p l e m e n t s , t h e n n o u n p h r a s e s f o u n d i n a d j u n c tc l a u se s . T h i s c a p t u r e s t h e i n t u i t i o n t h a t s u b j e c t s a r em o r e s a l i e nt t h a n o t h e r d i s c o u r s e en t it ie s .

    T h e B F P a l g o r i t h m a d d e d l i ng u is t i c c o n s t r a i n t so n C O N T R A - I N D E X I N G t o t h e c e n t e r i n g f r a m e w o r k .T h e s e c o n s t r a i n t s a r e e x e m p l i f i e d b y t h e f a c t t h a t ,i n t h e s e n t e n c e h e H k e s h i m , t h e e n t i t y c o s p e c i f i ed b yh e c a n n o t b e t h e s a m e a s t h a t c o s p e c i f i ed b y him. W es a y t h a t h e a n d h i m a re C O N T R A - I N D E X E D . T h e B F Pa l g o r i t h m d e p e n d s o n s e m a n t i c p r o c e s s i n g t o p r e c o m -p u t e t h e s e c o n s t r a i n t s , s i n c e t h e y a r e d e r i v e d f r o mt h e s y n t a c t i c s t r u c tu r e , a n d d e p e n d o n s o m e n o t i o no f c - c o m m a n d [ R e i 7 6 ] . T h e o t h e r a s s u m p t i o n t h a t isd e p e n d e n t o n s y n t a x i s t h a t t h e t h e r e p r e s e n ta t i o n so f d i s c o u r s e e n ti t ie s c a n b e m a r k e d w i t h t h e g r a m -m a t i c al f u n c t i o n t h r o u g h w h i c h t h e y w e r e r e a l i z e d ,e . g . s u b j e c t .

    T h e B F P a l g o r i t h m a s s u m e s t h a t s o m e o t h e r m e c h ~a n i s m c a n s t r u c t u r e b o t h w r i t t e n t e x t s a n d t a s k -o r i e n t e d d i a l og u e s i n t o h i e r a rc h i c a l s e g m e n t s . T h ep r e s e n t c o n c e r n i s n o t w i t h w h e t h e r t h e r e m i g h t b ea g r a m m a r o f d i s c o ur s e t h a t d e t e r m i n e s t h i s s t r u c-t u r e , o r w h e t h e r i t i s d e r i v e d f r o m t h e c u e s t h a tc o o p e r a t i v e s p e a k e r s g i v e h e a r e r s t o a i d i n p r o c e s s -i n g . S i n c e c e n t e r i n g i s a l o c a l p h e n o m e n o n a n d i si n t e n de d t o o p e r a t e w i t h i n a s e g m e n t , w e n e e d e d t od e d u c e a s e g m e n t a l s t r u c t u r e i n o r d e r t o a n a l y s e t h ed a t a . S p e a k e r ' s i n t e n ti o n s , t a s k s t r u c t u r e , c u e w o r d sl i k e O . K . n o w .. , n t o n a t i o n a l p r o p e r t i e s o f u t t e r a n c e s ,c o h e r e n c e r e l at i o n s, t h e s c o p i n g o f m o d a l , o p e r a t o r s ,a n d m e c h a n i s m s f o r s hi f t 'i ng c o n t r o l b e t w e e n d i s-c o u r s e p a r t i c i p a n t s h a v e a l l b e e n p r o p o s e d a s w a y s

    o f d e t e r m i n i n g d i s c o u r s e s e g m e n t a t i o n [ G r o 7 7 , G S 8 6 ,R e i 8 5 , P H 8 7 , H L 8 7 , H o b 7 8 , H o b 8 5 , R o b 8 8 , W S 8 8 ] .H e r e , w e u s e a c o m b i n a t i o n o f o r t h o g r a p h y , a n a p h o r ad i s t r i b u t i o n , c u e w o r d s a n d t a s k s t r u c t u r e . T h e r u l e sa re "

    I n p u b l i s h e d t e x t s , a p a r a g r a p h i s a n e w s e g -m e n t u n l e s s t h e f i r s t s e n t e n c e h a s a p r o n o u n i ns u b j e c t p o s i t io n o r a p r o n o u n w h e r e n o n e o f t h ep r e c e d i n g s e n t e n c e - i n t e r n a l n o u n p h r a s e s m a t c hi t s s y n t a c t i c f e a t u r e s .

    I n t h e t a s k - o r i e n t e d d i a l o g u e s , t h e a c t i o n P I C K -U P m a r k s t a s k b o u n d a r i e s h e n c e s e g m e n t b o u n d -a r i e s. Cu e wo r d s l ik e n e z t , t h e n , a n d n o w a l som a r k s e g m e n t b o u n d a r i e s . T h e s e w i l l u s u a l l y c o -o c c u r b u t e i t h e r o n e i s s u f f i c i e n t f o r m a r k i n g as e g m e n t b o u n d a r y .

    B F P n e v e r s t a t e t h a t c o s p e c i f i e r s f or p r o n o u n sw i t h i n t h e s a m e s e g m e n t a r e p r e f e r r e d o v e r t h o s e i np r e v i o u s s e g m e n t s , b u t t h i s i s a n i m p l i c i t a s s u m p -t i o n , s i n c e t h i s l i n e o f r e s e a r c h i s d e r i v e d f r o m S i d -n e r ' s w o r k o n l o c a l f o c u s i n g . S e g m e n t i ni t ia l u t t e r -a n c e s t h e r e f o r e a r e t h e o n l y s i t u a t i o n w h e r e t h e B F Pa l g o r i t h m w i l l p r e f e r a w i t h i n - s e n t e n c e n o u n p h r a s ea s t h e c o s p ec i f i er o f a p r o n o u n .

    2 . 1. 2 H o b b s ~ a l g o r i t h mT h e H o b b s a l g o r i t h m i s b a s e d o n s e a r c h i n g fo r ap r o n o u n ' s c o - sp e c if i e r i n t h e s y n t a c t i c p a r s e t r e e o fi n p u t s e n t e n c e s [ H o b 7 6 b ] . W e r e p r o d u c e t h i s a l g o -r i t h m i n f u l l n t h e a p p e n d i x a l o n g w i t h a n e x a m p l e .H o b b s a l g o r i t h m o p e r a t e s o n o n e s e n t e n c e a t a t i m e ,b u t t h e s t r u c t u r e o f p r e v i o u s s e n t e n c e s i n t h e d i s -c o u r s e i s a v a i l ab l e . I t is s t a t e d i n t e r m s o f s e a r c h e so n p a r s e t r e e s . W h e n l o o k i n g fo r a n i n t r a s e nt e n t ia la n t e c e d e n t , t h e s e s e a r c h e s a r e c o n d u c t e d i n a l ef t- to -r i g h t , b r e a d t h- f i r s t m a n n e r . H o w e v e r , w h e n l o o k i n gf o r a p r o n o u n ' s a n t e c e d e n t w i t h i n a s e n t e n c e , i t w i l lg o s e q u e n t i a l l y f u r t h e r a n d f u r t h e r u p t h e t r e e t o t h el ef t o f t h e p r o n o u n , a n d t h a t f a i li n g w i l l l o o k i n t h ep r e v i o u s s e n t e n c e . H o b b s d o e s n o t a s s u m e a s e g m e n -t a t i o n o f d i s c ou r s e s t r u c t u r e i n t h i s a l g o r i t h m ; t h ea l g o r i t h m w i l l g o b a c k a r b i t ra r i l y f a r i n t h e t e x t t of i n d a n a n t e c e d e n t. I n m o r e r e c e n t w o r k , H o b b s u s e st h e n o t i o n o f C O H E R E N C E R E L A T I O N S t o s t r u c t u r e t h ed i s c o u r s e [ H M 8 7 ] .

    T h e o r d e r b y w h i c h H o b b s ' a l g o r i t h m t r a v e r s e s t h ep a r s e t r e e is t h e c l o s e s t t h i n g i n h is f r a m e w o r k t o p r e -d i c t i on s a b o u t w h i c h d i s c o u r s e en t i t ie s r e s al i e nt . I nt h e m a i n i t p r e f e r s c o - s pe c i f ie r s f o r p r o n o u n s t h a t

    2 5 3

  • 8/6/2019 Discourse Processing Algorithms

    4/11

    a r e w i t h i n t h e s a m e s e n t en c e , a n d a l s o o n e s t h a ta r e c l o s e r t o t h e p r o n o u n i n ti le s e n t e n c e . T h i sa m o u n t s t o a c l a i m t h a t d i f f e r e n t d i s c o u r s e e n t i t i e sa r e s a li e n t , d e p e n d i n g o n t h e p o s i t i o n o f a p r o n o u ni n a s e n t e n c e . W h e n s e e k i n g a n i n t e r s e n t e n t i a l c o -s p e c i f ic a t i o n , H o b b s a l g o r i t h m s e a r c h e s t h e p a r s e t r e eo f t h e p r e v i o u s u t t e r a n c e b r e a d t h - f i r s t , f r o m l e f t t or i g h t . T h i s p r e d i c t s t h a t e n t i t ie s r e a l iz e d i n s u b j e c tp o s i t i o n a r e m o r e s a l i e n t, s i nc e e v e n i f a n a d j u n c tc l a u s e l i n e a r l y p r e c e d e s t h e m a i n s u b j e c t , a n y n o u np h r a s e s w i t h i n i t w il l b e d e e p e r i n t h e p a r s e t r e e . T h i sa l s o m e a n s t h a t o b j e c t s a n d i n d i r e c t o b j e c t s w i l l b ea m o n g t h e f i r s t p o s s i b l e a n t e c e d e n t s f o u n d , a n d i ng e n e r a l t h a t t h e d e p t h o f s y n t a c t ic e m b e d d i n g i s a ni m p o r t a n t d e t e r m i n e r o f d i sc o u rs e p r o m i n e n c e .

    T u r n i n g t o t h e a s s u m p t i o n s a b o u t s y n t a x , w e n o tet h a t H o b b s a s s u m e s t h a t o n e c a n p r o d u c e t h e c o r -r e c t s y n t a c t i c s t r u c t u r e f o r a n u t t e r a n c e , w i t h a l l a d -j u n c t p h r a s e s a t t a c h e d a t t h e p r o p e r p o i n t o f th ep a r s e t r e e . I n a d d i t i o n , i n o r d e r t o o b e y l in g u i s ti cc o n s t r a i n t s o n c o r e f e r e n c e , t h e a l g o r i t h m d e p e n d s o nt h e e x i s t e n c e o f a N p a r s e t r e e n o d e , w h i c h d e n o t e sa n o u n p h r a s e w i t h o u t i ts d e t e r m i n e r ( S e e t h e e x -a m p l e i n t h e A p p e n d i x ) . H o b b s a l g o r i t h m p r o c e d u -r a l l y e n c o d e s c o n t r a - i n d e x i n g c o n s t r a i n t s b y s k i p p i n go v e r N P n o d e s w h o s e N n o d e d o m i n a t e s t h e p a r t o ft h e p a r s e t r e e i n w h i c h t h e p r o n o u n i s f o u n d , w h i c hm e a n s t h a t h e c a n n o t g u a r a n t e e t h a t t w o c o n t r a -i n d e x e d p r o n o u n s w i l l n o t c h o o s e t h e s a m e N P a sa c o - s p e c i f ie r .

    Hobbs a l s o a s s u m e s t h a t hi s a l g o r i t h m c a n s o m e -h o w c o l le c t d i s c o u r s e e n ti t i es m e n t i o n e d a l o n e i n t os e t s a s c o -s p e c if i e rs o f p l u ra l a n a p h o r s . H o b b s d i s -c u s s e s a t l e n g t h o t h e r a s s u m p t i o n s t h a t h e m a k e sa b o u t t h e c a p a bi l i t i es o f a n i n t e r p r e t iv e p r o c e s s t h a t

    o p e r a t e s b e f o r e t h e a l g o r i t h m [ H o b 76 b ] . T h i s in -c l u d e s s u c h t h i n g s a s b e i n g a b l e t o r e c o v e r s y n t a c -t i ca l ly r e c o v e r a b l e o m i t t e d t e x t , s u c h a s e l i d e d v e r bp h r a s e s , a n d t h e i d en t i ti e s f t h e s p e a k e r s a n d h e a r e r si n a d i a l o g u e .

    2 . 1 . 3 S u m m a r yA m a j o r c o m p o n e n t o f a n y d i s c o u rs e a l g o r i t h m i s t h ep r e d i c t i o n o f w h i c h e n t i t ie s a r e s a l i e n t , e v e n t h o u g ha l l t h e f a c t o r s t h a t c o n t r i b u t e t o t h e s a l i e n c e o f a d i s -c o u r s e e n t i t y h a v e n o t b e e n i d e n t i f i e d [ P r i8 1 , P r i 8 5 ,B F 8 3 , H T D 8 6 ] . S o a n o b v i o u s q u e s t i o n i s w h e n t h et w o a l g o r i t h m s a c t u a l l y m a k e d i f f e r e n t p r e d i c t i o n s .T h e m a i n d i f fe r e n c e is t h a t t h e c h o i c e o f a c o - s p e c if i e rf o r a p r o n o u n i n t h e H o b b s a l g o r i t h m d e p e n d s i n p a r to n t h e p o s i ti o n o f t h a t p r o n o u n i n t h e s e n t e n c e . I n

    t h e c e n t er i n g f r a m e w o r k , n o m a t t e r w h a t c r i te r i a on eu s e s to o r d e r t h e f o r w a r d - c e n t e r s l i st , p r o n o u n s t a k et h e m o s t s a l ie n t e n t i t ie s a s a n t e c e d e n t s , i r r e s p e c t iv eo f t h a t p r o n o u n ' s p o s i t i o n . H o b b s o r d e r i n g o f e n t i -t i e s f r o m a p r e v i o u s u t t e r a n c e v a r i es f r o m B F P i nt h a t p o s s e s s o r s c o m e b e f o r e c a s e - m a r k e d o b j e c t s a n di n d i r e c t o b j e c t s , a n d t h e r e m a y b e s o m e o t h e r d i f fe r -e n c e s a s w el l b u t n o n e o f t h e m w e r e r e l e v a n t t o t h ea n a l y s i s t h a t f o l l o ws .

    T h e e f f e c t s o t" s o m e o f t h e a s s u m p t i o n s a r e m e a -s u r a b l e a n d w e w il l a t t e m p t t o s p e c if y e x a c tl y w h a tt h e s e e f f e c t s a r e , h o we v e r s o m e a r e n o t , e . g . we c a n -n o t m e a s u r e t h e e f fe c t o f H o b b s ' s y n t a x a s s u m p t i o ns i n c e i t i s d i f f i c u l t t o s a y h o w l i k e l y o n e i s t o g e t t h ew r o n g p a r s e . W e a d o p t t h e s e t c o l l ec t i o n a s s u m p t i o nf o r b o t h a l g o r i t h m s a s w e l l a s t h e a b i l i t y t o r e c o v e rt h e i d e n t i t y o f s p e a k e r s a n d h e a r e r s i n d i a lo g u e .

    2 . 2 Q u a n t i t a t i v e R e s u l t s o f t h e A l g o -r i t h m s

    T h e t e x t s o n w h i c h t h e a lg o r i t hm s a r e a n a l y s e d a r et h e f ir st h a p t e r o f A r t h u r H a i l e y ' s n o v e l Wheels, a n dt h e J u l y 7 , 1 9 7 5 e d i t i o n o f N e w s w e e k . T h e s e n t e n c e si n W h e e l s a r e s h o r t a n d s i m p l e w i t h l o n g s e q u e n c e sc o n s i s t i n g o f r e p o r t e d c o n v e r s a t i o n , s o i t i s s i m i l a r t oa c o n v e r s a t i o n a l t e x t . T h e a r t i cl e s f r o m N e w s w e e ka r e t y p i c a l o f j o u r n a l i s t ic w r i t i n g . F o r e a c h t e x t ,t h e f i r st 1 0 0 o c c u r r e n c e s o f s i n g u l a r a n d p l u r a l t h i r d -p e r s o n p r o n o u n s w e r e u s e d t o t e s t t h e p e r f o r m a n c e o ft h e a l g o r i t h m s . T h e t a s k - d i a l o g u e s c o n t a i n a t o t a l o f8 1 u s e s o f i t a n d n o o t h e r p r o n o u n s e x c e p t f or I a n dy o u . I n t h e f i g u r e s b e l o w n o t e t h a t p o s s e s s i v e s l i k eh / a a r e c o u n t e d a l o n g w i t h h e a n d t h a t a c c u s a t i v e sl ike h i m a n d h e r a r e c o u n t e d a s h e a n d s h e 2 .

    W h e e l sN e w s w e e kT a s k s

    N Hobbs1 0 0 . 8 81 0 0 8 98 1 5 1

    B F P907949

    F i g u r e I : N u m b e r c o r r e c t f o r b o t h a l g o r i t h m s f o rW h e e l s , N e w s w e e k a n d T a s k D i a l o g u e s

    W e p e r f o r m e d t h r e e a n a l y s e s o n t h e q u a n t i t a t i v er e s u lt s . A c o m p a r i s o n o f t h e t w o a l g o r i t h m s o n e a c hd a t a s e t in d i v i du a l l y a n d a n o v e r a ll a n a l y s i s o n t h et h r e e d a t a s e t s c o m b i n e d r e v e a le d n o s i gn i fi c an t d i gf e r e n c e s i n t h e p e r f o r m a n c e o f t he t w o a l g o r i t h m s

    2 H o b b e r e p o r t s h i s M g o r i tl u n 's p e r f o r m a n c e a n d t h e e x a m -p l ea it fa i ls o n i n [ H o b 7 6b , H o b 7 6 a ] . T h e n u m b e r s r e p o r t e dh e r e v a r y s l i g h t l y f r o m t h o s e . T h i s i s p r o b a b l y d u e t o a d i s -c r e p a n c y i n e x a c t l y w h a t t h e d a t a . s e t c o n s i s t e d of .

    2 5 4

  • 8/6/2019 Discourse Processing Algorithms

    5/11

    ( X 2 = 3 . 2 5 , n o t s i g n i f i c a n t ) . I n a d d i t i o n f o r e a c ha l g o r i t h m a l o n e w e t e s t e d w h e t h e r t h e r e w e r e s i g n i f -i c a n t d i f f e r e n c e s i n p e r f o r ma n c e f o r d i f f e r e n t t e x t u a lt y p e s . B o t h o f t h e a l g o r i t h m s p e r f o r m e d s i g n if i c a n tl ywo r s e o n t h e t a s k d i a l o g u e s ( X 2 = 2 2 . 0 5 f o r Ho b b s ,X 2 = 2 1 . 5 5 f o r B F P , p < 0 . 0 5 ) .

    W e m i g h t w o n d e r w i t h w h a t c o n f i d e n c e w e s h o u l dv i e w th e s e n u m b e r s . A s ig n i f ic a n t f a c to r t h a t m u s tb e c o n s i d e r e d i s t h e c o n t r i b u t i o n o f FAL S E P OS IT IVE Sa n d E R R O R C H AI N IN G . A F A LS E P O S I T I V E i s w h e na n a l g o r i t h m g e t s t h e ri g h t n s w e r f o r t h e w r o n g r e a -s o n. A v e r y s i m p l e e x a m p l e o f t h i s p h e n o m e n a isi l lu s tr a te d y th i s e q u e n c e f r o m o n e o f th e t a s k d i a -logues.

    E x p l : N o w p u t I T in t h e p a n o f w a t e r .E x p 2 : S t a n d I T u p .E x p s : P u m p t h e l i tt l e h a n d l e w i t h t h e r e d c a po n I T .C l i l . o kE x p 4 . D o e s I T w o r k ? ?

    T h e f i r s t i t i n E x p l r e f e r s t o t h e p u m p . H o b b sa l g o r i t h m g e t s t h e r i g h t a n t e c e d e n t f o r i t i n E x p 3 ,wh i c h i s the l i t t l e handle , b u t t h e n f a i l s o n i t i n E x p 4 ,w h e r e a s t h e B F P a l g o r i t h m h a s th e p u m p c e n te r e d a tE x p l a n d c o n t i n u e s t o s e l e c t t h a t a s t h e a n t e c e d e n tfo r i t t h r o u g h o u t t h e t e x t. T h i s m e a n s B F P g e t s t h ewr o n g c o - s p e c i f i e r i n E x p s b u t t h i s e r r o r a l l o ws i t t og e t t h e c o r r e c t c o - s p e c i f i e r i n E x p 4 .

    A n o t h e r t y p e o f f a l s e p o s i t i v e e x a m p l e i s " E v e r y -body and HIS b r o t her s ud d en l y w an t s t o b e t he P r es i -d en t ' s f r i end , n s a i d o ne a i d e . H o b b s g e t s t h i s c o r r e c ta s l o n g a s o n e i s w i l l i n g t o a c c e p t t h a t E ver yb od y i sr e a l l y t h e a n t e c e d e n t o f his . I t s ee m s t o m e t h a t t h ism i g h t b e a n i d i o m a t i c u s e .

    E R R O R CHAINING refers o the fact hat onc e an al-g o r i t h m m a k e s a n er r o r, t h e r e r r o r s a n re su l t. C on -sider:

    C l i 1 : S or r y n o lu c k.Expx: I bet IT's the stupid red thing.E x p 2 : T a k e I T o u t .Cli2: Ok. IT is stuck .

    I n t h i s e x a m p l e o n c e a n a l g o r i t h m fails t E x p x i tw i l l f a i l o n E x p 2 a n d C l i 2 a s we l l s i n c e t h e c h o i c e s o fa c o s p e c i ll e r in t h e f o ll o w i n g e x a m p l e s a r e d e p e n d e n to n t h e c h o i c e i n E x p l .

    I t i s n ' t p o s s i b l e t o m e a s u r e t h e e f fe c t o f f a ls e p o s -i t iv e s , s in c e i n s o m e s e n s e t h e y a r e s u b j e c t i v e j u d g e -m e n t s . H o w e v e r o n e c a n a n d s h o u l d m e a s u r e t h e ef -f e c t s o f e r r o r c h a i n i n g , s in c e r e p o r t i n g n u m b e r s t h a tc o r r e c t f o r e r r o r c h a i n i n g i s mi s l e a d i n g , b u t i f t h e e r -

    r o r t h a t p r o d u c e d t h e e r r o r c h a i n c a n b e c o r r e c t e dt h e n t h e a l g o r i t h m m i g h t s h o w a s i g n i f i c a n t i m p r o v e -m e n t . I n th i s a n a l y s i s , e r r o r c h ai n s c o n t r i b u t e d 2 2f a i l u r e s t o H o b b s ' a l g o r i t h m a n d 1 9 f a i l u r e s t o B F P .

    3 Q u a l i t a t i v eE v a l u a t i o n - G l a s s B o x

    T h e n u m b e r s p r e s e n t e d i n th e p r e v i o u s s ec t i o n a r ei n t u i t i v e l y u n s a t i s f y i n g . T h e y t e l l u s n o t h i n g a b o u tw h a t m a k e s t h e a l g o r i t h m s m o r e o r l e s s g e n e r a l , o rh o w t h e y m i g h t b e i m p r o v e d . I n a d d i t i o n , g i v e n t h ea s s u m p t i o n s t h a t w e n e e d e d t o m a k e in o r d e r t o p r o -d u c e t h e m , o n e m i g h t w o n d e r t o w h a t e x t e n t t h e d a t ai s a r e s u l t o f t h e s e a s s u mp t i o n s . F i g u r e 1 a l s o f a i ls t oi n d i c a t e w h e t h e r t h e t w o a l g o r i t h m s m i s s e d t h e s a m ee x a m p l e s o r a r e c o v e r i n g a d i f fe r e n t s e t o f p h e n o m e n a ,i .e . w h a t t h e r e l a t i v e d i s t r i b u t i o n o f t h e s u c c es s e s a n df a i l u r e s a r e . B u t h a v i n g d o n e t h e h a n d - s i m u l a t i o n i no r d e r t o p r o d u c e s u c h n u m b e r s , a ll o f t h i s in f o r m a -t i o n i s a v a i l a b l e . I n t h i s s e c t i o n we wi l l f i r s t d i s c u s st h e r e l a t i v e i m p o r t a n c e o f v a r i o u s f a c to r s t h a t g o i n top r o d u c i n g t h e n u m b e r s a b o v e , t h e n d i s cu s s if t h e a l -g o r i t h m s c a n b e m o d i f i e d s i n c e t h e f l e x i b il i ty o f af r a m e w o r k i n a l l o w i n g o n e t o m a k e m o d i f i c a t i o n s i sa n i m p o r t a n t d i m e n s i o n o f ev a l u a ti o n .

    3 . 1 D i s tr ibut ionsT h e f i g u r e s 2 , 3 a n d 4 sh o w f o r e a c h p r o n o m i n a l c a t -e g o r y , t h e d i s t r i b u t i o n o f s u c c e s s e s a n d f a i l u r e s f o rb o t h a l g o r it h m s .

    H ES H ET H E YT o t a l

    B o t h N e i t h e r H o b b s B F Po n l y o n l y66 1 166 3 35 1 183 5 5 7

    F i g u r e 2 : D i s t r i b u t i o n o n W h e e l sS i n c e t h e m a i n p u r p o s e o f e v a l u a t io n m u s t b e t oi m p r o v e t h e t h e o r y t h a t w e a r e e v a l u a t i n g , t h e m o s ti n t e r e s t i n g c a s e s a r e t h e o n e s o n wh i c h t h e a l g o -r i t h r n s ' p e r f o r m a n c e v a r i e s a n d t h o s e t h a t n e i t h e r a l -g o r i t h m g e t s c o r r e c t . W e d i s c u s s t h e s e b e l o w .

    2 5 5

  • 8/6/2019 Discourse Processing Algorithms

    6/11

    H EI TT H E YT o t a l

    B o t h N e i t h e r H o b b s B F Po n l y o n l y53 8 2I i 5 4 I13 377 8 12 3

    F i g u r e 3 : D i s t r i b u t i o n o n N e w s w e e kI B o t h N e i t h e r H o b b s B F Po n l y o n l y

    I T 48 29 3 1F i g u r e 4 : D i s t r i b u t i o n o n T a s k D i a l o g u e s

    3 . 1 . 1 B o t hI n t h e W h e e l s d a t a , 4 e x a m p l e s r e s t o n t h e a s s u m p -t i o n t h a t t h e i d e n t i t i e s o f s p e a k e r s a n d h e a r e r s i s re -c o v e r a b l e . F o r e x a m p l e i n T h e G M p r e s i d e n t s m i l e d ."Ex c ep t H en ry w i l l be da mn ed fo rc e fu l a n d t h e pa perswo n ' t p r in t a l l H I S lan gu age . ~ , g e t t i n g t h e h is c o r r e c th e r e d e p e n d s o n k n o w i n g t h a t i t is th e G M p r e s i d e n ts p e a k i n g . O n l y 4 e x a m p l e s r e s t o n b e i n g a b l e t o p r o -d u c e c o l l e c t io n s o r d i s c o u r s e e n t i t ie s , a n d 2 o f t h e s eo c c u r r e d w i t h a n e x p l i c it in s t r u c t i o n t o t h e h e a r e r t op r o d u c e s u c h a c o l l e c t i o n b y u s i n g t h e p h r a s e t h e mboth.

    3 . 1. 2 H o b b s o n l yT h e r e a r e 2 1 c a s e s t h a t H o b b s g e t s t h a t B F P d o n ' t ,a n d o f t h e s e t h e s e a f ew c la s s e s s t a n d o u t . I n e v -e r y c a s e t h e r e l e v a n t f a c t o r is H o b b s ' p r e f e r e n c e f o ri n t r a s e n t e n t i a l c o - s p e c if i e rs .

    O n e c l a s s , ( n = 3 ) , i s e x e m p l i f i e d b y P u t t h e l i t -t l e black r ing in to the the large blue CAP wi th theh o le i n IT . A l l t h r e e i n v o l v e d u s i n g t h e p r e p o s i t i o nwi th i n a d e s c r ip t i v e a d j u n c t o n a n o u n p h r a s e . I tm a y b e t h a t w i t h - a d j u n c t s a r e c o m m o n i n v i s u a l d e -s c r ip t i o n s, s i n ce t h e y w e r e o n l y fo u n d i n o u r d a t a i nt h e t a s k d i a l o g u e s , a n d a q u i c k i n s p e c t i o n o f G r o s z ' st a s k - o r i e n t e d d i a l o g u e s r e v e a l e d s o m e a s w e l l[ D e u 7 4 ].

    A n o t h e r c la s s , ( n = 7 ) , a r e p o s s e s s iv e s . I n s o m ec a s e s t h e p o s s e s s i v e c o - s p e c i fi e d w i t h t h e s u b j e c t o ft h e s e n t e n c e , e . g . T h e S E N A T E t oo k ti m e f r o mIT S pa ra lyz in g N ew H a mpsh i re e l ec t io n deba te t ov o t e a greemen t , a n d i n o t h e r s i t w a s w i t h i n a r e l a -t i v e c l au s e a n d c o - s p e c if i e d w i t h t h e s u b j e c t o f t h a tc l ause , e . g . Th e a u to i n du s t ry sh o u ld be a b le t o p ro -duce a to ta l ly sa fe , defec t - free CA R tha t doesn ' t po l-

    l u t e I T S e n v i r o n m e n t .O t h e r c a s e s s e e m t o b e s y n t a c t i c a l l y m a r k e d s u b -j e c t m a t c h i n g w i t h c o n s t r u c t i o n s t h a t l in k t w o Sc l a u s e s (n = 8 ) . T h e s e a r e u s e s o f m o r e - t h a n in e .g .bu t C h a mber la in g ro ssed a bo u t $8 .3 mi l l i o n m o re t h a nH E c o u ld h a v e ma de by se l l i n g o n t h e h o me f ro n t .

    T her e a l so a r e S - i f - S ca se s , a s i n Mo n da le sa id : " It h i n k T H E M A F I A w o u l d b e b r o k e i f ' I T c o n d u c t e d a l li t s bu s in ess t h a t wa y ." W e a l s o h a v e s u b j e c t m a t c h -i n g in A S - A S e x a m p l e s a s i n . .. a n d th e resu l t i n g EX-PO SU RE to da y l igh t h a s bec o me a s u n c o mfo r ta b le a sI T w a s u n a c c u s t o m e d , a s w e l l a s i n s e n t e n t i a l c o m -p l e m e n t s , s u c h a s Bu t a n o th er l i bera l , Min n eso ta ' sW a l t e r M O N D A L E , s a i d H E h a d f o u n d a lo t o f i n -c o mpeten c e i n t h e a gen c y ' s o pera t i o n s . T h e f a c t t h a tq u i t e a f e w o f t h e s e a r e a l s o m a r k e d w i t h B u t m a y b es i gn i f i can t .I n t e r m s o f t h e p o s s i b l e e ff e c t s t h a t w e n o t e d e a r -l ier , the DEFINITION OF SUCCESS (see se ct io n 2 .1 fa-

    v o r s H o b b s ( n = 2 ) . C o n s i d e r :K : N e x t t a k e t h e r e d p i e c e t h a t is t h e s m a ll -e s t a n d i n s e r t i t i n t o t h e h o l e i n t h e s i d e o ft h e l a r g e p l a s t i c t u b e . I T g o e s in t h e h o l en e a r e s t t h e e n d w i t h t h e e n g r a v i n g s o n I T .

    T h e H o b b s a l g o r i t h m w i l l c o r r e c t l y c h o o s e t h e en da s th e a n t e c e d e n t f or t h e s e c o n d it . T h e B F P a l -g o r i t h m o n t h e o t h e r h a n d w i l l g e t t w o i n t e r p r e t a -t i ons , one i n w h i ch t he s econd i t co - spec i f i e s the redp i ec e a n d o n e i n w h i c h i t c o - s p e c i f i e s the end. T h e ya r e b o t h C O NT IN U IN G i n t e r p r e t a t i o n s s i n c e t h e f i rs ti t c o - s p e c if i es t h e C B , b u t t h e c o n s t r a i n t s d o n ' t m a k ea cho i ce .

    3 . 1 . 3 B F P o n l yA l l o f t h e e x a m p l e s o n w h i c h B F P s u c c e ed a n d H o b b sf a il s h a v e t o d o w i t h e x t e n d e d d i s c u ss i o n o f o n e d i s -c o u r s e e n t i t y . F o r i n s t a n c e :

    E x p t : N o w t a k e t h e b l u e c a p w i t h t h e t w op r o n g s s t i c k i n g o u t ( C B - - b l u e c a p )E xp2: a n d f it t h e l i t t l e p i e c e o f p i n k p l a s t i c o n I T .

    O k ? ( C B = b l u e c a p )Cl i t : ok.E x p 3 : I n s e r t t h e r u b b e r r in g i n t o t h a t b l u e c a p .( C B = b l u e c a p )E x p 4 : N o w s c r e w I T o n t o th e c y l i n d e r .O n t h i s e x a m p l e , H o b b s f a i l s b y c h o o s i n g t h e c o -spec i f i e r o f i t i n E xp4 t o b e the rubber ring, e v e n

    2 5 6

  • 8/6/2019 Discourse Processing Algorithms

    7/11

    t h o u g h t h e w h o l e se g m e n t h a s b e e n a b o u t t he b l uecap .

    A n o t h e r e x a m p l e f r o m t h e n o v e l W H E E L S i s g ivenb e l o w . O n t h i s o n e H o b b s g e t s th e f i rs t u se o f h eb u t t h e n m i s s es t h e n e x t f o u r , a s a r e s u l t o f m i s s in gt h e s e c o n d o n e b y c h o o s i n g a h o u s e k e e p e r a s t h e c o -spec i f i e r f o r H I S .

    . . A n e x e c u t i v e v i c e - p r e s i d e n t o f F o r d w a sp r e p a r i n g t o l e a v e f o r D e t r o i t M e t r o p o l i -t a n A i r p o r t . H E h a d a l r e a d y b r e a k f a s t e d ,a l on e . A h o u s e k e e p e r h a d b r o u g h t a t r a y t oH I S d e s k i n t h e s o f t l y l i g h t e d s t u d y w h e r e ,s i n c e 5 a . m . , H E h a d b e e n a l t e r n a t e l y r e a d -i n g m e m o r a n d a ( m o s t l y o n s p e ci a l b lu e s t a -t i o n e r y w h i c h F o r d v i c e - p r e s i d e n t s u s e d i ni m p l e m e n t i n g p o l i c y ) a n d d i c t a t i n g c r i s p i n -s t r u c t i o n s i n to a r e c o r d in g m a c h i n e . H E h a ds c a r c e l y l o o k e d u p , e i t h e r a s t h e m a l l a r -r i v e d , o r w h i l e e a t i n g , a s H E a c c o m p l i s h e di n a n h o u r w h a t w o u l d h a v e t ak e n . . .

    S i n c e a n e z e c u t i v e v i c e - p r e s i d e n t i s c e n t e r e d i n t h ef i r s t s e n t e n c e , a n d c o n t i n u e d i n e a c h f o l l o w i n g s e n -t e n c e , t h e B F P a l g o r i t h m w i ll c o r r e c tl y ch o o s e th ecospec i f i e r .

    3 . 1 . 4 N e i t h e rA m o n g t h e e x a m p l e s t h a t n e i t h e r a l g o r i th m g e t s c o r-r e c t l y a r e 2 0 e x a m p l e s f r o m t h e t a s k d i a l o g u e s o f itr e f e r r i n g t o t h e g l o b a l f o c u s , th e p u m p . I n 1 5 c a s e s ,t h e s e s h i f t s t o g l o b a l f o c u s a r e m a r k e d s y n t a c t i c a l l yw i t h a c u e w o r d s u c h a s N o w , a n d a r e n o t m a r k e di n 5 c a s e s . P r e s u m a b l y t h e y a r e f e l i c it o u s s i n c e t h ep u m p i s v i s u a l l y s a l i e n t . B e s i d e s t h e g l o b a l f o c u sc a s e s, p r o n o m i n a l r e f e r e n c e s t o e n t i t ie s t h a t w e r e n o tl i n g u i s ti c a l ly i n t r o d u c e d a r e r a r e . T h e o n l y o t h e r e x -a m p l e i s a n i m p l i c i t r e f e r e n c e to ' t h e p r o b l e m ' o f t h ep u m p n o t w o r k i n g :

    C l i l : S o r r y n o l u c k .E x p l : I b e t I T ' s t h e s t u p i d r e d t h i n g .W e h a v e o n l y t w o e x a m p l e s o f s e n t e n t i a l o r V P

    a n a p h o r a a l t o g e t h e r , s u c h a s M a d a m C h a i r w o m a n ,s a i d C o l b y a t l a s t, I a m t r y i n g t o r a n a s e c re t i n t e l li -g e n c e s e r v i c e . I T u ~ a s a f o r l o r n h o p e . N e i t h e r H o b b sa l g o r i t h m n o r B F P a t t e m p t t o c o v e r t h e s e e x a m p l e s .

    T h r e e o f th e e x a m p l e s a r e u s e s o f it t h a t s e e m t ob e l e x i c a l i z e d w i t h c e r t a i n v e r b s , e . g . T h e y h i t I To f f r ea l w e l l . O n e c a n i m a g i n e t h e se b e i n g t r e a t e d a s

    p h r a s a l l e x i c al i t em s , a n d t h e r e f o r e n o t h a n d l e d b ya n a n a p h o r i c p r o c e s s i n g c o m p o n e n t [ A S 8 9 ] .M o s t o f t h e i n t e r c h a n g e s i n t h e t a s k d i a l o g u e s c o n -s i s t o f t h e c l i e n t r e s p o n d i n g t o c o t m n a n d s w i t h c u e ss u c h a s O . K . o r R e a d y t o l e t t h e e x p e r t k n o w w h e n

    t h e y h a v e c o m p l e t e d a t a s k . W h e n b o t h p a r t i e sc o n t r i b u t e d i s c o u r s e e n t i t i e s t o t h e c o m m o n g r o u n d ,b o t h a l g o r i t h m s m a y f a i l ( n = 4 ) .

    C o n s i d e r :E x p l : N o w w e h a v e a l i t tl e r e d p ie c e l e f tE x p 2 : a n d I d o n ' t k n o w w h a t t o d o w i t h I T .C l i l : We l l , t he r e i s a ho l e i n t he g r een p l un ge r

    i n s i d e t h e c y l i n d e r .E x p a : I d o n ' t t h in k I T g o e s i n T H E R E .E x p 4 : I t h i n k I T m a y b e l o n g i n t h e b l u e c a po n t o w h i c h y o u p u t t h e p i n k p i e c eof p l a s t i c .

    I n E x p 3 , o n e m i g h t c l a i m t h a t i t a n d t h e r e a r e c o n -t r a i n d e x e d , a n d t h a t t h e r e c a n b e p r o p e r l y r e s o l v e dt o a hole , so t h a t i t c a n n o t b e a n y o f t h e n o u n p h r a s e si n t h e p r e p o s i t i o n a l p h r a s e s t h a t m o d i f y a ho l e , b u tw h e t h e r a n y t h e o r y o f c o n t r a -i n d e x i n g a c t u a l ly g i v e .us t h i s i s ques t i onab l e .

    T h e m a i n f a c t o r s e e m s t o b e t h a t e v e n t h o u g hE x p t i s n o t s y n t a c t i c a l l y a q u e s t i o n , t he l i t t l e r edp i ec e i s t h e f o c u s o f a q u e s t i o n , a n d a s s u c h i s i nf o c u s d e s p i t e t h e f a c t t h a t t h e s y n t a c t i c c o n s t r u c t i o nt he r e i s s u p p o s e d l y f o c u se s a h o l e i n t h e g r e e n p l u n g e r. .. [S i d 7 9] . T h e s e e x a m p l e s s u g g e s t t h a t a q u e s t i o n e de n t i t y i s l e ft f o c u s e d u n t i l t h e p o i n t i n t h e d i a l o g u e a tw h i c h t h e q u e s t i o n i s r e s o l v e d . T h e f a c t t h a t w e l l h a sb e e n n o t e d a s a m a r k e r o f r e s p o n s e t o q u e s t i o n s s u p -p o r t s t h i s a n a ly s i s[ S c h 8 7 ]. T h u s t h e r e l e v a n t f a c t o rh e r e m a y b e t h e s w i t c h i n g o f c o n t r o l a m o n g d i s c o u r s ep a r t i c i p a n t s [ W S 8 8 ]. T h e s e m i x e d - i n it i at i .v e f e a t u r e sm a k e t h e s e s e q u e n c e s i n h e r e n t l y d i f f e r e n t t h a n t e x t .

    3.2 ModifiabilityT a s k s t r u c t u r e i n t h e p u m p d i a lo g u e s is a n i m p o r t a n tf a c t o r e s p e c i a l ly a s i t r e l a t e s t o t h e u s e o f g lo b a l f o c u s .T w e n t y o f th e c a s e s o n w h i c h b o t h a l g o r i t h m s f a i l a r er e f e r ences t o t h e p u m p , w hi ch i s t he g l oba l f ocus . Wec a n i n c l u d e a g l o b a l f o c u s i n t h e c e n t e r i n g f r a m e w o r k ,a s a s e p a r a t e n o t i o n f r o m th e c u r r e n t C B . T h i s m e a n st h a t i n t h e 1 5 o u t o f 2 0 c a s e s w h e r e t h e s h i f t t o g l o b a lf o c u s i s i d e n t i f i a b l y m a r k e d w i t h a c u e - w o r d s u c h a sn o w , t h e s e g m e n t r u l e s w i l l a l l o w B F P t o g e t t h eg l o b a l f o c u s e x a m p l e s .

    B F P c a n a d d t h e V P a n d t h e S o n t o t h e e n d o f t h e

    2 5 7

  • 8/6/2019 Discourse Processing Algorithms

    8/11

    f o r wa r d c e n t e r s l i s t , a s S i d n e r d o e s i n h e r a l g o r i t h mf o r lo c a l f o c u s i n g [ S id 7 9] . T h i s l e t s BF P g e t t h e t woe x a m p l e s o f e v e n t a n a p h o r a . H o b b s d i s c u ss e s th e f a c tt h a t h i s a l g o r i t h m c a n n o t b e m o d i f i e d t o g e t e v e n ta n a p h o r a i n [ H o b 7 6 b ] .

    An o t h e r i n t e r e s t i n g f a c t i s t h a t i n e v e r y c a s e i nw h i c h H o b b s ' a l g o r i t h m g e t s t h e c o r r e c t c o - s p e c i f i e ra n d B F P d i d n ' t , t h e r e l e v a n t f a c t o r is H o b b s ' p re f -e r e n c e fo r i n t r a s e n t e n t i a l c o - s p e c if i e rs . O n e v i e wo n t h e s e c a s e s m a y b e t h a t t h e s e a r e n o t d i s c o u r s ea n a p h o r a , b u t t h e r e s e e m s t o b e n o p r i n c ip l e d w a yt o m a k e th i s d i s ti n c t i o n . H o w e v e r , C a r t e r h a s p r o -p o s e d s o m e e x t e n s i o n s t o S i d n e r ' s a l g o r i t h m f o r l o -c a l f o c u s i n g t h a t s e e m t o b e re l e v a n t h e r e ( c h a p . 6 ,[ C a r8 7 ] ). H e a r g u e s t h a t i n t r a - s e n t e n t i a l c a n d i d a t e s( I S C s ) s h o u l d b e p r e f e r r e d o v e r c a n d i d a t e s f r o m t h ep r e v i o u s u t t e r a n c e , O N L Y i n t h e c a s e s w h e r e n o d i s -c o u r s e c e n t e r h a s b e e n e s t a b l i s h e d o r t h e d i s c o u r s ec e n t e r i s r e j e c t e d f o r s y n t a c t i c o r s e l e c t i o n a l r e a s o n s .H e t h e n u s e s H o b b s a l g o r i t h m t o p r o d u c e a n o r d e r i n go f t h e s e I S C s . T h i s i s c o m p a t i b l e w i t h t h e c e n t e r i n gf r a m e w o r k s i n ce i t i s u n d e r s p e c i f le d a s t o w h e t h e r o n es h o u l d a l w a y s c h o o s e t o e s t a b l i s h a d i s c o u r s e c e n t e rw i t h a c o - s p e c if i e r f r o m a p r e v i o u s u t t e r a n c e . I f w ea d o p t C a r t e r ' s r u le i n t o t h e c e n t e ri n g f r a m e w o r k , w ef i nd t h a t o f t h e 2 1 ca s e s t h a t H o b b s g e t s t h a t B F Pd o n ' t , i n 7 c a s e s t h e r e i s n o d i s c o u r s e c e n t e r e s t a b -l i s h e d , a n d i n a n o t h e r 4 t h e c u r r e n t c e n t e r c a n b e r e -j e c t e d o n t h e b a s i s o f s y n t a c t i c o r s o r t a l i n f o r m a t i o n .O f t h e s e C a r t e r ' s r u l e c l e a r l y g e t s 5 , a n d a n o t h e r 3s e e m t o r e s t o n w h e t h e r o n e m i g h t w a n t t o e st a b l is ha d i s co u r s e e n t i t y f r o m a p r e v i o u s u t t e r a n c e . S i n c et h e a d d i t i o n o f t h i s c o n s t r a i n t d o e s n o t a l lo w B F P t og e t a n y e x a m p l e s t h a t n e i t h e r a l g o r i th m g o t , it s e em st h a t t h is c o m b i n a t i o n i s a w a y o f m a k i n g t h e b e s t o u to f b o t h a l g o r i th m s .

    T h e a d d i t i o n o f t h e s e m o d i f i c a t i o n s c h a n g e s t h eq u a n t i t a t i v e r e s u l t s . S e e t h e F i g u r e 5.NW h e e l s 1 0 0

    N e w s w e e k 1 0 0T a s k s 8 1

    H o b b s B F P88 9389 8451 64

    F i g u r e 5 : N u m b e r c o r r e c t f o r b o t h a l g o r i t h m s a f t e rM o d i f i c a t i o n s , f o r W h e e l s , N e w s w e e k a n d T a s k D i a -l o g u e s

    H o w e v e r , t h e s t a t i s t i c a l a n a l y s e s s t i l l s h o w t h a tt h e r e i s n o s i g n i f i c a n t d i f f e r e n c e i n t h e p e r f o r m a n c eo f t h e a l g o r i t h m s i n g e n e r a l . I t is a l s o s t i ll t h e c a s et h a t t h e p e r f o r m a n c e o f e a c h a l g o r i t h m s i g n i f i c a n tl y

    v a r i e s d e p e n d i n g o n t il e d a t a . T i l e o n l y s ig n i f i c a n td i f f e r e n c e a s a r e s u l t o f t h e m o d i f c a t i o n s i s t h a t t il eB F P a l g o r i t h m n o w p e r f o r m s s i g n i f i c a n t l y b e t t e r o i lt i l e pu m p d ia log ues a lon e (X 2 = 4 .3 I , p < .05) .

    4 Conc lus ionW e c a n b e n e f i t i n t w o w a y s f r o m p e r f o r m i n g s u c he v a l u a t i o n s : ( a ) w e g e t g e n e r a l r e s u l t s o n a m e t h o d o l -o g y f o r d o i n g e v a l u a t i o n , ( b ) w e d i s c ov e r w a y s w e c a ni m p r o v e c u r r e n t t h e o r ie s . A s p l i t o f e v a l u a t i o n e f f o r t si n t o q u a n t i t a t i v e v e r s u s q u a l i t a t i v e i s i n c o h e r e n t . W ec a n n o t t r u s t t h e r e s u l t s o f a q u a n t i t a t i v e e v a l u a t i o nw i t h o u t d o i n g a c o n s i d e r a b l e a m o u n t o f q u a l i t a t i v ea n a l y s e s a n d w e s h o u l d p e r f o r m o u r q u a l i t a t iv e a n a l -y s e s o n t h o s e c o m p o n e n t s t h a t m a k e a s ig n i f ic a n t c o n -t r i b u t i o n t o t h e q u a n t i t a t i v e r e s u lt s ; w e n ee d t o b ea b l e t o m e a s u r e t h e e f f e c t o f v a r i o u s f a c to r s . T h e s em e a s u r e m e n t s m u s t b e m a d e b y d o i n g c o m p a r i s o n sa t t h e d a t a l e v e l .

    I n t e r m s o f g e n e r a l r e s u l ts , w e h a v e i d e n ti f ie d s o m ef a c t o r s t h a t m a k e e v a l u a t i o n s o f t h i s t y p e m o r e c o m -p l i c a t e d a n d w h i c h m i g h t l e a d u s t o e v a l u a t e s o l e l yq u a n t i t a t i v e r e s u l t s w i t h c a r e . T h e s e a r e : ( a ) T o d e-c i d e h o w t o e v a l u a t e UNDE RS PE CIF ICAT IONS a n d t h ec o n t r i b u t i o n o f A SS U M PT IO N S , a n d ( b ) T o d e t e r m i n ethe e f fec t s o f FALSE POSITIVES an d ERKOR CHAINING.W e a d v o c a t e a n a p p r o a c h i n w h i c h t h e c o n t r i b u t i o no f e a c h u n d e r s p e e i f i c a t i o n a n d a s s u m p t i o n i s t a b u -l a t e d a s we l l a s t h e e f f e c t o f e r r o r c h a i n s . I f a p r i n -c i p l e d wa y c o u l d b e f o u n d t o i d e n t i f y f a l s e p o s i t i v e s ,t h e i r e f fe c t s h o u l d b e r e p o r t e d a s w e ll a s p a r t o f a n yq u a n t i t a t i v e e v a l u a t io n .

    I n a d d i t i o n , w e h a v e t a k e ri a f e w s te p s t o w a r d s d e -t e r m i n i n g t h e r e l a t i v e i m p o r t a n c e o f d if f e re n t f a c t o r st o t h e s u c ce s s f u l o p e r a t i o n o f d i s c o u r se m o d u l e s . T h ep e r c e n t o f s u c c e s se s t h a t b o t h a l g o r i t h m s g e t in d i -c a t e s t h a t s y n t a x h a s a s t r o n g i n f l u e n c e , a n d t h a t a tt h e v e r y l e a s t w e c a n r e d u c e t h e a m o u n t o f i n f e re n c er e q u i r e d . I n 59 0 t o 8 2 % o f t h e c a s e s b o t h a l g o r i t h m sg e t t h e c o r r e c t r e s u l t . T h i s p r o b a b l y m e a n s t h a t i n al a r g e n u m b e r o f c a s e s t h e r e w a s n o p o t e n t i a l c o n f l i c to f c o - s p e c i f i e r s. I n a d d i t i o n , t h i s a n a l y s i s h a s s h o wn ,t h a t a t l e a s t f o r t a s k - o r i e n t e d d i a l o g u e s g l o b a l f o c u si s a s i g n i f i c a n t f a c t o r , a n d i n g e n e r a l d i s c o u r s e s t r u c -t u r e is m o r e i m p o r t a n t i n th e t a s k d i a lo g u e s . H o w -e v e r s i m p l e d e v i c e s s u c h a s c u e w o r d s m a y g o a l o n gw a y t o w a r d d e t e r m i n i n g t h i s s t r u c t u r e .

    F i n a ll y , w e s h o u l d n o t e t h a t d o i n g e v a l u a t i o n s s u c ha s t h i s a l l o ws u s t o d e t e r m i n e t h e GE NE RALI TY o f o u r

    258

  • 8/6/2019 Discourse Processing Algorithms

    9/11

    approaches. Since the performance of both Hobbsand BFP varies according to the type of the text, andin fact was significantly worse on the task dialoguesthan on the texts, we might question how their per-formance would vary on other inputs. An annotatedcorpus comprising some of the various NL input typessuch as those I discussed in the introduction wouldgo a long way towards giving us a basis against which-we could evaluate the generality of our theories.

    5 A c k n o w l e d g e m e n t sDavid Carter, Phil Cohen, Ni ck Haddock, JerryHobbs, Aravind Joshi, Don Knuth, Candy Sidner,Phil Stenton, Bonnie Webber, and Steve Whit takerhave provided valuable insights toward this endeavorand critical comments on a multiplicity of earlier ver-sions of this paper. Steve Whittaker advised me onthe statistical analyses. I would like to thank JerryHobbs for encouraging me to do this in the first place.

    R e f e r e n c e slAP861

    [AS89]

    [BF83]

    [BFP87]

    [Car87]

    James F. Allen and C. Raymond Perranlt.Analyzing intention in utterances. In Bar-bara J. Grc6z, Karen Sparck Jones, andBonnie Lynn Webber, editors, Readings inNatura l Language P rocessing, pages 419-422, Morgan Kauffman, Los Altos, Ca.,1986.Anne Abeille and Yves Schabes. Parsingidioms in lexicalized tags. In Proc. 27thAnnual Meeting of the ACL, Associat ionof Computational Linguist ics , pages 161-65, 1989.Roger Brown and Deborah Fish. The psy-chological causality implicit in language.Cognition, 14:237-273, 1983.Susan E. Brennan, Marilyn Walker Fried-man, and Carl J. Pollard. A center-ing approach to pronouns. In Proc. 25thAnn ual Meet ing o f the ACL, Assoc ia tionof Computational Linguist ics , pages 155-162, Stanford University, Stanford, Ca.,1987.David M. Carter. Interpret ing Anaphorsin Natural Language Texts . Ellis Hot-wood, 1987.

    [Coh78]

    [Coh84]

    [Deu74]

    [DJ89]

    [GJw831

    [GJWS6]

    [Gro77]

    [ c s 8 6 1

    [GSBC861

    [HL87]

    Phillip R. Cohen. On Knowing What toSay: Planning Spe ech Acts. Technical Re-port 118, University of Toronto; Depart-ment o f Computer Science, 1978.Phillip R. Cohen. The pragmatics of re-ferring and the modality of conununica-tion. Computational Linguist ics , 10:97-146, 1984.Barbara Grosz Deutsch. Typescripts oftask oriented dialogs. August 1974.Nits Dahlback and Arne Jonsson. Empiri-cal studies of discourse representations fornatu ral language interfaces. In Proc. 27thAnnual Meeting of the ACL, Associat ionof Computational Linguist ics , pages 291-298, 1989.Barbara J. Grosz, Aravind K. Joshi, andScott Weinstein. Providing a unified ac-count of definite noun phrases in dis-course. In Proc. 21st Annual Meeting ofthe ACL, Associat ion of ComputationalLinguistics, pages 44-50, Cambridge, MA,1983.Barbara J. Grosz, Aravind K. Joshi, andScott Weinstein. Towards a computa-tional theory of discourse interpretation.1986. Preliminary draft.Barbara J. Grosz. The Representationand Use of Focus in Dialogue Und erstand-ing. Technical Report 151, SRI Interna-tional, 333 Ravenswood Ave, Menlo Park,Ca. 94025, 1977.Barbara J. Grosz and Candace L. Sidner.Attentions, intentions and the structureof discourse. Computational Linguistics,12:pp. 175-204, 1986.Raymonde Guindon, P. Sladky, H. Brun-ner, and J. Conner. The structure of user-adviser dialogues: is there method in theirmadness? In Proc . 24s t Annua l Meet ingof the AC L, Associat ion of ComputationalLinguistics, pages 224-230, 1986.Julia Hirschberg and Diane Litmus. Nowlets talk about now: identifying cuephrases intonationally. In Proc . 25 th An-nual Meeting o f the ACL , Associat ionof Computational Linguist ics , pages 163-

    2 5 9

  • 8/6/2019 Discourse Processing Algorithms

    10/11

    [HM87]

    [HobTSa]

    [Hob76b]

    [Hob78]

    [HobS5]

    [HTD861

    [PH87]

    [Po186]

    [Pri81]

    171, Stan ford Univers i ty, Stan ford , Ca. , [Pri85]1987.J e r r y R . H obbs a nd P a u l M a r t i n . LocalPragm at ic s . Technica l Repor t , SRI In-t e rna t iona l , 333 P~venswood Ave . , MenloPark , Ca 94025, 1987.J e r r y R . H obbs . A C o m p u t a t i o n a l A p -proac h to Disc ourse Anal ys i s . Techni -c a l R e po r t 76 -2 , D e p a r t m e n t o f C om pu t e rSc ience , C i ty Col lege , C i ty Univers i ty ofNew York, 1976.J e r r y R . H obbs . P r o n o u n R e s o l u t i o n .T e c hn i c a l R e po r t 76 - 1 , D e pa r t m e n t o fCo mp ute r Sc ience , C i ty College , C i ty Uni -vers i ty of New York , 1976.J e r r y R . H obbs . Why i s Disc ourse Coher -e n t ? T e c hn i c a l R e po r t 176 , S R I I n t e r na -t iona l , 383 Ravenswood Ave . , Menlo Park ,Ca 94025, 1978.J e r r y R . H obbs . On the Coherence andS truc ture o f Disc ourse . Technica l Re-por t CSLI-85-37, Cente r for the S tudy ofL a ngua ge a n d I n f o r m a t i on , V e n t u r a H a l l,S tanford Univers i ty , S tanford , CA 94305,1985.S us a n B . H uds on , M i c ha e l K . T a ne nha us ,and Gary S . De l l . The ef fec t of the d is-course ce nter o n the local coherence of adiscourse. Technica l Rep or t , Univers i ty ofRoches te r , 1986.J a n e t P i e r r e h u m -be r t a nd J u l i a H i r s e hbe r g . T he m e a n i ngof in tona t iona l contours in the in te rpre ta -t ion of d i scourse . In P r o c . S y m p o s i u m o nI n t e n t i o n s a n d P l a n s i n C o m m u n i c a t i o nand Disc ourse , Mo nterey , Ca . , 1987.M a r t ha P o l la c k . A m ode l o f p la n i n f er -e nc e t ha t d i s t i ngu i s he s be t w e e n t he be -l ie fs of ac tors ando bserv ers . In Proc . $4stA n n u a l M e e t i n g o f t h e A C L , A s s o c i a t io no f C o m p u t a t i o n a l L i n g u i st i c s, pa ge s 207 -214, Colum bia Univers i ty , New York , N .Y,1986.E l l en F . P r inc e . T ow a r d a t a xonom y o fgiven-new informat ion . In Radical Prag-m at ic s , Academic Pres s , 1981.

    [Rei76]

    [Rei85]

    [ROBS8]

    [Sch87][ S I 8 1 ]

    [Sid79]

    [Tho80]

    [W eb86]

    [ws88]

    [ws89]

    El len F . P r ince . Fancy syntax and sharedknowledge . J ournal o f Pragm at ic s , pp.65-81, 1985.T . R e i nha r t . T h e S y n t a c t i c D o m a i n o fA n a p h o r a . P hD t he s i s , M I T , C a m br i dgeMass., 1976.R a c he l R e i c hm a n . Get t ing Com puter s toTal k L ik e You and Me. M I T P r e s s , C a m -br idge , MA, 1985.Cra ige Rober t s . Modal Subord ina-t i o n a n d P r o n o m i n a l A n a p h o r a i n D i s -course. Techn ica l Rep or t No. 127, CSLI ,May,1988. Also to appea r in Linguis t icsand Phi losophy.Deborah Schi f f r in . Disc ourse Mark ers .Cambr idge Univers i ty P res s , 1987.Cand ace S idner and David I s rae l . Rec-ogn i z i ng i n t e nde d m e a n i ng a nd s pe a k -ers plans . In Proc . In ternat ionalJ o in t Conf erenc e on Ar t i f i c ia l In te l l i -gence, pages 203-208, Va ncouv er, BC,Canada , 1981.Candace L . S idner . "Toward a computa-t iona l theory o f def ini te anaphora compre-hens ion in Eng l i sh . Technica l Repor t AI-TR-537, MIT, 1979.Bozena Henisz Tho mp son. L inguis -t i c ana lys i s of na tura l l anguage com-m un i c a t ion w i t h c om pu t e r s . I n C O L -ING80 : Proc. 8 th In ternat io nal Con-terenc e on Com puta t ional L inguis t ic s .Tokyo, pages 190-201, 1980.B onn i e L ynn We bbe r . T w o S t e p s C l o s e rto Ev ent R e f erenc e . Technica l Rep or t MS-CIS-86-74, L inc Lab 42 , Depar tment ofC om pu t e r a nd I n f o r m a t i on S c ie nc e, U n i-vers i ty of Pennsy lvania , 1986.S t e ve W hi t t a ke r a nd P h i l S te n t on . C ue sand cont ro l in exper t c l i en t d ia logues . InP r o c . 2 6 t h A n n u a l M e e t i n g o f t h e A C L ,A s s o c i a t i o n o f C o m p u t a t i o n a l L i n g u is t ic s ,1988.S t e ve W hi t t a ke r a nd P h i l S t e n t on . U s e rs t ud i e s a nd t h e de s i gn o f na t u r a l l a ngua gesys tems . In P roc . 2 7 t h A n n u a l M e e t i n go f th e A C L , A s s o c i a t i o n o f C o m p u t a t i o n a lL inguis t ic s , pages 116-123, 1989.

    2 6 0

  • 8/6/2019 Discourse Processing Algorithms

    11/11

    A T h e H o b b s a l g o r it h mT h e a l g o r i t h m a n d a n e x a m p l e i s r e p r o d u c e d b e l o w .I n i t , N P d e n o t e s N O U N P H R A S E a n d S d e n o t e s S E N-TENCE.

    1 . B eg i n at t h e N P n o d e i m m e d i a t e l y d o m i n a t in gt h e p r o n o u n i n t h e p a r s e t r e e o f S .

    2 . G o u p t h e t r e e u n t il y o u e n c o u n t e r a n N P o r Sn o d e . C a l l th i s n o d e X , a n d c a l l t h e p a t h u s e dt o r e a c h i t p .3 . T r a v e r s e a ll b r a n c h e s b e l o w n o d e X t o t h e l ef to f p a t h p i n a l e f t - t o - ri g h t b r e a d t h - f i r s t f a s h i o n .P r o p o s e a s t h e a n t e c e d e n t a n y N P n o d e e n c o u n -t e r e d t h a t h a s a n N P o r S n o d e o n t h e p a t h f r o mi t t o X .4 . I f X i s n o t t h e h i g h e s t S n o d e i n t h e s e n t e n c e ,c o n t i n u e t o s t e p 5 . O t h e r w i s e t r a v e r s e t h e s u r-

    f a c e p a r s e t r e e s o f p r e v io u s s e n t e n c e s i n t h e t e x ti n r e v e r s e c h ro n o l o g ic a l o r d e r u n t i l a n a c c e p t a b l ea n t e c e d e n t i s f o u n d ; e a c h t r e e i s t r a v e r s e d i n al e f t - t o - r i g h t , b r e a d t h - f i r s t m a n n e r , a n d w h e n a nN P n o d e i s e n c o u n t e r e d , i t is p r o p o s e d a s t h ea n t e c e d e n t .5 . F r o m n o d e X , g o u p t h e t r e e t o t h e f i r s t N P o rS n o d e e n c o u n t e r e d . C a l l t h i s n e w n o d e X , a n dc a l l t h e p a t h t r a v e r s e d t o r e a c h i t p .6 . I f X i s a n N P n o d e a n d i f t h e p a t h p t o X d i dn o t p a s s t h r o u g h t h e N n o d e t h a t X i m m e d i a t e l y

    d o m i n a t e s , p r o p o s e X as th e an t e c e d en t .7. Tra v e r s e al l b r a n c h e s b e l o w n o d e X to th e le ft

    of pa th p in a left-to-right, readth-first an ne r,b u t d o n o t g o b e l o w a n y N P o r S n o d e e n c o u n -t e r ed . Pr o p o s e a n y N P o r S n o d e e n c o u n t e r e das the antecedent.

    8 . G o to st e p 4.

    T h e p u r p o s e o f s t e ps 2 a n d 3 i s t o o b s er v e t h econtra.indexing constrain ts. Let us consi der a sim-ple conversational sequence.

    UI: L yn ' s m o r n i s a ga r d e n e r .U2: Craige likes er.

    W e ar e t r y in g t o fi nd t h e a n t e c e d e n t f o r h e r in thes e c o n d u t te r a nc e . L e t u s g o t h r o u g h t h e a l g o r i t h ms t e p b y st e p , u s i n g t h e p a r s e t r e e s f o r UI an d U 2 int h e f i g u r e .1. NPs labels he starting point of step 1.

    /N P 2

    IL y n

    S l/ \N P t V P/ \ I

    D e t N V\ I I' s r o o m i s\

    N PID e tIa

    \N3 \

    Nlg a r d e n e r

    S2/ q :NP4 V P "I / "