learning the meaning of unknown words

13
International Conference on Intelligent Computer Communication, C. Unger & A. Leţia (eds.), Casa Cărţii de Ştiinţă, Cluj-Napoca, 1995 LEARNING THE MEANING OF UNKNOWN WORDS Dan Tufiş Romanian Centre for Artificial Intelligence 13, "13 Septembrie", 74311, Bucharest 5, Romania fax: +(40 1)411 39 16 e-mail : [email protected] Abstract Lexical acquisition is one of the most difficult problems in building up operational natural language processing systems. Automatic learning of new words (morphological, syntactic and semantic properties) is even harder. The paper discusses our solution for overcoming the lexical gaps. Learning what some unknown words might mean is abductively driven by the world knowledge and the local context of the unknowns, according to the pragmatic principle stating that abductive inference is inference to the best explanation. A generalization of the weight assignment for weighted abduction inference strategy is proposed in order to overcome some inherent difficulties existing in a standard cost-based abductive processing. 1. Introduction Recent years have seen a great interest in abductive inference, as a complement for the older deductive approach, applied in the area of natural language processing [Hobbs,1986, Hobbs,1990a, Stickel,1989, Appelt,1990, Konolige,1991, Tufiş,1992], etc. It was rightfully claimed that the abductive approach yields not only a simplification but also a significant extension of the range of the phenomena that can be captured: reference resolution, compound nominal interpretation, resolution of syntactic ambiguity understanding metonymy, etc. Hobbs has shown [Hobbs,1990a] how “interpretation as abduction” can be naturally combined with the former view of “parsing as deduction” [Pereira,1983] to produce an elegant and thorough integration of syntax, semantics and pragmatics, accommodating both interpretation and generation. Translation fits also nicely within this abductive framework, being regarded as a matter of interpreting in the source language and generating in the target language [Hobbs,1990b]. In the following we will show the use of abduction in dealing with one of the most sensitive problems in natural language processing: overcoming the lexical gaps, that is learning from context the meaning of unknown words. Modern parsers can predict the syntactic category of an unknown word without much difficulty. Some of them may even extend this ability to offer a complete feature-based

Upload: racai

Post on 08-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

International Conference on Intelligent Computer Communication,C. Unger & A. Leţia (eds.), Casa Cărţii de Ştiinţă, Cluj-Napoca, 1995

LEARNING THE MEANING OF UNKNOWN WORDS

Dan Tufiş

Romanian Centre for Artificial Intelligence13, "13 Septembrie", 74311, Bucharest 5, Romania

fax: +(40 1)411 39 16e-mail : [email protected]

Abstract

Lexical acquisition is one of the most difficult problems in building up operational naturallanguage processing systems. Automatic learning of new words (morphological, syntactic andsemantic properties) is even harder.The paper discusses our solution for overcoming the lexical gaps. Learning what someunknown words might mean is abductively driven by the world knowledge and the localcontext of the unknowns, according to the pragmatic principle stating that abductiveinference is inference to the best explanation.A generalization of the weight assignment for weighted abduction inference strategy isproposed in order to overcome some inherent difficulties existing in a standard cost-basedabductive processing.

1. IntroductionRecent years have seen a great interest in abductive inference, as a complement for the

older deductive approach, applied in the area of natural language processing [Hobbs,1986,Hobbs,1990a, Stickel,1989, Appelt,1990, Konolige,1991, Tufiş,1992], etc. It was rightfullyclaimed that the abductive approach yields not only a simplification but also a significantextension of the range of the phenomena that can be captured: reference resolution, compoundnominal interpretation, resolution of syntactic ambiguity understanding metonymy, etc. Hobbshas shown [Hobbs,1990a] how “interpretation as abduction” can be naturally combined withthe former view of “parsing as deduction” [Pereira,1983] to produce an elegant and thoroughintegration of syntax, semantics and pragmatics, accommodating both interpretation andgeneration. Translation fits also nicely within this abductive framework, being regarded as amatter of interpreting in the source language and generating in the target language[Hobbs,1990b].

In the following we will show the use of abduction in dealing with one of the mostsensitive problems in natural language processing: overcoming the lexical gaps, that is learningfrom context the meaning of unknown words.

Modern parsers can predict the syntactic category of an unknown word without muchdifficulty. Some of them may even extend this ability to offer a complete feature-based

2

syntactic description of the unknown word. But few of them offer a uniform approach to thecontextual meaning prediction of a word missing from the system’s lexicon.

Weighted abduction provides a natural environment for solving this problem. Informally,understanding contextually an unknown word UW, may be stated as:

1. identify a new word NW which makes it possible for the sentence to be interpreted;

2. assume that in the given context, UW and NW are interchangeable, that is they are(at least partially) synonymic.

To interpret a sentence means:

1. proving the logical form of the sentence, together with the constraints that predicatesimpose on their arguments, allowing for coercion;

2. merging redundances where possible;

3. making assumptions where necessary.

2. Weighted abductionLet Σ be a consistent theory of a domain, expressed as a set of sentences in a given first-

order language L. Let Ψ be a set of facts (expressed in the same language L) about the domainmodeled by the theory Σ and φ a set of explanations, according to Σ for the facts in Ψ. Let usfurther consider |— the Σ inferential operation, connecting a set of facts in Ψ to a set ofexplanations in φ.

Now, consider F a set of new observed facts about the domain. If there exist a set E ⊆ φso that E |— F then E is said to represent a deductive set of explanations for F. If such an E isnot found but there exists a set A of sentences called assumptions so that:

(1) A ∩ φ = { }

(2) A is admissible1 for the theory Σ

(3) ∃ E' ⊆ φ, A ∪ E' |— F

then A ∪ E' is said to represent an abductive set of explanations for F.

Abductive reasoning requires augmenting the set of known facts with a set of factsassumed to be true. This set should not be contradictory with what is already known.

1 The notion of admissibility as used here is very close to the one in [Appelt,1990]. For an algorithm checkingthe admissibility of an assumption set with respect to a first-order theory, see [Appelt,1990].

3

Obviously, in trying to abductively explain some observed facts, one may use differentsets of assumptions. The problem in an abductive inference engine is to provide for the “best”set of assumptions. The weighted abductive prover, underlying the work reportered here, wasdeveloped by Mark Stickel [Stickel,1988] and its control, [Appelt,1990], is implemented as analternative to the statistical methods. The abduction is carried on based on model preferencesaccording to which making an assumption is viewed as restricting the models of thebackground knowledge. The preference ordering is supported by the use of annotations on therules encoding the underliyng theory. These annotations are expressed as numeric weights. The“bestness” od an assumption set among other competing anes may be defined in manydifferent ways, out of wich the cost criterion is a tempting solution. In weighted abduction, thecompeting assumption sets are preferenially ordered according to their decreasing costs.

During the abductive inferential process, the costs of different competing assumption arecomputed on the basis of the weights annotating the applicable rules. An annotated inferencerule represented as:

Qwww & . . . & PPP k21 k21 ⊃

and has to be interpreted (from the cost point of view) as follows:

“if the cost of assuming Q is C then the cost of assuming Pi is wi • C”

A term in the antecedent of a rule, appearing without a weight, accounts for a predicatewhich cannot be assumed but must be proved. In proving it, some finer grained pieces ofknowledge, necessary for proof completion, could be assumed if not stated otherwise. Theweight assignment over the terms in a rule is very important for abduction control.

So, if 1 k

1=i iW �� then assuming Q is cheaper than assuming P1, P2, ..., Pk, thus least

specific abduction is favoured.

If 1 k

1=iiW �� then assuming P1, P2, ..., Pk is cheaper than assuming Q and most-specific

abduction is encouraged.

On the other hand, even if 1 k

1=i iW �� and some Pi’s were already derived, the cost of

assuming the rest of the terms in the antecedent might be cheaper than assuming theconsequent.

Theorem proving techniques include, among other operations, the factorization of thelogical expressions generated during a proof. This is necessary not only for efficiency reasonsbut also for completeness of the demonstration procedure [Tufiş,1981, Shapiro,1979]. Withweighted abduction, there is another benefit of factorization. Allowing for factoring of logicalexpressions, the weighted abduction provides an elegant way of exploiting the naturalredundancy of texts, overriding least-specific abduction.

4

Consider that one is supposed to derive Q1 & Q2 with an equal cost, say $10, for eachconjunct. Assuming Q1 & Q2 would therefore cost $20. But suppose there are two inferencerules, stating:

QPP

QPP

20.63

0.62

10.62

0.61

&

&

Although each rule favors least-specific abduction, factoring over P2 makes the problemresolution $2 cheaper than if the most-specific-abduction strategy had been applied.

3. Overcoming the Lexical GapsAs one would expect, interpretation of a sentence containing unknown words will lead to

a set of assumptions, some of them referring to the meaning of the unknowns.

To exemplify our approach to understanding unknown words by abduction, let usconsider the simple example2 :

EX1) The car veered right.

containing the unknown (to the system) word “veered”.

Now, suppose that the knowledge base of the system contains (among others) thefollowing rules:

(1) e)(right1,constraint x)move1(e, & (x)mobile0.10.1

(2) mobile(x)car(x) 0.1

The formula “constraint(p,e)” stands for the restrictions that the predicate “p” places onits eventuality argument “e”. This predicate belongs to the same class of coercion predicates as“Req” and “rel” (see [Hobbs,1990a]). Unlike “Req” and “rel” which restrict the type ofparticipants in an event, “constraint” restricts the type of the event itself. These coercionpredicates seem to be second-order, but as shown in [Hobbs,1985,1990a] they are easilyexpressed equivalently as first-order predicates.

The axioms above read as follows:

2 In the following, for the sake of simplicity, we'll leave out some details, not relevant here, as tense,agreement, etc.

5

(1') One way of satisfying the requirements imposed by the adverbial reading of the word right(right1) on its argument “e” is to hypothesize that e is the event described by a move actioncarried out by a mobile agent.

(2') One way to prove that something is a mobile thing, is to prove it is a car.

Let us further consider that there exist (among other similar rules) the following rules:

(3))w1j,verb1(i,w2wsyn(e,-ass

& x)(e,w & verb1),w 2cat(& x)(e,w& )w1j,unk(i,

),

1

210.5

0.11.00.10.5

The reading of this particular axiom is:

(3') One way of proving the existence of a verb w1 between the interword positions i and j is toprove it is an unknown word (unk(i,j,w1)) referring to an eventuality e. The agent of e is any x(w1(e,x)) and there exist a word w2 of the same category - here an intransitive verb -(cat(w2,verb1)) so that w2 could refer to the event e implying the agent x (w2(e,x)). In thecontext of “e”, w2 is a synonym for w1 (ass-syn(e,w1,w2)).

The last atomic formula in the antecedent of (3) will not be provable but alwaysassumable.

To show how the sentence in example 1 is interpreted, let us consider the grammar rulesbelow, augmented with semantic and pragmatic information3 .

(4) & e)k,s(i,x)e,k,vp(j,x)j,np(i,0.80.4

(5) x)k,np(i, w)wk,n(j,)wj,det(i, (x) & & 2210.051.02.0

(6) x)e,k, vp(i,e),w(constraint

(e)w)wk,adverb(j,x)(e,w)wj,verb1(i,

& & & &

2

22110.1

0.050.050.050.8

Stated in English, the rules above say that:

(4') To prove that a string of words between the positions i and k represents a sentencereferring to an event e, one has to prove an np from i to j the referent of which is x, and a vpfrom j to k denoting an event in which x is the agent.

(5') To prove an np between the positions i and k the referent of which is x, one has to provethe existence on a determiner (from i to j) in front of a noun the referent of which is x.

3 All the variables in these formulae are implicitly universally quantified.

6

(6') To prove a vp between the positions i and k denoting an event the agent of which is x, onehas to prove an intransitive verb (verb1) between the positions i and j which could identify anevent the agent of which is x; adjacent to the verb, there must be an adverb which couldmodify the event e. The event e must satisfy the constraints imposed by the adverbial modifier.

The weights assignment is not our main concern here, but as we will see in the nextparagraph, annotating the constituents of the phrase-structure rules nicely allows for dealingwith extragrammatical input. The next paragraph will discuss also a cost-based treatment ofnoun-phrase referent identification. Here it will suffice to say that the weights assignment isbased on empirical evidence (for instance, in 4, the weight on the “vp” is higher than that onthe “np” since presumably a verb phrase provides more evidence for the existence of asentence than a noun phrase does).

To assign an interpretation to EX1) means to prove the following statement (having let'ssay $100 assuming budget):

∃ e s(0,4,e)

Because of the weights assignment in rule 4 (w1+w2 > 1), we cannot assume theexistence of both np and vp, but we need to be able to prove, even partially, at least one ofthem.

x)e,vp(j,4,x)j,np(0,0.80.4

&

The proof of the first term can be done according to rule 5 (almost) for nothing:

)wj,(0,det matches string,input thefrom the),det(0,1, 10.2

)wk,n(1, matches string,input thefrom car),n(1,2, 21.0

To complete “np” it remains to prove or assume car(x), depending on whether or not anobject denoted by the word car, say car00017, is known in the knowledge base. If it is notknown its existence is assumed for only $2 (0.4 * 0.05 * $100), as part of the new informationcarried by the sentence.

To prove car00017)e,vp(2,4,0.8

, rule 6 is relevant, and when fired, the new goal to provewould be:

e),w(constraint(e)w)w,adverb(j,4car00017)(e,wwj,verb(2, 22210.080.040.040.04

1

0.64 & & & & )

The new weights annotating the predicates above resulted from the multiplication of theinitial weights in rule 6 with the weight of the previous vp goal (0.8).

7

The word “veered” being unknown to the system, it will be assigned the interpretationunk(2,3,veered). Since, a matching between verb1(2,j,w1) and unk(2,3,veered) is not possible,the rule 3 would be fired eventually.

Now, what is to be proved is:

e).,(wconstraint(e)w)w,adverb(j,4cat00017)(e,w

)w,w,syn(e-assx),(e'wverb1),cat(wx),(ew)wj,unk(2,

2221

311331110.080.040.040.04

0.320.0640.0640.0640.32

& & & &

& & & &

By factoring (see [Hobbs,1990a]) the expression above would yield4:

e),w(constraint (e)w)w,adverb(j,4)w,wsyn(e,-ass

car00017)(e,wverb1),wcat(car00017)(e,w)wj,unk(2,

2231

31

& & &

& & & & 0.080.04

2

0.040.32

0.064

3

0.0640.04

1

0.32

The input unk(2,3,veered) matches )wj,unk(2, 132.0

so after variable binding the expressionto be further on proved becomes:

e),w(constraint(e)w)w,adverb(3,4)wveered,syn(e,-ass

car00017)(e,wverb1),wcat( car00017)veered(e,

223

30.080.04

2

0.040.32

0.064

3

0.0640.064

& & &

& & &

The term )2w,adverb(3,404.0

matches the input adverb(3,4,right) so that proving

e)(right,constraint08.0

fires rule 1. One obtains:

mobile(x)x)move1(e,)wveered,syn(e,-ass

right(e)car00017)(e,wverb1),wcat(car00017)veered(e,0.080.080.32

0.040.064

3

0.064064.0

& &

& & & &

3

3

The predicate mobile(x)08.0

is easily proved (rule 2) binding x to car00017. Again, byfactoring, the above expression is reduced to:

move1)veered,syn(e,-assright(e)

car00017)move1(e,verb1)cat(move1,car00017)veered(e,0.320.04

0.080.064064.0

&

& & &

By dictionary check-up, verb1)cat(move1,064.0

is proved and finally one gets:

4 The factor of two unifiable predicates annotated by the weights w1 and w2 will be assigned the weightmin(w1, w2).

8

move1)veered,syn(e,-ass

right(e)car00017)move1(e,car00017)veered(e,0.32

0.040.08064.0 & & &

Since nothing can be proved further on, these four conjuncts have to be assumed. Thesystem assumes (for $6.4), the word “veered” to be a (intransitive) verb and (for another $32)that it is (partially) synonymic to the verb “to move” (veering is a kind of moving5).

Under this assumptions, the new information carried by the sentence is that a specific carparticipated in an event of type move1 (which could be more precisely referred to as a veeringevent) and that the direction in which this event progressed was to the right.

4. Extragrammaticality and noun-phrase referent identificationWe have seen in the previous paragraphs, that weights were annotating not only semantic

and pragmatic information but syntactic as well. As already suggested, the effect of usingweights on different syntactic constituents would allow for dealing with syntactic deviantsentences.

Thus, if the input matched no rule in the grammar, the matching condition could berelaxed by making abductive assumptions. Recall, for instance, rule 4 (and its Englishformulation) we used in chapter 3:

(4) e)k,s(i,x)e,k,vp(j,x)j,np(i, & 0.080.04

The annotations on the categorial predicates provide an opportunity to accept incompletesentences or ill-formed constituents. A missing noun phrase can be assumed (according to rule4 for 40% of the sentence interpretation cost. According to rule 5, a noun phrase has to bemade up of a determiner and a noun. So, if either of them is missing, the noun phrase is ill-formed. Anyway, the relatively low cost on the determiner, allows in our case to interpret anoun as a noun phrase for only 8% (0.4 * 0.2) from the sentence interpretation cost. On theother hand, the failure to identify a noun, even with a determiner found, would force theassuming of the whole noun phrase (the “least specific” strategy is here cheaper than the “mostspecific” strategy). The same discussion holds for the verb phrase too, but the assumption costis significantly higher.

In the following, we will present an extension to the abductive engine presented so far.The necessity of this extension will be revealed soon.

5 This approach to the partial synonymy could be elegantly expressed by the axiom:(∀ x,e) ass-syn(e,w1,w2) & (w1(e,x) & etc(x) |— w2(e,x)) where “etc(x)” is a predicate standing for whateverunspecified properties of x. It will never be provable but assumable. The predicate “etc” is a logical deviceanalogous to the abnormality predicate in the circumscription logic [McCarthy,1987].

9

Let us focus our attention to the interpretation of the determiner with respect toidentifying the referent of a noun phrase.

Most NL designers take a definite noun phrase to refer to an entity already known in thecontext while an indefinite noun phrase is taken to introduce a new object in the context.

Let us consider the rule:

(7) x)k,np(i,(x)w)wk,(j,noun)wj,(i,det & & 2qqq 321 21 ⊃

In this rule, the predicate “w2(x)” accounts for identifying the referent of the string“w1w2”, proved (or assumed) as the structure of noun group. It is obvious that the expectationof proving w2(x) heavily depends on the actual string making the noun group.

Then, if we adopt the same convention, w2(x) would be provable if w1 represented adefinite determiner, and w2(x) would be assumable if w1 represented an indefinite determiner.

To account for making such a distinction a solution might be to split rule 7 into two morespecific rules:

(8) x)k,np(i,(x)wwk,(j,nounwj,(i,det-def & ) & ) 2q'q'q' 321 21 ⊃

(9) x)k,np(i, wwk,(j,nounwj,(i,det-indef (x) & ) & ) 2q"q"q" 321 21 ⊃

In order to follow the above convention, in rule 8 the assumption cost for the referent ofthe noun group should be very high, while in 9 it should be very low.

There are some problems with this approach. For instance, suppose that the assumptionbudget for finding an “np” is C. Let us further suppose a state of the world, STATE1, in whichan object “car00017” is known so that “car(car00017)” is true, and another one, STATE2, inwhich there is no such object.

Now, let's examine what happens when the system receives the following strings:

Ex2) ... the car ...

Ex3) ... a car ...

Ex4) ... car ...

Being in STATE1 the interpretations of the noun-groups in Ex2)-Ex4) will cost:

cost-ex2-STATE1 = 0 (according to rule 8)

cost-ex3-STATE1 = 0 (according to rule 9)

10

cost-ex4-STATE1 = min (q'1 • C, q1" • C)

Being in STATE2, the interpretations of the same substrings will yield the followingcosts:

cost-ex2-STATE2 = q'3 • C (according to rule 8)

cost-ex3-STATE2 = q"3 • C (according to rule 9)

cost-ex4-STATE2 = min [(q'1+q'3) • C, (q1"+q3") • C]

Since there is no serious argument for assigning different assumption costs for the twotypes of determiners, probably q'1 ≅ q"1 = q1. According to the previous discussion, q'3 >>q"3. In this case,

cost-ex4-STATE1 = q1 • C (according to either rule 8 or rule 9)

cost-ex4-STATE2 = (q"1+ q"3) × C (according to rule 9).

Let us notice that although q"1+q"3 < q'3, it is not possible to assume “the” being an“indef-det” for applying rule 10 instead of 9 when interpreting “the car” in STATE2. Such anassumption would contradict the rest of known things, namely that “the” is a definitedeterminer. The results are summarized below:

Table 1

STATE1 STATE2Sub-string cost applied-rule cost applied-rulethe car 0 rule 8 C • q'3 rule 8a car 0 rule 9 C • q"3 rule 9car C • q1 rule 8 or rule 9 C•(q"1+q"3) rule 9

There are some queer results in here. First, it seems odd that irrespective of definitenessof the noun-group, when interpreted in the world WORLD1 its referent is the same(“car00017”). This association is reflected in the table above by the null cost on the nounphrases “the car” and “a car”.

Second, it might be of to interest to be able to identify precisely what rule should havebeen applicable when an assumption was made to overcome a certain grammatical deviance6.

6 This might be useful, for instance, in a CALL system equipped with an explanatory module dealing withungrammatical input.

11

In our examples (the STATE1 case), because of presumably equal weights on thedefinite and indefinite determiner predicates, the incomplete noun-phrase “car”, althoughrecovered (correctly or not), lost this information (of being definite and indefinite).

A local solution to this drawback would be to assign a slightly lower weight to “def-det”than to “indef-det”, so that rule 8 be chosen. This patch would indeed favor rule 8, but theproblem with indiscriminate referent resolution still remains.

Now, suppose we decided that whenever an indefinite noun-phrase is interpreted it willalways introduce in the context a new object of the appropriate type, irrespective of whether an

object of the type in question already exists. The use of type predicate wq(x) is not possible inthis case. Indeed, if an appropriate object existed in the context, x would be bound to it.

To handle this problem, we will introduce the always assumable predicate Ass(p,x)which asserts the truth of the predicate p(x), x being an arbitrary new object.

A further enhancement of the abduction mechanism will be to generalize weights evenbeyond what Stickel already did. Stickel (1989) generalized the weights assigned to the termsin the antecedent of an axiom, to arbitrary functions of the assumability cost of the consequentof the axiom. While consistently more powerful, this generalization seems not being enough.What we would like to do is to allow for annotating the terms in the antecedent of a rule bywhat might be called “assumption-cost-track functions”. An assumption cost-function (ACTF)would specify the cost of assuming a predicate not just as a percentage or even a function of agiven figure but as a relation to the costs of other assumptions already made.

Consider the new rule, replacing rule 9.

(10) y)k,np(i,y),wAss

(x)wwk,(i,nounwj,(i,det-indef (

& & ) & )

2

21

1q

2qqq

4

321

where ��

���

otherwise q"4assumed was(x)w 2 if 0 (x))wACF(q = = 24

The results of introducing rule 10 in the grammar can be shown in the table below:

Table 2

STATE1 STATE2Sub-string cost applied-rule cost applied-rulethe car 0 rule 8 C • q'3 rule 8a car 0 rule 10 C • q"3 rule 10car C • q1 rule 8 C•(q"1+q"3) rule 10

As one can see, the net result is that in case of an indefinite noun-phrase, its referent (thevariable y) will be bound to a newly created object of the appropriate type.

12

Also, the preferred reading of an incomplete noun-phrase (as a definite noun-phrase ifthere exist an appropriate referent or as an indefinite noun-phrase otherwise) is solved in amore rigorous manner.

5. ConclusionsWe have shown how the difficult problem of overcoming the lexical gaps in

understanding natural language can be solved within an abductive environment. Understandingunknown words is modeled in terms of the partial synonymy relation. The contextual meaningof the new learnt words are imported from concepts which could be proved satisfying all therestrictions the context impose on them and ranking these synonyms according to their meritsin providing the best explanation for why the input string should be a meaningful sentence.

The question of the role which the determiner plays in the referent identification problem(in the presence of grammatical deviance) is uniformly solved by virtue of the same basicprinciple that abductive inference is inference to the best explanation.

Acknowledgements

Research reported here was supported in part by a grant from the InternationalResearch & Exchanges Board (IREX), with funds provided by the Andrew W. MellonFoundation, the National Endowment for the Humanities, Association for ComputationalLinguistics and the U.S. Department of State. None of these organizations is responsible forthe views expressed.

I would like to warmly thank the people at SRI working on the TACITUS project. Aspecial mention is due to Jerry Hobbs. Without his kindness, patience and encouragement, thispaper would have never been written.

13

REFERENCES

Appelt Douglas E., Pollack Martha E.: Weighted Abduction for Plan Ascription. SRIInternational, Technical Note 491, May 1990

Goodman Bradley A.: Reference Identification and Reference Identification Failures. InJournal of Computational Linguistics, vol. 12, no. 4, pp.273-305, 1986

Hobbs Jerry R.: Overview of the TACITUS Project. Journal of Computational Linguistics, vol.12, no. 3, 1986

Hobbs Jerry R., Stickel Mark, Appelt Douglas and Martin Paul: Interpretation as Abduction.SRI International, Technical Note 499, December 1990

Hobbs Jerry R., Kameyama Megumi: Translation by Abduction. SRI International, TechnicalNote 484, May 1990

Konolige Kurt: Abduction vs Closure in Causal Theory. SRI International, Technical Note505, April 1991

McCarthy John: Circumscription: A Form of Nonmonotonic Reasoning. In M.Ginsberg (ed)Readings in Nonmonotonic Reasoning, pp.145-152 Morgan Kaufmann Publishers, Los Altos,California, 1987

Pereira Fernando C.N., Warren David: Parsing as Deduction. Proceedings of the 21-st AnnualMeeting of the Association for Computational Linguistics, pp.137-144, 1983

Shapiro Stuart C.: Techniques of Artificial Intelligence. D. Van Nostrand Company, 1979

Stickel Mark: A Prolog-Technology Theorem Prover: Implementation by an Extended PrologCompiler. Journal of Automated Reasoning, no.4, pp.353-380, 1988

Stickel Mark: Rationale and Methods for Abductive Reasoning in Natural LanguageInterpretation. Lecture Notes in Artificial Intelligence, no. 459, pp.233-252, Springer - Verlag,Berlin, 1989

Tufiş Dan I.: Abductive Natural Language Processing. Tutorial Notes for the InternationalSummer School “Current Topics in Computational Linguistics”, Tzigov Chark, 62 pg.,September 1992

Tufiş Dan I.: Theorem Proving Techniques in Natural Language Question Answering.Proceedings of the 3-rd INFO-IASI Symposium, pp.264-280, 1981 (in Romanian)