aquaint 18-month workshop 1 light semantic processing for qa language technologies institute,...
TRANSCRIPT
1Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Language Technologies Institute, Carnegie Mellon
B. Van Durme, Y. Huang,A. Kupsc and E. Nyberg
Towards Light Semantic Processingfor Question Answering
2Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Overview of This Talk
• Motivation• Components of the Approach
– Logical Form – Similarity Measure– Unification Strategy
• Incorporation into JAVELIN
• Future Work / Next Steps
3Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example of Extraction Error
• Question: “When was Wendy’s founded?”
• Passage candidate:– “The renowned Murano glassmaking industry, on an
island in the Venetian lagoon, has gone through several reincarnations since it was founded in 1291. Three exhibitions of 20th-century Murano glass are coming up in New York. By Wendy Moonan.”
• Statistical extractor: 20th-century
4Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Basic IdeaQ: “xxx xxxx xxxx xxxx xxxxxxxxxx xx xxxxx?” P: “xxx xxxx xxxx xxxx xxxxx xx xxxxx.”
A(?,C) A(B,C)
? = B
extract extract
Unification on simple predicatesrepresenting basic argumentstructure will provide a moreaccurate way to match questionswith appropriate answer(s)
Two Challenges:* Where do predicates come from?* Flexibility in interpretation…
partial interpretation
5Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Associating Tokens with Concepts
• Imprecise Reference, e.g.:“John W. was greeted by William Clinton” “Bill greeted Mr. Wright”
• Definite Description, e.g.“Mr. Bush” vs. “the president”
• Anaphoric Reference
UNIFY( {GREET(“William Clinton”,”John W.”)} , {GREET(“Bill”,”Mr. Wright”)} )
Interpretation of tokens must be:•Approximate, not exact•Context-sensitive
6Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Language Processing Tools
• BBN IdentiFinder (BBN, 2000)• Link Grammar parser (Grinberg et al., 1995)• KANTOO parser (Nyberg & Mitamura, 2000)• Brill part-of-speech tagger (Brill, 1995)• WordNet (Fellbaum, 1998)• Lexical Conceptual Structure (LCS) Database
(Dorr 2001)
7Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Representation
• Formula: a set of literals• Literal: a predicate, plus two terms• Extrinsic literal: a relation mapping a
label to a label– SUBJECT(x1,x2)
• Intrinsic literal: a relation mapping a label to a value– ROOT(x1,|Benjamin|)
• Value: EVENT, past, +, |Mary Smith|,…
8Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example
Q = Who killed Jefferson?ROOT(x1,?a0),ROOT(x2,|kill|),ROOT(x3,|Jefferson|),TYPE(x2,|event|),TYPE(x1,|person|),TYPE(x3,|person|),SUBJECT(x2,x1),OBJECT(x2,x3),ANS(?a0)
P = Benjamin murdered Jefferson.ROOT(y1,|Benjamin|),ROOT(y2,|murder|),ROOT(y3,|Jefferson|),TYPE(y2,|event|),TYPE(y1,|person|),TYPE(y3,|person|),SUBJECT(y2,y1),OBJECT(y2,y3)
9Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Graphically
?a0
x1 x2
kill
x3
Jeffersonperson
Benjamin
y1y2
murder
y3
Jeffersonperson
eventperson
person
event
SUBJECT
SUBJECT
OBJECT
OBJECT
ROOT
ROOT
ROOT
ROOT
ROOT
ROOT
TYPE
TYPE
TYPETYPE
TYPE
TYPE
10Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Similarity Functions• A zero-to-one function that returns a value
representing similarity between the formulae for question, passage
• Unification requires similarity measurement between literal values
• sim(“Who killed Jefferson?”, ”Benjamin murdered Jefferson.”) = 0.9
11Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(formula0,formula1)
Given two formulae, we define the similarity to be the geometricmean of the similarity between the separate extrinsic literals.
12Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(extrinsicLiteral0,extrinsicLiteral1)
To measure the similarity between two extrinsic literals,we take the square root of the product of the similaritybetween each of the two pairs of labels.
13Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(label0,label1)
To measure the similarity of two labels, we find the maximumpossible value of taking the geometric mean of the similarity of each pairwise combination of intrinsic literals that are shared by the two labels.
14Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(intrinsicLiteral0,intrinsicLiteral1)
The similarity between two intrinsic literals is measured by similarity of the paired words, times the weight of the first literal.
15Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(word0,word1)
• sim(|kill|,|murder|) = 0.8– via WordNet distance function
• sim(?a0,|Benjamin|) = 1.0– zero cost for variable binding
16Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example
17Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Answer
• Find the maximum possible similarity score, return the term bound to ?a0
• ?a0/|Benjamin|• sim(Q,P) = 0.9• Answer = Benjamin, 0.9
18Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Current Status, Future Work• First version implemented, testing now• Short Term: Test “NLP IX” against statistical
extraction module on factoid questions• Longer Term:
– Support simple reasoning about questions and passages
– Investigate approach in narrower domains• Question answering based on CNS data on terrorism
and weapons of mass destruction– Extend similarity metric at word level
• Word co-occurrence information• Distance metrics on ontologies other than WordNet
– Incorporate LCS Lexicon
19Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Summary
• We believe complex question answering requires more than statistical extraction methods
• Knowledge bottleneck forces compromise in depth of language processing
• Robust unification based on heuristic measure of similarity offers short-term solution
20Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Additional Resources
• Paper available:
B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg (2003). “Towards Light Semantic Processing for Question Answering”, presented at the HLT/NAACL 2003 Workshop on Text Meaning.
• This and other papers at the JAVELIN web site:
http://www.lti.cs.cmu.edu/Research/JAVELIN
21Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Questions?