aquaint bbn’s aqua project ana licuanan, jonathan may, scott miller, ralph weischedel, jinxi xu 3...

Post on 02-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

AQUAINT

BBN’s AQUA ProjectBBN’s AQUA Project

Ana Licuanan, Jonathan May, Scott Miller, Ralph Weischedel, Jinxi Xu

3 December 2002

2

AQUAINTBBN’s Approach to QABBN’s Approach to QA

• Theme: Use document retrieval, entity recognition, & proposition recognition

• Analyze the question

– Reduce question to propositions and a bag of words

– Predict the type of the answer

• Rank candidate answers using passage retrieval from primary corpus (the Aquaint corpus)

• Other knowledge sources (e.g. the Web) are optionally used to rerank answers

• Re-rank candidates based on propositions

• Estimate confidence for answers

3

AQUAINTSystem DiagramSystem Diagram

Question Classification

Web Search

NP Labeling

Treebank

Name Annotation

Name Extraction

Parsing

Description ClassificationProposition Finding

Document Retrieval

Confidence Estimation

Passage Retrieval

Question

Answer & Confidence Score

Name Extraction

Regularization Proposition Bank

AQUAINT

Question ClassificationQuestion Classification

5

AQUAINTQuestion ClassificationQuestion Classification

• A hybrid approach based on rules and statistical parsing & question templates– Match question templates against statistical parses– Back off to statistical bag-of-word classification

• Example features used for classification– The type of WHNP starting the question (e.g. “Who”,

“What”, “When” …) – The headword of the core NP– WordNet definition– Bag of words– Main verb of the question

• Performance– TREC8&9 questions for training– ~85% when testing on TREC10

6

AQUAINTExamples of Question AnalysisExamples of Question Analysis

• Where is the Taj Mahal?

– WHNP=where

– Answer type: Location or GPE

• Which pianist won the last International Tchaikovsky Competition?

– Headword of core NP=pianist,

– WordNet definition=person

– Answer type: Person

7

AQUAINTQuestion-Answer TypesQuestion-Answer Types

Type Subtype

ORGANIZATIONCORPORATION EDUCATIONAL GOVERNMENT HOSPITAL HOTEL MUSEUM OTHER POLITICAL RELIGIOUS

LOCATION CONTINENT LAKE_SEA_OCEAN OTHER REGION RIVER BORDER

FAC AIRPORT ATTRACTION BRIDGE BUILDING HIGHWAY_STREET OTHER

GAME

PRODUCT DRUG OTHER VEHICLE WEAPON

NATIONALITY NATIONALITY OTHER POLITICAL RELIGION

LANGUAGE

FAC_DESC AIRPORT ATTRACTION BRIDGE BUILDING HIGHWAY_STREET OTHER

MONEY

GPE_DESC CITY COUNTRY OTHER STATE_PROVINCE

ORG_DESCCORPORATION EDUCATIONAL GOVERNMENT HOSPITAL HOTEL MUSEUM OTHER POLITICAL RELIGIOUS

CONTACT_INFO ADDRESS OTHER PHONE

WORK_OF_ART BOOK OTHER PAINTING PLAY SONG

*Thanks to USC/ISI and IBM groups for sharing the conclusions of their analyses.

8

AQUAINTQuestion Answer Types (cont’d)Question Answer Types (cont’d)

PRODUCT_DESC OTHER VIHICLE WEAPON

PERSON

EVENT HURRICAN OTHER WAR

SUBSTANCE CHEMICAL DRUG FOOD OTHER

PER_DESC

PRODCUT OTHER

ORDINAL

ANIMAL

QUANTITY1D 1D_SPACE 2D 2D_SPACE 3D 3D_SPACE ENERGY OTHER SPEED WEIGHT TEMPERATURE

GPE CITY COUNTRY OTHER STATE_PROVINCE

DISEASE

CARDINAL

AGE

TIME

PLANT

PERCENT

LAW

DATE AGE DATE DURATION OTHER

9

AQUAINTFrequency of Q TypesFrequency of Q Types

0

50

100

150

200

250P

ers

on

Qu

an

tity

Mo

ne

yP

erc

en

tO

rga

niz

atio

nO

rga

niz

atio

n-D

esc

Pro

du

ct-N

am

eP

rod

uct

-De

scF

aci

lity

Dis

ea

seR

ea

son

GP

EG

PE

-De

scW

ork

-of-

Art

Da

teE

ven

tT

ime

La

ng

ua

ge

Na

tion

alit

yL

oca

tion

-Na

me

De

finiti

on

Use

Oth

er

Ca

rdin

al

Ord

ina

lG

am

eC

on

tact

In

foA

nim

al

Pla

nt

Bio

Ca

use

-Eff

ect

-In

flue

nce

La

w

# i

n T

RE

C 8

, 9

, 1

0

AQUAINT

InterpretationInterpretation

11

AQUAINTIdentiFinderIdentiFinderTMTM Status Status

• Current IdentiFinder performance on types

• IdentiFinder easily trainable for other languages, e.g., Arabic and Chinese

Rec

all

Pre

cis

ion F

SubcategoryCategory

88 89 88.487 88 87.3

0

20

40

60

80

100

12

AQUAINTProposition IndexingProposition Indexing

• A shallow semantic representation

– Deeper than bags of words

– But broad enough to cover all the text

• Characterizes documents by

– The entities they contain

– Propositions involving those entities

• Resolves all references to entities

– Whether named, described, or pronominal

• Represents all propositions that are directly stated in the text

13

AQUAINTProposition Finding ExampleProposition Finding Example

Propositions

• (e1: “Dell”)

• (e2: “Comaq”)

• (e3: “the most PCs”)

• (e4: “2001”)

• (sold subj:e1, obj:e3, in:e4)

• (beating subj:e1, obj:e2)

• Question: Which company sold the most PCs in 2001?

• Text: Dell, beating Compaq, sold the most PCs in 2001.

• Passage retrieval would select the wrong answer

Answer

14

AQUAINTProposition Recognition StrategyProposition Recognition Strategy

• Start with a lexicalized, probabilistic (LPCFG) parsing model

• Distinguish names by replacing NP labels with NPP

• Currently, rules normalize the parse tree to produce propositions

• At a later date, extend the statistical model to

– Predict argument labels for clauses

– Resolve references to entities

15

AQUAINTConfidence EstimationConfidence Estimation

• Compute probability P(correct|Q,A) from the following featuresP(correct|Q,A)P(correct|type(Q), <m,n>, PropSat)– type(Q): question type– m: question length– n: number of matched question words in answer

context– PropSat: whether answer satisfies propositions in the

question• Confidence for answers found on the Web P(correct|Q,A)P(correct|Freq, InTrec)

– Freq=Number of Web hits, using Google– InTrec=Whether Q was also a top answer from Aquaint

corpus

16

AQUAINT

Dependence of Answer Correctness Dependence of Answer Correctness on Question Typeon Question Type

0

0.1

0.2

0.3

0.4

0.5P

(cor

rrec

t|Typ

e)

17

AQUAINT

Dependence on Proposition Dependence on Proposition SatisfactionSatisfaction

0

0.1

0.2

0.3

0.4

0.5

0.6

PropSat=True PropSat=False

P(c

orr

ec

t|P

rop

Sa

t)

18

AQUAINT

Dependence on Number of Matched Dependence on Number of Matched WordsWords

0

0.1

0.2

0.3

0.4

0.5

0 2 4 6

number of matched words

p(co

rrec

t)

questionlength=3

questionlength=4

questionlength=5

19

AQUAINT

Dependence of AnswerDependence of AnswerCorrectness on Web FrequencyCorrectness on Web Frequency

0

0.2

0.4

0.6

0.8

1

0 50 100 150

Fre que ncy of answe r in Google sum m arie s

P(c

orre

ct|F

,IN

TR

EC

)

INT REC t rue

INT REC false

20

AQUAINTOfficial Results of TREC2002QAOfficial Results of TREC2002QA

RunTagUnranked Average

Precision

Ranked Average

Precision

Upper-bound

BBN2002A 0.186 0.257 0.498

BBN2002B 0.288 0.468 0.646

BBN2002C 0.284 0.499 0.641

• BBN2002A did not use Web

• BBN2002B&C used Web

• Unranked average precision=percentage of questions for which the first answer is correct

• Ranked average precision=Confidence weighted score, the official metric for TREC2002

• Upper-bound=confidence weighted score given perfect confidence estimation

21

AQUAINTRecent Progress Recent Progress

• In the last six months, we have:

– Retrained our name tagger (IdentiFinderTM) for roughly 29 question types

– Distributed the re-trained English version of IdentiFinder to other sites

– Participated in the Question Answering track of TREC 2002

– Participated in a pilot evaluation of automatically answering definitional/biographical questions

– Developed a demonstration of our question answering system AQUA against streaming news

top related