aquaint kickoff meeting advanced techniques for answer extraction and formulation language computer...

17
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation www.languagecomputer.com Dallas, Texas PI: Dan Moldovan [email protected]

Upload: linda-thompson

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting

Advanced Techniques for Answer Extraction and Formulation

Language Computer Corporationwww.languagecomputer.com

Dallas, Texas

PI: Dan [email protected]

Page 2: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

Advanced Techniques for Answer Extraction and Formulation

People Dan Moldovan, PI Sanda Harabagiu, Co-PI Mihai Surdeanu Marius Pasca John Lehmann Vasile Rus Earl Hood

Page 3: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

Tasks

Task 1. QA System Taxonomy Task 2. Answer fusion Task 3. Develop methods for on-line ontology

construction Task 4. Develop an inference engine capable of

providing answer justification Task 5. Formulate concise and coherent

answers Task 6. Explore new QA System Architectures.

Page 4: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T1: QA System Taxonomy

Need for a taxonomy Goal: Develop elaborate taxonomies for QA

systems and applications Approach:

Develop QA System theoretical models Study tradeoffs and theoretical limits

Page 5: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T1: QA System Taxonomy

Model QA system performance as a function of: Question space: scope, context, judgement Answer space: multiple-sources, fusion,

interpretations Document space: type, indexing System architecture: modules, feedbacks Computing power: processing time, computer

parameters Resources used: dictionaries, knowledge bases,

knowledge acquisition, parsers, theorem provers

Page 6: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T2: Answer Fusion

Goal: Develop methods to handle questions whose answers spread across several documents.

Approach: Map the original question into simpler queries Collect answers to these simple queries Ensemble an answer by fusing partial answers.

Page 7: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T2: Answer Fusion

Study answer fusion at various levels of complexity Questions asking simple facts

What countries import sugar from Cuba? Questions that require on-line ontology development

What software products does Microsoft sell? What causes asthma? What are the effects of alcohol on the brain?

Speculative questions about future events Does the Fed cut the interest rate at their next meeting?

Page 8: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T2: Answer Fusion

Use IE and Text Mining to map semantic relations into lexico-syntactic patterns that in turn help to develop ontologies.

Q: What causes asthma?

Page 9: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T3: On-line Ontology Development

Goal: Develop automatically ontological structures that help answer some complex questions

Approach: Use knowledge acquisition from text methods to extract and classify concepts and relations relevant to question keywords

Page 10: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T3: On-line Ontology Development

Example: What software products does Microsoft sell?

Page 11: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T4: Answer Justification

Goal: Develop an inference engine capable of justifying an answer via a logical proof

Approach: Transform questions and document paragraphs into logical

representations Use world knowledge axioms extracted from WordNet glosses Construct lexical chains between query concepts and candidate

answer sentence concepts Apply unification on lexical chains.

Page 12: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T4: Answer Justification

Logic proof for P2 succeeds and agent of shooting Sheriff_Pat_Garret unifies with PERSON

e1 = e1’x1 = x2’x2 = x1’

P1: The scene called for Phillips’ character to be saved from a lynching when Billy the Kid (Emilio Estevez) shot the rope in half just as he was about to be hanged.

P2: In 1881, outlaw William H. Bonney Jr., alias Billy the Kid, was shot and killed by Sheriff Pat Garrett in Fort Summer, N.M.

Example Q481: Who shot Billy the Kid?

LFT:

)(__&),,(&)(__:2

)(&),,(&)(__:1

)(__&),,(&)(:

21211

22111

22111

lllll

lllll

xGarretPatSheriffxxeshootxKidtheBillyP

xropexxeshootxKidtheBillyP

xKidtheBillyxxeshootxPERSONQ

Page 13: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T4: Answer Justification

WordNet axioms:

Columbian: of Columbiamurder: kill intentionally with premeditationkill: cause to die

A: “Several gunmen on a highway leading to the Columbian city of Ibaque murdered Colombian ambassador to Honduras Lucelly Garcia today.”

Example Q045: When did Lucelly Garcia, former ambassador of Columbia to Honduras die?

Page 14: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T5: Answer formulation

Goal: Develop methods to formulate concise and coherent answers

Approach: Answer formulation receives inputs from answer

extraction, dialogue, and other modules. Language generation operators piece together the

information provided by the answer elements.

Page 15: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T5: Answer formulation

Page 16: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T5: Answer formulation

Example: How do trade liberalization and foreign aid affect international migration?

Page 17: AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation  Dallas, Texas

AQUAINT Kickoff Meeting - AnsQA

T6: Explore new QA System Architectures

Goal: Study innovative QA system architectures that are high performance, modular, and tunable.

Approach: Study via analytical modeling, simulations and implementation the architectural implications of features such as: High recall High precision Fast response time Portability Large volume and diversity of documents