javelin project briefing 1 aquaint year i mid-year review language technologies institute carnegie...

27
1 AQUAINT Year I Mid-Year Review JAVELIN Project Briefing Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program Review May 15, 2002

Upload: earl-fleming

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

1AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Language Technologies InstituteCarnegie Mellon University

Status Update forMid-Year Program Review

May 15, 2002

Page 2: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

2AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Eric Nyberg

Jamie Callan Jaime Carbonell Bob Frederking

Alon Lavie

Teruko Mitamura Dave Svoboda

Jeongwoo Ko

Michael DugganKrzysztof Czuba

Vasco Pedro

Yifen HuangLaurie Hiyakumoto Lucian Lita

Current Team

Page 3: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

3AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Research Objectives• QA as Planning

– Create a general QA planning system– How should a QA system represent its chain

of reasoning?

• QA and Auditability– How can we improve a QA system’s ability

to justify its steps?– How can we make QA systems open to

machine learning?

Page 4: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

4AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Research Objectives [2]

• Utility-Based Information Fusion– Perceived utility is a function of many different

factors– Create and tune utility metrics, e.g.:

U = Argmax k [F (Rel(I,Q,T), Nov(I,T,A), Ver(S,Sup(I,S)), Div(S), Cmp(I,A)), Cst(I,A)]

I: Info item, Q: Question, S: Source, T: Task context, A: Analyst

- relevance- novelty- veracity, support- diversity- comprehensibility- cost

Page 5: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

5AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Project Status Summary

• Started in November 2001• Fully staffed in December 2001• Basic end-to-end architecture

operational w/limited coverage• Working on question types for TREC

evaluation• Working to integrate advance JAVELIN

capabilities (e.g., Planner)

Page 6: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

6AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Research Plan & Progress• Develop end-to-end system

– Architecture, Planner, Repository– Individual QA Modules

First version being tested now• Evaluation:

– English queries– English, Chinese and

Japanese documents

Starting with English only

Page 7: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

7AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

QA Evaluation in the Large• Information-Based Evaluation

current focus, for TREC QA track• Utility-Based Evaluation

when planner is integrated• Architectural Evaluation

future task: specify and analyze the properties of the design

JAVELIN team participatedin LREC ’02 workshop onQA systems evaluation

Page 8: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

8AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

JAVELIN Basic Architecture

All objects created orretrieved are storedcentrally for reuse

Details of moduleimplementation arehidden

Planner is independentfrom the particular QAmodules being used

FirstEnd-to-EndSystemCompleted

Next Step:Integrate Planner

Page 9: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

9AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

JAVELIN Data Flow

Merged

Page 10: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

10AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Revised Architecture

DataRepository

JAVELIN GUI

QuestionAnalyst

AnswerGenerator

RetrievalStrategist

ExecutionManager

...search engines &document collections

process historyand results

operator (action) models

RequestFiller

PlannerDomainModel

Page 11: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

11AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Module Integration

• Via XML DTDs for each object type• Modules use simple XML object-passing

protocol built on TCP/IP• Execution Manager takes care of

checking objects in/out of Repository

Page 12: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

12AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

End-to-End System

• User Interface• Repository• Execution Manager• Question Analyzer• Retrieval Strategist• Request Filler• Answer Generator• Planner

Java clientMicrosoft SQL ServerJava server/IIMModule server/KANTOOModule server/InqueryModule serverModule serverModule server

Page 13: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

13AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Evaluation: Current

• Execution Manager can run in “lights out” batch mode

• Nightly tests on different test suites (starting with TREC QA)

• Results include scores and logs for debugging

• Working up to full TREC QA evaluation for TREC ‘02

Page 14: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

14AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Sample TestResults Page

Page 15: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

15AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Sample Results

Page 16: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

16AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Sample LogFile Excerpt

Page 17: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

17AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

JAVELIN User Interface

Page 18: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

18AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Repository ERD

(Entity Relationship Diagram)

Page 19: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

19AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Short-Term Goals

• Prepare for TREC QA evaluation!• Integrate Planner with end-to-end

system• Acquire Japanese and Chinese resources

Page 20: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

20AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Post-TREC Tasks

• Execution Manager & UI:– Support for interactive dialog– Extended evaluation capability– Ability to re-run prior question with

modifications

• Repository:– Support for answer justification– Support for end-user repository search

Page 21: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

21AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Post-TREC Tasks [2]

• Planner:– Ablation studies– Advanced question types– Planning parameter variations

• Question Analyzer:– Broaden coverage of question parsing– Support for additional question types– Produce interlingua for request object

Page 22: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

22AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Post-TREC Tasks [3]

• Retrieval Strategist:– Add Japanese and Chinese document

collections, relational DB support– Add Google as a document collection– Support for answer verification– Switch to Lemur Toolkit

• Request Filler:– Reference resolution– Deeper NL analysis

Page 23: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

23AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Post-TREC Tasks [4]

• Answer Generator:– Combining more evidence types (predicate

argument structure, event boundaries)– Extended answer types (hypotheticals)

Page 24: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

24AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Gathering Evidence in Answer Generation

• Q: Name all the bills that were passed during the Bush administration.

• Not likely to find passages mentioning `bill’, `pass’, `Bush administration’.

• When was Bush administration??

• `Symbolic’ QA: look for explicit answer in collection, might not be present.

• `Statistical’ QA: look at distribution of documents mentioning Bush administration.

• Combining evidence of different sorts!

Page 25: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

25AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Gathering Evidence [2]

• Can we figure out if Bush administration was around when document was written?

• Look at tense/aspect/wording.

• Forward time references – Bush administration will do something

• Backward time references – Bush administration has done something

• Hypothesis: – Backward time references provide information about onset of

event;– Forward time references provide information about end of event.

Page 26: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

26AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Clustering Evidence

• Bush administration forward references

AdministrationchangeEvent end

Time stamps

#docsmentioningBush adm.on givenday

Page 27: JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program

27AQUAINT Year I Mid-Year Review

JAVELIN Project Briefing

Clustering Evidence [2]

• Bush administration backward references

#docsmentioningBush adm.on given day

Time stamps

AdministrationchangeEvent onset