mitre dialog management workshop – a review dan bohus dialogs on dialogs reading group cmu,...

43
MITRE Dialog Management Workshop – a review Dan Bohus Dialogs on Dialogs reading group CMU, November 2003

Post on 21-Dec-2015

229 views

Category:

Documents


5 download

TRANSCRIPT

MITRE Dialog Management Workshop – a review

Dan Bohus

Dialogs on Dialogs reading groupCMU, November 2003

MITRE Dialog Management Workshop

The Workshop

MITRE Dialog Workshop @ MITRE, Bedford/Boston October 27-28, 2003

Idea Bring together researchers working on dialog

management Give them a homework

Adapt you dialog manager to a medical diagnosis domain (details in a sec)

Discuss, compare, learn

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

The Homework Implement a dialog system for the medical diagnosis

domain Task left open-ended (diagnosis, tutoring, etc) No speech, just text in and out Backend provided backend.doc

Java version and web-based interface version 3 diseases: malaria, coccidioidomycosis, another one List of symptoms: headache, nausea, muscle pain, etc. Decision tree involving symptoms and tests (fever, blood

tests, travel patterns, etc)

Small enough to presumably not be lots of work, but large enough to allow illustration of functionalities, and provide some skeleton to the discussions…

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Participants

MITRE (Carl Burke et al) MiDiKi Gothenburg (Staffan Larsson) GoDiS (TRINDIKit) USC ICT (David Traum) ICT Dialogue Manager NTT/CMU (Matthias Denecke) Ariadne CMU (Dan, Alex) RavenClaw Ames (Beth-Ann Hockey) NASA Dialogue Manager DFKI (Norbert Reithinger) DFKI Dialogue Manager MERL (Candy Sidner, Charles Rich) COLLAGEN

… and others invited but not present

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS

GoDiS

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS

TRINDIKit – information state update dialogue management toolkit Information state

Private: dialog plan, beliefs, agenda (short term goals) Shared: established facts, QUD, last utterance information

Dialog moves Update rules

GoDiS: dialog management system implemented in TRINDIKit, handing: information oriented dialogue action oriented dialogue

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

TRINDIKit / GoDiS architecture

inputinter-pret

TISDEVICES LEXICON DOMAIN

backendinterface

control

update selectgene-rate

output

lexicon domainknowledge

DME

Dialog plans

Ontology

Connection to Java

Backend

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS: Task Representation

Plans; propositional logic Dialogue plans for dealing with diagnosis (issues

opened at dialogue start) ?x.disease(x): ”which disease is diagnosed?” ?confirmed_by_interview: ”Is the diagnosis confirmed by

additional information?” ?confirmed_by_tests: ”Is the diagnosis confirmed by medical

tests?”

Additional plans ?x.info(x): ”What information is there about a given

disease?” ?x.treatment(x): ”What treatment is there for a given

disease?”

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS: Alternate Tasks

User-driven dialogue (implemented) Not load issues when resetting; user has to raise all issues User can ask system to

Provide a diagnosis Confirm whether user has given disease

Decision trees as dialogue plans Move backend knowledge into dialogue plans Information conversion could be done automatically

Separate genre: expert system dialogue Add special purpose update rules Dynamic dialogue planning by expert

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS: Highlights / Lowlights

Highlights: Reuse, you get for free:

Grounding Accomodation / plan recognition Multiple simultaneous issues & info sharing

High-level abstraction for dialog plans Rapid prototyping

Lowlights Not used in this type of domain so far, so not

entirely straight-forward (update rule changes) Dynamic dialog plans (backend decides)

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS

RavenClaw

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw

Captures all domain-specific dialog (task) logic with a hierarchical description

The authoring effort is focused entirely here

Dialog Task (Specification)

Domain-independent Dialog Engine

Manages dialog by executing the dialog task specification

Provides domain-independent conversational strategies

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Welcome

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

LoadSymptoms

R:Headache R: R: R:

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Architecture

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

How are you feeling today?

general_feeling

chart

have_fever

diagnostic

HowAreYou

Expectation Agenda

general_feeling: [good], [bad], [soso]

general_feeling: [good], [bad], [soso] [good], [bad], [soso]

general_feeling: [good], [bad], [soso] [good], [bad], [soso]have_fever: [fever]. ![yes], ![no] ![yes], ![no]headache: [headache], ![yes], ![no] ![yes], ![no]cough: [cough], ![yes], ![no] ![yes], ![no]……

GeneralFeel

I:Glad I:Sorry

Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)

headache

GeneralFeel

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Illustrated Features

Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate

structures to “talk about them” created on-the-fly New symptoms – no DM changes

Dynamic dialog control policy The order in which symptoms are addressed is

controlled by the backend

Conversational skills

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Illustrated Features

Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate

structures to “talk about them” created on-the-fly New symptoms – no DM changes

Dynamic dialog control policy The order in which symptoms are addressed is

controlled by the backend

Conversational skills

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Dynamic Dialog Control …

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…How are you today?Not so good, I think I have a headacheSorry to hear you’re not feeling so good,Tell me more about your symptoms…Do you have abdominal pain?

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

Diagnose

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

Backend Decision Tree

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Illustrated Features

Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate

structures to “talk about them” created on-the-fly New symptoms – no DM changes

Dynamic dialog control policy The order in which symptoms are addressed is

controlled by the backend

Conversational skills

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Conversational Skills

Corresponding agencies added automatically to the dialog task tree Help What Can I Say? Repeat Suspend / Resume Start Over Timeout handling (not illustrated)

Still need all the language generation prompts and grammar, but some of those are develop-once, too

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

RavenClaw Conclusion

Highlights Set task posed no challenges to the framework

Easy to implement Dynamic dialog structure and control Automatic use of domain-independent

conversational skills

Lowlights? Toolkit perspective: how easy would it be for

someone else to build it? Asynchronous behaviors? (timing) Couple of bugs / fixes (or is that a highlight?)

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS

Collagen

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

COLLAGEN

Collaborative Interface Agent

communicate

interactinteract

observe observe

plan tree

focus stack *

Collagen

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

COLLAGEN Systems

air travel planning email reading and responding (w. IBM/Lotus) GUI design tool operation car navigation system operation airport landing path planning (w. MITRE) gas turbine operator training (w. USC/ISI) personal video recorder operation programmable thermostat operation (with Delft U.) multi-modal web-based form-filling

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Collagen: Theory and Implementation

Intentional

purposes,contributes

Linguistic

segments,lexical items

Attentional

focus spaces,focus stack

SharedPlan Discourse Theory

(Grosz, Sidner, Kraus, Lochbaum 1974-1998)

Java Implementation

focus stack

purpose tree

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Collagen: Discourse Segments and Purposes

(Grosz, 1974)

E: Replace the pump and belt please.

A: Ok, I found a belt in the back.

A: Is that where it should be?

A: [removes belt]

A: It’s done.

E: Now remove the pump.

E: First you have to remove the flywheel.

E: Now take the pump off the base plate.

A: Already did.

replacebelt

replacepump

replacepump

andbelt

(fixing an air compressor, E = expert, A = apprentice)

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Discourse state representation

E: Replace the pump and belt please.

A: Ok, I found a belt in the back.

A: Is that where it should be?

A: [removes belt]

A: It’s done

Focus Stack

replace belt

replace pump and belt

Purpose Tree

replace pump and belt

replace pump replace belt

currentfocus space

(Grosz & Sidner, 1986)

replace belt

replace pump

and belt

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Discourse interpretation algorithm

(Lochbaum, 1998)

• starts a new segment/focus space (push)• ends the current segment/focus space (pop)• continues (contributes to) the current segment/... (add)

The current (communication or manipulation) act either:

focus stack

• directly achieves the purpose

• is a step in the plan for the purpose *

• identifies the recipe used to achieve the purpose

• identifies who should perform the purpose or a step in the plan

• identifies a parameter of the purpose or a step in the plan

An act contributes to the purpose of a segment if it:

purpose tree

* does not include recursive plan recognition (see later topic)

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

COLLAGEN … my take

Separation of task from dialog/discourse engine Recipes / Domain plans / Task tree

Full-blown HTN Hierarchical Preconditions (constraints) Effects Completion / failure Live nodes

Stack to keep track of focus and discourse structure Tree explicitly contains agent and user nodes

Formalized / descriptive recipe specs (actually Java underneath), with procedure overwrites…

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

GoDiS

Themes …

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Task Representation

Task representation Separation of task representation from dialog

engine High-level representations of task Descriptive rather than procedural

Procedural will be unavoidable for complex tasks Expressive power

GoDiS, RavenClaw, Collagen: plan based representations of task

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Task/Domain/Gendre

The notion of dialog gendre Tutoring Diagnosis Information Access

Where to fold it in a dialog manager? GoDiS: update/select rules Ariadne: plugins RavenClaw: collapsed with task

How clear is that separation: task vs. gendre?

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Development time

Systems took on the order of 3-5 days to develop Significant effort in the backend connection

Some sites shortcut it Significant effort in grammar/language generation

development Some sites shortcut it

Everyone that had an implementation: “fixed a couple of bugs, but no major changes required”

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Development tools

Regression testing (GoDiS) Systems are complex. Change something in a

dialog management framework, can you prove that it did not screw up things that used to work?

System-wise, very intractable Component-wise, maybe: i.e. DM with DM

inputs/outputs

System diagnosis / log visualization tools (Collagen)

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Timing

(Micro)timing unaddressed

Turn-taking models in general, very rudimentary

Asynchronous behaviors Could be accomplished, but no-one seemed to

have it

Multi-party conversation unaddressed

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: the important problems

Different people have different views of what those are: Plan / Intention recognition Reference resolution Backup in complex systems Tense problems Negations Grounding; error prevention / recovery

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Reasoning

Dialog Managers vs Backends Where to draw the line? Who does the reasoning? Can we avoid duplicating it? How rich is the interaction between them?

Dialog systems - use language to act in a domain, so they are generally strongly tied

Basic set of conversational skills can be identified

Drawing that line is still an “art”, no general agreement or solutions exist

workshop : godis : ravenclaw : collagen : themes

MITRE Dialog Management Workshop

Themes: Science of Dialog?

How much science do we have? Theory vs. experiment

Interesting Collagen / RavenClaw similarities

Representation or not? GUI analogy

Do we have the checkboxes and radio-buttons?

workshop : godis : ravenclaw : collagen : themes