ravenclaw

RavenClaw

Yet another (please read “An improved”) dialog management architecture for task-oriented spoken dialog systems

Presented by: Dan Bohus (dbohus@cs.cmu.edu)

Work by: Dan Bohus, Alex Rudnicky

Carnegie Mellon University, 2002

11-04-01 Modeling the cost of misunderstanding … 2

New DM Architecture: Goals Able to handle complex, goal-directed dialogs

Go beyond (information access systems and) the slot-filling paradigm

Easy to develop and maintain systems Developer focuses only on dialog task Automatically ensure a minimum set of task-

independent, conversational skills

Open to learning (hopefully both at task and discourse levels)

Open to dynamic SDS generation More careful, more structured code, logs, etc:

provide a robust basis for future research.

A View from far, far away

What did you just say ?

What’s your name ?

SELECT * WHERE …

Since that failed, I need you to push

button B

Can you repeat that, please ? Suspend… Resume …

Conversational Skills

Dialog Task Specification

Backend

Let the developer focus only on the dialog task spec.: Don’t worry about misunderstandings, repeats, focus shift,

etc… merely describe (program) the task, assuming perfect knowledge of the world

Automatically generate the conversational mechanisms Examples

Outline

Goals A view from far away Main ideas

Dialog Task Specification / Execution Conversational skills

In more detail Dialog Task Specification / Execution Conversational skills

Conversational

Backend

Dialog Task Spec & ExecutionCommunicator

Welcome Login Travel Locals Bye

AskRegistered AskName GreetUser GetProfile Leg1

DepartLocation ArriveLocation

Agencies and Microagents (for input, request, execute …) Handle Concepts

Execution with interleaved Input Passes. Execute the agents by top-down “planning” Do input passes when information is required

REMEMBER: This is just the dialog task

Handling inputsCommunicator

Welcome Login Travel Locals Bye

AskRegistered AskName GreetUser GetProfile Leg1

DepartLocation ArriveLocation

Input Pass Assemble an agenda of expectations (open concepts) Bind values from the input to the concepts Process non-understanding (if), analyze need for focus shifts Continue execution

Conversational Skills /Mechanisms

A lot of problems in SDS generated by lack of conversational skills. “It’s all in the little details!” Dealing with misunderstandings Generic channel/dialog mechanisms : repeats, focus

shift, context establishment, help, start over, etc, etc. Timing

Even when these mechanisms are in, they lack uniformity & consistency.

Development and maintenance are time consuming.

Conversational Skills / Mechanisms

More or less task independent mechanisms: Implicit/Explicit Confirmations, Clarifications,

Disambiguation = the whole Misunderstandings problem Context reestablishment Timeout and Barge-in control Back-channel absorption Generic dialog mechanisms:

Repeat, Suspend… Resume, Help, Start over, Summarize, Undo, Querying the system’s belief

The core takes care of these by dynamically inserting in the task tree agencies which handle these mechanisms.

Outline

Conversational

Backend

Goal: able to handle complex domains, beyond information access, frame-based, slot-filling systems i.e. : Symphony, Intelligent checklists, Navigation, Route

planning

We need a powerful enough formalism to describe all these tasks: C++ code ? Declarative would be nice … but is it powerful enough ? Templatized C++ code … ?

A possible more formalized approach Tree of agents with:

Preconditions Success Criteria Focus Criteria (triggers)

Expressed mostly in terms of concepts Data, Type (basic, struct, array) Confidence, Availability, Ambiguousness,

Groundedness, System/User, TurnAcquired, TurnConveyed, etc…

An example DTS

UserLogin: AGENCYconcepts: registered(BOOL), name(STRING), id(STRING),

profile(PROFILE), profile_found(BOOL)achieves_when: profile || InformProfileNotFound

AskRegistered: REQUEST(registered) grammar: {[yes]->true,[no]->false,[guest]->false}AskName: REQUEST(name) precond: registered==no grammar: [user_name] max_attemps: 2InformGreetUser: INFORM precond: nameAskID: REQUEST(id) precond: registered==yes mapping: [user_id]DoProfileRetrieval: EXECUTE precond: name || id call: ABEProfile.Call >name, >id, <profile, <profile_foundInformProfileNotFound: INFORM precond: !profile_found

Given that the baseline is 259 lines of C++ code, this is pretty good.

Can a formalism cut it ?

People have repeatedly tried formalizing dialog … and failed We’re focusing only on the task (like in

robotics/execution) Actually, these agents are all C++ classes, so

we can backoff to code; the hope is that most of the behaviors can be easily expressed as above.

Other Ideas for DTS

4 Microagents: Inform, Request, Expect, Execute

Provide a library of “common task” and “common discourse” agencies Frame agency List browse agency Choose agency Disambiguate agency, Ground Agency, … Etc

DTS execution

Agency.Execute() decides what is executed next Various simple policies can be implemented

Left-to-right (open/closed), choice, etc

But free to do more sophisticated things (MDPs, etc) ~ learning at the task level

Input Pass1. Construct an agenda of expectations

(Partially?) ordered list of concepts expected by the system

2. Bind values/confidences to concepts The SI <> MI spectrum can be expressed in terms of the

way the agenda is constructed and binding policies, independent of task

3. Process non-understandings (iff) - try and detect source and inform user: Channel (SNR, clipping) Decoding (confidence score, prosody) Parsing ([garble]) Dialog level (POK, but no expectation)

Input Pass

4. Focus shifts Focus shifts seem to be task dependent.

Decision to shift focus is taken by the task (DTS)

But they also have a TI-side (sub-dialog size, context reestablishment). Context reestablishment is handled automatically, in the Core (see later)

Outline

Conversational

Backend

Task-Independent, Conversational Mechanisms

Should be transparently handled by the core; little or no effort from the developer However, the developer should be able to write his own

customized mechanisms if needed

Handled by inserting extra “discourse” agents on the fly in the dialog task specification

Conversational Skills Universal dialog mechanisms:

Repeat, Suspend… Resume, Help, Start over, Summarize, Undo, Querying the system’s belief

The grounding / misunderstanding problems Timing and Barge-in control Focus Shifts, Context Establishment Back-channel absorption

Q: To which extent can we abstract these away from the Dialog Task ?

Repeat

Repeat (simple) The DTT is adorned with a “Repeat” Agency

automatically at start-up Which calls upon the OutputManager Not all outputs are “repeatable” (i.e. implicit

confirms, gui, )… which ones exactly… ?

Repeat (with referents) only 3%, they are mostly [summarize]

User-defined custom repeat agency

Help DTT adorned at start-up with a help agency Can capture and issue:

Local help (obtained from focused agent) ExplainMore help (obtained from focused)

What can I say ?

Contextual help (obtained from main topic) Generic help (give_me_tips)

Obtains Help prompts from the focused agent and the main topic (defaults provided)

Default help agency can be overwritten by user

Suspend … Resume

DTT adorned with a SuspendResume agency.

Forces a context reestablishment on the current main topic upon resume.

Context reestablishment also happens when focusing back after a sub-dialog Can maybe construct a model for that (given

size of sub-dialog, time issues, etc)

Start over, Summarize, Querying Start over:

DTT adorned with a Start-Over agency

Summarize: DTT adorned with a Summarize agency;

prompt generated automatically, problem shifted to NLG: can we do something corpus-based … work on automated summarization ?

Querying the system’s beliefs: Still thinking… problem with the grammars…

can meaningful Phoenix grammars for “what is [slot]” be automatically generated ?

Timing & barge-in control

Knowledge of barge-in location Information on what got conveyed is fed

back to the DM, through the concepts to the task level Special agencies can take special action

based on that (I.e. List Browsing) Can we determine what are non-barge-in-able

utterances in a TI manner ?

Confirmation, Clarif., Disamb., Misunderstandings, Grounding…

Largely unsolved in my head: this is next ! 2 components:

Confidence scores on concepts Obtaining them Updating them

Taking the “right” decision based on those scores:

Insert appropriate agencies on the fly in the dialog task tree: opportunity for learning

What’s the set of decisions / agencies ? How does one decide ?

Confidence scores

Obtaining conf. Scores : from annotator Updating them, from different sources:

(Un)Attacked implicit/explicit confirms Correction detection Elapsed time ? Domain knowledge Priors ?

But how do you integrate all these in a principled way ?

Mechanisms DepartureCity = <Seattle,0.71><SF,0.29> Implicit / Explicit confirmations

When do you leave from Seattle ? So you’re leaving from Seattle… When ?

Clarifications Did you say you were leaving from Seattle ?

Disambiguation I’m sorry was that Seattle or San Francisco?

How do you decide which ? Learning ?

Software Engineering

Provide a robust basis for future research. Modularity

Separability between task and discourse Separability of concepts and confidence

computations

Portability Mutiple servers Aggressive, structured, timed logging

ravenclaw

task independent mechanisms

minimum set of task

task tree agencies

dialog taskautomatically

lack of conversational

information access systems

conversational skillsopen

handle conceptsexecution

Documents

august 2016 $1.00 witches to ... · same way witches and...

harry potter quiz click here to start. 1) in what house did...

madeleine, a ravenclaw exercise in the medical …madeleine,...

developing spoken dialogue systems in the communicator /...

ravenclaw the joakj

ravenclaw cookbook (printable)

a newbie experience of dialogue system construction using...

follett book fairs brochure...marvel comic characters. $9.99...

final 1st year herbology assignment (keaton harris:...

manual de usuario - gobierno | gob.mx€¦ · presentación...

rhteacherslirariansco · nicole panteleakos is a...

the ravenclaw dialog management framework: architecture...

the ravenclaw chronicles - cambridge scholars · 2021. 4....

manual de usuario - gob.mx€¦ · presentación de...

gryffindor ravenclaw slytherin

geschichte hogwarts - beepworld · 2020. 2. 26. · kapitel...

ravenclaw cookbook

ravenclaw an improved dialog management architecture for...

madeleine, a ravenclaw exercise in the medical diagnosis...

harry potter and the prisoner of azkaban...the marauder’s...