enterface 08 project #1 “ multiparty communication with a tour guide eca” final presentation
DESCRIPTION
eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 2 9th, 2008. Project Overview Objectives, Issues & Work Done System Overview Configuration and Design Conclusion. Outline. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/1.jpg)
eNTERFACE 08 Project #1“MultiParty Communication
with a Tour Guide ECA”
Final presentation
August 29th, 2008
![Page 2: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/2.jpg)
Outline
• Project Overview
• Objectives, Issues & Work Done
• System Overview
• Configuration and Design
• Conclusion
![Page 3: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/3.jpg)
Project Objectives• Main objective: develop an ECA Tour Guide system which can interract with one or two users
• Research features:
• multiparty dialogue model and scenario between two humans and ECA
• handling and combining input data: users presence and behaviors (speech, tracking)
• gaze behaviors control and nonverbal model of ECA
![Page 4: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/4.jpg)
Work done: Component Functionality Overview
• We implemented components which support scenario based on narration and interruptions
• ECA is narrator, users can ask context-related questions (“where”, “how”, “when”)
• speaker, addresse and listener identification, ECA gaze model
• ECA can ask users simple “yes/no” questions to keep attention
• System can detect users appearance and dynamically initiate/end session
• System can detect and handle situation when users are paying less attention
• System can recover from failure (e.g. SR does not recognize user’s speech)
![Page 5: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/5.jpg)
Work done...about to be done...
• Components are implemented
• System is being integrated
• debugging and full testing is needed
• Not supported:• Detection of situation when users are starting their conversation
• Detection of speech collision between users
• Smart scheduling and control of ECAs behaviors
![Page 6: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/6.jpg)
System Configuration
Okao Vision
OpenCV
NonVerbal Input Understanding
Decision Making Planner (Scenario Component)
Animation Player
Speech Recognition 1
Speech Recognition 2
Input
Central Part
Output
![Page 7: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/7.jpg)
Speech Recognition
Okao Vision
OpenCV
NonVerbal Input Understanding
Decision Making Planner (Scenario Component)
Animation Player
Speech Recognition 1
Speech Recognition 2
Input
Central Part
Output
![Page 8: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/8.jpg)
Speech Recognition
• Functionality:
• Detects users requests (“Where”, “How”, “When”, “Who”)
• Detects users willingness to leave the system
• Detects results of simple questioners (“yes/no”)
• Detects unknown words
• Implementation:
• Keywords detection with confidence score and speech duration is implemented by using Loquendo API
![Page 9: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/9.jpg)
Nonverbal Inputs and Understanding
Okao Vision
OpenCV
NonVerbal Input Understanding
Decision Making Planner (Scenario Component)
Animation Player
Speech Recognition 1
Speech Recognition 2
Input
Central Part
Output
![Page 10: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/10.jpg)
Nonverbal Inputs: Users appearance and face orientation
• Functionality of components:
• Detect motions and users appearance/disappearance
• Detect number of users present
• Detect users face orientation and increased/decreased attention
• left, right user
• Implementation:
• OpenCV (motion) & Okao Vision (face orientation, gazing)
![Page 11: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/11.jpg)
Decision Making Component
Okao Vision
OpenCV
NonVerbal Input Understanding
Decision Making Planner (Scenario Component)
Animation Player
Speech Recognition 1
Speech Recognition 2
Input
Central Part
Output
![Page 12: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/12.jpg)
Decision Making Component- Functionalities
• Makes decisions “when and what to do to whom”:• Handles multimodal input events (number of users, attention, speech channels)
• Handles user interruptions while ECA is speaking
• Handles failures from SR component
• Generates multimodal output and controls ECA’s gazing
• Simple rule: “First one will be served”
• “yes”/”no” questionnaire is exception
• No domain knowledge and behavior scheduling
![Page 13: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/13.jpg)
Decision Making Component - Implementation
• Decision Making Component component uses ideas from information state theory [Larsson’00] and AIML:
• The progress of dialogue is represented by a set of variables
• Most appropriate plans are selected and scheduled by simple inference
• Time control to obtain both messages from speech channels in case (“yes/no”) questions
• Component is being developed by using MIDIKI’s toolkit as reference
![Page 14: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/14.jpg)
Animation Player
Okao Vision
OpenCV
NonVerbal Input Understanding
Decision Making Planner (Scenario Component)
Animation Player
Speech Recognition 1
Speech Recognition 2
Input
Central Part
Output
![Page 15: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/15.jpg)
Animation Player
• Functionality:
• Animation player uses scripted behaviors (GSML language) to generate speech and animation
• Model of gaze in a multiparty communication is supported:
• Gazing control is obtained on the utterance level
• Gaze pattern is following conversational rules (who is addresee, who is listener)
• Implementation:
• Visage SDK (based on MPEG-4 standard)
• 3ds Max
![Page 16: eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56814580550346895db25906/html5/thumbnails/16.jpg)
Conclusion• Components to support context-based two party human - ECA communication are implemented
• System is being integrated, but not fully tested
• Component issues:• missing face tracking and domain knowledge about users behaviors• simple dialogue management and control (no smart scheduling and smart gaze control)
• Future directions: system debugging and testing, implement tracking, improve gazing control, study on users behaviors and gazing, system evaluation