fname=“yevgen” lname=“borodin” zip=“11790” …

1
fname=“Yevgen” lname=“Borodin” zip=“11790” user=“Yevgen” session=“browse” title=“circuit city” mode=“formFill” Cached Files main.vxml allLinks.vxml history.vxml favorites.vxml keyList.vxml commands.vxml Processor Variable Space M anager Input M anager Event- Handling M anager Output M anager VXM L Submitted Variables InputQueue Keyboard Interface Console SU N Audio Player Speech R ecognition Engine Text-To-Speech Engine Keyboard Input Voice Input Voice Output Audio Output OutputBuffer JSAPI SGRS Gram m ar Interpreter JavaScript Interpreter File M anager VXMLSurfer in Action VXMLSurfer in Action A Flexible VXML (Voice XML) Interpreter Yevgen Borodin Computer Science Department, Stony Brook University Back-End Processing Back-End Processing HELP <rule id="phone_number"> <item repeat="0-1">d</item> <tag><![CDATA[$.phone="ddd-ddd-dddd";]]></tag> </item> </rule> <?xml version='1.0'?> <vxml> <form id=‘shipping'> <field name="fname"> <prompt>Enter first name</prompt> <grammar src="customer.xml#fname“ type="application/grammar+xml "/> </field> <field name="lname"> <prompt>Enter last name</prompt> <grammar src="customer.xml#lname" type="application/grammar+xml "/> </field> </form> </vxml> Interface M anager Brow serO bject C ontextAnalyzer Fram e Tree Processor VXM L D ialog G enerator Fram e Tree H TTP request W EB H TTP request H TM L Fram e Tree Fram e Tree H TTP request Big Picture: HearSay Big Picture: HearSay Browser Browser VXML Interprete r Mozilla Engine <catch event="Insert+F1 help"> <prompt> You are at Circuit City check out </prompt> </catch> <catch event=“Ctrl+S submit"> <submit namelist=“fname lname …”> </catch> INS + F1 SGRS Grammar SGRS Grammar Variables | Files | Return | Variables | Files | Return | Events Events fname=“Yevgen lname=“Borodi n” zip=“11790” Features of VXMLSurfer Features of VXMLSurfer Application of VXMLSurfer Application of VXMLSurfer Future Work Future Work Inadequacy of screen-readers in Web browsing Development of interface Manager for HearSay voice-browser Absence of fully-implemented open-source VXML interpreters Need for specialization in Web browsing applications Need for a multiplatform, extensible, modular, flexible system Compliant with VoiceXML 2.0 specifications Geared to Web-browsing as opposed to telephony Modular, Extensible, Multi-Platform (Java) Extends VoiceXML 2.0 to give more control over dialog flow Loaded with add-ons: Spell Check, TTS, SR, etc. VXMLSurfer is the interface of the HearSay voice-browser Users interact with VXMLSurfer through keyboard and mic HTTP request is forwarded to the Mozilla browser engine The loaded Web page is converted into a frame tree The frame tree is processed and converted to VXML dialogs Complete VoiceXML 2.0 Specifications Messaging between VXMLSurfer and calling application Speech Recognition (CMU Sphinx) Java Script Interpreter and Grammar Interpreter Modules Multilingual TTS, etc. VoiceXML dialog files are sent to the interpreter for processing Variables are returned to the calling application (HearSay) HearSay invokes form-filling module to fill and submit the form Motivation Motivation Uses of VoiceXML Uses of VoiceXML VXML is typically used in telephony applications Computer games use VXML to program interactive dialogs VXML dialogs disseminate information through public terminals VoiceXML can be used in voice browsing! This material is based upon work supported by the National Science Foundation - Awards: IIS-0534419, CNS-0751083, IIS-0808678 and National Institute on Disability and Rehabilitation Research (NIDRR) - Award: H133S090065.

Upload: cade-newman

Post on 30-Dec-2015

22 views

Category:

Documents


1 download

DESCRIPTION

You are at Circuit City check out

TRANSCRIPT

Page 1: fname=“Yevgen” lname=“Borodin” zip=“11790” …

fname=“Yevgen”lname=“Borodin”zip=“11790”…

user=“Yevgen”session=“browse”…

title=“circuit city”mode=“formFill”…

Cached Files

main.vxmlallLinks.vxml history.vxmlfavorites.vxmlkeyList.vxml commands.vxml…

Processor

VariableSpace

Manager

InputManager

Event-HandlingManager

OutputManager

VXML SubmittedVariables

Input Queue

KeyboardInterface

ConsoleSUN Audio

PlayerSpeech

Recognition EngineText-To-Speech

Engine

KeyboardInput

VoiceInput

VoiceOutput

AudioOutput

Output Buffer

JSAPI

SGRSGrammarInterpreter

JavaScriptInterpreter

File Manager

VXMLSurfer in ActionVXMLSurfer in Action

A Flexible VXML (Voice XML) InterpreterYevgen Borodin

Computer Science Department, Stony Brook University

Back-End ProcessingBack-End Processing

HELP

…<rule id="phone_number"> <item repeat="0-1">d</item> <tag><![CDATA[$.phone="ddd-ddd-dddd";]]></tag> </item></rule> …

<?xml version='1.0'?><vxml> … <form id=‘shipping'>  <field name="fname">   <prompt>Enter first name</prompt>   <grammar src="customer.xml#fname“ type="application/grammar+xml"/> </field> <field name="lname">   <prompt>Enter last name</prompt>   <grammar src="customer.xml#lname" type="application/grammar+xml"/> </field> …   </form></vxml>

Interface Manager

Browser Object

Context Analyzer

Frame Tree Processor

VXML

Dialog Generator

Frame Tree

HTTPrequest

WEB

HTTPrequest

HTML

Frame Tree

Frame Tree

HTTPrequest

Big Picture: HearSay BrowserBig Picture: HearSay Browser

VXML

Interpreter

Mozilla

Engine

<catch event="Insert+F1 help"> <prompt> You are at Circuit City check out </prompt></catch>

<catch event=“Ctrl+S submit"> <submit namelist=“fname lname …”></catch> …

INS + F1

SGRS GrammarSGRS Grammar

Variables | Files | Return | EventsVariables | Files | Return | Events

fname=“Yevgen”lname=“Borodin”zip=“11790” …

Features of VXMLSurferFeatures of VXMLSurfer

Application of VXMLSurferApplication of VXMLSurfer

Future WorkFuture Work

Inadequacy of screen-readers in Web browsing Development of interface Manager for HearSay voice-browser Absence of fully-implemented open-source VXML interpreters Need for specialization in Web browsing applications Need for a multiplatform, extensible, modular, flexible system

Compliant with VoiceXML 2.0 specifications Geared to Web-browsing as opposed to telephony Modular, Extensible, Multi-Platform (Java) Extends VoiceXML 2.0 to give more control over dialog flow Loaded with add-ons: Spell Check, TTS, SR, etc.

VXMLSurfer is the interface of the HearSay voice-browser Users interact with VXMLSurfer through keyboard and mic HTTP request is forwarded to the Mozilla browser engine The loaded Web page is converted into a frame tree The frame tree is processed and converted to VXML dialogs

Complete VoiceXML 2.0 Specifications Messaging between VXMLSurfer and calling application Speech Recognition (CMU Sphinx) Java Script Interpreter and Grammar Interpreter Modules Multilingual TTS, etc.

VoiceXML dialog files are sent to the interpreter for processing Variables are returned to the calling application (HearSay) HearSay invokes form-filling module to fill and submit the form

MotivationMotivation

Uses of VoiceXMLUses of VoiceXML VXML is typically used in telephony applications Computer games use VXML to program interactive dialogs VXML dialogs disseminate information through public terminals VoiceXML can be used in voice browsing!

This material is based upon work supported by the National Science Foundation - Awards: IIS-0534419, CNS-0751083, IIS-0808678and National Institute on Disability and Rehabilitation Research (NIDRR) - Award: H133S090065.