speech-enabling web apps

Post on 18-Dec-2014

369 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

An overview of the technology options for adding speech to web applications. It covers the HTML5 Speech Input API for speech recognition, using the Audio tag with 3rd party APIs for text-to-speech, and an overview of WebRTC application possibilities. Presented at the Atlanta Ruby Users Group meeting on November 13, 2013.

TRANSCRIPT

Speech-Enabling Web Apps

CAN YOU SPEAK MAGIC?

!2

CAN YOU SPEAK MAGIC?

!2

Ben Klang

CAN YOU SPEAK MAGIC?

!2

Ben Klang

CAN YOU SPEAK MAGIC?

!2

Ben Klang

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB

!3

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB

!3

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB•Speech Input API

!3

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB•Speech Input API•Text-To-Speech (<Audio/>)

!3

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB•Speech Input API•Text-To-Speech (<Audio/>)•WebRTC

!3

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB•Speech Input API•Text-To-Speech (<Audio/>)•WebRTC

!3

http://bit.ly/HTML5_Speech_Input_APIhttp://www.w3.org/TR/webrtc/

CAN YOU SPEAK MAGIC?

ADD SPEECH TO THE WEB•Speech Input API•Text-To-Speech (<Audio/>)•WebRTC

!3

http://bit.ly/HTML5_Speech_Input_APIhttp://www.w3.org/TR/webrtc/

CAN YOU SPEAK MAGIC?

SPEECH INPUT API

!4

CAN YOU SPEAK MAGIC?

SPEECH INPUT API

!5

CAN YOU SPEAK MAGIC?

SPEECH INPUT API

!5

CAN YOU SPEAK MAGIC?

SPEECH INPUT API

!5

<input type="text" x-webkit-speech />

CAN YOU SPEAK MAGIC?

ANNYANG!

!6

CAN YOU SPEAK MAGIC?

!7

CAN YOU SPEAK MAGIC?

DEMO

!8

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)•Partial Firefox implementation from GSoC

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)•Partial Firefox implementation from GSoC•Requires ASR Server

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)•Partial Firefox implementation from GSoC•Requires ASR Server•Only Google runs one today

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)•Partial Firefox implementation from GSoC•Requires ASR Server•Only Google runs one today•serviceURI attribute not yet implemented

!9

CAN YOU SPEAK MAGIC?

SPEECH INPUT API CAVEATS•Chrome Only :(•Uses Google ASR(duh)•Partial Firefox implementation from GSoC•Requires ASR Server•Only Google runs one today•serviceURI attribute not yet implemented•Specification maturity seems slow

!9

CAN YOU SPEAK MAGIC?

TEXT-TO-SPEECH

!10

CAN YOU SPEAK MAGIC?

TTS API + <AUDIO/>

!11

CAN YOU SPEAK MAGIC?

TTS API OPTIONS

!12

CAN YOU SPEAK MAGIC?

TTS API OPTIONS•AT&T: http://developer.att.com

!12

CAN YOU SPEAK MAGIC?

TTS API OPTIONS•AT&T: http://developer.att.com

•Nuance NDEVhttp://nuancemobiledeveloper.com/

!12

CAN YOU SPEAK MAGIC?

TTS API OPTIONS•AT&T: http://developer.att.com

•Nuance NDEVhttp://nuancemobiledeveloper.com/

•Google: http://translate.google.com/translate_tts?

tl=en&q=TEXT

!12

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS•No specified Mandatory To Implement (MTI) codecs

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS•No specified Mandatory To Implement (MTI) codecs•Broad consensus

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS•No specified Mandatory To Implement (MTI) codecs•Broad consensus•Everyone: MP3 (+containers H.264, MP4)

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS•No specified Mandatory To Implement (MTI) codecs•Broad consensus•Everyone: MP3 (+containers H.264, MP4)•Except IE: Ogg/Vorbis, Opus, WebM

!13

CAN YOU SPEAK MAGIC?

<AUDIO/> CAVEATS•You can’t pay for Google TTS•No specified Mandatory To Implement (MTI) codecs•Broad consensus•Everyone: MP3 (+containers H.264, MP4)•Except IE: Ogg/Vorbis, Opus, WebM•http://bit.ly/Browser_Audio_Codecs

!13

CAN YOU SPEAK MAGIC?

!14

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

WHAT IS WEBRTC TO ME?

!15

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

WHAT IS WEBRTC TO ME?

!15

Telephones in Web Browsers!

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

WHAT IS WEBRTC TO ME?

!15

Telephones in Web Browsers!Telephones in Web Browsers!

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

How does WebRTC Work?

!16

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://

Alice

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

SRTP

SRTP

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!17

http://Get

me B

ob p

leas

e!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

SDP:v=0 o=bob 19915 0 IN IP4 0.0.0.0

s=- t=0 0 m=audio 61001 RTP/SAVPF 109

Alice Bob

SRTP

SRTP

X

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Bob

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Bob

Get m

e Bob

ple

ase!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Bob

Get m

e Bob

ple

ase!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Calling!

SDP:

v=0 o=freeswitch 19915 0 IN IP4 0.0.0.0

s=- t=0 0

m=audio 61001 RTP/SAVPF 109

Alice Bob

Get m

e Bob

ple

ase!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Calling!

SDP:

v=0 o=freeswitch 19915 0 IN IP4 0.0.0.0

s=- t=0 0

m=audio 61001 RTP/SAVPF 109

Alice Bob

Get m

e Bob

ple

ase!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!18

Alice Calling!

SDP:

v=0 o=freeswitch 19915 0 IN IP4 0.0.0.0

s=- t=0 0

m=audio 61001 RTP/SAVPF 109

Alice Bob

SRTP

SRTP

Get m

e Bob

ple

ase!

SDP:

v=0

o=al

ice 2

0518 0

IN IP

4 0.0

.0.0

s=-

t=0 0

m

=audio

54609 R

TP/SAVPF 1

09

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

Example RTC Apps

!19

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

Example RTC Apps

!19

2 Examples

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

“Communicating isn’t going to be what you’re doing - it’s what you’ll be doing

while you’re doing something else”

- Geoff Hollingworth Ericsson Head of AT&T Foundry

!20

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

1. Incident Response

!21

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!22

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

INCIDENT RESPONSE

!23

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

INCIDENT RESPONSE•Timely, Contextual Information•Adapt for mobile vs. desktop users•Group-based communication•Inherit from existing organizational groups•Allow ad-hoc participants (“guest” parties)•Federate with external services•Incident recording/logging•“Lessons learned” and process improvement•Links from/to issue tracking systems

!23

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

2. Medical Records Management

!24

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!25

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

MEDICAL RECORDS MGMT

!26

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

MEDICAL RECORDS MGMT•Automate Medical Claims•Secure Caller Authentication•Reuse primary auth via website•Verify with voice biometrics•Cross-check against caller location•Call recording/transcription•Medical advice given to patient automatically added to patient file•Auditing/Service Quality Assurance

!26

CAN YOU SPEAK MAGIC?

HTTPS://TALKY.IO/ATLRUG

!27

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard•Only available on Chrome, Firefox

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard•Only available on Chrome, Firefox•Only available on Desktop

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard•Only available on Chrome, Firefox•Only available on Desktop•Well funded/backed development

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard•Only available on Chrome, Firefox•Only available on Desktop•Well funded/backed development•Expect to see it mainstream (Desktop + Mobile) as soon as 2014

!28

CAN YOU SPEAK MAGIC?

WEBRTC CAVEATS•Bleeding edge, developing standard•Only available on Chrome, Firefox•Only available on Desktop•Well funded/backed development•Expect to see it mainstream (Desktop + Mobile) as soon as 2014•http://iswebrtcreadyyet.com/

!28

CAN YOU SPEAK MAGIC?

!29

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!30

adhearsionconf.comEarly Bird Discount: atlrug

CAN YOU SPEAK MAGIC?CAN YOU SPEAK MAGIC?

!31

http://mojolingo.com @MojoLingo@bklang

bklang@mojolingo.com

http://bit.ly/HTML5_Speech_Input_APIhttp://www.w3.org/TR/webrtc/http://iswebrtcreadyyet.com/

Early Bird Discount: atlrug

top related