assistants that make sense - nuance technology lenke... · assistants that make sense ... – very...

22
© 2015 Nuance Communications, Inc. All rights reserved. Assistants that Make Sense Nuance Corporate Research, Machine Learning, and AI Overview Dr. Nils Lenke

Upload: trantu

Post on 30-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

© 2015 Nuance Communications, Inc. All rights reserved.

Assistants that Make Sense Nuance Corporate Research, Machine Learning, and AI Overview

Dr. Nils Lenke

© 2015 Nuance Communications, Inc. All rights reserved. 2

Strong customer and brand preference

With leading global relationships, it’s rare to go a day without Nuance

Healthcare Consumer Electronics Document Imaging

Automotive Telecommunications Government

Financial Services Travel Consumer goods

© 2015 Nuance Communications, Inc. All rights reserved. 3

14 billion customer engagements per year

4,300 patents and applications

40 text-to-speech languages and voices

800 million mobile keyboards shipped annually

30,000 mobile app developers

80 languages

309 million patient stories shared annually

130 million voice-enabled vehicles shipped globally

6,500 companies use Nuance Enterprise solutions

World-class technology portfolio

© 2015 Nuance Communications, Inc. All rights reserved. 4

– Very large global research group, part of the

larger R&D organization (1700 people)

– Appr. 300 researchers in ASR, Machine

Learning, NL and AI worldwide

– research.nuance.com/research

– Closely cooperates with divisional R&D

(Healthcare, Mobile, Enterprise), directly

involved in select customer engagements

– Co-Organizer of “Winograd Schema Challenge”

to replace the “Turing test”

– Shareholder of DFKI, world’s largest AI research

lab

Nuance Corporate Research

© 2015 Nuance Communications, Inc. All rights reserved. 5

The Science Behind the Buzzwords

“Deep Learning”

“Deep Neural Networks”

“Artificial Intelligence”

“Natural Language

Understanding”

“Cloud”

“The problem is that the concept of "artificial

intelligence" is way too potent for its own good,

conjuring images of supercomputers that operate

spaceships, rather than particularly clever spam

filters. The next thing you know, people are

worrying about exactly how and when AI is going

to doom humanity.” The Verge, Feb 29, 2016

“Machine Learning”

© 2015 Nuance Communications, Inc. All rights reserved. 6

Structure of Talk (and of Nuance Research)

NN & Machine

Learning

ASR Research NLU Research Symbolic AI TTS Research Voice Biometrics

Research

Enterprise:

Assisting

Customers &

Customer Agents

Ap

plicati

on

Layer

Co

re T

ech

no

log

y

Mobile: Assisting

the Driver & the IoT

user

Healthcare:

Assisting the

Specialist

© 2015 Nuance Communications, Inc. All rights reserved. 7

The big themes for ML:

– Which model type works best for which tasks (HMMs, NNs,

DNNs, CNNs, RNNs, CRF, SVM, Classifier, Logistic Regression,

Clustering…)

– Where to find supervised or at least “lightly” supervised training

data?

– How to get it to work?

Machine Learning

Data Model

Unseen

object Learning

(Training) Prediction

This is where

“big data”, the

Internet and

“social media”

come in

© 2015 Nuance Communications, Inc. All rights reserved. 8

inpur

hidden layer 1 hidden layer 2 hidden layer 3

output layer

Learn hierarchical feature

presentations

“Deep” Neural Nets

Backpropagation Mary

?

Learning from labeled

examples (= supervised

learning”)

© 2015 Nuance Communications, Inc. All rights reserved. 9

DNNs in Speech Recognition (ASR)

Hidden Markov Models

(HMMS) “Deep” Neural Nets

© 2015 Nuance Communications, Inc. All rights reserved. 10

– Map shows position of

devices (smart

phones, TVs, cars,

wearables, …)

sending requests to

cloud servers

– > 1 bn transactions /

month

– Important source

of training data

Nuance Speech Recognition (and NLU) increasingly deployed “in the cloud”

© 2015 Nuance Communications, Inc. All rights reserved. 11

Voice Biometrics vs. Voice Recognition (ASR)

“I want to transfer

money”

Eliminate

this variation

by training

on lots of

data

Extract only

what was said

from speech

signal

Use the

characteristics

to identify or

verify speaker

identity

Eliminate (vocal

password) or

ignore

(Freespeech)

content variation

© 2015 Nuance Communications, Inc. All rights reserved. 12

Does NOT mean to understand the complete meaning of any utterance

Instead it works for a specific domain (or a set of domains)

And the primary task of NLU is to return the most likely user intent and associated

concepts or named entities (aka slots/mentions) given an input utterance.

intent = navigation drive to [name] joe beef [/name] in [location] montreal [/location]

• Good accuracy gains with more modern ML models over the last few years:

– Baseline HMMs

– CRF (Conditional Random Fields, “invented for labeling tasks) +20% rel. in accuracy

– “Neuro CRF” (= combination of RNNs and CRFs) +15% rel. in accuracy

Natural Language Understanding (NLU)

© 2015 Nuance Communications, Inc. All rights reserved. 13

Text-to-Speech Technology - Giving the Assistant a Voice

Input

Text Speech

Front-End Linguistic Processing

Back-End Signal Generation

Text

Preprocessor

Linguistic

Preprocessor

Unit Selection

or Model Synthesizer

Language data Voice data

Voice Talent

“American Airlines and US

Airways have settled an anti-

trust suit with US regulators.

As part of the agreement,

which must still be approved

by a judge, the airlines will

give up slots at several US

airports.”

Or go for a

celebrity /

custom

voice

Selection of

Voice

Talent

=

TTS Voice

=

Assistant

Persona

© 2015 Nuance Communications, Inc. All rights reserved. 14

Symbolic AI (around since the 1960ies)

– Capturing “knowledge” in logical forms

– Ontologies play an important role “Pizza IS-A Italian

Dish”

– Allows to do reasoning and come up with

conclusions,

– Planning as an important technique

– Develop action plans to fulfill a user request

– Complex dialog behavior based on planning (rather

than fixed dialog strategies)

– Syntax parsing and semantical analysis beyond shallow

NLU

– Dependency trees as interface between Syntax and

Semantics

Intend(Sys, (Bel(Sys,

x.close(User,x) cafe(x)

tell( Sys, User,

close(User,x))))

send

I message John

agent

obj target

© 2015 Nuance Communications, Inc. All rights reserved. 15

“Find a good charging

station near the AMC”

Find the

AMC near

the driver’s

location

Find charging

stations near

the AMC

returned by

Fandango

NLU Output

Combine the results

from charge point

with Yelp to get the

highest rated stations

Big Knowledge & Semantic Routing

3,87 4,13

1,8

3,93 3,7

3,47

2,9

3,42

1

2

3

4

5

MovieTV Business:Restaurant

DMA+SR DMA Siri Google Now

Big Knowledge

Repository

BKR

Ingestion

Knowledge

Interface Layer /

Semantic Router

Open

Cyc

© 2015 Nuance Communications, Inc. All rights reserved. 16

Assistants that Make Sense… “I don’t know what you mean

by X, do you want me to do a

web search?”

– The problem with many personal virtual assistants is that they are too

general: you don’t know what they can and cannot do for you

– Rather focus on specialized assistants that cover a well defined task

and have an intuitive scope:

– Automotive Assistant for the Driver

– Device-centric Assistants for the IoT

– Assistant for the Clinical Specialist

– Virtual Customer Service Agent

© 2015 Nuance Communications, Inc. All rights reserved. 17

Acceptance of Conversational / Intelligent Experiences

89% Prefer

conversation with

virtual assistants

over search

73% Prefer personalized

conversations

83% Want an alternative to

PINs and passwords

90% Would prefer voice

biometrics over

passwords or

questions

A recent Nuance survey found:

© 2015 Nuance Communications, Inc. All rights reserved. 18

Assisting the Driver Today: Dragon Drive in BMW 7 Series

The Future : ASR, NLU, and AI form the Automotive Assistant

Calendar Book a table at

Joe‘s Pizza after

my last meeting

and let Tom and

Brian know to

meet me there

Sorry, but there aren‘t any

tables open until 9pm.

Would you like me to find

you another Italian

restaurant along your way

home?

Restaurant reservation

Address book

Messaging

Maps

Big Knowledge + Semantic Routing + Planning + Deep Syntax / Semantics + Dialog

© 2015 Nuance Communications, Inc. All rights reserved. 19

Nuance “Mix” – Build your own IoT Assistant

– Self-service platform covering ASR and NLU

– ASR tools to allow customization of vocabulary

– NLU GUI tools allow to define intents and named

entities (ontology).

– Fast creation of seed data and annotation of real data

– Push-button ML NLU model training

– Cloud based; create & test app for free; pay as you go

for real traffic

© 2015 Nuance Communications, Inc. All rights reserved. 20

Assisting the Specialist: CA-CDI

– Computer- Assisted Clinical

Documentation Improvement

– Ensure diagnoses suggested by case

record are explicitly documented

– If not, submit a clarification request to

the physician/ document author to

ensure proper subsequent coding and

thus (maximal) reimbursement

– Human CDS (“Clinical Documentation

Specialist”) use 42 strategies

– Rule-based system in production

mimicking the strategies

– Now adding ML solutions on top:

Billing

Automated

Assistant to

the CDS

Ru

les

Sim

ple

ML

DN

Ns o

nte

xt

DN

NS

on

text a

nd

rule

ou

tput

F-Score Baseline

Combining Rules

and DNN ML

© 2015 Nuance Communications, Inc. All rights reserved. 21

– “Transfer 73€ to A. van Dijck”

– Uses ML-ASR for spoken input

– Uses ML-NLU to discover intent & named entities

– To build these apps it needs a lot of manual work (be it with great GUI tools) – Define intents

– Collect and annotate data to train the NLU models per intent

– Define dialog strategies per intent

– Define the answers / answer strategies per intent

Assisting the Customer & the Customer Service Agent: Nina today

http://whatsnext.nuance.com/customer-experience/ing-intelligent-virtual-assistant-mobile-app/

Intent Named entities

Can we automate

this with ML?

© 2015 Nuance Communications, Inc. All rights reserved. 22

Assisting the Customer

& the Customer Agent

User Web Virtual Assistant

Nina & HAVA Hidden Agent

User Asks a question “Has my check cleared”

VA does not know

the answer

Hidden Agent

Supplies an Answer “yes your check has cleared”

Voice, chat,

web, …

Apply ML to learn

from Agent

interventions and

improve Automatic

Agent