task-oriented neural dialogue systems - github pages · end-to-end trainable task-oriented dialogue...
TRANSCRIPT
DialogueSystemsGroup
Task-orientedNeuralDialogueSystems
Apple,08/03/2017Tsung-Hsien(Shawn)Wen
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecLon
¤ Experiments
¤ Conclusion
2
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecLon
¤ Experiments
¤ Conclusion
3
Thetwoparadigms 4
M:Hello,welcometotheCambridgedialoguesystem,Whatkindoffoodwouldyoulike?H:YeahIwanttofindarestaurantthatservesEuropeanfood.M:HotelduVinandBistroisaniceplace.ItservesEuropeanfood.H:Uhwhatistheaddressandphonenumber?M:HotelduVinandBistroisaniceplace.TheiraddressisXXXandtheirphoneisYYY.H:Thankyougoodbye.M:Thankyouforusingthissystem.
A:Howoldareyou?B:I’m16.Whyareyouasking?A:Ithoughtyouwere12.B:Whatmadeyouthinkso?A:Idon’tknowwhatyouaretalkingabout.B:Youdon’tknowwhatyouaresaying.
[Suetal,2016]
[Lietal,2016]
Thetwoparadigms 5
¤ Task-OrientedDialogueSystems¤ Goal-oriented¤ Requirepreciseunderstanding,hardtocollectdata.¤ Modular,highlyhandcraced,restrictedability,butmeaningful/
usefulsystems.¤ Chat-basedConversaLonalAgents
¤ Chit-chat(non-goal).¤ Vastamountofdata(butprobablynothelpful).¤ End-to-end,highlydata-driven,butmeaningless/inappropriate
responses,unreliablesystems.
¤ Canwetrainauseful(completetasks)dialoguesystemdirectlyfromdata?
¤ Howcanwecollectthedatatotrainthismodel?
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecLon
¤ Experiments
¤ Conclusion&Discussion
6
TradiLonalDialogueSystems
SpeechRecogniLon
LanguageUnderstanding
SpeechSynthesis
DialogueManager
KB
Web
DialogueSystem
LanguageGeneraLon
7
text
text
NeuralDialogueSystems
SpeechRecogniLon
SpeechSynthesis
KB
Web
NeuralDialogueSystem
8
text
text
CanIhaveKorean LifleSeoulservesgreatKorean.
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
CanIhave<v.food> <v.name>servesgreat<v.food>.
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
DelexicalisaLon
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
zt
Seq2Seq
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
CanIhavekorean
Korean0.7BriLsh0.2French0.1
…
BeliefTracker
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
zt
pt
LanguageGrounding
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
<nil>
I
want
Korean
food
<nil>
JordanRNN-CNNbelieftrackers
1stconv. 2ndconv. 3rdconv. max-pool avg-pool
TurntInputlayer
Outputlayer
Hiddenlayer
DelexicalisedCNN
<nil>
I
want
v.food
s.food
<nil>
sentencerepresentaLon
…
BriLshFrenchKorean…Chinese 1.3 2.3 9.7 1.2 .01.02.85.01
13
[Hendersonetal,2014]
Memorisethedelex.posiLon
Padzerostohavethe
samelength
Slot-specificdelex.ngram
feature
Value-specificdelex.ngramplaceholder
Value-specificdelex.ngram
feature
JordanRNN-CNNbelieftrackers
1stconv. 2ndconv. 3rdconv. max-pool avg-pool
Userturnt Systemturnt-1
TurntInputlayer
Outputlayer
Hiddenlayer
DelexicalisedCNN
JordanRNN
<nil>
I
want
v.food
s.food
<nil>
sentencerepresentaLon
14
CanIhavekorean
Korean0.7BriLsh0.2French0.1
…
BeliefTracker
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
zt
pt
LanguageGrounding
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
CanIhavekorean
Korean0.7BriLsh0.2French0.1
…
BeliefTracker
000…01
MySQLquery:“Select*wherefood=Korean”
DatabaseOperator
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
…
Database
Sevendays CurryPrince
Nirala
RoyalStandard
LifleSeuol
DBpointer xt
zt
pt
qt
DatabaseAccessing
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
CanIhavekorean
Korean0.7BriLsh0.2French0.1
…
BeliefTracker
000…01
MySQLquery:“Select*wherefood=Korean”
DatabaseOperator
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
PolicyNetwork
…
Database
Sevendays CurryPrince
Nirala
RoyalStandard
LifleSeuol
DBpointer xt
zt
pt
qt
DecisionMaking
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
CanIhavekorean
Korean0.7BriLsh0.2French0.1
…
BeliefTracker
000…01
MySQLquery:“Select*wherefood=Korean”
DatabaseOperator
IntentNetwork
CanIhave<v.food>
GeneraLonNetwork <v.name>servesgreat<v.food>.
PolicyNetwork Copyfield
…
Database
Sevendays CurryPrince
Nirala
RoyalStandard
LifleSeuol
DBpointer xt
zt
pt
qt
ANetwork-basedEnd-to-EndTrainableTask-OrientedDialogueSystem,Wenetal,2016
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecCon
¤ Experiments
¤ Conclusion
19
WizardofOzDataCollecLon 20
Hi,IwantacheapKoreanrestaurant.
Whatareaareyoulookingfor?
1
Task:Findarestaurant,cheap,Korean,NorthAskphonenumber
1
Whatuserwants?
Food Korean
Price Cheap
Area N/A
SearchTable
LifleSeoul …
BestKorea …
…
WizardofOzDataCollecLon 21
Hi,IwantacheapKoreanrestaurant.
Whatareaareyoulookingfor?
Somewhereinthenorth.
Li>leSeoulisniceoneinthenorth. 2
Task:Findarestaurant,cheap,Korean,NorthAskphonenumber
2
Whatuserwants?
Food Korean
Price Cheap
Area North
SearchTable
LifleSeoul …
WizardofOzDataCollecLon 22
Hi,IwantacheapKoreanrestaurant.
Whatareaareyoulookingfor?
Somewhereinthenorth.
Li>leSeoulisniceoneinthenorth.
Itsphonenumberis01223456789.
Whatisthephonenumber? 3
Task:Findarestaurant,cheap,Korean,NorthAskphonenumber
3
Whatuserwants?
Food` Korean
Price Cheap
Area North
SearchTable
LifleSeoul …
WizardofOzDataCollecLon 23
Hi,IwantacheapKoreanrestaurant.
Whatareaareyoulookingfor?
Somewhereinthenorth.
Li>leSeoulisniceoneinthenorth.
Itsphonenumberis01223456789.
Whatisthephonenumber?
Thankyouverymuch,goodbye.
Thankyouforusingthesystem.
4
Task:Findarestaurant,cheap,Korean,NorthAskphonenumber
4
Whatuserwants?
Food Korean
Price Cheap
Area North
SearchTable
LifleSeoul …
WizardofOzDataCollecLon 24
Hi,IwantacheapKoreanrestaurant.
Whatareaareyoulookingfor?
Somewhereinthenorth.
Li>leSeoulisniceoneinthenorth.
Itsphonenumberis01223456789.
Whatisthephonenumber?
Thankyouverymuch,goodbye.
Thankyouforusingthesystem.
Whatuserwants?
Food Korean
Price Cheap
Area North
WizardofOzDataCollecLon 25
¤ OnlineparallelversionofWOZonMTurk¤ Randomlyhireaworkertobeuser/wizard.¤ Task:Enteranappropriateresponseforoneturn.¤ RepeattheprocessunLlalldialoguesarefinished.
¤ Exampleuserpage
WizardofOzDataCollecLon 26
¤ Examplewizardpage
CamRest676dataset 27
¤ Ontology:¤ Cambridgerestaurantdomain,99venues.¤ 3informableslots: area,pricerange,foodtype¤ 3requestableslots: address,phone,postcode
¤ Dataset¤ 676dialogues,~2750turns¤ 3000HITS,takes3days,costs~400USD¤ Datacleaningtakes2-3daysforoneperson
Link:hfps://www.repository.cam.ac.uk/handle/1810/260970
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecLon
¤ Experiments
¤ Conclusion
28
Experiments 29
¤ Experimentaldetails¤ Train/valid/test:3/1/1¤ SGD,l2regularisaLon,earlystopping,gradientclip=1¤ Hiddensize=50,Vocabsize:~500
¤ Twostagetraining:¤ Trainingtrackerswithlabelcrossentropy¤ Trainingotherpartswithresponsecrossentropy
¤ Decoding¤ Beamsearchw/beamwidth10¤ Decodewithaveragewordlikelihood
ResponseGeneraLonTask 30
Model Match(%) Success(%) BLEU
Seq2Seq[Sutskeveretal,2014] - - 0.1718
HRED[Serbanetal,2015] - - 0.1861
Ourmodelw/oreq.trackers 89.70 30.60 0.1799 Ourfullmodel 86.34 75.16 0.2313 Ourfullmodel+afenLon 90.88 80.02 0.2388
HumanevaluaLon 31
Qualityassessment SystemComparison
Exampledialogues 32
Exampledialogues 33
VisualisingacLonembedding 34
Outline
¤ Intro
¤ NeuralDialogueSystems
¤ Wizard-of-OzDataCollecLon
¤ Experiments
¤ Conclusion
35
Conclusion
¤ Anend-to-endtrainabletask-orienteddialoguesystemarchitectureisintroduced.
¤ AcomplementaryWOZdatacollecLonisusedtocollectthetrainingdata(nolatency,parallel,cheap).
¤ Resultsshowthatitcanlearnfromhuman-humanconversaLonsandhelpuserstocompletetasks.
¤ Explicitlanguagegroundingiscrucial,butwhatisthebestwaytorepresentsemanLcs?
36
FutureWork 37
¤ LatentIntenLonDialogueModels(underreview)¤ Learnanembeddedlatentpolicyfromasupervised
corpus.¤ Fine-tunepolicyusingreinforcementlearning.
¤ MulL-domainNeuralDialogueSystems¤ CollectWOZdataacrossseveraldomains.¤ Trainaneuralcontrollertoread/writememorytapes
(trackers)andemitresponses.
Thepaper
¤ Tsung-HsienWen,DavidVandyke,NikolaMrksic,MilicaGasic,LinaM.R.Barahona,Pei-HaoSu,StefanUltes,andSteveYoung.ANetwork-basedEnd-to-End Trainable Task-orientedDialogue System. To appear EACL2017.
¤ Tsung-HsienWen,MilicaGasic,NikolaMrksic, LinaM.Rojas-Barahona,Pei-Hao Su, Stefan Ultes, David Vandyke, Steve Young. CondiConalGeneraConandSnapshotLearninginNeuralDialogueSystems.EMNLP2016.
38
References
¤ P-H.Su,M.Gasic,N.Mrksic,L.Rojas-Barahona,S.Ultes,D.Vandyke,T-H. Wen, and S. Young. On-line AcCve Reward Learning for PolicyOpCmisaConinSpokenDialogueSystems,ACL2016.
¤ M. Henderson, B. Thomson and S. Young.Word-Based Dialog StateTrackingwithRecurrentNeuralNetworks,SigDial2014.
¤ J.Li,W.Monroe,A.Rifer,D.Jurafsky.DeepReinforcementLearningforDialogueGeneraCon,EMNLP2016.
39
DialogueSystemsGroup
Thankyou!QuesLons?
Tsung-HsienWenissupportedbyastudentshipfundedbyToshibaResearchEuropeLtd,CambridgeResearchLaboratory