![Page 1: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/1.jpg)
NoMetricsArePerfect:AdversarialREward Learning
forVisualStorytellingXin(Eric)Wang*,Wenhu Chen*,Yuan-FangWang,WilliamWang
1
![Page 2: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/2.jpg)
2
Caption:Twoyoungkidswithbackpackssittingontheporch.
Image Captioning
![Page 3: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/3.jpg)
VisualStorytelling
3
Story:Thebrotherdidnotwanttotalktohissister.Thesiblingsmadeup.Theystartedtotalkandsmile.Theirparentsshowedup.Theywerehappytoseethem.
Story:Thebrother didnotwanttotalktohissister.Thesiblings madeup.Theystartedtotalkandsmile.Theirparents showedup.Theywerehappy toseethem.
Imagination
Story:Thebrother didnotwanttotalktohissister.Thesiblings madeup.Theystartedtotalkandsmile.Theirparents showedup.Theywerehappytoseethem.
Emotion Subjectiveness
![Page 4: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/4.jpg)
4
Story#2:Thebrotherandsisterwerereadyforthefirstdayofschool.Theywereexcitedtogototheirfirstdayandmeetnewfriends.Theytoldtheirmomhowhappytheywere.Theysaidtheyweregoingtomakealotofnewfriends.Thentheygotupandgotreadytogetinthecar.
VisualStorytelling
![Page 5: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/5.jpg)
5
Behavioralcloningmethods(e.g.MLE)arenotgoodenoughforvisualstorytelling
![Page 6: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/6.jpg)
ReinforcementLearning
6
o Directlyoptimizetheexistingmetrics• BLEU,METEOR,ROUGE,CIDEr• Reduceexposurebias
ReinforcementLearning(RL)
Environment
RewardFunction
OptimalPolicy
Rennie 2017, “Self-critical Sequence Training for Image Captioning”
![Page 7: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/7.jpg)
7
We had a great time to have a lot of the. They were to be a of the. They were to be in the. The and it were to be the. The, and it were to be the.
AverageMETEORscore:40.2(SOTAmodel:35.0)
![Page 8: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/8.jpg)
8
I had a great time at the restaurant today. The food was delicious. I had a lot of food. I had a great time.
BLEU-4score:0
![Page 9: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/9.jpg)
9
NoMetricsArePerfect!
![Page 10: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/10.jpg)
Inverse Reinforcement Learning
10
InverseReinforcementLearning(IRL)
Environment
RewardFunction
OptimalPolicy
ReinforcementLearning(RL)
Environment
RewardFunction
OptimalPolicy
Reinforcement Learning
![Page 11: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/11.jpg)
AdversarialREward Learning(AREL)
11
Environment
AdversarialObjective PolicyModelRewardModel
Reward
GeneratedStory
ReferenceStory
RL
InverseRL
![Page 12: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/12.jpg)
PolicyModel𝜋"
12
CNN
Mybrotherrecentlygraduatedcollege.
Itwasaformalcapandgownevent.
Mymomanddadattended.
Later,myauntandgrandmashowedup.
Whentheeventwasoverheevengotcongratulatedbythemascot.
Encoder Decoder
![Page 13: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/13.jpg)
RewardModel𝑅$
13
Story Convolution FClayerPooling
CNN
mymomanddad
attended.
<EOS>
Reward
+
Kim 2014, “Convolutional Neural Networks for Sentence Classification”
![Page 14: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/14.jpg)
AssociatingReward withStory
Approximatedatadistribution Partitionfunction
Story
Optimalrewardfunction𝑅$∗(𝑊) isachievedwhen
LeCun et al. 2006, “A tutorial on energy-based learning”
Energy-basedmodels associateanenergyvalue𝐸$(𝑥) withasample𝑥,modelingthedataasaBoltzmanndistribution
𝑝$ 𝑥 =exp(−𝐸$(𝑥))
𝑍
RewardFunctionRewardBoltzmannDistribution
![Page 15: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/15.jpg)
ARELObjective
15
• TheobjectiveofRewardModel𝑅$:
Empiricaldistribution Policydistribution
RewardBoltzmanndistribution
• TheobjectiveofPolicyModel𝜋":
Therefore,wedefineanadversarialobjectivewithKL-divergence
![Page 16: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/16.jpg)
RewardVisualization
16
![Page 17: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/17.jpg)
Method BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDErSeq2seq (Huangetal.) - - - - 31.4 - -HierAttRNN (Yuetal.) - - 21.0 - 34.1 29.5 7.5AREL(ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5
AutomaticEvaluation
17
Method BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDErSeq2seq (Huangetal.) - - - - 31.4 - -HierAttRNN (Yuetal.) - - 21.0 - 34.1 29.5 7.5XE 62.3 38.2 22.5 13.7 34.8 29.7 8.7AREL(ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5
Method BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDErSeq2seq (Huangetal.) - - - - 31.4 - -HierAttRNN (Yuetal.) - - 21.0 - 34.1 29.5 7.5XE 62.3 38.2 22.5 13.7 34.8 29.7 8.7BLEU-RL 62.1 38.0 22.6 13.9 34.6 29.0 8.9METEOR-RL 68.1 35.0 15.4 6.8 40.2 30.0 1.2ROUGE-RL 58.1 18.5 1.6 0.0 27.0 33.8 0.0CIDEr-RL 61.9 37.8 22.5 13.8 34.9 29.7 8.1AREL(ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5
Method BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDErSeq2seq (Huangetal.) - - - - 31.4 - -HierAttRNN (Yuetal.) - - 21.0 - 34.1 29.5 7.5XE 62.3 38.2 22.5 13.7 34.8 29.7 8.7BLEU-RL 62.1 38.0 22.6 13.9 34.6 29.0 8.9METEOR-RL 68.1 35.0 15.4 6.8 40.2 30.0 1.2ROUGE-RL 58.1 18.5 1.6 0.0 27.0 33.8 0.0CIDEr-RL 61.9 37.8 22.5 13.8 34.9 29.7 8.1AREL(ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5
Method BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDErSeq2seq (Huangetal.) - - - - 31.4 - -HierAttRNN (Yuetal.) - - 21.0 - 34.1 29.5 7.5XE 62.3 38.2 22.5 13.7 34.8 29.7 8.7BLEU-RL 62.1 38.0 22.6 13.9 34.6 29.0 8.9METEOR-RL 68.1 35.0 15.4 6.8 40.2 30.0 1.2ROUGE-RL 58.1 18.5 1.6 0.0 27.0 33.8 0.0CIDEr-RL 61.9 37.8 22.5 13.8 34.9 29.7 8.1GAN 62.8 38.8 23.0 14.0 35.0 29.5 9.0AREL(ours) 63.7 39.0 23.1 14.0 35.0 29.6 9.5
Huang et al. 2016, “Visual Storytelling” Yu et al. 2017, “Hierarchically-Attentive RNN for Album Summarization and Storytelling”
![Page 18: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/18.jpg)
HumanEvaluation
18
0% 10% 20% 30% 40% 50%
XE BLEU-RL CIDEr-RL GAN AREL
TuringTest
Win Unsure
-17.5 -13.7-26.1
-6.3
![Page 19: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/19.jpg)
HumanEvaluation
19
Relevance:thestoryaccuratelydescribeswhatishappeninginthephotostreamandcoversthemainobjects.Expressiveness:coherence,grammaticallyandsemanticallycorrect,norepetition,expressivelanguagestyle.Concreteness:thestoryshouldnarrateconcretelywhatisintheimagesratherthangivingverygeneraldescriptions.
PairwiseComparison
![Page 20: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/20.jpg)
20
XE-ss Wetookatriptothemountains.
Thereweremanydifferentkindsofdifferentkinds.
Wehadagreattime.
Hewasagreattime.
Itwasabeautifulday.XE-ss Wetookatripto
themountains.
Thereweremanydifferentkindsofdifferentkinds.
Wehadagreattime.
Hewasagreattime.
Itwasabeautifulday.
ARELThefamilydecidedtotakeatriptothecountryside.
Thereweresomanydifferentkindsofthingstosee.
Thefamilydecidedtogoonahike. Ihadagreattime.
Attheendoftheday,wewereabletotakeapictureofthebeautifulscenery.
Human-createdStory
Wewentonahikeyesterday.
Therewerealotofstrangeplantsthere.
Ihadagreattime.Wedrankalotofwaterwhilewewerehiking.
Theviewwasspectacular.
XE-ss Wetookatriptothemountains.
Thereweremanydifferentkindsofdifferentkinds.
Wehadagreattime.
Hewasagreattime.
Itwasabeautifulday.
ARELThefamilydecidedtotakeatriptothecountryside.
Thereweresomanydifferentkindsofthingstosee.
Thefamilydecidedtogoonahike. Ihadagreattime.
Attheendoftheday,wewereabletotakeapictureofthebeautifulscenery.
![Page 21: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/21.jpg)
Takeaway
21
o Generatingandevaluatingstoriesarebothchallengingduetothecomplicatednatureofstories
o Noexistingmetricsareperfectforeithertrainingortestingo ARELisabetterlearningframeworkforvisualstorytelling
§ Canbeappliedtoothergenerationtaskso Ourapproachismodel-agnostic
§ Advancedmodelsà betterperformance
![Page 22: No Metrics Are Perfect: Adversarial REwardLearning for ...xwang/slides/AREL.pdf · for Visual Storytelling Xin (Eric) Wang*, Wenhu Chen*, Yuan-Fang Wang, William Wang 1. 2 Caption:](https://reader030.vdocuments.mx/reader030/viewer/2022040909/5e80cc55fc719c45a64d4349/html5/thumbnails/22.jpg)
Thanks!
22
Paper:https://arxiv.org/abs/1804.09160Code: https://github.com/littlekobe/AREL