005 20151130 adversary_networks
TRANSCRIPT
AdversarialNetworks
TranQuocHoanPaperalert@2015-11-30
Introduc@on
• SuccessesindeeplearningDiscrimina@vemodelsthatmapahigh-dimensional,richsensory
inputtoaclasslabel
->Problem=Whatisdifferentwithhumanrecogni@onability?
(today’stopic)
• PromiseofdeeplearningDiscoverrich,hierarchicalmodelsthatrepresentprobabilitydistribu@ons
overthedata.
->Genera@vemodelbyDNN(lastweek+nextweek)
2/40PaperAlert@2015-11-30
Today’spapers
• IntriguingProper@esofNeuralNetworksSzegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,DumitruErhan,IanGoodfellow,
andRobFergus(2013)
• ExplainingandHarnessingAdversarialExamplesGoodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(2015)
• [email protected],JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,
AaronCourville,YoshuaBengio(2015)
• DeepNeuralNetworksareEasilyFooled:HighConfidencePredic@onsforUnrecognizableImagesNguyen,Anh,JasonYosinski,andJeffClune(2014)
3/40PaperAlert@2015-11-30
IntriguingProper@esofNeuralNetworks
Szegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,Dumitru
Erhan,IanGoodfellow,andRobFergus(2013)
PaperAlert@2015-11-30
AdversarialExamples• Whatisadversarialexample?– Addingaimpercep@ble(forhuman)pertuba@ontomakenetworkmisclassifyanimage
• Whydoadversarialexamplesexist?
– DNNlearninput-outputmappingsthatarediscon@nuoustoasignificantextent
• Interes@ngobserva@on– TheadversarialexamplesgeneratedfornetworkAcanalsomakenetworkBfail
5/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks
GenerateAdversarialExamples• Inputimage
• Classifier
• Targetlabel
• When
6/40PaperAlert@2015-11-30
x+ristheclosetimagetoxclassifiedaslbyf
IntriguingProper@esofNeuralNetworks
Adversarialexamples
7/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks
Adversarialexamples
8/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks
Intriguingproper@es• Proper@es– Visuallyhardtodis@nguishthegeneratedadversarialexamples
– Crossmodelgeneraliza@on(differenthyper-parameters)
– Crosstraining-setgenera@on(differenttrainingset)
• Observa@on– Adversarialexamplesareuniversal
– Back-feedingadversarialexamplestotrainingmight
improvegeneraliza@onofthemodel(butremainshard)
9/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks
FoolingDNN
10/40PaperAlert@2015-11-30
Impercep@bleadversarialexamplesthatcausemisclassifica@on
UnrecognizableimagesthatmakeDNNbelieveNguyenA,YosinskiJ,CluneJ.DeepNeuralNetworksareEasilyFooled:HighConfidencePredicXonsfor
UnrecognizableImages.(CVPR2015)
Oppositedirec@on
Producingimagesthatarecompletelyunrecognizabletohumans,butDNNbelievetoberecognizableobjectswithhighconfidence(99%)
FoolingDNN
Genera@ngImageswithEvolu@on(oneclass)
11/40PaperAlert@2015-11-30
• Organisms(images)willberandomlyperturbedandselectedbasedonfitnessfunc@on
• Fitnessfunc@on:thehighestpredic@onvalueaDNNbelievesthattheimagebelongstoaclass
FoolingDNN
Genera@ngImageswithEvolu@on(mul@-class)
12/40PaperAlert@2015-11-30
• Algorithm:mul@-dimensionalarchiveofphenotypic
elitesMAP-Elites(A.Cullyetal.2015)
• Procedures:
• Randomlychooseanorganism,mutateitrandomly
• ShowthemutatedorganismtotheDNN.Ifthe
predic@onscoreishigherthanthecurrenthighest
scoreofANYclass,maketheorganismasthe
championofthatclass
FoolingDNN
EncodinganImage
• Directencoding:– ForMNIST:28x28pixels
– ForImageNet:256x256pixels,eachpixelhas3
channels(H,S,V)
• Valuesareindependentlymutated
– 10%chanceofbeingchosen.Thechancedropsbyhalfevery1000genera@ons
– Mutateviathepolynomialmuta@onoperator
13/40PaperAlert@2015-11-30FoolingDNN
DirectlyEncodedImages
14/40PaperAlert@2015-11-30FoolingDNN
EncodinganImage
• Indirectencoding:
– Verylikelytoproduceregularimageswith
meaningfulpajerns
– BothhumansandDNNscanrecognize
– Composi@onalpajern-producingnetwork(CPPN)
(similarwithANN)(K.O.Stanley2007): gettheposi@onofpixel(x,y),outputthecolourvaluebyNN
15/40PaperAlert@2015-11-30FoolingDNN
CPPN-encodedImages
16/40PaperAlert@2015-11-30FoolingDNN
MNIST-IrregularImages
17/40PaperAlert@2015-11-30
LeNet:99.99%medianconfidence,200genera@ons
FoolingDNN
MNIST-RegularImages
18/40PaperAlert@2015-11-30
LeNet:99.99%medianconfidence,200genera@ons
FoolingDNN
ImageNet-IrregularImages
19/40PaperAlert@2015-11-30
AlexNet:21.59%medianconfidence,[email protected]:>99%confidence
FoolingDNN
ImageNet-IrregularImages
20/40PaperAlert@2015-11-30FoolingDNN
ImageNet-RegularImages
21/40PaperAlert@2015-11-30
AlexNet:88.11%medianconfidence,[email protected]
Dogsandcats
FoolingDNN
ImageNet-RegularImages
22/40PaperAlert@2015-11-30FoolingDNN
FoolingCloselyRelatedClasses
23/40PaperAlert@2015-11-30FoolingDNN
Repe@@onofPajerns
24/40PaperAlert@2015-11-30FoolingDNN
Withrelatedresearches
25/40PaperAlert@2015-11-30FoolingDNN
ExplainingandHarnessingAdversarialExamples
Goodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(ICLR2015).
PaperAlert@2015-11-30ExplainingAdversarialExamples
WhyDoAdversarialExamplesExist?
• Pastexplana@ons– ExtremenonlinearityofDNN
– Insufficientmodelaveraging
– Insufficientregulariza@on
• Newexplana@on– Linearbehaviorinhigh-dimensionalspacesis
sufficienttocauseadversarialexamples
27/40PaperAlert@2015-11-30
Goodfellow,IanJ.,JonathonShlens,andChris@anSzegedy."ExplainingandHarnessingAdversarialExamples.”(ICLR2015).
ExplainingAdversarialExamples
LinearExplana@onsofAdversarialExamples?
28/40PaperAlert@2015-11-30
Perturba@on Adversarialexamples
Pixelvalueprecision: typically=1/255
Perturba@onismeaninglessif:
Ac@va@onofadversarialexamples:
maximizestheincreaseofac@va@on.
Assumethemagnitudeoftheweightvectorismandthedimensionisn,thenincreaseofac@va@onis
Asimplelinearmodelcanhaveadversarialexamplesaslongasitsinputhassufficientdimensionality.
ExplainingAdversarialExamples
Fasterwaytogenerateadversarialexamples
29/40PaperAlert@2015-11-30
Costfunc@on:
Perturba@on:
ExplainingAdversarialExamples
AdversarialTrainingofLinearModels
30/40PaperAlert@2015-11-30
Simplecase:linearregression
Traingradientdescendon
Adversarialtrainingversionis
Regularizedcostfunc@on
OnMNIST:errorratedropsfrom0.94%to0.84%
Foradversarialexamples:errorratedropsfrom89.4%to17.9%
ExplainingAdversarialExamples
WhyAdversarialExamplesGeneralize
• Anadversarialexamplegeneratedforonemodelisosen
misclassifiedbyothermodels
• Whendifferentmodelsmisclassifyanadversarialexamples,
theyosenagreewitheachother
• Aslongasisposi@ve,adversarialexampleswork
• Hypothesis:neuralnetworkstrainedallresemblethelinear
classifierlearnedonthesametrainingset
– Suchstabilityofunderlyingclassifica@onweightscausesthestabilityofadversarialexamples
31/40PaperAlert@2015-11-30ExplainingAdversarialExamples
Genera@veAdversarialNetworks
IanJ.Goodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-
Farley,SherjilOzair,AaronCourville,YoshuaBengio(2015)
PaperAlert@2015-11-30Genera@veAdversarialNetworks
Genera@veAdversarialNets
• Twotypesofmodels:
– Genera@vemodel:learnsthejoinprobability
distribu@onofthedata–p(x,y)
– Discrimina@vemodel:learnsthecondi@onal
probabilitydistribu@onofthedata–p(y|x)
– Mucheasiertodiscrimina@vemodelwiththe
genera@vemodel
33/40PaperAlert@2015-11-30Genera@veAdversarialNetworks
Mainidea
• Adversarialprocess:– Simultaneouslytraintwomodels:
• Agenera@vemodelGcapturesthedatadistribu@on
• Discrimina@vemodelD–tellswhetherasamplecomes
fromthetrainingdataornot
– Op@malsolu@on:
• Grecoversthedatadistribu@on
• Dis½everywhere
34/40PaperAlert@2015-11-30Genera@veAdversarialNetworks
Forma@on
• Purpose->Learndatadistribu@onpgoverdatax– Priorinputnoisepz(z)
– MappingtodataspaceG(z; Θg)asdifferen@able
func@onrepresentedbyamul@layerperceptronwith
parametersΘg
– Mul@layerperceptrondiscrimina@vefunc@onD(x;
Θd)representstheprobabilitythatxcamefromthe
dataratherthanpg
35/40PaperAlert@2015-11-30Genera@veAdversarialNetworks
Two-playerminmaxgame
36/40PaperAlert@2015-11-30Genera@veAdversarialNetworks
Algorithm
37/50PaperAlert@2015-11-30
Experiments
38/40PaperAlert@2015-11-30Genera@veAdversarialNetworks
Today’spapers
• IntriguingProper@esofNeuralNetworksSzegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,DumitruErhan,IanGoodfellow,
andRobFergus(2013)
• ExplainingandHarnessingAdversarialExamplesGoodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(2015)
• [email protected],JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,
AaronCourville,YoshuaBengio(2015)
• DeepNeuralNetworksareEasilyFooled:HighConfidencePredic@onsforUnrecognizableImagesNguyen,Anh,JasonYosinski,andJeffClune(2014)
39/40PaperAlert@2015-11-30
AdversarialNetworks
TranQuocHoanPaperalert@2015-11-30