005 20151130 adversary_networks

40
Adversarial Networks Tran Quoc Hoan Paper alert@2015-11-30

Upload: ha-phuong

Post on 22-Jan-2018

317 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: 005 20151130 adversary_networks

AdversarialNetworks

TranQuocHoanPaperalert@2015-11-30

Page 2: 005 20151130 adversary_networks

Introduc@on

•  SuccessesindeeplearningDiscrimina@vemodelsthatmapahigh-dimensional,richsensory

inputtoaclasslabel

->Problem=Whatisdifferentwithhumanrecogni@onability?

(today’stopic)

•  PromiseofdeeplearningDiscoverrich,hierarchicalmodelsthatrepresentprobabilitydistribu@ons

overthedata.

->Genera@vemodelbyDNN(lastweek+nextweek)

2/40PaperAlert@2015-11-30

Page 3: 005 20151130 adversary_networks

Today’spapers

•  IntriguingProper@esofNeuralNetworksSzegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,DumitruErhan,IanGoodfellow,

andRobFergus(2013)

•  ExplainingandHarnessingAdversarialExamplesGoodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(2015)

•  [email protected],JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,

AaronCourville,YoshuaBengio(2015)

•  DeepNeuralNetworksareEasilyFooled:HighConfidencePredic@onsforUnrecognizableImagesNguyen,Anh,JasonYosinski,andJeffClune(2014)

3/40PaperAlert@2015-11-30

Page 4: 005 20151130 adversary_networks

IntriguingProper@esofNeuralNetworks

Szegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,Dumitru

Erhan,IanGoodfellow,andRobFergus(2013)

PaperAlert@2015-11-30

Page 5: 005 20151130 adversary_networks

AdversarialExamples•  Whatisadversarialexample?– Addingaimpercep@ble(forhuman)pertuba@ontomakenetworkmisclassifyanimage

•  Whydoadversarialexamplesexist?

– DNNlearninput-outputmappingsthatarediscon@nuoustoasignificantextent

•  Interes@ngobserva@on–  TheadversarialexamplesgeneratedfornetworkAcanalsomakenetworkBfail

5/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks

Page 6: 005 20151130 adversary_networks

GenerateAdversarialExamples•  Inputimage

•  Classifier

•  Targetlabel

•  When

6/40PaperAlert@2015-11-30

x+ristheclosetimagetoxclassifiedaslbyf

IntriguingProper@esofNeuralNetworks

Page 7: 005 20151130 adversary_networks

Adversarialexamples

7/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks

Page 8: 005 20151130 adversary_networks

Adversarialexamples

8/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks

Page 9: 005 20151130 adversary_networks

Intriguingproper@es•  Proper@es–  Visuallyhardtodis@nguishthegeneratedadversarialexamples

–  Crossmodelgeneraliza@on(differenthyper-parameters)

–  Crosstraining-setgenera@on(differenttrainingset)

•  Observa@on–  Adversarialexamplesareuniversal

–  Back-feedingadversarialexamplestotrainingmight

improvegeneraliza@onofthemodel(butremainshard)

9/40PaperAlert@2015-11-30IntriguingProper@esofNeuralNetworks

Page 10: 005 20151130 adversary_networks

FoolingDNN

10/40PaperAlert@2015-11-30

Impercep@bleadversarialexamplesthatcausemisclassifica@on

UnrecognizableimagesthatmakeDNNbelieveNguyenA,YosinskiJ,CluneJ.DeepNeuralNetworksareEasilyFooled:HighConfidencePredicXonsfor

UnrecognizableImages.(CVPR2015)

Oppositedirec@on

Producingimagesthatarecompletelyunrecognizabletohumans,butDNNbelievetoberecognizableobjectswithhighconfidence(99%)

FoolingDNN

Page 11: 005 20151130 adversary_networks

Genera@ngImageswithEvolu@on(oneclass)

11/40PaperAlert@2015-11-30

•  Organisms(images)willberandomlyperturbedandselectedbasedonfitnessfunc@on

•  Fitnessfunc@on:thehighestpredic@onvalueaDNNbelievesthattheimagebelongstoaclass

FoolingDNN

Page 12: 005 20151130 adversary_networks

Genera@ngImageswithEvolu@on(mul@-class)

12/40PaperAlert@2015-11-30

•  Algorithm:mul@-dimensionalarchiveofphenotypic

elitesMAP-Elites(A.Cullyetal.2015)

•  Procedures:

•  Randomlychooseanorganism,mutateitrandomly

•  ShowthemutatedorganismtotheDNN.Ifthe

predic@onscoreishigherthanthecurrenthighest

scoreofANYclass,maketheorganismasthe

championofthatclass

FoolingDNN

Page 13: 005 20151130 adversary_networks

EncodinganImage

•  Directencoding:–  ForMNIST:28x28pixels

–  ForImageNet:256x256pixels,eachpixelhas3

channels(H,S,V)

•  Valuesareindependentlymutated

–  10%chanceofbeingchosen.Thechancedropsbyhalfevery1000genera@ons

– Mutateviathepolynomialmuta@onoperator

13/40PaperAlert@2015-11-30FoolingDNN

Page 14: 005 20151130 adversary_networks

DirectlyEncodedImages

14/40PaperAlert@2015-11-30FoolingDNN

Page 15: 005 20151130 adversary_networks

EncodinganImage

•  Indirectencoding:

– Verylikelytoproduceregularimageswith

meaningfulpajerns

– BothhumansandDNNscanrecognize

– Composi@onalpajern-producingnetwork(CPPN)

(similarwithANN)(K.O.Stanley2007): gettheposi@onofpixel(x,y),outputthecolourvaluebyNN

15/40PaperAlert@2015-11-30FoolingDNN

Page 16: 005 20151130 adversary_networks

CPPN-encodedImages

16/40PaperAlert@2015-11-30FoolingDNN

Page 17: 005 20151130 adversary_networks

MNIST-IrregularImages

17/40PaperAlert@2015-11-30

LeNet:99.99%medianconfidence,200genera@ons

FoolingDNN

Page 18: 005 20151130 adversary_networks

MNIST-RegularImages

18/40PaperAlert@2015-11-30

LeNet:99.99%medianconfidence,200genera@ons

FoolingDNN

Page 19: 005 20151130 adversary_networks

ImageNet-IrregularImages

19/40PaperAlert@2015-11-30

AlexNet:21.59%medianconfidence,[email protected]:>99%confidence

FoolingDNN

Page 20: 005 20151130 adversary_networks

ImageNet-IrregularImages

20/40PaperAlert@2015-11-30FoolingDNN

Page 21: 005 20151130 adversary_networks

ImageNet-RegularImages

21/40PaperAlert@2015-11-30

AlexNet:88.11%medianconfidence,[email protected]

Dogsandcats

FoolingDNN

Page 22: 005 20151130 adversary_networks

ImageNet-RegularImages

22/40PaperAlert@2015-11-30FoolingDNN

Page 23: 005 20151130 adversary_networks

FoolingCloselyRelatedClasses

23/40PaperAlert@2015-11-30FoolingDNN

Page 24: 005 20151130 adversary_networks

Repe@@onofPajerns

24/40PaperAlert@2015-11-30FoolingDNN

Page 25: 005 20151130 adversary_networks

Withrelatedresearches

25/40PaperAlert@2015-11-30FoolingDNN

Page 26: 005 20151130 adversary_networks

ExplainingandHarnessingAdversarialExamples

Goodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(ICLR2015).

PaperAlert@2015-11-30ExplainingAdversarialExamples

Page 27: 005 20151130 adversary_networks

WhyDoAdversarialExamplesExist?

•  Pastexplana@ons–  ExtremenonlinearityofDNN

–  Insufficientmodelaveraging

–  Insufficientregulariza@on

•  Newexplana@on–  Linearbehaviorinhigh-dimensionalspacesis

sufficienttocauseadversarialexamples

27/40PaperAlert@2015-11-30

Goodfellow,IanJ.,JonathonShlens,andChris@anSzegedy."ExplainingandHarnessingAdversarialExamples.”(ICLR2015).

ExplainingAdversarialExamples

Page 28: 005 20151130 adversary_networks

LinearExplana@onsofAdversarialExamples?

28/40PaperAlert@2015-11-30

Perturba@on Adversarialexamples

Pixelvalueprecision: typically=1/255

Perturba@onismeaninglessif:

Ac@va@onofadversarialexamples:

maximizestheincreaseofac@va@on.

Assumethemagnitudeoftheweightvectorismandthedimensionisn,thenincreaseofac@va@onis

Asimplelinearmodelcanhaveadversarialexamplesaslongasitsinputhassufficientdimensionality.

ExplainingAdversarialExamples

Page 29: 005 20151130 adversary_networks

Fasterwaytogenerateadversarialexamples

29/40PaperAlert@2015-11-30

Costfunc@on:

Perturba@on:

ExplainingAdversarialExamples

Page 30: 005 20151130 adversary_networks

AdversarialTrainingofLinearModels

30/40PaperAlert@2015-11-30

Simplecase:linearregression

Traingradientdescendon

Adversarialtrainingversionis

Regularizedcostfunc@on

OnMNIST:errorratedropsfrom0.94%to0.84%

Foradversarialexamples:errorratedropsfrom89.4%to17.9%

ExplainingAdversarialExamples

Page 31: 005 20151130 adversary_networks

WhyAdversarialExamplesGeneralize

•  Anadversarialexamplegeneratedforonemodelisosen

misclassifiedbyothermodels

•  Whendifferentmodelsmisclassifyanadversarialexamples,

theyosenagreewitheachother

•  Aslongasisposi@ve,adversarialexampleswork

•  Hypothesis:neuralnetworkstrainedallresemblethelinear

classifierlearnedonthesametrainingset

–  Suchstabilityofunderlyingclassifica@onweightscausesthestabilityofadversarialexamples

31/40PaperAlert@2015-11-30ExplainingAdversarialExamples

Page 32: 005 20151130 adversary_networks

Genera@veAdversarialNetworks

IanJ.Goodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-

Farley,SherjilOzair,AaronCourville,YoshuaBengio(2015)

PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 33: 005 20151130 adversary_networks

Genera@veAdversarialNets

•  Twotypesofmodels:

– Genera@vemodel:learnsthejoinprobability

distribu@onofthedata–p(x,y)

– Discrimina@vemodel:learnsthecondi@onal

probabilitydistribu@onofthedata–p(y|x)

– Mucheasiertodiscrimina@vemodelwiththe

genera@vemodel

33/40PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 34: 005 20151130 adversary_networks

Mainidea

•  Adversarialprocess:– Simultaneouslytraintwomodels:

•  Agenera@vemodelGcapturesthedatadistribu@on

•  Discrimina@vemodelD–tellswhetherasamplecomes

fromthetrainingdataornot

– Op@malsolu@on:

•  Grecoversthedatadistribu@on

•  Dis½everywhere

34/40PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 35: 005 20151130 adversary_networks

Forma@on

•  Purpose->Learndatadistribu@onpgoverdatax–  Priorinputnoisepz(z)

– MappingtodataspaceG(z; Θg)asdifferen@able

func@onrepresentedbyamul@layerperceptronwith

parametersΘg

– Mul@layerperceptrondiscrimina@vefunc@onD(x;

Θd)representstheprobabilitythatxcamefromthe

dataratherthanpg

35/40PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 36: 005 20151130 adversary_networks

Two-playerminmaxgame

36/40PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 37: 005 20151130 adversary_networks

Algorithm

37/50PaperAlert@2015-11-30

Page 38: 005 20151130 adversary_networks

Experiments

38/40PaperAlert@2015-11-30Genera@veAdversarialNetworks

Page 39: 005 20151130 adversary_networks

Today’spapers

•  IntriguingProper@esofNeuralNetworksSzegedy,Chris@an,WojciechZaremba,IlyaSutskever,JoanBruna,DumitruErhan,IanGoodfellow,

andRobFergus(2013)

•  ExplainingandHarnessingAdversarialExamplesGoodfellow,IanJ.,JonathonShlens,andChris@anSzegedy(2015)

•  [email protected],JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,

AaronCourville,YoshuaBengio(2015)

•  DeepNeuralNetworksareEasilyFooled:HighConfidencePredic@onsforUnrecognizableImagesNguyen,Anh,JasonYosinski,andJeffClune(2014)

39/40PaperAlert@2015-11-30

Page 40: 005 20151130 adversary_networks

AdversarialNetworks

TranQuocHoanPaperalert@2015-11-30