convolutional neural networks iiiyjlee/teaching/ecs269-fall2019/cnn_basi… · convolutional neural...

56
Convolutional neural networks III October 2 nd , 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Andrej Karpathy

Upload: others

Post on 01-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

ConvolutionalneuralnetworksIII

October2nd,2019

YongJaeLeeUCDavis

ManyslidesfromRobFergus,SvetlanaLazebnik,Jia-BinHuang,DerekHoiem,AdrianaKovashka,AndrejKarpathy

Page 2: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Announcements•  Sign-upforpaperpresentations•  FirstpaperreviewdueThurs11:59PM

2

Page 3: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Gradientdescent•  We’llupdateweightsiteratively•  Moveindirectionoppositetogradient:

LLearning rate

Time

Figure from Andrej Karpathy

original W negative gradient direction

W_1

W_2

loss function landscape

Page 4: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Gradientdescentinmulti-layernets•  We’llupdateweights•  Moveindirectionoppositetogradient:

•  Howtoupdatetheweightsatalllayers?•  Answer:backpropagationoflossfromhigher

layerstolowerlayers

Page 5: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Backpropagation:Graphicexample

•  Firstcalculateerrorofoutputunitsandusethistochangethetoplayerofweights.

output

hidden

input

Update weights into j

Adapted from Ray Mooney

k j i

w(2)

w(1)

Page 6: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Backpropagation:Graphicexample

•  Nextcalculateerrorforhiddenunitsbasedonerrorsontheoutputunitsitfeedsinto.

output

hidden

input

k j i

Adapted from Ray Mooney

Page 7: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Backpropagation:Graphicexample

•  Finallyupdatebottomlayerofweightsbasedonerrorscalculatedforhiddenunits.

output

hidden

input

Update weights into i

k j i

Adapted from Ray Mooney

Page 8: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Backpropagation•  Easierifweusecomputationalgraphs,

especiallywhenwehavecomplicatedfunctionstypicalindeepneuralnetworks

Figure from Karpathy

Page 9: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Lecture 4 - 10

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 10: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 11

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 11: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 12

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 12: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 13

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 13: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 14

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 14: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 15

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 15: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 16

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 16: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 17

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 17: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 18

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 18: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Chain rule: Want:

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Upstream gradient Local gradient

Page 19: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Want:

Lecture 4 - 20

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 20: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

e.g.x=-2,y=5,z=-4

Chain rule: Want:

Lecture 4 - 21

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Backpropagation: a simple example

Page 21: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

f

activations

Lecture 4 - 22

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

Page 22: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

activations

Lecture 4 - 23

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

“local gradient”

f

gradients

Page 23: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

activations

Lecture 4 - 24

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

“local gradient”

f

gradients

Page 24: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

activations

Lecture 4 - 25

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

“local gradient”

f

gradients

Page 25: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

activations

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

“local gradient”

f

gradients

Page 26: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 2016

activations

Lecture 4 - 27

13Jan2016Fei-FeiLi&AndrejKarpathy&JustinJohnson

Andrej Karpathy

“local gradient”

f

gradients

Page 27: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Backpropagation: another example

Andrej Karpathy

Page 28: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

ConvolutionalNeuralNetworks(CNN)•  Neuralnetworkwithspecialized

connectivitystructure•  Stackmultiplestagesoffeatureextractors•  Higherstagescomputemoreglobal,more

invariant,moreabstractfeatures•  Classificationlayerattheend

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86(11): 2278–2324, 1998.

Adapted from Rob Fergus

Page 29: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

•  Feed-forwardfeatureextraction:1.  Convolveinputwithlearnedfilters2.  Applynon-linearity3.  Spatialpooling(downsample)

•  Supervisedtrainingofconvolutionalfiltersbyback-propagatingclassificationerror

Adapted from Lana Lazebnik

ConvolutionalNeuralNetworks(CNN)

Input Image

Convolution (Learned)

Non-linearity

Spatial pooling

Output (class probs)

Page 30: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

3

32x32x3 image

width

height

32 depth

Convolutions:Moredetail

Andrej Karpathy

Page 31: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

32

3

5x5x3 filter

32x32x3 image

Convolve the filter with the image i.e. “slide over the image spatially, computing dot products”

Convolutions:Moredetail

AndrejKarpathy

Page 32: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

32

3

ConvolutionLayer32x32x3 image 5x5x3 filter

1 number: the result of taking a dot product between the filter and a small 5x5x3 chunk of the image (i.e. 5*5*3 = 75-dimensional dot product + bias)

Convolutions: More detail

AndrejKarpathy

Page 33: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

32

3

ConvolutionLayeractivation map

32x32x3 image 5x5x3 filter

1

28

28

convolve (slide) over all spatial locations

Convolutions: More detail

AndrejKarpathy

Page 34: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

32

3

Convolution Layer

32x32x3 image 5x5x3 filter

activation maps

1

28

28

convolve (slide) over all spatial locations

considerasecond,greenfilter

Convolutions: More detail

AndrejKarpathy

Page 35: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

32

3 6

28

activation maps 32

28

Convolution Layer

Forexample,ifwehad65x5filters,we’llget6separateactivationmaps:

We stack these up to get a “new image” of size 28x28x6!

Convolutions: More detail

AndrejKarpathy

Page 36: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Preview:ConvNetisasequenceofConvolutionLayers,interspersedwithactivationfunctions

32

32

3

28

28

6

CONV, ReLU e.g. 6 5x5x3 filters

Convolutions: More detail

AndrejKarpathy

Page 37: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Preview:ConvNetisasequenceofConvolutionalLayers,interspersedwithactivationfunctions

32

32

3

CONV, ReLU e.g. 6 5x5x3 filters 28

28

6

CONV, ReLU e.g. 10 5x5x6 filters

CONV, ReLU

….

10

24

24

Convolutions: More detail

AndrejKarpathy

Page 38: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

preview:

Convolutions: More detail

AndrejKarpathy

Page 39: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Figurefromhttp://www.mdpi.com/2072-4292/7/11/14680/htm

ACommonArchitecture:AlexNet

Page 40: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

CaseStudy:VGGNet

Only 3x3 CONV stride 1, pad 1 and 2x2 MAX POOL stride 2

best model 11.2% top 5 error in ILSVRC 2013 -> 7.3% top 5 error

[Simonyan and Zisserman, 2014]

AndrejKarpathy

Page 41: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

[Szegedy et al., 2014]

Inception module

ILSVRC 2014 winner (6.7% top 5 error)

Case Study: GoogLeNet

AndrejKarpathy

Page 42: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Slide from Kaiming He’s presentation https://www.youtube.com/watch?v=1PGLj-uKT1w

[He et al., 2015]

ILSVRC 2015 winner (3.6% top 5 error)

CaseStudy:ResNet

AndrejKarpathy

Page 43: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

(slide from Kaiming He’s presentation)

CaseStudy:ResNet

AndrejKarpathy

Page 44: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

[He et al., 2015]

ILSVRC 2015 winner (3.6% top 5 error)

(slide from Kaiming He’s presentation)

2-3 weeks of training on 8 GPU machine at runtime: faster than a VGGNet! (even though it has 8x more layers)

CaseStudy:ResNet

AndrejKarpathy

Page 45: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Practicalmatters

Page 46: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Commentsontrainingalgorithm•  Notguaranteedtoconvergetozerotrainingerror,may

convergetolocaloptimaoroscillateindefinitely.•  However,inpractice,doesconvergetolowerrorformany

largenetworksonrealdata.•  Thousandsofepochs(epoch=networkseesalltrainingdata

once)mayberequired,hoursordaystotrain.•  Toavoidlocal-minimaproblems,runseveraltrialsstarting

withdifferentrandomweights(randomrestarts),andtakeresultsoftrialwithlowesttrainingseterror.

•  Maybehardtosetlearningrateandtoselectnumberofhiddenunitsandlayers.

•  Neuralnetworkshadfallenoutoffashionin90s,early2000s;backwithanewnameandsignificantlyimprovedperformance(deepnetworkstrainedwithdropoutandlotsofdata).

Ray Mooney, Carlos Guestrin, Dhruv Batra

Page 47: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Over-trainingprevention•  Runningtoomanyepochscanresultinover-fitting.

•  Keepahold-outvalidationsetandtestaccuracyonitaftereveryepoch.Stoptrainingwhenadditionalepochsactuallyincreasevalidationerror.

0 # training epochs

erro

r

on training data

on test data

Adapted from Ray Mooney

Page 48: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Training:Bestpractices•  Usemini-batch•  Useregularization•  Usecross-validationforyourparameters•  UseRELUorleakyRELU,don’tusesigmoid•  Center(subtractmeanfrom)yourdata•  Learningrate:toohigh?toolow?•  UseBatchNorm

Page 49: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

DataAugmentation(Jittering)•  Createvirtualtrainingsamples

– Horizontalflip– Randomcrop– Colorcasting– Geometricdistortion

Jia-bin Huang, Image: https://github.com/aleju/imgaug

Page 50: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Regularization:Dropout

Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014]

•  Randomly turn off some neurons •  Allows individual neurons to independently be responsible for performance

Adapted from Jia-bin Huang

Page 51: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

TransferLearning

“You need a lot of a data if you want to train/use CNNs”

Andrej Karpathy

Page 52: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

TransferLearningwithCNNs

•  Themoreweightsyouneedtolearn,themoredatayouneed

•  That’swhywithadeepernetwork,youneedmoredatafortrainingthanforashallowernetwork

•  Onepossiblesolution:

Set these to the already learned weights from another network

Learn these on your own task

Page 53: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

1. Train on ImageNet

2. Small dataset:

Freeze these

Train this

3. Medium dataset: finetuning

more data = retrain more of the network (or all of it)

Freeze these

Lecture 11 - 29

Train this

TransferLearningwithCNNs

Adapted from Andrej Karpathy

Source: classification on ImageNet Target: some other task/data

Page 54: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

moregeneric

more specific

Lecture 11 - 34

very similar dataset

very different dataset

very little data Use linear classifier on top layer

You’re in trouble… Try linear classifier from different stages

quite a lot of data

Finetune a few layers

Finetune a larger number of layers

Transfer Learning with CNNs

Andrej Karpathy

Page 55: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Summary•  Weusedeepneuralnetworksbecauseoftheir

strongperformanceinpractice•  Convolutionalneuralnetworks(CNN)

•  Convolution,nonlinearity,maxpooling•  Trainingdeepneuralnets

•  Weneedanobjectivefunctionthatmeasuresandguidesustowardsgoodperformance

•  Weneedawaytominimizethelossfunction:stochasticgradientdescent

•  Weneedbackpropagationtopropagateerrorthroughalllayersandchangetheirweights

•  Practicesforpreventingoverfitting•  Dropout;BatchNorm;dataaugmentation;transfer

learning

Page 56: Convolutional neural networks IIIyjlee/teaching/ecs269-fall2019/cnn_basi… · Convolutional neural networks III October 2nd, 2019 Yong Jae Lee UC Davis Many slides from Rob Fergus,

Questions?

SeeyouFriday!

56