cs6501: deep learning for visual recognition neural...

31
CS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer Perceptron

Upload: others

Post on 05-Oct-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

CS6501: Deep Learning for Visual RecognitionNeural Networks / Multi-layer

Perceptron

Page 2: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Today’s Class

Neural Networks• The Perceptron Model• The Multi-layer Perceptron (MLP)• Forward-pass in an MLP (Inference)• Backward-pass in an MLP (Backpropagation)

Page 3: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Perceptron ModelFrank Rosenblatt (1957) - Cornell University

More: https://en.wikipedia.org/wiki/Perceptron

! " = $1, if )*+,

-.*"* + 0 > 0

0, otherwise

":

";

"<

"=

)

.:.;.<.=

Activationfunction

Page 4: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Perceptron ModelFrank Rosenblatt (1957) - Cornell University

More: https://en.wikipedia.org/wiki/Perceptron

! " = $1, if )*+,

-.*"* + 0 > 0

0, otherwise

":

";

"<

"=

)

.:.;.<.=

!?

Page 5: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Perceptron ModelFrank Rosenblatt (1957) - Cornell University

More: https://en.wikipedia.org/wiki/Perceptron

! " = $1, if )*+,

-.*"* + 0 > 0

0, otherwise

":

";

"<

"=

)

.:.;.<.=

Activationfunction

Page 6: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Activation Functions

ReLU(x) = max(0, x)Tanh(x)

Sigmoid(x)Step(x)

Page 7: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Two-layer Multi-layer Perceptron (MLP)

!"!#!$!%

&

'"'#'$'%

&

&

&

&

()"

”hidden" layer

)"

Loss / Criterion

Page 8: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

8

Linear Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0- = 1-&$"& + 1-'$"' + 1-($"( + 1-)$") + 3-0. = 1.&$"& + 1.'$"' + 1.($"( + 1.)$") + 3.0/ = 1/&$"& + 1/'$"' + 1/($"( + 1/)$") + 3/

,- = 456/(456+459 + 45:),. = 459/(456+459 + 45:),/ = 45:/(456+459 + 45:)

Page 9: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

9

Linear Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0- = 1-&$"& + 1-'$"' + 1-($"( + 1-)$") + 3-0. = 1.&$"& + 1.'$"' + 1.($"( + 1.)$") + 3.0/ = 1/&$"& + 1/'$"' + 1/($"( + 1/)$") + 3/

,- = 456/(456+459 + 45:),. = 459/(456+459 + 45:),/ = 45:/(456+459 + 45:)

1 =1-& 1-' 1-( 1-)1.& 1.' 1.( 1.)1/& 1/' 1/( 1/)

3 = 3- 3. 3/

Page 10: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

10

Linear Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0 = 1$2 + 42 1 =1-& 1-' 1-( 1-)1.& 1.' 1.( 1.)1/& 1/' 1/( 1/)

4 = 4- 4. 4/

,- = 567/(567+56: + 56;),. = 56:/(567+56: + 56;),/ = 56;/(567+56: + 56;)

Page 11: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

11

Linear Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0 = 1$2 + 42

, = 56,789$(0)

1 =1-& 1-' 1-( 1-)1.& 1.' 1.( 1.)1/& 1/' 1/( 1/)

4 = 4- 4. 4/

Page 12: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

12

Linear Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

, = 01,234$(6$7 + 97)

Page 13: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

13

Two-layer MLP + Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 ), = 15,=40$(8[']$9 + ;[']9 )

Page 14: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

14

N-layer MLP + Softmax[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

0A = 1234526(8[A]0A?&9 + ;[A]9 )

Page 15: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

15

How to train the parameters?[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

0A = 1234526(8[A]0A?&9 + ;[A]9 )

Page 16: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Forward pass (Forward-propagation)

!"!#!$!%

&

'"'#'$'%

&

&

&

&

()" )"

*+ =&+-.

/0"+1'+ + 3" !+ = 4567859(*+)

<" =&+-.

/0#+!+ + 3#

)" = 4567859(<+)

=8>> = =()", ()")

Page 17: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

17

How to train the parameters?[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

0A = 1234526(8[A]0A?&9 + ;["]9 )

BCB8[A]"D

BCB; A "

We need!

We can still use SGD

Page 18: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

18

How to train the parameters?[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

ABA8[C]"D

ABA; C "

We need!

We can still use SGD

B = B511(,, !)

0" = 1234526(8[C]0C?&9 + ;[C]9 )

Page 19: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

19

How to train the parameters?[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

ABA8[C]"D

ABA; C "

We need!

We can still use SGD

B = B511(,, !)

0" = 1234526(8[C]0C?&9 + ;[C]9 )

Page 20: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

20

How to train the parameters?[1 0 0]!" =$" = [$"& $"' $"( $")] +!" = [,- ,. ,/]

0& = 1234526(8[&]$9 + ;[&]9 )

, = 15,=40$(8[>]0>?&9 + ;[>]9 )

0' = 1234526(8[']0&9 + ;[']9 )…

0" = 1234526(8[A]0A?&9 + ;[A]9 )

BCB8[A]"D

= BCB0>?&

B0>?&B0>?'

…B0AE&B0AB0AB8 A "D

C = C511(,, !)

Page 21: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Backward pass (Back-propagation)

!"!#!$!%

&

'"'#'$'%

&

&

&

&

()" )"

*+*,-

= **,-

/012304(,-)*+*!7

898:;

= 88:;

/012304(<-) 898 (=;

898 (=;

= 88 (=;

+()", ()")

*+*'7

= ( **'7&

-?@

AB"-C'- + E")

*+*,-

*+*B"-C

= *,-*B"-C

*+*,-

*+*!7

= ( **!7&

-?@

AB#-!- + E#)

*+*<"

*+*B#-

= *<"*B#-

*+*<"

GradInputs

GradParams

Page 22: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Softmax+ Negative

LogLikelihood

Page 23: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Linearlayer

Page 24: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

ReLUlayer

Page 25: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Two-layer Neural Network – Forward Pass

Page 26: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Two-layer Neural Network – Backward Pass

Page 27: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Automatic Differentiation

You only need to write code for the forward pass,backward pass is computed automatically.

Pytorch (Facebook -- mostly):

Tensorflow (Google -- mostly):

DyNet (team includes UVA Prof. Yangfeng Ji):

https://pytorch.org/

https://www.tensorflow.org/

http://dynet.io/

Page 28: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Defining a Model in Pytorch (Two Layer NN)

Page 29: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

1. Creating Model, Loss, Optimizer

Page 30: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

2. Running forward and backward on a batch

Compare this to what we had to do for toynn

Page 31: CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Questions?

31