deep learning for computer vision: recurrent neural networks (upc 2016)

Download Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)

Post on 06-Jan-2017

466 views

Category:

Data & Analytics

2 download

Embed Size (px)

TRANSCRIPT

  • Day 2 Lecture 6

    Recurrent Neural Networks

    Xavier Gir-i-Nieto

    [course site]

    https://imatge.upc.edu/web/people/xavier-girohttps://imatge.upc.edu/web/people/xavier-girohttp://imatge-upc.github.io/telecombcn-2016-dlcv/http://imatge-upc.github.io/telecombcn-2016-dlcv/

  • 2

    Acknowledgments

    Santi Pascual

  • 3

    General idea

    ConvNet(or CNN)

  • 4

    General idea

    ConvNet(or CNN)

  • 5

    Multilayer Perceptron

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    The output depends ONLY on the current input.

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 6

    Recurrent Neural Network (RNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    The hidden layers and the output depend from previous

    states of the hidden layers

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 7

    Recurrent Neural Network (RNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    The hidden layers and the output depend from previous

    states of the hidden layers

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 8

    Recurrent Neural Network (RNN)

    time

    time

    Rotation 90o

    Front View Side View

    Rotation 90o

  • 9

    Recurrent Neural Networks (RNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    Each node represents a layerof neurons at a single timestep.

    tt-1 t+1

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 10

    Recurrent Neural Networks (RNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    tt-1 t+1

    The input is a SEQUENCE x(t) of any length.

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 11

    Recurrent Neural Networks (RNN)Common visual sequences:

    Still image Spatial scan (zigzag, snake)

    The input is a SEQUENCE x(t) of any length.

  • 12

    Recurrent Neural Networks (RNN)Common visual sequences:

    Video Temporalsampling

    The input is a SEQUENCE x(t) of any length.

    ...

    t

  • 13

    Recurrent Neural Networks (RNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    Must learn temporally shared weights w2; in addition to w1 & w3.

    tt-1 t+1

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 14

    Bidirectional RNN (BRNN)

    Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    Must learn weights w2, w3, w4 & w5; in addition to w1 & w6.

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 15Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks

    Bidirectional RNN (BRNN)

    https://www.cs.toronto.edu/~graves/preprint.pdf

  • 16Slide: Santi Pascual

    Formulation: One hidden layerDelay unit

    (z-1)

  • 17Slide: Santi Pascual

    Formulation: Single recurrence

    One-timeRecurrence

  • 18Slide: Santi Pascual

    Formulation: Multiple recurrences

    Recurrence

    One time-steprecurrence

    T time stepsrecurrences

  • 19Slide: Santi Pascual

    RNN problems

    Long term memory vanishes because of the T nested multiplications by U.

    ...

  • 20Slide: Santi Pascual

    RNN problems

    During training, gradients may explode or vanish because of temporal depth.

    Example: Back-propagation in time with 3 steps.

  • 21

    Long Short-Term Memory (LSTM)

  • 22Hochreiter, Sepp, and Jrgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8 (1997): 1735-1780.

    Long Short-Term Memory (LSTM)

    http://web.eecs.utk.edu/~itamar/courses/ECE-692/Bobby_paper1.pdf

  • 23Figure: Cristopher Olah, Understanding LSTM Networks (2015)

    Long Short-Term Memory (LSTM)Based on a standard RNN whose neuron activates with tanh...

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 24

    Long Short-Term Memory (LSTM)Ct is the cell state, which flows through the entire chain...

    Figure: Cristopher Olah, Understanding LSTM Networks (2015)

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 25

    Long Short-Term Memory (LSTM)...and is updated with a sum instead of a product. This avoid memory vanishing and exploding/vanishing backprop gradients.

    Figure: Cristopher Olah, Understanding LSTM Networks (2015)

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 26

    Long Short-Term Memory (LSTM)Three gates are governed by sigmoid units (btw [0,1]) define the control of in & out information..

    Figure: Cristopher Olah, Understanding LSTM Networks (2015)

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 27

    Long Short-Term Memory (LSTM)

    Forget Gate:

    Concatenate

    Figure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 28

    Long Short-Term Memory (LSTM)

    Input Gate Layer

    New contribution to cell state

    Classic neuronFigure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 29

    Long Short-Term Memory (LSTM)

    Update Cell State (memory):

    Figure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 30

    Long Short-Term Memory (LSTM)

    Output Gate Layer

    Output to next layer

    Figure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 31

    Long Short-Term Memory (LSTM)

    Figure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 32

    Gated Recurrent Unit (GRU)

    Cho, Kyunghyun, Bart Van Merrinboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." AMNLP 2014.

    Similar performance as LSTM with less computation.

    http://arxiv.org/abs/1406.1078http://arxiv.org/abs/1406.1078http://arxiv.org/abs/1406.1078

  • 33Figure: Cristopher Olah, Understanding LSTM Networks (2015) / Slide: Alberto Montes

    Gated Recurrent Unit (GRU)

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • 34

    Applications: Machine Translation

    Cho, Kyunghyun, Bart Van Merrinboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).

    Language IN

    Language OUT

    http://arxiv.org/abs/1406.1078http://arxiv.org/abs/1406.1078http://arxiv.org/abs/1406.1078

  • 35

    Applications: Image Classification

    van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel Recurrent Neural Networks." arXiv preprint arXiv:1601.06759 (2016).

    RowLSTM Diagonal BiLSTMClassification MNIST

    http://arxiv.org/abs/1601.06759

  • 36

    Applications: Segmentation

    Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville, ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation. DeepVision CVPRW 2016.

    http://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w12/html/Visin_ReSeg_A_Recurrent_CVPR_2016_paper.htmlhttp://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w12/html/Visin_ReSeg_A_Recurrent_CVPR_2016_paper.htmlhttp://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w12/html/Visin_ReSeg_A_Recurrent_CVPR_2016_paper.html

  • 37

    Learn moreNando de Freitas (University of Oxford)

    http://www.youtube.com/watch?v=56TYLaQN4N8https://www.cs.ox.ac.uk/people/nando.defreitas/https://www.cs.ox.ac.uk/people/nando.defreitas/

  • 38

    Thanks ! Q&A ?Follow me at

    https://imatge.upc.edu/web/people/xavier-giro

    @DocXavi/ProfessorXavi

    https://imatge.upc.edu/web/people/xavier-girohttps://imatge.upc.edu/web/people/xavier-girohttps://twitter.com/DocXavihttps://twitter.com/DocXavihttps://www.facebook.com/ProfessorXavi/https://www.facebook.com/ProfessorXavi/

Recommended

View more >