순환신경망(recurrent neural networks) 개요

Download 순환신경망(Recurrent neural networks) 개요

Post on 05-Apr-2017

72 views

Category:

Data & Analytics

10 download

Embed Size (px)

TRANSCRIPT

  • (Recurrent Neural Networks)for Sequential Pattern Modeling

    2017-03-22

    Lecture by Kim, Byoung-Hee

    Biointelligence Laboratory

    School of Computer Science and Engineering

    Seoul National University

    http://bi.snu.ac.kr

  • Part 1: (RNN)

    Part 2:

    Part 3: :

    Part 4:

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 2

  • Part 1

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 3

  • (artificial neural networks)

    ()

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 4

    (feedforward)

  • : Neural Network Activation Functions

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 5

    (A. Graves, 2012)

    = max(, 0)

    Rectified Linear Unit

    [] Hidden unit sparsity Gradient vanishing

    - ON OFF . , - smooth S - ReLU - sigmoid tanh

    https://www.quora.com/What-is-special-about-rectifier-neural-units-used-in-NN-learninghttps://www.quora.com/What-is-special-about-rectifier-neural-units-used-in-NN-learning

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 6

    . ODE, PDE . Stochastic process. ARMA, ARIMA

    HMM (Hidden Markov Model) &

    SSM (State Space Model)

    RNN (Recurrent Neural Networks)

    Kalman Filter

    , , , , ,

    Kalman filter

    welcome

  • : S/W

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 7

    Hound by Soundhound

  • (Figure from ICML 2014 tutorial by Li Deng)

    ,

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 8

    (switchboard) 99 10

    Microsoft Research: 2010 ~23%

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 9

    (recurrent)

    RNN

    ()

  • as API

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 10

    API: (application program interface)

    (state)

  • RNN

    (Sequence labeling)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 11A. Graves, Supervised Sequence Labelling with Recurrent Neural Networks, 2012.

    Sentiment classification

  • RNN

    (time series prediction)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 12

    Blue=Whole Dataset, Green=Training, Red=Predictions

    Airline Passengers (1949~1960)

    ( )

  • RNN

    (Spike or periodic function generation)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 13

    Gers, Schraudolph and Schmidhuber, Learning Precise Timing with LSTM Recurrent Networks, JMLR, 2002.

    Generating Timed Spikes (GTS) Periodic Function Generation (PFG)

  • RNN

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 14

    .

    ()

    ,

    :

    .

    .

    RNN

  • RNN

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 15

    ,

    (: ), .

  • RNN : , ,

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 16

    INPUT OUTPUT

    Doosan. Building your tomorrow today

    Proud Global Doosan

    (none)

  • 1:

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 17

    Alex Graves

    http://www.cs.toronto.edu/~graves/handwriting.html

    100 Text

    .

    : Style

    : bias ( )

    jpg

    Theres no place like home.Lets count: 1, 2, 3

    http://www.cs.toronto.edu/~graves/handwriting.html

  • RNN :

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 18

    Baidu, DeepSpeech2 (2015)

    ,

    RNN

  • RNN : Neural Machine Translation (NMT)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 19

    https://research.googleblog.com/2016/09/a-neural-network-for-machine.html

    RNN RNN

    https://research.googleblog.com/2016/09/a-neural-network-for-machine.html

  • (2016)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 20

    Encoder-Decoder

    Encoder Decoder 8 LSTM

  • RNN :

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 21

    Memory Network:

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 22

    (Y. Lecun, Y. Bengio, and G. Hinton, 2015)

    http://www.nature.com/nature/journal/v521/n7553/fig_tab/nature14539_F3.html

    CNN RNN

    http://www.nature.com/nature/journal/v521/n7553/fig_tab/nature14539_F3.html

  • 23

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 24

    (), ()

  • Part 1

    (RNN)

    (text, audio, video )

    , ,

    , ,

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 25

  • Part 2

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 26

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 27

    x

    y

    y = wx + b

    (linear) (non-linear) .

    ,weight

    , bias

  • : Neural Network Activation Functions

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 28

    (A. Graves, 2012)

    = max(, 0)

    Rectified Linear Unit

    [] Hidden unit sparsity Gradient vanishing

    - ON OFF . , - smooth S - ReLU - sigmoid tanh

    https://www.quora.com/What-is-special-about-rectifier-neural-units-used-in-NN-learninghttps://www.quora.com/What-is-special-about-rectifier-neural-units-used-in-NN-learning

  • (state)

    ,

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 29

    State-transition digram

  • (state)

    ,

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 30

    : , (hidden states)

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 31

    :

    .

    ) (hidden)

    () (weight)

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 32

    (Slide from Stanford CS231n 2015~2016 winter class)

    RNN .

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 33

    (Slide from Stanford CS231n 2015~2016 winter class)

    (parameter) !

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 34(Slide from Stanford CS231n 2015~2016 winter class)

    : Whh , Wxh ,Why

    (bias )

  • RNN Computational Graphs for RNN

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 35

    : Whh , Wxh , bh

    1tanh( )

    t hh t xh t hh W h W x b

    RNN , , .

    ( : operation, : tensor)

    Images from Graham Neubig, Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, arXiv:1703.01619v1

  • Recurrent Neural Networks

    RNN . : input at time

    : hidden state at time (memory of the network).

    : is an activation function (e.g, () and ReLUs).

    U, V, W: network parameters (unlike a feedforward neural network, an RNN

    shares the same parameters across all time steps).

    g: activation function for the output layer (typically a softmax function).

    y: the output of the network at time

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

    = +1

    ) = (

    36

    ( , RNN ) : U, V, W

    (Slide from Deep Learning Book Seminar of BI Lab)

  • Computing the Gradient

    in a Recurrent Neural Network The use of back-propagation on the unrolled graph is called

    the back-propagation through time (BPTT) algorithm The backpropagation algorithm can be extended to BPTT by unfolding RNN in time

    and stacking identical copies of the RNN.

    As the parameters that are supposed to be learned (U, V and W) are shared by all

    time steps in the network, the gradient at each output depends, not only on the

    calculations of the current time step, but also the previous time steps.

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

    BPTT .

    BPTT

    (gradient)

    .

    37(Slide from Deep Learning Book Seminar of BI Lab)

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 38(Slide from Stanford CS231n 2015~2016 winter class)

    (one-hot encoding)

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 39(Slide from Stanford CS231n 2015~2016 winter class)

    : ,

    W_* ()

  • 2:

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 40

    1. Google AI Experiment Quick, Draw!

    https://aiexperiments.withgoogle.com/quick-draw

    3.

    4. (20 )

    2.

    https://aiexperiments.withgoogle.com/quick-draw

  • 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 41

    LSTM .

  • (LSTM):

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 42

    , , RNN .

    gradient exploding or vanishing big model cannot learn well

    (Slide from Stanford CS231n 2015~2016 winter class)

  • Long-Term Dependencies

    43

    The clouds are in the sky

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

    .(short-term memory)

  • Longer-Term Dependencies

    44http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

    (long-term memory) .

  • (Long Short-Term Memory)

    2017, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 45

    LSTM

Recommended

View more >