understanding rnn and lstm

15
RNN and LSTM (Oct 12, 2016) YANG Jiancheng

Upload: -

Post on 06-Jan-2017

52 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Understanding RNN and LSTM

RNN and LSTM(Oct 12, 2016)

YANG Jiancheng

Page 2: Understanding RNN and LSTM

Outline

• I. Vanilla RNN

• II. LSTM

• III. GRU and Other Structures

Page 3: Understanding RNN and LSTM

• I. Vanilla RNN

In theory, RNNs are absolutely capable of handling such “long-term dependencies.” A human could carefully pick parameters for them to solve toy problems of this form. Sadly, in practice, RNNs don’t seem to be able to learn them. 

GREAT Intro: Understanding LSTM Networks

Page 4: Understanding RNN and LSTM

• I. Vanilla RNN

WILDML has a series of articles to introduce RNN (4 articles, 2 GitHub repos).

Page 5: Understanding RNN and LSTM

• I. Vanilla RNN• Back Prop Through Time (BPTT)

Page 6: Understanding RNN and LSTM

• I. Vanilla RNN• Back Prop Through Time (BPTT)

Page 7: Understanding RNN and LSTM

• I. Vanilla RNN• Gradient Vanishing Problem

tanh and derivative. Source: http://nn.readthedocs.org/en/rtd/transfer/

RNNs tend to be very deep

Page 8: Understanding RNN and LSTM

• II. LSTM• Differences of LSTM and Vanilla RNN

Page 9: Understanding RNN and LSTM

• II. LSTM• Core Idea Behind LSTMs

Cell state Gates

Page 10: Understanding RNN and LSTM

• II. LSTM• Step-by-Step Walk Through

 0 ~ 1

 0 ~ 1

Page 11: Understanding RNN and LSTM

• II. LSTM• Step-by-Step Walk Through

 0 ~ 1

 0 ~ 1

Page 12: Understanding RNN and LSTM

• III. GRU and other structures• Gated Recurrent Unit (GRU)

• Combines the forget and input gates into a single “update gate.” • Merges the cell state and hidden state• Other changes

Page 13: Understanding RNN and LSTM

• III. GRU and other structures• Variants on Long Short Term Memory

Greff, et al. (2015) do a nice comparison of popular variants, finding that they’re all about the same.

Page 15: Understanding RNN and LSTM

Thanks for listening!