understanding rnn and lstm

RNN and LSTM (Oct 12, 2016) YANG Jiancheng

Upload: -

Post on 06-Jan-2017

52 views

Category:

Data & Analytics

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

RNN and LSTM(Oct 12, 2016)

YANG Jiancheng

Outline

• I. Vanilla RNN

• II. LSTM

• III. GRU and Other Structures

• I. Vanilla RNN

In theory, RNNs are absolutely capable of handling such “long-term dependencies.” A human could carefully pick parameters for them to solve toy problems of this form. Sadly, in practice, RNNs don’t seem to be able to learn them.

GREAT Intro: Understanding LSTM Networks

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

• I. Vanilla RNN

WILDML has a series of articles to introduce RNN (4 articles, 2 GitHub repos).

http://www.wildml.com/

• I. Vanilla RNN• Back Prop Through Time (BPTT)

• I. Vanilla RNN• Back Prop Through Time (BPTT)

• I. Vanilla RNN• Gradient Vanishing Problem

tanh and derivative. Source: http://nn.readthedocs.org/en/rtd/transfer/

RNNs tend to be very deep

• II. LSTM• Differences of LSTM and Vanilla RNN

• II. LSTM• Core Idea Behind LSTMs

Cell state Gates

• II. LSTM• Step-by-Step Walk Through

0 ~ 1

• II. LSTM• Step-by-Step Walk Through

0 ~ 1

• III. GRU and other structures• Gated Recurrent Unit (GRU)

• Combines the forget and input gates into a single “update gate.” • Merges the cell state and hidden state• Other changes

• III. GRU and other structures• Variants on Long Short Term Memory

Greff, et al. (2015) do a nice comparison of popular variants, finding that they’re all about the same.

http://arxiv.org/pdf/1503.04069.pdf

Bibliography

• [1] Understanding LSTM Networks

• [2] Back Propagation Through Time and Vanishing Gradients

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

http://www.wildml.com/2015/10/recurrent-neural-networks-tutorial-part-3-backpropagation-through-time-and-vanishing-gradients/

Thanks for listening!

ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS – FROM … · ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS – FROM HMM TO LSTM-RNN Heiga Zen Google

Composing a melody with long-short term memory …konstilackner.github.io/LSTM-RNN-Melody-Composer-Website/Thesis... · long-short term memory (LSTM) Recurrent Neural Networks

Reinforcement Learning: A Brief Introduction · 2012: Distributed deep learning (e.g., Google Brain) 2013: DQN for deep reinforcement learning 1997: LSTM-RNN 2015: Open source tools:

Understanding LSTM Networks - CSE · RNN long-term dependencies A x0 h0 A x1 h1 A x2 h2 A xt−1 ht−1 A xt ht Language model trying to predict the next word based on the previous

Understanding LSTM Networks - UNRCdc.exa.unrc.edu.ar/rio2016/sites/default/files/6.Understanding LSTM... · 11/2/2016 Understanding LSTM Networks ... Recurrent Neural Networks

Pixel RNN/CNN for Generative Image Modelpr-ai.hit.edu.cn/_upload/article/files/df/3a/bbe2530444669b9bed4ea... · Approach: Spatial LSTM, PixelRNN / PixelCNN [1] Generative image modeling

Graph LSTM with Context-Gated Mechanism for Spoken ...xiaohuiyan.github.io/paper/AAAI-ZhangL.2694.pdf · Graph LSTM with Context-Gated Mechanism for Spoken Language Understanding

Intro to Recurrent Neural Networks€¦ · This Lecture: 1. Segmentation Review 2. Intro to Polygon-RNN 3. Modeling Sequences 1. Vanilla RNN 2. Training and Problems 3. LSTM 4. Convolutional

Recurrent Neural Networks - GitHub PagesRNN: LSTM: Source: Understanding LSTM Networks Christopher Olah LSTM •Modeling –Input gate –Input information Source: Understanding LSTM

Image Captioning - Kiran Vodrahalli · – LSTM (most people), RNN, RNNLM (Mikolov); ... required to encode each word in LM ... Use BPTT (Backprop through

RNN, LSTM and Seq-2-Seq Models

RNN/LSTM Data Assimilation for the Lorenz Chaotic Models · RNN/LSTM Data Assimilation for the Lorenz Chaotic Models Milt Halem, Harsh Vashistha University of Maryland, Baltimore

Large-Batch Training for LSTM and BeyondHowever, there are three problems in current large-batch research: (1) Although RNN techniques like LSTM [12] have been widely used in many

COMPARISON OF RNN, LSTM AND GRU ON SPEECH … · 2020. 1. 24. · class of RNN, Long Short-Term Memory [LSTM] networks. LSTM networks have special memory cell structure, which is

MULTIPLE-TARGET DEEP LEARNING FOR LSTM-RNN BASED …home.ustc.edu.cn/~sunlei17/pdf/MULTIPLE-TARGET.pdf · LSTM-RNN based speech enhancement. First, a direct map-ping approach using

Recurrent Neural Networks · Recurrent Neural Networks 11-785 / 2020 Spring / Recitation 7 Vedant Sanil, David Park “Drop your RNN and LSTM, they are no good!” The fall of RNN

Why LSTM outperforms RNN - GitHub Pages...Solution - LSTM With the cell state, it runs straight down the entire chain, with only some minor linear interactions (NOT matrix multiplication

Stock Price Prediction Using RNN and LSTM

AI602 Lec04 RNN architectures(모상우)alinlab.kaist.ac.kr/resource/AI602_Lec04_RNN... · 2019. 9. 17. · Algorithmic Intelligence Lab •Long Short-Term Memory (LSTM) •A special

Automated Neural Image Caption Generator for Visually Impaired …cs224d.stanford.edu/reports/mcelamri.pdf · 2016-06-20 · then fed into a vanilla RNN or a LSTM. The vanilla RNN

Understanding RNN States with Predictive Semantic ...summit.sfu.ca/system/files/iritems1/19423/etd20450.pdf · Understanding RNN States with Predictive Semantic Encodings and Adaptive

arXiv:1512.05287v5 [stat.ML] 5 Oct 2016 · insights into the use of the technique with RNN models. Here we focus on common RNN models in the ﬁeld (LSTM [18], GRU [19]) and interpret

The Munich LSTM-RNN Approach to the MediaEval 2014 “Emotion in Music” Task

Examensarbete Systemarkitekturutbildningen Philip ...hb.diva-portal.org/smash/get/diva2:1142444/FULLTEXT01.pdf · box trading, lstm, rnn, time series forecasting , prediction, tensorflow,

Understanding LSTM Networks - Stanford University · 2018-01-14 · Understanding LSTM Networks Posted on August 27, 2015 Recurrent Neural Networks Humans don’t start their thinking

Appendix D RNN/LSTM Example Implementations with Keras/TensorFlo · 2019-05-14 · 2 -2.871662 3 0.496465 4 -0.602593 5 0.205998 6 0.808703 7 1.689375 The LSTM example to be examined

CSE 291G : Deep Learning for Sequencescseweb.ucsd.edu/classes/wi19/cse291-g/student_presentations/NER.… · RNN : LSTM • Overcome drawbacks of existing system • Account for variable

Advanced Prediction ModelsA Special RNN: LSTM • The gap between the relevant information and the point where it is needed can become unbounded • Empirical observation: Vanilla

RNN LSTM and Deep Learning Libraries · RNN LSTM and Deep Learning Libraries UDRC Summer School Muhammad Awais [email protected]

Compression of Neural Machine Translation Models via Pruningnlp.stanford.edu/pubs/see2016compression.pdf(2015b) to recurrent neural networks (RNN), in particular LSTM architectures

DiSAN: Directional Self-Attention Network for RNN/CNN … · · 2017-11-22DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding ... e.g., word/character

EcoRNN: Efficient Computing of LSTM RNN on GPUs · Figure 1: Left: A single-layer LSTM RNN that scans through an input sequence. Right: A zoom-in view of an LSTM cell. Both diagrams

Situation Assessment for Planning Lane Changes ...Neural Network (RNN) with Long Short-Term Memory (LSTM) units. We further extend this to a Bidirectional RNN. As Bidirectional RNNs

A RNN-LSTM based Predictive Autoscaling Approach on ...ijsrcseit.com/Paper/CSEIT1833498.pdf3Department of Information Technology, PSG College of Technology, Coimbatore, Tamilnadu,

YoTube: Searching Action Proposal Via Recurrent and Static ... Searching... · A. Recurrent Neural Network Recurrent Neural Network (RNN), especially Long-Short Term Memory (LSTM)