[한국어] neural architecture search with reinforcement learning

of 39 /39
Presentation on “Neural Architecture Search with Reinforcement Learning” Kiho Suh Modulabs( ), June 12th 2017

Upload: kiho-suh

Post on 22-Jan-2018

1.947 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: [한국어] Neural Architecture Search with Reinforcement Learning

Presentation on “Neural Architecture Search with Reinforcement Learning”

Kiho Suh Modulabs( ), June 12th 2017

Page 2: [한국어] Neural Architecture Search with Reinforcement Learning

About Paper

• 2016 11 5 v1

• v2

• Google Brain

• Barret Zoph, Quoc V. Le

• ICLR 2017 Oral Presentation

Kiho Suh, Modulabs ( )

Page 3: [한국어] Neural Architecture Search with Reinforcement Learning

Google’s AutoML

https://futurism.com/googles-new-ai-is-better-at-creating-ai-than-the-companys-engineers/ http://techm.kr/bbs/?t=154

• AI ?…

• AutoML

Kiho Suh, Modulabs ( )

Page 4: [한국어] Neural Architecture Search with Reinforcement Learning

Motivation for Architecture Search

• . .

• “ ” .

• Feature Engineering Neural Network Engineering .

• ?

• Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 5: [한국어] Neural Architecture Search with Reinforcement Learning

Related Work

• Hyperparameter optimization

• Modern neuro-evolution algorithms

• Neural Architecture Search

• Learning to learn or Meta-Learning

Learning to learn by gradient descent by gradient descent, Andrychowicz et al. 2016

Kiho Suh, Modulabs ( )

Page 6: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search

• Configuration string . (Caffe )

- layer Configuration: [“Filter Width: 5”, “Filter Height: 3”, “Num Filters: 24”]

• RNN (“Controller”) Neural Network Architecture string .

• RNN . string .

• Validation set (“Child Network”) .

• Child model Controller model .

• Reward signal .

Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 7: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search

1. Controller RNN hyperparameter . Layer

. training

.

2. Controller RNN

train .

3. validation set .

4. Expected Validation Accuracy Controller RNN θc .

5. Policy gradient θc .

Kiho Suh, Modulabs ( )

Page 8: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search for Convolutional Networks

Softmax classifier

Embedding

Controller RNN

Kiho Suh, Modulabs ( )slide from Zoph, Le

Page 9: [한국어] Neural Architecture Search with Reinforcement Learning

Training with REINFORCE

Architecture predicted by the controller RNN viewed as a sequence of actions.

, Reward signal

• REINFORCE Q-learning . .

• Layer CNN T 3 . a1 filter height, a2 filter width, a3 number of filters .

• Reward signal R non-differentiable policy gradient θc .

Standard REINFORCE Update Rule

Controller hyperparameter

Controller RNN

Kiho Suh, Modulabs ( )

Page 10: [한국어] Neural Architecture Search with Reinforcement Learning

Training with REINFORCE

Sample many architectures at one time and average them across mini batch

high variance baseline ( )

mini batch

Controller hyperparameter Controller RNN

Kiho Suh, Modulabs ( )

Page 11: [한국어] Neural Architecture Search with Reinforcement Learning

Distributed Training

• Controller parameter S parameter server . Parameter server parameter K controller replicas .

• controller replica m architecture sample child model .

• child model parameter server θc gradient .

• 10 parameter servers

• 800 GPUs. Accuracy train .

• 13,000~15,000 train . Google 2~3 .Kiho Suh, Modulabs ( )

Shards

Page 12: [한국어] Neural Architecture Search with Reinforcement Learning

Overview of Experiments

• CIFAR-10 Penn Treebank .

• CIFAR-10 CNN Penn Treebank RNN cell .

• Penn Treebank State of the Art CIFAR-10 State of the Art

• Penn Treebank cell LSTM language modeling datasets .

https://www.cs.toronto.edu/~kriz/cifar.html https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

Kiho Suh, Modulabs ( )

Page 13: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search for CIFAR-10

• CIFAR-10 convolutional network Neural Architecture Search .

• layer (15, 20, 13) :

- Filter width/height

- Stride width/height

- Number of filters

Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 14: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search for CIFAR-10

• Skip Connection ( )

• Filter Height Width [1,3,5,7] softmax . Continuous , .

• One layer = 128 unit RNN (pretty small)

• Skip Connection (e.g. ResNet, DenseNet)

[1,3,5,7] [1,3,5,7] [1,2,3] [1,2,3] [24,36,48,64]

Kiho Suh, Modulabs ( )

Page 15: [한국어] Neural Architecture Search with Reinforcement Learning

CIFAR-10 Prediction Method

• Branching residual connections .

• Skip Connection .

• Layer N layer layer N input N-1 sigmoid .

• layer , minibatch .

• layer layer concatenate .

Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 16: [한국어] Neural Architecture Search with Reinforcement Learning

Skip Connection in ResNet

The formulation of F(x) +x can be realized by feedforward neural networks with “shortcut connections” (Fig. 2). Shortcut connections [2, 34, 49] are those skipping one or more

layers. In our case, the shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers (Fig. 2). Identity shortcut

connections add neither extra parameter nor computational complexity.

Kiho Suh, Modulabs ( )

Page 17: [한국어] Neural Architecture Search with Reinforcement Learning

Neural Architecture Search for CIFAR-10

Weight Matrices

Kiho Suh, Modulabs ( )

Page 18: [한국어] Neural Architecture Search with Reinforcement Learning

CIFAR-10 Experiment Details

• 100 Controller Replica training 8 child network .

• 800 GPUs .

• Reward given to the Controller is the maximum validation accuracy of the last 5 epochs squared

• 50,000 Training 45,000 training 5,000 validation .

• child model 50 epoch train . .

• 12,800 child model .

• Controller curriculum training layer .

• : 1030~1040

Kiho Suh, Modulabs ( )

Page 19: [한국어] Neural Architecture Search with Reinforcement Learning

DenseNet and ResNet

DenseNet, Huang et al. 2016 ResNet, He et al. 2015

DenseNets connects each layer to every other layer in a feed-forward fashion. They alleviate the vanishing-gradient problem, strengthen feature

propagation, encourage feature reuse, and substantially reduce the number of parameters.

Kiho Suh, Modulabs ( )

Page 20: [한국어] Neural Architecture Search with Reinforcement Learning

Generated Convolutional Network from Neural Architecture Search

5% faster

Best result of evolution (Real et al. 2017): 5.4% Best result of Q-learning (Baker et al. 2017): 6.92%

Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 21: [한국어] Neural Architecture Search with Reinforcement Learning

Generated Convolutional Network from Neural Architecture Search

• Skip Connection

• (e.g. 7 x 4 filter)

• convolution layer .

Kiho Suh, Modulabs ( )

Page 22: [한국어] Neural Architecture Search with Reinforcement Learning

Recurrent Cell Prediction Method

• LSTM GRU RNN cell search space .

• LSTM cell Search space .Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 23: [한국어] Neural Architecture Search with Reinforcement Learning

Recurrent Cell Prediction Method

Controller RNNCell Search Space Created New Cell

• activation function

• LSTM “ ”

• leaf node 8 ( 2)

• Controller RNN activation function

label

• Controller RNN

• cell

.

Kiho Suh, Modulabs ( )

Page 24: [한국어] Neural Architecture Search with Reinforcement Learning

Penn Treebank Experiment Details

• Run Neural Architecture Search with our cell prediction method on the Penn Treebank language modeling dataset

• 1 child network training 400 Controller Replica .

• 400 CPU .

• 15,000 child model . ( search space 1018 )

• Controller Reward c/(validation perplexity)2

Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 25: [한국어] Neural Architecture Search with Reinforcement Learning

Penn Treebank Results

LSTM Cell Neural Architecture Search (NAS) Cell

Kiho Suh, Modulabs ( )

Page 26: [한국어] Neural Architecture Search with Reinforcement Learning

Other Experiment Info.

• PennTree Bank: LSTM cell . Attention .

• Google Brain CIFAR-10 CNN .

• . NAS , NAS . inverse matrix

matrix (20,000 by 20,000) . NAS inversion .

• . . string lists .

• network . architecture . overfit .

Kiho Suh, Modulabs ( )

Page 27: [한국어] Neural Architecture Search with Reinforcement Learning

Penn Treebank Results

2x as fast

Kiho Suh, Modulabs ( )slide from Zoph, Le

Page 28: [한국어] Neural Architecture Search with Reinforcement Learning

RHN (Recurrent Highway Network)

Recurrent Highway Network, Zilly et al. 2016

Kiho Suh, Modulabs ( )

Page 29: [한국어] Neural Architecture Search with Reinforcement Learning

Comparison to Random Search

• Policy gradient random search .

• policy gradient .Kiho Suh, Modulabs ( )

Page 30: [한국어] Neural Architecture Search with Reinforcement Learning

Transfer Learning on Character Level Language Modeling

• Took the RNN cell that we evolved on Penn Treebank and tried it in on other datasets. Here, we tried on character level Penn Treebank datasets.

Kiho Suh, Modulabs ( )

Page 31: [한국어] Neural Architecture Search with Reinforcement Learning

Transfer Learning on Neural Machine Translation

LSTM Cell

Google Neural Machine Translation, Wu et al. 2016

Model WMT’14 en->de Test Set BLEU

GNMT LSTM 24.1

GNMT NAS Cell 24.6

• GNMT LSTM NAS cell .

• LSTM Hyperparameter(learning rate, weight initialization) .

• 0.5 BLEU . .

• 96 GPU 1 train .Kiho Suh, Modulabs ( )slide modified from Zoph, Le

Page 32: [한국어] Neural Architecture Search with Reinforcement Learning

Designing Neural Network Architectures Using RL

• 2016 11 30

• MetaQNN, a meta-modeling algorithm based on RL to automatically generate high-performing CNN architectures for a given learning.

• The learning agent is trained to sequentially choose CNN layers using Q-Learning with an epsilon-greedy exploration strategy and experience replay.

• The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task.

Kiho Suh, Modulabs ( )

Page 33: [한국어] Neural Architecture Search with Reinforcement Learning

UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION (Zhang et al. 2016)

1.The effective capacity of neural networks is large enough for a brute-force memorization of the entire data set.

2.Even optimization on random labels remains easy. In fact, training time increases only by a small constant factor compared with training on the true labels.

3.Randomizing labels is solely a data transformation, leaving all other properties of the learning problem unchanged.

‘ ’ .

Kiho Suh, Modulabs ( )

Page 34: [한국어] Neural Architecture Search with Reinforcement Learning

Reference• https://arxiv.org/pdf/1611.01578.pdf

• https://arxiv.org/pdf/1608.06993.pdf

• https://arxiv.org/pdf/1606.04474.pdf

• https://arxiv.org/pdf/1512.03385.pdf

• https://arxiv.org/pdf/1611.02167.pdf

• https://arxiv.org/pdf/1607.03474.pdf

• https://arxiv.org/abs/1609.08144.pdf

• https://arxiv.org/abs/1602.07261.pdf

• https://openreview.net/pdf?id=Sy8gdB9xx

• https://futurism.com/googles-new-ai-is-better-at-creating-ai-than-the-companys-engineers/

• http://www.iclr.cc/doku.php?id=iclr2017:schedule

• http://techm.kr/bbs/?t=154

• https://medium.com/intuitionmachine/deep-learning-the-unreasonable-effectiveness-of-randomness-14d5aef13f87

• http://rll.berkeley.edu/deeprlcourse/docs/quoc_barret.pdf

• http://www.1-4-5.net/~dmm/ml/nnrnn.pdf

• https://blog.acolyer.org/2017/05/10/neural-architecture-search-with-reinforcement-learning/

Kiho Suh, Modulabs ( )

Page 35: [한국어] Neural Architecture Search with Reinforcement Learning

Appendix 1: REINFORCE in depth

likelihood parametrized by θ

Kiho Suh, Modulabs ( )note by David Meyer

Page 36: [한국어] Neural Architecture Search with Reinforcement Learning

Appendix 1: REINFORCE in depth

Kiho Suh, Modulabs ( )note by David Meyer

Page 37: [한국어] Neural Architecture Search with Reinforcement Learning

Appendix 2: Proof of the Policy Gradient Theorem

Kiho Suh, Modulabs ( )

http://ufal.mff.cuni.cz/~straka/courses/npfl114/2016/sutton-bookdraft2016sep.pdf

Page 269 in Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto

Page 38: [한국어] Neural Architecture Search with Reinforcement Learning

Appendix 2: Large-Scale Evolution of Image Classifiers

Large-Scale Evolution of Image Classifiers, Real et al. 2016

Kiho Suh, Modulabs ( )

Page 39: [한국어] Neural Architecture Search with Reinforcement Learning

Appendix 3: Inception V4

Inception V4, Szegedy et al. 2016

Kiho Suh, Modulabs ( )