lecture 6 - automl · we also have conditional hyperparameters they are activated only given...

8

● ● ● Lecture 6 - AutoML Resources: Automl book: https://www.automl.org/book/ Section 1. Motivation Success of deep learning; we have deployments everywhere and successes in different domains/applications However, there is a big problem: The performance of deep learning models is very sensitive to many hyper-parameters! Architectural hyper parameters Optimization algorithm, learning rates, momentum, batch normalization, batch sizes, dropout rates, weight decay, data augmentation…. Easily 20-50 design decisions Section 2. Goal of AutoML The current deep learning practice requires the expert to choose the architecture and the hyper parameters, train the deep learning model end-to-end and then iterate The promise of AutoML is true end-to-end learning

Upload: others

Post on 16-Oct-2020

5 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

●

●

●

Lecture 6 - AutoML

Resources:

Automl book: https://www.automl.org/book/

Section 1. Motivation

Success of deep learning; we have deployments everywhere and successes in different domains/applications

However, there is a big problem: The performance of deep learning models is very sensitive to many hyper-parameters!

Architectural hyper parameters

Optimization algorithm, learning rates, momentum, batch normalization, batch sizes, dropout rates, weight decay, data augmentation….

Easily 20-50 design decisions

Section 2. Goal of AutoML

The current deep learning practice requires the expert to choose the architecture and the hyper parameters, train the deep learning model end-to-end and then iterate

The promise of AutoML is true end-to-end learning

https://www.automl.org/book/

Page 2: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

––––––

The learning box can contain multiple ML-related operations:Clean and preprocess the dataSelect/engineer better featuresSelect the model familySet the hyper parametersConstruct ensembles of models….

Section 3. Modern Hyperparameter Optimization

Section 3.1 AutoML as Hyperparameter Optimization

What is hyper parameter optimization?

We have different types of hyper parameters: continuous (learning rate), discrete (number of unites), categorical (finite domain, e.g., algorithm, activation functions etc)

Page 3: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

We also have conditional hyperparameters they are activated only given another hyperpameter value.

Example 1:A = choice of optimizer (Adam or SGD)B = Adam‘s second momentum hyperparameter (only active if A=Adam)

Example 2:A = choice of classifier (RF or SVM)B = SVM‘s kernel parameter (only active if A = SVM)

Top-level hyperparameter and then all other hyperparamters are conditional hyperparamters.

Section 3.2 Blackbox optimization

Page 4: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

1.

2.

3.

Evaluating the function is expensive (remember our previous discussions on model evaluation in the last class).

Methods for blackbox optimization:

Grid Search

Random Search

Bayesian optimization

Page 5: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

See lecture notes for details

Bayesian optimization can be very slow. Different methods to speed things up!

Section 3.3. Beyond Hyperparameter Optimization

See: https://media.neurips.cc/Conferences/NIPS2018/Slides/hutter-vanschoren-part1-2.pdf

Section 4. Neural Architecture Search Spaces

https://media.neurips.cc/Conferences/NIPS2018/Slides/hutter-vanschoren-part1-2.pdf

https://media.neurips.cc/Conferences/NIPS2018/Slides/hutter-vanschoren-part1-2.pdf

Page 6: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

Page 7: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

Use of reinforcement learning combined with recurrent neural networks to learn a better model

Read work: https://arxiv.org/pdf/1611.01578.pdfhttp://rll.berkeley.edu/deeprlcoursesp17/docs/quoc_barret.pdf

https://arxiv.org/pdf/1611.01578.pdf

http://rll.berkeley.edu/deeprlcoursesp17/docs/quoc_barret.pdf

Page 8: Lecture 6 - AutoML · We also have conditional hyperparameters they are activated only given another hyperpameter value. Example 1: A = choice of optimizer (Adam or SGD)

An ADMM Based Framework for AutoML Pipeline Conﬁguration

ORIE 4741: Introduction to AutoML

OBOE: Collaborative Filtering for AutoML InitializationOBOE: Collaborative Filtering for AutoML Initialization Chengrun Yang, Yuji Akimoto, Dae Won Kim, Madeleine Udell Cornell University

The Effect Of Hyperparameters In The Activation Layers Of

Autotext: AutoML for Text Classi cation

HyperSTAR: Task-Aware Hyperparameters for Deep Networksopenaccess.thecvf.com/content_CVPR_2020/papers/Mittal... · 2020. 6. 29. · HyperSTAR: Task-Aware Hyperparameters for Deep

Efficient Autotuning of Hyperparameters in Approximate

Methods for Improving Bayesian Optimization for AutoMLml.informatik.uni-freiburg.de/papers/15-AUTOML-AutoML... · 2020-06-23 · • Test performance of the best configuration found

Our First Hyperparameters: Mini-batching, Regularization ... · Benefits of Regularization •Cheap to compute •For SGD and L2 regularization, there’s just an extra scaling •L2

The power of SGD . SGD overview

Bayesian Optimization and Automated Machine Learningmlg.postech.ac.kr/~jtkim/slides/slides_talk_naver... · 2018-06-11 · J. Kim, J. Jeong, and S. Choi. AutoML Challenge: AutoML

Chapter 3 Neural Architecture Search - AutoML

Scalable AutoML for Time Series Forecasting using Ray

Automated Machine Learning (AutoML): A Tutorial · Automated Machine Learning (AutoML): A Tutorial ... 2016) and Frazier (arXiv:1807.02811) Bayesian Optimization Feurer and Hutter:

Chapter 5 Hyperopt-Sklearn - AutoML

Building and Benchmarking AutoML Systems · • Automatic Machine Learning (AutoML) • Machine Learning Benchmarking ... Changes made to the H2O AutoML algorithm ... will expand

OBOE: Collaborative Filtering for AutoML Initializationmetalearning.ml/2018/papers/metalearn2018_paper39.pdfOBOE: Collaborative Filtering for AutoML Initialization Chengrun Yang, Yuji

Model Selection and Adaptation of Hyperparameters

tamil.examsdaily.in€¦ · 58 59 66 69 70 73 74 77 80 82 88 89 90 93 96 97 98 sc, d-61 sgd-62 sgd-63 sgd-64 sgd„65 sgd„66 sgd-67 sgd-69 sgd-70 sc,d-71 scd-72 scd-73

Automated Machine Learning (AutoML): A Tutorial · One Problem of Deep Learning Performance is very sensitive to many hyperparameters Architectural hyperparameters Optimization algorithm,

Google Brain team Thanks: AutoML: Automated Machine Learningrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-25.pdf · AutoML: Automated Machine Learning Barret Zoph, Quoc Le

Calibration curves as features for tuning hyper …...Calibration curves as features for tuning hyperparameters Feb 2016 Artem Vorozhtsov, Yandex [email protected] Hyperparameters,

On the Security Risks of AutoML

AutoML 툴과개발과정의어려움 · AutoML –Bayesian Optimization 베이지안최적화-최적하이퍼파라미터조합을모델기반의최적화기법활용하여탐색

LightAutoML: AutoML Solution for a Large Financial

HyperDrive: Exploring Hyperparameters with POP Scheduling · 2018-05-24 · HyperDrive: Exploring Hyperparameters with POP Scheduling Middleware ’17, December 11–15, 2017, Las

AutoML: A Survey of the State-of-the-Art · paper, we provide a comprehensive and up-to-date study on the state-of-the-art (SOTA) in AutoML. First, we introduce the AutoML techniques

Analysis of the AutoML Challenge series 2015-2018

Efficient Hyperparameter Optimization of Deep Learning ... · Optimization of 19 CNN hyperparameters Optimization of 8 CNN hyperparameters Optimization of 15 CNN hyperparameters Optimization

Hardware-Aware AutoML for Efﬁcient Deep Learning Applications

AMC: AutoML for Model Compression and Accelerationon

AutoML-Zero: Evolving Machine Learning Algorithms From Scratch

Tuning Neural network hyperparameters through Bayesian

AutoML from Service Provider’s Perspective: Multi-device ...proceedings.mlr.press/v89/yu19e/yu19e.pdf · AutoML from Service Provider’s Perspective: Multi-device, Multi-tenant

UIN SGD Bandung...UIN SGD Bandung