efficient benchmarking of hyperparameter optimizers via ... · holger h. hoos and kevin...
TRANSCRIPT
Empirical Results
• Almost perfect for low-dimensional
benchmarks and still acceptable for higher
dimensions
• Reduce benchmark overhead to <1 sec
Answer: Surrogates work well,
especially based on Random
Forests
This is Arial bold This is Arial, Anthrazit, 77 79 83
This is Arial bold This is Arial, Mittel-Grün, 115 150 0
This is Arial bold This is Arial, Mittel-Grau, 178 180 179
This is Arial bold This is Arial, Dunkel-Grau
This is Arial bold This is Arial, Hell-Grün, 146 212 0
This is Arial bold This is Arial, Pastell-Grün, 208 235 138
This is Arial bold This is Arial, Mittel-Blau, 42 110 187
This is Arial bold This is Arial, Dunkel-Rot, 123 41 39
This is Arial bold This is Arial, Pastell-Blau, 218 227 234
This is Arial bold This is Arial, Hell-Blau, 167 193 227
This is Arial bold This is Arial, Hell-Rot, 222 56 49
This is Arial bold This is Arial, Pastell-Rot, 255 129 141
This is Arial bold This is Arial, Orange, 233 131 0
This is Arial bold This is Arial, Gelb, 239 198 71
This is Arial bold This is Arial, Pastell-Gelb, 248 228 152
This is Arial bold This is Arial, Pastell-Grau, 234 235 231
This is Arial bold This is Arial, Hell-Grau, 213 214 210
Implementation publicly available:
Efficient Benchmarking of Hyperparameter
Optimizers via Surrogates
Katharina Eggensperger and Frank Hutter University of Freiburg
{eggenspk,fh}@cs.uni-freiburg.de
Holger H. Hoos and Kevin Leyton-Brown University of British Columbia
{hoos, kevinlb}@cs.ubc.ca
What is Bayesian Optimization?
…in 30 seconds
• Benchmarking hyperparameter optimization
methods is costly, as it requires many
evaluations of the used benchmark problem.
• We propose cheap, but realistic surrogate
benchmarks based on predictive performance
models.
• Our benchmarks are publicly available and
allow extensive white-box tests and fast
evaluation of optimization algorithms.
Experiments
• Collect data by conducting
optimization runs and
random search
• Evaluate quality of
regression models
• Compare optimizer
performance on true vs.
surrogate benchmark
Regression Model
Evaluate different regression models with respect to
time needed to predict/train and prediction
quality and select promising models:
• Tree-based model
• Gaussian Processes
• Support Vector Regression
• K-Nearest Neighbour
• Linear Regression
Surrogate Benchmarks
• Train regression model
• Replace call to true
benchmark function with
surrogate prediction
• Evaluate surrogate
benchmark and compare to
optimizer performance on
real benchmark
Problems with existing benchmark functions Realistic benchmark
problems:
+ Complex & interesting
- Expensive to evaluate
- Complicated to set up (libraries,
dependencies, special hardware, etc. )
Examples: Deep Neural Networks,
onlineLDA, AutoWEKA
Synthetic test functions:
+ Easy to set up
+ Cheap to evaluate
- Unrealistic shape &
too smooth
Examples: BBOB, Table-Lookups,
Branin
Figures: Best performance found by different
optimizers over time on the logistic regression
benchmark. We show median and quartile across 10
runs on the real benchmark (top) and surrogate
benchmarks based on random forests (left, top) and
Gaussian Processes (left, bottom).
Benefit
Our surrogate benchmarks …
• provide a 60 to 3600 x
speedup
• require negligible computational
resources
• allow unit tests
• help to analyze how optimizers
work on complex benchmark
… and are publicly available:
For more information visit: www.automl.org/benchmarks.html
Ou
r c
on
trib
uti
on
10 × 10 × 3 × 1
Example usage: Comparing three optimizers on a
single benchmark. Running each optimizer 10 times
with a budget of 100 configuration evaluations:
3 × 10 × 100 × time(𝑓 𝜆 )
↑ True benchmark:
3000 × 1min
← Surrogate benchmarks:
3000 × 1sec
https://github.com/KEggensperger/SurrogateBenchmarks
TODO: Short
description or
center figure
TODO: UBC
Logo
Garantiert
gegen das CD
TODO: QR
Code nach
automl?
Garantiert
gegen das CD
Surrogate Benchmarks
• Train regression model
• Replace call to true benchmark
function with surrogate prediction
• Evaluate surrogate benchmark
and compare to optimizer
performance on real benchmark
…in 15 sec Hyperparameter optimization is essential and there
exists a wide range of automating this process.
However, empirically comparing these different
hyperparameter optimization methods can be very
costly due to many evaluations of the considered
benchmark problem.
To speed up benchmarking we propose a new option,
surrogate benchmark, that feature the complexity of
expensive benchmark functions , but require
negligible computational resources.
Figure: Performance of different
regression models predicting
configurations for optimizer SMAC
Our Approach
1. Collect <configuration, performance> pairs from different benchmarks
2. Train Regression model
3. Replace call to benchmark function with model prediction
Question: How well does this work?
Setup
8 regression models
• Tree-based models
• Gaussian Processes
• Support Vector Regression
• K-Nearest Neighbour
• Linear Regression
9 benchmark
functions, e.g.
• Logistic
Regression
• (Deep) Neural
Networks
• onlineLDA
Data
Collect data from by conducting
optimization runs and random
search on actual benchmark
functions:
• Logistic Regression
• (Deep) Neural Network
• onlineLDA
Figures: Best performance found by
different optimizers over time. We show
median and quartile across 10 runs on
the real benchmark (left) and surrogate
benchmarks based on random forests
(middle) and Gaussian Processes
(right).
Logis
tic R
egre
ssio
n
HP
-DB
Net
mrb
i
Table: Properties of the benchmarks for which we
provide surrogate benchmarks Table: Empirical comparison of three optimizers on various real and surrogate-based benchmarks
SMAC: F. Hutter, H. Hoos, and K. Leyton-Brown, Sequential model-based optimization for general algorithm configuration, 2011
TPE: J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, Algorithms for hyper-parameter optimization, 2011
Spearmint: J. Snoek, H. Larochelle, and R. Adams, Practical Bayesian optimization of machine learning algorithms, 2012
HPOlib: K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown,
Towards an empirical foundation for assessing Bayesian optimization of hyperparameters, 2013