derivative-free optimization (dfo) introduction · 2020-01-20 · i classical methods: coordinate...

45
Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References Derivative-Free Optimization (DFO) Introduction MTH8418 S. Le Digabel, Polytechnique Montr´ eal Winter 2020 (v2) MTH8418: Introduction 1/44

Upload: others

Post on 27-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Derivative-Free Optimization (DFO)Introduction

MTH8418

S. Le Digabel, Polytechnique Montreal

Winter 2020(v2)

MTH8418: Introduction 1/44

Page 2: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Plan

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 2/44

Page 3: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 3/44

Page 4: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

General optimization problem

Optimization is the field that studies problems of the form

minx∈Xf(x) : x ∈ Ω

where

I X is a n-dimensional space, corresponding to the optimizationvariables. Typically, in continuous optimization, X = Rn

I Ω ⊆ X is the set of feasible points, defined by constraints

I The objective function f takes its values on X

MTH8418: Introduction 4/44

Page 5: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Optimization terms

I Local optimum vs global optimum

I Exact solution vs heuristic solution

I Different types of optimization:

I Discrete optimization

I Continuous optimization

I Linear optimization

I Nonlinear optimization

I Derivative-free optimization

I Blackbox optimization

MTH8418: Introduction 5/44

Page 6: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Blackbox optimization problemsSlight reformulation of the general optimization problem:

I Optimization problem:

minx∈Ω

f(x)

I Evaluations of f (the objective function) and of the functionsdefining Ω are usually the result of a computer code (ablackbox)

I n variables, m general constraints

I Ω =x ∈ X : cj(x) ≤ 0, j ∈ 1, 2, . . . ,m

⊆ Rn

I X : Bounds and/or nonquantifiable constraints (typically)

MTH8418: Introduction 6/44

Page 7: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Blackboxes as illustrated by a Boeing engineer

MTH8418: Introduction 7/44

Page 8: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

BBO vs DFO vs SOI “Blackbox Optimization (BBO) is the study of design and

analysis of algorithms that assume the objective and/orconstraints functions are given byblackboxes” [Audet and Hare, 2017]

I A simulation, or a blackbox, is involvedI Obj./constraints may be analytical functions of the outputsI Derivatives may be available (ex.: PDEs)

I Sometimes referred as Simulation-Based Optimization (SBO)

I “Derivative-Free Optimization (DFO) is the mathematicalstudy of optimization algorithms that do not usederivatives” [Audet and Hare, 2017]

I Optimization without using derivativesI Derivatives may exist but are not available

I Obj./constraints may be analytical or given by a blackbox

I Simulation Optimization (SO):I SO 6= SBOI Stochastic properties of the simulation are exploited

MTH8418: Introduction 8/44

Page 9: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Global view

MTH8418: Introduction 9/44

Page 10: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Extensions

I Global optimization

I Multiobjective optimization

I Stochastic optimization

I Robust optimization

. . .

MTH8418: Introduction 10/44

Page 11: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 11/44

Page 12: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Typical setting

Algorithm

f(blackbox)

x1,x2,...f(x1),f(x2),...

x0 x*

Unconstrained case, with one initial starting solution

MTH8418: Introduction 12/44

Page 13: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Algorithms for blackbox optimizationA method for blackbox optimization should ideally:

I Be efficient given a limited budget of evaluations

I Be robust to noise and blackbox failures

I Natively handle general constraints

I Have convergence properties ensuring first-order localoptimality in the smooth case – otherwise why using it onmore complicated problems?

I Easily exploit parallelism

I Deal with multiobjective optimization

I Deal with integer and categorical variables

I Have a publicly available implementation

MTH8418: Introduction 13/44

Page 14: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Families of methods

I “Computer science” methods:I Heuristics such as genetic algorithmsI No convergence propertiesI Cost a lot of evaluationsI Should be used only in last resort for desperate cases

I Statistical methods:I Design of experiments – out of date compared to modern DFO

methodsI EGO algorithm based on surrogates and expected improvementI Still limited in terms of dimensionI Does not natively handle constraintsI Better to use these tools in conjonction with DFO methods

I Derivative-Free Optimization methods (DFO)

MTH8418: Introduction 14/44

Page 15: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

DFO methodsI Model-based methods:

I Derivative-Free Trust-Region (DFTR) methodsI Based on quadratic models or radial-basis functionsI Use of a trust-regionI Better for DFO \ BBO I Not resilient to noise and hidden constraintsI Not easy to parallelize

I Direct-search methods:I Classical methods: Coordinate search, Nelder-Mead – the

other simplex methodI Modern methods: Generalized Pattern Search (GPS),

Generating Set Search (GSS), Mesh Adaptive Direct Search(MADS)

So far, the size of the instances (variables and constraints) istypically limited to ' 50, and we target local optimization

MTH8418: Introduction 15/44

Page 16: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 16/44

Page 17: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Snow Water Equivalent (SWE) estimationI [Alarie et al., 2013]; Context as discussed on Radio-CanadaI Accurate estimate of water stored in snow is crucial to

optimize hydroelectric plants management

I Exact snow measurement is impossible

I SWE is measured at specific sites and next interpolated overthe territory

I Territory is huge: Hydro-Quebec (HQ) operates 565 dams, 75reservoirs, and 56 hydroelectric power plants, located over 90watersheds and covering more than 550,000 km2

MTH8418: Introduction 17/44

Page 18: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Previous SWE estimationI Done manually by weighing snow cores at specific sites

I Each measurement campaign requires 2 weeks

I Missing measurements due to adverse meteorologicalconditions

MTH8418: Introduction 18/44

Page 19: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

GMON deviceI A new measuring instrument that provides daily automatic

SWEI GMON for Gamma-MONitoring device: it measures the

natural Gamma radiation emitted from the soilI Communicates via satellites

MTH8418: Introduction 19/44

Page 20: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

SWE estimation from GMON measuresI Kriging interpolation is used to obtain SWE estimation

together with an error mapI How to find the device locations that minimize the kriging

interpolation error of the SWE?

MTH8418: Introduction 20/44

Page 21: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem formulation

I x ∈ R2N are the locations of N stations

I Typically, N ≤ 10, so we do not consider it as a variable

I Ω ⊆ R2 is the feasible domain where the stations can belocated

I f(x) is a score based on the standard deviation map obtainedby the kriging simulation and is considered as a blackbox

I Each simulation requires ' 2 seconds, and can only belaunched within the Hydro-Quebec research center (IREQ)

MTH8418: Introduction 21/44

Page 22: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Constraints

I GMON stations cannot belocated anywhere

I Restrictions on:I subsoil propertiesI slopeI vegetation,I exploitationI etc.

MTH8418: Introduction 22/44

Page 23: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Constraints

I GMON stations cannot belocated anywhere

I Restrictions on:I subsoil propertiesI slopeI vegetation,I exploitationI etc.

I Highly fragmented domain

MTH8418: Introduction 22/44

Page 24: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 23/44

Page 25: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Calibration of a hydrologic model[Minville et al., 2014]

credit: NASA

Evaporation + Transpiration = Evapotranspiration (ETR)

MTH8418: Introduction 24/44

Page 26: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Objectives

I Define a calibration (= parameters optimization) approach inorder to improve the transposability of the hydrologic model

I A transposable model should adequately reproduce hydrologicprocesses when they are employed with other data than thoseused to obtain the parameters (e.g. climate change)

I Emphasis on a realistic representation of ETR

I Characteristics of the optimization problem: Nonsmoothness,multiple regions of attraction, and many local optima withineach region of attraction

MTH8418: Introduction 25/44

Page 27: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

The model

I HSAMI (Service hydrometeorologique apports modulesintermediaires) [Bisson and Roberge, 1983, Fortin, 1999]

I Hydrologic model developed by and used at Hydro-Quebec

I 23 parameters: Optimization variables

I One evaluation takes ' 1-2 seconds

I We compare the simulated and observed streamflows and

minimize the Nash-Sutcliffe criteria

T∑t=1

(Qot−Qs

t )2

T∑t=1

(Qot−Qo)2

I Cross-validation typically over half the data

MTH8418: Introduction 26/44

Page 28: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Definition of the ETR constraintCalibration of the ETR is achieved by considering a climatic model(MRCC) for known values of P, T, and ETR

MTH8418: Introduction 27/44

Page 29: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 28/44

Page 30: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

#2: Aircraft takeoff trajectories

I [Torres et al., 2011]

I AIRBUS problem involving (among others):O. Babando, C. Bes, J. Chaptal,J.-B. Hiriart-Urruty, B. Talgorn, B. Tessier, andR. Torres

I Biobjective optimization

I Must execute on different platforms including some old Solarisdistributions

MTH8418: Introduction 29/44

Page 31: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Definition of the optimization problem

I Concept : Optimization of vertical flight path based onprocedures designed to reduce noise emission at departure toprotect airport vicinity

I Minimization of environmental and economical impact: Noiseand fuel consumption

I NADP (Noise Abatement Departure Procedure), variables:During departure phase, the aircraft will target its climbconfiguration:I Increase the speed up to climb speed (acceleration phase)I Reduce the engine rate to climb thrust (reduction phase)I Gain altitude

MTH8418: Introduction 30/44

Page 32: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Parametric Trajectory: 5 optimization variables (*)

MTH8418: Introduction 31/44

Page 33: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

The blackbox: MCDP: Multi-Criteria DepartureProcedure

One evaluation ' 2 seconds

MTH8418: Introduction 32/44

Page 34: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 33/44

Page 35: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Characterization of objects from radiographs - LANL

We want to identify an unknown object inside a box, using a x-raysource that gives an image on a detector

LA-UR-11-0342

In this work, the unknown object is supposed to be spherical

MTH8418: Introduction 34/44

Page 36: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Radiograph

A radiograph is the observed image on the detector. For example:

MTH8418: Introduction 35/44

Page 37: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Description of the problem

I The problem consist to identify the unknown object withsufficient precision so that the object can be classified asdangerous or not

I Must work rapidly

I Must work for radiographs not created on a well-controlledexperimental environment

I Must not crash for unreasonable user inputs

MTH8418: Introduction 36/44

Page 38: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Definition of the optimization problem

I Variables:I They represent a spherical objectI Categorical variables: Number of layers and type of material of

each layerI Continuous variables: Radius of each layerI The number of variables can change depending on the number

of layers

I Objective function:I A score associated to the difference between the observed

image on the detector, and a simulated image obtained fromthe candidate object (inverse problem)

I A numerical code – the blackbox – produces this simulatedradiograph, using raytracing

I Quick to compute

MTH8418: Introduction 37/44

Page 39: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Problem definition

Algorithms

Example 1

Example 2

Example 3

Example 4

Example 5

MTH8418: Introduction 38/44

Page 40: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

Hyper-Parameters Optimization (HPO)I PhD project of Dounia Lakhmiri

I We focus on the HPO of deep neural networks

I Our advantages:I Blackbox optimization problem:

One blackbox call = One training + one validation for a fixed set of hyper-parameters

I Presence of categorical variables (ex.: number of layers)

I Existing methods are mostly heuristics(grid search, random search, GAs, GPs, etc.)

I Average results on MNIST:I Random search: 94.0%I RBFOpt [Diaz et al., 2017]: 95.7%.I HYPERNOMAD: 95.4% (w/o categorical), 97.5%

(categorical)

MTH8418: Introduction 39/44

Page 41: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

CIFAR-10 (1/2)

I 5× n1 + n2 + 10 variables:I 2 categorical variables: n1 (number of

convolution layers) and n2 (number of fullyconnected layers)

I Type of optimizer, 4 HPs related to the optimizer(example : learning rate, momentum, weightdecay, dampening), Dropout rate, activationfunction, batch size, number of epochs

I For each layer: n output, kernel size, stride,padding, pooling and output size

I Training with 50,000 images, validation on 10,000 images

I One evaluation (training+test) ' 2 hours(CPU: i7-6700 @ 3.4 GHz, GPU: Nvidia GeForce GTX 1070)

I Current best solution: 96.58%

MTH8418: Introduction 40/44

Page 42: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

CIFAR-10 (2/2): Comparison with Hyperopt

0 20 40 60 80 100number of evaluations

80

70

60

50

40

30

20

10

obje

ctiv

e

CIFAR-10 with HYPERNOMAD

all valid evaluationssuccesses

0 20 40 60 80 100number of evaluations

20

18

16

14

12

10

obje

ctiv

e

CIFAR-10 with TPE from Hyperopt

all valid evaluationssuccesses

MTH8418: Introduction 41/44

Page 43: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

References I

Alarie, S., Audet, C., Garnier, V., Le Digabel, S., and Leclaire, L.-A. (2013).

Snow water equivalent estimation using blackbox optimization.Pacific Journal of Optimization, 9(1):1–21.

Audet, C. and Dennis, Jr., J. (2006).

Mesh Adaptive Direct Search Algorithms for Constrained Optimization.SIAM Journal on Optimization, 17(1):188–217.

Audet, C. and Hare, W. (2017).

Derivative-Free and Blackbox Optimization.Springer Series in Operations Research and Financial Engineering. Springer International Publishing, Berlin.

Bisson, J. and Roberge, F. (1983).

Previsions des apports naturels: Experience d’Hydro-Quebec.In Proceedings IEEE/Workshop on Flow Predictions, Toronto, On.

Conn, A., Scheinberg, K., and Vicente, L. (2009).

Introduction to Derivative-Free Optimization.MOS-SIAM Series on Optimization. SIAM, Philadelphia.

Diaz, G., Fokoue, A., Nannicini, G., and Samulowitz, H. (2017).

An effective algorithm for hyperparameter optimization of neural networks.IBM Journal of Research and Development, 61(4):9:1–9:11.

MTH8418: Introduction 42/44

Page 44: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

References II

Fermi, E. and Metropolis, N. (1952).

Numerical solution of a minimum problem.Los Alamos Unclassified Report LA–1492, Los Alamos National Laboratory, Los Alamos, USA.

Fortin, V. (1999).

Le Modele Meteo-Apport HSAMI: Historique, Theorie et Application.Technical Report IREQ-1999-0255, Institut de recherche d’Hydro-Quebec, Varennes, Qc.

Jones, D., Schonlau, M., and Welch, W. (1998).

Efficient Global Optimization of Expensive Black Box Functions.Journal of Global Optimization, 13(4):455–492.

Kolda, T., Lewis, R., and Torczon, V. (2003).

Optimization by direct search: New perspectives on some classical and modern methods.SIAM Review, 45(3):385–482.

Lakhmiri, D. (2019).

HyperNOMAD.https://github.com/DouniaLakhmiri/HyperNOMAD.

Lakhmiri, D., Digabel, S. L., and Tribes, C. (2019).

HyperNOMAD: Hyperparameter optimization of deep neural networks using mesh adaptive direct search.Technical Report G-2019-46, Les cahiers du GERAD.

MTH8418: Introduction 43/44

Page 45: Derivative-Free Optimization (DFO) Introduction · 2020-01-20 · I Classical methods: Coordinate search, Nelder-Mead { the other simplex method I Modern methods: Generalized Pattern

Problem definition Algorithms Example 1 Example 2 Example 3 Example 4 Example 5 References

References III

Minville, M., Cartier, D., Guay, C., Leclaire, L.-A., Audet, C., Le Digabel, S., and Merleau, J. (2014).

Improving process representation in conceptual hydrological model calibration using climate simulations.Water Resources Research, 50:5044–5073.

Nelder, J. and Mead, R. (1965).

A simplex method for function minimization.The Computer Journal, 7(4):308–313.

Torczon, V. (1997).

On the convergence of pattern search algorithms.SIAM Journal on Optimization, 7(1):1–25.

Torres, R., Bes, C., Chaptal, J., and Hiriart-Urruty, J.-B. (2011).

Optimal, Environmentally-Friendly Departure Procedures for Civil Aircraft.Journal of Aircraft, 48(1):11–22.

MTH8418: Introduction 44/44