new perspectives in stochastic derivative free optimization · workshop on advanced methods and...
TRANSCRIPT
Anne AugerINRIA – Saclay Ile-de-France
http://www.lri.fr/~auger/
Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control
Toulouse, Febuary 3-5
New perspectives in stochastic derivative free optimization
2New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 2
Données:ensemble de solutions possibles
critère de qualité
Problem statement: Black-Box-OptimizationOptimisation
3New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 3
What makes a function difficult to solve?
non-linear, non-quadratic, non-convex
dimensionality
(considerably) larger than 3
non-separability
dependencies between variables
Ill-conditioning
ruggedness
non-smooth, discontinuous, multi-modal, and/or noisy
Stochastic algorithms: try to deal with any of those difficulties
4New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 4
Overview
Stochastic optimization algorithms
Step-size adaptive ESs
Covariance Matrix Adaptation ESs (CMA-ES)
Benchmarking stochastic and deterministic DFOs
Theoretical convergence results
5New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 5
A stochastic black-box-search template
6New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 6
Evolution Strategies (ES)[Rechenberg, Schwefel, 70's]
7New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 7
Evolution Strategies (ES)[Rechenberg, Schwefel, 70's]
8New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 8
ES: typical update of mean value
9New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 9
Why step-size control?
10New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 10
One-fifth success rule[Rechenberg 70]
11New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 11
One-fifth success rule – the equations
12New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 12
Step-size adaptation → linear convergence
13New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 13
Why covariance matrix adaptation?
Ill-conditioned problems:
What do we want to achieve?
14New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 14
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
15New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 15
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
16New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 16
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
17New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 17
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
18New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 18
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
19New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 19
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
20New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 20
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
21New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 21
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
22New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 22
Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update
23New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 23
Covariance Matrix Adaptation - ES (CMA-ES)
Rank-one update
Hansen, N. and A. Ostermeier (2001). Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation,
24New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 24
Experimentum crucis (1)
[Code available at www.lri.fr/~hansen/]
25New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 25
Experimentum crucis (2)
26New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 26
Invariance to Monotonically Increasing Functions
27New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 27
Experimentum crucis (2')
28New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 28
Overview
Stochastic optimization algorithms
Benchmarking stochastic and deterministic DFOs
Theoretical convergence results
29New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 29
Benchmarking of several DFO
BFGS: gradient estimated by finite differences, Matlab implementation
[Broyden, Fletcher, Goldfarb, Shanno]
NEWUOA (NEW Unconstrained Optimization Algorithm): C-translation by M. Guibert of Fortran Powell code [Powell 2006]
CMA-ES (Covariance Matrix Adaptation ES) [Hansen et al. 2001-2004]
PSO (Particle Swarm Optimization) [Kennedy, Eberhart 1995]
DE (Differential Evolution) [Storn-Price 1997]
Algorithms tested:
Empirical comparisons of several derivative free optimization algorithms, Auger, A. et al., Acte du 9ime colloque national en calcul des structures, Giens, 2009.
30New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 30
“Optimal” case for deterministic algorithms
Convex quadratic function:
31New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 31
Impact of condition number - Ellipsoid
32New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 32
Impact of condition number – Rotated Ellipsoid
33New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 33
Impact of condition number – (Rotated Ellipsoid)¼
34New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 34
Black-Box-Optimization-Benchmarking (BBOB)
Black Box Optimization Benchmarking (BBOB) workshop, GECCO-2009, GECCO-2010
[http://coco.gforge.inria.fr/doku.php?id=bbob-2010]
function testbed
mainly non-convex and non-separable
Scalable with the search space dimension
not too easy to solve, but yet comprehensible
provide code for data acquisition (only need to plug the optimizers)
provide code for post-processing experiments
Data presentation yields quantitative assessment, stratified by function properties
35New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 35
Comprehensive comparison of 28 algorithms
BBOB non-noisy testbed d=20
Cumulative distribution of running length distribution (aka data profile)
d=20
36New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 36
Comprehensive comparison of 28 algorithms
BBOB non-noisy testbed
Cumulative distribution of running length distribution
d=2
37New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 37
Comprehensive comparison of 28 algorithms
BBOB non-noisy testbed
Cumulative distribution of running length distribution
d=20unimodal functions
38New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 38
Comprehensive comparison of 28 algorithms
BBOB non-noisy testbed d=20
Cumulative distribution of running length distribution
d=20, multi-modal functions
39New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 39
Comprehensive comparison of 28 algorithms
Cumulative distribution of running length distribution
d=20, noisy functions
40New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 40
Overview
Stochastic optimization algorithms
Benchmarking stochastic and deterministic DFOs
Theoretical convergence results
41New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 41
Convergence
can be rather meaningless ...
42New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 42
Linear convergence for sequence of RVs
43New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 43
Linear convergence for sequence of RVs
44New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 44
General lower bounds for convergence
O. Teytaud et al (2006-2009)
45New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 45
Class of functions
Exploitation of invariance property of rank-based algorithms
46New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 46
Linear convergence – sufficient conditions
47New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 47
Law of Large Numbers for Markov Chains
48New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 48
Verifying stability conditions
Anne Auger (2005), Convergence results for (1,λ)-SA-ES using the theory of φ-irreducible Markov chains.
Auger – Hansen in preparation
49New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 49
● Acknowledgments: Nikolaus Hansen, Raymond Ros
● Forthcoming events:
● GECCO Workshop Black Box Optimization Benchmarking 2010
– Deadline to submit your favorite algorithm: March 25th
[http://coco.gforge.inria.fr/doku.php?id=bbob-2010]
● Dagstuhl seminar on “Theory of Evolutionary Algorithms” September 2010
[co-organized with J. Shapiro, D. Whitley, C. Witt]