new perspectives in stochastic derivative free optimization · workshop on advanced methods and...

49
Anne Auger INRIA – Saclay Ile-de-France http://www.lri.fr/~auger/ Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives in stochastic derivative free optimization

Upload: others

Post on 12-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

Anne AugerINRIA – Saclay Ile-de-France

http://www.lri.fr/~auger/

Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control

Toulouse, Febuary 3-5

New perspectives in stochastic derivative free optimization

Page 2: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

2New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 2

Données:ensemble de solutions possibles

critère de qualité

Problem statement: Black-Box-OptimizationOptimisation

Page 3: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

3New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 3

What makes a function difficult to solve?

non-linear, non-quadratic, non-convex

dimensionality

(considerably) larger than 3

non-separability

dependencies between variables

Ill-conditioning

ruggedness

non-smooth, discontinuous, multi-modal, and/or noisy

Stochastic algorithms: try to deal with any of those difficulties

Page 4: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

4New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 4

Overview

Stochastic optimization algorithms

Step-size adaptive ESs

Covariance Matrix Adaptation ESs (CMA-ES)

Benchmarking stochastic and deterministic DFOs

Theoretical convergence results

Page 5: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

5New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 5

A stochastic black-box-search template

Page 6: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

6New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 6

Evolution Strategies (ES)[Rechenberg, Schwefel, 70's]

Page 7: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

7New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 7

Evolution Strategies (ES)[Rechenberg, Schwefel, 70's]

Page 8: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

8New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 8

ES: typical update of mean value

Page 9: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

9New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 9

Why step-size control?

Page 10: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

10New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 10

One-fifth success rule[Rechenberg 70]

Page 11: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

11New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 11

One-fifth success rule – the equations

Page 12: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

12New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 12

Step-size adaptation → linear convergence

Page 13: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

13New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 13

Why covariance matrix adaptation?

Ill-conditioned problems:

What do we want to achieve?

Page 14: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

14New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 14

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 15: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

15New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 15

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 16: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

16New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 16

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 17: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

17New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 17

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 18: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

18New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 18

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 19: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

19New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 19

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 20: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

20New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 20

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 21: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

21New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 21

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 22: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

22New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 22

Covariance Matrix Adaptation - ES (CMA-ES)[N. Hansen, A Ostermeier 2001]Rank-one update

Page 23: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

23New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 23

Covariance Matrix Adaptation - ES (CMA-ES)

Rank-one update

Hansen, N. and A. Ostermeier (2001). Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation,

Page 24: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

24New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 24

Experimentum crucis (1)

[Code available at www.lri.fr/~hansen/]

Page 25: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

25New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 25

Experimentum crucis (2)

Page 26: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

26New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 26

Invariance to Monotonically Increasing Functions

Page 27: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

27New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 27

Experimentum crucis (2')

Page 28: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

28New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 28

Overview

Stochastic optimization algorithms

Benchmarking stochastic and deterministic DFOs

Theoretical convergence results

Page 29: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

29New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 29

Benchmarking of several DFO

BFGS: gradient estimated by finite differences, Matlab implementation

[Broyden, Fletcher, Goldfarb, Shanno]

NEWUOA (NEW Unconstrained Optimization Algorithm): C-translation by M. Guibert of Fortran Powell code [Powell 2006]

CMA-ES (Covariance Matrix Adaptation ES) [Hansen et al. 2001-2004]

PSO (Particle Swarm Optimization) [Kennedy, Eberhart 1995]

DE (Differential Evolution) [Storn-Price 1997]

Algorithms tested:

Empirical comparisons of several derivative free optimization algorithms, Auger, A. et al., Acte du 9ime colloque national en calcul des structures, Giens, 2009.

Page 30: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

30New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 30

“Optimal” case for deterministic algorithms

Convex quadratic function:

Page 31: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

31New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 31

Impact of condition number - Ellipsoid

Page 32: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

32New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 32

Impact of condition number – Rotated Ellipsoid

Page 33: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

33New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 33

Impact of condition number – (Rotated Ellipsoid)¼

Page 34: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

34New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 34

Black-Box-Optimization-Benchmarking (BBOB)

Black Box Optimization Benchmarking (BBOB) workshop, GECCO-2009, GECCO-2010

[http://coco.gforge.inria.fr/doku.php?id=bbob-2010]

function testbed

mainly non-convex and non-separable

Scalable with the search space dimension

not too easy to solve, but yet comprehensible

provide code for data acquisition (only need to plug the optimizers)

provide code for post-processing experiments

Data presentation yields quantitative assessment, stratified by function properties

Page 35: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

35New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 35

Comprehensive comparison of 28 algorithms

BBOB non-noisy testbed d=20

Cumulative distribution of running length distribution (aka data profile)

d=20

Page 36: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

36New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 36

Comprehensive comparison of 28 algorithms

BBOB non-noisy testbed

Cumulative distribution of running length distribution

d=2

Page 37: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

37New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 37

Comprehensive comparison of 28 algorithms

BBOB non-noisy testbed

Cumulative distribution of running length distribution

d=20unimodal functions

Page 38: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

38New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 38

Comprehensive comparison of 28 algorithms

BBOB non-noisy testbed d=20

Cumulative distribution of running length distribution

d=20, multi-modal functions

Page 39: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

39New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 39

Comprehensive comparison of 28 algorithms

Cumulative distribution of running length distribution

d=20, noisy functions

Page 40: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

40New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 40

Overview

Stochastic optimization algorithms

Benchmarking stochastic and deterministic DFOs

Theoretical convergence results

Page 41: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

41New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 41

Convergence

can be rather meaningless ...

Page 42: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

42New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 42

Linear convergence for sequence of RVs

Page 43: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

43New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 43

Linear convergence for sequence of RVs

Page 44: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

44New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 44

General lower bounds for convergence

O. Teytaud et al (2006-2009)

Page 45: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

45New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 45

Class of functions

Exploitation of invariance property of rank-based algorithms

Page 46: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

46New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 46

Linear convergence – sufficient conditions

Page 47: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

47New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 47

Law of Large Numbers for Markov Chains

Page 48: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

48New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 48

Verifying stability conditions

Anne Auger (2005), Convergence results for (1,λ)-SA-ES using the theory of φ-irreducible Markov chains.

Auger – Hansen in preparation

Page 49: New perspectives in stochastic derivative free optimization · Workshop on Advanced Methods and Perspectives in Nonlinear optimization and control Toulouse, Febuary 3-5 New perspectives

49New perspectives in stochastic derivative free optimizationAnne Auger INRIA Saclay Ile-de-France 49

● Acknowledgments: Nikolaus Hansen, Raymond Ros

● Forthcoming events:

● GECCO Workshop Black Box Optimization Benchmarking 2010

– Deadline to submit your favorite algorithm: March 25th

[http://coco.gforge.inria.fr/doku.php?id=bbob-2010]

● Dagstuhl seminar on “Theory of Evolutionary Algorithms” September 2010

[co-organized with J. Shapiro, D. Whitley, C. Witt]