introduction to optimization: benchmarkingresearchers.lille.inria.fr/.../optimizationsaclay/... ·...

94
Introduction to Optimization: Benchmarking Dimo Brockhoff Inria Saclay Ile-de-France September 13, 2016 TC2 - Optimisation Université Paris-Saclay, Orsay, France Anne Auger Inria Saclay Ile-de-France

Upload: others

Post on 20-May-2020

7 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Introduction to Optimization: Benchmarking

Dimo Brockhoff

Inria Saclay – Ile-de-France

September 13, 2016

TC2 - Optimisation

Université Paris-Saclay, Orsay, France

Anne Auger

Inria Saclay – Ile-de-France

Page 2: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

2TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 2

Mastertitelformat bearbeitenDate Topic

1 Fri, 16.9.2016 Introduction to Optimization

Wed, 21.9.2016

Thu, 22.9.2016

groups defined via wiki

everybody went (actively!) through the Getting Started part of

github.com/numbbo/coco

2

3

4

Fri, 23.9.2016

Fri, 30.9.2016

Fri, 7.10.2016

Today's lecture: Benchmarking; final adjustments of groups

everybody can run and postprocess the example experiment (~1h for final

questions/help during the lecture)

Lecture

Lecture

Mon, 10.10.2016 deadline for intermediate wiki report:

what has been done and what remains to be done?

5

6

Fri, 14.10.2016

Tue, 18.10.2016

Lecture

Lecture

Tue, 18.10.2016

Fri, 21.10.2016

deadline for submitting data sets

deadline for paper submission

7 Fri, 4.11.2016

vacation

Final lecture

7.-11.11.2016 oral presentations (individual time slots)

14 - 18.11.2016 Exam (exact date to be confirmed)

Course Overview

All deadlines:

23:59pm Paris time

Page 3: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

3TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 3

Mastertitelformat bearbeitenDate Topic

1 Fri, 16.9.2016 Introduction to Optimization

Wed, 21.9.2016

Thu, 22.9.2016

groups defined via wiki

everybody went (actively!) through the Getting Started part of

github.com/numbbo/coco

2

3

4

Fri, 23.9.2016

Fri, 30.9.2016

Fri, 7.10.2016

Today's lecture: Benchmarking; final adjustments of groups

everybody can run and postprocess the example experiment (~1h for final

questions/help during the lecture)

Lecture

Lecture

Mon, 10.10.2016 deadline for intermediate wiki report:

what has been done and what remains to be done?

5

6

Fri, 14.10.2016

Tue, 18.10.2016

Lecture

Lecture

Tue, 18.10.2016

Fri, 21.10.2016

deadline for submitting data sets

deadline for paper submission

7 Fri, 4.11.2016

vacation

Final lecture

7.-11.11.2016 oral presentations (individual time slots)

14 - 18.11.2016 Exam (exact date to be confirmed)

Course Overview

All deadlines:

23:59pm Paris time

Page 4: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

challenging optimization problems

appear in many

scientific, technological and industrial domains

Page 5: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Optimize 𝑓: Ω ⊂ ℝ𝑛 ↦ ℝ𝑘

derivatives not available or not useful

𝑥 ∈ ℝ𝑛 𝑓(𝑥) ∈ ℝ𝑘

Numerical Blackbox Optimization

Page 6: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Given:

Not clear:

which of the many algorithms should I use on my problem?

𝑥 ∈ ℝ𝑛 𝑓(𝑥) ∈ ℝ𝑘

Practical Blackbox Optimization

Page 7: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Deterministic algorithmsQuasi-Newton with estimation of gradient (BFGS) [Broyden et al. 1970]

Simplex downhill [Nelder & Mead 1965]

Pattern search [Hooke and Jeeves 1961]

Trust-region methods (NEWUOA, BOBYQA) [Powell 2006, 2009]

Stochastic (randomized) search methodsEvolutionary Algorithms (continuous domain)

• Differential Evolution [Storn & Price 1997]

• Particle Swarm Optimization [Kennedy & Eberhart 1995]

• Evolution Strategies, CMA-ES [Rechenberg 1965, Hansen&Ostermeier 2001]

• Estimation of Distribution Algorithms (EDAs) [Larrañaga, Lozano, 2002]

• Cross Entropy Method (same as EDA) [Rubinstein, Kroese, 2004]

• Genetic Algorithms [Holland 1975, Goldberg 1989]

Simulated annealing [Kirkpatrick et al. 1983]

Simultaneous perturbation stochastic approx. (SPSA) [Spall 2000]

Numerical Blackbox Optimizers

Page 8: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Deterministic algorithmsQuasi-Newton with estimation of gradient (BFGS) [Broyden et al. 1970]

Simplex downhill [Nelder & Mead 1965]

Pattern search [Hooke and Jeeves 1961]

Trust-region methods (NEWUOA, BOBYQA) [Powell 2006, 2009]

Stochastic (randomized) search methodsEvolutionary Algorithms (continuous domain)

• Differential Evolution [Storn & Price 1997]

• Particle Swarm Optimization [Kennedy & Eberhart 1995]

• Evolution Strategies, CMA-ES [Rechenberg 1965, Hansen&Ostermeier 2001]

• Estimation of Distribution Algorithms (EDAs) [Larrañaga, Lozano, 2002]

• Cross Entropy Method (same as EDA) [Rubinstein, Kroese, 2004]

• Genetic Algorithms [Holland 1975, Goldberg 1989]

Simulated annealing [Kirkpatrick et al. 1983]

Simultaneous perturbation stochastic approx. (SPSA) [Spall 2000]

• choice typically not immediately clear• although practitioners have knowledge about problem

difficulties (e.g. multi-modality, non-separability, ...)

Numerical Blackbox Optimizers

Page 9: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• understanding of algorithms

• algorithm selection

• putting algorithms to a standardized test• simplify judgement

• simplify comparison

• regression test under algorithm changes

Kind of everybody has to do it (and it is tedious):

• choosing (and implementing) problems, performance measures, visualization, stat. tests, ...

• running a set of algorithms

Need: Benchmarking

Page 10: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

that's where COCO comes into play

Comparing Continuous Optimizers Platform

https://github.com/numbbo/coco

Page 11: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

automatized benchmarking

Page 12: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

How to benchmark algorithms with COCO?

Page 13: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 14: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 15: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 16: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 17: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 18: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 19: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

Page 20: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

requirements &

download

Page 21: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

installation I & test

installation II

(post-processing)

Page 22: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

example

experiment

Page 23: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

example_experiment.c

/* Iterate over all problems in the suite */while ((PROBLEM = coco_suite_get_next_problem(suite, observer)) != NULL) {

size_t dimension = coco_problem_get_dimension(PROBLEM);

/* Run the algorithm at least once */for (run = 1; run <= 1 + INDEPENDENT_RESTARTS; run++) {

size_t evaluations_done = coco_problem_get_evaluations(PROBLEM);long evaluations_remaining =

(long)(dimension * BUDGET_MULTIPLIER) – (long)evaluations_done;

if (... || (evaluations_remaining <= 0))break;

my_random_search(evaluate_function, dimension,coco_problem_get_number_of_objectives(PROBLEM), coco_problem_get_smallest_values_of_interest(PROBLEM), coco_problem_get_largest_values_of_interest(PROBLEM),(size_t) evaluations_remaining,random_generator);

}

Page 24: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

run:

! choose right test suite !bbob or bbob-biobj

Page 25: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

https://github.com/numbbo/coco

postprocess

tip:

start with small #funevals (until bugs fixed )

then increase budget to get a feeling

how long a "long run" will take

Page 26: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

result folder

Page 27: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

automatically generated results

Page 28: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

automatically generated results

Page 29: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

automatically generated results

Page 30: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

so far:

data for about 165 algorithm variants

[in total on single- and multiobjective problems]

118 workshop papers

by 79 authors from 25 countries

Page 31: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

On

• real world problems• expensive

• comparison typically limited to certain domains

• experts have limited interest to publish

• "artificial" benchmark functions• cheap

• controlled

• data acquisition is comparatively easy

• problem of representativeness

Measuring Performance

Page 32: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• define the "scientific question"

the relevance can hardly be overestimated

• should represent "reality"

• are often too simple?

remind separability

• a number of testbeds are around

• account for invariance properties

prediction of performance is based on “similarity”, ideally equivalence classes of functions

Test Functions

Page 33: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Available Test Suites in COCO

bbob 24 noiseless fcts 140+ algo data sets

bbob-noisy 30 noisy fcts 40+ algo data sets

bbob-biobj 55 bi-objective fcts in 2016

15 algo data sets

new

Page 34: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Meaningful quantitative measure• quantitative on the ratio scale (highest possible)

"algo A is two times better than algo B" is a meaningful statement

• assume a wide range of values

• meaningful (interpretable) with regard to the real world

possible to transfer from benchmarking to real world

How Do We Measure Performance?

Page 35: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

runtime or first hitting time is the prime candidate(we don't have many choices anyway)

Page 36: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Two objectives:

• Find solution with small(est possible) function/indicator value

• With the least possible search costs (number of function evaluations)

For measuring performance: fix one and measure the other

How Do We Measure Performance?

Page 37: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

convergence graphs is all we have to start with...

Measuring Performance Empiricallyfu

nction v

alu

e o

r

Page 38: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

ECDF:

Empirical Cumulative Distribution Function of theRuntime

[aka data profile]

Page 39: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

A Convergence GraphA Convergence Graph

Page 40: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

First Hitting Time is Monotonous

Page 41: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

15 Runs

Page 42: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

target

15 Runs ≤ 15 Runtime Data Points

Page 43: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Empirical CDF1

0.8

0.6

0.4

0.2

0

the ECDF of run lengths to reach the target

● has for each data point a vertical step of constant size

● displays for each x-value (budget) the count of observations to the left (first hitting times)

Empirical Cumulative Distribution

Page 44: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Empirical CDF1

0.8

0.6

0.4

0.2

0

interpretations possible:

● 80% of the runs reached the target

● e.g. 60% of the runs need between 2000 and 4000 evaluations

Empirical Cumulative Distribution

Page 45: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Reconstructing A Single Run

Page 46: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

50 equallyspaced targets

Reconstructing A Single Run

Page 47: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Reconstructing A Single Run

Page 48: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Reconstructing A Single Run

Page 49: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

the empirical CDFmakes a step for each star, is monotonous and displays for each budget the fraction of targets achieved within the budget

1

0.8

0.6

0.4

0.2

0

Reconstructing A Single Run

Page 50: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

the ECDF recovers the monotonous graph, discretized and flipped

1

0.8

0.6

0.4

0.2

0

Reconstructing A Single Run

Page 51: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

1

0.8

0.6

0.4

0.2

0

Reconstructing A Single Run

the ECDF recovers the monotonous graph, discretized and flipped

Page 52: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Aggregation

15 runs

Page 53: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Aggregation

15 runs

50 targets

Page 54: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Aggregation

15 runs

50 targets

Page 55: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

15 runs

50 targets

ECDF with 750 steps

Aggregation

Page 56: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

50 targets from 15 runs

...integrated in a single graph

Aggregation

Page 57: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

area over the ECDF curve

=average log

runtime(or geometric avg. runtime) over all

targets (difficult and easy) and all runs

50 targets from 15 runs integrated in a single graph

Interpretation

Page 58: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Fixed-target: Measuring Runtime

Page 59: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Fixed-target: Measuring Runtime

• Algo Restart A:

• Algo Restart B:

𝑹𝑻𝑨𝒓

ps(Algo Restart A) = 1

𝑹𝑻𝑩𝒓

ps(Algo Restart A) = 1

Page 60: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Fixed-target: Measuring Runtime

• Expected running time of the restarted algorithm:

𝐸 𝑅𝑇𝑟 =1 − 𝑝𝑠𝑝𝑠𝐸 𝑅𝑇𝑢𝑛𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 + 𝐸[𝑅𝑇𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙]

• Estimator average running time (aRT):

𝑝𝑠 =#successes

#runs

𝑅𝑇𝑢𝑛𝑠𝑢𝑐𝑐 = Average evals of unsuccessful runs

𝑅𝑇𝑠𝑢𝑐𝑐 = Average evals of successful runs

𝑎𝑅𝑇 =total #evals

#successes

Page 61: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

ECDFs with Simulated Restarts

What we typically plot are ECDFs of the simulated restarted algorithms:

Page 62: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

Worth to Note: ECDFs in COCO

In COCO, ECDF graphs

• never aggregate over dimension

• but often over targets and functions

• can show data of more than 1 algorithm at a time

150 algorithms

from BBOB-2009

till BBOB-2015

Page 63: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

...comparing aRT values over several algorithms

Another Interesting Plot...

Page 64: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

...comparing aRT values over several algorithms

Another Interesting Plot...

aRT value

[if < ∞]to reach

given target

precision

a star indicates statistically

significant results compared

to all other displayed algos

median runlength

of unsuccessful runs

Page 65: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

...comparing aRT values over several algorithms

Another Interesting Plot...

artificial best

algorithm

from

BBOB-2016

scaling with

dimensionlinear

Page 66: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

...are scatter plots

Interesting for 2 Algorithms...

aRT for algorithm A

aR

Tfo

r alg

ori

thm

B

dimensions:

one marker

per target

Page 67: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

...but they are probably less interesting for us here

There are more Plots...

Page 68: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

The single-objective BBOB functions

Page 69: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• 24 functions in 5 groups:

• 6 dimensions: 2, 3, 5, 10, 20, (40 optional)

bbob Testbed

Page 70: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• All COCO problems come in form of instances

• e.g. as translated/rotated versions of the same function

• Prescribed instances typically change from year to year

• avoid overfitting

• 5 instances are always kept the same

Plus:

• the bbob functions are locally perturbed by non-linear transformations

Notion of Instances

Page 71: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• All COCO problems come in form of instances

• e.g. as translated/rotated versions of the same function

• Prescribed instances typically change from year to year

• avoid overfitting

• 5 instances are always kept the same

Plus:

• the bbob functions are locally perturbed by non-linear transformations

Notion of Instances

f10 (Ellipsoid) f15 (Rastrigin)

Page 72: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

the recent extension to

multi-objective optimization

Page 73: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

73TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 73

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

better

worse

incomparable

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20

Multiobjective Optimization (MOO)

Multiple objectives that have to be optimized simultaneously

max

min

incomparable

Page 74: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

74TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 74

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

better

worse

incomparable

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20

Observations: there is no single optimal solution, but

some solutions ( ) are better than others ( )

max

min

incomparable

Page 75: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

75TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 75

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

better

worse

incomparable

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20

max

min

incomparable

Page 76: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

76TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 76

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

dominating

dominated

incomparable

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20

max

min

incomparable

Page 77: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

77TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 77

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20

Pareto set: set of all non-dominated solutions (decision space)

Pareto front: its image in the objective space

currently non-

dominated front

(approximation)

Vilfredo Pareto

(1848 –1923)

wikipedia

max

min

Page 78: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

78TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 78

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

500 1000 1500 2000 2500 3000 3500

cost

performance

5

10

15

20true Pareto front

(Pareto efficient

frontier)

Vilfredo Pareto

(1848 –1923)

wikipedia

Pareto set: set of all non-dominated solutions (decision space)

Pareto front: its image in the objective space

max

min

Page 79: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

79TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 79

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

f2

f1

x3

x1

decision space objective space

solution of Pareto-optimal set

non-optimal decision vector

vector of Pareto-optimal front

non-optimal objective vector

x2

max

min

Page 80: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

80TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 80

Mastertitelformat bearbeitenA Brief Introduction to Multiobjective Optimization

f2

f1

f2

f1

nadir point

ideal pointShape Range

min

min

min

min

ideal point: best values

nadir point: worst valuesobtained for Pareto-optimal points

Page 81: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

81TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 81

Mastertitelformat bearbeiten

Idea:

transfer multiobjective problem into a set problem

define an objective function (“quality indicator”) on sets

Important:

Underlying dominance relation (on sets) should be reflected by

the resulting set comparisons!

Quality Indicator Approach to MOO

max

min

max

min

𝑨 ≼ 𝑩neither 𝑨 ≼ 𝑩nor 𝑩 ≼ 𝑨

Page 82: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

82TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 82

Mastertitelformat bearbeitenExamples of Quality Indicators

I(A)A

A

I(A) = volume of the

weakly dominated area

in objective space

I(A,B) = how much needs A to

be moved to weakly dominate B

A B : I(A) I(B) A B : I(A,B) I(B,A)

unary hypervolume indicator binary epsilon indicator

A’

max

max

max

max

Page 83: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

83TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 83

Mastertitelformat bearbeitenExamples of Quality Indicators II

R

A

I(A,R) = how much needs A to

be moved to weakly dominate R

A B : I(A,R) I(B,R)

unary epsilon indicator

A’ A

I(A) = 1

Λ

𝜆∈Λ

min𝑎∈Amax𝑗=1..𝑚𝜆𝑗|𝑧𝑗∗ − 𝑎𝑗|

A B : I(A) I(B)

unary R2 indicator

max

max

max

max

𝒛∗

slope

based

on 𝜆

Page 84: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

84TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 84

Mastertitelformat bearbeitenExamples of Quality Indicators II

R

A

I(A,R) = how much needs A to

be moved to weakly dominate R

A B : I(A,R) I(B,R)

unary epsilon indicator

A’ A

I(A) = 1

Λ

𝜆∈Λ

min𝑎∈Amax𝑗=1..𝑚𝜆𝑗|𝑧𝑗∗ − 𝑎𝑗|

A B : I(A) I(B)

unary R2 indicator

max

max

max

max

𝒛∗

Page 85: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• 55 functions by combining 2 bbob functions

bbob-biobj Testbed

Page 86: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• 55 functions by combining 2 bbob functions

bbob-biobj Testbed

Page 87: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• 55 functions by combining 2 bbob functions

• 15 function groups with 3-4 functions each• separable – separable, separable – moderate, separable -

ill-conditioned, ...

• 6 dimensions: 2, 3, 5, 10, 20, (40 optional)

• instances derived from bbob instances:

• no normalization (algo has to cope with different orders of magnitude)

• for performance assessment: ideal/nadir points known

bbob-biobj Testbed

Page 88: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

• Pareto set and Pareto front unknown• but we have a good idea of where they are by running

quite some algorithms and keeping track of all non-dominated points found so far

• Various types of shapes

bbob-biobj Testbed (cont'd)

Page 89: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

bbob-biobj Testbed (cont'd)se

arc

h s

pa

ce

ob

jective s

pa

ce

disconnected

multi-modal

connected

uni-modal

Page 90: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

algorithm quality =

normalized* hypervolume (HV)

of all non-dominated solutions

if a point dominates nadir

closest normalized* negative distance

to region of interest [0,1]2

if no point dominates nadir

* such that ideal=[0,0] and nadir=[1,1]

Bi-objective Performance Assessment

Page 91: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

We measure runtimes to reach (HV indicator) targets:

• relative to a reference set, given as the best Pareto front approximation known (since exact Pareto set not known)

• actual absolute hypervolume targets used are

HV(refset) – targetprecision

with 58 fixed targetprecisions between 1 and -10-4

(same for all functions, dimensions, and instances) in the displays

Bi-objective Performance Assessment

Page 92: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

92TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 92

Mastertitelformat bearbeitenDate Topic

1 Fri, 16.9.2016 Introduction to Optimization

Wed, 21.9.2016

Thu, 22.9.2016

groups defined via wiki

everybody went (actively!) through the Getting Started part of

github.com/numbbo/coco

2

3

4

Fri, 23.9.2016

Fri, 30.9.2016

Fri, 7.10.2016

Today's lecture: Benchmarking; final adjustments of groups

everybody can run and postprocess the example experiment (~1h for

final questions/help during the lecture)

Lecture

Lecture

Mon, 10.10.2016 deadline for intermediate wiki report:

what has been done and what remains to be done?

5

6

Fri, 14.10.2016

Tue, 18.10.2016

Lecture

Lecture

Tue, 18.10.2016

Fri, 21.10.2016

deadline for submitting data sets

deadline for paper submission

7 Fri, 4.11.2016

vacation

Final lecture

7.-11.11.2016 oral presentations (individual time slots)

14 - 18.11.2016 Exam (exact date to be confirmed)

Course Overview

All deadlines:

23:59pm Paris time

Page 93: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

93TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 93

Mastertitelformat bearbeiten

I hope it became clear...

...what are the important issues in algorithm benchmarking

...which functionality is behind the COCO platform

...and how to measure performance in particular

...what are the basics of multiobjective optimization

...and what are the next important steps to do:

read the assigned paper and implement the algorithm

document everything on the wiki

Monday in 2 weeks: intermediate report on wiki

Conclusions Benchmarking Continuous Optimizers

Page 94: Introduction to Optimization: Benchmarkingresearchers.lille.inria.fr/.../optimizationSaclay/... · Pattern search [Hooke and Jeeves 1961] Trust-region methods (NEWUOA, BOBYQA) [Powell

94TC2: Introduction to Optimization, U. Paris-Saclay, Sep. 16, 2016© Anne Auger and Dimo Brockhoff, Inria 94

Mastertitelformat bearbeiten

And now...

...time for your questions and problems around COCO