sequential parameter optimization for symbolic regression · seven introducing rgp rgp overview...

28
SEVEN Sequential Parameter Optimization for Symbolic Regression Thomas Bartz-Beielstein Oliver Flasch Martin Zaefferer {firstname.lastname}@fh-koeln.de SPOT SEVEN Cologne University of Applied Sciences Faculty for Computer and Engineering Science July 2012 Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 1 / 27

Upload: others

Post on 15-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Sequential Parameter Optimization forSymbolic Regression

Thomas Bartz-Beielstein Oliver Flasch Martin Zaefferer{firstname.lastname}@fh-koeln.de

SPOT SEVENCologne University of Applied Sciences

Faculty for Computer and Engineering Science

July 2012

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 1 / 27

Page 2: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Agenda

Introducing RGP

Introducing SPOT

Experiments with SPOT

thomas section

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 2 / 27

Page 3: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

R: Programming Language for Statistics

I R is “GNU S”, a freely available language andenvironment for statistical computing and graphics.

I R provides a wide variety of statistical and graphicaltechniques:

I linear and nonlinear modelling,I statistical tests,I time series analysis,I classification,I clustering, etc.

I Useful platform for GP, providing:I flexible interactive environmentI fast expression maniplutaion and evaluationI powerful visualization toolsI tools for parallel computing

I See R project homepagehttp://cran.r-project.org/ for further information.

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 3 / 27

Page 4: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

RGP OverviewGeneral Features

I modular GP implementation in RI simplicity beats complexityI convention over configuration

I large feature setI multiple search heuristics (Pareto GP, TinyGP, . . . )I multiple representations (tree GP and linear GP)I multiple sets of variation operatorsI support for strongly-typed GP

↪→ complex parameterization

I performance-critical functions also implemented in C

I comprehensive documentation

I Freely available (GPL-2) on CRAN:install.packages("rgp")

I See RGP project homepage http://rsymbolic.org/

for details and “bleeding edge releases”.

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 4 / 27

Page 5: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

GP Search Heuristics

I concrete search strategy employed by a GP system

I independent of the concrete GP search space

↪→ decouple the search heuristic from the search space

I GP search heuristic components:I selection strategyI variation pipeline (order of variation operator application)I diversity preservation

I GP system components independent of the search heuristic:I GP individual representationI GP individual initialization and variation (mutation and crossover)I GP individual evaluation

I examples of GP search heuristics:I classical single-objective steady-state EAs with tournament selectionI modern multi-objective steady-state heuristics

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 5 / 27

Page 6: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

TinyGP

I popular small GP implementation mainly used in teaching

I steady-state single-objective search heuristic with tournament selection

I no direct means of diversity preservation

I included as a reference with well-known performance characteristics

Table : Parameters of the TinyGP search heuristic.

Variable (Symbol) Domain Default

Population Size mu (µ) N 300Tournament Size tournamentSize (stournament) N 2Recombination Probability recombinationProbability (prec) [0, 1] 0.9

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 6 / 27

Page 7: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

Generational Multi-Objective GP (GMOGP)

I based on the well-known multi-objective generational (µ+ λ) EA NSGAII

I coarsely scalable complexity through optional selection criteria:individual age, individual complexity

I diversity preservation through age-layering

I included as search heuristic with scalable complexity

Table : Parameters of the GMOGP search heuristic.

Variable (Symbol) Domain Default

Population Size mu (µ) N 300Children per Generation lambda (λ) N 20New Individuals per Generation nu (ν) N0 1Enable Complexity Criterion complexityCriterion B true

Enable Age Criterion ageCriterion B true

Recombination Probability recombinationProbability (prec) [0, 1] 0.1

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 7 / 27

Page 8: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing RGP

Experiment Setup

I Research goal: Quantify influence of RGP search heuristic parameters onalgorithm performance (single algorithm, single problem).

Table : RGP parameters independent of the search heuristic.

Problem Symbolic Regression of f (x) := sin(x) + cos(2 · x)Fitness Cases 200 equidistant samples in [0, 4 · π]Error Measure sample RMSE

Function Set {+,−, ·,÷}Input Variable Set {x}Constant Set uniform random constants in [−1, 1]Individual Size Limit 64Mutation Operator Set { insert/delete subtree, change function/constant }Crossover Operator Set { random subtree crossover }

Time Budget per GP Run 5 minutesInitial Experiment Design Size 10Number of Sequential GP Runs 100

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 8 / 27

Page 9: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Introducing SPOT

SPOT Introduction

Sequential Parameter Optimization [1] Toolbox (SPOT1)

I Based on statistical methods and Design of Experiment

Create initial design

Terminate?

Evaluate design on target function

Report + End

Exploit model: choose new design

yes

Build surrogate model

no

Optional User Input,

e.g.:

Check output, remove

outliers

Add specific points to design

(expert knowledge)

1SPOT and all other used R packages can be retrieved from the CRAN homepage, i.e.http://cran.r-project.org.

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 9 / 27

Page 10: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT SPOT Setup

SPOT Setup

Table : Parameters influencing SPOT performance

Initial Design Size 10Initial Design Repeats 2Maximum Repeats 5Budget (GP-Runs) 100Old Best Size 3New Design Size 1Budget Allocation Linearly increasing

Surrogate Model Kriging ModelSurrogate Optimization Method CMA-ESSurrogate Optimization Budget 1000

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 10 / 27

Page 11: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT Code Example

Interfacing RGP and SPOT I

> spotRgpTargetFunction <- function(x, time = 120) {+ populationSize <- x[1]+ tournamentSizePercentage <- x[2]+ crossoverProbability <- x[3]++ ## [...] ## repair parameters as necessary++ ## [...] ## define problem data, to be solved by symb. regress.++ ## [...] ## run symbolic regression with parameters++ ## [...] ## calculate RMSE (fitness) of best individual in population++ return (bestFitness)+ }

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 11 / 27

Page 12: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT Code Example

Configuring and starting SPOT

> conf <- list(alg.func = spotRgpTargetFunction,+ alg.roi = spotROI(lower = c(10, 0.0, 0.0),+ upper = c(1000, 1.0, 1.0),+ type = c("INT", "FLOAT", "FLOAT")),+ alg.seed = 1,+ spot.seed = 0,+ seq.predictionModel.func = "spotPredictForrester",+ seq.predictionOpt.func = "spotPredictOptMulti",+ seq.predictionOpt.budget = 1000,+ seq.predictionOpt.method = "cmaes",+ io.verbosity = 3,+ report.interactive = TRUE,+ spot.ocba = FALSE, # no variance at some points causes OCBA to crash+ init.design.size = 10,+ init.design.repeats = 2,+ seq.design.oldBest.size = 3,+ seq.design.size = 1000,+ seq.design.new.size = 1,+ seq.design.maxRepeats = 5,+ auto.loop.nevals = 100)

> time <- 600 # set per RGP run time budget (in seconds), default is 120> res <- spot(spotConfig = conf, time = time)

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 12 / 27

Page 13: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT Results of Code Example

Results of the automated SPOT run

> spot(spotConfig=append(+ list(report.func="spotReportSens"),+ res),spotTask="rep")

−1.0 −0.5 0.0 0.5 1.00.89

50.

905

0.91

50.

925

NULL

normalized ROI

Y

●● ●●●●

VARX1VARX2VARX3

VARX1 (pop. size) VARX2 (tournament size %) VARX3 (crossover prob.)

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 13 / 27

Page 14: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT Results of Code Example

> spot(spotConfig=append(+ list(report.func="spotReportContour",report.observations.all=T),+ res),spotTask="rep")

0.925

0.930

0.935

0.940

0.945

0.950

0.955●

200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

VARX1

VA

RX

2

0.93

0.94

0.95

0.96

0.97

0.98

200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

VARX1

VA

RX

3

0.93

0.94

0.95

0.96

0.97

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

VARX2

VA

RX

3

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 14 / 27

Page 15: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Experiments with SPOT Results of Code Example

> spot(spotConfig=append(+ list(report.func="spotReport3d"),+ res),spotTask="rep")

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 15 / 27

Page 16: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Factor Variables

I Categorical VariablesI Enable Complexity CriterionI Enable Age Criterion

I OptionsI Special treatment: Mapping to R not adequateI Analyze each factor setting separately: Many extra runsI Random forestsI Use linear model with dummy variables

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 16 / 27

Page 17: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models with Factors

I Initial design: 20 design points × 2 repeats = 40 GP runs

I 6 variables, 2 factors

I Linear model shows VARX3 (crossover prob.) has significant effect

I Refinement of the automated analysis:I Perform stepwise model selection by AIC

Df Sum Sq Mean Sq F value Pr(>F)VARX3 1.000000 0.001389 0.001389 13.577399 0.000711Residuals 38.000000 0.003887 0.000102

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 17 / 27

Page 18: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Screening, Model Selection

0.0 0.2 0.4 0.6 0.8 1.0

0.95

50.

960

0.96

50.

970

VARX3

aov(formula=y~VARX3,data=cbind(y,x))

I Starting 6 additional GP runs to refine the model: 3 old models, 1 new withthree repeats

VARX1 VARX2 VARX3 VARX4 VARX5 VARX6 CONFIG REPEATS STEP SEED

519 0.46 0.65 0 0 0.04 2 1 1 3

208 0.40 0.02 1 1 0.14 9 1 1 3

108 0.26 0.49 0 0 0.96 4 1 1 3

75 0.24 0.00 0 0 0.37 21 3 1 1

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 18 / 27

Page 19: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression and Dummy Variables

1.0 1.2 1.4 1.6 1.8 2.00.95

260.

9532

0.95

38

Eval: 46 , Y: 0.952603021030235

step

Y

1.0 1.2 1.4 1.6 1.8 2.0

200

300

400

500

step

VA

RX

1

1.0 1.2 1.4 1.6 1.8 2.0

0.41

0.43

0.45

step

VA

RX

2

1.0 1.2 1.4 1.6 1.8 2.0

0.0

0.2

0.4

0.6

step

VA

RX

3

1.0 1.2 1.4 1.6 1.8 2.0

0.0

0.4

0.8

step

VA

RX

4

1.0 1.2 1.4 1.6 1.8 2.0

0.0

0.4

0.8

step

VA

RX

5

1.0 1.2 1.4 1.6 1.8 2.0

0.04

0.08

0.12

step

VA

RX

6

I SPOT search path at step 2

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 19 / 27

Page 20: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression Analysis

I Second step shows similar results.

I Suggested design points after step 2

VARX1 VARX2 VARX3 VARX4 VARX5 VARX6 CONFIG REPEATS STEP SEED

208 0.40 0.02 1 1 0.14 9 1 2 4

108 0.26 0.49 0 0 0.96 4 1 2 4

555 0.54 0.50 1 0 0.91 15 2 2 3

464 0.46 0.00 0 1 0.29 22 4 2 1

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 20 / 27

Page 21: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression and Dummy Variables

1.0 1.5 2.0 2.5 3.00.95

260.

9532

0.95

38

Eval: 54 , Y: 0.953649013942794

step

Y

● ●

1.0 1.5 2.0 2.5 3.0

200

300

400

500

step

VA

RX

1

● ●

1.0 1.5 2.0 2.5 3.0

0.41

0.43

0.45

step

VA

RX

2

● ●

1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

step

VA

RX

3

● ●

1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

step

VA

RX

4

● ●

1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

step

VA

RX

5

● ●

1.0 1.5 2.0 2.5 3.0

0.04

0.08

0.12

step

VA

RX

6

I SPOT search path at step 3

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 21 / 27

Page 22: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression Analysis

I Second step shows similar results.

VARX1 VARX2 VARX3 VARX4 VARX5 VARX6 CONFIG REPEATS STEP SEED

208 0.40 0.02 1 1 0.14 9 1 3 5

108 0.26 0.49 0 0 0.96 4 1 3 5

75 0.24 0.00 0 0 0.37 21 2 3 4

352 0.60 0.00 1 1 0.99 23 5 3 1

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 22 / 27

Page 23: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression and Dummy Variables

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.95

300.

9540

0.95

50

Eval: 63 , Y: 0.954956008081695

step

Y

● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

100

300

500

step

VA

RX

1

● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.25

0.35

0.45

step

VA

RX

2

● ●●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.2

0.4

0.6

step

VA

RX

3

● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.4

0.8

step

VA

RX

4

● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.4

0.8

step

VA

RX

5

● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.05

0.15

0.25

0.35

step

VA

RX

6

I SPOT search path at step 4

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 23 / 27

Page 24: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models with Factors

I Refinement of the automated analysis:I Perform stepwise model selection by AIC

Df Sum Sq Mean Sq F value Pr(>F)VARX1 1.000000 0.000588 0.000588 9.403483 0.003309VARX3 1.000000 0.001004 0.001004 16.056302 0.000180VARX5 1.000000 0.000048 0.000048 0.765542 0.385272VARX3:VARX5 1.000000 0.000290 0.000290 4.640992 0.035457VARX1:VARX3 1.000000 0.000550 0.000550 8.790953 0.004414Residuals 57.000000 0.003564 0.000063

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 24 / 27

Page 25: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Linear Models: Regression and Dummy Variables

0 200 400 600 800 1000

0.95

0.96

0.97

0.98

0.99

1 VARX1

0.0 0.2 0.4 0.6 0.8 1.0

0.95

0.96

0.97

0.98

0.99

2 VARX3

0.95

0.96

0.97

0.98

0.99

3 VARX5

0.95

0.96

0.97

0.98

0.99

0 1

VARX1

VARX3

1 VARX1: VARX3

VARX1

VARX5

2 VARX1: VARX5

VARX3

VARX5

3 VARX3: VARX5

aov(formula=y~VARX1+VARX3+VARX5+VARX3:VARX5+VAR...

I VARX1 + VARX3 + VARX5 + VARX3:VARX5 + VARX1:VARX3

I Interactions between x1 and x5 and between x3 and x5

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 25 / 27

Page 26: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

thomas section

Summary: Regression Analysis

I Categorical parameters can be easily integrated into the SPOT tuningframework

I More complex models (MARS, GLM) can be used as well, however: Occam’srazor

I Crossover prob. has significant impact, should be low

I Children per generation and age criterion: no effect

I Interactions

I Further steps: nested designs for complicated settings

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 26 / 27

Page 27: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Acknowledgments

Acknowledgments

I This work has been supported by the Federal Ministry of Education andResearch (BMBF) under the grants FIWA (AIF FKZ 17N1009) and CIMO(FKZ 17002X11)

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 27 / 27

Page 28: Sequential Parameter Optimization for Symbolic Regression · SEVEN Introducing RGP RGP Overview General Features I modular GP implementation in R I simplicity beats complexity I convention

SEVEN

Acknowledgments

[1] Thomas Bartz-Beielstein, Konstantinos E. Parsopoulos, and Michael N.Vrahatis.Design and analysis of optimization algorithms using computational statistics.Applied Numerical Analysis and Computational Mathematics (ANACM),1(2):413–433, 2004.

Bartz-Beielstein, Flasch, Zaefferer (CUAS) SPO for Symbolic Regression July 2012 27 / 27