towards prediction of algorithm performance in real world problems

Towards prediction of algorithm performance in real world problems

Tommy Messelis*

Stefaan HaspeslaghBurak Bilgin

Patrick De CausmaeckerGreet Vanden Berghe

*[email protected]

Overviewscope

performance prediction

two approaches

experimental setup & results

conclusions

future work

ScopeReal world timetabling and personnel scheduling problems

example: Nurse Rostering Problem (NRP) assign nurses to shifts for some planning horizon

pre-determined demands

satisfying hard constraints: e.g. no nurse can work two places at the same

time soft constraints: e.g. a nurse should not work more than 7 days

in a row

objective: Find an assignment that satisfies all hard constraints and as many soft constraints as possible.

NP-hard combinatorial optimisation problem

Scope

in practice, finding the optimal solution is computationally infeasible, even for small real world problems

rely on good-enough, fast-enough solutions, provided by (meta)heuristics

Performance predictionThere are many solution methods

some perform very well, while others very bad on the same instances

no method outperforms the others on all instances

It would be good to know in advance how well an algorithm will perform on a given problem instancechoose the best algorithm and use the resources

as efficiently as possibledecisions should be made without spending the

(possibly scarce) resourcesbased on basic, quickly computable properties of the

instance itself

Empirical hardness models

idea of mapping efficiently computable features onto empirical hardness measuresempirical

we need to run some algorithm to get an idea of the hardness of an instance

hardnessis measured by some performance criteria of the

algorithm

General frameworkIntroduced by Leyton-Brown et al.

1. identify a problem instance distribution

2. select one or more algorithms

3. select a set of inexpensive features

4. generate a training set of instances, run all algorithms and determine runtimes, compute all feature values for all instances

5. eliminate redundant/uninformative features

6. build empirical hardness models (functions of the features that predict the runtime)

K. Leyton-Brown, E. Nudelman, Y. Shoham. Learning the empirical hardness of optimisation problems: The case of combinatorial auctions. In LNCS, 2002

K. Leyton-Brown, E. Nudelman, Y. Shoham. Empirical Hardness Models: Methodology and a case study on combinatorial auctions. In Journal of the ACM, 2009

Performance predictionWe will use this framework to predict other

performance criteria as wellquality of the solutionsquality gap (distance between the found solution

and the optimal solution)

for both a complete solver and a metaheuristicproof-of-concept study, on a small ‘real world’-like

instance distribution

Approaches1. NRP-specific context

feature set for nurse rostering problems build empirical hardness models based on these

features

2. General SAT context translate the NRP instances into SAT instances use an existing extensive feature set for SAT

problems build empirical hardness models based on SAT

features

Experimental setup1. Instance distribution

we focus at an instance distribution that produces small instances, still solvable by a complete solver in a reasonable amount of time

6 nurses, 14 days, fluctuating sequence constraints, coverage and availabilities

2. Algorithm selection + performance criteria integer program representation, CPLEX solver

runtime quality of the optimal solution

variable neighbourhood search (metaheuristic) quality of the approximate solution quality gap between optimal and approximate solution

Experimental setup3. Feature set

NRP features structural property values:

min & max total nr of assigments min & max nr of consecutive working days min & max nr of consecutive free days

ratios, expressing the ‘tightness’ of the constraints hardness: availability / coverage demand tightness: max / min ratios (the smaller, the less

freedom)

Experimental setup3. Feature set

After translation to a SAT formula, we can use (a subset

of) an existing feature set for SAT problems SAT features

problem size: #clauses, #variables, c/v ratio, ... problem structure: different graph representations

lead to various node and edge degree statistics balance bases: fraction pos. literals per clause,

positive occurences per variable, ... proximity to Horn formulae

Experimental setup4. Sampling and measuring

2 training sets: 500 instances & 2000 instances for computational reasons, the performance criteria

that are dependent on CPLEX results are modelled using the smaller set

both algorithms are run on the training instances, all performance criteria are determined, all feature values are computed.

5. Feature elimination useless features (i.e. univalued) correlated features (based on correlation analysis)

if (correlation coefficient > 0.7) then filter feature

Experimental setup6. Model learning

statistical regression techniques linear regression relatively simple technique, but with good results

models for all performance criteria based on NRP features based on SAT features

models are built iteratively start with a model consisting of all uncorrelated

features features are iteratively removed from the regression

model when their P-value is higher than 0.05 evaluation using testsets of 100 and 1000

instances

ResultsCPLEX runtime

models are not very accurate (R2 = 0.10)due to high variability in the runtimes

most instances (70%) are solved in 4 ms some take up to 4 hours

models for log(runtime) are ‘better’, but still not very accurate (R2 = 0.17)

Resultsquality optimal solution

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

SAT model - testset

real optimal value

pre

dic

ted

op

tim

al

va

lue

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

NRP model - testset

real optimal value

pre

dic

ted

op

tim

al

va

lue

R2 = 0.98 R2 = 0.94

Resultsquality approximate solution

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

NRP model - testset

real approximate quality

pre

dic

ted

ap

pro

xim

ate

qu

ali

ty

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

SAT model - testset

real approximate quality

pre

dic

ted

ap

pro

xim

ate

qu

ali

ty

R2 = 0.96 R2 = 0.94

Resultsquality gap between approximate and optimal

solution

-1 0 1 2 3 4 5 6 7 8 9 10-1

0

1

2

3

4

5

6

7

8

9

10

SAT model - testset

real gap

pre

dic

ted

ga

p

-1 0 1 2 3 4 5 6 7 8 9 10-1

0

1

2

3

4

5

6

7

8

9

10

NRP model - testset

real gap

pre

dic

ted

ga

p

R2 = 0.54 R2 = 0.40

ConclusionsWe obtain good results for predicting solution

qualitycomplete CPLEX solverapproximate metaheuristic

quality gap prediction is less accurate, though still not bad

CPLEX runtime could not be modelledbut is this really necessary?

Conclusionsimportance of the translation to SAT

representing the instances as general, abstract SAT problems helpsexpressing hidden structurescontext free modeling

SAT models are slightly better than NRP modelsdue to the quantity/quality of NRP featuresthough, still very good results, even with this

limited NRP feature set!

We can build empirical hardness models!

Future work‘real’ real world instances

improve the runtime prediction, use other features better suited for runtime predictione.g. based on solving relaxations of integer programs

for some performance criteria, other machine learning techniques might be more suitablee.g. classification tool for runtime prediction

very short / feasible / unfeasible

better systematic building of models instead of manual deletion of features in the iterative

models

Thank you!

Questions?

towards prediction of algorithm performance in real world problems

Documents

small real world

performance criteria

problem instance

performance predictionwe

real world scheduling

small instances

training set of instances

computable features