towards prediction of algorithm performance in real world problems
DESCRIPTION
Towards prediction of algorithm performance in real world problems. Tommy Messelis * Stefaan Haspeslagh Burak Bilgin Patrick De Causmaecker Greet Vanden Berghe. * [email protected]. Overview. scope performance prediction two approaches experimental setup & results - PowerPoint PPT PresentationTRANSCRIPT
Towards prediction of algorithm performance in real world problems
Tommy Messelis*
Stefaan HaspeslaghBurak Bilgin
Patrick De CausmaeckerGreet Vanden Berghe
Overviewscope
performance prediction
two approaches
experimental setup & results
conclusions
future work
ScopeReal world timetabling and personnel scheduling problems
example: Nurse Rostering Problem (NRP) assign nurses to shifts for some planning horizon
pre-determined demands
satisfying hard constraints: e.g. no nurse can work two places at the same
time soft constraints: e.g. a nurse should not work more than 7 days
in a row
objective: Find an assignment that satisfies all hard constraints and as many soft constraints as possible.
NP-hard combinatorial optimisation problem
Scope
in practice, finding the optimal solution is computationally infeasible, even for small real world problems
rely on good-enough, fast-enough solutions, provided by (meta)heuristics
Performance predictionThere are many solution methods
some perform very well, while others very bad on the same instances
no method outperforms the others on all instances
It would be good to know in advance how well an algorithm will perform on a given problem instancechoose the best algorithm and use the resources
as efficiently as possibledecisions should be made without spending the
(possibly scarce) resourcesbased on basic, quickly computable properties of the
instance itself
Empirical hardness models
idea of mapping efficiently computable features onto empirical hardness measuresempirical
we need to run some algorithm to get an idea of the hardness of an instance
hardnessis measured by some performance criteria of the
algorithm
General frameworkIntroduced by Leyton-Brown et al.
1. identify a problem instance distribution
2. select one or more algorithms
3. select a set of inexpensive features
4. generate a training set of instances, run all algorithms and determine runtimes, compute all feature values for all instances
5. eliminate redundant/uninformative features
6. build empirical hardness models (functions of the features that predict the runtime)
K. Leyton-Brown, E. Nudelman, Y. Shoham. Learning the empirical hardness of optimisation problems: The case of combinatorial auctions. In LNCS, 2002
K. Leyton-Brown, E. Nudelman, Y. Shoham. Empirical Hardness Models: Methodology and a case study on combinatorial auctions. In Journal of the ACM, 2009
Performance predictionWe will use this framework to predict other
performance criteria as wellquality of the solutionsquality gap (distance between the found solution
and the optimal solution)
for both a complete solver and a metaheuristicproof-of-concept study, on a small ‘real world’-like
instance distribution
Approaches1. NRP-specific context
feature set for nurse rostering problems build empirical hardness models based on these
features
2. General SAT context translate the NRP instances into SAT instances use an existing extensive feature set for SAT
problems build empirical hardness models based on SAT
features
Experimental setup1. Instance distribution
we focus at an instance distribution that produces small instances, still solvable by a complete solver in a reasonable amount of time
6 nurses, 14 days, fluctuating sequence constraints, coverage and availabilities
2. Algorithm selection + performance criteria integer program representation, CPLEX solver
runtime quality of the optimal solution
variable neighbourhood search (metaheuristic) quality of the approximate solution quality gap between optimal and approximate solution
Experimental setup3. Feature set
NRP features structural property values:
min & max total nr of assigments min & max nr of consecutive working days min & max nr of consecutive free days
ratios, expressing the ‘tightness’ of the constraints hardness: availability / coverage demand tightness: max / min ratios (the smaller, the less
freedom)
Experimental setup3. Feature set
After translation to a SAT formula, we can use (a subset
of) an existing feature set for SAT problems SAT features
problem size: #clauses, #variables, c/v ratio, ... problem structure: different graph representations
lead to various node and edge degree statistics balance bases: fraction pos. literals per clause,
positive occurences per variable, ... proximity to Horn formulae
Experimental setup4. Sampling and measuring
2 training sets: 500 instances & 2000 instances for computational reasons, the performance criteria
that are dependent on CPLEX results are modelled using the smaller set
both algorithms are run on the training instances, all performance criteria are determined, all feature values are computed.
5. Feature elimination useless features (i.e. univalued) correlated features (based on correlation analysis)
if (correlation coefficient > 0.7) then filter feature
Experimental setup6. Model learning
statistical regression techniques linear regression relatively simple technique, but with good results
models for all performance criteria based on NRP features based on SAT features
models are built iteratively start with a model consisting of all uncorrelated
features features are iteratively removed from the regression
model when their P-value is higher than 0.05 evaluation using testsets of 100 and 1000
instances
ResultsCPLEX runtime
models are not very accurate (R2 = 0.10)due to high variability in the runtimes
most instances (70%) are solved in 4 ms some take up to 4 hours
models for log(runtime) are ‘better’, but still not very accurate (R2 = 0.17)
Resultsquality optimal solution
0 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
SAT model - testset
real optimal value
pre
dic
ted
op
tim
al
va
lue
0 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
NRP model - testset
real optimal value
pre
dic
ted
op
tim
al
va
lue
R2 = 0.98 R2 = 0.94
Resultsquality approximate solution
0 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
NRP model - testset
real approximate quality
pre
dic
ted
ap
pro
xim
ate
qu
ali
ty
0 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
SAT model - testset
real approximate quality
pre
dic
ted
ap
pro
xim
ate
qu
ali
ty
R2 = 0.96 R2 = 0.94
Resultsquality gap between approximate and optimal
solution
-1 0 1 2 3 4 5 6 7 8 9 10-1
0
1
2
3
4
5
6
7
8
9
10
SAT model - testset
real gap
pre
dic
ted
ga
p
-1 0 1 2 3 4 5 6 7 8 9 10-1
0
1
2
3
4
5
6
7
8
9
10
NRP model - testset
real gap
pre
dic
ted
ga
p
R2 = 0.54 R2 = 0.40
ConclusionsWe obtain good results for predicting solution
qualitycomplete CPLEX solverapproximate metaheuristic
quality gap prediction is less accurate, though still not bad
CPLEX runtime could not be modelledbut is this really necessary?
Conclusionsimportance of the translation to SAT
representing the instances as general, abstract SAT problems helpsexpressing hidden structurescontext free modeling
SAT models are slightly better than NRP modelsdue to the quantity/quality of NRP featuresthough, still very good results, even with this
limited NRP feature set!
We can build empirical hardness models!
Future work‘real’ real world instances
improve the runtime prediction, use other features better suited for runtime predictione.g. based on solving relaxations of integer programs
for some performance criteria, other machine learning techniques might be more suitablee.g. classification tool for runtime prediction
very short / feasible / unfeasible
better systematic building of models instead of manual deletion of features in the iterative
models