and analysis of algorithm parameters automatic algorithm ... · automatic algorithm configuration...
TRANSCRIPT
Automatic algorithm configuration and analysis of algorithm parameters
Nguyen Dang
School of Computer Science, University of St Andrews
2
Algorithm design and implementation
Automatic algorithm configuration
Analysis of algorithm components and parameters
Algorithm development process
Concepts
● Optimisation problem
● Problem instance and solution
● Parameterised algorithm
● Automatic algorithm configuration
4
Optimisation problem
9
Travelling Salesman Problem (TSP): given a list of places and the distances between each pair of places, what is the shortest possible route that visits each place exactly once and returns to the origin place?
Problem instance
10
A problem instance of the Travelling Salesman Problem (TSP) includes
● number of customers● distance between every pair of customers
Solution
11
A solution of a TSP instance: a route
Number of possible solutions● n customers: n! solutions● 10 customers: 3.628.800 solutions● 20 customers: ~ 2.4 million milion million solutions
Optimisation algorithm
12
okie! let’s me call my TSP algorithm to solve your problem instance!
Hey! Here are the orders I have for
today
Step 1: do AStep 2: do BStep 3: do CStep 4: do DStep 5: do E
a TSP problem instance
a (good) solution
Parameterised algorithms
13
Step 1: do AStep 2: do BStep 3: do CStep 4: do DStep 5: do E
Step 1: do AStep 2: do BStep 3: do C’Step 4: do DStep 5: do E
Town X City Y
Algorithm parameters and algorithm configuration
14
Step 1: do AStep 2: do BStep 3: do xStep 4: do DStep 5: do E
x ∈{C,C’}
x is a categorical parameter of this TSP algorithmOther parameter types:
● ordinal, e.g., x ∈ {A,B,C,D,E}● integer, e.g., x ∈ [1..5]● continuous, e.g., x ∈ [1.2, 2.5]
Conditions on parameters: y is activated only when x==’C’ an algorithm configuration: an instantiation of all algorithm’s parameters
Step 1: do AStep 2: do BStep 3: do CStep 4: do DStep 5: do E
configuration 1
Step 1: do AStep 2: do BStep 3: do C’Step 4: do DStep 5: do E
configuration 2
Automatic algorithm configuration problem
15
Executable of a parameterised algorithm
Problem instances
#Name #Type #Range #Conditionparam1 categorical {a,b,c}param2 int [1,5]param3 real [2.5,10.0] | param1==’a’
Configurator
Best algorithm configuration to be used
Racing technique
16
Algorithm configurationsIn
stan
ces
Algorithm configurations
Configuration budget is used efficiently!
Maron and Moore, 1994
Inst
ance
s
Automatic algorithm configurators
● irace (López-Ibánez et al 2011, 2016)
● ParamILS (Hutter 2009)
● SMAC (Hutter 2009)
● GGA and GGA++ (Ansótegui et al 2009, 2015)
17
❖ well-packaged, easy to use
❖ support all types of parameters
❖ work on high-dimensional case studies
Automatic algorithm configurators
irace (López-Ibánez, Dubois-Lacoste, Pérez Cáceres, Birattari, and Stützle 2016)
● http://iridia.ulb.ac.be/irace/
● available as an R-package on CRAN
● irace = iterated racing
○ a simple sampling model + racing
● parallelisation supported
18
Automatic algorithm configurators
ParamILS (Hutter, Hoos, Leyton-Brown and Stützle 2009)
● http://www.cs.ubc.ca/labs/beta/Projects/ParamILS/
● written in Ruby
● ParamILS = Iterated Local Search in parameter configuration space
○ a sequential local search approach + racing between 2 configurations
● integer and continuous parameters must be discretised
● parallelisation not supported
→ multiple independent runs, take the best configuration in the end.19
Automatic algorithm configurators
SMAC (Hutter, Hoos, Leyton-Brown and Stützle 2009)
● http://www.cs.ubc.ca/labs/beta/Projects/SMAC/
● SMAC in Java, and SMAC3 in Python
● SMAC = Sequential Model-based Algorithm Configuration
○ Bayesian Optimisation + racing between 2 configurations
● parallelisation: two options
○ Multiple independent runs
○ Multiple runs with shared model
20
Fit/update a prediction model
Evaluate promising configurations
Automatic algorithm configurators
GGA (Ansótegui, Sellmann, and Tierney 2009)
GGA++ (Ansótegui, Malitsky, Samulowitz, Sellmann, and Tierney 2015)
● GGA : https://bitbucket.org/gga_ac/dgga
GGA++: https://bitbucket.org/mlindauer/aclib2
● written in C++
● GGA = Gender-based Genetic Algorithm
○ Genetic Algorithm + racing
● parallelisation supported
21
Applications● Improve performance of state-of-the-art algorithms on several hard
computational problems: SAT, MIP, SMT, ASP, AI planning, TSP, BBOB,
MOEA, robotics,...
○ CPLEX (ParamILS & SMAC, Hutter et al 2010): speed up with several order of
magnitudes over default configuration, while CPLEX’s tuning tool can’t
● Unified algorithm frameworks → automatic construction of algorithms
○ Multi-objective evolutionary algorithm framework (irace, Bezerra et al 2014)
○ SATenstein (SMAC, Khudabukhsh et al 2009, 2016)
○ ...
22
Applications
● Hyper-parameter optimisation in machine learning
○ SMAC has been successfully applied on a wide range of applications
■ SVM (Feurer et al 2015)■ Neural networks (Domhan et 2015, Mendoza et al 2016)■ Deep reinforcement learning (BOHB, Falkner et al 2018)■ ...
Used inside the auto-sklearn machine learning toolkit (Python)
23
https://www.automl.org/
Analysis of algorithm parameters
● Output of an algorithm configuration procedure
○ the best algorithm configuration to be used
○ an algorithm performance dataset
→ can re-used to gain better insights into algorithm parameters
using parameter analysis methods with prediction models
25
Parameter analysis methods using prediction models
● forward selection (Hutter, Hoos & Leyton-Brown, 2013)
● fANOVA (Hutter, Hoos & Leyton-Brown, 2014)and their pairwise interactions
● ablation analysis (Fawcett & Hoos, 2016; Biedenkapp, Lindauer & Eggensperger, 2017)
Implemented in the Python tool PyImp
https://github.com/automl/ParameterImportance
26
Analysis of algorithm parameters
Each method addresses a different analysis aspect
● forward selection (Hutter, Hoos & Leyton-Brown, 2013)
○ Identify a subset of key parameters
● fANOVA (Hutter, Hoos & Leyton-Brown, 2014)
○ Quantify the importance of every parameter and their pairwise interactions
● ablation analysis (Fawcett & Hoos, 2016; Biedenkapp, Lindauer & Eggensperger, 2017)
○ Quantify the importance of every parameter on the local path between two configurations
27
Analysis of algorithm parameters
Analysis of algorithm parametersExample: Ant Colony Optimization algorithms for the Travelling Salesman Problem
fANOVA analysis
28
localsearch: ● 0: no local search● 1: 2-opt (default)● 2: 2.5-opt● 3: 3-opt (irace’s choice)
Summary
● Automatic algorithm configuration tools (SMAC, ParamILS, irace, GGA, GGA++)
○ easy to use
○ flexible, scalable
○ have a wide-range of applications
● Various analysis approaches after the configuration process
● Many more to try...
29
Useful links
● https://www.automl.org/
● http://iridia.ulb.ac.be/irace/
● https://bitbucket.org/gga_ac/
30
Automatic algorithm configuratorsirace - input
1. A parameter definition file
2. A set of problem instances (training/test)
3. An executable of the parameterised algorithmInput: <configuration_id> <instance_id> <random_seed> <instance_name> <param1>
<param1_value> <param2> <param2_value> …
Output: <cost>,[<time>]
4. Forbidden configurations (optional)
5. Default configurations (optional)
6. A scenario file: put everything together33