nk landscapes, problem difficulty, and hybrid evolutionary algorithms

24
NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms Martin Pelikan Missouri Estimation of Distribution Algorithms Laboratory (MEDAL) University of Missouri, St. Louis, MO http://medal.cs.umsl.edu/ [email protected] Download MEDAL Report No. 2010001 http://medal.cs.umsl.edu/files/2010001.pdf Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Upload: martin-pelikan

Post on 17-May-2015

1.158 views

Category:

Technology


0 download

DESCRIPTION

http://medal.cs.umsl.edu/show_abstract.php?type=tr&number=2010001

TRANSCRIPT

Page 1: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

NK Landscapes, Problem Difficulty, and HybridEvolutionary Algorithms

Martin PelikanMissouri Estimation of Distribution Algorithms Laboratory (MEDAL)

University of Missouri, St. Louis, MOhttp://medal.cs.umsl.edu/

[email protected]

Download MEDAL Report No. 2010001

http://medal.cs.umsl.edu/files/2010001.pdf

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 2: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Motivation

Problem difficulty measures

I Important for understanding and estimating problem difficulty.I Useful in designing, chosing and setting up optimization

algorithms.

Little work on problem difficulty of large sets of instances

I Most past work considers isolated instances.I Little work on analyzing difficulty of large sets of instances.

This study

I ConsiderI Two classes of NK landscapes.I Several popular problem difficulty measures.

I Analyze difficulty measures on large sets of problem instances.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 3: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Outline

1. Problem difficulty measures.I Fitness distance correlation.I Correlation length.I Escape rate.I Distance of local and global optima.

2. NK landscapes.

3. Experiments.

4. Conclusions and future work.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 4: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Fitness Landscapes

Components of fitness landscape

I Set S of admissible solutions.I Fitness function that assigns a real value to each solution.I Distance measure that defines distance between any two

solutions in S.

Problem vs. operators

I Set of solutions and fitness function depend on the problem.I Distance measure depends on the operators (sometimes also

on problem).

Example distance measures

I Hamming distance for binary strings (num. mismatched bits).I Euclidean distance metric for continuous vectors.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 5: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Fitness Distance Correlation (FDC): Intuition

Loose definitionI FDC measures correlation between fitness and distance to global optimum.

Simple functionsI Being closer to a global optimum leads to higher fitness.I Climbing directly to the optimum is rather easy.

Difficult functionsI Being closer to a global optimum may lead to lower fitness.I Climbing directly to the optimum is not that easy.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 6: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Fitness Distance Correlation (FDC): Definition

Basic setup

I Set of n candidate solutions x1, x2, . . . , xn.I Fitness values: F = {f1, f2, . . . , fn}.I Distances from closest global optimum: D = {d1, d2, . . . , dn}.

I Can also focus only on local optima (useful for hybrid EAs).I Important statistics

I Average of F is f̄ ; standard deviation of F is σF .I Average of D is d̄; standard deviation of D is σD.I Covariance of F and D

cFD =1

n

n∑i=1

(fi − f̄)(di − d̄)·

Fitness distance correlation (FDC)

r =cFD

σF σD

I Range of values for FDC is [−1, 1].I For maximization, small FDC values indicate simpler problems.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 7: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Correlation Length: Intuition

Loose definitionI Correlation length measures ruggedness of the landscape.

Simple functionsI Solutions close to each other have similar fitness values.I Fitness differences increase slowly with distance.

Difficult functionsI Solutions close to each other have different fitness values.I Fitness differences increase fast with distance.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 8: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Correlation Length: Definition

Basic setup

I Random walk of n steps x1, x2, . . . , xn.I xi−1 and xi are neighbors.I Fitness values: F = {f1, f2, . . . , fn}.

Autocorrelation function for gap s

ρ(s) =1

σ2F (m − s)

m−s∑t=1

(ft − f̄

) (ft+s − f̄

),

Correlation length

l = − 1

ln |ρ(1)|

I Smaller correlation lengths indicates harder problems.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 9: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Escape Rate

Intuition

I Problem difficulty relates to ease with which we can escape local optima.

Escape rate for s steps

I Start in an arbitrary local optimum xl.I Generate n solutions at distance s from xl.I Climb (steepest ascent) to nearest local optimum from each of them.I Escape rate is percentage of starting points that end up in a different local

optimum.I Repeat the above for multiple local optima.

Comments

I Escape rate increases with number of steps s.I The larger escape rates may mean an easier problem because

it is easier to escape local optima.I The larger escape rates may mean increased ruggedness,

which indicates greater difficulty.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 10: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Distance of Local and Global Optima

Intuition

I Problem difficulty is related to distance of global and localoptima.

Definition

I Distance of local and global optima is the average distance ofa local optimum to its closest global optimum.

Some thoughts

I Short distances indicate an easier problem, because it is easyto climb near the global optimum.

I Long distances may indicate decreased ruggedness, andtherefore an easier problem.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 11: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

NK Landscape

NK landscape

I Proposed by Kauffman (1989).

I Model of rugged landscape and popular test function.I An NK landscape is defined by

I Number of bits, n.I Number of neighbors per bit, k.I Set of k neighbors Π(Xi) for i-th bit, Xi.I Subfunction fi defining contribution of Xi and Π(Xi).

I The objective function fnk to maximize is then defined as

fnk(X0, X1, . . . , Xn−1) =n−1∑i=0

fi(Xi,Π(Xi)).

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 12: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

NK Landscape

Exmaple for n = 9 and k = 2:

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 13: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Types of NK Landscapes

Unrestricted NK landscapes

I Neighbors are chosen arbitrarily.I Each subfunction is look-up table from uniform distr. over [0, 1).I NP-complete problem.

Nearest-neighbor NK landscapes

I Neighbors are in subsequent positions (neighborhoods wrap around).I Each subfunction is look-up table from uniform distr. over [0, 1).I Solvable in polynomial time using dynamic programming.

Parameters

I Number of bits n ranges from 20 to 100.I Neighborhood size k ranges from 2 to 6.I For each n and k, we generate 104 random instances.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 14: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Hybrid hBOA

Hybrid hierarchical Bayesian optimization algorithm (hBOA)

I Build and sample a Bayes net instead of crossover/mutation.I Model structure and parameters adapt to problem landscape.I Use niching to preserve diversity.I Use single-bit-flip hill climbing for all candidate solutions.

Current

population

Selected

population

New

population

Bayesian

network

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 15: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Focus of Experiments

What we already know from prior work

I Effects of n and k on hBOA performance.I Effects of neighborhood type on hBOA performance.

Main areas of focus of this study

I Effects of n and k on problem-difficulty measures.I Problem-difficulty measures and performance for fixed n and k.I Effects of neighborhood type on problem-difficulty measures.I Effects of escape rate on performance.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 16: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Effects of k

Standard NK landscapes (n = 36)

Nearest-neighbor NK landscapes (n = 100)

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 17: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Effects of n for Standard NK LandscapesTable 1: Some of the measures of problem difficulty for instances of varying problem size. Standarddeviations are shown in brackets.

(a) Standard NK landscapes for k = 2 and k = 6.

DHC steps until fitness dist. corr. avg. dist. ofn k opt. (hBOA+DHC) fitness dist. corr. for local optima corr. length local from global

20 2 147.8 (88.6) −0.3202 (0.105) −0.6862 (0.18) 5.30 (0.41) 14.92 (0.63)28 2 355.9 (195.9) −0.3177 (0.090) −0.6624 (0.17) 7.63 (0.50) 20.91 (0.74)36 2 666.3 (340.2) −0.3181 (0.079) −0.6542 (0.16) 9.97 (0.56) 26.91 (0.83)44 2 1105.0 (515.4) −0.3145 (0.072) −0.6424 (0.15) 12.29 (0.61) 32.88 (0.91)52 2 1633.6 (727.7) −0.3145 (0.068) −0.6384 (0.14) 14.62 (0.67) 38.88 (0.98)

20 6 563.2 (308.9) −0.0496 (0.050) −0.2086 (0.11) 2.30 (0.05) 15.83 (0.30)28 6 1916.5 (1148.3) −0.0491 (0.043) −0.1605 (0.09) 3.44 (0.06) 22.28 (0.32)36 6 5583.8 (3940.5) −0.0489 (0.038) −0.1355 (0.08) 4.58 (0.07) 28.72 (0.33)

(b) Nearest-neighbor NK landscapes for k = 2 and k = 6.

DHC steps until opt. fitness dist. corr. avg. dist. ofn k (hBOA+DHC) fitness dist. corr. for local optima corr. length local from global

20 2 173.0 (93.0) −0.3428 (0.104) −0.7271 (0.151) 5.31 (0.39) 14.96 (0.64)40 2 873.4 (286.7) −0.3412 (0.074) −0.7180 (0.107) 11.13 (0.56) 29.89 (0.91)60 2 1881.4 (435.9) −0.3416 (0.061) −0.7151 (0.087) 16.95 (0.68) 44.85 (1.11)80 2 3055.0 (691.9) −0.3416 (0.054) −0.7143 (0.076) 22.76 (0.80) 59.79 (1.29)100 2 4436.2 (1019.5) −0.3423 (0.049) −0.7134 (0.067) 28.59 (0.88) 74.75 (1.44)

20 6 634.3 (350.1) −0.0494 (0.051) −0.2518 (0.122) 2.30 (0.06) 15.80 (0.33)40 6 5747.2 (3770.7) −0.0486 (0.036) −0.2430 (0.091) 5.14 (0.08) 31.68 (0.45)60 6 16569.5 (10933.3) −0.0481 (0.030) −0.2416 (0.075) 7.97 (0.10) 47.53 (0.54)80 6 34485.9 (23700.1) −0.0488 (0.026) −0.2419 (0.065) 10.80 (0.11) 63.37 (0.62)100 6 60774.8 (42442.8) −0.0486 (0.024) −0.2431 (0.058) 13.63 (0.13) 79.21 (0.69)

5.4.2 Measures and actual performance

The previous subsection focused on the influence of n and k on the performance of hBOA+DHC andthe measures of problem difficulty. But what is the relationship between the measures of problemdifficulty and the actual performance of hBOA+DHC for fixed values of n and k? Furthermore, it iswell known that for NK landscapes and other difficult classes of additively decomposable problems,performance of metaheuristics varies substantially from one problem instance to another. Willthese variations be captured by the considered measures of problem difficulty?

The relationship between the measures of problem difficulty and the actual computationalrequirements of hBOA+DHC is illustrated in tables 2 and 3. Only the results for several valuesof n and k are shown; the results for other values of n and k were qualitatively similar. Theresults show that the correlation length and both versions of the fitness distance correlation areall in agreement with the actual problem difficulty measured by the number of DHC flips untiloptimum. However, the distance of local and global optima is misleading in that for the easierproblem instances the local optima are in fact further from the global ones than for the moredifficult instances. This result is quite surprising. It is also of note that the measures of problemdifficulty vary only a little across the instances for the same value of n and k.

5.4.3 Standard vs. nearest-neighbor instances

Table 4 summarizes the results for standard and nearest-neighbor NK landscapes for n = 36,and k = 2 to k = 6. The table compares the actual performance of hBOA+DHC as well as themeasures of problem difficulty. The fixed problem size was chosen to focus only on the effects of

9

I Fitness distance correlationI Fitness distance correlation agrees with results.I The differences are very small.I Focusing on local optima improves the measure.

I Correlation lengthI Correlation length indicates opposite behavior as expected.I Does not provide a good indicator in this case.

I Distance of optimaI Distance of local and global optima increase with n.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 18: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Effects of n for Nearest-Neighbor NK Landscapes

Table 1: Some of the measures of problem difficulty for instances of varying problem size. Standarddeviations are shown in brackets.

(a) Standard NK landscapes for k = 2 and k = 6.

DHC steps until fitness dist. corr. avg. dist. ofn k opt. (hBOA+DHC) fitness dist. corr. for local optima corr. length local from global

20 2 147.8 (88.6) −0.3202 (0.105) −0.6862 (0.18) 5.30 (0.41) 14.92 (0.63)28 2 355.9 (195.9) −0.3177 (0.090) −0.6624 (0.17) 7.63 (0.50) 20.91 (0.74)36 2 666.3 (340.2) −0.3181 (0.079) −0.6542 (0.16) 9.97 (0.56) 26.91 (0.83)44 2 1105.0 (515.4) −0.3145 (0.072) −0.6424 (0.15) 12.29 (0.61) 32.88 (0.91)52 2 1633.6 (727.7) −0.3145 (0.068) −0.6384 (0.14) 14.62 (0.67) 38.88 (0.98)

20 6 563.2 (308.9) −0.0496 (0.050) −0.2086 (0.11) 2.30 (0.05) 15.83 (0.30)28 6 1916.5 (1148.3) −0.0491 (0.043) −0.1605 (0.09) 3.44 (0.06) 22.28 (0.32)36 6 5583.8 (3940.5) −0.0489 (0.038) −0.1355 (0.08) 4.58 (0.07) 28.72 (0.33)

(b) Nearest-neighbor NK landscapes for k = 2 and k = 6.

DHC steps until opt. fitness dist. corr. avg. dist. ofn k (hBOA+DHC) fitness dist. corr. for local optima corr. length local from global

20 2 173.0 (93.0) −0.3428 (0.104) −0.7271 (0.151) 5.31 (0.39) 14.96 (0.64)40 2 873.4 (286.7) −0.3412 (0.074) −0.7180 (0.107) 11.13 (0.56) 29.89 (0.91)60 2 1881.4 (435.9) −0.3416 (0.061) −0.7151 (0.087) 16.95 (0.68) 44.85 (1.11)80 2 3055.0 (691.9) −0.3416 (0.054) −0.7143 (0.076) 22.76 (0.80) 59.79 (1.29)100 2 4436.2 (1019.5) −0.3423 (0.049) −0.7134 (0.067) 28.59 (0.88) 74.75 (1.44)

20 6 634.3 (350.1) −0.0494 (0.051) −0.2518 (0.122) 2.30 (0.06) 15.80 (0.33)40 6 5747.2 (3770.7) −0.0486 (0.036) −0.2430 (0.091) 5.14 (0.08) 31.68 (0.45)60 6 16569.5 (10933.3) −0.0481 (0.030) −0.2416 (0.075) 7.97 (0.10) 47.53 (0.54)80 6 34485.9 (23700.1) −0.0488 (0.026) −0.2419 (0.065) 10.80 (0.11) 63.37 (0.62)100 6 60774.8 (42442.8) −0.0486 (0.024) −0.2431 (0.058) 13.63 (0.13) 79.21 (0.69)

5.4.2 Measures and actual performance

The previous subsection focused on the influence of n and k on the performance of hBOA+DHC andthe measures of problem difficulty. But what is the relationship between the measures of problemdifficulty and the actual performance of hBOA+DHC for fixed values of n and k? Furthermore, it iswell known that for NK landscapes and other difficult classes of additively decomposable problems,performance of metaheuristics varies substantially from one problem instance to another. Willthese variations be captured by the considered measures of problem difficulty?

The relationship between the measures of problem difficulty and the actual computationalrequirements of hBOA+DHC is illustrated in tables 2 and 3. Only the results for several valuesof n and k are shown; the results for other values of n and k were qualitatively similar. Theresults show that the correlation length and both versions of the fitness distance correlation areall in agreement with the actual problem difficulty measured by the number of DHC flips untiloptimum. However, the distance of local and global optima is misleading in that for the easierproblem instances the local optima are in fact further from the global ones than for the moredifficult instances. This result is quite surprising. It is also of note that the measures of problemdifficulty vary only a little across the instances for the same value of n and k.

5.4.3 Standard vs. nearest-neighbor instances

Table 4 summarizes the results for standard and nearest-neighbor NK landscapes for n = 36,and k = 2 to k = 6. The table compares the actual performance of hBOA+DHC as well as themeasures of problem difficulty. The fixed problem size was chosen to focus only on the effects of

9

I Fitness distance correlationI Fitness distance correlation agrees with results.I The differences are very small.I Focusing on local optima improves the measure.

I Correlation lengthI Correlation length indicates opposite behavior as expected.I Does not provide a good indicator in this case.

I Distance of optimaI Distance of local and global optima increase with n.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 19: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Measures vs. Performance for Fixed n, k (Standard)

Standard NK landscapes (n = 36, k = 6)

Table 2: Some of the measures of problem difficulty for instances of varying difficulty. StandardNK landscapes are considered with n = 36, and k = 2 or k = 6. The difficulty of each instanceis measured by the number of DHC steps until optimum using hBOA+DHC. Standard deviationsare shown in brackets.

(a) Standard NK landscapes, n = 36, k = 2.

desc. of DHC steps until fitn. dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length local from global

10% easiest 239.0 (54.6) −0.3691 (0.074) −0.7828 (0.112) 10.2259 (0.567) 27.3003 (0.847)25% easiest 317.9 (78.6) −0.3549 (0.074) −0.7457 (0.127) 10.1340 (0.548) 27.1788 (0.839)50% easiest 422.4 (123.6) −0.3411 (0.076) −0.7081 (0.141) 10.0599 (0.553) 27.0721 (0.835)all instances 666.3 (340.2) −0.3181 (0.079) −0.6542 (0.160) 9.9729 (0.555) 26.9069 (0.831)50% hardest 910.1 (311.8) −0.2952 (0.076) −0.6004 (0.160) 9.8859 (0.543) 26.7417 (0.793)25% hardest 1119.1 (323.4) −0.2819 (0.075) −0.5635 (0.161) 9.8345 (0.547) 26.6541 (0.781)10% hardest 1391.1 (358.7) −0.2695 (0.076) −0.5259 (0.158) 9.8102 (0.570) 26.5672 (0.757)

(b) Standard NK landscapes, n = 36, k = 6.

desc. of DHC steps until fitn. dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length local from global

10% easiest 1785.8 (315.3) −0.0737 (0.035) −0.2113 (0.070) 4.5874 (0.069) 28.9966 (0.370)25% easiest 2279.8 (488.9) −0.0675 (0.035) −0.1909 (0.068) 4.5836 (0.067) 28.8979 (0.356)50% easiest 3034.4 (882.5) −0.0604 (0.036) −0.1682 (0.069) 4.5803 (0.067) 28.8242 (0.340)all instances 5583.8 (3940.5) −0.0489 (0.038) −0.1355 (0.075) 4.5752 (0.067) 28.7155 (0.332)50% hardest 8133.3 (4156.7) −0.0374 (0.035) −0.1029 (0.067) 4.5701 (0.066) 28.6067 (0.285)25% hardest 10738.5 (4534.7) −0.0316 (0.036) −0.0868 (0.066) 4.5699 (0.066) 28.5572 (0.272)10% hardest 14542.1 (5101.1) −0.0242 (0.036) −0.0716 (0.064) 4.5652 (0.066) 28.5074 (0.258)

Table 3: Some of the measures of problem difficulty for instances of varying difficulty. Nearest-neighbor NK landscapes are considered with n = 100, and k = 2 or k = 6. The difficulty of eachinstance is measured by the number of DHC steps until optimum using hBOA+DHC. Standarddeviations are shown in brackets.

(a) Nearest-neighbor NK landscapes, n = 100, k = 2.

desc. of DHC steps until fitn. dist. corr. avg. dist. of lo-instances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length cal from global

10% easiest 3330.9 (163.9) −0.3763 (0.045) −0.7594 (0.054) 28.9495 (0.878) 75.2075 (1.370)25% easiest 3550.2 (217.0) −0.3657 (0.045) −0.7466 (0.058) 28.8338 (0.878) 75.1053 (1.395)50% easiest 3758.6 (265.2) −0.3578 (0.046) −0.7364 (0.058) 28.7432 (0.873) 74.9838 (1.427)all instances 4436.2 (1019.5) −0.3423 (0.049) −0.7134 (0.067) 28.5919 (0.885) 74.7529 (1.440)50% hardest 5113.8 (1044.2) −0.3269 (0.047) −0.6905 (0.067) 28.4405 (0.871) 74.5221 (1.415)25% hardest 5805.5 (1089.4) −0.3178 (0.047) −0.6728 (0.069) 28.3831 (0.854) 74.3572 (1.390)10% hardest 6767.6 (1152.3) −0.3115 (0.048) −0.6632 (0.069) 28.3347 (0.865) 74.3459 (1.372)

(b) Nearest-neighbor NK landscapes, n = 100, k = 6.

desc. of DHC steps until fitn. dist. corr. avg. dist. of lo-instances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length cal from global

10% easiest 21364.1 (2929.9) −0.0601 (0.023) −0.2962 (0.050) 13.6468 (0.135) 79.6971 (0.720)25% easiest 26787.7 (5261.6) −0.0564 (0.023) −0.2778 (0.053) 13.6432 (0.131) 79.5491 (0.706)50% easiest 34276.6 (8833.1) −0.0529 (0.023) −0.2626 (0.055) 13.6395 (0.132) 79.4021 (0.687)all instances 60774.8 (42442.8) −0.0486 (0.024) −0.2431 (0.058) 13.6344 (0.131) 79.2125 (0.690)50% hardest 87272.9 (46049.2) −0.0444 (0.023) −0.2237 (0.055) 13.6294 (0.130) 79.0228 (0.638)25% hardest 114418.9 (52085.3) −0.0430 (0.023) −0.2163 (0.056) 13.6274 (0.131) 78.9531 (0.638)10% hardest 154912.8 (62794.1) −0.0418 (0.024) −0.2107 (0.057) 13.6243 (0.130) 78.9277 (0.633)

10

I Fitness distance correlationI Provides good indication of problem difficulty.I Differences smaller than expected based on performance.

I Correlation lengthI Correlation length agrees with the results.I Differences are almost negligible though.

I Distance of optimaI Greater distances observed for simpler problems.I Indicates that ruggedness is important.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 20: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Measures vs. Performance for Fixed n, k (Nearest Neighb.)

Nearest-neighbor NK landscapes (n = 100, k = 6)

Table 2: Some of the measures of problem difficulty for instances of varying difficulty. StandardNK landscapes are considered with n = 36, and k = 2 or k = 6. The difficulty of each instanceis measured by the number of DHC steps until optimum using hBOA+DHC. Standard deviationsare shown in brackets.

(a) Standard NK landscapes, n = 36, k = 2.

desc. of DHC steps until fitn. dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length local from global

10% easiest 239.0 (54.6) −0.3691 (0.074) −0.7828 (0.112) 10.2259 (0.567) 27.3003 (0.847)25% easiest 317.9 (78.6) −0.3549 (0.074) −0.7457 (0.127) 10.1340 (0.548) 27.1788 (0.839)50% easiest 422.4 (123.6) −0.3411 (0.076) −0.7081 (0.141) 10.0599 (0.553) 27.0721 (0.835)all instances 666.3 (340.2) −0.3181 (0.079) −0.6542 (0.160) 9.9729 (0.555) 26.9069 (0.831)50% hardest 910.1 (311.8) −0.2952 (0.076) −0.6004 (0.160) 9.8859 (0.543) 26.7417 (0.793)25% hardest 1119.1 (323.4) −0.2819 (0.075) −0.5635 (0.161) 9.8345 (0.547) 26.6541 (0.781)10% hardest 1391.1 (358.7) −0.2695 (0.076) −0.5259 (0.158) 9.8102 (0.570) 26.5672 (0.757)

(b) Standard NK landscapes, n = 36, k = 6.

desc. of DHC steps until fitn. dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length local from global

10% easiest 1785.8 (315.3) −0.0737 (0.035) −0.2113 (0.070) 4.5874 (0.069) 28.9966 (0.370)25% easiest 2279.8 (488.9) −0.0675 (0.035) −0.1909 (0.068) 4.5836 (0.067) 28.8979 (0.356)50% easiest 3034.4 (882.5) −0.0604 (0.036) −0.1682 (0.069) 4.5803 (0.067) 28.8242 (0.340)all instances 5583.8 (3940.5) −0.0489 (0.038) −0.1355 (0.075) 4.5752 (0.067) 28.7155 (0.332)50% hardest 8133.3 (4156.7) −0.0374 (0.035) −0.1029 (0.067) 4.5701 (0.066) 28.6067 (0.285)25% hardest 10738.5 (4534.7) −0.0316 (0.036) −0.0868 (0.066) 4.5699 (0.066) 28.5572 (0.272)10% hardest 14542.1 (5101.1) −0.0242 (0.036) −0.0716 (0.064) 4.5652 (0.066) 28.5074 (0.258)

Table 3: Some of the measures of problem difficulty for instances of varying difficulty. Nearest-neighbor NK landscapes are considered with n = 100, and k = 2 or k = 6. The difficulty of eachinstance is measured by the number of DHC steps until optimum using hBOA+DHC. Standarddeviations are shown in brackets.

(a) Nearest-neighbor NK landscapes, n = 100, k = 2.

desc. of DHC steps until fitn. dist. corr. avg. dist. of lo-instances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length cal from global

10% easiest 3330.9 (163.9) −0.3763 (0.045) −0.7594 (0.054) 28.9495 (0.878) 75.2075 (1.370)25% easiest 3550.2 (217.0) −0.3657 (0.045) −0.7466 (0.058) 28.8338 (0.878) 75.1053 (1.395)50% easiest 3758.6 (265.2) −0.3578 (0.046) −0.7364 (0.058) 28.7432 (0.873) 74.9838 (1.427)all instances 4436.2 (1019.5) −0.3423 (0.049) −0.7134 (0.067) 28.5919 (0.885) 74.7529 (1.440)50% hardest 5113.8 (1044.2) −0.3269 (0.047) −0.6905 (0.067) 28.4405 (0.871) 74.5221 (1.415)25% hardest 5805.5 (1089.4) −0.3178 (0.047) −0.6728 (0.069) 28.3831 (0.854) 74.3572 (1.390)10% hardest 6767.6 (1152.3) −0.3115 (0.048) −0.6632 (0.069) 28.3347 (0.865) 74.3459 (1.372)

(b) Nearest-neighbor NK landscapes, n = 100, k = 6.

desc. of DHC steps until fitn. dist. corr. avg. dist. of lo-instances opt. (hBOA+DHC) fitn. dist. corr. for local optima corr. length cal from global

10% easiest 21364.1 (2929.9) −0.0601 (0.023) −0.2962 (0.050) 13.6468 (0.135) 79.6971 (0.720)25% easiest 26787.7 (5261.6) −0.0564 (0.023) −0.2778 (0.053) 13.6432 (0.131) 79.5491 (0.706)50% easiest 34276.6 (8833.1) −0.0529 (0.023) −0.2626 (0.055) 13.6395 (0.132) 79.4021 (0.687)all instances 60774.8 (42442.8) −0.0486 (0.024) −0.2431 (0.058) 13.6344 (0.131) 79.2125 (0.690)50% hardest 87272.9 (46049.2) −0.0444 (0.023) −0.2237 (0.055) 13.6294 (0.130) 79.0228 (0.638)25% hardest 114418.9 (52085.3) −0.0430 (0.023) −0.2163 (0.056) 13.6274 (0.131) 78.9531 (0.638)10% hardest 154912.8 (62794.1) −0.0418 (0.024) −0.2107 (0.057) 13.6243 (0.130) 78.9277 (0.633)

10I Fitness distance correlation

I Provides good indication of problem difficulty.

I Correlation lengthI Correlation length agrees with the results.I Differences are almost negligible though.

I Distance of optimaI Greater distances observed for simpler problems.I Indicates that ruggedness is important.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 21: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Escape Rate for Nearest-Neighbor Instances (n = 100)

0 5 10 15 20 250

0.10.20.30.40.50.60.70.80.9

1

Number of steps

Esc

ape

rate

k=6k=5k=4k=3k=2

Figure 3: Escape rate for nearest-neighbor NK landscapes of n = 100 and k = 6.

The paper shows that in most cases the measures of problem difficulty correlate with the actualcomputational requirements of hybrid hBOA. Most measures capture the effects of n and k onproblem difficulty, and they provide input on problem difficulty even for fixed n and k. Nonetheless,in some cases the measures of problem difficulty do not provide a clear indication of what probleminstances are difficult and what instances are easy. Furthermore, there is no single measure thatwould be more accurate than others. Finally, while the differences between the computationalrequirements of hybrid hBOA across a set of problem instances are substantial, the variance of theproblem difficulty measures for this set of instances is typically very low.

One of the interesting topics for future work is to further analyze the results obtained in thispaper. Furthermore, analytical approaches may be designed to provide more accurate and fasterestimates of the measures of problem difficulty. Additional measures of problem difficulty shouldalso be studied and the relationship between the different measures of problem difficulty should beinvestigated in more detail. Finally, practical approaches should be proposed to use the measuresof problem difficulty for choosing an appropriate optimization method and adequate parameters.

Acknowledgments

This project was sponsored by the National Science Foundation under CAREER grant ECS-0547013, by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, undergrant FA9550-06-1-0096, and by the Univ. of Missouri in St. Louis through the High PerformanceComputing Collaboratory sponsored by Information Technology Services, and the Research Awardand Research Board programs. The U.S. Government is authorized to reproduce and distributereprints for government purposes notwithstanding any copyright notation thereon. Any opinions,findings, and conclusions or recommendations expressed in this material are those of the authorsand do not necessarily reflect the views of the National Science Foundation, the Air Force Office ofScientific Research, or the U.S. Government. Some experiments were done using the hBOA soft-ware developed by M. Pelikan and D. E. Goldberg at the Univ. of Illinois at Urbana-Champaignand most experiments were performed on the Beowulf cluster maintained by ITS at the Univ. ofMissouri in St. Louis.

12

I Escape rate increases with k.I Increased k leads to more ruggedness and tougher problems.I Escape rate increases with the number of steps.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 22: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Neighborhood Type

Table 3: Some of the measures of problem difficulty for instances of varying difficulty. Nearest-neighbor NKlandscapes are considered with n = 100, and k = 2 or k = 6. The difficulty of each instance is measured by thenumber of DHC steps until optimum using hBOA+DHC. Standard deviations are shown in brackets.

(a) Nearest-neighbor NK landscapes, n = 100, k = 2.

desc. of DHC steps until fitness dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitness dist. corr. for local optima correlation length local from global10% easiest 3330.92 (163.922) −0.37630 (0.0449) −0.75937 (0.0545) 28.94946 (0.8780) 75.20749 (1.3696)25% easiest 3550.22 (216.959) −0.36573 (0.0452) −0.74656 (0.0579) 28.83381 (0.8778) 75.10535 (1.3955)50% easiest 3758.58 (265.189) −0.35776 (0.0455) −0.73643 (0.0583) 28.74320 (0.8725) 74.98378 (1.4267)all instances 4436.19 (1019.527) −0.34231 (0.0487) −0.71345 (0.0671) 28.59186 (0.8848) 74.75294 (1.4396)50% hardest 5113.80 (1044.154) −0.32685 (0.0469) −0.69046 (0.0674) 28.44052 (0.8712) 74.52211 (1.4153)25% hardest 5805.50 (1089.351) −0.31783 (0.0470) −0.67277 (0.0689) 28.38310 (0.8537) 74.35724 (1.3899)10% hardest 6767.56 (1152.300) −0.31149 (0.0479) −0.66316 (0.0690) 28.33472 (0.8653) 74.34589 (1.3725)

(b) Nearest-neighbor NK landscapes, n = 100, k = 6.

desc. of DHC steps until fitness dist. corr. avg. dist. ofinstances opt. (hBOA+DHC) fitness dist. corr. for local optima correlation length local from global10% easiest 21364.12 (2929.912) −0.06013 (0.0232) −0.29623 (0.0502) 13.64677 (0.1347) 79.69713 (0.7199)25% easiest 26787.73 (5261.648) −0.05645 (0.0235) −0.27778 (0.0534) 13.64324 (0.1312) 79.54913 (0.7063)50% easiest 34276.64 (8833.067) −0.05293 (0.0235) −0.26257 (0.0548) 13.63947 (0.1320) 79.40213 (0.6873)all instances 60774.79 (42442.760) −0.04865 (0.0236) −0.24311 (0.0582) 13.63444 (0.1313) 79.21247 (0.6899)50% hardest 87272.93 (46049.197) −0.04436 (0.0229) −0.22366 (0.0549) 13.62941 (0.1305) 79.02281 (0.6384)25% hardest 114418.93 (52085.301) −0.04304 (0.0231) −0.21626 (0.0558) 13.62740 (0.1308) 78.95310 (0.6377)10% hardest 154912.82 (62794.146) −0.04179 (0.0236) −0.21073 (0.0571) 13.62429 (0.1300) 78.92773 (0.6326)

Table 4: Comparison of the results for the standard NK landscapes (std) and for the nearest-neighbor NKlandscapes (nn). In both cases, n = 36 and k varies from 2 to 6. Standard deviations are shown in brackets.

kDHC steps fit. dist. corr. fit. dist. corr. (local) corr. length dist. of local from globalstd nn std nn std nn std nn std nn

2 666 697 −0.32 (0.08) −0.34 (0.08) −0.65 (0.16) −0.72 (0.11) 9.973 (0.555) 9.962 (0.527) 13.06 (0.72) 12.58 (0.84)3 1029 983 −0.20 (0.07) −0.21 (0.07) −0.50 (0.16) −0.60 (0.13) 7.912 (0.342) 7.911 (0.333) 11.33 (0.41) 10.74 (0.55)4 1629 1437 −0.13 (0.06) −0.13 (0.06) −0.35 (0.13) −0.48 (0.13) 6.450 (0.200) 6.446 (0.202) 9.96 (0.24) 9.33 (0.33)5 2846 2397 −0.08 (0.05) −0.08 (0.05) −0.23 (0.10) −0.35 (0.11) 5.379 (0.116) 5.379 (0.123) 8.86 (0.14) 8.25 (0.20)6 5584 4255 −0.05 (0.04) −0.05 (0.04) −0.14 (0.08) −0.24 (0.09) 4.575 (0.067) 4.576 (0.073) 7.95 (0.10) 7.39 (0.12)

[13] T. Jones and S. Forrest. Fitness distance correlation as ameasure of problem difficulty for genetic algorithms. Proc.of the Int. Conf. on Genetic Algorithms (ICGA-95), pages184–192, 1995.

[14] S. Kauffman. Adaptation on rugged fitness landscapes. InD. L. Stein, editor, Lecture Notes in the Sciences ofComplexity, pages 527–618. Addison Wesley, 1989.

[15] S. Kauffman. The origins of Order: Self-organization andselection in evolution. Oxford University Press, 1993.

[16] P. Merz. Advanced fitness landscape analysis and theperformance of memetic algorithms. EvolutionaryComputation, 12(3):303–325, 2004.

[17] P. Merz and B. Freisleben. Greedy and local searchheuristics for unconstrained binary quadratic programming.Journal of Heuristics, 8(2):197–213, 2002.

[18] M. Pelikan. Hierarchical Bayesian optimization algorithm:Toward a new generation of evolutionary algorithms.Springer, 2005.

[19] M. Pelikan and D. E. Goldberg. Escaping hierarchical trapswith competent genetic algorithms. Genetic andEvolutionary Comp. Conf. (GECCO-2001), pages 511–518,2001.

[20] M. Pelikan and D. E. Goldberg. A hierarchy machine:Learning to optimize from nature and humans. Complexity,8(5):36–45, 2003.

[21] M. Pelikan, J. Ocenasek, S. Trebst, M. Troyer, and F. Alet.Computational complexity and simulation of rare events ofising spin glasses. Genetic and Evolutionary Comp. Conf.(GECCO-2004), 2:36–47, 2004.

[22] M. Pelikan, K. Sastry, M. V. Butz, and D. E. Goldberg.Performance of evolutionary algorithms on random

decomposable problems. Parallel Problem Solving fromNature, pages 788–797, 2006.

[23] M. Pelikan, K. Sastry, M. V. Butz, and D. E. Goldberg.Analysis of estimation of distribution algorithms andgenetic algorithms on NK landscapes. Genetic andEvolutionary Comp. Conf. (GECCO-2008), pages1033–1040, 2008.

[24] M. Pelikan, K. Sastry, D. E. Goldberg, M. V. Butz, andM. Hauschild. Performance of evolutionary algorithms onNK landscapes with nearest neighbor interactions andtunable overlap. Genetic and Evolutionary Comp. Conf.(GECCO-2009), pages 851–858, 2009.

[25] K. Sastry. Evaluation-relaxation schemes for genetic andevolutionary algorithms. Master’s thesis, University ofIllinois at Urbana-Champaign, Department of GeneralEngineering, Urbana, IL, 2001.

[26] P. F. Stadler. Landscapes and their correlation functions.Journal of Mathematical Chemistry, 20:1–45, 1996.

[27] A. M. Sutton, L. D. Whitley, and A. E. Howe. A polynomialtime computation of the exact correlation structure ofk-satisfiability landscapes. Genetic and EvolutionaryComp. Conf. (GECCO-2009), pages 365–372, 2009.

[28] D. Thierens, D. E. Goldberg, and A. G. Pereira. Dominoconvergence, drift, and the temporal-salience structure ofproblems. Proc. of the Int. Conf. on EvolutionaryComputation (ICEC-98), pages 535–540, 1998.

[29] E. Weinberger. Correlated and uncorrelated fitnesslandscapes and how to tell the difference. BiologicalCybernetics, 63(5):325–336, 1990.

[30] A. H. Wright, R. K. Thompson, and J. Zhang. Thecomputational complexity of N-K fitness functions. IEEETrans. on Evolutionary Computation, 4(4):373–379, 2000.

I Fitness distance correlation for arbitrary solutionsI For arbitrary solutions, differences are negligible.I Situation doesn’t seem to change with k.

I Fitness distance correlation for local optimaI Correctly indicates that nearest-neighbor instances are easier.I Differences increase with k (as expected).

I Correlation lengthI Differences are negligible.I Situation doesn’t seem to change with k.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 23: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Conclusions and Future Work

Conclusions

I Different measures focus on different facets of difficulty.I Most success with fitness distance correlation for local optima.I Differences in measures usually very small.I Need better measures to make these results useful in practice.I Most measures depend on information that is typically unavailable.

Future work

I Compare to other existing measures (signal/noise, scaling).I Look at other optimization algorithms (e.g. GAs and ESs).I Define measures that do not depend on problem-specific

information.I Define measures that are more practically useful

I Tune algorithm to problem (parameters, operators).I Choose best optimization algorithm.I Drive design of new optimization algorithms.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs

Page 24: NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms

Acknowledgments

Acknowledgments

I NSF; NSF CAREER grant ECS-0547013.

I University of Missouri; High Performance ComputingCollaboratory sponsored by Information Technology Services;Research Award; Research Board.

Martin Pelikan NK Landscapes, Problem Difficulty, and Hybrid EAs