from mating pool distributions to model overfitting
DESCRIPTION
This paper addresses selection as a source of overfitting in Bayesian estimation of distribution algorithms (EDAs). The purpose of the paper is twofold. First, it shows how the selection operator can lead to model overfitting in the Bayesian optimization algorithm (BOA). Second, the metric score that guides the search foran adequate model structure is modified to take into account the non-uniform distribution of the mating pool generated by tournament selection.TRANSCRIPT
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
From Mating Pool Distributionsto Model Overfitting
Claudio F. Lima1 Fernando G. Lobo1 Martin Pelikan2
1Department of Electronics and Computer Science EngineeringUniversity of Algarve, Portugal
2Missouri Estimation of Distribution Algorithm LaboratoryUniversity of Missouri at St. Louis, USA
GECCO 2008, Atlanta, USA
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Probabilistic Modeling in EDAs
The coreEDA is defined by its model complexity (expressiveness).
The spectrumSimpler EDAs only learn the parameters of the model.More powerful EDAs learn both structure and parameters.
Bayesian EDAsSolve a broad class of problems efficiently.But oftentimes models do not exactly reflect the underlyingproblem structure.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Motivation
Importance of model accuracyModel-based efficiency enhancement techniques.
Substructural local search.Fitness estimation (surrogate fitness function).
Offline interpretation of probabilistic models.Bias structural learning in EDAs (Mark Hauschild’s talk).Design structure-aware variation operators.
Empirical findings with BOA: Tournament vs. truncationPopulation size requirements.Model quality difference.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Bayesian Optimization Algorithm (BOA)
BOAEDA which uses Bayesian networks (BNs).
Model selected solutions to generate new ones.Similar algorithms:
Estimation of Bayesian networks algorithm (EBNA).Learning factorized distribution algorithm (LFDA).
Learning BNsDecision trees express conditional probabilities.Greedy algorithm for learning.Scoring metrics considered: K2 and BIC.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Experimental Setup
Test problemTo investigate MSA we use a problem of known structure.Clear identification of:
Necessary dependencies→ successful tractability.Unnecessary dependencies→ poor model interpretability.
m − k trap problemConsists of m concatenated k -bit trap functions.For k ≥ 3, the model should be able to maintain k−orderstatistics, otherwise→ exponential scalability.k = 5 used in experiments.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Ideal Model for the m − k trap problem
Joint distribution in BNsp(Xa,Xb,Xc ,Xd ) = p(Xa)p(Xb|Xa)p(Xc |Xa,Xb)p(Xd |Xa,Xb,Xc)
Clique between variables of the same trap function.While between different traps there are no dependencies.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
MotivationBOA: The Bayesian Optimization AlgorithmExperimental Setup
Measuring Model Structural Accuracy
Definition 1The model structural accuracy (MSA) is defined as the ratio ofcorrect edges over the total number of edges in the BN.
Definition 2An edge is correct if it connects two variables that are linkedaccording to the objective function definition.
Definition 3Model overfitting is defined as the inclusion of incorrect (orunnecessary) edges to the BN.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
Selection Methods
Selection methods consideredTournament selection. Randomly pick s individuals andchoose the best. Repeat n times.Truncation selection. Choose the best τ proportion of thepopulation.
Equivalent selection intensitySelection intensity, I 0.56 0.84 1.03 1.16 1.35 1.54 1.87Tournament size, s 2 3 4 5 7 10 20
Truncation threshold, τ(%) 66 47 36 30 22 15 8
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
Selection-Metric Comparison
0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
Selection intensity, I
Mod
el S
truc
tura
l Acc
urac
y, M
SA
Tournament w/ K2Tournament w/ BICTruncation w/ K2Truncation w/ BIC
0.5 1 1.5 20
1
2
3
4
5
6
7
x 105
Selection intensity, I
Num
ber
of fu
nctio
n ev
alua
tions
, nfe
Tournament w/ K2Tournament w/ BICTruncation w/ K2Truncation w/ BIC
Observations1 Truncation performs better than tournament selection.2 K2 metric performs better than BIC metric.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
A Simple Analysis of Tournament Selection
Probability of an individual with rank i to win a tournament
pi =s−1∏j=1
i − jn − j
, for s ≥ 2. (1)
Number of copies in the mating pool
ci = s pi . (2)
Which can be approximated by a power distribution with p.d.f.
f (x) = α xα−1, 0 < x ≤ 1, α = s, x = i/n. (3)
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
A Simple Analysis of Truncation Selection
Number of copies in the mating pool
ci =
{0, if i < n (1− (τ/100))1, otherwise.
(4)
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
Distribution of the Expected Number of Copies
0 0.2 0.4 0.6 0.8 10
5
10
15
20
Rank, i (in percentile)
Exp
ecte
d nu
mbe
r of
cop
ies,
ci
s = 2s = 3s = 4s = 5s = 7s = 10s = 20
0 0.2 0.4 0.6 0.8 10
1
Rank, i (in percentile)
Exp
ecte
d nu
mbe
r of
cop
ies,
ci
τ = 66%
τ = 47%
τ = 36%
τ = 30%
τ = 22%
τ = 15%
τ = 8%
Tournament and truncation selection differ in:1 Window size.2 Distribution shape.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
Mating Pool Distribution: An Example
Functionfmk−trap = f1(X1X2X3) + f2(X4X5X6) + f3(X7X8X9)
ExampleRank Individual Copies
8 111 100 000 1 4 87 111 111 101 1 3 46 111 000 110 1 2 25 000 001 000 1 1 14 111 010 011 0 0 03 000 101 001 0 0 02 000 110 101 0 0 01 101 000 101 0 0 0
Mating pool distributions:Uniform Linear Exponential
X6X7 freq. X6X7 freq. X6X7 freq.00 0.25 00 0.4 00 0.5301 0.25 01 0.2 01 0.1310 0.25 10 0.1 10 0.0711 0.25 11 0.3 11 0.27
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Influence of Selection in Model AccuracySelection: The Mating-Pool Distribution Generator
Mating Pool w/ Non-Uniform Distribution
What’s good about it?
Detection of existent patterns is achieved with less data.Tournament selection requires smaller population sizesthan truncation.
What’s bad about it?Irrelevant features can also be detected as patterns.
Excessive complexity→ model overfitting.
So what?Recognize that we are learning from imbalanced data sets.Change scoring metric accordingly!
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Improving Model Accuracy for Tournament
Agenda1 Compute how the power-law-based frequencies differ from
uniform ones in the mating pool.2 Use computed frequency deviation to estimate scoring
metric gain due to overfitting.3 Change the complexity penalty to counterbalance the
misleading metric gain.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Deviation of power-law-based frequencies
Deviation is linear w.r.t. tournament sizeExpected market share of top-ranked individuals afterselection is given by the (right-side area of the) c.d.f. of thepower distribution.
F (x) = xs, 0 < x < 1, s ≥ 2. (5)
In the worst case, market share grows linearly withtournament size.Thus, misleading frequencies deviate linearly from the“true” ones (uniform distribution).More details in the paper.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Computing Metric Gain
Consider the decision of adding an edge from X2 to X1
Metric gain must be calculated:Gmetric = ScoreAfter − ScoreBefore − ComplexityPenaltyUniform mating pool revealsp00 = p01 = p10 = p11 = 0.25→ Gmetric < 0.Simulate tournament mating pool by assigning
p00 ≈ 0.25 + ∆(s − 1),p01 ≈ 0.25−∆(s − 1),p10 ≈ 0.25−∆(s − 1),p11 ≈ 0.25 + ∆(s − 1).
Where ∆ is related to the proportion of top-individualswhich lead to overfitting (more info in the paper).
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Computing Metric Gain (2)
Approximated metric gain
G′ ≈ 2 (0.25−∆(s − 1)) log2 (0.25−∆(s − 1)) + 1. (6)
100
101
102
10−4
10−3
10−2
10−1
100
101
102
Tournament size, s
Met
ric g
ain,
G’ B
IC
ApproximationO(s)
O(s2)
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Comparison of Different Penalty Corrections
Complexity penalty for the K2 metric
p(B) = 2−0.5cs log2(n)P`
i=1 |Li | → cs = {1,√
s, s, and s log2(s)}
5 10 15 200
0.2
0.4
0.6
0.8
1
Tournament size, s
Mod
el S
truc
tura
l Acc
urac
y, M
SA
cs=1
cs=sqrt(s)
cs=s
cs=s*log
2(s)
5 10 15 200
1
2
3
4
5x 10
5
Tournament size, s
Num
ber
of fu
nctio
n ev
alua
tions
, nfe
c
s=1
cs=sqrt(s)
cs=s
cs=s*log
2(s)
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Results for the s−penalty (cs = s)
5 10 15 200.9
0.92
0.94
0.96
0.98
1
Tournament size, s
Mod
el S
truc
tura
l Acc
urac
y, M
SA
l = 40l = 60l = 80l = 100
5 100.992
0.994
0.996
0.998
1
5 10 15 200
2
4
6
8
10
12
14x 10
5
Tournament size, s
Num
ber
of fu
nctio
n ev
alua
tions
, nfe
l = 40l = 60l = 80l = 100
RemarkModel structural accuracy is superior to 99%!
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Results for Truncation with the Standard Penalty
102030405060700.9
0.92
0.94
0.96
0.98
1
Truncation threshold, τ
Mod
el S
truc
tura
l Acc
urac
y, M
SA
l = 40l = 60l = 80l = 100
102030405060700
0.5
1
1.5
2x 10
6
Truncation threshold, τ
Num
ber
of fu
nctio
n ev
alua
tions
, nfe
l = 40l = 60l = 80l = 100
RemarkGood results, but not as good as for tournament selectionwith the s−penalty.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Summary & Conclusions
Selection addressed as a source of overfitting.Influence of selection method in BN learning:
Truncation selection→ uniform mating pool is moreappropriate for learning (standard scoring metrics).Tournament selection→ non-uniform mating pool leads tomodel overfitting.
Scoring metric adapted to the power distribution of themating pool generated by tournament selection.
Model accuracy has been considerably improved!
Improved model interpretability to assist efficiencyenhancement techiniques and human researchers.
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
From Mating Pool Distributionsto Model Overfitting
Claudio F. Lima1 Fernando G. Lobo1 Martin Pelikan2
1Department of Electronics and Computer Science EngineeringUniversity of Algarve, Portugal
2Missouri Estimation of Distribution Algorithm LaboratoryUniversity of Missouri at St. Louis, USA
GECCO 2008, Atlanta, USA
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting
IntroductionSelection as a Source of Overfitting
Improving Model Accuracy
Modeling Metric Gain When OverfittingAdaptive Scoring Metric
Scoring Metrics
(Bayesian-Dirichlet) K2 metric
K 2(B) = p(B)∏̀i=1
∏l∈Li
1Γ(mi(l) + 2)
∏xi
Γ(mi(xi , l) + 1), (7)
where p(B) = 2−0.5 log2(n)P`
i=1 |Li |.
Bayesian information criteria (BIC) metric
BIC(B) =∑̀i=1
∑l∈Li
∑xi
(mi(xi , l) log2
mi(xi , l)mi(l)
)− |Li |
log2(n)
2
.
(8)
Lima, Lobo, & Pelikan From Mating Pool Distributions to Model Overfitting