gas and premature convergence

GAs and Premature Convergence

Premature convergence - GAs converge too early to suboptimal solution o as the population evolves, only a little new can be produced

Reasons for premature convergence:o improper selection pressureo insufficient population sizeo deception o improper representation and

genetic operators

Motivation and Realization

Motivation – to maintain a diversity of the evolved population and extend the explorative power of the algorithm

Realizationo Convergence of the population is allowed up to specified extento Convergence at individual positions of the representation is controlled o Convergence rate – specifies a maximal difference in the frequency of

ones and zeroes in every column of the population– ranges from 0 to PopSize/2

o Principal condition – at any position of the representation neither ones nor zeroes can exceed the frequency constraint

o Specific way of modifying the population genotype

Algorithm of GALCO

1. Generate initial population2. Choose parents3. Create offspring4. if (offspring > parents)

then replace parents with offspring

else{find(replacement)replace_with_mask(child1, replacement)find(replacement)replace_with_mask(child2, replacement)

}5. if (not finished) then go to step 2

Operator replace_with_mask

Mask – vector of integer counters; stores a number of 1s for each bit of the representation

50

Testovací úlohy - statické

F101(x, y) Deceptive function

Hierarchická funkce Royal Road Problem

GALCO – vliv parametru C

GALCO vs. SGA

GALCO – vliv parametru C

GALCO vs. SGA

Multimodal Optimization

Initial population SIGA

with without

Multimodal Optimization (cont.)

Initial population GALCO SIGA

GA s reálně kódovanou binární rep. (GARB)

Pseudo-binární rep. - bity kódovány reálným číslem r 0.0, 1.0o interpretace(r) = 1, pro r > 0.5

= 0, pro r < 0.5 redundance kóduo Příklad: ch1 = [0.92 0.07 0.23 0.62]

ch2 = [0.65 0.19 0.41 0.86] interpretace(ch1) = interpretace(ch2) = [1 0 0 1]

Síla genů – vyjadřuje míru stability genů

o Čím blíže k 0.5 tím je gen slabší (nestabilnější)

o „Jedničkové geny“: 0.92 > 0.86 > 0.65 > 0.62

o „Nulové geny“: 0.07 > 0.19 > 0.23 > 0.41

Gene-strength adjustment mechanism

Geny chromozomů vzniklých při křížení jsou upravenyo v závislosti na jejich interpretaci o a relativní frequenci jedniček (nul) na dané pozici v populaci P[]

př.: P[0.82 0.17 0.35 0.68] v populaci je na 1. pozici 82% jedniček,

na 2. pozici 17% jedniček,na 3. pozici 35% jedniček,na 4. pozici 68% jedniček.

Geny, které v populaci převládají jsou oslabovány; ostatní jsou posilovány.

Posilování a oslabování genů

Oslabovánígen’ = gen + c*(1.0-P[i]), když (gen<0.5) a (P[i]<0.5)

(gen má hodnotu nula a v populaci na i-té pozici převažují nuly)a

gen’ = gen – c*P[i], když (gen>0.5) a (P[i]>0.5)

Posilovánígen’ = gen – c*(P[i]), když (gen<0.5) a (P[i]>0.5)

(gen má hodnotu nula a v populaci na i-té pozici převažují jedničky)a

gen’ = gen + c*(1.0-P[i]), když (gen>0.5) a (P[i]<0.5)

Konstanta c určuje rychlost adaptace genů: c (0.0,0.2

Stabilizace slibných jedinců

Potomci, kteří jsou lepší než jejich rodiče by měli být stabilnější než ostatní vygenerovaná nekvalitní řešení

o Chromozomy slibných jedinců jsou vygenerovány se silnými genych = (0.71, 0.45, 0.18, 0.57)

ch’= (0.97, 0.03, 0.02, 0.99)

o Geny slibných jedinců přežijí více generací aniž by byly zmeněny v důsledku oslabování

Pseudocode for GARB1 begin2 initialize(OldPop)3 repeat4 calculate P[] from OldPop5 repeat6 select Parents from OldPop7 generate Children8 adjust Children genes9 evaluate Children10 if Child is better than Parents11 then rescale Child12 insert Children to NewPop13 until NewPop is completed14 switch OldPop and NewPop15 until termination condition16 end

Testovací úlohy - dynamické

Ošmerův dynamický problémg(x,t) = 1-exp(-200(x-c(t))2)c(t) = 0,04(t/20)

Minimum g(x,t)=0.0 se mění každých 20 generací

Oscillating Knapsack Problem14 objektů, wi=2i, i=0,...,13

f(x)=1/(1+target-wixi) Target osciluje mezi hodnotami

12643 a 2837, které se v binárním vyjádření liší o 9 bitů

Výsledky na statických problémech

0 100 200 300 400 5001300

1350

1400

1450

1500

f itness ev aluations (x1000)

fitne

ss

GARBSGA

DF3

0 100 200 300 400 500500

1000

1500

2000

2304


fitne

ss

GARBSGA

H-IFF

F101

0 100 200 300 400 500-955

-900

-800

-700


fitne

ss

GARBSGA

F101

Výsledky na statických problémech

0 100 200 300 400 5000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene60gene62

0 100 200 300 400 500500

1000

1500

fitness evaluations (x1000)

fitne

ss

best fitnessaverage fitness

0 100 200 300 400 5000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene51gene59

0 100 200 300 400 500-955

-750

-500

-250


fitne

ss


0 100 200 3000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene80gene200

0 100 200 234 300500

1000

1500

2000

2304


fitne

ss


Výsledky na dynamických problémech

Oscillating knapsack problem

Výsledky na dynamických problémech• Ošmerův dynamický problém

Bezprostředně po změně opt. Celkově

Algoritmus MTE StDev MTEStDev

GARB c = 0:025 83.3 30.6 50.425.2

GARB c = 0:075 25.6 34.6 2.47.4GARB c = 0:125 12.8 22.4 1.03.9GARB c = 0:175 10.2 19.7 0.73.0GARB c = 0:225 9.2 19.3 0.62.7SGA binary N/A N/A 57.343.61SGA Gray N/A N/A 47.6642.94CBM-B N/A N/A 19.3933.13

MTE – Mean Tracking Error [%] – střední odchylka nejlepšího jedince v populaci a optimálního řešení počítaná přes všechny gen.

Zotavení z homogenní populace

0 4 25 50 75 1000.0

0.125

0.250

0.375

0.5

700

875

1050

1225

1400

1400

1425

1450

1475

1500DF3 problem

generations0 4 25 50 75 100

0.0

0.125

0.25

0.375

0.5

0.0

0.125

0.25

0.375

0.5

0.0

0.25

0.50

0.75

1.0

generations

Knapsack problem

dive

rsity

mea

sure

aver

age

fitne

ss

best

fit

ness

best

fit

ness

aver

age

fitne

ss

dive

rsity

mea

sure

Weakness of Simple Selectorecombinative GAs

Scale poorely on hard problems, largely the result of their mixing behaviouro Inability of SGA to correctly identify and adequately mix the

appropriate BBs in subsequent generationso Exponential computation complexity of SGA

Crossover operators or other exchange emchanisms are needed such that adapt to the problem at hando Linkage adaptation

Naivní přístupy – operátor inverze Obrátí pořadí genů náhodně vybraného podřetězce v chromozomu

10011 – (1,1) | (2,0)(3,0)(4,1) | (5,1) po inverzi

(1,1) (4,1)(3,0)(2,0) (5,1)

Nepoužitelné z důvodu nevyváženosti signálu pro zlepšování linkage oproti signálu pro učení allel.o tα < tλ

- alely podstupují přímější selekci než linkage GA se rozhodne pro optimální nastavení alel dříve než zjistí, které kombinace genů zformovat dohromady a vzájemně mixovat.

o Řešení: obrátit nerovnítko na tα > tλ (ALE JAK?)

Competent GAs

Can solveo hard problems (multimodal, deceptive, high degree of

subsolution interaction, noise, ...),o quickly,o accurately,o reliably.

Messy GAs – mGA, fmGA, gemGA Learning linkage GAs – LLGA Compact GAs – cGA, ECGA Bayesian optimization algorithm - BOA

Messy Genetic Algorithms - mGAs

Inspirationfrom the nature – evolution starts from the simplest forms of life

mGA departed from SGA in four ways:o messy codingso messy operatorso separation of processing into three heterogeneous phaseso epoch-wise iteration to improve the complexity of solution

mGA’s codings

Tagged alleles: o Variable-length strings: (name1, allele1) … (nameN, alleleN)

((4,0) (1,1) (2,0) (4,1) (4,1) (5,1))

Over-specification – multiple gene instances (gene 4)o Majority voting – would express deceptive genes too readilyo First-come first-served (left to right expression) - positional

priority

Underspecification – missing gene instances (gene 3)o Average schema value – variance is too high o Competitive template – solution locally optimal with respect to

k-bit perturbations

Messy operators: cut & splice

Cut – divides a single string into two parts Splice – joins the head of one string with the tail of the other one

o When short strings are mated – probability of cut is small mostly the string will be just spliced

– the strings’ length is doubled

o When long string are mated – probability of cut is large one-point crossover

mGAs: three heterogeneous phases

Initializationo Enumerative initialization of the population with all sub-strings of a

certain length k<<l (lk)2k O(lk) computations

o Guaranteed that all BBs of certain size are present in the population

Primordial phaseo Only selection used to dope the population with good BBso Good linkage groups are selected before their alleles are allowed to

be mixed

Juxtapositional phaseo selection + cut&spliceo Mixing of the BBs

Fast messy genetic algorithms - fmGAs Probabilistically complete enumeration

o Population of strings of length l’ close to l is generatedo Assumption: each string contains many different BBs of length k<<l

Building block filtering – extracts highly-fit and effectively linked BBso Repeated (1) selection and (2) gene deletiono Only O(l) computations to converge

Extended thresholding – tournaments are held only between strings that have a threshold number of genes in common

fmGA vs mGA: 150-bit long problem, 305-bit deceptive functiono 1.9105 vs. 5.9108 evaluations

Gene expression messy GA - gemGA

Messy ???o No variable-length stringso No under- or over-specificationo No left-to-right expression

Messy use of heterogeneous phases of processing in gemGAo Linkage learning phase - first identifies linkage groupso Mixing phase – selection + recombination

– exchanges good allele combinations within those groups to find optimal solution

gemGA: The idea Linkage learning phase

o Transcription I (antimutation)– Each string undergoes l one-bit perturbations– Improvements are ignored ?!? (bit does not belong to optimal BB)– Changes that degrade the structure are marked as possible linkage

groups candidatesEx.: two 3-bit deceptive BBs 111 101

marked not marked (degrades) (improves)

o Transcription II– Identifies the exact relations among the genes by checking

nonlinearitiesIF f(X’i) + f(X’j) != f(X’ij) THEN link(i,j)

Linkage Learning GA - LLGA

More “messy” than gemGAo Variable-length stringso Left-to-right expressiono Always over-specification

NO primordial or juxtapositional phase – more SGA like

Idea: o Probabilistic expression that slows down the convergence of

alleleso Crossover that adapts linkage at the same time that alleles are

exchanged

LLGA – Probabilistic expression

Clockwise interpretation

(3,1)(2,0)(5,1)(1,1)(4,0)

1 0 1 0 1

LLGA – probabilistic expression cont.

The allele 1 is expressed with the probability δ/l and 1/l respectively

The allele 0 is expressed with the probability (l-δ)/l and (l-1)/l respectively

LLGA: Effect of PE on BBs

Assume a 6-bit problem where BB requiring genes 4, 5, and 6 to take on values of 1 in a trap function.

o Initially the block 111 will be expressed roughly 1/8th of the time

o After the linkage evolved properly the BB success rate increases

(6,1) (4,1) (5,1) (4,0) (5,0) (6,0)

expressed most of the time almost never expressed

Extended probabilistic expression EPE-qo q is the number of copies of unexpressed allele (q=2)

LLGA – introns

•Introns – non-coding genes (97% of DNA is non-coding)

oNumber of introns required for proper functioning grows exponentially compressed introns

Probabilistic Model-Building GAs

1. Initialize population at random

2. Select promising solutions

3. Build probabilistic model of selected solutions

4. Sample built model to generate new solutions

5. Incorporate new solutions into original population

6. Go to 2 (if not finished)

Com

pact

GA

-cG

A

5-bit trap problem

UMDA performance

UMDA with “good” statistics

Extended compact GA - ECGA

Marginal product model (MPM)

o Groups of bits (partitions) treated as chunks

o Partitions represent subproblem

o Onemax: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

o Traps: [1 2 3 4 5] [6 7 8 9 10]

Learning structure in ECGA

Two componentso Scoring metrics: minimal description length (MDL)

– Number of bits for storing probabilities:Cm = log2N i 2Si

– Number of bits storing population using model:Cp = N i E(Mi)

– Minimize C = Cm + Cp

o Search procedure: a greedy algorithm– Start with one-bit groups– Merge two groups for most improvement– No more improvement possible finish.

ECGA model

[0 ,2 ,5 ]

[1 ,4 ]

[3 ]

[0,2,5] [1,4] [3]

000 0.5 00 0.5 0 0.7

111 0.5 01 0.0 1 0.3

001, 010, 100 0.0 10 0.0

011, 101, 110 0.0 11 0.5

ECGA example

gas and premature convergence

Documents

gen c

evolved population

kdy gen0

populaci jena

nulov geny

kte jsou lep ne jejich

oldpop5repeat6select

initial population2