gas and premature convergence

46
GAs and Premature Convergence Premature convergence - GAs converge too early to suboptimal solution o as the population evolves, only a little new can be produced Reasons for premature convergence: o improper selection pressure o insufficient population size o deception o improper representation and genetic operators

Upload: brinda

Post on 18-Mar-2016

59 views

Category:

Documents


0 download

DESCRIPTION

GAs and Premature Convergence. Premature convergence - GAs converge too early to suboptimal solution as the population evolves, only a little new can be produced. Reasons for premature convergence: improper selection pressure insufficient population size deception - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GAs and Premature Convergence

GAs and Premature Convergence

Premature convergence - GAs converge too early to suboptimal solution o as the population evolves, only a little new can be produced

Reasons for premature convergence:o improper selection pressureo insufficient population sizeo deception o improper representation and

genetic operators

Page 2: GAs and Premature Convergence

Motivation and Realization

Motivation – to maintain a diversity of the evolved population and extend the explorative power of the algorithm

Realizationo Convergence of the population is allowed up to specified extento Convergence at individual positions of the representation is controlled o Convergence rate – specifies a maximal difference in the frequency of

ones and zeroes in every column of the population– ranges from 0 to PopSize/2

o Principal condition – at any position of the representation neither ones nor zeroes can exceed the frequency constraint

o Specific way of modifying the population genotype

Page 3: GAs and Premature Convergence

Algorithm of GALCO

1. Generate initial population2. Choose parents3. Create offspring4. if (offspring > parents)

then replace parents with offspring

else{find(replacement)replace_with_mask(child1, replacement)find(replacement)replace_with_mask(child2, replacement)

}5. if (not finished) then go to step 2

Page 4: GAs and Premature Convergence

Operator replace_with_mask

Mask – vector of integer counters; stores a number of 1s for each bit of the representation

50

Page 5: GAs and Premature Convergence

Testovací úlohy - statické

F101(x, y) Deceptive function

Hierarchická funkce Royal Road Problem

Page 6: GAs and Premature Convergence

GALCO – vliv parametru C

Page 7: GAs and Premature Convergence

GALCO vs. SGA

Page 8: GAs and Premature Convergence

GALCO – vliv parametru C

Page 9: GAs and Premature Convergence

GALCO vs. SGA

Page 10: GAs and Premature Convergence

Multimodal Optimization

Initial population SIGA

with without

Page 11: GAs and Premature Convergence

Multimodal Optimization (cont.)

Initial population GALCO SIGA

Page 12: GAs and Premature Convergence

GA s reálně kódovanou binární rep. (GARB)

Pseudo-binární rep. - bity kódovány reálným číslem r 0.0, 1.0o interpretace(r) = 1, pro r > 0.5

= 0, pro r < 0.5 redundance kóduo Příklad: ch1 = [0.92 0.07 0.23 0.62]

ch2 = [0.65 0.19 0.41 0.86] interpretace(ch1) = interpretace(ch2) = [1 0 0 1]

Síla genů – vyjadřuje míru stability genů

o Čím blíže k 0.5 tím je gen slabší (nestabilnější)

o „Jedničkové geny“: 0.92 > 0.86 > 0.65 > 0.62

o „Nulové geny“: 0.07 > 0.19 > 0.23 > 0.41

Page 13: GAs and Premature Convergence

Gene-strength adjustment mechanism

Geny chromozomů vzniklých při křížení jsou upravenyo v závislosti na jejich interpretaci o a relativní frequenci jedniček (nul) na dané pozici v populaci P[]

př.: P[0.82 0.17 0.35 0.68] v populaci je na 1. pozici 82% jedniček,

na 2. pozici 17% jedniček,na 3. pozici 35% jedniček,na 4. pozici 68% jedniček.

Geny, které v populaci převládají jsou oslabovány; ostatní jsou posilovány.

Page 14: GAs and Premature Convergence

Posilování a oslabování genů

Oslabovánígen’ = gen + c*(1.0-P[i]), když (gen<0.5) a (P[i]<0.5)

(gen má hodnotu nula a v populaci na i-té pozici převažují nuly)a

gen’ = gen – c*P[i], když (gen>0.5) a (P[i]>0.5)

Posilovánígen’ = gen – c*(P[i]), když (gen<0.5) a (P[i]>0.5)

(gen má hodnotu nula a v populaci na i-té pozici převažují jedničky)a

gen’ = gen + c*(1.0-P[i]), když (gen>0.5) a (P[i]<0.5)

Konstanta c určuje rychlost adaptace genů: c (0.0,0.2

Page 15: GAs and Premature Convergence

Stabilizace slibných jedinců

Potomci, kteří jsou lepší než jejich rodiče by měli být stabilnější než ostatní vygenerovaná nekvalitní řešení

o Chromozomy slibných jedinců jsou vygenerovány se silnými genych = (0.71, 0.45, 0.18, 0.57)

ch’= (0.97, 0.03, 0.02, 0.99)

o Geny slibných jedinců přežijí více generací aniž by byly zmeněny v důsledku oslabování

Page 16: GAs and Premature Convergence

Pseudocode for GARB1 begin2 initialize(OldPop)3 repeat4 calculate P[] from OldPop5 repeat6 select Parents from OldPop7 generate Children8 adjust Children genes9 evaluate Children10 if Child is better than Parents11 then rescale Child12 insert Children to NewPop13 until NewPop is completed14 switch OldPop and NewPop15 until termination condition16 end

Page 17: GAs and Premature Convergence

Testovací úlohy - dynamické

Ošmerův dynamický problémg(x,t) = 1-exp(-200(x-c(t))2)c(t) = 0,04(t/20)

Minimum g(x,t)=0.0 se mění každých 20 generací

Oscillating Knapsack Problem14 objektů, wi=2i, i=0,...,13

f(x)=1/(1+target-wixi) Target osciluje mezi hodnotami

12643 a 2837, které se v binárním vyjádření liší o 9 bitů

Page 18: GAs and Premature Convergence

Výsledky na statických problémech

0 100 200 300 400 5001300

1350

1400

1450

1500

f itness ev aluations (x1000)

fitne

ss

GARBSGA

DF3

0 100 200 300 400 500500

1000

1500

2000

2304

f itness ev aluations (x1000)

fitne

ss

GARBSGA

H-IFF

F101

0 100 200 300 400 500-955

-900

-800

-700

f itness ev aluations (x1000)

fitne

ss

GARBSGA

F101

Page 19: GAs and Premature Convergence

Výsledky na statických problémech

0 100 200 300 400 5000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene60gene62

0 100 200 300 400 500500

1000

1500

fitness evaluations (x1000)

fitne

ss

best fitnessaverage fitness

0 100 200 300 400 5000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene51gene59

0 100 200 300 400 500-955

-750

-500

-250

fitness evaluations (x1000)

fitne

ss

best fitnessaverage fitness

0 100 200 3000

100

200

300

400

500

frequ

ency

of o

nes

at g

iven

pos

ition

gene80gene200

0 100 200 234 300500

1000

1500

2000

2304

fitness evaluations (x1000)

fitne

ss

best fitnessaverage fitness

Page 20: GAs and Premature Convergence

Výsledky na dynamických problémech

Oscillating knapsack problem

Page 21: GAs and Premature Convergence

Výsledky na dynamických problémech• Ošmerův dynamický problém

Bezprostředně po změně opt. Celkově

Algoritmus MTE StDev MTEStDev

GARB c = 0:025 83.3 30.6 50.425.2

GARB c = 0:075 25.6 34.6 2.47.4GARB c = 0:125 12.8 22.4 1.03.9GARB c = 0:175 10.2 19.7 0.73.0GARB c = 0:225 9.2 19.3 0.62.7SGA binary N/A N/A 57.343.61SGA Gray N/A N/A 47.6642.94CBM-B N/A N/A 19.3933.13

MTE – Mean Tracking Error [%] – střední odchylka nejlepšího jedince v populaci a optimálního řešení počítaná přes všechny gen.

Page 22: GAs and Premature Convergence

Zotavení z homogenní populace

0 4 25 50 75 1000.0

0.125

0.250

0.375

0.5

700

875

1050

1225

1400

1400

1425

1450

1475

1500DF3 problem

generations0 4 25 50 75 100

0.0

0.125

0.25

0.375

0.5

0.0

0.125

0.25

0.375

0.5

0.0

0.25

0.50

0.75

1.0

generations

Knapsack problem

dive

rsity

mea

sure

aver

age

fitne

ss

best

fit

ness

best

fit

ness

aver

age

fitne

ss

dive

rsity

mea

sure

Page 23: GAs and Premature Convergence

Weakness of Simple Selectorecombinative GAs

Scale poorely on hard problems, largely the result of their mixing behaviouro Inability of SGA to correctly identify and adequately mix the

appropriate BBs in subsequent generationso Exponential computation complexity of SGA

Crossover operators or other exchange emchanisms are needed such that adapt to the problem at hando Linkage adaptation

Page 24: GAs and Premature Convergence

Naivní přístupy – operátor inverze Obrátí pořadí genů náhodně vybraného podřetězce v chromozomu

10011 – (1,1) | (2,0)(3,0)(4,1) | (5,1) po inverzi

(1,1) (4,1)(3,0)(2,0) (5,1)

Nepoužitelné z důvodu nevyváženosti signálu pro zlepšování linkage oproti signálu pro učení allel.o tα < tλ

- alely podstupují přímější selekci než linkage GA se rozhodne pro optimální nastavení alel dříve než zjistí, které kombinace genů zformovat dohromady a vzájemně mixovat.

o Řešení: obrátit nerovnítko na tα > tλ (ALE JAK?)

Page 25: GAs and Premature Convergence

Competent GAs

Can solveo hard problems (multimodal, deceptive, high degree of

subsolution interaction, noise, ...),o quickly,o accurately,o reliably.

Messy GAs – mGA, fmGA, gemGA Learning linkage GAs – LLGA Compact GAs – cGA, ECGA Bayesian optimization algorithm - BOA

Page 26: GAs and Premature Convergence

Messy Genetic Algorithms - mGAs

Inspirationfrom the nature – evolution starts from the simplest forms of life

mGA departed from SGA in four ways:o messy codingso messy operatorso separation of processing into three heterogeneous phaseso epoch-wise iteration to improve the complexity of solution

Page 27: GAs and Premature Convergence

mGA’s codings

Tagged alleles: o Variable-length strings: (name1, allele1) … (nameN, alleleN)

((4,0) (1,1) (2,0) (4,1) (4,1) (5,1))

Over-specification – multiple gene instances (gene 4)o Majority voting – would express deceptive genes too readilyo First-come first-served (left to right expression) - positional

priority

Underspecification – missing gene instances (gene 3)o Average schema value – variance is too high o Competitive template – solution locally optimal with respect to

k-bit perturbations

Page 28: GAs and Premature Convergence

Messy operators: cut & splice

Cut – divides a single string into two parts Splice – joins the head of one string with the tail of the other one

o When short strings are mated – probability of cut is small mostly the string will be just spliced

– the strings’ length is doubled

o When long string are mated – probability of cut is large one-point crossover

Page 29: GAs and Premature Convergence

mGAs: three heterogeneous phases

Initializationo Enumerative initialization of the population with all sub-strings of a

certain length k<<l (lk)2k O(lk) computations

o Guaranteed that all BBs of certain size are present in the population

Primordial phaseo Only selection used to dope the population with good BBso Good linkage groups are selected before their alleles are allowed to

be mixed

Juxtapositional phaseo selection + cut&spliceo Mixing of the BBs

Page 30: GAs and Premature Convergence

Fast messy genetic algorithms - fmGAs Probabilistically complete enumeration

o Population of strings of length l’ close to l is generatedo Assumption: each string contains many different BBs of length k<<l

Building block filtering – extracts highly-fit and effectively linked BBso Repeated (1) selection and (2) gene deletiono Only O(l) computations to converge

Extended thresholding – tournaments are held only between strings that have a threshold number of genes in common

fmGA vs mGA: 150-bit long problem, 305-bit deceptive functiono 1.9105 vs. 5.9108 evaluations

Page 31: GAs and Premature Convergence

Gene expression messy GA - gemGA

Messy ???o No variable-length stringso No under- or over-specificationo No left-to-right expression

Messy use of heterogeneous phases of processing in gemGAo Linkage learning phase - first identifies linkage groupso Mixing phase – selection + recombination

– exchanges good allele combinations within those groups to find optimal solution

Page 32: GAs and Premature Convergence

gemGA: The idea Linkage learning phase

o Transcription I (antimutation)– Each string undergoes l one-bit perturbations– Improvements are ignored ?!? (bit does not belong to optimal BB)– Changes that degrade the structure are marked as possible linkage

groups candidatesEx.: two 3-bit deceptive BBs 111 101

marked not marked (degrades) (improves)

o Transcription II– Identifies the exact relations among the genes by checking

nonlinearitiesIF f(X’i) + f(X’j) != f(X’ij) THEN link(i,j)

Page 33: GAs and Premature Convergence

Linkage Learning GA - LLGA

More “messy” than gemGAo Variable-length stringso Left-to-right expressiono Always over-specification

NO primordial or juxtapositional phase – more SGA like

Idea: o Probabilistic expression that slows down the convergence of

alleleso Crossover that adapts linkage at the same time that alleles are

exchanged

Page 34: GAs and Premature Convergence

LLGA – Probabilistic expression

Clockwise interpretation

(3,1)(2,0)(5,1)(1,1)(4,0)

1 0 1 0 1

Page 35: GAs and Premature Convergence

LLGA – probabilistic expression cont.

The allele 1 is expressed with the probability δ/l and 1/l respectively

The allele 0 is expressed with the probability (l-δ)/l and (l-1)/l respectively

Page 36: GAs and Premature Convergence

LLGA: Effect of PE on BBs

Assume a 6-bit problem where BB requiring genes 4, 5, and 6 to take on values of 1 in a trap function.

o Initially the block 111 will be expressed roughly 1/8th of the time

o After the linkage evolved properly the BB success rate increases

(6,1) (4,1) (5,1) (4,0) (5,0) (6,0)

expressed most of the time almost never expressed

Extended probabilistic expression EPE-qo q is the number of copies of unexpressed allele (q=2)

Page 37: GAs and Premature Convergence

LLGA – introns

•Introns – non-coding genes (97% of DNA is non-coding)

oNumber of introns required for proper functioning grows exponentially compressed introns

Page 38: GAs and Premature Convergence

Probabilistic Model-Building GAs

1. Initialize population at random

2. Select promising solutions

3. Build probabilistic model of selected solutions

4. Sample built model to generate new solutions

5. Incorporate new solutions into original population

6. Go to 2 (if not finished)

Page 39: GAs and Premature Convergence

Com

pact

GA

-cG

A

Page 40: GAs and Premature Convergence

5-bit trap problem

Page 41: GAs and Premature Convergence

UMDA performance

Page 42: GAs and Premature Convergence

UMDA with “good” statistics

Page 43: GAs and Premature Convergence

Extended compact GA - ECGA

Marginal product model (MPM)

o Groups of bits (partitions) treated as chunks

o Partitions represent subproblem

o Onemax: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

o Traps: [1 2 3 4 5] [6 7 8 9 10]

Page 44: GAs and Premature Convergence

Learning structure in ECGA

Two componentso Scoring metrics: minimal description length (MDL)

– Number of bits for storing probabilities:Cm = log2N i 2Si

– Number of bits storing population using model:Cp = N i E(Mi)

– Minimize C = Cm + Cp

o Search procedure: a greedy algorithm– Start with one-bit groups– Merge two groups for most improvement– No more improvement possible finish.

Page 45: GAs and Premature Convergence

ECGA model

[0 ,2 ,5 ]

[1 ,4 ]

[3 ]

[0,2,5] [1,4] [3]

000 0.5 00 0.5 0 0.7

111 0.5 01 0.0 1 0.3

001, 010, 100 0.0 10 0.0

011, 101, 110 0.0 11 0.5

Page 46: GAs and Premature Convergence

ECGA example