genetic algorithms tanmay, abhijit, ameya, saurabh

GENETIC ALGORITHMS

Tanmay, Abhijit, Ameya, Saurabh

Inspiration - Evolution• Natural Selection:

– “Survival of the Fittest”– favourable traits become common and

unfavourable traits become uncommon in successive generations

• Sexual Reproduction:– Chromosomal crossover and genetic

recombination– population is genetically variable– adaptive evolution is facilitated– unfavourable mutations are eliminated

Overview Inspiration The basic algorithm

Encoding Selection Crossover Mutation

Why Genetic Algorithms work ? Schemas Hyper-planes Schema Theorem

Strengths and Weakness Applications

TSP Conclusion

THE BASIC ALGORITHM

Ameya Muley

Encoding of Solution Space Represent solution space by strings of

fixed length over some alphabet TSP:

ordering of points

Knapsack: inclusion in knapsack

A D B E C B E D A C

B

A C

E D

0 0 1 0 1 1 0 1 1 0

Selection• Fitness function:

– f(x), x is a chromosome in the solution space– f(x) may be:• an well-defined objective function to be optimised

– e.g. TSP and knapsack• a heuristic

– e.g. N-Queens• Probability distribution for selection:

• Fitness proportional selection

M

j j

ii

xfxfxXP

1)(

)(

Operators-Crossover and Mutation

• Crossover:

– Applied with high probability– Position for crossover on the two parent chromosomes

randomly selected– Offspring share characteristics of well-performing parents– Combinations of well-performing characteristics generated

• Mutation:

– Applied with low probability– Bit for mutation randomly selected– New characteristics introduced into the population– Prevents algorithm from getting trapped into a local

optimum

The Basic Algorithm1. Fix population size M2. Randomly generate M strings in the solution space3. Observe the fitness of each chromosome4. Repeat:

1. Select two fittest strings to reproduce2. Apply crossover with high probability to produce

offspring3. Apply mutation to parent or offspring with low

probability4. Observe the fitness of each new string5. Replace weakest strings of the population with the

offspringuntil

i. fixed number of iterations completed, ORii. average/best fitness above a threshold, ORiii. average/best fitness value unchanged for a fixed number

of consecutive iterations

Example• Problem specification:

– string of length 4– two 0’s and two 1’s– 0’s to the right of the 1’s

• Solution space:• Fitness function (heuristic):

– f(x) = number of bits that match the ones in the solution

• Initialization (M = 4):

0 0 1 1

41,0

1 0 0 0 1)( Af

0 1 0 0 1)( Bf

0 1 0 1 2)( Cf

0 0 1 0 3)( Df

75.1avf

0 1 0 1

0 0 1 0

0 1 0 0

0 0 1 1

1)( Xf

4)( Yf

0 1 0 0 0 1 1 0 2)( Zf

Example (contd.) After iteration 1:

After iteration 2:

0 1 0 1 2)( Af

0 1 1 0 2)( Bf

0 0 1 0 3)( Cf

0 0 1 1 4)( Df75.2avf

0 1 0 1 2)( Af

0 0 0 1 3)( Bf

0 0 1 0 3)( Cf

0 0 1 1 4)( Df3avf

0 1 0 1

0 0 1 0

0 1 1 0

0 0 0 1

2)( Xf

3)( Yf

WHY GENETIC ALGORITHMS WORK?

Tanmay Khirwadkar

Schemas Population

Strings over alphabet {0,1} of length L E.g.

Schema A schema is a subset of the space of all

possible individuals for which all the genes match the template for schema H.

Strings over alphabet {0,1,*} of length L E.g. }11110,11010,10110,10010{]10**1[ H

10010s

Hyper-plane model Search space

A hyper-cube in L dimensional space Individuals

Vertices of hyper-cube Schemas

Hyper-planes formed by vertices

0**

Sampling Hyper-planes Look for hyper-planes (schemas) with good

fitness value instead of vertices (individuals) to reduce search space

Each vertex Member of 3L hyper-planes Samples hyper-planes

Average Fitness of a hyper-plane can be estimated by sampling fitness of members in population

Selection retains hyper-planes with good estimated fitness values and discards others

Schema Theorem Schema Order O(H)

Schema order, O(.) , is the number of non ‘*’ genes in schema H.

E.g. O(1**1*) = 2 Schema Defining Length δ(H)

Schema Defining Length, δ(H), is the distance between first and last non ‘*’ gene in schema H

E.g. δ(1**1*) = 4 – 1 = 3 Schemas with short defining length, low order

with fitness above average population are favored by GAs

Formal Statement Selection probability

Crossover probability

Mutation probability

Expected number of members of a schema

),(),(),())1,((

tHftHftHmtHmE

1)()( L

Hccrossover phP

mmutation pHhP )()(

))(1)(1(

),(),(),()1,(( Hp

LHp

tHftHftHmtHmE mc

Why crossover and mutation? Crossover

Produces new solutions while ‘remembering’ the characteristics of old solutions

Partially preserves distribution of strings across schemas

Mutation Randomly generates new solutions which

cannot be produced from existing population Avoids local optimum

STRENGTHS AND WEAKNESS

Abhijit Bhole

Area of applicationGAs can be used when: Non-analytical problems. Non-linear models. Uncertainty. Large state spaces.

Non-analytical problems Fitness functions may not be expressed

analytically always. Domain specific knowledge may not be

computable from fitness function. Scarce domain knowledge to guide the

search.

Non-linear models Solutions depend on starting values. Non – linear models may converge to

local optimum. Impose conditions on fitness functions

such as convexity, etc. May require the problem to be

approximated to fit the non-linear model.

Uncertainty Noisy / approximated fitness functions. Changing parameters. Changing fitness functions. Why do GAs work? Because uncertainty

is common in nature.

Large state spaces Heuristics focus only on the immediate

area of initial solutions. State-explosion problem: number of

states huge or even infinite! Too large to be handled.

State space may not be completely understood.

Characteristics of GAs Simple, Powerful, Adaptive, Parallel Guarantee global optimum solutions. Give solutions of un-approximated form

of problem. Finer granularity of search spaces.

When not to use GA! Constrained mathematical optimization

problems especially when there are few solutions.

Constraints are difficult to incorporate into a GA.

Guided domain search is possible and efficient.

PRACTICAL EXAMPLE - TSP

Saurabh Chakradeo

TSP Description Problem Statement: Given a complete

weighted undirected graph, find the shortest Hamiltonian cycle. (n nodes)

The size of the solution space in (n-1)!/2 Dynamic Programming gives us a

solution in time O(n22n) TSP is NP Complete

TSP Encoding Binary representation

Tour 1-3-2 is represented as ( 00 10 01 ) Path representation

Natural – ( 1 3 2 ) Adjacency representation

Tour 1-3-2 is represented as ( 3 1 2 ) Ordinal representation

A reference list is used. Let that be ( 1 2 3 ). Tour 1-3-2 is represented as ( 1 2 1 )

TSP – Crossover operator Order Based crossover (OX2)

Selects at random several positions in the parent tour

Imposes the order of nodes in selected positions of one parent on the other parent

Parents: (1 2 3 4 5 6 7 8) and (2 4 6 8 7 5 3 1) Selected positions, 2nd , 3rd and 6th

Impose order on (2 4 6 8 7 5 3 1) &(1 2 3 4 5 6 7 8)

Children are (2 4 3 8 7 5 6 1) and (1 2 3 4 6 5 7 8)

TSP – Mutation Operators Exchange Mutation Operator (EM)

Randomly select two nodes and interchange their positions.

( 1 2 3 4 5 6 ) can become ( 1 2 6 4 5 3 ) Displacement Mutation Operator (DM)

Select a random sub-tour, remove and insert it in a different location.

( 1 2 [3 4 5] 6 ) becomes ( 1 2 6 3 4 5 )

Conclusions Plethora of applications

Molecular biology, scheduling, cryptography, parameter optimization

General algorithmic model applicable to a large variety of classes of problems

Another in the list of algorithms inspired by biological processes – scope for more parallels?

Philosophical Implication: Are humans actually moving towards their global

optimum?

References Adaptation in Natural and Artificial Systems,

John H. Holland, MIT Press, 1992. Goldberg, D. E. 1989 Genetic Algorithms in

Search, Optimization and Machine Learning. 1st. Addison-Wesley Longman Publishing Co., Inc.

Genetic Algorithms for the Travelling Salesman Problem: A Review of Representations and Operators, P. Larranaga et al., University of Basque, Spain. Artificial Intelligence Review, Volume 13, Number 2 / April, 1999

genetic algorithms tanmay, abhijit, ameya, saurabh

Documents