csnb234 artificial intelligence
DESCRIPTION
CSNB234 ARTIFICIAL INTELLIGENCE. Chapter 9 Genetic Algorithms. (Chapter 12, pp. 509-519, Textbook) (pp. 116-119, Ref. #3). Instructor: Alicia Tang Y. C. Genetic Algorithms - I. Genetic Algorithms (GAs) Is an efficient and robust search technique in complex searching areas - PowerPoint PPT PresentationTRANSCRIPT
UNIVERSITI TENAGA NASIONAL 1
CSNB234ARTIFICIAL INTELLIGENCE
Chapter 9Genetic Algorithms
Chapter 9Genetic Algorithms
Instructor: Alicia Tang Y. C.
(Chapter 12, pp. 509-519, Textbook)(pp. 116-119, Ref. #3)
UNIVERSITI TENAGA NASIONAL 2
Genetic Algorithms - I
Genetic Algorithms (GAs) Is an efficient and robust search technique in
complex searching areas It normally find near optimal solutions to
problemsSolutions are based on the natural selection and
genetic, i.e. survival of the fittest In a GA work:
a number of solutions are evaluated simultaneously
UNIVERSITI TENAGA NASIONAL 3
Genetic Algorithms - II
developed by John Holland in 1975
inspired by biological mutation & evolution
stochastic (means non-deterministic) search techniques based on the mechanism of natural selection and genetics
implicit parallel search of solution space
UNIVERSITI TENAGA NASIONAL 4
Genetic Algorithms - III
Each solution is evaluated for its fitness.
The better solution the greater its chance of survival
use an iterative process, with better solutions normally evolving over time
UNIVERSITI TENAGA NASIONAL 5
Applications
GAs are used for optimization problemssuch as maximising profits or minimising
costs/time
Some problem areas are:schedulingdesign financial management
UNIVERSITI TENAGA NASIONAL 6
The Algorithm …Evolutionary Algorithm
Generate initialpopulation
Evaluate function
Best result
start
stop
Criteria met?
yes
no selection crossover mutation
UNIVERSITI TENAGA NASIONAL 7
Genetic RepresentationChromosomes (Strings) an instance of the problem to be solved
which is also called a genetic structure
a chromosome consists of one or more bits
e.g. 01001001 a chromosome with n bits represents 2n solutions (of which some may be invalid)
UNIVERSITI TENAGA NASIONAL 8
ChromosomesRepresentation
bit strings (0011 .. 1101) - as seen in earlier slide
permutation of elements (e3 e4 e1 e7 e9 .. e15)
list of rules (R1 R2 R3 .. R10)Tree-structured expression (+ a b)etc.
GAs have been successful with chromosomes sizes of 1000’s of bit, 100’s rules and
1000’s of permutation elements
UNIVERSITI TENAGA NASIONAL 9
Genes
A chromosome may be divided into parts (positions) called genes
Each gene may represent a particular aspect or parameter of a problem
And, its values is called alleleAs a gene is in binary a 1 bit gene can hold 1 or 2 values, a 2 bit gene can hold 3 or 4 values, a 3 bit gene 5 to 8 values, etc.
using this: e.g. n bits can hold from 2 n-1 + 1 to 2
n values
UNIVERSITI TENAGA NASIONAL 10
PopulationA GA uses a number of chromosome at a time (e.g 50) called population each representing solutions for a problemThe population of chromosomes compete to survive based on their fitness and are manipulated by genetic operatorsThe population evolves over a number of generations towards a better solution
UNIVERSITI TENAGA NASIONAL 11
Genetic Operators
Main genetic operators arereproductioncrossovermutationinversion
UNIVERSITI TENAGA NASIONAL 12
Reproduction
“Reproduce” by “Selection” of individual chromosomes that are to reproduce selecting individuals to be parents
chromosomes with a higher fitness value will have a higher probability of contributing one or more offspring in the next generation
Roulette wheel selection technique
It is one of the chromosome selection techniques. Each chromosome is given a slice of the circular roulette wheel. The area of the slice within the wheel is equal to the chromosome fitness ratio.
To select a chromosome for mating, a random number is generated in the interval [0, 100], and the chromosome whose segment spans the random number is selected.
UNIVERSITI TENAGA NASIONAL 13
UNIVERSITI TENAGA NASIONAL 14
Crossover
Mixing of genetic material (mating)Two structures in the current generation are allowed to mate randomlywith each pair producing two chromosomes (offspring)
A crossover point is selected at random and parts of the two parent chromosomes are swapped to create two child chromosomes.
UNIVERSITI TENAGA NASIONAL 15
Crossover in action
0 1 0 0 1 0 0 1
0 0 1 1 0 0 1 0
0 1 0 0 0 0 0 1
0 0 1 1 1 0 1 0
Before crossing over
After it is done
Crossover can lead to effective combination of partial solutions on different chromosomes
By doing this, it helps accelerates the search at an early stage of evolution
Select at random pos & swap the two bits
UNIVERSITI TENAGA NASIONAL 16
Types of crossover
Single-pointmulti-pointarithmetic
reduced surrogateuniformshuffle
tree, etc.
UNIVERSITI TENAGA NASIONAL 17
Mutation
Mutation is a small copy error from one generation to the nextthe mutation rate is the probability a bit changes from 0 to 1 or 1 to 0the mutation rate must be very small (e.g. 0.001) or it may result in a random search, rather than the guided search
UNIVERSITI TENAGA NASIONAL 18
Mutation in action
Mutation may be random or heuristicsMovement can be made global or local to some sub units
0 1 0 0 1 0 0 1 0 1 0 1 1 0 0 1
becomes
UNIVERSITI TENAGA NASIONAL 19
An Example
Data population: RGB colours Aim: to obtain darkest colour represented by (0, 0, 0)This is a minimisation problem, i.e. a good colour is one that fits for (colour) --> 0.
We now tabulate our data as shown (see next slide):
UNIVERSITI TENAGA NASIONAL 20
GA: Fitness (I)
Start at a random pattern like this:
Colour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317
Fitness for (C1) = 80 + 170 + 689 = 939Fitness for (C2) = 130 + 690 + 15 = 835Fitness for (C3) = 24 + 8 + 317 = 349
Fitness for (C1) = 80 + 170 + 689 = 939Fitness for (C2) = 130 + 690 + 15 = 835Fitness for (C3) = 24 + 8 + 317 = 349
where
UNIVERSITI TENAGA NASIONAL 21
GA: Fitness (II)
Start at a random pattern like this:
Colour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317
Fitness (C1) = 80 + 170 + 689 = 939Fitness (C2) = 130 + 690 + 15 = 835Fitness (C3) = 24 + 8 + 317 = 349
FITTEST PLACED TOP
C3C2C1
UNIVERSITI TENAGA NASIONAL 22
GA: Selection
After a Selection is done on the sample:
Colour Fitness C3 349C2 835C1 939
** Remember, this is a minimisation problem..
UNIVERSITI TENAGA NASIONAL 23
GA: Reproduction & Crossover
So far, we have thisColour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317
Colour FitnessC3 349C2 835C1 939
Colour Red Green BlueC4 24 8 15C5 24 8 689 C6 130 690 689
C4 is crossover(C3,C2)= (24, 8, 15) C5 is crossover(C3,C1)= (24, 8, 689)C6 is crossover(C2,C1)= (130, 690, 689)
Next step is to reproduce the pattern, like this, by crossing over:
UNIVERSITI TENAGA NASIONAL 24
Mutation in GA
perform mutation, and we have
Colour Red Green BlueC7 24 8 13C8 25 9 689C9 128 688 689
C7 is obtained by mutating(4) =(24, 8, 13)C8 is obtained by mutating(5) =(25, 9, 689 )C9 is obtained by mutating(6) =(128, 688, 689)
New population of3 chromosomes
UNIVERSITI TENAGA NASIONAL 25
Conclusion (up to “mutation” to get a new data set)
Some solutions have improved (after first iteration):
Fitness for C7 = (24 + 8 + 13) = 45Fitness for C8 = (25 + 9 + 689) = 743Fitness for C9 = (128 + 688 + 689 ) = 1505
If the process is iterated, population will converge to have fitness near to zero (colour) --> 0.
Getting better rather fast
Slightly improved of the answer
worse here..
UNIVERSITI TENAGA NASIONAL 26
We shall look at the so-called
‘fitness function’with an example
UNIVERSITI TENAGA NASIONAL 27
Fitness Function
The GA performs a search amongst possible solutionsThe GA search is guided by a fitness function which returns a single numeric value indicating the fitness of a chromosomethe fitness is maximised or minimised depending on the problems
UNIVERSITI TENAGA NASIONAL 28
Problem Description
To make the best use of disk space and avoid fragmentation, files should be allocated to minimize the number of locations used.
Assumptions: files must be placed in a single location
So that it could free some storage for other files use
UNIVERSITI TENAGA NASIONAL 29
Problem Description
Disk space Location Size (KB)0 1.01 1.52 4.03 0.3Total 6.8
UNIVERSITI TENAGA NASIONAL 30
Files to be stored are:
Identifier Size (KB)
A 0.2
B 0.1
C 1.2
D 3.0
E 0.9
UNIVERSITI TENAGA NASIONAL 31
SOLUTION
Step 1: design the structure of the chromosomeuse one gene per file
5 genes (for this example) the first gene represent the location for file
A, etc.as there are 4 locations, 2 bit genes can be
used to indicate a files location (00 = 0, 01 = 1, 10 = 2, 11 = 3) - base two
chromosome size = 5 * 2 = 10 bits
UNIVERSITI TENAGA NASIONAL 32
Step 2: determine the fitness functionThe fitness is to be maximisedFactors affecting the fitness function
Free memory locations (min = 0, max = 4)Memory overflow (disk space Kb)Valid/invalid solutions.
Fitness calculationValid solution: fitness = bonus for getting a valid
solution + number of free memory The bonus in this case is the maximum that can be
obtained for an invalid solution Invalid solution: fitness = total disk space - memory
overflow
So that all filesare allocated
Not being used
Not ‘fit’ if there are many overflow
6.8+06.8+4
UNIVERSITI TENAGA NASIONAL 33
Genetic Algorithm - revisitedAlgorithm overview Initialise the populationEvaluate the fitness of the populationWHILE the termination condition has not been
satisfiedselect chromosomes for the new reproductionperform crossover on the new populationperform mutation on the new populationevaluate the fitness of the new populationmake the new population the old population
ENDWHILE
UNIVERSITI TENAGA NASIONAL 34
ProceduresStep 1: Pick a population of random codes. (Population size = 4)
Step 2: Evaluate the fitness of each chromosome
Step 3: Select for reproduction with a probability based on the fitness value
Step 4: Crossover chromosomes to form the new generation
randomly select pairscrossover
UNIVERSITI TENAGA NASIONAL 35
Step 5: Mutation
Step 6: Loop back to step 2
Control ParametersThe main GA control parameters are:
•Number of generations and trials•Population size•Crossover rate•Mutation rate
UNIVERSITI TENAGA NASIONAL 36
Fitness value calculationExample:
Find the fitness value of 2 chromosomes 1011 and 0101 for the boolean expression:
b(X1,X2,X3,X4) = (X1 vX2) (X3 v ¬X4) ¬X1
Given the formulas,
Fitness(X) = 1 if X is a 1-bit, and 0 if X is 0-bitFitness(¬ X) = 1 – fitness(X)Fitness(X1 v X2 v……v Xn) = max(fitness(X1), fitness(X2),……, fitness(X3n))Fitness(X1 X2 …… Xn) = (1/n) * (ni=1 fitness(Xi))
Exercise
UNIVERSITI TENAGA NASIONAL 37
Evaluate the fitness of each binary strings for the above expression
Fitness(1011)= (1/3) * ( fitness value for 3 sub-expressions)= (1/3) * (max(fitness(X1), fitness(X2)) + max(fitness(X3), fitness(¬X4))
+ (1- fitness(X1)))= (1/3) * (max(1,0) + max(1, (1-1)) + (1-1))= 1/3 * (1+1+0) = 2/3
Similarly, fitness(0101) = 2/3
Apply crossover at bit position 2 and compute new fitness values:
1011 becomes 1001 & 0101 becomes 0111
Working out the fitness for these strings and you will get
a fitness value 1 for 0111 & hence it provides the (best) solution
b(X1,X2,X3,X4) = (X1 vX2) (X3 v ¬X4) ¬X1
UNIVERSITI TENAGA NASIONAL 38
Supplementary slides
UNIVERSITI TENAGA NASIONAL 39
Number of generationsThe number of cycles (generations) of processing
Not all chromosomes in a population need to be evaluated in each generation as they do not change
Number of trials the fitness evaluation of a chromosome takes 1 trial the number of trials is approximately equal to
population size * number-of-generations
Genesis default: 1000
UNIVERSITI TENAGA NASIONAL 40
Population size
The number of chromosomes in the population
Constant throughout a GA run
Genesis default: 50
UNIVERSITI TENAGA NASIONAL 41
Crossover rateThe proportion of chromosomes that are crossed over, the remainder are copied to the new population unchangedFor example a crossover rate of 0.6 results in 60% of the chromosomes selected for reproduction being crossed over, and the other 40% being carried through unchanged to the new population
Genesis default: 0.6
UNIVERSITI TENAGA NASIONAL 42
Mutation rate
The probability of a bit being mutated (changed)
Usually in the range 0.01 to 0.001
For example a mutation rate of 0.001 means there is a 1 in 1000 chance of a bit being changed
Genesis default: 0.001
UNIVERSITI TENAGA NASIONAL 43
Genetic Algorithms Software Packages
ANT: PC implementation of 'John Muir Trail' experiment
CFS-C: Domain Independent Subroutines for Implementing Classifier Systems in Arbitrary, User-Defined Environments
DGENESIS: Distributed GA
EM: Evolution Machine
GAucsd: Genetic Algorithm Software Package
GAC: Simple GA in C
GACC: Genetic Aided Cascade-Correlation
GAGA: A Genetic Algorithm for General Application
GAGS: Genetic algorithm application generator and C++ class library / GAL: Simple GA in Lisp
GAME: Genetic Algorithms Manipulation Environment
UNIVERSITI TENAGA NASIONAL 44
Genetic Algorithms Software Packages
GAMusic: Genetic Algorithm to Evolve Musical Melodies
GANNET: Genetic Algorithm / Neural NETwork
GAW: Genetic Algorithm Workbench
GECO: Genetic Evolution through Combination of Objects
GENALG: Genetic Algorithm package written in Pascal
GENESIS: GENEtic Search Implementation System
GENEsYs: Experimental GA based on GENESIS
GenET: Domain-independent generic GA software package
Genie: GA-based modeling/forecasting system
GENITOR: Modular GA package with floating-point support.
GENlib: Genetic Algorithms and Neural Networks
mGA: C and Common Lisp implementations of a messy GA