csnb234 artificial intelligence

44
UNIVERSITI TENAGA NASIONAL 1 CSNB234 ARTIFICIAL INTELLIGENCE Chapter 9 Genetic Algorithms Instructor: Alicia Tang Y. C. (Chapter 12, pp. 509-519, Textbook) (pp. 116-119, Ref. #3)

Upload: talib

Post on 04-Jan-2016

26 views

Category:

Documents


1 download

DESCRIPTION

CSNB234 ARTIFICIAL INTELLIGENCE. Chapter 9 Genetic Algorithms. (Chapter 12, pp. 509-519, Textbook) (pp. 116-119, Ref. #3). Instructor: Alicia Tang Y. C. Genetic Algorithms - I. Genetic Algorithms (GAs) Is an efficient and robust search technique in complex searching areas - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 1

CSNB234ARTIFICIAL INTELLIGENCE

Chapter 9Genetic Algorithms

Chapter 9Genetic Algorithms

Instructor: Alicia Tang Y. C.

(Chapter 12, pp. 509-519, Textbook)(pp. 116-119, Ref. #3)

Page 2: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 2

Genetic Algorithms - I

Genetic Algorithms (GAs) Is an efficient and robust search technique in

complex searching areas It normally find near optimal solutions to

problemsSolutions are based on the natural selection and

genetic, i.e. survival of the fittest In a GA work:

a number of solutions are evaluated simultaneously

Page 3: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 3

Genetic Algorithms - II

developed by John Holland in 1975

inspired by biological mutation & evolution

stochastic (means non-deterministic) search techniques based on the mechanism of natural selection and genetics

implicit parallel search of solution space

Page 4: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 4

Genetic Algorithms - III

Each solution is evaluated for its fitness.

The better solution the greater its chance of survival

use an iterative process, with better solutions normally evolving over time

Page 5: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 5

Applications

GAs are used for optimization problemssuch as maximising profits or minimising

costs/time

Some problem areas are:schedulingdesign financial management

Page 6: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 6

The Algorithm …Evolutionary Algorithm

Generate initialpopulation

Evaluate function

Best result

start

stop

Criteria met?

yes

no selection crossover mutation

Page 7: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 7

Genetic RepresentationChromosomes (Strings) an instance of the problem to be solved

which is also called a genetic structure

a chromosome consists of one or more bits

e.g. 01001001 a chromosome with n bits represents 2n solutions (of which some may be invalid)

Page 8: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 8

ChromosomesRepresentation

bit strings (0011 .. 1101) - as seen in earlier slide

permutation of elements (e3 e4 e1 e7 e9 .. e15)

list of rules (R1 R2 R3 .. R10)Tree-structured expression (+ a b)etc.

GAs have been successful with chromosomes sizes of 1000’s of bit, 100’s rules and

1000’s of permutation elements

Page 9: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 9

Genes

A chromosome may be divided into parts (positions) called genes

Each gene may represent a particular aspect or parameter of a problem

And, its values is called alleleAs a gene is in binary a 1 bit gene can hold 1 or 2 values, a 2 bit gene can hold 3 or 4 values, a 3 bit gene 5 to 8 values, etc.

using this: e.g. n bits can hold from 2 n-1 + 1 to 2

n values

Page 10: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 10

PopulationA GA uses a number of chromosome at a time (e.g 50) called population each representing solutions for a problemThe population of chromosomes compete to survive based on their fitness and are manipulated by genetic operatorsThe population evolves over a number of generations towards a better solution

Page 11: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 11

Genetic Operators

Main genetic operators arereproductioncrossovermutationinversion

Page 12: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 12

Reproduction

“Reproduce” by “Selection” of individual chromosomes that are to reproduce selecting individuals to be parents

chromosomes with a higher fitness value will have a higher probability of contributing one or more offspring in the next generation

Page 13: CSNB234 ARTIFICIAL INTELLIGENCE

Roulette wheel selection technique

It is one of the chromosome selection techniques. Each chromosome is given a slice of the circular roulette wheel. The area of the slice within the wheel is equal to the chromosome fitness ratio.

To select a chromosome for mating, a random number is generated in the interval [0, 100], and the chromosome whose segment spans the random number is selected.

UNIVERSITI TENAGA NASIONAL 13

Page 14: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 14

Crossover

Mixing of genetic material (mating)Two structures in the current generation are allowed to mate randomlywith each pair producing two chromosomes (offspring)

A crossover point is selected at random and parts of the two parent chromosomes are swapped to create two child chromosomes.

Page 15: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 15

Crossover in action

0 1 0 0 1 0 0 1

0 0 1 1 0 0 1 0

0 1 0 0 0 0 0 1

0 0 1 1 1 0 1 0

Before crossing over

After it is done

Crossover can lead to effective combination of partial solutions on different chromosomes

By doing this, it helps accelerates the search at an early stage of evolution

Select at random pos & swap the two bits

Page 16: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 16

Types of crossover

Single-pointmulti-pointarithmetic

reduced surrogateuniformshuffle

tree, etc.

Page 17: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 17

Mutation

Mutation is a small copy error from one generation to the nextthe mutation rate is the probability a bit changes from 0 to 1 or 1 to 0the mutation rate must be very small (e.g. 0.001) or it may result in a random search, rather than the guided search

Page 18: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 18

Mutation in action

Mutation may be random or heuristicsMovement can be made global or local to some sub units

0 1 0 0 1 0 0 1 0 1 0 1 1 0 0 1

becomes

Page 19: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 19

An Example

Data population: RGB colours Aim: to obtain darkest colour represented by (0, 0, 0)This is a minimisation problem, i.e. a good colour is one that fits for (colour) --> 0.

We now tabulate our data as shown (see next slide):

Page 20: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 20

GA: Fitness (I)

Start at a random pattern like this:

Colour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317

Fitness for (C1) = 80 + 170 + 689 = 939Fitness for (C2) = 130 + 690 + 15 = 835Fitness for (C3) = 24 + 8 + 317 = 349

Fitness for (C1) = 80 + 170 + 689 = 939Fitness for (C2) = 130 + 690 + 15 = 835Fitness for (C3) = 24 + 8 + 317 = 349

where

Page 21: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 21

GA: Fitness (II)

Start at a random pattern like this:

Colour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317

Fitness (C1) = 80 + 170 + 689 = 939Fitness (C2) = 130 + 690 + 15 = 835Fitness (C3) = 24 + 8 + 317 = 349

FITTEST PLACED TOP

C3C2C1

Page 22: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 22

GA: Selection

After a Selection is done on the sample:

Colour Fitness C3 349C2 835C1 939

** Remember, this is a minimisation problem..

Page 23: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 23

GA: Reproduction & Crossover

So far, we have thisColour Red Green BlueC1 80 170 689C2 130 690 15C3 24 8 317

Colour FitnessC3 349C2 835C1 939

Colour Red Green BlueC4 24 8 15C5 24 8 689 C6 130 690 689

C4 is crossover(C3,C2)= (24, 8, 15) C5 is crossover(C3,C1)= (24, 8, 689)C6 is crossover(C2,C1)= (130, 690, 689)

Next step is to reproduce the pattern, like this, by crossing over:

Page 24: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 24

Mutation in GA

perform mutation, and we have

Colour Red Green BlueC7 24 8 13C8 25 9 689C9 128 688 689

C7 is obtained by mutating(4) =(24, 8, 13)C8 is obtained by mutating(5) =(25, 9, 689 )C9 is obtained by mutating(6) =(128, 688, 689)

New population of3 chromosomes

Page 25: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 25

Conclusion (up to “mutation” to get a new data set)

Some solutions have improved (after first iteration):

Fitness for C7 = (24 + 8 + 13) = 45Fitness for C8 = (25 + 9 + 689) = 743Fitness for C9 = (128 + 688 + 689 ) = 1505

If the process is iterated, population will converge to have fitness near to zero (colour) --> 0.

Getting better rather fast

Slightly improved of the answer

worse here..

Page 26: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 26

We shall look at the so-called

‘fitness function’with an example

Page 27: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 27

Fitness Function

The GA performs a search amongst possible solutionsThe GA search is guided by a fitness function which returns a single numeric value indicating the fitness of a chromosomethe fitness is maximised or minimised depending on the problems

Page 28: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 28

Problem Description

To make the best use of disk space and avoid fragmentation, files should be allocated to minimize the number of locations used.

Assumptions: files must be placed in a single location

So that it could free some storage for other files use

Page 29: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 29

Problem Description

Disk space Location Size (KB)0 1.01 1.52 4.03 0.3Total 6.8

Page 30: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 30

Files to be stored are:

Identifier Size (KB)

A 0.2

B 0.1

C 1.2

D 3.0

E 0.9

Page 31: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 31

SOLUTION

Step 1: design the structure of the chromosomeuse one gene per file

5 genes (for this example) the first gene represent the location for file

A, etc.as there are 4 locations, 2 bit genes can be

used to indicate a files location (00 = 0, 01 = 1, 10 = 2, 11 = 3) - base two

chromosome size = 5 * 2 = 10 bits

Page 32: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 32

Step 2: determine the fitness functionThe fitness is to be maximisedFactors affecting the fitness function

Free memory locations (min = 0, max = 4)Memory overflow (disk space Kb)Valid/invalid solutions.

Fitness calculationValid solution: fitness = bonus for getting a valid

solution + number of free memory The bonus in this case is the maximum that can be

obtained for an invalid solution Invalid solution: fitness = total disk space - memory

overflow

So that all filesare allocated

Not being used

Not ‘fit’ if there are many overflow

6.8+06.8+4

Page 33: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 33

Genetic Algorithm - revisitedAlgorithm overview Initialise the populationEvaluate the fitness of the populationWHILE the termination condition has not been

satisfiedselect chromosomes for the new reproductionperform crossover on the new populationperform mutation on the new populationevaluate the fitness of the new populationmake the new population the old population

ENDWHILE

Page 34: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 34

ProceduresStep 1: Pick a population of random codes. (Population size = 4)

Step 2: Evaluate the fitness of each chromosome

Step 3: Select for reproduction with a probability based on the fitness value

Step 4: Crossover chromosomes to form the new generation

randomly select pairscrossover

Page 35: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 35

Step 5: Mutation

Step 6: Loop back to step 2

Control ParametersThe main GA control parameters are:

•Number of generations and trials•Population size•Crossover rate•Mutation rate

Page 36: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 36

Fitness value calculationExample:

Find the fitness value of 2 chromosomes 1011 and 0101 for the boolean expression:

b(X1,X2,X3,X4) = (X1 vX2) (X3 v ¬X4) ¬X1

Given the formulas,

Fitness(X) = 1 if X is a 1-bit, and 0 if X is 0-bitFitness(¬ X) = 1 – fitness(X)Fitness(X1 v X2 v……v Xn) = max(fitness(X1), fitness(X2),……, fitness(X3n))Fitness(X1 X2 …… Xn) = (1/n) * (ni=1 fitness(Xi))

Exercise

Page 37: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 37

Evaluate the fitness of each binary strings for the above expression

Fitness(1011)= (1/3) * ( fitness value for 3 sub-expressions)= (1/3) * (max(fitness(X1), fitness(X2)) + max(fitness(X3), fitness(¬X4))

+ (1- fitness(X1)))= (1/3) * (max(1,0) + max(1, (1-1)) + (1-1))= 1/3 * (1+1+0) = 2/3

Similarly, fitness(0101) = 2/3

Apply crossover at bit position 2 and compute new fitness values:

1011 becomes 1001 & 0101 becomes 0111

Working out the fitness for these strings and you will get

a fitness value 1 for 0111 & hence it provides the (best) solution

b(X1,X2,X3,X4) = (X1 vX2) (X3 v ¬X4) ¬X1

Page 38: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 38

Supplementary slides

Page 39: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 39

Number of generationsThe number of cycles (generations) of processing

Not all chromosomes in a population need to be evaluated in each generation as they do not change

Number of trials the fitness evaluation of a chromosome takes 1 trial the number of trials is approximately equal to

population size * number-of-generations

Genesis default: 1000

Page 40: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 40

Population size

The number of chromosomes in the population

Constant throughout a GA run

Genesis default: 50

Page 41: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 41

Crossover rateThe proportion of chromosomes that are crossed over, the remainder are copied to the new population unchangedFor example a crossover rate of 0.6 results in 60% of the chromosomes selected for reproduction being crossed over, and the other 40% being carried through unchanged to the new population

Genesis default: 0.6

Page 42: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 42

Mutation rate

The probability of a bit being mutated (changed)

Usually in the range 0.01 to 0.001

For example a mutation rate of 0.001 means there is a 1 in 1000 chance of a bit being changed

Genesis default: 0.001

Page 43: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 43

Genetic Algorithms Software Packages

ANT: PC implementation of 'John Muir Trail' experiment

CFS-C: Domain Independent Subroutines for Implementing Classifier Systems in Arbitrary, User-Defined Environments

DGENESIS: Distributed GA

EM: Evolution Machine

GAucsd: Genetic Algorithm Software Package

GAC: Simple GA in C

GACC: Genetic Aided Cascade-Correlation

GAGA: A Genetic Algorithm for General Application

GAGS: Genetic algorithm application generator and C++ class library / GAL: Simple GA in Lisp

GAME: Genetic Algorithms Manipulation Environment

Page 44: CSNB234 ARTIFICIAL INTELLIGENCE

UNIVERSITI TENAGA NASIONAL 44

Genetic Algorithms Software Packages

GAMusic: Genetic Algorithm to Evolve Musical Melodies

GANNET: Genetic Algorithm / Neural NETwork

GAW: Genetic Algorithm Workbench

GECO: Genetic Evolution through Combination of Objects

GENALG: Genetic Algorithm package written in Pascal

GENESIS: GENEtic Search Implementation System

GENEsYs: Experimental GA based on GENESIS

GenET: Domain-independent generic GA software package

Genie: GA-based modeling/forecasting system

GENITOR: Modular GA package with floating-point support.

GENlib: Genetic Algorithms and Neural Networks

mGA: C and Common Lisp implementations of a messy GA