MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #12 2/18/02 Intro to Evolutionary Algorithms

Download MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #12 2/18/02 Intro to Evolutionary Algorithms

Post on 22-Dec-2015

216 views

Category:

Documents

3 download

Embed Size (px)

TRANSCRIPT

<ul><li> Slide 1 </li> <li> MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #12 2/18/02 Intro to Evolutionary Algorithms </li> <li> Slide 2 </li> <li> So Far: Up until now, all the algorithms discussed operated on a single current design. Simulated Annealing randomized perturbations of a single point. Greedy Algorithms maximized local improvement. </li> <li> Slide 3 </li> <li> Now: Consider operating on and maintaining an entire population of points simultaneously. So what? It would be easier to just run my single point algorithm many times or maybe on multiple processors to save wall clock time. </li> <li> Slide 4 </li> <li> Advantage We can now simulate the processes of natural selection and competition within our population. We can have our candidate designs fight for places in the population of future generations (iterations). </li> <li> Slide 5 </li> <li> Evolutionary Algorithms Date back to the 1950s. Many researchers independently developed different versions. Examples are: Genetic Algorithms, Evolution Strategies, Evolutionary Programming. </li> <li> Slide 6 </li> <li> Basic Terminology Most of the terminology is borrowed from Biology Phenotype: the "outward, physical manifestation" of an organism. The physical parts, the sum of the atoms, molecules, macromolecules, cells, structures, metabolism, energy utilization, tissues, organs, reflexes and behaviors; anything that is part of the observable structure, function or behavior of a living organism. Genotype: This is the "internally coded, heritable information" carried by all living organisms. This stored information is used as a "blueprint" or set of instructions for building and maintaining a living creature. </li> <li> Slide 7 </li> <li> Basic Terminology Gene: The fundamental physical and functional unit of heredity. A gene is an ordered sequence of nucleotides located in a particular position on a particular chromosome that encodes a specific functional product. Chromosome: The self-replicating genetic structures of cells each containing the entire genome of an organism. </li> <li> Slide 8 </li> <li> Basic Terminology Alleles: Alternative forms of a genetic locus. Crossing Over: The breaking during meiosis of one maternal and one paternal chromosome, the exchange of corresponding sections of DNA, and the rejoining of the chromosomes. This process can result in an exchange of alleles between chromosomes. Mutation: A heritable change in the genetic makeup of an organism </li> <li> Slide 9 </li> <li> Important Note We are not constrained by any of the rules of biological systems. For example, we can have as many parents as we wish contribute to the makeup of our offspring, we can have members that live forever (dont age). What is important to note here is that we are using nature as a model for our mathematical algorithms. </li> <li> Slide 10 </li> <li> 5 Basic Components An encoded representation of solutions to the problem. Ex. binary encoding, real number encoding, integer encoding, data structure encoding. A means of generating an initial population. Ex. random initialization, patterned initialization. A means of evaluating design fitness. Need a consistent means of determining which designs are better than others. Operators for producing and selecting new designs. Ex. selection, crossover, mutation. Values for the parameters of the algorithm. Ex. How much crossover and mutation, how big is population. </li> <li> Slide 11 </li> <li> General Approach General equation describing most evolutionary algorithms is: Where: x[t] is the population at time t v(*) is/are the variation operator(s) s(*) is the selection operator. </li> <li> Slide 12 </li> <li> 5 Basic Components An encoded representation of solutions to the problem. Ex. binary encoding, real number encoding, integer encoding, data structure encoding. A means of generating an initial population. Ex. random initialization, patterned initialization. A means of evaluating design fitness. Need a consistent means of determining which designs are better than others. Operators for producing and selecting new designs. Ex. selection, crossover, mutation. Values for the parameters of the algorithm. Ex. How much crossover and mutation, how big is population. </li> <li> Slide 13 </li> <li> Encoding Vectors of integers. Useful for TSP, Integer problems. 3 1 5 4 2 8 6 7 TSP: Possible Trips: [ 1 8 6 5 2 3 4 7 ] [ 8 2 5 6 3 1 7 4 ] [ 2 4 6 3 7 5 1 8 ] (where return home is implied). </li> <li> Slide 14 </li> <li> Encoding Vectors of real numbers. Useful for continuous problems. Min: f(x) = x[1] 2 + x[2] x[3] 3 50 s.t. g(x) 0 h(x) = 0 x l x x u Possible Design Configurations: [ 13.65, -1.25, 30.98 ] [ 0.67, 14.81, 67.15 ] [ 53.74, 12.54, -21.32 ] </li> <li> Slide 15 </li> <li> Encoding Vectors of binary bits. Useful for packing and shipping problems. 4 1 2 3 Whats in the bag: [ 0 1 1 0 ] [ 1 0 1 0 ] [ 1 1 1 1 ] </li> <li> Slide 16 </li> <li> Encoding Combination of the previous types. Useful for variable length lists. (perhaps a list of continuous numbers where an integer indicates the size of the list). y t What do we do: form:[ N, a 0, a 1, a 2, a N ] Possible Solutions: [ 3, 2.65, 4.25, 3.14 ] [ 2, 5.32, 2.81 ] [ 4, 3.21, 4.25, 9.65, 7.28 ] Example: </li> <li> Slide 17 </li> <li> Encoding Symbolic expressions Useful for mapping problems (control problems etc. typically requires a parser. Data structure is typically a tree.) Combine: *, +, -, /, % (mod), sin, cos, tan, etc. With: Possible Solutions: </li> <li> Slide 18 </li> <li> Encoding Common Method It is common to use binary encoding for problems involving integer, real, and binary type variables. Previously we saw that vectors of bits may be useful for problems involving binary state variables (T, F) (like is item 1 in the bag?) </li> <li> Slide 19 </li> <li> Common Method How in a minute, first why? Primarily because of flexibility (handles many types of variables) and because we can take advantage of the way that computers work. Also, this method of encoding lends itself to a number of common variation operators as we will see. </li> <li> Slide 20 </li> <li> Common Method A note on flexibility: Flexibility usually comes at the expense of optimality. In our case, this method may not work best for many of the problems we do, but it should work fairly well for many of them. Specialization on a problem by problem basis will usually improve performance. </li> <li> Slide 21 </li> <li> Binary Encoding: How? Each of our design variable values may be represented as vector of 1s and 0s. For example: BinaryDecimal 00000000 0 00101101 45 11111111 255 1101.11 13.75 </li> <li> Slide 22 </li> <li> Binary Encoding: How? Therefore, since our design is defined by the collection of its variables, the design can be written as a long string of bits. Example: For a design [ 4, 6, 2 ] We can equivalently write [ 0100, 0110, 0010 ] </li> <li> Slide 23 </li> <li> Binary Encoding Back to why. We will not likely be doing this by hand. We will probably us a computer. All numbers in a computer are represented as a string of bits ( 1s and 0s). We can take advantage of this. </li> <li> Slide 24 </li> <li> Binary Encoding Because of this, it is not necessary to explicitly create vectors of bits to represent our design variables. Advantages (assuming we decided on BE): Memory efficient 32 bit integer value requires 4 bytes instead of what would be a minimum of 32 bytes otherwise. </li> <li> Slide 25 </li> <li> Binary Encoding Advantages (vs. explicit vectors of bits): Code simplification No explicit conversion from binary to decimal is necessary for use of the design variables. Most languages support operation directly on the bits of integers. </li> <li> Slide 26 </li> <li> Binary Encoding Disadvantages: Most bitwise operators only work on integral types and probably most of our variables are real. Solution: Specify a precision with which to keep each design variable and convert to a long integer before any bit manipulation. </li> <li> Slide 27 </li> <li> Binary Encoding Example of conversion using precision. Given X 1 = 12.6345 - desired precision = 3 X 1-int = (int)[(12.6345)(10 3 )] = 12634 To improve accuracy, perhaps round X 1 prior to conversion. To get back the original, simply divide X 1-int by 10 3. </li> <li> Slide 28 </li> <li> Binary Encoding In general: X i-int = X i * 10 (prec) (truncated). X i = X i-int / 10 (prec) </li> <li> Slide 29 </li> <li> 5 Basic Components An encoded representation of solutions to the problem. Ex. binary encoding, real number encoding, integer encoding, data structure encoding. A means of generating an initial population. Ex. random initialization, patterned initialization. A means of evaluating design fitness. Need a consistent means of determining which designs are better than others. Operators for producing and selecting new designs. Ex. selection, crossover, mutation. Values for the parameters of the algorithm. Ex. How much crossover and mutation, how big is population. </li> <li> Slide 30 </li> <li> Initial Population Quite simply, the population must be initialized in any way you wish. Some Possibilities: x2x2 x1x1 x1x1 x2x2 RandomPatterned </li> <li> Slide 31 </li> <li> 5 Basic Components An encoded representation of solutions to the problem. Ex. binary encoding, real number encoding, integer encoding, data structure encoding. A means of generating an initial population. Ex. random initialization, patterned initialization. A means of evaluating design fitness. Need a consistent means of determining which designs are better than others. Operators for producing and selecting new designs. Ex. selection, crossover, mutation. Values for the parameters of the algorithm. Ex. How much crossover and mutation, how big is population. </li> <li> Slide 32 </li> <li> Fitness Evaluation It is necessary to provide a consistent means of evaluating the fitness of a design. A B C implies A C (those who have studied utility theory and preference ranking know that this does not always hold) </li> <li> Slide 33 </li> <li> Fitness Evaluation The closer to optimal a point is, the better its fitness should be (provides direction). Consider the case of functions of binary bits x i = {0, 1} What will be the result of these two functions (what is the difference)? Maximize: </li> <li> Slide 34 </li> <li> Fitness Evaluation How can we avoid this Problem? In this particular case, perhaps we can make our fitness value a count of the number of DVs with a value of 1. This is a very problem specific question and will usually require knowledge about the problem. </li> <li> Slide 35 </li> <li> Fitness Evaluation Important concept is that fitness is not limited to the objective function value and commonly is not. Creative measures of fitness can greatly improve the performance of the algorithm and may have a strong dependency on the choice of encoding and use of operators. </li> <li> Slide 36 </li> <li> 5 Basic Components An encoded representation of solutions to the problem. Ex. binary encoding, real number encoding, integer encoding, data structure encoding. A means of generating an initial population. Ex. random initialization, patterned initialization. A means of evaluating design fitness. Need a consistent means of determining which designs are better than others. Operators for producing and selecting new designs. Ex. selection, crossover, mutation. Values for the parameters of the algorithm. Ex. How much crossover and mutation, how big is population. </li> <li> Slide 37 </li> <li> Variation Operators The variation operators provide means of generating new designs. They should be set up to leverage information discovered in previous design evaluations. </li> <li> Slide 38 </li> <li> Variation Operators Choice of variation operators is tightly coupled with choice of encoding (as we will see as we progress). Problems are most efficiently solved when the proper operators are chosen and tailored to the problem at hand. </li> <li> Slide 39 </li> <li> Variation Operators - Crossover Crossover is the inclusion of or combination of genetic material from one or more designs to create new designs. (recall biological definition). Appropriate choice of a crossover strategy is highly dependent on choice of encoding and evaluation function. </li> <li> Slide 40 </li> <li> Crossover on Vectors of Integers Consider a plain integer problem (not TSP) and two possible designs: -Could probabilistically choose values from the vectors: X 1 = [ 10, 15, 9, 7, 19 ] X 2 = [ 17, 2, 14, 31, 3 ] C 1 = [ 10, 2, 14, 7, 19 ] C 2 = [ 17, 15, 9, 31, 3 ] -Could choose 1 or more random crossover point(s): X 1 = [ 10, 15, 9, 7, 19 ] X 2 = [ 17, 2, 14, 31, 3 ] C 1 = [ 10, 15, 14, 31, 3 ] C 2 = [ 17, 2, 9, 7, 19 ] -Could increment or decrement each according to a Gaussian dist. with mean of zero and std dev. would then be a measure of the probability of large changes. </li> <li> Slide 41 </li> <li> Crossover on Vectors of Reals Could use same strategies listed for ints. Another approach is to use arithmetic crossover (taken from convex set theory). Basic Equations: C 1 = 1 X 1 + 2 X 2 C 2 = 1 X 2 + 2 X 1 This is a weighted average approach: Convex Combination: 1 + 2 = 1 and 1, 2 &gt; 0 Affine Combination: 1 + 2 = 1 Linear Combination: 1, 2 E n Note: there are many other approaches you can read about. </li> <li> Slide 42 </li> <li> Crossover on Vectors of Bits First Define Hamming Distance: Given 2 expressions, the Hamming distance is the number of characters that must be changed to make the expressions equivalent. So D H for 01101 and 10100 is 3 and D H for 01111 and 10000 is 5 We will consider this later. </li> <li> Slide 43 </li> <li> Crossover on Vectors of Bits Could use 1 st two strategies for ints. (3 rd strategy would not make sense). Could iterate through vector and according to a given probability, change the bits. </li> <li> Slide 44 </li> <li> Good Crossover for BE BE lends itself well to Parameterized Crossover (we have already seen a parameterized approach) Because each variable (parameter) is in itself a string of bits, we can operate on each variable separately. </li> <li> Slide 45 </li> <li> Good Crossover for BE Example Single Point Parameterized: Given two designs [ 0100, 0110, 0010 ]= [ 4, 6, 2 ] [ 1010, 0111, 1111 ]= [ 9, 7, 15 ] </li> <li> Slide 46 </li> <li> Good Crossover for BE Example Single Point Parameterized: Choose a crossover point for each variable [ 0 100, 01 10, 001 0 ]= [ 4, 6, 2 ] [ 1 010, 01 11, 111 1 ]= [ 9, 7, 15 ] Then perform crossover as before. </li> <li> Slide 47 </li> <li> Good Crossover for BE Example Single Point Parameterized: Results for this case: [ 0 100, 01 10, 001 0 ]= [ 4, 6, 2 ] [ 1 010, 01 11, 111 1 ]= [ 9, 7, 15 ] [ 0 010, 01 11, 001 1]= [ 2, 7, 3 ] [ 1 100, 01 10, 111 0]= [ 12, 6, 14 ] Notice that one child tends to be like one parent and the other tends to be like the other parent. Prnts Cldrn </li> <li> Slide 48 </li> <li> Crossover on Combinations A few of the strat...</li></ul>