genetic algorithm - shodhganga...genetic algorithm page 28 2.1 genetic algorithm in the previous...
TRANSCRIPT
Genetic Algorithm Page 27
CHAPTER 2
GENETIC ALGORITHM
“Genetic algorithm is basically a method for solving constrained and unconstrained
optimization problems. GA is based on the Darwin’s theory of natural evolution specified
in the origin of species. GA is based on the concept of ‘survival of the fittest’. As in the
nature the fit species remain intact, while the unfit species is eliminated. On the similar
lines out of a number of solutions available, only the more fit solutions are survived,
while the less fit solutions are discarded. GA represents the solutions in the form of
chromosomes and the fitness of the chromosomes is evaluated. The more fit solutions are
selected for the reproduction using the crossover operator. The mutation operator is used
to maintain the diversity the population. The more fit chromosomes replace less fit
chromosomes and the process continues till the optimal solution is found on the basis of
some pre-specified criteria. This chapter provides an overview of GA. GA is based on
population of multiple points as compared to traditional approaches which are based on
single point. The various types of encoding, selection, crossover, mutation and
replacement mechanisms are discussed in this chapter. The major advantage of GA is
that it can be used in such type of situations where the numerical or mathematical models
fail. As it is an evolutionary algorithm, one can easily view the progress within each
iteration. GA can be used in a number of application areas such as optimization, design,
robotics, image processing, machine learning, automatic programming, etc. GA can also
be used in YM for the purpose of airline booking, hotel industry, air traffic control, choice
based network revenue model etc. Overall GA can prove to be a very effective tool for
YM.”
Genetic Algorithm Page 28
2.1 Genetic Algorithm
In the previous chapter, an overview of yield management and optimization was given. It was
observed that yield management is basically a problem of optimization. The major conditions
for yield management as discussed are fixed capacity, perishable inventory and price
discrimination. The yield management problems are stochastic in nature as it is not known
when and how many customers will arrive, it is merely a prediction. These conditions
perfectly form the platform for applying genetic algorithm on the yield management problem.
An insight into genetic algorithm will now be taken.
A genetic algorithm (GA) is a method for solving both constrained and unconstrained
optimization problems based on a natural selection process that mimics biological evolution.
The algorithm repeatedly modifies a population of individual solutions. At each step, the
genetic algorithm randomly selects individuals from the current population and uses them as
parents to produce the children for the next generation. Over successive generations, the
population evolves toward an optimal solution.
One can apply the genetic algorithm to solve problems that are not well suited for standard
optimization algorithms, including problems in which the objective function is discontinuous,
nondifferentiable, stochastic, or highly nonlinear.
Charles Darwin stated the theory of natural evolution in the origin of species on which the
genetic algorithm is based. According to the theory of natural evolution, “over several
generations, biological organisms evolve based on the principle of natural selection ‘survival
of the fittest’ to reach certain remarkable tasks”.
In nature, an individual in population competes with each other for basic resources for life
such as food, shelter, etc. Also in the same species, individuals compete to attract mates for
reproduction. In this selection procedure, poorly performing individuals have less chance to
survive, and the most adapted or “fit” individuals produce a relatively large number of
offspring’s. It can also be observed that during reproduction, a recombination of the good
characteristics of each ancestor can produce “best fit” offspring whose fitness is greater than
that of a parent in general. After a few generations, species evolve spontaneously to become
more and more adapted to their environment and may sustain for a longer period of time.
In 1975, Holland developed this idea in his book “Adaptation in natural and artificial
systems”. He described how to apply the principles of natural evolution to optimization
problems and built the first Genetic Algorithms. Holland’s theory has now been further
developed by leaps and bounds. Genetic Algorithms (GAs) now stand up as a powerful tool
Genetic Algorithm Page 29
for solving search and optimization problems. Genetic algorithms are based on the basic
principle of genetics and evolution.
2.2 Historical Background
Holland’s influence in the development of the GA has been very important, but several other
scientists with different backgrounds were also involved in developing similar ideas. 1975
was a pivotal year in the development of genetic algorithms. It was in that year that Holland’s
book was published, but perhaps more relevantly for those interested in metaheuristics, that
year also saw the completion of a doctoral thesis by one of Holland’s graduate students, Ken
DeJong (1975). Other students of Holland’s had completed theses in this area before, but this
was the first to provide a thorough treatment of the GA’s capabilities in optimization.
Another graduate student of Holland’s, David Goldberg, produced first an award-winning
doctoral thesis on his application to gas pipeline optimization, and then, in 1989, an
influential book —Genetic Algorithms in Search, Optimization, and Machine Learning. This
was the final catalyst in setting off a sustained development of GA theory and applications
that is still growing rapidly.
Optimization had a fairly small place in Holland’s work on adaptive systems, yet the majority
of research on GAs tends to assume this is their purpose. Nevertheless, using GAs for
optimization is very popular, and frequently successful in real applications, and to those
interested in metaheuristics, it will undoubtedly be the viewpoint that is most useful.
When GA is used to solve optimization problems, good results are obtained quite quickly. A
heuristic is a part of an optimization algorithm that uses the information currently gathered by
the algorithm and acts as a carrier to decide which solution candidate should be tested next,
or how the next individual can be produced [Thomas (2007)]. Genetic algorithms are guided
random search and one of the most popular optimization techniques among evolutionary
algorithms for multi-objective optimization problems. Genetic algorithms have been found to
be capable of finding solutions for a wide variety of problems for which no acceptable
algorithmic solutions exist. GA has been used for solving various NP Complete problems
[Vijay Lakshmi and Radha Krishnan (2007)]. GA attempts to arrive at optimal solutions
through a process similar to biological evolution. To use a genetic algorithm, it is required to
represent the solution of the problem as a genome (or chromosome). The genetic algorithm
then creates a population of solutions and applies genetic operators such as mutation and
crossover to evolve the solutions in order to find the best one. These operate on a population
of potential solutions, applying the principle of survival of the fittest to generate improved
estimations to a solution. At each generation, a new set of approximations is created by the
Genetic Algorithm Page 30
process of selecting individuals according to their level of fitness and breeding them together
using genetic operators inspired by natural genetics. This process leads to the evolution of
better populations than the previous populations [Eiben & Smith (2003), Michalewicz
(1996)]. The GA consists of an iterative process that evolves a working set of individuals
called a population toward an objective function, or fitness function [Goldberg (1989),
Whitley (1994)].Genetic algorithms are typically implemented using computer simulations in
which an optimization problem is specified.
2.3 Natural Selection
The origin of species is based on “Preservation of favourable variations and rejection of
unfavourable variations”. The variation refers to the changes shown by the individual of a
species and also by offspring’s of the same parents. There are lot more individuals born than
can survive, so there is a continuous struggle for life. Individuals with an advantage have a
greater chance for survive i.e., the survival of the fittest. For example, Giraffe with long
necks can have food from tall trees as well from grounds, on the other hand goat, deer with
small neck have food only from grounds. As a result, natural selection plays a major role in
this survival process [S.N.Sivanandam et.al. (2008)]. On the similar lines in a genetic
algorithm in each iteration the favourable (fit) individual are survived and the unfavourable
(unfit) individual are died out. In each iteration the process goes on and on until a stage a
reached in which the stable or optimized solutions are reached, which can be termed as
adaptability.
The following Table 2.1 gives a list of different expressions, which are in common with
natural evolution and genetic algorithm.
Table 2.1 Comparison of natural evolution and genetic algorithm terminology
Natural evolution Genetic algorithm
Chromosome String
Gene Feature or character
Allele Feature value
Locus String position
Genotype Structure or coded string
Phenotype Parameter set, a decoded structure
2.4 Basic Principle
The working principle of a standard GA is illustrated in algorithm 2.1. The major steps
involved are the generation of a population of solutions, finding the objective function and
Genetic Algorithm Page 31
fitness function and the application of genetic operators. These aspects are described with the
help of a basic genetic algorithm as below.
Algorithm 2.1 Basic Genetic Algorithm
[start] Generate random population of n chromosomes/individuals (suitable and possible
solutions for the problem)
[Fitness] Evaluate the fitness f(x) of each chromosome/individual x in the population
[New population] Create a new population by repeating following steps until the New
population is complete
[selection] select two parent chromosomes from a population according to their
fitness ( the better fitness, the bigger chance to get selected).
[crossover] With a crossover probability, cross over the parents to form new offspring
(children). If no crossover was performed, offspring is the exact copy of parents.
[Mutation] With a mutation probability, mutate new offspring at each locus (position
in chromosome)
[Accepting] Place new offspring in the new population.
[Replace] Use new generated population for a further run of the algorithm.
[Test] If the end condition is satisfied, stop, and return the best solution in current
population.
[Loop] Go to the second step for fitness evaluation.
The basic principle behind GAs is that they create and maintain a population of individuals
represented by chromosomes. Chromosomes are essentially a character string analogous to
the chromosomes appearing in DNA. These chromosomes are typically encoded solutions to
a problem. The chromosomes then undergo a process of evolution according to rules of
selection, reproduction and mutation. Each individual in the environment (represented by a
chromosome) receives a measure of its fitness in the environment. Reproduction selects
individuals with high fitness values in the population, and through crossover and mutation of
such individuals, a new population is derived in which individuals may be even better fitted
to their environment. The process of crossover involves two chromosomes swapping chunks
of data and is analogous to the process of sexual reproduction. Mutation introduces slight
changes into a small proportion of the population and is representative of an evolutionary
step.
Genetic Algorithm Page 32
2.5 Difference between Traditional and Genetic Approach
An algorithm is a series of steps for solving a problem. A genetic algorithm is a problem
solving method that uses genetics as its model of problem solving. It’s a search technique to
find approximate solutions to optimization and search problems. One can easily differentiate
between a traditional algorithm and a genetic algorithm.
Table 2.2. Difference between Traditional and Genetic Approach
Traditional Algorithm Genetic Algorithm
Generates a single point at each iteration.
The sequence of points approaches an
optimal solution.
Generates a population of points at each
iteration. The best point in the population
approaches an optimal solution.
Selects the next point in the sequence by a
deterministic computation.
Selects the next population by computation
which uses random number generators.
Improvement in each iteration is problem
specific
Convergence in each iteration in problem
independent.
The differences can also be shown with the help of the following fig. 2.1
Fig. 2.1 Comparison of traditional and genetic approaches
2.6 Exploitation and Exploration
Search is one of the more universal problem solving methods for such type of problems in
which one cannot determine a prior sequence of steps leading to a solution. Search can be
Genetic Algorithm Page 33
performed with either blind strategies or heuristic strategies. Blind search strategies do not
use information about the problem domain. Heuristic search strategies use additional
information to guide search move along with the best search directions.
There are two important issues in search strategies: exploiting the best solution and exploring
the search space. Michalewicz(1996) gave a comparison on hill climbing search, random
search and genetic search. Hill climbing is an example of a strategy which exploits the best
solution for possible improvement, ignoring the exploration of the search space. Random
search is an example of a strategy which explores the search space, ignoring the exploitation
of the promising regions of the search space. GA is a class of general purpose search methods
combining elements of directed and stochastic search which can produce a remarkable
balance between exploration and exploitation of the search space. At the beginning of genetic
search, there is a widely random and diverse population and crossover operator tends to
perform wide-spread search for exploring all solution space. As the high fitness solutions
develop, the crossover operator provides exploration in the neighbourhood of each of them.
In other words, what kinds of searches (exploitation or exploration) a crossover performs
would be determined by the environment of genetic system (the diversity of population) but
not by the operator itself.
2.7 Population-based Search
Generally, an algorithm for solving optimization problems is a sequence of computational
steps which asymptotically converge to optimal solution. Most classical optimization
methods generate a deterministic sequence of computation based on the gradient or higher
order derivatives of objective function. The methods are applied to a single point in the
search space. The point is then improved along the deepest descending direction gradually
through iterations as shown in algorithm 2.1. This Point-to-point approach embraces the
danger of failing in local optima. GA performs a multi-directional search by maintaining a
population of potential solutions. The population-to-population approach is hopeful to make
the search escape from local optima. Population undergoes a simulated evolution: at each
generation the relatively good solutions are reproduced, while the relatively bad solutions die.
GA uses probabilistic transition rules to select someone to be reproduced and someone to die
so as to guide their search toward regions of the search space with likely improvement.
2.8 Building block hypothesis
Genetic algorithms are simple to implement, but their behaviour is difficult to understand. In
particular it is difficult to understand why these algorithms frequently succeed at generating
Genetic Algorithm Page 34
solutions of high fitness when applied to practical problems. The building block hypothesis
consists of:
(i) A description of a heuristic that performs adaptation by identifying and recombining
building blocks, i.e. low order, low defining-length schemata with above average fitness.
(ii) A hypothesis that a genetic algorithm performs adaptation by implicitly and efficiently
implementing this heuristic.
2.9 Implementation of Genetic Algorithm
GAs encodes the decision variables of a search problem into finite-length strings of alphabets
of certain cardinality. To evolve good solutions and to implement natural selection, one needs
a measure for distinguishing good solutions from bad solutions. The measure could be an
objective function that is a mathematical model or a computer simulation. In essence, the
fitness measure must determine a candidate solution’s relative fitness, which will
subsequently be used by the GA to guide the evolution of good solutions.
Another important concept of GAs is the notion of population. The population size, which is
usually a user-specified parameter, is one of the important factors affecting the scalability and
performance of genetic algorithms. For example, small population sizes might lead to
premature convergence and yield substandard solutions. On the other hand, large population
sizes lead to unnecessary expenditure of valuable computational time [Sastry et.al. (2005)].
Once the problem is encoded in a chromosomal manner and a fitness measure for
discriminating good solutions from bad ones has been chosen, solutions can start to evolve
the search problem using the steps already specified in algorithm 2.1
In the next subsections details of these steps will be discussed.
2.9.1 Initialization
Usually there are only two main components of most genetic algorithms that are problem
dependent: the problem encoding and the evaluation function. Consider a parameter
optimization problem where one must optimize a set of variables either to maximize some
target such as a profit, or to minimize cost or some measure of error. The goal is to set the
various parameters so as to optimize some output. In more traditional terms, some function
‘f’ should be minimized, or maximized.
The most common form of representing a solution as a chromosome is a string of binary
digits. Each bit in this string is a gene. The process of converting the solution from its
original form into the bit string is known as encoding. The specific encoding scheme used is
application dependent. The solution bit strings are decoded to enable their evaluation using a
Genetic Algorithm Page 35
fitness measure. Chromosomes are all of the same type and same length. The population size
remains constant from generation to generation.
2.9.2 Encoding
Encoding of chromosomes is the first question to ask when starting to solve a problem with
genetic algorithm. Genetic algorithms work on encoded space and solution space
alternatively. Genetic operations work on encoded space i.e. chromosomes, while evaluation
and selection work on solution space [Gen & Mitsuo (1996)]. A chromosome should in some
way contain information about solution that it represents. The most used way of encoding is a
binary string. Each bit in the string can represent some characteristics of the solution. Another
possibility is that the whole string can represent a number. Of course, there are many other
ways of encoding. The encoding depends mainly on the problem to be solved. For example,
one can encode directly integer or real numbers; sometimes it is useful to encode some
permutations and so on. Various types of encodings are available which can be selected
according to the nature of the problem.
Genetic algorithms follow two basic principles for choosing the encoding method namely:
1. The principle of meaningful building blocks: The schemata should be short, of low order,
and relatively unrelated to schemata over other fixed positions.
2. The principle of minimal alphabets: The alphabet of the encoding should be as small as
possible while still allowing a natural representation of solutions.
The first principle states that the user should select a coding such that the building blocks of
the underlying problem are small and relatively unrelated to building blocks at other
positions. The principle of meaningful building blocks is directly motivated by the schema
theorem. If schemata are highly fit, short and of low order, then their numbers exponentially
increase over the generations. If the high-quality schemata are long or of high order, they are
disrupted by crossover and mutation and they cannot be propagated properly. The second
principle states that the user should select the smallest alphabet that permits an expression of
the problem so that the number of exploitable schemas is maximized [Goldberg (1989)]. The
principle of minimal alphabets tells us to increase the potential number of schemata by
reducing the cardinality of the alphabet. When using minimal alphabets the number of
possible schemata is maximal. This is the reason why Goldberg advises to use bit string
representations, because high quality schemata are more difficult to find when using
alphabets of higher cardinality.
Genetic Algorithm Page 36
2.9.2.1 Binary Encoding
The most common way of encoding is a binary string, which can be represented as in Fig.
2.2. Each chromosome encodes a binary (bit) string. Each bit in the string can represent some
characteristics of the solution. Every bit string therefore is a solution but not necessarily the
best solution. Another possibility is that the whole string can represent a number. The way bit
strings can code differs from problem to problem.
Binary encoding gives many possible chromosomes with a smaller number of alleles. On the
other hand this encoding is not natural for many problems and sometimes corrections must be
made after genetic operation is completed. Binary coded strings with 1s and 0s are mostly
used. The length of the string depends on the accuracy.
In this encoding integers are represented exactly, finite number of real numbers can be
represented and number of real numbers represented increases with string length.
Chromosome 1 1100011101110010
Chromosome 2 0110010101110011
Fig. 2.2 Binary Encoding
2.9.2.2 Octal Encoding
This encoding uses string made up of octal numbers (0–7). The basic advantage of this
encoding scheme over binary encoding is the smaller size.
Chromosome 1 20346151
Chromosome 2 12670231
Fig. 2.3 Octal Encoding
2.9.2.3 Hexadecimal Encoding
This encoding uses string made up of hexadecimal numbers (0–9, A–F). The advantage of
this encoding scheme over binary and octal encoding is again smaller size.
Chromosome 1 A09B
Chromosome 2 932F
Fig. 2.4 Hexadecimal Encoding
2.9.2.4 Gray Encoding
Ordinary binary number representation of the variable values may slow convergence of a GA.
Increasing the number of bits in the variable representation magnifies the problem [Haupt and
Haupt (1998)]. Gray Code can avoid this problem by redefining the binary numbers so that
consecutive numbers have a hamming distance of one [Taub and Schilling (1986)]. Gray
codes speed convergence time by keeping the algorithm’s attention on converging toward a
Genetic Algorithm Page 37
solution [Caruana and Schaffer (1988)]. Gray coding uses QUAD search for finding the
solutions. As in binary strings, even in gray coded strings a bit change in any arbitrary
location may cause a large change in the decoded integer value. The decoding of the gray
coded strings to the corresponding decision variable introduces an artificial non-linearity in
the relationship between the string and the decoded value.
2.9.2.5 Permutation Encoding
Every chromosome is a string of numbers, which represents the number in sequence.
Sometimes corrections have to be done after genetic operation is completed. In permutation
encoding, every chromosome is a string of integer/real values, which represents number in a
sequence. Permutation encoding is only useful for ordering problems. Even for these
problems for some types of crossover and mutation corrections must be made to leave the
chromosome consistent.
Chromosome 1 1 4 2 7 3 9 8 6 5
Chromosome 2 2 6 1 9 3 7 4 5 8
Fig. 2.5 Permutation Encoding
Permutations are also important for scheduling applications, variants of which are also often
NP complete. This encoding is also called path representation or order representation
[Starkweather et al. (1991)].
2.9.2.6 Value Encoding
Every chromosome is a string of values and the values can be anything related to the
problem. This encoding produces very good results for some special problems. On the other
hand, it is often necessary to develop new genetic operator’s specific to the problem. Direct
value encoding can be used in problems, where some complicated values, such as real
numbers, are used. Use of binary encoding for these types of problems would be very
difficult. In value encoding, every chromosome is a string of some values. Values can be
anything connected to problem, form numbers, real numbers or chars to some complicated
objects.
Value encoding is very good for some special problems. On the other hand, for this encoding
is often necessary to develop some new crossover and mutation specific for the problem.
Chromosome 1 2.125 0.2398 8.0127 0.0932 3.2917
Chromosome 2 AFJBADCEHISTKJHTP
Chromosome 3 (back) (back) (right) (left) (forward)
Fig. 2.6 Value Encoding
Genetic Algorithm Page 38
2.9.2.7 Tree encoding
Tree encoding is used mainly for genetic programming. In the tree encoding every
chromosome is a tree of some objects, such as functions or commands in programming
language. The representation space is defined by defining the set of functions and terminals
to label the nodes in the trees. Trees provide rich representation that is sufficient to represent
computer programs, analytical functions, and variable length structure, even computer
hardware. Parse tree is a popular representation for evolving executable structures [Back et
al. (1997)]. Parse tree incorporates natural recursive definition, which allows for dynamically
sized structures. Most of the parse tree representations have restriction on size of evolving
programs. In a parse tree representation, the contents of the parse tree determine the power
and suitability of the representation. Due to acyclic nature of parse trees, iterative
computations are not naturally represented. It is very difficult to identify the stopping criteria.
So the evolved function is evaluated within an implied loop that re-executes the evolved
function until some predetermined stopping criteria is satisfied.
Tree encoding allows search space to be open ended. But due to this, tree may grow in an
uncontrolled way. Large trees are difficult to understand and simplify. Large trees also
prevent structure and hierarchical candidate solutions. Parse tree incorporates natural
recursive definition, which allows for dynamically sized structures. Most of the parse tree
representations have restriction on size of evolving programs. If there is no restriction, then it
would lead to increase in size of evolving programs and further lead to swamping of available
computational resources. Size restriction is implemented in two ways. Depth limitation
restricts the size of evolving parse tree based on user-defined maximal depth parameter. Node
limitation places limit on total number of nodes available for an individual parse tree. Node
limitation is preferred over size restriction because it encodes fewer restrictions on structural
organization of evolving programs [Angeline (1996)].
Fig.2.7 Image courtesy: http://www.myreaders.info/09 Genetic Algorithms.pdf
Genetic Algorithm Page 39
2.9.3 Fitness Evaluation
A fitness function is a particular type of objective function that prescribes the optimality of a
solution in a genetic algorithm so that that particular chromosome may be ranked against all
the other chromosomes. Optimal chromosomes, or at least chromosomes which are more
optimal, are allowed to breed and mix their datasets by any of several techniques, producing a
new generation that will hopefully be even better. An ideal fitness function correlates closely
with the algorithm's goal, and yet may be computed quickly. Speed of execution is very
important, as a typical genetic algorithm [DeJong (2006)] must be iterated many, many times
in order to produce a usable result for a non-trivial problem. This is one of the main
drawbacks of GAs in real world applications and limits their applicability in some industries.
Sometimes approximate models may be one of the most promising approaches, especially in
the following cases:
• Fitness computation time of a single solution is extremely high,
• Precise model for fitness computation is missing,
• The fitness function is uncertain or noisy.
Another way of looking at fitness functions is in terms of a fitness landscape, which shows
the fitness for each possible chromosome. Definition of the fitness function is not
straightforward in many cases and often is performed iteratively if the fittest solutions
produced by GA are not what is desired. In some cases, it is very hard or impossible to come
up even with a guess of what fitness function definition might be. Interactive genetic
algorithms [Kershenbaum (1996), Davis (1987)] address this difficulty up to some extent.
2.9.4 Genetic Operators
A Genetic Operator is an operator used in genetic algorithms to maintain genetic diversity.
Genetic variation is a necessity for the process of evolution. Genetic operators used in genetic
algorithms are analogous to those which occur in the natural world: survival of the fittest, or
selection; reproduction (crossover, also called recombination); and mutation. Genetic
diversity, the level of biodiversity, refers to the total number of genetic characteristics in the
genetic makeup of a species. It is distinguished from genetic variability, which describes the
tendency of genetic characteristics to vary.
When GA proceeds, both the search direction to optimal solution and the search speed should
be considered as important factors, in order to keep a balance between exploration and
exploitation in search space. In general, the exploitation of the accumulated information
resulting from GA search is done by the selection mechanism, while the exploration to new
regions of the search space is accounted for by genetic operators. The genetic operators
Genetic Algorithm Page 40
mimic the process of heredity of genes to create new offspring at each generation. The
operators are used to alter the genetic composition of individuals during representation. In
essence, the operators perform a random search, and cannot guarantee to yield an improved
offspring. There are three common genetic operators: crossover, mutation and selection.
2.9.4.1 Selection
Selection is the process of selecting two or more parents from the population for crossing.
After deciding on an encoding, the next step is to decide how to perform selection i.e., how to
choose individuals in the population that will create offspring for the next generation and how
many offspring each will create. The purpose of selection is to emphasize fitter individuals in
the population in hopes that their off springs have higher fitness. Chromosomes are selected
from the initial population to be parents for reproduction. The problem is how to select these
chromosomes. According to Darwin’s theory of evolution the best ones survive to create new
offspring.
Selection is a method that randomly picks chromosomes out of the population according to
their evaluation function. The higher the fitness function, the more chance an individual has
to be selected. The selection pressure is defined as the degree to which the better individuals
are favoured. The higher the selection pressured, the more the better individuals are favoured.
This selection pressure drives the GA to improve the population fitness over the successive
generations.
The convergence rate of GA is largely determined by the magnitude of the selection pressure,
with higher selection pressures resulting in higher convergence rates. Genetic Algorithms
should be able to identify optimal or nearly optimal solutions under a wide range of selection
scheme pressure. However, if the selection pressure is too low, the convergence rate will be
slow, and the GA will take unnecessarily longer time to find the optimal solution. If the
selection pressure is too high, there is an increased change of the GA prematurely converging
to an incorrect (sub-optimal) solution. In addition to providing selection pressure, selection
schemes should also preserve population diversity, as this helps to avoid premature
convergence [S.N.Sivanandam et.al. (2008)].
Typically one can distinguish two types of selection scheme, proportionate selection and
ordinal-based selection. Proportionate-based selection picks out individuals based upon their
fitness values relative to the fitness of the other individuals in the population. Ordinal-based
selection schemes select individuals not upon their raw fitness, but upon their rank within the
population. This requires that the selection pressure is independent of the fitness distribution
of the population, and is solely based upon the relative ordering (ranking) of the population.
Genetic Algorithm Page 41
It is also possible to use a scaling function to redistribute the fitness range of the population
in order to adapt the selection pressure. Selection has to be balanced with variation form
crossover and mutation. Too strong selection means sub optimal highly fit individuals will
take over the population, reducing the diversity needed for change and progress; too weak
selection will result in too slow evolution. The various selection methods generally used are:
Roulette Wheel Selection
Roulette selection is one of the traditional GA selection techniques. This reproduction
operator is the proportionate reproductive operator where a string is selected from the mating
pool with a probability proportional to the fitness. The principle of roulette selection is a
linear search through a roulette wheel with the slots in the wheel weighted in proportion to
the individual’s fitness values. A target value is set, which is a random proportion of the sum
of the fitnesses in the population. The population is stepped through until the target value is
reached. This is only a moderately strong selection technique, since fit individuals are not
guaranteed to be selected for, but somewhat have a greater chance. A fit individual will
contribute more to the target value, but if it does not exceed it, the next chromosome in line
has a chance, and it may be weak. It is essential that the population not be sorted by fitness,
since this would dramatically bias the selection. The Roulette Wheel Selection is shown in
fig. below:
Fig. 2.8 Roulette Wheel Selection Mechanism for four chromosomes
Random Selection
This technique randomly selects a parent from the population. In terms of disruption of
genetic codes, random selection is a little more disruptive, on average, than roulette wheel
selection.
Rank Selection
The roulette selection will have problems when the fitness differs very much. For example, if
the best chromosome fitness is 90% of the entire roulette wheel then the other chromosomes
will have very few chances to be selected.
Genetic Algorithm Page 42
Rank selection first ranks the population and then every chromosome receives fitness from
this ranking. The worst will have fitness 1, second worst 2 etc. and the best will have fitness
N(number of chromosomes in the population).
One can see in the following picture, how the situation changes after changing fitness to order
number.
Fig. 2.9(a) Situation before Ranking (Graph of fitnesss)
Fig. 2.9(b) Situation after Ranking (Graph of order numbers)
Tournament Selection
An ideal selection strategy should be such that it is able to adjust its selective pressure and
population diversity so as to fine-tune GA search performance. Unlike, the Roulette wheel
selection, the tournament selection strategy provides selective pressure by holding a
tournament competition among Nu individuals.
The best individual from the tournament is the one with the highest fitness, which is the
winner of Nu. Tournament competitions winner are then inserted into the mating pool. The
tournament competition is repeated until the mating pool for generating new offspring is
filled. The mating pool comprising of the tournament winner has higher average population
fitness. The fitness difference provides the selection pressure, which drives GA to improve
the fitness of the succeeding genes. This method is more efficient and leads to an optimal
solution.
2.9.4.2 Crossover
Crossover depends upon the encoding scheme used for the problem. Crossover operates on
selected genes from parent chromosomes and creates new offspring. The simplest way of
Genetic Algorithm Page 43
performing crossover is to choose randomly some crossover point and copy everything before
this point from the first parent and then copy everything after the crossover point from the
other parent. There exist many other ways to perform crossover like n-point crossover,
uniform crossover, order crossover etc. Crossover can be quite complicated and depends
mainly on the encoding of chromosomes [Beasley, Bull & Martin (1993)].
There are a number of various types of crossover, which are discussed below:
One Point Crossover
The traditional genetic algorithm uses single point crossover, where the two mating
chromosomes are cut once at corresponding points and the sections after the cuts exchanged.
Here, a cross-site or crossover point is selected randomly along the length of the mated
strings and bits next to the cross-sites are exchanged. If appropriate site is chosen, better
children can be obtained by combining good parents else it severely hampers string quality.
The following Fig. 2.10 illustrates single point crossover and it can be observed that the bits
next to the crossover point are exchanged to produce children. The crossover point can be
chosen randomly.
11001011 + 11011111 = 11001111
Fig. 2.10 One Point Crossover
Two Point Crossover
Apart from single point crossover, many different crossover algorithms have been devised,
often involving more than one cut point. It should be noted that adding further crossover
points reduces the performance of the GA. The problem with adding additional crossover
points is that building blocks are more likely to be disrupted.
However, an advantage of having more crossover points is that the problem space may be
searched more thoroughly. In two-point crossover, two crossover points are chosen and the
contents between these points are exchanged between two mated parents as shown in
fig.2.11.
The main problem with one-point crossover was that the head and the tail of one
chromosome cannot be passed together to the offspring. If both the head and the tail of a
chromosome contain good genetic information, none of the offsprings obtained directly with
Genetic Algorithm Page 44
one-point crossover will share the two good features. Using a 2-point crossover avoids this
drawback, and then, is generally considered better than 1-point crossover.
11001011 + 11011111 = 11011111
Fig. 2.11 Two Point Crossover
Multi-Point Crossover (N-Point crossover)
The problem found in one point crossover may also occur in two point crossover. In fact this
problem can be generalized to each gene position in a chromosome. Genes that are close on a
chromosome have more chance to be passed together to the offspring obtained through a N-
points crossover. Consequently, the efficiency of a N-point crossover will depend on the
position of the genes within the chromosome. In a genetic representation, genes that encode
dependant characteristics of the solution should be close together. Actually this situation in
GA is also known as gene locus problem, which can be eliminated with the help of uniform
crossover.
Uniform Crossover
Uniform crossover is quite different from the N-point crossover. Each gene in the offspring is
created by copying the corresponding gene from one or the other parent chosen according to
a random generated binary crossover mask of the same length as the chromosomes. Where
there is a 1 in the crossover mask, the gene is copied from the first parent, and where there is
a 0 in the mask the gene is copied from the second parent. A new crossover mask is randomly
generated for each pair of parents. Offsprings, therefore contain a mixture of genes from each
parent. The number of effective crossing point is not fixed, but will average of the
chromosome length. In Fig. 2.12, new children are produced using uniform crossover
approach.
11001011 + 11011101 = 11011111
Fig. 2.12 Uniform Crossover
Genetic Algorithm Page 45
Arithmetic Crossover
Chromosomes having real value or floating point representation undergo arithmetic
crossover. This crossover creates a new allele at each gene position in the offspring. The
value of new allele lies between the values of the parent alleles. The value of new alleles for
offspring is computed using following equation:
Offspring1 = w*parent1 + (1-w)*Parent2
Offspring2= (1-w)*parent1 + w*Parent2
Where w is constant weight factor that is used to compute new values.
2.9.4.3 Mutation
Mutation is a background operator which produces spontaneous random changes in various
chromosomes. A simple way to achieve mutation would be to alter one or more genes. In
GA, mutation serves the crucial role of either (a) replacing the genes lost from the population
during the selection process so that they can be tried in a new context or (b) providing the
genes that were not present in the initial population. The mutation probability is defined as
the percentage of the total number of genes in the population. The mutation probability
controls the probability with which new genes are introduced into the population for trial. If it
is too low, many genes that would have been useful are never tried out, while if it is too high,
there will be much random perturbation, the offspring will start losing their resemblance to
the parents, and the algorithm [Knuth (1997)] will lose the ability to learn from the history of
the search.
There are a number of techniques for mutation. Some of which are discussed as under:
Flipping
Flipping of a bit involves changing 0 to 1 and 1 to 0 based on a mutation chromosome
generated. The fig. 2.13 explains mutation-flipping concept.
11001001 => 10001001
Fig.2.13 Flipping Mutation
Interchanging
Two random positions of the string are chosen and the bits corresponding to those positions
are interchanged. This is shown in Fig. 2.14.
Genetic Algorithm Page 46
Parent 10100110
Child 11100010
Fig. 2.14 Interchanging Mutation
Reversing
A random position is chosen and the bits next to that position are reversed and child
chromosome is produced. This is shown in Fig. 2.15.
Parent 10100110
Child 10100011
Fig. 2.15 Reversing Mutation
2.9.5 Replacement
This is the last stage of the breeding cycle. This stage basically acts as a building block for
the next iteration. When a new generation of offspring’s is produced, the major question is
which of these newly generated offspring’s would move forward to the next generation and
would replace which chromosomes of the current generation. The answer to this question
again lies in Darwin’s principle of “Survival of Fittest” [Fogel (1995)]. So better fit
individuals should have more chances to survive and carried forward to next generation
leaving behind the less fit ones. The process of forming next generation of individuals by
replacing or removing some offspring’s or parent individuals is done by replacement scheme
[Sivanandam et.al. (2008)]. Basically, there are two kinds of replacement strategies for
maintaining the population – generational replacement and steady state replacement. In
generational replacement, entire population of genomes is replaced at each generation. In
elitism, complete population of genome is replaced except for the best member of each
generation which is carried over to next generation without modification [Affenzeller,
Winkler & Wagner (2009)]. In this case, generations are non-overlapping. Steady state
replacement involves overlapping population in which only a small fraction of the population
is replaced in all iterations. In a steady state replacement, new individuals are inserted in the
population as soon as they are created [Sarma & De Jong (1997)]. There are various types of
replacement techniques as discussed below:
Random Replacement
The children replace two randomly chosen individuals in the population. The parents are also
candidates for selection. This can be useful for continuing the search in small populations,
since weak individuals can be introduced into the population.
Genetic Algorithm Page 47
Weak Parent Replacement
In weak parent replacement, a weaker parent is replaced by a strong child. With the four
individuals only the fittest two, parent or child, return to population. This process improves
the overall fitness of the population.
Both Parents
Both parents replacement is simple. The child replaces the parent. In this case, each
individual only gets to breed once. As a result, the population and genetic material moves
around but leads to a problem when combined with a selection technique that strongly
favours fit parents: the fit breed and then are disposed of.
2.9.6 Termination
This step is for terminating the algorithm. Although termination depends on the problem and
the user, still following are some of the general criteria under which the algorithm can be
terminated.
• Maximum generations–The genetic algorithm stops when the specified number of
generation’s has evolved.
• Elapsed time–The genetic process will end when a specified time has elapsed.
• No change in fitness–The genetic process will end if there is no change to the population’s
best fitness for a specified number of generations.
• Stall generations–The algorithm stops if there is no improvement in the objective function
for a sequence of consecutive generations of length Stall generations.
• Stall time limit–The algorithm stops if there is no improvement in the objective function
during an interval of time in seconds equal to stall time limit.
The termination or convergence criterion finally brings the search to a halt.
2.9.7 Parameters
There are number of parameters [Merek(1998)] that control the precise operation of the
genetic algorithm. But some of most important parameters are as follows:
Crossover probability: It is the measure of how often crossover will be performed. If
crossover probability is 100%, then all offspring are made by crossover. If it is 0%, whole
new generation is made from exact copies of chromosomes from old population. Crossover is
made in hope that new chromosomes will contain good parts of old chromosomes and
therefore the new chromosomes will be better. Crossover rates should generally be high,
about 80-95%. But in some problems 60% of the crossover rate is also sufficient.
Mutation probability: It is the measure of how often parts of chromosome will be mutated.
If mutation probability is 100%, whole chromosome is changed, if it is 0%, nothing is
Genetic Algorithm Page 48
changed. Mutation generally prevents the genetic algorithm from falling into local extremes
and helps in recovering the lost genetic material. Mutation should not occur very often,
because then genetic algorithms would act as to random search. Mutation rate generally shoul
be very low, 0.5-1% only.
Population size: It is the number of how many chromosomes are present in the population
(representing one generation). If there are too few chromosomes, genetic algorithm has few
options available for crossover and only a small part of search space is explored. On the
counterpart, if there are too many chromosomes in one population then the speed of genetic
algorithm slows down. It is quite surprising that higher population size does not always
improve the performance of GA. A population size of 20-30 is found to be good enough. But
in some specialized problems the population size of 50-100 are reported as best.
Other parameters: Encoding depends upon the problem type and also the size of the
instance of the problem. Selection method also depends upon the problem, although generally
used selection methods are roulette wheel, tournament or rank selection method
2.10 Advantages of GA systems
There are a number of advantages of Genetic Algorithms. Some of them are as under:
The main advantage of the GA lies in its parallelism. Most of the search techniques
start from one point and continue until with a single point in each iteration until a final
solution is reached. Therefore a problem of local maxima may exist in them, while the
starting solution space in GA is having multiple points in search space and hence the
problem of local maxima generally does not exist.
The GA is much easier to implement as compared to other techniques as it requires no
knowledge or gradient information about the response surface. The advantage of the
GA approach is the ease with which it can handle arbitrary kinds of constraints and
objectives; all such things can be handled as weighted components of the fitness
function, making it easy to adapt the GA scheduler to the particular requirements of a
very wide range of possible overall objectives.
GA can be used when no algorithms or heuristics are available for solving a problem.
A GA based system can be built as long as a solution representation and an evaluation
scheme can be worked out. Since it only requires the description of a good solution
and not how to achieve it, the need for expert access is minimized.
Optimization problems in which the constraints and objective functions are non-linear
and/or discontinuous are not amenable to solution by traditional methods such as
Genetic Algorithm Page 49
linear programming. GA can solve such problems. GA does not guarantee optimal
solutions, but produce near optimal solutions which are likely to be very good.
Solution time with GA is highly predictable – it is determined by the size of the
population, time taken to decode and evaluate a solution and the number of
generations of population.
GA use simple operations, but are able to solve problems which are found to be
computationally prohibitive by traditional algorithmic and numerical techniques. One
example is the TSP problem.
2.11 Limitations of GA based systems
Although there are a number of advantages of GAs, yet there are some limitations as well.
Some of which are described below:
One of the biggest problems in implementing is identification of the fitness function.
As the optimal solution heavily depends of the fitness function, therefore it must be
determined accurately. There are no standard techniques available to define a fitness
function and it is the sole responsibility of the user to define it.
Sometimes premature convergence can occur and therefore the diversity in the
population is lost, which is one of the major objectives of GA.
Another problem is related with the choosing of various parameters like the size of the
population, mutation rate, crossover rate, the selection method and its strength.
The termination criteria are also not standardized. Till date no effective single
terminator criteria has been identified.
GA themselves are blind to the optimization process, as they only look at the fitness
value of each chromosome rather than knowing what the fitness value actually means.
As a result, their capability to explain why a particular solution was arrived at is
practically very poor or nil.
Although GA are moderately scalable – an increased number of variables can be
accommodated by increasing the length of the chromosome – a longer chromosome
also makes finding the solution more time consuming. The longer the chromosome,
the larger the population needs to be since there are more potential combinations of
genes. This result in more time required for decoding and fitness evaluation.
In general, GA does not require extensive access to data. But some applications may
require access and process data from the organization’s databases to be able to
Genetic Algorithm Page 50
evaluate the fitness of solutions. For these applications, the quality and quantity of
data is important.
2.12 Applications of Genetic Algorithm
Genetic algorithms have been used for difficult problems (such as NP-hard problems), for
machine learning and also for evolving simple programs. They have been also used for some
art, for evolving pictures and music. A few applications of GA are as follows:
Business: Genetic Algorithms have been used to solve many different types
of business problems in functional areas such as finance, marketing, information
systems, and production/ operations. Within these functional areas, GAs has
performed a variety of applications such as tactical asset allocation, job scheduling,
machine-part grouping, and computer network design.
Optimization: GAs have been used in a wide variety of optimization tasks, including
numerical optimization, and combinatorial optimization problems such as traveling
salesman problem (TSP), circuit design [Louis (1993)] , job shop scheduling
[Goldstein (1991)] and video & sound quality optimization, Telecommunication
routing, State assignment problem, Time tabling problem, Traffic and Shipment
routing etc.
Automatic Programming: They are used to evolve computer programs for specific
tasks and to design other computational structures as in Cellular automata and sorting
networks.
Design: They are also used to optimize the structure and operational design of
buildings, factories, machines etc. They are used to design heat exchangers, robot
gripping arms, flywheels, turbines etc.
Robotics: Robot’s design is dependent on the job it is intended to do. A range of
optimal designs and components can be searched with the help of genetic algorithms
for each specific use and return entirely new type of robots.
Machine Learning: These algorithms are used for machine learning applications like
and prediction, protein structure prediction etc. They are also used to design neural
networks, to evolve rules for learning classifier systems and symbolic production
systems.
Evolvable Hardware: Genetic algorithms are used develop computer models that use
stochastic operators to evolve new configurations from old ones so as develop new
electronic circuits that can be termed as evolvable hardware.
Genetic Algorithm Page 51
Game Playing: Genetic algorithms are also applied in Game theory and so they are
widely used in developing computer games, simulated environments.
Encryption and code breaking: Genetic algorithms can be used both to create
encryption for sensitive data as well as to break those codes.
Image processing: With medical X-rays or satellite images, there is often a need to
align two images of the same area, taken at different times. By comparing a random
sample of points on the two images, a GA can efficiently and a set of equations which
transform one image to fit onto the other [Goldberg (1989)].
2.13 Applications of Genetic Algorithm in Yield Management
Yield management as already discussed in previous chapter is the problem related to
maximizing revenue by selling the right inventory unit to the right type customer, at the right
time and for the right price. The basic conditions for applying the yield management are
perishable inventory, price discrimination and fixed capacity. On the basis of these
conditions, it was observed that YM is basically a problem of optimization. Now in the
present chapter, it has been observed that for the purpose of optimization GA has been proven
to be a very effective approach.
So far not much of the work has been found in literature which applies GA in YM. But still
some of the applications have been identified from literature. The applications comprise of
decision making tool for YM [Pulugutha et.al.(2003) and Jeng (2011)], air traffic control
system [Xiao-Bing et.al.(2007)], choice based network revenue model [Etebari, F., et.al.
(2011)], crop yield management [Martin (2009)], airline booking [George et.al. (2012)],
pricing inventory [Ganji et.al.(2013)], advertising time allocation [Reza Alaei et.al. (2011)],
and project planning [Karova et.al. (2008)]
2.14 Summary
Genetic Algorithm is an algorithm based on the Darwin’s theory of “survival of the fittest”.
This algorithm tries to replicate this theory in various problems and has been found to be
quite successful. The algorithm is based on the various steps such as initialization, selection,
crossover, mutation and replacement. The biggest problem in implementing a GA is
identifying the fitness function. However, if the fitness function is accurately identified, the
GA can converge in a speedy manner. Another important aspect in using a GA is the use of
operators which are selection, crossover and mutation. The selection mechanism depends on
the problem, though it should be selected so that neither the convergence is premature nor it
is very slow. Crossover and mutation are very important aspects for maintaining the diversity
Genetic Algorithm Page 52
in the population. The probability of crossover and mutation should be selected such that
neither more fit solutions are lost nor diversity is lost. The algorithm should terminate in a
finite number of steps depending on various criterions. The major advantage of the GA is in
its parallelism i.e. it work on multiple solution in the search space simultaneously as
compared with some other methods like Hill Climbing which works only from a single point.
The advantage of starting with multiple points is that the solution will not be trapped in local
maxima and the chances of finding the global maximum are very high. GA can be used in a
number of applications such as optimization, business, robotics, machine learning,
networking, image processing, etc. GAs is very helpful when the developer does not have
precise domain expertise, because GAs possesses the ability to explore and learn from their
domain. Predictions have been made that advances in mathematics, fuzzy logic, chaos and
fractals will promote and enhance the work currently being undertaken by GA's. The future
will bring forth new applications of genetic algorithms and new techniques which will allow
GA's to be fully exploited.