[ieee 2013 international conference on advances in electrical engineering (icaee) - dkaka,...
TRANSCRIPT
Preserving Rotation Invariant Properties in Differential Evolution Algorithm
Md. Tanvir Alam Anik1, Abu Saleh Md Noman
2 and Sabbir Ahmed
3
Department of Computer Science and Engineering (CSE)
Bangladesh University of Engineering and Technology (BUET), Dhaka-1000, Bangladesh
e-mail: [email protected], [email protected], [email protected]
ABSTRACT
Differential evolution (DE) is an efficient and
powerful population-based stochastic direct search
method for solving optimization problems over
continuous space. It uses both crossover and
mutation for producing offspring. Mutation is
rotation-invariant while crossover is not rotation-
invariant. As a result, the performance of DE
degrades in problems with strong linkage among
variables. In this paper, we propose a new DE
algorithm that uses rotation-invariant crossover
operators to achieve better optimization performance
when solving rotated problems. The proposed
algorithm has been examined on a test-suite of 12
benchmark functions. Experimental results have
demonstrated the effectiveness of the proposed
algorithm.
KEY WORDS: Rotation-invariant, crossover,
mutation, differential evolution, function
optimization, Gram-Schmidt process
1. INTRODUCTION
Evolutionary Algorithms have been successfully
applied to solve numerous optimization problems in
diverse fields. They are stochastic search methods
that operate on a population of potential solutions.
Differential evolution (DE) is a newly proposed EA
by Storn and Price [2]. DE has much faster
convergence rate than classical EA because of its
improved local search ability along with a special
self-adaptive mutation scheme [5], combined with
crossover and greedy replacement policy. It has been
shown that DE is a very fast and robust algorithm.
DE conventionally has several candidate mutation
strategies, and three control parameters, i.e.,
population size µ, differential scaling factor F and
crossover rate CR. Apart from the parameter µ which
is common for all population-based algorithms,
mutation strategy selection, parameters F and CR
adaptations are the three most important issues of DE
research. Many works have been done to optimally
adapt these parameters during evolution. The
relationship between the F, CR and population
diversity has been analyzed in [6]. Some prominent
research works in DE literature include Self-adaptive
DE (SaDE) [1], DE with Neighborhood Search
(NSDE) [3], self-adaptive neighborhood search DE
(SaNSDE) [4]. SaDE [1] uses previous experience in
generating promising solutions to gradually self-
adapt the differential mutation operators and their
associated parameter values. NSDE [3] combines the
benefits of exploitative neighborhood search
operators with its more explorative component of
differential evolution. Combining SaDE and NSDE,
Yang et al. [4] Proposed SaNSDE to improve the
performance of NSDE. SaNSDE inherits the self-
adapted mutation selection schemes of SaDE, and
adopts a self-adaptive strategy to adjust the
parameters of NSDE.
Gamperle et al. [7] proposed empirical parameter
settings of DE along with experimental parameter
studies. Although there are already suggestions for
parameter settings [10], [11], the interaction between
the parameter setting and the optimization
performance is still complicated and not completely
understood. This is mainly because of the fact that
there is no fixed parameter setting that is suitable for
various problems or even at different evolution
stages of a single problem. Moreover, the
performance of DE degrades in problems with strong
linkage among variables (where variables are related
strongly each other). One of the desirable properties
of optimization algorithms for solving the problems
with strong linkage is rotation-invariant property.
Rotated problems can be solved by the rotation-
invariant algorithms where variables are strongly
related as in the same way of solving non-rotated
problems.
Conventionally, DE employs mutation and
crossover operation to promote variation in
producing offspring. Mutation operation is
performed by adding the weighted difference
between two individuals to a third individual.
Crossover operation decomposes the difference
between a parent and the mutant individual into
elements of a coordinate system, some elements are
selected probabilistically and the elements are
combined. Mutation operation is rotation-invariant,
but crossover operation is not rotation-invariant. In
this study, we propose a new DE algorithm that
generates offspring employing rotation-invariant
mutation and crossover operation. That is, we will
introduce crossover operation which will be rotation-
invariant in nature. Instead of using the fixed
coordinate system, a new coordinate system based on
Gram-Schmidt process [8] has been introduced in
this paper. Experimental studies are carried out on a
set of benchmark functions and the results have been
compared with a number of prominent evolutionary
systems. Experimental results demonstrate the
Proceedings of 2013 2nd International Conference on Advances in Electrical Engineering (ICAEE 2013)19-21 December, 2013, Dhaka, Bangladesh
978-1-4799-2465-3/13/$31.00 ©2013 IEEE 235
effectiveness of the proposed scheme as it often
outperforms other algorithms on most of the
functions.
The rest of the paper is organized as follows:
Section 2 highlights the classical DE algorithm along
with its non-rotation-invariant nature and
incorporates rotation-invariant property into the basic
crossover operators using Gram-Schmidt process.
Section 3 presents the proposed algorithm in details.
Section 4 presents the experimental results of the
proposed algorithm, along with discussion and
comparison with some other popular research works.
Finally, section 5 draws conclusions and makes a few
suggestions for future study.
2. PRELIMINARIES
2.1 Differential Evolution (DE)
Differential Evolution follows the general
framework of an evolutionary algorithm. It aims at
evolving a population of µ D-dimensional parameter
vectors, known as individuals. These individuals
represent the candidate solutions in a search space S.
The initial population should better cover the entire
search space as much as possible by uniformly
randomizing individuals within the search space
constrained by the prescribed minimum and
maximum parameter bounds.
In this paper we are concerned with bounded, real-
valued optimization problems. A problem is a pair
(S, f) where S ⊆ RD is a bounded set on R
D, and
f : S → R is a D-dimensional fitness function, also
known as objective function. Without loss of
generality we assume all problems are stated as
minimization problems. Ideally our goal is to find a
point xmin ∈ S such that f(xmin) is a global minimum
on S, that is:
∀x ∈ S : f(xmin) ≤ f(x).
where x = (x1, x2, …, xD). In practice, the goal of
our algorithm will be to find the smallest value
(closest to xmin) that it can before reaching a stopping
criterion. We will consider in this paper that the
stopping criterion for the algorithms is a specified
maximum number of fitness function evaluations.
According to the description by Storn and Price [2],
the pseudocode of classical DE has been summarized
in Fig.1.
It is apparent from the pseudocode of Fig. 1 that
classical DE algorithm uses uniform crossover and
DE/rand/1 mutation strategy. Crossover operation is
rotation variant while mutation operation is rotation-
invariant. But rotation-invariant property is one of
the desirable properties while solving function
optimization problems. The rotation-invariant
property helps algorithms to solve rotated problems.
Thus the only way to preserve rotation-invariant
properties in a DE algorithm is to make the crossover
Algorithm 1: Classical Differential Evolution Algorithm
1. Generate an initial population P
2. consisting of µ individuals within 3. specified upper and lower bounds ;
4. for(FE=1; FE <= FEmax ; FE++) {
5. for(i=1; i <= µ; i++) {
6. Randomly select 3 individuals: xr1,
7. xr2, and xr3 ;
8. Generate mutated offspring vi from xr1,
9. xr2, and xr3 using (1) ;
11. Generate offspring ui from parent xi
12. and mutated offspring vi by performing
13. crossover operation using (5) ;
14. Use survival selection to select the
15. better individual between xi and ui ;
16. if( f(ui) <= f(xi) )
17. zi = ui ;
18. else zi = xi ;
19. FE = FE + 1 ;
20. }
21. P = { zi, i = 1,2,…,µ } ;
22. }
Figure 1. The pseudocode of classical DE. FE is the number of function evaluations.
operation rotation-invariant. Since classical DE uses
uniform crossover, we will first show how uniform
crossover is not rotation-invariant. Then we will
propose a rotation-invariant crossover operator.
Details have been discussed in the following
subsections.
2.2 Mutation Operation
After initialization, classical DE employs mutation
operation to produce a mutant vector vi with respect
to each individual xi, so-called target vector, in the
current population. For each target vector xi, its
associated mutant vector vi can be generated via
DE/rand/1 mutation strategy as follows:
(1)
The indices
and are mutually exclusive
integers randomly generated within the range [1, µ],
which are also different from the index i. These
indices are randomly generated once for each mutant
vector. The scaling factor F is a positive control
parameter for scaling the difference vector. µ is the
population size. The following mutation strategies
are frequently used in DE literature:
DE/best/1:
(2)
DE/rand-to-best/1:
(3)
DE/best/2:
(4)
236
2.3 Crossover Operation
After the mutation phase, uniform crossover
operation is applied to each pair of the target vector
xi and its corresponding mutant vector vi to generate
a trial vector: ui = {ui1,…, ui
D} as follows:
{
[
Here, j=1,2,…,µ. The crossover rate CR is a user
specified constant within the range [0,1), which
controls the fraction of parameter values copied from
the mutant vector. jrand is a randomly chosen integer
in the range [1, D].
Fig. 2 shows the uniform crossover. Black circles
correspond to parents and one of white circles
corresponds to the child. When a given problem is
rotated and search points are rotated, the child
corresponds to one of circles with (red) diagonal
lines and does not correspond to one of gray (green)
circles. Therefore, the uniform crossover is not
rotation-invariant [8].
2.3.1 Rotation-Invariant Crossover Operation
In uniform crossover, a vertex is selected from
vertices of a hyper-rectangle where diagonal
positions are occupied by a parent and a mutant
vector, as a child. The fixed coordinate system used
in uniform crossover as evidenced from Fig. 2 needs
to be modified to introduce rotation-invariant
crossover operation. In this regard, we will introduce
Gram-Schmidt process, a coordinate system which is
defined by search points.
2.3.2 Gram-Schmidt process
1) Calculating the centroid of the search points:
∑ (6)
2) Calculating directional vectors from the centroid:
(7)
3) Selecting p vectors randomly from the directional
vectors.
{ ∈ { }} (8)
4) Orthonormalizing the selected vectors using
Gram-Schmidt process:
‖ ‖
‖ ‖
∑
‖ ∑ ‖
Where (y, b) is the inner product of y and b.
2.3.3 Rotation-Invariant Uniform and
Exponential Crossover
In uniform crossover, either element of the parent
xi or the mutant vector vi is selected. Thus, the
rotation-invariant uniform crossover operation using
Gram-Schmidt process can be defined as follows:
i = 1, 2,…,µ (9)
∑ ∈ (10)
is a vector from parent xi to the mutant vi. K is
the set of indexes of selected elements, and is a
unit vector of which kth
element is 1 and other
elements are 0. ’s can be replaced by ’s, the new
coordinate vectors as discussed in subsection 2.3.2.
Finally, the two dimensional rotation-invariant
uniform crossover (RIUC) has been presented in Fig.
4 and the pseudocode is presented in Fig. 3. We have
also incorporated rotation-invariant exponential
crossover (RIEC) in this paper. RIEC has quite
similarities with RIUC. The pseudocode is presented
in Fig. 4. More details about them can be found in
[8].
(a) (b)
Fig. 2: Crossover operation.
237
Algorithm 2: Rotation-Invariant Uniform Crossover
1. ;
2. ;
3. j=rand(1,p);
4. for(k=1; k ≤ p; k++)
5. {
6. if(k==j || u(0,1)<CR)
7. ;
8. }
Figure 3. The pseudocode of Rotation-Invariant Uniform
Crossover where rand(1,p) and u(0,1) generates an integer randomly from the range [1,p] and [0,1] respectively.
Algorithm 3: Rotation-Invariant Exponential Crossover
1. ;
2. ;
3. j=rand(1,p);
4. for(k=1; k ≤ p && u(0,1) < CR; k++)
5. {
;
j = (j+1)%p ;
8. }
Figure 4. The pseudocode of Rotation-Invariant Exponential
Crossover.
3. ALGORITHM OUTLINE
To achieve the most satisfactory optimization
performance by applying the classical DE to a given
problem, it is common to perform a trial-and-error
search for the most appropriate mutation strategy and
fine-tune its associated control parameter values.
Obviously, it may expend a huge amount of
computational costs. Moreover, each mutation and
crossover operator has certain relevance to the
problem being optimized. Motivated by these
observations, we have proposed mixed mutation and
mixed crossover strategy based DE (MMCDE)
algorithm in this paper that integrates the advantages
of several mutation and crossover operators in one
algorithm. MMCDE contains two pools: 1) Mutation
pool that contains four different mutation strategies,
and 2) Crossover pool that contains two different
crossover strategies. The crossover operators that
MMCDE have incorporated are rotation-invariant in
nature. Thus the proposed algorithm performs better
while solving rotated, noisy optimization problems.
The major steps of MMCDE can be described as
follows:
Step 1) Initialization: Randomly initialize a
population of µ individuals P = {x1, x2,…,xµ} with xi
= {xi1, xi2, …, xiD}, i = 1, 2,…,µ uniformly distributed
in the range [xmin , xmax] where xmin = {xmin1, xmin2, …,
xminD} and xmax = {xmax1, xmax2, …, xmaxD}. Set the
generation counter G=0. Initialize crossover rate CR,
scaling factor F.
Step 2) Evaluate the fitness of each individual of the
population P.
Step 3) Termination condition: If the number of
function evaluations exceeds the maximum number
of evaluation FEmax, the algorithm is terminated.
Step 4) Mutation: Choose a random number t in the
range [1, 4].
If t=1: Use DE/rand/1 mutation strategy to produce
mutated offspring vi. (using (1))
Else If t=2: Use DE/best/1 mutation strategy to
produce mutated offspring vi. (using (2))
Else If t=3: Use DE/rand-to-best/1 mutation strategy
to produce mutated offspring vi. (using (3))
Else If t=4: Use DE/best/2 mutation strategy to
produce mutated offspring vi. (using (4))
Step 5) Crossover: Choose a random number t in the
range [1, 2].
If t=1: Use RIUC crossover operation to produce
offspring ui. (Pseudocode of Fig. 3)
Else If t=2: Use RIEC crossover operation to produce
offspring ui. (Pseudocode of Fig. 4)
Step 6) Selection:
{
(11)
Where, zi will be the survival offspring for the next
generation.
Step 7) Set G=G+1. Go back to Step 2.
The proposed algorithm uses randomization while
picking the mutation or crossover operator. This
includes trade-offs. The disadvantage of such
operation is the non-self-adaptive nature. But self-
adaptive property often associates biasness. Due to
the structure of the fitness landscape of the problems,
self-adaptive nature may introduce better
optimization for a certain group of problems. But
randomization introduces random behavior without
any biasness. Thus the choice of mutation and
crossover operators at different stages of evolution
follows a random nature. As a result, MMCDE often
produces better optimization as evident from the
experimental studies discussed the next section.
4. EXPERIMENTAL STUDIES
A function is multimodal if it has multiple local
optima. In order to minimize such a function, the
search process must be able to avoid being trapped at
the regions around local minima to reach the global
minimum. The problem complicates with the
238
TABLE I. BENCHMARK FUNCTIONS USED IN THE
EXPERIMENTAL STUDY. D IS THE DIMENSIONALITY OF THE
FUNCTION.
Test Functions
Initial Range
f1(x) = ∑
[-100, 100]D
f2(x) = ∑ ∏
[-10, 10]D
f3(x) = ∑ [ ( )
[-30, 30]D
f4(x) = ∑ [
[-1.28, 1.28]D
f5(x) = ∑ [
[-5.12, 5.12]D
f6(x) = -20exp(-0.2√
∑
)
exp(
∑
)+20+e
[-32, 32]D
f7(x) =1+
∑ ∏ ( √ ⁄ )
⁄ ,
[-600, 600]D
f8(x) =
[∑
∑
]
[-50, 50]D
f9(x) =∑ [
∑
] ∑
[-50, 50]D
f10(x) = [
∑
∑
[-65.536,
65.536]D
f11(x) =∑
+
[-5, 5]D
f12(x) =
[-5, 10]x[0, 15]D
dimensionality of the problem, because the number
of local minima increases exponentially with the
number of dimensions. As our main objective is to
achieve a reliable optimization performance and
most of the unstable cases occur on multimodal
problems, we choose four unimodal functions and
eight multimodal functions. Table 1 presents short
descriptions of these functions with their specific
features. More details can be found in the original
references [9]. Among the eight multimodal
functions, f8-f9 are high dimensional multimodal
functions and f10-f12 are low dimensional multimodal
functions.
4.1 Parameter Settings
In the experiments, following parameters are used:
population size μ = 100, Crossover Rate CR = 0.6,
scaling factors F = 0.5. These values are chosen to
make a fair comparison with previous works. We
TABLE II. COMPARISON BETWEEN MMCDE, DE [2], AND
NSDE [3] ON 12 CLASSICAL BENCHMARK FUNCTIONS. BEST
RESULTS ARE MARKED WITH BOLDFACE FONTS (AVERAGED OVER
50 RUNS)
primarily consider DE [2], and NSDE [3], for
comparison.
4.2 Result Analysis
All the results shown in this section are the error
values found by our algorithm with respect to the
optimal solution to the problem. Error value is
computed as:
Error = f(x)-f(x*) (12)
where f(x) is the obtained solution by our algorithm,
while f(x*) is the already known global minimum for
a particular benchmark function. For each benchmark
function, 50 independent runs were taken and the
mean error of 50 runs was averaged. Table II shows
the mean error of MMCDE on the 12 test functions
in comparison with DE, and NSDE. For unimodal
functions f1-f4, MMCDE achieves much better results
than DE and NSDE on three out of four functions.
For multimodal functions f5-f9, MMCDE performs
better than DE and NSDE on 4 and 2 out of 5
functions respectively. DE performs better than
MMCDE only on Ackley’s function. Meanwhile,
NSDE significantly performs better than MMCDE
on f6 and f7 functions but shows almost equal
performance on f9. The performance of MMCDE,
DE, and NSDE is almost similar for the low
dimensional multimodal function f10-f12.
Table II also shows the results of t-test at the
confidence level of 5% between the MMCDE and
each of the other algorithms. “+” and “−” indicate
No MMCDE Mean Error
DE Mean Error
NSDE Mean Error
Vs.
DE
Vs.
NSDE
f1 2.47E−21 1.81E−13 7.10E−17 + +
f2 9.66E−20 6.43E−07 6.49E−11 + +
f3 4.01E−26 0 5.90E−28 − −
f4 6.54E−04 4.84E−03 4.97E−03 + +
f5 7.46E−06 138.70 3.98E−02 + +
f6 2.77E−04 1.20E−07 1.69E−09 − −
f7 3.57E−05 1.97E−04 5.80E−16 ≈ −
f8 0.0 1.98E−14 5.37E−18 + +
f9 9.95E−16 1.16E−13 6.37E−17 + ≈
f10 2.00E−03 2.00E−03 2.00E−03 ≈ ≈
f11 1.20E−03 1.60E−03 1.60E−03 ≈ ≈
f12 2.00E−03 2.00E−03 2.00E−03 ≈ ≈
239
that the MMCDE is significantly better and worse
than the compared algorithm, respectively, “≈”
indicates that the difference is not statistically
significant. It is apparent from the t-test results that,
MMCDE is significantly better than the other two
algorithms on most of the functions.
5. CONCLUSION AND FUTURE WORKS
In this paper, we have proposed a differential
evolution algorithm that associates and incorporates
rotation-invariant crossover operators to possess
rotation-invariant property throughout the
evolutionary process. Gram-Schmidt process is
applied to the candidate solutions in order to obtain
orthogonal vectors, and the vectors form the new
coordinate system to obtain rotation-invariant
property. As the evolutionary process progresses, the
mutation and crossover operators are adopted
randomly. The performance of the proposed
MMCDE algorithm is evaluated and discussed on a
set of 12 classical benchmark functions. MMCDE
has shown significant superiority over both DE and
NSDE.
Self-adaptive strategy has not been introduced in
the proposed MMCDE algorithm. In future, we will
apply self-adaptive strategy to gradually self- adapt
the desired crossover and mutation operator to
promote search space exploration and exploitation.
We will reconstruct the crossover and mutation pools
to make room for more crossover and mutation
operators. The optimal choice of the crossover and
mutation operators are attractive research issues and
deserves further investigation.
REFERENCES [1]. A. K. Qin and P. N. Suganthan, “Self-adaptive differential
evolution algorithm for numerical optimization,” Proc. Of the 2005 IEEE Congress on Evolutionary Computation, vol. 2, pp. 1785–1791, 2005.
[2]. R. Storn and K. V. Price, “Differential evolution-A simple and efficient heuristic for global optimization over continuous Spaces,” J. Global Optim., vol. 11, pp. 341–359, 1997.
[3]. Z. Yang, J. He and X. Yao, “Making a difference to differential evolution,” Advances in Metaheuristics for Hard Optimization, pp. 397-414. Z. Michalewicz and P. Siarry Eds. Springer, 2007.
[4]. Z. Yang, K. Tang, X. Yao, “Self-adaptive Differential Evolution with Neighborhood Search,” Proceedings of the 2008 IEEE Congress on Evolutionary Computation, vol. 2, pp. 1110 - 1116, 2008.
[5]. J.H. Holland, “Adaptation in Natural and Artificial Systems”, University of Michigan Press, Ann Arbor, MI, 1975.
[6]. D. Zaharie, “Critical Values for the Control Parameters of Differential Evolution Algorithms,” Proceedings of the 8th International Conference on Soft Computing, pp. 62–67, 2002.
[7]. R. Gamperle, S. Muller, and P. Koumoutsakos, “A Parameter Study for Differential Evolution,” Proceedings WSEAS international conference on advances in intelligent systems, fuzzy systems, evolutionary computation, pp. 293–298, 2002.
[8]. Takahama, T.; Sakai, S., “Solving nonlinear optimization problems by Differential Evolution with a rotation-invariant crossover operation using Gram-Schmidt process,” Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on, pp. 526-533, Dec. 2010.
[9]. X. Yao, Y. Liu, and G. Lin, “Evolutionary programming made faster,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 82–102, 1999.
[10]. K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization, 1st ed. New York: Springer-Verlag, Dec. 2005.
[11]. R. Gamperle, S. D. Muller, and P. Koumoutsakos, “A parameter study for differential evolution,” Proceedings of Advances Intell. Syst., Fuzzy Syst., Evol. Comput., Crete, Greece, 2002, pp. 293–298.
240