[ieee 2013 international conference on advances in electrical engineering (icaee) - dkaka,...

Preserving Rotation Invariant Properties in Differential Evolution Algorithm

Md. Tanvir Alam Anik1, Abu Saleh Md Noman

2 and Sabbir Ahmed

3

Department of Computer Science and Engineering (CSE)

Bangladesh University of Engineering and Technology (BUET), Dhaka-1000, Bangladesh

e-mail: [email protected], [email protected], [email protected]

ABSTRACT

Differential evolution (DE) is an efficient and

powerful population-based stochastic direct search

method for solving optimization problems over

continuous space. It uses both crossover and

mutation for producing offspring. Mutation is

rotation-invariant while crossover is not rotation-

invariant. As a result, the performance of DE

degrades in problems with strong linkage among

variables. In this paper, we propose a new DE

algorithm that uses rotation-invariant crossover

operators to achieve better optimization performance

when solving rotated problems. The proposed

algorithm has been examined on a test-suite of 12

benchmark functions. Experimental results have

demonstrated the effectiveness of the proposed

algorithm.

KEY WORDS: Rotation-invariant, crossover,

mutation, differential evolution, function

optimization, Gram-Schmidt process

1. INTRODUCTION

Evolutionary Algorithms have been successfully

applied to solve numerous optimization problems in

diverse fields. They are stochastic search methods

that operate on a population of potential solutions.

Differential evolution (DE) is a newly proposed EA

by Storn and Price [2]. DE has much faster

convergence rate than classical EA because of its

improved local search ability along with a special

self-adaptive mutation scheme [5], combined with

crossover and greedy replacement policy. It has been

shown that DE is a very fast and robust algorithm.

DE conventionally has several candidate mutation

strategies, and three control parameters, i.e.,

population size µ, differential scaling factor F and

crossover rate CR. Apart from the parameter µ which

is common for all population-based algorithms,

mutation strategy selection, parameters F and CR

adaptations are the three most important issues of DE

research. Many works have been done to optimally

adapt these parameters during evolution. The

relationship between the F, CR and population

diversity has been analyzed in [6]. Some prominent

research works in DE literature include Self-adaptive

DE (SaDE) [1], DE with Neighborhood Search

(NSDE) [3], self-adaptive neighborhood search DE

(SaNSDE) [4]. SaDE [1] uses previous experience in

generating promising solutions to gradually self-

adapt the differential mutation operators and their

associated parameter values. NSDE [3] combines the

benefits of exploitative neighborhood search

operators with its more explorative component of

differential evolution. Combining SaDE and NSDE,

Yang et al. [4] Proposed SaNSDE to improve the

performance of NSDE. SaNSDE inherits the self-

adapted mutation selection schemes of SaDE, and

adopts a self-adaptive strategy to adjust the

parameters of NSDE.

Gamperle et al. [7] proposed empirical parameter

settings of DE along with experimental parameter

studies. Although there are already suggestions for

parameter settings [10], [11], the interaction between

the parameter setting and the optimization

performance is still complicated and not completely

understood. This is mainly because of the fact that

there is no fixed parameter setting that is suitable for

various problems or even at different evolution

stages of a single problem. Moreover, the

performance of DE degrades in problems with strong

linkage among variables (where variables are related

strongly each other). One of the desirable properties

of optimization algorithms for solving the problems

with strong linkage is rotation-invariant property.

Rotated problems can be solved by the rotation-

invariant algorithms where variables are strongly

related as in the same way of solving non-rotated

problems.

Conventionally, DE employs mutation and

crossover operation to promote variation in

producing offspring. Mutation operation is

performed by adding the weighted difference

between two individuals to a third individual.

Crossover operation decomposes the difference

between a parent and the mutant individual into

elements of a coordinate system, some elements are

selected probabilistically and the elements are

combined. Mutation operation is rotation-invariant,

but crossover operation is not rotation-invariant. In

this study, we propose a new DE algorithm that

generates offspring employing rotation-invariant

mutation and crossover operation. That is, we will

introduce crossover operation which will be rotation-

invariant in nature. Instead of using the fixed

coordinate system, a new coordinate system based on

Gram-Schmidt process [8] has been introduced in

this paper. Experimental studies are carried out on a

set of benchmark functions and the results have been

compared with a number of prominent evolutionary

systems. Experimental results demonstrate the

Proceedings of 2013 2nd International Conference on Advances in Electrical Engineering (ICAEE 2013)19-21 December, 2013, Dhaka, Bangladesh

978-1-4799-2465-3/13/$31.00 ©2013 IEEE 235

effectiveness of the proposed scheme as it often

outperforms other algorithms on most of the

functions.

The rest of the paper is organized as follows:

Section 2 highlights the classical DE algorithm along

with its non-rotation-invariant nature and

incorporates rotation-invariant property into the basic

crossover operators using Gram-Schmidt process.

Section 3 presents the proposed algorithm in details.

Section 4 presents the experimental results of the

proposed algorithm, along with discussion and

comparison with some other popular research works.

Finally, section 5 draws conclusions and makes a few

suggestions for future study.

2. PRELIMINARIES

2.1 Differential Evolution (DE)

Differential Evolution follows the general

framework of an evolutionary algorithm. It aims at

evolving a population of µ D-dimensional parameter

vectors, known as individuals. These individuals

represent the candidate solutions in a search space S.

The initial population should better cover the entire

search space as much as possible by uniformly

randomizing individuals within the search space

constrained by the prescribed minimum and

maximum parameter bounds.

In this paper we are concerned with bounded, real-

valued optimization problems. A problem is a pair

(S, f) where S ⊆ RD is a bounded set on R

D, and

f : S → R is a D-dimensional fitness function, also

known as objective function. Without loss of

generality we assume all problems are stated as

minimization problems. Ideally our goal is to find a

point xmin ∈ S such that f(xmin) is a global minimum

on S, that is:

∀x ∈ S : f(xmin) ≤ f(x).

where x = (x1, x2, …, xD). In practice, the goal of

our algorithm will be to find the smallest value

(closest to xmin) that it can before reaching a stopping

criterion. We will consider in this paper that the

stopping criterion for the algorithms is a specified

maximum number of fitness function evaluations.

According to the description by Storn and Price [2],

the pseudocode of classical DE has been summarized

in Fig.1.

It is apparent from the pseudocode of Fig. 1 that

classical DE algorithm uses uniform crossover and

DE/rand/1 mutation strategy. Crossover operation is

rotation variant while mutation operation is rotation-

invariant. But rotation-invariant property is one of

the desirable properties while solving function

optimization problems. The rotation-invariant

property helps algorithms to solve rotated problems.

Thus the only way to preserve rotation-invariant

properties in a DE algorithm is to make the crossover

Algorithm 1: Classical Differential Evolution Algorithm

1. Generate an initial population P

2. consisting of µ individuals within 3. specified upper and lower bounds ;

4. for(FE=1; FE <= FEmax ; FE++) {

5. for(i=1; i <= µ; i++) {

6. Randomly select 3 individuals: xr1,

7. xr2, and xr3 ;

8. Generate mutated offspring vi from xr1,

9. xr2, and xr3 using (1) ;

11. Generate offspring ui from parent xi

12. and mutated offspring vi by performing

13. crossover operation using (5) ;

14. Use survival selection to select the

15. better individual between xi and ui ;

16. if( f(ui) <= f(xi) )

17. zi = ui ;

18. else zi = xi ;

19. FE = FE + 1 ;

20. }

21. P = { zi, i = 1,2,…,µ } ;

22. }

Figure 1. The pseudocode of classical DE. FE is the number of function evaluations.

operation rotation-invariant. Since classical DE uses

uniform crossover, we will first show how uniform

crossover is not rotation-invariant. Then we will

propose a rotation-invariant crossover operator.

Details have been discussed in the following

subsections.

2.2 Mutation Operation

After initialization, classical DE employs mutation

operation to produce a mutant vector vi with respect

to each individual xi, so-called target vector, in the

current population. For each target vector xi, its

associated mutant vector vi can be generated via

DE/rand/1 mutation strategy as follows:

(1)

The indices

and are mutually exclusive

integers randomly generated within the range [1, µ],

which are also different from the index i. These

indices are randomly generated once for each mutant

vector. The scaling factor F is a positive control

parameter for scaling the difference vector. µ is the

population size. The following mutation strategies

are frequently used in DE literature:

DE/best/1:

(2)

DE/rand-to-best/1:

(3)

DE/best/2:

(4)

236

2.3 Crossover Operation

After the mutation phase, uniform crossover

operation is applied to each pair of the target vector

xi and its corresponding mutant vector vi to generate

a trial vector: ui = {ui1,…, ui

D} as follows:

{

[

Here, j=1,2,…,µ. The crossover rate CR is a user

specified constant within the range [0,1), which

controls the fraction of parameter values copied from

the mutant vector. jrand is a randomly chosen integer

in the range [1, D].

Fig. 2 shows the uniform crossover. Black circles

correspond to parents and one of white circles

corresponds to the child. When a given problem is

rotated and search points are rotated, the child

corresponds to one of circles with (red) diagonal

lines and does not correspond to one of gray (green)

circles. Therefore, the uniform crossover is not

rotation-invariant [8].

2.3.1 Rotation-Invariant Crossover Operation

In uniform crossover, a vertex is selected from

vertices of a hyper-rectangle where diagonal

positions are occupied by a parent and a mutant

vector, as a child. The fixed coordinate system used

in uniform crossover as evidenced from Fig. 2 needs

to be modified to introduce rotation-invariant

crossover operation. In this regard, we will introduce

Gram-Schmidt process, a coordinate system which is

defined by search points.

2.3.2 Gram-Schmidt process

1) Calculating the centroid of the search points:

∑ (6)

2) Calculating directional vectors from the centroid:

(7)

3) Selecting p vectors randomly from the directional

vectors.

{ ∈ { }} (8)

4) Orthonormalizing the selected vectors using

Gram-Schmidt process:

‖ ‖

‖ ‖

∑

‖ ∑ ‖

Where (y, b) is the inner product of y and b.

2.3.3 Rotation-Invariant Uniform and

Exponential Crossover

In uniform crossover, either element of the parent

xi or the mutant vector vi is selected. Thus, the

rotation-invariant uniform crossover operation using

Gram-Schmidt process can be defined as follows:

i = 1, 2,…,µ (9)

∑ ∈ (10)

is a vector from parent xi to the mutant vi. K is

the set of indexes of selected elements, and is a

unit vector of which kth

element is 1 and other

elements are 0. ’s can be replaced by ’s, the new

coordinate vectors as discussed in subsection 2.3.2.

Finally, the two dimensional rotation-invariant

uniform crossover (RIUC) has been presented in Fig.

4 and the pseudocode is presented in Fig. 3. We have

also incorporated rotation-invariant exponential

crossover (RIEC) in this paper. RIEC has quite

similarities with RIUC. The pseudocode is presented

in Fig. 4. More details about them can be found in

[8].

(a) (b)

Fig. 2: Crossover operation.

237

Algorithm 2: Rotation-Invariant Uniform Crossover

1. ;

2. ;

3. j=rand(1,p);

4. for(k=1; k ≤ p; k++)

5. {

6. if(k==j || u(0,1)<CR)

7. ;

8. }

Figure 3. The pseudocode of Rotation-Invariant Uniform

Crossover where rand(1,p) and u(0,1) generates an integer randomly from the range [1,p] and [0,1] respectively.

Algorithm 3: Rotation-Invariant Exponential Crossover

1. ;

2. ;

3. j=rand(1,p);

4. for(k=1; k ≤ p && u(0,1) < CR; k++)

5. {

;

j = (j+1)%p ;

8. }

Figure 4. The pseudocode of Rotation-Invariant Exponential

Crossover.

3. ALGORITHM OUTLINE

To achieve the most satisfactory optimization

performance by applying the classical DE to a given

problem, it is common to perform a trial-and-error

search for the most appropriate mutation strategy and

fine-tune its associated control parameter values.

Obviously, it may expend a huge amount of

computational costs. Moreover, each mutation and

crossover operator has certain relevance to the

problem being optimized. Motivated by these

observations, we have proposed mixed mutation and

mixed crossover strategy based DE (MMCDE)

algorithm in this paper that integrates the advantages

of several mutation and crossover operators in one

algorithm. MMCDE contains two pools: 1) Mutation

pool that contains four different mutation strategies,

and 2) Crossover pool that contains two different

crossover strategies. The crossover operators that

MMCDE have incorporated are rotation-invariant in

nature. Thus the proposed algorithm performs better

while solving rotated, noisy optimization problems.

The major steps of MMCDE can be described as

follows:

Step 1) Initialization: Randomly initialize a

population of µ individuals P = {x1, x2,…,xµ} with xi

= {xi1, xi2, …, xiD}, i = 1, 2,…,µ uniformly distributed

in the range [xmin , xmax] where xmin = {xmin1, xmin2, …,

xminD} and xmax = {xmax1, xmax2, …, xmaxD}. Set the

generation counter G=0. Initialize crossover rate CR,

scaling factor F.

Step 2) Evaluate the fitness of each individual of the

population P.

Step 3) Termination condition: If the number of

function evaluations exceeds the maximum number

of evaluation FEmax, the algorithm is terminated.

Step 4) Mutation: Choose a random number t in the

range [1, 4].

If t=1: Use DE/rand/1 mutation strategy to produce

mutated offspring vi. (using (1))

Else If t=2: Use DE/best/1 mutation strategy to

produce mutated offspring vi. (using (2))

Else If t=3: Use DE/rand-to-best/1 mutation strategy

to produce mutated offspring vi. (using (3))

Else If t=4: Use DE/best/2 mutation strategy to

produce mutated offspring vi. (using (4))

Step 5) Crossover: Choose a random number t in the

range [1, 2].

If t=1: Use RIUC crossover operation to produce

offspring ui. (Pseudocode of Fig. 3)

Else If t=2: Use RIEC crossover operation to produce

offspring ui. (Pseudocode of Fig. 4)

Step 6) Selection:

{

(11)

Where, zi will be the survival offspring for the next

generation.

Step 7) Set G=G+1. Go back to Step 2.

The proposed algorithm uses randomization while

picking the mutation or crossover operator. This

includes trade-offs. The disadvantage of such

operation is the non-self-adaptive nature. But self-

adaptive property often associates biasness. Due to

the structure of the fitness landscape of the problems,

self-adaptive nature may introduce better

optimization for a certain group of problems. But

randomization introduces random behavior without

any biasness. Thus the choice of mutation and

crossover operators at different stages of evolution

follows a random nature. As a result, MMCDE often

produces better optimization as evident from the

experimental studies discussed the next section.

4. EXPERIMENTAL STUDIES

A function is multimodal if it has multiple local

optima. In order to minimize such a function, the

search process must be able to avoid being trapped at

the regions around local minima to reach the global

minimum. The problem complicates with the

238

TABLE I. BENCHMARK FUNCTIONS USED IN THE

EXPERIMENTAL STUDY. D IS THE DIMENSIONALITY OF THE

FUNCTION.

Test Functions

Initial Range

f1(x) = ∑

[-100, 100]D

f2(x) = ∑ ∏

[-10, 10]D

f3(x) = ∑ [ ( )

[-30, 30]D

f4(x) = ∑ [

[-1.28, 1.28]D

f5(x) = ∑ [

[-5.12, 5.12]D

f6(x) = -20exp(-0.2√

∑

)

exp(

∑

)+20+e

[-32, 32]D

f7(x) =1+

∑ ∏ ( √ ⁄ )

⁄ ,

[-600, 600]D

f8(x) =

[∑

∑

]

[-50, 50]D

f9(x) =∑ [

∑

] ∑

[-50, 50]D

f10(x) = [

∑

∑

[-65.536,

65.536]D

f11(x) =∑

+

[-5, 5]D

f12(x) =

[-5, 10]x[0, 15]D

dimensionality of the problem, because the number

of local minima increases exponentially with the

number of dimensions. As our main objective is to

achieve a reliable optimization performance and

most of the unstable cases occur on multimodal

problems, we choose four unimodal functions and

eight multimodal functions. Table 1 presents short

descriptions of these functions with their specific

features. More details can be found in the original

references [9]. Among the eight multimodal

functions, f8-f9 are high dimensional multimodal

functions and f10-f12 are low dimensional multimodal

functions.

4.1 Parameter Settings

In the experiments, following parameters are used:

population size μ = 100, Crossover Rate CR = 0.6,

scaling factors F = 0.5. These values are chosen to

make a fair comparison with previous works. We

TABLE II. COMPARISON BETWEEN MMCDE, DE [2], AND

NSDE [3] ON 12 CLASSICAL BENCHMARK FUNCTIONS. BEST

RESULTS ARE MARKED WITH BOLDFACE FONTS (AVERAGED OVER

50 RUNS)

primarily consider DE [2], and NSDE [3], for

comparison.

4.2 Result Analysis

All the results shown in this section are the error

values found by our algorithm with respect to the

optimal solution to the problem. Error value is

computed as:

Error = f(x)-f(x*) (12)

where f(x) is the obtained solution by our algorithm,

while f(x*) is the already known global minimum for

a particular benchmark function. For each benchmark

function, 50 independent runs were taken and the

mean error of 50 runs was averaged. Table II shows

the mean error of MMCDE on the 12 test functions

in comparison with DE, and NSDE. For unimodal

functions f1-f4, MMCDE achieves much better results

than DE and NSDE on three out of four functions.

For multimodal functions f5-f9, MMCDE performs

better than DE and NSDE on 4 and 2 out of 5

functions respectively. DE performs better than

MMCDE only on Ackley’s function. Meanwhile,

NSDE significantly performs better than MMCDE

on f6 and f7 functions but shows almost equal

performance on f9. The performance of MMCDE,

DE, and NSDE is almost similar for the low

dimensional multimodal function f10-f12.

Table II also shows the results of t-test at the

confidence level of 5% between the MMCDE and

each of the other algorithms. “+” and “−” indicate

No MMCDE Mean Error

DE Mean Error

NSDE Mean Error

Vs.

DE

Vs.

NSDE

f1 2.47E−21 1.81E−13 7.10E−17 + +

f2 9.66E−20 6.43E−07 6.49E−11 + +

f3 4.01E−26 0 5.90E−28 − −

f4 6.54E−04 4.84E−03 4.97E−03 + +

f5 7.46E−06 138.70 3.98E−02 + +

f6 2.77E−04 1.20E−07 1.69E−09 − −

f7 3.57E−05 1.97E−04 5.80E−16 ≈ −

f8 0.0 1.98E−14 5.37E−18 + +

f9 9.95E−16 1.16E−13 6.37E−17 + ≈

f10 2.00E−03 2.00E−03 2.00E−03 ≈ ≈

f11 1.20E−03 1.60E−03 1.60E−03 ≈ ≈

f12 2.00E−03 2.00E−03 2.00E−03 ≈ ≈

239

that the MMCDE is significantly better and worse

than the compared algorithm, respectively, “≈”

indicates that the difference is not statistically

significant. It is apparent from the t-test results that,

MMCDE is significantly better than the other two

algorithms on most of the functions.

5. CONCLUSION AND FUTURE WORKS

In this paper, we have proposed a differential

evolution algorithm that associates and incorporates

rotation-invariant crossover operators to possess

rotation-invariant property throughout the

evolutionary process. Gram-Schmidt process is

applied to the candidate solutions in order to obtain

orthogonal vectors, and the vectors form the new

coordinate system to obtain rotation-invariant

property. As the evolutionary process progresses, the

mutation and crossover operators are adopted

randomly. The performance of the proposed

MMCDE algorithm is evaluated and discussed on a

set of 12 classical benchmark functions. MMCDE

has shown significant superiority over both DE and

NSDE.

Self-adaptive strategy has not been introduced in

the proposed MMCDE algorithm. In future, we will

apply self-adaptive strategy to gradually self- adapt

the desired crossover and mutation operator to

promote search space exploration and exploitation.

We will reconstruct the crossover and mutation pools

to make room for more crossover and mutation

operators. The optimal choice of the crossover and

mutation operators are attractive research issues and

deserves further investigation.

REFERENCES [1]. A. K. Qin and P. N. Suganthan, “Self-adaptive differential

evolution algorithm for numerical optimization,” Proc. Of the 2005 IEEE Congress on Evolutionary Computation, vol. 2, pp. 1785–1791, 2005.

[2]. R. Storn and K. V. Price, “Differential evolution-A simple and efficient heuristic for global optimization over continuous Spaces,” J. Global Optim., vol. 11, pp. 341–359, 1997.

[3]. Z. Yang, J. He and X. Yao, “Making a difference to differential evolution,” Advances in Metaheuristics for Hard Optimization, pp. 397-414. Z. Michalewicz and P. Siarry Eds. Springer, 2007.

[4]. Z. Yang, K. Tang, X. Yao, “Self-adaptive Differential Evolution with Neighborhood Search,” Proceedings of the 2008 IEEE Congress on Evolutionary Computation, vol. 2, pp. 1110 - 1116, 2008.

[5]. J.H. Holland, “Adaptation in Natural and Artificial Systems”, University of Michigan Press, Ann Arbor, MI, 1975.

[6]. D. Zaharie, “Critical Values for the Control Parameters of Differential Evolution Algorithms,” Proceedings of the 8th International Conference on Soft Computing, pp. 62–67, 2002.

[7]. R. Gamperle, S. Muller, and P. Koumoutsakos, “A Parameter Study for Differential Evolution,” Proceedings WSEAS international conference on advances in intelligent systems, fuzzy systems, evolutionary computation, pp. 293–298, 2002.

[8]. Takahama, T.; Sakai, S., “Solving nonlinear optimization problems by Differential Evolution with a rotation-invariant crossover operation using Gram-Schmidt process,” Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on, pp. 526-533, Dec. 2010.

[9]. X. Yao, Y. Liu, and G. Lin, “Evolutionary programming made faster,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 82–102, 1999.

[10]. K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization, 1st ed. New York: Springer-Verlag, Dec. 2005.

[11]. R. Gamperle, S. D. Muller, and P. Koumoutsakos, “A parameter study for differential evolution,” Proceedings of Advances Intell. Syst., Fuzzy Syst., Evol. Comput., Crete, Greece, 2002, pp. 293–298.

240

[ieee 2013 international conference on advances in electrical engineering (icaee) - dkaka,...

Documents