287-517-1-sm

7/29/2019 287-517-1-SM

1/24

JNM: Journal of Natural Sciences and Mathematics

Vol. 2, No. 1, pp. 17-40 (June 2008/Jumada Al-Akher 1429 H)

c Qassim University Publications

Reduced Gradient Method and its Generalizationvia Stochastic Perturbation

Abdelkrim El Mouatasim

Department of Mathematics,

Faculty of Sciences, Jazan University,

B.P. 114, Jazan, Saudi Arabia.

E-mail: [email protected]

Abstract: In this paper, the global optimization of a nonconvex objective function

under linear and nonlinear differentiable constraints is studied, a reduced gradient and

GRG descent methods with random perturbation is proposed and it is desired to establish

the global convergence of the algorithm. Some numerical examples are also given by the

problems of statistical, octagon, mixture, alkylation and pooling.

Keywords: Stochastic perturbation, constraints optimization, reduced gradientmethod and its generalization, Nanconvex optimization.

Received for JNM on November 19, 2007.

1 Introduction

The general constrained optimization problem is to minimize a nonlinear function

subject to linear or nonlinear constraints. Two equivalent formulations of this problem

are useful for describing algorithms. They are

Minimize : f(x)

subject to : gi(x) = 0, i = 1,..,ml

gi(x) 0, i = ml + 1,..,m x u

(1.1)

where f and each gi : IRn IR is twice continuously differentiable function and the

lower- and upper-bound vectors, and u, may contain some infinite components; and

Minimize : f(x)Subject to : h(x) = 0

x 0(1.2)

17

7/29/2019 287-517-1-SM

2/24

18 Abdelkrim El Mouatasim

where h maps IRm to IR, the objective function f : IRn IR and the restrictionhj : IR

n IR, j = 1,...,m (the component of h) are twice continuously differentiablefunctions. For the case of multi objective optimization see [10].

The main technique that has been proposed for solving constrained optimization

problems in this work is reduced gradient method and its generalization. We are mainlyinterested in the situation where, on one hand f is not convex and, on the other hand,

the constraints are in general not linear; to get equality constraints one might think

about introducing slack variables and incorporating them into the functions gi.

The problem (1.2) can be numerically approached by using reduced gradient method

or its generalizations, which generates a sequence {xk}k0, where x0 is an initial feasiblepoint and, for each k > 0, a new feasible point xk+1 is generated from xk by using anoperator Qk (see section 3). Thus the iterations are given by:

k 0 : xk+1 = Qk(xk). (1.3)In order to prevent convergence to local minimum, various modifications of these basic

methods have been introduced in the literature. For instance, the reduced gradi-

ent method has been considered by [1]; the random perturbations projected gradient

method in [2]; the sequential convex programming in [24]; the generalized reduced

gradient in [5,7, 2526,32, 34]; sequential quadratic programming in [15, 30]; nonlin-

ear programming in [3, 20, 27]; stochastic perturbation of active set [11]. Moreover,

stochastic or evolutionary search can be introduced, but these methods are usually

considered as expensive and not accurate: on the one hand, the flexibility introduced

by randomization implies often a large number of evaluations of the objective func-

tion and, on the other hand, the pure random search generates many points which do

not improve the value of the objective function. This remark has led some people to

introduce a controlled random search (see for instance [2, 4, 6, 9, 23, 28, 33]).

In such a method, the sequence {xk}k0 is replaced by a random vectors sequence{Xk}k0 and the iterations are modified as follows:

k 0 : Xk+1 = Qk(Xk) + Pk (1.4)where Pk is a suitable random variable, called the stochastic perturbation. The se-quence {Pk}k0 goes to zero slowly enough in order to prevent convergence to a localminimum (see Section 4). The reduced gradient method and its generalization are

recalled in Section 3, while the notations are introduced in Section 2. The results of

numerical experiments will be given in Section 5.

2 Notations and assumptions

We denote by IR the set of the real numbers (, +), IR+ the set of posi-tive real numbers [0, +), E = IRn, the n-dimensional real Euclidean space. For

7/29/2019 287-517-1-SM

3/24

Reduced Gradient Method and its Generalization via Stochastic Perturbation 19

x = (x1, x2,....,xn)t E, xt denotes the transpose of x. We denote by x =

xtx = (x21 + ... + x2n)

1/2 the Euclidean norm of x.

The general optimization problem (1.1) is equivalent to global optimization problem

(1.2) with equality constrained and nonnegative variables.

Remark 2.1 The assumed conditions are satisfied by using two phase method, since

we can add artificial variables in the constraints.

Linear case: h(x) = Ax b where A is m n matrix with rank m, and b isan m-vector. Any m columns of A are linearly independent, and every extreme

point of the feasible region has m strictly positives variables.

We introduce basic and nonbasic variables according to

A = [B, N] x =

xBxN

xN 0 xB > 0 (2.1)

by nondegeneracy assumption. Furthermore the gradient off may be conformally

partitioned as

f(x)t = Bf(x)t,Nf(x)t,whereBf andNf are the basic gradient and nonbasic gradient of f respec-tively.

Nonlinear case: for simplicity we also assume that the gradients of the con-straint functionshj are linearly independent for all point x

0. This assumption

like the assumption of the full rank of the matrix A in the linear case ensures thatwe can choose a basis and express the basic variables as a linear combination of

the nonbasic variables and then reduce the problem. Hence a similar generalized

reduced gradient procedure applies.

Let a feasible solution xk 0 with hj(xk) = 0 for all j be given. By assumptionthe Jacobian matrix of the constraints h(x) = (h1(x),...,hm(x))

t at each x 0has full rank and, for simplicity at the point xk will be denoted by

Ak = Jh(xk) and bk = Akx

k.

Then a similar construction as in linear case applies.

Let

C = {x E | h(x) = 0, x 0}.The objective function is f : E IR, its lower bound on C is denoted by l: l =minC

f. Let us introduce

C = S C; S = {x E | f(x) }.

7/29/2019 287-517-1-SM

4/24


We assume that

f is twice continuously differentiable on E (2.2)

> l : C is not empty, closed and bounded (2.3)

> l

: meas (C) > 0, (2.4)where meas(C) is the measure of C.

Since E is a finite dimensional space, the assumption (2.3) is verified when C is

bounded or f is coercive, i.e., limx+

f(x) = +. Assumption (2.3) is verified when Ccontains a sequence of neighborhoods of a point of optimum x having strictly positive

measure, i.e., when x can be approximated by a sequence of points of the interior of

C. We observe that the assumptions (2.2)-(2.3) yield that

C =>l

C i.e., x C : > l such that x C.

From (2.2) and (2.3) we get

1 = sup{f(x) : x C} < +.

Then

2 = sup{d : x C} < +,where d is the direction of reduced gradient method (see section 3). Thus,

(, ) = sup{y (x + d) : (x, y) C C, 0 } < +, (2.5)

where , are positive real numbers.

3 Reduced Gradient and its Generalization

3.1 Reduced Gradient Method

The reduced gradient method is coldly related to the simplex method of linear pro-

gramming in that the problem variables are partitioned into basic and nonbasic groups.

From a theoretical viewpoint, the method can be shown to have very much like the gra-

dient projection method [1]. Some variants of the method applied to convex nonlinear

problems with linear constraints are also referred to as convex simplex method.

Again we are dealing with the same basic descent algorithm structure and are

interested in still other means for computing descent directions. We use the following

form of the linearly constrained with f(.) continuously differentiable on IRn:

Minimize : f(x)subject to : Ax = b

x 0(3.1)

7/29/2019 287-517-1-SM

5/24


where A is an m n matrix with full row rank and b IRm.The reduced gradient method begins with a basis B and a feasible solution xk =

(xkB, xkN) such that x

kB > 0. The solution x is not necessarily a basic solution, i.e. xN

do not have to be identically zero. Such a solution can be obtained e.g. by the usual

first phase procedure of linear optimization. Using the basis B from BxB + NxN = b,we have

xB = B1b B1NxN,

hence the basic variables xB can be eliminated from the problem (3.1)

Minimize : fN(xN)subject to : B1b B1NxN > 0

0 xN(3.2)

where fN(xN) = f(x) = f(B1b B1NxN, xN). Using the notation

f(x)t = Bf(x)t,Nf(x)t,the gradient of fN, which is the so-called reduced gradient, can be expressed as

fN(x)t = Bf(x)tB1N) + (Nf(x)t.

Now let us assume that the basis is nondegenerate, i.e. only the nonnegativity

constraints xN 0 might be active at the current iterate xk. Let the search directionbe a vector dt = (dtB, d

tN) in the null space of the matrix A defined as dB = B1NdN

and dN 0. If we define so, then the feasibility of xk + d is guaranteed as long asxkB + dB 0, i.e. as long as

max = miniB,di

7/29/2019 287-517-1-SM

6/24


3.2 Generalized reduced gradient (GRG)

The reduced gradient method can be generalized to nonlinearly constrained optimiza-

tion problems. Similar to the linearly constrained case we consider the problem with

equality constraints and nonnegative variables.

The reduced gradient method and its generalization try to extend the methods of

linear optimization to the nonlinear case. These methods are close to, or equivalent

to projected gradient methods; just the presentation of methods is frequently quite

different.

We generate a reduced gradient search direction by virtually keeping the linearized

constraints valid. This direction by construction will be in the null space of Ak. More

specifically for the linearized constraints we have

h(xk) + Jh(xk)(x xk) = 0 + Ak(x xk) = 0.From this, one has

BkxBk + NkxNk = Akxk,

and by introducing the notation bk = Akxk, we have

xBk = B1k bk B1k NkxNk .

Hence the basic variables xBk can be eliminated from the linearization of the problem

(1.2) to result the problem (3.2).

From this point the generation of the search direction d goes exactly to the same

way as in the linearly constrained case. Due to the nonlinearity of the constraints

h(xk+1) = h(xk + d) = 0 will not hold. Hence something more has to be done torestore feasibility.

In old versions of the GRG Newtons method is applied to the nonlinear equality

system h(x) = 0 from the initial point xk+1 to produce gradient direction, which is

combined by a direction from the orthogonal subspace (the range space of Atk) and

then a modified (nonlinear, discrete) line search is performed. These schemes are quite

complicated and not discussed here in details.

Let us recall briefly the essential points of the reduced gradient method: an initial

feasible guess x0 C is given and a sequence {xk}k0 C is generated by usingiterations of the general form:

k 0 xk+1 = Qk(xk) = xk + kdk. (3.3)The optimal choice for k is

k arg min{f(xk + dk) : 0 max}, (3.4)where

max =

min1jn

xkjdkj

: dkj < 0

if dk < 0

if dk 0.We have f(xk + kd

k) f(xk).

7/29/2019 287-517-1-SM

7/24


4 Stochastic perturbation of the reduced gradient

The main difficulty remains the lack of convexity: if f is not convex, the Kuhn-Tucker

points may not correspond to global minimum. In the following we shall improve this

point by using an appropriate random perturbation.The sequence of real numbers {xk}k0 is replaced by a sequence of random variables

{Xk}k0 involving a random perturbation Pk of the deterministic iteration (3.3). Thenwe have X0 = x0;

k 0 Xk+1 = Qk(Xk) + Pk = Xk + kdk + Pk, (4.1)

where

k 1Pk is independent from (Xk1,..., X0), (4.2)x C Qk(x) + Pk C. (4.3)

Equation (4.1) can be viewed as perturbation of the descent direction dk, which isreplaced by a new direction Dk = d

k + Pkk and the iterations (4.1) become

Xk+1 = Xk + kDk.

General properties defining convenient sequences of perturbation {Pk}kO can be foundin the literature: usually, a sequence of Gaussian laws may be used in order to satisfy

these properties.

We introduce a random vector Zk, we denote by k and k the cumulative distri-bution function and the probability density ofZk, respectively.

We denote by Fk+1(y|Xk

= x) the conditional cumulative distribution function

Fk+1(y|Xk = x) = P(Xk+1 < y|Xk = x);the conditional probability density of Xk+1 is denoted by fk+1.

Let us introduce a sequence of n-dimensional random vectors {Zk}k0 C. Weconsider also {k}k0, a suitable decreasing sequence of strictly positive real numbersconverging to 0 and such that 0 1. The optimal choice for k is

k arg min{f(Xk + Dk) : 0 max}, (4.4)

where

max =

min1jn

XkjDkj

: Dkj < 0

if Dk < 0

if Dk 0.Fk+1(y|Xk = x) = P(Xk+1 < y|Xk = x).

Then

Fk+1(y|Xk = x) = P(Zk < y Qk(x)k

)

7/29/2019 287-517-1-SM

8/24


and we have

Fk+1(y|Xk = x) = ky Qk(x)

k

fk+1(y|Xk = x) =

1

nkky Qk(x)

k y C. (4.5)

Note that (2.5) shows that

y Qk(x) (, ) for (x, y) C C.

We assume that there exists a decreasing function t gk(t), gk(t) > 0 on IR+ suchthat

y C ky Qk(x)

k

gk

(, )k

(4.6)

For simplicity, let

Zk = 1C(Zk)Zk.

(4.7)The procedure generates a sequence Uk = f(X

k). By construction this sequence is

decreasing and lower bounded by l

k 0 l Uk+1 Uk. (4.8)

Thus there exists U l such that Uk U for k +.Lemma 4.1 LetPk = kZk and = f(x0) if Zk is given by (4.7). Then there exists > 0 such that

P(Uk+1 < |Uk ) meas(C

C)

nk gk(, )

k

> 0 (l

, l

+ ],

where n = dim(E).

Proof. Let C = {x C|f(x) < }, for (l, l + ]. Since C C , l < < , itfollows from (2.4) that C is non empty and has a strictly positive measure.

If meas(C C) = 0 for any (l, l + ], the result is immediate, since we havef(x) = l on C.

Let us assume that there exists > 0 such that meas(CC) > 0. For (l, l+],we have C C and meas(C C) > 0.

P(Xk

C) = P(Xk

C C) = CC P(Xk

dx) > 0 for any (l

, l

+ ], sincethe sequence {Ui}i0 is decreasing, we have also

{Xi}i0 C. (4.9)

Therefore

P(Xk C) = P(Xk C C) =CC

P(Xk dx) > 0 for any (l, l + ].

7/29/2019 287-517-1-SM

9/24


Let (l, l + ], we have from (4.8)P(Uk+1 < |Uk ) = P(Xk+1 C|Xi C, i = 0, , k).

But (4.4) and (4.5) yield that

P(Xk+1 C|Xi C , i = 0, , k) = P(Xk+1 C|Xk C).Thus

P(Xk+1 C|Xk C) = P(Xk+1 C, Xk C)

P(Xk C).

Moreover

P(Xk+1 C, Xk C) =CC

P(Xk dx)C

fk+1(y|Xk = x) dy.

Using (4.9) we get

P(Xk+1 C, Xk C) =CC

P(Xk dx)C

fk+1(y|Xk = x) dy

and

P(Xk+1 C, Xk C) infxCC

C

fk+1(y|Xk = x) dy

CC

P(Xk dx).

Thus

P(Xk+1 C|Xk C) infxCC

C

fk+1(y|Xk = x) dy

.

Taking (4.5) into account, we have

P(Xk+1 C|Xk C) 1nk

infxCC

C

k(y Qk(x)

k) dy

.

Note that (2.5) shows that

y Qk(x) (, )and (4.6) yields that

k

y Qk(x)k

gk

(, )k

.

Hence

P(Xk+1 C|Xk C) 1nk

infxCC

C

gk(((, )k

)) dy.

It follows that

P(Xk+1 C|Xk C) meas(C C)nk

gk

(, )k

.

The global convergence is a consequence of the following result, which yields from

the Borel-Catellis lemma (see, for instance, [28]).

7/29/2019 287-517-1-SM

10/24


Lemma 4.2 Let{Uk}k0 be a decreasing sequence, lower bounded by l. Then thereexists U such that Uk U for k +. Assume that there exists > 0 such that forany (l, l + ], there is a sequence of strictly positive real numbers{ck()}k0 suchthat

k 0 P(Uk+1 < |Uk ) ck() > 0;+k=0

ck() = +. (4.10)

Then U = l almost surely.

For the proof we refer the reader to [21] or [28].

Theorem 1 Let = f(x0) and k satisfy (4.4). Assume that x0 C, the sequence k

is non increasing and+

k=0

gk

(, )k

= +. (4.11)

Then U = l almost surely.

Proof. Let

ck() =meas(C C)

nkgk(

(, )

k) > 0. (4.12)

Since the sequence { k }k 0 is non increasing,

ck() meas(C C)n0

gk((, )

k) > 0.

Thus, equation (4.11) shows that

+k=0

ck() meas ( C C )n0

+k=0

gk

(, )

k

= +.

Using Lemma 1 and Lemma 2 we have U = l almost surely.

Theorem 2 Let Zk = 1C(Z)Z where Z is a random variable following N( 0, Id),

( > 0) and let

k = a

log(k + d) , (4.13)

where a > 0, d > 0 and k is the iteration number. If x0 C then, for a large enough,U = l almost surely.

Proof. We have

k ( z ) =1

2n exp

1

2

z

2

= gk ( z ) > 0.

7/29/2019 287-517-1-SM

11/24


So

gk

(, )

k

=

1

2n

(k + d)(,)2/(22a)

For a such that

0 < (, )22a

< 1,

we havek=0

gk

(, )

k

= +,

and, from the preceding theorem, we have U = l almost surely.

5 Numerical results

In order to apply the method presented in Section 4, we start at the initial valueX0 = x0 C. At step k 0, Xk is known and Xk+1 is determined.

We generate ksto the number of perturbation, the case ksto = 0 corresponds to the

unperturbed descent method. We add slash variables z1, z2,... to inequality constraints

in order to opting equality constraints.

We shall introduce a maximum iteration number kmax such that the iterations are

stopped when k = kmax or xk is a Kuhn-Tucker point. For each descent direction,

the optimal k is calculated by unidimensional exhaustive search with a fixed step on

the interval defined by (3.4). This procedure could imply a large number of calls of

the objective function. The global number of function evaluations is eval and includes

the evaluations for unidimensional search. We denote by kend the value of k when theiterations are stopped (it corresponds to the number of evaluations of the gradient of

f). The optimal value and optimal point are fopt and xopt respectively. The perturba-

tion is normally distributed and samples are generated by using the log-trigonometric

generator and the standard random number generator of FORTRAN library. We use

k =

alog(k+2)

, where a > 0.

The maximum of iterations has been fixed at kmax = 100. The results for ksto = 0,

250, 500, 1000 and 2000 are given in the tables below for each problems. Concern

experiments performed on a workstation HP Intel(R) Celeron(R) M processor 1.30GHz,

224 Mo RAM. The row cpu gives the mean CPU time in seconds for one run.

5.1 Statistical problem

The statistical problem of estimating a bounded mean with a minimax procedure is

nonconvex and nonlinear; we will restrict ourselves to this problem (see for instance, [16]

and [17]). Thus, our problem is reduced to the following: for a suitable m2 > m1 1.27and for each m (m1, m2] find the a1, a2. In this case, the problem reduces to a globaloptimization problem with 5 unknown (, a1, a2, a3, b) and 3 equality constraints. This

7/29/2019 287-517-1-SM

12/24


problem is equivalent to the maximization of the convex combination of the several gifunction without the constraint on (see for instance [14] and [22]). Using the equality

a1 + a2 + a3 = 1, one variable, a1 for example, can be eliminated. The problem we

eventually studied is the following:

Minimize : {(1 a2 a3)g1 + a2g2 + a3g3}subject to:

a2 + a3 + z1 = 1

ai + zi = 1 i = 2, 3

b + z4 = 1

0 ai i = 2, 30 b, 0 zi i = 1,.., 4,

where

g1(a2, a3, b) : (1)2 = ,

g2(a2, a3, b) : bm exp(bm) +x=1

(bm (x))2 (bm)x1

x!exp(bm) = ,

g3(a2, a3, b) : m exp(m) +x=1

(m (x))2 mx1

x!exp(m) = and

(x) = ma2b

x exp (1 b)m + a3a2bx1 exp(1 b)m + a3

if x= 0, 1

(1) = ma2b exp(bm) + a3 exp(m)

1 a2 + a3 + a2 exp(bm) + a3 exp(m) .

There are 7 variables and 10 constraints. We use a = 1 and m = 1.5, the Fortran

code furnishes the following optimal solutions, see also Table 1:

a1 = 0.0958 a2 = 0.9042 a

3 = 0.00000 b

= 0.907159

Table 1: Results for Statistics

ksto 0 250 500 1000 2000

fopt 0.3401 0.4022 0.4030 0.4044 0.4044eval 1 1183 4773 10743 17635

kend 2 7 12 13 11

cpu 0.00 37.16 134.72 288.91 491.86

7/29/2019 287-517-1-SM

13/24


5.2 Octagon problem

Consider polygon in the plane with 5 sides (5-gons for short) and unit diameter. Which

one of them has a maximum area? (see for instant [18,31]). Using QP and geometric

reasoning, the optimal octagon is determined in (6, 10) to have an area about 2.8%

larger than the regular octagon.

Grahams conjecture states that the optimal octagon can be illustrated as in Fig-

ure 1, in which a solid line between two vertices indicates that the distance between

these points is one.

In 1996 Hansen formulates this question in the quadratically constrained quadratic

optimization problem defining this configuration, which appears below (see for instance

[8]). By symmetry, and without loss of generality, the constraint x2 x3 is added toreduce the size of the feasible region.

(

((((((((((

rrrrrrrrrrrrrgggggggggggggg

A0 = (0, 0)

A7 = (x3 x1, y1 y3)

A6 = (x1 x2 + x4, y1 y2 + y4)

A5 = (x1, y1)A4 = (0, 1)

A3 = (x1, y1)

A2 = (x3 x1 x5, y1 y3 + y5)

A1 = (x1 x2, y1 y2)

Figure 1: Definition of variables for the configuration conjectured by Graham to have max-imum area

7/29/2019 287-517-1-SM

14/24


Minimize: 12{(x2 + x3 4x1)y1 + (3x1 2x3 + x5)y2 + (3x1 2x2 + x4)y3+

(x3 2x1)y4 + (x2 2x1)y5}+ x1Subject to:

A0 A1 1 : (x1 x2)2 + (y1 y2)2 + z1 = 1A0 A2 1 : (x1 + x2 x5)2 + (y1 y3 + y5)2 + z2 = 1A0 A6 1 : (x1 x2 + x4)2 + (y1 y2 + y4)2 + z3 = 1A0 A7 1 : (x1 + x3)2 + (y1 y3)2 + z4 = 1,A1 A2 1 : (2x1 x2 x3 + x5)2 + (y2 y3 + y5)2 + z5 = 1A1 A3 1 : (2x1 + x2)2 + y22 + z6 = 1A1 A4 1 : (x1 x2)2 + (y1 y2 1)2 + z7 = 1A1 A7 1 : (2x1 x2 x3)2 + (y2 + y3)2 + z8 = 1A2 A3 1 : (x3 x5)2 + (y3 + y5)2 + z9 = 1

A

2 A

4 1 : (

x

1+ x

3 x

5)2 + (y

1 y

3+ y

5 1)2 + z

10= 1

A2 A5 1 : (2x1 + x3 x5)2 + (y3 + y5)2 + z11 = 1A2 A6 = 1 : (2x1 x2 x3 + x4 + x5)2 + (y2 + y3 + y4 y5)2 = 1A3 A6 1 : (2x1 + x2 x4)2 + (y2 y4)2 + z12 = 1A4 A6 1 : (x1 x2 + x4)2 + (y1 y2 + y4 1)2 + z13 = 1A4 A7 1 : (x1 x3)2 + (1 y1 + y3)2 + z14 = 1A5 A6 1 : (x2 x4)2 + (y2 y4)2 + z15 = 1A5 A7 1 : (2x1 x3)2 + y23 + z16 = 1A6 A1 1 : (2x1 x2 x3 + x4)2 + (y2 + y3 + y4)2 + z17 = 1

x2 x3 z18 = 0x2i +

y2i = 1

, i= 1

,2

,3

,4

,5x1 + z19 = 0.5

xi + z18+i = 1, i = 2, 3, 4, 5

0 xi, yi i = 1, 2,.., 5, 0 zi i = 1, 2,.., 23.There are 33 variables and 62 constraints.

We use a = 5, the Fortran code furnishes the following optimal solutions, see also

Table 2:

xopt = (0.25682, 0.67357, 0.67060, 0.91187, 0.91342),

yopt = (0.96619, 0.73978, 0.74245, 0.41172, 0.40820)

Table 2: Results for Octagon

ksto 0 250 500 1000 2000

fopt 0.726637 0.72712 0.727377 0.72738 0.72739eval 2 526 1654 2634 4585

kend 3 11 12 10 15

cpu 0.00 1.48 6.70 10.67 12.53

7/29/2019 287-517-1-SM

15/24


5.3 Mixture problem

In this example of petrochemical mixture we have four reservoirs: R1, R2, R3 and R4.

The first two receive three distinct source products. Then their content is combined inthe other two reservoirs to create the wanted miscellanies. The question is to determine

the quantity of every product to buy in order to maximize the profits.

The R1 reservoir receives two products of quality (e.g., content in sulfur) 3 and

1 in quantity q0 and q1 (see for instance [12, 19]). The quality of the mixture done

contain in R1 is s =3q0+q1q0+q1

(1, 3). The R2 reservoir contains a product of the thirdsource, of quality 2 in quantity q2. One wants to get in the R3 reservoirs and R4 of

capacity 10 and 20, of the quality products of to the more 2.5 and 1.5. Figure 2, where

the variable x1 to x2 represents some quantities, illustrates this situation. The unit

prices of the bought products are respectively 60, 160 and 100; those of the finished

products are 90 and 150. Is the difference between the costs of purchase and saletherefore 60q0 + 160q1 + 100q2 90(x1 + x3) 150(x2 x4)? The qualities of thefinal miscellanies are sx1+2x3x1+x3 and

sx2+2x4x2+x4

. The addition of the constraints of volume

conservation q0 + q1 = x1 + x2 and q2 = x3 + x4 permits the elimination of the variables

q0, q1 and q2:

q0 =(s 1)(x1 + x2)

2, q1 =

(3 s)(x1 + x2)2

, q2 = x3 + x4.

R2

R1

R4

R3

qq0,3

q1,1

Ex1, s E 10; 2.5ttt

tttttttt

x2,s

Eq2, 2 E

x3, 2

0

x4, 2E 20; 1.5

Figure 2: A Problem of mixture

7/29/2019 287-517-1-SM

16/24


The mathematical model transformed (the variable s is noted x5) is as follows:

Minimize: 120x1 + 60x2 + 10x3 50x4 50x1x5 50x2x5Subject to:

2.5x1

+ 0.5x3

x1

x5

z1

= 0

1.5x2 0.5x4 x2x5 z2 = 0x1 + x3 + z3 = 10

x2 + x4 + z4 = 20,

x5 + z5 = 3,

x5 z6 = 1,0 xi i = 1,.., 5; 0 zi i = 1,.., 6.

There are 11 variables and 17 constraints. We use a = 0.05, the Fortran code furnish

the following optimal solutions, see also Table 3.

xopt = (0, 10, 0, 10, 1)

Table 3: Results for Mixture

ksto 0 250 500 1000 2000

fopt 306 353 367 374 400eval 2 28 51 105 28667

kend 3 7 7 9 101

cpu 0.00 0.02 0.36 1.27 38.53

We have the optimum value of ob jective function (fopt = 400) for Mixture problemin ksto = 2000.

5.4 Alkylation problem

The alkylation process is common in the petroleum industry, it is an important unit

that is used in refineries to upgrade light olefins and isobutane into much more highly

valued gasoline component. The alkylation process is usually mixed at the gasoline

in order to improve his performance; see the problem 5.3. A simplified process flow

diagram of an alkylation process is shown in Figure 3 below. The process model seeks

to determine the optimum set of operating conditions for the process, based on a math-

ematical model, which allows the maximization of profit. The problem is formulated

as a direct nonlinear programming model with mixed nonlinear inequality and equalityconstraints and a nonlinear profit function to be maximized (see for instance [3]). A

typical reaction [29] is

(CH3)2COlefin feed

= CH2 + (CH3)3CHIsobutane make up

Fresh acid (CH3)2CHCH2C(CH3)3Alkylate product

As shown in Figure 3, an olefin feed (100% butane), a pure isobutane recycle and a

100% isobutane make up stream are introduced in a reactor together with an acid

7/29/2019 287-517-1-SM

17/24


catalyst. The reactor product stream is then passed through a fractionator where the

isobutane and the alkylate product are separated. The spent acid is also removed

from the reactor.

Reactor

E

EAlkylate product

ESpent acid

EOlefin feed

EIsobutanemake up

EFresh acid

Isobutane recycle

FractionatorEHydrocarbone product

Figure 3: Simplified Alkylation Process Flow sheet

The notation used is shown in Table 4 along with the upper and lower bounds on

each variable. The bounds represent economic, physical and performance constraints.

Table 4: Variable and their bounds

Symbol Variable Lower Upper

Bound Bound

x1 Olefin feed (barrels/day) 0 2000

x2 Isobutane recycle (barrels/day) 0 16000

x3 Acid Addition Rate (X1000 pounds/day) 0 120

x4 Alkylate yield (barrels/day) 0 5000

x5 Isobutane make up (barrels/day) 0 2000

x6 Acid strength (wt.%) 85 93

x7 Motor octane no. 90 95

x8 External Isobutane-olefin Ratio 3 12

x9 Acid dilution factor 1.2 4

x10 F-4 performance no. 145 162

The external isobutane- to- olefin ratio, x8, was equal to the sum of the isobutane

recycle, x2, and the isobutane make up, x5, divided by the olefin feed, x1.

x8 =x2 + x5

x1.

7/29/2019 287-517-1-SM

18/24


The motor octane number, x7, was a function of the external isobutane- to- olefin ratio,

x8, and the acid strength by weight percent, x6, (for the same reactor temperatures

and acid strenghts as for the alkylate yield x4)

x1 86.35 + 1.098x8 0.038x28 + 0.325(x6 89).

The acid strength by weight percent, x6, could be derived from an equation that

expressed the acid addition rate, x3, as a function of the alkylate yield, x4, and the

acid dilution factor, x9, and the acid strength by weight percent, x6 (the addition acid

was assumed to have a strength of 98%)

1000x3 =x4x6x9

98 x6 .

This equation repeats itself with the help of quadratic constraints implying an artificial

variable x11

x3 = x6x11 and 98000x11 1000x3 = x4x9.

The alkylate yield, x4, was a function of the olefin feed, x1, and the external isobutane-

to- olefin ratio, x8. The relationship determined by holding the reactor temperaturebetween 80 F and 90 F and the reactor acid strength by weight percent at 85 to 93

was:

x4 x1(1.12 + 0.13167x8 0.00667x28).

While adding an artificial variable x12, one gets the equivalent quadratic constraints:

x4 = x1x12 and x12 1.12 + 0.13167x8 0.00667x2

8.

The objective function is linear in the cost of the products sources x1, x3 and x5, and

in the cost of the isobutane recycled x2, and depends linearly on the product of the

quantity of alkane by the rate of octane x4x7.

The other constraints of the physical boundary are on the variables and the linear

relations. Continuation to the elimination of variables, a stake to the scale and a

7/29/2019 287-517-1-SM

19/24


renumbering of the indications, one gets the following optimization problem:

Minimize : 614.88x1 171.5x2 6.3x1x4 + 4.27x1x5 3.5x2x5 + 10x3x6Subject to :

0.325x3 + x4 1.098x5 + 0.038x25 + z1 = 57.4250.13167x5 + x7 + 0.00667x25 + z2 = 1.12

12.2x1 10x2 + z3 = 200, 12.2x1 10x2 z4 = 0.110x2 + 12.2x1x5 10x2x5 + z5 = 1600, 10x2 + 12.2x1x5 10x2x5 z6 = 0.65.346x1 980x6 0.666x1x4 + 10x2x5 = 0x1 1.22x1x7 + x2x7 = 0x3x6 + z7 = 120

x1 + z8 = 32.786885, x1 z9 = 0.01x5 + z10 = 12, x5 z11 = 3x2 + z12 = 20, x6 + z13 = 1.411765

x3 + z14 = 95, x3 z15 = 85x4 + z16 = 95, x4 z17 = 92.66666.0.8196722 + z18 = x7,

xi 0, i = 1,.., 7; zj, j = 1,.., 18.

There are 25 variables and 38 restrictions and we use a = 10 and kmax = 100.

The SPRGM algorithm furnishes the following optimal solutions, see also Table 5:

xopt = (30.4713, 20.0000, 90.9999, 94.1900, 10.4100, 1.0697, 1.7743)

Table 5: Results for Alkylation

nsto 0 250 500 1000 2000

fopt 1161.626 1176.192 1176.191 1176.186 1176.193eval 2 274 531 1019 2063

kend 3 14 14 14 14

cpu 0.00 0.97 1.95 3.86 7.86

5.5 Pooling problem

We present in Figure 4 (and Table 6) an example in which the intermediate pools are

allowed to be intermediate, see for instance [12]. The pooling is thus extended to the

case where exiting blends of some intermediate pools are feeds of others.

The proportion model of the generalized pooling problem is not a bilinear program,

since the variables are not partitioned into two sets. Therefore, this formulation belongs

to the class of quadratically constrained quadratic programs; cf. for instance [19].

7/29/2019 287-517-1-SM

20/24


F3

F2

F1

P1

P2

B3

B2

B1E

x11rrrj

B

B

$$$$$$

$$$$$Xx32d

dddy23

y21

q

y12dd

Figure 4: A Generalized Pooling Problem

Table 6: Characteristics of GP

Feed Price Attribute Max Pool Max Blend Price Max Attribute

$/bbl Quality Supply capacity $/bbl Demand Max

F1 6 3 18 P1 20 B1 9 10 2.5

F2 16 1 18 P2 20 B2 13 15 1.75

F3 10 2 18 B3 14 20 1.5

A hybrid model for GP is given below where v12 denotes the flow from P1 to P2 (all

other variables are defined as before).

maxq,t,v,w,x,y 6(x11 + y12 + v12 q21(y12 + v12)) 16q12(y12 + v12)10(x32 + y21 + y23 v12) + 9(x11 + y21) + 13(y12 + x32) + 14y23

Subject to :

supply:

x11 + y12 + v12 q21(y12 + v12) + z1 = 18q21(y12 + v12) + z2 = 18

x32 + y21 + y23 v12) + z3 = 18

demand:

x11 + y21 + z4 = 10

y12 + x32 + z5 = 15

y23 + z6 = 20

capacity: y12 + v12 + z7 = 20y21 + y23 + z8 = 20

attribute:

(3 2q21)v12 + 2(y21 + y23 v12) t12(y21 + y23) = 03x11 + t

12y21 2.5(x11 + y21) + z9 = 0

2x32 + (3 2q21)y12 1.75(x32 + y12) + z10 = 0t12 + z11 = 1.5

pos. flow:

y21 + y23 v12 z12 = 0q21 + z13 = 1

q 0, t 0, v 0, w 0, x 0, y 0, zi 0, i = 1,.., 13.

7/29/2019 287-517-1-SM

21/24


There are 21 variables and 35 restrictions, we use a = 50.

The SPRGM algorithm furnish the following optimal solutions, see also Table 7:

xopt = (x11, x32) = (0.1506, 2.6564); yopt = (y12, y21, y23) = (0.7425, 0.0005, 19.9988)vopt = v12 = 10.0013; qopt = q21 = 0.9949; topt = t

12 = 1.5000.

Table 7: Results for Pooling

nsto 0 250 500 1000 2000

fopt 26.15 26.63 26.71 26.72 26.73eval 2 181 441 813 1114

kend 3 5 5 5 5

cpu 0.00 0.03 0.03 0.11 0.17

6 Concluding remarks

We have presented a stochastic modification of the reduced gradient method for linear

and nonlinear constraints, involving the adjunction of a stochastic perturbation. This

approach leads to a stochastic descent method where the deterministic sequence gen-

erated by the reduced gradient is replaced by a sequence of random variables. We have

established a result of stochastic descent methods.

The numerical experiments show that our method is effective to be calculated for

nonconvex optimization problems verifying a nondegeneracy assumption. The use of

stochastic perturbations generally improves the results furnished by the reduced gra-

dient.

Here yet, we observe that the adjunction of the stochastic perturbation improvesthe result, with a larger number of evaluations of the objective function. The final

points Xkend are very close or practically identical for ksto 250.The main difficulty in the practical use of the stochastic perturbation is connected

to the tuning of the parameters. Even if the combination with the deterministic reduced

gradient increases the robustness, the values of a and ksto have an important influence

on the quality of the result. The random number generator may have also a influence

aspects such as the number of evaluations. All the sequences {k}k0 tested havefurnished analogous results, but for different sets of parameters.

References

[1] Beck, P., Lasdon, L. and Engquist, M. A reduced gradient algorithm for nonlinear

network problems. ACM Trans. Math. Softw., 9 (1983), 5770.

[2] Bouhadi, M., Ellaia, R. and Souza de Cursi, J. E. Random perturbations of the

projected gradient for linearly constrained problems. Nonconvex Optim. Appl.,

54 (2001), 487499.

7/29/2019 287-517-1-SM

22/24


[3] Bracken, J. and McCormick, G. P. Selected applications of nonlinear programming,

New York-London-Sydney: John Wiley and Sons, Inc. (1986) XII, 110.

[4] Carson, Y. and Maria, A. Simulation optimization: methods and applications,

Proceeding of Winter Simulation Conference, USA, 1997.

[5] Conn, A.R., Gould, N. and Toint, P.L. Methods for nonlinear constraints in opti-

mization calculations, Inst. Math. Appl. Conf. Ser., New Ser., 63 (1997), 363390.

[6] Dorea, C. C. Y. Stopping rules for a random optimization method. SIAM

Journal Control and Optimization, Vol. 28, No. 4 (1990), 841850.

[7] Drud, A.S. CONOPT A large-scale GRG code. ORSA J. Comput., Vol. 6,

No. 2 (1994), 207216.

[8] El Mouatasim, A. Stochastic perturbation in global optimization: analyses andapplication, PhD Theses, Morocco: Mohammadia School of Engineers, 2007.

[9] El Mouatasim, A., Ellaia, R. and Souza de Cursi, J. E. Random perturbation of

variable metric method for unconstraint nonsmooth global optimization. Inter-

national Journal of Applied Mathematics and Computer Science, Vol. 16, No. 4

(2006), 463474.

[10] Ellaia, R., El Mouatasim, A., Banga, J. R. and Sendin, O. H. NBI-RPRGM for

multi-objective optimization design of Bioprocesses, Journal of ESAIM: PRO-

CEEDINGS, 20 (2007), 118128.

[11] El Mouatasim, A., Ellaia, R. and Souza de Cursi, J.E. Projected variable metric

method for linear constrained nonsmooth global optimization via perturbation

stochastic. Submitted to RAIRO Journal (2008).

[12] Foulds, L.R. Haugland, D. and Jornsten, K. A bilinear approach to the pooling

problem. Optimization. 24 (1992), 165180.

[13] Floudas, C. A. and Pardalos, P. M. A collection of test problems for constrained

global optimization algorithms, Lecture Notes in Computer Science, 455. Berlin:

Springer-Verlag, XIV, 180, 1990.

[14] Ghosh, M. N. Uniform approximation of minimax point estimates. Ann. Math.

Statist., 35 (1964), 10311047.

[15] Gould, N.I.M. and Toint, P.L. SQP Methods for large-scale nonlinear program-

ming, System modelling and optimization. Methods, theory and applications, 19th

IFIP TC7 conference, Cambridge, GB, July 12-16, 1999, Boston: Kluwer Aca-

demic Publishers, IFIP, Int. Fed. Inf. Process. 46, 149178, 2000.

7/29/2019 287-517-1-SM

23/24


[16] Gourdin, E. Global optimization algorithms for the construction of least favorable

priors and minimax estimators, Engineering Undergraduate Thesis, Every, France:

Institut dInformatique dInterprise, 1989.

[17] Gourdin, E., Jaumard, B. and MacGibbon, B. Global optimization decomposi-tion methods for bounded parameter minimax evaluation, SIAM Journal of Sci.

Comput., Vol. 15, No. 1 (1994), 1635.

[18] Graham, R. L. The largest small Hexagon. Journal of Combinatorial Theory,

Series A, 18 (1975), 165170.

[19] Haverly, C. A. Studies of the behaviour of recursion for the pooling problem,

ACM SIGMAP Bulletin, 25 (1996), 1928.

[20] Hock, W. and Schittkowski, K. Test examples for nonlinear programming

Codes,Lecture Notes in Economics and Mathematical Systems, 187, Berlin-

Heidelberg-New York: Springer-Verlag., V, 178, 1981.

[21] LEcuyer, P. and Touzin, R. On the Deng-Lin random number generators and

related methods. Statistics and Computing, Vol. 14, No. 1 (2003), 59.

Johnstone, I.M. and MacGibbon, K.B. Minimax estimation of a constrained

Poisson vector, Ann. Stat., Vol. 20, No. 2 (1992), 807831.

[22] Kall, P. Solution methods in stochastic programming, Lect. Notes Control Inf. Sci.

197, 322, 1994.

[23] Kiwiel, K. C. Free-Steering relaxation mthods for problems with strictly convexcosts and linear constraints, Math. Oper. Res., Vol. 22, No. 2 (1997), 326349.

[24] Lasdon, L.S., Waren, A.D. and Ratner, M. Design and testing of a generalized

reduced gradient code for nonlinear programming, ACM Trans. Math. Softw., 4

(1978), 3450.

[25] Luenberger, D. G. Introduction to linear and nonlinear programming, Addison-

Wesley Publishing Company. XII, 356, 1973.

[26] Martinez, J. M. A direct search method for nonlinear programming. ZAMM,

Z. Angew. Math. Mech., Vol. 79, No. 4 (1999), 267276.

Pogu, M. and Souza de Cursi, J.E. Global optimization by random perturbation

of the gradient method with a fixed parameter. J. Glob. Optim. 5, 2 (1994),

159180.

[27] Pine, S. H. Organic chemistry, New York London: McGraw-hill, 1987.

[28] Raber, U. Nonconvex All-Quadratic global optimization problems: solution meth-

ods, application and related topics, Disseration Universitat Trier, 1999.

7/29/2019 287-517-1-SM

24/24


[29] Reinhardt, K. Extremale polygone gegebenen durchmessers, Jber Dtsch. Math.

Ver., 31 (1922), 251270.

[30] Smeers, Y. Generalized reduced gradient method as an extension of feasible

Direction Methods, J. Optimization Theory Appl., 22 (1977), 209226.

[31] Sadegh, P. and Spall, J. C. Optimal random perturbations for stochastic approx-

imation using a simultaneous perturbation gradient approximation, IEEE Trans.

Autom. Control, Vol. 44, No. 1 (1999), 231232.

[32] Stople, M. On models and methods for global optimization of structural topology,

Doctoral Thesis, Stockholm, 2003.

287-517-1-sm

Documents