trust region algorithm for nonsmooth optimization

li N O g I I t - I t O I l A N D

Trust Region Algorithm for Nonsmooth Optimization

R. J. B. de Sampaio and J in-Yun Yuan

Departamento de MatemStica Universidade Federal do Paranh Centro Polit~cnico, Cx. P.: 19.081 CEP: 81531-990, Curitiba, PR, Brazil

Wen-Yu Sun

Department of Mathematics Nanjing University, Nanjing, 210093, People's Republic of China

ABSTRACT

Minimization of a composite function h(f(x)) is considered here, where f: R n --) R m is a locally Lipschitzian function, and h: R 'n --* R is a continuously differentiable convex function. Theory of trust region algorithm for nonsmooth optimization given by Fletcher, Powell, and Yuan, is extended to this case. Trust region algorithm and its global convergence are studied. Finally, some applications on nonlinear and nonsmooth least squares problems are also given. © Elsevier Science Inc., 1997

1. I N T R O D U C T I O N

The problem of t rust region algori thm for nonsmooth opt imizat ion and nonsmooth equations has been considered by Fletcher [1], Powell [2], Yuan [3], Qi and Sun [4], Martinez and Qi [5], and Sun and Yuan [6]. Fletcher [1] and Yuan [3] consider t rust region methods for composite NDO problem (Nondifferentiable Opt imizat ion Problem)

MinimizeF( x) = g(x) + h( f ( x ) ) x ~ R n

(1.1)

APPLIED MATHEMATICSAND COMPUTATION85:109-116 (1997) © Elsevier Science Inc., 1997 0096-3003/97//$17.00 655 Avenue of the Americas, New York, NY 10010 PII S0096-3003(96)00112-9

110 SAMPAIO ET AL.

where g: R ~ --* R and f: R ~ ~ R ~ are cont inuously differentiable functions, and h: R ~ ---) R is a convex but nonsmooth function bounded below. For Yuan [3] g(x) is identically zero for all x ~ R ~. However, several pract ical problems from engineering and stat is t ics fall into the following minimizat ion problem of nonsmooth composi te function

MinimizeF( x) = h( f ( x ) ) (1.2) x ~ R '~

where f: R n -* R" is a locally Lipschitzian function, and h: R n -* R is a cont inuously differentiable convex function bounded below. For instance, the prob lem of solving the sys tem of nonsmooth equations and the least squares p rob lem with nonsmooth da t a are also special cases of (1.2). There- fore, in this paper we are likely to use t rus t region methods to deal wi th (1.2) and extend the results of Fletcher [1] and Powell [2] to the case of (1.2). The a lgor i thm of t rus t region is i terative, and a t each i terat ion we need to solve a constrained subproblem. So, on the kth i terat ion Xk, a s tep-bound A k > 0 and a Bk, n × n symmet r ic matr ix , should be given. At each i teration, our subprob lem will be defined as

1 Minimize¢~k( d) = h( f(Xk) + ZTd) + -~ dTBkd

I[dll<a~ (1.3)

where Z ~ Of (x ) the so-called generalized Jacobian of f at x is fixed. The vector xk+ 1 is given the value

x k + dk, if F( x k+ dk) < F( xk) Xk+l = X k, if F( x k+ dk) >1 F( x~)

where d k is the solution of (1.3), and then Ak+ 1 is defined for the next i teration. If

F ( x k ) - F ( x k + dk) >/ c a [ F (x~) - ~bk(dk)], (1.4)

then

otherwise, let

IIdkll < Ak+l < min{clAk, A};

c3lldkll < Ak+l < c4A k.

Trust Region Algorithm for Nonsmooth Optimization 111

where c 1 >i 1, c 2 < 1, c 3 < c 4 < 1, and A is a posit ive cons tant which m a y be taken equal to the d iamete r of a set B tha t will be defined below. And then B k is upda ted according to some rule.

Fle tcher [1] proved tha t if the sequence { x k} is all in a bounded set B, and if B k is the Hessian of the Lagrangean function on the kth i teration, then there exists an accumula t ion point x* to the sequence { x k} at which the first order condit ion holds, t ha t is, the generalized der ivat ive of F at x*, for all d ~ R ~, is bigger t han or equal to zero, which means

max ( d, Z T V h ( f ( x * ) ) ) >10. Z E ~ .f( z* )

The condit ion (1.5) is called Fle tcher ' s condition, and it holds in par t icular if V h ( f ( x * ) ) = 0. Following Yuan [3], we are able to prove t ha t for each mat r ix B k if

k

IIBkll < c5 -4- c6 ~ A~, i=1

Fletcher ' s necessary condit ion (1.5) holds.

2. P R E L I M I N A R I E S

Let f: R" --* R m be a locally Lipschitzian function. According to a theorem from Rademarche r [7] f is Frechet differentiable a lmost every- where. If we denote by 12f the subset of R" where f fails to be differentiable, we define the generalized Jacobian of f at x, denoted a f (x ) , as the convex hull of all m X n matr ices Z obta ined as the limit of a sequence of the form { J( xi)} , where x i --* x, xi ~ Ftf, and J ( x i ) s tands for the Jacobian of f at x~. Symbolically, one has

~ f ( x ) = co{lim J ( x i ) , x( --* x, x i ~ •f}.

Following Clarke [7] the next two proposit ions, the first about a locally Lipschitzian function at x and the second about composi te functions, are true.

PROPOSITION 1. I f f: R ~ ~ R m is a locally Lipschitzian funct ion at x, then:

1. a ]( x) is a nonempty convex compact subset of R m× ~;

112 SAMPAIO ET AL.

2. 3 ] ( x ) is closed at x, that is, i f x~ ~ x, Z i • 3] (x , ) , Z~ --) Z, then z • 31(x);

3. 3 f ( x) is upper semicont inuous at x, that is, f o r any E > O, there is a 8 > O, such that, for all y • x + 8 B~, it implies that 3 f ( y) c 3 f ( x) + eBmx ~, where B~ s tands f o r a uni t ball in R ~ and Bmx ~ s tands for a uni t ball in R ~x ~"

4. I f each componen t func t i on f i o f f is L ip sch i t z i ano f rankk~ at x, then f is Lipschi tz ian at x with rank k = II(kl, k2, . . . , k~)ll, and 3 f ( x ) c kB~x m.

PROPOSITION 2. Let F = h o f , where f: R ~ --* R m is locally Lips- chitzian at x, and h: Rm ~ R is a cont inuously differentiable convex f unc t ion Then F is locally Lipschi tz ian func t ion at x and one has 3 F( x) = ZTVh(f(x)), where Z • 3 f ( x ) .

As a matter of simplicity of notation, we denote as

• ¢P( x; d; Z) = h ( f ( x ) ) - h( f ( x ) + ZTd) • ~I~(x; Z) = maxd{O(x; d; Z): Ildll ~< r} (2.1) • F° (x ; d) = m a x z E ~f(x)( d, ZTVh(f(x))}.

If F°(x*; d) >1 0, Vd • R n, then x* is a critical point of F(x) = h(f(x)). The following are elementary results obtained from Yuan [3] immedi-

ately:

PROPOSITION 3. Let F, ¢P, and • as abov¢ then:

1. F°(x*; d) exists f o r all x and d in Rn; 2. (I)(x;.; Z) is a concave func t ion on R ~, and given d • R ~, its

directional derivative in the direct ion d evaluated at d . = 0, is F°( x; d); 3. ¢P( x; d; . ) is a concave func t ion on 3f(x); 4. ~ r ( x ; Z ) > / 0 , for any r > O. ~r ( x; Z ) = O i f and only i f x is a

s tat ionary point of h( ]( x)); 5. xItr( x; Z) is a concave func t ion in r; 6. ~r( '; Z) is cont inuous for any given r >10.

3. THE MAIN RESULT

Next we shall prove that under conditions of Section 1 on h, f, and Bk, Fletcher's condition (1.5) holds.


PROPOSITION 4. The following lower bound for predicted reduction at each iteration

1 h( f ( x ) ) - dPk( dk) >/ -~xIf~( xk; Z)min{1, ~I'a~ ( xk; Z) /(I[ BkHA2k)}

holds.

PROOF. If d k minimize ~bk(d), then

h ( f ( x ) ) - C b k ( d ~ ) > ~ h ( f ( x ) ) - ~ b k ( d ), VdeR '~ ,V l [d l l<Ak .

Let d k be such that the maximum in (2.1) is achieved; thus,

%k( x~; Z) = h(S(Xk)) -- h(S(XE) + Z T ~ ) .

Then using the convexity of h( f (x) + zT(')), we have for all a ~ (0, 1],

h ( S ( ~ ) ) - ¢~(d~) >~ h ( S ( ~ ) ) - ~(~)

= h(f ( x ~ ) ) - h ( f (x~) + ~ Z T ~ )

1 2 - T - -

- - i s d k Bk dk

>~ h( f( ~)) - .h( f ( ~ ) + Z ~ )

- ( 1 - - ) h ( S ( ~ ) ) - ~ - 2 ~ B ~

= a[ h( f (xk) ) - h( f (xk) + zT-~k)] -- ½a 2 d-~k TBk-~k

7a dk Bk dk = ~ ' I ' ~ ~( z ~ ; z ) - 1 ~ - - ~

• _ 1 , 2 1 1 B , IIA2. >/ , 'I '~k( ~ , , Z)

114 SAMPAIO ET AL.

The above inequality means

h( f ( Xk) ) -- qbk( dk) >1 1

max a q ~ ( xk; Z) - ~,~211BklIA ~ O < a < l

1 1 % , ( x k; Z) 2 >/ min -~xItak(Xk; Z), 2 IIBklIA~ ]"

The proof is complete. []

Now we are able to prove under certain conditions on Bk, that if the sequence { x k} generated by algorithm of Section 1 is in a bounded set B whose diameter is not bigger than A, then the sequence {x k} has an accumulation point that satisfies Fletcher's condition (1.5).

THEOREM 5. I f h( f ( x ) ) satisfies all the condit ions stated in Section 1, i f the sequence { x k} generated by the algorithm of Sect ion 1 is in a bounded set B whose diameter is not bigger than A, i f B k satisfies

k

IIBkll < c5 -4- c6 ~ A~, i=1

then the sequence { x k} has an accumulat ion point x* that satisfies the condition (1.5).

PROOF. By the contrary, suppose that the sequence { x k} is bounded from stationary points of h(f(x)) . By 4 of Proposition 3, there exists 8 > 0 such that ~1( xk; Z) > 8, for all k.

Proposition 4 and Proposition 3 (5) indicate that

{11 h( f ( xk) ) - Ck( dk) >1 cTmin Ak, i[-~k H c7A k

>/ 1 + II Bkll~k

holds for all k. Using the fact that h is bounded below, we have that F~k[h(f(Xk)) -- h( f( Xk+ l))] is finite, so is ~:'k[h(f(Xk)) -- ¢bk(dk)] , where F.' stands for the sum over the iterations on which (1.4) holds. If we choose % c 6 such that

k

1 + II B~IIAk < c5 + es ~ A, i = l


then F~'kAk/(c 5 + c6F.~=IA ~) is finite. Therefore, E'kA k is convergent . By definition of hk, we have, due to Powell [2],

( Cl)[ ] E A~ < 1 + (1 c4) A1 "~- " (3.2) i = 1

We notice by (3.2) t ha t L-~i= 1Ai is finite, and then B k is bounded over all k, t ha t is, B k is uniformly bounded. Thus, ~1( Xk; Z) cannot be bounded away from zero by a Fle tcher ' s result [1]. This contradic t ion shows tha t our theorem is true. []

4. A P P L I C A T I O N S

There are two obvious appl icat ions to the above theory: least squares problems with nonsmooth da ta and nonsmooth equations. I t is enough to identify the function h wi th square Euclidean norm. Take h(.) = 1/2]1-]]2 and f( . ) = F( .) - b where F ( x ) T = ( f l ( x ) , f 2 ( x ) , . . . , fro(x)) and b T = (bl , b 2 , . . . , b,n) are not necessarily smooth. In this case, p rob lem (1.2) is the nonsmooth linear least squares p rob lem

Minimize½[[ F ( x ) - b1122. (4.1) x E R n

Since h is a cont inuously differentiable convex function bounded below, and F(.) - b is locally Lipschitzian, we can apply our t rus t region method to solve (4.1).

A L G O R I T H M .

1. set initial values: given x 0 ~ R n, B 0 ~ R n x n , A > A 0 > 0 ; given posit ive cons tants c 1 >t 1, a 2 < 1, c 3 < c a ~< 1.

2. for k = 0 , 1 , 2 . . . . . 2.1 Solve the subprob lem

1 T min 1HF( xk) -- b + ZTd[[~ +-~d Bkd I ld l [<Ak

where Z ~ 0 F ( x ) , is convenient ly fixed. 2 .2 I f ] lF(x k + d k) - b]l~ < ]]F(x k) - b]]~, then Xk+ 1

Xk+ 1 = Xk, where d k is a solution of (5.1).

(5.1)

= x k + dk, else

116 SAMPAIO ET AL.

2.3 Check convergence. If some convergence rule is satisfied, then stop, else, go to step 2.4.

2.4 Update Ak: set

Vk = liE( ) - - l i E ( + - bll

{[ Ak],min{clAk,-A} if rk >~ C 2

Ak+l ~ [c3Ak, c4Ak] otherwise.

2.5 Update B k according to some update formula.

Similarly, we can solve nonsmooth equation F (x ) = 0 by using our trust region technique to solve the following minimization problem

min ½ l] F ( x ) [122. x E R n

The work was supported by CNPq / Brazil, Foundation of The Federal University of Paran[z, Brazil, and the National Natural Science Foundation of China. This work was done when the last author visited the Federal University of Paran•, Brazil.

REFERENCES

1 R. Fletcher, Practical Methods of Optimization, Vol. 2, Constrained Optimiza- tion, John Wiley and Sons, New York, 1981.

2 M.J .D. Powell, Convergence properties of a class of minimization algorithm, In: O. L. Mangassarian, R. R. Meyer and S. M. Robinson, Eds., NonlinearPrograw¢ ming 2, Academic Press, New York, 1975.

3 Y. Yuan, Conditions for convergence of trust region algorithm for nonsmooth optimization, MathematicalProgramming31:220-228 (1985).

4 L. Qi and Jie Sun, A nonsmooth version of Newton's Method, Mathematical Programming 58:353-367 (1993).

5 J. M. Martinez and L. Qi, Inexact Newton Methods for solving nonsmooth equations, J. Comput. Appl. Math., to appear.

6 W. Sun and Y. Yuan, Optimization Theory and Method4 Academic Press, Beijing, 1995.

7 F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.

trust region algorithm for nonsmooth optimization

Documents