an introduction to optimization...an introduction to optimization fourth edition edwin k. p. chong...

AN INTRODUCTION TO OPTIMIZATION

WILEY SERIES IN DISCRETE MATHEMATICS AND OPTIMIZATION

A complete list of titles in this series appears at the end of this volume.

AN INTRODUCTION TO OPTIMIZATION

Fourth Edition

Edwin K. P. Chong Colorado State University

Stanislaw H. 2ak Purdue University

®WILEY A JOHN WILEY & SONS, INC., PUBLICATION

Copyright © 2013 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data

Chong, Edwin Kah Pin. An introduction to optimization / Edwin K. P. Chong, Colorado State University, Stanislaw H. Zak,

Purdue University. — Fourth edition, pages cm

Summary: "The purpose of the book is to give the reader a working knowledge of optimization theory and methods" — Provided by publisher.

Includes bibliographical references and index. ISBN 978-1-118-27901-4 (hardback)

1. Mathematical optimization. I. Zak, Stanislaw H. II. Title. QA402.5.C476 2012 519.6—dc23 2012031772

Printed in the United States of America.

10 9 8 7 6 5 4 3 2 1

http://www.copyright.com

http://www.wiley.com/go/permissions

http://www.wiley.com

To my wife, Yat-Yee, and to my parents, Paul

and Julienne Chong. Edwin K. P. Chong

To JMJ; my wife, Mary Ann; and my

parents, Janina and Konstanty Zak.

Stanislaw H. Zak

CONTENTS

Preface xiii

PART I MATHEMATICAL REVIEW

1 Methods of Proof and Some Notation 3

1.1 Methods of Proof 3 1.2 Notation 5

Exercises 6

2 Vector Spaces and Matrices 7

2.1 Vector and Matrix 7 2.2 Rank of a Matrix 13 2.3 Linear Equations 17 2.4 Inner Products and Norms 19

Exercises 22

3 Transformations 25

3.1 Linear Transformations 25

viii CONTENTS

3.2 Eigenvalues and Eigenvectors 26 3.3 Orthogonal Projections 29 3.4 Quadratic Forms 31 3.5 Matrix Norms 35

Exercises 40

Concepts from Geometry 45

4.1 Line Segments 45 4.2 Hyperplanes and Linear Varieties 46 4.3 Convex Sets 48 4.4 Neighborhoods 50 4.5 Poly topes and Polyhedra 52

Exercises 53

Elements of Calculus 55

5.1 Sequences and Limits 55 5.2 Differentiability 62 5.3 The Derivative Matrix 63 5.4 Differentiation Rules 67 5.5 Level Sets and Gradients 68 5.6 Taylor Series 72

Exercises 77

PART II UNCONSTRAINED OPTIMIZATION

Basics of Set-Constrained and Unconstrained Optimization 81

6.1 Introduction 81 6.2 Conditions for Local Minimizers 83

Exercises 93

One-Dimensional Search Methods 103

103

104

108

116

116

120

123

7.1 7.2 7.3 7.4 7.5 7.6 7.7

Introduction Golden Section Search Fibonacci Method Bisection Method Newton's Method Secant Method Bracketing

CONTENTS IX

7.8 Line Search in Multidimensional Optimization 124 Exercises 126

8 Gradient Methods 131

8.1 Introduction 131 8.2 The Method of Steepest Descent 133 8.3 Analysis of Gradient Methods 141

Exercises 153

9 Newton's Method 161

9.1 Introduction 161 9.2 Analysis of Newton's Method 164 9.3 Levenberg-Marquardt Modification 168 9.4 Newton's Method for Nonlinear Least Squares 168

Exercises 171

10 Conjugate Direction Methods 175

10.1 Introduction 175 10.2 The Conjugate Direction Algorithm 177 10.3 The Conjugate Gradient Algorithm 182 10.4 The Conjugate Gradient Algorithm for Nonquadratic

Problems 186 Exercises 189

11 Quasi-Newton Methods 193

11.1 Introduction 193 11.2 Approximating the Inverse Hessian 194 11.3 The Rank One Correction Formula 197 11.4 The DFP Algorithm 202 11.5 The BFGS Algorithm 207

Exercises 211

12 Solving Linear Equations 217

12.1 Least-Squares Analysis 217 12.2 The Recursive Least-Squares Algorithm 227 12.3 Solution to a Linear Equation with Minimum Norm 231 12.4 Kaczmarz's Algorithm 232 12.5 Solving Linear Equations in General 236

X CONTENTS

Exercises 244

13 Unconstrained Optimization and Neural Networks 253

13.1 Introduction 253 13.2 Single-Neuron Training 256 13.3 The Backpropagation Algorithm 258

Exercises 270

14 Global Search Algorithms 273

14.1 Introduction 273 14.2 The Nelder-Mead Simplex Algorithm 274 14.3 Simulated Annealing 278 14.4 Particle Swarm Optimization 282 14.5 Genetic Algorithms 285

Exercises 298

PART III LINEAR PROGRAMMING

15 Introduction to Linear Programming 305

15.1 Brief History of Linear Programming 305 15.2 Simple Examples of Linear Programs 307 15.3 Two-Dimensional Linear Programs 314 15.4 Convex Polyhedra and Linear Programming 316 15.5 Standard Form Linear Programs 318 15.6 Basic Solutions 324 15.7 Properties of Basic Solutions 327 15.8 Geometric View of Linear Programs 330

Exercises 335

16 Simplex Method 339

16.1 Solving Linear Equations Using Row Operations 339 16.2 The Canonical Augmented Matrix 346 16.3 Updating the Augmented Matrix 349 16.4 The Simplex Algorithm 350 16.5 Matrix Form of the Simplex Method 357 16.6 Two-Phase Simplex Method 361 16.7 Revised Simplex Method 364

Exercises 369

CONTENTS XI

17 Duality 379

17.1 Dual Linear Programs 379 17.2 Properties of Dual Problems 387

Exercises 394

18 Nonsimplex Methods 403

18.1 Introduction 403 18.2 Khachiyan's Method 405 18.3 Affine Scaling Method 408 18.4 Karmarkar's Method 413

Exercises 426

19 Integer Linear Programming 429

19.1 Introduction 429 19.2 Unimodular Matrices 430 19.3 The Gomory Cutting-Plane Method 437

Exercises 447

PART IV NONLINEAR CONSTRAINED OPTIMIZATION

20 Problems with Equality Constraints 453

20.1 Introduction 453 20.2 Problem Formulation 455 20.3 Tangent and Normal Spaces 456 20.4 Lagrange Condition 463 20.5 Second-Order Conditions 472 20.6 Minimizing Quadratics Subject to Linear Constraints 476

Exercises 481

21 Problems with Inequality Constraints 487

21.1 Karush-Kuhn-Tucker Condition 487 21.2 Second-Order Conditions 496

Exercises 501

22 Convex Optimization Problems 509

22.1 Introduction 509 22.2 Convex Functions 512 22.3 Convex Optimization Problems 521

xii CONTENTS

22.4 Semidefinite Programming 527 Exercises 540

23 Algorithms for Constrained Optimization 549

23.1 Introduction 549 23.2 Projections 549 23.3 Projected Gradient Methods with Linear Constraints 553 23.4 Lagrangian Algorithms 557 23.5 Penalty Methods 564

Exercises 571

24 Multiobjective Optimization 577

24.1 Introduction 577 24.2 Pareto Solutions 578 24.3 Computing the Pareto Front 581 24.4 From Multiobjective to Single-Objective Optimization 585 24.5 Uncertain Linear Programming Problems 588

Exercises 596

References 599

Index 609

PREFACE

Optimization is central to any problem involving decision making, whether in engineering or in economics. The task of decision making entails choosing among various alternatives. This choice is governed by our desire to make the "best" decision. The measure of goodness of the alternatives is described by an objective function or performance index. Optimization theory and methods deal with selecting the best alternative in the sense of the given objective function.

The area of optimization has received enormous attention in recent years, primarily because of the rapid progress in computer technology, including the development and availability of user-friendly software, high-speed and parallel processors, and artificial neural networks. A clear example of this phenomenon is the wide accessibility of optimization software tools such as the Optimization Toolbox of MATLAB1and the many other commercial software packages.

There are currently several excellent graduate textbooks on optimization theory and methods (e.g., [3], [39], [43], [51], [87], [88], [104], [129]), as well as undergraduate textbooks on the subject with an emphasis on engineering design (e.g., [1] and [109]). However, there is a need for an introductory

1MATLAB is a registered trademark of The MathWorks, Inc.

xiii

XIV PREFACE

textbook on optimization theory and methods at a senior undergraduate or beginning graduate level. The present text was written with this goal in mind. The material is an outgrowth of our lecture notes for a one-semester course in optimization methods for seniors and beginning graduate students at Purdue University, West Lafayette, Indiana. In our presentation, we assume a working knowledge of basic linear algebra and multivariable calculus. For the reader's convenience, a part of this book (Part I) is devoted to a review of the required mathematical background material. Numerous figures throughout the text complement the written presentation of the material. We also include a variety of exercises at the end of each chapter. A solutions manual with complete solutions to the exercises is available from the publisher to instructors who adopt this text. Some of the exercises require using MATLAB. The student edition of MATLAB is sufficient for all of the MATLAB exercises included in the text. The MATLAB source listings for the MATLAB exercises are also included in the solutions manual.

The purpose of the book is to give the reader a working knowledge of optimization theory and methods. To accomplish this goal, we include many examples that illustrate the theory and algorithms discussed in the text. How-ever, it is not our intention to provide a cookbook of the most recent numerical techniques for optimization; rather, our goal is to equip the reader with suffi-cient background for further study of advanced topics in optimization.

The field of optimization is still a very active research area. In recent years, various new approaches to optimization have been proposed. In this text, we have tried to reflect at least some of the flavor of recent activity in the area. For example, we include a discussion of randomized search methods—these in-clude particle swarm optimization and genetic algorithms, topics of increasing importance in the study of complex adaptive systems. There has also been a recent surge of applications of optimization methods to a variety of new problems. An example of this is the use of descent algorithms for the training of feedforward neural networks. An entire chapter in the book is devoted to this topic. The area of neural networks is an active area of ongoing research, and many books have been devoted to this subject. The topic of neural net-work training fits perfectly into the framework of unconstrained optimization methods. Therefore, the chapter on feedforward neural networks not only pro-vides an example of application of unconstrained optimization methods but also gives the reader an accessible introduction to what is currently a topic of wide interest.

The material in this book is organized into four parts. Part I contains a review of some basic definitions, notations, and relations from linear algebra, geometry, and calculus that we use frequently throughout the book. In Part II we consider unconstrained optimization problems. We first discuss some theo-retical foundations of set-constrained and unconstrained optimization, includ-ing necessary and sufficient conditions for minimizers and maximizers. This is followed by a treatment of various iterative optimization algorithms, includ-ing line search methods, together with their properties. A discussion of global

PREFACE XV

search algorithms is included in this part. We also analyze the least-squares optimization problem and the associated recursive least-squares algorithm. Parts III and IV are devoted to constrained optimization. Part III deals with linear programming problems, which form an important class of constrained optimization problems. We give examples and analyze properties of linear pro-grams, and then discuss the simplex method for solving linear programs. We also provide a brief treatment of dual linear programming problems. We then describe some nonsimplex algorithms for solving linear programs: Khachiyan's method, the affine scaling method, and Karmarkar's method. We wrap up Part III by discussing integer linear programming problems. In Part IV we treat nonlinear constrained optimization. Here, as in Part II, we first present some theoretical foundations of nonlinear constrained optimization problems, including convex optimization problems. We then discuss different algorithms for solving constrained optimization problems. We also treat multiobjective optimization.

Although we have made every effort to ensure an error-free text, we suspect that some errors remain undetected. For this purpose, we provide online updated errata that can be found at the Web site for the book, accessible via

http://www.wiley.com/mathematics

We are grateful to several people for their help during the course of writing this book. In particular, we thank Dennis Goodman of Lawrence Livermore Laboratories for his comments on early versions of Part II and for making available to us his lecture notes on nonlinear optimization. We thank Moshe Kam of Drexel University for pointing out some useful references on nonsim-plex methods. We are grateful to Ed Silverman and Russell Quong for their valuable remarks on Part I of the first edition. We also thank the students of ECE 580 at Purdue University and ECE 520 and MATH 520 at Colorado State University for their many helpful comments and suggestions. In par-ticular, we are grateful to Christopher Taylor for his diligent proofreading of early manuscripts of this book. This fourth edition incorporates many valu-able suggestions of users of the first, second, and third editions, to whom we are grateful.

E. K. P. CHONG AND S. H. ZAK Fort Collins, Colorado, and West Lafayette, Indiana

http://www.wiley.com/mathematics

PART I

MATHEMATICAL REVIEW

CHAPTER 1

METHODS OF PROOF AND SOME NOTATION

1.1 Methods of Proof

Consider two statements, "A" and "B," which could be either true or false. For example, let "A" be the statement "John is an engineering student," and let "B" be the statement "John is taking a course on optimization." We can combine these statements to form other statements, such as "A and B" or "A or B." In our example, "A and B" means "John is an engineering student, and he is taking a course on optimization." We can also form statements such as "not A," "not B," "not (A and B)," and so on. For example, "not A" means "John is not an engineering student." The truth or falsity of the combined statements depend on the truth or falsity of the original statements, "A" and "B." This relationship is expressed by means of truth tables; see Tables 1.1 and 1.2.

From the tables, it is easy to see that the statement "not (A and B)" is equivalent to "(not A) or (not B)" (see Exercise 1.3). This is called DeMor-gan's law.

In proving statements, it is convenient to express a combined statement by a conditional, such as "A implies B," which we denote "A=>B." The conditional

An Introduction to Optimization, Fourth Edition. 3 By E. K. P. Chong and S. H. Zak. Copyright © 2013 John Wiley & Sons, Inc.

4 METHODS OF PROOF AND SOME NOTATION

Table 1.1 Truth Table for "A and B" and "A or B"

A B A and B A or B F F F F F T F T T F F T T T T T

Table 1.2 Truth Table for "not A"

A not A F Y~ T F

Table 1.3 Truth Table for Conditionals and Biconditionals

A

F F T T

B

F T F T

A ^ B

T T F T

A <=B

T F T T

A < ^ B

T F F T

"A=>B" is simply the combined statement "(not A) or B" and is often also read "A only if B," or "if A then B," or "A is sufficient for B," or "B is necessary for A."

We can combine two conditional statements to form a biconditional state-ment of the form "A<i=>B," which simply means "(A=*-B) and (B=>A)." The statement "ΑΦ^Β" reads "A if and only if B," or "A is equivalent to B," or "A is necessary and sufficient for B." Truth tables for conditional and bicon-ditional statements are given in Table 1.3.

It is easy to verify, using the truth table, that the statement "A=>B" is equivalent to the statement "(not B)=>(not A)." The latter is called the con-trapositive of the former. If we take the contrapositive to DeMorgan's law, we obtain the assertion that "not (A or B)" is equivalent to "(not A) and (not B)."

Most statements we deal with have the form "A=>B." To prove such a statement, we may use one of the following three different techniques:

1. The direct method

NOTATION 5

2. Proof by contraposition

3. Proof by contradiction or reductio ad absurdum

In the case of the direct method, we start with "A," then deduce a chain of various consequences to end with "B."

A useful method for proving statements is proof by contraposition, based on the equivalence of the statements "A=>B" and "(not B)=>(not A)." We start with "not B," then deduce various consequences to end with "not A" as a conclusion.

Another method of proof that we use is proof by contradiction, based on the equivalence of the statements "A=>B" and "not (A and (not B))." Here we begin with "A and (not B)" and derive a contradiction.

Occasionally, we use the principle of induction to prove statements. This principle may be stated as follows. Assume that a given property of positive integers satisfies the following conditions:

■ The number 1 possesses this property.

■ If the number n possesses this property, then the number n + 1 possesses it too.

The principle of induction states that under these assumptions any positive integer possesses the property.

The principle of induction is easily understood using the following intuitive argument. If the number 1 possesses the given property, then the second condition implies that the number 2 possesses the property. But, then again, the second condition implies that the number 3 possesses this property, and so on. The principle of induction is a formal statement of this intuitive reasoning.

For a detailed treatment of different methods of proof, see [130].

1.2 Notation

Throughout, we use the following notation. If X is a set, then we write x € X to mean that x is an element of X. When an object x is not an element of a set X, we write x $. X. We also use the "curly bracket notation" for sets, writing down the first few elements of a set followed by three dots. For example, {xi,X2,^3,. · ·} is the set containing the elements χ\,Χ2,χζ, and so on. Alternatively, we can explicitly display the law of formation. For example, {x : x £ R, x > 5} reads "the set of all x such that x is real and x is greater than 5." The colon following x reads "such that." An alternative, notation for the same set is {x £ M : x > 5}.

If X and Y are sets, then we write X C Y to mean that every element of X is also an element of Y. In this case, we say that X is a subset of Y. If X and Y are sets, then we denote by X \ Y ("X minus Y") the set of all points in X that are not in Y. Note that X \ Y is a subset of X. The

6 METHODS OF PROOF AND SOME NOTATION

notation / : X —■> Y means " / is a function from the set X into the set V." The symbol := denotes arithmetic assignment. Thus, a statement of the form x := y means "x becomes y." The symbol = means "equals by definition."

Throughout the text, we mark the end of theorems, lemmas, propositions, and corollaries using the symbol □. We mark the end of proofs, definitions, and examples by | .

We use the IEEE style when citing reference items. For example, [77] represents reference number 77 in the list of references at the end of the book.

EXERCISES

1.1 Construct the truth table for the statement "(not B)=>(not A)," and use it to show that this statement is equivalent to the statement "A=^B."

1.2 Construct the truth table for the statement "not (A and (not B))," and use it to show that this statement is equivalent to the statement "A=>B."

1.3 Prove DeMorgan's law by constructing the appropriate truth tables.

1.4 Prove that for any statements A and B, we have "A <^ (A and B) or (A and (not B))." This is useful because it allows us to prove a statement A by proving the two separate cases "(A and B)" and "(A and (not B))." For example, to prove that \x\ > x for any x G M, we separately prove the cases "|x| > x and x > 0" and "|x| > x and x < 0." Proving the two cases turns out to be easier than proving the statement \x\ > x directly (see Section 2.4 and Exercise 2.7).

1.5 (This exercise is adopted from [22, pp. 80-81]) Suppose that you are shown four cards, laid out in a row. Each card has a letter on one side and a number on the other. On the visible side of the cards are printed the symbols

S 8 3 A

Determine which cards you should turn over to decide if the following rule is true or false: "If there is a vowel on one side of the card, then there is an even number on the other side."

CHAPTER 2

VECTOR SPACES AND MATRICES

2.1 Vector and Matrix

We define a column n-vector to be an array of n numbers, denoted

ai

. a<2

a —

The number α̂ is called the zth component of the vector a. Denote by R the set of real numbers and by Rn the set of column n-vectors with real components. We call Rn an n-dimensional real vector space. We commonly denote elements of Rn by lowercase bold letters (e.g., x). The components of x £ Rn are denoted # i , . . . , xn.

We define a row n-vector as

[αι,α2 , . . ·,αη]·

An Introduction to Optimization, Fourth Edition. By E. K. P. Chong and S. H. Zak. Copyright © 2013 John Wiley & Sons, Inc.

8 VECTOR SPACES AND MATRICES

The transpose of a given column vector a is a row vector with corresponding elements, denoted aT. For example, if

a 02

ii - · · i ^n\

then aT = [αι,α2,

Equivalently, we may write a = [αχ, α2 , . . . , α η ] τ . Throughout the text we adopt the convention that the term vector (without the qualifier row or col-umn) refers to a column vector.

Two vectors a = [ai, a 2 , . . . , an]T and b = [b\, 62, · · · ? M T a r e eQual if

ai — bi, i = 1,2,... ,n. The sum of the vectors a and 6, denoted a + 6, is the vector

a + b= [ai H-6i,a2 + 6 2 , . . . , a n + 6n]T .

The operation of addition of vectors has the following properties:

1. The operation is commutative:

a + b = b + a.

2. The operation is associative:

(a + b)-\-c = a + (b + c).

3. There is a zero vector

such that

The vector

0 = [0 ,0 , . . . ,0 ] T

a + Q = 0-\-a = a.

[Oi - & ι , α 2 - &2 ? · . · , Α η - &n]

is called the difference between a and b and is denoted a — b. The vector 0 - b is denoted —6. Note that

b + (a — b) — a,

— (a — b) = b — a.

VECTOR AND MATRIX 9

The vector b — a is the unique solution of the vector equation

a + x = b.

Indeed, suppose that x = [xi, x2,..., xn]T is a solution to a + x = b. Then,

a\+x\ =h,

a2 + X2 = fo,

an -\- xn — on,

and thus x = b — a.

We define an operation of multiplication of a vector a G Mn by a real scalar a G R a s

αα = [ααι, αα2,. · ·, α;αη] .

This operation has the following properties:

1. The operation is distributive: for any real scalars a and /?,

a(a + 6) = aa + α&, (a + β)α — aa + /3a.

2. The operation is associative:

α(βα) = (α/3)α.

3. The scalar 1 satisfies

4. Any scalar a satisfies

5. The scalar 0 satisfies

6. The scalar —1 satisfies

l a = a.

a0 = 0.

0a = 0.

(—l)a = —a.

Note that aa = 0 if and only if a = 0 or a = 0. To see this, observe that aa = 0 is equivalent to ααι = aa2 = · · · = ααη = 0. If a = 0 or a = 0, then aa = 0. If a ^ 0, then at least one of its components α^ φ 0. For this component, αα^ = 0, and hence we must have a = 0. Similar arguments can be applied to the case when a / 0 .

1 0 VECTOR SPACES AND MATRICES

A set of vectors {αχ, . . . ,ak} is said to be linearly independent if the equal-ity

a\a\ + a2a2 + l· akak = 0 implies that all coefficients a*, i = 1 , . . . , fc, are equal to zero. A set of the vectors {αχ, . . . , ak} is linearly dependent if it is not linearly independent.

Note that the set composed of the single vector 0 is linearly dependent, for if a φ 0, then aO — 0. In fact, any set of vectors containing the vector 0 is linearly dependent.

A set composed of a single nonzero vector a φ 0 is linearly independent since aa = 0 implies that a = 0.

A vector a is said to be a linear combination of vectors αχ, a 2 , . . . , ak if there are scalars αχ , . . . , α^ such that

a = OL\a\ + α 2 α 2 Η + QfcOfc.

Proposition 2.1 A set of vectors { α ι , α 2 , . . . ,ak} is linearly dependent if and only if one of the vectors from the set is a linear combination of the remaining vectors. □

Proof. =>: If {αι, a 2 , . . . , a^} is linearly dependent, then

OLICLI + a2a2 H l· α^α^ = 0,

where at least one of the scalars α; Φ 0, whence OL\ OL2 OLk

di = αι a2 — · · · α^.

<=: Suppose that

αχ = α2α2 + α3α3 Η h α^α^,

then ( - l ) a i + a2a2 H l· akak = 0.

Because the first scalar is nonzero, the set of vectors { α ι , α 2 , . . . ,α/c} is lin-early dependent. The same argument holds if α ,̂ i = 2, . . . , /c, is a linear combination of the remaining vectors. I

A subset V of Rn is called a subspace of Rn if V is closed under the op-erations of vector addition and scalar multiplication. That is, if a and b are vectors in V, then the vectors a + b and aa are also in V for every scalar a.

Every subspace contains the zero vector 0, for if a is an element of the subspace, so is (—I)a = — a. Hence, a — a — 0 also belongs to the subspace.

Let α ι , α 2 , . . . ,α^ be arbitrary vectors in W1. The set of all their linear combinations is called the span of αχ, α 2 , . . . , ak and is denoted

span[ai ,a2 , . . . ,a /e] = < ^ α ^ : a x , . . . ,ak £ R > .

VECTOR AND MATRIX 1 1

Given a vector a, the subspace span [a] is composed of the vectors ao , where a is an arbitrary real number (a G R). Also observe that if a is a linear combination of αι, α 2 , . . . , α/~, then

span[a i , a 2 , . . . , α^,α] — span[a i , a 2 , . . . , α&].

The span of any set of vectors is a subspace. Given a subspace V, any set of linearly independent vectors

{oi, C&2,. · ·, a>k} C V such that V = span[ai, a 2 , . . . , a/-] is referred to as a basis of the subspace V. All bases of a subspace V contain the same number of vectors. This number is called the dimension of V, denoted dim V.

Proposition 2.2 If {ai, a 2 , . . . , a / J zs a fraszs 0/ V, t/ien an?/ vector aofV can be represented uniquely as

a = OL\CL\ + a 2 a 2 H h α^α^,

where a^ G R, z = 1, 2 , . . . , k. □

Proof To prove the uniqueness of the representation of a in terms of the basis vectors, assume that

a = OL\a\ + α2α2 + · · · + ο^α/c

and α = βια,ι + ß 2a 2 H h Αα^ .

We now show that ai — βι, i = 1 , . . . , k. We have

α ια ι + a 2 a 2 H h α^α^ = /^«l H- β2α2 Η h /3fcafc

or (ai - )3i)ai + (a2 - Α )α 2 + · · · + (afc ~ ßk)ak = 0.

Because the set {a* : z = 1,2,.. . , A:} is linearly independent, OL\ — β\ = a2 — /?2 = · · · = a/e — ßk — 0, which implies that a* = /?*, 2 = 1 , . . . , fc. I

Suppose that we are given a basis {αχ, α 2 , . . . , α^} of V and a vector a G V such that

α = αχθι + a 2 a 2 H h α^α^. The coefficients a*, i = 1 , . . . , /c, are called the coordinates of a with respect to the basis {ai, a 2 , . . . , α^}.

The natural basis for Rn is the set of vectors

"1" 0 0

0 _0.

, e2 =

Ό" 1 0

0 .0.

1 2 VECTOR SPACES AND MATRICES

The reason for calling these vectors the natural basis is that

x

Xl

X2 X\e\ + £2^2 + · · · + Xn^n

We can similarly define complex vector spaces. For this, let C denote the set of complex numbers and C n the set of column n-vectors with complex components. As the reader can easily verify, the set C n has properties similar to those of Rn, where scalars can take complex values.

A matrix is a rectangular array of numbers, commonly denoted by upper-case bold letters (e.g., A). A matrix with m rows and n columns is called an m x n matrix, and we write

a n Ö21

«12

Ö22

a\n

d2n

G m l Om2

The real number α^ located in the ith row and jth column is called the (i, j ) th entry. We can think of A in terms of its n columns, each of which is a column vector in Rm . Alternatively, we can think of A in terms of its m rows, each of which is a row n-vector.

Consider the ra x n matrix A above. The transpose of matrix A, denoted A , is the n x m matrix

A 1 =

a n a2i

Ö12 «22

G i n 0,2η

0>ml

«7712

that is, the columns of A are the rows of A , and vice versa. Let the symbol Mm X n denote the set oimxn matrices whose entries are real

numbers. We treat column vectors in IRn as elements of R n x l . Similarly, we treat row n-vectors as elements of R l x n . Accordingly, vector transposition is simply a special case of matrix transposition, and we will no longer distinguish between the two. Note that there is a slight inconsistency in the notation of row vectors when identified as 1 x n matrices: We separate the components of the row vector with commas, whereas in matrix notation we do not generally use commas. However, the use of commas in separating elements in a row helps to clarify their separation. We use use such commas even in separating matrices arranged in a horizontal row.

an introduction to optimization...an introduction to optimization fourth edition edwin k. p. chong...

Documents