introductory linear algebraweb.csulb.edu/~okoroste/instr_materials/lecturesmath247.pdf · textbook...

$: Introductory Linear Algebraweb.csulb.edu/~okoroste/instr_materials/lecturesMATH247.pdf · Textbook \Linear Algebra and Its Applications" by David Lay, Pearson Education, Inc., 2006$
MATH 247 Introductory Linear Algebra

Textbook “Linear Algebra and Its Applications” by David Lay, PearsonEducation, Inc., 2006 (3rd edition update).

1.1. Systems of Linear Equations.

Definition. A linear equation in the variables x1, . . . , xn is an equation ofthe form

a1x1 + a2x2 + · · ·+ anxn = b

where a1, . . . , an, and b are real or complex numbers.

Example. x1 − 2x2 = 3, ix2 −√2x1 = π are linear equations in x1 and

x2. Equations x21 = 5 and x1 ln(x2) = −7 are not linear.

Definition. A system of linear equations is a collection of linear equationsin the same variables x1, . . . , xn.

Example. {x1 − x2 = 1

x1 + x2 = 5

is a system of two linear equations in two unknowns x1 and x2.{x1 + x3 = 8

x3 − x2 = 4

is a system of two linear equations in three unknowns x1, x2, and x3.

Definition. A solution of a system of linear equations is a set of numberss1, . . . , sn that make the equations true statements if x1 = s1, . . . , xn = sn.

Example. The system {x1 − x2 = 1

x1 + x2 = 5

has solution x1 = 3 and x2 = 2.

Definition. A system of linear equations has either(i) no solution (inconsistent system),(ii) exactly one solution (consistent system),or(iii) infinitely many solutions (consistent system).

Example. A graph of a linear equation is a straight line. Therefore, find-ing a solution of a system of two linear equations is equivalent to finding an

1

intersection of two lines. Lines can be either (i) parallel (no solution), (ii)intersect at a point (exactly one solution), or (iii) coincide (infinitely many

solutions). For example, system

{x1 + x2 = 3

2x1 + 2x2 = 4has no solution, system{

x1 + x2 = 4

x1 − x2 = 2has exactly one solution, and system

{x1 + x2 = 4

2x1 + 2x2 = 8has

infinitely many solutions.

Definition. Two systems of equations are called equivalent if they havethe same solution.

Example. Systems

{x1 + 3x2 = 7

2x1 − x2 = 0and

{x1 + 3x2 = 7

3x1 + 2x2 = 7are equivalent

since they have the same solution x1 = 1 and x2 = 2.

Definition. Given the systema11x1 + a12x2 + · · ·+ a1nxn = b1

a21x1 + a22x2 + · · ·+ a2nxn = b2

. . .

an1x1 + an2x2 + · · ·+ annxn = bn

,

the matrix a11 a12 . . . a1na21 a22 . . . a2n. . .an1 an2 . . . ann

is called the coefficient matrix or matrix of coefficients. The matrix

a11 a12 . . . a1n b1a21 a22 . . . a2n b2. . .an1 an2 . . . ann bn

is called the augmented matrix.

Example. Consider the system−x1 + x3 = −3

2x1 + x2 − x3 = 4

x1 + 2x2 = 1

.

Its augmented matrix is −1 0 1 −32 1 −1 41 2 0 1

.

2

Definition. A matrix of size m× n has m rows and n columns.

Example. In the above example, we have 3× 4 matrix.

1.1. Solving Linear System.

To solve a system of linear equations we replace it with an equivalent onethat is easier to solve. Three basic operations are used to simplify a linearsystem (i) replace an equation by the sum of itself and a multiple of anotherequation, (ii) interchange two equations, and (iii) multiply an equation by anonzero constant.

Example. {x1 − x2 = 1

x1 + x2 = 5

e1→e1+e2=⇒

{2x1 = 6

x1 + x2 = 5

e1→e1/2=⇒

{x1 = 3

x1 + x2 = 5

e2→e2−e1=⇒

{x1 = 3

x2 = 2.

This solution can be written in terms of augmented matrices as follows:[1 −1 11 1 5

]r1→r1+r2=⇒

[2 0 61 1 5

]r1→r1/2=⇒

[1 0 31 1 5

]r2→r2−r1=⇒

[1 0 30 1 2

].

To solve a system of three or more equations, it is advisable to work with aug-mented matrices. Three elementary row operations that lead to a row equivalentaugmented matrix are(i) replace one row by the sum of itself and a multiple of another row,(ii) interchange two rows,and (iii) multiply a row by a nonzero constant.

1.2. Row Reduction and Echelon Forms.

Example. Consider two matrices2 −2 0 10 1 2 30 0 0 50 0 0 0

and

1 0 0 −10 1 0 30 0 1 2

.

Definition. In a matrix, a leading entry of a nonzero row is the leftmostnonzero entry of the row.

3

In our example, the leading entries are 2, 1 and 5 in the first matrix, and 1,1, and 1 in the second matrix.

Definition. A matrix is in row echelon form if:(i) all nonzero rows are above any rows of all zeros,(ii) each leading entry of a row is to the right of the leading entry of the rowabove it,and (iii) all entries in a column below a leading entry are zeros.

In our example, both matrices are in row echelon form.

Definition. A matrix is in reduced row echelon form if:(i) the leading entry in each nonzero row is 1,and (ii) each leading 1 is the only nonzero entry in its column.

In our example, the second matrix is in reduced row echelon form.

Definition. A pivot position in a matrix is a location in this matrix thatcorresponds to a leading 1 in its reduced row echelon form. A pivot columnis a column that contains a pivot position.

Algorithm for Solving a Linear System.

To solve a system of equations, we reduce the augmented matrix to its rowechelon form and then to its reduced row echelon form. It is done in severalsteps.

Step 1. Take the leftmost nonzero column. This will be your pivot column.Make sure the top entry (pivot) is not zero. Interchange rows if necessary.

Example. x1 + x2 + 2x3 = 0

2x2 + x3 = 4

x1 + 2x3 = −3

↔

1 1 2 00 2 1 41 0 2 −3

.

The first column is a pivot column.

Step 2. Use the elementary row operations to create zeros in all positionsbelow the pivot.

In our example,

r3→r3−r1=⇒

1 1 2 00 2 1 40 −1 0 −3

.

Step 3. Select a pivot column in the matrix with the first row ignored. Re-peat the previous steps.

4

In our example,

r3↔2r3+r2=⇒

1 1 2 00 2 1 40 0 1 −2

.

This matrix is in row echelon form.

Step 4. Starting with the rightmost pivot, create zeros above each pivot.Make pivots equal 1 by rescaling rows if necessary.

In our example,

r2→r2−r3=⇒

1 1 2 00 2 0 60 0 1 −2

r1→r1−2r3=⇒

1 1 0 40 2 0 60 0 1 −2

r2→r2/2=⇒

1 1 0 40 1 0 30 0 1 −2

r1→r1−r2=⇒

1 0 0 10 1 0 30 0 1 −2

↔

x1 = 1

x2 = 3

x3 = −2.

1.6. Applications of Linear Systems.

Example. (Applications in Economics: Example 1, page 58). Considercoal, electric and steel sectors of economy. Suppose coal sells 60% of its out-put to electric and 40% to steel; electric sells 40% to coal, 10% to electricand 50% to steel; steel sells 60% to coal, 20% to electric and 20% to steel. Itcan be summarized by a table

Coal Electric Steel Purchased by0 .4 .6 coal.6 .1 .2 electric.4 .5 .2 steel

Let pc, pe and ps denote the prices of the total outputs of coal, electric andsteel, respectively. Find prices that balance each sector’s income and expen-ditures.

Solution: The prices must satisfy the systempc = .4pe + .6ps

pe = .6pc + .1pe + .2ps

ps = .4pc + .5pe + .2ps

.

The reduced row echelon form of the augmented matrix of this system is1 0 −.94 00 1 −.85 00 0 0 0

.

5

Thus, the solution is pc = .94ps, pe = .85ps and ps is free.

Example. (Applications in Chemistry: Example on page 59).When propane burns, the propane C3H8 combines with oxygen O2 and formscarbon dioxide CO2 and water H2O. Write a balanced chemical equation forthis reaction.

Solution: The number of atoms is preserved in a chemical reaction. Hence,we have to find integers x1, x2, x3, and x4 such that x1C3H8 + x2O2 →x3CO2 + x4H2O. That is, we have to solve the system

3x1 = x3

8x1 = 2x4

2x2 = 2x3 + x4.

The solution is x1 = .25x4

x2 = 1.25x4

x3 = .75x4.

Since all x’s must be integers, x4 = 4, x1 = 1, x2 = 5, and x3 = 3.

Example. (Network Flow: Example 2 on page 61).In a network, the in-flow and out-flow of every node are equal. Considerintersections A, B, C, and D. Picture. Determine the flows at these intersec-tions.

Solution: The unknown flows satisfy the linear systemx1 + x2 = 800

x2 + x4 = x3 + 300

x4 + x5 = 500

x1 + x5 = 600.

The solution is x1 = 600− x5

x2 = 200 + x5

x3 = 400

x4 = 500− x5.

The flow x5 is free, though it must satisfy x5 ≤ 500 since x4 ≥ 0.

1.3. Vector Equations.

6

Definition. A column vector or a vector is a list of numbers arranged ina column.

Example.

1−37

.Notation. Vectors are usually denoted by u,v,w,x,y.

Notation. The set of real numbers is denoted by R.

Notation. The set of all vectors with n entries is denoted by Rn. We writeu ∈ Rn.

Definition. Two vectors are equal iff their corresponding entries are equal.That is, equality of vectors is defined entry-wise.

Example. Let u =

[1−1

],v =

[1i2

], and w =

[eπi

−1

]. Entry-wise, u =

v = w.

Definition. Vector addition and scalar multiplication are defined entry-wise.

Example. In our example, u + w =

[1 + (−1)−1 + (−1)

]=

[0−2

], and 2u =[

2(1)2(−1)

]=

[2−2

].

Geometric Presentation of a Vector.

A vector v =

v1v2. . .vn

is plotted in n-dimensional coordinate system as a

pointed line from the origin 0 to point (v1, v2, . . . , vn). Picture in R2. Ex-

ample u =

[23

], v =

[−21

]. How to picture the sum of two vectors and a

vector multiplied by a scalar?

7

Parallelogram Rule for Addition.The vector u+v points from the origin and lies on the diagonal of the parallel-ogram formed by vectors u and v. Picture. Example. Where is u−v? v−u?

Scalar Multiplication.Vector cv lies on the same line as v, its length is |c| times the length of v,and it points in the direction of v if c > 0 and in the opposite direction ifc < 0. Picture. Example.

Definition. For any vectors v1, . . . ,vp in Rn and any c1, . . . , cp ∈ R, thevector c1v1 + · · · + cpvp is called a linear combination of v1, . . . ,vp withweights c1, . . . , cp.

Give an example.

Example. Let u =

10−1

, v =

210

, and w =

301

. Is vector y =

02−1

a

linear combination of u,v, and w?

Solution: The goal is to find, if possible, weights cu, cv, and cw such that

y = cuu+cvv+cww or

02−1

= cu

10−1

+cv

210

+cw

301

=

cu + 2cv + 3cwcv

−cu + cw

.The answer is y = −(1/4)u+ 2v − (5/4)w.

Definition. The set of all linear combinations of vectors v1,v2, . . . ,vp ∈ Rn

is called the subset of Rn spanned (or generated) by v1,v2, . . . ,vp and is de-noted by Span{v1,v2, . . . ,vp}.

Example. Span{u} where u ∈ R3. Picture.

Example. Span{u,v} where the vectors are in R3. Picture.

Definition. A vector equation is a linear equation of the form x1v1+x2v2+· · ·+xpvp = b. It is equivalent to a linear system with the augmented matrix[v1 v2 . . . vp b

].

1.4. The Matrix Equation Ax = b.

Definition. A system of linear equations with augmented matrix[A|b

]can be written as a matrix equation Ax = b.

1.5. Solution Sets of Linear Systems.

8

Homogeneous Linear Systems.

Definition. A system Ax = 0 is called homogeneous. It has a trivial solu-tion x = 0. A non-trivial solution exists iff the system has at least one freevariable.

Example (Example 1 on page 50). Describe the solution set of Ax = 0where

A =

3 5 −4−3 −2 46 1 −8

.

Solution: The reduced row echelon form of[A|0

]is3 5 −4 0

0 3 0 00 0 0 0

,

meaning x1 = −4/3x3, x2 = 0, and x3 is free. Thus, any solution

x = x3

−4/301

= x3v,

that is, a non-trivial solution of this homogeneous system is in Span{v}.

Example (Example 2 on page 51). Describe the solution set of 10x1 −3x2 − 2x3 = 0.Solution: From this equation, x1 = .3x2 + .2x3 where x2 and x3 are free.Any solution

x =

x1

x2

x3

=

.3x2 + .2x3

x2

x3

= x2

.310

+ x3

.201

= x2u+ x3v,

that is, a non-trivial solution of this equation is in Span{u,v}.

Generally speaking, for a homogeneous system with k free variables, a non-trivial solution is in the span of k vectors.

9

Nonhomogeneous Linear Systems.

Definition. A system Ax = b with nontrivial right-side (b = 0) is callednonhomogeneous linear system.

Proposition. The solution set of a nonhomogeneous system Ax = b isthe set of all vectors of the form w = p + vh where p is some particularsolution of the nonhomogeneous system, and vh is any solution of the homo-geneous system Ax = 0.In other words, in order to solve a nonhomogeneous system, one has to findall solutions of the homogeneous system and add a particular solution of thenonhomogeneous system.

Example (Example 3 on page 52).

A =

3 5 −4−3 −2 46 1 −8

and b =

7−1−4

.

The reduced row echelon form of[A|b

]is1 0 −4/3 −1

0 1 0 20 0 0 0

,

meaning x1 = −1 + 4/3x3

x2 = 2

x3 free.

Thus,

x =

x1

x2

x3

=

−1 + 4/3x3

2x3

=

−120

+4/3x3

0x3

=

−120

+x3

4/301

= p+x3v.

Picture of Span{v}, translated by a vector.

1.7. Linear Independence.

Definition. Vectors v1, . . . ,vp ∈ Rn are linearly independent if the vec-tor equation x1v1+ · · ·+xpvp = 0 has only the trivial solution (there are nofree variables). Otherwise, the vectors are linearly dependent and the vec-tor equation with a non-trivial solution is called a linear dependence relation.

Example (Example 1 on page 65). Decide whether vectors v1 =

123

,v2 =

10

456

, and v3 =

210

are linearly independent.

Solution: A row echelon form of the augmented matrix is1 4 2 00 −3 −3 00 0 0 0

.

Thus, a non-trivial solution is possible. The vectors are not linear indepen-dent.

Find a linear dependence relation.Solution: The reduced row echelon form is1 0 −2 0

0 1 1 00 0 0 0

.

Thus, for example, 10v1 − 5v2 + 5v3 = 0.

Proposition. The columns of a matrix A are linearly independent iff theequation Ax = 0 has only the trivial solution.

Example (Example 2 on pages 66 – 67). Determine if the columns of the

matrix A =

0 1 41 2 −15 8 0

are independent.

Solution: A row echelon form of the augmented matrix is1 2 −1 00 1 4 00 0 13 0

.

There are no free variables, therefore only a trivial solution exists and thecolumns are independent.

Proposition. (i) One vector is linearly independent iff it is a non-zero vec-tor. (ii) Two vectors are linearly dependent iff one is a multiple of the other(that is, if they lie on the same line through the origin).Proof: (i) The equation x1v1 = 0 has a non-trivial solution iff v1 = 0.(ii) The equation x1v1+x2v2 = 0 has a non-trivial solution iff v1 = −x2/x1v2

(assumed x1 = 0). 2

Example. Vector

[13

]is linearly independent. Vectors

[1−3

]and

[−39

]are

linearly independent, while

[1−3

]and

[26

]are linearly dependent.

11

Theorem 7 on page 68. Vectors v1, . . . ,vp, p ≥ 2 are linearly depen-dent iff at least one of the vectors is a linear combination of the others (thatis, at least one vector is in the set spanned by the others).Proof: The p vectors are linearly dependent iff the equation x1v1 + · · · +xpvp = 0 has a non-trivial solution iff (say, x1 = 0) v1 = −x2/x1v2 − · · · −xp/x1vp. 2

Example. (Example 4 on page 68). Consider vectors v1 =

301

and v2 =160

. They are not multiples of each other, therefore, Span{v1,v2} is a plane

through the origin. A vector in this plane is linearly dependent with v1 andv2. A vector not in this plane is linearly independent. Picture.

Theorem 8 on page 69. Vectors v1, . . . ,vp ∈ Rn are linearly dependent ifp > n. In other words, if a set contains more vectors than there are entriesin each vector, then the set is linearly dependent.Proof: Let A =

[v1 . . . vp

]. Then the equation Ax = 0 is a system of

n equations in p unknowns. If p > n, there must be free variables, and, thus,a non-trivial solution. 2

Example 5 on page 69. Three vectors

[21

],

[4−1

], and

[−22

]are linearly

dependent, since 3 = p > n = 2. The dependence relation is −[21

]+

[4−1

]+[

−22

]=

[00

].

Theorem 9 on page 69. If a set of vectors contains the zero vector, theset is linearly dependent.Proof: Suppose v1 = 0. Then 1 · v1 + 0 · v2 + · · · + 0 · vp = 0 solves theequation nontrivially. 2

Example 6(b) on page 69. The set of vectors

235

,

000

, and

118

is

linearly dependent since it contains the zero vector.

1.8. Linear Transformations.

Definition. A transformation (or function or mapping) T from Rn to Rm isa rule that assigns to each vector x in Rn a vector T (x) in Rm. Picture.

12

Definition. The set Rn is called the domain of T , and Rm is called thecodomain of T . The notation T : Rn → Rm means that T maps Rn into Rm.The vector T (x) is called the image of x. The set of all images is called therange of T . Picture.

A matrix equation Ax = b can be considered as a transformation.

Definition. Let x ∈ Rn. The transformation T (x) = Ax where A is am× n matrix is called a matrix transformation and is denoted by x 7→ Ax.

Example 1 on page 74. Define a matrix transformation T : R2 → R3

by T (x) = Ax where A =

1 −33 5−1 7

.(i) Find T (u) where u =

[2−1

].

Solution: T (u) = Au =

1 −33 5−1 7

[2−1

]=

51−9

.(ii) Find x which image is

32−5

.Solution: x =

[x1

x2

]solves

1 −33 5−1 7

[x1

x2

]=

32−5

. Solution is x1 =

1.5, x2 = −.5 and is unique.

(iii) Determine if

325

is in the range. Answer: no.

Example 2 on page 76. The matrix transformation with

A =

1 0 00 1 00 0 0

is a projection on the x1x2-plane sincex1

x2

x3

7→

1 0 00 1 00 0 0

x1

x2

x3

=

x1

x2

0

.

Define projections on x1x3- and x2x3-planes.

Definition. A transformation T is called linear if

13

(i) T (u+ v) = T (u) + T (v) for any vectors u and v in the domain of T ,(ii) T (cu) = cT (u) for any vector u and any scalar c.

Example. A matrix transformation is linear.

Exercise. Show the superposition principle T (c1v1+· · ·+cpvp) = c1T (v1)+· · ·+ cpT (vp).

Exercise. Define T : R2 → R2 by T (u) = rT (u). If 0 ≤ r ≤ 1, T iscalled a contraction; if r > 1, T is called a dilation. Show that T is a lineartransformation.

Exercise. Show that if T is a linear transformation, then T (0) = 0.

1.9. The Matrix of a Linear Transformation.

It turns out that any linear transformation is in fact a matrix transformation.

Proposition. Let T : Rn → Rm be a linear transformation. Then thereexists a unique matrix A such that T (u) = Au for any vector u ∈ Rn. The

matrix Am×n

=[T (e1) . . . T (en)

]where e1

m×1=

10. . .0

, etc.Proof: u =

u1

. . .un

= u1e1 + · · · + unen. Thus, T (u) = u1T (e1) + · · · +

unT (en) =[T (e1) . . . T (en)

] u1

. . .un

. 2

Example 1 on page 82. Let T : R2 → R3 be a linear transformation such

that T (e1) =

5−72

and T (e2) =

−380

. Find A, the standard matrix for T .

2.1. Matrix Operations.

Definition. An m× n matrix A has the form

A =

a11 . . . a1j . . . a1n

. . .ai1 . . . aij . . . ain

. . .am1 . . . amj . . . amn

where aij is called the (i, j)-entry.

14

Definition. The diagonal entries of A are a11, a22, . . . , and they form themain diagonal of A.

Definition. A diagonal matrix is a square matrix with the zero entries offthe main diagonal.

Definition. The identity matrix In is a diagonal matrix with ones onthe main diagonal.

Definition. Two matrices are equal if they have the same dimensions andtheir corresponding entries are equal.

Definition. The addition of matrices of equal dimensions and the scalarmultiplication of a matrix are defined entrywise.

Definition. If A is a m × n matrix and B is a n × p matrix, then theproduct AB is a m× p matrix with (i, j) entry

(AB)ij = ai1b1j + ai2b2j + · · ·+ ainbnj.

Example. 1 0 −42 −3 26 1 0

1 30 2−1 1

=

Remark. (i) The product of matrices is not commutative, that is, in gen-eral, AB = BA. Example.(ii) The cancellation law doesn’t hold for matrix multiplication, that is, ifAB = AC, then, in general, B = C. Give example.(iii) If AB = 0, then, in general, it is not true that A = 0 or B = 0. Giveexample.

Definition. The kth power of a matrix A is defined as the product of k

copies of A, that is, Ak = A . . .k times

A.

Definition. The transpose of a n × m matrix A is the m × n matrix AT

with aji as the (i, j) entry. Example.

Theorem 3 on page 115. (i) (AT )T = A, (ii) (A + B)T = AT + BT ,(iii) (rA)T = rAT for any scalar r, and (iv) (AB)T = BTAT .

Examples.

2.2. The Inverse of a Matrix.

Definition. The inverse of a square matrix A is the (unique) matrix A−1

such that AA−1 = A−1A = I.

15

Theorem 4 on page 119. The inverse of a 2 × 2 matrix A =

[a bc d

]is A−1 = 1

ad−bc

[d −b−c a

].

If ad − bc = 0, the inverse does not exist, and A is called singular ornot invertible.Proof:

Theorem 5 on page 120. If A is invertible, then the matrix equationAx = b has the unique solution x = A−1b.

Example. Solve {3x1 + 4x2 = 3

5x1 + 6x2 = 7.

Theorem 6 on page 121. (i) (A−1)−1 = A, (ii) (AB)−1 = B−1A−1, and(iii) (AT )−1 = (A−1)T .Proof:

We know how to find the inverse of a 2 × 2 matrix. For larger matricesthere is no explicit formula. However, one can find A−1 by applying thefollowing algorithm based on the fact that AA−1 = I.

Algorithm for Finding A−1: Take the augmented matrix[A I

]. Apply-

ing elementary operations reduce the matrix to[I A−1

]. If it is impossible,

then A is not invertible.

Example 7 on page 124. Find the inverse of the matrix

A =

0 1 21 0 34 −3 8

.

Solution:

[A I

]=

0 1 2 1 0 01 0 3 0 1 04 −3 8 0 0 1

=⇒

1 0 0 −9/2 7 −3/20 1 0 −2 4 −10 0 1 3/2 −2 1/2

.

It is useful to check the answer.

3.1. Determinants.

Notation. Denote by Aij the matrix obtained from matrix A by remov-ing the ith row and the jth column.

16

Example. Find A11.

A =

1 5 02 4 −10 −2 0

.

Definition. The determinant of a n×n matrix A = [aij] is a number givenrecursively by

detA =n∑

j=1

(−1)j+1a1jdetA1j.

Definition. The quantity (−1)i+jdetAij is called the (i, j)-cofactor of A.

Definition. The determinant is defined as a cofactor expansion across thefirst row.

Example. Find

det

1 5 02 4 −10 −2 0

.

Theorem 1 on page 188. The determinant of a matrix can be computedby a cofactor expansion across any row or any column.

Example. In above example, it is simpler to expand the determinant alongthe last row or last column.

Exercise. Show that if one row or one column of a square matrix is zero,its determinant is zero (singular matrix).

Definition. A matrix is called triangular if all entries below or above themain diagonal are zeros. Example. Special case: diagonal matrix.

Theorem 2 on page 189. The determinant of a triangular matrix is theproduct of the entries on the main diagonal.

3.2. Properties of Determinants.

Theorem 3 on page 192. (i) If a multiple of one row a matrix A is addedto another row to produce a matrix B, then detB = detA.(ii) If two rows of A are interchanged to produce B, then detB = −detA.(iii) If one row of A is multiplied by c to produce B, then detB = cdetA.

Example 1 on page 193.

det

1 −4 2−2 8 −9−1 7 0

= det

1 −4 20 0 −5−1 7 0

17

= det

1 −4 20 0 −50 3 2

= −det

1 −4 20 3 20 0 −5

= −(1)(3)(−5) = 15.

Exercise. Compute det

0 2 21 0 32 1 1

. Answer: 12.Exercise. Show that if two rows of a square matrix are equal, its deter-minant is zero.

Exercise. Show that if one column of a square matrix is a linear combi-nation of the others, then its determinant is zero. The converse is also true.

Theorem 4 on page 194. A square matrix is invertible iff its determi-nant is nonzero.

Remark. It follows from the above Exercise and Theorem 4 that a ma-trix is invertible iff its determinant is nonzero iff its columns are linearlyindependent.

Theorem 5 on page 196. detAT = detA.

Theorem 6 on page 196. detAB = (detA)(detB).

Exercise. detA = 5. Find detA−1.

Exercise. A matrix A is such that A = A−1. Find detA.

3.3. Cramer’s Rule.

Notation. Let Ai(b) denote the matrix obtained from a n×n matrix A byreplacing its ith column by vector b ∈ Rn.

Theorem 7 on page 201 (Cramer’s Rule). The unique solution of amatrix equation Ax = b is

xi = detAi(b)/detA, i = 1, . . . , n.

Proof: Let A =[a1 . . . an

]and I =

[e1 . . . en

]. If Ax = b, then

AIi(x) =[Ae1 . . . Ax . . . Aen

]=

[a1 . . . b . . . an

]= Ai(b).

Thus, det(AIi(x)

)= detA detIi(x) = detAi(b), but detIi(x) = xi. 2

Example 1 on page 202. Use Cramer’s rule to solve{3x1 − 2x2 = 6

−5x1 + 4x2 = 8.

18

Answer: x1 = 20, x2 = 27.

Exercise 6 on page 209. Use Cramer’s rule to solve2x1 + x2 + x3 = 4

−x1 + 2x3 = 2

3x1 + x2 + 3x3 = −2.

A Formula for A−1.

Notation. Denote by Cij = (−1)i+jdetAij, the (i, j)-cofactor of A.

Theorem 8 on page 203. The inverse

A−1 =1

detA

C11 C21 . . . Cn1

C12 C22 . . . Cn2

. . .C1n C2n . . . Cnn

.

The matrix of cofactors is called the adjugate or adjoint of A and is denoted

by adjA. Thus, A−1 = 1detA

adjA.

Example 3 on pages 203 – 204. Find the inverse of2 1 31 −1 11 4 −2

.

Answer:

A−1 = (1/14)

−2 14 43 −7 15 −7 −3

.

Exercise 12 on page 210. Find the inverse of1 1 32 −2 10 1 0

.

Determinants as Area or Volume.

Theorem 9 on page 205 (Geometric Interpretation of Determi-nants). The absolute value of the determinant of a 2 × 2 matrix is thearea of the parallelogram determined by the columns of the matrix. The| · | of the determinant of a 3× 3 matrix is the volume of the parallelepipeddetermined by the columns of the matrix.

19

Example. Draw the picture for∣∣∣det [a 0

0 d

] ∣∣∣ and ∣∣∣deta 0 00 b 00 0 c

∣∣∣.Example 4 on page 206. Find the area of the parallelogram with ver-tices (−2,−2), (0, 3), (4,−1), and (6, 4).Solution: First move the parallelogram to the origin by subtracting (−2,−2),for example. Then the area of the parallelogram with vertices (0, 0), (2, 5), (6, 1),

and (8, 6) is |det[2 65 1

]| = | − 28| = 28.

4.1. Vector Spaces and Subspaces.

Definition. A vector space V is a collection of objects, called vectors, forwhich two operations are defined: addition and multiplication by a scalar.For any u,v,w ∈ V and any c, d ∈ R, the following ten axioms hold:1. u+ v ∈ V .2. u+ v = v + u (commutative law for addition).3. (u+ v) +w = u+ (v +w) (associative law for addition).4. There exists a zero vector 0 ∈ V such that u+ 0 = u.5. For every u, there exists a vector −u ∈ V , called the negative of u suchthat u+ (−u) = 0.6. Vector cu ∈ V .7. c(u+ v) = cu+ cv (distributive law).8. (c+ d)u = cu+ du (distributive law).9. c(du) = (cd)u.10. 1 · u = u.

Example 1 on page 217. The space Rn is a vector space. Each elementis represented by a column vector.

Example 2 on pages 217 – 218. The space of all arrows is a vectorspace. Here two vectors are equal if they have the same length and point inthe same direction. Addition is defined by the parallelogram rule, and mul-tiplication by a scalar as contracting or dilating a vector by |c| and reversingthe direction if c < 0.

Example 3 on page 218. The space of all doubly infinite sequences ofnumbers (. . . , x−2, x−1, x0, x1, x2, . . . ) is a vector space.

Example 4 on pages 218 – 219. The space of all polynomials of de-gree at most n is a vector space.

Example 5 on page 219. The space of all real-valued functions is a vectorspace.

Example. The space of n× n matrices is a vector space.

20

Example. The space of natural numbers N = {1, 2, 3, . . . } is not a vec-tor space. No negative vector, no zero.

Example. The space of integer numbers Z = {0, 1,−1, 2,−2, . . . } is nota vector space. Multiplication by a real scalar doesn’t work.

Proposition 1. The negative of u is unique.

Proof: Let a and b be two negatives of u. Then, by axiom 5, u + a = 0.Since vectors commute by axiom 2, we can write a+u = 0, which, by axiom5 again, means that u = −a. Similar, u = −b. So, −a = −b. There-fore, by axiom 5 once again, b + (−a) = 0. Adding a to both sides of thisidentity, we get that the left-hand side equals

(b + (−a)

)+ a = {by ax-

iom 3 } = b +((−a) + a

)= { by axiom 2} = b +

(a + (−a)

)= {by axiom

5} = b+0 = {by axiom 4} = b, while the right-hand side equals 0+a = {byaxiom 2} = a+ 0 = {by axiom 4} = a. So, a = b. 2

Proposition 2. 0 · u = 0.

Proof: By axiom 4, it suffices to show that u + 0 · u = u. Indeed,u+0·u = {by axiom 10} = 1·u+0·u = {by axiom 8} = (1+0)u = 1·u = {byaxiom 10 again} = u. 2

Proposition 3. −u = (−1)u.

Proof: By axiom 5, it suffices to show that u + (−1)u = 0. Indeed,u + (−1)u = {by axiom 10} = 1 · u + (−1)u = {by axiom 8} = (1 − 1)u =0 · u = {by proposition 2} = 0. 2

Proposition 4. c0 = 0.

Proof: By axiom 5, c0 = c(u+(−u)

)= {by axiom 7} = cu+c(−u) = {by

proposition 3} = cu + c((−1)u

)= {by axiom 9} = cu + (c(−1))u =

cu + ((−1)c)u = {by axiom 9 again} = cu + (−1)(cu) = {by proposition3} = cu+

(− (cu)

)= {by axiom 5} = 0. 2

Subspaces.

Definition. A subspace of a vector space V is a subset H with three prop-erties: (i) has the zero vector, (ii) closed under vector addition, (iii) closedunder scalar multiplication. These properties guarantee that a subspace isitself a vector space.

Example 6 on page 220. The set consisting only of the zero vector isa subspace. It is denoted by {0}.

21

Example 7 on page 220. The set of all polynomials with real coefficients,P is a subspace of the space of real-valued functions. The set of all polyno-mials of degree at most n is a subspace of P.

Example 8 on page 220. The set{x1

x2

0

}is a subspace of R3. It looks

and acts like R2 but is not R2.

Example 10 on page 220. If v1, . . . ,vp ∈ V , then Span{v1, . . . ,vp} is asubspace of V .

4.2. Null Spaces, Column Spaces, and Linear Transformations.

Definition. The null space of an m× n matrix A is the set of all solutionsof the homogeneous equation Ax = 0. In set notation,

NulA = {x : x ∈ Rn, Ax = 0}.

Geometrically, the null space of A is the set that is mapped into zero by thelinear transformation x 7→ Ax. Picture.

Theorem 2 on page 227. The null space of an m × n matrix is a sub-space of Rn.Proof: (i) 0 ∈ Rn is in the null space of A since A0 = 0.(ii) If Au = 0 and Av = 0, then A(u+ v) = 0.(iii) If Au = 0, then A(cu) = c(Au) = c(0) = 0. 2

Example. Find the null space of A =

[1 −3 −2−5 9 1

].

Solution: x =

x1

x2

x3

is in NulA if it solves

{x1 − 3x2 − 2x3 = 0

−5x1 + 9x2 + x3 = 0.

Therefore, NulA = {x ∈ R3 : x1 = −(5/2)x3, x2 = −(3/2)x3}. It is asubspace of R3.

An Explicit Description of NulA.

Example 3 on page 228. Find a spanning set for the null space of thematrix

A =

−3 6 −1 1 −71 −2 2 3 −12 −4 5 8 −4

.

Solution: The general solution of the equation Ax = 0 is x1 = 2x2 +x4 − 3x5, x3 = −2x4 + 2x5, and x2, x4, and x5 are free variables. We cannow decompose any vector in R5 into a linear combination of vectors where

22

weights are the free variables. That is,x1

x2

x3

x4

x5

=

2x2 + x4 − 3x5

x2

−2x4 + 2x5

x4

x5

= x2

21000

+x4

10−210

+x5

−30201

= x2u+x4v+x5w.

Notice that u,v, and w are linearly independent since the weights are freevariables. Thus, NulA = Span{u,v,w}.

The Column Space of a Matrix.

Definition. The column space of an m × n matrix a = [a1, a2, . . . , an] isthe set of all linear combinations of its columns, that is,

ColA = Span{a1, a2, . . . , an} = {b : b = Ax for some x ∈ Rn}.

Theorem 3 on page 229. ColA is a subspace of Rm.Proof: exercise.

Example 4 on pages 229 – 230. Find a matrix A such that W = ColA

where W ={6a− b

a+ b−7a

: a, b ∈ R}.

Solution:

W ={a

61−7

+ b

−110

: a, b ∈ R}= Span

{ 61−7

,

−110

}.

Thus, the matrix is

A =

6 −11 1−7 0

.

Kernel and Range of a Linear Transformation.

Recall that a linear transformation preserves vector addition and multiplica-tion by a scalar, that is, T (u+ v) = T (u) + T (v) and T (cu) = cT (u).

Definition. The kernel of a linear transformation is the set of vectors thatare mapped into zero. That is, for T : V → W the kernel is the set of allv ∈ V such that T (v) = 0.

Definition. The range of a linear transformation is the image under themapping, that is, the range of T is the set of all vectors w ∈ W such thatw = T (v) for some v ∈ V .

23

Picture.

Remark. Recall that any linear transformation is a matrix transformation,that is, T (x) = Ax for some A. Therefore, by definition, the kernel of T isthe null space of A, and the range of T is the column space of A.

4.3. Bases.

Definition. A set of vectors B = {b1, . . . ,bp} is a basis for a vector spaceV if (i) B is a linearly independent set, and (ii) the vector space is spannedby B, that is, V = Span{b1, . . . ,bp}.

Example 3 on page 238. Let A be an invertible n × n matrix. Thenthe columns of A form a basis for Rn because they are linearly independentand they span Rn. Explain more.

Example 4 on page 238. The set

{e1 =

10. . .0

, e2 =

01. . .0

, . . . , en =

00. . .1

}

is called the standard basis for Rn. Picture.

Example 5 on page 238. Determine whether{u =

30−6

,v =

−417

,w =−215

}is a basis for R3.

Solution: These vectors are linearly independent iff det

3 −4 −20 1 1−6 7 5

=

0, which is true. Thus, by Example 3, this set is a basis.

Example 6 on pages 238 – 239. Show that {1, t, t2, . . . , tn} for a ba-sis, called standard basis, for Pn, the set of all polynomial of degree at mostn.

Theorem 5 on page 239 (The Spanning Set Theorem).

Suppose a vector space V = Span{v1, . . . ,vp}. If one of the vectors, sayv1, is a linear combination of the others, then the set with this vector re-moved still spans V , that is, V = Span{v2, . . . ,vp}. The set with all such

24

vectors removed is a basis for V since it contains only linearly independentvectors spanning V .

Example 7 on page 239. Let v1 =

02−1

, v2 =

220

, and v3 =

616−5

,and suppose V = Span{v1,v2,v3}. Find a basis for V .Solution: Note that v3 = 5v1 + 3v2. By the theorem, {v1,v2} is a basisfor V .

Bases for NulA.

Recall Example 3 on page 228 discussed on the last lecture. We learnedhow to produce linearly independent set of vectors that spans the null spaceof a matrix. That is, we learned how to find a basis.

Bases for ColA.

Example 8 on page 240. Find a basis for ColB where

B =

1 4 0 2 00 0 1 −1 00 0 0 0 10 0 0 0 0

=[b1 . . . b5

].

Solution: Each nonpivot column of B is a linear combination of the pivotcolumns. In fact, the pivot columns are b1,b3, and b5, and the nonpivotcolumns are b2 = 4b1, and p4 = 2b1 − b3. By the Spanning Set Theorem,we may discard the nonpivot columns. The pivot columns are linearly inde-pendent and form a basis for ColB.

Notice that the matrix B is in the reduced row echelon form. Suppose amatrix is not in this form. How to find a basis for its column space?

Theorem 6 on page 241. (i) Elementary row operations on a matrix donot affect the linear dependence relations among the columns of the matrix.(ii) The pivot columns of a matrix form a basis for its column space.

Example 9 on page 241. Find a basis for ColA where

A =

1 4 0 2 −13 12 1 5 52 8 1 3 25 20 2 8 8

.

Solution: It can be shown that A is row equivalent to the matrix B inExample 8. Thus, the columns a1, a3, and a5 are the pivot columns and

25

form a basis.

4.4. Coordinate Systems.

Theorem 7 on page 246 (The Unique Representation Theorem). LetB = {b1, . . . ,bn} be a basis for a vector space V . Then for any x ∈ V , thereexists a unique set of scalars c1, . . . , cn such that x = c1b1 + · · ·+ cnbn.

Proof: by contradiction, use linear independence.

Definition. The coordinates of x relative to a basis B = {b1, . . . ,bn} (orB-coordinates of x) are the weights c1, . . . , cn such that x = c1b1+· · ·+cnbn.

Definition. The vector in Rn

[x]B =

c1. . .cn

is the coordinate vector of x (relative to B) or theB-coordinate vector of x.The mapping x 7→ [x]B is the coordinate mapping determined by B.

Examples 1 and 2 on page 247. Consider B = {b1 =

[10

],b2 =

[12

]}.

It is a basis for R2. Suppose a vector x ∈ R2 has the coordinate vector

[x]B =

[−23

]. Find x.

Solution: x = c1b1 + c2b2 =

[16

]. Note that the entries of this vector are

the coordinates of x relative to the standard basis{e1 =

[10

], e2 =

[01

]}.

Indeed, x =

[16

]= 1 ·

[10

]+ 6 ·

[01

]= 1 · e1 + 6 · e2.

Graphically, the coordinates can be interpreted in the following way. Plot the

standard basis and x, plotB. In theB-graph paper, x has coordinates

[−23

].

Example 4 on page 249. Let b1 =

[21

],b2 =

[−11

], and x =

[45

]. Find

the B-coordinates of x.

Solution: [x]B =

[c1c2

]where x =

[45

]= c1b1 + c2b2 = c1

[21

]+ c2

[−11

].

Solving the system, obtain c1 = 3, c2 = 2.Note that the augmented matrix is

[b1 b2 x

]. This can be generalized for

Rn.

Definition. Consider a basisB = {b1, . . . ,bn} in Rn. Let PB =[b1 . . . bn

].

The change-of-coordinates equation x = c1b1 + · · · + cnbn is equivalent tox = PB [x]B. The matrix PB is called the change-of-coordinates matrix from

26

B to the standard basis in Rn. The coordinate mapping from the standardbasis to B is given by [x]B = P−1

B x. Note that PB is invertible since itscolumns are linearly independent.

The Coordinate Mapping.

Definition (on page 87). A mapping T : U → V is said to be onto Vif each vector in V is the image of at least one vector in U .Picture.

Definition (on page 87). A mapping T : U → V is said to be one-to-oneif each vector in V is the image of at most one vector in U .Picture.

Theorem 8 on page 250. Let B = {b1, . . . ,bn} be a basis for a vec-tor space V . Then the coordinate mapping x 7→ [x]B is a one-to-one lineartransformation from V onto Rn.Proof: The coordinate mapping preserves vector addition and scalar mul-tiplication (show!). Hence it is a linear transformation. Show the rest.

Definition. A one-to-one linear transformation from a vector space V ontoa vector space W is called an isomorphism between V and W . The vec-tor spaces are called isomorphic. The word “isomorphism” is from Greek“iso”=“the same” and “morph”=“structure”. The name reflects the factthat two isomorphic spaces are indistinguishable as vector spaces. Everyvector space calculation in V is accurately reproduced in W and vice versa.

The above theorem states that the possibly unfamiliar space V is isomor-phic to the familiar Rn.

Example 5 on page 251. Let B = {1, t, t2t3} be the standard basis forP3. A typical vector in P3 has the form p(t) = a0 + a1t + a2t

2 + a3t3 which

corresponds to the coordinate vector [p]B =

a0a1a2a3

in R4. By the theorem,

the coordinate mapping p 7→ [p]B is an isomorphism between P3 and R4.

4.5. The Dimension of a Vector Space.

Theorem 9 on page 256. If a vector space V has a basisB = {b1, . . . ,bn},then any set in V containing more than n vectors must be linearly depen-dent. This implies that any linearly independent set in V has no more thann vectors.Proof: Suppose {u1, . . . ,up} is a set in V and p > n. Then the set{[u1]B, . . . , [up]B} is a linearly dependent set in Rn since there are more

27

vectors (p) than entries (n) in each vector. Therefore, there exist scalarsc1, . . . , cp, not all zeros, such that 0

. . .0

= c1[u1]B + · · ·+ cp[up]B

= {the coordinate mapping is a linear transformation} = [c1u1+· · ·+cpup]B.

Hence, c1u1 + · · ·+ cpup = 0. The scalars are not all zero, therefore, the set{u1, . . . ,up} is linearly dependent. 2

Theorem 10 on page 257. If a vector space V has a basis consistingof n vectors, then every basis in V must have exactly n vectors.Proof: Let B1 be a basis consisting of n vectors, and B2 be any other ba-sis. Since B1 is a basis and B2 is linearly independent, B2 has no more thann vectors by Theorem 9. Also, since B2 is a basis and B1 is linearly inde-pendent, B2 has at least n vectors. Thus, B2 consist of exactly n vectors. 2

Definition. The dimension of a vector space V , dimV , is the number ofvectors in a basis of V . The dimension of the zero vector space {0} is definedto be zero. If V is not spanned by a finite number of vectors, then dimV = ∞.

Remark. This is a consistent definition since by Theorem 10, all bases inV have the same number of vectors.

Example 1 on page 257. dimRn = n since the standard basis consists ofn vectors. dimPn = n+1 since the set {1, t, t2, . . . , tn} is a basis. dimP = ∞.

Example 3 on page 258. Find the dimension of the subspace of R4

H ={

a− 3b+ 6c5a+ 4d

b− 2c− d5d

: a, b, c, d ∈ R}.

Solution: Note that

H = Span{v1 =

1500

,v2 =

−3010

,v3 =

60−20

,v4 =

04−15

}.

Since v3 = −2v2 and the other vectors are linearly independent, H =Span{v1,v2,v4}. Thus, dimH = 3.

Theorem 12 on page 259 (The Basis Theorem). Let V be a p-dimensionalvector space, p ≥ 1. Any linearly independent set of exactly p vectors in Vis a basis in V . Any set of exactly p vectors that spans V is a basis in V .

28

Proposition. The dimension of NulA is the number of free variables in theequation Ax = 0. The dimension of ColA is the number of pivot columnsin A.

Example 5 on page 260. Find the dimensions of the null space and thecolumn space of

A =

−3 6 −1 1 −71 −2 2 3 −12 −4 5 8 −4

.

Solution: Row reduce the augmented matrix[A 0

]to echelon form1 −2 2 3 −1 0

0 0 1 2 −2 00 0 0 0 0 0

.

There are three free variables – x2, x4, and x5. Hence dimNulA = 3. Also,dimColA = 2 because A has two pivot columns.

4.6. Rank.

Definition. The rank of a matrix A is the dimension of the column spaceof A. That is , rankA = dimColA.

Theorem 14 on page 265 (The Rank Theorem).

rankA+ dimNulA = n.

“Proof:” The rank of A is the number of pivot columns. They correspondto the non-free variables. The other variables are free. 2

Note that in Example 5, rankA+ dimNulA = 2 + 3 = 5.

5.1. Eigenvectors and Eigenvalues.

Definition. An eigenvector of an n × n matrix A is a nonzero vector xsuch that Ax = λx where the scalar λ is called the eigenvalue of A corre-sponding to the eigenvector x.In other words, the matrix transformation x 7→ Ax stretches or shrinks x.

Definition. The eigenspace of a matrix corresponding to eigenvalue λ isthe space consisting of the zero vector and all the eigenvectors correspondingto λ.

Example 2 on page 303. Let A =

[1 65 2

],u =

[6−5

], and v =

[3−2

].

29

Are u and v eigenvectors of A?Solution: Au = −4u and Av = λv.

5.2. The Characteristic Equation.

To find eigenvalues of a matrix A, one has to find all λ’s such that theequation Ax = λx or, equivalently, (A− λI)x = 0 has a nontrivial solution.This problem is equivalent to finding all λ’s such that the matrix A− λI isnot invertible, which is equivalent to solving the characteristic equation ofA, det(A− λI) = 0.

Definition. If A is an n × n matrix, then det(A − λI) is a polynomialof degree n called the characteristic polynomial of A.

Example 3 on pages 313 – 314. Find the characteristic equation of

A =

5 −2 6 −10 3 −8 00 0 5 40 0 0 1

.

Answer: The characteristic equation is (λ− 1)(λ− 3)(λ− 5)2 = 0.

Definition. The multiplicity of an eigenvalue λ1 is its multiplicity as a rootof the characteristic equation, that is, if the characteristic polynomial of Ahas factor (λ− λ1)

k, then the multiplicity of λ1 is k.

In our example, λ1 = 1 and λ2 = 3 have multiplicities 1, while λ3 = 5has multiplicity 2.

Example. Find eigenvalues and eigenvectors of A =

1 5 00 4 −10 0 −2

.Solution: Solve

0 = det

1− λ 5 00 4− λ −10 0 −2− λ

= (1− λ)(4− λ)(−2− λ).

The solution is λ1 = −2, λ2 = 1, and λ3 = 4. Note that the eigenvalues of adiagonal matrix are the diagonal entries.The eigenvector x1 corresponding to λ1 = −2 solves (A + 2I)x = 0. Theaugmented matrix of this matrix equation is3 5 0 0

0 6 −1 00 0 0 0

=⇒

1 0 5/18 00 1 −1/6 00 0 0 0

.

30

The solution is x1 =

−(5/18)a(1/6)a

a

where a is free.

Eigenvector x2 solves (A− I)x = 0. The augmented matrix is0 5 00 3 −10 0 −3

=⇒

0 0 0 00 1 0 00 0 1 0

.


b00

where b is free.

Eigenvector x3 solves (A− 4I)x = 0. The augmented matrix is−3 5 0 00 0 −1 00 0 −6 0

=⇒

0 −5/3 0 00 0 0 00 0 1 0

.


(5/3)cc0

where c is free.

For example, we may take x1 =

−5318

, x2 =

100

, and x3 =

530

.Theorem 2 on page 307. If v1, . . . ,vr are eigenvectors corresponding todistinct eigenvalues λ1, . . . , λr of a matrix A, then the vectors are linearlyindependent.

Indeed, in our example, consider the linear combination αx1 + βx2 + γx3 =

α

−5318

+ β

100

+ γ

−530

= 0 ⇐⇒ α = β = γ = 0.

Exercise 14 on page 317. Find eigenvalues and eigenvectors of

A =

5 −2 30 1 06 7 −2

.

Solution: det(A−λI) = −(λ+4)(λ− 1)(λ− 7) = 0, thus, the eigenvaluesare λ1 = −4, λ2 = 1, and λ3 = 7. The eigenvector x1 solves9 −2 3 0

0 5 0 06 7 2 0

=⇒ x1 =

−(1/3)a0a

.

The eigenvector x2 solves4 −2 3 00 0 0 06 7 −3 0

=⇒ x2 =

−(3/8)b3b/4b

.

31

The eigenvector x3 solves−2 −2 3 00 −6 0 06 7 −9 0

=⇒ x3 =

(3/2)c0c

.

For example, we can take x1 =

−103

, x2 =

−368

, and x3 =

302

. Again,

they are linearly independent.

5.3. Diagonalization.

Definition. A square matrix A is diagonalizable if there exist an invert-

ible matrix P and a diagonal matrix D such that A = PDP−1. The matrixA is said to be similar to D.


[7 2−4 1

], P =

[1 1−1 −2

], and D =[

5 00 3

]. Check that A = PDP−1 and find A3.

Solution: Check AP = PD.

A3 = PD3P−1 =

[1 1−1 −2

] [125 00 27

] [2 1−1 −1

]=

[223 98−196 −71

].

Theorem 5 on page 320 (The Diagonalization Theorem). An n× nmatrix A is diagalizable iff it has n linearly independent eigenvectors. Infact, A = PDP−1 iff the columns of P are n linearly independent eigenvec-tors of A, and the diagonal entries of D are the corresponding eigenvalues ofA.

In our example, det(A− λI) = (7− λ)(1− λ) + 8 = (λ− 5)(λ− 3) = 0, etc.

Example 3 on pages 321 – 322. If possible, diagonalize the followingmatrix

A =

1 3 3−3 −5 −33 3 1

.

Solution: The characteristic equation is −(λ − 1)(λ + 2)2 = 0. Hence,the eigenvalues are 1 and -2. Three linearly independent eigenvectors are

v1 =

1−11

, v2 =

−110

, and v3 =

−101

.Example 4 on page 323. Diagonalize A =

2 4 3−4 −6 −33 3 1

.Solution: The characteristic equation is −(λ− 1)(λ+ 2)2 = 0. Hence, the

32

eigenvalues are 1 and -2. We can find only two linearly independent vectors

v1 =

1−11

and v2 =

−110

, so A is not diagonalizable.

Theorem 6 on page 323. An n × n matrix with n distinct eigenvaluesis diagonalizable.

5.4. Eigenvectors and Linear Transformations.

Suppose T : V → W where dimV = n, and dimW = m. Let B ={b1, . . . ,bn} and C be two bases in V and W , respectively. Given x ∈ V ,the coordinate vector [x]B ∈ Rn, and the coordinate vector of T (x) ∈ W ,[T (x)]C is in Rm. The connection between [x]B and [T (x)]C is as follows. If

x = r1b1+· · ·+rnbn, then [x]B =

r1. . .rn

and T (x) = r1T (b1)+· · ·+rnT (bn)

because T is linear.In terms of the C-coordinate vectors [T (x)]C = r1[T (b1)]C+ · · ·+ rn[T (bn)]C.Hence,

[T (x)]C = M [x]B

whereM =[[T (b1)]C . . . [T (bn)]C

]is the matrix representation of T called

the matrix for T relative to the bases B and C.

Example 1 on page 329. Let B = {b1,b2} and C = {c1, c2, c3}, andsuppose T (b1) = 3c1 − 2c2 + 5c3 and T (b2) = 4c1 + 7c2 − c3. Then the

matrix for T relative to B and C is M =[[T (b1)]C [T (b2)]C

]=

3 4−2 75 −1

.Definition. If T : V → V and the basis C is the same as B, then thematrix M is called the matrix for T relative to B (or the B-matrix for T ),and is denoted by [T ]B.

Example 2 on page 329. Consider differentiation in P2. It is definedas T : P2 → P2 such that T (a0 + a1t + a2t

2) = a1 + 2a2t. SupposeB = {1, t t2}. Since T (1) = 0, T (t) = 1, and T (t2) = 2t, the B-matrix

for T is [T ]B =

0 1 00 0 20 0 0

.For a general p(t) = a0+a1t+a2t

2, the coordinate vector is [p]B =

a0a1a2

, andthe image [T (p)]B = [a1 + 2a2t]B =

a12a20

=

0 1 00 0 20 0 0

a0a1a2

= [T ]B[p]B.

33

Theorem 8 on page 331. Suppose A = PDP−1 where D is a diago-nal n × n matrix. If B is the basis for Rn formed from the columns of P,then D is the B- matrix for the transformation T (x) = Ax.


[7 2−4 1

]. Then P =

[1 1−1 −2

]and

D =

[5 00 3

]. D is the B-matrix for T when B =

{[1−1

],

[1−2

]}.

6.1. Inner Product, Length, and Orthogonality.

Definition. If u =

u1

. . .un

and v =

v1. . .vn

are two vectors in Rn, then the

inner product or dot product is the number

u · v = uTv =[u1 . . . un

] v1. . .vn

= u1v1 + · · ·+ unvn.

Example. Compute the dot product of u =

1−32

and v =

50−2

.Theorem 1 on page 376. The dot product has the following properties:(a) u · v = v · u (commutative law)(b) (u+ v) ·w = u ·w + v ·w (distributive law)(c) (cu) · v = c(u · v) = u · (cv) (associative law)(d) u · u ≥ 0,∀u, and u · u = 0 iff u = 0

Definition. The length (or norm) of a vector v is ∥v∥ =√v · v =

√v21 + · · ·+ v2n.

Example.

Definition. The distance between u and v is dist(u,v) = ∥u− v∥.Example. Picture.

Definition. Vectors u and v are orthogonal iff u · v = 0.

Theorem 2 on page 380 (The Pythagorean Theorem). Vectors uand v are orthogonal iff ∥u+ v∥2 = ∥u∥2 + ∥v∥2.Proof:

Definition. Let W be a vector space. The orthogonal complement of W,

W⊥, is the set of all vectors that are orthogonal to every vector in W.

Example. Let W = R2 Then W⊥ is the line through the origin.

34

6.2. Orthogonal Sets.

Definition. A set of vectors {u1, . . . ,up} is an orthogonal set if ui · uj = 0for any i = j.

Example 1 on page 384. The set{31

1

,

−121

,

−1/2−27/2

}is an orthog-

onal set.

Theorem 4 on page 384. If S = {u1, . . . ,up} is an orthogonal set, thenS is linearly independent and therefore is a basis for the subspace spannedby S.Proof: The vector equation 0 = c1u1 + · · · + cpup has only the trivial so-lution. Indeed, 0 = 0 · u1 = c1u1 · u1 hence c1 = 0, etc.

Theorem 5 on page 385. Let {u1, . . . ,up} be an orthogonal basis fora vector space W . Then for any y ∈ W ,

y =

p∑i=1

y · ui

ui · ui

ui.

Proof: y · u1 = c1u1 · u1, etc.

Example 2 on pages 385 – 386. Let y =

61−8

. Write y as a linear

combination of the vectors in S in Example 1. Answer: y = u1 − 2u2 − 2u3.

Definition. Let L = Span{u} for some vector u ∈ Rn. Take any y ∈ L.The orthogonal projection of y onto L is

y = projL y =y · uu · u

u.

Example. u =

[10

].

Definition. A set of vectors is orthonormal if it is an orthogonal set ofunit vectors.Example.

Theorem 6 on page 390. Matrix U, called the unitarian matrix, hasorthonormal columns iff UTU = I.

Theorem 7 on page 390. Unitarian matrices have the property that(Ux) ·(Uy) = x · y. In particular, ∥Ux∥ = ∥x∥.

35

6.3. Orthogonal Projections.

Theorem 8 on page 395 (The Orthogonal Decomposition Theo-rem). LetW be a subspace of Rn. Then any y ∈ Rn can be written uniquelyas

y = y + z

where y ∈ W and z ∈ W⊥.In fact, if {u1, . . . ,up} is an orthogonal basis for W , then

y =

p∑i=1

y · ui

ui · ui

ui

and z = y − y.

Picture.

Example 2 on pages 396 – 397. Let u1 =

25−1

,u2 =

−211

, and

y =

123

. Then the decomposition of y is

123

=

−2/521/5

+

7/50

14/5

.

Theorem 9 on page 398. An orthogonal projection y of a vector y ontoW is the closest vector in W to y in the sense that dist(y, y) = ∥y − y∥ =minv∈W ∥y − v∥.

Picture.

Definition. The distance between y and W is the distance between y andthe closest vector in W , that is, dist(y,W ) = ∥y−projWy∥.

Example 4 on page 399. Find the distance between y =

= −1−510

and

W = Span{u1 =

5−21

,u2 =

12−1

}.

Solution:

y = (1/2)u1 − (7/2)u2 =

−1−84

, y − y =

036

, ∥y − y∥ = 3√5.

36

Theorem 10 on page 399. If {u1, . . . ,up} is an orthonormal basis in W ,then projWy =

∑pi=1(y ·ui)ui. If U =

[u1 · · · up

], then projWy = UUTy.

6.4. The Gram – Schmidt Orthogonalization Process.

Theorem 11 on page 404 (The Gram – Schmidt Process). Suppose{x1, . . . ,xp} is a basis for W . The following algorithm produces an orthog-onal basis for W . Take

v1 = x1, v2 = x2 −x2 · v1

v1 · v1

v1, . . . ,

vp = xp − xp · v1

v1 · v1

v1 −xp · v2

v2 · v2

v2 − · · · − xp · vp−1

vp−1 · vp−1

vp−1.

The set {v1, . . . ,vp} is an orthogonal basis for W .

Example 2 on pages 402 – 403. Suppose

x1 =

1111

,x2 =

0111

,x3 =

0011

.

Then

v1 =

1111

,v2 =

−3/41/41/41/4

,v3 =

0

−2/31/31/3

.

6.7. Inner Product Spaces.

Definition. An inner product on a vector space V is a function that forany u and v ∈ V , assigns a real number ⟨u,v⟩ satisfying the following prop-erties:1. ⟨u,v⟩ = ⟨v,u⟩ (commutative law)2. ⟨u+ v,w⟩ = ⟨u,w⟩+ ⟨v,w⟩ (distributive law)3. ⟨cu,v⟩ = c⟨u,v⟩4. ⟨u,u⟩ ≥ 0 and ⟨u,u⟩ = 0 iff u = 0.A vector space with an inner product is called an inner product space.

Example 1 on page 428. Let u =

[u1

u2

]and v =

[v1v2

]. Show that

⟨u,v⟩ = 4u1v1 + 5u2v2 is an inner product.

Generally, ⟨u,v⟩ =∑n

i=1 aiuivi defines an inner product in Rn.

37

Example 2 on page 429. Fix t0, . . . , tn ∈ R. For any p and q ∈ Pn

define ⟨p,q⟩ =∑n

i=0 p(ti)q(ti). Show that it is an inner product.Solution: ⟨p,p⟩ = 0 iff the polynomial vanishes at n + 1 points iff it is azero polynomial since its degree is at most n.

Example 3 on page 429. Let t0 = 0, t1 = 1/2, t2 = 1 and let p(t) = 12t2

and q = 2t− 1. Then ⟨p,q⟩ = p(0)q(0) + p(1/2)q(1/2) + p(1)q(1) = 12.

Definition. The length (or norm) of a vector v in an inner product space

V is ∥v∥ =√⟨v,v⟩.

Example 4 on page 429. In Example 3, ∥p∥ =√153 and ∥q∥ =

√2.

The Gram – Schmidt Orthogonalization process is applicable to an innerproduct space.

Example 5 on pages 430 – 431. Consider P2 with the inner product⟨p,q⟩ = p(−1)q(−1) + p(0)q(0) + p(1)q(1). Take {1, t, t2}, the standardbasis in P2. Orthogonalize this basis.

Solution: For any polynomial p ∈ P2, consider

p(−1)p(0)p(1)

∈ R3. The stan-

dard basis corresponds to the set{v1 =

111

,v2 =

−101

,v3 =

101

}.

Notice that x1 = v1 and x2 = v2 are orthogonal. Apply the Gram – Schmidt

process to find x3 = v3− ⟨x3,v1⟩⟨v1,v1⟩v1− ⟨x3,v2⟩

⟨v2,v2⟩v2 =

101

−(2/3)

111

=

1/3−2/31/3

.This vector corresponds to a polynomial a+bt+ct2 ∈ P2 where the coefficientssatisfy

a− b+ c = 1/3

a = −2/3

a+ b+ c = 1/3.

Thus, {1, t, t2 − 2/3} is an orthogonal basis for P2. An orthonormal basis is{1/3, t/2, (3/2)t2 − 1}.

Theorem 16 on page 432 (The Cauchy – Schwarz Inequality). Forany u and v,

|⟨u,v⟩| ≤ ∥u∥∥v∥.

Three applications of the Cauchy – Schwarz Inequality.

38

1. Theorem 17 on page 433 (The Triangle Inequality). For any uand v,

∥u+ v∥ ≤ ∥u∥+ ∥v∥.

Picture.Proof: ∥u + v∥2 = ⟨u + v,u + v⟩ = ⟨u,u⟩ + 2⟨u,v⟩ + ⟨v,v⟩ ≤ ∥u∥2 +

2|⟨u,v⟩|+ ∥v∥2C−S.=

≤ ∥u∥2 + 2∥u∥∥v∥+ ∥v∥2 = (u+ v)2. 2

2. Exercise 19 on page 436. Show that√ab ≤ (a + b)/2, that is, the

geometric mean is less than or equal to the arithmetic mean.

Solution: Consider u =

[√a√b

]and v =

[√b√a

]. The inner product is

⟨u,v⟩ = 2√ab. The norms are ∥u∥ = ∥v∥ =

√a+ b. By the Cauchy –

Schwarz inequality, 2√ab ≤ a+ b.

3. Exercise 20 on page 436. Show that (a+ b)2 ≤ 2a2 + 2b2.

Solution: Consider u =

[ab

]and v =

[11

]. The inner product is ⟨u,v⟩ =

a + b. The norms are ∥u∥ =√a2 + b2 and ∥v∥ =

√2. By the Cauchy –

Schwarz inequality, a+ b ≤√

2(a2 + b2).

7.1. Diagonalization of Symmetric Matrices.

Definition. A square matrix A is symmetric iff A = AT .

Example.

Theorem 1 on page 450. If A is symmetric, then any two eigenvectorsfrom different eigenspaces are orthogonal.Proof: Let v1 and v2 be two eigenvectors corresponding to two distincteigenvalues λ1 and λ2, respectively. We want to show that v1 · v2 = 0.We have λ1v1 · v2 = (λ1v1) · v2 = (λ1v1)

Tv2 = (Av1)Tv2 = v1

TATv2 =v1

T (Av2) = v1T (λ2v2) = λ2v1 · v2. Since λ1 = λ2, v1 · v2 = 0. 2

Example 2 on pages 449 – 450. Consider A =

6 −2 −1−2 6 −1−1 −1 5

. The

characteristic equation of A is −(λ− 3)(λ− 6)(λ− 8) = 0. The eigenvectorsare

λ1 = 3 : v1 =

111

, λ2 = 6 : v2 =

−1−12

, λ3 = 8 : v3 =

−110

.

The are orthogonal.

39

Since a nonzero multiple of an eigenvector is still an eigenvector, we cannormalize the orthogonal eigenvectors {v1,v2,v3} to produce the unit eigen-vectors (orthonormal eigenvectors) {u1,u2,u3}. In our example,

u1 =

1/√3

1/√3

1/√3

, u2 =

−1/√6

−1/√6

2/√6

, u3 =

−1/√2

1/√2

0

.

Definition. Let P =[u1 . . . un

]where {u1, . . . ,un} are the orthonormal

eigenvectors of an n×n matrix A that correspond to not necessarily distincteigenvalues λ1, . . . , λn. The matrix P is called an orthonormal matrix. It has

the property that P−1 = PT . Let D =

λ1 0 . . . 00 λ2 . . . 0

. . .0 0 . . . λn

be the diagonal

matrix with the eigenvalues on the main diagonal. Then A = PDP−1 =PDPT . The matrix A is said to be orthogonally diagonalizable.

In our example, P = . . . , D = . . . .

Theorem 2 on page 451. An n× n matrix A is orthogonally diagonaliz-able iff A is symmetric.

Example 3 on pages 451 – 452. Orthogonally diagonalize the matrix

A =

3 −2 4−2 6 24 2 3

.

Solution: The characteristic polynomial is −(λ + 2)(λ − 7)2. The eigen-vectors are

λ1 = −2 : v1 =

−1−1/21

, λ2 = 7 : v2 =

101

, v3 =

−1/210

.

The basis {v2,v3} can be orthogonalized by using the Gram – Schmidt pro-

cess. It produces u1 =

−2/3−1/32/3

, u2 =

1/√20

1/√2

, u3 =

−1/√18

4/√18

1/√18

. Then

A = PDPT .

Definition. WriteA = PDPT =[u1 . . . un

] λ1 0 . . . 0. . .

0 0 . . . λn

u1T

. . .un

T

=

[λ1u1 . . . λnun

] u1T

. . .un

T

. Thus, we can writeA = λ1u1u1T+· · ·+λnunun

T .

This representation of A is called a spectral decomposition of A. The set of

40

eigenvalues of A is called the spectrum.

Example 4 on page 453. Construct a spectral decomposition of A =[7 22 4

]=

[2/√5 −1/

√5

1/√5 2/

√5

] [8 00 3

] [2/√5 1/

√5

−1/√5 2/

√5

].

Solution: A = 8u1u1T + 3u2u2

T =

[32/5 16/516/5 8/5

]+

[3/5 −6/5−6/5 12/5

].

7.2. Quadratic Forms.

Definition. A quadratic form on Rn is Q(x) = xTAx where A is an n× nsymmetric matrix called the matrix of quadratic form.


[3 −2−2 7

]. The quadratic form

Q(x) = xTAx = 3x21 − 4x1x2 + 7x2

2.

Example 2 on page 456. Let Q(x) = 5x21 + 3x2

2 + 2x23 − x1x2 + 8x2x3.

The matrix of the quadratic form is A =

5 −1/2 0−1/2 3 40 4 2

.Definition. If x represents a variable vector in Rn, then the change of variableis an equation of the form x = Py or y = P−1x.

Definition. If the change of variable is made in the quadratic form, thenxTAx = (Py)TA(Py) = yT (PTAP)y, and the new matrix of the quadraticform is PTAP. If P orthogonally diagonalizes A, then PTAP = D. Thus,the matrix of the new quadratic form is diagonal.

Example 4 on pages 457 – 458. Let A =

[1 −4−4 −5

]. Then P =[

2/√5 1/

√5

−1/√5 2/

√5

]and D =

[3 00 −7

]. The orthogonal change of variable

is x = Py where x =

[x1

x2

]and y =

[y1y2

]. The quadratic form is xTAx =

x21 − 8x1x2 − 5x2

2 = yTDy = 3y21 − 7y22.

Theorem 4 on page 458. If A is an n× n symmetric matrix, then thereexists an orthogonal change of variable x = Py that transforms the quadraticform xTAx into a quadratic form yTDy with no cross-product terms.

Definition. The columns of P are called the principal axes of the quadratic

form xTAx. The vector y is actually vector x relative to the orthonormalbasis of Rn given by the principal axes.

Geometric Interpretation of Quadratic Forms in R2.

41

Consider Q(y) = yTDy, and let c be a constant. Then the equation yTDy =c can be written in either of the following six forms:(i) x2

1/a2 + x2

2/b2 = 1, a ≥ b > 0 (an ellipse if a > b, a circle if a = b),

(ii) x21/a

2 + x22/b

2 = 0 (a single point (0, 0)),(iii) x2

1/a2 + x2

2/b2 = −1 (empty set of points),

(iv) x21/a

2 − x22/b

2 = 1, a ≥ b > 0 (a hyperbola),(v) x2

1/a2 − x2

2/b2 = 0 (two intersecting lines x2 = ±(b/a)x1),

and(vi) x2

1/a2 − x2

2/b2 = −1 (a hyperbola x2

2/b2 − x2

1/a2 = 1).

Consider Q(x) = xTAx where A is a 2 × 2 symmetric matrix. If A isdiagonal, the graph is in standard position. If A is not diagonal, the graphis rotated out of standard position. Finding the principal axes amounts tofinding the new coordinate system with respect to which that graph is instandard position.

Pictures of an ellipse, hyperbola, rotation of axes, etc.

Definition. A quadratic form Q is:a. positive definite if Q(x) > 0 for all x = 0,b. negative definite if Q(x) < 0 for all x = 0,andc. indefinite if Q(x) assumes both positive and negative values.

Example. Q(x) = x21 + 2x2

2, Q(x) = −x21 − 2x2

2, Q(x) = x21 − 2x2

2.

Theorem 5 on page 461. Let A be an n×n symmetric matrix. Then thequadratic form xTAx isa. positive definite iff the eigenvalues of A are all positive,b. negative definite iff the eigenvalues of A are all negative,andc. indefinite iff some eigenvalues of A are positive and some are negative.Proof: xTAx = yTDy = λ1y

21 + · · ·+ λny

2n. 2

Definition. An n×n symmetric matrixA is called a positive definite matrix

(negative definite or indefinite) if the corresponding quadratic form xTAx ispositive definite (negative definite or indefinite).

Example 5 on page 461. Is Q(x) = 3x21 + 2x2

2 + x23 + 4x1x2 + 4x2x3

positive definite?

Solution: A =

3 2 02 2 20 2 1

. The eigenvalues are -1, 2, and 5. Hence, Q is

indefinite.

42

introductory linear algebraweb.csulb.edu/~okoroste/instr_materials/lecturesmath247.pdf · textbook...

Documents