solutionsmanual,linearalgebratheoryand applications · solutionsmanual,linearalgebratheoryand...

Exercises 1

Solutions Manual, Linear Algebra Theory AndApplications

F.12 Exercises

1.6

1. Let z = 5 + i9. Find z−1.

(5 + i9)−1

= 5106 − 9

106 i

2. Let z = 2 + i7 and let w = 3− i8. Find zw, z + w, z2, and w/z.

62 + 5i, 5− i,−45 + 28i, and − 5053 − 37

53 i.

3. Give the complete solution to x4 + 16 = 0.

x4 + 16 = 0, Solution is: (1− i)√2,− (1 + i)

√2,− (1− i)

√2, (1 + i)

√2.

4. Graph the complex cube roots of 8 in the complex plane. Do the same for the fourfourth roots of 16.

The cube roots are the solutions to z3 + 8 = 0, Solution is: i√3 + 1, 1− i

√3,−2

The fourth roots are the solutions to z4 + 16 = 0, Solution is:

(1− i)√2,− (1 + i)

√2,− (1− i)

√2, (1 + i)

√2. When you graph these, you will have three equally spaced points on the circle of

radius 2 for the cube roots and you will have four equally spaced points on the circleof radius 2 for the fourth roots. Here are pictures which should result.

5. If z is a complex number, show there exists ω a complex number with |ω| = 1 andωz = |z| .

If z = 0, let ω = 1. If z 6= 0, let ω =z

|z|

6. De Moivre’s theorem says [r (cos t+ i sin t)]n= rn (cosnt+ i sinnt) for n a positive

integer. Does this formula continue to hold for all integers, n, even negative integers?Explain.

Yes, it holds for all integers. First of all, it clearly holds if n = 0. Suppose now thatn is a negative integer. Then −n > 0 and so

[r (cos t+ i sin t)]n=

1

[r (cos t+ i sin t)]−n =1

r−n (cos (−nt) + i sin (−nt))

Saylor URL: http://www.saylor.org/courses/ma212/ The Saylor Foundation

2 Exercises

=rn

(cos (nt)− i sin (nt))=

rn (cos (nt) + i sin (nt))

(cos (nt)− i sin (nt)) (cos (nt) + i sin (nt))

= rn (cos (nt) + i sin (nt))

because (cos (nt)− i sin (nt)) (cos (nt) + i sin (nt)) = 1.

7. You already know formulas for cos (x+ y) and sin (x+ y) and these were used to proveDe Moivre’s theorem. Now using De Moivre’s theorem, derive a formula for sin (5x)and one for cos (5x).

sin (5x) = 5 cos4 x sinx− 10 cos2 x sin3 x+ sin5 x

cos (5x) = cos5 x− 10 cos3 x sin2 x+ 5 cosx sin4 x

8. If z and w are two complex numbers and the polar form of z involves the angle θ whilethe polar form of w involves the angle φ, show that in the polar form for zw the angleinvolved is θ + φ. Also, show that in the polar form of a complex number, z, r = |z| .You have z = |z| (cos θ + i sin θ) and w = |w| (cosφ+ i sinφ) . Then when you multiplythese, you get

|z| |w| (cos θ + i sin θ) (cosφ+ i sinφ)

= |z| |w| (cos θ cosφ− sin θ sinφ+ i (cos θ sinφ+ cosφ sin θ))

= |z| |w| (cos (θ + φ) + i sin (θ + φ))

9. Factor x3 + 8 as a product of linear factors.

x3 + 8 = 0, Solution is: i√3 + 1, 1− i

√3,−2 and so this polynomial equals

(x+ 2)(

x−(

i√3 + 1

))(

x−(

1− i√3))

10. Write x3 + 27 in the form (x+ 3)(x2 + ax+ b

)where x2 + ax+ b cannot be factored

any more using only real numbers.

x3 + 27 = (x+ 3)(x2 − 3x+ 9

)

11. Completely factor x4 + 16 as a product of linear factors.

x4 + 16 = 0, Solution is: (1− i)√2,− (1 + i)

√2,− (1− i)

√2, (1 + i)

√2. These are

just the fourth roots of −16. Then to factor, this you get

(

x−(

(1− i)√2))(

x−(

− (1 + i)√2))

·(

x−(

− (1− i)√2))(

x−(

(1 + i)√2))

12. Factor x4 + 16 as the product of two quadratic polynomials each of which cannot befactored further without using complex numbers.

x4 + 16 =(x2 − 2

√2x+ 4

) (x2 + 2

√2x+ 4

). You can use the information in the

preceding problem. Note that (x− z) (x− z) has real coefficients.

13. If z, w are complex numbers prove zw = zw and then show by induction that z1 · · · zm =z1 · · · zm. Also verify that

∑mk=1 zk =

∑mk=1 zk. In words this says the conjugate of a

product equals the product of the conjugates and the conjugate of a sum equals thesum of the conjugates.

(a+ ib) (c+ id) = ac− bd+ i (ad+ bc) = (ac− bd)− i (ad+ bc)


Exercises 3

(a− ib) (c− id) = ac − bd − i (ad+ bc) which is the same thing. Thus it holds fora product of two complex numbers. Now suppose you have that it is true for theproduct of n complex numbers. Then

z1 · · · zn+1 = z1 · · · zn zn+1

and now, by induction this equals

z1 · · · zn zn+1

As to sums, this is even easier.

n∑

j=1

(xj + iyj) =n∑

j=1

xj + in∑

j=1

yj

=

n∑

j=1

xj − i

n∑

j=1

yj =

n∑

j=1

xj − iyj =

n∑

j=1

(xj + iyj).

14. Suppose p (x) = anxn + an−1x

n−1 + · · ·+ a1x+ a0 where all the ak are real numbers.Suppose also that p (z) = 0 for some z ∈ C. Show it follows that p (z) = 0 also.

You just use the above problem. If p (z) = 0, then you have

p (z) = 0 = anzn + an−1zn−1 + · · ·+ a1z + a0

= anzn + an−1zn−1 + · · ·+ a1z + a0

= an zn + an−1 zn−1 + · · ·+ a1 z + a0

= anzn + an−1z

n−1 + · · ·+ a1z + a0

= p (z)

15. I claim that 1 = −1. Here is why.

−1 = i2 =√−1√−1 =

√

(−1)2 =√1 = 1.

This is clearly a remarkable result but is there something wrong with it? If so, whatis wrong?

Something is wrong. There is no single√−1.

16. De Moivre’s theorem is really a grand thing. I plan to use it now for rational exponents,not just integers.

1 = 1(1/4) = (cos 2π + i sin 2π)1/4

= cos (π/2) + i sin (π/2) = i.

Therefore, squaring both sides it follows 1 = −1 as in the previous problem. Whatdoes this tell you about De Moivre’s theorem? Is there a profound difference betweenraising numbers to integer powers and raising numbers to non integer powers?

It doesn’t work. This is because there are four fourth roots of 1.

17. Show that C cannot be considered an ordered field. Hint: Consider i2 = −1.It is clear that 1 > 0 because 1 = 12. (In general a2 > 0. This is clear if a > 0. If a < 0,then adding −a to both sides, yields that 0 < −a. Also recall that −a = (−1)a and

that (−1)2 = 1. Therefore, (−a)2 = (−1)2 a2 = a2 > 0. ) Now it follows that if C canbe ordered, then −1 > 0, but this is a problem because it implies that 0 > 1 = 12 > 0.


4 Exercises

18. Say a+ ib < x + iy if a < x or if a = x, then b < y. This is called the lexicographicorder. Show that any two different complex numbers can be compared with this order.What goes wrong in terms of the other requirements for an ordered field.

From the definition of this funny order, 0 < i and so if this were an order, you wouldneed to have 0 < i2 = −1. Now add 1 to both sides and obtain 0 > 1 = 12 > 0, acontradiction.

19. With the order of Problem 18, consider for n ∈ N the complex number 1 − 1n . Show

that with the lexicographic order just described, each of 1− im is an upper bound toall these numbers. Therefore, this is a set which is “bounded above” but has no leastupper bound with respect to the lexicographic order on C.

This follows from the definition. 1 − im > 1 − 1/n for each m. Therefore, if youconsider the numbers 1− 1

n you have a nonempty set which has an upper bound butno least upper bound.

F.13 Exercises

1.11

1. Give the complete solution to the system of equations, 3x − y + 4z = 6, y + 8z = 0,and −2x+ y = −4.x = 2− 4t, y = −8t, z = t.

2. Give the complete solution to the system of equations, x+3y+3z = 3, 3x+2y+z = 9,and −4x+ z = −9.x = y = 2, z = −1

3. Consider the system −5x + 2y − z = 0 and −5x− 2y − z = 0. Both equations equalzero and so −5x + 2y − z = −5x − 2y − z which is equivalent to y = 0. Thus x andz can equal anything. But when x = 1, z = −4, and y = 0 are plugged in to theequations, it doesn’t work. Why?

These are invalid row operations.

4. Give the complete solution to the system of equations, x+2y+6z = 5, 3x+2y+6z = 7,−4x+ 5y + 15z = −7.No solution.

5. Give the complete solution to the system of equations

x+ 2y + 3z = 5, 3x+ 2y + z = 7,

−4x+ 5y + z = −7, x+ 3z = 5.

x = 2, y = 0, z = 1.

6. Give the complete solution of the system of equations,

x+ 2y + 3z = 5, 3x+ 2y + 2z = 7

−4x+ 5y + 5z = −7, x = 5

No solution.


Exercises 5

7. Give the complete solution of the system of equations

x+ y + 3z = 2, 3x− y + 5z = 6

−4x+ 9y + z = −8, x+ 5y + 7z = 2

x = 2− 2t, y = −t, z = t.

8. Determine a such that there are infinitely many solutions and then find them. Nextdetermine a such that there are no solutions. Finally determine which values of acorrespond to a unique solution. The system of equations is

3za2 − 3a+ x+ y + 1 = 03x− a− y + z

(a2 + 4

)− 5 = 0

za2 − a− 4x+ 9y + 9 = 0

If a = 1, there are infinitely many solutions of the form x = 2 − 2t, y = −t, z = t. Ifa = −1, t then there are no solutions. If a is anything else, there is a unique solution.

x =2a

a+ 1, y = − 1

a+ 1, z =

1

a+ 1

9. Find the solutions to the following system of equations for x, y, z, w.

y + z = 2, z + w = 0, y − 4z − 5w = 2, 2y + z − w = 4

x = t, y = s+ 2, z = −s, w = s

10. Find all solutions to the following equations.

x+ y + z = 2, z + w = 0,

2x+ 2y + z − w = 4, x+ y − 4z − 5z = 2

x = −t+ s+ 2, y = t, z = −s, w = s, where s, t are each in F.

F.14 Exercises

1.14

1. Verify all the properties 1.11-1.18.

2. Compute 5 (1, 2 + 3i, 3,−2) + 6 (2− i, 1,−2, 7) .3. Draw a picture of the points in R2 which are determined by the following ordered

pairs.

(a) (1, 2)

(b) (−2,−2)(c) (−2, 3)(d) (2,−5)

4. Does it make sense to write (1, 2) + (2, 3, 1)? Explain.

5. Draw a picture of the points in R3 which are determined by the following orderedtriples. If you have trouble drawing this, describe it in words.

(a) (1, 2, 0)

(b) (−2,−2, 1)(c) (−2, 3,−2)


6 Exercises

F.15 Exercises

1.17

1. Show that (a · b) = 14

[

|a+ b|2 − |a− b|2]

.

2. Prove from the axioms of the inner product the parallelogram identity, |a+ b|2 +

|a− b|2 = 2 |a|2 + 2 |b|2 .

3. For a,b ∈ Rn, define a · b ≡∑nk=1 βkakbk where βk > 0 for each k. Show this satisfies

the axioms of the inner product. What does the Cauchy Schwarz inequality say inthis case.

The product satisfies all axioms for the inner product so the Cauchy Schwarz inequalityholds.

4. In Problem 3 above, suppose you only know βk ≥ 0. Does the Cauchy Schwarz in-equality still hold? If so, prove it.

Yes, it does. You don’t need the part which says that the only way a · a = 0 is fora = 0 in the argument for the Cauchy Schwarz inequality.

5. Let f, g be continuous functions and define

f · g ≡∫ 1

0

f (t) g (t)dt

show this satisfies the axioms of a inner product if you think of continuous functionsin the place of a vector in Fn. What does the Cauchy Schwarz inequality say in thiscase?

The only part which is not obvious for the axioms is the one which says that if

∫ 1

0

|f |2 = 0

then f = 0. However, this is obvious from continuity considerations.

6. Show that if f is a real valued continuous function,

(

∫ b

a

f (t) dt

)2

≤ (b− a)

∫ b

a

f (t)2dt.

∣

∣

∣

∣

∣

∫ b

a

f (t) dt

∣

∣

∣

∣

∣

≤(

∫ b

a

12dt

)1/2(∫ b

a

|f (t)|2 dt)1/2

= (b− a)1/2(

∫ b

a

|f (t)|2 dt)1/2

which yields the desired inequality when you square both sides.


Exercises 7

F.16 Exercises

2.2

1. In 2.1 - 2.8 describe −A and 0.

2. Let A be an n×nmatrix. Show A equals the sum of a symmetric and a skew symmetricmatrix.

A = A+AT

2 + A−AT

2

3. Show every skew symmetric matrix has all zeros down the main diagonal. The maindiagonal consists of every entry of the matrix which is of the form aii. It runs fromthe upper left down to the lower right.

You know that Aij = −Aji. Let j = i to conclude that Aii = −Aii and so Aii = 0.

4. Using only the properties 2.1 - 2.8 show −A is unique.

Suppose that B also works. Then

−A = −A+ (A+B) = (−A+A) +B = B

5. Using only the properties 2.1 - 2.8 show 0 is unique.

If 0′ is another additive identity, then 0′ = 0 + 0′ = 0

6. Using only the properties 2.1 - 2.8 show 0A = 0. Here the 0 on the left is the scalar 0and the 0 on the right is the zero for m× n matrices.

0A = (0 + 0)A = 0A+ 0A. Now add the additive inverse of 0A to both sides.

7. Using only the properties 2.1 - 2.8 and previous problems show (−1)A = −A.0 = 0A = (1 + (−1))A = A+ (−1)A. Hence, (−1)A is the unique additive inverse ofA. Thus −A = (−1)A.

8. Prove 2.17.

(AB)Tij ≡ (AB)ji =

∑

k AjkBki =∑

k BTikA

Tkj =

(BTAT

)

ij. Hence the formula holds.

9. Prove that ImA = A where A is an m× n matrix.

(ImA)ij ≡∑

k IikAkj = Aij and so ImA = A.

10. Let A and be a real m × n matrix and let x ∈ Rn and y ∈ Rm. Show (Ax,y)Rm =

(x,ATy

)

Rn where (·, ·)Rk denotes the dot product in Rk.

(Ax,y) =∑

i (Ax)i yi =∑

i

∑

k Aikxkyi(x,ATy

)=

∑

k xk

∑

i

(AT

)

kiyi =

∑

k

∑

i xkAikyi, the same as above. Hence the twoare equal.

11. Use the result of Problem 10 to verify directly that (AB)T= BTAT without making

any reference to subscripts.(

(AB)T x,y)

≡ (x, (AB)y) =(ATx,By

)=

(BTATx,y

). Since this holds for every

x,y, you have for all y(

(AB)Tx−BTATx,y

)

Let y =(AB)Tx−BTATx. Then since x is arbitrary, the result follows.


8 Exercises

12. Let x =(−1,−1, 1) and y = (0, 1, 2) . Find xTy and xyT if possible.

xTy =

−1−11

(0 1 2

)=

0 −1 −20 −1 −20 1 2

xyT =(−1 −1 1

)

012

= 1

13. Give an example of matrices, A,B,C such that B 6= C, A 6= 0, and yet AB = AC.(

1 11 1

)(1 −1−1 1

)

=

(0 00 0

)

(1 11 1

)(−1 11 −1

)

=

(0 00 0

)

14. Let A =

1 1−2 −11 2

, B =

(1 −1 −22 1 −2

)

, and C =

1 1 −3−1 2 0−3 −1 0

. Find

if possible.

(a) AB

(b) BA

(c) AC

(d) CA

(e) CB

(f) BC

15. Consider the following digraph.

1 2

34

Write the matrix associated with this digraph and find the number of ways to go from3 to 4 in three steps.

The matrix for the digraph is

0 1 1 01 0 0 11 1 0 20 1 0 1


Exercises 9

0 1 1 01 0 0 11 1 0 20 1 0 1

3

=

1 5 2 43 2 0 54 5 1 81 3 1 3

Thus it appears that there are 8 ways to do this.

16. Show that if A−1 exists for an n×n matrix, then it is unique. That is, if BA = I andAB = I, then B = A−1.

From the given equations, multiply on the right by A−1. Then B = A−1.

17. Show (AB)−1

= B−1A−1.

ABB−1A−1 = AIA−1 = I

B−1A−1AB = B−1IB = I

Then by the definition of the inverse and its uniqueness, it follows that (AB)−1

existsand

(AB)−1 = B−1A−1

18. Show that if A is an invertible n× n matrix, then so is AT and(AT

)−1=

(A−1

)T.

AT(A−1

)T=

(A−1A

)T= I

(A−1

)TAT =

(AA−1

)T= I Then from the definition of the inverse and its uniqueness,

it follows that(AT

)−1exists and equals

(A−1

)T.

19. Show that if A is an n×n invertible matrix and x is a n× 1 matrix such that Ax = b

for b an n× 1 matrix, then x = A−1b.

Multiply both sides on the left by A−1.

20. Give an example of a matrix A such that A2 = I and yet A 6= I and A 6= −I.(

0 11 0

)2

=

(1 00 1

)

21. Give an example of matrices, A,B such that neither A nor B equals zero and yetAB = 0.(

1 11 1

)(1 −1−1 1

)

=

(0 00 0

)

22. Write

x1 − x2 + 2x3

2x3 + x1

3x3

3x4 + 3x2 + x1

in the form A

x1

x2

x3

x4

where A is an appropriate matrix.

1 −1 2 01 0 2 00 0 3 01 3 0 3

x1

x2

x3

x4

=

x1 − x2 + 2x3

x1 + 2x3

3x3

x1 + 3x2 + 3x4

23. Give another example other than the one given in this section of two square matrices,A and B such that AB 6= BA.

Almost anything works.(

1 23 4

)(1 22 0

)

=

(5 211 6

)


10 Exercises

(1 22 0

)(1 23 4

)

=

(7 102 4

)

24. Suppose A and B are square matrices of the same size. Which of the following arecorrect?

(a) (A−B)2= A2 − 2AB +B2 Note this.

(b) (AB)2= A2B2 Not this.

(c) (A+B)2 = A2 + 2AB +B2 Not this.

(d) (A+B)2= A2 +AB +BA+B2 This is all right.

(e) A2B2 = A (AB)B This is all right.

(f) (A+B)3= A3 + 3A2B + 3AB2 +B3 Not this.

(g) (A+B) (A− B) = A2 −B2 Not this.

(h) None of the above. They are all wrong.

(i) All of the above. They are all right.

25. Let A =

(−1 −13 3

)

. Find all 2× 2 matrices, B such that AB = 0.

(−1 −13 3

)(x yz w

)

=

(−x− z −w − y3x+ 3z 3w + 3y

)

(−z −wz w

)

, z, w arbitrary.

26. Prove that if A−1 exists and Ax = 0 then x = 0.

Multiply on the left by A−1.

27. Let

A =

1 2 32 1 41 0 2

.

Find A−1 if possible. If A−1 does not exist, determine why.

1 2 32 1 41 0 2

−1

=

−2 4 −50 1 −21 −2 3

28. Let

A =

1 0 32 3 41 0 2

.


1 0 32 3 41 0 2

−1

=

−2 0 30 1

3 − 23

1 0 −1

29. Let

A =

1 2 32 1 44 5 10

.


Exercises 11


1 2 32 1 44 5 10

, row echelon form:

1 0 53

0 1 23

0 0 0

A has no inverse.

30. Let

A =

1 2 0 21 1 2 02 1 −3 21 2 1 2


1 2 0 21 1 2 02 1 −3 21 2 1 2

−1

=

−1 12

12

12

3 12 − 1

2 − 52

−1 0 0 1−2 − 3

414

94

F.17 Exercises

2.7

1. Show the map T : Rn → Rm defined by T (x) = Ax where A is an m× n matrix andx is an m× 1 column vector is a linear transformation.

This follows from matrix multiplication rules.

2. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/3.(

cos (π/3) − sin (π/3)sin (π/3) cos (π/3)

)

=

(12 − 1

2

√3

12

√3 1

2

)

3. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/4.(


)

=

(12

√2 − 1

2

√2

12

√2 1

2

√2

)

4. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of −π/3.(

cos (−π/3) − sin (−π/3)sin (−π/3) cos (−π/3)

)

=

(12

12

√3

− 12

√3 1

2

)

5. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of 2π/3.(

2 cos (π/3) −2 sin (π/3)2 sin (π/3) 2 cos (π/3)

)

=

(1 −

√3√

3 1

)

6. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/12. Hint: Note that π/12 = π/3− π/4.


12 Exercises

(cos (π/3) − sin (π/3)sin (π/3) cos (π/3)

)(cos (−π/4) − sin (−π/4)sin (−π/4) cos (−π/4)

)

=

(14

√2√3 + 1

4

√2 1

4

√2− 1

4

√2√3

14

√2√3− 1

4

√2 1

4

√2√3 + 1

4

√2

)

7. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of 2π/3 and then reflects across the x axis.(

1 00 −1

)(cos (2π/3) − sin (2π/3)sin (2π/3) cos (2π/3)

)

=

(− 1

2 − 12

√3

− 12

√3 1

2

)

8. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/3 and then reflects across the x axis.(

1 00 −1

)(cos (π/3) − sin (π/3)sin (π/3) cos (π/3)

)

=

(12 − 1

2

√3

− 12

√3 − 1

2

)

9. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/4 and then reflects across the x axis.(

1 00 −1


)

=

(12

√2 − 1

2

√2

− 12

√2 − 1

2

√2

)

10. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/6 and then reflects across the x axis followed by a reflection across they axis.(−1 00 1

)(1 00 −1


)

=

(− 1

2

√3 1

2

− 12 − 1

2

√3

)

11. Find the matrix for the linear transformation which reflects every vector in R2 acrossthe x axis and then rotates every vector through an angle of π/4.(


)(1 00 −1

)

=

(12

√2 1

2

√2

12

√2 − 1

2

√2

)

12. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of π/4 and next reflects every vector across the x axis. Compare with theabove problem.(

1 00 −1


)

=

(12

√2 − 1

2

√2

− 12

√2 − 1

2

√2

)

13. Find the matrix for the linear transformation which reflects every vector in R2 acrossthe x axis and then rotates every vector through an angle of π/6.(


)(1 00 −1

)

=

(12

√3 1

212 − 1

2

√3

)

14. Find the matrix for the linear transformation which reflects every vector in R2 acrossthe y axis and then rotates every vector through an angle of π/6.(


)(−1 00 1

)

=

(− 1

2

√3 − 1

2

− 12

12

√3

)


Exercises 13

15. Find the matrix for the linear transformation which rotates every vector in R2 throughan angle of 5π/12. Hint: Note that 5π/12 = 2π/3− π/4.

(cos (2π/3) − sin (2π/3)sin (2π/3) cos (2π/3)

)(cos (−π/4) − sin (−π/4)sin (−π/4) cos (−π/4)

)

=

(14

√2√3− 1

4

√2 − 1

4

√2√3− 1

4

√2

14

√2√3 + 1

4

√2 1

4

√2√3− 1

4

√2

)

16. Find the matrix for proju(v) where u = (1,−2, 3)T .

Tei =ei·(1,−2,3)

14 (1,−2, 3)T

1

14

1 2 3−2 −4 −63 6 9

17. Find the matrix for proju(v) where u = (1, 5, 3)T .

1

35

1 5 35 25 153 15 9

18. Find the matrix for proju (v) where u = (1, 0, 3)T.

1

10

1 0 30 0 03 0 9

19. Give an example of a 2 × 2 matrix A which has all its entries nonzero and satisfiesA2 = A. Such a matrix is called idempotent.

You know it can’t be invertible. So try this.

(a ab b

)2

=

(a2 + ba a2 + bab2 + ab b2 + ab

)

Let a2 + ab = a, b2 + ab = b. A solution which yields a nonzero matrix is

(2 2−1 −1

)

20. Let A be an m × n matrix and let B be an n ×m matrix where n < m. Show thatAB cannot have an inverse.

This follows right away from Theorem 2.3.8. This theorem says there exists a vectorx 6= 0 such that Bx = 0. Therefore, ABx = 0 also and so AB cannot be invertible.

21. Find ker (A) for

A =

1 2 3 2 10 2 1 1 21 4 4 3 30 2 1 1 2

.


14 Exercises

Recall ker (A) is just the set of solutions to Ax = 0.

1 2 3 2 1 00 2 1 1 2 01 4 4 3 3 00 2 1 1 2 0

. After row operations,

1 0 2 1 −1 00 1 1

212 1 0

0 0 0 0 0 00 0 0 0 0 0

A solution is x2 = − 12 t1 − 1

2 t2 − t3, x1 = −2t1 − t2 + t3 where the ti are arbitrary.

22. If A is a linear transformation, and Axp= b. Show that the general solution to theequation Ax = b is of the form xp + y where y ∈ ker (A). By this I mean to showthat whenever Az = b there exists y ∈ ker (A) such that xp + y = z.

If Az = b, Then A (z− xp) = Az − Axp = b− b = 0 so there exists y such thaty ∈ ker (A) and xp + y = z.

23. Using Problem 21, find the general solution to the following linear system.

1 2 3 2 10 2 1 1 21 4 4 3 30 2 1 1 2

x1

x2

x3

x4

x5

=

117187

−2t1 − t2 + t3− 1

2 t1 − 12 t2 − t3

t1t2t3

+

47/2000

, ti ∈ F

That second vector is a particular solution.

24. Using Problem 21, find the general solution to the following linear system.

1 2 3 2 10 2 1 1 21 4 4 3 30 2 1 1 2

x1

x2

x3

x4

x5

=

67137

−2t1 − t2 + t3− 1

2 t1 − 12 t2 − t3

t1t2t3

+

−17/2000

, ti ∈ F

25. Show that the function Tu defined by Tu (v) ≡ v − proju (v) is also a linear transfor-mation.

This is the sum of two linear transformations so it is obviously linear.

26. If u = (1, 2, 3)T and Tu is given in the above problem, find the matrix Au whichsatisfies Aux = Tu (x).

1 0 00 1 00 0 1

− 114

1 2 32 4 63 6 9

=

1314 − 1

7 − 314

− 17

57 − 3

7− 3

14 − 37

514


Exercises 15

27. ↑Suppose V is a subspace of Fn and T : V → Fp is a nonzero linear transformation.Show that there exists a basis for Im (T ) ≡ T (V )

Tv1, · · · , Tvm

and that in this situation,v1, · · · ,vm

is linearly independent.

Im (T ) is a subspace of Fp. Therefore, it has a basis Tv1, · · · , Tvm for some m ≤ p.Say

m∑

i=1

civi = 0

Then do T to both sidesm∑

i=1

ciTvi = T0 = 0

Hence each ci = 0. (T0 =T (0+ 0) = T0+ T0 and so T0 = 0)

28. ↑In the situation of Problem 27 where V is a subspace of Fn, show that there existsz1, · · · , zr a basis for ker (T ) . (Recall Theorem 2.4.12. Since ker (T ) is a subspace,it has a basis.) Now for an arbitrary Tv ∈ T (V ) , explain why

Tv = a1Tv1 + · · ·+ amTvm

and why this implies

v − (a1v1 + · · ·+ amvm) ∈ ker (T ) .

Then explain why V = span (v1, · · · ,vm, z1, · · · , zr) .ker (T ) is also a subspace so it has a basis z1, · · · , zr for some r ≤ n. Now let thebasis for T (V ) be as above. Then for v an arbitrary vector, there exist unique scalarsai such that

Tv = a1Tv1 + · · ·+ amTvm

Then it follows that

T (v − (a1v1 + · · ·+ amvm)) = Tv− (a1Tv1 + · · ·+ amTvm) = 0

Hence the vector v−(a1v1 + · · ·+ amvm) is in ker (T ) and so there exist unique scalarsbi such that

v − (a1v1 + · · ·+ amvm) =

r∑

i=1

bizi

and therefore, V = span (v1, · · · ,vm, z1, · · · , zr) .29. ↑In the situation of the above problem, show v1, · · · ,vm, z1, · · · , zr is a basis for V

and therefore, dim (V ) = dim (ker (T )) + dim (T (V ))

The claim that v1, · · · ,vm, z1, · · · , zr is a basis will be complete if it is shown thatthese vectors are linearly independent. Suppose then that

∑

i

aizi +∑

j

bjvj = 0.

Then do T to both sides. Then you have∑

j bjTvj = 0 and so each bj = 0. Then youuse the linear independence to conclude that each ai = 0.


16 Exercises

30. ↑Let A be a linear transformation from V to W and let B be a linear transformationfrom W to U where V,W,U are all subspaces of some Fp. Explain why

A (ker (BA)) ⊆ ker (B) , ker (A) ⊆ ker (BA) .

ker(B)

A(ker(BA))

ker(BA)

ker(A) -

A

If x ∈ ker (BA) , then BAx = 0 and so Ax ∈ ker (B) . That is, BAx = 0. It followsthat

A (ker (BA)) ⊆ ker (B)

The second inclusion is obvious because if x is sent to 0 by A, then B will send Axto 0.

31. ↑Let x1, · · · ,xn be a basis of ker (A) and let Ay1, · · · , Aym be a basis ofA (ker (BA)).Let z ∈ ker (BA) . Explain why

Az ∈ span Ay1, · · · , Aym

and why there exist scalars ai such that

A (z − (a1y1 + · · ·+ amym)) = 0

and why it follows z − (a1y1 + · · ·+ amym) ∈ span x1, · · · ,xn. Now explain why

ker (BA) ⊆ span x1, · · · ,xn,y1, · · · ,ym

and sodim (ker (BA)) ≤ dim (ker (B)) + dim (ker (A)) .

This important inequality is due to Sylvester. Show that equality holds if and only ifA(kerBA) = ker(B).

Let Ax1, · · · , Axr be a basis for A (ker (BA)) . Then let y ∈ ker (BA) . Thus Itfollows that

Ay ∈ A (ker (BA)) .

Then there are scalars ai such that

Ay =

r∑

i=1

aiAxi

Hence y−∑ri=1 aixi ∈ ker (A) . Let z1, · · · , zm be a basis for ker (A) . This shows

that

dimker (BA) ≤ m+ r = dimker (A) + dimA (ker (BA))

≤ dimker (A) + dimker (B)

It is interesting to note that the set of vectors z1, · · · , zm,x1, · · · ,xr is independent.To see this, say

∑

j

ajzj +∑

i

bixi = 0


Exercises 17

Then do A to both sides and obtain each bi = 0. Then it follows that each aj is alsozero because of the independence of the zj . Thus

dimker (BA) = dimker (A) + dim (A ker (BA))

32. Generalize the result of the previous problem to any finite product of linear mappings.

dim (ker∏s

i=1 Ai) ≤∑s

i=1 dimker (Ai). This follows by induction.

33. If W ⊆ V for W,V two subspaces of Fn and if dim (W ) = dim (V ) , show W = V .

Let a basis for W be w1, · · · ,wr Then if there exists v ∈ V \W, you could add in v

to the basis and obtain a linearly independent set of vectors of V which implies thatthe dimension of V is at least r + 1 contrary to assumption.

34. Let V be a subspace of Fnand let V1, · · · , Vm be subspaces, each contained in V . Then

V = V1 ⊕ · · · ⊕ Vm (6.27)

if every v ∈ V can be written in a unique way in the form

v = v1 + · · ·+ vm

where each vi ∈ Vi. This is called a direct sum. If this uniqueness condition does nothold, then one writes

V = V1 + · · ·+ Vm

and this symbol means all vectors of the form

v1 + · · ·+ vm, vj ∈ Vj for each j.

Show this is equivalent to saying that if

0 = v1 + · · ·+ vm, vj ∈ Vj for each j,

then each vj = 0. Next show that in the this situation, if βi =ui1, · · · , ui

mi

is a basis

for Vi, then β1, · · · , βm is a basis for V .

span (β1, · · · , βm) is given to equal V. It only remains to verify that β1, · · · , βm islinearly independent. Suppose vi ∈ Vi with

∑

i

vi = 0

It is also true that∑

i 0 = 0 and so each vi = 0 by assumption. Now suppose

∑

i

∑

j

cijuij = 0

Then from what was just observed, for each i,

∑

j

cijuij = 0

and now, since these uij form a basis for Vi, it follows that each cij = 0 for each j for

each i. Thus β1, · · · , βm is a basis.


18 Exercises

35. ↑Suppose you have finitely many linear mappings L1, L2, · · · , Lm which map V to Vwhere V is a subspace of Fn and suppose they commute. That is, LiLj = LjLi for alli, j. Also suppose Lk is one to one on ker (Lj) whenever j 6= k. Letting P denote theproduct of these linear transformations, P = L1L2 · · ·Lm, first show

ker (L1) + · · ·+ ker (Lm) ⊆ ker (P )

Next show Lj : ker (Li)→ ker (Li) . Then show

ker (L1) + · · ·+ ker (Lm) = ker (L1)⊕ · · · ⊕ ker (Lm) .

Using Sylvester’s theorem, and the result of Problem 33, show

ker (P ) = ker (L1)⊕ · · · ⊕ ker (Lm)

Hint: By Sylvester’s theorem and the above problem,

dim (ker (P )) ≤∑

i

dim (ker (Li))

= dim (ker (L1)⊕ · · · ⊕ ker (Lm)) ≤ dim (ker (P ))

Now consider Problem 33.

First note that, since these operators commute, it follows that Lk : kerLi → kerLi.Let vi ∈ ker (Li) and consider

∑

i vi. It is obvious, since the linear transformationscommute that P (

∑

i vi) = 0. Thus

ker (L1) + · · ·+ ker (Lm) ⊆ ker (P )

However, by Sylvester’s theorem

dimker (P ) ≤∑

i

dim ker (Li)

If vi ∈ ker (Li) and∑

i vi = 0, then apply∏

i6=k Li to both sides. This yields

∏

i6=k

Livk = 0

Since each of these Li is one to one on ker (Lk) , it follows that vk = 0. Thus

ker (L1) + · · ·+ ker (Lm) = ker (L1)⊕ · · · ⊕ ker (Lm)

Now it follows that a basis for ker (L1) + · · ·+ker (Lm) has∑

i dimker (Li) vectors init.

dimker (P ) ≤∑

i

dimker (Li)

= dim (ker (L1) + · · ·+ ker (Lm))

≤ dimker (P )

Thus the inequalities are all equal signs and so

ker (L1)⊕ · · · ⊕ ker (Lm) = ker (P )


Exercises 19

36. LetM (Fn,Fn) denote the set of all n×n matrices having entries in F. With the usualoperations of matrix addition and scalar multiplications, explain whyM (Fn,Fn) can

be considered as Fn2

. Give a basis for M (Fn,Fn) . If A ∈ M (Fn,Fn) , explain whythere exists a monic polynomial of the form

λk + akλk + · · ·+ a1λ+ a0

such thatAk + akA

k + · · ·+ a1A+ a0I = 0

The minimal polynomial of A is the polynomial like the above, for which p (A) = 0which has smallest degree. I will discuss the uniqueness of this polynomial later. Hint:

Consider the matrices I, A,A2, · · · , An2

. There are n2+1 of these matrices. Can theybe linearly independent? Now consider all polynomials and pick one of smallest degreeand then divide by the leading coefficient.

A basis for M (Fn,Fn) is obviously the matricies Eij where Eij has a 1 in the ijth

place and zeros everywhere else. Thus the dimension of this vector space is n2. Itfollows that the list of matrices I, A,A2, · · · , An2

is linearly dependent. Hence thereexists a polynomial p (λ) which has smallest possible degree such that p (A) = 0 by thewell ordering principle of the natural numbers. Then divide by the leading coefficient.If you insist that its leading coefficient be 1, (monic) then the polynomial is uniqueand it is called the minimal polynomial. It is unique thanks to the division algorithm,because if q (λ) is another one, then

q (λ) = p (λ) l (λ) + r (λ)

where the degree of r (λ) is less than the degree of p (λ) or else equals 0. If it is notzero, then r (A) = 0 and this would be a contradiction. Hence q (λ) = p (λ) l (λ) wherel (λ) must be monic. Since q (λ) has smallest possible degree, this monic polynomialcan only be 1. Thus q (λ) = p (λ).

37. ↑Suppose the field of scalars is C and A is an n × n matrix. From the precedingproblem, and the fundamental theorem of algebra, this minimal polynomial factors

(λ− λ1)r1 (λ− λ2)

r2 · · · (λ− λk)rk

where rj is the algebraic multiplicity of λj . Thus

(A− λ1I)r1 (A− λ2I)

r2 · · · (A− λkI)rk = 0

and so, letting P = (A− λ1I)r1 (A− λ2I)

r2 · · · (A− λkI)rk and Lj = (A− λjI)

rj

apply the result of Problem 35 to verify that

Cn = ker (L1)⊕ · · · ⊕ ker (Lk)

and that A : ker (Lj) → ker (Lj). In this context, ker (Lj) is called the generalizedeigenspace for λj. You need to verify the conditions of the result of this problem hold.

Let Li = (A− λiI)ri . Then obviously these commute since they are just polynomials

in A. Is Lk one to one on ker (Li)?

(A− λkI)rk = (A− λiI + (λi − λk) I)

rk


20 Exercises

=

rk∑

j=0

(rkj

)

(A− λiI)j(λi − λk)

rk−j

= (λi − λk)rk I +

rk∑

j=1

(rkj

)

(A− λiI)j (λi − λk)

rk−j

Now raise both sides to the ri power.

(A− λkI)rkri = (λi − λk)

rkri I + g (A) (A− λiI)ri

where g (A) is some polynomial in A. Let vi ∈ ker (Li) . Then suppose (A− λkI)rk vi =

0. Then0 = (A− λkI)

rkri vi = (λi − λk)rkri vi

and since λi 6= λk, this requires that vi = 0. Thus Lk is one to one on ker (Li) ashoped. Therefore, since Cn = ker

∏

i Li, it follows from the above problems that

Cn = ker (L1)⊕ · · · ⊕ ker (Lk)

Note that there was nothing sacred about C all you needed for the above to hold isthat the minimal polynomial factors completely into a product of linear factors. Inother words, all the above works fine for Fn provided the minimal polynomial “splits”.

38. In the context of Problem 37, show there exists a nonzero vector x such that

(A− λjI)x = 0.

This is called an eigenvector and the λj is called an eigenvalue. Hint: There mustexist a vector y such that

(A− λ1I)r1 (A− λ2I)

r2 · · · (A− λjI)rj−1 · · · (A− λkI)

rk y = z 6= 0

Why? Now what happens if you do (A− λjI) to z?

The hint gives it away.

(A− λjI) z = (A− λjI)(

(A− λ1I)r1 (A− λ2I)

r2 · · · (A− λjI)rj−1 · · · (A− λkI)

rk y)

= (A− λ1I)r1 (A− λ2I)

r2 · · · (A− λjI)rj · · · (A− λkI)

rk y = 0

39. Suppose Q (t) is an orthogonal matrix. This means Q (t) is a real n× n matrix whichsatisfies

Q (t)Q (t)T= I

Suppose also the entries of Q (t) are differentiable. Show(QT

)′= −QTQ′QT .

This is just the product rule.

Q′ (t)Q (t)T+Q (t)Q′ (t)T = 0

HenceQ′ (t)T = −Q (t)

TQ′ (t)Q (t)

T


Exercises 21

40. Remember the Coriolis force was 2Ω× vB where Ω was a particular vector whichcame from the matrix Q (t) as described above. Show that

Q (t) =

i (t) · i (t0) j (t) · i (t0) k (t) · i (t0)i (t) · j (t0) j (t) · j (t0) k (t) · j (t0)i (t) · k (t0) j (t) · k (t0) k (t) · k (t0)

.

There will be no Coriolis force exactly when Ω = 0 which corresponds to Q′ (t) = 0.When will Q′ (t) = 0?

Recall, that letting i = e1, j = e2,k = e3 in the usual way,

Q (t)u = u1e1 (t) + u2e2 (t) + u3e3 (t)

whereu ≡ u1e1 (t0) + u2e2 (t0) + u3e3 (t0)

Note that uj = u · ej (t0) . Thus

Q (t)u =∑

j

u · ej (t0) ej (t)

So what is the rsth entry of Q (t)? It equals

er (t0)TQ (t) es (t0) = er (t0) ·

∑

j

es (t0) ·ej (t0) ej (t)

= er (t0) · es (t)

which shows the desired result.

41. An illustration used in many beginning physics books is that of firing a rifle hori-zontally and dropping an identical bullet from the same height above the perfectlyflat ground followed by an assertion that the two bullets will hit the ground at ex-actly the same time. Is this true on the rotating earth assuming the experimenttakes place over a large perfectly flat field so the curvature of the earth is not anissue? Explain. What other irregularities will occur? Recall the Coriolis accelerationis 2ω [(−y′ cosφ) i+(x′ cosφ+ z′ sinφ) j− (y′ sinφ)k] where k points away from thecenter of the earth, j points East, and i points South.

Obviously not. Because of the Coriolis force experienced by the fired bullet which isnot experienced by the dropped bullet, it will not be as simple as in the physics books.For example, if the bullet is fired East, then y′ sinφ > 0 and will contribute to a forceacting on the bullet which has been fired which will cause it to hit the ground fasterthan the one dropped. Of course at the North pole or the South pole, things shouldbe closer to what is expected in the physics books because there sinφ = 0. Also, ifyou fire it North or South, there seems to be no extra force because y′ = 0.

F.18 Exercises

3.2

1. Find the determinants of the following matrices.

(a)

1 2 33 2 20 9 8

(The answer is 31.)


22 Exercises

(b)

4 3 21 7 83 −9 3

(The answer is 375.)

(c)

1 2 3 21 3 2 34 1 5 01 2 1 2

, (The answer is −2.)

2. If A−1 exist, what is the relationship between det (A) and det(A−1

). Explain your

answer.

1 = det(AA−1

)= det (A) det

(A−1

).

3. Let A be an n × n matrix where n is odd. Suppose also that A is skew symmetric.This means AT = −A. Show that det(A) = 0.

det (A) = det(AT

)= det (−A) = det (−I) det (A) = (−1)n det (A) = − det (A) .

4. Is it true that det (A+B) = det (A) + det (B)? If this is so, explain why it is so andif it is not so, give a counter example.

Almost anything shows that this is not true.

det

((1 00 1

)

+

(−1 00 −1

))

= 0

det

(1 00 1

)

+ det

(−1 00 −1

)

= 2

5. Let A be an r×r matrix and suppose there are r−1 rows (columns) such that all rows(columns) are linear combinations of these r − 1 rows (columns). Show det (A) = 0.

Without loss of generality, assume the last row is a linear combination of the first r−1rows. Then the matrix is of the form

rT1...

rTn−1∑n−1i=1 air

Ti

Then from the linear property of determinants, the determinant equals

n−1∑

i=1

ai det

rT1...

rTn−1

rTi

=

n−1∑

i=1

ai det

rT1...

rTn−1

0T

= 0

Where the first equal sign in the above is obtained by taking −1 times a the ith rowfrom the top and adding to the last row.

6. Show det (aA) = an det (A) where here A is an n× n matrix and a is a scalar.

Each time you take out an a from a row, you multiply by a the determinant of thematrix which remains. Since there are n rows, you do this n times, hence you get an.


Exercises 23

7. Suppose A is an upper triangular matrix. Show that A−1 exists if and only if allelements of the main diagonal are non zero. Is it true that A−1 will also be uppertriangular? Explain. Is everything the same for lower triangular matrices?

This is obvious because the determinant of A is the product of these diagonal entries.When you consider the usual process of finding the inverse, you get that A−1 must beupper triangular. Everything is similar for lower triangular matrices.

8. Let A and B be two n× n matrices. A ∼ B (A is similar to B) means there exists aninvertible matrix S such that A = S−1BS. Show that if A ∼ B, then B ∼ A. Showalso that A ∼ A and that if A ∼ B and B ∼ C, then A ∼ C.

This is easy except possibly for the last claim. Say A = P−1BP and B = Q−1CQ.Then

A = P−1BP = A = P−1Q−1CQP = (QP )−1 C (QP )

9. In the context of Problem 8 show that if A ∼ B, then det (A) = det (B) .

detA = det(P−1BP

)= det

(P−1

)det (B) det (P )

= det (B) det(P−1P

)= det (B) .

10. Let A be an n× n matrix and let x be a nonzero vector such that Ax = λx for somescalar, λ. When this occurs, the vector, x is called an eigenvector and the scalar, λis called an eigenvalue. It turns out that not every number is an eigenvalue. Onlycertain ones are. Why? Hint: Show that if Ax = λx, then (λI −A)x = 0. Explainwhy this shows that (λI −A) is not one to one and not onto. Now use Theorem 3.1.15to argue det (λI −A) = 0. What sort of equation is this? How many solutions does ithave?

If you have (λI −A)x = 0 for x 6= 0, then (λI −A)−1

cannot exist because if it did,you could multiply on the left by it and then conclude that x = 0. Therefore, (λI −A)is not one to one and not onto.

11. Suppose det (λI −A) = 0. Show using Theorem 3.1.15 there exists x 6= 0 such that(λI −A)x = 0.

If that determinant equals 0 then the matrix λI − A has no inverse. It is not oneto one and so there exists x 6= 0 such that (λI −A)x = 0. Also recall the process forfinding the inverse.

12. Let F (t) = det

(a (t) b (t)c (t) d (t)

)

. Verify

F ′ (t) = det

(a′ (t) b′ (t)c (t) d (t)

)

+ det

(a (t) b (t)c′ (t) d′ (t)

)

.

Now suppose

F (t) = det

a (t) b (t) c (t)d (t) e (t) f (t)g (t) h (t) i (t)

.


24 Exercises

Use Laplace expansion and the first part to verify F ′ (t) =

det

a′ (t) b′ (t) c′ (t)d (t) e (t) f (t)g (t) h (t) i (t)

+ det

a (t) b (t) c (t)d′ (t) e′ (t) f ′ (t)g (t) h (t) i (t)

+det

a (t) b (t) c (t)d (t) e (t) f (t)g′ (t) h′ (t) i′ (t)

.

Conjecture a general result valid for n × n matrices and explain why it will be true.Can a similar thing be done with the columns?

The way to see this holds in general is to use the usual proof for the product rule andthe theorem about the determinant and row operations.

F (t+ h)− F (t) = det

a (t+ h) b (t+ h) c (t+ h)d (t+ h) e (t+ h) f (t+ h)g (t+ h) h (t+ h) i (t+ h)

− det


And so this equals

det

a (t+ h) b (t+ h) c (t+ h)d (t+ h) e (t+ h) f (t+ h)g (t+ h) h (t+ h) i (t+ h)

− det

a (t) b (t) c (t)d (t+ h) e (t+ h) f (t+ h)g (t+ h) h (t+ h) i (t+ h)

+det

a (t) b (t) c (t)d (t+ h) e (t+ h) f (t+ h)g (t+ h) h (t+ h) i (t+ h)

− det

a (t) b (t) c (t)d (t) e (t) f (t)

g (t+ h) h (t+ h) i (t+ h)

+det

a (t) b (t) c (t)d (t) e (t) f (t)

g (t+ h) h (t+ h) i (t+ h)

− det


Now multiply by 1/h to obtain the following for the difference quotient F (t+h)−F (t)h .

det

a(t+h)−a(t)h

b(t+h)−b(t)h

c(t+h)−c(t)h

d (t+ h) e (t+ h) f (t+ h)g (t+ h) h (t+ h) i (t+ h)

+det

a (t) b (t) c (t)d(t+h)−d(t)

he(t+h)−e(t)

hf(t+h)−f(t)

hg (t+ h) h (t+ h) i (t+ h)

+det

a (t) b (t) c (t)d (t) e (t) f (t)

g(t+h)−g(t)h

h(t+h)−h(t)h

i(t+h)−i(t)h

Now passing to a limit yields the desired formula. Obviously this holds for any sizedeterminant.

13. Use the formula for the inverse in terms of the cofactor matrix to find the inverse ofthe matrix

A =

et 0 00 et cos t et sin t0 et cos t− et sin t et cos t+ et sin t

.


Exercises 25

et 0 00 et cos t et sin t0 et cos t− et sin t et cos t+ et sin t

−1

=

e−t 0 00 e−t (cos t+ sin t) − (sin t) e−t

0 −e−t (cos t− sin t) (cos t) e−t

14. Let A be an r×r matrix and let B be an m×m matrix such that r+m = n. Considerthe following n× n block matrix

C =

(A 0D B

)

.

where the D is an m× r matrix, and the 0 is a r ×m matrix. Letting Ik denote thek × k identity matrix, tell why

C =

(A 0D Im

)(Ir 00 B

)

.

Now explain why det (C) = det (A) det (B) . Hint: Part of this will require an expla-nation of why

det

(A 0D Im

)

= det (A) .

See Corollary 3.1.9.

The first follows right away from block multiplication. Now

det (C) = det

(A 0D Im

)

det

(Ir 00 B

)

= det

(A 00 Im

)

det

(Ir 00 B

)

= det (A) det (B)

from expanding along the last m columns for the first one and along the first r columnsfor the second.

15. Suppose Q is an orthogonal matrix. This means Q is a real n×nmatrix which satisfies

QQT = I

Find the possible values for det (Q).

You have to have det (Q) det(QT

)= det (Q)

2= 1 and so det (Q) = ±1.

16. Suppose Q (t) is an orthogonal matrix. This means Q (t) is a real n× n matrix whichsatisfies

Q (t)Q (t)T = I

Suppose Q (t) is continuous for t ∈ [a, b] , some interval. Also suppose det (Q (t)) = 1.Show that it follows det (Q (t)) = 1 for all t ∈ [a, b].

You have from the given equation that det (Q (t)) is always either 1 or −1. Since Q (t)is continuous, so is t → det (Q (t)) and so if it starts off at 1, it cannot jump to −1because this would violate the intermediate value theorem.


26 Exercises

F.19 Exercises

3.6

1. Let m < n and let A be an m × n matrix. Show that A is not one to one. Hint:

Consider the n× n matrix A1 which is of the form

A1 ≡(

A0

)

where the 0 denotes an (n−m) × n matrix of zeros. Thus detA1 = 0 and so A1 isnot one to one. Now observe that A1x is the vector,

A1x =

(Ax0

)

which equals zero if and only if Ax = 0.

The hint gives it away. You could simply consider a vector of the form

(0

a

)

where a 6= 0.

2. Let v1, · · · ,vn be vectors in Fn and let M (v1, · · · ,vn) denote the matrix whose ith

column equals vi. Define

d (v1, · · · ,vn) ≡ det (M (v1, · · · ,vn)) .

Prove that d is linear in each variable, (multilinear), that

d (v1, · · · ,vi, · · · ,vj , · · · ,vn) = −d (v1, · · · ,vj , · · · ,vi, · · · ,vn) , (6.28)

andd (e1, · · · , en) = 1 (6.29)

where here ej is the vector in Fn which has a zero in every position except the jth

position in which it has a one.

This follows from the properties of determinants which are discussed above.

3. Suppose f : Fn × · · · × Fn → F satisfies 6.28 and 6.29 and is linear in each variable.Show that f = d.

Consider f (x1, · · · ,xn) . Then by the assumptions on f it equals

f (x1, · · · ,xn) =∑

i1,··· ,inxi1 · · ·xinf (ei1 · · · ein)

=∑

i1,··· ,inx1i1 · · ·x1in sgn (i1, · · · , in) f (e1 · · · e1)

=∑

i1,··· ,inx1i1 · · ·x1in sgn (i1, · · · , in) d (e1 · · ·e1)

=∑

i1,··· ,inx1i1 · · ·x1ind (ei1 · · · ein) = d (x1, · · · ,xn)


Exercises 27

4. Show that if you replace a row (column) of an n × n matrix A with itself added tosome multiple of another row (column) then the new matrix has the same determinantas the original one.

This was done above.

5. Use the result of Problem 4 to evaluate by hand the determinant

det

1 2 3 2−6 3 2 35 2 2 33 4 6 4

.

det

1 2 3 2−6 3 2 35 2 2 33 4 6 4

= 5

6. Find the inverse if it exists of the matrix

et cos t sin tet − sin t cos tet − cos t − sin t

.

et cos t sin tet − sin t cos tet − cos t − sin t

−1

=

12e−t 0 1

2e−t

12 cos t+

12 sin t − sin t 1

2 sin t− 12 cos t

12 sin t− 1

2 cos t cos t − 12 cos t− 1

2 sin t

7. Let Ly = y(n) + an−1 (x) y(n−1) + · · · + a1 (x) y

′ + a0 (x) y where the ai are givencontinuous functions defined on a closed interval, (a, b) and y is some function whichhas n derivatives so it makes sense to write Ly. Suppose Lyk = 0 for k = 1, 2, · · · , n.The Wronskian of these functions, yi is defined as

W (y1, · · · , yn) (x) ≡ det

y1 (x) · · · yn (x)y′1 (x) · · · y′n (x)

......

y(n−1)1 (x) · · · y

(n−1)n (x)

Show that for W (x) = W (y1, · · · , yn) (x) to save space,

W ′ (x) = det

y1 (x) · · · yn (x)y′1 (x) · · · y′n (x)

......

y(n)1 (x) · · · y

(n)n (x)

.

Now use the differential equation, Ly = 0 which is satisfied by each of these functions,yi and properties of determinants presented above to verify that W ′+an−1 (x)W = 0.Give an explicit solution of this linear differential equation, Abel’s formula, and use


28 Exercises

your answer to verify that the Wronskian of these solutions to the equation, Ly = 0either vanishes identically on (a, b) or never.

The last formula above follows because W ′ (x) equals the sum of determinants ofmatrices which have two equal rows except for the last one in the sum which is thedisplayed expression. Now let

mi (x) = −(

an−1 (x) y(n−1)i + · · ·+ a1 (x) y

′i + a0 (x) yi

)

Since each yi is a solution to Ly = 0, it follows that y(n)i (t) = mi (t). Now from the

properties of determinants, being linear in each row,

W ′ (x) = −an−1 (x) det

y1 (x) · · · yn (x)y′1 (x) · · · y′n (x)

......

y(n−1)1 (x) · · · y

(n−1)n (x)

= −an−1 (x)W′ (x)

Now let A′ (x) = an−1 (x) . Then

d

dx

(

eA(x)W (x))

= 0

and so W (x) = Ce−A(x). Thus the Wronskian either vanishes for all x or for no x.

8. Two n × n matrices, A and B, are similar if B = S−1AS for some invertible n × nmatrix S. Show that if two matrices are similar, they have the same characteristicpolynomials. The characteristic polynomial of A is det (λI −A) .

Say A = S−1BS. Then

det (λI −A) = det(λI − S−1BS

)

= det(λS−1S − S−1BS

)

= det(S−1 (λI −B)S

)

= det(S−1

)det (λI −B) det (S)

= det(S−1S

)det (λI −B) = det (λI −B)

9. Suppose the characteristic polynomial of an n× n matrix A is of the form

tn + an−1tn−1 + · · ·+ a1t+ a0

and that a0 6= 0. Find a formula A−1 in terms of powers of the matrix A. Show thatA−1 exists if and only if a0 6= 0.In fact a0 = (−1)n det (A).From the Cayley Hamilton theorem,

An + an−1An−1 + · · ·+ a1A+ a0I = 0

Also the characteristic polynomial is

det (tI −A)

and the constant term is (−1)n det (A) . Thus a0 6= 0 if and only if det (A) 6= 0 if andonly if A−1 has an inverse. Thus if A−1 exists, it follows that

a0I = −(An + an−1A

n−1 + · · ·+ a1A)= A

(−An−1 − an−1A

n−2 − · · · − a1I)


Exercises 29

and alsoa0I =

(−An−1 − an−1A

n−2 − · · · − a1I)A

Therefore, the inverse is

1

a0

(−An−1 − an−1A

n−2 − · · · − a1I)

10. ↑Letting p (t) denote the characteristic polynomial of A, show that

pε (t) ≡ p (t− ε)

is the characteristic polynomial of A + εI. Then show that if det (A) = 0, it followsthat det (A+ εI) 6= 0 whenever |ε| is sufficiently small.

p (t) ≡ det (tI −A) . Hence p (t− ε) = det ((t− ε) I −A) = det (tI − (εI +A)) andby definition, this is the characteristic polynomial of εI +A. Therefore, the constantterm of this is the constant term of p (t− ε). Say

p (t) = tn + an−1tn−1 + · · ·+ a1t+ a0

where the smallest ai which is nonzero is ak. Then it reduces to

tn + an−1tn−1 + · · ·+ akt

k

Thenpε (t) = p (t− ε) = (t− ε)

n+ an−1 (t− ε)

n−1+ · · ·+ ak (t− ε)

k

then the constant term of this pε (t) is ak (−1)k εk+ terms having ε raised to a higherpower. Hence for all ε small enough, this term will dominate the sum of all the othersand it follows that the constant term is nonzero for all |ε| small enough.

11. In constitutive modeling of the stress and strain tensors, one sometimes considers sumsof the form

∑∞k=0 akA

k where A is a 3×3 matrix. Show using the Cayley Hamiltontheorem that if such a thing makes any sense, you can always obtain it as a finite sumhaving no more than n terms.

Say the characteristic polynomial is q (t) . Then if n ≥ 3,

tn = q (t) l (t) + r (t)

where the degree of r (t) is either less than 3 or it equals zero. Thus

An = q (A) l (A) + r (A) = r (A)

and so all the terms An for n ≥ 3 can be replaced with some r (A) where the degree ofr (t) is no more than 2. Thus, assuming there are no convergence issues, the infinite

sum must be of the form∑2

k=0 bkAk.

12. Recall you can find the determinant from expanding along the jth column.

det (A) =∑

i

Aij (cof (A))ij

Think of det (A) as a function of the entries, Aij . Explain why the ijth cofactor isreally just

∂ det (A)

∂Aij.

This follows from the formula for determinant. If you take the partial derivative, youget (cof (A))ij .


30 Exercises

13. Let U be an open set in Rn and let g :U → Rn be such that all the first partialderivatives of all components of g exist and are continuous. Under these conditionsform the matrix Dg (x) given by

Dg (x)ij ≡∂gi (x)

∂xj≡ gi,j (x)

The best kept secret in calculus courses is that the linear transformation determinedby this matrix Dg (x) is called the derivative of g and is the correct generalizationof the concept of derivative of a function of one variable. Suppose the second partialderivatives also exist and are continuous. Then show that

∑

j

(cof (Dg))ij,j = 0.

Hint: First explain why∑

i

gi,k cof (Dg)ij = δjk det (Dg)

Next differentiate with respect to xj and sum on j using the equality of mixed partialderivatives. Assume det (Dg) 6= 0 to prove the identity in this special case. Thenexplain why there exists a sequence εk → 0 such that for gεk (x) ≡ g (x) + εkx,det (Dgεk) 6= 0 and so the identity holds for gεk . Then take a limit to get the de-sired result in general. This is an extremely important identity which has surprisingimplications.

First assume det (Dg) 6= 0. Then from the cofactor expansion, you get∑

i

gi,k cof (Dg)ij = δjk det (Dg)

If j = k you get det (Dg) on the right but if j 6= k, then the left is the expansion of adeterminant which has two equal columns. Now differentiate both sides with respectto j using the above problem. You have to use the chain rule.

∑

i

gi,kjcof (Dg)ij +∑

i

gi,kcof (Dg)ij,j

= δjk∑

r,s

∂ (det (Dg))

gr,sgr,sj =

∑

r,s

δjkcof (Dg)rs gr,sj

Next sum on j∑

i,j


i,j

gi,kcof (Dg)ij,j =∑

r,s,j

δjkcof (Dg)rs gr,sj

Of course the terms on the right are 0 unless j = k, and so∑

i,j


i,j


r,s

cof (Dg)rs gr,sk

=∑

i,j

cof (Dg)ij gi,jk

Subtract the terms at the end from each side using equality of mixed partial derivatives.This gives.

0 =∑

i,j


i

gi,k

∑

j

cof (Dg)ij,j


Exercises 31

Since Dg is assumed invertible, this requires

∑

j

cof (Dg)ij,j = 0

In case det (Dg (x)) = 0 for some x, consider εk, a sequence of numbers converging to0 with the property that det (Dg (x) + εkI) 6= 0. Then let gk (x) = g (x) + εkx. Fromthe above, you have

∑

j

cof (Dg)ij,j (x) = limk→∞

∑

j

cof (Dgk)ij,j (x) = 0

14. A determinant of the form∣∣∣∣∣∣∣∣∣∣∣∣∣

1 1 · · · 1a0 a1 · · · ana20 a21 · · · a2n...

......

an−10 an−1

1 · · · an−1n

an0 an1 · · · ann

∣∣∣∣∣∣∣∣∣∣∣∣∣

is called a Vandermonde determinant. Show this determinant equals

∏

0≤i<j≤n

(aj − ai)

By this is meant to take the product of all terms of the form (aj − ai) such that j > i.Hint: Show it works if n = 1 so you are looking at

∣∣∣∣

1 1a0 a1

∣∣∣∣

Then suppose it holds for n − 1 and consider the case n. Consider the followingpolynomial.

p (t) ≡

∣∣∣∣∣∣∣∣∣∣∣∣∣

1 1 · · · 1a0 a1 · · · ta20 a21 · · · t2

......

...an−10 an−1

1 · · · tn−1

an0 an1 · · · tn

∣∣∣∣∣∣∣∣∣∣∣∣∣

.

Explain why p (aj) = 0 for i = 0, · · · , n− 1. Thus

p (t) = cn−1∏

i=0

(t− ai) .

Of course c is the coefficient of tn. Find this coefficient from the above description ofp (t) and the induction hypothesis. Then plug in t = an and observe you have theformula valid for n.

In the case of n = 1, you have

∣∣∣∣

1 1a0 a1

∣∣∣∣= (a1 − a0) so the formula holds. Now

suppose it holds for n and consider p (t) as above. Then it is obvious that p (aj) = 0


32 Exercises

for i = 0, · · · , n − 1 because you have the determinant of a matrix with two equalcolumns. Thus

p (t) = c

n−1∏

i=0

(t− ai) .

because p (t) has degree n and so it has no more than n roots. So what is c? Thecoefficient of tn is the Vandermonde determinant which is n× n and by induction, itequals

∏

0≤i<j≤n−1 (aj − ai) . From the above product, the coefficient of tn is c. Thus

p (t) =∏

0≤i<j≤n−1

(aj − ai)

n−1∏

i=0

(t− ai)

Now plug in t = an to get the (n+ 1)× (n+ 1) Vandermonde determinant

p (an) =∏

0≤i<j≤n−1

(aj − ai)

n−1∏

i=0

(an − ai)

=∏

0≤i<j≤n

(aj − ai)

F.20 Exercises

4.6

1. Let u1, · · · ,un be vectors in Rn. The parallelepiped determined by these vectorsP (u1, · · · ,un) is defined as

P (u1, · · · ,un) ≡

n∑

k=1

tkuk : tk ∈ [0, 1] for all k

.

Now let A be an n× n matrix. Show that

Ax : x ∈ P (u1, · · · ,un)

is also a parallelepiped.

A typical thing in Ax : x ∈ P (u1, · · · ,un) is

n∑

k=1

tkAuk : tk ∈ [0, 1]

and so it is just P (Au1, · · · , Aun) .

2. In the context of Problem 1, draw P (e1, e2) where e1, e2 are the standard basis vectorsfor R2. Thus e1 = (1, 0) , e2 = (0, 1) . Now suppose

E =

(1 10 1

)

where E is the elementary matrix which takes the third row and adds to the first.Draw

Ex : x ∈ P (e1, e2) .


Exercises 33

In other words, draw the result of doing E to the vectors in P (e1, e2). Next draw theresults of doing the other elementary matrices to P (e1, e2).

For E the given elementary matrix, the result is as follows.

P (e1, e2) E(P (e1, e2))

It is called a shear. Note that it does not change the area.

In caseE =

(1 00 α

)

, the given square P (e1, e2) becomes a rectangle. If α > 0,

the square is stretched by a factor of α in the y direction. If α < 0 the rectangleis reflected across the x axis and in addition is stretched by a factor of |α| in the y

direction. When E =

(α 00 1

)

the result is similar only it features changes in the x

direction. These elementary matrices do change the area unless |α| = 1 when the area

is unchanged. The permutation

(0 11 0

)

simply switches e1 and e2 so the result

appears to be a square just like the one you began with.

3. In the context of Problem 1, either draw or describe the result of doing elementarymatrices to P (e1, e2, e3). Describe geometrically the conclusion of Corollary 4.3.7.

It is the same sort of thing. The elementary matrix either switches the ei about or itproduces a shear or a magnification in one direction.

4. Consider a permutation of 1, 2, · · · , n. This is an ordered list of numbers taken fromthis list with no repeats, i1, i2, · · · , in. Define the permutation matrix P (i1, i2, · · · , in)as the matrix which is obtained from the identity matrix by placing the jth columnof I as the ithj column of P (i1, i2, · · · , in) . What does this permutation matrix do to

the column vector (1, 2, · · · , n)T ?It produces the column (i1, i2, · · · , in)T .

5. Consider the 3 × 3 permutation matrices. List all of them and then determine thedimension of their span. Recall that you can consider an m× n matrix as somethingin Fnm.

Here they are.

1 0 00 1 00 0 1

,

0 0 11 0 00 1 0

,

0 1 00 0 11 0 0

0 1 01 0 00 0 1

,

1 0 00 0 10 1 0

,

0 0 10 1 01 0 0

So what is the dimension of the span of these? One way to systematically accomplishthis is to unravel them and then use the row reduced echelon form. Unraveling theseyields the column vectors


34 Exercises

100010001

001100010

010001100

010100001

100001010

001010100

Then arranging these as the columns of a matrix yields the following along with itsrow reduced echelon form.

1 0 0 0 1 00 0 1 1 0 00 1 0 0 0 10 1 0 1 0 01 0 0 0 0 10 0 1 0 1 00 0 1 0 0 10 1 0 0 1 01 0 0 1 0 0

, row echelon form:

1 0 0 0 0 10 1 0 0 0 10 0 1 0 0 10 0 0 1 0 −10 0 0 0 1 −10 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0

The dimension is 5.

6. Determine which matrices are in row reduced echelon form.

(a)

(1 2 00 1 7

)

This one is not.

(b)

1 0 0 00 0 1 20 0 0 0

This one is.

(c)

1 1 0 0 0 50 0 1 2 0 40 0 0 0 1 3

This one is.

7. Row reduce the following matrices to obtain the row reduced echelon form. List thepivot columns in the original matrix.

(a)

1 2 0 32 1 2 21 1 0 3

, row echelon form:

1 0 0 30 1 0 00 0 1 −2

(b)

1 2 32 1 −23 0 03 2 1

, row echelon form:

1 0 00 1 00 0 10 0 0

(c)

1 2 1 3−3 2 1 03 2 1 1

, row echelon form:

1 0 0 00 1 1

2 00 0 0 1


Exercises 35

8. Find the rank and nullity of the following matrices. If the rank is r, identify r columnsin the original matrix which have the property that every other column may bewritten as a linear combination of these.

(a)

0 1 0 2 1 2 20 3 2 12 1 6 80 1 1 5 0 2 30 2 1 7 0 3 4

, row echelon form:

0 1 0 2 0 1 10 0 1 3 0 1 20 0 0 0 1 1 10 0 0 0 0 0 0

Rank

equals 3 nullity equals 4. A basis is columns 2,3,5

(b)

0 1 0 2 0 1 00 3 2 6 0 5 40 1 1 2 0 2 20 2 1 4 0 3 2

, row echelon form:

0 1 0 2 0 1 00 0 1 0 0 1 20 0 0 0 0 0 00 0 0 0 0 0 0

Rank

is 2 nullity equals 5. A basis is columns 2,3.

(c)

0 1 0 2 1 1 20 3 2 6 1 5 10 1 1 2 0 2 10 2 1 4 0 3 1

, row echelon form:

0 1 0 2 0 1 00 0 1 0 0 1 00 0 0 0 1 0 00 0 0 0 0 0 1

Rank

is 4 and nullity is 3. A basis is columns 2,3,5,7.

9. Find the rank of the following matrices. If the rank is r, identify r columns in the

original matrix which have the property that every other column may be writtenas a linear combination of these. Also find a basis for the row and column spaces ofthe matrices.

(a)

1 2 03 2 12 1 00 2 1

, row echelon form:

1 0 00 1 00 0 10 0 0

. The rank is 3.

(b)

1 0 04 1 12 1 00 2 0

, row echelon form:

1 0 00 1 00 0 10 0 0

. The rank is 3

(c)

0 1 0 2 1 2 20 3 2 12 1 6 80 1 1 5 0 2 30 2 1 7 0 3 4

, row echelon form:

0 1 0 2 0 1 10 0 1 3 0 1 20 0 0 0 1 1 10 0 0 0 0 0 0

.

The rank is 3.

(d)

0 1 0 2 0 1 00 3 2 6 0 5 40 1 1 2 0 2 20 2 1 4 0 3 2

, row echelon form:

0 1 0 2 0 1 00 0 1 0 0 1 20 0 0 0 0 0 00 0 0 0 0 0 0

. The

rank is 2.

(e)

0 1 0 2 1 1 20 3 2 6 1 5 10 1 1 2 0 2 10 2 1 4 0 3 1

, row echelon form:

0 1 0 2 0 1 00 0 1 0 0 1 00 0 0 0 1 0 00 0 0 0 0 0 1

. The

rank is 4.

10. Suppose A is an m × n matrix. Explain why the rank of A is always no larger thanmin (m,n) .


36 Exercises

It is because you cannot have more than min (m,n) nonzero rows in the row reducedechelon form. Recall that the number of pivot columns is the same as the number ofnonzero rows from the description of this row reduced echelon form.

11. Suppose A is an m×n matrix in which m ≤ n. Suppose also that the rank of A equalsm. Show that A maps Fn onto Fm. Hint: The vectors e1, · · · , em occur as columnsin the row reduced echelon form for A.

It follows from the fact that e1, · · · , em occur as columns in row reduced echelon formthat the dimension of the column space of A is n and so, since this column space isA (Rn) , it follows that it equals Fm.

12. Suppose A is an m × n matrix and that m > n. Show there exists b ∈ Fm such thatthere is no solution to the equation

Ax = b.

Since m > n the dimension of the column space of A is no more than n and so thecolumns of A cannot span Fm.

13. Suppose A is an m × n matrix in which m ≥ n. Suppose also that the rank of Aequals n. Show that A is one to one. Hint: If not, there exists a vector, x 6= 0 suchthat Ax = 0, and this implies at least one column of A is a linear combination of theothers. Show this would require the column rank to be less than n.

The hint gives it away. The matrix has at most n independent columns. If one is acombination of the others, then its rank is less than n.

14. Explain why an n× n matrix A is both one to one and onto if and only if its rank isn.

If A is one to one, then its columns are linearly independent, hence a basis and so itis also onto. Since the columns are independent, its rank is n. If its rank is n this justsays that the columns are independent.

15. Suppose A is an m × n matrix and w1, · · · ,wk is a linearly independent set ofvectors in A (Fn) ⊆ Fm. Suppose also that Azi − wi. Show that z1, · · · , zk is alsolinearly independent.

If∑

i cizi = 0, apply A to both sides to obtain∑

i ciwi = 0. By assumption, eachci = 0.

16. Let A,B be m× n matrices. Show that rank (A+B) ≤ rank (A) + rank (B).

This is obvious from the following.

rank (A+B) = dim (span (a1 + b1, · · · , an + bn))

≤ dim (span (a1, · · · , an,b1, · · · ,bn))

≤ dim (span (b1, · · · ,bn)) + dim (span (a1, · · · , an))= rank (A) + rank (B)

17. Suppose A is an m × n matrix, m ≥ n and the columns of A are independent. Sup-pose also that z1, · · · , zk is a linearly independent set of vectors in Fn. Show thatAz1, · · · , Azk is linearly independent.


Exercises 37

Since columns are independent, if Ax = 0, then x = 0. Therefore, the matrix is oneto one. If

n∑

j=1

cjAzj = 0,

then A(∑n

j=1 cjzj

)

= 0 and so∑n

j=1 cjzj = 0 therefore each cj = 0.

18. Suppose A is an m× n matrix and B is an n× p matrix. Show that

dim (ker (AB)) ≤ dim (ker (A)) + dim (ker (B)) .

Hint: Consider the subspace, B (Fp) ∩ ker (A) and suppose a basis for this subspaceis w1, · · · ,wk . Now suppose u1, · · · ,ur is a basis for ker (B) . Let z1, · · · , zkbe such that Bzi = wi and argue that

ker (AB) ⊆ span (u1, · · · ,ur, z1, · · · , zk) .Here is how you do this. Suppose ABx = 0. Then Bx ∈ ker (A) ∩ B (Fp) and so

Bx =∑k

i=1 aiBzi showing that

x−k∑

i=1

aizi ∈ ker (B) .

Let w1, · · · ,wk be a basis for B (Fp) ∩ ker (A) and suppose u1, · · · ,ur is a basisfor ker (B). Let Bzi = wi. Now suppose x ∈ ker (AB). Then Bx ∈ ker (A) ∩ B (Fp)and so

Bx =

k∑

i=1

aiwi =

k∑

i=1

aiBzi

so

x−k∑

i=1

aizi ∈ ker (B)

Hence

x−k∑

i=1

aizi =

r∑

j=1

bjuj

This shows that

dim (ker (AB)) ≤ k + r ≤ dim (ker (A)) + dim (ker (B))

Note that a little more can be said. u1, · · · ,ur, z1, · · · , zk is independent. Here iswhy. Suppose

∑

i

aiui +

k∑

j=1

bjzj = 0

Then do B to both sides and conclude that∑

j

bjBzj = 0

which forces each bj = 0. Then the independence of the ui implies each ai = 0. Itfollows that in fact,

dim (ker (AB)) = k + r ≤ dim (ker (A)) + dim (ker (B))

and the remaining inequality is an equality if and only if B (Fp) ⊇ ker (A).


38 Exercises

19. Let m < n and let A be an m× n matrix. Show that A is not one to one.

There are more columns than rows and at most m can be pivot columns so it followsat least one column is a linear combination of the others hence A is not one too one.

20. Let A be an m× n real matrix and let b ∈ Rm. Show there exists a solution, x to thesystem

ATAx = ATb

Next show that if x,x1 are two solutions, then Ax = Ax1. Hint: First show that(ATA

)T= ATA. Next show if x ∈ ker

(ATA

), then Ax = 0. Finally apply the Fred-

holm alternative. Show ATb ∈ ker(ATA)⊥. This will give existence of a solution.(ATA

)T= AT

(AT

)T= ATA. Also if ATAx = 0, then

(ATAx,x

)= |Ax|2 = 0

and so Ax = 0. Is ATb ∈ ker(ATA

)⊥? Say ATAx = 0. Then Ax = 0. Then

(ATb,x

)= (b,Ax) = (b,0) = 0

Hence ATb ∈ ker(ATA)⊥ and so there exists a solution to the above system.

21. Show that in the context of Problem 20 that if x is the solution there, then |b−Ax| ≤|b−Ay| for every y. Thus Ax is the point of A (Rn) which is closest to b of everypoint in A (Rn). This is a solution to the least squares problem.

|b−Ay|2 = |b−Ax+Ax−Ay|2

= |b−Ax|2 + |Ax−Ay|2 + 2 (b−Ax,A (x− y))

= |b−Ax|2 + |Ax−Ay|2 + 2(ATb−ATAx, (x− y)

)

= |b−Ax|2 + |Ax−Ay|2

and so, Ax is closest to b out of all vectors Ay.

22. Here is a point in R4 : (1, 2, 3, 4)T. Find the point in span

1023

,

0132

which

is closest to the given point.

This just says to find the least squares solution to the equations

1 00 12 33 2

(xy

)

=

1234

1 00 12 33 2

T

1 00 12 33 2

(xy

)

=

1 00 12 33 2

T

1234


Exercises 39

(14 1212 14

)(xy

)

=

(1919

)

, Solution is:

(19261926

)

. Then the desired point is

19

26

1023

+19

26

0132

=

1926192695269526

23. ↑Here is a point in R4 : (1, 2, 3, 4)T. Find the point on the plane described by x+2y−

4z + 4w = 0 which is closest to the given point.

The plane is of the form

−2y + 4z − 4wyzw

where y, z, w are arbitary. Thus it is

the span of the vectors

−2100

,

4010

,

−4001

and so you are looking for a least squares solution to the system

−2 4 −41 0 00 1 00 0 1

abc

=

1234

and to find this, you use the method of least

squares.

−2 4 −41 0 00 1 00 0 1

T

−2 4 −41 0 00 1 00 0 1

abc

=

−2 4 −41 0 00 1 00 0 1

T

1234

5 −8 8−8 17 −168 −16 17

abc

=

070

, Solution is:

56371473711237

The closest point is

then

5637

−2100

+ 14737

4010

+ 11237

−4001

=

283756371473711237

Check:(1 2 −4 4

)

283756371473711237

= 0

24. Suppose A,B are two invertible n× n matrices. Show there exists a sequence of rowoperations which when done to A yield B. Hint: Recall that every invertible matrixis a product of elementary matrices.

Both A,B have row reduced echelon form equal to I. Therefore, they are row equiva-lent.

E1 · · ·EpA = I, F1 · · ·FrB = I


40 Exercises

and soE1 · · ·EpA = F1 · · ·FrB

Now multiply on both sides by the inverses of the Fi to obtain a product of elementarymatrices which multiplied by A give B.

25. If A is invertible and n× n and B is n× p, show that AB has the same null space asB and also the same rank as B.

From the above problem, AB and B have exactly the same row reduced echelon form.Thus the result follows.

26. Here are two matrices in row reduced echelon form

A =

1 0 10 1 10 0 0

, B =

1 0 00 1 10 0 0

Does there exist a sequence of row operations which when done to A will yield B?Explain.

No. The row reduced echelon form is unique.

27. Is it true that an upper triagular matrix has rank equal to the number of nonzeroentries down the main diagonal?

No.

1 0 2 00 1 1 70 0 0 10 0 0 0

28. Let v1, · · · ,vn−1 be vectors in Fn. Describe a systematic way to obtain a vector vn

which is perpendicular to each of these vectors. Hint: You might consider somethinglike this

det

e1 e2 · · · env11 v12 · · · v1n...

......

v(n−1)1 v(n−1)2 · · · v(n−1)n

where vij is the jth entry of the vector vi. This is a lot like the cross product.

This is a good hint. You formally expand along the top row. It will work. If youreplace ei with vki so that the top row becomes.

(vk1 vk2 · · · vkn

)

then the determinant will be equal to 0 because it has two equal rows. However, thisdeterminant is also what you get when you take the dot product of the above formaldeterminant having the ei on the top row with vk.

29. Let A be an m × n matrix. Then ker (A) is a subspace of Fn. Is it true that everysubspace of Fn is the kernel or null space of some matrix? Prove or disprove.

Let M be a subspace of Fn. If it equals 0 , consider the matrix I. Otherwise, it hasa basis m1, · · · ,mk . Consider the matrix

(m1 · · · mk 0

)

where 0 is either not there in case k = n or has n− k columns.


Exercises 41

30. Let A be an n×n matrix and let P ij be the permutation matrix which switches the ith

and jth rows of the identity. Show that P ijAP ij produces a matrix which is similarto A which switches the ith and jth entries on the main diagonal.

This is easy to see when you consider that P ij is its own inverse and that P ij multipliedon the right switches the ith and jth columns. Thus you switch the columns and thenyou switch the rows. This has the effect of switching Aii and Ajj . For example,

1 0 0 00 0 0 10 0 1 00 1 0 0

a b c de f z hj k l mn t h g

1 0 0 00 0 0 10 0 1 00 1 0 0

=

a d c bn g h tj m l ke h z f

More formally, the iith entry of P ijAP ij is∑

s,p

P ijisAspP

ijpi = P ij

ij AjjPijji = Aij

31. Recall the procedure for finding the inverse of a matrix on Page 48. It was shown thatthe procedure, when it works, finds the inverse of the matrix. Show that wheneverthe matrix has an inverse, the procedure works.

If A has an inverse, then it is one to one. Hence the columns are independent. There-fore, they are each pivot columns. Therefore, the row reduced echelon form of A is I.This is what was needed for the procedure to work.

F.21 Exercises

5.8

1. Find a LU factorization of

1 2 02 1 31 2 3

. =

1 0 02 1 01 0 1

1 2 00 −3 30 0 3

2. Find a LU factorization of

1 2 3 21 3 2 15 0 1 3

. =

1 0 01 1 05 −10 1

1 2 3 20 1 −1 −10 0 −24 −17

3. Find a PLU factorization of

1 2 11 2 22 1 1

. =

1 0 00 0 10 1 0

1 0 02 1 01 0 1

1 2 10 −3 −10 0 1


1 2 1 2 12 4 2 4 11 2 1 3 2

.

=

1 0 00 1 00 0 1

1 0 02 1 01 0 1

1 2 1 2 10 0 0 0 −10 0 0 1 1


1 2 11 2 22 4 13 2 1

.

=

1 0 0 00 0 0 10 0 1 00 1 0 0

1 0 0 03 1 0 02 0 1 01 0 −1 1

1 2 10 −4 −20 0 −10 0 0


42 Exercises

6. Is there only one LU factorization for a given matrix? Hint: Consider the equation

(0 10 1

)

=

(1 01 1

)(0 10 0

)

.

Apparently, LU factorization is not unique.(

0 10 1

)

=

(1 01 1

)(0 10 0

)

(0 10 1

)

=

(1 00 1

)(0 10 1

)

7. Here is a matrix and an LU factorization of it.

A =

1 2 5 01 1 4 90 1 2 5

=

1 0 01 1 00 −1 1

1 2 5 00 −1 −1 90 0 1 14

Use this factorization to solve the system of equations

Ax =

123

1 0 01 1 00 −1 1

abc

=

123

, Solution is:

114

1 2 5 00 −1 −1 90 0 1 14

xyzw

=

114

, Solution is:

24t4 − 923t4 − 54− 14t4

t4

, t4 ∈ R

8. Find a QR factorization for the matrix

1 2 13 −2 11 0 2

1 2 13 −2 11 0 2

=

111

√11 13

66

√2√11 − 1

6

√2

311

√11 − 5

66

√2√11 − 1

6

√2

111

√11 1

33

√2√11 2

3

√2

·

√11 − 4

11

√11 6

11

√11

0 611

√2√11 2

11

√2√11

0 0√2

9. Find a QR factorization for the matrix

1 2 1 03 0 1 11 0 2 1


Exercises 43

1 2 1 03 0 1 11 0 2 1

=

111

√11 1

11

√10√11 0

311

√11 − 3

110

√10√11 − 1

10

√2√5

111

√11 − 1

110

√10√11 3

10

√2√5

√11 2

11

√11 6

11

√11 4

11

√11

0 211

√10√11 1

22

√10√11 − 2

55

√10√11

0 0 12

√2√5 1

5

√2√5

10. If you had a QR factorization, A = QR, describe how you could use it to solve theequation Ax = b. Note, this is not how people usually solve systems of equations.

You could first solve Qy = b to obtain y = QTb. Then you would solve Rx = y whichwould be easy because R is upper triangular.

11. If Q is an orthogonal matrix, show the columns are an orthonormal set. That is showthat for

Q =(q1 · · · qn

)

it follows that qi · qj = δij . Also show that any orthonormal set of vectors is linearlyindependent.

Say∑n

i=1 ciqi = 0. Then take the inner product of both sides with qk to obtain

∑

i

ciδij = 0

Therefore, cj = 0 and so the vectors are independent.

12. Show you can’t expect uniqueness for QR factorizations. Consider

0 0 00 0 10 0 1

and verify this equals

0 1 012

√2 0 1

2

√2

12

√2 0 − 1

2

√2

0 0√2

0 0 00 0 0

and also

1 0 00 1 00 0 1

0 0 00 0 10 0 1

.

Using Definition 5.7.4, can it be concluded that if A is an invertible matrix it willfollow there is only one QR factorization?

0 1 012

√2 0 1

2

√2

12

√2 0 − 1

2

√2

0 0√2

0 0 00 0 0

=

0 0 00 0 10 0 1

1 0 00 1 00 0 1

0 0 00 0 10 0 1

=

0 0 00 0 10 0 1


44 Exercises

Now if A−1 exists and A = QR = Q′R′ where R,R′ have positive entries downthe main diagonal, then you get R (R′)−1

= QTQ′ ≡ U, where U is an orthogonalmatrix. Thus you have an upper triangular matrix which has positive entries downthe diagonal equal to an orthogonal matrix. Obviously, this requires that the uppertriangular matrix is the identity. Hence R = R′ and Q = Q′. Why is this so obvious?Consider the following case of a 3×3 which is orthogonal.

a b c0 d e0 0 f

Then this matrix times its transpose is the identity.

a b c0 d e0 0 f

a 0 0b d 0c e f

=

a2 + b2 + c2 ce+ bd cfce+ bd d2 + e2 fe

cf fe f2

a 0 0b d 0c e f

a b c0 d e0 0 f

=

a2 ab acab b2 + d2 de+ bcac de+ bc c2 + f2 + e2

Thus both of these on the right end are the identity and so b = c = 0, a = 1 sincethe diagonal entries are all positive. Then also d = 1 and c, e = 0 and f = 1. Now itfollows that all the off diagonal entries must equal 0 since otherwise the magnitude ofa column would not equal 1.

13. Suppose a1, · · · , an are linearly independent vectors in Rn and let

A =(a1 · · · an

)

Form a QR factorization for A.

(a1 · · · an

)=

(q1 · · · qn

)

r11 r12 · · · r1n0 r22 · · · r2n...

. . .

0 0 · · · rnn

Show that for each k ≤ n,

span (a1, · · · , ak) = span (q1, · · · ,qk)

Prove that every subspace of Rn has an orthonormal basis. The procedure just de-scribed is similar to the Gram Schmidt procedure which will be presented later.

From the way we multiply matrices,

ak ∈ span (q1, · · · ,qk)

Hence span (a1, · · · , ak) ⊆ span (q1, · · · ,qk). In addition to this, each rii > 0 becausethe ai are a linearly independent set and if this were not so, then the triangular matrixon the right would fail to be invertible which would be a contradiction since you couldexpress it as the product of two invertible matrices. Multiplying on the right by itsinverse, you can get a similar expression for the qi and see that span (q1, · · · ,qk) ⊆span (a1, · · · , ak) . Thus

span (a1, · · · , ak) = span (q1, · · · ,qk)


Exercises 45

If you have a subspace, let a1, · · · , ak be a basis and then extend to a basis ofRn, a1, · · · , an. Then form the above QR factorization and you will have thatq1, · · · ,qk is an orthonormal basis for the subspace.

14. Suppose QnRn converges to an orthogonal matrix Q where Qn is orthogonal and Rn

is upper triangular having all positive entries on the diagonal. Show that then Qn

converges to Q and Rn converges to the identity.

LetQ = (q1, · · · ,qn) , Qk =

(qk1 , · · · ,qk

n

)

where the q are the columns. Also denote by rkij the ijth entry of Rk. Thus

QkRk =(qk1 , · · · ,qk

n

)

rk11 ∗. . .

0 rknn

It followsrk11q

k1 → q1

and sork11 =

∣∣rk11q

k1

∣∣→ 1

Therefore,qk1 → q1.

Next consider the second column.

rk12qk1 + rk22q

k2 → q2

Taking the inner product of both sides with qk1 it follows

limk→∞

rk12 = limk→∞

(q2 · qk

1

)= (q2 · q1) = 0.

Therefore,limk→∞

rk22qk2 = q2

and since rk22 > 0, it follows, as in the first part that rk22 → 1. Hence

limk→∞

qk2 = q2.

Continuing this way, it followslimk→∞

rkij = 0

for all i 6= j andlimk→∞

rkjj = 1, limk→∞

qkj = qj .

Thus Rk → I and Qk → Q.


46 Exercises

F.22 Exercises

6.6

1. Maximize and minimize z = x1 − 2x2 + x3 subject to the constraints x1 + x2 + x3 ≤10, x1 + x2 + x3 ≥ 2, and x1 +2x2 + x3 ≤ 7 if possible. All variables are nonnegative.

The constraints lead to the augmented matrix

1 1 1 1 0 0 101 1 1 0 −1 0 21 2 1 0 0 1 7

The obvious solution is not feasible. Do a row operation.

1 1 1 1 0 0 100 0 0 1 1 0 80 1 0 −1 0 1 −3

An obvious solution is still not feasible. Do another operation couple of row operations.

1 1 1 0 −1 0 20 0 0 1 1 0 80 1 0 0 1 1 5

At this point, you can spot an obvious feasible solution. Now assemble the simplextableau.

1 1 1 0 −1 0 0 20 0 0 1 1 0 0 80 1 0 0 1 1 0 5−1 2 −1 0 0 0 1 0

Now preserve the simple columns.

1 1 1 0 −1 0 0 20 0 0 1 1 0 0 80 1 0 0 1 1 0 50 2 0 0 −1 0 1 0

First lets work on minimizing this. There is a +2. The ratios are then 5, 2 so the pivotis the 1 on the top of the second column. The next tableau is

1 1 1 0 −1 0 0 20 0 0 1 1 0 0 8−1 0 −1 0 2 1 0 3−2 0 −2 0 1 0 1 −4

There is a 1 on the bottom. The ratios of interest for that column are 3/2, 8, and sothe pivot is the 2 in that column. Then the next tableau is

12 1 1

2 0 0 12 0 7

212 0 1

2 1 0 − 12 0 13

2−1 0 −1 0 2 1 0 3− 3

2 0 − 32 0 0 − 1

2 1 − 112

Now you stop because there are no more positive numbers to the left of 1 on thebottom row. The minimum is −11/2 and it occurs when x1 = x3 = x6 = 0 andx2 = 7/2, x4 = 13/2, x6 = −11/2.


Exercises 47

Next consider maximization. The simplex tableau was

1 1 1 0 −1 0 0 20 0 0 1 1 0 0 80 1 0 0 1 1 0 5−1 2 −1 0 0 0 1 0

This time you work on getting rid of the negative entries. Consider the −1 in the firstcolumn. There is only one ratio to consider so 1 is the pivot.

1 1 1 0 −1 0 0 20 0 0 1 1 0 0 80 1 0 0 1 1 0 50 3 0 0 −1 0 1 2

There remains a −1. The ratios are 5 and 8 so the next pivot is the 1 in the third rowand column 5.

1 2 1 0 0 1 0 70 −1 0 1 0 −1 0 30 1 0 0 1 1 0 50 4 0 0 0 1 1 7

Then no more negatives remain so the maximum is 7 and it occurs when x1 = 7, x2 =0, x3 = 0, x4 = 3, x5 = 5, x6 = 0.

2. Maximize and minimize the following if possible. All variables are nonnegative.

(a) z = x1 − 2x2 subject to the constraints x1 + x2 + x3 ≤ 10, x1 + x2 + x3 ≥ 1, andx1 + 2x2 + x3 ≤ 7

an augmented matrix for the constraints is

1 1 1 1 0 0 101 1 1 0 −1 0 11 2 1 0 0 1 7

The obvious solution is not feasible. Do some row operations.

0 0 0 1 1 0 91 1 1 0 −1 0 1−1 0 −1 0 2 1 5

Now the obvious solution is feasible. Then include the objective function.

0 0 0 1 1 0 0 91 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−1 2 0 0 0 0 1 0

First preserve the simple columns.

0 0 0 1 1 0 0 91 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−3 0 −2 0 2 0 1 −2


48 Exercises

Lets try to maximize first. Begin with the first column. The only pivot is the 1.Use it.

0 0 0 1 1 0 0 91 1 1 0 −1 0 0 10 1 0 0 1 1 0 60 3 1 0 −1 0 1 1

There is still a -1 on the bottom row to the left of the 1. The ratios are 9 and 6so the new pivot is the 1 on the third row.

0 −1 0 1 0 −1 0 31 2 1 0 0 1 0 70 1 0 0 1 1 0 60 4 1 0 0 1 1 7

Then the maximum is 7 when x1 = 7 and x1, x3 = 0.

Next consider the minimum.

0 0 0 1 1 0 0 91 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−3 0 −2 0 2 0 1 −2

There is a positive 2 in the bottom row left of 1. The pivot in that column is the2.

12 0 1

2 1 0 − 12 0 13

212 1 1

2 0 0 12 0 7

2−1 0 −1 0 2 1 0 5−2 0 −1 0 0 −1 1 −7

The minimum is −7 and it happens when x1 = 0, x2 = 7/2, x3 = 0.

(b) z = x1−2x2−3x3 subject to the constraints x1+x2+x3 ≤ 8, x1+x2+3x3 ≥ 1,and x1 + x2 + x3 ≤ 7

This time, lets use artificial variables to find an initial simplex tableau. Thus youadd in an artificial variable and then do a minimization procedure.

1 1 1 1 0 0 0 0 81 1 3 0 −1 0 1 0 11 1 1 0 0 1 0 0 70 0 0 0 0 0 −1 1 0

First preserve the seventh column as a simple column by a row operation.

1 1 1 1 0 0 0 0 81 1 3 0 −1 0 1 0 11 1 1 0 0 1 0 0 71 1 3 0 −1 0 0 1 1

Now use the third column.

23

23 0 1 1

3 0 − 13 0 23

31 1 3 0 −1 0 1 0 123

23 0 0 1

3 1 − 13 0 20

30 0 0 0 0 0 −1 1 0


Exercises 49

It follows that a basic solution is feasible if

23

23 0 1 1

3 0 233

1 1 3 0 −1 0 123

23 0 0 1

3 1 203

Now assemble the simplex tableau

23

23 0 1 1

3 0 0 233

1 1 3 0 −1 0 0 123

23 0 0 1

3 1 0 203

−1 2 3 0 0 0 1 0

Preserve the simple columns by doing row operations.

23

23 0 1 1

3 0 0 233

1 1 3 0 −1 0 0 123

23 0 0 1

3 1 0 203

−2 1 0 0 1 0 1 −1

Lets do minimization first. Work with the second column.

0 0 −2 1 1 0 0 71 1 3 0 −1 0 0 10 0 −2 0 1 1 0 6−3 0 −3 0 2 0 1 −2

Recall how you have to pick the pivot correctly. There is still a positive numberin the bottom row left of the 1. Work with that column.

0 0 0 1 0 −1 0 11 1 1 0 0 1 0 70 0 −2 0 1 1 0 6−3 0 1 0 0 −2 1 −14

There is still a positive number to the left of 1 on the bottom row.

0 0 0 1 0 −1 0 11 1 1 0 0 1 0 72 2 0 0 1 3 0 20−4 −1 0 0 0 −3 1 −21

It follows that the minimum is −21 and it occurs when x1 = x2 = 0, x3 = 7.

Next consider the maximum. The simplex tableau was

23

23 0 1 1

3 0 0 233

1 1 3 0 −1 0 0 123

23 0 0 1

3 1 0 203

−2 1 0 0 1 0 1 −1

Use the first column.

0 0 −2 1 1 0 0 71 1 3 0 −1 0 0 10 0 −2 0 1 1 0 60 3 6 0 −1 0 1 1


50 Exercises

There is still a negative on the bottom row to the left of 1.

0 0 0 1 0 −1 0 11 1 1 0 0 1 0 70 0 −2 0 1 1 0 60 3 4 0 0 1 1 7

There are no more negatives on the bottom row left of 1 so stop. The maximumis 7 and it occurs when x1 = 7, x2 = 0, x3 = 0.

(c) z = 2x1 + x2 subject to the constraints x1 − x2 + x3 ≤ 10, x1 + x2 + x3 ≥ 1, andx1 + 2x2 + x3 ≤ 7.

The augmented matrix for the constraints is

1 −1 1 1 0 0 101 1 1 0 −1 0 11 2 1 0 0 1 7

The basic solution is not feasible because of that −1. Lets do a row operation tochange this. I used the 1 in the second column as a pivot and zeroed out whatwas above and below it. Now it seems that the basic solution is feasible.

2 0 2 1 −1 0 111 1 1 0 −1 0 1−1 0 −1 0 2 1 5

Assemble the simplex tableau.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−2 −1 0 0 0 0 1 0

Then do a row operation to preserve the simple columns.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−1 0 1 0 −1 0 1 1

Lets do minimization first. Work with the third column because there is a positiveentry on the bottom.

0 −2 0 1 1 0 0 91 1 1 0 −1 0 0 10 1 0 0 1 1 0 6−2 −1 0 0 0 0 1 0

It follows that the minimum is 0 and it occurs when x1 = x2 = 0, x3 = 1.

Now lets maximize.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−1 0 1 0 −1 0 1 1


Exercises 51

Lets begin with the first column.

0 −2 0 1 1 0 0 91 1 1 0 −1 0 0 10 1 0 0 1 1 0 60 1 2 0 −2 0 1 2

There is still a −2 to the left of 1 in the bottom row.

0 −3 0 1 0 −1 0 31 2 1 0 0 1 0 70 1 0 0 1 1 0 60 3 2 0 0 2 1 14

There are no negatives left so the maximum is 14 and it happens when x1 =7, x2 = x3 = 0.

(d) z = x1 +2x2 subject to the constraints x1 − x2 + x3 ≤ 10, x1 + x2 + x3 ≥ 1, andx1 + 2x2 + x3 ≤ 7.

The augmented matrix for the constraints is

1 −1 1 1 0 0 101 1 1 0 −1 0 11 2 1 0 0 1 7

Of course the obvious or basic solution is not feasible. Do a row operation in-volving a pivot in the second row to try and fix this.

2 0 2 1 −1 0 111 1 1 0 −1 0 1−1 0 −1 0 2 1 5

Now all is well. Begin to assemble the simplex tableau.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 5−1 −2 0 0 0 0 1 0

Do a row operation to preserve the simple columns.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 51 0 2 0 −2 0 1 2

Next lets maximize. There is only one negative number in the bottom left of 1.

32 0 3

2 1 0 12 0 27

212 1 1

2 0 0 12 0 7

2−1 0 −1 0 2 1 0 50 0 1 0 0 1 1 7

Thus the maximum is 7 and it happens when x2 = 7/2, x3 = x1 = 0.


52 Exercises

Next lets find the minimum.

2 0 2 1 −1 0 0 111 1 1 0 −1 0 0 1−1 0 −1 0 2 1 0 51 0 2 0 −2 0 1 2

Start with the column which has a 2.

0 −2 0 1 1 0 0 91 1 1 0 −1 0 0 10 1 0 0 1 1 0 6−1 −2 0 0 0 0 1 0

There are no more positive numbers so the minimum is 0 when x1 = x2 = 0, x3 =1.

3. Consider contradictory constraints, x1 + x2 ≥ 12 and x1 + 2x2 ≤ 5. You know thesetwo contradict but show they contradict using the simplex algorithm.

You can do this by using artificial variables, x5. Thus

1 1 −1 0 1 0 121 2 0 1 0 0 50 0 0 0 −1 1 0

Do a row operation to preserve the simple columns.

1 1 −1 0 1 0 121 2 0 1 0 0 51 1 −1 0 0 1 12

Next start with the 1 in the first column.

0 −1 −1 −1 1 0 71 2 0 1 0 0 50 −1 −1 −1 0 1 7

Thus the minimum value of z = x5 is 7 but, for there to be a feasible solution, youwould need to have this minimum value be 0.

4. Find a solution to the following inequalities for x, y ≥ 0 if it is possible to do so. If itis not possible, prove it is not possible.

(a)6x+ 3y ≥ 48x+ 4y ≤ 5

Use an artificial variable. Let x1 = x, x2 = y and slack variables x3, x4 withartificial variable x5. Then minimize x5 as described earlier.

6 3 −1 0 1 0 48 4 0 1 0 0 50 0 0 0 −1 1 0

Keep the simple columns.

6 3 −1 0 1 0 48 4 0 1 0 0 56 3 −1 0 0 1 4


Exercises 53

Now proceed to minimize.

0 0 −1 − 34 1 0 1

48 4 0 1 0 0 50 0 −1 − 3

4 0 1 14

It appears that the minimum value of x5 is 1/4 and so there is no solution tothese inequalities with x1, x2 ≥ 0.

(b)6x1 + 4x3 ≤ 11

5x1 + 4x2 + 4x3 ≥ 86x1 + 6x2 + 5x3 ≤ 11

The augmented matrix is

6 0 4 1 0 0 115 4 4 0 −1 0 86 6 5 0 0 1 11

It is not clear whether there is a solution which has all variables nonnegative.However, if you do a row operation using 5 in the first column as a pivot, you get

0 − 245 − 4

5 1 65 0 7

55 4 4 0 −1 0 80 6

515 0 6

5 1 75

and so a solution is x1 = 8/5, x2 = x3 = 0.

(c)6x1 + 4x3 ≤ 11

5x1 + 4x2 + 4x3 ≥ 96x1 + 6x2 + 5x3 ≤ 9


6 0 4 1 0 0 115 4 4 0 −1 0 96 6 5 0 0 1 9

Lets include an artificial variable and seek to minimize x7.

6 0 4 1 0 0 0 0 115 4 4 0 −1 0 1 0 96 6 5 0 0 1 0 0 90 0 0 0 0 0 −1 1 0

Preserving the simple columns,

6 0 4 1 0 0 0 0 115 4 4 0 −1 0 1 0 96 6 5 0 0 1 0 0 95 4 4 0 −1 0 0 1 9

Use the first column.

0 −6 −1 1 0 −1 0 0 20 −1 − 1

6 0 −1 − 56 1 0 3

26 6 5 0 0 1 0 0 90 −1 − 1

6 0 −1 − 56 0 1 3

2

It appears that the minimum value for x7 is 3/2 and so there will be no solutionto these inequalities for which all the variables are nonnegative.


54 Exercises

(d)x1 − x2 + x3 ≤ 2x1 + 2x2 ≥ 43x1 + 2x3 ≤ 7


1 −1 1 1 0 0 21 2 0 0 −1 0 43 0 2 0 0 1 7

Lets add in an artificial variable and set things up to minimize this artificialvariable.

1 −1 1 1 0 0 0 0 21 2 0 0 −1 0 1 0 43 0 2 0 0 1 0 0 70 0 0 0 0 0 −1 1 0

Then

1 −1 1 1 0 0 0 0 21 2 0 0 −1 0 1 0 43 0 2 0 0 1 0 0 71 2 0 0 −1 0 0 1 4

Work with second column.

32 0 1 1 − 1

2 0 12 0 4

1 2 0 0 −1 0 1 0 43 0 2 0 0 1 0 0 70 0 0 0 0 0 −1 1 0

It appears the minimum value of x7 is 0 and so this will mean there is a solutionwhen x2 = 2, x3 = 0, x1 = 0.

(e)5x1 − 2x2 + 4x3 ≤ 16x1 − 3x2 + 5x3 ≥ 25x1 − 2x2 + 4x3 ≤ 5


5 −2 4 1 0 0 16 −3 5 0 −1 0 25 −2 4 0 0 1 5

lets introduce an artificial variable x7 and then minimize x7.

5 −2 4 1 0 0 0 0 16 −3 5 0 −1 0 1 0 25 −2 4 0 0 1 0 0 50 0 0 0 0 0 −1 1 0

Then, preserving the simple columns,

5 −2 4 1 0 0 0 0 16 −3 5 0 −1 0 1 0 25 −2 4 0 0 1 0 0 56 −3 5 0 −1 0 0 1 2


Exercises 55

work with the first column.

5 −2 4 1 0 0 0 0 10 − 3

515 − 6

5 −1 0 1 0 45

0 0 0 −1 0 1 0 0 40 − 3

515 − 6

5 −1 0 0 1 45

There is still a positive entry.

5 −2 4 1 0 0 0 0 1− 1

4 − 12 0 − 5

4 −1 0 1 0 34

0 0 0 −1 0 1 0 0 4− 1

4 − 12 0 − 5

4 −1 0 0 1 34

It appears that there is no solution to this system of inequalities because theminimum value of x7 is not 0.

5. Minimize z = x1 + x2 subject to x1 + x2 ≥ 2, x1 + 3x2 ≤ 20, x1 + x2 ≤ 18. Changeto a maximization problem and solve as follows: Let yi = M − xi. Formulate in termsof y1, y2.

You could find the maximum of 2M −x1−x2 for the given constraints and this wouldhappen when x1+x2 is as small as possible. Thus you would maximize y1+y2 subjectto the constraints

M − y1 +M − y2 ≥ 2

M − y1 + 3 (M − y2) ≤ 20

M − y1 +M − y2 ≤ 18

To simplify, this would be

2M − 2 ≥ y1 + y2

4M − 20 ≤ y1 + 3y2

2M − 18 ≤ y1 + y2

You could simply regard M as large enough that yi ≥ 0 and use the techniques justdeveloped. The augmented matrix for the constraints is then

1 1 1 0 0 2M − 21 3 0 −1 0 4M − 201 1 0 0 −1 2M − 18

Here M is large. Use the 3 as a pivot to zero out above and below it.

23 0 1 1

3 0 23M + 14

31 3 0 −1 0 4M − 2023 0 0 1

3 −1 23M − 34

3

Then it is still the case that the basic solution is not feasible. Lets use the bottomrow and pick the 2/3 as a pivot.

0 0 1 0 1 160 3 0 − 3

232 3M − 3

23 0 0 1

3 −1 23M − 34

3


56 Exercises

Now it appears that the basic solution is feasible provided M is large. Then assemblethe simplex tableau.

0 0 1 0 1 0 160 3 0 − 3

232 0 3M − 3

23 0 0 1

3 −1 0 23M − 34

3−1 −1 0 0 0 1 0

Do row operations to preserve the simple columns.

0 0 1 0 1 0 160 3 0 − 3

232 0 3M − 3

23 0 0 1

3 −1 0 23M − 34

30 0 0 0 −1 1 2M − 18

There is a negative number to the left of the 1 on the bottom row and we want tomaximize so work with this column. Assume M is very large. Then the pivot shouldbe the top entry in this column.

0 0 1 0 1 0 160 3 − 3

2 − 32 0 0 3M − 27

23 0 1 1

3 0 0 23M + 14

30 0 1 0 0 1 2M − 2

It follows that the maximum of y1+y2 is 2M−2 and it happens when y1 = M+7, y2 =M − 9, y3 = 0. Thus the minimum of x1 + x2 is

M − y1 +M − y2 = M − (M + 7) +M − (M − 9) = 2

F.23 Exercises

7.3

1. If A is the matrix of a linear transformation which rotates all vectors in R2 through30, explain why A cannot have any real eigenvalues.

Because the vectors which result are not parallel to the vector you begin with.

2. If A is an n×n matrix and c is a nonzero constant, compare the eigenvalues of A andcA.

λ→ cλ

3. If A is an invertible n × n matrix, compare the eigenvalues of A and A−1. Moregenerally, for m an arbitrary integer, compare the eigenvalues of A and Am.

λ→ λ−1 and λ→ λm.

4. Let A,B be invertible n × n matrices which commute. That is, AB = BA. Supposex is an eigenvector of B. Show that then Ax must also be an eigenvector for B.

Say Bx = λx. ThenB (Ax) = A (Bx) = Aλx = λ (Ax)


Exercises 57

5. Suppose A is an n × n matrix and it satisfies Am = A for some m a positive integerlarger than 1. Show that if λ is an eigenvalue of A then |λ| equals either 0 or 1.

Let x be the eigenvector. Then Amx = λmx,Amx = Ax = λx and so

λm = λ

Hence if λ 6= 0, thenλm−1 = 1

and so |λ| = 1.

6. Show that if Ax = λx and Ay = λy, then whenever a, b are scalars,

A (ax+ by) = λ (ax+ by) .

Does this imply that ax+ by is an eigenvector? Explain.

The formula is obvious from properties of matrix multiplications. However, this vectormight not be an eigenvector because it might equal 0 and

Eigenvectors are never 0

7. Find the eigenvalues and eigenvectors of the matrix

−1 −1 7−1 0 4−1 −1 5

.

Determine whether the matrix is defective.

−1 −1 7−1 0 4−1 −1 5

, eigenvectors:

311

↔ 1,

211

↔ 2. This is a defective matrix.


−3 −7 19−2 −1 8−2 −3 10

.


−3 −7 19−2 −1 8−2 −3 10

, eigenvectors:

311

↔ 1,

121

↔ 2,

211

↔ 3

This matrix has distinct eigenvalues so it is not defective.


−7 −12 30−3 −7 15−3 −6 14

.


−7 −12 30−3 −7 15−3 −6 14

, eigenvectors:

−210

,

501

↔ −1,

211

↔ 2

This matrix is not defective because, even though λ = 1 is a repeated eigenvalue, ithas a 2 dimensional eigenspace.


58 Exercises


7 −2 08 −1 0−2 4 6

.


7 −2 08 −1 0−2 4 6

, eigenvectors:

− 12

−11

↔ 3,

001

↔ 6

This matrix is defective.


3 −2 −10 5 10 2 4

.


3 −2 −10 5 10 2 4

, eigenvectors:

100

,

0− 1

21

↔ 3,

−111

↔ 6

This matrix is not defective.


6 8 −234 5 −163 4 −12



5 2 −512 3 −1012 4 −11

.


5 2 −512 3 −1012 4 −11

, eigenvectors:

− 1310

,

5601

↔ −1

This matrix is defective. In this case, there is only one eigenvalue, −1 of multiplicity3 but the dimension of the eigenspace is only 2.


20 9 −186 5 −630 14 −27

.


20 9 −186 5 −630 14 −27

, eigenvectors:


Exercises 59

34141

↔ −1,

1211

↔ 2,

9133131

↔ −3

Not defective.


1 26 −174 −4 4−9 −18 9

.



3 −1 −211 3 −98 0 −6

.


3 −1 −211 3 −98 0 −6

, eigenvectors:

34141

↔ 0 This one is defective.


−2 1 2−11 −2 9−8 0 7

.


−2 1 2−11 −2 9−8 0 7

, eigenvectors:

34141

↔ 1

This is defective.


2 1 −12 3 −22 2 −1

.


2 1 −12 3 −22 2 −1

, eigenvectors:

−110

,

101

↔ 1,

1211

↔ 2

This is non defective.

19. Find the complex eigenvalues and eigenvectors of the matrix

4 −2 −20 2 −22 0 2

.


60 Exercises

4 −2 −20 2 −22 0 2

, eigenvectors:

1−11

↔ 4,

−i−i1

↔ 2− 2i,

ii1

↔ 2 + 2i


9 6 −30 6 0−3 −6 9

.


9 6 −30 6 0−3 −6 9

, eigenvectors:

−210

,

101

↔ 6,

−101

↔ 12

This is nondefective.


4 −2 −20 2 −22 0 2

. De-

termine whether the matrix is defective.

4 −2 −20 2 −22 0 2

, eigenvectors:

1−11

↔ 4,

−i−i1

↔ 2− 2i,

ii1

↔ 2 + 2i


−4 2 02 −4 0−2 2 −2

.


−4 2 02 −4 0−2 2 −2

, eigenvectors:

001

,

110

↔ −2,

1−11

↔ −6

This is not defective.


1 1 −67 −5 −6−1 7 2

.


1 1 −67 −5 −6−1 7 2

, eigenvectors:

1−11

↔ −6,

−i−i1

↔ 2− 6i,

ii1

↔ 2 + 6i


Exercises 61



4 2 0−2 4 0−2 2 6

. Deter-

mine whether the matrix is defective.

4 2 0−2 4 0−2 2 6

, eigenvectors:

001

↔ 6,

1−i1

↔ 4− 2i,

1i1

↔ 4 + 2i


25. Here is a matrix.

1 a 0 00 1 b 00 0 2 c0 0 0 2

Find values of a, b, c for which the matrix is defective and values of a, b, c for which itis nondefective.

First consider the eigenvalue λ = 1

0 a 0 0 00 0 b 0 00 0 1 c 00 0 0 1 0

Then you have ax2 = 0, bx3 = 0. If neither a nor b = 0 then λ = 1 would be a defectiveeigenvalue and the matrix would be defective. If a = 0, then the dimension of theeigenspace is clearly 2 and so the matrix would be nondefective. If b = 0 but a 6= 0,then you would have a defective matrix because the eigenspace would have dimensionless than 2. If c 6= 0, then the matrix is defective. If c = 0 and a = 0, then it is nondefective. Basically, if a, c 6= 0, then the matrix is defective.


a 1 00 b 10 0 c

where a, b, c are numbers. Show this is sometimes defective depending on the choiceof a, b, c. What is an easy case which will ensure it is not defective?

An easy case which will ensure that it is not defective is for a, b, c to be distinct. Ifyou have any repeats, then this will be defective.

27. Suppose A is an n×n matrix consisting entirely of real entries but a+ ib is a complexeigenvalue having the eigenvector, x + iy. Here x and y are real vectors. Show thatthen a − ib is also an eigenvalue with the eigenvector, x − iy. Hint: You shouldremember that the conjugate of a product of complex numbers equals the product ofthe conjugates. Here a+ ib is a complex number whose conjugate equals a− ib.

A (x+ iy) = (a+ ib) (x+ iy) . Now just take complex conjugates of both sides.


62 Exercises

28. Recall an n×n matrix is said to be symmetric if it has all real entries and if A = AT .Show the eigenvalues of a real symmetric matrix are real and for each eigenvalue, ithas a real eigenvector.

Let λ be an eigenvalue which corresponds to x 6= 0. Then λxT = xTAT . Then

λxT x = xTAT x = xTAx = xT xλ

which shows that λ = λ. We have Ax = λx and Ax = λx so

A (x+ x) = λ (x+ x)

Hence it has a real eigenvector.

29. Recall an n × n matrix is said to be skew symmetric if it has all real entries and ifA = −AT . Show that any nonzero eigenvalues must be of the form ib where i2 = −1.In words, the eigenvalues are either 0 or pure imaginary.

Let A be skew symmetric. Then if x is an eigenvector for λ,

λxT x = xTAT x = −xTAx = −xT xλ

and so λ = −λ. Thus a+ ib = − (a− ib) and so a = 0.

30. Is it possible for a nonzero matrix to have only 0 as an eigenvalue?

Sure. Try this one.

(−12 −169 12

)

31. Show that the eigenvalues and eigenvectors of a real matrix occur in conjugate pairs.

This follows from the observation that if Ax = λx, then Ax = λx

32. Suppose A is an n × n matrix having all real eigenvalues which are distinct. Showthere exists S such that S−1AS = D, a diagonal matrix. If

D =

λ1 0. . .

0 λn

define eD by

eD ≡

eλ1 0. . .

0 eλn

and defineeA ≡ SeDS−1.

Next show that if A is as just described, so is tA where t is a real number and theeigenvalues of At are tλk. If you differentiate a matrix of functions entry by entry sothat for the ijth entry of A′ (t) you get a′ij (t) where aij (t) is the ijth entry of A (t) ,show

d

dt

(eAt

)= AeAt

Next show det(eAt

)6= 0. This is called the matrix exponential. Note I have only

defined it for the case where the eigenvalues of A are real, but the same procedure willwork even for complex eigenvalues. All you have to do is to define what is meant by


Exercises 63

ea+ib.

It is clear that the eigenvalues of tA are tλ where λ is an eigenvalue of A. Thus eAt isof the form

S

etλ1 0. . .

0 etλn

S−1

And so

d

dt

(eAt

)= S

λ1etλ1 0

. . .

0 λnetλn

S−1

= S

λ1 0. . .

0 λn

etλ1 0. . .

0 etλn

S−1

= S

λ1 0. . .

0 λn

S−1S

etλ1 0. . .

0 etλn

S−1

= AetA

33. Find the principle directions determined by the matrix

712 − 1

416

− 14

712 − 1

616 − 1

623

. The

eigenvalues are 13 , 1, and

12 listed according to multiplicity.

1−11

, 1

,

− 12

121

, 12

,

110

, 13

34. Find the principle directions determined by the matrix

53 − 1

3 − 13

− 13

76

16

− 13

16

76

The eigenvalues are 1, 2, and 1. What is the physical interpreta-

tion of the repeated eigenvalue?

There are apparently two directions in which there is no stretching.

1210

,

1201

↔ 1,

−211

↔ 2

35. Find oscillatory solutions to the system of differential equations, x′′ = Ax where A =

−3 −1 −1−1 −2 0−1 0 −2

The eigenvalues are −1,−4, and −2.


64 Exercises

The eigenvalues are given and so there are oscillatory solutions of the form

−111

(a cos (t) + b sin (t)) ,

0−11

(

c sin(√

2t)

+ d cos(√

2t))

211

(e cos (2t) + f sin (2t))

where a, b, c, d, e, f are scalars.

36. Let A and B be n× n matrices and let the columns of B be

b1, · · · ,bn

and the rows of A areaT1 , · · · , aTn .

Show the columns of AB areAb1 · · ·Abn

and the rows of AB areaT1 B · · · aTnB.

The ith entry of the jth column of AB is aTi bj . Therefore, these columns are asindicated. Also the ith row of AB will be

(aTi b1 · · · aTi bn

). Thus this row is

just aTi B.

37. Let M be an n × n matrix. Then define the adjoint of M , denoted by M∗ to be thetranspose of the conjugate of M. For example,

(2 i

1 + i 3

)∗=

(2 1− i−i 3

)

.

A matrix M, is self adjoint if M∗ = M. Show the eigenvalues of a self adjoint matrixare all real.

First note that (AB)∗= B∗A∗. Say Mx = λx,x 6= 0. Then

λ |x|2 = λx∗x = (λx)∗x = (Mx)

∗x = x∗M∗x

= x∗Mx = x∗λx = λ |x|2

Hence λ = λ.

38. Let M be an n × n matrix and suppose x1, · · · ,xn are n eigenvectors which form alinearly independent set. Form the matrix S by making the columns these vectors.Show that S−1 exists and that S−1MS is a diagonal matrix (one having zeros every-where except on the main diagonal) having the eigenvalues ofM on the main diagonal.When this can be done the matrix is said to be diagonalizable.


Exercises 65

Since the vectors are linearly independent, the matrix S has an inverse. Denoting thisinverse by

S−1 =

wT1...

wTn

it follows by definition thatwT

i xj = δij .

Therefore,

S−1MS = S−1 (Mx1, · · · ,Mxn) =

wT1...

wTn

(λ1x1, · · · , λnxn)

=

λ1 0. . .

0 λn

39. Show that a n×n matrix M is diagonalizable if and only if Fn has a basis of eigenvec-tors. Hint: The first part is done in Problem 38. It only remains to show that if thematrix can be diagonalized by some matrix S giving D = S−1MS for D a diagonalmatrix, then it has a basis of eigenvectors. Try using the columns of the matrix S.

The formula says thatMS = SD

Letting xk denote the kth column of S, it follows from the way we multiply matricesthat

Mxk = λkxk

where λk is the kth diagonal entry on D.

40. Let

A =

1 23 4

20

0 1 3

and let

B =

0 11 1

2 1

Multiply AB verifying the block multiplication formula. HereA11 =

(1 23 4

)

, A12 =(

20

)

, A21 =(0 1

)and A22 = (3) .

AB =

1 2 23 4 00 1 3

0 11 12 1

=

6 54 77 4


66 Exercises

Now consider doing it by block multiplication. This leads to

(1 23 4

)(0 11 1

)

+

(20

)(2 1

)

(0 1

)(

0 11 1

)

+ 3(2 1

)

=

(6 54 7

)

(7 4

)

=

6 54 77 4

41. Suppose A,B are n×n matrices and λ is a nonzero eigenvalue of AB. Show that thenit is also an eigenvalue of BA. Hint: Use the definition of what it means for λ to bean eigenvalue. That is,

ABx = λx

where x 6= 0. Maybe you should multiply both sides by B.

B (ABx) = λBx Hence (BA) (Bx) = λ (Bx) . Is Bx = 0? No, it can’t be since λ 6= 0.If Bx = 0, then the left side of the given equation would be 0 and the right wouldn’tbe.

42. Using the above problem show that if A,B are n× n matrices, it is not possible thatAB − BA = aI for any a 6= 0. Hint: First show that if A is a matrix, then theeigenvalues of A− aI are λ− a where λ is an eigenvalue of A.

If it is true that AB − BA = aI for a 6= 0, Then you get AB − aI = BA. Ifλ1, · · · , λn are the eigenvalues of AB, then the eigenvalues of AB − aI = BA areλ1 − a, · · · , λn − a which cannot be the same set, contrary to the above problem.

43. Consider the following matrix.

C =

0 · · · 0 −a01 0 −a1

. . .. . .

...0 1 −an−1

Show det (λI − C) = a0+λa1+ · · ·an−1λn−1+λn. This matrix is called a companion

matrix for the given polynomial.

λI − C is of the form

λ 0 · · · a0−1 λ a1

. . .. . .

...0 −1 λ+ an−1

Expand along the bottom row.

(λ+ an−1) det

λ 0−1 λ

. . .. . .

−1 λ0 −1 λ


Exercises 67

+1 det

λ 0 0 · · · a0−1 λ a1

−1 λ a2. . .

. . ....

0 −1 an−2

That last determinant is of the form

det

λ 0 0 · · · a0−1 λ a1

−1 λ a2. . .

. . ....

0 −1 λ+ (an−2 − λ)

By induction, this is of the form

λn−1 + (an−2 − λ)λn−2 + · · ·+ a1λ+ a0

and so, the sum of the two is of the form

λn−1 (λ+ an−1) + λn−1 + (an−2 − λ)λn−2 + · · ·+ a1λ+ a0

= λn + an−1λn−1 + an−2λ

n−2 + · · ·+ a1λ+ a0

It is routine to verify that when n = 2 this determinant gives the right polynomial.

44. A discreet dynamical system is of the form

x (k + 1) = Ax (k) , x (0) = x0

where A is an n× n matrix and x (k) is a vector in Rn. Show first that

x (k) = Akx0

for all k ≥ 1. If A is nondefective so that it has a basis of eigenvectors, v1, · · · ,vnwhere

Avj = λjvj

you can write the initial condition x0 in a unique way as a linear combination of theseeigenvectors. Thus

x0 =n∑

j=1

ajvj

Now explain why

x (k) =

n∑

j=1

ajAkvj =

n∑

j=1

ajλkjvj

which gives a formula for x (k) , the solution of the dynamical system.

The first formula is obvious from induction. Thus

Akx0 =

n∑

j=1

ajAkvj =

n∑

j=1

ajλkjvj

because if Av = λv, then if Akx = λkx, do A to both sides. Thus Ak+1x = λk+1

x.


68 Exercises

45. Suppose A is an n× n matrix and let v be an eigenvector such that Av = λv. Alsosuppose the characteristic polynomial of A is

det (λI −A) = λn + an−1λn−1 + · · ·+ a1λ+ a0

Explain why(An + an−1A

n−1 + · · ·+ a1A+ a0I)v = 0

If A is nondefective, give a very easy proof of the Cayley Hamilton theorem based onthis. Recall this theorem says A satisfies its characteristic equation,

An + an−1An−1 + · · ·+ a1A+ a0I = 0.

Let v be any vector in a basis of eigenvectors of A. Since A is nondefective, such abasis exists. Then

(An + an−1A

n−1 + · · ·+ a1A+ a0I)v =

(λn + an−1λ

n−1 + · · ·+ a1λ+ a0)v = 0

because λ is by definition a solution to the characteristic polynomial. Since the aboveholds for vectors in a basis, it holds for any v and so

An + an−1An−1 + · · ·+ a1A+ a0I = 0

46. Suppose an n× n nondefective matrix A has only 1 and −1 as eigenvalues. Find A12.

Let x be some vector. Let x =∑

j ajvj where vj is an eigenvector. Then

A12x =n∑

j=1

ajλ12j vj =

n∑

j=1

ajvj = x

It follows that A12 = I.

47. Suppose the characteristic polynomial of an n×n matrix A is 1−λn. Find Amn wherem is an integer. Hint: Note first that A is nondefective. Why?

The eigenvalues are distinct because they are the nth roots of 1. Hence from the aboveformula, if x is a given vector with

x =n∑

j=1

ajvj

then

Anmx = Anmn∑

j=1

ajvj =n∑

j=1

ajAnmvj =

n∑

j=1

ajvj = x

so Anm = I.

48. Sometimes sequences come in terms of a recursion formula. An example is the Fi-bonacci sequence.

x0 = 1 = x1, xn+1 = xn + xn−1

Show this can be considered as a discreet dynamical system as follows.

(xn+1

xn

)

=

(1 11 0

)(xn

xn−1

)

,

(x1

x0

)

=

(11

)


Exercises 69

Now use the technique of Problem 44 to find a formula for xn.

What are the eigenvalues and eigenvectors of this matrix?(

1 11 0

)

, eigenvectors:

(12 − 1

2

√5

1

)

↔ 1

2− 1

2

√5,

(12

√5 + 1

21

)

↔ 1

2

√5 +

1

2

Now also(1

2− 1

10

√5

)(12 − 1

2

√5

1

)

+

(1

10

√5 +

1

2

)(12

√5 + 1

21

)

=

(11

)

Therefore, the solution is of the form

(xn+1

xn

)

=

(1

2− 1

10

√5

)(1

2− 1

2

√5

)n (12 − 1

2

√5

1

)

+

(1

10

√5 +

1

2

)(1

2

√5 +

1

2

)n (12

√5 + 1

21

)

In particular,

xn =

(1

2− 1

10

√5

)(1

2− 1

2

√5

)n

+

(1

10

√5 +

1

2

)(1

2

√5 +

1

2

)n

49. Let A be an n× n matrix having characteristic polynomial

det (λI −A) = λn + an−1λn−1 + · · ·+ a1λ+ a0

Show that a0 = (−1)n det (A).The characteristic polynomial equals det (λI −A) . To get the constant term, you plugin λ = 0 and obtain det (−A) = (−1)n det (A).

F.24 Exercises

7.10

1. Explain why it is typically impossible to compute the upper triangular matrix whoseexistence is guaranteed by Schur’s theorem.

To get it, you must be able to get the eigenvalues and this is typically not possible.

2. Now recall the QR factorization of Theorem 5.7.5 on Page 133. The QR algorithmis a technique which does compute the upper triangular matrix in Schur’s theorem.There is much more to the QR algorithm than will be presented here. In fact, whatI am about to show you is not the way it is done in practice. One first obtains whatis called a Hessenburg matrix for which the algorithm will work better. However,the idea is as follows. Start with A an n × n matrix having real eigenvalues. FormA = QR where Q is orthogonal and R is upper triangular. (Right triangular.) This


70 Exercises

can be done using the technique of Theorem 5.7.5 using Householder matrices. Nexttake A1 ≡ RQ. Show that A = QA1Q

T . In other words these two matrices, A,A1 aresimilar. Explain why they have the same eigenvalues. Continue by letting A1 play therole of A. Thus the algorithm is of the form An = QRn and An+1 = Rn+1Q. Explainwhy A = QnAnQ

Tn for some Qn orthogonal. Thus An is a sequence of matrices each

similar to A. The remarkable thing is that often these matrices converge to an uppertriangular matrix T and A = QTQT for some orthogonal matrix, the limit of the Qn

where the limit means the entries converge. Then the process computes the uppertriangular Schur form of the matrix A. Thus the eigenvalues of A appear on thediagonal of T. You will see approximately what these are as the process continues.

Suppose you have found An. Then An = QnRn and then

An+1 ≡ RnQn = QTnAnQn = QT

nQTn−1An−1Qn−1Qn · · ·

Now the product of orthogonal matrices is orthogonal and so this shows by inductionthat each An is orthogonally similar to A.

3. Try the QR algorithm on(−1 −26 6

)

which has eigenvalues 3 and 2. I suggest you use a computer algebra system to do thecomputations.(−1 −26 6

)

=

(− 1

37

√37 − 6

37

√37

637

√37 − 1

37

√37

)( √37 38

37

√37

0 637

√37

)

A1 =

( √37 38

37

√37

0 637

√37

)(− 1

37

√37 − 6

37

√37

637

√37 − 1

37

√37

)

=

(19137 − 260

373637 − 6

37

)

(19137 − 260

373637 − 6

37

)

=(

19137 777

√37√1021 − 36

37 777

√37√1021

3637 777

√37√1021 191

37 777

√37√1021

)(137

√37√1021 − 1348

37 777

√37√1021

0 61021

√37√1021

)

A2 =

(39591021 − 7952

10212161021

11461021

)

(39591021 − 7952

10210.211 56 1146

1021

)

=(

0.998 51 −5. 447 9× 10−2

5. 447 9× 10−2 0.998 51

)(3. 883 3 −7. 715 7

0 1. 545 1

)

A3 =

(3. 883 3 −7. 715 7

0 1. 545 1

)(0.998 51 −5. 447 9× 10−2

5. 447 9× 10−2 0.998 51

)

=(

3. 457 2 −7. 915 88. 417 6× 10−2 1. 542 8

)

You can keep on going. You see that it appears to be converging to an upper triangularmatrix having 3, 1 down the diagonal.

4. Now try the QR algorithm on(

0 −12 0

)

Show that the algorithm cannot converge for this example. Hint: Try a few iterationsof the algorithm.


Exercises 71

(0 −12 0

)

=

(0 −11 0

)(2 00 1

)

A1 =

(2 00 1

)(0 −11 0

)

=

(0 −21 0

)

(0 −21 0

)

=

(0 −11 0

)(1 00 2

)

A2 =

(1 00 2

)(0 −11 0

)

=

(0 −12 0

)

. Now it is back to where you started.

Thus the algorithmmerely bounces between the two matrices

(0 −12 0

)

and

(0 −21 0

)

and so it can’t possibly converge.

5. Show the two matrices A ≡(

0 −14 0

)

and B ≡(

0 −22 0

)

are similar; that is

there exists a matrix S such that A = S−1BS but there is no orthogonal matrixQ such that QTBQ = A. Show the QR algorithm does converge for the matrix Balthough it fails to do so for A.

Both matrices have distinct eigenvalues ±2 and so they are both similar to a diagonalmatrix having these eigenvalues on the diagonal. Therefore, the matrices are indeedsimilar. Lets try the QR algorithm on the second.(

0 −22 0

)

=

(0 −11 0

)(2 00 2

)

A1 =

(2 00 2

)(0 −11 0

)

=

(0 −22 0

)

and so the algorithm keeps on returning

the same matrix in the case of B. Now consider the matrix A.(

0 −14 0

)

=

(0 −11 0

)(4 00 1

)

A1 =

(4 00 1

)(0 −11 0

)

=

(0 −41 0

)

(0 −41 0

)

=

(0 −11 0

)(1 00 4

)

A2 =

(1 00 4

)(0 −11 0

)

=

(0 −14 0

)

In this case, the algorithm bounces between

(0 −14 0

)

and

(0 −41 0

)

.

6. Let F be an m× n matrix. Show that F ∗F has all real eigenvalues and furthermore,they are all nonnegative.

The eigenvalues are real because the matrix F ∗F is Hermitian. If λ is one of themwith eigenvector x,

λ (x,x) = (F ∗Fx,x) = (Fx,Fx) ≥ 0

7. If A is a real n × n matrix and λ is a complex eigenvalue of A having eigenvectorz+ iw, show that w 6= 0.

A (z+ iw) = (a+ ib) (z+ iw)

A (z− iw) = (a− ib) (z− iw)


72 Exercises

Now subtract to obtain A (2iw) = 2iaw + 2ibz. Now if w = 0, then you would have0 =ibz and this requires b = 0 since otherwise, the eigenvector was equal to 0 whichit isn’t. Hence λ was not complex after all.

8. Suppose A = QTDQ where Q is an orthogonal matrix and all the matrices are real.Also D is a diagonal matrix. Show that A must be symmetric.

AT = QTDTQ = QTDQ = A.

9. Suppose A is an n× n matrix and there exists a unitary matrix U such that

A = U∗DU

where D is a diagonal matrix. Explain why A must be normal.

A∗A = U∗D∗UU∗DU = U∗D∗DU

AA∗ = U∗DUU∗D∗U = U∗DD∗U. But D∗D = DD∗ and both are equal to thediagonal matrix which has the squares of the absolute values of the diagonal entriesof D down the diagonal.

10. If A is Hermitian, show that det (A) must be real.

A = U∗DU where D is a real diagonal matrix. Therefore,det (A) = det

(U−1

)det (U) det (D) = det (D) ∈ R.

11. Show that every unitary matrix preserves distance. That is, if U is unitary,

|Ux| = |x| .

|Ux|2 = (Ux, Ux) = (U∗Ux,x) = (x,x) = |x|2.

12. Show that if a matrix does preserve distances, then it must be unitary.

Let U preserve distances. Then

|x|2 + |y|2 + 2Re (x,y) =

|U (x+ y)|2 = (Ux+ Uy,Ux+ Uy)

= |Ux|2 + |Uy|2 + 2Re (Ux, Uy)

= |x|2 + |y|2 + 2Re (Ux, Uy)

Therefore,

Re (Ux, Uy)− Re (x,y) = 0

Re (U∗Ux− x,y) = 0

Let |θ| = 1 and θ (U∗Ux− x,y) = |(U∗Ux− x,y)| . Then replace y in the above withθy. Then you get

Re(U∗Ux− x,θy

)= Re θ (U∗Ux− x,y) = |(U∗Ux− x,y)| = 0

and since y is arbitrary,(U∗Ux− x,y) = 0

for all y. In particular, this holds for y =U∗Ux− x and so U∗U = I.


Exercises 73

13. Show that a complex normal matrix A is unitary if and only if its eigenvalues havemagnitude equal to 1.

Since A is normal, it follows that there is a unitary matrix U such that

AU = UD

where D is a diagonal matrix. Then from the above problems, A is unitary if and onlyif it preserves distances. Let the columns of U be the orthonormal set u1, · · · ,un .Then we have Auk = λkuk. For A to preserve distances, you must have |λk| = 1.Conversely, if |λk| = 1 for all k, then λkuk is also an orthonormal set and so Apreserves distances and is therefore normal.

14. Suppose A is an n× n matrix which is diagonally dominant. Recall this means

∑

j 6=i

|aij | < |aii|

show A−1 must exist.

This follows from Gerschgorin’s theorem. The condition implies that 0 is not aneigenvalue and so the matrix is one to one and hence invertible.

15. Give some disks in the complex plane whose union contains all the eigenvalues of thematrix

1 + 2i 4 20 i 35 6 7

B (1 + 2i, 6) , B (i, 3) , B (7, 11)

16. Show a square matrix is invertible if and only if it has no zero eigenvalues.

To say the matrix has no zero eigenvalues is to say that it is one to one because itmaps no nonzero vector to 0.

17. Using Schur’s theorem, show the trace of an n × n matrix equals the sum of theeigenvalues and the determinant of an n×n matrix is the product of the eigenvalues.

Let A be an n × n matrix. By Schur, there exists a unitary matrix U and an uppertriangular matrix T such that

A = U∗TU

Then det (A) = det (T ) = the product of the diagonal entries of T. These are theeigenvalues of T and both A and T have the same eigenvalues because they have thesame characteristic polynomial. Thus the determinant of A equals the product of theeigenvalues. Since A and T are similar, they have the same trace also. However, thetrace of T is just the sum of the eigenvalues of A.

18. Using Schur’s theorem, show that if A is any complex n×n matrix having eigenvaluesλi listed according to multiplicity, then

∑

i,j |Aij |2 ≥∑n

i=1 |λi|2. Show that equalityholds if and only if A is normal.

By Schur’s theorem, there is a unitary U and upper triangular T such that U∗AU = T.Then ∑

i,j

|Aij |2 = trace (AA∗) = trace (UTU∗UT ∗U∗)


74 Exercises

= trace (UTT ∗U∗) = trace (TT ∗) ≥n∑

i=1

|λi|2

Equality holds if and only if there are no nonzero off diagonal terms. This happens ifand only if T is a diagonal matrix. But if this is so, then A must be normal becauseU∗AU = D, a diagonal matrix and this implies that A is normal.


1234 6 5 30 −654 9 12398 123 10, 000 1156 78 98 400

I know this matrix has an inverse before doing any computations. How do I know?

Gerschgorin’s theorem shows that there are no zero eigenvalues and so the matrix isinvertible.

20. Show the critical points of the following function are

(0,−3, 0) , (2,−3, 0) , and(

1,−3,−1

3

)

and classify them as local minima, local maxima or saddle points.

f (x, y, z) = − 32x

4 + 6x3 − 6x2 + zx2 − 2zx− 2y2 − 12y − 18− 32z

2.

0 = ∇(− 3

2x4 + 6x3 − 6x2 + zx2 − 2zx− 2y2 − 12y − 18− 3

2z2).

2xz − 2z − 12x+ 18x2 − 6x3 = 0−4y − 12 = 0

x2 − 2x− 3z = 0

After much fuss, you find the above solutions

to these systems of equations.

Hessian is

−18x2 + 36x+ 2z − 12 0 2x− 20 −4 0

2x− 2 0 −3

(0,−3, 0) ,

−12 0 −20 −4 0−2 0 −3

The eigenvalues are all negative and so this point is a local maximum.

(2,−3, 0)

−12 0 20 −4 02 0 −3

The eigenvalues are all negative so this is a local maximum.(1,−3,− 1

3

)

163 0 00 −4 00 0 −3

This is a saddle point.


Exercises 75

21. Here is a function of three variables.

f (x, y, z) = 13x2 + 2xy + 8xz + 13y2 + 8yz + 10z2

change the variables so that in the new variables there are no mixed terms, termsinvolving xy, yz etc. Two eigenvalues are 12 and 18.

The function is

f (x, y, z) =

xyz

T

13 1 41 13 44 4 10

xyz

13 1 41 13 44 4 10

, eigenvectors:

− 12

− 121

↔ 6,

−110

↔ 12,

111

↔ 18

Then in terms of the new variables, the quadratic form is 6x′2 + 12y′2 + 18z′2. Notehow I only needed to find the eigenvalues. This changes in the next problems.


f (x, y, z) = 2x2 − 4x+ 2 + 9yx− 9y − 3zx+ 3z + 5y2 − 9zy − 7z2

change the variables so that in the new variables there are no mixed terms, termsinvolving xy, yz etc.

The function is f (x, y, z) =

xyz

T

2 9/2 −3/29/2 5 −9/2−3/2 −9/2 −7

xyz

+(−4 −9 3

)

xyz

+ 2

2 9/2 −3/29/2 5 −9/2−3/2 −9/2 −7

, eigenvectors:

5−31

↔ −1,

0131

↔ − 17

2 ,

−2−31

↔ 19

2

Appropriate orthogonal matrix: Q =

17

√35 0 − 1

7

√14

− 335

√35 1

10

√10 − 3

14

√14

135

√35 3

10

√10 1

14

√14

QTAQ =

−1 0 00 − 17

2 00 0 19

2

, A = Q

−1 0 00 − 17

2 00 0 19

2

QT Thus you need to have

the new variables be

17

√35 0 − 1

7

√14

− 335

√35 1

10

√10 − 3

14

√14

135

√35 3

10

√10 1

14

√14

T

xyz

=

x′

y′

z′

Then in terms of the new variables, the quadratic form is


76 Exercises

x′

y′

z′

T

−1 0 00 − 17

2 00 0 19

2

x′

y′

z′

+

(−4 −9 3

)

17

√35 0 − 1

7

√14

− 335

√35 1

10

√10 − 3

14

√14

135

√35 3

10

√10 1

14

√14

x′

y′

z′

+ 2

=−(x′)2 + 27

√5√7x′ − 17

2 (y′)2 + 19

2 (z′)2 + 197

√2√7z′ + 2


f (x, y, z) = −x2 + 2xy + 2xz − y2 + 2yz − z2 + x

change the variables so that in the new variables there are no mixed terms, termsinvolving xy, yz etc.

The function is

xyz

T

−1 1 11 −1 11 1 −1

xyz

+(1 0 0

)

xyz

−1 1 11 −1 11 1 −1

, eigenvectors:

111

↔ 1,

−110

,

−1−12

↔ −2

Appropriate orthogonal matrix: Q =

13

√3 − 1

2

√2 − 1

6

√6

13

√3 1

2

√2 − 1

6

√6

13

√3 0 1

3

√6

.

New variables,

x′

y′

z′

=

13

√3 − 1

2

√2 − 1

6

√6

13

√3 1

2

√2 − 1

6

√6

13

√3 0 1

3

√6

T

xyz

Quadratic form in the new variables:

x′

y′

z′

T

1 0 00 −2 00 0 −2

x′

y′

z′

+(1 0 0

)

13

√3 − 1

2

√2 − 1

6

√6

13

√3 1

2

√2 − 1

6

√6

13

√3 0 1

3

√6

x′

y′

z′

=(x′)2 + 13

√3x′ − 2(y′)2 − 1

2

√2y′ − 2(z′)2 − 1

6

√6z′

24. Show the critical points of the function,

f (x, y, z) = −2yx2 − 6yx− 4zx2 − 12zx+ y2 + 2yz.

are points of the form,

(x, y, z) =(t, 2t2 + 6t,−t2 − 3t

)

for t ∈ R and classify them as local minima, local maxima or saddle points.


Exercises 77

∇(−2yx2 − 6yx− 4zx2 − 12zx+ y2 + 2yz

)=

−6y − 12z − 4xy − 8xz−2x2 − 6x+ 2y + 2z−4x2 − 12x+ 2y

−6y − 12z − 4xy − 8xz = 0−2x2 − 6x+ 2y + 2z = 0−4x2 − 12x+ 2y = 0

Let x = t. Then y = 2t2 + 6t. Now place these into the

middle equation and you then find that −2t2− 6t+2(2t2 + 6t

)+2z = 0, Solution is:

z = −t2 − 3t. This also works in the top equation.

−6(2t2 + 6t

)− 12

(−t2 − 3t

)− 4t

(2t2 + 6t

)− 8t

(−t2 − 3t

)= 0

−2yx2 − 6yx− 4zx2 − 12zx+ y2 + 2yz, Hessian is

−4y − 8z −4x− 6 −8x− 12−4x− 6 2 2−8x− 12 2 0

Plug in the critical points.

0 −4t− 6 −8t− 12−4t− 6 2 2−8t− 12 2 0

You need to consider the sign of the eigenvalues.

det

−λ −4t− 6 −8t− 12−4t− 6 2− λ 2−8t− 12 2 −λ

= 80t2λ+240tλ−λ3+2λ2 +184λ. Thus you

need to be looking at the solutions to

λ3 − 2λ2 − 80t2λ− 240tλ− 184λ = 0

You get an eigenvalue which equals 0. Then you need to solve

λ2 − 2λ−(184 + 80t2 + 240t

)= 0

184 + 80t2 + 240t has a minimum value when t = − 32 and at this point, the value of

this function is 184 + 80(32

)2+ 240

(− 3

2

)= 4. Therefore, from the quadratic formula,

there is a positive eigenvalue and a negative eigenvalue for any value of t. Thus thisfunction always has a direction in which it appears to have a local min. and a directionin which it appears to have a local max. for each of the critical points.

25. Show the critical points of the function

f (x, y, z) =1

2x4 − 4x3 + 8x2 − 3zx2 + 12zx+ 2y2 + 4y + 2 +

1

2z2.

are (0,−1, 0) , (4,−1, 0) , and (2,−1,−12) and classify them as local minima, localmaxima or saddle points.

Hessian:

6x2 − 24x− 6z + 16 0 12− 6x0 4 0

12− 6x 0 1

(0,−1, 0)

16 0 120 4 012 0 1

This is a saddle point because it has a negative and two positive eigenvalues.

(4,−1, 0)

22 0 −120 4 0−12 0 1


78 Exercises

This is also a saddle point because it has a negative and two positive eigenvalues.

(2,−1,−12)

64 0 00 4 00 0 1

This is a local minimum because all eigenvalues are positive.

26. Let f (x, y) = 3x4 − 24x2 + 48− yx2 + 4y. Find and classify the critical points usingthe second derivative test.

∇(3x4 − 24x2 + 48− yx2 + 4y

)=

(12x3 − 2xy − 48x

4− x2

)

Critical points: (2, 0) , (−2, 0) .

3x4 − 24x2 + 48− yx2 + 4y, Hessian is

(36x2 − 2y − 48 −2x

−2x 0

)

(2, 0)(

96 −4−4 0

)

Clearly it has eigenvalues of different sign because the determinant is negative. There-fore, this is a saddle point. (−2, 0) works out the same way. To have some fun, graphthis function of two variables and see if you can see the critical points are saddle pointsfrom the picture.

27. Let f (x, y) = 3x4− 5x2+2− y2x2 + y2. Find and classify the critical points using thesecond derivative test.

∇(3x4 − 5x2 + 2− y2x2 + y2

)=

(12x3 − 2xy2 − 10x

2y − 2x2y

)

Critical points: (1, 1) , (−1, 1) , (1,−1) , (−1,−1) , (0, 0) ,(− 1

6

√5√6, 0

),(16

√5√6, 0

)

3x4 − 5x2 + 2− y2x2 + y2, Hessian is

(36x2 − 2y2 − 10 −4xy

−4xy 2− 2x2

)

(1, 1)(

24 −4−4 0

)

Saddle point. The eigenvalues have opposite sign.

(−1, 1) , saddle point also. (1,−1) saddle point (−1,−1) saddle point.(− 1

6

√5√6, 0

)

(20 00 1

3

)

Local minimum.(16

√5√6, 0

)also a local minimum.

28. Let f (x, y) = 5x4− 7x2− 2− 3y2x2 +11y2− 4y4. Find and classify the critical pointsusing the second derivative test.

∇(5x4 − 7x2 − 2− 3y2x2 + 11y2 − 4y4

)=

(20x3 − 6xy2 − 14x−6x2y − 16y3 + 22y

)

Critical points (1, 1) , (−1, 1) , (0, 0) ,(− 1

10

√7√10, 0

),(

110

√7√10, 0

)

5x4 − 7x2 − 2− 3y2x2 + 11y2 − 4y4,


Exercises 79

Hessian is

(60x2 − 6y2 − 14 −12xy

−12xy −6x2 − 48y2 + 22

)

(1, 1)(

40 −12−12 −32

)

The determinant is negative and so this is a saddle point.

(−1, 1)(

40 1212 −32

)

The determinant is negative and so this is also a saddle point.

(0, 0)(−14 00 22

)

This is a saddle point.(

110

√7√10, 0

)

(28 00 89

5

)

This is a local minimum. So is(− 1

10

√7√10, 0

).

29. Let f (x, y, z) = −2x4 − 3yx2 + 3x2 + 5x2z + 3y2 − 6y + 3− 3zy + 3z + z2. Find andclassify the critical points using the second derivative test.

∇(−2x4 − 3yx2 + 3x2 + 5x2z + 3y2 − 6y + 3− 3zy + 3z + z2

)

=

6x− 6xy + 10xz − 8x3

−3x2 + 6y − 3z − 65x2 − 3y + 2z + 3

Critical points: (0, 1, 0)

Hessian is

−24x2 − 6y + 10z + 6 −6x 10x−6x 6 −310x −3 2

At the point of interest this is

−54 0 00 6 −30 −3 2

It has two positive eigenvalues and one negative eigenvalue so this is a saddle point.

30. Let f (x, y, z) = 3yx2 − 3x2 − x2z − y2 + 2y − 1 + 3zy − 3z − 3z2. Find and classifythe critical points using the second derivative test.

∇(3yx2 − 3x2 − x2z − y2 + 2y − 1 + 3zy − 3z − 3z2

)

=

6xy − 6x− 2xz3x2 − 2y + 3z + 2−x2 + 3y − 6z − 3

Critical points: (0, 1, 0)


80 Exercises

Hessian is

6y − 2z − 6 6x −2x6x −2 3−2x 3 −6

At the critical point, this is

0 0 00 −2 30 3 −6

eigenvalues:√13− 4,−

√13− 4, 0

This has two directions in which it appears to have a local maximum. However, thetest fails because of the zero eigenvalue.

31. Let Q be orthogonal. Find the possible values of det (Q) .

±132. Let U be unitary. Find the possible values of det (U) .

U is unitary which means it preserves length. Therefore, if Ux = λx, it must be thecase that |λ| = 1. Hence λ = eiθ for some θ. In other words, λ is a point on the unitcircle.

33. If a matrix is nonzero can it have only zero for eigenvalues?

Definitely yes. Consider this one.

(−1 1−1 1

)

34. A matrix A is called nilpotent if Ak = 0 for some positive integer k. Suppose A is anilpotent matrix. Show it has only 0 for an eigenvalue.

Say λ is an eigenvalue. Then for some x 6= 0,

Ax = λx

Then λk is an eigenvalue for Ak and so λk = 0. Hence λ = 0.

35. If A is a nonzero nilpotent matrix, show it must be defective.

If it is not defective, then there exists S such that

S−1AS = 0

But then A = 0. Hence, if A 6= 0 is nilpotent, then A is defective.

36. Suppose A is a nondefective n × n matrix and its eigenvalues are all either 0 or 1.Show A2 = A. Could you say anything interesting if the eigenvalues were all either0,1,or −1? By DeMoivre’s theorem, an nth root of unity is of the form

(

cos

(2kπ

n

)

+ i sin

(2kπ

n

))

Could you generalize the sort of thing just described to get An = A? Hint: Since Ais nondefective, there exists S such that S−1AS = D where D is a diagonal matrix.

S−1AS = D and so S−1A2S = D2 = D = S−1AS and so, in the first case, A2 = A.

If the eigenvaluse are 0,1,−1, then all even powers of A are equal and all odd powersare equal.

In the last case, you could have the eigenvalues of A equal to(cos

(2kπn

)+ i sin

(2kπn

)).

ThenS−1AnS = I

and so An = I.


Exercises 81

37. This and the following problems will present most of a differential equations course.To begin with, consider the scalar initial value problem

y′ = ay, y (t0) = y0

When a is real, show the unique solution to this problem is y = y0ea(t−t0). Next

supposey′ = (a+ ib) y, y (t0) = y0 (6.30)

where y (t) = u (t) + iv (t) . Show there exists a unique solution and it is

y (t) = y0ea(t−t0) (cos b (t− t0) + i sin b (t− t0))

≡ e(a+ib)(t−t0)y0. (6.31)

Next show that for a real or complex there exists a unique solution to the initial valueproblem

y′ = ay + f, y (t0) = y0

and it is given by

y (t) = ea(t−t0)y0 + eat∫ t

t0

e−asf (s) ds.

Hint: For the first part write as y′ − ay = 0 and multiply both sides by e−at. Thenexplain why you get

d

dt

(e−aty (t)

)= 0, y (t0) = 0.

Now you finish the argument. To show uniqueness in the second part, suppose

y′ = (a+ ib) y, y (0) = 0

and verify this requires y (t) = 0. To do this, note

y′ = (a− ib) y, y (0) = 0

and that

d

dt|y (t)|2 = y′ (t) y (t) + y′ (t) y (t)

= (a+ ib) y (t) y (t) + (a− ib) y (t) y (t)

= 2a |y (t)|2 , |y|2 (t0) = 0

Thus from the first part |y (t)|2 = 0e−2at = 0. Finally observe by a simple computationthat 6.30 is solved by 6.31. For the last part, write the equation as

y′ − ay = f

and multiply both sides by e−at and then integrate from t0 to t using the initialcondition.

38. Now consider A an n×n matrix. By Schur’s theorem there exists unitary Q such that

Q−1AQ = T

where T is upper triangular. Now consider the first order initial value problem

x′ = Ax, x (t0) = x0.


82 Exercises

Show there exists a unique solution to this first order system. Hint: Let y = Q−1x

and so the system becomes

y′ = Ty, y (t0) = Q−1x0 (6.32)

Now letting y = (y1, · · · , yn)T , the bottom equation becomes

y′n = tnnyn, yn (t0) =(Q−1x0

)

n.

Then use the solution you get in this to get the solution to the initial value problemwhich occurs one level up, namely

y′n−1 = t(n−1)(n−1)yn−1 + t(n−1)nyn, yn−1 (t0) =(Q−1x0

)

n−1

Continue doing this to obtain a unique solution to 6.32.

39. Now suppose Φ (t) is an n× n matrix of the form

Φ (t) =(x1 (t) · · · xn (t)

)(6.33)

wherex′k (t) = Axk (t) .

Explain whyΦ′ (t) = AΦ (t)

if and only if Φ (t) is given in the form of 6.33. Also explain why if c ∈ Fn,

y (t) ≡ Φ (t) c

solves the equationy′ (t) = Ay (t) .

This is because y′ (t) = Φ′ (t) c = AΦ (t) c = Ay (t) .

40. In the above problem, consider the question whether all solutions to

x′ = Ax (6.34)

are obtained in the form Φ (t) c for some choice of c ∈ Fn. In other words, is thegeneral solution to this equation Φ (t) c for c ∈ Fn? Prove the following theorem usinglinear algebra.

Theorem F.24.1 Suppose Φ (t) is an n× n matrix which satisfies

Φ′ (t) = AΦ (t) .

Then the general solution to 6.34 is Φ (t) c if and only if Φ (t)−1

exists for some t.

Furthermore, if Φ′ (t) = AΦ (t) , then either Φ (t)−1

exists for all t or Φ (t)−1

never

exists for any t.

(det (Φ (t)) is called the Wronskian and this theorem is sometimes called the Wronskianalternative.)

Hint: Suppose first the general solution is of the form Φ (t) c where c is an arbitrary

constant vector in Fn. You need to verify Φ (t)−1

exists for some t. In fact, show


Exercises 83

Φ (t)−1

exists for every t. Suppose then that Φ (t0)−1

does not exist. Explain whythere exists c ∈ Fn such that there is no solution x to

c = Φ(t0)x

By the existence part of Problem 38 there exists a solution to

x′ = Ax, x (t0) = c

but this cannot be in the form Φ (t) c. Thus for every t, Φ (t)−1 exists. Next suppose

for some t0,Φ (t0)−1

exists. Let z′ = Az and choose c such that

z (t0) = Φ (t0) c

Then both z (t) ,Φ (t) c solve

x′ = Ax, x (t0) = z (t0)

Apply uniqueness to conclude z = Φ(t) c. Finally, consider that Φ (t) c for c ∈ Fn

either is the general solution or it is not the general solution. If it is, then Φ (t)−1

exists for all t. If it is not, then Φ (t)−1

cannot exist for any t from what was justshown.

41. Let Φ′ (t) = AΦ (t) . Then Φ (t) is called a fundamental matrix if Φ (t)−1

exists for allt. Show there exists a unique solution to the equation

x′ = Ax+ f , x (t0) = x0 (6.35)

and it is given by the formula

x (t) = Φ (t)Φ (t0)−1

x0 +Φ(t)

∫ t

t0

Φ (s)−1

f (s) ds

Now these few problems have done virtually everything of significance in an entire un-dergraduate differential equations course, illustrating the superiority of linear algebra.The above formula is called the variation of constants formula.

Hint: Uniquenss is easy. If x1,x2 are two solutions then let u (t) = x1 (t)−x2 (t) andargue u′ = Au, u (t0) = 0. Then use Problem 38. To verify there exists a solution, youcould just differentiate the above formula using the fundamental theorem of calculusand verify it works. Another way is to assume the solution in the form

x (t) = Φ (t) c (t)

and find c (t) to make it all work out. This is called the method of variation ofparameters.

For the method of variation of parameters,

x′ (t) = AΦ (t) c (t) + Φ (t) c′ (t)

x′ (t) = AΦ (t) c (t) + f (t)

and so

c (t) =

∫ t

t0

Φ (s)−1

f (s) ds+ k


84 Exercises

Then the solution desired is

x (t) = Φ (t)k+ Φ(t)

∫ t

t0

Φ (s)−1

f (s) ds

Plug in t = t0. This yields x0 = Φ(t0)k and so k = Φ(t0)−1

x0 and so the solution is

x (t) = Φ (t)Φ (t0)−1

x0 +Φ(t)

∫ t

t0

Φ (s)−1

f (s) ds

Note that by approximating with Riemann sums and passing to a limit, it follows thatthe constant Φ (t) can be taken into the integral to write

x (t) = Φ (t)Φ (t0)−1

x0 +

∫ t

t0

Φ (t)Φ (s)−1f (s) ds

42. Show there exists a special Φ such that Φ′ (t) = AΦ (t) , Φ (0) = I, and suppose

Φ (t)−1

exists for all t. Show using uniqueness that

Φ (−t) = Φ (t)−1

and that for all t, s ∈ R

Φ (t+ s) = Φ (t)Φ (s)

Explain why with this special Φ, the solution to 6.35 can be written as

x (t) = Φ (t− t0)x0 +

∫ t

t0

Φ (t− s) f (s) ds.

Hint: Let Φ (t) be such that the jth column is xj (t) where

x′j = Axj , xj (0) = ej.

Use uniqueness as required.

Just follow the hint. Such a Φ (t) clearly exists. By the Wronskian alternative in one

of the above problems Φ (t)−1

exists for all t.

Consider AΦ (t)− Φ (t)A = Ψ (t)

Ψ′ (t) = AΦ′ (t)− Φ′ (t)A

= AAΦ (t)−AΦ (t)A

= A (Ψ (t))

and also Ψ (0) = 0. By uniqueness given above, Ψ (t) = 0 for all t. Therefore,

(Φ (t)Φ (−t))′ = AΦ (t)Φ (−t)− Φ (t)AΦ (−t)= Φ (t)AΦ (−t)− Φ (t)AΦ (−t) = 0

and soΦ (t)Φ (−t) = C

a constant. However, when t = 0, C = I and so Φ (t)Φ (−t) = I. Also fixing s,

Ψ (t) ≡ Φ (t+ s)− Φ (t)Φ (s)


Exercises 85

Ψ′ (t) = Φ′ (t+ s)− Φ′ (t)Φ (s)

= AΦ (t+ s)−AΦ (t)Φ (s)

= AΨ (t)

and Ψ (0) = 0. Therefore, Ψ (t) = 0 for all t by uniqueness. It follows that for all t, s,

Φ (t+ s) = Φ (t)Φ (s)

Therefore, in the variation of constants formula, it reduces to

x (t) = Φ (t− t0)x0 +

∫ t

t0

Φ (t− s) f (s) ds

One thing might not be clear. If Ψ′ (t) = 0, why is Ψ (t) a constant? This works forscalar valued functions by a use of the mean value theorem. However, this is a matrixvalued function. Nevertheless this is true. You just consider the ijth entry which is

eTi Ψ (t) ej

Its derivative equals 0 and so it is constant.

43. You can see more on this problem and the next one in the latest version of Hornand Johnson, [16]. Two n × n matrices A,B are said to be congruent if there is aninvertible P such that

B = PAP ∗

Let A be a Hermitian matrix. Thus it has all real eigenvalues. Let n+ be the numberof positive eigenvalues, n−, the number of negative eigenvalues and n0 the number ofzero eigenvalues. For k a positive integer, let Ik denote the k × k identity matrix andOk the k×k zero matrix. Then the inertia matrix of A is the following block diagonaln× n matrix.

In+

In−

On0

Show that A is congruent to its inertia matrix. Next show that congruence is anequivalence relation. Finally, show that if two Hermitian matrices have the sameinertia matrix, then they must be congruent. Hint: First recall that there is aunitary matrix, U such that

U∗AU =

Dn+

Dn−

On0

where the Dn+is a diagonal matrix having the positive eigenvalues of A, Dn

−

beingdefined similarly. Now let

∣∣Dn

−

∣∣ denote the diagonal matrix which replaces each entry

of Dn−

with its absolute value. Consider the two diagonal matrices

D = D∗ =

D−1/2n+

∣∣Dn

−

∣∣−1/2

In0

Now consider D∗U∗AUD.


86 Exercises

The hint gives it away. When you block multiply, the thing which results is the inertiamatrix. Congruance is obviously an equivalence relation. Consider the transitive lawfor example. Say A = PBP ∗, B = QCQ∗. Then A = PQCQ∗P ∗ = PQC (PQ)

∗. Now

if you have two Hermitian matrices which have the same inertia matrix, then sincecongruance is an equivalence relation, it follows that the two matrices are congruant.

44. Show that if A,B are two congruent Hermitian matrices, then they have the sameinertia matrix. Hint: Let A = SBS∗ where S is invertible. Show that A,B have thesame rank and this implies that they are each unitarily similar to a diagonal matrixwhich has the same number of zero entries on the main diagonal. Therefore, lettingVA be the span of the eigenvectors associated with positive eigenvalues of A and VB

being defined similarly, it suffices to show that these have the same dimensions. Showthat (Ax,x) > 0 for all x ∈ VA. Next consider S

∗VA. For x ∈ VA, explain why

(BS∗x,S∗x) =(

S−1A (S∗)−1S∗x,S∗x

)

=(S−1Ax,S∗x

)=

(

Ax,(S−1

)∗S∗x

)

= (Ax,x) > 0

Next explain why this shows that S∗VA is a subspace of VB and so the dimension of VB

is at least as large as the dimension of VA. Hence there are at least as many positiveeigenvalues for B as there are for A. Switching A,B you can turn the inequalityaround. Thus the two have the same inertia matrix.

The above manipulation is straight forward. Now an orthonormal basis exists for Fn

which consists of eigenvectors of B. Say you have v1, · · · ,vq,w1, · · · ,wr, z1, · · · , zswhere the vi correspond to the positive eigenvalues, the wi to the negative eigenvaluesand the zi to the eigenvalue 0. We know that B is positive on span (v1, · · · ,vq) . Couldit be positive on any larger subspace? Could it be positive on span (v1, · · · ,vq,u)where

u ∈ span (w1, · · · ,wr, z1, · · · , zs)?No, it couldn’t because you could then consider (Bu,u) and it would end up being nolarger than 0. It follows that S∗VA ⊆ VV = span (v1, · · · ,vq) and so the dimensionof S∗VA, which is the same as the dimension of VA which is the same as the numberof positive eigenvalues, counted according to multiplicity, is no larger than q. Nowreverse the argument letting A ←→ B. Hence the two have the same number ofpositive eigenvalues. Since they have the same number of zero eigenvalues, this impliesthey have the same inertia matrix.

45. Let A be an m×n matrix. Then if you unraveled it, you could consider it as a vectorin Cnm. The Frobenius inner product on the vector space of m×n matrices is definedas

(A,B) ≡ trace (AB∗)

Show that this really does satisfy the axioms of an inner product space and that italso amounts to nothing more than considering m× n matrices as vectors in Cnm.

First note that

(A,B) =∑

i,j

AijBij =∑

i,j

AijBij = (B,A)

(A,A) =∑

i,j

AijAij =∑

i,j

|Aij |2

Thus (A,A) = 0 if and only if A = 0. Clearly for a, b scalars,

(aA+ bB,C) = a (A,C) + b (B,C)


Exercises 87

and so this does satisfy the axioms of the inner product. Also, this inner productcoincides with the standard inner product on Cnm.

46. ↑ Consider the n × n unitary matrices. Show that whenever U is such a matrix, itfollows that

|U |Cnn =

√n

Next explain why if Uk is any sequence of unitary matrices, there exists a subse-quence Ukm∞m=1 such that limm→∞ Ukm = U where U is unitary. Here the limittakes place in the sense that the entries of Ukm converge to the corresponding entriesof U .

From the above and the fact that U is unitary,

|U |2Cn2 = trace (UU∗) = trace (I) = n.

Let Uk be a sequence of unitary matrices. Since these, considered as vectors in Cn2

,are contained in a closed and bounded set, it follows that there is a subsequence whichconverges to U . Why is U a unitary matrix? You have

I = UkmU∗km→ UU∗

and so U is unitary.

47. ↑Let A,B be two n× n matrices. Denote by σ (A) the set of eigenvalues of A. Define

dist (σ (A) , σ (B)) = maxλ∈σ(A)

min |λ− µ| : µ ∈ σ (B)

Explain why dist (σ (A) , σ (B)) is small if and only if every eigenvalue of A is closeto some eigenvalue of B. Now prove the following theorem using the above problemand Schur’s theorem. This theorem says roughly that if A is close to B then theeigenvalues of A are close to those of B in the sense that every eigenvalue of A is closeto an eigenvalue of B.

Theorem F.24.2 Suppose limk→∞ Ak = A. Then

limk→∞

dist (σ (Ak) , σ (A)) = 0

By Schur’s theorem, there exist unitary matrices Uk such that

U∗kAkUk = Tk,

where Tk is upper triangular. I claim that there exists a subsequence Tkm∞m=1 suchthat

Tkm → T

Where T is unitarily similar to A.

Using the above problem, there exists a subsequence km such that

limm→∞

Ukm = U

a unitary matrix. It follows then that

limn→∞

Tkm = limm→∞

U∗kmAkmUkm = U∗AU ≡ T


88 Exercises

Since U∗kmAkmUkm is upper triangular, it follows that so is T . This shows the claim.

Next suppose that the conclusion does not hold. Then there exists ε > 0 and asubsequence, still called k such that

dist (σ (Ak) , σ (A)) ≥ ε

It follows that for every choice of Uk such that U∗kAkUk = Tk, a triangular matrix andfor every choice of unitary U such that U∗AU = T, a triangular matrix,

|Tk − T |Cn2 ≥ dist (σ (Ak) , σ (A)) ≥ ε

Recall from the proof of Schur’s theorem, you change the order of the eigenvalueson the diagonal by adjusting the unitary matrix U . No matter how you permutethe entries on the diagonals, the two are always far apart. Now pick Uk for each k,such that U∗kAkUk = Tk a triangular matrix. By the first part of the proof, thereexists a subsequence km such that Tkm → T where T is unitarily similar to A. Thiscontradicts the above.

48. Let A =

(a bc d

)

be a 2 × 2 matrix which is not a multiple of the identity. Show

that A is similar to a 2 × 2 matrix which has one entry equal to 0. Hint: First notethat there exists a vector a such that Aa is not a multiple of a. Then consider

B =(a Aa

)−1A(a Aa

)

Show B has a zero on the main diagonal.

Let the vector be

(pq

)

such that this vector and A

(pq

)

are linearly independent.

Then by brute force you do the following computation.

1

D

(cp+ dq −ap− bq−q p

)(a bc d

)(p ap+ bqq cp+ dq

)

Then you do the multiplication. This yields

1

D

(0 (ad− bc)

(bq2 − cp2 + apq − dpq

)

cp2 − bq2 − apq + dpq − (a+ d)(bq2 − cp2 + apq − dpq

)

)

and since the trace equals 0 the term in the bottom right is also equal to 0.

49. ↑ Let A be a complex n×n matrix which has trace equal to 0. Show that A is similarto a matrix which has all zeros on the main diagonal. Hint: Use Problem 30 onPage 122 to argue that you can say that a given matrix is similar to one which hasthe diagonal entries permuted in any order desired. Then use the above problem andblock multiplication to show that if the A has k nonzero entries, then it is similar toa matrix which has k− 1 nonzero entries. Finally, when A is similar to one which hasat most one nonzero entry, this one must also be zero because of the condition on thetrace.

Suppose A has k nonzero diagonal entries. First of all, we can assume k ≥ 2 becausethe trace equals 0. Then from Problem 30 on Page 122 it is similar to a block matrixof the form (

P QR S

)


Exercises 89

where the diagonal entries are just re ordered and P is a 2× 2 matrix which which isnot a multiple of the identity and has both diagonal entries nonzero. Otherwise, thetrace could not equal zero if all the diagonal entries were equal. Then from the aboveproblem, there exists L such that L−1PL has at least one zero on the diagonal.

(L−1 00 I

)(P QR S

)(L 00 I

)

=

(L−1PL L−1QRL S

)

Thus this new matrix is similar to the one above and it has at least one fewer nonzeroentries on the main diagonal. Continue this process obtaining a sequence of similarmatrices, each having fewer nonzero diagonal entries than the preceding one till thereis at most one nonzero entry. It follows that this one must be zero also because thetrace equals 0.

50. ↑An n × n matrix X is a comutator if there are n × n matrices A,B such that X =AB − BA. Show that the trace of any comutator is 0. Next show that if a complexmatrix X has trace equal to 0, then it is in fact a comutator. Hint: Use the aboveproblem to show that it suffices to consider X having all zero entries on the maindiagonal. Then define

A =

1 02

. . .

0 n

, Bij =

Xij

i−j if i 6= j

0 if i = j

The first part is easy because the trace of a product is the same in either order.Suppose then that X has zero trace. Then by the above problem, it is similar to amatrix Y which has a zero diagonal. Suppose you can show that Y = AB−BA. Thenif Y = S−1XS, you could write

S−1XS = AB −BA,

X = S (AB −BA)S−1 = SAS−1SBS−1 − SBS−1SAS−1

Thus X will be a comutator. It suffices then to show that whenever a matrix has allzero diagonal entries, it must be a comutator. Now use the definition of A,B in thehint.

(AB −BA)ij ≡∑

s

AisBsj −∑

r

BirArj

Since A is a diagonal matrix, this reduces for i 6= j to

iBij −Bijj =iXij

i− j− j

Xij

i− j= Xij

while if i = j, you haveiBii −Biii = 0

Thus it yields Xij .

F.25 Exercises

8.4


90 Exercises

1. Let H denote span

120

,

140

,

131

,

011

. Find the dimension of H

and determine a basis.

1 1 1 02 4 3 10 0 1 1

, row echelon form:

1 0 0 −10 1 0 00 0 1 1

. The first three vectors form

a basis and the dimension is 3.

2. Let M =u = (u1, u2, u3, u4) ∈ R4 : u3 = u1 = 0

. Is M a subspace? Explain.

Sure. It is closed with respect to linear combinations.

3. Let M =u = (u1, u2, u3, u4) ∈ R4 : u3 ≥ u1


No. Not a subspace. Consider (0, 0, 1, 0) and multiply by −1.

4. Let w ∈ R4 and let M =u = (u1, u2, u3, u4) ∈ R4 : w · u = 0

. Is M a subspace?

Explain.

Of course. It is the kernel of a linear transformation.

5. Let M =u = (u1, u2, u3, u4) ∈ R4 : ui ≥ 0 for each i = 1, 2, 3, 4

. Is M a subspace?

Explain.

NO. Multiply something by −1.

6. Let w,w1 be given vectors in R4 and define

M =u = (u1, u2, u3, u4) ∈ R4 : w · u = 0 and w1 · u = 0

.

Is M a subspace? Explain.

Yes. It is the intersection of two subspaces.

7. Let M =u = (u1, u2, u3, u4) ∈ R4 : |u1| ≤ 4


No. Take something nonzero in M where say u1 = 1. Now multiply by 100.

8. Let M =u = (u1, u2, u3, u4) ∈ R4 : sin (u1) = 1


No. Let u1 = π/2 and multiply by 2.

9. Suppose x1, · · · ,xk is a set of vectors from Fn. Show that 0 is in span (x1, · · · ,xk) .

0 =∑

i 0xi

10. Consider the vectors of the form

2t+ 3ss− tt+ s

: s, t ∈ R

.

Is this set of vectors a subspace of R3? If so, explain why, give a basis for the subspaceand find its dimension.

Yes. A basis is

2−11

,

311


Exercises 91


2t+ 3s+ us− tt+ su

: s, t, u ∈ R

.


It is a subspace. It is spanned by

3110

,

2110

,

1001

It will be a basis if it is a linearly independent set.

3 2 11 1 01 1 00 0 1

, row echelon form:

1 0 00 1 00 0 10 0 0

. Thus the vectors are linearly inde-

pendent.


2t+ u+ 1t+ 3u

t+ s+ vu

: s, t, u, v ∈ R

.


Because of that 1 this is not a subspace.

13. Let V denote the set of functions defined on [0, 1]. Vector addition is defined as(f + g) (x) ≡ f (x) + g (x) and scalar multiplication is defined as (αf) (x) ≡ α (f (x)).Verify V is a vector space. What is its dimension, finite or infinite?

Pick n points x1, · · · , xn . Then let ei (x) = 0 unless x = xi when it equals 1. Theneini=1 is linearly independent, this for any n.

14. Let V denote the set of polynomial functions defined on [0, 1]. Vector addition isdefined as (f + g) (x) ≡ f (x)+g (x) and scalar multiplication is defined as (αf) (x) ≡α (f (x)). Verify V is a vector space. What is its dimension, finite or infinite?

This is clearly a subspace of the set of all functions defined on [0, 1] . Its dimension isstill infinite. For example, you can see that

1, x, x2, · · · , xn

is linearly independent

for each n. You could use Theorem 8.2.9 to show this, for example.

15. Let V be the set of polynomials defined on R having degree no more than 4. Give abasis for this vector space.1, x, x2, x3, x4


92 Exercises

16. Let the vectors be of the form a + b√2 where a, b are rational numbers and let the

field of scalars be F = Q, the rational numbers. Show directly this is a vector space.What is its dimension? What is a basis for this vector space?

A basis is1,√2. In fact it is actually a field.

(a+ b

√2)−1

= a−b√2

a2−2b2 . Since a, b are rational, it follows that the denominator is not0. Thus every element has an inverse so it is actually a field. It is certainly a vectorspace over the rational numbers.

17. Let V be a vector space with field of scalars F and suppose v1, · · · ,vn is a basis forV . Now let W also be a vector space with field of scalars F. Let L : v1, · · · ,vn →W be a function such that Lvj = wj . Explain how L can be extended to a lineartransformation mapping V to W in a unique way.

L (∑n

i=1 civi) ≡∑n

i=1 ciwi

18. If you have 5 vectors in F5 and the vectors are linearly independent, can it always beconcluded they span F5? Explain.

Yes. If not, you could add in a vector not in the span and obtain a linearly independentset of 6 vectors which is not possible.

19. If you have 6 vectors in F5, is it possible they are linearly independent? Explain.

No. There is a spanning set having 5 vectors and this would need to be as long as thelinearly independent set.

20. Suppose V,W are subspaces of Fn. Show V ∩W defined to be all vectors which are inboth V and W is a subspace also.

If x, y ∈ V ∩W then αa+βb ∈ V and is also in W because these are subspaces. HenceV ∩W is a subspace.

21. Suppose V and W both have dimension equal to 7 and they are subspaces of a vectorspace of dimension 10. What are the possibilities for the dimension of V ∩W? Hint:

Remember that a linear independent set can be extended to form a basis.

See the next problem.

22. Suppose V has dimension p and W has dimension q and they are each contained ina subspace, U which has dimension equal to n where n > max (p, q) . What are thepossibilities for the dimension of V ∩W? Hint: Remember that a linear independentset can be extended to form a basis.

Let x1, · · · , xk be a basis for V ∩W. Then there is a basis for V and W which arerespectively

x1, · · · , xk, yk+1, · · · , yp , x1, · · · , xk, zk+1, · · · , zq

It follows that you must have k + p− k + q − k ≤ n and so you must have

p+ q − n ≤ k

23. If b 6= 0, can the solution set of Ax = b be a plane through the origin? Explain.

No. It can’t. It does not contain 0.


Exercises 93

24. Suppose a system of equations has fewer equations than variables and you have founda solution to this system of equations. Is it possible that your solution is the only one?Explain.

No. There must then be infinitely many solutions. If the system is Ax = b, thenthere are infinitely many solutions to Ax = 0 and so the solutions to Ax = b are aparticular solution to Ax = b added to the solutions to Ax = 0 of which there areinfinitely many.

25. Suppose a system of linear equations has a 2×4 augmented matrix and the last columnis a pivot column. Could the system of linear equations be consistent? Explain.

No. This would lead to 0 = 1.

26. Suppose the coefficient matrix of a system of n equations with n variables has theproperty that every column is a pivot column. Does it follow that the system ofequations must have a solution? If so, must the solution be unique? Explain.

Yes. It has a unique solution.

27. Suppose there is a unique solution to a system of linear equations. What must be trueof the pivot columns in the augmented matrix.

The last one must not be a pivot column and the ones to the left must each be pivotcolumns.

28. State whether each of the following sets of data are possible for the matrix equationAx = b. If possible, describe the solution set. That is, tell whether there exists aunique solution no solution or infinitely many solutions.

(a) A is a 5 × 6 matrix, rank (A) = 4 and rank (A|b) = 4. Hint: This says b is inthe span of four of the columns. Thus the columns are not independent.

Infinite solution set.

(b) A is a 3× 4 matrix, rank (A) = 3 and rank (A|b) = 2.

This surely can’t happen. If you add in another column, the rank does not getsmaller.

(c) A is a 4 × 2 matrix, rank (A) = 4 and rank (A|b) = 4. Hint: This says b is inthe span of the columns and the columns must be independent.

You can’t have the rank equal 4 if you only have two columns.

(d) A is a 5 × 5 matrix, rank (A) = 4 and rank (A|b) = 5. Hint: This says b is notin the span of the columns.

In this case, there is no solution to the system of equations represented by theaugmented matrix.

(e) A is a 4× 2 matrix, rank (A) = 2 and rank (A|b) = 2.

In this case, there is a unique solution since the columns of A are independent.

29. Suppose A is an m×n matrix in which m ≤ n. Suppose also that the rank of A equalsm. Show that A maps Fn onto Fm. Hint: The vectors e1, · · · , em occur as columnsin the row reduced echelon form for A.

This says that the columns of A have a subset of m vectors which are linearly inde-pendent. Therefore, this set of vectors is a basis for Fm. It follows that the span ofthe columns is all of Fm. Thus A is onto.


94 Exercises

30. Suppose A is an m×n matrix in which m ≥ n. Suppose also that the rank of A equalsn. Show that A is one to one. Hint: If not, there exists a vector, x such that Ax = 0,and this implies at least one column of A is a linear combination of the others. Showthis would require the column rank to be less than n.

The columns are independent. Therefore, A is one to one.

31. Explain why an n× n matrix A is both one to one and onto if and only if its rank isn.

The rank is n is the same as saying the columns are independent which is the same assaying A is one to one which is the same as saying the columns are a basis. Thus thespan of the columns of A is all of Fn and so A is onto. If A is onto, then the columnsmust be linearly independent since otherwise the span of these columns would havedimension less than n and so the dimension of Fn would be less than n.

32. If you have not done this problem already, here it is again. It is a very importantresult. Suppose A is an m× n matrix and B is an n× p matrix. Show that

dim (ker (AB)) ≤ dim (ker (A)) + dim (ker (B)) .

Let w1, · · · ,wk be a basis for B (Fp) ∩ ker (A) and suppose u1, · · · ,ur is a basisfor ker (B). Let Bzi = wi. Now suppose x ∈ ker (AB). Then Bx ∈ ker (A) ∩ B (Fp)and so

Bx =k∑

i=1

aiwi =k∑

i=1

aiBzi

so

x−k∑

i=1

aizi ∈ ker (B)

Hence

x−k∑

i=1

aizi =

r∑

j=1

bjuj

This shows that

dim (ker (AB)) ≤ k + r ≤ dim (ker (A)) + dim (ker (B))

Note that a little more can be said. u1, · · · ,ur, z1, · · · , zk is independent. Here iswhy. Suppose

∑

i

aiui +

k∑

j=1

bjzj = 0

Then do B to both sides and conclude that∑

j

bjBzj = 0

which forces each bj = 0. Then the independence of the ui implies each ai = 0. Itfollows that in fact,

dim (ker (AB)) = k + r ≤ dim (ker (A)) + dim (ker (B))

and the remaining inequality is an equality if and only if B (Fp) ⊇ ker (A).

This is because B (Fp)∩ker (A) ⊆ ker (A) and these are equal exactly when they havethe same dimension.


Exercises 95

33. Recall that every positive integer can be factored into a product of primes in a uniqueway. Show there must be infinitely many primes. Hint: Show that if you have anyfinite set of primes and you multiply them and then add 1, the result cannot bedivisible by any of the primes in your finite set. This idea in the hint is due to Euclidwho lived about 300 B.C.

Consider p1p2 · · · pn + 1. Then if

p1p2 · · · pn + 1 = lpk

you would have1

pk+m = l

where m is an integer, the product of the other primes different than pk. This is acontradiction. Hence there must be another prime to divide p1p2 · · · pn + 1 and sothere are infinitely many primes.

34. There are lots of fields. This will give an example of a finite field. Let Z denote the setof integers. Thus Z = · · · ,−3,−2,−1, 0, 1, 2, 3, · · ·. Also let p be a prime number.We will say that two integers, a, b are equivalent and write a ∼ b if a − b is divisibleby p. Thus they are equivalent if a − b = px for some integer x. First show thata ∼ a. Next show that if a ∼ b then b ∼ a. Finally show that if a ∼ b and b ∼ cthen a ∼ c. For a an integer, denote by [a] the set of all integers which is equivalentto a, the equivalence class of a. Show first that is suffices to consider only [a] fora = 0, 1, 2, · · · , p− 1 and that for 0 ≤ a < b ≤ p− 1, [a] 6= [b]. That is, [a] = [r] wherer ∈ 0, 1, 2, · · · , p− 1. Thus there are exactly p of these equivalence classes. Hint:

Recall the Euclidean algorithm. For a > 0, a = mp+ r where r < p. Next define thefollowing operations.

[a] + [b] ≡ [a+ b]

[a] [b] ≡ [ab]

Show these operations are well defined. That is, if [a] = [a′] and [b] = [b′] , then[a] + [b] = [a′] + [b′] with a similar conclusion holding for multiplication. Thus foraddition you need to verify [a+ b] = [a′ + b′] and for multiplication you need to verify[ab] = [a′b′]. For example, if p = 5 you have [3] = [8] and [2] = [7] . Is [2× 3] = [8× 7]?Is [2 + 3] = [8 + 7]? Clearly so in this example because when you subtract, the resultis divisible by 5. So why is this so in general? Now verify that [0] , [1] , · · · , [p− 1]with these operations is a Field. This is called the integers modulo a prime and iswritten Zp. Since there are infinitely many primes p, it follows there are infinitelymany of these finite fields. Hint: Most of the axioms are easy once you have shownthe operations are well defined. The only two which are tricky are the ones whichgive the existence of the additive inverse and the multiplicative inverse. Of these, thefirst is not hard. − [x] = [−x]. Since p is prime, there exist integers x, y such that1 = px+ky and so 1−ky = px which says 1 ∼ ky and so [1] = [ky] . Now you finish theargument. What is the multiplicative identity in this collection of equivalence classes?Of course you could now consider field extensions based on these fields.

The only substantive issue is why Zp is a field. Let [x] ∈ Zp where [x] 6= [0]. Thus xis not a multiple of p. Then x and p are relatively prime. Hence from Theorem 1.9.3,there exists a, b such that

1 = ap+ bx

Then[1− bx] = [ap] = 0


96 Exercises

and it follows that[b] [x] = [1]

so [b] = [x]−1

.

35. Suppose the field of scalars is Z2 described above. Show that(

0 10 0

)(0 01 0

)

−(

0 01 0

)(0 10 0

)

=

(1 00 1

)

Thus the identity is a comutator. Compare this with Problem 50 on Page 198.

This is easy. The left side equals

(1 00 −1

)

=

(1 00 1

)

36. Suppose V is a vector space with field of scalars F. Let T ∈ L (V,W ) , the space oflinear transformations mapping V onto W where W is another vector space. Definean equivalence relation on V as follows. v ∼ w means v −w ∈ ker (T ) . Recall thatker (T ) ≡ v : Tv = 0. Show this is an equivalence relation. Now for [v] an equiv-alence class define T ′ [v] ≡ Tv. Show this is well defined. Also show that with theoperations

[v] + [w] ≡ [v +w]

α [v] ≡ [αv]

this set of equivalence classes, denoted by V/ ker (T ) is a vector space. Show next thatT ′ : V/ ker (T )→W is one to one, linear, and onto. This new vector space, V/ ker (T )is called a quotient space. Show its dimension equals the difference between thedimension of V and the dimension of ker (T ).

It is obviously an equivalence relation. Is the operation of addition well defined? If[v] = [v′] , similar for w,w′, is it the case that [v +w] = [v′ +w′]? Is

v +w− (v′ +w′) ∈ ker (T )?

Of course so. It is just the sum of two things in ker (T ) . Similarly the operation ofscalar multiplication is well defined. Is T ′ well defined? If v ∼ w, is it true thatTv =Tw? Of course so.

T (v −w) = 0

Is T ′ linear? This is obvious. Is T ′ one to one? If T ′ [v] = 0, then it says that Tv = 0

and so v ∈ ker (T ) . Hence [v] = [0] . Thus T ′ is certainly one to one. If T is onto, it isobvious that T ′ is onto. Just pick w ∈ W . Then Tv = w for some v. Then T ′ [v] = w.

37. Let V be an n dimensional vector space and let W be a subspace. Generalize theabove problem to define and give properties of V/W . What is its dimension? Whatis a basis?

Here you define an equivalence relation by v ∼ u if v − u ∈ W. This is clearlyan equivalence relation. Define [v] + [u] ≡ [u+ v] and α [u] ≡ [αu]. Then by arepeat of the above, everything is well defined. It is clear that V/W , the set ofequivalence classes is a vector space with these operations as just defined. What isits dimension? Let w1, · · · , wr be a basis for W . Now extend to get a basis forV, w1, · · · , wr, v1, · · · , vn−r. Then if

n−r∑

i=1

ci [vi] = [0] ,


Exercises 97

it follows that∑n−r

i=1 civi ∈W and so there exist scalars di such that

r∑

i=1

diwi +n−r∑

i=1

civi = 0

Now, since w1, · · · , wr, v1, · · · , vn−r is a basis, it follows that all the di and ci areequal to 0. Therefore, in particular the vectors [vi] form a linearly independent set inV/W. Do these vectors span V/W? Suppose [v] ∈ V/W . Then

v =

r∑

i=1

diwi +

n−r∑

i=1

civi

for suitable scalars ci and di. Now it follows that

[v] =

n−r∑

i=1

ci [vi]

and so the dimension of V/W equals n− r where r is the dimension of W .

38. If F and G are two fields and F ⊆ G, can you consider G as a vector space with fieldof scalars F? Explain.

Yes. This is obvious.

39. Let A denote the algebraic numbers, those numbers which are roots of polynomialshaving rational coefficients which are in R. Show A can be considered a vector spacewith field of scalars Q. What is the dimension of this vector space, finite or infinite?

As in the above problem, since A is a field, it can be considered as a vector spaceover Q. So what is the dimension? It is clearly infinite. To see this, consider xn − 7.By the rational root theorem, this polynomial is irreducible. Therefore, the minimalpolynomial for n

√7 is xn − 7 and so for α = n

√7, 1, α, α2, · · · , αn−1 are linearly inde-

pendent since otherwise, xn− 7 would not be the minimal polynomial of α and wouldend up not being irreducible. Since n is arbitrary, this shows that the dimension of Ais infinite.

40. As mentioned, for distinct algebraic numbers αi, the complex numbers eαini=1 arelinearly independent over the field of scalars A where A denotes the algebraic numbers,those which are roots of a polynomial having integer (rational) coefficients. What isthe dimension of the vector space C with field of scalars A, finite or infinite? If thefield of scalars were C instead of A, would this change? What if the field of scalarswere R?

How many distinct algebraic numbers can you get? Clearly as many as desired. Justconsider the nth roots of some complex number. If you have n distinct algebraicnumbers αi, then from the Lindemann Weierstrass theorem mentioned above, eαini=1

is linearly independent. Of course if the field of scalars were C this would changecompletely. Then the dimension of C over C is clearly 1.

41. Suppose F is a countable field and let A be the algebraic numbers, those numberswhich are roots of a polynomial having coefficients in F which are in G, some otherfield containing F. Show A is also countable.

You consider each polynomial of degree n which has coefficients in the countable fieldF. There are countably many of these. Each has finitely many roots in A. Therefore,


98 Exercises

there are countably many things in A which correspond to polynomials of degree nThe totality of A is obtained then from taking a countable union of countable sets,and this is countable.

42. This problem is on partial fractions. Suppose you have

R (x) =p (x)

q1 (x) · · · qm (x), degree of p (x) < degree of denominator.

where the polynomials qi (x) are relatively prime and all the polynomials p (x) andqi (x) have coefficients in a field of scalars F. Thus there exist polynomials ai (x)having coefficients in F such that

1 =

m∑

i=1

ai (x) qi (x)

Explain why

R (x) =p (x)

∑mi=1 ai (x) qi (x)

q1 (x) · · · qm (x)=

m∑

i=1

ai (x) p (x)∏

j 6=i qj (x)

This is because you just did

R (x) =p (x)

q1 (x) · · · qm (x)=

p (x)∑m

i=1 ai (x) qi (x)

q1 (x) · · · qm (x)

Then simplifying this yields

R (x) =

m∑

i=1

ai (x) p (x)∏

j 6=i qj (x)

Now continue doing this on each term in the above sum till finally you obtain anexpression of the form

m∑

i=1

bi (x)

qi (x)

Using the Euclidean algorithm for polynomials, explain why the above is of the form

M (x) +

m∑

i=1

ri (x)

qi (x)

where the degree of each ri (x) is less than the degree of qi (x) and M (x) is a polyno-mial.

You just note thatbi (x) = li (x) qi (x) + ri (x)

where the degree of ri (x) is less than the degree of qi (x) . Then replacing each bi (x)by this expression, you obtain the above expression.

Now argue that M (x) = 0.

You have

M (x) +

m∑

i=1

ri (x)

qi (x)=

p (x)

q1 (x) · · · qm (x)


Exercises 99

Now multiply both sides by q1 (x) · · · qm (x) . If M (x) is not zero, then you would havethe degree of the left is larger than the degree on the right in the equation whichresults. Hence M (x) = 0.

From this explain why the usual partial fractions expansion of calculus must be true.You can use the fact that every polynomial having real coefficients factors into a prod-uct of irreducible quadratic polynomials and linear polynomials having real coefficients.This follows from the fundamental theorem of algebra in the appendix.

What if some qi (x) = ki (x)n? In this case, you can go further. You have the partial

fractions expansion described above and you have a term of the form

a (x)

k (x)n , degree of a (x) < degree k (x)n

I claim this can be written asn∑

i=1

ai (x)

k (x)i

where the degree of ai (x) < degree of k (x). This can be proved by induction. Ifn = 1, there is nothing to show. Suppose then it is true for n− 1.

a (x)

k (x)n=

1

k (x)

(

a (x)

k (x)n−1

)

If the degree of a (x) is less than the degree of k (x)n−1

, then there is nothing to show.Induction takes care of it. If this is not the case, you can write

a (x)

k (x)n =

1

k (x)

(

a (x)

k (x)n−1

)

=1

k (x)

(

b (x) k (x)n−1

+ r (x)

k (x)n−1

)

where the degree of r (x) is less than the degree of k (x)n−1

. Then this reduces to

a (x)

k (x)n =

b (x)

k (x)+

r (x)

k (x)n−1

1

k (x)

The second term gives what is desired by induction. I claim that the degree of b (x)must be less than the degree of k (x) . If not, you could multiply both sides by k (x)

n

and obtaina (x) = b (x) k (x)n−1 + r (x)

and the degree of the left is less than the degree of k (x)nwhile the degree of the

right is greater than the degree of k (x)n. Therefore, the desired decomposition has

been obtained. Now the usual procedure for partial fractions follows. If you havea rational function p (x) /q (x) where the degree of q (x) is larger than the degreeof p (x) , there exist factors of q (x) into products of linear and irreducible over R

quadratic factors by using the fundamental theorem of algebra. Then you just applythe above decomposition result.

43. Suppose f1, · · · , fn is an independent set of smooth functions defined on some inter-val (a, b). Now let A be an invertible n×n matrix. Define new functions g1, · · · , gnas follows.

g1...gn

= A

f1...fn


100 Exercises

Is it the case that g1, · · · , gn is also independent? Explain why.

Suppose∑n

i=1 aigi = 0. Then

0 =∑

i

ai∑

j

Aijfj =∑

j

fj∑

i

Aijai

It follows that ∑

i

Aijai = 0

for each j. Therefore, since AT is invertible, it follows that each ai = 0. Hence thefunctions gi are linearly independent.

F.26 Exercises

9.5

1. If A,B, and C are each n× n matrices and ABC is invertible, why are each of A,B,and C invertible?

This is because ABC is one to one.

2. Give an example of a 3 × 2 matrix with the property that the linear transformationdetermined by this matrix is one to one but not onto.

1 00 10 0

3. Explain why Ax = 0 always has a solution whenever A is a linear transformation.

4. Review problem: Suppose det (A− λI) = 0. Show using Theorem 3.1.15 there existsx 6= 0 such that (A− λI)x = 0.

Since the matrix (A− λI) is not invertible, it follows that there exists x which it sendsto 0.

5. How does the minimal polynomial of an algebraic number relate to the minimal poly-nomial of a linear transformation? Can an algebraic number be thought of as a lineartransformation? How?

They are pretty much the same thing. If you have an algebraic number α it canbe thought of as a linear transformation which maps A to A according to the ruleα (β) 6= αβ.

6. Recall the fact from algebra that if p (λ) and q (λ) are polynomials, then there existsl (λ) , a polynomial such that

q (λ) = p (λ) l (λ) + r (λ)

where the degree of r (λ) is less than the degree of p (λ) or else r (λ) = 0. With this inmind, why must the minimal polynomial always divide the characteristic polynomial?That is, why does there always exist a polynomial l (λ) such that p (λ) l (λ) = q (λ)?Can you give conditions which imply the minimal polynomial equals the characteristicpolynomial? Go ahead and use the Cayley Hamilton theorem.


Exercises 101

If q (λ) is the characteristic polynomial and p (λ) is the minimal polynomial, then fromthe above equation,

0 = r (A)

and this cannot happen unless r (λ) = 0 because if not, p (λ) was not the minimalpolynomial after all.

7. In the following examples, a linear transformation, T is given by specifying its actionon a basis β. Find its matrix with respect to this basis.

(a) T

(12

)

= 2

(12

)

+ 1

(−11

)

, T

(−11

)

=

(−11

)

The basis is

(12

)

,

(−11

)

. Formally,(

2

(12

)

+ 1

(−11

)

,

(−11

))

=

((12

)

,

(−11

))

A(

1 −15 1

)

=

(1 −12 1

)

A

A =

(1 −12 1

)−1 (1 −15 1

)

=

(2 01 1

)

(b) T

(01

)

= 2

(01

)

+ 1

(−11

)

, T

(−11

)

=

(01

)

(

2

(01

)

+ 1

(−11

)

,

(01

))

=

((01

)

,

(−11

))

A

A =

(2 11 0

)

(c) T

(10

)

= 2

(12

)

+ 1

(10

)

, T

(12

)

= 1

(10

)

−(

12

)

(

2

(12

)

+ 1

(10

)

, 1

(10

)

−(

12

))

=

((10

)

,

(12

))

A

A =

(1 12 −1

)

8. Let β = u1, · · · ,un be a basis for Fn and let T : Fn → Fn be defined as follows.

T

(

n∑

k=1

akuk

)

=

n∑

k=1

akbkuk

First show that T is a linear transformation. Next show that the matrix of T withrespect to this basis is [T ]β =

b1. . .

bn

Show that the above definition is equivalent to simply specifying T on the basis vectorsof β by

T (uk) = bkuk.

Formally,(b1u1, b2u2, · · · , bnun) = (u1,u2, · · · ,un) [T ]β

Thus [T ]β is the diagonal matrix just described.


102 Exercises

9. ↑In the situation of the above problem, let γ = e1, · · · , en be the standard basis forFn where ek is the vector which has 1 in the kth entry and zeros elsewhere. Show that[T ]γ =

(u1 · · · un

)[T ]β

(u1 · · · un

)−1(6.36)

We have Tui = biui and so we know [T ]β. The problem is to find [T ]γ where γ is thegood basis. Consider the diagram

Fn[T ]γ→ Fn

↓ ↓Fn T→ Fn

↑ ↑Fn

[T ]β→ Fn

On the left side going down you get x→ x→(u1 · · · un

)−1x. Going up on the

right, you need to have x→(u1 · · · un

)x→

(u1 · · · un

)x. Then you need

[T ]γ =(u1 · · · un

)[T ]β

(u1 · · · un

)−1

10. ↑Generalize the above problem to the situation where T is given by specifying itsaction on the vectors of a basis β = u1, · · · ,un as follows.

Tuk =

n∑

j=1

ajkuj .

Letting A = (aij) , verify that for γ = e1, · · · , en , 6.36 still holds and that [T ]β = A.

This isn’t really any different. As explained earlier, the matrix of T with respect tothe given basis is just A where it is as described. Thus A = [T ]β and by a repeat ofthe above argument,

[T ]γ =(u1 · · · un

)[T ]β

(u1 · · · un

)−1

11. Let P3 denote the set of real polynomials of degree no more than 3, defined on aninterval [a, b]. Show that P3 is a subspace of the vector space of all functions definedon this interval. Show that a basis for P3 is

1, x, x2, x3

. Now let D denote the

differentiation operator which sends a function to its derivative. Show D is a lineartransformation which sends P3 to P3. Find the matrix of this linear transformationwith respect to the given basis.(0, 1, 2x, 3x2

)=

(1, x, x2, x3

)A where A is the desired matrix. Then clearly

A =

0 1 0 00 0 2 00 0 0 30 0 0 0

12. Generalize the above problem to Pn, the space of polynomials of degree no more thann with basis 1, x, · · · , xn .The pattern seems pretty clear. A will be an (n+ 1)× (n+ 1) matrix which has zeroseverywhere except on the super diagonal where you have, starting at the top and goingtowards the bottom, the numbers in the following order. 1, 2, 3, · · · , n.


Exercises 103

13. In the situation of the above problem, let the linear transformation be T = D2 + 1,defined as Tf = f ′′ + f. Find the matrix of this linear transformation with respect tothe given basis 1, x, · · · , xn.First consider the operator D2.(0 0 2 6x 12x2 · · ·

)=

(1 x x2 x3 x4 · · ·

)A

To see the pattern, let n = 4. Then you would have

A =

0 0 2 0 00 0 0 6 00 0 0 0 120 0 0 0 00 0 0 0 0

Then the matrix will be this one added to the identity. Hence it is the matrix

1 0 2 0 00 1 0 6 00 0 1 0 120 0 0 1 00 0 0 0 1

14. In calculus, the following situation is encountered. There exists a vector valued func-tion f :U → Rm where U is an open subset of Rn. Such a function is said to havea derivative or to be differentiable at x ∈ U if there exists a linear transformationT : Rn → Rm such that

limv→0

|f (x+ v)− f (x) − Tv||v| = 0.

First show that this linear transformation, if it exists, must be unique. Next showthat for β = e1, · · · , en , , the standard basis, the kth column of [T ]β is

∂f

∂xk(x) .

Actually, the result of this problem is a well kept secret. People typically don’t seethis in calculus. It is seen for the first time in advanced calculus if then.

Suppose that T ′ also works. Then letting t > 0,

limt→0

|f (x+ tv)− f (x)− tTv|t

= limt→0

|f (x+ tv)− f (x)− tT ′v|t

= 0

In particular,

limt→0

t (Tv− T ′v)

t= 0

which says that Tv = T ′v and since v is arbitrary, this shows T = T ′.

15. Recall that A is similar to B if there exists a matrix P such that A = P−1BP. Showthat if A and B are similar, then they have the same determinant. Give an exampleof two matrices which are not similar but have the same determinant.

These two are not similar but have the same determinant.(

1 00 1

)

,

(1 10 1

)


104 Exercises

You can see these are not similar by noticing that the second has an eigenspace ofdimension equal to 1 so it is not similar to any diagonal matrix which is what the firstone is.

16. Suppose A ∈ L (V,W ) where dim (V ) > dim (W ) . Show ker (A) 6= 0. That is, showthere exist nonzero vectors v ∈ V such that Av = 0.

If not, then A would be one to one and so it would take a basis to a linearly independentset which is impossible because W has smaller dimension than V .

17. A vector v is in the convex hull of a nonempty set S if there are finitely many vectorsof S, v1, · · · ,vm and nonnegative scalars t1, · · · , tm such that

v =

m∑

k=1

tkvk,

m∑

k=1

tk = 1.

Such a linear combination is called a convex combination. Suppose now that S ⊆ V,a vector space of dimension n. Show that if v =

∑mk=1 tkvk is a vector in the convex

hull for m > n+ 1, then there exist other scalars t′k such that

v =

m−1∑

k=1

t′kvk.

Thus every vector in the convex hull of S can be obtained as a convex combinationof at most n+ 1 points of S. This incredible result is in Rudin [23]. Hint: ConsiderL : Rm → V × R defined by

L (a) ≡(

m∑

k=1

akvk,

m∑

k=1

ak

)

Explain why ker (L) 6= 0 . Next, letting a ∈ ker (L) \ 0 and λ ∈ R, note thatλa ∈ ker (L) . Thus for all λ ∈ R,

v =m∑

k=1

(tk + λak)vk.

Now vary λ till some tk + λak = 0 for some ak 6= 0.

If any of the tk = 0, there is nothing to prove. Hence you can assume each of thesetk > 0. Suppose then that m > n+ 1 and consider

L (a) ≡(

m∑

k=1

akvk,

m∑

k=1

ak

)

a mapping from Rm to V ×R a vector space of dimension n+1. Then since m > n+1,this mapping L cannot be one to one. Hence there exists a 6= 0 such that La = 0.Thus also λa ∈ ker (L). By assumption, v =

∑mk=1 tkvk. Then for each λ ∈ R,

v =

m∑

k=1

(tk + λak)vk

because∑m

k=1 akvk = 0. Now∑

(tk + λak) = 1 because∑

ak = 0. Also, for λ = 0,each of the tk + λak > 0. Adjust λ so that each of these is still ≥ 0 but one vanishes.Then you have gotten v as a convex combination of fewer than m vectors.


Exercises 105

18. For those who know about compactness, use Problem 17 to show that if S ⊆ Rn andS is compact, then so is its convex hull.

Suppose that the convex hull of S is not compact. Then there exists(

v1k,v2k, · · · ,v(n+1)k

)∞k=1

, each in S and sk ∈ C ≡

s :∑n+1

i=1 si = 1, si ∈ [0, 1]

such that n+1∑

i=1

sikvik

∞

k=1

has no convergent subsequence. However, there exists a further subsequence, stilldenoted as k such that for each i, limk→∞ vik = vi. This is because S is given to becompact. Also, taking yet another subsequence, you can assume that limk→∞ sik = sibecause of the compactness of C. But this yields an obvious contradiction to theassertion that the above has no convergent subsequence.

19. Suppose Ax = b has a solution. Explain why the solution is unique precisely whenAx = 0 has only the trivial (zero) solution.

This is because the general solution is yp+y where Ayp = b and Ay = 0. Now A0 = 0

and so the solution is unique precisely when this is the only solution y to Ay = 0.

20. Let A be an n × n matrix of elements of F. There are two cases. In the first case,F contains a splitting field of pA (λ) so that p (λ) factors into a product of linearpolynomials having coefficients in F. It is the second case which of interest here wherepA (λ) does not factor into linear factors having coefficients in F. Let G be a splittingfield of pA (λ) and let qA (λ) be the minimal polynomial of A with respect to the fieldG. Explain why qA (λ) must divide pA (λ). Now why must qA (λ) factor completelyinto linear factors?

In the field G

pA (λ) = qA (λ) l (λ) + r (λ)

where r (λ) either equals 0 or has degree less than the degree of qA (λ). Then itfollows that r (A) = 0 and so it must be the case that r (λ) = 0. Hence qA (λ)must divide pA (λ). Now G was the splitting field of pA (λ) and this new polynomialdivides this one. Therefore, this new polynomial also must factor completely intolinear polynomials having coefficients in G.

pA (λ) = qA (λ) l (λ)

and all roots of pA (λ) are in G so the same is true of the roots of qA (λ) because theseare all roots of pA (λ).

21. In Lemma 9.2.2 verify that L is linear.


106 Exercises

L

(

an∑

i=1

αivi + bn∑

i=1

βivi

)

= L

(

n∑

i=1

(aαi + bβi) vi

)

=

n∑

i=1

(aαi + bβi)Lvi

= an∑

i=1

αiLvi + bn∑

i=1

βiLvi

= aL

(

n∑

i=1

αivi

)

+ bL

(

n∑

i=1

βivi

)

F.27 Exercises

10.6

1. In the discussion of Nilpotent transformations, it was asserted that if two n×nmatricesA,B are similar, then Ak is also similar to Bk. Why is this so? If two matrices aresimilar, why must they have the same rank?

Say A = S−1BS. Then Ak = S−1BkS and so these are also similar. You can write

S−1BSS−1BSS−1BS · · ·S−1BS

and note that the inside SS−1 cancels. Therefore, the result of the above is as claimed.Hence Ak is similar to Bk.

Now the dimension of A (Fn) is the dimension of S−1BS (Fn) which is the same as thedimension of S−1B (Fn) which is the same as the dimension of B (Fn) because S, S−1

are one to one and onto. Hence they each take a basis to a basis.

2. If A,B are both invertible, then they are both row equivalent to the identity matrix.Are they necessarily similar? Explain.

Obviously not. Consider

(

1 10 1

)

,

(

1 00 1

)

. These are both in Jordan form.

3. Suppose you have two nilpotent matrices A,B and Ak and Bk both have the samerank for all k. Does it follow that A,B are similar? What if it is not known that A,Bare nilpotent? Does it follow then?

Yes, this is so. You have A = S−1JS and B = T−1J ′T where J, J ′ are both Jordancanonical form. Then you have that J = SAS−1 and J ′ = TBT−1 and so Jk and J ′k

have the same rank for all k. Therefore, J = J ′ because these matrices are nilpotent.It follows that

SAS−1 = TBT−1

and consequently,

B = T−1SAS−1T =(

S−1T)−1

AS−1T

so that A is similar to B. In case they are not nilpotent, the above problem gives acounter example.


Exercises 107

4. When we say a polynomial equals zero, we mean that all the coefficients equal 0. Ifwe assign a different meaning to it which says that a polynomial

p (λ) =

n∑

k=0

akλk = 0,

when the value of the polynomial equals zero whenever a particular value of λ ∈ F

is placed in the formula for p (λ) , can the same conclusion be drawn? Is there anydifference in the two definitions for ordinary fields like Q? Hint: Consider Z2, theintegers mod 2.

This sort of thing where it equals 0 if it sends everything to 0 works fine when wethink of polynomials as functions. However, consider λ2 + λ where the coefficientscome from Z2. It sends everything to 0 but we don’t want to consider this polynomialto be the zero polynomial because its coefficients are not all 0.

5. Let A ∈ L (V, V ) where V is a finite dimensional vector space with field of scalars F.Let p (λ) be the minimal polynomial and suppose φ (λ) is any nonzero polynomial suchthat φ (A) is not one to one and φ (λ) has smallest possible degree such that φ (A) isnonzero and not one to one. Show φ (λ) must divide p (λ).

Say φ (A)x = 0 where x 6= 0. Consider all monic polynomials φ (λ) such that φ (A)x = 0.One such is the minimal polynomial p (λ). This condition is less severe than sayingthat φ (A) = 0. Therefore, there are more polynomials considered. If you take the oneof least degree, its degree must be no larger than the degree of p (λ) . By assumptionthere is a polynomial φ (λ) for which φ (A) 6= 0 but such that φ (A)x = 0. You can dothe usual thing.

p (λ) = l (λ)φ (λ) + r (λ)

where r (λ) = 0 or else it has smaller degree than φ (λ) . Then

0 = l (A)φ (A)x+ r (A)x

It follows that r (A)x = 0. If r (λ) 6= 0 this would be a contradiction to the definitionof φ (λ). Therefore, φ (λ) divides p (λ) .

6. Let A ∈ L (V, V ) where V is a finite dimensional vector space with field of scalars F.Let p (λ) be the minimal polynomial and suppose φ (λ) is an irreducible polynomialwith the property that φ (A)x = 0 for some specific x 6= 0. Show that φ (λ) mustdivide p (λ) . Hint: First write

p (λ) = φ (λ) g (λ) + r (λ)

where r (λ) is either 0 or has degree smaller than the degree of φ (λ). If r (λ) = 0 youare done. Suppose it is not 0. Let η (λ) be the monic polynomial of smallest degreewith the property that η (A)x = 0. Now use the Euclidean algorithm to divide φ (λ)by η (λ) . Contradict the irreducibility of φ (λ) .

Say φ (A)x = 0 where x 6= 0. Consider all monic polynomials φ (λ) such that φ (A)x =0. One such is the minimal polynomial p (λ). This condition is less severe than sayingthat φ (A) = 0. Therefore, there are more polynomials considered. If you take the oneof least degree, η its degree must be no larger than the degree of p (λ) . You can dothe usual thing.

p (λ) = g (λ) η (λ) + r (λ)


108 Exercises

where r (λ) = 0 or else it has smaller degree than η (λ) . Then

0 = g (A) η (A)x+ r (A) x

It follows that r (A) x = 0. If r (λ) 6= 0 this would be a contradiction to the definitionof η (λ). Therefore, η (λ) divides p (λ) . Also

φ (λ) = η (λ) l (λ) + r (λ)

where the degree of r (λ) is less than the degree of η (λ) or else r (λ) = 0. If r (λ) 6= 0,then r (A) x = 0 and it would have degree too small. Hence r (λ) = 0 and so η (λ)must divide φ (λ) . But this requires that the two must be scalar multiples of eachother. Hence φ (λ) must also divide p (λ).

7. Suppose A is a linear transformation and let the characteristic polynomial be

det (λI −A) =

q∏

j=1

φj (λ)nj

where the φj (λ) are irreducible. Explain using Corollary 8.3.11 why the irreduciblefactors of the minimal polynomial are φj (λ) and why the minimal polynomial is ofthe form

q∏

j=1

φj (λ)rj

where rj ≤ nj . You can use the Cayley Hamilton theorem if you like.

This follows right away from the Cayley Hamilton theorem and that corollary. Sincethe minimal polynomial divides the characteristic polynomial, it must be of the formjust mentioned.

8. Let

A =

1 0 00 0 −10 1 0

Find the minimal polynomial for A.

You know that this satisfies a polynomial of degree no more than 3. Consider

1 0 00 1 00 0 1

,

1 0 00 0 −10 1 0

,

1 0 00 0 −10 1 0

2

,

1 0 00 0 −10 1 0

3

These are

1 0 00 1 00 0 1

,

1 0 00 0 −10 1 0

,

1 0 00 −1 00 0 −1

,

1 0 00 0 10 −1 0

Thus it is desired to find a linear combination of these which equals 0 and out of allof them to pick the one using as few of them as possible starting with the left side.An easy way to do this is to use the row reduced echelon from. Write them as columnvectors and then do row operations.


Exercises 109

100010001

10000−1010

1000−1000−1

1000010−10

,

1 1 1 10 0 0 00 0 0 00 0 0 01 0 −1 00 −1 0 10 0 0 00 1 0 −11 0 −1 0

, row echelon form:

1 0 0 10 1 0 −10 0 1 10 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

Then it follows that A3 = I − A+A2 and so the minimal polynomial is,

λ3 − λ2 + λ− 1

Does it work?

1 0 00 0 −10 1 0

3

−

1 0 00 0 −10 1 0

2

+

1 0 00 0 −10 1 0

−

1 0 00 1 00 0 1

=

0 0 00 0 00 0 0

9. Suppose A is an n × n matrix and let v be a vector. Consider the A cyclic set ofvectors

v, Av, · · · , Am−1v

where this is an independent set of vectors but Amv is

a linear combination of the preceding vectors in the list. Show how to obtain a monicpolynomial of smallest degree, m, φv (λ) such that

φv(A)v = 0

Now let w1, · · · ,wn be a basis and let φ (λ) be the least common multiple of theφwk

(λ) . Explain why this must be the minimal polynomial of A. Give a reasonablyeasy algorithm for computing φv (λ).

You could simply form the cyclic set of vectors described, make them the columns ofa matrix and then do the row reduced echelon form to reveal linear relations. This ishow you can compute φ

v(λ). Then, having found this, for each vector in a basis, and

letting φ (λ) be the least common multiple, it follows that φ (A)v = 0 for each v in abasis and so φ (A) = 0. Also, if η (A) = 0, then η (λ) must be divisible by each of theφv (λ) for v in the given basis. Here is why.

η (λ) = l (λ)φv (λ) + r (λ) , degree of r (λ) < degree of φv (λ) or r (λ) = 0

Then if r (λ) 6= 0, you could contradict the fact that φv(λ) has smallest degree out of

all those φ (λ) for which φ (A)v = 0. Therefore, η (λ) is divisible by each φv (λ) foreach v in the given basis. The polynomial of smallest such degree is by definition theleast common multiple.


−7 −1 −1−21 −3 −370 10 10

Using the process of Problem 9 find the minimal polynomial of this matrix. It turnsout the characteristic polynomial is λ3.


110 Exercises

1 0 00 1 00 0 1

100

−7 −1 −1−21 −3 −370 10 10

100

0 0 00 0 00 0 0

100

100

−7−2170

000

,

1 −7 00 −21 00 70 0

, row echelon form:

1 0 00 1 00 0 0

It fol-

lows that φe1

(λ) , λ2 = 0 + 0λ. The same sort of thing will happen with e2, e3.

Therefore, the minimal polynomial is just λ2. Note that

−7 −1 −1−21 −3 −370 10 10

2

=

0 0 00 0 00 0 0

11. Find the minimal polynomial for

A =

1 2 32 1 4−3 2 1

by the above technique. Is what you found also the characteristic polynomial?

First here is the string of vectors which comes from e1

100

12−3

−4−8−2

−26−24−6

1 1 −4 −260 2 −8 −240 −3 −2 −6

, row echelon form:

1 0 0 −140 1 0 00 0 1 3

φe1(λ) : −14 + 3λ2 = λ3

The minimal polynomial is then λ3−3λ2+14. We know it can’t have any larger degreeand so this must be it.

1 2 32 1 4−3 2 1

3

− 3

1 2 32 1 4−3 2 1

2

+14

1 2 32 1 4−3 2 1

0

=

0 0 00 0 00 0 0

This is also the characteristic equation. Note that no computation involving determi-nants was required.

12. Let A be an n × n matrix with field of scalars C. Letting λ be an eigenvalue, showthe dimension of the eigenspace equals the number of Jordan blocks in the Jordancanonical form which are associated with λ. Recall the eigenspace is ker (λI −A) .

Let the blocks associated with λ in the Jordan form be

J1 (λ)J2 (λ)

. . .

Jr (λ)


Exercises 111

Thus this is an upper triangular matrix which has λ down the main diagonal. If youtake away λI from it, you get a matrix of the form

N1 (λ)N2 (λ)

. . .

Nr (λ)

where

Ni (λ) =

0 1 0

0 0. . .

. . . 10 0

How many columns of zeros will there be? Exactly r where r is the number of thoseblocks. You get one at the beginning of each block. Hence there will be exactly r freevariables and so the dimension of the eigenspace will equal to the number of blocks.

13. For any n × n matrix, why is the dimension of the eigenspace always less than orequal to the algebraic multiplicity of the eigenvalue as a root of the characteristicequation? Hint: Note the algebraic multiplicity is the size of the appropriate blockin the Jordan form.

This is obvious from the above problem. The dimension of the eigenspace equals thenumber of blocks which must be no larger than the length of the diagonal.

14. Give an example of two nilpotent matrices which are not similar but have the sameminimal polynomial if possible.

0 1 0 00 0 0 00 0 0 00 0 0 0

2

=

0 0 0 00 0 0 00 0 0 00 0 0 0

,

0 1 0 00 0 0 00 0 0 10 0 0 0

2

=

0 0 0 00 0 0 00 0 0 00 0 0 0

These are two which work.

15. Use the existence of the Jordan canonical form for a linear transformation whoseminimal polynomial factors completely to give a proof of the Cayley Hamilton theoremwhich is valid for any field of scalars. Hint: First assume the minimal polynomialfactors completely into linear factors. If this does not happen, consider a splitting fieldof the minimal polynomial. Then consider the minimal polynomial with respect tothis larger field. How will the two minimal polynomials be related? Show the minimalpolynomial always divides the characteristic polynomial.

First suppose the minimal polynomial factors. Then the linear transformation has aJordan form.

J =

J (λ1). . .

J (λr)


112 Exercises

where each block is as described in the description of the Jordan form. Say J (λi)is ri × ri in size. Then the characteristic polynomial is the determinant of λI − J amatrix of the form

(λ− λ1) I +N (λ1). . .

(λ− λr) I +N (λr)

where the blocks N (λi) have some −1’s on the super diagonal and zeros elsewhere.Thus the characteristic polynomial is the determinant of this matrix which is

q (λ) =

r∏

i=1

(λ− λi)ri

Then q (J) is given byr∏

i=1

(J − λiI)ri

The matrix J − λiI is of the form

J (λ1 − λi). . .

J (0). . .

J (λr − λi)

where the block J (0) is nilpotent with J (0)ri = 0. It follows that (J − λiI)

ri is of theform

J (λ1 − λi)ri

. . .

0. . .

J (λr − λi)ri

Therefore, the product∏r

i=1 (J − λiI)ri = 0 from block multiplication.

Now in the general case where the minimal polynomial pF (λ) does not necessarilyfactor, consider a splitting field for the minimal polynomial. The new minimal poly-nomial divides the old one. Therefore, the new minimal polynomial with respect tothis splitting field G can be completely factored. Hence the Jordan form exists withrespect to this new splitting field and new minimal polynomial pG (λ) . Then from theabove, it follows that pG (J) = 0. However, pG (λ) divides pF (λ) and so it is also thecase that pF (J) = 0. Since J is a matrix of T with respect to a suitable basis, itfollows that pF (T ) = 0 also, where T is the original linear transformation.

16. Here is a matrix. Find its Jordan canonical form by directly finding the eigenvectorsand generalized eigenvectors based on these to find a basis which will yield the Jordanform. The eigenvalues are 1 and 2.

−3 −2 5 3−1 0 1 2−4 −3 6 4−1 −1 1 3


Exercises 113

Why is it typically impossible to find the Jordan canonical form?

−3 −2 5 3−1 0 1 2−4 −3 6 4−1 −1 1 3

, eigenvectors:

4131

↔ 1,

1010

↔ 2

Now it is necessary to find generalized eigenvectors.

−3 −2 5 3−1 0 1 2−4 −3 6 4−1 −1 1 3

−

1 0 0 00 1 0 00 0 1 00 0 0 1

x1

x2

x3

x4

=

4131

−4 −2 5 3−1 −1 1 2−4 −3 5 4−1 −1 1 2

x1

x2

x3

x4

=

4131

, Solution is:

4t4 − 4t4 + 13t4 − 2

t4

A generalized eigenvector is

−41−20

−3 −2 5 3−1 0 1 2−4 −3 6 4−1 −1 1 3

− 2

1 0 0 00 1 0 00 0 1 00 0 0 1

x1

x2

x3

x4

=

1010

−5 −2 5 3−1 −2 1 2−4 −3 4 4−1 −1 1 1

x1

x2

x3

x4

=

1010

, Solution is:

t31t31

A generalized eigenvector is

0101

So let the matrix be as follows

4131

−41−20

1010

0101

,

4 −4 1 01 1 0 13 −2 1 01 0 0 1

S =

4 −4 1 01 1 0 13 −2 1 01 0 0 1

Now to get the Jordan form you take S−1AS, but of course we know the Jordan format this point. There will be two blocks, one for 1 and one for 2. Nevertheless, here it


114 Exercises

is

4 −4 1 01 1 0 13 −2 1 01 0 0 1

−1

−3 −2 5 3−1 0 1 2−4 −3 6 4−1 −1 1 3

4 −4 1 01 1 0 13 −2 1 01 0 0 1

=

1 1 0 00 1 0 00 0 2 10 0 0 2

17. People like to consider the solutions of first order linear systems of equations whichare of the form

x′ (t) = Ax (t)

where here A is an n × n matrix. From the theorem on the Jordan canonical form,there exist S and S−1 such that A = SJS−1 where J is a Jordan form. Definey (t) ≡ S−1x (t) . Show y′ = Jy.

This is easy. You havex′ = SJS−1x

and so(S−1x

)′= J

(S−1x

),y′ = Jy

Now suppose Ψ (t) is an n × n matrix whose columns are solutions of the abovedifferential equation. Thus

Ψ′ = AΨ

Now let Φ be defined by SΦS−1 = Ψ. Show

Φ′ = JΦ.

You haveΦ = S−1ΨS

and so

Φ′ = S−1Ψ′S = S−1AΨS = S−1ASΦS−1S

= JΦ

18. In the above Problem show that

det (Ψ)′= trace (A) det (Ψ)

and sodet (Ψ (t)) = Cetrace(A)t

This is called Abel’s formula and det (Ψ (t)) is called the Wronskian.

It is easiest to consider Φ′ = JΦ and verify that det (Φ (t)) = Cetrace(J)t. This will doit because J and A are similar and so are Φ and Ψ.

det (Φ)′=

n∑

i=1

detΦi


Exercises 115

where Φi has the ith row differentiated and the other rows of Φ left unchanged. Thus,letting φi denote the ith row of Φ, you have that the jth entry of φ′i is given by

φ′ij (t) =∑

k

JikΦkj = λiΦij + aiΦ(i+1)j

Thusφ′i (t) = λiφi (t) + aiφi+1 (t)

where ai either is 0 or 1. Then since the determinant of a matrix having two equalrows equals 0, it follows that

(det (Φ))′=

n∑

i=1

λi detΦ = trace (J) det (Φ)

and sodet (Φ) = Cetrace(J)t

19. Let A be an n× n matrix and let J be its Jordan canonical form. Recall J is a blockdiagonal matrix having blocks Jk (λ) down the diagonal. Each of these blocks is ofthe form

Jk (λ) =

λ 1 0

λ. . .

. . . 10 λ

Now for ε > 0 given, let the diagonal matrix Dε be given by

Dε =

1 0ε

. . .

0 εk−1

Show that D−1ε Jk (λ)Dε has the same form as Jk (λ) but instead of ones down the

super diagonal, there is ε down the super diagonal. That is Jk (λ) is replaced with

λ ε 0

λ. . .

. . . ε0 λ

Now show that for A an n×n matrix, it is similar to one which is just like the Jordancanonical form except instead of the blocks having 1 down the super diagonal, it hasε.

That D−1ε Jk (λ)Dε is of the right form is easy to see by considering the 4× 4 case.

1 0 0 00 ε−1 0 00 0 ε−2 00 0 0 ε−3

λ 1 0 00 λ 1 00 0 λ 10 0 0 λ

1 0 0 00 ε 0 00 0 ε2 00 0 0 ε3


116 Exercises

=

1 0 0 00 ε−1 0 00 0 ε−2 00 0 0 ε−3

λ ε 0 00 λε ε2 00 0 λε2 ε3

0 0 0 λε3

=

λ ε 0 00 λ ε 00 0 λ ε0 0 0 λ

Finally, the Jordan form is of the form

J1 0. . .

0 Jm

where the blocks Jk are as just described. So multiply on the right by

Dε1 0. . .

0 Dεm

and on the left by the inverse of this matrix. Using block mutiplication you get thedesired modification. Note that this shows that every matrix in n× n matrix havingentries in C is similar to one which is close to a diagonal matrix.

20. Let A be in L (V, V ) and suppose that Apx 6= 0 for some x 6= 0. Show that Apek 6= 0for some ek ∈ e1, · · · , en , a basis for V . If you have a matrix which is nilpotent,(Am = 0 for some m) will it always be possible to find its Jordan form? Describe howto do it if this is the case. Hint: First explain why all the eigenvalues are 0. Thenconsider the way the Jordan form for nilpotent transformations was constructed in theabove.

Yes, it is possible to compute the Jordan form. In fact, the proof of existence tells howto do it. A nilpotent matrix has all eigenvalues equal to 0. This is because if Ax =λx,then Akx = λkx and so for large k, you get λkx = 0. Finding the eigenvalues is alwaysthe difficulty.

21. Suppose A is an n×n matrix and that it has n distinct eigenvalues. How do the mini-mal polynomial and characteristic polynomials compare? Determine other conditionsbased on the Jordan Canonical form which will cause the minimal and characteristicpolynomials to be different.

If the eigenvalues are distinct, then the two polynomials are obviously the same. Moregenerally, A has a Jordan form

J (λ1). . .

J (λr)

where each J (λk) is an rk × rk block diagonal matrix consisting of Jordan blocks, theλk being the distinct eigenvalues. Corresponding to one of these J (λk) you get theterm

(λ− λk)rk


Exercises 117

in the characteristic equation. If J (λk) is of the form

λk 1

λk. . .

. . . 1λk

then J − λiI is of the form

J (λ1 − λi). . .

J (0). . .

J (λr − λi)

where J (0)ri = 0 and no smaller exponent will work. Therefore, the characteristic

polynomial and the minimal polynomial will coincide. However, if J (λk) is composedof many blocks, the extreme case being when it is a diagonal matrix having λk downthe diagonal, a smaller exponsne will work and so there will be a difference betweenthe characteristic and minimal polynomials.

For example, consider the minimal polynomial for the matrix

N =

0 1 0 00 0 0 00 0 0 10 0 0 0

The characteristic polynomial is just λ4 but N2 = 0. If you had an unbroken stringof ones down the diagonal, then the minimal polynomial would also be λ4. Gettingdifference between the two polynomials involves having lots of repeated eigenvaluesand lots of blocks for the Jordan form for each eigenvalue.

22. Suppose A is a 3× 3 matrix and it has at least two distinct eigenvalues. Is it possiblethat the minimal polynomial is different than the characteristic polynomial?

You might have distinct eigenvalues in which case the two polynomials are the sameor you would have the following Jordan forms.

a 1 00 a 00 0 b

or

a 0 00 a 00 0 b

In the first case, the two polynomials are the same. The characteristic polynomial is

(λ− a)2 (λ− b)

In the first case

a 1 00 a 00 0 b

−

a 0 00 a 00 0 a

a 1 00 a 00 0 b

−

b 0 00 b 00 0 b


118 Exercises

=

0 a− b 00 0 00 0 0

6= 0

and so the characteristic polynomial and minimal polynomial are the same. In thesecond case however,

a 0 00 a 00 0 b

−

a 0 00 a 00 0 a

a 0 00 a 00 0 b

−

b 0 00 b 00 0 b

=

0 0 00 0 00 0 0

and so the minimal polynomial is different than the characteristic polynomial

23. If A is an n×n matrix of entries from a field of scalars and if the minimal polynomialof A splits over this field of scalars, does it follow that the characteristic polynomialof A also splits? Explain why or why not.

If the minimal polynomial splits, then the matrix has a Jordan form with respect tothis field of scalars. Thus both the characteristic polynomial and minimal polynomialare of the form

m∏

i=1

(λ− λi)ri

Hence the characteristic polynomial also splits.

24. In proving the uniqueness of the Jordan canonical form, it was asserted that if twon×n matrices A,B are similar, then they have the same minimal polynomial and alsothat if this minimal polynomial is of the form

p (λ) =

s∏

i=1

φi (λ)ri

where the φi (λ) are irreducible, then ker (φi (A)ri) has the same dimension as ker (φi (B)ri) .

Why is this so? This was what was responsible for the blocks corresponding to aneigenvalue being of the same size.

Since the two are similar, they come from the same linear transformation T . Therefore,this is pretty obvious. Also, if B = S−1AS, then

ker (φi (B)ri) = ker

(

φi

(S−1AS

)ri)

= ker(S−1 (φi (A)

ri)S)

This clearly has the same dimension as ker (φi (A)ri) .

25. Show that a given matrix is non defective (diagonalizable) if and only if the minimalpolynomial has no repeated roots.

If the matrix is non defective, then its Jordan form is

J (λ1). . .

J (λr)


Exercises 119

where each J (λk) is a diagonal matrix, J (λk) of size rk × rk. Then the minimalpolynomial is

∏

i

(λ− λi) .

Conversely, suppose the minimal polynomial is of this form. Also suppose that someJ (λi) say J (λi) is not diagonal. Then consider

∏

i

(J − λiI)

The ith term in the product is

J1 (λ1 − λi). . .

Ji (0). . .

Jr (λr − λi)

where for some i, Ji (0) is not the zero matrix. It follows that the above productcannot be zero either. In fact, you could let x ∈ Fri be such that J (0)x = y 6= 0.Then

∏

i

(J − λiI)

0...x...0

=

0...z...0

, z 6= 0

because Ji (λi − λk) is one to one on Fri whenever k 6= i. It follows that each of theJ (λk) must be a diagonal matrix.

26. Describe a straight forward way to determine the minimal polynomial of an n × nmatrix using row operations. Next show that if p (λ) and p′ (λ) are relatively prime,then p (λ) has no repeated roots. With the above problem, explain how this gives away to determine whether a matrix is non defective.

You take the matrix A and consider the sequence

I, A,A2, A3, · · · , An

There is some linear combination of these which equals 0. You just want to find theone which involves the lowest exponents. Thus you string these out to make themvectors in Fn2

and then make them the columns of a n2 × n matrix. Find the rowreduced echelon form of this matrix. This will allow you to identify a linear relationwhich gives the minimal polynomial.

If p (λ) and p′ (λ) are relatively prime, then you would need to have

p (λ) =∏

i

(λ− λi)

the λi being distinct. Thus the minimal polynomial would have no repeated roots andso the matrix is non defective by the above problem.


120 Exercises

27. In Theorem 10.3.4 show that the cyclic sets can be arranged in such a way that thelength of βxi+1

divides the length of βxi.

You make this condition a part of the induction hypothesis. Then at the end whenyou extend to φ (A)

m−1(V ) ⊆ ker (φ (A)) , you note that, thanks to Lemma 10.3.2,

each of the βyifor yi ∈ ker

(

φ (A)m−1

)

has length equal to a multiple of d, the degree

of φ (λ) while the βx for x ∈ φ (A)m−1

(V ) \ ker(

φ (A)m−1

)

each have length equal to

d thanks to the assumption that φ (λ) is irreducible over the field of scalars and eachof these is in ker (φ (A)).

28. Show that if A is a complex n×n matrix, then A and AT are similar. Hint: Considera Jordan block. Note that

0 0 10 1 01 0 0

λ 1 00 λ 10 0 λ

0 0 10 1 01 0 0

=

λ 0 01 λ 00 1 λ

Such a backwards identity will work for any size Jordan block. Therefore, J is similarto JT because you can use a similarity matrix which is block diagonal, each blockbeing appropriate for producing the transpose of the corresponding Jordan block.Thus there is a matrix S such that

J = S−1JTS

Now also J = P−1AP for a suitable P. Therefore,

P−1AP = S−1(P−1AP

)TS

and so

A = PS−1PTAT(PT

)−1SP−1

=((

PT)−1

SP−1)−1

AT(PT

)−1SP−1

29. Let A be a linear transformation defined on a finite dimensional vector space V . Letthe minimal polynomial be

q∏

i=1

φi (λ)mi

and let(

βivi1, · · · , βi

viri

)

be the cyclic sets such that

βivi1, · · · , βi

viri

is a basis for

ker (φi (A)mi). Let v =

∑

i

∑

j vij . Now let q (λ) be any polynomial and suppose that

q (A) v = 0

Show that it follows q (A) = 0.

First suppose that V is a vector space with a cyclic basisx,Ax, · · · , An−1x

such

that Anx is a linear combination of the vectors just listed. Let φ (λ) be the monicpolynomial of degree n which results from this. Thus φ (A) x = 0. Then actually,φ (A) = 0. This is because if v is arbitrary, then

v =

n−1∑

i=0

aiAix.


Exercises 121

Then

φ (A) v =n−1∑

i=0

aiAiφ (A) x = 0

Now suppose that q (A)x = 0. Does it follow that q (A) = 0?

q (λ) = φ (λ) l (λ) + r (λ)

where r (λ) either has smaller degree than φ (λ) or r (λ) = 0. Suppose that r (λ) 6= 0.Then

r (A)x = 0

and the degree of r (λ) is less than n which contradicts the linear independence ofx,Ax, · · · , An−1x

Hence r (λ) = 0. It follows that φ (λ) divides q (λ) and so q (A) = 0.

Now consider the problem of interest. Since each span(

βivij

)

is invariant, it follows

that q (A) vij ∈ span(

βivij

)

. Therefore, 0 =∑

i,j q (A) vij . By independence, each vector

in the sum equals 0 and so from the first part of the argument q (A) restricted toker (φi (A)

mi) equals 0. Since the whole space is a direct sum of these, it follows thatq (A) = 0.

F.28 Exercises

10.9

1. Letting A be a complex n × n matrix, in obtaining the rational canonical form, oneobtains the Cn as a direct sum of the form

span(βx1

)⊕ · · · ⊕ span

(βxr

)

where βx is an ordered cyclic set of vectors, x, Ax, · · · , Am−1x such that Amx is inthe span of the previous vectors. Now apply the Gram Schmidt process to the orderedbasis

(βx1

, βx2, · · · , βxr

), the vectors in each βxi

listed according to increasing powerof A, thus obtaining an ordered basis (q1, · · · ,qn) . Letting Q be the unitary matrixwhich has these vectors as columns, show that Q∗AQ equals a matrix B which satisfiesBij = 0 if i−j ≥ 2. Such a matrix is called an upper Hessinberg matrix and this showsthat every n× n matrix is orthogonally similar to an upper Hessinberg matrix.

You note that qk is in the span of

βx1

︷︸︸︷

x1, · · · , Am1x1,

βx2

︷︸︸︷

x2, · · · , Am2x2, · · · , · · ·xl, Axl, · · · , Ajxl

where there are k vectors in the above list. Since each span(βxi

)is A invariant, Aqk

is in the span of

βx1

︷︸︸︷

x1, · · · , Am1x1,

βx2

︷︸︸︷

x2, · · · , Am2x2, · · · , · · ·xl, Axl, · · · , Ajxl, Aj+1xl

and by the Gram Schmidt process, this is the same as

span (q1, · · · ,qk+1)


122 Exercises

It follows that if p ≥ k + 2, thenq∗pAqk = 0

Thus the pkth entry of Q∗AQ equals 0 if p − k ≥ 2. That is, this matrix is upperHessinberg.

2. In the argument for Theorem 10.2.4 it was shown that m (A)φl (A) v = v wheneverv ∈ ker (φk (A)

rk) . Show that m (A) restricted to ker (φk (A)rk) is the inverse of the

linear transformation φl (A) on ker (φk (A)rk) .

In this theorem you had polynomials m (λ) , n (λ) such that

m (λ)φl (λ) + n (λ)φk (λ)rk = 1

Thus you also havem (A)φl (A) + n (A)φk (A)

rk = I

If x ∈ ker (φk (A)rk) , then if follows that

m (A)φl (A) x+ n (A)

=0︷︸︸︷

φk (A)rk x = x

and so m (A) is indeed the inverse of φl (A).

3. Suppose A is a linear transformation and let the characteristic polynomial be

det (λI −A) =

q∏

j=1

φj (λ)nj

where the φj (λ) are irreducible. Explain using Corollary 8.3.11 why the irreduciblefactors of the minimal polynomial are φj (λ) and why the minimal polynomial is ofthe form

q∏

j=1

φj (λ)rj

where rj ≤ nj . You can use the Cayley Hamilton theorem if you like.

From the Cayley Hamilton theorem, the minimal polynomial divides the characteristicpolynomial and so from the mentioned corollary, it follows that the minimal polynomialhas the form described above with rj ≤ nj .

4. Find the minimal polynomial for

A =

1 2 32 1 4−3 2 1

by the above technique with the field of scalars being the rational numbers. Is whatyou found also the characteristic polynomial?

1 2 32 1 4−3 2 1

, characteristic polynomial: X3 − 3X2 + 14

It is irreducible over the rationals by the rational root theorem. Therefore, the abovemust be the minimal polynomial as well.


Exercises 123

5. Show, using the rational root theorem, the minimal polynomial for A in the aboveproblem is irreducible with respect to Q. Letting the field of scalars be Q find therational canonical form and a similarity transformation which will produce it.

To find the rational canonical form, you need to look for cyclic sets. You have a singleirreducible polynomial, λ3 − 3λ + 14 = φ (λ). Also φ (A) sends every vector to 0.Therefore, you can simply start with a vector. Take e1.

100

,

1 2 32 1 4−3 2 1

100

,

1 2 32 1 4−3 2 1

2

100

1 2 32 1 4−3 2 1

3

100

These vectors are

100

,

12−3

,

−4−8−2

,

−26−24−6

The last is definitely a linear combination of the first three because the first three arean independent set. Hence, you ought to have these as columns. Then you have

1 1 −40 2 −80 −3 −2

−1

1 2 32 1 4−3 2 1

1 1 −40 2 −80 −3 −2

=

0 0 −141 0 00 1 3

and this is the rational canonical form for this matrix.

6. Find the rational canonical form for the matrix

1 2 1 −12 3 0 21 3 2 41 2 1 2

This was just pulled out of the air. To find the rational canonical form, look for cyclicsets.

1 0 0 00 1 0 00 0 1 00 0 0 1

1000

,

1 2 1 −12 3 0 21 3 2 41 2 1 2

1000

,

1 2 1 −12 3 0 21 3 2 41 2 1 2

2

1000

1 2 1 −12 3 0 21 3 2 41 2 1 2

3

1000

,

1 2 1 −12 3 0 21 3 2 41 2 1 2

4

1000


124 Exercises

These reduce to

1000

,

1211

,

510138

,

30569354

,

181336600343

1 1 5 30 1810 2 10 56 3360 1 13 93 6000 1 8 54 343

, row echelon form:

1 0 0 0 −30 1 0 0 −10 0 1 0 −110 0 0 1 8

Thus the minimal polynomial is

λ4 − 8λ3 + 11λ2 + λ+ 3

Then you can find the rational canonical form from this. Also,

1 1 5 300 2 10 560 1 13 930 1 8 54

−1

1 2 1 −12 3 0 21 3 2 41 2 1 2

1 1 5 300 2 10 560 1 13 930 1 8 54

=

0 0 0 −31 0 0 −10 1 0 −110 0 1 8

7. Let A : Q3 → Q3 be linear. Suppose the minimal polynomial is (λ− 2)(λ2 + 2λ+ 7

).

Find the rational canonical form. Can you give generalizations of this rather simpleproblem to other situations?

That second factor is irreducible over the rationals and so you would need the followingfor the rational canonical form.

2 0 00 0 −70 1 −2

8. Find the rational canonical form with respect to the field of scalars equal to Q for thematrix

A =

0 0 11 0 −10 1 1

Observe that this particular matrix is already a companion matrix of λ3−λ2 + λ− 1.Then find the rational canonical form if the field of scalars equals C or Q+ iQ.

First note that the minimal polynomial is λ3 − λ2 + λ − 1 and a use of the rationalroot theorem shows that this factors over the rationals as (λ− 1)

(λ2 + 1

). Then you

need to locate cyclic sets starting with a vector in ker(A2 + I

). Lets find this first.

0 0 11 0 −10 1 1

2

+

1 0 00 1 00 0 1

=

1 1 10 0 01 1 1

Clearly the kernel is the vectors of the form

−s− tst


Exercises 125

Starting with one of these vectors, lets find cyclic sets.

−110

,

0 0 11 0 −10 1 1

1

−110

,

0 0 11 0 −10 1 1

2

−110

−110

,

0−11

,

1−10

These of course will be dependent since otherwise we don’t really have the minimalpolynomial. Thus the cycle can consist of only the first two. Thus we will have thesetwo as the first two columns and then we will have the eigenvector corresponding toker (A− I)

ker

−1 0 11 −1 −10 1 0

This eigenvector is just

101

. Then the rational canonical form of this matrix is

−1 0 11 −1 00 1 1

−1

0 0 11 0 −10 1 1

−1 0 11 −1 00 1 1

=

− 12

12

12

− 12 − 1

212

12

12

12

0 0 11 0 −10 1 1

−1 0 11 −1 00 1 1

0 −1 01 0 00 0 1

If the field of scalars is Q+ iQ, then the minimal polynomial factors as(λ− 1) (λ+ i) (λ− i) and so the rational canonical form is just the diagonal matrix,

1 0 00 i 00 0 −i

9. Let q (λ) be a polynomial and C its companion matrix. Show the characteristic andminimal polynomial of C are the same and both equal q (λ).

Let q (λ) = a0 + a1λ+ · · ·+ an−1λn−1 + λn. Then its companion matrix is

C =

0 · · · 0 −a01 0 · · · −a1

. . ....

0 1 −an−1

Then you can show by induction that the characteristic polynomial of C is q (λ). SeeProblem 43 on Page 66 It was also shown that q (C) = 0. (Cayley Hamilton theoremfor companion matrices.) Let p (λ) be the minimal polynomial. Thus it has degree nomore than the degree of q (λ) and thus must divide q (λ). Could it have smaller degree


126 Exercises

than q (λ)? No, it couldn’t. To see this, consider, for example a companion matrixfor a monic polynomial of degree 4 denoted as A. The following gives the sequenceA0, A,A2, A3. Note that these are linearly independent. Watch the ones to see this.

1 0 0 00 1 0 00 0 1 00 0 0 1

0 0 0 a1 0 0 b0 1 0 c0 0 1 d

0 0 a ad0 0 b a+ bd1 0 c b+ cd0 1 d d2 + c

0 a ad a(d2 + c

)

0 b a+ bd b(d2 + c

)+ ad

0 c b+ cd a+ c(d2 + c

)+ bd

1 d d2 + c b+ d(d2 + c

)+ cd

This pattern will occur for any sized companion matrix. You start with ones down themain diagonal and then the string of ones moves down one diagonal and then anotherand so forth. After n − 1 multiplications, you get a 1 in the lower left corner. Eachmultiplication moves the ones into slots which had formerly been occupied with zeros.That is why these are independent. Therefore, the degree of the minimal polynomialis at least n and so the minimal and characteristic polynomials must coincide for anycompanion matrix.

10. Use the existence of the rational canonical form to give a proof of the Cayley Hamiltontheorem valid for any field, even fields like the integers mod p for p a prime. The earlierproof was fine for fields like Q or R where you could let λ→∞ but it is not clear thesame result holds in general.

The linear transformation has a matrix of the following form,

M =

M1

. . .

Mq

where Mk comes from ker (φk (λ)mk) , φk (λ) being irreducible over the field of scalars.

Say the minimal polynomial is

p (λ) =

q∏

k=1

φk (λ)mk

Furthermore, Mk is of the form

Ck1

. . .

Cklk

where Ckj is the companion matrix of ηkj (λ) where η

kj (λ) = φk (λ)

rj with rj ≤ mk. By

the above problem, the minimum polynomial for Ckj is just φk (λ)

rj , rj ≤ mk whichis also the characteristic polynomial of this matrix. The characteristic polynomial isthen

q (λ) =

q∏

k=1

det (λI −Mk) =

q∏

i=1

lk∏

j=1

det(λI − Ck

j

)

=

q∏

k=1

lk∏

j=1

ηkj (λ) =

q∏

k=1

lk∏

j=1

φk (λ)rj , rj ≤ mk


Exercises 127

It follows that

q (A) =

q∏

i=1

li∏

j=1

φk (A)rj

We know that φk

(Ck

j

)rj= 0.Also when you have a block diagonal matrix,

B =

B1

. . .

Bq

and φ (λ) is a polynomial,

φ (B) =

φ (B1). . .

φ (Bq)

Therefore, the above product for q (A) is the product of block diagonal matrices suchthat every position on the diagonal has a zero matrix in at least one of the blockmatrices in the product. This equals the zero matrix. Like this:

0 0 00 B 00 0 C

D 0 00 0 00 0 E

F 0 00 H 00 0 0

When you multiply the block matrices, it involves simply multiplying together theblocks in the corresponding positions. Since at least one is zero, it follows the productis the zero matrix.

11. Suppose you have two n×n matrices A,B whose entries are in a field F and suppose Gis an extension of F. For example, you could have F = Q and G = C. Suppose A andB are similar with respect to the field G. Can it be concluded that they are similarwith respect to the field F? Hint: Use the uniqueness of the rational canonical form.

Do A,B have the same minimal polynomial over F? Denote these by pA (λ) and pB (λ)You have

pA (A) = 0

and so pA (B) = 0 also because A and B are similar with respect to G. It follows thatpA (λ) must divide pB (λ) . Similar reasoning implies that pB (λ) must divide pA (λ).Therefore, these two minimal polynomials having coefficients in F coincide. Otherwise,you could write

pB (λ) = pA (λ) + r (λ)

where the degree of r (λ) is smaller than the common degree of pA (λ) and pB (λ).Then plugging in λ = B, you get a contradiction to the definition of pB (λ) as havingthe smallest degree possible. Say the common minimal polynomial over F is

q∏

i=1

φi (λ)mi

where φi (λ) is irreducible over F. Thus, each of these φi is a polynomial with coeffi-cients in F and hence in G, and so

φi (A) , φi (B)


128 Exercises

are similar over G. Letting βvi and βwjdenote the A cyclic sets associated with

ker (φi (A))mi , ker (φi (B))

mi

respectively, it follows from Lemma 10.8.3 that the lengths of these βvi and βwicom-

prise exactly the same set of positive numbers. These lengths are identified with thesize of the Jordan blocks of the two similar matrices, φi (A) , φi (B). However, theproof of this lemma exhibits a basis which yields the Jordan form, which happens toinvolve only the field F. The companion matrices corresponding to two of these whichhave the same length are therefore exactly the same, both being companion matricesof φi (λ)

kfor a suitable k. It follows that the rational canonical form for A,B over F

is exactly the same. Hence, by uniqueness, the two matrices are similar over F.

F.29 Exercises

11.4

1. Suppose the migration matrix for three locations is

.5 0 .3

.3 .8 0

.2 .2 .7

.

Find a comparison for the populations in the three locations after a long time.

.5 0 .3

.3 .8 0

.2 .2 .7

−

1 0 00 1 00 0 1

=

−0.5 0 0.30.3 −0.2 00.2 0.2 −0.3

− 12 0 3

10 0310 − 1

5 0 015

15 − 3

10 0

, row echelon form:

1 0 − 35 0

0 1 − 910 0

0 0 0 0

It follows that the populations are

t

.6

.91

where t is chosen such that the sum of the entries corresponds to the total amount atthe beginning.

2. Show that if∑

i aij = 1, then if A = (aij) , then the sum of the entries of Av equalsthe sum of the entries of v. Thus it does not matter whether aij ≥ 0 for this to be so.∑

i

∑

j aijvj =∑

j

∑

i aijvj =∑

j vj

3. If A satisfies the conditions of the above problem, can it be concluded that limn→∞ An

exists?

Not necessarily. Consider the following matrix.(

4 9−3 −8

)

(4 9−3 −8

)4

=

(−311 −936312 937

)

etc. The eigenvalues of this matrix are

1,−5. Thus there is no way that the limit can exist. Just do the powers of the matrixto the eigenvector which corresponds to −5.


Exercises 129

4. Give an example of a non regular Markov matrix which has an eigenvalue equal to−1.

1 0 00 0 10 1 0

, eigenvalues: 1,−1

5. Show that when a Markov matrix is non defective, all of the above theory can be provedvery easily. In particular, prove the theorem about the existence of limn→∞An if theeigenvalues are either 1 or have absolute value less than 1.

It suffices to show that limn→∞ Anv exists for each v ∈ Rn. This is because you couldthen assert that limn→∞ eTi A

nej exists. But this is just the ijth entry of An. Let theeigen pairs be

(1,v1) , (λi,vi) , i = 2, · · · , nand by assumption, these vi form a basis. Therefore, if v is any vector, there existunique scalars ci such that

v =∑

i

civi

It follows that

Amv = c1v1+

n∑

j=2

ciλmvi → c1v1

Thus this is really easy.

6. Find a formula for An where

A =

52 − 1

2 0 −15 0 0 −472 − 1

212 − 5

272 − 1

2 0 −2

Does limn→∞ An exist? Note that all the rows sum to 1. Hint: This matrix is similarto a diagonal matrix. The eigenvalues are 1,−1, 12 , 1

2 .

52 − 1

2 0 −15 0 0 −472 − 1

212 − 5

272 − 1

2 0 −2

=

1 1 1 03 2 1 02 5

4 1 12 1 1 0

−1 0 0 00 1

2 0 00 0 1 00 0 0 1

2

−1 0 0 11 1 0 −21 −1 0 1− 1

4 − 14 1 − 1

2

Now it follows that

An =

1 1 1 03 2 1 02 5

4 1 12 1 1 0

(−1)n 0 0 00 1

2n 0 00 0 1 00 0 0 1

2n

−1 0 0 11 1 0 −21 −1 0 1− 1

4 − 14 1 − 1

2

=

12n − (−1)n + 1 1

2n − 1 0 (−1)n − 22n + 1

22n − 3 (−1)n + 1 2

2n − 1 0 3 (−1)n − 42n + 1

12n − 2 (−1)n + 1 1

2n − 1 12n 2 (−1)n − 3

2n + 112n − 2 (−1)n + 1 1

2n − 1 0 2 (−1)n − 22n + 1


130 Exercises

7. Find a formula for An where

A =

2 − 12

12 −1

4 0 1 −452 − 1

2 1 −23 − 1

212 −2

Note that the rows sum to 1 in this matrix also. Hint: This matrix is not similarto a diagonal matrix but you can find the Jordan form and consider this in order toobtain a formula for this product. The eigenvalues are 1,−1, 12 , 12 .

2 − 12

12 −1

4 0 1 −452 − 1

2 1 −23 − 1

212 −2

=

− 79 1 − 1

679

− 239 1 − 1

3149

− 139 1 − 1

649

− 169 1 − 1

679

−1 0 0 00 1 0 00 0 1

2 10 0 0 1

2

1 0 0 −11 −1 0 10 −6 −14 201 0 −3 2

Now consider the Jordan block on the bottom.

It when raising this to the nth power, you get

2∑

k=0

(nk

)(1

2n−k 00 1

2n−k

)(0 10 0

)k

=

(12n 00 1

2n

)

+ n

(1

2n−1 00 1

2n−1

)(0 10 0

)

=

(12n 21−nn0 1

2n

)

It follows that An is given by the formula

− 79 1 − 1

679

− 239 1 − 1

3149

− 139 1 − 1

649

− 169 1 − 1

679

(−1)n 0 0 00 1 0 00 0 1

2n 21−nn0 0 0 1

2n

1 0 0 −11 −1 0 10 −6 −14 201 0 −3 2

79×2n − 1

621−nn− 7

9 (−1)n + 1 1

2n − 1 122

1−nn 79 (−1)

n − 169×2n − 1

321−nn+ 1

149×2n − 1

321−nn− 23

9 (−1)n + 1 22n − 1 21−nn 23

9 (−1)n − 329×2n − 2

321−nn+ 1

49×2n − 1

621−nn− 13

9 (−1)n + 1 12n − 1 1

221−nn+ 1

2n139 (−1)n − 22

9×2n − 132

1−nn+ 17

9×2n − 162

1−nn− 169 (−1)n + 1 1

2n − 1 122

1−nn 169 (−1)n − 16

9×2n − 132

1−nn+ 1

8. Find limn→∞An if it exists for the matrix

A =

12 − 1

2 − 12 0

− 12

12 − 1

2 012

12

32 0

32

32

32 1

The eigenvalues are 12 , 1, 1, 1.

The matrix equals


Exercises 131

−1 1 −1 0−1 1 0 02 −1 1 04 −3 1 1

1 0 0 00 1

2 0 00 0 1 00 0 0 1

1 0 1 01 1 1 0−1 1 0 00 2 −1 1

Hence the matrix raised to the nth power is

−1 1 −1 0−1 1 0 02 −1 1 04 −3 1 1

1 0 0 00 1

2n 0 00 0 1 00 0 0 1

1 0 1 01 1 1 0−1 1 0 00 2 −1 1

and so, in the limit,

you get

−1 1 −1 0−1 1 0 02 −1 1 04 −3 1 1

1 0 0 00 0 0 00 0 1 00 0 0 1

1 0 1 01 1 1 0−1 1 0 00 2 −1 1

=

0 −1 −1 0−1 0 −1 01 1 2 03 3 3 1

9. Given an example of a matrix A which has eigenvalues which are either equal to 1,−1,or have absolute value strictly less than 1 but which has the property that limn→∞ An

does not exist.

An easy example is

1 0 00 0 10 1 0

, eigenvalues: 1,−1

1 0 00 0 10 1 0

n

alternates between the identity and

1 0 00 0 10 1 0

.

10. If A is an n× n matrix such that all the eigenvalues have absolute value less than 1,show limn→∞An = 0.

The Jordan form is

J (λ1). . .

J (λr)

where each |λr| < 1. Then as shown above, J (λr)m → 0.

11. Find an example of a 3 × 3 matrix A such that limn→∞ An does not exist butlimr→∞A5r does exist.

An easy example is the following.

A =

ei(2π/5) 0 0

0 ei(2π/5) 00 0 ei(2π/5)

A5 =

ei(2π/5) 0 00 ei(2π/5) 0

0 0 ei(2π/5)

5

= I.

Therefore, limr→∞A5r exists. However, limr→∞Ar cannot exist because it will justyield diagonal matrices which have various fifth roots of 1 on the diagonal.


132 Exercises

12. If A is a Markov matrix and B is similar to A, does it follow that B is also a Markovmatrix?

No. Start with a Markov matrix(

1/2 1/31/2 2/3

)

Now take a similarity transformation of it.(

1 23 4

)−1 (1/2 1/31/2 2/3

)(1 23 4

)

=

(− 1

2 −11 5

3

)

This is certainly not a Markov matrix.

13. In Theorem 11.1.3 suppose everything is unchanged except that you assume either∑

j aij ≤ 1 or∑

i aij ≤ 1. Would the same conclusion be valid? What if you don’tinsist that each aij ≥ 0? Would the conclusion hold in this case?

You can’t conclude that the limit exists if the sign of aij is not restricted. For example,

(4 9−3 −8

)

has an eigenvalue of −5 and so the limit of powers of this matrix cannot exist. Whatabout the case where

∑

i aij ≤ 1 instead of equal to 1? In this case, you do stillget the conclusion of this theorem. The condition was only used to get the entriesof An bounded independent of n and this can be accomplished just as well with theinequality.

14. Let V be an n dimensional vector space and let x ∈ V and x 6= 0. Consider βx ≡x,Ax, · · · ,Am−1x where

Amx ∈ span(x,Ax, · · · ,Am−1x

)

and m is the smallest such that the above inclusion in the span takes place. Showthat

x,Ax, · · · ,Am−1x

must be linearly independent. Next suppose v1, · · · ,vn

is a basis for V . Consider βvi

as just discussed, having length mi. Thus Amivi is alinearly combination of vi,Avi, · · · ,Am−1vi for m as small as possible. Let pvi (λ) bethe monic polynomial which expresses this linear combination. Thus pvi (A)vi = 0and the degree of pvi (λ) is as small as possible for this to take place. Show that theminimal polynomial for A must be the monic polynomial which is the least commonmultiple of these polynomials pvi (λ).

Let p (λ) be the minimal polynomial. Then there exists ki (λ) , a monic polynomialsuch that

p (λ) = ki (λ) pvi (λ) + ri (λ)

where ri (λ) must be zero since otherwise, you would have ri (A)vi = 0 and this wouldcontradict the minimality of the degree of pvi (λ). Therefore, each of these pvi (λ)divides p (λ) . If q (λ) is their least common multiple, then it follows that q (A) = 0because every vector is a linear combination of these vectors vi. Therefore, p (λ)divides q (λ). Thus

q (λ) = p (λ) k (λ)

Now each pvi (λ) divides p (λ) so it is a common multiple. It follows that k (λ) = 1since otherwise, q (λ) wouldn’t really be the least common multiple.


Exercises 133

15. If A is a complex Hermitian n×n matrix which has all eigenvalues nonnegative, showthat there exists a complex Hermitian matrix B such that BB = A.

You have A = U∗DU for D a diagonal matrix having all nonnegative entries. Thenjust note that B = U∗D1/2UU∗D1/2U, U is unitary. Clearly you should take B =U∗D1/2U.

16. Suppose A,B are n × n real Hermitian matrices and they both have all nonnegativeeigenvalues. Show that det (A+B) ≥ det (A) + det (B).

Let P 2 = A,Q2 = B where P,Q are Hermitian and nonnegative. Then

A+B =(P Q

)(

PQ

)

Now apply the Cauchy Binet formula to both sides.

det (A+B) =∑

i

det (Ci)2 ≥ det (P )2 + det (Q)2

= det(P 2

)+ det

(Q2

)= det (A) + det (B)

where the Ci are n× n submatrices of(P Q

). Note that this does not depend on

the matrices being real. You could write

A+B =(P ∗ Q∗

)(

PQ

)

Then

det (A+B) =∑

i

det (C∗i ) det (Ci) =∑

i

det (C∗i Ci)

≥ det(P 2

)+ det

(Q2

)= det (A) + det (B)

17. Suppose B =

(α c∗

b A

)

is an (n+ 1)× (n+ 1) Hermitian nonnegative matrix where

α is a scalar and A is n × n. Show that α must be real, c = b, and A = A∗, A isnonnegative, and that if α = 0, then b = 0.

B∗ =

(α b∗

c A∗

)

= B and so you must have α is real and b = c while A = A∗.

Suppose α = 0 and b 6= 0. I need to show that B is not nonnegative.

(α x∗

)

B︷︸︸︷(

0 b∗

b A

)(αx

)

=(α x∗

)(

b∗xαb+Ax

)

= αb∗x+αx∗b+ x∗Ax

If for some x,x∗Ax < 0, then let α = 0 and you see that B is not nonnegative.Therefore, A must be nonnegative. But now you could let x = b and then pick α belarge and negative and again see that B is not nonnegative. Hence if α = 0, thenb = 0. Now consider b 6= 0 and determine the sign of α. Consider

(β 0∗

)

B︷︸︸︷(

α b∗

b A

)(β0

)

=(βα βb∗

)(

β0

)

= β2α

and so for this to be nonnegative, you have to have α > 0.


134 Exercises

18. ↑If A is an n× n complex Hermitian and nonnegative matrix, show that there existsan upper triangular matrix B such that B∗B = A. Hint: Prove this by induction. Itis obviously true if n = 1. Now if you have an (n+ 1)× (n+ 1) Hermitian nonnegative

matrix, then from the above problem, it is of the form

(α2 αb∗

αb A

)

, α real.

From the above problem, a generic (n+ 1)× (n+ 1) Hermitian nonnegative matrix isof the form just claimed where A is nonnegative and Hermitian and α is real. In caseα = 0, there is nothing to prove. You would just use

(0 0∗

0 B∗

)(0 0∗

0 B

)

where B∗B = A. Otherwise, you consider(

α 0∗

b B∗

)(α b∗

0 B

)

19. ↑ Suppose A is a nonnegative Hermitian matrix which is partitioned as

A =

(A11 A12

A21 A22

)

where A11, A22 are square matrices. Show that det (A) ≤ det (A11) det (A22). Hint:

Use the above problem to factor A getting

A =

(B∗11 0∗

B∗12 B∗22

)(B11 B12

0 B22

)

Next argue that A11 = B∗11B11, A22 = B∗12B12 +B∗22B22. Use the Cauchy Binet theo-rem to argue that det (A22) = det (B∗12B12 +B∗22B22) ≥ det (B∗22B22) . Then explainwhy

det (A) = det (B∗11) det (B∗22) det (B11) det (B22)

= det (B∗11B11) det (B∗22B22)

The Hint tells pretty much how to do it. A11 = B∗11B11, A22 = B∗12B12+B∗22B22 followsfrom block multiplication. Why is the determinant of a block triangular matrix equal

to the product of the determinants of the blocks? Consider

(B11 B12

0 B22

)

. If the

rank of B22 is not maximal, then its determinant is zero and also the determinant ofthe matrix is zero. Assume then that it has full rank. Then suitable row operationscan be applied to obtain that the determinant of this matrix equals the determinantof (

B11 00 B22

)

This determinant is clearly equal to the product of the determinants of B11 and B22.The inequality det (B∗12B12 +B∗22B22) ≥ det (B∗22B22) follows from the Cauchy Binettheorem since there are more nonnegative terms in the sum for the determinant on theleft than for the term on the right. Note that these determinants are all real becausethe matrices are Hermitian. It follows that

det (A) = det (B∗11B11) det (B∗22B22)

≤ det (A11) det (B∗12B12 +B∗22B22) = det (A11) det (A22)


Exercises 135

20. ↑ Prove the inequality of Hadamard. If A is a Hermitian matrix which is nonnegative,then

det (A) ≤∏

i

Aii

This follows from induction. It is clearly true if n = 1. Now let A be n×n, nonnegativeand Hermitian and assume the theorem is true for all nonnegative matrices which areof smaller size. Then

A =

(B a

a∗ α

)

It follows that B and α are both nonnegative. Then from the above,

det (A) ≤ det (B) det (α)

By induction, this implies

det (A) ≤∏

i

Aii

F.30 Exercises

12.7

1. Find the best solution to the system

x+ 2y = 62x− y = 53x+ 2y = 0

1 22 −13 2

T

1 22 −13 2

=

(14 66 9

)

Solve

(14 66 9

)(xy

)

=

1 22 −13 2

T

650

(14 66 9

)(xy

)

=

(167

)

, Solution is:

(1715145

)

2. Find an orthonormal basis for R3, w1,w2,w3 given that w1 is a multiple of thevector (1, 1, 2).

A basis consists of

112

,

010

,

100

1 0 11 1 02 0 0

=

16

√6 − 1

30

√5√6 2

5

√5

16

√6 1

6

√5√6 0

13

√6 − 1

15

√5√6 − 1

5

√5

√6 1

6

√6 1

6

√6

0 16

√5√6 − 1

30

√5√6

0 0 25

√5


136 Exercises

Then one example of an orthonormal basis extending the given vector is

16

√6

16

√6

13

√6

,

− 130

√5√6

16

√5√6

− 115

√5√6

,

25

√5

0

− 15

√5

3. Suppose A = AT is a symmetric real n× n matrix which has all positive eigenvalues.Define

(x,y) ≡ (Ax,y) .

Show this is an inner product on Rn. What does the Cauchy Schwarz inequality sayin this case?

It satisfies all the axioms of an inner product obviously. The only ones which are notobvious are whether (Ax,x) ≥ 0 and so forth. But A = U∗DU where D is diagonaland U is orthogonal. Therefore,

(Ax,x) = (U∗DUx,x) = (DUx,Ux) ≥ µ |Ux|2 = µ |x|2

where µ is the smallest eigenvalue which is assumed positive. Therefore, also (Ax,x) =0 if and only if |x| = 0. The Cauchy Schwarz inequality says

|(Ax,y)| ≤ (Ax,x)1/2 (Ay,y)1/2

4. Let||x||∞ ≡ max |xj | : j = 1, 2, · · · , n .

Show this is a norm on Cn. Here x =(x1 · · · xn

)T. Show

||x||∞ ≤ |x| ≡ (x,x)1/2

where the above is the usual inner product on Cn.

The axioms of a norm are all obvious except for the triangle inequality. However,

‖x+ y‖∞ = max |xi + yi| , i ≤ n ≤ ‖x‖∞ + ‖y‖∞so even this one is all right. Also for some i

‖x‖∞ = |xi| =(

|xi|2)1/2

≤(∑

i

|xi|2)1/2

= |x| .

5. Let

||x||1 ≡n∑

j=1

|xj | .

Show this is a norm on Cn. Here x =(x1 · · · xn

)T. Show

||x||1 ≥ |x| ≡ (x,x)1/2

where the above is the usual inner product on Cn.

It is obvious that ‖·‖1 is a norm. What of the inequality? Is

∑

i

|xi| ≥(∑

i

|xi|2)1/2

?

Of course. Just square both sides and the left has mixed terms which are not presenton the right.


Exercises 137

6. Show that if ||·|| is any norm on any vector space, then

|||x|| − ||y||| ≤ ||x− y|| .

‖x‖ = ‖x− y + y‖ ≤ ‖x− y‖+ ‖y‖ and so

‖x‖ − ‖y‖ ≤ ‖x− y‖

Now repeat the argument with the x, y switched. Finally, |‖x‖ − ‖y‖| equals either‖x‖ − ‖y‖ or ‖y‖ − ‖x‖ either way, you get the inequality.

7. Relax the assumptions in the axioms for the inner product. Change the axiom about(x, x) ≥ 0 and equals 0 if and only if x = 0 to simply read (x, x) ≥ 0. Show the CauchySchwarz inequality still holds in the following form.

|(x, y)| ≤ (x, x)1/2

(y, y)1/2

.

The proof given included this case.

8. Let H be an inner product space and let uknk=1 be an orthonormal basis for H .Show

(x, y) =n∑

k=1

(x, uk) (y, uk).

You have x =∑

j (x, uj)uj and y =∑

j (y, uj)uj . Therefore,

(x, y) =

∑

j

(x, uj)uj,∑

j

(y, uk)uk

=∑

j,k

(x, uj) (y, uk) (uj, uk) =

n∑

k=1

(x, uk) (y, uk)

9. Let the vector space V consist of real polynomials of degree no larger than 3. Thus atypical vector is a polynomial of the form

a+ bx+ cx2 + dx3.

For p, q ∈ V define the inner product,

(p, q) ≡∫ 1

0

p (x) q (x) dx.

Show this is indeed an inner product. Then state the Cauchy Schwarz inequality interms of this inner product. Show

1, x, x2, x3

is a basis for V . Finally, find an

orthonormal basis for V. This is an example of some orthonormal polynomials.

p1 (x) = 1

p2 (x) =x−(x,p1)p1

‖x−(x,p1)p1‖ = x−(1/2)(

∫

1

0 (x−12 )

2)

1/2 =√3 (2x− 1)

p3 (x) =x2−(x2,

√3(2x−1))

√3(2x−1)−(x2,1)1

‖x2−(x2,√3(2x−1))

√3(2x−1)−(x2,1)1‖ =

x2−x+ 16

‖x2− 16

√3√3(2x−1)− 1

3‖= 6√5(x2 − x+ 1

6

)


138 Exercises

p4 (x) =x3−(x3,6

√5(x2−x+ 1

6 ))6√5(x2−x+ 1

6 )−(x3,√3(2x−1))

√3(2x−1)− 1

4

‖x3−(x3,6√5(x2−x+ 1

6 ))6√5(x2−x+ 1

6 )−(x3,√3(2x−1))

√3(2x−1)− 1

4‖

=x3− 1

20

√56√5(x2−x+ 1

6 )− 320

√3√3(2x−1)− 1

4

‖x3− 120

√56√5(x2−x+ 1

6 )− 320

√3√3(2x−1)− 1

4‖ =x3− 1

20

√56√5(x2−x+ 1

6 )− 320

√3√3(2x−1)− 1

4

12800

√2800

= 20√7(x3 − 3

2x2 + 3

5x− 120

)

It is clear the vectors given form a basis. The orthonormal basis is

1,√3 (2x− 1) , 6

√5

(

x2 − x+1

6

)

, 20√7

(

x3 − 3

2x2 +

3

5x− 1

20

)

10. Let Pn denote the polynomials of degree no larger than n− 1 which are defined on aninterval [a, b] . Let x1, · · · , xn be n distinct points in [a, b] . Now define for p, q ∈ Pn,

(p, q) ≡n∑

j=1

p (xj) q (xj)

Show this yields an inner product on Pn. Hint: Most of the axioms are obvious. Theone which says (p, p) = 0 if and only if p = 0 is the only interesting one. To verify thisone, note that a nonzero polynomial of degree no more than n− 1 has at most n− 1zeros.

If (p, p) = 0 then p equals zero at n distinct points and so it must be the zero polynomialbecause it has degree at most n− 1 and so it can have no more than n− 1 roots.

11. Let C ([0, 1]) denote the vector space of continuous real valued functions defined on[0, 1]. Let the inner product be given as

(f, g) ≡∫ 1

0

f (x) g (x) dx

Show this is an inner product. Also let V be the subspace described in Problem 9.Using the result of this problem, find the vector in V which is closest to x4.

It equals(x4, 1

)1 +

(

x4,√3 (2x− 1)

)√3 (2x− 1)

+

(

x4, 6√5

(

x2 − x+1

6

))

6√5

(

x2 − x+1

6

)

+

(

x4, 20√7

(

x3 − 3

2x2 +

3

5x− 1

20

))

20√7

(

x3 − 3

2x2 +

3

5x− 1

20

)

Now it is just a matter of working these out.(x4,√3 (2x− 1)

)=

∫ 1

0 x4√3 (2x− 1) = 2

15

√3

(x4, 6

√5(x2 − x+ 1

6

))=

∫ 1

0x46√5(x2 − x+ 1

6

)= 2

35

√5

(x4, 20

√7(x3 − 3

2x2 + 3

5x− 120

))=

∫ 1

0 x420√7(x3 − 3

2x2 + 3

5x− 120

)= 1

70

√7

Therefore, the closest point is

1

5+

2

15

√3√3 (2x− 1) +

2

35

√56√5

(

x2 − x+1

6

)

+

+1

70

√720√7

(

x3 − 3

2x2 +

3

5x− 1

20

)

= 2x3 − 9

7x2 +

2

7x− 1

70


Exercises 139

12. A regular Sturm Liouville problem involves the differential equation, for an un-known function of x which is denoted here by y,

(p (x) y′)′+ (λq (x) + r (x)) y = 0, x ∈ [a, b]

and it is assumed that p (t) , q (t) > 0 for any t ∈ [a, b] and also there are boundaryconditions,

C1y (a) + C2y′ (a) = 0

C3y (b) + C4y′ (b) = 0

whereC2

1 + C22 > 0, and C2

3 + C24 > 0.

There is an immense theory connected to these important problems. The constant, λis called an eigenvalue. Show that if y is a solution to the above problem correspondingto λ = λ1 and if z is a solution corresponding to λ = λ2 6= λ1, then

∫ b

a

q (x) y (x) z (x) dx = 0. (6.37)

and this defines an inner product. Hint: Do something like this:

(p (x) y′)′z + (λ1q (x) + r (x)) yz = 0,

(p (x) z′)′y + (λ2q (x) + r (x)) zy = 0.

Now subtract and either use integration by parts or show

(p (x) y′)′z − (p (x) z′)

′y = ((p (x) y′) z − (p (x) z′) y)

′

and then integrate. Use the boundary conditions to show that y′ (a) z (a)−z′ (a) y (a) =0 and y′ (b) z (b)−z′ (b) y (b) = 0. The formula, 6.37 is called an orthogonality relation.It turns out there are typically infinitely many eigenvalues and it is interesting to writegiven functions as an infinite series of these “eigenfunctions”.

Let y go with λ and z go with µ.

z (p (x) y′)′+ (λq (x) + r (x)) yz = 0

y (p (x) z′)′+ (µq (x) + r (x)) zy = 0

Subtract.z (p (x) y′)

′ − y (p (x) z′)′+ (λ− µ) q (x) yz = 0

Now integrate from a to b. First note that

z (p (x) y′)′ − y (p (x) z′)

′=

d

dx(p (x) y′z − p (x) z′y)

and so what you get is

p (b) y′ (b) z (b)− p (b) z′ (b) y (b)− (p (a) y′ (a) z (a)− p (a) z′ (a) y (a))

+ (λ− µ)

∫ b

a

q (x) y (x) z (x) dx = 0


140 Exercises

Look at the stuff on the top line. From the assumptions on the boundary conditions,

C1y (a) + C2y′ (a) = 0

C1z (a) + C2z′ (a) = 0

and soy (a) z′ (a)− y′ (a) z (a) = 0

Similarly,y (b) z′ (b)− y′ (b) z (b) = 0

Hence, that stuff on the top line equals zero and so the orthogonality condition holds.

13. Consider the continuous functions defined on [0, π] , C ([0, π]) . Show

(f, g) ≡∫ π

0

fgdx

is an inner product on this vector space. Show the functions√

2π sin (nx)

∞

n=1are

an orthonormal set. What does this mean about the dimension of the vector space

C ([0, π])? Now let VN = span(√

2π sin (x) , · · · ,

√2π sin (Nx)

)

. For f ∈ C ([0, π]) find

a formula for the vector in VN which is closest to f with respect to the norm determinedfrom the above inner product. This is called the N th partial sum of the Fourier seriesof f . An important problem is to determine whether and in what way this Fourierseries converges to the function f . The norm which comes from this inner product issometimes called the mean square norm.

Note that sin (nx) is a solution to the boundary value problem

y′′ + n2y = 0, y (0) = y (π) = 0

It is obvious that sin (nx) solves this boundary value problem. Those boundary con-ditions in the above problem are implied by these. In fact,

1 sin 0 + (0) cos (0) = 0

1 sinπ + (0) cosπ = 0

Then it follows that∫ π

0

sin (nx) sin (mx) dx = 0 unless n = m.

Now also∫ π

0 sin2 (nx) dx = 12π and so

√2√πsin (nx)

are orthonormal.

14. Consider the subspace V ≡ ker (A) where

A =

1 4 −1 −12 1 2 34 9 0 15 6 3 4

Find an orthonormal basis for V. Hint: You might first find a basis and then use theGram Schmidt procedure.


Exercises 141

1 4 −1 −12 1 2 34 9 0 15 6 3 4

xyzw

=

0000

, Solution is:

− 97 t3

47 t3t30

This is pretty easy. The space has dimension 1. Therefore, an orthonormal basis isjust

1√16 + 49 + 81

−9470

=

− 9146

√146

273

√146

7146

√146

0

15. The Gram Schmidt process starts with a basis for a subspace v1, · · · , vn and pro-duces an orthonormal basis for the same subspace u1, · · · , un such that

span (v1, · · · , vk) = span (u1, · · · , uk)

for each k. Show that in the case of Rm the QR factorization does the same thing.More specifically, if

A =(v1 · · · vn

)

and ifA = QR ≡

(q1 · · · qn

)R

then the vectors q1, · · · ,qn is an orthonormal set of vectors and for each k,

span (q1, · · · ,qk) = span (v1, · · · ,vk)

This follows from the way we multiply matrices and the definition of an orthogonalmatrix.

16. Verify the parallelogram identify for any inner product space,

|x+ y|2 + |x− y|2 = 2 |x|2 + 2 |y|2 .

Why is it called the parallelogram identity?

|x+ y|2 + |x− y|2 = (x+ y, x+ y) + (x− y, x− y)

= |x|2 + |y|2 + 2 (x, y) + |x|2 + |y|2 − 2 (x, y) .

17. ↑Let H be an inner product space and let K ⊆ H be a nonempty convex subset. Thismeans that if k1, k2 ∈ K, then the line segment consisting of points of the form

tk1 + (1− t) k2 for t ∈ [0, 1]

is also contained in K. Suppose for each x ∈ H, there exists Px defined to be a pointof K closest to x. Show that Px is unique so that P actually is a map. Hint: Supposez1 and z2 both work as closest points. Consider the midpoint, (z1 + z2) /2 and use theparallelogram identity of Problem 16 in an auspicious manner.

Suppose there are two closest points to x say k1, k2 both work. Then k1+k2

2 is also inK and ∣

∣∣∣x− k1 + k2

2

∣∣∣∣

2

=

∣∣∣∣

(x

2− k1

2

)

+

(x

2− k2

2

)∣∣∣∣

2


142 Exercises

= −∣∣∣∣

k2 − k12

∣∣∣∣

2

+ 2

∣∣∣∣

(x

2− k1

2

)∣∣∣∣

2

+ 2

∣∣∣∣

(x

2− k2

2

)∣∣∣∣

2

= −∣∣∣∣

k2 − k12

∣∣∣∣

2

+ d2

where d is the distance between x and ki.

18. ↑In the situation of Problem 17 suppose K is a closed convex subset and that His complete. This means every Cauchy sequence converges. Recall from calculus asequence kn is a Cauchy sequence if for every ε > 0 there exists Nε such thatwhenever m,n > Nε, it follows |km − kn| < ε. Let kn be a sequence of points of Ksuch that

limn→∞

|x− kn| = inf |x− k| : k ∈ K

This is called a minimizing sequence. Show there exists a unique k ∈ K such thatlimn→∞ |kn − k| and that k = Px. That is, there exists a well defined projection maponto the convex subset of H . Hint: Use the parallelogram identity in an auspiciousmanner to show kn is a Cauchy sequence which must therefore converge. Since Kis closed it follows this will converge to something in K which is the desired vector.

Let x be given and let kn be a minimizing squence.

|x− kn| → λ ≡ inf |x− y| : y ∈ K

Then ∣∣∣∣

kn − km2

∣∣∣∣

2

+

∣∣∣∣

x− kn2

+x− km

2

∣∣∣∣

2

= 2

∣∣∣∣

x− kn2

∣∣∣∣

2

+ 2

∣∣∣∣

x− km2

∣∣∣∣

2

Now it follows that

∣∣∣∣

kn − km2

∣∣∣∣

2

=1

2|x− kn|2 +

1

2|km − x|2 −

∣∣∣∣x− kn + km

2

∣∣∣∣

2

≤ 1

2|x− kn|2 +

1

2|km − x|2 − λ2

The right side converges to 0 as n,m → ∞ and so kn is a Cauchy sequence andsince the space is given to be complete, it follows that this converges to some k. SinceK is closed, it follows that k ∈ K. Now

|x− k| = limn→∞

|x− kn| = λ

and so there exists a closest point.

19. ↑Let H be an inner product space which is also complete and let P denote the projec-tion map onto a convex closed subset, K. Show this projection map is characterizedby the inequality

Re (k − Px, x− Px) ≤ 0

for all k ∈ K. That is, a point z ∈ K equals Px if and only if the above variationalinequality holds. This is what that inequality is called. This is because k is allowedto vary and the inequality continues to hold for all k ∈ K.

Consider for t ∈ [0, 1] the following.

|y − (x+ t (w− x))|2


Exercises 143

where w ∈ K and x ∈ K. It equals

f (t) = |y − x|2 + t2 |w − x|2 − 2tRe 〈y − x,w − x〉

Suppose x is the point of K which is closest to y. Then f ′ (0) ≥ 0. However, f ′ (0) =−2Re 〈y − x,w − x〉 . Therefore, if x is closest to y,

Re 〈y − x,w − x〉 ≤ 0.

Next suppose this condition holds. Then you have

|y − (x+ t (w − x))|2 ≥ |y − x|2 + t2 |w − x|2 ≥ |y − x|2

By convexity of K, a generic point of K is of the form x+ t (w − x) for w ∈ K. Hencex is the closest point.

20. ↑Using Problem 19 and Problems 17 - 18 show the projection map, P onto a closedconvex subset is Lipschitz continuous with Lipschitz constant 1. That is

|Px− Py| ≤ |x− y|

〈Px− Py,y − Py〉 ≤ 0

〈Py − Px,x− Px〉 ≤ 0

Thus〈Px− Py,x− Px〉 ≥ 0

Hence〈Px− Py,x− Px〉 − 〈Px− Py,y − Py〉 ≥ 0

and so〈Px− Py,x−y − (Px− Py)〉 ≥ 0

|x−y| |Px− Py| ≥ 〈Px− Py,Px− Py〉 = |Px− Py|2

21. Give an example of two vectors in R4 x,y and a subspace V such that x · y = 0 butPx·Py 6= 0 where P denotes the projection map which sends x to its closest point onV .

Try this. V is the span of e1 and e2 and x = e3 + e1,y = e4 + e1.

Px = (e3 + e1, e1) e1 + (e3 + e1, e2) e2 = e1

Py = (e4 + e1, e1) e1 + (e4 + e1, e2) e2 = e1

Px·Py = 1

22. Suppose you are given the data, (1, 2) , (2, 4) , (3, 8) , (0, 0) . Find the linear regressionline using the formulas derived above. Then graph the given data along with yourregression line.

You draw the graphs. You want to solve

a+ b = 22a+ b = 43a+ b = 8

b = 0


144 Exercises

Of course there is no solution. The least squares solution is to solve

1 12 13 10 1

T

1 12 13 10 1

(ab

)

=

1 12 13 10 1

T

2480

(14 66 4

)(ab

)

=

(3414

)

, Solution is:

(135− 2

5

)

Then the best line is

y =13

5x− 2

5

23. Generalize the least squares procedure to the situation in which data is given and youdesire to fit it with an expression of the form y = af (x)+bg (x)+c where the problemwould be to find a, b and c in order to minimize the error. Could this be generalizedto higher dimensions? How about more functions?

Say you had ordered pairs given which you wanted the curve to go through. Say theseare (xi, yi) for i ≤ n. Then you would be finding least squares solutions for a, b, c to

yi = af (xi) + bg (xi) + c, i = 1, · · · , n

You would simply obtain a least squares solutions to this. The principles are the samefor higher dimensions.

24. Let A ∈ L (X,Y ) where X and Y are finite dimensional vector spaces with the dimen-sion of X equal to n. Define rank (A) ≡ dim (A (X)) and nullity(A) ≡ dim (ker (A)) .Show that nullity(A) + rank (A) = dim (X) . Hint: Let xiri=1 be a basis for ker (A)

and let xiri=1 ∪ yin−ri=1 be a basis for X. Then show that Ayin−r

i=1 is linearlyindependent and spans AX.

Let Ayimi=1 be a basis for AX and let xiri=1 be a basis for ker (A).Then y1, · · · , ym, x1, · · · , xr is linearly independent. In fact it spans X because ifz ∈ X, Then

Az =m∑

i=1

ciAyi

and so z −∑mi=1 ciyi ∈ ker (X) and so it is in the span of the xiri=1 . Therefore,

z is in the span of the given vectors and so this list of vectors is a basis. Hence thedimension of X is equal to m+ r = rank (A) + dim (ker (A)).

25. Let A be an m×n matrix. Show the column rank of A equals the column rank of A∗A.Next verify column rank of A∗A is no larger than column rank of A∗. Next justify thefollowing inequality to conclude the column rank of A equals the column rank of A∗.

rank (A) = rank (A∗A) ≤ rank (A∗) ≤

= rank (AA∗) ≤ rank (A) .

Hint: Start with an orthonormal basis, Axjrj=1 of A (Fn) and verify A∗Axjrj=1

is a basis for A∗A (Fn) .

Say you have∑r

i=1 ciA∗Axi = 0. Then for any appropriate y,

0 =

(

r∑

i=1

ciA∗Axi, y

)

=

(

r∑

i=1

ciAxi, Ay

)


Exercises 145

In particular, you could have y =∑

i cixi. Then you would get

r∑

i=1

ciAxi = 0

and so each ci = 0. Now if you have x ∈ Fn, Then Ax is a linear combination of theAxj and so A∗Ax is a linear combination of the A∗Axi which shows that these vectorsare a basis for A∗AFn. It follows that the rank of A and the rank of A∗A are thesame. Similarly, the rank of A∗ and AA∗ are the same. Hence the above inequalitiesfollow right away.

26. Let A be a real m × n matrix and let A = QR be the QR factorization with Qorthogonal and R upper triangular. Show that there exists a solution x to the equation

RTRx = RTQTb

and that this solution is also a least squares solution defined above such that ATAx =ATb.

There exists a soluton to this last equation. Hence you have a solution to

RTQTQRx = RTQTb

But this is just the equationsRTRx = RTQTb

Of course this kind of thing is very easy for a computer to solve.

F.31 Exercises

12.9

1. Here are three vectors in R4 : (1, 2, 0, 3)T, (2, 1,−3, 2)T , (0, 0, 1, 2)

T. Find the three

dimensional volume of the parallelepiped determined by these three vectors.

(1, 2, 0, 3) · (1, 2, 0, 3) = 14

(1, 2, 0, 3) · (2, 1,−3, 2) = 10

(1, 2, 0, 3) · (0, 0, 1, 2) = 6

(2, 1,−3, 2) · (2, 1,−3, 2) = 18

(2, 1,−3, 2) · (0, 0, 1, 2) = 1

(0, 0, 1, 2) · (0, 0, 1, 2) = 5

det

14 10 610 18 16 1 5

= 218

volume is√218

2. Here are two vectors in R4 : (1, 2, 0, 3)T, (2, 1,−3, 2)T . Find the volume of the paral-

lelepiped determined by these two vectors.

(1, 2, 0, 3) · (1, 2, 0, 3) = 14

(1, 2, 0, 3) · (2, 1,−3, 2) = 10


146 Exercises

(2, 1,−3, 2) · (2, 1,−3, 2) = 18

det

(14 1010 18

)

= 152

volume is√152

3. Here are three vectors in R2 : (1, 2)T, (2, 1)

T, (0, 1)

T. Find the three dimensional

volume of the parallelepiped determined by these three vectors. Recall that from theabove theorem, this should equal 0.

It does equal 0.

4. Find the equation of the plane through the three points (1, 2, 3) , (2,−3, 1) , (1, 1, 7) .5. Let T map a vector space V to itself. Explain why T is one to one if and only if T is

onto. It is in the text, but do it again in your own words.

This is because if T is one to one, it takes a basis to a basis. If T is onto, then it takesa basis to a spanning set which must be independent, since otherwise, there would bea linearly independent set of vectors which would also span and yet have fewer vectorsthan a basis.

6. ↑Let all matrices be complex with complex field of scalars and let A be an n×n matrixand B a m×m matrix while X will be an n×m matrix. The problem is to considersolutions to Sylvester’s equation. Solve the following equation for X

AX −XB = C

where C is an arbitrary n × m matrix. If q (λ) is a polynomial, show first that ifAX −XB = 0, then q (A)X −Xq (B) = 0. Next define the linear map T which mapsthe n×m matrices to the n×m matrices as follows.

TX ≡ AX −XB

Show that the only solution to TX = 0 is X = 0 if and only if σ (A) ∩ σ (B) = ∅.Conclude that there exists a unique solution for each C to Sylvester’s equation if andonly if σ (A) ∩ σ (B) = ∅.

q (A)X −Xq (B) = q (A)X = 0

Now explain why q (A)−1 exists if and only if σ (A) ∩ σ (B) = ∅.Consider the first part. Multiply by An−1. Then An−1AX − An−1XB = 0. SinceAX = XB, you can make multiple switches and write this as

AnX −XBn = 0

Now it follows that the desired result holds for a polynomial. Consider T which mapsthe vector space of n×m matrices to itself. Let q (λ) be the characteristic polynomialof B as suggested and suppose TX = 0. Then

q (A)X −Xq (B) = 0 = q (A)X.

Let the characteristic polynomial for B be∏m

i=1 (λ− λi) . In case there are no sharedeigenvalues, it follows that

q (A) =

m∏

i=1

(A− λiI)


Exercises 147

is invertible. Hence X = 0. Therefore, T is one to one and it follows that it is onto. Ifthere is a shared eigenvalue, then the matrix q (A) will no longer be invertible. Hencethere will exist nonzero solutions X to q (A)X = 0. It follows that in this case T isnot one to one and so it is not onto either.

7. Compare Definition 12.8.2 with the Binet Cauchy theorem, Theorem 3.3.14. What isthe geometric meaning of the Binet Cauchy theorem in this context?

Letting U be the matrix having ui as the ith column the above definition states that

the volume is just det(UTU

)1/2and by the Cauchy Binet theorem, in the interesting

case that U has more rows than columns, this equals

(∑

I

det (UI)2

)1/2

where I is a choice ofm rows which are selected from U so that UI is the square matrixwhich results from these rows. The geometric meaning of the above expression is this.det (UI) gives the m dimensional volume of the parallelepiped which is obtained froma projection onto the m dimensional subspace of Rn determined by setting the othern − m variables equal to 0. Thus the volume is the square root of the sum of thesquares of the areas of these projections.

F.32 Exercises

13.12

1. Show (A∗)∗ = A and (AB)∗= B∗A∗.

((A∗)∗ x, y

)= (x,A∗y) = (Ax, y) . Since y is arbitrary, (A∗)∗ x = Ax.

((AB)

∗x, y

)= (x,ABy) = (A∗x,By) = (B∗A∗x, y)

Since y is arbitrary, this shows (AB)∗x = B∗A∗x and since x is arbitrary, this shows

the formula.

2. Prove Corollary 13.3.9.

This is just like the proof of the theorem.

3. Show that if A is an n× n matrix which has an inverse then A+ = A−1.

You have AA−1A = A,A−1AA−1 = A−1, and AA−1 = I which is Hermitian andA−1A = I which is also Hermitian.

4. Using the singular value decomposition, show that for any square matrix A, it followsthat A∗A is unitarily similar to AA∗.

You have A = U∗ΣV where Σ is the singular value matrix and U, V are unitary of theright size. Therefore, A∗A = V ∗ΣUU∗ΣV = V ∗Σ2V . Similarly, AA∗ = U∗Σ2U. ThenΣ2 = UAA∗U∗ and so

A∗A = V ∗UAA∗U∗V

Since these matrices are all square, this does it. U∗V is unitary.

5. Let A,B be a m× n matrices. Define an inner product on the set of m× n matricesby

(A,B)F ≡ trace (AB∗) .


148 Exercises

Show this is an inner product satisfying all the inner product axioms. Recall for M ann×n matrix, trace (M) ≡∑n

i=1 Mii. The resulting norm, ||·||F is called the Frobeniusnorm and it can be used to measure the distance between two matrices.

This was done earlier.

6. Let A be an m× n matrix. Show

||A||2F ≡ (A,A)F =∑

j

σ2j

where the σj are the singular values of A.

‖A‖2F = trace (AA∗) = trace (U∗ΣV V ∗Σ∗U) = trace (ΣΣ∗) = sum of the squares ofthe singular values.

7. If A is a general n × n matrix having possibly repeated eigenvalues, show there is asequence Ak of n × n matrices having distinct eigenvalues which has the propertythat the ijth entry of Ak converges to the ijth entry of A for all ij. Hint: Use Schur’stheorem.

A = U∗TU where T is upper triangular and U is unitary. Change the diagonal entriesof T slightly so that the resulting upper triangular matrix Tk has all distinct diagonalentries and Tk → T in the sense that the ijth entry of Tk converges to the ijth entryof T . Then let Ak = U∗TkU. It follows that Ak → A in the sense that correspondingentries converge.

8. Prove the Cayley Hamilton theorem as follows. First suppose A has a basis of eigen-vectors vknk=1 , Avk = λkvk. Let p (λ) be the characteristic polynomial. Showp (A)vk = p (λk)vk = 0. Then since vk is a basis, it follows p (A)x = 0 for allx and so p (A) = 0. Next in the general case, use Problem 7 to obtain a sequence Akof matrices whose entries converge to the entries of A such that Ak has n distincteigenvalues and therefore by Theorem 7.1.7 Ak has a basis of eigenvectors. There-fore, from the first part and for pk (λ) the characteristic polynomial for Ak, it followspk (Ak) = 0. Now explain why and the sense in which

limk→∞

pk (Ak) = p (A) .

First say A has a basis of eigenvectors v1, · · · ,vn . Akvj = λkjvj . Then it follows

that p (A)vk = p (λk)vk. Hence if x is any vector, let x =∑n

k=1 xkvk and it followsthat

p (A)x = p (A)

(

n∑

k=1

xkvk

)

=

n∑

k=1

xkp (A)vk

=

n∑

k=1

xkp (λk)vk =

n∑

k=1

xk0vk = 0

Hence p (A) = 0. Now drop the assumption that A is nondefective. From the above,there exists a sequence Ak which is non defective which converges to A and alsopk (λ) → p (λ) uniformly on compact sets because these characteristic polynomialsare defined in terms of determinants of the corresponding matrix. See the aboveconstruction of the Ak. It is probably easiest to use the Frobinius norm for the lastpart.

‖pk (Ak)− p (A)‖F ≤ ‖pk (Ak)− p (Ak)‖F + ‖p (Ak)− p (A)‖F


Exercises 149

The first term converges to 0 because the convergence of Ak to A implies all entriesof Ak lie in a compact set. The second term converges to 0 also because the entries ofAk converge to the corresponding entries of A.

9. Prove that Theorem 13.4.6 and Corollary 13.4.7 can be strengthened so that thecondition on the Ak is necessary as well as sufficient. Hint: Consider vectors of the

form

(x

0

)

where x ∈ Fk.

Say the matrix is positive definite. Then you could just look at

(xT 0

)M (A)

(x

0

)

= xTM (A)k x

and this needs to be positive. Hence those principle minors have all positive determi-nants.

10. Show directly that if A is an n× n matrix and A = A∗ (A is Hermitian) then all theeigenvalues and eigenvectors are real and that eigenvectors associated with distincteigenvalues are orthogonal, (their inner product is zero).

λ (x,x) = (Ax,x) = (x,Ax) = λ (x,x) so λ = λ. Now if x is an eigenvector,Ax =Ax = λx and so x+ x is also an eigenvector. Hence, you can assume thatthe eigenvectors are real. If x is pure imaginary, you could simply multiply by i andget an eigenvector which is real.

11. Let v1, · · · ,vn be an orthonormal basis for Fn. Let Q be a matrix whose ith columnis vi. Show

Q∗Q = QQ∗ = I.

This follows from how we multiply matrices.

12. Show that an n × n matrix Q is unitary if and only if it preserves distances. Thismeans |Qv| = |v| . This was done in the text but you should try to do it for yourself.

13. Suppose v1, · · · ,vn and w1, · · · ,wn are two orthonormal bases for Fn and sup-pose Q is an n × n matrix satisfying Qvi = wi. Then show Q is unitary. If |v| = 1,show there is a unitary transformation which maps v to e1.

This is easy because you show it preserves distances.

14. Finish the proof of Theorem 13.6.5.

15. Let A be a Hermitian matrix so A = A∗ and suppose all eigenvalues of A are largerthan δ2. Show

(Av,v) ≥ δ2 |v|2

Where here, the inner product is

(v,u) ≡n∑

j=1

vjuj.

Since it is Hermitian, there exists unitary U such that U∗AU = D. Then

(Ax,x) = (UDU∗x,x) = (DU∗x,U∗x) ≥ δ2 |U∗x|2 = δ2 |x|2


150 Exercises

16. Suppose A + A∗ has all negative eigenvalues. Then show that the eigenvalues of Ahave all negative real parts.

You have

0 > ((A+A∗)x,x) = (Ax,x) + (A∗x,x)

= (Ax,x) + (Ax,x)

Now let Ax = λx. Then you get

0 > λ |x|2 + λ |x|2 = Re (λ) |x|2

17. The discrete Fourier transform maps Cn → Cn as follows.

F (x) = z where zk =1√n

n−1∑

j=0

e−i 2πn jkxj .

Show that F−1 exists and is given by the formula

F−1 (z) = x where xj =1√n

n−1∑

j=0

ei2πn jkzk

Here is one way to approach this problem. Note z = Ux where

U =1√n

e−i 2πn 0·0 e−i 2πn 1·0 e−i 2πn 2·0 · · · e−i 2πn (n−1)·0



......

......

e−i 2πn 0·(n−1) e−i 2πn 1·(n−1) e−i 2πn 2·(n−1) · · · e−i 2πn (n−1)·(n−1)

Now argue U is unitary and use this to establish the result. To show this verifyeach row has length 1 and the inner product of two different rows gives 0. NowUkj = e−i 2πn jk and so (U∗)kj = ei

2πn jk.

Take the inner product of two rows.

n−1∑

j=0

e−i 2πn jke−i 2πn jl =

n−1∑

j=0

e−i 2πn jkei2πn jl =

n−1∑

j=0

e−i 2πn j(l−k)

=1− e−i 2πn n(l−k)

1− e−i 2πn (l−k)= 0 if l 6= k

because e−i 2πn n(l−k) = 1. What if l = k? then the sum reduces to∑n−1

j=01√n

1√n= 1

and so this matrix is unitary. Therefore, its inverse is just the transpose conjugatewhich yields the other formula.

18. Let f be a periodic function having period 2π. The Fourier series of f is an expressionof the form ∞∑

k=−∞cke

ikx ≡ limn→∞

n∑

k=−n

ckeikx

and the idea is to find ck such that the above sequence converges in some way to f . If

f (x) =

∞∑

k=−∞cke

ikx


Exercises 151

and you formally multiply both sides by e−imx and then integrate from 0 to 2π,interchanging the integral with the sum without any concern for whether this makessense, show it is reasonable from this to expect

cm =1

2π

∫ 2π

0

f (x) e−imxdx.

Now suppose you only know f (x) at equally spaced points 2πj/n for j = 0, 1, · · · , n.Consider the Riemann sum for this integral obtained from using the left endpoint ofthe subintervals determined from the partition

2πn j

n

j=0. How does this compare with

the discrete Fourier transform? What happens as n→∞ to this approximation?

Multiply by e−imx and formally integrate both sides. This leads to the formula forcm. Now use a left sum. This would be

1

2π

n−1∑

j=0

f

(2π

nj

)

e−im 2πn j

(2π

n(j + 1)− 2π

nj

)

=1

n

n−1∑

j=0

f

(2π

nj

)

e−im 2πn j

Thus an approximation to cm comes from multiplying 1√nf by the unitary matrix U

above where f is the vector whose jth component is f(2πn j

)/√n.

19. Suppose A is a real 3 × 3 orthogonal matrix (Recall this means AAT = ATA = I. )having determinant 1. Show it must have an eigenvalue equal to 1. Note this showsthere exists a vector x 6= 0 such that Ax = x. Hint: Show first or recall that anyorthogonal matrix must preserve lengths. That is, |Ax| = |x| .If Ax = λx, then you can take the norm of both sides and conclude that |λ| = 1. Itfollows that the eigenvalues of A are eiθ, e−iθ and another one which has magnitude1 and is real. This can only be 1 or −1. Since the determinant is given to be 1, itfollows that it is 1. Therefore, there exists an eigenvector for the eigenvalue 1.

20. Let A be a complex m×n matrix. Using the description of the Moore Penrose inversein terms of the singular value decomposition, show that

limδ→0+

(A∗A+ δI)−1

A∗ = A+

where the convergence happens in the Frobenius norm. Also verify, using the singularvalue decomposition, that the inverse exists in the above formula.

Recall that the Moore Penrose inverse is

V

(σ−1 00 0

)

U∗

where A = U

(σ 00 0

)

V ∗. The matrices U, V are unitary and of the right size. First

of all, observe that there is no problem about the existence of the inverse mentionedin the formula. This is because

((A∗A+ δI)x,x) ≥ δ |x|2

and so it is a one to one matrix. Now consider the limit assertion. The left side equals


152 Exercises

limδ→0+

(

V

(σ 00 0

)

UU

(σ 00 0

)

V ∗ + δI

)−1

V

(σ 00 0

)

U∗

= limδ→0

(

V

(σ2 00 0

)

V ∗ + δI

)−1

V

(σ 00 0

)

U∗

Now(

V

(σ2 00 0

)

V ∗ + δI

)−1

V

(σ 00 0

)

U∗

=

(

V

(σ2 00 0

)

V ∗ + δV V ∗)−1

V

(σ 00 0

)

U∗

= V

((σ2 00 0

)

+ δI

)−1 (σ 00 0

)

U∗

= V

(σ2 + δI 0

0 δI

)−1 (σ 00 0

)

U∗

= V

(

(

σ2 + δI)−1

0

0 δ−1I

)

(

σ 00 0

)

U∗

= V

((

σ2 + δI)−1

σ 00 0

)

U∗

Of course σ and σ2 and so forth are diagonal matrices. Thus(

σ2 + δI)−1

is also a

diagonal matrix having on the diagonal an expression of the form(

σ2i + δ

)−1σi and

this clearly converges as δ → 0 to σ−1i . Therefore, the above converges to

V

(

σ−1 00 0

)

U∗ = A+

21. Show that A+ = (A∗A)+ A∗. Hint: You might use the description of A+ in terms ofthe singular value decomposition.

Recall that A+ = V

(

σ−1 00 0

)

U∗ where A = U

(

σ 00 0

)

V ∗. The σ was the

diagonal matrix which consisting of square roots of eigenvalues of A∗A, arranged indecreasing order from top left toward lower right.

A∗A = V

(

σ2 00 0

)

V ∗

where the σ2 are the square roots of the eigenvalues of (A∗A)2. Thus (A∗A)+ A∗ =

V

(

σ−2 00 0

)

V ∗V

(

σ 00 0

)

U∗ = V

(

σ−1 00 0

)

U∗ = A+


Exercises 153

F.33 Exercises

14.7

1. Solve the system

4 1 11 5 20 2 6

xyz

=

123

using the Gauss Seidel method and the Jacobi method. Check your answer by alsosolving it using row operations.

4 1 11 5 20 2 6

xyz

=

123

, Solution is:

91002110043100

2. Solve the system

4 1 11 7 20 2 4

xyz

=

123


4 1 11 7 20 2 4

xyz

=

123

, Solution is:

5947946794

3. Solve the system

5 1 11 7 20 2 4

xyz

=

123


5 1 11 7 20 2 4

xyz

=

123

, Solution is:

51189

1184259

4. If you are considering a system of the form Ax = b and A−1 does not exist, will eitherthe Gauss Seidel or Jacobi methods work? Explain. What does this indicate aboutfinding eigenvectors for a given eigenvalue?

No. These methods only work when there is a unique solution.

5. For ||x||∞ ≡ max |xj | : j = 1, 2, · · · , n , the parallelogram identity does not hold.Explain.

Let x =(1, 0) ,y = (0, 1) . Then

‖x+ y‖2 + ‖x− y‖2 = 1 + 1 = 2

2 ‖x‖2 + 2 ‖y‖2 = 4

6. A norm ||·|| is said to be strictly convex if whenever ||x|| = ||y|| , x 6= y, it follows

∣∣∣∣

∣∣∣∣

x+ y

2

∣∣∣∣

∣∣∣∣< ||x|| = ||y|| .


154 Exercises

Show the norm |·| which comes from an inner product is strictly convex.

Let x, y be as described.

∥∥∥∥

x+ y

2

∥∥∥∥

2

+

∥∥∥∥

x− y

2

∥∥∥∥

2

=1

2‖x‖2 + 1

2‖y‖2 = ‖x‖2

If x 6= y, then∥∥∥∥

x+ y

2

∥∥∥∥

2

< ‖x‖2

7. A norm ||·|| is said to be uniformly convex if whenever ||xn|| , ||yn|| are equal to 1 forall n ∈ N and limn→∞ ||xn + yn|| = 2, it follows limn→∞ ||xn − yn|| = 0. Show thenorm |·| coming from an inner product is always uniformly convex. Also show thatuniform convexity implies strict convexity which is defined in Problem 6.

The usual norm satisfies the parallelogram law. Thus

|xn − yn|2 = 2 |xn|2 + 2 |yn|2 − |xn + yn|2

= 4− |xn + yn|2

and the right side converges to 0. Thus |xn − yn| → 0.

8. Suppose A : Cn → Cn is a one to one and onto matrix. Define

||x|| ≡ |Ax| .

Show this is a norm.

If ‖x‖ = 0, then Ax = 0 ans since A is one to one, it follows that x = 0. Clearly‖cx‖ = |c| ‖x‖. What about the triangle inequality? This is also easy.

‖x+ y‖ ≡ |A (x+ y)| ≤ |Ax|+ |Ay| = ‖x‖ + ‖y‖

9. If X is a finite dimensional normed vector space and A,B ∈ L (X,X) such that||B|| < ||A|| , can it be concluded that

∣∣∣∣A−1B

∣∣∣∣ < 1?

Maybe not. Say A =

(1 00 1/2

)

, B =

(1/2 00 3/4

)

. Then A−1 =

(1 00 2

)

and

A−1B =

(12 00 3

2

)

so it has norm 3/2 > 1. Also ‖B‖ < ‖A‖.

10. Let X be a vector space with a norm ||·|| and let V = span (v1, · · · , vm) be a finitedimensional subspace of X such that v1, · · · , vm is a basis for V. Show V is a closedsubspace of X . This means that if wn → w and each wn ∈ V, then so is w. Next showthat if w /∈ V,

dist (w, V ) ≡ inf ||w − v|| : v ∈ V > 0

is a continuous function of w and

|dist (w, V )− dist (w1, V )| ≤ ‖w1 − w‖

Next show that if w /∈ V , there exists z such that ||z|| = 1 and dist (z, V ) > 1/2.For those who know some advanced calculus, show that if X is an infinite dimensionalvector space having norm ||·|| , then the closed unit ball in X cannot be compact. Thusclosed and bounded is never compact in an infinite dimensional normed vector space.


Exercises 155

Say wn =∑m

k=1 cnkvk. Define a norm on V by ‖w‖ ≡

(∑m

k=1 |ck|2)1/2

where w =∑m

k=1 ckvk. This gives a norm on V which is equivalent to the given norm on V .Thus if wn → w, it follows that cn → c in Cm and so w =

∑mk=1 ckvk and so

V is closed. Let w,w′ be two points and suppose without loss of generality thatdist (w, V ) − dist (w′, V ) ≥ 0. Let v ∈ V be such that ‖w′ − v‖ < dist (w′, V ) + ε.Then

|dist (w, V )− dist (w′, V )| = dist (w, V )− dist (w′, V )

≤ ‖w − v‖ − ‖w′ − v‖+ ε

≤ ‖w − w′‖+ ‖w′ − v‖ − ‖w′ − v‖ + ε

= ‖w − w′‖+ ε

Then since ε is arbitrary, this shows that |dist (w, V )− dist (w′, V )| ≤ ‖w − w′‖ andso this is continuous as claimed. Now let w /∈ V. Thus dist (w, V ) > 0 because if thisis not so, there wouild exist vn → w and by the first part, w ∈ V . Let v ∈ V be suchthat 2 dist (w, V ) > ‖w − v‖. Then you can consider z = w−v

‖w−v‖ . If v1 ∈ V,

‖z − v1‖ =

∥∥∥∥

w − v

‖w − v‖ −v1 ‖w − v‖‖w − v‖

∥∥∥∥

≥ 1

2 dist (w, V )‖w − v − ‖w − v‖ v1‖

≥ dist (w, V )

2 dist (w, V )=

1

2

Since v1 is arbitrary, it follows that dist (z, V ) ≥ 12 . If you have an infinite dimensional

normed linear space, the above argument shows that you can obtain an infinite se-quence of vectors xn , each ‖xn‖ = 1 and such that dist (xn+1, span (x1, · · · , xn)) ≥1/2 for every n. Therefore, there is no convergent subsequence because no subsequenceis a Cauchy sequence. Therefore, the unit ball is not sequentially compact.

11. Suppose ρ (A) < 1 for A ∈ L (V, V ) where V is a p dimensional vector space havinga norm ||·||. You can use Rp or Cp if you like. Show there exists a new norm |||·|||such that with respect to this new norm, |||A||| < 1 where |||A||| denotes the operatornorm of A taken with respect to this new norm on V ,

|||A||| ≡ sup |||Ax||| : |||x||| ≤ 1

Hint: You know from Gelfand’s theorem that

||An||1/n < r < 1

provided n is large enough, this operator norm taken with respect to ||·||. Show thereexists 0 < λ < 1 such that

ρ

(A

λ

)

< 1.

You can do this by arguing the eigenvalues of A/λ are the scalars µ/λ where µ ∈ σ (A).Now let Z+ denote the nonnegative integers.

|||x||| ≡ supn∈Z+

∣∣∣∣

∣∣∣∣

An

λn x

∣∣∣∣

∣∣∣∣


156 Exercises

First show this is actually a norm. Next explain why

|||Ax||| ≡ λ supn∈Z+

∣∣∣∣

∣∣∣∣

An+1

λn+1 x

∣∣∣∣

∣∣∣∣≤ λ |||x||| .

First, it is obvious that there exists λ < 1 such that

ρ

(A

λ

)

< 1

Just pick λ larger than the absolute value of all the eigenvalues which are each lessthan 1. Now use the norm suggested. First of all, why is this a norm? If |||x||| = 0,then ‖x‖ = 0 because

‖x‖ ≤ |||x|||It is also clear that |||cx||| = |c| |||x|||. What about the triangle inequality?

|||x+ y||| ≡ supn∈Z+

∥∥∥∥

An

λn (x+ y)

∥∥∥∥≤ sup

n∈Z+

(∥∥∥∥

An

λn x

∥∥∥∥+

∥∥∥∥

An

λn y

∥∥∥∥

)

≤ supn∈Z+

∥∥∥∥

An

λn x

∥∥∥∥+ sup

n∈Z+

∥∥∥∥

An

λn y

∥∥∥∥≡ |||x|||+ |||y|||

Now why is the operator norm with respect to this new norm less than 1?

|||Ax||| ≡ supn∈Z+

∣∣∣∣

∣∣∣∣

An

λn Ax

∣∣∣∣

∣∣∣∣= λ sup

n∈Z+

∣∣∣∣

∣∣∣∣

An+1

λn+1 x

∣∣∣∣

∣∣∣∣≤ λ |||x|||

12. Establish a similar result to Problem 11 without using Gelfand’s theorem. Use anargument which depends directly on the Jordan form or a modification of it.

ρ (A) < 1 and A = S−1JεS where Jε is in modified Jordan form having ε or 0 on thesuper diagonal the diagonal entries of every block being less than 1 in absolute value.Let λ < 1 be larger than the absolute values of all eigenvalues. Now what if you define|||x||| ≡ ‖Sx‖∞? Then

|||Ax||| ≡ ‖SAx‖∞ = ‖JεSx‖∞≤ max

i

∣∣µi (Sx)i + ε (Sx)i+1

∣∣

≤ maxiλ |(Sx)i|+ εmax

i|(Sx)i|

≤ (λ+ ε) ‖Sx‖∞ < ‖Sx‖∞ ≡ |||x|||

provided ε is chosen small enough.

13. Using Problem 11 give an easier proof of Theorem 14.6.6 without having to use Corol-lary 14.6.5. It would suffice to use a different norm of this problem and the contractionmapping principle of Lemma 14.6.4.

The suggestion gives it away. You were given ρ(B−1C

)< 1. Now with the above

problem, there is another norm such that with respect to this other norm, B−1C is acontraction mapping. Therefore, it has a fixed point by the easier result on contractionmaps.


Exercises 157

14. A matrix A is diagonally dominant if |aii| >∑

j 6=i |aij | . Show that the Gauss Seidelmethod converges if A is diagonally dominant.

It is the eigenvalues of B−1C which are important. Say

B−1Cx = λx

Then you must havedet (λB − C) = 0

Illustrating what happens in the Gauss Seidel method in three dimensions, this requires

det

λa11 −a12 −a13λa21 λa22 −a23λa31 λa32 λa33

= 0

In other words the above matrix must have a zero determinant. However, since theoriginal matrix was given to be diagonally dominant, this matrix cannot have a zerodeterminant unless |λ| < 1.

15. Suppose f (λ) =∑∞

k=0 anλn converges if |λ| < R. Show that if ρ (A) < R where A is

an n× n matrix, then

f (A) ≡∞∑

k−0

anAn

converges in L (Fn,Fn) . Hint: Use Gelfand’s theorem and the root test.

Since the given series converges if |λ| < R, it follows that

lim supn→∞

|anλn|1/n < 1

and solim sup

n→∞|an|1/n |λ| < 1 if |λ| < R

Thus you have

lim supn→∞

|an|1/n ≤1

R

You need to look at ‖anAn‖1/n .

lim supn→∞

‖anAn‖1/n = lim supn→∞

|an|1/n ‖An‖1/n

≤ lim supn→∞

|an|1/n lim supn→∞

‖An‖1/n

<1

RR = 1

and so the series converges.

16. Referring to Corollary 14.4.3, for λ = a+ ib show

exp (λt) = eat (cos (bt) + i sin (bt)) .

Hint: Let y (t) = exp (λt) and let z (t) = e−aty (t) . Show

z′′ + b2z = 0, z (0) = 1, z′ (0) = ib.


158 Exercises

Now letting z = u+ iv where u, v are real valued, show

u′′ + b2u = 0, u (0) = 1, u′ (0) = 0

v′′ + b2v = 0, v (0) = 0, v′ (0) = b.

Next show u (t) = cos (bt) and v (t) = sin (bt) work in the above and that there is atmost one solution to

w′′ + b2w = 0 w (0) = α,w′ (0) = β.

Thus z (t) = cos (bt) + i sin (bt) and so y (t) = eat (cos (bt) + i sin (bt)). To show thereis at most one solution to the above problem, suppose you have two, w1, w2. Subtractthem. Let f = w1 − w2. Thus

f ′′ + b2f = 0

and f is real valued. Multiply both sides by f ′ and conclude

d

dt

(

(f ′)2

2+ b2

f2

2

)

= 0

Thus the expression in parenthesis is constant. Explain why this constant must equal0.

The first claim is a routine computation. The next claim is also routine computations.What about the uniqueness assertion? It suffices to show that if α, β = 0 then theonly solution is 0. Multiply both sides of the differential equation by w′. Then you get

1

2

d

dt(w′)

2+

b2

2

d

dt

(

w2)

= 0

It follows that(w′)

2+ w2

is a constant. From the initial conditions, this constant can only be 0. The rest follows.

17. Let A ∈ L (Rn,Rn) . Show the following power series converges in L (Rn,Rn).

∞∑

k=0

tkAk

k!

You might want to use Lemma 14.4.2. This is how you can define exp (tA). Next showusing arguments like those of Corollary 14.4.3

d

dtexp (tA) = A exp (tA)

so that this is a matrix valued solution to the differential equation and initial condition

Ψ′ (t) = AΨ (t) , Ψ (0) = I.

This Ψ (t) is called a fundamental matrix for the differential equation y′ = Ay. Showt→ Ψ (t)y0 gives a solution to the initial value problem

y′ = Ay, y (0) = y0.


Exercises 159

The series converges absolutely by the ratio test and the observation that∥∥Ak

∥∥ ≤

‖A‖k. Now we need to show that you can differentiate it. Writing the differencequotient gives

1

h

∞∑

k=0

(

(t+ h)k − tk)

Ak

k!=

∞∑

k=1

ks (k, h)k−1

Ak

k!=

∞∑

k=1

s (k, h)k−1

Ak

(k − 1)!

where s (k, h) is between t and t + h. Thus for each k, limh→0 s (k, h) = t. I want to

show that the limit of the series is∑∞

k=1tk−1Ak

(k−1)! .

∞∑

k=1

s (k, h)k−1

Ak

(k − 1)!−

∞∑

k=1

tk−1Ak

(k − 1)!=

∞∑

k=1

(

s (k, h)k−1 − tk−1

)

Ak

(k − 1)!

=

∞∑

k=1

(k − 1) p (k, h)k−2 (s (k, h)− t)Ak

(k − 1)!=

∞∑

k=2

p (k, h)k−2 (s (k, h)− t)Ak

(k − 2)!

where p (k, h) is also between t and t+ h. Now the norm of this is dominated by

|h|∞∑

k=2

|p (k, h)|k−2 ‖A‖k(k − 2)!

The series converges by the ratio test and so

∞∑

k=1

s (k, h)k−1

Ak

(k − 1)!−

∞∑

k=1

tk−1Ak

(k − 1)!→ 0

as h→ 0. Hence you can differentiate the series and it gives you

∞∑

k=1

tk−1Ak

(k − 1)!= A

∞∑

k=1

tk−1Ak−1

(k − 1)!= A exp (At)

When you plug in t = 0, you get I. Therefore, this is the solution to the givendifferential equation.

18. In Problem 17 Ψ (t) is defined by the given series. Denote by exp (tσ (A)) the numbersexp (tλ) where λ ∈ σ (A) . Show exp (tσ (A)) = σ (Ψ (t)) . This is like Lemma 14.4.7.Letting J be the Jordan canonical form for A, explain why

Ψ (t) ≡∞∑

k=0

tkAk

k!= S

∞∑

k=0

tkJk

k!S−1

and you note that in Jk, the diagonal entries are of the form λk for λ an eigenvalueof A. Also J = D +N where N is nilpotent and commutes with D. Argue then that

∞∑

k=0

tkJk

k!

is an upper triangular matrix which has on the diagonal the expressions eλt whereλ ∈ σ (A) . Thus conclude

σ (Ψ (t)) ⊆ exp (tσ (A))


160 Exercises

Next take etλ ∈ exp (tσ (A)) and argue it must be in σ (Ψ (t)) . You can do this asfollows:

Ψ (t)− etλI =

∞∑

k=0

tkAk

k!−

∞∑

k=0

tkλk

k!I

=

∞∑

k=0

tk

k!

(

Ak − λkI)

=

∞∑

k=0

tk

k!

k−1∑

j=1

Ak−jλj

(A− λI) (6.38)

Now you need to argue∞∑

k=0

tk

k!

k−1∑

j=1

Ak−jλj

converges to something in L (Rn,Rn). To do this, use the ratio test and Lemma 14.4.2after first using the triangle inequality. Since λ ∈ σ (A) , Ψ (t)− etλI is not one to oneand so this establishes the other inclusion. You fill in the details. This theorem is aspecial case of theorems which go by the name “spectral mapping theorem”.

If λ ∈ σ (A) , then Ax = λx and so∑∞

k=0(tA)kx

k! = eλtx and so exp (tσ (A)) ⊆σ (exp (tA)) . Now consider exp (tA) . You have A = S−1JS where J is Jordan form.Thus it is simple to write that

exp (tA) = S−1∞∑

k=0

(tJ)k

k!S

Now J is block diagonal and the blocks have a constant down the main diagonal. Say

α 1. . .

. . .

α 1α

When you raise such a block to the exponent k, you get an upper triangular matrixwhich has αk down the diagonal. Therefore, exp (tA) is of the form

S−1BS

where B is an upper triangular matrix which has for its diagonal entries eλt for λ ∈σ (A). These diagonal entries are the eigenvalues of exp (tA). It follows that

σ (exp (tA)) ⊆ exp (tσ (A)) .

19. Suppose Ψ (t) ∈ L (V,W ) where V,W are finite dimensional inner product spaces andt→ Ψ (t) is continuous for t ∈ [a, b]: For every ε > 0 there there exists δ > 0 such thatif |s− t| < δ then ||Ψ (t)−Ψ (s)|| < ε. Show t → (Ψ (t) v, w) is continuous. Here it isthe inner product in W. Also define what it means for t → Ψ (t) v to be continuousand show this is continuous. Do it all for differentiable in place of continuous. Nextshow t→ ||Ψ (t)|| is continuous.


Exercises 161

|(Ψ (t) v, w) − (Ψ (s) v, w)| ≤ |Ψ (t) v −Ψ (s) v| |w|≤ ‖Ψ (t)−Ψ (s)‖ |v| |w|

and so this is continuous.

‖Ψ (t) v −Ψ (s) v‖ ≤ ‖Ψ (t)−Ψ (s)‖ |v|

so t → Ψ (t) v is continuous. Differentiable works out similarly. If t → Ψ (t) isdifferentiable, then those other things are too. For example,

‖Ψ (t+ h) v −Ψ (t) v −Ψ′ (t)hv‖≤ ‖Ψ (t+ h)−Ψ (t)−Ψ′ (t)h‖ |v|= o (h) |v| = o (h)

As to continuity of t→ ‖Ψ (t)‖ ,

|‖Ψ (t)‖ − ‖Ψ (s)‖| ≤ ‖Ψ (t)−Ψ (s)‖

from the triangle inequality.

20. If z (t) ∈ W, a finite dimensional inner product space, what does it mean for t→ z (t)to be continuous or differentiable? If z is continuous, define

∫ b

a

z (t) dt ∈W

as follows.(

w,

∫ b

a

z (t) dt

)

≡∫ b

a

(w, z (t)) dt.

Show that this definition is well defined and furthermore the triangle inequality,

∣

∣

∣

∣

∣

∫ b

a

z (t) dt

∣

∣

∣

∣

∣

≤∫ b

a

|z (t)| dt,

and fundamental theorem of calculus,

d

dt

(∫ t

a

z (s) ds

)

= z (t)

hold along with any other interesting properties of integrals which are true.

Differentiability is the same as usual.

z (t+ h)− z (t)− z′ (t)h = o (h)

If t → z (t) is continuous, then from the above problem, so is t → (z (t) , w) and so∫ b

a(z (t) , w) dt makes sense. Does there exist an element I of the inner product space

such that (w, I) =∫ b

a (w, z (t)) dt? This is so in a finite dimensional inner productspace because

w →∫ b

a

(w, z (t)) dt


162 Exercises

is a linear map and so it can be represented by a unique I ∈W so that

(w, I) =

∫ b

a

(w, z (t)) dt

for all w. Now what about the fundamental theorem of calculus?(

w,

∫ t+h

tz (s) ds

h

)

≡ 1

h

∫ t+h

t

(w, z (s)) ds→ (w, z (t))

Since the space is finite dimensional, this is the same as saying that

limh→0

∫ t+h

tz (s) ds

h= z (t) .

You can just apply the above to the finitely many Fourier coefficients. Therefore, theusual Fundamental theorem of calculus holds. What about the triangle inequality?

∣

∣

∣

∣

∣

∫ b

a

z (t) dt

∣

∣

∣

∣

∣

= sup|w|=1

∣

∣

∣

∣

∣

(

w,

∫ b

a

z (t) dt

)∣

∣

∣

∣

∣

= sup|w|=1

∣

∣

∣

∣

∣

∫ b

a

(w, z (t)) dt

∣

∣

∣

∣

∣

≤ sup|w|=1

∫ b

a

|(w, z (t))| dt

≤∫ b

a

|z (t)| dt.

21. In the situation of Problem 19 define∫ b

a

Ψ (t) dt ∈ L (V,W )

as follows.(

w,

∫ b

a

Ψ (t) dt (v)

)

≡∫ b

a

(w,Ψ (t) v) dt.

Show this is well defined and does indeed give∫ b

aΨ (t) dt ∈ L (V,W ) . Also show the

triangle inequality∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∫ b

a

Ψ (t) dt

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

≤∫ b

a

||Ψ (t)|| dt

where ||·|| is the operator norm and verify the fundamental theorem of calculus holds.

(∫ t

a

Ψ (s) ds

)′= Ψ (t) .

Also verify the usual properties of integrals continue to hold such as the fact theintegral is linear and

∫ b

a

Ψ (t) dt+

∫ c

b

Ψ (t) dt =

∫ c

a

Ψ (t) dt

and similar things. Hint: On showing the triangle inequality, it will help if you usethe fact that

|w|W = sup|v|≤1

|(w, v)| .


Exercises 163

You should show this also.

That last assertion follows by the Cauchy Schwarz inequality which shows the rightside is no larger than |w| and then by taking v = w/ |w|. That the integral is well

defined follows because w →∫ b

a (w,Ψ (t) v) dt is a linear mapping and so, by theRiesz representation theorem, there exists a unique I (v) ∈ W such that (w, I (v)) =∫ b

a(w,Ψ (t) v) dt. Now also, I is a linear function of v because of the fact that each

Ψ (t) is linear. Therefore, you can denote by∫ b

a Ψ (t) dt ∈ L (V,W ) this I. Thus thedefinition of the integral is well defined. The standard properties of the integral arenow fairly obvious. As for the triangle inequality, it goes the same way as in the aboveproblem.

∥∥∥∥∥

∫ b

a

Ψ (t) dt

∥∥∥∥∥≡ sup

|v|≤1

∥∥∥∥∥

∫ b

a

Ψ (t) dt (v)

∥∥∥∥∥= sup|v|≤1

sup|w|≤1

∣∣∣∣∣

(

w,

∫ b

a

Ψ (t) dt (v)

)∣

∣

∣

∣

∣

= sup|v|≤1

sup|w|≤1

∣

∣

∣

∣

∣

∫ b

a

(w,Ψ (t) (v)) dt

∣

∣

∣

∣

∣

≤∫ b

a

‖Ψ (t)‖ dt.

Now consider the fundamental theorem of calculus. For simplicity, let h→ 0 + .

∥

∥

∥

∥

∥

1

h

∫ t+h

t

Ψ (s) ds−Ψ (t)

∥

∥

∥

∥

∥

=

∥

∥

∥

∥

∥

1

h

∫ t+h

t

Ψ (s)−Ψ (t) ds

∥

∥

∥

∥

∥

≤ 1

h

∫ t+h

t

‖Ψ (s)−Ψ (t)‖ ds

which converges to 0 by continuity of Ψ.

22. Prove Gronwall’s inequality. Suppose u (t) ≥ 0, continuous, and for all t ∈ [0, T ] ,

u (t) ≤ u0 +

∫ t

0

Ku (s) ds.

where K is some nonnegative constant. Then

u (t) ≤ u0eKt.

Hint: w (t) =∫ t

0u (s) ds. Then using the fundamental theorem of calculus, w (t)

satisfies the following.

u (t)−Kw (t) = w′ (t)−Kw (t) ≤ u0, w (0) = 0.

Now use the usual techniques you saw in an introductory differential equations class.Multiply both sides of the above inequality by e−Kt and note the resulting left side isnow a total derivative. Integrate both sides from 0 to t and see what you have got. Ifyou have problems, look ahead in the book. This inequality is proved later in TheoremC.4.3.

Following the hint,(

we−Kt)′ ≤ u0e

−Kt

and therefore,

w (t) e−Kt ≤ u0

∫ t

0

e−Ksds = u0

(

1

K− e−Kt

K

)


164 Exercises

Therefore, w (t) ≤ u0

(eKt

K − 1K

)

. Also, you get

u (t) ≤ u0 +Kw (t)

If K > 0, this implies

u (t) ≤ u0 +K

(

u0

(eKt

K− 1

K

))

= u0eKt

If K = 0, then the result requires no proof.

23. With Gronwall’s inequality and the integral defined in Problem 21 with its propertieslisted there, prove there is at most one solution to the initial value problem

y′ = Ay, y (0) = y0.

Hint: If there are two solutions, subtract them and call the result z. Then

z′ = Az, z (0) = 0.

It follows

z (t) = 0+

∫ t

0

Az (s) ds

and so

||z (t)|| ≤∫ t

0

‖A‖ ||z (s)|| ds

Now consider Gronwall’s inequality of Problem 22. Use Problem 20 as needed.

From Gronwall’s inequality,

‖z (t)‖ ≤ 0e−‖A‖t = 0.

24. Suppose A is a matrix which has the property that whenever µ ∈ σ (A) , Reµ < 0.Consider the initial value problem

y′ = Ay,y (0) = y0.

The existence and uniqueness of a solution to this equation has been established abovein preceding problems, Problem 17 to 23. Show that in this case where the real partsof the eigenvalues are all negative, the solution to the initial value problem satisfies

limt→∞

y (t) = 0.

Hint: A nice way to approach this problem is to show you can reduce it to theconsideration of the initial value problem

z′ = Jεz, z (0) = z0

where Jε is the modified Jordan canonical form where instead of ones down the maindiagonal, there are ε down the main diagonal. You have y′ = S−1JεSy and so youjust let z = Sy. Then z→ 0 if and only if y→ 0.

Thenz′ = Dz+Nεz


Exercises 165

whereD is the diagonal matrix obtained from the eigenvalues ofA andNε is a nilpotentmatrix commuting with D which is very small provided ε is chosen very small. Nowlet Ψ (t) be the solution of

Ψ′ = −DΨ, Ψ (0) = I

described earlier as ∞∑

k=0

(−1)k tkDk

k!.

Thus Ψ (t) commutes with D and Nε. Tell why.

The reason these commute is that D is block diagonal, each block having constantentries down the diagonal while the same is true of Nε. The blocks associated with Nε

commute with the blocks associated with D and so D and Nε commute. Thus Ψ (t)commutes with D and Nε.

Next argue(Ψ (t) z)′ = Ψ (t)Nεz (t)

and integrate from 0 to t.

Just do the differentiation. (Ψ (t) z)′=

Ψ′ (t) z (t) + Ψ (t) z′ (t) = −DΨ (t) z (t) + Ψ (t) (D +Nε) z (t)

= Ψ (t)Nεz (t)

Then integrating,

Ψ (t) z (t)− z0 =

∫ t

0

Ψ (s)Nεz (s) ds.

It follows from the triangle inequality discussed above that

||Ψ (t) z (t)|| ≤ ||z0||+∫ t

0

||Nε|| ||Ψ (s) z (s)|| ds.

It follows from Gronwall’s inequality

||Ψ (t) z (t)|| ≤ ||z0|| e||Nε||t

Now look closely at the form of Ψ (t) to get an estimate which is interesting. Explainwhy

Ψ (t) =

eµ1t 0. . .

0 eµnt

and now observe that if ε is chosen small enough, ||Nε|| is so small that each componentof z (t) converges to 0.

It follows right away from the definition of Ψ (t) that Ψ (t) is given by the abovediagonal matrix where here the µi are negatives of the eigenvalues of A. Thus theirreal parts are all bounded below by some δ > 0. It follows that

|(Ψ (t)x)i| ≥ eδt |xi|

and so||Ψ (t) z (t)|| ≥ eδt ‖z (t)‖


166 Exercises

Therefore, from the above inequality,

‖z (t)‖ ≤ ‖z0‖ e−δt + e−δt

∫ t

0

||z0|| e||Nε||sds

Of course the right side converges to 0 if you choose ε small enough that ‖Nε‖ < δ.

25. Using Problem 24 show that if A is a matrix having the real parts of all eigenvaluesless than 0 then if

Ψ′ (t) = AΨ (t) , Ψ (0) = I

it followslimt→∞

Ψ (t) = 0.

Hint: Consider the columns of Ψ (t)?

From the above, problem, this happens. If Ψ is just defined, then the solution tox′i = Axi (t) ,xi (0) = ei is Ψ (t) ei. This follows from the fact that Ψ (t) ei satisfies theequation and uniqueness. Thus the ith column of Ψ (t) converges to 0. In particularthe entries of Ψ (t) converge to 0. Thus ‖Ψ (t)‖ → 0 also.

26. Let Ψ (t) be a fundamental matrix satisfying

Ψ′ (t) = AΨ (t) , Ψ (0) = I.

ShowΨ (t)n= Ψ (nt) .Hint: Subtract and show the difference satisfies Φ′ = AΦ, Φ (0) =

0. Use uniqueness.

27. If the real parts of the eigenvalues of A are all negative, show that for every positivet,

limn→∞

Ψ (nt) = 0.

Hint: Pick Re (σ (A)) < −λ < 0 and use Problem 18 about the spectrum of Ψ (t)and Gelfand’s theorem for the spectral radius along with Problem 26 to argue that∣∣∣∣Ψ (nt) /e−λnt

∣∣∣∣ < 1 for all n large enough.

28. Let H be a Hermitian matrix. (H = H∗) . Show that eiH ≡∑∞n=0

(iH)n

n! is unitary.

You have H = U∗DU where U is unitary and D is a real diagonal matrix. Then youhave

eiH = U∗∞∑

n=0

(iD)n

n!U = U∗

eiλ1

. . .

eiλn

U

and this is clearly unitary because each matrix in the product is.

29. Show the converse of the above exercise. If V is unitary, then V = eiH for some HHermitian.

To do this, note first that V is normal because V ∗V = V V ∗ = I. Therefore, thereexists a unitary U such that V = U∗DU where D is the diagonal matrix consistingof the eigenvalues of V down the main diagonal. Since V is unitary, it preserves alllengths and so each diagonal entry of D is of magnitude 1. Thus you have

V = U∗

eiλ1

. . .

eiλn

U = U∗

∞∑

k=0

(iD)k

k!U


Exercises 167

where D =

λ1

. . .

λn

, each λk real. Then the above equals

∞∑

k=0

(iU∗DU)k

k!= eiH

where H = U∗DU is obviously Hermitian.

30. If U is unitary and does not have −1 as an eigenvalue so that (I + U)−1

exists, showthat

H = i (I − U) (I + U)−1

is Hermitian. Then, verify that

U = (I + iH) (I − iH)−1

.

To see this, note first that (I − U) , (I + U)−1

commute. Taking the adjoint of theabove operator, it is easy to see that this equals

−i (I + U∗)−1(I − U∗) = −i (U∗U + U∗)−1

(U∗U − U∗)

= −i (U + I)−1

UU∗ (U − I)

= i (I + U)−1

(I − U) = i (I − U) (I + U)−1

.

Next, using the fact that everything commutes,(

I + i(

i (I − U) (I + U)−1

))(

I − i(

i (I − U) (I + U)−1

))−1

=(

I − (I − U) (I + U)−1

)(

I + (I − U) (I + U)−1

)−1

= (I + U − (I − U)) (I + U)−1

(

(I + U + I − U) (I + U)−1

)−1

= 2U (I + U)−1

(

2I (I + U)−1

)−1

= 2U (I + U)−1 1

2(I + U) = U

31. Suppose that A ∈ L (V, V ) where V is a normed linear space. Also suppose that‖A‖ < 1 where this refers to the operator norm on A. Verify that

(I −A)−1 =

∞∑

i=0

Ai

This is called the Neumann series. Suppose now that you only know the algebraiccondition ρ (A) < 1. Is it still the case that the Neumann series converges to (I −A)

−1?

Consider partial sums of the series.∥∥∥∥∥∥

q∑

i=p

Ai

∥∥∥∥∥∥

≤q

∑

i=p

∥∥Ai

∥∥ ≤

q∑

i=p

‖A‖i

which converges to 0 as p, q → ∞ because∑∞

i=0 ‖A‖i= 1

1−‖A‖ < ∞. Therefore, by

completeness of L (V, V ) , this series converges. Why is is the desired inverse? This isobvious when you multiply both sides by (I −A) . Thus

(I −A)

∞∑

i=0

Ai = limn→∞

(I −A)

n∑

i=0

Ai = limn→∞

(I −An+1

)= I


168 Exercises

because∥∥An+1

∥∥ ≤ ‖A‖n+1

and this converges to 0 as n →∞ because ‖A‖ < 1. Yes,it the series still converges. Let ρ (A) < r < R < 1. Then for all n large enough, youhave

‖An‖1/n < R

and so ‖An‖ < Rn. It follows that the partial sums of the series form a Cauchysequence and so the series converges.

F.34 Exercises

15.3

1. In Example 15.1.10 an eigenvalue was found correct to several decimal places alongwith an eigenvector. Find the other eigenvalues along with their eigenvectors.

The matrix was

1 2 32 2 13 1 4

1 2 32 2 13 1 4

20

=

8. 485 7× 1015 6. 190 4× 1015 1. 188 8× 1016

6. 190 4× 1015 4. 515 9× 1015 8. 672 7× 1015

1. 188 8× 1016 8. 672 7× 1015 1. 665 6× 1016

=

0.534 92 −0.470 81 −0.701 570.390 23 −0.598 82 0.699 380.749 39 0.647 89 0.136 60

1. 586 4× 1016 1. 157 3× 1016 2. 222 5× 1016

0 2. 212 9× 1011 8. 403 1× 1011

0 0 4. 160 5× 1011

0.534 92 −0.470 81 −0.701 570.390 23 −0.598 82 0.699 380.749 39 0.647 89 0.136 60

T

1 2 32 2 13 1 4

0.534 92 −0.470 81 −0.701 570.390 23 −0.598 82 0.699 380.749 39 0.647 89 0.136 60

=

6. 662 1 1. 218 8× 10−4 7. 374 9× 10−5

1. 218 8× 10−4 1. 139 5 −1. 156 97. 374 9× 10−5 −1. 156 9 −0.801 49

The largest eigenvalue is close to 6. 662 1 and an eigenvector is

0.534 920.390 230.749 39

. The

others can be found by looking at the lower right block. You could do the same as theabove or you could just graph the characteristic function and zoom in on the roots.They are approximately 1.7 and −1.3. Next use shifted inverse power method to findthem more exactly and to also find the eigenvectors.

1 2 32 2 13 1 4

− 1.7

1 0 00 1 00 0 1

−1

=

−0.977 92 −5. 047 3 3. 47−5. 047 3 −33. 47 21. 1363. 47 21. 136 −13. 281


Exercises 169

Rather than use this method as described, I will just use the variation of it based onthe QR algorithm which involves raising the matrix to a power which was just usedto find the first eigenvalue.

−0.977 92 −5. 047 3 3. 47−5. 047 3 −33. 47 21. 1363. 47 21. 136 −13. 281

20

=

6. 044 8× 1031 3. 893 4× 1032 −2. 458 8× 1032

3. 893 4× 1032 2. 507 7× 1033 −1. 583 7× 1033

−2. 458 8× 1032 −1. 583 7× 1033 1. 000 2× 1033

:

0.130 15 −3. 322 8× 10−2 0.990 940.838 32 −0.529 98 −0.127 88−0.529 42 −0.847 36 4. 112 3× 10−2

4. 644 3× 1032 2. 991 4× 1033 −1. 889 2× 1033

0 9. 738 8× 1027 −3. 859 8× 1028

0 0 3. 342 9× 1027

0.130 15 −3. 322 8× 10−2 0.990 940.838 32 −0.529 98 −0.127 88−0.529 42 −0.847 36 4. 112 3× 10−2

T

−0.977 92 −5. 047 3 3. 47−5. 047 3 −33. 47 21. 1363. 47 21. 136 −13. 281

0.130 15 −3. 322 8× 10−2 0.990 940.838 32 −0.529 98 −0.127 88−0.529 42 −0.847 36 4. 112 3× 10−2

=

−47. 602 6. 755 1× 10−5 −2. 994 7× 10−5

6. 755 1× 10−5 6. 320 5× 10−2 −0.232 90−2. 994 7× 10−5 −0.232 90 −0.190 38

Thus you find the eigenvector by solving 1λ−1.7 = −47. 602, Solution is: 1. 679 0 and a

suitable eigenvector is just the first column of the orthogonal matrix

0.130 150.838 32−0.529 42

.

How well does it work?

1 2 32 2 13 1 4

0.130 150.838 32−0.529 42

=

0.218 531. 407 5−0.888 91

0.130 150.838 32−0.529 42

(1. 679 0) =

0.218 521. 407 5−0.888 90

This worked well.

Now find the pair which goes with −1.3.

1 2 32 2 13 1 4

+ 1.3

1 0 00 1 00 0 1

−1

=

−16. 948 7. 810 9 8. 119 27. 810 9 −3. 278 5 −3. 802 78. 119 2 −3. 802 7 −3. 689 6


170 Exercises

−16. 948 7. 810 9 8. 119 27. 810 9 −3. 278 5 −3. 802 78. 119 2 −3. 802 7 −3. 689 6

20

=

3. 827 4× 1027 −1. 745 5× 1027 −1. 823 0× 1027

−1. 745 5× 1027 7. 960 3× 1026 8. 313 7× 1026

−1. 823 0× 1027 8. 313 7× 1026 8. 682 7× 1026

:

0.834 83 −0.544 13 8. 355 9× 10−2

−0.380 73 −0.461 03 0.801 56−0.397 63 −0.700 98 −0.592 05

4. 584 7× 1027 −2. 090 8× 1027 −2. 183 7× 1027

0 1. 658 9× 1022 2. 792 9× 1022

0 0 4. 620 1× 1021

0.834 83 −0.544 13 8. 355 9× 10−2

−0.380 73 −0.461 03 0.801 56−0.397 63 −0.700 98 −0.592 05

T

−16. 948 7. 810 9 8. 119 27. 810 9 −3. 278 5 −3. 802 78. 119 2 −3. 802 7 −3. 689 6

0.834 83 −0.544 13 8. 355 9× 10−2

−0.380 73 −0.461 03 0.801 56−0.397 63 −0.700 98 −0.592 05

=

−24. 377 9. 597 9× 10−5 5. 267× 10−5

9. 597 9× 10−5 0.127 02 −1. 802 1× 10−2

5. 267× 10−5 −1. 802 1× 10−2 0.334 17

1λ+1.3 = −24. 377, Solution is: −1. 341. The approximate eigenvector is the first col-umn. How well does it work?

1 2 32 2 13 1 4

0.834 83−0.380 73−0.397 63

=

−1. 119 50.510 570.533 24

0.834 83−0.380 73−0.397 63

(−1. 341) =

−1. 119 50.510 560.533 22

It works well. You could do more iterations if desired.

2. Find the eigenvalues and eigenvectors of the matrix A =

3 2 12 1 31 3 2

numerically.

In this case the exact eigenvalues are ±√3, 6. Compare with the exact answers.

3 2 12 1 31 3 2

13

=

4. 353 6× 109 4. 353 6× 109 4. 353 6× 109

4. 353 6× 109 4. 353 6× 109 4. 353 6× 109

4. 353 6× 109 4. 353 6× 109 4. 353 6× 109

=

0.577 35 −0.816 50 −1. 437 8× 10−29

0.577 35 0.408 25 −0.707 110.577 35 0.408 25 0.707 11

7. 540 7× 109 7. 540 7× 109 7. 540 7× 109

0 1. 533 3× 10−19 1. 533 3× 10−19

0 0 2. 736 9× 10−48


Exercises 171

0.577 35 −0.816 50 −1. 437 8× 10−29

0.577 35 0.408 25 −0.707 110.577 35 0.408 25 0.707 11

T

3 2 12 1 31 3 2

0.577 35 −0.816 50 −1. 437 8× 10−29

0.577 35 0.408 25 −0.707 110.577 35 0.408 25 0.707 11

=

6. 000 0 0 0−1. 262 2× 10−29 1. 5 0.866 03−2. 524 4× 10−29 0.866 03 −1. 5

An eigenvalue is 6 and an eigenvector is

0.577 350.577 350.577 35

. Now lets find the others.

3 2 12 1 31 3 2

− 1.5

1 0 00 1 00 0 1

−1

=

2. 740 7 −0.592 59 −1. 925 9−0.592 59 7. 407 4× 10−2 0.740 74−1. 925 9 0.740 74 1. 407 4

2. 740 7 −0.592 59 −1. 925 9−0.592 59 7. 407 4× 10−2 0.740 74−1. 925 9 0.740 74 1. 407 4

20

=

3. 034 1× 1012 −8. 129 9× 1011 −2. 221 1× 1012

−8. 129 9× 1011 2. 178 4× 1011 5. 951 5× 1011

−2. 221 1× 1012 5. 951 5× 1011 1. 626 0× 1012

:

0.788 68 0.425 47 0.443 81−0.211 33 −0.490 28 0.845 56−0.577 35 0.760 66 0.296 76

3. 847 1× 1012 −1. 030 8× 1012 −2. 816 3× 1012

0 3. 850 7× 106 3. 843 7× 107

0 0 1. 927 6× 107

0.788 68 0.425 47 0.443 81−0.211 33 −0.490 28 0.845 56−0.577 35 0.760 66 0.296 76

T

2. 740 7 −0.592 59 −1. 925 9−0.592 59 7. 407 4× 10−2 0.740 74−1. 925 9 0.740 74 1. 407 4

0.788 68 0.425 47 0.443 81−0.211 33 −0.490 28 0.845 56−0.577 35 0.760 66 0.296 76

=

4. 309 4 −9. 614 1× 10−6 −1. 276 5× 10−5

−9. 614 1× 10−6 −0.223 59 0.195 59−1. 276 5× 10−5 0.195 59 0.136 42

1λ−1.5 = 4. 309 4, Solution is: 1. 732 1 and an eigenvector is

0.788 68−0.211 33−0.577 35

. How well

does it work?

3 2 12 1 31 3 2

0.788 68−0.211 33−0.577 35

=

1. 366−0.366 02−1.0


172 Exercises

0.788 68−0.211 33−0.577 35

1. 732 1 =

1. 366 1−0.366 04−1.0

Next you can find the other one.

3 2 12 1 31 3 2

+ 1.5

1 0 00 1 00 0 1

−1

=

4. 444 4× 10−2 0.711 11 −0.622 220.711 11 −2. 622 2 2. 044 4−0.622 22 2. 044 4 −1. 288 9

4. 444 4× 10−2 0.711 11 −0.622 220.711 11 −2. 622 2 2. 044 4−0.622 22 2. 044 4 −1. 288 9

20

=

2. 178 4× 1011 −8. 129 9× 1011 5. 951 5× 1011

−8. 129 9× 1011 3. 034 1× 1012 −2. 221 1× 1012

5. 951 5× 1011 −2. 221 1× 1012 1. 626 0× 1012

:

0.211 32 −0.490 28 0.845 56−0.788 68 0.425 47 0.443 810.577 35 0.760 66 0.296 76

1. 030 8× 1012 −3. 847 1× 1012 2. 816 3× 1012

0 1. 437 1× 107 2. 791 7× 107

0 0 1. 927 6× 107

0.211 32 −0.490 28 0.845 56−0.788 68 0.425 47 0.443 810.577 35 0.760 66 0.296 76

T

4. 444 4× 10−2 0.711 11 −0.622 220.711 11 −2. 622 2 2. 044 4−0.622 22 2. 044 4 −1. 288 9

0.211 32 −0.490 28 0.845 56−0.788 68 0.425 47 0.443 810.577 35 0.760 66 0.296 76

:

−4. 309 4 −2. 468 3× 10−6 −1. 167 8× 10−5

−2. 468 3× 10−6 0.280 95 −6. 479 4× 10−2

−1. 167 8× 10−5 −6. 479 4× 10−2 0.161 74

1λ+1.5 = −4. 309 4, Solution is: −1. 732 1 and an eigenvector is

0.211 32−0.788 680.577 35

. How

well does it work?

3 2 12 1 31 3 2

+ 1. 732 1

1 0 00 1 00 0 1

0.211 32−0.788 680.577 35

=

−2. 262 8× 10−5

−6. 262 8× 10−5

7. 935× 10−6

which is pretty close to 0 so this worked well.


3 2 12 5 31 3 2

numerically.

The exact eigenvalues are 2, 4 +√15, 4 −

√15. Compare your numerical results with

the exact values. Is it much fun to compute the exact eigenvectors?


Exercises 173

3 2 12 5 31 3 2

20

=

1. 448 7× 1017 2. 713 5× 1017 1. 632 8× 1017

2. 713 5× 1017 5. 082 3× 1017 3. 058 1× 1017

1. 632 8× 1017 3. 058 1× 1017 1. 840 1× 1017

=

0.415 99 0.806 58 0.419 970.779 18 −7. 803 9× 10−2 −0.621 920.468 86 −0.585 95 0.660 94

3. 482 5× 1017 6. 522 6× 1017 3. 924 8× 1017

0 1. 539× 1013 1. 324 1× 1013

0 0 1. 399 9× 1012

0.415 99 0.806 58 0.419 970.779 18 −7. 803 9× 10−2 −0.621 920.468 86 −0.585 95 0.660 94

T

3 2 12 5 31 3 2

0.415 99 0.806 58 0.419 970.779 18 −7. 803 9× 10−2 −0.621 920.468 86 −0.585 95 0.660 94

=

7. 873 0 8. 768 9× 10−5 3. 295 6× 10−5

8. 768 9× 10−5 1. 746 2 0.641 053. 295 6× 10−5 0.641 05 0.380 82

An eigenvalue is 7. 873 0 and the approximate eigenvector is

0.415 990.779 180.468 86

. How well

does it work?

3 2 12 5 31 3 2

− 7. 873 0

1 0 00 1 00 0 1

0.415 990.779 180.468 86

=

1. 007 3× 10−4

−2. 414× 10−5

−8. 478× 10−5

4 +√15 = 7. 873 0 which is what was just obtained to four decimal places.

It works pretty well. Next we find the other eigenvalues and eigenvectors which gowith them.

3 2 12 5 31 3 2

− 1.75

1 0 00 1 00 0 1

−1

=

3. 295 6 −1. 006 3 −1. 106 9−1. 006 3 0.276 73 0.704 4−1. 106 9 0.704 4 −2. 515 7× 10−2

3. 295 6 −1. 006 3 −1. 106 9−1. 006 3 0.276 73 0.704 4−1. 106 9 0.704 4 −2. 515 7× 10−2

20

=

8. 995 9× 1011 −2. 998 7× 1011 −2. 998 6× 1011

−2. 998 7× 1011 9. 995 6× 1010 9. 995 4× 1010

−2. 998 6× 1011 9. 995 4× 1010 9. 995 2× 1010

=


174 Exercises

0.904 53 −0.404 72 0.134 23−0.301 52 −0.829 69 −0.469 80−0.301 51 −0.384 47 0.872 51

9. 945 4× 1011 −3. 315 2× 1011 −3. 315 1× 1011

0 2. 995 0× 106 1. 376× 106

0 0 5. 368 8× 105

0.904 53 −0.404 72 0.134 23−0.301 52 −0.829 69 −0.469 80−0.301 51 −0.384 47 0.872 51

T

3. 295 6 −1. 006 3 −1. 106 9−1. 006 3 0.276 73 0.704 4−1. 106 9 0.704 4 −2. 515 7× 10−2

0.904 53 −0.404 72 0.134 23−0.301 52 −0.829 69 −0.469 80−0.301 51 −0.384 47 0.872 51

=

4. 000 0 1. 828 8× 10−6 −8. 371 4× 10−6

1. 828 8× 10−6 0.155 70 −7. 669 2× 10−2

−8. 371 4× 10−6 −7. 669 2× 10−2 −0.608 53

1λ−1.75 = 4, Solution is: 2.0. An eigenvector is then

0.904 53−0.301 52−0.301 51

. How well does it

work?

3 2 12 5 31 3 2

− 2

1 0 00 1 00 0 1

0.904 53−0.301 52−0.301 51

=

−0.000 02−0.000 03−0.000 03

3 2 12 5 31 3 2

− 0.38

1 0 00 1 00 0 1

−1

=

0.493 54 7. 815 4× 10−2 −0.449 387. 815 4× 10−2 −1. 056 5 1. 908 3−0.449 38 1. 908 3 −2. 639 1

0.493 54 7. 815 4× 10−2 −0.449 387. 815 4× 10−2 −1. 056 5 1. 908 3−0.449 38 1. 908 3 −2. 639 1

20

=

7. 593 8× 109 −4. 459 9× 1010 6. 738× 1010

−4. 459 9× 1010 2. 619 4× 1011 −3. 957 3× 1011

6. 738× 1010 −3. 957 3× 1011 5. 978 7× 1011

:

9. 356 7× 10−2 0.107 43 −0.989 8−0.549 53 0.834 58 3. 863 5× 10−2

0.830 22 0.540 31 0.137 12

8. 115 9× 1010 −4. 766 6× 1011 7. 201 3× 1011

0 4. 418 3× 106 1. 380 8× 106

0 0 6. 663 6× 105

9. 356 7× 10−2 0.107 43 −0.989 8−0.549 53 0.834 58 3. 863 5× 10−2

0.830 22 0.540 31 0.137 12

T

0.493 54 7. 815 4× 10−2 −0.449 387. 815 4× 10−2 −1. 056 5 1. 908 3−0.449 38 1. 908 3 −2. 639 1


Exercises 175

9. 356 7× 10−2 0.107 43 −0.989 8−0.549 53 0.834 58 3. 863 5× 10−2

0.830 22 0.540 31 0.137 12

=

−3. 952 9 −2. 044 4× 10−5 1. 021 4× 10−5

−2. 044 4× 10−5 0.182 25 0.145 621. 021 4× 10−5 0.145 62 0.568 55

1λ−.38 = −3. 952 9, Solution is: 0.127 02 and an approximate eigenvector is then

9. 356 7× 10−2

−0.549 530.830 22

. How well does it work?

3 2 12 5 31 3 2

− 0.127 02

1 0 00 1 00 0 1

9. 356 7× 10−2

−0.549 530.830 22

=

−2. 388× 10−5

−5. 469 9× 10−5

−3. 754 4× 10−5

Note that 4−√15 = 0.127 02 which is what was just obtained.


0 2 12 5 31 3 2

numerically.

I don’t know the exact eigenvalues in this case. Check your answers by multiplyingyour numerically computed eigenvectors by the matrix.

0 2 12 5 31 3 2

50

=

5. 045 4× 1042 1. 454 4× 1043 8. 826 9× 1042

1. 454 4× 1043 4. 192 3× 1043 2. 544 4× 1043

8. 826 9× 1042 2. 544 4× 1043 1. 544 2× 1043

=

0.284 32 0.756 78 0.588 60.819 59 −0.510 39 0.260 320.497 42 0.408 40 −0.765 36

·

1. 774 5× 1043 5. 115 1× 1043 3. 104 5× 1043

0 7. 091 8× 1038 8. 072 9× 1037

0 0 3. 007 5× 1038

A matrix similar to the original is

0.284 32 0.756 78 0.588 60.819 59 −0.510 39 0.260 320.497 42 0.408 40 −0.765 36

T

0 2 12 5 31 3 2

0.284 32 0.756 78 0.588 60.819 59 −0.510 39 0.260 320.497 42 0.408 40 −0.765 36

=

7. 514 5 9. 130 9× 10−5 1. 248 7× 10−5

9. 130 9× 10−5 −0.541 46 −0.344 281. 248 7× 10−5 −0.344 28 2. 686 9× 10−2

Thus an eigenvalue is 7. 514 5 and an eigenvector is approximately

0.284 320.819 590.497 42

. How

well does it work?


176 Exercises

0 2 12 5 31 3 2

0.284 320.819 590.497 42

=

2. 136 66. 158 93. 737 9

0.284 320.819 590.497 42

7. 514 5 =

2. 136 56. 158 83. 737 9

.

This has essentially found it. However, we don’t know the others yet. begin with theone which is closest to −0.541 46

0 2 12 5 31 3 2

+ .541 46

1 0 00 1 00 0 1

−1

=

−5. 323 8 2. 181 4 −0.480 232. 181 4 −0.393 89 −0.393 38−0.480 23 −0.393 38 1. 046 8

−5. 323 8 2. 181 4 −0.480 232. 181 4 −0.393 89 −0.393 38−0.480 23 −0.393 38 1. 046 8

20

=

5. 483 4× 1015 −2. 055 8× 1015 2. 530 4× 1014

−2. 055 8× 1015 7. 707 7× 1014 −9. 487 0× 1013

2. 530 4× 1014 −9. 487 0× 1013 1. 167 7× 1013

:

0.935 48 0.353 06 1. 491 6× 10−2

−0.350 73 0.932 81 −8. 286 3× 10−2

4. 316 9× 10−2 −7. 228 5× 10−2 −0.996 45

5. 861 6× 1015 −2. 197 6× 1015 2. 704 9× 1014

0 2. 167 1× 1010 −1. 796 2× 109

0 0 8. 270 5× 107

0.935 48 0.353 06 1. 491 6× 10−2

−0.350 73 0.932 81 −8. 286 3× 10−2

4. 316 9× 10−2 −7. 228 5× 10−2 −0.996 45

T

−5. 323 8 2. 181 4 −0.480 232. 181 4 −0.393 89 −0.393 38−0.480 23 −0.393 38 1. 046 8

0.935 48 0.353 06 1. 491 6× 10−2

−0.350 73 0.932 81 −8. 286 3× 10−2

4. 316 9× 10−2 −7. 228 5× 10−2 −0.996 45

=

−6. 163 8 1. 742 0× 10−5 −1. 159 3× 10−6

1. 742 0× 10−5 0.513 51 0.577 10−1. 159 3× 10−6 0.577 10 0.979 41

1λ+.541 46 = −6. 163 8, Solution is: −0.703 70

0.935 48−0.350 73

4. 316 9× 10−2

. How well does it work?

0 2 12 5 31 3 2

+ 0.703 70

1 0 00 1 00 0 1

0.935 48−0.350 73

4. 316 9× 10−2

:


Exercises 177

6. 276× 10−6

8. 299× 10−6

6. 025 3× 10−6

Next find the eigenvalue closest to 0.

0 2 12 5 31 3 2

−1

=

−1.0 1.0 −1.01.0 1.0 −2.0−1.0 −2.0 4.0

−1.0 1.0 −1.01.0 1.0 −2.0−1.0 −2.0 4.0

20

=

1. 287 1× 1013 2. 778 8× 1013 −5. 314 3× 1013

2. 778 8× 1013 5. 999 6× 1013 −1. 147 4× 1014

−5. 314 3× 1013 −1. 147 4× 1014 2. 194 2× 1014

:

0.209 85 −0.930 85 0.299 130.453 05 −0.178 55 −0.873 42−0.866 43 −0.318 81 −0.384 26

6. 133 5× 1013 1. 324 3× 1014 −2. 532 5× 1014

0 1. 509 3× 109 1. 706 9× 109

0 0 6. 192 3× 109

0.209 85 −0.930 85 0.299 130.453 05 −0.178 55 −0.873 42−0.866 43 −0.318 81 −0.384 26

T

−1.0 1.0 −1.01.0 1.0 −2.0−1.0 −2.0 4.0

0.209 85 −0.930 85 0.299 130.453 05 −0.178 55 −0.873 42−0.866 43 −0.318 81 −0.384 26

=

5. 288 0 2. 017 9× 10−5 −2. 856 3× 10−5

2. 017 9× 10−5 −0.916 86 0.727 58−2. 856 3× 10−5 0.727 58 −0.371 12

1λ = 5. 288 0, Solution is: 0.189 11. An approximate eigenvector is

0.209 850.453 05−0.866 43

.

How well does it work?

0 2 12 5 31 3 2

− 0.189 11

1 0 00 1 00 0 1

0.209 850.453 05−0.866 43

=

−1. 473 4× 10−5

−1. 628 6× 10−5

−9. 422 7× 10−6

.


0 2 12 0 31 3 2

numerically.

I don’t know the exact eigenvalues in this case. Check your answers by multiplyingyour numerically computed eigenvectors by the matrix.

I will do this one a little differently. You can easily find that the eigenvalues areapproximately 4.9,−2.7, .3. To find the eigen pair which goes with .3 to the following.

0 2 12 0 31 3 2

− .3

1 0 00 1 00 0 1

−1

=


178 Exercises

−1. 138 5 −4. 788 7× 10−2 0.754 22−4. 788 7× 10−2 −0.180 77 0.347 18

0.754 22 0.347 18 −0.468 10

−1. 138 5 −4. 788 7× 10−2 0.754 22−4. 788 7× 10−2 −0.180 77 0.347 18

0.754 22 0.347 18 −0.468 10

20

=

17805. 3431. 1 −12214.3431. 1 661. 21 −2353. 7−12214. −2353. 7 8378. 3

=

0.814 41 −0.321 53 −0.483 080.156 94 0.923 48 −0.350 07−0.558 67 −0.209 28 −0.802 55

21863. 4213.0 −14997.0 2. 261 7× 10−2 6. 348 1× 10−2

0 0 0.281 26

A similar matrix is

0.814 41 −0.321 53 −0.483 080.156 94 0.923 48 −0.350 07−0.558 67 −0.209 28 −0.802 55

T

−1. 138 5 −4. 788 7× 10−2 0.754 22−4. 788 7× 10−2 −0.180 77 0.347 18

0.754 22 0.347 18 −0.468 10

0.814 41 −0.321 53 −0.483 080.156 94 0.923 48 −0.350 07−0.558 67 −0.209 28 −0.802 55

=

−1. 665 1 7. 437 4× 10−6 1. 163 3× 10−5

7. 437 4× 10−6 −0.296 62 −0.142 051. 163 3× 10−5 −0.142 05 0.174 36

Thus an eigenvalue is close to −1.6651 and so the eigenvalue for the original matrixis the solution of 1

λ−.3 = −1.6651, Solution is: −0.300 56. Then what about theeigenvector? We have

(A− .3I)−1v =

1

λ− .3v

It follows that (λ− .3)v = (A− .3I)v. Thus you have Av = λv. Thus an eigenvectoris just the first column of that orthogonal matrix above. Check it.

0 2 12 0 31 3 2

0.814 410.156 94−0.558 67

=

−0.244 79−0.047 190.167 89

0.814 410.156 94−0.558 67

(−0.300 56) =

−0.244 78−4. 717 0× 10−2

0.167 91

This worked very well. Now lets find the pair associated with the eigenvalue beingclose to −2.7.

0 2 12 0 31 3 2

+ 2.7

1 0 00 1 00 0 1

−1


Exercises 179

=

7. 969 8 −13. 823 7. 127 4−13. 823 25. 248 −13. 1757. 127 4 −13. 175 7. 105 8

7. 969 8 −13. 823 7. 127 4−13. 823 25. 248 −13. 1757. 127 4 −13. 175 7. 105 8

20

=

1. 896 8× 1031 −3. 436 7× 1031 1. 799 7× 1031

−3. 436 7× 1031 6. 226 6× 1031 −3. 260 7× 1031

1. 799 7× 1031 −3. 260 7× 1031 1. 707 5× 1031

=

0.439 25 −0.891 55 0.110 45−0.795 85 −0.443 21 −0.412 550.416 76 9. 330 8× 10−2 −0.904 21

4. 318 3× 1031 −7. 823 9× 1031 4. 097 2× 1031

0 7. 478 5× 1026 −3. 772 7× 1026

0 0 3. 494 4× 1026

0.439 25 −0.891 55 0.110 45−0.795 85 −0.443 21 −0.412 550.416 76 9. 330 8× 10−2 −0.904 21

T

7. 969 8 −13. 823 7. 127 4−13. 823 25. 248 −13. 1757. 127 4 −13. 175 7. 105 8

0.439 25 −0.891 55 0.110 45−0.795 85 −0.443 21 −0.412 550.416 76 9. 330 8× 10−2 −0.904 21

=

39. 777 −1. 791 2× 10−4 6. 881× 10−5

−1. 791 2× 10−4 0.336 06 −0.128 956. 881× 10−5 −0.128 95 0.210 75

Eigenvalue, 1λ+2.7 = 39. 777, Solution is: −2. 674 9. The eigenvector would be the first

column of the above orthogonal matrix on the right.

0 2 12 0 31 3 2

0.439 25−0.795 850.416 76

=

−1. 174 92. 128 8−1. 114 8

0.439 25−0.795 850.416 76

(−2. 674 9) =

−1. 174 92. 128 8−1. 114 8

This worked very well.

Now consider the largest eigenvalue.

0 2 12 0 31 3 2

− 4.9

1 0 00 1 00 0 1

−1

=

1. 753 6 2. 962 0 3. 668 82. 962 0 4. 446 3 5. 6213. 668 8 5. 621 6. 735 1

1. 753 6 2. 962 0 3. 668 82. 962 0 4. 446 3 5. 6213. 668 8 5. 621 6. 735 1

40

=

1. 144 5× 1044 1. 765 1× 1044 2. 164 3× 1044

1. 765 1× 1044 2. 722 1× 1044 3. 337 8× 1044

2. 164 3× 1044 3. 337 8× 1044 4. 092 8× 1044


180 Exercises

0.379 20 0.739 6 −0.556 060.584 81 −0.657 26 −0.475 400.717 08 0.144 92 0.681 76

3. 018 2× 1044 4. 654 7× 1044 5. 707 5× 1044

0 6. 539 3× 1039 5. 579 8× 1039

0 0 4. 797 6× 1039

0.379 20 0.739 6 −0.556 060.584 81 −0.657 26 −0.475 400.717 08 0.144 92 0.681 76

T

1. 753 6 2. 962 0 3. 668 82. 962 0 4. 446 3 5. 6213. 668 8 5. 621 6. 735 1

0.379 20 0.739 6 −0.556 060.584 81 −0.657 26 −0.475 400.717 08 0.144 92 0.681 76

=

13. 259 1. 247 5× 10−4 1. 129 2× 10−5

1. 247 5× 10−4 −0.142 61 2. 290 2× 10−2

1. 129 2× 10−5 2. 290 2× 10−2 −0.181 74

1λ−4.9 = 13. 259, Solution is: 4. 975 4. Check this.

0 2 12 0 31 3 2

0.379 200.584 810.717 08

=

1. 886 72. 909 63. 567 8

0.379 200.584 810.717 08

4. 975 4 =

1. 886 72. 909 73. 567 8

. It worked well.

6. Consider the matrix A =

3 2 32 1 43 4 0

and the vector (1, 1, 1)T. Find the shortest

distance between the Rayleigh quotient determined by this vector and some eigenvalueof A.

q =

111

T

3 2 32 1 43 4 0

111

13 = 7. 333 3

|7. 333 3− λq| ≤

∣∣∣∣∣∣

3 2 32 1 43 4 0

111

− 7. 333 3

111

∣∣∣∣∣∣

√3

= 0.471 41

Thus there is an eigenvalue close to 7.333.


1 2 12 1 41 4 5




Exercises 181

q =

111

T

1 2 12 1 41 4 5

111

13 = 7

|7− λq| = 2. 449 5


3 2 32 6 43 4 −3



q =

111

T

3 2 32 6 43 4 −3

111

13 = 8.0

|λq − 8| ≤

∣∣∣∣∣∣

3 2 32 6 43 4 −3

111

− 8

111

∣∣∣∣∣∣

√3

= 3. 266 0

9. Using Gerschgorin’s theorem, find upper and lower bounds for the eigenvalues of A =

3 2 32 6 43 4 −3

.

−10 ≤ λ ≤ 12

10. Tell how to find a matrix whose characteristic polynomial is a given monic polynomial.This is called a companion matrix. Find the roots of the polynomial x3+7x2+3x+7.

The companion matrix is

0 0 −71 0 −30 1 −7

0 0 −71 0 −30 1 −7

20

=

8. 062 3× 1014 −5. 408 5× 1015 3. 628 2× 1016

2. 253 5× 1014 −1. 511 7× 1015 1. 014 1× 1016

7. 726 4× 1014 −5. 183 2× 1015 3. 477× 1016

=

0.707 72 0.258 58 0.657 470.197 82 0.820 86 −0.535 780.678 23 −0.509 24 −0.529 79

·

1. 139 2× 1015 −7. 642 2× 1015 5. 126 6× 1016

0 4. 570 4× 1010 2. 079 1× 1010

0 0 3. 151 4× 1011

0.707 72 0.258 58 0.657 470.197 82 0.820 86 −0.535 780.678 23 −0.509 24 −0.529 79

T

0 0 −71 0 −30 1 −7

·

0.707 72 0.258 58 0.657 470.197 82 0.820 86 −0.535 780.678 23 −0.509 24 −0.529 79


182 Exercises

=

−6. 708 3 5. 850 6 5. 220 94. 147 2× 10−5 0.154 76 1. 187 6−1. 391 6× 10−5 −0.936 81 −0.446 46

Clearly a real root is close to −6. 708 3. Then the other roots can be obtained fromthe lower right block.(

0.154 76 1. 187 6−0.936 81 −0.446 46

)

, eigenvalues: −0.145 85+ 1. 011i,−0.145 85− 1. 011i

How well do these work? Try the last one. Evaluating the polynomial at this value ofx gives

6. 350 5× 10−4 + 2. 065 8× 10−4i

11. Find the roots to x4 + 3x3 + 4x2 + x+ 1. It has two complex roots.

0 0 0 −11 0 0 −10 1 0 −40 0 1 −3

20

=

36482. −27300.0 −49899. 2. 448 9× 105

14010.0 9182.0 −77199. 1. 949 9× 105

1. 318 0× 105 −95190.0 −1. 904 1× 105 9. 023 5× 105

27300.0 49899. −2. 448 9× 105 5. 442 5× 105

=

0.260 30 −6. 965 1× 10−2 −0.941 85 0.200 799. 996 0× 10−2 0.251 72 0.209 32 0.939 59

0.940 38 −0.202 95 0.253 14 −0.102 070.194 78 0.943 71 −7. 090 6× 10−2 −0.257 75

1. 401 6× 105 −85984. −2. 474 6× 105 1. 037 8× 106

0 70622. −2. 084 2× 105 3. 625 1× 105

0 0 2. 130 5 −5. 0380 0 0 2. 683 4

0.260 30 −6. 965 1× 10−2 −0.941 85 0.200 799. 996 0× 10−2 0.251 72 0.209 32 0.939 59

0.940 38 −0.202 95 0.253 14 −0.102 070.194 78 0.943 71 −7. 090 6× 10−2 −0.257 75

T

0 0 0 −11 0 0 −10 1 0 −40 0 1 −3

0.260 30 −6. 965 1× 10−2 −0.941 85 0.200 799. 996 0× 10−2 0.251 72 0.209 32 0.939 59

0.940 38 −0.202 95 0.253 14 −0.102 070.194 78 0.943 71 −7. 090 6× 10−2 −0.257 75

=

−0.613 47 −4. 251 0 0.485 69 2. 096 80.503 89 −2. 337 6 0.115 42 0.330 94

2. 547 6× 10−7 8. 418 7× 10−6 −0.157 34 0.304 464. 611 8× 10−6 7. 585 5× 10−6 −0.974 48 0.108 46

Now there are two blocks of importance.(−0.613 47 −4. 251 00.503 89 −2. 337 6

)

, eigenvalues: −1. 475 5+ 1. 182 7i,−1. 475 5− 1. 182 7i


Exercises 183

(−0.157 34 0.304 46−0.974 48 0.108 46

)

, eigenvalues: −0.024 44+ 0.528 23i,−0.024 44− 0.528 23i

How well does one of these work? Try one.

(−0.024 44+ 0.528 23i)4+ 3 (−0.024 44+ 0.528 23i)

3

+4 (−0.024 44+ 0.528 23i)2+ (−0.024 44+ 0.528 23i)

+1 = 2. 887 9× 10−5 − 3. 009 2× 10−6i

12. Suppose A is a real symmetric matrix and the technique of reducing to an upperHessenberg matrix is followed. Show the resulting upper Hessenberg matrix is actuallyequal to 0 on the top as well as the bottom.

Let QTAQ = H where H is upper Hessenberg. Then take the transpose of both sides.This will show that H = HT and so H is zero on the top as well.


solutionsmanual,linearalgebratheoryand applications · solutionsmanual,linearalgebratheoryand...

Documents