linear algebra short course lecture 4 - feedbackward.com filelinear algebra short course lecture 4...

82
Linear Algebra Short Course Lecture 4 Matthew J. Holland [email protected] Mathematical Informatics Lab Graduate School of Information Science, NAIST 1

Upload: others

Post on 20-Oct-2019

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Linear Algebra Short CourseLecture 4

Matthew J. [email protected]

Mathematical Informatics LabGraduate School of Information Science, NAIST

1

Page 2: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Some useful references

I Finite dimensional inner-product spaces, normal operators: Axler(1997, Ch. 6-7)

I Projection theorem on infinite-dimensional Hilbert spaces:Luenberger (1968, Ch. 3)

I Unitary matrices: Horn and Johnson (1985, Ch. 2)

2

Page 3: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

2

Page 4: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

2

Page 5: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Key idea: generalizing geometric notions

Our progress thus far:

I Built a framework for sets with a linearity property

I Built a framework for functions with a linearity property

I Looked at some deeper results based on this framework

Note our framework was very general (operations on linear spaces offunctions, etc.).

Can we add length and angle to our general framework?

Yes, and the key notion is that of an “inner product” between vectors.

3

Page 6: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Geometric motivations from vector analysis on R3 1Typically define “projection” by its length, to start.

That is, if proj(x; y) ∈ R3 denotes projection of x onto direction of y,we require proj(x; y) satisfy

‖ proj(x; y)‖ = ‖x‖| cos(∠xy)|,natural considering the right-triangle of hypotenuse length ‖x‖.

4

Page 7: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Geometric motivations from vector analysis on R3 2To define the actual projection, just scale y. That is,

proj(x; y) ..=‖x‖ cos(∠xy)‖y‖

y.

This naturally depends on “what goes where” (i.e., asymmetric inarguments).

A convenient quantity for examining the direction of a vector pairx, y ∈ R3 is

x · y ..= ‖x‖‖y‖ cos(∠xy).

Clearly x · y = y · x and

x ⊥ y ⇐⇒ x · y = 0

∠xy acute ⇐⇒ x · y > 0

∠xy obtuse ⇐⇒ x · y < 0.

5

Page 8: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Geometric motivations from vector analysis on R3 3It is easy to geometrically motivate the validity of scalar product beinglinear in both terms, namely (x + z) · y = (x · y) + (y · y).

With this, and perpendicular unit coordinate vectors e1, e2, e3 ∈ R3

note

x · y = (x1e1 + x2e2 + x3e3) · (y1e1 + y2e2 + y3e3)

= x1y1 + x2y2 + x3y3

where xi..= ‖x‖ cos(∠xei), same for yj. 6

Page 9: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The inner product as a generalized scalar product 1Enough geometry, now we do algebra.Scalar product x · y captures both length and angle. Let’s generalize.

For x, y ∈ Rn, natural to extend length and angle quantifiers via

‖x‖ ..=√

x21 + · · ·+ x2

n

x · y ..= x1y1 + · · ·+ xnyn.

What about complex case? For u = a + ib ∈ C, just like R2, i.e.,

‖u‖ ..=√

a2 + b2 = (uu)1/2 =√|u|2.

Extending this length to u = (u1, . . . , un) ∈ Cn, naturally try

‖u‖ ..=√|u1|2 + · · ·+ |un|2.

As ‖u‖2 = u1u1 + · · ·+ unun, intuitively we’d like to consider defining

u · v ..= u1v1 + · · ·+ unvn.

7

Page 10: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The inner product as a generalized scalar product 2

(*) Note that the following properties of our extended scalar productson both Rn and Cn hold:

I conjugate symmetry, v · u = u · vI definiteness, u · u = 0 ⇐⇒ u = 0I linearity in first argument, (αu + βu′) · v = α(u · v) + β(u′ · v)

Recall that these properties are shared by the original scalar product.

In linear algebra, we start with inner product properties asaxioms.

We abandon the clunky “dot” for the more standard 〈·, ·〉 notationhenceforth.

8

Page 11: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Inner product

Consider vector space V on field F = R or C.

Defn. Call 〈·, ·〉 : V × V → C an inner product on V if∀ u, v,w ∈ V, α ∈ F,

IP.1 〈u, v〉 = 〈v, u〉IP.2 〈αu, v〉 = α〈u, v〉IP.3 〈u + w, v〉 = 〈u, v〉+ 〈w, v〉IP.4 〈u, u〉 ≥ 0, and 〈u, u〉 = 0 ⇐⇒ u = 0.

(*) Additivity actually holds in both arguments. Also, 〈u, αv〉 = α〈u, v〉,and 〈0, v〉 = 〈v, 0〉 = 0 for any v ∈ V .

Defn. Call (V, 〈·, ·〉) an inner product space. A complete IP space iscalled Hilbert space.

9

Page 12: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Inner product examples(*) Note Rn and Cn with 〈·, ·〉 defined by our generalized dot productare IP spaces.

(*) Note Pm() equipped with

〈p, q〉 ..=

∫ 1

0p(x)q(x) dx

is a valid inner product space.

(**) Recalling the space of real sequences

`p..=

{(x1, x2, . . .) ∈ R∞ :

∞∑i=1

|xi|p <∞

},

if use the Hölder inequality to show finiteness, can show the natural IP

〈x, y〉 ..=

∞∑i=1

xiyi

makes `2 an inner product space. 10

Page 13: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Inner product properties 1Defn. On IP space V , call ‖u‖ ..=

√〈u, u〉 the norm on V .

Let’s verify this naming is valid (considering general norm definition).

(*) [Cauchy-Schwartz] On IP space V ,

|〈u, v〉| ≤ ‖u‖‖v‖, ∀ u, v ∈ V.

Expand 0 ≤ 〈u− αv, u− αv〉 and cleverly pick α.

(*) We just need the triangle inequality. Expand ‖u + v‖2 and use C-Sto verify

‖u + v‖ ≤ ‖u‖+ ‖v‖, ∀ u, v ∈ V.

Be sure to check the other axioms to conclude that inner productsinduce valid norms.

11

Page 14: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Inner product properties 2(*) The generalized “Parallelogram Law” clearly follows:

‖x + y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.

Defn. If 〈u, v〉 = 0, we say u and v are orthogonal, often denotedu ⊥ v. For any W ⊂ V , say u ⊥ W iff u ⊥ w, ∀w ∈ W.

(*) Clearly the Pythagorean theorem extends nicely,

u ⊥ v =⇒ ‖u + v‖2 = ‖u‖2 + ‖v‖2.

(*) If u ⊥ v, ∀ v ∈ V , then u = 0.

(*) A superb fact: the IP is continuous. That is, if sequences (un), (vn)in V converge to un → u and vn → v, then

〈un, vn〉 → 〈u, v〉.

12

Page 15: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

12

Page 16: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

12

Page 17: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Considering the notion of a projection againWe previously considered proj(u; v) only geometrically.Now it is quite easy. Intuitively, we seek αv ∈ [{v}] such that

u = αv + w, where w ⊥ v.

(*) Check that the scalar then must be α = 〈u, v〉/‖v‖2, a nostalgicform indeed.

This “orthogonal projection” (formalized shortly) will play an importantrole moving ahead.

13

Page 18: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

An optimization problem

Consider the following problem.Let V be an IP space, and X ⊂ V a subspace. Fix u0 ∈ V , and

find x̂ ∈ X which minimizes ‖x− u0‖ in x.

Natural questions:

Does a solution exist? Is it unique? What is it?

The answers to these questions are given by the “Projection Theorem,”a truly classic result.

Note: no requirement that dim V <∞ thus far.

14

Page 19: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The Projection Theorem

(*) Say x̂ ∈ X is s.t. ‖x̂− u0‖ ≤ ‖x− u0‖ for all x ∈ X. Then x̂ (theminimizing vector in X) is unique.

(*) Element x̂ (uniquely) minimizes ‖x− u0‖ ⇐⇒ x̂− u0 ⊥ X.

We have not shown that such an element need exist; to do this weneed slightly stronger conditions:

(**) Let V be a Hilbert space, and X ⊂ V a closed subspace. Then,for any u0 ∈ V ,

∃ x̂ ∈ X, ‖x̂− u0‖ ≤ ‖x− u0‖, ∀ x ∈ X.

This result is typically called the classical Projection theorem.Let’s develop these ideas further.

15

Page 20: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal complements 1Let’s develop these ideas further. Let V be IP space.

Defn. Take any subset U ⊂ V , and denote by

U⊥ ..= {v ∈ V : u ⊥ v},

called the orthogonal complement of U.

(*) Note {0}⊥ = V and V⊥ = {0}. Also, for any U, have that U⊥ ⊂ Vis a closed subspace.

(*) Some additional properties (still allowing dim V =∞):

I U ⊂ U⊥⊥

I U ⊂ W =⇒ W⊥ ⊂ U⊥

I U⊥⊥⊥ = U⊥

I U⊥⊥ = [U]

16

Page 21: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal complements 2The use of the term “complement” will now be justified.

(*) Let V be a Hilbert space and X ⊂ V a closed subspace. Then,

V = X ⊕ X⊥, and X⊥⊥ = X.

This result may be proved using the Projection Theorem.

Thus, orthogonal complements furnish a nice direct sumdecomposition. Uniquely have v = x + x′ with x ∈ X, x′ ∈ X⊥.

(*) If specialize to dim V <∞ everything simplifies further:

I In this case, V an IP space =⇒ V is Hilbert (recall Lec 1).I Thus, above result and Proj Theorem hold for any subspace.I Similarly, for subspace U ⊂ V have U⊥⊥ = U.I Naturally have dim U⊥ = dim V − dim U.

17

Page 22: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal projection 1

With these terms down, we provide a general projection notion.

Defn. Let X ⊂ V be a subset, and take u ∈ V . Uniquely, have

u = x + x′

where x ∈ X, x′ ∈ X⊥. Define the orthogonal projection of u onto Xby proj(u;X) ..= x′ = u− x.

(*) This pops up naturally in the Proj Theorem, since

x̂ ∈ X minimizes ‖x− u0‖ ⇐⇒ x̂− u0 ∈ X⊥,

and as u0 = (u0 − x̂) + x̂, have x̂ = proj(u;X).

18

Page 23: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal projection 2

(*) Projecting some x ∈ V in the direction of y ∈ V is tantamount toacquiring proj(x; [{y}]). It takes a familiar form. Decompose

x = αy + w, w ⊥ y. Thus, proj(x; [{y}]) = 〈x, y〉/‖y‖2

as we would hope.

19

Page 24: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Properties of orthogonal projection

Let U ⊂ V be a subspace of V . Assume dim V <∞. DenotePU(v) ..= proj(v;U) here.

(*) The following may readily be checked:

I PU ∈ L(V), a linear operator.

I range PU = U, null PU = U⊥.

I P2U = PU (idempotent map)

I ‖PU(v)‖ ≤ ‖v‖,∀ v ∈ V , a contraction.

(*) Interestingly, the latter two properties characterize the orthogonalprojections. That is, taking some S ∈ L(V),

S2 = S and ‖S(v)‖ ≤ ‖v‖,∀ v ∈ V =⇒ S = PU for some subspace U.

20

Page 25: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal sets 1

Let S ⊂ V be a subset of IP space V .

Defn. We call S an orthogonal set if for any u, v ∈ S, we have u ⊥ v.We call S orthonormal if it is orthogonal and each u ∈ S is ‖u‖ = 1.

In the following useful way, here orthogonality connects with the morefundamental notion of independence seen earlier:

(*) If S ⊂ V is orthogonal, it is (linearly) independent.

21

Page 26: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal sets 2Conversely, given independent sets, we can always “orthonormalize,”in the following sense.

(*) Given independent sequence v1, v2, . . ., exists orthonormalsequence e1, e2, . . . such that

[{v1, . . . , vn}] = [{e1, . . . , en}], for any n > 0.

Proving this is straightforward, and can be done constructively.Initialize e1

..= v1/‖v1‖. The rest are induced by

en = vn −n−1∑i=1

〈vn, ei〉ei.

This is often called the “Graham-Schmidt procedure.”

(*) Thus every inner product space has an orthonormal basis.

22

Page 27: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Orthogonal basesOrthonormal sets play an important role as convenient bases.

Let V be dim V = n, with orthonormal basis {e1, . . . , en}. Then forany v ∈ V , have

v = 〈v, e1〉e1 + · · ·+ 〈v, en〉en

‖v‖2 = |〈v, e1〉|2 + · · ·+ |〈v, en〉|2.

A classical result is nice to check here.

(*) (Schur’s theorem, 1909). For any finite-dim V on C and T ∈ L(V),there exists basis B such that

M(T;B) is upper-tri and B is orthonormal.

Show this with our “portmanteau theorem” for upper-tri representations(Lec 3), and the previous result via Graham-Schmidt.

23

Page 28: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Optimization example 1

First example: find the optimal approximation of sin(x) on [-pi,pi] by a5th-degree polynomial.

24

Page 29: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Optimization example 2

Second example: Find the closest element in the subspace generatedby m vectors to an arbitrary vector.

25

Page 30: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Optimization example 3

Third example: Find the element of an affine set which has thesmallest norm (this is of course the distance from any element in thataffine set to the associated hyperplane through the origin).

26

Page 31: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Optimization example 4

Fourth example: Minimum distance from an arbitrary element to aconvex set.

27

Page 32: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

27

Page 33: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

27

Page 34: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Linear functionals 1

Linear maps which return scalars play an important role in linearalgebra (and other related fields).

Defn. Let V be a vector space on F. Any f ∈ L(V,F) is called alinear functional.

(*) The following are linear functionals:

I On Rn over F, f (x) ..=∑n

i=1 αixi, any fixed α ∈ Fn. In fact, everyg ∈ L(Rn,F) takes this form.

I On real P6(R), f (x) ..=∫ 1

0 x(t) cos(t) dt.I On C[0, 1], f (x) ..= x(0.5).I On Hilbert space H, f (x) ..= 〈x, h〉 for fixed h ∈ H.

The last example here will be of particular interest to us.

28

Page 35: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Linear functionals 2

A very neat fact:

(*) On IP space (V, 〈·, ·〉) with dim V = n, let f be a linear functional.Then, there exists a unique v ∈ V such that

f (u) = 〈u, v〉, ∀ u ∈ V.

To see this, recall we can always find an orthonormal basis{e1, . . . , en} of V . Expand arbitrary u wrt this basis, examine f (u)using linearity of f .

This result is a special case of the Riesz-Fréchet theorem, whichextends things to infinite-dim case. See for example Luenberger(1968, Ch. 4).

29

Page 36: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Adjoint of a linear mapA very important notion moving forward.

Defn. Let U,V be IP spaces on F, with dim U, dim V <∞. Take anyT ∈ L(U,V), fixed. For any v ∈ V fixed,

f (u) ..= 〈Tu, v〉

is clearly a linear functional f ∈ L(U,F). By Riesz-Fréchet, ∃ u∗ ∈ U,unique, s.t.

f (u) = 〈u, u∗〉, u ∈ U.

The initial v was arbitrary, so, we may define a map T∗ : V → U by

T∗(v) ..= u∗ as above.

We call T∗ the adjoint of T . Somewhat subtle, but critical.Critical to memorize: 〈T(u), v〉 = 〈u,T∗(v)〉.

30

Page 37: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Properties of adjoints(*) Define T : R3 → R2 by

T(x1, x2, x3) ..= (x2 + 3x3, 2x1),

and verify with usual inner products that T∗(y) = (2y2, y1, 3y1).Simply note we must have for any y ∈ R2 that 〈Tx, y〉 = 〈x,T∗y〉.

(*) For any T ∈ L(U,V) as in previous slide, have T∗ ∈ L(V,U).

(*) Verify the following properties of (·)∗ map of T 7→ T∗. TakeT,T ′ ∈ L(U,V) and α ∈ F.

I (T + T ′)∗ = (T)∗ + (T ′)∗

I (αT)∗ = α(T)∗

I (T∗)∗ = T .I For T ∈ L(U,V), S ∈ L(V,W) have (ST)∗ = T∗S∗, where W

any IP space.

31

Page 38: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

More properties of adjoints(*) Let T ∈ L(V) and take α ∈ F. Then,

α ∈ σ(T) ⇐⇒ α ∈ σ(T∗).(*) Let U ⊂ V be a subspace, and T ∈ L(V). Then,

U is T-invariant ⇐⇒ U⊥ is T∗-invariant.

(*) Take T ∈ L(V,W). ProveI T is injective iff T∗ is surjective.I T is surjective iff T∗ is injective.

(*) With this, take T ∈ L(V,W) and verify

dim null T∗ = dim null T + dim W − dim V

as well as dim range T∗ = dim range T .

(*) Note the above result completes the generalization of Strang’s firstfundamental theorem (i.e., row/colspaces have same dimension),mentioned in Lec 2.

32

Page 39: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Connections between a map and its adjoint(*) Take any T ∈ L(U,V), both U,V IP spaces. Then,

I null T∗ = (range T)⊥

I range T∗ = (null T)⊥

I null T = (range T∗)⊥

I range T = (null T∗)⊥

Defn. Denote the conjugate transpose of matrix A = [aij] ∈ Fm×n byA∗ ..= AT = [aji].

Given a proper matrix representation of a linear map, we can easilyfind the representation of its adjoint:

(*) Take T ∈ L(U,V) for finite-dim IP spaces U,V . Let BU,BV berespectively orthonormal bases of U and V . Then,

(M(T;BU,BV))∗ = M(T∗;BV ,BU).

33

Page 40: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

33

Page 41: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

33

Page 42: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Operators on inner product spaces

The flow through the first three lectures was:

I Linear spaces (sets with linearity)

I Linear maps (functions with linearity)

I Linear operators on general spaces

Now, considering what we’ve seen in this lecture, the next key point totackle is

I Linear operators on inner product spaces

That is precisely what we look at now.

34

Page 43: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Self-adjoint operatorsLet V be a finite-dim IP space, take T ∈ L(V).

Defn. Call operator T self-adjoint or Hermitian when T = T∗.

(*) Take T ∈ L(F2) defined to have matrix

M(T) =[

19 γ7 59

]wrt standard basis of course. Note T self-adjoint ⇐⇒ γ = 7.

(*) Similarly, we may confirm that for arbitrary orthonormal basis B,

T = T∗ ⇐⇒ M(T;B) = (M(T;B))∗.

A natural connection to the more familiar matrix territory.

35

Page 44: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Properties of self-adjoint operators

(*) If T, S ∈ L(V) are self-adjoint, then T + S is self-adjoint.

(*) If T ∈ L(V) is self-adjoint, then for α ∈ R, αT is self-adjoint.

(*) Let T ∈ L(V) be self-adjoint. Then every eigenvalue is real(recalling F may be either C or R).

(*) Of course, for T ∈ L(Fn) specified by A ∈ Fm×n, this is already amatrix WRT the standard basis, so just look at A.

A nice analogy:Think of the self-adjoint operators among all operators like R as asubset of C (adjoint operation (·)∗ analogous to complex conjugateoperation (·)).

36

Page 45: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Characterizing the self-adjoint operators

A characterization of the self-adjoint operators is given at the start ofsection 5, but technically is used for upcoming results.

37

Page 46: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The normal operatorsThe next very important class of operators.

Defn. Take T ∈ L(V). When T commutes with its adjoint, that is, if

TT∗ = T∗T,

we call T a normal operator.

(*) Every self-adjoint operator is normal.

(*) Let B be an orthonormal basis. Then, T is normal iff M(T;B) andM(T∗;B) commute.

(*) Consider T ∈ L(F2) with matrix (wrt standard basis)

M(T) =[

2 −33 2

],

clearly normal, not self-adjoint. Thus normals are larger class.

38

Page 47: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Properties of normal operatorsNormal operators need not equal their adjoints, yet they have a lot incommon with them:

(**) Norms of maps are common:

T is normal ⇐⇒ ‖T(v)‖ = ‖T∗(v)‖, v ∈ V.

(*) This implies for normal T ∈ L(V),

null T = null T∗.

(*) Their eigenvectors are closely related. Let T is normal andα ∈ σ(T). We have that

if Tv = αv, then T∗v = αv.

(*) This gives us a critical property. Let α1, . . . , αm be the distincteigenvalues of T ∈ L(V), with eigenvectors v1, . . . , vm. Then,

{v1, . . . , vm} is orthogonal.

This clearly strengthens previous results (only had independence).39

Page 48: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The spectral theorem, intuitivelyRecall we know that for T ∈ L(V), dim V = n,

T is “diagonalizable” ⇐⇒ ∃ basis {v1, . . . , vn}, vi eigenvectors.

While such a T is nice, in general we have no guarantee that basis{v1, . . . , vn} is orthogonal, which is really the “nicest” setup.

The spectral theorem characterizes the very nicest operators:

C version:The nicest operators are the normal operators.

R version:The nicest operators are the self-adjoint operators.

Why is this useful?It gives us easy access to an orthonormal basis!(in general, we have only existence guarantees)

40

Page 49: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The spectral theoremWork on V , dim V = n. Take T ∈ L(V).

(**) Complex spectral theorem. Assume V on F = C.

T is normal ⇐⇒ ∃ ortho basis {v1, . . . , vn}, all eigenvecs

(**) Real spectral theorem. Assume V on F = R.

T is self-adjoint ⇐⇒ ∃ ortho basis {v1, . . . , vn}, all eigenvecs

Proving these results is somewhat involved (though we have the toolsrequired), but absolutely worth doing.

Key take-aways:

For the “nicest” operators (and only the nicest operators), theeigenvectors furnish an orthogonal basis.

All the self-adjoint operators (on general F) can bediagonalized via an orthogonal basis.

41

Page 50: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Illustrative examples 1

Example. (*) Define T ∈ L(C2) by

M(T) =[

2 −33 2

]wrt standard basis of C2. Confirm

B =

{(i, 1)√

2,(−i, 1)√

2

}is a orthonormal basis, both are eigenvectors of T , and indeed thatM(T;B) is diagonal.

42

Page 51: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Illustrative examples 2

Example. (*) Similar deal, this time T ∈ L(R3), with matrix (wrtstd. basis)

M(T) =

14 −13 8−13 14 8

8 8 −7

.Check the same properties as in previous slide, this time for the basis

B′ ={(1,−1, 0)√

2,(1, 1, 1)√

3,(1, 1,−2)√

6

}.

43

Page 52: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Some comments on the R caseEven on R, when restricted to self-adjoint operators, things simplify.

(**) Let T ∈ L(V) on R be self-adjoint. Take a, b ∈ R s.t. a2 < 4b.Then,

T2 + aT + bI ∈ L(V) is invertible.

(*) This implies T has no “eigenpairs,” and thus (recall Lec 3),

=⇒ σ(T) 6= ∅.

Of course, we know this last fact must hold, since we’ve alreadypresented the real spectral theorem.

Note: we haven’t characterized the normal operators in the real case.For this, see Axler (1997, Ch. 7).

44

Page 53: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Specialized structural results

With the extra assumptions of the “nice” operators, the structuralresults specialize nicely, proving us with respectively orthogonalsubspaces.

(*) Let T ∈ L(V) be self-adjoint if F = R (normal if F = C), withdistinct eigenvalues α1, . . . , αm. Then,

V = null(T − α1I)⊕ · · · ⊕ null(T − αmI)

and null(T − αiI) ⊥ null(T − αjI), all i 6= j.

Thus, the spectral information of any “nice” T yields an orthogonaldecomposition of V .

45

Page 54: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

45

Page 55: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

45

Page 56: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Characterizing the self-adjoint operatorsReal case:The real spectral theorem characterizes self-adjoint operators.

Complex case:We haven’t discussed this yet.

(**) For any T ∈ L(V) on C, say

〈Tv, v〉 = 0, ∀ v ∈ V.

Then, T = 0 (for self-adjoint T , holds for R case).

(*) It follows that for T ∈ L(V) on C,

T self-adjoint ⇐⇒ 〈Tv, v〉 ∈ R, v ∈ V.

So, complex self-adjoint operators are precisely those for which any vand its map T(v) have a real inner product.

Important special case: when 〈Tv, v〉 ≥ 0, all v ∈ V .46

Page 57: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Positive operators and square rootsLet V be dim V <∞.

Defn. Focus on self-adjoint T ∈ L(V). Call T a positive(semi-definite) operator if

〈Tv, v〉 ≥ 0, ∀ v ∈ V.

(*) Of course, for C case, self-adjoint requirement is superfluous.

(*) For any subspace U ⊂ V , the projection operator proj(·;U) ispositive.

Defn. Take any operator T ∈ L(V). If exists S ∈ L(V) such that

S2 = T,

then call S a square root of operator T .

(*) Find a square root of T ∈ L(F3) defined T(z1, z2, z3) ..= (z3, 0, 0).

47

Page 58: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Portmanteau theorem for positive operators

(**) Take any T ∈ L(V) on finite-dim V . The following are equivalent.

A T is positive; i.e., T = T∗ and 〈Tv, v〉 ≥ 0, all v.

B T = T∗ and eigenvalues of T are non-negative.

C Exists positive Q ∈ L(V) such that Q2 = T .

D Exists self-adjoint R ∈ L(V) such that R2 = T .

E Exists S ∈ L(V) such that S∗S = T .

(**) When T ∈ L(V) is positive, there exists a unique Q ∈ L(V)s.t. Q2 = T . That is, T has a unique square root. Denote

√T ..= Q.

Great. We’ll sort out the implications in the next slide.

48

Page 59: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Key properties of positive operators

(*) From the results of the previous slide:

I Only positive operators have positive square roots

I If an operator has a positive square root, this root is unique.

I Positive operators form a subset of self-adjoint operators.

I Not only are eigenvalues real, they’re positive.

I For any S ∈ L(V), S∗S is positive.

I If S is positive or self-adjoint, S2 is positive.

(*) Let T ∈ L(V) be positive. Show

T invertible ⇐⇒ 〈Tv, v〉 > 0, ∀ v 6= 0.

49

Page 60: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

IsometriesNorm-preserving operators are also naturally of interest.

Defn. Call T ∈ L(V) an isometry if

‖Tv‖ = ‖v‖, ∀ v ∈ V.

This is a general term. For specific cases, other names are used:

I If F = C, call T a unitary operator.I If F = R, call T an orthogonal operator.

(*) Let β ∈ F be |β| = 1. Note T ..= βI is an isometry.

(*) Let {v1, . . . , vn} be orthonormal basis of V . Define T ∈ L(V) by

T(vi) ..= βivi,

with |βi| = 1 for each i = 1, . . . , n. Then T is positive.

(*) Counter-clockwise rotation on V = R2 is an isometry.

50

Page 61: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Many useful properties of isometries(*) If T ∈ L(V) an isometry, T−1 exists.

(**) Let T ∈ L(V). The following are equivalent.

A T is an isometry.

B 〈Tu,Tv〉 = 〈u, v〉 for all u, v ∈ V (preserves IP)

C T∗T = I

D For any orthonormal set {e1, . . . , em}, the mapped{Te1, . . . ,Tem} also orthonormal (0 ≤ m ≤ n).

E Exists basis {v1, . . . , vn} such that {Tv1, . . . ,Tvn} isorthonormal.

F T∗ is an isometry.

This is a nice collection of characterizations for the special case ofisometries, which yields some critical implications.

51

Page 62: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Implications of isometry equivalencesSome key implications:

(*) Clearly, if T an isometry, have T−1 = T∗.

(*) T preserves norms ⇐⇒ T preserves inner products.

(*) Every isometry is normal.

(*) Now a great equivalence. Let E ..= {e1, . . . , en} be anyorthonormal basis of V on F. Then,

T an isometry ⇐⇒ columns of M(T;E) orthonormal

To see this, use A =⇒ D for the =⇒ direction, and E =⇒ A forthe ⇐= direction.

(*) Using A ⇐⇒ F, show that analogous condition holds using therows of M(T;E) above.

52

Page 63: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

More concrete characterization of isometries

The previous characterizations of isometries were quite general. Let’sput forward a more concrete equivalence condition.

Complex case:(*) Let T ∈ L(V), on C. The following condition is both sufficient andnecessary for T ∈ L(V) to be an isometry.

Exists {v1, . . . , vn}, orthonormal basis of V , where the vi areeigenvectors of T , with eigenvalues |αi| = 1.

Real case:Similar to Lecture 3, a bit less elegant. See Axler (1997, Ch. 7).

53

Page 64: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Symmetric real matrices

In probability/statistics, symmetric real matrices appear frequently.

We’ve said a lot about how working on R is somewhat inconvenient.What’s so special about symmetric matrices?

That’s easy: Let A ∈ Rn×n be symmetric. Then,

I T ∈ L(Rn) def’d T(x) ..= Ax is self-adjoint

I T is normal.

I T has eigenvalues, and Rn×n has an orthonormal basis{v1, . . . , vn} of T ’s eigenvectors.

I T may be “diagonalized” by {v1, . . . , vn}.I Specifically, A may be diagonalized by COB matrix [v1 · · · vn].

54

Page 65: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

54

Page 66: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

54

Page 67: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Some famous decompositions

Here we look at conditions for some well-known decompositions:

I Schur

I Polar

I Singular value

I Spectral

Here we periodically switch over to “matrix language” to illustrate thegenerality of our results thus far.

55

Page 68: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Schur’s decomposition

See for example Magnus and Neudecker (1999).

(*) Let A be complex, n× n matrix. Then, there exists unitary matrix Rsuch that

R∗AR =

α1 ∗. . .

0 αn

,where the αi are eigenvalues of A.

To see this:Easy, just let T(z) ..= Az, which we know can always beupper-triangularized by an orthonormal basis E = {v1, . . . , vn}.Construct COB matrix (here R) using these vi. Done.

56

Page 69: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Polar decomposition: first, an analogy

A nice analogy exists between C and L(V):

z ∈ C · · · T ∈ L(V)

z ∈ C · · · T∗ ∈ L(V)

z = z, i.e. R ⊂ C · · · T = T∗, i.e. {self-adjoint ops.} ⊂ L(V)

x ∈ R, x ≥ 0 · · · {positive ops.} ⊂ {self-adjoint ops.}unit circle {z : zz = 1} · · · isometries, {T : T∗T = I}

Note any z ∈ C can be written

z =(

z|z|

)√zz, of course noting z/|z| on unit circle.

Following the analogy, we wonder whether for any T ∈ L(V) we havean isometry S such that T breaks down into S

√T∗T . . .

57

Page 70: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Polar decompositionIndeed, the analogy proves to lead us in a fruitful direction.

(**) (Polar decomposition). Let T ∈ L(V) over F. Then, existsisometry S ∈ L(V) such that

T = S√

T∗T.

The naming refers to z = eθir, θ ∈ [0, 2π), where r = |z|.Here S (like eθi) only changes direction.Magnitude is determined by

√T∗T (like r).

Why is this nice? T is totally general, but,

T = isometry × positive operator

i.e., it breaks into two classes we know very well!

58

Page 71: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Polar decomposition, in matrix language(*) Let A ∈ Fn×n. Then, exists unitary matrix Q and positivesemi-definite matrix P such that

A = QP.

To see this:Take matrix of T ∈ L(Fn) defined by A with respect to usual basis B,so

A = M(T;B) = M(S;B)M(√

T∗T;B),

where S an isometry, and√

T∗T is positive. Verify M(S;B) is unitaryand M(

√T∗T;B) is positive semi-definite. Done.

(*) Also note that in decomposing any T into an isometry/positiveoperator product, the only choice for the positive operator is

√T∗T .

59

Page 72: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Singular values of operators

Clearly, for T ∈ L(V), the positive operator√

T∗T clearly plays animportant role. It pops up in both theory and practice.

Take any T ∈ L(V) on general F.

T need not have real eigenvalues, nor even any at all. However,√

T∗T always has real, non-neg eigenvalues.

Defn. Call the eigenvalues si ∈ σ(√

T∗T) the singular values of T .

60

Page 73: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Singular value decomposition (SVD) 1(*) Let’s see the interesting role that σ(

√T∗T) plays.

Take T ∈ L(V), dim V = n. Let s1, . . . , sn denote the eigenvalues of√T∗T , up to multiplicity.

By spectral theorem, exists {b1, . . . , bn}, eigenvectors of√

T∗T ,forming an orthonormal basis of V . Taking v ∈ V , recall

v = 〈v, b1〉b1 + · · ·+ 〈v, bn〉bn.

Note that via polar decomp. T = S√

T∗T ,

Tv = S√

T∗Tv

= 〈v, b1〉s1Sb1 + · · ·+ 〈v, bn〉snSbn,

and as S is an isometry, {Sb1, . . . , Sbn} is an orthonormal basis of V .

61

Page 74: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Singular value decomposition (SVD) 2(*) With this handy decomposition, one may easily check that forB1

..= {b1, . . . , bn} and B2..= {Sb1, . . . , Sbn}, have

M(T;B1,B2) =

s1 0. . .

0 sn

,another rare appearance of matrix reps with distinct bases.

To estimate singular values:Finding

√T∗T explicitly may be hard. For fixed basis B, let G be the

matrix that diagonalizes such that

GM(√

T∗T;B)G∗ = M(T;B1,B2),

clearly M(T;B1,B2)2 = GM(T∗T;B)G∗. So

σ(T∗T) = {s21, . . . , s

2n}.

Estimating the eigenvalues of positive T∗T is easier. Take roots.

62

Page 75: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

SVD, in (square) matrix language(**) (Matrix SVD). Consider A ∈ Fn×n. Then we may factorize A into

A = QDR∗,

where Q,R are unitary matrices, and D is diagonal, and whosediagonal entries are precisely the singular values of A.

To see this:Defining T(z) ..= Az, via polar decomposition T = S

√T∗T , letting B be

usual basis,

A = M(T;B) = M(S;B)M(√

T∗T;B)

= M(S;B)GDG∗,

where G = M(I;E,B), and E = {v1, . . . , vn} is an orthonormal basisdiagonalizing

√T∗T , where

√T∗Tvi = si.

G has ortho cols, equivalent to G being unitary. Let R∗ ..= G∗ = G−1.Set Q ..= M(S;B)G, unitary as both M(S;B) and G are.

63

Page 76: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

SVD-related additional properties(*) Take T ∈ L(V), singular values s1, . . . , sn. Then,

T invertible ⇐⇒ si 6= 0, i = 1, . . . , n.

(*) Take T ∈ L(V). Then,

dim range T = |{s ∈ σ(√

T∗T) : s 6= 0}|.

(*) Take S ∈ L(V), singular values s1, . . . , sn. Then,

S an isometry ⇐⇒ si = 1, i = 1, . . . , n.

(*) Let s∗ and s∗ denote the smallest and largest singular values ofT ∈ L(V). Then,

s∗‖v‖ ≤ ‖Tv‖ ≤ s∗‖v‖, any v ∈ V.

64

Page 77: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Singular values of general linear mapsIn fact, the “usual” singular value decomposition extends to the moregeneral case of A ∈ Fm×n quite easily.

(*) Note of course for finite-dim IP spaces U,V , taking T ∈ L(U,V),

T∗T ∈ L(U), (T∗T)∗ = T∗T,

thus T∗T self-adjoint, and furthermore, for u ∈ U,

〈T∗Tu, u〉 = 〈T∗(Tu), u〉 = 〈Tu,Tu〉 ≥ 0,

and so T∗T is in fact positive, as we would hope.

Thus may define singular values of T ∈ L(U,V) by σ(√

T∗T).

For more on the general SVD, see Horn and Johnson (1985, Ch. 7).

65

Page 78: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Spectral (or eigen-) decompositionLet A ∈ Fn×n be self-adjoint. We may then express A as

A =

n∑i=1

αivivTi ,

where αi are eigenvalues of A, with respective orthonormaleigenvectors vi.

To see this:Letting T(z) ..= Az, as T is self-adjoint, have orthonormal basis ofeigenvectors E ..= {v1, . . . , vn}. Let B be usual basis. Then,

A = M(T;B) = M(I;E,B)DM(I;B,E),

where D is diagonal, populated by eigenvalues αi, andM(I;E,B) = [v1 · · · vn]. Matrix multiplication yields our result.

66

Page 79: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

The rest of the decompositionsThe rest of the basic famous decompositions can be shown usingtypically algorithmic approaches. For example:

QR factorizationAny A ∈ Fm×n, can get A = QR, Q ∈ Fm×n with orthonormalcolumns, R ∈ Fn×n upper-tri.

Cholesky factorizationAny positive definite A ∈ Fn×n may be factorized as A = LL∗, with Llower-tri with non-neg diagonal elements. Note A = S∗S for somesquare S. Applying the QR result to S yields the result.

For QR and Cholesky, see Horn and Johnson (1985, Ch. 2).

For LU decomposition (a lot of technical details), see Horn andJohnson (1985, Ch. 3).

67

Page 80: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

67

Page 81: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

Lecture contents

1. Inner products: motivations, terms, and basicproperties

2. Projections, orthogonal complements, andrelated problems

3. Linear functionals and the adjoint

4. Normal operators and the spectral theorem

5. Positive operators and isometries

6. Some famous decompositions

67

Page 82: Linear Algebra Short Course Lecture 4 - feedbackward.com fileLinear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School

References

Axler, S. (1997). Linear Algebra Done Right. Springer, 2nd edition.

Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge University Press, 1st edition.

Luenberger, D. G. (1968). Optimization by Vector Space Methods. Wiley.

Magnus, J. R. and Neudecker, H. (1999). Matrix differential calculus with applications in statisticsand econometrics. Wiley, 3rd edition.

68