math 113: linear algebra adjoints of linear...

Math 113: Linear AlgebraAdjoints of Linear Transformations

Ilya Sherman

November 12, 2008

1 Recap

Last time, we discussed the Gram-Schmidt process.

For example, if we take V to be the space of polynomials of degree ≤ N from [−1, 1] to Rn withinner product hf, gi =

R 1

−1 f(x)g(x)dx, and start with basis e1 = 1, e2 = x, e3 = x2, . . . , then bythe Gram-Scmidt process, we get the orthonormal basis

f1 =e1

ke1k

=1√2

f 02 = e2 − he2, f1i f1

f2 =f 0

2

kf 02k

= x

f3 = α(3x2 − 1)

f4 = β(5x3 − 3x2)

(where α and β are constants that scale the polynomials to the appropriate length). These fi’sare called Legendre polynomials. They arise often in studying systems with spherical symmetry.fi(cos θ) are examples of “spherical harmonics”.

More generally, given any interval [A, B] and a function w(x) > 0 for x ∈ [A, B], we can applythe Gram-Schmidt process to polynomials on [A, B] with respect to inner product

hf, gi =

Z B

A

w(x)f(x)g(x)dx.

This gives “orthogonal polynomials”; these arise very often (for various A, B, w) in math andphysics.

1

Ilya Sherman Math 113: Adjoints November 12, 2008

2 The Adjoint of a Linear Transformation

We will now look at the adjoint (in the inner-product sense) for a linear transformation. Aself-adjoint linear transformation has a basis of orthonormal eigenvectors v1, . . . , vn.

Earlier, we defined for T : V → W the adjoint bT : W ∗ → V ∗. If V and W are inner productspaces, we can “reinterpret” the adjoint as a map T ∗ : W → V . The motivation for thisconstruction is something like the following: Earlier we saw that a bilinear pairing X × Y → F(where X, Y are vector spaces over F ) induces maps X → Y ∗ and Y → X∗. In the case thatF = R, then an inner product on V — which gives a bilinear map on V × V → R — gives anisomorphism of V and V ∗. Roughly, an inner product gives a way to equate V and V ∗.

Definition 1 (Adjoint). If V and W are finite dimensional inner product spaces and T : V → Wis a linear map, then the adjoint T ∗ is the linear transformation T ∗ : W → V satisfying for allv ∈ V, w ∈ W ,

hT (v), wi = hv, T ∗(w)i .

Lemma 2.1 (Representation Theorem). If V is a finite dimensional inner product space and` : V → F (F = R or C) is a linear functional, then there exists a unique w ∈ V so that`(v) = hv, wi for all v ∈ V .

Proof. Proof left as an exercise (use an orthonormal basis).

This theorem is called a “representational theorem” because it shows that you can represent alinear functional ` ∈ V ∗ by a vector w ∈ V .

Why does T ∗ (as in the definition of an adjoint) exist? For any w ∈ W , consider hT (v), wias a function of v ∈ V . It is linear in v. By the lemma, there exists some y ∈ V so thathT (v), wi = hv, yi. Now we define T ∗(w) = y. This gives a function W → V ; we need only tocheck that it is linear.

Properties of T ∗:

1. If e1 is an orthonormal basis for V and fj is an orthonormal basis for W , then the matrixof T with respect to ei, fj is the conjugate transpose of the matrix of T ∗ with respectto fj, ei.For example, if V = C2, W = C2, the inner product is h(z1, w1), (z2, w2)i = z1z2 + w1w2,

and T is defined by∑1 i0 i

∏, then T ∗ is defined by

∑1 0

−i −i

∏.

2. If T1, T2 ∈ L(V, W ) and λ1, λ2 ∈ F , then

(λ1T1 + λ2T2)∗ = λ1T ∗1 + λ2T ∗

2 .

3. (T ∗)∗ = T (follows directly from the definition).

of 3

Ilya Sherman Math 113: Adjoints November 12, 2008

4. The range (image) of T is the perpendicular space of the nullspace of its adjoint, andvice-versa:

image(T ) = null(T ∗)⊥

null(T ) = image(T ∗)⊥

and similarly exchanging T and T ∗.

Most of the proofs are very similar to previous ones about bT (see Axler, end of Chapter 6). Forexample,

Proposition 1. image(T ) = null(T ∗)⊥.

Proof. Note that both image(T ), null(T ∗) ⊆ W .

We will actually show that image(T )⊥ = null(T ∗); this implies the desired result, since (image(T )⊥)⊥ =image(T ).

Recall that w ∈ image(T )⊥ if and only if hT (v), wi = 0 for all v ∈ V . By the definition of theadjoint, this is if and only if hv, T ∗(w)i = 0 for all v ∈ V , i.e. T ∗(w) = 0, so w ∈ null(T ∗).

Note that at the last step, we used the fact that if v0 ∈ V satisfies hv, v0i = 0 for all v, thenv0 = 0. Indeed, this implies that hv0, v0i = 0, and hence by an axiom of inner products,v0 = 0.

3 Self-Adjoint

Recall that we want:

Theorem 3.1. If T : V → V (where V is a finite dimensional inner product space over F )so that T = T ∗ (“self-adjoint”), then there is an orthonormal basis of eigenvectors and alleigenvalues are real.

Proof. Why are all eigenvalues real? Given v an eigenvector with eigenvalue λ, i.e. T (v) = λv,we can consider

λ hv, vi = hλv, vi = hT (v), vi = hv, T (v)i = hv, λvi = λ hv, vi .

Note that hT (v), vi = hv, T (v)i because T is self-adjoint. From the above, we see that λ = λ,i.e. λ is real.

Now, let’s prove the rest of the theorem for F = C. We know that there exists an eigenvectorv ∈ V . Let U be span(v)⊥. Then, T preserves U : if u ∈ U , then T (u) ∈ U . Now, by inductionon the dimension, we can find an orthonormal basis of eigenvectors on U .

This is perhaps the most important result in the course.

of 3

math 113: linear algebra adjoints of linear...

Documents