1 | vector spacespioneer.netserv.chula.ac.th/~myotsana/math336linearii.pdf · 2015-04-10 · 1 |...
TRANSCRIPT
1 | Vector Spaces
1.1 The Algebra of Matrices over a Field
Definition. By a field F , we mean a non-empty set of elements with two laws of combination,
which we call an addition + and a multiplication · satisfying:
(F1) To every pair of elements a, b ∈ F there is associated a unique element, called their sum,
which we denote by a+ b.(F2) Addition is associative: (a+ b) + c = a+ (b+ c).(F3) Addition is commutative: a+ b = b+ a.
(F4) There exists an element, which we denote by 0, such that a+ 0 = a for all a ∈ F .
(F5) For each a ∈ F there exists an element, which we denote by −a such that a+ (−a) = 0.
(F6) To every pair of elements a, b ∈ F there is associated a unique element, called their
product, which we denote by ab, or a · b.(F7) Multiplication is associative: (ab)c = a(bc).(F8) Multiplication is commutative: ab = ba.
(F9) There exists an element different from 0, which we denote by 1, such that a · 1 = a for all
a ∈ F .
(F10) For each a ∈ F , a 6= 0, there exists an element which we denote by a−1, such that
a · a−1 = 1.
(F11) Multiplication is distributive with respect to addition: (a+ b)c = ac+ bc.
Remark. Note that in a field F , 0 + 0 = 0.
We write Q for the set of rational numbers, R for the set of real numbers and C for the set
of complex numbers. These sets are fields. The rigorous definition and treatments on fields can
be found in any abstract algebra courses including 2301337 Abstract Algebra I. The definition of
field was presented once in Linear Algebra I. In this course, F always denotes any of Q, R, Cor other fields. Its members are called scalars. However, almost nothing essential is lost if we
assume that F is the real field R or the complex field C.
Example 1.1.1. A non-empty subset F of C such that for any x, y ∈ F , x − y ∈ F and xy ∈ Fand for any non-zero z ∈ F , 1/z ∈ F is also a field. It is called a subfield of C. For example,
Q(i) = {a+ bi : a, b ∈ Q} is a subfield of C.
Example 1.1.2. Let p be a prime and Fp = {0, 1, . . . , p− 1}. For a and b in Fp, we define
a+ b = the remainder when we divide a+ b with p, and
ab = the remainder when we divide ab with p.
Then (Fp,+, ·) is a finite field of p elements. Note that if p = 2, we have 1 + 1 = 0.
1
2 1. Vector Spaces
Definition. Let F be a field. An m × n (m by n) matrix A with m rows and n columns with
entries over F is a rectangular array of the form
A =
a11 · · · a1j · · · a1n...
......
ai1 · · · aij · · · ain...
......
am1 · · · amj · · · amn
,
where aij ∈ F for all i ∈ {1, 2, . . . ,m} and j ∈ {1, 2, . . . , n}. We write Mm,n(F ) for the set of
m×n matrices with entries in F and we write Mn(F ) for Mn,n(F ) the set of square matrices of
order n.
Remark. As a shortcut, we often use the notation A = [aij ] to denote the matrix A with entries
aij . Notice that when we refer to the matrix we put parentheses—as in “[aij ]”, and when we refer
to a specific entry we do not use the surrounding parentheses—as in “aij .”
Definition. Two m × n matrices A = [aij ] and B = [bij ] are equal if aij = bij for all i ∈{1, 2, . . . ,m} and j ∈ {1, 2, . . . , n}.
Definition. The m× n zero matrix 0m×n ∈Mm,n(F ) is the matrix with 0F ’s everywhere,
0m×n =
0 0 0 · · · 00 0 0 · · · 0...
...... · · · ...
0 0 0 · · · 0
.
When m = n we write 0n as an abbreviation for 0n×n.
The n × n identity matrix In ∈ Mn(F ) is the matrix with 1’s on the main diagonal and 0’s
everywhere else,
In =
1 0 0 · · · 00 1 0 · · · 0...
...... · · · ...
0 0 0 · · · 1
.
Definition. Let A = [aij ] and B = [bij ] be m×n matrices and a scalar r ∈ F . The matrix A+rBis the matrix C ∈Mm,n(F ) with entries C = [cij ] where
cij = aij + rbij .
Theorem 1.1.1. Let A,B and C be matrices of the same size, and let r and s be scalars in F . Then(a) A+B = B +A (e) r0 = 0 and 0A = 0
(b) (A+B) + C = A+ (B + C) (f) 1A = A(c) A+ 0 = A (g) (r + s)A = rA+ sA(d) r(A+B) = rA+ rB (h) r(sA) = (rs)A = (sr)A = s(rA)
1.1. The Algebra of Matrices over a Field 3
Definition. Let A be an m × n matrix with columns ~a1,~a2, . . . ,~an and ~x is a column vector
in Fn. The product of A and ~x denoted by A~x is the linear combination of the columns of Ausing the corresponding entries in ~x as weights. That is,
A~x =[~a1 ~a2 · · · ~an
]
x1x2...
xn
:= x1~a1 + x2~a2 + · · ·+ xn~an.
If B is an n × p matrix with columns ~b1,~b2, . . . ,~bp, then the product of A and B, denoted by
AB, is the m× p matrix with columns A~b1, A~b2, . . . , A~bp. In other words,
AB = A[
~b1 ~b2 · · · ~bp
]
:=[
A~b1 A~b2 · · · A~bp
]
.
The above definition of AB is a good for theoretical work. When A and B have small sizes,
the following method is more efficient when working by hand. Let A = [aij ] ∈ Mm,n(F ) and
B = [bij ] ∈ Mn,p(F ). Then the matrix product AB is defined as the matrix C = [cij ] ∈ Mm,p(F )with entries
cij =n∑
l=1
ailblj ,
that is,
a11 a12 · · · a1n...
... · · · ...
ai1 ai2 · · · ain...
... · · · ...
am1 am2 · · · amn
b11 · · · b1j · · · b1pb21 · · · b2j · · · b2p... · · · ... · · · ...
bn1 · · · bnj · · · bnp
=
c11 · · · c1p... · · · ...
· · · cij · · ·... · · · ...
cm1 · · · cmp
.
If A is a square matrix of order n, then we write Ak for A · · ·A︸ ︷︷ ︸
k copies
.
Theorem 1.1.2. Let A be m × n and let B and C have sizes for which the indicated sums and
products are defined.
(a) A(B + C) = AB +AC and (B + C)A = BA+ CA(b) r(AB) = (rA)B = A(rB) for any scalar r(c) A0n×k = 0m×k and 0k×mA = 0k×n
(d) ImA = A = AIn(e) A(BC) = (AB)C
Remarks. Properties above are analogous to properties of real numbers. But NOT ALL real
number properties correspond to matrix properties.
1. It is not the case that AB always equal to BA.
2. Even if A 6= 0 and AB = AC, then B may not equal to C. (A must have an inverse!)
3. It is possible for AB = 0 even if A 6= 0 and B 6= 0. E.g.,
[1 00 0
] [0 01 0
]
=
[0 00 0
]
.
4 1. Vector Spaces
Definition. The transpose of an m × n matrix A is the n × m matrix obtained from A by
interchanging the rows and columns. We denote the transpose of A by AT . That is, if A =[aij ]m×n, then AT = [bji]n×m where bji = aij for all i, j. Moreover,
~xT =
x1x2...
xm
T
=[x1 x2 · · · xm
]
and so if A =[~a1 ~a2 · · · ~an
], then
AT =
~aT1~aT2...
~aTn
.
Theorem 1.1.3. Let A and B denote matrices whose sizes are appropriate for the following sums
and products.(a) (AT )T = A (c) (rA)T = rAT for any scalar r(b) (A+B)T = AT +BT (d) (AB)T = BTAT
1.2 Axioms of a Vector Space
Definition. A vector space V over a field F is a nonempty set of elements called vectors, which
two laws of combination, called vector addition (or addition) and scalar multiplication, sat-
isfying the following conditions.(A1) ∀~u,~v ∈ V, ~u+ ~v ∈ V . (SM1) ∀a ∈ F, ∀~u ∈ V, a~u ∈ V .
(A2) ∀~u,~v ∈ V, ~u+ ~v = ~v + ~u. (SM2) ∀a ∈ F, ∀~u,~v ∈ V, a(~u+ ~v) = a~u+ a~v.
(A3) ∀~u,~v ∈ V, ~u+ (~v + ~w) = (~u+ ~v) + ~w. (SM3) ∀a, b ∈ F, ∀~u ∈ V, (a+ b)~u = a~u+ b~u.
(A4) ∃~0 ∈ V, ∀~u ∈ V, ~u+~0 = ~u = ~0 + ~u. (SM4) ∀a, b ∈ F, ∀~u ∈ V, (ab)~u = a(b~u).
(A5) ∀~u ∈ V, ∃~u′ ∈ V, ~u+ ~u′ = ~0 = ~u′ + ~u. (SM5) ∀~u ∈ V, 1~u = ~u (1 ∈ F ).
We call ~0 the zero vector and ~u′ the negative of ~u.
Theorem 1.2.1. Let V be a vector space over a field F . Then
1. (Cancellation) ∀~u,~v, ~w ∈ V, ~u+ ~w = ~v + ~w ⇒ ~u = ~v and
∀~u,~v, ~w ∈ V, ~w + ~u = ~w + ~v ⇒ ~u = ~v.
2. The zero vector and the negative of ~u are unique. We shall denote the negative of ~u by −~u.
3. ∀~v ∈ V,−(−~v) = ~v.
4. ∀~v ∈ V, 0~v = ~0.
5. ∀a ∈ F, a~0 = ~0.
6. ∀a ∈ F, ∀~v ∈ V, (−a)~v = −(a~v) = a(−~v). In particular, (−1)~v = −(1~v) = −~v.
7. ∀a ∈ F, ∀~v ∈ V, a~v = ~0⇒ (a = 0 ∨ ~v = ~0).
Examples 1.2.1. 1. For any field F and n ≥ 1, we have Fn is a vector space over F where
(x1, . . . , xn) + (y1, . . . , yn) = (x1 + y1, . . . , xn + yn)
and
a(x1, . . . , xn) = (ax1, . . . , axn)
for all (x1, . . . , xn), (y1, . . . , yn) ∈ Fn and a ∈ F
1.2. Axioms of a Vector Space 5
2. Let m,n ∈ N, F is a field and Mm,n(F ) the set of all m× n matrices over F . Then Mm,n(F )is a vector space over F under the usual addition and scalar multiplication of matrices.
3. [The space of functions from a set to a field] Let S be a nonempty set and F a field. Let
FS = {f | f : S → F}. Then FS is a vector space over F by defining f + g and af for
functions f, g ∈ FS and a scalar c ∈ F as follows:
(f + g)(t) = f(t) + g(t) and (cf)(t) = cf(t)
for all t ∈ S. The zero function from S into F is the zero vector of F and the negative of
f ∈ V is −f defined by (−f)(t) = −f(t) for all t ∈ S.
4. [The sequence space] Let FN = {(xn) : (xn) is a sequence in F}. Then FN is a vector
space over F under the usual addition and scalar multiplication of sequences. That is, for
sequences (an) and (bn) in FN and a scalar c ∈ F ,
(an) + (bn) = (an + bn) and c(an) = (c an).
Its zero is the zero sequence (zn) where zn = 0 for all n and the negative of (an) is the
sequence (bn) given by bn = −an for all n.
5. Let n be a non-negative integer and Fn[x] be the set of polynomials over F of degree at most
n. That is,
Fn[x] = {a0 + a1x+ a2x2 + · · ·+ anx
n : ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.
We define the addition and scalar multiplication by
p(x) + q(x) = (a0 + b0) + (a1 + b1)x+ (a2 + b2)x2 + · · ·+ (an + bn)x
n
and c(p(x)) = (ca0) + (ca1)x+ (ca2)x2 + · · ·+ (can)x
n
for all polynomials p(x) = a0+a1x+a2x2+· · ·+anx
n and q(x) = b0+b1x+b2x2+· · ·+bnx
n in
Fn[x] and c ∈ F . Then Fn[x] is a vector space over F . Observe that for each positive integer
n, we have Fn−1[x] ⊂ Fn[x].6. [The space of polynomials over a field] Let F [x] be the set of all polynomials over F . That
is,
F [x] = {a0 + a1x+ a2x2 + · · ·+ anx
n : n ≥ 0 and ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.
Then F [x] =⋃
n≥0
Fn[x]. If we use the addition and scalar multiplication defined for Fn[x],
then F [x] is a vector space over F . The zero polynomial 0(x) = 0 + 0x + 0x2 + · · · is its
zero vector and for f(x) = c0 + c1x + · · · + cnxn ∈ F [x], the negative of f(x) is (−f)(x) =
(−c0) + (−c1)x+ · · ·+ (−cn)xn.
Theorem 1.2.2. Let (V1,+1, ·1), (V2,+2, ·2), . . . , (Vn,+n, ·n) be vector spaces over a field F .
For (~v1, ~v2, . . . , ~vn), (~w1, ~w2, . . . , ~wn) ∈ V and c ∈ F , we define the addition and scalar multiplica-
tion on V = V1 × V2 × · · · × Vn by
(~v1, ~v2, . . . , ~vn) + (~w1, ~w2, . . . , ~wn) = (~v1 +1 ~w1, ~v2 +2 ~w2, . . . , ~vn +n ~wn)
and c(~v1, ~v2, . . . , ~vn) = (c ·1 ~v1, c ·2 ~v2, . . . , c ·n ~vn).
Then V is a vector space over F with the zero vector ~0 = (~01,~02, . . . ,~0n) and the negative of
(~v1, ~v2, . . . , ~vn) is (−~v1,−~v2, . . . ,−~vn). V is called the direct product of V1, V2, . . . , Vn.
6 1. Vector Spaces
1.3 Subspaces
Definition. Let V be a vector space over a field F . A subspace of V is a subset of V which
is itself a vector space over F with the operations of vector addition and scalar multiplication
of V .
Theorem 1.3.1. Let W be a nonempty subset of V . Then the following statements are equivalent.
(i) W is a subspace of V .
(ii) ∀~u,~v ∈W, ∀c ∈ F, ~u+ ~v ∈W and c~u ∈W .
(iii) ∀~u,~v ∈W, ∀c, d ∈ F, c~u+ d~v ∈W .
(iv) ∀~u,~v ∈W, ∀c ∈ F, c~u+ ~v ∈W .
Examples 1.3.1. 1. For any vector space V over a field F , we have {~0V } and V are subspaces
of V , called trivial subspaces.
2. For a non-negative integer n, we have Fn[x] is a subspace of F [x].3. Let α ∈ F and Vα = {(x1, x2) : x1 = αx2}. Then Vα is a subspace of F 2.
4. Let Bd(R) = {(an) ∈ RN : (an) is a bounded sequence},C(R) = {(an) ∈ RN : (an) is a convergent sequence} and
C0(R) = {(an) ∈ RN : an → 0 as n→∞}.Then Bd(R), C(R) and C0(R) are subspaces of RN.
5. Let C0(−∞,∞) = {f ∈ RR : f is continuous on (−∞,∞)}.Then C0(−∞,∞) is a subspace of RR.
6. Let W = {f : R→ R | f ′′ = f}. Then W is a subspace of RR.
7. Let W1 = {p(x) ∈ F [x] : p(1) = 0} and W2 = {p(x) ∈ F [x] : p(0) = 1}.Then W1 is a subspace of F [x] but W2 is not.
8. Let A ∈ Mm,n(F ). Then NulA = {~x ∈ Fn : A~x = ~0m} is a subspace of Fn, called the null
space of A.
Theorem 1.3.2. Let V be a vector space over a field F . The intersection of any collection of
subspaces of V is a subspace of V .
Definition. For non-empty subsets S1, S2, . . . , Sn of V , we define
S1 + S2 + · · ·+ Sn =n∑
i=1
Si = {x1 + x2 + · · ·+ xn : x1 ∈ S1, x2 ∈ S2, . . . , xn ∈ Sn}.
Theorem 1.3.3. If W1, . . . ,Wn are subspaces of V , then W1 + · · ·+Wn is a subspace of V .
Remark. W1 +W2 is the smallest subspace of V containing W1 and W2, i.e., any subspace con-
taining W1 and W2 must contain W1 +W2.
Definition. Let V be a vector space over a field F .
A vector ~v is said to be a linear combination of ~v1, . . . , ~vn ∈ V if
∃a1, . . . , an ∈ F,~v = a1~v1 + · · ·+ an~vn.
Definition. Let S ⊆ V . The subspace of V spanned by S is defined to be the intersection of
all subspaces of V containing S. We denote this subspace by Span S.
For ~v1, . . . , ~vp ∈ V , we call Span{~v1, . . . , ~vp} the subspace of V spanned by ~v1, . . . , ~vp.
1.3. Subspaces 7
Since ∅ ⊂ {~0V } which is the smallest of all subspaces of V , we have Span = {~0V }. Moreover,
if W is a subspace of V , then SpanW = W . In particular, Span(SpanS) = SpanS.
Remark. Let S be a non-empty subset of V and let W be a subspace of V containing S. Note that
for c1, . . . , cm ∈ F and ~v1, . . . ~vm ∈ S, we have ~v1, . . . ~vm ∈W and so
c1~v1 + · · ·+ cm~vm ∈W.
Thus, Y := {c1~v1 + · · ·+ cm~vm : c1, . . . , cm ∈ F and ~v1, . . . , ~vm ∈ S for some m ∈ N} ⊆ W for all
subspaces W of V containing S. Hence, Y ⊆ SpanS.
Theorem 1.3.4. SpanS is the smallest subspace of V containing S. That is, any subspace of Vcontaining S must also contain SpanS. Moreover, Span ∅ = {~0} and
Span S = {c1~v1 + · · ·+ cm~vm : c1, . . . , cm ∈ F and ~v1, . . . , ~vm ∈ S for some m ∈ N} if S 6= ∅.
In particular,
Span{~v1, . . . , ~vp} = {c1~v1 + · · ·+ cp~vp : c1, . . . , cp ∈ F}.
Definition. Let A =[~a1 ~a2 . . . ~an
]be an m× n matrix over a field F . Then ~ai ∈ Fm for all
i = 1, 2, . . . , n and Span{~a1,~a2, . . . ,~an} is a subspace of Fm, called the column space of A. We
denote this space by ColA.
By Theorem 1.3.4, we have
ColA = {c1~a1 + c2~a2 + · · ·+ cn~an : c1, c2, . . . , cn ∈ F}.
Definition. Let V and W be vector spaces over a field F . A function T : V →W is said to be a
linear transformation if the following conditions are satisfied:
(i) ∀~u,~v ∈ V, T (~u+ ~v) = T (~u) + T (~v) and
(ii) ∀~u ∈ V, ∀c ∈ F, T (c~u) = cT (~u).
Theorem 1.3.5. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. Then T (~0V ) = ~0W and ∀~v ∈ V, T (−~v) = −T (~v).
Theorem 1.3.6. The following statements are equivalent.
(i) T is a linear transformation.
(ii) ∀~u,~v ∈ V, ∀c ∈ F, T (~u+ ~v) = T (~u) + T (~v) ∧ T (c~u) = cT (~u).(iii) ∀~u,~v ∈ V, ∀c, d ∈ F, T (c~u+ d~v) = cT (~u) + dT (~v).(iv) ∀~u,~v ∈ V, ∀c ∈ F, T (c~u+ ~v) = cT (~u) + T (~v).
Definition. Let V and W be vector spaces over a field F and T : V → W a linear transforma-
tion. Recall that the image or range of T is given by
imT = rangeT = {~w ∈W : ∃~v ∈ V, T (~v) = ~w} = {T (~v) : ~v ∈ V }.
The kernel of T is defined by
kerT = {~v ∈ V : T (~v) = ~0W } = T−1({~0W }).
Theorem 1.3.7. The kernel of T is a subspace of V and the image of T is a subspace of W .
8 1. Vector Spaces
Example 1.3.2. Let A =[~a1 ~a2 . . . ~an
]be an m× n matrix over a field F .
Then the matrix transformation T : Fn → Fm given by
T (~x) = A~x
is a linear transformation. Its kernel is
NulA = {~x ∈ Fn : A~x = ~0m}
the null space of A, and its image is
imT = {A~x : ~x ∈ Fn} = {x1~a1 + · · ·+ xn~an : x1, . . . , xn ∈ F} = ColA,
which is the column space of A.
Remark. Since the image of T : ~x→ A~x the column space of A,
T is onto⇔ imT = Fm ⇔ ColA = Fm.
If ColA = Fm, we say that the columns of A span Fm.
Example 1.3.3. Let T : R[x]→ R be defined by
T (p(x)) = p(1)
for all p(x) ∈ R[x]. Show that T is an onto linear transformation and find its kernel.
Example 1.3.4. Let V be the space of differentiable functions on (−∞,∞) with continuous
derivative. Define a function T : V → C0(−∞,∞) by
T (f(x)) = f ′(x)
for all f ∈ V . Show that T is an onto linear transformation and find its kernel.
Definition. Let V be a vector space over a field F . Vectors ~u1, ~u2, . . . , ~un in V are linearly
independent if
∀c1, c2, . . . , cn ∈ F, c1~u1 + c2~u2 + · · ·+ cn~un = ~0⇒ c1 = c2 = · · · = cn = 0.
If there is a linear combination c1~u1 + c2~u2 + . . . cn~un = ~0 with the scalars c1, c2, . . . , cn not all
zero, we say that ~u1, ~u2, . . . , ~un are linearly dependent.
Example 1.3.5. Determine whether the set of vectors
{(1, 1, 1), (0, 1, 1), (0, 0, 1)}
is dependent or independent in R3.
Example 1.3.6. Determine whether the set of vectors
~u1 =
[2 2 10 0 1
]
, ~u2 =
[0 0 10 0 1
]
, ~u3 =
[1 1 10 0 1
]
is dependent or independent in M2,3(R).
Remarks. 1. The empty set is linearly independent.
2. If ~0V is in S, then S is linearly dependent.
3. The singleton {~0V } is linearly dependent and {~u} is linearly independent unless ~u = ~0V .
1.4. Bases and Dimensions 9
Theorem 1.3.8. Let V be a vector space over a field F and S1 ⊆ S2 ⊆ V . Then
1. SpanS1 ⊆ SpanS2.
2. If S1 is linearly dependent, then S2 is linearly dependent.
3. If S2 is linearly independent, then S1 is linearly independent.
Example 1.3.7. Consider the space of continuous functions C0[−1, 1].Determine whether the functions 1, x, x2 are dependent or independent.
Remark. Observe that the question of dependence and independence of sets of functions is re-
lated to the interval over which the space is defined. Consider the same interval [−1, 1] with the
functions f , g and h defined as follows:
f(x) = 1,−1 ≤ x ≤ 1,
g(x) =
{
0 if −1 ≤ x ≤ 0,
x if 0 ≤ x ≤ 1.and h(x) =
{
0 if −1 ≤ x ≤ 0,
x2 if 0 ≤ x ≤ 1.
These functions are linearly independent. However, if we restrict these same functions to the
interval [−1, 0], then they are dependent because
0 · f(x) + 1 · g(x) + 0 · h(x) = 0
for −1 ≤ x ≤ 0.
Theorem 1.3.9. Let T : V →W be a linear transformation. Then T is 1-1⇔ kerT = {~0V }.
1.4 Bases and Dimensions
Definition. Let V be a vector space over F . A subset B ⊂ V is a basis for V if B is linearly
independent and Span B = V .
Theorem 1.4.1. Let V be a vector space over a field F and B = {~v1, . . . , ~vn} ⊆ V linearly
independent.
1. If ~v ∈ SpanB, then there exist unique c1, . . . , cn ∈ F such that
~v = c1~v1 + · · ·+ cn~vn.
2. If B is a basis for V , then every vector in V can be expressed uniquely as a linear combination
of ~v1, . . . , ~vn.
3. Let W be a vector space over a field F and ~w1, . . . , ~wn ∈W (not necessarily distinct). If B is a
basis for V , then there is a unique linear transformation from V to W such that T (~vi) = ~wi
for all i ∈ {1, . . . , n}.
Examples 1.4.1. 1. Find a linear transformation T that satisfies the following conditions
(i) T : C→ R2[x] with T (1− i) = 2x2 and T (1 + i) = 1− x,
(ii) T : R2[x]→ R2 with T (1) = (2, 1), T (1− x) = (0, 1) and T (x+ x2) = (1, 1).2. Let T : R1[x]→ R3 be a linear transformation with
T (2− x) = (1,−1, 1) and T (1 + x) = (0, 1,−1).
Find T (−1 + 2x).
10 1. Vector Spaces
Lemma 1.4.2. 1. If ~u,~v1, . . . , ~vn ∈ S and ~u = c1~v1+ · · ·+cn~vn, then SpanS = Span(Sr{~u}).2. If S is a linearly independent subset of V and ~u /∈ SpanS, then S ∪ {~u} is linearly indepen-
dent.
Theorem 1.4.3. Let V be a vector space over F .
1. If B is a linearly independent subset of V which is maximal with respect to the property of
being linearly independent (i.e., ∀B ⊆ S, S 6= B ⇒ S is not linearly independent), then B is
a basis of V .
2. If B is a spanning set for V which is minimal with respect to the property of spanning (i.e.,
∀S ⊆ B, S 6= B ⇒ Span S $ V ), then B is a basis of V .
Theorem 1.4.4. [Replacement Theorem] Let V be a vector space that is spanned by a set Gcontaining exactly n vectors. Let L be a linearly independent subset of V with m vectors. Then
1. m ≤ n,
2. there exists a subset H of G with n−m vectors such that L ∪H spans V .
Example 1.4.2. Extend {(1, 1, 1)} to a basis of R3.
Corollary 1.4.5. If a vector space V has a finite spanning set {~v1, . . . , ~vn}, then
1. {~v1, . . . , ~vn} has a subset which is a basis,
2. any linearly independent set in V can be extended to a basis,
3. V has a basis,
4. any two bases have the same finite number of elements, necessarily ≤ n.
Definition. If a vector space V has a finite spanning set, then we say that V is finite-
dimensional, and the number of elements in a basis is called the dimension of V , written
dimV . If V has no finite spanning set, we say that V is infinite-dimensional.
Examples 1.4.3. 1. The vector space {~0} has dimension zero with basis ∅.2. The vector space Fn, n ≥ 1, is of dimension n with standard basis {(1, 0, . . . , 0), (0, 1, . . . , 0),
. . . , (0, 0, . . . , 1)}. Similarly, Mm,n(F ) is of dimension mn where m,n ∈ N.
3. The vector space Fn[x] is of dimension n+ 1 with standard basis {1, x, x2, . . . , xn}.4. The vector spaces FN and F [x] are infinite-dimensional. A basis for F [x] is {1, x, x2, . . . }.5. If we consider C as a vector space over C, it has dimension one with basis {1}. But if we
consider C as a vector space over R it has dimension two with basis {1, i}.Remark. The above corollary is valid for a “finite” dimensional vector space. For a general (fi-
nite/infinite dimensional) vector space V , consider L = {L ⊆ V : L is linearly independent}.Then ∅ ∈ L . Partially ordering L by ⊆. We now show that every chain in L has an upper
bound. Let C be a chain in L . Consider⋃
C . Let ~v1, . . . , ~vn ∈⋃
C and c1, . . . , cn ∈ F be such
that c1~v1 + · · · + cn~vn = ~0V . Suppose ~vi ∈ Li for some Li ∈ C for all i ∈ {1, . . . , n}. Since C
is a chain, we may suppose that L1 ⊆ . . . ⊆ Ln. Thus, ~v1, . . . , ~vn are in Ln which is a linearly
independent set. This implies c1 = · · · = cn = 0. Hence,⋃
C is a linearly independent set, so⋃
C
is in L . By Zorn’s lemma—“If a partially ordered set P has the property that every chain (i.e.,
totally ordered subset) has an upper bound in P , then the set P contains at least one maximal
element.”, L contains a maximal element, say B. This is a maximal linearly independent subset
of V . By Theorem 1.4.3 (1), B is a basis for V . Hence, every vector space has a basis. Note that a
basis for FN exists in this way and is not constructible explicitly.
Corollary 1.4.6. If V is a finite-dimensional vector space with dimV = n, then any spanning
set of n elements is a basis of V , and any linearly independent set of n elements is a basis of V .
Consequently, if W is an n-dimensional subspace of V , then W = V .
1.4. Bases and Dimensions 11
Corollary 1.4.7. If V is a finite-dimensional vector space and U is a proper subspace of V , then Uis finite-dimensional and dimU < dimV .
Theorem 1.4.8. If W1 and W2 are finite dimensional subspaces of a vector space V over a field F ,
then W1 +W2 is finite dimensional and
dim(W1 +W2) = dimW1 + dimW2 − dim(W1 ∩W2).
Example 1.4.4. Consider two subspaces of R5
W1 =
aa− bb
a+ b0
∈ R5 : a, b ∈ R
and W2 =
cd0e
d− e
∈ R5 : c, d, e ∈ R
.
Find bases for W1, W2 and W1 ∩W2. Determine the dimension of W1 +W2.
Definition. Let V and W be vector spaces over a field and T : V →W a linear transformation.
If V is finite dimensional, the rank of T , denoted by rankT , is dim(imT ) and the nullity of T ,
denoted by nullity T , is dim(kerT ).
Theorem 1.4.9. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. If V is finite dimensional, then
rankT + nullity T = dimV.
Theorem 1.4.10. Let V and W be finite dimensional and T : V → W a linear transformation
and dimV = dimW . Then T is one-to-one⇔ T is onto.
Corollary 1.4.11. If V is finite dimensional, S and T are linear transformations from V to V , and
T ◦ S is the identity map, then T = S−1.
From Theorem 1.4.1, we have known that the representation of a given vector ~v ∈ V in terms
of a given basis is unique.
Definition. Let V be an n-dimensional vector space over a field F with ordered basis B ={~v1, . . . , ~vn} and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1, . . . , cn) ∈ Fn, ~v = c1~u1 + · · ·+ cn~un. The vector
[~v]B =
c1...
cn
∈ Fn
is called the coordinate vector of ~v relative to the ordered basis B.
Theorem 1.4.12. For ~v, ~w ∈ V and c ∈ F , we have [~v + ~w]B = [~v]B + [~w]B and [c~v]B = c[~v]B.
Definition. A one-to-one linear transformation from V onto W is called an isomorphism. If
there exists an isomorphism from V onto W , then we say that V is isomorphic to W and we
write V ∼= W .
12 1. Vector Spaces
Note that ∼= is an equivalence relation.
Theorem 1.4.13. Let V be an n-dimensional vector space over F .
If B is a basis for V , then the map ~v 7→ [~v]B is an isomorphism from V onto Fn. Hence, V ∼= Fn.
Therefore, the theory of finite-dimensional vector spaces can be studied from column vectors
and matrices which we shall pursue in the next chapter.
Corollary 1.4.14. If V and W are finite dimensional, then dimV = dimW ⇔ V ∼= W .
Exercises for Chapter 1. 1. Let V = R+ the set of all positive integers. Define a vector addition anda scalar multiplication on V as
v ⊕ w = vw and α⊙ v = vα
for all positive real numbers v and w, and α ∈ R. Show that (V,⊕,⊙) is a vector space over R.2. Let V be a vector space over a field F . For c ∈ F and ~v ∈ V , if c~v = ~v, prove that c = 1 or ~v = ~0V .3. Which of the following are subspaces of M2(R)?
(a) {A ∈M2(R) : detA = 0} (b) {A ∈M2(R) : A = AT }(c) {A ∈M2(R) : A = −AT } (d) {A ∈M2(R) : A2 = A}
4. Which of the following are subspaces of RN?(a) All sequences like (1, 0, 1, 0, . . . ) that include infinitely many zeros.(b) {(an) ∈ RN : ∃n0 ∈ N, ∀j ≥ n0, aj = 0}. (c) All decreasing sequences: aj+1 ≤ aj for all j ∈ N.(d) All arithmetic sequences: {(an) ∈ RN : ∃a, d ∈ R, ∀n ∈ N, an = a+ (n− 1)d}.(e) All geometric sequences: {(an) ∈ RN : ∃a, r ∈ R, ∀n ∈ N, r 6= 0 ∧ an = arn−1}.
5. Which of the following are subspaces of V = C0[0, 1]?(a) {f ∈ V : f(0) = 0} (b) {f ∈ V : ∀x ∈ [0, 1], f(x) ≥ 0}(c) All increasing functions: ∀x, y ∈ [0, 1], x < y ⇒ f(x) ≤ f(y).
6. Let V and W be vector spaces over a field F and T : V →W a linear transformation.(a) If V1 is a subspace of V , then T (V1) = {T (~x) : ~x ∈ V1} is a subspace of W .(b) If W1 is a subspace of W , then T−1(W1) = {~x ∈ V : T (~x) ∈W1} is a subspace of V .
7. If L,M and N are three subspaces of a vector space V such that M ⊆ L, then show that
L ∩ (M +N) = (L ∩M) + (L ∩N) = M + (L ∩N).
Also give an example, in which the result fails to hold when M * L. (Hint. Consider Vα of F 2.)8. Let S1 and S2 be subsets of a vector space V . Prove that Span(S1 ∪ S2) = Span S1 + Span S2.9. If ~v1, ~v2, ~v3 ∈ V such that ~v1 + ~v2 + ~v3 = ~0, prove that Span{~v1, ~v2} = Span{~v2, ~v3}.
10. Let S = {~v1, . . . , ~vn} and c1, . . . , cn ∈ F r {0}. Prove that:(a) Span S = Span{c1~v1, . . . , cn~vn}(b) S is linearly independent⇔ {c1~v1, . . . , cn~vn} is linearly independent.
11. If {~y,~v1, . . . , ~vn} is linearly independent, show that {~y+ ~v1, . . . , ~y+ ~vn} is also linearly independent.12. Determine (with reason or counter example) whether the following statements are TRUE or FALSE.
(a) If W1 and W2 are subspaces of V , then W1 ∪W2 is a subspace of V .(b) If {~v1, ~v2, ~v3} is a basis of R3, then {~v1, ~v1 + ~v2, ~v1 + ~v2 + ~v3} is a basis of R3.
13. Determine whether the following subsets are linearly independent.(a) {(1, i,−1), (1 + i, 0, 1− i), (i,−1,−i)} in C3 (b) {x, sinx, cosx} in C0(R)
14. Let V be a vector space over a field F . Let ~v1, ~v2, . . . , ~vn be vectors in V .If ~w ∈ Span{~v1, ~v2, . . . , ~vn}r Span{~v2, . . . , ~vn}, then ~v1 ∈ Span{~w,~v2, . . . , ~vn}r Span{~v2, . . . , ~vn}.
15. Prove that if U and V are finite dimensional vector spaces, then dim(U × V ) = dimU + dimV .16. Find a basis and the dimension of the following subspaces of M2(R).
(a) {A ∈M2(R) : A = AT } (b) {A ∈M2(R) : A = −AT }(c) {A ∈M2(R) : ∀B ∈M2(R), AB = BA}
17. Let B ∈M2(R) and W = {A ∈M2(R) : AB = BA}.Prove that W is a subspace of M2(R) and dimW ≥ 2.
18. Find a basis for the subspace W = {p(x) ∈ R3[x] : p(2) = 0} and extend to a basis for R3[x].
1.4. Bases and Dimensions 13
19. Let W1 = Span{(1, 0, 2), (1,−2, 2)} and W2 = Span{(1, 1, 0), (0, 1,−1)} in R3.Find dim(W1 ∩W2) and dim(W1 +W2).
20. If T : V →W is a linear transformation and B is a basis for V , prove that Span T (B) = imT .21. Let T : R2[x]→ R3[x] be given by T (p(x)) = xp(x).
(a) Prove that T is a linear transformation and determine its rank and nullity.(b) Does T−1 exist? Explain.
22. Suppose that U and V are subspaces of R13, with dimU = 7 and dimV = 8.(a) What is the smallest and largest possible dimensions of U ∩ V ? Explain.(b) What is the smallest and largest possible dimensions of U + V ? Explain.
23. If V and W are finite-dimensional vector spaces such that dimV > dimW , then there is no one-to-one linear transformation T : V →W .
24. Let U and W be subspaces of a vector space V . If dimV = 3, dimU = dimW = 2 and U 6= W , provethat dim(U ∩W ) = 1.
25. Let U and W be subspaces of a vector space V such that U ∩W = {~0}.Assume that ~u1, ~u2 are linearly independent in U and ~w1, ~w2, ~w3 are linearly independent in W .(a) Prove that {~u1, ~u2, ~w1, ~w2, ~w3} is a linearly independent set in V .(b) If dimV = 5, show that dimU = 2 and dimV = 3.
14 1. Vector Spaces
2 | Inner Product Spaces
2.1 Inner Products
We shall need the following properties of complex numbers.
Proposition 2.1.1. Let z = a+ bi where a, b ∈ R.
1. Re z = a (real part) and Im z = b (imaginary part)
2. The conjugate z = a− bi, and the absolute value |z| =√a2 + b2. Moreover, zz = |z|2.
3. ¯z = z and |z| = 0⇔ a = b = 0.
4. If z, w ∈ C, then z + w = z + w and zw = zw.
Definition. Let F = R or C and let V be a vector space over F . Let ~u and ~v be vectors in V .
An inner product or scalar product on V is a function from V ×V to F , denoted by 〈·, ·〉, with
following properties:
(IN1) ∀~u,~v, ~w ∈ V, 〈~u+ ~v, ~w〉 = 〈~u, ~w〉+ 〈~v, ~w〉.(IN2) ∀~u,~v ∈ V, ∀c ∈ F, 〈c~u,~v〉 = c〈~u,~v〉.(IN3) ∀~u,~v ∈ V, 〈~u,~v〉 = 〈~v, ~u〉. Here, · is the complex conjugation.
(IN4) ∀~u ∈ V, 〈~u, ~u〉 ≥ 0 and [〈~u, ~u〉 = 0⇒ ~u = ~0].
A vector space over F , in which an inner product is defined, is called an inner product space.
Remarks. 1. For all ~u,~v ∈ V , 〈~0, ~u〉 = 0 = 〈~u,~0〉 and 〈~u,~v〉 = 0⇔ 〈~v, ~u〉 = 0.
2. If F = R, then (IN3) reads ∀~u,~v ∈ V, 〈~u,~v〉 = 〈~v, ~u〉.
Example 2.1.1. Consider the complex vector space Cn of n-tuples of complex numbers. Let
~u = (u1, u2, . . . , un) and ~v = (v1, v2, . . . , vn). We define
〈~u,~v〉 = u1v1 + u2v2 + . . .+ unvn.
Show that this is an inner product.
Remark. If we consider, on the other hand, Rn the space of n-tuples of real numbers, we have a
real-valued scalar product 〈~u,~v〉 = u1v1 + u2v2 + . . .+ unvn and the verification of the properties
is exactly like Example 2.1.1, where all conjugation symbols are removed.
Example 2.1.2. Consider V = C0[a, b] the vector space of real-valued continuous functions de-
fined on the interval [a, b]. Let
〈f, g〉 =∫ b
af(x)g(x) dx.
Show that this defines an inner product.
We can add to the list of properties of the scalar product by proving some theorems, assuming
of course that we are dealing with a complex vector space with a scalar product.
15
16 2. Inner Product Spaces
Theorem 2.1.2. 1. ∀~u,~v, ~w ∈ V, 〈~u,~v + ~w〉 = 〈~u,~v〉+ 〈~u, ~w〉.2. ∀~u,~v ∈ V, ∀c ∈ F, 〈~u, c~v〉 = c〈~u,~v〉.3.(∀~u ∈ V, 〈~u,~v〉 = 0)⇒ ~v = ~0.
4.(∀~u ∈ V, 〈~u,~v〉 = 〈~u, ~w〉
)⇒ ~v = ~w. In fact, if 〈~v − ~w,~v〉 = 〈~v − ~w, ~w〉, then ~v = ~w.
Remark. Let c1, c2 ∈ F and ~u,~v ∈ V . Then
〈c1~u+ c2~v, c1~u+ c2~v〉 = c1c1〈~u, ~u〉+ c1c2〈~u,~v〉+ c1c2〈~v, ~u〉+ c2c2〈~v,~v〉.
Moreover, if 〈~u,~v〉 = 0, then 〈~v, ~u〉 = 0, so
〈c1~u+ c2~v, c1~u+ c2~v〉 = c1c1〈~u, ~u〉+ c2c2〈~v,~v〉 = |c1|2〈~u, ~u〉+ |c2|2〈~v,~v〉.
The quantity 〈~u, ~u〉 is non-negative and is zero if and only if ~u = ~0. Therefore, we associate
with it the square of the length of the vector.
Definition. For ~v ∈ V , we define the length or norm of ~v to be ‖~v‖ =√
〈~v,~v〉.
Some of the properties of the norm are given by the next theorem.
Theorem 2.1.3. If V is an inner product space over F , then the norm ‖ · ‖ has the following
properties:
1. ∀~u ∈ V, ‖~u‖ ≥ 0 and ‖~u‖ = 0⇔ ~u = ~02. ∀~u ∈ V, ∀a ∈ F, ‖a~u‖ = |a|‖~u‖3. ∀~u,~v ∈ V, |(~u,~v)| ≤ ‖~u‖‖~v‖ (the Cauchy-Schwarz inequality)
4. ∀~u,~v ∈ V, ‖~u+ ~v‖ ≤ ‖~u‖+ ‖~v‖ (the triangle inequality).
Example 2.1.3. Let f be a real-valued continuous function defined on the interval [a, b]. Prove
that ∣∣∣∣∣
∫ b
af(x)dx
∣∣∣∣∣≤ (b− a)M, where M = max
x∈[a,b]|f(x)|.
2.2 Orthonormal Bases
Definition. Let V be an inner product space over F . Two nonzero vectors ~u and ~v are orthog-
onal if (~u,~v) = 0. A vector ~u is a unit vector if ‖~u‖ = 1.
Definition. A subset S of V is called an orthogonal set if ∀~u,~v ∈ S, ~u 6= ~v ⇒ ~u and ~v are
orthogonal. Moreover, S is called an orthonormal set if S is orthogonal and ∀~v ∈ S, ‖~v‖ = 1.
Example 2.2.1. 1. The standard basis of Fn, n ∈ N is an orthonormal set.
2. Let V = C0[0, 2π] with inner product 〈f, g〉 =∫ 2π
0f(x)g(x) dx. Then
S =
{1√2π
,1√πcosx,
1√πsinx,
1√πcos 2x,
1√πsin 2x, . . .
}
is an orthonormal set.
2.2. Orthonormal Bases 17
Let V be an inner product space.
Lemma 2.2.1. Let S = {~v1, . . . , ~vn} be an orthogonal set.
1. ∀α1, . . . , αn ∈ F, ∀k ∈ {1, . . . , n},⟨ n∑
i=1
αi~vi, ~vk
⟩
= αk‖~vk‖2.
2. ∀~v ∈ Span S,~v =n∑
i=1
〈~v,~vi〉‖~vi‖2
~vi.
Theorem 2.2.2. If S is an orthogonal set, then S is linearly independent.
Theorem 2.2.3. [Gram-Schmidt Process] Let ~v1, ~v2, . . . , ~vn ∈ V be linearly independent. Then
∀m ∈ {1, . . . , n}, ∃~w1, . . . , ~wm ∈ V such that {~w1, . . . , ~wm} is an orthogonal set and it is a basis
for Span{~v1, . . . , ~vm}.
Proof. We prove this theorem by induction on m ≥ 1.
If m = 1, {~v1} is an orthogonal set. Choose ~w1 = ~v1. Then Span{~w1} = Span{~v1}. Let
k ∈ {1, 2, . . . , n − 1} and assume that there exist ~w1, . . . , ~wk ∈ V such that {~w1, . . . , ~wk} is an
orthogonal set Span{~w1, . . . , ~wk} = Span{~v1, . . . , ~vk}. Choose
~wk+1 = ~vk+1 − vk+1 = ~vk+1 −k∑
i=1
〈~vk+1, ~wi〉‖~wi‖2
~wi. (2.2.1)
We have to show that:
(1) {~w1, . . . , ~wk, ~wk+1} is an orthogonal set. By induction hypothesis, {~w1, . . . , ~wk} is an orthog-
onal set, so it suffices to show that ~wk+1 is orthogonal to ~wj for all j ∈ {1, . . . , k}. Let
j ∈ {1, . . . , k}.
〈~wk+1, ~wj〉 =⟨
~vk+1 −k∑
i=1
〈~vk+1, ~wi〉‖~wi‖2
~wi, ~wj
⟩
= 〈~vk+1, ~wj〉 −k∑
i=1
⟨〈~vk+1, ~wi〉‖~wi‖2
~wi, ~wj
⟩
= 〈~vk+1, ~wj〉 − 〈~vk+1, ~wj〉 = 0.
(2) Span{~w1, . . . , ~wk, ~wk+1} = Span{~v1, . . . , ~vk, ~vk+1}. Again, by induction hypothesis,
Span{~w1, . . . , ~wk} = Span{~v1, . . . , ~vk}.
From Eq. (2.2.1), we have
~wk+1 ∈ Span{~w1, . . . , ~wk, ~vk+1} = Span{~v1, . . . , ~vk, ~vk+1}.
Then Span{~w1, . . . , ~wk, ~wk+1} ⊆ Span{~v1, . . . , ~vk, ~vk+1}. For the reverse, we note that
~vk+1 = ~wk+1 +k∑
i=1
〈~vk+1, ~wi〉‖~wi‖2
~wi ∈ Span{~w1, . . . , ~wk, ~wk+1}.
Since an orthogonal set is linearly independent, {~w1, . . . , ~wk} is a basis for Span{~v1, . . . , ~vk}.
Corollary 2.2.4. If V is a finite dimensional inner product space, then V has an orthonormal
basis.
18 2. Inner Product Spaces
Proof. Let B = {~v1, . . . , ~vm} be a basis for V . Then B is linearly independent. By the Gram-
Schmidt Process, we can construct an orthogonal subset {~w1, . . . , ~wm} of V which is a basis for
Span{~v1, . . . , ~vm} = V . Hence, {~w1, . . . , ~wm} is an orthogonal basis for V in which we can nor-
malize each vector to obtain an orthonormal basis as desired.
Example 2.2.2. Let H = Span
12i0
,
2i6−3
⊂ C3. Find an orthonormal basis for H.
Example 2.2.3. Let V be the space of continuous functions on [0, 1] and H = Span{1, 3√x, 10x}a 3-dimensional subspace of V . Use the Gram-Schmidt process to find an orthogonal basis for H.
2.3 Orthogonal Complements
Definition. Let V be an inner product space over F . For S ⊆ V , the orthogonal complement
of S is the set S⊥, read “S perp”, defined by
S⊥ = {~v ∈ V : 〈~v, ~u〉 = 0 for all ~u ∈ S}.
Remark. ∅⊥ = V = {~0}⊥, V ⊥ = {~0} and S⊥ = (Span S)⊥.
Theorem 2.3.1. For any subset S of V , S⊥ is a subspace of V .
Lemma 2.3.2. Let S = {~v1, . . . , ~vn} be a set of distinct nonzero vectors.
If S is an orthogonal set, then ~v −n∑
i=1
〈~v,~vi〉‖~vi‖2
~vi ∈ S⊥ for all ~v ∈ V .
Theorem 2.3.3. [Bessel’s inequality] Let S = {~v1, . . . , ~vn} be a set of distinct nonzero vectors.
If S is an orthogonal set, then for all ~v ∈ V ,
n∑
i=1
|〈~v,~vi〉|2‖~vi‖2
≤ ‖~v‖2
and equality holds if and only if ~v ∈ Span S.
Let W1 and W2 be subspaces of a vector space V . We know that W1 +W2 is subspace of V . If
V = W1 +W2, we say that V is a sum of W1 and W2. The sum is direct, denoted by W1 ⊕W2, if
W1 ∩W2 = {~0V }. That is,
V = W1 ⊕W2 ⇔ [(1) V = W1 +W2 and (2) W1 ∩W2 = {~0V }].
Theorem 2.3.4. V = W1 ⊕W2
⇔ every vector ~v ∈ V can be expressed uniquely as ~v = ~w1 + ~w2 with ~w1 ∈W1 and ~w2 ∈W2.
Theorem 2.3.5. [Orthogonal Decomposition Theorem] Let W be a finite dimensional subspace
of an inner product space V . Then
1. V = W ⊕W⊥. In other words, every ~v in V decomposes uniquely as ~v = ~y + ~z with ~y ∈ Wand ~z ∈W⊥.
2. dimW + dimW⊥ = dimV .
2.3. Orthogonal Complements 19
Exercises for Chapter 2. 1. Let Vn = {A ∈ Mn(R) : A = AT } be the vector space of all n × nsymmetric matrices over R, and define the product of two matrices A and B by
〈A,B〉 = tr (AB).
where tr denotes the trace of matrix.(a) Show that this is an inner product on Vn.
(b) Obtain an orthonormal basis for the subspace H = Span
{[1 00 1
]
,
[0 00 2
]}
of V2.
2. Find an orthonormal basis for R2[x] with respect to the inner product
〈p(x), q(x)〉 =∫ 1
0
p(x)q(x) dx.
3. Let W = {y(x) ∈ RR : y′′ + 4y = 0}. Then W is a real vector space generated by {cos 2x, sin 2x}.Define an inner product 〈y, z〉 =
∫ π
0
y(x)z(x) dx for all y, z ∈W . Find an orthonormal basis for W .
4. Let V and W be two vector spaces and T a one-to-one linear transformation from V into W . If W isan inner product space with inner product (·, ·), prove that the function 〈 , 〉 : V ×V → F defined by
〈~u,~v〉 = (T (~u), T (~v))
for all ~u,~v ∈ V is an inner product on V .5. Let V be an inner product space over F . Prove the following statements.
(a) If F = R, then ∀~u,~v ∈ V, 〈~u,~v〉 = 14‖~u+ ~v‖2 − 1
4‖~u− ~v‖2.
(b) If F = C, then ∀~u,~v ∈ V, 〈~u,~v〉 = 14‖~u+ ~v‖2 − 1
4‖~u− ~v‖2 + i4‖~u+ i~v‖2 − i
4‖~u− i~v‖2.(c) ∀~u,~v ∈ V, ‖~u+ ~v‖2 + ‖~u− ~v‖2 = 2‖~u‖2 + 2‖~v‖2.(a) and (b) are called the polarization identity and (c) is called the parallelogram law.
6. Show that |‖~u‖ − ‖~v‖| ≤ ‖~u− ~v‖ for all ~u,~v ∈ V .7. From the Cauchy-Schwarz inequality, |〈~u,~v〉| ≤ ‖~u‖‖~v‖, prove that equality holds if and only if ~u and
~v are linearly dependent.8. By choosing a suitable vector ~b in the Cauchy-Schwarz inequality, prove that
(a1 + · · ·+ an)2 ≤ n(a21 + · · ·+ a2n).
When does equality hold?9. Consider V = C0[a, b]. Let f ∈ V . Prove that
∫ b
a
|f(x)|2 dx ≤(∫ b
a
|f(x)| dx)1/2(∫ b
a
|f(x)|3 dx)1/2
.
10. Prove that the finite sequence a0, a1, . . . , an of positive real numbers is a geometric progression ifand only if
(a0a1 + a1a2 + · · ·+ an−1an)2 = (a20 + a21 + · · ·+ a2n−1)(a
21 + a22 + · · ·+ a2n).
11. Let P (x) be a polynomial with positive real coefficients. Prove that
√
P (a)P (b) ≥ P (√ab)
for all a, b ≥ 0.12. Let V be an n-dimensional inner product space and m < n. If {~v1, . . . , ~vm} is an orthonormal set,
then there exists ~vm+1, . . . , ~vn ∈ V such that {~v1, . . . , ~vn} is an orthonormal basis for V .13. Prove the following statements.
(a) ∀S1, S2 ⊆ V, S1 ⊆ S2 ⇒ S⊥1 ⊇ S⊥
2 . (b) ∀S ⊆ V, (Span S)⊥ = S⊥.(c) For S ⊆ V , if ~u ∈ S and ~v ∈ S⊥, then ‖~u+ ~v‖2 = ‖~u‖2 + ‖~v‖2.(d) For ~v1, . . . , ~vn ∈ V , {~v1}⊥ ∩ · · · ∩ {~vn}⊥ = (Span{~v1, . . . , ~vn})⊥.
14. Construct an orthonormal basis for the subspace H = {(1,−i, i)}⊥ of C3.15. Let W be a subspace of an inner product space V over F . If ~v ∈ V satisfies
〈~v, ~w〉+ 〈~w,~v〉 ≤ 〈~w, ~w〉 for all ~w ∈W,
show that 〈~v, ~w〉 = 0 for all ~w ∈W .
20 2. Inner Product Spaces
16. Consider the inner product space C0[−1, 1]. Suppose that f and g are continuous on [−1, 1] and‖f − g‖ ≤ 5. Let
u1(x) =1√2
and u2(x) =
√
3
2x for x ∈ [−1, 1].
Write
aj =
∫ 1
−1
uj(x)f(x) dx and bj =
∫ 1
−1
uj(x)g(x) dx
for j = 1, 2. Show that |a1 − b1|2 + |a2 − b2|2 ≤ 25. (Hint. Use Bessel’s inequality.)17. If V is a finite dimensional inner product space and W is a subspace of V , prove that (W⊥)⊥ = W .18. If {~v1, ~v2} is a basis for V , show that V = Span{~v1} ⊕ Span{~v2}.19. Consider the subspace Vα, α ∈ R, of R2. Prove that if α 6= β, then R2 = Vα ⊕ Vβ .20. Let V = RR be the space of all functions from R to R. Let
Ve = {f ∈ V : ∀x ∈ R, f(−x) = f(x)} and Vo = {f ∈ V : ∀x ∈ R, f(−x) = −f(x)},
the sets of all even and odd functions, respectively. Prove the following statements.(a) Ve and Vo are subspaces of V . (b) V = Ve ⊕ Vo.
21. Let S be a set of vectors in a finite dimensional inner product space V . Suppose that “〈~u,~v〉 = 0 forall ~u ∈ S implies ~v = ~0”. Show that V = SpanS.
22. Let RN be the sequence space of real numbers. Let V = {(an) ∈ RN : only finitely many ai 6= 0}.(a) Prove that V is a subspace of RN.(b) Given (an), (bn) ∈ V , define
〈(an), (bn)〉 =∞∑
n=1
anbn.
(Note that this makes sense since only finitely many ai and bi are nonzero.) Show that this definesan inner product on V .
(c) Let U =
{
(an) ∈ V :
∞∑
n=1
an = 0
}
.
Show that U is a subspace of V such that U⊥ = {~0}, U + U⊥ 6= V and U 6= U⊥⊥.
3 | Matrices
3.1 Solutions of Linear Systems
Definition. For any system of m linear equations in n unknowns with coefficients over a field F
a11x1 + a12x2 + . . . + a1nxn = b1a21x1 + a22x2 + . . . + a2nxn = b2
......
am1x1 + am2x2 + . . . + amnxn = bm,
we can use the matrix notation
A~x = ~b,
where
A =
a11 a12 . . . a1na21 a22 . . . a2n...
......
am1 am2 . . . amn
, ~x =
x1x2...
xn
, and ~b =
b1b2...
bm
,
considered as matrices over F . In this case, we usually call A the coefficient matrix of the
system. It is clear that A~x = ~b has a solution ⇔ ~b ∈ ColA. If all b1, . . . , bm are equal to 0,
the linear system is said to be homogeneous. Note that all solutions of a homogeneous system
form the null space of A.
There is another matrix which plays an important role in the study of linear systems. This is
the augmented matrix, which is formed by inserting~b as a new last column into the coefficient
matrix. In other words, the augmented matrix is
[
A : ~b]
=
a11 a12 . . . a1n b1a21 a22 . . . a2n b2...
......
...
am1 am2 . . . amn bm
.
Remark. A homogeneous linear system
a11x1 + a12x2 + . . . + a1nxn = 0a21x1 + a22x2 + . . . + a2nxn = 0
......
am1x1 + am2x2 + . . . + amnxn = 0,
always has a trivial solution, namely the solution obtained by letting all xj = 0. Other nonzero
solutions (if any) are called nontrivial solutions.
Definition. The rank of a matrix A is the dimension of the column space of A.
21
22 3. Matrices
Remark. If A is an m×n matrix, then rankA ≤ n and rankA is the maximum number of linearly
independent columns of A by Corollary 1.4.5.
Theorem 3.1.1. Let A be an m× n matrix over a field F .
1. The homogeneous system A~x = ~0m has only the trivial solution ~x = ~0n⇔ the columns of A are linearly independent⇔ rankA = n.
2. If rankA < n, then a homogeneous linear system has a nontrivial solution in F .
For an m× n matrix A over a field F , recall that the matrix transformation
T : ~x 7→ A~x
is a linear transformation from Fn to Fm. Its kernel is NulA and its image is ColA.
Definition. The dimension of NulA is called the nullity of A, denoted by nullityA.
By Theorem 1.4.9, we have:
Corollary 3.1.2. Let A be an m× n matrix over a field F . Then
rankA+ nullityA = n = the number of columns of A.
Examples 3.1.1. Consider the following augmented matrices. Write down their general solutions
(if any).
1.
1 −3 4 70 1 2 20 0 1 5
2.
1 −3 7 00 1 4 00 0 0 0
1 −3 7 10 1 4 00 0 0 −1
3.
[1 0 3 00 1 −2 0
] [1 0 3 50 1 −2 1
]
4.
1 −4 −2 0 3 −50 1 0 0 −1 −10 0 0 1 0 00 0 0 0 0 0
Theorem 3.1.3. Let A be an m× n matrix over a field F and ~b ∈ Fm.
1. A~x = ~b has a solution⇔ ~b ∈ ColA⇔ rank[A : ~b] = rankA.
2. If ~z ∈ Fn is a solution of A~x = ~b, then
~z = ~y + ~yp,
where ~y is a solution of the homogeneous system A~x = ~0m and A~yp = ~b.
Hence, the solution set of A~x = ~b is empty or given by
~yp + {~y ∈ Fn : A~y = ~0m},
where ~yp is a solution of A~x = ~b, called a particular solution.
Corollary 3.1.4. Let A be an m× n matrix over a field F and ~b ∈ Fm.
1. If A~x = ~b has a unique solution, then A~x = ~0m has a unique solution and rankA = n.
2. If A~x = ~0m has a nontrivial solution, then A~x = ~b has no solution or more than one
solutions.
3.2. Inverse of a Matrix and Elementary Matrices 23
3.2 Inverse of a Matrix and Elementary Matrices
Definition. The main part of the algorithms used for solving simultaneous linear systems with
coefficients in F is called elementary row operations. It makes repeatedly used of three
operations on the linear system or on its augmented matrix, each of which preserves the set of
solutions because its inverse is an operation of the same kind:
1. (Interchange, Rij) Interchange the ith row and the jth row.
2. (Scaling, cRi) Multiply the ith row by a nonzero scalar c.3. (Replacement, Ri + cRj) Replace the ith row by the sum of it and a scalar c multiple of
the jth row.
The elementary column operations are defined in a similar way.
Remark. The elementary row operations are reversible as follows.
Operation Reverse
Rij Rij
cRi, c 6= 0 (1/c)Ri
Ri + cRj Ri − cRj
Definition. Two linear systems are said to be equivalent if they have the same set of solutions.
Theorem 3.2.1. Suppose that a sequence of elementary operations is performed on a linear system.
Then the resulting system has the same set of solutions as the original, so the two linear systems
are equivalent.
Proof. It is clear from the way we do the row reductions that if c1, c2, . . . , cn satisfy the original
system, then they also satisfy the reduced system. Since the elementary row operations are re-
versible if we start with the reduced system, the original system can be recovered. Now, it is clear
that any solutions of the reduced system is also a solution of the original system.
Definition. A rectangular matrix is in echelon form (or row-echelon form) if it has the fol-
lowing three properties:
1. All nonzero rows are above any rows of all zeros.
2. Each leading entry of a row is in a column to the right of the leading entry of the row
above it.
3. All entries in a column below a leading entry are zero. If a matrix in echelon form satisfies
the following additional conditions, then it is in reduced echelon form (or reduced row-
echelon form):
4. The leading entry in each nonzero row is 1, called the leading 1.
5. Each leading 1 is the only nonzero entry in its column.
An echelon matrix (respectively, reduced echelon matrix) is one that is in echelon form (re-
spectively, reduced echelon form).
Theorem 3.2.2. Every matrix can be brought to a reduced echelon matrix by a finite sequence of
elementary row operations.
Proof. This can be done by an algorithm, called the Gaussian Algorithm.
1. If the matrix consists entirely of zeros, stop–it is already in row-echelon form.
2. Otherwise, find the first column from the left containing a nonzero entry (call it a), and
move the row containing that entry to the top position.
24 3. Matrices
3. Now multiply the new top row by 1/a to create a leading 1.
4. By subtracting multiples of that row from rows below it, make each entry below the lead-
ing 1 zero.
This completes the first row, and all further row operations are carried out on the remaining rows.
5. Repeat steps 1–4 on the matrix consisting of the remaining rows.
The process stops when either now rows remain at Step 5 of the remaining rows consist entirely
of zeros. Observe that the Gaussian algorithm is recursive.
Definition. A pivot position in a matrix A is a location in A that corresponds to a leading entry
in an echelon form of A. A pivot column is a column of A that contains a pivot position.
Definition. Let A be an n × n matrix. We say that A is invertible or nonsingular and has the
n× n matrix B as inverse if AB = BA = In.
If B and C are n× n matrices with AB = In and CA = In, then the associativity of multipli-
cation implies that
B = InB = (CA)B = C(AB) = CIn = C.
Hence an inverse for A is unique if it exists and we write A−1 for this inverse.
Theorem 3.2.3. Suppose A and B are invertible matrices of the same size. Then the following
results hold:
(a) A−1 is invertible and (A−1)−1 = A, i.e., A is the inverse of A−1.
(b) AB is invertible and (AB)−1 = B−1A−1.
(c) AT is invertible and (AT )−1 = (A−1)T .
Theorem 3.2.4. [Invertible Matrix Theorem]
The following statements are equivalent for an n× n matrix A.
(i) A is invertible.
(ii) The homogeneous system A~x = ~0 has only the trivial solution ~x = ~0n.
(iii) A can be carried to the identity matrix In by elementary row operations.
(iv) The system A~x = ~b has at least one solution for any vector ~b ∈ Fn.
(v) There is an n× n matrix C such that AC = In.
Corollary 3.2.5. If A and C are square matrices such that AC = I, then also CA = I. In
particular, A and C are invertible, C = A−1 and A = C−1.
Corollary 3.2.6. An n× n matrix A is invertible if and only if rankA = n.
Definition. An elementary matrix is one that is obtained by performing a single elementary
row operation on an identity matrix.
Example 3.2.1. Consider the following elementary matrices:
E1 =
1 0 00 2 00 0 1
, E2 =
1 0 00 0 10 1 0
, and E3 =
1 0 00 1 03 0 1
.
Let A =
a b cd e fg h i
. Compute the products E1A, E2A and E3A.
3.3. More on Ranks 25
Theorem 3.2.7. If an elementary row operation is performed on an m×n matrix A, the resulting
matrix can be written as EA, where the m ×m matrix E is created by performing the same row
operation on Im.
Remark. Elementary matrices are invertible because row operations are reversible. To find the in-
verse of an elementary matrix E, determine the elementary row operation needed to transform Eback into I and apply this operation to I to obtain the inverse.
Corollary 3.2.8. An elementary matrix is invertible. Moreover,
1. If IRij−→ E1, then I
Rij−→ E−11 .
2. If c 6= 0 and IcRi−→ E2, then I
(1/c)Ri−→ E−12 .
3. If c ∈ F and IRi+cRj−→ E3, then I
Ri−cRj−→ E−13 .
Example 3.2.2. Find the inverses of the elementary matrices given in Example 3.2.1
Theorem 3.2.9. Suppose A is an m× n matrix and A→ B by elementary row operations.
1. B = UA for some m×m invertible matrix U .
2. U can be computed by [A : Im]→ [B : U ] using the operations carrying A→ B.
3. U = EkEk−1 . . . E2E1, where E1, E2, . . . , Ek−1, Ek are the elementary matrix corresponding
(in order) to the elementary row operations carrying A→ B.
Example 3.2.3. If A =
[2 3 11 2 1
]
, express the reduced row-echelon form R of A as R = UA
where U is invertible.
Theorem 3.2.10. A square matrix is invertible if and only if it is a product of elementary matrices.
Remark. From the above theorem, we obtain an algorithm to find A−1 if A is invertible. Namely,
we start with the block matrix [A : I] and row reduce it until we reach the final reduced echelon
form [I : U ] (because A is row equivalent to I by Theorem 3.2.4). Then we have U = A−1.
Example 3.2.4. Express A =
[−2 31 0
]
as a product of elementary matrices.
3.3 More on Ranks
Definition. Let A be an m× n matrix.
The column space, ColA, of A is the subspace of Rm spanned by the columns of A.
The row space, RowA, of A is the subspace of Rn spanned by the rows of A.
Note that ColA = RowAT .
Lemma 3.3.1. Let V be a vector space over a field F . Let ~v1, . . . , ~vn be in V .
1. Span{~v1, . . . , ~vn} = Span{~v1, . . . , c~vi, . . . ~vn} for all i ∈ {1, . . . , n} and c ∈ F nonzero.
2. Span{~v1, . . . , ~vn} = Span{~v1, . . . , ~vi + c~vj , . . . , ~vj , . . . , ~vn} for all i 6= j and c ∈ F .
Lemma 3.3.2. Let A and B denote m× n matrices.
1. If A→ B by elementary row operations, then RowA = RowB.
2. If A→ B by elementary column operations, then ColA = ColB.
26 3. Matrices
If A is any matrix, we can carry A → R by elementary row operations where R is a row-
echelon matrix. Hence, RowA = RowR by Lemma 3.3.2.
Lemma 3.3.3. If R is a row-echelon matrix, then
1. The nonzero rows of R form a basis for RowR.
2. The columns of R containing leading ones form a basis for ColR.
Theorem 3.3.4. Let A denote any m× n matrix of rank r. Then
dimColA = r = dimRowA.
Moreover, if A is carried to a row-echelon matrix R by row operations, then
1. The r nonzero rows of R form a basis for RowA.
2. If the pivot positions lie in columns j1, j2, . . . , jr of R, then columns j1, j2, . . . , jr of A are a
basis of ColA. That is, the pivot columns of A form a basis for ColA.
Corollary 3.3.5. 1. If A is any matrix, then rankA = rankAT .
2. If A is an m× n matrix, then rankA ≤ m and rankA ≤ n.
3. rankA = rankUA = rankAV whenever U and V are invertible.
Corollary 3.3.6. Let A,B,U and V be matrices of sizes for which the indicated products are
defined.
1. Col(AV ) ⊆ ColA, with equality if V is (square and) invertible.
2. Row(UA) ⊆ RowA, with equality if U is (square and) invertible.
3. rankAB ≤ rankA and rankAB ≤ rankB.
Let A be an m×n matrix of rank r, and let R be the reduced row-echelon form of A. Theorem
3.2.9 shows that R = UA where U is invertible, and that U can be found by [A : Im]→ [R : U ].The matrix R has r leading ones (since rank A=r) so, as R is reduced, the n ×m matrix RT
contains each row of Ir in the first r columns. Thus, row operations will carry RT →[Ir 00 0
]
n×m
.
Hence, Theorem 3.2.9 (again) shows that
[Ir 00 0
]
n×m
= U1RT where U1 is an n × n invertible
matrix. Writing V = UT1 , we obtain
UAV = RV = RUT1 = (U1R
T )T =
([Ir 00 0
]
n×m
)T
=
[Ir 00 0
]
m×n
.
Moreover, the matrix U1 = V T can be computed by [RT : In]→[[
Ir 00 0
]
n×m
: V T
]
. This proves
Theorem 3.3.7. Let A be an m × n matrix of rank r. There exist invertible matrices U and V of
size m×m and n× n, respectively, such that
UAV =
[Ir 00 0
]
m×n
,
called the Smith normal form of A.
Moreover, if R is a reduced row-echelon form of A, then:
1. U can be computed by [A : Im]→ [R : U ].
2. V can be computed by [RT : In]→[[
Ir 00 0
]
n×m
: V T
]
.
3.4. Permutations and Determinants 27
Example 3.3.1. Given A =
1 −1 1 22 −2 1 −1−1 1 0 3
, find invertible matrices U and V such that UAV
is in the Smith normal form.
Theorem 3.3.8. [Uniqueness of the reduced row-echelon form]
If a matrix A is carried to reduced row-echelon matrices R and S by row operations, then R = S.
Proof. Observe first that UR = S for some invertible matrix U (by Theorem 3.2.9 there exist
invertible matrices P and Q such that R = PA and S = QA; take U = QP−1. We show that
R = S by induction on the number m of rows of A. The case m = 1 is trivial because we can
perform only scaling. If ~rj and ~sj denotes the jth column of R and of S, respectively, the fact that
UR = S gives
U~rj = ~sj for each j. (3.3.1)
Since U is invertible, this shows that R and S have the same zero columns. Hence, by passing to
the matrices obtained by deleting the zero columns from R and S, we may assume that R and Shave no zero columns.
But then the first column of R and S is the first column of Im because they are reduced row-
echelon so (3.3.1) forces that the first column of U is the first column of Im. Now, write U,R and
S in block form as follows.
U =
[1 X0 V
]
, R =
[1 Y0 R′
]
and S =
[1 X0 S′
]
.
Since UR = S, block multiplication gives V R′ = S′ so, since V is invertible (U is invertible) and
both R′ and S′ are reduced row-echelon, we obtain R′ = S′ by the induction hypothesis. Thus, Rand S have the same number (say r) of leading 1’s, and so both have m− r zero rows.
In fact, R and S have leading ones in the same columns, say r of them. Applying (3.3.1) to
these columns shows that the first r columns of U are the first r columns of Im. Hence, we can
write U,R and S in block form as follows:
U =
[Ir M0 W
]
, R =
[R1 R2
0 0
]
and S =
[S1 S2
0 0
]
,
where R1 and S1 are r × r. Then block multiplication gives UR = R. That is, S = R. This
completes the proof.
3.4 Permutations and Determinants
Definition. Let n ∈ N. A permutation σ on the set {1, 2, . . . , n} is a one-to-one mapping of the
set onto itself or equivalently, a rearrangement of the numbers 1, 2, . . . , n. Such a permutation
σ is denoted by
σ =
(1 2 . . . nj1 j2 . . . jn
)
or σ = j1j2 . . . jn, where ji = σ(i).
The set of all such permutations is denoted by Sn, and the number of such permutations is n!.
Example 3.4.1. S2 = {12, 21} and S3 = {123, 132, 213, 231, 312, 321}.
Remark. If σ ∈ Sn, then the inverse mapping σ−1 ∈ Sn; and if σ, τ ∈ Sn, then the composition
mapping σ ◦ τ ∈ Sn. Also, the identity mapping ε = σ ◦ σ−1 = 123 . . . n ∈ Sn.
28 3. Matrices
Definition. For a permutation σ in Sn, let
Iσ = {(i, k) : i, k ∈ {1, 2, . . . , n}, i < k and σ(i) > σ(k)}.
We say that σ is an even permutation⇔ |Iσ| is even, and an odd permutation⇔ |Iσ| is odd.
We then define the sign or parity of σ, written sgnσ, by
sgnσ =
{
1 if σ is even,
−1 if σ is odd.
Thus, sgnσ ∈ {−1, 1} for all σ ∈ Sn.
Example 3.4.2. Let σ = 2134 in S4 and τ = 21543 in S5.
1. Find σ−1 and τ−1.
2. Compute sgnσ and sgn τ .
Theorem 3.4.1. Let n ≥ 2 and let g be the polynomial given by
g = g(x1, x2, . . . , xn) =∏
i<j
(xi − xj).
For σ(g) ∈ Sn, define the polynomial
σ(g) =∏
i<j
(xσ(i) − xσ(j)).
Then
σ(g) =
{
g if σ is even,
−g if σ is odd.
That is, σ(g) = (sgnσ)g.
Theorem 3.4.2. Let σ, τ ∈ Sn. Then
sgn(τ ◦ σ) = (sgn τ)(sgnσ).
Thus, the product of two even or two odd permutations is even, and the product of an odd and an
even permutation is odd.
Let [aij ] be a square matrix of size n× n.
Consider a product of n elements of A such that one and only one element comes from each
row and one and only one element comes from each columns. Such a product can be written in
the form
a1jia2j2 . . . anjn
that is, where the factors comes from successive rows, and so the first subscripts are in the nat-
ural order 1, 2, . . . , n. Now since the factors come from different columns, the sequence of the
second subscripts forms a permutation σ = j1j2 . . . jn in Sn. Conversely, each permutation in Sn
determines a product of the above form. Thus the matrix A contains n! such products.
Definition. The determinant of A = [aij ], denoted by detA or |A|, is the sum of all the above
n! products where each such product is multiplied by sgnσ. That is,
|A| =∑
σ∈Sn
(sgnσ)a1j1a2j2 . . . anjn =∑
σ∈Sn
(sgnσ)a1σ(1)a2σ(2) . . . anσ(n).
3.4. Permutations and Determinants 29
Lemma 3.4.3. Let A = [aij ] be an n× n matrix and σ ∈ Sn.
1. sgnσ−1 = sgnσ.
2. {(i, σ−1(i)) : i ∈ {1, 2, . . . , n}} = {(σ(i), i) : i ∈ {1, 2, . . . , n}}.3. aσ(1),1aσ(2),2 . . . aσ(n),n = a1,σ−1(1)a2,σ−1(2) . . . an,σ−1(n).
Theorem 3.4.4. The determinant of a matrix A and its transpose are equal. That is, |A| = |AT |.
Remark. By this theorem, any theorem about the determinant of a matrix A that concerns the
rows of A will have an analogous theorem concerning the columns of A.
Lemma 3.4.5. For k < l, τ =
(1 2 . . . k . . . l . . . n1 2 . . . l . . . k . . . n
)
is an odd permutation in Sn.
Proof. Note that Iτ = {(k, j) : j ∈ {k + 1, k + 2, . . . , l}} ∪ {(i, l) : i ∈ {k + 1, k + 2, . . . , l − 1}}.Then |Iτ | = (l − k) + (l − k − 1) = 2(l − k) − 1 is odd. Thus, τ is an odd permutation and so
sgn τ = −1.
Theorem 3.4.6. If A→ B by interchanging two rows (columns) of A, then |B| = −|A|.
Theorem 3.4.7. Let A be a square matrix of size n× n.
(a) If A has a row (column) of zeros, then |A| = 0.
(b) If σ 6= 12 . . . n, then ∃i ∈ {1, 2, . . . , n}, i > σ(i).(c) If A is triangular, i.e., A has zeros above or below the diagonal, then |A| is the product of
diagonal elements. In particular, |I| = 1.
Theorem 3.4.8. If A has two identical rows (columns), then |A| = 0.
Proof. Assume that kth and lth rows are identical with k < l.That is, akj = alj for all j ∈ {1, . . . , n}.In particular, for any σ ∈ Sn, akσ(l) = alσ(l) and akσ(k) = alσ(k).
Let τ =
(1 2 . . . k . . . l . . . n1 2 . . . l . . . k . . . n
)
.
Then sgn τ = −1 and σ(τ(j)) = σ(j) for all j ∈ {1, . . . , n}r {k, l}. Also,
sgn(στ) = (sgnσ)(sgn τ) = − sgnσ.
As σ runs through all even permutations, στ runs through all odd permutations, and vice versa.
Thus
|A| =∑
σ∈Sn
(sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
=∑
σ∈Sneven
((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
+ (sgn(στ))a1στ(1)a2στ(2) . . . akστ(k) . . . alστ(l) . . . anστ(n))
=∑
σ∈Sneven
((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
− (sgnσ)a1σ(1)a2σ(2) . . . akσ(l) . . . alσ(k) . . . anσ(n))
=∑
σ∈Sneven
((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
− (sgnσ)a1σ(1)a2σ(2) . . . alσ(l) . . . akσ(k) . . . anσ(n))
= 0.
30 3. Matrices
Hence, we have the theorem.
Theorem 3.4.9. If A→ B by multiplying a row (column) of A by a scalar c ∈ F , then |B| = c|A|.
Remark. If A is an n× n matrix, then |cA| = cn|A|.
Theorem 3.4.10. If A→ B by adding a multiple of a row (column) of A to another row (column)
of A, then |B| = |A|.
Corollary 3.4.11. If E is an elementary matrix, then detE 6= 0.
Lemma 3.4.12. Let E be an elementary matrix. Then |EA| = |E||A| for any matrix A. In
particular, if E1, E2 . . . , Es are elementary matrices, then
|E1E2 . . . Es| = |E1||E2| . . . |Es|.
Theorem 3.4.13. Let A be a square matrix. Then, A is invertible⇔ detA 6= 0.
Theorem 3.4.14. The determinant of a product of two matrices A and B is the product of their
determinants; that is |AB| = |A||B|.
Definition. Consider an n-square matrix A = [aij ]. Let Mij(A) denote the (n − 1)-square
submatrix of A obtained by deleting its ith row and jth column. The determinant |Mij(A)| is
called the minor of the element aij of A, and we define the cofactor of aij , denoted by Cij(A),to be the “signed” minor:
Cij(A) = (−1)i+j |Mij(A)|.
Recall that
|A| =∑
σ∈Sn
(sgnσ)a1σ(1)a2σ(2) . . . anσ(n)
= aijCij(A) + (terms which do not contain aij as a factor).
Lemma 3.4.15. Cij(A) = Cij(A) for all i, j ∈ {1, . . . , n}.
Grant this lemma, we observe that
|A| =n∑
j=1
aijCij(A) =n∑
j=1
aijCij(A).
Therefore, we have shown.
Theorem 3.4.16. [Laplace] The determinant of a square matrix A = [aij ] is equal to the sum of
the products obtained by multiplying the elements of any row (column) by their respective cofactors:
|A| = ai1Ci1(A) + ai2Ci2(A) + · · ·+ ainCin(A) =n∑
j=1
aijCij(A)
|A| = a1jC1j(A) + a2jC2j(A) + · · ·+ anjCnj(A) =n∑
i=1
aijCij(A)
for all i, j ∈ 1, 2, . . . , n.
3.4. Permutations and Determinants 31
Remark. The above formulas for |A| is called the Laplace expansions of the determinant of Aby the ith row and the jth column. Together with the elementary row operations, they offer a
method of simplifying the computation of |A|.Next we proceed to prove the lemma.
Proof of Lemma 3.4.15. Note that for any matrix B,
|B| =∑
σ∈Sn
(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1)bnσ(n)
= bnn∑
σ∈Sn
σ(n)=n
(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1) + (other terms which do not contain bnn).
Thus,
Cnn(B) =∑
σ∈Sn
σ(n)=n
(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1)
=∑
τ∈Sn−1
(sgn τ)b1τ(1)b2τ(2) . . . bn−1,τ(n−1)
=
∣∣∣∣∣∣∣∣∣
b11 b12 . . . b1,n−1
b21 b22 . . . b2,n−1
. . .
bn−1,1 bn−1,2 . . . bn−1,n−1
∣∣∣∣∣∣∣∣∣
= determinant of the matrix obtained from deleting the nth row and nth column of B.
Write
A =
a11 a12 . . . a1j . . . a1na21 a22 . . . a2j . . . a2n
...
ai1 ai2 . . . aij . . . ain...
an1 an2 . . . anj . . . ann
.
To compute Cij(A), we row reduce A to A′ by interchanging rows n− i times and columns n− jtimes as shown:
A′ =
a11 a12 . . . a1,j−1 a1,j+1 . . . a1n a1ja21 a22 . . . a2,j−1 a2,j+1 . . . a2n a2j
......
ai−1,1 ai−1,2 . . . ai−1,j−1 ai−1,j+1 . . . ai−1,n ai−1,j
ai+1,1 ai+1,2 . . . ai+1,j−1 ai+1,j+1 . . . ai+1,n ai+1,j...
...
an1 an2 . . . an,j−1 an,j+1 . . . ann anjai1 ai2 . . . ai,j−1 ai,j+1 . . . ain aij
.
Hence,
|A′| = (−1)(n−i)+(n−j)|A| = (−1)−i−j |A|.That is,
|A| = (−1)i+j |A′|aijCij(A) + (other terms) = (−1)i+jaijCnn(A
′) + (other terms).
32 3. Matrices
Therefore,
Cij(A) = (−1)i+jCnn(A
′)
= (−1)i+j(the determinant of the matrix obtained from
deleting the nth row and the nth column of A′)
= (−1)i+j |Mij(A)|= Cij(A).
This completes the lemma.
Definition. Let A = [aij ] be an n × n matrix and let Cij(A) denote the cofactor of aij . The
classical adjoint of A, denoted by adjA, is the transpose of the matrix of the cofactors of A,
namely,
adjA = [Cij(A)]T .
We say “classical adjoint” here instead of simply “adjoint” because the term “adjoint” will be
used for an entirely different concept.
Theorem 3.4.17. Let A be a square matrix. Then
A(adjA) = (adjA)A = |A|I
where I is the identity matrix. Thus, if |A| 6= 0, then
A−1 =1
|A|(adjA).
For any n× n matrix A and any ~b ∈ Fn, let Ai(~b) be the matrix obtained from A by replacing
the ith column by the vector ~b, that is,
Aj(~b) =
[
~a1 · · · ~b︸︷︷︸
jth
· · · ~an
]
for all j = 1, 2, . . . , n.
Theorem 3.4.18. [Cramer’s rule] Let A be an invertible n×n matrix. For any~b ∈ Fn, the unique
solution ~x of A~x = ~b has entries given by
xj =|Aj(~b)||A| , j = 1, 2, . . . , n.
Exercises for Chapter 3. 1. The following matrices are echelon forms of coefficient matrices of linearsystems. Which has a unique solution? Why?
(a)
1 2 3 40 1 2 30 0 1 20 0 0 1
(b)
1 2 3 40 1 2 30 0 0 10 0 0 0
2. Find the general solution to the linear system
x1 + 2x2 + x3 − 2x4 = 52x1 + 4x2 + x3 + x4 = 93x1 + 6x2 + 2x3 − x4 = 14
3.4. Permutations and Determinants 33
3. Consider the linear system with parameter a
(2a− 1)x + ay − (a+ 1)z = 1ax + y − 2z = 12x + (3− a)y + (2a− 6)z = 1
Determine, with proof, for which a this system has(a) no solution (b) a unique solution (c) more than one solutions.
4. Consider the linear systemx + 2y + z = 3
ay + 5z = 102x + 7y + az = b
(a) Find those values of a for which the system has a unique solution.(b) Find those pairs of values (a, b) for which the system has more than one solutions.
5. If A~x = ~b has more than one solutions, why is it impossible for A~x = ~c (new right-hand side) to haveonly one solution? Could A~x = ~c have no solution?
6. Let A~x = ~0 be a homogeneous system of n linear equations in n unknowns and let Q be an invertiblen × n matrix. Show that A~x = ~0 has a nontrivial solution if and only if (QA)~x = ~0 has a nontrivialsolution.
7. If A~x = ~b has two distinct solutions ~p and ~q, find two distinct solutions to A~x = ~0.8. Under what conditions on b1 and b2 (if any) is A~x = ~b consistent (has a solution)?
A =
[1 2 0 32 4 0 7
]
and ~b =
[b1b2
]
.
9. Find the number c so that (if possible) the rank of A is (a) 1 (b) 2 (c) 3
A =
6 4 2−3 −2 −19 6 c
10. Suppose A =
1 2 1 b2 a 1 8∗ ∗ ∗ ∗
has the reduced echelon form R =
1 2 0 30 0 1 20 0 0 0
.
(a) Find a and b. (b) Solve A~x = ~0.11. Let A be an m× n matrix for which
A~x =
111
has no solutions and A~x =
010
has a unique solution.
(a) Give all possible information about m and n and the rank of A.(b) Find all solutions of A~x = ~0 and explain your answer.
12. Let A be an 3× 4 matrix for which
A~x =
111
has no solutions and A~x =
010
has more than one solutions.
(a) Give all possible values for rankA.(b) Do we always have more than one solutions for A~x = ~0? Explain your answer.(c) Is it possible to have a vector ~b such that A~x = ~b has a unique solution? Why?
13. Find the value for c in the following n by n inverse:
if A =
n −1 . . . −1−1 n . . . −1. . . . . . . . . −1−1 −1 . . . n
then A−1 =1
n+ 1
c 1 . . . 11 c . . . 1. . . . . . . . . 11 1 . . . c
.
14. If E is an elementary matrix, prove that ET is an elementary matrix.
34 3. Matrices
15. Let E1, E2 and E3 denote, respectively, the elementary row operations
“Interchange rows R1 and R2” “Multiply R3 by 5” “Replace R2 by −3R1 +R2”.
(a) Find the corresponding 3-square elementary matrices E1, E2 and E3.(b) Find the inverses of matrices E1, E2 and E3.
16. Let A be a 3 × 3 invertible matrix. Construct B by replacing R3 of matrix A by R3 − 4R1. How dowe find B−1 from A−1? Explain.
17. Let A be a 3× 3 matrix and B =
0 1 01 0 00 0 1
. Consider the augmented matrix C = [A : B]. After row
reducing C, we get the following matrix
1 0 1 2 −3 −40 1 0 −1 2 20 0 −1 0 0 1
.
Compute A−1.
18. (a) For which values of the parameter c is A =
−2 1 c0 −1 11 2 0
invertible?
(b) For which values of e is the matrix A =
5 e ee e e1 2 e
not invertible?
19. Let A =
a b ba a ba a a
. If a 6= 0 and a 6= b, prove that A is invertible and find A−1 in terms of a and b.
20. Show that if A =
1 0 00 1 0a b c
is an elementary matrix, then at least one entry in the third row must
be zero.21. In each case find an elementary matrix E such that B = EA.
(a) A =
[2 13 −1
]
, B =
[3 −12 1
]
(b) A =
[2 13 −1
]
, B =
[−1 −33 −1
]
22. In each case find an invertible matrix U such that UA = B, and express U as a product of elementarymatrices.
(a) A =
[2 1 3−1 1 2
]
, B =
[1 −1 −23 0 1
]
(b) A =
[2 −1 01 1 1
]
, B =
[3 0 12 −1 0
]
23. In each case find invertible matrices U and V such that UAV is in the Smith normal form.
(a) A =
[1 1 −1−2 −2 4
]
(b) A =
[3 22 1
]
(c) A =
[1 −2−2 4
]
(d) A =
1 −1 2 12 −1 0 30 1 −4 1
24. Let F be a field and A = [aij ] ∈Mn(F ). Define the trace of A to be the sum of the diagonal elements,that is,
trA =n∑
i=1
aii.
(a) Show that the trace is a linear transformation from Mn(F ) onto F .(b) If A and B are in Mn(F ), then tr (AB) = tr (BA).(c) If B is invertible, then tr (B−1AB) = trA.(d) Prove that there are no square real matrices A and B such that AB −BA = In.
25. Let A =
1 2 0 2 1−1 −2 1 1 01 2 −3 −7 −2
.
(a) Find bases for ColA and NulA. (b) Find rankA and nullityA.
26. Determine the rank and nullity of A =
1 2 0 30 2 2 20 0 0 00 0 0 4
.
27. If A is an n× n matrix such that A2 = A and rankA = n, prove that A = In.
3.4. Permutations and Determinants 35
28. Let A be a 5× 7 matrix with rank 4.(a) What is the dimension of the solution space of A~x = ~0?(b) Does A~x = ~b have a solution for all ~b ∈ R5? Explain.
29. Let A be a square matrix such that Ak = 0 for some positive integer k. Prove that I +A is invertible.30. Let A and B be m× n and n×m matrices, respectively. If m > n, show that AB is not invertible.31. Let A and B be m× n and n×m matrices, respectively. Show that AB = 0m×m ⇔ ColB ⊆ NulA.
32. Determine the sign of all permutations in S4 and expand the determinant
∣∣∣∣∣∣∣∣
a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34a41 a42 a43 a44
∣∣∣∣∣∣∣∣
by
using permutations and their signs explicitly.33. Determine the sign of the following permutations in S5.
(a) 12354 (b) 12534 (c) 15243 (d) 5432134. Show that if two rows (columns) of A are proportional, i.e., Rk = cRl for some k < l, then |A| = 0.35. Let A = [aij ] be a square matrix of order n and σ ∈ Sn. If Aσ = [aσ(i),j ], show that |Aσ| = (sgnσ)|A|.36. Prove that if n is odd, 1 + 1 6= 0 and A is a square matrix of order n with A = −AT , then A is not
invertible.37. After the indicated row operations on a 3 × 3 matrix A with detA = −540, matrices A1, A2, . . . , A5
are successively obtained:
AR1+3R2
// A1R23
// A23R2−R1
// A3R1−3R2
// A42R1
// A5 .
Determine the values of |A1|, |A2|, |A3|, |A4| and |A5|, respectively.38. If A is an invertible square matrix of order n > 1, show that det(adjA) = (detA)n−1.
What is det(adjA) if A is not invertible? Prove your answer.39. Let A,B,C be 3× 3 matrices with detA = 3, detB3 = −8, detC = 2. Compute
(a) det(ABC) (b) det(5ACT ) (c) det(A3B−3C−1) (d) det[B−1(adjC)] .40. Show that adjAT = (adjA)T .41. Show that if A is invertible and n > 2, then adj (adjA) = (detA)n−2A.42. If A and B are invertible, show that
adj (AB) = (adjB)(adjA) and adj (BAB−1) = B(adjA)B−1.
43. Prove that if A is an invertible upper triangular matrix (all entries lying below the diagonal are zero),then adjA and A−1 are upper triangular.
44. Suppose the set of real-valued functions f1(x), f2(x), . . . , fk(x) are all defined and are differentiablek − 1 times on the interval [a, b]. The Wronskian of the set of functions is defined on this interval tobe the determinant
W (x) =
∣∣∣∣∣∣∣∣∣∣∣
f1(x) f2(x) · · · fk(x)f ′1(x) f ′
2(x) · · · f ′
k(x)f ′′1 (x) f ′′
2 (x) · · · f ′′
k (x)...
.... . .
...
f(k−1)1 (x) f
(k−1)2 (x) · · · f
(k−1)k (x)
∣∣∣∣∣∣∣∣∣∣∣
.
Prove that a set of real-valued functions {f1(x), f2(x), . . . , fk(x)} differentiable k − 1 times on theinterval [a, b], are linearly independent if W (x0) 6= 0 at some point x0 in the interval.
45. Consider the interval [−1, 1] and the two functions defined by
f(x) =
{
0 if −1 ≤ x ≤ 0,
x2 if 0 ≤ x ≤ 1and g(x) =
{
x2 if −1 ≤ x ≤ 0,
0 if 0 ≤ x ≤ 1.
These functions are both differentiable. Show that f and g are linearly independent but W (x) = 0for all x ∈ [−1, 1]. This provides an example to prove that the converse of the previous problem doesnot hold.
46. (a) Show that the functions 1, x, x2, . . . , xk are linearly independent in the function space C0[0, 1].(b) Show that the functions sinx, sin 2x, sin 3x, . . . , sin kx are linearly independent in the functionspace C0[0, 2π]. (Hint. Use the Wronskian.)
36 3. Matrices
47. Use induction to show that ∣∣∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 · · · 1 1
1 0 0... 0 0
0 1 0 · · · 0 00 0 1 · · · 0 0...
... · · ·...
...0 0 0 · · · 1 0
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
= (−1)n+1.
48. (a) Let x1, x2 and x3 be numbers. Show that
V2 =
∣∣∣∣
1 x1
1 x2
∣∣∣∣= x2 − x1 and V3 =
∣∣∣∣∣∣
1 x1 x21
1 x2 x22
1 x3 x23
∣∣∣∣∣∣
= (x2 − x1)(x3 − x1)(x3 − x2).
(b) If x1, x2, . . . , xn are numbers, then show by induction that
Vn =
∣∣∣∣∣∣∣∣
1 x1 . . . xn−11
1 x2 . . . xn−12
. . .1 xn . . . xn−1
n
∣∣∣∣∣∣∣∣
=∏
i<j
(xj − xi).
This determinant is called the Vandermonde determinant. (Hint. To do the induction easily, multi-ply each column by x1 and subtract it from the next column on the right starting from the right-handside. We shall find that Vn = (xn − x1) . . . (x2 − x1)Vn−1.)
4 | Linear Transformations
4.1 Linear Functionals
Definition. Let V and W be two vector spaces over F . We write L(V,W ) for the set of all linear
transformations from V to W , that is,
L(V,W ) = {T : V →W |T is a linear transformation}.
Then L(V,W ) is a vector space over F with the operations defined by for S, T ∈ L(V,W ),
(S + T )(~v) = S(~v) + T (~v) and (cT )(~v) = c T (~v)
for all ~v ∈ V and c ∈ F . Note that the zero function is its zero vector and (−T )(~v) = −T (~v) for
all ~v ∈ V .
Remark. By Theorem 1.4.1, for a given basis B = {~v1, ~v2, . . . , ~vn} for an n-dimensional vector
space V , there exists a unique linear transformation T : V → W such that T (~vi) = ~wi ∈ W for
all i ∈ {1, 2, . . . , n}. Then for S, T ∈ L(V,W ), (S(~vα) = T (~vα) for all i ∈ {1, 2, . . . , n})⇒ S = T .
Hence, to show that two linear transformations are identical, it suffices to see the equality on some
basis of V .
Theorem 4.1.1. Let B = {~v1, . . . , ~vn} be a basis for V and let C = {~w1, . . . , ~wm} be a basis for W .
For each i ∈ {1, . . . , n} and j ∈ {1, . . . ,m}, we define
Tij(~vk) =
{
~wj if i = k,
~0W if i 6= k,
for all k ∈ {1, . . . , n}. By Theorem 1.4.1, Tij ∈ L(V,W ) for all i, j. Then
{Tij : i ∈ {1, . . . , n} and j ∈ {1, . . . ,m}}
is a basis for L(V,W ). Hence, if dimV = n and dimW = m, then dimL(V,W ) = mn.
Definition. Let V be a vector space over a field F .
A linear transformation from V to F is also called a linear functional. Let
V ∗ = L(V, F ) and V ∗∗ = (V ∗)∗(= L(V ∗, F ) = L(L(V, F ), F )).
By Theorem 4.1.1, we have that if V is finite dimensional, then
dimV = dimV ∗ = dimV ∗∗
and thus, by Corollary 1.4.14, V ∼= V ∗ ∼= V ∗∗.
37
38 4. Linear Transformations
Definition. The space V ∗ is called the dual space of V and V ∗∗ is called the double dual of V .
Examples 4.1.1. The following functions are linear functionals.
1. T : C0[0, 1]→ R given by T (f) =
∫ 1
0f(x) dx.
2. T : F [x]→ F given by T (p(x)) = p(1).
Remarks. 1. For f ∈ V ∗,
(a) f 6= 0⇒ im f = F(b) if V is finite dimensional and f 6= 0, then nullity f = (dimV )− 1.
2. For ~v ∈ V , if f(~v) = 0 for all f ∈ V ∗, then ~v = ~0.
Theorem 4.1.2. Let dimV = n and let B = {~v1, . . . , ~vn} be a basis of V .
For each i ∈ {1, . . . , n}, let fi ∈ V ∗ be such that
fi(~vj) =
{
1 if i = j,
0 if i 6= j.
Then the following statements hold.
1. {f1, . . . , fn} is a basis of V ∗ which is called the dual basis of B.
2. ∀f ∈ V ∗, f =n∑
i=1
f(~vi)fi = f(~v1)f1 + . . .+ f(~vn)fn.
3. ∀~v ∈ V , ~v =n∑
i=1
fi(~v)~vi = f1(~v)~v1 + . . .+ fn(~v)~vn.
For ~v ∈ V, define L~v : V ∗ → F by L~v(f) = f(~v) for all f ∈ V ∗.
Then L~v ∈ V ∗∗ for all ~v ∈ V . Hence, {L~v : ~v ∈ V } ⊆ V ∗∗.
Theorem 4.1.3. 1. The map θ : ~v 7→ L~v is a 1-1 linear transformation from V into V ∗∗.
2. If V is finite-dimensional, then
(a) the map θ : ~v 7→ L~v is an isomorphism of V onto V ∗∗
(b) ∀L ∈ V ∗∗, ∃!~v ∈ V, L = L~v.
Corollary 4.1.4. If V is finite dimensional, then each basis of V ∗ is the dual of some basis of V .
Example 4.1.2. Consider V = R2[x], the vector space of all polynomials of degree less than 2over R. Let t1, t2, t3 be three distinct real numbers and let fi(p(x)) = p(ti) for all p(x) ∈ R2[x] and
i = 1, 2, 3.
Show that {f1, f2, f3} is a basis of V ∗ and find a basis of V such that {f1, f2, f3} is its dual
basis.
Let V be an inner product space over a field F = R or C.
1. ∀~w ∈ V , the map ~v 7→ (~v, ~w) is a linear functional on V .
2. The maps ~v 7→ (~v, ~w1) and ~v 7→ (~v, ~w2) are identical⇔ ~w1 = ~w2.
Theorem 4.1.5. Let V be a finite dimensional inner product space and f ∈ V ∗.
Then ∃!~w ∈ V, f(~v) = (~v, ~w) for all ~v ∈ V .
Hence, V ∗ = {f~w : ~w ∈ V } where f~w(~v) = (~v, ~w) for all ~v ∈ V .
4.2. Quotient Spaces and Isomorphism Theorem 39
4.2 Quotient Spaces and Isomorphism Theorem
Let V be a vector space over a field F and let W be any subspace of V . For ~v ∈ V , define
~v +W = {~v + ~w : ~w ∈W}
which is called a coset of W . Then
(1) ∀~v1, ~v2 ∈ V,~v1 +W = ~v2 +W ⇔ ~v1 − ~v2 ∈W ,
(2) ∀~v1, ~v2 ∈ V, (~v1 +W ) ∩ (~v2 +W ) = ∅ or ~v1 +W = ~v2 +W and
(3) ∀~v1, ~v2 ∈ V, (~v1 +W ) + (~v2 +W ) = (~v1 + ~v2) +W .
For c ∈ F and ~v ∈ V , define c(~v +W ) = c~v +W .
Definition. Let V/W = {~v + W : ~v ∈ V }. It is a vector space over F with respect to the
operations
(~v1 +W ) + (~v2 +W ) = (~v1 + ~v2) +Wand c(~v1 +W ) = c~v1 +W,
and ~0 +W is the zero vector of V/W and −(~v +W ) = (−~v) +W for all ~v ∈ V .
The vector space V/W is called the quotient space of V by W .
Theorem 4.2.1. 1. There is a linear transformation π from V onto V/W given by
π : ~v 7→ ~v +W for all ~v ∈ V .
Its kernel is equal to W . This map π is called the canonical projection from V onto V/W .
2. If V is a finite dimensional vector space and W is a subspace of V , then V/W is finite
dimensional and dim(V/W ) = dimV − dimW .
Theorem 4.2.2. [Isomorphism Theorem] Let V and W be two vector spaces over a field F and
T : V →W a linear transformation. Then
V/(kerT ) ∼= imT.
Example 4.2.1. Let A be an m × n matrix and TA : Rn → Rm given by TA(~x) = A~x. Then we
have
Rn/(NulA) ∼= ColA.
Moreover, if ~b ∈ ColA, then A~x = ~b has a solution, say ~yp. Theorem 4.2.2 also gives the corre-
spondence
~yp +NulA←→ ~b.
This is Theorem 3.1.3 (2).
4.3 Matrix Representations
Definition. Let V be an n-dimensional vector space over a field F with an ordered basis B ={~v1, ~v2, . . . , ~vn} and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1, . . . , cn) ∈ Fn,
~v = c1~v1 + c2~v2 + . . .+ cn~vn and [~v]B =
c1c2...
cn
∈ Fn
is called the coordinate vector of ~v relative to the ordered basis B.
40 4. Linear Transformations
Example 4.3.1. Let B = {(1, 1, 0, 0), (1, 0, 1, 0), (1, 1, 1, 0), (0, 0, 0, 2i)} be an ordered basis for C4.
Find [(2,−16, 3,−i)]B.
We recall Theorems 1.4.12 and 1.4.13 as follows.
Theorem 4.3.1. Let V be an n-dimensional vector space over F and B a basis for V .
1. For ~v, ~w ∈ V and c ∈ F , we have [~v + ~w]B = [~v]B + [~w]B and [c~v]B = c[~v]B.
2. The map ~v 7→ [~v]B is an isomorphism from V onto Fn.
This also implies ∀~u,~v ∈ V, [~u]B = [~v]B ⇔ ~u = ~v.
Theorem 4.3.2. Let T : V → W be a linear transformation where dimV = m and dimW = n,
and let B = {~v1, . . . , ~vn} and C = {~w1, . . . , ~wm} be ordered bases of V and W , respectively. Then
for each j ∈ {1, . . . , n}, we have
T (vj) = d1j ~w1 + d2j ~w2 + · · ·+ dmj ~wm.
Hence, there exists a unique m× n matrix over a field F given by
[T ]CB =[[T (~v1)]C [T (~v2)]C · · · [T (~vn)]C
]=
d11 d12 . . . d1nd21 d22 . . . d2n...
......
dm1 dm2 . . . dmn
.
Furthermore, ϕ : T 7→ [T ]CB is an isomorphism of L(V,W ) onto Mm,n(F ).
Definition. The matrix [T ]CB is called the matrix for T relative to the ordered bases B and
C If V = W and B = C, then we write [T ]B for [T ]BB. In addition, if T : Fn → Fn is a linear
transformation and B is the standard basis for Fn, we call [T ]B the standard matrix for T .
Note that for ~v ∈ V ,
~v = c1~v1 + c2~v2 + · · ·+ cn~vn,
so that
T (~v) = c1T (~v1) + c2T (~v2) + · · ·+ cnT (~vn).
Thus,
[T (~v)]C = c1[T (~v1)]C + c2[T (~v2)]C + · · ·+ cn[T (~vn)]C =[[T (~v1)]C [T (~v2)]C · · · [T (~vn)]C
]
c1c2...
cn
.
That is,
[T (~v)]C = [T ]CB [~v]B for all ~v ∈ V .
We conclude this in the following diagram.
~v
��
T// T (~v)
��
[~v]B[T ]C
B×
// [T (~v)]C = [T ]CB[~v]B
4.4. Change of Bases 41
Example 4.3.2. 1. Let B = {1 + x, x} be an ordered basis for R1[x] and
C = {1 + x, x, x2 − 1, x3} an ordered basis for R3[x].Let T : R1[x]→ R3[x] be a linear transformation defined by
T (a+ bx) = x2(a+ bx) for all a, b ∈ R.
Find [T ]CB.
2. Suppose T : M22(R)→ R3 is a linear transformation with [T ]CB =
1 −1 0 00 1 −1 00 0 1 −1
where
B =
{[1 00 0
]
,
[0 10 0
]
,
[0 01 0
]
,
[0 00 1
]}
and C = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
Compute T
([a bc d
])
.
Theorem 4.3.3. Let V , W and Z be finite-dimensional vector spaces over a field F and let B, Cand D be ordered bases of V , W and Z, respectively.
If S : V →W and T : W → Z are linear transformations, then
[T ◦ S]DB = [T ]DC [S]CB.
Moreover, if V = W = Z and B = C = D, then [T ◦ S]B = [T ]B[S]B.
Corollary 4.3.4. Let V be a finite-dimensional vector space, B an ordered basis and T : V → V a
linear transformation. Then
1. T is an isomorphism⇒ [T ]B is invertible and [T ]−1B =
[T−1
]
B.
2. [T ]B is invertible⇒ T is an isomorphism and[T−1
]
B= [T ]−1
B .
Theorem 4.3.5. Let T : V → W be a linear transformation where dimV = n and dimW = m.
If B and C are any ordered bases for V and W , respectively, then rankT = rank[T ]CB.
Example 4.3.3. Define T : R2[x] → R3 by T (a + bx + cx2) = (a − 2b, 3c − 2a, 3c − 4b) for all
a, b, c ∈ R. Compute rankT .
4.4 Change of Bases
Definition. Let V be a n-dimensional vector space over a field F . with an ordered basis B ={~v1, . . . , ~vn}. If B′ = {~v′1, . . . , ~v′n} is another ordered basis for V , we define the transition or
change of coordinate matrix from B′ to B by PB→B′ = [I]B′
B
Theorem 4.4.1. Let B,B′ and B′′ be bases for V . Then
1. ∀~v ∈ V, [~v]B′ = PB→B′ [~v]B,
2. PB→B = In,
3. PB→B′ is invertible and (PB→B′)−1 = PB′→B,
4. PB→B′′ = PB′→B′′PB→B′ .
Example 4.4.1. Let B =
{[−10
]
,
[1−1
]}
and B′ ={[
11
]
,
[2−1
]}
be ordered bases for R2.
(a) Find PB→B′
.
(b) If ~v =
[0−1
]
, find [~v]B and [~v]B′ .
42 4. Linear Transformations
Definition. A linear transformation from V to V is called a linear operator on V .
Theorem 4.4.2. Let B and B′ be two bases for a finite dimensional vector space V . If T : V → Vis a linear operator, then
[T ]B′ = [I]BB′ [T ]B[I]B′
B = (PB→B′)−1[T ]B(PB→B′).
Example 4.4.2. Let T : R3 → R3 be a linear transformation with
standard matrix
2 1 06 1 −10 0 1
. Find [T ]B′ where B′ =
120
,
1−30
,
−11−6
.
From the above theorem, we have
det[T ]B = det[T ]B′ and rank[T ]B = rank[T ]B′
for any two bases B and B′ for V .
Definition. If T : V → V is a linear operator, we define the determinant of T by
detT = det[T ]B for some basis B for V .
Definition. For n × n matrices A and B, we say that A is similar to B, A ∼ B, if there exists
an invertible matrix P ∈Mn(F ) such that B = P−1AP .
Remarks. 1. ∼ is an equivalence relation on Mn(F ).2. If A ∼ B, then AT ∼ BT , Ak ∼ Bk for all k ∈ N, and A−1 ∼ B−1 (if inverses exist).
3. [T ]B and [T ]B′ are similar for any two bases B and B′ of V .
Definition. The trace of an n× n matrix A is the sum of the diagonal elements.
Theorem 4.4.3. Let A and B be similar matrices. Then
1. detA = detB,
2. rankA = rankB,
3. trA = trB.
Exercises for Chapter 4. 1. If T : V → W is an isomorphism and B is a basis for V , prove that T (B)is a basis for W .
2. Let T : V → V be a linear transformation. Suppose that there exists a ~v ∈ V such that T (T (~v)) 6= ~0and T (T (T (~v))) = ~0. Prove that {~v, T (~v), T (T (~v))} is linearly independent.
3. Let S, T ∈ L(V,W ) and c ∈ F . Prove that:(a) kerS ∩ kerT ⊆ ker(S + cT ) (b) im(S + T ) ⊆ imS + imT .
4. Let E be a linear transformation on a vector space V such that E ◦ E = E.Prove that the following statements hold.(a) ∀~v ∈ V,~v ∈ imE ⇔ E(~v) = ~v (b) ∀~v ∈ V,~v − E(~v) ∈ kerE (c) V = kerE ⊕ imE.
5. Let f, g ∈ V ∗. If ker f ⊆ ker g, prove that g = cf for some c ∈ F .6. Let V be an n-dimensional vector space over F .
If f, g ∈ V ∗ are linearly independent, find dim(ker f ∩ ker g).7. If V and W are finite dimensional vector spaces which are isomorphic, prove that V ∗ ∼= W ∗.8. Let B = {(1, 0,−1), (1, 1, 1), (2, 2, 0)} be a basis for R3. Find the dual basis of B.
4.4. Change of Bases 43
9. Consider V = R1[x]. Let f1 : V → R and f2 : V → R be defined by
f1(p(x)) =
∫ 1
0
p(x) dx and f2(p(x)) =
∫ 2
0
p(x) dx.
Clearly, f1, f2 ∈ V ∗. Prove that {f1, f2} is a basis for V ∗ and find a basis of V such that {f1, f2} is itsdual basis.
10. (a) Let W be a subspace of a finite dimensional vector space V .If B = {x1, . . . , xm} is a basis for W and {x1, . . . , xm, xm+1, . . . , xn} is a basis of V ,show that {xm+1 +W, . . . , xn +W} is a basis for V/W .(b) Let H = Span{(1, 1,−1)}. Determine a basis for R3/H.
11. Let W1 and W2 be two subspaces of a vector space V .Define T : W1+W2 →W2/(W1∩W2) by T (~w1+ ~w2) = ~w2+(W1∩W2) for all ~w1 ∈W1 and ~w2 ∈W2.(a) Prove that T is well defined and is an onto linear transformation.(b) Prove that kerT = W1.(c) Conclude by Theorem 4.2.2 that (W1 +W2)/W1
∼= W2/(W1 ∩W2).This is a generalization of Theorem 1.4.8.
12. If W1 and W2 are subspaces of V with W1 ⊆W2.Define T : V/W1 → V/W2 by T (~v +W1) = ~v +W2 for all ~v ∈ V .(a) Prove that T is well defined and is an onto linear transformation.(b) Prove that kerT = W2/W1.(c) Conclude by Theorem 4.2.2 that (V/W1)/(W2/W1) ∼= V/W2.
13. Let U , V and W be finite dimensional vector spaces over a field F . Let S : U → V and T : V → Wbe linear transformations such that T ◦ S is the zero map. Show that
dim(W/ imT )− dim(kerT/ imS) + nullityS = dimW − dimV + dimU.
14. Let V and W be finite dimensional vector spaces over a field F . Let U be a subspace of V andT : V →W a linear transformation.(a) Prove that dim(V/U) ≥ dim(T (V )/T (U)).(b) If T is 1-1, prove also that the inequality in (a) becomes an equality.
15. For S ⊆ V , let A(S) = {f ∈ V ∗ : f(~v) = 0 for all ~v ∈ S}. It is called the annihilator of S.Prove that(a) A(S) is a subspace of V ∗ (b) If S1 ⊆ S2, then A(S1) ⊇ A(S2)(c) If V is finite dimensional and W is a subspace of V , then V ∗/A(W ) ∼= W ∗.
16. Prove that ∀S, T ∈ L(V, V ), S ◦ T ∈ L(V, V ).17. Let T : V →W be a linear transformation where dimV = dimW = n.
Prove that the following statements are equivalent.(i) T is an isomorphism.(ii) [T ]cC
Bis invertible for all ordered bases B and C of V and W , respectively.
(iii) [T ]cCB
is invertible for some pair of ordered bases B and C of V and W , respectively.18. Suppose the linear transformation T : R2 → R2 is given by
T (1, 1) = (2, 3) and T (−1, 1) = (4, 5).
Find the standard matrix for T .19. Let B = {1, x, x2} be an ordered basis for R2[x] and C = {(1, 0), (1,−1)} an ordered basis for R2.
Find [T ]CB
if T : R2[x]→ R2 is a linear transformation defined by T (a+ bx+ cx2) = (a+ c, 2b) for alla, b, c ∈ R.
20. Let B = {sin t, cos t} and B′ = {sin t+2 cos t, sin t−cos t} be ordered bases for H = Span B = Span B′which is a subspace of C1(−∞,∞). Let D : H → H defined by D(f) = f ′′ for all
21. If T : V → W is an isomorphism, prove that ([T ]CB)−1 = [T−1]B
Cfor all ordered bases B and C of V
and W , respectively.22. Let T : R2[x] → R3 be a linear transformation defined by T (a + bx + cx2) = (a − c, b, a − 2c) for all
a, b, c ∈ R. If B = {1, x, x2} and C = (1, 0, 0), (0, 1, 0), (0, 0, 1), find [T ]CB
and the formula for T−1.23. Let T : Rn[x]→ Rn[x] be a linear transformation defined by
T (p(x)) = p(x) + xp′(x),
where p′(x) is the derivative of p(x). Show that T is an isomorphism by finding [T ]B where B ={1, x, x2, . . . , xn}.
44 4. Linear Transformations
24. Let α be a real number. Define a linear transformation Tα : M2(R)→M2(R) by
Tα(A) = A+ αAT for all A ∈M2(R).
If B =
{[1 00 0
]
,
[0 00 1
]
,
[0 11 0
]
,
[0 −11 0
]}
, find [T ]B and conclude that Tα is invertible if α2 6= 1.
25. Let T : R2 → R2 be a linear transformation defined by T (x, y) = (−y, x) for all x, y ∈ R. Prove that(a) ∀c ∈ R, (A− cI2) is invertible,
(b) if B is an ordered basis for R2 and [T ]B =
[a bc d
]
, then bc 6= 0.
5 | Structure Theorems
5.1 Eigenvalues and Eigenvectors
We first recall some numerical examples.
Example 5.1.1. Diagonalize A =
[1 0−1 2
]
.
That is, find an invertible matrix P and a diagonal matrix D (if any) such that A = PDP−1.
Example 5.1.2. Let A =
3 1 −12 2 −12 2 0
and T (~x) = A~x a matrix transformation on R3.
Find a basis B (if any) such that [T ]B is a diagonal matrix. Given det(A− λI) = −(λ− 1)(λ− 2)2.
Definition. Let V be a vector space over a field F and T ∈ L(V, V ).A scalar λ ∈ F is called an eigenvalue or characteristic value of T if there exists a nonzero
vector ~v ∈ V such that T (~v) = λ~v. If λ is is an eigenvalue of T , then ~v ∈ V such that T (~v) = λ~vis called an eigenvector or characteristic vector of T associated with the characteristic
value λ. We have that
Eλ(T ) = {~v ∈ V : T (~v) = λ~v} = {~v ∈ V : (T − λI)(~v) = ~0V } = ker(T − λI)
is a subspace of V , called the eigenspace or characteristic space of T associated with λ.
Remark. λ is an eigenvalue of T ⇔ ker(T − λI) 6= ~0V ⇔ T − λI is not 1-1.
For matrix theory, we restrict ourselves to the case of V is n-dimensional. Then L(V, V ) ∼=Mn(F ) with T 7→ [T ]B for a fixed basis B of V . Hence, we can only work on Mn(F ).
Definition. Let A ∈Mn(F ). The matrix transformation TA : Fn → Fn is given by
TA(~x) = A~x
for all ~x ∈ Fn. An eigenvalue of TA is called an eigenvalue of A and the eigenspace of TA is
called an eigenspace of A. In other words,
Eλ(A) = {~x ∈ Fn : A~x = λ~x} = {~x ∈ Fn : (A− λIn)~x = ~0n} = Nul(A− λIn).
Then
λ is an eigenvalue of A⇔ ker(TA − λI) 6= ~0n
⇔ Nul(A− λIn) 6= ~0n
⇔ A− λIn is not invertible
⇔ det(A− λIn) = 0.
45
46 5. Structure Theorems
Definition. The polynomial cA(x) = det(xIn−A) is called the characteristic polynomial of A.
Thus we have proved
Theorem 5.1.1. For A ∈Mn(F ), λ is an eigenvalue of A⇔ det(A− λIn) = 0, i.e., λ is a root of
the characteristic polynomial of A.
Since an eigenvalue of an n×n matrix A is a root of cA(x) = det(xIn−A) which has degree nand a polynomial of degree n over a field F has at most n roots in F , A has ≤ n eigenvalues.
Theorem 5.1.2. An n× n matrix has at most n eigenvalues.
Remark. If A is similar to B, then detA = detB and
det(B − λIn) = det(P−1AP − λP−1InP ) = det(P−1(A− λIn)P ) = det(A− λIn).
Therefore, we have the following result.
Theorem 5.1.3. If A and B are similar n×n matrices, then A and B have the same characteristic
polynomial and eigenvalues (with same multiplicities).
Example 5.1.3. The matrices
A =
[1 10 1
]
and I =
[1 00 1
]
have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not
similar because PIP−1 = I for any invertible matrix P .
Definition. A diagonal matrix D is a square matrix such that all the entries off the main
diagonal are zero, that is if D is of the form
D =
λ1 0 . . . 00 λ2 . . . 0...
.... . .
...
0 0 . . . λn
= diag(λ1, λ2, . . . , λn),
where λ1, λ2, . . . , λn ∈ F (not necessarily distinct).
Definition. An n× n matrix A over F is said to be diagonalizable if A is similar to a diagonal
matrix, that is, there are an invertible matrix P and a diagonal matrix D such that P−1AP = D.
In this case, we say that P diagonalizes A.
Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We
say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix.
Theorem 5.1.4. Let A be an n× n matrix.
1. A is diagonalizable⇔A has eigenvectors ~v1, . . . , ~vn such that P =
[~v1 · · · ~vn
]is invertible.
2. When this is the case, P−1AP = diag(λ1, λ2, . . . , λn), where
for each i, λi is the eigenvalue of A corresponding to ~vi.
5.1. Eigenvalues and Eigenvectors 47
Proof. Let P =[~v1 ~v2 . . . ~vn
]and D = diag(λ1, λ2, . . . , λn).
Then AP = PD becomes
A[~v1 ~v2 · · · ~vn
]=[~v1 ~v2 · · · ~vn
]
λ1 0 · · · 00 λ2 · · · 0...
.... . .
...
0 0 · · · λn
[A~v1 A~v2 · · · A~vn
]=[λ1~v1 λ2~v2 · · · λn~vn
].
Comparing columns shows that A~vi = λi~vi for each i, so
P−1AP = D ⇔ P is invertible and A~vi = λi~vi for all i ∈ {1, . . . , n}.
The results follow.
Theorem 5.1.5. Let ~v1, . . . , ~vm be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λm of
an n× n matrix A. Then {~v1, . . . , ~vm} is linearly independent.
Proof. We use induction on k.
If k = 1, then {~v1} is linearly independent because ~v1 6= ~0.
Let k ≥ 1 and the theorem is true for any k eigenvectors.
Let ~v1, . . . , ~vk+1 be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λk+1 of A.
Let c1, . . . , ck+1 ∈ F be such that
c1~v1 + c2~v2 + · · ·+ ck+1~vk+1 = ~0. (5.1.1)
Since A~vi = λ~vi for all i, multiplying by A both sides gives
c1λ1~v1 + c2λ2~v2 + · · ·+ ck+1λk+1~vk+1 = ~0. (5.1.2)
Subtracting (5.1.2) by λ1×(5.1.1), we have
c2(λ2 − λ1)~v2 + · · ·+ ck+1(λk+1 − λ1)~vk+1 = ~0.
Since ~v2, . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so
c2(λ2 − λ1) = · · · = ck+1(λk+1 − λ1) = 0.
However, λ1, . . . , λk+1 are distinct, hence we get
c2 = · · · = ck+1 = 0.
This implies c1~v1 = ~0, so c1 = 0 because ~v1 6= ~0.
Therefore, {~v1, . . . , ~vk+1} is linearly independent.
Corollary 5.1.6. If A is an n× n matrix with n distinct eigenvalues, then A is diagonalizable.
Proof. Let ~v1, . . . , ~vn be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λn of A.
Then they are linearly independent, and so P =[~v1 . . . ~vn
]is invertible
and P−1AP = diag(λ1, λ2, . . . , λn).
48 5. Structure Theorems
Lemma 5.1.7. Let {~v1, . . . , ~vk} be a linearly independent set of eigenvectors of an n×n matrix A,
extend it to a basis of Fn, and let
P =[~v1 . . . ~vk ~vk+1 . . . ~vn
]
which is invertible. If λ1, . . . , λk are the (not necessarily distinct) eigenvalues of A corresponding
to ~v1, . . . , ~vk, respectively, then P−1AP has block form
P−1AP =
[diag(λ1, . . . , λk) B
0 C
]
where B has size k × (n− k) and C has size (n− k)× (n− k).
Definition. An eigenvalue λ of a square matrix A is said to have multiplicity m if it occurs mtimes as a root of the characteristic polynomial cA(x).
In other words,
cA(x) = (x− λ)mg(x)
for some polynomial g(x) such that g(λ) 6= 0.
Lemma 5.1.8. Let λ be an eigenvalue of multiplicity m of a square matrix A.
Then nullity(A− λI) = dimEλ(A) ≤ m.
Proof. Assume that dimEλ(A) = d with basis {~v1, . . . , ~vd}. By Lemma 5.1.7, there exists an
invertible n× n matrix P such that
P−1AP =
[λId B0 C
]
= M
where Id is the d× d identity matrix. Since M and A are similar,
cA(x) = cM (x) = det(xIn −M) =
∣∣∣∣
(x− λ)Id B0 xIn−d − C
∣∣∣∣
= (det(x− λ)Id)(det(xIn−d − C))
= (x− λ)dcC(x).
Hence, d ≤ m because m is the highest power of (x− λ) in cA(x).
Theorem 5.1.9. Let λ1, . . . , λk be all distinct eigenvalues of an n× n matrix A.
For each i ∈ {1, . . . , k}, let mi denote the multiplicity of λi and write di = nullity(A− λiIn).Then 1 ≤ di ≤ mi for all i, n = m1 + · · ·+mk and
cA(x) = (x− λ1)m1 · · · (x− λk)
mk .
Moreover, the following statements are equivalent.
(i) A is diagonalizable.
(ii) di = nullity(A− λiIn) = dimEλi(A) = mi for all i.
(iii) n = d1 + · · ·+ dk.
5.2. Annihilating Polynomials 49
5.2 Annihilating Polynomials
Let A be an n × n matrix over a field F . Since dimMn(F ) = n2, the set {In, A,A2, . . . , An2} is
linearly dependent. Then there exist c0, c1, . . . , cn2 in F not all zero such that
c0In + c1A+ c2A2 + · · ·+ cn2An2
= 0.
Let f(x) be the polynomial over F defined by f(x) = c0+c1x+c2x2+ · · ·+cn2xn
2
. Then f(x) 6= 0and f(A) = 0.
Let g(x) = α−1f(x) where α is the leading coefficient of f(x). Then g(x) is monic (leading
coefficient = 1) and g(A) = 0. Thus there exists a polynomial p(x) over F such that
(a) p(A) = 0(b) p(x) is monic and
(c) ∀ nonzero polynomial q(x), q(A) = 0⇒ deg p(x) ≤ deg q(x).We have that such p(x) is unique (Proof!) and it is called the minimal polynomial. Note that if
k(x) ∈ F [x] and k(A) = 0, then p(x) | k(x).
Remark. If A and B in Mn(F ) are similar, then they have the same minimal polynomial.
Recall that the characteristic polynomial of A is given by
cA(x) = det(xIn −A).
Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots.
Remark. Although the minimal polynomial and the characteristic polynomial have the same
roots, they may not be the same.
Example 5.2.1. The characteristic polynomial for A =
5 −6 −6−1 4 23 −6 −4
is (x− 1)(x− 2)2 while
(A− I)(A− 2I) = 0,
so the minimal polynomial of A is (x− 1)(x− 2). Notice that A is diagonalizable. In general, we
have:
Theorem 5.2.2. If an n× n matrix A is diagonalizable with distinct eigenvalues λ1, . . . , λk, then
(x− λ1) . . . (x− λk) is the minimal polynomial for A.
Theorem 5.2.3. [Cayley-Hamilton] If f(x) is the characteristic polynomial of a matrix A, then
f(A) = 0.
Proof. Write f(x) = det(xIn − A) = xn + an−1xn−1 + · · · + a1x + a0. Let B = xIn − A. Since
adjB is a matrix such that each entry is obtained by using (n − 1) × (n − 1) submatrix of A and
computing its determinant,
Cij(B) = b(n−1)ij xn−1 + b
(n−2)ij xn−2 + · · ·+ b
(1)ij x+ b
(0)ij
for all i, j ∈ {1, . . . , n}. Thus
adjB = [Cij(B)]Tn×n
= [b(n−1)ij xn−1 + b
(n−2)ij xn−2 + · · ·+ b
(1)ij x+ b
(0)ij ]Tn×n
= Bn−1xn−1 +Bn−2x
n−2 + · · ·+B1x+B0
50 5. Structure Theorems
where Bi ∈Mn(F ). Recall that
(detB)In = B(adjB) = B(Bn−1xn−1 +Bn−2x
n−2 + · · ·+B1x+B0).
Then
(xn + an−1xn−1 + · · ·+a1x+ a0)In
= B(Bn−1xn−1 +Bn−2x
n−2 + · · ·+B1x+B0)
= (xI −A)(Bn−1xn−1 +Bn−2x
n−2 + · · ·+B1x+B0)
= Bn−1xn +Bn−2x
n−1 + · · ·+B1x2 +B0x
−ABn−1xn−1 +ABn−2x
n−2 + · · ·+AB1x+AB0.
This gives
I = Bn−1
an−1I = Bn−2 −ABn−1
an−2I = Bn−3 −ABn−2
...
a1I = B0 −AB1
a0I = −AB0.
Therefore
An+an−1An−1 + . . . a1A+ a0I
= AnBn−1 +An−1(Bn−2 −ABn−1) +An−2(Bn−3 −ABn−2) + . . .
+A(B0 −AB1)−AB0
= 0
as desired.
Example 5.2.2. Determine the minimal polynomial of A =
3 1 −12 2 −12 2 0
.
Some consequences of the Cayley-Hamilton are as follows.
Corollary 5.2.4. The minimal polynomial of A divides its characteristic polynomial.
Recall that
0 is an eigenvalue of A⇔ 0 = det(A− 0I) = detA⇔ A is not invertible.
Corollary 5.2.5. If f(x) = a0 + a1x+ · · ·+ an−1xn−1 + xn is the characteristic polynomial of an
invertible matrix A, then a0 6= 0
A−1 = − 1
a0(a1I + a2A+ · · ·+An−1).
5.3. Symmetric and Hermitian Matrices 51
5.3 Symmetric and Hermitian Matrices
Definition. Let F = R or C and A = [aij ] a matrix over F .
The matrix A is said to be symmetric if A = AT . We define AH = [aij ]T , the conjugate
transpose of A, called A Hermitian. We say that A is Hermitian or self-adjoint if A = AH .
Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R.
Example 5.3.1. Let A =
[3 11 −2
]
and B =
[−1 2 + 3i
2− 3i 2
]
.
Then A is symmetric and both of them are Hermitian.
Theorem 5.3.1. If A is a Hermitian matrix, then
(1) ~xHA~x is real for all ~x ∈ Cn and (2) the eigenvalues of A are real.
That is, if A is Hermitian, then all roots of cA(x) are real.
Example 5.3.2. For vectors ~x and ~y in Cn, we define (~x, ~y) = ~xH~y.
Then (·, ·) is an inner product on Cn so that
‖~x‖2 = ~xH~x = |x1|2 + · · ·+ |xn|2 for all ~x = (x1, . . . , xn) ∈ Cn.
Theorem 5.3.2. Two eigenvectors corresponding to different eigenvalues of a Hermitian matrix
are orthogonal to one another.
Definition. For F = R or C and U ∈ Mn(F ), U is called unitary if UHU = In = UUH . If
F = R, a unitary matrix satisfies UTU = In = UUT and may be called an orthonormal matrix.
Theorem 5.3.3. Let U ∈Mn(C) be a unitary matrix.
For the inner product defined in Example 5.3.2, we have
(U~x, U~y) = (~x, ~y) for all ~x, ~y ∈ Cn, so ‖U~x‖ = ‖~x‖ for all ~x ∈ C.
Corollary 5.3.4. If U =[~u1 ~u2 . . . ~un
]∈ Mn(C) is a unitary matrix, then for all j, k ∈
{1, 2, . . . , n} we have
(~uj , ~uk) =
{
1 if j = k,
0 if j 6= k.
Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise.
Example 5.3.3. U1 =1√2
[1 ii 1
]
and U2 =
[cos t − sin tsin t cos t
]
are unitary matrices.
Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1.
Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other.
We are going to explore some very remarkable facts about Hermitian and real symmetric
matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished
by a unitary matrix P . This means that P−1AP = PHAP is diagonal. In this situation, we say
that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are
particularly attractive since the calculation is essentially free and error-free as well: PH = P−1.
52 5. Structure Theorems
Theorem 5.3.6. If a real matrix A is a orthogonally diagonalizable with an orthonormal matrix P ,
that is P TAP is a diagonal matrix, then A is symmetric.
Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result.
Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable.
In addition, every real symmetric matrix is orthogonally diagonalizable.
Proof. We shall show this statement by induction on n. It is clear for n = 1.
Assume that n > 1 and every (n−1)×(n−1) Hermitian matrix is unitary diagonalizable. Consider
an n× n Hermitian matrix A.
Let λ1 be a real eigenvalue of A with unit eigenvector ~v.
Then A~v = λ1~v and ‖~v‖ = 1.
Let W = {~v}⊥ with orthonormal basis {~z1, . . . , ~zn−1}.Thus, R =
[~v ~z1 . . . ~zn−1
]is an n× n unitary matrix. Observe that
B = RHAR =
λ1 0 . . . 00 b22 . . . b2n...
......
0 bn2 . . . bnn
=
λ1 0 . . . 00... C(n−1)×(n−1)
0
and BH = (RHAR)H = B. Hence, B is Hermitian and so is C.
Since C is an (n− 1)× (n− 1) Hermitian matrix, by the induction hypothesis,
∃ an (n− 1)× (n− 1) unitary matrix Q such that QHCQ = diag(λ2, . . . , λn).
Let P =
1 0 . . . 00... Q0
n×n
. Then P is an n× n unitary matrix and
PHBP = PHRHARP = (RP )HA(RP ).
Choose U = RP . Then UH = (RP )H = PHRH = R−1P−1 = (RP )−1 = U−1 and
UHAU = PHBP =
λ1
λ2
. . .
λn
.
Hence, A is unitarily diagonalizable.
Example 5.3.4. Diagonalize the Hermitian matrix A =
[1 1− i
1 + i 0
]
.
Example 5.3.5. Orthogonally diagonalize the symmetric matrix A =
1 2 02 4 00 0 5
.
(Given λ = 0, 5, 5).
Definition. A square matrix A is normal if AHA = AAH .
Clearly, every Hermitian matrix is normal.
5.4. Jordan Forms 53
Theorem 5.3.8. A matrix is unitarily diagonalizable if and only if it is normal.
Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this
course.
Real versus Complex
(x1, . . . , xn) ∈ Rn (x1, . . . , xn) ∈ Cn
length: ‖~x‖2 = x21 + · · ·+ x2n ‖~x‖2 = |x1|2 + · · ·+ |xn|2transpose: AT
ij = Aji Hermitian: AHij = Aji
(AB)T = BTAT (AB)H = BHAH
~x · ~y = ~xT~y = x1y1 + · · ·+ xnyn ~x · ~y = ~xH~y = x1y1 + · · ·+ xnynorthogonality: ~xT~y = 0 ~xH~y = 0orthonormal: P TP = In = PP T unitary: UHU = In = UUH
symmetric matrix: AT = A Hermitian matrix AH = AA = PDP−1 = PDP T (real D) A = UDU−1 = UDUH (real D)
orthogonally diagonalizable unitarily diagonalizable
5.4 Jordan Forms
Theorem 5.1.9 gives necessary and sufficient conditions for an n× n matrix to be diagonalizable,
namely that it should have n independent eigenvectors. We have also seen square matrices which
are not diagonalizable. In this section, we discuss the so-called Jordan canonical form, a form
of matrix to which every square matrix is similar.
Definition. Let A be an n× n matrix. Let λ be an eigenvalue of A with nullity(A− λIn) = ℓ.Assume that λ is of multiplicity m. Then 1 ≤ ℓ ≤ m.
If m = 1, then ℓ = m = 1.
If m > 1 and ℓ < m, then λ is said to be defective and the number m − ℓ > 0 of missing
eigenvector(s) is called the defect of λ.
Note that if A has a defective eigenvalue, then A is not diagonalizable.
Definition. The generalized eigenspace Gλ corresponding to an eigenvalue λ of A, consists
of all vectors ~v such that, for some k ∈ N, (A− λI)k~v = ~0, that is,
Gλ(A) = {~v ∈ Fn : (A− λI)k~v = ~0 for some k ∈ N} =⋃
k∈N
Nul(A− λI)k.
Definition. A length r chain of generalized eigenvectors based on the eigenvector ~v for λis a set {~v = ~v1, ~v2, . . . , ~vr} of r linearly independent generalized eigenvectors such that
(A− λI)~vr = ~vr−1,
(A− λI)~vr−1 = ~vr−2,
...
(A− λI)~v2 = ~v1.
Since ~v1 is an eigenvector, (A− λI)~v1 = ~0. It follows that
(A− λI)r~vr = ~0.
54 5. Structure Theorems
We may denote the action of the matrix A− λI on the string of vectors by
~vr −→ ~vr−1 −→ · · · −→ ~v2 −→ ~v1 −→ ~0.
Now let W be the subspace of Gλ spanned by {~v1, . . . , ~vr}. Any vector ~x in W has a represen-
tation of the form
~x = c1~v1 + c2~v2 + · · ·+ cr~vr
and
A~x = c1(A~v1) + c2(A~v2) + · · ·+ cr(A~vr)
= c1(λ~v1) + c2(λ~v2 + ~v1) + · · ·+ cr(λ~vr + ~vr−1)
= (λc1 + c2)~v1 + · · ·+ (λcr−1 + cr)~vr−1 + λcr~vr.
Thus A~x is also in W . If B = {~v1, . . . , ~vr} is a basis for W , then
[A~x]B =
λc1 + c2λc2 + c3
...
λcr−1 + crλcr
=
λ 1λ 1
. .λ 1
λ
c1c2...
cr−1
cr
= J [~x]B
where
J = J(λ; r) =
λ 1λ 1
. .λ 1
λ
r×r
is called the Jordan block of size r corresponding to λ.
Example 5.4.1. Let A =
[1 1−1 3
]
. Find generalized eigenspaces of A.
Example 5.4.2. Let A1 =
0 1 2−5 −3 −71 0 0
and A2 =
−1 1 00 −1 00 1 −1
.
Then A1 and A2 have the same characteristic polynomial (x+ 1)3. Find
(1) the minimal polynomials of A1 and A2, and
(2) the generalized eigenspaces of A1 and A2.
Theorem 5.4.1. If a n × n matrix A has t linearly independent eigenvectors, then it is similar to
a matrix J , that is, in Jordan form, with t square blocks on the diagonal:
Jordan form J = M−1AM =
J1. . .
Jt
.
Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal:
Jordan block Ji = J(λi, ri) =
λi 1λi 1
. .. 1
λi
ri×ri
.
The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, Mconsists of n generalized eigenvectors which are linearly independent.
5.4. Jordan Forms 55
Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized
eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the
lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the
structure of these chains depends on the defect of λ, and can be quite complicated. For instance,
a multiplicity-four-eigenvalue can correspond to
• Four length 1 chain (defect 0);
• Two length 1 chains and a length 2 chain (defect 1);
• Two length 2 chains (defect 2);
• A length 1 chain and a length 3 chain (defect 2);
• A length 4 chain (defect 3).
Observe that, in each of these cases, the length of the longest chain is at most d + 1 where dis the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors
corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin
with the equation
(A− λI)d+1~u = ~0 (5.4.1)
to start building the chains of generalized eigenvectors corresponding to λ.
Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the
matrix A− λI until the zero vector is obtained. If
(A− λI)~u1 = ~u2 6= ~0
(A− λI)~u2 = ~u3 6= ~0
...
(A− λI)~uk−1 = ~uk 6= ~0
but (A− λI)~uk = ~0, then we get the string of k generalized eigenvectors
~u1 −→ ~u2 −→ · · · −→ ~uk.
Example 5.4.3. Let A =
0 0 1 00 0 0 1−2 2 −3 12 −2 1 −3
with the characteristic polynomial x(x+ 2)3.
Find the chains of generalized eigenvectors corresponding to each eigenvalues and the Jordan
form of A.
Example 5.4.4. Let A =
8 0 0 00 8 0 34 0 8 00 0 0 8
. Find the minimal polynomial of A and chain(s) of
generalized eigenvectors and the Jordan form of A.
Example 5.4.5. Write down the Jordan form of the following matrices.
(1)
0 0 1 10 0 1 10 0 1 10 0 0 0
(2)
3 5 0 00 3 6 00 0 4 70 0 0 4
(3)
3 0 0 00 3 5 00 0 4 60 0 0 4
56 5. Structure Theorems
Let N(r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and
zero elsewhere. For example,
N(2) =
[0 10 0
]
, N(3) =
0 1 00 0 10 0 0
, N(4) =
0 1 0 00 0 1 00 0 0 10 0 0 0
, etc.
Then J(λ; r) = λI +N(r), or in abbreviated J = λI +N .
Suppose that f(x) is a polynomial of degree s. Then the Taylor expansion around a point c from
calculus gives us
f(c+ x) = f(c) + f ′(c)x+f ′′(c)
2!x2 + · · ·+ f (s)(c)
s!xs,
where f ′, f ′′, . . . , f (s) represent successive derivatives of f . In terms of matrices I and N , we
have
f(J) = f(λI +N) = f(λI) + f ′(λI)N +f ′′(λI)N2
2!+ · · ·+ f (s)(λI)N s
s!
= f(λ)I + f ′(λ)N +f ′′(λ)
2!N2 + · · ·+ f (s)(λ)
s!N s
=
f(λ) f ′(λ)f ′′(λ)
2!. .
f(λ) f ′(λ)f ′′(λ)
2!.
. . . .
. .f ′′(λ)
2!. f ′(λ)
f(λ)
r×r
because the entries of Nk that are k steps above the diagonal are 1’s and all the other entries
are zeros.
Example 5.4.6. Compute J(λ; 4)2, J(λ; 3)10 and J(λ; 2)s.
Remark. If J =
J1. . .
Jt
is in a Jordan form, then Js =
Js1
. . .
Jst
.
Example 5.4.7. Compute Js for J =
2 1 00 2 00 0 3
.
Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal
polynomial.
Solution. Let J be the Jordan form of A. Since f(A) = Mf(J)M−1, f(A) = 0 if and only if
f(J) = 0. Also, if J(λ; r) is a Jordan block, then f(J(λ; r)) is a Jordan block of f(J). We must
thus find a polynomial such that, for every Jordan block J(λ; r) of J , f(J(λ; r)) = 0 holds.
But we derived a formula for f(J(λ; r)), and it equals the zero matrix if and only if f(λ), f ′(λ),. . . , f (r−1)(λ) are all zero. Thus, f(x) and its first r − 1 derivatives must vanish at x = λ; in other
words, (x− λ)r must be a factor of f(x).
5.4. Jordan Forms 57
Let λ1, . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks
corresponding to the eigenvalue λi. Hence, we obtain
f(x) = (x− λ1)m1 . . . (x− λk)
mk
is the minimal polynomial of A.
Example 5.4.9. Find the minimal polynomial of the following matrices.
(1)
2 0 00 2 00 0 −1
(2)
2 1 00 2 00 0 −1
(3)
2 1 00 1 00 0 −1
(4)
3 13 1
33
8 18
5
(5)
3 13
3 13
8 18
5
Exercises for Chapter 5. 1. Let A =
0 1 00 0 1a b c
. Find a, b, c so that det(A− λI3) = 9λ− λ3.
2. Let T : V → V be a linear operator.A subspace U of V is T -invariant if T (U) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U .(a) Show that kerT and imT are T -invariant.(b) If U and W are T -invariant, prove that U ∩W and U +W are also T -invariant.(c) Show that the eigenspace Eλ(T ) is T -invariant.
3. Show that A and AT have the same eigenvalues.4. Show that if λ1, . . . , λk are eigenvalues of A, then λm
1 , . . . , λmk are eigenvalues of Am for all m ≥ 1.
Moreover, each eigenvector of A is an eigenvector of Am.5. Let A and B be n× n matrices over a field F . If I −AB is invertible, prove that I −BA is invertible
and (I −BA)−1 = I +B(I −AB)−1A.6. Show that if A and B are the same size, then AB and BA have the same eigenvalues.7. Determine all 2× 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a.8. Let V be the space of all real-valued continuous functions. Define T : V → V by
(Tf)(x) =
∫ x
0
f(t) dt.
Show that T has no eigenvalues.9. Prove that if A is invertible and diagonalizable, then A−1 is diagonalizable.
10. Let V = Span{1, sin 2t, sin2 t}. Let T : V → V defined by T (f) = f ′′.Find all eigenvalues and eigenspaces of D. Is T diagonalizable? Explain.
11. Let A = [aij ] be an n× n matrix such that for each i = 1, 2, . . . , n, we have
n∑
j=1
aij = 0.
Show that 0 is an eigenvalue of A.12. Let A be an n× n matrix with characteristic polynomial (x− λ1)
d1 . . . (x− λk)dk .
Show that trA = d1λ1 + · · ·+ dkλk.13. Let A be a 2× 2 matrix. Prove that the characteristic polynomial of A is given by
x2 − (trA)x+ detA = 0.
14. If A and B are 2× 2 matrices with determinant one, prove that
trAB − (trA)(trB) + trAB−1 = 0.
58 5. Structure Theorems
15. Find the 2× 2 matrices with real entries that satisfy the equation
X3 − 3X2 =
[−2 −2−2 −2
]
.
(Hint. Apply the Cayley-Hamilton Theorem.)
16. Let A =
0 0 c1 0 b0 1 a
.
Prove that the minimal polynomial of A and the characteristic polynomial of A are the same.17. A 3× 3 matrix A has the characteristic polynomial x(x− 1)(x+ 2).
What is the characteristic polynomial of A2?18. Let V = Mn(F ) be the vector space of n× n matrices over a field F . Let A be an n× n matrix.
Let TA be the linear operator on V defined by TA(B) = AB.Show that the minimal polynomial for TA is the minimal polynomial for A.
19. Let U be an n× n real orthonormal matrix. Prove that(a) |tr (U)| ≤ n, and (b) det(U2 − In) = 0 if n is odd.
20. If U =[~u1 ~u2 . . . ~un
]with (~uj , ~uk) =
{
1 if j = k,
0 if j 6= k, prove that U is unitary.
21. Let A be an n× n symmetric matrix with distinct eigenvalues λ1, . . . , λk. Prove that
(A− λ1In) . . . (A− λkIn) = 0.
22. Unitarily diagonalize the following matrices.
(a)
[2 1−1 2
]
(b)
[3 i−i 0
]
(c)
0 1 0−1 0 00 0 −1
(d)
0 1 01 0 00 0 1
(e)
2 i i−i 1 0−i 0 1
23. Show that every unitarily diagonalizable matrix is normal.24. Suppose that A is real symmetric and orthonormal. Prove that the only possible eigenvalues of A are±1.
25. Show that if a real matrix A is skew-symmetric (i.e., AT = −A), then iA is Hermitian.26. Prove that if A is unitarily diagonalizable, then so is AH .27. Let A be any square real matrix. Show that the eigenvalues of ATA are all non-negative.28. Show that the generalized eigenspace Gλ corresponding to an eigenvalue λ of an n × n matrix A is
a subspace of Fn.29. Suppose the characteristic polynomial of a 4× 4 matrix A is (x− 1)2(x+ 1)2.
(a) Prove that A−1 = 2A−A3. (b) Write down all possible Jordan form(s) of A.30. Let J = J(λ; r) be an r × r Jordan block with λ on its diagonal. Show that J has only one linearly
independent eigenvector corresponding to λ.31. If J is in Jordan form with k Jordan blocks on the diagonal, prove that J has exactly k linearly
independent eigenvectors.32. These Jordan matrices have eigenvalues 0, 0, 0, 0:
J =
0 10
0 10
and K =
0 10 1
00
.
For any matrix M , compare JM with MK. If they equal, show that M is not invertible. Then Jand K are not similar.
33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np(λ) = nullity(A− λI)p, p ∈ N, areas follows:n1(2) = 2, n2(2) = 4, np(2) = 5 for p ≥ 3, and n1(5) = 1, np(5) = 2 for p ≥ 2.Write down the Jordan form of A.
34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J2, count its eigenvectors and write itsJordan form.
35. How many possible Jordan forms are there for a 6× 6 matrix with characteristicpolynomial (x− 1)2(x+ 2)4?
5.4. Jordan Forms 59
36. Let A =
2 a b0 2 c0 0 1
∈M3(R).
(a) Prove that A is diagonalizable if and only if a = 0.(b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0.
37. Let V = {h(x, y) = ax2 + bxy + cy2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the spaceof polynomial in two variables x and y over R. Then B = {x2, xy, y2, x, y, 1} is a basis for V . DefineT : V → V by
(T (h))(x, y) =∂
∂y
(∫
h(x, y) dx
)
.
(a) Prove that T is a linear transformation and find A = [T ]B.(b) Compute the characteristic polynomial and the minimal polynomial of A.(c) Find the Jordan form of A.
38. True or False:
(a)
[3 00 4
]
and
[3 10 4
]
are similar. (b)
[3 00 3
]
and
[3 10 3
]
are similar.
39. Show that
a 1 00 a 00 0 b
and
b 0 00 a 10 0 a
are similar.
40. Write down the Jordan form for the following matrices and find its minimal polynomial.
(a)
[−2 1−1 −4
]
(b)
−1 0 10 −1 11 −1 −1
(c)
1 0 0−2 −2 −32 3 4
(d)
2 0 0−7 9 70 0 2
(e)
3 1 −12 2 −12 2 0
(f)
−2 17 4−1 6 10 1 2
(g)
−3 5 −53 −1 38 −8 10
(h)
5 −1 11 3 0−3 2 1
(i)
1 −4 0 −20 1 0 06 −12 −1 −60 −4 0 −1
(j)
2 1 0 10 2 1 00 0 2 10 0 0 2
(k)
−1 −4 0 01 3 0 01 2 1 00 1 0 1
(l)
1 3 7 00 −1 −4 00 1 3 00 −6 −14 1
Eigenvalues: (b) −1,−1,−1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3(i) −1,−1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1.