linear algebra lecture notes
DESCRIPTION
Linear Algebra Lecture NotesTRANSCRIPT
MATH 106 LINEAR ALGEBRA
LECTURE NOTESFALL 2010-2011
1
1These Lecture Notes are not in a final form being still subject of improvement
1
2
Contents
1 Systems of linear equations and matrices 5
1.1 Introduction to systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Matrices and matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Inverses. Rules of matrix arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Elementary matrices and a method for finding A−1 . . . . . . . . . . . . . . . . . . . 22
1.6 Further results on systems of equations and invertibility . . . . . . . . . . . . . . . . 25
1.7 Diagonal, Triangular and Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . 27
2 Determinants 33
2.1 The determinant function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Evaluating determinants by row reduction . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Properties of determinant function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Cofactor expansion. Cramer’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 Euclidean vector spaces 43
3.1 Euclidean n-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Linear transformations from Rn to Rm . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Properties of linear transformations from Rn to Rm . . . . . . . . . . . . . . . . . . . 50
3
4
Chapter 1
Systems of linear equations and
matrices
1.1 Introduction to systems of linear equations
Definition 1.1.1 A linear equation in the n variables x1 , . . . , xn , equally called unknowns, is the
problem of finding the values of x1 , . . . , xn such that a1x1 + . . . + anxn = b, where a1 , . . . , an , b are
constants. A solution of a linear equation a1x1 + . . . + anxn = b is a sequence t1 , . . . , tn of n real
numbers such that the equation is satisfied when we replace x1 = t1 , . . . , xn = tn . The set of all
solutions of the equation is called its solution set or the general solution.
Definition 1.1.2 A finite set of linear equations in the variables x1 , . . . , xn is called a system of
linear equations or a a linear system. A solution of a linear system
a11x1 + a12x2 + . . . + a1nxn = b1
a21x1 + a22x2 + . . . + a2nxn = b2
......
am1x1 + am2x2 + . . . + amnxn = bm
(1.1)
with m linear equations and n unknowns, is a sequence t1 , . . . , tn of n real numbers such that each
equation of the system is satisfied when we replace x1 = t1 , . . . , xn = tn . The set of all solutions of
the linear system is called its solution set or the general solution. A system of equations is said to
be consistent if it has a solution at least. Otherwise it is called inconsistent. The augmented matrix
5
of the linear system (1.3) is
a11 a12 . . . a1n b1
a21 a22 . . . a2n b2
......
. . ....
...
am1 am2 . . . amn bm
Remark 1.1.3 The solution set of a linear system unchanges if we perform on the system one of
the following operations:
1. Multiply an equation with a nonzero constant;
2. Interchange two equations;
3. Add a multiple of one equation to another.
The corresponding operations at the level of augmented matrix are:
1. Multiply a row with a nonzero constant;
2. Interchange two rows;
3. Add a multiple of one row to another.
Example 1.1.4 Solve the following linear system
x + 2y − 3z + aw = 4
3x − y + 5z + 10aw = −2
4x + y + 2z + 11aw = 2
.
Solution: The augmented matrix of the system is
1 2 −3 a
3 −1 5 10a
4 1 2 11a
∣∣∣∣∣∣∣∣∣
4
−2
2
and we have
6
successively:
1 2 −3 a
3 −1 5 10a
4 1 2 11a
∣∣∣∣∣∣∣∣∣
4
−2
2
x + 2y − 3z + aw = 4
3x − y + 5z + 10aw = −2
4x + y + 2z + 11aw = 2
−3r1 + r2 → r2↓ −4r1 + r3 → r3 −3e1 + e2 → e2↓ −4e1 + e3 → e3
↓ ↓
1 2 −3 a
0 −7 14 7a
0 −7 14 7a
∣∣∣∣∣∣∣∣∣
4
−14
−14
x + 2y − 3z + aw = 4
− 7y + 14z + 7aw = −14
− 7y + 14z + 7aw = −14
−r2 + r3 → r3↓ ↓ −e2 + e3 → e3
↓ ↓
1 2 −3 a
0 −7 14 7a
0 0 0 0
∣∣∣∣∣∣∣∣∣
4
−14
0
x + 2y − 3z + aw = 4
− 7y + 14z + 7aw = −14
0 = 0
−17r2↓ ↓ −1
7e2
↓ ↓
1 2 −3 a
0 1 −2 −a
0 0 0 0
∣∣∣∣∣∣∣∣∣
4
2
0
x + 2y − 3z + aw = 4
y − 2z − aw = 2
0 = 0
−2r2 + r1 → r1↓ ↓ −2e2 + e1 → e1
↓ ↓
1 0 1 3a
0 1 −2 −a
0 0 0 0
∣∣∣∣∣∣∣∣∣
0
2
0
x + z + 3aw = 0
y − 2z − aw = 2
0 = 0.
Thus, the equivalent system isx +z +3aw = 0
y −2z −aw = 2and its solutions are
x = −t− 3as, y = 2 + 2t + as, z = t, w = s for s, t ∈ R.
1.2 Gaussian elimination
Definition 1.2.1 A matrix is said to be in reduced row-echelon form if it has the following prop-
erties:
1. If a row does not consist entirely of zeros, then the first nonzero number in the row is a 1,
called a leading 1.
7
2. If there are any rows that consist entirely of zeros, then they are grouped together at the
bottom of the matrix.
3. In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower
row occurs farther to the right than the leading 1 in the higher row.
4. Each column that contains a leading 1 has zero everywhere else.
A matrix satisfying just the first 3 properties, namely 1,2 and 3, is said to be in row-echelon form
We are now going to provide a number of steps in order to reduce a matrix to a (reduced) row-
echelon form. These are:
1. Locate the lefmost column that does not consist entirely of zeros;
2. Interchange the top row with another row, if necessary, to bring a nonzero entry to the top
of the column found in Step 1;
3. If the entry that is now at the top of the column found in Step 1 is a, multiply the first row
by 1a in order to introduce a leading 1;
4. Add suitable multiples of the top row to the rows below so that all entries below the leading
1 become zeros;
5. Cover the top row in the matrix and begin again with the Step 1 applied to the submatrix
that remains. Continue in this way until the entire matrix is in row-echelon form.
6. Once you got the matrix in row-echelon form, beginning with the last nonzero row and working
upward, add suitable multiples of each row to the rows above to introduce zeros above the
leading 1’s.
The variables corresponding to the leading 1’s in row echelon form are called leading variables
and the others are called free variables.
The above procedure for reducing a matrix to reduced row-echelon form is called Gauss-Jordan
elimination. If we use only the first five steps, the procedure produces a row-echelon form and is
called Gaussian elimination.
Examples 1.2.2 1. Solve the linear system
x − y + 2z − w = −1
2x + y − 2z − 2w = −2
−x + 2y − 4z + w = 1
3x − 3w = −3
(1.2)
8
Solution: The augmented matrix is
1 −1 2 −1
2 1 −2 −2
−1 2 −4 1
3 0 0 −3
∣∣∣∣∣∣∣∣∣∣∣∣
−1
−2
1
−3
and it can be reduced at
reduced row-echelon form as follows:
1 −1 2 −1
2 1 −2 −2
−1 2 −4 1
3 0 0 −3
∣∣∣∣∣∣∣∣∣∣∣∣
−1
−2
1
−3
−2r1+r2→r2r1 + r3 → r3
−3r1 + r4 → r4
→
1 −1 2 −1
0 3 −6 0
0 1 −2 0
0 3 −6 0
∣∣∣∣∣∣∣∣∣∣∣∣
−1
0
0
0
13r2→
1 −1 2 −1
0 1 −2 0
0 1 −2 0
0 3 −6 0
∣∣∣∣∣∣∣∣∣∣∣∣
−1
0
0
0
−r2 + r3 → r3
−3r2 + r4 → r4
→
1 −1 2 −1
0 1 −2 0
0 0 0 0
0 0 0 0
∣∣∣∣∣∣∣∣∣∣∣∣
−1
0
0
0
r2 + r1 → r1→
1 0 0 −1
0 1 −2 0
0 0 0 0
0 0 0 0
∣∣∣∣∣∣∣∣∣∣∣∣
−1
0
0
0
such that the corresponding linear system, equivalent with the initial one, is
x − w = −1
y − 2z = 0and its solutions are
x = t− 1
y = 2s
z = s
w = t
, s, t ∈ R,
being infinitely many in this case. It is sometimes preferable to solve a linear system by
using Gauss elimination procedure for the augmented matrix to bring it just in a row-echelon
form and to use the so called back-substitution method afterward. For the linear system 1.2,
a row-echelon form of the augmented matrix is
1 −1 2 −1
0 1 −2 0
0 0 0 0
0 0 0 0
∣∣∣∣∣∣∣∣∣∣∣∣
−1
0
0
0
and its corresponding linear system is
x − y + 2z − w = −1
y − 2z = 0
The back-substitution method consists in the following three steps:
(a) Solve the equations for the leading variables in terms of the free variables;
(b) Beginning with the bottom equation and working upward, successively substitute each
equation into all the equations above it.
9
(c) Assign arbitrary values to the free variables, if any.
In our particular case x, y are the leading variables and z, w are the free variables. Therefore
the step (1a) is
x = y − 2z + w − 1
y = 2z
The step (1b) is
x = 2z − 2z + w − 1
y = 2zor equivalently
x = w − 1
y = 2z
Finally, the step (1c) consists in assigning to the free variables z, w the arbitrary values z = s
and w = t such that the solutions of the initial linear system are:
x = t− 1
y = 2s
z = s
w = t
, s, t ∈ R.
2. Solve the linear system
−x + 7y − 2z = 1
3x − y + z = 4
2x + 6y − z = 5
Solution: The augmented matrix is
−1 7 −2
3 −1 1
2 6 −1
∣∣∣∣∣∣∣∣∣
1
4
5
and it can be reduced at reduced
row-echelon form as follows:
−1 7 −2
3 −1 1
2 6 −1
∣∣∣∣∣∣∣∣∣
1
4
5
−r1→
1 −7 2
3 −1 1
2 6 −1
∣∣∣∣∣∣∣∣∣
−1
4
5
−3r1 + r2 → r2
−2r1 + r3 → r3→
1 −7 2
0 20 −5
0 20 −5
∣∣∣∣∣∣∣∣∣
−1
7
7
−r2 + r3 → r3 →
1 −7 2
0 20 −5
0 0 0
∣∣∣∣∣∣∣∣∣
−1
7
0
1/20r2→
1 −7 2
0 1 −14
0 0 0
∣∣∣∣∣∣∣∣∣
−1720
0
7r2 + r1 → r1
1 0 14
0 1 −14
0 0 0
∣∣∣∣∣∣∣∣∣
2920
7/20
0
such that the corresponding linear system, equivalent with the initial one, is
x + 14z = 29
20
y − 14z = 7/20
and its solutions are
x = 29−5t20
y = 7+5t20
z = t.
t ∈ R.
10
3. Solve the linear system
x1 − 2x2 + x3 − 4x4 = 1
x1 + 3x2 + 7x3 + 2x4 = 2
x1 − 12x2 − 11x3 − 16x4 = 5
Solution: The augmented matrix is
1 −2 1 −4
1 3 7 2
1 −12 −11 −16
∣∣∣∣∣∣∣∣∣
1
2
5
and it can be reduced at
reduced row-echelon form as follows:
1 −2 1 −4
1 3 7 2
1 −12 −11 −16
∣∣∣∣∣∣∣∣∣
1
2
5
−r1 + r2 → r2
−r1 + r3 → r3
→
1 −2 1 −4
0 5 6 6
0 −10 −12 −12
∣∣∣∣∣∣∣∣∣
1
1
4
2r2 + r3 → r3→
2r2 + r3 → r3→
1 −2 1 −4
0 5 6 6
0 0 0 0
∣∣∣∣∣∣∣∣∣
1
1
6
.
Thus the system is inconsistent, that is it has no solutions at all, since 0 6= 6.
4. For which values of a ∈ R the following linear system has no solution ? Exactly one solution
? Infinitely many solutions ? When the system is consistent, find its solution set.
x − y − 2z = 0
2x − 4y − 6z = −6
3x − 5y + (a2 − 14)z = a− 4
Solution: The augmented matrix is
1 −1 −2
2 −4 −6
3 −5 a2 − 14
∣∣∣∣∣∣∣∣∣
0
−6
a− 4
and it will be reduced by
performing successively the following row-operations:
1 −1 −2
2 −4 −8
3 −5 a2 − 14
∣∣∣∣∣∣∣∣∣
0
−6
a− 4
−2r1 + r2 → r2
−3r1 + r3 → r3
→
1 −1 −2
0 −2 −4
0 −2 a2 − 8
∣∣∣∣∣∣∣∣∣
0
−6
a− 4
− 1
2r2→
− 12r2→
1 −1 −2
0 1 2
0 −2 a2 − 8
∣∣∣∣∣∣∣∣∣
0
3
a− 4
2r2 + r3 → r3
1 −1 −2
0 1 2
0 0 a2 − 4
∣∣∣∣∣∣∣∣∣
0
3
a + 2
︸ ︷︷ ︸(∗)
1a2−4
if a 6∈ {±2}→
11
1a2−4
if a 6∈ {±2}→
1 −1 −2
0 1 2
0 0 1
∣∣∣∣∣∣∣∣∣
0
31
a−2
−2r3+r2→r22r3 + r1 → r1
if a 6∈ {±2} →
1 −1 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣∣
2a−2
3− 2a−2
1a−2
r2 + r1 → r1
if a 6∈ {±2}→
r2 + r1 → r1
if a 6∈ {±2}→
1 0 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣∣
33a−8a−2
1a−2
.
Therefore for a 6∈ {±2} the given linear system has the unique solutions x = 3, y = 3a−8a−2 , z = 1
a−2 .
If a = −2, the (∗) reduced matrix becomes
1 −1 −2
0 1 2
0 0 0
∣∣∣∣∣∣∣∣∣
0
3
0
r2 + r1 → r1→
1 0 0
0 1 2
0 0 0
∣∣∣∣∣∣∣∣∣
3
3
0
,
and in this case the given linear system has infinitely many solutions
x = 3, y = 3− 2s, z = s, s ∈ R.
Finally, when a = 2, the (∗) reduced matrix becomes
1 −1 −2
0 1 2
0 0 0
∣∣∣∣∣∣∣∣∣
0
3
4
and linear system has no
solutions at all in this case.
1.3 Matrices and matrix operations
Definition 1.3.1 A matrix
A=
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
am1 am2 · · · amn
← row 1
← row 2...
← row m
↑ ↑ · · · ↑col.1 col.2 · · · col.n
equally written [aij ]m×n (or simply [aij ]), is a rectangular array of numbers. The numbers in the
array are called entries. The entry in row i and column j of a matrix A is commonly denoted by
(A)ij , in our particular case (A)ij = aij . The size of a matrix is described in terms of the number
of rows and the number of columns it has. The above matrix A has size m× n. A matrix
x =
x1
x2
...
xm
12
with only one column is called a column matrix (or a column vector), and a matrix
y =[
y1 y2 . . . yn
]
with only one row is called row matrix (or a row vector). A matrix
A =
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
an1 an2 . . . ann
,
with n rows and n columns, that is of size n×n, is called a square matrix of order n and the entries
a11 , a22 , . . . , ann are said to be on the main diagonal of A.
Definition 1.3.2 Two matrices are said to be equal if they have the same size and their corre-
sponding entries are equal.
Definition 1.3.3 If
A =
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
, B =
b11 b12 . . . b1n
b21 b22 . . . b2n
......
. . ....
bm1 bm2 . . . bmn
,
are matrices of the same size, then their sum is the matrix
A + B =
a11 + b11 a12 + b12 . . . a1n + b1n
a21 + b21 a22 + b22 . . . a2n + b2n
......
. . ....
am1 + bm1 am2 + bm2 . . . amn + bmn
,
and their difference is the matrix
A−B =
a11 − b11 a12 − b12 . . . a1n − b1n
a21 − b21 a22 − b22 . . . a2n − b2n
......
. . ....
am1 − bm1 am2 − bm2 . . . amn − bmn
.
One can shortly define the entries of the sum and the difference matrices in the following way:
(A + B)ij = (A)ij + (B)ij and (A−B)ij = (A)ij − (B)ij .
13
If the matrices A and B have different sizes, then their sum and difference is not defined. If c is
any scalar (number), then the product cA is the matrix
ca11 ca12 . . . ca1n
ca21 ca22 . . . ca2n
......
. . ....
cam1 cam2 . . . camn
obtained by multiplying each entry of A with c and its entries can be shortly written as (cA)ij =
c(A)ij . If A1 , A2 , . . . , An are matrices of the same size and c1 , c2 , . . . , cn are scalars, then the matrix
c1A1 + c2A2 + · · · + cnAn is defined and it is called a linear combination of A1 , A2 , . . . , An with
coefficients c1 , c2 , . . . , cn .
Examples 1.3.4 Consider the matrices
A =
1 −3 0
2 0 2
, B =
1 2 −2
−2 4 3
, C =
1 2 −3
−1 0 4
2 5 −1
, D =
−1 6 2
2 1 3
1 2 −4
.
1. Compute, when is possible, A−B, C + D, B − C, A + C,
Solution:
A−B =
1 −3 0
2 0 2
−
1 2 −2
−2 4 3
=
0 −5 2
4 −4 −1
.
D + E =
1 2 −3
−1 0 4
2 5 −1
+
−1 6 2
2 1 3
1 2 −4
=
0 8 −1
1 1 7
3 7 −5
.
B − C and A + C are not defined.
2. Compute, when possible, 13A, (−1)B, 2A−B, 3C + D, 5B + 1
2006D. Solution:
13A =
13 −1 023 0 2
3
, (−1)B =
−1 −2 2
2 −4 −3
2A−B = 2
1 −3 0
2 0 2
−
1 2 −2
−2 4 3
=
2 −6 0
4 0 4
−
1 2 −2
−2 4 3
=
1 −8 2
6 −4 1
.
3C+D = 3
1 2 −3
−1 0 4
2 5 −1
+
−1 6 2
2 1 3
1 2 −4
=
3 6 −9
−3 0 12
6 15 −3
+
−1 6 2
2 1 3
1 2 −4
=
2 12 −7
−1 1 15
7 17 −7
.
5B + 12006D is not defined.
14
Definition 1.3.5 Lat A be an m× n matrix and B be an n× p matrix. Then the product AB is
the m×p matrix C whose entry cij in row i and column j is obtained as follows: Sum the products
formed by multiplying, in order, each entry in row i of the matrix A with the ”corresponding” entry
in column j of the matrix B. More precisely
cij = ai1b1j + ai2b2j + · · ·+ ainbnj =n∑
k=1
aik
bkj
.
A B Cm× n n× p m× p
6 666 6
6must bethe same
size of product
6
=
AB =
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
ai1 ai2 . . . ain
......
. . ....
am1 am2 . . . amn
b11 b12 . . . b1j . . . b1p
b21 b22 . . . b2j . . . b2p
......
. . ....
. . ....
bn1 bn2 . . . bnj . . . bnp
=
=
n∑
k=1
a1k
bk1
n∑
k=1
a1k
bk2
. . .
n∑
k=1
a1k
bkj
. . .
n∑
k=1
a1k
bkp
n∑
k=1
a2k
bk1
n∑
k=1
a2k
bk2
. . .n∑
k=1
a2k
bkj
. . .n∑
k=1
a2k
bkp
......
. . ....
. . ....
n∑
k=1
aik
bk1
n∑
k=1
aik
bk2
. . .n∑
k=1
aik
bkj
. . .n∑
k=1
aik
bkp
......
. . ....
. . ....
n∑
k=1
amk
bk1
n∑
k=1
amk
bk2
. . .n∑
k=1
amk
bkj
. . .n∑
k=1
amk
bkp
.
Definition 1.3.6 The transpose of an m×n matrix A, denoted by AT , is the n×m matrix whose
i− th row is the i− th column of A. Thus (AT )ij = (A)ji . Observe that for any matrix A we have
(AT )T = A.
Example 1.3.7 If A =
1 0 −1
0 2 3
and B =
2 1 0 1
3 −1 −1 0
4 3 2 1
, find, when possible, the
matrices AB, BA, (AB)T , (BA)T , AT BT , BT AT .
15
Solution: then we have
AB =
1 0 −1
0 2 3
︸ ︷︷ ︸2×3
2 1 0 1
3 −1 −1 0
4 3 2 1
︸ ︷︷ ︸3×4
=
=
1 · 2 + 0 · 3 + (−1) · 4 1 · 1 + 0 · (−1) + (−1) · 3 1 · 0 + 0 · (−1) + (−1) · 2 1 · 1 + 0 · 0 + (−1) · 1
0 · 2 + 2 · 3 + 3 · 4 0 · 1 + 2 · (−1) + 3 · 3 0 · 0 + 2 · (−1) + 3 · 2 0 · 1 + 2 · 0 + 3 · 1
=
=
−2 −2 −2 0
18 7 4 3
︸ ︷︷ ︸2×4
, such that (AB)T =
−2 18
−2 7
−2 4
0 3
︸ ︷︷ ︸4×2
.
Observe that the product BA, (BA)T , AT BT are not defined. However, observe that
BT =
2 3 4
1 −1 3
0 −1 2
1 0 1
︸ ︷︷ ︸4×3
and AT =
1 0
0 2
−1 3
︸ ︷︷ ︸3×2
,
such that the product BT AT is also defined and
BT AT =
2 3 4
1 −1 3
0 −1 2
1 0 1
︸ ︷︷ ︸4×3
1 0
0 2
−1 3
︸ ︷︷ ︸3×2
=
−2 18
−2 7
−2 4
0 3
︸ ︷︷ ︸4×2
= (AB)T .
Definition 1.3.8 If A is a square matrix, then the trace of A, denoted by tr(A), is defined to be
the sum of the entries on the main diagonal, namely tr(A) := (A)11 + (A)22 + · · ·+ (A)nn , where n
is the order of A. The trace of A is not defined if A is not a square matrix.
Example 1.3.9 Find, when possible, tr(AAT ) and tr(AB), where
A =
1 0 −1
0 2 3
and B =
2 1 0 1
3 −1 −1 0
4 3 2 1
16
We first observe that AAT =
1 0 −1
0 2 3
1 0
0 2
−1 3
=
2 −3
−3 13
︸ ︷︷ ︸2×2
, such that tr(AAT ) =
2 + 13 = 15. Since AB =
−2 −2 −2 0
18 7 4 3
︸ ︷︷ ︸2×4
is not a square matrix, its trace is not defined.
If Aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n are matrices of suitable sizes, then we can form a new matrix
A11
A21
Am1
A12
A22
Am2
A2n
A1n
Amn
A=
and call partition of the matrix A the above kind of subdivision. For example
-1
1
-1
0
0
0
1
5
2
0
0
2
3
-1
1
-7
-2
3
-1
6
0
5
0
0
=A11 A12
A21A22
A13
A23
where
A11 =
−1 0
1 0
, A12 =
2 3
0 −1
, A13 =
−2 0
3 5
,
A21 =
−1 1
0 5
, A22 =
0 1
2 −7
, A23 =
−1 0
6 0
.
An application of matrix multiplication operation consists in transforming a system of linear equa-
tions into a matrix equation. Indeed, for a linear system we have successively:
a11x1 + a12x2 + . . . + a1nxn = b1
a21x1 + a22x2 + . . . + a2nxn = b2
......
am1x1 + am2x2 + . . . + amnxn = bm
⇔
a11x1 + a12x2 + . . . + a1nxn
a21x1 + a22x2 + . . . + a2nxn
...
am1x1 + am2x2 + . . . + amnxn
=
b1
b2
...
bm
⇔
17
⇔
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
︸ ︷︷ ︸A
x1
x2
...
xn
︸ ︷︷ ︸X
=
b1
b2
...
bm
︸ ︷︷ ︸B
⇔ AX = B.
The matrix
A =
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
is called the coefficient matrix of the linear system. Observe that the augmented matrix of the
system is[
A... B
].
1.4 Inverses. Rules of matrix arithmetic
Theorem 1.4.1 Assuming that the sizes of matrices are such that the indicated operations can be
performed, the following rules of matrix arithmetic are valid:
1. A + B = B + A;
2. (A + B) + C = A + (B + C);
3. (AB)C = A(BC);
4. A(B + C) = AB + AC;
5. (A + B)C = AC + BC;
6. A(B − C) = AB −AC;
7. (A−B)C = AC −BC;
8. a(B + C) = aB + aC;
9. a(B − C) = aB − aC;
10. (a + b)C = aC + bC;
11. (a− b)C = aC − bC;
12. a(bC) = (ab)C;
18
13. a(BC) = (aB)C = B(aC).
Remark 1.4.2 The matrix multiplication is not commutative. Indeed, for example if
A =
−1 0
2 3
, B =
1 2
3 0
,
then
AB =
−1 0
2 3
1 2
3 0
=
−1 −2
11 4
6=
3 6
−3 0
=
1 2
3 0
−1 0
2 3
= BA.
Define the m×n zero matrix to be the matrix Om×n :=
0 0 · · · 0
0 0 · · · 0...
.... . .
...
0 0 · · · 0
m×n
and it will be equally
denoted by O when the size is understood from context.
Theorem 1.4.3 Assuming that the sizes of the matrices are such that the indicated operations can
be performed, the following rules of matrix arithmetic are valid
1. A + O = O + A = A;
2. A−A = O;
3. O −A = −A;
4. AO = O.
The identity matrix of order n is defined to be the square the n× n matrix
In :=
1 0 · · · 0
0 1 · · · 0...
.... . .
...
0 0 · · · 1
n×n
,
and it will be equally denoted by I when the order is understood from context.
Remark 1.4.4 If A is an m× n matrix, then AIn = ImA = A.
Theorem 1.4.5 If R is the reduced row-echelon form of an n × n matrix A, then either R has a
row of zeros, or R is the identity matrix.
19
Definition 1.4.6 A square matrix A is said to be invertible if there is another square matrix B,
of the same size, such that AB = BA = I.
Remark 1.4.7 If B, C are both inverses of a A, then B = C. Indeed, we have successively:
B = BI = B(AC) = (BA)C = IC = C. Therefore, the inverse of an invertible matrix A is unique
and denoted by A−1, being characterized by the equalities: AA−1 = A−1A = I.
Example 1.4.8 If a, b, c, d ∈ R are such that ad − bc 6= 0, then the matrix A =
a b
c d
is
invertible and A−1 = 1ad−bc
d −b
−c a
=
dad−bc − b
ad−bc
− cad−bc
aad−bc
. Indeed we have
a b
c d
︸ ︷︷ ︸A
dad−bc − b
ad−bc
− cad−bc
aad−bc
=
ad−bcad−bc
−ab+baad−bc
cd−dcad−bc
−cb+daad−bc
=
1 0
0 1
.
and similarly
dad−bc − b
ad−bc
− cad−bc
aad−bc
a b
c d
︸ ︷︷ ︸A
=
da−bcad−bc
db−bdad−bc
−ca+acad−bc
−cb+adad−bc
=
1 0
0 1
.
If A is a square matrix, then its positive powers are defined to be A0 = I, An = AA · · ·A︸ ︷︷ ︸n times
for n > 0.
Moreover, if A is invertible, then A−n = A−1A−1 · · ·A−1︸ ︷︷ ︸n times
, also for n > 0.
Theorem 1.4.9 If A is an invertible matrix, then
1. A−1 is also invertible and (A−1)−1 = A;
2. An is invertible and (An)−1 = (A−1)n for n ∈ {0, 1, 2, . . .};
3. AT is invertible and (AT )−1 = (A−1)T ;
4. For any non-zero scalar k, the matrix kA is invertible and (kA)−1 = 1kA−1;
5. AmAn = Am+n for any integers m,n;
6. (Am)n = Amn for any integers m,n.
Theorem 1.4.10 If the sizes of the involved matrices are such that the stated operations can be
performed, then
20
1. (AT )T = A;
2. (A + B)T = AT + BT and (A−B)T = AT −BT ;
3. (kA)T = kAT , where k is any scalar;
4. (AB)T = BT AT .
Example 1.4.11 1. Find the matrix A knowing that (I2 + 2A)−1 =
−1 2
4 5
.
Solution: Since (I2 + 2A)−1 =
−1 2
4 5
, it follows that [(I2 + 2A)−1]−1 =
−1 2
4 5
−1
.
Therefore we have successively
I2 + 2A =1
(−1) · 5− 2 · 4
5 −2
−4 −1
⇔ I2 + 2A = − 1
13
5 −2
−4 −1
⇔
⇔ 2A = − 113
5 −2
−4 −1
−
1 0
0 1
⇔ 2A =
− 5
13 − 1 213
413
113 − 1
⇔
⇔ 2A =
−18
13213
413 −12
13
⇔ A =
12
−18
13213
413 −12
13
⇔
⇔ A =
−1
2 · 1813
12 · 2
13
12 · 4
13 −12 · 12
13
⇔ A =
− 9
13113
213 − 6
13
.
2. If A =
cos θ − sin θ
sin θ cos θ
, show that A2 =
cos 2θ − sin 2θ
sin 2θ cos 2θ
and A3 =
cos 3θ − sin 3θ
sin 3θ cos 3θ
.
Solution: Indeed,
A2 =
cos θ − sin θ
sin θ cos θ
cos θ − sin θ
sin θ cos θ
=
cos2 θ − sin2 θ −2 sin θ cos θ
2 sin θ cos θ cos2 θ − sin2 θ
=
cos 2θ − sin 2θ
sin 2θ cos 2θ
.
21
A3 = AA2 =
cos θ − sin θ
sin θ cos θ
cos 2θ − sin 2θ
sin 2θ cos 2θ
=
cos θ cos 2θ − sin θ sin 2θ −(cos θ sin 2θ + sin θ cos 2θ)
sin θ cos 2θ + cos θ sin 2θ − sin θ sin 2θ + cos θ cos 2θ
=
cos(θ + 2θ) − sin(θ + 2θ)
sin(θ + 2θ) cos(θ + 2θ)
=
cos 3θ − sin 3θ
sin 3θ cos 3θ
.
1.5 Elementary matrices and a method for finding A−1
Definition 1.5.1 An n×n matrix is called an elementary matrix if it can be obtain from the n×n
identity matrix In by performing a single elementary row operation.
Example 1.5.2
I3 =
1 0 0
0 1 0
0 0 1
3r1 + r3 → r3→
1 0 0
0 1 0
3 0 1
= E − elementary matrix.
If A =
−1 2 1 3
2 1 6 −3
4 2 −1 0
, then EA =
−1 2 1 3
2 1 6 −3
1 8 2 9
←
r3←r3+r1
−1 2 1 3
2 1 6 −3
4 2 −1 0
= A.
Theorem 1.5.3 If the elementary matrix E results from performing a certain row operation on
Im and A is an m × n matrix, then the product EA is the matrix that results when the same row
operation is performed on A.
If an elementary operation is performed on the identity matrix I to obtain an elementary matrix
E, then there is another operation, called the corresponding inverse operation, when apply to E to
obtain I back again.
Direct Operation Corresponding Inverse operation
Multiply row i by c 6= 0 (cri) Multiply row i by 1c (1
c ri)
Interchange rows i and j (i ↔ j) Interchange rows i and j (i ↔ j)
Add c times row i to row j (cri + rj → rj ) Add −c times row i to row j (−cri + rj → rj )
22
Theorem 1.5.4 Every elementary matrix is invertible, and the inverse is also an elementary ma-
trix.
Proof. If Idir. op.−→ E
inv. op.−→ I. Daca Iinv. op.−→ E0 , then E0E = I and EE0 = I, namely E is
invertible.¤
Theorem 1.5.5 If A is an n× n matrix, then the following statements are equivalent:
1. A is invertible;
2. The homogeneous linear system AX = O has only the trivial solution;
3. The reduced row-echelon form of A is In;
4. A is expressible as a product of elementary matrices.
The last statement, (4), of theorem 1.6.7 is of particular importance since it provide us a method
of finding the reduced row-echelon form of a matrix and a method of finding the inverse of an
invertible matrix.
Let A be a matrix and R be its reduced row-echelon form, namely
Aop1→ A1
op2→ A2
op3→ · · · opn→ R,
then R = En · . . . · E1A, where Iop1→ E1 , I
op2→ E2 , · · · , Iopn→ En . Therefore
A = (En · . . . · E1)−1R = E−1
1· . . . · E−1
nR.
If A is particularly invertible, then R = I, such that A = E−11· . . . · E−1
nand implicitly
A−1 = En · . . . · E1 .
Thus, in order to find the inverse of an invertible matrix A, we first observe that we have the general
property A[B...C] = [AB
...AC] and that we have successively:
[A...I]
op1→ E1 [A...I] = [E1A
...E1I]op2→ E2 [E1A
...E1I] = [E2E1A...E2E1I]
op3→ · · · opn→ [En · · ·E1A︸ ︷︷ ︸=I
... En · · ·E1I︸ ︷︷ ︸=A−1
].
Examples 1.5.6 1. Find the inverse of the matrix A =
−2 −1 4
3 1 −7
2 0 −5
and write A−1 as a
product of elementary matrices;
23
2. Express the matrix
A =
0 1 7 8
1 3 3 8
−2 −5 1 −8
in the form A = EFGR, where E, F,G are elementary matrices and R in row-echelon form.
(1)
−2 −1 4
3 1 −7
2 0 −5
∣∣∣∣∣∣∣∣∣
1 0 0
0 1 0
0 0 1
r2 + r1 → r1→
1 0 −3
3 1 −7
2 0 −5
∣∣∣∣∣∣∣∣∣
1 1 0
0 1 0
0 0 1
−3r1 + r2 → r2
−2r1 + r3 → r3
→
1 0 −3
0 1 2
0 0 1
∣∣∣∣∣∣∣∣∣
1 1 0
−3 −2 0
−2 −2 1
−2r3 + r2 → r2
3r3 + r1 → r1
→
1 0 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣∣
−5 −5 3
1 2 −2
−2 −2 1
︸ ︷︷ ︸=A−1
.
Therefore E5E4E3E2E1A = I3 , where
I3 =
1 0 0
0 1 0
0 0 1
r2 + r1 → r1→
1 1 0
0 1 0
0 0 1
=: E1
I3 =
1 0 0
0 1 0
0 0 1
−3r1 + r2 → r2→
1 0 0
−3 1 0
0 0 1
=: E2
I3 =
1 0 0
0 1 0
0 0 1
−2r1 + r3 → r3→
1 0 0
0 1 0
−2 0 1
=: E3
I3 =
1 0 0
0 1 0
0 0 1
−2r3 + r2 → r2→
1 0 0
0 1 −2
0 0 1
=: E4
I3 =
1 0 0
0 1 0
0 0 1
3r3 + r1 → r1→
1 0 3
0 1 0
0 0 1
=: E5 .
(2)
A =
0 1 7 8
1 3 3 8
−2 −5 1 −8
r1 ↔ r2→
1 3 3 8
0 1 7 8
−2 −5 1 −8
2r1 + r3 → r3→
24
2r1 + r3 → r3→
1 3 3 8
0 1 7 8
0 1 7 8
−r2 + r3 → r3→
1 3 3 8
0 1 7 8
0 0 0 0
= R−matrix in row-echelon form.
Consequently R = E3E2E1A, or equivalently A = E−11
E−12
E−13
R, where
I3 =
1 0 0
0 1 0
0 0 1
r1 ↔ r2→
0 1 0
1 0 0
0 0 1
=: E1 = E−1
1
I3 =
1 0 0
0 1 0
0 0 1
2r1 + r3 → r3↗
−2r1 + r3 → r3↘
1 0 0
0 1 0
2 0 1
=:E2
1 0 0
0 1 0
−2 0 1
=:E−1
2
I3 =
1 0 0
0 1 0
0 0 1
−r2 + r3 → r3↗
r2 + r3 → r3↘
1 0 0
0 1 0
0 −1 1
=:E3
1 0 0
0 1 0
0 1 1
=:E−1
3
.
Now it is enough to take E = E−11
, F = E−12
, G = E−13
.
1.6 Further results on systems of equations and invertibility
Theorem 1.6.1 Every system of linear equations has either no solution, exactly one solution or
infinitely many solutions.
Theorem 1.6.2 If A is an invertible n × n matrix, then for each n × 1 matrix b, the system of
linear equations AX = B has exactly one solution, namely X = A−1B.
Example 1.6.3 Find the solution set of the following linear system:
−2x − y + 4z = 1
3x + y − 7z = −1
2x − 5z = 0
25
Solution: We first observe that the coefficient matrix of the given linear system is A =
−2 −1 4
3 1 −7
2 0 −5
, which is invertible and its inverse is A−1 =
−5 −5 3
1 2 −2
−2 −2 1
. The given linear
system can be written in the form AX = B, where X =
x
y
z
and B =
1
−1
0
. Consequently the
unique solution of the given linear system is x = A−1B =
−5 −5 3
1 2 −2
−2 −2 1
1
−1
0
=
0
−1
0
.
Frequently, one have to solve a sequence of linear systems
AX = B1 , AX = B2 , . . . , AX = Bk
each of which has the same square matrix A. If A is invertible then the solutions are
X1 = A−1B1 , X2 = A−1B2 , . . . , Xk= A−1B
k.
Otherwise the method of solving those systems, which works both for A invertible or A non-
invertible, consists in forming the matrix
[A... B1
... B2
... · · · ... Bk]
and by reducing it to the reduced row-echelon form.
Example 1.6.4 Solve the following linear systems:
−2x − y + 4z = 1
3x + y − 7z = −1
2x − 5z = 0
,
−2x − y + 4z = 2
3x + y − 7z = −2
2x − 5z = 3
Theorem 1.6.5 Let A be a square matrix.
1. If B is a square matrix satisfying BA = I, then B = A−1;
2. If B is a square matrix satisfying AB = I, then B = A−1.
Theorem 1.6.6 Theorem 1.6.7 If A is an n×n matrix, then the following statements are equiv-
alent:
1. A is invertible (*);
26
2. The homogeneous linear system AX = O has only the trivial solution;
3. The reduced row-echelon form of A is In;
4. A is expressible as a product of elementary matrices.
5. AX = B is consistent for every n× 1 matrix B;(*)
6. AX = B has exactly one solution for every n× 1 matrix B(*).
Theorem 1.6.8 Let A and B be square matrices of the same size. If AB is invertible, then A and
B are both invertible.
Fundamental Problem 1.6.9 Let A be a fixed m × n matrix. Find all m × 1 matrices B such
that the system of equations AX = B is consistent.
Example 1.6.10 What conditions must a, b1, b2, b3 satisfy in order for the system of the linear
system
x + y − 2z = b1
x − 2y + z = b2
−2x + y + z = b3
to be consistent ?
Solution: For the augmented matrix of the given linear system we have successively:
1 1 −2... b1
1 −2 1... b2
−2 1 1... b3
−r1 + r2 → r2
2r1 + r3 → r3→
1 1 a... b1
0 −3 3... b2 − b1
0 3 −8... b3 + 2b1
r1 + r2 → r2→
r1 + r2 → r2→
1 1 a... b1
0 −3 3... b2 − b1
0 0 0... b3 + b2 + b1
.
Thus, the given linear system is consistent if and only if b3 + b2 + b1 = 0.¤
1.7 Diagonal, Triangular and Symmetric Matrices
Definition 1.7.1 A square matrix in which all the entries off the diagonal are zero is called a
diagonal matrix. The general form a of a diagonal matrix is
D =
d1 · · · 0
0 d2 · · · 0...
.... . .
...
0 0 · · · dn
.
27
Remarks 1.7.2 1. The diagonal matrix
D =
d1 0 · · · 0
0 d2 · · · 0...
.... . .
...
0 0 · · · dn
is invertible iff all of its diagonal entries are non-zero. In this case
D−1 =
1d1
0 · · · 0
0 1d2
· · · 0...
.... . .
...
0 0 · · · 1dn
.
2. Powers of diagonal matrices are easy to be computed. More precisely, if
D =
d1 0 · · · 0
0 d2 · · · 0...
.... . .
...
0 0 · · · dn
then Dk =
dk1
0 · · · 0
0 dk2· · · 0
......
. . ....
0 0 · · · dkn
.
3. If A = [aij ] is an m× n matrix, then
δ1 0 · · · 0
0 δ2 · · · 0...
.... . .
...
0 0 · · · δm
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
=
δ1a11 δ1a12 . . . δ1a1n
δ2a21 δ2a22 . . . δ2a2n
......
. . ....
δmam1 δmam2 . . . δmamn
and
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
d1 · · · 0
0 d2 · · · 0...
.... . .
...
0 0 · · · dn
=
a11d1 a12d2 . . . a1ndn
a21d1 a22d2 . . . a2ndn
......
. . ....
am1d1 am2d2 . . . amndn
.
Examples 1.7.3 Find a diagonal matrix A satisfying
A−2 =
9 0 0
0 4 0
0 0 1
.
28
Solution: We have successively:
A−2 =
9 0 0
0 4 0
0 0 1
⇔ [A−2]−1 =
9 0 0
0 4 0
0 0 1
−1
⇔
⇔ A2 =
19 0 0
0 14 0
0 0 1
⇔ A2 =
(13
)20 0
0(
12
)20
0 0 12
.
Thus, such a matrix is
A =
13 0 0
0 12 0
0 0 1
.
Definition 1.7.4 A square matrix in which all the entries above the main diagonal are zero is
called lower triangular and a square matrix in which all the entries below the main diagonal are
zero is called upper triangular. A matrix which is either lower triangular or upper triangular is
called triangular. The general form a of a lower triangular matrix is
a11 0 · · · 0
a21 a22 · · · 0...
.... . .
...
an1 an2 · · · ann
and the general form a of a upper triangular matrix is
a11 a12 · · · a1n
0 a22 · · · a2n
......
. . ....
0 0 · · · ann
.
Remark 1.7.5 1. A square matrix A = [aij ] is upper triangular iff aij = 0 for i > j;
2. A square matrix A = [aij ] is lower triangular iff aij = 0 for i < j.
Theorem 1.7.6 1. The transpose of a lower triangular matrix is upper triangular, and the
transpose of an upper triangular matrix is lower triangular;
2. The product of lower triangular matrices is a lower triangular matrix;
29
3. The product of upper triangular matrices is a upper triangular matrix;
4. A triangular matrix is invertible if and only if its diagonal entries are all nonzero;
5. The inverse of a lower triangular matrix is lower triangular and the inverse of an upper
triangular matrix is upper triangular.
Example 1.7.7 If
A =
1 −1 2
0 2 3
0 0 1
and B =
2 3 1
0 −1 2
0 0 5
,
show that AB is also upper triangular and find (AB)−1 and B−1A−1.
Solution:
1 −1 2
0 2 3
0 0 1
2 3 1
0 −1 2
0 0 5
=
2 4 9
0 −2 19
0 0 5
.
2 4 9
0 −2 19
0 0 5
∣∣∣∣∣∣∣∣∣
1 0 0
0 1 0
0 0 1
12R1 ,−1
2R2 ,15R3→
1 2 92
0 1 −192
0 0 1
∣∣∣∣∣∣∣∣∣
12 0 0
0 −12 0
0 0 15
−2R2 + R1 → R1
192 R3 + R2 → R2
→
1 0 92
0 1 0
0 0 1
∣∣∣∣∣∣∣∣∣
12 1 0
0 −12
1910
0 0 15
−9
2R3 + R1 → R1→
1 0 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣∣
12 1 − 9
10
0 −12
1910
0 0 15
.
Thus
(AB)−1 = B−1A−1 =
12 1 − 9
10
0 −12
1910
0 0 15
.
Definition 1.7.8 A square matrix A is called symmetric if AT = A, or equivalently (A)ij = (A)ji
Theorem 1.7.9 If A,B are symmetric matrices of the same size and k is any scalar, then:
1. AT is symmetric;
2. A + B and A−B are symmetric;
3. kA is symmetric.
Theorem 1.7.10 1. The product of two symmetric matrices is symmetric if and only if the
matrices commute;
30
2. If A is an invertible symmetric matrix, then A−1 is symmetric.
Theorem 1.7.11 1. If A is an arbitrary matrix, then AAT and AT A are symmetric matrices;
2. If A is an invertible matrix, then AAT and AT A are invertible matrices.
Example 1.7.12 Find the values of a, b, c ∈ R for which the matrix A is symmetric, where
A =
2 a + b + c 4b− 3c
5 1 2a + c
2 4 6
.
Solution: The required values are solutions of the linear system
a + b + c = 5
4b− 3c = 2
2a + c = 4
that
can be solved by performing row-reduction operations in the corresponding augu-
mented matrix, namely:
1 1 1... 5
0 4 −3... 2
2 0 1... 4
−2r1+r3→r3−→
1 1 1... 5
0 4 −3... 2
0 −2 −1... −6
14r2−→
14r2−→
1 1 1... 5
0 1 −34
... 12
0 −2 −1... −6
2r2+r3→r3−→
1 1 1... 5
0 1 −34
... 12
0 0 −52
... −5
− 25r3−→
1 0 74
... 92
0 1 −34
... 12
0 0 1... 2
. The corre-
sponding linear system of the last matrix in the above row-reduction process, which is equivalent
with the initial one, has the unique solution a = 1, b = c = 2.
31
32
Chapter 2
Determinants
2.1 The determinant function
Definition 2.1.1 A permutation of the set {1, 2, . . . , n} is a bijective function σ : {1, 2, . . . , n} →{1, 2, . . . , n}, namely an arrangement of the integers 1, 2, . . . , n without omissions or repetitions. It
is also written
σ =
1 2 · · · n
σ(1) σ(2) · · · σ(n)
.
An inversion of the permutation σ is a pair (i, j) such that i < j and σ(i) > σ(j). Denote by m(σ)
the number of inversions of σ and by ε(σ) the signature (−1)ε(σ) of σ. A permutation σ is called
even/odd if m(σ) is even/odd, or equivalently ε(σ) = +1/ε(σ) = −1.
Definition 2.1.2 The determinant of a square n× n matrix A = [aij ] is defined to be
det(A) =∑
ε(σ)a1σ(1)
a2σ(2)
· . . . · anσ(n)
.
It is denoted either by det(A) or by∣∣∣∣∣∣∣∣∣∣∣∣
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
an1 an2 . . . ann
∣∣∣∣∣∣∣∣∣∣∣∣
.
Remarks 2.1.3 1. If A =
a11 a12
a21 a22
is a 2×2 square matrix, then det(A) = a11a22−a12a21 .
Indeed, the only 2 × 2 permutations are e =
1 2
1 2
and σ =
1 2
2 1
and ε(e) = 1,
33
ε(σ) = −1 since m(e) = 0 and m(σ) = 1. Thus
det
a11 a12
a21 a22
=
∣∣∣∣∣∣a11 a12
a21 a22
∣∣∣∣∣∣= ε(e)a
1e(1)a
2e(2)+ ε(σ)a
1σ(1)a
2σ(2)= a11a22 − a12a21 .
For example
∣∣∣∣∣∣1 2
−1 3
∣∣∣∣∣∣= 1 · 3− (−1)3 = 3 + 3 = 6.
2. A 2× 2 matrix A =
a11 a12
a21 a22
is invertible if and only if det(A) 6= 0. In this case
A−1 =1
det(A)
a22 −a12
−a21 a11
.
3. If A =
a11 a12 a13
a21 a22 a23
a31 a32 a33
is a 2× 2 square matrix, then
det(A) =
∣∣∣∣∣∣∣∣∣
a11 a12 a13
a21 a22 a23
a31 a32 a33
∣∣∣∣∣∣∣∣∣= a11a22a33+a12a23a31+a13a21a32−a13a22a31−a12a21a33−a11a23a32 =
a11a11
a21a21
a31a31
a12a12
a22a22
a32a32
a13
a23
a33
j j j¼ ¼ ¼
=
+++− − −
Indeed, the only 3× 3 square matrices are
e =
1 2 3
1 2 3
, σ1 :=
1 2 3
2 1 3
, σ2 =
1 2 3
3 2 1
,
σ3 =
1 2 3
1 3 2
, σ4 =
1 2 3
2 3 1
, σ5 =
1 2 3
3 1 2
and ε(e) = ε(σ4) = ε(σ5) = 1 while ε(σ1) = ε(σ2) = ε(σ4) = −1 since m(e) = 0,m(σ4) =
m(σ5) = 2 and m(σ1) = 1,m(σ2) = 3,m(σ3) = 1.
For example
∣∣∣∣∣∣∣∣∣
1 −1 2
0 2 1
−2 2 3
∣∣∣∣∣∣∣∣∣=
∣∣∣∣∣∣∣∣∣
1 −1 2
0 2 1
−2 2 3
∣∣∣∣∣∣∣∣∣
1 −1
0 2
−2 2
= 6 + 2 + 0− (−8)− 2− 0 = 8 + 8 = 16.
34
2.2 Evaluating determinants by row reduction
Theorem 2.2.1 Let A be a square matrix.
1. If A has a row of zeros, then det(A) = 0.
2. det(A) = det(AT ).
Theorem 2.2.2 If A is an n × n triangular matrix, then det(A) is the product of the entries on
the main diagonal, namely det(A) = a11a22 · . . . · ann
Theorem 2.2.3 Let A be an n× n square matrix.
1. If B is the matrix that results when a single row or a single column of A is multiplied by a
scalar k, then det(B) = k det(A); Consequently det(kA) = kn det(A).
2. If B is the matrix that results when two rows or two columns of A are interchanged, then
det(B) = −det(A);
3. If B is the matrix that results when a multiple of one row of A is added to another row or
when a multiple of one column of A is added to another column, then det(B) = det(A).
Corollary 2.2.4 If A is a square matrix with two proportional rows or two proportional columns,
then (A) = 0.
Example 2.2.5 Find the determinant of A =
1 2 3 4
1 0 1 0
0 1 0 1
0 0 1 −1
.
Solution:
det(A)=
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
1 2 3 4
1 0 1 0
0 1 0 1
0 0 1 −1
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
=
−r1 + r2 → r2
=
∣∣∣∣∣∣∣∣∣∣∣∣
1 2 3 4
0 −2 −2 −4
0 1 0 1
0 0 1 −1
∣∣∣∣∣∣∣∣∣∣∣∣
= (−2)
∣∣∣∣∣∣∣∣∣∣∣∣
1 2 3 4
0 1 1 2
0 1 0 1
0 0 1 −1
∣∣∣∣∣∣∣∣∣∣∣∣
= (−2)
∣∣∣∣∣∣∣∣∣
1 1 2
1 0 1
0 1 −1
∣∣∣∣∣∣∣∣∣=
= −2(2− 1 + 1) = −4.
35
2.3 Properties of determinant function
Theorem 2.3.1 Let A,B and C be n×n matrices that differ only in a single row, say the rth, and assume
that the rth row of C can be obtained by adding corresponding entries in the rth row of A and B. Then
det(C) = det(A) + det(B).
In other words∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
a11 + a′11
a12 + a′12
· · · a1n + a′1n
......
. . ....
am1 am2 · · · amn
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
=
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
a11 a12 · · · a1n
......
. . ....
am1 a
m2 · · · amn
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
+
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
a′11
a′12
· · · a′1n
......
. . ....
am1 a
m2 · · · amn
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
Theorem 2.3.2 If A and B are square matrices of the same size, then det(AB) = det(A) det(B).
Theorem 2.3.3 If A is an invertible matrix, then det(A−1) = 1det(A) .
Theorem 2.3.4 If A is an n× n matrix, then the following statements are equivalent:
1. A is invertible;(*)
2. The homogeneous linear system AX = O has only the trivial solution;
3. The reduced row-echelon form of A is In ;
4. A is expressible as a product of elementary matrices.
5. AX = B is consistent for every n× 1 matrix B;
6. AX = B has exactly one solution for every n× 1 matrix B.
7. det(A) 6= 0(*)
Example 2.3.5 1. Consider the matrix A =
1 1 1 1
x y z w
w x y z
y + z z + w w + x x + y
. Show that det(A) = 0
for all x, y, z, w ∈ R.
Solution: For the determinant of A we have successively:∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 1
x y z w
w x y z
y + z z + w w + x x + y
∣∣∣∣∣∣∣∣∣∣∣∣
r2+r4→r4=
∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 1
x y z w
w x y z
x + y + z y + z + w z + w + x w + x + y
∣∣∣∣∣∣∣∣∣∣∣∣
36
r3+r4→r4=
∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 1
x y z w
w x y z
x + y + z + w x + y + z + w y + z + w + x z + w + x + y
∣∣∣∣∣∣∣∣∣∣∣∣
=
= (x + y + z + w)
∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 1
x y z w
w x y z
1 1 1 1
∣∣∣∣∣∣∣∣∣∣∣∣
= 0
for all x, y, z, w ∈ R.
2. Let
A =
a b c
d e f
g h i
.
Assuming that det(A) = −7, find det(3A),det(A−1),det(2A−1) and det
a g d
b h e
c i f
.
Solution: • det(3A) = 33 det(A) = 27(−7).
• det(A−1) = 1det(A) = 1
−7 = − 17 .
• det(2A−1) = 23 det(A−1) = 8(− 1
7
)= − 8
7 .
• det
a g d
b h e
c i f
= det
a g d
b h e
c i f
T
=
∣∣∣∣∣∣∣∣∣
a b c
g h i
d e f
∣∣∣∣∣∣∣∣∣= −
∣∣∣∣∣∣∣∣∣
a b c
d e f
g h i
∣∣∣∣∣∣∣∣∣= −(−7) = 7.
3. Show that
∣∣∣∣∣∣∣∣∣
a1 + b1t a2 + b2t a3 + b3t
a1t + b1 a2t + b2 a3t + b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣= (1− t2)
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣.
37
Solution: Indeed, we have successively:
∣∣∣∣∣∣∣∣∣
a1 + b1 t a2 + b2 t a3 + b3 t
a1 t + b1 a2 t + b2 a3 t + b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣=
∣∣∣∣∣∣∣∣∣
a1 a2 a3
a1 t + b1 a2 t + b2 a3 t + b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣+
∣∣∣∣∣∣∣∣∣
b1 t b2 t b3 t
a1 t + b1 a2 t + b2 a3 t + b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣
=
∣∣∣∣∣∣∣∣∣
a1 a2 a3
a1 t a2 t a3 t
c1 c2 c3
∣∣∣∣∣∣∣∣∣︸ ︷︷ ︸
=0
+
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣+
∣∣∣∣∣∣∣∣∣
b1 t b2 t b3 t
a1 t a2 t a3 t
c1 c2 c3
∣∣∣∣∣∣∣∣∣+
∣∣∣∣∣∣∣∣∣
b1 t b2 t b3 t
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣︸ ︷︷ ︸
=0
=
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣+ t2
∣∣∣∣∣∣∣∣∣
b1 b2 b3
a1 a2 a3
c1 c2 c3
∣∣∣∣∣∣∣∣∣
=
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣− t2
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣
= (1− t2)
∣∣∣∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣∣∣∣
2.4 Cofactor expansion. Cramer’s rule
Definition 2.4.1 If A is a square matrix, then the minor of entry aij is denoted by Mij and is defined to
be the determinant of the submatrix that remains after the ith row and the jth column are deleted from A.
The number (−1)i+jMij is denoted by Cij and is called the cofactor of entry aij .
Example 2.4.2
∣∣∣∣∣∣∣∣∣
a11 a12 a13
a21 a22 a23
a31 a32 a33
∣∣∣∣∣∣∣∣∣= a11a22a33 + a12a23a31 + a13a21a32 − a13a22a31 − a12a21a33 − a11a23a32
= a11(a22a33 − a23a32)− a12(a12a21a33 − a23a31) + a13(a21a32 − a22a31)
= (−1)1+1a11
∣∣∣∣∣∣a22 a23
a32 a33
∣∣∣∣∣∣+ (−1)1+2a12
∣∣∣∣∣∣a21 a23
a31 a33
∣∣∣∣∣∣+ (−1)1+3a13
∣∣∣∣∣∣a21 a22
a31 a32
∣∣∣∣∣∣
= a11C11 + a12C12 + a13C13
38
Similarly∣∣∣∣∣∣∣∣∣
a11 a12 a13
a21 a22 a23
a31 a32 a33
∣∣∣∣∣∣∣∣∣= a11a22a33 + a12a23a31 + a13a21a32 − a13a22a31 − a12a21a33 − a11a23a32
= −a12(a21a33 − a23a31) + a22(a11a33 − a13a31)− a32(a11a23 − a13a21)
= (−1)1+2
∣∣∣∣∣∣a21 a23
a31 a33
∣∣∣∣∣∣+ (−1)2+2
∣∣∣∣∣∣a11 a13
a31 a33
∣∣∣∣∣∣+ (−1)3+2
∣∣∣∣∣∣a11 a13
a21 a23
∣∣∣∣∣∣
= a12C12 + a22C22 + a32C32
Theorem 2.4.3 If A = [aij
] is a square n× n matrix, then
det(A) = ai1Ci1 + ai2Ci2 + · · ·+ ainCin︸ ︷︷ ︸The cofactor expansion of det(A) along the ith row
= a1j C1j + a2j C2j + · · ·+ anj Cnj︸ ︷︷ ︸The cofactor expansion of det(A) along the jth column
.
Example 2.4.4 Show that:∣∣∣∣∣∣∣∣∣∣∣∣
1 1 1 1
a1 a2 a3 a4
a21 a2
2 a23 a2
4
a31 a3
2 a33 a3
4
∣∣∣∣∣∣∣∣∣∣∣∣
= (a1 − a2)(a1 − a3)(a1 − a4)(a2 − a3)(a2 − a4)(a3 − a4)
If A = [aij ] is a square n× n matrix and Cij is the cofactor of the entry aij , then the matrix
C11 C12 · · · C1n
C21 C22 · · · C2n
......
. . ....
Cn1 Cn2 · · · Cnn
is called the matrix of cofactors from A. The transpose of this matrix is called the adjoint of A and is
denoted by adj(A). If A is an invertible matrix, then A−1 = 1det(A)adj(A). The inverse of an invertible
lower triangular matrix is lower triangular and the inverse of an invertible upper triangular matrix is upper
triangular. Find necessary and sufficient conditions on a, b, c ∈ R
A =
1 1 1
a a a
a2 a2 a2
such that the matrix A is invertible. In those cases find A−1.
39
According to theorem 2.3.4, A is invertible if and only if det(A) 6= 0. But since
det(A) = V (a, b, c) =
∣∣∣∣∣∣∣∣∣
1 1 1
a b c
a2 b2 c2
∣∣∣∣∣∣∣∣∣= (c− a)(c− b)(b− a),
it follows that A is invertible if and only if a 6= b, b 6= c and c 6= a. In this case
A−1 =1
det(A)adj(A) =
1det(A)
C11 C12 C13
C21 C22 C23
C31 C32 C33
T
=
=1
(c− a)(c− b)(b− a)
∣∣∣∣∣∣b c
b2 c2
∣∣∣∣∣∣−
∣∣∣∣∣∣a c
a2 c2
∣∣∣∣∣∣
∣∣∣∣∣∣a b
a2 b2
∣∣∣∣∣∣
−∣∣∣∣∣∣
1 1
b2 c2
∣∣∣∣∣∣
∣∣∣∣∣∣1 1
a2 c2
∣∣∣∣∣∣−
∣∣∣∣∣∣1 1
a2 b2
∣∣∣∣∣∣
∣∣∣∣∣∣1 1
b c
∣∣∣∣∣∣−
∣∣∣∣∣∣1 1
a c
∣∣∣∣∣∣
∣∣∣∣∣∣1 1
a b
∣∣∣∣∣∣
T
=
=1
(c− a)(c− b)(b− a)
bc(c− b) −ac(c− a) ab(b− a)
−(c2 − b2) c2 − a2 −(b2 − a2)
c− b −(c− a) b− a
T
=
=1
(c− a)(c− b)(b− a)
bc(c− b) b2 − c2 c− b
ac(a− c) c2 − a2 a− c
ab(b− a) a2 − b2 b− a
=
bc(c−b)(c−a)(c−b)(b−a)
(b−c)(b+c)(c−a)(c−b)(b−a)
c−b(c−a)(c−b)(b−a)
ac(a−c)(c−a)(c−b)(b−a)
(c−a)(c+a)(c−a)(c−b)(b−a)
a−c(c−a)(c−b)(b−a)
ab(b−a)(c−a)(c−b)(b−a)
(a−b)(a+b)(c−a)(c−b)(b−a)
b−a(c−a)(c−b)(b−a)
=
=
bc(c−a)(b−a)
b+c(a−c)(b−a)
1(c−a)(b−a)
ac(c−b)(a−b)
c+a(c−b)(b−a)
1(c−b)(a−b)
ab(c−a)(c−b)
a+b(c−a)(b−c)
1(c−a)(c−b)
.
(Cramer’s Rule) If AX = B is a system of n linear equations in n unknowns such that det(A) 6= 0,
then the system has a unique solution. This solution s
x1 =det(A1)det(A)
, x2 =det(A2)det(A)
, . . . , xn =det(An)det(A)
,
where Aj is the matrix obtained from A by replacing its jth row with the entries of the column
matrix B. Provide necessary and sufficient conditions on a1, a2, a3, a4 ∈ R such that the linear
40
system
x + y + z + w = 1
ax + by + cz + dw = α
a2x + b2y + c2z + d2w = α2
a3x + b3y + c3z + d3w = α3
(2.1)
has exactly one solution for each α ∈ R. In this case solve the system by using the Cramer’s rule.
The given linear system has a unique solution for each α ∈ R if and only if its coefficient matrix
A =
1 1 1 1
a b c d
a2 b2 c2 d2
a3 b3 c3 d3
is invertible, or, equivalently det(A) 6= 0. But since
det(A) = V (a, b, c, d) = (d− a)(d− b)(d− c)(c− a)(c− b)(b− a),
it follows that the given linear system has unique solution for each α ∈ R if and only if the scalars
a, b, c, d are pairwise disjoint.
By Crammer’s rule, it follows that the unique solution of the system is
x =V (α, b, c, d)V (a, b, c, d)
=(d− α)(c− α)(b− α)(d− a)(c− a)(b− a)
;
y =V (a, α, c, d)V (a, b, c, d)
=(α− a)(c− α)(d− α)(b− a)(c− b)(d− b)
;
z =V (a, b, α, d)V (a, b, c, d)
=(α− a)(α− b)(d− α)(c− a)(c− b)(d− c)
;
w =V (a, b, c, α)V (a, b, c, d)
=(α− a)(α− b)(α− c)(d− a)(d− b)(d− c)
.
41
42
Chapter 3
Euclidean vector spaces
3.1 Euclidean n-space
Definition 3.1.1 If n is a positive integer, then an ordered n-tuple is a sequence of n numbers
(a1 , a2 , . . . , an). The set of Rn of all n-tuples is called n-space.
A 2-tuple is also called ordered pair and a 3-tuple is called ordered triple and both of them have
geometric interpretation. Thus an n-tuple can be viewed as either an ’generalized point’ or a
’generalized vector’.
Definition 3.1.2 Two vectors u = (u1 , u2 , . . . , un) and v = (v1 , v2 , . . . , vn) in Rn are called equal
if u1 = v1 , u2 = v2 , . . . , un = vn . Their sum u+v is defined by u+v = (u1 +v1 , u2 +v2 , . . . , un +vn)
and, for a sscalar k, the scalar multiple is defined by ku = (ku1 , ku2 , . . . , kun).
The zerovector in Rn is denoted by 0 and is defined to be the vector 0 = (0, 0, . . . , 0). If u =
(u1 , u2 , . . . , un) ∈ Rn is any vector, the negative or additive inverse of u is denoted by −u and
is defined by −u = (−u1 ,−u2 , . . . ,−un). The difference u − v of the vectors (u1 , u2 , . . . , un), v =
(v1 , v2 , . . . , vn) ∈ Rn is defined by u − v := u + (−v). In termes of components we have u − v =
(u1 − v1 , u2 − v2 , . . . , un − vn).
Theorem 3.1.3 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn), w = (w1 , w2 , . . . , wn) ∈ Rn and k, l are
scalars, then
1. u + v = v + u.
2. u + (v + w) = (u + v) + w.
3. u + 0 = 0 + u = u.
4. u + (−u) = 0, namely u− u = 0.
43
5. k(lu) = (kl)u.
6. k(u + v) = ku + kv.
7. (k + l)u = ku + lu.
8. 1u = u.
Definition 3.1.4 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈ Rn are any vectors, then the Eu-
clidean Product u · v is defined by u · v := u1v1 + u2v2 + · · ·+ unvn .
Theorem 3.1.5 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn), w = (w1 , w2 , . . . , wn) ∈ Rn and k, l are
scalars, then
1. u · v = v · u.
2. k(u · v) = (ku) · v = u · (kv).
3. (u + v) · w = u · w + v · w.
4. u · u ≤ 0 and u · u = 0 if and only if u− 0.
Definition 3.1.6 The Euclidean norm or the Euclidean length ||u|| of a vector u =
(u1 , u2 , . . . , un) ∈ Rn is defined by ||u|| :=√
u · u =√
u21+ u2
2+ · · ·+ u2
nand the distance d(u, v)
between the points u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈ Rn is defined by
d(u, v) = ||u− v|| =√
(u1 − v1)2 + (u2 − v2)2 + · · ·+ (un − vn)2.
Theorem 3.1.7 (Cauchy-Schwarz Inequality in Rn) If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈Rn, then |u · v| ≤ ||u|| · ||v||. In terms of components, the inequality becomes
|u1v1 + u2v2 + · · ·+ unvn | ≤√
u21+ u2
2+ · · ·+ u2
n·√
v21
+ v22
+ · · ·+ v2n.
Theorem 3.1.8 If u, v ∈ Rn and k is any scalar, then
1. ||u|| ≥ 0.
2. ||u|| = 0 if and only if u = 0.
3. ||ku|| = |k| ||u||.
44
4. ||u + v|| ≤ ||u||+ ||v|| (Triangle inequality).
Theorem 3.1.9 If u, v, w ∈ Rn and k is any scalar, then
1. d(u, v) ≥ 0.
2. d(u, v) = 0 if and only if u = v.
3. d(u, v) = d(v, u).
4. d(u, v) ≤ d(u,w) + d(w, v) (Triangle inequality).
Theorem 3.1.10 If u, v ∈ Rn, then u · v = 14 ||u + v||2 − 1
4 ||u− v||2.
Proof. Indeed, by adding the identities
||u + v||2 = (u + v) · (u + v) = ||u||2 + 2u · v + ||v||2
||u− v||2 = (u− v) · (u− v) = ||u||2 − 2u · v + ||v||2,we immediately get the required identity.¤
Definition 3.1.11 Two vectors u, v ∈ Rn are said to be orthogonal if u · v = 0.
Theorem 3.1.12 (Theorem of Pythagoras for Rn) If u, v ∈ Rn are orthogonal vectors, then the
equality ||u + v||2 = ||u||2 + ||v||2 holds.
Proof. Indeed, we have successively:
||u + v||2 = (u + v) · (u + v) = ||u||2 + 2u · v + ||v||2 = ||u||2 + ||v||2.¤
A vector u = (u1 , . . . , un) ∈ Rn can be naturally identified with a column or with a row matrix,
namely with
u =
u1
u2
...
un
or u = [u1 u2 · · · un ].
Consequently, the set Rn of n-tuples can be naturally identified with either the space of column
matrices, or with the space of row matrices. The sum u + v of two vectors u = (u1 , . . . , un), v =
(v1 , . . . , vn) will be identified consquently identified with the sum
u + v =
u1
u2
...
un
+
v1
v2
...
vn
=
u1 + v1
u2 + v2
...
un + vn
45
of the column matrices u and v, or with the sum
u + v = [u1 u2 · · · un ] + [v1 v2 · · · vn ] = [u1 + v1 u2 + v2 · · · un + vn ]
of the row matrices u and v.
Similarly, the scalar multiple ku of the real k with the ordered n-tuple u = (u1 , u2 , · · · , un) can
be either identified with the scalar multiple
ku = k
u1
u2
...
un
=
ku1
ku2
...
kun
,
or with the scalar multiple
ku = k[u1 , u2 , . . . , un ] = [ku1 , ku2 . . . , kun ].
3.2 Linear transformations from Rn to Rm
Definition 3.2.1 A mapping TA : Rn → Rm,TA(x) = Ax, where A is an m× n matrix, is called
linear transformation, or linear operator, if m = n. The linear transformation TA is equally called
the multiplication by A. If A = [aij ]mn, then
T
(
x1
x2
...
xn
)=
a11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
x1
x2
...
xn
=
a11x1 + a12x2 + · · ·+ a1nxn
a21x1 + a22x2 + · · ·+ a2nxn
...
am1x1 + am2x2 + · · ·+ amnxn
.
Observe that
T
(
1
0...
0
)=
a11
a21
...
am1
, T
(
0
1...
0
)=
a12
a22
...
am2
, . . . , T
(
0
0...
1
)=
a1n
a2n
...
amn
The matrix A is called the standard matrix of the linear transformation T and it is usually
denoted by [T ] and the multiplication by A mapping is also denoted by TA . Therefore [T ] =
46
[T (e1)...T (e2)
... · · · ...T (en)], where
e1 =
1
0...
0
, e2 =
0
1...
0
, . . . , en =
0
0...
1
.
1. The orthogonal projection of R2 on the x-axis, px : R2 → R2, px(x1 , x2) = (x1 , 0) is a linear
mapping since
px
( x1
x2
)=
x1
0
=
1 0
0 0
x1
x2
and [px ] =
1 0
0 0
. Its equations are :
w1 = x1
w2 = 0.
2. The orthogonal projection of R2 on the y-axis, py : R2 → R2, py(y1 , y2) = (0, y2) is a linear
mapping since
py
( y1
y2
)=
0
y2
=
0 0
0 1
y1
y2
and [py ] =
0 0
0 1
. Its equations are :
w1 = 0
w2 = y2 .
3. The reflection of R3 about the x1x2-plane, Sx1x2: R3 → R3, Sx1x2
(x, y, z) = (x, y,−z) is a
linear mapping since
Sx1x2
(
x
y
z
)=
x
y
−z
=
1 0 0
0 1 0
0 0 −1
x
y
z
and [Sx1x2] =
1 0 0
0 1 0
0 0 −1
. Its equations are :
w1 = x
w2 = y
w3 = −z
4. The reflection of R3 about the x1x3-plane, Sx1x3: R3 → R3, Sx1x3
(x, y, z) = (x,−y, z) is a
linear mapping since
Sx1x3
(
x
y
z
)=
x
−y
z
=
1 0 0
0 −1 0
0 0 1
x
y
z
47
and [Sx1x3] =
1 0 0
0 −1 0
0 0 1
. Its equations are :
w1 = x
w2 = −y
w3 = z
5. The reflection of R3 about the x2x3-plane, Sx2x3: R3 → R3, Sx2x3
(x, y, z) = (x,−y, z) is a
linear mapping since
Sx2x3
(
x
y
z
)=
−x
y
z
=
−1 0 0
0 1 0
0 0 1
x
y
z
and [Sx2x3] =
−1 0 0
0 1 0
0 0 1
. Its equations are :
w1 = −x
w2 = y
w3 = z
6. The orthogonal projection of R3 on the x1x2-plane, Px1x2: R3 → R3, Px1x2
(x, y, z) = (x, y, 0)
is a linear mapping since
Px1x2
(
x
y
z
)=
x
y
0
=
1 0 0
0 1 0
0 0 0
x
y
z
and [Px1x2] =
1 0 0
0 1 0
0 0 0
. Its equations are :
w1 = x
w2 = y
w3 = 0
7. The orthogonal projection of R3 on the x1x3-plane, Px1x3: R3 → R3, Px1x3
(x, y, z) = (x, 0, z)
is a linear mapping since
Px1x3
(
x
y
z
)=
x
0
z
=
1 0 0
0 0 0
0 0 1
x
y
z
and [Px1x3] =
1 0 0
0 0 0
0 0 1
. Its equations are :
w1 = x
w2 = 0
w3 = z
8. The orthogonal projection of R3 on the x2x3-plane, Px2x3: R3 → R3, Px2x3
(x, y, z) = (0, y, z)
is a linear mapping since
Px2x3
(
x
y
z
)=
0
y
z
=
0 0 0
0 1 0
0 0 1
x
y
z
48
and [Px2x3] =
0 0 0
0 1 0
0 0 1
. Its equations are:
w1 = 0
w2 = y
w3 = z
9. The rotation operator of R2 through a fixed angle θ,
Rθ
: R2 → R2, Rθ
= (x cos θ − y sin θ, x sin θ + y cos θ)
is a linear mapping since
Rθ
( x
y
)=
x cos θ − y sin θ
x sin θ + y cos θ
=
cos θ − sin θ
sin θ cos θ
x
y
and [Rθ] =
cos θ − sin θ
sin θ cos θ
. Its equations are:
w1 = x cos θ − y sin θ
w2 = x sin θ + y cos θ.
Indeed, the rotation operator Rθ
rotates the point (x, y) = (r cosϕ, r sinϕ), counterclockwise
with the angle θ if θ > 0 and clockwise if θ < 0, the coordinates of the rotated point being
(w1 , w2) = (r cos(θ + ϕ), r sin(θ + ϕ)), such that one getsw1 = x cos θ − y sin θ
w2 = x sin θ + y cos θ.
10. The rotation operator of R3 through a fixed angle θ about an oriented axis, rotates about the
axis of rotation each point of R3 in such a way that its associated vector sweeps out some
portion of the cone determine by the vector itself an by a vector which gives the direction
and the orientation of the considered oriented axis. The angle of the rotattion is measured
at the base of the cone and it is measured clockwise or counterclockwise in relation with a
viewpoint along the axis looking toward the origin. As in R2, the positives angles generates
counterclockwise roattions and negative angles generates clockwise roattions. The counter-
clockwise sense of rotaion can be determined by the right-hand rule: If the thumb of the right
hand points the direction of the direction of the oriented axis, then the cupped fingers points
in a counterclockwise direction. The rotation operators in R3 are linear. For example
(a) The counterclockwise rotation about the positive x-axis through an angle θ has the
equations
w1 = x
w2 = y cos θ − z sin θ
w3 = y sin θ + z cos θ
, its standard matrix is
1 0 0
0 cos θ − sin θ
0 sin θ cos θ
.
(b) The counterclockwise rotation about the positive y-axis through an angle θ has the
equations
w1 = x cos θ + z sin θ
w2 = y
w3 = −x sin θ + z cos θ
, its standard matrix is
cos θ 0 − sin θ
0 1 0
sin θ 0 cos θ
.
49
(c) The counterclockwise rotation about the positive z-axis through an angle θ has the
equations
w1 = x cos θ − y sin θ
w2 = x sin θ + y cos θ
w3 = z
, its standard matrix is
cos θ − sin θ 0
sin θ cos θ 0
0 0 1
.
(d) The homotopy of ratio k ∈ R is the linear operator Hk
: Rn → Rn,Hk(x) = kx, Its
standard matrix is
k 0 · · · 0
0 k · · · 0...
.... . .
...
0 0 · · · k
and its equations are
w1 = kx1
w2 = kx2
...
wn = kxn .
If 0 ≤ k ≤ 1, then Hk
is called contraction, and for k ≥ 1,Hk
is called dilatation.
• Let A,B be matrices of sizes n×k and k×m respectively, and TA : Rn → Rk, TB : Rk → Rm
are the linear transformations of standard matrices A and B respectively, namely
TA(x) = Ax, T
B(y) = By.
Then the composed mapping TB ◦ TA is also linear and TB ◦ TA = TBA . Indeed
(TB ◦ TA)(x) = TB (TA)(x)) = TB (Ax) = B(Ax) = (BA)x = TBA(x).
We have just proved that [T2 ◦ T1 ] = [T2 ][T1 ] for any two linear mappings whose composition
T2 ◦T1 is defined. Similarly, if the composed map T3 ◦T2 ◦T1 , of the linear mappings T3 ,T2 ,T1 ,
is defined, then [T3 ◦ T2 ◦ T1 ] = [T3 ][T2 ][T1 ].
• Let T1 : R2 → R2 be the reflection operator about the line y = x and let T2 : R2 → R2 be
the orthogonal projection about the y-axis. Find T1 ◦ T2 , T2 ◦ T1 and their standard matrices.
• Show that Rθ1◦R
θ2= R
θ2◦R
θ1and it is the rotation of angle θ1 + θ2 .
3.3 Properties of linear transformations from Rn to Rm
Definition 3.3.1 A linear transformation T : Rn → Rm is said to be one-to-one if u1 , u2 ∈Rn, u1 6= u2 ⇒ Tu1 6= Tu2.
It follows that each vector w in the range {Tx : x ∈ Rn} of an one-to-one linear transformation
T : Rn → Rm there is exactly one vector x such that Tx = w.
Theorem 3.3.2 If A is an n × n matrix and TA : Rn → Rn is the multiplication by A, then the
following statements are equivalent:
50
1. A is invertible.
2. The range of TA is Rn.
3. TA is one-to-one.
If TA : Rn → Rn is a one-to-one linear operator, then the matrix A is invertible and TA−1 : Rn → Rn
is also linear. Moreover,
TA(TA−1 (x)) = AA−1x = x = TI x = idRn (x), ∀x ∈ Rn.
TA−1 (TA(x)) = A−1Ax = x = TI x = idRn (x), ∀x ∈ Rn.
Consequently TA is also invertible and its inverse is T−1A
= TA−1 . Therefore, for the standard matrix
of the inverse of an one-to-one linear operator T : Rn → Rn, we have [T−1] = [T ]−1.
Examples 3.3.3 The rotation operator of R2 through a fixed angle θ,
Rθ
: R2 → R2, Rθ
= (x cos θ − y sin θ, x sin θ + y cos θ)
is a linear mapping since has the standard matric and [Rθ] =
cos θ − sin θ
sin θ cos θ
, which is invertible
and [Rθ]−1 =
cos θ sin θ
− sin θ cos θ
=
cos(−θ) − sin(−θ)
sin(−θ) cos(−θ)
= [R−θ
]. Consequently Rθ
is invertible
and R−1θ
= R−θ.
Theorem 3.3.4 A transformation T : Rn → Rm is linear if and only if
1. T (u + v) = T (u) + T (v) for all u, v ∈ Rn.
2. T (cu) = cT (u) for all u ∈ Rn.
Proof. Indeed, if T = TA , then T = TA, namely T
A(u+v) = A(u+v) = Au+Av = T
Au+T
Av.
Conversely, if the given conditions are satisfied, then one can easily show that for any k vectors
u1 , . . . , uk∈ Rn we have T(u
1+ · · · + u
k) = T(u
1) + · · · + T(u
k). Oncan now show that T = TA ,
where A = [T(e1)... · · · ...T(e
n)], where
e1
=
1
0...
0
, e2
=
0
1...
0
, . . . , en
=
0
0...
1
.
51
Indeed, for
x =
x1
x2
...
xn
= x1
1
0...
0
+ x2
0
1...
0
+ · · ·+ xn
0
0...
1
= x1e1+ x2e2
+ · · ·+ xnen
we have
T(x) = T(x1e1+ x2e2
+ · · ·+ xnen) = x1T(e
1) + x2T(e
2) + · · ·+ xnT(e
n) = Ax.¤
52