linear algebra lecture notes

MATH 106 LINEAR ALGEBRA

LECTURE NOTESFALL 2010-2011

1

1These Lecture Notes are not in a final form being still subject of improvement

1

Contents

1 Systems of linear equations and matrices 5

1.1 Introduction to systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Matrices and matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 Inverses. Rules of matrix arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5 Elementary matrices and a method for finding A−1 . . . . . . . . . . . . . . . . . . . 22

1.6 Further results on systems of equations and invertibility . . . . . . . . . . . . . . . . 25

1.7 Diagonal, Triangular and Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . 27

2 Determinants 33

2.1 The determinant function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2 Evaluating determinants by row reduction . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3 Properties of determinant function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.4 Cofactor expansion. Cramer’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 Euclidean vector spaces 43

3.1 Euclidean n-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.2 Linear transformations from Rn to Rm . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.3 Properties of linear transformations from Rn to Rm . . . . . . . . . . . . . . . . . . . 50

3

Chapter 1

Systems of linear equations and

matrices

1.1 Introduction to systems of linear equations

Definition 1.1.1 A linear equation in the n variables x1 , . . . , xn , equally called unknowns, is the

problem of finding the values of x1 , . . . , xn such that a1x1 + . . . + anxn = b, where a1 , . . . , an , b are

constants. A solution of a linear equation a1x1 + . . . + anxn = b is a sequence t1 , . . . , tn of n real

numbers such that the equation is satisfied when we replace x1 = t1 , . . . , xn = tn . The set of all

solutions of the equation is called its solution set or the general solution.

Definition 1.1.2 A finite set of linear equations in the variables x1 , . . . , xn is called a system of

linear equations or a a linear system. A solution of a linear system

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

......

am1x1 + am2x2 + . . . + amnxn = bm

(1.1)

with m linear equations and n unknowns, is a sequence t1 , . . . , tn of n real numbers such that each

equation of the system is satisfied when we replace x1 = t1 , . . . , xn = tn . The set of all solutions of

the linear system is called its solution set or the general solution. A system of equations is said to

be consistent if it has a solution at least. Otherwise it is called inconsistent. The augmented matrix

5

of the linear system (1.3) is

a11 a12 . . . a1n b1

a21 a22 . . . a2n b2

......

. . ....

...

am1 am2 . . . amn bm

Remark 1.1.3 The solution set of a linear system unchanges if we perform on the system one of

the following operations:

1. Multiply an equation with a nonzero constant;

2. Interchange two equations;

3. Add a multiple of one equation to another.

The corresponding operations at the level of augmented matrix are:

1. Multiply a row with a nonzero constant;

2. Interchange two rows;

3. Add a multiple of one row to another.

Example 1.1.4 Solve the following linear system

x + 2y − 3z + aw = 4

3x − y + 5z + 10aw = −2

4x + y + 2z + 11aw = 2

.

Solution: The augmented matrix of the system is

1 2 −3 a

3 −1 5 10a

4 1 2 11a

∣∣∣∣∣∣∣∣∣

4

−2

2

and we have

6

successively:

1 2 −3 a

3 −1 5 10a

4 1 2 11a

∣∣∣∣∣∣∣∣∣

4

−2

2

x + 2y − 3z + aw = 4

3x − y + 5z + 10aw = −2

4x + y + 2z + 11aw = 2

−3r1 + r2 → r2↓ −4r1 + r3 → r3 −3e1 + e2 → e2↓ −4e1 + e3 → e3

↓ ↓

1 2 −3 a

0 −7 14 7a

0 −7 14 7a

∣∣∣∣∣∣∣∣∣

4

−14

−14

x + 2y − 3z + aw = 4

− 7y + 14z + 7aw = −14

− 7y + 14z + 7aw = −14

−r2 + r3 → r3↓ ↓ −e2 + e3 → e3

↓ ↓

1 2 −3 a

0 −7 14 7a

0 0 0 0

∣∣∣∣∣∣∣∣∣

4

−14

0

x + 2y − 3z + aw = 4

− 7y + 14z + 7aw = −14

0 = 0

−17r2↓ ↓ −1

7e2

↓ ↓

1 2 −3 a

0 1 −2 −a

0 0 0 0

∣∣∣∣∣∣∣∣∣

4

2

0

x + 2y − 3z + aw = 4

y − 2z − aw = 2

0 = 0

−2r2 + r1 → r1↓ ↓ −2e2 + e1 → e1

↓ ↓

1 0 1 3a

0 1 −2 −a

0 0 0 0

∣∣∣∣∣∣∣∣∣

0

2

0

x + z + 3aw = 0

y − 2z − aw = 2

0 = 0.

Thus, the equivalent system isx +z +3aw = 0

y −2z −aw = 2and its solutions are

x = −t− 3as, y = 2 + 2t + as, z = t, w = s for s, t ∈ R.

1.2 Gaussian elimination

Definition 1.2.1 A matrix is said to be in reduced row-echelon form if it has the following prop-

erties:

1. If a row does not consist entirely of zeros, then the first nonzero number in the row is a 1,

called a leading 1.

7

2. If there are any rows that consist entirely of zeros, then they are grouped together at the

bottom of the matrix.

3. In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower

row occurs farther to the right than the leading 1 in the higher row.

4. Each column that contains a leading 1 has zero everywhere else.

A matrix satisfying just the first 3 properties, namely 1,2 and 3, is said to be in row-echelon form

We are now going to provide a number of steps in order to reduce a matrix to a (reduced) row-

echelon form. These are:

1. Locate the lefmost column that does not consist entirely of zeros;

2. Interchange the top row with another row, if necessary, to bring a nonzero entry to the top

of the column found in Step 1;

3. If the entry that is now at the top of the column found in Step 1 is a, multiply the first row

by 1a in order to introduce a leading 1;

4. Add suitable multiples of the top row to the rows below so that all entries below the leading

1 become zeros;

5. Cover the top row in the matrix and begin again with the Step 1 applied to the submatrix

that remains. Continue in this way until the entire matrix is in row-echelon form.

6. Once you got the matrix in row-echelon form, beginning with the last nonzero row and working

upward, add suitable multiples of each row to the rows above to introduce zeros above the

leading 1’s.

The variables corresponding to the leading 1’s in row echelon form are called leading variables

and the others are called free variables.

The above procedure for reducing a matrix to reduced row-echelon form is called Gauss-Jordan

elimination. If we use only the first five steps, the procedure produces a row-echelon form and is

called Gaussian elimination.

Examples 1.2.2 1. Solve the linear system

x − y + 2z − w = −1

2x + y − 2z − 2w = −2

−x + 2y − 4z + w = 1

3x − 3w = −3

(1.2)

8

Solution: The augmented matrix is

1 −1 2 −1

2 1 −2 −2

−1 2 −4 1

3 0 0 −3

∣∣∣∣∣∣∣∣∣∣∣∣

−1

−2

1

−3

and it can be reduced at

reduced row-echelon form as follows:

1 −1 2 −1

2 1 −2 −2

−1 2 −4 1

3 0 0 −3

∣∣∣∣∣∣∣∣∣∣∣∣

−1

−2

1

−3

−2r1+r2→r2r1 + r3 → r3

−3r1 + r4 → r4

→

1 −1 2 −1

0 3 −6 0

0 1 −2 0

0 3 −6 0

∣∣∣∣∣∣∣∣∣∣∣∣

−1

0

0

0

13r2→

1 −1 2 −1

0 1 −2 0

0 1 −2 0

0 3 −6 0

∣∣∣∣∣∣∣∣∣∣∣∣

−1

0

0

0

−r2 + r3 → r3

−3r2 + r4 → r4

→

1 −1 2 −1

0 1 −2 0

0 0 0 0

0 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣

−1

0

0

0

r2 + r1 → r1→

1 0 0 −1

0 1 −2 0

0 0 0 0

0 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣

−1

0

0

0

such that the corresponding linear system, equivalent with the initial one, is

x − w = −1

y − 2z = 0and its solutions are

x = t− 1

y = 2s

z = s

w = t

, s, t ∈ R,

being infinitely many in this case. It is sometimes preferable to solve a linear system by

using Gauss elimination procedure for the augmented matrix to bring it just in a row-echelon

form and to use the so called back-substitution method afterward. For the linear system 1.2,

a row-echelon form of the augmented matrix is

1 −1 2 −1

0 1 −2 0

0 0 0 0

0 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣

−1

0

0

0

and its corresponding linear system is

x − y + 2z − w = −1

y − 2z = 0

The back-substitution method consists in the following three steps:

(a) Solve the equations for the leading variables in terms of the free variables;

(b) Beginning with the bottom equation and working upward, successively substitute each

equation into all the equations above it.

9

(c) Assign arbitrary values to the free variables, if any.

In our particular case x, y are the leading variables and z, w are the free variables. Therefore

the step (1a) is

x = y − 2z + w − 1

y = 2z

The step (1b) is

x = 2z − 2z + w − 1

y = 2zor equivalently

x = w − 1

y = 2z

Finally, the step (1c) consists in assigning to the free variables z, w the arbitrary values z = s

and w = t such that the solutions of the initial linear system are:

x = t− 1

y = 2s

z = s

w = t

, s, t ∈ R.

2. Solve the linear system

−x + 7y − 2z = 1

3x − y + z = 4

2x + 6y − z = 5


−1 7 −2

3 −1 1

2 6 −1

∣∣∣∣∣∣∣∣∣

1

4

5

and it can be reduced at reduced

row-echelon form as follows:

−1 7 −2

3 −1 1

2 6 −1

∣∣∣∣∣∣∣∣∣

1

4

5

−r1→

1 −7 2

3 −1 1

2 6 −1

∣∣∣∣∣∣∣∣∣

−1

4

5

−3r1 + r2 → r2

−2r1 + r3 → r3→

1 −7 2

0 20 −5

0 20 −5

∣∣∣∣∣∣∣∣∣

−1

7

7

−r2 + r3 → r3 →

1 −7 2

0 20 −5

0 0 0

∣∣∣∣∣∣∣∣∣

−1

7

0

1/20r2→

1 −7 2

0 1 −14

0 0 0

∣∣∣∣∣∣∣∣∣

−1720

0

7r2 + r1 → r1

1 0 14

0 1 −14

0 0 0

∣∣∣∣∣∣∣∣∣

2920

7/20

0

such that the corresponding linear system, equivalent with the initial one, is

x + 14z = 29

20

y − 14z = 7/20

and its solutions are

x = 29−5t20

y = 7+5t20

z = t.

t ∈ R.

10

3. Solve the linear system

x1 − 2x2 + x3 − 4x4 = 1

x1 + 3x2 + 7x3 + 2x4 = 2

x1 − 12x2 − 11x3 − 16x4 = 5


1 −2 1 −4

1 3 7 2

1 −12 −11 −16

∣∣∣∣∣∣∣∣∣

1

2

5

and it can be reduced at

reduced row-echelon form as follows:

1 −2 1 −4

1 3 7 2

1 −12 −11 −16

∣∣∣∣∣∣∣∣∣

1

2

5

−r1 + r2 → r2

−r1 + r3 → r3

→

1 −2 1 −4

0 5 6 6

0 −10 −12 −12

∣∣∣∣∣∣∣∣∣

1

1

4

2r2 + r3 → r3→

2r2 + r3 → r3→

1 −2 1 −4

0 5 6 6

0 0 0 0

∣∣∣∣∣∣∣∣∣

1

1

6

.

Thus the system is inconsistent, that is it has no solutions at all, since 0 6= 6.

4. For which values of a ∈ R the following linear system has no solution ? Exactly one solution

? Infinitely many solutions ? When the system is consistent, find its solution set.

x − y − 2z = 0

2x − 4y − 6z = −6

3x − 5y + (a2 − 14)z = a− 4


1 −1 −2

2 −4 −6

3 −5 a2 − 14

∣∣∣∣∣∣∣∣∣

0

−6

a− 4

and it will be reduced by

performing successively the following row-operations:

1 −1 −2

2 −4 −8

3 −5 a2 − 14

∣∣∣∣∣∣∣∣∣

0

−6

a− 4

−2r1 + r2 → r2

−3r1 + r3 → r3

→

1 −1 −2

0 −2 −4

0 −2 a2 − 8

∣∣∣∣∣∣∣∣∣

0

−6

a− 4

− 1

2r2→

− 12r2→

1 −1 −2

0 1 2

0 −2 a2 − 8

∣∣∣∣∣∣∣∣∣

0

3

a− 4

2r2 + r3 → r3

1 −1 −2

0 1 2

0 0 a2 − 4

∣∣∣∣∣∣∣∣∣

0

3

a + 2

︸︷︷︸(∗)

1a2−4

if a 6∈ {±2}→

11

1a2−4

if a 6∈ {±2}→

1 −1 −2

0 1 2

0 0 1

∣∣∣∣∣∣∣∣∣

0

31

a−2

−2r3+r2→r22r3 + r1 → r1

if a 6∈ {±2} →

1 −1 0

0 1 0

0 0 1

∣∣∣∣∣∣∣∣∣

2a−2

3− 2a−2

1a−2

r2 + r1 → r1

if a 6∈ {±2}→

r2 + r1 → r1

if a 6∈ {±2}→

1 0 0

0 1 0

0 0 1

∣∣∣∣∣∣∣∣∣

33a−8a−2

1a−2

.

Therefore for a 6∈ {±2} the given linear system has the unique solutions x = 3, y = 3a−8a−2 , z = 1

a−2 .

If a = −2, the (∗) reduced matrix becomes

1 −1 −2

0 1 2

0 0 0

∣∣∣∣∣∣∣∣∣

0

3

0

r2 + r1 → r1→

1 0 0

0 1 2

0 0 0

∣∣∣∣∣∣∣∣∣

3

3

0

,

and in this case the given linear system has infinitely many solutions

x = 3, y = 3− 2s, z = s, s ∈ R.

Finally, when a = 2, the (∗) reduced matrix becomes

1 −1 −2

0 1 2

0 0 0

∣∣∣∣∣∣∣∣∣

0

3

4

and linear system has no

solutions at all in this case.

1.3 Matrices and matrix operations

Definition 1.3.1 A matrix

A=

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

am1 am2 · · · amn

← row 1

← row 2...

← row m

↑ ↑ · · · ↑col.1 col.2 · · · col.n

equally written [aij ]m×n (or simply [aij ]), is a rectangular array of numbers. The numbers in the

array are called entries. The entry in row i and column j of a matrix A is commonly denoted by

(A)ij , in our particular case (A)ij = aij . The size of a matrix is described in terms of the number

of rows and the number of columns it has. The above matrix A has size m× n. A matrix

x =

x1

x2

...

xm

12

with only one column is called a column matrix (or a column vector), and a matrix

y =[

y1 y2 . . . yn

]

with only one row is called row matrix (or a row vector). A matrix

A =

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

an1 an2 . . . ann

,

with n rows and n columns, that is of size n×n, is called a square matrix of order n and the entries

a11 , a22 , . . . , ann are said to be on the main diagonal of A.

Definition 1.3.2 Two matrices are said to be equal if they have the same size and their corre-

sponding entries are equal.

Definition 1.3.3 If

A =

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

, B =

b11 b12 . . . b1n

b21 b22 . . . b2n

......

. . ....

bm1 bm2 . . . bmn

,

are matrices of the same size, then their sum is the matrix

A + B =

a11 + b11 a12 + b12 . . . a1n + b1n

a21 + b21 a22 + b22 . . . a2n + b2n

......

. . ....

am1 + bm1 am2 + bm2 . . . amn + bmn

,

and their difference is the matrix

A−B =

a11 − b11 a12 − b12 . . . a1n − b1n

a21 − b21 a22 − b22 . . . a2n − b2n

......

. . ....

am1 − bm1 am2 − bm2 . . . amn − bmn

.

One can shortly define the entries of the sum and the difference matrices in the following way:

(A + B)ij = (A)ij + (B)ij and (A−B)ij = (A)ij − (B)ij .

13

If the matrices A and B have different sizes, then their sum and difference is not defined. If c is

any scalar (number), then the product cA is the matrix

ca11 ca12 . . . ca1n

ca21 ca22 . . . ca2n

......

. . ....

cam1 cam2 . . . camn

obtained by multiplying each entry of A with c and its entries can be shortly written as (cA)ij =

c(A)ij . If A1 , A2 , . . . , An are matrices of the same size and c1 , c2 , . . . , cn are scalars, then the matrix

c1A1 + c2A2 + · · · + cnAn is defined and it is called a linear combination of A1 , A2 , . . . , An with

coefficients c1 , c2 , . . . , cn .

Examples 1.3.4 Consider the matrices

A =

1 −3 0

2 0 2

, B =

1 2 −2

−2 4 3

, C =

1 2 −3

−1 0 4

2 5 −1

, D =

−1 6 2

2 1 3

1 2 −4

.

1. Compute, when is possible, A−B, C + D, B − C, A + C,

Solution:

A−B =

1 −3 0

2 0 2

−

1 2 −2

−2 4 3

=

0 −5 2

4 −4 −1

.

D + E =

1 2 −3

−1 0 4

2 5 −1

+

−1 6 2

2 1 3

1 2 −4

=

0 8 −1

1 1 7

3 7 −5

.

B − C and A + C are not defined.

2. Compute, when possible, 13A, (−1)B, 2A−B, 3C + D, 5B + 1

2006D. Solution:

13A =

13 −1 023 0 2

3

, (−1)B =

−1 −2 2

2 −4 −3

2A−B = 2

1 −3 0

2 0 2

−

1 2 −2

−2 4 3

=

2 −6 0

4 0 4

−

1 2 −2

−2 4 3

=

1 −8 2

6 −4 1

.

3C+D = 3

1 2 −3

−1 0 4

2 5 −1

+

−1 6 2

2 1 3

1 2 −4

=

3 6 −9

−3 0 12

6 15 −3

+

−1 6 2

2 1 3

1 2 −4

=

2 12 −7

−1 1 15

7 17 −7

.

5B + 12006D is not defined.

14

Definition 1.3.5 Lat A be an m× n matrix and B be an n× p matrix. Then the product AB is

the m×p matrix C whose entry cij in row i and column j is obtained as follows: Sum the products

formed by multiplying, in order, each entry in row i of the matrix A with the ”corresponding” entry

in column j of the matrix B. More precisely

cij = ai1b1j + ai2b2j + · · ·+ ainbnj =n∑

k=1

aik

bkj

.

A B Cm× n n× p m× p

6 666 6

6must bethe same

size of product

6

=

AB =

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

ai1 ai2 . . . ain

......

. . ....

am1 am2 . . . amn

b11 b12 . . . b1j . . . b1p

b21 b22 . . . b2j . . . b2p

......

. . ....

. . ....

bn1 bn2 . . . bnj . . . bnp

=

=

n∑

k=1

a1k

bk1

n∑

k=1

a1k

bk2

. . .

n∑

k=1

a1k

bkj

. . .

n∑

k=1

a1k

bkp

n∑

k=1

a2k

bk1

n∑

k=1

a2k

bk2

. . .n∑

k=1

a2k

bkj

. . .n∑

k=1

a2k

bkp

......

. . ....

. . ....

n∑

k=1

aik

bk1

n∑

k=1

aik

bk2

. . .n∑

k=1

aik

bkj

. . .n∑

k=1

aik

bkp

......

. . ....

. . ....

n∑

k=1

amk

bk1

n∑

k=1

amk

bk2

. . .n∑

k=1

amk

bkj

. . .n∑

k=1

amk

bkp

.

Definition 1.3.6 The transpose of an m×n matrix A, denoted by AT , is the n×m matrix whose

i− th row is the i− th column of A. Thus (AT )ij = (A)ji . Observe that for any matrix A we have

(AT )T = A.

Example 1.3.7 If A =

1 0 −1

0 2 3

and B =

2 1 0 1

3 −1 −1 0

4 3 2 1

, find, when possible, the

matrices AB, BA, (AB)T , (BA)T , AT BT , BT AT .

15

Solution: then we have

AB =

1 0 −1

0 2 3

︸︷︷︸2×3

2 1 0 1

3 −1 −1 0

4 3 2 1

︸︷︷︸3×4

=

=

1 · 2 + 0 · 3 + (−1) · 4 1 · 1 + 0 · (−1) + (−1) · 3 1 · 0 + 0 · (−1) + (−1) · 2 1 · 1 + 0 · 0 + (−1) · 1

0 · 2 + 2 · 3 + 3 · 4 0 · 1 + 2 · (−1) + 3 · 3 0 · 0 + 2 · (−1) + 3 · 2 0 · 1 + 2 · 0 + 3 · 1

=

=

−2 −2 −2 0

18 7 4 3

︸︷︷︸2×4

, such that (AB)T =

−2 18

−2 7

−2 4

0 3

︸︷︷︸4×2

.

Observe that the product BA, (BA)T , AT BT are not defined. However, observe that

BT =

2 3 4

1 −1 3

0 −1 2

1 0 1

︸︷︷︸4×3

and AT =

1 0

0 2

−1 3

︸︷︷︸3×2

,

such that the product BT AT is also defined and

BT AT =

2 3 4

1 −1 3

0 −1 2

1 0 1

︸︷︷︸4×3

1 0

0 2

−1 3

︸︷︷︸3×2

=

−2 18

−2 7

−2 4

0 3

︸︷︷︸4×2

= (AB)T .

Definition 1.3.8 If A is a square matrix, then the trace of A, denoted by tr(A), is defined to be

the sum of the entries on the main diagonal, namely tr(A) := (A)11 + (A)22 + · · ·+ (A)nn , where n

is the order of A. The trace of A is not defined if A is not a square matrix.

Example 1.3.9 Find, when possible, tr(AAT ) and tr(AB), where

A =

1 0 −1

0 2 3

and B =

2 1 0 1

3 −1 −1 0

4 3 2 1

16

We first observe that AAT =

1 0 −1

0 2 3

1 0

0 2

−1 3

=

2 −3

−3 13

︸︷︷︸2×2

, such that tr(AAT ) =

2 + 13 = 15. Since AB =

−2 −2 −2 0

18 7 4 3

︸︷︷︸2×4

is not a square matrix, its trace is not defined.

If Aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n are matrices of suitable sizes, then we can form a new matrix

A11

A21

Am1

A12

A22

Am2

A2n

A1n

Amn

A=

and call partition of the matrix A the above kind of subdivision. For example

-1

1

-1

0

0

0

1

5

2

0

0

2

3

-1

1

-7

-2

3

-1

6

0

5

0

0

=A11 A12

A21A22

A13

A23

where

A11 =

−1 0

1 0

, A12 =

2 3

0 −1

, A13 =

−2 0

3 5

,

A21 =

−1 1

0 5

, A22 =

0 1

2 −7

, A23 =

−1 0

6 0

.

An application of matrix multiplication operation consists in transforming a system of linear equa-

tions into a matrix equation. Indeed, for a linear system we have successively:

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

......

am1x1 + am2x2 + . . . + amnxn = bm

⇔

a11x1 + a12x2 + . . . + a1nxn

a21x1 + a22x2 + . . . + a2nxn

...

am1x1 + am2x2 + . . . + amnxn

=

b1

b2

...

bm

⇔

17

⇔

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

︸︷︷︸A

x1

x2

...

xn

︸︷︷︸X

=

b1

b2

...

bm

︸︷︷︸B

⇔ AX = B.

The matrix

A =

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

is called the coefficient matrix of the linear system. Observe that the augmented matrix of the

system is[

A... B

].

1.4 Inverses. Rules of matrix arithmetic

Theorem 1.4.1 Assuming that the sizes of matrices are such that the indicated operations can be

performed, the following rules of matrix arithmetic are valid:

1. A + B = B + A;

2. (A + B) + C = A + (B + C);

3. (AB)C = A(BC);

4. A(B + C) = AB + AC;

5. (A + B)C = AC + BC;

6. A(B − C) = AB −AC;

7. (A−B)C = AC −BC;

8. a(B + C) = aB + aC;

9. a(B − C) = aB − aC;

10. (a + b)C = aC + bC;

11. (a− b)C = aC − bC;

12. a(bC) = (ab)C;

18

13. a(BC) = (aB)C = B(aC).

Remark 1.4.2 The matrix multiplication is not commutative. Indeed, for example if

A =

−1 0

2 3

, B =

1 2

3 0

,

then

AB =

−1 0

2 3

1 2

3 0

=

−1 −2

11 4

6=

3 6

−3 0

=

1 2

3 0

−1 0

2 3

= BA.

Define the m×n zero matrix to be the matrix Om×n :=

0 0 · · · 0

0 0 · · · 0...

.... . .

...

0 0 · · · 0

m×n

and it will be equally

denoted by O when the size is understood from context.

Theorem 1.4.3 Assuming that the sizes of the matrices are such that the indicated operations can

be performed, the following rules of matrix arithmetic are valid

1. A + O = O + A = A;

2. A−A = O;

3. O −A = −A;

4. AO = O.

The identity matrix of order n is defined to be the square the n× n matrix

In :=

1 0 · · · 0

0 1 · · · 0...

.... . .

...

0 0 · · · 1

n×n

,

and it will be equally denoted by I when the order is understood from context.

Remark 1.4.4 If A is an m× n matrix, then AIn = ImA = A.

Theorem 1.4.5 If R is the reduced row-echelon form of an n × n matrix A, then either R has a

row of zeros, or R is the identity matrix.

19

Definition 1.4.6 A square matrix A is said to be invertible if there is another square matrix B,

of the same size, such that AB = BA = I.

Remark 1.4.7 If B, C are both inverses of a A, then B = C. Indeed, we have successively:

B = BI = B(AC) = (BA)C = IC = C. Therefore, the inverse of an invertible matrix A is unique

and denoted by A−1, being characterized by the equalities: AA−1 = A−1A = I.

Example 1.4.8 If a, b, c, d ∈ R are such that ad − bc 6= 0, then the matrix A =

a b

c d

is

invertible and A−1 = 1ad−bc

d −b

−c a

=

dad−bc − b

ad−bc

− cad−bc

aad−bc

. Indeed we have

a b

c d

︸︷︷︸A

dad−bc − b

ad−bc

− cad−bc

aad−bc

=

ad−bcad−bc

−ab+baad−bc

cd−dcad−bc

−cb+daad−bc

=

1 0

0 1

.

and similarly

dad−bc − b

ad−bc

− cad−bc

aad−bc

a b

c d

︸︷︷︸A

=

da−bcad−bc

db−bdad−bc

−ca+acad−bc

−cb+adad−bc

=

1 0

0 1

.

If A is a square matrix, then its positive powers are defined to be A0 = I, An = AA · · ·A︸︷︷︸n times

for n > 0.

Moreover, if A is invertible, then A−n = A−1A−1 · · ·A−1︸︷︷︸n times

, also for n > 0.

Theorem 1.4.9 If A is an invertible matrix, then

1. A−1 is also invertible and (A−1)−1 = A;

2. An is invertible and (An)−1 = (A−1)n for n ∈ {0, 1, 2, . . .};

3. AT is invertible and (AT )−1 = (A−1)T ;

4. For any non-zero scalar k, the matrix kA is invertible and (kA)−1 = 1kA−1;

5. AmAn = Am+n for any integers m,n;

6. (Am)n = Amn for any integers m,n.

Theorem 1.4.10 If the sizes of the involved matrices are such that the stated operations can be

performed, then

20

1. (AT )T = A;

2. (A + B)T = AT + BT and (A−B)T = AT −BT ;

3. (kA)T = kAT , where k is any scalar;

4. (AB)T = BT AT .

Example 1.4.11 1. Find the matrix A knowing that (I2 + 2A)−1 =

−1 2

4 5

.

Solution: Since (I2 + 2A)−1 =

−1 2

4 5

, it follows that [(I2 + 2A)−1]−1 =

−1 2

4 5

−1

.

Therefore we have successively

I2 + 2A =1

(−1) · 5− 2 · 4

5 −2

−4 −1

⇔ I2 + 2A = − 1

13

5 −2

−4 −1

⇔

⇔ 2A = − 113

5 −2

−4 −1

−

1 0

0 1

⇔ 2A =

− 5

13 − 1 213

413

113 − 1

⇔

⇔ 2A =

−18

13213

413 −12

13

⇔ A =

12

−18

13213

413 −12

13

⇔

⇔ A =

−1

2 · 1813

12 · 2

13

12 · 4

13 −12 · 12

13

⇔ A =

− 9

13113

213 − 6

13

.

2. If A =

cos θ − sin θ

sin θ cos θ

, show that A2 =

cos 2θ − sin 2θ

sin 2θ cos 2θ

and A3 =

cos 3θ − sin 3θ

sin 3θ cos 3θ

.

Solution: Indeed,

A2 =

cos θ − sin θ

sin θ cos θ

cos θ − sin θ

sin θ cos θ

=

cos2 θ − sin2 θ −2 sin θ cos θ

2 sin θ cos θ cos2 θ − sin2 θ

=

cos 2θ − sin 2θ

sin 2θ cos 2θ

.

21

A3 = AA2 =

cos θ − sin θ

sin θ cos θ

cos 2θ − sin 2θ

sin 2θ cos 2θ

=

cos θ cos 2θ − sin θ sin 2θ −(cos θ sin 2θ + sin θ cos 2θ)

sin θ cos 2θ + cos θ sin 2θ − sin θ sin 2θ + cos θ cos 2θ

=

cos(θ + 2θ) − sin(θ + 2θ)

sin(θ + 2θ) cos(θ + 2θ)

=

cos 3θ − sin 3θ

sin 3θ cos 3θ

.

1.5 Elementary matrices and a method for finding A−1

Definition 1.5.1 An n×n matrix is called an elementary matrix if it can be obtain from the n×n

identity matrix In by performing a single elementary row operation.

Example 1.5.2

I3 =

1 0 0

0 1 0

0 0 1

3r1 + r3 → r3→

1 0 0

0 1 0

3 0 1

= E − elementary matrix.

If A =

−1 2 1 3

2 1 6 −3

4 2 −1 0

, then EA =

−1 2 1 3

2 1 6 −3

1 8 2 9

←

r3←r3+r1

−1 2 1 3

2 1 6 −3

4 2 −1 0

= A.

Theorem 1.5.3 If the elementary matrix E results from performing a certain row operation on

Im and A is an m × n matrix, then the product EA is the matrix that results when the same row

operation is performed on A.

If an elementary operation is performed on the identity matrix I to obtain an elementary matrix

E, then there is another operation, called the corresponding inverse operation, when apply to E to

obtain I back again.

Direct Operation Corresponding Inverse operation

Multiply row i by c 6= 0 (cri) Multiply row i by 1c (1

c ri)

Interchange rows i and j (i ↔ j) Interchange rows i and j (i ↔ j)

Add c times row i to row j (cri + rj → rj ) Add −c times row i to row j (−cri + rj → rj )

22

Theorem 1.5.4 Every elementary matrix is invertible, and the inverse is also an elementary ma-

trix.

Proof. If Idir. op.−→ E

inv. op.−→ I. Daca Iinv. op.−→ E0 , then E0E = I and EE0 = I, namely E is

invertible.¤

Theorem 1.5.5 If A is an n× n matrix, then the following statements are equivalent:

1. A is invertible;

2. The homogeneous linear system AX = O has only the trivial solution;

3. The reduced row-echelon form of A is In;

4. A is expressible as a product of elementary matrices.

The last statement, (4), of theorem 1.6.7 is of particular importance since it provide us a method

of finding the reduced row-echelon form of a matrix and a method of finding the inverse of an

invertible matrix.

Let A be a matrix and R be its reduced row-echelon form, namely

Aop1→ A1

op2→ A2

op3→ · · · opn→ R,

then R = En · . . . · E1A, where Iop1→ E1 , I

op2→ E2 , · · · , Iopn→ En . Therefore

A = (En · . . . · E1)−1R = E−1

1· . . . · E−1

nR.

If A is particularly invertible, then R = I, such that A = E−11· . . . · E−1

nand implicitly

A−1 = En · . . . · E1 .

Thus, in order to find the inverse of an invertible matrix A, we first observe that we have the general

property A[B...C] = [AB

...AC] and that we have successively:

[A...I]

op1→ E1 [A...I] = [E1A

...E1I]op2→ E2 [E1A

...E1I] = [E2E1A...E2E1I]

op3→ · · · opn→ [En · · ·E1A︸︷︷︸=I

... En · · ·E1I︸︷︷︸=A−1

].

Examples 1.5.6 1. Find the inverse of the matrix A =

−2 −1 4

3 1 −7

2 0 −5

and write A−1 as a

product of elementary matrices;

23

2. Express the matrix

A =

0 1 7 8

1 3 3 8

−2 −5 1 −8

in the form A = EFGR, where E, F,G are elementary matrices and R in row-echelon form.

(1)

−2 −1 4

3 1 −7

2 0 −5

∣∣∣∣∣∣∣∣∣

1 0 0

0 1 0

0 0 1

r2 + r1 → r1→

1 0 −3

3 1 −7

2 0 −5

∣∣∣∣∣∣∣∣∣

1 1 0

0 1 0

0 0 1

−3r1 + r2 → r2

−2r1 + r3 → r3

→

1 0 −3

0 1 2

0 0 1

∣∣∣∣∣∣∣∣∣

1 1 0

−3 −2 0

−2 −2 1

−2r3 + r2 → r2

3r3 + r1 → r1

→

1 0 0

0 1 0

0 0 1

∣∣∣∣∣∣∣∣∣

−5 −5 3

1 2 −2

−2 −2 1

︸︷︷︸=A−1

.

Therefore E5E4E3E2E1A = I3 , where

I3 =

1 0 0

0 1 0

0 0 1

r2 + r1 → r1→

1 1 0

0 1 0

0 0 1

=: E1

I3 =

1 0 0

0 1 0

0 0 1

−3r1 + r2 → r2→

1 0 0

−3 1 0

0 0 1

=: E2

I3 =

1 0 0

0 1 0

0 0 1

−2r1 + r3 → r3→

1 0 0

0 1 0

−2 0 1

=: E3

I3 =

1 0 0

0 1 0

0 0 1

−2r3 + r2 → r2→

1 0 0

0 1 −2

0 0 1

=: E4

I3 =

1 0 0

0 1 0

0 0 1

3r3 + r1 → r1→

1 0 3

0 1 0

0 0 1

=: E5 .

(2)

A =

0 1 7 8

1 3 3 8

−2 −5 1 −8

r1 ↔ r2→

1 3 3 8

0 1 7 8

−2 −5 1 −8

2r1 + r3 → r3→

24

2r1 + r3 → r3→

1 3 3 8

0 1 7 8

0 1 7 8

−r2 + r3 → r3→

1 3 3 8

0 1 7 8

0 0 0 0

= R−matrix in row-echelon form.

Consequently R = E3E2E1A, or equivalently A = E−11

E−12

E−13

R, where

I3 =

1 0 0

0 1 0

0 0 1

r1 ↔ r2→

0 1 0

1 0 0

0 0 1

=: E1 = E−1

1

I3 =

1 0 0

0 1 0

0 0 1

2r1 + r3 → r3↗

−2r1 + r3 → r3↘

1 0 0

0 1 0

2 0 1

=:E2

1 0 0

0 1 0

−2 0 1

=:E−1

2

I3 =

1 0 0

0 1 0

0 0 1

−r2 + r3 → r3↗

r2 + r3 → r3↘

1 0 0

0 1 0

0 −1 1

=:E3

1 0 0

0 1 0

0 1 1

=:E−1

3

.

Now it is enough to take E = E−11

, F = E−12

, G = E−13

.

1.6 Further results on systems of equations and invertibility

Theorem 1.6.1 Every system of linear equations has either no solution, exactly one solution or

infinitely many solutions.

Theorem 1.6.2 If A is an invertible n × n matrix, then for each n × 1 matrix b, the system of

linear equations AX = B has exactly one solution, namely X = A−1B.

Example 1.6.3 Find the solution set of the following linear system:

−2x − y + 4z = 1

3x + y − 7z = −1

2x − 5z = 0

25

Solution: We first observe that the coefficient matrix of the given linear system is A =

−2 −1 4

3 1 −7

2 0 −5

, which is invertible and its inverse is A−1 =

−5 −5 3

1 2 −2

−2 −2 1

. The given linear

system can be written in the form AX = B, where X =

x

y

z

and B =

1

−1

0

. Consequently the

unique solution of the given linear system is x = A−1B =

−5 −5 3

1 2 −2

−2 −2 1

1

−1

0

=

0

−1

0

.

Frequently, one have to solve a sequence of linear systems

AX = B1 , AX = B2 , . . . , AX = Bk

each of which has the same square matrix A. If A is invertible then the solutions are

X1 = A−1B1 , X2 = A−1B2 , . . . , Xk= A−1B

k.

Otherwise the method of solving those systems, which works both for A invertible or A non-

invertible, consists in forming the matrix

[A... B1

... B2

... · · · ... Bk]

and by reducing it to the reduced row-echelon form.

Example 1.6.4 Solve the following linear systems:

−2x − y + 4z = 1

3x + y − 7z = −1

2x − 5z = 0

,

−2x − y + 4z = 2

3x + y − 7z = −2

2x − 5z = 3

Theorem 1.6.5 Let A be a square matrix.

1. If B is a square matrix satisfying BA = I, then B = A−1;

2. If B is a square matrix satisfying AB = I, then B = A−1.

Theorem 1.6.6 Theorem 1.6.7 If A is an n×n matrix, then the following statements are equiv-

alent:

1. A is invertible (*);

26


3. The reduced row-echelon form of A is In;


5. AX = B is consistent for every n× 1 matrix B;(*)

6. AX = B has exactly one solution for every n× 1 matrix B(*).

Theorem 1.6.8 Let A and B be square matrices of the same size. If AB is invertible, then A and

B are both invertible.

Fundamental Problem 1.6.9 Let A be a fixed m × n matrix. Find all m × 1 matrices B such

that the system of equations AX = B is consistent.

Example 1.6.10 What conditions must a, b1, b2, b3 satisfy in order for the system of the linear

system

x + y − 2z = b1

x − 2y + z = b2

−2x + y + z = b3

to be consistent ?

Solution: For the augmented matrix of the given linear system we have successively:

1 1 −2... b1

1 −2 1... b2

−2 1 1... b3

−r1 + r2 → r2

2r1 + r3 → r3→

1 1 a... b1

0 −3 3... b2 − b1

0 3 −8... b3 + 2b1

r1 + r2 → r2→

r1 + r2 → r2→

1 1 a... b1

0 −3 3... b2 − b1

0 0 0... b3 + b2 + b1

.

Thus, the given linear system is consistent if and only if b3 + b2 + b1 = 0.¤

1.7 Diagonal, Triangular and Symmetric Matrices

Definition 1.7.1 A square matrix in which all the entries off the diagonal are zero is called a

diagonal matrix. The general form a of a diagonal matrix is

D =

d1 · · · 0

0 d2 · · · 0...

.... . .

...

0 0 · · · dn

.

27

Remarks 1.7.2 1. The diagonal matrix

D =

d1 0 · · · 0

0 d2 · · · 0...

.... . .

...

0 0 · · · dn

is invertible iff all of its diagonal entries are non-zero. In this case

D−1 =

1d1

0 · · · 0

0 1d2

· · · 0...

.... . .

...

0 0 · · · 1dn

.

2. Powers of diagonal matrices are easy to be computed. More precisely, if

D =

d1 0 · · · 0

0 d2 · · · 0...

.... . .

...

0 0 · · · dn

then Dk =

dk1

0 · · · 0

0 dk2· · · 0

......

. . ....

0 0 · · · dkn

.

3. If A = [aij ] is an m× n matrix, then

δ1 0 · · · 0

0 δ2 · · · 0...

.... . .

...

0 0 · · · δm

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

=

δ1a11 δ1a12 . . . δ1a1n

δ2a21 δ2a22 . . . δ2a2n

......

. . ....

δmam1 δmam2 . . . δmamn

and

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

d1 · · · 0

0 d2 · · · 0...

.... . .

...

0 0 · · · dn

=

a11d1 a12d2 . . . a1ndn

a21d1 a22d2 . . . a2ndn

......

. . ....

am1d1 am2d2 . . . amndn

.

Examples 1.7.3 Find a diagonal matrix A satisfying

A−2 =

9 0 0

0 4 0

0 0 1

.

28

Solution: We have successively:

A−2 =

9 0 0

0 4 0

0 0 1

⇔ [A−2]−1 =

9 0 0

0 4 0

0 0 1

−1

⇔

⇔ A2 =

19 0 0

0 14 0

0 0 1

⇔ A2 =

(13

)20 0

0(

12

)20

0 0 12

.

Thus, such a matrix is

A =

13 0 0

0 12 0

0 0 1

.

Definition 1.7.4 A square matrix in which all the entries above the main diagonal are zero is

called lower triangular and a square matrix in which all the entries below the main diagonal are

zero is called upper triangular. A matrix which is either lower triangular or upper triangular is

called triangular. The general form a of a lower triangular matrix is

a11 0 · · · 0

a21 a22 · · · 0...

.... . .

...

an1 an2 · · · ann

and the general form a of a upper triangular matrix is

a11 a12 · · · a1n

0 a22 · · · a2n

......

. . ....

0 0 · · · ann

.

Remark 1.7.5 1. A square matrix A = [aij ] is upper triangular iff aij = 0 for i > j;

2. A square matrix A = [aij ] is lower triangular iff aij = 0 for i < j.

Theorem 1.7.6 1. The transpose of a lower triangular matrix is upper triangular, and the

transpose of an upper triangular matrix is lower triangular;

2. The product of lower triangular matrices is a lower triangular matrix;

29

3. The product of upper triangular matrices is a upper triangular matrix;

4. A triangular matrix is invertible if and only if its diagonal entries are all nonzero;

5. The inverse of a lower triangular matrix is lower triangular and the inverse of an upper

triangular matrix is upper triangular.

Example 1.7.7 If

A =

1 −1 2

0 2 3

0 0 1

and B =

2 3 1

0 −1 2

0 0 5

,

show that AB is also upper triangular and find (AB)−1 and B−1A−1.

Solution:

1 −1 2

0 2 3

0 0 1

2 3 1

0 −1 2

0 0 5

=

2 4 9

0 −2 19

0 0 5

.

2 4 9

0 −2 19

0 0 5

∣∣∣∣∣∣∣∣∣

1 0 0

0 1 0

0 0 1

12R1 ,−1

2R2 ,15R3→

1 2 92

0 1 −192

0 0 1

∣∣∣∣∣∣∣∣∣

12 0 0

0 −12 0

0 0 15

−2R2 + R1 → R1

192 R3 + R2 → R2

→

1 0 92

0 1 0

0 0 1

∣∣∣∣∣∣∣∣∣

12 1 0

0 −12

1910

0 0 15

−9

2R3 + R1 → R1→

1 0 0

0 1 0

0 0 1

∣∣∣∣∣∣∣∣∣

12 1 − 9

10

0 −12

1910

0 0 15

.

Thus

(AB)−1 = B−1A−1 =

12 1 − 9

10

0 −12

1910

0 0 15

.

Definition 1.7.8 A square matrix A is called symmetric if AT = A, or equivalently (A)ij = (A)ji

Theorem 1.7.9 If A,B are symmetric matrices of the same size and k is any scalar, then:

1. AT is symmetric;

2. A + B and A−B are symmetric;

3. kA is symmetric.

Theorem 1.7.10 1. The product of two symmetric matrices is symmetric if and only if the

matrices commute;

30

2. If A is an invertible symmetric matrix, then A−1 is symmetric.

Theorem 1.7.11 1. If A is an arbitrary matrix, then AAT and AT A are symmetric matrices;

2. If A is an invertible matrix, then AAT and AT A are invertible matrices.

Example 1.7.12 Find the values of a, b, c ∈ R for which the matrix A is symmetric, where

A =

2 a + b + c 4b− 3c

5 1 2a + c

2 4 6

.

Solution: The required values are solutions of the linear system

a + b + c = 5

4b− 3c = 2

2a + c = 4

that

can be solved by performing row-reduction operations in the corresponding augu-

mented matrix, namely:

1 1 1... 5

0 4 −3... 2

2 0 1... 4

−2r1+r3→r3−→

1 1 1... 5

0 4 −3... 2

0 −2 −1... −6

14r2−→

14r2−→

1 1 1... 5

0 1 −34

... 12

0 −2 −1... −6

2r2+r3→r3−→

1 1 1... 5

0 1 −34

... 12

0 0 −52

... −5

− 25r3−→

1 0 74

... 92

0 1 −34

... 12

0 0 1... 2

. The corre-

sponding linear system of the last matrix in the above row-reduction process, which is equivalent

with the initial one, has the unique solution a = 1, b = c = 2.

31

Chapter 2

Determinants

2.1 The determinant function

Definition 2.1.1 A permutation of the set {1, 2, . . . , n} is a bijective function σ : {1, 2, . . . , n} →{1, 2, . . . , n}, namely an arrangement of the integers 1, 2, . . . , n without omissions or repetitions. It

is also written

σ =

1 2 · · · n

σ(1) σ(2) · · · σ(n)

.

An inversion of the permutation σ is a pair (i, j) such that i < j and σ(i) > σ(j). Denote by m(σ)

the number of inversions of σ and by ε(σ) the signature (−1)ε(σ) of σ. A permutation σ is called

even/odd if m(σ) is even/odd, or equivalently ε(σ) = +1/ε(σ) = −1.

Definition 2.1.2 The determinant of a square n× n matrix A = [aij ] is defined to be

det(A) =∑

ε(σ)a1σ(1)

a2σ(2)

· . . . · anσ(n)

.

It is denoted either by det(A) or by∣∣∣∣∣∣∣∣∣∣∣∣

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

an1 an2 . . . ann

∣∣∣∣∣∣∣∣∣∣∣∣

.

Remarks 2.1.3 1. If A =

a11 a12

a21 a22

is a 2×2 square matrix, then det(A) = a11a22−a12a21 .

Indeed, the only 2 × 2 permutations are e =

1 2

1 2

and σ =

1 2

2 1

and ε(e) = 1,

33

ε(σ) = −1 since m(e) = 0 and m(σ) = 1. Thus

det

a11 a12

a21 a22

=

∣∣∣∣∣∣a11 a12

a21 a22

∣∣∣∣∣∣= ε(e)a

1e(1)a

2e(2)+ ε(σ)a

1σ(1)a

2σ(2)= a11a22 − a12a21 .

For example

∣∣∣∣∣∣1 2

−1 3

∣∣∣∣∣∣= 1 · 3− (−1)3 = 3 + 3 = 6.

2. A 2× 2 matrix A =

a11 a12

a21 a22

is invertible if and only if det(A) 6= 0. In this case

A−1 =1

det(A)

a22 −a12

−a21 a11

.

3. If A =

a11 a12 a13

a21 a22 a23

a31 a32 a33

is a 2× 2 square matrix, then

det(A) =

∣∣∣∣∣∣∣∣∣

a11 a12 a13

a21 a22 a23

a31 a32 a33

∣∣∣∣∣∣∣∣∣= a11a22a33+a12a23a31+a13a21a32−a13a22a31−a12a21a33−a11a23a32 =

a11a11

a21a21

a31a31

a12a12

a22a22

a32a32

a13

a23

a33

j j j¼ ¼ ¼

=

+++− − −

Indeed, the only 3× 3 square matrices are

e =

1 2 3

1 2 3

, σ1 :=

1 2 3

2 1 3

, σ2 =

1 2 3

3 2 1

,

σ3 =

1 2 3

1 3 2

, σ4 =

1 2 3

2 3 1

, σ5 =

1 2 3

3 1 2

and ε(e) = ε(σ4) = ε(σ5) = 1 while ε(σ1) = ε(σ2) = ε(σ4) = −1 since m(e) = 0,m(σ4) =

m(σ5) = 2 and m(σ1) = 1,m(σ2) = 3,m(σ3) = 1.

For example

∣∣∣∣∣∣∣∣∣

1 −1 2

0 2 1

−2 2 3

∣∣∣∣∣∣∣∣∣=

∣∣∣∣∣∣∣∣∣

1 −1 2

0 2 1

−2 2 3

∣∣∣∣∣∣∣∣∣

1 −1

0 2

−2 2

= 6 + 2 + 0− (−8)− 2− 0 = 8 + 8 = 16.

34

2.2 Evaluating determinants by row reduction

Theorem 2.2.1 Let A be a square matrix.

1. If A has a row of zeros, then det(A) = 0.

2. det(A) = det(AT ).

Theorem 2.2.2 If A is an n × n triangular matrix, then det(A) is the product of the entries on

the main diagonal, namely det(A) = a11a22 · . . . · ann

Theorem 2.2.3 Let A be an n× n square matrix.

1. If B is the matrix that results when a single row or a single column of A is multiplied by a

scalar k, then det(B) = k det(A); Consequently det(kA) = kn det(A).

2. If B is the matrix that results when two rows or two columns of A are interchanged, then

det(B) = −det(A);

3. If B is the matrix that results when a multiple of one row of A is added to another row or

when a multiple of one column of A is added to another column, then det(B) = det(A).

Corollary 2.2.4 If A is a square matrix with two proportional rows or two proportional columns,

then (A) = 0.

Example 2.2.5 Find the determinant of A =

1 2 3 4

1 0 1 0

0 1 0 1

0 0 1 −1

.

Solution:

det(A)=

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

1 2 3 4

1 0 1 0

0 1 0 1

0 0 1 −1

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

=

−r1 + r2 → r2

=

∣∣∣∣∣∣∣∣∣∣∣∣

1 2 3 4

0 −2 −2 −4

0 1 0 1

0 0 1 −1

∣∣∣∣∣∣∣∣∣∣∣∣

= (−2)

∣∣∣∣∣∣∣∣∣∣∣∣

1 2 3 4

0 1 1 2

0 1 0 1

0 0 1 −1

∣∣∣∣∣∣∣∣∣∣∣∣

= (−2)

∣∣∣∣∣∣∣∣∣

1 1 2

1 0 1

0 1 −1

∣∣∣∣∣∣∣∣∣=

= −2(2− 1 + 1) = −4.

35

2.3 Properties of determinant function

Theorem 2.3.1 Let A,B and C be n×n matrices that differ only in a single row, say the rth, and assume

that the rth row of C can be obtained by adding corresponding entries in the rth row of A and B. Then

det(C) = det(A) + det(B).

In other words∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

a11 + a′11

a12 + a′12

· · · a1n + a′1n

......

. . ....

am1 am2 · · · amn

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

=

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

a11 a12 · · · a1n

......

. . ....

am1 a

m2 · · · amn

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

+

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

a′11

a′12

· · · a′1n

......

. . ....

am1 a

m2 · · · amn

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

Theorem 2.3.2 If A and B are square matrices of the same size, then det(AB) = det(A) det(B).

Theorem 2.3.3 If A is an invertible matrix, then det(A−1) = 1det(A) .

Theorem 2.3.4 If A is an n× n matrix, then the following statements are equivalent:

1. A is invertible;(*)


3. The reduced row-echelon form of A is In ;


5. AX = B is consistent for every n× 1 matrix B;

6. AX = B has exactly one solution for every n× 1 matrix B.

7. det(A) 6= 0(*)

Example 2.3.5 1. Consider the matrix A =

1 1 1 1

x y z w

w x y z

y + z z + w w + x x + y

. Show that det(A) = 0

for all x, y, z, w ∈ R.

Solution: For the determinant of A we have successively:∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 1

x y z w

w x y z

y + z z + w w + x x + y

∣∣∣∣∣∣∣∣∣∣∣∣

r2+r4→r4=

∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 1

x y z w

w x y z

x + y + z y + z + w z + w + x w + x + y

∣∣∣∣∣∣∣∣∣∣∣∣

36

r3+r4→r4=

∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 1

x y z w

w x y z

x + y + z + w x + y + z + w y + z + w + x z + w + x + y

∣∣∣∣∣∣∣∣∣∣∣∣

=

= (x + y + z + w)

∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 1

x y z w

w x y z

1 1 1 1

∣∣∣∣∣∣∣∣∣∣∣∣

= 0

for all x, y, z, w ∈ R.

2. Let

A =

a b c

d e f

g h i

.

Assuming that det(A) = −7, find det(3A),det(A−1),det(2A−1) and det

a g d

b h e

c i f

.

Solution: • det(3A) = 33 det(A) = 27(−7).

• det(A−1) = 1det(A) = 1

−7 = − 17 .

• det(2A−1) = 23 det(A−1) = 8(− 1

7

)= − 8

7 .

• det

a g d

b h e

c i f

= det

a g d

b h e

c i f

T

=

∣∣∣∣∣∣∣∣∣

a b c

g h i

d e f

∣∣∣∣∣∣∣∣∣= −

∣∣∣∣∣∣∣∣∣

a b c

d e f

g h i

∣∣∣∣∣∣∣∣∣= −(−7) = 7.

3. Show that

∣∣∣∣∣∣∣∣∣

a1 + b1t a2 + b2t a3 + b3t

a1t + b1 a2t + b2 a3t + b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣= (1− t2)

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣.

37

Solution: Indeed, we have successively:

∣∣∣∣∣∣∣∣∣

a1 + b1 t a2 + b2 t a3 + b3 t

a1 t + b1 a2 t + b2 a3 t + b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣=

∣∣∣∣∣∣∣∣∣

a1 a2 a3

a1 t + b1 a2 t + b2 a3 t + b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣+

∣∣∣∣∣∣∣∣∣

b1 t b2 t b3 t

a1 t + b1 a2 t + b2 a3 t + b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣

=

∣∣∣∣∣∣∣∣∣

a1 a2 a3

a1 t a2 t a3 t

c1 c2 c3

∣∣∣∣∣∣∣∣∣︸︷︷︸

=0

+

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣+

∣∣∣∣∣∣∣∣∣

b1 t b2 t b3 t

a1 t a2 t a3 t

c1 c2 c3

∣∣∣∣∣∣∣∣∣+

∣∣∣∣∣∣∣∣∣

b1 t b2 t b3 t

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣︸︷︷︸

=0

=

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣+ t2

∣∣∣∣∣∣∣∣∣

b1 b2 b3

a1 a2 a3

c1 c2 c3

∣∣∣∣∣∣∣∣∣

=

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣− t2

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣

= (1− t2)

∣∣∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣∣∣

2.4 Cofactor expansion. Cramer’s rule

Definition 2.4.1 If A is a square matrix, then the minor of entry aij is denoted by Mij and is defined to

be the determinant of the submatrix that remains after the ith row and the jth column are deleted from A.

The number (−1)i+jMij is denoted by Cij and is called the cofactor of entry aij .

Example 2.4.2

∣∣∣∣∣∣∣∣∣

a11 a12 a13

a21 a22 a23

a31 a32 a33

∣∣∣∣∣∣∣∣∣= a11a22a33 + a12a23a31 + a13a21a32 − a13a22a31 − a12a21a33 − a11a23a32

= a11(a22a33 − a23a32)− a12(a12a21a33 − a23a31) + a13(a21a32 − a22a31)

= (−1)1+1a11

∣∣∣∣∣∣a22 a23

a32 a33

∣∣∣∣∣∣+ (−1)1+2a12

∣∣∣∣∣∣a21 a23

a31 a33

∣∣∣∣∣∣+ (−1)1+3a13

∣∣∣∣∣∣a21 a22

a31 a32

∣∣∣∣∣∣

= a11C11 + a12C12 + a13C13

38

Similarly∣∣∣∣∣∣∣∣∣

a11 a12 a13

a21 a22 a23

a31 a32 a33

∣∣∣∣∣∣∣∣∣= a11a22a33 + a12a23a31 + a13a21a32 − a13a22a31 − a12a21a33 − a11a23a32

= −a12(a21a33 − a23a31) + a22(a11a33 − a13a31)− a32(a11a23 − a13a21)

= (−1)1+2

∣∣∣∣∣∣a21 a23

a31 a33

∣∣∣∣∣∣+ (−1)2+2

∣∣∣∣∣∣a11 a13

a31 a33

∣∣∣∣∣∣+ (−1)3+2

∣∣∣∣∣∣a11 a13

a21 a23

∣∣∣∣∣∣

= a12C12 + a22C22 + a32C32

Theorem 2.4.3 If A = [aij

] is a square n× n matrix, then

det(A) = ai1Ci1 + ai2Ci2 + · · ·+ ainCin︸︷︷︸The cofactor expansion of det(A) along the ith row

= a1j C1j + a2j C2j + · · ·+ anj Cnj︸︷︷︸The cofactor expansion of det(A) along the jth column

.

Example 2.4.4 Show that:∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 1

a1 a2 a3 a4

a21 a2

2 a23 a2

4

a31 a3

2 a33 a3

4

∣∣∣∣∣∣∣∣∣∣∣∣

= (a1 − a2)(a1 − a3)(a1 − a4)(a2 − a3)(a2 − a4)(a3 − a4)

If A = [aij ] is a square n× n matrix and Cij is the cofactor of the entry aij , then the matrix

C11 C12 · · · C1n

C21 C22 · · · C2n

......

. . ....

Cn1 Cn2 · · · Cnn

is called the matrix of cofactors from A. The transpose of this matrix is called the adjoint of A and is

denoted by adj(A). If A is an invertible matrix, then A−1 = 1det(A)adj(A). The inverse of an invertible

lower triangular matrix is lower triangular and the inverse of an invertible upper triangular matrix is upper

triangular. Find necessary and sufficient conditions on a, b, c ∈ R

A =

1 1 1

a a a

a2 a2 a2

such that the matrix A is invertible. In those cases find A−1.

39

According to theorem 2.3.4, A is invertible if and only if det(A) 6= 0. But since

det(A) = V (a, b, c) =

∣∣∣∣∣∣∣∣∣

1 1 1

a b c

a2 b2 c2

∣∣∣∣∣∣∣∣∣= (c− a)(c− b)(b− a),

it follows that A is invertible if and only if a 6= b, b 6= c and c 6= a. In this case

A−1 =1

det(A)adj(A) =

1det(A)

C11 C12 C13

C21 C22 C23

C31 C32 C33

T

=

=1

(c− a)(c− b)(b− a)

∣∣∣∣∣∣b c

b2 c2

∣∣∣∣∣∣−

∣∣∣∣∣∣a c

a2 c2

∣∣∣∣∣∣

∣∣∣∣∣∣a b

a2 b2

∣∣∣∣∣∣

−∣∣∣∣∣∣

1 1

b2 c2

∣∣∣∣∣∣

∣∣∣∣∣∣1 1

a2 c2

∣∣∣∣∣∣−

∣∣∣∣∣∣1 1

a2 b2

∣∣∣∣∣∣

∣∣∣∣∣∣1 1

b c

∣∣∣∣∣∣−

∣∣∣∣∣∣1 1

a c

∣∣∣∣∣∣

∣∣∣∣∣∣1 1

a b

∣∣∣∣∣∣

T

=

=1

(c− a)(c− b)(b− a)

bc(c− b) −ac(c− a) ab(b− a)

−(c2 − b2) c2 − a2 −(b2 − a2)

c− b −(c− a) b− a

T

=

=1

(c− a)(c− b)(b− a)

bc(c− b) b2 − c2 c− b

ac(a− c) c2 − a2 a− c

ab(b− a) a2 − b2 b− a

=

bc(c−b)(c−a)(c−b)(b−a)

(b−c)(b+c)(c−a)(c−b)(b−a)

c−b(c−a)(c−b)(b−a)

ac(a−c)(c−a)(c−b)(b−a)

(c−a)(c+a)(c−a)(c−b)(b−a)

a−c(c−a)(c−b)(b−a)

ab(b−a)(c−a)(c−b)(b−a)

(a−b)(a+b)(c−a)(c−b)(b−a)

b−a(c−a)(c−b)(b−a)

=

=

bc(c−a)(b−a)

b+c(a−c)(b−a)

1(c−a)(b−a)

ac(c−b)(a−b)

c+a(c−b)(b−a)

1(c−b)(a−b)

ab(c−a)(c−b)

a+b(c−a)(b−c)

1(c−a)(c−b)

.

(Cramer’s Rule) If AX = B is a system of n linear equations in n unknowns such that det(A) 6= 0,

then the system has a unique solution. This solution s

x1 =det(A1)det(A)

, x2 =det(A2)det(A)

, . . . , xn =det(An)det(A)

,

where Aj is the matrix obtained from A by replacing its jth row with the entries of the column

matrix B. Provide necessary and sufficient conditions on a1, a2, a3, a4 ∈ R such that the linear

40

system

x + y + z + w = 1

ax + by + cz + dw = α

a2x + b2y + c2z + d2w = α2

a3x + b3y + c3z + d3w = α3

(2.1)

has exactly one solution for each α ∈ R. In this case solve the system by using the Cramer’s rule.

The given linear system has a unique solution for each α ∈ R if and only if its coefficient matrix

A =

1 1 1 1

a b c d

a2 b2 c2 d2

a3 b3 c3 d3

is invertible, or, equivalently det(A) 6= 0. But since

det(A) = V (a, b, c, d) = (d− a)(d− b)(d− c)(c− a)(c− b)(b− a),

it follows that the given linear system has unique solution for each α ∈ R if and only if the scalars

a, b, c, d are pairwise disjoint.

By Crammer’s rule, it follows that the unique solution of the system is

x =V (α, b, c, d)V (a, b, c, d)

=(d− α)(c− α)(b− α)(d− a)(c− a)(b− a)

;

y =V (a, α, c, d)V (a, b, c, d)

=(α− a)(c− α)(d− α)(b− a)(c− b)(d− b)

;

z =V (a, b, α, d)V (a, b, c, d)

=(α− a)(α− b)(d− α)(c− a)(c− b)(d− c)

;

w =V (a, b, c, α)V (a, b, c, d)

=(α− a)(α− b)(α− c)(d− a)(d− b)(d− c)

.

41

Chapter 3

Euclidean vector spaces

3.1 Euclidean n-space

Definition 3.1.1 If n is a positive integer, then an ordered n-tuple is a sequence of n numbers

(a1 , a2 , . . . , an). The set of Rn of all n-tuples is called n-space.

A 2-tuple is also called ordered pair and a 3-tuple is called ordered triple and both of them have

geometric interpretation. Thus an n-tuple can be viewed as either an ’generalized point’ or a

’generalized vector’.

Definition 3.1.2 Two vectors u = (u1 , u2 , . . . , un) and v = (v1 , v2 , . . . , vn) in Rn are called equal

if u1 = v1 , u2 = v2 , . . . , un = vn . Their sum u+v is defined by u+v = (u1 +v1 , u2 +v2 , . . . , un +vn)

and, for a sscalar k, the scalar multiple is defined by ku = (ku1 , ku2 , . . . , kun).

The zerovector in Rn is denoted by 0 and is defined to be the vector 0 = (0, 0, . . . , 0). If u =

(u1 , u2 , . . . , un) ∈ Rn is any vector, the negative or additive inverse of u is denoted by −u and

is defined by −u = (−u1 ,−u2 , . . . ,−un). The difference u − v of the vectors (u1 , u2 , . . . , un), v =

(v1 , v2 , . . . , vn) ∈ Rn is defined by u − v := u + (−v). In termes of components we have u − v =

(u1 − v1 , u2 − v2 , . . . , un − vn).

Theorem 3.1.3 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn), w = (w1 , w2 , . . . , wn) ∈ Rn and k, l are

scalars, then

1. u + v = v + u.

2. u + (v + w) = (u + v) + w.

3. u + 0 = 0 + u = u.

4. u + (−u) = 0, namely u− u = 0.

43

5. k(lu) = (kl)u.

6. k(u + v) = ku + kv.

7. (k + l)u = ku + lu.

8. 1u = u.

Definition 3.1.4 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈ Rn are any vectors, then the Eu-

clidean Product u · v is defined by u · v := u1v1 + u2v2 + · · ·+ unvn .

Theorem 3.1.5 If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn), w = (w1 , w2 , . . . , wn) ∈ Rn and k, l are

scalars, then

1. u · v = v · u.

2. k(u · v) = (ku) · v = u · (kv).

3. (u + v) · w = u · w + v · w.

4. u · u ≤ 0 and u · u = 0 if and only if u− 0.

Definition 3.1.6 The Euclidean norm or the Euclidean length ||u|| of a vector u =

(u1 , u2 , . . . , un) ∈ Rn is defined by ||u|| :=√

u · u =√

u21+ u2

2+ · · ·+ u2

nand the distance d(u, v)

between the points u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈ Rn is defined by

d(u, v) = ||u− v|| =√

(u1 − v1)2 + (u2 − v2)2 + · · ·+ (un − vn)2.

Theorem 3.1.7 (Cauchy-Schwarz Inequality in Rn) If u = (u1 , u2 , . . . , un), v = (v1 , v2 , . . . , vn) ∈Rn, then |u · v| ≤ ||u|| · ||v||. In terms of components, the inequality becomes

|u1v1 + u2v2 + · · ·+ unvn | ≤√

u21+ u2

2+ · · ·+ u2

n·√

v21

+ v22

+ · · ·+ v2n.

Theorem 3.1.8 If u, v ∈ Rn and k is any scalar, then

1. ||u|| ≥ 0.

2. ||u|| = 0 if and only if u = 0.

3. ||ku|| = |k| ||u||.

44

4. ||u + v|| ≤ ||u||+ ||v|| (Triangle inequality).

Theorem 3.1.9 If u, v, w ∈ Rn and k is any scalar, then

1. d(u, v) ≥ 0.

2. d(u, v) = 0 if and only if u = v.

3. d(u, v) = d(v, u).

4. d(u, v) ≤ d(u,w) + d(w, v) (Triangle inequality).

Theorem 3.1.10 If u, v ∈ Rn, then u · v = 14 ||u + v||2 − 1

4 ||u− v||2.

Proof. Indeed, by adding the identities

||u + v||2 = (u + v) · (u + v) = ||u||2 + 2u · v + ||v||2

||u− v||2 = (u− v) · (u− v) = ||u||2 − 2u · v + ||v||2,we immediately get the required identity.¤

Definition 3.1.11 Two vectors u, v ∈ Rn are said to be orthogonal if u · v = 0.

Theorem 3.1.12 (Theorem of Pythagoras for Rn) If u, v ∈ Rn are orthogonal vectors, then the

equality ||u + v||2 = ||u||2 + ||v||2 holds.

Proof. Indeed, we have successively:

||u + v||2 = (u + v) · (u + v) = ||u||2 + 2u · v + ||v||2 = ||u||2 + ||v||2.¤

A vector u = (u1 , . . . , un) ∈ Rn can be naturally identified with a column or with a row matrix,

namely with

u =

u1

u2

...

un

or u = [u1 u2 · · · un ].

Consequently, the set Rn of n-tuples can be naturally identified with either the space of column

matrices, or with the space of row matrices. The sum u + v of two vectors u = (u1 , . . . , un), v =

(v1 , . . . , vn) will be identified consquently identified with the sum

u + v =

u1

u2

...

un

+

v1

v2

...

vn

=

u1 + v1

u2 + v2

...

un + vn

45

of the column matrices u and v, or with the sum

u + v = [u1 u2 · · · un ] + [v1 v2 · · · vn ] = [u1 + v1 u2 + v2 · · · un + vn ]

of the row matrices u and v.

Similarly, the scalar multiple ku of the real k with the ordered n-tuple u = (u1 , u2 , · · · , un) can

be either identified with the scalar multiple

ku = k

u1

u2

...

un

=

ku1

ku2

...

kun

,

or with the scalar multiple

ku = k[u1 , u2 , . . . , un ] = [ku1 , ku2 . . . , kun ].

3.2 Linear transformations from Rn to Rm

Definition 3.2.1 A mapping TA : Rn → Rm,TA(x) = Ax, where A is an m× n matrix, is called

linear transformation, or linear operator, if m = n. The linear transformation TA is equally called

the multiplication by A. If A = [aij ]mn, then

T

(

x1

x2

...

xn

)=

a11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

x1

x2

...

xn

=

a11x1 + a12x2 + · · ·+ a1nxn

a21x1 + a22x2 + · · ·+ a2nxn

...

am1x1 + am2x2 + · · ·+ amnxn

.

Observe that

T

(

1

0...

0

)=

a11

a21

...

am1

, T

(

0

1...

0

)=

a12

a22

...

am2

, . . . , T

(

0

0...

1

)=

a1n

a2n

...

amn

The matrix A is called the standard matrix of the linear transformation T and it is usually

denoted by [T ] and the multiplication by A mapping is also denoted by TA . Therefore [T ] =

46

[T (e1)...T (e2)

... · · · ...T (en)], where

e1 =

1

0...

0

, e2 =

0

1...

0

, . . . , en =

0

0...

1

.

1. The orthogonal projection of R2 on the x-axis, px : R2 → R2, px(x1 , x2) = (x1 , 0) is a linear

mapping since

px

( x1

x2

)=

x1

0

=

1 0

0 0

x1

x2

and [px ] =

1 0

0 0

. Its equations are :

w1 = x1

w2 = 0.

2. The orthogonal projection of R2 on the y-axis, py : R2 → R2, py(y1 , y2) = (0, y2) is a linear

mapping since

py

( y1

y2

)=

0

y2

=

0 0

0 1

y1

y2

and [py ] =

0 0

0 1


w1 = 0

w2 = y2 .

3. The reflection of R3 about the x1x2-plane, Sx1x2: R3 → R3, Sx1x2

(x, y, z) = (x, y,−z) is a

linear mapping since

Sx1x2

(

x

y

z

)=

x

y

−z

=

1 0 0

0 1 0

0 0 −1

x

y

z

and [Sx1x2] =

1 0 0

0 1 0

0 0 −1


w1 = x

w2 = y

w3 = −z


(x, y, z) = (x,−y, z) is a


Sx1x3

(

x

y

z

)=

x

−y

z

=

1 0 0

0 −1 0

0 0 1

x

y

z

47

and [Sx1x3] =

1 0 0

0 −1 0

0 0 1


w1 = x

w2 = −y

w3 = z


(x, y, z) = (x,−y, z) is a


Sx2x3

(

x

y

z

)=

−x

y

z

=

−1 0 0

0 1 0

0 0 1

x

y

z

and [Sx2x3] =

−1 0 0

0 1 0

0 0 1


w1 = −x

w2 = y

w3 = z

6. The orthogonal projection of R3 on the x1x2-plane, Px1x2: R3 → R3, Px1x2

(x, y, z) = (x, y, 0)

is a linear mapping since

Px1x2

(

x

y

z

)=

x

y

0

=

1 0 0

0 1 0

0 0 0

x

y

z

and [Px1x2] =

1 0 0

0 1 0

0 0 0


w1 = x

w2 = y

w3 = 0


(x, y, z) = (x, 0, z)


Px1x3

(

x

y

z

)=

x

0

z

=

1 0 0

0 0 0

0 0 1

x

y

z

and [Px1x3] =

1 0 0

0 0 0

0 0 1


w1 = x

w2 = 0

w3 = z


(x, y, z) = (0, y, z)


Px2x3

(

x

y

z

)=

0

y

z

=

0 0 0

0 1 0

0 0 1

x

y

z

48

and [Px2x3] =

0 0 0

0 1 0

0 0 1

. Its equations are:

w1 = 0

w2 = y

w3 = z

9. The rotation operator of R2 through a fixed angle θ,

Rθ

: R2 → R2, Rθ

= (x cos θ − y sin θ, x sin θ + y cos θ)


Rθ

( x

y

)=

x cos θ − y sin θ

x sin θ + y cos θ

=

cos θ − sin θ

sin θ cos θ

x

y

and [Rθ] =

cos θ − sin θ

sin θ cos θ

. Its equations are:

w1 = x cos θ − y sin θ

w2 = x sin θ + y cos θ.

Indeed, the rotation operator Rθ

rotates the point (x, y) = (r cosϕ, r sinϕ), counterclockwise

with the angle θ if θ > 0 and clockwise if θ < 0, the coordinates of the rotated point being

(w1 , w2) = (r cos(θ + ϕ), r sin(θ + ϕ)), such that one getsw1 = x cos θ − y sin θ

w2 = x sin θ + y cos θ.

10. The rotation operator of R3 through a fixed angle θ about an oriented axis, rotates about the

axis of rotation each point of R3 in such a way that its associated vector sweeps out some

portion of the cone determine by the vector itself an by a vector which gives the direction

and the orientation of the considered oriented axis. The angle of the rotattion is measured

at the base of the cone and it is measured clockwise or counterclockwise in relation with a

viewpoint along the axis looking toward the origin. As in R2, the positives angles generates

counterclockwise roattions and negative angles generates clockwise roattions. The counter-

clockwise sense of rotaion can be determined by the right-hand rule: If the thumb of the right

hand points the direction of the direction of the oriented axis, then the cupped fingers points

in a counterclockwise direction. The rotation operators in R3 are linear. For example

(a) The counterclockwise rotation about the positive x-axis through an angle θ has the

equations

w1 = x

w2 = y cos θ − z sin θ

w3 = y sin θ + z cos θ

, its standard matrix is

1 0 0

0 cos θ − sin θ

0 sin θ cos θ

.

(b) The counterclockwise rotation about the positive y-axis through an angle θ has the

equations

w1 = x cos θ + z sin θ

w2 = y

w3 = −x sin θ + z cos θ


cos θ 0 − sin θ

0 1 0

sin θ 0 cos θ

.

49

(c) The counterclockwise rotation about the positive z-axis through an angle θ has the

equations

w1 = x cos θ − y sin θ

w2 = x sin θ + y cos θ

w3 = z


cos θ − sin θ 0

sin θ cos θ 0

0 0 1

.

(d) The homotopy of ratio k ∈ R is the linear operator Hk

: Rn → Rn,Hk(x) = kx, Its

standard matrix is

k 0 · · · 0

0 k · · · 0...

.... . .

...

0 0 · · · k

and its equations are

w1 = kx1

w2 = kx2

...

wn = kxn .

If 0 ≤ k ≤ 1, then Hk

is called contraction, and for k ≥ 1,Hk

is called dilatation.

• Let A,B be matrices of sizes n×k and k×m respectively, and TA : Rn → Rk, TB : Rk → Rm

are the linear transformations of standard matrices A and B respectively, namely

TA(x) = Ax, T

B(y) = By.

Then the composed mapping TB ◦ TA is also linear and TB ◦ TA = TBA . Indeed

(TB ◦ TA)(x) = TB (TA)(x)) = TB (Ax) = B(Ax) = (BA)x = TBA(x).

We have just proved that [T2 ◦ T1 ] = [T2 ][T1 ] for any two linear mappings whose composition

T2 ◦T1 is defined. Similarly, if the composed map T3 ◦T2 ◦T1 , of the linear mappings T3 ,T2 ,T1 ,

is defined, then [T3 ◦ T2 ◦ T1 ] = [T3 ][T2 ][T1 ].

• Let T1 : R2 → R2 be the reflection operator about the line y = x and let T2 : R2 → R2 be

the orthogonal projection about the y-axis. Find T1 ◦ T2 , T2 ◦ T1 and their standard matrices.

• Show that Rθ1◦R

θ2= R

θ2◦R

θ1and it is the rotation of angle θ1 + θ2 .

3.3 Properties of linear transformations from Rn to Rm

Definition 3.3.1 A linear transformation T : Rn → Rm is said to be one-to-one if u1 , u2 ∈Rn, u1 6= u2 ⇒ Tu1 6= Tu2.

It follows that each vector w in the range {Tx : x ∈ Rn} of an one-to-one linear transformation

T : Rn → Rm there is exactly one vector x such that Tx = w.

Theorem 3.3.2 If A is an n × n matrix and TA : Rn → Rn is the multiplication by A, then the

following statements are equivalent:

50

1. A is invertible.

2. The range of TA is Rn.

3. TA is one-to-one.

If TA : Rn → Rn is a one-to-one linear operator, then the matrix A is invertible and TA−1 : Rn → Rn

is also linear. Moreover,

TA(TA−1 (x)) = AA−1x = x = TI x = idRn (x), ∀x ∈ Rn.

TA−1 (TA(x)) = A−1Ax = x = TI x = idRn (x), ∀x ∈ Rn.

Consequently TA is also invertible and its inverse is T−1A

= TA−1 . Therefore, for the standard matrix

of the inverse of an one-to-one linear operator T : Rn → Rn, we have [T−1] = [T ]−1.

Examples 3.3.3 The rotation operator of R2 through a fixed angle θ,

Rθ

: R2 → R2, Rθ

= (x cos θ − y sin θ, x sin θ + y cos θ)

is a linear mapping since has the standard matric and [Rθ] =

cos θ − sin θ

sin θ cos θ

, which is invertible

and [Rθ]−1 =

cos θ sin θ

− sin θ cos θ

=

cos(−θ) − sin(−θ)

sin(−θ) cos(−θ)

= [R−θ

]. Consequently Rθ

is invertible

and R−1θ

= R−θ.

Theorem 3.3.4 A transformation T : Rn → Rm is linear if and only if

1. T (u + v) = T (u) + T (v) for all u, v ∈ Rn.

2. T (cu) = cT (u) for all u ∈ Rn.

Proof. Indeed, if T = TA , then T = TA, namely T

A(u+v) = A(u+v) = Au+Av = T

Au+T

Av.

Conversely, if the given conditions are satisfied, then one can easily show that for any k vectors

u1 , . . . , uk∈ Rn we have T(u

1+ · · · + u

k) = T(u

1) + · · · + T(u

k). Oncan now show that T = TA ,

where A = [T(e1)... · · · ...T(e

n)], where

e1

=

1

0...

0

, e2

=

0

1...

0

, . . . , en

=

0

0...

1

.

51

Indeed, for

x =

x1

x2

...

xn

= x1

1

0...

0

+ x2

0

1...

0

+ · · ·+ xn

0

0...

1

= x1e1+ x2e2

+ · · ·+ xnen

we have

T(x) = T(x1e1+ x2e2

+ · · ·+ xnen) = x1T(e

1) + x2T(e

2) + · · ·+ xnT(e

n) = Ax.¤

52

linear algebra lecture notes

Documents