matrices - ma1s1 - maths.tcd.ietristan/ma1s1/slides/matrices-3-print.pdf · basic matrix notation...

Matrices

MA1S1

Tristan McLoughlin

November 9, 2014

Anton & Rorres: Ch 1.3-1.8

Basic matrix notation

We have studied matrices as a tool for solving systems of linear equationsbut now we want to study them in their own right. Let us start by recallingsome definitions:

A matrix is a rectangular array or table of numbers. We called theindividual numbers entries of the matrix and refered to them by their rowand column numbers. The rows are numbered 1, 2, . . . from the top and thecolumns are numbered 1, 2, . . . from left to right.So we use what you might think of as a

(row, column)

coordinate system for the entries of a matrix.In the example 1 1 2 5

1 11 13 −22 1 3 4

13 is the (2, 3) entry, the entry in row 2 and column 3.The matrix above is called a 3× 4 matrix because it has 3 rows and 4columns.

Basic matrix notation

We can talk about matrices of all different sizes such as[4 57 11

]2× 2

[47

]2× 1

[4 7

]1× 2

4 57 1113 13

3× 2

and in general we can have m× n matrices for any m ≥ 1 and n ≥ 1.

Matrices with just one row are called row matrices, e.g.

[x1, x2, . . . , xn]

While we use the term column matrix for a matrix with just one column.Here is an n× 1 (column) matrix

x1

x2

...xn

We consider two matrices to be the ‘same’ matrix only if they areabsolutely identical. They have to have the same shape (same number ofrows and same number of columns) and they have to have the samenumbers in the same positions.

Thus, all the following are different matrices[1 23 4

]6=[

2 13 4

]6=[

2 1 03 4 0

]6=

2 13 40 0

Double subscripts

When we want to discuss a matrix without listing the numbers in it, that iswhen we want to discuss a matrix that is not yet specified or an unknownmatrix we use a notation with double subscripts[

x11 x12

x21 x22

]This is a 2× 2 matrix where the (1, 1) entry is x11, the (1, 2) entry is x12

and so on.

Carrying this idea further, when we want to discuss an m× n matrix X andrefer to its entries we write

X =

x11 x12 · · · x1n

x21 x22 · · · x2n

.... . .

...xm1 xm2 · · · xmn

So the (i, j) entry of X is called xij .

Sometimes we want to write something like this but we don’t want to takeup space for the whole picture and we write an abbreviated version like

X = [xij ]1≤i≤m,1≤j≤n

To repeat what we said about when matrices are equal using this kind ofnotation, suppose we have two m× n matrices

X = [xij ]1≤i≤m,1≤j≤n and Y = [yij ]1≤i≤m,1≤j≤n

Then X = Y means the mn scalar equations xij = yij must all hold (foreach (i, j) with 1 ≤ i ≤ m, 1 ≤ j ≤ n). And if an m× n matrix equals anr × s matrix, we have to have m = r (same number or rows), n = s (samenumber of columns) and all the entries equal.

Arithmetic with matrices

In much the same way as we did with n-tuples we now define addition ofmatrices. We only allow addition of matrices that are of the same size. Twomatrices of different sizes cannot be added.

To add two matrices we simply add each corresponding component. Forexample 2 1

3 −40 7

+

6 −215 12−9 21

=

2 + 6 1 + (−2)3 + 15 −4 + 12

0 + (−9) 7 + 21

=

8 −118 8−9 28

More generally, if we take two m× n matrices

X = [xij ]1≤i≤m,1≤j≤n and Y = [yij ]1≤i≤m,1≤j≤n

then we defineX + Y = [xij + yij ]1≤i≤m,1≤j≤n

(the m× n matrix with the (1, 1) entry being the sum of the (1, 1) entries ofX and Y , the (1, 2) entry the sum of the (1, 2) entries of X and Y , and soon).

To define the scalar multiple kX, for a number k and a matrix X we justmultiply every entry of X by k.

For example

8

2 13 −40 7

=

8(2) 8(1)8(3) 8(−4)8(0) 8(7)

=

16 824 −320 56

The general definition is:

X = [xij ]1≤i≤m,1≤j≤n

is any m× n matrix and k is any real number then kX is another m× nmatrix. Specifically

kX = [kxij ]1≤i≤m,1≤j≤n

We see that if we multiply by k = 0 we get a matrix where all the entriesare 0.

The zero matrix

The m× n matrix where every entry is 0 is called the m× n zero matrix.Thus we have zero matrices of every possible size.

If X is a matrix then we can say

X + 0 = X

if 0 means the zero matrix of the same size as X. If we wanted to make thenotation less ambiguous, we could write something like 0m,n for the m× nzero matrix. Then things we can note are that if X is any m× n matrixthen

X + 0m,n = X, 0X = 0m,n

We will not usually go to the lengths of indicating the size of the zeromatrix we mean in this way. We will write the zero matrix as 0 and try tomake it clear what size matrices we are dealing with from the context.

Matrix transformations

Let us temporarily return to our study of linear equations. Consider theequations:

y1 = x1 + 2x2

y2 = + 3x2

We can think of these equations as a matrix equation[y1y2

]=

[x1 + 2x2

3x2

]This can be viewed as a rule for transforming the vector

x =

[x1

x2

]into the vector y =

[y1y2

].

The transformation is described by the matrix

M =

[1 20 3

]

Matrix transformations

If we write this as y = Mx this tells us how to multiply matrices:[y1y2

]=

[1 20 3

] [x1

x2

]=

[x1 + 2x2

3x2

]

Now we will generalise this to multiplying arbitrary matrices.

Matrix multiplication

This is a rather new thing, compared to the ideas we have discussed up tonow. As we saw certain matrices can be multiplied and their product isanother matrix.

If X is an m× n matrix and Y is an n× p matrix then the product XY willmake sense and it will be an m× p matrix.

For example, then [1 2 34 5 6

] 1 0 1 −22 −1 3 14 2 6 4

is going to make sense. It is the product of

2× 3 by 3× 4

and the result is going to be 2× 4. (We have to have the same number ofcolumns in the left matrix as rows in the right matrix. The outer numbers,the ones left after ‘cancelling’ the same number that occurs in the middle,give the size of the product matrix.)

As an example of a product that is not defined and will not make sense[1 2 34 5 6

] [7 89 10

]2× 3 by 2× 2

Back to the example that will make sense, what we have explained so far isthe shape of the product[

1 2 34 5 6

] 1 0 1 −22 −1 3 14 2 6 4

=

[z11 z12 z13 z14z21 z22 z23 z24

]

and we still have to explain how to calculate the zij , the entries in theproduct.

We’ll concentrate on one example to try and show the idea. Say we look atthe entry z23, the (2, 3) entry in the product. What we do is take row 2 ofthe left matrix ‘times’ column 3 of the right matrix[

1 2 34 5 6

] 1 0 1 −22 −1 3 14 2 6 4

=

[z11 z12 z13 z14z21 z22 z23 z24

]

The way we multiply the row[

4 5 6]

times the column 136

is a very much reminiscent of a dot product

(4)(1) + (5)(3) + (6)(6) = z23

In other words z23 = 55[1 2 34 5 6

] 1 0 1 −22 −1 3 14 2 6 4

=

[z11 z12 z13 z14z21 z22 55 z24

]

If we calculate all the other entries in the same sort of way (row i on theleft times column j on the right gives zij) we get

[1 2 34 5 6

] 1 0 1 −22 −1 3 14 2 6 4

=

[17 4 25 1238 7 55 21

]

As we discussed the only way to get used to the way to multiply matrices isto do some practice. Consider the two matrices:

A =

[4 1 120 1 4

], B =

5 41 1−2 3

As these are 2× 3 and a 3× 2 matrix they can be multiplied to give 2× 2.

So we have:

AB =

[4(5) + 1(1) + 12(−2) 4(4) + 1(1) + 12(3)0(5) + 1(1) + 4(−2) 0(4) + 1(1) + 4(3)

]=

[−3 53−7 13

]

It is possible to explain in a succinct formula what the rule is forcalculating the entries of the product matrix.

In the equationx11 x12 · · · x1n

x21 x22 · · · x2n

.... . .

...xm1 xm2 · · · xmn

y11 y12 · · · y1py21 y22 · · · y2p...

. . ....

yn1 yn2 · · · ynp

=

z11 z12 · · · z1pz21 z22 · · · z2p...

. . ....

zm1 zm2 · · · zmp

the (i, k) entry zik of the product is got by taking the dot product of the ith

row [xi1 xi2 . . . xin] of the first matrix times the kth column

y1ky2k...

ynk

of the

second.

In shortzik = xi1y1k + xi2y2k + · · ·+ xinynk

or with the Sigma notation for sums, we can rewrite this as

n∑j=1

xijyjk = zik (for 1 ≤ i ≤ m, 1 ≤ k ≤ p).

Hence for X = [xij ]1≤i≤m,1≤j≤n and Y = [yij ]1≤i≤n,1≤j≤p we have

XY = Z

where

Z =

[n∑

j=1

xijyjk

]1≤i≤m,1≤k≤p

.

In previous weeks we considered the system of m linear equations in nunknowns:

a11x1 + a12x2 + · · ·+ a1nxn = b1

a21x1 + a22x2 + · · ·+ a2nxn = b2...

......

......

am1x1 + am2x2 + · · ·+ amnxn = bm

If we introduce the m× n coefficient matrix

A =

a11 a12 . . . a1n

a21 a22 . . . a2n

......

......

am1 am2 . . . amn

and the column vectors (or n× 1 and m× 1 matrices)

x =

x1

x2

...xn

, b =

b1b2...bm

Then we can write the equations as

Ax = b .

This in fact gives us another way to think about matrix multiplication aswe can write

Ax =

a11x1 + a12x2 + · · ·+ a1nxn

a21x1 + a22x2 + · · ·+ a2nxn

......

......

am1x1 + am2x2 + · · ·+ amnxn

= x1

a11

a21

...am1

+ x2

a12

a22

...am2

+ · · ·+ xn

a1n

a2n

...amn

Which can be stated as:We can express the product Ax as a linear combination of the columnvectors of A with coefficients given by the entries of x.

In fact the same picture can be used when multiplying general matrices.

We can phrase the earlier construction in an equivalent fashion by sayingthe in the product of matrices A and B, the first column vector of AB isthe linear combination of column vectors of A with coefficients from thefirst row of B. The second column vector is given by the column vectors ofA with coefficients from the second row of B. And so on.

For example:[1 2 42 6 0

] 4 1 4 30 −1 3 12 7 5 2

=

[12 27 30 138 −4 26 12

]

Where [128

]= 4

[12

]+ 0

[26

]+ 2

[40

]and [

27−4

]=

[12

]−[

26

]+ 7

[40

]and so on.


Recall how we multiply two matrices together: Given two matrices A andB,

A =

[2 2 85 4 3

], B =

3 51 −12 1

we can multiply them to get a 2× 2 matrix as follows:

AB =

[2 2 85 4 3

] 3 51 −12 1

=

[2(3) + 2(1) + 8(2) 2(5) + 2(−1) + 8(1)5(3) + 4(1) + 3(2) 5(5) + 4(−1) + 3(1)

]


Recall how we multiply two matrices together: Given two matrices A andB,

A =

[2 2 85 4 3

], B =

3 51 −12 1

we can multiply them to get a 2× 2 matrix as follows:

AB =

[2 2 85 4 3

] 3 51 −12 1

=

[2(3) + 2(1) + 8(2) 2(5) + 2(−1) + 8(1)5(3) + 4(1) + 3(2) 5(5) + 4(−1) + 3(1)

]=

[24 1625 24

]

Properties of matrix multiplication

Matrix multiplication has the properties that you would expect of anymultiplication and the standard rules of algebra work out as long as youkeep the order of the products intact:

(i) If A and B are both m× n matrices and C is n× p, then

(A + B)C = AC + BC

(This is the right distributive law for multiplication.)

(ii) If A is an m× n matrices and B and C are both n× p then

A(B + C) = AB + AC .

(This is the left distributive law for multiplication.)

(iii) If k is a scalar, and A is an m× n matrices while C is n× p then

(kA)C = k(AC) = A(kC) .

(iv) If A is an m× n matrices and B is n× p and C is p× q, then the twoways of calculating ABC work out the same:

(AB)C = A(BC)

(This is known as the associative law for multiplication.)

While most of these properties are fairly obvious let us try to prove the firstone: From the definition of matrix multiplication and addition the entriesof (A + B)C are given by

[(A + B)C]ij =

n∑k=1

(aik + bik)ckj

but because of the distributivity of normal multiplication

[(A + B)C]ij =n∑

k=1

(aik + bik)ckj

=n∑

k=1

(aikckj + bikckj)

=

n∑k=1

aikckj +

n∑k=1

bikckj

= [AC]ij + [BC]ij .

Where on the right we have the entries of the matrix AC + BC.

We won’t prove the property of associativity instead we will give anexample:

We saw before that AB 6= BA in general for matrices.

In a bit more detail the situation is as follows:

(a) BA does not have to make sense if AB makes sense.

For example: As we saw, if A is a 3× 4 matrix and B is 3× 3, thenAB does not make sense. But BA is a product of a 3× 3 times a 3× 4— so it makes sense.

(b) It can be that AB and BA both make sense but they are different sizes.

For example: If A is a 2× 3 matrix and B is a 3× 2 matrix, then ABis 2× 2 while BA is 3× 3. As they are different sizes AB and BA arecertainly not equal.

(c) The more tricky case is the case where the matrices A and B are squarematrices of the same size.

A square matrix is an n× n matrix for some n. Notice that the productof two n× n matrices is another n× n matrix. Still, it is usually notthe case that AB = BA when A and B are n× n.



(a) BA does not have to make sense if AB makes sense.For example: As we saw, if A is a 3× 4 matrix and B is 3× 3, thenAB does not make sense. But BA is a product of a 3× 3 times a 3× 4— so it makes sense.









(b) It can be that AB and BA both make sense but they are different sizes.For example: If A is a 2× 3 matrix and B is a 3× 2 matrix, then ABis 2× 2 while BA is 3× 3. As they are different sizes AB and BA arecertainly not equal.









(c) The more tricky case is the case where the matrices A and B are squarematrices of the same size.A square matrix is an n× n matrix for some n. Notice that the productof two n× n matrices is another n× n matrix. Still, it is usually notthe case that AB = BA when A and B are n× n.

There are some perhaps non-intuitive consequences of the properties ofmatrix multiplication. For example for real numbers:

• If ab = ac and a 6= 0 then b = c.

• If ab = 0 then at least one of a or b is 0.

Neither of these is true for matrices.

Consider

Identity Matrices

There are some special square matrices which deserve a special name.We’ve already seen the zero matrix (which makes sense for any size — canbe m× n and need not be square).

Another special matrix is the n× n identity matrix which we denote by In.So

I2 =

[1 00 1

], I3 =

1 0 00 1 00 0 1

, I4 =

1 0 0 00 1 0 00 0 1 00 0 0 1

and in general In is the n× n matrix with 1 in all the ‘diagonal’ entries andzeroes off the diagonal.

By the diagonal entries of an n× n matrix we mean the (i, i) entries fori = 1, 2, . . . , n. We do not talk of the diagonal for rectangular matrices.

Identity Matrices

The reason for the name is that the identity matrix is a multiplicativeidentity.

That is ImA = A and A = AIn for any m× n matrix A. For example[1 00 1

] [2 53 4

]=

[2 53 4

]and [

2 53 4

] [1 00 1

]=

[2 53 4

].

Identity matrices show up naturally in the study of the reduced row echelonform of square (i.e. n× n) matrices.

If R is the reduced row echelon form of an n× n matrix then either R hasone or more rows of zeros or R is the n× n identity matrix In.

Inverse matrices — basic ideas

Definition: If A is an n× n matrix, then another n× n matrix C is calledthe inverse matrix for A if it satisfies

AC = In and CA = In.

We write A−1 for the inverse matrix C (if there is one).

The idea for this definition is that the identity matrix is analogous to thenumber 1, in the sense that 1k = k1 = k for every real number k whileAIn = InA = A for every n× n matrix A.

Then the key thing about the reciprocal of a nonzero number k is that theproduct (

1

k

)k = 1

We insist that the inverse should work on both sides but we will see atheorem that says that if A and C are n× n matrices and AC = In, thenautomatically CA = In must also hold.

Inverse matrices — basic ideas

When does there exist an inverse for a given matrix A?It is not enough that A should be nonzero. One way to see this is to look ata system of n linear equations in n unknowns written in matrix form.

If the n× n matrix A has an inverse matrix C then we can multiply bothsides of the equation Ax = b by C from the left to get

Ax = b

⇒ C(Ax) = Cb

⇒ (CA)x = Cb

⇒ Inx = Cb

⇒ x = Cb

So we find that the system of n equation in n unknowns given by Ax = b,for any right hand side b, will just have the one solution x = Cb.

Invertible Matrices

For a system of n equation in n unknowns given by

Ax = b

where A is an invertible matrix with inverse C and for any right hand sideb, will just have the one solution x = Cb. We have seen for many systemsof linear equations there are infinite families of solutions or sometimes nosolutions.

This amounts to a significant restriction on A and hence not all matriceshave inverses. We are led to the definition:

Definition. An n× n matrix A is called invertible if there is an n× ninverse matrix for A.

We now consider how to find the inverse of a given matrix A. The methodwill work quite efficiently for large matrices as well as for small ones.

Invertible Matrices - an example

To make things more concrete, we’ll thing about a specific example

A =

[2 32 5

]How can we find

C =

[c11 c12c21 c22

]so that AC = I2.

Writing out that equation we want C to satisfy we get

AC =

[2 32 5

] [c11 c12c21 c22

]=

[1 00 1

]= I2

If you think of how matrix multiplication works, this amounts to twodifferent equations for the columns of C[

2 32 5

] [c11c21

]=

[10

]and

[2 32 5

] [c12c22

]=

[01

]According to the reasoning we used above to get to equation Ax = b, eachof these represents a system of 2 linear equations in 2 unknowns:

2c11 + 3c21 = 1

2c11 + 5c21 = 0

and

2c12 + 3c22 = 0

2c12 + 5c22 = 1

We know how to solve these!

To solve them we can use Gauss-Jordan elimination (or Gaussianelimination) twice, once for the augmented matrix for the first system ofequations, [

2 3 : 12 5 : 0

]and again for the second system[

2 3 : 02 5 : 1

]

If we were to write out the steps for the Gauss-Jordan eliminations, we’dfind that we were repeating the exact same steps the second time as thefirst time.

There is a trick to solve at once two systems of linear equations, where thecoefficients of the unknowns are the same in both, but the right hand sidesare different.

The trick is to write both columns after the dotted line, like this[2 3 : 1 02 5 : 0 1

]We row reduce this matrix[

1 32

: 12

02 5 : 0 1

]Old Row 1× 1

2[1 3

2: 1

20

0 2 : −1 1

]Old Row 2− 2×Old Row 1[

1 32

: 12

00 1 : − 1

212

]Old Row 2× 1

2

this is now in row echelon form now. We now perform the final steps ofGauss-Jordan[

1 0 : 54− 3

4

0 1 : − 12

12

]Old Row 1− 3

2×Old Row 2

This is in reduced row echelon form so Gauss-Jordan is finished.

The first column after the dotted line gives the solution to the first system,the one for the first column of C. The second column after the dotted linerelates to the second system, the one for the second column of C.

That means we have[c11c21

]=

[54

− 12

]and

[c12c22

]=

[− 3

412

]So we find that the matrix C has to be

C =

[54− 3

4

− 12

12

]

We can multiply out and check that it is indeed true that AC = I2 (whichhas to be the case unless we made a mistake) and that CA = I2 (a factwhich which has to be true automatically as we will see).

AC =

[2 3

2 5

][54− 3

4

− 12

12

]=

[2(54

)+ 3

(− 1

2

)2(− 3

4

)+ 3

(12

)2(54

)+ 5

(− 1

2

)2(− 3

4

)+ 5

(12

)]

=

[1 00 1

]

CA =

[54− 3

4

− 12

12

][2 3

2 5

]=

[(54

)(2) +

(− 3

4

)(2)

(54

)(3) +

(− 3

4

)(5)(

− 12

)(2) +

(12

)(2)

(− 1

2

)(3) +

(12

)(5)

]

=

[1 00 1

]Which is exactly the required property of the inverse matrix!

Invertible 2 Matrices

Let’s consider the most general 2× 2 matrix:

A =

[a bc d

]and use the above method. In this case the augmented matrix is[

1 ba

: 1a

0

c d : 0 1

]Old Row 1× 1

a

⇒

[1 b

a: 1

a0

0 d− c ba

: −c 1a

1

]Old Row 2− c×Old Row 1

⇒

[1 b

a: 1

a0

0 1 : −cad−cb

aad−cb

]1

d− c ba

×Old Row 2

This ends Gaussian elimination. Now we do the remaining steps ofGauss-Jordan.

Removing the entry above the leading one in row two:[1 0 : 1

a

(1− bc

ad−cb

)−b

ad−cb

0 1 : −cad−cb

aad−cb

]Old Row 1− b

a×Old Row 2

=

[1 0 : d

ad−cb−b

ad−cb

0 1 : −cad−cb

aad−cb

]Thus we see that the inverse of an arbitrary 2× 2 matrix, A, is

A−1 =1

ad− bc

[d −b−c a

]Here we can see the determinant of the 2× 2 matrix appear:

det(A) = ad− bc

(note the down-right, plus sign, down-left minus sign rule works).

This general approach works for larger matrices too. If we start with ann× n matrix

A =

a11 a12 · · · a1n

a21 a22 · · · a2n

.... . .

...an1 an2 · · · ann

and we look for an n× n matrix

C =

c11 c12 · · · c1nc21 c22 · · · c2n...

. . ....

cn1 cn2 · · · cnn

where AC = In, we want

AC =

a11 a12 · · · a1n

a21 a22 · · · a2n

.... . .

...an1 an2 · · · ann

c11 c12 · · · c1nc21 c22 · · · c2n...

. . ....

cn1 cn2 · · · cnn

=

1 0 · · · 00 1 · · · 0...

. . ....

0 0 · · · 1

= In

This means that the columns of C have to satisfy systems of n linearequations in n unknowns of the form

A(jth column of C) = jth column of In

We can solve all of these n systems of equations together because they havethe same matrix A of coefficients for the unknowns.

We do this by writing an augmented matrix where there are n columnsafter the dotted line. The columns to the right of the dotted line, the righthand sides of the various systems we want to solve to find the columns ofC, are going to be the columns of the n× n identity matrix.

Method for finding the inverse A−1 of an n× n matrix A.Use Gauss-Jordan elimination to row reduce the augmented matrix

[A | In] =

a11 a12 · · · a1n : 1 0 · · · 0a21 a22 · · · a2n : 0 1 · · · 0...

. . .... :

.... . .

...an1 an2 · · · ann : 0 0 · · · 1

We should end up with a reduced row echelon form that looks like

1 0 · · · 0 : c11 c12 · · · c1n0 1 · · · 0 : c21 c22 · · · c2n...

. . .... :

.... . .

...0 0 · · · 1 : cn1 cn2 · · · cnn

or in summary [In |A−1].

If we don’t end up with a matrix of the form [In |C] it means that there isno inverse for A.

Elementary matrices

We now make a link between elementary row operations and matrixmultiplication.

Recall now the 3 types of elementary row operations:

(i) multiply all the numbers is some row by a nonzero factor (and leaveevery other row unchanged)

(ii) replace any chosen row by the difference between it and a multiple ofsome other row.

(iii) Exchange the positions of some pair of rows in the matrix.

Definition: An n× n elementary matrix E is the result of applying asingle elementary row operation to the n× n identity matrix In.

Examples. We use n = 3 in these examples. Recall

I3 =

1 0 00 1 00 0 1

(i) Row operation: Multiply row 2 by −5. Corresponding elementary

matrix

E =

1 0 00 −5 00 0 1

(ii) Row operation: Add 4 times row 1 to row 3 (same as subtracting (−4)

times row 1 from row 3). Corresponding elementary matrix

E =

1 0 00 1 04 0 1

(iii) Row operation: swap rows 2 and 3.

E =

1 0 00 0 10 1 0

Row operations and Elementary Matrices

The idea is that if A is an m× n matrix, then doing one single rowoperation on A is equivalent to multiplying A on the left by an elementarymatrix E (to get EA), and E should be the m×m elementary matrix forthat same row operation.

Row operations and Elementary Matrices

Examples. We use the following A to illustrate this idea,

A =

1 2 3 45 6 7 89 10 11 12

(1) Row operation: Add (−5) times row 1 to row 2.

The corresponding E (let’s call it E1):

E1 =

1 0 0−5 1 00 0 1

and so E1A is

E1A =

1 0 0−5 1 00 0 1

1 2 3 45 6 7 89 10 11 12

=

1 2 3 40 −4 −8 −129 10 11 12

(Same as doing the row operation to A.)

(2) Row operation: Suppose in addition we also want to add (−9) times row1 to row 3. In the context of multiplying by elementary matrices, weneed a different elementary matrix for the second step, let’s call it E2:

E2 =

1 0 00 1 0−9 0 1

What we want in order to do first one and then the next row operationis

E2E1A =

1 0 00 1 0−9 0 1

E1A =

1 0 00 1 0−9 0 1

1 2 3 40 −4 −8 −129 10 11 12

=

1 2 3 40 −4 −8 −120 −8 −16 −24

where E1 is the elementary matrix we used first.

So the first row operation changes A to E1A, and then the second changesthat to E2E1A.

If we do a whole sequence of several row operations (as we would do if wefollowed the Gaussian elimination recipe further) we can say that the endresult after k row operations is that we get

EkEk−1 . . . E3E2E1A

where Ei is the elementary matrix for the ith row operation we did.

Elementary matrices are invertible

We heard before all elementary row operations are reversible by anotherelementary row operation. It follows that every elementary matrix E has aninverse that is another elementary matrix.

Elementary matrices are invertible

For example, take E to be the 3× 3 elementary matrix corresponding thethe row operation “add (−5) times row 1 to row 2”. So

E =

1 0 0−5 1 00 0 1

Then the reverse row operation is “add 5 times row 1 to row 2”, and theelementary matrix for that is

E =

1 0 05 1 00 0 1

Thinking in terms of row operations, or just by multiplying out thematrices we see that the result of applying second row operation to E:

EE =

1 0 00 1 00 0 1

= I3

and EE = I3 also.

Theory about invertible matrices

Proposition

If A is an invertible n×n matrix, then its inverse A−1 is also invertible and

(A−1)−1 = A

Proof.What we know about the inverse A−1 (from its definition) is that

AA−1 = In and A−1A = In

In words, the inverse of A is a matrix with the property that A times theinverse and the inverse times A are both the identity matrix In.But looking at the two equations again and focussing on A−1 rather thanon A, we see that there is a matrix which when multiplied by A−1 on theright or the left gives In. And that matrix is A. So A−1 has an inverse andthe inverse is A.

This is an example of a mathematical proof of a theorem. Theorems are themathematical analogue of the laws of science (the second law ofthermodynamics, Boyle’s law, Newtons Laws and so on), but there is adifference.

In Science, a law is a way of summarising observations made in experimentsor a conjecture based on some principle. The law should then be checkedwith further experiments and if it checks out, it becomes accepted as a fact.

Such laws have to have a precise statement for them to work. Roughly theysay that given a certain situation, some particular effect or result willhappen. Sometimes the “certain situation” may be somewhat idealised. Soone may interpret the law as saying that the effect or result should be veryclose to the observed effect or result if the situation is almost exactly valid.

Sometimes we don’t even know what approximations or idealisations we aremaking in stating our physical laws and when we do new experiments indifferent arenas we discover our laws are only approximations and don’t infact hold as stated in general.

In mathematics we expect our results to be exactly true as long as ourassumptions hold, furthermore our laws, once proven, will always be true asstated.

TheoremProducts of invertible matrices are invertible, and the inverse of the productis the product of the inverses taken in the reverse order.In more mathematical language, if A and B are two invertible n× nmatrices, then AB is invertible and (AB)−1 = B−1A−1.

Proof.Start with any two invertible n× n matrices A and B, and look at

(AB)(B−1A−1) = A(BB−1)A−1 = AInA−1 = AA−1 = In

And look also at

(B−1A−1)(AB) = B−1(B−1B)A = B−1InB = B−1B = In

This shows that B−1A−1 is the inverse of AB (because multiplying AB byB−1A−1 on the left or the right gives In). So it shows that (AB)−1 exists,or in other words that AB is invertible, as well as showing the formula for(AB)−1.

Theorem (equivalent ways to see that a matrix is invertible)

Let A be an n× n (square) matrix.The following are equivalent statements about A, meaning that is any one ofthem is true, then the others have to be true as well. (And if one is nottrue, the others must all be not true.)

(a) A is invertible (has an inverse)

(b) the equation Ax = 0 (where x is an unknown n× 1 column matrix, 0 isthe n× 1 zero column) has only the solution x = 0

(c) the reduced row echelon for for A is In

(d) A can be written as a product of elementary matrices

In principle there are a lot of things to prove in the theorem.

Staring with any one of the 4 items, assuming that that statement is validfor a given n× n matrix A, we should provide a line of logical reasoningwhy all the other items have to be also true about that same A. We don’tdo this by picking examples of matrices A, but by arguing about a matrixwhere we don’t specifically know any of the entries.

But we then have 4 times 3 little proofs to give, 12 proofs in all. So itwould be long even if each individual proof is very easy.

There is a trick to reduce the number of proofs from 12 to only 4. We provea cyclical number of steps

(a) ⇒ (b)⇑ ⇓

(d) ⇐ (c)

The idea then is to prove 4 things only

(a) ⇒ (b) In this step we assume only that statement (a) is true aboutA, and then we show that (b) must also be true.

(b) ⇒ (c) In this step we assume only that statement (b) is true aboutA, and then we show that (c) must also be true.

(c) ⇒ (d) Similarly we assume (c) and show (d) must follow.

(d) ⇒ (a) In the last step we assume (d) and show that (a) mustfollow.

When we have done this we will be able to deduce all the statements fromany one of the 4.

Starting with (say) the knowledge that (c) is a true statement the thirdstep above shows that (d) must be true. Then the next step tells us (a)must be true and the first step then says (b) must be true. In other words,starting at any point around the ring

(a) ⇒ (b)⇑ ⇓

(d) ⇐ (c)

(or at any corner of the square) we can work around to all the others.

Link 1


(b) the equation Ax = 0 (where x is an unknown n× 1 column matrix, 0 isthe n× 1 zero column) has only the solution x = 0

Proof: (a) ⇒ (b).

Assume A is invertible and A−1 is its inverse.Consider the equation Ax = 0 where x is some n× 1 matrix and 0 is then× 1 zero matrix. Multiply both sides by A−1 on the left to get

A−1Ax = A−10

Inx = 0

x = 0

Therefore x = 0 is the only possible solution of Ax = 0.

Link 2

(b) the equation Ax = 0 (where x is an unknown n× 1 column matrix, 0is the n× 1 zero column) has only the solution x = 0

(c) the reduced row echelon for for A is In

Proof: (b) ⇒ (c).

Assume now that x = 0 is the only possible solution of Ax = 0.That means that when we solve Ax = 0 by using Gauss-Jordan eliminationon the augmented matrixA |

00...0

= [A | 0]

we can’t end with free variables.We know that we must end up with a reduced row echelon form that has asmany leading ones as there are unknowns. Since we are dealing with nequations in n unknowns, that means A row reduces to In.

Link 3

(c) the reduced row echelon for for A is In(d) A can be written as a product of elementary matrices

Proof: (c) ⇒ (d).

Suppose now that A row reduces to In.Write down an elementary matrix for each row operation we need torow-reduce A to In. Say they are E1, E2, . . . , Ek. Recall that allelementary matrices have inverses. So we must have

EkEk−1 . . . E2E1A = In

E−1k EkEk−1 . . . E2E1A = E−1

k In

InEk−1 . . . E2E1A = E−1k

Ek−1 . . . E2E1A = E−1k

E−1k−1Ek−1 . . . E2E1A = E−1

k−1E−1k

Ek−2 . . . E2E1A = E−1k−1E

−1k

So, when we keep going in this way, we end up with

A = E−11 E−1

2 . . . E−1k−1E

−1k

So we have (d) because inverses of elementary matrices are againelementary matrices.

Link 4

(d) A can be written as a product of elementary matrices


Proof: (d) ⇒ (a).

If A is a product of elementary matrices,

A = EkEk−1 . . . E2E1

then

E−11 E−1

2 . . . E−1k−1E

−1k A = E−1

1 E−12 . . . E−1

k−1E−1k EkEk−1 . . . E2E1

= E−11 E−1

2 . . . E−1k−1InEk−1 . . . E2E1

= E−11 E−1

2 . . . E−1k−2InEk−2 . . . E2E1

= In

we can use the earlier facts that the inverse of the product is the product ofthe inverses in the reverse order to show that A is invertible and it’s inverseis

A−1 = E−11 E−1

2 . . . E−1k−1E

−1k

So we get (a).

Summary

Hence we have shown every link in our proof

(a) ⇒ (b)⇑ ⇓

(d) ⇐ (c)

as required by the theorem.

TheoremIf A and B are two n× n matrices and if AB = In, then BA = In.

Proof.The idea is to apply the previous theorem to the matrix B rather than to A.Consider the equation Bx = 0 (where x and 0 are n× 1). Multiply thatequation by A on the left to get

ABx = B0

Inx = 0

x = 0

So x = 0 is the only possible solution of Bx = 0.That means B satisfies condition (b) of the previous Theorem. Thus by thetheorem, B−1 exists. Multiply the equation AB = In by B−1 on the rightto get

ABB−1 = InB−1

AIn = B−1

A = B−1

So, we getBA = BB−1 = In.

Special matrices

There are matrices that have a special form that makes calculations withthem much easier than the same calculations are as a rule.

Diagonal matricesFor square matrices (that is n× n for some n) A = (aij)

ni,j=1 we say that A

is a diagonal matrix if aij = 0 whenever i 6= j. Thus in the first few casesn = 2, 3, 4 diagonal matrices look like[

a11 00 a22

]a11 0 0

0 a22 00 0 a33

a11 0 0 00 a22 0 00 0 a33 00 0 0 a44

Examples with numbers4 0 00 −2 00 0 13

,

−1 0 00 0 00 0 4

.

(These are 3× 3 examples.)Diagonal matrices are easy to multiply4 0 0

0 5 00 0 6

−1 0 00 12 00 0 4

=

−4 0 00 60 00 0 24

a11 0 0

0 a22 00 0 a33

b11 0 00 b22 00 0 b33

=

a11b11 0 00 a22b22 00 0 a33b33

The idea is that all that needs to be done is to multiply the correspondingdiagonal entries to get the diagonal entries of the product (which is againdiagonal).

Based on this we can rather easily figure out how to get the inverse of adiagonal matrix. For example if

A =

4 0 00 5 00 0 6

then

A−1 =

14

0 00 1

50

0 0 16

because if we multiply these two diagonal matrices we get the identity.We could also figure out A−1 the usual way, by row-reducing [A | I3]. Thecalculation is actually quite easy. Starting with

[A | I3] =

4 0 0 : 1 0 00 5 0 : 0 1 00 0 6 : 0 0 1

we just need to divide each row by something to get to

[A | I3] =

1 0 0 : 14

0 00 1 0 : 0 1

50

0 0 1 : 0 0 16

Special Matrices

Upper triangular matricesThis is the name given to square matrices where all the non-zero entries areon or above the diagonal.

A 4× 4 example is

A =

4 −3 5 60 3 7 −90 0 0 60 0 0 −11

Another way to express it is that all the entries that are definitely belowthe diagonal have to be 0. Some of those on or above the diagonal can bezero also. They can all be zero and then we would have the zero matrix,which would be technically upper triangular. All diagonal matrices are alsocounted as upper triangular.

The precise statement then is that an n× n matrix

A =

a11 a12 · · · a1n

a21 a22 · · · a2n

.... . .

...an1 an2 · · · ann

is upper triangular when

aij = 0 whenever i > j.

It is fairly easy to see that if A and B are two n× n upper triangularmatrices, then

the sum A + B and the product AB

are both upper triangular.

Example

Let us consider the upper triangular matrices:

A =

3 4 5 60 7 8 90 0 1 20 0 0 3

, B =

1 2 −2 50 4 −1 50 0 8 20 0 0 1

Then the sum is

A + B =

4 6 3 110 11 7 140 0 9 40 0 0 4

while the product is

AB = ?

Example

Let us consider the upper triangular matrices:

A =

3 4 5 60 7 8 90 0 1 20 0 0 3

, B =

1 2 −2 50 4 −1 50 0 8 20 0 0 1

Then the sum is

A + B =

4 6 3 110 11 7 140 0 9 40 0 0 4

while the product is

AB =

3 22 30 510 28 57 600 0 8 40 0 0 3

.

Also inverting upper triangular matrices is relatively painless because theGaussian elimination parts of the process are almost automatic.

As an example, we look at the (upper triangular)

A =

3 4 5 60 7 8 90 0 1 20 0 0 3

We should row reduce

3 4 5 6 : 1 0 0 00 7 8 9 : 0 1 0 00 0 1 2 : 0 0 1 00 0 0 3 : 0 0 0 1

and the first few steps are to divide row 1 by 3, row 2 by 7 and row 4 by 3,to get

1 43

53

2 : 13

0 0 00 1 8

797

: 0 17

0 00 0 1 2 : 0 0 1 00 0 0 1 : 0 0 0 1

3

This is then already in row echelon form and to get the inverse we need toget to reduced row echelon form (starting by clearing out above the lastleading 1, then working back up). The end result should be

1 0 0 0 : 13− 4

21− 1

70

0 1 0 0 : 0 17

− 87

13

0 0 1 0 : 0 0 1 − 23

0 0 0 1 : 0 0 0 13

It is quite easy to see that an upper triangular matrix is invertible exactlywhen the diagonal entries are all nonzero. Another way to express thissame thing is that the product of the diagonal entries should be nonzero.

It is also easy enough to see from the way the above calculation of theinverse worked out that the inverse of an upper triangular matrix will beagain upper triangular.

Strictly upper triangular matrices

These are matrices which are upper triangular and also have all zeros onthe diagonal. This can also be expressed by saying that there should bezeros on and below the diagonal.The precise statement then is that an n× n matrix

A =

a11 a12 · · · a1n

a21 a22 · · · a2n

.... . .

...an1 an2 · · · ann

is strictly upper triangular when

aij = 0 whenever i ≥ j.

Example

An example is

A =

0 1 20 0 30 0 0

This matrix is certainly not invertible. To be invertible we need eachdiagonal entry to be nonzero while his matrix is at the other extreme — alldiagonal entries are 0.

For this matrix

A2 = AA =

0 1 20 0 30 0 0

0 1 20 0 30 0 0

=

0 0 30 0 00 0 0

and

A3 = AA2 =

0 1 20 0 30 0 0

0 0 30 0 00 0 0

=

0 0 00 0 00 0 0

= 0

In fact this is not specific to the example. Every strictly upper triangularmatrix

A =

0 a12 a13

0 0 a23

0 0 0

has

A2 =

0 0 a12a23

0 0 00 0 0

and A3 = 0.

In general an n× n strictly upper triangular matrix A has An = 0. This isan example of nilpotent matrix.

DefinitionA square matrix A is called nilpotent if some power of A is the zero matrix.

This shows again a significant difference between ordinary multiplication ofnumbers and matrix multiplication. It is not true that AB = 0 means thatA or B has to be 0.

The question of which matrices have an inverse is also more complicatedthan it is for numbers. Every nonzero number has a reciprocal, but thereare many nonzero matrices that fail to have an inverse.

Given we’ve discussed upper triangular matrices (and strictly uppertriangular)we might also discuss lower triangular matrices. In fact we couldrepeat most of the same arguments for them, with small modifications, butthere is an operation to flip one to the other

Transposes

The transpose of a matrix is what you get by writing the rows as columns.More precisely, we can take the transpose of any m× n matrix A,

A =

a11 a12 · · · a1n

a21 a22 · · · a2n

.... . .

am1 am2 · · · amn

by writing the entries of the first row a11, a12, . . . , a1n down the first columnof the transpose, the entries a21, a22, . . . , a2n of the second row down thesecond column, etc. We get a new n×m matrix, which we denote At

At =

a11 a21 · · · am1

a12 a22 · · · am2

.... . .

a1n a2n · · · anm

Another way to describe it is that the (i, j) entry of the transpose in aji =the (j, i) entry of the original matrix.Examples are

A =

[a11 a12 a13

a21 a22 a23

], At =

a11 a21

a12 a22

a13 a32

A =

4 5 67 8 910 11 12

, At =

4 7 105 8 116 9 12

Yet another way to describe it is that it is the matrix got by reflecting theoriginal matrix in the “diagonal” line, or the line were i = j (row number =column number).

So we see that if we start with an upper triangular

A =

a11 a12 a13

0 a22 a23

0 0 a33

then the transpose

At =

a11 0 0a12 a22 0a13 a23 a33

is lower triangular (has all nonzero entries on or below the diagonal).

Facts about transposes

(i) Att = A (transpose twice gives back the original matrix)

(ii) (A + B)t = At + Bt (if A and B are matrices of the same size).

(iii) (kA)t = kAt for A a matrix and k a scalar.

(iv) (AB)t = BtAt (the transpose of a product is the product of thetransposes taken in the reverse order — provided the product ABmakes sense).So if A is m× n and B is n× p, then (AB)t = BtAt. Note that Bt isp× n and At is n×m so that BtAt makes sense and is a p×mmatrix, the same size as (AB)t.

The proof requires a bit of notation and organisation so we won’t do it indetail. Here is what we would need to do just for the 2× 2 case.Take any two 2× 2 matrices, which we write out as

A =

[a11 a12

a21 a22

], B =

[b11 b12b21 b22

]Then

At =

[a11 a21

a12 a22

], Bt =

[b11 b21b12 b22

]and we can find

AB =

[a11b11 + a12b21 a11b12 + a12b22a21b11 + a22b21 a21b12 + a22b22

],

(AB)t =

[a11b11 + a12b21 a21b11 + a22b21a11b12 + a12b22 a21b12 + a22b22

]while

BtAt =

[b11a11 + b21a12 b11a21 + b21a22

b12a11 + b22a12 b12a21 + b22a22

]= (AB)t

A final property of transposes is:

(v) If A is an invertible square matrix then At is also invertible and(At)−1 = (A−1)t (the inverse of the transpose is the same as thetranspose of the inverse.

This is easy to see: Let A be an invertible n× n matrix. We know fromthe definition of A−1 that

AA−1 = In and A−1A = In

Take transposes of both equations to get

(A−1)tAt = Itn = In and At(A−1)t = Itn = In

Therefore we see that At has an inverse and that the inverse matrix is(A−1)t.

Lower triangular matrices

We can use the transpose to transfer what we know about upper triangularmatrices to lower triangular ones. Let us take 3× 3 matrices as an example,though what we say will work similarly for n× n.If

A =

a11 0 0a21 a22 0a31 a32 a33

is lower triangular, then its transpose

At =

a11 a21 a31

0 a22 a32

0 0 a33

is upper triangular. So we know that At has an inverse exactly when theproduct of its diagonal entries

a11a22a33 6= 0

But that is the same as the product of the diagonal entries of A. So lowertriangular matrices have an inverse exactly when the product of thediagonal entries is nonzero.

1 Another thing we know is that (At)−1 is again upper triangular. So((At)−1)t = (A−1)tt = A−1 is lower triangular. Thus the inverse of alower triangular matrix is again lower triangular (if it exists).

2 Using (AB)t = BtAt we can show that the product of lower triangularmatrices is again lower triangular. As BtAt = is a product of uppertriangulars it is upper triangular and then AB = ((AB)t)t = (BtAt)t =is the transpose of an upper triangular and so AB is lower triangular.

3 Finally, we could use transposes to show that strictly lower triangularmatrices have to be nilpotent (some power of them is the zero matrix).

Or we could figure these out by working them out in more or less the sameway as we did for the strictly upper triangular case.

Symmetric matrices

A matrix A is called symmetric if At = A.

Symmetric matrices must be square as the transpose of an m× n matrix isn×m. So if m 6= n, then At and A are not even the same size — and sothey could not be equal.

One way to say what ‘symmetric’ means for a square matrix is to say thatthe numbers in positions symmetrical around the diagonal are equal.Examples are

−3 1 −11 14 33−1 33 12

,

5 1 2 −31 11 25 02 25 −41 6−3 0 6 −15

Trace of a matrix

The trace of a matrix is a number that is quite easy to compute and whichpartly characterises the matrix. It is the sum of the diagonal entries.

So

A =

[1 23 4

]has

trace(A) = 1 + 4 = 5

and

A =

1 2 34 5 6−7 −8 −6

has

trace(A) = 1 + 5 + (−6) = 0

For 2× 2

A =

[a11 a12

a21 a22

]⇒ trace(A) = a11 + a22

and for 3× 3,

A =

a11 a12 a13

a21 a22 a23

a31 a32 a33

⇒ trace(A) = a11 + a22 + a33

Properties of the trace

(i) trace(A + B) = trace(A) + trace(B) (if A and B are both n× n)

(ii) trace(kA) = k trace(A) (if k is a scalar and A is a square matrix)

(iii) trace(At) = trace(A) (if A is any square matrix)

(iv) trace(AB) = trace(BA) for A and B square matrices of the same size(or even for A m× n and B an n×m matrix).

The last property is the only one that is at all hard to check out. Theothers are pretty easy to see.

To check the last one, we should write out the entries of A and B, work outthe diagonal entries of AB and the sum of them. Then work out the sum ofthe diagonal entries of BA and their sum. Rearranging we should see weget the same answer.

Let us look at the simple example of 2× 2 matrices.

In the 2× 2 we would take

A =

[a11 a12

a21 a22

], B =

[b11 b12b21 b22

](without saying what the entries are specifically) and look at

AB =

[a11b11 + a12b21 ∗

∗ a21b12 + a22b22

]⇒ trace(AB) = a11b11 + a12b21 + a21b12 + a22b22

(where the asterisks mean something goes there but we don’t have to figureout what goes in those places).

BA =

[b11a11 + b12a21 ∗

∗ b21a12 + b22a22

]⇒ trace(BA) = b11a11 + b12a21 + b21a12 + b22a22

Hence we see thattrace(AB) = trace(BA) .

The idea is that now we know this is always going to be true for 2× 2matrices A and B. To check it for all the sizes is not really that much moredifficult but it requires a bit of notation to be able to keep track of what isgoing on.

Application of matrices to graphs

Graph theory is a subject that is somehow abstract, but at the same timerather close to applications.

Fairly simple minded examples would be an intercity rail network (nodeswould be stations and the edges would correspond to the existence of adirect line from one station to another), or an airline route network, or anancestral tree graph, or a telecommunications network.

Mathematically a graph is something that has vertices (also known asnodes) and edges (also called paths) joining some of the nodes to someothers.

For our situation, we consider something more like an airline network(joining different airports by direct flights), but we will take account of thefact that some airports might not be connected by flights that go direct inboth directions.

Here is a route network for a start-up airline that has two routes it flies.One goes Dublin → London and London → Dublin, while another routemakes a round trip Dublin → Galway → Shannon → Dublin.

The arrows on the edges mean that this is an example of a directed graph.Here we allow one-way edges and bi-drectional edges between nodes (orvertices) of the graph, which we draw by indicating arrows.

To get the vertex matrix for a graph like this, we first number or order thevertices, for instance

Dublin 1

London 2

Galway 3

Shannon 4

and then we make a matrix, a 4× 4 matrix in this case since there are 4vertices, according to the following rules. The entries of the matrix areeither 0 or 1. All diagonal entries are 0. The (i, j) entry is 1 if there is adirect edge from vertex i to vertex j (in that direction).

That is we form the matrix M with entries:

mij =

{1, Pi → Pj

0, otherwise(1)

So in our example graph, the vertex matrix is

M =

0 1 1 01 0 0 00 0 0 11 0 0 0

For instance the first row in 0 1 1 0 because there is a direct link 1→ 2 and1→ 3, but no direct link 1→ 4 (Dublin to Galway is not directly linked).

In more precise terms: A directed graph is a finite set of elements,{P1, P2, . . . , Pn} together with a collection of ordered pairs (Pi, Pj) ofdistinct elements of this set with no ordered pair being repeated.

The elements of this set are called vertices, and the ordered pairs are calleddirected edges. We also use the notation Pi → Pj to indicate a directededge in a graph.

In general a graph can have separate components.

In science very complicated graphs and networks commonly appear: e.g.metabolic pathways

Image from http://www2.ufp.pt/∼pedros/bq/integration.htm

In science very complicated graphs and networks commonly appear: e.g.metabolic pathways

Image from http://potatometabolicpathways.webs.com

Having a matrix description of graphs is often very useful. Here is oneresult that makes a connection to matrix multiplication.

TheoremIf M is the vertex matrix of a directed graph, then the entries of M2 givethe numbers of 2-step (or 2-hop) connections.More precisely, the (i, j) entry of M2 gives the number of ways to go fromvertex i to vertex j with exactly 2 steps (or exactly one intermediate vertex).Similarly M3 gives the number of 3-step connections, and so on for higherpowers of M .

Consider again the graph:

with the vertex matrix

M =

0 1 1 01 0 0 00 0 0 11 0 0 0

In our example

M2 =

0 1 1 01 0 0 00 0 0 11 0 0 0

0 1 1 01 0 0 00 0 0 11 0 0 0

=

1 0 0 10 1 1 01 0 0 00 1 1 0

The diagonal 1’s in the matrix correspond to the fact that there is a roundtrip Dublin → London → Dublin (or 1→ 2→ 1) and also London →Dublin → London. The 1 in the top right corresponds to the connectionDublin → Galway → Shannon.

If we add M + M2 we get nonzero entries in every place where there is aconenction in 1 or 2 steps

M + M2 =

1 1 1 11 1 1 01 0 0 11 1 1 0

and the zeros off the diagonal there correspond to the connections thatcan’t be made in either a direct connection of a 2-hop connection (which isGaway → London and London → Shannon in our example).Although we don’t see it here the numbers in the matrix M2 can be biggerthan 1 if there are two routes available using 2-hops.

Clique

Definition A subset of a directed graph is called a clique if it satisfies thefollowing three conditions:

1 The subset contains at least three vertices.

2 For each pair of vertices Pi and Pj in the subset, both Pi → Pj andPj → Pi are true.

3 The subset is as large as possible; that is, it is not possible to addanother vertex to the subset and still satisfy condition (ii).

A clique is in some sense the largest possible subgroup of vertices that aremaximally connected to each other.

For simple graphs we can find cliques by inspection but for morecomplicated diagrams it is useful to have a tool.

For a graph we can form the matrix S with the entries:

sij =

{1, Pi ↔ Pj

0, otherwise(2)

This is similar to the vertex matrix but only has non-zero entries whereboth vertices are connected. That is sij = 1 if mij = mji = 1 but sij = 0 ifmij = 0 and/or mji = 0.

We note that this matrix is automatically symmetric as

S = St .

We can then identify cliques by using the fact that if s(3)ij are the entries of

the matrix S3, then a vertex Pi belongs to some clique if and only ifs(3)ii 6= 0.

Consider the vertex matrix

M =

0 1 1 1 01 0 0 1 00 1 0 1 10 1 1 0 00 1 1 1 0

(3)

we can find the graph corresponding to the with a bit of work or by usingsage ...

21/11/2013 19:23Graphs and Matrices -- Sage

Page 1 of 1http://www.sagenb.org/home/Tristan_McLoughlin/9/print

Graphs and MatricesM = Matrix([[0, 1, 1, 1, 0],[1, 0, 0, 1, 0],[0, 1, 0, 1, 1],[0, 1, 1, 0, 0],[0, 1, 1, 1, 0]]); M

[0 1 1 1 0][1 0 0 1 0][0 1 0 1 1][0 1 1 0 0][0 1 1 1 0]

G=DiGraph(M)P = G.plot()P.show( figsize=5)

Does this matrix have any cliques? The S matrix is

S =

0 1 0 0 01 0 0 1 00 0 0 1 10 1 1 0 00 0 1 0 0

(4)

and so (can calculate using for example SAGE)

S3 =

0 2 1 0 02 0 0 3 11 0 0 3 20 3 3 0 00 1 2 0 0

(5)

and as there are no non-zero diagonal elements there are no cliques.

matrices - ma1s1 - maths.tcd.ietristan/ma1s1/slides/matrices-3-print.pdf · basic matrix notation...

Documents