m1gla notes
TRANSCRIPT
CONTENTS iii
Contents
1 Introduction 1
2 Geometry in two dimensions 3
2.1 Vector operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6 Equation of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.7 Perpendicular distance . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Two famous inequalities . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Linear inequalities 13
4 Conics 15
4.1 Transforming conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.1 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.2 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 The Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Geometrical definition of conics . . . . . . . . . . . . . . . . . . . . . 22
5 Matrices and linear equations 25
5.1 Method – Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . 27
6 Some applications of matrix algebra 37
6.1 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Population Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 Matrices and Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.1 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.2 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.3 Combining Transformations . . . . . . . . . . . . . . . . . . . 41
7 Inverses 43
7.1 Relation to linear equations . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2 Finding inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.3 Finding inverses in general . . . . . . . . . . . . . . . . . . . . . . . . 45
iv CONTENTS
8 Determinants 47
8.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.2 Effects of row operations on determinant . . . . . . . . . . . . . . . . 49
8.3 n × n determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9 Eigenvalues and eigenvectors 53
9.1 Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
9.2 How to find eigenvectors and eigenvalues . . . . . . . . . . . . . . . . 54
9.2.1 Back to Fibonacci . . . . . . . . . . . . . . . . . . . . . . . . 55
9.3 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3.1 Repeated eigenvalues . . . . . . . . . . . . . . . . . . . . . . . 60
10 Conics(again) and Quadrics 63
10.1 Reduction of conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
10.1.1 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 67
10.2 Quadric surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10.3 Linear Geometry in 3 dimensions . . . . . . . . . . . . . . . . . . . . . 70
10.4 Geometry of planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3
Chapter 2
Geometry in two dimensions
This means geometry in the plane. Based on the real numbers. Think of real numbers
as points on the real line.
Definition 2.1. R2 is the set of all ordered pairs (x1, x2) with x1, x2 real numbers.
Note: (1, 3) 6= (3, 1)
Call (x1, x2) a vector. Geometrically, vector (x1, x2) will be used to represent several
things:
• Point p in the plane with coordinates (x1, x2)
• Position vector −→OPline in direction
−→OP , length OP
• Any vector −→AB of the same direction as −→OP
Usually we write x = (x1, x2).
The origin or zero vector is O = (0, 0).
2.1 Vector operations
Definition 2.2.
1) Addition
define the sum of two vectors by
(x1, x2) + (y1, y2) = (x1 + y1, x2 + y2)
2) Scalar multiplication
A scalar is a real number. If x = (x1, x2) is a vector and λ ∈ R is a scalar
λx = λ(x1, x2) = (λx1, λx2)
• Subtraction
x − y = x + (−y) = (x1 − y1, x2 − y2)
4 2.2. LINES
Note. If λ, ν ∈ R and x, y ∈ R2
i) λ(x + y) = λx + λy
ii) (λ+ ν)x = λx + νx
Proof.
(i)
λ(x + y) = λ(x1 + y1, x2 + y2)
= (λ(x1 + y1), λ(x2 + y2))
= (λx1 + λy1, λx2 + λy2)
= (λx1, λy1) + (λy1, λy2)
= λx + λy
(ii) excercise
�
2.2 Lines
In Euclid, points and lines are undefined. But we need to define points and lines. We
have already defined points.
Definition 2.3. Let a, u be vectors in R2 with u 6= O. The set of vectors {a+λu | λ ∈R} is called a line.
Definition 2.4. Two lines {a + λu | λ ∈ R} and {b + λv | λ ∈ R} are parallel iffv = αu for some α ∈ R.
Next we will prove the three famous propositions from Euclid.
Proposition 2.1. If L, L′ are parallel lines which have a point in common, then L = L′.
Proposition 2.2. Through any two points, there is exactly one line.
Proposition 2.3. Two non parallel lines meet in exactly one point.
Proof. Proposition 2.1.
Let
L = {a + λu | λ ∈ R}L′ = {b + λv | λ ∈ R}
As L, L′ are parallel, v = αu for some scale α ∈ R, so
L′ = {b + λαu | λ, α ∈ R} = {b + νu | ν ∈ R}
2.2. LINES 5
Given that L and L′ have x in common, we have
x = a + λ1u = b + λ2u λ1, λ2 ∈ Rb = a + λ1u − λ2u= a + (λ1 − λ2)u
b = a + λ3u (λ3 = (λ1 − λ2)→ λ3 ∈ R)Finally, we get
L′ = {b + λu | λ ∈ R}= {a + λ3u + λu | λ ∈ R}= {a + (λ3 + λ)u | λ ∈ R}= {a + λu | λ ∈ R}= L
�
Proof. Proposition 2.2.
Let u, v ∈ R2 to be our two points. Let us take the lineL = {u + λ(v − u) | λ ∈ R}
Suppose now that L′ is some line through u and v
L′ = {a + λw | λ ∈ R}Since points u and v line on L′, we get
u = a + λ1w
v = a + λ2w
v − u = (λ2 − λ1)wTherefore, lines L and L′ are parallel. As they also have a common point, they are (byproposition 1) identical.
�
Example 2.1. Find common point of
L1 = {(0, 1) + λ(1, 1) | λ ∈ R}L2 = {(4, 0) + λ(2, 1) | λ ∈ R}
By Proposition 3, there exists a unique common point, call it x .
x = (0, 1) + λ1(1, 1)
x = (4, 0) + λ2(2, 1)
Writing equations for individual coordinates, we get
0 + λ1 = 4 + 2λ2
1 + λ1 = 0 + λ2
−1 = 4 + λ2
λ2 = −5λ1 = 4 + 2(−5)λ1 = −6
6 2.3. TRIANGLES
Therefore
x = (0, 1)− 6(1, 1) = (−6,−5)= (4, 0)− 5(2, 1) = (−6,−5)
2.3 Triangles
Definition 2.5. A triangle is a set of 3 non-colinear points {a, b, c | a, b, c ∈ R2}.Edges of a triangle are the line segments ab, ac, bc, where
ab = {a + λ(b − a) | 0 ≤ λ ≤ 1}
Midpoint of ab is the point
a +1
2(b − 1) = 1
2(a + b)
A median of a triangle is a line joining one of a, b, c with the midpoint of the oposite
side (line segment).
Proposition 2.4. The 3 medians of a triangle meet in a common point.
Proof. Let the three medians be Ma, Mb and Mc
Ma = {a + λ(12(b + c)− a) | λ ∈ R}
Mb = {b + λ(12(a + c)− b) | λ ∈ R}
Mc = {c + λ(12(a + b)− c) | λ ∈ R}
We just show, that for λ = 23 , medians meet at the same point
Ma = a +1
3b +1
3c − 23a =1
3(a + b + c)
Mp = b +1
3a +1
3c − 23b =1
3(a + b + c)
Ma = c +1
3a +1
3b − 23c =
1
3(a + b + c)
Therefore, all the medians contain the point (sometimes called centroid) 13(a+ b+ c).
�
Note. Other interesting properties of triangles are
• 3 altitudes meet at a point (sometimes called orthocentre)
• 3 perpendicular bisectors meet at a point (sometimes called circumcentre)
• the 3 centres (orthocentre, circumcentre, centroid) are colinear, (they lie on Eulerline)
2.4. DISTANCES 7
2.4 Distances
Definition 2.6. The length of a vector x = (x1, x2) (denoted ‖x‖) is a real number
‖x‖ =√
x21 + x22
The distance between x and y (x, y ∈ R2) is the length of (x − y) called dist(x, y)
dist(x, y) =√
(x1 − y1)2 + (x2 − y2)2
2.5 Dot product
Definition 2.7. For x, y ∈ R2, dot product (scalar product) of x and y is a real number
x.y = x1y1 + x2y2
Example 2.2. x = (1, 2), y = (−1, 0)
‖x‖ =√
12 + 22 =√5
dist(x, y) =√
22 + 22 = 2√2
x.y = −1 + 0 = −1
Proposition 2.5. (i) ‖x‖2 + ‖y‖2 − ‖x − y‖2 = 2x.y
(ii) If θ is the angle between x and y then x.y = ‖x‖‖y‖ cos θ
(iii) The position vectors of x and y are at right angles iff x.y = 0
Proof.
(i)
LHS = x21 + x22 + y
21 + y
22 − (y1 − x1)2 − (y2 − x2)2
= −(−2x1y1)− (−2x2y2)= 2x.y
(ii) From cosine rule
‖x‖2 + ‖y‖2 − ‖y − x‖22‖x‖‖y‖ = cos θ
Using part (i), we get
2x.y = 2‖x‖‖y‖ cos θx.y = ‖x‖‖y‖ cos θ
�
8 2.6. EQUATION OF A LINE
2.6 Equation of a line
Consider a line L = {a + λu | λ ∈ R}, u = (u1, u2), define n = (−u2, u1), so thatn.u = 0.
For any x ∈ L, x = a + λu
x.n = (a + λu).n = a.n + λu.n = a.n
x.n = a.n
Proposition 2.6. Let L = {a + λy | λ ∈ R}, n = (−u2, u1). Then
(i) Every x in L satisfies x.n = a.n
(ii) Every solution of x.n = a.n lies on L
Proof.
(ii) Suppose x.n = a.n (we need to show that x = a + λu, i.e. x − a = λu). Then(x − a).n = 0.Let y = x − a, and say y = (y1, y2). As y .n = 0
−y1u2 + y2u1 = 0
y2u1 = y1u2
So, y = y1u1(u1, u2) if u1 6= 0, or y = y2
u2(u1, u2) if u2 6= 0.
Therefore
y = λu = x − ax = a + λu ∈ L
�
Definition 2.8. For a line L = {a + λu | λ ∈ R}, the vector n = (−u2, u1) (or anyscalar multiple of it) is called a normal to L.
The equation x.n = a.n is called the equation of L.
Proposition 2.7. The linear equation px1 + qx2 + r = 0 (1) is an equation of a line
with normal (p, q) and with direction (−q, p).
Proof.
Suppose q 6= 0. Then the solution of (1)p
qx1 + x2 +
r
q= 0
Let x1 = λ, x2 =−rq −
pλq , so
(x1, x2) =
(
λ,− rq− pλq
)
=
(
0,−rq
)
+ λ
(
1,−pq
)
which is a line with direction vector(
1, −pq
)
= 1−q (−q, p). Normal is clearly
(p, q).
2.7. PERPENDICULAR DISTANCE 9
When q = 0. Equation (1) is px1 + r = 0, with solution
(x1, x2) =
(−rp, 0
)
+ λ(0, 1)
Which is clearly a line with equation 1p (0, p) =1p (−q, p).
�
Definition 2.9. Lines L1 = {a + λu | λ ∈ R} and L2 = {b + λv | λ ∈ R} areperpendicular iff u.v = 0.
Proposition 2.8. Let lines L1, L2 have equations
L1 : p1x1 + q1x2 + r1 = 0
L2 : p2x1 + q2x2 + r2 = 0
Then
(i) L1, L2 are parallel iff (p1, q1) = α(p2, q2) for some scalar λ
(ii) L1, L2 are perpendicular iff (p1, q1) · (p2, q2) = 0
Proof.
(i) By Proposition 2.7
L1 = {a + λ(−q1, p1) | λ ∈ R}L2 = {b + λ(−q2, p2) | λ ∈ R}
These are parallel iff ∃α ∈ R such that (−q2, p2) = α(−q1, p1), i.e. α(p1, q1) =(p2, q2).
(ii) excercise
�
2.7 Perpendicular distance
Definition 2.10. For a line L and point p, p /∈ L, dist(p, L) is the length of the
perpendicular from p to L.
Proposition 2.9. If a is a point on the line and n is a normal to the line, then
dist(p, L) =
∣∣∣∣
(p − a).n‖n‖
∣∣∣∣
Note. A unit vector is a vector of length 1. If w is any vector, then w = w‖w‖ is a unit
vector. Therefore we can write
dist(p, l) = |(p − a).n|
10 2.8. TWO FAMOUS INEQUALITIES
Proof.
dist(p, L) = |‖p − a‖ cos θ|
=
∣∣∣∣
‖p − a‖‖n‖ cos θ‖n‖
∣∣∣∣
=
∣∣∣∣
(p − a).n‖n‖
∣∣∣∣
�
2.8 Two famous inequalities
Proposition 2.10. (Cauchy - Schwartz inequality)
Let x, y ∈ R2.
(i) |x.y | ≤ ‖x‖‖y‖
(ii) If |x.y | = ‖x‖‖y‖ then one of x.y is a scalar multiple of the other
Proof.
(i)
|x.y | ≤ ‖x‖‖y‖↔
(x1y1 + x2y2)2 ≤ (x21 + x
22 )(y
21 + y
22 )
↔x21 y
21 + 2x1y1x2y2 + x
22y22 ≤ x21y
21 + x
21y22 + x
22y21 + x
22 y22
↔0 ≤ x21y
22 + x
22y21 − 2x1y1x2y2
↔(x1y2 − x2y1)2 ≥ 0
(ii) If |x.y | = ‖x‖‖y‖, above proof shows
0 = x1y2 − x2y1x1x2=y1y2
Therefore one of x and y is a scalar multiple of the other.
Note that x.x = (x21 + x22 ) = ‖x‖2.
�
Proposition 2.11. (Triangle inequality) If x, y ∈ R2, then
‖x + y‖ ≤ ‖x‖+ ‖y‖
2.8. TWO FAMOUS INEQUALITIES 11
Proof.
‖x + y‖2 = (x + y)(x + y) = x.x + y .y + 2x.y
= ‖x‖2 + ‖y‖2 + 2x.y≤ ‖x‖2 + ‖y‖2 + 2‖x‖‖y‖ by Cauchy - Schwartz
≤ (‖x‖+ ‖y‖)2
Therefore ‖x + y‖ ≤ ‖x‖+ ‖y‖.�
15
Chapter 4
Conics
In R2, a linear equation px1 + qx2 + r = 0 defines a line. Now we define a curve in R2
by quadratic equation.
Example 4.1. Circle of center c, radius r . It contains all the points x such that
‖x − c‖ = r
‖x − c‖2 = r2
(x1 − c1)2 + (x2 − c2)2 = r2
x21 + x22 − 2c2x2 + c21 + c22 − r2 = 0
Definition 4.1. A conic section in R2 is the set of poitns x ∈ R2, x = (x1, x2),satisfying a quadratic equation
ax21 + bx22 + cx1x2 + dx1 + ex2 + f = 0 (4.1)
where not all of a, b and c are 0.
Here are some basic examples
(1)
x21a2+x22b2= 1
is an ellipse
b
a
(2)
x21a2− x
22
b2= 1
is a hyperbola
16
(3)
x2 = ax21 + b
(a 6= 0) is a parabola.
b
(4)x21a2− x
22
b2= 0
is a pair of lines.
(5)x21a2= 1
is a pair of lines.
17
a−a
(6)x21a2= 0
is a double line
(7)x21a2+x22b2= 0
is a point
(8)x21a2+x22b2= −1
is the empty set (empty conic)
We call the conics (6), (7) and (8) degenerate conics. We’ll see how to reduce an
arbitrary conic to one of these basic examples.
18 4.1. TRANSFORMING CONICS
Note. Why conic sections?
4.1 Transforming conics
Aim – start with an arbitrary conic in the form (4.1), do some geometrical chages to
coordinates so that the equation becomes one of our standard equations (1) – (8).
Two kinds of changes are allowed.
4.1.1 Translation
To change coordinates from (x1, x2) to (y1, y2) with new origin (α, β)
y1 = x1 − αy2 = x2 − β
This is called a translation.
Example 4.2. What type is the following conic?
x21 + 2x22 − 2x1 + 4x2 − 10 = 0
We complete the square
(x1 − 1)2 + 2(x2 + 1)2 − 13 = 0
So the new coordinates are
y1 = x1 − 1y2 = x2 + 1
Now, we can easily see that the our conic is an ellipse.
Remark – If (4.1) has no x1x2 term (c = 0), then we can find a translation which
reduces the equation to one of the standard ones (examples (1) – (6)).
4.1. TRANSFORMING CONICS 19
4.1.2 Rotation
Rotate axes anticlockwise through θ. What happens to the coordinates of a general
point P?
Say P has old coordinates (x1, x2) and new coordinates (y1, y2).
θ
α
r y1
x1
x2
y2
bP
Now
y1 = r cosα
y2 = r sinα
and
x1 = r cos(α+ θ)
x2 = r sin(α+ θ)
Hence
x1 = r cos θ cos θ − r sinα sin θ= y1 cos θ − y2 sin θ
and
x2 = r sinα cos θ + cosα sin θ
= y2 cos θ + y1 sin θ
Summarizing – change of coordinates when we do a rotation through angle θ is
x1 = y1 cos θ − y2 sin θx2 = y1 sin θ + y2 cos θ
20 4.2. THE THEORY
Example 4.3. Rotation through π4 .
x1 =1√2(y1 − y2)
x2 =1√2(y1 + y2)
Example 4.4. What is the following conic (find the rotation and translation to change
coordinates and get standard equation)
x21 + x22 + 4x1x2 = 1
(From the hat method) Let’s rotate through π4 . By magic (see later). Then our
equation becomes
1
2(y1 − y2)2 +
1
2(y1 + y2)
2 + 41
2(y1 − y2)(y1 + y2) = 1 3y21 − y22 = 1
This is a hyperbola.
4.2 The Theory
Start with general conic
ax21 + bx22 + cx1x2 + dx1 + ex2 + f = 0
Aim
(i) find the rotation which gets rid of x1x2
(ii) complete the square to find the translation which changes the equation to one
of our six standard equations
Part(i).
If c = 0, we don’t need to rotate. So assume that c 6= 0. When we do general rotationthrough θ we change coordinates to y1, y2
x1 = y1 cos θ − y2 sin θx2 = y1 sin θ + y2 cos θ
We aim to find θ so that the new equation has no y1y2 term.
ax21 = a(y1 cos θ − y2 sin θ)2
bx22 = b(y1 sin θ + y2 cos θ)2
cx1x2 = c(y1 sin θ + y2 cos θ)(y1 cos θ − y2 sin θ)
So the y1y2 term when we change coordinates will be
−2a sin θ cos θ + 2b sin θ cos θ + c(cos2 θ − sin2 θ)=
(b − a) sin 2θ + c cos 2θ
4.2. THE THEORY 21
So we want to choose θ to make this expression zero.
(a − b) sin 2θ = c cos 2θ
If a 6= b then
tan 2θ =c
a − bIf a = b, we want to cos 2θ = 0, so take θ = π
4 .
Summary
• Step 1 – Rotation. If c 6= 0 in (4.1), we rotate through θ, where
θ =π
4
when a = b or
tan 2θ =c
a − bwhen a 6= b.
• Step 2 – Translation. After Step 1, equation becomesa′y21 + b
′y22 + d′y1 + e
′y2 + f′ = 0
Now we complete the square to find a translation which changes equation to one
of the standard ones.
We’ve proved:
Theorem 4.1. Every conic in the form (4.1) can be changed by rotation and translation
of the axes to one of the standard equations (1) – (8). Thus every conic is either an
ellipse, hyperbola, parabola, or one of the degenerate conics.
Example 4.5. Reduce conic
2x21 + 2√3x1x2 +
(
3√3− 1
)
x1 +(
3 +√3x2 = 0
)
to standartd form by rotation and translation.
• Step 1 – Rotation – Here a = 2, b = 0, c = 2√3. So rotate through θ, where
tan 2θ =√3
Therefore 2θ = π3 , so θ =
π6 . So, our new coordinates are
x1 = y1 cos θ − y2 sin θ
=
√3
2y1 −
1
2y2
x2 = y1 sin θ + y2 cos θ
=1
2y1 +
√3
2y2
So, the equation becomes
214(√3y1 − y2
)2+ 2√314(√3y1 − y2
)(y1 +
√3y2)
+(3√3− 1)12(
√3y1 − y2) + (3 +
√3)12(y1 +
√3y2) = 0
Which is
3y21 − y22 + 6y1 + 2y2 = 0
22 4.3. GEOMETRICAL DEFINITION OF CONICS
• Step 2 – Translation – complete the square
3(y1 + 1)2 − (y2 − y1)2 = 2
So put z1 = y1 + 1 , z2 = y2 − 1, and equation is now standard equation
3z21 − z22 = 2
This is a hyperbola. Sketch – new origin has y1, y2 coordinates (−1, 1), so hasx1x2 coordinates
(−12(√3 + 1), 12(−1 +
√3))
Note. Usually tan 2θ = ca−b is not so convinient (can’t write what 2θ is). In general,
use formula
tan 2θ =2 tan θ
1− tan2 θto work out tan θ, hence sin θ and cos θ.
Note. Standard parabola is x2 = ax21 + b. What about parabola x1 = ax
22 + b?
We can rotate this through π2 to the standard equation.
4.3 Geometrical definition of conics
Ingredients of this definition
• a line L
• a point p not on L
• a real number e > 0
Definition 4.2. Curve C is a set of all points x ∈ R2, such that
‖x − p‖ = e · dist(x, L)
i.e.
C ={x ∈ R2 | ‖x − p‖ = e · dist(p, L)
}
Example 4.6. Let e = 1.
Let e = 12
Theorem 4.2. Curve C (from previous definition) is a conic. It is
• a parabola if e = 1
• an ellipse if e < 1
• a hyperbola if e > 1
Proof. Do a rotation and translation to make p the origin and L vertical line x1 = s .
Then the equation defininig C is
‖x − p‖ = e · dist(x, L)
4.3. GEOMETRICAL DEFINITION OF CONICS 23
i.e
‖x‖ = e · |x1 − s |i.e
‖x‖2 = e2(x1 − s)2
i.e
x21 + x22 = e
2(x21 − 2sx1 + s2)i.e
x21 (1− e2) + x22 + 2se2x1 − s2e2 = 0If e = 1, the x21 term vanishes – this is a parabola.
Now suppose e 6= 1.Complete square
(1− e2)(
x1 +se2
1− e2)2
+ x22 − s2e2 −s2e4
1− e2 = 0
So put y1 = x1 +se2
1−e2 and y2 = x2. Then get standard equation
(1− e2)y21 + y22 = s2e2(
1 +e2
1− e2)
i.e
y21 +y221− e2 =
s2e2
(1− e2)2 (4.2)
If e < 1, this is an ellipse. If e > 1, this is hyperbola. �
Definition 4.3. The conic has focus p, directric L, excentricity e from the previous
proof.
Example 4.7. Find e, p and L for the ellipse
x212+ x22 = 1
This is the standard equation. We compare it with (4.2)
y21 +y221− e2 =
s2e2
(1− e2)2 (4.3)
From the y coordinate picture, the focus is(se2
1−e2 , 0)
, directrix is x1 = s +se2
1−e2 .
Compare equations
y21 +y221− e2 =
s2e2
(1− e2)2x21 + 2x
22 = 2
So,
1
1− e2 = 2
s2e2
(1− e2)2 = 2
So e = 1√2, s = 1. So the excentricity is 1√
2, focus (1, 0) and the directrix is x1 = 2.
Note. Ellipse has in fact two foci ±p, and two directrices L, L′.
25
Chapter 5
Matrices and linear equations
Definition 5.1. The space Rn – define
R1 = R
R2 = set of ordered pairs
R3 = set of all triples
R4 = set of all quadruples
In general, Rn is the set of all n-tuples (x1, x2, . . . , xn) with x1 ∈ R. Call these n-tuplesvectors.
Note that Ri 6⊆ Ri+1.
The Rn has interesting structure
• geometric – points, lines, curves
• algebraic
Definition 5.2. Addition of vectors
(x1, x2, . . . , xn) + (y1, y2, . . . , yn) = (x1 + y1, . . . , xn + yn)
Scalar multiplication
λ(x1, . . . , xn) = (λx1, . . . , λxn)
Definition 5.3. A linear equation in x1, x2, . . . xn is
a1x1 + a2x2 + · · ·+ anxn = b
where coefficients ai , b ∈ R.
Definition 5.4. A solution to a linear equation is a vector (k1, . . . , kn) such that the
equation is satisfied when we put xi = ki .
Example 5.1. (1,−2, 3) is a solution to the linear equation x1 + x2 + x3 = 2.
Definition 5.5. A system of linear equations in x1, . . . xn is a collection of one or more
linear equations in these variables.
Example 5.2.
26
1)
3x1 + 4x2 = 5
8x1 − x2 = −2is a system of 2 linear equations in x1, x2.
2)
x1 + x2 + x3 + x4 = 0
2x1 − x2 + 5x4 = −2x1 + x2 − x4 = 3
is a system of 3 equations in x1, x2, x3, x4.
Definition 5.6. General system
a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
· · ·am1x1 + am2x2 + · · ·+ amnxn = bm
where x1, . . . , xn are unknowns and ai j , bbi are constants. A solution to this system is
a vector (k1, . . . , kn) which satisfies all the equations.
Aim is to find the method to find all solutions of any system of the form from previous
definition.
Example 5.3. System
x1 + x1 = 1
−2x1 + 3x2 = 5
Eliminate x1 – take twice first, add to second
5x2 = 7
One solution (x1, x2) = (−25 ,75).
Example 5.4.
2x1 + x2 = 2
−6x1 − 3x2 = 1
These are just two parallel (nonidentical) lines – the system has no solution.
Example 5.5. System
x1 + x2 = 0
x2 − x3 = 0
x1 − x2 + 2x3 = 0
Notice (cleverly), that third equation is equal to first minus two times second. So any
solution of system of first and second equation will automatically satisfy third.
So general solution is
(x1, x2, x3) = (−a, a, a)for any a ∈ R. This system has infinitely many solutions. Soon we will see that everysystem has either no solution, one solution, or unlimited number of solution.
5.1. METHOD – GAUSSIAN ELIMINATION 27
5.1 Method – Gaussian elimination
Example 5.6. System
x1 + 2x2 + 3x3 = 9
4x1 + 5x2 + 6x3 = 24
3x1 + x2 − 2x3 = 4
Step 1 – Eliminate x1 from second and third, using first. We get equivalent system
x1 + 2x2 + 3x3 = 9
−3x2 − 6x3 = −12 (2)− 4(1)−5x2 − 11x2 = −23 (3)− 3(1)
This system has the same solutions as the original one as new equations are combina-
tions of orignal ones, and vice versa.
Step 2 – Eliminate x2 from the third equation using only second equation.
x1 + 2x2 + 3x3 = 9
−3x2 − 6x3 = −12−x3 = −3 (3)− 5
3(2)
Step 3 – Solve!
By third, x3 = 3. By the second, 3x2 = 12− 18, so x2 = −2. By the first, x1 = 4. Sothe system has one solution (4,−2, 3).
For bigger systems, we need better notation. This is provided by matrices.
Definition 5.7. A matrix is a rectangular array of numbers. Eg
(1 2 3
4 5 6
)
This is 2× 3 matrix.
1
5
−π
This is 3× 1 matrix.Call matrix m×n if it has m rows, n collumns. We use matrices to encapsulate systemsof linear equations. System from previous example has the coefficients matrix
1 2 3
4 5 6
3 1 −2
and augmented matrix
1 2 3 9
4 5 6 24
3 1 −2 4
28 5.1. METHOD – GAUSSIAN ELIMINATION
For general system
coeff matrix =
a11 a12 . . . a1n. . .
am1 am2 . . . amn
augmented matrix =
a11 a12 . . . a1n b1. . .
am1 am2 . . . amn bm
A row of augmented matrix corresponds to an equation in the system.
The number in the i-th row and j-th collumn of a matrix is called the i j-entry.
Operation on the equations in a system are operations on the rows of the augmented
matrix.
1 2 3 9
4 5 6 24
3 1 −2 4
→
1 2 3 9
0 −3 −6 −120 −5 −11 −23
→
1 2 3 9
0 −3 −6 −120 0 −1 −3
Definition 5.8. Elementary row operations are the following operations on the rows
of an (augmented) matrix
1) Add a scalar multiple of one row to another
r1 → ri + λrj , λ ∈ R
2) Swap two rows
ri ↔ rj
3) Multiply any row by a nonzero scalar
ri → λri , λ 6= 0
Doing these elementary row operations to an augmented matrix does not change the
solution of the system.
Idea of Gaussian elimination
We start with a “hard” system. Then we do a couple of row operations and get an
“easy” system. What makes it easy are zeros under the main diagonal.
Definition 5.9. An m × n matrix is in echelon form if
(i) The first non-zero number in each row occurs to the right of the first non-zero
number in any higher row
(ii) All rows (0, 0, . . . , 0) appear at the bottom.
5.1. METHOD – GAUSSIAN ELIMINATION 29
Example 5.7. The following matrices
1 2 3 9
0 −3 −6 −120 0 −3 −9
(0 0 1
0 0 0
)
are in echelon form. This one is not
0 1 3
1 0 0
0 0 0
The point If a system has its augmented matrix in echelon form, the system is easy
to solve.
Example 5.8. Solve the system of equations with augmentated matrix
2 −1 3 00 1 1 1
0 0 0 0
This system is
2x1 − x2 + 3x3 = 0
x2 + x3 = 1
Solve from bottom up. Equation 3 tells us nothing. Let x3 = a (any a ∈ R. Then eq2 implies that x2 is 1− a.Equation 1 implies 2x1 = x2 − 3x2, therefore x1 = 1−4a
2 .
Therefore the solutions are (x1, x2, x3) = (12(1 − 4a), 1 − a, a). E.g when a = 0, we
get solution(12 , 1, 0
).
Example 5.9. Solve the system with augmentated matrix
2 −1 3 00 1 1 0
0 0 0 2
System thus is
2x1 − x2 + 3x3 = 0
x2 + x3 = 1
0 = 2
Third equation instantly implies no solution at all.
30 5.1. METHOD – GAUSSIAN ELIMINATION
Example 5.10.
1 1 1 3 2 4 0
0 0 0 2 −2 0 30 0 0 0 1 1 0
The system is
x1 + x2 + x3 + 3x4 + 2x5 + 4x6 = 0
2x4 − x5 = 3
x5 + x6 = 0
Equation 3 sets x6 = a (any a). Then x5 = −a. From 2 we get x4 = 12(x5 + 3) =
12(3− a). From 1 we get x3 = b and x2 = c. Then x1 = −c − b− 23(3− a) + 2a− 4a= −9
2 − 12a − b − c. The general solution is
(x1, x2, x3, x4, x5, x6)
=
(−92 − 12a − b − c, c, b, 12(3− a), a, a)
for any a, b, c ∈ R.
In general, if augmentated matrix is in echelon form, solve the system by solving the
last equation, then the next last, and so on.
This method is called back substitution.
The variables we can put free equal to a, d, c etc are free variables. E.g. in previous
example, the free variables are x6, x3, x2.
Theorem 5.1. (Gausian elimination theorem)
Any matrix can be reduced by elementary row operations to a matrix which is in echelon
form.
Method 5.1. (Gausian elimination method)
System of linear equations (augmented matrix). We put the augmented to echelon
form using elementary row operations. Solve the new system by back substitution.
Example 5.11. Solve the system with augmentation matrix
A =
1 0 3 2 5 1
1 0 3 5 8 1
1 1 4 5 7 0
1 −1 2 −1 3 3
Answer.
• Step 1 – clear first collumn using top left hand entry (i.e. equations 2, 3, 4 takeaway equation 1)
A→
1 0 3 2 5 1
0 0 0 3 3 0
0 1 1 3 2 −10 −1 −1 −3 −2 1
5.1. METHOD – GAUSSIAN ELIMINATION 31
• Step 2 – swap rows 2 and 4
→
1 0 3 2 5 1
0 −1 −1 −3 −2 1
0 1 1 3 2 −10 0 0 3 3 0
• Step 3 – clear second collumn using the row 2
→
1 0 3 2 5 1
0 −1 −1 −3 −2 10 0 0 0 0 0
0 0 0 3 3 0
• Step 4
→
1 0 3 2 5 1
0 −1 −1 −3 −2 10 0 0 3 3 0
0 0 0 0 0 0
• Step 5 – solve by back substitution Let x5 = a. Then x4 = −a, x3 = b,x2 = −1− b + a, x1 = 1− 3b − 3a.
Matrix Algebra
Definition 5.10. Matrix multiplication
Dot product in Rn
x =
x1...
xn
, y =
y1...
yn
x.y = x1y1 + · · ·+ xnyn
It is convenient given a row vector z = (za, . . . , zn) to also define z.y = z1y1+· · ·+znyn.Then one can define matrix multiplication by
A =
a1. . .
an
, B =(b1 . . . bn
)
AB =
c11 . . . c1p. . . . . .
cm1 . . . cmp
where ci j = aibj
Note. If A is 1× n (a1 . . . an) = a, B is n × 1
b1...
bn
= b, AB is a 1× 1 matrix (a.b)
32 5.1. METHOD – GAUSSIAN ELIMINATION
Key properties of matrix multiplication
1) One can multiply an m× n matrix A with an r × s matrix B iff r = n. Then, ABwill be an m × s matrix.
2) Commutativity fails, i.e. in general AB 6= BA.In order for both AB and BA to be defined, they must be square matrices of
same size, i.e. n × n for some n.
Example 5.12.
A =
(1 2
3 4
)
, B =
(0 1
−1 2
)
Then
AB =
(−2 5
−4 11
)
BA =
(3 4
5 6
)
3) Matrix multiplication, like for × on R, is associative.
Theorem 5.2. Given any matrices A m × n, B n × p, C p × q
(AB)C = A(BC)
Example 5.13.
A =
(1 2
3 4
)
, B =
(0 1
−1 2
)
, C =
(1 2
3 0
)
Then
AB =
(−2 5
−4 11
)
(AB)C =
(13 −429 −8
)
BC =
(3 0
5 −2
)
, A(BC) =
(13 −429 −8
)
Proof. First, we do special case, when C is p×1 matrix, i.e. column vector. LetA is m × n, B is n × p, x is p × 1. We show that
(AB)x = A(Bx)
Let y = Bx be a vector and z = Ay . So z is the right hand side. Claim is
z = (AB)x .
y = B x
(n × 1) (n × p) (p × 1)
y =
y1...
yn
=
b11 . . . b1p...
...
bn1 . . . bnp
x1...
xp
5.1. METHOD – GAUSSIAN ELIMINATION 33
z =
z1...
zm
=
a11 . . . a1n...
...
bm1 . . . bmn
y1...
yn
For i ≤ i ≤ n zi = ai1y1 + · · ·+ ainyn, yj = bj ix1 + · · ·+ bjpxp.Then
zi
=
ai1(b1ix1 + · · ·+ b1p) + ai2(b2ix1 + · · · + b2p)+. . .
+ain(bnix1 + · · ·+ bnp)
=
(ai1b11 + ai2b21 + · · ·+ ainbn1)x1 + (ai2b12 + ai2b22 + · · ·+ ainbn2)x2+ · · ·+ (ai2b1p + ai2b2p + · · ·+ ainbnp)xp
Thus coefficient of xi in zi is
(ai1bi j + ai2b2j + · · ·+ ainbnj)
which is ai .bj , where
ai =
ai1...
ain
, bj =
b1j...
bnj
which by definition is the (i , j) entry of AB!
Claim
(AB)C = A(BC)
Write C =
......
c1 cq...
...
Then
BC =(Bc1 . . . Bcq
)
Therefore
A(BC) =(A(Bc1) . . . A(Bcq)
)
�
4) Again, as for + and × on R, distributivity holds
Proposition 5.3. Given matrices A m × n, B, C, both n × p
A(B + C) = AB + AC
34 5.1. METHOD – GAUSSIAN ELIMINATION
Powers of matrices
Definition 5.11. A square matrix is one which is n × n for some n.If A is n × n, define
A2 = AA
A3 = (AA)A = A(AA)
A4 = A3A
. . .
An = An−1A
Example 5.14.
(1 2
0 1
)2
=
(1 2
0 2
)(1 2
0 2
)
=
(1 4
0 1
)
Definition 5.12. The identity matrix is
2× 2 I2 =(1 0
0 1
)
3× 3 I3 =
1 0 0
0 1 0
0 0 1
n × n In =
1 0 . . .
0 1 . . .
0 0.. .
0 0 0 1
Proposition 5.4. If A is m × n and B is n × p then
AIn = A
InB = B
Link between addition and multiplication
Proposition 5.5.
(i) If A is m × n and B, C are n × p, then
A(B + C) = AB + AC
(ii) If D, E are m × n and F is n × p, then
(D + E)F = DF + EF
Proof.
(i) Let A = (ai j), B = (bi j), C = (ci j). The i j-entry of AB is
(aj1 . . . ain)
bi j...
bnj
= ai1bi j + · · ·+ ainbnj
5.1. METHOD – GAUSSIAN ELIMINATION 35
So i j-entry of A(B + C) is
ai1(b1j + c1j) + · · ·+ ain(bnj + cnj)=
ai1b1j + ai1c1j + · · · + ainbnj + aincnj=
(ai1b1j + · · ·+ ainbnj) + (ai1c1j + · · ·+ aincnj)
This is the i j entry of AB + AC.
(ii) Left to enthusiastic reader. �
37
Chapter 6
Some applications of matrixalgebra
6.1 Linear equations
We know that a general system of linear equations
a11x1 + · · ·+ a1nxn = b1...
am1x1 + · · ·+ amnxn = bn
can be expressed as a matrix product
Ax = b
where A =(ai j), x =
x1...
xn
and b =
b1...
bm
Proposition 6.1. The number of solutions of a system Ax = b is 0, 1 or ∞.Proof. Suppose the number of solutions is not 0 or 1, i.e. there are at least two
solutions. We prove that this implies that there are infinitely many solutions.
Let p and q be two different solutions (p, q ∈ Rn), soAp = b
Aq = b
and p 6= q. ThenA(p − q) = Ap − Aq = 0
For any scalar λ ∈ RAλ(p − q) = λA(p − q) = λ0
So
A(p + λ(p − q)) = Ap + Aλ(p − q) = Ap + 0 = bSo p + λ(p − q) is a solution of the original system.Since p 6= q, p − q 6= 0 and so each different scalar λ gives a different solution. Sothere are ∞ solutions. �
38 6.2. POPULATION DISTRIBUTION
Structure of solutions
Proposition 6.2.
(i) System Ax = 0 either has one solution (which must be x = O) or it has an
infinite number of solutions.
(ii) Suppose p is a solution of a system Ax = b (i.e. p ∈ Rn, Ap = b). Then allsolutions of Ax = b take the form
p + h
where h is a solution of the system Ax = 0.
Example 6.1. System
(1 0 3
0 1 1
)
x1x2x3
=
(2
1
)
. The general solution is
x = (2− 3a, 1− a, a)
Particular solution p = (2, 1, 0). So general solution is
(2− 3a, 1− a, a) = (2, 1, 0) + (−3a,−a, a) = p + h
Where h = (−3a,−a, a) is the general solution of(1 0 3
0 1 1
)
x =
(0
0
)
.
Proof.
(i) clear from previous proposition
(ii) Let q be a solution of A. So Ap = b, Aq = b. So A(q − p) = 0.Put q − p = h, a solution to Ax = 0. Then q = p + h.
�
6.2 Population Distribution
Example 6.2. Population classified in 3 income states
(1) Poor (P)
(2) Middle incom (M)
(3) Rich(R)
Over one generation (20 year period)
P → 20%M, 5%R rest stay P
M → 25%P, 10%R
R → 5%P, 30%M
6.2. POPULATION DISTRIBUTION 39
Summarize this information in a matrix. It’s 3× 3 matrix T , T = (ti j), where
ti j = proportion moving from state j → state i
So
T =
0.75 0.25 0.5
0.20 0.65 0.30
0.05 0.10 0.65
This is called the transition matrix. All entries are ≥ 0, and column sums are 1.Say we start with proportions p1, p2, p3 in states 1, 2, 3.
p(0) =
p1p2p3
is the initial population vector.
After 1 generation
proportion in state 1 = t11p1 + t12p2 + t13p3
proportion in state 2 = t21p1 + t22p2 + t23p3
proportion in state 3 = t31p1 + t32p2 + t33p3
So population vector after 1 generation is
p(1) = T
p1p2p3
= Tp(0)
Similarly, after 2 generations
p(2) = Tp(1) = T 2p(0)
Continuiing, see that after n generations, population vector is
p(n) = T np(0)
This is an example of a Markov chain – a population is divided into states 1, . . . , n, and
we’re given the proportion ti j moving from state j to state i over a generation. The
transition matrix T = (ti j) is n × n with properties
1) ti j ≥ 0
2) all column sums are 1
Markov chain is regular if some power of T has no zero entries.
Example 6.3. The above example is regular.
T =
(0 1
1 0
)
is not regular.
40 6.2. POPULATION DISTRIBUTION
Basic Fact
In a regular Markov chain, as n grows, the vector Tmp(0) gets closer and closer to a
steady state vector s =
s1s2...
sn
where
1) s1 + · · ·+ sn = 1
2) Ts = s
This is true, whatever the initial population vector p(0). Proof is not hard.
Example 6.4. On a desert island, a veg-prone community dies according to
(1) no-one eats meat 2 days in a row
(2) if someone doesn’t eat meat one day, they toss a coin: heads eat meat next day,
tails don’t
What proportion can be expected to eat meat on a given day?
Answer – Markov chain.
• State 1: meat
• State 2: no meat
• “generation”: 1 day
Transition matrix
T =
(0 1
2
1 12
)
Notice T 2 =
(12
14
12
34
)
. So it is regular. By our Basic Fact, we have a steady state
vector s =
(s1s2
)
, where s1 + s2 = 1 and Ts = s .
So
1
2s2 = s1
s1 +1
2s2 = s2
s1 + s2 = 1
So in long run, 13 of population will be eating meat on a given day.
6.3. MATRICES AND GEOMETRY 41
6.3 Matrices and Geometry
6.3.1 Rotation
Consider a rotation about the origin through angle θ. Then
y1 = r cos(θ + α)
= r(cos θ cosα− sin θ sinα)= x1 cos θ − x2 sin θ
y2 = r sin(θ + α)
= r(sin θ cosα+ cos θ sinα)
= x1 sin θ + x2 cos θ
So (y1y2
)
=
(cos θ − sin θsin θ cos θ
)(x1x2
)
Call Rθ =
(cos θ − sin θsin θ cos θ
)
, a rotation matrix.
6.3.2 Reflection
Let s be the reflection in the x1 axis sending
x =
(x1x2
)
→(x1x2
)
=
(1 0
0 −1
)(x1x2
)
So matrix S =
(1 0
0 −1
)
represents the reflection s .
6.3.3 Combining Transformations
Two rotations, say through θ then γ
x → rγ(rθ(x))
x → Rγ(Rθ(x))
= (RγRθ)x =
(cosγ − sin γsin γ cos γ
)(cos θ − sin θsin θ cos θ
)
= Rγ+θ
Reflection+Rotation
Now consider srθ, sending x → s(rθ(x)). This sendsx → S(Rθx)
= (SRθ)x
=
(1 0
0 −1
)(cos θ − sin θsin θ − cos θ
)
x
=
(cos − sin θ
− sin θ − cos θ
)
x
43
Chapter 7
Inverses
Definition 7.1. Let A be a square matrix (n × n). Say another n × n matrix B is aninverse of A iff AB = BA = In.
Inverse of A is denoted A−1. If A has an inverse, A is invertible.
Example 7.1. A =
(1 1
0 1
)
. Then
(1 1
0 1
)(1 −10 1
)
=
(1 0
0 1
)
(1 −10 1
)(1 1
0 1
)
=
(1 0
0 1
)
Example 7.2.(1 2
0 0
)(a b
c d
)
=
(a + 2c b + 2d
0 0
)
which cannot equal to I.
Proposition 7.1. If A is invertible then its inverse is unique.
Proof. Suppose B, C are inverses of A. Then
AB = BA = I
AC = CA = I
B = BI = B(AC) = (BA)C = IC = C
�
7.1 Relation to linear equations
Proposition 7.2. Suppose A is invertible. Then any system Ax = b has a unique
solution
x = A−1b
44 7.2. FINDING INVERSES
Proof.
Ax = b
⇔A−1(Ax) = A−1b
(A−1A)x = A−1b
x = A−1b
�
Example 7.3. System
x1 + x2 = 2
x2 = 3
is (1 1
0 1
)
x =
(2
3
)
By previous proposition
x =
(1 −10 1
)(2
3
)
=
(−13
)
7.2 Finding inverses
2× 2 matrices
Let A =
(a b
c d
)
. Observe
(a b
c d
)(d −b−c a
)
=
(ad − bc 0
0 ad − bc
)
= (ad − bc)I
Using this, we can prove the following
Proposition 7.3.
1) If ad − bc 6= 0 then A is invertible and
A−1 =1
ad − bc
(d −b−c a
)
2) if ad − bc = 0 then A is not invertible.Proof.
2) Suppose ad−bc = 0. Then AB = 0 (B =(d −b−c a
)
). Assume A is invertible.
A−1(AB) = A−10 = 0
A−1(AB) = (A−1A)B = IB
Therefore IB = 0 and thus A = B = 0. But zero matrix is not invertible.
�
7.3. FINDING INVERSES IN GENERAL 45
7.3 Finding inverses in general
Let A = (ai j) is n × n matrix. We want n × n matrix X = xi j such that AX = I. Tosolve, we write one giant matrix
a11 . . . a1n 1 . . . 0...
....... . .
...
an1 . . . ann 0 . . . 1
= (A|I)
Now use Gaussian elimination to reduce to Gaussian form. There are two possibilities
1) There is a row with zeros in the “left side” and non zero elements in the “right
side”. There is no inverse in this case.
2) We don’t get into the first situation. Therefore we can put the left matrix into
Echelon form. Then, we can get rid of non-zero elements above the main diagonal
of the left matrix and get a matrix
(I|E)
So AE = I.
Start with (E|I) and reverse everythung, we end up with (I|A). So solving EX = Igives X = A, completing proof that E = A−1.
Proposition 7.4.
1) If we can reduce (A|I) to (I|E) using elementary row operations, then E = A−1.
2) If we can reduce (A|I) to matrix with zeros in the left side of the last line and anon-zero element in the right side of the last line, then A is not invertible.
Example 7.4. Find inverse of
1 3 −22 5 −3−3 2 −4
46 7.3. FINDING INVERSES IN GENERAL
Augmented matrix is
1 3 −2 1 0 02 5 −3 0 1 0−3 2 −4 0 0 1
→
1 3 −2 1 0 0
0 −1 1 −2 1 00 11 −10 3 0 1
→
1 3 −2 1 0 0
0 −1 1 −2 1 0
0 0 1 −19 11 1
→
1 3 −2 1 0 0
0 1 −1 2 −11 00 0 1 −19 11 1
→
1 3 0 −37 22 20 1 0 −17 10 10 0 1 −19 11 1
→
1 0 0 14 −8 −10 1 0 −17 10 1
0 0 1 −19 11 1
Thus the inverse
14 −8 −1−17 10 1
−19 11 1
A result linking inverses, linear equations and echelon forms
Proposition 7.5. Let A be a square matrix n × n. The following four statements areequivalent
(1) A is invertible
(2) Any system Ax = b has a unique solution
(3) The system Ax = 0 has the unique solution x = 0
(4) A can be reduced to the identity In using ERO.
Proof. We prove (1)→ (2), (2)→ (3), (3)→ (4), (4)→ (1).(1)→ (2) Is done, x = A−1b
(2)→ (3) Is obvious
(3)→ (4) Needs to be proved.
(4)→ (1) Is proved earlier.�
47
Chapter 8
Determinants
Recall that matrix
(a b
c d
)
is invertible iff ad − bc 6= 0.
Definition 8.1. The determinant of a 2× 2 matrix is
det
(a b
c d
)
= ad − bc
Also write as ∣∣∣∣
a b
c d
∣∣∣∣
or |A|.
Example 8.1. ∣∣∣∣
1 2
3 4
∣∣∣∣ = −2
|Rθ| =∣∣∣∣
cos θ sin θ
− sin θ cos θ
∣∣∣∣= 1
Recall
Proposition 8.1. For a 2× 2 matrix A
• |A| 6= 0 ⇔ A is invertible ⇔ Ax = b has unique solution.
Aim is to define det(A) for 3 × 3 and larger matrices in such a way that this result isstill true.
Definition 8.2. For a 3× 3 matrix
A =
a11 a12 a13a21 a22 a23a31 a32 a33
define the determinant of A to be
det(A) = a11
∣∣∣∣
a22 a23a32 a33
∣∣∣∣− a12
∣∣∣∣
a21 a23a31 a33
∣∣∣∣+ a13
∣∣∣∣
a21 a22a31 a32
∣∣∣∣
= a11(a22a33 − a23a32)− a12(a21a33 − a23a31) + a13(a21a32 − a22a31)
48 8.1. PROPERTIES
Example 8.2.
∣∣∣∣∣∣
1 2 3
4 5 6
1 1 2
∣∣∣∣∣∣
= 1
∣∣∣∣
5 6
1 2
∣∣∣∣− 2
∣∣∣∣
4 6
1 2
∣∣∣∣+ 3
∣∣∣∣
4 5
1 1
∣∣∣∣
= −3
Example 8.3.
∣∣∣∣∣∣
a x y
0 b z
0 0 c
∣∣∣∣∣∣
= abc
8.1 Properties
Definition 8.3. If A is a 3×3 matrix, the i j-minor of A is the 2×2 matrix Ai j obtainedby deleting ith row and jth collumn of A.
Then
|A| = a11|A11| − a12|A12|+ a13|A13|This is called the expansion of |A| by the first row.
Example 8.4.
A =
1 2 3
4 5 6
1 1 2
, A11 =
(5 6
1 2
)
We don’t have to expand by the first row.
Proposition 8.2. Expansion by the second row
|A| = −a21|A21|+ a22|A22| − a23|A23|
third row
|A| = a31|A31| − a32|A32|+ a33|A33|
Proof. The second row.
RHS = −a21(a12a33 − a13a32) + a22(a11a33 − a13a31)− a23(a11a32 − a12a31)= |A|
Check third row. �
Proposition 8.3. If A (3× 3) has two equal rows, then |A| = 0.
Proof. WLOG1 say that
A =
a1 a2 a3a1 a2 a3c1 c2 c3
Expand by the third row to get |A| = 0. �
1Without Loss Of Generality
8.2. EFFECTS OF ROW OPERATIONS ON DETERMINANT 49
8.2 Effects of row operations on determinant
Proposition 8.4. Let A =
r1r2r3
be 3× 3.
1) Row operation ri → ri + λrj (i 6= j). does not change |A|.
2) Swapping two rows changes |A| to −|A|.
3) Row operation ri → λri changes |A| to λ|A|
Proof.
1) Say i = 2, j = 1, so the row. op. sends
A =
a1 a2 a3b1 b2 b3c1 c2 c3
→ A′ =
a1 a2 a3b1 + λa1 b2 + λa2 b3 + λa3c1 c2 c3
Expand by the second row
|A′| = −(b1 + λa1)∣∣∣∣
a2 a3c2 c3
∣∣∣∣+ (b2 + λa2)
∣∣∣∣
a1 a3c1 c3
∣∣∣∣− (b3 + λa3)
∣∣∣∣
a1 a2c1 c2
∣∣∣∣
= |A|+ λ
∣∣∣∣∣∣
a1 a2 a3a1 a2 a3c1 c2 c3
∣∣∣∣∣∣
= |A|
2) Say we swap rows 1 and 2 to get
∣∣∣∣∣∣
b1 b2 b3a1 a2 a3c1 c2 c3
∣∣∣∣∣∣
= −a1∣∣∣∣
b2 b3c2 c3
∣∣∣∣+ . . .
3) It is obvious when we do the expansion by the row we multiply.
�
Example 8.5.
∣∣∣∣∣∣
1 2 −1−2 3 4
5 9 −4
∣∣∣∣∣∣
=
∣∣∣∣∣∣
1 2 −10 7 2
0 −1 1
∣∣∣∣∣∣
=
∣∣∣∣
7 2
−1 1
∣∣∣∣+ 0 + 0
= 9
50 8.2. EFFECTS OF ROW OPERATIONS ON DETERMINANT
Example 8.6.∣∣∣∣∣∣
1 x x2
1 y y2
1 z z2
∣∣∣∣∣∣
=
∣∣∣∣∣∣
1 x x2
0 y − x y2 − x20 z − x z2 − x2
∣∣∣∣∣∣
= (y − x)(z − x)
∣∣∣∣∣∣
1 x x2
0 1 y + x
0 1 z + x
∣∣∣∣∣∣
= (y − x)(z − x)
∣∣∣∣∣∣
1 x x2
0 1 y + x
0 0 z − y
∣∣∣∣∣∣
= (y − x)(z − x)(z − y)This is the 3× 3 Vandermande determinant.Proposition 8.5. Let A be 3 × 3 matrix, and let A′ be obtained from A by el. row.ops. Then
|A| = 0 ⇔ |A′| = 0
Proof. Doing a row op. changes |A| to |A|, -|A| or λ|A| (where λ 6= 0).�
Main result
Theorem 8.6. Let A be 3× 3 matrix. Then
|A| 6= 0⇔ A is invertible
(or the system Ax = 0 has unique solution, or A→ I3 by row ops.)Proof.
⇒ Suppose |A| 6= 0. Reduce A to echelon form A′ by row operations. By 8.5,
|A′| 6= 0. If A′ has a zero row, |A| = 0. Hence A′ =
1 · ·0 1 ·0 0 1
so can be
reduced to I. So A is invertible.
⇐ Suppose A is invertible. A can be reduced to I by row ops. Since |I| = 1, so|A| 6= 0.
�
Example 8.7. For which values of a is A invertible?
A =
1 2 5
1 3 7
1 4 a
|A| =
∣∣∣∣∣∣
1 2 5
0 1 2
0 2 a − 5
∣∣∣∣∣∣
= a − 9
So by 8.6, A is invertible iff a 6= 9.
8.3. N × N DETERMINANTS 51
Important consequence
Corollary 8.7. Let A be 3 × 3 matrix. Suppose the system Ax = 0 has a non-zerosolution x 6= O. Then |A| = 0.
Example 8.8. Let A =
1 2 5
1 3 7
1 4 a
. System Ax = 0 has augmented matrix
1 2 5 0
1 3 7 0
1 4 a 0
→
1 2 5 0
0 1 2 0
0 2 a − 5 0
→
1 2 5 0
0 1 2 0
0 0 a − 9 0
So if a 6= 9, only solution is x = 0. If a = 9, system has ∞ solutions. We saw that|A| = 0↔ a = 9.
8.3 n × n determinantsIt is possible to define det(A) for any n × n matrix A, and to prove that all the resultsof this section are true for n × n matrices. Wait for proofs until next year.
Definition 8.4. 4× 4 determinant. If A = (ai j) is 4× 4,
|A| = a11 |A11| − a12 |A12|+ a13 |A13| − a14 |A14|
Similarly, define n×n determinant in terms of (n−1)× (n−1) determinants (recursivedefinition).
Example 8.9.
∣∣∣∣∣∣∣∣
1 2 3 4
−1 0 1 5
2 1 0 2
0 −1 1 3
∣∣∣∣∣∣∣∣
=
∣∣∣∣∣∣
0 1 5
1 0 2
−1 1 3
∣∣∣∣∣∣
− 2
∣∣∣∣∣∣
−1 1 52 0 2
0 1 3
∣∣∣∣∣∣
+ 3 + . . .
53
Chapter 9
Eigenvalues and eigenvectors
Fibonacci’s Rabbits
Rabbits, born in pairs. Newly born pair produce no offspring in the firs month, but then
one pair in each following month. No deaths. Start with one pair. How many pairs
after n months?
Answer. Let Fn = number of pairs. Then
Fn = Fn−1 + Fn−2
So F0, F1, F2, . . . = 0, 1, 1, 2, 3, 5, 8, 13, . . . is the Fibonacci sequence. In matrix form(FnFn−1
)
=
(1 1
1 0
)(Fn−1Fn−2
)
Now
(F1F0
)
=
(1
0
)
, so
(F2F1
)
= A
(F1F0
)
and so on.
(Fn+1Fn
)
= An(F1F0
)
9.1 Eigenvectors
Definition 9.1. Let A be a n × n matrix. Vector v =
v1...
vn
∈ R is an eigenvector of
A if
(1) v 6= 0
(2) Av = λv (λ ∈ R)
The scalar λ is an eigenvalue of A.
Example 9.1.(3 2
2 0
)(1
−2
)
=
(−12
)
= −1(1
−2
)
So
(1
−2
)
is an eigenvector of A =
(3 2
2 0
)
and −1 is its eigenvalue.
54 9.2. HOW TO FIND EIGENVECTORS AND EIGENVALUES
Example 9.2.(3 2
2 0
)(1
1
)
=
(5
2
)
6= λ(1
1
)
So
(1
1
)
is not an eigenvector of
(3 2
2 0
)
.
9.2 How to find eigenvectors and eigenvalues
Let A be a n × n matrix. Vector x is a non-zero solution of the system
Ax = λx
Example 9.3.(3 2
2 0
)(x1x2
)
= λ
(x1x2
)
We can get
(3− λ)x1 + 2x2 = 0
2x1 − λx2 = 0
So the equation is∣∣∣∣
3− λ 2
2 −λ
∣∣∣∣ = 0
λ2 − 3λ− 4 = 0
Eigenvalues are −1 and 4.For λ = −1, eigenvectors are non-zero solution of
A+ Ix = 0(4 2
2 1
)
x = 0
So eigenvectors are
(a
−2a
)
(a ∈ R, a 6= 0)For λ = 4 (
−1 2
2 −4
)
x = 0
So eigenvectors are
(2b
b
)
(b ∈ R,b 6= 0).
When does have the following equation a non-zero solution?
Ax − λx = 0
(A− λI)x = 0
Precisely when |A− λI| = 0 (by 8.6).
9.2. HOW TO FIND EIGENVECTORS AND EIGENVALUES 55
Proposition 9.1.
(1) If A is a 3×3 or 2×2, then the eigenvalues of A are the solutions λ of |A−λI| = 0.
(2) If λ is an eigenvalue, then the corresponding eigenvectors are the non-zero solu-
tions of
(A− λI)x = 0
Definition 9.2. The equation |A − λI| = 0 is the characteristic equation of A, and|A− λI| is the characteristic polynomial of A.
9.2.1 Back to Fibonacci
We had
(Fn+1Fn
)
= An(1
0
)
where A =
(1 1
1 0
)
.
Strategy
(1) Find eigenvalues λ1, λ2 and eigenvectors v1 and v2 of A. Observe
Av1 = λ1v1
Av2 = λ2v2
Then
A2v1 = A(Av1)
= A(λ1v1)
= λ1Av1
= λ21v1
Similarly
Anv1 = λn1v1
Anv2 = λn2v2
(2) Express
(1
0
)
as a combination of v1 and v2.
(1
0
)
= αv1 + βv2 (α, β ∈ R)
Then
An(1
0
)
= An(αv1 + βv2)
= αAnv1 + βAnv2
An(1
0
)
= αλn1v1 + βλn2v2 (9.1)
56 9.3. DIAGONALIZATION
Calculations
(1) Characteristic equation is |A− λI| = 0, i.e.∣∣∣∣
1− λ 1
1 −λ
∣∣∣∣= 0
i.e
λ2 − λ− 1 = 0
λ1 =1
2(1 +
√5)
λ2 =1
2(1−
√5)
Eigenvectors for λ1(1− λ 1
1 −λ
)
x = 0
So v1 =
(λ11
)
is an eigenvector.
Similarily v2 =
(λ21
)
.
(2) We now find α and β such that
(1
0
)
= α
(λ11
)
+ β
(λ21
)
α =1
λ1 − λ2β =
1
λ2 − λ1
Putting all this into (9.1)
(Fn+1Fn
)
= An(1
0
)
= αλn1v1 + βλn2v2
=1√5
(1
2
(
1−√5))n (λ1
1
)
− 1√5
(1
2
(
1−√5))n (λ2
1
)
To get formula
Fn =1√5
((1 +√5
2
)n
−(1−√5
2
)n)
9.3 Diagonalization
Want to investigate functions of matrices, e.g. An, A1n , f (A).
9.3. DIAGONALIZATION 57
Definition 9.3. An n × n matrix D is diagonal matrix if
D =
α1 O. . .
O αn
Example 9.4.(1 0
0 −2
)
It’s easy to find powers of diagonal matrices.
Proposition 9.2. Let
D =
α1 O. . .
O αn
E =
β1 O. . .
O βn
Then
DE =
α1β1 O. . .
O αnβn
and
Dk =
αk1 O. . .
O αkn
Proof. DE given by definition of matrix multiplication.
Take E = D to get D2 and repeat to get Dk . �
Aim - To relate an arbitrary square matrix A to a diagonal matrix, and exploit this to
find An, etc.
2× 2
Let A be 2×2 and suppose A has eigenvalues α1, α2 with eigenvectors v1, v2. Assumeα1 6= α2. We get
Av1 = α1v1
Av2 = α2v2
Cleverly define 2× 2 matrix P (v1, v2 are collumn vectors)
P =(v1 v2
)
58 9.3. DIAGONALIZATION
Then
AP = A(v1v2)
=(Av1 Av2
)
=(α1v1 α2v2
)
=(v1 v2
)(α1 0
0 α2
)
So if we write D =
(α1 0
0 α2
)
, we’ve shown
AP = PD (9.2)
We claim that P is invertible. For if not, then |P | = 0, which means that v1 = λv2,which is false as
Av1 = α1v1 = α1λv2
Aλv2 = λα2v2
and these are not equal as α1 6= α2.So from (9.2)
P−1AP = P−1PD = D
Summary
Proposition 9.3. Let A be 2× 2 with distinct eigenvalues α1, α2, eigenvectors v1, v2.Let
P =(v1 v2
)
Then P is invertible and
P−1AP = D =
(α1 0
0 α2
)
Note. Also true for 3× 3, . . . matrices.
Example 9.5. Let A =
(0 1
−2 3
)
.
1) Find P such that P−1AP is diagonal.
2) Find the formula An.
3) Find a 5th root of A, i. e. find B such that
B5 = A
Answer.
9.3. DIAGONALIZATION 59
1) Characteristic equation of A is
|A− λI| =∣∣∣∣
−λ 1
−2 3− λ
∣∣∣∣ = 0
i. e.
λ2 − 3λ+ 2 = 0
(λ− 1)(λ− 2) = 0
So eigenvectors are
λ = 1 a
(1
1
)
, (a 6= 0).
λ = 2 b
(1
2
)
, (b 6= 0)
Let P =
(1 1
1 2
)
. Then by Proposition 9.3,
P−1AP = D =
(1 0
0 2
)
Note that many other P ’s work, e. g.
(2 −12 −2
)
. Or if Q =
(1 1
2 1
)
, then
Q−1AQ =
(2 0
0 1
)
2) Find An. Know P−1AP = D. By the 9.2, we know that
Dn =
(1 0
0 2n
)
So(1 0
0 2n
)
= Dn =(P−1AP
)n
= P−1APP−1AP . . . P−1APP−1AP︸ ︷︷ ︸
n
= P−1AnP
So
P−1AnP = Dn
PP−1AnPP−1 = PDnP−1
An = PDnP−1
So
An =
(1 1
1 2
)(1 0
0 2n
)(2 −1−1 1
)
=
(1 2n
1 2n+1
)(2 −1−1 1
)
60 9.3. DIAGONALIZATION
3 Find B such that B5 = A.
Well, if
C =
(1 0
0 21/5
)
Then C5 = D. So
(PCP−1)5 = PCP−1 . . . PCP−1
= PC5P−1
= PDP−1 = A
So take
B = PCP−1 =
(1 1
1 2
)(
1 0
0 215
)(2 −1−1 1
)
=
(
2− 2 15 2 15 − 12− 2 65 2 65 − 1
)
Note. Usually a matrix has many square roots, fifth roots, etc. Eg, I has infinitely
many.
Note. Similarly can calculate polynomial functions
p(A) = anAn + an−1A
n−1 + · · · + a1A+ a0I
Summary
If a square matrix A has distinct eigenvalues, then it can be diagonalized, i. e. there
exists an invertible P such that P−1AP is diagonal.
9.3.1 Repeated eigenvalues
If the characteristic polynomial of A has a repeated root λ, we call λ a repeated
eigenvalue of A.
Some A’s with the repeated eigenvalue can be diagonalised and some can’t.
Example 9.6. Let A =
(1 1
0 1
)
. Then the characteristic polynomial is
∣∣∣∣
1− λ 1
0 1− λ
∣∣∣∣= (1− λ)2
so 1 is a repeated eigenvalue. Claim A cannot be diagonalized (i.e. no invertible P
exists such that P−1AP is diagonal).
Proof. Assume there exists an invertible P such that
P−1AP = D =
(α1 0
0 α2
)
9.3. DIAGONALIZATION 61
Then
AP = PD
Writing
P =(v1 v2
)
A(v1 v2
)=
(v1 v2
)D
=(α1v1 α2v2
)
Hence Av1 = α1v1, Av2 = α2v2.
So v1, v2 are eigenvectors of A.
(0 1 0
0 0 0
)
eigenvectors are a
(1
0
)
, a 6= 0. Hence
P =
(a b
0 0
)
But this is not invertible. Contradiction. �
Point P must have evectors of A as collumns, but A does not have enough “indepen-
dent” evectors to make invertible P .
Example 9.7. Let A =
1 0 0
−1 2 0
1 −1 1
. Then the characteristic polynomial is
∣∣∣∣∣∣
1− λ 0 0
−1 2− λ 0
1 −1 1− λ
∣∣∣∣∣∣
= (1− λ)2 (2− λ)
So 1 is a repeated evalue.
Can A be diagonalized?
Let’s try.
Evectors for λ = 2
−1 0 0 0
−1 0 0 0
1 −1 −1 0
eigenvectors a
0
1
−1
, a 6= 0.
For λ = 1
0 0 0 0
−1 1 0 0
1 −1 0 0
62 9.3. DIAGONALIZATION
eigenvectors
b
b
c
.
So for collumns of P choose eigenvectors.
0
1
−1
,
1
1
0
,
0
0
1
So take
P =
0 1 0
1 1 0
−1 0 1
Then |P | = −1, so P is invertible, and from 9.2
P−1AP =
2 0 0
0 1 0
0 0 1
Summary If A has enough independent vectors for the repeated eigenvalue, A can be
diagonalized. If not, it can’t.
63
Chapter 10
Conics(again) and Quadrics
Recall equation of conic
ax21 + bx1x2 + cx22 + dx1 + ex2 + f = 0
We’ll write this in matrix form
(x1 x2
)(a b
2b2 c
)(x1x2
)
=(
ax1 +b2x2
b2x1 + cx2
)(x1x2
)
= ax21 + bx1x2 + cx22
So in matrix form, the equation of conic is
xTAx +(d e
)x + f = 0
where x =
(x1x2
)
, xT =(x1 x2
), A =
(a b
2b2 c
)
Digression – Transposes
Definition 10.1. If A = (ai j) is m×n, the transpose of A is them×n matrix AT = (aj i).
Example 10.1. A =
1 2
3 4
5 6
, AT =
(1 3 5
2 4 6
)
x =
(x1x2
)
, xT =(x1 x2
)
Note.
(AT )T = A
x.y =(x1 x2
)(y1y2
)
= xT y
Definition 10.2. A square matrix A is symmetric if A = AT .
Example 10.2.
(a b
2b2 c
)
is symmetric.
Proposition 10.1. Let A be m × n, B be n × p. Then
(AB)T = BTAT
64
Proof. The i j-th entry of (AB)T is the j i-th entry of AB, which is the j-th row of A
multiplied by i-th column of B. And i j-th entry of BTAT is i-th row of BT multiplied
by j-th collumn of AT , i.e. i-th column of B multiplied by j-th row of A. These are
equal. �
Back to conics
We formulatied the equation of a conic as
xTAx +(d e
)x + f = 0 (10.1)
where x =
(x1x2
)
, A =
(a b
2b2 c
)
.
Notice
xT(λ1 0
0 λ2
)
= λ1x21 + λ2x
22
So, aim is to find a rotation matrix P , such that the change of coordinates to y =
(y1y2
)
,
where x = Py changes the equation (10.1) to
(Py)TA(Py) +(d e
)Py + f = 0
yTP TAPy +(d ′ e′
)y + f = 0
and P TAP is diagonal.
We can do this.
Theorem 10.2. Let A =
(a b
2b2 c
)
be symetrix, with b 6= 0. Then
1) A has two real eigenvalues λ1, λ2, with λ1 6= λ2.
2) If v1, v2 are eigenvectors of A corresponding to λ1 and λ2, then
v1.v2 = 0
3) We can choose unit eigenvectors v1, v2, such that
P =(v1 v2
)
has determinant 1 and P is a rotation matrix. Moreover, P−1 = P T , and P TAPis diagonal.
Proof.
1) Characteristic polynomial of A is∣∣∣∣
a − λ b2
b2 c − λ
∣∣∣∣ = λ
2 − (a + c)λ+ ac − b2
4
Roots are
1
2
(
a + c ±√
(a + c)2 − 4(ac + b2))
=1
2
(
a + c ±√
(a − c)2 + b2)
10.1. REDUCTION OF CONICS 65
Since b 6= 0, (a − c)2 + b2 > 0, so the roots are real and distinct. Call them λ1and λ2.
2) Let
Av1 = λ1v1
Av2 = λ2v2
Consider
vT1 Av2
(a) This is vT1 (λ2v2) = λ2vT1 v2.
(b) It is also (because A is simetric, i.e. A = AT )
(AT v1)T v2 = (Av1)
T v2 = λ1vT1 v2
So
λ2vT1 v2 = λ1v
T1 v2
As λ1 6= λ2, this forces vT1 v2 = 0, i.e. v1.v2 = 0.
3) Choose unit v1, v2 (see picture). Then
v1 =
(cos θ
sin θ
)
v2 =
(− sin θcos θ
)
So
P =(v1 v2
)
=
(cos θ − sin θsin θ cos θ
)
= Rθ
Finally, by 9.3 ,
P TAP = P−1AP =
(λ1 0
0 λ2
)
�
10.1 Reduction of conics
Start with equation (10.1).
1) Find the evalues and evectors of A
2) Find unit eigenvectors v1, v2, such that P =(v1v2
)has determinant 1, so is a
rotation matrix.
66 10.1. REDUCTION OF CONICS
3) Change coordinates to y =
(y1y2
)
when x = Py , so equation (10.1) becomes
yT (P TAP )y +(d ′ e′
)y + f = 0
i.e.
yT(y1 0
0 y2
)
y +(d ′ e′
)y + f = 0
i.e.
λ1y21 + λ2y
22 + d
′y1 + e′y2 + f = 0
This is an ellipse if λ1λ2 > 0, hyperbola if λ1λ2 < 0, parabola if λ1λ2 = 0 (or
possible degenerate cases).
Example 10.3. Reduce conic
5x21 + 4x1x2 + 2x22 = 1
to standard form. This is
xT(5 2
2 2
)
x = 1
1) Characteristic polynomial
∣∣∣∣
5− λ 2
2 2− λ
∣∣∣∣ = λ
2 − 7λ+ 6 = (λ− 1)(λ+ 6)
For λ = 1, eigenvectors are a
(1
−2
)
.
For λ = 6, eigenvectors are b
(2
1
)
.
2) Unit vectors are ± 1√5
(1
−1
)
, ± 1√5
(2
1
)
.
Take
P =1√5
(1 2
−2 1
)
Then |P | = 1, and P = Rθ =(cos θ − sin θsin θ cos θ
)
, where cos θ = 1√5, sin θ = −2√
5.
3) Let x = Py (i.e. x1 =1√5(y1+2y2), x2 =
1√5(−2y1+y2).Then equation becomes
y21 + 6y22 = 1
which is an ellipse.
10.2. QUADRIC SURFACES 67
10.1.1 Orthogonal matrices
Recall, rotation matrix satisfies
RTθ = Rθ
So does a reflection matrix(cos θ sin θ
sin θ − cos θ
)
Definition 10.3. A square matrix P is an orthogonal matrix if P T = P−1, i. e.PP T = I.
The key property:
Proposition 10.3. Orthogonal matrices preserve lengths, i. e.
‖Pv‖ = ‖v‖
Proof.
‖Pv‖2 = (Pv).(Pv)
= (Pv)T (Pv)
= vTP TPv
= vT Iv
= ‖v‖2
�
10.2 Quadric surfaces
Definition 10.4. A quadric surface is a surface in R3 defined by a quadratic equation
ax21 + bx22 + cx
23 + dx1x2 + ex1x3 + f x2x3 + gx1 + hx2 + jx3 + k = 0 (10.2)
Standard examples
1) Ellipsoidx21a2+x22b2+x23c2= 1
2) Hyperboloid of 1 sheetx21a2+x22b2− x
23
c2= 1
3) Hyperboloid of 2 sheetsx21a2+x22b2− x
23
c2= −1
4) Elliptic conex21a2+x22b2− x
23
c2= 0
68 10.2. QUADRIC SURFACES
5) Elliptic parabloid
x3 =x21a2+x22b2
6) Hyperbolic parabloid
x3 =x21a2− x
22
b2
Some degenerate cases
7) Elliptic cylinder
x21a2+x22b2= 1
8) Parabolic cylinder
x21 − ax2 = 0
Aim is to find a rotation and translation, which reduces (10.2) to one of the standard
examples. Here is the procedure.
1. Write (10.2) in matrix form
xTAx +(g h j
)x + k = 0
where
x =
x1x2x3
A =
a d2
e2
22 b f
2e2
f2 c
Notice that A is symmetric.
2. Find the eigenvalues λ1, λ2, λ3 of A (theory – these are real). Find corresponding
unit eigenvectors v1, v2, v3 which are perpendicular to each other (theory – this
can be done). Choose a directions for the vi so that the matrix
P =(v1 v2 v3
)
has determinant 1.
Then, P is a rotation matrix, and P−1 = P T . Then
P−1AP = P TAP
=
λ1 0 0
0 λ2 0
0 0 λ3
10.2. QUADRIC SURFACES 69
3. Change coordinates to y =
y1y2y3
, where x = Py . Then (??) reduces to
yT (P TAP )y +(g h j
)Py + k = 0
i. e.
λ1y21 + λ2y
22 + λ3y
23 + g
′y1 + h′y2 + j
′y3 + k = 0
Finally, complete the square to find translation reducing to a standard equation.
Note. All the assumed bits of theory will be covered in Algebra II next year.
Example 10.4. Reduce the quadric
2x1x2 + 2x1x3 − x2 − 1 = 0
to standard form.
Answer.
1. Equation is
xT
0 1 1
1 0 0
1 0 0
x = x2 + 1
2. Characteristic polynomial is
∣∣∣∣∣∣
−λ 1 1
1 −λ 0
1 0 −λ
∣∣∣∣∣∣
= −λ(λ2 − 2)
We get λ equal to 0,√2, −√2.
Eigenvectors.
For λ = 0 a
0
1
01
, unit eigenvectors ± 1√2
0
1
1
For λ =√2 b
√2
1
1
, unit eigenvectors ±
1√21212
For λ = −√2 b
−√2
1
1
, unit eigenvectors ±
− 1√21212
Let
P =
0 1√2− 1√
2
− 1√2
12
12
1√2
12
12
70 10.3. LINEAR GEOMETRY IN 3 DIMENSIONS
Then |P | = 1 and P is a rotation.Change of coordinates x = Py changes equation to
√2y22 −
√2y23 = −
1√2y1 +
1
2y2 +
1
2y3 + 1
Finally, complete the square
√2(y2 − α)2 −
√2(y3 − β)2 = −
1√2(y1 − γ)
This is a hyperbolic paraboloid.
10.3 Linear Geometry in 3 dimensions
Definition 10.5. For two vectors x = (x1, x2, x3), y = (y1, y2, y3) in R3, define
• x + y = (x1 + y1, x2 + y2, x3 + y3)
• λx = (λx1, λx2, λx3), λ ∈ R
• length of x , denoted ‖x‖ =√
x21 + x22 + x
23
• dot product x.y = x1y1 + x2y2 + x3y3 (note that x.x = ‖x‖2)
Proposition 10.4. For x, y ∈ R3
x.y = ‖x‖‖y‖ cos θ
Proof.
‖y − x‖2 = ‖x‖2 + ‖y‖2 − 2‖x‖‖y‖ cos θ2‖x‖‖y‖ cos θ = x.x + y .y − (y − x)(y − x)
= 2x.y
�
10.4 Geometry of planes
Fact – there is a unique plane through any 3 points in R3 provided thay aren’t colinear.
We want to describe planes in terms of vectors and equations.
A plane Π in R3 is specified by
1) a point A on the plane
2) a vector n normal to the plane
Then x ∈ Π⇔ (x − a).n = n ⇔ x.n = a.n.
Proposition 10.5. If there are points A,B, C ∈ R3 that are not collinear, there existsa unique plane Π through A,B, C.
10.4. GEOMETRY OF PLANES 71
Proof. Because both A− B and B − C are both in the plane
n.(a − b) = 0
n.(c − a) = 0
i.e.
r1n1 + r2n2 + r3n3 = 0
s1n1 + s2n2 + s3n3 = 0
where (r1, r2, r3) = a − b, (s1, s2, s3) = c − a.We want to show that all solutions are {λn | λ ∈ R}. Because A, B, C are not colinear,r and s are not parallel. �
How to find dist(P,Π)?
Find foot of perpendicular, Q and arbitrary A in the plane.
dist(P,Q) = ‖a − p‖ cos θ
But
(a − p).n = ‖a − p‖‖n‖ cos θ
dist(P,Q) =(a − p).n‖n‖