algebra i - mathematicsagondem/ab12-13algebra_files/algebra.… · theorem 2.4 (pythagoras...

ALGEBRA I

JAREK KĘDRA AND ALEX GONZALEZ

1. Basic logic 12. Examples of proofs 43. Coordinates and maps 124. Solving equations I: polynomial equations 175. Complex numbers 226. Complex numbers II 327. Trigonometric identities 408. Complex functions 439. Solving equations II: linear equations 4610. Algebra of matrices 5511. Square matrices 60

Contents

1. Basic logic

1.1. Mathematical statements.

Definition 1.2. A mathematical statement is a sentence that is either true or falsebut not both. A axiom is a statement that is true by definition. The truth value of astatement is T or F depending on whether it is true (T) or false (F).

Example 1.3. Some examples of statements.

(1) Let x be a natural number. This is not a statement.

(2) Two is equal to three. This is a false statement.

(3) If x is an even number then x2 is an even number. This is a true statement.

(4) This sentence is false. This is not a statement.

♣

Example 1.4. Some examples of axioms.

(1) 2 + 2 = 4.

(2) Two parallel lines never intersect.

(3) The product of any number with zero is zero.

♣1

2 JAREK KĘDRA AND ALEX GONZALEZ

We will use letters P , Q, etc., to denote statements.

Definition 1.5. A proof of the statement P is a sequence of statements whose truthvalue is known in advance and which compute the truth value of P .

Examples of proofs will appear later on in this chapter.

1.6. Basic operation on statements. Let P and Q be statements. The following isthe so-called truth table for basic logical operations: negation, conjuction, disjunction,implication and equivalence.

P Q ¬P P ∧Q P ∨Q P ⇒ Q P ⇔ QT T F T T T TT F F F T F FF T T F T T FF F T F F T T

NOT AND OR IF ... THEN IF AND ONLY IF

Example 1.7. Consider the following two statements:

P : Two divides five.

Q: Two divides six.

The first one is false and the second is true. Moreover,

P ∧Q is false;

P ∨Q is true;

P ⇒ Q is true;

P ⇔ Q is false. ♣

Example 1.8. The following are rules for simple negations:

(1) ¬(¬P )⇔ P ;

(2) ¬(P ∧Q)⇔ ¬P ∨ ¬Q

(3) ¬(P ∨Q)⇔ ¬P ∧ ¬Q

(4) ¬(P ⇒ Q)⇔ P ∧ ¬Q

♣

ALGEBRA I 3

1.9. Quantifiers. The symbol ∃ means there exists and the symbol ∀ means for all.For example, the sentence

there exists a natural number bigger than three

can shortly be written as∃n∈N n > 3.

The sentence

every natural number is divisible by three

can be written as∀n∈N 3|n.

Example 1.10. Negating a sentence with a quantifier:

¬(∃x∈X P )⇔ ∀x∈X ¬P

¬(∀x∈X P )⇔ ∃x∈X ¬P ♣

1.11. More examples.

(1) Let P be the (true) statement 3 > 0. Then ¬P means 3 ≤ 0, which is a falsestatement.

(2) Let x ∈ R. The statement

If x > 0 then x > −1 or x > 0⇒ x > −1

is true. However, the converse

x > −1⇒ x > 0

is not true.

(3) x = 5 ⇒ x2 = 25 is a true statement. However, x2 = 25 ⇒ x = 5 is not truebecause (−5)2 = 25.

(4) Let n ∈ N. The number n is even if and only if the number n2 is even. This isa true statement (we will prove it later). We can write it as follows:

∀n∈N 2|n⇔ 2|n2.

(5) “There exists p, q ∈ N+ such that pq

=√

2 ” is a false statement. Thus itsnegation

∀p,q∈N+

p

q6=√

2

is true. In simpler words it means that the square root of two is not a rationalnumber.


2. Examples of proofs

Computing the truth value of a statement P is usually called proving the statementP . There are several basic models of proofs, which are described below, but in mostof the cases a proof will consist of a combination of the models that are presentedhere.

2.1. A direct proof.

Theorem 2.2. For all n ∈ N if the number n is even then n2 is even.

Proof. Suppose that n is even. It means that there exists a number k ∈ N such thatn = 2k. Then n2 = (2k)2 = 4k2 = 2(2k2) which means that n2 is even. �

Theorem 2.3. It is not possible for the rook to go from the top left corner of the standardcheck-board to the bottom right corner passing through each square exactly once.

Proof. Break it into simple moves from one square to another. So, at step one weare in the top right corner (white, a8), at step two we are either at b8 or a7 (bothblack). Observe that at each even step the rook is at a black square. Since there are64 squares and the bottom right (h1) it white, we can’t finish there. �

Theorem 2.4 (Pythagoras Theorem). In any right triangle, the area of the squarewhose side is the hypotenuse (the side opposite the right angle) is equal to the sum ofthe areas of the squares whose sides are the two legs (the two sides that meet at aright angle).

Proof. Let a and b be the lengths of the sides that meet at the right angle, and let cbe the length of the hypotenuse, as in the picture below.

ALGEBRA I 5

We can now prove Pythagoras theorem using one of the two squares in the pictureabove. Choosing one square or the other provides different but similar proves. Herewe prove the theorem using the square on the right, and you can try on you ownproving the theorem using the square on the left.

The area of the square of side a+ b is (a+ b)2, which is equal to the area of the squareof size c plus the area of 4 right triangles as the original. The equation looks like

(a+ b)2 = c2 + 41

2ab.

The left side of the above equation gives (a+ b)2 = a2 + 2ab+ b2, and the right side ofthe equation can be simplified to c2 + 2ab. Thus,

a2 + b2 = c2.

�

2.5. Proof by contradiction. This is used to prove implications. The idea is tonegate the statement and derive something which is clearly false.

Theorem 2.6. Let n ∈ N. If n2 is even then n is even.

Proof. Suppose that n is odd. It means that n = 2k + 1 for some number k. Thusn2 = (2k + 1)2 = 4k2 + 4k + 1 = 2(2k2 + 2k) + 1 which means that n2 is odd whichcontradicts the hypothesis. This proves the theorem. �

Theorem 2.7. The square root of two is an irrational number.

Proof. Let’s assume that√

2 is rational. That is√

2 = pq

where p and q are positiverelatively prime natural numbers. Taking the square of the above equality we getthat

2 =p2

q2.

Multiplying both sides by q2 we obtain

2q2 = p2

which means that p2 is even. It follows that p is even due to the previous theorem.That is, p = 2k for some k. Hence we get that

2q2 = p2 = (2k)2 = 4k2.

Dividing both sides by two we obtain

q2 = 2k2

which implies that q is even. Finally, we get that both p and q are even and thiscontradicts the assumption that they are relatively prime. This proves the theorem.

�

Let’s do it a bit more complicated.


Theorem 2.8. Fix some natural number m > 0 and let n be any other natural number.If m divides n2 then m divides n.

Proof. Since m divides n2, there is a natural number k0 such that n2 = mk0. Supposenow that m does not divide n. Then, there exist numbers x0 > 0 and r0 such that0 < r0 < n and n = mx0 + r0. Thus,

mk0 = n2 = (mx0 + r0)2 = m2x20 + 2mx0r0 + r2

0.

We see from the equation above that m divides r20 and 0 < r0 < n. Since m does not

divide n, it does not divide r0 neither.

We do the same again, replacing n by r0: there exist numbers k1, x1 > 0 and r1 suchthat 0 < r1 < r0 and such that r0 = mx1 + r1 and

mk1 = r20 = (mx1 + r1)2 = m2x2

1 + 2mx1r1 + r21.

Again, we deduce that m divides r21, m does not divide r1, and 0 < r1 < r0.

Repeating this process, we create a sequence

0 < . . . < rs < rs−1 < . . . < r1 < r0 < n.

After enough iterations, either rs = rs−1 or rs = 0. But none of these situations canhappen: if rs = rs−1 then 0 < xs = 0, and if rs = 0 then m divides rs−1 (and hence itdivides n). Thus, we have a contradiction with the hypothesis that m does not dividen. �

Theorem 2.9. The square root of a natural number m is an irrational number if andonly if m is not a square.

Proof. Consider the following statements:

P:√m is an irrational number.

Q: m is not a square number.

Then we have to show that the following statements (both of them) are true

• P ⇒ Q: if√m is an irrational number, then m is not a square number.

• Q⇒ P : if m is not a square number, then√m is not a rational number.

First of all, using the table 1.6, we can see the statement P ⇒ Q is equivalent to thestatement

• ¬Q⇒ ¬P : if m is a square number, then√m is a rational number.

This is obviously true: if m is a square (that it, if m = q2 for some natural number q),then

√m =

√q2 = q, which is a natural number. Thus P ⇒ Q is true.

ALGEBRA I 7

Now let’s prove that Q⇒ P is true. Assume that√m is rational:

√m =

p

q

where p and q are positive relatively prime natural numbers. Taking the square powerof the above we get

m = p2/q2,

or equivalently mq2 = p2, which means that m divides p2. By the previous Theorem,this implies that m divides p: there is some number k such that p = mk, and we getthat

mq2 = p2 = (mk)2 = m2k2.

Dividing both sides by m, we have

q2 = mk2,

which means that k divides both p and q. Since this contradicts the assumption thatp and q are relatively prime, it follows that

√m is irrational. �

2.10. Proof by induction. Let P (n), where n ∈ N, be a sequence of mathematicalstatements. Suppose that P (0) is true and the implication P (n)⇒ P (n+1) is true forall n. Then P (n) is tru for all n ∈ N. This is the principle of mathematical induction.Let’s see it in action.

Example 2.11. Let P (n) be the following statement:

For all n ∈ N the number 6n − 1 is divisible by 5.

The statement P (0) means that 60 − 1 = 0 is divisible by 5 which is clearly true.Suppose that P (n) is true. This means that 6n− 1 = 5k for some number k ∈ N. Wethen have that

6n+1 − 1 = 6n+1 − 6 + 6− 1 = 6 · (6n − 1) + 5 = 6 · 5k + 5 = 5(6k + 1)

which means that P (n + 1) is true provided P (n) is true. Then it follows from theinduction principle that our statement is true for all natural numbers. ♣

Example 2.12. Let’s prove the following theorem.

∀n∈Nn∑i=0

i =n(n+ 1)

2.

It obviously is true for n = 0. Suppose it is true for n. We have thatn+1∑i=0

i =n∑i=1

i+ (n+ 1) =n(n+ 1)

2+ (n+ 1) =

(n+ 1)(n+ 2)

2.

This proves the theorem.

This theorem can also be proved by the method of direct proof. Apparently, the greatmathematician Carl Friedrich Gauss (read about him!) cam up with this solutionwhen he was punished by the teacher by misbehavior. He was then given the task


to compute the sum of all the natural numbers from 0 to 100. This is what he cameup with. Write the numbers from 0 to n in a row

0 1 . . . n

You can also write the numbers from 0 to n in the inverse order. This way, we havea table which has 2 rows and n+ 1 columns:

0 1 . . . nn n− 1 . . . 0

Notice that the sum of each column is n:

0 + n = 1 + (n− 1) = . . . = n+ 0 = n.

Hence, twice the sum of all numbers from 0 to n is

Sn = n(n+ 1).

In other words,n∑i=0

i =1

2Sn =

n(n+ 1)

2.

♣

Example 2.13. Let’s prove the following theorem:

∀n∈Nn∑i=1

(2i− 1) = n2.

The beginning of the induction is clear. As for the induction step we have

n+1∑i=1

(2i− 1) =n∑i=1

(2i− 1) + (2(n+ 1)− 1) = n2 + 2(n+ 1)− 1 = n2 + 2n+ 1 = (n+ 1)2

which finishes the proof. ♣

Let n, k ∈ N be natural numbers. Recall that the number(nk

)(read n choose k) is

defined by (n

k

):=

n!

k!(n− k)!,

where m! = 1 · 2 · . . . ·m and 0! := 1, by definition.

Theorem 2.14. For all k, n ∈ N such that k ≤ n,(nk

)∈ N.

Proof. First of all, let’s prove Pascal’s formula(n+1k

)=(nk−1

)+(nk

):

ALGEBRA I 9

(n

k − 1

)+

(n

k

)=

n!

(k − 1)!(n− k + 1)!+

n!

k!(n− k)!=

=n!

(k − 1)!(n− k + 1)(n− k)!+

n!

k(k − 1)!(n− k)!=

=k · n!

k!(n− k + 1)!+

(n− k + 1)n!

k!(n− k + 1)!=

=(n+ 1)n!

k!(n− k + 1)!=

(n+ 1

k

)We can now apply induction. Clearly, P0 is true:

(00

)= 1 is a natural number. Let’s

now prove that Pn implies Pn+1 for all n. Notice that(n+1

0

)=(n+1n+1

)= 1, so we can

assume that 0 < k < n+ 1. Pascal’s formula together with the induction hypothesisimplies then that

(n+1k

)is a natural number. �

The numbers(nk

)are important combinatorial numbers. Here you have an example.

Theorem 2.15. Let a, b ∈ R and n ∈ N. Then

(a+ b)n =n∑k=0

(n

k

)akbn−k.

Proof. The statement is clearly true for n = 0. Suppose it is true for n and let’s proveit for n+ 1.

(a+ b)n+1 = (a+ b)n(a+ b) = (a+ b)na+ (a+ b)nb =

=n∑k=0

(n

k

)akbn−ka+

n∑k=0

(n

k

)akbn−kb

=n∑k=0

(n

k

)ak+1bn−k +

n∑k=0

(n

k

)akbn−k+1

=n+1∑k=1

(n

k − 1

)akbn−(k−1) +

n∑k=0

(n

k

)akbn−(k−1)

= an+1 +n∑k=1

[(n

k − 1

)+

(n

k

)]akbn+1−k + bn+1

=n+1∑k=0

(n+ 1

k

)akbn+1−k.


�

Remark 2.16. You should be already familiar with the above formula, at least whenn = 0, 1 and 2. For instance, for n = 2, the above formula is simply

(a+ b)2 = a2 + 2ab+ b2.

Notice that we can use the formula (a + b)n =∑n

k=0

(nk

)akbn−k to give a formula for

(a− b)n:

(a− b)n = (a+ (−b))n =n∑k=0

(n

k

)ak(−b)n−k =

n∑k=0

(−1)n−k(n

k

)akbn−k.

2.17. Strong induction. Let P (n) be a sequence of statements for n ∈ N. Supposethat the follwong implication is true.

If P (m) is true for all m < n then P (n) is true.

Then P (n) is true for all natural numbers n ∈ N. This is the principle of stronginduction. Let us see it in action.

Theorem 2.18. Every natural number n > 1 can be expressed as the product of primenumbers.

Proof. Let n be a natural number bigger than one. Suppose that every integer m < nis a product of prime numbers.

If n is prime then n is a product of one prime (itself) and we are finished. If it is notprime then it is divisible by an integer 1 < k < n. Thus n = k · l for some l ∈ N,l > 1. According to the induction hypothesis both k and l are products of primes andhence n is also a product of primes. �

Theorem 2.19. Every natural number n > 1 can be expressed as a sum of powers of2.

Proof. For each n > 1, let P (n): the number n is a sum of powers of 2. Clearly, P (1) istrue since 1 = 20. Let now n ≥ 1. There exists some k ∈ N such that 2k ≤ n < 2k+1.If n = 2k, then P (n) is true. Suppose then that 2k < n < 2k+1. Thus,

0 < n− 2k < 2k+1 − 2k = 2k ≤ n− 1.

By the strong induction hypothesis (P (m) is true for all m < n), the number n − 2k

is a sum of powers of 2, and hence

n = 2k + a sum of powers of 2.

�

Theorem 2.20. If n ∈ N, then 12 divides n4 − n2.

ALGEBRA I 11

Proof. For each n, let P (n): 12 divides n4−n2. The cases n = 0, 1, 2, . . . , 6 are checkedone by one. Let now n ≥ 6, and let k = n− 6. Then,

n4 − n2 = (k + 6)4 − (k + 6)2 =

= (k4 + 24k3 + 180k2 + 864k + 1296)− (k2 + 12k + 36) =

= (k4 − k2) + 24k3 + 180k2 + 852k + 1260 =

= (k4 − k2) + 12(2k3 + 15k2 + 71k + 105).

Since k < n, by the strong induction hypothesis we know that k4 − k2 is divisible by12, and hence n4 − n2 is divisible by 12.

We can prove the theorem by direct proof too. Write

n4 − n2 = n2(n2 − 1) = nn(n− 1)(n+ 1).

Since n− 1, n, n+ 1 are three consecutive natural numbers, it is clear that 3 dividesone of them, and hence it divides n4 − n2. Now, if n is even, then 4 divides n2, andhence 4 divides n4−n2. On the other hand, if n is odd, then 2 divides n−1 and n+1,so again 4 divides n4 − n2. �


3. Coordinates and maps

3.1. The space Rn. We imagine the set of real numbers R as an infinite straightline. After fixing a position of zero and one every number corresponds to a point onthe line and every point on the line corresponds to a number. In a similar way the setR2 of pairs (x, y) of real numbers we imagine as a flat plane. Further the set R3 oftriples of real numbers we imagine as the infinite space surrounding us. This allowsto express geometric figures in terms of numbers which is very useful.

Well, the set R4 of quadruples of real numbers we would like to imagine as the fourdimensional space surrounding us but most of us find it very hard to imagine.

Example 3.2 (A travel in four dimensions). Let p be a point inside a full bottle of beerplaced in the fridge. Let (x, y, z, t) ∈ R4 be four real numbers such that x, y and zexpress a position of the point p and t = 10C is the temperature of p (let’s pretendthat p is a small drop).

Suppose you would like to drink the beer. You take the bottle out of the fridge andplace it on the table. Of course, the coordinates x, y, z continuously change. Observethat the temperature t also slightly rise after taking the bottle out of the fridge, saynow t = 13C. You drink the beer (again the position of p changes) and after sometime the temperature t rises to above 30C.

In this process we observe that all four coorinates of p have been changing in time.We think of this as p travelling in the four dimensional space. ♣

In the spirit of the above example we can imagine modelling processes with five ormore dimensional spaces, e.g. three numbers for the position, then the temperature,pressure, radiation etc. Thus for any natural number n ∈ N we denote by Rn theset of n-tuples (x1, x2, ..., xn) of real numbers and for the time being we call it then-dimensional real space.

3.3. Functions and what do they do. A function f : R → R defined on the set ofreal numbers with real values is in many practical situations defined by a formula.For example,

(1) f(x) = 2x+ 1;

(2) f(x) = x2;

(3) f(x) = x5 + 2x3 − x2;

(4) f(x) = 1x2+1

.

In a similar way we can define functions between high-dimensional spaces. Forexample,

(1) f : R2 → R, f(x, y) = x + y; this is a map defined on the plane R2 withvalues in the real numbers.

ALGEBRA I 13

(2) f : R2 → R2, f(x, y) = (x+ y, x);

(3) f : R→ R2, f(x) = (cos(x), sin(x));

(4) f : R2 → R3, f(x, y) = (x+ y, x, 2y).

This shows that it is easy to define functions. However, we would like to understandthem. Understanding means that we can answer many reasonable questions abouta function. Let’s consider an easy example.

Example 3.4. Let f, g : R→ R be two functions defined by f(x) = 2x and g(x) = x2.To understand any of the following statements it is enough to take a quick look atthe graph of the function (but to prove it we need to compute something).

f(x) = 2x g(x) = x2

(1) For every y ∈ R there exists an x ∈ R such that f(x) = y. Indeed, given anyy ∈ R to find the x it is enough to solve the equation

y = 2x

which gives x = y2.

(2) Observe that a similar statement for g is not true. Indeed, if we take y = −1then there is no x ∈ R such that g(x) = x2 = −1.

(3) For every x, y ∈ R we have that

f(x+ y) = f(x) + f(y).

This is easy to prove:

f(x+ y) = 2(x+ y) = 2x+ 2y = f(x) + f(y).

(4) Again, a similar statement is not true for g. For example,

g(1 + 1) = g(2) = 22 = 4;

on the other hand we have that

g(1) + g(1) = 12 + 12 = 1 + 1 = 2.


(5) For every y ∈ R there exists at most one x ∈ R such that f(x) = y. This isequivalent to saying that for given y ∈ R the equation 2x = y has at most onesolution, which is clearly true. In fact it has exactly one solution.

(6) Again, the analogous statement is not true for g. For example g(1) = g(−1) =1, so there are two different arguments attaining the same value. ♣

Definition 3.5. Given two sets X and Y , a function f : X → Y is an assignment ofexactly one element y ∈ Y to each element x ∈ X. Usually we write y = f(x), theimage of x by the function f . The domain of f is the set X. The graph of the functionis the set of pairs

{(x, f(x)) | x ∈ X}.

(1) The function f is called injective if for every two elements x1, x2 ∈ X if x1 6= x2

then f(x1) 6= f(x2); equivalently, f(x1) = f(x2) implies that x1 = x2;

(2) The function f is called surjective if for every y ∈ Y there exists an x ∈ Xsuch that y = f(x).

(3) The function f is called bĳective if it is both injective and surjective.

(4) Let A ⊆ X. The set

f(A) := {y ∈ Y | ∃a ∈ A such that f(a) = y}is called the image of the set A with respect to the function f . The image f(X)of the whole domain is simply called the image of f .

(5) Let B ⊂ Y . The set

f−1(B) := {x ∈ X | f(x) ∈ B}is called the preimage of the set B with respect to the function f .

Example 3.6.

(1) The function f : R → R defined by f(x) = 2x is injective. Indeed, if x1 6= x2

then f(x1) = 2x1 6= 2x2 = f(x2).

(2) The same function f is also surjective. Indeed, let y ∈ R then y = 2y2

= f(y2)

and thus y = f(x), where x = y2.

(3) The function g : R → R defined by g(x) = x2 is not injective. To see thisobserve that g(−1) = 1 = g(1). That is two different arguments attain thesame value.

(4) The same function g is not surjective: there does not exist any number x ∈ Rsuch that g(x) = −1. ♣

Example 3.7. Let f : R2 → R be defined by f(x, y) = x + y. This function is notinjective. Indeed, for example we have that f(0, 0) = 0 = f(−1, 1); two differentarguments attain the same value.

ALGEBRA I 15

The function f is surjective because for any z ∈ R we have that z = f(0, z). That isevery element of the range is the value of the function.

Let’s compute the preimage f−1({1}) of the set consisting of one element 1:

f−1({1}) = {(x, y) ∈ R2 | f(x, y) = 1}= {(x, y) ∈ R2 |x+ y = 1}

We can just say that this preimage consists of points (x, y) satisfying the equationx + y = 1. However, we can also say that this set is a straight line on the planepassing through points (1, 0) and (0, 1) and we can draw it. ♣

Example 3.8. Let f : R→ R be defined by f(x) = x2. Determining the image f([1, 2])is the same as answering the question: what are the values of the function f for thearguments in the interval [1, 2]?

f([1, 2]) = {y ∈ R | y = f(a), for some a ∈ [1, 2]}= {y ∈ R | y = a2, 1 ≤ a ≤ 2}= {y ∈ R | 1 ≤ y ≤ 4}= [1, 4].

The second from the last equality says that if 1 ≤ a ≤ 2 then the square a2 is in theinterval [1, 4].

Let us now find the preimage of the set [1, 4] with respect to the function f .

f−1([1, 4]) = {x ∈ R | f(x) ∈ [1, 4]}= {x ∈ R | 1 ≤ f(x) ≤ 4}= {x ∈ R | 1 ≤ x2 ≤ 4}= [−2,−1] ∪ [1, 2]

To obtain the last equality we solve the following inequalities

1 ≤ x2 and x2 ≤ 4.

♣

Example 3.9. Let f : R2 → R be defined by f(x, y) = x2 + y2. Observe that thefollowing claims are true:

(1) The image of f is equal to the set of all nonnegative real numbers;

(2) f−1({0}) = {(0, 0)};

(3) f−1({1}) = {(x, y) ∈ R2 |x2 + y2 = 1}; observe that this set is equal to thecircle of radius one centered at the origin.

(4) f−1((0, 1]) = {(x, y) ∈ R2 | 0 < x2 + y2 ≤ 1}; observe that this set is a disc ofradius one with its center removed.


(5) f−1((1, 4]) = {(x, y) ∈ R2 | 1 < x2 + y2 ≤ 4}; this is an annulus.

Draw the above subsets of the plane. ♣

Example 3.10. Let f : R → R2 be defined by f(t) = (cos(t), sin(t)). Observe thatthe image of f is the circle of radius one centered at the origin. To see this use theidentity cos(t)2 + sin(t)2 = 1. ♣

In the above examples we have seen that various geometric figures, like a line, acircle, or an annulus, can be described as the image or the preimage of a set withrespect to some function. Observe moreover that the preimage of a one element set{y} ⊂ Y with respect to a function f : X → Y is nothing else as the set of solutionsof the equation

f(x) = y.

Solving equations is also related to injectivity and surjectivity:

• f is surjective if and only if f−1({y}) 6= ∅ for all y ∈ Y ;

• f is injective if and only if f−1({y}) contains only one point for all y ∈ f(X).

Example 3.11. Let us present more examples of geometric figures appearing aspreimages (or solutions of equations). Details of this sort of examples will be dis-cussed in further lectures.

(1) Let f : R2 → R be defined by f(x, y) = 3x− 5y+ 1. Observe that the preimagef−1({0}) is nothing else but the set of points (x, y) on the plane satisfying theequation 3x− 5y + 1 = 0. This set is the straight line passing through points(0, 1

5) and (−1

3, 0).

(2) Take f : R3 → R, where f(x, y, z) = 2x+3y+z. Then the preimage f−1({0}) isa plane in the three dimensional space containing the points (0, 0, 0), (0, 1,−3)and (1, 0,−2). Notice that three points not contained in a line determineuniquely a plane in three dimensional space. In other words, we can say thatthis plane is defined as the set of solutions of the equation 2x+ 3y + z = 0.

(3) Let f : R3 → R be defined by f(x, y, z) = x2 +y2 +z2. The preimage f−1({r2}),where r > 0, is the sphere or radius r centered at the origin.

(4) Let f : Rm → R be defined by f(x1, x2, . . . .xn) = x21 + x2

2 + · · · + x2n. The

preimage f−1({r2}), where r > 0, is called the (n − 1)-dimensional sphere orradius r centered at the origin. Observe that for n = 2 we get the circle, forn = 3 the usual sphere. ♣

ALGEBRA I 17

4. Solving equations I: polynomial equations

In the previous chapter we saw how important it is to solve equations. However, theproblem of solving any equation, without any kind of restriction, is rather compli-cated. For instance, what about solving

log(x3 − 4y87 + 34xy) + x56 cos(2xy)− 23y − 3x

x2 + 1= 18945.

Before we can make any serious attempt of solving the above equation we need tolearn how to deal with simpler equations.

The easiest type of function that we can think about are those called polynomial, andyou probably have met them before. For instance,

f(x) = 4x2 − 3x− 12.

The example above has only one variable and degree 2. In general, polynomialfunctions can have more than one variable and any (natural) degree.

This course is then divided into two main parts. The first part is about polynomialfunctions on one variable and any degree. These are functions as the example above,and will lead us to discover and study complex numbers. The second part of thecourse is about linear functions, and this will take us to learn about vector spacesand matrices (don’t worry too much about this, will come back to it in a few weeks).For now, let’s focus on polynomials.

Definition 4.1. A real polynomial function of degree n is a function P : R→ R definedby

P (x) := anxn + an−1x

n−1 + · · ·+ a1x+ a0,

where ai ∈ R and an 6= 0.

A polynomial equation of degree n is an equation of the form

P (x) = 0,

where P is a polynomial function of degree n.

A solution s of a polynomial equation P (x) = 0 is also called a root of the polynomialP (x).

Observe that the set of solutions of such an equation is just the preimage of the set{0} with respect to P .

Example 4.2 (Linear equations). A linear equation is a polynomial equation of degreeone:

ax+ b = 0.

It is very easy to solve and has the unique solution x = − ba. ♣


Example 4.3 (Quadratic equation). A polynomial equation of degree two is called aquadratic equation,

ax2 + bx+ c = 0,

where a 6= 0. It is not difficult to find solutions of the quadratic equation. They areof the form

s =−b±

√b2 − 4ac

2a

provided that b2 − 4ac ≥ 0. That is, this condition ensures that a solution s is a realnumber. Next week we will learn what complex numbers are and this will simplifysolving the quadratic equations. ♣

Observe that in the above examples the solutions of a polynomial equations are ex-pressed in terms of the coefficients of the polynomial and basic operations: addition,subtraction, multiplication, division and taking roots (or radicals). In algebra, theproblem of solving a polynomial equation is to find a general formula for solutions interms of the coefficients and with using only the above basic operations.

As we have seen above, such formulae can be found for equations of degree one andtwo. The problem can be also solved in degree three and four but the formulae aremore complicated. Read on internet about solving cubic and quartic equations.

Solving polynomial equations becomes a serious problem when the degree of theequation is five or higher: in this case there does not exist any formula for thesolutions of the equation! This is a beautiful theorem you can learn at the Level 4course called Galois Theory.

4.4. Long division of polynomials. The long division algorithm can be also appliedto polynomials. It proves the following fact.

Proposition 4.5. LetP (x) andD(x) be polynomials with degrees degD(x) ≤ degP (x).Then there exist polynomials Q(x) and R(x) such that

P (x) = Q(x)D(x) +R(x),

and degR(x) < degD(x).

In the above expression D(x) is called the divisor, Q(x) the quotient and R(x) theremainder.

Definition 4.6. We say that a polynomial D(x) divides a polynomial P (x) (or thatP (x) is divisible by Q(x)) if the remainder in the above expression is equal to zero,R(x) = 0.

Theorem 4.7. A number s is a solution of a polynomial equation P (x) = 0 if and onlyif the polynomial x− s divides the polynomial P (x).

ALGEBRA I 19

Proof. If the polynomial x− s divides P (x) then there exists a polynomial Q(x) suchthat

P (x) = (x− s)Q(x).

Thus we get that P (s) = (s− s)Q(s) = 0 ·Q(s) = 0, that is, s is a root of P (x).

Conversely, suppose that s is a root of the polynomial P (x). Write

P (x) = Q(x)(x− s) +R(x),

where degR(x) < 1, that is, R(x) is a number. Since s is a root we have

0 = P (s) = Q(s)(s− s) +R(s) = R(s).

Thus R(x) = 0 and hence x− s divides P (x). �

The above theorem tells us that if s1 is a root of a polynomial P (x) then we cansimplify it

P (x) = (x− s1)Q1(x),

where Q1(X) is a polynomial one degree lower that P (x). If s2 is a root of Q1(x) (andhence also a root of P (x)) then we can simplify it further

P (x) = (x− s1)(x− s2)Q2(x).

If we can continue this process then we end up in the simplest possible form

P (x) = (x− s1)(x− s2) . . . (x− sn).

This is however not always the case if we are dealing with real roots only. In will turnout that such a factorization is always possible over the complex numbers (we willencounter them soon).

Definition 4.8. A polynomial P (x) is irreducible if deg(P (x)) ≥ 1 and it cannot beexpressed as the product of two polynomials of smaller (positive) degree. That is,there does not exist Q(x), R(x) such that

P (x) = Q(x)R(x)

and 1 ≤ deg(Q(x)), deg(R(x)) < deg(P (x)).

Example 4.9. Let’s write P (x) = x3 − 3x2 + 3x − 1 as a product of irreducible poly-nomials. It’s not difficult to see that 1 is a root of this polynomial. By Theorem 4.7,the polynomial x− 1 divides P (x), and long division gives

P (x) = (x− 1)(x2 − 2x+ 1).

It is clear now that

P (x) = (x− 1)(x− 1)(x− 1) = (x− 1)3.

♣


Example 4.10. Express the polynomial P (x) = x4 − 16x2 − 4 as a product of irre-ducible polynomials. We have

x2 =16±

√256 + 16

2= 8± 2

√17.

Notice that 8 < 2√

17, and hence we deduce that

x =

{ √8 + 2

√17

−√

8 + 2√

17

are roots of P (x). By Proposition 4.7,

P (x) =

(x−

√8 + 2

√17

)(x+

√8 + 2

√17

)(x2 − 8 + 2

√17

).

The polynomial Q(x) = x2 − 8 + 2√

17 is irreducible since 8 < 2√

17. ♣

We have to be careful when dealing with irreducible polynomials. For instance, thepolynomial P (x) = x2 − 2 can be factored as

P (x) = (x+√

2)(x−√

2).

Thus, P (x) is not irreducible in R, but it is irreducible in Z or Q! What are theirreducible polynomials in a given set of coefficients? this is also a difficult questionrelated to Galois Theory. For the time being, the only general method to find roots ofpolynomials in R is the following theorem.

Theorem 4.11. Let P (x) = anxn + an−1x

n−1 + · · ·+ a1x+ a0 be a polynomial functionof degree n with integer coefficients, i.e. ai ∈ Z. If the equation P (x) = 0 has a rationalsolution x = p

q, where gcd(p, q) = 1, then p divides a0 and q divides an.

Proof. Since pq

is a solution we have that

anpn

qn+ an−1

pn−1

qn−1+ · · ·+ a1

p

q+ a0 = 0.

Multiplying the equation by qn we obtain

anpn + an−1p

n−1q + · · ·+ a1pqn−1 + a0q

n = 0

p(anpn−1 + an−1p

n−2q + · · ·+ a1qn−1) = −a0q

n

which means that p divides a0qn. Since the greatest common divisor of p and q is

one, this implies that p divides a0 as claimed.

Similarly we have that

anpn + an−1p

n−1q + · · ·+ a1pqn−1 + a0q

n = 0

q(an−1pn−1 + · · ·+ a1pq

n−2 + a0qn−1) = −anpn

ALGEBRA I 21

which means that q divides anpn and hence q divides an as claimed. �

Example 4.12. Consider the equation

2x5 + 3x4 − 7x3 + x2 − 10x+ 6 = 0.

Theorem 4.11 tells us that the only possible rational solutions can be ±1, ±12, ±2,

±3 or ±32. We can check by inspection that only 3

2is a rational solution. Dividing the

above polynomial by x− 32

we obtain that

2x5 + 3x4 − 7x3 + x2 − 10x+ 6 = (2x− 3)(x4 + 3x3 + x2 + 2x− 2).

We know that the quotient x4 + 3x3 + x2 + 2x− 2 has no rational roots. ♣

Example 4.13. Solve the following equation

x3 − x2 + x− 1 = 0.

To do this without knowing a general formula we can apply Theorem 4.11. It tells usthat the only possible rational roots can be ±1. We check that 1 is indeed a root andwe divide the above polynomial by x− 1 and obtain

x3 − x2 + x− 1 = (x− 1)(x2 + 1)

And now we know that the quadratic equation x2 + 1 = 0 has no real solutions. ♣

Example 4.14. Solve the equation

x5 − 2x4 − 3x3 + 6x2 + 2x− 4 = 0.

This is even worse because we know that there is no general formula for solvingequations of degree five. However, Theorem 4.11 says that ±1,±2,±4 are the onlypossible rational roots. Indeed, we can check that ±1 and 2 are the actual solutions.We do the long division and we obtain

x5 − 2x4 − 3x3 + 6x2 + 2x− 4 = (x− 1)(x+ 1)(x− 2)(x2 − 2).

We can now solve easily the equation x2 − 2 = 0 and we finally get that

x5 − 2x4 − 3x3 + 6x2 + 2x− 4 = (x− 1)(x+ 1)(x− 2)(x−√

2)(x+√

2).

That is, the solutions of our equations are ±1, 2,±√

2. ♣


5. Complex numbers

The equation x2 + 1 = 0 has no real solutions. A solution for this equation wouldbe a “square root of −1”. At the moment we just want to have a number which isthe square root of −1 so we artificially add it to our number system and we wouldlike to see what happens. We denote this number by

√−1 or by i (like an imaginary

number). It has the property that i2 = −1.

We want to multiply i by real numbers. So, if b ∈ R is a real number then what weget is bi and we observe that this has the property that (bi)2 = b2i2 = b2(−1) = −b2

because we want the commutativity of the multiplication to hold in our new setting.Observe that bi is not a real number (unless b = 0) because its square is negative.

Definition 5.1. A pair of real numbers (a, b) is called a complex number and it isdenoted by z = a+bi. The set of complex numbers is denoted by C. The real numbera is called the real part of z, and is denoted by Re(z). Similarly, the real number b iscalled the imaginary part of z and is denoted by Im(z).

We can define addition and multiplication on C as follows. Let z = a+bi and w = c+dibe complex numbers. Then, addition is defined by

z + w = (a+ bi) + (c+ di) = (a+ c) + (b+ d)i

and multiplication is defined by

zw = (a+ bi)(c+ di) =

= ac+ adi+ bic+ bidi =

= ac+ bdi2 + (ad+ bc)i =

= (ac− bd) + (ad+ bc)i

In what what follows we will discuss a sequence of miraculous facts and propertiesabout complex numbers. Let’s start with an easy one.

Lemma 5.2. Every complex number has a square root.

Proof. Let a+ bi ∈ C. We want to find x+ yi ∈ C such that (x+ yi)2 = a+ bi:

(x+ yi)2 = x2 + 2xyi+ (yi)2 =

= (x2 − y2) + 2xyi

and we have to solve the system of equations{x2 − y2 = a2xy = b

Keep in mind that we want to find x and y real numbers satisfying the above system!

ALGEBRA I 23

If x = 0 = y, then a = 0 = b and there is nothing to prove. We can assume then thateither x or y (or both) is not 0. For simplicity, suppose x 6= 0. Then, from the secondequation we obtain

y =b

2x.

We can replace this in first equation:

x2 −( b

2x

)2

= a.

Rearranging everything, we get a quartic equation

x4 − ax2 − b2

4= 0.

Treating this equation as a quadratic equation, we get

x2 =a±√a2 + b2

2.

Since x has to be a real number, we get

x = ±√a+√a2 + b2

√2

.

Also, we deduce then that

y = ±√√

a2 + b2 − a√2

.

To finish the proof we have to say which combinations x+ yi satisfy (x+ yi) = a+ bi.There are two possibilities

z =

√a+√a2 + b2

√2

+ sign(b)

(√√a2 + b2 − a√

2

)i

and w = −z. �

The following (obvious) result shows that the complex numbers simplify dealing withquadratic equations.

Theorem 5.3. Every quadratic equation

ax2 + bx+ c = 0

with complex coefficients has a solution s ∈ C given by the formula

s =−b±

√b2 − 4ac

2a.

�


5.4. The Fundamental Theorem of Algebra. In fact, more is true. The followingresult is called The Fundamental Theorem of Algebra:

Theorem 5.5. Every nonconstant polynomial with complex coefficients has at leastone root in the field of complex numbers.

There are many beautiful proofs of this theorem so there is no point of repeatingthem in these notes. At this point probably the best proof is the one in Chapter 19in Proofs from the Book by Aigner and Ziegler. It uses only very basic facts fromcalculus. Notice that the theorem does not tell how to find a solution it only provesthat it exists.

By applying Theorem 4.7 we immediately obtain the following result.

Corollary 5.6. Every polynomial P (x) of degree n ≥ 1 with complex coefficients isequal to a product of n polynomials of degree one

P (x) = (x− a1)(x− a2) . . . (x− an),

where ai ∈ C. �

The above result can be stated in terms of irreducible polynomials.

Corollary 5.7. A polynomial P (X) with complex coefficients is irreducible (in C) if andonly if deg(P (x)) = 1.

5.8. Properties of complex numbers. Let’s continue studying the set of complexnumbers.

Definition 5.9. Let z = a+ bi ∈ C. The conjugate of z is

z̄ = a− biThe modulus of z is

|z| =√a2 + b2 ∈ R.

As we will see soon enough, these are important notions related to z. The followingresult is an example.

Proposition 5.10. Let z = a+ bi be a complex number. Then

(1) |z̄| = |z|,

(2) zz̄ = |z|2,

(3) |z| = 0 if and only if z = 0.

Proof. Each of the properties can be proved by a simple computation.

(1) Since z̄ = a− bi, then |z̄| =√a2 + (−b)2 =

√a2 + b2 = |z|.

(2) zz̄ = (a+ bi)(a− bi) = a2 + b2.

ALGEBRA I 25

(3) The modulus is equal to zero if and only if a2 + b2 = 0 and this is true if andonly if both a and b are zero, which means z = 0.

�

Let’s see some more properties of the modulus and the conjugate of a complex num-ber. The following properties are easy to prove:

(1) z + w = z̄ + w̄,

(2) zw = z̄w̄,

(3) z−1 = (z̄)−1,

(4) |zw| = |z| · |w|,

(5) |z−1| = |z|−1.

Proposition 5.11. Let z, w ∈ C. Then the following triangle inequality holds true

|z + w| ≤ |z|+ |w|.

Proof. Let us first observe that for every complex number z = a + bi we have thatz+ z̄ = a+ bi+ a− bi = 2a = 2 Re(z). This will be used in the following computation.

|z + w|2 = (z + w)(z + w)

= (z + w)(z̄ + w̄)

= zz̄ + ww̄ + wz̄ + zw̄

= |z|2 + |w|2 + (wz̄ + wz̄)

= |z|2 + |w|2 + 2 Re(wz̄)

≤ |z|2 + |w|2 + 2|wz̄|= |z|2 + |w|2 + 2|w||z|= (|z|+ |w|)2

This implies the triangle inequality. �

Proposition 5.12. The addition (+) and multiplication (·) in C satisfy the followingproperties.

(1) The addition of complex numbers is commutative:

∀ x, y ∈ C x+ y = y + x.

(2) The addition of complex numbers is associative:

∀ x, y, z ∈ C x+ (y + z) = (x+ y) + z.


(3) 0 is the identity for addition

∀ x ∈ C x+ 0 = x = 0 + x.

(4) Every complex number has its oposite:

∀ x ∈ C ∃ −x ∈ C x+ (−x) = 0.

(5) The multiplication of complex numbers is commutative:

∀ x, y ∈ C xy = yx.

(6) The multiplication of complex numbers is associative:

∀ x, y, z ∈ C x(yz) = (xy)z.

(7) 1 is the inverse for multiplication

∀ x ∈ C x1 = x = 1x.

(8) Every non-trivial complex number has its inverse:

∀ x ∈ C such that x 6= 0 ∃ x−1 ∈ C xx−1 = 1.

(9) The multiplication is distributive over the addition:

∀ x, y, z ∈ C x(y + z) = xy + xz.

Proof. You can prove (1) - (7) and (9) as an exercise. Here we will only proof (8). Letthen z ∈ C, z 6= 0. By Proposition 5.10,

zz̄ = |z|2 6= 0

and we can define z−1 = z̄|z|2 . Then,

zz−1 = zz̄

|z|2=

zz̄

|z|2= 1.

�

Remark 5.13. Let X be a set of numbers with operations + (addition) and . (multi-plication). Depending on which of the above properties are satisfied, the set X hasdifferent names. For instance, if X only satisfies (1), (2) and (3), then it is called amonoid, and it it also satisfies (4) the it is a group. If X satisfies all the propertiesexcept (8), then X is a ring, and if it satisfies all the properties then it is called a field.

Actually, you know already some examples of these notions. So far we know thefollowing sequence of inclusions

N ⊂ Z ⊂ Q ⊂ R ⊂ C.

The set of natural numbers N is just a monoid, the set of integers Z is a ring, andthe sets of rational, real and complex numbers Q,R and C are fields. Not all fields

ALGEBRA I 27

are like these three, some of them are finite. The algebra of finite fields has powerfulapplications in cryptography. All your electronic devices use this abstract algebra.

The above properties seem to be obvious. In the following example we show a certainmultiplication which is not commutative.

Example 5.14. Let (a, b) be a pair of real numbers such that a 6= 0. Let us definethe multiplication “∗” of such pairs by

(a, b) ∗ (c, d) = (ac, ad+ b).

It is easy to check that this multiplication is associative. It also has the identity (1, 0).This means that for all (a, b) as above we have that

(a, b) ∗ (1, 0) = (a, b) = (1, 0) ∗ (a, b).

So multiplication by (1, 0) is like multiplication by one. Observe also that everyelement has an inverse

(a, b)−1 = (1/a,−b/a).

Indeed, (a, b) ∗ (1/a,−b/a) = (1, 0). However, this multiplication is not commutative.For example

(1, 1) ∗ (2, 1) = (2, 1 + 1) = (2, 2)

and, on the other hand,

(2, 1) ∗ (1, 1) = (2, 2 + 1) = (2, 3).

Observe that this multiplication is not artificial. It corresponds to the compositionof functions f : R → R of the form f(x) = ax + b, with a 6= 0. More, precisely ifg(x) = cx+ d then we have that

(f ◦ g)(x) = ac(x) + (ad+ b).

♣

5.15. Complex numbers as points in a plane and polar coordinates. We definedcomplex numbers originally as pairs (a, b) where a, b ∈ R. We can then imagine thecomplex number z = a + bi as the point in a plane with horizontal coordinate a andvertical coordinate b. The following figure shows a few examples.


a

b a+bi

1

i

−2−i

If we imagine complex numbers this way, then the modulus of z is simply the distanceof z to the point (0, 0), and the conjugate z̄ is the point that we obtain when we foldthe plane along the horizontal axis.

Many other properties of the modulus and the conjugate have an easy interpretationthis way. For instance, given z, w ∈ C, we know from Proposition 5.11 that

|z + w| ≤ |z|+ |w|.

In the next figure we show that the addition of complex numbers has a simple geo-metric interpretation and that the triangle inequality is intuitively obvious.

z

w

z+w

The advantages of this representation of complex numbers do not stop here, but firstwe need some trigonometry. For an angle θ (measured in radians), sin(θ) and cos(θ)are sides of the right angle triangle depicted below.

ALGEBRA I 29

Figure 5.15

Since cos(θ) and sin(θ) are the base and height of a right angle triangle, we deduceimmediately the well-known formula

cos2(θ) + sin2(θ) = 1,

but we can also deduce many other relations and formulas.

Example 5.16. The following equalities are easy to check using the above diagram

sin(π/2 + θ) = cos(θ) sin(π/2− θ) = cos(θ)cos(π/2 + θ) = − sin(θ) cos(π/2− θ) = sin(θ)

sin(π + θ) = − sin(θ) sin(π − θ) = sin(θ)cos(π + θ) = − cos(θ) cos(π − θ) = − cos(θ)

sin(3π/2 + θ) = − cos(θ) sin(3π/2− θ) = − cos(θ)cos(3π/2 + θ) = sin(θ) cos(3π/2− θ) = − sin(θ)

sin(2π − θ) = − sin(θ)cos(2π − θ) = cos(θ)

♣

Let z ∈ C be a complex number, and let θ be the angle from the real axis to theinterval joining the origin with z. The length of this interval is, of course, equal tothe modulus of z.


z

sin

cos

1

θ

θ

Lemma 5.17. For each z ∈ C there is an equality

z = |z|(cos θ + i sin θ).

The above expression of z is known as the polar form of z. The angle θ is called theargument of z and it is denoted by arg(z). It is well defined up to an integer multiple of2π. The argument chosen to be in the interval [0, 2π) is called the principal argumentof z and it is denoted by Arg(z).

Remark 5.18. Sometimes the principal argument is defined to be in the interval(−π, π]. It is a matter of convention.

Example 5.19. (1) 1 + i =√

2 (cos(π/4) + i sin(π/4))

(2) −1 + i =√

2 (cos(3π/4) + i sin(3π/4))

(3)√

3 + i = 2 (cos(π/6) + i sin(π/6))

(4) 1 + i√

3 = 2 (cos(π/3) + i sin(π/3)) ♣

The following theorem, known as De Moivre’s theorem, is the main result of thissection. It provides an intuition for the multiplication of complex numbers.

Theorem 5.20. Let z = |z|(cosα + i sinα) and w = |w|(cos β + i sin β). Then

zw = |z||w|(cos(α + β) + i sin(α + β)).

In other words, in order to multiply two complex numbers we multiply their moduli andadd their arguments.

Corollary 5.21. If z = |z|(cos θ + i sin θ) then for every integer n ∈ Z we have that

zn = |z|n(cos(nθ) + i sin(nθ)).

�

ALGEBRA I 31

Example 5.22. Let’s compute the square roots of the complex number z = 3i. ByLemma 5.2, these are

z1 =√

62

+(√

62

)i and z1 = −

√6

2−(√

62

)i

What if we compute now the square root of z1? If we follow Lemma 5.2, the compu-tations start getting complicated:

w1 =

√2√

3+√

6

2+(√

2√

3−√

6

2

)i

w2 = −√

2√

3+√

6

2−(√

2√

3−√

6

2

)i

Can you predict any pattern here? It is hard to detect anything. What happens ifwe write all these complex numbers in polar form? In this case, z = 3(cos(π/2) +sin(π/2)i). Its square roots are then

z1 =√

3(

cos(π4

)+ sin

(π4

)i)

and z2 =√

3(

cos(

5π4

)+ sin

(5π4

)i)

and the square roots of z1 are

w1 = 4√

3(

cos(π8

)+ sin

(π8

)i)

and w2 = 4√

3(

cos(

9π8

)+ sin

(9π8

)i)

Can you see some pattern now? We will see what really happens here in the nextchapter. ♣


6. Complex numbers II

In this section we prove De Moivre’s theorem and apply it to some concrete com-putations. We start with proving a geometric fact which is interesting in its ownright.

6.1. A geometric lemma.

Lemma 6.2. Let F : R2 → R2 be a rotation by an angle α about the origin. Then, themap F is given by the formula

F (x, y) = (x cosα− y sinα, x sinα + y sinα).

Proof. Notice that F preserves distance. Thus, using Figure 5.15, we see immediatelythat

F (1, 0) = (cos(α), sin(α)).

Let’s write for shorta = cos(α) and b = sin(α).

Then, using again Figure 5.15, it follows that F (0, 1) = (−b, a). Actually, we candeduce more that this: for all (x, 0), (0, y) ∈ R2,

F (x, 0) = (ax, bx) and F (0, y) = (−by, ay),

(1, 0)

(0, 1)

(x, 0)

(x, y)(0, y)

α

(a, b)

(ax, bx)

(−b, a)

(−by, ay)

F (x, y)

Note that the coloured triangles in the picture above are equal, and the blue one hasvertices (0, 0), (−by, ay) and (0, ay), and hence we deduce that

F (x, y) = (ax− by, bx+ ay) = (x cos(α)− y sin(α), x sin(α) + y cos(α)).

ALGEBRA I 33

�

Let Fα, Fβ : R2 → R2 be two rotations about the origin by the angle α and β respec-tively. Its composition, Fβ ◦ Fα is the rotation about the angle α + β. Let us makeuse of this observation. First calculate the formula for the composition.

(Fβ ◦ Fα)(x, y) =

= (x(cosα cos β − sinα sin β)− y(sinα cos β + cosα sin β),

x(cosα sin β + sinα cos β) + y(cosα cos β − sinα sin β))

On the other hand we know that

Fα+β(x, y) = (x cos(α + β)− y sin(α + β), x sin(α + β) + y cos(α + β)).

As a consequence we get the following identities:

sin(α + β) = cosα sin β + sinα cos β(6.1)cos(α + β) = cosα cos β − sinα sin β(6.2)

Proof of De Moivre’s theorem. Recall that we need to prove the following formula

zw = |z||w|(cos(α + β) + i sin(α + β)),

where z = |z|(cosα + i sinα) and w = |w|(cos β + i sin β). Observe that it is animmediate consequence of the above identities (1) and (2). �

We also obtain the following geometric interpretation of the multiplication by a com-plex number.

Corollary 6.3. Let w = |w|(cosα + i sinα) be a complex number. Let fw : C → C bedefined to be the multiplication by w. That is,

fw(z) = wz.

Then fw is the composition of the rotation by the angle α about the origin followed bythe scaling by the modulus |w|.

Proof. Let z = x+ yi and let w = |w|(cosα + i sinα). Observe that

zw = (x+ yi)|w|(cosα + i sinα) = |w|(x cosα− y sinα) + i|w|(x sinα + y cosα).

Thus in the cordinates (x, y) the above multiplication is given by the formula

Fw(x, y) = (|w|(x cosα− y sinα), |w|(x sinα + y cosα)).

The statement then follows from Lemma 6.2. �


6.4. The exponential notation and roots of complex numbers. Recall from Lemma5.2 that every complex number z has a square root. In fact, if z = a+ bi 6= 0, then ithas two square roots, given by the formulas

w1 =

√√a2 + b2 + a√

2+ sign(b)

(√√a2 + b2 − a√

2

)i

and w2 = −w1. It is pretty clear that computing roots of complex numbers this way isfar from optimal. Furthermore, the above formulas only work to compute the squareroots of z, but what if we what to compute the roots of degree 3 of z?

The polar form that we have seen recently, together with De Moivre’s Theorem, sim-plifies these computations, but we can still make another simplification before westart computing anything. The following is called Euler’s formula:

eiθ := cos θ + i sin θ,

where e is the base of the natural logarithm (the Euler number). Don’t let the sim-plicity of this formula cheat you: there is a lot of deep mathematics behind it! Youcan read about this formula on internet, and specially about the equation

eiπ = −1,

which is usually known as “the most beautiful theorem in mathematics”.

It is very often more convenient to write reiθ instead of r(cos θ+ i sin θ). How does DeMoivre’s Theorem look like in this notation? If we write z = |z|eiα and w = |w|eiβ,then

zw = |z|eiα|w|eiβ = |z||w|eiαeiβ = |z||w|eiα+iβ = |z||w|ei(α+β)

which is easy to remember as it is the usual rule for multiplying powers. Let’s seesome examples.

Example 6.5. (1) 1 + i =√

2eiπ/4

(2) −1 + i =√

2ei3π/4

(3)√

3 + i = 2eiπ/6

(4) (1 + i)(√

3 + i) = 2√

2eiπ/4+iπ/6 = 2√

2ei5π/12

(5) (1 + i)(−1 + i) =√

2√

2eiπ/4+i3π/4 = 2eiπ = −2 ♣

We can now apply Euler’s formula to compute the roots of degree n of a complexnumber z ∈ C. In other words, we find all (complex) solutions of the equation

xn − z = 0,

where z ∈ C is a complex number. Notice that, if z = 0 then the only solution is zero.The following result is an immediate consequence of De Moivre’s theorem.

ALGEBRA I 35

Corollary 6.6. Let n ∈ N be a nonzero natural number. Every nonzero complexnumber z ∈ C has n different roots of degree n. Moreover, the roots are given by thefollowing formula

n√|z|(

cos

(2πk + θ

n

)+ i sin

(2πk + θ

n

))= n√|z| ei

2πk+θn ,

where θ = arg(z) and k = 0, 1, 2, . . . , n− 1.

Proof. It is easy to check, by making a quick computation using De Moivre’s formula,that each of the above numbers is a degree n root of z. On the other hand we knowthat xn − z = 0 has at most n solutions, and thus the above numbers are all degreen roots of z. �

On Figure 6.4 we show how to draw all degree n roots of a complex number z = reiθ.We can quickly check that the number z1 = n

√reiθ/n is a root of degree n of the

number z. In order to draw all of the roots of z we proceed then as follows (see Figure6.4).

(1) The number z is given and drawn on the plane;

(2) draw a ray from 0 to z;

(3) draw a circle of radius n√r centered at the origin;

(4) on the circle draw the number with argument θ/n: this is z1;

(5) draw a regular n-gon with vertices on the circe and such that z1 is a vertex.

Z

1

Z1

Z2

Z3

Z4

Z5

Figure 6.4. Degree n roots of a complex number.


6.7. Roots of unity. In the above procedure to draw all the degree n roots of z = reiθ,notice that once we fix the root z1, the rest of the roots correspond to rotations of z1

of anglesθ

n,θ + 2π

n,θ + 4π

n, . . . ,

θ + 2(n− 1)π

n,θ + 2nπ

n.

Z1

Z2

Z3

Z4

Z5

θn

On the other hand, we know from Corollary 6.3 that a rotation of z1 by angle θ/n isthe same as multiplying z1 by the complex number w = eiθ/n = (cos(θ/n) + i sin(θ/n).Note that wn = 1. In other words, all the other roots are of the form zi = εiz1, whereεi is a root of degree n of 1.

Thus, in order to determine all the degree n roots of a complex number z we onlyneed to determine one, z1, and combine it with the degree n roots of 1:

xn = 1.

Since this is a particular case, it deserves its own name: the solutions of this equationare the degree n roots of unity. These are the following numbers

εk := cos

(2πk

n

)+ i sin

(2πk

n

)= e

2πkn ,

k = 0, 1, . . . , n− 1. Let us consider several cases for small values of n.

(1) The only degree one root of unity is 1 itself.

(2) 1 and −1 are the square roots of unity.

(3) The numbers 1, −12

+ i√

32, −1

2− i

√3

2are the degree three roots of unity.

(4) 1, i,−1 and −i are the degree four roots of unity.

ALGEBRA I 37

1

Figure 6.7. Degree seven roots of unity

In general, observe that the degree n roots of unity are the vertices of the regularn-gon inscribed in the unit circle so that 1 is its vertex (see Figure 6.7).

Since we know that one itself is a root of unity of any degree we can divide thepolynomial zn − 1 by z − 1. We have

zn − 1 = (z − 1)(zn−1 + zn−2 + · · ·+ z2 + z + 1).

Thus the degree n roots of unity different from 1 are the solutions of the equation

zn−1 + zn−2 + · · ·+ z2 + z + 1 = 0.

6.8. Two general theorems. We have seen in the previous section that it is relativelyeasy to find solutions of the equation zn − c = 0 for any n ∈ N and c ∈ C. This is,however, a very simple equation. On the other hand we know that it is impossible togive a formula for solutions of a general polynomial equation of high degree.

Let’s see two general theorems about complex roots of polynomials. The first oneapplies to any polynomial with complex coefficients, and shows that we can estimatethe region where the solutions are. It is known as Cauchy’s Theorem.

Theorem 6.9. Consider a degree n polynomial equation with complex coefficients

xn + an−1xn−1 + · · ·+ a1x+ a0 = 0.

Let M := max{|a0|, |a1|, . . . , |an−1|}. If z ∈ C is a solution of the above equation then|z| ≤M + 1.

Proof. Let p(x) = xn + an−1xn−1 + · · · + a1x + a0. Suppose that |z| > M + 1. We

shall show that such a z cannot be a solution of our equation. Let us evaluate the


polynomial p(x) on z and make the following sequence of inequalities.

|p(z)| ≥ |zn| − |an−1zn−1 + · · ·+ a1z + a0|

≥ |zn| − (|an−1||z|n−1 + · · ·+ |a1||z|+ |a0|)≥ |zn| −M(|z|n−1 + |z|n−2 + · · ·+ |z|+ 1)

≥ |zn|(1−M(|z|−1 + |z|−2 + · · ·+ |z|−(n−1) + |z|−n)

> |zn|(1−M(|z|−1 + |z|−2 + · · ·+ |x|−(n−1) + |z|−n + ...)

= |zn|(

1−M(

1/|z|1− 1/|z|

))= |zn|

(1−M

(1

|z| − 1

))= |zn|

(1−

(M

|z| − 1

))= |zn|

(|z| − (M + 1)

|z| − 1

)> 0

This calculation implies that if |z| > M+1 then p(z) > 0 and hence z is not a solution.Consequently all solutions have to have moduli at most M + 1 as stated. �

Example 6.10. All solutions of the equation

x5 + x4 + 2x3 − x2 + 2 = 0

are contained in the disc of radius 3. ♣

The second general theorem works only for polynomials with real coefficients. It isknow as the Complex roots Theorem.

Theorem 6.11. Let P (x) = anxn + an−1x

n−1 + . . .+ a1x+ a0 be a polynomial with realcoefficients (that is, ai ∈ R for i = 0, . . . , n), and let z ∈ C be a complex root of P (x).Then, the conjugate z̄ of z is also a root of P (x).

Proof. Since ai ∈ R for i = 0, . . . , n, we have ai = ai for all i. Now, P (z) = 0 because zis a root of P (x), and the following follows easily from the properties that we alreadyknow about conjugates of complex numbers.

P (z̄) = an(z̄)n + an−1(z̄)n−1 + . . .+ a1(z̄) + a0 =

= anzn + an−1zn−1 + . . .+ a1z̄ + a0 =

= anzn + an−1zn−1 + . . .+ a1z + a0 =

= anzn + an−1zn−1 + . . . a1z + a0 = P (z) = 0.

�

Corollary 6.12. Let P (x)anxn + an−1x

n−1 + . . . + a1x + a0 be a polynomial with realcoefficients. If the degree n of P (x) is odd, then P (x) has (at least) one real root.

ALGEBRA I 39

Proof. Suppose that n is odd. Then, the Fundamental Theorem of Algebra says thatP (x) has an odd number of roots. On the other hand, the previous theorem says thatthe number of complex roots (with nontrivial imaginary part) is even. Thus, theremust be at least one real root.

�

Example 6.13. Consider the polynomial equation x3 +2x2 +2x+1. It is easy to checkthat x = −1 is a solution, and the other two (complex) solutions are the solutions forthe quadratic equation x2 + x+ 1 = 0:

x =−1±

√1− 4

2=−1± i

√3

2.

♣


7. Trigonometric identities

Many applications of De Moivre’s theorem are concerned with computations of exactvalues of trigonometric functions. Recall Corollary 5.21 which states that

(cosα + i sinα)n = cos(nα) + i sin(nα)

for every integer n ∈ Z and every angle α ∈ R. By calculating directly the left handside we obtain useful identities for trigonometric functions.

Example 7.1. Let n = 2 in the above formula. We get that

cos2(α)− sin2(α) + i(2 sinα cosα) = cos(2α) + i sin(2α).

By comparing the real and imaginary parts we get that

cos(2α) = cos2 α− sin2 α = 2 cos2 α− 1

sin(2α) = 2 sinα cosα.

♣

Example 7.2. For n = 3 we get that

cos3 α− 3 cosα sin2 α + i(3 cos2 α sinα− sin3 α) = cos(3α) + i sin(3α)

which implies the following identities.

sin(3α) = 3 cos2 α sinα− sin3 α,

cos(3α) = cos3 α− 3 cosα sin2 α.

♣

Example 7.3. For n = 4 we obtain that

cos(4α) = cos4 α− 6 cos2 α sin2 α + sin4 α,

sin(4α) = 4 cos3 α sinα− 4 cosα sin3 α.

Observe that using the fact that the sum of the squares of sine and cosine is equalto one we get that

cos(4α) = 8 cos4 α− 8 cos2 α + 1.

♣

ALGEBRA I 41

Example 7.4. Let us compute the exact value of cos(π/8). We know that cos(π/4) =√2/2 and hence we have the following computation

√2/2 = 2 cos2(π/8)− 1

1 +√

2/2

2= cos2(π/8)

2 +√

2

4= cos2(π/8)

±√

2 +√

2

2= cos(π/8).

Since the cosine of π/8 is positive we finaly obtain that

cos(π/8) =

√2 +√

2

2.

♣

Example 7.5. Observe that the computation from the previous example can be doneinductively. Namely, let us prove the following identity

(7.1) cos(2π/2n) =

√2 +

√2 +

√...+

√2

2for all natural numbers n ≥ 2. The right hand side expression is meant to containn digits “2”. As usual, we use induction. For n = 2 we know that cos(π/4) =

√2/2.

Suppose the statement is true for n and let’s prove is for n+ 1. We have that

cos(π/2n) = cos(2π/2n+1) = 2 cos2(π/2n+1)− 1

and it follows that

cos(π/2n+1) =

√cos(π/2n) + 1

2Subsituting the value from (7.1) for cos(π/2n) we get the statement. Let us list thefirst several values.

cos(π/4) =

√2

2

cos(π/8) =

√2 +√

2

2

cos(π/16) =

√2 +

√2 +√

2

2

cos(π/32) =

√2 +

√2 +

√2 +√

2

2

♣


Example 7.6. Let us compute the exact value of cos(2π/5). We know that the numberw = cos(2π/5)+ i sin(2π/5) is a degree five root of unity. According to the observationmade in Section 6.7, we know that w4 +w3 +w2 +w+ 1 = 0 and dividing both sidesof this equality by w2 we obtain that

(7.2) w2 + w + 1 + w−1 + w−2 = 0.

Observe that w + w−1 = 2 cos(2π/5) and that w2 + w−2 = 2 cos(4π/5). Moreover, weknow that

cos(4π/5) = cos2(2π/5)− sin2(2π/5) = 2 cos2(2π/5)− 1.

Substituting the values of w+w−1 and w2 +w−2 to the above identity (7.2) we obtainthat 4 cos2(2π/5) + 2 cos(2π/5)− 1 = 0. This means that cos(2π/5) is a solution of theequation

4x2 + 2x− 1 = 0.

This equation has two solutions

x± =−1±

√5

4and since we know that cos(2π/5) is positive we get that

cos(2π/5) =−1 +

√5

4.

♣

Example 7.7. Let us now compute the exact value of cos(3π/20). First notice that3π/20 = 2π/5− π/4 and hence

cos(3π/20) = cos(2π/5 + (−π/4))

= cos(2π/5) cos(−π/4)− sin(2π/5) sin(−π/4)

= cos(2π/5) cos(π/4) + sin(2π/5) sin(π/4)

=(−1 +

√5)

4

√2

2+

√10 + 2

√5

4

√2

2

=2√

5 +√

5 +√

10−√

2

8= 0.8910065241883678623597095714136263127705...

♣

ALGEBRA I 43

8. Complex functions

In this section we investigate the behaviour of some simple functions defined on thecomplex plane and with complex values. We have already done it in Corollary 6.3 fora very simple function given by the multiplication by a fixed complex number.

Consider now the function Pn : C→ C defined by

Pn(z) := zn,

where n ∈ N is a natural number. If n = 0 then we get the constant map sendingeverything to one. If n = 1 then we get the identity map which we easily understand.Let us try to understand what does the quadratic function do to the complex plane.To make it easier let us look at the formula for our function in the polar coordinates:

P2(r(cos θ + i sin θ)) = r2(cos(2θ) + i sin(2θ)).

Let us make the following observations on the images of various sets with respect tothe quadratic function P2.

(1) The image of a straight line intersecting the real axis at zero at the angle α isthe ray based at zero and making the angle 2α with the real axis.

Z Z2

In particular, the image of the real axis is the ray consisting of all non-negativereal numbers. The image of an imaginary axis is the ray consisting of non-positive real numbers. The image of the line consisting of numbers with equalreal and imaginary parts is the ray consisting of pure imaginary numbers withnon-negative imaginary part.

(2) The image of the circle of radius r > 0 centered at the origin is the circle ofradius r2 centered at the origin.


1 1

Z Z2

(3) The image of a half of the circle of radius r centered at the origin is the wholecircle of radius r2 centered at the origin.

1 1

Z Z2

(4) The image of the circular sector

{z ∈ C : |z| ≤ r and α ≤ arg(z) ≤ β}

is equal to the circular sector

{z ∈ C : |z| ≤ r2 and 2α ≤ arg(z) ≤ 2β}.

ALGEBRA I 45

1 1

Z Z2

(5) The image of the upper half plane {z ∈ C | Im(z) ≥ 0} is the whole plane.


9. Solving equations II: linear equations

In the previous sections we studied polynomial equations, and we saw how difficultthey are to solve in general. In this section we study a different type of equations:simpler equations which combine several unknowns.

Example 9.1. Consider the following equation with two unknowns x1 and x2.{x1 − x2 = 1

2x1 + 3x2 = 3

In order to solve it we proceed as follows. First, we multiply the first equation by two:{2x1 − 2x2 = 2

2x1 + 3x2 = 3

and then we subtract the first equation from the second:{2x1 − 2x2 = 2

5x2 = 1

Dividig the second equation by five we have found one solution:{2x1 − 2x2 = 2

x2 = 1/5

We substitute it to the first equation{2x1 − 2/5 = 2

x2 = 1/5

rearange the terms, divide by two, and get{x1 = 6/5

x2 = 1/5

♣

Example 9.2. Similar algorithm works for more unknowns. Consider this system ofequations.

x1 − x2 + x3 = −2

2x1 + 3x2 + x3 = 7

x1 − 2x2 − x3 = −2

Subtract the first equation from the third:x1 − x2 + x3 = −2

2x1 + 3x2 + x3 = 7

0 − x2 − 2x3 = 0

ALGEBRA I 47

Now we multiply the first equation by two and subtract the result from the secondequation:

x1 − x2 + x3 = −2

0 + 5x2 − x3 = 11

0 − x2 − 2x3 = 0

Swap the second equation with the third:x1 − x2 + x3 = −2

0 − x2 − 2x3 = 0

0 + 5x2 − x3 = 11

Mutliply the second equation by five and add to the third:x1 − x2 + x3 = −2

0 − x2 − 2x3 = 0

0 + 0 − 11x3 = 11

The last equation is now solved and we substitute the solution to the second and thefirst equation:

x1 − x2 − 1 = −2

0 − x2 + 2 = 0

x3 = −1

Now the second equation is solved and we substitute the solution to the first equation:x1 − 2 − 1 = −2

x2 = 2

x3 = 1

Finally we get the full solution:x1 = 1

x2 = 2

x3 = −1

♣

In the above examples we manipulate with the system of equations so that the ma-nipulation does not change the solution and that we get zeros in the bottom rightcorner of the system.

9.3. A geometric remark. Let us try to understand what have we done in the lasttwo examples. Let f : R2 → R2 be a map given by the following formula

f(x1, x2) = (x1 − x2, 2x1 + 3x2).


In Example 9.1 we just found the argument (x1, x2) for which the value of the map fis equal to (1, 3). In other words, we calculated the preimage of the set {(1, 3)} withrespect to the map f .

In Example 9.2 we have the following situation. Let f : R3 → R3 be a map defined by

f(x1, x2, x3) = (x1 − x2 + x3, 2x1 + 3x2 + x3, x1 − 2x2 − x3).

By solving the equation we found the argument (x1, x2, x3) for which the value of themap f is equal to (−2, 7,−2). Or we computed the preimage of the set {(−2, 7,−2)}with respect to the map f .

These two examples are very special cases of the following general situation.

Definition 9.4. Let C be either the set of integers, rational numbers, real numbers orcomplex numbers. A map f : Cn → Cm is a linear function if, for each j = 1, . . . ,m,there exist elements aj1, . . . , ajn ∈ A such that the function f is defined by theformula

f(x1, . . . , xn) =(a11x1 + a12x2 + . . .+ a1nxn,

a21x1 + a22x2 + . . .+ a2nxn,

. . .

am1x1 + am2x2 + . . .+ amnxn).

Solving a system of linear equations is thus equivalent to finding the pre-image of anelement by a linear function!

9.5. A complex example. Consider the following system of linear equations over thecomplex numbers.

ix1 − x2 + (1− i)x3 = i

2x1 + 3x2 + ix3 = 2

x1 − (2 + i)x2 − ix3 = 1

We play the same game as in the previous examples. To start we multiply the firstequation by i and add to the third equation:

ix1 − x2 + (1− i)x3 = i

2x1 + 3x2 + ix3 = 2

− (2 + 2i)x2 + x3 = 0

Next we multiply the first equation by 2/i = −2i and subtract from the secondequation:

ix1 − x2 + (1− i)x3 = i

+ (3− 2i)x2 + (2 + 3i)x3 = 0

+ (−2− 2i)x2 + x3 = 0

ALGEBRA I 49

Multiply the second equation by (2 + 2i)/(3− 2i) and add to the third equation:ix1 − x2 + (1− i)x3 = i

+ (3− 2i)x2 + (2 + 3i)x3 = 0

+(2 + 2i)(2 + 3i) + (3− 2i)

3− 2ix3 = 0

We obtain that x3 = 0 and substitute this to the second and the first equation:ix1 − x2 = i

+ (3− 2i)x2 = 0

x3 = 0

We get that x2 = 0 and substitute it to the first equation and finally get the fullsolution:

x1 = 1

x2 = 0

x3 = 0

This example shows that the same method works for the real numbers and for thecomplex numbers. In fact all we need to solve such a system of equations is that wecan add/subtract and multiply/divide the coefficients. So, in the words of Remark5.13, this method works whenever the coefficients are in a field (recall, the rationalnumbers Q, the real numbers R and the complex numbers C are fields, but thereare many others).

In the last example we see that what we calculated is the preimage of the set {i, 2, 1} ⊂C3 with respect to a map f : C3 → C3 defined by the formula

f(x1, x2, x3) = (ix1 − x2 + (1− i)x3, 2x1 + 3x2 + ix3, x1 − (2 + i)x2 − ix3).

Here Cn is the n-dimensional complex space. That is, the set of all n-tuples ofcomplex numbers.

9.6. Matrices. A matrix is a rectangular array of symbols. More precisely, an (m×n)-matrix has the following form

a11 a12 . . . a1n

a21 a22 . . . a2n

a31 a32 . . . a3n...

... . . . ...am1 am2 . . . amn

It hasm rows and n columns. Each aij is called an entry or a coefficient of the matrix.For example, [

2 3 71 −1 2

]


is a (2× 3)-matrix with integer coefficients. And i 31 −1 + i

−2− 3i 1− i

is a (3× 2)-matrix with complex entries.

Matrices are useful in many situations. First, let us apply them to solving linearequations. Let us look again at the equation from Example 9.2

x1 − x2 + x3 = −2

2x1 + 3x2 + x3 = 7

x1 − 2x2 − x3 = −2

All the manipulations we did in order to solve the system were applied to the co-efficients. The following (3 × 4)-matrix collects all the coefficients of the system ofequations. 1 −1 1 | −2

2 3 1 | 71 −2 −1 | −2

Our task is to obtain a matrix with zeros in the bottom right corner and ones on thediagonal by applying the following operations which don’t change the solutions of thesystem:

• swapping rows;

• multiplying a row by a number;

• adding one row to another.

In this concrete example we proceed as we did in Example 9.2 and we get the followingsequence of matrices: 1 −1 1 | −2

2 3 1 | 71 −2 −1 | −2

We can now systematically reduce the entries of the matrix of coefficients until weobtain an equivalent system which can be solved easily. The idea is to obtain zeroesin all the positions under the diagonal of the matrix, using only the three rulesmentioned above. Let us first get a zero in the position (2, 1) of the matrix. To doso, we subtract two times the first row from the second. This operation is codified asr2 − 2r1: 1 −1 1 | −2

2 3 1 | 71 −2 −1 | −2

r2−2r1

∼

1 −1 1 | −20 5 −1 | 111 −2 −1 | −2

ALGEBRA I 51

Now we want to get a zero in the position (3, 1). To do so, we subtract the first rowfrom the third, which we abbreviate by r3 − r1:1 −1 1 | −2

0 5 −1 | 111 −2 −1 | −2

r3−r1

∼

1 −1 1 | −20 5 −1 | 110 −1 −2 | 0

Next, we have to get a zero in the position (3, 2). To simplify further calculations,before doing so we can swap the second and the third rows, which we abbreviate byr2 ↔ r3: 1 −1 1 | −2

0 5 −1 | 110 −1 −2 | 0

r2↔r3

1 −1 1 | −20 −1 −2 | 00 5 −1 | 11

Now we can easily get a zero in the position (3, 2) by adding five times the second rowto the third (r3 + 5r2):1 −1 1 | −2

0 −1 −2 | 00 5 −1 | 11

r3+5r2

∼

1 −1 1 | −20 −1 −2 | 00 0 −11 | 11

Finally, we can multiply the third row by − 1

11to get 1. We denote this simply by

− 111r3: 1 −1 1 | −2

0 −1 −2 | 00 0 −11 | 11

− 1

11r3

∼

1 −1 1 | −20 −1 −2 | 00 0 1 | −1

The original system of equations is then equivalent to the system

x1 − x2 + x3 = −2

−x2 − 2x3 = 0

x3 = −1

meaning that both have the same solutions. But the system above is very easy tosolve: from the third equation we see that x3 = −1. Substituting this in the secondequation, we have x2 = 2, and then substituting x2 = 2 and x3 = −1 in the firstequation we deduce x1 = 1.

Remark 9.7. The method of solving systems of linear equations presented in theabove examples is called the Gaussian elimination. It is very efficient and it is usedin algebra software.

9.8. Further examples of systems of linear equations. In all the examples abovewe considered systems of linear equations where the number of equations was alwaysequal to the number of unknowns. Here we consider a more general situation.


Example 9.9. Consider the following system where we have more equations thanunknowns.

x1 − x2 = −2

2x1 + 3x2 = 7

x1 − 2x2 = −2

The matrix of coefficients is 1 −1 | −22 3 | 71 −2 | −2

Let us apply the Gaussian elimination algorithm. The operation that we do at eachstep is specified on the bottom right corner. For instance, in the first step we subtracttwo times the first row from the second, and this corresponds to r2 − 2r1:1 −1 | −2

2 3 | 71 −2 | −2

r2−2r1

∼

1 −1 | −20 5 | 111 −2 | −2

r3−r1

∼

1 −1 | −20 5 | 110 −3 | 0

− 1

3r3

∼

1 −1 | −20 5 | 110 1 | 0

r2↔r3

∼

1 −1 | −20 1 | 00 5 | 11

r3−5r1

∼

1 −1 | −20 1 | 00 0 | 11

The third row produces then the equation

0x1 + 0x2 = 11

which does not make any sense. This means that the system has no solutions. ♣

Example 9.10. Now let us consider the system with two equations and three un-knowns and let’s find all real solutions.{

x1 + x2 − x3 = −2

2x1 + x2 + 3x3 = 7

The associated matrix of coefficients is[1 1 −1 | −22 1 3 | 7

]Again, let us apply Gaussian elimination to solve the system.[

1 1 −1 | −22 1 3 | 7

]r2−2r1

∼[1 1 −1 | −20 −1 5 | 11

]

ALGEBRA I 53

In this case, we cannot reduce the coefficient matrix any more. Notice that we havemore variables than equations. We may then consider x3 as a parameter, and we getthen

x2 = 5x3 − 11 x1 = −6x3 + 9.

In other words, the solutions of the system form the (infinite) set

{(x1, x2, x3) ∈ R3 |x1 = −6t+ 9, x2 = 5t− 11, x3 = t for some t ∈ R}.We can also write the solutions of the system as the set

{(−6t+ 9, 5t− 11, t) ∈ R3 | t ∈ R}.♣

Example 9.11. Let us see a longer example: consider the system

2x2 + x3 + x4 = 0

x1 + x3 + x5 = 6

x1 − x2 − x4 = 2

2x1 + x2 + 2x3 + x5 = 8

2x1 + x3 − x4 = 4

The matrix of coefficients is

A =

0 2 1 1 0 | 01 0 1 0 1 | 61 −1 0 −1 0 | 22 1 2 0 1 | 82 0 1 −1 0 | 4

We can now start applying Gaussian elimination

0 2 1 1 0 | 01 0 1 0 1 | 61 −1 0 −1 0 | 22 1 2 0 1 | 82 0 1 −1 0 | 4

r1↔r3

∼

1 −1 0 −1 0 | 21 0 1 0 1 | 60 2 1 1 0 | 02 1 2 0 1 | 82 0 1 −1 0 | 4

r2−r1

1 −1 0 −1 0 | 20 1 1 1 1 | 40 2 1 1 0 | 02 1 2 0 1 | 82 0 1 −1 0 | 4

r4−2r1

∼

1 −1 0 −1 0 | 20 1 1 1 1 | 40 2 1 1 0 | 00 3 2 2 1 | 42 0 1 −1 0 | 4

r5−2r1

1 −1 0 −1 0 | 20 1 1 1 1 | 40 2 1 1 0 | 00 3 2 2 1 | 40 2 1 1 0 | 0

r3−2r1

∼

1 −1 0 −1 0 | 20 1 1 1 1 | 40 0 −1 −1 −2 | −80 3 2 2 1 | 40 2 1 1 0 | 0

r4−3r2


1 −1 0 −1 0 | 20 1 1 1 1 | 40 0 −1 −1 −2 | −80 0 −1 −1 −2 | −80 2 1 1 0 | 0

r5−2r2

∼

1 −1 0 −1 0 | 20 1 1 1 1 | 40 0 −1 −1 −2 | −80 0 −1 −1 −2 | −80 0 −1 −1 −2 | −8

r4−r3

1 −1 0 −1 0 | 20 1 1 1 1 | 40 0 −1 −1 −2 | −80 0 0 0 0 | 00 0 −1 −1 −2 | −8

r5−r3

∼

1 −1 0 −1 0 | 20 1 1 1 1 | 40 0 −1 −1 −2 | −80 0 0 0 0 | 00 0 0 0 0 | 0

Let us analyze the last matrix. We can ignore the last two rows, since they do notgive any information about the system. Also, there is no row of the form[

0 0 0 0 0 | ∗]

and this means that the system has solutions.

However, once we discard the last two rows, the resulting system has five variablesand three equations. This means that the system has infinitely many solutions,which depend on two parameters: the difference between the number of variablesand the number of equations.

Let z = x4 and t = x5. Then, from the third row we see that

x3 = 8− z − 2t.

Replacing this in the second row we get then

x2 = t− 4.

Finally, replace this in the first row to get

x1 = z + t− 2.

In other words, the solutions of the system form the set

{(z + t− 2, t− 4, 8− z − 2t, z, t) ∈ R5 | z, t ∈ R}.♣

ALGEBRA I 55

10. Algebra of matrices

The set of matrices with m rows and n columns is denoted by Mm,n(C), where C isthe set of coefficients for the matrix. For instance, consider the matrices

A =

[4 2− i 5

3 + 2i i 11− 3i

]B =

[4 2 53 0 11

].

The matrix A is a matrix in M2,3(C), since some of the coefficients are complex, whilethe matrix B can be considered as a matrix in M2,3(Z), M2,3(Q), M2,3(R) or M2,3(C),since the coefficients can be considered integer, rational, real or even complex.

For simplicity, and since Z ⊆ Q ⊆ R ⊆ C, we will only consider matrices withcomplex coefficients. The goal of this chapter is to learn how to multiply matricesand to relate the multiplication of matrices with linear functions.

10.1. Multiplication of matrices. Let A = [aij] ∈ Mk,m(C) and let B = [bij] ∈Mm,n(C). We define the product A ·B to be an (k × n)-matrix with ij-entry given by

ai1b1j + ai2b2j + · · ·+ aimbmj.

Example 10.2. In this example we calculate the product of an (2 × 3)-matrix withan (3× 2)-matrix. The result is a (2× 2)-matrix.[

1 2 32 3 4

]·

1 22 32 2

=

[1 + 4 + 6 2 + 6 + 62 + 6 + 8 4 + 9 + 8

]=

[11 1416 21

]In this case we can also multiply1 2

2 32 2

· [1 2 32 3 4

]=

1 + 4 2 + 6 3 + 82 + 6 4 + 9 6 + 122 + 4 4 + 6 6 + 8

=

5 8 118 13 186 10 14

Thus, depending on the order of the factors, the product of two matrices will giverather different results! ♣

Example 10.3. A matrix which has only one row or one column is called a vector.For instance,

v =

122

or w =[2 3 4

]We cannot multiply vw, but

wv =[2 3 4

]·

122

=[16]

This means that a single number can also be considered as a matrix of dimension(1× 1)! ♣


10.4. Matrices and maps. In the previous chapter, matrices were used to solvesystems of equations. We also saw that solving a systems of linear equations is thesame as computing the pre-image of an element by a linear function. For instance,in Example 9.2 we calculated the pre-image of {(−2, 7,−2)} by the function

f(x1, x2, x3) = (x1 − x2 + x3, 2x1 + 3x2 + x3, x1 − 2x2 − x3).

The matrix associated to the system of equations is then1 −1 1 | −22 3 1 | 71 −2 −1 | −2

Notice that this is not a random matrix, the entries are obviously related to thefunction and the element (−2, 7,−2)! It is clear then that there is some relationshipbetween matrices and linear functions.

If we consider the variables x1, x2, x3 as a generic vector in R3, then1 −1 12 3 11 −2 −1

·x1

x2

x3

=

x1 − x2 + x3

x2 + 3x2 + x3

x3 − 2x2 − x3

This is the exactly the formula for the function f !

Let’s see what happens in general. Recall that a linear function is a map f : Cn → Cm

defined by a formula of the form

f(x1, . . . , xn) =(a11x1 + a12x2 + . . .+ a1nxn,

a21x1 + a22x2 + . . .+ a2nxn,

. . .

am1x1 + am2x2 + . . .+ amnxn)

where the coefficients a11, . . . , a1n, a21, . . . , a2n, . . . , am1, . . . , amn ∈ C are fixed (they donot depend on the inputs (x1, . . . , xn)). The associated matrix of coefficients is

A =

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

The above says that it is enough to give the matrix of coefficients of f in order todescribe the function f . Indeed, let

v =

x1...xn

∈ Cn

ALGEBRA I 57

be a generic vector in Cn. Then, we recover the formula for the function f by multi-plying A · v:

A · v =

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

·x1

x2...xn

=

a11x1 + a12x2 + . . .+ a1nxna21x1 + a22x2 + . . .+ a2nxn

...am1x1 + am2x2 + . . .+ amnxn

Clearly, the converse works as well: given a matrix A as above, the product of A bya generic vector in Cn defines a linear function:

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

·x1

x2...xn

=


...am1x1 + am2x2 + . . .+ amnxn

Let’s see an example.

Example 10.5. The matrix

A =

1 10 3−1 4

defines the map f : R2 → R3 such that

f(x1, x2) = (x1 + x2, 3x2,−x1 + 4x2).

♣

Thus, giving the formula for the function f or giving its associated matrix of co-efficients is equivalent. From now on, we will only use the associated matrix ofcoefficients whenever we are working with a linear function. This is not only a matterof saving time: working with matrices instead of working with the formula of f hassome great advantages.

Example 10.6. Let fA : R3 → R2 and fB : R2 → R3 be maps defined by the abovematrices

A =

[1 2 32 3 4

]and B =

1 22 32 2

respectively. That is,

fA(x1, x2, x3) = (x1 + 2x2 + 3x3, 2x1 + 3x2 + 4x3)

fB(x1, x2) = (x1 + 2x2, 2x1 + 3x2, 2x1 + 2x2).


Let’s compute the formula of the composition fA ◦ fB : R2 → R2:

(fA ◦ fB)(x1, x2) = fA(x1 + 2x2, 2x1 + 3x2, 2x1 + 2x2)

= ((x1 + 2x2) + 2(2x1 + 3x2) + 3(2x1 + 2x2),

2(x1 + 2x2) + 3(2x1 + 3x2) + 4(2x1 + 2x2))

= (11x1 + 14x2, 16x1 + 21x2).

We see that the composition is defined by the product AB (calculated in the previousexample). In other words,

fA ◦ fB = fAB.

♣

The above observation is true in general. That is, the composition of linear mapscorresponds to the multiplication of matrices. Let us summarize this as a theorem(which is straightforward to prove).

Theorem 10.7. Let A ∈ Mk,m(C) and let B ∈ Mm,n(C) be matrices and consider thecorresponding functions, fA : Cm → Ck and fB : Cn → Cm. Then, the compositionfunction fA ◦ fB is determined by the matrix A ·B:

fA ◦ fB = fA·B.

Recall the definition of a linear function, 9.4. The following proposition provides analternative definition of linear functions: a map f is a linear function if and only if itsatisfies the two properties in the following proposition.

Proposition 10.8. Let A ∈ Mm,n(C) be a matrix and let fA : Cn → Cm be the corre-sponding map. Then

fA(x + y) = fA(x) + fA(y) and fA(rx) = rfA(x)

for every x = (x1, x2, . . . , xn),y = (y1, y2, . . . , yn) ∈ Cn and r ∈ C.

Proof. Write the elements x and y as vectors in Cn:

v =

x1

x2...xn

w =

y1

y2...yn

ALGEBRA I 59

Then, we have

A · v + A ·w =

=

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

x1

x2...xn

+

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

y1

y2...yn

=

=


...am1x1 + am2x2 + . . .+ amnxn

+

a11y1 + a12y2 + . . .+ a1nyna21y1 + a22y2 + . . .+ a2nyn

...am1y1 + am2y2 + . . .+ amnyn

=

=

a11(x1 + y1) + a12(x2 + y2) + . . .+ a1n(xn + yn)a21(x1 + y1) + a22(x2 + y2) + . . .+ a2n(xn + yn)

...am1(x1 + y1) + am2(x2 + y2) + . . .+ amn(xn + yn)

=

=

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...am1 am2 . . . amn

·x1 + y1

x2 + y2...

xn + yn

= A · (v + w)

The second is proven similarly. �


11. Square matrices

In this section we work only with square matrices: matrices whose number of rowsequals its number of columns. More specifically, in this section we will learn how tocompute the determinant of a square matrix. When the determinant is different fromzero we will also learn how to find the inverse of a square matrix.

These concepts have great importance in different areas of mathematics. For in-stance, a matrix A ∈ M2,2(C) determines a linear function f : C2 → C2. In this case,the determinant of A is different from zero if and only, for every element (b1, b2) ∈ C2,the system

f(x1, x2) = (b1, b2)

has a unique solution. In other words, the determinant is different from zero if andonly if the function f is bĳective!

We can also interpret the matrix A above as a geometric operation (a rotation, forinstance). From this point of view, the determinant of A is different from zero if andonly if we can “undo” such operation. When this is the case, the inverse matrix, A−1

determines the inverse operation.

As you see, computing determinants can save a lot of work, even when it gets to solvesystems of equations. Let’s see how to compute determinants.

Definition 11.1. Let A ∈ Mn,n(C) be an (n× n)-matrix. The matrix Aij is defined tobe the

((n − 1) × (n − 1)

)-matrix which arises from A by crossing out the i-th row

and the j-column. It is called the (ij)-th minor matrix of A.

Example 11.2. Let A ∈ M3,3(C) be the matrix

A =

1 1 i2 i 2i−2 −1 1

The (13)-rd minor matrix of A is

A13 =

[2 i−2 −1

]while the (21)-st minor matrix of A is

A21 =

[1 i−1 1

]♣

The formula of the determinant of a matrix A ∈ Mn,n(C) depends on the size (n× n)of the matrix:

(1) If n = 1 then det[a] = a;

ALGEBRA I 61

(2) If n = 2 then

det

[a bc d

]= ad− bc;

(3) If n ≥ 3,

det

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . . ...an1 an2 . . . ann

=n∑k=1

(−1)1+ka1k detA1k.

Let’s see some examples.

Example 11.3. Let’s compute the determinant of the matrix A is example 11.2:

A =

1 1 i2 i 2i−2 −1 1

The formula for the determinant is then

det(A) =3∑

k=1

(−1)k+1a1k det(A1k) =

= (−1)2 · 1 · det(A11) + (−1)3 · 1 · det(A12) + (−1)4 · i · det(A13) =

= 1 · det

[i 2i−1 1

]− 1 · det

[2 2i−2 1

]+ i · det

[2 i−2 −1

]

Let us calculate each of the determinants above aside. Since the matrices now havedimension 2 we have

det(A11) = det

[i 2i−1 1

]= (i)(1)− (2i)(−1) = 3i;

det(A12) = det

[2 2i−2 1

]= (2)(1)− (2i)(−2) = 2 + 4i;

det(A13) = det

[2 i−2 −1

]= (2)(−1)− (i)(−2) = −2 + 2i.

Thus, going back to the determinant of A, we have

det(A) = 1 · det(A11)− 1 · det(A12) + i · det(A13) =

= 1(3i)− 1(2 + 4i) + i(−2 + 2i) =

= 3i− 2− 4i− 2i− 2 =

= −4− 3i.

♣


For each dimension n, we can consider the determinant as a function

det : Mn,n(C) −→ C.

The following proposition lists some of the properties of this function. Actually, usingthe formula for the determinant you can already prove property (2) below.

Proposition 11.4. For each n, the determinant function det satisfies

(1) det(A ·B) = det(A) · det(B), for all A,B ∈ Mn,n(C);

(2) det(r · A) = rn · det(A), for all r ∈ C and all A ∈ Mn,n(C).

The identity matrix of dimension n, denoted by In ∈ Mn,n(C), is the square matrixwith 1 in all the positions of the diagonal, and zeroes in all the other positions:

In =

1 0 0 . . . 00 1 0 . . . 00 0 1 . . . 0...

...... . . . ...

0 0 0 . . . 1

This matrix has several special features, and we will have time to see only a few ofthem. For instance, the linear function associated to In is the identity function:

Id(x1, . . . , xn) = (x1, . . . , xn).

Clearly, the identity function is bĳective for all n. Let’s see some other properties ofthe identity matrix.

Lemma 11.5. For all n,

(1) for any A ∈ Mm,n(C), A · In = A;

(2) for any B ∈ Mn,k(C), In ·B = B; and

(3) det(In) = 1.

Definition 11.6. Let A ∈ Mn,n(C). An inverse of A is a matrix B ∈ Mn,n(C) suchthat

A ·B = In = B · A.

A note of warning: not every matrix A ∈ Mn,n(C) has an inverse! When the matrix Ahas an inverse, we say that A is invertible. In this case, the inverse matrix is denotedby A−1.

Proposition 11.7. A matrix A ∈ Mn,n(C) has an inverse if and only if det(A) 6= 0.

Corollary 11.8. If A ∈ Mn,n(C) is invertible, then

det(A−1) = det(A)−1.

ALGEBRA I 63

Proof. If A is invertible, then A ·A−1 = In. Applying the function det to this equality,we have

det(A) · det(A−1) = det(A · A−1) = det(In) = 1,

which implies that det(A−1) = 1/ det(A) = det(A)−1. �

Example 11.9. Let A be the following matrix

A =

[1 22 4

]Then, det(A) = 1 · 4− 2 · 2 = 0, and thus we know that A is not invertible. In general(try to prove it!), let A be a matrix with the following property: one of its rows is amultiple of another row. Then det(A) = 0 (you can replace “row” by “column” andthis is still true). ♣Example 11.10. Let now A be

A =

[1 22 3

]Then, det(A) = 1 · 3 − 2 · 2 = −1, and we know that A is invertible. Let’s find theinverse for A. Since the dimension of A is small enough, we can do as follows. WriteA−1 as

A−1 =

[a bc d

]Then, we have [

1 00 1

]=

[1 22 3

]·[a bc d

]=

[a+ 2c b+ 2d2a+ 3c 2b+ 3d

]We then have to solve the system

a+ 2c = 12a+ 3c = 0b+ 2d = 02b+ 3d = 1

Solving the system, we obtain a = −3, b = 2, c = 2 and d = −1, and

A−1 =

[−3 22 −1

]♣

The method that we used to find the inverse in the previous example is not veryefficient: if A ∈ Mn,n(C), then we have to solve a system with n2 equations! A moreefficient system is the following. Starting from the matrix A ∈ Mn,n(C), form a newmatrix B ∈ Mn,2n(C) by glueing to A the identity matrix on the right:

B =

a11 a12 . . . a1n | 1 0 . . . 0a21 a22 . . . a2n | 0 1 . . . 0...

... . . . ... | ...... . . . ...

an1 an2 . . . ann | 0 0 . . . 1


The goal is to end up with a matrix of the form

C =

1 0 . . . 0 | c11 c12 . . . c1n

0 1 . . . 0 | c21 c22 . . . c2n...

... . . . ... | ...... . . . ...

0 0 . . . 1 | cn1 cn2 . . . cnn

and the rules are that we can only apply row operations:

• swapping rows;

• multiplying a row by a number; and

• adding/subtracting a row to another.

The inverse of A is then the matrix

A−1 =

c11 c12 . . . c1n

c21 c22 . . . c2n...

... . . . ...cn1 cn2 . . . cnn

Example 11.11. Let’s try this new method on a matrix of dimension 3. Let A be thematrix from Example 11.2:

A =

1 1 i2 i 2i−2 −1 1

We have computed its determinant in Example 11.3: det(A) = −4− 3i, and thus weknow that A is invertible.

Start by constructing the matrix B:

B =

1 1 i | 1 0 02 i 2i | 0 1 0−2 −1 1 | 0 0 1

ALGEBRA I 65

And now we operate by rows to construct the identity matrix on the left:1 1 i | 1 0 0

2 i 2i | 0 1 0

−2 −1 1 | 0 0 1

r2−2r1

∼

1 1 i | 1 0 0

0 i− 2 0 | −2 1 0

−2 −1 1 | 0 0 1

r3+2r1

∼

1 1 i | 1 0 0

0 i− 2 0 | −2 1 0

0 1 1 + 2i | 2 0 1

r2

−2+i

∼

1 1 i | 1 0 0

0 1 0 | 4+2i5−2+i

50

0 1 1 + 2i | 2 0 1

r3−r2

∼

1 1 i | 1 0 0

0 1 0 | 4+2i5−2+i

50

0 0 1 + 2i | 6−2i5

2+i5

1

r3

1+2i

∼

1 1 i | 1 0 0

0 1 0 | 4+2i5

−2+i5

0

0 0 1 | 2−14i25

4−3i25

1−2i5

Notice that, at this step, we already have 1 in each position of the diagonal, in theleft matrix. We now have to get 0 on each position above the diagonal.

1 1 i | 1 0 0

0 1 0 | 4+2i5

−2+i5

0

0 0 1 | 2−14i25

4−3i25

1−2i5

r1−r2

∼

1 0 i | 1−2i

52+i

50

0 1 0 | 4+2i5

−2+i5

0

0 0 1 | 2−14i25

4−3i25

1−2i5

r1−r3

1 0 0 | −9+12i25

7+i25

−2+i5

0 1 0 | 4+2i5

−2+i5

0

0 0 1 | 2−14i25

4−3i25

1−2i5

The inverse of A is the matrix

A−1 =

−9+12i

257+i25

−2+i5

4+2i5

−2+i5

0

2−14i25

4−3i25

1−2i5

=1

25

−(9 + 12i) 7 + i −(2 + i)

4 + 2i −(2 + i) 0

2− 14i 4− 3i 1− 2i

You can check that this result is correct by checking that indeed A · A−1 = I3. ♣

11.12. Invertible matrices and linear functions. The linear functions induced byinvertible matrices are of special interest. The following proposition collects the mainproperties of these functions.

Proposition 11.13. Let A ∈ Mn,n(C) be an invertible matrix. Then, the induced linearfunction, fA : Cn → Cn, is bĳective.


Proof. Let A ∈ Mn,n(C) be an invertible matrix, and let A−1 be its inverse. Let alsofA : Cn → Cn and fA−1 : Cn → Cn be the linear functions induced by A and A−1

respectively.

If we consider the composition fA ◦ fA−1, then we know that

fA ◦ fA−1 = fA·A−1 = fIn = Id .

Since the identity function is surjective, we deduce that fA is surjective. Similarly, ifwe consider the composition fA−1 ◦ fA, then

fA−1 ◦ fA = fA−1·A = Id .

Since the identity function is injective, we deduce that fA is injective. �

Linear functions induced by invertible matrices are specially nice when it comes tosolving systems of equations. Let A ∈ Mn,n(C) be an invertible matrix, and let A−1

be the inverse. As usual, let fA be the linear function induced by A. Suppose wewant to solve the system

fA(x1, . . . , xn) = (b1, . . . , bn)

for some element (b1, . . . , bn) ∈ Cn. In terms of the matrix of coefficients, we can writethis as

A · x = v,

where x1

x2...xn

and

b1

b2...bn

Solving the system is equivalent to find the values for x1, . . . , xn, and in this situationwe have an easy way to do it: just multiply everything by A−1, and we get

x = In · x = A−1 · A · x = A−1 · v.

This means several things:

(1) the system has a unique solution (we knew this because the function fA isbĳective);

(2) the solution of the system is given by the product A−1 · v.

Example 11.14. Let A be the matrix in Example 11.2: 1 1 i2 i 2i−2 −1 1

.

ALGEBRA I 67

We saw in Example 11.3 that this matrix is invertible, and we computed its inversein Example 11.11:

A−1 =

−9+12i

257+i25

−2+i5

4+2i5

−2+i5

0

2−14i25

4−3i25

1−2i5

For instance, we can now find the solutions for the system

x1 + x2 + ix3 = 1

2x1 + ix2 + 2ix3 = i

−2x1 − x2 + x3 = 2i

The solution now only requires a simple multiplication of matrices:−9+12i

257+i25

−2+i5

4+2i5

−2+i5

0

2−14i25

4−3i25

1−2i5

·

1

i

2i

=

−9+12i

25· 1 + 7+i

25· i+ 2i ·

(− 2+i

5

)4+2i

5· 1 +

(− 2+i

5

)· i+ 0 · 2i

2−14i25· 1 + 4−3i

25· i+ 1−2i

5· 2i

=

−9−12i+7i+i2−20i−10i2

25

4+2i−2i−i25

2−14i+4i−3i2+10i−20i2

25

=

−i

1

1

♣

algebra i - mathematicsagondem/ab12-13algebra_files/algebra.… · theorem 2.4 (pythagoras...

Documents