elementary linear algebra and vector calculus

ELEMENTARY LINEARALGEBRA AND VECTORCALCULUS

MATHEMATICS FOR UNDERGRADUATES

Antony L. FosterDepartment of Mathematics (NAC 6/273)The City College of New YorkNew York, New York 10031.

Chapter 1.

LINEAR EQUATIONS AND MATRICES

1.1. SYSTEMS OF LINEAR EQUATIONS

One of the most frequently recurring practical problems in almost all fields of study—such as mathematics, physics,biology, chemistry, economics, all phases of engineering, operations research, the social sciences, and so forth—is thatof solving a system of linear equations. The equation

b = a1x1 + a2x2 + · · · + anxn, (1)

which expresses b in terms of the variables x1, x2, . . . , xn and the constants a1, a2, . . . , an, is called a linearequation. In many applications we are given b and must find numbers x1, x2, . . . , xn satisfying (1).

A solution to a linear equation (1) is a sequence of n numbers s1, s2, . . . , sn such that (1) is satisfied whenx1 = s1, x2 = s2, . . . , xn = sn are substituted in (1). Thus x1 = 2, x2 = 3, and x3 = −4 is a solution to thelinear equation

6x1 − 3x2 + 4x3 = −13,

because6(2) − 3(3) + 4(−4) = −13.

More generally, a system of m linear equations in n unknowns, or a linear system, is a set of m linearequations each in n unknowns called x1, x2, . . . , xn.

A linear system can be conveniently written as

a11x1 + a12x2 + · · · + a1nxn = b1

a21x1 + a22x2 + · · · + a2nxn = b2...

......

...am1x1 + am2x2 + · · · + amnxn = bm.

(2)

Thus the ith equation in the system is

ai1x1 + ai2x2 + · · · + ainxn = bi.

In (2) the aij are known constants. Given the values b1, b2, . . . , bn, we want to find values of x1, x2, . . . , xn thatwill satisfy each equation in (2).

A solution to a linear system (2) is a sequence of n numbers s1, s2, . . . , sn, which have the property that eachequation in (2) is satisfied when x1 = s1, x2 = s2, . . . , xn = sn are substituted.

If the linear system (2) has no solution, it is said to be inconsistent; if it has a solution, it is called consistent. Ifb1 = b2 = · · · = bn = 0, then (2) is called a homogeneous system. The solution x1 = x2 = · · · = xn = 0to a homogeneous system is called the trivial solution. A solution to a homogeneous system in which not all ofx1, x2, . . . , xn are zero is called a nontrivial solution.

Consider another system of r linear equations in n unknowns:

c11x1 + c12x2 + · · · + c1nxn = d1

c21x1 + c22x2 + · · · + c2nxn = d2...

......

...cr1x1 + cr2x2 + · · · + crnxn = dr.

(3)

We say that (2) and (3) are equivalent if they both have exactly the same solutions.

Example 1.1.1. The linear systemx1 − 3x2 = −72x1 + x2 = 7

(4)

has only the solution x1 = 2 and x2 = 3. The linear system

8x1 − 3x2 = 73x1 − 2x2 = 0

10x1 − 2x2 = 14(5)

also has only the solution x1 = 2 and x2 = 3. Thus (4) and (5) are equivalent.

To find solutions to a linear system we shall use a technique called the method of elimination; that is, weeliminate some variables by adding a multiple of one equation to another equation. Elimination merely amountsto the development of a new linear system which is equivalent to the original system but is much simpler to solve.Readers have probably confined their earlier work in this area to linear systems in which m = n, that is, linearsystems having as many equations as unknowns. In this course we shall broaden our outlook by dealing with systemsin which we have m = n, m < n, and m > n. Indeed, there are numerous applications in which m 6= n.

Example 1.1.2. Consider the linear systemx1 − 3x2 = −32x1 + x2 = 8.

(6)

To eliminate x1, we subtract twice the first equation from the second, obtaining

7x2 = 14,

and equation having no x1 term. Thus we have eliminated the unknown x1. Then solving for x2, we have

x2 = 2,

and substituting into the first equation of (6), we obtain

x1 = 3.

Then x1 = 3, x2 = 2 is the only solution to the given linear system.

Example 1.1.3. Consider the linear systemx1 − 3x2 = −72x1 + x2 = 7.

(7)

Again, we decide to eliminate x1. We subtract twice the first equation from the second one, obtaining

0 = 21,

which makes no sense. This means that (7) has no solution; it is inconsistent. We could have come to the sameconclusion from observing that in (7) the left side of the second equation is twice the left side of the first equation,but the right side of the second equation is not twice the right side of the first equation.

Example 1.1.4. Consider the linear system

x1 + 2x2 + 3x3 = 62x1 − 3x2 + 2x3 = 143x1 + x2 − x3 = −2.

(8)

To eliminate x1, we subtract twice the first equation from the second and three times the first equation from thethird, obtaining

−7x2 − 4x3 = 2−5x2 − 10x3 = −20.

(9)

This is a system of two equations in the unknowns x2 and x3. We divide the second equation of (9) by -5, obtaining

−7x2 − 4x3 = 2x2 + 2x3 = 4,

which we write, by interchanging equations as

x2 + 2x3 = 4−7x2 − 4x3 = 2.

(10)

We now eliminate x2 in (10) by adding 7 times the first equation to the second one, to obtain

10x3 = 30,

orx3 = 3. (11)

Substituting this value of x3 into the first equation of (10), we find that x2 = −2. Substituting these values of x2

and x3 into the first equation of (8), we find that x1 = 1. We might observe further that our elimination procedurehas actually produced the linear system

x1 + 2x2 + 3x3 = 6x2 + 2x3 = 4

x3 = 3.(12)

obtained by using the first equations of (8) and (10) as well as (11). The importance of the procedure is that althoughthe linear systems (8) and (12) are equivalent, (12) has the advantage that it is easier to solve.

Example 1.1.5. Consider the linear system

x1 + 2x2 − 3x3 = −4

2x1 + x2 − 3x3 = 4.(13)

Eliminating x1, we subtract twice the first equation from the second equation, to obtain

−3x2 + 3x3 = 12. (14)

We must now solve (14). A solution isx2 = x3 − 4,

where x3 can be any real number. Then from the first equation of (13),

x1 = −4 − 2x2 + 3x3

= −4 − 2(x3 − 4) + 3x3

= x3 + 4.

Thus a solution to the linear system (13) is

x1 = x3 + 4

x2 = x3 − 4

x3 = any real number.

This means that the linear system (13) has infinitely many solutions. Every time we assign a value to x3 we obtainanother solution to (13). Thus, if x3 = 1, then

x1 = 5, x2 = −3, and x3 = 1

is a solution, while if x3 = −2, then

x1 = 2, x2 = −6, and x3 = −2

is another solution.

These examples suggest that linear system may have a unique solution, no solution, or infinitely many solutions.

Consider next a linear system of two equations in the unknowns x1 and x2.

a1x1 + a2x2 = c1

b1x1 + b2x2 = c2

(15)

The graph of each of these equations is a straight line, which we denote by l1 and l2 respectively. If x1 = s1, x2 = s2

is a solution to the linear system (15), then the point (s1, s2) lies on both lines l1 and l2. Conversely, if the point(s1, s2) lies on both lines l1 and l2, then x1 = s1, x2 = s2 is a solution to the linear system (15). Thus we are ledgeometrically to the same three possibilities mentioned above in (Figure 1.1).

x1

x2

x1

x2

x1

x2

(a) A unique solution. (b) No solution. (c) Infinitely many solutions.

l2

l1

l1

l2

l1

l2

x2 x2

If we examine the method of elimination more closely, we find that it involves three manipulations that can beperformed on a linear system to convert it into an equivalent system. These manipulations are as follows:

1. We may interchange the ith and jth equations of the system.

2. We may multiply an equation in the system by a nonzero constant.

3. We may replace the ith equation by c times the jth equation plus the ith equation,i 6= j.

That is, replaceai1x1 + ai2x2 + · · · + ainxn = bi

by(ai1 + caj1)x2 + (ai2 + caj2)x2 + · · · + (ain + cajn)xn = bi + cbj .

It is not difficult to prove that performing these manipulations on a linear system leads to an equivalent system.

Example 1.1.6. Suppose that the ith equation of a linear system such as (2) is multiplied by the nonzero constantc, obtaining the linear system

a11x1 + a12x2 + · · · + a1nxn = b1

a21x1 + a22x2 + · · · + a2nxn = b2...

......

...cai1x1 + cai2x2 + · · · + cainxn = cbi

......

......

am1x1 + am2x2 + · · · + amnxn = bm.

(16)

If x1 = s1, x2 = s2, · · · xn = sn is a solution to (2), then it is a solution to all the equations in (16) exceptpossibly for the ith equation. For the ith equation we have

c(ai1s1 + ai2s2 + · · · + ainsn) = cbi

orcai1s1 + cai2s2 + · · · + cainsn = cbi.

Thus the ith equation of (16) is also satisfied. Hence every solution to (2) is also a solution to (16). Conversely,every solution to (16) also satisfies (2). Hence (2) and (16) are equivalent systems.

In the next Section we develop methods that will enable us to further discuss and solve linear systems of equations.

SUGGESTED EXERCISES FOR SECTION 1

In Exercises 1 through 14, solve the given linear system by the method of elimination. (I suggest that you hold offon these exercises until later when we discuss the method of Gaussian Elimination.)

Exercise 1.1.1.x1 + 2x2 = 83x1 − 4x2 = 4

.

Exercise 1.1.2.2x1 − 3x2 + 4x3 = −12

x1 − 2x2 + x3 = −53x1 + x2 + 2x3 = 1

.

Exercise 1.1.3.3x1 + 2x2 + x3 = 24x1 + 2x2 + 2x3 = 8

x1 − x2 + x3 = 4.

Exercise 1.1.4.x1 + x2 = 5

3x1 + 3x2 = 10.

Exercise 1.1.5.2x1 + 4x2 + 6x3 = −122x1 − 3x2 − 4x3 = 153x1 + 4x2 + 5x3 = −8.

Exercise 1.1.6.x1 + x2 − 2x3 = 5

2x1 + 3x2 + 4x3 = 2.

Exercise 1.1.7.x1 + 4x2 + 6x3 = 123x1 + 8x2 − 2x3 = 4.

Exercise 1.1.8.3x1 + 4x2 − x3 = 8

6x1 + 8x2 − 2x3 = 3.

Exercise 1.1.9.x1 + x2 + 3x3 = 12

2x1 + 2x2 + 6x3 = 6.

Exercise 1.1.10.x1 + x2 = 12x1 − x2 = 5

3x1 + 4x2 = 2.

Exercise 1.1.11.2x1 + 3x2 = 13x1 − 2x2 = 3

5x1 + 2x2 = 27.

Exercise 1.1.12.x1 − 5x2 = 63x1 + 2x2 = 15x1 + 2x2 = 1

.

Exercise 1.1.13.x1 + 3x2 = −42x1 + 5x2 = −8x1 + 3x2 = −5.

Exercise 1.1.14.2x1 + 3x2 − x3 = 6

2x1 − x2 + 2x3 = −83x1 − x2 + x3 = −7.

Exercise 1.1.15. Show that the linear system obtained by interchanging two equations in (2) is equivalent to (2).

Exercise 1.1.16. Show that the linear system obtained by adding a multiple of an equation in (2) to anotherequation is equivalent to (2).

1.2. MATRICES; MATRIX OPERATIONS

Before continuing the study of solving linear systems, we now introduce the notion of a matrix, which will greatlysimplify our notational problems, and develop tools to solve many important applied problems.

If we examine the method of elimination described in Section 1.1, we make the following observation: Only thenumbers in front of the unknowns x1, x2, . . . , xn are being changed as we perform the steps in the method ofelimination. Thus we might think of looking for a way of writing a linear system without having to carry along theunknowns. Matrices enables us to do this—that is, to write linear systems in a compact form that makes it easierto automate the elimination method on an electronic computer in order to obtain a fast and efficient procedure forfinding solutions. Their use is not, however, merely that of a convenient notation. We now develop operations onmatrices and will work with matrices according to the rules they obey; this will enable us to solve systems of linearequations and to do other computational problems in a fast and efficient manner. Of course, as any good definitionshould do, the notion of a matrix provides not only a new way of looking at old problems but also gives rise to agreat many new questions, some of which we study in these notes.

WHAT ARE MATRICES?

Definition 1.2.1. A matrix (plural matrices) is a rectangular array of objects (usually numbers) denoted by

A =

a11 a12 · · · a1j · · · a1n

a21 a22 · · · a2j · · · a2n

......

... · · ·...

ai1 ai2 · · · aij · · · ain

......

... · · ·...

am1 am2 · · · amj · · · amn

.

Unless stated otherwise, we assume that all our matrices are composed entirely of real numbers. The ith row of Ais

( ai1 ai2 · · · ain ) (1 6 i 6 m)

while the jth column of A is

a1j

a2j

...amj

(1 6 j 6 n)

If a matrix A has m rows and n columns, we say that A is an m by n (m × n) matrix. If m = n, we say that A isa square matrix of order n and that the elements a11, a22, . . . , ann are on the main diagonal of A. We referto aij as the (i, j) entry (entry in ith row and jth column) or (i, j)th element and we often write

A = [aij ] or A = [aij ]m×n

We shall also write Am×n to indicate that A has m rows and n columns. If A is n × n, we merely write An.

Example 1.2.1. The following are matrices:

A =

1 2 34 5 67 8 9

, B = ( 1 −2 9 )

C =

2−134

, and D =

(0 3−1 −2

)

.

In matrix A, the entry a32 = 8; in matrix C, the entry c41 = 4. Here A is a 3 × 3 square matrix, B is a 1 × 3,matrix C is a 4× 1 matrix, and matrix D is a 2× 2 square matrix. In A, the entries a11 = 1, a22 = 5 and a33 = 9are on the main diagonal.

Whenever a new object is introduced in mathematics, one must determine when two such objects are equal. Forexample, in the set of all rational numbers, the numbers 2

3 and 46 are called equal, although they are not represented

in the same manner. What we have in mind is the definition that a/b equals c/d when ad = bc. Accordingly, wenow have the following definition.

EQUALITY OF MATRICES

Definition 1.2.2. Two m × n matrices A = [aij ] and B = [bij ] are equal if they agree entry by entry, that is, ifaij = bij for i = 1, 2, . . . , m and j = 1, 2, . . . , n.

Example 1.2.2. The matrices

1 2 −12 −3 40 −4 5

and B =

1 2 w2 x 4y −4 z

are equal if and only if w = −1, x = −3, y = 0, and z = 5.

We next define a number of operations that will produce new matrices out of given matrices; this will enable us tocompute with the matrices and not deal with the equations from which they arise. These operations are also usefulin the applications of matrices.

ADDITION OF MATRICES

Definition 1.2.3. If A = [aij ] and B = [bij ] are both m × n matrices, then their matrix sum A + B is anm× n matrix C = [cij ] defined by cij = aij + bij , i = 1, 2, . . . , m; j = 1, 2, . . . , n. Thus, to obtain the sum of Aand B, we merely add corresponding (i, j) entries.

Example 1.2.3. Let

A =

(1 −2 32 −1 4

)

and B =

(0 2 11 3 −4

)

.

Then

A + B =

(1 −2 32 −1 4

)

+

(0 2 11 3 −4

)

=

(1 + 0 −2 + 2 3 + 12 + 1 −1 + 3 4 − 4

)

=

(1 0 43 2 0

)

.

It should be noted that the sum of the matrices A and B is defined only when A and B have the same numberof rows and the same number of columns, that is, only when A and B are of the same size. We now establish theconvention that when A + B is formed, both A and B are of the same size. The basic properties of matrix additionare considered in the following section and are similar to those satisfied by the real numbers.

SCALAR MULTIPLICATION OF MATRICES

Definition 1.2.4. If A = [aij ] is an m × n matrix and α is a real number, then the scalar multiple of A by α,is αA, is the m × n matrix C = [cij ], where cij = α aij , i = 1, 2, . . . , m, j = 1, 2, . . . , n; that is, the matrix C isobtained by multiplying each entry of A by α.

Example 1.2.4. We have

2

(4 −2 −37 −3 2

)

=

(8 −4 −614 −6 4

)

.

SUBTRACTION: If A and B are m × n matrices, we write A + (−1)B as A − B and call this the differencebetween A and B.

We shall sometimes use the summation notation, and we now review this useful and compact notation.

Byn∑

k=1

rkak

we mean r1a1 + r2a2 + · · · + rnan. The letter k is called the index of summation; it is a dummy variable thatcan be replaced by another letter. Hence we can write

n∑

k=1

rkak =n∑

j=1

rjaj =n∑

i=1

riai.

Thus4∑

i=1

riai = r1a1 + r2a2 + r3a3 + r4a4.

The summation notation satisfies the following properties:

1.

n∑

i=1

(ri + si)ai =

n∑

i=1

(riai + siai) =

n∑

i=1

riai +

n∑

i=1

siai

2.

n∑

i=1

α (riai) = α

n∑

i=1

riai

3.m∑

j=1

n∑

i=1

aij =n∑

i=1

m∑

j=1

aij

Property 3 can be interpreted as follows. If we add up the entries in each row of a matrix and then add the resultingnumbers, we obtain the same result as when we add up the entries in each column of the matrix and then add theresulting numbers.

MULTIPLICATION OF MATRICES

Definition 1.2.5. If A = [aij ] is an m× n matrix and B = [bij ] is an n× p matrix, then the matrix product ofA and B, denoted C = AB = [cij ], is an m × p matrix defined y

cij =

n∑

k=1

aikbkj = ai1b1j + ai2b2j + · · · + ainbnj

i = 1, 2, . . . , m

j = 1, 2, . . . , p

Note that the matrix product AB is defined only when the number of columns of A is the same as the number ofrows of B. We also observe that the (i, j) entry of C = AB is obtained by using the ith row of A and the jthcolumn of B. Thus

a11 a12 · · · a1j · · · a1n

a21 a22 · · · a2j · · · a2n

......

... · · ·...

ai1 ai2 · · · aij · · · ain

......

... · · ·...

am1 am2 · · · amj · · · amn

b11 b12 · · · b1j · · · b1n

b21 b22 · · · b2j · · · b2n

......

... · · ·...

bi1 bi2 · · · bij · · · bin

......

... · · ·...

bm1 bm2 · · · bmj · · · bmn

=

c11 c12 · · · c1j · · · c1n

c21 c22 · · · c2j · · · c2n

......

... · · ·...

ci1 ci2 · · · cij · · · cin

......

... · · ·...

cm1 cm2 · · · cmj · · · cmn

Example 1.2.5. Let

A =

(1 2 −13 1 4

)

and B =

−2 54 −32 1

.

Then

AB =

((1)(−2) + (2)(4) + (−1)(2) (1)(5) + (2)(−3) + (−1)(1)(3)(−2) + (1)(4) + (4)(2) (3)(5) + (1)(−3) + (4)(1)

)

=

(4 −26 16

)

The basic properties of matrix multiplication are considered in the following section. However, we note here thatmultiplication of matrices requires much more care than their addition, since the algebraic properties of matrixmultiplication differ from those satisfied by the real numbers. Part of the problem is due to the fact that AB isdefined only when the number of columns of A is the same as the number of rows of B. Thus, if A is an m×n matrixand B is an n × p matrix, then AB is an m × p matrix. What about BA? Three different situations may occur:

1. BA may not be defined. This will take place if p 6= m.

2. If BA is defined, BA will be an n × n matrix and AB will be an m × m matrix, and if m 6= n, AB and BAare of different sizes.

3. If BA and AB are of the same size, they may be unequal.

As in the case of addition, we establish the convention that when AB is written, it is defined.

Example 1.2.6. Let A be a 2 × 3 matrix and let B be a 3 × 4 matrix. Then AB is 2 × 4 and BA is not defined.

Example 1.2.7. Let A be 2 × 3 and let B be 3 × 2. Then AB is 2 × 2 and BA is 3 × 3.

Example 1.2.8. Let

A =

(1 2−1 3

)

and B =

(2 10 1

)

.

Then

AB =

(2 3−2 2

)

while BA =

(1 7−1 3

)

.

Thus AB 6= BA.

One might ask why matrix equality and matrix addition are defined in such a natural way while matrix multiplicationappears to be much more complicated. Only a thorough understanding of the composition of functions and therelationship that exists between matrices and what are called linear transformations would show that the definitionof multiplication given above is the natural one. These topics will be covered later in these notes.

It is sometimes useful to be able to find a column in the matrix product AB without having to multiplying the twomatrices. It is not difficult to show that the jth column of the matrix product AB is equal to the matrix productABj , where Bj is the jth column of B.

Example 1.2.9. Let

A =

1 23 2−1 5

and B =

(−2 3 43 2 1

)

.

Then the second column of AB is

AB2 =

1 23 4−1 5

(32

)

=

7177

We now return to the linear system (2) in Section 1.1 and define the following matrices:

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

......

...am1 am2 · · · amn

, X =

x1

x2...

xn

, B =

b1

b2...

bm

We can then write the linear system (2) as AX = B. The matrix A is called the coefficient matrix of the systemand the matrix

(A | B) =

a11 a12 · · · a1n | b1

a21 a22 · · · a2n | b2

......

...... |

...am1 am2 · · · amn | bm

is called the augmented matrix of the system. The coefficient and augmented matrices of a linear system will playkey roles in our methods of solving linear systems.

Example 1.2.10. Consider the following linear system:

2x1 + 3x2 − 4x3 + x4 = 5−2x1 + x2 + x3 + x4 = 7

3x1 + 2x2 + x3 − 4x4 = 3.

We can write this in matrix form as

2 3 −4 1−2 0 1 0

3 2 0 −4

x1

x2

x3

x4

=

573

.

The coefficient matrix of this system is

2 3 −4 1−2 0 1 03 2 0 −4

and the augmented matrix is

2 3 −4 1 | 5−2 0 1 0 | 73 2 0 −4 | 3

.

Example 1.2.11. The matrix (2 −1 3 | 43 0 2 | 5

)

.

is the augmented matrix of the linear system

2x1 − x2 + 3x3 = 43x1 + 0x2 + 2x3 = 5.

TRANSPOSE OF MATRICES

Definition 1.2.6. If A = [aij ] is an m × n matrix, then the transpose of A, denoted by AT = [aTij ] is an n × m

matrix defined by aTij = aji. Thus the transpose of A is obtained from A by interchanging the rows and columns of

A.

Example 1.2.12. If

A =

(1 2 −1

−3 2 7

)

, then AT =

1 −32 2

−1 7

.

TRACE OF A MATRIX

Definition 1.2.7. If A = [aij ] is an n× n matrix, then the trace of A, denoted by Tr(A), is defined as the sum ofall elements on the main diagonal of A (these are elements of A of the form aii). Thus

Tr(A) =

n∑

i=1

aii.

Example 1.2.13. If

C =

1 2 34 5 67 8 9

, then Tr(C) =

3∑

i=1

cii = c11 + c22 + c33 = 1 + 5 + 9 = 15.

SUGGESTED EXERCISES FOR SECTION 1.2

Consider the following matrices for Exercises 1, 2, and 3:

A =

(1 2 32 1 4

)

, B =

1 02 13 2

, C =

3 −1 34 1 52 1 3

, D =

(3 −22 5

)

, and E =

2 −4 50 1 43 2 1

.

Exercise 1.2.1. If possible, compute:(a) C + E.

(b) AB and BA.

(c) 2C − 3E.

(d) CB + D.

(e) AB + D2, where D2 = DD.

(f) (3)(2A) and 6A.

Exercise 1.2.2. If possible, compute:(a) A(BD).

(b) (AB)D.

(c) A(C + E).

(d) AC + AE.

(e) 3A + 2A and 5A.

Exercise 1.2.3. If possible, compute:(a) AT .

(b) (AT )T .

(c) (AB)T .

(d) BT AT .

(e) (C + E)T and CT + ET .

(f) A(2B) and 2(AB).

Exercise 1.2.4.(a) Let A be an m×n matrix with a row consisting entirely of zeros. Show that if B is an n× p matrix, then the

matrix product AB has a row of zeros.

(b) Let A be an m × n matrix with a column consisting entirely of zeros and let B be p × m matrix. Prove thatthe matrix product BA has a column of zeros.

Exercise 1.2.5. Let A =

(1 23 2

)

and B =

(2 −1−3 4

)

. Show that AB 6= BA.

Exercise 1.2.6. Consider the following linear system

2x1 + 3x2 − 3x3 + x4 + x5 = 73x1 + 2x3 + 3x5 = −22x1 + 3x2 −4x4 = 3

(a) Find the coefficient matrix.

(b) Write the linear system in matrix form.

(c) Find the augmented matrix.

Exercise 1.2.7. Write the linear system whose augmented matrix is

−2 −1 0 4 | 5−3 2 7 8 | 31 0 0 2 | 43 0 1 3 | 6

.

Exercise 1.2.8. If

(a + b c + dc − d a − b

)

=

(4 610 2

)

, find a, b, c, and d.

Exercise 1.2.9. Write the following linear system in matrix form.

2x1 + 3x2 = 03x2 + x3 = 02x1 − x3 = 0.

Exercise 1.2.10. Write the linear system whose augmented matrix is

(a)

2 1 3 4 | 03 −1 2 0 | 3

−2 1 −4 3 | 2

(b)

2 1 3 4 | 03 −1 2 0 | 3

−2 1 −4 3 | 20 0 0 0 | 0

.

Exercise 1.2.11. How are the linear systems obtained in Exercise 1.2.10 related?

Exercise 1.2.12. If A = [aij ] is an n × n matrix and α is a real number, then prove that

(a) Tr(αA) = α Tr(A).

(b) Tr(A + B) = Tr(A) + Tr(B).

(c) Tr(AB) = Tr(BA).

Exercise 1.2.13. Compute the trace (see Definition 1.2.7) of each of the following matrices.

(a)

(1 02 3

)

(b)

2 2 32 4 43 −2 −5

. (c)

1 0 00 1 00 0 1

.

Exercise 1.2.14. Show that there are no 2 × 2 matrices A and B such that AB − BA =

(1 00 1

)

.

Exercise 1.2.15. Show that the jth column of the matrix product AB is equal to the matrix product ABj , whereBj is the jth column of B.

Exercise 1.2.16. Show that if A X = B has more than one solution, then it has infinitely many solutions. (Hint:If X1 and X2 are solutions, consider X3 = αX1 + βX2, where α + β = 1)

1.3. ALGEBRAIC PROPERTIES OF MATRIX OPERATIONS

In this section we consider the algebraic properties of the matrix operations just defined. Many of these properties aresimilar to the familiar properties holding for the real numbers. However, there will be striking differences between theset R of real numbers and the set Mm×n(R) of all m×n matrices in their algebraic behavior under certain operations,for example, under multiplication (as seen in Section 1.2). Most of the properties will be stated as theorems, whoseproof will be left as exercises.

Theorem 1.3.1. If A and B are in Mm×n(R), then A + B = B + A.

Proof. LetA = [aij ], B = [bij ], A + B = C = [cij ], and B + A = D = [dij ].

We must show that cij = dij for all i, j. Now cij = aij + bij and dij = bij + aij for all i, j. Since aij and bij arereal numbers, we have aij + bij = bij + aij , which implies that cij = dij for all i, j. ⊔⊓


(1 2 3−2 1 4

)

+

(3 2 −13 −1 2

)

=

(4 4 21 0 6

)

=

(3 2 −13 −1 2

)

+

(1 2 3−2 1 4

)

.

Theorem 1.3.2. If A, B, and C are in Mm×n(R), then A + (B + C) = (A + B) + C.

Proof. ⊔⊓


(1 23 4

)

+

((2 13 −2

)

+

(3 1−2 1

))

=

(6 44 3

)

=

((1 23 4

)

+

(2 13 −2

))

+

(3 1−2 1

)

.

Theorem 1.3.3. There exists a unique 0m×n in Mm×n(R) such that A + 0m×n = 0m×n + A = A for any A ∈Mm×n(R).

Proof. Let U = [uij ]. Then A + U = A if and only if aij + uij = aij , which holds if and only if uij = 0. ThusU is the m × n matrix all of whose entries are zero; U is denoted by 0. ⊔⊓

We call 0m×n the m × n zero matrix. When m = n, we write 0n. When m and n are understood, we shall write0m×n merely as 0.

Theorem 1.3.4. For any A in Mm×n(R), there exists B in Mm×n(R) such that A + B = 0m×n.

Proof. ⊔⊓

We can now show that B is unique and that it is (−1)A, which we have already agreed to write as −A, and call itthe negative of A.

Example 1.3.3. If

A =

(1 3 −2−2 4 3

)

, then − A =

(−1 −3 22 −4 −3

)

Theorem 1.3.5. If A is in Mm×n(R) and B is in Mn×p(R), and C is in Mp×q(R), then A(BC) = (AB)C.

Proof. We shall prove the result for m = 2, n = 3, p = 4, and q = 3. The general proof is completelyanalogous.

Let A = [aij ], B = [bij ], C = [cij ], AB = D = [dij ], BC = E = [eij ], (AB)C = F = [fij ], and A(BC) = G =[gij ]. We must show that fij = gij for all i, j. Now

fij =

p∑

k=1

dikckj =

p∑

k=1

(n∑

r=1

airbrk

)

ckj

and

gij =

n∑

r=1

airerj =

n∑

r=1

air

(p∑

k=1

brkckj

)

.

Then

fij =

p∑

k=1

(ai1b1k + ai2b2k + · · · + ainbnk)ckj

= ai1

p∑

k=1

b1kckj + ai2

p∑

k=1

b2kckj + · · · + ain

p∑

k=1

bnkckj

=n∑

r=1

air

(p∑

k=1

brkckj

)

= gij .

⊔⊓

Example 1.3.4. Let

A =

(5 2 32 −3 4

)

, B =

2 −1 1 00 2 2 23 0 −1 3

, and C =

1 0 22 −3 02 1 0

.

Then

A(BC) =

(5 2 32 −3 4

)

0 3 78 −4 69 3 3

=

(43 16 5612 30 8

)

and

(AB)C =

(19 −1 6 1316 −8 −8 6

)

1 0 22 −3 00 0 32 1 0

=

(43 16 5612 30 8

)

.

Recall Example 8 in Section 1.2, which shows that AB need not always equal BA. This is the first significantdifference between multiplication of matrices and multiplication of real numbers.

Theorem 1.3.6. Let A and B be in Mm×n(R) and C be in Mn×p(R), then(a) (A + B) C = AC + BC.

(b) If C is in Mm×n(R) and A and B are both in Mn×p(R), then

C (A + B) = CA + CB.

Proof. An easy exercise! ⊔⊓

Example 1.3.5. Let

A =

(2 2 33 −1 2

)

, B =

(0 0 12 3 −1

)

, and C =

1 02 23 −1

.

Then

(A + B)C =

(2 2 45 2 1

)

1 02 23 −1

=

(18 012 3

)

and

AC + BC =

(15 17 −4

)

+

(3 −15 7

)

=

(18 012 3

)

.

Theorem 1.3.7. If α and β are real numbers, A is m × n matrix, and B is an n × p matrix, then(a) α(β A) = (αβ) A = β(α A).

(b) A(αB) = α (AB).

Proof. An Easy Exercise. ⊔⊓

Example 1.3.6. Let

A =

(4 2 32 −3 4

)

and B =

3 −2 12 0 −10 1 2

.

Then

2(3A) = 2

(12 6 96 −9 12

)

=

(24 12 1812 −18 24

)

= 6A.

We also have

A(2B) =

(4 2 32 −3 4

)

=

6 −4 24 0 −20 2 4

=

(32 −10 160 0 26

)

= 2(AB).

Theorem 1.3.8. If α and β are real numbers and A is in Mm×n(R), then

(α + β) A = αA + βA.

Proof. An easy exercise. ⊔⊓

Theorem 1.3.9. If A and B are both in Mm×n(R) and γ is any real number, then

γ(A + B) = γA + γB.


So far we have seen that multiplication and addition of matrices have much in common with multiplication andaddition of real numbers. We now look at some properties of the transpose.

Theorem 1.3.10. If A is in Mm×n(R), then (AT )T = A


Theorem 1.3.11. If A and B are both in Mm×n(R) and γ is any real number, then(a) (γA)T = γAT .

(b) (A + B)T = AT + BT .


Example 1.3.7. Let

A =

(1 2 3−2 0 1

)

and B =

(3 −1 23 2 −1

)

.

Then

AT =

1 −22 03 1

and BT =

3 3−2 22 −1

.

Also

A + B =

(4 1 51 2 0

)

and (A + B)T =

4 11 25 0

.

Now

AT + BT =

4 11 25 0

= (A + B)T .

Theorem 1.3.12. If A is in Mm×n(R) and B is in Mn×p(R), then

(AB)T = BT AT .

Proof. Let A = [aij ] and B = [bij ]; let AB = C = [cij ]. We must prove that cTij is the (i, j) entry in BT AT .

Recall from matrix multiplication that

cij =n∑

k=1

aikbkj , where i = 1, 2, . . . , m and j = 1, 2, . . . , p.

Thus

cTij = cji =

n∑

k=1

ajkbki =n∑

k=1

aTkjb

Tik =

n∑

k=1

bTikaT

kj = the (i, j) entry in BT AT .

⊔⊓

Example 1.3.8. Let

A =

(1 3 22 −1 3

)

and B =

0 12 23 −1

.

Then

AB =

(12 57 −3

)

and (AB)T =

(12 75 −3

)

.

On the other hand,

AT =

1 23 −12 3

and BT =

(0 2 31 2 −1

)

,

and then

BT AT =

(12 75 −3

)

= (AB)T .

We also note two other peculiarities of matrix multiplication. If α and β are real numbers, then αβ = 0 can holdonly if α = 0 or β = 0. However, this is not true for matrices. This means it is possible for the matrix product ABbeing zero without neither A nor B being zero matrices (very strange indeed!).

Example 1.3.9. If A =

(1 22 4

)

and B =

(4 −6−2 3

)

, then neither A nor B is the zero matrix but the matrix

product AB =

(0 00 0

)

is the zero matrix.

The Cancelation Law for Real numbers: If α, β, and γ are real numbers for which αβ = αγ and α 6= 0, itfollows that β = γ. That is, we can cancel out the nonzero factor α. However, the cancelation law does not hold formatrices, as the following example shows.

Example 1.3.10. If

A =

(1 22 4

)

B =

(2 13 2

)

, and C =

(−2 75 −1

)

,

then

AB = AC =

(8 516 10

)

,

but B 6= C.

ALGEBRAIC NOTION OF A RING

A Ring is a set R together with two binary operations ⊕ : R×R → R called addition, and ⊙ : R ×R → R calledmultiplication. This means that we are given some rule on how to add and multiply the elements or objects in R.The elements or objects in R together with its two binary operations obeys or satisfies the following rules or axioms.

Ring Axioms

1) A ⊕ B = B ⊕ A for all A and B in R (commutative law of addition).

2) (A ⊕ B) ⊕ C = A ⊕ (B ⊕ C) for all A, B, and C in R (associative law of addition).

3) There exist an element 0R ∈ R such that A ⊕ 0R = 0R ⊕ A for all A in R (existence of an additive identity).

4) For each A in R there corresponds an element −A in R such that A ⊕ (−A) = (−A) ⊕ A = 0R (existenceof additive inverses).

5) (A ⊙ B) ⊙ C = A ⊙ (B ⊙ C) for all A, B, and C in R (associative law of multiplication).

6) A ⊙ (B ⊕ C) = (A ⊙ B) ⊕ (A ⊙ C) (left distributive law)

7) (B ⊕ C) ⊙ A = (B ⊙ A) ⊕ (C ⊙ A) (right distributive law).

8) There exists an element IR in R (IR 6= 0R) such that A ⊙ IR = IR ⊙ A = A for all A in R (existence of amultiplicative identity).

Any set obeying the first 7 axioms is called a Ring. If a ring R happens to obey axiom 8 also, then we say that R

is a ring with unity. Think of unity as an element in the ring R which behaves as if it was the real number 1. IfR is a ring with unity and A⊙B = B ⊙A for all A, B ∈ R, then we say that R is a Commutative Ring with unity.Otherwise, if there exists A, B ∈ R for which A ⊙ B 6= B ⊙ A, then R is a Non-Commutative Ring with Unity.

Theorem 1.3.13. The set Mm×n(R) of all m × n matrices equipped with addition and multiplication is a non-commutative ring (without unity whenever m 6= n).

Proof. See the theorems and examples of this section. ⊔⊓In this section we have developed a number of properties about matrices and their transposes. If a future problemeither asks a question about these ideas or involves these concepts, refer to these properties to help answer thequestion. These results can be used to develop many more results.


Exercise 1.3.1. Prove Theorem 1.3.2.


Exercise 1.3.3. Verify Theorem 1.3.5 for the following matrices:

A =

(1 32 −1

)

, B =

(−1 3 21 −3 4

)

, and C =

1 03 −11 2

.


Exercise 1.3.5. Verify Theorem 1.3.6(b) for the following matrices:

A =

(2 −3 23 −1 −2

)

, B =

(0 1 21 3 −2

)

, and C =

(1 −3

−3 4

)

.


Exercise 1.3.7. Verify Theorem 1.3.7(b) for the following matrices:

A =

(1 32 −1

)

, B =

(−1 3 21 −3 4

)

, and α = −3.

Exercise 1.3.8. Find a pair of unequal nonzero matrices A and B in M2×2(R), other than those given in Exam-ple 1.3.9, such that AB = O2×2.

Exercise 1.3.9. Find two different solutions in M2×2(R) of the matrix equation A2 =

(0 00 0

)

= O2 (Recall

A2 = AA).


Exercise 1.3.11. Verify Theorem 1.3.8 for α = 4, β = −2 and A =

(2 −34 2

)

.

Exercise 1.3.12. Find two different solutions in M2×2(R) of the matrix equation A2 =

(1 00 1

)

= I2.


Exercise 1.3.14. Verify Theorem 1.3.9 for γ = −3 and

A =

4 21 −33 2

and B =

0 24 3

−2 1

.

Exercise 1.3.15. Find A, B ∈ M2×2(R) such that A 6= B and AB =

(1 00 1

)

= I2.


Exercise 1.3.17. Find matrices A, B, and C all in M3×3(R), such that AB = AC with B 6= C and A 6= O2.


Exercise 1.3.19. Verify Theorem 1.3.11 for A =

(1 3 22 1 −3

)

, B =

(4 2 −1

−2 1 5

)

, and γ = −4.

Exercise 1.3.20. Verify Theorem 1.3.12 for A =

(1 3 22 1 −3

)

and B =

3 −12 41 2

.

Exercise 1.3.21. Let A be in Mm×n(R) and γ is a real number. Show that if γA = O, then γ = 0 or A = O.

Exercise 1.3.22. Determine all A in M2×2(R) such that AB = BA for any B in M2×2(R).

1.4. SPECIAL TYPES OF MATRICES

We have already introduced one special type of matrix, the zero m × n matrix, denoted by Om×n whose entriesconsists entirely of zeros. We now consider several other types of matrices whose structure is rather specialized andfor which it will be convenient to have special names

DIAGONAL MATRICES

Definition 1.4.1. An n × n matrix A = [aij ] is called a diagonal matrix if aij = 0 for i 6= j. Thus, for adiagonal matrix, the entries off the main diagonal are all zero.

SCALAR MATRICES

Definition 1.4.2. A scalar matrix is a diagonal matrix whose diagonal entries are equal.

IDENTITY MATRIX

Definition 1.4.3. The scalar matrix In = [aij ], where aii = 1 and aij = 0 for i 6= j, is called the n×n identitymatrix.

Example 1.4.1. Let

A =

1 0 00 2 00 0 3

, B =

2 0 00 2 00 0 2

, and I3 =

1 0 00 1 00 0 1

.

Then A, B, and I3 are diagonal matrices; B and I3 are scalar matrices; and I3 is the 3 × 3 identity matrix.

It is easy to show that if A is any m × n matrix, then

AIn = A and ImA = A.

Also, if A is a scalar matrix, then A = α In for some scalar α.

Definition 1.4.4. Suppose that A is a square matrix, that is, A is in Mn×n(R) = Mn(R) the set of all matriceswith the same number of rows and column with real entries. If p is any positive integer, then we define

Ap = A · A · · · · · A︸︷︷︸

p factors of A

.

If A is in Mn(R), then we also defineA0 = In.

Definition 1.4.5. For nonnegative integers p and q, the familiar law of exponents for the real numbers can alsobe proved for matrix multiplication (see Exercise 1.4.25) of a square matrix A:

ApAq = Ap+q

and(Ap)q = Apq.

It should be noted that the rule(AB)p = ApBp

does not hold for square matrices unless AB = BA (see Exercise 1.4.26).

UPPER AND LOWER TRIANGULAR MATRICES

Definition 1.4.6. An n×n matrix A = [aij ] is called upper triangular if aij = 0 for i > j. It is called lowertriangular if aij = 0 for i < j.

Example 1.4.2. The matrix

A =

1 3 30 3 50 0 2

is upper triangular and

B =

1 0 02 3 03 5 2

is lower triangular.

SYMMETRIC MATRICES

Definition 1.4.7. A matrix A is called symmetric if AT = A.

SKEW SYMMETRIC MATRICES

Definition 1.4.8. A matrix A is called skew symmetric if AT = −A.

Example 1.4.3. A =

1 2 32 4 53 5 6

is a symmetric matrix.

Example 1.4.4. B =

0 2 3−2 0 −4−3 4 0

is a skew symmetric matrix.

We can make a few observations about symmetric and skew symmetric matrices; the proofs of most of these statementswill be left as exercises. It follows from the definitions above that if A is symmetric or skew symmetric, then A is asquare matrix. If A is a symmetric matrix, then the entries of A are symmetric with respect to the main diagonalof A. Also, A is symmetric if and only if aij = aji and A is skew symmetric if and only if aij = −aji. Moreover, ifA is skew symmetric, then the entries on the main diagonal of A are all zero. An important property of symmetricand skew symmetric matrices is the following.

Theorem 1.4.1. If A is an n × n matrix, then A = S + K, where S is symmetric and K is skew symmetric.Moreover, this decomposition is unique.

Proof. Assume that there is such a decomposition A = S + K, where S is symmetric and K is skew symmetric.We shall determine S and K. Now AT = ST + KT = S − K. Thus we have the expressions

A = S + K

AT = S − K.

Adding these two expressions, we obtain A + AT = 2S, so

S =1

2(A + AT ).

Subtracting instead of adding leads to

K =1

2(A − AT ).

It is easy to verify that A = S + K, that S is symmetric, and that K is skew symmetric. Thus we have shown thatsuch a representation is possible and that the expressions for S and K are unique. ⊔⊓

Example 1.4.5. Let A =

1 3 −24 6 25 1 3

. Then

S =1

2(A + AT ) =

1 72

32

72 6 3

2

32

32 3

,

K =

0 − 12 − 7

2

12 0 1

2

72 − 1

2 0

,

and A = S + K.

NONSINGULAR (or INVERTIBLE) MATRICES

We now come to a special type of square matrix and formulate the notion corresponding to the reciprocal of a nonzeroreal number.

Definition 1.4.9. An n× n matrix A is called nonsingular, or invertible, if there exists an n× n matrix B suchthat AB = BA = In. Otherwise, A is called singular, or noninvertible; B is called an inverse of A.


(2 32 2

)

and let B =

(−1 3

21 −1

)

. Since AB = BA = I2, we conclude that B is an

inverse of A.

Theorem 1.4.2. The inverse of a matrix, if it exists, is unique.

Proof. Let B and C be inverses of A. Then AB = BA = In and AC = CA = In. We then have B = BIn =B(AC) = (BA)C = InC = C, which prove that the inverse of a matrix, if it exits, is unique. ⊔⊓

We now write the inverse of a nonsingular matrix A, as A−1. Thus

AA−1 = A−1A = In.

Example 1.4.7. Let

A =

(1 23 4

)

.

To find A−1 if it exists, we let

A−1 =

(a bc d

)

.

Then we must have

AA−1 =

(1 23 4

) (a bc d

)

= I2 =

(1 00 1

)

,

so that (a + 2c b + 2d3a + 4c 3b + 4d

)

=

(1 00 1

)

.

Equating corresponding entries of these two matrices, we obtain the linear systems

a + 2c = 1 b + 2d = 0and

3a + 4c = 0 3b + 4d = 1.

The solutions are (you should verify this) a = −1, c = 32 , b = 1, and d = − 1

2 . Moreover, since the matrix

(a bc d

)

=

(−2 1

32 − 1

2

)

also satisfies the property that(−2 1

32 − 1

2

) (1 23 4

)

=

(1 00 1

)

,

we conclude that A is nonsingular and that

A−1 =

(−2 1

32 − 1

2

)

.

Example 1.4.8. Let

A =

(1 22 4

)

.

To find A−1 if it exists, we let

A−1 =

(a bc d

)

.

Then we must have

AA−1 =

(1 22 4

) (a bc d

)

= I2 =

(1 00 1

)

,

so that (a + 2c b + 2d2a + 4c 2b + 4d

)

=

(1 00 1

)

.

Equating corresponding entries of these two matrices, we obtain the linear systems

a + 2c = 1 b + 2d = 0and

2a + 4c = 0 2b + 4d = 1.

These linear systems have no solutions, so A has no inverse.

We next establish several properties of inverses of matrices.

Theorem 1.4.3. If A and B are both nonsingular matrices in Mn(R), then the matrix product AB is nonsingularand (AB)−1 = B−1A−1.

Proof. We have (AB)(B−1A−1) = A(BB−1)A−1 = (AIn)A−1 = AA−1 = In. Similarly, (B−1A−1)(AB) = In.Therefore, AB is nonsingular. Since the inverse of a matrix is unique, we conclude that (AB)−1 = B−1A−1. ⊔⊓

Corollary 1.4.1. If A1, A2, . . . , Ar are nonsingular matrices in Mn(R), then the matrix product A1A2 · · · An isnonsingular and (A1A2 · · ·Ar)

−1 = A−1r A−1

r−1 · · ·A−11 .


Theorem 1.4.4. If A is a nonsingular matrix in Mn(R), then A−1 is nonsingular and (A−1)−1 = A.


Theorem 1.4.5. If A is a nonsingular matrix in Mn(R), then AT is nonsingular and (AT )−1 = (A−1)T .

Proof. We have AA−1 = In. Taking transposes of both sides, we obtain (A−1)T AT = ITn = In. Taking

transposes of both sides of the equation A−1A = In, we find, similarly, that (AT )(A−1)T = In. These equationsimply that (A−1)T = (AT )−1. ⊔⊓

Example 1.4.9. If

A =

(1 23 4

)

,

then from Example 1.4.7

A−1 =

(−2 1

32 − 1

2

)

and (A−1)T =

(−2 3

21 − 1

2

)

.

Also (you should verify this!)

AT =

(1 32 4

)

and (AT )−1 =

(−2 3

21 − 1

2

)

.

It follows from Theorem 1.4.5 that if A is a symmetric nonsingular matrix, then A−1 is symmetric (see Exercise 1.4.14).

Suppose that A is nonsingular. Then AB = AC implies that B = C (see Exercise 1.4.8) and AB = On impliesthat B = On (see Exercise 1.4.11).

LINEAR SYSTEMS AND INVERSES

If A is an n × n matrix, then the linear system A X = B is a system of n equations in n unknowns. Suppose thatA is nonsingular. Then A−1 exists and we can multiply A X = B by A−1 on both sides, obtaining

A−1(A X) = A−1B,

orIn X = A−1B.

Moreover, X = A−1B is clearly a solution to the given linear system. Thus, if A is nonsingular, we have a uniquesolution.

Applications. This observation is useful in industrial problems. Many physical models are described by linearsystems. This means that if n values are used as inputs (which can be arranged as the n × 1 matrix X), then mvalues are obtained as outputs (which can be arranged as the m × 1 matrix B) by the rule A X = B. The matrixA is inherently tied to the process. Thus suppose that a chemical process has a certain matrix A associated with it.Any change in the process may result in a new matrix. In fact, we speak of a black box, meaning that the internalstructure of the process does not interest us. The problem frequently encountered in systems analysis is that ofdetermining the input to be used to obtain a desired output. That is, we want to solve the linear system A X = Bfor X as we vary B. If A is a nonsingular square matrix, an efficient way of handling this is as follows. ComputeA−1 once; then whenever we change B, we find the corresponding solution X by forming A−1B.

Example 1.4.10. Consider an industrial process whose matrix is the matrix A of Example 1.4.7. If B is the outputmatrix (

86

)

,

then the input matrix X is the solution to the linear system A X = B. Using the result from Example 1.4.7, wehave

X = A−1B =

(−2 1

32 − 1

2

) (86

)

=

(−10

9

)

.

On the other hand, if B is the output matrix(

1020

)

,

then

X = A−1

(1020

)

=

(05

)

.


Exercise 1.4.1.(a) Show that if A is in Mm×n(R), then ImA = A and AIn = A.

(b) Show that if A is any scalar matrix in Mn(R), then A = γ In for some real number γ.

Exercise 1.4.2. Prove that the sum, product, and scalar multiple of diagonal, scalar, and upper (lower) triangularmatrices is diagonal, scalar, and upper (lower) triangular, respectively.

Exercise 1.4.3.(a) Show that A is symmetric if and only if aij = aji for all i, j.

(b) Show that A is skew symmetric if and only if aij = − aji for all i, j.

(c) Show that if A is skew symmetric, then the elements on the main diagonal of A are all zero.

Exercise 1.4.4. Show that if A is a symmetric matrix, then AT is symmetric.

Exercise 1.4.5. Show that if A is any n × n matrix, then(a) AAT and AT A are symmetric.

(b) A + AT is symmetric.

(c) A − AT is skew symmetric.

Exercise 1.4.6. Let A and B be symmetric matrices.(a) Show that A + B is symmetric.

(b) Show that AB is symmetric if and only if AB = BA.

Exercise 1.4.7. Write the matrix A =

3 −2 15 2 3−1 6 2

as a sum of a symmetric and a skew symmetric ma-

trix.endexercise

Exercise 1.4.8. Show that if AB = AC and A is nonsingular, then B = C.

Exercise 1.4.9. Find the inverse of A =

(1 35 2

)

.

Exercise 1.4.10.

Exercise 1.4.11. Show that if A is nonsingular and AB = On for an n × n matrix B, then B = On.


(a bc d

)

. Show that A is nonsingular if and only if ad − bc 6= 0

Exercise 1.4.13. Consider the linear system A X = B, where A is the matrix defined in Exercise 1.4.9.

(a) Find a solution if B =

(34

)

.

(b) Find a solution if B =

(56

)

.

Exercise 1.4.14. Prove that if A is symmetric and nonsingular, then A−1 is symmetric.

Exercise 1.4.15. Consider the homogeneous system A X = O, where A is n × n. If A is nonsingular, show thatthe only solution is the trivial one, X = O.

Exercise 1.4.16. Prove that if one row (column) of the n×n matrix A consists entirely of zeros, then A is singular.(Hint: Assume that A is nonsingular, that is, there exists an n× n matrix B such that AB = BA = In. Establisha contradiction.)

Exercise 1.4.17.

Exercise 1.4.18. Prove Corollary 1.4.1.

Exercise 1.4.19. Prove Theorem 1.4.4

Exercise 1.4.20. Show that the matrix A =

(2 34 6

)

is singular.

Exercise 1.4.21. Let

A =

3 2 −10 −4 30 0 0

and B =

6 −3 20 2 40 0 3

.

Verify that A + B and AB are upper triangular.

Exercise 1.4.22. Find two 2 × 2 singular matrices whose sum is nonsingular.

Exercise 1.4.23. Find two 2 × 2 nonsingular matrices whose sum is singular.

Exercise 1.4.24. If A is a nonsingular matrix whose inverse is

(2 14 1

)

, find A.

Exercise 1.4.25. Let p and q be nonnegative integers and let A be a square matrix. Show that

ApAq = Ap+q and (Ap)q = Apq.

Exercise 1.4.26. If AB = BA, and p is a nonnegative integer, show that (AB)p = ApBp.

Exercise 1.4.27. If p is a nonnegative integer and α is a scalar, show that

(α A)p = αpAp.

Exercise 1.4.28. Consider an industrial process with associated linear system A X = B, where A is n×n. Supposethat

A−1 =

(1 21 3

)

.

Find the input matrix for each of the following output matrices: (a)

(46

)

, (b)

(815

)

.

Exercise 1.4.29. If

D =

4 0 00 −2 00 0 3

.

Find D−1.

Exercise 1.4.30. If

A−1 =

(3 21 3

)

and B−1 =

(2 53 −2

)

,

find (AB)−1.

Exercise 1.4.31. Describe all skew symmetric scalar matrices.

Exercise 1.4.32. Describe all matrices that are both upper and lower triangular.

1.5. ECHELON FORM OF A MATRIX

In this section we take the elimination method for solving linear systems, learned in high school, and systemize itby introducing the language of matrices. This will result in two methods for solving a system of m linear equationsin n unknowns. These methods take the augmented matrix of the linear system, perform certain operations on it,and obtain a new matrix that represents an equivalent linear system (that is, has the same solutions as the originallinear system). The important point here is that the latter linear system can be solved very easily.

For example, if

1 2 0 | 30 1 1 | 20 0 1 | −1

represents the augmented matrix of a linear system, then the solution is easily found from the corresponding equations

x1 + 2x2 = 3x2 + x3 = 2

x3 = −1.

The task of this section is to manipulate the augmented matrix representing a given linear system into a form fromwhich the solution can easily be found.

Definition 1.5.1. An m × n matrix A is said to be in reduced row echelon form (r.r.e.f) if it satisfies thefollowing properties:

(a) All rows consisting entirely of zeros, if any, are at the bottom of the matrix.

(b) The first nonzero entry (called the leading entry) in each row not consisting entirely of zeros mustbe a 1, called a leading one.

(c) If row i is any row not consisting entirely of zeros, then the leading one of row i is to the right of theleading ones occurring in any rows k where k < i.

(d) If a column contains a leading entry of some row, then all other entries in that column are zero.

If A satisfies properties (a), (b), and (c), it is said to be in row echelon form (r.e.f). In Definition 1.5.1, there maybe no rows that consist entirely of zeros.

A similar definition can be formulated in the obvious manner for reduced column echelon form (r.c.e.f) andcolumn echelon form (c.e.f).

Example 1.5.1. The following are matrices in row echelon form:

A =

1 5 0 2 −2 40 1 0 3 4 80 0 0 1 7 −20 0 0 0 0 00 0 0 0 0 0

, B =

1 0 0 00 1 0 00 0 1 00 0 0 1

,

and

C =

0 0 1 3 5 7 90 0 0 0 1 −2 30 0 0 0 0 0 10 0 0 0 0 0 0

Example 1.5.2. The following are matrices in reduced row echelon form:

B =

1 0 0 00 1 0 00 0 1 00 0 0 1

, D =

1 0 0 0 −2 40 1 0 0 4 80 0 0 1 7 −20 0 0 0 0 00 0 0 0 0 0

and

E =

1 2 0 0 10 0 1 2 30 0 0 0 0

.

The following matrices are not in reduced row echelon form. (Why not?)

E =

1 2 0 40 0 0 00 0 1 −3

, G =

1 0 3 40 2 −2 50 0 1 2

,

H =

1 0 3 40 1 −2 50 1 2 20 0 0 0

, J =

1 2 3 40 1 −2 50 0 1 20 0 0 0

We shall now show that every matrix can be put into row (column) echelon form, or into reduced row (column)echelon form, by means of certain row (column) operations.

Definition 1.5.2. An elementary row (column) operation on a matrix A is any one of the following operations:

(a) Interchange rows (columns) i and j of A.

(b) Multiply row (column) i of A by α 6= 0.

(c) Add α times row (column) i of A to row (column) j of A, i 6= j.

Observe that when a matrix is viewed as the augmented matrix of a linear system, the elementary row operations areequivalent, respectively, to interchanging tow equations, multiplying an equation by a nonzero constant, and addinga multiple of one equation to another equation.

Definition 1.5.3. An m× n matrix A is said to be row (column) equivalent to an m× n matrix B if B can beobtained by applying a finite sequence of elementary row (column) operations to A.

Example 1.5.3. The matrix

A =

1 2 4 32 1 3 21 −1 2 3

is row equivalent to

D =

2 4 8 61 −1 2 34 −1 7 8

,

because if we add twice row 3 of A to row 2 of A, we obtain

B =

1 2 4 34 −1 7 81 −1 2 3

.

Interchanging rows 2 and 3 of B, we obtain

C =

1 2 4 31 −1 2 34 −1 7 8

,

Multiplying row 1 of C by 2, we obtain D.

We can easily show (see Exercise 1.5.1) that:

(a) Every matrix is row equivalent to itself;

(b) if A is row equivalent to B, then B is row equivalent to A; and

(c) if A is row equivalent to B and B is row equivalent to C, then A is row equivalent to C.

In view of (b), both statements “A is row equivalent to B” and “B is row equivalent to A” can be replaced by “Aand B are row equivalent.” A similar statement holds for column equivalence.

Theorem 1.5.1. Every nonzero m × n matrix A = [aij ] is row (column) equivalent to a matrix in row (column)echelon form.

Proof. We shall prove that A is row equivalent to a matrix in row echelon form, that is, by using only elementaryrow operations we can transform A into a matrix in row echelon form. A completely analogous proof using elementarycolumn operations establishes the result for column equivalence.

Step 1: We look in matrix A for the first column with a nonzero entry; say this is column j and saythat this first nonzero entry in column j occurs in row i; that is, entry aij 6= 0. Now interchange (ifnecessary), the rows 1 and i, thus obtaining matrix B = [bij ] where entry b1j 6= 0.

Step 2: Multiply all entries in row 1 of B by 1/b1j, obtaining C = [cij ] whose entry c1j = 1.

Step 3: Now if chj 6= 0 where 2 6 h 6 m, then to row h of C we add −chj times row 1; all elementsin column j, rows 2, 3, . . . , m are zero. Denote the resulting matrix by D. Note that we have used onlyelementary row operations.

Step 4: Next, consider the (m − 1) × n submatrix A1 of D obtained by mentally deleting the firstrow of D.

We now repeat the four step procedure above with submatrix A1 instead of matrix A. Continuing this way, weobtain a matrix H in row echelon form which is row equivalent to A. ⊔⊓

Example 1.5.4. Let

A =

0 2 3 −4 10 0 2 3 42 2 −5 2 42 0 −6 9 7

.

Find a matrix H in echelon form which is row equivalent to A.

Solution. Step 1: Column 1 is the first (counting from left to right) column in A with a nonzero entry. The first(counting from top to bottom) nonzero entry in the first column occurs in the third row, that is, a31. We interchangethe first and third rows of A to produce a matrix B given as

B =

2 2 −5 2 40 0 2 3 40 2 3 −4 12 0 −6 9 7

.

Step 2: Multiply the first row of B by 1b11

= 12 , to produce a matrix C given as

C =

1 1 − 52 1 2

0 0 2 3 40 2 3 −4 12 0 −6 9 7

Step 3: Add −2 times the first row of C to the fourth row of C to produce a matrix D in which the only nonzeroentry in the first column is d11 = 1.

D =

1 1 − 52 1 2

0 0 2 3 40 2 3 −4 10 −2 −1 7 3

Step 4: Identify A1 as the submatrix A1 obtained by mentally deleting the first row of D; do not physically erasethe first row of D.

Repeat the four steps above using the submatrix A1 which we write below with the first row of D separated fromits submatrix A1.

1 1 − 52 1 2

A1 =

0 0 2 3 40 2 3 −4 10 −2 −1 7 3

Interchange the first and Second rows of A1 to obain

1 1 − 52 1 2

B1 =

0 2 3 −4 10 0 2 3 40 −2 −1 7 3

Multiply the first row of B1 by 1/2 to obain

1 1 − 52 1 2

C1 =

0 1 32 −2 1

20 0 2 3 40 −2 −1 7 3

Add two times the first row of C1 to the third row to obain

1 1 − 52 1 2

D1 =

0 1 32 −2 1

20 0 2 3 40 0 2 3 4

Identify A2 as the submatrix A2 obtained by mentally deleting the first row of D1; do not physically erase the firstrow of D1.

Repeat the four steps above using the submatrix A2 which we write below with the first row of D1 separated fromits submatrix A2. No row of A2 needs to be interchanged; So then we have B2 = A2.

1 1 − 52 1 2

0 1 32 −2 1

2

A2 =

(0 0 2 3 40 0 2 3 4

)

= B2 Multiply the first row of B2 by 1/2 to obtain

1 1 − 52 1 2

0 1 32 −2 1

2

C2 =

(0 0 1 3

2 20 0 2 3 4

)

Finally, add -2 times the first row of C2 to its second row to obtain

1 1 − 52 1 2

0 1 32 −2 1

2

D2 =

(0 0 1 3

2 20 0 0 0 0

)

.

The matrix

H =

1 1 − 52 1 2

0 1 32 −2 1

2

0 0 1 32 2

0 0 0 0 0

is in row echelon form and is row equivalent to A. ⊔⊓When doing hand computations, it is sometimes possible to avoid messy fractions by suitably modifying the stepsin the procedure.

Theorem 1.5.2. Every nonzero m × n matrix A = [aij ] is row (column) equivalent to a matrix in reduced row(column) echelon form.

Proof. We proceed as in Theorem 1.5.1, obtaining a matrix H in row echelon form which is row equivalent to A.In H , if row i contains a nonzero entry, then its first (counting from left to right) nonzero entry is a leading one.Let’s suppose this leading one is in column j which we denote by cj . Then c1 < c2 < · · · < cj < . . . < cr wherer (1 6 r 6 m) is the number of nonzero rows in H . Add suitable multiples of row i of H to all the rows of Hpreceding row i to make all entries in column cj and rows i− 1, i − 2, . . . , 1 of H equal to zero; except the leadingones. The result is a matrix K in reduced row echelon form which has been obtained from H by elementary rowoperations only and is thus now row equivalent to H . Since A is row equivalent to H and H is row equivalent to K,then A is row equivalent to K. An analogous proof can be given to show that A is column equivalent to a matrix inreduced column echelon form. ⊔⊓

It can be shown, with some difficulty, that:

For a given nonzero m× n matrix A,there is only one matrix B in reduced row (column) echelon formthat is row (column) equivalent to A.

The proof of this statement is omitted.

Example 1.5.5. Suppose that we wish to find a matrix in reduced row echelon form that is row equivalent to thematrix A of Example 1.5.4. Starting with the matrix H obtained there, we add −1 times the second row to the firstrow, obtaining

1 0 −4 3 32

0 1 32 −2 1

20 0 1 3

2 20 0 0 0 0

.

In this matrix we add -3/2 times the third row to its second row and 4 times the third row to its first row. Thisyields

1 0 0 9 192

0 1 0 − 174 − 5

20 0 1 3

2 20 0 0 0 0

,

which is in reduced row echelon form and is row equivalent to A.

We now apply these results to the solution of linear systems.

Theorem 1.5.3. Let A X = B and C X = D be two linear systems each of m equations and n unknowns. If theaugmented matrices (A | B) and (C | D) are row equivalent, then the linear systems are equivalent; that is, theyhave exactly the same solutions.

Proof. This follows from the definition of row equivalence and from the fact that the three elementary rowoperations on the augmented matrix are the three manipulations on linear systems, discussed in Section 1.1, whichyield equivalent linear systems. We also note that if one system has no solution, then the other system has nosolution. ⊔⊓

Corollary 1.5.1. If A and B are row equivalent m × n matrices, then the homogeneous systems A X = O andB X = O are equivalent.


We now pause to observe that we have developed the essential features of two very straightforward methods forsolving linear systems. The idea consists of starting with the linear system A X = B, then obtaining a partitionedmatrix (C | D) in either row echelon form or reduced row echelon form that is row equivalent to the augmentedmatrix (A | B). Now (C | D) represents the linear system C X = D, which is quite simple to solve because of thestructure of (C | D), and the set of solutions to this system gives precisely the set of solutions to A X = B. Themethod where (C | D) is in row echelon form is called Gaussian elimination; the method where (C | D) is inreduced row echelon form is called Gauss-Jordan reduction. These methods are used often and computer codesof their implementation are widely available.

We thus consider the linear system C X = D, where C is m × n, and (C | D) is in row echelon form. Then forexample, (C | D) is of the following form:

1 c12 c13 . . . c1n | d1

0 0 1 c24 . . . c2n | d2

...... |

...0 0 . . . 0 1 c(k−1)n | dk−1

0 . . . 0 1 | dk

0 . . . 0 | dk+1

...... |

...0 . . . 0 | dm

.

This augmented matrix represents the linear system

x1 + c12x2 + c13x3 + . . . + c1nxn = d1

x3 + c24x4 + . . . + c2nxn = d2...

xn−1 + c(k−1)nxn = dk−1

xn = dk

0x1 + . . . + 0xn = dk+1

......

0x1 + . . . + 0xn = dm.

First, if dk+1 = 1, then the linear system C X = D has no solution, since at least one equation is not satisfied.If dk+1 = 0, which implies that dk+2 = · · · = dm = 0, we then obtain xn = dk, xn−1 = dk−1 − c(k−1)nxn =dk−1 − c(k−1)ndk, and continue using backward substitution to find the remaining unknowns corresponding to theleading entry in each row. Of course, in the solution, some of the unknowns may be expressed in terms of othersthat can take on any values whatsoever. This merely indicates that C X = D has infinitely may solutions. On theother hand, every unknown may have a determined value, indicating that the solution is unique.

Example 1.5.6. Consider the linear system C X = D whose augmented matrix (C | D) is in row echelon form as

(C... D) =

1 2 3 4 5 | 60 1 2 3 −1 | 70 0 1 2 3 | 70 0 0 1 2 | 9

.

Thenx4 = 9 − 2x5

x3 = 7 − x3 − 2x4 − 3x5 = 7 − 2(9 − 2x5) − 3x5 = −11 + x5

x2 = 7 − 2x3 − 3x4 + x5 = 2 + 5x5

x1 = 6 − 2x2 − 3x3 − 4x4 − 5x5 = −1 − 10x5

x5 = any real number.

Thus all solutions X are of the form

x1 = −1 − 10r

x2 = 2 + 5r

x3 = −11 + r

x4 = 9 − 2r

x5 = r, any real number.

or X =

−12

−1190

+

−1051−21

r, where r is any real number.

Since r can be assigned any real number, the given linear system as infinitely many solutions.

Example 1.5.7. If

(C | D) =

1 2 3 4 | 50 1 2 3 | 60 0 0 0 | 1

is the augmented matrix in row echelon form of the linear system A X = B,then linear system C X = D has nosolution, for the last equation is

0x1 + 0x2 + 0x3 + 0x4 = 1,

which can never be satisfied. Thus A X = B has no solution.

Example 1.5.8. If

(C | D) =

1 2 3 4 | 50 1 2 3 | 60 0 1 2 | 70 0 0 1 | 8

,

thenx1 = 0

x2 = 0

x3 = −9

x4 = 8. or X =

00

−98

.

The solution to C X = D is unique.

If (C | D) is in reduced row echelon form, then we can solve C X = D without backward substitution, but, ofcourse, it takes more effort to put a matrix in reduced row echelon form than in row echelon form.

Example 1.5.9. If

(C | D) =

1 0 0 0 | 50 1 0 0 | 60 0 1 0 | 70 0 0 1 | 8

,

then

X =

5678

.

Example 1.5.10. If

(C | D) =

1 1 2 0 − 52 | 3

20 0 0 1 1

2 | 12

0 0 0 0 0 | 0

,

then

x4 =1

2− 1

2x5

x1 =2

3− x2 − 2x3 +

5

2x5,

where x2, x3, an x5 can take on any real numbers. Thus a solution is of the form

x1 =2

3− r − 2s +

5

2t

x2 = r

x3 = s

x4 =1

2− 1

2t

x5 = t,

or in matrix form as

X =

2300120

+

−11000

r +

−20100

s +

5200

− 121

t where r, s, and t are any real numbers

We now solve a linear system both by Gaussian elimination and by Gauss-Jordan reduction.

Example 1.5.11. Find solutions (if any) of the linear system

x1 + 2x2 + 3x3 = 62x1 − 3x2 + 2x3 = 143x1 + x2 − x3 = −2.

Solution. We form the augmented matrix of the system as

1 2 3 | 62 −3 2 | 143 1 −1 | −2

.

Next, we start the Gaussian elimination method by applying the elementary row operations to the above augmentedmatrix.

1 2 3 | 62 −3 2 | 143 1 −1 | −2

Add -2 times the first row to the second row to obtain

1 2 3 | 60 −7 −4 | 23 1 −1 | −2

Add -3 times the first row to the third row to obtain

1 2 3 | 60 −7 −4 | 20 −5 −10 | −20

Multiply the third row by −1/5 and interchange the second and third rows to obtain

1 2 3 | 60 1 2 | 40 −7 −4 | 2

Add 7 times the second row to the third row to obtain

1 2 3 | 60 1 2 | 40 0 10 | 30

Multiply the third row by 1/10 to obtain

1 2 3 | 60 1 2 | 40 0 1 | 3

.

The above matrix is in row echelon form. This means that x3 = 3 and from the second row

x2 + 2x3 = 4,

sox2 = 4 − 2(3) = −2.

From the first rowx1 + 2x2 + 3x3 = 6,

which implies thatx1 = 6 − 2x2 − 3x3 = 6 − 2(−2) − 3(3) = 1.

Thus x1 = 1, x2 = −2, and x3 = 3 is the solution to the system. This gives the solution by Gaussian elimination.⊔⊓If, instead, we wish to use Gauss-Jordan reduction, we would transform the last matrix into reduced row echelonform by the following steps:

Add -2 times the second row to the first row:

1 0 −1 | −20 1 2 | 40 0 1 | 3

Add -2 times the third row to the second row to obtain

1 0 −1 | −20 1 0 | −20 0 1 | 3

Add the third row to the first row to obtain

1 0 0 | 10 1 0 | −20 0 1 | 3

.

The solution is x1 = 1, x2 = −2, and x3 = 3, as before.

NOW WE CONSIDER A HOMOGENEOUS SYSTEM A X = O CONSISTING OF m EQUATIONSIN n UNKNOWNS.

(A | O) =

a11 a12 · · · a1n | 0a21 a22 · · · a2n | 0...

......

... |...

am1 am2 · · · amn | 0

Example 1.5.12. Consider the homogeneous system whose augmented matrix is

1 0 0 0 2 | 00 0 1 0 3 | 00 0 0 1 4 | 00 0 0 0 0 | 0

.

Since the augmented matrix is in reduced row echelon form, the solution is easily seen as

x1 = −2rx2 = s

x3 = −3rx4 = −4rx5 = r,

or X =

−20

−3−41

r +

01000

s, where r and s are any real numbers.

In Example 1.5.12 we solved a homogeneous system of m (= 4) linear equations in n(= 5) unknowns, where m < nand the augmented matrix A was in reduced row echelon form. We can ignore any row of the augmented matrix thatconsists entirely of zeros. Thus let rows 1, 2, . . . , r of A be the nonzero rows, and let the 1 in row i occur in columnci. We are then solving a homogeneous system of r equations in n unknowns, r < n, and in this special case (A isin reduced row echelon form) we can solve for xc1

, xc2, . . . , xcr

in terms of the remaining n − r unknowns. Since thelatter can take on any real values, there are infinitely many solutions to the system A X = O; in particular, thereis a nontrivial solution. We now show that this situation holds whenever we have m < n; A does not have to be inreduced row echelon form.

Theorem 1.5.4. Any homogeneous system A X = O consisting of m linear equations in n unknowns always has anontrivial solution if m < n, that is, if the number of unknowns exceeds the number of equations.

Proof. Let B be a matrix in reduced row echelon form that is row equivalent to A. Then the homogeneous systemsA X = O and B X = O are equivalent. If we let r be the number of nonzero rows of B, then r 6 m. If m < n,we conclude that r < n. We are then solving r equations in n unknowns and can solve for r unknowns in terms ofthe remaining n − r unknowns, the latter being free to take on any values we please. Thus B X = O, and henceA X = O has a nontrivial solution. ⊔⊓

We shall soon use this result in the following equivalent form: If A is m × n and A X = O has only the trivialsolution, then m > n.

Example 1.5.13. Consider the homogeneous system

x1 + x2 + x3 + x4 = 0x1 x4 = 0x1 + 2x2 + x3 = 0.

The augmented matrix

A =

1 1 1 1 | 01 0 0 1 | 01 2 1 0 | 0

is row equivalent to

1 0 0 1 | 00 1 0 −1 | 00 0 1 1 | 0

.

Hence the solution isx1 = −rx2 = r

x3 = −rx4 = r

or X =

−11

−11

r, any real number.

A useful property of matrices in reduced row echelon form (see Exercise 1.5.3) is that if A is an n × n matrix inreduced row echelon form 6= In, then A has a row consisting entirely of zeros.


Exercise 1.5.1. Prove the following statements:

(a) Every matrix is row equivalent to itself.

(b) If A is row equivalent to B, then B is row equivalent A.

(c) If A is row equivalent to B and B is row equivalent to C, then A is row equivalent to C.

Exercise 1.5.2. Let

A =

0 0 −1 2 30 2 3 4 50 1 3 −1 20 3 2 4 1

.

(a) Find a matrix B in row echelon form that is row equivalent to A.

(b) Find a matrix C in reduced row echelon form that is row equivalent to A.

Exercise 1.5.3. Let A be an n × n matrix in reduced row echelon form. Prove that if A 6= In, then A has a rowconsisting entirely of zeros.

Exercise 1.5.4. Let

A =

1 −2 0 22 −3 −1 51 3 2 51 1 0 2

.

(a) Find a matrix B in row echelon form that is row equivalent to A.

(b) Find a matrix C in reduced row echelon form that is row equivalent to A.

Exercise 1.5.5. Consider the linear system

x1 + x2 + 2x3 = −1x1 − 2x2 + x3 = −53x1 + x2 + x3 = 3.

(a) Find all solutions, if any exists, by using the Gaussian elimination method.

(b) Find all solutions, if any exists, by using the Gauss-Jordan reduction method.

Exercise 1.5.6. Repeat Exercise 1.5.5 for each of the following linear systems.

(a)x1 + x2 + 2x3 + 3x4 = 13x1 − 2x2 + x3 + x4 = 83x1 + x2 + x3 − x4 = 1.

(b)x1 + x2 + x3 = 1x1 + x2 − x3 = 32x1 + x2 + x3 = 2.

(c)

2x1 + x2 + x3 − 2x4 = 13x1 − 2x2 + x3 − 6x4 = −2x1 + x2 − x3 − x4 = −16x1 + x3 − 9x4 = −25x1 − x2 + 2x3 − 8x4 = 3.

Exercise 1.5.7. In Exercises 7, 8, and 9, solve the linear system, if it is consistent, with given augmented matrix.

(a)

1 1 1 | 01 1 0 | 30 1 1 | 1

(b)

1 2 3 | 01 1 1 | 01 1 2 | 0

(c)

1 2 3 | 01 1 1 | 05 7 9 | 0

(d)

(1 2 3 | 01 2 1 | 0

)

Exercise 1.5.8. (a)

1 2 3 1 | 81 3 0 1 | 71 0 2 1 | 3

(b)

1 1 3 −3 | 00 2 1 −3 | 31 0 2 −1 | −1

Exercise 1.5.9. (a)

1 2 1 | 72 0 1 | 41 0 2 | 51 2 3 | 112 1 4 | 11

(b)

1 2 1 | 02 3 0 | 00 1 2 | 02 1 4 | 0


(a bc d

)

and X =

(x1

x2

)

. Show that the linear system A X = O has only the trivial

solution if and only if ad − bc 6= 0.

Exercise 1.5.11. Show that A =

(a bc d

)

is row equivalent to I2 if and only if ad − bc 6= 0.

Exercise 1.5.12. In the following linear system, determine all values of λ for which the resulting linear system has:(1) No solution.

(b) A unique solution.

(c) Infinitely may solutions.x1 + x2 − x3 = 2x1 + 2x2 + x3 = 3x1 + x2 + (λ2 − 5)x3 = λ.

Exercise 1.5.13. Repeat Exercise 1.5.12 for the linear system

x1 + x2 + x3 = 22x1 + 3x2 + 2x3 = 52x1 + 3x2 + (λ2 − 1)x3 = λ + 1.

Exercise 1.5.14.(a) Formulate the definitions of column echelon form and reduced column echelon form of a matrix.

(b) Prove that every m × n nonzero matrix is column equivalent to a matrix in column echelon form.

Exercise 1.5.15. Prove that every m × n nonzero matrix is column equivalent to a matrix in reduced columnechelon form.


x1 + x2 = 3x1 + (λ2 − 8)x2 = λ.

Exercise 1.5.17. Let A be the matrix in Exercise 1.5.2.(a) Find a matrix in column echelon form that is column equivalent to A.

(b) Find a matrix in reduced column echelon form that is column equivalent to A.


x1 + x2 + x3 = 2x1 + 2x2 + x3 = 3x1 + x2 + (λ2 − 5)x3 = λ.

Exercise 1.5.19. Repeat Exercise 1.5.17 for the matrix

1 2 3 4 52 1 3 −1 23 1 2 4 1

.

Exercise 1.5.20. Show that if the homogeneous system

(a − r)x1 + dx2 = 0cx1 + (b − r)x2 = 0

has a nontrivial solution, then r satisfies the equation (a − r)(b − r) − cd = 0.

Exercise 1.5.21. Let A X = B, B 6= O be a consistent linear system.(a) Show that if X1 is a solution to the linear system A X = B and Y1 is a solution to the associated homogeneous

system A X = O, then X1 + Y1 is a solution to the system A X = B.

(b) Show that every solution X to A X = B can be written as X1 + Y1, where X1 is a particular solution toA X = B and Y1 is a solution to A X = O. [Hint: Let X = X1 + (X − X1).]

Note: It is suggested that you use the methods of this section to solve the exercises in Section 1.1.

1.6. ELEMENTARY MATRICES; FINDING A−1

In this section we develop a method for finding the inverse, A−1 of a square matrix A if it exists. This method issuch that we do not have to first find out whether A−1 exists. Give the square matrix A, we start to find A−1, ifin the course of the computation we hit a certain condition, then we know that A−1 does not exist. Otherwise, weproceed to the end and obtain A−1. This method requires that the three elementary row operations (a), (b), and (c)defined in the previous section be performed on A. We clarify these notions by starting with the following definition.

Definition 1.6.1. An n×n matrix E is called an elementary matrix if it was obtained by performing elementaryrow or elementary column operations on In.

Example 1.6.1. The following are elementary matrices:

E1 =

0 0 10 1 01 0 0

, E2 =

1 0 00 −2 00 0 1

,

E3 =

1 2 00 1 00 0 1

, and E4 =

1 0 30 1 00 0 1

.

Matrix E1 was obtain by interchanging the first and third rows of I3; E2 was obtained by multiplying the secondrow of I3 by -2; and E4 was obtained by adding 3 times the first column of I3 to its third column.

Theorem 1.6.1. Let A be an m × n matrix and B is the matrix obtained by performing a single elementary row(column) operation on A. Let E be the elementary matrix obtained from Im (In) by performing the same elementaryrow (column) operation as was performed on A. Then B = EA (B = AE).


Theorem 1.6.1 says that an elementary row operation on A can be achieved by pre-multiplying A (multiplying Aon the left) by the corresponding elementary matrix E; an elementary column operation on A can be obtained bypost-multiplying A (multiplying A on the right) by the corresponding elementary matrix.


1 3 2 1−1 2 3 4

3 0 1 2

and let B result from A by adding -2 times the third row of A to its

first row. Thus B =

−5 3 0 −3−1 2 3 43 0 1 2

. Now let E be the elementary matrix that is obtained from I3 by adding

-2 times the third row of I3 to the its first row. Thus E =

1 0 −20 1 00 0 1

. It is easy to verify that B = EA.

Theorem 1.6.2. If A and B are nonzero m × n matrices, then A is row (column) equivalent to B if and only ifB = EkEk−1 · · · , E2E1A (B = AE1E2 · · · Ek−1Ek), where E1, E2, · · · , Ek−1, Ek are elementary matrices.

Proof. We prove only the theorem for row equivalence. If A is row equivalent to B, then B results from A by asequence of elementary row operations. This implies that there exist elementary matrices E1, E2, . . . , Ek such thatB = EkEk−1 · · · E2E1A. Conversely, if B = EkEk−1 · · · , E2E1A, where the E′

is are elementary matrices, then Bresults from A by a sequence of elementary row operations, which implies that A is row equivalent to B. ⊔⊓

Theorem 1.6.3. An elementary matrix E is nonsingular and its inverse, E−1 is an elementary matrix of the sametype.


Thus an elementary row operation can be “undone” by another elementary row operation of the same type.

We now obtain an algorithm for finding the inverse, A−1 (if it exists) of a given square matrix A; first, we prove thefollowing lemma.

Lemma 1.6.1. Let A be an n × n matrix and let the homogeneous system A X = O have only the trivial solutionX = O. Then A is row equivalent to In.

Proof. Let B be a matrix in reduced row echelon form which is row equivalent to A. Then the homogeneous systemA X = O and B X = O are equivalent, and thus B X = O also has only the trivial solution. It is clear that if r isthe number of nonzero rows of B, then the homogeneous system B X = O is equivalent to the homogeneous systemwhose coefficient matrix consists of the nonzero rows of B and is therefore r×n. Since this last homogeneous systemonly has the trivial solution, we conclude from Theorem 1.5.4 that r > n. Since B is n × n, r ≤ n. Hence r = n,which means that B has no zero rows. Thus B = In. ⊔⊓

Theorem 1.6.4. An n × n matrix A is nonsingular if and only if A is a product of elementary matrices (a finitenumber).

Proof. If A is a product of finitely many elementary matrices E1, E2, . . . , Ek, then A = E1E2 · · · , Ek−1Ek.Now each elementary matrix is nonsingular, and the product of nonsingular matrices is again nonsingular; therefore,A is nonsingular. Conversely, if A is nonsingular, then A X = O implies that A−1(A X) = A−1(O) = O, soIn X = O or X = O. Thus A X = O has only the trivial solution. Lemma 1.6.1 then implies that A is rowequivalent to In. This means that there exists elementary matrices E1, E2, . . . , Ek such that

In = EkEk−1 · · · E2E1A.

It then follows that A = (EkEk−1 · · · E2E1)−1 = E−1

1 E−12 · E−1

k−1E−1k . Since the inverse of an elementary matrix

is an elementary matrix, we have established the result. ⊔⊓

Corollary 1.6.1. An n × n matrix A is nonsingular if and only if A is row equivalent to In.

Proof. If A is row equivalent to In, then In = EkEk−1 · · · E2E1A, where the Ei are all elementary matrices.Therefore, it follows that A = E−1

1 E−12 · · · E−1

k−1E−1k . Now the inverse of an elementary matrix is an elementary

matrix, and so by Theorem 1.6.4 A is nonsingular.

Conversely, if A is nonsingular, then A is a product of elementary matrices, A = E (a product of finitely manyelementary matrices). Now A = A In = EIn which implies that A is row equivalent to In. ⊔⊓

Theorem 1.6.5. The homogeneous system of n linear equations in n unknowns A X = O has a nontrivial solutionif and only if A is singular.

Proof. We can see that Lemma 1.6.1 and Corollary 1.6.1 imply that if the homogeneous system A X = O, whereA is an n × n matrix, has only the trivial solution X = O, then A is nonsingular. Conversely, consider A X = O,where A is n×n, and let A be nonsingular. Then A−1 exists and we form A−1(A X) = A−1O = O. Thus X = O,which means that the homogeneous system has only the trivial solution. ⊔⊓


(1 22 4

)

be the matrix defined in Example 1.4.8, which is singular; i.e., A−1 does not

exist. Consider the homogeneous system AX = 0; that is,

(1 22 4

) (x1

x2

)

=

(00

)

. The reduced row echelon form

of the augmented matrix is

(1 2 | 00 0 | 0

)

, and so a solution is

x1 = −2t

x2 = t or X =

−2

1

t,

where t is any real number. Thus the homogeneous system has a nontrivial solution.

At the end of the proof of Theorem 1.6.4, we had

A = E−11 E−1

2 · · · E−1k−1E

−1k ,

from which it follows that

A−1 = (E−11 E−1

2 · · · E−1k−1E

−1k )−1 = EkEk−1 · · · E2E1.

This now provides an algorithm for finding A−1. Thus we perform elementary row operations on A until we get In;the product of the elementary matrices EkEk−1 · · · E2E1 then gives A−1.

An Algorithm for find A−1: A convenient way of organizing the computing process is to write down the augmentedmatrix (A | In). Then

(EkEk−1 · · · E2E1)(A | In) = (EkEk−1 · · · E2E1A | EkEk−1 · · · E2E1) = (In | A−1).

In other words, for a given n × n matrix A the algorithm for finding A−1 as follows:

Step 1. Form the augmented n × 2n matrix (A | In) where In is the n × n identity matrix.

Step 2. Bring the matrix (A | In) to reduced row echelon form using only elementary row operations.

Step 3. Check if the reduced row echelon form have a leading one in each of the first n columns(counting from left to right), then the reduced row echelon form of (A | In) is the matrix (In | A−1)and A−1 is the n × n submatrix which remains after deleting In from (In | A−1).

Otherwise, there is either a row consisting entirely of zeros or else there is a leading one in a columnbeyond the nth column (Remember the matrix (A | In) has 2n columns!). In this case A is consideredto be singular.

Example 1.6.4. For the 3 × 3 matrix A =

1 1 10 2 35 5 1

. Find A−1 if exists.

Solution.

Step 1. Form the 3 × 6 augmented matrix (A | I3)

(A | I3) =

1 1 1 | 1 0 00 2 3 | 0 1 05 5 1 | 0 0 1

.

Step 2. Bring the augmented matrix above to reduced row echelon form. Here are the elementary row operationsthat were used:

1. −5R1 + R4.

2. 12R2.

3. −R2 + R1.

4. − 14R3.

5. − 32R3 + R2.

6. 12R3 + R1.

The reduced row echelon form of (A | I3) is

(I3 | A−1) =

1 0 0 | 138 − 1

2 − 18

0 1 0 | − 158

12

38

0 0 1 | 54 0 − 1

4

.

Step 3. Hence

A−1 =

138 − 1

2 − 18

− 158

12

38

54 0 − 1

4

.

It is easy to verify that AA−1 = A−1A = I3.

The question that arise at this point is how can the algorithm tell us if A is singular? that is, when the abovealgorithm for finding A−1 fails. In step 3 of the algorithm we said if the reduced row echelon form of (A | In) hasat least one row consisting entirely of zeros or else there is a column beyond the nth column (counting from left toright) containing a leading one. This really means that A is row equivalent to some matrix B containing a row ofzeros. Thus the answer is that A is singular if and only if A is row equivalent to a matrix B having at least one rowthat consists entirely of zeros. We now prove this result.

Theorem 1.6.6. An n × n matrix A is singular if and only if A is row equivalent to a matrix B that has a row ofzeros.

Proof. First, let A be row equivalent to a matrix B that has a row consisting entirely of zeros. From Exercise 1.4.16in section 4 it follows that B is singular. Now B = EkEk−1 · · · E2E1A, where E1, E2, . . . , Ek−1, Ek are elementarymatrices. If A is nonsingular, then B is nonsingular, a clear contradiction.

Conversely, if A is singular, then A is not row equivalent to In, by Corollary 1.6.1. Thus A is row equivalent to amatrix B 6= In, which is in reduced row echelon form. From Exercise 1.5.3 of section 5 it follows that B must havea row of zeros. ⊔⊓

This means that in order to find A−1, we do not have to determine, in advance, whether or not it exists. We merelystart to calculate A−1; if at any point in the computation we find a matrix B that is row equivalent to A and has arow of zeros, then A−1 does not exist.

Example 1.6.5. Consider the matrix A =

1 2 −31 −2 15 −2 −3

. Determine if A is singular or not.

Solution. Apply the algorithm for calculating A−1. After applying the three elementary row operations −R1 +

R2, −5R1 + R3, −3R2 + R3 to (A | I3) we find that A is row equivalent to the matrix B =

1 2 −30 −4 40 0 0

.

Since B has a row of zeros, we stop and conclude that A is a singular matrix. ⊔⊓

In section 4 we defined an n × n matrix B to be the inverse of the n × n matrix A if AB = In and BA = In. Wenow show that one of these equations follows from the other.

Theorem 1.6.7. If A and B are n × n matrices such that AB = In, then BA = In. Thus B = A−1.

Proof. We first show that if AB = In, then A is nonsingular. Suppose that A is singular. Then A is row equivalentto a matrix C with a row of zeros. Now C = EkEk−1 · · · E2E1A, where all the Ei are elementary matrices. ThenCB = EkEk−1 · · · E2E1AB, so AB is row equivalent to CB. Since CB has a row of zeros, we conclude fromTheorem 1.6.6 that AB is singular. Then AB = In is impossible, because In is nonsingular. This contradictionshows that A is nonsingular, and so A−1 exists. Multiplying both sides of the equation AB = In by A−1 on the left,we then obtain A−1(AB) = (A−1A)B = InB = B = A−1In = A−1. Thus B = A−1 and BA = A−1A = In

holds. ⊔⊓This theorem tells us that when using multiplication to verify that matrix B is the inverse of A it is enough to verifyonly one of the two equations AB = In or BA = In but not both.



Exercise 1.6.2. Let A be a 4 × 3 matrix. Find the elementary matrix E, which as a premultiplier of A, that is asEA, performs the following elementary row operations on A:

(a) Multiplies the second row of A by -2.

(b) Adds 3 times the third row of A to the fourth row of A.

(c) Interchanges the first and third rows of A.

Exercise 1.6.3. Let A be a 3× 4 matrix. Find the elementary matrix F , which as a postmultiplier of A, that is asAF , performs the following elementary column operations on A:

(a) Adds -4 times the first column of A to the second column of A.

(b) Interchanges the second and third columns of A.

(c) Multiplies the third column of A by 4..

Exercise 1.6.4. Prove Theorem 1.6.3 (Hint: Find the inverse of the elementary matrices represented by the variouselementary row operations.)

Exercise 1.6.5. Find the inverse, if it exists, of

(a)

1 1 11 2 30 1 1

(b)

1 1 1 11 2 −1 21 −1 2 11 3 3 2

(c)

1 1 1 11 3 1 21 2 −1 15 9 1 6

(d)

1 2 11 3 21 0 1

(e)

1 2 21 3 21 1 3


(a bc d

)

be a 2 × 2 matrix and denote by det(A) the real number ad − bc. Then the

statement A is nonsingular if and only if det(A) 6= 0 holds. Show that

A−1 =1

det(A)

(d −b

−c a

)

.

provided that the statement holds.


2 3 −11 0 30 2 −3

−2 1 3

. Find the elementary matrix that as a postmultiplier of A performs the

following elementary column operations on A:

(a) Multiplies the third column of A by -3.

(b) Interchanges the second column and third columns of A.

(c) Adds -5 times the first column of A to the third column of A.

Exercise 1.6.8. Prove that A =

1 2 30 1 21 0 3

is nonsingular and write it as a product of elementary matrices.

(Hint: First, write the inverse as a product of elementary matrices then use Theorem 1.6.3.)

Exercise 1.6.9. Which of the following homogeneous systems have a nontrivial solution?

a)x1 + 2x2 + 3x3 = 0

2x2 + 2x3 = 0x1 + 2x2 + 3x3 = 0.

(b)2x1 + x2 − x3 = 0x1 − 2x2 − 3x3 = 0

−3x1 − x2 + 2x3 = 0.

(c)3x1 + x2 + 3x3 = 0−2x1 + 2x2 − 4x3 = 02x1 − 3x2 + 5x3 = 0.

Exercise 1.6.10. Find out which of the following matrices are singular. For the nonsingular ones find the inverse.

(a)

(1 32 6

)

(b)

(1 3

−2 6

)

(c)

1 2 31 1 20 1 2

(d)

1 2 31 1 20 1 1

.


(1 32 4

)

.


1 2 30 2 31 2 4

.

Exercise 1.6.13. Show that A =

(1 23 4

)

is nonsingular and write it as a product of elementary matrices (See

the hint in Exercise 1.6.8.)

Exercise 1.6.14. Find the inverse (if possible) of the following matrices:

(a)

1 2 −3 1−1 3 −3 −22 0 1 53 1 −2 5

(b)

3 1 22 1 21 2 2

(c)

1 2 31 1 21 1 0

(d)

2 1 30 1 21 0 3

.


1 1 2 10 −2 0 01 2 1 −20 3 2 1

Exercise 1.6.16. If A is a nonsingular matrix whose inverse is

(4 21 1

)

, find A.

Exercise 1.6.17. Prove that two m×n matrices A and B are row equivalent if and only if there exists a nonsingularmatrix P such that B = PA (Hint: Use Theorem 1.6.2 and Theorem 1.6.4.)

Exercise 1.6.18. Let A and B be row equivalent n × n matrices. Prove that A is nonsingular if and only if B isnonsingular.

Exercise 1.6.19. Let A and B be n × n matrices. Show that if AB is nonsingular, then A and B must benonsingular. (Hint: Use Theorem 1.6.5.)

Exercise 1.6.20. Let A be an m × n matrix. Show that A is row equivalent to Om×n if and only if A = Om×n.

Exercise 1.6.21. Let A and B be m× n matrices. Show that A is row equivalent to B if and only if AT is columnequivalent to BT .

Exercise 1.6.22. Show that a matrix which has a row or a column consisting entirely of zeros must be singular.

Exercise 1.6.23. Find value(s) of λ for which the inverse of

A =

1 1 01 0 01 2 λ

exists. What is A−1?

Exercise 1.6.24.(a) Is (A + B)−1 = A−1 + B−1?

(b) Is (γ A)−1 = 1γ A−1?

Exercise 1.6.25. For what value(s) of λ does the homogeneous system

(λ − 1)x1 + x2 = 02x1 + (λ − 1)x2 = 0

have a nontrivial solution?

1.7. EQUIVALENT MATRICES

– WE WILL NOT DISCUSS IT IN CLASS –

Chapter 2.

REAL VECTOR SPACES

VECTORS IN THE PLANE—A REVIEW

In many applications we deal with measurable quantities, such as pressure, mass, and speed, which can be completelydescribed by giving their magnitude. They are called scalars and will be denoted by lowercase Latin letters. Thereare many other measurable quantities, such as velocity, force, and acceleration, which require for their descriptionnot only magnitude, but also a sense of direction. These are called vectors and their study comprises this chapter.Vectors will be denoted by bold lowercase letters. You perhaps have already encountered vectors in elementaryphysics and in the calculus.

We quickly review the definition of a vector in the plane.

We draw a pair of perpendicular lines intersecting at a point O, called the origin. One of the lines, the x-axisis usually taken in a horizontal position. The other line, the y-axis, is then taken in a vertical position. We nowchoose a point on the x-axis to the right of O and a point on the y-axis above O to fix the units of length andpositive directions on the x- and y-axes. Frequently, but not always, these points are chosen so that they are bothequidistant from O, that is, so that the same unit of length is used for both axes. The x and y axes together arecalled coordinate axes and they form a rectangular coordinate system or a Cartesian coordinate system.

x1

x2

each point P in the plane we associate an ordered pair (x, y) of real numbers, its coordinates. Conversely, we canassociate a point in the plane with each ordered pair of real numbers. The point P with coordinates (x, y) is denotedP (x, y), or simply as (x, y). The set of all points in the plane is denoted by R

2, it is called 2-space.

Consider the 2 × 1 matrix

α =

[xy

]

,

where x and y are real numbers (note the use of square brackets instead of round ones used in the previous chapter).With α we associate the directed line segment with the initial point at the origin and the terminal point at P (x, y).

The directed line segment from O to P is denoted by−→OP ; O called its tail and P its head. We distinguish the tail

and the head by placing an arrow at the head. A directed line segment has a direction, indicated by the arrow at itshead. The magnitude of a directed line segment is its length. Thus a directed line segment can be used to describe

force, velocity, or acceleration. Conversely, with a directed line segment−→OP with tail O(0, 0) and head P (x, y) we

can associate the 2 × 1 matrix[

xy

]

.

A vector in the plane is a 2×1 matrix α =

[xy

]

, where x and y are real numbers, which are called the components

of α. We refer to a vector in the plane merely as a vector.

Thus with every vector we can associated a directed line segment and conversely, with every directed line segmentwe can associate a vector. Frequently, the notions of directed line segment and vector are used interchangeably, anda directed line segment is called a vector. Since a vector is a matrix, the vectors

α1 =

[x1

y1

]

and α2 =

[x2

y2

]

are said to be equal if x1 = x2 and y1 = y2. That is, two vectors are equal if their respective components are

equal. With each vector α =

[xy

]

we can associate the unique point P (x, y); conversely, with each point P (x, y) we

associate the unique vector

[xy

]

. Thus we also write the vector α as (x, y). Of course, this association is carried out

by means of the directed line segment−→OP , where O is the origin and P is the point with coordinates (x, y). Thus

the plane may be viewed both as the set of all points or as the set of all vectors. For this reason, and dependingupon the context, we sometimes take R

2 as the set of all ordered pairs (x, y) and sometimes as the set of all 2 × 1

matrices

[xy

]

(or directed line segments).

Let

α1 =

[x1

y1

]

and β =

[x2

y2

]

be two vectors in the plane. The sum of the vectors α and β is the vector

α + β =

[x1 + y1

x2 + y2

]

.

We can interpret vector addition geometrically, as follows. In the figure below, the directed line segment γ is parallelto β, it has the same length as β, and its tail is the head (x1, y1) of α so its head is (x1 + y1, y1 + y2). Thus thevector with tail at 0 and head at (x1 + y1, x2 + y2) is α + β. We can also describe α + β as the diagonal of theparallelogram defined by α and β, as shown in the diagram.

–Linear Algebra To be Continued—

INTENTIONALLY LEFT BLANK

Chapter 3 INTRODUCTON TO INTEGRATION IN VECTOR FIELDS

CURVES, SURFACES, LINE INTEGRALS, VECTOR FIELDS, GREEN’S THEOREM, SURFACE INTEGRALS AND STOKE’S THEOREM AND THE DIVERGENCE THEOREM

CURVES IN THE PLANE AND IN SPACE

In this section we discuss two mathematical formulations of the intuitive notion of a curve. The precise relation between them turns out to be quite subtle, so we shall begin by giving some examples of curves of each type and practical ways of passing between them.

3.1 WHAT IS A CURVE?

If asked to give an example of a curve, you might give a straight line, say 𝑦𝑦 − 2𝑥𝑥 = 1 (even though this is not 'curved'!), or a circle, say, 𝑥𝑥2 + 𝑦𝑦2 = 1, or perhaps a parabola, say 𝑦𝑦 − 𝑥𝑥2 = 0.

𝑦𝑦 − 2𝑥𝑥 = 1 𝑦𝑦 − 𝑥𝑥2 = 0 𝑥𝑥2 + 𝑦𝑦2 = 1

All of these curves are described by means of their cartesian equation

𝑓𝑓(𝑥𝑥,𝑦𝑦) = 𝑐𝑐

where, 𝑓𝑓 is a function of two variable 𝑥𝑥 and 𝑦𝑦 and 𝑐𝑐 is a constant. From this point of view, a curve is a set of points, namely

∁ = {(𝑥𝑥,𝑦𝑦) ∈ ℝ2: 𝑓𝑓(𝑥𝑥,𝑦𝑦) = 𝑐𝑐}. (1)

These examples are all curves in the plane ℝ2, but we can also consider curves in ℝ3 (3-D space)--- for example, the x-axis in ℝ3 is the straight line given by

{(𝑥𝑥,𝑦𝑦, 𝑧𝑧) ∈ ℝ3:𝑦𝑦 = 0 & 𝑧𝑧 = 0}

and more generally a curve in ℝ3 might be defined by a pair of equations

𝑓𝑓1(𝑥𝑥,𝑦𝑦, 𝑧𝑧) = 𝑐𝑐1, 𝑓𝑓2(𝑥𝑥, 𝑦𝑦, 𝑧𝑧) = 𝑐𝑐2

Curves of this kind are called level curves, the idea being that the curve in Eq (1), for example, is the set of points (𝑥𝑥,𝑦𝑦) in the plane at which the quantity 𝑓𝑓(𝑥𝑥,𝑦𝑦) reaches the 'level' 𝑐𝑐.

However, there is another way to think about curves which turns out to be more useful in many situations. For this, a curve is viewed as the path traced out by a moving point. Thus, if 𝛄𝛄(t) denotes the position vector of the point at time 𝑡𝑡, the curve is described by a function 𝛄𝛄 of a scalar parameter 𝑡𝑡 with vector values (in ℝ2 for a plane curve, in ℝ3 for a space curve i.e., for a curve in 3-D space). We use this idea to give our first formal definition of a curve in ℝ𝑛𝑛 (we shall be interested only in the cases when 𝑛𝑛 = 2 𝑎𝑎𝑛𝑛𝑎𝑎 𝑛𝑛 = 3, but it is convenient to treat both cases simultaneously):

Definition 3.1.1 A parametrized curve in ℝ𝑛𝑛 is a continuous function or map given by

𝛄𝛄 ∶ 𝑰𝑰 → ℝn , 𝜸𝜸 = 𝛄𝛄(t) = ⟨γ1(t), γ2(t), … , γn(t)⟩ for 𝑡𝑡 in some open interval 𝑰𝑰 ⊆ ℝ

where the continuous functions (usually called maps) 𝛾𝛾1(𝑡𝑡), 𝛾𝛾2(𝑡𝑡), … , 𝛾𝛾𝑛𝑛(𝑡𝑡) are called the component functions of 𝜸𝜸 and 𝑡𝑡 is called the parameter.

The symbol 𝑰𝑰 denotes the parameter interval for 𝜸𝜸 and it is usually (but not always) an open interval of the form 𝑰𝑰 = {𝑡𝑡 ∈ ℝ ∶ 𝛼𝛼 < 𝑡𝑡 < 𝛽𝛽 𝑓𝑓𝑓𝑓𝑓𝑓 𝑠𝑠𝑓𝑓𝑠𝑠𝑠𝑠 𝛼𝛼,𝛽𝛽 ∈ ℝ}. A parametrized curve whose image, 𝜸𝜸(𝑰𝑰), is contained in a level curve ∁ is called a parametrization of (part of) ∁.

The following examples illustrate how to pass from level curves to parametrized curves and back again in practice.

Example 3.1.1 Let us find parametrization 𝛄𝛄(t) of the plane curve given as ∁= {(𝑥𝑥,𝑦𝑦) ∈ ℝ2: 𝑦𝑦 − 𝑥𝑥2 = 0} (called a parabola). You are perhaps more familiar with it written in functional form as: 𝑦𝑦 = 𝑓𝑓(𝑥𝑥) = 𝑥𝑥2, 𝑥𝑥 ∊ ℝ

Suppose 𝛄𝛄(𝑡𝑡) = ⟨𝛾𝛾1(𝑡𝑡), 𝛾𝛾2(𝑡𝑡)⟩, then the component functions 𝛾𝛾1 and 𝛾𝛾2 of 𝛄𝛄 must satisfy the relation

𝛾𝛾2(𝑡𝑡) = 𝛾𝛾2(𝑡𝑡)2 (2)

for all values of 𝑡𝑡 ∈ 𝑰𝑰 where 𝛄𝛄 is defined (yet to be decided), and ideally every point (𝑥𝑥,𝑦𝑦) on the parabola ∁ should be equal to ⟨𝛾𝛾1(𝑡𝑡), 𝛾𝛾2(𝑡𝑡)⟩ for some value of 𝑡𝑡 ∈ 𝑰𝑰. Of course, there is an obvious solution to Eq (2): Take 𝛾𝛾1(𝑡𝑡) =𝑡𝑡, 𝛾𝛾2(𝑡𝑡) = 𝑡𝑡2. To get every point on the parabola we must allow 𝑡𝑡 to take every real number value (since the 𝑥𝑥-coordinate of 𝛄𝛄(𝑡𝑡) is just 𝑡𝑡, and the 𝑥𝑥-coordinate of a point on the parabola can be any real number), so we must take 𝐼𝐼 to be ℝ. Thus, the parametrization we seek is:

𝛄𝛄 ∶ ℝ ⟶ℝ2, 𝛄𝛄(𝑡𝑡) = ⟨𝑡𝑡, 𝑡𝑡2⟩.

You might ask: Is the parametrization given above for the parabola the only one ? The answer is no! Another choice for parametrization of the parabola is 𝛄𝛄 ∶ ℝ ⟶ℝ2, 𝛄𝛄(𝑡𝑡) = ⟨𝑡𝑡3, 𝑡𝑡6⟩. Yet another is 𝛄𝛄 ∶ ℝ ⟶ℝ2, 𝛄𝛄(𝑡𝑡) =⟨2𝑡𝑡, 4𝑡𝑡2⟩, and of course there are (infinitely many) others (see if you can think of some others!). So the parametrization of a given level curve 𝑓𝑓(𝑥𝑥,𝑦𝑦) = 𝑐𝑐 is not unique.

Example 3.1.2 Now we try to find a parametrization of the plane curve as ∁= {(𝑥𝑥, 𝑦𝑦) ∈ ℝ2: 𝑥𝑥2 + 𝑦𝑦2 = 1} (called the unit circle with center at the origin (0,0)). It is tempting to take 𝛾𝛾1(𝑡𝑡) = 𝑡𝑡 , 𝛾𝛾2(𝑡𝑡) = √1 − 𝑡𝑡2 as a choice for parametrization (perhaps motivated by the previous example!). But this parametrization is only a parametrizes to upper half of the circle, because √1 − 𝑡𝑡2 is always nonnegative. Similarly, if we had taken 𝛾𝛾1(𝑡𝑡) = 𝑡𝑡, 𝛾𝛾2(𝑡𝑡) = −√1− 𝑡𝑡2, we would only have a parametrization of the lower half of the circle.

If we want to a parametrization of the whole circle, we must (try again!). We need functions 𝛾𝛾1(𝑡𝑡) and 𝛾𝛾2(𝑡𝑡) such that

𝛾𝛾1(𝑡𝑡)2 + 𝛾𝛾2(𝑡𝑡)2 = 1 (3)

for all 𝑡𝑡 ∈ 𝐼𝐼, such that every point (𝑥𝑥,𝑦𝑦) on the circle is equal to ⟨𝛾𝛾1(𝑡𝑡), 𝛾𝛾2(𝑡𝑡)⟩ for some 𝑡𝑡 ∈ 𝐼𝐼. There is an obvious solution to Eq (3): Take 𝛾𝛾1(𝑡𝑡) = cos(𝑡𝑡) 𝑎𝑎𝑛𝑛𝑎𝑎 𝛾𝛾2(𝑡𝑡) = 𝑠𝑠𝑠𝑠𝑛𝑛(𝑡𝑡) (since cos2(𝑡𝑡) + sin2(𝑡𝑡) = 1 for all real values of 𝑡𝑡). We can take 𝐼𝐼 = ℝ, although this is overkill: Any open interval 𝐼𝐼 whose length greater than 2𝜋𝜋 will suffice. Thus a desired parametrization of the unit circle is

𝛄𝛄 ∶ [0, 2𝜋𝜋] ⟶ℝ2, 𝛄𝛄(𝑡𝑡) = ⟨cos 𝑡𝑡 , sin 𝑡𝑡⟩

Note: In our parametrization of the unit circle chose an interval 𝐼𝐼 which is not an open interval (rather it is closed! contrary to Definition 2.1). The reason for this choice is purely convenient and geometrical is see should be clear in section 4.

TRY THIS EXERCISE NOW! Find a parametrization 𝛄𝛄 of the circle with radius 𝑓𝑓 > 0 centered at a point (ℎ,𝑘𝑘) ∈ℝ2 (Do not choose all of ℝ as your parameter interval). Choose your parameter interval so that distinct values of your parameter variable corresponds to distinct points on the circle.

The next example shows how to pass from parametrized curves back to level curves.

Example 3.1.3 Consider the parametrized curve (called an astroid) given by

𝛄𝛄(𝑡𝑡) = ⟨cos3 𝑡𝑡, sin3 𝑡𝑡⟩.

Since cos2 𝑡𝑡 + sin2 𝑡𝑡 = 1 for all 𝑡𝑡, the coordinates 𝑥𝑥 = cos3 𝑡𝑡, 𝑦𝑦 = sin3 𝑡𝑡 of the point 𝛄𝛄(𝑡𝑡) satisfy

𝑥𝑥23 + 𝑦𝑦

23 = 1.

This level curve coincides with the image of the map 𝛄𝛄.

In this section, we shall be studying curves (and later, surfaces) using methods of the calculus. To differentiate a vector-valued function such as 𝛄𝛄(𝑡𝑡) (as in Definition 3.1.1), we differentiate component-wise: If

𝜸𝜸(𝑡𝑡) = ⟨𝛾𝛾1(𝑡𝑡),𝛾𝛾2(𝑡𝑡), … , 𝛾𝛾𝑛𝑛(𝑡𝑡)⟩,

then

𝑎𝑎𝛄𝛄𝑎𝑎𝑡𝑡

= ⟨ 𝑎𝑎𝛾𝛾1

𝑎𝑎𝑡𝑡,𝑎𝑎𝛾𝛾2

𝑎𝑎𝑡𝑡, … ,

𝑎𝑎𝛾𝛾𝑛𝑛𝑎𝑎𝑡𝑡

⟩,

𝑎𝑎2𝛄𝛄

𝑎𝑎𝑡𝑡2 = ⟨ 𝑎𝑎2𝛾𝛾1𝑎𝑎𝑡𝑡2 , 𝑎𝑎

2𝛾𝛾2𝑎𝑎𝑡𝑡2 , … , 𝑎𝑎

2𝛾𝛾𝑛𝑛𝑎𝑎𝑡𝑡2 ⟩, etc.

We say that a parametrized curve 𝛄𝛄 is smooth if each of its component functions 𝛾𝛾1,𝛾𝛾2, … , 𝛾𝛾𝑛𝑛 is smooth i.e., if

all derivative 𝑎𝑎𝜸𝜸𝑠𝑠𝑎𝑎𝑡𝑡

, 𝑎𝑎2𝜸𝜸𝑠𝑠𝑎𝑎𝑡𝑡2 , 𝑎𝑎

3𝜸𝜸𝑠𝑠𝑎𝑎𝑡𝑡3 , … exists, for 𝑠𝑠 = 1,2, … ,𝑛𝑛. In most cases, all parametrized curves studied here

in this section will be assumed to be smooth.

Definition 3.1.2 If 𝛄𝛄 is a parametrized curve, then at some 𝑡𝑡 ∈ 𝑰𝑰 its first derivative 𝑎𝑎𝛄𝛄/𝑎𝑎𝑡𝑡 is called the tangent vector of 𝛄𝛄 at the point 𝛄𝛄(𝑡𝑡).

To see the reason for this terminology, note that the vector

1Δ𝑡𝑡

[𝛄𝛄(𝑡𝑡 + Δ𝑡𝑡) − 𝛄𝛄(𝑡𝑡)]

is parallel to the chord joining the two points 𝛄𝛄(𝑡𝑡) 𝑎𝑎𝑛𝑛𝑎𝑎 𝛄𝛄(𝑡𝑡 + Δ𝑡𝑡) of the image ∁ of 𝛄𝛄.

𝛄𝛄(𝑡𝑡 + Δ𝑡𝑡)

𝛄𝛄(𝑡𝑡)

We expect, as Δ𝑡𝑡 ⟶ 0 (tends to 0), the chord parallel to the tangent to ∁ at 𝛄𝛄(t). Hence, the tangent should be parallel to

limΔ𝑡𝑡→0

1Δ𝑡𝑡

[𝛄𝛄(𝑡𝑡 + Δ𝑡𝑡) − 𝛄𝛄(𝑡𝑡)] =𝑎𝑎𝛄𝛄𝑎𝑎𝑡𝑡

= 𝛄𝛄′(𝑡𝑡).

The following result is intuitively clear:

Theorem 3.1.1 If the tangent vector of a parametrized curve is constant, then the image of the curve is (part of) a straight line.

proof If 𝑎𝑎𝛄𝛄𝑎𝑎𝑡𝑡

= a for all 𝑡𝑡 ∈ 𝐼𝐼, where 𝐚𝐚 is a constant vector, we have, integrating component-wise,

𝛾𝛾(𝑡𝑡) = �𝑎𝑎𝛄𝛄𝑎𝑎𝑑𝑑

𝑎𝑎𝑑𝑑𝑡𝑡

𝑡𝑡0

= � 𝐚𝐚 𝑎𝑎𝑡𝑡𝑡𝑡

𝑡𝑡0

= 𝑡𝑡𝐚𝐚 + 𝐛𝐛,

where b is another constant vector. If 𝐚𝐚 ≠ 𝟎𝟎, this is the parametric equation of the straight line parallel to 𝐚𝐚 and passing through the point with position vector 𝒃𝒃 = 𝜸𝜸(𝒕𝒕𝟎𝟎):

𝑡𝑡𝒂𝒂 𝜸𝜸(𝒕𝒕)

𝒃𝒃

𝒂𝒂

If 𝐚𝐚 = 𝟎𝟎, the image of 𝛾𝛾 is a single point (namely, the point with position vector 𝐛𝐛 = 𝛄𝛄(𝑡𝑡0) ). Q.E.D

Problems

1. Is the curve 𝛄𝛄(𝑡𝑡) = ⟨𝑡𝑡2, 𝑡𝑡4⟩ a parametrization of the parabola = 𝑥𝑥2 ? (Explain).

2. Find parametrizations of the following level curves a. 𝑦𝑦2 − 𝑥𝑥2 = 1 ;

b. 𝑥𝑥2

4+ 𝑦𝑦2

9= 1.

3. Find the cartesian equations of the following parametrized curves.

a. 𝜸𝜸(𝑡𝑡) = ⟨cos2 𝑡𝑡, sin2 𝑡𝑡⟩; b. 𝛄𝛄(𝑡𝑡) = ⟨𝑠𝑠𝑡𝑡 , 𝑡𝑡2⟩.

4. Calculate the tangent vectors of the curves in Problem 3.

5. Calculate the tangent vector at each point of the astroid sketched in Example 3.1.3. At which points is the

tangent vector zero?

6. Show that 𝛄𝛄(𝑡𝑡) = ⟨cos2 𝑡𝑡 − 1

2, sin(𝑡𝑡) cos(𝑡𝑡) , sin(𝑡𝑡)⟩ is a parametrization of the curve of intersection of

the circular cylinder of radius 12 and axis the 𝑧𝑧-axis with the sphere of radius 1 and center �− 1

2, 0, 0�. (This

is called Viviani's Curve)

3.2 ARC-LENGTH

If 𝒗𝒗 = ⟨𝑣𝑣1, 𝑣𝑣2, … , 𝑣𝑣𝑛𝑛⟩ is a vector in ℝ𝑛𝑛 , its length or magnitude is

∥ 𝒗𝒗 ∥= �𝑣𝑣12 + 𝑣𝑣2

2 + ⋯+ 𝑣𝑣𝑛𝑛2.

If 𝒖𝒖 is another vector in ℝ𝑛𝑛 , ∥ 𝒖𝒖 − 𝒗𝒗 ∥ is the length of the straight line segment joining the points in ℝ𝑛𝑛 with position vectors 𝒖𝒖 and 𝒗𝒗.

To find a formula for the length of any parametrized curve, say, 𝜸𝜸, note that, if Δ𝑡𝑡 is sufficiently small, the part of the image ∁ of 𝜸𝜸 between 𝛾𝛾(𝑡𝑡) and 𝛾𝛾(𝑡𝑡 + Δ𝑡𝑡) is nearly a straight line. so its length is approximately

∥ 𝜸𝜸(𝑡𝑡 + Δ𝑡𝑡) − 𝜸𝜸(𝑡𝑡) ∥.

Again, since Δ𝑡𝑡 is sufficiently small, the vector 1Δ𝑡𝑡

[𝜸𝜸(𝑡𝑡 + Δ𝑡𝑡) − 𝜸𝜸(𝑡𝑡)] ≈ 𝜸𝜸′(𝑡𝑡), so the length L is approximately

∥ 𝜸𝜸′(𝑡𝑡) ∥ Δ𝑡𝑡 (4)

If we want to calculate the length of (a not necessarily small) part of ∁, we can divide it up into segments, each of which corresponds to a small increment Δ𝑡𝑡 in 𝑡𝑡, calculate the length of each segment using Eq (4), and add up the results.

Letting Δ𝑡𝑡 → 0 should then give the exact length.

This motivates the following definition:

Definition 3.2.1 The arc-length of a parametrized curve 𝜸𝜸 starting at the point 𝜸𝜸(𝑡𝑡0) is the function 𝑠𝑠(𝑡𝑡) given by

𝑠𝑠(𝑡𝑡) = � ∥𝑎𝑎𝜸𝜸𝑎𝑎𝑑𝑑

∥ 𝑎𝑎𝑑𝑑 = � ∥ 𝜸𝜸′(𝑑𝑑) ∥ 𝑎𝑎𝑑𝑑𝑡𝑡

𝑡𝑡0

𝑡𝑡

𝑡𝑡0

Thus, 𝑠𝑠(𝑡𝑡0) = 0 and 𝑠𝑠(𝑡𝑡) is positive or negative according to whether 𝑡𝑡 is larger or smaller than 𝑡𝑡0. If we choose a

different starting point, say, 𝜸𝜸(𝜆𝜆), the resulting arc-length ��𝑠 differs from 𝑠𝑠 by the constant ∫ ∥ 𝜸𝜸′(𝑢𝑢) ∥ 𝑎𝑎𝑢𝑢 𝜆𝜆𝑡𝑡0

.

Example 3.2.1 For a logarithmic spiral

𝜸𝜸(𝑡𝑡) = ⟨𝑠𝑠0.2𝑡𝑡 cos(𝑡𝑡) , 𝑠𝑠0.2𝑡𝑡 sin(𝑡𝑡)⟩,

we have

𝛄𝛄′ = ⟨𝑠𝑠0.2𝑡𝑡(0.2cos(𝑡𝑡) − sin(𝑡𝑡)), 𝑠𝑠0.2𝑡𝑡 (cos(𝑡𝑡) + 0.2 sin(𝑡𝑡))⟩

∴ ∥ 𝜸𝜸′ ∥2= 𝑠𝑠0.4𝑡𝑡(0.2 cos(𝑡𝑡) − sin(𝑡𝑡))2 + 𝑠𝑠0.4𝑡𝑡(0.2 sin(𝑡𝑡) + cos(𝑡𝑡))2 = 1.04𝑠𝑠0.4𝑡𝑡 .

Hence, the arc-length of gamma starting at 𝜸𝜸(0) = ⟨1,0⟩ (which corresponds to the point (1,0) ) is

𝑠𝑠 = 𝑠𝑠(𝑡𝑡) = ∫ √1.04𝑠𝑠0.4𝑑𝑑𝑎𝑎𝑑𝑑 =𝑡𝑡0 √1.04

0.2 (𝑠𝑠0.2𝑡𝑡 − 1) .

If s is the arc-length of a curve 𝜸𝜸 starting at the point 𝛄𝛄(𝑡𝑡0), we have

𝑠𝑠′(𝑡𝑡) = 𝑎𝑎𝑠𝑠𝑎𝑎𝑡𝑡

= 𝑎𝑎𝑎𝑎𝑡𝑡 ∫ ∥ 𝜸𝜸′(𝑑𝑑) ∥ 𝑎𝑎𝑑𝑑 =∥ 𝜸𝜸′(𝑡𝑡) ∥. 𝑡𝑡

𝑡𝑡0 (5)

Thinking of 𝛄𝛄(𝑡𝑡) as the position of a moving point at time 𝑡𝑡, 𝑎𝑎𝑠𝑠 𝑎𝑎𝑡𝑡

is the speed of the point (the rate of

change of distance along the curve). For this reason, we make the following definition

Definition 3.2.2 If 𝛄𝛄: 𝑰𝑰 ⟶ ℝ𝑛𝑛 is a parametrized curve, its speed at the point 𝛄𝛄(𝑡𝑡) is ∥ 𝛄𝛄′(𝑡𝑡) ∥, and the curve 𝛄𝛄 is called a unit-speed curve if the tangent vector, 𝛄𝛄′(𝑡𝑡) , is a unit vector (i.e., ∥ 𝛄𝛄′(𝑡𝑡) ∥ = 1) for all 𝑡𝑡 ∈ 𝑰𝑰.

We shall see many examples of formulas and results relating to curves that take on a much simpler form when the curve is unit speed. The reason for this simplification is given in the next theorem. Although this admittedly looks uninteresting at first sight, it will be extremely useful for what follows.

-100 -50 0 50 100 150-120

-100

-80

-60

-40

-20

0

20

40

60

80

Theorem 3.2.1 Let 𝐧𝐧(𝑡𝑡) be a unit vector that is a smooth function of a parameter 𝑡𝑡. Then, the dot product

𝐧𝐧′(𝑡𝑡) ∙ 𝐧𝐧(𝑡𝑡) = 0

for all 𝑡𝑡, i.e., 𝐧𝐧′(𝑡𝑡) = 0 or else 𝐧𝐧′(t) is perpendicular to 𝐧𝐧(𝑡𝑡) for all 𝑡𝑡. In particular, If 𝜸𝜸 is a unit-speed curve, then 𝑎𝑎2𝜸𝜸

𝑎𝑎𝑡𝑡2

is either 0 or perpendicular to 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

.

proof We use the 'dot product' formula' for differentiating dot products of vector-valued functions 𝐚𝐚(𝑡𝑡) and 𝐛𝐛(𝑡𝑡):

𝑎𝑎𝑎𝑎𝑡𝑡

(𝐚𝐚 ∙ 𝐛𝐛) = 𝑎𝑎𝐚𝐚𝑎𝑎𝑡𝑡∙ 𝐛𝐛 + 𝐚𝐚 ∙ 𝑎𝑎𝐛𝐛

𝑎𝑎𝑡𝑡.

Using this to differentiate both sides of the equation 𝐧𝐧 ∙ 𝐧𝐧 = 1 with respect to 𝑡𝑡 gives

𝐧𝐧′ ∙ 𝐧𝐧 + 𝐧𝐧 ∙ 𝐧𝐧′ = 0,

so 2𝐧𝐧′ ∙ 𝐧𝐧 = 0. The last part follows by taking 𝐧𝐧 = 𝛄𝛄′ Q.E.D

Problems

1. Calculate the arc-length of the catenary 𝛄𝛄(𝑡𝑡) = ⟨𝑡𝑡, cosh 𝑡𝑡⟩ starting at the point (0,1).

2. Show that the following curves a unit-speed curves

(i) 𝛄𝛄(𝑡𝑡) = ⟨ 13

(1 + 𝑡𝑡)32 , 1

3 (1 − 𝑡𝑡)

32 , 𝑡𝑡

√2 ⟩;

(ii) 𝛄𝛄(𝑡𝑡) = ⟨45

cos 𝑡𝑡 , 1 − sin 𝑡𝑡 ,− 35

cos 𝑡𝑡⟩.

3. A cycloid is the plane curve traced out by a point on the circumference of a circle as it rolls without slipping along a straight line. It can be shown that if the straight line is the x-axis and the circle has radius 𝛼𝛼 > 0, the cycloid can be parametrized as

𝜸𝜸: ℝ⟶ ℝ2, 𝜸𝜸(𝑡𝑡) = 𝛼𝛼⟨𝑡𝑡 − sin 𝑡𝑡 , 1 − cos 𝑡𝑡⟩ Calculate the arc-length along the cycloid corresponding to one complete revolution of the circle.

4. The circular helix is the space curve given by 𝜸𝜸: 𝑰𝑰 ⟶ ℝ3,𝜸𝜸 = 𝜸𝜸(𝑡𝑡) = ⟨cosαt , sinαt, βt⟩, where 𝛼𝛼 and 𝛽𝛽 are constants. Calculate the arc-length function 𝑠𝑠 along the circular helix starting at the point 𝜸𝜸(0)

3.3 REPARAMETRIZATION

We saw in Examples 3.1.1 and 3.1.2 of section 3 that a given level curve can have many parametrizations, and it is important to understand the relationship between them.

Definition 3.3.1 A parametrized curve 𝝋𝝋: 𝑱𝑱 ⟶ ℝ𝑛𝑛 is called a reparametrization of 𝜸𝜸: 𝑰𝑰 ⟶ ℝ𝑛𝑛 if there is a smooth bijective map (i.e. a one-to-one and onto map) 𝑑𝑑: 𝑱𝑱 ⟶ 𝑰𝑰, 𝑡𝑡 = 𝑑𝑑(𝑢𝑢) such that the inverse map 𝑑𝑑−1: 𝑰𝑰 ⟶ 𝑱𝑱, 𝑢𝑢 =𝑑𝑑−1(𝑡𝑡) is also smooth and

𝝋𝝋: 𝑱𝑱 ⟶ ℝ𝑛𝑛 , 𝝋𝝋(𝑢𝑢) = (𝜸𝜸 ∘ 𝑑𝑑)(𝑢𝑢) = 𝜸𝜸(𝑑𝑑(𝑢𝑢))

Note that, since 𝑑𝑑has a smooth inverse, 𝜸𝜸 is a reparametrization of 𝝋𝝋:

𝝋𝝋�𝑑𝑑−1(𝑡𝑡)� = 𝜸𝜸�𝑑𝑑�𝑑𝑑−1(𝑡𝑡)�� = 𝜸𝜸(𝑡𝑡) for all 𝑡𝑡 ∈ 𝐼𝐼.

Two curves that are reparametrizations of each other have the same image, so they should have the same geometric properties.

Example 3.3.1 In Example 3.1.2, we gave the parametrization 𝜸𝜸(𝑡𝑡) = ⟨cos 𝑡𝑡, sin 𝑡𝑡⟩ for the circle 𝑥𝑥2 + 𝑦𝑦2 = 1. Another parametrization is

𝝋𝝋(𝑢𝑢) = ⟨sin𝑢𝑢, cos𝑢𝑢⟩

(since sin2 𝑠𝑠 + cos2 𝑠𝑠 = 1 for all s). To see that 𝝋𝝋 is a reparametrization of 𝜸𝜸, we have to find a reparametrization map 𝑑𝑑 such that

⟨cos(𝑑𝑑(𝑢𝑢)), sin(𝑑𝑑(𝑢𝑢))⟩ = ⟨sin𝑢𝑢, cos𝑢𝑢⟩

One solution is 𝑡𝑡 = 𝑑𝑑(𝑢𝑢) = 𝜋𝜋2− 𝑢𝑢 (observe that this is a smooth bijective map! whose inverse is 𝑢𝑢 = 𝑑𝑑−1(𝑡𝑡) = 𝜋𝜋

2− 𝑡𝑡

which is also a smooth bijective map. Thus 𝜸𝜸 is also a reparametrization of 𝝋𝝋 too!)

As we remarked in the previous section, the analysis of a curve is simplified when it is known to be unit-speed. It is therefore important to know exactly which curve have unit-speed reparametrizations.

Definition 3.3.2 A point 𝜸𝜸(𝑡𝑡) of a parametrized curve 𝜸𝜸 is called a regular point if the tangent vector at 𝜸𝜸(𝑡𝑡) does not vanish, i.e., if 𝜸𝜸′(𝑡𝑡) ≠ 𝟎𝟎 (please keep in mind that 𝟎𝟎 = ⟨0,0, … ,0⟩ in ℝ𝑛𝑛 and 0 is a real number); otherwise, 𝜸𝜸(𝑡𝑡) is a singular point. A curve 𝜸𝜸 is regular if all its points are regular points (equivalently, A curve is regular if it has no singular points).

Before we show the relationship between regularity and unit-speed reparametrization, we note two simple properties of regular curves. Although these results are not particularly appealing, they will be very important for what is to follow.

Theorem 3.3.1 If 𝛾𝛾 is a regular curve, then any reparametrization 𝜑𝜑 of 𝛾𝛾 is again regular.

proof We aim to show that the curve 𝜑𝜑 is a regular curve (i.e., 𝑎𝑎𝜑𝜑𝑎𝑎𝑠𝑠

is never zero) Since 𝜑𝜑 is a reparametrization of 𝛾𝛾, we

have smooth bijective reparametrization maps 𝑡𝑡 = 𝑑𝑑(𝑢𝑢) and 𝑢𝑢 = 𝑑𝑑−1(𝑡𝑡) having the property 𝑑𝑑�𝑡𝑡−1(𝑡𝑡)� = 𝑡𝑡. Now differentiating on both sides with respect to 𝑡𝑡 and using the chain rule yields

𝑎𝑎𝑑𝑑𝑎𝑎𝑢𝑢

𝑎𝑎𝑑𝑑−1

𝑎𝑎𝑡𝑡= 1

This shows that 𝑎𝑎𝑑𝑑/𝑎𝑎𝑢𝑢 is never zero. Since 𝝋𝝋(𝑢𝑢) = 𝜸𝜸�𝑑𝑑(𝑢𝑢)�, and 𝜸𝜸 is a regular curve, another application of the chain rule gives

𝑎𝑎𝝋𝝋𝑎𝑎𝑢𝑢

= 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

𝑎𝑎𝑑𝑑𝑎𝑎𝑢𝑢≠ 𝟎𝟎 (do you see why?) Q.E.D

Theorem 3.3.2 If 𝜸𝜸 is a regular curve, its arc-length, 𝑠𝑠 (see Definition 3.1.3), starting at any point 𝜸𝜸(𝑡𝑡) of 𝜸𝜸, is a smooth function of 𝑡𝑡.

proof We have already seen that (whether or not 𝜸𝜸 is regular) 𝑠𝑠 is a differentiable function of 𝑡𝑡 and that

𝑎𝑎𝑠𝑠𝑎𝑎𝑡𝑡

= ∥ 𝜸𝜸′(𝑡𝑡) ∥.

To simplify the notation, assume form now on that 𝛾𝛾 is a plane curve, say

𝜸𝜸(𝑡𝑡) = ⟨𝑥𝑥(𝑡𝑡),𝑦𝑦(𝑡𝑡)⟩,

where 𝑥𝑥 and 𝑦𝑦 are smooth functions of 𝑡𝑡. Define 𝑓𝑓:ℝ2 ⟶ℝ by

𝑓𝑓(𝑥𝑥,𝑦𝑦) = �𝑥𝑥2 + 𝑦𝑦2,


= 𝑓𝑓 �𝑎𝑎𝑥𝑥𝑎𝑎𝑡𝑡

, 𝑎𝑎𝑦𝑦𝑎𝑎𝑡𝑡� = ��𝑎𝑎𝑥𝑥

𝑎𝑎𝑡𝑡�

2+ �𝑎𝑎𝑦𝑦

𝑎𝑎𝑡𝑡�

2 (6)

The crucial point is that 𝑓𝑓 is smooth on ℝ2 without the origin, which means that all the partial derivatives of 𝑓𝑓 of all orders exists and are continuous functions except at the origin (0,0). For example,

𝜕𝜕𝑓𝑓𝜕𝜕𝑥𝑥

(𝑥𝑥,𝑦𝑦) =𝑥𝑥

�𝑥𝑥2 + 𝑦𝑦2,

𝜕𝜕𝑓𝑓𝜕𝜕𝑣𝑣

(𝑥𝑥,𝑦𝑦) =𝑦𝑦

�𝑥𝑥2 + 𝑦𝑦2,

are well defined and continuous except where 𝑥𝑥 = 𝑦𝑦 = 0, and similarly for higher derivatives. Since 𝜸𝜸 is regular 𝑎𝑎𝑥𝑥/𝑎𝑎𝑡𝑡 and 𝑎𝑎𝑦𝑦/𝑎𝑎𝑡𝑡 are never both zero, so the chain rule and Eq (6) shows that 𝑎𝑎𝑠𝑠/𝑎𝑎𝑡𝑡 is smooth. For example,

𝑎𝑎2𝑠𝑠𝑎𝑎𝑡𝑡2 =


𝑎𝑎2𝑥𝑥𝑎𝑎𝑡𝑡2 +

𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦

𝑎𝑎2𝑦𝑦𝑎𝑎𝑡𝑡2 ,

and similarly for the higher derivatives of 𝑠𝑠. Q.E.D

The main result we want is

Theorem 3.3.4 A parametrized curve 𝜸𝜸: 𝑰𝑰 ⟶ ℝ𝑛𝑛 has a unit-speed reparametrization if and only if it is regular.

proof Suppose the 𝜸𝜸: 𝑰𝑰 ⟶ ℝ𝑛𝑛 is a parametrized curve with a unit-speed reparametrization 𝝋𝝋: 𝑱𝑱 ⟶ ℝ𝑛𝑛 . Let 𝑑𝑑: 𝑱𝑱 ⟶ 𝑰𝑰, 𝑡𝑡 = 𝑑𝑑(𝑢𝑢) be the reparametrization map. Then we have

𝝋𝝋(𝑢𝑢) = 𝜸𝜸�𝝉𝝉(𝑢𝑢)� = 𝜸𝜸(𝑡𝑡)

∴ 𝑎𝑎𝝋𝝋𝑎𝑎𝑢𝑢

=𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

𝑎𝑎𝑡𝑡𝑎𝑎𝑢𝑢

,

∴ ∥𝑎𝑎𝝋𝝋𝑎𝑎𝑢𝑢

∥ = ∥𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

∥ �𝑎𝑎𝑡𝑡𝑎𝑎𝑢𝑢�.

Since 𝝋𝝋 is unit-speed, ∥ 𝑎𝑎𝝋𝝋𝑎𝑎𝑢𝑢∥ = 1, so clearly 𝑎𝑎𝜸𝜸/𝑎𝑎𝑡𝑡 cannot be zero (i.e. 𝜸𝜸 is a regular curve).

Conversely, suppose that the tangent vector 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡≠ 𝟎𝟎 (i.e., 𝛾𝛾 is a regular parametrized curve). By Eq (5), we

know that 𝑎𝑎𝑠𝑠𝑎𝑎𝑡𝑡

> 0 for all 𝑡𝑡, where 𝑠𝑠 is the arc-length of 𝛾𝛾 starting at any point of the curve, and by Theorem 3.3.2 we have that 𝑠𝑠 is a smooth function of 𝑡𝑡. It follows from the inverse function theorem of multivariable calculus that 𝑠𝑠: 𝐼𝐼 ⟶ ℝ is injective (i.e. one-to-one), that its image is an open interval 𝐽𝐽, and that the inverse map 𝑠𝑠−1: 𝐽𝐽 ⟶ 𝐼𝐼 is also smooth. (If you are not familiar with the inverse function theorem you should accept these statements has true for now; until you become familiar with it). We take 𝑑𝑑 = 𝑠𝑠−1 and let 𝜑𝜑 be the corresponding reparametrization of 𝛾𝛾, so that

𝝋𝝋(𝑠𝑠) = 𝜸𝜸(𝑡𝑡).

Then,

𝑎𝑎𝝋𝝋𝑎𝑎𝑠𝑠


=𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

,

∴ ∥ 𝑎𝑎𝝋𝝋𝑎𝑎𝑠𝑠∥ 𝑎𝑎𝑠𝑠𝑎𝑎𝑡𝑡

= ∥ 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡∥ = 𝑎𝑎𝑠𝑠

𝑎𝑎𝑡𝑡 (by Eq (5))

∴ ∥ 𝑎𝑎𝝋𝝋𝑎𝑎𝑠𝑠∥ = 1. Q.E.D

The proof of Theorem 3.3.2 shows that the arc-length is essentially the only unit-speed parameter on a regular curve.

Corollary 3.3.1 Let 𝜸𝜸 ∶ 𝑰𝑰 ⟶ ℝ𝑛𝑛 be a regular curve and let 𝝋𝝋 ∶ 𝑱𝑱 ⟶ ℝ𝑛𝑛 be a unit-speed reparametrization of 𝜸𝜸:

𝝋𝝋�𝑢𝑢(𝑡𝑡)� = 𝜸𝜸(𝑡𝑡) for all 𝑡𝑡,

where 𝑢𝑢 is a smooth function of 𝑡𝑡. Then, if 𝑠𝑠 is the arc-length of 𝜸𝜸 (starting at any point), we have

𝑢𝑢 = ±𝑠𝑠 + 𝑐𝑐, (7)

where 𝑐𝑐 is a constant. Conversely, if 𝑢𝑢 is given by Eq (7) for some value of 𝑐𝑐 and with either sign, then 𝝋𝝋 is a unit-speed reparametrization of 𝜸𝜸

proof The calculation in the first part of the proof of Theorem 3.3.4 shows that 𝑢𝑢 gives a unit-speed reparametrization of 𝜸𝜸 if and only if

𝑎𝑎𝑢𝑢𝑎𝑎𝑡𝑡

= ± ∥ 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡∥ = ± 𝑎𝑎𝑠𝑠

𝑎𝑎𝑡𝑡, (by Eq (5))

Hence, 𝑢𝑢 = ±𝑠𝑠 + 𝑐𝑐 for some constant 𝑐𝑐. Q.E.D

Although every regular curve has a unit-speed reparametrization, this may be very complicated, or even impossible to write down 'explicitly', as the following examples show.

Example 3.3.1 Consider the logarithmic spiral

𝜸𝜸(𝑡𝑡) = ⟨𝑠𝑠0.2𝑡𝑡 cos 𝑡𝑡, 𝑠𝑠0.2𝑡𝑡 sin 𝑡𝑡⟩,

we found in Example 3.1.4 that

∥ 𝜸𝜸′(𝑡𝑡) ∥2= 1.04𝑠𝑠0.04𝑡𝑡 .

This is never zero, so 𝜸𝜸 is regular. The arc-length of 𝛾𝛾 starting at (1,0) was found to be 𝑠𝑠 = √1.04(𝑠𝑠0.2𝑡𝑡 − 1). Hence,

𝑡𝑡 = 10.2

ln � s√1.04

+ 1�, so a unit-speed reparametrization of 𝜸𝜸 is given by the rather unwieldly formula

𝝋𝝋(𝑠𝑠) = ⟨ �𝑠𝑠

√1.04+ 1� cos �

10.2

ln �𝑠𝑠

√1.04+ 1�� , �

𝑠𝑠√1.04

+ 1� sin�1

0.2𝑙𝑙𝑛𝑛 �

𝑠𝑠√1.04

+ 1�� ⟩

Example 3.3.2 The twisted cubic is the space curve given by

𝜸𝜸(𝑡𝑡) = ⟨𝑡𝑡, 𝑡𝑡2, 𝑡𝑡3⟩, −∞ < 𝑡𝑡 < ∞

We have

𝜸𝜸′(𝑡𝑡) = ⟨1, 2𝑡𝑡, 3𝑡𝑡2⟩,

∴ ∥ 𝜸𝜸′(𝑡𝑡) ∥ = �1 + 4𝑡𝑡2 + 9𝑡𝑡4 .

This is never zero, so 𝛾𝛾 is regular. The arc-length starting at 𝜸𝜸(0) = 𝟎𝟎 = ⟨0,0,0⟩ (is the position vector associated with origin (0,0,0) ) is

𝑠𝑠 = 𝑠𝑠(𝑡𝑡) = ∫ ‖𝜸𝜸′(𝑢𝑢)‖ 𝑎𝑎𝑢𝑢 =𝑡𝑡0 ∫ √1 + 4𝑢𝑢2 + 9𝑢𝑢4 𝑎𝑎𝑢𝑢𝑡𝑡

0 .

This integral cannot be evaluated in terms of familiar elementary functions (i.e,, logarithms, exponentials, trigonometric, polynomials, etc). The integral above is an example of an elliptic integral.

Our final example shows that a given level curve can have both regular and non-regular parametrizations.

=

Example 3.3.3 For the parametrization 𝜸𝜸 = 𝜸𝜸(𝑡𝑡) = ⟨𝑡𝑡, 𝑡𝑡2⟩ of the level curve (or the parabola)

𝑥𝑥2 − 𝑦𝑦 = 0, we have 𝑎𝑎𝜸𝜸𝑎𝑎𝑡𝑡

= 𝜸𝜸′(𝑡𝑡) = ⟨1, 2𝑡𝑡⟩ ≠ 𝟎𝟎 for all 𝑡𝑡 , so 𝜸𝜸 is a regular curve. But the curve 𝜑𝜑 given by

𝜑𝜑(𝑡𝑡) = 𝛾𝛾(𝑢𝑢(𝑡𝑡)) = 𝛾𝛾(𝑡𝑡3) = ⟨𝑡𝑡3, 𝑡𝑡6⟩

is a reparametrization of the parabola 𝜸𝜸 (where 𝑢𝑢 = 𝑢𝑢(𝑡𝑡) = 𝑡𝑡3 is the reparametrization map) . This time

𝝋𝝋′(𝑡𝑡) = 𝑢𝑢′(𝑡𝑡)𝜸𝜸′�𝑢𝑢(𝑡𝑡)� = 3𝑡𝑡2⟨1, 2𝑡𝑡3⟩ = ⟨3𝑡𝑡2, 6𝑡𝑡5⟩, and this is zero when 𝑡𝑡 = 0, so the reparametrization 𝝋𝝋 is not regular. Thus showing that a regular curve may have both regular and non-regular parametrizations.

Problems

1. Which of the following curve are regular? (i) 𝛾𝛾(𝑡𝑡) = ⟨cos2 𝑡𝑡, sin2 𝑡𝑡⟩ for −∞ < 𝑡𝑡 < ∞; (ii) the same is curve as in (i), but with 0 < 𝑡𝑡 < 𝜋𝜋

2;

(iii) 𝛾𝛾(𝑡𝑡) = ⟨𝑡𝑡, cosh 𝑡𝑡⟩ for −∞ < 𝑡𝑡 < ∞.

2. The cissoid of Diocles (shown above) is the plane curve whose equation in terms of polar coordinates

(𝑓𝑓,𝜃𝜃) (you should what polar coordinates are look it up if you don't!) is

𝑓𝑓 = 𝑓𝑓(𝜃𝜃) = 𝑠𝑠𝑠𝑠𝑛𝑛𝜃𝜃𝑡𝑡𝑎𝑎𝑛𝑛𝜃𝜃, −𝜋𝜋2

< 𝜃𝜃 <𝜋𝜋2

.

write down a parametrization of the cissoid using 𝜃𝜃 as your parameter and show that the curve given by

𝜸𝜸(𝑡𝑡) = ⟨𝑡𝑡2,𝑡𝑡3

√1 − 𝑡𝑡2⟩ , −1 < 𝑡𝑡 < 1,

is a reparametrization of it (of the cissoid that is).

3.4 LEVEL CURVES vs. PARAMETRIZED CURVES

We shall now try to clarify the precise relationship between the two types of curve we have considered in previous sections.

Level curves in the generality we have defined them are not always the kind of objects we would want to call curves. For example, the level 'curve' 𝑥𝑥2 + 𝑦𝑦2 = 0 is a single point. The correct conditions to impose on a function 𝑓𝑓(𝑥𝑥,𝑦𝑦) in order that 𝑓𝑓(𝑥𝑥,𝑦𝑦) = 𝑐𝑐, where 𝑐𝑐 is a constant, will be an acceptable level curve in the plane are contained in the following theorem, which shows that such level curves can be parametrized. Note that we might as well assume that 𝑐𝑐 = 0 (since we could replace 𝑓𝑓 by 𝑓𝑓 − 𝑐𝑐).

Theorem 3.4.1 Let 𝑓𝑓(𝑥𝑥,𝑦𝑦) be a smooth function of two variables x and y. Assume that ∇𝑓𝑓(𝑥𝑥,𝑦𝑦) ≠ 𝟎𝟎 for all points (𝑥𝑥,𝑦𝑦) of the level curve given by

∁= {(𝑥𝑥,𝑦𝑦) ∈ ℝ2:𝑓𝑓(𝑥𝑥,𝑦𝑦) = 0}.

If 𝑃𝑃(𝑥𝑥0,𝑦𝑦0) is a point of ∁, there is a regular parametrized curved 𝜸𝜸 = 𝜸𝜸(𝑡𝑡), defined on an open interval 𝑰𝑰 containing 0, such that the curve 𝜸𝜸 passes through 𝑃𝑃(𝑥𝑥0,𝑦𝑦0) when 𝑡𝑡 = 0 and 𝜸𝜸(𝑡𝑡) is contained in ∁ for all 𝑡𝑡.

I will not supply a proof of the theorem because you may not have the machinery necessary to understand it (knowledge of the Inverse function Theorem for Multivariable calculus and some point set topology are required!). However, I will provide enough reasons as to why you should believe it is true, simply because you have studied multivariable calculus of functions of two and three variables. Before I proceed, a few explanations are in order. To say that a function of two variables is a 'smooth function' means that all the partial derivatives of all orders exists and all are continuous

function on the domain of the function. The assumption that 𝑔𝑔𝑓𝑓𝑎𝑎𝑎𝑎(𝑓𝑓) = ∇ 𝑓𝑓(𝑥𝑥,𝑦𝑦) = ⟨𝜕𝜕𝑓𝑓𝜕𝜕𝑥𝑥

(𝑥𝑥, 𝑦𝑦), 𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦

(𝑥𝑥, 𝑦𝑦)⟩ ≠ 𝟎𝟎 means

not both 𝜕𝜕𝑓𝑓𝜕𝜕𝑥𝑥

(𝑥𝑥,𝑦𝑦) = 0 and 𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦

(𝑥𝑥,𝑦𝑦) = 0 holds (usually one says the 'gradient of 𝑓𝑓 never vanishes').

To understand the significance of the conditions placed on f(x, y) in the theorem, we suppose for a moment that 𝑄𝑄(𝑥𝑥0 + Δ𝑥𝑥,𝑦𝑦0 + Δ𝑦𝑦) is a point of the level curve near the point P(x_0,y_0) which is on the level curve as well. Then we know that 𝑓𝑓(𝑥𝑥0 + Δ 𝑥𝑥,𝑦𝑦0 + Δ 𝑦𝑦) = 0. We al so know that

0 = 𝑓𝑓(𝑥𝑥0 + Δ𝑥𝑥,𝑦𝑦0 + Δ𝑦𝑦) − 𝑓𝑓(𝑥𝑥0,𝑦𝑦0) ≈ 𝜕𝜕𝑓𝑓𝜕𝜕𝑥𝑥

(𝑥𝑥0,𝑦𝑦0)Δ𝑥𝑥 + 𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦

(𝑥𝑥0,𝑦𝑦0)Δ𝑦𝑦 (this you should know!).

If follows that for sufficiently small Δ𝑥𝑥 and Δ𝑦𝑦 we get that


(𝑥𝑥0,𝑦𝑦0)Δ𝑥𝑥 + 𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦

(𝑥𝑥0,𝑦𝑦0)Δ𝑦𝑦 ≈ 0, (8)

which says that the vector ⟨Δ𝑥𝑥,Δ𝑦𝑦⟩ is nearly (and equal if we let both Δ𝑥𝑥 and Δ𝑦𝑦 approach 0) tangent to the level curve at 𝑃𝑃 and that the vector ∇𝑓𝑓 is perpendicular to the level curve at 𝑃𝑃 (this statement should not surprise you because you have seen it many times in Calculus III ).

∁

𝑦𝑦

∇𝑓𝑓

⟨Δ𝑥𝑥,Δ𝑦𝑦⟩

P

𝑥𝑥

The hypothesis in the theorem tells us that ∇𝑓𝑓 never vanishes at every point P of the level curve. Suppose

now for example, that 𝜕𝜕𝑓𝑓𝜕𝜕𝑦𝑦≠ 0 at 𝑃𝑃. Then,∇𝑓𝑓 is not parallel to the 𝑥𝑥-axis at 𝑃𝑃, so the tangent vector to the

tangent line to the level curve at P is not parallel to the 𝑦𝑦-axis.

∁

Tangent line at 𝑃𝑃

𝑃𝑃(𝑥𝑥0,𝑦𝑦0) Rectangle on which functon exists

𝑥𝑥 = 𝑎𝑎2

𝑥𝑥 = 𝑎𝑎_1 𝑥𝑥 = 𝑥𝑥0

This implies that vertical lines of the form 𝑥𝑥 = 𝑎𝑎 near the line 𝑥𝑥 = 𝑥𝑥0 all intersect the level curve in a unique point (𝑥𝑥,𝑦𝑦) near 𝑃𝑃. In other words, the equation

𝑓𝑓(𝑥𝑥,𝑦𝑦) = 0 (9)

has a unique solution 𝑦𝑦 near 𝑦𝑦0 for every 𝑥𝑥 near 𝑥𝑥0. Note that this may fail to be the case if the tangent to the level curve at 𝑃𝑃 is parallel to the 𝑦𝑦-axis:

𝑃𝑃

In this example, lines 𝑥𝑥 = 𝑎𝑎 for 𝑎𝑎 < 𝑥𝑥0 do not meet the level curve near 𝑃𝑃, while those vertical lines just to the right of 𝑥𝑥 = 𝑥𝑥0 and near P meets the level curve in more than one point.

The statement in red about 𝑓𝑓 in the last paragraph means that there is a function 𝑔𝑔(𝑥𝑥), defined for 𝑥𝑥 near 𝑥𝑥0, such that 𝑦𝑦 = 𝑔𝑔(𝑥𝑥) is the unique solution of Eq (9) near 𝑥𝑥0. We can now define a parametrization \gamma of the part of the level curve near 𝑃𝑃 by

𝜸𝜸(𝑡𝑡) = ⟨𝑡𝑡,𝑔𝑔(𝑡𝑡)⟩.

If we accept that 𝑔𝑔 is smooth (which follows from the inverse function theorem), then 𝜸𝜸 is certainly regular since

𝛾𝛾′(𝑡𝑡) = ⟨1,𝑔𝑔′(𝑡𝑡)⟩

is obviously never zero. This 'proves' the theorem. Q.E.D

3.5. LINE INTEGRALS, VECTOR FIELDS, GREEN’S THEOREM, SURFACEINTEGRALS, STOKE’S THEOREM

LINE INTEGRALS OF SCALAR FIELDS IN THE PLANE AND IN SPACE

Definition 3.5.1. Let f(x, y, z) be any continuous scalar function of three variables defined on a Domain E

(This is a usually a connected open region of R3. Suppose that γ be any smooth parametrized space curve given by

γ(s) = 〈x(s), y(s), z(s)〉) with s as the arc-length parameter and satisfying the inequality a 6 s 6 b.

We denote the line integral of f along the path γ with respect to arc length from a to b to be:

In space

∫

γ

f(x, y, z) ds = lim‖P‖→0

n∑

i=1

f(x∗i , y

∗i , z∗i ) ∆si (provided this limit exists)

or

In the Plane

∫

γ

f(x, y) ds = lim‖P‖→0

n∑

i=1

f(x∗i , y

∗i ) ∆si (provided this limit exists)

Theorem 3.5.1. [Evaluation Theorem for Line Integrals with respect to Arc-length] If the planecurve γ is given parametrically by x = x(s), y = y(s), a 6 s 6 b or the space curve γ by x = x(s), y =y(s), z = z(s), a 6 s 6 b, then

(In the plane)

∫

γ

f(x, y) ds =

∫ b

a

f(x(s), y(s)) ds

or

(In Space)

∫

γ

f(x, y, z) ds =

∫ b

a

f(x(s), y(s), z(s)) ds

In practice, the arc-length parameter s is usually difficult to obtain, therefore, the above Evaluation Theorem becomesimpractical. A more practical Evaluation Theorem for line integrals with respect to arc-length is given in the nexttheorem.

Theorem 3.5.2. [Evaluation Theorem for Line Integrals with respect to Arc-length] If the curve γis given parametrically by

x = x(t), y = y(t), a 6 t 6 b or by x = x(t), y = y(t), z = z(t), a 6 t 6 b,

then∫

γ

f(x, y) ds =

∫ b

a

f(x(t), y(t))√

[x′(t)]2 + [y′(t)]2 dt =

∫ b

a

f(x(t), y(t)) ‖ γ′(t) ‖ dt

or

∫

γ

f(x, y, z) ds =

∫ b

a

f(x(t), y(t), z(t))√

[x′(t)]2 + [y′(t)]2 + [z′(t)]2 dt =

∫ b

a

f(x(t), y(t), z(t)) ‖ γ′(t) ‖ dt

Example 3.5.1. How to find the mass of a Helical Spring: Find the mass of a spring in the shape of theHelix defined parametrically by γ(t) = 〈2 cos t, t, 2 sin t〉, for 0 6 t 6 6π, with density ρ(x, y, z) = 2y.

Solution. First we assume that the Mass, m is approximated as follows: m ≈ ∑ni=1 ρ(xi, yi, zi)∆si so that

m = lim‖p‖→0

∑ni=1 ρ(xi, yi, zi)∆si =

∫

γ ρ(x, y, z) ds. Since the density of the spring at point (x, y, z) on the spring

(the curve γ) is given as ρ(x, y, z) = 2y = 2t and the arc-length element ds is given by

ds = ‖ γ′(t) ‖ dt =√

(−2 sin t)2 + (1)2 + (2 cos t)2 dt =√

5 dt

Thus,

m = Mass =

∫

γ

ρ(x, y, z) ds =

∫ 6π

0

2√

5 t dt = 36π2√

5 mass-units

⊔⊓

PROBLEMS YOU SHOULD TO DO

Exercise 3.5.1. Evaluate (if possible)∫

γ x2y ds, where oriented curve γ is determined by the parametric equations

x = 3 cos(t), y = 3 sin(t), 0 6 t 6 π/2. Also show that the parametrization x =√

9 − y2, y = y, 0 6 y 6 3,gives the same value. Draw a picture of γ too! and if the above integral is impossible to evaluate by hand, thendon’t worry about it!


γ(2x2 − 3yz) ds, where the parametrized curve γ is given by γ(t) =

〈cos t, sin t, cos t〉, for 0 6 t 6 2π. Do not Draw a picture of γ. If the above line integral is impossible to evaluateby hand, then don’t worry about it!

Theorem 3.5.3. Let f(x, y, z) be a continuous function defined on some region D containing the parametrizedcurve γ. Then, if γ is a piece-wise smooth curve, with γ = γ1 ∪ γ2 ∪ · · · ∪ γn, where γ1, γ2, . . . , γn are all smoothand where the terminal point of γi is the same as the initial point of γi+1, for i = 1, 2, . . . , n − 1, we have

(i)∫

−γ f(x, y, z) ds =∫

γ f(x, y, z) ds and

(ii)∫

γ f(x, y, z) ds =∑n

i=1

∫

γif(x, y, z) ds.

Interpretation of Line Integrals of f(x, y) with respect to Arc-Length

As before, if y = f(x) ≥ 0 for all x ∈ [a, b] then∫ b

a f(x) dx measures the area of the plane region bounded by thevertical lines x = a, y = b and the interval [a, b] and the graph of y = f(x). In a similar way, if z = f(x, y) ≥ 0 forall (x, y) ∈ D ⊆ R

2, then∫

γf(x, y) ds measures the surface area of the vertical cylinder (usually called a “curtain”)

bounded above by the graph of z = f(x, y) and the vertical lines (in space parallel to the z-axis) passing throughthe initial and terminal points of the curve γ (with the arc-length parameter s) and also bounded below by the curveγ itself.

Example 3.5.2. Evaluating a line integral over a Piece-Wise smooth Curve γ: Evaluate the line integral∫

γ(3x − y) ds, where the plane cure γ is the line segment from (1, 2) to (3, 3), followed by the portion of the circle

x2 + y2 = 18 traversed from (3, 3) clockwise around to (3,−3).

Solution. Since γ is piece-wise smooth, we must find smooth parametrization of the smooth portions of γ whichare γ1 (the line segment from (1, 2) to (3, 3)) and γ2 the circular part of the circle x2 + y2 = 18 traversed clockwisefrom (3, 3) to (3,−3).

(A) For the line Segment portion: Use the parametrization formula for any line segment given as:

ϕ(t) = (1 − t)ϕ0 + tϕ1 for 0 6 t 6 1.

Now treat the initial point of the line segment as position vector γ10= ϕ0 = 〈1, 2〉 and the terminal point (3, 3) as

the position vector γ11= ϕ1 = 〈3, 3〉 so that

γ1(t) = 〈1 + 2t, 2 + t〉 for 0 6 t 6 1.

Thus ds = ‖ γ′1(t) ‖ dt = ‖ 〈2, 1〉 ‖ dt =

√5 dt and f(γ1(t)) = f(1 + 2t, 2 + t) = 3(1 + 2t) − (2 + t) = 1 + 5t and

so we have

∫

γ1

f(x, y) ds =

∫

γ1

(3x − y) ds =

∫ 1

0

[3(1 + 2t) − (2 + t)]√

22 + 12 dt =

∫ 1

0

(5t + 1)√

5 dt =7

2

√5

(B) For the curved portion of γ: This is not a line segment so our approach has to be different from that inpart (A). The usual parametrization of a circle with counterclockwise orientation is given as:

γ(t) = 〈R cos t, R sin t〉 for α 6 t 6 β.

We want a parametrization with clockwise orientation of a circle of radius R = 3√

2. So we want to start when−t = π/4 and end when −t = −π/4. This means replacing t in the counterclockwise parametrization of a circlewith −t will gives the clockwise orientation we seek: Thus using the trigonometric identities cos(−t) = cos t andsin(−t) = − sin t we arrive at

γ2(t) = 〈3√

2 cos t, −3√

2 sin t〉 for − π/4 6 t 6 π/4.

Now f(γ2(t)) = f(3√

2 cos t,−3√

2 sin t) = 3(3√

2 cos t) − (−3√

2 sin t) = 3√

2(3 cos t + sin t) and

ds = ‖ γ′2(t) ‖ dt = ‖ 〈−3

√2 sin t, −3

√2 cos t〉 ‖ dt =

√

(−3√

2 sin t)2 + (−3√

2 cos t)2 dt = 3√

2 dt

and∫

γ2

f(x, y) ds =

∫ π/4

−π/4

(3 cos t + sin t) 18 dt = 54√

2.

Finally, combining the results of (A) and (B) we have that

∫

γ

f(x, y) ds =

∫

γ1

f(x, y) ds +

∫

γ2

f(x, y) ds =

∫

γ

(3x − y) ds =7

2

√5 + 54

√2.

⊔⊓


γ(x2 + y2) ds, where γ is the piecewise smooth plane curve given by the

circle x2 + y2 = 4 traversed clockwise from (0, 2) to (0,−2) and the line segment from (0,−2) to (−2,−2) and theline segment from (−2,−2) to (−2, 2) and finally the line segment from (−2, 2) to (0, 2). Draw a picture of γ. If theabove line integral is impossible to evaluate by hand, then don’t worry about it!

Definition 3.5.2. The line integral of f(x, y, z) with respect to the parameter x along the smoothparametrized space-curve γ(t) = 〈x(t), y(t), z(t)〉 for α 6 t 6 β is written as:

∫

γ

f(x, y, z) dx = lim‖ P ‖→0

n∑

i=1

f(xi, y1, zi) ∆xi (provided that is limit exists )

and independent of how we choose the points (xi, yi, zi) on the curve.

Likewise, we define the line integral of f(x, y, z) with respect to the parameter y along the smooth parametrizedspace-curve γ(t) = 〈x(t), y(t), z(t)〉 for α 6 t 6 β as:

∫

γ

f(x, y, z) dy = lim‖ P ‖→0

n∑

i=1

f(xi, y1, zi) ∆yi

and the line integral of f(x, y, z) with respect to the parameter z along the smooth parametrized space-curveγ(t) = 〈x(t), y(t), z(t)〉 for α 6 t 6 β as:

∫

γ

f(x, y, z) dz = lim‖ P ‖→0

n∑

i=1

f(xi, y1, zi) ∆zi.

In each case, the line integral is defined whenever the corresponding limit exists and is independent of how we choosethe points (xi, yi, zi) on the curve.

Theorem 3.5.4. [Evaluation Theorem of Line Integrals with respect to the coordinate axes] LetP (x, y, z), Q(x, y, z) and R(x, y, z) be a continuous functions defined on a Path-connected region E ⊆ R

3 containingthe smooth parametrized space-curve γ(t) = 〈x(t), y(t), z(t)〉 for α 6 t 6 β. Then

∫

γ

P (x, y, z) dx =

∫ β

α

P (x(t), y(t), z(t)) x′(t) dt

∫

γ

Q(x, y, z) dy =

∫ β

α

Q(x(t), y(t), z(t)) y′(t) dt

∫

γ

R(x, y, z) dz =

∫ β

α

R(x(t), y(t), z(t)) z′(t) dt.

Notation:

IN THE PLANE

∫

γ

P (x, y) dx + Q(x, y) dy =

∫

γ

P (x, y) dx +

∫

γ

Q(x, y) dy

IN SPACE

∫

γ

P (x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz =

∫

γ

P (x, y, z) dx +

∫

γ

Q(x, y, z) dy +

∫

γ

R(x, y, z) dz.

PROBLEM 4 Evaluating A Line Integral in Space: You Try these Now!

Evaluate the line integral∫

γ(4xz + 2y) dx, where the piecewise smooth space-curve γ is made up of the line segment

(a) from (2, 1, 0) to (4, 0, 2) and the line segment (b) from (4, 0, 2) to (2, 1, 0).

Calculate∫

γ4x dx + 2y dz, where γ is the curve consisting of the line segment from (5, 1, 0) to (0, 1, 1) followed by

the line segment from (0, 1, 1) to (3, 5, 1) and followed by the line segment from (3, 5, 1) to (0, 0, 0).

Exercise 3.5.4. Evaluate

∫

γ

P (x, y) dx + Q(x, y) dy =

∫

γ

y

x2 + y2dx +

−x

x2 + y2dy,

where γ(t) = 〈cos3t, sin3 t〉; 0 6 t 6 π/2. [This hint might be helpful for evaluating the definite integral: Letu = tan3 t]

Theorem 3.5.5. If f(x, y, z) is a continuous function defined on some path-connected region E of R3 and containing

the parametrized curve γ, then

(i) If γ is a piecewise smooth curve, then

∫

−γ

f(x, y, z) dx = −∫

γ

f(x, y, z) dx.

(ii) If γ = γ1 ∪ γ1 ∪ · · · ∪ γn where all the γ1, γ2, . . . , γn are all smooth and the terminal point of γi is the sameas the initial point of γi+1 for all i = 1, 2, . . . , n − 1, then

∫

γ

f(x, y, z) dx =

n∑

i=1

∫

γi

f(x, y, z) dx.

Theorem 3.5.6. Let x = (x1, x2, . . . , xn) be a point in a path-connection region D of Rn containing a smooth

parametrized curve γ, for α 6 t 6 β and if f(x) is continuous on D . Then

∫

γ

f(x) ds =

∫ β

α

g(t) dt = G(β) − G(α)

where G(t) is an anti-derivative of g(t) = f(γ(t))‖ γ′(t) ‖

Problem 5 Consider the function f(x, y) = x2 + y2 which is continuous on D = R2. Let γ(t) = 〈R cos t, R sin t〉

be a circle of radius R > 0 in D with counterclockwise orientation i.e., 0 6 t 6 2π. Provide an interpretation ofthe line integral

∫

γf(x, y) ds. Can you provide the answer to this integral without actually evaluating the integral

directly? (I had evaluated it in class; but try providing the result without looking it up or evaluating it directly)

2-D and 3-D VECTOR FIELDS

Consider this practical example: Suppose that as a Structural Engineer, you were sent by your employer (who wantsto build a bridge that takes you across the Hudson River in New York City) to take velocity readings (of the river)in a 1 mile length of the river (a straight part of the Hudson). You would have to setup a rectangular coordinategrid similar to R

2. At special points in your grid you would probably draw arrows (some long and some short pointin different directions) to represent the velocity of the river at point (x, y). In fact, from a mathematical point ofview, you actually have a graph called a vector field. Note this type of graph is different from the graphs thatyou’ve seen thus far. To each point (x, y) your region (The river) a unique arrow (representing velocity of the river)was assigned. Clearly, it seems that we have a vector function F : D → R

2 whose range consists only of vectors inR

2(viewed as a 2-D vector space) and whose domain D is a subset of R2. The components of the function F could

depend on other factors at the point (x, y) chosen. Thus we make the following definition.

Definition 3.5.3. A vector field in a region D of R2 (the plane) is a function F : D → R

2 (a mapping from theplane to a vector space) defined by

F(x, y) = 〈P (x, y), Q(x, y)〉 for (x, y) ∈ D

where the component functions P (x, y) and Q(x, y) are scalar functions.

A vector field in a region E of R3 (in space) is a function F : E → R

3 (a mapping from the space to a 3-D vectorspace) defined by

F(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉 for (x, y, z) ∈ E

where the component functions P (x, y, z), Q(x, y, z) and R(x, y, z) are scalar functions.

Example 3.5.3.

LINE INTEGRALS OF VECTOR FIELDS IN THE PLANE AND IN SPACE

WHAT IS A WORK INTEGRAL?

Suppose that the vector field F = F(x, y, z) = P (x, y, z) i + Q(x, y, z) j + R(x, y, z) k represents a force throughouta region in space (it might be the force of gravity or an electromagnetic force of some kind) and that

γ(t) = 〈g(t), h(t), k(t)〉, a 6 t ≤ b,

is a smooth curve in the region. Then the integral of F ·T, the scalar component of F in the direction of the curve’sunit tangent vector, over the curve is called the work done by F over the curve from a to b.

Definition 3.5.4. [Work Over a Smooth Curve] The work done by a force F = P (x, y, z) i + Q(x, y, z) j +R(x, y, z) k over a smooth curve γ from t = a to t = b is

W =

∫ t=b

t=a

F · T ds. (1)

where the scalar coordinate functions (or component functions) P (x, y, z), Q(x, y, z) and R(x, y, z) are contin-uous (and possibly have continuous first partial derivatives) on an open and path-connected domain E ⊆ R

3.

We wish to find the work done by the vector field F in moving a particle along a piece-wise smooth (or smooth)parametrized unit-speed curve (or path) γ with arc-length parametrization contained in E .

In order to accomplish this (calculate work done), we let γ(s) = x(s) i + y(s) j + z(s) k (or γ(s) = 〈x(s), y(s), z(s)〉)be the position vector for a point P0(x, y, z) on the curve γ. If T = T(s) = x′(s) i + y′(s) j + z′(s) k is the unittangent vector at P0, then FT = F · T (a scalar function) is the tangential component of the vector field Fat the point P0. The work done by the vector field F in moving the particle from point P0 a short distance ∆salong the curve is approximately FT ∆s = F · T ∆s, and consequently the work done in moving the particle fromarbitrary point A to B along γ is defined to be the work integral given by

W =

∫

γ

FT ds =

∫

γ

F · T ds

Different ways of writing the work integral

Work =

∫ β

α

(F · T) ds =

∫

γ

FT ds (The Definition of the Work Integral)

=

∫

γ

(

F · d~γ

ds

)

ds

(Expanded to include ds; emphasizes the

arc-length parameter s and the velocity vector d~γ/ds

)

=

∫

γ

F · d~γ (Compact differential form)

=

∫

γ

(

F · d~γ

dt

)

dt

(Expanded to include dt; emphasizes the parameter t

and the velocity vector d~γ/dt

)

=

∫

γ

(

Pdx

dt+ Q

dy

dt+ R

dz

dt

)

dt (Abbreviates the components of γ)

=

∫

γ

P dx + Q dy + R dz (dt′s canceled; the most commonly used form).

and T = dxds i + dy

ds j + dzds k = 〈dx

ds , dyds , dz

ds 〉 = d~γds and ‖T‖ = 1

HOW TO EVALUATE A WORK INTEGRAL

To evaluate the work integral, take these steps.

Step 1. Find a Parametrization of the curve γ with parameter t (if this has not been given)

Step 2. Evaluate F on the curve γ as a function of the parameter t, i.e., Calculate g(t) = F(γ(t)).

Step 3. Find ~γ′(t) = d~γ/dt i.e., Take the first derivative with respect to the parameter t of each component ofthe parameterize curve γ.

Step 4. Dot the two vector functions F(γ(t)) with ~γ′(t), i.e., calculate F(~γ(t)) · ~γ′(t)

Step 5. Integrate the function of a single variable computed in Step 4 with respect to the parameter t fromt = α to t = β (the parameter interval you have from Step 1); i.e., Evaluate the definite integral

∫ t=β

t=α

F(~γ(t)) · ~γ′(t) dt.

Here’s a Quick Example

Example 3.5.4. [Finding work done by a Variable Force Over a Space Curve] Find the work done bythe variable force F given by

F(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉 = 〈y − x2, z − y2, x − z2〉 on E = R3

in moving a particle along the path (or smooth curve) given by γ(t) = 〈t, t2, t3〉; 0 6 t 6 1 from the point(0, 0, 0) to the (1, 1, 1) (Note: This is not by any means a straight path).

Solution. To Calculate the work integral∫

γF · d~γ using the 5 Steps above to accomplish this Task.

Step 1. A parametrization of the path or curve γ has been given as

γ = γ(t) = 〈t, t2, t3〉; α = 0 6 t 6 1 = β.

Step 2. Calculate the force at each point on the path, i.e., F(γ(t)).

F(γ(t)) = F(t, t2, t3) = 〈t2 − t2, t3 − t4, t − t6〉 = 〈0, t3 − t4, t − t6〉; 0 6 t 6 1.

Step 3. Find ~γ′(t).

~γ′(t) = 〈x′(t), y′(t), z′(t)〉 = 〈1, 2t, 3t2〉; 0 6 t 6 1.

Step 4. Dot the two vector functions F(γ(t)) and ~γ′(t).

F(γ(t)) · d~γ

dt= 〈0, t3 − t4, t − t6〉 · 〈1, 2t, 3t2〉

= (t3 − t4)(2t) + (t − t6)(3t2)

= 2t4 − 2t5 + 3t3 − 3t8; 0 6 t 6 1.

Step 5. Integrate the expression 2t4 − 2t5 + 3t3 − 3t8 over the parameter interval 0 6 t 6 1.

WORK =

∫ 1

0

(2t4 − 2t5 + 3t3 − 3t8) dt

=[2/5 t5 − 1/3 t6 + 3/4 t4 − 1/3 t9

]t=1

t=0=

29

60(Work Units).

⊔⊓

Exercise 3.5.5. Find the work done by the Gravitational vector field

F = F(x, y, z) = −GmM

⟨x

(x2 + y2 + z2)3/2,

y

(x2 + y2 + z2)3/2,

z

(x2 + y2 + z2)3/2

⟩

in moving a particle along the straight-line curve γ in space from A(0, 3, 0) to B(4, 3, 0). Here F is measure inNewtons and length in meters

Exercise 3.5.6. Write out the details for the work integral along a piecewise smooth curve γ defined in an openand connected region D of R

2 for some vector field F(x, y) where (x, y) is always in D .

WHAT ARE FLOW AND CIRCULATION INTEGRALS?

Suppose we have a vector field F(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉 and suppose that instead of calling Fa force field, we now view F as the velocity field of a fluid (liquid, air, molten lava, etc) flowing through a region inspace (a tidal basin or the turbine chamber of a hydroelectric generator, for example). Under these circumstances,the integral of the quantity F · T along a curve γ in the region gives us the fluid’s flow along γ.

Definition 3.5.5. If γ(t) is a smooth parametrized curve in the domain of a continuous velocity field F, the flowalong the curve from t = α to t = β is

Flow of F along γ =

∫

γ

FT ds =

∫

γ

F ·T ds (2).

The integral in this case is called a flow integral.

If the curve γ is a simple closed curve, the flow integral is called the circulation of F around the curve γ.

Circulation of F around the closed path γ =

∮

γ

FT ds =

∮

γ

F ·T ds.

(Circulation is just the flow of the fluid around a simple closed curve.)

Note: We do not calculate flow and circulation integrals any differently than we calculate work integrals. Ourinterpretation of the vector field F (as either velocity or force) completely determines whether we are calculating(flow, circulation) or work respectively. The 5 step procedure given above is still the same.

Here are two quick examples

Example 3.5.5. [Finding Flow along a Helix] A fluid’s velocity field F(x, y, z) = 〈x, z, y〉. Find the flowalong the helix

γ(t) = 〈cos(t), sin(t), t〉, 0 6 t ≤ π/2.

Solution. To Calculate the Flow integral∫

γ F · d~γ using the same 5-step procedure outlined above for workintegrals.

Step 1. A parametrization of the path or curve γ has been given as

γ = γ(t) = 〈cos(t), sin(t), t〉; α = 0 6 t 6 π/2 = β.

Step 2. Calculate the velocity at each point alonge the path, i.e., F(γ(t)).

F(γ(t)) = F(cos(t), sin(t), t) = 〈cos(t) t, sin(t)〉, 0 6 t 6 π/2.

Step 3. Find ~γ′(t).

~γ′(t) = 〈x′(t), y′(t), z′(t)〉 = 〈− sin(t), cos(t), 1〉; 0 6 t 6 π/2.


F(γ(t)) · ~γ′(t) = 〈cos(t), t, sin(t)〉 · 〈− sin(t), cos(t), 1〉= − sin(t) cos(t) + t cos(t) + sin(t).

Step 5. Integrate the expression − sin(t) cos(t) + t cos(t) + sin(t). over the given parameter interval 0 6 t 6

π/2.

Flow =

∫ π/2

0

(− sin(t) cos(t) + t cos(t) + sin(t)) dt

=

[cos2(t)

2+ t sin(t)

]t=π/2

t=0

=π

2− 1

2.

⊔⊓

Example 3.5.6. [Finding Circulation Around a circle] Find the circulation of the field F(x, y) = 〈x − y, x〉around the unit circle

γ(t) = 〈cos t, sin t〉, 0 ≤ t ≤ 2π.

Solution. To Calculate the Circulation integral∮

γF · d~γ using the same 5-step procedure for work integrals

Step 1. A parametrization of the unit circle γ has been given as

γ = γ(t) = 〈cos(t), sin(t)〉; 0 = α 6 t 6 β = 2π.

Step 2. Calculate the velocity at each point along the circle:

F(γ(t)) = F(cos(t), sin(t)) = 〈cos(t) − sin(t), cos(t)〉, 0 6 t 6 2π.

Step 3. Find ~γ′(t).~γ′(t) = 〈x′(t), y′(t)〉 = 〈− sin(t), cos(t)〉; 0 6 t 6 2π.


F(γ(t)) · ~γ′(t) = 〈cos(t) − sin(t), cos(t)〉 · 〈− sin(t), cos(t)〉= − sin(t) cos(t) + sin2 t + cos2(t)

= 1 − sin t cos t.

Step 5. Integrate the expression 1 − sin(t) cos(t) over the given parameter interval 0 6 t 6 2π.

Circulation =

∫ 2π

0

(1 − sin(t) cos(t)) dt

=

[

t − sin2 t

2

]t=2π

t=0

= 2π.

⊔⊓

FLUX ACROSS A CURVE IN THE PLANE

To find the rate at which a fluid is entering or leaving a region enclosed by a smooth curve γ in the xy-plane, wecalculate the line integral over γ of F · n, the scalar component of the fluid’s velocity field in the direction of thecurve’s outward pointing normal vector. The value of this integral is the flux of F across γ. Flux is Latin for flow,but many flux calculations involve no motion at all. If F were an electric field or a magnetic field, for instance, theintegral of F · n would still be called the flux of the field across γ.

Definition 3.5.6. [Flux Across a Closed Curve in the Plane] If γ is a smooth closed curve in the domainof a continuous vector field F(x, y) = 〈P (x, y), Q(x, y)〉 in the plane and if n is the outward-pointing unit normalvector on γ, the flux of F across γ is

Flux of F across γ =

∮

γ

F · n ds (3)

PATH INDEPENDENCE, POTENTIAL FUNCTIONS, AND CONSERVATIVE VECTOR FIELDS

In gravitational and electric fields, the amount of work it takes to move a mass or a charge from one point toanother depends only on the object’s initial and final positions and not on the path taken in between. Now wediscuss the notion of path independence of work integrals and describe some properties of vector fields in which thework integrals are path independent.

PATH INDEPENDENCE AND CONSERVATIVE VECTOR FIELDS

If A and B are two points in an open region E in space, the work∫

F · d~γ done in moving a particle from A to Bby a field F defined on E usually depends on the path taken. For some special fields, however, the integral’s valueis the same for all paths from A to B. If this is true for all points A and B in E , we say that the integral

∫F · d~γ is

path independent in E and that the field F is conservative on E .

Definition 3.5.7. [Path Independence and Conservative Field] Let F be a field defined on an open region

E in space and suppose that for any two points A and B in E the work∫ B

AF · d~γ done in moving a particle from A

to B is the same value over all paths γ from A to B. Then the integral∫

F · d~γ is said to be path independent inE and the vector field F is said to be conservative on E .

Under conditions normally met in practice, a vector field F is conservative if and only if it is the gradient field of ascalar function ϕ; that is, if and only if F = ∇ϕ for some ϕ. The function ϕ is then called a potential functionfor F.

Definition 3.5.8. [Potential Function] If F is a vector field defined on E and F = ∇ϕ for some scalar functionϕ defined on an open region E in space, then ϕ is called a potential function for F on E .

An electric potential is a scalar function whose gradient field is an electric field. A gravitational potential is a scalarfunction whose gradient field is a gravitational field, and so on. As we will see, once we have found a potentialfunction ϕ for a vector field F, we can evaluate all the work integrals in the domain of F with the formula

∫ B

A

F · d~γ =

∫ B

A

∇ϕ · d~γ = ϕ(B) − ϕ(A).

If you think of ∇F for functions F of several variables as being something like the derivative f ′ for functions of thesingle variable, then you see that the above equation is the vector calculus analogue of the Fundamental Theorem ofCalculus formula

∫ b

a

f ′(x) dx = f(b) − f(a).

Conservative vector fields have other remarkable properties we as we go along. For example, saying that a vectorfield F is conservative on E is equivalent to saying that the integral of F around every closed path (or closed curve)in E is zero. Naturally, we need to impose conditions on the curves, vector fields, and domains to make the aboveequation and its implications hold.

Assumptions we make from Now on: Connectivity

1. All curves we consider are piecewise smooth, recall that it means the curve is made up of finitely manysmooth pieces connected end to end, as discussed in the section about Curves given earlier.

2. The component functions of the vector field F have continuous first partial derivatives on some region E

in space. When F = ∇ϕ, this continuity requirement guarantees that the mixed second derivatives of thepotential function ϕ are equal (Clairaut’s Theorem), a result we will find revealing in studying conservativevector fields.

3. E is an open region in space. This means essentially that every point in E is the center of a sphere that liesentirely in E .

4. E is connected (all in one piece) which means in an open region every point of E can be connected to everyother point of E by a smooth curve that lies entirely in E .

Theorem 3.5.7. [Independence of Path Theorem] Let F be a continuous vector field on an open and connectedset D (subset of R

n). Then the line integral∫

γ

F · d~γ

is independent of path in D if and only if F = ∇ϕ for some scalar function ϕ : Rn → R (ϕ is called a potential

function for F); that is, if and only if F is a conservative vector field on D .

Notice that this theorem is just a generalized restatement of the definition given above.

Theorem 3.5.8. [Equivalent Conditions for line integrals] Let F be a continuous vector field on an openconnected subset D of R

n. Then the following conditions are equivalent:

A) The vector field F is conservative on D (i.e., F = ∇ϕ for some scalar function ϕ : Rn → R)

B) The line integral∫

γ F · d~γ is independent of path in D .

C)∫

γ F · d~γ = 0 for every closed path γ in D .

Theorem 3.5.9. [Test for Path Independence in Space] Let F(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉,where the scalar functions P (x, y, z), Q(x, y, z) and R(x, y, z) are continuous together with their first-order partialderivatives in an open and connected subset E of R

3. Then the vector field F is conservative on E if and only if∇× F = 0, that is, if and only if

∂P

∂y=

∂Q

∂x,

∂P

∂z=

∂R

∂x,

∂Q

∂z=

∂R

∂y

In the two variable case, where F(x, y) = 〈P (x, y), Q(x, y)〉. F is conservative on D if and only if

∂P

∂y=

∂Q

∂x

Theorem 3.5.10. [The Fundamental Theorem for Line Integrals] Let γ be a piecewise smooth curve givenparametrically by γ = γ(t), a 6 t 6 b, which begins at a = γ(a) and ends at b = γ(b). If the scalar function ϕ iscontinuously differentiable on an open set containing the curve γ, then

∫

γ

∇ϕ(γ) · d~γ = ϕ(b) − ϕ(a)

Exercise 3.5.7. For each point (x, y, z) in R3/{(0, 0, 0)}, let F(x, y, z) be a vector pointed toward the origin (0, 0, 0)

with magnitude inversely proportional to the distance from the origin; that is, let

F = F(x, y, z) = −κγ

‖ γ ‖2= −κ

⟨x

x2 + y2 + z2,

y

x2 + y2 + z2,

z

x2 + y2 + z2

⟩

where γ = 〈x, y, z〉. Show that vector field F is conservative on it’s domain E of definition by finding a potentialfunction for F.

3.6. GREEN’S THEOREM IN THE PLANE

In the preceding section, we learned how to evaluate flow integrals for conservative fields. We found a potentialfunction for the field, evaluated it at the path endpoints, and calculated the integral as the appropriate difference ofthose values.

In this section, we see how to evaluate flow and flux integrals across closed plane curves when the vector field is notconservative. The means for doing so is a theorem known as Green’s Theorem which converts line integrals to doubleintegrals.

Green’s Theorem is one of the great theorems of calculus. It is deep and surprising, and has far-reaching consequences.In pure mathematics, it ranks in importance with the Fundamental Theorem of Calculus. In applied mathematics,the generalizations of Green’s Theorem in three dimensions provide the foundation for theorems about electricity,magnetism, and fluid flow.

We talk in terms of velocity fields of fluid flows because fluid flows are easy to picture. Be aware, however, thatGreen’s Theorem applies to any vector field satisfying certain mathematical conditions. It does not depend for itsvalidity on the field’s having a particular physical interpretation.

Flux Density at a Point: Divergence

We need two new ideas for Green’s Theorem. The first is the idea of the flux density of a vector field at a point,which in mathematics is called the divergence of the vector field. We obtain it in the following way:

Suppose we are given a vector field F(x, y) = P (x, y) i + Q(x, y) j which we interpret as the velocityfield of a fluid flow in the plane and that the first partial derivatives of P and Q are continuous at eachpoint of a region D . We let (x, y) be a point in the region and let R denote a small rectangle withone of its vertices at (x, y) that, along with its interior, lies entirely in the region R. The sides of therectangle, parallel to the coordinate axes, have lengths ∆x and ∆y. The rate at which fluid leaves therectangle across the bottom edge is approximately

F(x, y) · (−j)∆x = −Q(x, y) ∆x.

This is the scalar component of the velocity at (x, y) in the direction of the outward normal times thelength of the segment. If the velocity is in meeters per second, for example, the exit rate will be inmeters per second times meters or square meters per second. The rates at which the fluid crosses theother three sides in the directions of their outward normals can be estimated in a similar way. All told,we have:

Exit Rates:Top: F(x, y + ∆y) · j ∆x = Q(x, y + ∆) ∆x

Bottom: F(x, y) · (−j) ∆x = −Q(x, y) ∆x

Right: F(x + ∆x, y) · i ∆y = P (x + ∆x) ∆y

Left: F(x, y) · (−i) ∆y = −P (x, y) ∆y.

Combining opposite pairs gives us:

Top and Bottom: (Q(x, y + ∆y) − Q(x, y)) ∆x ≈(

∂Q∂y ∆y

)

∆x

Right and Left: (P (x + ∆x, y) − P (x, y)) ∆y ≈(

∂P∂x ∆x

)∆y.

Adding these last two equations gives

Flux across rectangle boundary ≈(

∂P

∂x+

∂Q

∂y

)

∆x ∆y.

We now divide by ∆x ∆y to estimate the total flux per unit area or flux density for the rectangle:

Flux across rectangle boundary

rectangle area≈(

∂P

∂x+

∂Q

∂y

)

.

Finally, we let ∆x and ∆y approach zero to define what we call the flux density of F at the point (x, y).

In mathematics, we call the flux density the divergence of F. The symbol for it is div F, pronounced “divergenceof F” or “div F.”

Definition 3.6.1. [Flux Density or Divergence] The flux density or divergence of a vector field F(x, y) =P (x, y) i + Q(x, y) j at the point (x, y) in the plane is the scalar

div F = ∇ · F =∂P

∂x+

∂Q

∂y. (1)

If F(x, y, z) = P (x, y, z)i + Q(x, y, z) j + R(x, y, z), then at the point (x, y, z) in space the flux density or divergenceof F is the scalar

div F = ∇ ·F =∂P

∂x+

∂Q

∂y+

∂R

∂z

Intuitively, if water were flowing in to a region through a small hole at the point (x0, y0), the lines of flow would divergethere (hence the name) and, since water would be flowing out of a small rectangle about (x0, y0), the divergence ofF at (x0, y0) would be positive. If water were draining out of the hold instead of flowing in, the divergence would benegative.

Exercise 3.6.1. Find the divergence of F(x, y) = (x2 − y) i + (xy − y2) j.

Circulation Density at a Point: The k-component of Curl

The second of the two ideas we need for Green’s Theorem is the idea of circulation density of a vector field F at apoint.

To obtain it, we return to the velocity field

F(x, y) = P (x, y) i + Q(x, y) j

and the rectangle A. The counterclockwise circulation of F around the boundary of A is the sum offlow rates along the sides. For the bottom edge, the flow rate is approximately

F(x, y) · i ∆x = P (x, y) ∆x.

This is the scalar component of the velocity F(x, y) in the direction of the tangent vector i times thelength of the segment. The rates of flow along the other sides in the counterclockwise direction areexpressed in a similar way. In all, we have

Top: F(x, y + ∆y) · −i ∆x = −P (x, y + ∆) ∆x

Bottom: F(x, y) · (i) ∆x = P (x, y) ∆x

Right: F(x + ∆x, y) · j ∆y = Q(x + ∆x) ∆y

Left: F(x, y) · (−j) ∆y = −Q(x, y) ∆y.

We add opposite pairs to get:

Top and Bottom: −(P (x, y + ∆y) − P (x, y)) ∆x ≈ −(

∂P∂y ∆y

)

∆x

Right and Left: (Q(x + ∆x, y) − Q(x, y)) ∆y ≈(

∂Q∂x ∆x

)

∆y.

Adding these last two equations gives

Circulation around rectangle’s boundary ≈(

∂Q

∂x− ∂P

∂y

)

∆x ∆y.

We now divide by ∆x ∆y to estimate the total flux per unit area or flux density for the rectangle:

Circulation around the rectangle’s boundary

rectangle area≈(

∂P

∂x+

∂Q

∂y

)

.

Finally, we let ∆x and ∆y approach zero to define what we call the circulation density of F at the point (x, y).

The positive orientation of the circulation density for the plane is the counterclockwise rotation around the verticalaxis, looking downward on the xy-plane from the tip of the (vertical) unit vector k. The circulation value is actuallythe k-component of a more general circulation vector we define later on, called the curl of the vector field F. ForGreen’s Theorem, we need only the k-component.

Definition 3.6.2. [k-Component of Circulation Density or Curl] The k-component of the circulationdensity or curl of a vector field F(x, y) = P (x, y)i + Q(x, y) j at the point (x, y) is the scalar

curl F · k = ∇× F · k =∂Q

∂x− ∂P

∂y. (2)

If water is moving about a region in the xy-plane in a thin layer, then the k-component of the circulation or curl,at a point (x0, y0) gives a way to measure how fast and in what direction a small paddle wheel will spin if it is putinto the water at point (x0, y0) with it axis perpendicular to the plane, parallel to k.

Exercise 3.6.2. Find the k-component of the curl for the vector field

F(x, y) = (x2 − y) i + (xy − y2) j.

GREEN’S THEOREM

Theorem 3.6.1. [Green’s Theorem in the Plane] Let R be a domain (i.e., an open and connected subset) ofthe xy-plane and let γ be a piecewise smooth simple closed curve in R whose interior is also in R. Let P (x, y) andQ(x, y) be functions defined and continuous and having continuous first partial derivatives in R. Then

∮

γ

P (x, y) dx + Q(x, y) dy =

∫∫

D

(∂Q

∂x− ∂P

∂y

)

dA

where D is the closed region (i.e., it consists of a domain together with its boundary curve) bounded by γ.

Two forms for Green’s Theorem

In one form, Green’s Theorem says that under suitable conditions the outward flux of a vector field across a simpleclosed curve in the plane equals the double integral of the divergence of the field over the region enclosed by thecurve.

Theorem 3.6.2. [Outward Flux-Divergence or Normal Form of Green’s Theorem] The Outward fluxof a vector field F(x, y) = P (x, y) i + Q(x, y) j across a positively oriented simple close curve γ (with arc-lengthparametrization) in the plane equals the double integral of div F over the region D enclosed by the curve γ.

The outward flux of F across γ =

∮

γ

F · n ds =

∮

γ

−Q(x, y) dx + P (x, y) dy =

∫∫

D

(∂P

∂x+

∂Q

∂y

)

dA. (3)

Exercise 3.6.3. Calculate the outward Flux of the vector field F(x, y) = x i + y2 j across the square boundedby the lines x = ± 1 and y = ± 1 by direct methods and also by using green’s Theorem. Also, draw the positivelyoriented piecewise smooth parametrized simple closed curve γ enclosing the region as stated in Green’s Theorem.

In another form, Green’s Theorem says that the counterclockwise circulation of a vector field around a simple closedcurve is the double integral of the k-component of the curl of the field over the region enclosed by the curve.

Theorem 3.6.3. [Counterclockwise Circulation-Curl or Tangential Form of Green’s Theorem] Thecounterclockwise circulation of a vector field F(x, y) = P (x, y) i + Q(x, y) j around a positively oriented simpleclosed curve γ (with arc-length parametrization) in the plane equals the double integral of the k-component of thecurl of the field (i.e., the double integral of curl F · k) over the region D enclosed by the curve γ.

Counterclockwise Circulation of F around γ =

∮

γ

F ·T ds =

∮

γ

P (x, y) dx + Q(x, y) dy =

∫∫

D

(∂Q

∂x− ∂P

∂y

)

dA

Exercise 3.6.4. Calculate the counterclockwise circulation of the vector field F(x, y) = tan−1(y/x) i + ln(x2 + y2) jaround the curve γ where γ is the boundary of the region defined by the polar coordinate inequalities 1 6 r 6 2,0 6 θ 6 π using only Green’s Theorem. Also draw a picture of the region together with its positively orientedboundary curve γ. What is the length of γ? Use your arc-length formula to verify your answer.

3.7. SURFACE INTEGRALS, STOKE’S THEOREM

SURFACE INTEGRALS OF SCALAR FIELDS

Let G(x, y, z) be a continuous scalar function of three variables (or possibly more) defined on a subset S of R3.

Suppose S is a surface defined by the equation z = f(x, y) for all (x, y) in a subset D of R2. We partition D into

n sub-rectangles Ri; this results in a corresponding partition of the surface S into n surface patches Gi. Choosesample point (x∗

i , y∗i ) ∈ Ri, and let (x∗

i , y∗i , z∗i ) = (x∗

i , y∗i , f(x∗

i , y∗i )) be the corresponding point on the surface patch

Gi. Then we define the surface integral of G over the surface S to be the integral given by

∫∫

S

G(x, y, z) dS = lim‖P‖→0

n∑

i=1

G(x∗i , y

∗i , z∗i ) ∆Si (provided this limit exists)

where ∆Si is the area of the ith surface patch Gi. This definition extends in a natural way to non-Rectangularregions in R

2 (by giving G the value 0 outside D).

Please note the difference in this definition of surface integral and the definition of line integral. Pay close attentionto the ∆si (arc-length of curve) and ∆Si (area of surface patch Gi). The surface integral Generalizes the line integral.

The integral equation in (7) takes on different meanings in different applications. If G(x, y, z) has the constant value1, then integral gives the area of S. If G(x, y, z) gives the mass density of a thin shell of material modeled by S, theintegral gives the mass of the shell.

Surface integrals behave like other double integrals, the integral of the sum of two functions being the sum of theirintegrals and so on. The domain additivity property takes the form

∫∫

S

G(x, y, z) dS =

∫∫

S1

G(x, y, z) dS1 +

∫∫

S2

G(x, y, z) dS2 + · · · +

∫∫

Sn

G(x, y, z) dSn.

The idea is that if S is partitioned by smooth curve into a finite number of non-overlapping patches (i.e., if S ispiecewise smooth), then the integral over S is the sum of the integrals over the patches. Thus, the integral of afunction over the surface of a cube is the sum of the integrals over the faces of the cube.

Theorem 3.7.1. [Evaluation Theorem(s) for surface Integrals] Let S be a piecewise smooth surface givenby z = f(x, y), where (x, y) is in a region Dxy of the xy-plane. If z = f(x, y) has continuous first-order partialderivatives and G(x, y, z) = G(x, y, f(x, y)) is continuous on Dxy, then

∫∫

S

G(x, y, z) dS =

∫∫

Dxy

G(x, y, f(x, y))√

fx(x, y)2 + fy(x, y)2 + 1 dA. (4)

Let S be a piecewise smooth surface given by y = h(x, z), where (x, z) is in a region Dxz of the xz-plane. Ify = h(x, z) has continuous first-order partial derivatives and G(x, y, z) = G(x, h(x, z), z) is continuous on Dxz, then

∫∫

S

G(x, y, z) dS =

∫∫

Dxz

G(x, h(x, z), z)√

hx(x, z)2 + hz(x, z)2 + 1 dA. (5)

Let S be a piecewise smooth surface given by x = u(y, z), where (y, z) is in a region Dyz of the yz-plane. Ifx = u(y, z) has continuous first-order partial derivatives and G(x, y, z) = G(u(y, z), y, z) is continuous on Dyz, then

∫∫

S

G(x, y, z) dS =

∫∫

Dyz

G(u(y, z), y, z)√

uy(y, z)2 + uz(y, z)2 + 1 dA. (6)

In general, if we view a given surface S as a level surface H(x, y, z) = c and H is a continuous function defined atthe points of the surface S and D is the shadow region of S, then we can write

∫∫

S

G(x, y, z) dS =

∫∫

D

G(x, y, z)‖∇H(x, y, z)‖

|∇H(x, y, z) · p | dA. (7)

where p is a unit vector normal to the shadow region D and ∇H · p 6= 0.

In the most popular case where the surface S is given by the equation z = f(x, y), we let H(x, y, z) = z − f(x, y) =0 = c or H(x, y, z) = f(x, y) − z = 0 = c and the unit vector normal to the shadow region D of S is p = k andso we get the formula (4) above.

Example 3.7.1. [Integrating over a Surface] Let us use formula(s) from among (4), (5), and (6) and (7) toevaluate the surface integral ∫∫

S

y dS,

where the surface S is defined by the equation z = f(x, y) = x + y2 on D = {(x, y) : 0 6 x 6 1, 0 6 y 6 2}.Solution.

Using formula (4) (which is applicable to our situation z = f(x, y)) we see that must let

G(x, y, z) = y, fx(x, y) = 1, fy(x, y) = 2y so that√

fx(x, y)2 + fy(x, y)2 + 1 =√

2√

2y2 + 1.

Thus, we have

∫∫

S

G(x, y, z) dS =

∫∫

D

G(x, y, f(x, y))√

fx(x, y)2 + fy(x, y)2 + 1 dA

=

∫∫

D

y√

2√

2y2 + 1 dA =√

2

∫∫

D

y√

2y2 + 1 dA (All about Calc II&III)

= 2−3/2

∫ 1

0

[∫ 2

0

√

2y2 + 1 4y dy

]

dx

= 2−3/2

∫ 9

1

√u du =

13√

2

3.

Using formula (7), we would view the given surface S whose equation z = f(x, y) = x + y2 as a Level surfaceH(x, y, z) = z − f(x, y) = 0 and so H(x, y, z) = z − x − y2. Now

∇H(x, y, z) =

⟨∂H

∂x,

∂H

∂y,

∂H

∂z

⟩

= 〈−1, −2y, − 1〉 and ‖∇H(x, y, z) ‖ = ‖ −1 i −2y j − 1 k ‖ =√

2√

2y2 + 1

and since p = k is a unit vector normal to the shadow region D (of the xy-plane) of the Level surface H(x, y, z) = 0

we have that |∇H(x, y, z) · p| = |∇H(x, y, z) · k| = | − 1| = 1 and so ‖ ∇H(x,y,z) ‖|∇H(x,y,z)·p| =

√2√

2y2 + 1. Thus we can

evaluate the surface integral as follows:

∫∫

S

G(x, y, z) dS =

∫∫

D

G(x, y, z)‖ ∇H(x, y, z) ‖|∇H(x, y, z) · p| dA =

∫∫

D

y√

2√

2y2 + 1 dA =13

√2

3

which agrees with the result obtained using formula (4). ⊔⊓

Exercise 3.7.1. Use formula (7) to Evaluate the surface integral

∫∫

S

xyz dS

where S is the surface of the cube cut from the first octant by the planes x = 1, y = 1, and z = 1.

Exercise 3.7.2. Use formula (5) in the cases p = k and p = j above to Evaluate the surface integral

∫∫

S

xyz dS

where S is the portion of the cone z2 = x2 + y2 between the planes z = 1 and z = 4. Include a sketch of S and asketch of D (the shadow region of S) on separate coordinate axes. Note this problem asks for two evaluation of thegiven surface integral!

A TIDBIT ABOUT ORIENTABLE SURFACES!!!

We call a smooth surface S orientable or two-sided if it is possible to define a vector field n = n(x, y, z) of unitnormals vectors on S that varies continuously with position. Any patch or sub-portion of an orientable surface is stillorientable. Spheres and other smooth closed surfaces in space (smooth surfaces that enclose solids) are orientable.By convention, we choose the unit normal vector field n = n(x, y, z) on a closed surface S to point outward and calln(x, y, z) the outward unit normal at the point (x, y, z) on the surface S .

Once n has been chosen, we say that we have oriented the surface S, and we call the surface together with itsnormal vector field an oriented surface. The vector n at any point on the surface is called the positive directionat that point.

SURFACE INTEGRALS OF VECTOR FIELDS

Suppose that F(x, y, z) = P (x, y, z) i + Q(x, y, z) j + R(x, y, z) k is a continuous vector field defined over an oriented(two-sided) surface S and that n is the chosen unit normal field on the surface. We call the surface integral of F · nover S; the outward flux of F across S. Thus, flux is the integral over the surface S of the scalar component ofF in the direction of the outward unit normal n.

Surface Integral for Flux

Definition 3.7.1. [Outward Flux Across a Surface] The Flux of a three-dimensional vector field F(x, y, z) =P (x, y, z) i + Q(x, y, z) j + R(x, y, z) k across an oriented surface S in the direction of the chosen outward unitnormal n is given as

Flux of F across S in the direction of n =

∫∫

S

F · dS =

∫∫

S

(F · n) dS where dS = n dS

If we view the vector field F as the velocity field of a three-dimensional fluid flow(such as air), the flux of F acrossS is the net rate at which fluid is crossing the surface S in the chosen positive direction.

Note: If S is part of a level surface H(x, y, z) = c, then n may be taken to be one of the two fields

n = n(x, y, z) = +∇H(x, y, z)

‖∇H(x, y, z)‖ or n = n(x, y, z) = − ∇H(x, y, z)

‖∇H(x, y, z)‖

depending on which one gives the preferred direction (you can determine the right n by testing at a convenient point(x, y, z) on S). The corresponding outward flux across S is

Outward Flux of F across S =

∫∫

S

F · dS

=

∫∫

S

(F · n) dS (Please note the difference between dS and dS)

=

∫∫

D

(

F · ± ∇H(x, y, z)

‖∇H(x, y, z)‖

) ‖ ∇H(x, y, z) ‖|∇H(x, y, z) · p | dA

=

∫∫

D

1

| ∇H(x, y, z) · p | (F · ±∇H(x, y, z)) dA

= ±∫∫

D

1

| ∇H(x, y, z) · p | (F · ∇H(x, y, z)) dA

(6)

Exercise 3.7.3. Find the outward Flux of the vector field F(x, y, z) = yz j + z2 k across S where S is the portioncut from the cylinder y2 + z2 = 1, z > 0, by the planes x = 0 and x = 1. Include a sketch of S and a sketch ofthe shadow region D (This is a projection of the surface S in the xy-plane)

Mass and Moment formulas for very thin shells

Mass: M =∫∫

S

ρ dS (ρ = ρ(x, y, z) = density at (x, y, z), mass per unit area)

First moments about the coordinate planes:

Myz =

∫∫

S

xρ dS, Mxz =

∫∫

S

yρ dS, Mxy =

∫∫

S

zρ dS

Coordinates of Center of mass:

x =Myz

M, y =

Mxz

M, z =

Mxy

M

Moments of inertia about coordinate axes:

Ix =

∫∫

S

(y2 + z2)ρ dS, Iy =

∫∫

S

(x2 + z2)ρ dS, Iz =

∫∫

S

(x2 + y2)ρ dS

IL =

∫∫

S

r2ρ dS

where r = r(x, y, z) is the distance from the point (x, y, z) to the line L

Radius of gyration about a line L :

RL =

√

IL

M

Example 3.7.2. Let’s find the coordinates of the center of mass of a thin hemispherical shell of radius R > 0whose mass per unit area at each point (x, y, z) on the thin hemispherical shell is the constant κ (i.e., the densityρ = ρ(x, y, z) = κ at each point (x, y, z) on the thin hemispherical shell is constant).

Solution. We model the shell with hemisphere

H(x, y, z) = x2 + y2 + z2 = R2, z > 0

The symmetry of the surface about the z-axis tells us that x = y = 0. It remains only to calculate z from theformula z =

Mxy

M . The mass M of the shell is given by

M =

∫∫

S

ρ dS = κ

∫∫

S

dS = κA(S) =1

2(4πR2)κ = 2πR2κ.

To evaluate the surface integral Mxy we take p to be k so that

‖ ∇H(x, y, z) ‖ = ‖ 2〈x, y, z〉 ‖ = 2√

x2 + y2 + z2 = 2R

and | ∇H(x, y, z) · p | = | ∇H(x, y, z) · k | = 2 | z | = 2z. Recall that

dS =‖ ∇H(x, y, z) ‖| ∇H(x, y, z) · p | dA =

2R

2zdA =

R

zdA.

Then

Mxy =

∫∫

S

zρ dS =

∫∫

D

zκR

zdA = RκA(D) = Rκ(πR2) = πκR3;

and now we have the z-coordinate of the center of mass as

z =Mxy

M=

πκR3

2πκR2=

R

2.

Hence the center of mass of the thin hemispherical shell of radius R > 0 with constant density is (0, 0, R2 ). ⊔⊓

Exercise 3.7.4. In Example 3.7.2 we avoided direct computation of the x and y coordinates of the center of massof the thin shell by use of the phrase “by symmetry of the surface about the z-axis tells us that x = 0 and y = 0.”Show that we were just in making such a statement.

PARAMETRIC SURFACES !!!!

For a surface S in three-dimensional space we are accustomed to the Explicit forms: z = f(x, y), x = g(y, z) andy = u(x, z) and the Implicit form: F (x, y, z) = c = 0 (Also known as a level surface for the function F ). There isalso a Parametric form that gives the position of a point (x, y, z) on the surface as a vector function of two variables.

Consider a function r : D → V3 where V3 is the three dimensional real vector space R3 with the standard basis (Oh

NO! there is that Japanese again!) and is defined as:

r = r(u, v) = f(u, v) i + g(u, v) j + h(u, v) k, (u, v) ∈ D (9)

be a continuous vector function that is defined on a region D in the uv-plane and one-to-one on the interior ofD . We call the range of r the surface S defined or traced out by r. Equation (9) together with the domainR constitutes a parametrization of the surface S. The variable u and v are the parameters, and D is theparameter domain. To simplify our discussion, we take D to be the rectangle R defined by inequalities of theform R = {(u, v) : a 6 u 6 b, c 6 v 6 d}. The requirement that r be one-to-one on the interior of R ensuresthat S does not cross itself. Notice that Equation (9) is the vector equivalent of three parametric equations:

x = f(u, v), y = g(u, v), z = h(u, v) where (u, v) ∈ R.

PARAMETRIZATIONS OF SOME POPULAR SURFACES

Example 3.7.3. [Parametrizing the Cone] Find a parametrization of the cone

z2

c2=

x2

a2+

y2

b2.

Solution. Cylindrical coordinates provide everything we need. A typical point (x, y, z) on the cone has x = ar cos θ,y = br sin θ, and z = cr, with −H

c 6 r 6Hc and 0 6 θ 6 2π. Taking u = r and v = θ in Equation (9) give the

parametrization

r(r, θ) = a(r cos θ) i + b(r sin θ) j + cr k, −H

c6 r 6

H

c, 0 6 θ 6 2π.

Note: If a = b = c, then we have the familiar RIGHT CIRCULAR CONE. ⊔⊓

Example 3.7.4. [Parametrizing an Ellipsoid] Find a parametrization of the Ellipsoid

x2

a2+

y2

b2+

z2

c2= 1.

Solution. Spherical coordinates provide what we need. A typical point (x, y, z) on the Ellipsoid has x =a sinφ cos θ, y = b sinφ sin θ, and z = c cosφ, 0 6 φ 6 π, 0 6 θ 6 2π. Taking u = φ and v = θ inEquation (9) gives the parametrization

r(φ, θ) = (a sin φ cos θ) i + (b sinφ sin θ) j + c cosφ k, 0 6 φ 6 π, 0 6 θ 6 2π.

Note: If a = b = c, then we the familiar SPHERE. ⊔⊓

Example 3.7.5. [Parametrizing an Elliptic Cylinder] Find a parametrization of the Cylinder

x2

a2+

y2

b2= 1 (It may look like an ellipse; but it is not!).

Solution. Cylindrical coordinates provide what we need. A typical point (x, y, z) on the elliptic cylinder hasx = a cos θ, y = b sin θ, and z = z, 0 6 θ 6 2π, −H 6 z 6 H . Taking u = θ and v = z in Equation (9) givesthe parametrization

r(θ, z) = a cos θ i + b sin θ j + z k, 0 6 θ 6 2π, −H 6 z 6 H, H > 0.

Note: If a = b, then we have the familiar RIGHT CIRCULAR CYLINDER OF HEIGHT 2H. ⊔⊓

Example 3.7.6. [Parametrizing an Elliptic Paraboloid] Find a parametrization of the Elliptic Paraboloid

z

c=

x2

a2+

y2

b2(It may look like an ellipse; but it is not!).

Solution. Cylindrical coordinates provide what we need. A typical point (x, y, z) on the Elliptic Paraboloid has

x = ar cos θ, y = br sin θ, and z = cr2, 0 6 θ 6 2π, 0 6 r 6

√Hc . Taking u = r and v = θ in Equation (9)

gives the parametrization

r(r, θ) = ar cos θ i + br sin θ j + cr2 k, 0 6 θ 6 2π, 0 6 r 6

√

H

c, H > 0.

Note: If a = b, then we have the familiar CIRCULAR PARABOLOID. ⊔⊓

Example 3.7.7. [Parametrizing a Hyperbolic Paraboloid] Find a parametrization of the Hyperbolic Paraboloid

z

c=

x2

a2− y2

b2(It may look like an ellipse; but it is not!).

Solution. Cylindrical coordinates provide what we need. A typical point (x, y, z) on the Elliptic Paraboloid has

x = ar cosh θ, y = br sinh θ, and z = cr2, 0 6 θ 6 2π, 0 6 r 6

√Hc . Taking u = r and v = θ in Equation (9)

gives the parametrization

r(r, θ) = ar cosh θ i + br sinh θ j + cr2 k, 0 6 θ 6 2π, 0 6 r 6

√

H

c, H > 0.

Note: If a = b, then we have the familiar CIRCULAR PARABOLOID. ⊔⊓

Here is what I called the trivial Parametrization: Suppose a smooth surface S is given as the level surfaceof a function H(x, y, z) = c. The we always have these three natural parametrizations

r(x, y) = x i + y j + f(x, y) k; for (x, y) ∈ Domain (f) and z = f(x, y).

r(x, z) = x i + g(x, z) j + z k; for (x, z) ∈ Domain (g) and y = g(x, z).

r(y, z) = h(y, z) i + y j + z k; for (y, z) ∈ Domain (h) and x = h(y, z).

Definition 3.7.2. [Smooth Parametrized Surface] A parametrized surface S given by

r(u, v) = f(u, v) i + g(u, v) j + h(u, v) k

is said to be smooth if the first-order partial derivatives ru and rv are continuous and ru × rv is never zero on theparameter domain. The surface S is said to be piecewise smooth if it is the union of finitely many smooth surfaces.

Area of Parametrized smooth surfaces

Theorem 3.7.2. [Area of a Parametrized smooth surface] The area of the smooth parametrized surface

r(u, v) = f(u, v) i + g(u, v) j + h(u, v) k, D = {(u, v) | a 6 u 6 b, c 6 v 6 d}

is given by

The Surface Area of S = A(S) =

∫ d

c

∫ b

a

‖ ru × rv ‖ du dv (10)

Exercise 3.7.5. Find the area of the closed surface S consisting of the upper-half circular cone z = HR

√

x2 + y2

and the intersecting plane z = H where H and R are positive constants denoting the height and radius of the conerespectively.

Area of non-parametrized smooth surface

Theorem 3.7.3. [Area of a non-parametrized smooth level surface] If a smooth surface S is given as alevel surface H(x, y, z) = c (non-parametrized), then the Area of S is given by

A(S) =

∫∫

S

dS =

∫∫

D

1

| ∇H(x, y, z) · p | ‖ ∇H(x, y, z) ‖ dA (11)

where p is a unit vector normal to the shadow region D of the surface S.

Exercise 3.7.6. Repeat Exercise 3.7.5 using formula (11) (Some of you may have done this before but do notrecognize it! do it again anyway!)

Theorem 3.7.4. [Evaluation Theorem for Surface Integrals over Parametrized surfaces]

If S is a smooth surface defined parametrically as

r(u, v) = f(u, v) i + g(u, v) j + h(u, v) k, a 6 u 6 b, c 6 v 6 d,

and G(x, y, z) is a continuous function defined on S, then the surface Integral of G over S is∫∫

S

G(x, y, z) dS =

∫∫

R

G(r(u, v)) ‖ru × rv‖ du dv

=

∫ d

c

∫ b

a

G(f(u, v), g(u, v), h(u, v)) ‖ ru × rv ‖ du dv

(12)

where R is the uv-parameter domain.

Exercise 3.7.7. Integrate G(x, y, z) = x2 over the upper-half circular cone z =√

x2 + y2, 0 6 z 6 1.

Outward Flux for parametrized surfaces

Theorem 3.7.5. [Evaluation Theorem for Outward Flux across a parametrized surface] Suppose that

F(x, y, z) = P (x, y, z) i + Q(x, y, z) j + R(x, y, z) k

is a continuous vector field and that n = n(x, y, z) is the outward unit normal field on smooth parametrized positivelyorientable surface S, then the outward Flux of F across S is given by the surface integral

outward flux of F across S =

∫∫

S

F · dS =

∫∫

S

F · n dS = +

∫∫

R

F(r(u, v)) · ru × rv du dv (13)

Example 3.7.8. Find the outward flux of the vector field F(x, y, z) = 〈x, y, z〉 through the

Sphere: r(ϕ, θ) = 〈R sin ϕ cos θ, R sin ϕ sin θ, R cosϕ〉; 0 ≤ ϕ ≤ π, 0 ≤ θ ≤ 2π

of radius R > 0.

Solution. First, F(r(ϕ, θ)) = R 〈sin ϕ cos θ, sin ϕ sin θ, cosϕ〉. Next, rϕ = R 〈cos ϕ cos θ, cosϕ sin θ, − sinϕ〉and rθ = R 〈− sin ϕ sin θ, sinϕ cos θ, 0〉 so that rϕ × rθ = R2 〈sin2 ϕ cos θ, sin2 ϕ sin θ, sinϕ cos ϕ〉. Therefore,

F(r(ϕ, θ)) · rϕ × rθ = R 〈sin ϕ cos θ, sin ϕ sin θ, cosϕ〉 · R2 〈sin2 ϕ cos θ, sin2 φ sin θ, sin ϕ cosϕ〉= R3(sin3 ϕ cos2 θ + sin3 ϕ sin2 θ + cos2 ϕ sin ϕ)

= R3(sin3 ϕ + cos2 ϕ sin ϕ)

= R3 sin ϕ(sin2 ϕ + cos2 ϕ)

= R3 sin ϕ

Now outward flux is

∫∫

D

F(r(ϕ, θ)) · rϕ × rθ dϕ dθ = R3

∫ 2π

0

∫ π

0

sinϕ dϕ dθ = 4πR3 = 3 (volume of the sphere)

⊔⊓

Example 3.7.9.

Evaluate the surface integral∫∫

S

F · dS for the vector field F(x, y, z) = 〈x, −z, y〉 where S is the

part of the sphere x2 + y2 + z2 = 4 in the first octant. In other words, find the flux of F acrossS (this S is not closed!). For closed surfaces, use the positive (outward) orientation.

In the problem we are given the continuous vector field F = x i − z j + y k and the open smooth surface S whichis part of the closed surface (a sphere ) x2 + y2 + z2 = 4 in the first octant, with orientation towards the origin.

Solution.Step 1. Use formulas

Outward Flux of F across S =

∫∫

S

F · dS =

∫∫

S

F · n dS = ±∫∫

D

1

| ∇H(x, y, z) · p | F · ∇H(x, y, z) dA

in Definition 3.7.1

Step 2. Choosing the required n: We view the sphere x2 + y2 + z2 = 4 as the closed non-parametrized levelsurface of H(x, y, z) = c = 0 where H(x, y, z) = x2 + y2 + z2 − 4; so that it is enough to choose n bycalculating ± ∇H(x, y, z) = ±2 (x i + y j + z k) (vectors in the direction of n = n(x, y, z)). At the point(0, 0, 2) on the surface we see that ∇H(0, 0, 2) = 4 k (which points away from the origin). Since the problemrequires the opposite orientation we can choose n as a unit vector pointing in the direction of −∇H(x, y, z).Thus, we would choose n in the direction of

− ∇H(x, y, z) = − (2x i + 2y j + 2z k).

Based on this choice of n our formula in step 1 above becomes∫∫

S

F · dS =

∫∫

S

F · n dS = −∫∫

D

1

| ∇H(x, y, z) · p | (F · ∇H(x, y, z)) dA

Step 3. Since portion of the sphere x2 + y2 + z2 = 4 that we are interested in lies in the first octant, then weknow that z > 0 and the projection of S in the xy-plane is the quarter disk

D = {(x, y) | 0 6 x 6 2, 0 6 y 6

√

4 − x2}.

Thus a unit vector normal to the shadow region D is k (i.e. p = k.) Hence, | ∇H(x, y, z) ·p | = | ∇H(x, y, z) ·k | = 2 | z | = 2 z (Recall that z =

√

4 − (x2 + y2) > 0).

Step 4. We calculate

− 1

| ∇H(x, y, z) · p | (F · ∇H(x, y, z)) dA.

This gives us

− 1

| ∇H(x, y, z) · p | (F·∇H(x, y, z)) dA = − 1

2z〈x, −z, y〉·〈2x, 2y, 2z〉 dA = − x2

zdA = − x2

√

4 − (x2 + y2)dA

Step 5. We can now integrate as follows∫∫

S

F · dS = −∫∫

D

1

| ∇H(x, y, z) · p | (F · ∇H(x, y, z)) dA.

= −∫∫

D

x2

√

4 − (x2 + y2)dA (where D = {(x, y)| x > 0, y > 0 and 0 6 x2 + y2

6 4})

= −∫ π/2

0

∫ 2

0

r2 cos2 θ√4 − r2

r dr dθ (In polar coordinates)

= −(∫ π/2

0

cos2 θ dθ

) (∫ 2

0

r3(4 − r2)−1/2 dr

)

Step 6. To get a result, we must now calculate the definite integrals

∫ π/2

0

cos2 θ dθ =1

2

∫ π/2

0

dθ +1

4

∫ π/2

0

cos 2θ 2 dθ =π

4+ 0 =

π

4

and ∫ 2

0

r2

√4 − r2

r dr (We must be careful, because the integral is an improper integral !!)

Try letting τ = 4 − r2 so that dτ = −2r dr and r2 = 4 − τ . So the integral becomes

− 1

2

∫ 0

4

4 − τ√τ

dτ =1

2

∫ 4

0

4 − τ√τ

dτ

=1

2lim

λ→0+

[∫ 4

λ

(4τ−1/2 − τ1/2) dτ

]

=1

2lim

λ→0+

(

16 − 16

3− 8

√λ +

2

3λ3/2

)

=16

3

Step 7. Finally, we arrive at an answer (you check my calculations for errors !) Thus the Flux of F across Sin the direction of n (towards the origin) is given as

∫∫

S

F · dS = −∫∫

D

1

| ∇H(x, y, z) · p | (F · ∇H(x, y, z)) dA

= −∫∫

D

x2

√

4 − (x2 + y2)dA (where D = {(x, y) | 0 6 x 6 2, 0 6 y 6

√

4 − x2})

= −∫ π/2

0

∫ 2

0

r2 cos2 θ√4 − r2

r dr dθ (In polar coordinates)

= −(∫ π/2

0

cos2 θ dθ

) (∫ 2

0

r3(4 − r2)−1/2 dr

)

= − π

4· 16

3= − 4 π

3.

⊔⊓

Example 3.7.10. Consider the same problem in Example 3.7.9. We ask what if we now parametrize S instead?

Solution. The surface S is part of a sphere and in Example 3.7.4 know how to parametrize a sphere. Here is aparametrization of S

r(u, v) = 2 sinu cos v i + 2 sinu sin v j + 2 cosu k, 0 6 u 6 π/2, 0 6 v 6 π/2.

An outward unit normal n points in the direction of the vector ru × rv. The problem requires a unit normal in theopposite direction (i.e., in the direction of rv × ru = −(ru × rv)). Thus the flux of F across S is given by

∫∫

S

F · dS =

∫∫

S

F · n dS

= −∫ π/2

0

∫ π/2

0

F(r) · ru × rv du; dv

Step 1. F(r) = 2 sinu cos v i − 2 cosu j + 2 sinu sin v k

Step 2. ru = 2 cosu cos v i + 2 cosu sin v j − 2 sinu k and rv = −2 sinu sin v i + 2 sinu cos v j + 0 k

Step 3. ru × rv = 4 sin2 u cos v i + 4 sin2 u sin v j + 4 sin u cosu k

Step 4. F(r) · ru × rv = 8 sin3 u cos2 v

Step 5.∫∫

S

F · n dS = −∫ π/2

0

∫ π/2

0

F(r) · ru × rv du; dv

= −8

(∫ π/2

0

sin3 u cosu du

)

·(∫ π/2

0

cos2 v dv

)

= −8 · 2

3· π

4= − 4π

3⊔⊓

Example 3.7.11. Find the flux of the vector field F(x, y, z) = 〈xy, yz, zx〉 across the surface S which is part ofthe paraboloid z = 4 − x2 − y2 that lies above the square D = {(x, y)| 0 6 x 6 1, 0 6 y 6 1}, and has upwardorientation.

Solution. We use formula (6) (I will deal with the orientation of S later): First, view S as the level surfaceH(x, y, z) = x2 + y2 + z − 4 = 0 so that ∇H(x, y, z) = 〈2x, 2y, 1〉 and also that

F · ∇H(x, y, z) = 〈xy, yz, zx〉 · 〈2x, 2y, 1〉= 2x2y + 2y2z + zx

= 2x2y + 2y2(4 − x2 − y2) + x(4 − x2 − y2)

= 2x2y + 8y2 − 2x2y2 − 2y4 + 4x − x3 − xy2.

Since the surface S is given as a function z = g(x, y) which lies above the unit square D (the shadow region of S)we immediately have that the unit vector p is 〈0, 0, 1〉 = k. Thus | ∇H(x, y, z) ·k | = | − 1 | = 1. Now the doubleintegral ∫∫

S

F · n dS = ±∫∫

D

1

| ∇H(x, y, z) · p | F · ∇H(x, y, z) dA

= ±∫∫

D

(2x2y + 8y2 − 2x2y2 − 2y4 + 4x − x3 − xy2) dA

= ±∫ 1

0

∫ 1

0

(2x2y + 8y2 − 2x2y2 − 2y4 + 4x − x3 − xy2) dx dy

= ±∫ 1

0

(2

3y + 8y2 − 2

3y2 − 2y4 +

7

4− 1

2y2

)

dy = ± 713

180.

Now there is the issue of n: We want the upward (very bad wording here!) orientation (i.e., we want S orientedso that its gradient field ∇H(x, y, z) is outward). To choose the proper n, I choose a convenient point on S, let’ssay P (0, 0, 4). Now calculate ∇H(0, 0, 4) = 〈0, 0, 1〉 = k; which points outward away from S. Thus we want n to

point in the same direction as ∇H(x, y, z). This tells us that the required n = + ∇H(x,y,z)‖ ∇H(x,y,z) ‖ = 〈2x, 2y, 1〉√

4x2 +4y2 +1and

so finally we have that the flux of F across S is 713/180. ⊔⊓

Exercise 3.7.8. Calculate the outward Flux of the vector field F(x, y, z) = −x i − y j + z2 k across a smooth

parametrized surface S where S is the portion of the cone z =√

x2 + y2 between the planes z = 1 and z = 2.The unit normal is away from the z-axis. Also, setup the integral of the non parametrized version of this problem(But Do Not Evaluate!).

STOKE’S THEOREM

Stoke’s Theorem says that, under conditions normally met in practice, the circulation of a vector field around theboundary of an oriented surface in space in the direction counterclockwise with respect to the surface’s unit normalvector field n equals the integral of the normal component of the circulation density of the field over the surface.

Theorem 3.7.6. [Stoke’s Theorem] Let S be an orientable surface (i.e., S is a two-sided surface) with acontinuously varying unit normal vector field n = n(x, y, z). Let the boundary of S (which we denote by ∂S) be apiecewise smooth, simple closed curve, oriented consistently with n (i.e., γ = ∂S is a positively oriented piecewisesmooth simple closed curve). Suppose F(x, y, z) = P (x, y, z); i + Q(x, y, z) j + R(x, y, z) k is a vector field withP (x, y, z), Q(x, y, z), and R(x, y, z) having continuous first-order partial derivatives on S and its boundary curve∂S. If T denotes the unit tangent vector to γ = ∂S, then the Circulation of F around the boundary curveγ = ∂S is given by the surface integral

circulation of F around γ =

∫∫

S

curl F · dS =

∫∫

S

curl F · n dS =

∮

γ

F ·T ds =

∮

γ

F · d~γ

Recall that curl F = ∇× F =

∣∣∣∣∣∣∣∣∣

i j k

∂∂x

∂∂y

∂∂z

P Q R

∣∣∣∣∣∣∣∣∣

=(

∂R∂y − ∂Q

∂z

)

i +(

∂P∂z − ∂R

∂x

)j +

(∂Q∂x − ∂P

∂y

)

k.

Example 3.7.12. Calculate the work done by the “force field” given by

F(x, y, z) = 〈xx + z2, yy + x2, zz + y2〉

when a particle moves under its influence around the edge of the part of the sphere x2 + y2 + z2 = 32 that lies inthe first octant, in a counterclockwise direction as viewed from above.

Solution. Observe that we are being asked to calculate a work integral in space. that is we must calculate a lineintegral

∫

∂S F · d∂S or if you prefer∫

γ F · d~γ where γ = ∂S. Doing this calculation directly might not be a wise

idea (but you can certainly try!). So we will use Stoke’s Theorem (equivalently Green’s Theorem in Space). So wecalculate curl F as

curl F(x, y, z) = ∇× F(x, y, z) =

∣∣∣∣∣∣∣∣∣∣∣

i j k

∂∂x

∂∂y

∂∂z

xx + z2 yy + x2 zz + y2

∣∣∣∣∣∣∣∣∣∣∣

= 〈2y, 2z, 2x〉 or 2〈y, z, x〉.

Observe that curl F 6= 0 so that F is a non-conservative force field. Next, we must decide on whether we want toparametrize S or not. I choose to parametrize S as follows:

Parametrized S := r(u, v) = 〈3 sinu cos v, 3 sinu sin v, 3 cosu〉; 0 6 u 6 π/2, 0 6 v 6 π/2;

which naturally gives us the outer unit normal and the positively oriented boundary curve ∂S or γNow ru =〈3 cosu cos v, 3 cosu sin v, −3 sinu〉 and rv = 〈−3 sinu sin v, 3 sinu cos v, 0〉 and so

ru × rv = 9 〈sin2 u cos v, sin2 u sin v, sinu cosu〉

and curl F(r(u, v)) = 6 〈sin u sin v, cosu, sin u cos v〉. Now we have that

curl F(r(u, v)) · ru × rv = 54 (sin3 u sin v cos v + sin2 u cosu sin v + sin2 u cosu cos v).

Therefore,

Work done by F =

∮

γ

F · d~γ =

∮

γ

F · T ds =

∫∫

S

curl F · dS

=

∫∫

D

curl F(r(u, v)) · ru × rv du dv

= 54

∫ π/2

0

∫ π/2

0

(sin3 u sin v cos v + sin2 u cosu(sin v + cos v)) du dv

= 54

(∫ π/2

0

sin3 u du

)(∫ π/2

0

sin v cos v dv

)

+ 54

(∫ π/2

0

sin2 u cosu du

)(∫ π/2

0

sin v + cos v dv

)

= 54 · 1

2

[

− cosu +1

3cos3 u

]u=π/2

u=0

+ 54 · 1

3(1 + 1)

= 54 · 1

2

[

1 − 1

3

]

+ 54 · 1

3(1 + 1)

= 54 work-energy units

⊔⊓

Exercise 3.7.9. Use Stoke’s theorem to evaluate∮

γ F ·T ds, where F = 2z i + (8x − 3y) j + (3x + y) k where γ

is the triangular positively oriented curve with vertices at (0, 0, 2), (0, 1, 0), and (1, 0, 0). Include a sketch of S andits boundary curve in your work.

DIVERGENCE THEOREM AND A UNIFIED THEORY

The divergence form of Green’s Theorem in the plane states that the net outward flux of a vector field across asimple closed curve can be calculated by integrating the divergence of the field over the region enclosed by the curve.The corresponding theorem in three dimensions, called the Divergence Theorem, states that the net outward flux ofa vector field across a closed surface in space can be calculated by integrating the divergence of the field over theregion enclosed by the surface.

Divergence Theorem

The Divergence Theorem says that: Under suitable conditions, the outward flux of a vector field across a closedsurface (oriented outward) equals the triple integral of the divergence of the field over the solid region enclosed bythe surface.

Theorem 3.7.7. [Gauss’s Divergence Theorem] Suppose F(x, y, z) = P (x, y, z); i + Q(x, y, z) j + R(x, y, z) kis a vector field with P , Q, and R having continuous first-order partial derivatives on a solid E with boundary surfaceS (E is the solid region in space enclosed by S). If n denotes the outward unit normal to the surface S = ∂E , then

Outward flux of F across closed surface S =

∫∫

S

F · dS =

∫∫

S

F · n dS =

∫∫∫

E

div F dV

Example 3.7.13. [Verifying the divergence theorem] Evaluate both sides of the equation in the DivergenceTheorem for the field F = 〈x, y, z〉 over the sphere x2 + y2 + z2 = R2.

Solution. Calculating Outward flux using the surface integral∫∫

S

F · n dS:

The outward unit normal field is n =2 〈x, y, z〉

√

4 (x2 + y2 + z2)=

1

R〈x, y, z〉

and so

F · n dS =x2 + y2 + z2

RdS =

R2

RdS = R dS

since x2 + y2 + z2 = R2 on the surface. Therefore,

∫∫

S

F · n dS =

∫∫

S

R dS = R

∫∫

S

dS = R(4πR2) = 4πR3 = 3 (Volume of E ).

Next, we calculate∫∫∫

E

div F dV . Since F = 〈x, y, z〉, then the divergence of F is

div F =∂

∂x(x) +

∂

∂y(y) +

∂

∂z(z) = 1 + 1 + 1 = 3.

Therefore,∫∫∫

E

div F dV = 3

∫∫∫

E

dV = 3 (volume of E ) = 3

(4

3πR3

)

= 4πR3

⊔⊓Problem on This theorem will be on Final Exam! So study it well

Exercise 3.7.10. Use the Divergence Theorem to Evaluate the surface integral

∫∫

S

(2x + 2y + z2)dS

where S is the sphere x2 + y2 + z2 = a2, a > 0.

MISCELLANEOUS EXERCISES

Problem 1. Let S denote the portion of the parabolic cylinder z = 1 − x2; 0 6 y 6 2 which lies in the firstoctant. Let γ be piecewise smooth curve denoting the boundary of S, which is positively oriented when viewedfrom above. Consider now the continuous vector field F(x, y, z) = 〈1, 0, y2〉 defined every where. Calculate thecirculation of F around γ two ways: (1) Directly as a line integral and (2) Directly as flux of F through S.

Problem 2. Let E denote the solid enclosed by the circular paraboloid z = x2 + y2 and the circular disk x2 + y2 = 4at z = 4. Let F(x, y, z) = 〈x, y, 1〉 be a continuous vector field and the surface S is the boundary of E . Verify theDivergence Theorem.

Problem 3. Let D denote the plane region bounded x2 + y2 − y = 0, y = 0 and y = −x. Let γ denotethe boundary of D . Calculate

∮

γP (x, y) dx + Q(x, y) dy where P (x, y) = −y and Q(x, y) = x by verifying the

counterclockwise circulation form of Green’s theorem.

Problem 4. Calculate the area of the portion of the sphere x2 + y2 + z2 = 22 that lies between the planes z = 0and z =

√2.

Problem 5. Consider the continuous vector field F(x, y, z) = 〈2z, ez, 2x + yez〉.(a) Does the vector field have a potential? (Explain by finding it or say why it does not).

(b) Evaluate∫

γ F · d~γ where γ is the space curve given by y = 1 − x2, z = 0 with 0 6 x 6 1 and oriented sothat x is increasing.

Problem 6. Let E be the solid contained between the surfaces z = 4 − x2 − y2 and z = 0. Let S denote theboundary of E positively oriented (i.e., oriented with the outward unit normal field). Consider the continuous vectorfield F(x, y, z) = 〈x, y, 1〉. Evaluate the surface integral

∫∫

S

F · dS as a double and Triple integral.

Problem 7. Let E be the solid inside the cylinder x2 + y2 = 1 bounded by the planes z = 0 and z = 2. LetF(x, y, z) = 〈zy, zx, y2 + z2〉. Setup; but do not Evaluate the surface integral

∫∫

S

F · dS both as a double and

a Triple integral. Clearly state the limits of integration for both integrals.

Problem 8. Let γ be the curve of intersection of the plane z = 3x − 7 and the right circular cylinder x2 + y2 = 1and has the clockwise orientation as viewed from above. Let F be the continuous vector field given by F(x, y, z) =〈4z − 1, 2x, 5y + 1〉. Calculate

∮

γF · d~γ (a) directly as a line integral (impossible for your level of understanding

thus far) and (b) calculate using (Stoke’s Theorem) (Of course, I could be wrong about this! At any rate, I will justassume that the above wording is what was given)

Solution. After many hours of thought about this problem, I finally came to the conclusion that the difficultywith the problem lies totally in its wording. The inventor of the problem did not intend for us to do both the lineintegral directly (without the use of Stoke’s Theorem) and the surface integral directly (even though the problemstated it that way). This was precise the point I tried to make in class; it is worth it to take a minute or so to analyzesome of the questions, you will find that they contain “logical flaws” in the sense that they asked you to do the samething twice! even though the intent is for you to do two different things altogether (this is usually the tester’s faultnot yours). Anyway, real intent was that you do just of the two integrals (one of them is easy and the other is ratherdifficult (but not impossible!) as you would have discovered, so take your pick of only one!). The easy integral isthe surface integral

∫∫

S

(∇×F · n) dS (from Stoke’s Theorem). The surface S is the plane z = f(x, y) = 3x − 7 or

H(x, y, z) = −3x + z + 7 = 0 (Here we view S as a level surface of H). So let’s calculate this surface integral bythe numbers:

curl F(x, y, z) = ∇× F(x, y, z) =

∣∣∣∣∣∣∣∣∣

i j k

∂∂x

∂∂y

∂∂z

4z − 1 2x 5y + 1

∣∣∣∣∣∣∣∣∣

= 〈5, 4, 2〉 or 5i + 4j + 2k.

Next, since we are traversing γ clockwise as viewed from above, it means that the surface S is to our right as we

travel around γ, in other words, we negatively orient the surface and so n = ∇H(x,y,z)‖ ∇H(x,y,z) ‖ = 1√

10〈−3, 0, 1〉.

However our orientation is negative is we use −n as our unit normal. Thus, by our formula for surface integrals wehave ∮

γ

F · T ds =

∫∫

D

1

| ∇H(x, y, z) · k | (curl F(x, y, z) · −n) dS = −∫∫

D

− 13√10

√10 dA = 13π

where D is the shadow region of S (in our case, this is the unit disk centered at the origin) and p = k is a unitnormal to the shadow region.

Here’s the correct wording: Evaluate∫

γ F · d~γ, for the vector field F(x, y, z) = 〈4z − 1, 2x, 5y + 1〉, where

the curve γ is the intersection of the circular cylinder x2 + y2 = 1 and the plane z = 3x − 7, oriented so that it istraversed clockwise when viewed from high up on the positive z-axis. ⊔⊓

NEATLY WRITE UP THE EXERCISES AND TURN IT IN FOR A GRADE NO LATER THAN

DECEMBER 12th, 2008!!

SOLUTIONS TO THIS December 22, 2008 FINAL EXAM WAS DONE TO BENEFIT MY STUDENTS TAKING MATH 39200 C (Spring 2008). NOT TO BE SOLD FOR MONETARY GAIN BUT YOU MAY DISTRIBUTE IT FREELY. © 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐹𝐹𝐴𝐴𝐹𝐹𝐴𝐴𝐹𝐹𝐹𝐹 2008

HOW TO SCORE 120 OUT OF 100 (YOU MAKE IT A THING OF PERFECTION!)

Name (Printed)_______________________________________________

Name (Signed)_______________________________________________

MATH 39200 C FINAL EXAMINATION

December 22, 2008

1 10

2 10

3 10

4 10

5 10

6 10

7 10

8, 9 OR 10 10

9, 10 OR 11 10

10, 11 OR 12 10

TOTAL 10

POSSIBLE 100 (120)

Instructions: Show all work. Calculators and other electronic devices must be out of sight, turned off, and not used.

Answer 5 questions from part I and 2 questions from part II.

PART I: Answer 5 complete questions from this part. (14 points each)

1. (a) Find the inverse of the matrix 𝐴𝐴 = �1 4 12 7 31 7 5

�.

(b) Use the matrix 𝐴𝐴−1 that you found in (a) to solve the system �𝑥𝑥 + 4𝐴𝐴 + 𝑧𝑧 = 1

2𝑥𝑥 + 7𝐴𝐴 + 3𝑧𝑧 = −2𝑥𝑥 + 7𝐴𝐴 + 5𝑧𝑧 = −1

�

Antony L Foster

Antony Foster



No credit for any other method! Show your work on this page and on the page to the left.

MY ANSWER:

(a) (𝑨𝑨|𝑰𝑰𝟑𝟑) = �1 4 12 7 31 7 5

1 0 00 1 00 0 1

� reduce to reduced row echelon form using just a few elementary row

operations to get. I did not state my operations here!

⎝

⎜⎛1 0 0

0 1 00 0 1

−𝟐𝟐 𝟏𝟏𝟑𝟑𝟕𝟕

− 𝟓𝟓𝟕𝟕

𝟏𝟏 − 𝟒𝟒𝟕𝟕

𝟏𝟏𝟕𝟕

−𝟏𝟏 𝟑𝟑𝟕𝟕

𝟏𝟏𝟕𝟕⎠

⎟⎞

from which we have the desire inverse of 𝐴𝐴 as:

𝑨𝑨−𝟏𝟏 =

⎝

⎜⎛−𝟐𝟐 𝟏𝟏𝟑𝟑

𝟕𝟕− 𝟓𝟓

𝟕𝟕

𝟏𝟏 − 𝟒𝟒𝟕𝟕

𝟏𝟏𝟕𝟕

−𝟏𝟏 𝟑𝟑𝟕𝟕

𝟏𝟏𝟕𝟕⎠

⎟⎞

.

(b) Since the coefficient matrix 𝐴𝐴 is nonsingular then the solution to the matrix equation 𝐴𝐴𝒙𝒙 = 𝐵𝐵 is therefore

𝒙𝒙 = �𝑥𝑥𝐴𝐴𝑧𝑧� = 𝐴𝐴−1𝐵𝐵 =

⎝

⎜⎛−2 13

7− 5

7

1 − 47

17

−1 37

17⎠

⎟⎞�

1−2−1

� = �−5 2−2

�.

Therefore, the unique solution to the given linear non-homogeneous system 𝐴𝐴𝒙𝒙 = 𝐵𝐵 is:

𝒙𝒙 = −𝟓𝟓, 𝒚𝒚 = 𝟐𝟐, 𝒛𝒛 = −𝟐𝟐



2 (a) Find all eigenvalues and eigenvectors of the matrix �7 −13 3 �

(b) Use your answer to (a) to solve �𝐴𝐴1′ (𝐴𝐴) = 7𝐴𝐴1(𝐴𝐴) − 𝐴𝐴2(𝐴𝐴)

𝐴𝐴2′ (𝐴𝐴) = 3𝐴𝐴1(𝐴𝐴) + 3𝐴𝐴2(𝐴𝐴)

� for 𝐴𝐴1(𝐴𝐴) and 𝐴𝐴2(𝐴𝐴) subject to initial conditions

𝐴𝐴1(0) = 2 and 𝐴𝐴2(0) = 1.

MY ANSWER

(a) We seek real values 𝜆𝜆 and non zero vectors 𝒙𝒙 such that 𝐴𝐴𝒙𝒙 = 𝜆𝜆𝒙𝒙 holds.

𝜆𝜆𝜆𝜆 − 𝐴𝐴 = �𝜆𝜆 − 7 1−3 𝜆𝜆 − 3� and so 𝑝𝑝(𝜆𝜆) = �𝜆𝜆 − 7 1

−3 𝜆𝜆 − 3� = 𝜆𝜆2 − 10𝜆𝜆 + 24 = (𝜆𝜆 − 6)(𝜆𝜆 − 4) = 0.

Therefore the eigenvalues of the matrix 𝐴𝐴 are 𝝀𝝀 = 𝟔𝟔 𝑎𝑎𝐴𝐴𝑎𝑎 𝝀𝝀 = 𝟒𝟒. For the eigenvalue 𝜆𝜆 = 6, we solve the homogeneous linear system (6𝜆𝜆 − 𝐴𝐴)𝒙𝒙 = 0 for 𝒙𝒙 which tells us

that the vector �𝟏𝟏𝟏𝟏� is the eigenvector corresponding to the eigenvalue 𝜆𝜆 = 6.

For the eigenvalue 𝜆𝜆 = 4, we solve the homogeneous linear system (4𝜆𝜆 − 𝐴𝐴)𝒙𝒙 = 0 for 𝒙𝒙 which tells us

that �𝟏𝟏𝟑𝟑� is the eigenvector corresponding to the eigenvalue 𝜆𝜆 = 4.

(b) The given system of first-order linear differential equations has general solution (in matrix form)

𝒀𝒀(𝒕𝒕) = �𝒚𝒚𝟏𝟏(𝒕𝒕)𝒚𝒚𝟐𝟐(𝒕𝒕)� = 𝒄𝒄𝟏𝟏 �

𝒑𝒑𝟏𝟏𝟏𝟏𝒑𝒑𝟐𝟐𝟏𝟏� 𝒆𝒆

𝝀𝝀𝟏𝟏𝒕𝒕 + 𝒄𝒄𝟐𝟐 �𝒑𝒑𝟏𝟏𝟐𝟐𝒑𝒑𝟐𝟐𝟐𝟐� 𝒆𝒆

𝝀𝝀𝟐𝟐𝒕𝒕 = 𝒄𝒄𝟏𝟏 �𝟏𝟏𝟏𝟏�𝒆𝒆

𝟔𝟔𝒕𝒕 + 𝒄𝒄𝟐𝟐 �𝟏𝟏𝟑𝟑� 𝒆𝒆

𝟒𝟒𝒕𝒕

where the constants 𝑐𝑐1 𝑎𝑎𝐴𝐴𝑎𝑎 𝑐𝑐2 are to be determined. Now imposing the initial conditions 𝐴𝐴1(0) = 2 𝑎𝑎𝐴𝐴𝑎𝑎 𝐴𝐴2(0) = 1 we get that

𝑐𝑐1 + 𝑐𝑐2 = 2 𝑎𝑎𝐴𝐴𝑎𝑎 𝑐𝑐1 + 3𝑐𝑐2 = 1 (𝑎𝑎𝐴𝐴𝐴𝐴𝐴𝐴ℎ𝐹𝐹𝐹𝐹 𝐹𝐹𝑠𝑠𝑎𝑎𝑠𝑠𝑠𝑠 𝐹𝐹𝐴𝐴𝐹𝐹𝐴𝐴𝐹𝐹𝑠𝑠 𝐴𝐴𝐴𝐴 𝐹𝐹𝐴𝐴𝑠𝑠𝑠𝑠𝐹𝐹)

from which it follows that 𝑐𝑐1 = 52

𝑎𝑎𝐴𝐴𝑎𝑎 𝑐𝑐2 = − 12 and so the solution to the initial value system is

𝑌𝑌(𝐴𝐴) = �𝐴𝐴1(𝐴𝐴)𝐴𝐴2(𝐴𝐴)� =

52�1

1� 𝐹𝐹6𝐴𝐴 −

12�1

3� 𝐹𝐹4𝐴𝐴

𝒚𝒚𝟏𝟏(𝒕𝒕) =𝟓𝟓𝟐𝟐𝒆𝒆

𝟔𝟔𝒕𝒕 −𝟏𝟏𝟐𝟐𝒆𝒆

𝟒𝟒𝒕𝒕 𝒂𝒂𝒂𝒂𝒂𝒂 𝒚𝒚𝟐𝟐(𝒕𝒕) =𝟓𝟓𝟐𝟐𝒆𝒆

𝟔𝟔𝒕𝒕 −𝟑𝟑𝟐𝟐𝒆𝒆

𝟒𝟒𝒕𝒕



3. (a) Find the rank of the matrix 𝐴𝐴 = � 1 −4 9−1 2 −4 5 −6 10

−7 1 7�

(b) Solve the linear system � 𝑥𝑥 − 4𝐴𝐴 + 9𝑧𝑧 − 7𝑤𝑤 = 0−𝑥𝑥 + 2𝐴𝐴 − 4𝑧𝑧 + 𝑤𝑤 = 0

5𝑥𝑥 − 6𝐴𝐴 + 10𝑧𝑧 + 7𝑤𝑤 = 0�. Indicate clearly all row operations that you use.

(c) Find the determinant of the matrix �3 4 0 601

1000

1 2 12 3 4

2000 3000 4001

�

MY ANSWER

(a) Bring the matrix to reduced-row echelon form and then count the number of rows with leading one’s.

� 1 −4 9−1 2 −4 5 −6 10

−7 1 7� 𝐹𝐹1 + 𝐹𝐹2 → �

1 −4 9 𝟎𝟎 −𝟐𝟐 𝟓𝟓 5 −6 10

−7−𝟔𝟔 7� − 5𝐹𝐹1 + 𝐹𝐹3 → �

1 −4 9 0 −2 5 𝟎𝟎 𝟏𝟏𝟒𝟒 −𝟑𝟑𝟓𝟓

−7−6 𝟒𝟒𝟐𝟐

�

7𝐹𝐹2 + 𝐹𝐹3 → � 1 −4 9 0 −2 5 𝟎𝟎 𝟎𝟎 𝟎𝟎

−7−6 𝟎𝟎� − 2𝐹𝐹2 + 𝐹𝐹1 → �

𝟏𝟏 𝟎𝟎 −𝟏𝟏 0 −2 5 0 0 0

𝟓𝟓−6 0� −

12𝐹𝐹2 → �

𝟏𝟏 0 −1 0 𝟏𝟏 −𝟓𝟓/𝟐𝟐 0 0 0

5 𝟑𝟑 0�

From the last matrix (which is in reduced row echelon form) we can see that the Rank of the original matrix 𝐴𝐴 is 2, we have two pivots. Furthermore, the row of zeroes tells us that the homogeneous linear system has infinitely many solutions. So let’s write them down.

(b) The above information is useful in solving the given homogeneous linear system since the matrix in part (a) is the coefficient matrix of the system. We would use the same elementary row operations to achieve

the equivalent system �

𝒙𝒙 − 𝒛𝒛 + 𝟓𝟓𝟓𝟓 = 𝟎𝟎𝒚𝒚 − 𝟓𝟓

𝟐𝟐𝒛𝒛 + 𝟑𝟑𝟓𝟓 = 𝟎𝟎𝒛𝒛 = 𝒛𝒛𝟓𝟓 = 𝟓𝟓

� from which we let 𝑧𝑧 = 𝐹𝐹 𝑎𝑎𝐴𝐴𝑎𝑎 𝑤𝑤 = 𝐴𝐴 and then determine that the

system has infinitely many solutions which we write in matrix form as

�𝒙𝒙𝒚𝒚𝒛𝒛𝟓𝟓� = �

𝟏𝟏𝟓𝟓𝟐𝟐𝟏𝟏𝟎𝟎

�𝒔𝒔 + �−𝟓𝟓−𝟑𝟑 𝟎𝟎 𝟏𝟏�𝒕𝒕 where 𝒔𝒔, 𝒕𝒕 ∈ ℝ.



By the way the solutions of the system 𝐴𝐴𝒙𝒙 = 𝟎𝟎 which we express as linear combination of two

independent vectors 𝒗𝒗1 = �

𝟏𝟏𝟓𝟓𝟐𝟐𝟏𝟏𝟎𝟎

� 𝑎𝑎𝐴𝐴𝑎𝑎 𝒗𝒗2 = �−𝟓𝟓−𝟑𝟑 𝟎𝟎 𝟏𝟏

� spans a fundamental 2-dimensional subspace of ℝ4

called the Nullspace of the matrix 𝐴𝐴. An important theorem in linear algebra says that the rank of 𝑨𝑨 plus the nullity of 𝑨𝑨 equals the dimension of 𝑨𝑨. The rank of 𝑨𝑨 is the dimension of the row-space of A, this is the number of leading one’s in the row echelon form of 𝐴𝐴. In our case, the rank of 𝐴𝐴 = 2. The null-space of 𝑨𝑨 is the solution set to the homogeneous system 𝐴𝐴𝒙𝒙 = 𝟎𝟎. The dimension of the nullspace is called the nullity of 𝑨𝑨. In our case, the nullity of 𝐴𝐴 = 2.

(c) Use row reduction method to bring the matrix to upper-triangular from (not necessarily row echelon

form ) as follows (recalling that only 2 of the 3 elementary row operations have an effect on the determinant):

�𝟑𝟑 𝟒𝟒 𝟎𝟎 𝟔𝟔𝟎𝟎𝟏𝟏

𝟏𝟏𝟎𝟎𝟎𝟎𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏𝟐𝟐 𝟑𝟑 𝟒𝟒

𝟐𝟐𝟎𝟎𝟎𝟎𝟎𝟎 𝟑𝟑𝟎𝟎𝟎𝟎𝟎𝟎 𝟒𝟒𝟎𝟎𝟎𝟎𝟏𝟏

� = −�𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒𝟎𝟎𝟑𝟑

𝟏𝟏𝟎𝟎𝟎𝟎𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏𝟒𝟒 𝟎𝟎 𝟔𝟔

𝟐𝟐𝟎𝟎𝟎𝟎𝟎𝟎 𝟑𝟑𝟎𝟎𝟎𝟎𝟎𝟎 𝟒𝟒𝟎𝟎𝟎𝟎𝟏𝟏

� =

−�𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒𝟎𝟎𝟎𝟎

𝟏𝟏𝟎𝟎𝟎𝟎𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏−𝟐𝟐 −𝟗𝟗 −𝟔𝟔𝟐𝟐𝟎𝟎𝟎𝟎𝟎𝟎 𝟑𝟑𝟎𝟎𝟎𝟎𝟎𝟎 𝟒𝟒𝟎𝟎𝟎𝟎𝟏𝟏

� = − �𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒𝟎𝟎𝟎𝟎𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏−𝟐𝟐 −𝟗𝟗 −𝟔𝟔𝟎𝟎 𝟎𝟎 𝟏𝟏

� = −� 𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒𝟎𝟎𝟎𝟎𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏𝟎𝟎 −𝟓𝟓 −𝟒𝟒𝟎𝟎 𝟎𝟎 𝟏𝟏

� =

� 𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒 𝟎𝟎 𝟎𝟎 𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏𝟎𝟎 𝟓𝟓 𝟒𝟒𝟎𝟎 𝟎𝟎 𝟏𝟏

� = (𝟏𝟏)(𝟏𝟏)(𝟓𝟓)(𝟏𝟏) = 𝟓𝟓 Thus we get that

det�3 4 0 601

1000

1 2 12 3 4

2000 3000 4001

� = det� 𝟏𝟏 𝟐𝟐 𝟑𝟑 𝟒𝟒 𝟎𝟎 𝟎𝟎 𝟎𝟎

𝟏𝟏 𝟐𝟐 𝟏𝟏𝟎𝟎 𝟓𝟓 𝟒𝟒𝟎𝟎 𝟎𝟎 𝟏𝟏

� = 5



4. (a) Find the length of the parametrized curve with position vector 𝜸𝜸(𝐴𝐴) = ⟨cos 𝐴𝐴, cos t,−√2 sin t⟩; 0 ≤ t ≤ 2π.

(b) Let 𝜸𝜸 be the intersection curve of the surfaces 𝑧𝑧 = 𝑥𝑥2 + 𝐴𝐴2 and 𝑥𝑥2 + 𝐴𝐴2 + 𝑧𝑧2 = 2, oriented

counterclockwise as seen from above. Find. ∫ 𝐴𝐴 𝑎𝑎𝑥𝑥 − 𝑥𝑥 𝑎𝑎𝐴𝐴 + 𝑥𝑥2𝐴𝐴3𝑧𝑧5 𝑎𝑎𝑧𝑧𝜸𝜸

MY ANSWER

(a) Let 𝐿𝐿 denote the length of the curve 𝛾𝛾 then we calculate 𝐿𝐿 as follows:

𝑳𝑳 = � ‖𝜸𝜸′(𝒕𝒕)𝟐𝟐𝟐𝟐

𝟎𝟎‖ 𝒂𝒂𝒕𝒕 = � √𝟐𝟐 𝒂𝒂𝒕𝒕 = 𝟐𝟐𝟐𝟐√𝟐𝟐

𝟐𝟐𝟐𝟐

𝟎𝟎 𝒖𝒖𝒂𝒂𝒖𝒖𝒕𝒕𝒔𝒔.

And the arc-length parameter 𝐹𝐹 = ∫ ‖𝜸𝜸′(𝜏𝜏)‖𝐴𝐴0 𝑎𝑎𝜏𝜏 = ∫ √2 𝑎𝑎𝜏𝜏 = √2 𝐴𝐴𝐴𝐴

0 or that 𝐴𝐴 = 1√2𝐹𝐹. This means

could parameterize the curve as

𝛾𝛾(𝐹𝐹) = ⟨cos �1√2

𝐹𝐹� , sin �1√2

𝐹𝐹� ,−√2 sin �1√2

𝐹𝐹�⟩ , 0 ≤ 𝐹𝐹 ≤ 2√2 𝜋𝜋

(b) The curve 𝜸𝜸 of intersection of the two given surfaces is 𝑥𝑥2 + 𝐴𝐴2 = 1; 𝑧𝑧 = 1 whose usual parametrization is 𝜸𝜸(𝐴𝐴) = ⟨cos 𝐴𝐴, sin 𝐴𝐴, 1⟩; 0 ≤ 𝐴𝐴 ≤ 2𝜋𝜋 (This is a simple closed space curve in the plane 𝑧𝑧 = 1). Define 𝑭𝑭(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨𝐴𝐴, −𝑥𝑥, 𝑥𝑥2𝐴𝐴3𝑧𝑧5⟩ as the vector field (though it is not necessary!). We are being asked to calculate the line integral ∮ 𝑭𝑭 ⋅ 𝑎𝑎𝜸𝜸𝜸𝜸 so as usual

� 𝑭𝑭 ⋅ 𝒂𝒂𝜸𝜸 = � 𝑭𝑭�𝜸𝜸(𝒕𝒕)� ⋅ 𝜸𝜸′(𝒕𝒕) 𝒂𝒂𝒕𝒕 𝟐𝟐𝟐𝟐

𝟎𝟎𝜸𝜸= � −(𝐬𝐬𝐬𝐬𝐬𝐬𝟐𝟐 𝒕𝒕 + 𝐜𝐜𝐜𝐜𝐬𝐬𝟐𝟐 𝒕𝒕) 𝒂𝒂𝒕𝒕 = −𝟐𝟐𝟐𝟐

𝟐𝟐𝟐𝟐

𝟎𝟎



5. Let 𝑆𝑆 be the surface 𝑧𝑧 = �𝑥𝑥2 + 𝐴𝐴2; 𝑧𝑧 ≤ 9, and let 𝑃𝑃 be the point (3,4,5) on 𝑆𝑆. (a) Find an equation of the tangent plane to the surface 𝑆𝑆 at point 𝑃𝑃.

(b) Find the surface area of 𝑆𝑆.

MY ANSWER

(a) Let 𝐻𝐻(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = �𝑥𝑥2 + 𝐴𝐴2 − 𝑧𝑧 = 0 where (𝑥𝑥,𝐴𝐴) ∈ 𝐷𝐷 = {(𝑥𝑥,𝐴𝐴):𝑥𝑥2 + 𝐴𝐴2 ≤ 81} (A circular disk of radius 9). Now ∇𝐻𝐻(𝑥𝑥, 𝐴𝐴, 𝑧𝑧) = ⟨𝐻𝐻𝑥𝑥 ,𝐻𝐻𝐴𝐴 ,𝐻𝐻𝑧𝑧⟩ = ⟨ 𝑥𝑥

�𝑥𝑥2+𝐴𝐴2 , 𝒚𝒚�𝑥𝑥2+𝐴𝐴2 ,−1⟩. The equation of the tangent plane to the given

surface 𝑆𝑆 at the point (3,4,5) is 𝛁𝛁𝑯𝑯(𝟑𝟑,𝟒𝟒,𝟓𝟓) ⋅ ⟨𝒙𝒙 − 𝟑𝟑,𝒚𝒚 − 𝟒𝟒, 𝒛𝒛 − 𝟓𝟓⟩ = 𝟑𝟑

𝟓𝟓(𝒙𝒙 − 𝟑𝟑) + 𝟒𝟒

𝟓𝟓(𝒚𝒚 − 𝟒𝟒) − (𝒛𝒛 − 𝟓𝟓) = 𝟎𝟎

𝐜𝐜𝐨𝐨 𝟑𝟑𝒙𝒙 + 𝟒𝟒𝒚𝒚 − 𝟓𝟓𝒛𝒛 = 𝟎𝟎

(b) Based on the information in part (a), there is perhaps no need to parametrize the surface. Calculate the surface Area as follows (From Math 203)

� ‖𝛁𝛁𝐻𝐻(𝑥𝑥,𝐴𝐴, 𝑧𝑧)‖ 𝒂𝒂𝑨𝑨 = √𝟐𝟐 �𝒂𝒂𝑨𝑨 = √𝟐𝟐 𝑨𝑨(𝑫𝑫) = 𝟖𝟖𝟏𝟏√𝟐𝟐 𝟐𝟐𝑫𝑫

𝒖𝒖𝒂𝒂𝒖𝒖𝒕𝒕𝒔𝒔𝟐𝟐𝑫𝑫

Just in case you parametrized the circular cone 𝑆𝑆:

𝒓𝒓(𝑢𝑢, 𝑠𝑠) = ⟨𝑢𝑢𝑐𝑐𝐴𝐴𝐹𝐹 𝑠𝑠,𝑢𝑢 sin 𝑠𝑠,𝑢𝑢⟩; 0 ≤ 𝑢𝑢 ≤ 9, 0 ≤ 𝑠𝑠 ≤ 2𝜋𝜋.

Then 𝒓𝒓𝒖𝒖(𝑢𝑢, 𝑠𝑠) = ⟨cos𝑠𝑠, sin𝑠𝑠, 1⟩ 𝑎𝑎𝐴𝐴𝑎𝑎 𝒓𝒓𝒗𝒗(𝑢𝑢, 𝑠𝑠) = ⟨−𝑢𝑢 sin 𝑠𝑠,𝑢𝑢 cos 𝑠𝑠, 0⟩ 𝑎𝑎𝐴𝐴𝑎𝑎 𝒓𝒓𝒖𝒖 × 𝒓𝒓𝒗𝒗 = ⟨−𝑢𝑢 cos 𝑠𝑠,−𝑢𝑢 sin𝑠𝑠,𝑢𝑢⟩

‖𝐹𝐹𝑢𝑢 × 𝐹𝐹𝑠𝑠‖ = √2 𝑢𝑢. Now surface area is

�𝒂𝒂𝒅𝒅 = �‖𝒓𝒓𝒖𝒖 × 𝒓𝒓𝒗𝒗‖ 𝒂𝒂𝒖𝒖 𝒂𝒂𝒗𝒗 =𝑫𝑫

� � ‖𝒓𝒓𝒖𝒖 × 𝒓𝒓𝒗𝒗‖𝒂𝒂𝒖𝒖 𝒂𝒂𝒗𝒗 = √𝟐𝟐 � � 𝒖𝒖 𝒂𝒂𝒖𝒖 𝒂𝒂𝒗𝒗 = 𝟖𝟖𝟏𝟏√𝟐𝟐 𝟐𝟐 𝒖𝒖𝒂𝒂𝒖𝒖𝒕𝒕𝒔𝒔𝟐𝟐. 𝟗𝟗

𝟎𝟎

𝟐𝟐𝟐𝟐

𝟎𝟎

𝟗𝟗

𝟎𝟎

𝟐𝟐𝟐𝟐

𝟎𝟎𝒅𝒅

There is a nice formula for the surface area of a right circular cone of radius 𝑅𝑅 and height 𝐻𝐻 is given as:

𝜋𝜋𝑅𝑅√𝑅𝑅2 + 𝐻𝐻2 .

This should agree with the surface area of the right circular cone given above whose radius and height is the same number which is 9 = 𝑅𝑅 = 𝐻𝐻.



6. (a) Let 𝑭𝑭 be the vector field defined by 𝑭𝑭(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨2𝑥𝑥𝑧𝑧, 3𝐴𝐴, 𝑥𝑥2⟩. Explain why the value of the line integral ∫ 𝑭𝑭 ⋅ 𝑎𝑎𝜸𝜸𝛾𝛾 is the same for all curves 𝜸𝜸 from 𝑃𝑃(−1,1,0) to 𝑄𝑄(1,2,2) and find the value of this

integral.

b) Let 𝑆𝑆 be the part of the surface 𝑥𝑥2 + 𝐴𝐴2 + 𝑧𝑧2 = 4 with 𝑥𝑥 ≤ 0 and 𝑧𝑧 ≥ 0. Evaluate the surface integral ∬ 𝑧𝑧3 𝑎𝑎𝑆𝑆𝑆𝑆

MY ANSWER

(a) If the line integral ∫ 𝑭𝑭 ⋅ 𝑎𝑎𝜸𝜸𝜸𝜸 has the same value for all paths or curves from 𝑃𝑃(−1,1,0) 𝐴𝐴𝐴𝐴 𝑄𝑄(1,2,2), then it

must be that it is independent of path and so it would mean that the given vector field is the gradient field of a potential function 𝝋𝝋(𝒙𝒙,𝒚𝒚, 𝒛𝒛) = 𝑪𝑪, 𝒖𝒖. 𝒆𝒆. ,𝛁𝛁𝝋𝝋(𝒙𝒙,𝒚𝒚, 𝒛𝒛) = 𝑭𝑭 which in turn means that 𝛁𝛁 × 𝛁𝛁𝜑𝜑(𝑥𝑥,𝐴𝐴, 𝑧𝑧) =𝑐𝑐𝑢𝑢𝐹𝐹𝑠𝑠 𝑭𝑭 = 𝟎𝟎. But this needs to be checked in order to receive credit as shown below.

𝑐𝑐𝑢𝑢𝐹𝐹𝑠𝑠 F = ∇ × 𝑭𝑭 = �

𝒖𝒖 𝒋𝒋 𝒌𝒌𝜕𝜕𝜕𝜕𝑥𝑥

𝜕𝜕𝜕𝜕𝐴𝐴

𝜕𝜕𝜕𝜕𝑧𝑧

2𝑥𝑥𝑧𝑧 3𝐴𝐴 𝑥𝑥2

� = ⟨0,0,0⟩ = 𝟎𝟎

Now we find the potential for 𝑭𝑭 as follows:

𝜑𝜑(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ∫ (2𝑥𝑥𝑧𝑧)𝑎𝑎𝑥𝑥 = 𝑥𝑥2𝑧𝑧 + ℎ(𝐴𝐴, 𝑧𝑧) and 𝜕𝜕𝜑𝜑 (𝑥𝑥 ,𝐴𝐴 ,𝑧𝑧)𝜕𝜕𝐴𝐴

= 𝜕𝜕ℎ(𝐴𝐴 ,𝑧𝑧)𝜕𝜕𝐴𝐴

= 3𝐴𝐴. This tells us

immediately that ℎ(𝐴𝐴, 𝑧𝑧) = 32𝐴𝐴2 + 𝑔𝑔(𝑧𝑧) and so

𝜑𝜑(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = 𝑥𝑥2𝑧𝑧 +32𝐴𝐴2 + 𝑔𝑔(𝑧𝑧)

𝜕𝜕𝜑𝜑 (𝑥𝑥 ,𝐴𝐴 ,𝑧𝑧)𝜕𝜕𝑧𝑧

= 𝑥𝑥2 + 𝑔𝑔′(𝑧𝑧) = 𝑥𝑥2 which implies that 𝑔𝑔(𝑧𝑧) 𝑖𝑖𝐹𝐹 𝑎𝑎 𝑐𝑐𝐴𝐴𝐴𝐴𝐹𝐹𝐴𝐴𝑎𝑎𝐴𝐴𝐴𝐴. Finally, we have

𝑐𝑐 = 𝝋𝝋(𝒙𝒙,𝒚𝒚, 𝒛𝒛) = 𝒙𝒙𝟐𝟐𝒛𝒛 + 𝟑𝟑𝟐𝟐𝒚𝒚𝟐𝟐 as the potential function. Now we use the potential function to calculate the line

integral ∫ 𝑭𝑭 ⋅ 𝑎𝑎𝜸𝜸𝜸𝜸 as follows (That was the point of part (a), to spare you the laboring task of calculating

the line integral directly!)

� 𝑭𝑭 ⋅ 𝒂𝒂𝜸𝜸 = 𝑸𝑸(𝟏𝟏,𝟐𝟐,𝟐𝟐) − 𝑸𝑸(−𝟏𝟏,𝟏𝟏,𝟎𝟎) = 𝟖𝟖 −𝟑𝟑𝟐𝟐

=𝟏𝟏𝟑𝟑𝟐𝟐𝜸𝜸

(b) Perhaps it is best to parameterize the given portion of the sphere 𝑥𝑥2 + 𝐴𝐴2 + 𝑧𝑧2 = 4; 𝑥𝑥 ≤ 0, 𝑧𝑧 ≥ 0 as: 𝒓𝒓(𝑢𝑢, 𝑠𝑠) = ⟨2 sin𝑢𝑢 cos 𝑠𝑠, 2 sin𝑢𝑢 sin𝑠𝑠, 2 cos𝑢𝑢⟩; 0 ≤ 𝑢𝑢 ≤ 𝜋𝜋

2, 𝜋𝜋

2≤ 𝑠𝑠 ≤ 3𝜋𝜋

2.

So that 𝒓𝒓𝑢𝑢 = ⟨2 cos𝑢𝑢 cos 𝑠𝑠, 2 cos𝑢𝑢 sin𝑠𝑠,−2 sin𝑢𝑢⟩ 𝑎𝑎𝐴𝐴𝑎𝑎 𝒓𝒓𝑠𝑠 = ⟨−2 sin𝑢𝑢 sin𝑠𝑠, 2 sin𝑢𝑢 cos𝑠𝑠, 0⟩

and 𝒓𝒓𝑢𝑢 × 𝒓𝒓𝑠𝑠 = ⟨4 cos 𝑠𝑠 sin2 𝑢𝑢, 4 sin 𝑠𝑠 sin2 𝑢𝑢, 4 sin𝑢𝑢 cos𝑢𝑢⟩ 𝑎𝑎𝐴𝐴𝑎𝑎 ‖𝐹𝐹𝑢𝑢 × 𝐹𝐹𝑠𝑠‖ = 4 sin𝑢𝑢



Now let 𝐺𝐺(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = 𝑧𝑧3 so that 𝐺𝐺�𝒓𝒓(𝑢𝑢, 𝑠𝑠)� = 𝐺𝐺(2 sin𝑢𝑢 cos 𝑠𝑠, 2 sin𝑢𝑢 sin𝑠𝑠, 2 cos𝑢𝑢) = 8 cos3 𝑢𝑢 and then 𝐺𝐺�𝒓𝒓(𝑢𝑢, 𝑠𝑠)� ‖𝐹𝐹𝑢𝑢 × 𝐹𝐹𝑠𝑠‖ = −32 cos3 𝑢𝑢 (− sin𝑢𝑢) . Hence we can evaluate the surface integral over a scalar field in the usual way:

�𝑮𝑮(𝒙𝒙,𝒚𝒚, 𝒛𝒛)𝒂𝒂𝒅𝒅 = �𝒛𝒛𝟑𝟑𝒂𝒂𝒅𝒅 = � 𝑮𝑮�𝒓𝒓(𝒖𝒖,𝒗𝒗)�‖𝒓𝒓𝒖𝒖 × 𝒓𝒓𝒗𝒗‖𝒂𝒂𝒖𝒖 𝒂𝒂𝒗𝒗 = 𝟑𝟑𝟐𝟐� � 𝟓𝟓𝟑𝟑𝒂𝒂𝟓𝟓 𝒂𝒂𝒗𝒗 = 𝟖𝟖𝟐𝟐𝟏𝟏

𝟎𝟎

𝟑𝟑𝟐𝟐𝟐𝟐

𝟐𝟐𝟐𝟐𝑹𝑹𝒅𝒅𝒅𝒅



7. Let 𝑅𝑅 be the region in the 𝑥𝑥𝐴𝐴-plane contained between the curves 𝐴𝐴 = 𝑥𝑥 + 2 and 𝐴𝐴 = 𝑥𝑥2 − 2𝑥𝑥 + 4. Let 𝜸𝜸 be the boundary curve of 𝑅𝑅, oriented counterclockwise. Find ∫ 𝑥𝑥𝐴𝐴 𝑎𝑎𝑥𝑥 − (𝑥𝑥 + 𝐴𝐴) 𝑎𝑎𝐴𝐴𝛾𝛾

(a) directly, as a line integral AND

(b) as a double integral, by using Green’s Theorem

MY ANSWER

(a) Observe that the boundary curve is a piecewise smooth simple closed curve made up of two smooth parametrized curves 𝜸𝜸𝟏𝟏(𝐴𝐴) = ⟨𝐴𝐴 + 1, 𝐴𝐴2 + 3⟩; 0 ≤ 𝐴𝐴 ≤ 1 𝑎𝑎𝐴𝐴𝑎𝑎 𝜸𝜸𝟐𝟐(𝐴𝐴) = ⟨−𝐴𝐴 + 2,−𝐴𝐴 + 4⟩; 0 ≤ 𝐴𝐴 ≤ 1.

� 𝑥𝑥𝐴𝐴 𝑎𝑎𝑥𝑥 − (𝑥𝑥 + 𝐴𝐴) 𝑎𝑎𝐴𝐴 = � (−𝐴𝐴3 − 𝐴𝐴2 − 5𝐴𝐴 + 3)𝑎𝑎𝐴𝐴 = −1

12

1

0𝜸𝜸𝟏𝟏

� 𝑥𝑥𝐴𝐴 𝑎𝑎𝑥𝑥 − (𝑥𝑥 + 𝐴𝐴)𝑎𝑎𝐴𝐴 = � (−𝐴𝐴2 + 4𝐴𝐴 − 2)dt = −1

3

1

0𝜸𝜸𝟐𝟐

� 𝒙𝒙𝒚𝒚 𝒂𝒂𝒙𝒙 − (𝒙𝒙 + 𝒚𝒚)𝒂𝒂𝒚𝒚 = −𝟏𝟏𝟏𝟏𝟐𝟐 −

𝟏𝟏𝟑𝟑 = −

𝟓𝟓𝟏𝟏𝟐𝟐𝜸𝜸

(b) Evaluation using Green’s Theorem (Circulation or Curl form) straight from the notes:

𝑪𝑪𝑪𝑪𝒖𝒖𝒂𝒂𝒕𝒕𝒆𝒆𝒓𝒓𝒄𝒄𝑪𝑪𝑪𝑪𝒄𝒄𝒌𝒌𝟓𝟓𝒖𝒖𝒔𝒔𝒆𝒆 𝒄𝒄𝒖𝒖𝒓𝒓𝒄𝒄𝒖𝒖𝑪𝑪𝒂𝒂𝒕𝒕𝒖𝒖𝑪𝑪𝒂𝒂 𝒂𝒂𝒓𝒓𝑪𝑪𝒖𝒖𝒂𝒂𝒂𝒂 𝜸𝜸 = �𝑷𝑷(𝒙𝒙,𝒚𝒚)𝒂𝒂𝒙𝒙 + 𝑸𝑸(𝒙𝒙,𝒚𝒚)𝒂𝒂𝒚𝒚 =𝜸𝜸

� �𝝏𝝏𝑸𝑸𝝏𝝏𝒙𝒙

−𝝏𝝏𝑷𝑷𝝏𝝏𝒚𝒚

�𝒂𝒂𝑨𝑨𝑫𝑫

= � � �𝝏𝝏𝝏𝝏𝒙𝒙

(−𝒙𝒙 − 𝒚𝒚) −𝝏𝝏𝝏𝝏𝒚𝒚

(𝒙𝒙𝒚𝒚)�𝒂𝒂𝒚𝒚 𝒂𝒂𝒙𝒙 = −𝟓𝟓𝟏𝟏𝟐𝟐

𝒙𝒙+𝟐𝟐

𝒙𝒙𝟐𝟐−𝟐𝟐𝒙𝒙+𝟒𝟒

𝟐𝟐

𝟏𝟏



8. Let 𝑆𝑆 be the triangle with vertices 𝑃𝑃(0,0,0), 𝑄𝑄(1,2,3), 𝑅𝑅(2,1,2). Let 𝜸𝜸 be the boundary curve of 𝑆𝑆, oriented clockwise as seen from above. Calculate ∫ 𝐴𝐴 𝑎𝑎𝑥𝑥 − 𝑥𝑥 𝑎𝑎𝐴𝐴 + 𝑧𝑧 𝑎𝑎𝑧𝑧𝜸𝜸

(a) Directly as a line integral AND

(b) as a double integral, by using Stoke’s Theorem.

MY ANSWER

(a) We parametrize the three line segments as follows:

⎩⎪⎨

⎪⎧

𝜸𝜸𝟏𝟏(𝐴𝐴) = (1 − 𝐴𝐴)⟨0,0,0⟩ + 𝐴𝐴⟨1,2,3⟩ = ⟨𝐴𝐴, 2𝐴𝐴, 3𝐴𝐴⟩; 0 ≤ 𝐴𝐴 ≤ 1.

𝜸𝜸𝟐𝟐(𝐴𝐴) = (1 − 𝐴𝐴)⟨1,2,3⟩ + 𝐴𝐴⟨2,1,2⟩ = ⟨1 + 𝐴𝐴, 2 − 𝐴𝐴, 3 − 𝐴𝐴⟩; 0 ≤ 𝐴𝐴 ≤ 1.

𝜸𝜸𝟑𝟑(𝐴𝐴) = (1 − 𝐴𝐴)⟨2,1,2⟩ + 𝐴𝐴⟨0,0,0⟩ = ⟨2 − 2𝐴𝐴, 1 − 𝐴𝐴, 2 − 2𝐴𝐴⟩; 0 ≤ 𝐴𝐴 ≤ 1.

�

� 𝒚𝒚 𝒂𝒂𝒙𝒙 − 𝒙𝒙 𝒂𝒂𝒚𝒚 + 𝒛𝒛 𝒂𝒂𝒛𝒛 = � 𝟗𝟗𝒕𝒕 𝒂𝒂𝒕𝒕 + � 𝒕𝒕 𝒂𝒂𝒕𝒕 + � 𝟒𝟒𝒕𝒕 − 𝟒𝟒 𝒂𝒂𝒕𝒕 =𝟏𝟏

𝟎𝟎

𝟏𝟏

𝟎𝟎

𝟏𝟏

𝟎𝟎� 𝟏𝟏𝟒𝟒𝒕𝒕 − 𝟒𝟒 𝒂𝒂𝒕𝒕 = 𝟕𝟕𝒕𝒕𝟐𝟐 − 𝟒𝟒𝒕𝒕�|𝟎𝟎𝟏𝟏 = 𝟑𝟑𝟏𝟏

𝟎𝟎𝜸𝜸

(b) ∫ 𝐴𝐴 𝑎𝑎𝑥𝑥 − 𝑥𝑥 𝑎𝑎𝐴𝐴 + 𝑧𝑧 𝑎𝑎𝑧𝑧 = ∬ 𝛁𝛁 × 𝑭𝑭 ⋅ 𝒂𝒂 𝑎𝑎𝑆𝑆 = ∬ 1|∇𝐻𝐻(𝑥𝑥 ,𝐴𝐴 ,𝑧𝑧)⋅𝑘𝑘|𝐷𝐷𝑆𝑆𝜸𝜸 �𝛁𝛁 × 𝑭𝑭 ⋅ −𝛁𝛁𝐻𝐻(𝑥𝑥, 𝐴𝐴, 𝑧𝑧)�𝑎𝑎𝐴𝐴

where 𝐻𝐻(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = −𝑥𝑥 − 4𝐴𝐴 + 3𝑧𝑧 = 0 and 𝑭𝑭(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨𝐴𝐴,−𝑥𝑥, 𝑧𝑧⟩. Now −∇𝐻𝐻(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨1,4,−3⟩ and 𝒑𝒑 = 𝒌𝒌 and 𝐷𝐷 is the triangle in the 𝑥𝑥𝐴𝐴-plane with vertices at (0,0), (1,2)𝑎𝑎𝐴𝐴𝑎𝑎 (2,1) so that |∇𝐻𝐻(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ 𝒌𝒌| = 3 and 𝛁𝛁 × 𝑭𝑭 = ⟨0,0,−2⟩ and 𝛁𝛁 × 𝑭𝑭 ⋅ −𝛁𝛁𝐻𝐻(𝑥𝑥, 𝐴𝐴, 𝑧𝑧) = 6. So our formula above becomes

�𝛁𝛁 × 𝑭𝑭 ⋅ 𝒂𝒂𝒅𝒅𝒅𝒅

= −�𝟏𝟏𝟑𝟑

(−𝟔𝟔) 𝒂𝒂𝑨𝑨 = 𝟐𝟐�𝒂𝒂𝑨𝑨 = 𝟐𝟐(𝑨𝑨𝒓𝒓𝒆𝒆𝒂𝒂 𝑪𝑪𝒐𝒐 𝑫𝑫) = ‖⟨𝟏𝟏,𝟐𝟐,𝟎𝟎⟩ × ⟨𝟐𝟐,𝟏𝟏,𝟎𝟎⟩‖𝑫𝑫𝑫𝑫

= ‖⟨𝟎𝟎,𝟎𝟎,−𝟑𝟑⟩‖ = 𝟑𝟑



9. Let 𝑇𝑇 be the solid contained between the surfaces 𝑧𝑧 = 4 − 𝑥𝑥2 − 𝐴𝐴2 and 𝑧𝑧 = 𝑥𝑥2 + 𝐴𝐴2 − 4. Let 𝑭𝑭 be the vector field defined by 𝑭𝑭(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨𝑥𝑥,𝐴𝐴, 1⟩. Use the outward pointing unit normal vector to calculate ∬ 𝑭𝑭 ⋅ 𝒂𝒂 𝑎𝑎𝑆𝑆𝑆𝑆 (a) directly as a surface integral AND

(b) as a triple integral, by using the Divergence Theorem.

MY SOLUTION

(a) Let 𝑆𝑆1 denote the surface 𝑧𝑧 = 4 − (𝑥𝑥2 + 𝐴𝐴2) or 𝐻𝐻1(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = 𝑥𝑥2 + 𝐴𝐴2 + 𝑧𝑧 = 4. Then ∇𝐻𝐻1(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨2𝑥𝑥, 2𝐴𝐴, 1⟩ and |∇𝐻𝐻1(𝑥𝑥, 𝐴𝐴, 𝑧𝑧) ⋅ 𝒌𝒌| = 1 and ∇𝐹𝐹(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ ∇𝐻𝐻1(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = 2𝑥𝑥2 + 2𝐴𝐴2 + 1 where 𝐷𝐷 = {(𝑥𝑥,𝐴𝐴): 𝑥𝑥2 + 𝐴𝐴2 ≤ 4} So since the surface 𝑆𝑆1 is positively oriented, then we use

� 𝑭𝑭 ⋅ 𝒂𝒂 𝑎𝑎𝑆𝑆1 =𝑆𝑆1

�1

|∇𝐻𝐻1(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ 𝒌𝒌| �∇𝐹𝐹(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ ∇𝐻𝐻1(𝑥𝑥,𝐴𝐴, 𝑧𝑧)�𝑎𝑎𝐴𝐴𝐷𝐷

= � � (2𝐹𝐹2 + 1)𝐹𝐹 𝑎𝑎𝐹𝐹 𝑎𝑎𝑑𝑑 = 20𝜋𝜋2

0

2𝜋𝜋

0

Let 𝑆𝑆2 denote the surface 𝑧𝑧 = 𝑥𝑥2 + 𝐴𝐴2 − 4 or 𝐻𝐻2(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = −𝑥𝑥2 − 𝐴𝐴2 + 𝑧𝑧 = −4. Then −∇𝐻𝐻2(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = ⟨2𝑥𝑥, 2𝐴𝐴,−1⟩ and |∇𝐻𝐻2(𝑥𝑥, 𝐴𝐴, 𝑧𝑧) ⋅ −𝒌𝒌| = 1 and ∇𝐹𝐹(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ −∇𝐻𝐻2(𝑥𝑥,𝐴𝐴, 𝑧𝑧) = 2𝑥𝑥2 + 2𝐴𝐴2 + 1 where 𝐷𝐷 = {(𝑥𝑥,𝐴𝐴):𝑥𝑥2 + 𝐴𝐴2 ≤ 4} So since the surface 𝑆𝑆2 is positively oriented, then we use

� 𝑭𝑭 ⋅ 𝒂𝒂 𝑎𝑎𝑆𝑆2 =𝑆𝑆2

−�1

|∇𝐻𝐻2(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ 𝒌𝒌| �∇𝐹𝐹(𝑥𝑥,𝐴𝐴, 𝑧𝑧) ⋅ ∇𝐻𝐻2(𝑥𝑥,𝐴𝐴, 𝑧𝑧)�𝑎𝑎𝐴𝐴𝐷𝐷

= −� � (1 − 2𝐹𝐹2)𝐹𝐹 𝑎𝑎𝐹𝐹 𝑎𝑎𝑑𝑑 = 12𝜋𝜋2

0

2𝜋𝜋

0

Finally we get ∬ 𝑭𝑭 ⋅ 𝒂𝒂 𝒂𝒂𝒅𝒅 = ∬ 𝑭𝑭 ⋅ 𝒂𝒂 𝒂𝒂𝒅𝒅𝟏𝟏 + ∬ 𝑭𝑭 ⋅ 𝒂𝒂 𝒂𝒂𝒅𝒅𝟐𝟐 = 𝟐𝟐𝟎𝟎𝟐𝟐 + 𝟏𝟏𝟐𝟐𝟐𝟐 = 𝟑𝟑𝟐𝟐𝟐𝟐𝒅𝒅𝟐𝟐𝒅𝒅𝟏𝟏𝒅𝒅

(b) By the Divergence Theorem we have:

�𝑭𝑭 ⋅ 𝒂𝒂 𝒂𝒂 = �𝒂𝒂𝒖𝒖𝒗𝒗 𝑭𝑭 𝒂𝒂𝒅𝒅 = 𝟐𝟐�𝒂𝒂𝒅𝒅 = 𝟐𝟐 (𝒅𝒅𝑪𝑪𝑪𝑪𝒖𝒖𝑽𝑽𝒆𝒆 𝑪𝑪𝒐𝒐 𝑻𝑻) = 𝟐𝟐� � � 𝒂𝒂𝒛𝒛 𝒓𝒓 𝒂𝒂𝒓𝒓 𝒂𝒂𝒅𝒅 = 𝟑𝟑𝟐𝟐𝟐𝟐𝟒𝟒−𝒓𝒓𝟐𝟐

𝒓𝒓𝟐𝟐−𝟒𝟒

𝟐𝟐

𝟎𝟎

𝟐𝟐𝟐𝟐

𝟎𝟎𝑻𝑻𝑻𝑻𝒅𝒅



END OF THE EXAMINATION. Make sure that you answered 5 complete questions from Part I and 2 complete questions from Part II.

Post Solution Commentary: First, I just got through solve this final Exam above using the Class Notes (not your official textbook). There is nothing in this final that you should have difficulty with; unless you completely ignored the class notes (which was designed for the working student who has no time for long drawn out discussion of the ideas) or you just don’t know how to double integrate and triple integrate, or even successfully carry out a u-substitution etc. In those cases, you should have learned these things in Math 20300 like your classmate Mr. Wei Pan (a former student from my last semester Math 20300 course who took Math 39200 with you) whom I must congratulate for the scoring the highest among your class on the final Exam! Now Mr. Wei Pan is an ordinary student like all of you, he did not come to see me for help all semester long. Besides, I felt that he should not need too much help from me anyway. I believe what made the difference was that he had the unique experience of the integration project, the unconventional methods that I used last semester which force everyone have some firsthand knowledge of some of the important ideas of calculus. Furthermore, he was taught how to study for Exams! You too, can have similar success on important test, if you make a conscious effort to learn your material well (not always trying to avoid learning your subject matter).

elementary linear algebra and vector calculus

Documents