rudin solutions ch 1 - 7

Solutions to Principles of Mathematical Analysis (Walter Rudin)

Jason [email protected]

December 19, 2011

This work was done as an undergraduate student: if you really don’t understand something in one of theseproofs, it is very possible that it doesn’t make sense because it’s wrong. Any questions or corrections can bedirected to [email protected].

Exercise 1.1a

Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that (r + x) 6∈ Q. Proof of thecontrapositive:

7→ r + x is rational assumed

→ (∃p ∈ Q)(r + x = p) definition of rational

→ (∃p, q ∈ Q)(q + x = p) we’re told that r is rational

→ (∃p, q ∈ Q)(x = −q + p) existence of additive inverses in Q

Because p and q are members of the closed additive group of Q, we know that their sum is amember of Q.

→ (∃u ∈ Q)(x = u)

→ x is rational definition of rational

By assuming that r+x is rational, we prove that x must be rational. By contrapositive, then, if x is irrationalthen r + x is irrational, which is what we were asked to prove.

Exercise 1.1b

Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that rx 6∈ Q. Proof of thecontrapositive:

7→ rx is rational assumed

→ (∃p ∈ Q)(rx = p) definition of rational

→ (∃p ∈ Q)(x = r−1p) existence of multiplicative inverses in Q

(Note that we can assume that r−1 exists only because we are told that r is nonzero.) Because r−1

and p are members of the closed multiplicative group of Q, we know that their product is also amember of Q.

→ (∃u ∈ Q)(x = u)

→ x is rational definition of rational

By assuming that rx is rational, we prove that x must be rational. By contrapositive, then, if x is irrationalthen rx is irrational, which is what we were asked to prove.

1

Exercise 1.2

Proof by contradiction. If√

12 were rational, then we could write it as a reduced-form fraction in the form ofp/q where p and q are nonzero integers with no common divisors.

7→ pq =√

12 assumed

→ (p2

q2 = 12)

→ (p2 = 12q2)

It’s clear that 3|12q2, which means that 3|p2. By some theorem I can’t remember (possibly thedefinition of ‘prime’ itself), if a is a prime number and a|mn, then a|m ∨ a|n. Therefore, since 3|ppand 3 is prime,

→ 3|p→ 9|p2

→ (∃m ∈ N)(p2 = 9m) definition of divisibility

→ (∃m ∈ N)(12q2 = 9m) substitution from p2 = 12q2

→ (∃m ∈ N)(4q2 = 3m) divide both sides by 3

→ (3|4q2) definition of divisibility

From the same property of prime divisors that we used previously, we know that 3|4 ∨ 3|q2: itclearly doesn’t divide 4, so it must be the case that 3|q2. But if 3|qq, then 3|q ∨ 3|q. Therefore:

→ (3|q)

And this establishes a contradiction. We began by assuming that p and q had no common divisors, but wehave shown that 3|p and 3|q. So our assumption must be wrong: there is no reduced-form rational number suchthat p

q =√

12.

Exercise 1.3 a

If x 6= 0 and xy = xz, then

y = 1y = (x−1x)y = x−1(xy) = x−1(xz) = (x−1x)z = 1z = z

Exercise 1.3 b

If x 6= 0 and xy = x, theny = 1y = (x−1x)y = x−1(xy) = x−1x = 1

Exercise 1.3 c

If x 6= 0 and xy = 1, theny = 1y = (x−1x)y = x−1(xy) = x−11 = x−1 = 1/x

Exercise 1.3 d

If x 6= 0, then the fact that x−1x =1 means that x is the inverse of x−1: that is, x = (x−1)−1 = 1/(1/x).

Exercise 1.4

We are told that E is nonempty, so there exists some e ∈ E. By the definition of lower bound, (∀x ∈ E)(α ≤ x):so α ≤ e. By the definition of upper bound, (∀x ∈ E)(x ≤ β): so e ≤ β. Together, these two inequalities tell usthat α ≤ e ≤ β. We’re told that S is ordered, so by the transitivity of order relations this implies α ≤ β.

2

Exercise 1.5

We’re told that A is bounded below. The field of real numbers has the greatest lower bound property, so we’reguaranteed to have a greatest lower bound for A. Let β be this greatest lower bound. To prove that −β isthe least upper bound of −A, we must first show that it’s an upper bound. Let −x be an arbitrary element in −A:

7→ −x ∈ −A assumed

→ x ∈ A definition of membership in −A→ β ≤ x β = inf(A)

→ −β ≥ −x consequence of 1.18(a)

We began with an arbitrary choice of −x, so this proves that (∀ − x ∈ −A)(−β ≥ −x), which by definitiontells us that −β is an upper bound for −A. To show that −β is the least such upper bound for −A, we choosesome arbitrary element less than −β:

7→ α < −β assumed

→ −α > β consequence of 1.18(a)

Remember that β is the greatest lower bound of A. If −α is larger than inf(A), there must be someelement of A that is smaller than −α.

→ (∃x ∈ A)(x < −α) (see above)

→ (∃ − x ∈ −A)(−x > α) consequence of 1.18(a)

→ !(∀ − x ∈ −A)(−x ≤ α)

→ α is not an upper bound of −A definition of upper bound

This proves that any element less than −β is not an upper bound of −A. Together with the earlier proofthat −β is an upper bound of −A, this proves that −β is the least upper bound of −A.

Exercise 1.6a

The difficult part of this proof is deciding which, if any, of the familiar properties of exponents are consideredaxioms and which properties we need to prove. It seems impossible to make any progress on this proof unless wecan assume that (bm)n = bmn. On the other hand, it seems clear that we can’t simply assume that (bm)

1n = bm/n:

this would make the proof trivial (and is essentially assuming what we’re trying to prove).As I understand this problem, we have defined xn in such a way that it is trivial to prove that (xa)b = xab

when a and b are integers. And we’ve declared in theorem 1.21 that, by definition, the symbol x1n is the element

such that (xn)1n = x. But we haven’t defined exactly what it might mean to combine an integer power like n

and some arbitrary inverse like 1/r. We are asked to prove that these two elements do, in fact, combine in theway we would expect them to: (xn)1/r = xn/r.

Unless otherwise noted, every step of the following proof is justified by theorem 1.21.

3

7→ bm = bm assumed

→ ((bm)1n )n = bm definition of x

1n

→ ((bm)1n )nq = bmq

We were told that mn = p

q which, by the definition of the equality of rational numbers, means thatmq = np. Therefore:

→ ((bm)1n )nq = bnp

→ ((bm)1n )qn = bpn commutativity of multiplication

From theorem 12.1, we can take the n root of each side to get:

→ ((bm)1n )q = bp

From theorem 12.1, we can take the q root of each side to get:

→ (bm)1n = (bp)

1q

Exercise 1.6b

As in the last proof, we assume that br+s = brbs when r and s are integers and try to prove that the operationworks in a similar way when r and s are rational numbers. Let r = m

n and let s = pq where m,n, p, q ∈ Z and

n, q 6= 0.7→ br+s = b

mn + p

q

→ br+s = bmq+pn

nq definition of addition for rationals

→ br+s = (bmq+pn)1nq from part a

→ br+s = (bmqbpn)1nq legal because mq and pn are integers

→ br+s = (bmq)1nq (bpn)

1nq corollary of 1.21

→ br+s = (bmqnq )(b

pnnq ) from part a

→ br+s = (bmn )(b

pq )

→ br+s = (br)(bs)

Exercise 1.6c

We’re given that b > 1. Let r be a rational number. Proof by contradiction that br is an upper bound of B(r):

7→ br is not an upper bound of B(r) hypothesis of contradiction

→ (∃x ∈ B(r))(x > br) formalization of the hypothesis

By the definition of membership in B(r), x = bt where t is rational and t ≤ r.

→ (∃t ∈ Q)(bt > br ∧ t ≤ r)

It can be shown that b−t > 0 (see theorem S1, below) so we can multiply this term against bothsides of the inequality.

→ (∃t ∈ Q)(btb−t > brb−t ∧ t ≤ r) theorem S2

→ (∃t ∈ Q)(bt−t > br−t ∧ t ≤ r) from part b

→ (∃t ∈ Q)(1 > br−t ∧ r − t ≥ 0)

→ (∃t ∈ Q)(1−(r−t) > b ∧ r − t ≥ 0)

→ 1 > b

And this establishes our contradiction, since we were given that b > 1. Our initial assumption must havebeen incorrect: br is, in fact, an upper bound of B(r). We must still prove that it is the least upper bound ofB(r), though. To do so, let α represent an arbitrary rational number such that bα < br. From this, we need to

4

prove that α < r.

7→ bα < br hypothesis of contradiction

→ bαb−r < brb−r theorem S2

→ bα−r < br−r from part b

→ bα−r < 1 from part b

Exercise 1.7 a

Proof by induction. Let S = {n : bn − 1 ≥ n(b− 1)}. We can easily verify that 1 ∈ S. Now, assume that k ∈ S:

7→ k ∈ S hypothesis of induction

→ bk − 1 ≥ k(b− 1) definition of membership in S

→ bbk − 1 ≥ k(b− 1) we’re told that b > 1.

→ bk+1 − b ≥ k(b− 1)

→ bk+1 ≥ k(b− 1) + b

→ bk+1 − 1 ≥ k(b− 1) + b− 1

→ bk+1 − 1 ≥ (k + 1)(b− 1)

→ k + 1 ∈ S definition of membership in S

By induction, this proves that (∀n ∈ N)(bn − 1 ≥ n(b− 1)).

Alternatively, we could prove this using the same identity that Rudin used in the proof of 1.21. From thedistributive property we can verify that bn − an = (b− a)(bn−1a0 + bn−2a1 + . . .+ b0an−1). So when a = 1, thisbecomes bn − 1 = (b− 1)(bn−1 + bn−2 + . . .+ b0). And since b > 1, each term in the bn−k series is greater than1, so bn − 1 ≥ (b− 1)(1n−1 + 1n−2 + . . .+ 10) = (b− 1)n.

Exercise 1.7 b

7→ n(b1n − 1) = n(b

1n − 1)

→ n(b1n − 1) = (1 + 1 + . . .+ 1)︸︷︷︸

n times

(b1n − 1)

→ n(b1n − 1) = (1n−1 + 1n−2 + . . .+ 10)(b

1n − 1)

It can be shown that bk > 1 when b > 1, k > 0 (see theorem S4). Replacing 1 with b1n gives us the

inequality:

→ n(b1n − 1) ≤ ((b

1n )n−1 + (b

1n )n−2 + . . .+ (b

1n )0)(b

1n − 1)

Now we can use the algebraic identity bn − an = (bn−1a0 + bn−2a1 + . . .+ b0an−1)(b− a):

→ n(b1n − 1) ≤ ((b

1n )n − 1)

→ n(b1n − 1) ≤ (b− 1)

Exercise 1.7 c

7→ n > (b− 1)/(t− 1) assumed

→ n(t− 1) > (b− 1) this holds because n, t, and b are greater than 1

→ n(t− 1) > (b− 1) ≥ n(b1n − 1) from part b

→ n(t− 1) > n(b1n − 1) transitivity of order relations

→ (t− 1) > (b1n − 1) n > 0→ n−1 > 0 would be a trivial proof

→ t > b1n

5

Exercise 1.7 d

We’re told that bw < y, which means that 1 < yb−w. Using the substitution yb−w = t with part (c), we’re lead

directly to the conclusion that we can select n such that yb−w > b1n . From this we get y > bw+ 1

n , which is whatwe were asked to prove. As a corollary, the fact that b

1n > 1 means that bw+ 1

n > bw.

Exercise 1.7 e

We’re told that bw > y, which means that bwy−1 > 1. Using the substitution bwy−1 = t with part (c), we’re

lead directly to the conclusion that we can select n such that bwy−1 > b1n . Multiplying both sides by y gives us

bw > b1n y. Multiplying this by b

−1n gives us bw−

1n > y, which is what we were asked to prove. As a corollary,

the fact that b1n > 1 > 0 means that, upon taking the reciprocals, we have b

−1n < 1 and therefore bw−

1n < bw.

Exercise 1.7 f

We’ll prove that bx = y by showing that the assumptions bx > y and bx < y lead to contradictions.If bx > y, then from part (e) we can choose n such that bx > bx−

1n > y. From this we see that x − 1

n is anupper bound of A that is smaller than x. This is a contradiction, since we’ve assumed that x =sup(A).

If bx < y, then from part (d) we can choose n such that y > bx+1n > bx. From this we see that x is not an

upper bound of A. This is a contradiction, since we’ve assumed that x is the least upper bound of A.Having ruled out these two possibilities, the trichotomy property of ordered fields forces us to conclude that

bx = y.

Exercise 1.7 g

Assume that there are two elements such that bw = y and bx = y. Then by the transitivity of equality relations,bw = by, although this seems suspiciously simple.

Exercise 1.8

In any ordered set, all elements of the set must be comparable (the trichotomy rule, definition 1.5). We willshow by contradiction that (0, 1) is not comparable to (0, 0) in any potential ordered field containing C. First,we assume that (0, 1) > (0, 0) :

7→ (0, 1) > (0, 0) hypothesis of contradiction

→ (0, 1)(0, 1) > (0, 0) definition 1.17(ii) of ordered fields

We assumed here that (0, 0) can take the role of 0 in definition 1.17 of an ordered field. This isa safe assumption because the uniqueness property of the additive identity shows us immediatelythat (0, 0) + (a, b) = (a, b)→ (0, 0) = 0.

→ (−1, 0) > (0, 0) definition of complex multiplication

→ (−1, 0)(0, 1) > (0, 0) definition 1.17(ii) of ordered fields, since we initially assumed (0, 1) > 0

→ (0,−1) > (0, 0) definition of complex multiplication

It might seem that we have established our contradiction as soon as we concluded that (−1, 0) > 0or (0,−1) > 0. However, we’re trying to show that the complex field cannot be an ordered fieldunder any ordered relation, even a bizarre one in which −1 > −i > 0. However, we’ve shown that(0, 1) and (0,−1) are both greater than zero. Therefore:

→ (0,−1) + (0, 1) > (0, 0) + (0, 1) definition 1.17(i) of ordered fields

→ (0, 0) > (0, 1) definition of complex multiplication

This conclusion is in contradiction of trichotomy, since we initially assumed that (0, 0) < (0, 1). Next, weassume that (0, 1) < (0, 0):

6

7→ (0, 0) > (0, 1) hypothesis of contradiction

→ (0, 0) + (0,−1) > (0, 1) + (0,−1) definition 1.17(i) of ordered fields

→ (0,−1) > (0, 0) definition of complex addition

→ (0,−1)(0,−1) > (0, 0) definition 1.17(ii) of ordered fields

→ (−1, 0) > (0, 0) definition of complex multiplication

→ (−1, 0)(0,−1) > (0, 0) definitino 1.17(ii) of ordered fields, since we’ve established (0,−1) >(0, 0)

→ (0, 1) > (0, 0) definition of complex multiplication

Once again trichotomy has been violated.

Proof by contradiction that (0, 1) 6= (0, 0): if we assume that (0, 1) = (0, 0) we’re led to the conclusion that(a, b) = (0, 0) for every complex number, since (a, b) = a(0, 1)4 + b(0, 1) = a(0, 0) + b(0, 0) = (0, 0). By thetransitivity of equivalence relations, this would mean that every element is equal to every other. And this isin contradiction of definition 1.12 of a field, where we’re told that there are at least two distinct elements: theadditive identity (’0’) and the multiplicative identity (’1’).

Exercise 1.9a

To prove that this relation turns C into an ordered set, we need to show that it satisfies the two requirementsin definition 1.5. Proof of transitivity:

7→ (a, b) < (c, d) ∧ (c, d) < (e, f) assumption

→ [a < c ∨ (a = c ∧ b < d)] ∧ [c < e ∨ (c = e ∧ d < f)] definition of this order relation

→ (a < c ∧ c < e) ∨ (a < c ∧ c = e ∧ d < f)

∨(a = c ∧ b < d ∧ c < e) ∨ (a = c ∧ b < d ∧ c = e ∧ d < f) distributivity of logical operators

→ (a < e) ∨ (a < e ∧ d < f) ∨ (a < e ∧ b < d) ∨ (a = e ∧ b < f) transitivity of order relation on R

Although we’re falling back on the the transitivity of an order relation, we are not assuming whatwe’re trying to prove. We’re trying to prove the transitivity of the dictionary order relation on C,and this relation is defined in terms of the standard order relation on R. This last step is using thetransitivity of this standard order relation on R and is not assuming that transitivity holds for thedictionary order relation.

→ (a < e) ∨ (a < e) ∨ (a < e) ∨ (a = e ∧ b < f) p ∧ q → p

→ a < e ∨ (a = e ∧ b < f) p ∨ p→ p

→ (a, b) < (e, f) definition of this order relation

To prove that the trichotomy property holds for the dictionary relation on Q, we rely on the trichotomyproperty of the underlying standard order relation on R. Let (a, b) and (c, d) be two elements in C. From thestandard order relation, we know that

7→ (a, b) ∈ C ∧ (c, d) ∈ C assumed

→ a, b, c, d ∈ R definition of a complex number

→ (a < c) ∨ (a > c) ∨ (a = c) trichotomy of the order relation on R→ (a < c) ∨ (a > c) ∨ (a = c ∧ (b < d ∨ b > d ∨ b = d)) trichotomy of the order relation on R→ (a < c) ∨ (a > c) ∨ (a = c ∧ b < d) ∨ (a = c ∧ b > d)

∨(a = c ∧ b = d) distributivity of the logical operators

→ (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) < (c, d) ∨ (a, b) > (c, d)

∨(a, b) = (c, d) definition of the dictionary order relation

→ (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) = (c, d)

And this is the definition of the trichotomy law, so we have proven that the dictionary order turns the

7

complex numbers into an ordered set.

Exercise 1.9b

C does not have the least upper bound property under the dictionary order. Let E = {(0, a) : a ∈ R}. This subsetis just the imaginary axis in the complex plane. This subset clearly has an upper bound, since (x, 0) > (0, a) forany x > 0. But it does not have a least upper bound: for any proposed upper bound (x, y) with x > 0, we seethat

(x, y) < (x

2, y) < (0, a)

So that (x2 , y) is an upper bound less than our proposed least upper bound, which is a contradiction.

Exercise 1.10

This is just straightforward algebra, and is too tedious to write out.

Exercise 1.11

If we choose w = z|z| and choose r = |z|, then we can easily verify that |w| = 1 and that rw = |z| z|z| = z.

Exercise 1.12

Set ai√zi and bi =

√zi and use the Cauchy-Schwarz inequality (theorem 1.35). This gives us∣∣∣∣∣∣

n∑j=1

√zj√zj

∣∣∣∣∣∣2

≤n∑j=1

|√zj |2n∑j=1

|√zj |2

which is equivalent to|z1 + z2 + . . .+ zn|2 ≤ (|z1|+ |z2|+ . . .+ |zn|)2

Taking the square root of each side shows that

|z1 + z2 + . . .+ zn| ≤ |z1|+ |z2|+ . . .+ |zn|

which is what we were asked to prove.

Exercise 1.13

|x− y|2

= (x− y)(x− y)= (x− y)(x− y) Theorem 1.31(a)

= xx− xy − yx+ yy

= xx− (xy + yx) + yy

= |x|2 − 2Re(xy) + |y|2 Theorem 1.31(c), definition 1.32

≥ |x|2 − 2|Re(xy)|+ |y|2 x ≤ |x|, so −|x| ≥ |x|.≥ |x|2 − 2|xy|+ |y|2 Theorem 1.33(d)

= |x|2 − 2|x||y|+ |y|2 Theorem 1.33(c)

= |x|2 − 2|x||y|+ |y|2 Theorem 1.33(b)

= (|x| − |y|)(|x| − |y|)= (|x| − |y|)(|x| − |y|) Theorem 1.33(b)

= (|x| − |y|)(|x| − |y|) Theorem 1.31(a)

= ||x| − |y||2

This chain of inequalities shows us that ||x| − |y||2 ≤ |x − y|2. Taking the square root of both sides resultsin the claim we wanted to prove.

8

Exercise 1.14

|1 + z|2 + |1− z|2

= (1 + z)(1 + z) + (1− z)(1− z)= (1 + z)(1 + z) + (1− z)(1− z) Theorem 1.31(a)

= (1 + z)(1 + z) + (1− z)(1− z) The conjugate of 1 = 1 + 0i is just 1− 0i = 1.

= (1 + z + z + zz) + (1− z − z + zz)

= (2 + 2zz)

= (2 + 2) We are told that zz = 1

= 4

Exercise 1.15

Using the logic and the notation from Rudin’s proof of theorem 1.35, we see that equality holds in the Schwarzinequality when AB = |C|2. This occurs when have B(AB − |C|2) = 0, and from the given chain of equalitieswe see that this occurs when

∑|Baj − Cbj |2 = 0. For this to occur we must have Baj = Cbj for all j, which

occurs only when B = 0 or when

aj =C

Bbj for all j

That is, each aj must be a constant multiple of bj .

Exercise 1.16

We know that |z − x|2 = |z − y|2 = r2. Expanding these terms out, we have

|z − x|2 = (z − x) · (z − x) = |z|2 − 2z · x+ |x|2

|z − y|2 = (z − y) · (z − y) = |z|2 − 2z · y + |y|2

For these to be equal, we must have−2z · x+ |x|2 = −2z · y + |y|2

which happens when

z · (x− y) =1

2[|x|2 − |y|2] (1)

We also want |z − x| = r, which occurs when z = x + ru where |u| = 1. Using this representation of z, therequirement that r2 = |z − y|2 becomes

r2 = |z − y|2 = |x+ ru− y|2 = |(x− y) + ru2| = |x− y|2 + 2ru · (x− y) + |ru|2 = d2 + 2ru · (x− y) + r2

Rearranging some terms, this becomes

u · (y − x) =d2

2r=

d

2r|y − x| (2)

A quick, convincing, and informal proof would be to appeal to the relationship a · b = |a||b| cos(θ) where θ is theangle between the two vectors; the previous equation then becomes

|u||y − x| cos(θ) =d

2r|y − x|

Dividing by |y − x| and remembering that u = 1, this becomes

cos(θ) =d

2r

where θ is the angle between the fixed vector (y−x) and the variable vector u. It’s easy to see that this equationwill hold for exactly one u when d = 2r; it will hold for no u when d > 2r; it will hold for two values of u whend < 2r and n = 2; and it will hold for infinitely many values of u when d < 2r and n > 2. Each value of ucorresponds with a unique value of z. More formal proofs follow.

9

part (a)

When d < 2r, equation (2) is satisfied for all u for which

u · (y − x) =d

2r|y − x| < |y − x|

By the definition of the dot product, this is equivalent to

u1(y1 + x1) + u2(y2 + x2) + . . .+ uk(yk + xk) =d

2r|y − x| (3)

The only other requirement for the values of ui is that√u21 + u22 + . . .+ u2k = 1 (4)

This gives us a system of two equations with k variables. As long as k ≥ 3 we have more variables than equationsand therefore the system will have infinitely many solutions.

part (b)

Evaluating d2, we have:d2 = |x− y|2

= |(x− z) + (z − y)|2

= [(x− z) + (z − y)] · [(x− z) + (z − y)] definition of inner product ·= (x− z) · (x− z) + 2(x− z) · (z − y) + (z − y) · (z − y) inner products are distributive

= |x− z|2 + 2(x− z) · (z − y) + |z − y|2 definition of inner product ·

Evaluating (2r)2, we have:(2r)2 = (r + r)2 = (|z − x|+ |z − y|)2

= |z − x|2 + 2|z − x||z − y|+ |z − y|2 commutativity of multiplication

If 2r = d then d2 = (2r)2 and therefore, by the above evaluations, we have

|x− z|2 + 2(x− z) · (z − y) + |z − y|2 = |z − x|2 + 2|z − x||z − y|+ |z − y|2

which occurs if and only if2(x− z) · (z − y) = 2|x− z||z − y|

From exercise 14 we saw that this equality held only if (x − z) = c(z − y) for some constant c; we know that|x− z| = |z − y| so c = ±1; we know x 6= y so c = 1. Therefore we have x− z = z − y, from which we have

z =x+ y

2

and there is clearly only one such z that satisfies this equation.

part (c)

If 2r < d then we have|x− y| > |x− z|+ |z − y|

which is equivalent to|(x− z) + (z − y)| > |x− z|+ |z − y|

which violates the triangle inequality (1.37e) and is therefore false for all z.

10

Exercise 1.17

First, we need to prove that a · (b+ c) = a · b+ a · c and that (a+ b) · c = a · c+ b · c: that is, we need to provethat the distributive property holds between the inner product operation and addition.

a · (b+ c) =∑ai(bi + ci) definition 1.36 of inner product

=∑

(aibi + aici) distributive property of R=∑aibi +

∑aici associative property of R

= a · b+ a · c definition 1.36 of inner product

(a+ b) · c =∑

(ai + bi)ci definition 1.36 of inner product

=∑

(aici + bici) distributive property of R=∑aici +

∑bici associative property of R

= a · c+ b · c definition 1.36 of inner product

The rest of the proof follows directly:

|x+ y|2 + |x− y|2 = (x+ y) · (x+ y) + (x− y) · (x− y)

= (x+ y) · x+ (x+ y) · y + (x− y) · x− (x− y) · y= x · x+ y · x+ x · y + y · y + x · x− y · x− x · y + y · y= x · x+ y · y + x · x+ y · y= 2|x|2 + 2|y|2

Exercise 1.18

If x = 0, then x · y = 0 for any y ∈ Rk. If x 6= 0, then at least one of the elements x1, . . . , xk must be nonzero:let this element be represented by xa. Let xb represent any other element of x. Choose y such that:

yi =

xbxa

i = a

−1 i = b0 otherwise

We can now see that x·y = xaxb

xa+xb(−1) = xb−xb = 0. We began with an arbitrary vector x and demonstrated

a method of construction for y such that x · y = 0: therefore, we can always find a nonzero y such that x · y = 0.This is not true in R1 because the nonzero elements of R are closed with respect to multiplication.

Exercise 1.19

We need to determine the circumstances under which |x− a| = 2|x− b| and |x− c| = r. To do this, we need tomanipulate these equalities until they have a common term that we can use to compare them.

|x− a| = 2|x− b||x− a|2 = 4|x− b|2

|x|2 − 2x · a+ |a|2 = 4|x|2 − 8x · b+ 4|b|2

3|x|2 = |a|2 − 2x · a+ 8x · b− 4|b|2

|x− c| = r

|x− c|2 = r2

|x|2 − 2x · c+ |c|2 = r2

|x|2 = r2 + 2x · c− |c|2

3|x|2 = 3r2 + 6x · c− 3|c|2

Combining these last two equalities together, we have

11

7→ |a|2 − 2x · a+ 8x · b− 4|b|2 = 3r2 + 6x · c− 3|c|2

→ |a|2 − 2x · a+ 8x · b− 4|b|2 − 3r2 − 6x · c+ 3|c|2 = 0

→ |a|2 − 4|b|2 + 3|c|2 − 2x · (a− 4b− 3c)− 3r2 = 0

We can zero out the dot product in this equation by letting 3c = 4b − a. Of course, this alsodetermines a specific value of c. This substitution gives us the new equality:

→ |a|2 − 4|b|2 + 3

∣∣∣∣4b3 − a

3

∣∣∣∣2 − 3r2 = 0

→ |a|2 − 4|b|2 +3 · 16

9|b|2 − 3 · 8

9a · b+

3

9|a|2 − 3r2 = 0

→ 3|a|2 − 12|b|2 + 16|b|2 − 8a · b+ |a|2 − 9r2 = 0

→ 4|a|24|b|2 − 8a · b− 9r2 = 0

→ 4|a− b|2 = 9r2

→ 2|a− b| = 3r

By choosing 3c = 4b− a and 3r = 2|a− b|, we guarantee that |x− a| = 2|x− b| iff |x− c| = r.

Exercise 1.20

We’re trying to show that R has the least upper bound property. The elements of R are certain subsets of Q,and < is defined to be “is a proper subset of”. To say that an element α ∈ R has a least upper bound, then,is to say that the subset α has some “smallest” superset β such that α ⊂ β. We’re asked to omit property III,which told us that each α ∈ R has no largest element. With this restriction lifted, we have a new definition of“cut” that includes cuts such as (−∞, 1] and (−∞,

√2].

To prove that each subset of R with an upper bound must have a least upper bound, we will follow Step 3in the book almost exactly. We will define two subsets A ⊂ R and γ ⊂ R, show that the subset γ is also anelement of R, and then show that the subset/element γ is the least upper bound of A.

Let A be a nonempty subset of R with an upper bound of β (Note: A is a subset of R, not a subset of R.The elements of A are cuts, which are subsets of Q). Define γ to be the union of all cuts in A. This means thatγ contains every element from every cut in A: γ consists of elements from Q.

proof that γ is a cut:

The proof of criterion (I) has two parts. (i) γ is the union of elements of A, and we were told that A is nonempty.Therefore γ is nonempty. (ii) We are told that β is an upper bound for A, so x ∈ A → x ⊂ β (remember that< is defined as ⊂). But γ is just the union of cut elements in A, so γ is the union of proper subsets of β. Thismeans that γ itself is a proper subset of β. This shows that γ ⊂ β ⊆ Q, so γ 6= Q. So criterion (I) in thedefinition of “cut” has been met.

To prove part (II), pick some arbitrary rational p ∈ γ. γ is the set of cuts in A, so p ∈ α for some cut α ∈ A.Choose a second rational q such that q < p: by the definition of cut, the fact that p ∈ α and q < p means thatq ∈ α and therefore q ∈ γ. And we’re being asked to disregard part (III), so this is sufficient to prove that γ isa cut.

proof that γ is the least upper bound of A

(i) Choose any arbitrary cut α ∈ A. We’ve defined γ as the union of all cuts in A, so it’s clear that every rationalnumber in α is also contained in γ: that is, α ⊆ γ. And by the definition of < for this set, this tells us thatα ≤ γ for every α ∈ A.

(ii) Suppose that δ < γ. By the definition of < for this set, this means that δ ⊂ γ. This is a proper subset,so there must be some rational s such that s 6∈ δ but s ∈ γ. In order for s ∈ γ, it must be the case that s ∈ αfor some cut α ∈ A. We can now show that δ ⊂ α. For every r ∈ δ,it’s also true that s 6∈ δ. By (II), this means

12

that r < s (using standard rational order). And since s ∈ α, (II) also shows that r ∈ α. This shows that δ ⊂ α,which means δ < α ∈ A (using the subset order on R), and therefore δ is not an upper bound of A.

Together, these two facts show that γ is the least upper bound of A.

definition of addition

Following the example in the book, we define the addition of two cuts α+β to be the set of all sums r+ s wherer ∈ α and s ∈ β. The book defined 0∗ to be the set of rational numbers < 0, but by omitting requirement (III)we are forced to redefine our zero. Let 0# be the set of all rational numbers ≤ 0.

The original definition had to omit 0 from 0∗ in order to keep R, the set of cuts, closed under addition(otherwise we’d have 0∗ + 0∗ = (−∞, 0] which is not a cut because of criterion (III)). Our new zero 0# mustinclude 0 as an element because our newly defined cuts can have a greatest element. The set 0∗ can no longerfunction as our zero since (−∞, x] + 0∗ = (−∞, x); these two cuts are not equal ((−∞, x) < (−∞, x]), so 0∗ hasnot functioned as the additive identity.

field axioms A1,A2, and A3

The proofs from the book for closure, commutativity, and associativity are directly applicable to our newdefinition of cut.

field axiom A4: existence of an additive identity

Let α be a cut in R. Let r and s be rationals such that r ∈ α and s ∈ 0#. Then r + s ≤ r, which means thatr+ s ∈ α. This shows us that α+ 0# ⊆ α. Now let p and r be rationals such that p ∈ α, r ∈ α, and p ≤ r. Thismeans that p− r ≤ 0, so p− r ∈ 0#. Therefore r+ (p− r) ∈ α+ 0#, which means that p ∈ α+ 0#. This showsus that α ⊆ α+ 0#. Having shown that α+ 0# ⊆ α and that α ⊆ α+ 0#, we conclude that α = α+ 0# for anyα ∈ R.

field axiom A5: existence of additive inverses

The book constructed the definition of inverse so that the inverse of (−∞, x) would be (−∞,−x). This worksunder the original definition of “cut”, since (−∞, x) + (−∞,−x) = (−∞, 0) = 0∗. It is tempting to just definethe inverse for our new definition of cut so that the inverse of (−∞, x] is (∞,−x]. This overlooks the fact thatboth (−∞, x] and (−∞, x) are cuts under our new definition: both of these elements must have additive inverses.

To show that A5 is not satisfied, we need to demonstrate that at least one element of R has no additive in-verse. Let α = (−∞, r) for some rational r. Assume that there was some cut β such that α+β = 0# = (−∞, 0]:we will show that this assumption leads to a contradiction.

(i) Assume that β does not contain any elements greater than −r. Let p and q be arbitrary rationals suchthat p ∈ α and q ∈ β. From our definitions of α and β, we know that p < r and q ≤ −r. Combining these twoinequalities tells us that p+ q < −r + r, or p+ q < 0. But p and q were arbitrary members of α and β, so thisshows that 0 6∈ α+ β. So it cannot be the case that α+ β = 0#.

(ii) Assume that β contains at least one element q that is greater than −r. Let s represent the differenceq − (−r): then q = −r + s (note that s must be a positive rational). Let p be an arbitrary rational such thatp ∈ α. By the definition of α, we know that p < r. And from property (II) of cuts, we know that r− s < p < r.Adding q = −r+ s to each element in this equality gives us 0 < p+ q < s. And p+ q is an element of α+ β, sowe see that α+ β contains some positive rational: it cannot be the case that α+ β = 0#.

Whether or not β contains an element greater than −r, we find ourselves in contradiction with the initialassumption that β might be the inverse of α. Therefore there can be no inverse for α. In conclusion, we seethat omitting (III) forces us to redefine the additive identity, and this new definition results in the existence ofelements in R that have no inverse.

13

Exercise 2.1

This is immediately justified by noting that the definition of subset, x ∈ ∅ → x ∈ A, is satisfied for any set Abecause of the false antecedent. A more formal proof:

7→ ¬(∃x ∈ ∅) definition of an empty set

→ (∀x)(x 6∈ ∅) negation of the ∃ quantifier

→ (∀x)(x 6∈ A→ x 6∈ ∅) Hilbert’s PL1

This previous step is justified by the argument p→ (q → p): if something is true (like x 6∈ ∅), theneverything implies it.

→ (∀x)(x ∈ ∅→ x ∈ A) contrapositive

→ ∅ ⊆ A definition of subset

Exercise 2.2

The set of integers is countable, so by theorem 2.13 the set of all (k+1)-tuples (a0, a1, . . . , ak) with a0 6= 0 is alsocountable. Let this set be represented by Zk. For each a∈ Zk consider the polynomial a0z

k+a1zk−1+. . .+ak = 0.

From the fundamental theorem of algebra, we know that there are exactly k complex roots for this polynomial.

We now have a series of nested sets that encompass every possible root for every possible polynomial withinteger coefficients. More specifically, we have a countable number of Zks, each containing a countable numberof (k + 1)-tuples, each of which corresponds with k roots of a k-degree polynomial. So our set of complex roots(call it R) is a countable union of countable unions of finite sets. This only tells us that R is at most countable:it is either countable or finite.

To show that R is not finite, consider the roots for 2-tuples in Z1. Each 2-tuple of the form (−1, n) corre-sponds with the polynomial −z + n = 0 whose solution is z = n. There is clearly a unique solution for eachn ∈ Z, so R is an infinite set. Because R is also at most countable, this proves that R is countable.

I did not use the hint provided in the text, which either means that this proof is invalid or that there is analternate (simpler?) proof.

Exercise 2.3

The set of real numbers is uncountable, so if every real number were algebraic then the set of real algebraicnumbers would be uncountable. However, exercise 2.2 showed that the algebraic complex numbers were count-able. The real numbers are a subset of the complex numbers, so the set of algebraic real numbers is at mostcountable.

Exercise 2.4

The rational real numbers are countable. If the irrational real numbers were countable, then the union of rationaland irrational reals would be countable. But this union is just R, which is not countable. So the irrational realsare not countable.

Exercise 2.x: The Cantor Set

The idea behind the Cantor set is that each term in the series E1 ⊃ E2 . . . ⊃ En removes the middle third ofeach remaining line segment, so that you get a series of increasingly smaller segments that look like:

14

We can see that E0 has one segment of length 1, E1 has two segments of length 1/3, and Em has 2m segmentsof length 1/3m. Notice that for any possible segment (α, β), we can choose m to be large enough so that themaximum segment length is less than the length of (α, β) and therefore (α, β) 6∈ Em. And if the segment is notin Em for some m, it’s not in the union P . This is the reasoning behind Rudin’s statement that “P contains nosegment”.

To justify the claim that no point in P is an isolated point, we need to show that for every point p1 ∈ Pand any arbitrarily small radius r, we can find some second point p2 such that d(p1, p2) < r. P is defined tobe⋂Ei, so if p1 ∈ P then it is a member of every Ei. Choose some element Em where m is so large that the

segments in Em are all shorter than r. We are assured that this is possible by the Archimedean property of therationals, since we need only find m such that 3−m < r.

So we’ve found an Em with a line segment containing p1. Let p2 represent one of the endpoints of this line.The length of this segment is less than r, so d(p1, p2) < r. In each subsequent term in the series {Ei}, theendpoint p2 will never be “cut off”: it will always be the endpoint of a segment from which the middle third islost. So p2 ∈ P . And since this was true for an arbitrary point p1 ∈ P and an arbitrarily small r, this showsthat every neighborhood of every point in P is a limit point and not an isolated point.

Exercise 2.5

Let E be a subset of (0, 1] consisting only all rational numbers of the form 1m ,m ∈ N. No point in E is a limit

point, but the limit points of E need not be members of E: we will demonstrate that the point 0 is a limit pointof E.

(i) Proof that no point in E is a limit point: let pm be an arbitrary member of E. Then pm = 1m for some

m ∈ N. The closest point to pm is therefore pm+1 = 1m+1 . So we just choose r such that r = 1

2d(pm, pm+1) andwe see that there are no other points in this neighborhood of pm, which shows that pm is an isolated point.

(ii) Proof that 0 is a limit point: choose an arbitrarily small radius r. From the Archimedian property, wecan find some m ∈ N such that 1

m < r. This pm = 1m is a member of E, and d(0, pm) = 1

m < r. This shows that0 is a limit point.

(iii) Proof that no other points are limit points: let x be some point that is neither zero nor a memberof E. This point must be either less than zero, greater than 1, or it must lie between two sequential pointspm, pm+1 ∈ E. Let the dx represent the smallest distance from the set {d(x, 0), d(x, 1), d(x, pm), d(x, pm+1)}.Choose r such that r = 1

2dx. There can be no points of E in the neighborhood Nr(x). If there were, then thiswould indicate the existence of a point of E less than zero, greater than 1, or between 1

m and 1m+1 . None of

these can exist in E as we defined it, so no points other than 0 are limit points.

Now let F be the subset of (2, 3] containing rational numbers of the form 2 + 1m and let G be the subset of

(4, 5] consisting of rational numbers of the form 4 + 1m . Just as E was shown to have a limit point of only zero,

these sets can be shown to have limit points of only 2 and 4. Therefore the union E ∪ F ∪ G is bounded andhas three limit points {0, 2, 4}.

Exercise 2.6

(i) We’re asked to prove that E′ is closed. This is equivalent to proving that every limit point of E′ is a pointof E. And by the definition of E′, this is equivalent to proving that every limit point of E′ is a limit point of E.

To prove this, let x represent an arbitrary limit point of E′. Choose any arbitrarily small r and lets = r

2 , t = r41. Since x is a limit point of E′, we can find a point y ∈ E′ in the neighborhood Ns(x). And

y, by virtue of being in E′, is a limit point of E: so we can find a point z ∈ E in the neighborhood Nt(y).

The distance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures usthat d(x, z) ≤ d(x, y) + d(y, z) < s+ t < r, so d(x, z) < r and therefore z is in the neighborhood Nr(x) for any

1t was chosen to be less than r to guarantee x 6= z.

15

arbitrarily small r. So the point x is a limit point for E. But x was an arbitrarily chosen limit point of E′, sowe have proven that every limit point of E′ is a limit point of E, which is what we were asked to prove. Thefollowing image is helpful for imagining these points in R2.

(ii) We can use a similar technique to show that every limit point of E is a limit point E. Let x representan arbitrary limit point of E (which we won’t assume to be a member of E). Choose any arbitrarily small rand let s = r

2 , t = r4 . Since x is a limit point of E, theorem 2.20 tells us that the neighborhood Ns(x) contains

infinitely many points of E. Because E is defined to be E′ ∪E, each of these infinitely many points in Ns(x) iseither a member of E′ or a member of E.

(ii.a) First, assume that there exists at least one point y in Ns(x) such that y ∈ E′. By definition, thismeans that y is a limit point of E, so there is at least one point z ∈ E in the neighborhood Nt(y). Thedistance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures us thatd(x, z) ≤ d(x, y) + d(y, z) < s+ t < r, so d(x, z) < r and therefore we can find some z ∈ E in the neighborhoodNr(x) for any arbitrarily small r. So x is a limit point for E.

(ii.b) Next, assume that none of the points in Ns(x) are members of E. This means that all of the in-finitely many points of E′ ∪ E in Ns(x) are members of E. From this, we see that every neighborhood of xcontains elements of E. For any neighborhood Nt(x) with t < s, the fact that x is a limit point of E andthat Nt(x) ⊂ Ns(x) means that Nt(x) contains infinitely many points in E. But this means that x is a limitpoint of E. For any neighborhood Nt(x) with t > s, the fact that Ns(x) ⊂ Nt(x) means that Nt(x) containsinfinitely many points in E. And since every neighborhood of x contains an element of E, x is a limit point for E.

The second part of the proof is to show that every limit point of E is a limit point of E and is relatively trivial.Every element of E is also an element of E ∪ E′ = E. If x is a limit point of E, then every neighborhood of xcontains an element of E. Therefore every neighborhood of x contains an element of E. So x is a limit point of E.

We’ve shown that every limit point of E is a limit point of E and vice-versa, which is what we were askedto prove.

(iii) We’re asked if E and E′ always have the same limit points. The answer is “no”, and a counterexamplecan be found in the previous question. The limit points of E in exercise 2.5 were E′ = {0, 2, 4}. And E′ clearlyhas no limit points whatsoever.

Exercise 2.7a

We’ll prove the equality of these sets by showing that each is a subset of the other.

(A) Assume that x ∈ Bn. Then, x ∈ Bn or x is a limit point of Bn.

(A1) If x ∈ Bn, then x ∈ Ak for some Ak ∈⋃Ai. And if x ∈ Ak, then x ∈ Ak. And this means that x ∈

⋃Ai.

(A2) If x is a limit point of Bn, then it must be a limit point at least one specific Ak ∈⋃Ai. Proof by

contradiction: assume that x is not a limit point for any Ak ∈⋃Ai. Then for each i, there is some neighborhood

Nri(x) that contains no elements of Ai. Let s =min(ri) (which exists because i is finite). Then Ns(x) contains

16

no elements from any Ai since Ns(x) ⊆ Nri(x) for each i. And if this neighborhood contains no points forany Ai, then it contains no points of Bn =

⋃Ai. And this is a contradiction, since x is a limit point of Bn.

Therefore, by contradiction, x must be a limit point of at least one specific Ak ∈⋃Ai. So x ∈ Ak, which means

that x ∈⋃Ai.

In either of these two cases, x ∈⋃Ai. This proves that x ∈ Bn → x ∈

⋃Ai.

(B) Assume that x ∈⋃Ai. Then x ∈ Ak for some k. And this means that either x ∈ Ak or x is a limit

point of Ak.

(B1) If x ∈ Ak, then x ∈⋃Ai, which means that x ∈ Bn and therefore x ∈ BN .

(B2) If x is a limit point of Ak, then every neighborhood of x contain an element of Ak. But every elementof Ak is an element of

⋃Ai = Bn, so this means that every neighborhood of x contains an element of Bn. By

definition, this means that x is a limit point of Bn, so that x ∈ Bn.

In either of these two cases, x ∈ Bn. This proves that x ∈⋃Ai → x ∈ Bn. And we’ve already shown that

x ∈ Bn → x ∈⋃Ai, so we have proven that Bn =

⋃Ai.

Exercise 2.7b

Nothing in part B of the above proof required that i be finite, so it’s still the case that⋃∞i=1 Ai ⊂ B. But part

A of the proof assumed that we could choose a least element from the set of size i, which we can’t do for aninfinite set. So we can’t assume that B ⊆

⋃Ai. And it’s a good thing that we can’t assume this, because it’s false.

Consider the set from exercise 2.5, the set E consisting of all rational numbers of the form 1m ,m ∈ N. We’ll

construct this set by defining Ai = 1i and defining B =

⋃∞i=1Ai. We saw in exercise 2.5 that 0 ∈ B. But

0 6∈ Ai for any i, so 0 6∈⋃∞i=1 Ai. This shows us that B 6=

⋃∞i=1Ai for these sets. And we’ve already shown that⋃

Ai ⊆ B, so for this particular choice of sets we see that⋃Ai ⊂ B.

Exercise 2.8a

Every point in an open set E in R2 is a limit point of E. Proof: let p be an arbitrary point in E. We cansay two things about the neighborhoods of p. First, since we’re dealing with R2, every neighborhood of p con-tains infinitely many points. Second, p is an interior point (since E is open) and so there is some r such thatNr(p) ⊂ E. Together, these two facts tell us that Nr(p) contains infinitely many points of E.

We now show that this guarantees that every neighborhood of p contains a point of E. Choose any s suchthat r < s. Because the neighborhood Nr(p) contains some q ∈ E and Nr(p) ⊂ Ns(p), we know that q ∈ Ns(p).Now choose s such that s < r. Ns(p) contains infinitely many points, and Ns(p) ⊂ Nr(p) ⊂ E, so Ns(p) containsinfinitely many points of E. We’ve shown that whether r < s or r > s for any arbitrary r, Ns(p) contains apoint of E. Thus p is a limit point of E.

Exercise 2.8b

Consider a closed set consisting of a single point, such as E = {(0, 0)}. This point is clearly not a limit point ofE.

Exercise 2.9a

To prove that E◦ is open, we will show that every point in E◦ must be an interior point of E◦.

Let x be an arbitrary point in E◦. By the definition of E◦, we know that x is an interior point of E. Thismeans that we can choose some r such that Nr(x) ⊂ E: i.e., every point of Nr(x) is a point of E. To show thatx is an interior point of E◦, though, we will need to show that every point in the neighborhood Nr(x) is also a

17

point of E◦.

Let y be an arbitrary point in Nr(x). Clearly, 0 < d(x, y) < r. Choose a radius s such that d(x, y) + s = rand let z be an arbitrary point in Ns(y). Every point in Ns(y) is a point in E, since d(x, z) ≤ d(x, y) +d(y, z) <d(x, y) + s = r implies d(x, z) < r, and we’ve established that every point in Nr(x) is a member of E. And sinceevery point in Ns(y) is a point in E, by definition we know that y is an interior point of E. But y was an arbitrarypoint in Nr(x), so we know that every point in Nr(x) is an interior point of E. And this means that x is itselfan interior point of the set of interior points of E: that is, x is an interior point of E◦. And since x was an ar-bitrary point in E◦, this means that every point in E◦ is an interior point of E◦. And this proves that E◦ is open.

Exercise 2.9b

From exercise 2.9(a), we know that E◦ is always open: so E is open if E = E◦. And if E is open, then everypoint of E is an interior point and therefore E◦ = E: so E = E◦ if E is open. Together, these two statementsshow that E is open if and only if E = E◦.

Exercise 2.9c

Let p be an arbitrary element of G. Because G is open, p is an interior point of G. This means that for everypoint p ∈ G there is some neighborhood Nr(p) such that Nr(p) ⊂ G ⊂ E. This chain of subsets tells us thatevery point in G is an interior point not only of G, but also of E. And if every point in G is an interior pointof E, then this shows us that G ⊂ E◦.

Exercise 2.9d

Proof that E◦ ⊂ E: let x be a member of E◦, the complement of E◦. From the definition of E◦ we know that xis not an interior point of E. From the definition of “interior point”, this means that every neighborhood of xcontains some element y in the complement of E. For each neighborhood, It must be true that either x = y orx 6= y. If x = y for one or more of these neighborhoods, then x is in E (since y is in E and x = y). If x 6= y for

every neighborhood, then by definition we know that x is a limit point of E. So we conclude that either x ∈ Eor x is a limit point for E: that is, x is a member of E, the closure of E.

Proof that E ⊂ E◦: this proof is only a very slight modification of the previous one. Let x be a member of

E, the closure of the complement of E. From the definition of closure, either x is a limit point for E or x itselfis a member of E. In either case, every neighborhood of x contains some point of E. By definition of “interiorpoint”, this means that x is not an interior point of E and therefore x ∈ E◦, the complement of E◦.

Together, these two proofs show us that E◦ = E.

18

Exercise 2.9e

No, E and E do not always have the same interiors. Consider the set E = (−1, 0) ∪ (0, 1) in R1. The point 0 isnot a member of the interior of E, but it is a limit point of E. So the point 0 is an interior point of E = (−1, 1).

Exercise 2.9f

No, E and E◦ do not always have the same closures. Let E be a subset of (0, 1] consisting only all rationalnumbers of the form 1

m ,m ∈ N. As we saw in exercise 2.5, the closure of E is E ∪ {0}. But every neighborhoodof every point in E contains a point not in E, so E has no interior points. So E◦ = ∅. And the empty setis already closed, so the closure of E◦ is ∅. Clearly E ∪ {0} 6= ∅, so E and E◦ do not always have the sameclosures.

Exercise 2.10

Which sets are closed? Let E be an arbitrary subset of X and let p be an arbitrary point in E. The distancebetween any two distinct points is always 1, so by choosing r = .5 we guarantee that the neighborhood Nr(p)contains only p itself. Because this is true for any point in E, this shows us that E contains no limit points.It’s then vacuously true that all of the (nonexistant) limit points of E are points of E: we conclude that E isclosed. But our choice of E was arbitrary, so this shows that every subset of X is closed.

Which sets are open? Let E be an arbitrary set in X. We’ve shown that every set is closed, so it must bethe case that E is closed: and from theorem 2.23, this means that E is open. But our choice of E was arbitrary,so this shows that every subset of X is open.

Which sets are compact? Under this metric, subsets of X are compact if and only if they are finite.Proof: let E be an arbitrary subset of X. This set is either finite or infinite.

If E is finite with cardinality k, then we can always generate a finite subcover for any open cover. For eachei ∈ E, we select some gi in our (arbitrary) open cover {Gα} such that ei ∈ gi (note that E is finite, so we don’t

need to use the axiom of choice). This gives us a collection of sets⋃ki=1 gi such that E ⊆

⋃ki=1 gi ⊆ {Gα}. By

definition, then,⋃ki=1 gi is a finite subcover of E. Our choices for E, k, and {Gα} were arbitrary, so this shows

that any finite subset of X is compact.

If E is infinite, then we can always generate an open cover that has no finite subcover. As we saw previously,every subset of X is open: this includes subsets that consist of a single element. So we can create an infiniteopen cover of E by letting each set in {Gα} be a single element of E. Any subcover of E will need to contain oneelement of {Gα} for each element of E: and since E is infinite, this means that any subcover must be infinite.Therefore there is no finite subcover for this particular open cover, and E is not compact.

Exercise 2.11

d1 is not a metric: let x = −1, y = 0, z = 1. Then d1(x, y) + d1(y, z) = 2 while d1(x, z) = 4. And this tells usthat d1(x, z) > d1(x, y) + d1(y, z) which violates definition 2.15(c) of metric spaces.

d2 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has theproperty 2.15(c):

7→ |x− z| ≤ |x− y|+ |y − z| true by the triangle inequality, theorem 1.37(f)

→ |x− z| ≤ |x− y|+ |y − z|+ 2√|x− y||y − z| this additional term is > 0

→√|x− z|

2≤(√|x− y|+

√|y − z|

)2→√|x− z| ≤

√|x− y|+

√|y − z| legal because all terms are > 0

→ d2(x, z) ≤ d2(x, y) + d2(y, z)

19

d3 is not a metric. d3(1,−1) = 0, which violates definition 2.15(a) of metric spaces.

d4 is not a metric. d4(2, 1) = 0, which violates definition 2.15(a) of metric spaces.

d5 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has theproperty 2.15(c):

7→ |x− z| ≤ |x− y|+ |y − z|

This first step is always true because of the triangle inequality (theorem 1.37(f).

→ |x− z| ≤ |x− y|+ 2|x− y||y − z|+ |y − z|+ |x− z||x− y||x− z|

Adding positive terms to the right side does not change the sign of the equality.

→ |x− z|+ |x− z||x− y|+ |x− z||y − z|+ |x− z||x− y||y − z|≤ |x− y|+ 2|x− y||y − z|+ |y − z|+ |x− z||x− y|+ 2|x− z||x− y||y − z|+ |x− z||y − z|

It’s hard to verify from this ugly mess, but we’ve just added the same terms to each side.

→ (|x− z|)(1 + |x− y|+ |y − z|+ |x− y||y − z|) ≤ (1 + |x− z|)(|x− y|+ 2|x− y||y − z|+ |y − z|)

It’s still hard to verify, but we’ve just factored out terms from each side.

→ |x− z|1 + |x− z|

≤ |x− y|+ 2|x− y||y − z|+ |y − z|1 + |x− y|+ |y − z|+ |x− y||y − z|

And now we’ve divided each side by two of these factored terms.

→ |x− z|1 + |x− z|

≤ |x− y|1 + |x− y|

+|y − z|

1 + |y − z|→ d2(x, z) ≤ d2(x, y) + d2(y, z)

The logic behind this seemingly arbitrary series of algebraic steps becomes clear if one starts at the end andworks back to the beginning (which is, of course, how I initially derived the proof).

Exercise 2.12

Let {Gα} be any open cover of K. At least one open set in {Gα} must contain the point 0: let G0 be thisset, open in R, that contains 0 as an interior point. As an interior point of G0, there is some neighborhoodNr(0) ⊂ G0. This neighborhood contains every 1

n ∈ K such that 1n < r: equivalently, this neighborhood contains

one element for every natural number n such that n > 1r . Not only does this mean that G0 contains an in-

finite number of elements of K, it means that it contains all but a finite number ( 1r , to be exact) of elements of K.

Let Gi represent an element of {Gα} containing 1i . We can now see that K ⊂ G0 ∪

⋃1/ri=1Gi, which is a finite

subcover of {Gα}.

Exercise 2.13

For i ≥ 2, let Ai represent the set of all points of the form 1i + 1

n that are contained in the open interval(

1i ,

1i−1

).

Each Ai can be shown to have a single limit point of 1i (see exercise 2.5).

20

Now consider the union of sets S =⋃∞i=2Ai. The same reasoning used in exercise 2.5 shows that the set of

limit points for S is just the collection of limit points from Ai: that is, the rationals of the form 1i for i ≥ 2.

This shows us that S has one limit point for each natural number greater than 1, which is a countable number.

To show that S is compact, we return to the reasoning found in exercise 2.12. Let {Gα} be any open coverof S. For each i ≥ 2, Let Gi represent the element of {Gα} that contains 1

i . As we saw in exercise 2.12, Gi

contains all but a finite number of elements from the interval(

1i ,

1i−1

). If we let Gij represent an element of

{Gα} containing 1i + 1

j , we see that all of the points of S in the interval(

1i ,

1i−1

)can be covered by the finite

union of sets Gi ∪⋃rii=1.

Now, let G0 represent the element of {Gα} that contains 0 as an interior point. As an interior point, thereis some neighborhood Nr(0) ⊂ G0. This neighborhood contains every limit point of the form 1

n ∈ R such that1n < r. As in 2.12, this means that G0 contains all but a finite number of the limit points of S. More than that,though, it means that G0 contains all but a finite number of the intervals of the form ( 1

i ,1i−1 ).

And from these sets we can construct our finite subcover. Our subcover contains G0. For each of the finitelymany intervals not covered by G0, we include the finite union of sets Gi ∪

⋃rii=1. This is a finite collection of a

finite number of elements from {Gα}, and each element of S is included in one of these elements, so we haveconstructed a finite subcover for an arbitrary cover {Gα}. This proves that S is compact.

Exercise 2.14

Let Gn represent the interval(1n , 1). The union {Gα} =

⋃∞i=1Gn is a cover for the interval (0, 1) (Proof: for

any x ∈ (0, 1), there is some n > 1/x and therefore some 1n < x. So x ∈ Gn+1). But there is no finite subcover

for (0, 1). Let H be a finite subcover of {Gα}, and consider the set { 1i+1 : Gi ∈ H}. This set is finite, so there

is a least element. And this least element is in the interval (0, 1) but is not a member of any Gi ∈ H. So ourassumption that a finite subcover exists must be false.

Exercise 2.15

For each i ∈ N, define Ai to be:

Ai =

{p ∈ Q :

√2− 1

i≤ p ≤

√2 +

1

i

}Because the endpoints are irrational, each of these intervals Ai are both bounded and closed (see exercise 2.16for the proofs). From theorem 1.20, we also know that each interval is nonempty.

The union of any finite collection of these intervals will be nonempty, since the union⋂i∈∇Ai is equal to

Ak, where k is the largest index in ∇. But the infinite union⋂∞i=1Ai is equal to {p ∈ Q :

√2 ≤ p ≤

√2}, which

is obviously empty.

This proves that the collection {Ai : i ∈ N} is a counterexample to theorem 2.36 if the word “compact” isreplaced by either “closed” or “bounded”. This same collection works as a counterexample to the corollary of2.36, since An+1 ⊂ An.

Exercise 2.16

lemma 1: For any real number x 6∈ Q, the intervals [x,∞) and (x,∞) are open in Q. Proof: let x be a realnumber such that x 6∈ Q. Let p be a rational number in the interval [x,∞) or (x,∞). It cannot be the casethat p = x (since p is rational while x is not), so p is in the interval (x,∞) (i.e, [x,∞) is identical to (x,∞)in Q). Because x < p, we can rewrite this interval as (p − d(p, x),∞). Choose r = d(x, p). Then we see that(p − r, p + r) ⊆ (p − d(p, x),∞) = (x,∞). This means that Nr(p) is an interior point of (x,∞): but p was anarbitrary point chosen from [x,∞), so every point of this interval is an interior point. This proves that [x,∞)and (x,∞) are open in Q.

21

lemma 2: For any real number x 6∈ Q, the intervals (−∞, x] and (−∞, x) are open in Q. Proof: the proofis nearly identical to that of lemma 1.

lemma 3: All of these open intervals ([x,∞),(x,∞),(−∞, x], and (−∞, x)) are also closed. Proof: Chooseany of these four open sets. Note that its complement is also one of these four open sets. Since its complementis open, the set must be closed.

lemma 4: Every interval of the form (x, y) with x, y 6∈ Q is both open and closed in Q. Proof: choose an

arbitrary interval E = (x, y). It’s complement is E = (−∞, x]∪ [y,∞). From lemma 3, we see that E is a finite

union of closed sets, so E is closed (theorem 2.24) and E is open (theorem 2.23). But E is also a finite union

of open sets (lemmas 1 and 2), so E is open (theorem 2.24) and E is closed (theorem 2.23). This proves that(x, y) is both open and closed. The proofs for [x, y], (x, y], and [x, y) are identical to this one.

E is bounded: If |p| ≥ ±√

3, then p2 > 3 and p 6∈ E. So E is bounded by the interval (−√

3,√

3).

E is open and closed in Q: E is the set of all rational numbers p such that 2 < p2 < 3, which means thatE is the set of rational numbers in (−

√3,−√

2) ∪ (√

2,√

3). We know that ±√

2 and ±√

3 are not rational, soby lemma 4 we know that E is both a union of closed sets and also a union of open sets. So by theorem 2.24 weknow that E is both open and closed.

E is not compact: Drawing from the example in exercise 2.14, define the interval Ai to be:

Ai =

{(−√

3,−√

2) if i = 1(√2 + 1

i ,√

3)

if i > 1

Because√

2 + 1i is irrational for every i ∈ N, we know by lemma 4 that each Ai is an open set. Define an open

cover of E to beG = {Ai : i ∈ N}

This has no finite subcover for the same reason that the set in exercise 2.14 has no infinite subcover: given anyfinite collection of elements of G, we can always find an element sufficiently close to

√2 that is not contained in

any of those elements of G.

Exercise 2.17

E is not countable: This is proven directly by theorem 2.14.

E is not dense: Let x = .660... and let r = .01. Then Nr(x) = (.65, .67), and every real number inthis interval contains a number other than 4 or 7. Therefore Nr(x) contains no points of E. This shows that xis an element of [0, 1] that is neither a point in E nor a limit point of E, which proves that E is not dense in [0, 1].

E is compact: By theorem 2.41, we can prove that E is compact by showing that it is bounded and closed.E ⊂ [0, 1], so E is bounded. To show that E is closed, we need to show that every limit point of E is a memberof E.

Proof by contrapositive that every limit point of E is a member of E: let x be an element of [0, 1] such thatx 6∈ E. We will show that x is not a limit point. By the definition of membership in E, the fact that x 6∈ Emeans that

x =

∞∑i=1

ai10i

where some ai 6∈ {4, 7}. Let k represent some index such that ak 6∈ {4, 7}. Choose r = 10−(k+1). Then theneighborhood Nr(x) does not contain any points in E:

i) If ak+1 = 0, then the k + 1st decimal place will be either 9, 0, or 1 for every element of (x− r, x+ r) and sono element of Nr(x) is a member of E.

22

ii) If ak+1 = 9, then the k + 1st decimal place will be either 8, 9, or 0 for every element of (x− r, x+ r) and sono element of Nr(x) is a member of E.

iii) If ak+1 is neither 0 nor 9, then the kth decimal place will be be ak for every element of (x − r, x + r) andso no element of Nr(x) is a member of E.

This proves that there is some neighborhood Nr(x) that contains no points in E. This proves that x is nota limit point of E. But x was an arbitrary point such that x 6∈ E, so this proves that (x 6∈ E → x is not alimit point of E). By contrapositive, this proves that every limit point of E is a member of E. And this, bydefinition, means that E is closed.

E is perfect: We have already shown that E is closed in the course of proving that E is compact. To provethat E is perfect, we need to show that every point in E is a limit point of E.

Let x be an arbitrary point in E. By the definition of membership in E, we know that

x =

∞∑i=1

ai10i

where each ai is either 4 or 7. Define a second number, xk, such that

xk =

∞∑i=1

bi10i

, where

bi = ai if i 6= kbk = 4 if ak = 7bk = 7 if ak = 4

From this definition, we see that xk differs from x only in the kth decimal place, and d(x, xk) = 3× 10−k.We can use this information to show that x is a limit point of E. Let Nr(x) be any arbitrary neighborhood

of x. From the archimedian principle, we can find some integer k such that k > log1 0 3r . And this is algebraically

equivalent to finding some integer k such that 3 × 10−k < r. This means that we can find some xk ∈ E (asdefined above) in Nr(x). And this was an arbitrary neighborhood of an arbitrary point in E, so we have proventhat every neighborhood of every point in E contains a second point in E: by definition, this means that everypoint in E is a limit point.

Exercise 2.18

Section 2.44 describes the Cantor set. The Cantor set is a nonempty perfect subset of R1. Each point in theCantor set is an endpoint of some segment of the form

[n3k, n+1

3k

]with n, k ∈ N, so each point in the Cantor set

is rational.

Let E be the set {x+√

2 : x is in the Cantor set} (that is, we’re shifting every element of the Cantor set√

2units to the right). Each element in E is irrational (exercise 1.1). The Cantor set was bounded by [0, 1] so Eis clearly bounded by [

√2, 1 +

√2]. The proof that E is perfect is identical to the proof that the Cantor set is

perfect (given in the book in section 2.44 and also in this document after exercise 2.4).

Exercise 2.19 a

We’re told that A and B are disjoint, so A∩B = ∅. And if A and B are closed, then by theorem 2.27 we knowthat A = A and B = B. So we conclude that A ∩ B = A∩ B = A ∩ B = ∅. By definition, this means that Aand B are separated.

Exercise 2.19 b

Let A and B be disjoint open sets. Assume that A ∩B is not empty. This assumption leads to a contradiction:

23

7→ A ∩B is not empty hypothesis to be contradicted

→ (∃x)(x ∈ A ∩B) definition of non-emptiness

→ (∃x)(x ∈ A ∩ (B ∪B′)) definition of closure

→ (∃x)(x ∈ (A ∩B) ∪ (A ∩B′)) distributivity (section 2.11)

→ (∃x)(x ∈ ∅ ∪ (A ∩B′)) A and B are disjoint

→ (∃x)(x ∈ A ∩B′)→ (∃x)(x ∈ A ∧ x ∈ B′) definition of set intersection

→ (∃x)(x is an interior point of A ∧ x ∈ B′) A is an open set

→ (∃x)(x is an interior point of A ∧ x is a limit point of B) definition of B′

→ (∃x) [(∃r ∈ R)(Nr(x) ⊆ A) ∧ (x is a limit point of B)] definition of interior point

→ (∃x) [(∃r ∈ R)(Nr(x) ⊆ A) ∧ (∀s ∈ R)(∃y ∈ B)(y ∈ Ns(x))] definition of limit point

If something is true for all s ∈ R, it’s true for any particular R. So choose s = r.

→ (∃x) [(∃r ∈ R)( (Nr(x) ⊆ A) ∧ (∃y ∈ B)(y ∈ Nr(x)) )] substitution of s = r

→ (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ Nr(x) ∧Nr(x) ⊆ A)] rearrangement of terms for clarity

→ (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ A)] definition of subset

This last step establishes a contradiction. The sets A and B are disjoint, so there cannot be any possiblechoice of variables such that y ∈ B and y ∈ A. Our assumption must have been incorrect: A ∩ B is, in fact,empty. If we swap the roles of A and B, this same proof also shows us that A∩B is empty. By definition, then,A and B are separated.

Exercise 2.19 c

A is open in X: The set A is, by definition 2.18(a), a neighborhood of p. By theorem 2.19, then, A is an opensubset of X.

B is open in X: Let x be an arbitrary point in B. Let r = d(p, x) − δ. Let y be an arbitrary element inNr(x). Proof that y ∈ B:

7→ d(p, y) + d(y, x) ≥ d(p, x) Property 2.15(c) of metric spaces

→ d(p, y) + d(y, x)− d(p, x) ≥ 0

→ d(p, y) + d(y, x)− d(p, x) ≥ 0 ∧ r > d(x, y) we know d(x, y) < r because y ∈ Nr(x)

→ d(p, y) + d(y, x)− d(p, x) ≥ 0 ∧ d(p, x)− δ > d(x, y) definition of r

→ d(p, y) + d(y, x)− d(p, x) ≥ 0 ∧ d(p, x)− δ − d(x, y) > 0

→ (d(p, y) + d(y, x)− d(p, x)) + (d(p, x)− δ − d(x, y)) > 0 the sum of positive numbers is positive

→ d(p, y)− δ > 0 cancellation of like terms

→ d(p, y) > δ

→ y ∈ B definition of membership in B

Our choice for y was arbitrary, so every point in this neighborhood of x is a member of B: by definition,then, x is an interior point of B. But our choice of x ∈ B was also arbitrary, so this proves that every x ∈ B isan interior point of B. By definition, this shows that B is open in X.

A and B are disjoint: If there were some x ∈ A ∩ B, then d(x, p) > δ and d(x, p) < δ, which violates thetrichotomy rule for order relations.

A and B are separate: We’ve shown that A and B are disjoint open sets, so by exercise 2.19(b) we knowthat A and B are separate.

Exercise 2.19 d

If we are given any metric space X with a countable or finite number of elements, we can always find somedistance δ such that there are no elements x, y ∈ X with d(x, y) = δ (proof follows). This allows us to choose

24

some arbitrary p ∈ X and then use the results from part (c) to completely partition X into separated setsA = {x ∈ X : d(x, p) > δ} and B = {x ∈ X : d(x, p) < δ}. This proves that (X is at most countable → X is notconnected). By contrapositive, this proves that (X is connected → X is uncountable), which is what we wereasked to prove.

Proof that an at most countable metric space X has some distance δ such that, for all x, y ∈ X, d(x, y) 6= δ:Let X be an at most countable metric space with elements a1, a2, . . . an. Then from theorem 2.13, we knowthat the set of all order pairs {(ai, aj) : ai, aj ∈ X} is at most countable. And there is a clear one-to-onecorrespondence between this set and the set {(d(ai, aj) : ai, aj ∈ X}: so the set of all distances between allcombinations of points in X is at most countable.

Distances in metric spaces are always real numbers (definition 2.15), which are of course uncountable. Be-cause there are an at-most countable number of distances in the set {(d(ai, aj) : ai, aj ∈ X}, we can choose areal number δ that is not in this set (otherwise we would have an at most countable set with R as a proper subset).

But this isn’t quite enough: if δ is so large that there are no elements in the set {x ∈ X : d(x, y) > δ}, thenone of our partitions will be empty. We can avoid this problem (as long as X has at least two elements) by pickingarbitrary x, y ∈ X and then choosing delta from the interval (0, d(x, y)). This interval is still uncountable, sowe are still able to choose a δ that is not in this set. And we know that d(x, y) > δ and d(x, x) < δ, so ourpartitions A and B will both be nonempty.

Exercise 2.20 a

Closures of connected sets are always connected. If A and B are connected, then there is either some p in A∩Bor some q in A∩B. Clearly, then, we have either either p ∈ (A′ ∪ A) ∩B or q ∈ A∩ (B′ ∪B). Therefore A∩Bis nonempty, so these two closed sets are connected.

Exercise 2.20 b

Consider the segment (0, 1) of the real line. Although this segment is open in R1, it contains no interiorpoints in R2 since every neighborhood Nr(x, 0) contains the point (x, r2 ). So let r = 1

4 and let E be the setNr(0, 0) ∪ {(x, 0) : 0 < x < 1} ∪Nr(1, 0).

Since the line segment {(x, 0) : 0 < x < 1} contains no interior points while every point in a neighborhood isan interior point (theorem 2.19), the interior of E is just Nr(0, 0) ∪Nr(1, 1), and it is trivial to show that thisis the union of two nonempty separated sets.

Exercise 2.21 a

The function p(t) can be thought of as the parameterization of a straight line connecting the points a and b.For instance, consider the following sets in R2:

25

The set A0 is the set of all t such that p(t) ∈ A, and the set B0 is the set of all t such that p(t) ∈ B. We’retold that A and B are separated and are asked to prove that A0 and B0 are separated. Proof by contrapositive:

• There is some element tA that is an element of A0 and a limit point of B0.Assume that A0 and B0 are not separated. By definition 2.45, either A0∩B0 or A0 ∩B0 is nonempty. Assumethat A0 ∩ B0 is nonempty (the sets are interchangeable, so the proof for A0∩ B0 6= ∅ is almost identical). LettA be one element of A0 ∩ B0: then either tA ∈ A0 ∩ B0 or tA ∈ B′0. It can’t be the case that tA ∈ A0 ∩ B0,because this would imply p(tA) ∈ A∩B which is impossible because A and B are separated sets. So it must bethe case that tA ∈ A0 ∩B′0.

• There is a proportional relationship between d(t, t+ r) and d(p(t), p(t+ r)).The distance between p(t) and p(t+ r) is the vector norm of p(t+ r)− p(t):

d(p(t), p(t+r)) = |p(t+r)−p(t)| = |[(1−(t+r))a+(t+r)b]−[(1−t)a+tb]| = |r(b−a)| = |r||(b−a)| = d(t, t+r)d(a, b)

• p(tA) is an element of A and a limit point of B.We’ve shown that tA is a limit point of B0. So for any arbitrarily small r, there is some tB ∈ B0 suchthat d(tA, tB) < r. And this means that for any arbitrarily small r, there is some p(tB) ∈ B such thatd(p(tA), p(tB)) < |r(b− a)|. And since p(tA) is in A and each p(tB) is in B, this means that there is an elementof A that is a limit point of B: So A ∩B is nonempty, which means that A and B are not separated.

This shows that if A0 and B0 are not separated, then A and B are not separated. By contrapositive, then,if A and B are separated then A0 and B0 are separated. And this is what we were asked to prove.

Exercise 2.21 b

We are asked to find α ∈ (0, 1) such that p(α) 6∈ A ∪B: from the definition of the function p, this is equivalentto finding α ∈ (0, 1) such that α 6∈ A0 ∪B0. Proof that such a α exists:

Let E be defined as E = {d(0, x) : x ∈ B0}. That is, E is the set of all distances between the point 0 (whichis in A0, since p(0) = a) and elements of B0. The set R has the greatest lower bound property and E is a subsetof R with a lower bound of 0, so the set E has a greatest lower bound. Let α represent this greatest lower bound.

26

• 0 < α < 1:We know that α > 0 because α is a distance. We know that α 6= 0, because this would mean that d(0, 0) ∈ Ewhich means that 0 ∈ B0, which is false. And we know that α < 1, because 1 ∈ B0 and B0 is an open set: so 1is an interior point of B0, which means that there is some small r such that 1− r ∈ B0.

We now need to show that we can use α to construct elements that are not in A0 ∪B0. We know that eitherα ∈ B0,α ∈ A0, or α 6∈ A0 ∪B0:

• if α ∈ B0:We know that α is not a limit point of A0 (otherwise, xα ∈ B0∩A0 which contradicts the fact from part (a) thatB0 and A0 are separated). So there is some neighborhood Nr(α) that contains no points in A0. Now considerthe points p in the range (α− r, α). Each p is in the neighborhood Nr(α), so they aren’t members of A0. Andfor each p we see that d(0, p) < α, so they aren’t members of B0 (otherwise α wouldn’t have been a lower boundof E). So we see that for every p ∈ (α− r, α), we have p 6∈ A0 ∪B0.

• if α ∈ A0:This assumption leads to a contradiction. We’ve shown in part (a) that A0 and B0 are separated, so it can’tbe the case that α is a limit point of B0 (otherwise A0 ∩ B0 would not be empty). So there is some neighbor-hood Nr(α) that contains no points of B0. But if the interval (0, α) contains no points of B0 and the interval(α− r, α+ r) contains no points of B0, then their union (0, α+ r) contains no points of B0. But this contradictsour definition of α as the greatest lower bound of E. So our initial assumption must have been incorrect: itcannot have been the case that α ∈ A0.

• if α 6∈ A0 and α 6∈ B0:Under this assumption, we clearly have α 6∈ A0 ∪B0.

Whatever assumption we make about the set containing α, we see that there will always be at least oneelement α such that 0 < α < 1 and α 6∈ A0 ∪ B0. And, from the definition of the function p, this means thatp(α) 6∈ A ∪B. And this is what we were asked to demonstrate.

Exercise 2.21 c

Proof by contrapositive. Assume that E is not connected: then E could be described as the union of twoseparated sets (i.e, E = A ∪ B). From part (b), we could then choose a, b, and t such that (1 − t)a + (t)b 6∈A ∪B = E. And this would mean that E is not convex. By contrapositive, if E is convex then E is connected.

Exercise 2.22

The metric space Rk clearly contains Qk as a subset. We know that Qk is countable from theorem 2.13. Toprove that Qk is dense in Rk, we need to show that every point in Rk is a limit point of Qk:

Let a = (a1,a2, . . . ,ak) be an arbitrary point in Rk and let Nr(a) be an arbitrary neighborhood of a. Letb = (b1,b2, . . . ,bk) where bi is chosen to be a rational number such that ai < bi < ai + r√

k(possible via

theorem 1.20(b)). The point b is clearly in Qk, and

d(a, b) =√

(a1 − bi)2 + . . .+ (ak − bk)2 <

√r2

k+ . . .+

r2

k=

√kr2

k= r

This shows that every point in Rk is a limit point of Qk, which completes the proof that Rk is separable.

Exercise 2.23

Let X be a separable metric space. From the definition of separable in the previous exercise, we know thatX contains a countable dense subset E. For each ei ∈ E, let Nq(ei) be a neighborhood with rational radiusq around point ei. Let {Vα} = {Nq(ei) : q ∈ Q, i ∈ N} be the collection of all neighborhoods with rationalradius centered around members of E. This is a countable collection of countable sets, and therefore {Vα} has

27

countably many elements.

Let x be an arbitrary point in X, and let G be an arbitrary open set in X such that x ∈ G. Because G isopen, we know that x is an interior point of G. So there is some neighborhood Nr(x) such that Nr(x) ⊆ G. Butwe can choose a rational q such that 0 < q < r

2 , so that x ∈ Nq(x) ⊆ Nr(x) ⊆ G.

Because E is dense in x, every neighborhood of x contains some e ∈ E. So e ∈ Nq(x), which means thatd(e, x) < q. But, on the other hand, this also means that d(x, e) < q so that x ∈ Nq(e). And Nq(e) ∈ {Vα}, bydefinition.

Having shown that x ∈ Nq(e), we need to prove that Nq(e) ⊆ G. Let y be any point in Nq(e). We knowthat d(x, e) < q and d(e, y) < q. So d(x, y) < d(x, e) + d(e, y) = 2q by definition 2.15(c) of metric spaces. Butwe defined q so that 0 < q < r

2 , so we know that d(x, y) < r. This means that every y ∈ Nq(e)→ y ∈ Nr(x), orNq(e) ⊆ Nr(x). And we chose r so that Nr(x) ⊆ G, so by transitivity we know that Nq(e) ⊆ G.

We started by choosing an arbitrary element x in an arbitrary open set G ⊆ X, and proved that there was anelement Nq(e) ∈ {Vα} such that x ∈ Nq(e) ⊂ G. We’ve shown that {Vα} has a countable number of elements,and each element is an open neighborhood. This proves that {Vα} is a base for X.

Exercise 2.24

We’re told that X is a metric space in which every infinite subset has a limit point. Choose some arbitrary δand construct a set {xi} as described in the exercise.

• The set {xi} must be finite: if this set were infinite, then it would have a limit point. And if it has alimit point, then there would be multiple points of {xi} in some neighborhood of radius δ/2, which contradictsour assumption that d(xi, xj) > δ for each pair of points in {xi}.

• The neighborhoods of {xi} form a cover of X: Every x ∈ X must be contained in Nδ(xi) for somexi ∈ {xi}. If it weren’t, this would imply that d(x, xi) > δ for each xi ∈ {xi}, so that we could have chosen xto be an additional element in {xi}.

Let Ei represent the set of {xi} constructed in the above way when δ = 1i , and consider the set E =

⋃∞i=1Ei.

We will show that E is a countable dense subset of X. This union E is a countable union of nonempty finitesets, so E is countable. Each element of E was chosen from X, so clearly E ⊂ X. Proof that E is dense in X:

Choose an arbitrary x ∈ X. Choose an arbitrarily small radius r. From the archimedian principle, we canfind an integer n such that 1

n < r. We’ve shown that the neighborhoods of En cover X, so there is some e ∈ Ensuch that x ∈ N1/n(e). But if d(e, x) < 1

n , then d(x, e) < 1n and so e ∈ N1/n(x). And since 1

n < r, this impliesthat e ∈ Nr(x).

So we’ve chosen an arbitrary x and an arbitrary radius r, and shown that Nr(x) will always contain somee ∈ En ⊂ E. This proves that every x is a limit point of E, which by definition means that E is dense in X.

28

This proves that X contains a countable dense subset, so by the definition in exercise 22 we have proven thatX is separable.

Exercise 2.25 a

Theorem 2.41 tells us that the set K, by virtue of being a compact space, has the property that every infinitesubset of K has a limit point. By exercise 24, this means that K is separable. And exercise 23 proves that everyseparable metric spaces has a countable base.

I’m not sure that this an appropriate proof, though. The question wants us to prove that K has a countablebase and that therefore K is separable, whereas this is a proof that K is separable and therefore has a countablebase. An alternate proof follows.

Exercise 2.25 b

We’re told that K is compact. Consider the set En = {N1/n(k) : k ∈ K}, the set of neighborhoods of radius1/n around every element in K. This is clearly an open cover of K, since x ∈ K → x ∈ N(1/n)(x) → x ∈ En.And K is compact, so the open cover En must have some finite subcover: i.e., K must be covered by some finitenumber of neighborhoods from En. Let {Vn} represent a finite subcover of the open cover En.

Now consider the union V =⋃∞n=1 Vn, a countable collection of finite covers of K. We will prove that V is

a base for K.

Choose an arbitrary x ∈ K, and an arbitrary open set G such that x ∈ G ⊂ K. Because G is open andx ∈ G, x must be an interior point of G. And since x is an interior point, there is some neighborhood Nr(x) suchthat Nr(x) ⊂ G. Now choose an integer m such that 1

m < r2 . Because Vm is an open cover of K, there is some

k ∈ K and neighborhood N1/m(k) in the open cover Vm such that x ∈ N1/m(k). Proof that this neighborhoodN1/m(k) is a subset of G:

Assume that y is an element of N1/m(k). From the fact that x ∈ N1/m(k), we know that d(x, k) < 1/m.And from the fact that y ∈ N1/m(k), we know that d(k, y) < 1/m. And from the definition of metric spaces, we

know that d(x, y) ≤ d(x, k) + d(k, y), which means that d(x, y) ≤ 2/m ≤ r (since we chose 1m < r

2 ). This provesthat every element of N1/m(k) is in the neighborhood Nr(x), so N1/m(k) ⊆ Nr(x) ⊆ G. And this shows thatN1/m(k) ⊆ G.

We’ve chosen an arbitrary element x and an arbitrary element G, and have shown that there is someN1/m(k) ∈ V such that x ∈ N1/m(k) ⊂ G. By the definition in exercise 23, this proves that K has a base. Andthis base is a countable collection of finite sets, so it’s a countable base.

This countable base V is identical to the set E constructed in exercise 24. We saw there that this base wasa dense subset, so by the definition of “separable” in exercise 23 we know that K is separable.

Exercise 2.26

Let {Vn} be the countable base we constructed in exercises 24 and 25.{Vn} acts as a countable subcover to any open cover of X. Proof: let G be an arbitrary open set from an

arbitrary open cover of X. Choose any x ∈ G. We can find an element N1/m(k) ∈ V such that x ∈ N1/m(k) ⊆ Gby the same method used in the second half of exercise 2.25(b).

We must now prove that a finite subset of {Vn} can always be chosen to act as a subcover for any open cover.We’ll do this with a proof by contradiction.

Let {Wα} be an open cover of X and assume that there is no finite subset of {Vn} that acts as a subcoverof {Wα}. Construct Fn and E as described in the exercise. Because E is an infinite subset, it has a limit point:let x be this limit point. Every neighborhood Nr(x) contains infinitely many points of E (theorem 2.20), andwe can prove that this arbitrary neighborhood Nr(x) must contain every point of E:

29

We have constructed the Fn sets in such a way that Fn+1 ⊆ Fn for each n. So if Nr(x) doesn’t contain anypoints from Fj , then it doesn’t contain any points from Fj+1, Fj+2,etc because y ∈ Fj+1 → y ∈ Fj . So if Nr(x)doesn’t contain any points from Fj , then it has no more than j − 1 points of E: i.e, Nr(x) ∩E would be finite.But x is a limit point of E, so Nr(x) must contain infinitely many points of E. So Nr(x)∩E contains one pointfrom each Fn.

But this means that E = {x}, a set with one element. If it contained a second point, we could find aneighborhood Nr(x) that failed to contain this second point, which would contradict our finding that everyneighborhood of x contains E. And this means that we chose the same element from each Fn: i.e., x was ineach Fn. So x ∈

⋂Fn, which contradicts our assumption that

⋂Fn was empty.

Exercise 2.27

To prove that P is perfect, we must show that every limit point of P is a point of P , and vice-versa.Proof that every limit point of P is a point of P : assume that x is a limit point of P . Choose some arbitrary

r and let s = r2 . By the definition of limit point, every neighborhood Ns(x) contains some y ∈ P . And from the

fact that y ∈ P , we know that Ns(y) contains uncountably many points of P . But for each p ∈ Ns(y), we seethat d(x, p) ≤ d(x, y) + d(y, p) so that d(x, p) ≤ 2s = r. So the neighborhood Nr(x) contains uncountably manypoints of P . But Nr(x) was an arbitrary neighborhood of x, so this proves that x ∈ P .

Proof that every point of P is a limit point of P : assume that x ∈ P . By the definition of P , this means thatevery neighborhood of x contains uncountably many points of P . So clearly every neighborhood of x containsat least one point of P , which means that x is a limit point of P .

This proves that P is perfect. To prove that P c ∩E is countable, we let {Vn} be a countable base for X andlet W be the union of all Vn for which E ∩ Vn is countable. We will show that W c = P :

Proof that P ⊆ W c: Assume that x ∈ P . By the definition of membership in P , we know that x is acondensation point of E. By the definition of condensation point, then, we know that every neighborhoodNr(x) contains uncountably many points of E. And this means that every open set Vn containing x containsuncountably many points of E. So x is not a member of any countable Vn ∩ E, which means that x ∈W c.

Proof that W c ⊆ P : Assume that x ∈ W c. Choose any arbitrary neighborhood Nr(x): by the definitionof the base, there is some Vn ⊂ Nr(x) such that x ∈ Vn. By definition of W c, the fact that x ∈ Vn meansthat E ∩ Vn is uncountable, which of course means that Vn has uncountably many elements of E. And we’veestablished that Vn ⊆ Nr(x), which means that Nr(x) has uncountably many elements of E. But Nr(x) was anarbitrary neighborhood of x, so this proves that every neighborhood of x has uncountably many elements of E:by definition, then, x is a condensation point and therefore x ∈ P .

So we have proven that W c = P . Taking the complement of each of these, we see that W = P c. And the setW is the union of countably many sets of the form E ∩ Vn, each of which contains countably many elements: soW is countable. And W = P c, so P c is countable: this proves that countably many points are P c. And fromthis, it’s trivial to show that there are countably many points in E ∩ P c, which is what we were asked to prove.

Note that this proof assumed only that X had a countable base, so it is valid for any separable metric space.

Exercise 2.28

Let E be a closed set in a separable metric space X. Let P be the set of all condensation points for E.Proof that E ∩P is perfect: assume that x ∈ E ∩P . Clearly x is a point of P , which means that x is a limit

point of E (from the definition of P ) and a limit point of P (because P was shown to be perfect in exercise 27).So x is a limit point of E ∩ P . Now, assume that x is a limit point of E ∩ P . From this, we know that x is apoint of E (because E is closed) and a point of P (because P is perfect). So we have shown that every point ofE ∩ P is a limit point of E ∩ P and vice-versa: this proves that E ∩ P is perfect.

30

Proof that E ∩ P c is at most countable: this was proven in exercise 27.

And E = (E ∩ P ) ∪ (E ∩ P c): that is, E is the union of a perfect set and a countable set. And this is whatwe were asked to prove.

Exercise 2.29

Let E be an arbitrary open set in R1 and let {Gi} be an arbitrary collection of disjoint segments such that⋃Gi = E. Each Gi is open and nonempty, so each Gi contains at least one rational point. Therefore there

can’t be more Gi elements than rational numbers, which means that there are at most a countable number ofelements in {Gi}. But {Gi} was an arbitrary set of disjoint segements whose union is E, therefore E cannot bethe union of an uncountable number of disjoint segments.

Exercise 2.30

Proof by contradiction: Let {Fn} be a collection of sets such that⋃Fn = R and suppose that each Fn has an

empty interior.

7→ (∀n)(F ◦n = ∅) hypothesis of contradiction

→⋃F ◦n = ∅

→ (⋃F ◦n)

c= R taking the complement of both sides

→⋂

(F ◦n)c

= R theorem 2.22

→⋂F cn = R exercise 2.9d

→ (∀n)(F cn = R)

This last step tells us that F cn is dense in R for every n. And Fn was closed, so F cn is open. Appealing to theproof of Baire’s theorem in exercise 3.22 (which uses only terms and concepts introduced in chapter 2), we seethat

⋂F cn is nonempty. Therefore:

→⋂F cn 6= ∅

→ (⋃Fn)

c 6= ∅ theorem 2.22

→⋃Fn 6= R taking the complement of both sides

And this contradicts our original claim that⋃Fn = R so one of our initial assumptions must be wrong. And

our only assumption was that each Fn has an empty interior, so by contradiction we have proven that at leastone Fn must have a nonempty interior.

Exercise 3.1

The exercise does not explicitly say that we’re operating in the metric space Rk, but there are two reasons toassume that we are. First, we are supposed to understand the meaning of absolute value in this metric space,and Rk is the only metric space we’ve encountered so far for which absolute value has been defined. Second,Rudin appears to use sn to represent series in Rk and pn to represent series in arbitrary metric spaces. So we’llassume that we’re operating in Rk for this exercise.

We’re told that the sequence {sn} converges to some value s: that is, for any arbitrarily small ε there is someinteger N such that n > N implies d(s, sn) < ε. But it’s always the case that d(|s|, |sn|) ≤ d(s, sn) (exercise1.13), so for this same ε and N we see that d(|s|, |sn|) ≤ d(s, sn) < ε. By transitivity, this means that for anychoice of ε there is some integer N such that n > N implies d(|s|, |sn|) < ε. By definition of convergence, thismeans that the sequence {|sn|} converges to |s|.

The converse is not true. If we let sk = (−1)k, the sequence {|sn|} clearly converges while the sequence {sn}clearly does not.

31

Exercise 3.2

The exercise asks us to calculate the limit rather than prove it rigorously, so we might be able to manipulatethe expression algebraically:

√n2 + n− n = (

√n2 + n− n)

(√n2 + n+ n√n2 + n+ n

)=

n√n2 + n+ n

=n

n√

1 + 1/n+ n=

1√1 + 1/n+ 1

and then use basic calc 2 techniques to show that this limit is 12 . If we need to prove it more rigorously, we can

use theorem 3.14. We will first need to prove that this sequence has a limit, which theorem 3.14 tells us canbe done by proving that the sequence is bounded and monotonically increasing. We can then find the limit byfinding the least upper bound of the sequence.

The sequence is boundedThe quantitity (

√n2 + n − n) is bounded below: from the fact that

√n2 + n >

√n2 = n, we know that√

n2 + n− n > 0. And it’s bounded above by 12 :

7→√n2 + n− n ≥ 1

2 hypothesis to be contradicted

→√n2 + n ≥ 1

2 + n

→ n2 + n ≥ 14 + n+ n2 both sides are positive, so the sign doesn’t change

→ 0 ≥ 14 subtract n+ n2 from both sides

This last step is clearly false, so our initial assumption must be false: this shows that (√n2 + n − n) is

bounded above by 12 .

The sequence is monotonically increasing7→ sn+1 > sn

↔√

(n+ 1)2 + (n+ 1)− (n+ 1) >√n2 + n− n definition of sn

↔√

(n+ 1)2 + (n+ 1)−√n2 + n > (n+ 1)− n rearrange terms

↔√n2 + 3n+ 2−

√n2 + n > 1 some algebra

↔ (n2 + 3n+ 2) + (n2 + n)− 2√n2 + 3n+ 2

√n2 + n > 1 square both sides

↔ (2n2 + 4n+ 2)− 2√n4 + 4n3 + 5n2 + 2n > 1 simplify

↔ −2√n4 + 4n3 + 5n2 + 2n > 1− (2n2 + 4n+ 2) rearrange terms

↔ 2√n4 + 4n3 + 5n2 + 2n < 2n2 + 4n+ 1 multiply both sides by −1 and simplify

↔ 4n4 + 16n3 + 20n2 + 8n < 4n4 + 16n3 + 20n2 + 8n+ 1 square both sides

↔ 0 < 1 subtract 4n4 + 16n3 + 20n2 + 8n from each side

The sequence has a least upper bound of 12

We’ve already shown that 12 is an upper bound. To show that it is the least such upper bound, choose any

x ∈ [0, 12 ) and assume that it is an upper bound for {sn}.7→ (∀n ∈ N)(sn ≤ x) hypothesis to be contradicted

→ (∀n ∈ N)(√n2 + n− n ≤ x) definition of sn

→ (∀n ∈ N)(√n2 + n ≤ x+ n)

→ (∀n ∈ N)(n2 + n ≤ x2 + 2xn+ n2) square both sides

→ (∀n ∈ N)(0 ≤ x2 + (2x− 1)n) subtract n2 + n from both sides

→ (∀n ∈ N)(−x2 ≤ (2x− 1)n) subtract x2 from both sides

→ (∀n ∈ N)( −x2

2x−1 ≤ n) 2x− 1 < 0 when x < 12

This last step must be false: by the Archimedian principle, we can always find some n ∈ N greater than anyspecific quantity. So, by contradiction, we can find sn > x for every x ∈ [0, 12 ): this means that the upper boundcannot be less than 1

2 . But 12 is an upper bound, so we see that 1

2 is the least such upper bound.

32

We have shown that {sn} is a monotonically increasing bounded sequence with a least upper bound of 12 :

by theorem 3.14, this proves that the limit of {sn} is 12 .

Exercise 3.3

Note that we are not asked to find the limit of this sequence. We are only asked to show that it converges andthat 2 is an upper bound. By theorem 3.14, we can prove convergence by proving that the sequence is boundedand monotonically increasing.

The sequence is bounded above by√

2 +√

2Although we’re asked to show that the sequence has an upper bound of 2, we can prove the stronger result that

it has an upper bound of√

2 +√

2. We can prove this by induction. We see that 0 < s1 =√

2 < 2. Now assumethat 0 < sn < 2:

7→ 0 < sn < 2 hypothesis of induction

→ 0 <√sn <

√2

→ 0 < 2 +√sn < 2 +

√2

→√

2 +√sn <

√2 +√

2

→ sn+1 <√

2 +√

2

By induction, then, sn <√

2 +√

2 for all n.

The sequence is monotonically increasing

Proof by induction. We can immediately see that s1 =√

2 <√

2 +√

2 = s2. Now, assume that we haveproven si−1 < si for i = 1 . . . n.

7→ sn =√

2 +√sn−1 definition of sn

→ sn >√

2 +√sn−2 our hypothesis of induction tells us sn−1 > sn−2

→ sn > sn−1 definition of sn−1

We’ve shown that sn is a monotonically increasing function that is bounded above. By theorem 3.14, this issufficient to prove that it converges.

Exercise 3.4

From the recursive definition we are given, we can see that s2 = 0 and s2m = 14 + 1

2s2m−2. So we can usetheorem 3.26 to find the limit of s2m.

limm→∞

s2m =1

4+

1

8+

1

16+ . . . =

( ∞∑n=0

1

2n

)− 1− 1

2=

1

1− 12

− 1− 1

2=

1

2

From the same recursive definition, we see that s1 = 0 and s2m+1 = 12 + 1

2s2m−1. So the limit of s2m+1 is givenby

limm→∞

s2m+1 =1

2+

1

4+

1

8+ . . . =

( ∞∑n=0

1

2n

)− 1 =

1

1− 12

− 1 = 1

This shows that as n increases, the terms of {sn} are either arbitrarily close to 12 or arbitrarily close to 1, so

these are our upper and lower bounds for the sequence.

Exercise 3.5

Let a∗ = limn→∞ sup an, let b∗ = limn→∞ sup bn, and let s∗ = limn→∞ sup(an + bn).

If a∗ and b∗ are both finite:From theorem 3.17(a) we know that there are some subsequences {aj}, {bk} such that limj→∞ aj +limk→∞ bk =

33

s∗ (that is, the supremum of E as defined above is also member of E: E is a closed set). The limit of aj mustbe less than or equal to the supremum a∗ and the limit of bj must be less than or equal to the supremum b∗, so

s∗ = limj→∞

aj + limk→∞

bk ≤ limn→∞

sup an + limn→∞

sup bn = a∗ + b∗

By transitivity, this shows that s∗ ≤ a∗ + b∗: and, by the definitions of these terms, this proves that

limn→∞

sup(an + bn) ≤ limn→∞

sup an + limn→∞

sup bn

If either a∗ or b∗ is infinite:We’re asked to discount the possibility that a∗ =∞ and b∗ = −∞. In all other cases when one or both of thesevalues is infinite, the inequality can be easily shown to resolve to ∞ =∞ or −∞ = −∞.

Exercise 3.6a

an =√n+ 1−

√n =

(√n+ 1−

√n)(√n+ 1 +

√n√

n+ 1 +√n

)=

1√n+ 1 +

√n

We can compare this last term to a known series:

an =1√

n+ 1 +√n>

1

2√n+ 1

>1

2

(1

n

)1/2

We know that the series∑(

1n

)1/2diverges by theorem 3.28, so the terms of {an} are larger than the terms of

a divergent series. By the comparison theorem 3.25, this tells us that∑an is divergent.

Exercise 3.6b

an =

√n+ 1−

√n

n=

√n+ 1−

√n

n

(√n+ 1 +

√n√

n+ 1 +√n

)=

1√n3 + n2 +

√n2

We can compare this last term to a known series:

1√n3 + n2 +

√n3

<1√

n3 +√n3

=1

2

(1

n

)3/2

We know that the series∑(

1n

)3/2converges by theorem 3.28, so the terms of {an} are smaller than the terms

of a convergent series. By the comparison theorem 3.25, this tells us that∑an is convergent.

Exercise 3.6c

For n ≥ 1, we can show that 0 ≤ ( n√n− 1) < 1:

7→ (∀n ∈ N)(1 ≤ n < 2n) the < 2n can easily be proven via induction

→ (∀n ∈ N)(1 ≤ n√n < 2) take the nth root of each term

→ (∀n ∈ N)(0 ≤ n√n− 1 < 1) subtract 1 from each term

We know that the series∑xn converges when 0 ≤ x < 1, so by the comparison theorem 3.25 we know that∑

an is convergent.

Exercise 3.6d

This probably is more difficult than it looks at first, mainly because for complex z it’s not true that 1 + zn > zn

– in fact, without absolute value signs the inequality z1 > z2 has no meaning whatsoever (see exercise 1.8). It’salso not true that |1 + zn| > |zn|. The best we can do is appeal to the triangle inequality.

34

If |z| < 1 then by the triangle inequality we have

limn→∞

∣∣∣∣ 1

1 + zn

∣∣∣∣ ≥ limn→∞

1

|1|+ |zn|= 1

and therefore by theorem 3.23 the series doesn’t converge. If |z| = 1 then we similarly have

limn→∞

∣∣∣∣ 1

1 + zn

∣∣∣∣ ≥ limn→∞

1

|1|+ |zn|=

1

2

and by theorem 3.23 the series again fails to converge. If |z| > 1 then we can use the ratio test:

limn→∞

∣∣∣∣ 1 + zn

1 + zn+1

∣∣∣∣ = limn→∞

∣∣∣∣z−nz−n1 + zn

1 + zn+1

∣∣∣∣ = limn→∞

∣∣∣∣z−n + 1

z−n + z

∣∣∣∣ =

∣∣∣∣1z∣∣∣∣ < 1

and by theorem 3.34 the series converges. I’m not totally satisfied with this proof because I can’t totally justifythe claim that limn→∞ z−n = 0 without appealing to the polar representation z−n = r−ne−inθ and the fact that|eixθ| = 1 for all x.

Exercise 3.7

Define the partial sum tn to be

tn =

n∑k=1

√akk

We can show that the series∑√

an/n converges by showing that the sequence {tn} converges; we can do thisby showing that {tn} is bounded and monotonically increasing (theorem 3.14).

the sequence is monotonically increasing:From the definition of partial sums, we know that tn =

√an/n+ tn−1 for all n. And we’re told an > 0 for every

n, so tn > tn−1 for every n.

the sequence is bounded:From the Cauchy-Schwartz inequality, we know that

∑(ab) ≤

√∑a2∑b2. Applying this to the given series,

we see that ∑(√an

1

n

)≤√∑

an∑ 1

n2

We are told that∑an converges, and from theorem 3.28 we know that that

∑1/n2 converges. Therefore (by

theorem 3.50) we know that their product converges to some α, and so the right-hand side of the above inequalityconverges to

√α. This shows us that the series

∑√an/n is bounded by

√α, and therefore the sequence {tn} is

bounded by√α.

By assuming that∑an is convergent and that an > 0, we’ve shown that tn is bounded and monotonically

increasing. By theorem 3.14 this is sufficient to show that tn converges. And this is what we were asked toprove.

Exercise 3.8

We’re told that {bn} is bounded. Let α be the upper bound of {|bn|}. We’re also told that∑an converges: so

for any arbitrarily small ε, we can find an integer N such that∣∣∣∣∣n∑

k=m

ak

∣∣∣∣∣ ≤ ε

αfor all n,m such that n ≥ m ≥ N

which is algebraically equivalent to∣∣∣∣∣n∑

k=m

ak α

∣∣∣∣∣ ≤ ε for all n,m such that n ≥ m ≥ N

35

which, since |bk| ≤ α for every k, means that∣∣∣∣∣n∑

k=m

akbk

∣∣∣∣∣ ≤∣∣∣∣∣n∑

k=m

ak α

∣∣∣∣∣ ≤ ε for all n,m such that n ≥ m ≥ N

By theorem 3.22, this is sufficient to prove that∑anbn converges.

Exercise 3.9a

α = lim sup n√|n3| = lim sup |n3/n| = |n0| = 1

So the radius of convergence is R = 1/α = 1.

Exercise 3.9b

Example 3.40(b) mentions that “the ratio test is easier to apply than the root test”, although we’re neverexplicitly told what the ratio test is, how it might relate to the ratio test for series convergence, or how to applythe ratio test to example (b). According to some course handouts I found online 2, we can’t simply apply theratio test in the same way we use the root test. That is, we can’t use the fact that

β = lim sup

∣∣∣∣ 2n+1

(n+ 1)!

n!

2n

∣∣∣∣ = lim sup

∣∣∣∣ 2

n+ 1

∣∣∣∣ = 0

to assume that the radius of convergence is 1/β = ∞. Instead, we look at theorem 3.37, which tells us thatα ≤ β, so that

R =1

α≥ 1

β=∞

Exercise 3.9c

α = lim sup n

√∣∣∣∣2nn2∣∣∣∣ = lim sup

2∣∣∣ n√n2∣∣∣ =

2

(lim sup√n)2

The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this lastterm:

2

(lim sup√n)2

=2

1

So α = 2, which gives us a radius of convergence of 1/2.

Exercise 3.9d

α = lim sup n√|n33n| = lim sup

n√n3

3=

(lim sup n√n)3

3The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this lastterm:

(lim sup n√n)3

3=

1

3

So α = 1/3, which gives us a radius of convergence of 3.

Exercise 3.10

We’re told that infinitely many of the coefficients of∑anz

n are positive integers. This means that for every N ,there is some n > N such that an ≥ 1. Therefore lim supn→∞|an| ≥ 1

→ lim supn→∞n√|an| ≥ 1 from the fact thatk > 1→ k1/n > 1

→ 1R ≥ 1 theorem 3.39

→ 1 ≥ RThis final step indicates that the radius of convergence is at most 1.

2http://math.berkeley.edu/∼gbergman/

36

Exercise 3.11a

Proof by contrapositive: we will assume that∑an/(1 + an) converges and show that this implies that

∑an

converges.

To simplify the form of∑an/(1 + an), we can multiply the numerator and denominator of each term of the

series by 1/an to get ∑ 11an

+ 1

For this to converge, it is necessary that

limn→∞

11an

+ 1= 0

which can only happen if the limit of the denominator is ∞; and this can only happen if the limit of 1/an is ∞.That is, it must be the case that

limn→∞

an = 0

This alone is not sufficient to prove that∑an converges. It does, however, allow us to make the terms of an as

arbitrarily small as we like. For this proof, we it’s helpful to recognize that there is some integer N1 such thatan < 1 for all n > N1.

From the assumption that∑an/(1 + an) converges, we know that for every ε there is some Nε such that

n∑k=m

ak1 + ak

< ε for every n ≥ m ≥ Nε

Let N be the larger of {N1, Nε}. For any ε and any n ≥ m ≥ N , we can produce two inequalities:

n∑k=m

ak1 + 1

<

n∑k=m

ak1 + ak

n∑k=m

ak1 + ak

< ε for every n ≥ m ≥ N

The first is true because each k is larger than N1 (so that ak < 1); the second is true because each k is largerthan Nε. Together, these inequalities tell us that

n∑k=m

ak2< ε for every n ≥ m ≥ N

or, equivalently, thatn∑

k=m

ak < 2ε for every n ≥ m ≥ N

And our choice of ε was arbitrary, so we have shown that for every ε there is some N that makes the laststatement true: and this is the definition of convergence for the series

∑an.

We’ve shown that the convergence of∑ an

1+animplies the convergence of

∑an. By contrapositive, the fact

that∑an diverges is proof that

∑ an1+an

diverges.

Exercise 3.11b

From the definition of sn, we can see that

sn+1 = a1 + . . .+ an+1 = sn + an+1

37

and we’re told that every an > 0, so we know that sn+1 > sn. By induction, we also know that sn ≥ smwhenever n ≥ m. And from this, we know that 1

sm≥ 1

snwhenever n ≥ m. Therefore:

aN+1

sN+1+ . . .+

aN+k

sN+k≥ aN+1

sN+k+ . . .+

aN+k

sN+k=aN+1 + . . .+ aN+k

sN+k

A bit of algebraic manipulation shows us that aN+1+. . .+aN+k = sN+k−sN , so this last inequality is equivalentto

aN+1

sN+1+ . . .+

aN+k

sN+k≥ sN+k − sN

sN+k= 1− sN

sN+k


To show that∑ an

snis divergent, we once again look at the sequence {sn}. We’ve already determined that

{sn} is an increasing sequence, and we know that’s it’s not convergent (otherwise, by the definition 3.21 of“convergent series”,

∑an would be convergent). Therefore, we know that {sn} is not bounded (from theorem

3.14, which says that a bounded monotonic series is convergent).

Now let sN be an arbitrary element of {sn} and let ε be an arbitrarily small real. From the fact that {sn}is not bounded, we can make sN+k arbitrarily large by choosing a sufficiently large k. This means that, for anyN and any arbitrarily small ε, we can make sN

sN+k< ε by choosing a sufficiently large k. And we’ve established

thataN+1

sN+1+ . . .+

aN+k

sN+k≥ 1− sN

sN+k

which means that, by choosing sufficiently large k, we get the inequality

N+k∑k=N+1

aksk≥ 1− sN

sN+k≥ 1− ε

We’ve now established everything we need to show that∑ an

snis divergent. Choose any ε such that 0 < ε < 1

2 .From theorem 3.22, in order for

∑ ansn

to be convergent we must be able to find some integer N such that

n∑k=m

aksk

< ε for all n ≥ m ≥ N

But there can be no such N , because for every N we’ve shown that there is some k such that

N+k∑k=N+1

aksk

> 1− ε > ε

(note that 1 - ε > ε because 0 < ε < 12 ). And this is sufficient to show that the series does not converge.

Exercise 3.11c

To prove the inequality:

sn ≥ sn−1 {sn} is an increasing sequence (see part b)

→ 1sn≤ 1

sn−1

→ ans2n≤ an

snsn−1multiply both sides by an/sn

→ ans2n≤ sn−sn−1

snsn−1sn − sn−1 = an (see part b)

→ ans2n≤ 1

sn−1− 1

snalgebra

Now consider the summation

n∑k=2

1

sk−1− 1

sk=

(1

s1− 1

s2

)+

(1

s2− 1

s3

)+ . . .+

(1

sn−1− 1

sn

)

38

Most of the terms in this summation cancel one another out: the summation “telescopes down” and simplifiesto

n∑k=2

1

sk−1− 1

sk=

1

s1− 1

sk

As we saw in part (b), the terms of {sn} increase without bound as n→∞, so that

∞∑k=2

1

sk−1− 1

sk=

1

s1

Now, by the inequality we proved above, we know that

∞∑k=2

ans2n≤∞∑k=2

1

sk−1− 1

sk=

1

s1

We can add one term to each side so that our summation starts at 1 instead of at 2 to get

∞∑k=1

ans2n≤ a1s21

+1

s1

We can now show that the series converges. Define the partial sum of this series to be

{tn} =

n∑k=1

1

sk−1− 1

sk

We’ve shown that {tn} is bounded above by a1/s21 + 1/s1, and we know that {tn} is monotonically increasing

because each an is positive. Therefore, by theorem 3.14 we know that {tn} converges; and so by definition 3.21we know that its associated series converges. And this is what we were asked to prove.

Exercise 3.11d

The series∑an/(1 + n2an) always converges. From the fact that an > 0, we can establish the following chain

of inequalities:an

1 + n2an=

1/an1/an

an1 + n2an

=1

1an

+ n2<

1

n2

From this, we see that∞∑n=0

an1 + n2an

<

∞∑n=0

1

n2

We know that∑

1/n2 converges (theorem 3.28), and therefore∑an/(1 + n2an) converges by the comparison

test of theorem 3.25.

The series∑an/(1 + nan) may or may not converge. If an = 1/n, for instance, the summation becomes∑ an

1 + nan=∑ 1

n

2=

1

2

∑ 1

n

which is divergent by theorem 3.28.

To construct a convergent series, let an be defined as

an =

{1 if n = 2m − 1 (m ∈ Z)0 otherwise

The series∑an is divergent, since there are infinitely many integers of the form 2m − 1. But the series∑

an/(1 + nan) is convergent:∞∑n=0

an1 + nan

=

∞∑m=0

1

2m=∑(

1

2

)mThis series is convergent to 2 by theorem 3.26.

39

Exercise 3.12a

establishing the inequality

From the definition of rn, we see that

rk =

∞∑m=k

am = ak +

∞∑m=k+1

am = ak + rk+1

so that ak = rk − rk+1. And because each ak is positive, we see that rk > rk+1 which (from transitivity) meansthat rm > rn when n > m (that is, {rn} is a decreasing sequence). With these equalities, we can form our proof.

Take m ≤ n and consider the sumn∑

k=m

akrk

=amrm

+ . . .+anrn

From the fact that rm ≥ rn, we know that ak/rm ≤ ak/rn for all k, so that

am + . . .+ anrm

≤ amrm

+ . . .+anrn≤ am + . . .+ an

rn

Now, from the fact that ak = rk − rk+1, we see that this inequality is equivalent to

(rm − rm+1) + (rm+1 − rm+2) + . . .+ (rn − rn+1)

rm≤ amrm

+. . .+anrn≤ (rm − rm+1) + (rm+1 − rm+2) + . . .+ (rn − rn+1)

rn

Notice that most of the terms of these numerators cancel one another out, leaving us with

rm − rn+1

rm≤ amrm

+ . . .+anrn≤ rm − rn+1

rn

Taking the two leftmost terms of this inequality and performing some simple algebra then gives us

1− rn+1

rm≤ amrm

+ . . .+anrn

This is close to what we want to prove, but our index is off by one. But each an is positive and each rn isthe sum of positive terms, so an+1/rn+1 is strictly positive. By adding this term to the right-hand side of theinequality, we accomplish two things: we correct our index and we make this a strict inequality (<) instead ofa non-strict inequality (≤).

1− rn+1

rm<amrm

+ . . .+anrn

+an+1

rn+1

This last statement is true whenever n ≥ m, which means that it’s true whenever n + 1 > m. By a simplereplacement of variables, then, we know that

1− rn′

rm<amrm

+ . . .+a′nr′n

whenever n′ > m, which is what we were asked to prove.

proof of divergence

We’re told that∑an converges, so

∑an = α for some α ∈ R. From the definition of rn, we see that there is a

relationship between {rn} and α:

rn =

∞∑k=n

ak =

∞∑k=1

ak −n−1∑k=1

ak = α−n−1∑k=1

ak

As n→∞, this last term approaches α− α = 0, and so we see that limn→∞ rn = 0.

40

Now choose any arbitrarly small ε from the interval (0, 12 ) and choose any arbitrary integer N . From theinequality we verified in part (a1), we see that

aNrN

+ . . .+aN+n

rN+n> 1− rN+n

rN

We can choose n to be arbitrarily large; in doing so, rN+n approaches zero while rN remains fixed. So, bychoosing a sufficiently large n, we can guarantee that rN+n/rN < ε. This choice of n gives us the inequality

aNrN

+ . . .+aN+n

rN+n> 1− rN+n

rN> 1− ε > ε

where the last step of this chain of inequalities is justified by our choice of 0 < ε < 12 .

We’ve shown that there is some ε such that, for every N , we can find

n∑k=N

anrn

> ε

From theorem 3.22, this is sufficient to show that∑an/rn diverges (since we’ve proven the negation of the “for

every ε > 0 there is some N . . . ” statement of the theorem).

Exercise 3.12b

rn > rn+1 > 0 established in part (a)

→ √rn >√rn+1 take square root of each side

→ 2√rn >

√rn +

√rn+1 add

√rn to each side

→ 2 >√rn+√rn+1√

rndivide by

√rn

→ 2(√rn −

√rn+1) >

(√rn+√rn+1)(

√rn−√rn+1)√

rnmultiply each side by a positive term

→ 2(√rn −

√rn+1) > rn−rn+1√

rnsimply the right-hand side

→ 2(√rn −

√rn+1) > an√

rnwe established an = rn − rn+1 in part (a)

Having established this inequality, we see that

n∑k=1

ak√rk

<

n∑k=1

2(√rk −

√rk+1)

and many of the terms in the right-hand summation cancel one another out, so that the series “telescopes down”and simplifies to

n∑k=1

ak√rk

< 2(√r1 −

√rk+1

)so that

limn→∞

n∑k=1

ak√rk

< limn→∞

2(√r1 −

√rk+1

)= 2√r1

This shows that this sum of nonnegative terms is bounded, which is sufficient to show that it converges bytheorem 3.24.

Exercise 3.13

Let∑an be a series that converges absolutely to α and let

∑bn be a series that converges absolutely to β. We

can prove that the Cauchy product of these two series (definition 3.48) is bounded.

41

7→∑nk=0 |ck|

=∑nk=0

∣∣∣∑kj=0 ajbk−j

∣∣∣ definition 3.48 of ck

≤∑nk=0

∑kj=0 |ajbk−j | triangle inequality

=∑nk=0

∑kj=0 |aj ||bk−j | 1.33(c)

We can expand this summation out to get:

= |a0||b0|+ (|a0||b1|+ |a1||b0|) + . . .+ (|a0||bn|+ |a1||bn−1|+ . . .+ |an||b0|)

Define Bn to be∑nk=0 |bk| and An to be

∑nk=0 |ak|. We can rearrange these terms to get:

= |a0|Bn + |a1|Bn−1 + |a2|Bn−2 + . . .+ |an|B0

Now we can add several nonnegative terms to get

≤ |a0|Bn + |a1|(Bn−1 + |bn|) + |a2|(Bn−2 + |bn−1|+ |bn|) + . . .+ |an|(B0 + |b1|+ . . .+ |bn|)

= |a0|Bn + |a1|Bn + |a2|Bn + . . .+ |an|Bn= AnBn

= (∑nk=0 |ak|) (

∑nk=0 |bk|)

= αβ

This shows that each partial sum of∑|ck| is bounded above by αβ and below by 0 (since it’s a series of

nonnegative terms). By theorem 3.24, this is sufficient to prove that∑|ck| converges; and since

∑ck is the

Cauchy product, we have proved that the Cauchy product∑ck converges absolutely.

Exercise 3.14a

Choose any arbitrarily small ε > 0. We’re told that lim{sn} = s, which means that there is some N such thatd(s, sn) < ε whenever n > N . So we’ll rearrange the terms of σn a bit:

σn =

∑nk=0 skn+ 1

=

∑Nk=0 sk +

∑nk=N+1 sk

n+ 1

Whenever n > N we know that d(s, sn) < ε, which means that −ε < sn− s < ε, or that s− ε < sn < s+ ε. Thisgives us the inequality ∑N

k=0 sk +∑nk=N+1(s− ε)

n+ 1< σn <

∑Nk=0 sk +

∑nk=N+1(s+ ε)

n+ 1

The terms in some of these summations don’t depend on k, so we can further rewrite this as∑Nk=0 sk + (n− (N + 1))(s− ε)

n+ 1< σn <

∑Nk=0 sk + (n− (N + 1))(s+ ε)

n+ 1∑Nk=0 skn+ 1

+n(s− ε)n+ 1

− (N + 1)(s− ε)n+ 1

< σn <

∑Nk=0 skn+ 1

+n(s+ ε)

n+ 1− (N + 1)(s+ ε)

n+ 1

For many of these terms, the numerators are constant with respect to n. Therefore many of these terms willapproach zero as n→∞.

limn→∞

n(s− ε)n+ 1

< limn→∞

σn < limn→∞

n(s+ ε)

n+ 1

s− ε < limn→∞

σn < s+ ε

And this is just another way of saying d(s, limσn) < ε. And ε was arbitrarily small, so we have shown thatlimn→∞ σn = s.

42

Exercise 3.14b

Let sn = (−1)n. Then∑sn is either equal to 0 or 1, depending on n, and

0

n+ 1≤ σn ≤

1

n+ 1

Taking the limit as n→∞ gives us0 ≤ lim

n→∞σn ≤ 0

which can only be true iflimn→∞

σn = 0

Exercise 3.14c

Define sn to be

sn =

{ (12

)n+ k if n = k3 (k ∈ Z)(

12

)notherwise

There are no more than 3√n perfect cubes within the first n integers, so {sn} will contain no more than b 3

√nc

terms of the form (1/2)n + k. This gives us the inequality

n∑k=0

sn =

n∑k=0

(1

2

)n+

b 3√nc∑

k=0

k ≤ 2 +3√n( 3√n+ 1)

2

where the last step in this chain of inequalities is justified by the common summation∑nk=1 k = k(k + 1)/2.

Continuing, we see that

n∑k=0

sn ≤ 2 +3√n( 3√n+ 1)

2≤ 2 +

3√n( 3√n+ 3√n)

2= 2 +

3√n2

We can now analyze the value of σn.

σn =

∑nk=0 snn+ 1

≤ 2 +3√n2

n+ 1=

23√n2

+ 1

3√n+ 1

3√n2

Taking the limit of each side as n→∞ gives us

limn→∞

σn ≤ limn→∞

23√n2

+ 1

3√n+ 1

3√n2

= limn→∞

13√n

= 0

Every term of {sn} was greater than zero, so we know that the arithmetic average σn is greater than zero.Therefore 0 ≤ limσn ≤ 0, and therefore limσn = 0.

Exercise 3.14

proving the equality

n∑k=1

kak = (s1 − s0) + 2(s2 − s1) + . . .+ n(sn − sn−1)

= −s0 + (1− 2)s1 + (2− 3)s2 + . . .+ ((n− 1)− n)sn−1 + sn(n)

= −s0 − s1 − . . .− sn + (n+ 1)sn

Therefore, if we divide this by n+ 1, we get

1

n+ 1

n∑k=1

kak =−s0 − s1 − . . .− sn

n+ 1+ sn = −σn + sn

43

establishing convergence

We’re told that limn→∞ nan = 0, so by part a we know that

limn→∞

∑nk=1 kakn+ 1

= 0

We’re also told that {σn} converges to some value σ, so by theorem 3.3(a) we know that

limn→∞

(∑nk=1 kakn+ 1

+ σn

)= (0 + σ) = σ

And since sn =∑n

k=1 kakn+1 + σn, this is sufficient to prove that limn→∞ sn = σ.

Exercise 3.15

If you think I’m going to go through this tedious exercise, you can lick me where I shit.

Exercise 3.16a

{xn} is monotonically decreasing

From the fact that 0 <√α < xn we know that α < x2n, and therefore

xn+1 =1

2

(xn +

α

xn

)<

1

2

(xn +

x2nxn

)= xn

This shows that xn > xn+1 for all n, which proves that {xn} is monotonically decreasing.

The limit of {xn} exists and is not less than√a

First, we show that {xn} is bounded below by√a. We know that x0 >

√a because we chose it to be. And if

xn >√a for any n, we have

7→ xn 6=√a assumed

→ xn −√a 6= 0

→ (xn −√a)2 ≥ 0 1.18(d)

→ x2n − 2xn√a+ a ≥ 0

→ x2n + a ≥ 2xn√a

→ x2n+a2xn

≥√a

→ 12

(xn + a

xn

)≥√a

→ xn+1 ≥√a definition of xn+1

So by induction, we know that xn ≥√a for all n. We’ve now demonstrated that {xn} is monotonically

decreasing and is bounded below by√a, so we’re guaranteed that the limit of {xn} exists and that lim{xn} ≥

√a.

The limit of {xn} is not greater than√a

Now we show that lim{xn} ≤√a. Assume that lim{xn} =

√b for some

√b >√a. By the definition of “limit”,

we can find N such that n > N implies d(xn,√b) < ε for any arbitrarily small ε. We’ll choose ε =

√b− a (which

is positive since√b >√a ≥ 1 implies b > a).

7→ d(xn,√b) < ε

→ d(xn,√b) <

√b− a chosen value of ε

→ xn −√b <√b− a metric on R1

→ xn <√b− a+

√b

44

We can then calculate xn+1 in terms of xn to get a chain of inequalities:

xn+1 =x2n + a

2xn<

(√b− a+

√b)2 + a

2(√b− a+

√b)

=(b− a) + 2

√b√b− a+ b+ a

2(√b− a+

√b)

=2b+ 2

√b√b− a

2(√b− a+

√b)

=

√b(√b+√b− a)√

b+√b− a

=√b

This shows us that if the distance between xn and√b becomes less than

√b− a (which it is guaranteed to

do eventually because lim{xn} =√b), then the next term in the sequence {xn} will be less than

√b. And since

we’ve already shown that the sequence is monotonically decreasing, this means that every subsequent term willbe even farther from

√b, which contradicts our assumption that lim{xn} =

√b. Therefore the limit is not

√b:

but√b was an arbitrary value greater than

√a, so we’ve shown that the limit cannot be any value greater than√

a.

Having shown that lim{xn} exists, that the limit is not less than√a, and that the limit is not greater than√

a, we have therefore proven that lim{xn} =√a.

Exercise 3.16b

From the definition of xn+1 and en, we have

en+1 = xn+1 −√a =

x2n +√a

2xn−√a =

x2n − 2xn√a+ a

2xn=

(xn −√a)2

2xn

=e2n

2xn<

e2n2√a

where the last step is justified by the fact that xn >√a (see part (a)).

The next part can be proven by induction. Setting n = 1 in the above inequality, we have

e2 <e21

2√a

=e21β

= β

(e1β

)21

And, if the statement is true for n, then

en+1 <e2n

2√a

=1

β(e2n) <

1

β

[β

(e1β

)2n−1]2=

1

β

[β2

(e1β

)2n]

= β

(e1β

)2n

which shows that it is therefore true for n + 1. By induction, this is sufficient to show that it is true for alln ∈ N.

Exercise 3.16c

When a = and x1 = 2, we havee1β

=x1 −

√a

2√a

=2−√

3

2√

3

which can be shown to be less than 1/10 through simple algebra. Then, by part (b), we have

en < β

(e1β

)2n

< 2√

3

(1

10

)2n

< 4

(1

10

)2n

45

Exercise 3.17

Lemma 1: xn >√a→ xn+1 <

√a, and xn <

√a→ xn+1 >

√a

7→ xn >√a

↔ xn(√a− 1) >

√a(√a− 1)

↔ xn√a− xn > a−

√a

↔ xn√a+√a > a+ xn

↔√a(xn + 1) > a+ xn

↔√a > a+xn

xn+1

↔√a > xn+1 definition of xn+1

This shows that xn >√a → xn+1 <

√a. If “>” is replaced with “<” in each of the above steps, we will

have also constructed a proof that xn <√a→ xn+1 >

√a.

Lemma 2: xn >√a→ xn+2 < xn and xn <

√a→ xn+2 > xn

7→ xn >√a

↔ xn −√a > 0

↔ 2(xn −√a)(xn +

√a) > 0 multiplied by positive terms, so it’s still > 0

↔ 2(x2n − a) > 0

At this point, the algebraic steps become bizarre and seemingly nonsensical. But bear with me.

↔ 2x2n + 0xn − 2a) > 0

↔ 2x2n + (1 + a− 1− a)xn − 2a) > 0

↔ xn + x2n + axn + x2n > a+ axn + a+ xn

↔ xn(1 + xn) + xn(a+ xn) > a(1 + xn) + a+ xn

↔ xn(1+xn)+xn(a+xn)1+xn

> a(1+xn)+a+xn

1+xndivision by a positive term

↔ xn

(1 + a+xn

1+xn

)> a+ a+xn

1+xn

↔ xn (1 + xn+1) > a+ xn+1

↔ xn >a+xn+1

1+xn+1

↔ xn > xn+2 definition of xn+2

This shows that xn >√a→ xn+2 < xn. If “>” is replaced with “<” in each of the above steps, we will have

also constructed a proof that xn <√a→ xn+2 > xn.

Exercise 3.17a

We’re forced to choose x1 such that x1 >√a, so we can use induction with lemma 2 to show than {x2k+1} (the

subsequence of {xn} consisting of elements with odd indices) is a monotonically decreasing sequence.

Exercise 3.17b

Because we chose x1 such that x1 >√a, lemma 1 tells us that x2 <

√a. We can then use induction with lemma

2 to show that {x2k} (the subsequence of {xn} consisting of elements with even indices) is a monotonicallyincreasing sequence.

Exercise 3.17c

We can show than {xn} converges to√a by showing that we can make d(xn,

√a) arbitrarily small. We do this

by demonstrating the relationship between d(xn,√a) and d(x1,

√a).

46

en+2 = xn+2 −√a =

a+ xn+1

1 + xn+1−√a =

a+ xn+1 −√a− xn+1

√a

1 + xn+1

7→ en+2 = xn+2 −√a

→ en+2 = a+xn+1

1+xn+1−√a

→ en+2 = a+xn+1−√a−xn+1

√a

1+xn+1

→ en+2(1 + xn+1) = a+ xn+1 −√a− xn+1

√a

→ en+2(1 + xn+1) = xn+1(1−√a)−

√a(1−

√a)

→ en+2(1 + xn+1) = (xn+1 −√a)(1−

√a)

→ en+2(1 + xn+1) = en+1(1−√a)

→ en+2 = en+1

(1−√a

1+xn+1

)This tells us how to express en+2 in terms of en+1. We can use this same equality to express en+1 in terms

of en, giving us

en+2 = en+1

(1−√a

1 + xn+1

)= en

(1−√a

1 + xn

)(1−√a

1 + xn+1

)

= en

(1−√a

1 + xn

)(1−√a

1 + a+xn

1+xn

)= en

(1−√a

1 + xn

)(1−√a

1+xn+a+xn

1+xn

)

= en

(1−√a

1 + xn

)((1 + xn)(1−

√a)

1 + 2xn + a

)= en

((1−

√a)2

1 + 2xn + a

)< en

((√a− 1)2

a− 1

)The last step in this chain of inequalities is justified by the fact that a + 2xn + 1 > a − 1 > 1. Continuing, wehave

en+2 < en

((√a− 1)2

a− 1

)= en

(√a− 1√a+ 1

)= cen, 0 < c < 1

This tells us how to express en+2 in terms of en. We can use this same inequality to express e2n+2 in terms e2and e2n+1 in terms of e1:

e2n+2 < ce2n < c(ce2n−2) < . . . < cne2

e2n+1 < ce2n−1 < c(ce2n−3) < . . . < cne1

And since 0 < c < 1, this means that we can make cn arbitrarily small by taking sufficiently large n; anden < c2[maxe1, e2], so we can make en arbitrarily small by taking sufficiently large n; and this shows thatlimn→∞ en = 0. Finally, remember that we defined en to be d(xn,

√a). So limn→∞ d(xn,

√a) = 0 which is

sufficient to prove that limn→∞{xn} =√a.

Exercise 3.17d

As shown above, every two iterations reduces the error term by a factor of (√a−1)/(

√a+1) (linear convergence).

For the algorithm in exercise 16, the nth iteration reduced the error term by a factor of 10−2n

(quadraticconvergence).

Exercise 3.18

Lemma 1 : {xn} is decreasing.

We’re asked to choose that x1 > p√a. And if xn > p

√a, we have

47

7→ xn > p√a

→ xpn > a

→ 0 > a− xpn→ pxpn > pxpn + a− xpn→ pxpn > (p− 1)xpn + a

→ pxpn

pxp−1n

>(p−1)xp

n+a

pxp−1n

→ xn >(p−1)xn

p + ap x

p−1n

→ xn > xn+1

Lemma 2 : 0 < k < 1→ p(1− k) > 1− kp

Let k be a positive number less than 1 and let p be a positive integer.

7→ p = 10 + 12 + 13 + . . .+ 1p−1

→ p > k + k2 + k3 + . . .+ kp−1

→ p > (k−1)(k+k2+k3+...+kp−1)k−1

→ p > kp−1k−1

→ p(k − 1) > kp − 1

Lemma 3 : xn > p√a

We know that x1 > p√a because we chose it. And if xn > p

√a, we have:

7→ xn > p√a > 0

→ 1 >p√a

xn> 0

→ p(

1−p√a

xn

)> 1−

(p√a

xn

)pfrom lemma 2

→ p− 1 > pp√a

xn− a

xpn

expand and rearrange the terms

→ xpn(p− 1) > xpn

(p

p√a

xn− a

xpn

)multiply both sides by xpn

→ pxpn − xpn > p p√axp−1n − a

→ (p− 1)xpn + a > p p√axp−1n rearrange the terms and simplify

→ (p−1)xpn+a

pxp−1n

>p p√axp−1

n

pxp−1n

divide both sides by pxp−1n

→ (p−1)xn

p + a

pxp−1n

> p√a simplify

→ xn+1 > p√a definition of xn+1

Finally : limn→∞{xn} = p√a

We’ve shown that {xn} is decreasing (lemma 1) and that it’s bounded below (lemma 3), which is sufficientto show that it has some limit x∗ with x∗ ≥ p

√a. From the definition of limit, we can find N such that

n > N → d(xn, x∗) < ε

2p . From this, we see that we can find xn and xn+1 such that:

48

7→ d(xn, x∗) < ε ∧ d(xn+1, x

∗) < ε2p

→ d(xn, xn+1) < d(xn, x∗) + d(x∗, xn+1) = ε

p triangle inequality

→ xn − xn+1 <εp lemma 1: xn > xn+1

→ xn −[xn − xn

p + a

pxp−1n

]< ε

p definition of xn+1

→ xn

p −a

pxp−1n

< εp

→ xpn−a

pxp−1n

< εp

→ xn − a

xp−1n

< ε

This last statement tells us that limn→∞ xn − a

xp−1n

= 0, or that

7→ limn→∞ xn − a

xp−1n

= 0 assumed

→ limn→∞ xn(1− a/xpn) = 0 theorem 3.3c

→ x∗(

1−(

p√a

x∗

)p)= 0 theorem 3.3c

→ p√a/x∗ = 1

→ p√a = x∗

From the definition of x∗ as the limit of {xn}, this tells us that limn→∞{xn} = p√a which is what we wanted

to prove.

Exercise 3.19

The idea behind this proof is to consider the elements of the line segment [0, 1] in the form of their base-3expansion. When we do this, we notice that the mth iteration of the Cantor set eliminates every number witha 1 as the mth digit of its ternary decimal expansion.

From equation 3.24 in the book, we know that a real number r is not in the Cantor set iff it is in any intervalof the form (

3k + 1

3m,

3k + 2

3m

)m, k ∈ N

Let {αn} represent an arbitrary ternary sequence (that is, each digit is either 0,1, or 2). We can determine thenecessary and sufficient conditions for x(α) to fall into such an interval:

x(α) 6∈ Cantor iff x(α) ∈(

3k + 1

3m,

3k + 2

3m

)

iff (∃m)

∞∑n=1

αn3n∈(

3k + 1

3m,

3k + 2

3m

)

iff (∃m)

∞∑n=1

αn3n−m

∈ (3k + 1, 3k + 2)

We then split up the summation into three distinct parts.

iff (∃m)

m−1∑n=1

αn3n−m

+

m∑n=m

αn3n−m

+

∞∑n=m+1

αn3n−m

∈ (3k + 1, 3k + 2)

iff (∃m)

m−1∑n=1

αn3m−n + αm +

∞∑n=m+1

αn3n−m

∈ (3k + 1, 3k + 2)

Every term in the leftmost sum is divisible by three, so the leftmost sum is itself divisible by three. This givesus, for some j ∈ N,

iff (∃m) 3j + αm +

∞∑n=m+1

αn3n−m

∈ (3k + 1, 3k + 2), j ∈ N

49

Without specifying a certain sequence {αn}, we can’t evaluate the rightmost summation. But we can establishbounds for it. Each αn is either 0,1, or 2. So the summation is largest whenever αn = 2 for all n, and it’ssmallest when αn = 0 for all n. This gives us the bound

0 =

∞∑n=m+1

0

3n−m<

∞∑n=m+1

αn3n−m

<

∞∑n=m+1

2

3n−m= 2

∞∑n=1

1

3n= 1

Note that the upper bound is 1, not 3, since the index is from 1 instead of 0. Continuing with our chain of “iff”statements, we conclude that x(a) is not member of the Cantor set iff:

x(a) 6∈ Cantor iff (∃m) 3j + αm + δ ∈ (3k + 1, 3k + 2), j ∈ N, 0 < δ < 1

From the bounds on αm and δ, we know that 0 < αm + δ < 3: their sum is never a multiple of 3. So the onlyway that the set membership in the previous statement can be true is

iff (∃m) αm + δ ∈ (1, 2), 0 < δ < 1

iff (∃m) αm ∈ (0, 2)

We know that αm is an integer, and there’s only one integer in the open interval (0, 2):

iff (∃m) αm = 1

iff x(α) 6∈ x(a)

where x(a) is as defined in the exercise. This shows that any real number x(α) is a member of the Cantor set ifand only if it is a member of x(a): this is sufficient to prove that the Cantor set is equal to x(a).

Exercise 3.20

Choose some arbitrarily small ε. From the definition of Cauchy convergence, there is some N such that i, n > Nimplies d(pi, pn) < ε/2. From the convergence of the subsequence to p∗, there is some M such that i > Mimplies d(p∗, pi) < ε/2.

So, for i, n > max(M,N) we have

d(p∗, pn) ≤ d(p∗, pi) + d(pi, pn) = ε

which, because ε is arbitrarily small, is sufficient to prove that {pn} converges to p∗.

Exercise 3.21

We know that each Ei ∈⋂En is nonempty (see note 1), so we can construct at least one sequence whose ith

element is an arbitrary element of Ei. Let {sn} be an arbitrary sequence constructed in this way. We canimmediately say three things about this sequence.

1) The sequence is Cauchy. This comes from the fact that lim diam En = 0: see the text below definition 3.9.

2) The sequence is convergent. This comes from the fact that it’s a sequence in a complete metric space X: seedefinition 3.12.

3) The sequence converges to some s∗ ∈⋂En. The set

⋂En is a intersection of closed sets, and is therefore

closed (thm 2.24b), so any limit point of⋂En is a point of

⋂En, and therefore s∗ (being a limit point of⋂

En) is an element of⋂En.

This shows that⋂En is nonempty: it contains at least one element s∗. We can then follow the proof of theorem

3.10b to conclude that⋂En contains only one point.

50

note 1

We’re told that Ei ⊃ Ei+1 for all i. If Ei were empty, then Ek would be empty for all k > i. This would meanthat

lim diam En = diam∅ = diam sup {d(p, q) : p, q ∈ ∅} = sup ∅

To find the supremum of the empty set in R, we need to rely on the definition of supremum:

sup ∅ = least upper bound of ∅ in R = min {x ∈ R : a ∈ ∅→ a ≤ x}

The set for which we’re seeking a minimum contains every x ∈ R (because of false antecedent a ∈ ∅), so thesupremum of the empty set in R is the minimum of R itself. Regardless of how we define this minimum (or ifit’s defined at all), it certainly isn’t equal to zero. But we’re told that

lim diam En = 0

so our initial assumption must have been wrong: there is no empty Ei ∈⋂En.

Exercise 3.22

The set G1 is an open set. Choose p1 ∈ G1 and find some open neighborhood Nr1(p1) ⊆ G1. Choose a smallerneighborhood Ns1(p1). Not only do we have Ns1 ⊂ Nr1 ⊆ G1, we can take the closure of Ns1 and still haveNs1 ⊂ Nr1 ⊆ G1 (since δ ≤ s1 < r1). Define E1 to be the neighborhood Ns1 (not its closure). At this point, wehave

E1 ⊂ E1⊆ G1

The set G2 is dense in X, so p1 is either an element of G2 or is a limit point for G2. In either case, everyneighborhood of p1 contains some point p2 ∈ G2. More specifically, the neighborhood E1 contains some pointp2 ∈ G2. Since p2 is in both E1 and G2, both of which are open sets, there is some neighborhood Nr2(p2) that’s asubset of both E1 and G2. We now choose an even smaller radius Ns2(p2). Not only do we have Ns2 ⊂ Nr2 ⊆ G2

and Ns2 ⊂ Nr2 ⊆ E1, we could take the closure Ns2. Define E2 to be the neighborhood Ns2 (not its closure).At this point, we have

E2 ⊂ E2⊂ G2, E2 ⊂ E2⊂ E1 ⊂ E1

We can continue in this same way, constructing a series of nested sets

E1⊃ E2⊃ · · · ⊃ En

This is a series of closed, bounded, nonempty, nested sets. So (from exercise 21) we know that⋂En contains a

single point x. This single point x must be in every Ei ∈⋂En, and Ei⊂ Gi, and x must be in every Gi ∈

⋂Gn.

Therefore⋂Gn is nonempty.

Exercise 3.23

We’re aren’t given enough information about the metric space X to assume that the Cauchy sequences convergeto elements in X. All we can say is that, for any arbitrarily small ε, there exists some M,N ∈ R such that

n,m > M → d(pn, pm) ≤ ε/2

n,m > N → d(qn, qm) ≤ ε/2

From multiple applications of the triangle inequality, we have

d(pn, qn) ≤ d(pn, pm) + d(pm, qn) ≤ d(pn, pm) + d(pm, qm) + d(qm, qn)

which, for n,m > max{N,M}, becomes

d(pn, qn) ≤ ε+ d(pm, qm)

ord(pn, qn)− d(pm, qm) ≤ ε

51

Exercise 3.24a

reflexivity

For all sequences {pn}, we have

limn→∞

d(pn, pn) = limn→∞

|pn − pn| = limn→∞

0 = 0

so {pn} = {pn}.

symmetry

If {pn} = {qn}, we have

limn→∞

d(pn, qn) = limn→∞

|pn − qn| = limn→∞

|qn − pn| = limn→∞

d(qn, pn)

so {qn} = {pn}.

transitivity

If {pn} = {qn} and {qn} = {rn}, the triangle inequality gives us

limn→∞

d(pn, rn) ≤ limn→∞

d(pn, qn) + d(qn, rn) = 0 + 0

This tells us that limn→∞ d(pn, rn) = 0, so {pn} = {rn}.

Exercise 3.24b

Let {an} = {pn} and let {bn} = {qn}. From the triangle inequality, we have

∆(P,Q) = limn→∞

d(pn, qn) ≤ limn→∞

d(pn, an) + d(an, bn) + d(bn, qn)

Which, from the definition of equality established in part (a), gives us


d(pn, qn) ≤ limn→∞

d(an, bn) (5)

A similar application of the triangle inequality gives us

limn→∞

d(an, bn) ≤ limn→∞

d(an, pn) + d(pn, qn) + d(qn, bn)

Which, from the definition of equality established in part (a), gives us

limn→∞

d(an, bn) ≤ limn→∞

d(pn, qn) (6)

Combining equations (5) and (6), we have



d(an, bn)

Exercise 3.24c

Define X∗ to be the set of equivalence classes from part (b). Let {Pn} be a Cauchy sequence in (X∗,∆). This isan unusual metric space, so to be clear: The sequence {Pn} is a sequence {P0, P1, P2 . . .} of equivalence classesin X∗ that get “closer” to one another with respect to distance function ∆. Each Pi ∈ {Pn} is an equivalenceclass of Cauchy sequences of the form {pi0, pi1, pi2, . . .} containing elements of X that get “closer together” withrespect to distance function d.

Each Pi ∈ {Pn} is a set of equivalent Cauchy sequences in X. From each of these, choose some sequence{pin} ∈ Pi. For each i we have

(∃Ni ∈ R)(m,n > Ni → d(pim, pin) <

ε

i

)52

We’ll define a new sequence {qn} by letting qi = pim for some m > Ni. Let Q be the equivalence class containing{qn}. We can show that limn→∞ Pn = Q.

0 ≤ limn→∞

∆(Pn, Q) = limn→∞

[limk→∞

d(pnk, qk)

]≤ limn→∞

ε

n

So, by the squeeze theorem, we havelimn→∞

∆(Pn, Q) = 0 (7)

which, by the definition of equality in X∗, means that limn→∞ Pn = Q.

Proof that Q ∈ X∗:

Choose any arbitrarily small values ε1, ε2, ε3. From (7) we have

∃X : n, k > X → d(pnk, qn) < ε1

And {Pn} is Cauchy, so we have∃Y : n > Y → ∆(Pn, Pn+m) < ε2

which, after choosing appropriately large n, gives us

∃Z : k > Z → d(pnk, p(n+m)k) < ε3

Therefore, by the triangle inequality:

d(qn, qn+m) ≤ d(qn, pnk) + d(pnk, p(n+m)k) + d(p(n+m)k, qn+m)

Taking the limits of each side gives us

limn→∞

d(qn, qn+m) ≤ ε1 + ε3 + ε1

But these epsilon values were arbitrarily small and m was an arbitrary integer. This shows that for every ε thereexists some integer N such that n, n+m > N implies d(qn, qn+m) < ε. And this is the definition of {qn} beinga Cauchy sequence. Therefore Q ∈ X∗.

Exercise 3.24d

Let {pn} represent the sequence whose terms are all p, and let {qn} represent the sequence whose terms are allq.

∆(Pp, Pq) = limn→∞


d(p, q) = d(p, q)

Exercise 3.24e : proof of density

Let {an} be an arbitrary Cauchy sequence in X and let A ∈ X∗ be the equivalence class containing {an}. Ifthe sequence {an} converges to some element x ∈ X, then A = Px = ϕ(x) and therefore A ∈ ϕ(X) (see exercise3.24 for the definition of Px).

If the sequence {an} does not converge to some element x ∈ X, then choose an arbitrarily small ε. Thesequence {an} is Cauchy, so we’re guaranteed the existence of K such that

j, k > K → d(aj , ak) < ε

From this, we can consider the sequence whose terms are all ak. This sequence is a member of the equivalenceclass Pak = ϕ(ak) and

∆(A,Pak) = limn→∞

d(an, ak) < ε

This shows that we can find some element of ϕ(X) to be arbitrarily close to A, which means that A is a limitpoint of ϕ(X).

We have shown that an arbitrary Cauchy sequence A ∈ X∗ is either an element of ϕ(X) or a limit point ofϕ(X). By definition, this means that ϕ(X) is dense in X∗.

53

Exercise 3.24e : if X is complete

If X is complete then every arbitrary Cauchy sequence A ∈ X∗ converges to some point a ∈ X, so thatA = Pa = ϕ(a) ∈ ϕ(X). This shows that X∗ ⊆ ϕ(X). And for every ϕ(b) ∈ ϕ(X) there is some Cauchysequence in X∗ whose every element is b, so that ϕ(b) = Pb ∈ X∗. This shows that ϕ(X) ⊆ X∗.

This shows that X∗ ⊆ ϕ(X) ⊆ X∗, or that ϕ(X) = X∗.

Exercise 3.25

The completion of the set of rational numbers is a set that’s isomorphic to R.

Exercise 4.1

Consider the function

f(x) =

{1, x = 00, x 6= 0

This function satisfies the condition that

limh→∞ [f(x+ h)− f(x− h)] = 0

for all x, but the function is not continuous at x = 0: We can choose ε < 1, and every neighborhood Nδ(0) willcontain a point p for which

d(f(p), f(0)) = 1

Therefore we can’t pick δ such that d(p, 0) < δ → d(f(p), f(0)) < 1, which means that f is not continuous bydefinition 4.5.

Exercise 4.2

Let X be a metric space, let E be an arbitrary subset of X, and let E represent the closure of E. We want toprove that f(E) ⊆ f(E). To do this, assume y ∈ f(E). This means that y = f(e) for some e ∈ (E ∪ E′).

case 1: e ∈ E

If e ∈ E, then y ∈ f(E) and therefore y ∈ f(E).

case 2: e ∈ E′

If e ∈ E′, then every neighborhood of e contains infinitely many points of E. Choose an arbitrarily smallneighborhood Nε(y). We’re told that f is continuous so, by definition 4.1, we’re guaranteed the existence of δsuch that f(x) ∈ Nε(y) whenever x ∈ Nδ(e). But there are infinitely many elements of E in the neighborhoodNδ(e), so there are infinitely many elements of f(E) in Nε(y). This means that y is a limit point of f(E).

We’ve shown that every arbitrary element y ∈ f(E) is either a member of f(E) or a limit point of f(E),which means that y ∈ (f(E) ∪ f(E)′) = f(E). This proves that f(E) ⊆ f(E).

A function f for which f(E) is a proper subset f(E)

Let X be the metric space consisting of the interval (0, 1) with the standard distance metric. Let Y be themetric space R1. Define the function f : X → Y as f(x) = x. The interval (0, 1) is closed in X but open in Y ,so we have

f(X) = f(X) = (0, 1) 6= (0, 1)

Exercise 4.3

If we consider the image of Z(f) under f , we have f(Z(f)) = {0}. This range is a finite set, and is therefore aclosed set. By the corollary of theorem 4.8, we know that f−1({0}) = Z(f) must also be a closed set.

54

Exercise 4.4: f(E) is dense in f(X)

To show that f(E) is dense in f(X) we must show that every element of f(X) is either an element of f(E) ora limit point of f(E).

Assume y ∈ f(X). Then p = f−1(y) ∈ X. We’re told that E is dense in X, so either p ∈ E or p ∈ E′.

case 1: p ∈ E

If p ∈ E, then y = f(p) ∈ f(E).

case 2: f−1(y) ∈ E′

If p is a limit point of E, then there is a sequence {en} of elements of E such that en 6= p and limn→∞ en = p.We’re told that f is continuous, so by theorem 4.2 we know that limn→∞ f(en) = f(p) = y. Using definition4.2 again, we know that there is a sequence {f(en)} of elements of f(E) From theorem 4.2, this tells us thatlimx→p f(x) = f(p) = y. Therefore y is a limit point of f(E).

We’ve shown that every element y ∈ f(X) is either an element of f(E) or a limit point of f(E). By definition,this means that f(X) is dense in f(E).

Exercise 4.4b

Choose an arbitrary p ∈ X. We’re told E is dense in X, so p is either an element of E or a limit point of E.

Case 1: p ∈ E

If p ∈ E, then we’re told that f(p) = g(p).

Case 2: p ∈ E′

If p is a limit point of E, then there is a sequence {en} of elements of E such that en 6= p and limn→∞ en =p. We’re told that f and g are continuous, so by theorem 4.2 we know that limn→∞ f(en) = f(p) andlimn→∞ g(en) = g(p). But each en is an element of E, so we have f(en) = g(en) for all n. This tells usthat

g(p) = limn→∞

g(en) = limn→∞

f(en) = f(p)

We see that f(p) = g(p) in either case. This proves that f(p) = g(p) for all p ∈ X.

Exercise 4.5

The set EC can be formed from an at most countable number of disjoint open intervals

We’re told that E is closed, so EC is open. Exercise 2.29 tells us that EC contains an at-most countable numberof disjoint segments. Each of these segments must be open (if any of them contained a non-interior point, itwould be a non-interior point of EC , but open sets have no non-interior points). Let {(an, bn)} be the at-mostcountable collection of disjoint open segments.

constructing the function

We must separately consider the cases for x ∈ E and x 6∈ E; and if x 6∈ E we must consider the possibility thatE contains an interval of the form (−∞, b) or (a,∞). We’ll define the function to be

g(x) =

f(x), x ∈ Ef(bi), x ∈ (ai, bi) ∧ ai = −∞f(ai), x ∈ (ai, bi) ∧ bi =∞f(ai) + f(bi)−f(ai)

bi−ai (x− ai), x ∈ (ai, bi) ∧ −∞ < ai < bi <∞

(8)

This function is the one mentioned in the hint: the graph of g is a straight line on each closed interval [bi, ai+1] ∈EC . This function can easily (albeit tediously) be shown to be continuous on R1.

55

failure if “closed” is omitted

Consider the following function defined on the open set (−∞, 0) ∩ (0,∞):

f(x) =

{1, x > 0−1 x < 0

This function will be discontinuous at x = 0 no matter how we define the function at this point.

Extending this to vector-valued functions

Let E be a closed subset of R and let f : E → Rk be a vector-valued function defined by

f(x) = (f1(x), f2(x), . . . , fk(x))

For each fi we can define a function gi as in equation (8) to create a new vector-valued function g : R → Rk

defined byg(x) = (g1(x), g2(x), . . . , gk(x))

We’ve shown that each of these gi functions are continuous, therefore by theorem 4.10 the function g(x) iscontinuous.

Exercise 4.6

Let G represent the graph of f . Let g : E → G be defined as as g(x) = (x, f(x)). We can make G into a metricspace by defining a distance function: the set G is a subset of R× R, so the natural choice is to use the metricd(g(x), g(y)) = d((x, f(x)), (y, f(y)) =

√(x− y)2 + (f(x)− f(y))2. This choice allows us to treat G as a subset

of R2.

If f is continuous

Both x 7→ x and x 7→ f(x) are continuous mappings, so g(x, f(x)) is a continuous mapping by theorem 4.10.The domain of g is a compact metric space so by theorem 4.14 (or theorem 4.15) G is compact.

If the graph is compact

Define g as above. The inverse of g is g−1(x, f(x)) = x. It’s clear that g−1 is a one-to-one and onto functionfrom G to E. Therefore by theorem 4.17 the inverse of g−1 – that is, g itself – is continuous. We can then appealto theorem 4.10 once again to conclude that f is continuous.

Exercise 4.7

f is bounded

If x = 0, then f(x, y) = 0 for any value of y. If x > 0:

(x− y2)2 ≥ 0 squares are positive

→ x2 − 2xy2 + y4 ≥ 0 expanding the squared term

→ x2 − xy2 + y4 ≥ 0 LHS remains positive after adding the nonnegative term (xy2)

→ x2 + y4 ≥ xy2 add xy2 to both sides

→ 1 ≥ xy2

x2+y4 divide both sides by positive term x2 + y4

If x < 0:

(x+ y2)2 ≥ 0 squares are positive

→ x2 + 2xy2 + y4 ≥ 0 expanding the squared term

→ x2 − xy2 + y4 ≥ 0 LHS remains positive after adding the nonnegative term (−3xy2)

→ x2 + y4 ≥ xy2 add xy2 to both sides

→ 1 ≥ xy2

x2+y4 divide both sides by positive term x2 + y4

56

g is unbounded near (0,0)

If we let x = nα and let y = nβ , we have

g(nα, nβ) =nα+2β

n2α + n6β

We can divide the numerator and denominator by nα+2β to get

g(nα, nβ) =1

nα−2β + n4β−α

If we let α = −3, β = −1 this becomes

g

(1

n3,

1

n

)=

1

n−1 + n−1=n

2

Taking limits, we have

g(0, 0) = limn→∞

g

(1

n3,

1

n

)= limn→∞

n

2

The rightmost limit is +∞.

f is not continuous at (0,0)

Choose 0 < δ < 1/2. For f to be continuous we must be able to choose some ε > 0 such that

d ((0, 0), (x, y)) < ε→ d (f(0, 0), f(x, y)) < δ

But there can be no such epsilon. To see this, choose an arbitrary ε > 0, choose x > 0 such that x2 + x < ε2,and then choose y =

√x. This gives us

d ((0, 0), (x, y)) =√x2 + y2 =

√x2 + x < ε

but

d (f(0, 0), f(x, y)) =xy2

x2 + y4=

x2

2x2=

1

2

The restriction of f to a straight line is continuous

Any straight line that doesn’t pass through (0, 0) doesn’t encounter any of the irregularities that occur at theorigin; it’s trivial but tedious to show that the restriction of f to such a line is continuous. So we need onlyconsider lines that pass through the origin: that is, lines of the form y = cx or x = 0 for some constant c.

For the line y = cx: let ε > 0 be given and let δ = ε/c2. Choose x such that 0 < x < δ. Then:

d(f(0, 0), f(x, y)) =

∣∣∣∣ xy2

x2 + y4

∣∣∣∣ =

∣∣∣∣ c2x3

x2 + c4x4

∣∣∣∣ =

∣∣∣∣ c2x

1 + c4x2

∣∣∣∣ < ∣∣∣∣c2x1∣∣∣∣ < ∣∣c2δ∣∣ = ε

And ε was arbitrary, so this proves that f is continuous on this line.

For the line x = 0: let ε > 0 be given choose any y 6= 0. Regardless of our choice of δ, we have

d(f(0, 0), f(x, y)) =

∣∣∣∣ xy2

x2 + y4

∣∣∣∣ =

∣∣∣∣ 0

y4

∣∣∣∣ = 0 < ε

And ε was arbitrary, so this proves that f is continuous on this line.

The restriction of g to a straight line is continuous

The proof for the continuity of the restriction of g is almost identical to that of the continuity of the restrictionof f .

57

Exercise 4.8

To show that f is bounded, we need to show that there exists some M such that |f(p)| < M for every p ∈ R1.Let p be an arbitrary element of R1. We’re told that E is a bounded subset of R1, so we know that E has alower bound α. Choose any ε > 0. From the definition of uniform continuity there exists some δ > 0 such that,for all p, q ∈ R1,

d(p, q) < δ → d(f(p), f(q)) < ε

Let n be the smallest integer such that nδ > p − α (which exists from the Archimedean property of the reals)and define γ = (p− α)/n. This allows us to divide the interval (α, p) into n intervals of length γ, each of whichis smaller than δ. We can then apply the triangle inequality multiple times:

d(f(α), f(p)) ≤ d(f(α), f(α+ γ)) + d(f(α+ γ), f(α+ 2γ)) + . . .+ d(f(α+ (n− 1)γ), f(α+ nγ))

< ε+ ε+ . . .+ ε (n terms)

= nε

This shows us that |f(p)| is bounded by |f(α) ± nε| for all p ∈ R1, so by definition 4.13 we know that f isbounded.

if E is not bounded

If E is not bounded below, then we can revise the previous proof using the upper bound β in place of the lowerbound α. If E is not bounded above or below, then we will not be able to use the Archimedean property to findn such that nδ > p− (−∞) and the proof fails. For an example of a real uniformly continuous function that isnot bounded, we can simply look to f(x) = x.

Exercise 4.9

Definition 4.18 says that f is uniformly continuous on X if for every ε > 0 there exists δ > 0 such that

dX(p, q) < δ → dY (f(p), f(q)) < ε (9)

We want to show that this conditional statement is true iff for every ε > 0 there exists δ > 0 such that

diam E < δ → diam f(E) < ε (10)

(10) implies (9)

Let equation (10) hold and suppose d(p, q) < δ.

d(p, q) < δ assumed

→ diam {p, q} < δ we’re letting E be the set containing only p and q.

→ diam {f(p), f(q)} < ε from (10)

→ d(f(p), f(q)) < ε definition of diameter

Therefore (9) holds.

(9) implies (10)

Let equation (9) hold, let E be a nonempty subset of X, and suppose diam E < δ.

diam E < δ assumed

→ (∀p, q ∈ E)(d(p, q) < δ) definition of diameter

→ (∀p, q ∈ E)(d(f(p), f(q)) < ε) definition of diameter

→ (∀f(p), f(q) ∈ f(E))(d(f(p), f(q)) < ε) p ∈ E → f(p) ∈ f(E)

→ diam f(E) < ε definition of diameter

Therefore (10) holds.

58

Exercise 4.10

We want to prove the converse of theorem 4.19: that is, we want to prove that if f is not uniformly continuouson X then either X is not a compact metric space or f is not a continuous function.

If f is not uniformly continuous, then the converse of definition 4.18 tells us that there exists some ε suchthat, for every δ > 0, we can find p, q ∈ X such that d(p, q) < δ but d(f(p), f(q)) ≥ ε. We’ll be contradictingthis claim, so we’ll restate it formally in a numbered equation:

(∃ε > 0)(∀δ > 0) d(p, q) < δ ∧ d(f(p), f(q)) ≥ ε (11)

From the fact that we can find such p, q for every value of δ, we can set δn = 1n and construct two sequences

{pn} and {qn} where d(pn, qn) < 1n but d(f(pn), f(qn)) ≥ ε.

If the set X is compact (supposition 1) the sequence {pn} must have at least one subsequential limit p(theorem 3.6a or 2.37). And, since d(pn, qn) < 1

n , the sequence {qn} must also have p as a subsequential limitsince

d(qn, p) ≤ d(qn, pn) + d(pn, p) <1

n+ ε

But then, from theorem 4.2, it must be the case that both {f(pn)} and {f(qn)} have subsequential limits off(p). If the function f is continuous without being uniformly continuous (supposition 2) this would implythat, for some sufficiently large n,

d(f(pn), f(qn)) ≤ d(f(pn), f(p)) + d(f(p), f(qn)) <ε

2+ε

2= ε

which contradicts (11).

We’ve established a contradiction from our initial assumption that f is uniformly continuous, so one of oursuppositions must be incorrect: either X is not compact or f is not continuous. By converse, if X is compactand f is continuous then f is uniformly continuous. And this is what we were asked to prove.

Exercise 4.11

Choose an arbitrary ε > 0. We’re told that f is uniformly continuous, so

(∃δ > 0) : d(xm, xn) < δ → d(f(xm), f(xn)) < ε

And we’re told that {xn} is Cauchy, so

(∃N) : n,m > N → d(xm, xn) < δ

Together, these two conditional statements tell us that, for any ε,

(∃N) : n,m > N → d(f(xm), f(xn)) < ε

which, by definition, shows that {f(xn)} is Cauchy. See exercise 4.13 for the proof that uses this result.

Exercise 4.12

We’re asked to prove that the composition of uniformly continuous functions is uniformly continuous. Letf : X → Y and g : Y → Z be uniformly continuous functions. Choose an arbitrary ε > 0. Because g isuniformly continuous, we have

(∃α > 0) : d(y1, y2) < α→ d(g(y1), g(y2)) < ε

Because f is uniformly continuous, we have

(∃δ > 0) : d(x1, x2) < δ → d(f(x1), f(x2)) < α

Since the elements f(x1), f(x2) are elements of Y these two conditional statements together tell us that, forarbitrary ε > 0,

(∃δ > 0) : d(x1, x2) < δ → d(f(x1), f(x2)) < α→ d(g(f(x1)), g(f(x2))) < ε

which, by definition means that the composite function (g ◦ f) : X → Z is uniformly continuous.

59

Exercise 4.13 (Proof using exercise 4.11)

For every p ∈ X it is either the case that p ∈ E or p 6∈ E. If p 6∈ E, then p is a limit point of E (becauseof density) and therefore we can construct a sequence {en} that converges to p. We now define the functiong : X → Y as

g(p) =

{limn→∞ f(p), if p ∈ Elimn→∞ f(en), if p 6∈ E

We need to prove that this is actually a well-defined function and that it’s continuous.

The function g is well-defined

It’s clear that every element of X is mapped to at least one element of Y , but it’s not immediately clear thateach element of X is mapped to only one element of Y . We need to show that if {pn} and {qn} are sequencesin E that converge to p, then {f(pn)} and {f(qn)} both converge to the same element.

Let {pn} and {qn} be arbitrary sequences in E that converge to p. From the definition of convergence (orfrom exercise 4.11), we know that

(∀δ > 0)(∃N) : n > N → d(pn, p) <δ

2

(∀δ > 0)(∃M) : m > M → d(qm, p) <δ

2

We can then use the triangle inequality to show that

(∀δ > 0)(∃M,N) : n > max{M,N} → d(pn, qn) ≤ d(pn, p) + d(p, qn) < δ (12)

We know that the function f is continuous on E, so for any ε > 0 there exists some δ > 0 such that

d(pn, qn) < δ → d(f(pn), f(qn)) < ε (13)

Combining equations (12) and (13) we see that for any ε > 0 we can find some integer N such that

n > N → (d(pn, qn) < δ)→ (d(f(pn), f(qn)) < ε)

This tells us that {f(pn)} and {f(qn)} don’t converge to different points, but without being able to concludethat Y is compact it’s entirely possible that {f(pn)} and {f(qn)} are merely Cauchy without actually convergingto any point in Y at all. The exercise asks us to assume that Y is R1, but we’ll try for a more general solutionand assume only that Y is a complete metric space (see definition 3.12). This allows us to conclude that {f(pn)}and {f(qn)} both converge to the same point of Y , and we call this point g(p). Therefore the function g(p) hasa unique value for every p ∈ X.

60

The function g is continuous

d(p, q) < δ/2 assumed

Let {pn} be a sequence that converges to p and let {qn} be a sequence that converges to q.

→ d(pn, qn) ≤ d(pn, p) + d(p, q) + d(q, qn)

→ d(pn, qn) < d(pn, p) + δ/2 + d(q, qn) from our initial assumption

This holds true for any n, even when n is sufficiently large to make d(q, qn) < δ/4, d(p, pn) < δ/4 inwhich case we have

→ d(pn, qn) < δ

We’re told that f is continuous on E, so we can make d(f(pn), f(qn)) arbitrarily small if we canmake d(pn, qn) arbitrarily small. And we can make δ arbitrarily small, so for any ε we can choosepn, qn such that

→ d(f(pn), f(qn)) < ε/2

By definition, we have g(p) = f(p) for p ∈ E, so

→ d(g(pn), g(qn)) < ε/2

→ d(g(p), g(q)) ≤ d(g(p), g(pn)) + d(g(pn), g(qn)) + d(g(qn), g(q)) triangle inequality

→ d(g(p), g(q)) < d(g(p), g(pn)) + ε/2 + d(g(qn), g(q)) established bound for d(g(pn), g(qn))

This holds true for any n, even when n is sufficiently large to make d(q, qn) < ε/4, d(p, pn) < ε/4(we know that {g(pn)} converges to g(p) and that {g(qn)} converges to g(q) because we definedg(p), g(q) specifically so that this would be true). For such values of n we have

d(g(p), g(q)) < ε

This shows that we can constrain the range of g by constraining its domain, and therefore g is continuous.

Could we replace the range with Rk or any compact metric space? By any complete metric space?

Yes. Definition 3.12 tells us that all compact metric spaces and all Euclidean metric spaces are also completemetric spaces. Our proof didn’t depend on the range of g being R1: we assumed only that the range was acomplete metric space.

Could we replace the range with any metric space?

No. Consider the function f(x) = 1x with a domain of E =

{1n : n ∈ N

}. The set E is dense in X = E ∪ {0}.

This function is continuous at every x ∈ E (we can find a neighborhood Nδ(x) that contains only x, whichguarantees that d(f(y), f(x)) = 0 for any y in Nδ(x)). However, there is no way to define f(0) to make this acontinuous function. This can be seen intuitively by recognizing that the range of f(x) = 1/x is unbounded inevery neighborhood of x = 0. If you’re satisfied with an “intuitive proof”, we’re done. Otherwise, buckle upand get ready for a few more paragraphs of inequalities and Greek letters.

To prove that f is discontinous at x = 0, choose any arbitrary neighborhood around 0 with radius δ. Wecan find an integer n such that n > 1

δ (Archimedian property of the reals) so that d( 1n , 0) < δ. For any positive

integer k we also have d( 1n+k , 0) < δ. Now, by the triangle inequality, we have

d

[f

(1

n

), f

(1

n+ k

)]≤ d

[f

(1

n

), f (0)

]+ d

[f (0) , f

(1

n+ k

)]We haven’t specificied a value for f(0) yet, but we do know that |f(1/n)− f(1/n+ k)| = |n− (n+ k)| = k. Thislast inequality is therefore equivalent to

k ≤ d[f

(1

n

), f (0)

]+ d

[f (0) , f

(1

n+ k

)]

61

Every term of this inequality is nonnegative, so one of the two terms on the right-hand side of this inequalitymust be ≥ k/2. But this is true for arbitrarily large k and arbitrarily small δ.

We can now show that f is not continuous at x = 0. Choose any ε > 0. For all possible choice of δ > 0 we canfind integers n > 1

δ and k > 2ε, so that by the previous method we can be sure that either d(f( 1n ), f(0)) ≥ k/2 = ε

or d(f( 1n+k ), f(0)) ≥ k/2 > ε. By definition 4.5 of “continuous”, this proves that f is not continous at 0. And

we never specified the value of f(0), so we’ve proven that f will be discontinuous however we define f(0). Thismeans that there is no continuous extension from E to X.

Exercise 4.14 proof 1

If f(0) = 0 or f(1) = 1, we’re done. Otherwise, we know that f(0) > 0 and f(1) < 1.

We’ll now inductively define a sequence of intervals {[ai, bi]} as follows3: Let [a0, b0] = [0, 1]. Now assumethat we have defined {[ai, bi]} for i = 0, 1, . . . , n. Let m = (an + bn)/2. We define [an+1, bn+1] as

[an,m], if f(m) ≤ m[m, bn], if f(m) ≥ m

(This procedure is underdefined for the case of f(m) = m, but when f(m) = m we’ve found the point we’relooking for anyway). The interval [an+1, bn+1] has diameter 2−(n+1) with f(an+1) ≥ an+1, f(bn+1) ≤ bn+1. Wetherefore have a nested sequence of compact sets {[an, bn]} whose diameter converges to zero, so by exercise 3.21we know that

⋂{[an, bn]} consists of just one point i ∈ I. Our method of constructing {[an, bn]} guarantees that

f(i) ≤ i and f(i) ≥ i, therefore f(i) = i and this is the point we were asked to find.


This proof, which is much clearer and more concise than mine, was taken from the homework of Men-Gen Tsai([email protected]).

Define a new function g(x) = f(x) − x. Both f(x) and x are continuous, so g(x) is continuous by theorem4.9. If g(0) = 0 or g(1) = 0, then we’re done. Otherwise, we have g(0) > 0 and g(1) < 0. Therefore by theorem4.23 there must be some intermediate point in the interval [0, 1] for which g(x) = 0: at this point, f(x) = x.

Exercise 4.15

Although the details of this proof might be ugly, the general idea is simple. If the function f is continuousbut not monotonic then we can find some open interval (x1, x2) on which f(x) obtains a local maximum orminimum. This point f(x) is not an interior point of f( (x1, x2) ), so f( (x1, x2) ) is not an open set.

Let f : R1 → R1 be a continuous mapping and assume that f is not monotonically increasing or decreasing.Then we can find x1 < x2 < x3 such that f(x2) > f(x1) and f(x2) > f(x3), or such that f(x2) < f(x1) andf(x2) < f(x3). The proofs for either case are analogous, so we’ll focus only on the case that f(x2) > f(x1) andf(x2) > f(x3).

We can’t assume that the supremum of f( (x1, x3) ) occurs at f(x2): there could be some x2.5 for whichf(x2.5) > f(x2). We want to construct a closed subinterval [xa, xb] ⊂ (x1, x3) containing the x for which f(x)obtains its local maximum. To do this, we define ε to be

ε = min{d( f(x2), f(x1) ), d( f(x2), f(x3) )}

Because f is continuous, we know that there exists some δa and δb such that

d(x1, x) < δa → d(f(x1), f(x)) < ε

d(x3, x) < δb → d(f(x3), f(x)) < ε

3Our method of defining this sequence is similar to Newton’s method or the Bisection method of root finding.

62

We can now let xa = x1 + 12δa and xb = x3 − 1

2δb. We know that f doesn’t obtain its local maximum for any xin the intervals (x1, xa) or (xb, x3) because all of these xs are sufficiently close (within δ) to x1 or x3 so that f(x)is less (by at least 1

2δ) than f(x2). And [xa, xb] is a closed subset of R1 and is therefore a compact set, so weknow that f([xa, xb]) is a closed and bounded subset of R1 (theorem 4.15), and therefore f([xa, xb]) contains itssupremum (which may or may not occur at x2, but we only care that the supremum exists for some x ∈ [xa, xb]).This is all getting a bit muddled, so to recap our results so far:

1) [xa, xb] ⊂ (x1, x3)

2) f([xa, xb]) ⊂ f((x1, x3))

3) (x ∈ (x1, xa) ∪ (xb, x3))→ f(x) < f(x2)

4) f(x) has a local maximum for some x ∈ [xa, xb]

Together these four facts tell us that f( (x1, x3) ) contains its own supremum for some x ∈ [xa, xb]. But thesupremum of a set can’t be an interior point of the set, so f( (x1, x3) ) is not open even though (x1, x3) isopen, so f is not an open mapping (theorem 4.8) We’ve shown that “f is not monotonic” implies “f is not acontinuous open mapping from R1 to R1”. By converse, then, we have shown that if f is a continuous openmapping from R1 to R1 then f is monotonic. And this is what we were asked to prove.

Exercise 4.16

The functions [x] and (x) are both discontinuous for each x ∈ N. If x ∈ N then, for every 0 < δ < 12 , we have

d([x− δ], [x]) = |(x− 1)− x| = 1

d((x− δ), (x)) = |(1− δ)− 0| = 1− δ > 1

2This shows that we can’t constrain the range of these functions by restraining the domain around x ∈ N, whichby definition means that the functions are discontinuous for integer values of x.

Note that the the sum of these discontinuous functions, [x] + (x), is the continuous function f(x) = x.

Exercise 4.17

Define E as in the hint. To the three criteria listed in the exercise, we add a fourth:

(d) a < q < x < r < b

Lemma 1: For each x ∈ E we can find p, q, r such that the four criteria are met

We’re assuming that f(x−) < f(x+), so by the density of Q in R (theorem 1.20b) we can find some rational psuch that f(x−) < p < f(x+). From our definition of the set E we know that f(x−) exists for all x ∈ E, so wecan find some neighborhood Nδ(x) containing a rational q that fulfills criteria (b). More formally,

(∃q ∈ Nδ(x) ∩ (a, x))(a < q < x ∧ [q < t < x→ f(t) < p]

Similarly, the existence of f(x+) tells us that there is some neighborhood Nδ2(x) containing a rational r suchthat

(∃r ∈ Nδ2(x) ∩ (x, b))(x < r < b ∧ [x < t < r → f(t) > p]

Therefore each x ∈ E can be associated with at least one (p, q, r) rational triple with q < x < r.

Lemma 2: If a given (p, q, r) triple fulfills the criteria for x and y, then x = y

Suppose x, y are two elements of E that both meet the four criteria. Assume, without loss of generality, thatx < y. The density of the reals guarantees that there exists some rational w such that x < w < y. But thismeans that f(w) < p (by criteria (b) since q < w < y) and that f(w) > p (by criteria (c) since x < w < r).Reversing the roles of x and y establishes the same results for the case of x > y. This is clearly a contradiction,so one of our assumptions must be wrong: for any given triple (p, q, r) it’s not possible to find two distinctelements of E that fulfill all four criteria. Note that we haven’t proven that each triple can be associated witha unique x ∈ E: we’ve only shown that each triple can be associated with at most one x ∈ E.

63

Lemma 3: E is at most countable

The set of rational triples is countable (theorem 2.13). Lemma 2 tells us that we can create a function fromthe set of triples onto E, so the cardinality of E is not more than the cardinality of the set of rational triples.Therefore the cardinality of E is at most countable.

Lemma 4: F is at most countable

If we define F to be the set of x for which f(x+) < f(x−) we can prove that F is at most countable with onlytrivial modifications to the previous lemmas.

Lemma 5: G is at most countable

Define G to be the set of x for which f(x+) = f(x−) = α but f(x) 6= α. Let x be an arbitrary element ofG. If every neighborhood of x contained another element of G, we could construct a sequence {tn} for which{tn} → x but {f(tn)} 6→ α. But this would mean that either f(x−) or f(x+) wouldn’t exist, which contradictsour definition of G. Therefore each x is an isolated point of G: we can find some radius δ such that Nδ(x)∩G = xand Nδ(x) contains rational numbers q, r such that q < x < r.

Now consider all of the (q, r) pairs of rational numbers. Each of these pairs will either have one, more thanone, or zero elements of G in the associated open interval (q, r). Let G be the set of rational pairs (q, r) forwhich there is exactly one element of g in the open interval (q, r). The set G is a subset of Q×Q, so it’s at mostcountable (theorem 2.13). And each x ∈ G can be associated with at least one (p, q) ∈ G, so we can create afunction from G onto G. This proves that G is at most countable.

The sets E,F ,and G exhaust all the different types of simple discontinuities and all of these sets are countable.Therefore their union is countable (theorem 2.12). And this is what we were asked to prove.

Exercise 4.18

f is continuous at every irrational point

Let x be irrational. Choose any ε > 0 and let n be the smallest integer such that n > 1ε . If we want to constrain

the range of f to f(x)± ε, we will need to find a neighborhood of x that contains no rational numbers in the set{p

q: p, q ∈ Q, q ≤ n

}For each i ≤ n the product xi is irrational, so there exists some integer mi such that

m < xi < m+ 1

which means thatm

i< x <

m+ 1

i

So we can define a neighborhood around x with radius δi where

δi = min

{d(x,m

i

), d

(x,m+ 1

i

)}The neighborhood (x− δi, x+ δi) contains no rationals of the form k

i . So if we have constructed neighborhoodsδ1, δ2, . . . , δn we can let δ = min{δi} (this minimum exists because n is finite), and we will have constructed aneighborhood (x−δ, x+δ) that contains no rational numbers with denominators ≤ n. Therefore, for all rationalnumbers y, we have

d(x, y) < δ → d(f(x), f(y)) <1

n< ε

and for irrational numbers y we have

d(x, y) < δ → d(f(x), f(y)) = 0 < ε

Which, by definition, means that f is continuous at x.

64

f has a simple discontinuity at every rational point

Let x be a rational point. If we follow the previous method of constructing δ we immediately see that f(x+) =f(x−) = 0. When x is rational, though, we no longer have f(x) = 0 and therefore f has a simple discontinuityat x.

Exercise 4.19

Suppose that f is discontinous at some point p. By definition 4.5 of “continuity”, we know that there existssome ε > 0 such that, for every δ > 0, we can find some x such that d(x, p) < δ but d(f(x), f(p)) ≥ ε. Thereforewe are able construct a sequence {xn} where each term of {xn} gets closer to p even though every term of{f(xn)} differs from f(p) by at least ε. More formally, we choose each element of {xn} so that d(p, xn) < 1

n butd(f(p), f(xn)) ≥ ε. By constructing the sequence in this way we see that limxn → p but d(f(xn), f(p)) ≥ ε forall n.

Assume, without loss of generality, that f(xn) < f(p) for every term of {xn} (see note 1 for justification ofWLOG). From the density property of the reals we can find a rational number r such that f(xn) < r < f(p).We’re told that f has the intermediate value property, so for each xn there is some yn between xn and p suchthat f(yn) = r. This sequence {yn} converges to p (since the terms of {yn} are squeezed between the terms of{xn} and p) and {f(yn)} clearly converges to r (since f(yn) = r for all n). But the fact that {yn} converges top means that p is a limit point of {yn}, which means that p is a limit point of the closed set described in theexercise (the set of all x with f(x) = r). Therefore p is an element of this set, which means that f(p) = r. Andthis is a contradiction since we specifically chose r so that f(p) 6= r.

We’ve established a contradiction, so our initial supposition must be false: f is not discontinous at any pointp. Therefore f is continuous.

Note 1: justification for WLOG claim

For each term of the sequence {f(xn)} either f(p) < f(xn) or f(xn) < f(p). Therefore {f(xn)} either containsan infinite subsequence where f(xn) < f(p) for all n or f(xn) > f(p) for all n (or both). If it contains an infinitesubsequence where f(xn) < f(p) for all n then we use this subsequence in place of {xn} in the proof. If it doesnot contain such a subsequence, then it contains an infinite subsequence where f(xn) > f(p) for all n: the prooffor this case requires only the trivial twiddling of a few inequality signs. Therefore we can assume, without lossof generality, that f(xn) < f(p) for each term of {xn}.

Exercise 4.20a

If x 6∈ E then x is an element of the open set EC

, therefore there is some radius ε such that d(x, y) < ε→ y 6∈ E,and therefore infz∈E d(x, z) ≥ ε > 0.

If x ∈ E then for every ε > 0 we can find some z ∈ E such that d(x, z) < ε and therefore infz∈E d(x, z) = 0.

Exercise 4.20b

Case 1: x, y ∈ E

If x and y are both elements of E then from part (a) we have ρE(x) = ρE(y) = 0 and therefore

|ρE(x)− ρE(y)| = 0 ≤ d(x, y)

Case 2: x ∈ E, y 6∈ E

If y ∈ E but x 6∈ E then ρE(y) = 0 and d(x, y) ∈ {d(x, z) : z ∈ E)}. Therefore

|ρE(x)− ρE(y)| = |ρE(x)| = infz∈E

d(x, z) ≤ d(x, y)

65

Case 3: x, y 6∈ E

If neither x nor y are elements of E then, for arbitrary z ∈ E we have

ρE(x) ≤ d(x, z) ≤ d(x, y) + d(y, z)

This must hold for any choice of z. By choosing z to make d(y, z) arbitrarily close to infz∈E d(x, z) we have

ρE(x) ≤ d(x, y) + ρE(y) + ε

Our choice of ε can be made arbitrarily small (possibly even zero, if y is a limit point of E), so this is equivalentto

ρE(x) ≤ d(x, y) + ρE(y) → ρE(x)− ρE(y) ≤ d(x, y)

By changing the roles of x and y we can similarly show that

ρE(y)− ρE(x) ≤ d(x, y)

Together, these last two inequalities show us that

|ρE(y)− ρE(x)| ≤ d(x, y)

These three cases exhaust the possibilities for x, y. This shows us that for every ε > 0 we have d(ρE(x), ρE(y)) < εwhenever d(x, y) < ε. And this is just definition 4.18 for uniform continuity with δ = ε.

Exercise 4.21

K is compact and ρF is continuous on K, so ρF (K) is compact (and is therefore both closed and bounded fromtheorem 2.41). The results of exercise 20 tell us that 0 is not an element of ρF (K), and therefore 0 is not a limitpoint of ρF (K) (since ρF (K) is closed). Therefore we can find some neighborhood around 0 with radius δ suchthat none of the elements in the interval (0− δ, 0 + δ) are in ρF (K).

Choose p ∈ K, q ∈ F . The distance d(p, q) is an element of ρF (K) and therefore d(p, q) 6∈ (0 − δ, 0 + δ).Therefore d(p, q) > δ.

The conclusion fails if neither set is compact

Consider the sets4

K =

{n+

1

n: n ∈ N

}F = {n : n ∈ N} − {2}

The set F is closed, neither K nor F is compact, and we can choose p ∈ K, q ∈ F such that d(p, q) < 1n for any

n.

Exercise 4.22

Exercise 20a shows us that ρA(p) = 0 iff p ∈ A and ρB(p) = 0 iff p ∈ B, so it’s clear that f(p) = 0 iff p ∈ A andf(p) = 1 iff p ∈ B. This means that the ρA(p) + ρB(p) is never zero, so f is continuous from theorem 4.9.

The intervals [0, 12 ) and ( 12 , 1] are open in [0, 1]; therefore V and W are open by theorem 4.8. The sets V

and W are clearly disjoint since f(p) can have only a single value for any given p. From the definition of A andB we have

A = f−1(0) ⊆ f−1([

0,1

2

))= V

B = f−1(1) ⊆ f−1((

1

2, 1

])= W

4This counterexample was taken from the homework of Men-Gen Tsai, [email protected]

66

Exercise 4.23a: proof of continuity

Choose x ∈ (a, b). Assume without loss of generality that f(x) 6= f(a) and f(x) 6= f(b) (see below for justificationof WLOG). Choose some ε > 0, and choose ε′ such that

0 < ε′ < min{ε, |f(b)− f(x)|, |f(a)− f(x)|}

Choose δ such that

0 < δ < min

{ε′(b− x)

|f(b)− f(x)|,

ε′(x− a)

|f(a)− f(x)|

}Define λa, λb as

λa = 1− δ

x− a

λb = 1− δ

b− xThe method we used to select ε′ and δ ensure that both of these λ values are in the interval (0, 1). Notice thatwe can express δ in terms of λa or λb as

δ = (1− λa)(x− a) = (1− λb)(b− x)

If we restrict the domain of f to (x− δ, x+ δ) we have

|f(x+ δ)− f(x)|= |f(x+ (1− λb)(b− x))− f(x)| from our definition of δ

= |f(λbx+ (1− λb)b)− f(x)| algebraic rearrangement

≤ |λbf(x) + (1− λb)f(b)− f(x)| from the convexity of f

= |(1− λb)(f(b)− f(x))| algebraic rearrangement

= | δb−x (f(b)− f(x))| from our definition of λ

< ε′(b−x)(b−x)|f(b)−f(x)| |(f(b)− f(x))| from our definition of δ

= ε′ algebra

< ε from our definition of ε

and also

|f(x− δ)− f(x)|= |f(x+ (1− λa)(x− a))− f(x)| from our definition of δ

= |f(λax+ (1− λa)a)− f(x)| algebraic rearrangement

≤ |λaf(x) + (1− λa)f(a)− f(x)| from the convexity of f

= |(1− λa)(f(a)− f(x))| algebraic rearrangement

= | δx−a (f(a)− f(x))| from our definition of λ

< ε′(x−a)(x−a)|f(a)−f(x)| |(f(a)− f(x))| from our definition of δ

= ε′ algebra

< ε from our definition of ε

We chose an arbitrarily small ε and showed how to find a nonzero upper bound for δ such that d(x, y) < δ →d(f(x), f(y)) < ε. By definition, this means that f is continuous.

Justifying “without loss of generality”: special case of f(a) = f(x) or f(b) = f(x)

Let x be an arbitrary point in the interval (a, b). Choose α ∈ (a, x) such that f(α) 6= f(x): if this is not possible,then choose α = x. Choose β ∈ (x, b) such that f(β) 6= f(x): if this is not possible, then choose β = x.

67

If α < x < β, we can use the previous method to prove that f is continuous on an interval (α, β) containing xsuch that (α, β) ⊆ (a, b). Since x was an arbitrary element of (a, b) this is sufficient to prove that f is continuouson (a, b).

If α = x < β, we can use the previous method to prove that f is continuous on an interval (x, β) containingx such that (x, β) ⊆ (a, b). We only have α = x if there was no α ∈ (a, x) such that f(α) 6= f(x): this canonly happen if f is constant (and therefore continuous) on the interval (a, x]. Therefore f is continuous on theinterval (a, β) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b).

If α < x = β, we can use the previous method to prove that f is continuous on an interval (α, x) containingx such that (α, x) ⊆ (a, b). We only have β = x if there was no β ∈ (x, b) such that f(β) 6= f(x): this canonly happen if f is constant (and therefore continuous) on the interval [x, b). Therefore f is continuous on theinterval (α, b) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b).

If α = x = β then f is a constant function on the entire interval (a, b) and is therefore trivially continuous.

Exercise 4.23b: increasing convex functions of convex functions are convex

Let g be an increasing convex function, let f be a convex function, and define their composite to be h = g ◦ f .

f(λx+ (1− λ)y) ≤ λf(x) + (1− λ)f(y) from convexity of f

→ g(f(λx+ (1− λ)y)) ≤ g(λf(x) + (1− λ)f(y)) g is an increasing function

→ g(f(λx+ (1− λ)y)) ≤ g(λf(x) + (1− λ)f(y)) ≤ λg(f(x)) + (1− λ)g(f(y)) from convexity of g

→ g(f(λx+ (1− λ)y)) ≤ λg(f(x)) + (1− λ)g(f(y)) transitivity

→ h(λx+ (1− λ)y) ≤ λh(x) + (1− λ)h(y) definition of h

x and y were arbitrary, so this shows that h is a convex function.

Exercise 4.23c: the ugly-looking inequality

We’re given that s < t < u, so we know that 0 < t− s < u− s. Define λ to be

λ =t− su− s

From this definition we immediately see that 0 < λ < 1. We can also see that

(1− λ) = 1− t− su− s

=u− tu− s

From this, we can establish an inequality:

λf(u) + (1− λ)f(s) ≥ f(λu+ (1− λ)s) from convexity of f

→ λf(u) + (1− λ)f(s) ≥ f(t−su−s u+ u−t

u−s s)

definition of λ, (1− λ)

→ λf(u) + (1− λ)f(s) ≥ f(ut−us+su−st

u−s

)algebra

→ λf(u) + (1− λ)f(s) ≥ f(t(u−s)u−s

)algebra

→ λf(u) + (1− λ)f(s) ≥ f(t) algebra

From this last inequality, we can derive two results.

λf(u) + (1− λ)f(s) ≥ f(t) previously derived

→ λf(u)− λf(s) ≥ f(t)− f(s) subtract f(s) from each side

→ t−su−s (f(u)− λf(s)) ≥ f(t)− f(s) definition of λ

68

Dividing both sides of this last inequality by t− s gives us

f(u)− f(s)

u− s≥ f(t)− f(s)

t− s(14)

Similarly, we have:

λf(u) + (1− λ)f(s) ≥ f(t) previously derived

→ −λf(u)− (1− λ)f(s) ≤ −f(t) multiply both sides by −1

→ (1− λ)f(u)− (1− λ)f(s) ≤ f(u)− f(t) add f(u) to both sides

→ (1− λ)(f(u)− f(s)) ≤ f(u)− f(t) add f(u) to both sides

→ u−tu−s (f(u)− f(s)) ≤ f(u)− f(t) definition of λ

Dividing both sides of this last equation by u− t gives us

f(u)− f(s)

u− s≤ f(u)− f(t)

u− t(15)

Combining (14) and (15) gives us

f(t)− f(s)

t− s≤ f(u)− f(s)

u− s≤ f(u)− f(t)

u− t


Exercise 4.24

To prove that f is convex on (a, b) we must show that

(∀x, y ∈ (a, b), λ ∈ [0, 1]) : f(λx+ (1− λ)y) ≤ λf(x) + (1− λ)f(y) (16)

To do this, choose an arbitrary x, y ∈ (a, b) (this choice of x and y will remain fixed for the majority of thisproof). Let Λ represent the set of values for λ for which (16) holds. It’s immediately clear that 0 ∈ Λ and 1 ∈ Λ.We’re also told that f has the property

f

(x+ y

2

)≤ f(x) + f(y)

2for all x, y ∈ (a, b) (17)

which is just (16) with λ = 12 ; therefore 1

2 ∈ Λ. To prove that f is convex we must show that [0, 1] ⊂ Λ.

Lemma 1: If j, k ∈ Λ then (j + k)/2 ∈ Λ

Assume that j, k ∈ Λ and let m = j+k2 . Proof that m ∈ Λ:

f(mx+ (1−m)y) = f(j+k2 x+ 2−j−k

2 y)

definition of m

= f(jx+kx+2y−jy−ky

2

)algebra

= f(jx−jy+y

2 + kx−ky+y2

)algebra

= f(jx+(1−j)y

2 + kx+(1−k)y2

)algebra

j is in Λ, so j ∈ [0, 1], so (jx+ (1− j)y) ∈ (a, b). The same is true for k. So we can apply (17).

≤ f(jx+(1−j)y)+f(kx+(1−k)y)2

≤ jf(x)+(1−j)f(y)+kf(x)+(1−k)f(y)2 We can apply (16) because j, k ∈ Λ

= j+k2 f(x) + 2−j−k

2 f(y) algebra

= mf(x) + (1−m)f(y) definition of m

This shows that f(mx+ (1−m)y) ≤ mf(x) + (1−m)f(y), therefore m ∈ Λ.

69

Lemma 2: all rationals of the form m/2n with 0 ≤ m ≤ 2n are members of Λ

This can be proven by induction. Let E be the set of all n for which the lemma is true. We know that{0, 12 , 1} ⊂ Λ so we have 0 ∈ E, 1 ∈ E. Now assume that E contains 1, 2, . . . , k. Choose an arbitrary m suchthat 0 ≤ m ≤ 2k+1.

case 1: m is evenIf m is even then m = 2α for some α ∈ Z, 0 ≤ α ≤ 2k and therefore m/2k+1 = α/2k. By the hypothesis ofinduction α/2k ∈ Λ therefore m/2k+1 ∈ Λ.case 2: m is oddIf m is odd then (m− 1)/2 = α and (m+ 1)/2 = β for some α, β ∈ Z with α, β ∈ [0, 2k]. From this we have

m

2k+1=m+ 1

2k+2+m− 1

2k+2=

1

2

(α

2k+

β

2k

)By the hypothesis of induction α/2k ∈ Λ and β/2k ∈ Λ. Therefore, by lemma 1, m/2k+1 ∈ Λ.

This covers all possible cases for m, so k + 1 ∈ E. This completes the inductive step, therefore E = N.

Lemma 3: every element of [0, 1] is a member of Λ

Choose any p ∈ [0, 1]. From lemma 2, we see that Λ is dense in [0, 1] so we can construct a sequence {λn} in Λsuch that limn→∞ λn = p. Each λn is an element of Λ, so for each n we have

f(λnx+ (1− λn)y) ≤ λnf(x) + (1− λn)f(y)

The function f is continuous, so by theorem 4.2 we can take the limit of both sides as n→∞ to get

f(px+ (1− p)y) ≤ pf(x) + (1− p)f(y)

which means that p ∈ Λ.

By lemma 3 we know that

(∀λ ∈ [0, 1])f(λx+ (1− λ)y) ≤ λf(x) + (1− λ)f(y)

But we chose x and y arbitrarily from the interval (a, b), so we have proven that

(∀x, y ∈ (a, b), λ ∈ [0, 1])f(λx+ (1− λ)y) ≤ λf(x) + (1− λ)f(y)

which is (16), the definition of convexity. This shows that f is convex, which is what we were asked to prove.

Exercise 4.25a

The “hint” given for this problem is actually a proof. The set F = z − C is clearly closed because z is a fixedelement and C is closed. The sets K and F are disjoint:

k ∈ K ∩ F assumed

→ k ∈ K ∧ k ∈ F def. of intersection

→ k ∈ K ∧ (∃c ∈ C)(k = z − c) def. of F

→ k ∈ K ∧ (∃c ∈ C)(k + c = z)

→ (∃k ∈ K, c ∈ C)k + c = z rearrangement of quantifiers

→ z ∈ K + C def. of K + C

From the definition of the set F , we see that (∀c ∈ C)(z − c ∈ F ). And we’ve established that F is closedand that and K is compact, so by exercise 21 we know that there exists some δ > 0 such that

70

(∀c ∈ C, k ∈ K)(d(z − c, k) > δ) by exercise 21

→ (∀c ∈ C, k ∈ K)(|z − c− k| > δ) def. of distance function in Rk

→ (∀c ∈ C, k ∈ K)(|z − (c+ k)| > δ) algebra

→ (∀c ∈ C, k ∈ K)(d(z, c+ k) > δ) def. of distance function in Rk

The set of c+ k for all c ∈ C, k ∈ K is simply the set C +K: so we have

→ (∀y ∈ C +K)(d(z, y) > δ) def. of distance function in Rk

This shows us that z is an interior point of C +K. But z was an arbitrary point of C +K, so we have shownthat C +K is open. By theorem 2.23 the complement of an open set is closed, so C +K is closed. And this iswhat we were asked to prove.

Exercise 4.25b

Let α be an irrational number and let C1 + C2 be defined as the set

C1 + C2 = {m+ nα : m,n ∈ N} (18)

We’ll first prove that C1 +C2 is dense in [0, 1) and then extend this proof out to prove that C1 +C2 is dense inR1. Define the set ∆ to be the set of radii δ such that, for every x ∈ [0, 1), every neighborhood of radius > δcontains a point of C1 + C2. More formally, we define it to be

∆ = {δ : (∀x ∈ [0, 1))(Nδ(x) ∩ C1 + C2 6= ∅)

We want to prove that C1 + C2 is dense in [0, 1): we can do this by proving that ∆ has a greatest lower boundof 0.

Lemma 1: Each element of C1 + C2 has a unique representation of the form m+ nα

Assume that m+ nα and p+ qα are two ways of describing the same element of C1 + C2.

m+ nα = p+ qα assumption of equality

→ (m− p) + (n− q)α = 0 algebra

→ (n− q)α = (p−m) algebra

If both sides of this equation are zero, then n = q and p = m so that our two representations arenot unique. If both sides are nonzero, we can divide by p−m:

→ α = p−mn−q algebra

But m,n, p, q are integers so this last statement implies that α is a rational number. By contradiction, eachelement of C1 + C2 has a unique representation of the form m+ nα.

Lemma 2: For each n ∈ N, there is exactly one m ∈ N such that m+ nα ∈ [0, 1).

Assume that p+ nα and q + nα are both in the interval [0, 1) Then |(p+ nα)− (q + nα| = |p− q| ∈ [0, 1). Andp and q are both integers, so it must be the case that p = q.

Lemma 3: If d(x, y) = δ for any x, y ∈ [0, 1) ∩ C1 + C2 then δ ∈ ∆

Assume d(x, y) = δ for some 0 ≤ x < y < 1 with x, y ∈ C1 + C2. Then we have

d(x, y) = y − x = p+mα− q + nα = (p− q) + (m− n)α = δ

So that δ is itself an element of C1 +C2. It’s also clear that any integer multiple of δ is an element of C1 +C2.Now, choose an arbitrary p ∈ [0, 1) that is not a multiple of δ. Every real number lies between two integers, sothere exists some a such that

a <p

δ< a+ 1

71

which impliesaδ < p < (a+ 1)δ

which shows that p is in a neighborhood of radius δ of some element of C1 +C2. But p was an arbitrary elementof [0, 1), so every element of [0, 1) lies in such a neighborhood of radius δ, and therefore δ ∈ ∆.

Lemma 4: if δ ∈ ∆, then 12δ ∈ ∆

Proof by contradiction. Assume that δ ∈ ∆ but 12δ 6∈ ∆. By lemma 3 we know that d(x, y) > δ for all

x, y ∈ [0, 1) ∩ C1 + C2. This gives us a maximum size for the set [0, 1) ∩ C1 + C2:

|[0, 1) ∩ C1 + C2| ≤1

δ

But this contradicts lemmas 1 and 2, which tell us that there is one unique element of [0, 1) ∩ C1 + C2 for eachn ∈ N. By contradiction, then, we must have 1

2δ ∈ ∆ whenever δ ∈ ∆.

The set C1 + C2 is dense in [0, 1)

We can now use induction to show that inf ∆ = 0. We know that 1 ∈ ∆ because a neighborhood of radius δ = 1around any x ∈ [0, 1) will contain 0 ∈ C1 + C2. Therefore 1 ∈ ∆. Using induction with lemma 4 tells us that(∀n ∈ N)(2−n ∈ ∆). This is sufficient to prove that inf ∆ ≤ 0. Each element of ∆ is a distance, so inf ∆ ≥ 0.Therefore inf ∆ = 0.

By the definition of ∆, this means that every neighborhood of every element of [0, 1) has an element ofC1 + C2. This allows us to conclude that every element of [0, 1) is a limit point of C1 + C2, which means thatC1 + C2 is dense in [0, 1).

The set C1 + C2 is dense in R1

Choose an arbitrary p ∈ R1. Let m be the integer such that m ≤ p < m+1. This means that 0 ≤ p−m < 1, andwe know that C1+C2 is dense in [0, 1). Therefore we can construct a sequence {cn} of elements of [0, 1)∩C1+C2

that converges to p −m. The definition of C1 + C2 guarantees that {cn + m} is also a sequence in C1 + C2,and theorem 3.3 tells us that {cn + m} converges to p. Therefore p is a limit point of C1 + C2. But p was anarbitrary element of R1, so every element of R1 is a limit point of C1 + C2. And this proves that C1 + C2 isdense in R1.

The sets C1 and C2 are closed, but C1 + C2 is not closed.

The sets C1 and C2 are closed because they have no limit points, so it’s trivially true that they contain all oftheir limit points. The set C1 + C2 doesn’t contain any non-integer rational numbers, but every real number isa limit point of C1 + C2. Therefore C1 + C2 doesn’t contain all of its limit points which means it is not closed.

Exercise 4.26

We’re told that Y is compact and that g is continuous and one-to-one. We conclude that g(Y ) is a compactsubset of Z (theorem 4.14) and therefore g is uniformly continuous on Y (theorem 4.19). The fact that g isone-to-one tells us that g is one-to-one and onto g(Y ), so by theorem 4.17 we conclude that g−1(g(Y )) is acontinuous mapping from compact space g(Y ) to compact space X. So by theorem 4.19 we see that g−1 isuniformly continuous on g(Y ).

In exercise 12 we proved that the composition of uniformly continuous functions is uniformly continuous,therefore f(x) = g−1(h(x)) is uniformly continuous if h is uniformly continuous. Theorem 4.7 tells us that thecomposition of continuous functions is continuous, therefore f(x) = g−1(h(x)) is continuous if h is continuous.

To construct the counter example, define X = Z = [0, 1] and Y = [0, 12 ) ∪ [1, 2]. Define the functionsf : X → Y and g : Y → Z as

f(x) =

{x if x < 1

22x if x ≥ 1

2

72

g(y) =

{y if y < 1

2y2 if y ≥ 1

2

We can easily demonstrate that f fails to be continuous at x = 12 and that g is continous at every point. The

composite function h : X → Z is just h(x) = x, which is clearly continuous.

Exercise 5.1

Choose arbitrary elements x, y ∈ R1. We’re told that f is continuous and that

|f(x)− f(y)| ≤ (x− y)2 = |x− y|2

Dividing the leftmost and rightmost terms by |x− y| we have∣∣∣∣f(x)− f(y)

x− y

∣∣∣∣ ≤ |x− y|Taking the limit of each side as y → x gives us

|f ′(x)| ≤ 0

But our choice of x was arbitrary, so the derivative of f is zero at every point. By theorem 5.11b, this provesthat f is a constant function.

Exercise 5.2

Lemma 1: g(f(x)) = x

We’re told that f ′(x) > 0 for all x, so f is strictly increasing in (a, b) (this result is a trivial extension of theproofs for theorem 5.11). By definition, this means that x < y ↔ f(x) < f(y) and x > y ↔ f(x) > f(y): fromthis we conclude

x 6= y ↔ f(x) 6= f(y)

By contrapositive this is equivalent tox = y iff f(x) = f(y)

which, by definition, means that f is one-to-one. Therefore g(f(x)) = x (theorem 4.17 ).

g′(x) exists for all x ∈ f( (a, b) )

Using lemma 1 we see that f ′(x) is inversely proportional to g′(x):

f ′(x) = limt→x

f(x)− f(t)

x− t=

f(x)− f(t)

g(f(x))− g(f(t))= lim t→ x

1g(f(x))−g(f(t))f(x)−f(t)

=1

g′(x)

We’re told that the derivative f ′(x) is defined for all x ∈ (a, b), so 1/g′(x) must also be defined.

bonus proofs: a big wad of properties for f and g

We can show that both f and g are uniformly continous, injective, differentiable, strictly increasing functionswhose domains and ranges are compact.

From lemma 1, we know that f is a one-to-one function that is strictly increasing on (a, b). We’re toldthat f is differentiable on (a, b), therefore f is continuous on [a, b] (theorem 5.2). Because the domain of f isa compact metric space we can conclude that f is uniformly continuous (theorem 4.19). From the continuityand injectiveness of f we can conclude that g is continuous on f( [a, b] ) (theorem 4.17). The domain of g isthe range of f , so the domain of g is a compact space (thoerem 4.14) and therefore g is uniformly continuous(theorem 4.19). From lemma 1 we know that g(f(x)) = x for all x, therefore the inequalities in lemma 1 areequivalent to

g(f(x)) > g(f(y))↔ f(x) > f(y)

g(f(x)) < g(f(y))↔ f(x) < f(y)

g(f(x)) = g(f(y))↔ f(x) = f(y)

Which means that g is a one-to-one function and is strictly increasing on f( (a, b) ).

73

Exercise 5.3

Choose ε such that |ε| < 1M . The function f is the sum of two differentiable functions, so by theorem 5.3 f is

itself differentiable:

f ′(x) = limt→x

x− tx− t

+ limt→x

εg(x)− εg(t)

x− t= 1 + εg′(x)

From our bounds on ε and g′(x) this gives us

1−(

1

MM

)< f ′(x) < 1 +

(1

MM

)These are strict inequalities, so we conclude that f ′(x) is always positive for this choice of ε. Therefore f is anincreasing function and is one-to-one (see lemma 1 of the previous exercise).

Exercise 5.4

Consider the function

f(x) = C0x+C1x

2

2+ · · ·+ Cnx

n+1

n+ 1

When x = 1 this evaluates to the function given to us in the exercise, so f(1) = 0. When x = 0 every termevaluates to zero, so f(0) = 0. From example 5.4 we know that f(x) is differentiable and its derivative is givenby

f ′(x) = C0 + C1x+ C2x2 + · · ·+ Cnx

n

By theorem 5.10, the fact that f(0) = f(1) = 0 means that f ′(x) = 0 for some x ∈ (0, 1). Therefore∑Cnx

n

has a real root in (0, 1), and this is what we were asked to prove.

Exercise 5.5

Choose an arbitrary ε > 0. We’re told that f ′(t)→ 0 as t→∞, so there exists some N such that

t > N → |f ′(t)| < ε

With some algebraic manipulation and the use of the the mean value theorem (theorem 5.10) we can express gas

g(x) = f(x+ 1)− f(x) =f(x+ 1)− f(x)

(x+ 1)− (x)= f ′(t) for some t ∈ (x, x+ 1)

This must be true for all possible values of x, so choose x > N . We now have t > x > N , so the f ′(t) term inthe previous equation is now less than ε.

|g(x)| = |f ′(t)| ≤ ε

which means that |g(x)| < ε for all x > N . And ε was an arbitrary positive real, which by definition means thatg(x)→ 0 as x→∞. And this is what we were asked to prove.

Exercise 5.6

The function g is differentiable (theorem 5.3c) and its derivative is

g′(x) =xf ′(x)− f(x)

x2

We want to prove that g is monotonically increasing. This is true iff g′(x) > 0 for all x, which is true iff

xf ′(x)− f(x)

x2> 0

which is true iffxf ′(x) > f(x)

which is true iff

f ′(x) >f(x)

x(19)

74

To show that (19) holds for all x, choose an arbitrary x ∈ R. From the fact that f(0) = 0 we know fromtheorem 5.10 that

f(x)

x=f(x)− f(0)

x− 0= f ′(c) for some c ∈ (0, x)

We’re told that f ′ is monotonically increasing, so f ′(x) > f ′(c). Therefore:

f ′(x) > f ′(c) =f(x)

x

Therefore (19) holds, and we’ve shown that this occurs iff g′(x) > 0, which means that g is monotonicallyincreasing. And this is what we were asked to prove.

Exercise 5.7

From the fact that f(x) = g(x) = 0 we see that

limt→x

f(t)

g(t)= limt→x

f(t)− f(x)

g(t)− g(x)= limt→x

f(t)− f(x)

t− xt− x

g(t)− g(x)=f ′(x)

g′(x)

Exercise 5.8

We’re told that f ′ is a continuous function on the compact space [a, b] therefore f ′ is uniformly continuous(theorem 4.19). Choose any ε > 0: by the definition of uniform continuity there exists some δ such that0 < |t − x| < δ → |f ′(t) − f ′(x)| < ε. Choose t, x ∈ [a, b]: by the mean value theorem there exists c ∈ (t, x)such that

f ′(c) =f(t)− f(x)

t− xFrom the fact that c ∈ (t, x) we know that |c− x| < |t− x|δ. Therefore |f ′(c)− f ′(x)| < ε. Therefore, from ourdefinition of c, we have

|f ′(c)− f ′(x)| =∣∣∣∣f(t)− f(x)

t− x− f ′(x)

∣∣∣∣ < ε

Our initial choice of t, x, and ε was arbitrary, some δ > 0 must exist so that this previous inequality is true forall t, x, and ε. And this is what we were asked to prove.

Does this hold for vector-valued functions?

Yes. Choose an arbitrary ε > 0 and define the vector-valued function f to be

f(x) = (f1(x), f2(x), . . . , fn(x))

Assume that f is differentiable on [a, b]. Then each fi is differentiable on [a, b] (remark 5.16) and [a, b] is compactso by the preceeding proof we know that for each fi there exists some δi > 0 such that

|t− x| < δi →∣∣∣∣fi(t)− fi(x)

t− x− f ′i(x)

∣∣∣∣ < ε

n

Define δ = min{δ1, δ2, . . . , δn}. For |t− x| < δ we now have∣∣∣∣f(t)− f(x)

t− x− f ′(x)

∣∣∣∣ =

∣∣∣∣ (f1(t) + f2(t) + · · ·+ fn(t))− (f1(x) + f2(x) + · · ·+ fn(x))

t− x− (f ′1(x) + f ′2(x) + · · ·+ f ′n(x))

∣∣∣∣=

∣∣∣∣(f1(t)− f1(x)

t− x− f ′1(x)

)+

(f2(t)− f2(x)

t− x− f ′2(x)

)+ · · ·+

(fn(t)− fn(x)

t− x− f ′n(x)

)∣∣∣∣≤

∣∣∣∣(f1(t)− f1(x)

t− x− f ′1(x)

)∣∣∣∣+

∣∣∣∣(f2(t)− f2(x)

t− x− f ′2(x)

)∣∣∣∣+ · · ·+∣∣∣∣(fn(t)− fn(x)

t− x− f ′n(x)

)∣∣∣∣<

ε

n+ε

n+ · · ·+ ε

n= ε

75

Exercise 5.9

We’re asked to show that f ′(0) exists. From the definition of the derivative, we know that

f ′(0) = limx→0

f(x)− f(0)

x− 0

The function f is continuous, so limx→0 f(x) − f(0) = 0 and limx→0 x = 0. Therefore we can use L’Hopital’srule.

f ′(0) = limx→0

f(x)− f(0)

x= limx→0

f ′(x)− 01− 0 = limx→0

f ′(x)

We’re told that the right-hand limit exists and is equal to 3, therefore the leftmost term (f ′(0)) exists and isequal to 3. And this is what we were asked to prove.

Exercise 5.9 : Alternate proof

If limx→0 f′(0) = 3 and f ′(0) 6= 3, then f ′ would have a simple discontinuity at x = 0. Therefore f ′(0) = 3 as

an immediate consequence of the corollary to theorem 5.12.

Exercise 5.10

Let f1, g1 represent the real parts of the functions f, g and let f2, g2 represent their imaginary parts: that is,f(x) = f1(x) + if2(x) and g(x) = g1(x) + ig2(x). We’re told that f and g are differentiable, therefore each ofthese dependent functions is differentiable (see Rudin’s remark 5.16). Applying the hint given in the exercise,we have

limx→0

f(x)

g(x)= limx→0

{f1(x) + if2(x)

x−A

}· x

g1(x) + ig2(x)+A · x

g1(x) + ig2(x)

Each of these functions is differentiable and the denominators all tend to 0, so we can apply L’Hopital’s rule.

limx→0

f(x)

g(x)= lim

x→0

{f ′1(x) + if ′2(x)

1−A

}· 1

g′1(x) + ig′2(x)+A · 1

g′1(x) + ig′2(x)

= limx→0

{f ′(x)

1−A

}· 1

g′(x)+A · 1

g′(x)

= {A−A} · 1

B+A · 1

B=A

B

Exercise 5.11

The denominator of the given ratio tends to 0 as h → 0, so we can use L’Hopital’s rule (differentiating withrespect to h):

limh→0

f(x+ h) + f(x− h)− 2f(x)

h2= limh→0

f ′(x+ h)− f ′(x− h)

2h

= limh→0

1

2

(f ′(x+ h)− f ′(x)

h+f ′(x)− f ′(x− h)

h

)We’re told that f ′′(x) exists, so this limit exists and is equal to

= limh→0

1

2(f ′′(x) + f ′′(x)) = f ′′(x)

A function for which this limit exists although f ′′(x) does not

For a counterexample, we need only find a differentiable function for which f ′′(x) = 1 when x > 0 and f ′′(x) = −1when x < 0. These criteria are met by

f(x) =

x2, if x > 0−(x2), if x < 0

0, if x = 0

This function is continuous and differentiable, but f ′′(x) does not exist at x = 0.

76

Exercise 5.12

From the definition of f ′(x), we have

f ′(x) = limh→0

|x+ h|3 − |x|3

h(20)

If x > 0 then the terms in the numerator are positive and (20) resolves to

f ′(x) = limh→0

(x+ h)3 − (x)3

h= limh→0

3x2 + 3xh+ h2 = 3x2

If x < 0 then the terms in the numerator are negative and (20) resolves to

f ′(x) = limh→0

−(|x|+ h)3 + (|x|)3

h= limh→0−(3|x|2 + 3|x|h+ h2) = −(3|x2|)

It’s clear from the above results that f ′(x)→ 0 as x→ 0, and this agrees with f ′(0):

f ′(0) = limh→0

|h3|h

= 0

So f ′(x) = 3x|x| for all x.

From the definition of f ′′(x), we have

f ′′(x) = limh→0

|3(x+ h)2| − |3x2|h

(21)

If x > 0 then the terms in the numerator are positive and (21) resolves to

f ′′(x) = limh→0

3(x+ h)2 − 3x2

h= limh→0

6x+ 3h = 6x

If x < 0 then the terms in the numerator are negative and (21) resolves to

f ′′(x) = limh→0

−3(|x|+ h)2 − 3|x2|h

= limh→0

6|x|+ 3h = 6|x|

It’s clear from the above results that f ′′(x)→ 0 as x→ 0, and this agrees with f ′′(0):

f ′′(0) = limh→0

|3h2|h

= 0

So f ′′(x) = |6x| for all x.

From the definition of f ′′′(x), we have

f ′′′(x) = limh→0

|6(x+ h)| − |6x|h

(22)

If x = 0, then when h > 0 we have

limh→0

|6h|h

= 6

and when h < 0 we have

limh→0

|6h|h

= −6

So the limit in (22) (which is f (3)(0)) doesn’t exist.

Exercise 5.13a

f is continuous when x 6= 0

The proof that xa is continuous when x 6= 0 is trivial. The sin function hasn’t been well-defined yet, but we canassume that it’s a continuous function5. Therefore their product xa sin(x−c) is continuous wherever it’s defined(theorem 4.9), which is everywhere but x = 0.

5Example 5.6 says that we should assume without proof that sin is differentiable, so we can also assume that it’s continuous(theorem 5.2).

77

f is continous at x = 0 if a > 0

We have f(0) = 0 by definition. For f to be continuous at x = 0 it must be the case that limx→0 f(x) = 0. Therange of the sin function is [−1, 1], so if a > 0 we have

xa(−1) ≤ xa sin(x−c) ≤ xa(1)

Taking the limit of each of these terms as x→ 0 gives us

0 ≤ limx→0

xa sin(x−c) ≤ 0

which shows that limx→0 f(x) = 0, and therefore f is continuous at x = 0.

f is discontinous at x = 0 if a = 0

To show that f is not continuous at x = 0, it’s sufficient to construct a sequence {xn} such that limxn = 0 butlim f(xn) 6= f(0) (theorem 4.2). Define the terms of {xn} to be

xn =

(1

2nπ + π/2

) 1c

This sequence clearly has a limit of 0, but

f(xn) = x0n sin(x−cn ) = sin(2nπ + π/2) = 1

so that lim{f(xn)} = 1. Note that we’re making lots of unjustified assumptions about the sin function and theproperties of the as-of-yet undefined symbol π.

f is discontinous at x = 0 if a < 0

Define the terms of {xn} to be

xn =

(1

2nπ + π/2

) 1c

This sequence clearly has a limit of 0, but

f(xn) = xan sin(x−cn ) =

(1

2nπ + π/2

) ac

sin(2nπ + π/2) =

(1

2nπ + π/2

) ac

We’re told that c > 0, therefore a/c < 0 and we have

f(xn) =

(1

2nπ + π/2

) ac

= (2nπ + π/2)−ac

where −a/c > 0. By theorem 3.20a we see that lim f(xn) =∞, which means that limxn = 0 but lim f(xn) 6= 0and therefore f is not continuous at x = 0.

These cases show that f is continous iff a > 0.

Exercise 5.13b

From the definition of limit we have

f ′(0) = limh→0

f(0 + h)− f(0)

h= limh→0

ha sin(h−c)− 0

h= limh→0

ha−1 sin(h−c) (23)

We can evaluate the rightmost term by noting that sin is bounded by [−1, 1] so that

|ha−1|(−1) ≤ ha−1 sin(h−c) ≤ |ha−1|(1) (24)

78

Lemma 1: f is differentiable when x 6= 0

The proof that xa and x−c are differentiable when x 6= 0 is trivial. The sin function hasn’t been well-defined yet,but Rudin asks us in example 5.6 to assume that it’s differentiable. Therefore sin(x−c) is differentiable whenx 6= 0 (theorem 5.5, the chain rule) and therefore xa sin(x−c) is differentiable when x = 0 (theorem 5.3b, theproduct rule).

case 1: f ′(0) exists when a > 1

When a > 1 we have a− 1 > 0 and therefore taking the limits of (24) as h→ 0 gives us

0 ≤ limh→0|ha−1 sin(h−c)| ≤ 0

which means that (23) becomesf ′(0) = lim

h→0ha−1 sin(h−c) = 0

which shows that f ′(0) is defined.

case 2: f ′(0) does not exist when a = 1

Define the sequences {hn} and {jn} such that

hn =

(1

2nπ + π2

)1/c

jn =

(1

(2n+ 1)π + π2

)1/c

When a = 1 we have a− 1 = 0 and therefore equation (23) gives us

limh→0

ha−1 sin(h−c) = limn→∞

sin(h−cn ) = 1

limj→0

ja−1 sin(j−c) = limn→∞

sin(j−cn ) = −1

We know the sequences {f ′(hn)} and {j′(hn)} are well-defined because of lemma 1, therefore we have conflictingdefinitions of f ′(0). This means that the limit in (23) (and therefore f ′(0) itself) does not exist.

case 3: f ′(0) does not exist when a < 1


hn =

(1

2nπ + π2

)1/c

jn =

(1

(2n+ 1)π + π2

)1/c

When a < 1 we have a− 1 < 0 and therefore equation (23) gives us

limh→0

ha−1 sin(h−c) = limn→∞

ha−1n =∞

limj→0

ja−1 sin(j−c) = limn→∞

−ja−1n = −∞

We know the sequences {f ′(hn)} and {j′(hn)} are well-defined because of lemma 1, therefore we have conflictingdefinitions of f ′(0). This means that the limit in (23) (and therefore f ′(0) itself) does not exist.

These cases show that f ′(0) exists iff a > 1.

79

Exercise 5.13c

Note that we’ve only defined f on the domain [−1, 1], so we only need to show that f is bounded on this domain.

case 1: f ′ is unbounded when a < 1

We saw in case (3) of part (b) that f ′ is unbounded near 0 when a < 1.

case 2: f ′ is unbounded when 1 ≤ a < c+ 1

By the lemma of part (b) we know that f ′(x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chainrule and product rule we know that the derivative of f when x 6= 0 is

f ′(x) = axa−1 sin(x−c)− cxa−(c+1) cos(x−c) (25)

Define the sequence {hn} such thathn = (2nπ)−1/c

Evaluating the derivative in (25) at x = hn gives us

f ′(hn) = (2nπ)1−ac sin(2nπ)− c(2nπ)

(1+c)−ac cos(2nπ) = −c(2nπ)

(1+c)−ac

We’re assuming in this case that a < c+ 1, so taking the limits of this last equation as n→∞ gives us

limn→∞

f ′(hn) = limn→∞

−c(2nπ)(1+c)−a

c = −∞

This doesn’t prove anything about f ′(0) itself, but it does show that f ′(x) is unbounded near 0.

case 3: f ′ is bounded when a ≥ c+ 1

If a ≥ c + 1 then clearly a > 1, so by the lemma of part (b) we know that f ′(x) is defined for all x ∈ [1, 1]including x = 0. By the chain rule and product rule we know that the derivative of f is


Since f ′(x) is defined for x = 0, we can take the limit of (26) as x→ 0:

f ′(0) = limx→0

f ′(x) = limx→0

axa−1 sin(x−c)− cxa−(c+1) cos(x−c)

Because x is bounded by [−1, 1] we know that xa−1, xa−(c+1), sin, and cos are all bounded by [−1, 1]. Thereforethe rightmost limit of the previous equation is bounded by

−(a+ c) ≤ limx→0

axa−1 sin(x−c)− cxa−(c+1) cos(x−c) ≤ a+ c

Which, of course, means that f ′(x) is also bounded by [−(a + c), a + c] for x ∈ [−1, 1]. We could find stricterbounds for f ′(x), but it’s not necessary.

These three cases show that f ′ is bounded iff a ≥ c+ 1.

Exercise 5.13d

lemma: f ′ is continuous when x 6= 0

From lemma 1 of part (b) we know that f ′ exists for all x 6= 0 and its derivative is given by


Rudin asks us to assume that sin and cos are continuous functions and it’s trivial to show that x±α is continuouswhen x 6= 0 for any α, so we can use the chain rule (theorem 5.5) and product rule (theorem 5.3b) to show thatf ′ is continuous when x 6= 0.

80

case 1: f ′ is continuous at x = 0 when a > 1 + c

We’ve shown that f ′(0) = 0 when a > 1 (case 1 of part (b)). For f ′ to be continuous at x = 0 it must be thecase that limx→0 f

′(x) = 0. We can algebraically rearrange (27) to obtain

f ′(x) = xa−(c+1)axc sin(x−c)− c cos(x−c)

The range of the cosine and sine functions are [−1, 1], so we can establish a bound on this function.

|f ′(x)| = |xa−(c+1)axc sin(x−c)− c cos(x−c)| ≤ |xa−(c+1)| · |axc + c| ≤ |xa−(c+1)| · (|axc|+ |c|)

Because a > c+ 1, we have |xa−(c+1)| → 0 and |axc| → 0 as x→ 0. Taking the limits of this last inequality asx→ 0 therefore gives us

limx→0|f ′(x)| ≤ 0 · (0 + c) = 0

This shows that limx→0 f′(x) = f ′(0), therefore f ′ is continuous at x = 0.

case 2: f ′ is not continuous at x = 0 when a = 1 + c

To show that f ′ is not continuous at x = 0 it’s sufficient to construct a sequence {xn} such that limxn = 0 butlim f ′(xn) 6= f ′(0) = 0 (theorem 4.2). Define the terms of xn to be

xn =

(1

2nπ

) 1c

This sequence clearly has a limit of 0, but sin(xn) = 0 and cos(xn) = 1 so that the terms of {f ′(xn)} are

f ′(xn) = axn(0)− cx0(1) = −c

so that lim{f ′(xn)} = −c 6= f ′(0).

case 3: f ′ is not continuous at x = 0 when a < 1 + c

If a < 1+c we know that f ′ is not bounded on [−1, 1] (part (c)) therefore f ′ is not continuous on [−1, 1] (theorem4.15). From lemma 1, we know that the discontinuity must occur at the point x = 0.

For an alternative proof we could use the sequence {xn} established in case 2 and show that f ′(xn)→∞ 6=f ′(0) as xn → 0 when a < 1 + c.

Exercise 5.13e

Lemma 1: f ′ is differentiable when x 6= 0

We established in part (b) that f ′ exists for x 6= 0 and is given by


We know that all of the exponential powers of x are differentiable when x 6= 0 and Rudin asks us to assume thatsin and cos are differentiable, so we can use theorem 5.5 (the chain rule) and theorem 5.3(the product rule) toshow that f ′ is differentiable when x 6= 0.

case 1: f ′′(0) exists when a > 2 + c

From the definition of limit we know that

f ′′(0) = limh→0

f ′(0 + h)− f ′(0)

h= limh→0

aha−1 sin(h−c)− cha−(c+1) cos(h−c)

h

= limh→0

[(aha−2) sin(h−c)− (cha−(c+2)) cos(h−c)

](29)

81

The range of the sin and cos functions is [−1, 1] so we can establish bounds for the limited term.

−(|aha−2|+ |cha−(c+2)|) ≤ aha−2 sin(h−c)− cha−(c+2) cos(h−c) ≤ |aha−2|+ |cha−(c+2)|

When a > (2 + c) > 2, the powers of h tend tend to zero as h→ 0. Taking the limits of the previous inequalityas h→ 0, we have

0 ≤ limh→0

aha−2 sin(h)− cha−(c+2) cos(h−c) ≤ 0

This shows that the limit in (29) (and therefore f ′′(0)) exists.

case 2: f ′′(0) does not exist when a = 2 + c


hn =

(1

2nπ

)1/c

jn =

(1

(2n+ 1)π

)1/c

When a = 2 + c we have a− (c+ 2) = 0 and therefore equation (28) gives us

limh→0

f ′′(h) = limn→∞

f ′′(hn) =[0− (ch0n)(1)

]= −c

limj→0

f ′′(j) = limn→∞

f ′′(jn) =[0− (cj0n)(−1)

]= c

We know the sequences {f ′′(hn)} and {j′′(hn)} are well-defined because of lemma 1, therefore we have conflictingdefinitions of f ′′(0). This means that the limit in (28) (and therefore f ′′(0) itself) does not exist.

case 2: f ′′(0) does not exist when a < 2 + c


hn =

(1

2nπ

)1/c

jn =

(1

(2n+ 1)π

)1/c

When a < 2 + c we have a− (c+ 2) < 0 and therefore ha−(c+2)n →∞. This means that equation (28) gives us

limh→0

f ′′(h) = limn→∞

f ′′(hn) =[0− (cha−(c+2)

n )(1)]

= −∞

limj→0

f ′′(j) = limn→∞

f ′′(jn) =[0− (cja−(c+2)

n )(−1)]

=∞

We know the sequences {f ′′(hn)} and {j′′(hn)} are well-defined because of lemma 1, therefore we have conflictingdefinitions of f ′′(0). This means that the limit in (28) (and therefore f ′′(0) itself) does not exist.

These three cases show that f ′′(0) exists iff a > 2 + c.

Exercise 5.13f

Lemma 1: to hell with this

By the lemma of part (e) we know that f ′′(x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chainrule and product rule we know that the derivative of f when x 6= 0 is

f ′′(x) =[(a2 − a)xa−2 − c2xa−(2+2c)

]sin(x−c) +

[(c2 + c− ca)xa−(2+c) − caxa−(1+c)

]cos(x−c) (30)

And I’m not going to screw around with limits and absolute values of something so annoying to type out, so I’llconclude with “the proof is similar to that of part(c)”.

82

Exercise 5.13g

The proof is similar to that of part (d), and the sentiment is similar to that of lemma (1) of part (f).

Exercise 5.14

Lemma 1: If f is not convex then the convexity condition fails for all λ for some s, t ∈ (a, b)

This could also be stated as “if f is not convex on (a, b) then it is strictly concave on some interval (s, t) ⊂ (a, b)”.Assume that f is continuous on the interval (a, b) and that f is not convex. By definition, this means that wecan choose c, d ∈ (a, b) and λ ∈ (0, 1) such that

f(λc+ (1− λ)d) > λf(c) + (1− λ)f(d) (31)

Having fixed our choice of c, d, and λ so that the previous equation is true, we define the function g(x) to be

g(x) = f((x)c+ (1− x)d)− (x)f(c) + (1− x)f(d)

We know that g is continuous on [0, 1] because f is continuous on [c, d] ⊂ (a, b) (theorem 4.9). And we knowthat there is at least one p ∈ [0, 1] such that g(p) > 0 because we can choose p = λ which causes the previousequation to simplify to

g(p) = f(λc+ (1− λ)d)− [λf(c) + (1− λ)f(d)]

which is > 0 from (31). Let Z1 be the set of all p ∈ [0, λ) for which g(p) = 0 and let Z2 be the set of allp ∈ (λ, 1] for which g(p) = 0 It’s immediately clear that these sets are nonempty since g(0) = g(1) = 0. Thesesets are closed (exercise 4.3) and therefore contain their supremums and infimums. Let α = sup{Z1} and letβ = inf{Z2}. We can now claim that equation (31) holds for all λ ∈ (α, β). And this is the same as saying that

f(λs+ (1− λ)t) > λf(s) + (1− λ)f(t)

for all λ ∈ (0, 1) fors = αc+ (1− α)d, t = βc+ (1− β)d

which, by definition, means that the convexity conditions fails on the inteval (s, t) ∈ (a, b) for all λ ∈ (0, 1).

The “if” case: f ′ is monotonically increasing if f is convex

Let f be a function that is convex on the interval (a, b) (see exercise 4.23) and choose x, y ∈ (a, b) with y ≥ x.By definition this means that for all x, y ∈ (a, b) and for all λ ∈ (0, 1) it must be the case that

f(λx+ (1− λ)y) ≤ λf(x) + (1− λ)f(y) (32)

We want to express the left-hand side of this inequality as f(x+ h): we can do this by defining h such that

h = (λ− 1)(x− y)

Rearranging this algebraically allows us to express λ and 1− λ as

λ = 1− h

y − x, (1− λ) =

h

x− y

Substituting these values of λ and λ− 1 into (32) results in

f(x+ h) ≤ f(x)− h f(x)

y − x+h f(y)

y − x

which is algebraically equivalent tof(x+ h)− f(x)

h≤ f(y)− f(x)

y − x

83

Equation (32) had to be true for any value of λ ∈ (0, 1). As λ→ 1 we see that h→ 0. Taking the limit of bothsides of the previous equation as h→ 0 gives us

f ′(x) ≤ f(y)− f(x)

y − x(33)

Having established (33), we now want to express the left-hand side of (32) as f(y − h): we can do this byredefining h such that

h = −(λ)(x− y)


λ =h

y − x, (1− λ) = 1− h

x− y

Substituting these values of λ and 1− λ into (32) results in

f(y − h) ≤ h f(x)

y − x+ f(y)− h f(y)

y − x

which is algebraically equivalent tof(y)− f(y − h)

h≥ f(y)− f(x)

y − xEquation (32) had to be true for any value of λ ∈ (0, 1). As λ→ 0 we see that h→ 0. Taking the limit of bothsides of the previous equation as h→ 0 gives us

f ′(y) ≥ f(y)− f(x)

y − x(34)

Combining equations (33) and (34), we have have shown that

f ′(y) ≥ f ′(x)

We assumed only that f was convex and that y > x; we concluded that f ′(y) ≥ f ′(x). By definition, this meansthat f ′ is monotonically increasing if f is convex.

The “only if” case: f ′ is monotonically increasing only if f is convex

Assume that f is not convex. By lemma 1, we can find some subinterval (s, t) ∈ (a, b) such that

f(λs+ (1− λ)t) > λf(s) + (1− λ)f(t) (35)

is true for all λ ∈ (0, 1). We can now follow the logic of the “if” case and define

h = (λ− 1)(s− t)


λ = 1− h

t− s, (1− λ) =

h

s− t

Substituting these values of λ and λ− 1 into (35) results in

f(s+ h) > f(s)− h f(s)

t− s+h f(t)

t− s

which is algebraically equivalent tof(s+ h)− f(s)

h>f(t)− f(t)

t− sEquation (35) had to be true for any value of λ ∈ (0, 1). As λ→ 1 we see that h→ 0. Taking the limit of bothsides of the previous equation as h→ 0 gives us

f ′(s) >f(t)− f(s)

t− s(36)

84

Having established (36), we now redefine h such that

h = −(λ)(s− t)


λ =h

t− s, (1− λ) = 1− h

s− t

Substituting these values of λ and 1− λ into (35) results in

f(t− h) >hf(s)

t− s+ f(t)− h f(t)

t− s

which is algebraically equivalent tof(t)− f(t− h)

h<f(t)− f(s)

t− sEquation (35) had to be true for any value of λ ∈ (0, 1). As λ→ 0 we see that h→ 0. Taking the limit of bothsides of the previous equation as h→ 0 gives us

f ′(t) <f(t)− f(s)

t− s(37)

Combining equations (36) and (37), we have have shown that

f ′(t) < f ′(s)

for some t > s. We assumed only that f was not convex; we concluded that f ′ was not monotonically increasing.By contrapositive, this means that f ′ is monotonically increasing only if f is convex.

f is convex iff f ′′(x) ≥ 0 for all x ∈ (a, b)

We’ve shown that f is convex iff f ′ is monotonically inreasing, and theorem 5.11 tells us that f ′ is monotonicallyincreasing iff f ′′(x) ≥ 0 for all x ∈ (a, b).

Exercise 5.15

Note on the bounds of f and its derivatives

When Rudin asks us to assume that |f | and its derivatives have upper bounds of M0,M1, and M2 it appearsthat we must assume that these bounds are finite. Otherwise the function f(x) = x would be a counterexampleto the claim that M2

1 ≤ 4M0M2.

Proof that M21 ≤ 4M0M2 for real-valued functions

Choose any h > 0. Using Taylor’s theorem (theorem 5.15) we can express f(x+ 2h) as

f(x+ 2h) = f(x) + 2hf ′(x) +4h2f ′′(ξ)

2, ξ > a (38)

which can be algebraically arranged to give us

f ′(x) =f(x+ 2h)

2h− f(x)

2h− hf ′′(ξ)

|f ′(x)| = |f(x+ 2h)

2h− f(x)

2h− hf ′′(ξ)| ≤ |f(x+ 2h)

2h|+ |f(x)

2h|+ |hf ′′(ξ)|

We’re given upper bounds for |f(x)| and |f ′′(x)|; these bounds give us the inequality

|f ′(x)| ≤ 2M0

h+ hM2

85

This inequality must hold for all x, even when |f(x)| approaches its upper bound, so we have

M1 ≤2M0

h+ hM2

We can multiply both sides by h and then algebraically rearrange this into a quadratic equation in h.

h2M2 − hM1 +M0 ≥ 0 (39)

The quadratic solution to this equation is

h =M1 ±

√M2

1 − 4M0M2

2M2(40)

We want to make sure that there are not two solutions to (40): we want (39) to hold for all values of h, and if(40) had two solutions then we would have h2M2 − hM1 + M0 < 0 on some interval of h. To make sure thatthere is at most a single real solution we need to make sure that the discriminant of (40) is either zero (onesolution) or negative (zero solutions). This occurs exactly when

M21 ≤ 4M0M2

Does this apply to vector-valued functions?

Yes. Let f be a vector-valued function that is continuous on (a,∞) and is defined by

f(x) = ( f1(x), f2(x), . . . , fn(x) )

Assume that |f |, |f ′|, and |f ′′| have finite upper bounds of (respectively) M0,M1, and M2. If we evaluate f atx+ 2h we have

f(x+ 2h) = ( f1(x+ 2h), f2(x+ 2h), . . . , fn(x+ 2h) )

Taking the Taylor expansion of each of these terms, we have

f(x+ 2h) =(

[f1(x) + 2hf ′1(x) + 2h2f ′′1 (ξ1)], [f2(x) + 2hf ′2(x) + 2h2f ′′2 (ξ1)], . . . , [fn(x) + 2hf ′n(x) + 2h2f ′′n (ξ1)])

= ( f1(x), f2(x), . . . , fn(x) ) + ( 2hf ′1(x), 2hf ′2(x), . . . , 2hf ′n(x) ) +(

2h2f ′′1 (ξ), 2h2f ′′2 (ξ), . . . , 2h2f ′′n (ξ))

= f(x) + 2hf ′(x) + 2h2f ′′(ξ)

This tells us that equation (38) holds for vector-valued functions. But nothing in the proof following equation(38) requires us to assume that f is real-valued instead of vector-valued. So the proof following (38) still sufficesto prove that

M21 ≤ 4M0M2


In exercise 5.15 we established the inequality

|f ′(x)| ≤ |f(x+ 2h)

2h|+ |f(x)

2h|+ |hf ′′(ξ)|

We are given an upper bound of M2 for |f ′′(ξ)|, so this inequality can be expressed as

|f ′(x)| ≤ |f(x+ 2h)

2h|+ |f(x)

2h|+ hM2

We’re told that f is twice-differentiable, so we know that both f and f ′ are continuous (theorem 5.2), so we cantake the limit of both sides of this equation as x→ 0 (theorem 4.2).

limx→∞

|f ′(x)| ≤ limx→∞

(|f(x+ 2h)

2h|+ |f(x)

2h|+ hM2

)

86

We’re told that f(x)→ 0 as x→ 0 so this becomes

limx→∞

|f ′(x)| ≤ limx→∞

hM2

This must be true for all h, so we can take the limit of both sides of this as h→ 0 (we can do this because both|f ′(x)| and hM2 are continuous functions with respect to the variable h).

limh→0

limx→∞

|f ′(x)| ≤ limh→0

limx→∞

hM2

The left-hand side of the previous inequality is independent of h and therefore doesn’t change; the right-handside becomes 0.

limx→∞

|f ′(x)| = limh→0

limx→∞

|f ′(x)| ≤ limh→0

limx→∞

hM2 = 0

limx→∞

|f ′(x)| ≤ 0

This show that f ′(x)→ 0 as x→∞, which is what we were asked to prove.


In exercise 5.15 we established the inequality

M21 ≤ 4M0M2

where each of the M terms represented a supremum on the interval (a,∞). This inequality was proven to holdfor all a such that f was twice-differentiable on the interval (a,∞). To show more explicitly that the value ofthese M terms depends on a, we might express the previous inequality as

supx>a|f ′(x)|2 ≤ sup

x>a4|f(x)||f ′′(x)|

Each of these terms is continuous with respect to a (the proof of this claim is trivial but tedious) so we can takethe limit of both sides as a→∞.

lima→∞

supx>a|f ′(x)|2 ≤ lim

a→∞supx>a

4|f(x)||f ′′(x)|

We’re told that |f ′′(x)| has an upper bound of M2. We’re also told that f(x)→ 0 as x→∞. The last inequalitytherefore allows us to conclude that

lima→∞

supx>a|f ′(x)|2 ≤ lim

a→∞supx>a

4|0||M2| = 0

It’s clear that |f ′(x)| must be less than or equal to the supremum of a set containing |f ′(x)|, so we have

limx→∞

|f ′(x)| ≤ lima→∞

supx>a|f ′(x)|2 ≤ 0

limx→∞

|f ′(x)| ≤ 0

This shows that f ′(x)→ 0 as x→∞, which is what we were asked to prove.

Exercise 5.17

We’re told that f is three-times differentiable on the interval (−1, 1) so we can take the Taylor expansions off(−1) and f(1) around x = 0:

f(−1) = f(0) + f ′(0)(−1) +f ′′(0)(−1)2

2+f ′′′(ξ1)(−1)3

6, ξ1 ∈ (−1, 0)

f(1) = f(0) + f ′(0)(1) +f ′′(0)(1)2

2+f ′′′(ξ2)(1)3

6, ξ2 ∈ (0, 1)

87

When we evaluate f(1)− f(−1), many of these terms cancel out and we’re left with

f(1)− f(−1) = 2f ′(0) +f ′′′(ξ1) + f ′′′(ξ2)

6, ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)

We’re given the values of f(1), f(−1), and f ′(0) so this last equation becomes

1 =f ′′′(ξ1) + f ′′′(ξ2)

6, ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)

which is algebraically equivalent to

f ′′′(ξ2) = 6− f ′′′(ξ1), ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)

If f ′′′(ξ1) ≤ 3 then f ′′′(ξ2) ≥ 3, and vice-versa: so f ′′′(x) ≥ 3 for either ξ1 or ξ2 ∈ (−1, 1). And this is what wewere asked to prove.

Exercise 5.18

The nth derivative of f(t)

The exercise tells gives us the following formula for f(t):

f(t) = f(β)− (β − t)Q(t)

Since β is a fixed constant, the first two derivatives of f with respect to t are

f ′(t) = Q(t)− (β − t)Q′′(t)

f ′′(t) = 2Q′(t)− (β − t)Q′′′(t)

And, in general, we havef (n)(t) = nQ(n−1)(t)− (β − t)Q(n)(t)

which, after multiplying by (β − α)n/n! and setting t = α, becomes

f (n)(t)

n!(β − α)n =

Q(n−1)(t)

(n− 1)!(β − α)n − Q(n)(t)

n!(β − α)n+1 (41)

Modifying the Taylor formula

The formula for the Taylor expansion of f(β) around the point f(α) (theorem 5.15) includes a P (β) term definedas

P (β) =

n−1∑k=0

[f (k)(α)

k!(β − α)k

]If we isolate the k = 0 case this becomes

P (β) = f(α) +

n−1∑k=1

[f (k)(α)

k!(β − α)k

]We can now use equation (41) to express the terms of the summation as functions of Q.

P (β) = f(α) +

n−1∑k=1

[Q(k−1)(α)

(k − 1)!(β − α)k

]−n−1∑k=1

[Q(k)(α)

k!(β − α)k+1

]If we extract the k = 1 term from the leftmost summation and the k = n−1 term from the rightmost summation,we have

P (β) = f(α) +Q(α)(β − α) +

n−1∑k=2

[Q(k−1)(α)

(k − 1)!(β − α)k

]−n−2∑k=1

[Q(k)(α)

k!(β − α)k+1

]− Q(n−1)(α)

(n− 1)!(β − α)n

88

We can then re-index the leftmost summation to obtain

P (β) = f(α) +Q(α)(β − α) +

n−2∑k=1

[Q(k)(α)

(k)!(β − α)k+1

]−n−2∑k=1

[Q(k)(α)

k!(β − α)k+1

]− Q(n−1)(α)

(n− 1)!(β − α)n

The two summations cancel one another, leaving us with

P (β) = f(α) +Q(α)(β − α)− Q(n−1)(α)

(n− 1)!(β − α)n

If we replace Q(α) with the definition of Q given in the exercise, we see that this previous equation evaluates to

P (β) = f(α) +f(α)− f(β)

α− β(β − α)− Q(n−1)(α)

(n− 1)!(β − α)n

This simplifies to

P (β) = f(α) + (f(β)− f(α))− Q(n−1)(α)

(n− 1)!(β − α)n

A simple algebraic rearrangement of these terms gives us

f(β) = P (β)− Q(n−1)(α)

(n− 1)!(β − α)n

which is the equation we were asked to derive.

Exercise 5.19a

The given expression for Dn is algebraically equivalent to

Dn =

[f(βn)− f(0)

βn − 0

βnβn − αn

]+

[f(0)− f(αn)

0− αn−αn

βn − αn

]We really want to be able to evaluate this by taking the limit of each side of this previous equation in thefollowing manner:

limn→∞

Dn = limn→∞

[f(βn)− f(0)

βn − 0

]· limn→∞

[βn

βn − αn

]+ limn→∞

[f(0)− f(αn)

0− αn

]· limn→∞

[−αn

βn − αn

](42)

There two conditions that must be met in order for this last step to be justified:

condition 1: theorem 3.3 tells us that each of the limits must actually exist (and must not be ±∞)

condition 2: and theorem 4.2 tells us that we must have lim f(αn) = lim f(βn) = f(0).

The fact that αn < 0 < βn guarantees that 0 < βn/(βn − αn) < 1 and 0 < −αn/(βn − αn) < 1, which tells usthat at 2 of the 4 limits in (42) exist. The other two limits exist because they’re equal to f ′(0), and we’re toldthat f ′(0) exists. Therefore condition 1 is met. The fact that f ′(0) exists tells us that f is continuous at x = 0(theorem 5.2) and therefore condition 2 is met (theorem 4.2). Therefore we’re justified in taking the limits in(42), giving us

limn→∞

Dn = f ′(0) · limn→∞

[βn

βn − αn

]+ f ′(0) · lim

n→∞

[−αn

βn − αn

]= limn→∞

f ′(0)βn − αnβn − αn

= f ′(0)

89

Exercise 5.19b

The given expression for Dn is algebraically equivalent to

Dn =

[f(βn)− f(0)

βn − 0

βnβn − αn

]+

[f(αn)− f(0)

αn − 0

(1− βn

βn − αn

)]We want to evaluate this by taking the limits of the individual terms as we did in part (a):

limn→∞

Dn = limn→∞

[f(βn)− f(0)

βn − 0

]· limn→∞

[βn

βn − αn

]+ limn→∞

[f(αn)− f(0)

αn − 0

]limn→∞

[(1− βn

βn − αn

)](43)

In order for this to be justified we must once again meet the two conditions mentioned in part (a). We’re toldthat 0 < βn/(βn − αn) < M for some M , which tells us that at 2 of the 4 limits in (43) exist. The othertwo limits exist because they’re equal to f ′(0), and we’re told that f ′(0) exists. Therefore condition 1 is met.The fact that f ′(0) exists tells us that f is continuous at x = 0 (theorem 5.2) and therefore condition 2 is met(theorem 4.2). Therefore we’re justified in taking the limits in (43), giving us

limn→∞

Dn = f ′(0) · limn→∞

[βn

βn − αn

]+ f ′(0) · lim

n→∞

[1− βn

βn − αn

]= f ′(0) · lim

n→∞

[βn

βn − αn+ 1− βn

βn − αn

]= f ′(0)

Exercise 5.19c proof 1

Define the sequence {hn} where hn = βn − αn. We can now express Dn as

Dn =f(αn + hn)− f(αn)

hn

We know that αn → 0 and βn → 0 as n→∞, so clearly hn → 0 as n→∞. We’re also told that f ′ is continuouson (−1, 1), so we know that f ′ is defined on this interval. Therefore we have

limn→∞

Dn = limn→∞

limh→0

f(αn + hn)− f(αn)

hn= lim

n→∞f ′(αn)

We’re told that f ′ is continuous on the interval (−1, 1) and that limαn = 0 so by theorem 4.2 we have

limn→∞

Dn = limn→∞

f ′(αn) = f ′(0)

Exercise 5.19c proof 2

The mean value theorem (theorem 5.10) allows us to construct a sequence {γn} as follows: for each n, choosesome γn ∈ (αn, βn) such that

Dn =f(βn)− f(αn)

βn − αn= f ′(γn) (44)

Each γn is in the interval (αn, βn). We’re told that limαn = limβn = 0. Therefore we can use the squeezetheorem to determine lim γn.

0 = limn→∞

αn ≤ limn→∞

γn ≤ limn→∞

βn = 0

which means that lim γn → 0. We’re told that f ′ is continuous so by theorem 5.2 we can take the limit ofequation (44).

limn→∞

Dn = limn→∞

f ′(γn) = f ′(0)

90

Exercise 5.20

Exercise 5.21

We saw in exercise 2.29 that any open set in R1 can be represented as a countable number of disjoint opensegments. Let {(an, bn)} represent such a countable collection of disjoint sets such that

⋃(ai, bi) = EC . Define

f : R1 → R1 as

f(x) =

0, x ∈ E(x− bi)2, x ∈ (ai, bi) ∧ ai = −∞(x− ai)2, x ∈ (ai, bi) ∧ bi =∞(x− ai)2(x− bi)2, x ∈ (ai, bi) ∧ −∞ < ai < bi <∞

It’s easy but tedious to verify that this a continuous function which is differentiable and that f(x) = 0 iffx ∈ E. Note that the derivative also has the property that f ′(x) = 0 iff x ∈ E. For the second part of theexercise we can define a function f to be

f(x) =

0, x ∈ E(x− bi)n+1, x ∈ (ai, bi) ∧ ai = −∞(x− ai)n+1, x ∈ (ai, bi) ∧ bi =∞(x− ai)n+1(x− bi)n+1, x ∈ (ai, bi) ∧ −∞ < ai < bi <∞

It’s similarly easy but tedious to verify that this is a continuous function that is n times differentiable andthat f(x) = 0 iff x ∈ E. It can also be seen that when k ≤ n we have f (k)(x) = 0 iff x ∈ E. Finally6, we canpretend that we’ve defined the exponential function and define f to be

f(x) =

0, x ∈ Eexp

(−1

(x−bi)2

), x ∈ (ai, bi) ∧ ai = −∞

exp(

−1(x−ai)2

), x ∈ (ai, bi) ∧ bi =∞

exp(

−1(x−ai)2(x−bi)2

), x ∈ (ai, bi) ∧ −∞ < ai < bi <∞

It’s fairly easy to confirm that this function is continuous and that f(x) = 0 iff x ∈ E. Looking at thederivative with respect to x we have

f ′(x) =

0, x ∈ E−2

(x−bi)3 exp(−1

(x−bi)2

), x ∈ (ai, bi) ∧ ai = −∞

−2(x−ai)3 exp

(−1

(x−ai)2

), x ∈ (ai, bi) ∧ bi =∞

−2(bi−ai)(x−ai)2(x−bi)2 exp

(−1

(x−ai)2(x−bi)2

), x ∈ (ai, bi) ∧ −∞ < ai < bi <∞

To calculate the limit of f ′(x) in the first case, we use L’Hopital’s rule.

limx→a

f ′(x) = limx→a

exp(−1

(x−bi)2

)(x− bi)3

The numerator and denominator of this last term both tend to ±∞, so L’Hopital’s rule is applicable.Repeated applications of L’Hopital’s rule will eventually give us a constant term in the numerator and a termthat tends to ±∞ in the denominator, so we see that lim f ′(x) = 0. Similar results hold for the limits in the othertwo cases. The general idea is that, for any polynomial term p(n), the exponential limit limn→∞ exp(−1/n)will tend towards zero faster than limn→∞ p(n) tends towards ∞. Therefore p(n)exp(−1/n) will tend to zeroas n → ∞. Every term of every derivative of f(x) will consist only of polynomial multiples of the exponentialterm, so it will hold that for all k:

limx→ai

f (k)(x) = limx→bi

f (k)(x) = 0

It seems clear in a vague, poorly-defined Calc II-ish way that f is infinitely differentiable and that, for all n,f (n)(x) = 0 iff x ∈ E. I have no idea how to prove this.

6this example was provided by Boris Shekhtman, University of South Florida

91

Exercise 5.22a

Assume that f(x1) = x1 and f(x2) = x2 with x1 6= x2. Then

f(x2)− f(x1)

x2 − x1=x2 − x1x2 − x1

= 1

By theorem 5.10 this means that f ′(x) = 1 for some x between x1 and x2. This contradicts the presumptionthat f ′(x) 6= 1, so our initial assumption must be wrong: there cannot be two fixed points.

Exercise 5.22b

For f(t) to have a fixed point it would be necessary that f(t) = t, in which case we would have

t = t+ (1− et)−1 −→ 1

1− et= 0

This statement is not true for any t.

Exercise 5.22c

Choose an arbitrary value for x0 and let {xn} be the sequence recursively defined by xn+1 = f(xn). We’re toldthat |f ′(t)| ≤ A for all real t. So for any n we have∣∣∣∣f(xn−1)− f(xn−2)

xn−1 − xn−2

∣∣∣∣ ≤ A −→ |f(xn−1)− f(xn−2)| ≤ A |xn−1 − xn−2|

which, since f(xn) = xn+1, gives us|xn − xn−1| ≤ A|xn−1 − xn−2|

But this holds for all n. So:

|xn − xn−1| ≤ A|xn−1 − xn−2| ≤ A2|xn−2 − xn−3| ≤ A3|xn−3 − xn−4| ≤ · · · ≤ An−2|x2 − x1|

We can use this general formula to determine the difference between xn+k and xn.

|xn+k − xn| = |(xn+k − xn+k−1) + (xn+k−1 − xn+k−2) + · · ·+ (xn+2 − xn+1) + (xn+1 − xn)|≤ |xn+k − xn+k−1|+ |xn+k−1 − xn+k−2|+ · · ·+ |xn+2 − xn+1|+ |xn+1 − xn)|≤ An+k−2|x2 − x1|+An+k−3|x2 − x1|+ · · ·+An|x2 − x1|+An−1|x2 − x1)|≤ An|x2 − x1|

This shows us that|xn+k − xn| ≤ An|x2 − x1|

We’re told that A < 1, so taking the limit of eachs ide of this inequality as n→∞ gives us

limn→∞

|xn+k − xn| ≤ limn→∞

An|x2 − x1| = 0

And this is just the Cauchy criterion for convergence for the sequence {xn}. And we’re converging in R1 so bytheorem 3.11 we know that the sequence converges to some element x ∈ R1.

∃x ∈ R : limn→∞

xn = x

But the elements of {xn} are just the elements of {f(xn)}, so we can also conclude that

∃x ∈ R : limn→∞

f(xn) = x

And we’re told that f is continuous, so by theorem 4.6 we see that

∃x ∈ R : f(x) = x

92

Exercise 5.23a

If x < α, then we can express x as x = α− δ for some δ > 0. This gives usf(x) = f(α− δ) definition of x

= (α−δ)3−13 definition of f

= α3−3α2δ+3αδ2−δ3+13 algebra

= α3+13 + −3α2δ+3αδ2

3 − δ3

3 algebra

= f(α)− αδ(α− δ)− 13δ

3 algebra

= f(α)− αδx− 13δ

3 definition of x

= α− αδx− 13δ

3 α is a fixed point

We know that x < α < −1, so αx > 1 and therefore αδx > δ. From this we have an inequality:

< α− δ − 13δ

3

< α− δ= x definition of x

This establishes that x < α→ f(x) < x. Therefore if x1 < α then the sequence {xn} will be a nonincreasingsequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points less thanα.

Exercise 5.23b

The first derivative of f is f ′(x) = x2. This is nonnegative, so we know that f is monotonically increasing(theorem 5.11). The second derivative is f ′(x) = 2x. This is negative for x < 0 and positive for x > 0: thereforef is strictly convex on (0, γ) and is strictly concave on (α, 0). Now let xk be chosen from the interval (α, γ).

Case 1: xk ∈ (α, 0)

Let xk ∈ (α, 0) be chosen. We can express this x as xk = λα+(1−λ)0 for some λ ∈ (0, 1). The second derivativeof f ′′(x) = 2x is negative on the interval (α, 0) and so the function is concave on this interval (corollary of exercise5.14). Therefore we have

xk+1 = f(xk) definition of xk+1

= f(λα+ (1− λ)0) definition of xk

> λf(α) + (1− λ)f(0) f is concave on (α, 0)

= λα+ (1− λ)(1/3) α is a fixed point, f(0) = 1/3

= xk + (1− λ)(1/3) definition of xk

> xk

We see that xk < xk+1; from our choice of xk we have α < xk; from the monotonic nature of f we havexk < β → xk+1 = f(xk) < f(β) = β. Combining these inequalities yields

α < xk < xk+1 < β

Therefore our initial choice of xk gives us an increasing sequence {xn} with an upper bound of β. Therefore itconverges to a some fixed point in the interval (x1, beta], and this point must be β.

Case 2: x ∈ (β, γ)

Let xk ∈ (β, γ) be chosen. We can express this as xk = λβ+ (1−λ)γ for some λ ∈ (0, 1). The second derivativeof f ′′(x) = 2x is positive on the interval (β, γ) and so the function is convex on this interval (exercise 5.14). There-

93

fore we have xk+1 = f(xk) definition of xk+1

= f(λβ + (1− λ)γ) definition of xk

< λf(β) + (1− λ)f(γ) f is convex on (β, γ)

= λβ + (1− λ)γ β and γ are fixed points

= xk definition of xk

We see that xk+1 < xk; from our choice of xk we have xk < γ; from the monotonic nature of f we haveβ < xk → β = f(β) < f(xk) = xk+1. Combining these inequalities yields

β < xk+1 < xk < γ

Therefore our initial choice of xk gives us an decreasing sequence {xn} with an lower bound of β. Therefore itconverges to a some fixed point in the interval [β, x1), and this point must be β.

Case 3: x ∈ (0, β)

Let xk ∈ (0, β) be chosen. We can express this as xk = λ0 + (1− λ)β for some λ ∈ (0, 1). The second derivativeof f ′′(x) = 2x is positive on the interval (0, β) and so the function is convex on this interval (exercise 5.14).Therefore we have

xk+1 = f(xk) definition of xk+1

= f(λ(0) + (1− λ)β) definition of xk

< λf(0) + (1− λ)f(β) definition of convexity

= λ(1/3) + (1− λ)β β is a fixed point, f(0) = 1/3

= λ(1/3) + x definition of xk

> xk

We see that xk < xk+1; from our choice of xk we have α < xk; from the monotonic nature of f we havexk < β → xk+1 = f(xk) < f(β) = β. Combining these inequalities yields

α < xk < xk+1 < β

Therefore our initial choice of xk gives us an increasing sequence {xn} with an upper bound of β. Therefore itconverges to a some fixed point in the interval (x1, beta], and this point must be β.

Case 4: xk = β or xk = 0

If xk = β then every term of {xn} is β, so this sequence clearly converges to β. If xk = 0 then xk+1 = f(0) = (1/3)and the remainder of the sequence {xn} converges to β by one of the previous cases.

Exercise 5.23c

If x > γ then we can express x as x = γ + δ for some δ > 0. This gives usf(x) = f(γ + δ) definition of x

= (γ+δ)3−13 definition of f

= γ3+3γ2δ+3γδ2+δ3+13 algebra

= γ3+13 + 3γ2δ+3γδ2

3 + δ3

3 algebra

= f(γ) + γδ(γ + δ) + 13δ

3 algebra

= f(γ)− γδx+ 13δ

3 definition of x

= γ + γδx+ 13δ

3 γ is a fixed point

We know that γ > 1 and x > 1, so γxδ > δ. From this we have an inequality :

> γ + δ + 13δ

3

> γ + δ

= x definition of x

94

This establishes that x > γ → f(x) > x. Therefore if x1 > γ then the sequence {xn} will be a nonincreasingsequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points greaterthan γ.

Exercise 5.24

The function f(x) has a derivative of zero at its fixed point, so when xk and xk+1 are both close to√a the mean

value theorem guarantees us that f(xk) and f(xk+1) will be very near one another:

|f(xk)− f(xk+1)| = |(xk − xk+1)f ′(ξ)| ≈ 0

The lefthand term of this equation converges very rapidly to 0 because both |xk−xk+1| and f ′(x) are convergingtoward zero. The function g(x) does not have a derivative of zero at its fixed point and therefore does not havethis property (although, as we saw in exercise 3.17, it still converges albeit more slowly).

Exercise 5.25a

Each xn+1 is chosen to be the point where the line tangent tangent to f(xn) crosses the x-axis.

Exercise 5.25b

Lemma 1: xn+1 < xn if xn > ξ

We’re told that f(b) > 0 and that f(x) = 0 only at x = ξ. We know that f is continuous since it is differentiable(theorem 5.2), so by the intermediate value theorem (theorem 4.23) we know that f(x) > 0 when x > ξ (otherwisewe’d have f(x) = 0 for some second x ∈ (ξ, b). We’re also told that f ′(x) > δ > 0 for all x ∈ (a, b) (without theδ it might be the case that f ′(x) 6= 0 but lim f ′(x) = 0). Therefore the ratio f(xn)/f ′(xn) is defined at each xand is positive when x > ξ. Therefore we have

xn −f(xn)

f ′(xn)< xn, xn > ξ

This of course means that xn+1 < xn when xn > ξ.

Lemma 2: xn+1 > ξ if xn > ξ

We’re told that f ′′(x) ≥ 0, which means that f ′(x) is monotonically increasing, which means that c < xn impliesf ′(c) ≤ f ′(xn), which means that

f ′(xn)− f(ξ)

xn − ξ≤ f ′(xn)

because the LHS of this inequality is equal to f ′(c) for some c < xn. Using the fact that f(ξ) = 0 and a bit ofalgebraic manipulation, this equation is equivalent to

xn −f(xn)

f ′(xn)≥ ξ

which of course means that xn+1 ≥ ξ.

Lemma 3: if {xn} → κ then f(κ) = 0

Suppose that limn→∞ xn+1 = κ. Then by the Cauchy criterion we have limn→∞ xn − xn+1 = 0 which isequivalent to

limn→∞

xn −(xn −

f(xn)

f ′(xn)

)= 0

or

limn→∞

f(xn)

f ′(xn)= 0

For this to hold it must either be the case that f(xn) → 0 or f ′(xn) → ±∞: but we know that f ′(xn)is bounded from the fact that f ′′(xn) is bounded (mean value theorem : if f ′(xn) were unbounded, then

95

(f ′(xn)− f ′(x1))/xn − x1 = f ′′(c) would be unbounded). Therefore it must be the case that f(xn)→ 0. So wehave limn→∞ xn = κ and limn→∞ f(xn) = 0. Therefore by theorem 4.2 we have

f(κ) = limx→κ

f(x) = 0

The actual proof

Choose x1 > ξ. By induction, using lemma 2, we know that every element of {xn} will be > ξ. Therefore,by lemma 1, xn+1 < xn for all n. This means that {xn} is a decreasing sequence with a lower bound of ξ.Therefore, by theorem 3.14, the sequence converges to some point κ. By lemma 3 we have f(κ) = 0. But we’retold that f has only one zero, so it must be the case that κ = ξ. This means that {xn} converges to ξ.

Exercise 5.25c

Using Taylor’s theorem to expand f(ξ) around f(xn), we have

f(ξ) = f(xn) + f ′(xn)(ξ − xn) +f ′′(tn)(ξ − xn)2

2

Subtracting f(xn) from both sides then dividing by f ′(xn) (we’re told f ′(x) ≥ δ > 0) gives us

f(ξ)− f(xn)

f ′(xn)= (ξ − xn) +

f ′′(tn)(ξ − xn)2

2f ′(xn)

Rearranging some terms and recognizing that f(ξ) = 0, we have

xn −f(xn)

f ′(xn)− ξ =

f ′′(tn)(xn − ξ)2

2f ′(xn)

Which, by the definition of xn+1, is equivalent to

xn+1 − ξ =f ′′(tn)(xn − ξ)2

2f ′(xn)

Exercise 5.25d

We’re told that f ′(x) ≥ δ and f ′′(x) ≤M , so the inequality in part (c) guarantees us that

xn+1 − ξ ≤M(xn − ξ)2

2δ= A(xn − ξ)2

This allows us to recursively construct a chain of inequalities.

xn+1 − ξ ≤ A1(xn − ξ)2 ≤ A1A2(xn−1 − ξ)4 ≤ A1A2A4(xn− 2− ξ)8 ≤ · · · ≤ A1A2 . . . A2n−1

(x1 − ξ)2n

Collapsing the exponents of the rightmost term, we have

xn+1 − ξ ≤ A2n−1(x1 − ξ)2n

=1

A[A(x1 − ξ)]2

n

Exercise 5.25e

How does g behave near ξ?

We’re told f and f ′ are differentiable, and clearly x is differentiable, therefore g is differentiable (theorem 5.3).Using the quotient rule (theorem 5.3 again), the derivative of g is given by

g′(x) = 1− f ′(x)2 − f(x)f ′′(x)

f ′(x)2=f(x)f ′′(x)

f ′(x)

96

Taking the absolute value of each side, we have

|g′(x)| =∣∣∣∣f(x)f ′′(x)

f ′(x)

∣∣∣∣Using the inequality we established in part (d) we have

|g′(x)| ≤ A|f(x)|

When we take the limit of each side of this equation as x → ξ, the RHS reduces to f(ξ) = 0 because f iscontinuous.

limx→ξ|g′(x)| ≤ lim

x→ξA|f(x)| = 0

Which immediately implieslimx→ξ

g′(x) = 0

and this describes the behavior of g′ as x→ ξ, which is what we were asked to describe.

Show that Newton’s method involves finding a fixed point of g

This is a slightly modified version of lemma 3 for part (b). Suppose that we have found a fixed point κ suchthat g(κ) = κ. From the definition of g this would mean that

g(κ)− κ = − f(κ)

f ′(κ)= 0

For this to hold it must either be the case that f(κ) = 0 or f ′(κ) = ±∞: but we know that f ′ is bounded from

the fact that f ′′ is bounded (mean value theorem : if f ′ were unbounded, then f ′(x)−f ′(y)x−y = f ′′(c) would be

unbounded for some x, y). Therefore it must be the case that f(κ) = 0 and therefore κ must be the uniquepoint for which f(κ) = 0.

Exercise 5.25f

We’ll consider the more general case in which f(x) = xm. In this case the single real zero occurs at x = 0.Newton’s formula gives us the step function

xn+1 = xn −f(xn)

f ′(xn)= xn −

xmnmxm−1n

= xn −xnm

=

(m− 1

m

)xn

This gives us a recursive definition of xn.

xn+1 =

(m− 1

m

)xn =

(m− 1

m

)2

xn−1 =

(m− 1

m

)3

xn−2 = · · · =(m− 1

m

)nx1

Taking the limit of each side as n→∞, we have

limn→∞

xn+1 = limn→∞

(m− 1

m

)nx1 (45)

In order to have {xn} converge to 0 we must either have x1 = 0 or |m− 1/m| < 1. The latter case occurs when

|m− 1| < |m|

This inequality holds iff m > 12 . If m is larger than 1

2 the limit in equation (45) exists and is equal to zero andso {xn} → 0; if m is smaller than 1

2 then the limit in equation (45) is unbounded and so {xn} → ±∞; if m = 12

then equation (45) becomeslimn→∞

xn+1 = limn→∞

(−1)nx1

This limit clearly doesn’t exist when x1 6= 0.

In the specific case given in the exercise we have m = 13 and therefore the sequence {xn} fails to converge.

97

Exercise 5.26

Following the hint given in the exercise, let M0 = sup |f(x)| and let M1 = sup |f ′(x)| for x ∈ [a, x0]. We knowthat

|f(x)| ≤M1(x0 − a) (46)

because otherwise we’d have∣∣∣∣ f(x)

x0 − a

∣∣∣∣ =

∣∣∣∣f(x)− f(a)

x0 − a

∣∣∣∣ = f ′(c) for some c ∈ (a, b) > M1 = sup |f ′(x)|

which is clearly contradictory. Additionally, we know that

M1 = sup |f ′(x)| ≤ sup |Af(x)| = A sup |f(x)| = AM0 (47)

Combining (46) and (47) gives us

|f(x)| ≤M1(x0 − a) ≤ AM0(x0 − a) , x ∈ [a, x0] (48)

From the fact that f(a) = 0 we know that M1(x0 − a) is the maximum possible value that f could possiblyobtain at f(x0) (otherwise, if f(x0) > M1(x0−a), then the mean value theorem would give us some f ′(c) > M1).In comparison, M0 is the maximum value that f actually does obtain at some x ∈ [a, b]. Therefore we have

M0 ≤M1(x0 − a) (49)

Suppose 0 < A(x0 − a) < δ < 1. Then (48) would give us

|f(x)| ≤M1(x0 − a) ≤ AM0(x0 − a) ≤ δM0

which contradicts (49) unless M1 = M0 = 0. And since we can force 0 < A(x0 − a) < 1 by choosing anappropriate x0, it must be the case that M1 = M0 = 0. This turns (48) into

|f(x)| ≤ 0 , x ∈ [a, x0]

which shows that f(x) = 0 on the interval [a, x0]. We can now repeat these steps using the interval [x0, b]. Weneed only perform this a finite number of times (specifically, a maximum of of [(b−a)/(x0−a)]+1 times) beforewe have covered the entire interval. Therefore f(x) = 0 on the entire interval [a, b].

Exercise 5.27

The given hint is pretty much the entire solution. Let y1 and y2 be two solutions to the initial-value problemand define the function f(x) = y2(x)−y1(x). The function f meets all of the prerequisites spelled out in exercise5.26: it’s differentiable, f(a) = 0, and we’re told that there is a real number A such that

|f ′(x)| = |y′2 − y′1| = |φ(x, y2)− φ(x, y1)| ≤ A|y2 − y1| = A|f(x)|

Therefore, by exericse 5.26, we have y2(x) − y1(x) = f(x) = 0 for all x; therefore y1 = y2. But y1 and y2 werearbitrary solutions for the intial-value problem, so we have proven that there is only one unique solution.

Exercise 5.28

Let ya and yb be two solution vectors to the initial-value problem and define the vector-valued function f(x) =yb(x) − ya(x). As in the previous problem, we see that f(x) is differentiable and that f(a) = a. Therefore, ifthere is a real number A such that

|f ′(x)| = |y′b − y′a| = |φ(x,yb)− φ(x,ya)| ≤ A|y2 − y1| = A|f(x)|

then, by exercises 5.26 and 5.27 the initial-value problem has a unique solution.

98

Exercise 5.29

Note: there’s probably a more subtle answer here, probably involving Taylor’s theorem.

Let v and v be two solution vectors to the initial value problem and define the vector valued functionf(x) = w(x)− v(x):

f(x) = [w1(x)− v1(x), w2(x)− v2(x), . . . , wk−1(x)− vk−1(x), wk(x)− vk(x)] (50)

The derivative of this is given by

f ′(x) =[w′1(x)− v′1(x), w′2(x)− v′2(x), . . . , w′k−1(x)− v′k−1(x), w′k(x)− v′k(x)

]which, by the equivalences given in the exercise, becomes

f ′(x) =

w2(x)− v2(x), w3(x)− v3(x), . . . , wk(x)− vk(x),

k∑j=1

gj(wj − vj)

(51)

From exercises 5.26-5.28 we know that we want an inequality of the form |f ′(x)| ≤ A|f ′(x)|. Using (50) and(51) we see that this inequality will hold if there exists some A such that

|w1(x)− v1(x), w2(x)− v2(x), . . . , wk−1(x)− vk−1(x), wk(x)− vk(x)|

≤ A

∣∣∣∣∣∣w2(x)− v2(x), w3(x)− v3(x), . . . , wk(x)− vk(x),

k∑j=1

gj(wj − vj)

∣∣∣∣∣∣The order of the components don’t matter when we’re taking the norm, and the rightmost term is a dot product,so this is equivalent to

|w2(x)− v2(x), w3(x)− v3(x), . . . , wk(x)− vk(x), w1(x)− v1(x)| ≤ A|w2(x)− v2(x), w3(x)− v3(x), . . . , wk(x)− vk(x), g(x) · (w(x)− v(x))|

Exercise 6.1 (Proof 1)

Outline of the proof

We’ll see that L(P, f) = 0 for every partition; therefore supL(P, f) = 0. By choosing an appropriate partitionwe can make U(P, f) arbitrarily small; therefore inf U(P, f) = 0. We conclude that

∫f = 0.

sup L(P,f) = 0

Although it’s easy to construct a partition such that L(P, f) = 0, we have to show that 0 is the supremum of theset of all L(P, f). To do this, let P be an arbitrary partition of [a, b] and let mi be an arbitrary interval of thisarbitrary partition. If mi contains any point other than x0 then inf f(x) = 0 on this interval, so mi∆αi = 0. Ifmi contains only the point x0 (that is, if the interval is [x0, x0]) then ∆αi = [α(x0)− α(x0)] = 0 and thereforemi∆αi = 0. Therefore mi∆αi = 0 for all i, which means L(P, f) = 0. But P was an arbitrary partition, soL(P, f) = 0 for all P . Therefore

0 = sup{L(P, f) : P is a partition of [a, b]} =

∫ b

a

f dx

inf U(P,f)=0

We need to show that for any ε > 0 we can find some partition P such that 0 ≤ U(P, f) < ε. So let ε > 0 be given.

We’re told that α is continuous at x0, so by the definition of continuity we can find some δ > 0 such that

d(x, x0) < δ → d(α(x), α(x0)) <ε

2(52)

99

Let 0 < µ < δ2 and let P be the partition {a, x0 − µ, x0 + µ, b}. We now calculate each Mi:

M1∆α1 = M1[α(x0 − µ)− α(a)] = 0 · [α(x0 − µ)− α(a)] = 0M2∆α2 = M2[α(x0 + µ)− α(x0 − µ)] = 1 · [α(x0 + µ)− α(x0 − µ)]M3∆α3 = M3[α(b)− α(x0 + µ)] = 0 · [α(b)− α(x0 − µ)] = 0

To determine a bound for M2∆α2 we apply (52) which allows us to conclude that α(x0 +µ)−α(x0−µ) < εfrom the fact that d(x0 − µ, x0 + µ) = 2µ < δ.

M2∆α2 = α(x0 + µ)− α(x0 − µ) ≤ ε

From this we haveU(P, f)

∑i=13

Mi∆αi = 0 + ε+ 0 = ε

But this epsilon is arbitrarily small, so

0 = inf{U(P, f) : P is a partition of [a, b]} =

∫ b

a

f dx

Exercise 6.1 (Proof 2)

Thanks to Helen Barclay ([email protected]) for this proof. The function f is discontinuous at only onepoint and α is continous at that point, so f ∈ R by theorem 6.10. Every partition of [a, b] will contain a pointother than x0 so mi = 0 for every interval; therefore the infimum of L(P, f) is zero; therefore∫ b

a

f dα =

∫ b

a

f dα = 0

Exercise 6.2


If we assume that f(x) 6= 0 for some x ∈ [a, b], then we can construct a partition P for which L(P, f) > 0. Fromthis we conclude that supL(P, f) > 0 and therefore

∫f 6= 0.

The proof

Suppose, for purposes of contradiction, that f(x0) = κ for some x0 ∈ [a, b] where κ is some arbitrary nonzeronumber. We’re told that f is continuous on a closed set, therefore f is uniformly continuous (theorem 4.19).Therefore there exists some δ > 0 such that |x0 − x| < δ → |f(x0)− f(x)| < κ

2 . Now let 0 < µ < δ and let P bethe four-element partition {a, x0 − µ, x0 + µ, b}. For clarity, let X = (x0 − µ, x0 + µ).

Since |x0 − x| ≤ µ < δ for all x ∈ X, we have |f(x0) − f(x)| = |κ − f(x)| < κ2 for all x ∈ X. For this

inequality to hold for all f(x) it must be the case that min f(X) > κ2 . This means that

L(P, f) =∑

mi∆xi ≥ m2∆x2 = min f(X)[(x0 + µ)− (x0 − µ)] = min f(X)(2µ) >κ

2(2µ) > 0

Since L(P, f) > 0 for this particular P , it must be the case that supL(P, f) > 0 and therefore∫f dx 6= 0.

We assumed that f(x) 6= 0 for some x ∈ [a, b] and concluded that∫f dx 6= 0: by contrapositive, if

∫f dx = 0

then f(x) = 0 for all x ∈ [a, b].

Exercise 6.3


We’ll see that U(P, f) − L(P, f) is equivalent to Mi − mi on a single arbitrarily small interval around 0. Tomake this difference arbitrarily small we will need to make sup f(x)− inf f(x) arbitrarily small by restricting xto a sufficiently small neighborhood of 0: this is possible iff f is continuous at 0.

100

Lemma 1: For each of the β functions, the upper and lower Riemann sums depend entirely onthe intervals containing 0

Let P be an arbitrary partition of [−1, 1]. By theorem 6.4 we can assume, without loss of generality, that 0 is anelement of this partition. We define α to be the partition element immediately preceding 0 and define ω to bethe partition element immediately following 0 (so our partition has the form P = {x0 < x1 < . . . < α < 0 < ω <. . . < xn < xn+1}). Note that α and ω must both exist since P is a finite set, although it’s possible that α = −1or ω = 1. For every xi−1 < xi < 0 we have β(xi−1) = β(xi) = 0, therefore ∆β = 0, therefore Mi∆β = mi∆β = 0.Similarly, for every 0 < xi−1 < xi we have β(xi−1) = β(xi) = 1, therefore ∆β = 0, therefore Mi∆β = mi∆β = 0.

This shows that Mi∆β = mi∆β = 0 for every interval that doesn’t contain 0. Therefore the values of U(P, f)and L(P, f) depend entirely on the intervals [α, 0] and [0, ω]. This holds for arbitrary P .

The general form of Mi∆β and mi∆β on [α, 0] and [0, ω]

Let P be an arbitrary partition. Let Mα denote the supremum of f(x) on the interval [α, 0]; similarly definemα,Mω,mω, etc. For all three of the β functions we have

Mα∆βα = Mα[β(0)− β(α)] = Mαβ(0) (53)

Mω∆βω = Mω[β(ω)− β(0)] = Mω[1− β(0)] (54)

mα∆βα = mα[β(0)− β(α)] = mαβ(0) (55)

mω∆βω = mω[β(ω)− β(0)] = mω[1− β(0)] (56)

(57)

Case 1: β(0) = 0

When β(0) = 0 we have ∑Mi∆β = Mαβ(0) +Mω[1− β(0)] = Mω∑mi∆β = mαβ(0) +mω[1− β(0)] = mω

Proof that continuity of f implies integrability: let ε > 0 be given. We’re told that f(0+) = f(0), so by definitionof continuity at this point there exists some δ > 0 such that

d(0, x) < δ → d(f(0), f(x)) <ε

2

Construct a partition P = {−1, 0, δ, 1}. The interval [0, δ] is compact, therefore by theorem 4.16 the pointsf−1(Mω) and f−1(mω) both exist in [0, δ] and therefore:

d(f−1(Mω), 0) < δ → d(Mω, f(0)) <ε

2

d(f−1(mω), 0) < δ → d(mω, f(0)) <ε

2

and therefore, by the triangle inequality,

d(Mω,mω) ≤ d(Mω, f(0)) + d(f(0),mω) < ε

and thereforeU(P, f)− L(P, f) = Mω −mω < ε

And since ε was arbitrary, this is sufficient to show that∫f dx exists.

Proof by contrapositive that integrability of f implies right-hand continuity: Assume that f(0+) 6= f(0).From the negation of the definition of continuity we therefore know that there exists some ε > 0 such that forall δ we can find some 0 < x < δ such that |f(x) − f(0)| ≥ ε. So for any partition P let a be a point at whichf(a) = Mω and let b be a point for which |f(b) − f(0)| ≥ ε. This gives us f(b) ≤ f(a) = Mω and mω ≤ f(0),therefore

U(P, f)− L(P, f) = Mω −mω = f(a)−mω ≥ f(a)− f(0) ≥ f(b)− f(0) ≥ ε

101

The partition P was arbitrary so this inequality must be true for all possible partitions, so we can never find Psuch that

inf U(P, f)− supL(P, f) ≤ ε

and therefore∫f dx does not exist.

Proof that this integral, if it exists, is equal to f(0): from the fact that 0 is an element of [0, δ] we know thatMω ≥ f(0). We also know that L(P, f) ≤

∫f (by theorem 6.4 and/or the definition of

∫f). This gives us the

inequalitiesMω = U(P, f) ≥ f(0)

L(P, f) ≤∫f dβ

from which we have

U(P, f)− L(P, f) ≥ f(0)−∫f dβ (58)

But we also know that mω ≤ f(0) and that U(P, f) ≥∫f . This gives us

U(P, f) ≥∫f dβ

mω = L(P, f) ≤ f(0)

from which we have

U(P, f)− L(P, f) ≥∫f dβ − f(0) (59)

Combining (58) and (59) gives us ∣∣∣∣∫ f dβ − f(0)

∣∣∣∣ ≤ U(P, f)− L(P, f) = ε

If this is to be true for all ε > 0 we must have∫f dβ = f(0).

Case 2: β(0) = 1

When β(0) = 1 we have ∑Mi∆β = Mαβ(0) +Mω[1− β(0)] = Mα∑mi∆β = mαβ(0) +mω[1− β(0)] = mα

From here the proof that∫f dβ = f(0) iff f(0) = f(0−) is almost identical to the previous proof with α in the

place of ω.

Case 3: β(0) = 12

When β(0) = 1/2 we have ∑Mi∆β = Mαβ(0) +Mω[1− β(0)] =

1

2[Mα +Mω]

∑mi∆β = mαβ(0) +mω[1− β(0)] =

1

2[mα +mω]

Therefore

U(P, f)− L(P, f) =1

2[Mα −mα] +

1

2[Mω −mω]

The function f is integrable iff this term can be made arbitrarily small; we saw in case (1) that [Mω −mω] canbe made arbitrarily small iff f(0) = f(0+), and we saw in case (2) that [Mα−mα] can be made arbitrarily smalliff f(0) = f(0−). Therefore we can minimize U(P, f) − L(P, f) in this case iff f(0+) = f(0−) = f(0); that is,iff f is continuous at 0.

102

Proof that this integral, if it exists, is equal to f(0): from the fact that 0 is an element of [0, δ] we know thatMω ≥ f(0) and Mα ≥ f(0). We also know that L(P, f) ≤

∫f (by theorem 6.4 and/or the definition of

∫f).

This gives us the inequalities

1

2[Mα +Mω] = U(P, f) ≥ 1

2[f(0) + f(0)] = f(0)

L(P, f) ≤∫f dβ

from which we have

U(P, f)− L(P, f) ≥ f(0)−∫f dβ (60)

But we also know that mω ≤ f(0) and that mα ≤ f(0) and that U(P, f) ≥∫f . This gives us

U(P, f) ≥∫f dβ

1

2[mα +mω] = L(P, f) ≤ 1

2[f(0) + f(0)] = f(0)

from which we have

U(P, f)− L(P, f) ≥∫f dβ − f(0) (61)

Combining (58) and (59) gives us ∣∣∣∣∫ f dβ − f(0)

∣∣∣∣ ≤ U(P, f)− L(P, f) = ε

If this is to be true for all ε > 0 we must have∫f dβ = f(0).

Part (d) of the exercise

If f is continuous at 0 then we have f(0) = f(0+) = f(0−), so by case (1) we have∫f dβ1 = f(0) and by case

(2) we have∫f dβ2 = f(0) and by case (3) we have f dβ3 = f(0).

Exercise 6.4

Let P be an arbitrary partition of [a, b]. Let n = |P | − 1, so that n represents the number of intervals in thepartition P . Every interval will contain at least one rational number (theorem 1.20b) and at least one irrationalnumber (by pigeonhole principle, since |(xi, xi+1)| = |R| > |Q|). Therefore we have Mi = 1,mi = 0 for all i.This gives us ∑

Mi∆xi =

n∑1

1 · (xi+1 − xi) = xn+1 − x0 = b− a

∑mi∆xi =

n∑1

0 · (xi+1 − xi) = 0

But P was an arbitrary partition, so this holds for all partitions, and therefore

supL(P, f) = 0, inf U(P, f) = 1

and therefore f 6∈ R.

Exercise 6.5

Is f ∈ R if f2 ∈ R?

No. Consider the function

f(x) =

{-1, x ∈ Q1, x 6∈ Q

We saw that f 6∈ R in exercise 6.4, but clearly f2(x) = 1 ∈ R.

103

Is f ∈ R if f2 ∈ R?

Yes. The function φ(x) = 3√x is a continuous function, so we let h(x) = φ(f3(x)) = f(x) and appeal to theorem

6.11 to claim thatf3 ∈ R → h ∈ R → f ∈ R

Note that the claim that h(x) = φ(f3(x)) = f(x) relied on the fact that x → x3 is a one-to-one mapping, so

that3√x3 = x. In contrast, the mapping x→ x2 is not one-to-one as can be seen by the fact that

√(−1)2 6= −1.

Exercise 6.6


We can define a partition for which f is discontinuous on 2n intervals each of which has length 3−n, so theRiemann sum across these intervals is proportional to (2/3)n. This sum approaches 0 as n→∞.

The proof

Let Ei be defined as in sec. 2.44. Note that E0 consists of one interval of length 1; E1 consists of two inter-vals of length 1

3 ; E2 consists of 4 intervals of length 19 ; and, in general, En will consist of 2n intervals of length 3−n.

With each En we can associate a set Fn that contains the endpoints of the intervals contained in En. Thatis, we define

Fn = {a1 < b1 < a2 < b2 < . . . < a2n < b2n}

where each [ai, bi] is an interval in En. Since every point of the Cantor set – and therefore every point at whichf might be discontinuous – is in an interval of the form [ai, bi] we’ll choose a partition that lets us isolate theseintervals.

Let m and M represent the lower and upper bounds of f on [0, 1]. Choose an arbitrary ε > 0. Choose nlarge enough that

3n−1 >2n+1(M −m)

ε

and choose δ such that

0 < δ <1

3n+1

These choice will be justified later. Define Pn to be

Pn = {0 = a1 < (b1 + δ) < (a2 − δ) < (b2 + δ) < (a3 − δ) < (b3 + δ) < . . . < (a2n − δ) < b2n = 1}

This partition contains 2n+1 points and therefore contains 2n+1 − 1 segments. We must now show that we canmake U(Pn, f)− L(Pn, f) arbitrarily small.

U(Pn, f)− L(Pn, f) =

2n+1−1∑i=1

(Mi −mi)∆xi

We’ll separate this sum into the intervals on which f is continuous (i = 2, 4, 6, . . .) and the intervals on which fmight contain discontinuities (i = 1, 3, 5, . . .).

=

2n∑i=1

(M2i −m2i)∆x2i +

2n∑i=0

(M2i+1 −m2i+1)∆x2i+1

The function f is continuous on every interval of the form [bi + δ, ai+1 − δ], and these intervals are representedby the lefthand summation. We can therefore refine Pn such that

≤ ε

2+

2n∑i=1

(M2i+1 −m2i+1)∆x2i+1

104

We know the exact value of the ∆x terms. For Pn, we have d(ai, bi) = 3−n and therefore ∆x2i+1 = 3−n + 2δ.Similarly, we know that Mi ≤M and mi ≥ m, so we have

≤ ε

2+

2n∑i=1

(M −m)

(1

3n+ 2δ

)From our choice of δ < 1

3n+1 <13n this becomes

≤ ε

2+

2n∑i=1

(M −m)

(1

3n−1

)This summation is constant with respect to i so it becomes simply

=ε

2+ 2n(M −m)

(1

3n−1

)From our choice of n we have 3n−1 > 2n+1(M−m)

ε and therefore 3−(n−1) < ε/2n+1(M −m).

≤ ε

2+

2n(M −m)ε

2n+1(M −m)

=≤ ε

2+ε

2= ε

Exercise 6.7a

If f ∈ R, then from theorem 6.12 we have∫ 1

0

f dx =

∫ c

0

f dx+

∫ 1

c

f dx (62)

Now consider the partition P of [0, c] defined by P = {0, c}. This partition has only a single interval, so we have

inf0<x<c

f(x) · c = L(P, f) ≤∫ c

0

f dx ≤ U(P, f) = sup0<x<c

f(x) · c

From (62), adding∫ 1

cf dx to each term of this inequality gives us∫ 1

c

f dx+ inf0<x<c

f(x) · c ≤∫ 1

0

f dx ≤∫ 1

c

f dx+ sup0<x<c

f(x) · c

Taking the limit of each term of this inequality as c→ 0, we see that f(x) · c→ 0 while the center term remainsconstant. The resulting inequality is

limc→0

∫ 1

c

f dx ≤∫ 1

0

f dx ≤ limc→0

∫ 1

c

f dx

which of course implies that

limc→0

∫ 1

c

f dx =

∫ 1

0

f dx

which is what we wanted to prove.

Exercise 6.7b


We construct a function f for which∫f =

∑ni=1

(−1)ii : this is the alternating harmonic series, which converges

(see Rudin’s remark 3.46). We then see that∫|f | =

∑ni=1

1i : this is the harmonic series, which diverges (theorem

3.28).

105

The proof

Choose any c ∈ (0, 1) and consider the following function defined on [c, 1].

f(x) =

{(−1)n(n+ 1), 1

n+1 < x < 1n for some n ∈ N

0, otherwise

Claim 1 : f is integrable on any interval [c, 1]

Choose an arbitrarily small ε > 0. Let 1N be the smallest harmonic number greater than c. We want to choose

δ such that each harmonic number in [c, 1] is contained in a distinct interval of radius δ: so choose δ such that0 < δ < 1

2 [ 1N − c]. We must also make sure that δ < ε

2(N+1)(N+2) for reasons that will become clear later. Let

Hc represent the partition containing the elements {c} ∪{

1n ± δ : n ∈ N, n < 1

c

}∩ [0, 1].

• The partition Hc contains one interval of the form [c, 1N − δ].

M1∆x1 = sup f(x)∆x1 = (−1)N (N + 1)∆x1 = (−1)N (N + 1)

(1

N− δ − 1

N + 1

)

= (−1)N (N + 1)

(1

N(N + 1)− δ)

= (−1)N(

1

N− (N + 1)δ

)The function is constant over this interval, so we have

m1∆x1 = inf f(x)∆x1 = sup f(x)∆x1 = M1

therefore|M1 −m1|∆x1 = 0 (63)

• The partition Hc contains one interval of the form [1− δ, 1] for which

Mn∆xn = sup f(x)∆xn = 0∆x = 0

mn∆xn = inf f(x)∆xn = −2∆x = −2δ

therefore|Mn −mn|∆xn = 2δ (64)

• The partition contains N − 1 intervals of the form[1i − δ,

1i + δ

]for which

Mi∆xi = sup f(x)∆xi ≤ sup |f(x)|2δ ≤ |i+ 1|2δ

mi∆xi = inf f(x)∆xi ≥ inf −|f(x)|2δ ≥ −|i+ 1|2δ

therefore|Mi −mi|∆xi ≤ |Mi|+ |mi| ≤ 4δ|i+ 1| (65)

• The partition contains N − 2 intervals of the form[

1j+1 + δ, 1j − δ

]for which

Mj∆xj = sup f(x)∆xj = (−1)j(j+1)

(1

j− δ − 1

j + 1− δ)

= (−1)j(j+1)

(1

j(j + 1)− 2δ

)= (−1)j

(1

j− 2δ(j + 1)

)The function is constant over this interval, so we have

mj∆xj = inf f(x)∆xj = sup f(x)∆x = M = (−1)j(

1

j− 2δ(j + 1)

)therefore

(Mj −mj)∆xj = 0 (66)

106

Summing equations (63)-(66), we have

|U(P, f)− L(P, f)| = |(M1 −m1)∆x1 + (Mn −mn)∆xn +

N∑i=2

(Mi −mi)∆xi +

N−1∑j=2

(Mj −mj)∆xj |

≤ |(M1 −m1)|∆x1 + |(Mn −mn)|∆xn +

N∑i=2

|Mi −mi|∆xi +

N−1∑j=2

|Mj −Mj |∆xj

≤ 0 + 2δ +

N∑i=2

4δ|i+ 1|+N−1∑j=2

0

= 2δ + 4δ(N + 1)(N + 2)− 1

2= 2δ(N + 1)(N + 2)

Earlier in the proof we chose δ such that δ < ε2(N+1)(N+2) , so we obtain the inequality

|U(P, f)− L(P, f)| ≤ ε

And ε was arbitrary, so this proves that f ∈ R.

Claim 2 : The value of limc→0

∫ 1

cf is defined

Adding the values for the M terms, we have

U(P, f) = M1 +Mn +

N∑i=2

Mi +

N−1∑i=2

Mj

≤ (−1)N(

1

N− (N + 1)δ

)+ 0 +

N∑i=2

|i+ 1|2δ +

N−1∑j=2

(−1)j(

1

j− 2δ(j + 1)

)

= (−1)N(

1

N− (N + 1)δ

)δ[(N + 1)(N + 2)− 6] +

N−1∑j=2

(−1)j(

1

j− 2δ(j + 1)

)

Remember that we first chose a value for c, which forced us to use a particular value for N ; our choice of δ cameafterward, so we’re still free to select δ small enough to make this last inequality become:

U(Pc, f) ≤ (−1)N

N+

N−1∑j=2

(−1)j1

j+ ε

We defined 1N to be the smallest harmonic number greater than c, so 1

N → 0 and N →∞ as c→ 0. Thereforetaking the limit of both sides as c→ 0 gives us

limc→0

U(Pc, f) =

∞∑j=2

(−1)j1

j+ ε = 1− ln(2) + ε

which proves that

limc→0

∫ 1

c

f dx ≥ 1− ln(2)

The values for the m terms are almost identical to the M terms, and adding them gives us the inequality

limc→0

∫ 1

c

f dx ≤ 1− ln(2)

Together, of course, these two inequalities prove that

limc→0

∫ 1

c

f dx = 1− ln(2)

107

Claim 3 : The value of limc→0

∫ 1

c|f | is not defined

If we follow the logic from claims 1 and 2, we find that adding the values for the M terms gives us

U(P, f) = M1 +Mn +

N∑i=2

Mi +

N−1∑i=2

Mj

≤(

1

N− (N + 1)δ

)+

1

2+

N∑i=2

|i+ 1|2δ +

N−1∑j=2

(1

j− 2δ(j + 1)

)

=

(1

N− (N + 1)δ

)+

1

2δ[(N + 1)(N + 2)− 6] +

N−1∑j=2

(1

j− 2δ(j + 1)

)

Again following the logic of part 2, we select δ small enough to gives us

limc→0

∫ 1

c

f dx =

∞∑j=2

1

j+ ε

But we know from chapter 3 that this series doesn’t converge, so the lefthand limit doesn’t exist and therefore

limc→0

∫ 1

cf dx does not exist.

Exercise 6.8

Choose an arbitrary real number c > 1. Let n be the greatest integer such that n < c.Define the partition Pn of[0, n] to be Pn = {x0 = 1, 2, 3, 4, . . . , n = xn−1}.

First, assume that∑f(n) diverges. Because f(x) > 0 this means that

∑f(n) is unbounded (theorem 3.24).

The fact that f is monotonically decreasing allows us to derive the following chain of inequalities.∫ c

1

f(x) dx ≥∫ n

1

f(x) dx

≥ L(Pn, f)

=

n−2∑i=1

mi∆xi

=

n−2∑i=1

f(i+ 1)

(Note: the index is going to n − 2 instead of n − 1 because of the awkward numbering of the partition: notethat the partition only goes to xn−1). Taking the limit as c→∞, we have∫ ∞

1

f(x) dx = limc→∞

∫ c

1

f(x) dx ≥ limn→∞

n∑i=2

f(i+ 1)

The integral on the left-hand side of this chain of inequalities is greater than the unbounded sum on the right-hand side, so we conclude that the integral does not converge.

Next assume that∑f(n) converges to some κ ∈ R. Choose c, n, Pn+1 as defined above. The fact that f is

monotonically decreasing allows us to derive the following chain of inequalities.∫ c

1

f(x) dx ≤∫ n+1

1

f(x) dx

≤ U(Pn+1, f)

=

n−1∑i=1

Mi∆xi

=

n−1∑i=1

f(i)

108

Taking the limit as c→∞, we have∫ ∞1

f(x) dx = limc→∞

∫ c

1

f(x) dx ≤ limn→∞

n−1∑i=1

f(i)

We’re told that f(x) > 0, so we know that∫ c0f(x) is a monotonically increasing function of c. The previous

inequality tells us that∫ c1

is bounded above. Therefore by theorem 3.14 we know that limc→∞∫ c1f dx is defined.

Exercise 6.9

By the definition given in exercise 6.8:∫ ∞0

f(x)g′(x) dx = limc→∞

∫ c

0

f(x)g′(x) dx

The integral from 0 to c is finite, so we can use integration by parts:∫ ∞0

f(x)g′(x) dx = limc→∞

f(c)g(c)− f(0)g(0)−∫ c

0

f ′(x)g(x) dx

Applying this to the given function with f(x) = (1 + x)−1 and g(x) = sinx, we have f ′(x) = −(1 + x)−2 andg′(x) = cosx. Integration by parts therefore yields∫ ∞

0

cosx

1 + xdx = lim

c→∞

sin c

1 + c− sin 0

1+

∫ c

0

sinx

(1 + x)2dx

Since −1 ≤ sin c ≤ 1, the non-integral terms both tend to zero as c→∞. This leaves us with∫ ∞0

cosx

1 + xdx = lim

c→∞

∫ c

0

sinx

(1 + x)2dx

By the definition in exercise 6.8 this is equivalent to∫ ∞0

cosx

1 + xdx =

∫ ∞0

sinx

(1 + x)2dx

Exercise 6.10 a: method 1

This proof was due to Helen Barclay ([email protected]). We rewrite uv as the exponential of a naturallog:

uv = (up)1/p(vq)1/q = exp(ln((up)1/p(vq)1/q)) = exp

(1

plnup +

1

qln vq

)The exponential function f(x) = ex has a strictly positive second derivative; therefore its first derivative ismonotonically increasing (theorem 5.11); therefore f(x) is a convex function (proof in exercise 5.14, definitionof convex in exercise 4.23). By definition of convexity with λ = 1/p, 1− λ = 1/q we have

exp

(1

plnup +

1

qln vq

)≤ 1

pexp (lnup) +

1

qexp (vq) =

up

p+vq

q

Combining these last two inequalities, we have

uv ≤ up

p+vq

q


109


Using the variable substitutions s = up, t = vq, a = 1p , (1− a) = 1

q we can rewrite

uv ≤ up

p+vq

q(67)

assat1−a ≤ as+ (1− a)t

Multiplying both sides by s−ata−1 gives us

1 ≤ as1−ata−1 + (1− a)s−ata

We can rewrite this previous inequality in two equivalent ways:

1 ≤ a(t

s

)a−1+ (1− a)

(t

s

)a(68)

1 ≤ a(st

)1−a+ (1− a)

(st

)−a(69)

If s ≤ t then t/s ≥ 1 and therefore

a

(t

s

)a−1+ (1− a)

(t

s

)a≥ a(1)a−1 + (1− a)(1)a = 1

so the inequality in (68) holds. If s ≥ t then s/t ≥ 1, and therefore

a(st

)1−a+ (1− a)

(st

)−a≥ a(1)1−a + (1− a)(1)−a = 1

so the inequality in (69) holds. But both of these inequalities are just (67) with a variable change, so we seethat (67) holds when s ≥ t or when t ≥ s – that is, it always holds. And this is what we were asked to prove.


This proof was due to Boris Shektman (don’t email him). Define the function f(u) to be

f(u) =up

p+vq

q− uv

This function has the derivativef ′(u) = up−1 − v

Note that f ′(u) = 0 has only one positive real solution, which occurs at u = v1/(p−1). We can show algebraicallythat 1

p−1 = qp . For u < vq/p we have f ′(u) < 0 and for u > vq/p we have f ′(u) > 0, so the the point u = vq/p is

the unique global minimum of f(u). Evaluating the function at this value of u gives us

f(vq/p) =vq

p+vq

q− vq/pv

=

(1

p+

1

q

)vq − (v(p+q)/p)

We’re given that 1/p+ 1/q = 1 and, from this, we also know that p+ q = pq. Making these substitutions givesus

= vq − vpq/p = vq − vq0

Therefore the unique minimum of f is f(vq/p) = 0, and f(u) ≥ 0 for all other u.

110

Exercise 6.10b

From part (a), we have∫ b

a

fg dα ≤∫ b

a

fp

p+gq

qdα =

1

p

∫ b

a

fp dα+1

q

∫ b

a

gq dα =1

p+

1

q= 1

Exercise 6.10c

Define κ and λ to be

κ =

∫ b

a

|f |p dα, λ =

∫ b

a

|g|q dα

Define f and g to be

f(x) =f(x)

κ1/p, g(x) =

g(x)

λ1/q

These two functions are “normalized” in the sense that∫ b

a

|f |p dα =

∫ b

a

|f |p

κdα =

1

κ

∫ b

a

|f |p dα =κ

κ= 1

∫ b

a

|g|q dα =

∫ b

a

|g|q

λdα =

1

λ

∫ b

a

|g|q dα =λ

λ= 1

We know that f and g are Riemann integrable by theorem 6.11, and therefore by theorem 6.13 and part (b) of

this exercise we know that |f | and |g| are Riemann integrable and that∣∣∣∣∣∫ b

a

f g dα

∣∣∣∣∣ ≤∫ b

a

|f ||g| dα ≤ 1

By definition of f and g this becomes:∣∣∣∣∣∫ b

a

f

κ1/pg

λ1/qdα

∣∣∣∣∣ ≤∫ b

a

|f |κ1/p

|g|λ1/q

dα ≤ 1

Multiplying both sides by the constant κ1/pλ1/q gives us∣∣∣∣∣∫ b

a

fg dα

∣∣∣∣∣ ≤∫ b

a

|fg| dα ≤ κ1/pλ1/q

which, by the definition of κ and λ, is equivalent to∣∣∣∣∣∫ b

a

fg dα

∣∣∣∣∣ ≤∫ b

a

|fg| dα ≤

{∫ b

a

|f |p dα

}1/p{∫ b

a

|g|q dα

}1/q

which is what we wanted to prove.

Exercise 6.10d

Proof by contrapositive. If the inequality were false for the improper integrals, we would be able to find somefunction f such that

limc→∞

∣∣∣∣∫ c

a

fg dα

∣∣∣∣ > limc→∞

{∫ c

a

|f |p dα}1/p{∫ c

a

|g|q dα}1/q

or

limc→x0

∣∣∣∣∣∫ b

x0

fg dα

∣∣∣∣∣ > limc→x0

{∫ b

x0

|f |p dα

}1/p{∫ b

x0

|g|q dα

}1/q

But this is a strict inequality, so in either of these cases we would have to be able to find a neighborhood of ∞or x0 for which this inequality still held. And this would give us a proper integral for which Holder’s inequalitydoesn’t hold. By contrapositive, the fact that Holder’s inequality is valid for proper integrals shows that it’strue for improper integrals.

111

Exercise 6.11

If we can prove that ||f+g|| ≤ ||f ||+||g||, then we can prove the given inequality by letting f = f−g and g = g−h.

||f + g||2 =∫ ba|f + g|2 dα definition of || · ||

≤∫ ba

(|f |+ |g|)2 dα The triangle inequality is establishedfor | · |

=∫ ba|f |2 + 2|fg|+ |g|2 dα

=∫ ba|f |2 dα+ 2

∫ ba|fg| dα+

∫ ba|g|2 dα properties of integrals: theorem 6.12

≤∫ ba|f |2 dα+ 2

{∫ ba|f |2 dα

}1/2 {∫ ba|g|2 dα

}1/2

+∫ ba|g|2 dα Holder’s inequality (exercise 6.10)

= ||f ||2 + 2||f || ||g||+ ||g||2 definition of || · ||= (||f ||+ ||g||)2

Taking the square root of the first and last terms of this chain of inequalities, we have

||f + g|| ≤ ||f ||+ ||g||

This inequality must hold for any f, g ∈ R. Letting f = f − g and g = g − h this becomes

||f − h|| ≤ ||f − g||+ ||g − h||


Exercise 6.12

We’re told that f is continuous, so it has upper and lower bounds m ≤ f(x) ≤ M . Choose any ε > 0. Sincef ∈ R(α), we can find some partition P such that

U(P, f, α)− L(P, f, α) <ε2

M −m

Having now determined a particular partition P we can define g as suggested in the hint. This functionis simply a series of straight lines connecting f(xi−1) to f(xi). This becomes clearer if we rewrite g in analgebraically equivalent form:

g(t) = f(xi−1) +f(xi)− f(xi−1)

xi − xi−1(t− xi−1)

We see that on any interval [xi−1, xi] the function g(t) is bounded between f(xi−1) and f(xi). That is,

mi min{f(xi−1), f(xi)} ≤ g(t), t ∈ [xi−1, xi] ≤ max{f(xi−1), f(xi)} ≤Mi

And of course we have similar bounds on f :

mi ≤ f(t) ≤Mi

and therefore, since f(t), g(t) ≤Mi and −f(t),−g(t) ≤ −mi, we have

|f(t)− g(t)| ≤ |Mi −mi| = Mi −mi (70)

Similarly, since Mi ≤M and −mi ≤ −m we have

Mi −mi ≤M −m (71)

We’ve now established all of the inequalities we need to complete our proof.

112

||f − g||2 =∫ ba|f(x)− g(x)|2 dα definition of || · ||

=∑ni=0

∫ xi+1

xi|f(x)− g(x)|2 dα integral property 6.12c

≤∑ni=0

∫ xi+1

xi|Mi −mi|2 dα from (70)

=∑ni=0 |Mi −mi|2

∫ xi+1

xidα integral property 6.12a

=∑ni=0 |Mi −mi|2 ∆α(xi)

≤∑ni=0(M −m)(Mi −mi) ∆α(xi) from (71)

= (M −m)∑ni=0(Mi −mi) ∆α(xi) integral property 6.12a

= (M −m)[U(P, f, α)− L(P, f, α)] Definition of U and L

≤ (M −m) ε2

M−m we chose P so that this would hold

= ε2 we chose ε so that this would hold

Taking the square root of the first and last terms gives us

||f − g|| ≤ ε


Exercise 6.13a

Letting u = t2 we have du/dt = 2t and therefore

dt =du

2t=

du

2√u

Using the change of variables theorem (6.19) we see that the given integral is equivalent to

f(x) =

∫ (x+1)2

x2

sinu

2√udu (72)

Integration by parts (theorem 6.22) with F = 1/2√u and g = sinu du gives us

f(x) =− cosu

2√u

∣∣∣∣(x+1)2

x2

−∫ (x+1)2

x2

cosu

4u3/2du

which expands to

f(x) =− cos(x+ 1)2

2(x+ 1)+

cosx2

2x−∫ (x+1)2

x2

cosu

4u3/2du

By the triangle inequality this gives us

|f(x)| =

∣∣∣∣∣− cos(x+ 1)2

2(x+ 1)+

cosx2

2x−∫ (x+1)2

x2

cosu

4u3/2du

∣∣∣∣∣≤

∣∣∣∣− cos(x+ 1)2

2(x+ 1)

∣∣∣∣+

∣∣∣∣cosx2

2x

∣∣∣∣+

∣∣∣∣∣∫ (x+1)2

x2

cosu

4u3/2du

∣∣∣∣∣<

∣∣∣∣ 1

2(x+ 1)

∣∣∣∣+

∣∣∣∣ 1

2x

∣∣∣∣+

∣∣∣∣∣∫ (x+1)2

x2

1

4u3/2du

∣∣∣∣∣=

∣∣∣∣ 1

2(x+ 1)

∣∣∣∣+

∣∣∣∣ 1

2x

∣∣∣∣+

∣∣∣∣12[

1

x+ 1− 1

x

]∣∣∣∣=

1

2(x+ 1)+

1

2x+

1

2x(x+ 1)

=x+ (x+ 1) + 1

2x(x+ 1)

=2(x+ 1)

2x(x+ 1)=

1

x

113

Note that the strict inequality is justified by the fact that cosu ≤ 1 for all u ∈ [x2, (x + 1)2] but cos is not aconstant function so cosu < 1 for some u ∈ [x2, (x+ 1)2].

Exercise 6.13b

In the previous problem we used integration by parts to determine that

f(x) =− cos(x+ 1)2

2(x+ 1)+

cosx2

2x−∫ (x+1)2

x2

cosu

4u3/2

Multiplying by 2x gives us

2xf(x) =−x cos(x+ 1)2

x+ 1+ cosx2 − 2x

∫ (x+1)2

x2

cosu

4u3/2du

which is algebraically equivalent to

2xf(x) = cosx2 − cos[(x+ 1)2] +cos(x+ 1)2

x+ 1− 2x

∫ (x+1)2

x2

cosu

4u3/2du

Letting r(x) be defined as

r(x) =cos(x+ 1)2

x+ 1− 2x

∫ (x+1)2

x2

cosu

4u3/2du

we have, by the triangle inequality,

|r(x)| ≤∣∣∣∣cos(x+ 1)2

x+ 1

∣∣∣∣+

∣∣∣∣∣2x∫ (x+1)2

x2

cosu

4u3/2du

∣∣∣∣∣≤

∣∣∣∣ 1

x+ 1

∣∣∣∣+

∣∣∣∣∣2x∫ (x+1)2

x2

1

4u3/2du

∣∣∣∣∣=

∣∣∣∣ 1

x+ 1

∣∣∣∣+

∣∣∣∣2x1

2

[1

x+ 1− 1

x

]∣∣∣∣=

1

x+ 1+ x

1

x(x+ 1)

=1

x+ 1+

1

x+ 1

=2

x+ 1

<2

x

So we see that2xf(x) = cosx2 − cos[(x+ 1)2] + r(x)

where |r(x)| < 2/x.

Exercise 6.13c

We’re asked to find the lim sup and lim inf of the function

xf(x) =1

2

[cos(x2)− cos([x+ 1]2) + r(x)

]We established in part (b) that r(x) → 0 as x → ∞. It’s also clear that this function is bounded above by 1and bounded below by −1, but it’s not immediately clear that these bounds are the lim sup and the lim inf.

114

Remark: The supremum and infimum of cos(x2)− cos([x+ 1]2) are never obtained

The supremum would be obtained if we could find x such that

cos(x2)− cos([x+ 1]2) = 2 (73)

which would occur precisely when

x2 = 2kπ, [x+ 1]2 = (2j + 1)π, j, k ∈ N (74)

After some tedious algebra, we can see that these two equalities hold when(π[2(j − k) + 1]− 1

2√

2π

)2

= k (75)

But this equation can’t hold. If it did, we could rearrange it algebraically to give us

π2[2(j − k) + 1]2 − 2π[2j + 3k + 1] + 1 = 0

This would allow us to use the quadratic formula to give an algebraic expression for π; but this is impossible asπ is a transcendental number. The infimum of cos(x2)− cos([x+ 1]2) is not obtained for the same reason.

Proof: The lim sup of xf(x) is 1

Let 1 > ε > 0 and N ∈ N be given. Let δ be chosen so that

0 < δ <cos−1(1− ε)

2π

Define the function p(m) to be

p(m) =

(π[2m+ 1]− 1

2√

2π

)2

Its derivative with respect to m isp′(m) =

√2π(π[2m+ 1]− 1)

which is strictly increasing. Therefore we can make the derivative as large as we want by choosing a sufficientlylarge m. Specifically, we are able to choose M ∈ N such that p′(M)δ > 1 and M > N . By the mean valuetheorem we have

p(M + δ)− p(M) = p′(ξ)δ, ξ ∈ (M,M + δ)

From the strictly increasing nature of p′ and our choice of M we have

p(M + δ)− p(M) = p′(ξ)δ > p′(M)δ > 1

Therefore p(x) must take an integer value for at least one x ∈ [M,M + δ]. Let κ represent one such x. Thetwo important properties of κ are that p(κ) is an integer and that κ −M < δ (so that κ itself is “almost” aninteger). We now have

p(κ) = p(M + δ) =

(π[2(M + δ) + 1]− 1

2√

2π

)2

= k ∈ N

If we define j to be j = k +M this becomes(π[2(j − k + δ) + 1]− 1

2√

2π

)2

= k ∈ N

Reversing the algebraic steps that led from (74) to (75) tells us that there exists some x ∈ R such that

x2 = 2kπ, [x+ 1]2 = (2(j + δ) + 1)π, j, k ∈ N

Using this value of x in our original function, we have

xf(x) =1

2

[cos(x2)− cos([x+ 1]2) + r(x)

]=

1

2[cos(2kπ)− cos(2(j + 1)π + 2δπ) + r(x)]

115

Using trig identities, this becomes

xf(x) =1

2[1− {cos(2(j + 1)π) cos(2επ)− sin(2(j + 1)π) sin(2επ)}+ r(x)]

=1

2[1− (−1) cos(2επ)− 0 + r(x)]

Finally, from our original choice of δ as an inverse cosine, this becomes

xf(x) =1

2[1 + (1− ε) + r(x)]

= 1 +1

2[r(x)− ε]

From part (b), we know that r(x)→ 0 as x→∞:

limx→∞

xf(x) = 1− ε

2

And ε was arbitrary, so the supremum islimx→∞

supxf(x) = 1

The proof that lim inf xf(x) = −1 is similar.

Exercise 6.13d

∫ ∞0

sin(t2) dt =

∞∑0

∫ √(n+1)π

√nπ

sin(t2) dt (76)

=

∞∑0

∫ √(n+1)π

√nπ

(−1)n| sin(t2)| dt

=

∞∑0

(−1)n∫ √(n+1)π

√nπ

| sin(t2)| dt

≤∞∑0

(−1)n∫ √(n+1)π

√nπ

1 dt (and is similarly bounded below)

=√π

∞∑0

(−1)n(√n+ 1−

√n)

(77)

(78)

We saw in exercise 3.6.a that the sequence {√n+ 1 −

√n} is decreasing. Therefore, by the alternating series

theorem (3.43) the series in (77) converges. Therefore, by the comparison test (3.25) the series in (76) convergesand therefore the integral in (76) converges.

Exercise 6.14a

Letting u = et we have du/dt = et = u and therefore

dt =du

et=du

u

Using the change of variables theorem (6.19) we see that the given integral is equivalent to

f(x) =

∫ ex+1

ex

sin(u)

udu (79)

116

We use integration by parts as we did in 6.13(a).

f(x) =− cosu

u

∣∣∣∣ex+1

ex−∫ ex+1

ex

cosu

u2du

which expands to

f(x) =cos ex

ex− cos ex+1

ex+1−∫ ex+1

ex

cosu

u2du

By the triangle inequality this gives us

|f(x)| =

∣∣∣∣∣cos ex

ex− cos ex+1

ex+1−∫ ex+1

ex

cosu

u2du

∣∣∣∣∣≤

∣∣∣∣cos ex

ex

∣∣∣∣+

∣∣∣∣cos ex+1

ex+1

∣∣∣∣+

∣∣∣∣∣∫ ex+1

ex

cosu

u2du

∣∣∣∣∣<

∣∣∣∣ 1

ex

∣∣∣∣+

∣∣∣∣ 1

ex+1

∣∣∣∣+

∣∣∣∣∣∫ ex+1

ex

1

u2du

∣∣∣∣∣<

∣∣∣∣ 1

ex

∣∣∣∣+

∣∣∣∣ 1

ex+1

∣∣∣∣+

∣∣∣∣ 1

ex− 1

ex+1

∣∣∣∣=

1

ex+

1

ex+1+e− 1

ex+1

=2e

ex+1

=2

ex

Therefore ex|f(x)| < 2. Note that the strictness of the inequality is justified by the fact that cosu ≤ 1 for allu ∈ [ex, ex+1] but cos is not a constant function so cosu < 1 for some u ∈ [ex, ex+1].

Exercise 6.14b

In the previous exercise we used integration by parts to determine that

f(x) =cos ex

ex− cos ex+1

ex+1−∫ ex+1

ex

cosu

u2du

Multiplying by ex gives us

exf(x) = cos ex − e−1 cos ex+1 − ex∫ ex+1

ex

cosu

u2du

Letting r(x) be defined as

r(x) = −ex∫ ex+1

ex

cosu

u2du

we have, by the triangle inequality,

117

|r(x)| ≤

∣∣∣∣∣−ex∫ ex+1

ex

cosu

u2du

∣∣∣∣∣=

∣∣∣∣∣−ex∫ ex+1

ex

1

u2du

∣∣∣∣∣= |ex|

∣∣∣∣∣∫ ex+1

ex

1

u2du

∣∣∣∣∣= |ex|

∣∣∣∣ 1

ex− 1

ex+1

∣∣∣∣=

e− 1

e

<2

e

So we see thatexf(x) = cos ex − e−1 cos ex+1 + r(x)

where |r(x)| < 2e−1.

Exercise 6.15a

Using integration by parts with F = f2(x) and g = dx gives us

1 =

∫ b

a

f2(x) dx = xf2(x)∣∣ba−∫ b

a

2f(x)f ′(x)x dx

which evaluates to

1 = bf2(b)− af2(a)−∫ b

a

2f(x)f ′(x)x dx

which, since f(a) = f(b) = 0, becomes

1 = −∫ b

a

2f(x)f ′(x)x dx

Dividing both sides by −2 gives us−1

2=

∫ b

a

f(x)f ′(x)x dx

which is the desired equality.

Exercise 6.15b

Applying Holder’s inequality to the last equation in part (a) gives us∣∣∣∣−1

2

∣∣∣∣ =

∣∣∣∣∣∫ b

a

f(x)f ′(x)x dx

∣∣∣∣∣ ≤{∫ b

a

|x2f2(x)| dx

}1/2

·

{∫ b

a

|[f ′(x)]2| dx

}1/2

Since f(x)f ′(x) must be negative at some point (by mean value theorem, since f(a) = f(b) = 0) while x2f2 and[f ′(x)]2 are strictly positive, this inequality must be strict:∣∣∣∣−1

2

∣∣∣∣ =

∣∣∣∣∣∫ b

a

f(x)f ′(x)x dx

∣∣∣∣∣ <{∫ b

a

|x2f2(x)| dx

}1/2

·

{∫ b

a

|[f ′(x)]2| dx

}1/2

Squaring the first and last term in this chain of inequalities, we have

1

4<

∫ b

a

|x2f2(x)| dx ·∫ b

a

|[f ′(x)]2| dx

118

All of the normed values are squares, so the norms are redundant:

1

4<

∫ b

a

x2f2(x) dx ·∫ b

a

[f ′(x)]2 dx

Exercise 6.16a

We can integrate the given function separately over infinitely many intervals of length 1:

s

∫ ∞1

[x]

xs+1dx =

∞∑n=1

s

∫ n+1

n

[x]

xs+1dx

We have [x] = n on the interval [n, n+ 1) so this becomes

=

∞∑n=1

s

∫ n+1

n

n

xs+1dx

=

∞∑n=1

sn

[−1

sxs

∣∣∣∣n+1

x=n

=

∞∑n=1

n

[1

ns− 1

(n+ 1)s

]We then split up the summation into three parts.

=

1∑n=1

n1

ns+

∞∑n=2

n1

ns−∞∑n=1

n1

(n+ 1)s

Evaluating the first summation and changing the index of the third gives us

= 1 +

∞∑n=2

n1

ns−∞∑n=2

(n− 1)1

(n)s

= 1 +

∞∑n=2

n− (n− 1)

ns

= 1 +

∞∑n=2

1

ns

And since 1 is clearly equal to 1/ns when n = 1, this is equivalent to

=

∞∑n=1

1

ns

Exercise 6.16b

We’re asked to evaluate the integrals

s− 1− s

∫ ∞1

x− [x]

xs+1dx

Having determined that [x] was integrable in part (a) we can split up the integral as follows:

=s

s− 1− s

∫ ∞1

1

xsdx+ s

∫ ∞1

[x]

xs+1dx

Elementary calculus allows us to calculate the left integral:

=s

s− 1− s

s− 1+ s

∫ ∞1

[x]

xs+1dx

= s

∫ ∞1

[x]

xs+1dx

This, as we saw in part (a), is equivalent to= ζ(s)

119

Exercise 6.17

I’m going to change the notation of this problem a bit to make it clearer (to me, at least) by letting f = G andf ′ = g. We’re told that α is a monotonically increasing function on [a, b] and that f is continuous. We’re askedto prove that ∫ b

a

α(x)f ′(x) dx = f(b)α(b)− f(a)α(a)−∫ b

a

f(x) dα

Most of the work is done for us by the theorem 6.22 (integration by parts) which tells us that∫ b

a

α(x)f ′(x) dx = f(b)α(b)− f(a)α(a)−∫ b

a

f(x)α′(x) dx

So our proof is reduced to simply proving that∫ b

a

f(x) dα =

∫ b

a

f(x)α′(x) dx (80)

It’s tempting to appeal to elementary calculus to show that dα = α′(x) dx, but we can prove this more formally.

Proving this more formally

Let ε > 0 be given. We’re told that f is continuous and that α is monotonically inreasing; therefore f ∈ R(α).Let P be an arbitrary partition of [a, b] such that U(P, f, α)− L(P, f, α) < ε

2 .

On any interval [xi−1, xi] the mean value theorem tells us that there is some ti ∈ [xi−1, xi] such that

α(xi−1)− α(xi) = α′(ti)[xi−1 − xi]

or, to express the same equation with different notation,

∆αi = α′(ti)∆xi (81)

By theorem 6.7 (c) we have ∣∣∣∣∣n∑i=1

f(ti)∆αi −∫ b

a

f(x) dα

∣∣∣∣∣ < ε

2(82)

and also ∣∣∣∣∣n∑i=1

f(ti)α′(ti)∆xi −

∫ b

a

f(x)α′(x) dx

∣∣∣∣∣ < ε

2(83)

By the inequality (81) we haven∑i=1

f(ti)∆αi =

n∑i=1

f(ti)α′(ti)∆xi

Therefore, by (82) and (83) and the triangle inequality, we have∣∣∣∣∣∫ b

a

f(x) dα−∫ b

a

f(x)α′(x) dx

∣∣∣∣∣ < ε

which, in order to be true for any ε > 0, requires that∫ b

a

f(x) dα =

∫ b

a

f(x)α′(x) dx

This proves (80), and this completes the proof.

120

Exercise 6.18

The derivative γ′1(t) = ieit is continuous, therefore γ1(t) is rectifiable (theorem 6.27) and its length is given by∫ 2π

0

|ieit| =∫ 2π

0

1 = 2π

The derivative γ′2(t) = 2ieit is continuous, therefore γ2(t) is rectifiable and its length is given by∫ 2π

0

|2ieit| =∫ 2π

0

2 = 4π

The arc γ3 is not rectifiable

Proof by contradiction. If γ3 were rectifiable then its length would be given by the integral∫ 2π

0

|γ′(t)| dt =

∫ 2π

0

∣∣∣∣[2πi sin

(1

t

)+−1

t22πit cos

(1

t

)]e2πit sin(1/t)

∣∣∣∣ dt=

∫ 2π

0

2π

∣∣∣∣sin(1

t

)− 1

tcos

(1

t

)∣∣∣∣ dt (84)

But this integral isn’t defined, since Riemann integrals are defined only for bounded functions and this functionis unbounded:

limt→0|γ′(t) dt| = lim

k→∞

∣∣∣∣γ′( 1

2kπ

)∣∣∣∣ = limk→∞

2π |sin (2kπ)− 2kπ cos (2kπ)|

= limk→∞

2π |±2kπ|

= limk→∞

4π2k =∞

Therefore the integral in (84) doesn’t exist, which means the arc length of γ3 is not defined, which means thatγ3 is not rectifiable.

Exercise 6.19

Lemma 1: If f : A→ B and G : B → C are both one-to-one, then g ◦ f : A→ C is one-to-one

By definition of “one-to-one”, we have

f(x) = f(y)→ x = y, g(s) = g(t)→ s = t

Thereforeg(f(x)) = g(f(y))→ f(x) = f(y)→ x = y

and so g ◦ f is one-to-one.

Lemma 2: If g ◦ f : A → C is one-to-one and f : A → B is one-to-one and onto, then g : B → C isone-to-one.

Proof by contrapositive. Suppose that g is not one-to-one. Then we could find x, y ∈ B such that g(x) = g(y)but x 6= y. But f is one-to-one and onto, so there exist unique s, t ∈ A such that f(s) = x, f(t) = y. So we haveg(f(s)) = g(f(t)) but f(s) 6= f(t) and therefore s 6= t. So g ◦ f is not one-to-one. By contrapositive, the lemmais proven.

Lemma 3: If f : [a, b] → [c, d] is a continuous, one-to-one, real function then f is either strictlydecreasing or strictly increasing

Proof by contrapositive. Let f be a continuous real function. If f were not strictly increasing and not strictlydecreasing then we could find some x < y < z such that either f(y) ≤ f(x) and f(y) ≤ f(z) or such thatf(y) ≥ f(x) and f(y) ≥ f(z). From the intermediate value property of continuous functions we know that wemust then be able to find x′, y′ such that

x ≤ x′ < y < z′ < z, f(x′) = f(z′)

so that f is not one-to-one. By contrapositive, the lemma is proven.

121

Lemma 4: If P = {xi} is a partition of γ2, then P ′ = {φ(xi)} is a partition of γ1

From lemma 3 we know that φ is strictly increasing or decreasing, and since φ(c) = a it must be strictlyincreasing. And φ is onto, so φ(d) = b. Let P be an arbitrary partition of [c, d]:

P = {xi} = {c = x0, x1, . . . , xn, xn+1 = d}

From the properties of φ we have

a = φ(c) = φ(x0) < φ(x1) < φ(x2) < . . . < φ(xn) < φ(xn+1) = φ(d) = b

and therefore{φ(xi)} = {a = φ(x0), φ(x1), φ(x2), . . . , φ(xn), φ(xn+1) = b}

which is a partition of [a, b].

Lemma 5: If P = {xi} is a partition of γ1, then P ′ = φ−1(xi) is a partition of γ2

We’re told that φ is a continuous one-to-one function from [a, b] to [c, d], so its inverse exists and is a continuousmapping from [a, b] onto [c, d] (theorem 4.17). By lemma 3, this means that φ−1 is either strictly increasing orstrictly decreasing, and since φ−1(a) = c it must be strictly increasing. And φ−1 is onto, so φ−1(b) = d. Let Pbe an arbitrary partion of [a, b]:

P = {xi} = {a = x0, x1, . . . , xn, xn+1 = b}

From the properties of φ−1 we have

c = φ−1(a) = φ−1(x0) < φ−1(x1) < φ−1(x2) < . . . < φ−1(xn) < φ−1(xn+1) = φ−1(b) = d

and therefore{φ−1(xi)} = {c = φ−1(x0), φ−1(x1), φ−1(x2), . . . , φ−1(xn), φ−1(xn+1) = d}

which is a partition of [c, d].

Proof 1: γ1 is an arc iff γ2 is an arc

If γ1 is an arc then γ1 is a one-to-one function. We’re told that φ is one-to-one, therefore by lemma 1 γ1 ◦ φ isone-to-one. And γ2 = γ1 ◦ φ so γ2 is one-to-one. There γ2 is an arc.

If γ2 is an arc then γ2 = γ1 ◦ φ is one-to-one. We’re told that φ is a one-to-one and onto function, thereforeby lemma 2 we know that γ1 is one-to-one. Therefore γ1 is an arc.

Proof 2: γ1 is a closed curve iff γ2 is a closed curve

Assume that γ1 is a closed curve so that γ1(a) = γ1(b). We saw in lemma 4 that φ(c) = a and φ(d) = b, so wehave

γ2(c) ≡ γ1(φ(c)) = γ1(a) = γ1(b) = γ1(φ(d)) ≡ γ2(d)

Therefore γ2(c) = γ2(d), which means that γ2 is a closed curve.

Now assume that γ2 is a closed curve so that γ2(c) = γ2(d). We saw in lemma 5 that φ−1(a) = c andφ−1(b) = d so we have

γ1(a) = γ1(φ(φ−1(a))) ≡ γ2(φ−1(a)) = γ2(c) = γ2(d) = γ2(φ−1(b)) ≡ γ1(φ(φ−1(b))) = γ1(b)

Therefore γ1(a) = γ1(b), which means that γ1 is a closed curve.

122

Proof 3: γ1 and γ2 have the same length

Let P be an arbitrary partitioning of [a, b]. Using the notation from definition 6.26, we have

Λ(P, γ2) =

n∑i=1

|(γ2(xi)− γ2(xi−1)| < M

From the definition of γ2, this becomes

Λ(P, γ2) =

n∑i=1

|(γ1(φ(xi))− γ1(φ(xi−1))| < M

From lemma 4 we see that {φ(xi)} describes a partitioning of [c, d]:

Λ(P, γ2) = Λ(φ(P ), γ1)

Similarly, if we let P ′ be an arbitrary partition of [a, b] we can use lemma 5 to show that

Λ(P ′, γ1) = Λ(φ−1(P ), γ2)

This means that the set {Λ(P, γ1)} is identical to the set of {Λ(P, γ2)} so clearly these sets have the samesupremums, which means that Λ(γ1) = Λ(γ2).

Proof 4: γ1 is rectifiable iff γ2 is rectifiable

Having shown previously that Λ(γ1) = Λ(γ2), it’s clear that Λ(γ1) is finite iff Λ(γ2) is finite and so γ1 is rectifiableiff γ2 is rectifiable.

Exercise 7.1

Let {fn} be a sequence of functions that converges uniformly to f . Let Mn = sup |fn| and let M = sup |f |. Letε > 0 be given. Because of the uniform convergence of {fn} → f we can find some N such that

n > N → |fn(x)| = |fn(x)− f(x) + f(x)| ≤ |fn(x)− f(x)|+ |f(x)| ≤ ε+M

This tells us that for all n > N the function fn is bounded above by M + ε. There are only finitely many n ≤ N ,each of which is bounded by some Mn. So by choosing

max{M + ε,M1,M2, . . . ,MN}

we have found a bound for all fn.

Exercise 7.2a

Let ε > 0 be given. We’re told that {fn} and {gn} converge uniformly on E so we can find N,M such that

n,m > N → |fn(x)− fm(x)| < ε

2, n,m > M → |gn(x)− gm(x)| < ε

2

By choosing N∗ > max{N,M} we have

n,m > N∗ → |(fn(x) + gn(x))− (fm(x)− gm(x))| ≤ |fn(x)− fm(x)|+ |gn(x)− gm(x)| < ε

which shows that fn + gn converges uniformly on E.

123

Exercise 7.2b

If {fn} and {gn} are bounded functions then by exercise 7.1 they are uniformly bounded, say by F and G. Letε > 0 be given and choose δ such that

δ < min

{ε,

√ε

3,ε

3F,ε

3G

}We’re told that {fn} and {gn} converge uniformly on E so we can find N,M such that

n,m > N → |fn(x)− fm(x)| < δ, n,m > M → |gn(x)− gm(x)| < δ

|fngn − fmgm| = |(fn − fm)(gn − gm) + fm(gn − gm) + gm(fn − fm)|

≤ |(fn − fm)||(gn − gm)|+ |fm||(gn − gm)|+ |gm||(fn − fm)| ≤ δ2 + |fm|δ + |gm|δ ≤ δ2Fδ +Gδ

≤(√

ε

3

)2

+ F( ε

3F

)+G

( ε

3G

)= ε

Exercise 7.3

Let {fn} be a sequence such that fn(x) = x. This function obviously converges uniformly to the functionf(x) = x on the set R. Let {gn} be a sequence of constant functions such that gn(x) = 1

n . This sequence obviouslyconverges uniformly to the function g(x) = 0. Their product is the sequence {fngn} where fn(x)gn(x) = x/n.It’s clear that this sequence converges pointwise to fn(x)gn(x) = 0.

To show that fngn is not uniformly convergent, let ε > 0 be given. Choose an arbitrarily large n ∈ Z andchoose t ∈ R such that t > ε(n(n+ 1)). We now have

|fn(t)gn(t)− fn+1(t)gn+1(t)| =∣∣∣∣ tn − t

n+ 1

∣∣∣∣ =

∣∣∣∣ t

n(n+ 1)

∣∣∣∣ > ∣∣∣∣ε(n(n+ 1))

n(n+ 1)

∣∣∣∣ = ε

This shows that the necessary requirements for uniform convergence given in theorem 7.8 do not hold.

Exercise 7.4

Holy shit this exercise is a mess.

For what values of x does the series converge absolutely? (Incomplete)

For values of x > 0 we have ∑∣∣∣∣ 1

1 + n2x

∣∣∣∣ =

∣∣∣∣ 1x∣∣∣∣∑∣∣∣∣ 1

1/x+ n2

∣∣∣∣ ≤ ∣∣∣∣ 1x∣∣∣∣∑∣∣∣∣ 1

n2

∣∣∣∣By the comparison test this shows that f(x) converges absolutely. If x = 0 then we have∑∣∣∣∣ 1

1 + n2x

∣∣∣∣ =∑|1| =∞

This series clearly doesn’t converge, absolutely or otherwise. If x < 0 things get more complicated: If x = −1/n2

for any n ∈ N then the nth term of the series is undefined and therefore f(x) is undefined. If x 6= −1/n2 for anyn ∈ N then we can use the fact that

For what intervals does the function converge uniformly?

If E is any interval of the form [a, b] with a > 0 then we have

sup |fn(x)| = fn(a) =1

1 + n2a

124

And therefore we have ∑sup |fn(x)| =

∑ 1

1 + n2a≤ 1

a

∑ 1

n2

This shows that∑

sup |fn(x)| converges by the comparison test, and so by theorem 7.10 we see that the series∑fn(x) converges uniformly on E.

If E is any interval of the form [a, b] with b < 0 that does not contain any elements of the form −1/n2, n ∈ N

For what intervals does the function fail to converge uniformly?

Define the set X = {xn} with xn = −1/n2. The function f will fail to converge uniformly on any interval thatcontains an element of X ∪ 0 or has an element of X ∪ 0 as a limit point of E.

Proof: Let E be an arbitrary interval. If E contains any xn ∈ X then f(xn) undefined and so f fails toconverge uniformly on E. If E contains 0 then f(0) =

∑1 =∞, but we will never find some finite N such that∣∣∣∑N

1 1− f(0)∣∣∣ < ε so it’s clear that f fails to converge uniformly on E.

Now suppose that some xn ∈ X is a limit point of E. The nth term of f is unbounded near xn, so f isunbounded near xn, and therefore limt→xn

f(t) =∞. From this we have

limn→∞

limt→xn

f(t) = limn→∞

∞ =∞

On the other hand, if we first fix a value of t and take the limit of f as n→∞ we have

limn→∞

f(t) = limn→∞

1

1 + n2t= 0

and thereforelimt→xn

limn→∞

f(t) = limt→xn

0 = 0

Exercise 7.5

If x ≤ 0 or x > 1 then x = 0 for all n and therefore in these cases limn→∞ fn(x) = 0. For any other x, choosean integer N large enough so that N > 1/x. For all n > N we now have x > 1/n and therefore fn(x) = 0, sofor this case we have limn→∞ fn(x) = 0. This exhausts all possible values of x, so {fn} converges pointwise tothe continuous function f(x) = 0.

To show that this function doesn’t converge uniformly to f(x) = 0 let 1 > ε > 0 be given and let n be anarbitrarily large integer. We can easily verify that

fn

(1

n+ 1/2

)= sin2(nπ + π/2) = 1

and therefore the definition of uniform convergence is not satisfied.

The last part of the question asks us to use the series∑fn to show that absolute convergence for all x does

not imply uniform convergence. The proof of this is simple: we’ve already shown that {fn} is not uniformlyconvergent but

∑|fn(x)| converges to f(x) = | sin2(π/x)| so

∑|fn| is absolutely convergent.

Exercise 7.6

To show that the series doesn’t converge absolutely for any x:∑∣∣∣∣(−1)nx2 + n

n2

∣∣∣∣ =∑∣∣∣∣x2n2 +

1

n

∣∣∣∣ ≥∑∣∣∣∣ 1n∣∣∣∣

The rightmost sum is the harmonic series, which is known to diverge. By comparison test the leftmost seriesmust also diverge.

125

To show that the series converges uniformly in every bounded interval [a, b], let ε > 0 be given. Define X tobe sup{|a|, |b|}. Define the partial sum fm to be

fm =

m∑n=1

(−1)nx2 + n

n2

We can rearrange this algebraically to form

fm =

m∑n=1

(−1)n1

n+ x2

m∑n=1

(−1)n1

n2

We know that the alternating harmonic series converges (theorem 3.44, or example 3.40(d)) and that∑

1/n2

converges (theorem 3.28). Therefore by the Cauchy criterion for convergence we can find N such that

p > q > N → |p∑

n=q

(−1)n1

n| < ε

2

and we can also find M such that

p > q > M → |p∑

n=q

(−1)n1

n2| < ε

2|X|2

So, by choosing p > q > max{N,M} we have (for all x ∈ [a, b]):

|fp(x)− fq(x)| =

∣∣∣∣∣p∑

n=q

(−1)n1

n+ x2

p∑n=q

(−1)n1

n2

∣∣∣∣∣≤

∣∣∣∣∣p∑

n=q

(−1)n1

n

∣∣∣∣∣+

∣∣∣∣∣x2p∑

n=q

(−1)n1

n2

∣∣∣∣∣≤ ε

2+

εx2

2X2

≤ ε

2+ε

2= ε

Exercise 7.7

We can establish a global maximum for |fn(x)|

The derivative of fn(x) is

f ′n(x) =1 + nx2 − 2nx2

(1 + nx2)2=

1− nx2

(1 + nx2)2

This derivative is zero only when nx2 = 1, which occurs only when x = ±1/√n, at which point we have

fn

(±1√n

)=±1

2√n

These extrema must be the global extrema for the function, since fn has no asymptotes and fn(x) → 0 asx→ ±∞. Therefore |fn(x)| ≤ 1/(2

√n).

fn converges uniformly to f(x) = 0

Clearly fn converges pointwise to f(x) = 0, since for any x we have

limn→∞

fn(x) = limn→∞

x

1 + nx2= 0

126

Now let ε > 0 be given and choose n sufficiently large so that 1/(2√n) < ε: this can be done by choosing

n > 14ε2 . From the previously established bounds, we now have

|fn(x)− f(x)| = |fn(x)− 0| = |fn(x)| ≤ 1

2√n< ε

By theorem 7.9 this is sufficient to prove that fn converges uniformly to f(x) = 0.

When does f ′(x) = lim f ′n(x)?

We’ve established that f(x) = 0, so clearly f ′(x) = 0 for all x. The limit of f ′n is given by:

limn→∞

f ′n(x) = limn→∞

1− nx2

n2x4 + 2nx2 + 1=

{0 x 6= 01 x = 0

This shows that it’s not necessarily true that [lim fn]′ = lim[f ′n] even if fn converges uniformly.

Exercise 7.8

Proof of uniform convergence

Let ε > 0 be given. Let fm represent the partial sum

fm =

m∑n=1

cnI(x− xi)

We’re told that∑|cn| converges, so

∑|cn| satisfies the Cauchy convergence criterion, so we can find N such

that p > q > N impliesp∑

n=q

|cn| < ε

For the same values of p > q > N we also have, by the triangle inequality and the fact that I(x− xn) ≤ 1,

|fp − fq| =

∣∣∣∣∣p∑

n=q

cnI(x− xn)

∣∣∣∣∣ ≤p∑

n=q

|cnI(x− xn)| ≤p∑

n=q

|cn| < ε

so {fn} converges uniformly by theorem 7.8

Proof of continuity

Let t be an arbitrary point (not necessarily in the interval (a, b)). This t either is or isn’t a limit of somesubsequence of {xn}.

If t is not a limit point of some subsequence then we can find some neighborhood around t that contains nopoints of {xn}; the function I(t− xn) is constant on this interval for all n and therefore f(x) =

∑cnI(x− xn)

is constant on this interval for all n. (see below for a a more thorough justification of this claim). And if f isconstant on an interval around t then it is clearly continuous at t.

Now suppose that t a limit point of {xn}. By the Cauchy convergence of∑|cn| we can find N such that∑∞

n=N |cn| < ε. Choose a neighborhood around t small enough that it does not contain the first N terms of {xn};let δ represent the radius of this neighborhood. If we choose s such that |s− t| < δ, then I(s− xn) = I(t− xn)for n = 1, 2, . . . , N . From this we have

|s− t| < δ → f(s)− f(t) ≤

∣∣∣∣∣∞∑

n=N+1

(cnI(s− xn)− cnI(t− xn))

∣∣∣∣∣ ≤∞∑

n=N+1

|cn [I(s− xn)− I(t− xn)]|

127

Each I(x− xn) is either 0 or 1 so |I(s− xn)− I(t− xn)| ≤ 1, so we have

|s− t| < δ → f(s)− f(t) ≤∞∑

n=N+1

|cn [I(s− xn)− I(t− xn)]| ≤∞∑

n=N+1

|cn| < ε

That is, we’ve found δ such that|s− t| < δ → f(s)− f(t) < ε

which, by definition, means that f is continuous at t.

We have therefore shown that f is continuous at t under all cases. But t was an arbitrary point (notnecessarily confined to (a, b)) and therefore f is continuous at all points.

Justification of I(x− xn) being constant on some interval

As mentioned above, if t is not a limit point of some subsequence then we can find some neighborhood aroundt that contains no points of {xn}. Let A be the set of elements of {xn} that are greater than t, and let B bethe set of elements of {xn} that are smaller than t. If |s− t| < δ then there are no elements of {xn} between sand t and so every element of A is greater than both t and s and every element of B is smaller than both t ands. From this we see that I(s − xn) = I(t − xn) = 0 if xn ∈ A and I(s − xn) = I(t − xn) = 1 if xn ∈ B. Thismeans that I(x − xn) has the same value for every x ∈ Nδ(t); so

∑cnI(x − xn) has the same value for every

x ∈ Nδ(t); and therefore f(x) has the same value for every x ∈ Nδ(t).

Exercise 7.9

We’re told that {fn} → f uniformly on E, so we can find N such that

n > N → |fn(t)− f(t)| < ε

2for all t ∈ E

This holds for all t ∈ E, so it must also hold for the elements of {xn}, which means that

n > N → |fn(xn)− f(xn)| < ε

2(85)

We’re also told that {fn} is a uniformly convergent sequence of continuous functions, so f is continuous; andsince {xn} → x we can find M such that

n > M → |f(xn)− f(x)| < ε

2(86)

Using the triangle inequality and equations (85) and (86), we have

|fn(xn)− f(x)| ≤ |fn(xn)− f(xn)|+ |f(xn)− f(x)| < ε

2+ε

2= ε

Converse statement 1

Suppose the converse is “Let {fn} be a sequence of continuous functions that converges uniformly to f . Is ittrue that if lim fn(xn) = f(x) then {xn} → x?”. The answer to this question is “no”. Consider the functionfn(x) = x/n and the sequence {xn} = {1} (an infinite sequence of 1s). The sequence {fn} converges uniformlyto f(x) = 0 on the set [0, 1], so we have

limn→∞

fn(xn) = limn→∞

fn(1) = 0 = f(0)

It’s clear that {xn} 6→ 0, so the converse does not hold.

128

Converse statement 2

Suppose the converse is “Let {fn} be a sequence such that lim fn(xn) = f(x) for some function f and for allsequences of points xn ∈ E such that {xn} → x ∈ E. Is it true that {fn} converges uniformly to f on E?

The answer to this question is “no”. Let fn(x) = 1/(nx). Let E be the harmonic set {1/n}. Then fnconverges pointwise on E to f(x) = 0. Although 0 is a limit point of E, 0 is not an element of E and thisconverse statement is only concerned with sequences {xn} that converge to some x ∈ E. So the only sequencesof points xn ∈ E such that {xn} → x ∈ E are sequences where every term of the sequence is eventually justx itself (that is, sequences for which there is some N such that n > N → xn = x). So clearly for every suchsequence we must have lim fn(xn) = lim fn(x) = f(x). But, despite the fact that lim fn(xn) = f(x) whenever{xn} → x ∈ E, it’s clear that fn(x) does not converge uniformly to f(x) = 0 on E (to prove this, choose any εand any n and then choose x < (nε)−1).

Exercise 7.10a

f is continuous at every irrational number

We know that∑

1/n2 converges, so by the Cauchy criterion we can choose N such that

b > a > N →b∑a

1

n2<ε

4

The fractional part of (nx) will never be zero because x is irrational, so we can choose δ such that

0 < δ <1

2min{(nx), 1− (nx) : n < N}

This guarantees that 0 < (n[x− δ]) < (nx) for all n < N and (nx) < (n[x+ δ]) < 1 for all n. More importantly,this choice of δ guarantees that (n[x − δ]) = (nx) − (nδ) for n < N . We now derive the following chain ofinequalities:

|f(x)− f(x± δ)| =

∣∣∣∣∣N∑1

[(nx)

n2− (nx± nδ)

n2

]+

[ ∞∑N+1

(nx)

n2− (nx± nδ)

n2

]∣∣∣∣∣≤

∣∣∣∣∣N∑1

[(nx)

n2− (nx± nδ)

n2

]∣∣∣∣∣+

∣∣∣∣∣∞∑N+1

(nx)

n2

∣∣∣∣∣+

∣∣∣∣∣∞∑N+1

(nx± nδ)n2

∣∣∣∣∣<

∣∣∣∣∣N∑1

[(nx)

n2− (nx± nδ)

n2

]∣∣∣∣∣+ε

4+ε

4

<

∣∣∣∣∣N∑1

[(nx)

n2− (nx)

n2± (nδ)

n2

]∣∣∣∣∣+ε

2

=

∣∣∣∣∣N∑1

[±(nδ)

n2

]∣∣∣∣∣+ε

2

We chose our value of δ after we fixed a particular value of N , so we could have chosen δ small enough to makethis sum less than ε/2. In doing so, we have

|f(x)− f(x± δ)| < ε

2+ε

2= ε

And this would hold not just for x± δ, but for all y such that |x− y| < δ. That is:

|x− y| < δ → |f(x)− f(y)| < ε

which means that f is continuous at x.

129

Lemma 1:∑

(nx)/n2 converges uniformly to f

This is an immediate consequence of the fact that (nx) < 1 for all n, x and that∑

1/n2 converges. Let ε > 0be given. By the Cauchy criterion we can choose N such that

b > a > N →b∑a

1

n2< ε

and therefore we have ∣∣∣∣∣f(x)−N∑n=1

(nx)

n2

∣∣∣∣∣ =

∣∣∣∣∣∞∑n=1

(nx)

n2−

N∑n=1

(nx)

n2

∣∣∣∣∣=

∣∣∣∣∣∞∑

n=N+1

(nx)

n2

∣∣∣∣∣ ≤∞∑

n=N+1

∣∣∣∣ (nx)

n2

∣∣∣∣ < ∞∑n=N+1

∣∣∣∣ 1

n2

∣∣∣∣ < ε

f is discontinuous at every rational number

When nx is not an integer the following limits are identical:

limt→x+

(nt)

n2=

(nx)

n2= limt→x−

(nt)

n2

When nx is an integer then this equality no longer holds. Instead, we have

limt→x+

(nt)

n2=

0

n2, lim

t→x−

(nt)

n2=

1

n2

We know by lemma 1 that∑

(nx)/n2 converges uniformly to f , so we also have (by theorem 7.11):

∞∑n=1

fn(x+) = f(x+),

∞∑n=1

fn(x−) = f(x−)

We know that∑

1/n2 converges, so by the Cauchy criterion we can choose N such that

b > a > N →b∑a

1

n2<ε

4

If f were continuous at our rational point x, then we would have |f(x+) − f(x−)| = 0. But when we actuallycalculate this difference we find:

|f(x+)− f(x−)| =

∣∣∣∣∣∞∑n=1

fn(x+)−∞∑n=1

fn(x−)

∣∣∣∣∣We determined that fn(x+) = fn(x−) unless nx is an integer; this occurs when n is a multiple of q, at whichpoint nx = [mq]x = mq[p/q] = mp. Most of the terms of the summations cancel out, leaving us with

|f(x+)−f(x−)| =

∣∣∣∣∣∞∑m=1

fmq(x+)−∞∑m=1

fmq(x−)

∣∣∣∣∣ =

∣∣∣∣∣∞∑m=1

limt→x+

(mq)

[mq]2− limt→x−

(mq)

[mq]2

∣∣∣∣∣ =

∣∣∣∣∣∞∑m=1

(0)

n2− 1

[mq]2

∣∣∣∣∣ =

∣∣∣∣∣∞∑m=1

1

[mq]2

∣∣∣∣∣This is clearly not equal to zero and it can’t be made arbitrarily small because we have no freedom to selectparticular values for m or q. Therefore f(x+) 6= f(x−) and therefore f is not rational at x. But x was anarbitrary rational number, and therefore f is not continuous at any rational number.

Exercise 7.10b

We’ve shown that the discontinuities of f are the rational numbers, and these are clearly countable and clearlydense in R.

130

Exercise 7.10c

To prove that f ∈ R we need only show that (nx)/n2 ∈ R for any fixed value of n. We can then use the fact that∑(nx)/n2 converges uniformly to f (lemma 1) and theorem 7.16 to show that

∑∫(nx)/n2 =

∫ ∑(nx)/n2 =

∫f .

Let n ∈ N be given. Let [a, b] be an arbitrary interval. Assume without loss of generality that a and b areintegers. ∫ b

a

(nx)

n

2

dx =

b−1∑k=a

∫ k+1

k

(nx)

n2dx

For 0 ≤ δ < 1 we have (n(k+ δ)) = (nk+nδ) = (nδ), so we can integrate over (0, 1) instead of (k, k+ 1) withoutchanging the value of the integral. ∫ b

a

(nx)

n

2

dx =

b−1∑k=a

∫ 1

0

(nx)

n2dx

As x ranges from 0 to 1, nx ranges from 0 to n. So we split up the interval (0, 1) into n intervals of length 1/n:∫ b

a

(nx)

n

2

dx =

b−1∑k=a

n−1∑j=0

∫ [j+1]/n

j/n

(nx)

n2dx

To make this integral a bit more manageable we make the variable substitution u = nx. When x = j/n we haveu = j; when x = [j + 1]/n we have u = j + 1; and we have dx = 1

n du. Using this variable substitution in theprevious integral: ∫ b

a

(nx)

n

2

dx =

b−1∑k=a

n−1∑j=0

∫ j+1

j

(u)

n3du

Again, there is no difference between the value of (u) on the intervals (j, j + 1) and the value of (u) on theinterval (0, 1): ∫ b

a

(nx)

n

2

dx =

b−1∑k=a

n−1∑j=0

∫ 1

0

(u)

n3du

On the interval (0, 1) we have (u) = u, so we can finally start integrating this thing. After a bunch of trivialcalculus, we have ∫ b

a

(nx)

n

2

dx = [b− a− 1][n]u2

2n3=b− a− 1

n2

And therefore ∫ b

a

f(x) =

∫ b

a

∞∑n=1

(nx)

n2=

∞∑n=1

∫ b

a

(nx)

n2=

∞∑n=1

b− a− 1

n2

We know this rightmost sum converges (specifically, it converges to π2[b−a− 1]/6), therefore∫ baf(x) exists and

therefore f ∈ R.

Exercise 7.11

Let ε > 0 be given. We’re told that∑fn has uniformly bounded partial sums, so let M represent the upper

bound of the partial sums of |∑fn|. We’re told that gn → 0 uniformly, so we can find N such that

n > N → gn(x) <ε

Mfor all x

Let An represent the partial sum∑nk=1 fkgk. Our result is proven if we can prove that {An} satisfies the

Cauchy converge criterion. To do this, we choose q > p > N . Following the logic of theorem 3.42, we have

|Aq −Ap−1| =

∣∣∣∣∣∣q∑

k=p

fkgk

∣∣∣∣∣∣ =

∣∣∣∣∣∣q−1∑k=p

An(gn − gn+1)

+Aqgq −Ap−1gp

∣∣∣∣∣∣ ≤M∣∣∣∣∣∣q−1∑k=p

(gn − gn+1)

+ gq − gp

∣∣∣∣∣∣131

The (gn − gn+1) terms telescope down, leaving us with

|Aq −Ap−1| ≤M

∣∣∣∣∣∣gp − q−1∑

k=p

(gn − gn+1)

+ gq + gp

∣∣∣∣∣∣ = 2M |gp| ≤ 2M |gN | = ε

Exercise 7.12

Let ε > 0 be given. We’re told that 0 ≤ fn, f ≤ g and that∫g <∞. Let M =

∫∞0g. Define G(t) to be

G(t) =

∫ c

0

g(x) dx

Since G(t) is continuous (theorem 6.30) and is strictly increasing (because g > 0) and has an upper bound ofM , we know that

limt→∞

G(t) = M

we can find some N such thatn > N →M −G(n) <

ε

2

which is equivalent to saying that

n > N →∫ ∞0

g −∫ c

0

g =

∫ ∞c

g <ε

2(87)

We’re given that fn → f uniformly, so for all x and any specified value for c we can find some M such that

m > M → |fm(x)− f(x)| < ε

2c(88)

from which we can conclude that

m > M →∣∣∣∣∫ c

0

fm − f∣∣∣∣ ≤ ∫ c

0

|fm − f | ≤ cε

2c=ε

2(89)

So, for the given value of ε > 0 we choose c large enough that (88) holds and then, based on this choice of c, wechoose n to be large enough that (89) holds. By the triangle inequality we then have∣∣∣∣∫ ∞

0

fn − f∣∣∣∣ ≤ ∣∣∣∣∫ c

0

fn − f∣∣∣∣+

∣∣∣∣∫ ∞c

fn − f∣∣∣∣

≤∫ c

0

|fn − f |+∫ ∞c

|fn − f |

≤ ε

2+ε

2= ε

We can make this hold for arbitrary ε by taking n sufficiently large, which is simply saying that

limn→∞

∣∣∣∣∫ ∞0

fn − f∣∣∣∣ = 0

from which we conclude

limn→∞

∫ ∞0

fn =

∫ ∞0

f

132

Exercise 7.13a

For any choice of x1 ∈ [0, 1], the sequence {f (1)n (x1)} is a bounded sequence of real numbers in the compact

domain [0, 1]. Therefore there exists some subsequence of functions {f (1)n } for which the subsequence of real

numbers {f (1)n (x1)} converges to some point in [0, 1] (theorem 3.6a). Call this subsequence of functions {f (2)n }and choose any x2 ∈ [0, 1]. The sequence {f (2)(x2) is still a bounded sequence of real numbers in the compact

domain [0, 1], so we can find some subsequence of functions {f (3)n } for which {fn(x1)} and {fn(x2)} both con-verge. This can be repeated a countable number of times to construct a sequence of functions {Fn} for which{Fn(x)} converges at all rational numbers and for all points of discontinuity for each fn (this set is countableby theorems 4.30,2.12, and 2.13).

Now let t be a rational number or a point of discontinuity for some fn. We have constructed {Fn} so that{Fn(t)} converges to some point, so we simply define f(t) for rational t to be

f(t) = limn→∞

Fn(t), if t is rational or a point of discontinuity

Now t be a rational number and a point at which fn is continuous for all n. The rational numbers are dense inR so we can construct a sequence {rn} of rational numbers such that {rn} → t. Each Fn is continuous at t, solimk→∞ Fn(rk) = Fn(t) for every n. Therefore we simply define the value of f(t) to be

f(t) = limn→∞

Fn(t), if t is irrational and each Fn is continuous at t

Exercise 7.13: example

An example where a sequence {fn} of monotonically increasing functions converges pointwise to f and f is notcontinuous:

fn(x) =

0 x ≤ 0

1− 11+n x ≥ 1

1− 11+nx 0 < x < 1

limn→∞

fn(x) =

{0 x ≤ 01 x > 0

Exercise 7.13b

Let α = inf f(x) and let β = sup f(x) (these values may be ±∞). Let ε > 0 be given. The function f mustcome arbitrarily close to its supremum and infimum on R. Let a be a point at which |f(a) − α| < ε and let bbe a point at which |f(b)− β| < ε. From the monotonicity of f we have

x < a→ |f(x)− α| < ε, x > b→ |f(x)− β| < ε (90)

Proving uniform convergence is now a matter of proving uniform convergence on [a, b].

We’re given that f is continuous on [a, b]; therefore it’s uniformly continuous on [a, b] (theorem 4.19); thereforewe can find some δ such that

|x− y| < δ → |f(x)− f(y) < ε (91)

We can cover [a, b] with finitely many intervals of length δ/2 ([a, b] is finite, so we don’t even need to rely oncompactness for this). From each interval we choose a rational number qi. We know that f converges pointwiseat each rational number, and the set of qis is a finite set, so we can find some integer N such that

n > N → |Fn(qi)− f(qi)| < ε for each qi (92)

Now choose any x ∈ [a, b]. If x = qi for some i then by (92) we have n > N → |Fn(qi) − f(qi)| < ε. If x 6= qithen we can find qi, qi+1 such that qi < x < qi+1 and |qi+1 − qi| < δ. By the triangle inequality:

|Fn(x)− f(x)| ≤ |Fn(x)− Fn(qi+1)|+ |Fn(qi+1)− f(qi+1)|+ |f(qi+1)− f(x)|

Each Fn is monotonic, so we have |Fn(x)− Fn(qi+1)| ≤ |Fn(qi)− Fn(qi+1)| so this last inequality becomes

|Fn(x)− f(x)| ≤ |Fn(qi)− Fn(qi+1)|+ |Fn(qi+1)− f(qi+1)|+ |f(qi+1)− f(x)|

133

The term |Fn(qi+1)− f(qi+1)| is < ε by (92). The term |f(qi+1)− f(x)| is < ε by (91). This leaves us with

|Fn(x)− f(x)| ≤ |Fn(qi)− Fn(qi+1)|+ 2ε

Additional applications of the triangle inequality gives us

|Fn(x)− f(x)| ≤ |Fn(qi)− f(qi)|+ |f(qi)− f(qi+1)||f(qi)− Fn(qi+1)|+ 2ε

The terms |Fn(qi) − f(qi)| and |f(qi) − Fn(qi+1)| are both < ε by (92). The term |f(qi) − f(qi+1)| is < ε by(91). This leave us with

|Fn(x)− f(x)| ≤ 5ε

Exercise 7.14

Exercise 7.15

Let ε > 0 be given. Each fn is equicontinous on [0, 1], which means that there exists some δ such that, for all n,

|0− y| < δ → |fn(0)− fn(y)| < ε

By the definition of f this means that, for all y ∈ (0, 1) and all n ∈ N,

|y| < δ → |f(0)− f(ny)| < ε

By taking n sufficiently large and choosing y appropriately we can cause ny to take on any value in (0,∞). Sowe conclude that f is a constant function on [0,∞).

Exercise 7.16

See part (b) of exercise 7.13

Exercise 7.17

Exercise 7.18

Lemma 1: {Fn} is uniformly bounded

We’re told that {fn} is a uniformly bounded sequence: let M be the upper bound of {|fn|}. For each n, we have

|Fn(x)| =∣∣∣∣∫ x

a

fn(t) dt

∣∣∣∣ ≤ ∫ x

a

|fn(t)| dt ≤∫ b

a

|fn(t)| dt ≤ (b− a)M

and therefore {|Fn|} is uniformly bounded.

Lemma 2: {Fn} is equicontinuous

Let ε > 0 be given. Let δ = ε/M .

|y − x| < δ → |Fn(y)− Fn(x)| =∣∣∣∣∫ y

x

fn(t) dt

∣∣∣∣ ≤ ∫ y

x

|fn(t) dt| ≤ (y − x)M < δM < ε

Therefore, by theorem 7.25, we know that {Fn} contains a uniformly convergent subsequence.

Exercise 7.19

This was solved during class

Exercise 7.21

We know that A separates points because f(eiθ) = eiθ ∈ A . For this same function, |f(eiθ)| = |eiθ| = 1 and soA vanishes at no point of K. But the function f(eiθ) = −eiθ is a continuous function that is not in the closureof A .

134

rudin solutions ch 1 - 7

Documents