introduction to real analysis chapter 0people.math.gatech.edu/~heil/real/chap0.pdfchapter 0 notation...

Christopher Heil

Introduction to Real Analysis

Chapter 0

Online Expanded Chapter on

Notation and Preliminaries

Last Updated: April 7, 2019

c©2019 by Christopher Heil

Chapter 0

Notation and Preliminaries:Expanded Version

This online chapter is an expanded version of the unnumbered Preliminarieschapter in the text “An Introduction to Real Analysis” by C. Heil. In thisChapter 0 we will review in detail the notation and background informationthat will be assumed throughout Chapters 1–9 of the main text (thoughwe do assume that the reader has a basic familiarity with logic, sets, realnumbers, and functions). For more details and for any omitted proofs of thefacts reviewed in Sections 0.1–0.13 of this Chapter 0 we refer to calculus texts(such as [HHW18]), and undergraduate analysis texts (such as [Rud76]). Foradditional details on the results discussed in Sections 0.14–0.15 of Chapter 0we refer to linear algebra texts (such as Axler, “Linear Algebra Done Right”).

We use the symbol ⊓⊔ to denote the end of a proof, and the symbol ♦to denote the end of a definition, remark, example, or exercise. We also use♦ to indicate the end of the statement of a theorem whose proof will beomitted. Some of the more challenging Problems in the text are marked withan asterisk *. A detailed index of symbols employed in the text can be foundafter Chapter 9 in the main text.

0.1 Numbers

The set of natural numbers is denoted by N = {1, 2, 3, . . .}. The set of integersis Z = {. . . ,−1, 0, 1, . . .}, Q denotes the set of rational numbers, R is the setof real numbers, and C is the set of complex numbers.

The real part of a complex number z = a+ib (where a, b ∈ R) is Re(z) = a,and its imaginary part is Im(z) = b. We say that z is rational if both its realand imaginary parts are rational numbers. The complex conjugate of z isz = a − ib. The modulus, or absolute value, of z is

|z| =√

zz =√

a2 + b2.

1

2 Expanded Preliminaries c©2019 Christopher Heil

We have |Re(z)| ≤ |z| and |Im(z)| ≤ |z| for every complex number z.If z 6= 0 then its polar form is z = reiθ where r = |z| > 0 and θ ∈ [0, 2π). In

this case the argument of z is arg(z) = θ. Given any z ∈ C, there is a complexnumber α such that |α| = 1 and zα = |z|. If z 6= 0 then α is uniquely givenby α = e−iθ = z/|z|, while if z = 0 then α can be any complex number thathas unit modulus. If z is a real number, then α is simply ±1.

Some useful identities are

z + z = 2 Re(z), z − z = 2iIm(z),

and|z + w|2 = (z + w) (z + w) = |z|2 + 2 Re(zw) + |w|2.

The Extended Real Line

We append ∞ and −∞ to the real line to form the set of extended real

numbers [−∞,∞]:

[−∞,∞] = R ∪ {−∞} ∪ {∞}.

Although ±∞ are part of the extended real line, they are not numbers andshould be treated with care. Some texts, such as [Fol99], denote the extendedreal line by

R = [−∞,∞].

We extend many of the normal arithmetic operations to [−∞,∞]. Forexample, if −∞ < a ≤ ∞ then we set a + ∞ = ∞. However, ∞ − ∞ and−∞ + ∞ are undefined, and are referred to as indeterminate forms.

Given a strictly positive extended real number 0 < a ≤ ∞ we define

a · ∞ = ∞, (−a) · ∞ = −∞, a · (−∞) = −∞, (−a) · (−∞) = ∞.

Further, we adopt the following conventions:

0 · (±∞) = 0,1

±∞ = 0.

If p is an extended real number in the range 1 ≤ p ≤ ∞, then its dual

index is the unique extended real number p′ that satisfies

1

p+

1

p′= 1.

We have 1 ≤ p′ ≤ ∞, and (p′)′ = p. If 1 < p < ∞, then we can write p′

explicitly as

0.2 Sets 3

p′ =p

p − 1.

Some examples are 1′ = ∞, (3/2)′ = 3, 2′ = 2, 3′ = 3/2, and ∞′ = 1.More notation related to the extended real line will be defined below in

Sections 0.3 and 0.7.

0.2 Sets

If X is a set then we often use lowercase letters such as x, y, z to denoteelements of X. Below is some terminology and notation for sets.

• If every element of a set A is also an element of a set B, then A is asubset of B, and in this case we write A ⊆ B (note that this includes thepossibility that A could equal B).

• A proper subset of a set B is a set A ⊆ B such that A 6= B. We indicatethis by writing A ( B.

• The empty set is denoted by ∅. The empty set is a subset of every set.

• The notation X = {x : x has property P} means that X is the set ofall x that satisfy property P. For example, the union of a collection of sets{Xj}j∈J is

⋃

j∈J

Xj ={

x : x ∈ Xj for some j ∈ J}

,

and their intersection is

⋂

j∈J

Xj ={

x : x ∈ Xj for every j ∈ J}

.

• If S is a subset of a set X, then the complement of S is

X \S ={

x ∈ X : x /∈ S}

.

We sometimes abbreviate X \S as SC if the set X is understood.

• If A and B are subsets of a set X, then the relative complement of A in Bis

B\A = B ∩ AC = {x ∈ B : x /∈ A},and the symmetric difference of A and B is

A△B = (A\B) ∪ (B\A).

• De Morgan’s Laws state that


X \ ⋃

i∈I

Xi =⋂

i∈I

(X \Xi) and X \ ⋂

i∈I

Xi =⋃

i∈I

(X \Xi).

• The Cartesian product of sets X and Y is the set of all ordered pairs ofelements of X and Y, i.e.,

X × Y ={

(x, y) : x ∈ X, y ∈ Y}

.

• The power set of a set X is the set of all subsets of X. We denote thepower set by P(X), i.e.,

P(X) ={

S : S ⊆ X}

.

• A collection of sets {Xi}i∈I is disjoint if Xi ∩ Xj = ∅ whenever i 6= j. Inparticular, two sets A and B are disjoint if A ∩ B = ∅.

• A collection of sets {Xi}i∈I is a partition of a set X if it is both disjointand covers X, i.e., if

{Xi}i∈I is disjoint and⋃

i∈I

Xi = X.

0.3 Intervals and Extended Intervals

We identify some special subsets of the real line and the extended real line.

• An open interval in the real line R is any one of the following sets:

(a, b) ={

x ∈ R : a < x < b}

, −∞ < a < b < ∞,

(a,∞) ={

x ∈ R : x > a}

a ∈ R,

(−∞, b) ={

x ∈ R : x < b}

b ∈ R,

(−∞,∞) = R.

• A closed interval in R is any one of the following sets:

[a, b] ={

x ∈ R : a ≤ x ≤ b}

, −∞ < a < b < ∞,

[a,∞) ={

x ∈ R : x ≥ a}

, a ∈ R,

(−∞, b] ={

x ∈ R : x ≤ b}

, b ∈ R,

(−∞,∞) = R.

• We refer to [a, b] as a bounded closed interval, a finite closed interval, or acompact interval.

0.4 Equivalence Relations 5

• An interval in R is a set that is either an open interval, a closed interval,or one of the following sets (which are sometimes referred to as “half-openintervals,” even though they are neither open nor closed):

(a, b] ={

x ∈ R : a < x ≤ b}

, −∞ < a < b < ∞,

[a, b) ={

x ∈ R : a ≤ x < b}

, −∞ < a < b < ∞.

• We also deal with subsets of the extended real line. An extended interval

is any one of the following subsets of [−∞,∞]:

(a,∞] = (a,∞) ∪ {∞}, a ∈ R,

[a,∞] = [a,∞) ∪ {∞}, a ∈ R,

[−∞, b) = {−∞} ∪ (−∞, b), b ∈ R,

[−∞, b] = {−∞} ∪ (−∞, b], b ∈ R,

[−∞,∞] = R ∪ {−∞} ∪ {∞}.

An extended interval is not an interval—whenever we refer to an “interval”without qualification we implicitly exclude the extended intervals.

• The empty set ∅ and a singleton {a} are not intervals, but even so weadopt the notational conventions

[a, a] = {a} and (a, a) = [a, a) = (a, a] = ∅.

0.4 Equivalence Relations

Informally, we say that ∼ is a relation on a set X if for each choice of x, y ∈ Xwe have only one of the following two possibilities:

x ∼ y (x is related to y) or x 6∼ y (x is not related to y).

Formally, a relation ∼ is simply a set of ordered pairs of elements of X, i.e.,it is a subset of the Cartesian product X × X. If (x, y) belongs to the set ∼then we write x ∼ y, and if (x, y) does not belong to ∼ then we write x 6∼ y.

For example, we can define a relation ∼ on the set of integers Z by declaringthat two numbers are related if and only if their difference is even, i.e.,

m ∼ n ⇐⇒ m − n is divisible by 2. (0.1)

This is a relation because every pair of integers is either related or not related.

Definition 0.4.1. An equivalence relation on a set X is a relation ∼ thatsatisfies the following conditions for all x, y, z ∈ X.


• Reflexivity: x ∼ x.

• Symmetry: If x ∼ y then y ∼ x.

• Transitivity: If x ∼ y and y ∼ z then x ∼ z.

If ∼ is an equivalence relation on X, then the equivalence class of x ∈ X isthe set [x] that contains all elements that are related to x:

[x] = {y ∈ X : x ∼ y}. ♦

The reader should prove that any two equivalence classes are either identi-cal or disjoint. That is, if x and y are two points in X, then either [x] = [y] or[x]∩[y] = ∅. The union of all of the equivalence classes [x] is X. Consequently,the set of distinct equivalence classes forms a partition of X.

Example 0.4.2. (a) The relation ∼ on Z defined in equation (0.1) is an equiv-alence relation (see Problem 0.4.3). The equivalence class of an integer m ∈ Z

is[m] = {n ∈ Z : m − n is even} = {m + 2k : k ∈ Z}.

There are only two possibilities: If m is even then the equivalence class of mis [m] = E, the set of all even integers, while if m is odd then [m] = O, theset of odd integers. No matter what we choose for m and n, either [m] = [n]or [m] ∩ [n] = ∅. There are only two distinct equivalence classes, the sets Eand O, and these two sets form a partition of Z.

(b) Given an integer N > 1, we can define an analogous equivalencerelation on Z by declaring that m ∼ n if and only if m − n is divisibleby N. There are N distinct equivalence classes in this case, which we canlist as [0], [1], . . . , [N − 1]. Although we will not need to make use of thisfact, we remark that the group known as ZN is the set of these equivalenceclasses, ZN =

{

[0], [1], . . . , [N − 1]}

, together with an appropriate definitionof addition of equivalence classes (specifically, [m] + [n] = [m + n]).

(c) We can define a relation on R by declaring that x ∼ y if and only ifx − y is rational, i.e.,

x ∼ y ⇐⇒ x − y ∈ Q.

This is an equivalence relation on R, and the equivalence class of x ∈ R isthe set of all numbers that differ from x by a rational amount:

[x] ={

y ∈ R : x − y ∈ Q}

={

x + r : r ∈ Q}

. (0.2)

Any two equivalence classes are either identical or disjoint. For example,[0] = Q and [

√2] = {

√2 + r : r ∈ Q} have no elements in common. This

equivalence relation will be useful to us in Section 2.4. ♦

0.5 Functions 7

Since the equivalence class [x] defined in equation (0.2) is obtained bytranslating each element of the set of rationals by x, we often denote it byx + Q or Q + x. That is, we let

x + Q = Q + x ={

x + r : r ∈ Q}

.

Problems

0.4.3. Prove that the relations defined in Example 0.4.2 are each equivalencerelations.

0.4.4. (a) Assume that ∼ is an equivalence relation on a set X. Prove thatthe set of distinct equivalence classes of ∼ forms a partition of X.

(b) Suppose that {Xi}i∈I is a partition of a set X. Given x, y ∈ X, definex ∼ y if and only if there exists some i ∈ I such that x and y both belong toXi. Prove that ∼ is an equivalence relation on X.

0.5 Functions

Let X and Y be sets. We write f : X → Y to mean that f is a function whosedomain is X and codomain (or target) is Y. We usually write f(t) to denotethe image of t under f, but sometimes we describe the rule for f by writingt 7→ f(t). For example, if f : R → R is defined by f(t) = 2t, then we couldalternatively say that f is given by the rule t 7→ 2t for t ∈ R.

Here is some terminology that we use to describe various properties of afunction f : X → Y.

• The direct image of a set A ⊆ X under f is

f(A) = {f(t) : t ∈ A}.

• The inverse image of a set B ⊆ Y under f is

f−1(B) = {t ∈ X : f(t) ∈ B}. (0.3)

• The range of f is the direct image of its domain X, i.e.,

range(f) = f(X) = {f(t) : t ∈ X}.

• f is surjective, or onto, if range(f) = Y.


• f is injective, or one-to-one, if f(a) = f(b) implies a = b.

• f is a bijection if it is both injective and surjective.

• A bijection f : X → Y has an inverse function f−1 : Y → X, definedby the rule f−1(y) = x if f(x) = y. The inverse function f−1 is also abijection. Despite the similar notation, an inverse function should not beconfused with the inverse image defined in equation (0.3). Only a bijectionhas an inverse function, yet the inverse image f−1(B) is well-defined forevery function f and set B ⊆ Y.

• If Y = R, then we say that f is real-valued. If Y = [−∞,∞], then f isextended real-valued. If Y = C, then f is complex-valued.

• Given S ⊆ X, the restriction of a function f : X → Y to the domain S isthe function f |S : S → Y defined by (f |S)(x) = f(x) for x ∈ S.

• The zero function on X is the function 0: X → R defined by 0(x) = 0 forevery x ∈ X. We use the same symbol 0 to denote the zero function andthe number zero.

• The characteristic function of a set A ⊆ X is the function χA : X → R

given by

χA(x) =

{

1, if x ∈ A,

0, if x /∈ A.

• If the domain of a function f is Rd, then the translation of f by a vectora ∈ Rd is the function Taf defined by Taf(x) = f(x − a) for x ∈ Rd.

Problems

0.5.1. Let f : X → Y be a function.

(a) Prove that f is surjective if and only if for every y ∈ Y there existssome x ∈ X such that f(x) = y.

(b) Prove that f is a bijection if and only if for every y ∈ Y there exists aunique x ∈ X such that f(x) = y.

0.5.2. Let f : X → Y be a function, let B be a subset of Y, and let {Bi}i∈I

be a family of subsets of Y. Prove that

f−1

(

⋃

i∈I

Bi

)

=⋃

i∈I

f−1(Bi), f−1

(

⋂

i∈I

Bi

)

=⋂

i∈I

f−1(Bi),

0.6 Cardinality 9

and f−1(

BC)

= (f−1(B))C. Also prove that f(f−1(B)) ⊆ B, and if f issurjective then equality holds. Show by example that equality need not holdif f is not surjective.

0.5.3. Let f : X → Y be a function, let A be a subset of X, and let {Ai}i∈I

be a family of subsets of X. Prove that

f

(

⋃

i∈I

Ai

)

=⋃

i∈I

f(Ai).

Also prove that

f

(

⋂

i∈I

Ai

)

⊆ ⋂

i∈I

f(Ai), f(X)\f(A) ⊆ f(AC), A ⊆ f−1(f(A)),

and if f is injective then equality holds in each of these inclusions. Show byexample that equality need not hold if f is not injective.

0.6 Cardinality

We say that two sets A and B have the same cardinality if there exists abijection f that maps A onto B, i.e., if there is a function f : A → B that isboth injective and surjective. Such a function f pairs each element of A witha unique element of B and vice versa, and therefore it is sometimes called aone-to-one correspondence.

Example 0.6.1. (a) The function f : [0, 2] → [0, 1] defined by f(x) = x/2 for0 ≤ x ≤ 2 is a bijection, so the intervals [0, 2] and [0, 1] have the samecardinality. This shows that a proper subset of a set can have the samecardinality as the set itself (although this is impossible for finite sets).

(b) The function f : N → {2, 3, 4, . . .} defined by f(n) = n + 1 for n ∈ N

is a bijection, so the set of natural numbers N = {1, 2, 3, . . .} has the samecardinality as its proper subset {2, 3, 4, . . .}.

(c) The function f : N → Z defined by

f(n) =

{

n2 , if n is even,

−n−12 , if n is odd,

is a bijection of N onto Z, so the set of integers Z has the same cardinalityas the set of natural numbers N.

(d) If n is a finite positive integer, then there is no way to define a functionf : {1, . . . , n} → N that is a bijection. Hence {1, . . . , n} and N do not have


the same cardinality. Likewise, if m 6= n are distinct positive integers, then{1, . . . , m} and {1, . . . , n} do not have the same cardinality. ♦

We use cardinality to define finite sets and infinite sets, as follows.

Definition 0.6.2 (Finite and Infinite Sets). Let X be a set.

(a) We say that X is finite if either X is empty or there exists an integern > 0 such that X has the same cardinality as the set {1, . . . , n}. Thatis, a nonempty X is finite if there for some n ∈ N we can find a bijection

f : {1, . . . , n} → X.

In this case we say that X has n elements.

(b) We say that X is infinite if it is not finite. ♦

We use the following terminology to further distinguish among sets basedon cardinality.

Definition 0.6.3 (Countable and Uncountable Sets). We say that a setX is:

(a) denumerable or countably infinite if it has the same cardinality as thenatural numbers, i.e., if there exists a bijection f : N → X,

(b) countable if X is either finite or countably infinite,

(c) uncountable if X is not countable. ♦

Every finite set is countable by definition, and parts (b) and (c) of Example0.6.1 show that the sets N, Z, and {2, 3, 4, . . .} are countable. Here is anothercountable set.

Example 0.6.4. Consider N2 = N × N ={

(j, k) : j, k ∈ N}

, the set of allordered pairs of positive integers. We depict N2 in table format in Figure 0.1.

(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) · · ·

(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) · · ·

(3, 1) (3, 2) (3, 3) (3, 4) · · ·

(4, 1) (4, 2) (4, 3) · · ·

(5, 1) (5, 2) · · ·

(6, 1) · · ·

· · ·

Fig. 0.1 The Cartesian product N2 = N × N as a table of ordered pairs.

0.6 Cardinality 11

Every ordered pair (j, k) of positive integers j and k appears somewhere inFigure 0.1. In particular, the first row of the table includes all those orderedpairs of positive integers (j, k) whose first component is j = 1, the secondline lists those pairs whose first component is j = 2, and so forth.

(1, 1) → (1, 2) (1, 3) → (1, 4) (1, 5) → (1, 6)ւ ր ւ ր ւ

(2, 1) (2, 2) (2, 3) (2, 4) (2, 5)↓ ր ւ ր ւ

(3, 1) (3, 2) (3, 3) (3, 4)ւ ր ւ

(4, 1) (4, 2) (4, 3)↓ ր ւ

(5, 1) (5, 2)ւ ր

(6, 1) (6, 2)↓ ր

(7, 1)

Fig. 0.2 Arrows show the pattern for defining a bijection of N onto N2.

Next, as shown in Figure 0.2, we insert arrows into the table in a certainpattern, and we define a bijection f : N → N2 by following these arrows.Specifically, we set

f(1) = (1, 1),

f(2) = (1, 2),

f(3) = (2, 1),

f(4) = (3, 1),

f(5) = (2, 2),

f(6) = (1, 3),

...

In other words, once f(n) has been defined to be a particular ordered pair(j, k), then we let f(n + 1) be the ordered pair that (j, k) points to next. Inthis way the outputs f(1), f(2), f(3), . . . give us a list of every ordered pair

in N2. Thus N and N2 have the same cardinality, so N2 is denumerable andhence countable. ♦

As Example 0.6.4 illustrates, if X is a nonempty countable set then wecan create a list of the elements of X. There are two possibilities. First,a countable X set might be finite, in which case there exists a bijectionf : {1, 2, . . . , n} → X for some positive integer n. Since f is surjective, wetherefore have


X = range(f) ={

f(1), f(2), . . . , f(n)}

.

Thus the function f gives us a way to list the n elements of X. On the otherhand, if X is countably infinite then there is a bijection f : N → X, and hence

X = range(f) ={

f(1), f(2), f(3), . . .}

.

Thus the elements of X have been again been listed in some order. Forexample, Example 0.6.4 shows that we can list the elements of N2 in thefollowing order:

N2 ={

(1, 1), (1, 2), (2, 1), (3, 1), (2, 2), (1, 3), (1, 4), (2, 3), . . .}

.

Although it may seem more natural to depict N2 as a “two-dimensional”table (as shown in Figure 0.2), because N2 is countable it is also true that wecan make a “one-dimensional” list of all of the elements of N2.

Now we show that there exist infinite sets that are not countable.

Example 0.6.5. Let S be the open interval (0, 1), which is the set of all realnumbers that lie strictly between zero and one:

S = (0, 1) ={

x ∈ R : 0 < x < 1}

.

We will use an argument by contradiction to prove that S is not countable.First we recall that every real number can be written in decimal form. Inparticular, if 0 < x < 1 then we can write

x = 0.d1d2d3 . . . =∞∑

k=1

dk

10k,

where each digit dk is an integer between 0 and 9. Some numbers have twodecimal representations, for example

1

2= 0.5000 . . . =

5

10+

∞∑

k=2

0

10k,

but also1

2= 0.4999 . . . =

4

10+

∞∑

k=2

9

10k. (0.4)

Any number whose decimal representation ends in infinitely many zeros alsohas a decimal representation that ends in infinitely many nines, but all otherreal numbers have a unique decimal representation.

Suppose that S were countable. In this case there would exist a bijectionf : N → S, and therefore we could make a list of all the elements of S. If weset xn = f(n), then this implies that

0.6 Cardinality 13

S = range(f) = {f(1), f(2), f(3), . . . } = {x1, x2, x3, . . . }

is a list of every real number between 0 and 1. Each number xn can berepresented in decimal form, say,

xn = 0.dn1dn

2dn3 . . . ,

where each digit dnk is an integer between 0 and 9.

Now we will create another sequence of digits between 0 and 9. In fact, inorder to avoid difficulties arising from the fact that some numbers have twodecimal representations, we will always choose digits that are between 1 and 8.To start, let e1 be any integer between 1 and 8 that does not equal d1

1 (thefirst digit of the first number x1). For example, if the decimal representationof x1 happened to be x1 = 0.72839172 . . . , then we let e1 be any digit otherthan 0, 7, or 9 (so we might take e1 = 5 in this case). Then we let e2 beany integer between 1 and 8 that does not equal d2

2 (the second digit of thesecond number x2), and so forth. This gives us digits e1, e2, . . . , and we let xbe the real number whose decimal expansion has exactly those digits:

x = 0.e1e2e3 . . . =

∞∑

k=1

ek

10k.

Then x is a real number between 0 and 1, so x is one of the real numbers inthe set S. Yet x 6= x1, because the first digit of x (which is e1) is not equal tothe first digit of x1 (why not—what if x1 has two decimal representations?).Similarly x 6= x2, because their second digits are different, and so forth.Hence x does not equal any element of S, which is a contradiction. ThereforeS cannot be a countable set. ♦

Here are some properties of countable and uncountable sets (the proof isassigned as Problem 0.6.11).

Lemma 0.6.6. Let X and Y be sets.

(a) If X is countable and Y ⊆ X, then Y is countable.

(b) If X is uncountable and Y ⊇ X, then Y is uncountable.

(c) If X is countable and there exists an injection f : Y → X, then Y is

countable.

(d) If X is uncountable and there exists an injection f : X → Y, then Y is

uncountable. ♦

Example 0.6.7. (a) Let Q+ = {r ∈ Q : r > 0} be the set of all positive rationalnumbers. Given r ∈ Q+, there is a unique way to write r as a fraction in lowestterms. That is, r = m/n for a unique choice of positive integers m and n thathave no common factors. Therefore, by setting f(r) = (m, n) we can define


an injective map of Q+ into N2. Since N2 is countable and f is injective, weapply Lemma 0.6.6(c) and conclude that Q+ is countable.

A similar argument shows that Q−, the set of negative rational numbers,is countable. Problem 0.6.12 tells us that a union of finitely many (or evencountably many) countable sets is countable, so it follows that Q = Q+ ∪Q− ∪ {0} is countable.

(b) We saw in Example 0.6.5 that S = (0, 1) is uncountable. Since R ⊇ S,Lemma 0.6.6(b) implies that R is uncountable. Also, since every real numberis a complex number we have R ⊆ C, and therefore C is uncountable as well.

(c) Let I = R\Q be the set of irrational real numbers. Since R = I ∪Q, ifI was countable then R would be the union of two countable sets, which iscountable by Problem 0.6.12. This is a contradiction, so the set of irrationalsmust be uncountable.

Thus Q is countable while I is uncountable. This may seem counterin-tuitive since between any two rational numbers there is an irrational, andbetween any two irrational numbers there is a rational number! ♦

Problems

0.6.8. Prove equation (0.4) (for the precise definition of an infinite series, seeSection 0.12).

0.6.9. Given sets A, B, and C, prove the following statements.

(a) A has the same cardinality as A.

(b) If A has the same cardinality as B, then B has the same cardinalityas A.

(c) If A has the same cardinality as B and B has the same cardinality asC, then A has the same cardinality as C.

0.6.10. Prove that the closed interval [0, 1] and the open interval (0, 1) havethe same cardinality by exhibiting a bijection f : [0, 1] → (0, 1).

Hint: Do not try to create a continuous function f.

0.6.11. Prove Lemma 0.6.6.

0.6.12. (a) Show that if X and Y are countable sets, then their Cartesianproduct X × Y and their union X ∪ Y is countable.

(b) Prove that the union of finitely many countable sets X1, . . . , Xn iscountable.

(c) Suppose that X1, X2, . . . are countably many sets, each of which iscountable. Prove that ∪∞

k=1Xk = X1 ∪X2 ∪ · · · is countable. Thus the unionof countably many countable sets is countable.

Hint: Consider Figure 0.1.

0.7 Extended Real-Valued Functions 15

0.6.13. Let F be the set of all functions f : R → R, i.e., F is the set of allfunctions that map real numbers to real numbers. Prove that F is uncount-able, and F does not have the same cardinality as the real line R.

Hint: Suppose there were a bijection a : R → F. To simplify the notation,for each x ∈ R write ax instead of a(x). Then ax is a function that mapsreal numbers to real numbers. The assumption that a is a bijection meansthat EVERY function f : R → R is one and only one of these functions ax.Consider the function defined by f(x) = ax(x) + 1 for x ∈ R.

0.7 Extended Real-Valued Functions

A function that maps a set X into the real line R is called a real-valued

function, and a function that maps X into the extended real line [−∞,∞] isan extended real-valued function. Every real-valued function is an extendedreal-valued function, but an extended real-valued function need not be real-valued. For example, if we set

f(x) =

{

1/x, x > 0,

∞, x = 0,and g(x) =

{

1/x, x > 0,

0, x = 0,

then f is extended real-valued but not real-valued, while g is both real-valuedand extended real-valued.

An extended real-valued function f is nonnegative if f(x) ≥ 0 for every xin its domain, where we use the convention that 0 ≤ ∞ (indeed, a < ∞ forevery real number a).

If f : X → [−∞,∞], then to avoid multiplicities of parentheses, brackets,and braces, we often write f−1(a, b) = f−1((a, b)), f−1[a,∞) = f−1([a,∞)),and so forth. We also use shorthands such as

{f > a} = {x ∈ X : f(x) > a} = f−1(a,∞],

{f ≥ a} = {x ∈ X : f(x) ≥ a} = f−1[a,∞],

{f = a} = {x ∈ X : f(x) = a} = f−1{a},{a < f < b} = {x ∈ X : a < f(x) < b} = f−1(a, b)

{f ≥ g} = {x ∈ X : f(x) ≥ g(x)},{f = g} = {x ∈ X : f(x) = g(x)},

and so forth.


Positive and Negative Parts

Let f : X → [−∞,∞] be an extended real-valued function. We associate to fthe two extended real-valued functions f+ and f− defined by

f+(x) = max{f(x), 0} and f−(x) = max{−f(x), 0}.

We call f+ the positive part and f− the negative part of f (see the illustrationin Figure 0.3). They are each nonnegative, and we have the equalities

f(x) = f+(x) − f−(x) and |f(x)| = f+(x) + f−(x).

Although f+ and f− can take the value ∞, the expression f+(x) − f−(x)is never an indeterminate form because these two functions cannot both takethe value ∞ at any single point x.

Fig. 0.3 A function f (top), its positive part f+ (middle), and its negative part f−

(bottom).

0.7 Extended Real-Valued Functions 17

Monotone Functions

If f : S → [−∞,∞] is an extended real-valued function on a set S ⊆ R, thenwe say that f is monotone increasing on S if for all x, y ∈ S we have

x ≤ y =⇒ f(x) ≤ f(y).

We say that f is strictly increasing on S if for all x, y ∈ S,

x < y =⇒ f(x) < f(y).

Monotone decreasing and strictly decreasing functions are defined similarly.Often the domain S of a monotone function is some type of interval. If

[a, b] is a finite closed interval and f : [a, b] → R is a monotone increasing,real-valued function on [a, b], then f is bounded since f(a) and f(b) must befinite real numbers and f(a) ≤ f(x) ≤ f(b) for every x ∈ [a, b]. However, amonotone increasing function whose domain is any other type of interval canbe unbounded, even if f never takes the values ±∞. For example, f(x) = − 1

xis an unbounded, strictly increasing function on (0, 1], even though f(x) isfinite for every x ∈ (0, 1].

Notation for Extended Real-Valued and

Complex-Valued Functions

Most of the functions that we will encounter in this text will either bereal-valued, extended real-valued, or complex-valued. A function of the formf : X → R is said to be real-valued, a function of the form f : X → [−∞,∞] isextended real-valued, and a function of the form f : X → C is complex-valued.We have the inclusions R ⊆ [−∞,∞] and R ⊆ C, so every real-valued func-tion is both an extended real-valued and a complex-valued function. However,neither [−∞,∞] nor C is a subset of the other, so an extended real-valuedfunction need not be a complex-valued function, and a complex-valued func-tion need not be an extended real-valued function. Hence there are usuallytwo cases to consider:

• extended real-valued functions of the form f : X → [−∞,∞], and

• complex-valued functions of the form f : X → C.

To avoid excessive duplication, we introduce a notation that will allow us toconsider both cases together.

Notation 0.7.1 (Scalars and the Symbol F). We let the symbol F denotea choice of either the extended real line [−∞,∞] or the complex plane C.Associated with this choice, we make the following declarations.


• If F = [−∞,∞], then the word scalar means a finite real number c ∈ R.

• If F = C, then the word scalar means a complex number c ∈ C.

Note that a scalar cannot be ±∞; instead, a scalar is always a real or complexnumber. ♦

Thus, for example, when we write f : X → F, we mean that f is either anextended real-valued or a complex-valued function on X. Both possibilitiesinclude real-valued functions as a special case.

Remark 0.7.2. The letter “F” here is related to the name “field.” In manycircumstances in analysis, we want to be able to use either the real line R orthe complex plane C as our field. In these cases, it is not uncommon to usethe symbol F to denote a choice of R or C. For example, this notation is usedin both [Heil11] and [Heil18].

However, in this text the choice is between the extended real line R =[−∞,∞] and the complex plane C. The extended real line is not a field,but it is related to the field R. Hence fields are still the issue, and this isreason for the choice of the letter “F” in this context. Thus, in this text, F

denotes a choice of R = [−∞,∞] or C, but the reader should be aware thatother notations are used, such as the use of F to denote a choice of R or C.Additionally, some texts focus solely on one field, or move interchangeablybetween fields as convenient without adopting a notation to denote a choiceof fields (or their extended versions).

The reader should also be aware that there is another notion, useful intopological contexts, of the one-point compactification of R or of C. This isa distinct concept that will not be used in this text. In particular, for ourpurposes in analysis, it is not useful to try to define a “complex infinity” inany way. ♦

0.8 Sequences

Let J be a fixed set. Given a set X and points xj ∈ X for j ∈ J, we write{xj}j∈J to denote the sequence of elements xj indexed by the set J. We call Jan index set in this context, and we refer to xj as the jth component of thesequence {xj}j∈J . If we know that the xj are real or complex numbers, thenwe often write a sequence as (xj)j∈J instead of {xj}j∈J . If the index set J isunderstood then we may write {xj}, {xj}j , (xj), or (xj)j , as appropriate.

Technically, a sequence {xj}j∈J is shorthand for the function x : J → Xwhose rule is

x(j) = xj , for j ∈ J.

Consequently the components xj of a sequence need not be distinct, i.e., itis possible that we might have xi = xj for some i 6= j.

0.8 Sequences 19

Often the index set J is countable. The two most common situations arethe finite index set J = {1, . . . , d} and the countably infinite index set J = N.If J = {1, . . . , d} then we often write a sequence in list form as

{xn}dn=1 = {x1, . . . , xd} or (xn)d

n=1 = (x1, . . . , xd).

Similarly, if J = N = {1, 2, . . .} then we often write

{xn}n∈N = {x1, x2, . . . } or (xn)n∈N = (x1, x2, . . . ).

A subsequence of a countable sequence {xn}n∈N = {x1, x2, . . . } is a se-quence of the form

{xnk}n∈N = {xn1

, xn2, . . . } where n1 < n2 < · · · .

For example,{x2, x3, x5, x7, x11, . . . }

is a subsequence of {x1, x2, . . . }.We say that a countable sequence of real numbers (xn)n∈N is monotone

increasing if xn ≤ xn+1 for every n, and strictly increasing if xn < xn+1 forevery n. We define monotone decreasing and strictly decreasing sequencessimilarly.

The Kronecker Delta and the Standard Basis Vectors

Given i, j in an index set J (typically J = N), the Kronecker delta of i and jis the number δij defined by the rule

δij =

{

1, if i = j,

0, if i 6= j.

If n ∈ N is a positive integer, then we let δn denote the sequence

δn = (δnk)k∈N = (0, . . . , 0, 1, 0, 0, . . . ).

That is, the nth component of the sequence δn is 1, while all other componentsare zero. We call δn the nth standard basis vector, and we refer to the family{δn}n∈N as the sequence of standard basis vectors, or simply the standard

basis.


Problems

0.8.1. Prove that each of the following sets A, B, C, and D are uncountable.

(a) A is the set of all sequences x = (x1, x2, . . . ) where each xk is an integerbetween 0 and 9.

(b) B is the set of all sequences x = (x1, x2, . . . ) where each xk is aninteger between 0 and 4.

(c) C is the set of all sequences x = (x1, x2, . . . ) where each xk is either 0or 1.

(d) D is the set of all sequences x = (x1, x2, . . . ) where each xk is either 0or 2.

0.9 Suprema and Infima

Let S be a set of real numbers.

• S is bounded above if there exists a real number M such that x ≤ M forevery x ∈ S. Any such number M is called an upper bound for S.

• S is bounded below if there exists a real number m such that m ≤ x forevery x ∈ S. Any such number m is called a lower bound for S.

• S is bounded if it is bounded both above and below. Equivalently, S isbounded if and only if there is a real number M ≥ 0 such that |x| ≤ Mfor all x ∈ S.

• x is a maximum element of S if x ∈ S and s ≤ x for every s ∈ S.

• x is a minimum element of S if x ∈ S and s ≥ x for every s ∈ S.

Not every set of real numbers S has a maximum or minimum element, evenif it is bounded. For example, the open interval I = (0, 1) has no maximum orminimum element. Often, a more useful notion than a maximum or minimumelement is the supremum or infimum of a set. We consider these next.

Definition 0.9.1 (Supremum). Let S be a nonempty set of real numbers.We say that x ∈ R is the supremum, or least upper bound, of S if the followingtwo statements hold.

(a) x is an upper bound for S, i.e., s ≤ x for every s ∈ S.

(b) If y is any upper bound for S, then x ≤ y. That is, if y ∈ R and s ≤ y forevery s ∈ S, then x ≤ y.

We denote the supremum of S, if one exists, by x = sup(S). ♦

0.9 Suprema and Infima 21

For example, the supremum of the open interval S = (0, 1) is sup(S) = 1.Note that the supremum of a set need not belong to the set.

It is not obvious whether every set that is bounded above has a supremum.We take the existence of suprema as an axiom, as follows.

Axiom 0.9.2 (Supremum Property of the Real Line). Let S be anonempty subset of R. If S is bounded above, then there exists a real numberx = sup(S) that is the supremum of S. ♦

Here is an immediate but useful fact about the supremum of a set.

Lemma 0.9.3. Let S be a nonempty subset of R that is bounded above. Then

for each ε > 0 there exists some x ∈ S such that

sup(S) − ε < x ≤ sup(S).

Proof. Since the set S is bounded above, its supremum u = sup(S) is a finitereal number. Since u is the least upper bound for S, the number u − ε isnot an upper bound for S. Therefore there must exist some x ∈ S such thatu − ε < x. On the other hand, u is an upper bound for S, so we must havex ≤ u. Therefore u − ε < x ≤ u. ⊓⊔

We extend the definition of a supremum to sets that are not boundedabove by declaring that sup(S) = ∞ if S is not bounded above. We alsodeclare that sup(∅) = −∞. Using these conventions, every set S ⊆ R has asupremum (although it might be ±∞), and the following statements hold.

• If S is empty, then sup(S) = −∞.

• If S is nonempty and bounded above, then −∞ < sup(S) < ∞.

• If S is not bounded above, then sup(S) = ∞.

If S = (xn)n∈N is countable, then we often write supn xn or supxn to denotethe supremum instead of sup(S).

The infimum, or greatest lower bound, of S is defined in an entirely anal-ogous manner, and is denoted by inf(S). Statements analogous to the onesmade for suprema hold for infima.

To illustrate the use of suprema, we prove the following result. Furtherresults about suprema and infima (including facts about unbounded sets ofnumbers) are listed in the problems below.

Lemma 0.9.4. If (xn)n∈N and (yn)n∈N are two bounded sequences of real

numbers, then

supn∈N

(xn + yn) ≤ supn∈N

xn + supn∈N

yn.

Proof. For simplicity of notation, set u = supxn and v = sup yn. Since(xn)n∈N and (yn)n∈N are bounded, we know that u and v are finite realnumbers. Since u is an upper bound for the xn, we have xn ≤ u for every n.


Similarly, yn ≤ v for every n. Therefore xn + yn ≤ u + v for every n. Henceu + v is an upper bound for the sequence (xn + yn)n∈N, so this sequence isbounded above and therefore has a finite supremum, say w = sup(xn + yn).By definition, w is the least upper bound for (xn + yn)n∈N. Since we haveshown that u+v is an upper bound for (xn +yn)n∈N, we must have w ≤ u+v,which is exactly what we wanted to prove. ⊓⊔

Problems

0.9.5. Given a nonempty set S ⊆ R, prove the following statements aboutsuprema. Also formulate and prove analogous results for infima.

Hint: We are not assuming that S is bounded, so first prove these re-sults assuming S is bounded, and then separately consider the case of anunbounded set S.

(a) If S has a maximum element x, then x = sup(S).

(b) If t ∈ R, then sup(S + t) = sup(S) + t, where S + t = {x + t : x ∈ S}.(c) If c ≥ 0, then sup(cS) = c sup(S), where cS = {cx : x ∈ S}.(d) If an ≤ bn for every n, then sup an ≤ sup bn.

0.9.6. Let (xn)n∈N and (yn)n∈N be two sequences of real numbers (not nec-essarily bounded).

(a) Show that if c > 0, then

supn∈N

cxn = c supn∈N

xn and supn∈N

(−cxn) = −c infn∈N

xn,

where we declare that c ·(±∞) = ±∞ and −c ·(±∞) = ∓∞ for every positivereal number c.

(b) Prove that

infn∈N

xn + infn∈N

yn ≤ infn∈N



xn + supn∈N

yn.

Show by example that any of the inequalities on the preceding line can bestrict.

0.9.7. Given nonempty sets A, B ⊆ R, prove that

sup(A + B) = sup(A) + sup(B),

where A+B = {a+ b : a ∈ A, b ∈ B}. Why does this not contradict Problem0.9.6(b)?

0.10 Convergent Sequences of Numbers 23

0.9.8. Let S be a bounded, nonempty set of real numbers. Given a realnumber u, prove that u = sup(S) if and only if both of the following twostatements hold:

(a) there does not exist any s ∈ S such that u < s, and

(b) if v < u, then there exists some s ∈ S such that v < s.

0.10 Convergent Sequences of Numbers

Convergence of sequences will be discussed in the more general setting ofmetric spaces in Section 1.1.1. Here we will focus on sequences (xn)n∈N ofreal or complex numbers.

Definition 0.10.1. Let (xn)n∈N be a sequence of real or complex numbers.

(a) We say that (xn)n∈N converges if there exists some real or complex num-ber x such that for every ε > 0 there is an N > 0 such that

n ≥ N =⇒ |x − xn| < ε.

In this case we say that xn converges to x as n → ∞ and write

xn → x or limn→∞

xn = x or limxn = x.

(b) We say that (xn)n∈N is a Cauchy sequence if for every ε > 0 there existsan integer N > 0 such that

m, n ≥ N =⇒ |xm − xn| < ε. ♦

It follows immediately from the definition that every convergent sequenceof real or complex numbers is a Cauchy sequence (this is Problem 0.10.9).According to the following result, which is a consequence of the SupremumProperty of the real line, the converse holds as well (for one proof, see [Rud76,Thm. 3.11]).

Theorem 0.10.2 (Cauchy Sequences of Scalars are Convergent). If

(xn)n∈N is a sequence of real or complex numbers, then

(xn)n∈N is convergent ⇐⇒ (xn)n∈N is Cauchy. ♦

Here are some basic properties of convergent sequences.

Lemma 0.10.3. Let (xn)n∈N and (yn)n∈N be sequences of real or complex

numbers.

(a) If (xn)n∈N converges then it is bounded, i.e., sup |xn| < ∞.


(b) If (xn)n∈N and (yn)n∈N both converge, then so does (xn + yn)n∈N, and

limn→∞

(xn + yn) = limn→∞

xn + limn→∞

yn.

(c) If (xn)n∈N converges and c is a real or complex number, then (cxn)n∈N

converges, and

limn→∞

cxn = c limn→∞

xn.

Proof. (a) Suppose that xn → x. Considering ε = 1, there must exists someN > 0 such that |x − xn| < 1 for all n ≥ N. Therefore, for n ≥ N we have

|xn| = |xn − xN + xN | ≤ |xn − xN | + |xN | ≤ 1 + |xN |.

Hence, for an arbitrary n ∈ N,

|xn| ≤ max{

|x1|, . . . , |xN−1|, |xN | + 1}

.

Since the right-hand side of the line above is a constant that is independentof n, we see that the sequence (xn)n∈N is bounded.

We assign the proofs of parts (b) and (c) as Problem 0.10.11. ⊓⊔

Divergence to Infinity

We introduce some terminology for a sequence of real numbers that increaseswithout bound. We do not say that such a sequence converges but insteadsay that it diverges to infinity.

Definition 0.10.4 (Divergence to Infinity). If (xn)n∈N is a sequence ofreal numbers, then we say that (xn)n∈N diverges to ∞ if for each real numberR > 0 there is an integer N > 0 such that xn > R for all n ≥ N. In this casewe write

xn → ∞, limn→∞

xn = ∞, or limxn = ∞.

We define divergence to −∞ similarly. ♦

Convergence in the Extended Real Sense

Let (xn)n∈N be a sequence of real numbers. We say that

limn→∞

xn exists

or that

0.10 Convergent Sequences of Numbers 25

(xn)n∈N converges in the extended real sense

if either

• xn converges to a real number x as n → ∞, or

• xn diverges to ∞ as n → ∞, or

• xn diverges to −∞ as n → ∞.

Remark 0.10.5. In some circumstances in mathematics it is appropriate tointroduce an analogue of ∞ for the complex plane. For example, this is donewhen we consider the topological “one-point compactification of the complexplane.” Those notions are not appropriate for the purposes of this text, andhence we will not consider any analogue of “convergence in the extendedreal sense” for sequences of complex numbers. Consequently, if (xn)n∈N is asequence of complex numbers, then its limit exists if and only if xn convergesto some complex scalar (not ±∞ or any notion of a “complex infinity”) asn → ∞. ♦

Conventions

A sequence (xn)n∈N of real numbers can converge, converge in the extendedreal sense, or not converge at all. A sequence (xn)n∈N of complex numbers canconverge or not converge. The terminology in these two situations is similarbut slightly different, yet it is usually clear from context what we mean whenwe say that a given sequence of scalars (xn)n∈N converges or that a limitexists. However, to be completely precise, we list the technical details here.

Notation 0.10.6 (Existence of a Limit of Scalars). Let (xn)n∈N be asequence of (real or complex) scalars.

• When we say that a generic sequence of scalars (xn)n∈N converges, wemean that it converges to a scalar value. This applies to sequences of realnumbers and to sequences of complex numbers.

• When we say that a sequence of real numbers exists, this means that itexists in the extended real sense. When we say that a sequence of realnumbers converges, we mean that it converges to a finite real number.

• When we say that a sequence of complex numbers exists or that thesequence converges, this means that it converges to a complex number.

• We do not use a concept of “complex infinity” in this text, and hencethere is no notion of “divergence to infinity” for a sequence of complex

numbers. Therefore, if (xn)n∈N is a sequence of complex numbers, thenthis sequence converges if and only the limit of the sequence exists and isa complex number. ♦


Example 0.10.7. Every monotone increasing sequence of real numbers (xn)n∈N

converges in the extended real sense, and in this case limxn = supxn. Sim-ilarly, a monotone decreasing sequence of real numbers converges in the ex-tended real sense and its limit equals its infimum. ♦Remark 0.10.8. Sometimes we need to consider sequences of extended realnumbers, instead of just sequences of real numbers. The concepts that wehave introduced extend to this setting. For example, if we allow each xn tobe an extended real number, then it is still true that a monotone increasingsequence of extended real numbers (xn)n∈N converges in the extended realsense. ♦

Pointwise Convergence of Functions

If X is a set and {fn}n∈N is a sequence of extended real-valued or complex-valued functions whose domain is X, then we say that fn converges pointwise

to a function f if

f(x) = limn→∞

fn(x), for all x ∈ X.

In this case we write fn(x) → f(x) for every x ∈ X or fn → f pointwise.If {fn}n∈N is a sequence of extended real-valued functions whose domain

is a set X, then we say that {fn}n∈N is a monotone increasing sequence offunctions if the sequence {fn(x)}n∈N is monotone increasing for each x ∈ X,i.e., if

f1(x) ≤ f2(x) ≤ · · · for every x ∈ X.

In this case f(x) = limn→∞ fn(x) exists for each x ∈ X in the extended realsense, and we say that fn increases pointwise to f . We denote this by writingfn ր f on X.

Problems

0.10.9. Prove the “easy” direction of Theorem 0.10.2, i.e., show that everyconvergent sequence of scalars is Cauchy.

0.10.10.* Use Axiom 0.9.2 to prove the “hard” direction of Theorem 0.10.2.

0.10.11. Prove parts (b) and (c) of Lemma 0.10.3, and use induction toextend part (b) to finite sums of limits.

0.10.12. Assume that (xn)n∈N is a monotone increasing sequence of realnumbers, i.e., each xn is real and x1 ≤ x2 ≤ · · · . Prove that (xn)n∈N con-verges if and only if (xn)n∈N is bounded, and in this case we have

0.11 Limsup and Liminf 27

limn→∞

xn = supn

xn.

Formulate and prove an analogous result for monotone decreasing sequences.

0.10.13. Assume that (xn)n∈N and (yn)n∈N are convergent sequences of realor complex numbers. Prove that

limn→∞

xnyn =(

limn→∞

xn

)(

limn→∞

yn

)

,

and if lim yn 6= 0 then limn→∞

xn

yn=

limn→∞

xn

limn→∞

yn.

0.11 Limsup and Liminf

Not every sequence of real numbers converges. Consequently, instead of tryingto use limits it is sometimes more useful to deal with the following weakernotions.

Definition 0.11.1 (Limsup and Liminf). The limit superior, or limsup,of a sequence of real numbers (xn)n∈N is

lim supn→∞

xn = infn∈N

(

supm≥n

xm

)

.

Likewise, the limit inferior, or liminf, of (xn)n∈N is

lim infn→∞

xn = supn∈N

(

infm≥n

xm

)

. ♦

We sometimes use the abbreviated notations lim sup xn or lim infxn todenote a limsup or a liminf.

The liminf and limsup of every sequence of real numbers exists in theextended real sense. That is, if (xn)n∈N is any sequence of real numbers thenits liminf and limsup are extended real numbers in the range

−∞ ≤ lim infn→∞

xn ≤ lim supn→∞

xn ≤ ∞.

Example 0.11.2. Let xn = (−1)n for n ∈ N. The sequence(

(−1)n)

n∈Ndoes

not converge. The limsup of this sequence is

lim supn→∞

(−1)n = infn∈N

(

supm≥n

(−1)m

)

= infn∈N

1 = 1,

and its liminf is


lim infn→∞

(−1)n = supn∈N

(

infm≥n

(−1)m

)

= infn∈N

(−1) = −1. ♦

We have the following characterization of convergent sequences.

Theorem 0.11.3. Let (xn)n∈N be a bounded sequence of real numbers. Then

(xn)n∈N converges ⇐⇒ lim infn→∞

xn = lim supn→∞

xn.

Furthermore, in this case we have

limn→∞

xn = lim infn→∞

xn = lim supn→∞

xn.

Proof. ⇒. Assume that w = limxn exists and is a finite real number. Since(xn)n∈N is a bounded sequence, both

u = lim infn→∞

xn and v = lim supn→∞

xn

are finite, and Problem 0.11.4(a) shows that u ≤ v. For each n, set

yn = infm≥n

xm and zn = supm≥n

xm.

If we fix ε > 0, then there exists some N > 0 such that w−ε ≤ xm ≤ w+εfor all m ≥ N. Hence for all n ≥ N we have

w − ε ≤ infm≥n

xm = yn ≤ zn = supm≥n

xm ≤ w + ε.

Consequently, since the yn are increasing,

u = supn≥1

yn = supn≥N

yn ≥ w − ε.

Likewise, since the zn are decreasing,

v = infn≥1

zn = infn≥N

zn ≤ w + ε.

This is true for every ε > 0, so u ≥ w and v ≤ w. Hence w ≤ u ≤ v ≤ w, andtherefore w = u = v.

⇐. We assign the proof of this direction as Problem 0.11.6. ⊓⊔

Thus, although the limit of a bounded sequence need not exist, its liminfand limsup will always exist and the sequence converges if and only if thelimsup and liminf are equal (and Problem 0.11.6 extends this fact to sequencesthat need not be bounded). More properties of the limsup and liminf ofsequences are given in the problems below. In particular, Problem 0.11.10gives several equivalent reformulations of the definition of a limsup.

0.11 Limsup and Liminf 29

On occasion we deal with real-parameter versions of liminf and limsup.Given a real-valued function f whose domain includes an interval centeredat a point x ∈ R, we define

lim supt→x

f(t) = infδ>0

sup|t−x|<δ

f(t) = limδ→0

sup|t−x|<δ

f(t),

and lim inft→x f(t) is defined analogously. The properties of these real-parameter versions of liminf and limsup are similar to those of the sequenceversions.

Problems

0.11.4. (a) Show that if (an)n∈N is any sequence of real numbers, thenlim inf an ≤ lim sup an.

(b) Show that if an ≤ bn for every n, then lim sup an ≤ lim sup bn.

0.11.5. Let (xn)n∈N be a sequence of strictly positive real numbers. Supposethat we have either

lim supn→∞

xn+1

xn< 1 or lim sup

n→∞x1/n

n < 1.

Prove that there exists constants 0 < r < 1 and C > 0 such that xn ≤ Crn

for every n. Conclude that limxn = 0.

0.11.6. (a) Finish the proof of Theorem 0.11.3.

(b) Given any sequence (xn)n∈N of real numbers, prove that

(xn)n∈N diverges to ∞ ⇐⇒ lim infn→∞

xn = lim supn→∞

xn = ∞.

0.11.7. Let (xn)n∈N and (yn)n∈N be any two sequences of real numbers. Aslong as none of the following sums takes the indeterminate forms ∞−∞ or−∞ + ∞, prove that

lim infn→∞

xn + lim infn→∞

yn ≤ lim infn→∞

(xn + yn)

≤ lim supn→∞

xn + lim infn→∞

yn

≤ lim supn→∞

(xn + yn)

≤ lim supn→∞

xn + lim supn→∞

yn,

Show by example that strict inequality can hold on any line above. (Conse-quently, in contrast to limits, neither limsup nor liminf is linear in general.)


Prove further that if the sequence (xn)n∈N converges, then

lim infn→∞

(xn + yn) = limn→∞

xn + lim infn→∞

yn

and a similar equality holds with liminf replaced by limsup.

0.11.8. Given a sequence of real numbers (xn)n∈N, prove that

lim supn→∞

(−xn) = − lim infn→∞

xn.

0.11.9. Given any sequence of real numbers (xn)n∈N, prove that there existsubsequences (xnk

)k∈N and (xmj)j∈N such that

limk→∞

xnk= lim sup

n→∞xn and lim

j→∞xmj

= lim infn→∞

xn.

Remark: In fact, the next problem shows that if (xn)n∈N is bounded thenlim sup xn is the largest possible limit of any subsequence (xnk

)k∈N, andlim inf xn is the smallest limit of any subsequence.

0.11.10. Let (xn)n∈N be a bounded sequence of real numbers and let x be areal number. Prove that the following five statements are equivalent.

(a) x = lim supn→∞

xn.

(b) x = limn→∞

(

supm≥n

xm

)

.

(c) If ε > 0, then there are infinitely many xn with xn > x − ε, but onlyfinitely many xn such that xn > x + ε.

(d) x = inf{

y ∈ R : there are only finitely many xn > y}

.

(e) x = sup{

y ∈ R : there exists a subsequence xnk→ y

}

.

Formulate and prove an analogous result for the liminf of a sequence.

0.12 Infinite Series of Numbers

Infinite series in the setting of normed spaces will be considered in detail inSection 1.2.3. Here we will review issues related to the convergence of seriesof real or complex numbers.

We say that a series∑∞

n=1 cn of real or complex numbers converges if thereis a real or complex number s such that the partial sums

sN =

N∑

n=1

cn

0.12 Infinite Series of Numbers 31

converge to s as N → ∞. In this case∑∞

n=1 cn is defined to be s, i.e.,

∞∑

n=1

cn = limN→∞

sN = limN→∞

N∑

n=1

cn = s.

If the series∑∞

n=1 cn does not converge, then we say that it diverges.We sometimes use the shorthands

∑

cn or∑

n cn to denote a series.Here are two particular examples of infinite series.

Lemma 0.12.1. (a) If z is a real or complex number with |z| < 1, then∑∞

k=0 zk converges and has the value

∞∑

k=0

zk =1

1 − z.

Conversely, if |z| ≥ 1, then∑∞

k=0 zk does not converge.

(b) If z is any real or complex number, then∑∞

k=0 zk/k! converges and has

the value∞∑

k=0

zk

k!= ez. ♦

An important property of convergent series is given in the next lemma(the proof is assigned as Problem 0.12.8).

Lemma 0.12.2 (The nth Term Test). If∑∞

n=1 cn is a convergent series

of real or complex numbers, then

limn→∞

cn = 0. ♦

Example 0.12.3. The converse of Lemma 0.12.2 is false in general. For exam-ple, consider cn = 1/n. Although the scalars 1/n converge to zero as n → ∞,it is nonetheless the case that

the series

∞∑

n=1

1

ndoes not converge! ♦

Convergence of Series of Nonnegative Numbers

We often deal with series where every cn is a nonnegative real number. In thiscase there are only the following two possibilities (see Problem 0.12.10):

• If cn ≥ 0 for every n and the sequence of partial sums {sN}N∈N is boundedabove, then the series

∑

cn converges to a nonnegative real number. Inthis case we write


∞∑

n=1

cn < ∞.

• If cn ≥ 0 for every n and the sequence of partial sums {sN}N∈N is notbounded above, then sN diverges to infinity. In this case, the series

∑

cn

diverges, and we say that∑

cn diverges to infinity and write

∞∑

n=1

cn = ∞.

Notation 0.12.4 (Existence of a Series of Nonnegative Scalars). As-sume that cn ≥ 0 for every n. Then either the series

∑

cn converges to a realnumber or it diverges to infinity. We therefore say that a series

∑

cn with allnonnegative terms exists or that it converges in the extended real sense. ♦

Note that saying that a series converges in the extended real sense doesnot mean that the series converges. Instead,

∑

cn converges if the partialsums converge to a finite real scalar.

The following lemma regarding the “tails of a nonnegative series” is oftenuseful (the proof is Problem 0.12.8).

Lemma 0.12.5 (Tails of Convergent Series). If cn ≥ 0 for every n and∑

cn < ∞, then

limN→∞

( ∞∑

n=N

cn

)

= 0. ♦

Absolutely Convergent Series of Scalars

We say that an infinite series∑

cn of real or complex numbers cn converges

absolutely if∞∑

n=1

|cn| < ∞.

Recall that every Cauchy sequence of real numbers converges (see Theorem0.10.2). It follows from this that every absolutely convergent series of scalarsconverges. That is, if cn is a real or complex number for each n ∈ N, then

∞∑

n=1

|cn| < ∞ =⇒∞∑

n=1

cn converges.

However, the converse implication fails in general. For example, the alternat-

ing harmonic series

0.12 Infinite Series of Numbers 33

∞∑

n=1

(−1)n

n

converges, but it does not converge absolutely (this is Problem 0.12.6).

Problems

0.12.6. Prove that the alternating harmonic series∑

(−1)n 1n converges, but

the harmonic series∑

1n diverges to infinity.

0.12.7. (a) Let an and bn be real or complex numbers. Show that if∑

an

and∑

bn each converge, then∑

(an + bn) converges and

∞∑

n=1

(an + bn) =

∞∑

n=1

an +

∞∑

n=1

bn.

(b) Exhibit real numbers an and bn such that∑

(an + bn) converges but∑

an and∑

bn do not converge.

0.12.8. Prove Lemmas 0.12.2 and 0.12.5.

0.12.9. Suppose that J is an uncountable index set, and xi > 0 for eachi ∈ J. Prove that

sup

{

∑

i∈F

xi : F ⊆ J, F is finite

}

= ∞.

Conclude that the only logical definition of an uncountable series of strictlypositive numbers is

∑

i∈J xi = ∞.

0.12.10. Suppose that an ≥ 0 for each n ∈ N. Prove that either∑

an con-verges or it diverges to infinity.

0.12.11. Prove Fatou’s Lemma for series: If akn ≥ 0 for all k, n, then

∞∑

k=1

(

lim infn→∞

akn

)

≤ lim infn→∞

( ∞∑

k=1

akn

)

.

Show by example that strict inequality can hold.

0.12.12. Prove Tonelli’s Theorem for iterated series: If cmn ≥ 0 for all m, n ∈N, then

∞∑

m=1

∞∑

n=1

cmn =∞∑

n=1

∞∑

m=1

cmn,

in the sense that either both sides converge and are equal, or both sides areinfinite.


0.13 Differentiation and The Riemann Integral

In this section we will briefly review some facts and terminology connectedwith continuity, differentiation, and integration.

Continuity

There are several equivalent ways to define continuity. For functions on aninterval, we will take the following as our definition of continuity. More gen-erally, continuity for functions on metric spaces will be explored in detail inSection 1.1.4.

Definition 0.13.1 (Continuity). Let I be an interval in the real line, andlet f be a real-valued or complex-valued function on I (that is, f has theform f : I → R or f : I → C).

(a) We say that f is continuous on I if for each x ∈ I we have

limy→x,y∈I

f(y) = f(x).

Stated explicitly, this means that for each point x ∈ I and for each ε > 0,there must exist a number δ > 0 such that

y ∈ I, |x − y| < δ =⇒ |f(x) − f(y)| < ε. (0.5)

(b) We say that f is uniformly continuous on I if for each ε > 0, there existsa δ > 0 such that

x, y ∈ I, |x − y| < δ =⇒ |f(x) − f(y)| < ε. ♦ (0.6)

Note that the value of δ in equation (0.5) implicitly depends on the choiceof x. That is, if we choose a different x then we may need a different value forδ in order to make equation (0.5) hold. In contrast, the value of δ in equation(0.6) must be independent of the choice of x. That is, in order for a functionto be uniformly continuous there must be a single δ such that equation (0.6)holds.

If I = [a, b] is a finite closed interval, then every continuous functionon I is both uniformly continuous and bounded on [a, b] (this is proved inTheorem 1.1.17). However, if I is any other type of interval in R, then thereare continuous functions on I that are not uniformly continuous. For example,f(x) = x2 is not uniformly continuous on I = R, and f(x) = 1/x is notuniformly continuous on I = (0, 1].

0.13 Differentiation and The Riemann Integral 35

Remark 0.13.2. Consider the function f on [0,∞) defined by f(x) = 1/x forx > 0 and f(0) = ∞. Is this function continuous? We have only consid-ered continuity for scalar-valued functions, so the definitions that we haveintroduced to this point cannot be applied to this extended real-valued func-tion f. It is possible to extended the notion of continuity to extended real-valued functions, by defining an appropriate topology on the extended realline [−∞,∞]. If we do this in a way that appropriately extends the topologyof the real line, then it turns out that the function f given above is contin-uous in this extended real sense. However, we will not need to consider thisextended notion of continuity in this text. Instead, when we are given a func-tion f : X → F, we will only apply terminology related to continuity if thefunction is real-valued or complex-valued. This should usually be clear fromcontext, and in most cases the scalar-valued condition will be explicit. ♦

Derivatives and Everywhere Differentiability

Let f be a real-valued or complex-valued function on a domain D ⊆ R. Ift ∈ D and there is some open interval I such that t ∈ I ⊆ D, then we saythat f is differentiable at t if the limit

f ′(t) = limh→0

f(t + h) − f(t)

h

exists and is a scalar (in particular, f is not differentiable at t if this limittakes the form ±∞). In this case we call f ′(t) the derivative of f at t.

Let [a, b] be a closed interval in the real line. We say that a function of theform f : [a, b] → R or f : [a, b] → C is differentiable everywhere on [a, b] if itis differentiable at each point x in the interior (a, b) and if the appropriateone-sided derivatives exist at the endpoints a and b. In other words, f isdifferentiable everywhere on [a, b] if

f ′(x) = limy→x,

y∈[a,b]

f(y) − f(x)

y − x

exists and is a scalar for each x ∈ [a, b]. We use similar terminology if f isdefined on other types of intervals in R. For example, x3/2 is differentiableeverywhere on [0, 1] and x1/2 is differentiable everywhere on (0, 1], but x1/2

is not differentiable everywhere on [0, 1].The Mean-Value Theorem is one of most important results of differen-

tial calculus. A proof can be found in undergraduate calculus texts, suchas [HHW18]. Note that this result only holds for real-valued functions (seeProblem 0.13.6).


Theorem 0.13.3 (Mean-Value Theorem). Assume that f : [a, b] → R is

continuous, and f is differentiable at each point of (a, b). Then there exists

some point c ∈ (a, b) such that

f ′(c) =f(b) − f(a)

b − a. ♦

The Riemann Integral

For proofs of the statements made here regarding the Riemann integral, werefer to calculus texts such as [HHW18].

Let f : [a, b] → R be a bounded, real-valued function on a finite, closedinterval [a, b]. A partition of [a, b] is a choice of finitely many points xk in[a, b] such that a = x0 < x1 < · · · < xn = b. If we wish to give this partitiona name then we write:

Let Γ ={

a = x0 < · · · < xn = b}

be a partition of [a, b].

The mesh size of Γ is |Γ | = max{

xj − xj−1 : j = 1, . . . , n}

.

Given a partition Γ ={

a = x0 < · · · < xn = b}

, for each j = 1, . . . , n let

mj and Mj denote the infimum and supremum of f on the interval [xj−1, xj ]:

mj = infx∈[xj−1,xj]

f(x) and Mj = supx∈[xj−1,xj]

f(x).

The numbers

LΓ =

n∑

j=1

mj (xj − xj−1) and UΓ =

n∑

j=1

Mj (xj − xj−1),

are called lower and upper Riemann sums for f , respectively. We say that fis Riemann integrable on [a, b] if there is a real number I such that

supΓ

LΓ = infΓ

UΓ = I,

where the supremum and infimum are taken over all partitions Γ of [a, b]. Inthis case, the number I is the Riemann integral of f over [a, b].

Here is an equivalent definition of the Riemann integral. Given a partitionΓ = {a = x0 < · · · < xn = b}, choose any points ξj ∈ [xj−1, xj ]. We call

RΓ =

n∑

j=1

f(ξj) (xj − xj−1)

0.13 Differentiation and The Riemann Integral 37

a Riemann sum for f (note that RΓ implicitly depends on the choice of pointsξj as well as the partition Γ ). Then f is Riemann integrable if and only ifthere is a real number I such that

I = lim|Γ |→0

RΓ ,

where this means that for every ε > 0, there is a δ > 0 such that for anypartition Γ with |Γ | < δ and any choice of points ξj ∈ [xj−1, xj ] we have|I − RΓ | < ε. In this case, I is the Riemann integral of f over [a, b].

We declare that a complex-valued function f on [a, b] is Riemann integrableif its real and imaginary parts are both Riemann integrable.

Every continuous function f : [a, b] → C is Riemann integrable, as is everypiecewise continuous function on [a, b]. There exist functions that are notpiecewise continuous but are Riemann integrable. We will characterize theRiemann integrable functions on [a, b] in Section 4.5.5. The characteristicfunction of the rationals, χ

Q, is an example of a function that is not Riemannintegrable on any interval [a, b].

Indefinite Integrals

Suppose that g : [a, b] → C is continuous on [a, b]. In this case g is Riemannintegrable on every interval [a, x] with a ≤ x ≤ b. The indefinite integral of gis the function

G(x) =

∫ x

a

g(t) dt, x ∈ [a, b].

The Fundamental Theorem of Calculus tells us that G is differentiable andits derivative is g.

Theorem 0.13.4 (Fundamental Theorem of Calculus). If g : [a, b] → C

is continuous, then its indefinite integral G(x) =∫ x

ag(t) dt is differentiable

at each point x in [a, b], and G′(x) = g(x) for each x ∈ [a, b]. ♦

Problems

0.13.5. Suppose that f and g are continuous functions whose domain is aninterval I. Prove the following statements.

(a) cf is continuous for every real or complex number c.

(b) f + g is continuous on I.

(c) fg is continuous on I.

(d) If g(x) 6= 0 for every x, then f/g is continuous on I.


0.13.6. This problem will show that the conclusion of the Mean-Value The-orem can fail for complex-valued functions. Set f(t) = eit for t ∈ [0, 2π]. Thisfunction is continuous on [0, 2π] and is differentiable at every point of (0, 2π).Prove that there is no point c ∈ (0, 2π) such that

f ′(c) =f(2π) − f(0)

2π − 0.

0.14 Vector Spaces

Euclidean Space

We will give the definition of a general vector space below. First, however,we discuss the most familiar vector space, Rd, which is the set of all orderedd-tuples of real numbers. The complex analogue of Rd is also very important.This is the space Cd, which is the set of all ordered d-tuples of complexnumbers. We refer to either Rd or Cd as a Euclidean space.

If x is an element of Rd or Cd, then x is a d-tuple of real of complexnumbers. We usually write x as

x = (x1, . . . , xd),

and refer to xk as the kth component of x. However, on occasion it is moreconvenient to write the components of x in the form

x =(

x(1), . . . , x(d))

.

Here are some important notions that apply to vectors in the Euclideanspaces Rd and Cd.

• The sum of x = (x1, . . . , xd) and y = (y1, . . . , yd) is the vector x + y =(x1 + y1, . . . , xd + yd).

• The product (or scalar product) of a scalar c with a vector x = (x1, . . . , xd)is cx = (cx1, . . . , cxd).

• The dot product of two vectors x = (x1, . . . , xd) and y = (y1, . . . , yd) is thescalar

x · y = x1 y1 + · · · + xd yd. (0.7)

If x and y are vectors in Rd, then the complex conjugate in equation (0.7)is superfluous. That is,

x, y ∈ Rd =⇒ x · y = x1 y1 + · · · + xd yd.

• The Euclidean norm of a vector x = (x1, . . . , xd) is

0.14 Vector Spaces 39

‖x‖2 = (x · x)1/2 =(

|x1|2 + · · · + |xd|2)1/2

.

If we are dealing with Cd rather than Rd, then it is very important toinclude the complex conjugate in the definition of the dot product, because|z|2 = zz can be different from z2 when z is complex. Including the complexconjugate, we have that

x · x = x1x1 + · · · + xdxd = |x1|2 + · · · + |xd|2.

Scalars

The definition of a vector space involves two sets and two operations thattell us how to combine elements of those sets. One of the two sets is thevector space itself (whose elements we call “vectors”), but we must also havea second set, called the associated field of scalars or simply the scalar field.There exist many different sets that are fields, but in this volume the onlytwo scalar fields that we will ever consider are the real line R and the complexplane C.

Vector Spaces

A vector space is a set V that is associated with a scalar field (always eitherR or C in this volume), and two operations that allow us to add vectorstogether and to multiply a vector by a scalar. Here is the precise definition(where we refer to an element of V as a “vector,” and an element of the scalarfield as a “scalar”).

Definition 0.14.1 (Vector Space). A vector space is a set V, together witha scalar field, that satisfies the following conditions.

Closure Axioms

(1) Vector addition: For each pair of vectors x, y ∈ V, there is a uniquevector x + y in V, which we call the sum of x and y.

(2) Scalar multiplication: For each vector x ∈ V and each scalar c, thereexists a unique vector cx in V, which we call the product of c and x.

Addition Axioms

(3) Commutativity: x + y = y + x for all x, y ∈ V.

(4) Associativity: (x + y) + z = x + (y + z) for all x, y, z ∈ V.

(5) Additive Identity: There exists an element 0 ∈ V that satisfies x+0 = xfor all x ∈ V. We call this element 0 the zero vector of V.


(6) Additive Inverses: For each vector x ∈ V, there exists a vector (−x) ∈ Vthat satisfies x+(−x) = 0. We call −x the additive inverse of x, and wedeclare that x − y = x + (−y).

Multiplication Axioms

(7) Associativity: (ab)x = a(bx) for all scalars a, b and all vectors x ∈ V.

(8) Multiplicative Identity: Scalar multiplication by the number 1 satisfies1x = x for every x ∈ V.

Distributive Axioms

(9) c(x + y) = cx + cy for all vectors x, y ∈ V and all scalars c.

(10) (a + b)x = ax + bx for all vectors x ∈ V and scalars a, b. ♦

Another name for a vector space is linear space. We call the elements of avector space vectors (regardless of whether they are numbers, sequences, func-tions, operators, tensors, or other types of objects), and we call the elementsof the scalar field scalars. The trivial vector space is V = {0}. If V containsmore than just the zero vector, then it is a nontrivial vector space.

If S is a subset of a vector space V and S is itself a vector space (using thesame operations of vector addition and scalar multiplication as V ), then wecall S a subspace of V. A proper subspace of V is a subspace S that satisfiesS 6= V. A nontrivial subspace of V is a subspace S such that S 6= {0}. Thus,a proper nontrivial subspace is a subspace S that satisfies {0} ( S ( V.

Once we know that a given set V is a vector space, we can easily checkwhether a subset Y is a vector space by applying the following lemma (theproof is Problem 0.14.7). In the statement of this lemma, we implicitly assumethat the vector space operations on Y are the same operations that are usedin V.

Lemma 0.14.2. Let Y be a nonempty subset of a vector space V. If:

(a) Y is closed under vector addition, i.e.,

x, y ∈ Y =⇒ x + y ∈ Y,

(b) Y is closed under scalar multiplication, i.e.,

x ∈ Y, c is a scalar =⇒ cx ∈ Y,

then Y is itself a vector space with respect to the operations of vector addition

and scalar multiplication that are defined on V. ♦

A subset Y of V that satisfies the conditions of Lemma 0.14.2 is called asubspace of V.


Examples

We will give some examples of vector spaces whose elements are functions.

Example 0.14.3. Let X be a nonempty set, and let F(X) denote the set of allscalar-valued functions whose domain is X. That is, if our field of scalars is R

then F(X) is the set of all real-valued functions f : X → R, while if our fieldof scalars is C then F(X) is the set of complex-valued functions f : X → C.Every real-valued function is complex-valued, so, for example, if X = R thenthe function f whose rule is f(t) = sin t is a vector in F(R) regardless ofwhether the scalar field is R or C. Similarly, if g(t) = et and h(t) = t2, theng and h are examples of vectors in F(R).

If f and g are two elements of F(X), then f + g is the function defined by

(f + g)(t) = f(t) + g(t), t ∈ X.

If f ∈ F(X) and c is a scalar, then the product cf is the function whose ruleis

(cf)(t) = cf(t), t ∈ X.

The set F(X) is a nontrivial vector space with regard to the two operationsof addition of functions and multiplication of a function by a scalar (this isProblem 0.14.8).

The zero element of the vector space F(X) is the function that maps everyelement of X to zero. We denote this function by the symbol 0, which is thesame symbol that we use to represent the number zero. That is, the zerofunction is the function 0 whose rule is 0(t) = 0 for every t ∈ X. It willusually be clear from context whether the symbol 0 is to be interpreted asthe zero function or the number zero. ♦

Here are some examples of subspaces.

Example 0.14.4. Let I be an interval in the real line, and let C(I) be the setof all continuous scalar-valued functions whose domain is I, i.e.,

C(I) ={

f ∈ F(I) : f is continuous}

.

The zero function is continuous, so we have 0 ∈ C(I). If f and g are continu-ous then so are f + g and cf, so f + g ∈ C(I) and cf ∈ C(I). Therefore C(I)is nonempty and is closed under both addition and multiplication by scalars,so Lemma 0.14.2 tells us that C(I) is a subspace of F(I).

Now let P be the set of all polynomial functions on I. That is, P consistsof all functions p that have the form

p(t) =

N∑

k=0

cktk = c0 + c1t + · · · + cN tN , t ∈ I,


where N ≥ 0 and c0, c1, . . . , cn are scalars. Since P is a nonempty subset ofC(I) and P is closed under both addition and multiplication by scalars, weconclude that P is a subspace of C(I), and therefore is a subspace of F(I)as well. In fact, we have the inclusions {0} ( P ( C(I) ( F(I). ♦

To avoid multiplicities of brackets and parentheses, if I = (a, b) then weusually write C(a, b) instead of C((a, b)), if I = [a, b) then we usually writeC[a, b) instead of C([a, b)), and so forth.

If we restrict our attention to open intervals, then we can create furthersubspaces consisting of differentiable functions.

Example 0.14.5. Let I be an open interval in the real line, and let C1(I) bethe set of all scalar-valued functions f that are differentiable on I and whosederivative f ′ is continuous on I:

C1(I) ={

f ∈ C(I) : f is differentiable and f ′ is continuous on I}

.

If the scalar field is R then functions in C1(I) are real-valued, while if thescalar field is C then C1(I) includes both real-valued and complex-valuedfunctions. Every differentiable function is continuous, so C1(I) is a subsetof C(I), but the examples below will show that it is a proper subset, i.e.,C1(I) ( C(I). ♦

To illustrate, take I = R and let f, g, h, k be the functions defined by

f(t) = |t|, g(t) = t2, h(t) = e−|t|, k(t) = e−t2 . (0.8)

Each of these functions is real-valued, and hence is also complex-valued sinceevery real number is a complex number. These functions have the followingproperties.

• f is continuous but not differentiable on R, so f belongs to C(R) but doesnot belong to C1(R).

• g is differentiable and g′(t) = 2t is continuous on R (in fact, g′ is differen-tiable on R), so g ∈ C1(R).

• h is continuous but not differentiable on R, so h ∈ C(R)\C1(R).

• k is differentiable and k′(t) = −2te−t2 is continuous on R (in fact, k′ isdifferentiable on R), so k ∈ C1(R).

The reader should use Lemma 0.14.2 to prove that C1(I) is a subspace ofC(I). Since C1(I) is contained in but not equal to C(I), C1(I) is a proper

subspace of C(I), which is itself a proper subspace of F(I).The two functions g and k defined in equation (0.8) are actually infinitely

differentiable on R, i.e., g′, g′′, g′′′, . . . and k′, k′′, k′′′, . . . all exist and aredifferentiable at every point. Are there any functions that are differentiableon R but are not infinitely differentiable?


Example 0.14.6. Let I = R and define

w(t) =

{

t2 sin 1t , t 6= 0,

0, x = 0.(0.9)

The reader should check (this is Problem 0.14.10) that:

• w is continuous on R, so w ∈ C(R),

• w is differentiable at every point of R, i.e., w′(t) exists and is a scalar foreach t ∈ R, but

• w′ is not continuous at every point of R.

Therefore, although w is differentiable on R, its derivative is not continuous.We say that w is once differentiable because w′(t) exists for every t. However,w is not twice-differentiable because w′′(t) does not exist for every t. Becausew is continuous it is an element of C(R), but w /∈ C1(R) because althoughw′ exists, it is not continuous. ♦

So far we have considered domains I that are open intervals. If I is aninterval in R that is not open, then we define C1(I) by considering one-sided differentiability at any endpoint of I, just as we did in the discussionin Section 0.13. So, for example, C1[0, 1] consists of all functions f that aredifferentiable everywhere on [0, 1] and whose derivative f ′ is continuous on[0, 1].

Thus C1(I) can be defined for any type of interval I. We can keep goingand define C2(I) to be the space of all functions f such that both f and f ′

exist and are differentiable on I and f ′′ exists and is continuous on I. Thisis a proper subspace of C1(I). Then we continue further and define C3(I)and so forth, obtaining a nested decreasing sequence of spaces. The spaceC∞(I) that consists of all infinitely differentiable functions is itself a propersubspace of each of these. Moreover, the set of all polynomial functions,

P =

{

N∑

k=0

cktk : N ≥ 0, ck are scalars

}

,

is a proper subspace of C∞(I). Thus we have the infinitely many distinctvector spaces

F(I) ) C(I) ) C1(I) ) C2(I) ) · · · ) C∞(I) ) P .

Problems

0.14.7. Prove Lemma 0.14.2.


0.14.8. Let X be any set, and let F(X) be the set of all scalar-valued func-tions on the domain X. That is, if our scalar field is R then F(X) consists ofall functions f : X → R, while if the scalar field is C then F(X) is the set ofall functions f : X → C.

(a) Prove that F(X) is a vector space.

(b) For this part we take X = {1, . . . , d}. Assuming the scalar field is R,explain why the Euclidean vector space Rd and the vector space F({1, . . . , d})are really the “same space,” in the sense that each vector in Rd naturallycorresponds to a function in F({1, . . . , d}), with the operations in Rd beingthe “same” as the operations in F({1, . . . , d}). Then, assuming the scalarfield is C, formulate an analogous statement for Cd.

(c) Now let X = N = {1, 2, 3, . . .}, and let S be the set of all infinitesequences x = (x1, x2, . . . ). In what sense are F(N) and S the “same” vectorspace?

0.14.9. Let I be an interval in the real line. A scalar-valued function f on Iis bounded if supt∈I |f(t)| < ∞.

(a) Let Fb(I) be the set of all bounded functions on I. Prove that Fb(I)is a proper subspace of F(I).

(b) Let Cb(I) be the set of all bounded continuous functions on I. Provethat Cb(I) is a subspace of C(I). Show that if I is any type of interval other

than a bounded closed interval [a, b], then Cb(I) 6= C(I).

Remark: If I = [a, b] is a bounded closed interval, then every continuousfunction on I is bounded. Therefore Cb[a, b] = C[a, b].

0.14.10. Let w be the function defined in equation (0.9).

(a) Use the product rule to prove that w is differentiable at every pointx 6= 0, and use the definition of the derivative to prove that w is differentiableat x = 0.

(b) Show that even though w′(t) exists for every t, the derivative w′ is notcontinuous at the origin.

0.15 Span and Independence

Spans

A finite linear combination (or simply a linear combination, for short) ofvectors x1, . . . , xN in a vector space V is any vector that has the form

x =N

∑

k=1

ckxk = c1x1 + · · · + cNxN ,

0.15 Span and Independence 45

where c1, . . . , cN are scalars. We collect all of the linear combinations togetherto form the following set.

Definition 0.15.1 (Span). If A is a nonempty subset of a vector space V,then the finite linear span of A, denoted by span(A), is the set of all finitelinear combinations of elements of A:

span(A) =

{ N∑

n=1

cnxn : N > 0, xn ∈ A, cn are scalars

}

. (0.10)

We say that A spans V if span(A) = V.We declare that the span of the empty set is span(∅) = {0}. ♦

We also refer to span(A) as the finite span, the linear span, or simply thespan of A.

If A = {x1, . . . , xn} is a finite set, then we usually write span{x1, . . . , xn}instead of span

(

{x1, . . . , xn})

. In this case, equation (0.10) simplifies to

span{x1, . . . , xn} =

{

c1x1 + · · · + cnxn : c1, . . . , cn are scalars

}

.

Similarly, if A = {xn}n∈N is a sequence, then we usually write span{xn}n∈N

instead of span(

{xn}n∈N

)

, and in this case equation (0.10) simplifies to

span{xn}n∈N =

{ N∑

n=1

cnxn : N > 0, cn are scalars

}

.

Example 0.15.2. In this example we implicitly assume that the domain of ourfunctions is some fixed interval I in the real line. For each integer k ≥ 0 letpk be the function defined by the rule

pk(t) = tk, t ∈ I.

A finite linear combination of the pk is any function of the form

p =

n∑

k=0

ckpk = c0p0 + c1p1 + · · · + cnpn,

where n ≥ 0 and c0, c1, . . . , cn are scalars. Such a function p is given by therule

p(t) =

n∑

k=0

ckpk(t) =

n∑

k=0

cktk, t ∈ I.

That is, p is a polynomial. Earlier we declared that P denotes the set of allpolynomials, so if we let M = {pk}k≥0 then we have shown that

span(M) = P .


That is, the subset M spans P . ♦

We were excessively careful in Example 0.15.2 to distinguish between thefunction, pk, and its rule, pk(t) = tk for t ∈ I. Technically, pk denotes afunction while pk(t) denotes the evaluation of pk at the point t and hence isa number. Therefore it is not literally correct to write “the function pk(t)”or to say that “tk is a vector in P .” However, since the meaning is clear, wetypically abuse notation and simply write “the function tk” or “the vectortk” instead of “the function pk given by pk(t) = tk for t ∈ I.” We will abusenotation in this way many times below, beginning with the following remark.

Remark 0.15.3. A monomial is a polynomial that has only one nonzero term,i.e., it is a polynomial of the form ctk where c 6= 0 and k ≥ 0. Thus the collec-tion M that is discussed in Example 0.15.2 is the set of all monomials tk withk ≥ 0. We often summarize Example 0.15.2 by saying that the monomials tk

span P . ♦

Linear Independence

By definition, if x ∈ span(A), then x is some finite linear combination ofelements of A. In general there could be many different linear combinationsthat equal the vector x. Often we wish to ensure that x is a unique linearcombination of elements of A. This issue is related to the following notion.

Definition 0.15.4 (Linear Independence). A nonempty subset A of avector space V is finitely linearly independent if given any integer N > 0,any choice of finitely many distinct vectors x1, . . . , xN ∈ A, and any scalarsc1, . . . , cN , we have

N∑

n=1

cnxn = 0 ⇐⇒ c1 = · · · = cN = 0.

We declare that the empty set ∅ is a linearly independent set. ♦

Instead of saying that a set A is finitely linearly independent, we sometimesabbreviate this to A is linearly independent, or even just A is independent.

Often the set A is a finite set or a countably infinite sequence. Rewriting thedefinition for these cases, we see that a finite set {x1, . . . , xn} is independentif and only if

∑nk=1 ckxk = 0 only for c1 = · · · = cn = 0. A countable

sequence {xn}n∈N is independent if and only if for every integer N ∈ N we

have∑N

n=1 cnxn = 0 only when c1 = · · · = cN = 0.

Example 0.15.5. Let I be an interval in the real line. As in Example 0.15.2,let P be the set of all polynomials on I, and let M = {tk}∞k=0. We will show


that M is a finitely linearly independent subset of P . To do this, choose anyinteger N ≥ 0, and suppose that c0, c1, . . . , cN are scalars such that

N∑

k=0

cktk = 0. (0.11)

Note that the vector on the left-hand side of equation (0.11) is a function, the

polynomial p(t) =∑N

k=0 cktk. The vector 0 on the right-hand side of equation(0.11) is the zero element of the vector space P , which is the zero function.Hence what we are assuming in equation (0.11) is that the polynomial pequals the zero function, i.e., p(t) = 0 for every t ∈ I (not just for some t).

We wish to show that each scalar ck in equation (0.11) must be zero. Thereare many ways to do this. We ask for a direct proof in Problem 0.15.12 andspell out an indirect proof here. The Fundamental Theorem of Algebra tellsus that a polynomial of degree N can have at most N roots, i.e., there canbe at most N values of t for which p(t) = 0. If the scalar cN is nonzero, then

p(t) =∑N

k=0 cktk has degree N, and therefore can have at most N roots. Butwe know that p(t) = 0 for every t ∈ I, so p has infinitely many roots. This isa contradiction, so we must have cN = 0.

Since cN = 0, if cN−1 is nonzero then p has degree N − 1, which againleads to a contradiction since p(t) = 0 for every t. Therefore we must havecN−1 = 0.

Continuing in this way, we see that cN−2 = 0, and so forth, down to c1 = 0.So we are left with p(t) = c0, i.e., p is a constant polynomial. The only way aconstant polynomial can equal the zero function is if c0 = 0. Therefore everyck is zero, so we have shown that M is independent. ♦

Hamel Bases

A set that is both linearly independent and spans a vector space V is usuallycalled a basis for V, but to avoid confusion with other types of bases that arisein analysis (such as the Schauder bases discussed in Problem 7.4.11), we willuse the following terminology for a set that both spans and is independent.

Definition 0.15.6 (Hamel Basis). Let V be a nontrivial vector space. Aset of vectors B is a Hamel basis, vector space basis, or simply a basis for V if

B is linearly independent and span(B) = V. ♦

For example, since Example 0.15.2 shows that the set of monomials M ={tk}∞k=0 spans the set of polynomials P , and since Example 0.15.5 shows thatM is independent, it follows that M is a Hamel basis for P .

It can be shown that any two Hamel bases for a given vector space V havethe same cardinality. Thus, if V has a Hamel basis that consists of finitely


many vectors, say B = {x1, . . . , xd}, then any other Hamel basis for V mustalso contain exactly d vectors. We call this number d the dimension of V andset

dim(V ) = d.

On the other hand, if V has a Hamel basis that consists of infinitely manyvectors, then any other Hamel basis must also be infinite. In this case we saythat V is infinite-dimensional and we set

dim(V ) = ∞.

Remark 0.15.7. Instead of simply setting dim(V ) = ∞ for all infinite-dimensional vector spaces, we could let dim(V ) denote the actual cardinalityof a basis for V. This would allow us to distinguish between vector spaceswhose dimension is countably infinite and those whose dimension is un-countable, or even to distinguish further among the uncountable-dimensionalspaces according to cardinality. However, that will not be necessary in thistext. A more important way to distinguish between different types of “large”spaces will be the notions of separability and nonseparability introduced inDefinition 1.1.5 and studied in detail in Chapters 7 and 8. ♦

For finite-dimensional vector spaces, we have the following characterizationof Hamel bases. The proof is assigned as Problem 0.15.14, and a similarcharacterization for general vector spaces is given in Problem 0.15.15.

Theorem 0.15.8. A set of vectors B = {x1, . . . , xd} is a Hamel basis for a

nontrivial finite-dimensional vector space V if and only if each vector x ∈ Vcan be written as

x =d

∑

n=1

cn(x)xn

for a unique choice of scalars c1(x), . . . , cd(x). ♦The trivial vector space {0} is a bit of an anomaly in this discussion, since

it does not contain any linearly independent subsets and therefore does notcontain a Hamel basis. To handle this case we declare that the empty set ∅

is a Hamel basis for the trivial vector space {0}, and we set

dim({0}) = 0.

Problems

0.15.9. Let ek = (0, . . . , 0, 1, 0, . . . , 0) be the vector that has a 1 in the kthcomponent and zeros elsewhere. Prove that {e1, . . . , ed} is a Hamel basis forRd (if the scalar field is R) or Cd (if the scalar field is C). This is called thestandard basis for Rd or Cd.


0.15.10. Let V be a vector space.

(a) Show that if A ⊆ B ⊆ V, then span(A) ⊆ span(B).

(b) Show by example that span(A) = span(B) is possible even if A 6= B.Can you find an example where A and B are both infinite?

0.15.11. Given a linearly independent set A in a vector space V, prove thefollowing statements.

(a) If B ⊆ A, then B is linearly independent.

(b) If x ∈ V but x /∈ span(A), then A ∪ {x} is linearly independent.

(c) If there is no vector x ∈ V such that A ∪ {x} is linearly independent,then A is a basis for V.

0.15.12. Let I be an interval in the real line. Without invoking the Funda-mental Theorem of Algebra, give a direct proof that the set of monomials{tk}∞k=0 is a linearly independent set of functions in C(I).

0.15.13. For each k ∈ N, define ek(t) = ekt for t ∈ R.

(a) Prove that {ek}k∈N is a linearly independent set in C(R).

(b) Let 1 be the constant function, i.e., the function that takes the value 1at every t. Prove that 1 does not belong to span{ek}k∈N.

(c) Find a function f that does not belong to span{ek}k≥0, where e0 = 1.

0.15.14. Prove Theorem 0.15.8.

0.15.15. Let B = {xj}j∈J be a subset of a vector space V. Prove that B is aHamel basis for V if and only if every nonzero vector x ∈ V can be writtenas

x =

N∑

k=1

ck(x)xjk

for a unique

• integer N ∈ N,

• indices j1, . . . , jN ∈ J, and

• nonzero scalars c1(x), . . . , cN (x).

0.15.16. For each r > 0, let gr be the function gr(x) = erx. Let A be the setof all such functions, i.e., A = {gr : r > 0}.

(a) Prove that A is a uncountable subset of C(R) that is linearly indepen-dent.

(b) Let 1 be the constant function, i.e., 1(x) = 1 for all x. Prove that 1does not belong to span(A), and therefore A is not a basis for C(R).

introduction to real analysis chapter 0people.math.gatech.edu/~heil/real/chap0.pdfchapter 0 notation...

Documents