algoritmic notes

8/7/2019 algoritmic notes

http://slidepdf.com/reader/full/algoritmic-notes 1/381

REAL ANALYSIS

FALL 2001

Gabriel Nagy

Kansas State University

cGabriel Nagy



Chapter I

Topology Preliminaries



Lecture 1

1. Review of basic topology concepts

In this lecture we review some basic notions from topology, the main goal beingto set up the language. Except for one result (Uryson Lemma) there will be noproofs.

Definitions. A topology on a (non-empty) set X is a family

T of subsets of

X , which are called open sets, with the following properties:(top1): both the empty set ∅ and the total set X are open;(top2): an arbitrary union of open sets is open;(top3): a finite intersection of open sets is open.

In this case the system (X, T ) is called a topological space.If (X, T ) is a topological space and x ∈ X is an element in X , a subset N ⊂ X

is called a neighborhood of x if there exists some open set D such that x ∈ D ⊂ N .A collection N of neighborhoods of x is called a basic system of neighborhoods

of x, if for any neighborhood M of x, there exists some neighborhood N in N suchthat x ∈ N ⊂ M .

A collection V of neighborhoods of x is called a fundamental system of neighbor-hoods of x if for any neighborhood M of x there exists a finite sequence V 1, V 2, . . . , V n

of neighborhoods in V such that x ∈ V 1 ∩ V 2 ∩ · · · ∩ V n ⊂ M .A toplogy is said to have the Hausdorff property if:

(h) for any x, y ∈ X with x = y, there exist open sets U x and V y such that U ∩ V = ∅.

If (X, T ) is a topological space, a subset F ⊂ X will be called closed , if itscomplement X F is open. The following properties are easily derived from thedefinition:

(c1) both the empty set ∅ and the total set X are closed;(c2) an arbitrary intersection of closed sets is closed;(c3) a finite union of closed sets is closed.

Using the above properties of open/closed sets, one can perform the followingconstructions. Let (X,

T ) be a topological space and A

⊂X be an arbitrary subset.

Consider the set Int(A) to be the union of all open sets D with D ⊂ A and considerthe set A to be the intersection of all closed sets F with F ⊃ A. The set Int(A)

(sometimes denoted simply byA) is called the interior of A, while the set A is

called the closure of A. The properties of these constructions are summarized inthe following:

Proposition 1.1. Let (X, T ) be a toplogical space, and let A be an arbitrary subset of X .

3



4 LECTURE 1

A. (Properties of the interior)

(i) The set Int(A) is open and Int(A) ⊂ A.

(ii) If D is an open set such that D ⊂ A, then D ⊂ Int(A).(iii) x belongs to Int(A) if and only if A is a neighborhood of x.(iv) A is open if and only if A = Int(A).

B. (Properties of the closure)

(i) The set A is closed and A ⊃ A.(ii) If F is a closed set with F ⊃ A, then F ⊃ A.

(iii) A point x belongs to A, if and only if, A ∩ N = ∅ for any neighborhood N of x.

(iv) A is closed if and only if A = A.

C. (Relationship between interior and closure) Int(X A) = X A and X A = X Int(A).

Definition. Suppose (X,

T ) is a topological space. Assume A

⊂X is a subset

of X . On A we can introduce a natural topology, sometimes denoted by T |A whichconsists of all subsets of A of the form A ∩ U with U open set in X . This topologyis called the relative (or induced ) topology .

Remark 1.1. If A is already open in the topology T , then a subset V ⊂ Ais open in the induced topology if and only if V is open in the topology T (thisfollows from the fact that the intersection of any two open sets in T is again anopen set in T .

Definition. Suppose (X, T ) and (Y, S ) are topological spaces and x is anelement in X . A map f : X −→ Y is said to be continuous at x, if for anyneighborhood N of f (x) in the topology S (on Y ), the set

f −1(N ) = x ∈ X | f (x) ∈ N

is a neighborhood of x in the topology T (on X ).If f is continuous at every point in X , then f is said to be continuous.

Continuity is “well behaved” with respect to compositions:

Proposition 1.2. Suppose (X, T ), (Y, S ), and (Z, Z are topological spaces,

and X f −→ Y

g−→ Z are two functions.

(i) If f is continuous at a point x ∈ X , and if g is continuous at f (x), then g f is continuous at x.

(ii) If f and g are (globally) continuous, then so is g f .

The identity map on a topological space is always continuous.

In terms of open/closed sets, the characterization of continuity is given by thefollowing.

Proposition 1.3. If (X,T

), (Y,S

) are topological spaces and f : (X,T

)→(Y, S ) is a map, then the following are equivalent:

(i) f is continuous.(ii) Whenever U ⊂ Y is an open set, it follows that f −1(U ) is also an open

set (in X ).(iii) Whenever F ⊂ Y is a closed set, it follows that f −1(F ) is also a closed

set (in X ).

We conclude this section with a useful technical result.



CHAPTER I: TOPOLOGY PRELIMINARIES 5

Theorem 1.1 (Urysohn’s Lemma). Let (X, T ) be a topological Hausorff spacewith the following property:

(n) For any two disjoint closed sets A, B ⊂ X , there exist two disjoint open sets U, V ⊂ X , such that U ⊃ A and V ⊃ B.

Then for any two disjoint closed sets A, B ⊂ X , there exists a continuous function f : X → [0, 1] such that f

A

= 0 and f B

= 1.

Proof. We begin with a refinement of property (n):

(n) For any disjoint closed sets A, B ⊂ X , there exist two open sets U, W ⊂ X ,such that A ⊂ U , U ⊂ W , and W ∩ B = ∅.

To prove (n), we first apply (n) to find two disjoint open sets W, Z ⊂ X such that

(1) W ⊃ A and Z ⊃ B.

Next we apply again (n) to the pair of closed sets A and X W , and find two

disjoint open sets U, V ⊂ X such that(2) U ⊃ A and V ⊃ X W.

On the one hand, using the fact that U ∩ V = ∅ and the fact that V is open, weget the inclusion U ⊂ X V . Using (2) this gives

U ⊂ X V ⊂ W.

On the other hand, using the fact that W ∩ Z = ∅ and the fact that Z is open, weget W ⊂ X Z . But using (1) this will give

W ⊂ X Z ⊂ X B,

and we are done.To prove the Theorem, start with two disjoint closed sets A, B

⊂X . For every

integer n ≥ 0 we define the set Dn = k2n : k ∈ Z, 0 ≤ k ≤ 2n, and we consider

D =∞n=0

Dn.

(Notice that Dn ⊂ Dn+1, for all n ≥ 0.)We are going to construct a family (V t)t∈D of open sets in X with the following

properties

(i) V 0 ⊃ A and V 1 ∩ B = ∅;(ii) V t ⊂ V s, for all t, s ∈ D with t < s.

Let us start by constructing V 0 and V 1. We use property (n) to find open setsU, W ⊂ X , with

A⊂

U ⊂

U ⊂

W and W ∩

B = ∅,

and we simply take V 0 = U and V 1 = W .The construction of the family (V t)t∈D is carried on recursively. Assume, for

some integer n ≥ 0, we have constructed the sets (V t)t∈Dnwith property (i) and (ii)

(satisfied for t, s ∈ Dn), and let us construct the next block of sets (V t)t∈Dn+1Dn.

We start off by observing that for every t ∈ Dn+1 Dn, then the numbers

t± = t ± 1

2n+1



6 LECTURE 1

belong to Dn. Apply (n) to the pair of disjoint closed sets V t− and X V t+ tofind two open sets U, W ⊂ X such that

V t− ⊂ U ⊂ U ⊂ W and W ∩ X V t+ = ∅.Notice that the equality W ∩ (X V t+) = ∅, coupled with the inclusion U ⊂ W ,gives U ∩ (X V t+), so we get U ⊂ V t+ . We can then define V t = U , and we willobviously have the inclusions

(3) V t− ⊂ V t ⊂ V t ⊂ V t+ .

Now the extended family (V t)t∈Dn+1 will also satisfy property (ii), since for t, s ∈Dn+1 with t < s, one of the following will hold:

• either t, s ∈ Dn, or• t ∈ Dn, s ∈ Dn+1 Dn, and t ≤ s−, or• t ∈ Dn+1 Dn, s ∈ Dn, and t+ ≤ s, or• t, s ∈ Dn+1 Dn, and t+ ≤ s−.

(In either case, one uses (3) combined with the inductive hypothesis.)Having constructed the family (V t)t∈D, with properties (i) and (ii), we define

the functions f : X → [0, 1] by

f (x) =

inf t ∈ D : x ∈ V t, if x ∈ V 11, if x ∈ V 1

Claim 1: The function f is equivalently defined by

(4) f (x) =

0, if x ∈ V 0supt ∈ D : x ∈ V t, if x ∈ V 0

Let us denote by g : X → [0, 1] be the function defined by formula (4). Fixsome point x ∈ X . We break the proof in several cases

Case I: x

∈V 0.

In particular, using (ii) we get x ∈ V t, for all t ∈ D, with t > 0, and sincex ∈ V 1, we have

f (x) = inf t ∈ D : x ∈ V t = inf t ∈ D : t > 0 = 0 = g(x).

Case II: x ∈ V 1.

Using (ii) we have x ∈ V t, for all t ∈ D, with t < 1, and since x ∈ V 0, we have

g(x) = supt ∈ D : x ∈ V t = supt ∈ D : t < 1 = 1 = f (x).

Case III: x ∈ V 1 V 0.

By the definition of f (x) we know:

x ∈ V t, ∀ t ∈ D, with t < f (x).(5)

∀ ε > 0, ∃ sε ∈ D, with f (x) ≤ sε < f (x) + ε, such that x ∈ V sε .(6)By the definition of g(x) we know:

x ∈ V t, ∀ t ∈ D, with t > g(x);(7)

∀ ε > 0, ∃ rε ∈ D, with g(x) ≥ rε > g(x) − ε, such that x ∈ V rε.(8)

Using (6) and (8) we see that we must have

(9) sε ≥ rε, ∀ ε > 0.




Indeed, if there exists some ε > 0 for which we have sε < rε, then using (6) wewould have

x ∈ V sε ⊂ V sε ⊂ V rε ⊂ V rε ,which contradicts (8).

Now the inequality (9) gives

f (x) + ε > g(x) − ε, ∀ ε > 0,

so we have in fact the inequality

f (x) ≥ g(x).

Suppose now this inequality is strict. Using (5) and (7) we will get

(10) x ∈ V t and x ∈ V t, for all t ∈ D, with f (x) > t > g(x).

Using the fact that D is dense in [0, 1], we could then find at least two elementst1, t2

∈ Dsuch that

f (x) > t1 > t2 > g(x).

In this case (10) immediately creates a contradiction, since

x ∈ V t2 ⊂ V t1 .

Claim 2 : The function f is continuous.

Since any open set in R is a union of open intervals, it suffice to prove thefollowing two properties1

(usc): f −1

(∞, t)

is open for all t ∈ R;

(lsc): f −1

(t, ∞)

is open for all t ∈ R.

In order to prove property (usc) it suffices to prove the equality

(11) f −1(∞

, t) = s∈Ds<t

V s.

Start with a point x ∈ f −1

(t, ∞)

, which means that f (x) < t. Using (6), thereexists some s ∈ D with f (x) < s < t, such that x ∈ V s, so x indeed belongs tothe right hand side of (11). Conversley, if x belongs to the right hand side of (11),there exists some s < t such that x ∈ V s. By the definition of f (x), it follows thatf (x) ≤ s < t, so x ∈ f −1

(∞, t)

.

In order to prove property (lsc) it suffices to prove the equality

(12) f −1

(t, ∞)

=r∈Dr>t

(X V r).

Start with a point x

∈f −1

(t,

∞), which means that f (x) > t. Using (8), there

exists some r ∈ D with f (x) > r > t, such that x ∈ V r, that is, x ∈ X V r, so xindeed belongs to the right hand side of (12). Conversley, if x belongs to the righthand side of (12), there exists some r > t such that x ∈ X V s, i.e. x ∈ V r Bythe equivalent definition of f (x) given by Claim 1, it follows that f (x) ≥ r > t, sox ∈ f −1

(t, ∞)

.

1 The condition (usc) means that f is upper semi-continuous, while the condition (lsc)means that f is lower semi-continuous.



8 LECTURE 1

Having proven that f is continuous, let us finish the proof. Since A ⊂ V 0, bythe definition of f , we get f

A= 0. Since B ⊂ X V 1, again by the definition of

f , we get f B = 1.

Definition. A Hausdorff space (X, T ) with property (n) is called normal .



Lecture 2

2. Ultrafilters

In this lecture we discuss a set theoretical concept, which turns out to betechnically useful in topology.

Definition. Suppose X is a fixed (non-empty) set. A filter in X is a (non-empty) family

F of non-empty subsets of X which has the property2:

(f) Whenever F and G belong to F , it follows that F ∩ G also belongs to F .What is important here is that all the sets in the filter are assumed to be non-empty . The set of all filters in X can be ordered by inclusion. A simple applicationof Zorn’s Lemma yields:

• For each filter F there exists at least one maximal filter U with U ⊃ F .Maximal filters will be called ultrafilters.

An interesting feature of ultrafilters is given by the following:

Lemma 2.1. Let X be a non-empty set, and let U be a filter on X . The following are equivalent:

(i) U is an ultrafilter.(ii) For any subsets A ⊂ X , it follows that either A or X A belongs to U ,

but not both!

Proof. (i) ⇒ (ii). Assume U is an ultrafilter. First remark that X alwaysbelongs to U . (Otherwise, if X does not belong to U , the family U ∪ X will beobviously a new filter which will contradict the maximality of U ).

Let us assume that A is non-empty and it does not belong to U . This meansthat the family

M = U ∪ A ∩ U | U ∈ Uis no longer a filter (otherwise, the maximality of U will be contradicted). Note thatif F and G belong to M, then automatically F ∩G belongs to M. This means thatthe only thing that can prevent M from being a filter, must be the fact that oneof the sets in M is empty. That is, there is some set V ∈ U such that A ∩ V = ∅.In other words, V

⊂X A. But then, it follows that for any U

∈ U we have

U ∩ (X A) ⊃ U ∩ V = ∅ and then the set

N = U ∪ U ∩ (X A) | U ∈ Uwill be a filter. By maximality, it follows that N = U , in particular, X A belongsto U . It is obvious that A and X A cannot simultaneously belong to U , becausethis will force ∅ = A ∩ (X A) to belong to U .

2 Some textbooks may use a slightly different definition.

9



10 LECTURE 2

(ii) ⇒ (i). Assume property (ii) holds, but U is not maximal, which meansthat there exists some ultrafilter V with V U . Pick then some set A ∈ V U .Since A

∈ U , by (ii) we must have X A

∈ U . This would force both A and X A

to belong to V , which is impossible.

Exercise 1. Let U be an ultrafilter on X , and let A ∈ U. Prove that thecollection

UA

= U ∩ A : U ∈ Uis an ultrafilter on A.

Remark 2.1. If U is an ultrafilter on X , and A ∈ U , then U contains all setsB with A ⊂ B ⊂ X . Indeed, if we start with such a B, then by the above result,either B ∈ U or X B ∈ U . Notice however that in the case X B ∈ U we wouldget

U (X B) ∩ A = ∅,

which is impossible. Therefore B must belong to

U .

We are in position now to define the notion of convergence for ultrafilters, bymeans of the following.

Proposition 2.1. Let (X, T ) be a topological space, let U be an ultrafilter in X , and let x be a point in X . The follwoing are equivalent:

(i) Every neighborhood of x belongs to U .(ii) There exists N a basic system of neighborhoods of x, with N ⊂ U .

(iii) There exists V a fundamental system of neighborhoods of x, with V ⊂ U .If the ultrafilter U satisfies one of the equivalent conditions above, we say that

U is convergent to x, and we write U → x.

Proof. The implications (i) ⇒ (ii) ⇒ (iii) are obvious.(iii)

⇒(i). Let

V be a fundamental system of neighborhoods of x, with

V ⊂ U .

Start with an arbitrary neighborhood M of x. By the proeprties of V , there existsa finite sequence V 1, . . . , V n ∈ V, with

x ∈ V 1 ∩ · · · ∩ V n ⊂ M.

Since V ⊂ U , and U is a filter, it follows that the intersection W = V 1 ∩ · · · ∩ V nbelongs to U . By Remark 2.1 it follows that M itself belong to U . Since M wasarbitrary, it follows that U indeed satisfies condition (i).

The Hausdorff property has a nice ultrafilter characterization:

Proposition 2.2. For a topological space (X, T ), the following are equivalent:

(i) The topology T is Hausdorff.(ii) Every convergent ultrafilter in X has a unique limit.

Proof. (i) ⇒ (ii). Assume the topolgy is Hausdorff. Let U be an ultrafilter inX which is convergent to both x and y. If x = y, then by the Hausdorff property,there exist two open sets U, V ⊂ X , with x ∈ U , y ∈ V , and U ∩ V = ∅. Since U is a neighborhood of x, we must have U ∈ U . Likewise, we must have V ∈ U . Butthis is impossible, since it will force U U ∩ V = ∅.

(ii) ⇒ (i). Assume X satisfies condition (ii), but the topology is not Hausdorff.This means that there exist two points x, y ∈ X , with x = y, such that

(∗) for any open sets U, V ⊂ X , with U x and V y, we have U ∩ V = ∅.




Let N x denote the collection of all neighborhoods of x, and N y denote the collectionof all neighborhoods of y. By condition (∗) we have

M ∩ N = ∅, ∀ M ∈ N x, N ∈ N y.

This proves that the collection

F = M ∩ N : M ∈ N x, N ∈ N yis a filter in X . Notice that, since X is a neighborhood for both x and y, we havethe inclusion F ⊃ N x ∪ N y. So if we take U to be an ultrafilter, with U ⊃ F , itfollows that U ⊃ N x, hence U converges to x, but also U ⊃ N y, hence U is alsoconvergent to y. By condition (ii) this is impossible.

Examples 2.1. A. Let x be a point in X . We can consider the collection U x = U ⊂ X | U x. Clearly U x is an ultrafilter in X . This is called a constant ultrafilter at x. If (X, T ) is a topological space, then it is obvious that U x is

convergent to x.B. (Example of a convergent non-constant ultrafilter.) Suppose (X, T ) is atopological space and x is a point in X such that for any neighborhood N of x, wehave N x = ∅. Consider the collection

F = N x | N neighborhood of x.

Then F is a filter. If we take U any ultrafilter which contains F , we get a non-constant (sometimes called free) ultrafilter. It is again clear that U is again con-vergent to x.

C. (Example of a non-convergent ultrafilter.) Let N be the set of non-negativeintegers. Equip N with the discrete topology (in which every subset is open). Con-sider the collection F consisting of all subsets F ⊂ N which have finite complementN F . It is easy to check that F is a filter. Pick then U to be any ultrafilter with

U ⊃ F . Since on N we use the discrete topology, it follows that the only convergentultrafilters are the constant ones. Note however, that if n ∈ N, then the set N nbelongs to F , hence to U . This means that the singleton set n cannot belong to

U . Therefore U cannot be constant.

Remark 2.2. Maps between sets can be put to act on ultrafilters. Moreexplicitly one has the following construction. Suppose f : X → Y is a map and U is a ultrafilter in X . Consider the collection

f ∗( U ) = V ⊂ Y | f −1(V ) ∈ U.

Then f ∗( U ) is a ultrafilter on Y . Indeed, it is easy to show that f ∗( U ) is a filter.To prove that it is maximal, let us take F a filter on Y with F ⊃ f ∗( U ) and let usconsider an arbitrary set F which belongs to F . Since U is an ultrafilter on X itfollows that either f −1(F ) or X f −1(F ) belongs to

U . If X f −1(F ) belongs to

U , using the equality X f −1(F ) = f −1(Y F ) if follows that Y F belongs tof ( U ), hence to F . But this is impossible, since F also belongs to F and this willforce the empty set F ∩ (Y F ) to belong to the filter F . This contradiction showsthat the set f −1(F ) belongs to U , which means precisely that F belongs to f ∗( U ).This argument proves the inclusion F ⊂ f ∗( U ), so f ∗( U ) is indeed a maximal filter.

Remark 2.3. With the above notations, one has

f (U ) ∈ f ∗( U ), ∀ U ∈ U .



12 LECTURE 2

One can prove this property by contradiction. Assume f (U ) does not belong tof ∗( U ), for some U ∈ U . Then Y f (U ) belongs to f ∗( U ), which means that the set

M = f −1Y f (U ) = X f

−1f (U )belongs to U . But using the obvious inclusion U ⊂ f −1

f (U )

, this gives M ∩ U =

∅, which is impossible.

Continuity can be nicely characterized using ultrafilters:

Proposition 2.3. Let (X, T ) and (Y, S ) be topological spaces, and let x beelement in X . For a function f : X → Y , the following are equivalent:

(i) f is continuous at x.(ii) Whenever U is an ultrafilter on X convergent to x, it follows that the

ultrafilter f ∗( U ) in Y , convergent to f (x).

Proof. (i) ⇒ (ii). Assume that f is continuous at x. Start with an ultrafilter U on X , with U → x. Let N be an arbotrary neighborhood of f (x). Since f iscontinuous at x, it follows that f −1(N ) is a neighborhood of x. In particular weget f −1(N ) ∈ U , which proves that N ∈ f ∗( U ). Since the ultrafilter f ∗( U ) containsall neighborhoods of f (x), it means that indeed f ∗( U ) is convergent to f (x).

(ii) ⇒ (i). Assume f satisfies condition (ii), but f is not continuous at x. Thismeans that there exists some neighborhood V of f (x) such that f −1(V ) is not aneighborhood of x. Consider the collection

F = N f −1(V ) : N neighborhood of x.

Our assumption on V shows that all the sets in F are non-empty. (Otherwisef −1(V ) would contain some neighborhood of x, which would force f −1(V ) itself tobe a neighborhood of x.) It is also clear that F is a filter. Let U be an ultrafilterwith U ⊃ F .

Claim: The ultrafilter U is convergent to x.

To prove this, start with some arbitrary neighborhood N of x. If N does notbelong to U , then X N belongs to U . But then (X N ) ∩ (N f −1(V )) = ∅

belongs to U , which is impossible. So U contains all neighborhoods of x, whichmeans that indeed U is convergent to x.

Using our assumption on V , plus condition (ii), it follows that V ∈ f ∗( U ),which means that f −1(V ) ∈ U . But this leads to a contradiction, since X f −1(V )clearly belongs to F ⊂ U .



Lecture 3

3. Constructing topologies

In this section we discuss several methods for constructing topologies on a givenset.

Definition. If T and T are two topologies on the same space X , such that

T

⊂ T (as sets), then

T is said to be stronger than

T . Equivalently, we will say

that T is weaker than T .Remark that this condition is equivalent to the continuity of the map

Id : (X, T ) → (X, T ).

Comment. Given a (non-empty) set X , and a collection S of subsets of X ,one can ask the following:

Question 1: Is there a topology on X with respect to which all the sets in S are open?

Of course, this question has an affirmative answer, since we can take as the topologythe collection of all subsets of X . Therefore the above question is more meaningfulif stated as:

Question 2 : Is there the weakest topology on X with respect to which all thesets in

S are open?

The answer to this question is again affirmative, and it is based on the following:

Remark 3.1. If X is a non-empty set, and (T i)i∈I is a family of topologies onX , then the intersection

i∈I

T i

is again a topology on X .In particular, if one starts with an arbitrary family S of subsets of X , and if

we takeΘ(S ) =

T : T topology on X with T ⊃ S ,

then the intersectiontop(S ) =

T ∈Θ(S)

T

is the weakest (i.e. smallest) among all topologies with respect to which all sets inS are open.

The topology top(S ) defined above cane also be described constructively asfollows.

Proposition 3.1. Let S be a collection of subsets of X . Then the sets in top(S ), which are a proper subsets of X , are those which can be written a (arbi-trary) unions of finite intersections of sets in S .

13



14 LECTURE 3

Proof. It is useful to introduce the following notations. First we define V (S )to be the collection of all sets which are finite intersections of sets in S . In otherwords,

B ∈ V (S ) ⇐⇒ ∃ D1, . . . , Dn ∈ S such that D1 ∩ · · · ∩ Dn = B.

With the above notation, what we need to prove is that for a set A X , we have

A ∈ top(S ) ⇐⇒ ∃ V A ⊂ V (S ) such that A =B∈V A

B.

The implication “⇐” is pretty obvious. Since top(S ) is a topology, and every setin S is open with respect to top(S ), it follows that every finite intersection of setsin S is again in top(S ), which means that every set in V (S ) is again open withrespect to top(S ). But then arbitrary unions of sets in V (S ) are again open withrespect to top(S ).

To prove the implication “⇒” we define

T 0 = A ⊂ X : ∃V A ⊂ V (S ) such that A = B∈V A

B,

and we will show that

(1) top(S ) ⊂ X ∪ T 0.

By the definition of top(S ) it suffices to prove the following

Claim: The collection T 1 = X ∪ T 0 is a topology on X , which contains all the sets in S .

The fact that T 1 ⊃ S is trivial.The fact that ∅, X ∈ T 1 is also clear.The fact that arbitrary unions of sets in T 1 again belong to T 1 is again clear,

by construction.Finally, we need to show that if A1, A2 ∈ T 1, then A1 ∩ A2 ∈ T 1. If eitherA1 = X or A2 = X , there is nothing to prove. Assume that both A1 and A2 areproper subsets of X , so there are subsets V 1, V 2 ⊂ V (S ), such that

A1 =B∈V 1

B and A2 =E∈V 1

E.

Then it is clear that

A1 ∩ A2 =B∈V 1E∈V 2

(B ∩ E ),

with all the sets B ∩ E in V (S ), so A1 ∩ A2 indeed belongs to T 1.

Definition. Let X be a (non-empty) set, let T be a topology on X . Acollection S of subsets of X , with the property that

T = top(S ),

is called a sub-base for T . According to the above remark, the above condition isequivalent to the fact that every open set D X can be written as a union of finiteintersections of sets in S .

Convergence of ultrafilters is characterized using sub-bases as follows;




Proposition 3.2. Let (X, T ) be a topological space, let S be a sub-base for T , and let x be some point in X . For an ultrafilter U on X , the following areequivalent:

(i) U is convergent to x;(ii) U contains all the sets S ∈ S with S x.

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i), we assume U has property (ii), we consider

some neighborhood N of x, and let us prove that N belongs to U. Since N is aneighborhood of x, there exists some open set D, with x ∈ D ⊂ N . Furthermore,by Proposition 3.1, either

(a) D = X , or(b) there exist sets S 1, S 2, . . . , S n ∈ S with

x ∈ S 1 ∩ S 2 ∩ · · · ∩ S n ⊂ D ⊂ N.

In case (a) we immediately have N = X , and we obviously get N ∈ U. In case (b)it follows that S 1, . . . , S n ∈ U, so the intersection S 1 ∩ S 2 ∩ · · · ∩ S n also belongs toU. By Remark 2.1 it then follows that N itself belongs to U.

There are instances when sub-bases have a particular feature, which enablesone to describe all open sets in an easier fashion.

Proposition 3.3. Let (X, T ) be a topological space. Suppose V is a colletion of subsets of X . The following are equivalent:

(i) V is a sub-base for T , and

(2) ∀ U, V ∈ V and x ∈ U ∩ V, , ∃ W ∈ V with x ∈ W ⊂ U ∩ V.

(ii) Every open set A X is a union of sets in V .Proof. (i) ⇒ (ii). From property (i), it follows that every finite intersection

of sets in V is a union of sets in V. Then the desired implication is immeadiatefrom the previous result.

(ii) ⇒ (i). Assume (ii) and start with two sets U, V ∈ V, and an elementx ∈ U ∩ V . Since U ∩ V is open, by (ii) either we have U ∩ V = X , in which casewe get U = V = X , and we take W = X , or U ∩ V X , in which case U ∩ V is aunion of sets in V, so in particular there exists W ∈ V with x ∈ W ⊂ U ∩ V .

Definition. If (X, T ) is a topological space, a collection V which satisfies theabove equivalent conditions, is called a base for T .

The following is a useful technical result.

Lemma 3.1. Let (Y,

T ) be a topological space, let X be some (non-empty) set,

and let f : X → Y be a function. Then the collection

f ∗(T ) =

f −1(D) : D ∈ T is a topology on X . Moreover, f ∗(T ) is the weakest topology on X , with respect towhich the map f is continuous.

Proof. Clearly ∅ = f −1(∅) and X = f −1(Y ) both belong to f ∗(T ). If (Ai)i∈I is a family of sets in f ∗(T ), say Ai = f −1(Di), for some Di ∈ T , for all



16 LECTURE 3

i ∈ I , then the equality

i∈I Ai = i∈I f −1(Di) = f i∈I Diclearly shows that

i∈I Ai again belongs to f ∗(T ). Likewise, if A1, A2 ∈ f ∗(T ),

say A1 = f −1(D1) and A2 = f −1(D2) for some D1, D2 ∈ T , then the equality

A1 ∩ A2 = f −1(D1) ∩ f −1(D2) = f −1(D1 ∩ D2)

proves that A1 ∩ A2 again belongs to f ∗(T ).Having proven that f ∗(T ) is a topology on X , let us prove now the second

statement. The fact that f is continuous with respect to f ∗(T ) is clear by con-struction. If T is another topology which still makes f continuous, then this willforce all the sets of the form f −1(D), D ∈ T to belong to T , which means thatf ∗(T ) ⊂ T .

Remark 3.2. Using the above notations, if V is a (sub)base for T , then

f ∗(V ) = f −1(V ) : V ∈ V is a (sub)base for f ∗(T ). This is pretty obvious since the correspondence

subsets of Y D −→ f −1(D)

is compatible with the operation of intersection and union (of arbitrary families).

Remark 3.3. As a consequence of the above remark, we see that (sub)basescan be useful for verifying continuity. More specifically, if (X, T ) and (X , T ) aretopological spaces, and V is a sub-base for T , then a function f : X → X iscontinuous, if and only if f −1(V ) is open, for all V ∈ V .

The construction outlined in Lemma 3.1 can be generalized as follows.

Proposition 3.4. Let X be a set, and let Φ = (f i, Y i)i∈I be a family consisting of maps f i : X → Y i, where Y i is a topological space, for all i ∈ I . Then there is a

unique toplogy T Φ

on X , with the following properties(i) Each of the maps f i : X → Y i, i ∈ I is continuous with respect to T Φ.

(ii) Given a topological space (Z, S ), and a map g : Z → X , such that thecomposition f i g : Z → Y i is continuous, for every i ∈ I , it follows that g is continuous as a map (Z, S ) → (X, T Φ).

Proof. For every i ∈ I we define

Di =

f −1i (D) : D open subset of Y i

,

and we form the collectionD =

i∈I

Di.

Take T Φ = top(D) Property (i) follows from the simple observation that, by

construction, every set in D is open.To prove property (ii) start with a topological space (Z, S ), and a map g : Z →X , such that the composition f i g : Z → X i is continuous, for every i ∈ I , andlet us prove that g is continuous. By Remark 3.3 it suffices to prove that g−1(A) isopen (in Z ) for every A ∈ D. By the definition of D this is equivalent to provingthe fact that, for each i ∈ I , and each open set D ⊂ Y i, the set g−1

f −1i (D)

is

open. But this is obvious, since we have

g−1

f −1i (D)

= (f i g)−1(D),




and f i g : Z → Y i is continuous.To prove the uniqueness, let T be another topology on X with properties (i)

and (ii). Consider the map h = Id : (X,T

)→

(X,T

Φ). Using property (i) forT

,combined with property (ii) for T Φ, it follows that h is continuous, which meansthat T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also getthe other inclusion T ⊂ T Φ.

Remark 3.4. Using the above setting, assume that for each i ∈ I a sub-baseSi for the topology of Y i is given. Consider the sets f ∗i Si =

f −1i (S ) : S ∈ Si

.

Then the collection

S =i∈I

f ∗i Si

is a sub-base for the topology T Φ.To prove this, we take T = top(S), so that we obviously have the inclusion

T ⊂ T Φ. In order to prove the equality T = T Φ, all we have to prove are (use thenotations from the proof of the above Proposition) the inclusions

Di ⊂ T , ∀ i ∈ I.

By construction however, we have Di = f ∗i T i, and since Si is a sub-base for T i, itfollows that f ∗i Si is a sub-base for Di, which means that we have

Di = top(f ∗i Si) ⊂ top(S) = T .

Comment. Using the notations above, it is immediate that the topology T Φcan also be described as the weakest topology on X , with respect to which all themaps f i : X → Y i, i ∈ I , are continuous. In the light of this remark, we will callthe topology T Φ the weak topology defined by Φ.

Convergence for ultrafilters can be nicely characterized:

Proposition 3.5. Let X be a set, and let Φ = (f i, Y i)i∈I be a family consisting

of maps f i : X → Y i, where Y i is a topological space, for all i ∈ I . For an ultrafilter U on X and a point x ∈ X , the following are equivalent:

(i) U is convergent to x, with respect to the topology T Φ;(ii) for every i ∈ I , the ultrafilter f i∗(U) is convergen to f i(x).

Proof. (i) ⇒ (ii). This implication is clear, since all maps f i : (X, T Φ) →(Y i, T i), i ∈ I , are continuous.

(ii) ⇒ (i). Suppose U satisfies (ii). Then for every i ∈ I , the ultrafilter f i∗(U)contains all the open sets D ⊂ Y i with D f i(x). This means that f −1

i (D) ∈ U.But by construction, the topology T Φ has

S =i∈I

f −1i (D) : D ⊂ Y i open

,

as a sub-base, so if we define S x = S ∈ S : S x,

we clearly have U ⊃ S x. Then the fact that U converges to x follows from Propo-sition 3.2.

Example 3.1. (The product topology) Supoose we have a family (X i, T i), i ∈ I

of topological spaces. Consider the Cartesian product X =i∈I

X i. For each j ∈ I



18 LECTURE 3

we consider the projection πj : X → X j . The weakest topology on X , defined bythe family Φ = πjj∈I , is called the product topology .

A sub-base for the product topolgy can be defined as follows. For each i∈

I ,we choose a sub-base Si for T i (for instance we can take Si = T i), and we take

S =i∈I

π∗i Si =

i∈I

π−1i (D) : D ∈ Si

.

Then S is a sub-base for the product topology.For a point x = (xi)i∈I ∈ X , and an ultrafilter U on X , the condition U → x

is equivalent to the fact that πi∗(U) → xi, ∀ i ∈ I .

Another method of constructing topologies is based on the following “dual”version of Lemma 3.1.

Lemma 3.2. Let (Y, T ) be a topological space, let X be some (non-empty) set,and let f : Y → X be a function. Then the collection

f ∗(

T ) = D

⊂X : f −1(D)

∈ T is a topology on X . Moreover, f ∗(T ) is the strongest topology on X , with respect to which the map f is continuous.

Proof. Since f −1(∅) = ∅ and f −1(X ) = Y , it follows that ∅ and X bothbelong to f ∗(T ). If (Ai)i∈I is a family of sets in f ∗(T ), then the sets f −1(Ai), i ∈ I belong to T . In particular the set

f −1i∈I

Ai

=i∈I

f −1(Ai)

will again belong to T , which means thati∈I Ai belongs to f ∗(T ). Likewise,

if A1, A2 ∈ f ∗(T ), then the sets f −1(A1) and f −1(A2) both belong to T . Theintersection

f −1(A1

∩A2) = f −1(A1)

∩f −1(A2)

will then belong to T , which means that A1 ∩ A2 again belongs to f ∗(T ).Having proven that f ∗(T ) is a topology on X , let us prove now the second

statement. The fact that f is continuous with respect to f ∗(T ) is clear by con-struction. If T is another topology which still makes f continuous, then this willforce all the sets of the form f −1(A), A ∈ T to belong to T , which means that Awill in fact belong to f ∗(T ). In other words, we have the inclusion T ⊂ f ∗(T ).

A generalization of the above construction is given in the following.

Proposition 3.6. Let X be a set, and let Φ = (f i, Y i)i∈I be a family consisting of maps f i : Y i → X , where Y i is a topological space, for all i ∈ I . Then there is a unique toplogy T Φ on X , with the following properties

(i) Each of the maps f i : Y i → X , i ∈ I is continuous with respect to T Φ.

(ii) Given a topological space (Z, S ), and a map g : X → Z , such that thecomposition g f i : Y i → Z is continuous, for every i ∈ I , it follows that g is continuous as a map (X, T Φ) → (Z, S ).

Proof. For each i ∈ I , let T i denote the topology on Y i. We define

T Φ =i∈I

f i∗(T i).

Property (i) is obvious by construction.




To prove property (ii), start with some topological space (Z, S ) and a mapg : X → Z such that g f i : Y i → Z is continuous, for all i ∈ I . Start withsome open set D

⊂Z , and let us prove that the set A = g−1(D) is open in X , i.e.

A ∈ T Φ. Notice that, for each i ∈ I , one has

f −1i (A) = f −1

i

g−1(D)

= (g f i)

−1(D),

so using the continuity of g f i we get the fact that f −1i (A) is open in Y i, which

means that A ∈ f i∗(T i). Since this is true for all i ∈ I , we then get A ∈ T Φ.To prove uniqueness, let T be another topology on X with properties (i) and

(ii). Consider the map h = Id : (X, T ) → (X, T Φ). Using property (i) for T Φ,combined with property (ii) for T , it follows that h is continuous, which meansthat T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also getthe other inclusion T ⊂ T Φ.

Comment. Using the notations above, it is immediate that the topology T Φcan also be described as the strongest topology on X , with respect to which all the

maps f i : Y i → X , i ∈ I , are continuous. In the light of this remark, we will callthe topology T Φ the strong topology defined by Φ.

Example 3.2. (The disjoint union topology) Supoose we have a family (X i, T i),

i ∈ I of topological spaces. Consider the disjoint union3 X =i∈I

X i. For each i ∈ I

we consider the inclusion i : X i → X . The strongest topology on X , defined bythe family Φ = ii∈I , is called the disjoint union topology .

If we think each X i as a subset of X , then X i is open in X , for all i ∈ I .Moreover, a set D ⊂ X is open, if and only if D ∩ X i is open (in X i), for all i ∈ I .For a point x ∈ X , there exists a unique i(x) ∈ I , with x ∈ X i(x). With thisnotation, an ultrafilter U on X is convergent to x, if and only if X i(x) ∈ U, and thecollection

UXi(x)

=

U ∩

X i(x) : U ∈U

is an ultrafilter on X i(x), which converges to x.

3 Formally one uses the sets Z =

i∈I Xi, and Y = I × Z , and one realizes the diskoint

union as X =

i∈I i × Xi.



Lecture 4

4. Compactness

Definition. Let X be a topological space X . A subset K ⊂ X is said to becompact set in X , if it has the finite open cover property:

(f.o.c) Whenever Dii∈I is a collection of open sets such that K ⊂

i∈I Di,

there exists a finite sub-collection Di1 , . . . , Din such that

K ⊂ Di1 ∪ · · · ∪ Din .

An equivalent description is the finite intersection property:

(f.i.p.) If F ii∈I is is a collection of closed sets such that for any finite sub-collection F i1 , . . . , F in we have K ∩ F i1 ∩ . . . F in = ∅, it follows that

K ∩ i∈I

F i = ∅.

A topological space (X, T ) is called compact if X itself is a compact set.

Remark 4.1. Suppose (X, T ) is a topological space, and K is a subset of X . Equip K with the induced topology T

K

. Then it is straightforward from thedefinition that the following are equivalent:

• K is compact, as a subset in (X, T );• (K, T K

) is a compact space, that is, K is compact as a subset in (K, T K

).

The following three results give methods of constructing compact sets.

Proposition 4.1. A finite union of compact sets is compact.

Proof. Immediate from the definition.

Proposition 4.2. Suppose (X, T ) is a topological space and K ⊂ X is a compact set. Then for every closed set F ⊂ X , the intersection F ∩ K is again compact.

Proof. Immediate, using the finite intersection property.

Proposition 4.3. Suppose (X,

T ) and (Y,

S ) are topological spaces, f : X

→Y

is a continuous map, and K ⊂ X is a compact set. Then f (K ) is compact.

Proof. Immediate from the definition.

Besides the two equivalent conditions (f.o.c) and (f.i.p.), there are some otheruseful characterizations of compactness, listed in the following.

Theorem 4.1. Let (X, T ) be a topological space. The following are equivalent:

(i) X is compact.

21



22 LECTURE 4

(ii) (Alexander sub-base Theorem) There exists a sub-base S with the finiteopen cover property:

(s) For any collection S i | i ∈ I ⊂S

with X = i∈I S i, there exists a

finite sub-collection S i1 , S i2 , . . . , S in (for some finite sequence of indices i1, i2, . . . , in ∈ I ) such that X = S i1 ∪ S i2 ∪ · · · ∪ S in .

(iii) Every ultrafilter in X is convergent.

Proof. (i) ⇒ (ii). This is obvious. (In fact any sub-base has the open coverproperty.)

(ii) ⇒ (iii). Let U be an ultrafilter on X . Assume U is not convergent to anypoint x ∈ X . By Proposition 3.2 it follows that, for each x ∈ X , one can find aset S x ∈ S with S x x, but such that S x ∈ U . Using property (s), one can find afinite collection of points x1, . . . , xn ∈ X , such that

(1) S x1 ∪ · · · ∪ S xn= X.

Since S xp ∈ U , it means that X S xp belongs to U , for every p = 1, . . . , n. Then,using (1), we get

U (X S x1) ∩ · · · ∩ (X S xn) = ∅,

which is impossible.(iii) ⇒ (i). Assuming property (iii), we will show that X has the finite in-

tersection property. Start with a family of closed sets F ii∈I , with the propertythat

(2)i∈J

F i = ∅, for every finite subset J ⊂ I.

We want to prove thati∈I

F i = ∅. For every finite subset J ⊂ I we define the

non-empty closed set F J =

i∈J F i. It is clear that

F = F J : J finite subset of I is a filter . Let then U be an ultrafilter with U ⊃ F . By (iii) there exists some x ∈ X such that U → x, whicm means that U contains all neighborhoods of x. Start nowwith some arbitrary index i ∈ I . Since we clearly have F i ∈ F ⊂ U, it follows thatX F i cannot belong to U, which means that X F i is not a neighborhood of x.Since X F i is already open, this forces x ∈ X F i, which means that x ∈ F i.Since this is true for all i ∈ I , it proves that the intersection

i∈I F i caontains x,

so it is non-empty.

An interesting application of the above result is the following:

Theorem 4.2 (Tihonov). Suppose one has a familiy (X i, T i)i∈I of compact topological spaces. Then the product space

i∈I X i is compact in the product topology.

Proof. We are going to use the ultrafilter characterization (iii) from the pre-ceding Theorem. Let U be an ultrafilter on X =

i∈I

X i. Denote by πi : X → X i,

i ∈ I the coordinate maps. Since each X i is compact, it follows that, for everyi ∈ I , the ultrafilter πi∗( U ) (in X i) is convergent to some point xi ∈ X i. If we formthe element x = (xi)i∈I ∈ X , this means that πi∗( U ) is convergent to πi(x), forevery i ∈ I . Then, by the ultrafilter characterization of the product topology (seesection 3) it follows that U is convergent to x.




Comment. Another interesting application of Theorem 4.1 is the followingconstruction. Suppose (X, T ) is a compact Hausdorff space, and (xi)i∈I ⊂ X is anarbitray family of elements. (Here I is an arbitrary set.) Suppose

U is an ultrafilter

on I . If we regard the family (xi)i∈I simply as a function f : I → X , then we canconstruct the ultrafilter f ∗( U ) on X . More explicitly

f ∗( U ) =

U ⊂ X : the set i ∈ I : xi ∈ W belongs to U .

Since X is compact Hausdorff, the ultrafilter f ∗( U ) is convergent to some uniquepoint x ∈ X . This point is denoted by lim

U xi.

We conclude this section with some results on compactness in Hausdorff spaces.

Proposition 4.4. Suppose (X, T ) is toplological Hausdorff space.

(i) Any compact set K ⊂ X is closed.(ii) If K is a compact set, then a subset F ⊂ K is compact, if and only if F

is closed (in X ).

Proof. (i) The key step is contained in the followingClaim: For every x ∈ X K , there exists some open set Dx with x ∈ Dx ⊂

X K .

Fix x ∈ X K . For every y ∈ K , using the Hausdorff property, we can findtwo open sets U y and V y with U y x, V y y, and U y ∩ V y = ∅. Since weobviously have K ⊂

y∈K V y, by compactness, there exist points y1, . . . , yn ∈ K ,such that K ⊂ V y1 ∪ · · · ∪ V yn

. The claim immediately follows if we then defineDx = U y1 ∩ · · · ∩ U yn

.Using the Claim we now see that we can write the complement of K as a union

of open sets:

X K =

x∈XK

Dx,

so X K is open, which means that K is indeed closed. (ii). If F is closed, thenF is compact by Proposition 4.2. Conversely, if F is compact, then by (i) F isclosed.

Proposition 4.5. Every compact Hausdorff space is normal.

Proof. Let X be a compact Hausdorff space. Let A, B ⊂ X be two closedsets with A∩B = ∅. We need to find two open sets U, V ⊂ X , with A ⊂ U , B ⊂ V ,and U ∩ V = ∅. We start with the following

Particular case : Assume B is a singleton, B = b.

The proof follows line by line the first part of the proof of part (i) from Proposition4.4. For every a ∈ A we find open sets U a and V a, such that U a a, V a b,and U a ∩ V a = ∅. Using Proposition 4.4 we know that A is compact, and since weclearly have A

⊂ a∈A U a, there exist a1, . . . , an

∈A, such that U a1

∪· · ·∪U an

⊃A.

Then we are done by taking U = U a1 ∪ · · · ∪ U an and V = V a1 ∩ · · · ∩ V an .Having proven the above particular case, we proceed now with the general case.

For every b ∈ B, we use the particular case to find two open sets U b and V b, withU b ⊃ A, V b b, and U b ∩ V b = ∅. Arguing as above, the set B is compact, andwe have B ⊂

b∈B V b, so there exist b1, . . . , bn ∈ B, such that V b1 ∪ · · · ∪ V bn⊃ B.

Then we are done by taking U = U b1 ∩ · · · ∩ U bnand V = V b1 ∪ · · · ∪ V bn

.



Lecture 5

5. Topology preliminaries V: Locally compact spaces

Definition. A locally compact space is a Hausdorff toplogical space with theproperty

(lc) Every point has a compact neighborhood.

One key feature of locally compact spaces is contained in the following;

Lemma 5.1. Let X be a locally compact space, let K be a compact set in X ,and let D be an open subset, with K ⊂ D. Then there exists an open set E with:

(i) E compact;(ii) K ⊂ E ⊂ E ⊂ D.

Proof. Let us start with the following

Particular case : Assume K is a singleton K = x.

Start off by choosing a compact neighborhood N of x. Using the results from section4, when equipped with the induced topology, the set N is normal. In particular, if we consider the closed sets A = x and B = N D (which are also closed in theinduced topology), it follows that there exist sets U, V ⊂ N , such that

• U ⊃ x, V ⊃ B, U ∩ V = ∅;

• U and V are open in the induced topology on N .The second property means that there exist open sets U 0, V 0 ⊂ X , such that U =N ∩ U 0 and V = N ∩ V 0. Let E = Int(U ). By construction E is open, and E x.Also, since E ⊂ U ⊂ N , it follows that

(1) E ⊂ N = N.

In particular this gives the compactness of E . Finally, since we obviously have

E ∩ V 0 ⊂ U ∩ V 0 = N ∩ U 0 ∩ V 0 = U ∩ V = ∅,

we get E ⊂ X V 0, so using the fact that X V 0 is closed, we also get theinclusion E ⊂ X V 0. Finally, combining this with (1) and with the inclusionN D ⊂ V ⊂ V 0, we will get

E

⊂N

∩(X V 0)

⊂N

∩(N D)

⊂D,

and we are done.Having proven the particular case, we proceed now with the general case. For

every x ∈ K we use the particular case to find an open set E (x), with E (x) compact,

and such that x ∈ E (x) ⊂ E (x) ⊂ D. Since we clearly have K ⊂ x∈K E (x), by

compactness, there exist x1, . . . , xn ∈ K , such that K ⊂ E (x1)∪· · ·∪E (xn). Noticethat if we take E = E (x1) ∪ · · · ∪ E (xn), then we clearly have

K ⊂ E ⊂ E ⊂ E (x1) ∪ · · · ∪ E (xn) ⊂ D,

25



26 LECTURE 5

and we are done.

One of the most useful result in the analysis on locally compact spaces is the

following.Theorem 5.1 (Urysohn’s Lemma for locally compact spaces). Let X be a

locally compact space, and let K, F ⊂ X be two disjoint sets, with K compact, F closed, and K ∩ F = ∅. Then there exists a continuous function f : X → [0, 1]such that f

K

= 1 and f F

= 0.

Proof. Apply Lemma 5.1 for the pair K ⊂ X F and find an open set E , withE compact, such that K ⊂ E ⊂ E ⊂ X F . Apply again Lemma 5.1 for the pairK ⊂ E and find anothe open set G with G compact, such that K ⊂ G ⊂ G ⊂ E .

Let us work for the moment in the space E (equipped with the induced topol-ogy). This is a compact Hausdorff space, hence it is normal. In particular, usingUrysohn Lemma (see section 1) there exists a continuous function g; E → [0, 1] suchthat gK = 0 and gEG = 0. Let us now define the function f : X

→[0, 1] by

f (x) =

g(x) if x ∈ E

0 if x ∈ X E

Notice that f E

= gE

, so f E

is continuous. If we take the open set A = X G,

then it is also clear that f A

= 0. So now we have two open sets E and A, with

A∪E = X , and f A

and f E

both continuous. Then it is clear that f is continuous.

The other two properties f K = 1 and f

F

= 0 are obvious.

We now discuss an important notion which meakes the linkage between locallycompact spaces and compact spaces

Definition. Let X be a locally compact space. By a compactification of X onemeans a pair (θ, T ) consisting of a compact Hausdorff space T , and of a continuous

map θ : X → T , with the following properties(i) θ(X ) is a dense open subset of T ;

(ii) when we equip θ(X ) with the induced topology, the map θ : X → θ(X )is a homeomorphism.

Notice that, when X is already compact, any compactification (θ, T ) of X is nec-essarily made up of a compact space T , and a homeomorphism θ : X → T .

Example 5.1 (Alexandrov compactification). Suppose X is a locally compactspace, which is not compact. We form a disjoint union with a singleton X α =X ∞, and we equip the space X α with the topology in which a subset D ⊂ X α

is declared to be open, if either D is an open subset of X , or there exists somecompact subset K ⊂ X , such that D = (X K ) ∞. Define the inclsuionmap ι : X

→X α. Then (ι, X α) is a compactification of X , which is called the

Alexandrov compactification. The fact that ι(X ) is open in X α, and ι : X → ι(X )is a homeomorphism, is clear. The density of ι(X ) in X α is also clear, since everyopen set D ⊂ X α, with D ∞, is of the form (X K ) ∞, for some compactset K ⊂ X , and then we have D ∩ ι(X ) = ι(X K ), which is non-empty, becauseX is not compact.

Remark that, if X is already compact, we can still define the topological spaceX α = X ∞, but this time the singleto set ∞ will be also be open. Althoughι(X ) will still be open in X α, it will not be dense in X α.




One should regard the Alexandrov compactification as a minimal one. It turnsout that there exists another compactification which is described below, which canbe regarded as the largest.

Theorem 5.2 (Stone-Cech). Let X be a locally compact space. Consider theset

F = f : X → [0, 1] : f continuous ,

and consider the product space

T =f ∈F

[0, 1],

equipped with the product topology, and define the map θ : X → T by

θ(x) =

f (x)f ∈F

, ∀ x ∈ X.

Equip the closure θ(X ) with the topology induced from T . Then the pair (θ, θ(X ))is a compactification of X .

Proof. For every f ∈ F , let us denote by πf : T → [0, 1] the coordinate map.Remark that θ : X → T is continuous. This is immediate from the definition of the product topology, since the continuity of θ is equivalent to the continuity of allcompositions πf β , f ∈ F . The fact that these compositions are continuous ishowever trivial, since we have πf θ = f , ∀ f ∈ F .

Denote for simplicity θ(X ) by B. By Tihonov’s Theorem, the space T is com-pact (and obviously Hausdorff), so the set B is compact as well, being a closedsubset of T . By construction, θ(X ) is dense in B, and θ is continuous.

At this point, it is interesting to point out the following property

Claim 1: For every f ∈ F , there exists a unique continuous map f : B →[0, 1], such that f θ = f .

The uniqueness is trivial, since θ(X ) is dense in B. The existence is also trivial,

because we can take f = πf B.We can show now that θ is injective. If x, y ∈ X are such that x = y, then

using Urysohn Lemma we can find f ∈ F , such that f (x) = f (y). The function f given by Claim 1, clearly satisfies

f

θ(x)

= f (x) = f (y) = f

θ(y)

,

which forces θ(x) = θ(y).In order to show that θ(X ) is open in B, we need some preparations. For every

compact subset K ⊂ X , we define

F K =

f : X → [0, 1] : f continuous, f XK

= 0

.

On key observation is the following.

Claim 2 : If K ⊂

X is compact, and if f ∈

F K

, then the continuous function f : B → [0, 1], given by Claim 1, has the property f

Bθ(K)= 0.

We start with some α ∈ B θ(K ), and we use Urysohn Lemma to find somecontinuous function φ : B → [0, 1] such that φ(α) = 1 and φ

θ(K)

= 0. Consider

the function ψ = φ · f . Notice that (φ θ)K

= 0, which combined with the fact

that f XK

= 0, gives

ψ θ = (φ θ) · (f θ) = (φ θ) · f = 0,



28 LECTURE 5

so using Claim 1 (the uniqueness part), we have ψ = 0. In particular, since φ(α) =

1, this forces f (α) = 0, thus proving the Claim.

We define now the collectionF c =

K⊂X

K compact

F K.

Define the set

S =f ∈F c

π−1f

0.

By the definition of the product topology, it follows that S is closed in T . The factthat θ(X ) is open in B, is then a consequence of the following fact.

Claim 3 : One has the equality θ(X ) = B S .

Start first with some point x ∈ X , and let us show that θ(x) ∈ S . Choose someopen set D ⊂ X , with D compact, such that D x, and apply Urysohn Lemma

to find some continuous map f : X → [0, 1] such that f (x) = 1 and f XD = 0.It is clear that f ∈ F D ⊂ F c, but πf

θ(x)

= f (x) = 1 = 0, which means that

θ(x) ∈ π−1f

0, hence θ(x) ∈ S . Conversely, let us start with some point α =

(αf )f ∈F ∈ B S , and let us prove that α ∈ θ(X ). Since α ∈ S , there exists somef ∈ F c, such that πf (α) > 0. Since f ∈ F c, there exists some compact subset

K ⊂ X , such that f XK

= 0. Using Claim 2, we know that f Bθ(K)

= 0. Since

f (α) = πf (α) = 0, this forces α ∈ θ(K ) ⊂ θ(X ).To finish the proof of the Theorem, all we need to prove now is the fact that

θ : X → θ(X ) is a homeomorphism, which amounts to proving that, wheneverD ⊂ X is open, it follows that θ(D) is open in B. Fix an open subset D ⊂ X . Inorder to show that θ(D) is open in B, we need to show that θ(D) is a neighborhoodfor each of its points. Fix some point α ∈ θ(D), i.e. α = θ(x), for some x ∈ D.

Choose some compact subset K ⊂ D, such that x ∈ Int(K ), and apply UrysohnLemma to find a function f ∈ F K, with f (x) = 1. Consider the continuous function

f : B → [0, 1] given by Claim 1, and apply Claim 2 to conclude that f Bθ(K)

= 0.

In particular the open set

N = f −1

(1/2, ∞) ⊂ B

is contained in θ(K ) ⊂ θ(D). Since f (α) = f (x) = 1, we clearly have x ∈ N .

Definition. The compactification (θ, θ(X )), constructed in the above Theo-

rem, is called the Stone-Cech compactification of X . The space θ(X ) will be denotedby X β . Using the map θ, we shall identify from now on X with a dense open subsetof X β . Remark that if X is compact, then X β = X .

Comment. The Stone-Cech compactification is an inherent “Zorn Lemmatype” construction. For example, if X is a non-compact locally compact space,and if U is an ultrafilter on X , then weither U is convergent to a point in X (thishappens when U contains at least one compact subset of X ), or U produces a pointin X β X . If θ : X → X β denotes the inclusion map, then one considers the ultra-filter θ∗U on X β , and by compactness this ultrafilter converges to some (unique)point in X β . This way one gets a correspondence

limX :U ⊂ P(X ) : U ultrafilter on X

→ X β .




This correspondence is surjective. The injectivity obstruction is characterized asfollows. For two ultrafilters U1, U2, the condition limX(U1) = limX(U2) is equiv-alent to the existence of two disjoint open sets D

1 ∈U

1and D

2 ∈U

2.

Exercise 1. Suppose a set X is equipped with the discrete topology. Prove thatthe correspondence limX is bijective.

The Stone-Cech compactification is functorial, in the following sense.

Proposition 5.1. If X and Y are locally compact spaces, and if Φ : X → Y is a continuous map, then there exists a unique continuous map Φβ : X β → Y β,such that Φβ

X

= Φ.

Proof. We use the notations from Theorem 5.2. Define

F = f : X → [0, 1] : f continuous and G = g : Y → [0, 1] : f continuous ,

the product spaces

T X =

f ∈F [0, 1] and T Y =

g∈G[0, 1],

as well as the maps θX : X → T X and θY : Y → T Y , defined by

θX(x) =

f (x)f ∈F

, ∀ x ∈ X ;

θY (y) =

g(y)g∈G

, ∀ y ∈ Y.

With these notations, we have X β = θX(X ) ⊂ T X and Y β = θY (Y ) ⊂ T Y . Usingthe fact that we have a correspondence G g −→ g Φ ∈ F , we define the map

Ψ : T X (αf )f ∈F −→ (αgΦ)g∈G ∈ T Y .

Remark that Ψ is continuous. This fact is pretty obvious, because when we composewith corrdinate projections πg : T Y → [0, 1], g ∈ G, we have πg Ψ = πgΦ whereπgΦ : T X → [0, 1] is the coordinate projection, which is automatically continuous.Remark that if we start with some point x

∈X , then

(2) ΨθX(x) = (g Φ)(x)g∈G

= θY Φ(x),

which means that we have the equality ΨθX = θY Φ. Remark first that, since Y β

is closed, it follows that Ψ−1(Y β) is closed in T X . Second, using (2), we clearly havethe inclusion θX(X ) ⊂ Ψ−1

θY (Y )

⊂ Ψ−1(Y β), so using the fact that Ψ−1(Y β) isclosed, we get the inclusion

X β = θX(X ) ⊂ Ψ−1(Y β).

In other words, we get now a continuous map Φβ = ΨXβ : X β → Y β, which clearly

satisfies Φβ θX = θY Φ, which using our conventions means that ΦβX

= Φ. The

uniqueness is obvious, by the density of X in X β .

Exercise 2 . The Alexandrov compactification is not functorial. In other words,

given locally compact spaces X and Y , and a continuous map f : X → Y , in generalthere does not exist a continuous map f α : X α → Y α, with f α

X

= f . Give anexample of such a situation.Hint: Consider X = Y = N, equipped with the discrete topology, and define f : N → N by

f (n) =

1 if n is odd2 if n is even

It turns out that one can define a certain type of continuous maps, with respectto which the Alexandrov compactification is functorial.



30 LECTURE 5

Definition. Let X , Y be locally compact spaces, and let Φ : X → Y be acontinuous map. We say that Φ is proper , if it satisfies the condition

K ⊂ Y , compact ⇒ Φ−1

(K ) compact in X.The following is an interesting property of proper maps, which will be exploited

later, is the following.

Proposition 5.2. Let X , Y be locally compact spaces, let Φ : X → Y be a proper continuous map, and let T ⊂ X be a closed subset. Then the set Φ(T ) isclosed in X .

Proof. Start with some point y ∈ ¯Φ(T ). This means that

(3) D ∩ Φ(T ) = ∅, for every open set D ⊂ Y , with D y.

Denote by V the collection of all compact neighborhoods of y. In other words,V ∈ V, if and only if V ⊂ Y is compact, and y ∈ Int(V ). For each V ∈ V we define

the set V = Φ−1(V ) ∩ T . Since Φ is proper, all sets V , V ∈ V, are compact. Notice

also that, for every finite number of sets V 1, . . . , V n ∈ V, if we form the intersectionV = V 1 ∩ · · · ∩ V n, then V ∈ V, and V ⊂ V j , ∀ j = 1, . . . , n. Remark now that, by

(3), we have V = ∅, ∀ V ∈ V. Indeed, if we start with some V ∈ V and we choose

some point x ∈ T , such that Φ(x) ∈ V , then x ∈ V . Use now the finite intersection

property, to get the fact thatV ∈V V = ∅. Pick now a point x ∈ V ∈V V . This

means that x ∈ T , and

(4) Φ(x) ∈ V, ∀ V ∈ V.

But now we are done, because this forces Φ(x) = y. Indeed, if Φ(x) = y, using theHausdorff property, one could find some V ∈ V with Φ(x) ∈ V , thus contradicting(4).

Exercise 3 . Let X be a locally compact space, which is non-compact, let Y beanother a locally compact space, and Φ : X

→Y is a proper continuous map.

(i) If Y is non-compact, prove that there exists a unique continuous mapΦα : X α → Y α, with Φα

X

= Φ.(ii) If Y is compact, prove that there exists a unique continuos map Ψ : X α →

Y , with ΨX

= Φ.

Hint: In case (i) define Φα(∞) = ∞. In case (ii) consider the collection

W=

Φ(T ) : T ⊂ X closed, with X T compact

.

Use the above result, combined with the finite intersection property, to pick a point y ∈ W ∈W W .

Define Ψ(∞) = y.



Lecture 6

6. Metric spaces

In this section we review the basic facts about metric spaces.

Definitions. A metric on a non-empty set X is a map

d : X × X → [0, ∞)

with the following properties:(i) If x, y ∈ X are points with d(x, y) = 0, then x = y;

(ii) d(x, y) = d(y, x), for all x, y ∈ X ;(iii) d(x, y) ≤ d(x, z) + d(y, z), for all x, y, z ∈ X .

A metric space is a pair (X, d), where X is a set, and d is a metric on X .

Notations. If (X, d) is a metric space, then for any point x ∈ X and anyr > 0, we define the open and closed balls:

Br(x) = y ∈ X : d(x, y) < r,

Br(x) = y ∈ X : d(x, y) ≤ r.

Definition. Suppose (X, d) is a metric space. Then X carries a naturaltoplogy constructed as follows. We say that a set D ⊂ X is open , if it has the

property:• for every x ∈ D, there exists some rx > 0, such that Brx(x) ⊂ D.

One can prove that the collection

T d = D ⊂ X : D open is indeed a topology , i.e. we have

• ∅ and X are open;• if (Di)i∈I is a family of open sets, then

i∈I Di is again open;

• if D1 and D2 are open, then D1 ∩ D2 is again open.

The topology thus constructed is called the metric topology .

Remark 6.1. Let (X, d) be a metric space. Then for every p ∈ X , and forevery r > 0, the set Br( p) is open, and the set Br( p) is closed.

If we start with some x ∈ Br( p), an if we define rx = r − d(x, p), then for everyy ∈ Brx(x) we will have

d(y, p) ≤ d(y, x) + d(x, p) < rx + d(x, p) = r,

so y belongs to Br( p). This means that Brx(x) ⊂ Br( p). Since this is true for all

x ∈ Br( p), it follows that Br( p) is indeed open.To prove that Br( p) is closed, we need to show that its complement

X Br( p) = x ∈ X : d(x, p) > r31



32 LECTURE 6

is open. If we start with some x ∈ X Br( p), an if we define ρx = d( p,x) − r, thenfor every y ∈ Bρx

(x) we will have

d(y, p) ≥ d( p,x) − d(y, x) > d( p,x) − ρx = r,so y belongs to X Br( p). This means that Bρx

(x) ⊂ X Br( p). Since this is true

for all x ∈ X Br( p), it follows that X Br( p) is indeed open.

Remark 6.2. The metric toplogy on a metric space (X, d) is Hausdorff. Indeed,if we start with two points x, y ∈ X , with x = y, then if we choose r to be a realnumber, with

0 < r <d(x, y)

2,

then we have Br(x) ∩Br(y) = ∅. (Otherwise, if we have a point z ∈ Br(x)∩Br(y),we would have 2r < d(x, y) ≤ d(x, z) + d(y, z) < 2r, which is impossible.)

Remark 6.3. Let (X, d) be a metric space, and let M be a subset of X . Thend

M ×M is a metric on M , and the metric topology on M defined by this metric is

precisely the induced toplogy from X . This means that a set A ⊂ M is open in M if and only if there exists some open set D ⊂ X with A = M ∩ D.

The metric space framework is particularly convenient because one can useconvergence.

Definition. Let (X, d) be a metric space. For a point x ∈ X , we say that asequence (xn)n≥1 ⊂ X is is convergent to x, if limn→∞ d(xn, x) = 0.

Remark 6.4. Let (X, d) is a metric space, and if the sequence (xn)n≥1 ⊂ X is convergent to some point x ∈ X , then

(1) limn→∞

d(xn, y) = d(x, y), ∀ y ∈ X.

This is an immediate consequence of the inequalities

d(x, y)

−d(xn, x)

≤d(xn, y)

≤d(x, y) + d(xn, x).

Among other things, the equality (1) gives the fact that (xn)n≥1 cannot beconvergent to any other point y = x. Therefore, if (xn)n≥1 is convergent to somex, then x is uniquely determined, and will be denoted by limn→∞ xn.

Convergence is useful for characterizing closure.

Proposition 6.1. Let (X, d) be a metric space, and let A ⊂ X be a non-empty subset. For a point x ∈ X , the following are equivalent:

(i) x belongs to the closure A of A;(ii) there exists some sequence (xn)n≥1 ⊂ A, with limn→∞ xn = x.

Proof. (i) ⇒ (ii). Assume x ∈ A. This means that

(∗) For every open set D ⊂ X with D x, the intersection D ∩ A is non-empty.

We use this property for the open sets B1/n(x), n = 1, 2, . . . . So, for every integern ≥ 1, we can find a point xn ∈ B1/n(x) ∩ A. This way we have built a sequence(xn)n≥1 ⊂ A, such that

d(xn, x) <1

n, ∀ n ≥ 1.

It is clear that this gives x = limn→∞ xn.(ii) ⇒ (i). Assume x satisfies property (ii). Fix (xn)n≥1 ⊂ A to be a sequence

with limn→∞ xn = x. We need to prove property (∗). Start with some arbitrary




open set D ⊂ X , with x ∈ D. Let ε > 0 be chosen such that Bε(x) ⊂ D. Sincelimn→∞ d(xn, x) = 0, there exists some nε such that d(xnε

, x) < ε. It is now clearthat

xnε∈ Bε(x) ∩ A ⊂ D ∩ A,

so the intersection D ∩ A is indeed non-empty.

Continuity can be characterized using convergence, as follows.

Proposition 6.2. Let X and Y be metric spaces, and let f : X → Y be a function. For a point p ∈ X , the following are equivalent:

(i) f is continuous at p;(ii) for every ε > 0, there exists some δε > 0 such that

d

f (x), f ( p)

< ε, for all x ∈ X with d(x, p) < δε.

(iii) if (xn)n≥1 ⊂ X is a sequence with limn→∞ xn = p, then limn→∞ f (xn) =f ( p).

Proof. (i) ⇒ (ii). The condition that f is continuoous at p means

(∗) for every open set D ⊂ Y , with D f ( p), there exists some open set E ⊂ X , with p ∈ E ⊂ f −1(D).

Assume f is continuous at p. For every ε > 0, we consider the open ball BY ε

f ( p)

.

Using (∗), there exists some open set E ⊂ X , with E p, and f (E ) ⊂ BY ε

f ( p)

.

In particular, there exists δ > 0, such that BXδ ( p) ⊂ E , so now we have

f BXδ ( p)

⊂ BY ε

f ( p)

,

which clearly gives (ii).(ii) ⇒ (iii). Assume f satisfies (ii), and start with some sequence (xn)n≥1 ⊂ X ,

which converges to p. For every ε > 0, we choose δε > 0 as in (ii), and using thefact that limn→∞ xn = p, we can also choose some N ε such that

d(xn, p) < δε, ∀ n ≥ N ε.

Using (ii) this will give

d

f (xn), f ( p)

< ε, ∀ n ≥ N ε.

In other words, we get the fact that

limn→∞

f (xn), f ( p)

= 0,

which means that we indeed have limn→∞ f (xn) = f ( p).(iii) ⇒ (i). Assume f satisfies (iii), but f is not continuous at p. By (∗) this

means that there exists some open set D0 ⊂ Y with D0 f ( p), such that

(∗) for every open set E ⊂ X with E p, we have f (E ) ⊂ D0.

It is clear that any other open set D, with f ( p) ∈ D ⊂ D0, will again satisfyproperty (∗). Fix then some r > 0, such that BY r

f ( p)) ⊂ D0. Using condition

(∗) it follows that for every integer n ≥ 1, we have

f BX1/n( p)

⊂ BY r

f ( p)

.

This means that, for every integer n ≥ 1, we can find a point xn ∈ X such that

d(xn, p) <1

nand d

f (xn), f ( p)

≥ r.



34 LECTURE 6

It is then clear that the sequence (xn)n≥1 ⊂ X is convergent to p, but the sequence

f (xn)

n≥1⊂ Y is not convergent to f ( p). This will contradict (iii).

Convergence can also be used for characterizing compactness.

Theorem 6.1. Let (X, d) be a metric space. The following are equivalent:

(i) X is compact in the metric topology;(ii) every sequence has a convergent subsequence.

Proof. (i) ⇒ (ii). Assume X is compact. Start with an arbitrary sequence(xn)n≥1 ⊂ X . For every n ≥ 1, we define the closed set

T n = xk : k > n.

It is obvious that the family of closed sets (T n)n≥1 has the finite intersection prop-erty , i.e. for every finite set F of indices, we have

n∈F

T n

= ∅.

(This follows from the fact that the T n’s form a decreasing sequence of sets.) Bycompactness, it follows that

n≥1

T n = ∅.

Take a point x ∈ n≥1 T n. The key feature of x is the given by the following:

Claim 1: For every ε > 0 and every integer ≥ 1, there exists some integer N (ε, ) > such that d(xN (ε,), x) < ε.

This is a consequence of the fact that, for every ≥ 1, the point x belongs to theclosure xN : N > , so for every ε > 0 we have

Bε(x)

∩ xN : N >

= ∅.

Using Claim 1, we define a sequence (kn)n≥0 of integers, recursively by

kn = N ( 1n , kn−1), ∀ n ≥ 1.

(The initial term k0 is chosen arbitrarily.) We have, by construction, k0 < k1 <k2 < . . . , and

d(xkn, x) <

1

n, ∀ n ≥ 1,

so (xkn)n≥1 is indeed a subsequence of (xk)k≥1, which is convergent (to x).

(ii) ⇒ (i). Assume (ii). Before we start proving that X is compact, We shallneed some preparations.

Claim 2 : For every r > 0 there exists a finite set F ⊂ X , such that

X = x∈F Br(x).

We prove this by contradiction. Assume there exists some r > 0, such thatx∈F

Br(x) X,

for every finite set F ⊂ X . In particular, there exists a sequence (xn)n≥1 such that

xn+1 ∈ X Br(x1) ∪ · · · ∪Br(xn)

, ∀ n ≥ 1.




This will forced(xm, xn) ≥ r, ∀ m > n ≥ 1.

Notice that every subsequence (xkn )n≥1 will satisfy the same propertyd(xkm

, xkn) ≥ r, ∀ m > n ≥ 1.

This proves that no subsequence of (xn)n≥1 is Cauchy , so no subsequence of (xn)n≥1

can be convergent , thus contradicting (ii).Having proven Claim 2, we choose, for every integer n ≥ 1, finite set F n such

thatX =

x∈F n

B 1n

(x).

Claim 3 : The collection W =B 1

n(x) : n ∈ N, x ∈ F n

is a base for the

metric topology.

What we need to show is that every open set is a union of sets in W. Fix an openset D and a point p

∈D. Choose r > 0, such that Br( p)

⊂D. Choose then

some integer n ≥ 1, such that 1n < r

2 , and choose some point x ∈ F n, such that p ∈ B 1

n(x). Notice that, for every y ∈ B 1

n(x), we have

d(y, p) ≤ d(y, x) + d(x, p) <1

n+

1

n≤ r,

which proves that y ∈ Br( p). Therefore we have

p ∈ B 1n

(x) ⊂ Br( p) ⊂ D.

Since p ∈ D is arbitrary, this proves that D is a union of sets in W.We now begin proving that X is compact. Start with a collection (Di)i∈I of

open sets, withi∈I Di = X . We need to find a finite set of indices I 0 ⊂ I , such

that

i∈I 0

Di = X . First we show that:

Claim 4: There exists a countable set of indices I 1

⊂I , such that

i∈I 1

Di = X.

The key fact is that the base W is countable. Let us enumerate the base W as asequence

W = W m : m ∈ N.

For each i ∈ I , we define the set

M i = m ≥ 1 : W m ⊂ Di.

By Claim 3, we know that for every x ∈ Di there exists some m ∈ M i such thatx ∈ W m ⊂ Di. In particular this proves the equality

Di =

m∈M i

W m, ∀ i ∈ I.

Consider then the union M =i∈I M i, which is countable, being a subset of the

integers. We clearly havem∈M

W m =i∈I

m∈M i

W m

=i∈I

Di = X.

For every m ∈ M we choose an im ∈ I , such that m ∈ M im . If we take

I 1 = im : m ∈ M ,



36 LECTURE 6

then I 1 is obviously countable, and since we clearly have W m ⊂ Dim , we get

X = m∈M

W m

⊂ m∈M

Dim = i∈I 1 Di,

so the Claim is proven.Let us list the countable set I 1 as

I 1 = ik : k ≥ 1.

(Of course, if I 1 is already finite, there is nothing to prove. So we will assumethat I 1 is infinite.) In order to finish the proof, we must find some k, such thatDi1 ∪ Di2 ∪ · · · ∪ Dik = X . Assume no such k can be found, which means that

Di1 ∪ Di2 ∪ · · · ∪ Dik X, ∀ k ≥ 1.

In other words, if we define for each k ≥ 1, the close set

Ak = X (Di1 ∪ Di2 ∪ · · · ∪ Dik ),

we haveAk = ∅, ∀ k ≥ 1.

For each k ≥ 1 we choose a point xk ∈ Ak. This way we have constructed asequence (xk)k≥1 ⊂ X , so using property (i) we can find a convergent subsequence.This means that we have a sequence of integers

1 ≤ k1 < k2 < . . .

and a point x ∈ X , such that limn→∞ xkn= x. Notice that, since

kn ≥ n, ∀ n ≥ 1,

and since the sequence (Ak)k≥1 is decreasing, we get the fact that, for each m ≥ 1,we have

xkn ∈

Am,∀

n≥

m.

Since Am is closed, this forces x ∈ Am, for all m ≥ 1. But this is clearly impossible,since

m≥1

Am = X m≥1

(Di1 ∪ · · · ∪ Dim )

= X i∈I 1

Di

= ∅.

Corollary 6.1 (of the proof). Evry compact metric space is second countable,which means that there exists a sequence (W m)m≥1 of open sets, with the property

(b) for every open set D, there exists a subset M ⊂ N such that

D =m∈M

W m.

Proof. Use (i) and the steps in the proof of (i) ⇒ (ii), up to the proof of Claim 3.

Corollary 6.2. Let (X, d) be a metric space. For a subset K ⊂ X the fol-lowing are equivalent:

(i) every sequence in K has a subsequence which is convergent to some point in K ;

(ii) K is compact in X .




Proof. (i) ⇒ (ii). By the above Theorem, we know that when we equip K with the metric d

K×K, then K is compact. This means that K is compact in the

induced topology, which means exactly that K is compact in X .(ii) ⇒ (i). Argue as above. If K is compact in X , then K is compact whenequipped with the induced toplogy, which means that (K, d

K×K

) is compact.

Corollary 6.3. Let X and Y be metric spaces, and let f : X → Y be a continuous map. If X is compact, then f is uniformly continuous, that is,

• for every ε > 0, there exists some δε > 0, such that

d

f (x), f (x)

< ε, for all x, x ∈ X with d(x, x) < δε.

Proof. Suppose f is not uniformly continuos, so there exists some ε0 > 0,with the property that for any δ > 0 there exists x, x ∈ X , with d(x.x) < δ, butd

f (x), f (x)

≥ ε0. In particular, one can construct two sequences (xn)n≥1 and

(xn)n≥1 with

(2) d(xn, xn) <

1

nand d

f (xn), f (x

n) ≥ ε0, ∀ n ≥ 1.

Using compactness, we can find a subsequence (xnk)k≥1 of (xn)n≥1 which converges

to some point p. On the one hand, we have

d( p,xnk

) ≤ d( p,xnk) + d(xnk

, xnk

) < d( p,xnk) +

1

nk, ∀ k ≥ 1,

which proves that

(3) limk→∞

xnk

= p.

On the other hand, using (2) we also have

ε0 ≤ df (xnk ), f (xnk ) ≤ df ( p), f (xnk )+ df ( p), f (xnk ),

which leads to a contradiction, because the equalities

limk→∞

xnk= limk→∞

xnk

= p,

together with the continuity of f , will force

limk→∞

d

f ( p), f (xnk) = lim

k→∞d

f ( p), f (xnk

)

= 0.

Remark 6.5. Let X be a metric space. Then any compact subset K ⊂ X isclosed (this is a consequence of the fact that X is Hausdorff) and bounded , in thesense that for every p

∈X we have

supx∈K

d(x, p) < ∞.

This is a consequence of the continuity (see ??) of the map

K x −→ d(x, p) ∈ [0, ∞).

In general however the converse is not true, i.e. there are metric spaces in whichclosed bounded sets may fail to be compact.



38 LECTURE 6

Exercise 1. Equip R with the metric

d(x, y) =|x − y|

1 + |x − y|,

∀x, y

∈R.

Prove that d is indeed a metric on R, and the metric topology on R defined by d isthe usual topology. Prove that R is bounded with respect to this metric.

Exercise 2 . Start with a metric space X , and let (xn)n≥1 ⊂ X be a sequencewhich is convergent to some point x. Prove that the set

K = x ∪ xn : n ≥ 1is compact in X .

Definition. Let (X, d) be e metric space. For a point x ∈ X and a non-emptysubset A ⊂ X , one defines the distance from x to A as the number

d(x, A) = inf

d(x, a) : a ∈ A

.

Exercise 3 . Let (X, d) be a metric space, and let A be a non-empty subset of

X .(i) For a point x ∈ X , prove that the equality d(x, A) = 0 is equivalent to

the fact that x ∈ A.(ii) Prove the inequalityd(x, A) − d(y, A)

≤ d(x, y), ∀ x, y ∈ X.

Using (ii) conclude that the map

X x −→ d(x, A) ∈ [0, ∞)

is continuous.

Proposition 6.3. Let (X, d) be a metric space. When equipped with the metrictopology, X is normal.

Proof. Let A and B be closed subsets of X with A ∩ B = ∅. We need tofind open sets U, V ⊂ X , with U ⊃ A, V ⊃ B, and U ∩ V = ∅. We are goingto use a converse of Urysohn Lemma. More explicitly, let us define the functionf : X → [0, 1] by

f (x) =d(x, A)

d(x, A) + d(x, B), x ∈ X.

Notice that by Exercise 3, both the numerator and denominator are continuous,and the denominator never vanishes. So f is indeed continuous. It is obviousthat f

A

= 0 and f B

= 1, so if we take the open sets U = f −1

(−∞, 12 )

and

V = f −1

( 12

, ∞)

, we clearly get the desired result.

We continue now with a discussion on completeness.

Definitions. Let (X, d) be a metric space. A sequence (xn)n≥1 ⊂ X is said

to be a Cauchy sequence, if it has the following property.(C) For every ε > 0, there exists some integer N ε ≥ 1 such that

d(xm, xn) < ε, ∀ m, n ≥ N ε.

The metric space (X, d) is said to be complete, if every Cauchy sequence isconvergent.

The following result summarizes some equivalent characterizations of complete-ness.




Proposition 6.4. Let (X, d) be a metric space. The following are equivalent.

(i) (X, d) is complete.

(ii) Every sequence (xn)n≥1 ⊂ X , with

(4)

∞n=1

d(xn+1, xn) < ∞,

is convergent.(iii) Every Cauchy sequence has a convergent subsequence.

Proof. (i) ⇒ (ii). Assume X is complete. Let (xn)n≥1 ⊂ X be a sequencewith property (4). To prove (ii) it suffices to show that (xn)n≥1 is Cauchy. Forevery N ≥ 1 we define

RN =

∞n=N

d(xn+1, xn).

Using (4) we get limN →∞ RN = 0, so for every ε > 0 there exists some N (ε) with

RN (ε) < ε. Notice also that the sequence (RN )N ≥1 is decreasing. If m > n ≥ N (ε),then

d(xm, xn) ≤m−1k=n

d(xk+1, xk) ≤∞k=n

d(xk+1, xk) = Rn ≤ RN (ε) < ε,

so (xn)n≥1 is indeed Cauchy.(ii) ⇒ (iii). Start with some Cauchy sequence (yk)k≥1. For every n ≥ 1 choose

an integer N (n) ≥ 1 such that

(5) d(xk, x) <1

2n, ∀ k, ≥ N (n).

Start with some arbitrary k1 ≥ N (1) and define recursively an entire sequence(kn)n≥1 of integers, by

kn+1 = maxkn + 1, N (n + 1), n ≥ 1.

Clearly we have k1 < k2 < . . . , and since we have

kn+1 > kn ≥ N (n), ∀ n ≥ 1,

using (5), we get

d(ykn+1, ykn

) <1

2n, ∀ n ≥ 1.

So if we define the subsequence xn = ykn, n ≥ 1, we will have

∞n=1

d(xn+1, xn) ≤∞n=1

1

2n= 1,

so the subsequence (xn)n≥1 satisfies condition (4). By (ii) the subsequence (xn)n≥1

is convergent.(iii) ⇒ (i). Assume condition (iii) holds. Start with some Cauchy sequence

(xn)n≥1. For every integer n ≥ 1 we put

S n = sup,m≥n

d(x, xm).

Since (xn)n≥1 is Cauchy, we have

(6) limn→∞

S n = 0.



40 LECTURE 6

Using the assumption, we can find a subsequence (xkn)n≥1 (defined by an increasing

sequence of integers 1 ≤ k1 < k2 < . . . ) which is convergent to some point x. Weare going to prove that the entire sequence (x

n)n≥1

is convergent to x. Fix for themoment n ≥ 1. For every m ≥ n, we have km ≥ m ≥ n, so we have

(7) S n ≥ d(xn, xkm), ∀ m ≥ n.

By Remark 3.4, we also know that

limm→∞

d(xn, xkm) = d(xn, x),

so if we take limm→∞ in (7) we will get

d(xn, x) ≤ S n.

Since this estimate holds for arbitrary n ≥ 1, using (6) we immediately get the factthat (xn)n≥1 is indeed convergent to x.

Proposition 6.5. Suppose (X, d) is a complete metric space, and Y is a subset of X . The following are equivalent:

(i) Y is complete, when equipped with the metric from X ;(ii) Y is closed in X , in the metric topology.

Proof. (i) ⇒ (ii). Assume Y is complete, and let us prove that Y is closed.Start with a point x ∈ Y . Then there exists a sequence (yn)n≥1 ⊂ Y withlimn→∞ yn = x. Notice that (yn)n≥1 is Cauchy in Y , so by assumption, (yn)n≥1 isconvergent to som point in Y . This will then clearly force x ∈ Y .

(ii) ⇒ (i). Assume Y is closed, and let us prove that Y is complete. Startwith a Cauchy sequence (yn)n≥1 ⊂ Y . Since X is complete, the sequence (yn)n≥1

is convergent to some point x ∈ X . Since Y is closed, this forces x ∈ Y .

Remark 6.6. Using Theorem 6.1, we immediately see that a metric space,which is compact in the metric topology, is automatically complete.

The next result identifies those complete metric spaces that are compact. Inorder to formulate it, we need the following:

Definition. Let (X, d) be a metric space, and let ε > 0. A subset A ⊂ X issaid to be ε-rare, if

d(a, b) ≥ ε, for all a, b ∈ A with a = b.

Proposition 6.6. Let (X, d) be a complete metric space. The following areequivalent:

(i) X is compact in the metric topology;(ii) for each ε > 0, all ε-rare subsets of X are finite;

(iii) for any ε > 0, there exist finitely many points p1, p2, . . . , pn ∈ X , such that

X = Bε( p1) ∪Bε( p2) ∪ · · · ∪Bε( pn).

Proof. (i) ⇒ (ii). Assume X is compact. We prove (ii) by contradiction.Assume there exists some ε > 0 and an infinite ε-rare set A ⊂ X . It then followsthat there exists a sequence (an)n≥1 ⊂ A, such that

d(am, an) ≥ ε, ∀ m > n ≥ 1.




It is clear that no subsequence of (an)n≥1 is Cauchy , which means that (an)n≥1

does not have any convergent subsequence, thus contradicting the fact that X iscompact.

(ii) ⇒ (iii). Assume property (ii) and let us prove (iii) by contradiction.Assume there exists some ε > 0, such that, for every finite set F ⊂ X , one has astrict inclusion

x∈F

Bε(x) X.

Start with some arbitrary point a1 ∈ X , and construct recursively a seqeuence(an)n≥1 ⊂ X , by choosing

an+1 ∈ X Bε(a1) ∪ · · · ∪Bε(an)

, ∀ n ≥ 1.

This will then forced(am, an) ≥ ε, ∀ m > n ≥ 1,

so A = an : n ∈ N will be an infinite ε-rare set, thus contradicting (ii).

(iii) ⇒ (i). Assume property (iii), and let us prove that X is compact. We aregoing to use Theorem 6.1. Start with an arbitrary sequence (xn)n≥1 ⊂ X , and letus construct a convergent subsequence.

Claim: There exists a sequence ( pn)n≥1 ⊂ X , such that for every integer k ≥ 1, the set

M k =

n ∈ N : xn ∈k=1

B1

( p)

is infinite.

The sequence ( pn)n≥1 is constructed recursively. To start, we use (ii) to find a finiteset F 1 ⊂ X , such that

X = p∈F 1B1( p).

If we define, for each p ∈ F 1, the set

S 1( p) = n ∈ N : xn ∈ B1( p),

then we clearly have p∈F 1

S 1( p) = N,

so in particular one of the sets S 1( p), p ∈ F 1, is infinite.Suppose now we have constructed points p1, p2, . . . , pm−1, such that, for every

k ∈ 1, . . . , m − 1, the set

M k =

n ∈ N : xn ∈

k

=1

B1

( p)

is infinite, and let us indicate how the next term pm is to be constructed. Startwith a finite set F m ⊂ X , such that

X = p∈F m

B 1m

( p),

and define, for each p ∈ F m, the set

S m( p) = n ∈ M m−1 : xn ∈ B 1m

( p)

.



42 LECTURE 6

It is clear thatM m−1 =

p∈F mS m( p),

and since M m−1 is infinite, it follows that one of the sets S m( p), p ∈ F m is infinite.We then choose pm ∈ F m to be one point for which S m( pm) is infinite.

Having proven the Claim, let us us construct a sequence of integers 1 ≤ n1 <n2 < . . . as follows. Start with some arbitrary n1 ∈ M 1. Once n1 < n2 < · · · < nkhave been constructed, we choose the integer nk+1 ∈ M k+1, such that nk+1 > nk.(It is here that we use the fact that M k+1 is infinite.) By construction, we havenk ∈ M k, ∀ k ≥ 1.

Suppose k ≥ ≥ 1. Then by construction we have nk ∈ M k ⊂ M and n ∈ M .In particular we get

d(xnk, xn

) ≤ d(xnk, p) + d(xn

, p) <2

.

The above estimate clearely proves that the subsequence (xnk)k≥1 is Cauchy. Since

X is complete, it follows that (xnk )k≥1 is convergent.

Corollary 6.4. Let (X, d) be a complete metric space, and let A be a subset of X . The following are equivalent:

(i) the closure A is compact in X ;(ii) for each ε > 0, all ε-rare subsets of A are finite.

Proof. (i) ⇒ (ii). This is trivial from the above result.(ii) ⇒ (i). Assume (ii), and let us prove that A is compact. Since A is complete,

it suffices to prove that, for each ε > 0, all ε-rare subsets of A are finite. Fix ε > 0,and let B be an ε-rare subset of A. For each x ∈ B, let us choose a point ax ∈ A,such that x ∈ Bε/3(ax). Suppose x, y ∈ B are such that x = y. Then

d(ax, ay)≥

d(x, y)−

d(ax, x)−

d(ay, y) > ε−

ε

3 −ε

3=

ε

3.

In particular, this shows that the map

f : B x −→ ax ∈ A

is injective, and the set f (B) is an (ε/3)-rare subset of A. By condition (ii) thisforces B to be finite.

We continue with an important construction.

Definitions. Let (X, d) be a metric space. We define

cs(X, d) =x = (xn)n≥1 : x Cauchy sequence in X

.

We say that two Cauchy sequences x = (xn)n≥1 and y = (yn)n≥1 in X are equiva-lent , if

limn→∞

d(xn, yn) = 0.

In this case we write x ∼ y. (It is fairly obvious that ∼ is indeed an equivalencerelation.) We define the quotient spaceX = cs(X, d)/ ∼ .

For an element x ∈ cs(X, d), we denote its equivalence class by x.

Finally, for a point x ∈ X , we define x ∈ X, to be the equivalence class of the constant sequence x (which is obviously Cauchy).




Remark 6.7. Let (X, d) be a metric space. If x = (xn)n≥1 and y = (yn)n≥1

are Cauchy sequences in X , then the sequence of real numbers

d(xn, yn)

n≥1is

convergent. Indeed, for any m, n we haved(xm, ym) − d(xn, yn) ≤ d(xm, ym) − d(xn, ym)

+d(xn, ym) − d(xn, yn)

≤≤ d(xm, xn) + d(ym, yn).

We can then define

δ(x,y) = limn→∞

d(xn, yn).

Proposition 6.7. Let (X, d) be a metric space.

A. The map δ : cs(X, d) × cs(X, d) → [0, ∞) has the following properties:(i) δ(x,y) = δ(y,x), ∀x,y ∈ cs(X, d);

(ii) δ(x,y) ≤ δ(x, z) = δ(z,y), ∀x,y, z ∈ cs(X, d);(iii) δ(x,y) = 0 ⇒ x ∼ y;(iv) If x,x,y,y ∈ cs(X, d) are such that x ∼ x and y ∼ y, then

δ(y,x) = δ(x

,y

).B. The map d : X × X → [0, ∞), correctly defined by d(x, y) = δ(x,y), ∀x,y ∈ cs(X, d),

is a metric on X .C. The map X x −→ x ∈ X is isometric, in the sense that d(x, y) = d(x, y), ∀ x, y ∈ X.

Proof. A. Properties (i), (ii) and (iii) are obvious. To prove property (iv) letx = (xn)n≥1, x = (x

n)n≥1, y = (yn)n≥1, and y = (yn)n≥1. The inequality

d(xn, y

n) ≤ d(xn, xn) + d(xn, yn) + d(yn, yn),

combined with limn→∞ d(xn, xn) = limn→∞ d(yn, y

n) = 0 immediately gives

δ(x,y) = limn→∞

d(xn, y

n) ≤ limn→∞

d(xn, yn) = δ(x,y).

By symmetry we also have δ(x,y) ≤ δ(x,y), and we are done.B. This is immediate from A.C. Obvious, from the definition.

Proposition 6.8. Let (X, d) be a metric space.

(i) For any Cauchy sequence x = (xn)n≥1 in X , one has

limn→∞

xn = x, in X.

(ii) The metric space (

X,

d) is complete.

Proof. (i). For every n ≥ 1, we have(8) dxn, x = lim

m→∞d(xn, xm).

Now if we start with some ε > 0, and we choose N ε such that

d(xn, xm) < ε, ∀ m, n ≥ N ε,

then (8) shows that dxn, x ≤ ε, ∀ n ≥ N ε,



44 LECTURE 6

so we indeed have

limn→∞

d

xn,

x

= 0.

(ii). Let pk)k≥1 be a Cauchy sequence in X . Using (i), we can choose, for eachk ≥ 1, an element xk ∈ X , such that

dxk, pk) ≤ 1

2k.

Claim 1: The sequence x = (xk)k≥1 is Cauchy in X .

Indeed, for k ≥ ≥ 1 we have

d(xk, x) = dxk, x ≤ dxk, pk)

+ d( pk, p) + d p, x

≤ d( pk, p) +1

2.

This clearly gives

limn→∞ sup

k,≥N d(x

k, x) ≤

limn→∞ sup

k,≥N d( pk

, p) = 0,

so x = (xk)k≥1 is indeed Cauchy.The proof of (ii) will the be finished, once we prove:

Claim 2 : We have limn→∞ pk = x in X .

To see this, we observe that, for ≥ k ≥ 1 we have the inequality

(9) d pk, x ≤ d pk, xk

+ dxk, x

≤ 1

2k+ d(xk, x).

If we now start with some ε > 0, and we choose N ε such that

d(xk, x) < ε, ∀ k, ≥ N ε,

then (9) gives d pk, x ≤ 1

2k+ ε, ∀ ≥ k ≥ N ε.

If we keep k ≥ N ε fixed and take lim→∞, using (i) we get

d( pk, x) = lim→∞

d( pk, x) ≤ 1

2k+ ε, ∀ k ≥ N ε.

The above estimate clearly proves that

limk→∞

d( pk, x) = 0,

so the sequence ( pk)k≥1 is convergent (to x).

Definition. The metric space ( X, d) is called the completion of (x, d).

The completion has a certain universality property. In order to formulate thisproperty we need the following

Definition. Let (X, d) and (Y, ρ) be metric spaces. A map f : X → Y is saidto be a Lipschitz function , if there exists some constant C ≥ 0, such that

ρ

f (x), f (x) ≤ C · d(x, x), ∀ x, x ∈ X.

Such a constant C is then called a Lipschitz constant for f .




Proposition 6.9. Let (X, d) be a metric space, and let (

X,

d) be its completion.

If (Y, ρ) is a complete metric space, and f : X → Y is a Lipschitz function with

Lipschitz constant C ≥ 0, then there exists a unique continuous function f : X →Y , such that f (x) = f (x), ∀ x ∈ X.

Moreover, f is Lipschitz, with Lipschitz constant C .

Proof. Start with some Cauchy sequence x = (xn)n≥1 in X . Using the in-equality

ρ

f (xm), f (xn) ≤ C · d(xm, xn), ∀ m, n ≥ 1,

it is obvious that

f (xn)n≥1

is a Cauchy sequence in Y . Since Y is complete, this

sequence is convergent. Define,

φ(x) = limn→∞

f (xn).

This way we have constructed a map φ : cs(X, d) → Y .Claim: If x ∼ x, then φ(x) = φ(x).

Indeed, if x = (xn)n≥1 and x = (xn)n≥1, then the Lipschitz property will give

ρ

f (xn), f (xn) ≤ C · d(xn, x

n), ∀ n ≥ 1,

and using the fact that limn→∞ d(xn, xn) = 0, we get limn→∞ ρ

f (xn), f (x

n)

= 0.This clearly forces

limn→∞

f (xn) = limn→∞

f (xn).

Having proven the claim, we now see that we have a correctly define map

f :

X → Y , with the property that

f (x) = φ(x),

∀x

∈cs(X, d).

The equality f (x) = f (x), ∀ x ∈ X

is trivially satisfied.

Let us check now that f is Lipschitz, with Lipschitz constant C . Start with

two points p, p ∈ X , represented as p = x and p = x, for two Cauchy sequencesx = (xn)n≥1 and x = (x

n)n≥1 in X . Using the definition, we havef ( p) = limn→∞

f (xn) and f ( p) = limn→∞

f (xn).

This will give

ρ

f ( p), f ( p)

= limn→∞

ρ

f (xn), f (xn)

.

Notice however that

ρ

f (xn), f (xn) ≤ C · d(xn, x

n), ∀ n ≥ 1,

so taking the limit yields

ρ

f ( p), f ( p)

= limn→∞

ρ

f (xn), f (xn) ≤ C · lim

n→∞d(xn, x

n) = C · d( p,p).

Finally, let us show that f is unique. Let F : X → Y be another continuousfunction with F (x) = f (x), for all x ∈ X . Start with an arbitrary point p ∈



46 LECTURE 6

X , represented as p = x, for some Cauchy sequence x = (xn)n≥1 in X . Since

limn→∞xn = p in

X , by continuity we have

F ( p) = limn→∞

F (xn) = limn→∞

f (xn) = φ(x) = f ( p).

Corollary 6.5. Let (X, d) be a metric space, let (Y, ρ) be a complete metricspace, and let f : X → Y be an isometric map, that is

ρ

f (x), f (x)

= d(x, x), ∀ x, x ∈ X.

Then the map f : X → Y , given by the above result, is isometric and f (X ) = f (X )- the closure of f (X ) in Y ..

Proof. To show that f (X ) = f (X ), start with some arbitrary point y ∈f (X ). Then there exists a sequence (xn)n≥1

⊂X , with limn→∞ f (xn) = y. Sincef (xn)n≥1 is Cauchy in Y , and

d(xm, xn) = ρ

f (xm), f (xn)

, ∀ m, n ≥ 1,

it follows that the sequence x = (xn)n≥1 is cauchy in X . We then have

y = limn→∞

f (xn) = f (x).

Finally, we show that f is isometric. Start with two points p, q ∈ X , representedas p = x and q = z, for some Cauchy sequences x = (xn)n≥1 and z = (zn)n≥1 inX . Then by construction we have

ρ

f ( p), f (q)

= limn→∞

ρ

f (xn), f (zn)

= limn→∞

ρ

f (xn), f (zn)

=

= limn→∞

d(xn, zn) = d(x, z) = d( p,q).

Corollary 6.6. If (X, d) is a complete metric space, and X is its completion,

then the map ι : X x −→ x ∈ X is bijective.

Proof. Apply the previous result to the map Id : X → X , to get a bijective

(isometric) map Id : X → X . Since the map Id is obviously a left inverse for ι, itfollows that ι itself is bijective.

In the remainder of this section we will address the following question: Given a topological Hausdorff space X , when does there exists a metric d on X , such that the given topology coincides with the metric topology defined by d? A topolgical

Hausdorff space with the above property is said to be metrizable. It is difficult togive non-trivial necessary and sufficient conditions for mtrizability. One instance inwhich this is possible is the compact case (see the Urysohn Metrizability Theorem later in these notes). Here is a useful result, which is an example of a sufficientcondition for mterizabilty.

Proposition 6.10 (Metrizability of Countable Products). Let (X i, di)i∈I be a countable family of metric spaces. Then the product space X =

i∈I X i, equipped

with the product topology, is metrizable.




Proof. Denote by T the product topology on X . What we need is a metric don X , such that the maps

Id : (X, d) → (X, T ) and Id : (X, T ) → (X, d)are continuous. (Here the notation (X, d) signifies that X is equipped with themetric topology defined by d.) For each i ∈ I , let πi : X → X i denote theprojection onto the ith coordinate.

Case I: Assume I is finite. In this case we define the metric d on X as follows.If x = (xi)i∈I and y = (yi)i∈I are elements in X , we put

d(x,y) = maxi∈I

di(xi, yi).

The continuity of the map Id : (X, d) → (X, T ) is equivalent to the fact that allmaps

πi : (X, d) → (X i, di), i ∈ I

are continuous. This is obvious, because by construction we have

diπi(x), πi(y) ≤ d(x,y), ∀x,y ∈ X.

Conversely, to prove the continuity of Id : (X, T ) → (X, d), we are going to provethat every d-open set is open in the product topology . It suffices to prove this onlyfor open balls. Fix then x = (xi)i∈I ∈

i∈I X i and r > 0, and consider the open

ball Br(x). If we define, for each i ∈ I , the open ball BXir (xi), then it is obvious

that

Br(x) =i∈I

π−1i

BXir (xi)

,

and since πi are all continuous, this proves that Br(x) is indeed open in the producttoplogy.

Case II: Assume I is infinite. In this case we identify I = N. For every n ∈ N

we define a new metric δn on X n, as follows. If sup

p,q∈Xn

dn( p,q) ≤ 1,

we put δn = dn. Otherwise, we define

δn( p,p) =dn( p,q)

1 + dn( p,q), ∀ p,q ∈ X n.

It is not hard to see that the metric topology defined by δn coincides with the onedefined by dn. The advantage is that δn takes values in [0, 1]. We define the metricd : X × X → [0, ∞), as follows. If x = (xn)n∈N and y = (yn)n∈N are elements inn∈N X n, we define

d(x,y) =

∞

n=1

1

2n ·dn(xn, yn)

1 + dn(xn, yn)=

∞

n=1

δn(xn, yn)

2n.

Due to the fact that δn takes values in [0, 1], the above series is convergent, and itobviously defines a metric on X .

As above, the continuity of the map Id : (X, d) → (X, T ) is equivalent to thecontinuity of all the maps πn : (X, d) → (X n, dn), or equivalently for πn : (X, d) →(X n, δn), n ∈ N. But this is an immediate consequence of the (obvioous) inequalities

δn

πn(x), πn(y) ≤ 2n · d(x,y), ∀x,y ∈ X.



48 LECTURE 6

As before, in order to prove the continuity of the other map Id : ( X, T ) → (X, d), westart with some d-open set D, and we show that D is open in the product topology.Since D is a union of of open balls, we need to prove that for any x

∈X and any

r > 0, the open ball Br(x), in (X, d), is a neighborhood of x in the product topology .Fix x = (xn)n∈N ∈ n∈N X n, as well as r > 0. Choose some integer N ≥ 1, suchthat

∞n=N +1

1

2n<

r

2,

and define, for each k ∈ 1, 2, . . . , N the set

Dk = y = (yn)n∈N ∈n∈N

X n : δn(xk, yk) <r

2.

It is clear that Dk is open in the product topology , for each k = 1, 2, . . . , N . (This isa consequence of the fact that Dk = π−1

k

Br/2(xk)

, where Br/2(xk) is the δk-open

ball in X k of radius r/2, centered at xk.) Then the set D = D1

∩D2

∩ · · · ∩DN is

also open in the product topology. Obviously we have x ∈ D. We now prove thatD ⊂ Br(x). Start with some arbitrary y ∈ D, say y = (yn)n∈N. On the one hand,we have

δk(xk, yk) <r

2, ∀ k ∈ 1, 2, . . . , N ,

so we getN n=1

1

2nδn(xn, yn) <

r

2

N n=1

1

2n<

r

2.

On the other hand, since δn takes values in [0, 1), we also have∞

n=N +1

1

2nδn(xn, yn) <

∞n=1

1

2n<

r

2,

so we get

d(x,y) =∞n=1

1

2nδn(xn, yn) < r,

thus proving that y indeed belongs to Br(x).



Lecture 7

7. Baire theorem(s)

In this section we discuss some topological phenomenon that occurs in certaintopological spaces. This deals with interiors of closed sets.

Exercise 1. Let X be a topological space, and let A and B be closed sets withthe property that int(A

∪B)

= ∅. Prove that either Int(A)

= ∅, or Int(B)

= ∅.

Exercise 2 . Give an example of a topological space X and of two (non-closed)sets A and B such that Int(A ∪ B) = ∅, but Int(A) = Int(B) = ∅.

Theorem 7.1 (Baire’s Theorem). Let (X, T ) be a topological Hausdorff space,which satisfies one (or both) of the following properties:

(a) There exists a meatric d on X , which meakes (X, d) a complete metricspace, and T is the metric topology.

(b) X is locally compact.

Suppose one has a sequence (F n)n≥1 of closed subsets of X , such that X =∞n=1 F n.

Then there exists some integer n ≥ 1, such that Int(F n) = ∅.

Proof. For every n ≥ 1 we define the closed set Gn =

nk=1 F k, so that we

still have X = ∞

n=1 Gn, but we also have G1 ⊂ G2 ⊂ . . . . According to Exercise 1(use an inductive argument) it suffices to show that there exists some n ≥ 1, withInt(Gn) = ∅. We are going to prove this property by contradiction.

(∗) Assume Int(Gn) = ∅, for all n ≥ 1.

Claim: Under the assumption (∗) there exists a sequence (Dn)n≥1 of non-empty open sets, such that for all n ≥ 1 we have:

(i) Dn ∩ Gn = ∅;(ii) Dn+1 ⊂ Dn;

(iii) In case (a) we have diam(Dn) ≤ 2−n; in case (b) Dn is compact.

The sequence is constructed recursivley. To construct D1 we use the fact thatInt(G1) = ∅ forces X G1 = ∅. We then choose a point x ∈ X G1. In case (a)we know that there exists r > 0 such that Br(x) ⊂ X G1. We put ρ = minr, 1

4and we set D1 = Bρ(x). In the case (b) we apply Lemma 5.1 to find D1 open withD1 compact, such that x ∈ D1 ⊂ D1 ⊂ X G1.

Let us assume now that we have constructed D1, D2, . . . , Dk, such that (i) and(iii) hold for all n ∈ 1, . . . , k, and such that (ii) hold for all n ∈ 1, . . . , k − 1,and let us indicate how the next set Dk+1 is constructed. Using the assumptionthat Int(Gk+1) = ∅, it follows that the open set Dk Gk+1 is non-empty. Choosethen a point x ∈ Dk Gk+1. In case (a) there exists some r > 0 such thatBr(x) ⊂ Dk Gk+1. We then put ρ = min r2 , 1

2k+2 , and we define Dk+1 = Bρ(x).

49



50 LECTURE 7

In case (b) we apply Lemma 5.1 an find an open set Dk+1 with Dk+1 compact,and x ∈ Dk+1 ⊂ Dk+1 ⊂ Dk Gk+1. All properties (i)-(iii) are easily verified.

Having proven the Claim, let us see now that the assumption (∗

) produces acontradiction.

Case (a): In this case we choose, for each n ≥ 1 a point xn ∈ Dn. Noticethat, for every m ≥ n ≥ 1 we have

xm, xn ∈ Dn and d(xm, xn) ≤ diam(Dn) ≤ 1

2n.

In particular, this proves that the sequence (xn)n≥1 is Cauchy , hence convergent

to some point x. Since xm ∈ Dn, ∀ m ≥ n ≥ 1, we see that x ∈ Dn, for all n ≥ 1.In other words we get

(1)

∞n=1

Dn = ∅.

Case (b): In this case we also get (1), this time as a consequence of the

compactness of the sets Dn (and the finite intersection property).Let us notice now that (1) combined with (ii) will also give

∞n=1 Dn = ∅. But

this is impossible, since by (i) we have∞n=1

Dn ⊂∞n=1

(X Gn) = X ∞n=1

Gn

= ∅.



Chapter II

Elements of Functional Analysis



Lecture 8

1. Hahn-Banach Theorems

The result we are going to discuss is one of the most fundamental theorems inthe whole field of Functional Analysis. Its statement is simple but quite technical.

Definitions. Let K be either of the fields R or C. Suppose X is a K-vectorspace.

A. A map q : X → R is said to be a quasi-seminorm , if (i) q(x + y) ≤ q(x) + q(y), for all x, y ∈ X ;

(ii) q(tx) = tq(x), for all x ∈ X and all t ∈ R with t ≥ 0.B. A map q : X → R is said to be a seminorm if, in addition to the above

two properties, it satisfies:(ii’) q(λx) = |λ|q(x), for all x ∈ X and all λ ∈ K.

Remark that if q : X → R is a seminorm, then q(x) ≥ 0, for all x ∈ X . (Use2q(x) = q(x) + q(−x) ≥ q(0) = 0.)

There are several versions of the Hahn-Banach Theorem.

Theorem 1.1 (Hahn-Banach, R-version). Let X be an R-vector space. Supposeq : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X and a linear map φ :

Y →R, such that

φ(y) ≤ q(y), for all y ∈ Y .Then there exists a linear map ψ : X → R such that

(i) ψY

= φ;

(ii) ψ(x) ≤ q(x) for all x ∈ X .Proof. We first prove the Theorem in the following:

Particular Case : Assume dim X /Y = 1.

This means there exists some vector x0 ∈ X such that

X = y + sx0 : y ∈ Y , s ∈ R.

What we need is to prescribe the value ψ(x0). In other words, we need a numberα

∈R such that, if we define ψ :

X →R by ψ(y + sx0) = φ(y) + sα,

∀y

∈ Y , s

∈R,

then this map satisfies condition (ii). For s > 0, condition (ii) reads:φ(y) + sα ≤ q(y + sx0), ∀ y ∈ Y , s > 0,

and, upon dividing by s (set z = s−1y), is equivalent to:

(1) α ≤ q(z + x0) − φ(z), ∀ z ∈ Y .For s < 0, condition (ii) reads (use t = −s):

φ(y) − tα ≤ q(y − tx0), ∀ y ∈ Y , t > 0,

53



54 LECTURE 8

and, upon dividing by t (set w = t−1y), is equivalent to:

(2) α

≥φ(w)

−q(w

−x0),

∀w

∈ Y .

Consider the sets

Z = q(z + x0) − φ(z) ; z ∈ Y ⊂ R

W = φ(w) − q(w − x0) : w ∈ Y ⊂ R.

The conditions (1) and (2) are equivalent to the inequalities

(3) sup W ≤ α ≤ inf Z.

This means that, in order to find a real number α with the desired property, itsuffices to prove that sup W ≤ inf Z , which in turn is equivalent to

(4) φ(w) − q(w − x0) ≤ q(z + x0) − φ(z), ∀ z.w ∈ Y .But the condition (4) is equivalent to

φ(z + w) ≤ q(z + x0) + q(w − x0),

which is obviously satisfied because

φ(z + w) ≤ q(z + w) = q

(z + x0) + (w − x0) ≤ q(z + x0) + q(w − x0).

Having proved the Theorem in this particular case, let us proceed now withthe general case. Let us consider the set Ξ of all pairs (Z , ν ) with

• Z is a subspace of X such that Z ⊃ Y ;• ν : Z → R is a linear functional such that

(i) ν Y

= φ;

(ii) ν (z) ≤ q(z), for all z ∈ Z .Put an order relation on Ξ as follows:

(Z 1, ν 1) (Z 2, ν 2) ⇔ Z 1 ⊃ Z 2ν 1Z 2= ν 2

Using Zorn’s Lemma, Ξ posesses a maximal element (Z , ψ). The proof of theTheorem is finished once we prove that Z = X . Assume Z X and choose avector x0 ∈ X Z . Form the subspace V = z + tx0 : z ∈ Z , t ∈ R and applythe particular case of the Theorem for the inclusion Z ⊂ V , for ψ : Z → R and forthe quasi-seminorm q

V

: V → R. It follows that there exists some linear functionalη : M → R such that

(i) ηZ

= ψ (in particular we will also have ηY

= φ);

(ii) η(v) ≤ q(v), for all v ∈ V .But then the element (V , η) ∈ Ξ will contradict the maximality of (Z , ψ).

Theorem 1.2 (Hahn-Banach,C

-version). Let X be an C

-vector space. Supposeq : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X and a linear map φ : Y → C, such that

Re φ(y) ≤ q(y), for all y ∈ Y .Then there exists a linear map ψ : X → R such that

(i) ψY

= φ;

(ii) Re ψ(x) ≤ q(x) for all x ∈ X .



CHAPTER I I: ELEMENTS OF FUNCTIONAL ANALYSIS 55

Proof. Regard for the moment both X and Y as R-vector spaces. Define theR-linear map φ1 : Y → R by φ1(y) = Re φ(y), for all y ∈ Y , so that we have

φ1(y) ≤ q(y), ∀ y ∈ Y .Use Theorem 1 to find an R-linear map ψ1 : X → R such that

(i) ψ1

Y

= φ1;

(ii) ψ1(x) ≤ q(x), for all x ∈ X .Define the map ψ : X → C by

ψ(x) = ψ1(x) − iψ1(ix), for all x ∈ X .Claim 1: ψ is C-linear.

It is obvious that ψ is R-linear, so the only thing to prove is that ψ(ix) = iψ(x),for all x ∈ X . But this is quite obvious:

ψ(ix) = ψ1(ix) − iψ1(i

2

x) = ψ1(ix) − iψ1(−x) == −i2ψ1(ix) + iψ1(x) = i

ψ1(x) − iψ1(ix)

= iψ(x), ∀ x ∈ X .

Because of the way ψ is defined, and because ψ1 is real-valued, condition (ii)in the Theorem follows immediately

Re ψ(x) = ψ1(x) ≤ q(x), ∀ x ∈ X ,so in order to finish the proof, we need to prove condition (i) in the Theorem, (i.e.ψY

= φ). This follows from the fact that φ1 = ψ1

Y

, and from:

Claim 2 : For every y ∈ Y , we have φ(y) = φ1(y) − iφ1(iy).

But this is quite obvious, because

Im φ(y) =−

Re (iφ(y)) =−

Re φ(iy) =−

φ1(iy),∀

y∈ Y

.

Theorem 1.3 (Hahn-Banach, for seminorms). Let X be a K-vector space ( Kis either R or C). Suppose q is a seminorm on X . Suppose also we are given a linear subspace Y ⊂ X and a linear map φ : Y → K, such that

|φ(y)| ≤ q(y), for all y ∈ Y .Then there exists a linear map ψ : X → K such that

(i) ψY

= φ;

(ii) |ψ(x)| ≤ q(x) for all x ∈ X .

Proof. We are going to apply Theorems 1 and 2, using the fact that q is alsoa quasi-seminorm.

The case K = R. Remark that

φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y .So we can apply Theorem 1 and find ψ : X → R with

(i) ψY

= φ;

(ii) ψ(x) ≤ q(x), for all x ∈ X .



56 LECTURE 8

Using condition (ii) we also get

−ψ(x) = ψ(

−x)

≤q(

−x) = q(x), for all x

∈ X .

In other words we get

±ψ(x) ≤ q(x), for all x ∈ X ,which of course gives the desired property (ii) in the Theorem.

The case K = C. Remark that

Re φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y .So we can apply Theorem 2 and find ψ : X → R with

(i) ψY

= φ;

(ii) Re ψ(x) ≤ q(x), for all x ∈ X .Using condition (ii) we also get

(5) Re λψ(x) = Re ψ(λx) ≤ q(λx) = q(x), for all x ∈ X and all λ ∈ T.(Here T = λ ∈ C : |λ| = 1.) Fix for the moment x ∈ X . There exists some λ ∈ T

such that |ψ(x)| = λψ(x). For this particular λ we will have Re

λψ(x)

= |ψ(x)|,so the inequality (5) will give

|ψ(x)| ≤ q(x).

In the remainder of this section we will discuss the geometric form of theHahn-Banach theorems. We begin by describing a method of constructing quasi-seminorms.

Proposition 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex subset, which contains 0, and has the property

(6)t>0

tC = X.

For every x ∈ X we define

QC(x) = inf t > 0 : x ∈ tC.

(By (6) the set in the right hand side is non-empty.) Then the map QC : X → R isa quasi-seminorm.

Proof. For every x ∈ X, let us define the set

T C(x) = t > 0 : x ∈ tC.

It is pretty clear that, since 0∈C, we have

T C(0) = (0, ∞),

so we get

QC(0) = inf T C(0) = 0.

Claim 1: For every x ∈ X and every λ > 0, one has the equality

T C(λx) = λT C(x).




Indeed, if t ∈ T C(λx), we have λx ∈ tC, which menas that λ−1tx ∈ C, i.e. λ−1t ∈T C(x). Conequently we have

t = λ(λ−1t) ∈ λT X(x),

which proves the inclusion

T C(λx) ⊂ λT C(x).

To prove the other inclusion, we start with some s ∈ λT C(x), which means thatthere exists some t ∈ T C(x) with λt = s. The fact that t = λ−1s belongs to T C(x)means that x ∈ λ−1sC, so get λx ∈ sC, so s indeed belongs to T C(λx).

Claim 2:: For every x, y ∈ X, one has the inclusion 4

T C(x + y) ⊃ T C(x) + T C(y).

Start with some t ∈ T C(x) and some s ∈ T C(y). Define the elements u = t−1x andv = s−1y. Since u, v ∈ C, and C is convex, it follows that C contains the element

t

t + s u +s

t + s v =1

t + s (x + y),

which means that x + y ∈ (t + s)C, so t + s indeed belongs to T C(x + y).We can now conclude the proof. If x ∈ X and λ > 0, then the equality

QC(λx) = λQC(x)

is an immediate consequence of Claim 1. If x, y ∈ X, then the inequality

QC(x + y) ≤ λQC(x) + QC(y)

is an immediate consequence of Claim 2.

Definition. Under the hypothesis of the above proposition, the quasi-semi-norm QC is called the Minkowski functional associated with the set C.

Remark 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex subset,which contains 0, and has the property (6). Then one has the inclusions

x ∈ X : QC(x) < 1 ⊂ C ⊂ x ∈ X : QC(x) ≤ 1.

The second inclusion is pretty obvious, since if we start with some x ∈ C, using thenotations from the proof of Proposition 2.1, we have 1 ∈ T C(x), so

QC(x) = inf T C(x) ≤ 1.

To prove the first inclusion, start with some x ∈ X with QC(x) < 1. In particularthis means that there exists some t ∈ (0, 1) such that x ∈ tC. Define the vectory = t−1x ∈ C and notice now that, since C is convex, it will contain the convexcombination ty + (1 − t)0 = x.

Exercise 1. Let X be a real vector space, and let q : X → R be a quasi-seminorm.

Define the setsC0 = x ∈ X : q(x) < 1,

C1 = x ∈ X : q(x) ≤ 1.

(i) Prove that C0 and C1 are both convex, they contain 0, and they bothhav property (6).

4For subsets T, S ⊂ R we define T + S = t + s : t ∈ T, s ∈ S .



58 LECTURE 8

(ii) Let C is any convex set with

C0 ⊂ C ⊂ C1.

Analyze the relationship between QC and q.

Definition. A topological vector space is a vector space X over K (which iseither R or C), which is also a topological space, such that the maps

X× X (x, y) −→ x + y ∈ X

K × X (λ, x) −→ λx ∈ X

are continuous.

Remark 1.2. Let X be a real topological vector space. Suppose C ⊂ X is aconvex open subset, which contains 0. Then C has the property (6). Moreover(compare with Remark 2.1), one has the equality

(7) x ∈ X : QC(x) < 1 = C.

To prove this remark, we define for each x ∈ X, the functionF x : R t −→ tx ∈ X.

Since X is a topological vector space, the map F x, x ∈ X are continuous. To provethe property (6) we start with an arbitrary x ∈ X, and we use the continuity of themap F x at 0. Since C is a neighborhood of 0, there exists some ρ > 0 such that

F x(t) ∈ C, ∀ t ∈ [−ρ, ρ].

In particular we get ρx ∈ C, which means that x ∈ ρ−1C.To prove the equality (7) we only need to prove the inclusion “⊃” (since the

inclusion “⊂” holds in general, by Remark 2.1). Start with some element x ∈ C.Using the continuity of the map F x at 1, plus the fact that F x(1) = x ∈ C, thereexists some ε > 0, such that

F x(t) ∈ C, ∀ t ∈ [1 − ε, 1 + ε].

In particular, we have F (1 + ε) ∈ C, which means precisely that

x ∈ (1 + ε)−1C.

This gives the inequalityQC(x) ≤ (1 + ε)−1,

so we indeed get QC(x) < 1.

The first geometric version of the Hahn-Banach Theorem is:

Lemma 1.1. Let X be a real topological vector space, and let C ⊂ X be a convex open set which contains 0. If x0 ∈ X is some point which does not belong to C, then there exists a linear continuous map φ : X → R, such that

•φ(x0) = 1;

• φ(v) < 1, ∀ v ∈ C.

Proof. Consider the linear subspace

Y = Rx0 = tx0 : t ∈ R,

and define ψ : Y → R byψ(tx0) = t, ∀ t ∈ R.

It is obvious that ψ is linear, and ψ(x0) = 1.




Claim: One has the inequality

ψ(y) ≤ QC(y), ∀ y ∈ Y.

Let y be represented as y = tx0 for some t ∈ R. It t ≤ 0, the inequality is clear,because ψ(y) = t ≤ 0 and the right hand side QC(y) is always non-negative. Assumet > 0. Since QC is a quasi-seminorm, we have

(8) QC(y) = QC(tx0) = tQC(x0),

and the fact that x0 ∈ C will give (by Remark 2.2) the inequality QC(x0) ≥ 1. Sincet > 0, the computation (8) can be continued with

QC(y) = tQC(x0) ≥ t = ψ(y),

so the Claim follows also in this case.Use now the Hahn-Banach Theorem, to find a linear map φ : X → R such that

(i) φ

Y

= ψ;(ii) φ(x)

≤QC(x),

∀x

∈X.

It is obvious that (i) gives φ(x0) = ψ(x0) = 1. If v ∈ C, then by Remark 2.2 wehave QC(v) < 1, so by (ii) we also get φ(v) < 1. This means that the only thingthat remains to be proven is the continuity of φ. Since φ is linear, we only need toprove that φ is continuous at 0. Start with some ε > 0. We must find some openset Uε ⊂ X, with Uε 0, such that

|φ(u)| < ε, ∀ u ∈ Uε.

We take Uε = (εC) ∩ (−εC). Notice that, for every u ∈ Uε, we have ±u ∈ εC, whichgives ε−1(±u) ∈ C. By Remark 2.2 this gives QC

ε−1(±u)

< 1, which gives

QC(±u) < ε.

Then using property (ii) we immediately get

φ(

±u) < ε,

and we are done.

It turns out that the above result is a particular case of a more general result:

Theorem 1.4 (Hahn-Banach Separation Theorem - real case). Let X be a real topological vector space, let A,B ⊂ X be non-empty convex sets with A open, and A ∩ B = ∅. Then there exists a linear continuous map φ : X → R, and a real number α, such that

φ(a) < α ≤ φ(b), ∀ a ∈ A, b ∈ B.

Proof. Fix some points a0 ∈ A, b0 ∈ B, and define the set

C = A−B+ b0 − a0 = a − b + b0 − a0 : a ∈ A, b ∈ B.

It is starightforward that C is convex and contains 0. The equality

C = b∈B

( A+ b0 − a0)

shows that C is also open. Define the vector x0 = b0 − a0. Since A ∩ B = ∅, it isclear that x0 ∈ C.

Use Lemma 2.1 to produce a linear continuous map phi : X → R such that

(i) φ(x0) = 1;



60 LECTURE 8

(ii) φ(v) < 1, ∀ v ∈ C.

By the definition of x0 and C, we have φ(b0) = φ(a0) + 1, and

φ(a) < φ(b) + φ(a0) − φ(b0) + 1, ∀ a ∈ A, b ∈ B,

which gives

(9) φ(a) < φ(b), ∀ a ∈ A, b ∈ B.

Put

α = inf b∈B

φ(b).

The inequalities (9) give

(10) φ(a) ≤ α ≤ φ(b), ∀ a ∈ A, b ∈ B.

The proof will be complete once we prove the following


φ(a) < α, ∀ a ∈ A.

Suppose the contrary, i.e. there exists some a1 ∈ A with φ(a1) = α. Using thecontinuity of the map

R t −→ a1 + tx0 ∈ X

there exists some ε > 0 such that

a1 + tx0 ∈ A, ∀ t ∈ [−ε, ε].

In particular, by (10) one has

φ

a1 + εx0) ≤ α,

which means that

α + ε ≤ α,which is clearly impossible.

Theorem 1.5 (Hahn-Banach Separation Theorem - complex case). Let X bea complex topological vector space, let A,B ⊂ X be non-empty convex sets with Aopen, and A ∩ B = ∅. Then there exists a linear continuous map φ : X → C, and a real number α, such that

Re φ(a) < α ≤ Im φ(b), ∀ a ∈ A, b ∈ B.

Proof. Regard X as a real topological vector space, and apply the real versionto produce an R-linear continuous map φ1 : X → R, and a real number α, such that

φ1(a) < α ≤ φ1(b), ∀ a ∈ A, b ∈ B.

Then the function φ : X → C defined by

φ(x) = φ1(x) − iφ1(ix), x ∈ X

will clearly satisfy the desired properties.

There is another version of the Hahn-Banach Separation Theorem, which holdsfor a special type of topological vector spaces. Before we discuss these, we shallneed a technical result.




Lemma 1.2. Let X be a topological vector space, let C ⊂ X be a compact set,and let D ⊂ D be a closed set. Then the set

C+D = x + y : x ∈ C y ∈ Dis closed.

Proof. Start with some point p ∈ C+ D, and let us prove that p ∈ C+D. Forevery neighborhood U of 0, the set p +U is a neighborhood of p, so by assumption,we have

(11) ( p +U) ∩ (C +D) = ∅.

Define, for each neighborhood U of 0, the set

AU = ( p + U−D) ∩ C.

Using (11), it is clear that AU is non-empty. It is also clear that, if U1 ⊂ U2, then AU1

⊂AU2 . Using the compactness of C, it follows that

U neighborhoodof 0

AU = ∅.

Choose then a point q in the above intersection. It follows that

(q + V) ∩ AU = ∅,

for any two neighborhoods U and V of 0. In other words, for any two such neigh-borphoods of 0, we have

(12) (q + V−U) ∩ ( p −D) = ∅.

Fix now an arbitrary neighborhood W of 0. Using the continuity of the map

X

×X

(x1, x2)

−→x1

−x2

∈X,

there exist neighborhoods U and V of 0, such that U− V ⊂ W. Then q + V−U ⊂q −W, so (12) gives

(q −W) ∩ ( p −D) = ∅,

which yields

( p − q +W) ∩D = ∅.

Since this is true for all neighborhoods W of 0, we get p − q ∈ D, and since D isclosed, we finally get p − q ∈ D. Since, by construction we have q ∈ C, it followsthat the point p = q + ( p − q) indeed belongs to C+ D.

Definition. A topological vector space X is said to be locally convex , if everypoint has a fundamental system of convex open neighborhoods. This means thatfor every x

∈X and every neighborhood N of x, there exists a convex open set D,

with x ∈ D ⊂ N .

Theorem 1.6 (Hahn-Banach Separation Theorem for Locally Convex Spaces).Let K be one of the fields R or C, and let X be a locally convex K-vector space.Suppose C,D ⊂ X are convex sets, with C compact, D closed, and C∩D = ∅. Then there exists a linear continuous map φ : X → K, and two numbers α, β ∈ R, such that

Re φ(x) ≤ α < β ≤ Re φ(y), ∀ x ∈ C, y ∈ D.



62 LECTURE 8

Proof. Consider the convex set B = D−C. By Lemma ??, B is closed. SinceC∩D = ∅, we have 0 ∈ B. Since B is closed, its complement XB will then be aneighborhood of 0. Since X is locally convex, there exists a convex open set A, with0 ∈ A ⊂ X B. In particular we have A ∩ B = ∅. Applying the suitable versionof the Hahn-Banach Theorem (real or complex case), we find a linear continuousmap φ : X → K, and a real number ρ, such that

Re φ(a) < ρ ≤ Re φ(b), ∀ a ∈ A, b ∈ B.

Notice that, since A 0, we get ρ > 0. Then the inequality

ρ ≤ Re φ(b), b ∈ B

givesRe φ(y) − Re φ(x) ≥ ρ > 0, ∀ x ∈ C, y ∈ D.

Then if we defineβ = inf

y∈DRe φ(y) and α = sup

x∈CRe φ(x),

we get β ≥ α + ρ, and we are done.



Lectures 9-11

2. Normed vector spaces

Definition. Let K be one of the fields R or C, and let X be a K-vector space.A norm on X is a map

X x −→ x ∈ [0, ∞)

with the following properties

(i) x + y ≤ x + y, ∀ x, y ∈ X;(ii) λx = |λ| · x, ∀ x ∈ X, λ ∈ K;

(iii) x = 0 =⇒ x = 0.

(Note that conditions (i) and (ii) state that . is a seminorm.)

Example 2.1. Let K be either R or C. Fix some non-empty set I , and define

cK0 (I ) =

α : I → K : inf

F ⊂I finite

supi∈I F

|α(i)| = 0

.

Remark that for a function α : I → K, the fact that α belongs to cK0 (I ) is equivalentto the following condition:

•For every ε > 0, there exists some finite set F

⊂I , such that

|α(i)| < ε, ∀ i ∈ I F.

We equip the space cK0 (I ) with the K-vector space structure defined by point-wiseaddition and point-wise scalar multiplication. We also define the norm . ∞ by

α = supi∈I

|α(i)|, α ∈ cK0 (I ).

When K = C, the space cC0 (I ) is simply denoted by c0(I ). When I = N - the set of natural numbers - the space cK0 (N) can be equivalently described as

cK0 (N) =

α = (αn)n≥1 ⊂ K : limn→∞

αn = 0

.

In this case instead of cR0 (N) we simply write cR0 , and instead of c0(N) we simply

write c0.Exercise 1. Prove that . ∞ is indeed a norm on cK0 (I ).

Example 2.2. Let K be either R or C, and let I be a non-empty set. Wedefine the space

fin K(I ) =

α : I → K : the set i ∈ I : α(i) = 0 is finite

.

Then fin K(I ) is a linear subspace in cK0 (I ).

63



64 LECTURES 9-11

Definition. Suppose X is a normed vector space, with norm . . Then thereis a natural metric d on X, defined by

d(x, y) = x − y, x , y ∈ X.The toplogy on X, defined by this metric, is called the norm topology .

Exercise 2 . Let X be a normed vector space, over K(= R, C). Prove that, whenequipped with the norm toplogy, X becomes a topological vetor space. That is, themaps

X× X (x, y) −→ x + y ∈ X

K × X (λ, x) −→ λx ∈ X

are continuous.

Exercise 3 . Let K be one of the fields R or C, and let I be a non-empty set.Prove that fin K(I ) is dense in cK0 (I ) in the norm topology.

Example 2.3. Let K be one of the fields R or C, and let I be a non-empty

set. Define∞K (I ) =

α : I → K : sup

i∈I |α(i)| < ∞.

We equip the space ∞K (I ) with the K-vector space structure defined by point-wise

addition and point-wise scalar multiplication. We also define the norm . ∞ by

α∞ = supi∈I

|α(i)|, α ∈ ∞K (I ).

When K = C, the space ∞C (I ) is simply denoted by ∞(I ). When I = N - the set

of natural numbers - instead of ∞R (N) we simply write ∞

R , and instead of ∞(N)we simply write ∞.

Exercise 4. Prove that . ∞ is indeed a norm on ∞K (I ).

Exercise 5 . Let K be one of the fields R or C, and let I be a non-empty set.

Prove that cK

0 (I ) is a linear subspace in ∞

K (I ), which is closed in the norm topology.In preparation for the next class of examples, we introduce the following:

Definition. A map α : I → K is said to be summable, if there exists somenumber s ∈ K such that

(s) for every ε > 0 there exists some finite set F ε ⊂ I such that s −i∈F

α(i)

< ε, for all finite sets F with F ε ⊂ F ⊂ I.

If such an s exists, then it is unique, and it is denoted byi∈I α(i). In the case

when I is finite, every map α : I → K is summable, and the above notation agreeswith the usual notation for the sum.

Exercise 6 . Assume α : I → K is summable. Prove that, for every λ ∈ K, the

map λα : I → K is summable, andi∈I

λα(i) = λi∈I

α(i).

If β : I → K is another summable map, prove that α + β : I → K is summable, andi∈I

[α(i) + β (i)] =i∈I

α(i)

+i∈I

β (i)

.

The following result characterizes summability for non-negative terms




Lemma 2.1. Let K be one of the fields R or C, let I be a non-empty set, and let α : I → [0, ∞). The following are equivalent:

(i) α is summable;(ii) sup

i∈F

α(i) : F ⊂ I , finite

< ∞.

Moreover, in this case we have

sup

i∈F


=i∈I

α(i).

Proof. We denote the quantity sup

i∈F


simply by t.

(i) ⇒ (ii). Assume α is summable, and denotei∈I α(i) simply by s. Choose,

for each ε > 0 a finite set F ε ⊂ I such that

s −i∈F

α(i) < ε, for all finite subsets F ⊂ I with F ⊃ F ε.

Claim: For any finite set G ⊂ I , and any ε > 0, one has the inequality i∈G

α(i) < s + ε.

Indeed, if we take the finite set G ∪ F ε, then using the fact that all α’s are non-negative, we get

i∈G

α(i) ≤

i∈G∪F ε

α(i) < s + ε.

Using the Claim, which holds for any ε > 0, we immediately geti∈G

α(i) ≤ s, for all finite subsets G ⊂ I,

so taking supremum yields t ≤ s, in particular t < ∞.(ii) ⇒ (i). Assume condition (ii) is true. We are going to show that α is

summable, by proving that the number t satisfies the definition of summabilty.Consider the set

S =

i∈F

α(i) : F finite subset of I

,

so that sup S = t < ∞. Start with some ε > 0. Since t − ε is no longer an upperbound for S , there exists some finite set F ε ⊂ I , such that

i∈F ε

α(i) > t − ε.Notice that, for any finite set F

⊂I with F

⊃F ε, we have

t − ε < i∈F ε

α(i) ≤i∈F

α(i) ≤ t,

so we immediately get t −i∈F

α(i)

< ε.



66 LECTURES 9-11

Exercise 7 . Let α : I → [0, ∞) be summable. Prove that every map β : I →[0, ∞) with

β ( j) ≤ α( j), ∀ j ∈ I,is summable, and

j∈I β ( j) ≤

j∈I α( j).

Remark 2.1. It is obvious tat the above result has a version for non-positivemaps as well. More explicitly, for a map α : I → (−∞, 0] the following are equiva-lent:

(i) α is summable;

(ii) inf

i∈F


> −∞.

Moreover, in this case we have

inf

i∈F α(i) : F ⊂ I , finite

=

i∈I α(i).

Lemma 2.2. Let I be a non-empty set. For a function α : I → C, the following are equivalent:

(i) α is summable;(ii) both functions Re α, Im α : I → R are summable.

Moreover, in this case we have the equality j∈I

α( j) =j∈I

Re α( j) + ij∈I

Im α( j).

Proof. (i) ⇒ (ii). Assume α is summable. Denote the sumj∈I α( j) simply

by s. For every ε > 0 choose a finite set F ε ⊂ I such that

s −j∈F

α( j) < ε, for all finite sets F ⊂ I with F ⊃ F ε.

Using the inequality

max|Re z|, |Im z| ≤ |z|, ∀ z ∈ C,

we immediately get the inequalitiesRe s −j∈F

Re α( j)

=

Re

s −j∈F

α( j) ≤ s −

j∈F

α( j)

< ε,Im s −

j∈F

Im α( j)

=

Im

s −

j∈F

α( j)

≤s −

j∈F

α( j)

< ε,

for all finite sets F ⊂ I with F ⊃ F ε,

so Re α and Im α are indeed summable and moreover, we havej∈I

Re α( j) = Re s andj∈I

Im α( j) = Im s.

(ii) ⇒ (i). Assume Re α and Im α are both summable. Denotej∈I Re α( j)

by u and denotej∈I Im α( j) by v. Fix some ε > 0. Choose finite sets E ε, Gε ⊂ I




such that

u −j∈ERe α( j) <ε

2

, for all finite sets E

⊂I with E

⊃E ε,v −

j∈G

Im α( j)

<ε

2, for all finite sets G ⊂ I with G ⊃ Gε.

Put F ε = E ε ∪ Gε. Suppose F ⊂ I is a finite set with F ⊃ F ε. Using the inclusionsF ⊃ E ε and F ⊃ Gε, we then getu −

j∈F

Re α( j)

<ε

2and

v −j∈F

Im α( j)

<ε

2,

so we get[u + iv] −

j∈F α( j)

=

u −

j∈F Re α( j)

+ i

v −

j∈F Im α( j)

≤

≤u −

j∈F

Re α( j)+

v −j∈F

Im α( j) <

ε

2+

ε

2= ε.

This proves that α is indeed summable, andj∈I α( j) = u + iv.

Exercise 8 . Let K be one of the fields R or C, and let I be a non-empty set.Suppose one has two non-empty sets I 1, I 2 with I = I 1 ∪ I 2 and I 1 ∩ I 2 = ∅.Suppose α : I → K has the property that both α

I 1

: I 1 → K and αI 2

: I 2 → K are

summable. Prove that α is summable, andj∈I

α( j) =j∈I 1

α( j) +j∈I 2

α( j).

Proposition 2.1. Let I be a non-empty set, let K be one of the fields R or C.

For a map α : I → K, the following are equivalent:

(i) α is summable;(ii) |α| is summable.

Moreover, in this case one has the inequality

(1)

j∈I

α( j)

≤j∈I

α( j).

Proof. (i) ⇒ (ii). Assume α is summable. We divide the proof in two cases:Case K = R. Define the sets

I + = j ∈ I : α( j) > 0,

I − =

j

∈I : α( j) < 0

,

I 0 = j ∈ I : α( j) = 0.

More generally, for any subset F ⊂ I we define F ± = F ∩ I ± and F 0 = F ∩ I 0.

Claim: Both maps αI +

: I + → R and αI −

: I − → R are summable.Moreover, one has the equality

(2)j∈I

α( j) =j∈I +

α( j) +j∈I −

α( j).



68 LECTURES 9-11

Denote the sumj∈I α( j) simply by s. Start by choosing some finite set F ⊂ I

such that

s −j∈G

α( j) < 1, for all finite sets G ⊂ I with G ⊃ F.

Let E ⊂ I + be a finite subset. Then the set E = E ∪ F will be a finite subset of I with E ⊃ F,, so we will have s −

j∈E

α( j)

< 1,

so we getj∈E

α( j) ≤

j∈E∪F +

α( j) = j∈E∪F +

α( j) +

j∈F 0∪F −

α( j)−

j∈F 0∪F −

α( j)

=

= j∈E

α( j) − j∈F − α( j) < s + 1 − j∈F − α( j) .In particular this gives

sup

j∈E

α( j) : E ⊂ I +, finite

≤ s + 1 −

j∈F −

α( j)

,

so by Lemma ??, the map αI +

: I + → [0, ∞) is indeed summable. The fact that

the map αI −

: I − → (−∞, 0] is summable is proven the exact same way. Theequality (2) follows from Exercise ??

Having proven the Claim, we notice now that the map −αI −

: I − → [0, ∞) isalso summable. Using Exercise ??, it is clear then that the map |α| : I → [0, ∞)is summable, simply because all the three maps

|α

|I + = αI + ,

|α

|I − =

−αI − , and

|α|I 0 = 0 are all summable.Case K = C. By Lemma ?? we know that the maps Re α, Im α : I → R

are summable. In particular, using the real case, we get the fact that the maps|Re α|, |Im α| : I → [0, ∞) are summable. Using the obvious inequality

|z| ≤ |Re z| + |Im z|, ∀ z ∈ C,

we getj∈F

|α( j)| ≤j∈F

|Re α( j)| +j∈F

|Im α( j)| ≤j∈I

|Re α( j)| +j∈I

|Im α( j)|,

for every finite subset F ⊂ I . Then we get

supj∈F |α( j)

|: F

⊂I , finite ≤j∈I |Re α( j)

|+j∈I |Im α( j)

|<

∞,

so |α| : I → [0, ∞) is indeed summable.Having proven the implication (i) ⇒ (ii), let us prove the inequality (1). If s

denotes the sumj∈I α( j), then for every ε > 0 there exists F ε ⊂ I finite such

that s −j∈F

α( j)

< ε, for all finite sets F ⊂ I with F ⊃ F ε.




In particular, we get

|s| ≤ ε + j∈F ε α( j) ≤ ε + j∈F ε α( j) ≤ ε +j∈I α( j).Since this inequality holds for all ε > 0, we then get

|s| ≤j∈F ε

α( j).

(ii) ⇒ (i). Assume now |α| : I → [0, ∞) is summable.Case K = R. It is obvious that |α|

J : J → [0, ∞) is summable, for any subset

J ⊂ I . In particular, using the notations from the proof of (i) ⇒ (ii), it followsthat α

I +

= |α|I +

, αI −

= −|α|I −

, and αI 0

= 0 are all summable. Then thesummability of α follows from Exercise ??.

Case K = C. Using the inequality

max|Re z|, |Im z| ≤ |z|, ∀ z ∈ C,

combined with Exercise ??, it follows that both maps |Re z|, |Im z| : I → [0, ∞) aresummable. Using the real case it then follows that both maps Re α, Im α : I → R

are summable. Then the summability of α follows from Lemma ??.

The following result shows that summability is essentially the same as thesummability of series.

Proposition 2.2. Suppose α : I → K is summable. Then the support set

[[α]] = j ∈ I : α( j) = 0is at most countable.

Proof. For every integer n ≥ 1, we define the set J n = j ∈ I : |α( j)| ≥ 1n.Since |α| is summable, the sets J n, n ≥ 1 are all finite. The desired result thenfollows from the obvious equality [[α]] =

∞n=1 J n.

We are now ready to discuss our next class of examples.

Example 2.4. Let K be either R or C, let I be a non-empty set, and let p ∈ [1, ∞) be a real number. We define

pK(I ) =

α : I → K : |α| p : I → [0, ∞) summable

.

For α ∈ pK(I ) we define

α

p = j∈I |α( j)

| p

1p

.

When K = C, the space ∞C (I ) is simply denoted by ∞(I ). When I = N - the set

of natural numbers - instead of ∞R (N) we simply write ∞

R , and instead of ∞(N)we simply write ∞.

In order to show that the p spaces (1 ≤ p < ∞) are normed vector spaces, wewill need several preliminary results. The first result we are going to need is the(classical) Holder inequality.



70 LECTURES 9-11

Exercise 9 . Let q > 1 and let u, v ≥ 0. Define the function f : [0, 1] → R by

f (t) = ut + v(1

−tq)

1q , t

∈[0, 1].

Prove that

maxt∈[0,1]

f (t) = (u p + v p)1p ,

where p =q

q − 1. Prove that, unless u = v = 0, there exists a unique s ∈ [0, 1] such

that

f (s) = maxt∈[0,1]

f (t).

Hint: Analyze the derivative: f (t) = u − v

tq

1 − tq

1p

, t ∈ (0, 1).

Lemma 2.3 (Holder’s inequality). Let a1, a2, . . . , an, b1, b2, . . . , bn be non-nega-tive numbers. Let p,q > 1 be real number with the property 1

p + 1q = 1. Then:

(3)nj=1

ajbj ≤ nj=1

a pj 1p

· nj=1

bqj 1q

.

Moreover, one has equality only when the sequences (a p1, . . . , a pn) and (bq1, . . . , bqn)are proportional.

Proof. The proof will be carried on by induction on n. The case n = 1 istrivial.

Case n = 2.Assume (b1, b2) = (0, 0). (Otherwise everything is trivial). Define the number

r =b1

(bq1 + bq2)1/q.

Notice that r ∈ [0, 1], and we haveb2

(bq1 + bq2)1/q= (1 − rq)1/q.

Notice also that, upon dividing by (bq1 + bq2)1/q, the desired inequality

(4) a1b1 + a2b2 ≤ (a p1 + a p2)1p (bq1 + bq2)

1q

reads

a1r + a2(1 − rq)1/q ≤ (a p1 + a p2)1/p,

and it follows immediately from the exercise, applied to the function

f (t) = a1t + a2(1 − tq)1/q, t ∈ [0, 1].

Let us examine when equality holds. If a1 = a2 = 0, the equality obviosuly holds,

and in this case (a1, a2) is clearly proportional to (b1, b2). Assume (a1, a2) = (0, 0).Put

s =a p/q1

(a p1 + a p2)1/q,

and notice that

(1 − sq)1/q =

1 − a p1

a p1 + a p2

1/q

=a p/q2

(a p1 + a p2)1/q,




so we have

f (s) =

a1+ p

q

1 + a1+ p

q

2

(a p1 + a p2) 1q =

a p1 + a p2

(a p1 + a p2) 1q = (a

p

1 + a

p

2)

1− 1q

= (a

p

1 + a

p

2)

1p

= maxt∈[0,1] f (t).

By the exercise, it follows that we have equality in (4) precisely when r = s, i.e.

b1

(bq1 + bq2)1q

=a

pq

1

(a p1 + a p2)1q

,

or equivalentlybq1

bq1 + bq2=

a p1a p1 + a p2

.

Obviously this forcesbq2

bq1 + bq2=

a p2a p1 + a p2

,

so indeed (a p1, a p2) and (b

q1, bq2) are proportional.

Having proven the case n = 2, we now proceed with the proof of:The implication: Case n = k ⇒ Case n = k + 1.

Start with two sequences (a1, a2, . . . , ak, ak+1) and (b1, b2, . . . , ak, bk+1). Definethe numbers

a =

kj=1

a pj

1p

and b =

kj=1

bqj

1q

.

Using the assumption that the case n = k holds, we have

(5)

k+1j=1

ajbj ≤ kj=1

a pj

1p

· kj=1

bqj

1q

+ ak+1bk+1 = ab + ak+1bk+1.

Using the case n = 2 we also have

(6) ab + ak+1bk+1 ≤ (a p + a pk+1)1p · (bq + bqk+1)

1q =

k+1j=1

a pj

1p

· k+1j=1

bqj

1q

,

so combining with (5) we see that the desired inequality (3) holds for n = k + 1.Assume now we have equality. Then we must have equality in both (5) and in

(6). On the one hand, the equality in (5) forces ( a p1, a p2, . . . , a pk) and (bq1, bq2, . . . , bqk) tobe proportional (since we assume the case n = k). On the other hand, the equalityin (6) forces (a p, a pk+1) and (bq, bqk+1) to be proportional (by the case n = 2). Since

a p =kj=1

a pj and bq =kj=1

bqj ,

it is clear that (a p

1, a p

2, . . . , a p

k, a p

k+1) and (bq

1, bq

2, . . . , bq

k, bq

k+1) are proportional.

Definition. Two numbers p, q ∈ [1, ∞) are said to be H¨ older conjugate, if 1 p + 1

q = 1. Here we use the convention 1∞ = 0.

Proposition 2.3. Let K be one of the fields R or C, let I be a non-empty set,and let p, q ∈ [1, ∞] be two H¨ older conjugate numbers. If α ∈ pK(I ) and β ∈ qK(I ),then αβ ∈ 1

K(I ), and

αβ 1 ≤ α p · β q.



72 LECTURES 9-11

Proof. Using Lemma ??, it suffices to prove the inequality

(7) j∈F α( j)β ( j) ≤ α

p

· β

q,

for every finite set F ⊂ I .Fix for the moment a finite subset F ⊂ I . Assume p, q ∈ (1, ∞), using Holder’s

inequality we have

(8)j∈F

α( j)β ( j) =

j∈F

α( j) · β ( j)

≤ j∈F

α( j) p 1

p

·j∈F

β ( j)q 1q .

Notice however that j∈F

α( j) p ≤

j∈I

α( j) p =

α p p

,

j∈F β ( j)

q ≤

j∈I β ( j)

q

=

β q

q

,

so we get j∈F

α( j) p 1

p

≤ α p and

j∈F

β ( j)q 1q ≤ β q,

so when we go back to (8) we immediately get the desired inequality (7)In the case when p = 1, we immediately have

j∈F

α( j)β ( j) ≤

j∈F

α( j) ·

maxj∈F

β ( j) ≤

≤j∈I

α( j)

·

supj∈I

β ( j)

= α1 · β ∞.

The case p = ∞ is proven in the exact same way.

Remark 2.2. Suppose p, q ∈ [1, ∞] are Holder conjugate numbers. For any α ∈ pK(I ) and β ∈ qK(I ), the map αβ is summable (by Proposition ??). In particular,one can define the number

α, β =j∈I

α( j)β ( j) ∈ K.

As a consequence we get the inequalityα, β ≤ α p · β q, ∀ α ∈ pK(I ), β ∈ qK(I ).

Notations. Let K be either R or C, let I be a non-empty set, and let q ∈ [1, ∞]be a real number. We define

Bq

K

(I ) = α∈

fin K(I ) :

αq

≤1.

(remark that fin K(I ) ⊂ qK(I ), for all q ∈ [1, ∞].)

Theorem 2.1 (Dual definition of p spaces). Let p, q ∈ (1, ∞) be H¨ older conjugate numbers, let K be one of the fields R or C, and let I be a non-empty set.For a function α : I → K, the following are equivalent:

(i) α ∈ pK(I );

(ii) supβ∈Bq

K(I )

|α, β | < ∞.




Moreover, one has the equality

(9) sup

β∈Bq

K(I ) |α, β

|=

α

p,

∀α

∈ pK(I ).

Proof. It will be convenient to introduce several notations. Given a functionα : I → K, and a finite set F ⊂ I , we define the function β F α : I → K, as follows:

β F α (i) =

|α(i)|1+ p

q

α(i) · j∈F |α( j)| p1/qif i ∈ F and α(i) = 0

0 if i ∈ F or α(i) = 0

Notice that [[β F α ]] ⊂ F , and unless β F α is identically zero, we havei∈[[βF

α ]]

|β F α (i)|q = 1.

So in any case we have β F α ∈ BqK(I ). Notice also that, unless β F α is identically zero,we have

α, β F α =i∈F

α(i)β F α (i) =

i∈F |α(i)|1+ p

qj∈F |α( j)| p1/q

=

i∈F |α(i)| pj∈F |α( j)| p1/q

=

=i∈F

|α(i)| p1− 1q =

i∈F

|α(i)| p1/p.

(10)

It is clear that the equality (10) actually holds even when β F α is identically zero.To make the exposition a bit clearer, we denote the quantity sup

β∈BqK

(I )

α, β simply by |||α|||.

We now proceed with the proof of the Theorem.

(i) ⇒ (ii). Assume α ∈ pK(I ). In order to prove (ii) it suffices to prove theinequality

(11) |||α||| ≤ α p.

Start with some arbitrary β ∈ BqK(I ). Using Holder inequality we have

|α, β | =

j∈[[β]]

α( j)β ( j)

≤ j∈[[β]]

|α( j)| · |β ( j)| ≤

≤ j∈[[β]]

|α( j)| p1/p

· j∈[[β]]

|β ( j)|q1/q

≤

supF ⊂I finite

i∈F

|α(i)| p1/p

= α p.

Since this inequality holds for all β ∈ BqK(I ), the inequality (11) follows.

(ii) ⇒ (i). Assume now |||α||| < ∞. In order to prove condition (i) it sufficesto prove that

(12)i∈F

|α(i)| p ≤ |||α||| p, for every finite subset F ⊂ I.

By (10) we know that for every finite subset F ⊂ I we have

(13)i∈F

|α(i)| p ≤ |||α||| p = α, β F α p.



74 LECTURES 9-11

In particular we get the fact that α, β F α = |α, β F α |, and the fact that β F α belongsto BqK(I ), combined with (13) will give

i∈F

|α(i)| p = |α, β F α | p ≤ supβ∈Bq

K(I )

|α, β | p = |||α||| p.

Having proven the equivalence (i) ⇔ (ii), let us now observe that (9) is animmediate consequence of (11) and (12).

Exercise 10 . Prove that Theorem 9.1 holds also in the cases ( p,q) = (1, ∞) and( p,q) = (∞, 1).

Corollary 2.1. Let K be either R or C, let I be a non-empty set, and let p ≥ 1.

(i) When equipped with point-wise addition and scalar multiplication, the set pK(I ) is a K-vector space.

(ii) The map

pK(I ) α −→ α p ∈ [0, ∞)

is a norm.

Proof. Let q be the Holder conjugate of p. If α ∈ pK(I ), and λ ∈ K, then

λα,β = λα, β , ∀ β ∈ fin K(I ),

so we getsup

β∈BqK

(I )

|λα,β | = |λ| · supβ∈Bq

K(I )

|α, β |,

which gives the fact that λα ∈ pK(I ), as well as the equality λα p = |λ| · α p.

If α1, α2 ∈ pK(I ), then

α1 + α2, β = α1, β + α2, β , ∀ β ∈ fin K(I ),

so we getsup

β∈BqK

(I )

|α1 + α2, β | = supβ∈Bq

K(I )

α1, β + α2, β ≤≤ supβ∈Bq

K(I )

|α1, β | + |α2, β | ≤ supβ∈Bq

K(I )

|α1, β | + supβ∈Bq

K(I )

|α2, β |,

which gives the fact that α1 + α2 ∈ pK(I ), as well as the inequality

α1 + α2 p ≤ α1 p + α2 p.

The implication α p = 0 ⇒ α = 0 is obvious.

Exercise 11. Let p ≥ 1 be a real number, let K be one of the fields R or C, andlet I be a non-empty set. Prove that fin K(I ) is a dense linear subspace in pK(I ).

Remark 2.3. Let p, q

∈[1,

∞] be Holder conjugate. Then the map

pK(I ) × qK(I ) (α, β ) −→ α, β ∈ K

is bilinear , in the sense that for any γ ∈ pK(I ) and any η ∈ q

K(I ), the maps

pK(I ) α −→ α, η ∈ K,

qK(I ) β −→ γ, β ∈ K

are linear. These facts follow immediately from Exercise ??

We now examine linear continuous maps between normed spaces.




Proposition 2.4. Let K be either R or C, let X and Y be normed K-vector spaces, and let T : X → Y be a K-linear map. The following are equivalent:

(i) T is continuous;(ii) supT x : x ∈ X, x ≤ 1 < ∞;

(iii) supT x : x ∈ X, x = 1

< ∞;

(iv) T is continuous at 0.

Proof. (i) ⇒ (ii). Assume T is continuous, but

supT x : x ∈ X, x ≤ 1

< ∞,

which means there exists some sequence (xn)n≥1 ⊂ X such that

(a) xn ≤ 1, ∀ n ≥ 1;(b) limn→∞ T xn = ∞.

Putzn = T xn−1xn, ∀ n ≥ 1.

On the one hand, we have

zn =xn

T xn ≤ 1

T xn , ∀ n ≥ 1,

which gives limn→∞ zn = 0, i.e. limn→∞ zn = 0. Since T is assumed to becontinuous, we will get

(14) limn→∞

T zn = T 0 = 0.

On the other hand, since T is linear, we have T zn = T xn−1T xn, so in particularwe get

T zn = 1, ∀ n ≥ 1,

which clearly contradicts (14).(ii) ⇒ (iii). This is obvious, since the supremum in (iii) is taken over a subset

of the set used in (ii).(iii) ⇒ (iv). Let (xn)n≥1 ⊂ X be a sequence with limn→∞ xn = 0. For each

n ≥ 1, define

un =

xn−1xn, if xn = 0

any vector of norm 1, if xn = 0

so that we haveun = 1 and xn = xnun, ∀ n ≥ 1.

Since T is linear, we have

(15) T xn = xnT un, ∀ n ≥ 1.

If we define M = sup

T x : x ∈ X, x = 1

, then T un ≤ M , ∀ n ≥ 1, so (15)

will give

T xn ≤ M · xn, ∀ n ≥ 1,and the condition limn→∞ xn = 0 will force limn→∞ T xn = 0.

(iv) ⇒ (i). Assume T is continuous at 0, and let us prove that T is continuous atany point. Start with some arbitrary x ∈ X and an arbitrary sequence (xn)n≥1 ⊂ X

with limn→∞ xn = x. Put zn = xn − x, so that limn→∞ zn = 0. Then we will havelimn→∞ T zn = 0, which (use the linearity of T ) means that

0 = limn→∞

T zn = limn→∞

T xn − T x,



76 LECTURES 9-11

thus proving that limn→∞ T xn = T x.

Remark 2.4. Using the notations above, the quantities in (ii) and (iii) are in

fact equal. Indeed, if we defineM 1 = sup

T x : x ∈ X, x ≤ 1

,

M 2 = supT x : x ∈ X, x = 1

,

then as observed during the proof, we have M 2 ≤ M 1. Conversely, if we start withsome arbitrary x ∈ X with x ≤ 1, then we can always write x = xu, for someu ∈ X with u = 1. In particular we will get

T x = x · T u ≤ x · M 2 ≤ M 2.

Taking supremum in the above inequality, over all x ∈ X with x ≤ 1, will thengive the inequality M 1 ≤ M 2.

Notations. Let K be either R or C, and let X and Y be normed K-vectorspaces. We define

L(X,Y) = T : X → Y : T K-linear and continuous.

For T ∈ L(X,Y) we define (see the above remark)

T = supT x : x ∈ X, x ≤ 1

= sup

T x : x ∈ X, x = 1

When Y = K (equipped with the absolute value as the norm), the space L(X, K)will be denoted simply by X∗, and will be called the topological dual of X.

Proposition 2.5. Let K be either R or C, and let X and Y be normed K-vector spaces.

(i) The space L(X,Y) is a K-vector space.(ii) For T ∈ L(X,Y) we have

(16) T = min

C ≥ 0 : T x ≤ C x, ∀ x ∈ X

.

In particular one has

(17) T x ≤ T · x, ∀ x ∈ X.

(iii) The map L(X,Y) T −→ T ∈ [0, ∞) is a norm.

Proof. The fact that L(X,Y) is a vector space is clear.(ii). Assume T L(X,Y). We begin by proving (17). Start with some arbitrary

x ∈ X, and write it as x = xu, for some u ∈ X with u = 1. Then by definitionwe have T u ≤ T , and by linearity we have

T x = x · T u ≤ x · T .

To prove the equality (16) let us define the set

CT =

C ≥ 0 : T x ≤ C x, ∀ x ∈ X

.

On the one hand, by (17) we know that T ∈ CT . On the other hand, if we takean arbitrary C ∈ CT , then for every u ∈ X with u = 1, we will have

T u ≤ C u = C,

so taking supremum, over all u with u = 1, will immediately give T ≤ C . Sincewe now have

T ≤ C, ∀ C ∈ CT ,

we clearly get T = min CT .




(iii). Let T, S ∈ L(X,Y). Using (17), we have

(T + S )x = T x + Sx ≤ T x + Sx ≤ (T + S ) · x, ∀ x ∈ X.

Then using (16) we get

T + S ≤ T + S .

If T ∈ L(X,Y) and λ ∈ K, then the equality

(λT )x = |λ| · T x, x ∈ X

will immediately give λT = |λ| · T .Finally if T ∈ L(X,Y) has T = 0, then using (17) one immediately gets

T = 0.

Notation. Let I be a non-empty set, let K be one of the fields R or C, andlet p ∈ [1, ∞]. Let q be the Holder conjugate of p. For every element α ∈ pK(I ) wedefine the map θα : qK(I ) → K by

θα(β ) = α, β = i∈I

α(i)β (i), β ∈ qK(I ).

We know that θα is linear, and by Remark 9.2, we haveθα(β ) ≤ α p · β q, ∀ β ∈ qK(I ),

so θα is continuous, and we have the inequality

(18) θα ≤ α p.

Proposition 2.6. Using the above notations, but assuming p ∈ (1, ∞], themap

Θ : pK(I ) α −→ θα ∈ qK(I )∗

is a linear isomorphism of K-vector spaces. Moreover, Θ is isometric, in the sensethat

(19) Θα = α p, ∀ α ∈ pK(I ).

Proof. We begin by proving (19). Since we have the inclusion

β ∈ qK(I ) : β q ≤ 1 ⊃ BqK(I ),

it follows that

(20) θα = supθα(β )

: β ∈ qK(I ), β q ≤ 1 ≥ sup

θα(β ) : β ∈ B

qK(I )

.

We know however (see Theorem 9 and Exercise 7) that

supθα(β )

: β ∈ BqK(I )

= α p,

so using (20) we get

θα

≥ α p.

Combining this with (18) yields the desired equality.The fact that Θ is linear is pretty obvious. Notice now that since Θ is isometric,

it is clear that Θ is injective, so the only thing we need to prove is the fact thatΘ is surjective. Start with an arbitrary linear continuous map φ : qK(I ) → K. Forevery i ∈ I we define the function δi : I → K by

δi( j) =

1 if j = i0 if j = i



78 LECTURES 9-11

It is clear that δi ∈ qK(I ), for all i ∈ I . (In fact δi ∈ fin K(I ).) We define α : I → K

byα(i) = φ(δi),

∀i∈

I.

Notice that, for every β ∈ fin K, we have

(21)i∈I

α(i)β (i) =i∈I

β (i)φ(δi) = φ i∈β

β (i)δi

= φ(β ),

where β = i ∈ I : β (i) = 0. (Since β ∈ fin K(I ), the set β is finite.) UsingHolder’s inequality, the above computation shows thatα, β ≤ φ · β q, ∀ β ∈ fin K(I ).

By Theorem 9.1 and Exercise 7, this proves that α ∈ pK(I ). Going back to (21) wenow have

θα(β ) = φ(β ), ∀ β ∈ fin K(I ).

Since both θα and φ are continuous, and fin K(I ) is dense in qK(I ) (by Exercise 10),

it follows that φ = θα.

Remark 2.5. In the case p = 1, the map

Θ : 1K(I ) α −→ θα ∈ ∞

K (I )∗

is still isometric, but it is no longer surjective, unless I is finite. The explanationis the fact that when I is infinite, the subspace fin K(I ) is not dense in ∞

K (I ). Forexample, if we take 1 ∈ ∞

K (I ) to be the constant function 1, then it is prettyobvious that

1 − β ≥ 1, ∀ β ∈ fin K(I ).

The above equality can be immediately extended to

(22) λ1 + β ≥ |λ|, ∀ λ ∈ K, β ∈ fin K(I ).

If we then consider the subspace fin K(I ) = λ1 + β : β ∈ fin K(I ), λ ∈ K,

we see that the map

φ0 : fin K(I ) λ1 + β −→ λ ∈ K

is linear, continuous, and has the property that

φ0

fin K(I )

= 0, φ0(1) = 1,(23)

|φ0(γ )| ≤ γ , ∀ γ ∈ fin K(I ).(24)

Using the Hahn-Banach Theorem, we can then extend φ0 to a linear map φ :∞K (I ) → K which will still satisfy (23) and (24), in particular we have φ ∈ ∞

K (I )∗

.

Notice however that if we had φ = θα, for some α ∈ 1K(I ), then we must have

α(i) = φ(δi) = 0, for all i ∈ I , so this would force φ = 0, which is impossible, since

φ(1) = 1.Exercise 12 . Use the notations above. For every α ∈ 1

K(I ), define

σα = θαcK0(I )

: cK0 (I ) → K.

Prove that σα is linear and continuous. Prove that the map

Σ : 1K(I ) α −→ σα ∈ cK0 (I )

∗is an isometric linear isomorphism of K-vector spaces.



Lecture 12

3. Banach spaces

Definition. Let K be one of the fields R or C. A Banach space over K is anormed K-vector space (X, . ), which is complete with respect to the metric

d(x, y) = x − y, x , y ∈ X.

Example 3.1. The field K, equipped with the absolute value norm, is a Banachspace. More generally, the vector space Kn, equipped with any of the norms

(λ1, . . . , λn)∞ = max|λ1|, . . . , |λn|,

(λ1, . . . , λn) p =|λ1| p + · · · + |λn| p

1/p, p ≥ 1,

is a Banach space.

Remark 3.1. Using the facts from the general theory of metric spaces, weknow that for a normed vector space (X, . ), the following are equivalent:

(i) X is a Banach space;(ii) given any sequence (xn)n≥1 ⊂ X with

∞n=1 xn < ∞, the sequence

(yn)n≥1 of partial sums, defined by yn =nk=1 xk, is convergent;

(iii) every Cauchy sequence in X has a convergent subsequence.

This is pretty obvious, since the sequence of partial sums has the property thatd(yn+1, yn) = yn+1 − yn = xn+1, ∀ n ≥ 1.

Exercise 1* . Let X be a finite dimensional normed vector space. Prove that Xis a Banach space.Hints: Use inductionn on dimX. The case dimX= 1 is trivial. Assume the statement is true forall normed vector spaces of dimension d, and let us prove it for a normed vector space of dimensiond +1. Fix such anX, and a linear basis e1, e2, . . . , en, ed+1 for X. Start with a Cauchy sequence

(xn)n≥1 ⊂ X. Write each term as

xn =

d+1k=1

αn(k)ek.

Prove first that

αn(d + 1)

n≥1⊂ K is bounded . Then extract a subsequence (xnp )p≥1 such that

αnp (d + 1)

p≥1is convergent. If we take α(d + 1) = limp→∞ αnp (d + 1), then prove that the

sequence

xnp −αnp (d+ 1)ed+1

p≥1 is Cauchy in the space Spane1, . . . , ed. Using the inductivehypothesis, conclude that (xnp )p≥1 is convergent in X. Thus, every Cauchy sequence in X has a

convergent subsequence, hence X is Banach.

Exercise 2* . Let n ≥ 1 be an integer, and let · be a norm on Kn. Provethat there exist constants C,D > 0, such that

C x∞ ≤ x ≤ Dx∞, ∀ x ∈ Kn.

79



80 LECTURE 12

Hint: Let e1, . . . , en be the standard basis vectors for Kn, so that

α1e1 + · · · + αnen = (α1, . . . , αn), ∀ (α1, . . . , αn) ∈ Kn.

Define D = e1 + · · · + en. The existence of C is equivalent to the existence of some C

> 0such that

x∞ ≤ C x, ∀ x ∈ Kn.

(If such a C exists, then we take C = 1/C .) To prove the existence of C as above, we considerthe set T = x ∈ Kn : x ≤ 1, and we need to prove that

supx∈T

x∞ < ∞.

Argue by contradiction (see also the hint from the preceding exercise).

Exercise 3 . Let X and Y be normed vector spaces. Consider the product X×Y,equipped with the natural vector space structure.

(i) Prove that (x, y) = x + y, (x, y) ∈ X× Y defines a norm on X× Y.(ii) Prove that, when equipped with the above norm, X×Y is a Banach space,

if and only if both X and Y are Banach spaces.

There are two key constructions which enable one to construct new Banachspace out of old ones.

Proposition 3.1. Let X be a normed vector space, and let Y be a Banach space. Then L(X,Y) is a Banach space, when equipped with the operator norm.

Proof. Start with a Cauchy sequence (T n)n≥1 ⊂ L(X,Y). This means thatfor every ε > 0, there exists some N ε such that

(1) T m − T n < ε, ∀ m, n ≥ N ε.

Notice that, if one takes for example ε = 1, and we define

C = 1 + maxT 1, T 2, . . . , T N 1,

then we clearly have

(2) T n ≤ C, ∀ n ≥ 1.Notice that, using (1), we have

(3) T mx − T nx ≤ εx, ∀ m, n ≥ N ε, x ∈ X,

which proves that

• for every x ∈ X, the sequence (T nx)n≥1 ⊂ Y is Cauchy.

Since Y is a Banach space, for each x ∈ X, the sequence (T n)n≥1 will be convergent.We define the map T : X → Y by

T x = limn→∞

T nx, x ∈ X.

Using (2) we immediately get

T x ≤ C x, ∀ x ∈ X.

Since T is obviously linear, this prove that T is continuous. Finally, if we fix n ≥ N εand we take limm→∞ in (3), we get

T nx − T x ≤ εx, ∀ n ≥ N ε, x ∈ X,

which proves precisely that we have the inequality

T n − T ≤ ε, ∀ n ≥ N ε,

hence (T n)n≥1 is convergent to T in the norm topology.




Corollary 3.1. If X is a normed vector space, then its topological dual X∗ =L(X, K) is a Banach space.

Proof. Immediate from the fact that K is a Banach space.

As a direct application of the above result we get

Corollary 3.2. If I is a non-empty set, if p ∈ [1, ∞], then pK(I ) is a Banach space.

Proof. For p = 1 we know that 1 (c0)∗. For p ∈ (1, ∞], we know that p (q)∗, where q is Holder conjugate to p.

Proposition 3.2. Let X be a Banach space, and let Z ⊂ X be a linear subspace.The following are equivalent:

(i) Z is a Banach space, ehen equipped with the norm from X;(ii) Z is closed in X, in the norm topology.

Proof. This is a particular case of a general result from the theory of completemetric spaces.

Corollary 3.3. Let I be a non-empty set, and let K be one of the fields R or C. Then cK0 (I ) is a Banach space.

Proof. Use the fact that cK0 (I ) is closed in ∞K (I ).

Exercise 4* . Let X be an infinite dimensional Banach space, and let B be alinear basis for X. Prove that B is uncountable.

Hint: If B is countable, say B = bn : n ∈ N, then

X=∞

n=1

F n,

where F n = Span(b1, b2, . . . , bn. Since the F n’s are finite dimensional linear subspaces, they will

be closed. Use Baire’s Theorem to get a contradiction.

Comments. A third method of constructing Banach spaces is the completion.If we start with a normed K-vector space X, when we regard X as a metric space,its completion X is constructed as follows. One defines

cs(X) =x = (xn)n≥1 : (xn)n≥1 Cauchy sequence in X

.

Two Cauchy sequences x = (xn)n≥1 and x = (xn)n≥1 are said to be equivalent, if

limn→∞ xn − xn = 0. In this case one writes x ∼ x. The completion X is then

defined as the space

X = cs(X)/ ∼

of equivalence classes. Forx

∈ cs(X

), one denotes by ˜x

its equivalence class inX

.Finally for an element x ∈ X one denotes by x ∈ X the equivalence class of theconstant sequence x.

We know from general theory that X is a complete metric space, with thedistance d (correctly) defined by

d(x, x) = limn→∞

xn − xn,

for any two Cauchy sequences x = (xn)n≥1 and x = (xn)n≥1.



82 LECTURE 12

It turns out that, in our situation, the space cs(X) carries a natural vectorspace structure, defined by pointwise addition and scalar multiplication. Moreover,the space X is identified as a quotient vector space

X = cs(X)/ns(X),

where

ns(X) =x = (xn)n≥1 : (xn)n≥1 sequence in X with lim

n→∞xn = 0

is the linear subspace of null sequences. It then follows that X carries a naturalvector space structure. More explicitly, if we start with a scalar λ ∈ K, and withtwo elements p, q ∈ X, which are represented as p = x and q = y, for two Cauchysequences x = (xn)n≥1 and y = (yn)n≥1 in X, then the sequence

w = (λxn + yn)n≥1

is Cauchy in X, and the element λp + q ∈ X is then defined as λp + q = w.

Finally, there is a natural norm on˜X, (correctly) defined by

x = d(x, 0) = limn→∞

xn,

for all Cauchy sequences x = (xn)n≥1. These considerations then prove that X isa Banach space, and the map

X x −→ x ∈ Xis linear and isometric, in the sense that

x = x, ∀ x ∈ X.

In the context of normed vector spaces, the universality property of the com-pletion is stated as follows:

Proposition 3.3. Let X be a normed vector space, let X denote its completion,

and let Y be a Banach space. For every linear continuous map T : X → Y, thereexists a unique linear continuous map T : X → Y, such that

T x = T x, ∀ x ∈ X.

Moreover the map

L(X,Y) T −→ T ∈ L(X,Y)

is an isometric linear isomorphism.

Proof. If T : X → Y is linear an continuous, then T is a Lipschitz map withLipschitz constant T , because

T x − T x ≤ T · x − x, ∀ x, x ∈ X.

We know, from the theory of metric spaces, that there exists a unique continuous

map T : X → Y, such thatT x = T x, ∀ x ∈ X.

We also know that T is Lipschitz, with Lipschitz constant T . The only thing we

need to prove is the fact that T is linear. Start with two points p, q ∈ X, representedas p = x and q = z, for some Cauchy sequences x = (xn)n≥1 and z = (zn)n≥1 inX. If λ ∈ K, then λp + q = w, where w = (λxn + zn)n≥1. We then have

T (λp + q) = limn→∞

T (λxn + z + n) =

λ · limn→∞

T xn

+

limn→∞

T zn

= λT p + T q.




Let us prove now that T = T . Since T is Lipschitz, with Lipschitz constant

T , we will have T ≤ T . To prove the other inequality, let us consider the

setsB0 = p ∈ X : p ≤ 1,B1 = x : x ∈ X, x ≤ 1.

By definition, we have

T = sup p∈B0

T p.

Since we clearly have B0 ⊃ B1, we get

T = sup p∈B1

T p ≥ supT x : x ∈ X x ≤ 1

=

= supT x : x ∈ X x ≤ 1

= T .

The fact that the map L(X,Y) T −→ T ∈ L(X,Y) is linear is obvious.

To prove the surjectivity, start with some S ∈ L(X,Y). Consider the map

ι : X x −→ x ∈ X.

Since ι is linear and isometric, in particular it is continuous, so the compositionT = S ι is linear and continuous. Notice that

S x = S

ι(x)

= (S ι)x = T x, ∀ x ∈ X,

so by uniqueness we have S = T .

Corollary 3.4. Let X be a normed space, let Y be a Banach space, and let T : X → Y be an isometric linear map.

(i) Let T : X → Y be the linear continuous map defined in the previous result.

Then T is linear, isometric, and T (X) = T (X).(ii) X is complete, if and only of T (X) is closed in Y.

Proof. (i). The fact that T is isometric, and has the range equal to T (X) istrue in general (i.e. for X metric space, and Y complete metric space). The linearityfollows from the previous result.

(ii). This is obvious.

Example 3.2. Let X be a normed vector space. For every x ∈ X define themap x : X∗ → K by

x(φ) = φ(x), ∀ φ ∈ X∗.

Then x is a linear and continuous. This is an immediate consequence of theinequality

|x(φ)| = |φ(x)| ≤ x · φ, ∀ φ ∈ X∗.

Notice that this also proves

x ≤ x, ∀ x ∈ X.

Interestingly enough, we actually have

(4) x = x, ∀ x ∈ X.

To prove this fact, we start with an arbitrary x ∈ X, and we consider the linearsubspace

Y = Kx = λx : λ ∈ K.



84 LECTURE 12

If we define φ0 : Y → K, by

φ0(λx) = λx, ∀ λ ∈ K,

then it is clear that φ0(x) = x, and

|φ0(y)| ≤ y, ∀ y ∈ Y.

Use then the Hahn-Banach Theorem to find φ : X → K such that φY

= φ0, and

|φ(z)| ≤ z, ∀ z ∈ X.

This will clearly imply φ ≤ 1, while the first condition will give φ(x) = φ0(x) =x. In particular, we will have

x = |φ(x)| = |x(φ)| ≤ x · φ ≤ x.

Having proven (4), we now have a linear isometric map

E : X x −→ x ∈ X∗∗.

Since X∗∗ is a Banach space, we now see that E : X

→E (X) is an isometric linear

isomorphism. In particular, X is Banach, if and only if E (X) is closed in X∗∗.

We conclude with a series of results, which are often regarded as the “principlesof Banach space theory.” These results are consequences of Baire Theorem.

Theorem 3.1 (Uniform Boundedness Principle). Let X be a Banach space, let Y be normed vector space, and let M ⊂ L(X,Y). The following are equivalent

(i) supT : T ∈ M

< ∞;

(ii) supT x : T ∈ M

< ∞, ∀ x ∈ X.

Proof. The implication (i) ⇒ (ii) is trivial, because if we define

M = supT : T ∈ M

,

then by the definition of the norm, we clearly have

supT x : T ∈ M ≤ M x, ∀ x ∈ X.(ii) ⇒ (i). Assume M satisfies condition (ii). For each integer n ≥ 1, let us

define the setF n =

x ∈ X : T x ≤ n, ∀ T ∈ M

.

It is obvious that F n is a closed subset of X, for each n ≥ 1. Moreover, by (ii) weclearly have

∞n=1 F n = X. Using Baire’s Theorem, there exists some n ≥ 1, such

that Int(F n) = ∅. This means that there exists some x0 ∈ X and some r > 0, suchthat

F n ⊃ Br(x0) = y ∈ X : x − x0 ≤ r.

Put M 0 = supT x0 : T ∈ M

. Fix for the moment some arbitrary x ∈ X,

with x ≤ 1, and some arbitrary element T ∈ M. The vector y = x0 + rx clearlybelongs to Br(x0), so we have T y ≤ n. We then get

T x = T 1r (y − x0) = 1rT y − T x0 ≤ 1r T y + T x0 ≤ 1r (n + M 0).Keep T fixed, and use the above estimate, which gives

supT x : x ∈ X, x ≤ 1

≤ n + M 0r

,

to conclude that T ≤ n+M 0r . Since T ∈ M is arbitrary, we finally get

supT : T ∈ M

≤ n + M 0r

< ∞.




Theorem 3.2 (Inverse Mapping Theorem). Let X and Y be Banach spaces,and let let T : X → Y be a bijective linear continuous map. Then the linear mapT −1 : Y

→X is also continuous.

Proof. Let us denote by A the open unit ball in X centered at the origin, i.e.

A =

x ∈ X : x < 1

.

The first step in the proof is contained in the following.

Claim 1: The closure T ( A) is a neighborhood of 0 in Y.

Consider the sequence of closed sets

kT ( A)∞k=1

. (Here we use the notation kM =kv : v ∈ M.) Since the map v −→ kv is a homeomorphism, one has the equalities

kT ( A) = kT ( A) = T (k A), ∀ k ≥ 1.

In particular, we have∞

k=1

kT ( A) = k=1

T (k A) ⊃∞

k=1

T (k A) = T ∞

k=1

[k A].

Since we obviously have∞k=1[k A] = X, and T is surjective, the above equality

shows that∞k=1 kT ( A) = Y. Using Baire’s Theorem, there exists some k ≥ 1, such

that Int

kT ( A)

= ∅. Again using the fact that v −→ kv is a homeomorphism,

this gives Int

T ( A)

= ∅. Fix now some point y ∈ Int

T ( A)

, and some r > 0,

such that T ( A) contains the open ball

(5) Br(y) =

z ∈ Y : z − y < r

.

The proof of the Claim is then finished, once we prove the inclusion

T ( A)

⊃B r

2(0).

To prove this inclusion, start with some arbitrary v ∈ B r2

(0), i.e. v ∈ Y and v < r2 .

Since (2v + y) − y = 2v < r , using (5) it follows that 2v + y ∈ T ( A). i.e. thereexists a sequence (xn)∞

n=1 ⊂ X with xn < 1, ∀ n ≥ 1, and 2v + y = limn→∞ T xn.

Since y itself belongs to T ( A), there also exists some sequence (zn)∞n=1 ⊂ X, with

zn < 1, ∀ n ≥ 1, and y = limn→∞ T zn. On the one hand, if we consider thesequence (un)∞

n=1 ⊂ X given by un = 12 (xn − zn), then it is clear that

un ≤ 12

xn + zn

< 1, ∀ n ≥ 1,

i.e. (un)∞n=1 ⊂ A. On the othe hand, we have

limn→∞

T un = limn→∞

12

T xn − T zn) = 1

2 (2v + y − y) = v,

so v indeed belongs to T ( A).

The next step is a slight (but crucial) improvement of Claim 1.

Claim 2 : T ( A) is a neighborhood of 0.

Start off by choosing ε > 0, such that

(6) T ( A) ⊃ Bε(0).

The Claim will follow, once we prove the inclusion

(7) T ( A) ⊃ Bε2

(0).



86 LECTURE 12

To prove this inclusion, we start with some arbitrary y ∈ Bε(0). We want toconstruct a sequence of vectors (xn)∞

n=1 ⊂ A, such that, for every n ≥ 1, we havethe inequality

(8)

y −nk=1

T ( 12k xk)

≤ ε

2n+1.

This sequence is constructed inductively as follows. We start by using (6), and wepick x1 ∈ A such that 2y − T x1 < ε

2 . Once x1, . . . , x p are constructed, such that(8) holds with n = p, we consider the vector

z = 2 p+1

y − pk=1

T ( 12k T xk)

∈ Bε(0),

and we use again (6) to find x p+1 ∈ A, such that z −T x p+1 ≤ ε2 . We then claerly

have

y − p+1k=1

T 12k xk = z−

T x p+12 p+1 ≤ε

2 p+2 ,

Consider now the series∞k=1

12k xk. Since xk < 1, ∀ k ≥ 1, and X is a Banacch

space, by Remark 3.1, the sequence of (wn)∞n=1 ⊂ X of partial sums

wn =nk=1

12k xk, n ≥ 1,

is convergent to some point x ∈ X. Moreover, since we have

wn ≤nk=1

xk2k

≤∞k=1

xk2k

, ∀ n ≥ 1,

we get the inequality

x ≤ ∞k=1

xk2k

< 1,

which means that x ∈ A. Note also that using these partial sums, the inequality(8) reads

y − T wn ≤ ε

2n+2, ∀ n ≥ 1,

so by the continuity of T , we have y = T x ∈ T ( A).Let us show now that T −1 is continuous. Use Claim 2, to find some r > 0 such

that

(9) T ( A) ⊃ Br(0),

and let y ∈ Y be an arbitrary vector with y ≤ 1. Consider the vector v = r2 y,

which has v ≤r

2 < r . By (9), there exists x ∈A

, such that T x = v, which meansthat T −1y = 2rx. This forces T −1y ≤ 2

r . This argument shows that

supT −1y : y ∈ Y, y ≤ 1

≤ 2

r< ∞,

and the continuity of T −1 follows from Proposition 2.4.

The following two exercises deal with two more “principles of Banach spacetheory.”




Exercise 5 ♦. (Closed Graph Theorem). Let X and Y be Banach spaces, andlet T : X → Y be a linear map. Prove that the following are equivalent:

(i) T is continuous.(ii) The graph of T GT =

(x, Tx) : x ∈ X

is a closed subset of X× Y, in the product topology.

Hint: For the implication (ii) ⇒ (i), use Exercise 3, to get the fact that GT is a Banach space.

Then T is exactly the inverse of πXGT

, where πX : X× Y → X is the projection onto the first

coordinate. Use Theorem 3.2.

Exercise 6 ♦. (Open Mapping Theorem). Let X and Y be Banach spaces, andlet T : X → Y be a surjective linear continuous map. Prove that T is an open map,in the sense that

• whenver D ⊂ X is open, it follows that T (D) is open in Y.

Hint: Consider the linear map

S : X× Y (x, y) −→ (x , T x + y) ∈ X× Y.

Prove that S is linear, continuous, bijective, hence by Theorem 3.2, it is a homeomorphism. Use

this fact to prove that for every open set D⊂ X, there exists some open set E⊂ X×Y, such that

T (D) = πY(E), where πY : X× Y→ Y is the projection onto the second coordinate. This reduces

the problem to proving the fact that πY is an open map.



Lecture 13

4. The weak dual topology

In this section we examine the topological duals of normed vector spaces. Be-sides the norm topology, there is another natural topology which is constructed asfollows.

Definition. Let X be a normed vector space over K(= R, C). For every x

∈X,

let x : X∗ → K be the linear map defined byx(φ) = φ(x), ∀ φ ∈ X∗.

We equipp the vector space X∗ with the weak topology defined by the family Ξ =(x)x∈X. This topology is called the weak dual topology , which is denoted by w∗.Recall (see Section 3) that this topology is characterized by the following property

(w∗) Given a topological space T , a map f : T → X∗ is continuous with respect to the w∗ topology, if and only if x f : T → K is continuous, for each x ∈ X.

Remark that all the maps x : X∗ → K, x ∈ X are already continuous with respectto the norm topology. This gives the fact that

• the w∗ topology on X∗ is weaker than the norm topology.

Remark 4.1. The w∗

topology is Hausdorff. Indeed, if φ, ψ ∈ X

∗

are suchthat φ = ψ, then there exists some x ∈ X such that

x(φ) = φ(x) = ψ(x) = x(ψ).

Proposition 4.1. Let X be a normed vector space over K. For every ε > 0,φ ∈ X∗, and x ∈ X, define the set

W (φ; x, ε) =

ψ ∈ X∗ : |ψ(x) − φ(x)| < ε.

Then the collection

W =

W (φ; x, ε) : ε > 0, φ ∈ X∗, x ∈ X

is a subbase for the w∗ topology. More precisely, given φ ∈ X∗, a set N ⊂ X∗ is a neighborhood of φ with respect to the w∗ topology, if and only if, there exist ε > 0and x1, . . . , xn

∈X, such that

N ⊃ W (φ; ε, x1) ∩ · · · ∩ W (φ; ε, xn).

Proof. It is clearly sufficient to prove the second assertion, because it wouldimply the fact that any w∗ open set is a union of finite intersections of sets in W.

If we define the collection

S =

−1x (D) : x ∈ X, D ⊂ K open

,

then we know that S is a subbase for the w∗ topology.

89



90 LECTURE 13

Fix φ ∈ X∗. Start with some w∗ neighborhood N of φ, so there exists some w∗

open set E with φ ∈ E ⊂ N . Using the fact that S is a subbase for the w∗ topology,there exist open sets D

1, . . . , D

n ⊂K, and points x

1, . . . , x

n, such that

φ ∈nk=1

−1xk

(Dk) ⊂ E.

Fix for the moment k ∈ 1, . . . , n. The fact that φ ∈ −1xk

(Dk) means that φ(xk) ∈Dk. Since Dk is open in K, there exists some εk > 0, such that

Dk ⊃ Bεk

φ(xk)

.

Then if we have an arbitrary ψ ∈ W (φ; εk, xk), we will have

|ψ(xk) − φ(xk)| < εk,

which gives ψ ∈ −1xk

(Dk). This proves that

W (φ; εk, xk) ⊂ −1xk

(Dk).

Notice that, if one takes ε = minε1, . . . , εn, then we clearly have the inclusions

W (φ; ε, xk) ⊂ W (φ; εk, xk) ⊂ −1xk

(Dk).

We then immediately get

W (φ; ε, xk) ⊂nk=1

−1xk

(Dk) ⊂ E ⊂ N,

and we are done.

Corollary 4.1. Let X be a normed vector space. Then the w∗ topology on X∗

is locally convex, i.e.

• for every φ ∈ X∗ and every w∗-neighborhood N of φ, there exists a convex w∗-open set D such that φ

∈D

⊂N .

Proof. Apply the second part of the proposition, together with the obviousfact that each of the sets W (φ; ε, x) is convex and w∗-open.

Proposition 4.2. Let X be a normed vector space. When equipped with thew∗ topology, the space X∗ is a topological vector space. This means that the maps

X∗ ×X∗ (φ, ψ) −→ φ + ψ ∈ X∗

K ×X∗ (λ, φ) −→ λφ ∈ X∗

are continuous with respect to the w∗ topology on the target space, and the w∗

product topology on the domanin.

Proof. According to the definition of the w∗ topology, it suffices to provethat, for every x ∈ X, the maps

σx : X∗ × X∗ (φ, ψ) −→ γ x : x(φ + ψ) ∈ K

K × X∗ (λ, φ) −→ x(λφ) ∈ K

are continuous. But the continuity of σx and γ x is obvious, since we have

σx(φ, ψ) = φ(x) + φ(x) = x(φ) + x(ψ), ∀ (φ, ψ) ∈ X∗ ×X∗;

γ x(λ, φ) = λφ(x) = λx(φ), ∀ (λ,φ,ψ) ∈ K × X∗.




Our next goal will be to describe the linear maps X∗ → K, which are continuousin the w∗ topology.

Proposition 4.3. Let X be a normed vector space over K. For a linear mapω : X∗ → K, the following are equivalent:

(i) ω is continuous with respect to the w∗ topology;(ii) there exists some x ∈ X, such that

ω(φ) = φ(x), ∀ φ ∈ X∗.

Proof. The implication (ii) ⇒ (i) is trivial, since condition (ii) gives ω = x(i) ⇒ (ii). Suppose ω is continuous. In particular, ω is continuous at 0, so if

we take the setD = λ ∈ K : |λ| < 1,

the setω−1(D) = φ ∈ X∗ : |ω(φ)| < 1

is an open neighborhood of 0 in the w∗ topology. By Proposition ?? there existx1, . . . , xn ∈ X, and ε > 0, such that

(1) W (0; ε, x1) ∩ · · · ∩ W (0; ε, xn) ⊂ D.

Claim 1: One has the inequality

|ω(φ)| ≤ ε−1 · max|φ(x1)|, . . . , |φ(xn)|, ∀ φ ∈ X∗.

Fix an arbitrary φ ∈ X∗, and put M = max|φ(x1)|, . . . , |φ(xn)|. For every integer

k ≥ 1, define

φk = ε

M + 1k

−1φ,

so that

|φk(xj)| = ε

M + 1k

−1|φ(xj)| ≤ εM

M + 1k

−1< ε, ∀ k ≥ 1, j ∈ 1, . . . , n.

This proves that φk∈

W (0; ε, xj), for all k

≥1, and all j

∈ 1, . . . , n

. By (1) this

will give|ω(φk)| < 1, ∀ k ≥ 1,

which readsε

M + 1k

−1|ω(φ)| < 1, ∀ k ≥ 1.

This gives|ω(φ)| ≤ ε−1

M + 1

k

, ∀ k ≥ 1,

and it will obviously force|ω(φ)| ≤ ε−1M.

Having proven the Claim, we now define the linear map T : X∗ → Kn, by

T φ =

φ(x1), . . . , φ(xn)

, ∀ φ ∈ X∗.

Claim 2 : There exists a linear map σ : Kn

→K, such that ω = σ

T .

First we show that we have the inclusion

Ker ω ⊃ Ker T.

If we start with φ ∈ Ker T , then φ(x1) = · · · = φ(xn) = 0, and then by Claim1 we immediately get ω(φ) = 0, so φ indeed belongs to Ker ω. We us now a bitof linear algebra. On the one hand, since ω

KerT

= 0, there exists a linear mapω : X/Ker T → K, such that ω = ω π, where π : X → X/Ker T denotes thequoatient map. On the other hand, by the Isomorphism Theorem for linear maps,



92 LECTURE 13

there exists a linear isomorphism T : X/Ker T ∼−−→ Ran T , such that T π = T .

We then define

σ0 = ω ˆT

−1

: Ran T → K,and we will have

σ0 T = (ω T −1) (T π) = ω π = ω.

We finally extend5 σ0 : Ran T → K to a linear map σ : Kn → K.Having proven Claim 2, we choose scalars α1, . . . , αn ∈ K, such that

σ(λ1, . . . , λn) = α1λ1 + · · · + αnλn, ∀ (λ1, . . . , λn) ∈ Kn.

We now have

ω(φ) = σ(T φ) = σ

φ(x1), . . . , φ(xn)

= α1φ(x1) + · · · + αnφ(xn), ∀ φ ∈ X∗,

so if we define x = α1x1 + · · · + αnxn, we claerly have

ω(φ) = φ(x),

∀φ

∈X

∗.

(ii) ⇒ (i). This implication is trivial.

Corollary 4.2. Let X be a normed vector space, let C ⊂ X∗ be a convex set,

and let φ ∈ X∗ Cw∗

. (Here Cw∗

denotes the w∗-closure of C.) Then there existsan element x ∈ X, and a real number α, such that

Re φ(x) < α ≤ Re ψ(x), ∀ ψ ∈ C.

Proof. Since the w∗ topology on X∗ is locally convex, there exists a convex

w∗-open set A ⊂ X∗, such that φ ∈ A ⊂ X∗Cw∗

. In particular, we have A∩C = ∅.Apply the Hahn-Banach separation theorem to find a linear map ω : X∗ → K, whichis w∗-continuous, and a real number α, such that

Re ω(ρ) < α

≤Re ω(ψ),

∀ρ

∈A, ψ

∈C.

We then apply the above Proposition.

Comments. The definition of the w∗ topology can be used in a more generalsetting, when X is just a topological vector space. The above results are still vaildin this general setting.

In general the unit ball

(X∗)1 = φ ∈ X∗ : φ ≤ 1,

although bounded and closed, is not compact in the norm topology. However, whenthe w∗ topology is used, we have

Theorem 4.1 (Alaoglu). If X is a normed vector space, then the unit ball (X∗)1, in the topological dual space, is compact in the w∗ topology.

Proof. Let us consider the unit ball in K:

B = λ ∈ K : |λ| ≤ 1.

Let us also consider the unital ball in X:

(X)1 = x ∈ X : x ≤ 1.

5 One can invoke the Hahn-Banach Theorem here. In fact this is not necessary, since Ran T ⊂Kn are finite dimensional vector spaces.




Define the product space

P =

x∈(X)1

B,

identified equivalently as the space of maps (X)1 → B. By Tihonov’s Theorem,when we equip P with the product topology , it will become a compact topologicalspace. We denote by πx : P → B, x ∈ (X)1, the projection onto the factor withlabel x. By definition of the product topology πx is continuous.

For any x, y ∈ (X)1 define the map ∆x,y : P → K by

∆x,y(f ) =f (x) + f (y)

2− f

x + y

2

, ∀ f ∈ P.

Note that

∆x,y =1

2(πx + πy) − π(x+y)/2,

so ∆x,y : P → K is obviously continuous. In particular, the set

Ax,y = ∆−1x,y(0) = f ∈ P : f (x) + f (y)2 = f x + y2 is closed in P , for every x, y ∈ (X)1.

Similarly, for every x ∈ (X)1 and every λ ∈ B, we define the map Σλ,x : P → K

byΣλ,x(f ) = f (λx) − λf (x), ∀ f ∈ P,

then Σλ,x is continuous, so the set

Bx,y = Σ−1x,y(0) =

f ∈ P : f (λx) = λf (x)

is closed in P , for every λ ∈ B, x ∈ (X)1.

Define the setL =

x,y∈(X)1

Ax,y

∩

λ∈B

x∈(X)1

Bλ,y

.

Since L is an intersection of closed sets, it follows that L itself is closed. In partic-ular, L is compact . By construction, we have

L =

f : (X)1 → B

12

f (x) + f (y)] = f

12 [x + y]

and

f (λx) = λf (x), ∀ x, y ∈ (X)1, λ ∈ B

.

For any f ∈ L, we define the map ψf : X → K by

ψf (x) =

0 if x = 0

x · f x

x

if x = 0

Claim 1: For any f

∈L, the map ψf : X

→K is linear, and satisfies

ψf (X)1= f .

Fix f ∈ L. Start with some x ∈ X and some λ ∈ K. We have λx = |λ| · x, sowe get

ψf (λx) =

0 if either x = 0, or λ = 0

|λ| · x · f λ

|λ| · x

x

if λ = 0 and x = 0



94 LECTURE 13

If λ = 0 and x = 0, we put

µ =λ

|λ|and y =

x

x,

and the fact that µ ∈ B, y ∈ (X)1, and f ∈ Bµ,y, will give

f λ

|λ| · x

x

= f (µy) = µf (y) =λ

|λ| · f x

x

=λ

|λ| · xψf (x),

so in this case we get

ψf (λx) = |λ| · x · f λ

|λ| · x

x

= |λ| · x · λ

|λ| · xψf (x) = λψf (x).

In the case when either λ = 0 or x = 0, we also get the equality

ψf (λx) = 0 = λψf (x).

This way we have proven the homeogeneity of ψf

(2) ψf (λx) = λψf (x), ∀ λ ∈ K, x ∈ X.Let us prove now that ψf

(X)1

= f . If x = 0, then using the property

(3) f (µy) = µf (y), ∀ µ ∈ B, y ∈ (X)1

with µ = 0 and y = 0, we immediately get f (x) = 0 = ψf (x). If x = 0, we use (3)

with µ = x and y =x

x and we again get

f (x) = f (x · y) = x · f (y) = x · f x

x

= ψf (x).

We now prove that ψf is additive. Start with two elements x, y ∈ X. Define

v =x

x + y + 1and w =

y

x + y + 1,

so that we obviously have v, w ∈ (X)1 andx = x + y + 1 · v and y = x + y + 1 · w.

By homogeneity, we have

ψf (x + y) = ψf

2x + y + 1 · 1

2[v + w]

= 2x + y + 1 · f

1

2[v + w]).

Using the fact that f ∈ Av,w the above computation can be continued to give:

ψf (x + y) = 2x + y + 1 · f 1

2[v + w]) =

= 2x + y + 1 · 1

2[f (v) + f (w)] =

= x + y + 1 · f (v) + x + y + 1 · f (w).

Using the fact that ψf (X)1 = f , the above equality gives

ψf (x + y) = x + y + 1 · ψf (v) + x + y + 1 · ψf (w).

Finally, using the homogeneity property (2) we get

ψf (x + y) = ψf x + y + 1 · v

+ ψf

x + y + 1 · w

= ψf (x) + ψf (y).

Having proven the Claim, let us now observe that, for f ∈ L, the fact that

ψf (x) = f (x) ∈ B, ∀ x ∈ (X)1,




shows that ψf is continuous, and ψf ≤ 1. Therefore we have a correctly definedmap

Ψ : L f −→ ψf ∈ (X∗

)1.Claim 2 : When (X∗)1 is equipped with the w∗ topology, the map Ψ is con-

tinuous.

By the definition of the w∗ topology, we need to prove that x Ψ : L → K iscontinuous, for avery x ∈ X. If x = 0, the composition x Ψ is the constant map0, so there is nothing to prove. If x = 0, we define

y =x

x ∈ (§)1,

and using Claim 1, we have

(x Ψ)(f ) = x(ψf ) = ψf (x) = x · ψf

x

x

= x · ψf (y) = x · f (y), ∀ f ∈ L.

This proves thatx Ψ = x · πy,

and since πy : P → B is continuous, the continuity of x Ψ follows.In order to finish the proof of the Theorem, it then suffices to prove

Claim 3 : The map Ψ : L → (X∗)1 is surjective.

Start with an arbitrary φ ∈ (X∗)1, which means that φ : X → K is linear, continu-ous, and

|φ(x)| ≤ 1, ∀ x ∈ (X)1.

In particular, if we define f = φ

(X)1, then

f (x) ∈ B, ∀ x ∈ (X)1,

which means that f ∈ P . Using the fact that φ is linear, it is obvious that f ∈ L.Using Claim 1, we have

ψf (x) = f (x) = φ(x), ∀ x ∈ (X)1.

Now, since ψf

(X)1= φ

(X)1

, and both ψf and φ are linear, we immediately get

ψf = φ.

Remarks 4.2. Using the notations from the above proof, the continuous mapΨ : L → (X∗)1 is in fact bijective. The only thing we need to prove is the injectivity.Suppose ψf = ψg, for some f, g ∈ L. Then

f = ψf

(X)1

= ψg

(X)1

= g.

Since Ψ : (X∗

)1 → L is bijective, continuous, and the spaces (X∗

)1 and L arecompact Hausdorff, it follows that Ψ is in fact a homeomorphism . The inverse mapΨ−1 : (X∗)1 → L is simply defined by

Ψ−1(φ) = φ

(X)1, ∀ φ ∈ (X∗)1.

Proposition 4.4. Suppose X is a normed vector space, which is separable in the norm topology. When equipped with the w∗ topology, the compact space (X∗)1

is metrizable.



96 LECTURE 13

Proof. Fix a countable dense subset M ⊂ X, and define (M)1 = (X)1 ∩ M.Notice that (M)1 is dense in (X)1. Indeed, if we start with some x ∈ (X)1, andsome ε > 0, then we set x

ε= (1

−ε

2)x, and we choose y

∈M such that

xε−

y

< ε

2.

On the one hand, we have

y ≤ xε − y + xε <ε

2+

1 − ε

2

· x ≤ ε

2+ 1 − ε

2= 1,

so y ∈ (M)1. On the other hand, we have

y − x ≤ y − xε + x − xε <ε

2+ε

2x ≤ ε

2· 1 + x ≤ ε.

Let us use the notations from the proof of Theorem 4.1. Let us then define theproduct space

x∈(M)1

B,

equipped with the product topology. Define also the map

Υ : x∈(X)1

B f −→ f (M)1

∈ x∈(M)1

B.

It is obvious that Υ is continuous. Let

κ : (X∗)1 φ −→ φ

(X)1∈

x∈(X)1

B.

We know that κ is continuous and injective (being the inverse of Ψ : L → (X∗)1).

Claim: The composition Υ κ : (X∗)1 → x∈(M)1

B is injective.

Indeed, if φ, ψ ∈ (X∗)1 satisfy (Υκ )(φ) = (Υκ )(ψ), then we get φ

(M)1= ψ

(M)1

.

Since (M)1 is dense in (X)1, this will force φ

(X)1

= ψ

(X)1

, which finally forces

φ = ψ.Using the above Claim, we see that if we define Q = (Υ κ )(X∗)1, then Q ⊂

x∈(M)1B is compact, and Υ κ : (X∗)1 → Q is a homeomorphism. Notice that

x∈(M)1B is a countable product of metric spaces, so it is metrizable. Therefore

Q is also metrizable, and so will be (X∗)1.

Remark 4.3. Assuming X is separable, and M ⊂ X is a countable dense subset.If we enumerate the countable set (M)1 as

(M)1 = yn : n ≥ 1,

then a metric d that defines the w∗ topology on (X∗)1 can be constructed as

d(φ, ψ) =∞

n=1

|φ(yn) − ψ(yn)|2n

, ∀ φ, ψ ∈ (X∗)1.

Comments. Let X be a normed vector space. One can extend the map κ to amap

κ : X∗ φ −→ φ

(X)1∈

x∈(X)1

K.

This map will still be injective and continuous, and one can show that

κ : X∗ → κ (X∗)




is a homeomorphism, when κ (X∗) is equipped with the induced topology from theproduct space

x∈(X)1

K. In general however, the set κ (X∗) is not closed in the

product space x∈(X)1 K.If X is separable, and if one takes a countable dense set M ⊂ X, then as before,one also still has a continuous map

Υ :x∈(X)1

K f −→ f

(M)1∈

x∈(M)1

K,

and the composition

Υ κ : X∗ →

x∈(M)1

B

will still be continuous and injective. In general however, it turns out that the map

Υ κ : X∗ → Υ κ (X∗)

is not a homeomrphism. The exercise below explains exactly when this is the case.

Exercise 1* . Let X be a normed vector space, which is of uncountable dimension(for example, a Banach space). Prove that the topological space (X∗, w∗) is notmetrizable.Hint: Assume (X∗, w∗) is metrizable. Let d be a metric which gives the w∗-topology. Then0 ∈ X∗ will have a countable basic system of neighborhoods. In particular, there exist sequences

(xn)n≥1 ⊂ X, and (εn)n≥1 ∈ (0, ∞), such that the sets

Bn =n

k=1

W (0; εn, xk)

satisfy Bn ⊂ B1/n(0), ∀ n ≥ 1, where B1/n(0) denotes the d-open ball of center 0 and radius

1/n. Consider the set M = xn : n ∈ N. We know that SpanM X. Choose some vector

y ∈ X SpanM. For every n ≥ 1, choose a linear map ψn : Spany, x1, . . . , xn → K, such

that ψn(y) = 1, and ψn(xk) = 0, ∀ k ∈ 1, . . . , n. Extend (use Hahn-Banach) ψn to a linear

continuous map φn : X

→K. Notice now that φn

∈Bn, for all n

≥1, which would then force

d-limn→∞ φn = 0. In particular, this would force limn→∞ φn(x) = 0, ∀ x ∈ X. But this is

impossible, since φn(y) = 1, ∀ n ≥ 1.

Comment. If X is a normed vector space of countable dimension, then (X∗, w∗)is metrizable. Indeed, if we take a linear basis bn : n ∈ N for X, then the w∗

topology on X∗ is clearly defined by the metric

d(φ, ψ) =nn=1

1

2n· |φ(bn) − ψ(bn)|

1 + |φ(bn) − ψ(bn)| , φ , ψ ∈ X∗.



Lectures 14-15

5. Banach spaces of continuous functions

In this section we discuss a examples of Banach spaces coming from topology.

Notation. Let K be one of the fields R or C, and let Ω be a topological space.We define

C Kb (Ω) =

f : Ω

→K : f bounded and continuous

.

In the case when K = C we use the notation C b(Ω).Proposition 5.1. With the notations above, if we define

f = sup p∈Ω

|f ( p)|, ∀ f ∈ C Kb (Ω),

then C Kb (Ω) is a Banach space.

Proof. It is obvious that C Kb (Ω) is a linear subspace of ∞K (Ω), and the norm

is precisely the one coming from ∞K (Ω). Therefore, it suffices to prove that C Kb (Ω)

is closed in ∞K (Ω).

Start with some sequence (f n)n≥1 ⊂ C Kb (Ω), which convergens in norm tosome f ∈ ∞

K (Ω), and let us prove that f : Ω → K is continuous (the fact that f isbounded is automatic).

Fix some point p0 ∈ Ω, and some ε > 0. We need to find some neighborhoodV of p0, such that|f ( p) − f ( p0)| < ε, ∀ p ∈ V.

Start by choosing n such that f n − f < ε3 . Use the fact that f n is continuous, to

find a neighborhood V of p0, such that

|f n( p) − f n( p0)| <ε

3, ∀ Ω ∈ V.

Suppose now Ω ∈ V . We have

|f ( p) − f ( p0)| ≤ |f n( p) − f ( p)| + |f n( p) − f n( p0)| + |f n( p0) − f ( p0)| ≤|f n( p) − f n( p0)| + 2

supq∈Ω

|f n(q) − f (q)| < 2ε

3+

ε

3= ε.

A first application of Banach space techniques is the following:

Lemma 5.1 (Urysohn type density). Let Ω be a topological space, let C ⊂ C Rb (Ω)be a linear subspace, which contains the constant function 1. Assume

(u) for any two closed sets A, B ⊂ Ω, with A ∩ B = ∅, there exists a function h ∈ C, such that h

A

= 0, hB = 1, and h(Ω) ∈ [0, 1], for all Ω ∈ Ω.

Then C is dense in C Rb (Ω), in the norm topology.

99



100 LECTURES 14-15

Proof. The key step in the proof will be the following:

Claim: For any f ∈ C Rb (Ω), there exists g ∈ C, such that

g − f ≤ 23f .

To prove this claim we define

α = inf p∈Ω

f ( p) and β = sup p∈Ω

f (x),

so that f ( p) ⊂ [α, β ], and f = max|α|, |β |. Define the sets

A = f −1

α,

2α + β

3

and B = f −1

α + 2β

3, β

.

so that both A and B are closed, and A ∩ B = ∅. Use the hypothesis, to find afunction h ∈ C, such that h

A

= 0, hB

= 1, and h( p) ∈ [0, 1], for all p ∈ Ω. Definethe function g ∈ C by

g =1

3α1 + (β − α)k.Let us examine the difference g − f . Start with some arbitrary point p ∈ Ω. Thereare three cases to examine:

Case I: p ∈ A. In this case we have h( p) = 0, so we get g( p) =α

3. By the

construction of A we also have α ≤ f ( p) ≤ 2α + β

3, so we get

2α

3≤ f ( p) − g( p) ≤ α + β

3.

Case II: p ∈ B. In this case we have h( p) = 1, so we get g( p) =β

3. We also

have2β + α

3≤ f ( p) ≤ β , so we get

α + β

3≤ f ( p) − g( p) ≤ 2β

3.

Case III: p ∈ Ω (A ∪ B). In this case we have 0 ≤ h( p) ≤ 1, so we getα

3≤ g( p) ≤ β

3, and

2α + β

3< f ( p) <

α + 2β

3. In particular we get

f ( p) − g( p) >2α + β

3− β

3=

2α

3;

f ( p) − g( p) <α + 2β

3− α

3=

2β

3.

Since2α

3≤ α + β

3≤ 2β

3, we see that in all three cases we have

2α3 ≤ f ( p) − g( p) ≤ 2β 3 ,

so we get2α

3≤ inf p∈Ω

f ( p) − g( p)

≤ sup p∈Ω

f ( p) − g( p)

≤ 2β

3,

so we indeed get the desired inequality

g − f ≤ 2

3f .



CHAPTER I I: ELEMENTS O F FUNCTIONAL ANALYSIS 10 1

Having proven the Claim, we now prove the density of C in C Rb (Ω). Start withsome f ∈ C Rb (Ω), and we construct recursively two sequences (gn)n≥1 ⊂ C and

(f n)n≥1 ⊂ C

R

b (Ω), as follows. Set f 1 = f . Apply the Claim to find g1 ∈C

such that

g1 − f ≤ 2

3f 1.

Once f 1, f 2, . . . , f n and g1, g2, . . . , gn have been constructed, we set

f n+1 = gn − f n,

and we choose gn+1 ∈ C such that

gn+1 − f n+1 ≤ 2

3f n+1.

It is clear, by construction, that

f n ≤ 2

3n−1

f , ∀ n ≥ 1.

Consider the sequence (sn)n≥1 ⊂ C of partial sums, defined by

sn = g1 + g2 + · · · + gn, ∀ n ≥ 1.

Using the equalities

gn = f n − f n+1, ∀ n ≥ 1,

we get

sn − f = g1 + g2 + · · · + gn − f 1 = f n+1,

so we have

sn

−f

≤ 2

3n

f

,

∀n

≥1,

which clearly give f = limn→∞ sn, so f indeed belongs to the closure C.

We are now in position to prove the following

Theorem 5.1 (Tietze Extension Theorem). Let Ω be a normal topological space, let T ⊂ Ω be a closed subset. Let f : T → [0, 1] be a continuous function.(Here Y is equipped with the induced topology.) There there exists a continuous

function g : Ω → [0, 1] such that gT

= f .

Proof. Let us introduce the Banach space setting that will make the proof clearer. We consider the Banach spaces C R(Ω) and C Rb (T ). To avoid any confusion,the norms on these Banach spaces will be denoted by · Ω and · T . If we define

the restriction mapR : C Rb (Ω) g −→ g

T

∈ C Rb (T ),

then R is obviously linear and continuous.We define the subspace C = R

C Rb (Ω)

⊂ C Rb (T ).

Claim: For every f ∈ C, there exists some g ∈ C Rb (Ω) such that f = Rg, and

inf q∈T

f (q) ≤ g( p) ≤ supq∈T f (q), ∀ p ∈ Ω.



102 LECTURES 14-15

To prove this fact, we start first with some arbitrary g0 ∈ C Rb (Ω), such that f =Rg0 = g0

Y . Put

α = inf q∈T f (q) and β = supq∈T f (q),

so that f T = max|α|, |β |. Define the function θ : R → [α, β ] by

θ(t) =

α if t < αt if α ≤ t ≤ β β if t > β

Then obviously θ is continuous, and the composition g = θ g0 : Ω → [α, β ] willstill satisfy g

T

= f , and we will clearly have

α ≤ g( p) ≤ β, ∀ p ∈ Ω.

Having proven the Claim, we are going to prove that C is closed . We do this byshowing that C is a Banach space, in the norm · Y . To get this, we use Remark??. Start with some sequence (f n)n≥1

⊂C, with ∞

n=1

f n

T <

∞. Apply the

Claim, to construct a sequence (gn)n≥1 ⊂ C Rb (Ω), such that Rgn = f n, and

inf q∈T

f n(q) ≤ gn( p) ≤ supq∈T

f n(q), ∀ p ∈ Ω,

for each n ≥ 1. Notice that this forces

gnΩ ≤ f nT , ∀ n ≥ 1.

Define the sequences of partial sums (hn)n≥1 ⊂ C and (sn)n≥1 ⊂ C Rb (Ω), by

hn = f 1 + · · · + f n and sn = g1 + · · · + gn, ∀ n ≥ 1.

Since∞n=1

gnΩ ≤∞n=1

f nT < ∞,

and C Rb (Ω) is a Banach space, it follows that the sequence (sn)n≥1 is convergent to

some point g ∈ C Rb (Ω). Since R : C Rb (Ω) → C Rb (T ) is linear an continuous, we willhave

Rs = limn→∞

[Rg1 + · · · + Rgn] = limn→∞

[f 1 + · · · + f n] = limn→∞

hn,

which proves that the sequence of partial sums (hn)n≥1 ⊂ C is indeed convergentto Rs ∈ C.

Let us remark now that obviously C contains the constant function 1 = R1.Using Urysohn Lemma (applied to T ) it is clear that C satifies the condition (u)in the above lemma. Using the Lemma ??, it follows that C = C Rb (T ), i.e. R issurjective.

To finish the proof, start with some arbitrary continuous function f : Y → [0, 1].Use surjectivity of R, combined with the Claim, to find g ∈ C Rb (Ω), such that

Rg = f , and inf q∈T

f (q) ≤ g( p) ≤ supq∈T

f (q), ∀ p ∈ Ω.

This clearly forces g to take values in [0, 1].

Next we concentrate on the case when Ω is a compact Hausdorff space. Inthis case, every continuous function F : Ω → K is automatically bounded, and theBanach space C Kb (Ω) will be denoted simply by C K(Ω). (When K = C this spacewill be denoted simply by C (Ω).)




Theorem 5.2 (Dini). Let K be a compact Hausdorff space, let (f n)n≥1 ⊂C R(K ) be a monotone sequence. Assume there is some f ∈ C R(K ), such that

limn→∞ f n( p) = f ( p), ∀ p ∈ K.

Then limn→∞ f n = f , in the norm topology.

Proof. Replacing f n with f n − f , we can assume that limn→∞ f n( p) = 0,∀ p ∈ K . Replacing (if necessary) f n with −f n, we can also assume that thesequence (f n)n≥1 is decreasing . In particular, each f n is non-negative.

We need to prove that limn→∞ f n = 0. Assume this is not true, so thereexists some ε > 0, such that the set

M = m ∈ N : f m ≥ εis infinite. For each integer n ≥ 1, let us define the set

F n = p ∈ K : f n( p) ≥ ε.

Then by the definition of M , we haveF m = ∅, ∀ m ∈ M.

Claim: One has the inclusion F n ⊃ F n+1, ∀ n ≥ 1.

Indeed, if p ∈ F n+1, thenε ≤ f n+1( p) ≤ f n( p),

which proves that p ∈ F n.Using the claim, plus the fact that the set M is infinite, it follows that, F n = ∅,

∀ n ≥ 1. (Indeed, if we start with some arbitrary n, then since M is infinite, wecan find m ∈ M , with m ≥ n, and then using the Claim we have ∅ = F m ⊂ F n.)

Since K is compact, and the sets F 1 ⊃ F 2 ⊃ . . . are closed and non-empty, bythe finite intersection property, it follows that

∞n=1

F n = ∅.

But this leads to a contradiction, because if we pick an element p ∈ ∞n=1 F n,

then we will have f n( p) ≥ ε, ∀ n ≥ 1, and then the equality limn→∞ f n( p) = 0 isimpossible.

Exercise 1. Define the sequence (P n)n≥1 of polynomials, by P 1(t) = 0, and

P n+1(t) =1

2

t − P n(t)2

+ P n(t), ∀ n ≥ 1.

Prove thatlimn→∞

maxt∈[0,1]

P n(t) − √t = 0.

Hint: Define the functions f n, f : [0, 1] → R by f n(t) = P n(t) and f (t) =

√t. Prove that, for

every t ∈ [0, 1], the sequence

f n(t)

n≥1is incresing, bounded, and limn→∞ f n(t) = f (t). Then

apply Dini’s Theorem.

Theorem 5.3 (Stone-Weierstrass). Let K be a compact Hausdorff space. Let A ⊂ C R(K ) be a unital subalgebra, i.e.

• A 1 - the constant function 1;• A is a linear subspace;• if f, g ∈ A, then f g ∈ A.



104 LECTURES 14-15

Assume A separates the points of K , i.e. for any p, q ∈ K , with p = q, there existsf ∈ A such that f ( p) = f (q).

Then A is dense in C R(K ), in the norm topology.

Proof. Let C denote the closure of A. Remark that C is again a unital sub-algebra and it still separates the points.

The proof will eventually use the Urysohn density Lemma. Before we get tothat point, we need several preparations.

Step 1. If f ∈ C, then |f | ∈ C.To prove this fact, we define g = f 2 ∈ C, and we set h = g−1g, so that h ∈ C,

and h( p) ∈ [0, 1], for all p ∈ K . Let P n(t), n ≥ 1 be the polynominals defined inthe above exercise. The functions hn = P n h, n ≥ 1 are clearly all in C. By theabove Exercise, we clearly get

limn→∞

max p∈K

|hn( p) −

h( p)|

= 0,

which means that limn→∞ hn = √h, in the norm topology. In particular, √hbelongs to C. Obviously we have

√h = f −1 · |f |,

so |f | indeed belongs to C.Step 2: Given two functions f, g ∈ C, the continuous functions maxf, g and

minf, g both belong to C.This follows immediately from Step 1, and the equalities

maxf, g =1

2

f + g + |f − g| and minf, g =

1

2

f + g − |f − g|.

Step 3: For any two points p, q ∈ K , p = q, there exists h ∈ C, such that h( p) = 0, h(q) = 1, and h(s)

∈[0, 1],

∀s

∈K .

Use the assumption on A, to find first a function f ∈ A, such that f ( p) = f (q).Put α = f ( p) and β = f (q), and define

g =1

β − α

f − α1

.

The function g still belongs to A, but now we have g( p) = 0 and g(q) = 1. Definethe function h = ming2, 1. By Step 3, h ∈ C, and it clearly satisfies the requiredproperties.

Step 4: Given a closed subset A ⊂ K , and a point p ∈ K A, there exists a function h ∈ C, such that h( p) = 0, h

A

= 1, and h(q) ∈ [0, 1], ∀ q ∈ K .

For every q ∈ A, we use Step 3 to find a function hq ∈ C, such that hq( p) = 0,hq(q) = 1, and hq(s) ∈ [0, 1], ∀ s ∈ K , and we define the open set

Dq = s ∈ K : hq(s) > 0.Using the compactness of A, we find points q1, . . . , qn ∈ A, such that

A ⊂ Dq1 ∪ · · · ∪ Dqn.

Define the function f = hq1 + · · · + hqn∈ C, so that f ( p) = 0, f (q) > 0, for all

q ∈ A, and f (s) ≥ 0, ∀ s ∈ K . If we define

m = minq∈A

f (q),




then the function g = m−1f again belongs to C, and it satisfies g( p) = 0, g(q) ≥ 1,∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K . Finally, the function

h = ming, 1will satisfy the required properties.

Step 5: Given closed sets A, B ⊂ K with A ∩ B = ∅, there exists h ∈ C, such that h

A

= 1, hB

= 0, and h(q) ∈ [0, 1], ∀ q ∈ K .

Use Step 4, to find for every p ∈ B, a function h p ∈ C, such that h pB

= 1,

h p( p) = 0, and h p(s) ∈ [0, 1], ∀ s ∈ K . Put g p = 1−h p, so that g p( p) = 1, g pB

= 0,and g p(s) ∈ [0, 1], ∀ s ∈ K . We the proceed as above. For each p ∈ A we define theopen set

D p = s ∈ K : g p(s) > 0.

Using the compactness of A, we find points p1, . . . , pn ∈ A, such that

A ⊂ D p1 ∪ · · · ∪ D pn.

Define the function f = g p1 + · · ·+ g pn ∈ C, so that f B = 0, f (q) > 0, for all q ∈ A,and f (s) ≥ 0, ∀ s ∈ K . If we define

m = minq∈A

f (q),

then the function g = m−1f again belongs to C, and it satisfies gB

= 0, g(q) ≥ 1,∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K . Finally, the function

h = ming, 1will satisfy the required properties.

We now apply the Urysohn density Lemma, to conclude that C is dense inC R(K ). Since C is already closed, this forces C = C R(K ), i.e. A is dense inC R(K ).

Corollary 5.1 (Complex version of Stone-Weierstrass Theorem). Let K bea compact Hausdorff space. Let A ⊂ C (K ) be a unital subalgebra, which satisfies;

• if f ∈ A, then f ∈ A.

Assume A separates the points of K . Then A is dense in C (K ), in the norm topology.

Proof. Consider the sub-algebra

AR = f ∈ A : f = f .

It is clear that A = AR + i AR,

and AR is a unital sub-algebra of C R(K ), which separates the points of K . Usingthe real version, we know that AR is dense in C R(K ). Then A is clearly dense inC (K ).

Example 5.1. Consider the unit disk

D = λ ∈ C : |λ| < 1,

and let D denote its closure. Consider the algebra A ⊂ C (D) consisting of allpolynomial functions. Notice that, although A is unital and separates the pointsof D, it does not have the property

f ∈ A ⇒ f ∈ A.



106 LECTURES 14-15

In fact, one way to see that this property fails is by inspecting the closure of A inC (D). This closure is denoted by A(D) and is called the disk algebra . The mainfeature of A(D) is the following:

Exercise 2* . Prove that

A(D) =

f : D → C : f continuous, and f D

holomorphic

.

We now examine the topological dual of C (K ).

Notations. Let K be a compact Hausdorff space, and let K be one of thefields R or C. We define the space

MK(K ) = C K(K )∗ = φ : C K(K ) → K : φ K-linear continuous.

The unit ball will be denoted by MK(K )1. When K = C, the superscript C will beomitted from the notation.

Remarks 5.1. Let K be a compact Hausdorff space. The space M(K ) =C (K )∗ carries a natural involution, defined as follows. For φ

∈M(K ), we define

the map φ : C (K ) → C by

φ(f ) = φ(f ), ∀ f ∈ C (K ).

For every φ ∈ M(K ), the map φ : C (K ) → C is again linear, continuous, and has

φ = φ.

The map φ will be called the adjoint of φ. We used the term involution , becausethe map

M(K ) φ −→ φ ∈ M(K )

has the following properties:

• (φ) = φ, ∀ φ ∈ M(K );• (φ + ψ) = φ + ψ, ∀ φ, ψ ∈ M(K );

• (λφ)

= λφ

, ∀ φ, ∈ M(K ), λ ∈ C.If we define the space of self-adjoint maps

Msa(K ) = φ ∈ M(K ) : φ = φ,

then is clear that, for any φ ∈ Msa(K ), the restriction φC R(K ) is real-valued. In

fact, for φ ∈ M(K ), one has

φ = φ ⇐⇒ φC R(K)

is real-valued.

Moreover, one has a map

(1) Msa(K ) φ −→ φC R(K)

∈ MR(K ),

which is an isomorphism of R-vector spaces. The inverse of this map is definedas follows. Start with some φ

∈MR(K ), i.e. φ : C R(K )

→R is R-linear and

continuous, and we define φ : C (K ) → C by

φ(f ) = φ(Re f ) + iφ(Im f ), ∀ f ∈ C (K ).

It turns out that φ is again linear, continuous, and self-adjoint. Moreover, thecorrespondence

MR(K ) φ −→ φ ∈ Msa(K )

is the inverse of (1).




Proposition 5.2. Let K be a compact Hausdorff space. Then the map

Msa(K ) φ −→ φC R(K)∈ MR(K )

is isometric. Moreover, when the two spaces are equipped with the w∗ topology, thismap is a homeomorphism.

Proof. To prove the first statement, fix φ ∈ Msa(K ). It is obvious thatφC R(K)

≤ φ. To prove the other inequality, fix for the moment ε > 0, and

choose f ∈ C (K ) such that f ≤ 1, and

|φ(f )| ≥ φ − ε.

Choose a complex number λ with |λ| = 1, such that

|φ(f )| = λφ(f ) = φ(λf ).

If we write λf = g + ih, with g, h ∈ C R(K ), then using the fact that φ is self-adoint,we will have

|φ(f )| = φ(g).Since g ≤ λf = f ≤ 1, we will get

|φ(f )| ≤ φC R(K)

,

so our choice of f will give

φ − ε ≤ φC R(K)

.

Since this holds for all ε > 0, we get

φ ≤ φC R(K)

.

The w∗ continuity (both ways) is obvious.

Convention. From now on, we will identify the space MR(K ) with Msa(K ).

Proposition 5.3. Let K be a compact Hausdorff space. For every p ∈ K , let γ p : C (K ) → C be the map

γ p : C (K ) f −→ f ( p) ∈ C.

(i) For every p ∈ K , the maps γ p and γ R p = γ pC R(K)

: C R(K ) → R are linear

and continuous.(ii) For every p ∈ K , one has γ p = γ R p = 1.(ii) The maps

ΓK : K p −→ γ p ∈ M(K )1

ΓRK : K p −→ γ R p ∈ MR(K )1

are injcetive and continuous, when the target spaces M(K )1 and MR(K )1

are equipped with the w∗

topology.Proof. (i)-(ii). The fact that γ p is C-linear is obvious. This will also give the

R-linearity of γ R p . The continuity follows from the obvious inequality

|γ p(f )| = |f ( p)| ≤ maxq∈K

|f (q)| = f , ∀ f ∈ C (K ).

AMong other things, the above inequality also proves

γ p ≤ 1 and γ R p ≤ 1.



108 LECTURES 14-15

The fact that we have in fact equalities follows from γ p(1) = 1.(iii) Let us first prove the injectivity. Assume we have two point p, q ∈ K , with

p= q. Use Urysohn Lemma to find f : K

→[0, 1] continuous, such that f ( p) = 0

and f (q) = 1. Then f ∈ C R(K ) and γ R p (f ) = f ( p) = 0, and γ Rq (f ) = f (q) = 1, so

we indeed have γ R p = γ Rq . (This will also imply γ p = γ q.

To prove the continuity of the maps ΓK : K → M(K )1 and ΓRK : K → MR(K )1,we need to prove the continuity of the maps f ΓK : K → C, f ∈ C (K ), and of the maps f ΓRK : K → R, f ∈ C R(K ). (Recall that f (φ) = φ(f ), ∀ φ ∈ C K(K )∗.)Notice hoewver that we have in fact equalities

f ΓK = f, ∀ f ∈ C (K ),

f ΓRK = f, ∀ f ∈ C R(K ),

so the desired continuity is automatic.

Corollary 5.2. With the above notations, the spaces

Γ(K ) = γ p : ∈ K ⊂M(K )1 and ΓR(K ) = γ R p : ∈ K ⊂ MR(K )1

are w∗ compact, and the maps

ΓK : K → Γ(K ) and ΓRK : K → ΓR(K )

are homeomorphisms.

Here is an interesting application of the above result to topology.

Theorem 5.4 (Urysohn Metrizatbility Theorem). Let K be a compact Haus-dorff space. The following are equivalent:

(i) K is metrizable;(ii) K is second countable, i.e. the topology has a countable base;

(iiiR) the Banach space C R(K ) is separable;(iiiC) the Banach space C (K ) is separable.

Proof. (i) ⇒ (ii). We already know this fact. (See the section on metricspaces).

(ii) ⇒ (iiiR). Assume K is second countable. Fix a countable base Dn : n ∈N for the topology. Consider the countable set

∆ = (m, n) ∈ N2 : Dm ∩ Dn = ∅.

Claim: For any two points p, q ∈ K , with p = q, there exists a pair (m, n) ∈∆ with p ∈ Dm and q ∈ Dn.

Indeed, since K is Hausdorff, there exist open sets U 0, V 0 ⊂ K with p ∈ U 0, q ∈ V 0,and U 0 ∩ V 0 = ∅. Since K is (locally) compact, there exist open sets U, V ⊂ K ,such that p ∈ U ⊂ U ⊂ U 0 and q ∈ V ⊂ V ⊂ V 0. Finally, since Dn : n ∈ N is abasis for the topology, there exist m, n ∈ N such that p ∈ Dm ⊂ U and q ∈ Dn ⊂ V .

Then clearly we have Dm ⊂ U ⊂ U 0, and Dn ⊂ V ⊂ V 0, which forces Dm∩Dn = ∅.Having proven the Claim, for every pair (m, n) ∈ ∆ we choose (use UrysohnLemma) a continuous function hmn : K → [0, 1] such that hmn

Dm

= 0 and

hmnDn

= 1, and we define the countable family

F = hmn : (m, n) ∈ ∆.

Using the Claim, we know that F separates the points of K . We set

P = h ∈ C R(K ) : h is a finite product of functions in F .




Notice that P is still countable, it also separates the points of K , but also has theproperty:

f, g∈P

⇒f g

∈P.

If we define A = Span(1 ∪ P),

then A ⊂ C R(K ) satisfies the hypothesis of the Stone-Weierstrass Theorem, hence A is dense in C R(K ). Notice that if we define

AQ = SpanQ(1 ∪ P),

i.e. the set of linear combinations of elements in 1 ∪ P with rational coefficients,then clearly AQ is dense in A, and so AQ is dense in C R(K ). But now we are done,since AQ is obviously countable.

(iiiR) ⇒ (iiiC). Assume C R(K ) is separable. Let S ⊂ C R(K ) be a countabledense set. Then the set

S+ iS =

f + ig : f, g

∈S

is clearly countable, and dense in C (K ).(iiiC) ⇒ (i). Assume C (K ) is separable. By the results from the previous

section, it follows that, when equipped with the w∗ topology, the compact spaceM(K )1 is metrizable. Then the compact subset Γ(K ) ⊂ M(K )1 is also metrizable.Since K is homeomorphic to Γ(K ), it follows that K itself is metrizable.

Definition. Let K be a compact Hausdorff space, and let K be one of thefields R or C. A K-linear map φ : C K(K ) → K is said to be positive, if it has theproperty

f ∈ C R(K ), f ≥ 0 =⇒ φ(f ) ≥ 0.

Proposition 5.4 (Automatic continuity for positive linear maps). Let K bea compact Hausdorff space, and let K be one of the fields R or C. Any positive

K-linear map φ : C

K

(K ) → K is continuous. Moreover, one has the equality φ = φ(1).

Proof. In the case when K = C, it suffices to prove that φC R(K)

is continuous.

Therefore, it suffices to prove the statement for K = R. Start with some arbitraryf ∈ C R(K ), and define the function f ± ∈ C R(K ) by

f + = maxf, 0 and f − = max−f, 0,

so that f ± ≥ 0, f = f + −f −, and f = maxf +, f −. On the one hand, bypositivity, we have the inequalities φ(f ±) ≥ 0, so we get

−φ(f −) ≤ φ(f +) − φ(f −) ≤ φ(f +),

which give

(2)

|φ(f )

|=

|φ(f +)

−φ(f −)

| ≤max

φ(f +), φ(f −)

.

On the other hand, we have

f ± · 1 − f ± ≥ 0,

so by positivity we getf ± · φ(1) ≥ φ(f ±).

Using this in (2) gives

|φ(f )| ≤ φ(1) · maxf +, f − = φ(1) · f .



110 LECTURES 14-15

Since this holds for all f ∈ C R(K ), the continuity of φ follows, together with theestimate

φ ≤ φ(1).Since φ(1) ≤ φ · 1 = φ, the desired norm equality follows.

Notations. Let K be a compact Hausdorff space. We define

MK+(K ) = φ : C K(K ) → K : φ K-linear, positive;

MK+(K )1 = φ ∈ M

K+(K ) : φ ≤ 1 = M

K+(K ) ∩MK(K )1.

When K = C, the superscript C will be ommitted.

Remarks 5.2. Let K be a compact Hausdorff space. We have the inclusionM+(K ) ⊂ Msa(K ). Indeed, if we start with φ ∈ M+(K ), then using the factthat every real-valued continuous function f ∈ C (K ) is a difference of non-negativecontinuous functions f = f +

−f −, it follows that φ(f ) = φ(f +)

−φ(f −) is a difference

of two non-negative (hence real) numbers, so φ(f ) ∈ R. This implies φ = φ.The set MR

+(K ) is w∗-closed in MR(K ), and the set M+(K ) is w∗-closed in

M(K ). This follows from the fact that, for each f ∈ C R(K ), the set

AKf = f ∈ MK(K ) : φ(f ) ≥ 0 = −1f

[0, ∞)

is w∗-closed, being the preimage of a closed set, under a w∗-continuous map. Theneverything is a consequence of the equality

MK+(K ) =

f ∈C R(K)f ≥0

AKf .

In particular, the sets MR+(K )1 and M+(K )1 are w∗-compact.

The sets MR+(K )1 and M+(K )1 are convex.Using the identification MR(K ) Msa(K ), we have the following hierarchies:

MR+(K ) M+(K )∩ ∩

MR(K ) Msa(K )∩

M(K )

MR+(K )1 M+(K )1

∩ ∩MR(K )1 Msa(K )1

∩M(K )1

with isometric and w∗-homeomorphism.

Proposition 5.5. Let K be a compact Hausdorff space. Then one has theequality

Msa(K )1 = convM+(K )1

∪ −M+(K )1.

(Here conv denotes the convex cover.)

Proof. Denote the set convM+(K )1 ∪ −M+(K )1

simply by C.

Claim: One has the equality:

(3) C = tφ − (1 − t)ψ : φ, ψ ∈ M+(K )1, t ∈ [0, 1].

In particular, the set C is w∗-compact.




Denote the set on the right hand side of (3) simply by D. The inclusion C ⊃ D isclear. To prove the inclusion C ⊂ D, we only need to prove that D is convex and itcontains M

+(K )

1 ∪ −M

+(K )

1. The second property is clear. The convexity of D

is also clear, being a consequence of the convexity of ±M+(K )1.The w∗-compactness of C is then a consequence of the compatness of the prod-

uct space

M+(K )1 ×M+(K )1 × [0, 1],

and of the fact that C is the range of the continuous map

M+(K )1 ×M+(K )1 × [0, 1] (φ, ψ, t) −→ tφ − (1 − t)ψ ∈ Msa(K ).

Having proven the Claim, we now proceed with the equality

Msa(K )1 = C.

The inclusion ⊃ is clear, since Msa(K )1 is convex, and it contains both M+(K )1

and −M+(K )1.

We prove the other inclusion by contradiction. Assume there is some φ ∈Msa(K )1 C. Apply Corollary II.4.2 to find some f ∈ C (K ) and a real number α,such that

Re φ(f ) < α ≤ Re σ(f ), ∀ σ ∈ C.

If we take g = Re f , then this gives

φ(g) < α ≤ σ(g), ∀ σ ∈ C.

Notice that 0 ∈ C, so we get α ≤ 0. If we define β = −α(≥ 0), and h = −g, theabove inequality gives

φ(h) > β ≥ σ(h), ∀ σ ∈ C.

Using the obvious inclusions ±Γ(K ) ⊂ C, we get

β ≥ ±γ p(h) = ±h( p), ∀ p ∈ K.

Since h is real-valued, this will force h ≤ β . But then we get a contradiction,because we also have

β < φ(h) ≤ φ · h ≤ h.

Corollary 5.3. Let K be a compact Hausdorff space, and let φ ∈ Msa(K ).Then there exist φ1, φ2 ∈ M+(K ), such that φ = φ1 − φ2, and φ = φ1 + φ2.

Proof. If φ ∈ M+(K ) ∪ −M+(K ), there is nothing to prove. Assume φ ∈M+(K )∪−M+(K ), in particular φ = 0. We define ψ =

φ

φ , so that ψ ∈ Msa(K )1.

Find ψ1, ψ2 ∈ M+(K )1 and t ∈ [0, 1], such that

ψ = tψ1

−(1

−t)ψ2.

Since ψ ∈ M+(K ) ∪ −M+(K ), it follows that 0 < t < 1. Notice that

1 = ψ = tψ1 − (1 − t)ψ2 ≤ tψ1 + (1 − t)ψ2.

If ψ1 < 1, or ψ2 < 1, then this would imply tψ1 + (1 − t)ψ2 < 1, whichis impossible by the above estimate. This argument proves that we must haveψ1 = ψ2 = 1. If we define

φ1 = tφψ1 and φ2 = (1 − t)φψ2,



112 LECTURES 14-15

then φ1 = tφ and φ2 = (1 − t)φ, so we indeed have φ1 + φ2 = φ.Obviously φ1 and φ2 are positive, and

φ1 − φ2 = φ · tψ1 − (1 − t)ψ2 = φ · ψ = φ.

Proposition 5.6. Let K be a compact Hausdorff space. The set

conv

Γ(K ) ∪ 0is w∗-dense in M+(K )1.

Proof. Let C be the w∗-closure of conv

Γ(K ) ∪ 0. It is obvious that C ⊂M+(K )1, so we only need to prove the inclusion M+(K )1 ⊂ C. We do this bycontardiction. Assume there exists some φ ∈ M+(K )1 C. Since C is w∗-closedand convex, there exists some f ∈ C (K ) and a real number α, such that

Re φ(f ) < α ≤ Re σ(f ), ∀ σ ∈ C.

In particular, if we take h = −Re f , and β = −α, we get

(4) φ(h) > β ≥ σ(h), ∀ σ ∈ C.

Sinc 0 ∈ C, we have β ≥ 0. Since Γ(K ) ⊂ C, we also get

β ≥ γ p(h) = h( p), ∀ p ∈ K,

which menas that β 1 − h ≥ 0. Since φ is positive, this will force φ(β 1 − h) ≥ 0,which gives

φ(h) ≤ φ(β 1) = βφ(1) = β φ.

Finally, since φ ≤ 1, this gives

φ(h) ≤ β,

thus contradicting (4).

The results for the Banach spaces of the form C (K ), with K compact Hausdorff space, can be generalized, with suitable modifications, to the situation when K isreplaced with a locally compact space. The following result in fact reduces theanalysis to the compact case.

Theorem 5.5. Let Ω be a locally compact space, and let Ωβ be the Stone-Cech compactification of Ω. Then the restriction map

R : C K(Ωβ) f −→ f

Ω∈ C Kb (Ω)


Proof. The linearity is obvious.Let us show that R is surjective. We show that R is bijective, by exhibiting an

inverse for it. For every h ∈ C K

b (Ω), we consider the compact setK h = z ∈ K : |z| ≤ h,

so that we can regard h as a continuous map Ω → K h. We know from the func-toriality of the Stone-Cech compactification that there exists a unique continuous

map hβ : Ωβ → K βh , with hβ

Ω= h. Since K h is compact, we have K βh = K h. In

particular, this gives the inequality

(5) |hβ(x)| ≤ h, ∀ x ∈ Ωβ.




Define the map T : C Kb (Ω) h −→ hβ ∈ C K(Ωβ), and let us show that T is aninverse for R. The equality R T = Id is trivial, by construction. To prove theequality T

R = Id, we start with some f

∈C Kb

(Ω), and we consider h = Rf .Then T h = hβ , and since hβ

Ω= h = f

Ω, the denisty of Ω in Ωβ clearly forces

f = hβ = T h = T (Rf ).The fact that R is isometric is now clear, because on the one hand we clearly

have Rf ≤ f , ∀ f ∈ C K(Ωβ), and on the other hand, by (5), we also haveT h ≤ h, ∀ h ∈ C Kb (Ω).

If Ω is a locally compact space, the above result suggests that the space C Kb (Ω)is quite “large.” It is then natural to look at smaller spaces.

Definitions. Let Ω be a locally compact space. If K is one of the fields R orC, and f : Ω → K is a continuous function, we define the support of f by

supp f = ω ∈ Ω : f (ω) = 0.

We define the spaceC Kc (Ω) =

f : Ω → K : f continuous, with compact support

.

When K = C, this space will be denoted simply by C c(Ω). Remark that, whenequipped with pointwise addition and multiplication, the space C Kc (Ω) becomes aK-algebra. One has obviously the inclusion C Kc (Ω) ⊂ C Kb (Ω).

We define C K0 (Ω) = C Kc (Ω), the closure of C Kc (Ω) in C Kb (Ω). (When K = C, wewill denote this space simply by C 0(Ω).) The Banach space C K0 (Ω) can be regardedas the completion of C Kc (Ω). Of course, when Ω is compact, we have the equalityC K0 (Ω) = C K(Ω).

The following result characterizes the Banach space C K0 (Ω).

Proposition 5.7. Let Ω be a locally compact space. For a function f ∈ C Kb (Ω),the following are equivalent:

(i) f ∈ C K0 (Ω);(ii) for every ε > 0, there exists some compact subset K ε ⊂ Ω, such that

supω∈ΩKε

|f (ω)| ≤ ε.

Proof. (i) ⇒ (ii). Suppose f ∈ C K0 (Ω), which means that there exists somesequence (f n)∞

n=1 ⊂ C Kc (Ω), such that limn→∞ f n = f , in the norm topology inC Kb (Ω). Fix some ε > 0, and choose k ≥ 1, such that f − f k ≤ ε. If we defineK ε = supp f k, then, for every ω ∈ Ω K ε, we have f k(ω) = 0, so the inequalityf − f k ≤ ε forces |f (ω)| ≤ ε.

(ii) ⇒ (i). Suppose f satisfies property (ii). Fix for the moment an integern ≥ 1. Use condition (ii) to find a compact subset K n ⊂ Ω, such that

|f (ω)| ≤1

n , ∀ ω ∈ Ω K n.

Use Urysohn Lemma to choose some continuous function hn : Ω → [0, 1], withcompact support, such that hn

Kn

= 1. Define the function f n = hnf , so that

f n ∈ C Kc (Ω). If ω ∈ Ω K n, then, using the inequality 0 ≤ hn ≤ 1, and the choiceof K n, we have

|f (ω) − f n(ω)| = |f (ω)| · [1 − hn(ω)] ≤ |f (ω)| ≤ 1

n.



114 LECTURES 14-15

Using the fact that f nKn

= f Kn

, the above equality proves that f −f n ≤ 1n . This

way we have constructed a sequence (f n)∞n=1 ⊂ C Kc (Ω), such that limn→∞ f n = f ,

in C K

b (Ω), so by the definition it follows that f ∈ C K

0 (Ω).

The following establishes an interesting connection with the Alexandrov com-pactification.

Proposition 5.8. Let Ω be a locally compact space, which is non-compact,and let Ωα = Ω ∞ denote the Alexandrov compactification.

(i) For every function f ∈ C K0 (Ω), the function f α : Ωα → K, defined by f α

Ω= f , and f α(∞) = 0, is continuous.

(ii) The correspondence U : C K0 (Ω) f −→ f α ∈ C K(Ωα) is an isometriclinear map.

(iii) One has the equality

(6) Ran U =

g ∈ C K(Ωα) : g(∞) = 0

.

Proof. (i). We know that Ω is open in Ωα, which immediately gives the factthat f α is continuous at every point ω ∈ Ω. So all we need to show is the continuityof f α at ∞. This amounts to showing that for every neighborhood N of f α(∞) = 0in K, there exists a neighborhood V of ∞ in Ωα, such that f α(V ) ⊂ N . Start witha neighborhood N of 0, and choose ε > 0, such that the set Bε = z ∈ K : |z| ≤ εis contained in N . Choose some compact set K ε ⊂ Ω, such that

supω∈ΩKε

|f (ω)| ≤ ε.

Define the set D = (Ω K ε) ∪∞. By the definition of the topology on Ωα, theset D is an open neigborhood of ∞. We are now done, because we clearly have

|f α(x)| ≤ ε, ∀ x ∈ D,

which gives the inclusion f α(D) ⊂ Bε ⊂ N .(ii). This part is trivial.(iii). Denote the right hand side of (6) by A. The inclusion Ran U ⊂ A is

trivial, by definition. Conversely, let us start with some g ∈ A, and let us considerthe function f = g

Ω

. Let us show that f ∈ C K0 (Ω), using Proposition 5.7. Startwith some ε > 0, and choose some open neighborhood Dε of ∞, in Ωα, such that

|g(x)| ≤ ε, ∀ x ∈ Dε.

By definition, there exists a compact subset K ε ⊂ Ω, such that Dε = Ωα K ε,so it is immediate that f satisfies condition (ii) from Proposition 5.7. Notice nowthat, by construction we have f α

Ω

= g

Ω, and f α(∞) = 0 = g(∞, so we indeed

get g = U f .

Remark 5.3. Let Ω be a locally compact space, which is non-compact. Usethe map U defined above, to identify C K0 (Ω) with the subspace Ran U ⊂ C K(Ωα).With this identification, we have the equality

C K(Ωα) = K1 + C K0 (Ω) =

λ1 + f : λ ∈ K, f ∈ C K0 (Ω)

.

Indeed, if we start with some function g ∈ C K(Ωα) and we take λ = g(∞) andf = g − λ1, then f (∞) = 0. Note that this argument proves that in fact everyg ∈ C K(Ωα), can be uniquely represented as g = λ1+f , with λ ∈ K, and f ∈ C K0 (Ω).




We conclude with a couple of generalizations of the various results in thissection. The first two ones are proven, the rest are stated as exercises. The followingresult is a generalization of Proposition 5.4.

Proposition 5.9. Let Ω be a locally compact space, and let φ : C R0 (Ω) → R bea positive linear map. Then φ is continuous, and one has the equality

(7) φ = supφ(f ) : f ∈ C R0 (Ω), 0 ≤ f ≤ 1.

Proof. Let us denote the right hand side of (7) by M . First we show thatM < ∞. If M = ∞, there exists a sequence (f n)∞

n=1 ⊂ C R0 (Ω), such that

0 ≤ f n ≤ 1 and φ(f n) ≥ 4n, ∀ n ≥ 1.

Consider then the function f =∞n=1

12n f n. Since

∞n=1

12n f n

≤ ∞n=1

12n = 1,

it follows that f ∈ C R0 (Ω). Notice however that, since we obviously have 12n f n ≤ f ,

by the positivity of φ, we get

φ(f ) ≥ φ 1

2n f n =1

2nφ(f n) ≥ 2n

, ∀ n ≥ 1,

which is clearly impossible. Let us show now that φ is continuous, by proving theinequality

(8) |φ(f )| ≤ M, ∀ f ∈ C R0 (Ω), with f ≤ 1.

Start with some arbitrary function f ∈ C R0 (Ω). The functions g± = |f |±f ∈ C R0 (Ω),clearly satisfy g ≥ 0, so we get φ(|f |± f ) ≥ 0, so we get φ(|f |) ≥ ±φ(f ). This gives|φ(f )| ≤ φ(|f |), and since 0 ≤ |f | ≤ 1, we immediately get (8).

The inequality (8) proves the inequality φ ≤ M . Since we obviously haveM ≤ φ, we get in fact the equality (7).

Corollary 5.4. Let Ω be a locally compact space, which is non-compact, and let Ωα be the Alexandrov compactification of Ω. Using the inclusion C R

0(Ω)

⊂C R(Ωα), given by Proposition 5.8, every positive linear map φ : C R0 (Ω) → R can beuniquely extended to a positive linear map ψ : C R0 (Ω) → R, such that ψ = φ.

Proof. For every g ∈ C R(Ωα), we know that there exists a unique λ ∈ R

and f ∈ C R0 (Ω), such that g = λ1 + f (namely λ = g(∞) and f = g − λ1). Wethen define ψ(g) = λφ + φ(f ). Notice that ψ(1) = φ. It is obvious thatψ : C R(Ωα) → R is linear, and ψ

C R0 (Ω)

= φ. Let us show that ψ is positive.

Start with some g ∈ C R(Ωα) with g ≥ 0, and let us prove that ψ(g) ≥ 0. Writeg = λ1 + f with λ ∈ R and f ∈ C R0 (Ω). We know that λ = g(∞) ≥ 0. If λ = 0,there is nothing to prove. If λ > 0, we define the function h = λ−1f ∈ C R0 (Ω), sothat g = λ(1 + h). The positivity of g forces 1 + h ≥ 0, which means if we considerthe function h− = max−h, 0 ∈ C R0 (Ω), then we have 0 ≤ h− ≤ 1, as well as

h−

+ h ≥ 0. Using the above result, this will then giveφ + φ(h) ≥ φ(h−) + φ(h) = φ(h− + h) ≥ 0,

which means that ψ(1 + h) ≥ 0. Consequently we also get

ψ(g) = ψ(λ(1 + h)) = λψ(h) ≥ 0.

Having shown the positivity of ψ, we know that

ψ = ψ(1) = φ.



116 LECTURES 14-15

To prove uniqueness, start with another positive linear map ξ : C R0 (Ω) → R,such that ξ = φ, with ξ

C R0 (Ω)

= φ. Since ξ is positive, this forces ξ(1) = ξ =

φ = ψ(1). But then we haveξ(λ1 + f ) = λφ + φ(f ) = ψ(λ1 + f ), ∀ λ ∈ R, f ∈ C R0 (Ω),

which proves that ξ = ψ.

Remark 5.4. Let Ω be a locally compact space, which is not compact, and letφ : C Rc (Ω) → R be a positive linear map. Then the following are equivalent:

(i) φ is continuous;(ii) sup

φ(f ) : f ∈ C Rc (Ω), 0 ≤ f ≤ 1

< ∞.

The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i) we followthe exact same steps as in the proof of the equality (7) in Proposition 5.9. Denotethe quantity in (ii) by M , and using the inequality |φ(f )| ≤ φ(|f |), we immediatelyget |φ(f )| ≤ M, ∀ f ∈ C Rc (Ω), with f ≤ 1.

Remark also that if φ is as above, then we have in fact the equalityφ = sup

φ(f ) : f ∈ C Rc (Ω), 0 ≤ f ≤ 1

.

The following is a generalization of Corollary 5.3.

Proposition 5.10. Let Ω be a locally compact space, and let φ : C R0 (Ω) → R bea linear continuous map. Then there exist positive linear maps φ1, φ2 : C R0 (Ω) → R,such that φ = φ1 − φ2, and φ = φ1 + φ2.

Proof. If Ω is compact there is nothing to prove (this is Corollary 5.3). As-sume Ω is non-compact. Use Hahn-Banach Theorem to find a linear continuousmap ψ : C R(Ωα) → R, with ψ = 1 and ψ

C R0 (Ω)

= φ. Apply Corollary 5.3 to

find two positive linear maps ψ1, ψ2 : C R(Ωα) → R such that ψ = ψ1 − ψ2 andψ = ψ1 + ψ2. Define the positive linear maps φk = ψk

C R0 (Ω)

, k = 1, 2. We

clearly have φ = φ1 − φ2, andφ1 + φ2 ≤ ψ1 + ψ2 = ψ = φ = φ1 − φ2 ≤ φ1 + φ2,

which forces φ = φ1 + φ2.

Exercise 3 . (Dini’s Theorem for locally compact spaces) Let Ω be a locallycompact space, let (f n)n≥1 ⊂ C R0 (Ω) be a monotone sequence. Assume there issome f ∈ C R0 (Ω), such that

limn→∞

f n(ω) = f (ω), ∀ ω ∈ Ω.

Then limn→∞ f n = f , in the norm topology.

Exercise 4. (Stone-Weierstrass Theorems) Let Ω be a locally compact space,which is non-compact, and let A ⊂ C K0 (Ω) be a subalgebra, with the followingseparation properties

• For any two points ω1, ω2 ∈ Ω, with ω1 = ω2, there exists f ∈ A such thatf (ω1) = f (ω2).

• For any ω ∈ Ω, there exists f ∈ A with f (ω) = 0.

A. Prove that, if K = R, then A is dense in C R0 ( A), in the norm topology.B. Prove that, if K = C, and if A has the property f ∈ A ⇒ f ∈ A, then A is

dense in C 0(Ω).

Hint: Work in Ωα (use Remark 5.3), and prove that K1 + A is dense in C K(Ωα).



Lectures 16-17

6. Hilbert spaces

In this section we examine a special type of Banach spaces.

Definition. Let K be one of the fields R or C, and let X be a vector spaceover K. An inner product on X is a map

X

×X

(ξ, η)

−→ ξ η ∈K,

with the following properties:

• ξ ξ ≥ 0, ∀ ξ ∈ X;

• if ξ ∈ X satisfies

ξ ξ

= 0, then ξ = 0;

• for any ξ ∈ X, the map X η −→ ξ η ∈ K is K-linear;

• η ξ

=

ξ η

, ∀ xi,η ∈ X.

Comments. Combining the last two properties, one getsξλη1 + η2

= λ

ξ η1

+

ξ η2

, ∀ ξ, η1, η2 ∈ X, λ ∈ K;

λξ1 + ξ2

η

= λ

ξ1

η

+

ξ2

η

, ∀ ξ1, ξ2, η ∈ X, λ ∈ K.

In particular, one has

(1) λξ λξ = λλ ξ ξ = |λ|2

· ξ ξ , ∀ ξ ∈ X, λ ∈ K.Proposition 6.1 (Cauchy-Bunyakowski-Schwartz Inequality). Let

· · bean inner product on the K-vector space X. Then

(2) ξ

η2 ≤

ξ ξ · η

η

, ∀ ξ, η ∈ X.

Moreover, if equality holds then ξ and η are proportional, in the sense that either ξ = 0, or η = 0, or ξ = λη.

Proof. Fix ξ, η ∈ X. Assume η = 0.

In the case when η = 0, both statements

are trivial

. Choose a number λ ∈ K, with |λ| = 1, such that

ξ

η

= λ

ξ

η

=

ξ

λη

.

Define the map F : K

→K by

F (z) = zλη + ξ zλη + ξ , ∀ z ∈ K.

A simple computation gives

F (z) = zλzλ

η η

+ zλ

ξ η

+ zλ

η ξ

+

ξ ξ

=

= |z|2|λ|2

η η

+ zλ

ξ η

+ zλ

ξ η

+

ξ ξ

=

= |z|2

η η

+ z ξ

η+ z

ξ η+

ξ ξ

, ∀ z ∈ R.

117



118 LECTURES 16-17

In particular, when we restrict F to R, it becomes a quadratic function:

F (t) = at2 + bt + c, ∀ t ∈ R,

where a = η η > 0, b = 2 ξ η , c = ξ ξ . Notice that we have

F (t) ≥ 0, ∀ t ∈ R.

This forces b2 − 4ac ≤ 0. This last inequality gives

4 ξ

η2 − 4

ξ ξ · η

η ≤ 0,

so we get ξ η2 ≤

ξ ξ · η

η

,

and the inequality (2) is proven. Let us examine now when we have equality. Theequality in (2) gives b2 − 4ac = 0, which in terms of quadratic equations says thatthe equation

F (t) = at2 + bt + c = 0

has a unique solution t0. This will givet0λη + ξ

t0λη + ξ

= F (t0) = 0,

which forces t0λη + ξ = 0, i.e. ξ = (−t0λ)η.

Corollary 6.1. Let · · be an inner product on the K-vector space X.

Then the map

X ξ −→

ξξ ∈ [0, ∞

is a norm on X.

Proof. Denote

ξξ simply by ξ. The fact that ξ is non-negative is

clear. The implication ξ = 0 ⇒ ξ = 0 is also clear. Using (1) we have

λξ = λξλξ = |λ|2ξξ = |λ| · ξξ = |λ| · ξ, ∀ ξ ∈ X, λ ∈ K.

Finally, for ξ, η ∈ X, we have

ξ + η2 =

ξ + η ξ + η

=

ξ ξ

+

η η

+

ξ η

+

η ξ

=

= ξ2 + η2 +

ξ η

+

ξ η

= ξ2 + η2 + 2Re

ξ η

.

We now use the C-B-S inequality, which reads

(3) ξ

η ≤ ξ · η,

so the above computation gives

ξ + η2 = ξ2 + η2 + 2Re

ξη ≤ ξ2 + η2 + 2

ξη ≤

≤ ξ2 + η2 + 2ξ · η =

ξ + η

2

,

so we immediately get ξ + η ≤ ξ + η.

Definition. The norm constructed in the above result is called the norm defined by the inner product

· · .

Exercise 1. Use the above notations, and assume we have two vectors ξ, η = 0,such that ξ +η = ξ+η. Prove that there exists some λ > 0 such that ξ = λη.

Lemma 6.1. Let X be a K-vector space, equipped with an inner product.

(ii) [Parallelogram Law] ξ + η2 + ξ − η2 = 2ξ2 + η2

.




(i) [Polarization Identities](a) If K = R, then

ξ η = 14ξ + η2 − ξ − η, ∀ ξ, η ∈ X.

(b) If K = R, then

ξ η

=1

4

3k=0

i−kξ + ikη2, ∀ ξ, η ∈ X.

Proof. (i). This is obvious, since (since the computations from the proof of Corollary ??)

ξ ± η2 = ξ2 + η2 ± 2Re

ξ η

.

(ii).(a). In the real case, the above identity gives

ξ ± η2 = ξ2 + η2 ± 2 ξ η ,

so we immediately get

ξ + η2 − ξ − η2 = 4

ξ η

.

(b). For every k ∈ 0, 1, 2, 3, we have

ξ + ikη2 = ξ2 + η2 + 2Re

ξ ikη

= ξ2 + η2 + ik

ξ η

+ i−k

η ξ

.

Then, when we sum up, we have

3k=0

i−kξ + ikη2 =ξ2 + η2

k=0

i−k + 4

ξ η

+

η ξ 3k=0

i−2k.

Since3

k=0

i−k =3

k=0

i−2k = 0,

the above computation proves that we indeed have

3k=0

i−kξ + ikη2 = 4

ξ η

.

Corollary 6.2. Let X be a K-vector space equipped with an inner product · · . Then the map

X× X (ξ, η) −→ ξ η ∈ K

is continuous, with respect to the product topologies.

Proof. Immediate from the polarization identities.

Corollary 6.3. Let X and Y be two K-vector spaces equipped with inner products

· · X

and · ·

Y. If T : X → K is an isometric linear map,

then T ξT η

Y

=

ξ ηX

, ∀ ξ, η ∈ X.

Proof. Immediate from the polarization identities.



120 LECTURES 16-17

Exercise 2 . Let X be a normed K-vector space. Assume the norm satisfies theParallelogram Law. Prove that there exists an inner product

·

·

on X, such

that

ξ = ξ ξ , ∀ ξ ∈ X.

Hint: Define the inner product by the Polarization Identity, and then prove that it is indeed an

inner product.

Proposition 6.2. Let X be a K-vector space, equipped with an inner product · · X

. Let Z be the completion of X with respect to the norm defined by the

inner product. Then Z carries a unique inner product · ·

Z, so that the norm

on Z is defined by · ·

Z. Moreover, this inner product extends

· · X

, in thesense that

ξ

η

Z

=

ξ

η

X

, ∀ ξ, η ∈ X.

Proof. It is obvious that the norm on Z satisfies the Parallelogram Law. Wethen apply Exercise 2.

Definitions. Let K be one of the fields R or C. A Hilbert space over K is aK-vector space, equipped with an inner product, which is complete with respect tothe norm defined by the inner product. Some textbooks use the term Euclidean forreal Hilbert spaces, and reserve the term Hilbert only for the complex case.

Examples 6.1. For I a non-empty set, the space 2K(I ) is a Hilbert space. We

know that this is a Banach space. The inner product defining the norm isαβ

=j∈I

α( j)β ( j), ∀ α, β ∈ 2K(I ).

The fact that the function αβ : I → K is summable is a consequence of Holder’s

inequality.More generally, a Banach space whose norm satisfies the Parallelogram Law is

a Hilbert space.

Definitions. Let X be a K-vector space, equipped with an inner product · · . Two vectors ξ, η ∈ X are said to be orthogonal , if

ξ η

= 0. In this casewe write ξ ⊥ η. Given a set M ⊂ X, and a vector ξ ∈ X, we write ξ ⊥ M, if

ξ ⊥ η, ∀ η ∈ M.

Finally, two subsets M,N ⊂ X are said to be orthogonal, and we write M ⊥ N , if

ξ ⊥ η, ∀ ξ ∈ M, η ∈ N .

Notation. Let X be a vector space equipped with an inner product. For asubset M

⊂X, we define the set

M⊥ = ξ ∈ X : ξ ⊥ M.

Remarks 6.1. Let X be a K-vector space equipped with an inner product.A. The relation ⊥ is symmetric.B. If ξ, η ∈ X satisfy ξ ⊥ η, then one has the Pythagorean Theorem:

ξ + η2 = ξ2 + η2.

This is a consequence of the equality ξ + η2 = ξ2 + η2 + 2Re

ξ η

.




C. If M ⊂ X is an arbitrary subset, then M⊥ is a closed linear subspace of X. This follows from the linearity of the inner product in the second variable, andfrom the continuity.

D. For sets M ⊂ N ⊂ X, one has

M⊥ ⊃ N ⊥.

E. For any set M ⊂ X, one has

M⊥ =

SpanM

⊥,

where SpanM denotes the norm closure of the linear span of M. The inclusion

M⊥ ⊃ SpanM

⊥is trivial, since we have M ⊂ SpanM. Conversely, if ξ ∈ M⊥, then M ⊂ ξ⊥. Butsince ξ⊥ is a closed linear subspace, this gives

SpanM

⊂ ξ

⊥,

i.e. ξ ∈ SpanM⊥

.

The following result gives a very interesting property of Hilbert spaces.

Proposition 6.3. Let H be a Hilbert space, let C ⊂ H be a non-empty closed convex set. For every ξ ∈ H, there exists a unique vector ξ ∈ C, such that

ξ − ξ = dist(ξ, C).

Proof. Denote dist(ξ, C) simply by d. By definition, we have

δ = inf η∈C

ξ − η.

Choose a sequence (ηn)n≥1 ⊂ C, such that limn→∞ ξ − ηn = δ.


ηm − ηn2 ≤ 2ξ − ηm2 + 2ξ − ηn2 − 4δ2, ∀ m, n ≥ 1.

Use the Parallelogram Law

(4) 2ξ − ηm2 + 2ξ − ηn2 = 2ξ − ηm − ηn2 + ηm − ηn2.

We notice that, since 12 (ηm + ηn) ∈ C, we have

ξ − 12 (ηm + ηn) ≥ δ,

so we get

2ξ − ηm − ηn2 = 4ξ − 12 (ηm + ηn)2 ≥ 4δ2,

so if we go back to (4) we get

2ξ − ηm2 + 2ξ − ηn2 = 2ξ − ηm − ηn2 + ηm − ηn2 ≥ 4δ2 + ηm − ηn2,

and the Claim follows.Having proven the Claim, we now notice that, since limn→∞ ξ − ηn = δ, we

immediately get the fact that the sequence (ηn)n≥1 is Cauchy . Since H is complete,it follows that the sequence is convergent to some point ξ. Since C is closed, itfollows that ξ ∈ C. So far we have

ξ − ξ = limn→∞

ξ − ηn = δ = dist(ξ, C),

thus proving the existence.



122 LECTURES 16-17

Let us prove now the uniqueness. Assume ξ ∈ C is another point such thatξ − ξ = δ. Using the Parallelogram Law, we have

4δ2

= 2ξ − ξ

2

+ ξ − ξ

2

= 2ξ − ξ

− ξ

2

+ ξ

− ξ

2

.If ξ = ξ, then we will have

4δ2 > 2ξ − ξ − ξ2 = 4ξ − 12 (ξ + ξ)2,

so we have a new vector η = 12 (ξ + ξ) ∈ C, such that

ξ − η < δ,

thus contracting the definition of δ.

Definition. Let H be a Hilbert space, and let X ⊂ H be a closed linearsubspace. For every ξ ∈ H, using the above result, we let P Xξ ∈ X denote theunique vector in X with the property

ξ

−P Xξ

= dist(ξ,X).

This way we have constructed a map P X : H → H, which is called the orthogonal projection ont X.

The properties of the orthogonal projection are summarized in the followingresult.

Proposition 6.4. Let H be a Hilbert space, and let X ⊂ H be a closed linear subspace.

(i) For vectors ξ ∈ H and ζ ∈ X one has the equivalence

ζ = P Xξ ⇐⇒ (ξ − ζ ) ⊥ X.

(ii) P XX

= IdX.(iii) The map P X : H → X is linear, continuous. If X = 0, then P X = 1.(iv) Ran P X = X and Ker P X = X⊥.

Proof. (i). “⇒.” Assume ζ = P Xξ. Fix an arbitrary vector η ∈ X 0, andchoose a number λ ∈ K, with |λ| = 1, such that

λ

ξ − ζ η

= ξ − ζ

η.

In particular, we have ξ − ζ η = Re

ξ − ζ

λη

.

Define the map F : R → R by

F (t) = ξ − ζ − tλη2 − ξ − ζ 2.

By the definition of ζ = P Xξ, we have

(5) F (t) > 0, ∀ t ∈ R 0.

Notice that F (t) = at2

+ bt, ∀ t ∈ R, where a = λη λη = η2

, and b =2Re ξ − ζ λη = 2 ξ − ζ η . Of course, the property

at2 + bt > 0, ∀ t ∈ R 0forces b = 0, so we indeed get

ξ − ζ

η

= 0.“⇐.” Assume (ξ − ζ ) ⊥ X. For any η ∈ X, we have (ξ − ζ ) ⊥ (ζ − η), so using

the Pythagorean Theorem, we get

ξ − η2 = ξ − ζ 2 + ζ − η2,




which forcesξ − η ≥ ξ − ζ , ∀ η ∈ X.

This proves that ξ − ζ = dist(ξ,X),

i.e. ζ = P Xξ.(ii). This property is pretty clear. If ξ ∈ X, then 0 = ξ − ξ is orthogonal to X,

so by (i) we get ξ = P Xξ.(iii). We prove the linearity of P X. Start with vectors ξ1, ξ −2 ∈ H and a scalar

λ ∈ K. Take ζ 1 = P Xξ1 and ζ 2 = P Xξ2. Consider the vector ζ = λζ 1 + ζ 2. For anyη ∈ X, we have

λξ1 + ξ2 − ζ η

=

(λξ1 − λζ 1) + (ξ2 − ζ 2) η

=

=

λξ1 − λζ 1 η

+

ξ2 − ζ 2 η

= λ

ξ1 − ζ 1 η

+

ξ2 − ζ 2 η

= 0.

By (i) we have (ξ1 − ζ 1) ⊥ X and (ξ1 − ζ 1) ⊥ X, so the above computation provesthat

(λξ1 + ξ2 − ζ ) ⊥ X,

so using (i) we get

P X(λξ1 + ξ2) = ζ = λζ 1 + ζ 2 = λP Xξ1 + P Xξ2,

so P X is indeed linear.To prove the continuity, we start with an arbitrary vector ξ ∈ H and we use

the fact that (ξ − P Xξ) ⊥ P Xξ. By the Pythagorean Theorem we then have

ξ2 = (ξ − P Xξ) + P Xξ2 = ξ − P Xξ2 + P Xξ2 ≥ P Xξ2.

In other words, we haveP Xξ ≤ ξ, ∀ ξ ∈ H,

so P X is indeed continuous, and we have P X ≤ 1. Using (ii) we immediately get

that, when X = 0, we have P X = 1.(iv). The equality Ran P X = X is trivial by the construction of P X and by (ii).If ξ ∈ Ker P X, then by (i), we have ξ ∈ X⊥. Conversely, if ξ ⊥ X, then ζ = 0satisfies the condition in (i), i.e. P Xξ = 0.

Corollary 6.4. If H is a Hilbert space, and X ⊂ H is a closed linear subspace,then

X+ X⊥ = H and X ∩ X⊥.

In other words the map

(6) X×X⊥ (η, ζ ) −→ η + ζ ∈ H

is a linear isomorphism.

Proof. If ξ

∈H then P Xξ

∈X, and ξ

−P Xξ

∈X⊥, and then the equality

ξ = P Xξ + (ξ − P Xξ)

proves that ξ ∈ X+X⊥. The equality X∩X⊥ = 0 is trivial, since for ζ ∈ X∩X⊥,we must have ζ ⊥ ζ , which forces ζ = 0.

Exercise 3 . Let H be a Hilbert space.

(i) Prove that, for any closed subspace X ⊂ H, one has the equality

P X⊥ = I − P X.



124 LECTURES 16-17

(ii) Prove that two closed subspaces X,Y ⊂ H, the following are equivalent:– X ⊥ Y;– P XP Y = 0;– P YP X = 0.

(iii) Prove that two closed subspaces X,Y ⊂ H, the following are equivalent:– X ⊂ Y;– P XP Y = P X;– P YP X = P X.

(iv) Let X,Y ⊂ H are closed subspaces, such that X ⊥ Y, then– X+ Y is c closed linear subspace of H;– P X+Y = P X + P Y.

Corollary 6.5. Let H be a Hilbert space, and let X ⊂ H be a linear (not necessarily closed) subspace. Then on has the equality

X =

X⊥

⊥

.

Proof. Denote the closed subspace X⊥⊥ by Z. Since X⊥ = X⊥, by theprevious exercise we have

P Z = I − P X⊥ = I − P X⊥ = I − (I − P

X) = P

X,

which forcesZ = Ran P Z = Ran P

X= X.

Theorem 6.1 (Riesz’ Representation Theorem). Let H be a Hilbert space over K, and let φ : H → K be a linear continuous map. Then there exists a uniquevector ξ ∈ H, such that

φ(η) =

ξ

η

, ∀ η ∈ H.

Moreover one has

ξ

=

φ

.

Proof. First we show the existence. If φ = 0, we simply take ξ = 0. Assumeφ = 0. Define the subspace X = Ker φ. Notice that X is closed . Using the linearisomorphism (6) we see that the composition

X⊥ → Hquotient map−−−−−−−−→ H/X

is a linear isomorphism. Since

H/X = H/Ker φ Ran φ = K,

it follows that dim(X⊥) = 1. In other words, there exists ξ0 ∈ X⊥, ξ0 = 0, suchthat

X⊥ = Kξ.

Start now with some arbitrary vector η

∈H. On the one hand, using the equality

Kξ0 + X = H, there exists λ ∈ K and ζ ∈ X, such that

η = λξ0 + ζ,

and since ζ ∈ X = Ker φ, we get

φ(η) = φ(λξ0) = λφ(ξ0).

On the other hand, we haveξ0

η

=

ξ0

λξ0

+

ξ0

ζ

= λξ02,




so if we define ξ = φ(ξ0)ξ0−2 we will have

ξ η = φ(ξ0)

ξ0

−2ξ0

|η = φ(ξ0)

ξ0

−2

ξ0 η = λφ(ξ0) = φ(η).

To prove uniqueness, assume ξ ∈ H is another vector with

φ(η) =

ξ η

, ∀ η ∈ H.

In particular, we have

ξ − ξ2 =

ξ − ξ | ξ − ξ

=

ξ | ξ − ξ− ξ | ξ − ξ

= φ(ξ − ξ) − φ(ξ − ξ) = 0,

which forces ξ = ξ.Finally, to prove the norm equality, we first observe that when ξ = 0, the

equality is trivial. If ξ = 0, then on the one hand, using C-B-S inequality we have

|φ(η)| =

ξ

η≤ ξ · η, ∀ η ∈ H,

so we immediately get

φ

≤ ξ

. If we take the vector ζ =

ξ

−1ξ, then

ζ

= 1,

and

φ(ζ ) =

ξ ξ−1ξ

= ξ,

so we also have φ ≥ ξ.

In the remainder of this section we discuss a Hilbert space notion of linearindependence. This should be thought as a “rigid” linear independence.

Definition. Let X be a K-vector space, equipped with an inner product. Aset F ⊂ X is said to be orthogonal , if 0 ∈ F , and

ξ ⊥ η, ∀ ξ, η ∈ F , with ξ = η.

A set F ⊂ X is said to be orthonormal , if it is orthogonal, but it also satisfies:

ξ = 1, ∀ ξ ∈ F .Remark that, if one starts with an orthogonal set F ⊂ X, then the set

F (1) =

ξ−1ξ : ξ ∈ F

is orthonormal.

Proposition 6.5. Let X be a K-vector space equipped with an inner product.Any orthogonal set F ⊂ X is linearly independent.

Proof. Indeed, if one starts with a vanishing linear combination

λ1ξ1 + · · · + λnξn = 0,

with λ1, . . . , λn ∈ K, ξ1, . . . , ξn ∈ X, such that ξk = ξ, for all k, ∈ 1, . . . , n with

k = , then for each k ∈ 1, . . . , n we clearly have

λkξk2 =

ξkλ − 1ξ1 + · · · + λnξn

= 0,

and since ξk = 0, we get λk = 0.

Lemma 6.2. Let X be a K-vector space equipped with an inner product, and let F ⊂ X be an orthogonal set. Then there exists a maximal (with respect to inclusion)orthogonal set G ⊂ X with F ⊂ G.



126 LECTURES 16-17

Proof. Consider the sets

A = G : G orthogonal subset of X,

AF = G ∈ A : G ⊃ F ,

ordered with the inclusion. We are going to apply Zorn’s Lemma to AF . LetT ⊂ AF be a subcollection, which is totally ordered, i.e. for any G1,G2 ∈ T one hasG1 ⊂ G2 or G1 ⊃ G2. Define the set

M =G∈T

G.

Since G ⊂ X 0, for all G ∈ T , it is clear that M ⊂ X 0. If ξ1, ξ2 ∈ M

are vectors with ξ1 = ξ2, then we can find G1,G2 ∈ T with ξ1 ∈ G1 and ξ2 ∈ G2.Using the fact that T is totally ordered, it follows that there is k ∈ 1, 2 such thatξ1, ξ2 ∈ Gk, so we indeed get ξ1 ⊥ ξ2. It is now clear that M ∈ AF , and M ⊃ G, forall G

∈T . In other words, we have shown that every totally ordered subset of AF

has an upper bound, in AF . By Zorn’s Lemma, AF has a maximal element. Finally,it is clear that any maximal element for AF is also a maximal element in A.

Remark 6.2. Using the notations from the proof above, given an orthonormal set M ⊂ X, the following are equivalent:

(i) M is maximal in A;(ii) M is maximal in

A(1) =

G : G orthonormal subset of X

.

The implication (i) ⇒ (ii) is trivial. Conversely, if M is maximal in A(1), we usethe Lemma to find a maximal N ∈ A with N ⊃ M. But then N (1) is orthonromal,and N (1)

⊃M, which by the maximality of M in A(1) will force N (1) = M. Since N

is linearly independent, the relations

N (1) = M ⊂ N ,

will force N = N (1) = M.

Comment. In linear algebra we know that a linearly independent set is max-imal, if and only if it spans the whole space. In the case of orthogonal sets, thisstatement has a version described by the following result.

Theorem 6.2. Let H be a Hilbert space, and let F be an orthogonal set in H.The following are equivalent:

(i) F is maximal among all orthogonal subsets of H;(ii) SpanF is dense in H in the norm topology.

Proof. (i) ⇒ (ii). Assume F is maximal. We are going to show that SpanF isdense in H, by contradiction. Denote the closure SpanF simply by X, and assumeX H. Since

X =X

⊥⊥

,

we see that, the strict inclusion X H forces X⊥ = 0. But now if we take anon-zero vector ξ ∈ X⊥, we immediately see that the set F ∪ξ is still orthogonal,thus contradicting the maximality of F .




(ii) ⇒ (i). Assume SpanF is dense in H, and let us prove that F is maximal.We do this by contardiction. If F is not maximal, then there exists ξ ∈ HF , suchthat F

∪ ξ

is still orthogonal. This would force ξ⊥F , so we will also have

ξ ⊥ SpanF .

But since SpanF is dense in H, this will give ξ ⊥ H. In particular we have ξ ⊥ ξ,which would force ξ = 0, thus contradicting the fact that F ∪ ξ is orthigonal.(Recall that all elements of an orthigonal set are non-zero.)

Definition. Let H be a Hilbert space An orthonormal set B ⊂ H, which ismaximal among all orthogonal (or orthonormal) subsets of H, is called an orthonor-mal basis for H.

By Lemma ??, we know that given any orthonormal set F ⊂ H, there exists anorthonormal basis B ⊃ F .

By the above result, an orthonormal set B ⊂ H is an orthonormal basis for H,if and only if SpanB is dense in H.

Example 6.2. Let I be a non-empty set. Consider the Hilbert space 2K(I ).

Consider (see section II.2) the set

B = δi : i ∈ I .

Then

SpanB = fin K(I ),

which is dense in 2K(I ). The above result then says that B is an orthonormal basis

for 2K(I ).

The following exercise will be useful in the discussion of another interestingexample.

Exercise 4. Equipp the space C ([0, 1]) with the inner product

f g = 1

0f (t)g(t) dt, f, g ∈ C ([0, 1]).

The norm defined by this inner product is

f 2 =

1

0

|f (t)|2dt

12

, f ∈ C ([0, 1]).

Define the maps en : [0, 1] t −→ exp(2nπit) ∈ T, n ∈ Z. (Here T denotes the unitcircle in C.) Prove that the set

B = en : n ∈ Zis orthonormal in C ([0, 1]), and SpanB is dense in C ([0, 1]) in the topology definedby the norm · 2.Hints: Define the space

P=

f ∈ C ([0, 1]) : f (0) = f (1)

.

Prove that P is dense in C ([0, 1]) in the topology defined by the norm · 2.Prove that the map

Φ : C (T) F −→ F e ∈ Pis a linear isomorphism, which is isometric with respect to the uniform norms.

In order to prove that SpanB is dense in C ([0, 1]) with respect to · 2, it suffices to showthat SpanB is dense in P in the uniform norm. Equivalently, it suffices to show that

Φ−1

SpanB



128 LECTURES 16-17

is dense in C (T), with respect to the uniform norm. To get this density use Stone-WeierstrassTheorem, plus the fact that the functions ζ n = Φ−1(en) ∈ C (T) are defined by

ζ n(z) = zn,∀

z∈

T, n∈

Z.

Example 6.3. We define L2([0, 1]) to be the completion of C ([0, 1]) with re-spect to the norm · 2. Regard C ([0, 1]) as a dense linear subspace in L2([0, 1]),so we also regard

B = en : n ∈ Zas a subset in L2([0, 1]). Then SpanB is dense in L2([0, 1]), so B is an orthonormal basis for L2([0, 1]).

Lemma 6.3. Let B be an orthonormal basis for the Hilbert space H, and let F B be an arbitrary non-empty subset.

(i) F is an orthonormal basis for the Hilbert space SpanF .

(ii)

SpanF

⊥

= Span(B F ).

Proof. (i). This is clear, sinceF

is orthonormal and has dense span.(ii). Denote for simplicity SpanF = X and Span(B F ) = Y. Since

ξ ⊥ η, ∀ ξ ∈ F , η ∈ B F ,

it is pretty obvious that X ⊥ Y. Since X+Y clearly contains SpanB, it follows thatX+ Y is dense in H. We know howver that X+ Y is closed, so we have in fact theequality

X+ Y = H.

This will then giveI = P H = P X + P Y,

so we getP Y = I − P X = P X⊥ ,

so X⊥ = Ran P X⊥ = Ran P Y = Y.

Theorem 6.3. Let H be a Hilbert space, and let B be an orthonormal basis for H, labelled 6 as B = ξj : j ∈ I . For every vector η ∈ H, let αη : I → K be themap defined by

αη( j) =

ξj η

, ∀ j ∈ I.

(i) For every η ∈ H, the map αη belongs to 2K(I ).

(ii) The map

T : H η −→ αη ∈ 2K(I )


Proof. (i). Fix for the moment η ∈ H. We must show that

supj∈F

|αη(i)|2 : F ⊂ I , finite

< ∞.

For any non-empty finite subset F ⊂ I , we define the subspace

HF = Spanξj : j ∈ F ,

6 This notation implicitly assumes that ξj = ξk, for all j, k ∈ I with j = k.




and define the vectorηF =

j∈F big( ξj

η ) · ξj .

Claim: For every finite set F ⊂ I , one has the equality

ηF = P HF η.

It suffices to prove that(η − ηF ) ⊥ HF .

But this is obvious, since if we start with some k ∈ F , then using the fact thatξk ξj

= 0, for all j ∈ F k, together with the equality ξk = 1, we get

ξk η −ηF

=

ξk η−

j∈F

ξj η · ξk

ξj

=

ξk η− ξk

η · ξk

ξk

= 0.

Having proven the Claim, let us observe that, since the terms in the sum thatdefines ηF are all orthogonal, we get

ηF 2 = j∈F

ξj η · ξj2 = j∈F

ξj η 2 · ξj2 = j∈F

|αη( j)|2.

Combining this computation with the Claim, we now havej∈F

|αη( j)|2 = ηF 2 = P HF η2 ≤ η2,

which proves that

supj∈F

|αη(i)|2 : F ⊂ I , finite

< η.

(ii). The linearlity of T is obvious. The above inequality actually proves that

T η ≤ η, ∀ η ∈ H.

We now prove that in fact T is isometric. Since T is linear and continuous, itsuffices to prove that T

SpanB

is isometric. Start with some vector η ∈ SpanB,

which means that there exists some finite set F ⊂ I , and scalars (λk)k∈F ⊂ K, suchthat η =

k∈F λkξk. Remark that

ξj | η

=k∈F

λk

ξj | ξj

=

λk if k ∈ F 0 if k ∈ F

so the element αη = T η ∈ 2K(I ) is defined by

αη(k) =

λk if k ∈ F 0 if k ∈ F

This gives

η2 = j,k∈F

λjλk ξj ξk = k∈F

|λk|2 = k∈F

|αη(k)|2 = αη2,

so we indeed getη = T η, ∀ η ∈ SpanB.

Let us prove that T is surjective. Notice that, the above computation, applied tosingleton sets F = k, k ∈ I , proves that

T ξk = δk, ∀ k ∈ I.



130 LECTURES 16-17

In particular, we have

Ran T ⊃ T SpanB = Span T (B) =

= SpanT ξk : k ∈ I = Spanδk : k ∈ I = fin K(I ),

which proves that Ran T is dense in 2K(I ). We know however that T is isometric,

so Ran T ⊂ 2K(I ) is closed. This forces Ran T = 2

K(I ).

Corollary 6.6 (Parseval Identity). Let H be a Hilbert space, and let B =ξj : j ∈ I be an orthonormal basis for H. One has:

ζ η

=j∈I

ζ ξj

· ξj η

, ∀ ζ, η ∈ H.

Proof. If we define α( j) =

ξj ζ

and

ξj η

, ∀ j ∈ I , then by constructionwe have α = T ζ and β = T η. Using the fact that T is isometric, the right handside of the above equality is the equal to

j∈I

α( j)β ( j) = α β = T ζ T η = ζ η .

Notation. Let H be a Hilbert space, let B = ξj : j ∈ I be an orthonormalbasis for H, and let T : H → 2

K(I ) be the isometric linear isomorphism defined inthe previous theorem. Given an element α ∈ 2

K(I ), we denote the vector T −1α ∈ H

by j∈I

α( j)ξj .

The summation notation is justified by the following fact.

Proposition 6.6. With the above notations, for every ε > 0, there exists some

finite subset F ε ⊂ I , such that j∈I

α( j)ξj −k∈F

α(k)ξk2

< ε, for all finite sets F ⊂ I with F ⊃ F ε.

Proof. Define the vector η =j∈I α( j)ξj . By construction we have T η = α.

Likewise, if we define, for each finite set F ⊂ I , the element αF ∈ 2K(I ) by

αF (k) =

α(k) if k ∈ F

0 if k ∈ I F

then T −1αF =k∈F α(k)ξk. Using the fact that T is an isometry, we have

η − T −1αF = T η − αF = α − αF ,

and the desired property follows from the well-known properties of 2K(I ).

Exercise 5 . Let H be a Hilbert space, let F = ξj : j ∈ J be an orthonormal

set. Define the closed linear subspace HF = SpanF . Prove that the orthogonalprojection P HF

is defined by

P HFη =

j∈J

ξj η

ξj , ∀ η ∈ H.




Hints: Extend F to an orthonormal basis B. Let B be labelled as ξi : i ∈ I for some setI ⊃ J . First prove that for any η ∈ H, the map β η = T η

J belongs to 2

K(J ). In particular, the

sum

ηF =j∈J

ξj η

ξj

is “legitimate” and defines an element in HF (use the fact that F is an orthonormal basis for HF ).

Finally, prove that (η − ηF ) ⊥ F , using Parseval Identity.

Example 6.4. Let us analyze the space L2([0, 1]). Use the orthonormal basisen : n ∈ Z defined by

en(t) = exp(2nπit), ∀ t ∈ [0, 1], n ∈ Z.

For any f ∈ C ([0, 1]) we define

f (n) =

1

0

exp(−2nπit)f (t) dt =

en

f

.

We then know thatf =

n∈Z

f (n)en.

One can think the right hand side as a series, but the reader should be aware of the fact that this series is convergent only in the norm · 2. One can define forexample for any N ≥ 1, a partial sum f N : [0, 1] → C by

f N (t) =N

n=−N

f (n)exp(2nπit), t ∈ [0, 1].

We will have

limN →∞

f − f N 2 = 0,

but in general there are (many) values of t ∈ [0, 1] for which the limit limN →∞ f N (t)does not exist. One can consider a formal infinite series

(7)∞

n=−∞

f (n)exp(2nπit).

Although this series is not convergent (pointwise) for all t ∈ [0, 1], it plays animportant role in analysis. The series (7) is called the complex Fourier series of f .

Note that Parseval’s Identity gives 1

0

f (t)g(t) dt =

∞n=−∞

f (n)g(n).

One can construct another orthonormal basis for L2([0, 1]), by taking real andimaginary parts of en. More explicitly, we define the sequences of functions (gn)∞

n=0

and (hn)∞n=1 by

g0(t) = 1, ∀ t ∈ [0, 1];

gn(t) =√

2 cos(2nπt), ∀ t ∈ [0, 1], n ≥ 1;

hn(t) =√

2 sin(2nπt), ∀ t ∈ [0, 1], n ≥ 1.



132 LECTURES 16-17

Then B = gn : n ≥ 0∪hn : n ≥ 1 is again an orthonormal basis for L2([0, 1]).(It is clear that B is orthonormal, and SpanB en, ∀ n ∈ Z, so SpanB is densein L2([0, 1]).) For f

∈C ([0, 1]) one can then define its real Fourier series

f (0) +

∞n=1

an cos(2nπt) + bn sin(2nπt)

,

where

an =√

2

1

0

f (t) cos(2nπt) dt and bn =√

2

1

0

f (t) sin(2nπt) dt, ∀ n ≥ 1.

Note that

an =

√2

2

f (−n) + f (n)] and bn =

√2

2i

f (−n) − f (n)], ∀ n ≥ 1.

The next result discusses the appropriate notion of dimension for Hilbert spaces.

Theorem 6.4. Let H be a Hilbert space. Then any two orthonormal bases of

H have the same cardinality.

Proof. Fix two orthonormal bases B and B. There are two possible cases.Case I: One of the sets B or B is finite.In this case H is finite dimensional, since the linear span of a finite set is

automatically closed. Since both B and B are linearly independent, it follows thatboth B and B are finite, hence their linear spans are both closed. It follows that

SpanB = SpanB = H,

so B and B are in fact linear bases for H, and then we get

CardB = CardB = dimH.

Case II: Both B and B are infinite.

The key step we need in this case is the following.Claim 1: There exists a dense subset Z ⊂ H, with

CardZ = CardB.

To prove this fact, we define the set

X = SpanQB.

It is clear that

CardX = CardB.

Notice that X is dense in SpanRB. If we work over K = R, then we are done. If

we work over K = C, we define

Z = X+ iX,

and we will still haveCardZ = CardX = CardB.

Now we are done, since clearly Z is dense in SpanCB.

Choose Z as in Claim 1. For every ξ ∈ B we choose a vector ζ ξ ∈ Z, such that

ξ − ζ ξ ≤√

2 − 1

2.

Claim 2 : The map B ξ −→ ζ ξ ∈ Z is injective.




Start with two vectors ξ1, ξ2 ∈ B, such that ξ1 = ξ2. In particular, ξ1 ⊥ ξ2, so wealso have ξ1 ⊥ (−ξ2), and using the Pythogorean Theorem we get

ξ1 − ξ22

= ξ22

+ − ξ22

= 2,which gives

ξ1 − ξ2 =√

2.

Using the triangle inequality, we now have√2 = ξ1 − ξ2 ≤ ξ1 − ζ ξ1 + ξ2 − ζ ξ2 + ζ ξ1 − ζ ξ2 ≤

√2 − 1 + ζ ξ1 − ζ ξ2.

This givesζ ξ1 − ζ ξ2 ≥ 1,

which forces ζ ξ1 = ζ ξ2 .Using Claim 2, we have constructed an injective map B → Z. In particular,

using Claim 1 and the cardinal arithmetic rules, we get

CardB ≤ CardZ = CardB.

By symmetry we also haveCardB ≤ CardB,

and then using the Cantor-Bernstein Theorem, we finally get

CardBCardB.

Corollary 6.7 (of the proof). A Hilbert space is separable, in the norm topol-ogy, if and only of it has an orthonormal basis which is at most countable.

Proof. Use Claims 1 and 2 from the proof of the Theorem.

Definition. Let H be a Hilbert space, and let B be an orthonormal basis forH. By the above theorem, the cardinal number CardB does not depend on the

choice of B. This number is called the hilbertian (or orthogonal ) dimension of H,and is denoted by h-dimH.

Corollary 6.8. For two Hilbert spaces H and H, the following are equivalent:

(i) h-dimH = h-dimH;(ii) There exists an isometric linear isomorphism U : H → H.

Proof. (i) ⇒ (ii). Choose a set I with h-dimH = h-dimH = Card I .Apply Theorem ?? to produce isometric linear isomorphisms T : H → 2

K(I ) andT : H → 2

K(I ). Then define U = T −1 T .(ii) ⇒ (i). Assume one has an isometric linear isomorphism U : H → H.

Choose an orthonormal basis B for H. Then U (B) is clearly and orthonormal basisfor H, and since U : B → U (B) is bijective, we get

h-dimH = CardB = Card U (B) = h-dimH.



Chapter III

Measure Theory



138 LECTURE 18

This is known as the Inclusion-Exclusion Principle.

Definition. Let X be a non-empty set, and let K be one of the fields7 Q, R

or C. An function φ : X → K is said to be elementary , if its range φ(X ) is finite.Remark that this gives

φ =

λ∈φ(X)

λ · κ φ−1(λ) =

λ∈φ(X)0

λ · κ φ−1(λ).

We define

ElemK(X ) = φ : X → K : φ elementary.

Given a collection M ⊂ P(X ), a function φ : X → K is said to be M-elementary ,if φ is elementary, and moreover,

φ−1(λ) ∈ M, ∀ λ ∈ K 0.

We define

M-ElemK(X ) =

φ : X

→K : φ M-elementary

.

Exercise 2 . With the above notations, prove that ElemK(X ) is a unital K-algebra.

Proposition 1.1. Given a non-empty set X , the collection P(X ) is a unital ring, with the operations

A + B = AB and A · B = A ∩ B , A , B ∈ P(X ).

Proof. First of all, it is clear that is commutative.To prove the associativity of , we simply observe that

κ (AB)C = κ AB + κ C − 2κ ABκ C =

= κ A + κ B − 2κ Aκ B + κ C − (κ A + κ B − 2κ Aκ B) · κ C =

= κ A + κ B + κ C

−2(κ Aκ B + κ Aκ C + κ Bκ C ) + 2κ Aκ Bκ C .

Since the final result is symmetric in A,B,C , we see that we get

κ A(BC ) = κ (AB)C ,

so we indeed get

(AB)C = A(BC ).

The neutral element for is the empty set ∅. Since we obviously have AA = ∅,it follows that

P(X ), is indeed an abelian group.

The operation ∩ is clearly commutative, associative, and has the total set X as the unit.

To check distributivity, we again use characteristic functions:

κ (A∩C )(B∩C ) = κ A∩C + κ B∩C − 2κ A∩C κ B∩C =

= κ Aκ C + κ Bκ C − 2κ Aκ Bκ C = (κ A + κ B − 2κ Aκ B)κ C == κ ABκ C = κ (AB)∩C ,

so we indeed have the equality

(A ∩ C )(B ∩ C ) = (AB) ∩ C.

7 K can be any field.



CHAPTER III: MEASURE THEORY 139

Definitions. Let X be a non-empty set. A ring on X is a non-empty sub-ringR ⊂ P(X ). We do not require the unit X to belong to R, but we do require ∅ ∈ R.

An algebra on X is a ring A which contains the unit X .

Rings and algebras of sets are characterized as follows.

Proposition 1.2. Let X be a non-empty set.A. For a non-empty collection R ⊂ P(X ), the following are equivalent:

(i) R is a ring on X ;(ii) For any A, B ∈ R, we have A B ∈ R and A ∪ B ∈ R.

B. For a non-empty collection A ⊂ P(X ), the following are equivalent:

(i) A is an algebra on X ;(ii) For any A ∈ A, we have X A ∈ A, and for any A, B ∈ A, we have

A ∪ B ∈ A.

Proof. A. (i) ⇒ (ii). Assume R is a ring on X , and let A, B ∈ R. Then A∩Bbelongs to R, so

A B = A(A ∩ B)also belongs to R. It the follows that

A ∪ B = (AB)(A ∩ B)

again belongs to R.(ii) ⇒ (i). Assume R satisfies property (ii). Start with A, B ∈ R. Then A B

belongs to R, andA ∩ B = A (A B)

again belongs to R. Since A ∪ B also belongs to R, it follows that the set

AB = (A ∪ B) (A ∩ B)

again belongs to R.B. (i)

⇒(ii). This is clear from the implication A.(i)

⇒(ii).

(ii) ⇒ (i). Assume A satisfies property (ii). Start with two sets A, B ∈ A.Then the complements X A and X B both belong to A, hence their union

(X A) ∪ (X B) = X (A ∩ B)

belongs to A, and the complement of this union

X

X (A ∩ B)

= A ∩ B

will also belong to A.If A, B ∈ A, then since X B belongs to A, by the above considerations, it

follows that the intersection

A ∩ (X B) = A B

also belongs to A. Likewise, the difference B A also belongs to A, hence the union

(A B) ∪ (B A) = ABalso belongs to A. By part A, it follows that A is a ring.

Finally, since A is non-empty, if we choose some A ∈ A, then AA = ∅ belongsto A, so its complement X ∅ = X also belongs to A.

It will be useful to introduce the following terminology.

Definition. A system of sets (Ai)i∈I is said to be pair-wise disjoint , if Ai ∩Aj = ∅, for all i, j ∈ I with i = j.



140 LECTURE 18

Lemma 1.1. Let X be a non-empty set, let K be one of fields Q, R or C, and let R be a ring on X . For a function φ : X → K, the following are equivalent:

(i) φ is R-elementary;(ii) there exist an integer n ≥ 1 and sets A1, . . . , An ∈ R, and numbersλ1, . . . , λn ∈ K, such that

φ = λ1κ A1 + · · · + λnκ An.

(ii) there exist an integer m ≥ 1, and a finite pair-wise disjoint system (Bj)mj=1 ⊂R, and numbers µ1, . . . , µm ∈ K, such that

φ = µ1κ B1 + · · · + µmκ Bm.

Proof. (i) ⇒ (ii). Assume φ is R-elementary. If φ = 0, there is nothing toprove, because we have φ = κ ∅. If φ is not identically zero, then we can obviouslywrite

φ =

λ∈φ(X)0

λκ φ−1(λ),

with all sets φ−1(λ) in R.(ii) ⇒ (iii). Define

E = ψ : X → K : ψ satisfies property (iii).

Assume φ satisfies (ii), i.e.

φ = λ1κ A1 + · · · + λnκ An,

with A1, . . . , An ∈ R and λ1, . . . , λn ∈ K. We are going to prove that φ ∈ E, byinduction on n. The case n = 1 is trivial (either φ = 0, so φ = κ ∅ ∈ E, or φ = λκ Afor some A ∈ R and λ = 0, in which case we also have φ ∈ E).

Assumeα1κ D1

+ · · · + αkκ Dk∈ E,

for all D1, . . . , Dk ∈R

, α1, . . . , αk ∈ K. Start with a functionφ = λ1κ A1 + · · · + λkκ Ak

+ λk+1κ Ak+1,

with A1, . . . , Ak+1 ∈ R and λ1, . . . , λk+1 ∈ K, and based on the above inductivehypothesis, let us show that φ ∈ E. Using the inductive hypothesis, the function

ψ = λ2κ A2 + · · · + λkκ Ak+ λk+1κ Ak+1

belongs to E, so there exist scalars η1, . . . , η p ∈ K, an integer p ≥ 1, and a pair-wisedisjoint system (C j) pj=1 ⊂ R, such that

ψ = η1κ C 1 + · · · + η pκ C p .

With this notation, we have

φ = λ1κ A1 + η1κ C 1 + · · · + η pκ C p .

Put thenB2j = A1 ∩ C j and B2j−1 = C j A1, for all j ∈ 1, . . . , p;

B2 p+1 = A1 (C 1 ∪ · · · ∪ C P ).

It is clear that (Bk)2 p+1k=1 ⊂ R is pair-wise disjoint. Notice now that the equalities

C j = B2j−1 ∪ B2j , ∀ j ∈ 1, . . . , p,

A1 = B1 ∪ B3 ∪ · · · ∪ B2 p+1,




combined with the fact that the B’s are pairwise disjoint, give

κ C j = κ B2j−1+ κ B2j

, ∀ j ∈ 1, . . . , p,

κ A1 = κ B1 + κ B3 + . . . κ B2p+1 ,

which give

φ =

pj=1

ηjκ B2j+

pj=1

(ηj + λ1)κ B2j−1+ λ1κ B2p+1

,

which proves that φ indeed belongs to E.(iii) ⇒ (i). Assume there exists a finite pair-wise disjoint system (Bj)mj=1 ⊂ R,

and numbers µ1, . . . , µm ∈ K, such that

φ = µ1κ B1 + · · · + µmκ Bm,

and let us prove that φ is R-elemntary.If all the µ’s are zero, there is noting to prove, since φ = 0.Assume the µ’s are not all equal to zero. Since the µ’s that are equal to zero

do not have any contribution, we can in fact assume that all the µ’s are non-zero.Notice that

φ(X ) 0 = µj : 1 ≤ j ≤ m.

In particular φ is elementary.If we start with an arbitrary λ ∈ K 0, then either λ ∈ φ(X ), or λ ∈

φ(X ) 0. In the first case we clearly have φ−1(λ) = ∅ ∈ R. In the secondcase, we have the equality

φ−1(λ) =j∈M λ

Bj ,

whereM λ = j : 1 ≤ j ≤ m and µj = λ.

Since all B’s belong to R, it follows that φ−1(

λ

) again belongs to R. Having

shown that φ is elementary, and φ−1(λ) ∈ R, for all λ ∈ K 0, it follows thatφ is indeed R-elementary.

Proposition 1.3. Let X be a non-empty set, and let K be one of the fields Q,R, or C.

A. For a non-empty collection R ⊂ P(X ), the following are equivalent:

(i) R is a ring on X ;(ii) R-ElemK(X ) is a K-subalgebra of ElemK(X ).

B. For a non-empty collection A ⊂ P(X ), the following are equivalent:

(i) A is an algebra on X ;(ii) A-ElemK(X ) is a K-subalgebra of ElemK(X ), which contains the constant

function 1.

Proof. A. (i) ⇒ (ii). Assume R is a ring on X . Using Lemma 1.1 we see thatwe have the equality:

R-ElemK(X ) = Spanκ A : A ∈ R.

In particular, this shows that R-ElemK(X ) is a K-linear subspace of ElemK(X ).Moreover, in order to prove that R-ElemK(X ) is a K-subalgebra, it suffices to provethe implication

A, B ∈ R =⇒ κ A · κ B ∈ R-ElemK(X ).



142 LECTURE 18

But this implication is trivial, since κ A · κ B = κ A∩B, and A ∩ B belongs to R.(ii) ⇒ (i). Assume R-ElemK(X ) is a K-subalgebra of ElemK(X ). First of all,

since κ ∅

= 0∈R-ElemK(X ), it follows that ∅

∈R.

Start now with two sets A, B ∈ R. Then κ A and κ B belong to R-ElemK(X ).Since R-ElemK(X ) is an algebra, the function

κ A∩B = κ A · κ B

belongs to R-ElemK(X ), so we immediately see that A ∩ B ∈ R.Likewise, the function

κ AB = κ A + κ B − 2κ Aκ B

belongs to R-ElemK(X ), so we also get AB ∈ R.B. This equivalence is clear from part A, plus the identity κ X = 1.

Algebras of elementary functions give in fact a complete description for ringsor algebras of sets, as indicated in the result below.

Proposition 1.4. Let X be a non-empty set, and let K be one of the fields Q,R, or C.

A. The mapR −→ R-ElemK(X )

is a bijective correspondence from the collection of all rings on X , and the collection of all K-subalgebras of ElemK(X ).

B. The map A −→ A-ElemK(X )

is a bijective correspondence from the collection of all algebras on X , and the col-lection of all K-subalgebras of ElemK(X ) that contain 1.

Proof. A. We start by proving surjectivity. Let E ⊂ ElemK(X ) be an arbitraryK-subalgebra. Define the collection

R = A ⊂ X : κ A ∈ E.

If A, B ∈ R, then the equalities

κ A∩B = κ Aκ B and κ AB = κ A + κ B − 2κ Aκ B,

combined with the fact that E is a subalgebra, prove that κ A∩B and κ AB bothbelong to E, hence A ∩ B and AB both belong to R. This shows that R is a ring.

It is pretty clear (see Lemma 1.1) that R-ElemK(X ) ⊂ E. To prove the otherinclusion, start with some arbitrary function φ ∈ E, and let us prove that φ ∈R-ElemK(X ). If φ = 0, there is nothing to prove. Assume φ is not identically zero.We write φ(X ) 0 as λ1, . . . , λn, with λi = λj for all i, j ∈ 1, . . . , n withi = j. For each i ∈ 1, . . . , n, we set Ai = φ−1(λi), so that

φ =

ni=1

λi · κ Ai .

Since all λ’s are different, the matrix

T =

λ1 λ2 . . . λnλ2

1 λ22 . . . λ2

n...

.... . .

...λn1 λn2 . . . λnn




is invertible. Take

αijni,j=1

to be the inverse of T . The obvious equalities

φk

=

n

j=1

λk

jκ Aj , ∀ k = 1, . . . , n

can be written in matrix form asφφ2

...φn

= T ·

κ A1κ A2...κ An

,

so multiplying by T −1 yields

κ Aj=

nk=1

αjkφk, ∀ j = 1, . . . , n ,

which proves that κ A1

, . . . , κ An ∈

E, so A1, . . . , An∈R. This then shows that

φ ∈ R-ElemK(X ).We now prove injectivity. Suppose first that R and S are rings such that

R-ElemK(X ) = S-ElemK(X ), and let us prove that R = S. For every A ∈ R, thefunction κ A ∈ R-ElemK(X ) is also S-elementary, which means that A ∈ S. Thisproves the inclusion R ⊂ S. By symmetry we also have the inclusion S ⊂ R, soindeed R = S.

B. This part is obvious from A.

Definitions. Let X be a (non-empty) set. A collection U ⊂ P(X ) is called aσ-ring , if it is a ring, and it has the property:

(σ) Whenever (An)∞n=1 is a sequence in U, it follows that

∞n=1 An also belongs

to U.

A collectionS

⊂P

(X ) is called a σ-algebra , if it is an algebra, and it has property(σ).Clearly, every σ-algebra is a σ-ring.

Remarks 1.2. A. For σ-rings and σ-algebras, one of the properties in thedefinition of rings and algebras is redundant. More explicitly:

(i) A collection U ⊂ P(X ) is a σ-ring, if and only if it has the property (σ)and the property: A, B ∈ U =⇒ A B ∈ U.

(ii) A collection S ⊂ P(X ) is a σ-algebra, if and only if it has the property(σ) and the property: A ∈ S =⇒ X A ∈ S.

B. If U is a σ-ring, then it also has the property

(δ) (An)∞n=1 ⊂ U =⇒ ∞

n=1 ∈ U.

Since σ-algebras are σ-rings, they will also have property (δ).

Definitions. Let X be a non-empty set. A sequence (An)n≥1 of subsets of X is said to be monotone, if it satisfies one of the following conditions:

(↑) An ⊂ An+1, ∀ n ≥ 1,(↓) An ⊃ An+1, ∀ n ≥ 1.

In the case (↑) the sequence is said to be increasing , and we define

limn→∞

An =∞n=1

An.



144 LECTURE 18

In the case (↓) the sequence is said to be decreasing , and we define

limn→∞

An

=

∞

n=1

An

.

A collection M ⊂ P(X ) is said to be a monotone class on X , if it satisfies thecondition:

(m) whenever (An)n≥1 is a monotone sequence in M, it follows that its limit limn→∞ An also belongs to M.

Proposition 1.5. Let R be a ring on X . Then the following are equivalent:

(i) R is a σ-ring;(ii) R is a monotone class.

Proof. (i) ⇒ (ii). This is immediate from the definition and Remark 1.2.B.(ii) ⇒ (i). Assume R is a monotone class, an let us prove that it is a σ-ring.

By Remark 1.2.A, we only need to prove that R has property (σ). Start with an

arbitrary sequence (An)n≥1 in R, and let us prove that ∞n=1 An again belongs to R.

For every integer n ≥ 1, we define Bn =nk=1 An. Since R is a ring, it follows that

Bn ∈ R, ∀ n ≥ 1. Moreover, the sequence (Bn)n≥1 is increasing, so by assumption,the set

∞n=1 An = limn→∞ Bn indeed belongs to R.



Lecture 19

2. Constructing (σ-)rings and (σ-)algebras

In this section we outline three methods of constructing (σ-)rings and(σ-)algebras. It turns out that one can devise some general procedures, whichwork for all the types of set collections considered, so it will be natural to beginwith some very general considerations.

Definition. Suppose one has a type Θ of set collections. In other words, forany set X , one defines what it means for a collection C ⊂ P(X ) to be of type Θ. Thetype Θ is said to be consistent , if for every set X , one has the following conditions:

• the collection P(X ), of all subsets of X , is of type Θ;• if Ci, i ∈ I are collections of type Θ, then the intersection

i∈I Ci is again

of type Θ.

Examples 2.1. The following types are consistent:

• The type R of rings;• The type A of algebras;• The type S of σ-rings;• The type Σ of σ-algebras;• The type M of monotone classes.

The reason for the consistency is simply the fact that each of these types is definedby means of set operations.

Definition. Let Θ be a consistent type, let X be a set, and let E ⊂ P(X ) bean arbitrary collection of sets. Define

F Θ(E, X ) =C ⊂ P(X ) : C ⊃ E, and C is of type Θ on X

.

Notice that the family F Θ(E, X ) is non-empty, since it contains at leas the collectionP(X ). The collection

ΘX(E) =

C∈F Θ(E,X)

C

is of type Θ on X , and is called the type Θ class generated by E. When there is nodanger of confusion, the ambient set X will be ommitted.

Comment. In the above setting, the class Θ(E

) is the smallest collection of type Θ on X , which contains E. In other words, if C is a collection of type Θ on X ,with C ⊃ E, then C ⊃ Θ(E). This follows immediately from the fact that C belongsto F Θ(E, X ).

Examples 2.2. Let X be a (non-empty) set, and let E be an arbitrary collectionof subsets of X . According to the previous list of consistent types R, A, S, Σ, andM, one can construct the following collections.

(i) R(E), the ring generated by E; this is the smallest ring that contains E.

145



146 LECTURE 19

(ii) A(E), the algebra generated by E; this is the smallest algebra that containsE.

(iii) S(E), the σ-ring generated by E; this is the smallest σ-ring that containsE.

(iv) Σ(E), the σ-algebra generated by E; this is the smallest σ-algebra thatcontains E.

(v) M(E), the monotone class generated by E; this is the smallest monotoneclass that contains E.

Comment. Assume Θ is a consistent type. Suppose E is an arbitrary collectionof subsets of some fixed non-empty set X . There are instances when we would liketo decide whether a class C ⊃ E coincides with Θ(E). The following is a useful test:

(i) check that C is of type Θ;(ii) check the inclusion C ⊂ Θ(E).

By (i) we must have C ⊃ Θ(E), so by (ii) we will indeed hav equality.

A simple illustration of the above technique allows one to describe the ring andthe algebra generated by a collection of sets.

Proposition 2.1. Let X be a non-empty set, and let E be an arbitrary collec-tion of subsets of X .

A. For a set A ⊂ X , the following are equivalent:

(i) A ∈ R(E);(ii) There exist sets A1, A2, . . . , An such that A = A1A2 . . . An, and each

Ak, k = 1, . . . , n is a finite intersection of sets in E.

B. The algebra generated by E is

A(E) = R(E) ∪ X A : A ∈ R(E)

= RE ∪ X .

Proof. A. Define R to be the class of all subsets A ⊂ X , which satisfy property

(ii), so that what we have to prove is the equalityR = R(E).

It is clear that E ⊂ R. Since every finite intersection of sets in E belongs to R(E),and the latter is a ring, it follows that R ⊂ R(E). So in order to prove the desiredequality, all we have to do is to prove that R is a ring. But this is pretty clear, if we think as the sum operation, and ∩ as the product operation. More explicitly,let us take Π(E) to be the collection of all finite intersections of sets in E, so that

(1) A ∩ B ∈ Π(E), ∀ A, B ∈ Π(E).

Now if we start with two sets A, B ∈ R, written as A = A1 . . . Am and B =B1 . . . Bn, with A1, . . . , Am, B1, . . . , Bn ∈ Π(E), then the equality

A

∩B = (A1

∩B1)

. . .

(Am

∩B1)(A1

∩B2)

. . .

(Am

∩B2) . . .

. . . (A1 ∩ Bn) . . . (Am ∩ Bn),combined with (1) proves that A ∩ B ∈ R, while the equality

AB = A1 . . . AmB1 . . . Bn

proves that AB also belongs to R.B. Define

A = R(E) ∪ X A : A ∈ R(E)

.




Since we clearly have E ⊂ A ⊂ A(E), all we need to prove is the fact that A is analgebra. It is clear that, whenever A ∈ A, it follows that X A ∈ A. Therefore(see Section III.1), we only need to show that

A, B ∈ A ⇒ A ∪ B ∈ A.

There are four cases to examine: (i) A, B ∈ R(E); (ii) A ∈ R(E) and X B ∈ R(E);(iii) X A ∈ R(E) and B ∈ R(E); (iv) X A ∈ R(E) and X B ∈ R(E).

Case (i) is clear, since it will force A ∪ B ∈ R(E).In case (ii), we use

X (A ∪ B) = (X A) ∩ (X B) = (X B) A,

which proves that X (A ∪ B) ∈ R(E).Case (iii) is proven exactly as case (ii).In case (iv) we use

X (A

∪B) = (X A)

∩(X B),

which proves that X (A ∪ B) ∈ R(E).The equality A(E) = R

E ∪ X is trivial.

Comment. Unfortunately, for σ-rings and σ-algebras, no easy constructivedescription is avaialable. There is an analogue of Proposition 2.1 uses transfiniteinduction. In order to formulate such a statement, we introduce the followingnotations. For every collection C of subsets of X , we define

C∗ = ∞n=1

(An Bn) : An, Bn ∈ C ∪ ∅, ∀ n ≥ 1

.

Notice that

(2) C

∪ ∅

⊂C

∗

⊂S(C).

Theorem 2.1. Let X be a non-empty set, and let E be an arbitrary collection of subsets of X . For every ordinal number η define the set

P η = α : α ordinal number with α < η.

Let Ω denote the smallest uncountable ordinal number, and define the classes Eα,α ∈ P Ω recursively by E0 = E, and

Eα = β∈P α

Eβ∗

, ∀ α ∈ P Ω 0.

Then the σ-ring generated by E is

S(E) =α∈P Ω

Eα.

Proof. Denote the union α∈P ΩEα simply by U. It is obvious that E ⊂ U.

Let us prove that U ⊂ S(E). We do this by showing that Eα ⊂ S(E), ∀ α ∈ P Ω.We use transfinite induction. The case α = 0 is clear. Assume α ∈ P Ω has theproperty that Eβ ⊂ S(E), for all β ∈ P α, and let us show that we also have theinclusion Eα ⊂ S(E). On the one hand, if we take the class

C =β∈P α

Eβ ,



148 LECTURE 19

then Eα = C∗. On the other hand, by the inductive hypothesis, we have C ⊂ S(E),which clearly forces S(C) ⊂ S(E). Then the desired inclusion follows from (2)

In order to finish the proof, we only need to prove that U is a σ-ring. It sufficesto prove the equality U∗ = U, which in turn is equivalent to the inclusion U∗ ⊂ U.Start with some U ∈ U∗, written as

U =

∞n=1

(An Bn),

for two sequences (An)∞n=1 and (Bn)∞

n=1 in U. For each n ≥ 1 choose αn, β n ∈ P Ω,such that An ∈ Eαn

and B ∈ Eβn. Form then the countable set

Z = αn : n ∈ N ∪ β n : n ∈ N ⊂ P Ω.

Then we clearly have

U ∈

ν∈ZEν

∗

.

Since Z is countable, there is a strict upper bound for Z in P Ω, i.e. there existsγ ∈ P Ω, such that αn < γ and β n < γ , ∀ n ≥ 1. In other words we have Z ⊂ P γ , so

U ∈ ν∈P γ

Eν∗

= Eγ,

so U indeed belongs to U.

Corollary 2.1. Given a non-empty set X , and an arbitrary collection E of subsets of X , with cardE ≥ 2, one has the inequality

card S(E) ≤ cardE

ℵ0 .

Proof. Using the notations from the proof of the above theorem, we will first

prove, by transfinite induction, that(3) cardEα ≤

cardEℵ0 , ∀ α ∈ P Ω.

The case α = 0 is clear. Assume now we have α ∈ P Ω 0, such that

cardEβ ≤ cardE

ℵ0 , ∀ β ∈ P α,

and let us prove that we also have the inequality card Eα ≤ cardE

ℵ0. If we take

C =β∈P α

Eβ , we know that C is a countable union of sets, each having cardinality

≤ cardE

ℵ0, so we immediately get

cardC ≤ ℵ0 · cardEℵ0

=

cardEℵ0 .

Then the collection

D(C) = A B : A, B ∈ Chas cardinality at most

cardC

2, so we also have

card D(C) ≤ cardE

ℵ0 .

Finally, the collection Eα = C∗ has cardinality at most

card D(C)ℵ0

, so we get

cardEα ≤ cardE

ℵ0ℵ0=

cardEℵ0 .




Having proven (3), we now have

card S(E) = card α∈P Ω

Eα ≤ card P Ω · cardEℵ0

=

ℵ1

· cardEℵ0 .

Since ℵ1 ≤ c = 2ℵ0 ≤ cardE

ℵ0, the above estimate gives

card S(E) ≤ cardE

ℵ02=

cardEℵ0 .

Comment. Suppose Θ is a consistent type. There is a very useful techniquefor proving results on classes of the form Θ(E). More explicitly, suppose E is anarbitrary collection of subsets of X , and (p) is a certain property which refers tosubsets of X . Suppose now we want to prove a statement like:

(∗) Every set A ∈ Θ(E) has property (p).

In order to prove such a statement, one defines

U = A

∈Θ(E) : A has property (p),

and it suffices to prove that:

(i) U is of type Θ;(ii) U ⊃ E, i.e. every set A ∈ E has property (p).

Indeed, if we prove the above two facts, that would force U ⊃ Θ(E), and since byconstruction we have U ⊃ Θ(E), we will in fact get U = Θ(E), thus proving (∗).

As a first illustration of the above technique, we prove the following.

Proposition 2.2. Let X be a non-empty set, and let R be a ring on X . Then the σ-ring generated by R is the same as the monotone class generated by R, that is, one has the equality

S(R) = M(R).

Proof. Since S(R) is a monotone class, and contains R, we have the inclusion

S(R) ⊃ M(R).To prove the other inclusion, using the fact that M(R) contains R, it suffices

to show that M(R) and is a σ-ring. Since M(R) is already a monotone class, weonly need to show that it is a ring. In other words, we need to show that wheneverA, B ∈ M(R), it follows that both A B and A ∪ B belong to M(R). Define then,for every A ∈ M(R) the set

MA =

B ∈ M(R) : A ∩ B, A B, B A ∈ M(R)

,

so that what we need to prove is:

(∗) MA = M( A), ∀ A ∈ M(R).

Before we proceed with the proof of (∗), let us first remark that, for A, B ∈ M(R),one has

(4) B ∈ MA ⇐⇒ A ∈ MB.

Secondly, we have the following

Claim 1: For every A ∈ M(R), the collection MA is a monotone class.

To prove this, we start with a monotone sequence (Bn)∞n=1 in MA, and we prove

that the limit B = limn→∞ Bn again belongs to MA. First of all, clearly B belongsto M(R). Second, since the sequences (A ∩Bn)∞

n=1, (A Bn)∞n=1, and (Bn A)∞

n=1

are all monotone sequences in M(R), and since M(R) is a monotone class, it follows



150 LECTURE 19

that the limits A ∩ B = limn→∞(A ∩ Bn), A B limn→∞(A Bn), and B A =limn→∞(Bn A) all belong to M(R), so B indeed belongs to MA.

Having proven Claim 1, we now prove (∗

) in a particular case:

Claim 2 : MA = M(R), ∀ A ∈ R.

Fix A ∈ R. We know that MA ⊂ M(R) is a monotone class, so it suffices to provethat MA ⊃ R. But this is obvious, since R is a ring.

We now proceed with the proof of (∗) in the general case. If we define

U =

A ∈ M(R) : MA = M(R)

,

all we need to prove is the equality U = M(R). By Claim 2, we know that U ⊃ R, soit suffices to prove that U is a monotone class. Start then with a monotone sequence(An)∞

n=1, and let us show that the limit A = limn→∞ An again belongs to U. Firstof all, A belongs to M(R). What we then have to prove is that MA = M(R). Startwith some arbitrary B ∈ M(R). We know that B ∈ MAn

, ∀ n ≥ 1. Using (4)

we have An ∈MB, ∀ n ≥ 1, and using the fact that

MB is a monotone class (seeClaim 1), it follows that A = limn→∞ An belongs to MA. Using (4) again, this

gives B ∈ MA. This way we have proven that any B ∈ M(R) also belongs to MA,so we indeed have the equality MA = M(R).

Corollary 2.2. Let X be a non-empty set, and let E be an arbitrary family of subsets of X . Then the σ-ring, and the σ-algebra generated by E respectively, aregiven as the monotone classes generated by the ring, and by the algebra generated by E respectively. That is, one has the equalities:

(i) S(E) = M

R(E)

;

(ii) Σ(E) = M

A(E)

.

Proof. (i). By the above result, since R(E) is a ring, we have

(5) MR(E) = SR(E).

Since S

R(E)

is a σ-ring, and contains E, it follows that

S

R(E) ⊃ S(E).

Conversely, since S(E) is a ring, and contains E, we get the inclusion

S(E) ⊃ R(E),

and since S(E) is a σ-ring, we will now get

S(E) ⊃ S

R(E)

,

so we get

SR(E) = S(E).

Using (5), the desired equality follows.(ii). This follows from Proposition 2.1 and part (i) applied to E∪X , combined

with the obvious equality Σ(E) = SE ∪ X .

The σ-ring and the σ-algebra, generated by an arbitrary collection of sets, arerelated by means of the following result.




Proposition 2.3. Let X be a non-empty set, and let E be an arbitrary collec-tion of subsets of X . Define the collection

PEσ(X ) = A ⊂ X : there exists (E n)∞n=1 ⊂ E, with A ⊂

∞n=1

E n.

(i) PEσ(X ) is a σ-ring on X ;(ii) the σ-ring S(E) and the σ-algebra Σ(E), generated by E, satsify the equality

S(E) = Σ(E) ∩ PEσ(X ).

Proof. Part (i) is trivial.To prove part (ii), we first observe that the intersection Σ(E) ∩ PEσ(X ) is a

σ-ring, which obviously contains E, so we immediately get the inclusion

S(E) ⊂ Σ(E) ∩ PEσ(X ).

The key ingredient in proving the inclusion “⊃

” is contained in the following.

Claim: Given a set E ∈ E, the collection

AE(X ) =

A ⊂ X : A ∩ E ∈ S(E)

is a σ-algebra on X .

To prove this we need to check:

(a) if A belongs to AE(X ), then X A also belongs to AE(X );(b) whenever (An)∞

n=1 is a sequence of sets in AE(X ), it follows that the union∞n=1 An also belongs to AE(X ).

To check (a) we simply remark that, since both E and A ∩ E belong to S(E), itfollows immediately that (X A) ∩ E = E (A ∩ E ), also belongs to S(E), whichmeans that X A indeed belongs to AE(X ).

Property (b) is clear. Since the fact that An ∩ E belongs to S(E), for all n,immediately gives the fact that ∞n=1 An) ∩ E = ∞

n=1(An ∩ E ) belongs to S(E),

which means precisely that∞n=1 An belongs to AE .

Having proven the Claim, we now proceed with the proof of the inclusionS(E) ⊃ Σ(E) ∩ PEσ(X ). Start with some set A ∈ Σ(E) ∩ PEσ(X ), and we will showthat A belongs to S(E). First of all, there exists a sequence (E n)∞

n=1 ⊂ E, such that

(6) A ⊂∞n=1

E n.

Using the Claim, we know that for each n ∈ N, the collection AEnis a σ-algebra.

This σ-algebra clearly contains E, so we have

Σ(E) ⊂ AEn, ∀ n ∈ N.

In particular, we get the fact that A ∈ AEn , which means that A ∩ E n belongs toS(E, for all n ∈ N. But then the inclusion (6) forces the equality

A =n=1

(A ∩ E n),

which then gives the fact that A indeed belongs to S(E).

The above result motivates the following.



152 LECTURE 19

Definition. A collection E of subsets of X is said to be σ-total in X , if X ∈ PEσ(X ), i.e. there exists some sequence (E n)∞

n=1 ⊂ E with

∞n=1 E n = X . By

the above result, this is equivalent to the fact that X belongs to the σ-ring S(E)generated by E, which in turn is equivalent to the equality Σ(E) = S(E).

We discuss now two more methods of constructing (σ-)rings, (σ-)algebras, ormonotone classes.

Notations. Let f : X → Y be a function, and let E ⊂ P(X ) and G ⊂ P(Y )be two arbitrary collections of sets. We define

f ∗E =

A ∈ P(Y ) : f −1(A) ∈ E ⊂ P(Y );

f ∗G =

f −1(G) : G ∈ G ⊂ P(X ).

Definitions. Let Θ be a type of set collections. We say that Θ is natural , if for any map f : X → Y , one has the implications

(i) C of type Θ on X =⇒ f ∗C of type Θ on Y ;

(ii) D of type Θ on Y =⇒ f ∗

D of type Θ on X .Examples 2.3. The types R, A, S, Σ, and M are natural.

The term “natural” is justified by the following.

Exercise 1. Let X f −→ Y

g−→ Z be maps.

(i) Prove that, for any collection C ⊂ P(X ), one has the equality g∗(f ∗C) =(g f )∗C.

(ii) Prove that, for any collection D ⊂ P(Y ), one has the equality f ∗(g∗D) =(g f )∗D.

Theorem 2.2 (Generating Theorem). Suppose Θ is a consistent class type,which is natural. Let X and Y be non-empty sets, and let f : X → Y be a map.For any collection G ⊂ P(Y ), one has the equality

f ∗

Θ(G) = Θ(f ∗

G).

Proof. On the one hand, by naturality, we know that f ∗Θ(G) is of type Θ.On the other hand, it is pretty clear that, since Θ(G) ⊃ G, we also have the inclusionf ∗Θ(G) ⊃ f ∗G). Since Θ is consistent, it then follows that we have the inclusion

f ∗Θ(G) ⊃ Θ(f ∗G).

To prove the other inclusion, we consider the class

C = f ∗

Θ(f ∗G) ⊂ P(Y ).

By naturality, it follows that C is of type Θ on Y . For any G ∈ G, the obviousrelation

f −1(G)

∈f ∗G

⊂Θ(f ∗G)

proves that G ∈ C. This means that we have the inclusion C ⊃ G, and since C is of class Θ, it follows that we have the inclusion

Θ(G) ⊂ C.

This means that, for every A ∈ Θ(G), we have f −1(A) ∈ Θ(f ∗G), which meansprecisely that we have the desired inclusion

f ∗Θ(G) ⊂ Θ(f ∗G).




Example 2.4. Let Θ be a consistent class type, which is both covariant andcontravariant. Let Y be some set, and let C be a collection of type Θ on Y . Givena subset X

⊂Y , we consider the inclusion map ι : X

→Y . The collection ι∗C is

then of type Θ on X . It will be denoted by CX

. Since ι−1A = A ∩ X , ∀ A ∈ P(Y ),we have

CX

= A ∩ X : A ∈ C.

If E ⊂ P(Y ) is a collection with C = Θ(E), then by the Generating Theorem wehave the equality

(7) Θ(E)X

= ΘE ∩ X : E ∈ E.

Comment. The exercise below shows that a “forward” version of the Gener-ating Theorem does not hold in general. In other words, an equality of the typef ∗Θ(G) = Θ(f ∗G) may fail. The reason is the fact that the collection f ∗G may berelatively “small.”

Exercise 2 . Consider the sets X =

1, 2, 3

, Y =

1, 2

, the function f : X

→Y , defined by f (1) = f (2) = 1, f (3) = 2, and the collection C = 1, 2, ∅.Describe the collection f ∗C, the algebra A(C) generated by C (on X ), and thealgebra A(f ∗C) generated by f ∗C (on Y ). Prove that one has a strict inclusionA(f ∗C) f ∗A(C).

Exercise 3 . Let Θ be a consistent natural type, let f : X → Y be a surjectivemap, and let G be a collection of subsets of X . Assume one has the inclusion

(8) G ⊂ f ∗Θ(f ∗G).

Prove that one has the equality

f ∗Θ(G) = Θ(f ∗G).

(One instance when (8) holds is for example when f −1

f (G)

= G, ∀ G ∈ G.)

Exercise 4* . Let Θ be one of the types A, R, S, Σ, or M. Let f : X → Y be an injective map, and let G ⊂ P(X ) be some arbitrary collection. Prove theequality

f ∗Θ(G) = Θ(f ∗G).

Natural consistent types are useful, because it is possible to construct productstructures.

Definition. Let Θ be a consistent type which is natural. Let (X i)i∈I be acollection of non-empty sets. Assume that, for each i ∈ I , a collection Ei ⊂ P(X i)of type Θ is given. Consider the product cartesian product X =

i∈I X i, together

with the projection maps πi : X → X i, i ∈ I . The collection

Θ -Xi∈I

Ei = Θ

i∈I π∗i Ei

is a collection of type Θ on X , which is called the Θ-product . When there is nodanger of confusion, we use the notation X.

Remark 2.1. Use the notations from the above definition. Assume that, foreach i ∈ I , a collection Gi ⊂ P(X i) is given. Then one has the equality

Xi∈I

Θ(Gi) = Θ

i∈I

π∗i Gi

.



154 LECTURE 19

Indeed, if we define Ei = Θ(Gi), the inclusion ⊃ follows from the obvious inclusions

Xi∈I

Ei

⊃π∗i Ei

⊃π∗i Gi.

The inclusion ⊂, follows from the inclusions

π∗i Gi ⊂ Θ

i∈I

π∗i Gi

,

which combined with the fact that the right hand side is of type Θ, and the Gen-erating Theorem, give the inclusions

π∗i Ei = π∗

iΘ(Gi) = Θ(π∗i Gi) ⊂ Θ

i∈I

π∗i Gi

.

Natural consistent types also allow one to define disjoint union structures.

Definitions. Let (X i)i∈I be a collection of non-empty sets. Assume that, for

each i ∈ I , a collectionCi ⊂

P(X i) is given. On the disjoint union X = i∈I X ione defines the collection

i∈I

Ci =

C ⊂ X : C ∩ X i ∈ Ci, ∀ i ∈ I

.

Assume now Θ is a natural consistent, and Ci is of type Θ on X i, for each i ∈ I . If we consider the inclusion maps i : X i → X , i ∈ I , then one clearly has the equality

i∈I

Ci =i∈I

i∗Ci,

which means thati∈I Ci is a collection of type Θ on X .

Exercise 5 . Let I be countable, and let (X i)i∈I be a collection of non-emptysets. Assume that, for each i ∈ I , a collection Ci ⊂ P(X i) is given, such that∅

∈Ci. Prove the equalities

i∈I

S(Ci) = Si∈I

Ci

andi∈I

Σ(Ci) = Σi∈I

Ci

.

We conclude with a discussion on certain constructions related to topology.

Definitions. Let X be a topological Hausdorff space. We consider the collec-tion T of all open sets in X . The σ-algebra Σ(T ) on X , generated by T is denotedby Bor(X ). The sets in Bor(X ) are called Borel sets.

Remark that singleton sets are Borel, since they are closed. Moreover

• every countable set B ⊂ X is Borel .

One also defines the σ-algebra Borc(X ) = Σ(CX) generated by the class CX of all compact subsets of X .

Another class of sets will also be of interest. Its construction uses the following

terminology.A subset A ⊂ X is said to be σ-compact , if there exists a sequence (K n)∞

n=1 of compact subsets of X , such that A =

∞n=1 K n. A set B ⊂ X is said to be relatively

σ-compact , if there exists a σ-compact set A with B ⊂ A. We set

Pσc(X ) =

B ∈ P(X ) : B relatively σ-compact

,

and we defineBorσc(X ) = Bor(X ) ∩ Pσc(X ).




Proposition 2.4. Let X be a topological Hausdorff space.

(i) Pσc(X ) is a σ-ring on X ;

(ii) the σ-ring Borσc(X ) coincides with the σ-ring S(CX) generated by thecollection CX of all compact subsets of X .

Proof. Using the notations from Proposition 2.3, we have Pσc(X ) = PCXσ (X ),

so part (i) is a consequence of Proposition 2.3.(i). By Proposition 2.3.(ii) we alsoknow that

S(CX) = Σ(CX) ∩ Pσc(X ) = Borc(X ) ∩ Pσc(X ),

and since Borc(X ) ⊂ Bor(X ), we have the inclusion

S(CX) ⊂ Borσc(X ).

To prove the other inclusion, all we need to show is the inclusion

Borσc(X ) ⊂ Borc(X ).

Start with some arbitrary set B ∈ Borσc(X ), and let us prove that prove thatB ∈ Borc(X ). Since B is relatively σ-compact, there exists a sequence (K n)∞n=1

of compact sets, such that B ⊂ ∞n=1 K n. Define, for each integer n ≥ 1, the set

Bn = B ∩ K n. Since B =∞n=1 Bn, It suffices to show that

(9) Bn ∈ Borc(X ), ∀ n ∈ N.

Fix n, and let us analyze the inclusion ιn : K n → X . Denote by T the collection of all open sets in X , and denote by T Kn

the collection of all sets D ⊂ K n, which areopen in the induced topology, that is,

T Kn=

D ∩ K n : D ∈ T

.

By the Generating Theorem (Example 2.4), we know that

Bor(X )Kn

= Σ(T )Kn

= ΣKnD

∩K n : D

∈T

= ΣKnT Kn = Bor(K n).

(Here the notation ΣKnindicates that the σ-algebra is taken on K n.) In particular,

we get

(10) Bn = B ∩ K n ∈ Bor(X )Kn

= Bor(K n), ∀ n ∈ N.

Since K n is compact, the σ-ring S(CKn), generated by all compact subsets of K n, is

a σ-algebra on K n (simply because it contains K n.) Notice that every set D ∈ T Kn

is of the form K n F , with F ⊂ K n compact (in X ), therefore D belongs toS(CKn

). Since S(CX) is a σ-algebra, which contains T Kn, we have

Bor(K n) = ΣKn

T Kn

⊂ S(CKn) ⊂ S(CX) ⊂ Borc(X ), ∀ n ∈ N.

Now (9) immediately follows from the above inclusions, combined with (10).

Remark 2.2. For a topological Hausdorff space, we always have the inclusionsBorσc(X ) ⊂ Borc(X ) ⊂ Bor(X ).

The following are equivalent

(i) Borσc(X ) = Bor(X );(ii) X is σ-compact.

The following result exaplains when a minimal set of generators can be chosenfor the Borel sets.



156 LECTURE 19

Proposition 2.5. Let X be a topological space which is second countable, i.e.there is a countable base for the topology. If S is any sub-base for the topology (countable or not), then

Bor(X ) = Σ(S).

Proof. Denote by T the collection of all open sets in X . Denote by V thecollection of all subsets of X , which can be written as finite intersections of sets inS. It is obvious that S ⊂ V ⊂ Σ(S), so we have the equality Σ(S) = Σ(V). Thismeans that it suffices to prove the equality

(11) Σ(T ) = Σ(V).

Notice that V is a base for the topology, which means that every open subset D X can be written as a union of sets in V. What we want to prove is

Claim: Every open set D X is a countable union of sets in V.

To prove this fact, we fix an open set D X , as well as a countable base B =

Bn∞

n=1 for the topology. For every x ∈ D we define the setM x = n ∈ N : there exists V ∈ V, such that x ∈ Bn ⊂ V ⊂ D.

It is pretty clear that M x = ∅, ∀ x ∈ D. (First use the fact that V is a base, tofind V ∈ V such that x ∈ V ⊂ D, and then use the fact that B is a base to find nsuch that x ∈ Bn ⊂ V .) If we put M =

x∈D M x, then it is pretty obvious that

n∈M Bn = D. For every n ∈ M we choose some V n ∈ V with Bn ⊂ V n ⊂ D (usethe fact that n must belong to some M x). It is then clear that D =

n∈M V n, and

the claim follows.As a consequence of the Claim, we see that any open set D X automatically

belongs to Σ(V), and then we have the inclusion T ⊂ Σ(V) ⊂ Σ(T ). This clearlyforces the equality (11).

Corollary 2.3. Let I be a set which is at most countable, and let (X i)i∈I be

a collection of second countable topological spaces. Then one has the equality

(12) Bori∈I

X i

= Σ-Xi∈I

Bor(X i),

where the product spacei∈I X i is equipped with the product topology.

Proof. By the definition of the product σ-algebra, we know that

Σ-Xi∈I

Bor(X i) = Σ

j∈I

π∗jBor(X j)

,

where πj :i∈I → X j , j ∈ I , denote the projection maps. Choose, for each j ∈ I ,

a countable sub-base Sj for X j, so that we have the equalities

Bor(X j) = Σ(Sj),

∀ j

∈I.

By Remark 2.1 we have the equality

Σ-Xi∈I

Bor(X i) = Σ

j∈I

π∗jSj

,

where πj :i∈I → X j , j ∈ I , denote the projection maps. Since the collection

i∈I π∗i Si is a countable sub-base for the product topology, the above equality,

combined with Proposition 2.5 immediately gives (12).




Exercise 6 . A. Prove that, if X is second countable, and S is a sub-base for itstopology (countable or not), with

S∈S S = X , then we have in fact the equality

Bor(X ) = S(S).B. Prove that, if X is Hausdorff, second countable, with card X ≥ 2, then for any sub-base S (countable or not), we have the equality

Bor(X ) = S(S).

Hints: Follow the proof above. Remark that every open set D ⊂ X, which is a countable unionof sets in V, belongs in fact to the σ-ring S(V) = S(S). So in either case, we only have to show

that X is a countable union of sets in V.In case A, we trace the proof of the Claim, and we notice that the only property that we

used was the fact that, for every x ∈ D, there exists V ∈ V with x ∈ V ⊂ D, i.e. D is a (possibly

uncountable) union of sets in V. Since X itself satisfies this property, it follows that X is also acountable union of sets in V.

In case B, we use the Hausdorff property to write X = D1 ∪ D2, with D1, D2 X open.

Corollary 2.4. If X is a topological Hausdorff space, which is second count-able, and X is infinite (as a set), then card Bor(X ) = c.

Proof. First of all, since X is infinite, one can chose an infinite countablesubset A ⊂ X . Then A, and all its subsets are Borel, i.e. we have the inclusionP(A) ⊂ Bor(X ), thus proving the inequality

card Bor(X ) ≥ cardP(A) = 2ℵ0 = c.

Secondly, one can choose a base V for the topology, which is countable. We nowhave Bor(X ) = S

V ∪ X , so by Corollary 2.1. we get

card Bor(X ) ≤ cardV ∪ X ℵ0 ≤ ℵ0

ℵ0 = c,

and the desired equality follows.

Examples 2.5. A. Consider the extended real line [−∞, ∞] =R

∪−∞, ∞,thought as a compact space, homeomorphic to the interval [−π/2, π/2], via the mapf : [−π/2, π/2] → [−∞, ∞], defined by

f (t) =

−∞ if t = −π/2tan t if − π/2 < t < π/2∞ if t = π/2

Notice that, when restricted to R = (−∞, ∞), this topology agrees with the usualtopology. In particular, this gives the equality Bor([−∞, ∞])

R

= Bor(R).Let A ⊂ R be a dense subset. Consider the collections

E1 =

(a, ∞] : a ∈ A

; E2 =

[a, ∞] : a ∈ A

;

E3 =

[−∞, a) : a ∈ A

; E4 =

[−∞, a] : a ∈ A

.

With these notations we have the equalitiesBor([−∞, ∞]) = Σ(E1) = Σ(E2) = Σ(E3) = Σ(E4).

First of all, we notice that each set in E1 ∪ E2 ∪ E3 ∪ E4 is either open or closed,which means that

E1 ∪ E2 ∪ E3 ∪ E4 ⊂ Bor([−∞, ∞]),

thus giving the inclusions

Σ(Ek) ⊂ Bor([−∞, ∞]), ∀ k ∈ 1, 2, 3, 4.



158 LECTURE 19

Second, we observe that E1 ∪ E3 is a sub-base for the topology, and since [−∞, ∞]is obviously second countable, we will have the equality

Bor([−∞, ∞]) = Σ(E1 ∪ E3).So, in order to finish the proof we only need to show the inclusions

(13) E1 ∪ E3 ⊂ Σ(Ek), ∀ k ∈ 1, 2, 3, 4.

Since every set in E2 has its complement in E3, and viceversa, we have the inclusions

E2 ⊂ Σ(E3) and E3 ⊂ Σ(E2),

which prove the equality

(14) Σ(E2) = Σ(E3).

Likewise, we have the equality

(15) Σ(E1) = Σ(E4).

This means that we only have to prove (13) for k = 2 and k = 4. The case k = 2amounts to proving that E1 ⊂ Σ(E2). Fix some a ∈ A. For every integer n ≥ 1 wechoose an ∈ (a, a + 1

n) ∩ A. Then the equality

(a, ∞] =

∞n=1

[an, ∞]

clearly shows that (a, ∞] ∈ Σ(E2).The case k = 4 amounts to proving that E3 ⊂ Σ(E4). Fix some a ∈ A. For

every integer n ≥ 1 we choose an ∈ (a − 1n , a) ∩ A. Then the equality

[−∞, a) =∞n=1

[−∞, an]

clearly shows that [−∞, a) ∈ Σ(E4).B. If we work on R, and we consider the collections

E0k =

E ∩ R : E ∈ Ek

, k = 1, 2, 3, 4,

then by the Generating Theorem (Example 2.4) we have the equalities

Bor(R) = Σ(E0k) = S(E0

k), k = 1, 2, 3, 4.

(The fact that the σ-algebra Σ(E0k) and the σ-ring S(E0

k) coincide is a consequenceof the fact that E0

k is σ-total in R.)C. Let X be a separable metric space. Let A ⊂ X be a dense set, and let

R ⊂ (0, ∞) be a subset with inf R = 0. Then the collection

SA,R =

Br(a) : r ∈ R, a ∈ A

is clearly a base for the metric topology. Since X is separable, one can choose bothA and R to be countable, which proves that X is automatically second countable.Then for any choice of A and R, we will have the equality

(16) Bor(X ) = ΣBr(a) : r ∈ R, a ∈ A

= S

Br(a) : r ∈ R, a ∈ A

.

(The equality between the generated σ-algebra and σ-ring follows from Exercise1.A.) As particular cases when the equality (16) holds, one has the metric spaceswhich are σ-compact.




Exercise 7* . Let I be an uncountable set, and let (X i)i∈I be a collection of topological spaces. Assume that for each i ∈ I , there esists at leas one non-emptyclosed subset F

i X

i. (This is the case for example when X

iis Hausdorff, and

card X i ≥ 2.) Prove that one has a strict inclusion

Bori∈I

X i

Σ-Xi∈I

Bor(X i).

Hint: For every subset J ⊂ I , define the projection map πJ :

i∈I Xi → i∈J Xi. Consider

the collection

A=

A ⊂i∈I

Xi : there exists J ⊂ I countable, such that A = π−1J

πJ (A)

.

Prove that A∪ ∅ is a σ-algebra, which contains

i∈I π∗i Bor(Xi). Prove that one has a strict

inclusion Bor

i∈I Xi

A∪∅, by contsructing a non-empty closed set F ⊂

i∈I Xi, which

does not belong to A.



Lecture 20

3. Measurable spaces and measurable maps

In this section we discuss a certain type of maps related to σ-algebras.

Definitions. A measurable space is a pair (X, A) consisting of a (non-empty)set X and a σ-algebra A on X .

Given two measurable spaces (X, A) and (Y,B), a measurable map T : (X, A)

→(Y,B) is simply a map T : X → Y , with the property

(1) T −1(B) ∈ A, ∀ B ∈ B.

Remark 3.1. In terms of the constructions outlined in Section 2, measurabilityfor maps can be characterized as follows. Given measurable spaces (X, A) and (Y,B), and a map T : X → Y , the following are equivalent:

(i) T : (X, A) → (Y, B) is measurable;(ii) T ∗B ⊂ A;

(iii) T ∗ A ⊃ B.

Recall

T ∗B =

T −1(B) : B ∈ B

;

T ∗ A = B ⊂ Y : T −1

(B) ∈ A.With these equalities, everything is immediate.

The following summarizes some useful properties of measurable maps.

Proposition 3.1. Let (X, A) be a measurable space.

(i) If A is any σ-algebra, with A ⊂ A, then the identity map IdX : (X, A) →(X, A) is measurable.

(ii) For any subset M ⊂ X , the inclusion map ι : (M, AM

) → (X, A) ismeasurable.

(iii) If (Y, B) and (Z, C) are measurable spaces, and if (X, A)T −−→ (Y,B)

S−−→(Z, C) are measurable maps, then the composition S T : (X, A) → (Z, C)is again a measurable map.

Proof. (i). This is trivial, since (IdX)∗ A = A ⊂ A.(ii). This is again trivial, since ι∗ A = A

M

.

(iii). Start with some set C ∈ C, and let us prove that (S T )−1(C ) ∈ A. Weknow that (S T )−1 = T −1

S −1(C )

. Since S is measurable, we have S −1(C ) ∈ B,

and since T is measurable, we have T −1

S −1(C ) ∈ A.

Often, one would like to check the measurability condition (1) on a small col-lection of B’s. Such a criterion is the following.

161



162 LECTURE 20

Lemma 3.1. Let (X, A) and (Y,B) be masurable spaces. Assume B = Σ(E), for some collection of sets E ⊂ P(Y ). For a map T : X → Y , the following areequivalent:

(i) T : (X, A) → (Y, B) is measurable;(ii) T −1(E ) ∈ A, ∀ E ∈ E.

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i), assume (ii) holds. We first observe that

condition (ii) reads f ∗E ⊂ A. Since A is a σ-algebra, we get the inclusion

Σ(f ∗E) ⊂ A.

Using the Generating Theorem 2.2, we have

f ∗B = f ∗Σ(E) = Σ(f ∗E) ⊂ A,

and, by the preceding remark, we are done.

Corollary 3.1. Let (X, A) be a measurable space, let Y be a topological Haus-dorff space which is second countable, and let S be a sub-base for the topology of Y .For a map T : X → Y , the following are equivalent:

(i) T : (X, A) → Y,Bor(Y )

is a measurable map;

(ii) T −1(S ) ∈ A, ∀ S ∈ S.

Proof. Immediate from the above Lemma, and Proposition 2.2, which statesthat Bor(Y ) = Σ(S).

We know (see Section 19) that the type Σ is consistent and natural. In par-ticular, measurability behaves nicely with respect to products and disjoint unions.More explicitly one has the following.

Proposition 3.2. Let (X i, Ai)i∈I be a collection of measurable spaces. Con-sider the sets X = i∈I X i and Y =

i∈I X i, and the σ-algebras

A = Σ -Xi∈I

Ai and B =i∈I

Ai.

Let (Z, G) be a measurable space.

(i) If we denote by πi : X → X i, i ∈ I , the projection maps, then a mapf : (Z, G) → (X, A) is measurable, if and only if, all the maps πi f :(Z, G) → (X i, Ai), i ∈ I , are measurable.

(ii) If we denote by i : X i → Y , i ∈ I , the inclusion maps, then a mapg : (Y,B) → (Z, G) is measurable, if and only if, all the maps g i f :(X i, Ai) → (Z, G), i ∈ I , are measurable.

Proof. (i). By the definition of the product σ-algebra, we know that

(2) A = Σ i∈I

π∗i Ai

.

If we fix some index i ∈ I , then the obvious inclusion π∗i Ai ⊂ A immediately

shows that πi : (X, A) → (X i, Ai) is measurable. Therefore, if f : (Z, G) → (X, A)is measurable, then by Proposition 3.1 it follows that all compositions πi f :(Z,G) → (X i, Ai), i ∈ I , are measurable.




Conversely, assume all the compositions πi f are measurable, and let us showthat f : (Z, G) → (X, A) is measurable. By Lemma 3.1 and (2), all we need toprove is the fact that

f ∗ i∈I

π∗i Ai

⊂ G,

which is equivalent to

f ∗

π∗i Ai

⊂ G, ∀ i ∈ I.

But this is obvious, because f ∗

π∗i Ai

= (πi f )∗ Ai, and πi f is measurable, for

all i ∈ I .(ii). By the definition of the σ-algebra sum, we know that

(3) B =i∈I

i∗ Ai.

If we fix some index i ∈ I , then the obvious inclusion i∗ Ai ⊃ B immediatelyshows that i : (X i, Ai)

→(Y,B) is measurable. Therefore, if g : (Y,B)

→(Z, G)

is measurable, then by Proposition 3.1 it follows that all compositions g i :(X i, Ai) → (Z, G), i ∈ I , are measurable.

Conversely, assume all the compositions g i are measurable, and let us showthat g : (Y,B) → (Z, G) is measurable. This is equivalent to the inclusion g∗B ⊃ G.By (3) we immediately have

(4) g∗B = g∗

i∈I

i∗ Ai

=i∈I

g∗

i∗ Ai

.

We know however that, since g i are all measurable, we have

g∗

i∗ Ai

= (g i)∗ Ai ⊃ G, ∀ i ∈ I,

so the desired inclusion is an immediate consequence of (4).

Conventions. Let (X, A) be a measurable space. An extended real-valuedfunction f : (X, A) → [−∞, ∞] is said to be a measurable function , if it is measur-able in the above sense as a map f : (X, A) →

[−∞, ∞],Bor([−∞, ∞])

. If f hasvalues in R, this is equivalent to the fact that f is a measurable map f : (X, A) →

R,Bor(R)

is measurable. Likewise, a complex valued function f : (X, A) → C is

measurable, if it is measurable as a map f : (X, A) → C,Bor(C)

. If K is one of

the fields R or C, we define the set

BK(X, A) =

f : (X, A) → K : f measurable function

.

Remark 3.2. Let (X, A) be a measurable space. If A ⊂ R is a dense subset,then the results from Section 2, combined with Lemma 2.1, show that the measur-ability of a function f : (X, A) → [−∞, ∞] is equivalent to any of the followingconditions:

• f −1(a, ∞] ∈ A, ∀ a ∈ A;• f −1

[a, ∞]

∈ A, ∀ a ∈ A;

• f −1

[−∞, a) ∈ A, ∀ a ∈ A;

• f −1

[−∞, a] ∈ A, ∀ a ∈ A.

Definition. If X and Y are topological Hausdorff spaces, a map T : X → Y is said to be Borel measurable, if T is measurable as a map

T :

X,Bor(X ) →

Y,Bor(Y )

.



164 LECTURE 20

In the cases when Y = R, C, [−∞, ∞], a Borel measurable map will be simplycalled a Borel measurable function .

For K = R, C, we define

BK(X ) =

f : X → K : f Borel measurable function

.

Remark 3.3. If X and Y are topological Hausdorff spaces, then any continuousmap T : X → Y is Borel measurable. This follows from Lemma 3.1, from the factthat

Bor(Y ) = ΣD ⊂ Y : D open ,

and the fact that T −1(D) is open, hence in Bor(X ), for every open set D ⊂ Y .

Measurable maps behave nicely with respect to “measurable countable opera-tions,” as suggested by the following result.

Proposition 3.3. Let (X, A) and (Z,B) be a measurable spaces, let I be a set which is at most countable, and let (Y i)i∈I be a family of topological Hausdorff

spaces, each of which is second countable. Suppose a measurable map T i : (X, A) →Y i,Bor(Y i)

is given, for each i ∈ I . Define the map T : X →

i∈I Y i by

T (x) =

T i(x)i∈I

, ∀ x ∈ X.

Equip the product space Y =i∈I Y i with the product topology.

For any measurable map g :

Y,Bor(Y ) → (Z,B), the composition g T :

(X, A) → (Z,B) is measurable.

Proof. We know (see Corollary 2.3) that we have the equality

Bor(Y ) = Σ-Xi∈I

Bor(Y i).

By Proposition 3.2, the map T : (X, A)

→ Y,Bor(Y ) is measurable, so by Propo-

sition 3.1, the composition g T : (X, A) → (Z,B) is also measurable.

The above result has many useful applications.

Corollary 3.2. Suppose (X, A) is a measurable space, and K is either R or C.Then, when equipped with point-wise addition and multiplication, the set BK(X, A)is a unital K-algebra.

Proof. Clearly the constant function 1 is measurable.Also, if f ∈ BK(X, A) and λ ∈ K, then the function λf is again measurable,

since it can be written as the composition M λ f , where M λ : K α −→ λα ∈ K

is obviously continuous.Finally, let us show that if f 1, f 2 ∈ BK(X, A), then f 1 + f 2 and f 1 · f 2 again

belong to BK(X, A). This is however immediate from Proposition 3.3, applied to

the index set I = 1, 2, the spaces Y 1 = Y 2 = K, and the continuous maps

g1 : K2 (λ1, λ2) −→ λ1 + λ2 ∈ K,

g2 : K2 (λ, λ2) −→ λ1 · λ2 ∈ K.

Corollary 3.3. If (X, A) is a measurable space, then a complex valued func-tion f : X → C is measurable, if and only if the real valued functions Re f, Im f :X → R are measurable.




Proof. If f is measurable, the composing f with the continuous maps

ρ : C

z

−→Re z

∈R and γ : C

z

−→Im z

∈R,

immediately gives the measurability of Re f = ρ f and Im f = γ f .Conversely, if both Re f, Im f : X → R then the measurability of f follows from

Proposition 3.3, applied to Y 1 = Y 2 = R, the functions f 1 = Re f and f 2 = Im f ,and to the continuous function

g : R2 (a, b) −→ a + bi ∈ C.

Corollary 3.4. Let (X, A) be a measurable space, let I be a set which is at most countable, and let f i : (X, A) → [−∞, ∞], i ∈ I be collection of measurable

functions. Then the functions g, h : X → [−∞, ∞], defined by

g(x) = inf

f i(x) : i ∈ I

and h(x) = sup

f i(x) : i ∈ I

, ∀ x ∈ X,

are both measurable.

Proof. Define the maps m, M : i∈I [−∞, ∞] → [−∞, ∞] by

m(x) = inf xi : i ∈ I and M (x) = supxi : i ∈ I , ∀ x = (xi)i∈I ∈i∈I

[−∞, ∞].

By Proposition 3.3, it suffices to prove the (Borel) measurability of the maps mand M .

To prove the measurability of m, we are going to show that

m−1

[−∞, a) ∈ Bor

i∈I

[−∞, ∞]

, ∀ a ∈ R.

But this is quite obvious, since a point x = (xi)i∈I belongs to m−1

[−∞, a)

, if and only if there exists some j ∈ I with xi < a. In other words, if we define the

projections πj : i∈I [−∞, ∞] → [−∞, ∞], then we havem−1

[−∞, a)

=j∈I

πj

[−∞, a)

.

This shows that in fact m−1

[−∞, a)

is open, hence clearly Borel.To prove the measurability of M , we are going to show that

M −1

(a, ∞] ∈ Bor

i∈I

[−∞, ∞]

, ∀ a ∈ R.

But this is again clear, since, as before, we have the equality

M −1

(a, ∞]

=j∈I

πj

(a, ∞]

,

which shows that in fact M −1(a, ∞] is open, hence Borel.

Corollary 3.5. Let (X, A) be a measurable space, and let f n : (X, A) →[−∞, ∞], n ∈ N be sequence of measurable functions. Then the functions g, h :X → [−∞, ∞], defined by

g(x) = lim inf n→∞

f n(x) and h(x) = lim supn→∞

f n(x), ∀ x ∈ X,

are both measurable.



166 LECTURE 20

Proof. For every n ∈ N, define the functions gn, hn : X → [−∞, ∞] by

gn(x) = inf f k(x) : k ≥ n and hn(x) = supf k(x) : k ≥ n, ∀ x ∈ X.

By Corollary 3.5, we know that gn and hn are measurable for all n ∈ N. Since

g(x) = sup

gn(x) : n ∈ N

and h(x) = inf

hn(x) : n ∈ N

, ∀ x ∈ X,

the fact that both g and h are measurable follows again from Corollary 3.5.

Corollary 3.6. Let (X, A) be a measurable space, and let

f n : (X, A) → [−∞, ∞], n ∈ N

be sequence of measurable functions, with the property that, for each x ∈ X , thesequence

f n(x)

∞n=1

⊂ [−∞, ∞] has a limit. Then the function f : X → [−∞, ∞],defined by

f (x) = limn→∞

f n(x), ∀ x ∈ X,

is again measurable.

Proof. Immediate from the above result.

Exercise 1. If f n : R → R, n ∈ N, are continuous functions, and if f (x) =limn→∞ f n(x) exists, for every x ∈ R, then by the above Corollary we know thatf : R → [−∞, ∞] is Borel measurable. Prove that the converse is not true. Moreexplicitly, prove that there is no sequence (f n)∞

n=1 of continuous functions, with

limn→∞

f n(x) = κ Q(x), ∀ x ∈ R.

Hint: Use Baire’s Theorem.

Exercise 2 . Prove that a function f : R → R, which is continuous everywhere,except for a countable set of points, is Borel measurable. As an application, provethat any monotone function is Borel measurable.

Corollary 3.6 can be generalized, as follows.Theorem 3.1. Let (X, A) be a measurable space, let Y be a separable metric

space, and let T n : (X, A) →

Y,Bor(Y )

, n ∈ N

be a sequence of measurable maps. Assume that, for every x ∈ X , the sequenceT n(x)

∞n=1

⊂ Y is convergent. Define the map T : X → Y by

T (x) = limn→∞

T n(x), ∀ x ∈ X.

Then T : (X, A) → Y,Bor(Y )

is a measurable map.

Proof. Denote by d the metric on Y . The collection

V =

Br(y) : y ∈ Y, r > 0

is a base for the topology of Y . Since Y is second countable, it suffices then toshow that

(5) T −1Br(y)

∈ A, ∀ y ∈ Y, r > 0.

Claim: For every y ∈ Y and r > 0 one has the equality

(6) T −1Br(y)

=

∞m,n=1

∞k=m

T −1k

Br− 1

n(y)

.




Denote the set in the right hand side simply by A. Start first with some x ∈ A.There exist some m, n ∈ N such that

x ∈∞k=m

T −1kBr− 1

n(y),

which means that

T k(x) ∈ Br− 1n

(y), ∀ k ≥ m,

that is,

d

T k(x), y

< r − 1

n, ∀ k ≥ m.

Pasing to the limit (k → ∞) then yields

d

T (x), y ≤ r − 1

n< r,

which means that T (x) ∈ Br(y), i.e. x = T −1

Br(y), thus proving the inclusion

A ⊂ T −1Br(y).Conversely, if x ∈ T −1

Br(y)

, we get T (x) ∈ (Br(y), i.e. d

T (x), y

< r.

Choose an integer n such that

(7) d

T (x), y

< r − 2

n.

Since limk→∞ T k(x) = T (x), there exists some m ∈ N such that

d

T k(x), T (x)

<2

n, ∀ k ≥ m.

Combining this with (7) then gives

d

T k(x), y

≤ d

T (x), y

+ d

T k(x), T (x)

< r − 2

n+

1

n= r − 1

n, ∀ k ≥ m,

which means thatx ∈

∞k=m

T −1k

Br− 1

n(y)

,

hence x indeed belongs to A.Having proven (6) we now observe that, since the T k’s are measurable, it follows

that

T −1k

Br− 1

n(y) ∈ A, ∀ k, n ∈ N, r > 0.

Using the fact that A is closed under countable intersections, it follows that∞k=m

T −1k

Br− 1

n(y) ∈ A, ∀ m, n ∈ N, r > 0.

Finally, using the fact that A is closed under countable unions, the desired property

(5) follows.

Exercise 3 . Let (X, A) be a measurable space, and let (X n)∞n=1 be a sequence

of sets in A, with X =∞n=1 X n. Suppose (Y,B) is a measurable space, and

F : X → Y is a map, such that

F Xn

:

X n, AXn

→ (Y, B)

is measurable, for all n ∈ N. Prove that f : (X, A) → (Y, B) is measurable.



168 LECTURE 20

Exercise 4* . Let Ω1 ⊂ Rn be an open set, and let f 1, . . . , f n : Ω1 → R be C 1

functions, with the property that the matrix

A( p) = ∂f j∂xk

( p)nj,k=1

is invertible, for every point p ∈ Ω1. Define the map

F : Ω1 p −→ f 1( p), . . . , f n( p)

∈ Rn.

(i) Prove that the set Ω2 = F (Ω1) is open in Rn.(ii) Although F : Ω1 → Ω2 may fail to be injective, prove that there exists a

Borel measurable map φ : Ω2 → Ω1, with F φ = IdΩ2 .

Hint: Use the Inverse Function Theorem, combined with Exercises 2 and 3. exercise.

Exercise 5* . Let P (z) be a non-constant polynomial with complex coefficients.Prove that there exists a Borel measurable function f : C

→C, such that

P f (z) = z, ∀ z ∈ C.

Hint: Use the preceding exercise, applied to the set Ω1 = z ∈ C : P (z) = 0.

The preceding exercise can be generalized:

Exercise 6* . Let Ω1 ⊂ C be a connected open set, and let f : Ω1 → C be anon-constant holomorphic function. By the Open Mapping Theorem we know thatthe set Ω2 = f (Ω1) is open. Prove that there exists a Borel measurable functionφ : Ω2 → Ω1, such that f φ = Id

Ω2

.

Hint: Use Exercise 4, applied to the set Ω0 = z ∈ Ω1 : f (z) = 0. Since f is non-constant,

the set Ω1 Ω0 is countable.

We continue with a discussion on the role of elementary functions.Proposition 3.4. Let (X, A) be a measurable space, and let K be one of the

fields R or C. For an elementary function f ∈ ElemK(X ), the following are equiv-alent:

(i) f ∈ A-ElemK(X );(ii) f : (X, A) → K is measurable.

Proof. (i) ⇒ (ii). We know that A-ElemK = SpanK

κ A : A ∈ A. SinceBK(X, A) is a vector space, it suffices to show only that κ A : (X, A) → K ismeasurable, for all A ∈ A. But this is trivial, since for every Borel set B ⊂ R onehas either κ −1

A (B) = ∅, or κ −1A (B) = A, or κ −1

A (B) = X .(ii) ⇒ (i). Assume now f is measurable. List the range of f as

f (X ) = λ1, . . . , λn,

with λj = λk, for all j, k ∈ 1, . . . , n with j = k. Since f is measurable, and thesingleton sets λ1, . . . , λn are in Bor(K), it follows that the sets Aj = f −1

λj

, j = 1, . . . , n are all in A. Since we clearly have

f = λ1κ A1 + · · · + λnκ An,

it follows that f indeed belongs to A-ElemK(X ).




Remarks 3.4. A. If (X, A) and (Y,B) are measurable spaces, if T : (X, A) →(Y,B) is a measurable map, and if f ∈ B-ElemK(Y ), then f T ∈ A-ElemK(X ).This follows from the fact that the composition f

T : (X, A)

→K is measurable,

and elementary.B. If (X, A) is a measurable space, if f ∈ A-ElemK(X ), and if g : f (X ) → K is

an arbitrary function, then g f ∈ A-ElemK(X ). This follows from the fact that,if one considers the finite set Y = f (X ), and the σ-algebra P(Y ) on it, then

(X, A)f −−→

Y, P(Y ) g−−→ K

are measurable. So g f is also measurable, and obviously elementary.

The following is an interesting converse of Corollary 3.6.

Theorem 3.2. Let (X, A) be a measurable space, and let f : (X, A) → [−∞, ∞]be a measurable function. Then there exists a sequence (f n)∞

n=1 ∈ A-ElemR(X ),such that

• inf f (y) : y ∈ X ≤ f n(x) ≤ supf (z) : z ∈ X , ∀ x ∈ X , n ≥ 1;• limn→∞ f n(x) = f (x), ∀ x ∈ X .

Moreover,

(i) if inf

f (x) : x ∈ X

> −∞, then the sequence (f n)∞n=1 can be chosen to

be non-decreasing, i.e. f n ≤ f n+1, ∀ n ∈ N;(ii) if sup

f (x) : x ∈ X

< ∞, then the sequence (f n)∞

n=1 can be chosen tobe non-increasing, i.e. f n ≥ f n+1, ∀ n ∈ N;

(iii) if inf

f (x) : x ∈ X

> −∞ and sup

f (x) : x ∈ X

< ∞, then thesequence (f n)∞

n=1 can be chosen eiher non-decreasing, or non-increasing,and such that it converges uniformly to f , i.e.

limn→∞

supx∈X

f n(x) − f (x)

= 0.

Proof. We begin with a special case of (iii). Assume X = [0, 1], A =Bor([0, 1]), and consider the inclusion F : [0, 1] → [−∞, ∞]. For each n ∈ N,define the intervals I nk , J nk , 0 ≤ k ≤ 2n − 1 by

I nk =

k/2n, (k + 1)/2n

, if 0 ≤ k ≤ 2n − 2; I n2n−1 =

(2n − 1)/2n, 1

,

J nk =

k/2n, (k + 1)2n

, if 1 ≤ k ≤ 2n − 1; J n0 =

0, 1/2n

.

We then define, for each n ∈ N, the functions gn, hn : [0, 1] → R by

gn = 2−n2n−1k=0

kκ I nk

and hn = 2−n2n−1k=0

(k + 1)κ J nk

.

Remark that

(8) 0 ≤ gn(s) < 1 and 0 < hn(s) ≤ 1, ∀ s ∈ [0, 1].

Note that, for every n ∈ N, we have

gn(0) = 0; gn(1) = (2n − 1)/2n;(9)

hn(0) = 1/2n; hn(1) = 1.(10)

Claim 1: The sequence (gn)∞n=1 is non-decreasing, and the sequence (hn)∞

n=1

is non-increasing.



170 LECTURE 20

Using (9) and (10), we only need to examine the restrictions to the open interval(0, 1). Fix some point s ∈ (0, 1). For every integer n ≥ 1, define

psn = maxk ∈ Z : 0 ≤ k2n

< s.

We clearly have psn < 2n and

(11)psn2n

< s ≤ psn + 1

2n.

We then have

(12) gn(s) =

psn/2n if s = ( psn + 1)/2n

( psn + 1)/2n if s = ( psn + 1)/2nand hn(s) =

psn + 1

2n

We now estimate gn+1(s) and hn+1(s). First of all, using (11), we have

2 psn2n+1

< x ≤ 2 psn + 2

2n+1,

which means that either psn+1 = 2 psn, or psn+1 = 2 psn + 1. This immediately gives

hn+1(s) =psn+1 + 1

2n+1≤ 2 psn + 2

2n+1=

psn + 1

2n= hn(s).

Note that, if s = ( psn+ 1)/2n, we will have psn+1 = 2 ps+1 and s = ( psn+1 + 1)/2n+1,so we get

gn+1(s) = ( psn+1 + 1)/2n+1 = (2 psn + 2)/2n+1 = ( psn + 1)/2n = gn(s).

If s = ( psn + 1)/2n, then

gn(s) =psn2n

=2 psn2n

≤ psn+1

2n+1≤ gn+1(s).

Claim 2 : For every s ∈ [0, 1] one has

limn→∞

sups∈[0,1]

gn(s) − s = limn→∞

sups∈[0,1]

hn(s) − s = 0.

To prove this fact we are going to estimate the differences |gn(s)−s| and |hn(s)−s|.If s = 0 or s = 1, then the equalities (9) and (10) immediately show that

(13) |gn(s) − s| ≤ 1

2nand |hn(s) − s| ≤ 1

2n, ∀ n ∈ N.

If s ∈ (0, 1), then the definitions of gn(s) and hn(s) clearly show that

s, gn(s), hn(s) ∈ psn/2n, ( psn + 1)/2n

,

and then we see that we again have the inequalities (13). Since (13) now holds forall s ∈ [0, 1], the Claim immediately follows.

We proceed now with the proof of the theorem. Define

α = inf f (x) : x ∈ X and β = sup

f (x) : x ∈ X .

If α = β , there is nothing to prove. Assume α < β . Depending on the finitude of α and β , we define a homeomorphism Φ : [α, β ] → [0, 1], as follows.

(a) If α > −∞ and β < ∞, we define

Φ(s) =s − α

β − α, ∀ s ∈ [α, β ].




(b) If α > −∞ and β = ∞, we define

Φ(s) = 2π arctan(s − α) if s = β

1 if s = β (c) If α = −∞ and β < ∞, we define

Φ(s) =

1 + 2

π arctan(s − β ) if s = α0 if s = α

(d) If α = −∞ and β = ∞, we define

Φ(s) =

0 if s = α12 + 1

π arctan(s − β ) if α < sβ 1 if s = β

Notice that Φ(α) = 0, Φ(β ) = 1, and

α ≤ s < t ≤ β ⇒ Φ(s) < Φ(t).

After these preparations, we proceed with the proof. We begin with the specialcases (i) (ii) and (iii).If α > −∞, we define the functions f n = Φ−1 gn Φ f . Since Φ and Φ−1 are

increasing, and (gn)∞n=1 is non-decreasing, it follows that (f n)∞

n=1 is non-decreasing.Since 0 ≤ gn(s) < 1, ∀ s ∈ [0, 1], we see that α ≤ f n(x) < β , ∀ x ∈ X . In particular,we have −∞ < f n(x) < ∞, for all n and x. It it obvious that f n is elementary,measurable, and since limn→∞ gn(s) = s, ∀ s ∈ [0, 1] (by Claim 2), we immediatelyget limn→∞ f n(x) = f (x), ∀ x ∈ X .

If β < ∞, we define the functions f n = Φ−1 hn Φ f . Since Φ and Φ−1 areincreasing, and (hn)∞

n=1 is non-increasing, it follows that (f n)∞n=1 is non-increasing.

Since 0 < hn(s) ≤ 1, ∀ s ∈ [0, 1], we see that α < f n(x) ≤ β , ∀ x ∈ X . In particular,we have −∞ < f n(x) < ∞, for all n and x. It it obvious that f n is elementary,measurable, and since limn→∞ hn(s) = s,

∀s

∈[0, 1] (by Claim 2), we immediately

get limn→∞ f n(x) = f (x), ∀ x ∈ X .If α > −∞ and β < ∞, then we can take f n = Φ−1 gn Φ f , ∀ n, or we can

take f n = Φ−1 hn Φ f , ∀ n. The inequalities (13), combined with the definition(c) of Φ, show that

|f n(x) − f | ≤ β − α

2n, ∀ x ∈ X, n ∈ N,

with any of the above choices for (f n)∞n=1.

Having proven the cases (i), (ii) and (iii), we now examine the general situation,when α = −∞ and β = ∞. Consider the functions f , f : X → [−∞, ∞] definedby

f (x) = maxf (x), 0 and f (x) = minf (x), 0, ∀ x ∈ X.

By Corollary 3.4, both f and f are measurable. Since inf x∈X f (x) ≥ 0, by part

(i), there exists a sequence (f n)∞n=1 ∈ A-ElemR(X ), such that limn→∞ f n(x) =f (x), ∀ x ∈ X . Since supx∈X f (x) ≤ 0, by part (ii), there exists a sequence(f n )∞

n=1 ∈ A-ElemR(X ), such that limn→∞ f n(x) = f (x), ∀ x ∈ X . Define theelementary functions f n = f n+ f n , n ∈ N. Clearly the f n’s are all in A-ElemR(X ).

We now check that

(14) limn→∞

f n(x) = f (x), ∀ x ∈ X.

There are two cases to examine: (a) f (x) ≥ 0; (b) f (x) ≤ 0.



172 LECTURE 20

In case (a), we have f (x) = f (x) and f (x) = 0, so limn→∞ f n(x) = f (x) andlimn→∞ f n (x) = 0.

In case (b), we have f (x) = 0 and f (x) = f (x), so limn→∞

f n

(x) = 0 andlimn→∞ f n (x) = f (x).

In either case, the equality (14) follows.

We conclude this section with a discussion on an interesting measurable space,that appears often in connection with probability theory.

Example 3.1. Consider the space T = 0, 1ℵ0 , i.e.

T =

a = (αn)∞n=1 : αn ∈ 0, 1, ∀ n ∈ N

.

We call T the space of infinite coin flippings, having in mind that an elementof T is the same as the outcome of an infinite sequence of coin flips (think 0as corresponding to tails, and 1 as corresponding to heads). Equipp T with theproduct topology. By Tihonov’s Theorem, T is compact. The product topology on

T is in fact given by a metric d defined by

d(a, b) =∞n=1

|αn − β n|2n

, ∀ a = (αn)∞n=1, b = (β n)∞

n=1 ∈ T.

For every number r ≥ 2 we define a map φr : T → [0, 1] by

φr(a) = (r − 1)

∞n=1

αnrn

, ∀ a = (αn)∞n=1 ∈ T.

It is pretty clear thatφr(a) − φr(b) ≤ (r − 1)d(a, b), ∀ a, b ∈ T,

so the maps φr : T → [0, 1], r ≥ 2 are continuous. In particular, the set K r = φr(T )

is a compact subset of [0, 1].Define

T 0 =

a = (αn)n∈N ∈ T : the set n ∈ N : αn = 0 is infinite

.

The set T T 0 can be described as:

T T 0 =

(αn)n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀ n ≥ N

.

The following are well known (see Appendix B, the proof of Proposition B.2).

Facts : 1. The set T T 0 is countable2. For any r ≥ 2, and elements a = (αn)∞

n=1, b = (β n)∞n=1 ∈ T 0, the

following are equivalent:• there exists N ∈ N such that αN = 1, β N = 0, and αn = β n, for all

n

∈N with n < N ;

• φr(a) > φ(b).In particular, the map φr

T 0

: T 0 → [0, 1] is injective.

The above constructions have a remarkable feature.

Theorem 3.3. Use the notations above. For a number r ≥ 2 and subset A ⊂ T ,the following are equivalent:

(i) A ∈ Bor(T );(ii) φr(A) ∈ Bor(K r).




Proof. Throughout the proof the number r will be fixed. The map φr will bedenoted by φ, and the compact set K r will be denoted by K .

Since φ : T →

K is continuous, it is measurable, i.e. we have the implication

(15) B ∈ Bor(K ) ⇒ φ−1(B) ∈ Bor(T ).

Before we proceed with the actual proof, we need some preparations. Remark that,since φ : T → K is surjective, we have the equality

(16) φ

φ−1(C )

= C, ∀ C ⊂ K.

Claim 1: If a subset C ⊂ K is at most countable, if and only if the set φ−1(C ) ⊂ T is at most countable.

Suppose C is at most countable countable. If we take A0 = φ−1(C ) ∩ T 0, andA1 = φ−1(C ) T 0, then obviously φ−1(C ) = A0 ∪ A1. Since A1 ⊂ T T 0, andT T 0 is countable, it follows that A1 is at most countable, so we only need to provethat A0 is at most countable. But since φ

T 0is injective, and A0 ⊂ T 0, it follows

that φA0 : A0 → C is injective, and then the fact that C is at most countable,forces A0 to be at most countable.

Conversely, if φ−1(C ) is at most countable, then so is φ

φ−1(C )

. By (16) weare done.

For each subset A ⊂ T , we define

A = φ−1

φ(A)

.

Remark that A ⊂ A, ∀ A ⊂ T . Note also that, for any family (Ai)i∈I of subsetsof T , one has the equality

(17)i∈I

Ai

= φ−1

φi∈I

Ai

= φ−1

i∈I

φ(Ai)

=i∈I

φ−1

φ(Ai)

=i∈I

Ai.

As an application of Claim 1, to the set C = φ(T T 0), we see that

(∗) the set T T 0 is at most countable.Claim 2 : For any subset A ⊂ T 0, one has the inclusion

A A ⊂ T T 0.

In particular, the difference A A is at most countable.

Start with an arbitrary element x ∈ A A. This means that x ∈ A, but φ(x) ∈φ(A), which means that there exists some a ∈ A, with φ(x) = φ(a). Assume nowx ∈ T T 0, which means that x ∈ T 0. But then, the fact that x, a ∈ T 0, combinedwith the injectivity of φ

T 0

will force x = a, which is impossible since a ∈ A.

Claim 3 : For any set A ⊂ T , the difference A A is at most countable.

Take A0 = A ∩ T 0 and A1 = A A0. Notice that, since A1 ⊂ T T 0, we have

A

1= φ−1φ(A

1) ⊂

φ−1φ(T T 0

) =T T

0,

so it follows that A1 is at most countable. We obviously have A = A0 ∪ A1, so by(17)

A = A0 ∪ A1.

But now we are done, since

A A =A0 ∪ A1

A0 ∪ A1

⊂ A0 A0

∪ A1,

and both A0 A0 (by Claim 2) and A1 are at most countable.



174 LECTURE 20

Claim 4: For any subset A ⊂ T , one has the inclusion

(18) φ(T A) ⊃ K φ(A),

and the difference φ(T A) K φ(A) is at most countable.

The inclusion (18) is pretty obvious, from the surjectivity of φ. In order to provethat the difference

C = φ(T A)

K φ(A)

= φ(T A) ∩ φ(A)

is countable, by Claim 1, it suffices to prove that φ−1(C ) is countable. We have

φ−1(C ) = φ−1

φ(T A) ∩ φ(A)

= φ−1

φ(T A) ∩ φ−1

φ(A)

= T A ∩ A.

We can write φ−1(C ) = A1 ∪ A2, where

A1 = (T A) ∩ A and A2 =T A (T A)

∩ A,

so it suffices to prove that both A1 and A2 are at most countable. But these factsare immediate from Claim 3, since A1 =

A

A, and A2

⊂ T A

(T A).

We can now proceed with the proof of the theorem. Define

A =

A ⊂ T : φ(A) ∈ Bor(K )

,

so that what we need to prove is the equality A = Bor(T ).First, remark that, if A ∈ A, then φ(A) ∈ Bor(K ), and the fact that φ is Borel

measurable will force A = φ−1

φ(A)

to be a Borel set in T . But since A Ais countable, hence Borel, it follows that

A = A A A

is again Borel. Therefore, we have the inclusion A ⊂ Bor(T ).

Second, remark that if F ⊂ T is a compact subset, then the continuity of φgives the fact that φ(F ) is compact, hence Borel. This then forces F ∈ A. Therefore A contains the collection CT of all compact subsets of T .

Now we have

CT ⊂ A ⊂ Bor(T ) = Σ(CT ),

so all we need to prove is the fact that A is a σ-algebra, i.e. we have the properties

(a) A ∈ A ⇒ T A ∈ A;(b) for any sequence (An)∞

n=1 ⊂ A, the union∞n=1 An also belongs to A..

To check (a) start with some set A ∈ A. We know that φ(A) ∈ Bor(K ), andwe want to show that φ(T A) is again Borel. By Claim 4, we know we can write

φ(T A) =

K φ(A) ∪ C,

for some set C ⊂ K which is at most countable. Since C and K φ(A) are Borel,this shows that φ(T A) is also Borel.

Property (b) is obvious, since φ(An), n

≥1 are all Borel, and

φ ∞n=1

An

=∞n=1

φ(An).

Corollary 3.7. Use the above notations. For a number r ≥ 2 and a subset B ⊂ K r, the following are equivalent:

(i) B ∈ Bor(K r);(ii) φ−1

r (B) ∈ Bor(T ).




Proof. The implication (i) ⇒ (ii) is trivial, since φr is continuous, hencemeasurable.

Conversely, if the set A = φ−1

r(B) is Borel, then by the Theorem, φ

r(A) is

Borel. But since φr is surjective, we have B = φr(A).

Comments. From the above results, we see that φr : T → K r “almost pre-serves Borel structures.” More explicitly, if one considers the maps

Φr : P(T ) A −→ φr(A) ∈ P(K r),

Ψr : P(K r) B −→ φ−1r (B) ∈ P(T ),

then

• (Φr Ψr)(B) = B, for all B ⊂ K r;• (Ψr Φr)(A) ⊃ A, and (Φr Ψr)(A) A is at most countable, for all

A ⊂ T ;• B ∈ Bor(K r) ⇔ Ψr(B) ∈ Bor(T );

• A ∈ Bor(T ) ⇔ Φr(A) ∈ Bor(K r).In the particular case r = 2, we know that K 2 = [0, 1], so we can think the mea-surable space

[0, 1],Bor([0, 1])

as “approximatively the same” as the measurable

space

T,Bor(T )

.The case r = 3 will be an interesting one, especially for constructing various

counter-examples. The compact set K 3 ⊂ [0, 1] is called the ternary Cantor set .

It turns out that there exists another useful description of the ternary Cantorset K 3, which yields some interesting properties.

Notations. We keep the notations above. An element a = (α)∞n=1 ∈ T will

be called finite, if there exists some N ∈ N, such that αn = 0, ∀ n ≥ 0. We define

T fin =

a ∈ T : a finite

.

Remark that T fin ⊂ T 0. In particular the map φ3T fin : T fin → K 3 is injective.For a ∈ T fin we define its length as

(a) = minN ∈ N : αn = 0, ∀ n ≥ N − 1.

With this definition, for every a = (αn)∞n=1 ∈ T fin , we have

(19) α(a) = 1 and αn = 0, ∀ n > (a).

We define

Λ =

(k, a) ∈ Z × T fin : k ≥ (a)

.

Finally, for every pair λ = (k, a) ∈ Λ, we define the open interval

I λ =

φ3(a) +

1

3k+1, φ3(a) +

2

3k+1 .

Remark that, using (19) we have

φ3(a) ≤ 2an=1

2

3n= 1 − 1

3(a),

with the convention that the sum is 0, if (a) = 0. We then get

φ3(a) +2

3k+1≤ 1 − 1

3(a)+

2

3k+1< 1 − 1

3(a)+

1

3k≤ 1,



176 LECTURE 20

which gives the inclusion I λ ⊂ (0, 1).The following result is describes an alternative construction of K 3.

Theorem 3.4. Use the notations above.(i) The set T fin is dense in T ;

(ii) The system (I λ)λ∈Λ is pair-wise disjoint.(iii)

λ∈Λ = [0, 1] K 3.

Proof. The map φ3 will be simply denoted by φ, and the Cantor set K 3 willbe denoted simply by K .

(i). Fix some element a = (αn)∞n=1 ∈ T . For every integer k ≥ 1 define the

element ak = (αkn)∞n=1 ∈ T , by

αkn =

αn if n ≤ k0 if n > k

It is obvious that ak ∈ T fin , ∀ k ∈ N. The inequality

d(a, ak) =∞

n=k+1

αn2n

≤ n=k+1

1

2n=

1

2k, ∀ k ∈ N

then immediately shows that limk→∞ ak = a.(ii). Assume λ, µ ∈ Λ are such that λ = µ, and let us prove that I λ ∩ I µ = ∅.

Let λ = ( j,a) and µ = (k, b), where a = (αn)∞n=1 and b = (β n)∞

n=1 are elements intT fin with (a) ≤ j and (b) ≤ k. Since λ = µ, we have one (or both) of the followingcases: (a) a = b, or (b) j = k.

In case (a) we take

m = minn ∈ N : αn = β n.

Without any loss of generality, we can assume that αm = 0 and β m = 1. Note thatk

≥(b)

≥m

≥1. We are going to prove that I λ

∩I µ = ∅, by showing that the

right end-point of I λ is not greater than the left end-point of I µ, that is,

(20) φ(a) +2

3k+1≤ φ(b) +

1

3k+1.

Define the number

M =m−1n=1

αn3n

=m−1n=1

β n3n

,

with the convention that M = 0, if m = 1. We have:

φ(a) = 2M + 2

(a)n=m+1

αn3n

≤ 2M + 2

(a)m+1

1

3n= 2M +

1

3m− 1

3(a);

φ(b) = 2M + 23m

+ 2

(b)n=m+1

β n3n

≥ 2M + 23m

.

The inequality (20) then follows immediately from:

φ(a) +2

3j+1≤ 2M +

1

3m− 1

3(a)+

2

3j+1<< 2M +

1

3m− 1

3(a)+

1

3j≤

≤ 2M +1

3m< 2M +

2

3m≤ φ(b) < φ(b) +

1

3k+1.




In case (b), based on the fact that we have proven case (a), we can assume, withoutany loss of generality, that a = b and j < k . In this case we have

φ(b) + 23k+1

= φ(a) + 23k+1

< φ(a) + 13k

≤ φ(a) + 13j+1

,

which means that the right end-point of I µ is not greater than the left end-point of I λ, so again we get I λ ∩ I µ = ∅.

For the proof of (iii) we are going to use the space

P = 0, 1, 2ℵ0 =

(αn)∞n=1 : αn ∈ 0, 1, 2, ∀ n ∈ N

.

Exactly as is the case with T , the product space P is compact with respect to theproduct topology, which is given by the metric

d(a, b) =

∞

n=1

|αn − β n|2n

, ∀ a = (αn)∞n=1, b = (β n)∞

n=1 ∈ P.

Then map ψ : P → [0, 1], defined by

ψ(a) =

∞n=1

αn3n

, ∀ a = (αn)∞n=1 ∈ P,

satisfies ψ(a) − ψ(b)| ≤ d(a, b), ∀ a, b ∈ P,

hence it is continuous. Note also that ψ is surjective. We can write φ = ψ ρ,where

ρ : 0, 1ℵ0 (αn)∞n=1 −→ (2αn)∞

n=1 ∈ 0, 1, 2ℵ0 .

Note also that ρ : T

→P is continuous, since we clearly have

dρ(a), ρ(b) ≤ 2d(a, b), ∀ a, b ∈ T.

We now proceed with the proof of (iii). Denote the open setλ∈Λ I λ simply by

D. Since T fin is dense in T , it follows that φ(T fin ) is dense in K = φ(T ). Therefore,in order to prove the inclusion K ⊂ [0, 1] D, using the surjectivity of ψ, it sufficesto prove the inclusion

φ(T fin ) ⊂ [0, 1] D.

Using the map ψ : P → [0, 1], the above inclusion is equivalent to

(21) P ρ(T fin ) ⊃ ψ−1(D).

In order to prove the inclusion [0, 1] D ⊂ K , again using the surjectivity of ψ, it

suffices to prove the inclusion(22) ψ−1(D) ⊃ P ψ−1(K ).

To prove (21) start with some element a = (αn)∞n=1 ∈ ψ−1(D), which means that

there exists some b ∈ T fin , and an integer k ≥ (b), such that ψ(a) ∈ I (k,b), i.e.

(23)2β 1

3+ · · · +

2β k3k

+1

3k+1<

∞n=1

αn3n

<2β 1

3+ · · · +

2β k3k

+2

3k+1.



178 LECTURE 20

We prove that a ∈ ρ(T fin ) by contradiction. Assume a ∈ ρ(T fin ), which means thatthere exists c = (γ n)∞

n=1 ∈ T fin , such that αn = 2γ n, ∀ n ∈ N. Define the element˜b = (

˜β n)

∞

n=1 ∈ T fin by

β n =

β n if n ≤ k1 if n = k + 10 if n > k + 1

With this definition, the inequalities (23) give

(24) φ(b) < φ(b) +1

3k+1< φ(c) < φ(b).

By Fact 2 above, there exist N, N ∈ N such that

• γ N = 1, β N = 0, and γ n = β n, for all n ∈ N with n < N ;• γ N = 0, β N = 1, and γ n = β n, for all n ∈ N with n < N .

We will examine three cases: (a) N < N , (b) N = N , or (c) N > N .

Case (b) is clearly impossible. In case (a), the inequality N < N

forcesβ N = 0, γ N = 1 and β N = γ N , which means that β N = 1 = β N = 0. This clearlyforces N = k + 1 > (b), which in particular gives β n = β n = 0, ∀ n > N , so we

clearly have γ n ≥ β n, ∀ n ∈ N, so we get φ(c) ≥ φ(b), thus contradicting (24). In

case (c), we have γ N = 0, β N = 1, and since N < N , we also have β N = γ N = 0.As before this would force N = k + 1. We then have

φ(c) = 2∞n=1

γ n3n

= 2N −1n=1

γ n3n

+2γ N

3N

+ 2∞

n=N +1

γ n3n

= 2kn=1

β n3n

+ 0 + 2∞

n=k+2

γ n3n

=

= φ(b) + 2

∞n=k+2

γ n3n

≤ φ(b) + 2

∞n=k+1

1

3n= φ(b) +

1

3k+1,

again contradicting (24).To prove (22), we start with some element a ∈ P ψ−1(K ), and we show that

ψ(a) ∈ D. The fact that a ∈ ψ−1(K ) forces the fact that a ∈ ρ(T ). In particular,this gives the fact that a = (αn)∞

n=1 ∈ 0, 1, 2ℵ0 and there exists some n ∈ N suchthat αn = 1. Put

N = minn ∈ N : αn = 1.

Define the elements b = (β n)∞n=1 ∈ 0, 1ℵ0 , by

β n =

αn/2 if n < N

0 if n ≥ N

Notice that b ∈ T fin , and (b) ≤ N − 1. Notice also that 2β n = αn, for all n ∈ N

with n < N − 1. In particular, using the equality αN = 1, this gives

φ(b) +1

3N = 2

N −1n=1

β n3n

+αN 3N

=N n=1

αn3n

≤∞n=1

αn3n

= ψ(a);

(25)

φ(b) +2

3N = 2

N −1n=1

γ n3n

+αN 3N

+∞

n=N +1

2

3n=

N n=1

αn3n

+∞

n=N +1

2

3n≥

∞n=1

αn3n

= ψ(a).

(26)




Consider the pair λ = (N − 1, β ) ∈ Λ. We are going to show that ψ(a) ∈ I λ, i.e.we have the inequalities

(27) φ(b) + 13N

< ψ(a) < φ(b) + 23N

.

By (25) and (26) it suffices to prove only that

ψ(a) = φ(b) +1

3N and ψ(a) = φ(b) +

2

3N .

If ψ(a) = φ(b) + 13N , then by the inequalities (25), we are forced to have

(28) αn = 0, ∀ n > N..

If ψ(a) = φ(b) + 23N , then by the inequalities (26), we are forced to have

(29) αn = 2, ∀ n > N..

If (28) holds, we define c = (γ n)∞n=1 ∈ T , by

γ n = αn/2 if n < N

0 if n = N 1 if n > N

and we will have

φ(c) = 2∞n=1

γ n3n

=N −1n=1

2γ n3n

+ 2∞

n=N +1

1

3n=N −1n=1

αn3n

+1

3N = ψ(a),

thus forcing ψ(a) ∈ K , which is impossible.If (29) holds, we define c = (γ n)∞

n=1 ∈ T , by

γ n =

αn/2 if n = N 1 if n = N 0 if n > N

and we will have

φ(c) = 2N −1n=1

γ n3n

+2

3N =N −1n=1

2γ n3n

+1

3N +

∞n=N +1

2

3n=

∞n=1

αn3n

= ψ(a),

thus forcing again ψ(a) ∈ K , which is impossible.

Exercise 7 . Using the notations above, prove that the set

[0, 1] K 3 =λ∈Λ

I λ

is dense in [0, 1].Hints: Define the set

P 0 =

(αn)

∞

n=1 ∈ 0, 1, 2ℵ0

: the set n ∈ N : αn = 1 is infinite

.Prove that P 0 is dense in P , and prove that ψ(P ) ⊂ [0, 1] K . (Use the arguments employed in

the proof of part (iii).)

Remarks 3.5. If we set Λn = Λ∩n×P

, then we can write the complementof the ternary Cantor set as

[0, 1] K 3 =∞n=0

Dn,



180 LECTURE 20

whereDn =

λ∈Λn

I λ.

Then the system of open sets (Dn)n≥0 is pair-wise disjoint. Morever, each Dn is aunion of 2n disjoint intervals of length 1/3n+1.

Since card T 0 = c, and the map φ3

T 0

: T 0 → K 3 is injective, we get card K 3 ≥ c.

Since we also have card K 3 ≤ card R = c, we get in fact the equality

card K 3 = c.



Lecture 21

4. The concept of measure

Definition. Let X be a non-empty set, and let E be an arbitrary collectionof subsets of X . Assume ∅ ∈ E. A measure on E is a map µ : E → [0, 1] with thefollowing properties

(0) µ(∅) = 0.

(addσ) Whenever (E n)∞n=1 ⊂ E is a pair-wise disjoint sequence, with ∞n=1 E n ∈

E, it follows that we have the equality

µ ∞n=1

E n

=

∞n=1

µ(E n).

Property (addσ) is called σ-additivity .

Convention. For a sequence (αn)∞n=1 ⊂ [0, ∞] we define

∞n=1

αn =

∞n=1

αn if αn ∈ [0, ∞), ∀ n ∈ N

∞ if there exists n ∈ N with αn = ∞.

(Of course, in the first case, it is still possible to have ∞

n=1 αn = ∞.)Remark 4.1. If µ is a measure on E, then µ is additive, i.e.

(add) Whenever (E n)N n=1 ⊂ E is a finite pair-wise disjoint system, such that E 1 ∪ · · · ∪ E N ∈ E, it follows that we have the equality

µ

E 1 ∪ · · · ∪ E N

= µ(E 1) + · · · + µ(E N ).

This follows from (addσ) (0), after completing the sequence E 1, . . . , E N to aninfinite sequence by taking E n = ∅, ∀ n > N .

Comment. The most natural setting for measures is the one when E is a σ-ring.In this case, the stipulation that

∞n=1 E n ∈ E, which appears in the definition, is

superfluous.The purpose of this section is to study measures on more rudimentary collec-

tions.

Examples 4.1. Let X be a non-empty set.A. If we take E = ∅, X and we define µ(∅) = 0 and µ(X ) to be any element

in [0, ∞], then µ is obviously a measure on ∅, X .B. If we take E = P(X ) and we define

µ(E ) =

0 if E = ∅

∞ if E = ∅

then µ is a measure on P(X ).

181



182 LECTURE 21

C. If we take E = P(X ) and we define

µ(E ) = card E if E is finite

∞ if E is infinitethen µ is a measure on P(X ). This is called the counting measure.

Exercise 1. Let X 1, X 2 be non-empty spaces, let Ek ⊂ P(X k) be arbitrarycollections with ∅ ∈ Ek, k = 1, 2. Let µ1 be a measure on E1 and µ2 be a measureon E2. Consider the collections

f ∗E1 = A ⊂ X 2 : f −1(A) ∈ E1 ⊂ P(X 2);

f ∗E2 = f −1(A) : A ∈ E2 ⊂ P(X 1).

A. Prove that the map f ∗µ1 : f ∗E1 → [0, ∞], defined by

(f ∗µ1)(A) = µ1

f −1(A)

, ∀ A ∈ f ∗E1,

is a measure on f ∗E1.

B. If f is surjective, prove that the map f

∗

µ2 : f

∗E2 → [0, ∞], defined by

(f ∗µ2)

)

= µ2

f (B)

, ∀ B ∈ f ∗E2,

is a measure on f ∗E2.

We now concentrate on the most rudimentary types of collections E on whichmeasures can be somehow easily defined. Actually, what we have in mind is a setof easy conditions on a map µ : E → [0, ∞] which would guarrantee that µ is ameasure.

Definition. Let X be a non-empty set. A collection J ⊂ P(X ) is called asemiring , if it satisfies the following properties:

• ∅ ∈ J;• if A, B ∈ J, then A ∩ B ∈ J;

•if A, B

∈J and A

⊂B, then there exists an integer n

≥1, and sets

D0, D1, . . . , Dn ∈ J, such that A = D0 ⊂ D1 ⊂ · · · ⊂ Dn = B, andDk Dk−1 ∈ J, ∀ k ∈ 1, . . . , n.

Remark that every ring is a semiring.

Exercise 2 . Prove that the semiring type is not consistent. Give an example of two semirings J1, J2 ⊂ P(X ), such that J1 ∩ J2 is not a semiring.

Hint: Use the set X = 1, 2, 3.

Exercise 3 . Let X 1, . . . , X n be non-empty sets, and let Jk ⊂ P(X k), k =1, . . . , n, be semirings. Prove that

J =

A1 × · · · × An : A1 ∈ J1, . . . , An ∈ Jn ⊂ P(X 1 × · · · × X n)

is a semiring.

Hint: First prove the case n = 2, and then use induction.

Example 4.2. Take X = R. The collection

J = ∅ ∪ [a, b) : a, b ∈ R, a < b ⊂ P(R)

is a semiring.Indeed, the first two axioms are pretty clear. To prove the third axiom, we

start with two intervals A = [a, b) and B = [c, d) with A ⊂ B. This means thata ≥ c and b ≤ d. If a = c or b = d, we set D0 = A and D1 = B. If a > c and b < d,we set D0 = A, D1 = [a, d) and D2 = B.




More generally, by Exercise 3, the collection of ”half-open boxes”

Jn

=

∅ ∪

n

j=1

[aj

, bj

) : a1

< b1

, . . . , an

< bn ⊂

P(Rn)

is a semiring.

Exercise 4. Let Jn ⊂ P(Rn) be the semiring defined above. Prove that theσ-ring S(J) generated by Jn coincides with Bor(Rn).

The ring generated by a semiring has a particularly nice description (compareto Proposition 2.1):

Proposition 4.1. Let J be a semiring on X . For a subset A ⊂ X , the following are equivalent:

(i) A belongs to R(J), the ring generated by J;(ii) There exists an integer n ≥ 1, and a pair-wise disjoint system (Aj)nj=1 ⊂ J,

such that A = A1 ∪ · · · ∪ An.

Proof. Denote by R the collection of all subsets A ⊂ X that satisfy condition(ii). It is obvious that

J ⊂ R ⊂ R(J),

so (see Section III.2) we only need to prove that R is a ring.Let us first remark that we obviously have the property:

(i) if A, B ∈ R, and A ∩ B = ∅, then A ∪ B ∈ R.

Secondly, we remark that we have have the implication:

(ii) A, B ∈ J ⇒ A B ∈ R.

Indeed, since A∩B ∈ J, by the definition of a semiring, there exist D0, D1, . . . , Dn ∈J with A ∩ B = D0 ⊂ D1 ⊂ · · · ⊂ Dn = A, and Dk Dk−1 ∈ J, ∀ k ∈ 1, . . . , n.Then the equality

A =nk=1

(Dk Dk−1)

shows that A B indeed belongs to R.Thirdly, we prove the implication:

(iii) A, B ∈ R ⇒ A ∩ B ∈ R.

Write A = A1 ∪ · · · ∪ Am and B = B1 ∪ · ·· ∪ Bn, with (Ai)mi=1, (Bk)nk=1 ⊂ J

pair-wise disjoint systems. If we define the sets Dik = Aj ∩ Bk ∈ J, (i, k) ∈1, . . . , m × 1, . . . , n then it is obvious that

A ∩ B =

mi=1

nk=1

Dik,

and (Dik)1≤i≤m

1≤j≤n ⊂J is a pair-wise disjoint system, therefore A

∩B indeed belongs

to R.Finally, we show the implication:

(iv) if A, B ∈ R and A ⊃ B, then A B ∈ R.

Write A = A1 ∪ · · · ∪ Am, with (Ai)mi=1 ⊂ J a pair-wise disjoint system. Notice that

A B =

mi=1

(Ai B),



184 LECTURE 21

with (Ai B)mi=1 a pair-wise disjoint system, so by (i) it suffices to show thatAiB ∈ R, ∀ i ∈ 1, . . . , m. To prove this, we fix i and we write B = B1 ∪· · ·∪Bn,with (B

k)nk=1 ⊂

J a pair-wise disjoint system. Then

Ai B = (Ai B1) ∩ · · · ∩ (Ai Bn),

and the fact that Ai B belongs to R follows from (ii) and (iii).Having proven (i)-(iv), it we now prove that R is a ring. By (iii), we only need

to prove the implication

(∗) A, B ∈ R ⇒ AB ∈ R.

On the one hand, using (iv), it follows that the sets A B = A (A ∩ B) andB A = B (A ∩ B) both belong to R. Since AB = (A B) ∪ (B A), and(A B) ∩ (B A) = ∅, by (i) is follows that AB indeed belongs to R.

Theorem 4.1 (Semiring-to-ring extension). Let J be a semiring on X , and let µ : J → [0, ∞] be an additive map with µ(∅) = 0.

(i) There exists a unique additive map µ : R(J) → [0, ∞], such that µJ = µ.(ii) If µ is σ-additive, then so is µ.

Proof. The key step is contained in the following

Claim: If (Ai)mi=1 ⊂ J and (Bj)nj=1 ⊂ J are pair-wise disjoint systems, with

A1 ∪ · · · ∪ Am = B1 ∪ · · · ∪ Bn,

then µ(A1) + · · · + µ(Am) = µ(B1) + · · · + µ(Bn).

To prove this fact, we define the pair-wise disjoint system (Dij)1≤i≤m1≤j≤n

by Dij =

Ai ∩ Bj , ∀ (i, j) ∈ 1, . . . , m × 1, . . . , n. Since

n

j=1

Dij = Ai, ∀ i ∈ 1, . . . , m,

mi=1

Dij = Bj , ∀ j ∈ 1, . . . , n,

using additivity, we have the equalitiesnj=1

µ(Dij) = µ(Ai), ∀ i ∈ 1, . . . , m,

mi=1

µ(Dij) = µ(Bj), ∀ j ∈ 1, . . . , n,

and then we getm

i=1

µ(Ai) =

m

i=1

m

j=1

µ(Dij) =

n

j=1

n

i=1

µ(Dij) =

n

j=1

µ(Bj).

To prove (i), for any set A ∈ R(J) we choose (use Proposition 4.1) a finitepair-wise disjoint system (Ai)ni=1 ⊂ J, with A = A1 ∪ · · · ∪ An, and we define

(1) µ(A) = µ(A1) + · · · + µ(An).

By the above Claim, the number µ(A) is independent of the particular choice of thepair-wise disjoint system (Ai)ni=1. Also, it is clear that µ

J

= µ, and µ is additive.




The uniqueness is also clear, because the equality µJ

= µ and additivity of µ force

(1)

(ii). Assume now that µ is σ-additive, and let us prove that µ is again σ-additive. Start with a pair-wise disjoint sequence (An)∞n=1 ⊂ R(J), with ∞

n=1 An ∈R(J), and let us prove the equality

(2) µ ∞n=1

An

=

∞n=1

µ(An).

Since∞n=1 An ∈ R, there exists a finite pair-wise disjoint system (Bi)

pi=1 ⊂ J, such

that∞n=1 An = B1 ∪ · · · ∪ B p. With this choice we have

(3) µ ∞n=1

An

=

pi=1

µ(Bi).

For each i ∈ 1, . . . , p, we have Bi =

∞n=1(Bi ∩ An). Fix for the moment a

pair (n, i)∈

N×

1, . . . , p

. Since Bi ∩

An ∈

R(J), it follows that there exist an

integer N ni ≥ 1 and a finite pair-wise disjoint system (C nik )N ni

k=1 ⊂ J, such that

Bi ∩ An =N ni

k=1 C nik .Since, for each i ∈ 1, . . . , p, the countable system (C nik ) n∈N

1≤k≤N ni

⊂ J is pair-

wise disjoint, and we have the equality

∞n=1

N nik=1

C nik =

∞n=1

(Bi ∩ An) = Bi ∈ J,

by the σ-additivity of µ, we have

(4) µ(Bi) =

∞n=1

N nik=1

µ(C nik ), ∀ i ∈ 1, . . . , p.

Since, for each n ∈ N, the finite system (C nik ) 1≤i≤ p1≤k≤N ni

⊂ J is pair-wise disjoint,

and we have the equality

pi=1

N nik=1

C nik =

∞i=1

(Bi ∩ An) = An ∈ J,

by the definition of µ, we have

µ(An) =

pi=1

N nik=1

µ(C nik ), ∀ i ∈ 1, . . . , p.

Combining this with (4) yields

∞n=1

µ(An) =

∞n=1

pi=1

N nik=1

µ(C nik ) =

pi=1

µ(Bi),

and the equality (2) follows from (3).

Definition. Let X be a non-empty set, and let E ⊂ P(X ) be a collection of sets. We say that a map µ : E → [0, ∞] is sub-additive, if

(add−) whenever A ∈ E, and (An)nk=1 is a finite sequence in E with A ⊂ nk=1 Ak,

it follows that µ(A) ≤ nk=1 µ(Ak).



186 LECTURE 21

Note that we do not require the Ak’s to be pair-wise disjoint. With this terminology,Theorem 4.1 has the following.

Corollary 4.1. Let X be a non-empty set X , and let J ⊂ P(X ) be a semiring.Then any additive map µ : J → [0, ∞] is sub-additive.

Proof. Let µ : R(J) → [0, ∞] be the additive extension of µ to the ring gener-ated by J. It suffices to prove that µ is sub-additive. Start with sets A, A1, . . . , An ∈R(J) such that A ⊂ A1 ∪ . . . An. Define the sets B1 = A1, and

Bk = Ak (A1 ∪ · · · ∪ Ak−1), forallk ∈ 1, . . . , n, k ≥ 2.

Since we work in a ring, the sets Bk, Bk∩A, BkA, and AnBn, n ∈ N, all belongto R(J). Moreover, the sequence (Bk)nk=1 is pair-wise disjoint and it satisfies

• Bk ⊂ Ak, ∀ k ∈ 1, . . . , n,• nk=1 Bk =

nk=1 Ak ⊃ A,

so by the additivity of µ, we get

nk=1

µ(Ak) =nk=1

µ

(Ak Bk) ∪ Bk

=nk=1

µ(Ak Bk) + µ(Bk)

≥

≥nk=1

µ(Bk) =nk=1

µ

(Bk A) ∪ (Bk ∩ A)

=nk=1

µ(Bk A) + µ(Bk ∩ A)

≥

≥nk=1

µ(Bk ∩ A) = µ nk=1

[Bk ∩ A]

= µ(A).

Exercise 5* . Let X 1, X 2 be non-empty sets, let Jk ⊂ P(X k), k = 1, 2, besemirings, and let µk : Jk → [0, ∞] be additive maps. Consider the semiring (seeExercise 3)

J = A1

×A2 : A1

∈J1, A2

∈J2 ⊂

P(X 1×

X 2).

Then the map µ : J → [0, ∞] defined by

µ(A1 × A2) = µ1(A)1 · µ2(A1)

is additive. Here we use the convention 0 · ∞ = ∞ · 0 = 0.Hints: One wants to show that, whenever A1 × A2 ∈ J is written as aunion

A1 × A2 =n

k=1

(Ak1 × Ak

2),

with (Ak1 × Ak

2)nk=1 ⊂ J pair-wise disjoint, it follows that

µ1(A1) · µ2(A2) =n

k=1

µ1(Ak1) · µ2(Ak

2).

Analyze first the case of “strips,” that is, when A11 =

· · ·= An

1 = A1 or A21 =

· · ·= An

2 = A2. In

the general case, use induction, by picking some k such that Ak1 A1 and splitting A1 × A2 into

“strips” of the form B × A2, where B1, . . . , Bm ∈ J1 are pairwise disjoint, with B1 = Ak1 and

B1 ∪ · · · ∪ Bm = A1.

Comment. In connection with the above exercise, one can as the following

Question: With the notations above, is it true that, if both µ1 and µ2 aremeasures, then µ is also a measure?

As we shall see a bit later in the course, that the answer is is “yes.”




Definition. Let X be a non-empty set, and let E ⊂ P(X ) be a collection of sets. We say that a map µ : E → [0, ∞] is σ-sub-additive, if

(add−σ ) whenever A ∈ E, and (An)∞n=1 is a sequence in E with A ⊂ ∞n=1 An, it

follows that µ(A) ≤ ∞n=1 µ(An).

Note that we do not require the An’s to be pair-wise disjoint.

Proposition 4.2 (characterization of semiring measures). Let X be a non-empty set, let J ⊂ P(X ) be a semiring, and let µ : J → [0, ∞] be a map with µ(∅) = 0. The following are equivalent:

(i) µ is a measure on J;(ii) µ is additive, and σ-sub-additive.

Proof. (i) ⇒ (ii). Assume µ is a measure on J. It is clear that µ is additive,so we only need to prove σ-sub-additivity. Use Theorem 4.1 to find a measure µ on

the ring R(J) generated by J, such that

µ(A) = µ(A), ∀ A ∈ J.

Then it suffices to show that µ is σ-sub-additive. Start with a set A ∈ R(J), and asequence (An)∞

n=1 ⊂ R(J), such that A ⊂ ∞n=1 An. Define the sets B1 = A1, and

Bn = An (A1 ∪ · · · ∪ An−1), foralln ≥ 2.

Since we work in a ring, the sets Bn, Bn∩A, BnA, and AnBn, n ∈ N, all belongto R(J). Moreover, the sequence (Bn)∞

n=1 is pair-wise disjoint and it satisfies

• Bn ⊂ An, ∀ n ∈ N,

• ∞n=1 Bn = ∞

n=1 An ⊃ A,so by σ-additivity of µ, we get

∞n=1

µ(An) =∞n=1

µ

(An Bn) ∪ Bn

=∞n=1

µ(An Bn) + µ(Bn)

≥

≥∞n=1

µ(Bn) =

∞n=1

µ

(Bn A) ∪ (Bn ∩ A)

=

∞n=1

µ(Bn A) + µ(Bn ∩ A)

≥

≥∞n=1

µ(Bn ∩ A) = µ ∞n=1

[Bn ∩ A]

= µ(A).

(ii)⇒

(i). Assume µ : J→

[0,∞

] is additive and σ-sub-additive, and let usshow that µ is σ-additive. We again use Theorem 4.1, to find an additive mapµ : R(J) → [0, ∞], such that µ

J

= µ. Start with a pair-wise disjoint sequence

(An)∞n=1 ⊂ J, such that the union A =

∞n=1 An belongs to J. On the one hand, by

σ-sub-additivity, we have the inequality

(5) µ(A) ≤∞n=1

µ(An).



188 LECTURE 21

On the other hand, for any integer N ≥ 1, we have

µ(A) = µ(A) = µN

n=1An ∪ A

N

n=1An ≥

≥ µ N n=1

An

=N n=1

µ(An) =N n=1

µ(An),

which then gives

µ(A) ≥ supN ∈N

N n=1

µ(An) =∞n=1

µ(An),

so using (5) we immediately get µ(A) =∞n=1 µ(An).

The following technical result will be often employed in subsequent sections.

Lemma 4.1 (Continuity). Let J be a semiring, and let µ be a measure on J.

(i) If (An)∞n=1 ⊂ J is a sequence of sets, with A1 ⊂ A2 ⊂ . . . , and ∞

n=1 An ∈J, then

µ ∞n=1

An

= limn→∞

µ(An).

(ii) If (Bn)∞n=1 ⊂ J is a sequence of sets, with B1 ⊃ B2 ⊃ . . . , and

∞n=1 Bn ∈

J, and µ(B1) < ∞, then

µ ∞n=1

Bn

= limn→∞

µ(Bn).

Proof. Using Theorem 4.1, we can assume that J is already a ring. (Otherwisewe replace J by R(J), and µ by its extension µ.)

(i). Consider the sets D1 = A1, and Dk = An Ak−1,

∀k

≥2. It is clear that

(Dk)∞k=1 is a pairwise disjoint sequence in J, and we have the equality

(6)nk=1

Dk = An, ∀ n ≥ 1.

This gives of course the equality∞k=1

Dk =

∞n=1

An ∈ J.

Using this equality, combined with the (σ-)additivity of µ, and with (6), we get

µ

∞

n=1

An

=

∞

k=1

µ(Dk) = limn→∞

n

k=1

µ(Dk)

= limn→∞

µ

n

k=1

Dk

= limn→∞

µ(An).

(ii). Consider the sets B = ∞n=1 Bn, and An = B1Bn, ∀ n ≥ 1. It is clear that

(An)∞n=1 ⊂ J, and we have A1 ⊂ A2 ⊂ . . . . Moreover, we have

∞n=1 An = B1 B,

so by part (i), we get

(7) µ(B1 B) = limn→∞

µ(B1 Bn).

Using the fact that µ(B1) < ∞, it follows that

µ(B) ≤ µ(Bn) ≤ µ(B1) < ∞, ∀ n ≥ 1.




This gives then the equalities

µ(B1 B) = µ(B1)

−µ(B) and µ(B1 Bn) = µ(B1)

−µ(Bn),

∀n

≥1,

so the equality (7) immediately gives µ(B) = limn→∞ µ(Bn).

The above result has a (minor) generalization, which we record for future use.To formulate it we introduce the following.

Notation. Let R be a ring, and let µ be a measure on R. For two setsA, B ∈ R, we write A ⊂

µB, if µ(A B) = 0.

Using this notation, we have the following generalization of Lemma 4.1.

Proposition 4.3. Let R be a ring, and let µ be a measure on R.

(i) If (An)∞n=1 ⊂ R is a sequence of sets, with A1 ⊂

µA2 ⊂

µ. . . , and

∞n=1 An ∈

R, then

µ ∞n=1

An = limn→∞ µ(An).

(ii) If (Bn)∞n=1 ⊂ R is a sequence of sets, with B1 ⊃

µB2 ⊃

µ. . . , and

∞n=1 Bn ∈

J, and µ(B1) < ∞, then

µ ∞n=1

Bn

= limn→∞

µ(Bn).

Proof. (i). Define the sequence of sets (E n)∞n=1 ⊂ R, by E n =

nk=1 Ak,

∀ n ≥ 1. Notice that, A1 = E 1, and for each n ≥ 2, we have An ⊂ E n, as well asthe equality

E n An =

n−1

k=1[An Ak].

Using sub-additivity, it follows that

µ(E n An) ≤n−1k=1

µ(An Ak),

which forces µ(E n An) = 0. This gives

(8) µ(E n) = µ(An) + µ(E n An) = µ(An), ∀ n ≥ 1.

Since∞n=1 E n =

∞n=1 An, and we have the inclusions E 1 ⊂ E 2 ⊂ . . . , by Lemma

4.1, combined with (8), we get

µ ∞n=1

An = µ ∞n=1

E n = limn→∞ µ(E n) = limn→∞ µ(An).

Part (ii) is proven exactly as part (ii) from Lemma 4.1.

Exercise 6 . Let µ be a measure on a ring R. Prove that, for A, B ∈ R, one hasthe implication

A ⊂µ

B ⇒ µ(A) ≤ µ(B).



190 LECTURE 21

Example 4.3. Fix some integer n ≥ 1. Consider the semiring of “half-openboxes” in Rn

Jn = ∅ ∪ nj=1

[aj , bj) : a1 < b1, . . . , an < bn ⊂ P(Rn).

For a non-empty box A = [a1, b1) × · · · × [an, bn) ∈ Jn, we define

voln(A) =nk=1

(bk − ak).

We also define voln(∅) = 0.

Theorem 4.2. With the above notations, the map voln : J → [0, ∞] is a measure on Jn.

Proof. First we prove additivity. Using Exercise ?? (and induction on n) it

suffices to analyze only the case n = 1, i.e. the case of half-open intervals in R. Weneed to show the implication

(9)[a, b) =

pk=1[a p, b p)

[ak, bk) pk=1

pair-wise disjoint

=⇒ b − a =

pk=1

(bk − ak).

We can prove this using induction on p. The case p = 1 is trivial. Assuming that theabove fact holds for p = N , let us prove it for p = N + 1. Pick k1 ∈ 1, . . . , N + 1such that ak1 = a. Then we clearly have

1≤k≤N +1k=k1

[ak, bk) = [bk1 , b),

so by the inductive hypothesis we getb − bk1 =

1≤k≤N +1k=k1

(bk − ak),

so we get

N +1k=1

(bk − ak) = (bk1 − ak1) + (b − bk1) = b − ak1 = b − a,

and we are done.We now prove that voln is σ-sub-additive. Suppose we have A ∈ Jn and a

sequence (Ak)∞k=1 ⊂ Jn, such that A ⊂

∞k=1 Ak, and let us prove the inequality

(10) voln(A) ≤∞k=1

voln(Ak).

It will be helpfull to introduce the following notations. For every half-open box

B = [x1, y1) × · · · × [xn, yn),

and every δ > 0, we define the boxes boxes

Bδ = [x1 − δ, y1) × · · · × [xn − δ, yn) and Bδ = [x1, y1 − δ) × · · · × [xn, yn − δ).




It is clear that, for any box B ∈ Jn we have

Bδ ⊂ B ⊂ Int(Bδ),(11)

voln(B) = limδ→0+

voln(Bδ) = limδ→0+

voln(Bδ).(12)

To prove (10), we fix some ε > 0, and we choose positive numbers δ and (δk)∞k=1,

such that

(13) voln(Aδ) > voln(A) − ε, and voln

(Ak)δn

<ε

2k+ voln(Ak), ∀ k ∈ N.

Notice now that, using (11), we have the inclusions

Aδ ⊂ A ⊂∞k=1

Ak ⊂ Int

(Ak)δn

,

and using the compactness of Aδ, there exists some N ≥ 1, such that

Aδ ⊂

N

k=1 Int(Ak)

δn.

This immediately gives the inclusion

Aδ ⊂N k=1

(Ak)δn .

Using sub-additivity (see Corollary 4.1) we now get

voln(Aδ) ≤N k=1

voln

(Ak)δn

,

and using (13) we have

voln(A) − ε ≤N

k=1 ε

2k + voln(Ak) ≤ ε +

N

k=1 voln(Ak) ≤ ε +

∞

k=1 voln(Ak).

This gives

voln(A) − 2ε ≤∞k=1

voln(Ak).

But since this inequality holds for all ε > 0, the inequality (10) immediately follows.



194 LECTURE 22

Proof. It is obvious that µ∗(∅) = 0. It is also clear that µ∗ is mono-tone. To prove that µ∗ is σ-sub-additive, start with A ∈ P(X ) and a sequence(An

)∞

n=1 ∈P(X ), such that A

⊂ ∞

n=1An

, and let us prove the inequality µ∗(A)≤∞

n=1 µ∗(An). If there exists some n with An ∈ PJσ(X ), there is nothing to prove.Assume An ∈ PJσ(X ), for all n. Then it is clear that A ∈ PJσ(X ). Fix for themoment some ε > 0. For every n ∈ N choose a sequence (Bnk )∞

k=1 ⊂ J, such that

∞k=1

µ(Bnk ) <ε

2n+ µ(An).

It is clear that, if we list the countable family (Bnk )∞n,k=1 as a sequence (Dm)∞

m=1,

then A ⊂ ∞m=1 Dm, and

µ(A) ≤∞m=1

µ(Dm) =

∞n=1

∞k=1

µ(Bnk ) ≤∞n=1

ε

2n+ µ(An)

= ε +

∞n=1

µ(An).

Since the above inequality holds for all ε > 0, we conclude that

µ∗(A) = µ(A) ≤∞n=1

µ(An) =

∞n=1

µ∗(An),

so µ∗ is indeed σ-sub-additive.Finally, we must show that µ∗

J

= µ. Start with some A ∈ J. On the one

hand, since µ is a measure on J, we know that µ is σ-subadditive (see Theorem4.2). This means that, for any sequence (Bn)∞

n=1 ⊂ J with A ⊂ ∞n=1 Bn, we have∞

n=1 µ(Bn) ≥ µ(A). Since A obviously belongs to PJσ(X ), this will force

µ∗(A) = µ(A) ≥ µ(A).

On the other hand, if we consider the sequence B1 = A, B2 = B3 = · · · = ∅, thenwe clearly have

∞n=1 µ(Bn) = µ(A), which gives µ(A)

≤µ(A), so in fact we must

have equality µ(A) = µ(A).

Definition. The outer measure µ∗, defined in the above result, is called themaximal outer extension of µ. This terminology is justified by the following.

Exercise 1. Let J be a semiring on X , and let µ be a measure on J. Prove thatany outer measure ν on X , with ν

J

= µ, then ν ≤ µ∗, in the sense that

ν (A) ≤ µ∗(A), ∀ A ⊂ X.

Exercise 2 . Let J1 and J2 be semirings on X with J1 ⊂ J2, and let µ1, µ2 berespectively measures on J1, J2, such that µ2

J1

≤ µ1. Let µ∗1, µ∗

2 respectively be

the maximal outer extensions of µ1, µ2. Prove the inequality µ∗2 ≤ µ∗

1.

Given a measure µ on a semiring J on X , one can ask whether there exists aunique outer measure on X , which extends µ. The answer is no, even in the mosttrivial cases.

Example 5.1. Work on the set X = 1, 2. Take the semiring J = ∅, X and define a measure µ on J by µ(∅) = 0 and µ(X ) = 1. Choose now any numbera ∈ (0, 1) and define ν a : P(X ) → [0, 1] by ν a(A) = aκ A(1) + (1 − a)κ A(2). Thenν a is an outer measure on X - in fact ν a is a measure on P(X ) - and ν a

J

= µ. It

is obvious that µ∗(1) = 1 = a = ν a(1) and µ∗(2) = 1 = 1 − a = ν a(2).

We introduce now another concept, which is very important in our analysis.




Definition. Let ν be an outer measure on a non-empty set X . A subsetA ⊂ X is said to be ν -measurable, if it satisfies the condition

(m

) ν (S ) = ν (S ∩ A) + ν (S A), ∀ S ⊂ X .For a given S , it is useful to think the equality ν (S ) = ν (S ∩ A) + ν (S A) inunorthodox terms as “A sharply cuts S ,” so that saying that A is ν -measurablemeans that “A sharply cut every set S ⊂ X .”

Remarks 5.1. Let ν be an outer measure on X .A. Since ν is (finitely) sub-additive, for any two sets A, S ⊂ X , one always has

the inequality ν (S ) ≤ ν (S ∩A) + ν (S A). Therefore, a set A ⊂ X is ν -measurable,if and only if

ν (S ) ≥ ν (S ∩ A) + ν (S A), ∀ S ⊂ X.

B. Any subset N ⊂ X , with ν (N ) = 0, is ν -measurable. Indeed, from themonotonicity of ν , we see that for every S ⊂ X , we have

ν (S ∩ N ) + ν (S N ) ≤ ν (N ) + ν (S ) = ν (S ),

so by the preceding remark, N is indeed ν -measurable. Such a set N is calledν -negligeable.

The first key result in this section is the following.

Theorem 5.1. Let ν be an outer measure on a non-empty set X . Then thecollection

m ν(X ) =

A ⊂ X : A ν -measurable

is a σ-algebra on X . Moreover, the restriction

ν m ν (X)

: m ν(X ) → [0, ∞]

is a measure on m ν(X ).

Proof. The proof will be carried on in several steps.

Step 1: If A∈m ν(X ), then X A

∈m ν(X ).

This is trivial, since for every S ⊂ X , one has the equalities

S ∩ (X A) = S A and S (X A) = S ∩ A.

Step 2 : If A, B ∈ m ν(X ), then A ∩ B ∈ m ν(X ).

Start with some arbitrary S ⊂ X . Since B is ν -measurable, it “shaprply cuts theset S (A ∩ B),” which means that

ν

S (A ∩ B)

= ν

[S (A ∩ B)] ∩ B

+ ν

[S (A ∩ B)] B

.

Since we clearly have [S (A∩B)]∩B = (S ∩B)A, and [S (A∩B)]B = S B,the above equality gives

ν

S (A ∩ B)

= ν

(S ∩ B) A

+ ν (S B).

Adding ν (S

∩B)

∩A, and using the fact that A “sharply cuts S

∩B,” we now

get

ν (

S ∩ (A ∩ B)

+ ν

S (A ∩ B)

=

= ν

(S ∩ B) ∩ A

+ ν

(S ∩ B) A

+ ν (S B) = ν (S ∩ B) + ν (S B).

Finally, using the fact that B “sharply cuts S ,” we get

ν (

S ∩ (A ∩ B)

+ ν

S (A ∩ B)

= ν (S ∩ B) + ν (S B) = ν (S ),

so A ∩ B is indeed ν -measurable.



196 LECTURE 22

So far, Steps 1 and 2 prove that m ν(X ) is an algebra on X .

Step 3 : For any pair-wise disjoint finite sequence (An)N n=1 ⊂ m ν(X ), one

has the equality

ν

S ∩ A1 ∪ · · · ∪ AN

=

N n=1

ν (S ∩ An), ∀ S ⊂ X.

Since m ν(X ) is an algebra, it suffices to prove the aboove equalityonly for N =2. (The case of arbitrary N follows immediately by induction.) To prove thatν

S ∩ (A1 ∪ A2)

= ν (S ∩ A1) + ν (S ∩ A2), we simply use the fact that A1 “sharplycuts S ∩ (A1 ∪ A2),” which gives

ν

S ∩ (A1 ∪ A2)

= ν

[S ∩ (A1 ∪ A2)] ∩ A1

+ ν

[S ∩ (A1 ∪ A2)] A1

.

The desired equality then immediately follows from the obvious equalities

[S ∩

(A1

∪A2)]

∩A1 = S

∩A1 and [S

∩(A1

∪A2)] A1 = S

∩A2.

The preceding step can be in fact extended to infinite sequences.

Step 4: For any pair-wise disjoint sequence (An)∞n=1 ⊂ m ν(X ), one has the

equality

ν

S ∩ ∞

n=1

An]

=

∞n=1

ν (S ∩ An), ∀ S ⊂ X.

To prove this fact, we fix a sequence (An)∞n=1 as above, as well as S ⊂ X . By

σ-sub-additivity, we already know that

ν

S ∩

∞

n=1

An

= ν

∞

n=1

[S ∩ An]

≤

∞

n=1

ν (S ∩ An),

so the only thing we have to show is the inequality

N n=1

ν (S ∩ An) ≤ ν

S ∩ ∞

n=1

An

, ∀ N ∈ N.

This follows immediately from Step 3 and the monotonicity:

N n=1

ν (S ∩ An) = ν

S ∩ N

n=1

An ≤ ν

S ∩ ∞

n=1

An

.

Step 5 : m ν(X ) is a monotone class.

We need to prove the properties:

(i) whenever (An)∞

n=1 ⊂ m ν(X ) is a sequence with An ⊂ An+1, ∀ n ∈ N, itfollows that ∞

n=1 An belongs to m ν(X );(ii) whenever (An)∞

n=1 ⊂ m ν(X ) is a sequence with An ⊃ An+1, ∀ n ∈ N, itfollows that

∞n=1 An belongs to m ν(X ).

Since m ν(X ) is an algebra, it suffices only to prove (i). Start with an arbitrarysubset S , and a sequence (An)∞

n=1 ⊂ m ν(X ) with An ⊂ An+1, ∀ n ∈ N, and denotethe union

∞n=1 An simply by A. Define the sets B1 = A1 and Bn = An An−1,

∀ n ≥ 2. It is obvious that (Bn)∞n=1 is a pair-wise disjoint sequence. Since m ν(X )




is an alegbra, all the Bn’s belong to m ν(X ). We have,∞n=1 Bn =

∞n=1 An = A,

which, using Step 4 gives

(1) ν (S ∩ A) = ν S ∩ ∞n=1

An = ν S ∩ ∞

n=1

Bn =

∞n=1

ν (S ∩ Bn).

Using Step 3, combined with the equalityN n=1 Bn = AN , we also have

N n=1

ν (S ∩ Bn) = ν

S ∩ N

n=1

Bn

= ν (S ∩ AN ), ∀ N ∈ N,

so by (1) we have

(2) ν (S ∩ A) =

∞n=1

ν (S ∩ Bn) = limN →∞

ν (S ∩ AN ).

Notice now that, using the fact that AN “sharply cuts S ,” combined with themonotonicity of ν and the obvious inclusion S A ⊂ S AN , we have

ν (S ∩ AN ) + ν (S A) ≤ ν (S ∩ AN ) + ν (S AN ) = ν (S ), ∀ N ∈ N,

so using (2), we immediately get

ν (S ∩ A) + ν (S A) ≤ ν (S ).

Since the above inequality holds for all S ⊂ X , by Remark 5.1.A it follows that Aindeed belongs to m ν(X ).

By the results from Section 1, we know that the fact that m ν(X ) is simu-lutaneously an algebra, and a monotone class, implies the fact that m ν(X ) is aσ-algebra.

We now show that ν m ν (X)is a measure. If we start with a pair-wise disjoint

sequence (An)∞n=1 ⊂ m ν(X ), then the equality equality

ν ∞n=1

An

=∞n=1

ν (An)

is an immediate consequence of Step 4, applied to the set S =∞n=1 An, which

clearly satisfies S ∩ An = An, ∀ n ∈ N.

We are now in position to answer the Question 1.

Theorem 5.2. Let X be a non-empty set, let J be a semiring on X , let µ be a measure on J, and let µ∗ be the maximal outer extension of µ. Then J ⊂ m µ∗(X ).In particular, m µ∗(X ) contains the σ-algebra Σ(J) on X , generated by J, and µ∗

Σ(J)is a measure on Σ(J).

Proof. What we need to prove is the fact that every set A ∈ J is µ∗-measurable.Start with an arbitrary set S ⊂ X . As noticed before (Remark 5.1.A), we only needto prove the inequality

(3) µ∗(S ∩ A) + µ∗(S A) ≤ µ∗(S ).

If µ∗(S ) = ∞, there is nothing to prove, so we can assume that µ∗(S ) < ∞. Inparticular this means that S ∈ PJσ(X ). Fix for the moment ε > 0. By the definition



198 LECTURE 22

of µ∗(S ) = µ(S ), there exists a sequence (Bn)∞n=1 ⊂ J, such that S ⊂ ∞

n=1 Bn,and

(4)

∞n=1

µ(Bn) ≤ µ∗(S ) + ε.

Since J is a semiring, for each n ∈ N, we can find some integer pn ≥ 1, and asequence (Dnj ) pn

j=0 ⊂ J, such that

• Bn ∩ A = Dn0 ⊂ Dn1 ⊂ · · · ⊂ Dn pn= Bn,

• Dj Dj−1 ∈ J, ∀ j ∈ 1, . . . , pn.

Define the numbers k0 = 0, and kn =nj=1 pj , ∀ n ∈ N, and the sequence

(C m)∞m=1 ⊂ J, by

C m = Dnm−kn−1 Dnm−1−kn−1

, if kn−1 < m ≤ kn, n ∈ N.

By construction, for each n ∈ N, we have

knm=kn−1+1

C m =

pnj=1

(Dnj Dnj−1) = Bn An.

Moreover, for each n ∈ N the system

(Dn0 , C kn−1+1, C kn−1+2, . . . , C kn) = (Dn0 , Dn1 Dn0 , Dn2 Dn1 , . . . , Dn pn

Dn pn−1)

in J is pair-wise disjoint, and has

Dn0 ∪kn

m=kn−1+1

C m = Bn,

so we get the equality

µ(D

n

0 ) +

kn

m=kn−1+1 µ(C m) = µ(Bn).

Using (4) we now get

∞n=1

µ(Dn0 ) +∞m=1

µ(C m) =∞n=1

µ(Dn0 ) +∞n=1

knm=kn−1+1

µ(C m)

=

=∞n=1

µ(Dn0 ) +

knm=kn−1+1

µ(C m)

=

∞n=1

µ(Bn) ≤ µ∗(S ) + ε.

(5)

On the one hand, we clearly have

∞

m=1

C m =∞

n=1kn

m=kn−1+1

C m =∞

n=1pn

j=1

(Dnj Dnj−1) =

=

∞n=1

(Dn pn Dn0 ) =

∞n=1

(Bn A) = ∞n=1

Bn

A ⊃ S A,

which gives the inequality

(6)∞m=1

µ(C m) ≥ µ∗(S A).




On the other hand, we also have∞

n=1 D

n

0 =

∞

n=1(Bn ∩ A) = ∞

n=1 Bn ∩ A ⊃ S ∩ A,

which gives the inequality

(7)

∞n=1

µ(Dn0 ) ≥ µ∗(S ∩ A).

Combining (6) and (7) with (5) immediately gives the desired inequality (3).

The constructionµ

measure on J

maximal outer

extension−−−−−−−−−→

µ∗

outer measure on X

restriction−−−−−−→

µ∗Σ(J)

measure on Σ(J)

is referred to as the Caratheodory construction .

Definitions. Let J be a semiring on X , and let µ be a measure on J. TheCaratheodory construction provides us with two measures. The first measure -µ∗S(J)

- is a measure on the σ-ring S(J) generated by J, and is called the maximal

σ-ring extension of µ. The second measure - µ∗Σ(J)

- is a measure on the σ-algebra

Σ(J) generated by J, and is called the maximal σ-algebra extension of µ.The above terminology is justified by the following result.

Proposition 5.2. Let J be a semiring on X , and let µ be a measure on J.

(i) If ν is a measure on the σ-ring S(J) generated by J, with ν J

= µ, then

ν ≤ µ∗S(J)

.

(ii) If ν is a measure on the σ-algebra Σ(J) generated by J, with ν J

= µ, then

ν ≤ µ∗

Σ(J).

Proof. We prove both statements simultaneously. Let J1 denote either theσ-ring, or the σ-algebra generated by J. In particular J1 is a semiring, and J ⊂ J1.Since ν is a measure on J1 with ν

J

= µ, if we denote by ν ∗ its maximal outer

extension, then by Exercise 2 we know that ν ∗ ≤ µ∗. In particular, by Proposition5.1 and Theorem 5.2, we get ν = ν ∗

J1

≤ µ∗J1

.

We now discuss the uniqueness of extensions of a semiring measure. In orderto clarify this matter, we have to introduce a technical condition, which turns outto be very helpful not only here, but in many other situations.

Definitions. Let J be a semiring on X , and let µ be a measure on J.A. We say that a subset A ⊂ X is J-µ-σ-finite, if there exists a sequence

(Bn)∞n=1 ⊂ J, such that A ⊂

∞n=1 Bn, and µ(Bn) < ∞, ∀ n ∈ N. (When there is

no danger of confusion, we will use the terms “µ-σ-finite,” or simply “σ-finite.”)B. We say that the measure µ is σ-finite, if every A ∈ J is σ-finite.C. We say that the measure µ is finite, if µ(A) < ∞, ∀ A ∈ J.Clearly every finite measure on J is σ-finite.

Remark 5.2. Let J be a semiring on X , let µ be a measure on J, and let A be aset which belongs to the σ-algebra Σ(J) generated by J. If A if J-µ-σ-finite, then Ain fact belongs to the semiring S(J) generated by J. The only thing that is actuallyneeded here is the existence of a sequence (Bn)∞

n=1 ⊂ J with A ⊂ ∞n=1 Bn. This



200 LECTURE 22

gives the fact that A belongs to PJσ(X ), so by Proposition 2.3, the set A belongs tothe intersection Σ(J) ∩ PJσ(X ) = S(J).

Using the above terminology, we have the following uniqueness result.Theorem 5.3. Let J be a semiring on X , let µ be a measure on J, let µ∗ be the

maximal outer extension of µ, and let ν be a measure on the σ-ring S(J) generated by J, with ν

J

= µ. Then one has ν (A) = µ∗(A), for all J-µ-σ-finite sets A ∈ S(J).

Proof. Fix a J-µ-σ-finite set A ∈ S(J).

Claim: There exists a pair-wise disjoint sequence (Dn)∞n=1 ⊂ S(J) such that

A ⊂ ∞n=1 Dn, and ν (Dn) = µ∗(Dn) < ∞, ∀ n ∈ N.

To prove the above statement, start with a sequence (Bn)∞n=1 ⊂ J with A ⊂∞

n=1 Bn and µ(Bn) < ∞, ∀ n ∈ N. Define the sets Dn, n ∈ N by D1 = B1,and Dn = Bn (B1 ∪ · · · ∪ Bn−1), ∀ n ≥ 2. It is clear that the sequence (Dn)∞

n=1

is pair-wise disjoint, and

A ⊂∞n=1

Bn =

∞n=1

Dn.

Moreover, all the Dn’s belong to the ring R(J) generated by J. The inclusionsDn ⊂ Bn then prove that

µ∗(Dn) ≤ µ∗(Bn) = µ(Bn) < ∞, ∀ n ∈ N.

Finally, since both µ∗R(J)

and ν R(J)

are measures on R(J), which have the same

values on J, using the Semiring-to-Ring Extension Theorem 4.1, it follows that

(8) µ∗R(J)

= ν R(J)

.

In particular we have the equalities

ν (Dn) = µ∗(Dn),∀

n∈

N.

Having proven the Claim, we now show that ν (A) = µ∗(A). We choose asequence (Dn)∞

n=1 ⊂ S(J) as in the Claim. On the one hand, since the Dn’s arepair-wise disjoint, and both ν and µ∗

S(J)

are measures on the σ-ring S(J), one has

the equalities

ν (A) =

∞n=1

ν (A ∩ Dn) and µ∗(A) =

∞n=1

µ∗(A ∩ Dn).

So, in order to prove the equality ν (A) = µ∗(A), it suffices to prove that

(9) ν (A ∩ Dn) = µ∗(A ∩ Dn), ∀ n ∈ N.

Fix n ∈ N. On the one hand, by Proposition 5.2(i), we have the inequalities

(10) ν (A ∩ Dn) ≤ µ∗(A ∩ Dn) < ∞ and ν (Dn A) ≤ µ∗(Dn A) < ∞.


ν (A ∩ Dn) + ν (Dn A) = ν (Dn) = µ∗(Dn) = µ∗(A ∩ Dn) + µ∗(Dn A).

Now if we go back to (10), we see that none of the two inequalities can be strict, be-cause in that case we would get ν (Dn) < µ∗(Dn). (The assumption that µ∗(Dn) <∞ is essential here.) So we must have (9), and we are done.




Corollary 5.1. If µ is a σ-finite measure on a semiring J, then there exists a unique measure ν on the σ-ring S(J) generated by J, such that ν

J= µ. Moreover,

ν is σ-finite.Proof. The existence is given by the Caratheodory construction. The unique-

ness follows from Theorem 5.3.To prove σ-finiteness, start with some A ∈ S(J), and let us find a sequence

(Bn)∞n=1 ⊂ S(J) with A ⊂ ∞

n=1 Bn and ν (Bn) < ∞, ∀ n ∈ N. First of all, since

PJσ(X ) is a σ-ring which contains J, it follows that S(J) ⊂ PJσ(X ). In particular,there exists (Dn)∞

n=1 ⊂ J such that A ⊂ ∞n=1 Dn. Using the fact that µ is σ-finite,

we see that for each n we can find a sequence (Dnk )∞k=1 ⊂ J, with Dn ⊂ ∞

k=1 Dnk andµ(Dnk ) < ∞, ∀ k ∈ N. If we list all the sets Dnk , k, n ∈ N as a sequence (Bm)∞

m=1,then we are done.

In the absence of the σ-finitess condition the uniqueness of the σ-ring extensionfails, as illustrated by the following.

Example 5.2. Consider the set X = Q, and the semiring of rational half-openintervals

J1 =

∅ ∪ [a, b) ∩ Q : a, b ∈ R, a < b

.

We equipp J1 with the measure µ defined by

µ(A) =

0 if A = ∅

∞ if A = ∅

Notice that, if we look at the inclusion ι : Q → R, then J1 = JQ

, where J is the

semiring of half-open intervals in R. By the Generating Theorem we then have

S(J1) = S(JQ

) = S(J)Q

= Bor(R)Q

= P(Q).

Define now the measures ν 1, ν 2 : S(J

) → [0, ∞] by

ν 1(A) =

card A if A is finite

∞ if A is infiniteν 2(A) =

2 · card A if A is finite

∞ if A is infinite

It is obvious that both ν 1 and ν 2 satisfy ν 1J1

= ν 2J1

= µ, but obviously ν 1 and

ν 2 are not equal.

Comment. In connection with the Caratheodory construction, it is legitimateto ask the following.

Question 2 : What happens if we do the Caratheodory construction twice?

This problem has in fact two aspects.

Question 2A: Suppose ω is an outer measure on X . Take I = m ω(X ) and ν = ωI, so that I is a semiring (in fact it is a σ-algebra) on X , and ν

is a measure on I. Let ν ∗ be the maximal outer extension of ν . Is it truethat ν ∗ = ω?

By Exercise 2, we always have ω ≤ ν ∗. In general the answer to Question 2A innegative, as shown in Exercise ??? below. One can ask however the following

Question 2B: Same question as 2A, but suppose ω = µ∗, the maximal outer extension of a measure µ on a semiring J.

The following result shows that Question 2B always has an affirmative answer.



202 LECTURE 22

Proposition 5.3. Let X be a non-empty set, let J be a semiring on X , and let µ be a measure on J. Let µ∗ be the maximal outer exetension of ν . Let I be a semiring, with I

⊃J Consider the measure ν = µ∗I, and let ν ∗ be the maximal

outer extension of ν . Then ν ∗ = µ∗.

Proof. First of all, since ν J

= µ∗J

= µ, by Exercise 2, we have the inequality

ν ∗ ≤ µ∗.To prove the other inequality, we start with an arbitrary set A ⊂ X , and we

prove that µ∗(A) ≤ ν ∗(A). If ν ∗(A) = ∞, there is nothing to prove, so we mayassume ν ∗(A) < ∞. In particular, A ∈ PIσ(X ), i.e. there exists at least one sequence(Bn)∞

n=1 ⊂ I, with A ⊂ ∞n=1 Bn, and we have

ν ∗(A) = inf

∞n=1

ν (Bn) : (Bn)∞n=1 ⊂ I, A ⊂

∞n=1

Bn

.

Fix for the moment a some ε > 0, and choose a sequence (Bεn)∞n=1 ⊂ I, such that

(11) A ⊂ ∞n=1

Bεn and∞n=1

ν (Bεn) ≤ ν ∗(A) + ε.

By σ-subadditivity of µ∗, we have

µ∗(A) ≤∞n=1

µ∗(Bεn).

Using the fact that ν = µ∗I

, the above inequality, combined with (11) yields

µ∗(A) ≤ ν ∗(A) + ε.

Since this inequality holds for all ε > 0, it forces the inequality µ∗(A) ≤ ν ∗(A).

Exercise 3 . Let X be an uncountable set, and define ω : P(X ) → [0, ∞] by

ω(A) = 0 if A = ∅

1 if 0 < card A ≤ ℵ0

2 if A is uncountable

(i) Prove that ω is an outer measure on X .(ii) Take I = m ω(X ). Prove that I = ∅, X .

(iii) Consider the measure ν = ωI

, and let ν ∗ be the maximal outer extensionof ν . Prove that there are sets A ⊂ X , with ω(A) < ν ∗(A).

Hints: For (ii) start with some A with ∅ A X. Prove that A is not ω-measurable, by

showing that A does not “sharply cut” sets of the form a, b with a ∈ A and b ∈ X A.

Comment. Suppose J is a semiring on X , and µ is a measure on J. Wehave used the maximal outer extension µ∗ as a tool in defining measures on the

σ-ring S(J

) and the σ-algebra Σ(J

) generated byJ

, by employing the Caratheodoryconstruction, which uses the σ-algebra m µ∗(X ) of µ∗-measurable sets. A legitimatequestion is then

Question 3 : Is the inclusion Σ(J) ⊂ m µ∗(X ) strict?

In most cases this inclusion is indeed strict (see Examples ?? below, or the dis-cussion in the next section). This can be seen by looking at µ∗-negligeable setsN ⊂ X , which are automatically µ∗-measurable. The following result gives someuseful information.




Proposition 5.4. Suppose J is a semiring on X , and µ is a measure on J. Let µ∗ be the maximal outer extension of µ. For any set A ∈ PJσ(X ), there exists someset B in the σ-ring S(J) generated by J, such that A

⊂B, and µ∗(A) = µ∗(B).

In particular, a subset N ⊂ X is µ∗-neglijeable, i.e. µ∗(N ) = 0, if and only if there exists a µ∗-negligeable set B ∈ S(J), such that N ⊂ B.

Proof. Since A ∈ PJσ(X ), there exists a sequence (Dn)∞n=1 ⊂ J with A ⊂∞

n=1 Dn. Moreover, we have

µ∗(A) = inf

∞n=1

µ(Dn) : (Dn)∞n=1 ⊂ J, A ⊂

∞n=1

Dn

.

For each integer k ≥ 1, we can then choose a sequence (Bkn)∞n=1 ⊂ J with A ⊂∞

n=1 Bkn and∞n=1 µ(Bkn) ≤ µ∗(A) + 1/k. For each integer k ≥ 1, we define the set

Bk =

∞n=1 Bkn. It is clear that Ak ∈ S(J), and Bk ⊃ A, for all k ∈ N. Moreover,

by σ-sub-additivity of µ∗, and the equality µ∗J = µ, we have the inequalities

µ∗(Bk) ≤∞n=1

µ∗(Bkn) =∞n=1

µ(Bkn) ≤ µ∗(A) +1

k, ∀ k ∈ N.

If we then form B =∞k=1 Bk, then B still belongs to S(J), and we have A ⊂ B ⊂

Bk, which gives

µ∗(A) ≤ µ∗(B) ≤ µ∗(Bk) ≤ µ∗(A) +1

k∀ k ∈ N,

thus forcing µ∗(B) = µ∗(A).To prove the second assertion, we see that the “only if” part is a particular

case of the first part. The “if” part is trivial, since the inclusion N ⊂ B forces theinequality µ∗(N ) ≤ µ∗(B).

In connection to Question 3, it is useful to introduce the following terminology.

Definition. Let X be a non-empty set, and let J be a semiring on X . Ameasure µ on J is said to be complete, if it satisfies the condition

(c) whenever N ∈ J has µ(N ) = 0, it follows that J contains all the subsetsof N .

Remarks 5.3. A. Given an outer measure ν on a set X , the measure ν m ν (X)

:

m ν(X ) → [0, ∞] is always complete, as a consequence of monotonicity, and of Remark 5.1.B.

B. Given a semiring J on X , and a measure µ on J, we now see that a sufficientcondition, for having a strict inclusion Σ(S) m µ∗(X ), is the lack of completenessfor the measure µ∗

Σ(J). Later on (see Corollary 5.2) we shall see that in the case

of σ-finite measures, defined on σ-total semirings, this condition is also necessary.The lack of completeness of a σ-ring measure can be compensated by the fol-

lowing result.

Theorem 5.4. Let X be a non-empty set, let S be a σ-ring on X , and let ν bea measure on S.

(i) The collection

N (S, ν ) =

N ⊂ X : there exists D ∈ S with N ⊂ D and ν (D) = 0



204 LECTURE 22

is a σ-ring on X . Moreover, if N ∈ N (S, ν ), then N (S, ν ) contains all subsets of N .

(ii) For a subset A⊂

X , the following are equivalent:(a) there exists B ∈ S and N ∈ N (S, ν ), such that A = B N ;(b) there exists F ∈ S and M ∈ N (S, ν ), such that A = F ∪ M .

(iii) The collection S of all subsets A ⊂ X , satisfying the equivalent conditionsin (ii), is a σ-ring. We have the equality

S = SN (S, ν ) ∪ S.

(iv) There exists a unique measure ν on S, such that ν N(S,ν)

= 0 and ν S

= ν .

The measure ν is complete.(v) If E is a σ-ring with E ⊃ S, and if λ is a complete measure on E with

λS

= ν , then E ⊃ S and λS

= ν .

Proof. (i). This is pretty clear. In fact, if one takes E = B ∈ S : ν (B) = 0,

then one has the equality N (S, ν ) = P

E

σ(X ).(ii). (a) ⇒ (b). Assume A = B N with B ∈ S and N ∈ N (S, ν ). ChooseD ∈ S with ν (D) = 0 and N ⊂ D. We now have

B D ⊂ B N = A,

so if we put F = B D, we have the equality A = F ∪ M , where

M = A F = (B N ) (B D) ⊂ D.

Notice that F ∈ S, while the inclusion M ⊂ D shows that M ∈ N (S, ν ).(b) ⇒ (a). Assume A = F ∪ M with F ∈ S and M ∈ N (S, ν ). Choose D ∈ S

with M ⊂ D and ν (D) = 0. Define B = F ∪ D. It is clear that B ∈ S, and A ⊂ B.Define N = B A, so we clearly have A = B N . We have

N = (F

∪D) (F

∪M )

⊂D M

⊂D,

so N clearely belongs to N (S, ν ).(iii). We need to prove the following properties:

(∗) whenever A1, A2 are sets in S, it follows that the difference A1 A2 alsobelongs to S;

(∗∗) whenever (An)∞n=11 is a sequence of sets in S, it follows that the union∞

n=1 An also belongs to S.

To prove (∗), we write A1 = B N and A2 = F ∪ M , with B, F ∈ S and M, N ∈N (S, ν ). Then we have

A1 A2 = (B N ) (F ∪ M ) = B (F ∪ M ∪ N ) = (B F ) (M ∪ N ).

The difference B F belongs to S, and, using (i), the union N ∪ M belongs toN (S, ν ). By (ii) it follows that A

1 A

2belongs to S.

To prove (∗∗), we write, for each n ∈ N, the set An as An = F n ∪ M n withF n ∈ S and M n ∈ N (S, ν ). Then

∞n=1

An = ∞n=1

F n ∪ ∞

n=1

M n

.

The union∞n=1 F n belongs to S, and, using (i), the union

∞n=1 M n belongs to

N (S, ν ). By (ii), the union∞n=1 An belongs to S.




Since S is a σ-ring, which clearly contains both N (S, ν ) and S, it follows thatS ⊃ S

N (S, ν ) ∪ S

. The other inclusion S ⊂ S

N (S, ν ) ∪ S

is trivial, by the

definition of S

.(iv). To prove the existence, we consider the maximal outer extension ν ∗.When restricted to the σ-algebra m ν∗(X ) of all ν ∗-measurable sets, then we geta measure. Notice that ν ∗(N ) = 0 , ∀ N ∈ N (S, ν ), which gives the inclusionN (S, ν ) ⊂ m ν∗(X ). In particular, since m ν∗(X ) is a σ-algebra, which containsboth N (S, ν ) and S, it follows that

m ν∗(X ) ⊃ SN (S, ν ) ∪ S = S.

In particular, ν = ν ∗S

is a measure on S, which clearly satisfies the requiredproperties.

To prove uniqueness, let µ be another measure on S, such that µN(S,ν)

= 0 and

µ

S

= ν . It we start with an arbitrary set A ∈ S, and we write it as A = F ∪ M ,with F

∈S and M

∈N (S, ν ), then using the fact that A F

⊂M , we see that

A F belongs to N (S, ν ), so we have

µ(A) = µ(F ) + ν (A F ) = µ(F ) = ν (F ) = ν (F ) = ν (F ) + ν (A F ) = ν (A).

Finally, we prove that the measure ν is complete. Let A ∈ S be a set withν (A) = 0, and let U be an arbitrary subset of A. Using (ii) we write A = F ∪ M ,with F ∈ S and M ∈ N (S, ν ). Notice that we have

0 ≤ ν (F ) = ν (F ) ≤ ν (F ∪ M ) = ν (A) = 0,

which forces F ∈ N (S, ν ), so using (i), we see that A itself belongs to N (S, ν ). By(i), it follows that U ∈ N (S, ν ) ⊂ S.

(v) Let E and λ be as in indicated. In order to prove the inclusion E ⊃ S, itsuffices to prove the inclusion N (S, ν ) ⊂ E. But this inclusion is pretty obvious. If

we start with some N ∈ N (S, ν ), then there exists A ∈ S with N ⊂ A and ν (A) = 0.In particular, we have A ∈ E and λ(E ) = 0, and then the completeness of λ forcesN ∈ E. Notice that this also forces λ(N ) = ν (N ) = 0. Using (iv) it then followsthat λ|S = ν .

Definition. Using the notations above, the σ-ring S is called the completion of S with respect to ν . The correspondence (S, ν ) −→ (S, ν ) is referred to as themeasure completion . Remark that, if ν is already complete, then S = S and ν = ν .

Exercise 4. Using the notations from Theorem 5.4, prove that for a set A ⊂ X ,the condition A ∈ S is equivalent to any of the following:

(a) there exists B ∈ S and N ∈ N (S, µ), with A = B N , and N ⊂ B;(b) there exists F ∈ S and M ∈ N (S, ν ), such that A = F ∪M and F ∩M = ∅;(c) there exists E

∈S and Z

∈N (S, ν ), such that A = E

Z .

(d) there exist B, F ∈ S such that F ⊂ A ⊂ B, and µ(B F ) = 0.

The µ∗-measurable sets of a special type can be completely characterized usingµ∗-negligeable ones.

Theorem 5.5. Suppose J is a semiring on X , and µ is a measure on J. Let µ∗

be the maximal outer extension of µ. For a J-µ-σ-finite subset A ⊂ X , the following are equivalent;

(i) A is µ∗-measurable;



206 LECTURE 22

(ii) there exists B in the σ-ring S(J) generated by J, and a µ∗-neglijeable set N ⊂ X , such that A = B N .

Proof. (i) ⇒ (ii). Start by choosing a sequence (Dn)∞n=1 ⊂ J with A ⊂∞n=1 Dn and µ(Dn) < ∞, ∀ n ∈ N. Since m µ∗(X ) is an algebra, which contains

J, it follows that all the intersections An = A ∩ Dn, n ∈ N, belong to m µ∗(X ).For each n ∈ N, we use the previous result to find some set Bn ∈ S(J) such thatAn ⊂ Bn, and µ∗(Bn) = µ∗(An). On the one hand, if we put V n = Bn An, thenV n ∈ m µ∗(X ), so we will have

µ∗(Bn) = µ∗(An) + µ∗(V n).

On the other hand, we know that µ∗(Bn) = µ∗(An) ≤ µ∗(Dn) < ∞, so the aboveequality forces µ∗(V n) = 0.

Since we have Bn = An ∪ V n, ∀ n ∈ N, we will get∞

n=1

Bn = ∞

n=1

An ∪ ∞

n=1

V n = A

∪ ∞

n=1

V n,

so if we define B =∞n=1 E n and V =

∞n=1 V n, then B belongs to S(J), we have

the equality B = A ∪ V , and V is µ∗-negligeable, because of the inequalities

µ∗(V ) ≤∞n=1

µ∗(V n).

The set N = B A ⊂ V is clearly µ∗-negligeable, because µ∗(N ) ≤ µ∗(V ). Nowwe are done because B N = A.

(ii) ⇒ (i). This part is trivial, since m µ∗(X ) is an algebra.

Remarks 5.4. A. The implication (ii) ⇒ (i) holds without the assumptionthat A is J-µ-σ-finite. In fact, for any A ⊂ X , one has the implications (ii) ⇒(ii)

⇒(i), where

(ii) there exists B in the σ-algebra Σ(J) generated by J, and a µ∗-neglijeableset N ⊂ X , such that A = B N .

B. Consider the measure µ∗S(J)

on the σ-ring S(J). Using the notations from

Theorem 5.4, by Proposition 5.3, we clearly have the equalityN ⊂ X : N µ∗-negligeable

= N

S(J), µ∗

S(J)

.

So, if we denote by S(J) the completion of S(J) with respect to µ∗S(J)

, condition (ii)

from Theorem 5.5 reads: A ∈ S(J). Similarly, if we denote by Σ(J) the completion

of Σ(J) with respect to the measure µ∗Σ(J)

, condition (ii) above reads: A ∈ Σ(J).

With these notations, we have the inclusions

(12) S(J)

⊂Σ(J)

⊂m µ∗(X ).

With these notations, Theorem 5.5 states that

(13) S(J) ∩ A ⊂ X : A J-µ-σ-finite

= m µ∗(X ) ∩ A ⊂ X : A J-µ-σ-finite

.

Theorem 5.5, written in the form (13) has the following.

Corollary 5.2. If the semiring J is σ-total in X , and µ is a σ-finite measureon J, then one has the equalities

(14) S(J) = Σ(J) = m µ∗(X ).




Proof. Indeed, under the given assumptions on J and µ, it follows that every set A ⊂ X is J-µ-σ-finite.

Examples 5.3. A. The implication (i) ⇒ (ii) from Theorem 5.5 may fail, if A is not σ-finite. Start with an arbitrary set X , consider the semiring J = ∅, X and the measure µ on J defined by µ(∅) = 0 and µ(X ) = ∞. Notice that J is aσ-algebra, so it is trivial that J is σ-total in X . The maximal outer extension µ∗

of µ is defined by

µ∗(A) =

0 if A = ∅

∞ if A = ∅

It is clear that, since µ∗ is a measure on P(X ), we have the equality m µ∗(X ) =P(X ), but the only µ∗-neglijeable set is the empty set ∅. This means that the setssatisfying condition (ii) in Theorem 5.4 are only the sets ∅ and X , so, if ∅ = A X ,the implication (i) ⇒ (ii) fails, although J is σ-total in X . What occurs here is thetotal lack of J-µ-σ-finite sets.

B. Let X be an uncountable set, and letJ

be the semiring of all finite subsetsof X . We have

S(J) =

A ⊂ X : card A ≤ ℵ0

,

Σ(J) =

A ⊂ X : either card A ≤ ℵ0, or card(X A) ≤ ℵ0

.

Equipp J with the trivial measure µ(A) = 0, ∀ A ∈ J. The maximal outer extensionµ∗ is then defined by

µ∗(A) =

0 if card A ≤ ℵ0

∞ if A is uncountable

It is clear that µ∗ is a measure on P(X ), so we have m µ∗(X ) = P(X ) J. Noticethat both measures µ∗

S(J)

and µ∗Σ(J)

are complete, so using the notations from

Remark 5.4.B, we have the equalities

S(J) = S(J) and Σ(J) = Σ(J).

It is clear however that both inclusions in (12) are strict, although µ is finite. Whathappens here is the fact that J is not σ-total in X .

C. In the same setting as in Example B, if we take I = Σ(J), and ν = µ∗I

,then I is σ-total in X , simply because I is a σ-algebra. In this case, by Proposition5.3, the maximal outer extension ν ∗ of ν coincides with µ∗. We have

I = S(I) = S(I) = Σ(I) = Σ(I) m ν∗(X ),

the reason for the strict inclusion being this time the fact that ν is not σ-finite.

Comment. In the remainder of this section we take another look Question 3,trying to generalize the answer given by Corollary 5.2. To simplify matters a littlebit, we start with a σ-algebra B on X (which is clearly σ-total in X ), and a measure

µ on B. It we take µ∗ to be the maximal outer extension of µ, and consider thecompletion B, we have the inclusion

(15) B ⊂ m µ∗(X ),

so we can ask whether this inclusion is strict. Of course, if µ is σ-finite, then byCorollary 5.2 the inclusion (15) is not strict. As Example 5.3.C suggests, in theabsence of the σ-finiteness assumption, the inclusion (15) may indeed be strict. As itturns out, the fact that the inclusion (15) is strict in Example 5.3.C is a consequence



208 LECTURE 22

of the fact that there are “new” measurable sets which are not necessarily of theform B N with B ∈ B and N neglijeable. The existence of such sets is suggestedby the following.

Remark 5.5. Suppose ν is an outer measure on X . For a set A ⊂ X , thefollowing are equivalent:

(i) A is ν -measurable;(ii) ν (S ) ≥ ν (S ∩ A) + ν (S A), for all S ⊂ X with ν (S ) < ∞;

The implication (i) ⇒ (ii) is trivial. To prove the converse, by Remark 5.1.A, weneed to show that

ν (S ) ≥ ν (S ∩ A) + ν (S A), ∀ S ⊂ X.

But this is trivial, when ν (S ) = ∞. If ν (S ) < ∞, then this is exactly condition (ii).The “new” sets, that were mentioned above, are of a type covered by the

following.

Definition. Let ν be an outer measure on X . A subset N ⊂

X is said to belocally ν -neglijeable, if

ν (N ∩ A) = 0, for all A ⊂ X with ν (A) < ∞.

It is clear that every subset of N is also locally ν -neglijeable.The above observation shows that every locally ν -neglijeable set is ν -measurable.

The term “local” will be used in connection with properties that hold when thesubject set is cut down by sets of finite measure. For example, one can formulatethe following.

Definitions. Let B be a σ-algebra on X , and µ be a measure on B. We saythat a set N ∈ B is locally µ-null , if

(16) µ(F ∩ N ) = 0, for all F ∈ B, with µ(F ) < ∞.

Remark that locally µ-null sets do not necessarily have zero measure (see Example5.3.C)

We say that µ is locally complete, if it satisfies the condition

(lc) whenever N ∈ B is a locally µ-null set, it follows that B contains all subsets of N .

Remarks 5.6. Use the notations above.A. If the measure µ is σ-finite, the local completeness of µ is equivalent to

completeness. The reason is the fact that, in the σ-finite case, condition (16) isequivalent to µ(N ) = 0.

B. Given an outer measure ν on X , the measure ν m ν (X)

is locally complete.

Comment. If we look at Example 5.3.C, we now see that although the measureν on I is complete, it is not locally complete, thus giving another explanation for

the strict inclusion I m ν∗(X ).We are now in position to analyze Question 3, in the simplified given setting.

The following fact will be helpful.

Lemma 5.1. Let B be a σ-algebra on X , let µ be a measure on B, and let µ∗

be the maximal outer extension of µ. Then, for every subset S ⊂ X , one has theequality

(17) µ∗(S ) = inf

µ(B) : B ∈ B, B ⊃ S

.




Proof. Since B is σ-total in X , by definition we have

(18) µ∗(S ) = inf ∞

n=1

µ(Bn

) : (Bn

)∞

n=1 ⊂B, S

⊂

∞

n=1

Bn.

If we denote the right hand side of (??) by ν (S ), then using (18) we clearly haveµ∗(S ) ≤ ν (S ). Conversely, if we start with any sequence (Bn)∞

n=1 ⊂ B with S ⊂∞n=1 Bn, then we clearly have

∞n=1

µ(Bn) ≥ µ

∞n=1

Bn

≥ ν (S ),

so taking the infimum yields µ∗(S ) ≥ ν (S ).

Proposition 5.5. Let B be a σ-algebra on X , and let µ be a measure on B.Define the collection

Bfin =

F ∈ B : µ(F ) < ∞

.

For every F ∈ Bfin, denote by MF the completion of the σ-algebra BF (on F ) with respect to the measure8 µ

F

. Denote by µ∗ the maximal outer extension of µ.

A. For a subset A ⊂ X , the following are equivalent (i) A is µ∗-measurable;

(ii) A ∩ F ∈ MF , for each F ∈ Bfin;(iii) A ∩ F is µ∗-measurable, for each F ∈ Bfin.

B. For a subset N ⊂ X , the following are equivalent (i) N is locally µ∗-neglijeable;

(ii) µ∗(N ∩ F ) = 0, for all F ∈ Bfin.

Proof. Let us fix some useful notations. By construction, for every F ∈ Bfin,we have

BF = A

∩F : A

∈B = B

∈B : B

⊂F .

For each F ∈ Bfin, we denote the σ-ring N BF , µF simply by N F . With theabove identification we have

N F =

N ⊂ F : there exists D ∈ B with N ⊂ D ⊂ F and µ(D) = 0

,

so (see Theorem 5.4) the σ-algebra MF is given as

MF =

B N : B ∈ B, B ⊂ F, N ∈ N F

.

A. To prove the implication (i) ⇒ (ii), start with a µ∗-measurable set A, andwith some F ∈ Bfin. Since F is µ∗-measurable, the intersection A ∩ F is µ∗-measurable. Since µ∗(A ∩ F ) ≤ µ∗(F ) = µ(F ) < ∞, by Theorem 5.5 there existB0 ∈ B and N 0 ⊂ X with µ∗(N 0) = 0 and A ∩ F = B0 N 0. If we then defineB = B0 ∩ F and N = N 0 ∩ F , then we clearly have B ∈ B

F

, N ∈ N F , andA

∩F = B N , so A

∩F indeed belongs to MF .

The implication (ii) ⇒ (ii) is trivial, since every set in MF is clearly µ∗-measurable.

To prove the implication (iii) ⇒ (i), assume A has property (iii), and let usshow that A is µ∗-measurable. We are going to use Remark 5.5, which means thatit suffices to prove the inequality

(19) µ∗(S ) ≥ µ∗(S ∩ A) + µ∗(S A),

8 Here µ

F denotes the restriction of µ to the σ-algebra B

F

.



210 LECTURE 22

only for those subsets S ⊂ X with µ∗(S ) < ∞. Fix such a subset S . Sinceµ∗(S ) < ∞, Lemma 5.1 gives

(20) µ∗

(S ) = inf µ(F ) : F ∈ Bfin, F ⊃ S .Start with some arbitrary ε > 0, and choose some F ∈ Bfin with F ⊃ S andµ(F ) ≤ µ∗(S ) + ε. By (iii) the set A ∩ F is µ∗-measurable, so we have

µ∗(F ) = µ∗

F ∩ [A ∩ F ]

+ µ∗

F [A ∩ F ]

= µ∗(F ∩ A) + µ∗(F A).

Since F ∩ A ⊃ S ∩ A, and F A ⊃ S A, we have the inequalities µ∗(F ∩ A) ≥µ∗(S ∩ A) and µ∗(F A) ≥ µ∗(S A), so the above inequality gives

µ∗(F ) ≥ µ∗(S ∩ A) + µ∗(S A).

By the choice of F , this gives

µ∗(S ) + ε ≥ µ∗(S ∩ A) + µ∗(S A).

Since this inequality holds for all ε > 0, we immediately get the desired inequality

(19).B. The condition (i) says that

(21) µ∗(N ∩ S ) = 0, for all S ⊂ X with µ∗(S ) = 0.

It is obvious that we have the implication (i) ⇒ (ii). Conversely, suppose N satisfies (ii), and let us prove (21). Start with some arbitrary subset S ⊂ X withµ∗(S ) < ∞. Using (20), there exists some F ∈ Bfin with S ⊂ F . By (ii), andthe monotocity of µ∗ we have 0 = µ∗(N ∩ F ) ≥ µ∗(N ∩ S ), which clearly forcesµ∗(N ∩ S ) = 0.

The above result suggests that the σ-algebra m µ∗(X ) can be regarded as somesort of “local” completion of B. To simplify the exposition a little bit, we introducethe following.

Notation. Let B be a σ-algebra on X , let µ be a measure on B, and let µ∗

be the maximal outer extension of µ. The σ-algebra m µ∗(X ), of all µ∗-measurablesubsets of X , will be denoted by Mµ(B) (or just Mµ, when there is no danger oof confusion). The measure µ∗

Mµ

will be denoted by µ. The pair (Mµ, µ) will be

called the quasi-completion of B with respect to µ.

Unfortunately, analogues of Theorem 5.4 are not available, unless some (other-wise natural) restrictions are imposed. The type of restrictions we have in mind alsoaimed at making the test conditions A.(iii) and B.(ii) easier to check. We wouldlike to check them on a “small” sub-collection of Bfin. This naturally suggests thefollowing.

Definition. Let B be a σ-algebra on X , and let µ be a measure on B. Asufficient µ-finite B-partition of X is a collection F of non-empty subsets of X ,with the following properties:

(i) F is pairwise disjoint, and F ∈F F = X ;(ii) F ⊂ B, and µ(F ) < ∞, for all F ∈ F ;

(iii) for every set B ∈ B, with µ(B) < ∞, one has the equality

µ(B) =F ∈F

µ(B ∩ F ).

Condition (iii) uses the summation convention from II.2. (The sum is defined asthe suppremum of all finite partial sums.)




Remarks 5.7. A. Suppose F is a sufficient µ-finite B-partition of X . For everyset A ∈ B, we define the collection

S µF (A) = F ∈ F : µ(A ∩ F ) > 0.

If µ(A) < ∞, then

(a) S µF

(A) is at most countable, and(b) µ

A

F ∈Sµ

F(A)(A ∩ F )

= 0.

By condition (iii) in the definition, it follows that, the family

µ(A ∩ F )F ∈Sµ

F(A)

is

summable, and

(22)

F ∈SµF

(A)

µ(A ∩ F ) = µ(A).

Since µ(A ∩ F ) > 0, ∀ F ∈ S µF (A), property (a) follows from Proposition II.2.2. If

we denote the union

F ∈Sµ

F(A)(A ∩ F ) by A0, then by the σ-additivity of µ (it is

here where we use (a) in an essential way) the equality (22) givesµ(A0) =

F ∈Sµ

F(A)

µ(A ∩ F ) = µ(A),

which combined with µ(A) < ∞ forces µ(A A0) = 0.B. The existence of a sufficient µ-finite B-partition of X is a generalization of

σ-finitess. In fact the following are equivalent (B is a σ-algebra on X ):

• µ is σ-finite;• there exists a countable sufficient µ-finite B-partition of X .

In the presence of a sufficient µ-finite B-partition, the properties that appearin Proposition 5.5 are simplified.

Proposition 5.6. Let B be a σ-algebra on X , let µ be a measure on B. As-sume F is a sufficient µ-finite B-partition of X . Denote by µ∗ the maximal outer extension of µ.


(ii) A ∩ F is µ∗-measurable, for each F ∈ F .B. For a subset N ⊂ X , the following are equivalent

(i) N is locally µ∗-neglijeable;(ii) µ∗(N ∩ F ) = 0, for all F ∈ F .

C. If A ⊂ X is a subset with µ∗(A) < ∞, then

(23) µ∗(A) =F ∈F

µ∗(A ∩ F ).

Proof. It will be useful to introduce the following notations (use also the

notations from Proposition 5.5). For every B ∈ Bfin, we define

B =

F ∈Sµ

F(B)

(B ∩ F ).

By Remark 5.7 we know that µ(B B) = 0.A. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i),

we start with a set A ⊂ X satisfying condition (ii), and we show that A satisfiescondition (iii) from Proposition 5.5.A. Start with some arbitrary set B ∈ Bfin,



212 LECTURE 22

and let us show that A ∩ B is µ∗-measurable. Using the above notation, and themonotonicity of µ∗ we have

µ∗A ∩ [B B] ≤ µ∗(B B) = µ(B B) = 0,which in particular shows that A ∩ [B B] is µ∗-measurable. Since we haveA ∩ B = (A ∩ B) ∪ (A ∩ [B B]), it then suffices to show that A ∩ B is µ∗-measurable. Notice that

A ∩ B =SµF

(B)

(A ∩ F ∩ B),

and since the indexing set S µF (B) is at most countable, it then suffices to show

that A ∩ F ∩ B is µ∗-measurable, for each F . But this is obvious, since A ∩ F isµ∗-measurable, by condition (ii), and B ∈ B.

C. Let A ⊂ X be a subset with µ∗(A). Using Lemma 5.1, we can find, forevery ε > 0, some set Bε ∈ Bfin, such that Bε ⊃ A, and µ(Bε) ≤ µ∗(A) + ε. Fix

for the moment ε. Since the family µ(Bε∩ F )F ∈F is summable, and µ

∗

(A ∩ F ) ≤µ(Bε ∩ F ), ∀ F ∈ F , it follows that the family µ∗(A ∩ F )F ∈F

is summable, andmoreover one has the inequality

F ∈F

µ∗(A ∩ F ) ≤F ∈F

µ(Bε ∩ F ) = µ(Bε) ≤ µ∗(A) + ε.

Since we haveF ∈F µ

∗(A ∩ F ) ≤ µ∗(A) + ε, for all ε > 0, it follows that we havein fact the inequality

F ∈F

µ∗(A ∩ F ) ≤ µ∗(A).

To prove the reverse inequality, we fix ε = 1 and we define set

G =

F ∈S

µ

F(B1)

F.

Since S µF (B1) is at most countable, the set G belongs to B. With the above notation,

we have the equality B1 = B1 ∩ G, and by Remark 5.7.A, we have µ(B1 G) =

µ(B1 B1) = 0. Since A G ⊂ B1 G, it follows that µ∗(A G) = 0. Since G is

µ∗-measurable, we get

µ∗(A) = µ∗(A ∩ G) + µ∗(A G) = µ∗(A ∩ G).

Since G is a countable union of F ’s, by the σ-subadditivity of µ∗, we have

µ∗(A) = µ∗(A∩G) = µ∗ F ∈Sµ

F(B1)

[A∩F ] ≤

F ∈Sµ

F(B1)

µ∗(A∩F ) ≤F ∈F

µ∗(A∩F ).

B. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i), wemust show that condition (ii) implies

µ∗(N ∩ B) = 0, ∀ B ∈ Bfin.

But if we fix some B ∈ Bfin, then of course we have µ∗N ∩B) ≤ µ∗(B) = µ(B) < ∞,so using part C, we have

µ∗(N ∩ B) =F ∈F

µ∗(N ∩ B ∩ F ) ≤F ∈F

µ∗(N ∩ F ) = 0,

and we are done.




Comments. Let B be a σ-algebra on X , let µ be a measure on B. Assume F is a sufficient µ-finite B-partition of X .

By Proposition 5.6.C, it follows that F is also a sufficient µ-finite Mµ

-partitionof X .

We see naow that Mµ may contain more “new” sets, appart from the “nat-ural candidates,” which are of the form B N , with B ∈ B and N locally µ∗-neglijeable. Such “new” sets are those which belong (see Section 2) to the σ-algebraF ∈F

BF

. More precisely, we have the following.

Corollary 5.3. Let B be a σ-algebra on X , let µ be a measure on B. AssumeF is a sufficient µ-finite B-partition of X .

A. One has the equality

Mµ =F ∈F

Mµ

F

.

B. For a subset A ⊂ X , the following are equivalent

(i) A ∈ Mµ;(ii) there exist a set B ∈

F ∈F

BF

, and a locally µ∗-neglijeable set

N ⊂ X , such that A = B N .

Proof. A. This is exactly property A from Proposition 5.6.B. (i) ⇒ (ii). Assume A ∈ Mµ, i.e. A is µ∗ measurable. For every F ∈ F , the

set A ∩ F is µ∗-measurable. Since µ∗(A ∩ F ) < ∞, by Theorem 5.5, it follows thatA ∩ F = BF N F , with BF ∈ B and µ∗(N F ) = 0. Replacing BF with BF ∩ F ,and N F with N F ∩ F , we can assume that BF , N F ⊂ F . Form then the setsB =

F ∈F BF and N =

F ∈F N F . On the one hand, we have B ∩ B = BF ∈ B,

∀ F ∈ F , which means precisely that B ∈ F ∈F BF . On the other hand, we alsohave N ∩ F = N F , so we get µ∗(N ∩ F ) = 0, ∀ F ∈ F . By Proposition 5.6.B, itfollows that N is locally µ∗-neglijeable. We clearly have A = B N .

The implication (ii) ⇒ (i) is obvious.

There is yet another nicer consequence of Proposition 5.6, for which we aregoing to use the following terminology.

Definition. Let A be a σ-algebra on X , and let µ be a measure on A. Afamily F is called a µ-finite decomposition for A, if

(i) F is a sufficient µ-finite A-partition of X , and(ii) one has the equality

F ∈F

AF

= A.

(Given a collection F ⊂ A, one always has the inclusionF ∈F

AF

⊂ A.)A measure µ on A is said to be decomposable, if there exists at least one µ-finite

decomposition for A.

Remark 5.8. Decomposability is a generalization of σ-finiteness. This followsfrom Remark 5.6.B, combined with the fact that whenever F

⊂A is a countable

sub-collection, one always has the equality F ∈F AF = A.With this terminology, Corollary 5.3 states that if F is a sufficient µ-finite

B-partition of X , then F is a µ-finite decomposition for Mµ.

With the above terminology, Corollary 5.2 has the following generalization

Theorem 5.6. Let µ be a decomposable measure on the σ-algebra B.




214 LECTURE 22

(ii) there exist B ∈ B, and some locally µ∗-neglijeable set N , such that A = B N .

B. For a subset N ⊂

X , the following are equivalent (i) N is locally µ∗-neglijeable;

(ii) there exists a locally µ-null set D ∈ B with N ⊂ D.

Proof. A. This is clear, by Corollary 5.3.B. The implication (ii) ⇒ (i) is trivial, because any locally µ-null set D is

locally µ∗-neglijeable, and so is every subset of D.To prove the implication (i) ⇒ (ii) start with a locally µ∗-neglijeable set N ,

and we fix F a µ-finite decomposition of B. We know that µ∗(N ∩ F ) = 0, ∀ F ∈ F .In particular, using Remark 5.4.B, for each F ∈ F , there exists some set E F ∈ B,with N ∩ F ⊂ E F , and µ(E F ) = 0. Consider now the set D =

F ∈F (E F ∩ F ).

By construction, we have D ∩ F = E F ∩ F ∈ B, ∀ F ∈ F , which means thatD ∈

F ∈F

B

F

. It is here where we use condition (ii) in the definition of µ-finite

decompositions, to conclude that D belongs to B. Of course, we have

µ(D ∩ F ) = µ(E F ∩ F ) ≤ µ(E F ) = 0, ∀ F ∈ F ,

which by Proposition 5.6 means that D is locally µ∗-neglijeable. This means that

µ(D ∩ B) = µ∗(D ∩ B) = 0, ∀ B ∈ Bfin,

which means that D is locally µ-null. Since N ∩ F ⊂ E F ∩ F ⊂ D, ∀ F ∈ F , and F is a partition of X , we get N ⊂ D.



Lectures 23-25

6. The Lebesgue measure

In this section we apply various results from the previous sections to a verybasic example: the Lebesgue measure on Rn.

Notations. We fix an integer n ≥ 1. In Section 21 we introduced the semiringof “half-open boxes” in Rn:

Jn = ∅ ∪ nj=1

[aj , bj) : a1 < b1, . . . , an < bn ⊂ P(Rn).

For a non-empty box A = [a1, b1)× · · ·× [an, bn) ∈ Jn, we defined its n-dimesnionalvolume by

voln(A) =nk=1

(bk − ak).

We also defined voln(∅) = 0.By Theorem 4.2, we know that voln is a finite measure on Jn.

Definitions. The maximal outer extension of voln is called the n-dimensional outer Lebesgue measure, and is denoted by λ∗

n.

The λ∗n-measurable sets in R

n

will be called n-Lebesgue measurable. The σ-algebra m λ∗n (Rn) will be denoted simply by m (Rn). The measure λ∗nm (Rn)

is

simply denoted by λn, and is called the n-dimensional Lebesgue measure. Althoughthis notation may appear to be confusing, it turns out (see Proposition 5.3) that λ∗

n

is indeed the maximal outer extension of λn. In the case when n = 1, the subscriptwill be ommitted.

We know (see Section 21) that

S(Jn) = Σ(Jn) = Bor(Rn).

Using the fact that the semiring Jn is σ-total in Rn, by the definition of the outerLebesgue measure, we have

(1) λ∗

n(A) = inf

∞

k=1

voln

(Bk

) : (Bk

)∞

k=1 ⊂Jn

,∞

k=1

Bk ⊃

A,∀

A⊂

Rn

Using Corollary 5.2, we have the equality

m (Rn) = Bor(Rn),

where Bor(Rn) is the completion of Bor(Rn) with respect to the measure λnBor(Rn)

.

This means that a subset A ⊂ Rn is Lebesgue measurable, if and only if there ex-ists a Borel set B and a neglijeable set N such that A = B ∪ N . (The fact N is

215



216 LECTURES 23-25

neglijeable means that λ∗n(N ) = 0, and is equivalent to the existence of a Borel set

C ⊃ N with λn(C ) = 0.)

Exercise 1. Let A = [a1, b1) × · · · × [an, bn) be a half-open box in Rn

. AssumeA = ∅ (which means that a1 < b1, . . . , an < bn). Consider the open box Int(A)and the closed box A, which are given by

Int(A) = (a1, b1) × · · · × (an, bn) and A = [a1, b1] × · · · × [an, bn].

Prove the equalitiesλn

Int(A)

= λn

A

= voln(A).

Remarks 6.1. If D ⊂ Rn is a non-empty open set, then λn(D) > 0. This is aconsequence of the above exercise, combined with the fact that D contains at leastone non-empty open box.

The Lebesgue measure of a countable subset C ⊂ Rn is zero. Using σ-additivity,it suffices to prove this only in the case of singletons C = x. If we write x incoordinates x = (x1, . . . , xn), and if we consider half-open boxes of the form

J ε = [x1, x1 + ε) × · · · × [xn, xn + ε),

then the obvious inclusion x ⊂ J ε will force

0 ≤ λnx ≤ λn(J ε) = εn,

so taking the limit as ε → 0, we indeed get λnx = 0.

The (outer) Lebesgue measure is completely determined by its values on opensets. More explicitly, one has the following result.

Proposition 6.1. Let n ≥ 1 be an integer. For every subset A ⊂ Rn one has:

(2) λ∗n(A) = inf λn(D) : D open subset of Rn, with D ⊃ A.

Proof. Throughout the proof the set A will be fixed. Let us denote, forsimplicity, the right hand side of (2) by ν (A). First of all, since every open set is

Lebesgue measurable (being Borel), we have λn(D) = λ∗n(D), for all open sets D,so by the monotonicity of λ∗

n, we get the inequality

λ∗n(A) ≤ ν (A).

We now prove the inequality λ∗n(A) ≥ ν (A). Fix for the moment some ε > 0, and

use (1). to get the existence of a sequence (Bk)∞k=1 ⊂ Jn, such that

∞k=1 Bk ⊃ A,

and∞k=1

voln(Bk) < λ∗n(A) + ε.

For every k ≥ 1, we write

Bk = [a(k)1 , b

(k)1 ) × · · · × [a(k)

n , b(k)n ),

so that voln(Bk) = nj=1

(b(k)1

−a

(k)

j

). Using the obvious continuity of the map

R t −→nj=1

(b(k)1 − a

(k)j − t) ∈ R,

we can find, for each k ≥ 1 some numbers c(k)1 < a

(k)1 , . . . , c

(k)n < a

(k)n , with

(3)

nj=1

(b(k)1 − c

(k)j ) <

ε

2k+

nj=1

(b(k)1 − a

(k)j ).




Notice that, if we define the half-open boxes

E k = [c(k)1 , b

(k)1 )

× · · · ×[c(k)n , b(k)

n ),

then for every k ≥ 1, we clearly have Bk ⊂ Int(E k), and by Exercise 1, combinedwith (3), we also have the inequality

λn

Int(E k)

= voln(E k) <ε

2k+ voln(Bk).

Summing up we then get

(4)∞k=1

λn

Int(E k)

<∞k=1

ε

2k+ voln(Bk)

= ε +

∞k=1

voln(Bk) < 2ε + λ∗n(A).

Now we observe that by σ-sub-additivity we have

λn∞

k=1

Int(E k) ≤∞

k=1

λnInt(E k),

so if we define the open set D =∞k=1 Int(E k), then using (4) we get

(5) λn(D) < 2ε + λ∗n(A).

It is clear that we have the inclusions

A ⊂∞k=1

Bk ⊂∞k=1

Int(E k) = D,

so by the definition of ν (A), combined with (5), we finally get

ν (A) ≤ λn(D) < 2ε + λ∗n(A).

Up to this moment ε > 0 was fixed. Since the inequality ν (A) < 2ε + λ∗n(A) holds

for any ε > 0 however, we finally get the desired inequality ν (A) ≤ λ∗

n(A).

The Lebesgue measure can also be recovered from its values on compact sets.

Proposition 6.2. Let n ≥ 1 be an integer. For every Lebesgue measurablesubset A ⊂ Rn one has:

(6) λn(A) = supλn(K ) : K compact subset of Rn, with K ⊂ A.

Proof. Let us denote, for simplicity, the right hand side of (6) by µ(A). Firstof all, by the mononoticity we clearly have the inequality

λn(A) ≥ µ(A).

To prove the inequality λn(A) ≤ µ(A), we shall first use a reduction to the boundedcase. For each integer k

≥1, we define the compact box

Bk = [−k, k] × · · · × [−k, k].

Notice that we have B1 ⊂ B2 ⊂ . . . , with∞k=1 Bk = Rn. We then have

B1 ∩ A ⊂ B2 ∩ A ⊂ . . . ,

with∞k=1(Bk ∩ A) = A, so using the Continuity Lemma 4.1, we have

(7) λn(A) = limk→∞

λn(Bk ∩ A) = sup

λn(Bk ∩ A) : k ≥ 1

.



218 LECTURES 23-25

Fix for the moment some ε > 0, and use the (7) to find some k ≥ 1, such thatλn(A) ≤ λn(Bk ∩ A) + ε. Apply Proposition 6.1 to the set Bk A, to find an openset D, with D

⊃Bk

A, and λn

(Bk

A)≥

λn

(D)−

ε. On the one hand, we have

λn(Bk) = λn(Bk ∩ A) + λn(Bk A) ≥ λn(Bk ∩ A) + λn(D) − ε ≥≥ λn(Bk ∩ A) + λn(Bk ∩ D) − ε.

(8)


λn(Bk) = λn(Bk D) + λn(Bk ∩ D),

so using (8) we get the inequality

λn(Bk D) + λn(Bk ∩ D) ≥ λn(Bk ∩ A) + λn(Bk ∩ D) − ε,

and since all numbers involved in the above inequality are finite, we conclude that

λn(Bk D)

≥λn(Bk

∩A)

−ε

≥λn(A)

−2ε.

Obviously the set K = Bk D is compact, with K ⊂ Bk ∩ A ⊂ A, so we haveµ(A) ≥ λn(K ), hence we get the inequality

µ(A) ≥ λn(A) − 2ε.

Since this is true for all ε > 0, the desired inequality µ(A) ≥ λn(A) follows.

Corollary 6.1. For a set A ⊂ Rn, the following are equivalent:

(i) A is Lebesgue measurable;(ii) there exists a neglijeable set N and a sequence of (K j)∞

j=1 of compact subsets of Rn, such that

A = N ∪

∞

j=1 K j .

Proof. (i) ⇒ (ii). Start by using the boxes

Bk = [−k, k] × · · · × [−k, k]

which have the property that∞k=1 Bj = Rn, so we get A =

∞k=1(Bk ∩ A). Fix

for the moment k. Apply Proposition 6.2. to find a sequence (C kr )∞r=1 of compact

subsets of Bk∩A, such that limr→∞ λn(C kr ) = λn(Bk∩A). Consider the countablefamily (C kr )∞

k,r=1 of compact sets, and enumerate it as a sequence (K j)∞j=1, so that

we have∞

j=1

K j =∞

k=1

∞

r=1

C kr .

If we define, for each k ≥ 1, the sets E k =∞r=1 C kr ⊂ Bk∩A and N k = (Bk∩A)E k,

then, because of the inclusion C kr ⊂ E k ⊂ Bk ∩ A, we have the inequalities

(9) 0 ≤ λn(N k) = λn(Bk ∩ A) − λn(E k) ≤ λn(Bk ∩ A) − λn(C kr ), ∀ r ≥ 1.

Using the fact that

limr→∞

λn(C kr ) = λn(Bk ∩ A) ≤ λn(Bk) < ∞,




the inequalities (9) force λn(N k) = 0, ∀ k ≥ 1. Now if we define the set N =A

∞j=1 K j

, we have

N =∞k=1

(Bk ∩ A) ∞j=1

K j =

∞k=1

(Bk ∩ A) ∞ p=1

E p ⊂

⊂∞k=1

(Bk ∩ A) E k

=

∞k=1

N k,

which proves that λn(N ) = 0.The implication (ii) ⇒ (i) is trivial.

Proposition 6.2 does not hold if A ⊂ Rn is non-measurable. In fact the equality(6), with λn replaced by λ∗

n, essentially forces A to be measurable, as shown by thefollowing.

Exercise 2 . Let A ⊂ Rn be am arbitrary subset, with λ∗n(A) < ∞. Prove that

the following are equivalent:(i) A is Lebesgue measurable;

(ii) λ∗n(A) = supλn(K ) : K compact subset of Rn, with K ⊂ A.

Propositions 6.1 and 6.2 are regularity properties. The following terminology isuseful:

Definitions. Suppose A is a σ-algebra on X , and µ is a measure on A. Sup-pose we have a sub-collection F ⊂ A.

(i) We say that µ is regular from below, with respect to F , if

µ(A) = sup

µ(F ) : F ⊂ A, F ∈ F

.

(ii) We say that µ is regular from above, with respect to F , if

µ(A) = inf

µ(F ) : F ⊃ A, F ∈ F

.

With this terminology, Proposition 6.1 gives the fact that the Lebesgue measure isregular from above with respect to open sets, while Proposition 6.2 gives the factthat the Lebesgue measure is regular from below with respect to compact sets.

Exercise 3 . For a subset A ⊂ Rn, prove that the following are equivalent:

(i) A is Lebesgue measurable;(ii) There exist a sequence of compact sets (K j)∞

j=1, and a dequence of open

sets (Dj)∞j=1, such that

∞j=1 K j ⊂ A ⊂ ∞

j=1 Dj , and the difference∞j=1 Dj

∞

j=1 K j

is neglijeable.

Hint: For the implication (i) ⇒ (ii) analyze first the case when λ∗(A) < ∞. Then write A as a

countable union of sets of finite outer measure.

In the one-dimensional case n = 1, the Lebesgue measure of open sets can becomputed with the aid of the following result.

Proposition 6.3. For every open set D ⊂ R, there exists a countable (or finite) pair-wise disjoint collection J ii∈I of open intervals with D =

i∈I J i.

Proof. For every point x ∈ D, we define

ax = inf a < x : (a, x) ⊂ D and bx = supb > x : (x, b) ⊂ D.

(The fact that D is open guarantees the fact that both sets above are non-empty.)It is clear that, for every x ∈ D, the open interval J x = (ax, bx) is contained in D, so



220 LECTURES 23-25

we have the equality D =x∈D J x. The problem at this point is the fact that the

collection J xx∈D is not pair-wise disjoint. What we need to find is a countable(or finite) subset X

⊂D, such that the sub-collection

J xx∈X

is pair-wise disjoint,and we still have D =

x∈X J x. One way to do this is based on the following

Claim: For two points x, y ∈ D, the following are equivalent:(i) x ∈ J y;

(ii) J x ⊃ J y;(ii) J x ∩ J y = ∅;

(iii) J x = J y.

To prove the implication (i) ⇒ (ii) we observe that if x ∈ J y, then ay < x < by,so we have (ay, x) ⊂ D and (x, by) ⊂ D, which means that ax ≤ ay and bx ≥ by,therefore we have the inclusion J x = (ax, bx) ⊃ (ay, by) = J y. The implication(ii) ⇒ (iii) is trivial. To prove (iii) ⇒ (iv), assume J x ∩ J y = ∅, and pick a pointz ∈ J x ∩ J y. Using the implication (i) ⇒ (ii) we have the inclusions J z ⊃ J x andJ z

⊃J y. In particular we have x

∈J z, so again using the inplication (i)

⇒(ii) we

get J x ⊃ J z, which means that we have in fact the equality J x = J z. Likewise wehave the equality J y = J z, so (iv) follows. The implication (iv) ⇒ (i) is trivial.

Going back to the proof of the Proposition, we now see that, using the factthat any open interval contains a rational number, if we put X 0 = D ∩ Q, thenfor any y ∈ D, there exists x ∈ X 0, such that J x = J y. This gives the equalityD =

x∈X0

J x, this time with the indexing set X 0 countable. Finally, if we equipthe set X 0 with the equivalence relation

x ∼ y ⇐⇒ J x = J y,

and we choose X ⊂ X 0 to the a list of all equivalence classes. This means that, forevery y ∈ X 0, there exists a unique x ∈ X with J x = J y. It is clear now that westill have D =

x∈X J x, but now if x, x ∈ X are such that x = x, then x ∼ x, so

we have J x

= J x , which by the Claim gives J x

∩J x = ∅.

Comments. When we want to compute the Lebesgue measure of an open setD ⊂ R, we should first try to write D =

i∈I J i with (J i)i∈I a countable (or finite)

pair-wise collection of open intervals. If we succeed, then we would have

λ(D) =i∈I

λ(J i).

For intervals (open or not) the Lebesgue measure is the same as the length.There are instances when we can manage only to write a given open set D as a

union D =∞k=1 J k, with the J ’s not necessarily disjoint. In that case we can only

get the estimate

λ(D) ≤∞

k=1

λ(J k).

Example 6.1. Consider the ternary Cantor set K 3 ⊂ [0, 1], discussed in III.3.We know (see Remarks 3.5) that one can find a pair-wise sequence (Dn)∞

n=0 of opensubsets of (0, 1) such that K 3 = [0, 1]

∞n=0 Dn, and such that, for each n ≥ 0,

the open set Dn is a disjoint union of 2n intervals of length 1/3n+1. In particular,this means that λ(Dn) = 2n/3n+1, so

λ(K 3) = λ

[0, 1]− λ

∞n=0

Dn

= 1 −

∞n=0

λ(Dn) = 1 −∞n=0

2n

3n+1= 0.



222 LECTURES 23-25

Then, using the obvious inclusion A + x ⊂ ∞k=1(Bk + x), by the remark made at

the begining of the proof, combined with the monotonicity of the outer Lebesguemeasure, we have

λ∗n(A + x) ≤ λ∗

n

∞k=1

(Bk + x)

≤

∞k=1

λ∗n(Bk + x) =

=∞k=1

voln(Bk + x) =∞k=1

voln(Bk) ≤ λ∗n(A) + ε.

Since the inequality λ∗n(A + x) ≤ λ∗

n(A) + ε holds for all ε > 0, we get

λ∗n(A + x) ≤ λ∗

n(A).

The other inequality follows from the above one applied to the set A + x and thetranslation by −x.

Corollary 6.2. For a subset A ⊂ Rn, one has the equivalence

A ∈ m (Rn) ⇐⇒ A + x ∈ m (Rn).

Proof. Write A = B ∪ N , with B Borel, and N neglijeable. Then we haveA + x = (B + x) ∪ (N + x). The set B + x is Borel. By the above result we haveλ∗n(N + x) = λ∗

n(N ) = 0, i.e. N + x is neglijeable. Therefore A + x is Lebesguemeasurable.

As we have seen, the fact that there exist Lebesgue measurable sets that arenot Borel is explained by the difference in cardinalities. Since card m (Rn) = 2c =cardP(Rn), it is legitimate to ask whether the inclusion m (Rn) ⊂ P(Rn) is strict.In other words, do there exist sets that are not Lebesgue measurable? The answeris affirmative, as discussed in the following.

Example 6.2. Equipp R with the equivalence relation

x ∼ y ⇐⇒ x − y ∈ Q.

Denote by R/Q the quotient space (this is in fact the quotient group of (R, +) withrespect to the subgroup Q), and denote by π : R → R/Q the quotient map. Sincefor every x ∈ R, one can find some y ∼ x, with y ∈ [0, 1), it follows that the mapπ

[0,1): [0, 1) → R/Q is surjective. Choose then a map φ : R/Q → [0, 1), such that

φ π = Id, and put E = φ(R/Q). The set E is a complete set of representatives forthe equivalence relation ∼. In other words, E ⊂ [0, 1) has the property that, forevery x ∈ R, there exists exactly one element y ∈ E , with x ∼ y. In particular, thecollection of sets (E + q)q∈Q is pair-wise disjoint, and satisfies

q∈Q(E + q) = R.

Using σ-sub-additivity, we get

∞ = λ(R) ≤

q∈Qλ∗(E + q).

Since (by Proposition 6.5) we have λ∗(E + q) = λ∗(E ), the above inequality forcesλ∗(E ) > 0.

Claim: The set E is not Lebesgue measurable

Assume E is Lebesgue measurable. If we define the set X = Q ∩ [0, 1), then thesets E + q, q ∈ X are pair-wirse disjoint. On the one hand, the measurabilityof E , combined with the Corollary 6.2 would imply the measurability of the setS =

q∈X(E + q). On the other hand, the equalities λ(E + q) = λ(E ) > 0 will




force λ(S ) = ∞. But this is impossible, since we obviously have S ⊂ [0, 2), whichforces λ(S ) ≤ 2.

Exercise 5 . Let E ∈ m (Rn

). Prove that the mapRn x −→ λ

E ∪ (E + x)

∈ [0, ∞]

is continuous.

Hint: Analyze first the case when E is compact. In this particular case, show that for everyx0 ∈ Rn and every open set D ⊃ E ∪ (E + x0), there exists some neighborhood V of x0, such that

D ⊃ E ∪ (E + x), ∀ x ∈ V.

Use then regularity from above, combined with the inequality9

|λ(A) − λ(B)| ≤ λ(AB), for all A, B ∈ m (Rn), with λ(A), λ(B) < ∞.

In the general case, use regularity from b elow. (The case λ(E ) = ∞ is trivial.)

Exercise 6 . Let E ∈ m (Rn), be such that λn(E ) > 0. Prove that the set

E − E = x − y : x, y ∈ E is a neighborhood of 0.

Hint: Assume the contrary, which means that there exists a sequence ( xp)∞p=1 ⊂ Rn (E − E ),

with limp→∞ xp = 0. This will force E ∩ (E + xp) = ∅, ∀ p ≥ 1. Use the preceding Exercise to

get a contradiction.

We are now in position to construct a Lebesgue measurable set which is notBorel.

Example 6.3. In Section 3 we discussed the compact space T = 0, 1ℵ0 andthe maps

φr : T (αn)∞n=1 −→ (r − 1)

∞

n=1

αnrn

∈ [0, 1].

For each r ≥ 2 the map φr : T → [0, 1] is continuous so the set K r = φr(T ) iscompact. We have K 2 = [0, 1], and K 3 is the ternary Cantor set. We also know(see Theorem 3.5) that, for a set A ⊂ T , one has the equivalence

(11) A ∈ Bor(T ) ⇐⇒ φr(A) ∈ Bor(K r).

Choose now a set E ⊂ [0, 1] which is not Lebesgue measurable. In particular, E isnot Borel, so E ∈ Bor([0, 1]). Since φ2 : T → [0, 1] is surjective, by (11) it followsthat the set A = φ−1

2 (E ) is not in Bor(T ). Again, by (11) it follows that the setS = φ3(A) is not in Bor(K 3). Since

Bor(K 3) = Bor(R)K3

this gives S ∈

Bor(R). Notice however that since S ⊂

K 3, it follows that S isLebesgue measurable.

Comment. When one wants to prove that a Lebesgue measurable set M ⊂ R

has positive measure, a sufficient condition for this property is that Int(M ) = ∅

(see Remark 6.1). It turns out however that this condition is not always necessary,as seen from the following:

9 This inequality holds for any additive map defined on a ring.



224 LECTURES 23-25

Exercise 7 . Start with an arbitrary inerval [0, 1], and list all rational numbersin [0, 1] as a sequence Q ∩ [0, 1] = xn∞

n=1. Fix some ε > 0, and consider the openset

D =∞n=1

xn − ε

2n+1, xn +

ε

2n+1

.

Consider the compact set K = [0, 1] D.

(i) Prove that λ(D) ≤ ε.(ii) Prove that λ(K ) ≥ 1 − ε.

(iii) Prove that Int(K ) = ∅.

Hint: For (iii) use the fact that K ∩ Q = ∅.

Exercise 8* . Prove that, for every non-empty open set D ⊂ R, and any twopositive numbers α, β with α + β < λ(D), there exist compact sets A, B ⊂ D, withλ(A) > α, λ(B) > β , such that A ∩ B = ∅ and (A ∪ B) ∩ Q = ∅.

Hint: Write D as a union of a pair-wise disjoint sequence (J n)∞n=1 of open intervals, so that

λ(D) =∞n=1 λ(J n). Find then two sequences (αn)∞n=1 and (β n)∞n=1 of positive numbers, such

that∞

n=1 αn > α,∞

n=1 β n > β , and αn + β n < λ(J n), for all n ≥ 1. This reduces essentially

the problem to the case when D is an open interval, for which one can use the construction

outlined in Exercise 7.

Exercise 9* . Construct o Borel set A ⊂ R, such that, for every open intervalI ⊂ R one has λ(I ∩ A) > 0 and λ(I A) > 0.

Hints: List all open intervals with rational endpoints as a sequence ( I n)∞n=1. Start (use exercise

8) off by choosing two compact sets A1, B1 ⊂ I 1, with A1 ∩ B1 = ∅, (A1 ∪ B1) ∩ Q = ∅,

and λ(A1), λ(B1) > 0. Use Exercise 5 to construct two sequences (An)∞n=1 and (Bn)∞n=1 of

compact sets, such that, for all n ≥ 1 we have: (i) An ∩ Bn = ∅; (ii) (An ∪ Bn) ∩ Q = ∅;

(iii) λ(An), λ(Bn) > 0; (iv) An+1 ∪ Bn+1 ⊂ I n+1 n

k=1(Ak ∪ Bk)

. Put A =∞

n=1 An and

B =

∞n=1 Bn. Notice that A ∩ B = ∅, λ(A), λ(B) > 0, and λ(A ∩ I n), λ(B ∩ I n) > 0, ∀ n ≥ 1.

In the remainder of this section we discuss some applications of the Lebesguemeasure to the theory of Riemann integration. The following techincal result willbe very useful.

Lemma 6.1. Let f : [a, b] → R be a non-negative Riemann integrable function,let A, B ⊂ [a, b] be two disjoint sets, with A∪B = [a, b]. Then one has the estimates

λ∗(A) · inf z∈A

f (z) ≤ ba

f (t) dt ≤ (b − a) · supx∈A

f (x) + λ∗(B) · supy∈B

f (y).

Proof. Define the numbers

α = supx∈A

f (x), β = supy∈B

f (y), and γ = inf z∈A

f (z).

Recall first that, if for each partition ∆ = ( a = x0 < x1 < · · · < xn = b) of [a, b],

we define the lower and the upper Darboux sums of f with respect to ∆:

L(∆, f ) =nk=1

(xk − xk−1) · inf t∈[xk−1,xk]

f (t),

U (∆, f ) =nk=1

(xk − xk−1) · supt∈[xk−1,xk]

f (t),




then one has the equalities

b

a f (t) dt = supL(∆, f ) : ∆ partition of [a, b] == inf

U (∆, f ) : ∆ partition of [a, b]

.

(12)

Fix now a partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], and define the set

S =

k ∈ 1, . . . , n : [xk−1, xk] ∩ A = ∅

.

It is clear that

inf x∈[xk−1,xk]

f (x) ≤ α, supx∈[xk−1,xk ]

f (x) ≥ γ, ∀ k ∈ S,

inf y∈[xk−1,xk]

f (y) ≤ β, supy∈[xk−1,xk]

f (y) ≥ 0, ∀ k ∈ 1, . . . , n S,

so we get

L(∆, f ) ≤ k∈S

(xk − xk−1) · α + k∈S

(xk − xk−1) · β (13)

U (∆, f ) ≥ k∈S

(xk − xk−1) · γ (14)

Consider now the sets

M =k∈S

[xk−1, xk] and N =k∈S

[xk−1, xk].

Since the intervals involded in both M and N have at most singleton overlaps, itfollows that we have the equalities

k∈S(xk − xk−1) = λ(M ) and

k∈S(xk − xk−1) = λ(N ),

so the estimates (13) and (14) read

L(∆, f ) ≤ λ(M ) · α + λ(N ) · β (15)

U (∆, f ) ≥ λ(M ) · γ (16)

Since we clearly have A ⊂ M ⊂ [a, b] and N ⊂ B, we have the inequalities

λ∗(A) ≤ λ(M ) ≤ b − a and λ(N ) ≤ λ∗(B),

so the inequalities (15) and (16) give

L(∆, f ) ≤ (b − a) · α + λ∗(B) · β and U (∆, f ) ≥ λ∗(A) · γ.

Since ∆ is arbitrary, the desired inequality then follows from (12).

One application of the above result is the following.Proposition 6.5. If f : [a, b] → R is Riemann integrable, and the set

N = x ∈ [a, b] : f (x) = 0is neglijeable, then

(17)

ba

f (x) dx = 0.



226 LECTURES 23-25

Proof. Since f is bounded, there exists some constant C > 0, such that theRiemann integrable functions C +f and C −f are both non-negative. Apply Lemma6.1 to these two functions with A = [a, b] N and B = N . Since f [a,b]N = 0, we

get (C ± f )[a,b]N

= C , so we get ba

[C ± f (x)] dx ≤ (b − a) · C,

which yields

± ba

f (x) dx =

ba

[C ± f (x)] − C

dx =

ba

[C ± f (x)] dx − (b − a) · C ≤ 0,

from which (17) immediately follows.

In order to make the exposition a bit easier to follow, it will be helpful to

introduce the followingConvention. Given two functions f 1, f 2 : [a, b] → R, and a relation r on R

(in our case r will be either “=,” or “≥,” or “≤”), we write

f 1 r f 2, a.e.

if the set

A =

x ∈ [a, b] : f 1(x) r f 2(x)

has neglijeable complement in [a, b], i.e. λ∗

[a, b] A

= 0. The abreviation “a.e.”stands for “almost everywhere.”

For example, using this convention, Proposition 6.6 reads: if f : [a, b] → R is

Riemann integrable, and f = 0, a.e., then b

af (x) dx = 0.

Exercise 10 . A. Prove that “= a.e” is an equivalence relation, and “≥ a.e” and“≤ a.e” are transitive relations on the collection of all function [a, b] → R.

B. Prove that f 1 ≥ f 2, a.e. and f 1 ≤ f 2, a.e. imply f 1 = f 2, a.e.C. Prove that these relations are compatible with the arithmetic operations, in

the exact way as their “honest” versions. For example, if r is one of “=,” or “≥,”or “≤”, and if f 1 r f 2, a.e. and g1 r g2, a.e., then (f 1 + g1) r (f 2 + g2), a.e.

Exercise 11. Let f, g : [a, b] → R be continuous functions, such that f ≥ g, a.e.Prove that f ≥ g.

Exercise 12 . Let f : [a, b] → R be a non-negative Riemann integrable function,

with ba f (x) dx = 0. Prove that f = 0, a.e.

Comment. Riemann integrability is quite a rigid condition. For example the

characteristic function κ Q∩[a,b] of the set of rational numbers in [a, b] is not Riemannintegrable. By the above result however, we can introduce a slightly weaker notion,which will make such functions integrable, in a weaker sense. This will be a first“improvement” of the Riemann integration theory. Eventually (see Chapter IV), amore sofisticated theory - the Lebesgue integral - will emerge.

Definition. We say that a function f : [a, b] → R is almost Riemann inte-grable, if there exists a Riemann integrable function g : [a, b] → R, with f = g, a.e.Of course, such a g is not unique. Notice however that, if h : [a, b] → R is another




Riemann integrable function, with f = h, a.e., then g = h, a.e., so by Proposition6.6, we immediately get the equality

ba

g(x) dx = ba

h(x) dx.

This observation shows that we can unambiguously define

≈ ba

f (x) dx =

ba

g(x) dx.

Example 6.4. Consider the function f = κ Q∩[a,b]. Since Q∩[a, b] is neglijeable,

we have f = 0, a.e. So f is almost Riemann integrable (althought it is not Riemannintegrable), and we have

≈ ba

f (x) dx = 0.

We now focus our attention to (honest) Riemann integrability, with an eye onthe role played by continuity. For a function f : [a, b] → R we define the set

Df =

x ∈ [a, b] : f not continuous at x

.

It is well-known that continuous functions are Riemann integrable. There are dis-continuous functions which are still Riemann integrable, for instance we know that

(18) Df finite =⇒ f Riemann integrable.

Notations. Let f : [a, b] → R be a bounded function. Suppose ∆ = (a =x0 < x1 < · · · < xn = b) is a partition. For each k ∈ 1, . . . , n we consider thenumbers

M k = supt∈[xk−1,xk]

f (t) and mk = inf t∈[xk−1,xk]

f (t),

and we define the functions

f ∆ = m1 · κ [x0,x1] + m2 · κ (x1,x2] + · · · + mn · κ (xn−1,xn],

f ∆ = M 1 · κ [x0,x1] + M 2 · κ (x1,x2] + · · · + M n · κ (xn−1,xn].

Clearly the functions f ∆ and f ∆ have only finitely many points of discontinuity, sothey are Riemann integrable.

With these notations we have the following

Proposition 6.6. For a bounded function f : [a, b] → R, the following areequivalent:

(i) f is Riemann integrable;

(ii) inf

b

a [f ∆(x) − f ∆(x)] dx : ∆ partition of [a, b]

= 0;

(iii) there exists a sequence (∆ p)

∞

p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,and lim p→∞

ba

f ∆p (x) − f ∆p

(x)

dx = 0.

Proof. From the definition of Riemann integrability, we know that (i) is equiv-alent to any of the following two conditions

(ii’) inf

U (∆, f ) − L(∆, f ) : ∆ partition of [a, b]

= 0;(iii’) there exists a sequence (∆ p)∞

p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,

and lim p→∞

U (∆ p, f ) − L(∆ p, f )

= 0.



228 LECTURES 23-25

Then the Proposition follows immediately from the fact that, for every partition ∆one has the equalities

ba

f ∆(x) dx = L(∆, f ) and ba

f ∆(x) dx = U (∆, f ).

The following result gives a complete description of the relationship betweenRiemann integrability and continuity.

Theorem 6.1 (Lebesgue’s criterion for Riemann integrability). Let f : [a, b] →R be a bounded function. The following are equivalent:

(i) f is Riemann integrable;(ii) the discontinuity set Df is neglijeable.

Proof. (i) ⇒ (ii). Assume f is Riemann integrable. Using Proposition 6.7,there exists a sequence (∆ p)∞

p=1 of partitions of [a, b], such that ∆1 ⊂ ∆2 ⊂ . . . and

lim p→∞ ba f ∆p (x) − f ∆p (x) dx = 0.

Notice that

(19) f ∆1 ≥ f ∆2 ≥ f ∆3 ≥ · · · ≥ f ≥ ·· · ≥ f ∆3≥ f ∆2

≥ f ∆1.

Define the Riemann integrable functions h p = f ∆p − f ∆p, p ∈ N. We then clearly

have

(α) h p ≥ h p+1 ≥ 0, ∀ p ∈ N;

(β ) lim p→∞

ba

h p(x) dx = 0.

Using (α) we can define the function h : [a, b] → R by

h(x) = lim p→∞

h p(x), ∀ x ∈ [a, b].

Claim 1: The set N = x ∈ [a, b] : h(x) = 0 is neglijeable.First of all, the functions h p are all Lebesgue measurable. Secondly, since h isa point-wise limit of a sequence of Lebesgue measurable functions, it follows (seeTheorem 3.2) that h itself is Lebesgue measurable. In particular N is Lebesguemeasurable. For every integer j ≥ 1, define

N j =

x ∈ [a, b] : h(x) >1

j

,

so that the sets N j , j ≥ 1 are again Lebesgue measurable, and N =∞j=1 N j . In

order to prove that N is neglijeable, it then suffices to prove that λ(N j) = 0, forall j ≥ 1. Fix for the moment j ≥ 1. Since h p ≥ h ≥ 0, it follows that

inf x∈N j

h p(x) ≥ 1

j, ∀ p ≥ 1,

so by Lemma 6.1 we get the inequality

λ(N j)

j≤ ba

h p(x) dx, ∀ p ≥ 1,

so by (β ) we indeed get λ(N j) = 0.Define the set S =

∞ p=1 ∆ p.

Claim 2 : If y ∈ [a, b] (N ∪ S ), then f is continuous at y.




Fix y ∈ [a, b] (N ∪ S ). In order to prove that f is continuous at y, we must find,for every ε > 0, some open interval J ε y, such that

(20) |f (z) − f (y)| < ε, ∀ z ∈ J ε ∩ [a, b].Since y ∈ N , we have lim p→∞ h p(y) = 0. Fix ε and choose p ≥ 1, such that0 ≤ h p(y) < ε. Write the partition ∆ p as

∆ p = (a = x0 < x1 < · · · < xn = b).

Using the fact that y ∈ ∆ p, if we define k = min j ∈ 1, . . . , n : y < xj

, we

have y ∈ (xk−1, xk). In particular, we get

f ∆p (y) = supt∈[xk−1,xk]

f (t) and f ∆p(y) = inf

s∈[xk−1,xk]f (s),

so the inequality 0 ≤ h p(y) < ε gives

sup

t∈[xk−1,xk]

f (t)

−

inf

s∈[xk−1,xk]f (s)

< ε,

so if we choose J ε = (xk−1, xk), we clearly have (20).Now we are done, because using the fact that S is countable, it follows that S

is neglijeable, so N ∪ S is also neglijeable. Since by Claim 2, we have Df ⊂ N ∪ S ,it follows that Df itself is neglijeable.

(ii) ⇒ (i). Assume now the discontinuity set Df is neglijeable, and let us provethat f is Riemann integrable. Fix a sequence (∆ p)∞

p=1 of partitions of [a, b], with

∆1 ⊂ ∆2 ⊂ . . . , and10 lim p→∞ |∆ p| = 0. As before, we define the set S =∞ p=1 ∆ p.

Claim 3 : For any point y ∈ [a, b] (Df

S ), one has the equalities

lim p→∞

f ∆p (y) = lim p→∞

f ∆p(y) = f (y).

Fix for the moment ε > 0. Since f is continuous at y, there exists some δε > 0,such that

(21) |f (z) − f (y)| < ε, ∀ z ∈ (y − δε, y + δε) ∩ [a, b].

Choose now q ≥ 1, such that |∆q| < δε. Write ∆q = (a = x0 < x1 < · · · < xn = b).Using the fact that y ∈ ∆q, we can find k ∈ 1, . . . , n such that y ∈ (xk−1, xk).Since xk − xk−1 < δε, we have the inclusion [xk−1, xk] ⊂ (y − δε, y + δε), so by (21)we immediately get

f (y) ≤ f ∆q (y) = supz∈[xk−1,xk]

f (z) ≤ f (y) + ε;

f (y) ≥ f ∆q(y) = inf

z∈[xk−1,xk]f (z) ≥ f (y) − ε.

Since the sequence

f ∆p (y)∞ p=1

is non-increasing, and the sequence

f ∆p(y)∞ p=1

is

non-decreasing, the above inequalities give

|f ∆p (y) − f (y)| ≤ ε and |f ∆p (y) − f (y)| ≤ ε, for all p ≥ q,

and the Claim follows.Going back to the proof of the Theorem, we will now prove that f satsifies

condition (iii) in Proposition 6.6. Fix ε > 0. Since Df ∪ S is also neglijeable, usingregularity from above with respect to open sets, we can find an open set E ⊂ R

10 Recall that, for a partition ∆ = (a = x0 < · · · < xn = b), the number |∆| is defined as

|∆| = max

xk − xk−1 : 1 ≤ k ≤ n

.



230 LECTURES 23-25

such that E ⊃ Df ∪ S , and λ(E ) < ε. Define the compact set A = [a, b] E , andput B = [a, b] ∩ E . We clearly have

(22) λ(B) ≤ λ(E ) < ε.

Define the sequence (h p)∞ p=1 by h p = f ∆p − f ∆p

. Since A ∩ ∆ p = ∅, it follows that

h pA

is continuous, for each p ≥ 1. Since A ∩ (Df ∪ S ) = ∅, by Claim 3, we knowthat lim p→∞ h p(y) = 0, ∀ y ∈ A. Since (h p)∞

p=1 is monotone, by Dini’s Theorem(see ??) it follows that

lim p→∞

maxy∈A

h p(y)

= 0.

In particular, there exists pε ≥ 1, such that

(23) h pε(y) ≤ ε, ∀ y ∈ A.

Let

M = sup

x∈[a,b]

f (x) and m = inf

x∈[a,b]

f (x).

Using Lemma 6.1 for h pεand the sets A and B, combined with (22), we have b

a

h pε(x) dx ≤ (b − a) · sup

y∈Ah pε

(y) + λ∗(B) · supz∈B

h pε(z) ≤

≤ ε(b − a) + λ∗(B)(M − m) ≤ ε(b − a + M − m).

Since h pε≥ h p ≥ 0, for all p ≥ pε, we get the inequalities

0 ≤ ba

h p(x) dx ≤ ε(b − a + M − m), ∀ p ≥ pε.

The above argument proves that lim p→∞

b

ah p(x) dx = 0, i.e.

lim p→∞

ba

[f ∆p (x) − f ∆p(x)] dx = 0.

By Proposition 6.6, it follows that f is Riemann integrable.

Exercise 13 . Prove that a Riemann integrable function f : [a, b] → R isLebesgue measurable.

Hint: Use a sequence of partitions (∆p)∞p=1, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p| = 0. Use

the arguments given in the proof of the implication (ii) ⇒ (i), to find a neglijeable set N ⊂ [a, b],

such that

limp→∞

f ∆p(x) = f (x), ∀ x ∈ [a, b] N.

The sequence (f ∆p )∞p=1 is non-decreasing, so it has a point-wise limit, say g, which is Lebesgue

measurable. Use the fact that

f (x) = g(x) ∀ x ∈ [a, b] N,

to show that f itself is Lebesgue measurable.

Exercise 14. Let K ⊂ [0, 1] be a compact set with K ∩ Q = ∅, and λ(K ) > 0(see Exercise 7 for the existence of such sets). Prove that the characteristic functionκ K : [0, 1] → R is not Riemann integrable. In fact, f cannot be almost Riemannintegrable either.

Hint: Examine the discontinuity set Df , and prove that K ⊂ Df .




Exercise 15 . Let f n : [a, b] → R, n ≥ 1 be a sequence of Riemann integrablefunctions. Consider the product space P =

∞n=1 Ran f n, equipped with the prod-

uct topology (the sets Ran f n

, n≥

1, are equipped with the topology induced fromR), and the function F : [a, b] → P , defined by F (x) = f n(x)∞

n=1. Prove that, for

every bounded continuous function g : P → R, the composition g F : [a, b] → R isRiemann integrable. In other words, the result of a bounded continuous operation,involving a sequence of Riemann integrable functions, is again a Riemann integrable

function .

Hint: Examine the relationship between the discountinuity set DgF and the dsicontinuity sets

Df n , n ≥ 1.

Exercise 16 . Let M be an arbitrary subset of [a, b], and let f : [a, b] → R be aRiemann integrable function, such that f ≤ κ M . Prove the inequality b

a

f (x) dx ≤ λ∗(M ).

Hint: Consider the function g : [a, b] → R defined by g(x) = maxf (x), 1. Then f ≥ g ≥ κ M ,

and g is still Riemann integrable. Apply Lemma 6.1 (the first inequality) to the function 1 − g.

Exercise 17* . Let f : [a, b] → R be a bounded function. Prove that the followingare equivalent:

(i) f is Riemann integrable;(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with

g ≥ f ≥ h, and ba [g(x) − h(x)] dx < ε;

(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R

with g ≥ f ≥ h, and ba [g(x) − h(x)] dx < ε.

Hints: For the implication (i) ⇒ (ii) analyze first the particular case when f = κ J , with J

a sub-interval of [a, b]. Then analyze the functions of the type f ∆ and f ∆. For the implication

(iii) ⇒ (i), analyze the relationship among lower/upper Darboux sums of f , g and h.

Comment. The statement of Theorem 6.1 shows that, appart from trivialcases, the problem of checking that a function f : [a, b] → R is Riemann integrable,is a rather difficult one. The main difficulty arises from the fact that, if N ⊂ [a, b]is a neglijeable set, and f

[a,b]N

is continuous, then f need not be continuous

at all points in [a, b] N . For instance, if we consider the characteristic functionf = κ Q∩[a,b] of the rationals in [a, b], and N = Q∩[a, b], then clearly N is neglijeable,

f

[a,b]N is continuous (because it is constant zero), but Df = [a, b].

As earlier suggested, in the hope that such an anomaly can be eliminated, it isreasonable to consider the slightly weaker notion of almost Riemann integrabilty.In the remainder of this section, we take a closer look at this notion, and we willeventually show (see Theorem 6.2) that this indeed removes the above anomaly.

We begin with an “almost” version of Exercise 17.Lemma 6.2. For a function f : [a, b] → R, the following are equivalent:

(i) f is almost Riemann integrable;(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with

g ≥ f ≥ h a.e., and ba [g(x) − h(x)] dx < ε;

(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R

with g ≥ f ≥ h a.e., and ba [g(x) − h(x)] dx < ε.



232 LECTURES 23-25

Proof. The implication (i) ⇒ (iii) is trivial.The implication (iii) ⇒ (ii) follows from Exercise 17.We now prove (ii)

⇒(i). Assume f has property (ii). For each integer n

≥1,

choose continuous functions gn, hn : [a, b] → R, such that gn ≥ f ≥ hn, a.e., and ba [gn(x) − hn(x)] dx ≤ 1/n. Define the functions Gn, H n : [a, b] → R, n ≥ 1, by

Gn(x) = min

g1(x), . . . , gn(x)

,

H n(x) = max

g1(x), . . . , gn(x)

.

It is clear that

(α) Gm ≥ f ≥ H n, a.e., ∀ m, n ≥ 1;(β ) G1 ≥ G2 ≥ . . . and H 1 ≤ H 2 ≤ . . . ;

(γ ) ba

[Gn(x) − H n(x)] dx ≤ ba

[gn(x) − hn(x)] dx ≤ 1/n, ∀ n ≥ 1.

Notice that, since the Gm’s and the H n’s are continuous, by Exercise ??, we alsohave

(α) Gm ≥ H n (everywhere!), ∀ m, n ≥ 1.Use (β ) to define the functions G, H : [a, b] → R, by

G(x) = limn→∞

Gn(x) and H (x) = limn→∞

H n(x), ∀ x ∈ [a, b],

so by (α) we clearly have Gn ≥ G ≥ H ≥ H n, ∀ n ≥ 1. Using then (γ ), byExercise 17 it follows that both G and H are Riemann integrable. Moreover, wehave G − H ≥ 0 and

0 ≤ ba

[G(x) − H (x)] dx ≤ ba

[Gn(x) − H n(x)] dx ≤ 1/n, ∀ n ≥ 1,

Which forces

b

a [G(x) − H (x)] dx = 0, so by Exercise ??, we get G = H , a.e. By(α) it follows that f = G, a.e., so f in indeed almost Riemann integrable.

We are now in position to prove the “almost” version of Theorem 6.1.

Theorem 6.2. Let f : [a, b] → R be a bounded function. The following areequivalent:

(i) f is almost Riemann integrable;(ii) there exists a neglijeable set N ⊂ [a, b] such that f

[a,b]N

is continuous.

Proof. (i) ⇒ (ii). Assume f is almost Riemann integrable, so there exists aRiemann integrable function g : [a, b] → R, such that f = g, a.e. By Theorem 6.1,the discontinuity set Dg is neglijeable. Take

M = x ∈ [a, b] : f (x) = g(x).

Since f = g, a.e., the set M is neglijeable, and so is the set N = M

∪Dg. On

the one hand, since Dg ⊂ N , the restriction g[a,b]N , is continuous. On the otherhand, since M ⊂ N , we have f

[a,b]N

= g

[a,b]N , so (ii) follows.

(ii) ⇒ (i). We are going to imitate the proof of Theorem 6.1, with some minormodifications. Fix N ⊂ [a, b] neglijeable, such that f

[a,b]N

is continuous. Fix

also a sequence (∆ p)∞ p=1 of partitions, with ∆1 ⊂ ∆2 ⊂ . . . , and lim p→∞ |∆ p| = 0.

Put S =∞ p=1 ∆ p. Since S is countable, the set N ∪ S is still neglijeable. We put

T = [a, b] (N ∪ S ), and we define the analogues of the functions f ∆p and f ∆pas




follows. Write each partition as ∆ p = (a = x p0 < x p1 < · · · < x pnp= b), and define,

for each k ∈ 1, . . . , n p, the numbers

M pk = supf (t) : t ∈ [x pk−1, x pk] ∩ T and m pk = inf f (t) : t ∈ [x pk−1, x pk] ∩ T .We then define, for each p ≥ 1, the functions

g p = m p1 · κ [xp0 ,x

p1 ] + m p2 · κ (xp

1 ,xp2] + · · · + m pn · κ (xp

n−1,xpn],

g p = M p1 · κ [xp0,x

p1 ] + M p2 · κ (xp

1,xp2 ] + · · · + M pn · κ (xp

n−1,xpn].

Note that we have the inequalities g p(x) ≥ f (x) ≥ g p(x), ∀ x ∈ T , which give

(24) g p ≥ f ≥ g p, a.e., ∀ p ≥ 1.

It is obvious that g p and g p, p ≥ 1, are all Riemann integrable. We are now going

to estimate the integrals ba [g p(x) − g p(x)] dx. Put h p = g p − g p, p ≥ 1. First we

observe that, since f

T

is continuous, and T ∩ ∆ p = ∅, ∀ p ≥ 1, we clearly have theequalities lim p→∞ g p(x) = lim p→∞ g p(x) = f (x),

∀x

∈T , which give

(25) lim p→∞

h p(x) = 0, ∀ x ∈ T.

Fix some ε > 0, and use regularity from above, to find an open set D with D ⊃ N ∪S and λ(D) < ε. Take the compact set A = [a, b] D. Note that f

A

is continuous,

since A ⊂ [a, b] N . Note also that, since A ⊂ [a, b] S , the functions g pA

and

g pA

are also continuous, and so will be h pA

, for every p ≥ 1. Since

g p(x)∞ p=1

is non-increasing, and

g p(x)∞ p=1

is non-decreasing, for all x, it follows that the

sequence (h p)∞ p=1 is monotone, so by Dini’s Theorem, (25) gives

lim p→∞

maxx∈A

h p(x)

= 0.

In particular, there exists some pε ≥ 1, such that

(26) h p(x) ≤ ε, ∀ p ≥ pε, x ∈ A.Put B = [a, b] A, and take M = supx∈[a,b] f (x) and m = inf x∈[a,b] f (x). Using

the inclusion B ⊂ D, we get λ∗(B) ≤ λ(D) ≤ ε, so by Lemma 6.1, (the functionsh p, p ≥ 1, are clearly non-negative), combined with (26), we get b

a

h p(x) dx ≤ (b − a) · supx∈A

h p(x) + λ∗(B) · supx∈B

h p(x) ≤≤ (b − a)ε + λ∗(B)(M − m) ≤ ε(b − a + M − m), ∀ p ≥ pε.

This estimate then proves that lim p→∞

ba h p(x) dx = 0, i.e.

lim p→∞

ba

[g p(x) − g p(x)] dx = 0.

Combining this with (24), and applying Lemma 6.2, yields the fact that f is almostRiemann integrable.

Comment. The hypothesis that f is bounded can be replaced with a slightlyweaker one, which assumes that f is almost bounded, meaning that there exists aneglijeable set U ⊂ [a, b], such that f

[a,b]U

is bounded.

Exercise 18 . Let f n : [a, b] → R, n ≥ 1, be almost Riemann integrable functions,such that



234 LECTURES 23-25

(i) f n ≥ f n+1 ≥ 0, a.e., ∀ n ≥ 1;(ii) limn→∞ f n(x) = 0, for “almost all” x ∈ [a, b], i.e. there exists a neglijeable

set N ⊂

[a, b], such that limn→∞

f n

(x) = 0,∀

x∈

[a, b] N .

Prove that

limn→∞

≈ ba

f n(x) dx = 0.



Lectures 26-29

7. Measure theory on locally compact spaces

Earlier in this chapter we discussed the construction of (outer) measures, start-ing with more primitive objects: semiring measures. The main application was theconstruction of the (outer) Lebesgue measure on Rn. In this section we describe analternative construction, which has as its starting point another primitive object:

a regular content . The idea is again to start with the measure defined on a “small”class of sets, extend it to an outer measure, and then use the Caratheodory con-struction. Among other applications, we will get an alternative construction of the(outer) Lebesgue measure on Rn.

Definition. Let X be a locally compact space. Denote by CX the collectionof all compact subsets of X . A content on X , is a map ω : CX → [0, ∞), with thefollowing properties:

(i) ω(∅) = 0;(ii) if K, L ∈ CX are such that K ⊂ L, then ω(K ) ≤ ω(L);

(iii) ω(K ∪ L) ≤ ω(K ) + ω(L), for all K, L ∈ CX ;(iv) ω(K ∪ L) = ω(K ) + ω(L), for all K, L ∈ CX , with K ∩ L = ∅.

Comments. Note that ω takes finite values. The collection CX does not have

any nice set-arithmetic properties, except for the following: (i) the union of anyfinite collection of sets in CX is again in CX ; (ii) an arbitrary intersection of sets inCX is again in CX .

Examples 7.1. A. If µ is a measure on Bor(X ), then µCX

is a content.

B. Take X = R, and for a compact subset K ⊂ R, define

ω(K ) =

1 if 0 ∈ Int(K )0 if 0 ∈ Int(K )

It is obvious that ω is a content on R. Notice however that if we consider thecompact sets K n = [− 1

n , 1n ], then ω

∞n=1 K n

= 0, but ω(K n) = 1, ∀ n ≥ 1. This

shows that, in general, a content cannot be extended to a measure on Bor(X ).

One useful property, which will be invoked several times in this section, iscontained in the following:

Exercise 1. Let X be a locally compact space, let K ⊂ X be compact, and letD1, D2 ⊂ X be open subsets, with K ⊂ D1 ∪ D2. Show there exist compact setsK 1 and K 2, such that K 1 ⊂ D1, K 2 ⊂ D2, and K = K 1 ∪ K 2.

As Example 7.1.B suggests, one obstruction for the extendability of a content onX , to a measure on Bor(X ), is its behaviour with respect to interiors. The followingnotion isolates an important property, which will be shown to be sufficient for theextendability property.

235



236 LECTURES 26-29

Definition. Let X be a locally compact space. A content ω on X is said tobe regular , if for any K ∈ CX , one has the equality

ω(K ) = inf ω(L) : L ∈ CX , Int(L) ⊃ K .

The following exercise shows how the lack of regularity can always be repaired.

Exercise 2 . Let X be a locally compact space, and let ω be a content on X .Define ω : CX → [0, ∞), by

ω(K ) = inf

ω(L) : L ∈ CX , Int(L) ⊃ K

, ∀ K ∈ CX .

Prove that:

(i) ω is a regular content on X ;(ii) ω(K ) ≥ ω(K ), ∀ K ∈ CX ;

(iii) if η is a regular content on X , with η(K ) ≥ ω(K ), ∀ K ∈ CX , thenη(K ) ≥ ω(K ), ∀ K ∈ CX ;

(iv) ω is regular, if and only if ω = ω.

Definition. With the notations from Exercise 2, the regular content ω is calledthe regularization of ω.

Theorem 7.1. Let X be a locally compact space, and let ω be a content on X .Denote by T X the collection of all open subsets of X . Define the map ω : T X →[0, ∞] by

ω(D) = sup

ω(K ) : K ∈ CX , K ⊂ D

, ∀ D ∈ T X ,

and define the map ω∗ : P(X ) → [0, ∞], by

ω∗(A) = inf

ω(D) : D ∈ T X , D ⊃ A

, ∀ A ⊂ X.

Then ω∗ is an outer measure on X .

Proof. We begin by collecting the useful properties of the map ω.

Claim: The map ω has the following properties(i) ω(∅) = 0;

(ii) ω is monotone, i.e. whenever D, E ∈ T X satisfy D ⊂ E , it followsthat ω(D) ≤ ω(E );

(iii) ω is σ-sub-additive, i.e., for any sequence (Dn)∞n=1 ⊂ T X , one has

the inequality ω∞

n=1 Dn) ≤ ∞n=1 ω(Dn).

Properties (i) and (ii) are trivial.To prove property (iii), let us start with some sequence ( Dn)∞

n=1 of open sets,and let us denote for simplicity the union

∞n=1 Dn by D. Start with some arbitrary

compact set K ⊂ D. Using compactness, there exists some index p ≥ 1, such thatK ⊂ D1 ∪ D2 ∪ · · · ∪ D p. Use Exercise 1 (and induction) to find compact setsK 1 ⊂ D1, K 2 ⊂ D2, . . . , K p ⊂ D p, such that K = K 1 ∪ K 2 ∪ · · · ∪ K p. We then

clearly have the inequalities

ω(K ) ≤ pn=1

ω(K n) ≤ pn=1

ω(Dn) ≤∞n=1

ω(Dn).

Since we have

ω(K ) ≤∞n=1

ω(Dn), for all K ∈ CX with K ⊂ D,




by the definition of ω, we immediately get

ω(D)≤

∞

n=1

ω(Dn

).

Having proven the Claim, we now check the conditions in the definition of anouter measure. It is clear that ω∗(∅) = 0. It is also clear, from the definition, andproperty (ii) from the Claim, that

A ⊂ B =⇒ ω∗(A) ≤ ω∗(B).

Finally, we need to show σ-sub-additivity, i.e.

(1) ω∗ ∞n=1

An ≤

∞n=1

ω∗(An).

Start with some sequence (An)∞n=1 of subsets of X . Of course, if one of the terms

in the right hand side of (1) is infinite, there is nothing to prove. Assume that

ω∗

(An) < ∞, ∀ n ≥ 1. Fix some ε > 0, and choose, for each n ≥ 1, an open setDn ⊃ An, such that ω(Dn) ≤ ω∗(An) + ε2n . Put D = ∞

n=1 Dn. Using part (iii) of the Claim, we have

ω(D) ≤∞n=1

ω(Dn) ≤∞n=1

ω∗(An) +

ε

2n

= ε +∞n=1

ω∗(An).

Since we obviously have the inclusion D ⊃ ∞n=1 An, the above inequality gives

ω∗ ∞n=1

An ≤ ω(D) ≤ ε +

∞n=1

ω∗(An).

Now we have

ω∗

∞

n=1

An≤ ε +

∞

n=1

ω∗(An),

for all ε > 0, so the inequality (1) follows.

Definition. Let X be a locally compact space, and let ω be a content on X .The outer measure ω∗ on X , defined in Theorem 7.1, is called the outer measureinduced by ω.

Remarks 7.1. Let X be a locally compact space, let ω be a content on X ,and let ω∗ be the outer measure induced by ω.

A. The map ω : T X → [0, ∞], defined in the statement of Theorem 7.1, is givenby ω = ω∗

T X

. To see that this is the case, start with some open set D. On the

one hand, by the definition of ω∗, we know that

ω∗(D) = inf

ω(E ) : E ∈ T X , E ⊃ D

,

which (using E = D) immediately gives the inequality ω∗(D)≤

ω(D). On theother hand, using property (ii) from the Claim stated in the proof, we also knowthat

ω(E ) ≥ ω(D), for all E ∈ T X with E ⊃ D,

which gives the reverse inequality, ω∗(D) ≥ ω(D).B. As a consequence of the above remark, we get the fact that ω∗ is regular

from above, with respect to the collection T X of all open sets in X , i.e.

ω∗(A) = inf

ω∗(D) : D ∈ T X , D ⊃ A

, ∀ A ⊂ X.



238 LECTURES 26-29

C. If one denotes by ω the regularization of ω (see Exercise 2), then ω∗ = ω∗.

In fact, using the notations from Theorem 7.1, we have the equality ω = ˆω. Indeed,

on the one hand, since we have the inequalityω(K ) ≥ ω(K ), ∀ K ∈ CX ,

it follows immediately by the definitions, that

ˆω(D) ≥ ω(D), ∀ D ∈ T X .

To prove the other inequality, fix some open set D ⊂ X . Suppose K ⊂ D is somecompact subset. Using the well-known properties of locally compact spaces, thereexists some compact set L, with

K ⊂ Int(L) ⊂ L ⊂ D,

so by the definitions of ω and ω, we get

ω(D)

≥ω(L)

≥ω(K ).

Since we have the inequality

ω(D) ≥ ω(K ), for all K ∈ CX with K ⊂ D,

taking supremum in the right hand side yields

ω(D) ≥ sup

ω(K ) : K ∈ CX , K ⊂ D

= ˆω(D).

Proposition 7.1. Let X be a locally compact space, let ω be a content on X ,and let ω∗ be the outer measure induced by ω. If we denote by ω the regularization of ω, then one has the equality

ω∗CX

= ω.

Proof. Using Remark 7.1.C, we can assume that ω is regular, and in this casewe need to prove that ω∗CX

= ω. Start with some compact set K

⊂X . By the

definition of ω∗, using the notations from Theorem 7.1, we know that

(2) ω∗(K ) = inf

ω(D) : D ∈ T X , D ⊃ K

.

It is clear that, for every open set D ⊃ K , we have the inequality

ω(D) ≥ ω(K ),

so taking infimum in the left hand side, and using (2), immediately gives the in-equality

ω∗(K ) ≥ ω(K ).

To prove the reverse inequality, we start by fixing ε > 0, and we use regularity tofind some compact set L with K ⊂ Int(L), and ω(L) ≤ ω(K ) + ε. Consider theopen set D = Int(L). On the one hand, for every compact set F ⊂ D, we have

the onbious inclusion F ⊂ L, which gives ω(F ) ≤ ω(L). Taking supremum over allcopact sets F ⊂ D then gives ω(D) ≤ ω(L). By the choice of L, by the definitionof ω∗, and using the inclusion D ⊃ K , we then get

ω∗(K ) ≤ ω(D) ≤ ω(L) ≤ ω(K ) + ε.

Since the inequality

ω∗(K ) ≤ ω(K ) + ε,

holds for all ε > 0, we then must have ω∗(K ) ≤ ω(K ).




The above result gives a nice characterization for the regularity of a content,in terms of the induced outer measure.

Corollary 7.1. Let X be a locally comoact space. A content ω is regular, if and only if ω∗CX

= ω.

Proof. Immediate from Proposition 7.1 and exercise 2.

Theorem 7.2. Let X be a locally compact space, let ω be a content on X ,and let ω∗ be the outer measure induced by ω. Then every open set D ⊂ X isω∗-measurable.

Proof. Fix an open set D ⊂ X . We need to prove (see Section 5) that D“sharply cuts” every subset of X , which is equivalent to the fact that, for everyA ⊂ X , one has the inequality:

(3) ω∗(A) ≥ ω∗(A ∩ D) + ω∗(A D).

This will be shown in several steps.Claim 1: For any open set E ⊂ X , and any compact set K ⊂ E , one has

the inequality

ω∗(E ) ≥ ω(K ) + ω∗(E K ).

To prove this inequality, we first note that, since both E and E K are open, byRemark 7.1.A, we have the equalities ω∗(E ) = ω(E ) and ω∗(E K ) = ω(E K ),where ω : T X → [0, ∞] is the map defined in the statement of Theorem 7.1. If L ⊂ E K is an arbitrary compact set, then we obviously have K ∩ L = ∅, sousing the inclusion K ∪ L ⊂ E , we get

ω(K ) + ω(L) = ω(K ∪ L) ≤ ω(E ) = ω∗(E ),

which then gives

ω∗(E ) − ω(K ) ≥ ω(L), for all L ∈ CX with L ⊂ E K.

Taking supremum in the right hand side then gives

ω∗(E ) − ω(K ) ≥ sup

ω(L) : L ∈ CX , L ⊂ E K

= ω(E K ) = ω∗(E K ),

and the Claim follows.

Claim 2 : The inequality (3) holds for all open subsets A ⊂ X .

Assume A is open. If the left hand side of (3) is infinite, there is nothing to prove.Assume ω∗(A) < ∞, so both ω∗(A ∩ D) and ω∗(A D) are also finite. Since A ∩ Dis open, we have

(4) ω∗(A ∩ D) = ω(A ∩ D) = sup

ω(K ) : K ∈ CX , K ⊂ A ∩ D

.

Fix for the moment a compact subset K ⊂ A ∩ D. Using Claim 1 we have the

inequalityω∗(A) ≥ ω(K ) + ω∗(A K ).

Since we obviously have the inclusion A K ⊃ A D, the above inequality givesω∗(A) ≥ ω(K ) + ω∗(A D), which can be rw-written as

ω∗(A) − ω∗(A D) ≥ ω(K ), for all K ∈ CX with K ⊂ A ∩ D.

Taking supremum in the right hand side, and using (4), we immediately get thedesired inequality (3).



240 LECTURES 26-29

We now proceed with the proof of (3) for arbitrary A’s. Fix A, and consideran arbitrary open set E ⊃ A. By Claim 2, we have

ω∗

(E ) ≥ ω∗

(E ∩ D) + ω∗

(E D).Using the obvious inclusions E ∩ D ⊃ A ∩ D and E D ⊃ A D, we then get

ω∗(E ) ≥ ω∗(A ∩ D) + ω∗(A D).

The desired inequality (3) follows now by taking infimum in the left hand side, andusing Remark 7.1.B.

The most important consequence of Theorem 7.2 is the following.

Corollary 7.2. Let X be a locally compact space, and let ω be a regular content on X . Then ω can be extended uniquely to a measure µω on Bor(X ), with the following properties.

(i) µ is regular from above, with respect to the collection T X of all open sets,that is

µω(B) = inf µω(D) : D ∈ T X , D ⊃ B, ∀ B ∈ Bor(X );

(ii) for every open set D ⊂ X , one has the equality

µω(D) = sup

µω(K ) : K ∈ CX , K ⊂ D

.

Conversely, if µ is a measure on Bor(X ) with properties (i) and (ii), and such that µ(K ) < ∞, ∀ K ∈ CX, then µ

CX

is regular content.

Proof. If we denote by m ω∗(X ) the σ-algebra of ω∗-measurable sets, thenTheorem 7.2 gives the inclusion Bor(X ) ⊂ m ω∗(X ), so the existence follows bytaking µω = ω∗

Bor(X)

. The fact that µω has properties (i) and (ii) is trivial, by

construction and by Remarks 7.1 and Proposition 7.1.The uniqueness is trivial, since property (ii) uniquely defines µω on open sets,

and (i) then uniquely defines µω on all Borel sets.To prove the second assertion, assume µ is a measure on Bor(X ) with properties

(i) and (ii), and let us show that ω = µCX

is a regular content. The fact that ω is

a content is trivial, so the only thing we must show is regularity. Fix some compactset K ⊂ X . It is clear that

ω(K ) ≤ inf

ω(L) : L ∈ CX , K ⊂ Int(L)

.

To prove the converse, we use property (i), to find, for each ε > 0, an open setDε ⊃ K , such that µ(Dε) ≤ µ(K ) + ε. If we choose, for each ε.), a compact set Lε,such that

K ⊂ Int(Lε) ⊂ Lε ⊂ Dε,

then we obviously have

µ(K ) ≤ µ(Lε) ≤ µ(Dε) ≤ µ(K ) + ε,so we get the inequality

inf

ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K ) + ε.

Since this holds for all ε > 0, we get in fact the inequality

inf

ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K ),

and we are done.




Definition. Let X be a locally compact space. A Radon measure on X is ameasure µ on Bor(X ) with the following properties:

(i) µ(K ) < ∞, for all compact sets K ⊂ X ;(ii) for every open set D one has

µ(D) = sup

µ(K ) : K ⊂ D, K compact

;

(iii) for every Borel set B one has

µ(B) = inf

µ(B) : D ⊃ B, D open

.

By Corollary 7.2, the map ω −→ µω establishes a bijective correspondence betweenthe set of all regular contents on X , and the set of all Radon measures on X . Fora regullar content ω, the measure µω is called the Radon measure extension of ω.

Proposition 7.2. Let X be a locally compact space.

(a) If µ is a Radon measure on X , and t ∈ [0, ∞), then tµ is also a Radon measure on X .

(b) If µ1 and µ2 are Radon measures on X , then µ1 + µ2 is also a Radon measure on X .

Proof. Property (a) is trivial.To prove property (b) let us denote µ1 + µ2 simply by µ. We first obvserve that

µ is indeed a measure on Bor(X ), and we clearly have

µ(K ) = µ1(K ) + µ2(K ) < ∞, ∀ K ∈ CX .

Let us show that µ satisfies condition (ii). Fix some open set D ⊂ X , and letus prove that

(5) µ(D) = sup

µ(K ) : K ∈ CX , K ⊂ D

.

If µ(D) = ∞, then either µ1(D) = ∞ or µ2(D) = ∞, so we get

supmax µ1(K ), µ2(K ) : K ∈CX , K

⊂D =

∞,

and since µ(K ) ≥ max

µ1(K ), µ2(K )

, ∀ K ∈ CX , the equality (5) immedi-ately follows. Suppose now µ(D) < ∞, which is equivalent to the fact thatµ1(D), µ2(D) < ∞. Denote the right hand side of (5) by ν (D). For every ε > 0,using the fact that µ1 and µ2 are Radon measures, we can find two compact setsK ε1 , K ε2 ⊂ D, such that µ1(K ε1 ) ≥ µ1(D) − ε

2 and µ2(K ε2 ) ≥ µ2(D) − ε2 . Of course,

the compact set K ε = K ε1 ∪ K ε2 is still a subset of D, and satisfies

µ1(K ε) ≥ µ1(K ε1 ) ≥ µ1(D) − ε

2,

µ2(K ε) ≥ µ2(K ε2 ) ≥ µ2(D) − ε

2,

so we get µ(K ε) = µ1(K ε) + µ2(K ε) ≥ µ1(D) + µ2(D) − ε = µ(D) − ε. Thisproves that ν (D)

≥µ(D)

−ε, and since this inequality is true for all ε > 0, we get

ν (D) ≥ µ(D). The inequality ν (D) ≤ µ(D) is trivial.We now show that µ satisfies condition (iii). Fix some set A ∈ Bor(X ), and

let us prove that

(6) µ(A) = inf

µ(D) : D ∈ T X , D ⊃ A

.

If µ(A) = ∞, there is nothing to prove. Suppose now µ(A) < ∞, which is equivalentto the fact that µ1(A), µ2(A) < ∞. Denote the right hand side of (6) by λ(A). Forevery ε > 0, using the fact that µ1 and µ2 are Radon measures, we can find two



242 LECTURES 26-29

open sets Dε1, Dε2 ⊃ A, such that µ1(Dε1) ≤ µ1(A) + ε2 and µ2(Dε2) ≤ µ2(A) + ε

2 .Then open set Dε = Dε1 ∩ Dε2 still contains A, and satisfies

µ1(Dε) ≤ µ1(Dε1) ≤ µ1(A) +ε

2 ,

µ2(Dε) ≤ µ2(Dε2) ≥ µ2(A) +ε

2,

so we get µ(Dε) = µ1(Dε) + µ2(Dε) ≤ µ1(A) + µ2(A) + ε = µ(A) + ε. Thisproves that λ(A) ≤ µ(A) + ε, and since this inequality is true for all ε > 0, we getλ(A) ≤ µ(A). The inequality λ(A) ≥ µ(A) is trivial.

Radon measures are also functorial with respect to proper maps, in the followingsense.

Proposition 7.3. Let X and Y be locally compact spaces, let Φ : X → Y be a proper continuous map, and let µ be a Radon measure on X . Then the mapν : Bor(Y ) → [0, ∞], defined by

ν (B) = µΦ−1

(B), ∀ B ∈ Bor(Y ),is a Radon measure on Y .

Proof. First of all, remark that since Φ is continuous, it is Borel measurable,which means that

Φ−1(B) ∈ Bor(X ), ∀ B ∈ Bor(Y ).

Secondly, by the well known properties of measures, the map ν is a measure.We now check that ν is a Radon measure. First of all, if K ⊂ Y is compact,

then using the fact that Φ is proper, it means that Φ−1(K ) is compact in X , so weclearly get

ν (K ) = µ

Φ−1(K )

< ∞.

To prove that ν satisfies condition (ii), start with some open set D ⊂ Y ,and let us find a sequence (L

n)∞

n=1of compact subsets of D, such that ν (D) =

limn→∞ ν (Ln). The set Φ−1(D) is open, so there exists a sequence (K n)∞n=1 of

compact subsets of Φ−1(D), with

(7) ν (D) = µ

Φ−1(D)

= limn→∞

µ(K n).

It we define the subsets Ln = Φ(K n), then (Ln)n≥1 is a sequence of compact subsetsof D, and the inclusion K n ⊂ Φ−1(Ln) immediately gives ν (D) ≥ ν (Ln) ≥ µ(K n),so by (7) we also get ν (D) = limn→∞ ν (Ln).

To prove condition (iii) start with some arbitrary subset B ∈ Bor(Y ), and letus a sequence (E n)∞

n=1 of open subset of Y , such that ν (B) = limn→∞ ν (E n), andE n ⊃ B, ∀ n ≥ 1. Use the fact that µ is a Radon measure, to find a sequence(Dn)∞

n=1 of open subset of X , such that

(8) ν (B) = µΦ−1(B) = limn→∞

µ(Dn),

and Dn ⊃ Φ−1(B), ∀ n ≥ 1. Put T n = X Dn, so that T n is closed, for each n ≥ 1.By Proposition I.5.2, the sets Φ(T n) are closed in Y , hence their complementsE n = Y Φ(T n), n ≥ 1 are open. Remark that we have the inclusions B ⊂ E n,∀ n ≥ 1. Otherwise, we would have B ∩ Φ(T n) = ∅, forcing T n ∩ Φ−1(B) = ∅,which is impossible, since Φ−1(B) ⊂ Dn = X T n. Moreover, we also have theinclusions

Φ−1(B) ⊂ Φ−1(E n) ⊂ Dn, ∀ n ≥ 1,




which then force

ν (B) ≤ ν (E n) ≤ µ(Dn), ∀ n ≥ 1.

Using by (8) this gives the equality limn→∞ ν (E n) = ν (B).

Of course, if X is a compact Hausdorff space, then every Radon measure µ onX is finite. The following gives an interesting converse of this property, which alsoshows that sometimes functoriality can be present beyond the proper case describedabove.

Proposition 7.4. Let X be a locally compact space, let µ be a Radon measureon X , and let (θ, T ) be a compactification of X . The following are equivalent:

(i) µ(X ) < ∞;(ii) the map ν : Bor(T ) → [0, ∞), defined by

ν (B) = µ

θ−1(B)

, ∀ B ∈ Bor(T ),

is a Radon measure on T .

Proof. Recall that the fact that (θ, T ) is a compactification of X means that

• T is a compact Hausdorff space;• θ : X → T is continuous;• θ(X ) is open and dense in T ;• θ : X → θ(X ) is a homeomorphism.

Without any loss of generality, we can assume that X is a dense open subset of T ,and θ is the inclusion map. With this convention, the map ν is defined by

(9) ν (B) = µ(B ∩ X ), ∀ B ∈ Bor(T ).

(i) ⇒ (ii). Assume µ(X ) < ∞. It is clear that ν is a finite measure on Bor(T ),and in fact we have ν (T X ) = 0.

The fact that ν (K ) <

∞, for every compact subset K

⊂T is of course trivial.

We now check the second condition in the definition. Fix some open subsetD ⊂ T , and let us show that

ν (D) = sup

ν (K ) : K compact, K ⊂ D

.

All we need is a sequence (K n)∞n=1 of compact subsets of D, with limn→∞ ν (K n) =

ν (D). To get this sequence we simply use the fact that D ∩X is open (in X ), so wecan find a sequence (K n)∞

n=1 of compact subsets of D ∩ X , with limn→∞ µ(K n) =µ(D ∩X ) = ν (D). Now we are done, because the fact that K n ⊂ X , gives µ(K n) =ν (K n), ∀ n ≥ 1.

We now check the third condition in the definition. Fix some set B ∈ Bor(T ),and let us show that

ν (B) = inf ν (D) : D ⊂ T open, D ⊃ B.

All we need is a sequence (Dn)∞n=1 of open subsets of T , with Dn ⊃ B, ∀ n ≥ 1,

and limn→∞ ν (Dn) = ν (B). Start off by choosing a sequnce (K n)∞n=1 of compact

subsets of X , such that limn→∞ µ(K n) = µ(X ), we will get limn→∞ µ(X K n) = 0(the condition that µ(X ) < ∞ is essential here). If we define then the open setsAn = T K n, then we will have ν (An) = µ(An ∩ X ) = µ(X K n), ∀ n ≥ 1, so wehave

(10) limn→∞

ν (An) = 0.



244 LECTURES 26-29

Notice also that

(11) An ⊃ T X, ∀ n ≥ 1.

Use now the fact that µ is a Radon measure on X , and the fact that B ∩ X ∈Bor(X ), to find a sequence (E n)∞

n=1 of open subsets of X , with E n ⊃ B ∩ X ,∀ n ≥ 1, and

(12) limn→∞

ν (E n) = µ(B ∩ X ).

Since X is open in T , it follows that all the E n’s are open in T . If we defineDn = E n ∪ An, then using (11) we have the inclusions

Dn = E n ∪ An ⊃ (B ∩ X ) ∪ (T X ) ⊃ B, ∀ n ≥ 1,

as well as the inequalities

µ(B ∩ X ) = ν (B) ≤ ν (Dn) ≤ ν (E n) + ν (An) =

= µ(E n

∩X ) + ν (An) = µ(E n) + ν (An),

∀n

≥1,

which, combined with (10) and (12), clearly give limn→∞ ν (Dn) = µ(B∩X ) = ν (B).(ii) ⇒ (i). This implication is trivial, because the fact that ν is a Radon

measure forces µ(X ) = ν (X ) ≤ ν (T ) < ∞.

Comment. Assume µ is a Radon measure on a locally compact space X .Although the measure µ is regular from above with respect to open sets by (iii), ingeneral, one cannot conclude that it is regular from below with respect to compactsets. The following example illustrates such an anomaly.

Exercise 3* . Equipp the space X = R2 with the disjoint union topology definedby the decomposition X =

y∈R

R × y. More explicitly, if we define, for each

A ⊂ X , and each y ∈ R, the set

Ay = x ∈ R : (x, y) ∈ A

,

then a set D ⊂ X is declared to be open, if and only if all subsets Dy ⊂ R, y ∈ Rare open (in the usual topology on R). For each subset A ⊂ X , define its support

S A = y ∈ R : Ay = ∅.

Prove the following.

(i) A set K ⊂ X is compact, if and only if its support S K is finite and, foreach y ∈ S K , the set K y ⊂ R is compact (in the usual topology on R).

(ii) X is a locally compact space.(iii) If we define, for every compact subset K ⊂ X , the number

ω(K ) =y∈SK

λ(K y),

where λ is the Lebesgue measure on R, then ω is a regular content on X .

(iv) Let µ denote the Radon measure extension of ω. Then for every open setD ⊂ X , one has the equality

µ(D) =y∈R

λ(Dy),

where one uses the summation conventions discussed in II.2. (The sum inthe right hand side is defined as the supremum of all finite sums.)

(v) If B ∈ Bor(X ) has uncountable support S B, then µ(B) = ∞.




(vi) Consider the y-axis Y = 0 × R ⊂ X Show that F is closed in X (henceBorel), it has infinite measure µ(F ) = ∞, but µ(K ) = 0, for all compactsubsets K

⊂F .

Hints: Using regularity from above, it suffices to prove (v) only when B is open. In this case

use the fact that if a map α : R → R is summable, then the set t ∈ R : α(t) = 0 is countable.

For (vi), the equality µ(F ) = ∞ is a consequence of (v). To get the fact that all compact subsets

of F have measure zero, use part (i).

Remark 7.2. Let X be a locally compact space, and let µ be a Radon measureon X . We define the maximal outer extension of µ (see Section 5) by

µ∗(A) = inf

µ(B) : B ∈ Bor(X ), B ⊃ A

, ∀ A ⊂ X.

By the regularity from above, one has the equality

(13) µ∗(A) = inf

µ(D) : D ∈ T X , D ⊃ A

, ∀ A ⊂ X.

If one considers the regular content ω = µCX, then µ∗ = ω∗, the outer mea-

sure induced by ω. We also know that if we consider the σ-algebra m µ∗(X ) of all µ∗-measurable subsets of X , we have the inclusion Bor(X ) ⊂ m µ∗(X ), andµ∗Bor(X)

= µ.

Exercise 4. Consider the collection D of all subsets of Rn, of the form

D = (a1, b1) × · · · × (an, bn), a1 < b1, . . . , an < bn.

For every such D we define

voln =nj=1

(bj − aj).

We define, for every bounded subset B ⊂ Rn, the number

(14) v(B) = inf N p=1

voln(D p) : (D p)N p=1 ⊂ D, B ⊂N p=1

D p.

(i) If we define B =

B ⊂ Rn : B bounded

, then B is a ring, the mapv : B → [0, ∞) is sub-additive, but not σ-sub-additive. In particular, vdoes not extend to an outer measure on Rn.

(ii) If we consider the unit square S = [0, 1]n, then the collection N v(S ) =N ⊂ S : v(N ) = 0

is a ring, but not a σ-ring.

(iii) When restricted to the collection CRn , of all compact subsets of Rn, themap ω = v

CRn

defines a regular content on Rn.

(iv) The outer measure ω∗, defined by ω, is precisely the outer Lebesgue mea-sure λ∗

n.

The above construction somehow belongs to the “prehistory” of measure theory.

The map v : B → [0, ∞) is called the Jordan content . Bounded sets N ⊂ Rn

,with v(N ) = 0 are called Jordan neglijeable. The theory of Riemann integration(especially for functions of several variables) relies heavily on the use of Jordanneglijeable sets. Part (ii) shows that, when restricted to Bor(S ), the map v failsto be a measure. Parts (iii) and (iv) explain how the construction can be “fixed.”The regular content ω = v

CRn

is called the Lebesgue content . The correspondence

ω −→ ω∗ gives an alternative construction of the outer Lebesgue measure, whichstarts with its definition on compact sets as the Jordan content.



246 LECTURES 26-29

For Radon measures, the lack of regularity from below, with respect to compactsets, in somehow compensated by the following result (compare with Exercise 2 fromSection 6).

Lemma 7.1. Let X be a locally compact space, let µ be a Radon measure on X , and let µ∗ be the maximal outer extension of µ. For a subset A ⊂ X , with µ∗(A) < ∞, the following are equivalent

(i) A is µ∗-measurable;(ii) µ∗(A) = supµ(K ) : K ∈ CX , K ⊂ A;

(iii) there exists a sequence (K n)∞n=1 of compact subsets of A, such that

µ∗

A ∞n=1

K n

= 0.

Proof. (i) ⇒ (ii). Suppose A is µ∗-measurable, and let us prove the equality(ii). Denote the right hand side of (ii) simply by ν (A). It is obvious, by themonotonicity of µ∗, and the fact that µ∗Bor(X)

= µ, that we have the inequality

µ∗(A) ≥ ν (A). To prove the other inequality we fix for the moment some ε > 0.Using (13), there exists an open set D ⊃ A, such that µ(D) ≤ µ∗(A) + ε. Useproperty (ii) in the definition of Radon measures, to find some compact set L ⊂ Dsuch that

µ(D) ≤ µ(L) + ε.

Since µ(D) = µ(D L) + µ(L), and µ(L) ≤ µ(D) < ∞, this inequality gives

µ(D L) ≤ ε,

which, combined with the obvious inclusion A L ⊂ D L, yields

(15) µ∗(A L) ≤ µ∗(D L) = µ(D L) ≤ ε.

Using (13) we can also find an open set E ⊃ L A, such that

(16) µ(E ) ≤ µ∗(L A) + ε.Since LA is µ∗-measurable, we have µ(E ) = µ∗(E ) = µ∗

E (LA)

+µ∗(LA).

Since µ∗(L A) ≤ µ∗(E ) = µ(E ) < ∞, the inequality (16) gives

(17) µ∗

E (L A) ≤ ε.

Consider the set K = L E . It is obvious that K is compact, and we have theinclusion

K ⊂ L (L A) = L ∩ A ⊂ A.

Moreover, we have(L ∩ A) K ⊂ E (L A).

Using the inequality (17), we then get

µ∗

(L

∩A) K ≤

ε.

Finally, the above inequality, combined with (15), gives

µ∗(A K ) ≤ µ∗

(L ∩ A) K

+ µ∗

(A L) K ≤ ε + µ∗(A L) ≤ 2ε.

Since K ⊂ A, we get

µ∗(A) ≤ µ∗(A K ) + µ∗(K ) ≤ 2ε + µ(K ) ≤ 2ε + ν (A).

Since the inequality µ∗(A) ≤ 2ε + ν (A) holds for all ε > 0, we get µ∗(A) ≤ ν (A),so (ii) follows.




(ii) ⇒ (iii). Assume A satisfies (ii), and let us show that A has property (iii).For every integer n ≥ 1, we use (ii) to find a compact set K n ⊂ A, such that

(18) µ∗(A) ≤ µ(K n) + 1n

.

On the one hand, we have the inclusions A ∞

n=1 K n ⊂ A K p, which give

(19) µ∗

A ∞n=1

K n ≤ µ∗(A K p), ∀ p ≥ 1.

On the other hand, since K p is measurable, we have the equality

µ∗(A) = µ∗(A K p) + µ∗(K p) = µ∗(A K p) + µ(K p),

and then the fact that µ∗(A) < ∞, combined with (18), will force

µ∗(A K p)

≤

1

p

,

∀n

≥1.

Using (19), this forces µ∗

A ∞

n=1 K n

= 0.(iii) ⇒ (i). This is pretty obvious. We define the sets B =

∞n=1 K n ⊂ A, and

N = A B. Then µ∗(N ) = 0, so in particular, N is µ∗-measurable. Since B isBorel, it is also µ∗-measurable, so A = B ∪ N is indeed µ∗-measurable.

The following result generalizes Lemma 7.1 to the σ-finite case.

Theorem 7.3. Let X be a locally compact space, let µ be a Radon measure on X , and let µ∗ be the maximal outer extension of µ. For a set A ⊂ X , the following are equivalent

(i) A is µ∗-measurable, and µ∗-σ-finite.(ii) There exists sequences (K n)∞

n=1 ⊂ CX and (Dn)∞n=1 ⊂ T X , such that

∞n=1

K n ⊂ A ⊂∞n=1

Dn and µ ∞n=1

Dn

∞n=1

K n

= 0.

(The condition that A is µ∗-σ-finite means that there exists a sequence (An)∞n=1 of

subsets of X , with A =∞n=1 An, and µ∗(An) < ∞, for all n ≥ 1.)

Proof. (i) ⇒ (ii). Assume A is µ∗-measurable and µ∗-σ-finite.

Claim 1: There exists a sequence (An)∞n=1 of µ∗-measurable sets, such that

A =∞n=1 An, and µ∗(An) < ∞, ∀ n ≥ 1.

A priori, we only know that there exists a sequence (A0n)∞n=1 of subsets of X

(not assumed to be µ∗-measurable), with A =∞n=1 A0

n, and µ∗(A0n) < ∞, ∀ n ≥ 1.

Using (13), we can choose however, for each n ≥ 1, an open set E n, with A0n ⊂ E n,

and µ(E n) < ∞. In particular, E n is µ∗

-measurable, and so will be An = A ∩ E n.We clearly have A =

∞n=1 An, and µ∗(An) ≤ µ∗(E n) < ∞, ∀ n ≥ 1.

Using Claim 1, we start off by writing A =∞n=1 An, with the An’s µ∗-

measurable, and µ∗(An) < ∞. For each n ≥ 1, we use Lemma 7.1 to find asequence (L pn)∞

p=1 of compact subsets of An, such that

µ∗

An ∞ p=1

L pn

= 0.



248 LECTURES 26-29

Let us list the countable collection L pn : p, n ≥ 1 as a sequence (K n)∞n=1, so that

we have∞n=1

K n =

∞n=1

∞ p=1

L pn ⊂∞n=1

An = A.

Claim 2 : The set M = A ∞

n=1 K n

is µ∗-neglijeable, i.e. µ∗(M ) = 0.

Indeed, if we define, for each k ≥ 1 the set M k = Ak ∞

n=1 K n

, then we havethe obvious equality M =

∞k=1 M k, and the inclusions

M k = Ak ∞ p=1

∞n=1

L pn ⊂ Ak

∞ p=1

L pk

, ∀ k ≥ 1,

which, by the choice of the L’s, prove that µ∗(M k) = 0, ∀ k ≥ 1.We proceed now with the construction of the D’s. For each pair of integers

( p,n), we use (13) to find an open set E pn ⊃ An, such that µ(E pn) ≤ µ∗(An) + 12p+n .

Since the An’s are µ∗-measurable, we have

µ(E pn) = µ∗(E pn) = µ∗(An) + µ∗(E pn An).

Since µ∗(An) < ∞, by the choice of the E ’s, we will get

(20) µ∗(E pn An) ≤ 1

2 p+n, ∀ p,n ≥ 1.

We then define, for each p ≥ 1, the open set D p =∞n=1 E pn. Notice that, for each

p ≥ 1, we have the inclusion A =∞n=1 An ⊂ ∞

n=1 E pn = D p, and

D p A =∞n=1

[E pn A] ⊂∞n=1

[E pn An].

Using (20), we then get

(21) µ∗(D p A) ≤∞n=1

µ∗(E pn An) ≤∞n=1

12 p+n

= 12 p

, ∀ p ≥ 1.

Since A ⊂ D p, ∀ p ≥ 1, we get A ⊂ ∞ p=1 D p. Moreover, if we define the set

N =∞ p=1 D p

A, we obviously have the inclusions N ⊂ D p A, ∀ p ≥ 1, and

then (21) clearly forces µ∗(N ) = 0.Now we have

∞n=1 K n ⊂ A ⊂ ∞

p=1 D p, and∞ p=1 D p

∞

n=1 K n

= N ∪M ,

with µ∗(M ) = µ∗(N ) = 0, so we indeed have (ii).The implication (ii) ⇒ (i) is pretty obvious. If there exist sequences (K n)∞

n=1

and (Dn)∞n=1 as in (ii), then the sets B =

∞n=1 K n and G =

∞n=1 Dn are Borel.

Moreover, the inclusions B ⊂ A ⊂ G, give A B ⊂ G B, so we have µ∗(A B) ≤µ∗(G B). By the second feature in (ii) we know that µ∗(G B) = 0, thereforethe set P = A B is µ∗-neglijeable, hence µ∗-measurable. Since A = B

∪P , it

follows that A is indeed µ∗-measurable.

Comment. The implication (ii) ⇒ (i) in Theorem 7.3 holds without the µ∗-σ-finiteness assumption on A. In fact, condition (ii) actually forces A to be µ∗-σ-finite.

Corollary 7.3. If µ is a Radon measure on X , and the set A is µ∗-measurable,and µ∗-σ-finite, then one has the equality

µ∗(A) = sup

µ(K ) : K ∈ CX , K ⊂ A

.




Proof. Follow the first part of the proof of (i) ⇒ (ii) to find a sequence(K n)∞

n=1 of compact subsets of A, such that

µ∗A ∞n=1

K n = 0.

Since∞n=1 K n is µ∗-measurable, this forces the equality

µ∗(A) = µ∗ ∞n=1

K n

= limn→∞

µ∗(K 1 ∪ · · · ∪ K n).

Exercise 5* . Let X be a locally compact space, and let µ be a Radon measure onX . Suppose ν : Bor(X ) → [0, ∞] is a measure satisfying the following conditions:

(a) ν (B) ≤ µ(B), ∀ B ∈ Bor(X );(b) for every B ∈ Bor(X ), one has the implication ν (B) < ∞ ⇒ µ(B) < ∞.

Prove that ν is a Radon measure on X . (Notice that, in the case when µ is finite,

the condition (b) is superfluous.)Hints: To prove condition (ii) in the definition of Radon measures, start with some open setD ⊂ X, and choose a sequence K 1 ⊂ K 2 ⊂ ··· ⊂ D of compact subsets, such that

limn→∞

µ(K n) = µ(D),

and define the Borel set B =∞

n=1 K n ⊂ D. Notice that we have the equalities µ(B) =

limn→∞ µ(K n) and ν (B) = l i mn→∞ ν (K n). Argue that, when ν (D) = ∞, we must have

ν (B) = ∞. When ν (D) < ∞, show that µ(D B) = 0. In either case we get ν (B) = ν (D).

The next result explains somehow the anomaly illustrated by Exercise 3.

Proposition 7.5. If µ is a Radon measure on X , and let µ∗ denote its maximal outer extension. For a subset N ⊂ X , the following are equivalent

(i) N is µ∗-measurable, and for every compact subset K ⊂ N , one has the

equality µ(K ) = 0;(ii) µ∗(D ∩ N ) = 0, for all open subsets D ⊂ X with µ(D) < ∞;(iii) N is locally µ∗-neglijeable, i.e.

µ∗(A ∩ N ) = 0, for all subsets A ⊂ X with µ∗(A) < ∞.

Proof. (i) ⇒ (ii). Assume N satisfies condition (i). Fix some open set D ⊂X , with µ(D) < ∞. Then the set D∩N is measurable, and µ∗(D∩N ) ≤ µ(D) < ∞.The equality µ∗(D ∩ N ) = 0 then follows from (i), combined with Corollary 7.3.

(ii) ⇒ (iii). Assume N satisfies condition (ii). Fix some arbitrary subsetA ⊂ X , with µ∗(A) < ∞. Using (13), there exists some open set D ⊃ A withµ(D) < ∞. Then we have the inequality µ∗(A ∩ N ) ≤ µ∗(D ∩ N ), so condition (ii)will force µ∗(A ∩ N ) = 0.

(iii)

⇒(i). Let N be locally µ∗-neglijeable. We know that local µ∗-neglijeability

implies µ∗-measurability (see Section 5). The fact that µ(K ) = 0, for all compactsubsets K ⊂ N is also trivial.

Notation. Let µ be a Radon measure on the locally compact space X , andlet µ∗ be the maximal outer extension of µ. We denote the σ-algebra m µ∗(X ),of all µ∗-measurable subsets of X , simply by Mµ(X ), and we define the measureµ = µ∗

m µ∗ (X)

. Using the terminology introduced in Section 5, the pair (Mµ(X ), µ)

is the quasi-completion of Bor(X ) with respect to µ.



250 LECTURES 26-29

Our next goal is to examine the inclusion Bor(X ) ⊂ Mµ(X ) along the samelines used in the final part of Section 5. In preparation for the results that follow,it is helpful to introduce the following terminology.

Definition. Let µ be a Radon measure on the locally compact space X . Anon-empty compact subset K ⊂ X , is said to be µ-tight , if it has the property

• there is no compact non-empty proper subset L K , with µ(K ) = µ(L).

Remark 7.3. Singleton sets are always µ-tight. If K is µ-tight, and µ(K ) = 0then K must be a singleton.

For a non-empty compact set K with µ(K ) > 0, the µ-tightness is equivalentto the following condition11:

(22)D ⊂ X openD ∩ K = ∅

=⇒ µ(D ∩ K ) > 0.

Indeed, if K is µ-tight, and D ⊂ X is an open set, such that D ∩ K = ∅, then thecompact set L = K D is either empty, or a proper subset of K . In either case,

we get µ(L) < µ(K ), and then the equality D ∩ K = K L gives µ(D ∩ K ) =µ(K ) − µ(L) > 0. Conversely, if K satisfies (22) and if L is a non-empty propercompact subset of K , then the set D = X L is open, and satisfies D ∩ K = ∅.By (22) this forces µ(D ∩ K ) > 0, and since we have L = K (D ∩ K ), we getµ(L) = µ(K ) − µ(D ∩ K ) < µ(K ).

A µ-tight compact set K , with µ(K ) > 0, will be called non-degenerate.

Lemma 7.2. Let X be a locally compact space, let µ be a Radon measure on X . Every non-empty compact set K ⊂ X has a µ-tight compact subset K 0 ⊂ K ,with µ(K 0) = µ(K ).

Proof. If K is already tight, there is nothing to prove. Also, if µ(K ) = 0,then we can pick K 0 to be of the form x, with x any point in K .

For the remainder of the proof, we are going to assume that K is not µ-tight,

and µ(K ) > 0. Consider the collection

L =

L ∈ CX : ∅ = L K and µ(L) = µ(K )

.

Since K is not µ-tight, the collection L is non-empty. One key property of thecollection L is the following.

Claim 1: If L1, . . . , Ln ∈ L, then L1 ∩ · · · ∩ Ln ∈ L.

Indeed, if we define the sets Aj = K Lj , j = 1, . . . , n, then µ(A1) = · · · = µ(An) =0, and then the equality

K [L1 ∩ · · · ∩ Ln] = A1 ∪ · · · ∪ An

will force µ

K [L1 ∩ · · · ∩ Ln]

= 0, thus giving µ(L1 ∩ · · · ∩ Ln) = µ(K ) > 0.(The last inequality forces of course L1 ∩ · · · ∩ Ln = ∅.)

Using the finite intersection property , it follows that the intersection K 0 =L∈L L is non-empty.

Claim 2 : K 0 ∈ L.

Obviously K 0 is compact non-empty proper subset of K , so the only thing we needto prove is the equality µ(K 0) = µ(K ). Consider the Borel subset

B = K K 0 ⊂ K.

11 Notice that using D = X, condition (22) actually forces µ(K ) > 0.




Since B ⊂ K , it follows that µ(B) < ∞. By Corollary 7.3 we have

(23) µ(B) = supµ(P ) : P compact, P ⊂ B.

Notice however that if P ⊂ B is compact, then we have, by the definition of B, theequality

L∈L

(L ∩ P ) = L∈L

L ∩ P = (K B) ∩ P = ∅,

so again by the finite intersection property , combined with Claim 1, it follows thatthere exists L ∈ L, such that P ∩ L = ∅. Then we have µ(K L) = 0, so theinclusion P ⊂ K L will force µ(P ) = 0. Using (23), this forces µ(B) = 0.

We now show that K 0 is µ-tight. Indeed, if K 0 were not tight, we could findsome non-empty compact proper subset L K 0, with µ(L) = µ(K 0) = µ(K ).This will of course force L to belong to L, and therefore it will force the inclusionK 0 ⊂ L, which is impossible.

Lemma 7.3. Let X be a locally compact space, let ν be a Radon measure on X ,

and let G be a pair-wise disjoint collection of non-degenerate µ-tight compact sets.For any set A ⊂ X , with µ∗(A) < ∞, the collection

S G(A) =

G ∈ G : G ∩ A = ∅

is at most countable.

Proof. Since µ∗(A) < ∞, by (13), there exists some open set D ⊃ A withµ(D) < ∞. It is obvious that S G(A) ⊂ S G(D), so it suffices to prove that S G(D) isat most countable.

On the one hand, we notice that, for every finite subset F ⊂ S G(D), one hasG∈F

µ(G ∩ D) = µ G∈F

[G ∩ D] ≤ µ(D) < ∞.

This means that the family µ(G ∩ D)G∈SG(D) is summable, and we haveG∈SG(D)

µ(G ∩ D) ≤ µ(D) < ∞.

On the other hand, by Remark 7.3, we know that all the terms µ(G ∩ D), G ∈S G(D) are are strictly positive. Using Proposition II.2.2, this forces S G(D) to becountable.

The main application of the above result is the following.

Theorem 7.4. Let X be a locally compact space, and let µ be a Radon measureon X . Then there exists a partition F of X into µ-tight compact sets, with theproperty that the set

N F = F ∈F µ(F )=0

F

is locally µ∗-neglijeable.

Proof. Define the set

Ω =F : F pairwise disjoint collection of non-degenerate µ-tight compact sets

.

We agree to consider the empty collection as an element of Ω, so that Ω is non-empty. Equip the set Ω with the order relation ⊂ given by inclusion.



252 LECTURES 26-29

Claim 1: The ordered set (Ω, ⊂) contains a maximal element.

This is a straightforward application of Zorn’s Lemma. Start with some subset Λ

of Ω, which is totally ordered with respect to ⊂, and let us show that there is anupper bound for Λ (in Ω). If we write Λ = Gi : i ∈ I , we define the collectionG =

i∈I Gi. It is clear that every element in G is a non-degenerate µ-tight compact

set. If K, L ∈ G are different elements, then there exist i, j ∈ I with K ∈ Gi andL ∈ Gj . Since Λ is totally ordered, we either have Gi ⊂ Gj , or Gj ⊂ Gi. In eithercase, we conclude that there exists some k ∈ I , such that K, L ∈ Gk, and thenK ∩ L = ∅. This shows that G is pairwise disjoint, hence G belongs to Ω. It isobvious that G is an upper bound for Λ.

Having proven Claim 1, we fix a maximal collection G ∈ Ω, and we define theset T =

G∈G G. It is quite possible that G = ∅. In that case we define T = ∅.

Claim 2 : For every compact subset K ⊂ X T , one has µ(K ) = 0.

We prove this by contradiction. Assume µ(K ) > 0. By Lemma 7.3 there existsa µ-tight compact subset K 0

⊂K , with µ(K 0) = µ(K ) > 0 (in particular K 0 is

non-degenerate). But then the collection G ∪ K 0 would obviously contradict themaximality of G.

Claim 3 : Whenever D ⊂ X is an open set with µ(D) < ∞, it follows that the set D T is Borel, and µ(D T ) = 0.

By Lemma 7.3, the collection

S G(D) =

G ∈ G : G ∩ D = ∅

is at most countable. Now we have

D ∩ T =

G∈SG(D)

(D ∩ G),

so D ∩ T is a countable union of Borel sets, hence D ∩ T itself is Borel, and so willbe D T = D (D

∩T ). Since µ(D T )

≤µ(D) <

∞, by Corollary 7.3, we have

µ(D T ) = supµ(K ) : K compact, K ⊂ D T .

By Claim 2 this, forces µ(D T ) = 0.Going back to the proof of the theorem, we notice that, by Claim 2, none of

the singletons x, x ∈ X T , has positive measure. We can then define collection

F = G ∪ x : x ∈ X T

,

which is obviously a partition of X into µ-tight compact sets. For this partition,we obviously have the equality N F = X T . By Claim 3, we have

µ∗(N F ∩ D) = 0, for all open sets D ⊂ X with µ(D) < ∞.

By Proposition 7.2, it follows that N F is indeed locally µ∗-neglijeable.

Definition. Let X be a locally compact space, and let µ be a Radon measure

on X . A partition F of X into µ-tight compact sets, with the property stated inTheorem 7.3, will be called non-degenerate.

The existence of such partitions is significant, as indicated below.

Theorem 7.5. Let X be a locally compact space, let µ be a Radon measure on X , and let F be a non-degenerate partition of X into µ-tight compact sets. Then 12

F is a sufficient µ-finite Bor(X )-partition of X .

12 See Section 5 for the terminology.




Proof. What we to prove are the following properties:

(i) F is pairwise disjoint, and

F ∈F F = X ;

(ii) F ⊂ B, and µ(F ) < ∞, for all F ∈ F ;(iii) for every set B ∈ Bor(X ), with µ(B) < ∞, one has the equality13

(24) µ(B) =F ∈F

µ(B ∩ F ).

Conditions (i) and (ii) are obvious.To prove condition (iii) we define the sub-collection G = F ∈ F : µ(F ) > 0,

so that G consists on non-degenerate µ-tight compact sets, and the set

N F = X F ∈G

F

is locally µ∗-neglijeable. Assume now B ∈ Bor(X ) has µ(B) < ∞. By Lemma 7.3,the collection

S G(B) = F ∈ G : B ∩ F = ∅is at most countable. In particular, the set

B0 =

F ∈SG(B)

(B ∩ F ) = B N F

is Borel, and so will be B B0 = B ∩ N F . On the one hand, since B B0 is asubset of N F , it follows that B B0 is locally µ∗-neglijeable. On the other hand,since B B0 is a subset of B, it follows that µ(B B0) < ∞. This clearly forcesµ(B B0) = 0, so we have the equality

(25) µ(B) = µ(B0) =

F ∈SG(B)

µ(B ∩ F ).

Notice that, if F ∈F S G(B), then either F

∈G, in which case we have µ(F ) = 0,

or F ∈ G S G(B), in which case we have µ(B ∩ F ) = 0. This shows that

µ(B ∩ F ) = 0, ∀ F ∈ F S G(B),

so the equality (25) immediately gives (24).

Corollary 7.4. Under the hypothesis above, the collection F is a µ-finitedecomposition for Mµ(X ).

Proof. Immediate from Corollary 5.3.

In the remainder of this section we discuss two basic examples of methods forconstructing (regular) contents.

To introduce the first construction, let us recall some notations and terminology

introduced in II.5 For a locally compact space X , and K one of the fields R or C, wedenote by C Kc (X ) the space of all continuous functions f : X → K, with compactsupport. A R-linear map φ : C Rc (X ) → R is said to be positive, if it has theproperty:

f ∈ C Rc (X ), f ≥ 0 ⇒ φ(f ) ≥ 0.

With these notations, we have the following result.

13 Here we use the summation convention from II.2



254 LECTURES 26-29

Proposition 7.6. Let X be a locally compact space, and let φ : C Rc (X ) → Rbe a positive R-linear map. For every compact subset K ⊂ X , define the number

ωφ(K ) = inf φ(f ) : f ∈ C Rc (X ), f ≥ κ K.

Then the map CX K −→ ωφ(K ) ∈ [0, ∞) is a regular content on X .

Proof. The inequality f ≥ κ K forces f ≥ 0, so we indeed have ωφ(K ) ≥ 0,∀ K ∈ CX . We now check conditions (i)-(iv) in the definition of a content.

The constant function 0 satisfies 0 ≥ κ ∅, which immediately gives the equalityωφ(∅) = 0, so condition (i) is satisfied.

By the definition of ωφ, it is clear that one has the implication

K, L ∈ CX , K ⊂ L =⇒ ωφ(K ) ≤ ωφ(L),

thus giving condition (ii).To check condition (iii), suppose K, L ∈ CX , and let us prove the inequality

(26) ωφ(K ∪

L)≤

ωφ(K ) + ωφ(L).

Start with some ε > 0, and choose functions f, g ∈ C Rc (X ), such that f ≥ κ K ,g ≥ κ L, φ(f ) ≤ ωφ(K ) + ε, and φ(g) ≤ ωφ(L). If we consider the functionh = f + g ∈ C Rc (X ), then we clearly have h ≥ κ K∪L, so we will have

ωφ(K ∪ L) ≤ φ(h) = φ(f + g) = φ(f ) + φ(g) ≤ ωφ(K ) + ωφ(L) + 2ε.

Since the inequality ωφ(K ∪ L) ≤ ωφ(K ) + ωφ(L) + 2ε holds for arbitrary ε > 0, itwill clearly force (26)

Finally, to check condition (iv) we need start with two disjoint sets K, L ∈ CX ,and we prove the equality

(27) ωφ(K ∪ L) = ωφ(K ) + ωφ(L).

By (26) it only suffices to show the inequality

(28) ωφ(K ∪ L) ≥ ωφ(K ) + ωφ(L).

Start with some arbitrary ε > 0, and choose a function f ∈ C Rc (X ), with f ≥κ K∪L and φ(f ) ≤ ωφ(K ∪ L) + ε. Use Uryshon Lemma for locally compact spaces(Theorem I.5.1) to find a continuous map θ : X → [0, 1], such that θ

K

= 1 and

θL

= 0. The functions g = f θ and h = f (1 − θ) are obviously continuous, andhave compact supports. Moreover, one has the inequalities g ≥ κ K and h ≥ κ L.Since g + h = f , we get

ωφ(K ∪ L) + ε ≥ φ(f ) = φ(g + h) = φ(g) + φ(h) ≥ ωφ(K ) + ωφ(L).

Since the inequality ωφ(K ∪ L) + ε ≥ ωφ(K ) + ωφ(L) holds for all ε > 0, it willclearly force the inequality (28)

So far, we have shown that ωφ is a content. We now prove that ωφ is regular,

which means that, for every K ∈ CX , one has the equalityωφ(K ) = inf

ωφ(L) : L ∈ CX , K ⊂ Int(L).

By property (ii) we always have the inequality

ωφ(K ) ≤ inf

ωφ(L) : L ∈ CX , K ⊂ Int(L),

so all we need to prove is the inequality

(29) ωφ(K ) ≥ inf

ωφ(L) : L ∈ CX , K ⊂ Int(L).




Start with some arbitrary ε > 0, and choose a function f ∈ C Rc (X ) with f ≥ κ K ,and φ(f ) ≤ ωφ(K ) + ε. Consider the function g = (1 + ε)f , and the set

D = x ∈ X : g(x) > 1.Obviously D is an open set, and since f (x) ≥ 1, ∀ x ∈ K , we get g(x) ≥ 1 + ε > 1,∀ x ∈ K . In particular, this gives the inclusion K ⊂ D. Apply then Lemma I.5.1to find some compact set L ⊂ D, with K ⊂ Int(L). Since g(x) > 1, ∀ x ∈ L, weclearly have

ωφ(L) ≤ φ(g) = (1 + ε)φ(f ) ≤ (1 + ε)(ωφ(K ) + ε).

This argument shows that, if we denote the right hand side of (29) by ν (K ), thenwe have the inequality

ν (K ) ≤ (1 + ε)(ωφ(K ) + ε).

Since this inequality holds for all ε > 0, it will force the inequality ν (K ) ≤ ωφ(K ),thus proving (29).

Definition. Let X be a locally compact space, and let φ : C Rc (X ) → R be a

positive R-linear map. We apply Corollary 7.2 to the regular content ωφ, and wewill denote the Radon measure extension of ωφ simply by µφ. The measure µφ onBor(X ) is called the Riesz measure associated with φ.

An interesting property, which will later be generalized, is the following.

Lemma 7.4 (Mean Value Property). Let X be a locally compact space, let φ : C Rc (X ) → R be a positive R-linear map, and let µφ be the Riesz measureassociated with φ. For any function f ∈ C Rc (X ), and any compact subset K ⊂ X ,with K ⊃ supp f , one has the inequality

(30)

minx∈K

f (x) · µφ(K ) ≤ φ(f ) ≤

maxx∈K

f (x) · µφ(K ).

Proof. Since minx∈K f (x) = − maxx∈K(−f )(x), it suffices to prove only the

inequality(31) φ(f ) ≤

maxx∈K

f (x) · µφ(K ).

Fix f ∈ C Rc (X ), as well as the compact set K ⊃ supp f . Denote the numbermaxx∈K f (x) simply by M .

If M < 0 the inequality is pretty clear, because the function g = f M satisfies g ≥

κ K , which gives φ(g) ≥ ωφ(K ) = µφ(K ), and then multiplying by M immediatelygives (31).

The case M = 0 is also trivial, since this forces f ≤ 0, so we get φ(f ) ≤ 0.Assume M > 0. Fix for the moment some ε > 0, and choose some function

h ∈ C Rc (X ), with h ≥ κ K, and φ(h) ≤ µφ(K ) + ε.Let us observe that M h − f ≥ 0. Indeed, if we start with some arbitrary point

x

∈X , then either x

∈K , in which case we have Mh(x)

≥M

≥f (x), or we have

x ∈ X K , in which case Mh(x) ≥ 0 = f (x).Using the positivity of φ we then get φ(Mh − f ) ≥ 0, which by the choice of h

givesφ(f ) ≤ φ(M h) = M φ(h) ≤ M

µφ(K ) + ε

.

Since the inequality φ(f ) ≤ M

µφ(K ) + ε

holds for arbitrary ε > 0, it will clearlyforce φ(f ) ≤ Mµφ(K ).

The Riesz measure can be implicitly characterized by the following result.



256 LECTURES 26-29

Proposition 7.7. With the notations above, the Riesz measure µφ is theunique Radon measure which has the interpolation property:

(iφ) whenver F ⊂ X is compact, D ⊂ X is open, and f ∈ C R

c (X ) satisfiesκ F ≤ f ≤ κ D, it follows that one has the inequality

µφ(F ) ≤ φ(f ) ≤ µφ(D).

Proof. Let us first show that µφ has property (iφ). Start with F , D and f as in (iφ). Since µφ(F ) = ωφ(F ), by the definition of ωφ, we immediately get theinequality µφ(F ) ≤ φ(f ).

To prove the inequality φ(f ) ≤ µφ(D), we need some preparations. For everyinteger n ≥ 1 we define the sets

An =

x ∈ X : f (x) >1

n

and Bn =

x ∈ X : f (x) ≥ 1

n

.

Define also the set E =

x

∈X : f (x) > 0

, so that E = supp f . (Here we use

the obvious fact that f ≥ 0.) The sets An, n ≥ 1 are open. The sets Bn, n ≥ 1are closed subsets of E ⊂ E , hence they are compact. Notice also that we have theinclusions

A1 ⊂ B1 ⊂ A2 ⊂ B2 ⊂ · · · ⊂ E ⊂ D.

For every n ≥ 1, we use Urysohn Lemma to find a continuous function hn : X →[0, 1], with hn

Bn

= 1 and hnXAn+1

= 0. On the one hand, we notice that the

function f (1−hn) has the support contained in the compact set E An ⊂ X An.Moreover, since we clearly have f (x) ≤ 1

n , ∀ x ∈ X An, by Lemma 7.4 we get theinequality

φ(f ) = φ(f hn)+φ

f (1−hn)

≤ φ(fhn)+

µφ(E An)

n≤ φ(f hn)+

µφ(E )

n, ∀ n ≥ 1,

which shows that

(32) φ(f ) ≤ lim supn→∞

φ(f hn).

On the other hand, for each n ≥ 1, the function fhn has support contained inBn+1, and (f hn)(x) ≤ 1, ∀ x ∈ Bn+1, so again by Lemma 7.4 combined with theinclusion Bn+1 ⊂ D, we get

φ(fhn) ≤ µφ(Bn+1) ≤ µφ(D).

Using (32) we immediately get φ(f ) ≤ µφ(D).We now prove the uniqueness. Let µ be a Radon measure with property (iφ).

Claim 1: For any compact set K ⊂ X and any open set D ⊂ X , with K ⊂ D,one has the inequality

µφ(K ) ≤ µ(D).

Choose a compact set L ⊂ X , with K ⊂ Int(L) ⊂ L ⊂ D, and use Urysohn Lemmato find a continuous function f : X → [0, 1] such that f

K

= 1 and f XInt(L)

= 0.

In particular, f has compact support, and satisfies κ K ≤ f ≤ κ D. Using (iφ) forµφ and for µ, we then get µφ(K ) ≤ φ(f ) ≤ µ(D), and we are done.

Claim 2 : for every compact set K ⊂ X , one has the equality µφ(K ) = µ(K ).




On the one hand, by the definition of the Radon measure, we have

µ(K ) = inf µ(D) : D

⊂X open, with D

⊃K .

By Claim 1, this immediately gives the inequality µφ(K ) ≤ µ(K ). On the otherhand, if we choose, for every ε > 0, a function f ε ∈ C Rc (X ) with f ≥ κ K andφ(f ) ≤ µφ(K ) + ε, then the function gε = minf ε, 1 will also satisfy gε ≥ κ K , andφ(gε) ≤ φ(f ε) ≤ µφ(K ) + ε. Applying (iφ) for µ, with C = K and X = D will thenforce µ(K ) ≤ φ(gε) ≤ µφ(K ) + ε. Since the inequality µ(K ) ≤ µφ(K ) + ε holds forall ε > 0, it will force µ(K ) ≤ µφ(K ).

Having proven Claim 2, we now see that, using condition (iii) in the definitionof Radon measures, we get the equality µ(D) = µφ(D), for all open sets D ⊂ X .Using condition (ii) from the definition, it then follows that µ(B) = µφ(B), ∀ B ∈Bor(X ).

Comment. The Riesz correspondence

positive R-linear maps

C Rc (X) → R

φ −→ µφ ∈ Radon measures

on X

.

will be studied in Chapter IV, where we will eventially prove the fact that it isbijective. At this point we simply regard it as a method of constructing Radonmeasures.

Proposition 7.8. Let X be a locally compact space. Then the Riesz corre-spondence is “linear” in the following sense.

(i) If φ : C Rc (X ) → R is a positive R-linear map, and t ∈ [0, ∞), then tφ isalso a positive R-linear map, and one has the equality µtφ = tµφ.

(ii) If φ1, φ2 : C Rc (X ) → R are positive R-linear maps, then φ1 + φ2 is also a positive R-linear map, and one has the equality µφ1+φ2 = µφ1 + µφ2 .

Proof. (i). Assume φ is positive and t ∈ [0, ∞). The fact that tφ is positiveis trivial. We know, by Proposition 7.2, that tµφ is a radon measure. Then theequality µtφ = tµφ follows from Proposition 7.5, combined with the obvious factthat µtφ has the interpolation property (itφ)

(ii). If φ1 and φ2 are positive, then so is φ1 + φ2. Define ψ = φ1 + φ2, andν = µφ1 + µφ2 . By Proposition 7.2, we again know that ν is a Radon measure. Theequality µψ = ν follows from Proposition 7.5, combined with the obvious fact thatν has the interpolation property (iψ)

The Riesz correspondence is also functorial, with respect to proper maps, inthe following sense.

Proposition 7.9. Let X and Y be locally compact spaces, let Φ : X

→Y be

a proper continuous map, and let φ : C Rc (X ) be a positive linear map.(i) Whenever f : Y → R is a continuous function with compact support, it

follows that the composition f Φ : X → R is also a continuous function with compact support.

(ii) The map

ψ : C Rc (Y ) f −→ φ(f Φ) ∈ R

is R-linear and positive.



258 LECTURES 26-29

(iii) If µφ is the Riesz measure on X defined by φ, and if µψ is the Riesz measure on Y defined by ψ, then one has the equality

µψ(B) = µφΦ−1

(B), ∀ B ∈ Bor(Y ).

Proof. (i). This statement is trivial, since Φ is proper.(ii). The linearity of ψ is a consequence of the linearity of the map

T : C Rc (Y ) f −→ f Φ ∈ C Rc (X ),

and of the obvious equality ψ = φ T .(iii). Use Proposition 7.3, which states that the map ν : Bor(Y ) → [0, ∞],

defined by

ν (B) = µφ

Φ−1(B)

, ∀ B ∈ Bor(Y ),

is a Radon measure. In order to prove statement (iii), which reads µψ = ν , weobserve that, using Proposition 7.5, it suffices to prove that ν has the interpolationproperty (iψ). Fix then a compact set K and an open set D

⊂Y , as well as a

function f ∈ C Rc (Y ), such that κ K ≤ f ≤ κ D, and let us prove the inequalities

(33) ν (K ) ≤ ψ(f ) ≤ ν (D).

Define the compact set L = Φ−1(K ) (here we use the fact that Φ is proper), anddefine the open set E = Φ−1(D) ⊂ X , so that ν (K ) = µφ(L) and ν (D) = µφ(E ).If we define, using (i), the function g = f Φ ∈ C Rc (X ), then we have ψ(f ) = φ(g),and the inequalities (33) are the same as the inequalities

µφ(L) ≤ φ(g) ≤ µφ(E ).

But these inequalities follow immediately from the interpolation property of µφ,combined with the obvious inequalities κ L ≤ g ≤ κ E .

Remarks 7.4. Let X be a locally compact space, let φ : C Rc (X ) → R be a

positive R-linear map, and let µφ be the Riesz measure defined by φ.A. One has the equality

(34) µφ(X ) = sup

φ(f ) : f ∈ C Rc (X ), 0 ≤ f ≤ 1

.

Indeed, if we denote the right hand side of (34) by M , then the inequality µφ(X ) ≥M is immediate from the interpolation property. In fact, if for each compact setK ⊂ X , we choose (use Urysohn Lemma) some continuous function f K : X → [0, 1],with compact support, such that f K

K

= 1, then by the interpolation property weget M ≥ φ(f K) ≥ µφ(K ), so we have

M ≥ sup

µφ(K ) : K ∈ CX

= µφ(K ).

B. As a consequence of the equality (34), and of Remark II.5.4, we get theequivalence

φ continuous ⇔ µφ(X ) < ∞.Moreover, in this case one has the equality φ = µφ(X ).

C. Assume X is non-compact, and φ is continuous. Then φ can be extended toa positive linear function φ on the completion C R0 (X ) of C Rc (X ). In this case theRiesz correspondence has a nice connection with the Alexandrov compactificationX α = X ∞ (see I.5 and II.5). Recall that C R0 (X ) is identified with the spaceof all continuous functions f : X α → R with f (∞) = 0. Moreover, φ has a uniqueextension to a positive linear map ψ : C R(X α) → R, with φ = φ = φ.




We can then consider two Riesz measures µφ on X , and µψ on X α. One has theequality

(35) µψ(B) = µφ(B ∩ X ), ∀ B ∈ Bor(X α

).First of all, remark that

(36) µψ(K ) = µφ(K ), ∀ K ∈ CX .

This is a consequence of the fact that for every g ∈ C R(X α) with g ≥ κ K , thereexists some f ∈ C Rc (X ), with g ≥ f ≥ κ K (Simply take f = gh, for some continuousfunction h : X → [0, 1] with compact support, with h

K

= 1.) Using (36), weimmediately get the equality

(37) µψ(B ∩ X ) = µφ(B ∩ X ), ∀ B ∈ Bor(X α).

Using this with B = X , we get

µψ(X ) = µφ(X ) = φ = ψ = µψ(X α),

which forces µψ(∞) = 0, and then (35) is immediate from (37)Exercise 6 . Consider the case when X = Rn. For every continuous function

f : Rn → R, with compact support, we define

φ(f ) =

b1a1

b2a2

· · · bn

an

f (x1, x2, . . . , xn) dx1 dx2 · · · dxn,

where the numbers a1 < b1, . . . , an < bn are chosen (arbitrarily) such that

supp f ⊂ [a1, b1] × [a2, b2] × · · · × [an, bn].

(One can show that the multiple integral is independent of the choice of the a’s andthe b’s.) It is obvious that this way we have constructed a positive R-linear mapφ : C Rc (Rn) → R. The Riesz measure µφ, defined by φ, is precisely the Lebesguemeasure λn.

Hint: Compute the values of µφ on compact boxes.

We conclude this section with an important result from harmonic analysis. Themain object of study is explained in the following.

Definition. A topological group is a group G, which comes also equipped witha topology, which is compatible with the group structure in the sense that the map

G × G (g, h) −→ gh−1 ∈ G

is continuous. Remark that is equivalent to the fact that both maps G × G (g, h) −→ gh ∈ G and G g −→ g−1 ∈ G are continuous. To avoid any complica-tions, all topological groups are assumed to be Hausdorff .

Examples 7.2. A. Any group becomes a topological group, when equippedwith the discrete topology . (This is the topology in which every subset is open.)

B. The group (Rn, +) is a topological group, when equipped with the normtopology.

C. The unit circle T =

z ∈ C : |z| = 1

is a topological group, when equippedwith the unsual multiplication, and the topology induced from C. More generally,for an integer n ≥ 1, the n-dimensional torus Tn, equipped with coordinate-wisemultiplication, and the product topology, is a topological group.

D. Given an integer n ≥ 1, the group GLn(R), of all invertible n × n matrices(with matrix multiplication as the group operation), is a topological group, when



260 LECTURES 26-29

equipped with the topology comming from the identification of GLn(R) as an open

subset in Rn2

.

Notations. Let G be a group. For a subset A ⊂ G and an element g ∈ G, wedefine the left and right translations of A by g, as the sets

gA = gh : h ∈ A and Ag = hg : h ∈ A.

For two subsets A, B ⊂ G, we define

A · B = hk : h ∈ A, k ∈ B.

Finally, for a subset A ⊂ G, we define A−1 = h−1 : h ∈ A

.

Remark 7.5. There is some similarity between topological groups and metricspaces. The subsets that paly role of open balls are the open neighborhoods of theidentity. More explicitly, if G is a topological group, with identity element e, thenone has the equalities

N

⊂G : N open neighborhood of g

=

= gV ⊂ G : V open neighborhood of e =

= W g ⊂ G : W open neighborhood of e.

For example, given a metric space (X, d), a map f : G → X is continuous at somepoint g ∈ G, if and only if, for every ε > 0, there exists some neighborhood V ε of e, such that

d

f (gh), f (g)

< ε, ∀ h ∈ V ε.

The following two results will be used several times.

Lemma 7.5. Suppose G is a topological group, with identity element e. For any open neighborhood U of e, there exists an open neighborhoods V of e, such that V = V −1 and V · V ⊂ U .

Proof. Fix the open neighborhood U . Use the continuity of the map G×

H (g, h) → gh ∈ G, at (e, e), to find an open neighborhood D of (e, e) in G × G, such

thatgh ∈ U, ∀ (g, h) ∈ D.

Since D is open in the product topology, there exist open neighborhoods U 1 andU 2, of e, such that U 1 × U 2 ⊂ D. Then we obviously have

U 1 · U 2 ⊂ U.

Consider the open neighborhood W = U 1 ∩ U 2. We still have W · W ⊂ U . Finally,using the continuity of the map G g −→ g−1 ∈ G, it follows that W −1 is also aneighborhood of e. Then we are done, if we take V = W ∩ W −1.

Proposition 7.10. Let G be a topological group, and let K, L ⊂ G be twocompact disjoint sets. Then there exists an open neighborhood V of the identity element e, such that V = V −1 and (K · V ) ∩ (L · V ) = (V · K ) ∩ (V · L) = ∅.

Proof. Consider the continuous map φ : G × G (g, h) −→ gh−1 ∈ G, andthe compact set C = (K × L) ∪ (L × K ) ⊂ G × G. Since φ is continuous, it followsthat φ(C ) is a compact subset of G. The condition K ∩ L = ∅ obviously gives thefact that e ∈ φ(C ). Since φ(C ) is closed, there exists some open neighborhood U of e, such that φ(C ) ∩ U = ∅. Use Lemma 7.5 to find some open neigborhood V of e, such that V = V −1 and V · V ⊂ U .




We now show that (K · V ) ∩ (L · V ) = ∅. Suppose the contrary, i.e. there existg ∈ K , h ∈ L, and v, w ∈ V , such that gv = hw. Then we get h−1g = wv−1 ∈V

·V −1 = V

·V

⊂U , which is impossible, since h−1g also belongs to φ(C ).

Finally, let us show that we also have (V · K ) ∩ (W · L) = ∅. Suppose thecontrary, i.e. there exist g ∈ K , h ∈ L, and v, w ∈ V , such that vg = wh. Then weget hg−1 = w−1v ∈ V −1· = V · V ⊂ U , which is impossible, since gh−1 also belongsto φ(C ).

In what follows we are going to restrict our attention to those topological groupswhich are locally compact in their respective topology. The topological groups listedin Examples 7.2.A-D are all locally compact.

Definition. Let G be a locally compact group. A Radon measure µ on G iscalled a Haar measure on G, if µ(G) > 0, and µ has the left invariance property:

µ(gA) = µ(A), ∀ g ∈ G, A ∈ Bor(G).

Remark that, for every g∈

G the map g : G

h−→

gh∈

G is a homeomorphism,so for a subset A ⊂ G, one has the equivalence A ∈ Bor(G) ⇔ gA = g(A) ∈Bor(G). Likewise, the map rg : G h −→ hg ∈ G is a hoemorphism, so A ∈Bor(X ) ⇔ Ag ∈ Bor(G).

Remark 7.6. Let G be a locally compact group. For any element g ∈ G, andany function F ∈ C Rc (G), we define the continuous functions LgF, RgF : G → R byLgF = F g−1 and RgF = f rg. In other words,

(LgF )(h) = F (g−1h) and (RgF )(h) = F (hg), ∀ h ∈ G.

It is fairly obvious that LgF and RgF both have compact support. Moreover, fora fixed g ∈ G, the maps Lg, Rg : C Rc (G) → C Rc (G) are linear, and continuous in thenorm defined in Exercise 5. One has the equalities

Lgh = Lg

Lh and Rgh = Rg

Rh,

∀g, h

∈G,

as well as Le = Re = Id, where e denotes the identity element in G.

The following result gives a sufficient condition for a Riesz measure to be aHaar measures.

Proposition 7.11. Let G be a locally compact group, and let φ : C Rc (G) → R

be a positive R-linear map, which is not identically zero, and has the left invarianceproperty:

φ Lg = φ, ∀ g ∈ G.

Then the Riesz measure µφ is a Haar measure on G.

Proof. The key property we need is contained in the following

Claim 1: For any g ∈ G, and any compact subset K ⊂ G, one has theequality µφ(gK ) = µφ(K ).

Fix for the moment g ∈ G, as well as the compact set K ⊂ G. The set gK iscompact, so we have

(38) µφ(gK ) = inf

φ(F ) : F ∈ C Rc (G), F ≥ κ gK

.

Notice that if F ∈ C Rc (G) satisfies F ≥ κ gK , this means that F (gh) ≥ κ gK (gh),∀ h ∈ G. Notice that, for any h ∈ G, one has the equivalences

κ gK(gh) = 1 ⇔ gh ∈ gK ⇔ h ∈ K,



262 LECTURES 26-29

which means that

κ gK(gh) = κ K(h), ∀ h ∈ K.

The inequality F ≥ κ gK then gives

F (gh) ≥ κ K(h), ∀ h ∈ G,

which reads

Lg−1f ≥ κ K .

Using the invariance property, we get

µφ(K ) ≤ φ

Lg−1(F )

= (φ Lg−1)(F ) = φ(F ).

In other words, we have

φ(F ) ≥ µφ(K ), for all F ∈ C Rc (G) with F ≥ κ gK .

Using (38) this immediately gives

µφ(K ) ≤ µφ(gK ).

Applying the same inequality with g replaced by g−1 and K replaced by gK , yields

µφ(gK ) ≤ µφ

g−1(gK )

= µφ(K ),

so the Claim follows.

Claim 2 : For any g ∈ G, and any open subset D ⊂ G, one has the equality µφ(gD) = µφ(D).

For a compact subset L ⊂ G, one clearly has the equivalence L ⊂ gD ⇔ g−1 ⊂ D.So, using Claim 1, for every compact subset L ⊂ gD, one has

µφ(L) = µφ(g−1L) ≤ µφ(D),

and using property (iii) for Radon measures, we immediately get the inequalityµφ(gD) = sup

µφ(L) : L compact, L ⊂ gD

≤ µφ(D).

The inequality µφ(D) ≤ µφ(gD) is proven by replacing g with g−1 and D with gD,in the above inequality.

We now prove that µφ is a Haar measure. Start with some Borel set A ⊂ G.For every open set D ⊃ gA, one has the inclusion g−1D ⊃ A, which using Claim 2,gives µφ(D) = µφ(g−1D) ≥ µφ(A). Using property (ii) in the definition of Radonmeasures, we then have

µφ(gA) = inf

µφ(D) : D open, D ⊃ gA ≥ µφ(A).

The inequality µφ(A) ≥ µφ(gA) is proven by replacing g with g−1 and A with gA,in the above inequality.

Comment. Later on, in Chapter IV, we are going to prove that the left invari-ance property of φ is also a necessary condition for µφ to be a Haar measure.

Examples 7.3. Let us examine the examples 7.2.A-D and let us constructHaar measures on these groups.

A. On a discrete group G, one has the counting measure µ(A) = Card A,∀ A ⊂ G, which is obviously a Haar measure.

B. On (Rn, +), the Lebesgue measure is a Haar measure.




C. On the n-dimensional torus Tn, we consider the Riesz measure µΛ, associatedwith the positive R-linear map Λ : C R(Tn) → R, defined by

Λ(F ) = 1

0

· · · 1

0

F (e2πiθ1 , . . . , e2πiθn ) dθ1 . . . d θn.

It is not hard to see that Λ Lg = Λ, ∀ g ∈ Tn. One easy way is to check directly theequality (Λ Lg)(P ) = Λ(P ), for functions of the form P (z1, . . . , zn) = zm1

1 · · · zmnn ,

with m1, . . . , mn ∈ Z, and then use continuity and the Stone-Weierstrass Theoremwhich gives the fact that the linear span of all these P ’s is dense in C R(Tn). UsingProposition 7.6 it follows that µΛ is a Haar measure on Tn.

D. The construction of a Haar measure on GLn(R) is outlined in the following.

Exercise 7* . Identify GLn(R) as an open subset in Rn2

. For every continuous

function F : GLn(R) → R, with compact support, F : Rn2 → R by

˘F (x) = F (x)

· |det x

|−n if x

∈GLn(R)

0 if x ∈ Rn2 GLn(R)

and we define

ψ(F ) =

b1a1

b2a2

· · · bn2

an2

f (x1, x2, . . . , xn2) dx1 dx2 · · · dxn2 ,

where the numbers a1 < b1, . . . , an2 < bn2 are chosen (arbitrarily) such that

supp F ⊂ [a1, b1] × [a2, b2] × · · · × [an2 , bn2 ].

(On has the equality supp F = supp F , and the multiple integral is independent of the choice of the a’s and the b’s.) Prove that ψ Ls = ψ, ∀ s ∈ GLn(R). Concludethat the Riesz measure µψ associated with ψ is a Haar measure on GLn(R).

Hints: Fix s ∈ GLn(R). The map s−1 : GLn(R) → GLn(R) has an obvious linear extensionΦs : Rn2 → Rn2 , defined by

Φs(x) = s−1x, ∀x ∈ Rn2 ,

where the vector space Rn2 is identified with Matn×n(R). Fix now F ∈ C Rc

GLn(R)

and consider

the function H = F s−1 , so that (ψ Ls)(F ) = ψ(H ). Prove the equality

H (x) = F

Φs(x) · | det s|−n, ∀x ∈ Rn.

Prove that the Jacobian of Φs is given as det[(DΦs)(x)] = | det s|−n, ∀x ∈ Rn.

Use this equality, combined with the above formula for H , to get the equality ψ(H ) = ψ(F ),

as a result of the change of variable theorem. (Use the fact that in the definition of ψ, instead

of integrating over rectangles one can integrate over arbitrary compact sets Ω

⊂GLn(R), with

Jordan neglijeable boundary, and Int(Ω) ⊃ supp F .)

Comments. The Haar measures defined in Examples 7.3.A-D are peculiar inthe sense that they also have the right invariance property:

µ(Ag) = µ(A), ∀ g ∈ G, A ∈ Bor(G).

In general such a property does not hold. At this point, we can only speculate onthis matter, by examining the following example.



264 LECTURES 26-29

Exercise 8* . Consider the group G of all affine orientation preserving affinetransformations of R, i.e. the collection

G = T ab : a, b ∈ R, a > 0,where T ab : R x −→ ax + b ∈ R. (Some people call this the “ax + b” group.) Itis not hard to see that compositions and inverses of such transformations are againof this form. In fact one can identify G as the subgroup of GL2(R) given by

G =

a b0 1

: a, b ∈ R, a > 0

.

The topology on G is the one induced from this inclusion. Equivalently, G can beidentified with the right half-plane (0, ∞)×R. We use this identification to define apositive R-linear map Λ : C Rc (G) → R as follows. For every F ∈ C Rc (G), we choose0 < c1 < d1 and c2 < d2, such that supp F ⊂ [c1, d1] × [c2, d2], and we define

Λ(F ) = d1

c1 d2

c2

F (a, b)

a2

dadb.

The integral does not depend on the particular choice of the rectangle. Prove thatΛLg = Λ, ∀ g ∈ G, so that the Riesz measure µΛ is a Haar measure. In general theequality Λ Rg = Λ fails. As indicated in the comment that followed Proposition7.6, the fact that Λ Rg = Λ would prevent the Riesz measure µΛ from having theright invariance property.Hints: Use similar arguments to the ones in Exercise 8. If g = T ab ∈ G, then the mapg−1 : G → G extends to a linear map Φg : R2 → R2, defined by

Φg(x, y) = (ax + by,y), ∀ (x, y) ∈ R2.

Argue as in Exercise 8, and use the change of variable theorem.

Exercise 9 . As indicated above, in general, Haar measures need not have theright invariance property. Prove that when µ is a Haar measure on G, then themap ν : Bor(G) → [0, ∞] defined by

ν (B) = µ(B−1), ∀ B ∈ Bor(G),

is a Radon measure, which has the right invariance property.

Hint: The map G g −→ g−1 ∈ G is a homeomorphism.

The main result we are interested in is the existence of a Haar measure. Thefollowing result reduces the problem to the existence of a left invariant content.

Lemma 7.6. Let G be a locally compact group, and let ω be a content on G,with the left invariance property:

ω(gK ) = ω(K ), ∀ g ∈ G, K ∈ CG.

If ω is not identically zero, then the outer measure ω∗, induced by ω, also has the

left invariance property:ω∗(gA) = ω(A), ∀ g ∈ G, A ⊂ G.

The measure µ = ω∗Bor(G)

is a Haar measure on G.

Proof. We trace the construction outlined in Theorem 7.1. Denote by T G thecollection of all open subsets of G, and define the map ω : T G → [0, ∞] by

ω(D) = sup

ω(K ) : K ∈ CG, K ⊂ D

, ∀ D ∈ T G.




The outer measure ω∗ is then defined by

ω∗(A) = inf ω(D) : D ∈ T G, D ⊃ A, ∀ A ⊂ G.

Claim: The map ω : T G → [0, ∞] has the left invariance property:

ω(gD) = ω(D), ∀ g ∈ G, D ∈ T G.

Start with some arbitrary compact subset K ⊂ gD. Then g−1K is a compactsubset of D, so by the left invariance property of ω, we get

ω(K ) = ω(g−1K ) ≤ ω(D).

This means that we have ω(K ) ≤ ω(D), for all compact subsets K ⊂ gD, so by thedefinition of ω we get

ω(gD) = sup

ω(K ) : K ∈ CG, K ⊂ gD ≤ ω(D).

The other inequality ω(D) ≤ ω(gD), follows from the one above if we replace gwith g−1 and D with gD.

We are now in position to prove that ω∗

has the left invariance property. Fixfor the moment A ⊂ G and g ∈ G. For every open set D ⊃ gA, one has g−1D ⊃ A,so by the Claim we get

ω(D) = ω(g−1D) ≥ ω∗(A).

Since we have ω(D) ≥ ω∗(A), for all open sets D ⊃ gA, by the definition of ω∗, weget

ω∗(gA) = inf

ω(D) : D ∈ T G, D ⊃ gA ≥ ω∗(A).

The other inequality ω∗(A) ≥ ω∗(gA), follows from the one above if we replace gwith g−1 and A with gA.

In order to prove that µ is a Haar measure, all we need to prove is the fact thatµ(G) > 0. Start with some compact subset K ⊂ G, with ω(K ) > 0. We have

µ(G) ≥ µ(K ) = ω∗(K ) = ω(K ) ≥ ω(K ) > 0,

and we are done.

Before we prove the existence of Haar measures, we need more preparations.

Notations. Let G be a group. For two non-empty subsets A, B ⊂ G, we writeA B, if there exist elements g1, . . . , gn ∈ G, such that A ⊂ g1B ∪ · · · ∪ gnB. Inthis case we define the number

[A : B] = minn ∈ N : there exist g1, . . . , gn ∈ G with K ⊂ g1V ∪ · · · ∪ gnV

.

The following result will be useful.

Lemma 7.7. Let G be a group.

(i) If A, B ⊂ G are non-empty sets with A ⊂ B, then A B, and [A : B] = 1.(ii) The relation is transitive, i.e. whenever A , B, C ⊂ G are non-empty

subsets satisfying A

B and B

C , it follows that A

C . Moreover, in

this case one has the inequality [A : C ] ≤ [A : B] · [B : C ].

(iii) The relation is compatible with left translations. This means that for any two elements g, h ∈ G, and any two non-empty subsets A, B ⊂ G,one has the equivalence A B ⇔ gA hB. Moreover, in this case onehas

[gA : hB] = [A : B].



266 LECTURES 26-29

(iv) If A , B, C ⊂ G are non-empty subsets such that A C and B C , then A ∪ B C . Moreover, in this case one has the inequality

[A ∪ B : C ] ≤ [A : C ] + [B : C ].

(v) If A , B, C ⊂ G are non-empty sets, such that A C , B C , and (A ·C −1) ∩ (B · C −1) = ∅, then one has the equality

[A ∪ B : C ] = [A : C ] + [B : C ].

Proof. (i) This part is trivial.(ii) Put m = [A : B] and n = [B : C ]. Choose g1, . . . , gm, h1, . . . , hn ∈ G, such

that A ⊂ g1B ∪ · · · ∪ gmB, and B ⊂ h1C ∪ · · · ∪ hnC . We then obviously have theinclusion

A ⊂mi=1

nj=1

(gihj)C,

which proves that A C , but also shows that [A : C ] ≤ mn.(iii) This follows immediately from (ii) plus the obvious relations A gA A,B hB B, and the equalities

[A : gA] = [gA : A] = [B : hB] = [hB : B] = 1.

(iv) Let m = [A : C ] and n = [B : C ]. Choose g1, . . . , gm, gm+1, . . . , gm+n ∈ Gsuch that A ⊂ g1C ∪ · · · ∪ gmC and B ⊂ gm+1C ∪ · · · ∪ gm+nC . This clearly showsthat A ∪ B C and [A ∪ B : C ] ≤ m + n.

(v) Let p = [A ∪ B : C ], and choose g1, . . . , g p ∈ G, such that A ∪ B ⊂g1C ∪ · · · ∪ g pC . Define the sets

M = j ∈ 1, . . . , p : A ∩ gjC = ∅

and N =

k ∈ 1, . . . , p : B ∩ gkC = ∅

.

Notice that M ∩ N = ∅. Indeed, if there exists j ∈ M ∩ N , this means that on the

one hand, we have A ∩ gjC = ∅, which gives gj ∈ A · C −1, and on the other hand,we have B ∩ gjC = ∅ which gives gj ∈ B · C −1. But this clearly contradicts theassumption that (A · C −1) ∩ (B · C −1) = ∅.

By the definition of M and N , we clearly have the inclusions

A ⊂j∈M

gjC and B ⊂k∈N

gkC.

These immediately give the inequalities [A : C ] ≤ card M and [B : C ] ≤ card N .Since M and N are disjoint, and M ∪ N ⊂ 1, . . . , p, these inequalities give

[A : C ] + [B : C ] ≤ card M + card N = card(M ∪ N ) ≤ p = [A ∪ B : C ].

Using part (iv), we see that in fact we have equality [A : C ] + [B : C ] = [A ∪ B :C ].

Remark 7.7. If G is a topological group with identity element e, and if V isa neighborhood of e, then K V , for every compact subset of G. Indeed, if wechoose some open set D with e ∈ D ⊂ V , then using the compactness of K , andthe obvious inclusion K ⊂

g∈K gD, it follows that there exists g1, . . . , gn ∈ K ,such that K ⊂ g1D ∪ · · · ∪ gnD ⊂ g1V ∪ · · · ∪ gnV .

With these preparations we are in position to prove the following fundamentalresult.




Theorem 7.6. Let G be a locally compact group, and let A be a compact neighborhood of the identity element. Then there exists a Haar measure µ on G,such that µ(A) = 1.

Proof. Denote the identity element of G by e. Throughout the proof thecompact neighborhood A of e will be fixed. For every non-empty compact setK ⊂ G, we define m(K ) = [K : A]. We also put m(∅) = 0.

Let us define V to be the collection of all neighborhoods of e. For every V ∈ V,we denote by Ω(V ) the set of all maps ω : CG → [0, ∞) with the following properties

(i) 0 ≤ ω(K ) ≤ m(K ), ∀ K ∈ CG;(ii) ω(A) = 1;

(iii) K, L ∈ CG, K ⊂ L ⇒ ω(K ) ≤ ω(L);(iv) ω(K ∪ L) ≤ ω(K ) + ω(L), ∀ K, L ∈ CG;(v) ω(gK ) = ω(K ), ∀ g ∈ G, K ∈ CG.

(vi) K, L

∈CG, (K

·V )

∩(L

·V ) = ∅

⇒ω(K

∪L) = ω(K ) + ω(L).

Claim 1: For every V ∈ V, the set Ω(V ) is non-empty.

Fix V . We shall prove this Claim by an explicit construction of an element ω ∈Ω(V ). Define ω(∅) = 0, and define

ω(K ) =[K : V −1]

[A : V −1],

for all non-empty compact subsets K ⊂ G. The fact that ω has properties (i)-(vi)is immediate from Lemma 7.7.

Let us regard the sets Ω(V ), V ∈ V as subsets of the product space

P =

K∈CG

[0, m(K )].

Notice that, when we equip P with the product topology, it becomes a compactspace, by Tihonov’s Theorem.

Claim 2 : For every V ∈ V, the set Ω(V ) is closed in P.

Define, for any K ∈ CG, the map

πK : P ω −→ ω(K ) ∈ R.

By the definition of the topology of P, all maps πK : P → R are continuous. Forany two sets K, L ∈ CG, consider the functions F KL, T KL : P → R, defined by

F KL(ω) = ω(K ) − ω(L) and T KL(ω) = ω(K ∪ L) − ω(K ) − ω(L), ∀ ω ∈ P.

Since we have F KL = πK − πL and T KL = πK∪L − πK − πL, it follows that themaps F KL , T KL : P → R, K, L ∈ CG, are all continuous. As a consequence of thecontinuity of these maps, it follows that, for any two sets K, L ∈ CG, the sets

Γ(K, L) = ω ∈ P : ω(K ) ≤ ω(L) = F −1KL

(−∞, 0]

,

Θ−(K, L) = ω ∈ P : ω(K ∪ L) ≤ ω(K ) + ω(L) = T −1KL

(−∞, 0]

,

Θ+(K, L) = ω ∈ P : ω(K ∪ L) ≥ ω(K ) + ω(L) = T −1KL

[0, ∞)



268 LECTURES 26-29

are closed subsets of P. It then follows that the sets

Ω1 = ω

∈P : ω(A) = 1 = π−1

A 1

,

Ω2 = (K,L)∈CG×CG

K⊂L

Γ(K, L),

Ω3 =

(K,L)∈CG×CG

Θ−(K, L),

Ω4 =K∈CG

g∈G

Γ(K,gK ) ∩ Γ(gK,K )

,

are all closed, so the intersection

Ω5 = Ω1 ∩ Ω3 ∩ Ω3 ∩ Ω4

is again closed. Notice that

Ω5 = ω ∈ P : ω has properties (i)-(v) .

Finally, if we define, for every V ∈ V, the set

Ω6V =

(K,L)∈CG×CG

(K·V )∩(L·V )=∅

Θ+(K, L),

then Ω6V is also closed, and so will then be the intersection Ω5 ∩ Ω6

V = Ω(V ).

Claim 3 : The intersection V ∈V Ω(V ) is non-empty.

Remark that, if V 1, V 2 ∈ V are such that V 1 ⊂ V 2, then we have the inclusionΩ(V 1) ⊂ Ω(V 2). Indeed, if ω belongs to Ω(V 1), then properties (i)-(v) are clear. Tocheck property (vi) for V 2 we need to show that whenever K, L ⊂ G are compactsets, with (K

·V 2)

∩(L

·V 2) = ∅, it follows that ω(K

∪L) = ω(K ) + ω(L). This

is however trivial, since the inclusion V 1 ⊂ V 2 forces (K · V 1) ∩ (L · V 1) = ∅, andthen the desired equality follows from the property (vi) for V 1. We now see that,for any finite number of sets V 1, . . . , V n ∈ V, we have the inclusion

Ω(V 1 ∩ · · · ∩ V n) ⊂ Ω(V 1) ∩ · · · ∩ Ω(V n),

which by Claim 1, proves that Ω(V 1) ∩ · · · ∩ Ω(V n) = ∅. Using Claim 2, and thecompactness of P, the Claim immediately follows.

Pick now an element ω ∈ V ∈V Ω(V ).

Claim 4: The map ω : CG → [0, ∞) is a content on G with the left invarianceproperty

ω(gK ) = ω(K ), ∀ g ∈ G, K ∈ CG.

Moreover, one has the equality ω(A) = 1.

The fact that ω(A) = 1 is clear, from condition (ii) in the definition of Ω(V ). Theleft invariance property follows from condition (v). In order to prove that ω is acontent, we need to prove

(a) ω(∅) = 0;(b) K, L ∈ CG, K ⊂ L ⇒ ω(K ) ≤ ω(L);(c) ω(K ∪ L) ≤ ω(K ) + ω(L), ∀ K, L ∈ CG;(d) K, L ∈ CG, K ∩ L = ∅ ⇒ ω(K ∪ L) = ω(K ) + ω(L).




Properties (a), (b), and (c) are clear, because every element in Ω(V ), V ∈ V satisfiesthem. (Property (a) is a consequence of condition (i), property (b) is a consequenceof (iii), and property (c) is a consequence of (iv).) To prove property (d), we startwith two disjoint compact sets K and L, and we use Proposition 7.5 to find someV ∈ V such that (K · V ) ∩ (L ∩ V ) = ∅. Then we use the fact that ω belongs toΩ(V ), and by condition (vi) we indeed get ω(K ∪ L) = ω(K ) + ω(L).

Having proven Claim 4, we now define the measure µ0 = ω∗Bor(G)

. By Lemma

7.7, µ0 is a Haar measure on G. Notice that µ0(A) = ω(A) ≥ ω(A) = 1, so if wedefine µ : Bor(G) → [0, ∞] by (use the convention ∞/µ0(A) = ∞)

µ(B) =µ0(B)

µ0(A), ∀ B ∈ Bor(G),

then µ is a Haar measure on G, and satisfies µ(A) = 1.

Comment. Eventually (see Chapter IV) we are going to improve on the aboveresult by proving the uniqueness of µ.

In concrete examples, it is possible to prove uniqueness.Exercise 10* . Let S = [0, 1]n be the unit square in Rn, and let µ be a Haar

measure on (Rn, +), with µ(S ) = 1. Prove that µ coincides with the n-dimensionalLebesgue measure λn.Hint: Consider first the half open box S 0 = [0, 1)n, and its measure β = µ(S 0). Prove that fora half open box of the form

B = [a1, b1) × · · · × [an, bn)

with a1, . . . , an, b1, . . . , bn ∈ Q, one has µ(B) = βλn(B). Conclude that if a subset A ⊂ Rn iscontained in a hyperplane of the form

Πk(a) = (x1, . . . , xn) ∈ Rn : xk = a,

then µ(A) = 0. Use this to get β = 1, so

µ(B) = λn(B),

for every “rational” half-open box. Prove that this equality holds for all half-open boxes. UseCorollary 5.1 to conclude that µ = λn.

The following two exercises show how a Haar measure can be used to get sometopological information.

Exercise 11. Let G be a locally compact group, and let µ be a Haar measureon G. Prove that µ(D) > 0, for every open subset D ⊂ G.

Hint: Use the inequality µ(K ) ≤ [K : D] · µ(D), for all compact K ⊂ G.

Exercise 12* . Let G be a locally compact group, and let µ be a Haar measureon G. Prove that the following are equivalent:

(i) G is compact;(ii) µ(G) < ∞.

Hint: For the implication (ii) ⇒ (i), start with some compact neighborhood V of the identity,

and choose a maximal subset A ⊂ G, such that the sets gV , g ∈ A are disjoint. Prove that A isfinite. Conclude that G =

g∈A(gV · V −1), so G is a finite union of compact sets.



Lectures 30-31

8. Signed measures and complex measures

In this section we discuss a generalization of the notion of a measure, to thecase where the values are allowed to be outside [0, ∞]. The first notion is describedby the following.

Definition. Suppose A is a σ-algebra on a non-empty set X . A function

µ : A → [−∞, ∞] is called a signed measure on A, if it has the properties below.(i) Either one of the following is true

• µ(A) < ∞, ∀ A ∈ A;• µ(A) > −∞, ∀ A ∈ A.

(ii) µ(∅) = 0.(iii) For any pairwise disjoint sequence (An)∞

n=1 ⊂ A, one has the equality

(1) µ ∞n=1

An

=

∞n=1

µ(An).

Here we adopt the convention that if one term in the right hand side of (1) is equalto ±∞, then the entire sum is equal to ±∞. It is important to use condition (i),which avoids situations when one term is

∞and another term is

−∞.

Examples 8.1. Let us agree, in this section only, to use the term “honest”measure, for a measure in the usual sense.

A. Any “honest” measure is of course a signed measure.B. If µ is a signed measure, then −µ is again a signed measure.C. If µ1 and µ2 are “honest” measures, one of which is finite, then µ1 − µ2 is

a signed measure. Eventually (see Theorem 8.2) we are going to show that anysigned measure can be written in this form.

One key technical result about signed measures is the following.

Theorem 8.1. Let A be a σ-algebra on a non-empty set X , and let µ be a signed measure on X . Then there exist sets L, U ∈ A, such that

µ(L) = inf

µ(A) : A ∈ A

;(2)

µ(M ) = supµ(A) : A ∈ A.(3)

Proof. Since −µ is also a signed measure, it suffices to prove only the exis-tence of M satisfying (3). Denote the right hand side of (3) by α, and choose asequence (αn)n≥1 ⊂ R, such that limn→∞ αn = α, and αn < α, ∀ n ≥ 1. The keyconstruction we need is contained in the following.

Claim 1: There exists a family of sets Bnk : k, n ∈ N, 1 ≤ k ≤ n ⊂ A,with the following properties:

271



272 LECTURES 30-31

(i) for every n ≥ 1, one has the inclusions

Bn1

⊂Bn2

⊂. . .

⊂Bnn

∪ ∪ . . . ∪Bn+1

1 ⊂ Bn+12 ⊂ . . . ⊂ Bn+1

n ⊂ Bn+1n+1

(ii) for every k ≥ 1 one has the inequalities

µ(Bnk Bn+1k ) ≤ 0, ∀ n ≥ k.

(iii) µ(Bnn) ≥ αn, ∀ n ≥ 1.

We construct this sequence inductively, one row at a time (the rows ar indexed bythe upper index n). Choose B1

1 ∈ A to be any set with µ(B11 ) ≥ α1. Suppose we

have constructed the first N rows, i.e. we have defined the sets Bnk , 1 ≤ k ≤ n ≤ m,so that property (i) holds for all n = 1, . . . , m − 1, property (ii) holds in the form

αk

≤µ(Bkk)

≤µ(Bk+1

k )

≤ · ·· ≤µ(Bmk ),

∀k = 1, . . . , m ,

and property (iii) holdes for all n = 1, . . . , m. Let us explain now how the next rowBm+1

1 ⊂ Bm+12 ⊂ . . . Bm+1

m ⊂ Bm+1m+1 is constructed. Define the sets E 1, E 2, . . . , E m ∈

A by

E 1 = Bm1 , and E k = Bmk Bmk−1, ∀ k = 2, . . . , m .

The sets E k, k = 1, . . . , m are pairwise disjoint, and we have

Bmk =

kj=1

E j , ∀ k = 1, . . . , m .

Choose now an arbitrary set D ∈ A, with µ(D) ≥ αm+1, and define, for each j ∈ 1, . . . , m, the set

Gj = E j if µ(E j D) > 0E j ∩ D if µ(E j D) ≤ 0

Notice that we have E j ⊃ Gj , and using the equality µ(E j) = µ(E j∩D)+µ(E jD),we also have

(4) µ(E j Gj) ≤ 0 and µ(Gj) ≥ µ(E j ∩ D), ∀ j = 1, . . . , m .

Define also the set Gm+1 = D Bmm . It is clear that the sets G1, G2, . . . , Gm+1 arepairwise disjoint. Construct now the m + 1 row by taking

Bm+1k =

kj=1

Gj , ∀ k = 1, 2, . . . , m + 1.

It is obvious that one has the inclusionsBm+1

1 ⊂ Bm+12 ⊂ · ·· ⊂ Bm+1

m+1 .

Since E k ⊃ Gk, ∀ k = 1, . . . , m, it is also clear that we have the vertical inclusionsBmk ⊃ Bm+1

k , ∀ k = 1, . . . , m. Using (4), for each k = 1, . . . , m, we have

µ(Bmk Bm+1k ) = µ

kj=1

[E j Gj ]

=

kj=1

[µ(E j Gj) ≤ 0.




Finally, again by (4), we have

µ(Bm+1

m+1 ) = µm+1

k=1

Gk =

m+1

k=1

µ(Gk) ≥ µ(Gm+1) +

m

k=1

µ(E k ∩ D) =

= µ(Gm+1) + µ mk=1

[E k ∩ D]

= µ(D Bmm) + µ(Bmm ∩ D) = µ(D) ≥ αm+1.

Claim 2 : There exists a sequence (Ak)∞k=1 ⊂ A, such that

(i) A1 ⊂ A2 ⊂ A3 ⊂ . . . ;(ii) µ(Ak) ≥ αk, ∀ k ≥ 1.

We fix a family Bnk : k, n ∈ N, 1 ≤ k ≤ n

satisfying the properties in Claim 1.For every k ≥ 1, we define Ak =

∞n=k Bnk . Notice that, using property (i) from

Claim 1 (the vertical inclusions), we have

Bkk = Ak∪

∞

n=k

[Bnk Bn+1k ],

and the sets Ak, Bnk Bn+1k , n ≥ k, are pairwise disjoint, so using property (ii)

from Claim 1, we have

µ(Bkk) = µ(Ak) +

∞n=k

µ(Bnk Bn+1k ) ≤ µ(Ak).

Using property (iii) from Claim 1, we then get µ(Ak) ≥ αk. The fact that we havethe inclusions A1 ⊂ A2 ⊂ . . . is clear, from property (i) in Claim 1 (the horizontalinclusions).

Fix now the sequence (Ak)∞k=1 ⊂ A as in Claim 2, and let us consider the set

M =

∞k=1 Ak. If we define the sets

M 1

= A1

and M k

= Ak

Ak−1

,∀

k≥

2,

then we have M =∞k=1 M k, and the sets M 1, M 2, M 3, . . . are pairwise disjoint. In

particular, this gives

µ(M ) =

∞k=1

µ(M k) = limk→∞

kj=1

µ(M j)

= limk→∞

µ kj=1

M j

.

Since we obviously havekj=1 M j = Ak, ∀ k ≥ 1, the above equality proves that

(5) µ(M ) = limk→∞

µ(Ak).

Since we have αk ≤ µ(Ak) ≤ α, ∀ k ≥ 1, as well as limk→∞ αk = α, the equality(5) forces µ(M ) = α.

Remark 8.1. One interesting application of the above result is the fact that,whenever µ is a signed measure on A, such that

(6) −∞ < µ(A) < ∞, ∀ A ∈ A,

then

−∞ < inf µ(A) : A ∈ A ≤ supµ(A) : A ∈ A < ∞.

A signed measure with property (6) is called finite.

We are now in position to prove the statement made in Example 8.1.C.



274 LECTURES 30-31

Theorem 8.2. Let X be a non-empty set, let A be a σ-algebra on X , and let µbe a signed measure on A. Then there exist subsets X +, X − ∈ A, with the following properties:

(i) X + ∩ X − = ∅, and X + ∪ X − = X ;(ii) the maps µ± : A → [−∞, ∞], defined by

µ±(A) = ±µ(A ∩ X ±), ∀ A ∈ A,

are “honest” measures on A;(iii) one of the measures µ± is finite, and one has the equality µ = µ+ − µ−.

Proof. Without any loss of generality, we can assume that

µ(A) < ∞, ∀ A ∈ A.

(Otherwise, we replace µ with −µ, and the conclusion does not essentially change.)Put α = sup

µ(A) : A ∈ A

. By Theorem 8.1, it follows that 0 ≤ α < ∞, and

there exists a set X +

∈A, such that µ(X +) = α. Define X − = X X +.

Claim: The sets X ± have the following properties:(a) 0 ≤ µ(A) ≤ α, for all A ∈ A, with A ⊂ X +;(b) 0 ≥ µ(B), for all B ∈ A, with B ⊂ X −.

To prove (a) start with some arbitrary subset A ⊂ X +. First of all, by the definitionof α, it is clear that µ(A) ≤ α. Second, using the equality µ(X +) = µ(A) + µ(X +

A), it is clear that µ(A), µ(X + A) > −∞, so we have

α ≥ µ(X + A) = µ(X +) − µ(A) = α − µ(A),

which clearly forces µ(A) ≥ 0. To prove (b), we start with some set B ∈ A withB ⊂ X −. Using the fact that X + ∩ B = ∅, we have

α ≥ µ(X + ∪ B) = µ(X +) + µ(B) = α + µ(B),

which clearly forces µ(B) ≤ 0.Having proven the Claim, we define the maps µ± : A → [−∞, ∞] as in thestatement of the Theorem. By the Claim, we get µ±(A) ≥ 0, ∀ A ∈ A. It is alsopretty clear that both µ+ and µ− are σ-additive, so they define “honest” measures.Also, by the Claim, we have µ+(A) ≤ α, ∀ A ∈ A, so µ+ is a finite measure. Finally,if we start with some arbitrary A ∈ A, and we write it as A = (A∩X +)∪ (A∩X −),then using the fact that (A ∩ X +) ∩ (A ∩ X −) = ∅, we get

µ(A) = µ(A ∩ X +) + µ(A ∩ X −) = µ+(A) − µ−(A).

It will be helpful not only here, but also in some future discussions, to isolatea certain feature identified by the above result.

Definition. Given a σ-algebra A on a non-empty set X , and two “honest”measures µ and ν on A, we say that µ and ν are mutually singular , if there exists

sets M, N ∈ A, with M ∪ N = X and M ∩ N = ∅, such that µ(N ) = ν (M ) = 0.Notice that this implies the equalities

µ(A) = µ(A ∩ M ) and ν (A) = ν (A ∩ N ), ∀ A ∈ A.

If this situation occurs, we write µ ⊥ ν .With this terminology, Theorem 8.2 states that any signed measure µ can be

written as µ = µ+ − µ−, with µ+ and µ− “honest” mutually singular measures,and one of them finite.




Although the sets X ± may not be uniquely determined, the decompositionµ = µ+ − µ− is unique, as indicated by the following result.

Theorem 8.3 (Minimality). Let X be a non-empty set, let A be a σ-algebra on X , and let µ be a signed measure on A. Suppose µ+ and µ− are mutually singular “honest” measures on A, one of them being finite, such that µ = µ+ − µ−.Suppose ν and η are two “honest” measures on A, one of which being finite, such that µ = ν − η. Then one has the inequalities µ+ ≤ ν and µ− ≤ η.

Proof. Fix sets X +, X − ∈ A, such that X + ∪ X − = X , X + ∩ X − = ∅, andµ+(X −) = µ−(X +) = 0.

Start with some arbitrary set A ∈ A. On the one hand, since A = (A ∩ X +) ∪(A ∩ X −), with A = (A ∩ X +) ∩ (A ∩ X −) = ∅, we see that if λ is either one of themeasures µ, µ+ ,or µ−, we have the equality

(7) λ(A) = λ(A ∩ X +) + λ(A ∩ X −), ∀ A ∈ A.

On the other hand, since µ+ is an “honest” measure, and µ+(X −) = 0, the inclusion

A ∩ X − ⊂ X − will force

µ+(A ∩ X −) = 0, ∀ A ∈ A.

Likewise, we have the equality

µ−(A ∩ X +) = 0, ∀ A ∈ A.

These equalities, combined with µ = µ+ − µ−, and with (7), give the equalities

µ+(A) = µ+(A ∩ X +) = µ+(A ∩ X +) − µ−(A ∩ X +) = µ(A ∩ X +),(8)

µ−(A) = µ−(A ∩ X −) = −µ+(A ∩ X −) + µ−(A ∩ X −) = −µ(A ∩ X −),(9)

for all A ∈ A. Fix now some set A ∈ A. Since ν is an “honest” measure, andη(A ∩ X +) ≥ 0, using (8) we get

ν (A) ≥ ν (A ∩ X +

) ≥ ν (A ∩ X +

) − η(A ∩ X −

) = µ(A ∩ X +

) = µ+

(A).Likewise, we have

η(A) ≥ η(A ∩ X −) ≥ η(A ∩ X −) − ν (A ∩ X −) = −µ(A ∩ X −) = µ−(A).

Corollary 8.1. Let A be a σ-algebra on X , let µ be a signed measure on A,and let µ+, µ−, ν + and ν − be “honest” measures on A with

• µ+ ⊥ µ−, and one of the measures µ+ and µ− is finite;• ν + ⊥ ν −, and one of the measures ν + and ν − is finite;• µ = µ+ − µ− = ν + − ν −.

Then one has the equalities µ+ = ν + and µ− = ν −.

Proof. Apply Theorem 8.3 “both ways” to get µ+ ≤ ν + and µ− ≤ ν −, aswell as ν +

≤µ+ and ν −

≤µ−.

Definition. Given a signed measure µ, the decomposition µ = µ+−µ−, whoseexistence is shown in Theorem 8.2, and whose uniqueness is shown above, is calledthe Hahn-Jordan decomposition of µ. A pair of sets (X +, X −), with X ± ∈ A,X + ∪ X − = X , X + ∩ X − = ∅, and µ+(X −) = µ−(X +) = 0, is called a Hahn-Jordan set decomposition of X relative to µ.

Exercise 1. Let µ be a signed measure, and let µ = µ+−µ− be the Hahn-Jordandecomposition. Prove that the following are equivalent



276 LECTURES 30-31

(i) µ is finite, i.e. −∞ < µ(A) < ∞, ∀ A ∈ A;(ii) both “honest” measures µ+ and µ− are finite.

The following result characterizes mutual singularity in an approximate fashion.Lemma 8.1. Let A be a σ-algebra on X , and let µ and ν be “honest” measures

on A. The following are equivalent

(i) µ ⊥ ν ;(ii) for every ε > 0, there exist sets D, E ∈ A, such that µ(D) < ε, ν (E ) < ε,

and D ∪ E = X .

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i) construct, for each ε > 0, two sequences

(Dεn)∞n=1 and (E εn)∞

n=1 of sets in A, such that µ(Dεn) < ε/2n, ν (E εn) < ε/2n, andDεn ∪ E εn = X . Put Aε =

∞n=1 Dεn and Bε =

∞n=1 E εn. Fix for the moment ε > 0.

On the one hand, using the inclusion Aε ⊂ Dεn, ∀ n ≥ 1, we get µ(Aε) ≤ ε/2n,

∀n

≥1, which clearly forces

(10) µ(Aε) = 0.

On the other hand, using σ-subadditivity, we have

(11) ν (Bε) = ν ∞n=1

E εn ≤

∞n=1

ν (E εn) <∞n=1

ε

2n= ε.

Finally, since we have, X Dεn ⊂ E εn, ∀ n ≥ 1, we get

X Aε =

∞n=1

(X Dεn) ⊂∞n=1

E εn = Bε,

which gives

(12) Aε ⊃ X Bε.

Define now the sets N =∞n=1 A1/n and M = X N . On the one hand, using

σ-subadditivity, combined with (10), we get µ(N ) = 0. On the other hand, using(12), we have

M = X N = X ∞n=1

A1/n

=

∞n=1

(X A1/n) ⊂∞n=1

B1/n ⊂ B1/k, ∀ k ≥ 1,

which forces ν (M ) = 0.

Although the next technical result seems a bit out of context at this point, weprove it here, and record it for future use.

Lemma 8.2. Let A be a σ-algebra on some non-empty set X , and let µ, η besigned measures on A. Assume there is an “honest” finite measure ν on A, with µ + ν = η.

(i) If µ = µ+ − µ− and η = η+ − η− are the Hahn-Jordan decompositions of µ and η respectively, then one has the inequalities

µ+ ≤ η+ ≤ µ+ + ν (13)

η− ≤ µ− ≤ η− + ν.(14)




(ii) If (X +, X −) is a Hahn-Jordan set decomposition of X relative to µ, and if (Y +, Y −) is a Hahn-Jordan set decomposition of X relative to η, then one has the relations X +

⊂νY + and Y −

⊂νX −.

Proof. On the one hand, the signed measure η has a decomposition

η = µ + ν = (µ+ + ν ) − µ−,

with µ+ + ν and µ− “honest” measures (one of them finite). Using the minimalityTheorem 8.3, we get the inequalities

(15) η+ ≤ µ+ + ν and η− ≤ µ−.

On the other hand, we can also consider the signed measure µ = η − ν , which hasa decomposition

µ = η+ − (η− + ν ),

with η+

and η−

+ ν “honest” measures (one of them finite). Using again theminimality Theorem 8.3, we get the inequalities

(16) µ+ ≤ η+ and µ− ≤ η− + ν.

Clearly the inequalities (15) and (16) cover the desired inequalities (13) and (14)(ii). Recall (see Section 4) that the relation A ⊂

νB means that ν (A B) = 0.

In our case, we have to look at the set

N = X + Y + = Y − X −,

for which we have to show that ν (N ) = 0. On the one hand, since N ⊂ Y −, we getη+(N ) = 0. Using (13) this forces µ+(N ) = 0. On the other hand, since N ⊂ X +,we get µ−(N ) = 0, and using (14) we also get η−(N ) = 0. In other words, we get

the equalitiesµ(N ) = µ+(N ) − µ−(N ) = 0,

η(N ) = η+(N ) − η−(N ) = 0,

and then the equality η = µ + ν clearly forces ν (N ) = 0.

The Hahn-Jordan decomposition has the following interesting application to theproperties of the natural order relation on “honest” measures. The result belowgives the existence of a “infimum” and a ”supremum” for a pair of finite “honest”measures.

Proposition 8.1 (Lattice Property). Let A be a σ-algebra on a non-empty set X , and let µ and ν be “honest” measures on A, with one of them finite.

(i) There exists a unique measure µ ∨ ν with:(a) µ ∨ ν ≥ µ and µ ∨ ν ≥ ν ;(b) whenever ω is an “honest” measure on A, with µ ≤ ω and ν ≤ ω, it

follows that one has the inequality µ ∨ ν ≤ ω.(ii) There exists a unique measure µ ∧ ν with:

(a) µ ≥ µ ∧ ν and ν ≥ µ ∧ ν ;(b) whenever λ is an “honest” measure on A, with µ ≥ λ and ν ≥ λ, it

follows that one has the inequality µ ∧ ν ≥ λ.



278 LECTURES 30-31

Proof. Since the statement of the Theorem is “symmetric,” without any lossof generality we can assume that µ is finite.

Consider the signed measure η = µ−

ν , and its Hahn-Jordan decompositionη = η+ − η−. Let (X +, X −) be a Hahn-Jordan set decomposition of X relative toη. This means that, for every A ∈ A, one has

0 ≤ η+(A) = η(A ∩ X +) = µ(A ∩ X +) − ν (A ∩ X +);(17)

0 ≤ η−(A) = −η(A ∩ X −) = ν (A ∩ X −) − µ(A ∩ X −).(18)

In particular we get

(19) µ(A ∩ X +) ≥ ν (A ∩ X +) and µ(A ∩ X −) ≤ ν (A ∩ X −), ∀ A ∈ A.

(i). Define the measure µ ∨ ν = µ + η−. Using (18) we have

(20) (µ ∨ ν )(A) = µ(A ∩ X +) + ν (A ∩ X −), ∀ A ∈ A.

Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities

(µ ∨ ν )(A ∩ X +

) = µ(A ∩ X +

) ≥ ν (A ∩ X +

),(µ ∨ ν )(A ∩ X −) = ν (A ∩ X −) ≥ µ(A ∩ X −),

In particular, this gives

(µ ∨ ν )(A) = (µ ∨ ν )(A ∩ X +) + (µ ∨ ν )(A ∩ X −) ≥ µ(A ∩ X +) + µ(A ∩ X −) = µ(A),

(µ ∨ ν )(A) = (µ ∨ ν )(A ∩ X +) + (µ ∨ ν )(A ∩ X −) ≥ ν (A ∩ X +) + ν (A ∩ X −) = µ(A),

for every A ∈ A, so µ ∨ ν indeed has property (a).To prove property (b), start with some “honest” measure ω on A, with µ, ν ≤ ω,

and let us show that µ ∨ ν ≤ ω. This is quite clear, since for any A ∈ A, using (20)we have

ω(A) = ω(A ∩ X +) + ω(A ∩ X −) ≥ µ(A ∩ X +) + ν (A ∩ X −) = (µ ∨ ν )(A).

The uniqueness of µ ∨ ν is now clear from (a) and (b).(ii). Remark that, using the Minimality Theorem 8.3, for the measure η = µ−ν ,

it follows that η+ ≤ µ. In particular, η+ is a finite “honest” measure, and so is thedifference µ − η+. Put µ ∧ ν = µ − η+. Using (17) we have

(21) (µ ∧ ν )(A) = µ(A ∩ X −) + ν (A ∩ X +), ∀ A ∈ A.

Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities

(µ ∧ ν )(A ∩ X +) = ν (A ∩ X +) ≤ µ(A ∩ X +),

(µ ∧ ν )(A ∩ X −) = µ(A ∩ X −) ≥ ν (A ∩ X −),

In particular, this gives

(µ ∧ ν )(A) = (µ ∧ ν )(A ∩ X +) + (µ ∧ ν )(A ∩ X −) ≤ µ(A ∩ X +) + µ(A ∩ X −) = µ(A),

(µ ∧ ν )(A) = (µ ∧ ν )(A ∩ X +

) + (µ ∧ ν )(A ∩ X −

) ≤ ν (A ∩ X +

) + ν (A ∩ X −

) = µ(A),for every A ∈ A, so µ ∧ ν indeed has property (a).

To prove property (b), start with some “honest” measure λ on A, with µ, ν ≤ ω,and let us show that µ ∧ ν ≥ λ. This is quite clear, since for any A ∈ A, using (21)we have

λ(A) = λ(A ∩ X +) + ω(A ∩ X −) ≤ ν (A ∩ X +) + µ(A ∩ X −) = (µ ∧ ν )(A).

The uniqueness of µ ∧ ν is now clear from (a) and (b).




We conclude with a series of results that make a connection with the theory of Radon measures discussed in Section 7.

Definition. Suppose X is a locally compact space, and µ is a signed measureon Bor(X ). We call µ a signed Radon measure on X , if there exist “honest” Radonmeasures ν and η on X , one of which is finite, such that µ = ν − η.

Exercise 2* . Let X be a locally compact space, and let µ be a signed measureon Bor(X ). Prove that the following are equivalent:

(i) µ is a signed Radon measure on X ;(ii) if µ = µ+ − µ− denotes the Hahn-Jordan decomposition of µ, then both

µ+ and µ− are Radon measures on X .

Hint: To prove the implication (i) ⇒ (ii) use the fact that µ+ ≤ ν and µ− ≤ η. Moreover,

show that, for any B ∈ Bor(X), one has the implications µ+(B) < ∞ ⇒ ν (B) < ∞ and

µ−(B) < ∞ ⇒ η(B) < ∞. Then use Exercise 5 from Section 7.

Remark 8.2. Suppose X is a locally compact space. In Section 7 we discussed

the Riesz correpsondence, which associates to each linear positive map φ : C Rc (X ) →

R, a Radon measure µφ on X . As already suggested, this correspondence is in facta bijection, although the proof of this fact will come later in Chapter IV. At thispoint we would like to analyze the Riesz correspondence in a simpler situation,namely the case when X is compact . In this case it is interesting to point out thatRiesz correspondence can be extended beyond the positive case. The key fact (seeCorollary II.5.3) is that every linear continuous map φ : C R(X ) → R can be writtenas a difference φ = φ1 − φ2, with φ1, φ2 : C R(X ) → R positive linear maps. (In factφ1 and φ2 can be chosen such that φ = φ1 + φ2. This fact will be heavilyexploited a little later.) We would like then to define a finite signed Radon measureµφ by the formula µφ = µφ1 − µφ2 . There is a minor problem here: What if we

find another pair of continuous positive linear maps ψ1, ψ2 : C R(X ) → R, such that φ = ψ1

−ψ2? Is is true that µψ1

−µψ2 = µφ1

−µφ2? The answer is affirmative,

and this is an easy consequence of Proposition 7.6, which gives the equalities

µφ1 + µψ2 = µφ1+ψ2 = µψ1+φ2 = µψ1 + µφ2 .

Notations. Suppose X is a compact Hausdorff space. We define

MR(X ) =

φ : C R(X ) → R : φ R-linear continuous

,

RR(X ) =

µ signed Radon measure on X

.

The correspondence

(22) MR(X ) φ −→ µφ ∈ R

R(X )

defined above, will still be referred to as the extended Riesz correspondence.

Remark 8.3. If X is a compact Hausdorff space, then the extended Rieszcorrespondence (22) is a linear map. This is a consequence of Proposition 7.6.

Given φ ∈ MR(X ), the existence of a decomposition of φ, of the particular typedescribed in Corollary II.5.3, is extremely significant, as suggested by the followingresult.

Theorem 8.4. Let X be a compact Hausdorff space, let φ1, φ2 : C R(X ) → Rbe positive linear maps, and let µφ1 and µφ2 be the corresponding Riesz measures.Consider the linear continuous map φ = φ1 − φ2, and the finite signed measure

(23) µφ = µφ1 − µφ2 .



280 LECTURES 30-31

If φ = φ1 + φ2, then µφ1 ⊥ µφ2 , so (23) represents the Hahn-Jordan decom-position of µφ.

Proof. We are going to show that the decomposition (23) satisfies condition(ii) in Lemma 8.1. The key step in proving this fact is contained in the following.

Claim: For every ε > 0, there exist functions f 1, f 2 ∈ C R(X ), with f 1, f 2 ≥0, f 1 + f 2 ≥ 1, and such that φ1(f 2) < ε and φ2(f 1) < ε.

To prove this we fix ε > 0, and we use the definition of the norm, to find somefunction g ∈ C R(X ), with g ≤ 1, and |φ(g)| ≥ φ − ε. Replacing g with −g, if necessary, we can assume that

(24) φ(g) ≥ φ − ε.

Consider the functions g+ = maxg, 0 and g− = max−g, 0

, so that g = g+ −g−,and we clearly have 0 ≤ g± ≤ 1. On the one hand, since φk = φk(1) (seeProposition II.5.4), we have φk(g±) ≤ φk, k = 1, 2. On the other hand, by (24),

and the positivity of φ1 and φ2, we know thatφ − ε ≤ φ(g) = φ1(g) − φ2(g) = φ1(g+) + φ2(g−) − φ1(g−) − φ2(g+) ≤≤ φ1(g+) + φ2(g−) ≤ φ1 + φ2 = φ,

so we get

ε ≥ φ − φ1(g+) − φ2(g−) = φ1 + φ2 − φ1(g+) − φ2(g−) =

= φ1(1) + φ2(1) − φ1(g+) − φ2(g−) = φ1(1 − g+) + φ2(1 − g−).

If we define f 1 = 1 − g− and f 2 = 1 − g+, then it is clear that f 1, f 2 ≥ 0. Usingthe fact that g+ + g− = |g| ≤ 1, we get f 1 + f 2 = 2 − |g| ≥ 1. Finally, the aboveestimate gives φ1(f 2) + φ2(f 1) ≤ ε, and so the Claim immediately follows.

Having proven the Claim, we are now in position to prove that the two measuresµφ1 and µφ2 satisfy condition (ii) in Lemma 8.1. Start with some arbitrary ε > 0,

and use the Claim to find two functions f 1, f 2 ∈ C R(X ) with f 1, f 2 ≥ 0, f 1 +f 2 ≥ 1,such that φ1(f 2) ≤ ε/2 and φ2(f 1) ≤ ε/2. Consider the compact subsets

K 1 =

x ∈ X : f 1(x) ≥ 1

2

and K 2 =

x ∈ X : f 2(x) ≥ 1

2

.

Since f 1 + f 2 ≥ 1, it follows immediately that we have K 1 ∪ K 2 = X . By construc-tion, we have 2f 1 ≥ κ K1

and 2f 2 ≥ κ K2, so using the interpolation property (see

Proposition 7.5), we get

µφ1(K 2) ≤ φ1(2f 2) = 2φ1(f 2) ≤ ε;

µφ2(K 1) ≤ φ2(2f 1) = 2φ2(f 1) ≤ ε.

The above result has several interesting consequences.

Corollary 8.2. Suppose X is a compcat Hausdorff space. Then the extended Riesz correspondence (22) is injective.

Proof. Since the correspondence (22) is linear, is suffices to prove the impli-cation µφ = 0 ⇒ φ = 0. Start with some linear continuous map φ : C R(X ) → R,such that µφ = 0. Use Corollary II.5.3 to find two positive linear maps φ1, φ2 :C R(X ) → R, such that φ = φ1 − φ2, and φ = φ1 + φ2. By Theorem 8.4the difference µφ1 − µφ2 = µφ = 0 is the Hahn-Jordan decomposition of the zeromeasure. By the uniqueness (see Corollary 8.1) it follows that µφ1 = µφ2 = 0. By




the interpolation property, we know that φk = φk(1) = µφk(X ) = 0, k ≥ 1, so

we get φ1 = φ2 = 0, thus forcing φ = 0.

The injectivity of the extended Riesz correspondence has as a consequence theuniqueness of the decomposition of linear continuous as differences of positive ones,of the type described in Corollary II.5.3.

Corollary 8.3. Let X be a compact Hausdorff space, and let φ : C R(X ) → R

be a linear continuous map. Assume one has positive linear maps φ1, φ2, ψ1, ψ2 :C R(X ) → R, such that

• φ = φ1 − φ2 = ψ1 − ψ2;• φ1 + φ2 = ψ1 + ψ2 = φ.

Then one has the equalities φ1 = ψ1 and φ2 = ψ2.

Proof. Consider the signed measure µφ. By Theorem 8.4, the decompositions

µφ = µφ1 − µφ2 = µψ1 − µψ2

both represent the Hahn-Jordan decomposition of µφ. By the uniqueness (Corollary8.1) we have µφ1 = µψ1 and µφ2 = µψ2 . By Corollary 8.2 this forces φ1 = ψ1 andφ2 = ψ2.

Comment. If X is a compact Hausdorff space, and φ : C R(X ) → R is alinear continuous map, then by the above result, combined with Corollary II.5.3,we know that there exist unique positive linear maps φ± : C R(X ) → R such thatφ = φ+ + φ−, and

(25) φ = φ+ − φ−.

The decomposition (25) will be referred to as the Hahn-Jordan decomposition of φ.This noation and terminology are used for the following reason. If we take µφ themeasure given by the extended Riesz correspondence, then

µφ = µφ+ − µφ−

is precisely the Hahn-Jordan decomposition of µφ.

Remarks 8.4. There is a version of the extended Riesz correspondence whichworks for general locally compact spaces. Start with a locally compact space X ,and define the spaces

MR0 (X ) =

φ : C R0 (X ) → R : φ linear continuous

,

RR0 (X ) =

µ finite signed Radon measure on X

.

Since C R0 (X ) is the completion of C RC (X ), the correspondence

MR0 (X ) φ −→ φ

C Rc (X)

establishes an isometric linear isomorphism between MR

0 (X ) and the space of allcontinuous linear maps C Rc (X ) → R. For every positive φ ∈ MR0 (X ), we denote by

µφ the Riesz measure associated with the restriction φc = φC Rc (X)

. Since φc =

φ, we have the equality µφ(X ) = φ.We know (see Proposition II.5.10) that for every linear continuous map φ :

C R0 (X ) → R, there exist linear positive continuous maps φ1, φ2 : C R0 (X ) → R, withφ = φ1 − φ2. (In fact φ1 and φ2 can be chosen such that φ1 + φ2 = φ.) Weuse this fact to define the finite signed Radon measure µφ = µφ1 − µφ2 . Exactly as



282 LECTURES 30-31

in Remark 8.2, this definition is independent of the particular choice of φ1 and φ2.This way we have constructed a map

(26) MR0 (X ) φ −→ µφ ∈ R

R0 (X )

which we will call the extended finite Riesz correspondence. Of course, if X is alreadycompact, we have C R0 (X ) = C R(X ), MR

0 (X ) = MR(X ), and mathfrakRR0 (X ) =

RR(X ), so (26) is the extended Riesz correspondence previously defined.

The following result generalizes the statements of Remark 8.3, Theorem 8.4,and Corollaries 8.2 and 8.3.

Theorem 8.5. Let X be a locally compact space.

A. The extended finite Riesz correspondence (26) is an injective linear map.B. For every φ ∈ MR

0 (X ), there exist unique positive maps φ+, φ− ∈∈ MR0 (X ),

such that φ = φ+ − φ−, and φ = φ+ + φ−. Moreover, in this case

µφ = µφ+ − µφ−

is precisely the Hahn-Jordan decomposition of µφ.Proof. First of all, the correspondence (26) is clearly linear, again as a con-

sequence of Proposition 7.6.Second, we remark that the existence part in B is already known, from Propo-

sition II.5.10. We are going to use the following version of Theorem 8.4.

Claim: Suppose φ ∈ MR0 (X ) is written as a difference φ = φ1 − φ2, with

φ1, φ2 ∈ MR0 (X ) positive, and φ = φ1 + φ2. Then

µφ = µφ1 − µφ2

is the Hahn-Jordan decomposition of µφ.

One way to prove this is by employing the Alexandrov compactification X α =X ∞. We use the identification

C R0 (X ) = f ∈ C R(X α) : f (∞) = 0.

We know that there exist positive linear maps ψ1, ψ2 : C R(X ) → R, such thatψkC R0 (X)

= φk, and ψk = φk, k = 1, 2. If we define ψ : C R(X α) → R by

ψ = ψ1 − ψ2, it it not hard to see that ψ = ψ1 + ψ2, so if we consider theRadon measures µψ, µψ1 and µψ2 on the compact space X α, then using Theorem8.4, we get the fact that

µψ = µψ1 − µψ2is precisely the Hahn-Jordan decomposition of µψ. This means that there are setsB1, B2 ∈ Bor(X α), with B1 ∪B2 = X α, B1 ∩B2 = ∅, and µψ1(B2) = µψ2(B1) = 0.We know (see Remarks 7.4) that

µψk(B) = µφk

(B ∩ X ), ∀ B ∈ Bor(X α), k = 1, 2,

so if we define Ak = Bk ∩ X , we immediately get A1 ∪ A2 = X , A1 ∩ A2 = ∅, andµφ1(A2) = µφ2(A1) = 0, thus proving that µφ1 ⊥ µφ2 .

Having proven the above Claim, the proof follows line by line the proofs of Corollaries 8.3 and 8.4.

The notion of a finite signed measure can be generalized to the complex case.

Definition. Suppose A is a σ-algebra on a non-empty set X . A functionµ : A → C is called a complex measure on A, if it is σ-additive in the sense that




(addσ) for any pairwise disjoint sequence (An)∞n=1 ⊂ A, one has the equality

(27) µ∞

n=1

An =

∞

n=1

µ(An

).

Remark that the condition µ(∅) = 0 is automatic in this case. Note also that amap µ : A → C is a complex measure, if and only if the maps Re µ and Im µ arefinite signed measures.

The following result describes an important construction.

Theorem 8.6. Let A be a σ-algebra, and let µ be either a signed measure, or a complex measure on A. For every A ∈ A, we define

(28) ν (A) = sup

∞k=1

|µ(Ak)| : (Ak)∞k=1 ⊂ A, pairwise disjoint,

∞k=1

Ak = A

.

The map ν : A → [0, ∞] is an “honest” measure on A.

Proof. The first step in the proof is contained in the following.Claim 1: For any pariwise disjoint sequence (An)∞

n=1 ⊂ A, one has the in-equality

(29) ν ∞n=1

An ≤

∞n=1

ν (An).

Denote the right hand side of (29) by S , and denote the union∞n=1 An simply by A.

Start now with some pairwise disjoint sequence (Dk)∞k=1 ⊂ A, with

∞k=1 Dk = A.

For every k ≥ 1, we have Dk =∞n=1(Dk ∩ An), with (Dk ∩ An)∞

n=1 ⊂ A pairwisedisjoint, so we have

|µ(Dk)| =

∞

n=1

µ(Dk ∩ An)

≤

∞

n=1

|µ(Dk ∩ An)|, ∀ k ≥ 1.

Summing up then yields

(30)∞k=1

|µ(Dk)| ≤∞k=1

∞n=1

|µ(Dk ∩ An)|

=∞n=1

∞k=1

|µ(Dk ∩ An)|

.

Since for each n ≥ 1, the sequence (Dk ∩ An)∞k=1 ⊂ A is pairwise disjoint, and

satisfies∞k=1(Dk ∩ An) = An, by the definition of ν , we get

∞k=1

|µ(Dk ∩ An)| ≤ ν (An), ∀ n ≥ 1.

Using these estimates in (30), we then get∞

k=1 |µ(Dk)

≤

∞

n=1

ν (An).

Since the inequality∞k=1 |µ(Dk)| ≤ S holds for all pairwise disjoint sequences

(Dk)∞k=1 ⊂ A, with

∞k=1 Dk = A, by the definition of ν we get ν (A) ≤ S , and the

Claim is proven.

Claim 2 : For any finite pairwise disjoint collection (An)N n=1 ⊂ A, one hasthe inequality

ν (A1 ∪ · · · ∪ AN ) ≥ ν (A1) + · · · + ν (AN ).



284 LECTURES 30-31

We use induction on N , and we see immediately that it suffices only to provethe case N = 2. Fix for the moment a pairwise disjoint sequence (Dk)∞

k=1 ⊂ A,with ∞

k=1Dk

= A1

, and denote the sum ∞

k=1 |µ(D

k)|

by R. Suppose we have apairwise disjoint sequence (E j)∞

j=1 ⊂ A, with ∞j=1 E j = A2. If we combine it with

the Dk’s, i.e. we define

F p =

D p/2 if p is even

E ( p+1)/2 if p is odd

then we get a new pairwise disjoint sequence (F p)∞ p=1 ⊂ A, with

∞ p=1 F p = A1 ∪A2.

By the definition of ν we will then get

ν (A1 ∪ A2) ≥∞ p=1

|µ(F p)| =

∞k=1

|µ(Dk)| +

∞j=1

|µ(E j)| = R +

∞j=1

|µ(E j)|.

Taking supremum over all pairwise disjoint sequences (E j)∞j=1 ⊂ A, with

∞j=1 E j =

A2, the above inequality yields µ(A1 ∪ A2) ≥ R + ν (A2), so now we have

ν (A1 ∪ A2) ≥ ν (A2) +∞k=1

|µ(Dk)|.

Taking supremum over all pairwise disjoint sequences (Dk)∞k=1 ⊂ A, with

∞k=1 Dk =

A1, the above inequality finally gives ν (A1 ∪ A2) ≥ ν (A2) + ν (A1), and the Claimis proven.

We are now in position to prove that ν is a measure on A. The equalityν (∅) = 0 is trivial. To prove σ-additivity, we start with some pairwise disjointsequence (An)∞

n=1 ⊂ A, and we must prove the equality

ν

∞

n=1

An

=

∞

n=1

ν (An).

On the one hand, using Claim 1, we know that we have the inequality ν ∞

n=1 An ≤∞

n=1 ν (An). On the other hand, if we denote the union∞n=1 An simply by A,

then using Claim 2, we see that

ν (A) ≥ ν (A [A1 ∪ · · · ∪ AN ]) + ν (A1) + . . . ν (AN ) ≥ ν (A1) + . . . ν (AN ), ∀ N ≥ 1,

which immedaitely gives the other inequality ν (A) ≥ ∞n=1 ν (An).

Definition. With the notations above, and under the hypothesis of Theorem8.6, the “honest” measure ν , defined by (28), is called the variation measure of µ,and will be denoted by |µ|. By construction, we have the inequality

|µ(A)| ≤ |µ|(A), ∀ A ∈ A.

Remark 8.5. Let µ be either a signed measure, or a complex measure onthe σ-algebra A. Exactly as with numbers (or functions), the measure |µ| has aminimality property, which can be stated as follows. Whenever ν is an “honest”measure on A with

|µ(A)| ≤ ν (A), ∀ A ∈ A,

it follows that we have

|µ|(A) ≤ ν (A), ∀ A ∈ A.




This is quite clear, because for any pairwise disjoint sequence ( An)∞n=1 ⊂ A, with

∞n=1 An = A, one has the inequality

∞n=1

|µ(An)| ≤∞n=1

ν (An) = ν (A),

and then the desired inequality follows by taking the supremum in the left handside.

In the case of signed measures, the variation measure is also given by thefollowing.

Proposition 8.2. Let µ be a signed measure on the σ-algebra A. Then onehas the equality

|µ| = µ+ + µ−,

where µ = µ+ − µ− is the Hahn-Jordan decomposition of µ.

Proof. Denote the measure µ+ + µ− simply by ν . Remark that we obviously

have−ν (A) = −µ+(A)−µ−(A) ≤ µ+(A)−µ−(A) ≤ µ+(A)+mu−(A) = ν (A), ∀ A ∈ A,

which gives|µ(A)| ≤ ν (A), ∀ A ∈ A.

By Remark 8.5, this forces the inequality |µ| ≤ ν .To prove the other inequality, we start by fixing sets X +, X − ∈ A as in Theorem

8.2. We decompose each set A ∈ A as A = A+ ∪ A−, where A± = A ∩ X ±, so thatwe have

ν (A) = ν (A+)+ν (A−) = µ+(A+)+µ+(A−)+µ−(A+)+µ−(A−) = µ+(A+)+µ−(A−).

Notice now that µ(A+) = µ+(A+) ≥ 0, and −µ(A−) = µ−(A−) ≥ 0, which meansthat we have the equalities µ+(A+) = |µ(A+)| and µ−(A−) = |µ(A−)|, so the above

equality readsν (A) = |µ(A+)| + |µ(A−)|,

and by the definition of |µ| we then immediately get ν (A) ≤ |µ|(A).

An interesting consequence is the following.

Corollary 8.4. Let µ be either a finite signed measure, or a comlex measureon the σ-algebra A. Then the variation measure |µ| is finite.

Proof. The signed measure case is clear from the above result.In the complex case, we write µ = ν + iη, with ν and η finite signed measures

on A. We apply the signed case, to get the fac that both |ν | and |η| are finite.Notice that we have

|µ(A)| = |ν (A) + iη(A)| ≤ |ν (A)| + |η(A)| ≤ |ν |(A) + |η|(A), ∀ A ∈ A,

so by Remark 8.5 we get |µ| ≤ |ν |+|η|, and then the finiteness of |µ| is a consequenceof the finiteness of |ν | and |η|.

Exercise 3 . Let A be a σ-algebra, and let K be one of the fields R or C. Forthe purpose of this exercise, let us agree to use the term K-measure for designatingeither a finite signed measure (when K = R), or a complex measure (when K = C).Prove the following.

(i) The collection of all K-measures on A is a vector space.



286 LECTURES 30-31

(ii) For anu two K-measures if µ and ν , one has the inequality

|µ + ν | ≤ |µ| + |ν |.(iii) For any K-measure µ and any α ∈ K, one has the equality

|αµ| = |α| · |µ|.Proposition 8.2 has another interesting consequence, which is relevant for the

study of the extended finite Riesz correspondence.

Corollary 8.5. Let X be a locally compact space. Then the extended finiteRiesz correspondence (26) has the property

(31) |µφ|(X ) = φ, ∀ φ ∈ MR0 (X ).

Proof. From Proposition 8.1 and Theorem 8.5, we know that |µφ| = µφ+ +µφ− . Using Remark 8.4, and Theorem 8.5 again, we have

|µφ|(X ) = µφ+(X ) + µφ−(X ) = φ+ + φ− = φ.

Comments. Given a locally compact space X , we can define a complex Radon measure on X as being a complex measure on X , whose real and imaginary partare both (finite) signed Radon measures. The extended finite Riesz correspondencecan be then defined also over the complex numbers, as a map

M0(X ) φ −→ µφ ∈ R0(X ),

where

M0(X ) =

φ : C 0(X ) → C : φ linear constinuous

,

R0(X ) =

µ complex Radon measure on X

.

This correspondence is again linear. One will still have the equality (31), but theproof of this fact will appear later in Chapter IV.



Chapter IV

Integration Theory



Lectures 32-33

1. Construction of the integral

In this section we construct the abstract integral. As a matter of terminology,we define a measure space as being a triple (X, A, µ), where X is some (non-empty)set, A is a σ-algebra on X , and µ is a measure on A. The measure space (X, A, µ)is said to be finite, if If µ(X ) < ∞.

Definition. Let (X, A, µ) be a measure space, and let K be one of the fieldsR or R. A K-valued elementary µ-integrable function on (X, A, µ) is an functionf : X → K, with the following properties

• the range f (X ) of f is a finite set;• f −1(α) ∈ A, and µ

f −1(α)

< ∞, for all α ∈ f (X ) 0.

We denote by L1K,elem(X, A, µ) the collection of all such functions.

Remarks 1.1. Let (X, A, µ) be a measure space.A. Every K-valued elementary µ-integrable function f on (X, A), µ) is mea-

surable, as a map f : (X, A) → K,Bor(K

. In fact, any such f can be written

as

f = α1κ A1 + · · · + αnκ An,

with αk ∈ K, Ak ∈ A and µ(Ak) < ∞, ∀ k = 1, . . . , n. Using the notations fromIII.1, we have the inclusion

L1K,elem(X, A, µ) ⊂ A-ElemK(X ).

B. If we consider the collection R = A ∈ A : µ(A) < ∞, then R is a ring,and, we have the equality

L1K,elem(X, A, µ) = R-ElemK(X ).

In particular, it follows that L1K,elem(X, A, µ) is a K-vector space.

The following result is the first step in the construction of the integral.

Theorem 1.1. Let (X, A, µ) be a measure space, and let K be one of the fieldsR or C. Then there exists a unique K-linear map I µelem : L1

K,elem(X, A, µ) → K,such that

(1) I µelem(κ A) = µ(A),

for all A ∈ A, with µ(A) < ∞.

Proof. For every f ∈ L1K,elem(X, A, µ), we define

I µelem(f ) =

α∈f (X)0

α · µ

f −1(α)

,

289



290 LECTURES 32-33

with the convention that, when f (X ) = 0 (which is the same as f = 0), we defineI µelem(f ) = 0. It is obvious that I µelem satsifies the equality (1) for all A ∈ A withµ(A) <

∞.

One key feature we are going to use is the following.

Claim 1: Whenever we have a finite pairwise disjoint sequence (Ak)nk=1 ⊂ A,with µ(Ak) < ∞, ∀ k = 1, . . . , n, one has the equality

I µelem(α1κ A1 + · · · + αnκ An) = α1µ(A1) + · · · + αnµ(An), ∀ α1, . . . , αn ∈ K.

It is obvious that we can assume αj = 0, ∀ j = 1, . . . , n. To prove the above equality,we consider the elementary µ-integrable function f = α1κ A1 + · · ·+αnκ An

, and weobserve that f (X )0 = α1∪·· ·∪αn. It may be the case that some of the α’sa equal. We list f (X ) 0 = β 1, . . . , β p, with β j = β k, for all j, k ∈ 1, . . . , pwith j = k. For each k ∈ 1, . . . , p, we define the set

J k =

j ∈ 1, . . . , n : αj = β k

.

It is obvious that the sets (J k) pk=1 are pairwise disjoint, and we have J 1 ∪ · · · ∪J p =1, . . . , n. Moreover, for each k ∈ 1, . . . , p, one has the equality

f −1(β k) =j∈J k

Aj,

so we get

β kµ

f −1(β k)

= β kj∈J k

µ(Aj) =j∈J k

αjµ(Aj), ∀ k ∈ 1, . . . , p.

By the definition of I µelem we then get

I µelem(f ) =

p

k=1

β kµf −1(

β k

) =

p

k=1 j∈J k αjµ(Aj) =n

j=1

αjµ(Aj).

Claim 2 : For every f ∈ L1K,elem(X, A, µ), and every A ∈ A with µ(A) < ∞,

one has the equality

(2) I µelem(f + ακ A) = I µelem(f ) + αµ(A), ∀ α ∈ K.

Write f = α1κ A1 + · · · + αnκ An, with (Aj)nj=1 ⊂ A pairwise disjoint, and µ(Aj) <

∞, ∀ j = 1, . . . , n. In order to prove (2), we are going to write the function f +ακ A in a similar way, and we are going to apply Claim 1. Consider the setsB1, B2, . . . , B2n, B2n+1 ∈ A defined by B2n+1 = A (A1 ∪ · · · ∪ An), and B2k−1 =Ak ∩ A, B2k = Ak A, ∀ k = 1, . . . , n. It is obvious that the sets (B p)2n+1

p=1 arepairwise disjoint. Moreover, one has the equalities

(3) B2k−1 ∪

B2k

= Ak

,∀

k∈

1, . . . , n

,

as well as the equality

(4) A =

n+1k=1

B2k−1.

Using these equalities, now we have f + ακ A =2n+1 p=1 β pκ Bp

, where β 2n+1 = α,

and β 2k = αk and β 2k−1 = αk + α, ∀ k ∈ 1, . . . , n. Using these equalities,



CHAPTER IV: INTEGRATION THEORY 291

combined with Claim 1, and (3) and (4), we now get

I µ

elem(f + ακ A) =

2n+1

p=1

β pµ(B p) =

= αµ(B2n+1) +

nk=1

(αk + α)µ(B2k−1) + αkµ(B2k)

=

=

αn+1k=1

µ(B2k−1)

+

nk=1

αk

µ(B2k−1) + µ(B2k)

=

= αµ n+1k=1

B2k−1)

+nk=1

αkµ(B2k−1 ∪ B2k) =

= αµ(A) +n

k=1

αkµ(Ak) = αµ(A) + I µelem(f ),

and the Claim is proven.We now prove that I µelem is linear. The equality

I µelem(f + g) = I µelem(f ) + I µelem(g), ∀ f, g ∈ L1K,elem(X, A, µ)

follows from Claim 2, using an obvious inductive argument. The equality

I µelem(αf ) = αI µelem(f ), ∀ α ∈ K, f ∈ L1K,elem(X, A, µ).

is also pretty obvious, from the definition.The uniqueness is also clear.

Definition. With the notations above, the linear map

I µelem : L1K,elem(X, A, µ) → K

is called the elementary µ-integral .

In what follows we are going to encounter also situations when certain relationsamong measurable functions hold “almost everywhere.” We are going to use thefollowing.

Convention. Let T be one of the spaces [−∞, ∞] or C, and let r be somerelation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞, ∞]). Givena measurable space (X, A, µ), and two measurable functions f 1, f 2 : X → T ,

f 1 r f 2, µ-a.e.

if the set

A =

x ∈ X : f 1(x) r f 2(x)

belongs to A, and it has µ-null complement in X , i.e. µ(X A) = 0. (If r is one of

the relations listed above, the set A automatically belongs to A.) The abreviation“µ-a.e.” stands for “µ-almost everywhere.”

Remark 1.2. Let (X, A, µ) be a measure space, let f ∈ A-ElemK(X ) be suchthat

f = 0, µ-a.e.

Then f ∈ L1K,elem(X, A, µ), and I µelem(f ) = 0. Indeed, if we define the set

N = x ∈ X : f (x) = 0,



292 LECTURES 32-33

then N ∈ A and µ(N ) = 0. Since f −1(α) ⊂ N , ∀ α ∈ f (X ) 0, it follows thatµ

f −1(α)

= 0, ∀ α ∈ f (X ) 0, and then by the definition of the elementary

µ-integral, we get I µ

elem(f ) = 0.

One useful property of elementary integrable functions is the following.

Proposition 1.1. Let (X, A, µ) be a measure space, let f, g ∈ L1R,elem(X, A, µ),

and let h ∈ A-ElemR(X ) be such that

f ≤ h ≤ g, µ-a.e.

Then h ∈ L1R(X, A, µ), and

(5) I µelem(f ) ≤ I µelem(h) ≤ I µelem(h).

Proof. Consider the sets

A = x ∈ X : f (x) > h(x) and B = x ∈ X : h(x) > g(x),

which both belong to A, and have µ(A) = µ(B) = 0. The set M = A ∪ B

also belongs to A and has µ(M ) = 0. Define the functions f 0 = f (1 − κ M ),g0 = g(1 − κ M ), and h0 = h(1 − κ M ). It is clear that f 0, g0, and h0 are allin A-ElemR(X ). Moreover, we have the equalities f 0 = f , µ-a.e., g0 = g, µ-a.e.,and h0 = h, µ-a.e., so by Remark ??, combined with Theorem 1.1, the functionsf 0 = f + (f 0 − f ) and g0 = (g0 − g) + g both belong to L1

R(X, A, µ), and we havethe equalities

(6) I µelem(f 0) = I µelem(f ) and I µelem(g0) = I µelem(g).

Notice now that we have the (absolute) inequality

f 0 ≤ h0 ≤ g0.

Let us show that h0 is elementary integrable. Start with some α ∈ h0(X ) 0. If α > 0, then, using the inequality h0 ≤ g0, we get

h−10 (α) ⊂ g−10 (0, ∞) ⊂ λ∈g0(X)0

g−10 (λ),

which proves that µ

h−10 (α)

< ∞. Likewise, if α < 0, then, using the inequality

h0 ≥ f 0, we get

h−10 (α) ⊂ f −1

0

(−∞, 0)

⊂

λ∈f 0(X)0

f −10 (λ),

which proves again that µ

h−10 (α)

< ∞.

Having shown that h0 is elementary integrable, we now compare the numbersI µelem(f ), I µelem(h0), and I µ(g). Define the functions f 1 = h0 −f 0, and g1 = g0 −h0.By Theorem 1.1, we know that f 1, g1 ∈ L1

R,elem(X, A, µ). Since f 1, g1 ≥ 0, we

have f 1(X ), g1(X )

⊂[0,

∞), so it follows immediately that I µelem(f 1)

≥0 and

I µelem(g1) ≥ 0. Now, again using Theorem 1.1, and (6), we get

I µelem(h0) = I µelem(f 0 + f 1) = I µelem(f 0) + I µelem(f 1) ≥ I µelem(f 0) = I µelem(f );

I µelem(h0) = I µelem(g0 − g1) = I µelem(g0) − I µelem(g1) ≤ I µelem(g0) = I µelem(g).

Since h = h0, µ-a.e., by the above Remark it follows that h ∈ L1R,elem(X, A, µ), and

I µelem(h) = I µelem(h0), so the desired inequality (5) follows immediately.

We now define another type of integral.




Definition. Let (X, A, µ) be a measure space. A measurable function f :X → [0, ∞] is said to be µ-integrable, if

(a) every h ∈ A-ElemR(X ), with 0 ≤ h ≤ f , is elementary µ-integrable;(b) supI µelem(h) : h ∈ A-ElemR(X ), 0 ≤ h ≤ f < ∞.

If this is the case, the above supremum is denoted by I µ+(f ). The space of all suchfunctions is denoted by L1

+(X, A, µ). The map

I µ+ : L1+(X, A, µ) → [0, ∞)

is called the positive µ-integral .

The first (legitimate) question is whether there is an overlap between the twodefinitions. This is anwered by the following.

Proposition 1.2. Let (X, A, µ) be a measure space, and let f ∈ A-ElemR(X )be a function with f ≥ 0. The following are equivalent

(i) f ∈ L1+(X, A, µ);

(ii) f ∈ L

1R,elem(X, A, µ).

Moreover, if f is as above, then I µelem(f ) = I µ+(f ).

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i) we start with an arbitrary elementary

h ∈ A-ElemR(X ), with 0 ≤ h ≤ f . Using Proposition 1.1, we clearly get

(a) h ∈ L1R,elem(X, A, µ);

(b) I µelem(h) ≤ I µelem(f ).

Using these two facts, it follows that f ∈ L1+(X, A, µ), as well as the equality

sup

I µelem(h) : h ∈ A-ElemR(X ), h ≤ f

= I µelem(f ),

which gives I µ+(f ) = I µelem(f ).

We now examine properties of the positive integral, which are similar to thoseof the elementary integral. The following is an analogue of Proposition 1.1.

Proposition 1.3. Let (X, A, µ) be a measure space, let f ∈ L1+(X, A, µ),

and let g : X → [0, ∞] be a measurable function, such that g ≤ f , µ-a.e., then g ∈ L1

+(X, A, µ), and I µ+(g) ≤ I µ+(f ).

Proof. Start with some elementary function h ∈ A-ElemR(X ), with 0 ≤ h ≤g. Consider the sets

M = x ∈ X : h(x) > f (x) and N = x ∈ X : g(x) > f (x),

which obviously belong to A. Since N ⊂ N , and µ(N ) = 0, we have µ(M ) = 0. If we define the elementary function h0 = h(1 − κ M ), then we have h = h0, µ-a.e.,and 0

≤h0

≤f , so it follows that h0

∈L1R,elem(X, A, µ), and I µelem(h0)

≤I µ(f ).

Since h = h0, µ-a.e., by Proposition 1.1., it follows that h ∈ L1R,elem(X, A, µ),and I µelem(h) = I µelem(h0) ≤ I µ+(f ). By definition, this gives g ∈ L1

+(X, A, µ) andI µ+(g) ≤ I µ+(f ).

Remark 1.3. Let (X, A, µ) be a measure space, and let f ∈ L1+(X, A, µ).

Although f is allowed to take the value ∞, it turns out that this is inessential.More precisely one has

µ

f −1(∞)

= 0.



294 LECTURES 32-33

This is in fact a consequence of the equality

(7) limt→∞

µf −1([t, ∞]) = 0.

Indeed, if we define, for each t ∈ (0, ∞), the set At = f −1([t, ∞]) ∈ A, then wehave 0 ≤ tκ At

≤ f . This forces the functions tκ At, t ∈ (0, ∞) to be elementary

integrable, and

µ(At) ≤ I µ+(f )

t, ∀ t ∈ (0, ∞).

This forces limt→∞ µ(At) = 0.

The next result explains the fact that positive integrability is a “decomposable”property.

Proposition 1.4. Let (X, A, µ) be a measure space. Suppose (Ak)nk=1 ⊂ A

is a pairwise disjoint finite sequence, with A1 ∪ · · · ∪ An = X . For a measurable function f : X → [0, ∞], the following are equivalent.

(i) f ∈L1

+(X, A

, µ);(ii) f κ Ak∈ L1

+(X, A, µ), ∀ k = 1, . . . , n.

Moreover, if f satisfies these equivalent conditions, one has

I µ+(f ) =nk=1

I µ+(f κ Ak).

Proof. The implication (i) ⇒ (ii) is trivial, since we have 0 ≤ f κ Ak≤ f , so

we can apply Proposition 1.3.To prove the implication (ii) ⇒ (i), start by assuming that f satisfies condition

(ii). We first observe that every elementary function h ∈ A-ElemR(X ), with 0 ≤h ≤ f , has the properties:


(b) I µ

elem(h) ≤n

k=1 I µ

+(f κ Ak ).This is immediate from the fact that we have the equality h =

nk=1 hκ Ak

, and allfunction hκ Ak

are elementary, and satisfy 0 ≤ hκ Ak≤ f κ Ak

, and then everythingfollows from Theorem 1.1 and the definition of the positive integral which givesI µelem(hκ Ak

) ≤ I µ+(f κ Ak).

Of course, the properties (a) and (b) above prove that f ∈ L1+(X, A, µ), as well

as the inequality

I µ+(f ) ≤nk=1

I µ+(f κ Ak).

To prove that we have in fact equality, we start with some ε > 0, and we choose, foreach k ∈ 1, . . . , n, a function hk ∈ L1

R,elem(X, A, µ), such that 0 ≤ hk ≤ f κ Ak,

and I µelem(hk)

≥I µ+(f κ Ak

)

−εn . By Theorem 1.1, the function h = h1 +

· · ·+ hn

belongs to L1R,elem(X, A, µ), and has

(8) I µelem(h) =nk=1

I µelem(hk) ≥ nk=1

I µ+(f κ Ak)− ε.

We obviously have

h =nk=1

hk ≤nk=1

f κ Ak= f,




so we get I µelem(h) ≤ I µ(f ), thus the inequality (8) gives

I µ

(f ) ≥ n

k=1I

µ

+(f κ Ak )− ε.

Since this inequality holds for all ε > 0, we get I µ(f ) ≥ nk=1 I µ+(f κ Ak

), and weare done.

Remark 1.4. Let (X, A, µ) be a measure space, and let S ∈ A. We can

AS

= A ∩ S : A ∈ A = A ∈ A : A ⊂ S ,

so that AS

⊂ A is a σ-algebra on S . The restriction of µ to AS

will be denoted

by µ|S . With these notations, (S, AS

, µ|S) is a measure space. It is not hard tosee that for a measurable function f : X → [0, ∞], the conditions

• f κ S ∈ L1+(X, A, µ),

•f S ∈

L1+(S, AS , µS)

are equivalent. Moreover, in this case one has the equality

I µ+(f κ S) = I µ|S

+ (f S

).

This is a consequence of the fact that these two conditions are equivalent if f iselementary, combined with the fact that the restriction map h −→ h

S

establishesa bijection between the sets

h ∈ A-ElemR(X ) : 0 ≤ h ≤ f κ S

,k ∈ A

S

-ElemR(S ) : 0 ≤ k ≤ f S

.

The next result gives an alternative definition of the positive integral, for func-tions that are dominated by elementary integrable ones.

Proposition 1.5. Let X (, A, µ) be a measure space, let f : X

→[0,

∞] be

a measurable function. Assume there exists h0 ∈ L1R,elem(X, A, µ), with h0 ≥ f .

Then f ∈ L1+(X, A, µ), and one has the equality

(9) I µ+(f ) = inf

I µelem(h) : h ∈ L1R,elem(X, A, µ), h ≥ f

.

Proof. Since h0 ≥ 0, by Proposition 1.2, we know that h0 ∈ L1+(X, A, µ).

The fact that f ∈ L1+(X, A, µ) then follows from Proposition 1.3, combined with

the inequality h0 ≥ f . More generally, again by Propositions 1.2 and 1.3, we knowthat for any h ∈ L1

R,elem(X, A, µ), with h ≥ f , we have h ∈ L1+(X, A, µ), as well as

the inequality

I µ+(f ) ≤ I µ+(h) = I µelem(h).

So, if we denote the right hand side of (9) by J (f ), we have I µ+(f ) ≤ J (f ) ≤I µelem(h0).

We now prove the other inequality I µ+(f ) ≥ J (f ). If h0 = 0, there is nothingto prove. Assume h0 is not identically zero. Without any loss of generality, we canassume that h0 = β κ B, for some β ∈ (0, ∞) and B ∈ A with µ(B) < ∞. (If wedefine B = h−1

0

(0, ∞)

=α∈h0(X)0 h−1

0 (α), and if we set β = max h0(X ),

then we clearly have µ(B) < ∞, and h0 ≤ β κ B.)For every integer n ≥ 1, we define the sets An1 , . . . , Ann ∈ A by

Ank = f −1

( (k−1)βn , kβn ]

, ∀ k = 1, . . . , n ,



296 LECTURES 32-33

and we define the elementary functions

gn

=n

k=1

(k−1)β

nκ A

n

k

and hn

=n

k=1

kβ

nκ A

n

k

.

The main features of these constructions are collected in the following.

Claim: For every n ≥ 1, the functions gn and hn are elementary integrable,and satisfy the inequalities 0 ≤ gn ≤ f ≤ hn ≤ h0, as well as

I µelem(hn) ≤ I µelem(gn) +βµ(B)

n.

To prove this fact, we fix n ≥ 1, and we first remark that the sets (Ank)nk=1 arepairwise disjoint. Since 0 ≤ f ≤ h0 = β κ B, we have

An1 ∪ · · · ∪ Ann = f −1

(0, β ] ⊂ B.

In particular, if we define An = An1 ∪ · · · ∪ Ann ⊂ B, we have

hn =

nk=1

kβn κ Ank ≤ β

nk=1

κ Ank = β κ An ≤ β κ B.

Let us prove the inequalities gn ≤ f ≤ hn. Start with some arbitrary point x ∈ X ,and let us show that gn(x) ≤ f (x) ≤ hn(x). If f (x) = 0, there is nothing to prove,because this forces κ An

k(x) = 0, ∀ k = 1, . . . , n. Assume now f (x) > 0. Since

f ≤ β κ B, we now that f (x) ∈ (0, β ], so there exists a unique k ∈ 1, . . . , n, such

that (k−1)βn < f (x) ≤ kβ

n , i.e. x ∈ Ank . We then obviously have

gn(x) =(k−1)βn κ An

k(x) =

(k−1)βn < f (x) ≤ kβ

n = kβn κ An

k(x) = hn(x),

and we are done. Finally, let us observe that since gn ≤ hn ≤ h0, it follows that gnand hn are in L1

+(X, A, µ), so gn and hn are elementary integrable. Notice that

hn − gn =βn

n

k=1

κ Ank =βnκ An ≤

βnκ B,

so we have I µelem(hn − gn) ≤ I µelem(βnκ B) = βµ(B)n , so using Theorem 1.1, we get

I µelem(hn) = I µelem(gn) + I µelem(hn − gn) ≤ I µ(gn) +βµ(B)

n.

Having proven the Claim, we immediately see that by the definition of thepositive integral, we have

J (f ) ≤ I µelem(hn) ≤ I µ(gn) +βµ(B)

n≤ I µ+(f ) +

βµ(B)

n.

Since the inequality J (f ) ≤ I µ+(f ) + βµ(B)n holds for all n ≥ 1, it will clearly force

J (f ) ≤ I µ+(f ).

Our next goal is to prove an analogue of Theorem 1.1, for the positive integral(Theorem 1.2 below). We discuss first a weaker version.

Lemma 1.1. Let (X, A, µ) be a measure space.

(i) If f ∈ L1+(X, A, µ) and g ∈ L1

R,elem(X, A, µ) are such that g + f ≥ 0, then

g + f ∈ L1+(X, A, µ), and I µ+(g + f ) = I µelem(g) + I µ+(f ).

(ii) If f ∈ L1+(X, A, µ) and g ∈ L1

R,elem(X, A, µ) are such that g − f ≥ 0, then

g − f ∈ L1+(X, A, µ), and I µ+(g − f ) = I µelem(g) − I µ+(f ).




Proof. (i). We start with a weaker version.

Claim: If f ∈ L1+(X, A, µ) and g ∈ L1

R,elem(X, A, µ), are such that g +f ≥ 0,

then g + f ∈ L1+(X, A, µ), and I

µ+(g + f ) ≤ I

µelem(g) + I

µ+(f ).

What we need to prove is the fact that, for every h ∈ A-ElemR(X ), with 0 ≤ h ≤g + f , we have:


(b) I µelem(h) ≤ I µelem(g) + I µ+(f ).

Consider the elementary function h1 = maxh−g, 0. It is obvious that 0 ≤ h1 ≤ f ,so by Proposition 1.3, it follows that h1 ∈ L1

+(X, A, µ), and I µ+(h1) ≤ I µ+(f ). ByProposition 1.2, this gives h1 ∈ L1

R,elem(X, A, µ), and

(10) I µelem(h1) = I µ+(h1) ≤ I µ+(f ).

Using the obvious inequality −g ≤ h − g ≤ h1, again by Proposition 1.2, it followsthat h

−g

∈L1R,elem(X, A, µ), and

(11) I µelem(h − g) ≤ I µelem(h1).

Of course, by Theorem 1.1, this gives the fact that h = (h − g) + g is elementaryµ-integrable, as well as the equality

I µelem(h) = I µelem(h − g) + I µelem(g).

Combining this with (11) and (10) immediately gives

I µelem(h) ≤ I µelem(h1) + I µelem(g) ≤ I µ+(f ) + I µelem(g),

and the Claim is proven.Having proven the above Claim, we now proceed with the proof of (i). If

f ∈ L1+(X, A, µ) and g ∈ L1

R,elem(X, A, µ) are such that g + f ≥ 0, then by the

Claim , we already know that g+f ∈L1

+(X, A, µ), and I µ

(g+f ) ≤ I

µ

elem(g)+ I

µ

+(f ).We apply now again the Claim to the functions f 1 = g + f and g1 = −g, to get

I µ+(f ) = I µ+(g1 + f 1) ≤ I µelem(g1) + I µ+(f 1) = −I µ+(g) + I +(g + f ),

which gives the other inequality I µelem(g) + I µ+(f ) ≤ I µ+(g + f ).(ii). Start with f ∈ L1

+(X, A, µ) and g ∈ L1R,elem(X, A, µ), with g−f ≥ 0. First

of all, since 0 ≤ g − f ≤ g, by Proposition 1.5, it follows that g − f ∈ L1+(X, A, µ),

and

(12) I µ+(g − f ) = inf

I µelem(k) : k ∈ L1R,elem(X, A, µ), k ≥ g − f

.

Second, remark that, whenever k ∈ L1R,elem(X, A, µ) is such that g − f ≤ k, it

follows that k + f ≥ g, so using part (i) combined with Proposition 1.3, we see thatk + f

∈L1

+(X, A, µ), and

I µelem(g) = I µ+(g) ≤ I µ+(k + f ) = I µelem(k) + I µ+(f ).

This means that we have

I µelem(k) ≥ I µelem(g) − I µ+(f ),

for all k ∈ L1R,elem(X, A, µ), with k ≥ g − f , and then by (12), we immediately get

I µ+(g − f ) ≥ I µelem(g) − I µ+(f ).



298 LECTURES 32-33

To prove the other inequality, we use the definition of the positive integral, whichgives

(13) I µ+(g − f ) = sup I

µelem(h) : h ∈ L1R,elem(X, A, µ), 0 ≤ h ≤ g − f .

Remark that, whenever h ∈ L1R,elem(X, A, µ) is such that 0 ≤ h ≤ g − f , it follows

that 0 ≤ h + f ≤ g, so using part (i) combined with Proposition 1.3, we see thath + f ∈ L1

+(X, A, µ), and

I µelem(g) = I µ+(g) ≥ I µ+(h + f ) = I µelem(h) + I µ+(f ).

This means that we have

I µelem(h) ≤ I µelem(g) − I µ+(f ),

for all h ∈ L1R,elem(X, A, µ), with 0 ≤ h ≤ g − f , and then by (13), we immediately

get I µ+(g − f ) ≤ I µelem(g) − I µ+(f ).

We are now in position to prove the following result (compare with Theorem1.1).

Theorem 1.2. Let (X, A, µ) be a measure space.

(i) If f 1, f 2 ∈ L1+(X, A, µ), then f 1 + f 2 ∈ L1

+(X, A, µ), and one has theequality I µ+(f 1 + f 2) = I µ+(f 1) + I µ+(f 2).

(ii) If f ∈ L1+(X, A, µ), and α ∈ [0, ∞), then 14 αf ∈ L1

+(X, A, µ), and onehas the equality I µ+(αf ) = αI µ+(f ).

Proof. (i). Fix f 1, f 2 ∈ L1+(X, A, µ).

Claim 1: Whenever h ∈ A-ElemR(X ) satisfies 0 ≤ h ≤ f 1 + f 2, it followsthat (a) h ∈ L1

R,elem(X, A, µ),

(b) I

µ

elem(h) ≤ I

µ

+(f 1) + I

µ

+(f 2).Fix an elementary function h ∈ A-ElemR(X ), with 0 ≤ h ≤ f 1 + f 2, and let us firstshow that h is elementary integrable. Fix some α ∈ h(X )0, and let us prove thatµ

h−1(α)

< ∞. If we define the sets Aj = f −1j

[α/2, ∞]

∈ A, j = 1, 2, thenthe elementary functions hj = α

2 κ Ajsatisfy 0 ≤ hj ≤ f j , j = 1, 2. In particular,

it follows that h1, h2 ∈ L1R,elem(X, A, µ), which forces µ(A1) < ∞ and µ(A2) < ∞.

Notice however that, for every x ∈ h−1(α), we have f 1(x) + f 2(x) ≥ h(x) = α,which forces either f 1(x) ≥ α

2 or f 2(x) ≥ α2 . This argument shows tha we have the

inclusion h−1(α) ⊂ A1 ∪ A2, so it follows that we indeed have µ

h−1(α)

< ∞.Having shown property (a), let us prove property (b). Define the sets

B = x ∈ X : h(x) ≥ f 1(x) and D = X B.

It is obvious that B, D∈A are pairwise disjoint, and B

∪D = X . Define the

elementary functions h = hκ B, and h = h − h = hκ D. On the one hand, wehave

f 1κ B ≤ h ≤ f 1κ B + f 2κ B,

which gives

0 ≤ h − f 1κ B ≤ f 2κ B.

14 Here we use the convention that when α = 0, we take αf = 0.




By Lemma 1.1.(ii), combined with Proposition 1.4, it follows that h − f 1κ B ∈L1

+(X, A, µ) and I µelem(h) − I µ+(f 1κ B) = I µ+(h − f 1κ B) ≤ I µ+(f 2κ B), so we get

(14) I µelem(h

) ≤ I µ+(f 1κ B) + I

µ+(f 2κ B).


h = hκ D ≤ f 1κ D,

which gives

(15) I µelem(h) ≤ I µ+(f 1κ D) ≤ I µ+(f 1κ D) + I µ+(f 2κ D).

Since h = h + h, with h and h elementary integrable, using Theorem 1.1 com-bined with Proposition 1.4, by adding the inequalities (14) and (15) we get

I µelem(h) = I µelem(h) + I µelem(h) ≤≤ I µ+(f 1κ B) + I µ+(f 2κ B) + I µ+(f 1κ D) + I µ+(f 2κ D) = I µ+(f 1) + I µ+(f 2),

and the Claim is proven.

Claim 1 obviously implies the fact that f 1 + f 2 ∈ L1+(X, A, µ), as well as the

inequalityI µ+(f 1 + f 2) ≤ I µ+(f 1) + I µ+(f 2).

To prove the other inequality, we use the following.

Claim 2 : For every h ∈ A-ElemR(X ), with 0 ≤ h ≤ f 1, one has the inequal-ity

I µelem(h) ≤ I µ+(f 1 + f 2) − I µ+(f 2).

Indeed, if h is as above, then h is in L1+(X, A, µ), hence elementary integrable, and

we obviously have 0 ≤ h + f 2 ≤ f 1 + f 2. Then by Lemma 1.1.(i), combined withProposition 1.3, we get

I µelem(h) + I µ+(f 2) = I µ+(h + f 2) ≤ I µ+(f 1 + f 2),

and the Claim follows.Using Claim 2, and the definition of the positive integral, we get

I µ+(f 1) = sup

I µelem(h) : h ∈ A-ElemR(X ), 0 ≤ h ≤ f 1 ≤ I µ+(f 1 + f 2) − I µ+(f 2),

which then givesI µ+(f 1) + I µ+(f 2) ≤ I µ+(f 1 + f 2).

(ii). This part is obvious.

Definitions. Let (X, A, µ) be a measure space. Denote the extended real line[−∞, ∞] by R. A measurable function f : X → R is said to be µ-integrable, if thereexist functions f 1, f 2 ∈ L1

+(X, A, µ), such that

(16) f (x) = f 1(x) − f 2(x), ∀ x ∈ X

f −11 (∞) ∪ f −1

2 (∞)

.

By Remark 1.3, we know that the sets f −1k (∞), k = 1, 2, have measure zero. The

equality (16) gives then the fact f = f 1 − f 2, µ-a.e. We defineL

1R

(X, A, µ) =

f : X → R : f µ-integrable

.

We also define the space of “honest” real-valued µ-integrable functions, as

L1R(X, A, µ) =

f ∈ L

1R

(X, A, µ) : f − ∞ < f (x) < ∞, ∀ x ∈ X

.

Finally, we define the space of complex-valued µ-integrable functions as

L1C(X, A, µ) =

f : X → C : Re f, Im f ∈ L

1R(X, A, µ)

.



300 LECTURES 32-33

The next result collects the basic properties of L1R

. Among other things, itstates that it is an “almost” vector space.

Theorem 1.3. Let (X, A, µ) be a measure space.(i) For a measurable function f : X → R, the following are equivalent:

(a) f ∈ L1R

(X, A, µ);

(b) f ∈ L1+(X, A, µ).

(ii) If f, g ∈ L1R

(X, A, µ), and if h : X → R is a measurable function, such that

h(x) = f (x) + g(x), ∀ x ∈ X

f −1(−∞, ∞) ∪ g−1(−∞, ∞)

,

then h ∈ L1R

(X, A, µ).

(iii) If f ∈ L1R

(X, A, µ), and α ∈ R, and if g : X → R is a measurable function,such that

g(x) = αf (x),

∀x

∈X f −1(

−∞,

∞),

then g ∈ L1R

(X, A, µ).(iv) One has the inclusion

L1R,elem(X, A, µ) ∪ L

1+(X, A, µ) ⊂ L

1R

(X, A, µ).

Proof. (i). Consider the functions measurable functions f ± : X → [0, ∞]defined as

f + = maxf, 0 and f − = max−f, 0.

To prove the impliaction (a) ⇒ (b), assume f ∈ L1R

(X, A, µ), which means there

exist f 1, f 2 ∈ L1+(X, A, µ), such that

f (x) = f 1(x) − f 2(x), ∀ x ∈ X

f −1

1 (∞) ∪ f −12 (∞)

.

Notice that we have the inequalities

f + ≤ f 1, µ-a.e.,(17)

f − ≤ f 2, µ-a.e..(18)

Indeed, if we put N = f −11 (∞) ∪ f −1

2 (∞), then µ(N ) = 0, and if we start withsome x ∈ X N , we either have f 1(x) ≥ f 2(x) ≥ 0, in which case we get

f +(x) = f (x) = f 1(x) − f 2(x) ≤ f 1(x),

f −(x) = 0 ≤ f 2(x),

or we have f 1(x) ≤ f 2(x), in which case we get

f +(x) = 0

≤f 1(x),

f −(x) = −f (x) = f 2(x) − f 1(x) ≤ f 2(x).


f +(x) ≤ f 1(x) and f −(x) ≤ f 2(x), ∀ x ∈ X N,

so we indeed get (17) and (18). Using these inequalities, and Proposition 1.3, itfollows that f ± ∈ L1

+(X, A, µ), so by Theorem 1.2, it follows that f + + f − = |f |also belongs to L1

+(X, A, µ).




To prove the implication (b) ⇒ (a), start by assuming that |f | ∈ L1+(X, A, µ).

Then, since we obviously have the inequalities 0 ≤ f ± ≤ |f |, again by Proposition1.3, it follows that f ±

∈L1

+(X, A, µ). Since we obviously have

f (x) = f +(x) − f −(x), ∀ x ∈ X f −1(−∞, ∞),

it follows that f indeed belongs to f ± ∈ L1R

(X, A, µ).(ii). Assume f , g, and h are as in (ii). By (i), both functions |f | and |g| are in

L1+(X, A, µ). By Theorem 1.2, it follows that the function k = |f |+ |g| also belongs

to L1+(X, A, µ). Notice that we have the equality

f −1(−∞, ∞) ∪ g−1(−∞, ∞) = k−1(∞),

so the hypothesis on h reads

h(x) = f (x) + g(x), ∀ x ∈ X k−1(∞),

which then gives

|h(x)| = |f (x) + g(x)| ≤ |f (x)| + |g(x)|, ∀ x ∈ X k−1

(∞).Of course, since µ

k−1(∞)

= 0, this gives

|h| ≤ k, µ-a.e.,

and using (i) it follows that h indeed belongs to L1R

(X, A, µ).(iii). Assume f , α, and g are as in (iii). Exactly as above, we have |g| = |α| · |f |,

µ-a.e., and then by Theorem 1.2 it follows that |g| ∈ L1+(X, A, µ).

(iv). The inclusion L1+(X, A, µ) ⊂ L1

R(X, A, µ) is trivial. To prove the inclusion

L1R,elem(X, A, µ) ⊂ L1

R(X, A, µ), we use parts (ii) and (iii) to reduce this to the fact

that κ A ∈ L1R

(X, A, µ), for all A ∈ A, with µ(A) < ∞. But this fact is now obvious,

because any such function belongs to L1+(X, A, µ) ⊂ L1

R(X, A, µ).

Corollary 1.1. Let (X, A, µ) be a measure space, and let K be one of the

fields R or C.(i) For a K-valued measurable function f : X → K, the following are equiva-

lent:(a) f ∈ L1

K(X, A, µ);(b) |f | ∈ L1

+(X, A, µ).(ii) When equipped with the pointwise addition and scalar multiplication, the

space L1K(X, A, µ) becomes a K-vector space.

Proof. (i). The case K = R is immediate from Theorem 1.3In the case when K = C, we use the obvious inequalities

(19) max |Re f |, |Im f | ≤ |f | ≤ |Re f | + |Im f |.

If f ∈ L1C(X, A, µ), then both Re f and Im f belong to L1

R(X, A, µ), so by

Theorem 1.3, both |Re f | and |Im f | belong to L

1+(X, A, µ). By Theorem 1.2, the

function g = |Re f | + |Im f | belongs to L1+(X, A, µ), and then using the second

inequality in (19), it follows that |f | belongs to L1+(X, A, µ).

Conversely, if |f | belongs to L1+(X, A, µ), then using the first inequality in (19),

it follows that both |Re f | and |Im f | belong to L1+(X, A, µ), so by Theorem 1.3,

both Re f and Im f belong to L1R(X, A, µ), i.e. f belongs to L1

C(X, A, µ).(ii). This part is pretty clear. If f, g ∈ L1

K(X, A, µ), then by (i) both |f |and |g| belong to L1

+(X, A, µ), and by Theorem 1.2, the function |f | + |g| will



302 LECTURES 32-33

also belong to L1+(X, A, µ). Since |f + g| ≤ |f | + |g|, it follows that |f + g| itself

belongs to L1+(X, A, µ), so using (i) again, it follows that f + g indeed belongs to

L1

K(X, A

, µ). If f ∈L1

K(X, A

, µ) and α ∈ K, then |f | belongs toL1

+(X, A

, µ), so|αf | = |α| · |f | again belongs to L1+(X, A, µ), which by (i) gives the fact that αf

belongs to L1K(X, A, µ).

Remark 1.5. Let (X, A, µ) be a measure space. Then one has the equalities

L1+(X, A, µ) =

f ∈ L

1R

(X, A, µ) : f (X ) ⊂ [0, ∞]

;(20)

L1K,elem(X, A, µ) = L

1K(X, A, µ) ∩ A-ElemK(X ).(21)

Indeed, by Theorem 1.3 that we have the inclusion

L1+(X, A, µ) ⊂

f ∈ L1R

(X, A, µ) : f (X ) ⊂ [0, ∞]

.

The inclusion in the other direction follows again from Theorem 1.3, since any

function that belongs to the right hand side of (20) satisfies f = |f |. The inclusionL

1K,elem(X, A, µ) ⊂ L

1K(X, A, µ) ∩ A-ElemK(X )

is again contained in Theorem 1.3. To prove the inclusion in the other direction,it suffices to consider the case K = R. Start with h ∈ L1

R(X, A, µ) ∩ A-ElemR(X ),which gives |h| ∈ L1

+(X, A, µ). The function |h| is obviously in A-ElemR(X ), sowe get |h| ∈ L1

R,elem(X, A, µ). Since L1R,elem(X, A, µ) is a vector space, it will also

contain the function −|h|. The fact that h itself belongs to L1R,elem(X, A, µ) then

follows from Proposition 1.1, combined with the obvious inequalities

−|h| ≤ h ≤ |h|.The following result deals with the construction of the integral.

Theorem 1.4. Let (X, A, µ) be a measure space. There exists a unique mapI µR

(X, A, µ) → R, with the following properties:

(i) Whenever f, g, h ∈ L1R

(X, A, µ) are such that

h(x) = f (x) + g(x), ∀ x ∈ X

f −1(−∞, ∞) ∪ g−1(−∞, ∞)

,

it follows that I µR

(h) = I µR

(f ) + I µR

(g).

(ii) Whenever f, g ∈ L1R

(X, A, µ) and α ∈ R are such that

g(x) = αf (x), ∀ x ∈ X f −1(−∞, ∞),

it follows that I µR

(g) = αI µR

(f ).

(iii) I µR

(f ) = I µ+(f ), ∀ f ∈ L1+(X, A, µ).

Proof. Let us first show the existence. Start with some f ∈ L1R

(X, A, µ), and

define the functions f ± : X → [0, ∞] by f + = maxf, 0 and f − = max−f, 0 sothat f = f + − f −, and f +, f − ∈ L1

+(X, A, µ). We then define

I µR

(f ) = I µ+(f +) − I µ+(f −).

It is obvious that I µR

satisfies condition (iii).The key fact that we need is contained in the following.




Claim: Whenever f ∈ L1R

(X, A, µ), and f 1, f 2 ∈ L1+(X, A, µ) are such that

f (x) = f +(x)

−f −(x),

∀x

∈X.

−1(

∞)

∪f −1

2 (

∞),it follows that we have the equality

I µR

(f ) = I µ+(f 1) − I µ+(f 2).

Indeed, since we have f = f +−f −, it follows immediately that we have the equality

f 2(x) + f +(x) = f 1(x) + f −(x), ∀ x ∈ X.

f −11 (∞) ∪ f −1

2 (∞)

,

which gives

f 2 + f + = f 1 + f −, µ-a.e.

By Theorem 1.2, this immediately gives

I µ+(f 2) + I µ+(f +) = I µ+(f 1) + I µ+(f −),

which then gives

I µ+(f 1) − I µ+(f 2) = I µ+(f +) − I µ+(f −) = I µR(f ).

Having prove the above Claim, let us show now that I µR

has properties (i) and

(ii). Assume f , g and h are as in (i). Notice that if we define h1 = f + + g+ andh2 = f − + g−, then we clearly have 0 ≤ h1 ≤ |f | + |g| and 0 ≤ h2 ≤ |f | + |g|, so h1

and h2 both belng to L1+(X, A, µ). By Theorem 1.2, we then have

(22) I µ+(h1) = I µ+(f +) + I µ+(g+) and I µ+(h2) = I µ+(f −) + I µ+(g−).

Notice also that, because of the equalities

h−11 (∞) = f −1(∞∪ g−1(∞) and h−1

2 (∞) = f −1(−∞∪ g−1(−∞),

we have

h = h1(x) − h2(x), ∀ x ∈ X.

h−1

1 (∞) ∪ h−12 (∞)

,

so by the above Claim, combined with (22), we getI µR

(h) = I µ+(h1) − I µ+(h2) = I µ+(f +) + I µ+(g+) − I µ+(f −) − I µ+(g−) = I µR

(f ) + I µR

(g).

Property (ii) is pretty obvious.The uniqueness is also obvious. If we start with a map J : L1

R(X, A, µ) → R

with properties (i)-(iii), then for every f ∈ L1R

(X, A, µ), we must have

J (f ) = J (f +) − J (f −) = I µ+(f +) − I µ+(f −).

(For the second equality we use condition (iii), combined with the fact that bothf + and f − belong to L1

+(X, A, µ).)

Corollary 1.2. Let (X, A, µ) be a measure space, and let K be either R or C. There exists a unique linear map I µK(X, A, µ) → K, such that

I µK(f ) = I µ+(f ), ∀ f ∈ L1+(X, A, µ) ∩ L1K(X, A, µ).

Proof. Let us start with the case K = R. In this case, we have the inclusion

L1R(X, A, µ) ⊂ L

1R

(X, A, µ),

so we can define I µR as the restriction of I µ

Rto L1

R(X, A, µ). The uniqueness is againclear, because of the equalities

I µR(f ) = I µR (f +) − I µR(f −) = I µ+(f +) − I µ+(f −).



304 LECTURES 32-33

In the case K = C, we define

I µC(f ) = I µ

R (Re f ) + iI µR(Im f ).

The linearity is obvious. The uniqueness is also clear, because the restriction of I µCto L1

R(X, A, µ) must agree with I µR .

Definition. Let (X, A, µ) be a measure space, and let K be one of the symbolsR, R, or C. For any f ∈ L1

K(X, A, µ), the number I µ

K(f ) (which is real, if K = R

or R, and is complex if K = C) will be denoted by X

f dµ,

and is called the µ-integral of f . This notation is unambiguous, because if f ∈L1R(X, A, µ), then we have I µ

R(f ) = I µC(f ) = I µR(f ).

Remark 1.6. If (X, A, µ) is a measure space, then for every A ∈ A, withµ(A) < ∞, using the above Corollary, we get

X

κ A dµ = I µ+(κ A) = µ(A).

By linearity, if K = R, C, one has then the equality X

h dµ = I µelem(h), ∀ h ∈ L1K,elem(X, A, µ).

To make the exposition a bit easier, it will adopt the following.

Convention. If (X, A, µ) is a measure space, and if f : X → [0, ∞] is ameasurable function, which does not belong to L1

+(X, A, µ), then we define X

f dµ = ∞.

Remarks 1.7. Let (X, A, µ) be a measure space.

A. Using the above convention, when h ∈ A-ElemR(X ) is a function withh(X ) ⊂ [0, ∞), the condition

X

h dµ = ∞ is equivalent to the existence of some

α ∈ h(X ) 0, with µ

h−1(α)

= ∞.B. Using the above convention, for every measurable function f : X → [0, ∞],

one has the equality X

f dµ = sup

X

h dµ : h ∈ A-ElemR(X ), 0 ≤ h ≤ f

.

C. If f, g : X → [0, ∞] are measurable, then one has the equalities X

(f + g) dµ =

X

f dµ +

X

g dµ,

(αf ) dµ = α X f dµ,

∀α

∈[0,

∞),

even in the case when some term is infinite. (We use the convention ∞ + t = ∞,∀ t ∈ [0, ∞], as well as α · ∞ = ∞, ∀ α ∈ (0, ∞), and 0 · ∞ = 0.)

D. If f, g : X → [0, ∞] are measurable, and f ≤ g, µ-a.e., then (using B) onehas the inequality

X

f dµ ≤ X

g dµ,

even if one side (or both) is infinite.




E. Let K be one of the symbols R, R, or C, and let f : X → K be a measurablefunction. Then the function |f | : X → [0, ∞] is measurable. Using the above con-vention, the condition that f belongs to L1

K(X, A, µ) is equivalent to the inequality

X|f | dµ < ∞.

In the remainder of this section we discuss several properties of integration thatare analoguous to those of the positive/elementary integration.

We begin with a useful estimate

Proposition 1.6. Let (X, A, µ) be a measure space, and let K be one of thesymbols R, R, or C. For every function f ∈ L1

K(X, A, µ), one has the inequality

X

f dµ

≤ X

|f | dµ.

Proof. Let us first examine the case when K = R, R. In this case we definef + = maxf, 0 and f − = max−f, 0, so we have f = f + − f −, as well as

|f

|= f + + f −. Using the inequalities I µ+(f ±)

≥0, we have

X

f dµ = I µ+(f +) − I µ+(f −) ≤ I µ+(f +) + I µ+(f −) = X

|f | dµ;

− X

f dµ = −I µ+(f +) + I µ+(f −) ≤ I µ+(f +) + I µ+(f −) =

X

|f | dµ.


± X

f dµ ≤ X

|f | dµ,

and the desired inequality immediately follows.Let us consider now the case K = C. Consider the number λ =

X f dµ, and

let us choose some complex number α ∈ C, with |α| = 1, and αλ = |λ|. (If λ = 0,we take α = λ−1|λ|; otherwise we take α = 1.) Consider the measurable functiong = αf . Notice now that

X

Re g dµ

+ i

X

Im g dµ

= X

g dµ = α X

f dµ = αλ = |λ| ≥ 0,

so in particular we get

|λ| =

X

Re g dµ.

If we apply the real case, we then get

(23) |λ| ≤ X

|Re g| dµ.

Notice now that, we have the inequality |Re g| ≤ |g| = |f |, which gives

I µ+|Re g| ≤ I µ|f | = X |f | dµ,

so the inequality (23) immediately gives X

f dµ

= |λ| ≤ X

|f | dµ.

Corollary 1.3. Let (X, A, µ) be a measure space, and let K be one of thesymbols R, R, or C. If a measurable function f : X → K satisfies f = 0, µ-a.e,then f ∈ L1

K(X, A, µ), and

X

f dµ = 0.



306 LECTURES 32-33

Proof. Consider the measurable function |f | : X → [0, ∞], which satisfies|f | = 0, µ-a.e. By Proposition 1.3, it follows that |f | ∈ L1

+(X, A, µ), hence f ∈L1

K

(X, A

, µ), and X |f | dµ

= 0. Of course, the last equality forces X f dµ= 0.

Corollary 1.4. Let K be either R or C. If (X, A, µ) is a finite measure space,then every bounded measurable function f : X → K belongs to L1

K(X, A, µ), and satisfies

X

f dµ

≤ µ(X ) · supx∈X

|f (x)|.

Proof. If we put β = supx∈X |f (x)|, then we clearly have |f | ≤ β κ X , whichshows that |f | ∈ L1

+(X, A, µ), and also gives X

|f | dµ ≤ X

β κ X dµ = µ(X ) · β .Then everything follows from Proposition 1.6.

Comment. The introduction of the space L1R

(X, A, µ), of extended real-valuedµ-integrable functions, is useful mostly for technical reasons. In effect, everything

can be reduced to the case when only “honest” real-valued functions are involved.The following result clarifies this matter.

Lemma 1.2. Let (X, A, µ) be a measure space, and let f : X → R be a mea-surable function. The following ar equivalent

(i) f ∈ L1R

(X, A, µ);

(ii) there exists g ∈ L1R(X, A, µ), such that g = f , µ-a.e.

Moreover, if f satisfies these equivalent conditions, then any function g, satisfying (ii), also has the property

X

f dµ =

X

g dµ.

Proof. Consider the set F = x ∈ X : −∞ < f (x) < ∞, which belongs to

A. We obviously have the equality X F = |f |−1

(∞).(i) ⇒ (ii). Assume f ∈ L1

R(X, A, µ), which means that |f | ∈ L1

+(X, A, µ). Inparticular, we get µ(X F ) = 0. Define the measurable function g = f κ F . On theone hand, it is clear, by construction, that we have −∞ < g(x) < ∞, ∀ x ∈ X . Onthe other hand, it is clear that g

F

= f F

, so using µ(X F ) = 0, we get the factthat f = g, µ-a.e. Finally, the inequality 0 ≤ |g| ≤ |f |, combined with Proposition1.3, gives |g| ∈ L1

+(X, A, µ), so g indeed belongs to L1R(X, A, µ).

(ii) ⇒ (i). Suppose there exists g ∈ L1R(X, A, µ), with f = g, µ-a.e., and let us

prove that

(a) f ∈ L1R

(X, A, µ);

(b) X

f dµ = X

g dµ.

The first assertion is clear, because by Proposition 1.3, the equality |f | = |g|, µ-a.e.,

combined with |g| ∈ L1

+(X, A, µ), forces |f | ∈ L1

+(X, A, µ), i.e. f ∈ L1

R(X, A, µ). Toprove (b), we consider the difference h = f − g, which is a measurable function h :X → R, and satisfies h = 0, µ-a.e. By Corollary 1.3, we know that h ∈ L1

R(X, A, µ),

and X h dµ = 0. By Theorem 1.3, we get

X

f dµ =

X

g dµ +

X

h dµ =

x

g dµ.

The following result is an analogue of Proposition 1.1 (see also Proposition 1.3).




Proposition 1.7. Let (X, A, µ), and let f 1, f 2 ∈ L1R

(X, A, µ). Suppose f :

X → R is a measurable function, such that f 1 ≤ f ≤ f 2, µ-a.e. Then f ∈L1

R(X, A

, µ), and one has the inequality X

f 1 dµ ≤ X

f dµ ≤ X

f 2 dµ.

Proof. First of all, since f 1 and f 2 belong to L1R

(X, A, µ), it follows that

|f 1| and |f 2|, hence also |f 1| + |f 2|, belong tp L1+(X, A, µ). Second, since we have

f 2 ≤ |f 2| ≤ |f 1| + |f 2| and f 1 ≥ −|f 1| ≥ −|f 1| − |f 2| (everyhwere!), the inequalitiesf 1 ≤ f ≤ f 2, µ-a.e., give

−|f 1| − |f 2| ≤ f ≤ |f 1| + |f 2|, µ-a.e.,

which reads|f | ≤ |f 1| + |f 2|, µ-a.e.

Since |f 1| + |f 2| ∈ L1+(X, A, µ), by Proposition 1.3., we get |f | ∈ L1

+(X, A, µ), so f

indeed belongs to L

1

R(X, A, µ).To prove the inequality for integrals, we use Lemma 1.2, to find functionsg1, g2, g ∈ L1

R(X, A, µ), such that f 1 = g1, µ-a.e., f 2 = g2, µ-a.e., and f = g, µ-a.e.Lemma 1.2 also gives the equalities

X f 1 dµ =

X g1 dµ,

X f 2 dµ =

X g2 dµ, and

Xf dµ =

X

g dµ, so what we need to prove are the inequalities

(24)

X

g1 dµ ≤ X

g dµ ≤ X

g2 dµ.

Of course, we haveg1 ≤ g ≤ g2, µ-a.e.

To prove the first inequality in (24), we consider the function h = g − g1 ∈L1R(X, A, µ), and we prove that

X

h dµ ≥ 0. But this is quite clear, becausewe have h ≥ 0, µ-a.e., which means that h = |h|, µ-a.e., so by Lemma 1.2, we get

X

h dµ = X

|h| dµ = I µ+(|h|) ≥ 0.

The second inequality in (24) is prove the exact same way.

The next result is an analogue of Proposition 1.4.

Proposition 1.8. Let (X, A, µ) be a measure space, and let K be one of thesymbols R, R, or C. Suppose (Ak)nk=1 ⊂ A is a pairwise disjoint finite sequence,with A1 ∪ · · · ∪ An = X . For a measurable function f : X → K, the following areequivalent.

(i) f ∈ L1K

(X, A, µ);(ii) f κ Ak

∈ L1K

(X, A, µ), ∀ k = 1, . . . , n.

Moreover, if f satisfies these equivalent conditions, one has

(25) X

f dµ =nk=1

X

f κ Akdµ.

Proof. It is fairly obvious that |f κ Ak| = |f |κ Ak

. Then the equivalence (i) ⇔(ii) follows from Proposition 1.4 applied to the function |f | : X → [0, ∞]. Inthe cases when K = R, C, the equality (25) follows immediately from linearity,and the obvious equality f =

nk=1 f κ Ak

. In the case when K = R, we takeg ∈ L1

R(X, A, µ), such that f = g, µ-a.e. Then we obviously have f κ Ak= gκ Ak

,



308 LECTURES 32-33

µ-a.e., for all k = 1, . . . , n, and the equality (25) follows from the correspondingequality that holds for g.

Remark 1.8. The equality (25) also holds for arbitrary measurable functionsf : X → [0, ∞], if we use the convention that preceded Remarks 1.7. This is animmediate consequence of Proposition 1.4, because the left hand side is infinite, if an only if one of the terms in the right hand side is infinite.

The following is an obvious extension of Remark 1.4.

Remark 1.9. Let K be one of the symbols R, R, or C, let (X, A, µ) be ameasure space. For a set S ∈ A, and a measurable function f : X → K, one hasthe equivalence

f κ S ∈ L1K(X, A, µ) ⇐⇒ f

S

∈ L1K

S, A

S

, µS

.

If this is the case, one has the equality

(26) X f κ S dµ = S f S dµ|S.

The above equality also holds for arbitrary measurable functions f : X → [0, ∞],again using the convention that preceded Remarks 1.7.

Notation. The above remark states that, whenver the quantities in (26) aredefined, they are equal. (This only requires the fact that f

S

is measurable, and

either f S

∈ L1K

S, A

S

, µS

, or f (S ) ⊂ [0, ∞].) In this case, the equal qunatities

in (26) will be simply denoted by S

f dµ.

Exercise 1. Let I be some non-empty set. Consider the σ-algebra P(I ), of allsubsets of I , equipped with the counting measure

µ(A) =

Card A if A is finite


Prove that L1R

(I,P(I ), µ) = L1R(I, P(I ), µ). Prove that, if K is either R or C, then

L1K(I, P(I ), µ) = 1

K(I ),

the Banach space discussed in II.2 and II.3.

Exercise 2 . There is an instance when the entire theory developped here isessentially vacuous. Let X be a non-empty set, and let A be a σ-algebra on X . Fora measure µ on A, prove that the following are equivalent

(i) L1+(X, A, µ) =

f : X → [0, ∞] : f measurable, and f = 0, µ-a.e.

;

(ii) for every A ∈ A, one has µ(A) ∈ 0, ∞.

A measure space (X, A, µ), with property (ii), is said to be degenerate.

Exercise 3 ♦. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] be ameasurable function, with

X

f dµ = 0. Prove that f = 0, µ-a.e.

Hint: Define the measurable sets An = x ∈ X : f (x) ≥ 1n

, and analyze the relationship

between f and κ An .



Lecture 34

2. Convergence theorems

In this section we analyze the dynamics of integrabilty in the case when se-quences of measurable functions are considered. Roughly speaking, a “convergencetheorem” states that integrability is preserved under taking limits. In other words,if one has a sequence (f n)∞

n=1 of integrable functions, and if f is some kind of a

limit of the f n’s, then we would like to conclude that f itself is integrable, as wellas the equality

f = limn→∞

f n.

Such results are often employed in two instances:

A. When we want to prove that some function f is integrable. In this casewe would look for a sequence (f n)∞

n=1 of integrable approximants for f .B. When we want to construct and integrable function . In this case, we will

produce first the approximants, and then we will examine the existenceof the limit.

The first convergence result, which is somehow primite, but very useful, is thefollowing.

Lemma 2.1. Let (X, A, µ) be a finite measure space, let a ∈ (0, ∞) and let f n : X

→[0, a], n

≥1, be a sequence of measurable functions satisfying

(a) f 1 ≥ f 2 ≥ ·· · ≥ 0;(b) limn→∞ f n(x) = 0, ∀ x ∈ X .

Then one has the equality

(1) limn→∞

X

f n dµ = 0.

Proof. Let us define, for each ε > 0, and each integer n ≥ 1, the set

Aεn = x ∈ X : f n(x) ≥ ε.

Obviously, we have Aεn ∈ A, ∀ ε > 0, n ≥ 1. One key fact we are going to use is the

following.Claim 1: For every ε > 0, one has the equality

limn→∞

µ(Aεn) = 0.

Fix ε > 0. Let us first observe that, using (a), we have the inclusions

(2) Aε1 ⊃ Aε2 ⊃ . . .

309



310 LECTURE 34

Second, using (b), we clearly have the equality∞k=1 Aεk = ∅. Since µ is finite,

using the Continuity Property (Lemma III.4.1), we have

limn→∞

µ(Aεn) = µ ∞n=1

Aεn = µ(∅) = 0.

Claim 2 : For every ε > 0 and every integer n ≥ 1, one has the inequality

0 ≤ X

f n dµ ≤ aµ(Aεn) + εµ(X ).

Fix ε and n, and let us consider the elementary function

hεn = aκ Aεn

+ εκ Bεn

,

where Bεn = X Aε. Obviously, since µ(X ) < ∞, the function hεn is elementaryintegrable. By construction, we clearly have 0 ≤ f n ≤ hεn, so using the propertiesof integration, we get

0 ≤ X

f n dµ ≤ X

hεn dµ = aµ(Aεn) + εµ(Bε) ≤ aµ(Aε) + εµ(X ).

Using Claims 1 and 2, it follows immediately that

0 ≤ liminf n→∞

X

f n dµ ≤ limsupn→∞

X

f n dµ ≤ εµ(X ).

Since the last inequality holds for arbitrary ε > 0, the desired equality (1) immedi-ately follows.

We now turn our attention to a weaker notion of limit, for sequences of mea-surable functions.

Definition. Let (X, A, µ) be a measure space, let K be a one of the symbols

R, R, or C. Suppose f n : X → K, n ≥ 1, are measurable functions. Givena measurable function f : X → K, we say that the sequence (f n)∞

n=1 convergesµ-almost everywhere to f , if there exists some set N ∈ A, with µ(N ) = 0, such that

limn→∞

f n(x) = f (x), ∀ x ∈ X N.

In this case we write

f = µ-a.e.- limn→∞

f n.

Remark 2.1. This notion of convergence has, among other things, a certainuniqueness feature. One way to describe this is to say that the limit of a µ-a.e. con-vergent sequence is µ-almost unique, in the sense that if f anf g are measurable func-tions which satisfy the equalities f = µ-a.e.- limn→∞ f n and g = µ-a.e.-limn→∞ f n,then f = g, µ-a.e. This is quite obvious, because there exist sets M, N

∈A, with

µ(M ) = µ(N ) = 0, such that

limn→∞

f n(x) = f (x), ∀ x ∈ X M,

limn→∞

f n(x) = g(x), ∀ x ∈ X N,

then it is obvious that µ(M ∪ N ) = 0, and

f (x) = g(x), ∀ x ∈ X [M ∪ N ].




Comment. The above definition makes sense if K is an arbitrary metric space.Any of the spaces R, R, and C is in fact a complete metric space. There are instanceswhere the requirement that f is measurable is if fact redundant. This is somehowclarified by the the next two exercises.

Exercise 1* . Let (X, A, µ) be a measure space, let K be a complete separablemetric space, and let f n : X → K, n ≥ 1, be measurable functions.

(i) Prove that the set

L =

x ∈ X :

f n(x)∞n=1

⊂ K is convergent

belongs to A.(ii) If we fix some point α ∈ K, and we define : X → K by

(x) =

limn→∞

f n(x) if x ∈ L

α if x ∈ X L

then is measurable.In particular, if µ(X L) = 0, then = µ-a.e.- limn→∞ f n.

Hints: If d denotes the metric on K, then prove first that, for every ε > 0 and every m, n ≥ 1,the set

Dεmn =

x ∈ X : d

f m(x), f n(x)

< ε

belongs to A (use the results from III.3). Based on this fact, prove that, for every p, k ≥ 1, the

set

E pk =

x ∈ X ; d

f m(x), f n(x)

< 1p

, ∀ m, n ≥ k

belongs to A. Finally, use completeness to prove that

L =∞

p=1

∞k=1

E pk

.

Exercise 2 . Use the setting from Exercise 1. Prove that Let (X, A, µ), K, and

(f n)∞n=1 be as in Exercise 1. Assume f : X → K is an arbitrary function, for whichthere exists some set N ∈ A with µ(N ) = 0, and

limn→∞

f n(x) = f (x), ∀ x ∈ X N.

Prove that, when µ is a complete measure on A (see III.5), the function f is auto-matically measurable.

Hint: Use the results from Exercise 1. We have X N ⊂ L, and f (x) = (x), ∀ x ∈ X N .

Prove that, for a Borel set B ⊂ K, one has the equality f −1(B) = −1(B)M , for some M ⊂ N .

By completeness, we have M ∈ A, so f −1(B) ∈ A.

The following fundamental result is a generalization of Lemma 2.1.

Theorem 2.1 (Lebesgue Monotone Convergence Theorem). Let (X, A, µ) bea measure space, and let (f n)∞

n=1

⊂L1

+(X, A, µ) be a sequence with:

• f n ≤ f n+1, µ-a.e., ∀ n ≥ 1;• sup

X f n dµ : n ≥ 1

< ∞.

Assume f : X → [0, ∞] is a measurable function, with f = µ-a.e.-limn→∞ f n. Then f ∈ L1

+(X, A, µ), and X

f dµ = limn→∞

X

f n dµ.

Proof. Define αn = X

f n dµ, n ≥ 1. First of all, we clearly have

0 ≤ α1 ≤ α2 ≤ . . . ,



312 LECTURE 34

so the sequence (αn)∞n=1 has a limit α = limn→∞ αn, and we have in fact the

equality

α = supαn : n ≥ 1 < ∞.With these notations, all we need to prove is the fact that f ∈ L1

+(X, A, µ), andthat we have

(3)

X

f dµ = α.

Fix a set M ∈ A, with µ(M ) = 0, and such that limn→∞ f n(x) = f (x),∀ x ∈ X M . For each n, we define the set

M n = x ∈ X : f n(x) > f n+1(x).

Obviously M n ∈ A, and by assumption, we have µ(M n) = 0, ∀ n ≥ 1. Define theset N = M ∪ ∞

n=1 M n

. It is clear that µ(N ) = 0, and

• 0 ≤ f 1(x) ≤ f 2(x) ≤ · · · ≤ f (x), ∀ x ∈ X N ;

•f (x) = lim

n→∞f n

(x),∀

x∈

X N .

So if we put A = X N , and if we define the measurable functions gn = f nκ A,n ≥ 1, and g = f κ A, then we have

(a) 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere!);(b) limn→∞ gn(x) = g(x), ∀ x ∈ X ;(c) gn = f n, µ-a.e., ∀ n ≥ 1;(d) g = f , µ-a.e.;

Notice that property (c) gives gn ∈ L1+(X, A, µ) and

X gn dµ = αn, ∀ n ≥ 1. By

property (d), we see that we have the equivalence

f ∈ L1+(X, A, µ) ⇐⇒ f ∈ L

1+(X, A, µ).

Moreover, if g ∈ L1+(X, A, µ), then we will have

X

g dµ =

X

f dµ. These observa-tions show that it suffices to prove the theorem with g’s in place of the f ’s. The

advantage is now the fact that we have the slightly stronger properties (a) and (b)above. The first step in the proof is the following.

Claim 1: For every t ∈ (0, ∞), one has the inequality µ

g−1((t, ∞]) ≤ α

t .

Denote the set g−1((t, ∞]) simply by At. For each n ≥ 1, we also define the setAnt = g−1

n ((t, ∞]). Using property (a) above, it is clear that we have the inclusions

(4) At1 ⊂ A2t ⊂ ·· · ⊂ At.

Using property (b) above, we also have the equality At =∞n=1 Ant . Using the

continuity Lemma 4.1, we then have

µ(At) = limn→∞

µ(Ant ),

so in order to prove the Claim, it suffices to prove the inequalities

(5) µ(Ant ) ≤ αnt , ∀ n ≥ 1.

But the above inequality is pretty obvious, since we clearly have 0 ≤ tκ Ant

≤ gn,which gives

tµ(Ant ) =

X

tκ Ant

dµ ≤ X

gn dµ = αn.

Claim 2 : For any elementary function h ∈ A-Elem R(X ), with 0 ≤ h ≤ g,one has




(i) h ∈ L1R,elem(X, A, µ);

(ii)

Xh dµ ≤ α.

Start with some elementary function h, with 0 ≤ h ≤ g. Assume h is not identicallyzero, so we can write it as

h = β 1κ B1 + · · · + β pκ Bp,

with (Bj) pj=1 ⊂ A pairwise disjoint, and 0 < β 1 < · · · < β p. Define the setB = B1 ∪ · · · ∪ B p. It is obvious that, if we put t = β 1/2, we have the inclusionB ⊂ g−1

(t, ∞]

, so by Claim 1, we get µ(B) < ∞. This gives, of course µ(Bj) < ∞,

∀ j = 1, . . . , p, so h is indeed elementary integrable. To prove the estimate (ii), wedefine the measurable functions hn : X → [0, ∞] by hn = mingn, h, ∀ n ≥ 1.Since 0 ≤ hn ≤ gn, ∀ n ≥ 1, it follows that, hn ∈ L1

+(X, A, µ), ∀ n ≥ 1, and we havethe inequalities

(6) X hn dµ

≤ X gn dµ = αn,

∀n

≥1.

It is obvious that we have

(∗) 0 ≤ h1 ≤ h2 ≤ · ·· ≤ h ≤ β pκ B (everywhere);(∗∗) h(x) = limn→∞ hn(x), ∀ x ∈ X .

Let us restrict everything to B. We consider the σ-algebra B = AB

, and the

measure ν = µB

. Consider the elementary function ψ = hB

∈ B-ElemR(B), as

well as the measurable functions ψn = hnB

: B → [0, ∞], n ≥ 1. It is clear that

ψ ∈ L1R,elem(B,B, ν ), and we have the equality

(7)

B

ψ dν =

X

hdµ.

Likewise, using (

∗), which clearly forces hnXB = 0, it follows that, for each n

≥1,

the function ψn belongs to L1+(B,B, ν ), and by (6), we have

(8)

B

ψn dν =

X

hn dµ, ∀ n ≥ 1.

Let us analyze the differences ϕn = ψ − ψn. On the one hand, using (∗), wehave ϕn(x) ∈ [0, β p], ∀ x ∈ B, n ≥ 1. On the other hand, again by (∗), we haveϕ1 ≥ ϕ2 ≥ . . . . Finally, by (∗∗) we have limn→∞ ϕn(x) = 0, ∀ x ∈ B. We canapply Lemma 2.1, and we will get limn→∞

B

ϕn dν = 0. This clearly gives, B

ψ dν = limn→∞

B

ψn dν,

and then using (7) and (8), we get the equality

X

h dµ = limn→∞ X hn dµ.

Combining this with (6), immediately gives the desired estimate X h dµ ≤ α.

Having proven Claim 2, let us observe now that, using the definition of thepositive integral, it follows immediately that g ∈ L1

+(X, A, µ), and we have theinequality

X

g dµ ≤ α.



314 LECTURE 34

The other inequality is pretty obvious, because the inequality g ≥ gn forces

X g dµ

≥ X gn dµ = αn,

∀n

≥1,

so we immediately get X

g dµ ≥ supαn ; n ≥ 1 = α.

Comment. In the previous section we introduced the convention which defines X f dµ = ∞, if f : X → [0, ∞] is measurable, but f ∈ L1

+(X, A, µ). Using thisconvention, the Lebesgue Monotone Convergence Theorem has the following generalversion.

Theorem 2.2 (General Lebesgue Monotone Convergence Theorem). Let (X, A, µ) be a measure space, and let f, f n : X → [0, ∞], n ≥ 1, be measurable

functions, such that

• f n ≤ f n+1, µ-a.e., ∀ n ≥ 1;• f = µ-a.e.- limn→∞ f n.

Then

(9)

X

f dµ = limn→∞

X

f n dµ.

Proof. As before, the sequence (αn)∞n=1 ⊂ [0, ∞], defined by αn =

X

f n dµ,∀ n ≥ 1, is non-decreasing, and is has a limit

α = limn→∞

αn = sup

X

f n dµ : n ≥ 1

∈ [0, ∞].

There are two cases to analyze.

Case I : α =

∞.

In this case the inequalities f ≥ f n ≥ 0, µ-a.e. will force X

f dµ ≥ X

f n dµ = αn, ∀ n ≥ 1,

which will force X

f dµ ≥ α, so we indeed get X

f dµ = ∞ = α.

Case II : α < ∞.

In this case we apply directly Theorem 2.1.

The following result provides an equivalent definition of integrability for non-negative functions (compare to the construction in Section 1).

Corollary 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] bea measurable function. The following are equivalent:

(i) f ∈ L1+(X, A, µ);

(ii) there exists a sequence (hn)∞n=1 ⊂ L1

R,elem(X, A, µ), with • 0 ≤ h1 ≤ h2 . . . ;• limn→∞ hn(x) = f (x), ∀ x ∈ X ;• sup

X

hn dµ : n ≥ 1

< ∞.




Moreover, if (hn)∞n=1 is as in (ii), then one has the equality

(10) X f dµ = limn→∞ X hn dµ.

Proof. (i) ⇒ (ii). Assume f ∈ L1+(X, A, µ). Using Theorem III.3.2, we know

there exists a sequence (hn)∞n=1 ⊂ A-ElemR(X ), with

(a) 0 ≤ h1 ≤ h2 ≤ · ·· ≤ f ;(b) limn→∞ hn(x) = f (x), ∀ x ∈ X .

Note the (a) forces hn ∈ L1R,elem(X, A, µ), as well as the inequalities

X hn dµ ≤

Xf dµ < ∞, ∀ n ≥ 1, so the sequence (hn)∞

n=1 clearly satisfies condition (ii).The implication (ii) ⇒ (i), and the equality (10) immediately follow from the

General Lebesgue Monotone Convergence Theorem.

Corollary 2.2 (Fatou Lemma). Let (X, A, µ) be a measure space, and let f n : X

→[0,

∞], n

≥1, be a sequence of measurable functions. Define the function

f : X → [0, ∞] by

f (x) = liminf n→∞

f n(x), ∀ x ∈ X.

Then f is measurable, and one has the inequality X

f dµ ≤ liminf n→∞

X

f n dµ.

Proof. The fact that f is measurable is already known (see Corollary III.3.5).Define the sequence (αn)∞

n=1 ⊂ [0, ∞] by αn = X

f n dµ, ∀ n ≥ 1.Define, for each integer n ≥ 1, the function gn : X → [0, ∞] by

gn(x) = inf

f k(x) : k ≥ n

, ∀ x ∈ X.

By Corollary III.3.4, we know that gn, n ≥ 1 are all measurable. Moreover, it isclear that

• 0 ≤ g1 ≤ g2 ≤ . . . ;• f (x) = limn→∞ gn(x), ∀ x ∈ X .

By the General Lebesgue Monotone Convergence Theorem 2.2, it follows that

(11)

X

f dµ = limn→∞

X

gn dµ.

Notice that, if we define the sequence (β n)∞n=1 ⊂ [0, ∞], by β n =

X

gn dµ, ∀ n ≥ 1,

then the obvious inequalities 0 ≤ gn ≤ f n give X g dµ ≤

X f n dµ, so we get

β n ≤ αn, ∀ n ≥ 1.

Using (11), we then get X

f dµ = limn→∞

β n = liminf n→∞

β n ≤ liminf n→∞

αn.

The following is an important application of Theorem 2.1, that deals withRiemann integration.

Corollary 2.3. Let a < b be real numbers. Denote by λ the Lebesgue measure,and consider the Lebesgue space

[a, b],Mλ([a, b]), λ

, where Mλ([a, b]) denotes the



316 LECTURE 34

σ-algebra of all Lebesgue measurable subsets of [a, b]. Then every Riemann inte-grable function f : [a, b] → R belongs to L1

R([a, b],Mλ([a, b]), λ), and one has theequality

(12)

[a,b]

f dλ =

ba

f (x) dx.

Proof. We are going to use the results from III.6. First of all, the fact that f is Lebesgue integrable, i.e. f belongs to L1

R

[a, b],Mλ([a, b]), λ

, is clear since f is

Lebesgue measurable, and bounded. (Here we use the fact that the measure space[a, b],Mλ([a, b]), λ

is finite.)

Next we prove the equality between the Riemann integral and the Lebesgueintegral. Adding a constant, if necessary, we can assume that f ≥ 0. For everypartition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], we define the numbers

mk = inf t∈[xk−1,xk]

f (t), ∀ k = 1, . . . , n ,

and we define the function

f ∆ = m1κ [x0,x1] + m2κ (x1,x2] + · · · + mmκ (xn−1,xn].

Fix a sequence of partitions (∆ p)∞ p=1, with ∆1 ⊂ ∆2 ⊂ . . . , and lim p→∞ |∆ p| = 0,

We know (see III.6) that we have

f = λ-a.e.- lim p→∞

f ∆p.

Clearly we have 0 ≤ f ∆1≤ f ∆2

≤ · · · ≤ f , so by Theorem 2.1, we get

(13)

[a,b]

f dλ = lim p→∞

[a,b]

f ∆pdλ.

Notice however that [a,b]

f ∆pdλ = L(f, ∆ p), ∀ p ≥ 1,

where L(f, ∆ p) denotes the lower Darboux sum. Combining this with (13), and withthe well known properties of Riemann integration, we immediately get (12).

The following is another important convergence theorem.

Theorem 2.3 (Lebesgue Dominated Convergence Theorem). Let (X, A, µ) bea measure space, let K be one of the symbols R, R, or C, and let (f n)∞

n=1 ⊂L1K

(X, A, µ). Assume f : X → K is a measurable function, such that

(i) f = µ-a.e.- limn→∞ f n;(ii) there exists some function g ∈ L1

+(X, A, µ), such that

|f n| ≤ g, µ-a.e., ∀ n ≥ 1.

Then f ∈ L1K

(X, A, µ), and one has the equality

(14)

X

f dµ = limn→∞

X

f n dµ.

Proof. The fact that f is integrable follows from the following

Claim: |f | ≤ g, µ-a.e.




To prove this fact, we define, for each n ≥ 1, the set

M n = x

∈X :

|f n(x)

|> g(x).

It is clear that M n ∈ A, and µ(M n) = 0, ∀ n ≥ 1. If we choose M ∈ A such thatµ(M ) = 0, and

f (x) = limn→∞

f n(x), ∀ x ∈ X M,

then the set N = M ∪ ∞n=1 M n

∈ A will satisfy

• µ(N ) = 0;• |f n(x)| ≤ g(x), ∀ x ∈ X N ;• f (x) = limn→∞ f n(x), ∀ x ∈ X N .

We then clearly get

|f (x)| ≤ g(x), ∀ x ∈ X N,

and the Claim follows.Having proven that f is integrable, we now concentrate on the equality (14).

Case I : K = R.

First of all, without any loss of generality, we can assume that 0 ≤ g(x) < ∞,∀ x ∈ X . (See Lemma 1.2.) Let us define the functions gn = minf n, g andhn = maxf n, −g, n ≥ 1. Since we have −g ≤ f n ≤ g, µ-a.e., we immediately get

(15) gn = hn = f n, µ-a.e., ∀ n ≥ 1,

thus giving the fact that gn, hn ∈ L1R

(X, A, µ), ∀ n ≥ 1, as well as the equalities

(16)

X

gn dµ =

X

hn dµ =

X

f n dµ, ∀ n ≥ 1.

Define the measurable functions ϕ, ψ : X → R by

ϕ(x) = liminf n→∞

hn(x) and ψ(x) = lim supn→∞

gn(x), ∀ x ∈ X.

Using (15), we clearly have f = ϕ = ψ, µ-a.e., so we get

(17)

X

f dµ =

X

ϕ dµ =

X

ψdµ.

Remark also that we have equalities(18)g(x)−ϕ(x) = lim inf

n→∞[g(x)−gn(x)] and g(x)+ψ(x) = lim inf

n→∞[g(x)+hn(x)], ∀ x ∈ X.

Since we clearly have

g − gn ≥ 0 and g + hn ≥ 0, ∀ n ≥ 1,

using (18), and Fatou Lemma (Corollary 2.2) and we get the inequalities X

(g − ϕ) dµ ≤ liminf n→∞

X

(g − gn) dµ, X

(g + ψ) dµ ≤ liminf n→∞

X

(g + hn) dµ,



318 LECTURE 34

In other words, we get

X g dµ

− X ϕ dµ

≤liminf n→∞ X g dµ

− X gn dµ = X g dµ

−limsupn→∞ X gn dµ,

X

g dµ +

X

ψ dµ ≤ liminf n→∞

X

g dµ +

X

hn dµ

=

X

g dµ + liminf n→∞

X

hn dµ.

Using the equalities (16) and (17), the above inequalities give X

f dµ =

X

ϕ dµ ≥ limsupn→∞

X

gn dµ = lim supn→∞

X

f n dµ, X

f dµ =

X

ψ dµ ≤ liminf n→∞

X

hn dµ = lim inf n→∞

X

f n dµ.

In other words, we have X

f dµ ≤ lim inf n→∞

X

f n dµ ≤ lim supn→∞

X

f n dµ ≤ X

f dµ,

thus giving the equality (14)The case K = R is trivial (it is in fact contained in case K = R).The case K = C is also pretty clear, using real and imaginary parts, since for

each n ≥ 1, we clearly have

|Re f n| ≤ g, µ-a.e.,

|Im f n| ≤ g, µ-a.e.,.

Exercise 3 . Give an example of a sequence of continuous functions f n : [0, 1] →[0, ∞), such that

(a) limn→∞ f n(x) = 0, ∀ n ≥ 1;(b)

[0,1]

f n dλ = 1, ∀ n ≥ 1.

(Here λ denotes the Lebesgue measure). This shows that the Lebesgue Dominated

Convergence Theorem fails, without the dominance condition (ii).Hint: Consider the functions f n defined by

f n(x) =

n2x if 0 ≤ x ≤ 1/nn(2 − nx) if 1/n ≤ x ≤ 2/n

0 if 2/n ≤ x ≤ 1

The Lebesgue Convergence Theorems 2.2 and 2.3 have many applications. Theyare among the most important results in Measure Theory. In many instances, thesetheorem are employed during proofs, at key steps. The next two results are goodillustrations.

Proposition 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] bea measurable function. Then the map

ν : A

A−→ A f dµ

∈[0,

∞]

defines a measure on A.

Proof. It is clear that ν (∅) = 0. To prove σ-additivity, start with a pairwisedisjoint sequence (An)∞

n=1 ⊂ A, and put A =∞n=1 An. For each integer n ≥ 1,

define the set Bn =nk=1 Ak, and the measurable function gn = f κ Bn

. Define alsothe function g = f κ A. It is obvious that

• 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere),




• limn→∞ gn(x) = g(x), ∀ x ∈ X .

Using the General Lebesgue Monotone Convergence Theorem, it follows that

(19) ν (A) = X

f κ A dµ = X

g dµ = limn→∞

X

gn dµ.

Notice now that, for each n ≥ 1, one has the equality

gn = f κ A1 + · · · + f κ An,

so using Remark 1.7.C, we get X

gn dµ =nk=1

X

f κ Akdµ =

nk=1

ν (Ak),

so the equality (19) immediately gives ν (A) =∞n=1 ν (An).

The next result is a version of the previous one for K-valued functions.

Proposition 2.2. Let (X, A, µ) be a measure space, let K be one of the symbols

R, R, or C, and let (An)∞n=1 ⊂ A be a pairwise disjoint sequence with ∞

n=1 An = X .For a function f : X → K, the following are equivalent.

(i) f ∈ L1K

(X, A, µ);(ii) f

An

∈ L1K

(An, AAn

, µAn

), ∀ n ≥ 1, and

∞n=1

An

|f | dµ < ∞.

Moreover, if f satisfies these equivalent conditions, then X

f dµ =

∞n=1

An

f dµ.

Proof. (i)⇒

(ii). Assume f ∈L1K

(X, A, µ). Applying Proposition 2.1, to|f |, we immediately get

∞n=1

An

|f | dµ =

X

|f | dµ < ∞,

which clearly proves (ii).(ii) ⇒ (i). Assume f satisfies condition (ii). Define, for each n ≥ 1, the set

Bn = A1 ∪ · · · ∪ An. First of all, since (An)∞n=1 ⊂ A, and

∞n=1 An = X , it follows

that f is measurable. Consider the the functions f n = f κ Bnand gn = f κ An

, n ≥ 1.

Notice that, since f An

∈ L1K

(An, AAn

, µAn

), it follows that gn ∈ L1K

(X, A, µ),

∀ n ≥ 1, and we also have

X gn dµ = An

f dµ, ∀ n ≥ 1.

In fact we also have X

|gn| dµ =

An

|f | dµ, ∀ n ≥ 1.

Notice that we obviously have f n = g1 + · · · + gn, and |f n| = |g1| + · · · + |gn|, so if we define

S =∞n=1

An

|f | dµ,



320 LECTURE 34

we get

X |f n

|dµ =

n

k=1 Ak |f

|dµ

≤S <

∞,

∀n

≥1.

Notice however that we have 0 ≤ |f 1| ≤ |f 2| ≤ . . . |f |, as well as the equalitylimn→∞ f n(x) = f (x), ∀ x ∈ X . On the one hand, using the General LebesgueMonotone Convergence Theorem, we will get X

|f | dµ = limn→∞

X

|f n| dµ = limn→∞

nk=1

Ak

|f | dµ

=

∞n=1

An

|f | dµ = S < ∞,

which proves that |f | ∈ L1+(X, A, µ), so in particular f belongs to L1

K(X, A, µ). On

the other hand, since we have |f n| ≤ |f |, by the Lebesgue Dominated ConvergenceTheorem, we get

X f dµ = lim

n→∞ X f n dµ = lim

n→∞ n

k=1 Ak

f dµ =

∞

n=1 An

f dµ.

Corollary 2.4. Let (X, A, µ) be a measure space, let K be one of the symbolsR, R, or C, and let (X n)∞

n=1 ⊂ A be sequence with ∞n=1 X n = X , and X 1 ⊂

X 2 ⊂ . . . . For a function f : X → K be a measurable function, the following areequivalent.

(i) f ∈ L1K

(X, A, µ);(ii) f

Xn

∈ L1K

(X n, AXn

, µXn

), ∀ n ≥ 1, and

sup

Xn

|f | dµ : n ≥ 1

< ∞.

Moreover, if f satisfies these equivalent conditions, then

X

f dµ = limn→∞ Xn

f dµ.

Proof. Apply the above result to the sequence (An)∞n=1 given by A1 = X 1

and An = X n X n−1, ∀ n ≥ 2.

Remark 2.2. Suppose (X, A, µ) is a measure space, K is one of the fields R orC, and f ∈ L1

K(X, A, µ). By Proposition 2.2, we get the fact that the map

ν : A A −→ A

f dµ ∈ K

is a K-valued measure on A. By Proposition 2.1, we also know that

ω : A

A

−→ A |f

|dµ

∈K

is a finite “honest” measure on A. Using Proposition 1.6, we clearly have

|ν (A)| =

A

f dµ

≤ A

|f | dµ = ω(A), ∀ A ∈ A,

which by the results from III.8 gives the inequality |ν | ≤ ω. (Here |ν | denotes thevariation measure of ν .) Later on (see Section 4) we are going to see that in factwe have the equality |ν | = ω.




Comment. It is important to understand the “sequential” nature of the con-vergence theorems discussed here. If we examine for instance the MononotoneConvergence Theorem, we could easily formulate a “series” version, which statesthe equality

X

∞n=1

f n

dµ =∞n=1

X

f n dµ,

for any sequence measurable functions f n : X → [0, ∞].Suppose now we have an arbitrary family f j : X → [0, ∞], j ∈ J of measurable

functions, and we define

f (x) =j∈J

f j(x), ∀ x ∈ X.

(Here we use the summability convention which defines the sum as the supremumof all finite sums.) In general, f is not always measurable. But if it is, one stillcannot conclude that

X

f dµ = j∈J

X

f i dµ.

The following example illustrates this anomaly.

Example 2.1. Take the measure space ([0, 1],Mλ([0, 1]), λ), and fix J ⊂ [0, 1]and arbitrary set. For each j ∈ J we consider tha characteristic function f j = κ j.

It is obvious that the function f : X → [0, ∞], defined by

f (x) =j∈J

f j(x), ∀ x ∈ [0, 1],

is equal to κ J If J is non-measurable, this already gives an example when f =

j∈J f j is non-measurable. But even if J were measurable, it would be impossible

to have the equality X

f dλ = j∈J

X

f j dλ,

simply because the right hand side is zero, while the left hand side is equal to λ(J ).

The next two exercises illustrate straightforward (but nevertheless interesting)applications of the convergence theorems to quite simple situations.

Exercise 4. Let A be a σ-algebra on a (non-empty) set X , and let (µn)∞n=1 be

a sequence of signed measures on A. Assume that, for each A ∈ A, the sequenceµn(A)

∞n=1

has a limit denoted µ(A) ∈ [−∞, ∞]. Prove that the map µ : A →[0, ∞] defines a measure on A, if the sequence (µn)∞

n=1 satisfies one of the followinghypotheses:

A. 0 ≤ µ1(A) ≤ µ2(A) ≤ . . . , ∀ A ∈ A;B. there exists a finite measure ω on A, such that

|µn(A)

| ≤ω(A),

∀n

≥1,

A ∈ A.

Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak)∞k=1 ⊂ A, and put A =∞

k=1 Ak.

Treat the problem of proving the equality µ(A) =∞

k=1 µ(Ak) as a convergence problem on

the measure space (N,P(N), ν ) - with ν the counting measure - for the sequence of functions

f n : N → [0, ∞] defined by f n(k) = µn(Ak), ∀ k ∈ N.

Exercise 5* . Let A be a σ-algebra on a (non-empty) set X , and let (µj)j∈J bea family of signed measures on A. Assume either of the following is true:



322 LECTURE 34

A. µj(A) ≥ 0, ∀ j ∈ J , A ∈ A.B. There exists a finite measure ω on A, such that

j∈J |µj(A)| ≤ ω(A),

∀A

∈A.

Define the map µ : A → [0, ∞] by µ(A) = j∈J µj(A), ∀ A ∈ A. (In Case A, the

sum is defined as the supremum over finite sums. In case B, it follows that thefamily

µj(A)

j∈J

is summable.) Prove that µ is a measure on A.

Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak)∞k=1 ⊂ A, and put A =∞

k=1 Ak.

To prove the equality µ(A) =∞

k=1 µ(Ak), analyze the following cases: (i) There is some k ≥ 1,

such that µ(Ak) = ∞; (ii) µ(Ak) < ∞, ∀ k ≥ 1. The first case is quite trivial. In the second

case reduce the problem to the previous exercise, by observing that, for each k ≥ 1, the set

J (Ak) = j ∈ J : µj(Ak) > 0

must be countable. Then the set J (A) = j ∈ J : µj(A) > 0 is

also countable.

Comment. One of the major drawbacks of the theory of Riemann integrationis illustrated by the approach to improper integration. Recall that for a functionh : [a, b)

→R the improper Riemann integral is defined as b−

a

h(t) dt = limx→b−

xa

f (t) dt,

provided that

(a) h

[a,x]is Riemann integrable, ∀ x ∈ (a, b), and

(b) the above limit exists.

The problem is that although the improper integral may exist, and the function isactually defined on [a, b], it may fail to be Riemann integrable, for example whenit is unbounded.

In contrast to this situation, by Corollary 2.4, we see that if for example h ≥ 0,then the Lebesgue integrability of h on [a, b] is equivalent to the fact that

(i) h[a,x]is Lebesgue integrable,

∀x

∈(a, b), and

(ii) limx→b− [a,x] h dλ exists.

Going back to the discussion on improper Riemann integral, we can see thata sufficient condition for h : [a, b) → R to be Riemann integrable in the impropersense, is the fact that h has property (a) above, and h is Lebesgue integrable on[a, b). In fact, if h ≥ 0, then by Corollary 2.4, this is also necessary.

Notation. Let −∞ ≤ a < b ≤ ∞, and let f be a Lebesgue integrable function,defined on some interval J which is one of (a, b), [a, b), (a, b], or [a, b]. Then the

Lebesgue integral J f dλ will be denoted simply by

ba f dλ.

Exercise 6* . Let (X, A, µ) be a finite measure space. Prove that for everyf ∈ L1

+(X, A, µ), one has the equality

X f dµ = ∞

0

µf −1([t,

∞]) dt,

where the second term is defined as improper Riemann integral.Hint: The function ϕ : [0, ∞) → [0, ∞) defined by ϕ(t) = µ

f −1([t, ∞])

, ∀ t ≥ 0, is non-

increasing, so it is Riemann integrable on every interval [0, a], a > 0. Prove the inequalities Xa

f dµ ≤ a

0ϕ(t) dt ≤

X

f dµ, ∀ a > 0,

where Xa = f −1([0, a)), by analyzing lower and upper Darboux sums of ϕ[0,a]

. Use Corollary

2.4 to get lima→∞

Xaf dµ =

X

f dµ .



Lecture 35

3. Banach spaces of integrable functions I: the Lp spaces

In this section we discuss an important construction, which is extremely usefulin virtually all branches of Analysis. In Section 1, we have already introduced thespace L1. The first construction deals with a generalization of this space.

Definitions. Let (X, A, µ) be a measure space, and let K be one of the fields

R or C.A. For a number p ∈ (1, ∞), we define the space

L pK(X, A, µ) =

f : X → K : f measurable, and

X

|f | p ∈ dµ < ∞.

Here we use the convention introduced in Section 1, which defines X

h dµ = ∞,for those measurable functions h : X → [0, ∞], that are not integrable.

Of course, in this definition we can allow also the value p = 1, and in this casewe get the familiar definition of L1

K(X, A, µ).B. For p ∈ [1, ∞), we define the map Q p : L1

K(X, A, µ) → [0, ∞) by

Q p(f ) =

X|f | p dµ, ∀ f ∈ L

1K(X, A, µ).

Remark 3.1. The space L1K(X, A, µ) was studied earlier (see Section 1). It

has the following features:

(i) L1K(X, A, µ) is a K-vector space.

(ii) The map Q1 : L1K(X, A, µ) → [0, ∞) is a seminorm , i.e.

(a) Q1(f + g) ≤ Q1(f ) + Q1(g), ∀ f, g ∈ L1K(X, A, µ);

(b) Q1(αf ) = |α| · Q1(f ), ∀ f ∈ L1K(X, A, µ), α ∈ K.

(iii) X

f dµ ≤ Q1(f ), ∀ f ∈ L1

K(X, A, µ).

Property (b) is clear. Property (a) immediately follows from the inequality |f +g| ≤|f | + |g|, which after integration gives

X

|f + g| dµ ≤ X

|f | + |g|

dµ =

X

|f | dµ +

X

|g| dµ.

In what follows, we aim at proving similar features for the spaces L pK(X, A, µ)and Q p, 1 < p < ∞.

The following will help us prove that L p is a vector space.

Exercise 1 ♦. Let p ∈ (1, ∞). Then one has the inequality

(s + t) p ≤ 2 p−1(s p + t p), ∀ s, t ∈ [0, ∞).

323



324 LECTURE 35

Hint: The inequality is trivial, when s = t = 0. If s + t > 0, reduce the problem to the caset + s = 1, and prove, using elementary calculus techniques that

mint∈[0,1]

tp + (1−

t)p = 21−p.

Proposition 3.1. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, and let p ∈ (1, ∞). When equipped with pointwise addition and scalar multiplication, L pK(X, A, µ) is a K-vector space.

Proof. It f, g ∈ L pK(X, A, µ), then by Exercise 1 we have

X

|f + g| p dµ ≤ X

|f | + |g| p dµ ≤ 2 p−1

X

|f | p dµ +

X

|g| p dµ

< ∞,

so f + g indeed belongs to L pK(X, A, µ).

It f ∈ L pK(X, A, µ), and α ∈ K, then the equalities

X|αf | p dµ =

X|α| p · |f | p dµ = |α| p ·

X|f | p dµ

clearly prove that αf also belongs to L pK(X, A, µ).

Our next task will be to prove that Q p is a seminorm, for all p > 1. In thisdirection, the following is a key result. (The above mentioned convention will beused throughout this entire section.)

Theorem 3.1 (Holder’s Inequality for integrals). Let (X, A, µ) be a measurespace, let f, g : X → [0, ∞] be measurable functions, and let p, q ∈ (1, ∞) be such that 1

p + 1q = 1. Then one has the inequality 15

(1)

X

f g d µ ≤ X

f p dµ

1/p

· X

gq dµ

1/q

.

Proof. If either X f p dµ = ∞, or X g p dµ = ∞, then the inequality (1) is

trivial, because in this case, the right hand side is ∞. For the remainder of theproof we will assume that

X

f p dµ < ∞ and X

gq dµ < ∞.

Use Corollary 2.1 to find two sequences (ϕn)∞n=1, (ψn)∞

n=1 ⊂ L1R,elem(X, A, µ),

such that

• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . and 0 ≤ ψ1 ≤ ψ2 ≤ . . . ;• limn→∞ ϕn(x) = f (x) p and limn→∞ ψn(x) = g(x)q, ∀ x ∈ X .

By the Lebesgue Dominated Convergence Theorem, we will also get the equalities

(2)

X

f p dµ = limn→∞

X

ϕn dµ and

X

gq dµ = limn→∞

X

ψn dµ.

Remark that the functions f n = ϕ1/pn , gnψ1/q

n , n ≥ 1 are also elementary (becausethey obviously have finite range). It is obvious that we have

• 0 ≤ f 1 ≤ f 2 ≤ . . . , and 0 ≤ g1 ≤ g2 ≤ . . . ;• limn→∞ f n(x) = f (x), and limn→∞ gn(x)] = g(x), ∀ x ∈ X .

With these notations, the equalities (2) read

(3)

X

f p dµ = limn→∞

X

(f n) p dµ and

X

gq dµ = limn→∞

X

(gn)q dµ.

Of course, the products f ngn, n ≥ 1 are again elementary, and satisfy

15 Here we use the convention ∞1/p = ∞1/q = ∞.




• 0 ≤ f 1g1 ≤ f 2g2 ≤ . . . ;• limn→∞[f n(x)gn(x)] = f (x)g(x), ∀ x ∈ X .

Using the General Lebesgue Monotone Convergence Theorem, we then get X

f g d µ = limn→∞

X

f ngn dµ.

Using (3) we now see that, in order to prove (1), it suffices to prove the inequalities X

f ngn dµ ≤ X

(f n) p dµ

1/p

· X

(gn)q dµ

1/q

, ∀ n ≥ 1.

In other words, it suffices to prove (1), under the extra assumption that both f and g are elementary integrable.

Suppose f and g are elementary integrable. Then (see III.1) there exist pair-wise disjoint sets (Dj)mj=1 ⊂ A, with µ(Dj) < ∞, ∀ j = 1, . . . , m, and numbersα1, β 1, . . . , αm, β m

∈[0,

∞), such that

f = α1κ D1+ · · · + αmκ Dm

g = β 1κ D1+ · · · + β mκ Dm

Notice that we have

f g = α1β 1κ D1+ · · · + αmβ mκ Dm

,

so the left hand side of (1) is the given by X

f g d µ =mj=1

αjβ jµ(Dj).

Define the numbers xj = αjµ(Dj)1/p, yj = β jµ(Dj)1/q, j = 1, . . . , m. Using these

numbers, combined with 1 p + 1q = 1, we clearly have

(4)

X

f g d µ =mj=1

(xjyj).

At this point we are going to use the H older inequality for finite sequences (LemmaII.2.3), which gives

mj=1

(xjyj) ≤ mj=1

(xj) p1/p

· mj=1

(yj)q1/q

,

so the equality (4) continues with

X

f g d µ ≤ mj=1

(xj) p1/p

· mj=1

(yj)q1/q

=

=

mj=1

(αj) pµ(Dj)

1/p

· mj=1

(β j)qµ(Dj)

1/q

=

=

X

f p dµ

1/p

· X

gq dµ

1/q

.



326 LECTURE 35

Corollary 3.1. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, and let p, q ∈ (1, ∞) be such that 1

p + 1q = 1. For any two functions

f ∈L p

K(X, A, µ) and g ∈Lq

K(X, A, µ), the product f g belongs toL1

K(X, A, µ) and one has the inequality X

f g d µ

≤ Q p(f ) · Qq(g).

Proof. By Holder’s inequality, applied to |f | and |g|, we get X

|f g| dµ ≤ Q p(f ) · Qq(g) < ∞,

so |f g| belongs to L1+(X, A, µ), i.e. f g belongs to L1

K(X, A, µ). The desired inequal-

ity then follows from the inequality X

f g d µ ≤

X|f g| dµ.

Notation. Suppose (X, A, µ) is a measure space, K is one of the fields R or C,and p, q

∈(1,

∞) are such that 1

p

+ 1

q

= 1. For any pair of functions f ∈L pK(X, A, µ),

g ∈ LqK(X, A, µ), we shall denote the number X f g d µ ∈ K simply by f, g. With

this notation, Corollary 3.1 reads:f, g ≤ Q p(f ) · Qq(g), ∀ f ∈ L pK(X, A, µ), g ∈ L

qK(X, A, µ).

The following result gives an alternative description of the maps Q p, p ∈ (1, ∞).

Proposition 3.2. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, let p, q ∈ (1, ∞) be such that 1

p + 1q = 1. and let f ∈ L

pK(X, A, µ). Then

one has the equality

(5) Q p(f ) = supf, g : g ∈ L

qK(X, A, µ), Qq(g) ≤ 1

.

Proof. Let us denote the right hand side of (5) simply by P (f ). By Corollary3.1, we clearly have the inequality

P (f ) ≤ Q p(f ).

To prove the other inequality, let us first observe that in the case when Q p(f ) = 0,there is nothing to prove, because the above inequality already forces P (f ) = 0.Assume then Q p(f ) > 0, and define the function h : x → K by

h(x) =

|f (x)| p

f (x)if f (x) = 0

0 if f (x) = 0

It is obvious that h is measurable. Moreover, one has the equality |h| = |f | p−1,which using the equality qp = p + q gives |h|q = |f |qp−q = |f | p. This proves thath

∈Lq

K(X, A, µ), as well as the equality

Qq(h) =

X

|h|q dµ

1/q

=

X

|f | p dµ

1/q

= Q p(f ) p/q.

If we define the number α = Q p(f )− p/q, then the function g = αh has Qq(g) = 1,so we get

P (f ) ≥ X

f g d µ

=1

Q p(f ) p/q

X

fhdµ

.




Notice that fh = |f | p, so the above inequality can be continued with

P (f )≥

1

Q p(f ) p/q X |f | p dµ =

Q p(f ) p

Q p(f ) p/q= Q p(f ).

Corollary 3.2. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, and let p ∈ (1, ∞). Then the Q p is a seminorm on L pK(X, A, µ), i.e.

(a) Q p(f 1 + f 2) ≤ Q p(f 1) + Q p(f 2), ∀ f 1, f 2 ∈ L pK(X, A, µ);

(b) Q p(αf ) = |α| · Q p(f ), ∀ f ∈ L pK(X, A, µ), α ∈ K.

Proof. (a). Take q = p p−1 , so that 1

p + 1q = 1. Start with some arbitrary

g ∈ LqK(X, A, µ), with Qq(g) ≤ 1. Then the functions f 1g and f 2g belong to

L1K(X, A, µ), and so f 1g + f 2g also belongs to L1

K(X, A, µ). We then getf 1 + f 2 , g

=

X

(f 1g + f 2g) dµ

=

X

f 1g dµ +

X

f 2g dµ

≤

≤ X

f 1g dµ+ X

f 2g dµ = f 1, g+ f 2, g.Using Proposition 3.2, the above inequality givesf 1 + f 2 , g ≤ Q p(f 1) + Q p(f 2).

Since the above inequality holds for all g ∈ LqK(X, A, µ), with Qq(g) ≤ 1, again by

Proposition 3.2, we get

Q p(f 1 + f 2) ≤ Q p(f 1) + Q p(f 2).

Property (b) is obvious.

Remarks 3.2. Let (X, A, µ) be a measure space, and K be one of the fields R

or C, and let p ∈ [1, ∞).

A. If f ∈ L p

K(X, A, µ) and if g : X → K is a measurable function, with g = f ,µ-a.e., then g ∈ L pK(x, A, µ), and Q p(g) = Q p(f ).

B. If we define the space

NK(X, A, µ) =

f : X → K : f measurable, f = 0, µ-a.e.

,

then NK(X, A, µ) is a linear subspace of L pK(X, A, µ). In fact one has the equality

NK(X, A, µ) =

f ∈ L pK(X, A, µ) : Q p(f ) = 0

.

The inclusion “⊂” is trivial. Conversely, f ∈ L pK(X, A, µ) has Q p(f ) = 0, then the

measurable function g : X → [0, ∞) defined by g = |f | p will have X

g dµ = 0. ByExercise 2.3 this forces g = 0, µ-a.e., which clearly gives f = 0, µ-a.e.

Definition. Let (X, A, µ) be a measure space, let K be one of the fields R orC, and let p

∈[1,

∞). We define

L pK(X, A, µ) = L pK(X, A, µ)/NK(X, A, µ).

In other words, L pK(X, A, µ) is the collection of equivalence classes associated with

the relation “=, µ-a.e.” For a function f ∈ L pK(X, A, µ) we denote by [f ] its

equivalence class in L pK(X, A, µ). So the equality [f ] = [g] is equivalent to f = g,µ-a.e. By the above Remark, there exists a (unique) map . p : L pK(X, A, µ) →[0, ∞), such that

[f ] p = Q p(f ), ∀ f ∈ L pK(X, A, µ).



328 LECTURE 35

By the above Remark, it follows that . p is a norm on L pK(X, A, µ). When K = C

the subscript C will be ommitted.

Conventions. Let (X, A, µ), K, and p be as above We are going to abuse abit the notation, by writingf ∈ L p

K(X, A, µ),

if f belongs to L pK(X, A, µ). (We will always have in mind the fact that this notationsignifies that f is almost uniquely determined.) Likewise, we are going to replaceQ p(f ) with f p.

Given p, q ∈ (1, ∞), with 1 p + 1

q = 1, we use the same notation for the (correctly

defined) map . , . : L p

K(X, A, µ) × LqK(X, A, µ) → K.

Remark 3.3. Let (X, A, µ) be a measure space, let K be either R or C, andlet p, q ∈ (1, ∞) be such that 1

p + 1q = 1. Given f ∈ L pK(X, A, µ), we define the map

Λf : LqK(X, A, µ) g −→ f, g ∈ K.

According to Proposition 3.2, the map Λf is linear, continuous, and has norm Λf = f p. If we denote by LqK(X, A, µ)∗ the Banach space of all linear continu-ous maps LqK(X, A, µ) → K, then we have a correspondence

(6) L pK(X, A, µ) f −→ Λf ∈ LqK(X, A, µ)∗

which is linear and isometric. This correspondence will be analyzed later in Section5.

Notation. Given a sequence (f n)∞n=1, and a function f , in L p

K(X, A, µ), weare going to write

f = L p- limn→∞

f n,

if (f n)∞n=1 converges to f in the norm topology, i.e. limn→∞ f n − f p = 0.

The following technical result is very useful in the study of L p spaces.

Theorem 3.2 (L p Dominated Convergence Theorem). Let (X, A, µ) be a mea-sure space, let K be one of the fields R or C, let p ∈ [1, ∞) and let (f n)∞

n=1 be a sequence in L pK(X, A, µ). Assume f : X → K is a measurable function, such that

(i) f = µ-a.e.- limn→∞ f n;(ii) there exists some function g ∈ L1

K(X, A, µ), such that

|f n| ≤ |g|, µ-a.e., ∀ n ≥ 1.

Then f ∈ L pK

(X, A, µ), and one has the equality

f = L p- limn→∞

f n.

Proof. Consider the functions ϕn = |f n| p, n ≥ 1, and ϕ = |f | p, and ψ = |g| p.Notice that

• ϕ = µ-a.e.- limn→∞ ϕn;• |ϕn| ≤ ψ, µ-a.e., ∀ n ≥ 1;• ψ ∈ L1

+(X, A, µ).

We can apply the Lebsgue Dominated Convergence Theorem, so we get the fact thatϕ ∈ L1

+(X, A, µ), which gives the fact that f ∈ L pK(X, A, µ). Now if we consider

the functions ηn = |f n − f | p, and η = 2 p−1|g| p + |f | p, then we have (use Exercise

1):

• 0 = µ-a.e.-limn→∞ ηn;




• |ηn| ≤ η, µ-a.e., ∀ n ≥ 1;• η ∈ L1

+(X, A, µ).

Again using the Lebesgue Dominated Convergence Theorem, we get

limn→∞

X

ηn dµ = 0,

which means that

limn→∞

|f n − f | p dµ,

which reads limn→∞

f n − f p p

= 0, so we clearly have f = L p-limn→∞ f n.

Our main goal is to prove that the L p spaces are Banach spaces. The key resultwhich gives this, but also has some other interesting consequences, is the following.

Theorem 3.3. Let (X, A, µ) be a measure space, let K be one of the fields R

or C, let p

∈[1,

∞) and let (f k)∞

k=1 be a sequence in L pK(X, A, µ), such that

∞k=1

f k p < ∞.

Consider the sequence (gn)∞n=1 ⊂ L p

K(X, A, µ) of partial sums:

gn =nk=1

f k, n ≥ 1.

Then there exists a function g ∈ L pK(X, A, µ), such that

(a) g = µ-a.e.-limn→∞ gn;(b) g = L p- limn→∞ gn.

Proof. Denote the sum ∞

k=1 f n p simply by S . For each integer n ≥ 1,define the function hn : X → [0, ∞], by

hn(x) =nk=1

|f n(x)|, ∀ x ∈ X.

It is clear that hn ∈ L pR(X, A, µ), and we also have

(7) hn p ≤nk=1

f k p ≤ S, ∀ n ≥ 1.

Notice also that 0 ≤ h1 ≤ h2 ≤ . . . . Define then the function h : X → [0, ∞] by

h(x) = limn→∞

hn(x), ∀ x ∈ X.

Claim: h ∈ L pR(X, A, µ).

To prove this fact, we define the functions ϕ = h p and ϕn = (hn) p, n ≥ 1. Noticethat, we have

• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . ;• ϕn ∈ L1

R(X, A, µ), ∀ n ≥ 1;• limn→∞ ϕn(x) = ϕ(x), ∀ x ∈ X ;• sup

X

ϕn dµ : n ≥ 1 ≤ M p.



330 LECTURE 35

Using the Lebesgue Monotone Convergence Theorem, it then follows that h p = ϕ ∈L1R(X, A, µ), so h indeed belongs to L pR(X, A, µ).(7) gives

Let us consider now the set N =

x∈

X : h(x) =∞

. On the one hand, sincewe also have

N = x ∈ X : ϕ(x) < ∞,

and ϕ is integrable, it follows that N ∈ A, and µ(N ) = 0. On the other hand, since

∞k=1

|f n(x)| = h(x) < ∞, ∀ x ∈ X N,

it follows that, for each x ∈ X N , the series∞k=1 f k(x) is convergent. Let us

define then g : X → K by

g(x) =

∞k=1 f k(x) if x ∈ X N

0 if x ∈ N

It is obvious that g is measurable, and we have

g = µ-a.e.- limn→∞

gn.

Since we have

|gn| =

nk=1

f k

≤ nk=1

|f k| = hn ≤ h, ∀ n ≥ 1,

using the Claim, and Theorem 3.2, it follows that g indeed belongs to L pK(X, A, µ)and we also have the equality g = L p-limn→∞ gn.

Corollary 3.3. Let (X, A, µ) be a measure space, and let K be one of the fields R or C. Then L pK(X, A, µ) is a Banach space, for each p ∈ [1, ∞).

Proof. This is immediate from the above result, combined with the complete-

ness criterion given by Remark II.3.1.

Another interesting consequence of Theorem 3.3 is the following.

Corollary 3.4. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, let p ∈ [1, ∞), and let f ∈ L pK(X, A, µ). Any sequence (f n)∞

n=1 ⊂∈L pK(X, A, µ), with f = L p- limn→∞ f n, has a subsequence (f nk

)∞k=1 such that f =

µ-a.e.- limk→∞ f nk.

Proof. Without any loss of generality, we can assume that f = 0, so that wehave

limn→∞

f n p = 0.

Choose then integers 1 ≤ n1 < n2 < . . . , such that

f nk p ≤1

2k , ∀ k ≥ 1.

If we define the functions

gm =mk=1

f nk,

then by Theorem 3.3, it follows that there exists some g ∈ L pK(X, A, µ), such that

g = µ-a.e.- limm→∞

gm.




This measn that there exists some N ∈ A, with µ(N ) = 0, such that

limm→∞

gm(x) = g(x),

∀x

∈X N.

In other words, for each x ∈ X N , the series ∞k=1 f nk

(x) is convergent (to somenumber g(x) ∈ K). In particular, it follows that

limk→∞

f nk(x) = 0, ∀ x ∈ X N,

so we indeed have 0 = µ-a.e.- limk→∞ f nk.

The following result collects some properties of L p spaces in the case when theundelrying measure space is finite.

Proposition 3.3. Suppose (X, A, µ) is a finite measure space, and K is oneof the fields R or C.

(i) If f : X → K is a bounded measurable function, then f ∈ L pK(X, A, µ),

∀ p

∈[1,

∞).

(ii) For any p, q ∈ [1, ∞), with p < q, one has the inclusion LqK(X, A, µ) ⊂L pK(X, A, µ). So taking quotients by NK(X, A, µ), one gets an inclusion of

vector spaces

(8) LqK(X, A, µ) → L pK(X, A, µ).

Moreover the above inclusion is a continuous linear map.

Proof. The key property that we are going to use here is the fact that theconstant function 1 = κ X is µ-integrable (being elementary µ-integrable).

(i). This part is pretty clear, because if we start with a bounded measurablefunction f : X → K and we take M = supx∈X |f (x)|, then the inequality |f | p ≤M p · 1, combined with the integrability of 1, will force the inetgrability of |f | p, i.e.f ∈ L p

K(X, A, µ).

(ii). Fix 1 ≤ p < q < ∞, as well as a function f ∈ L

qK(X, A, µ). Consider the

number r = q p > 1, and s = r

r−1 , so that we have 1r + 1

s = 1. Since f ∈ LqK(X, A, µ),

the function g = |f |q belongs to L1K(X, A, µ). If we define then the function h = |f | p,

then we obviously have g = hr, so we get the fact that h belongs to LrK(X, A, µ).Using part (i), we get the fact that 1 ∈ LsK(X, A, µ), so by Corollary 3.1, it followsthat h = 1 · h belongs to L1

K(X, A, µ), and moreover, one has the inequality X

|f | p dµ =

X

h dµ ≤ 1s · hr =

X

1 dµ

1/s

· X

hr dµ

1/r

=

= µ(X )1/s · X

|f |q dµ

1/r

= µ(X )1/s · f qq/r

.

On the one hand, this inequality proves that f

∈L pK(X, A, µ). On the other hand,

this also gives the inequalityf p p ≤ µ(X )1/s · f q

q/r= µ(X )1− p

q · f q p

,

which yields

f p ≤ µ(X )1p

− 1q · f q.

This proves that the linear map (8) is continuous (and has norm no greater than

µ(X )1p

− 1q ).



332 LECTURE 35

Exercise 2 . Give an example of a sequence of continuous functions f n : [0, 1] →[0, ∞), n ≥ 1, such that L p-limn→∞ f n = 0, ∀ p ∈ [1, ∞), but for which itis not true that 0 = µ-a.e.- lim

n→∞f n

. (Here we work on the measure space[0, 1],Mλ([0, 1]), λ).)

Exercise 3 . Let Ω ⊂ Rn be an open set. Prove that C Kc (Ω) is dense inL pK(Ω,Mλ(Ω), λ), for every p ∈ [1, ∞). (Here λ denotes the n-dimensional Lebesguemeasure, and Mλ(Ω) denotes the collection of all Lebesgue measurable subsets of Ω.)

Notations. Let (X, A, µ) be a measure space, let K be one of the fields R orC. We define the space

NK,elem(X, A, µ) = L1K,elem(X, A, µ) ∩NK(X, A, µ),

and we define the quotient space

L1K,elem(X, A, µ) = L

1K,elem(X, A, µ) /NK,elem(X, A, µ).

In other words, if one considers the quotient mapΠ1 : L1

K(X, A, µ) → L1K(X, A, µ),

then L1K,elem(X, A, µ) = Π1

L1K,elem(X, A, µ)

. Notice that we have the obvious

inclusionL

1K,elem(X, A, µ) ⊂ L

pK(X, A, µ), ∀ p ∈ [1, ∞),

so we we consider the quotient map

Π p : L pK(X, A, µ) → L pK(X, A, µ),

we can also define the subspace

L pK,elem(X, A, µ) = Π pL pK,elem(X, A, µ)

, ∀ p ∈ [1, ∞).

Remark that, as vector spaces, the spaces L pK,elem(X, A, µ) are identical, since

KerΠ p = NK(x, A, µ), ∀ p ∈ [1, ∞).With these notations we have the following fact.

Proposition 3.4. L pK,elem(X, A, µ) is dense in L pK(X, A, µ), for each p ∈[1, ∞).

Proof. Fix p ∈ [1, ∞), and start with some f ∈ L pK(X, A, µ). What weneed to prove is the existence of a sequence (f n)∞

n=1 ⊂ L1K,elem(X, A, µ), such that

f = L p-limn→∞ f n. Taking real and imaginary parts (in the case K = C), itsuffieces to consider the case when f is real valued. Since |f | also belongs to L p,it follows that f + = maxf, 0 = 1

2

|f | + f

, and f − = max−f, 0 = 12

|f | − f

both belong to L p, so in fact we can assume that f is non-negative. Consider thefunction g = f p ∈ L1

+(X, A, µ). Use the definition of the integral, to find a sequence(gn)∞

n=1

⊂L1R,elem(X, A, µ), such that

• 0 ≤ gn ≤ g, ∀ n ≥ 1;• limn→∞

X

gn dµ = X

g dµ.

This gives the fact that g = L1-limn→∞ gn. Using Corollary 3.4, after replacing(gn)∞

n=1 with a subsequence, we can also assume that g = µ-a.e.- limn→∞ gn. If weput f n = (gn)1/p, ∀ n ≥ 1, we now have

• 0 ≤ f n ≤ f , ∀ n ≥ 1;• f = µ-a.e.- limn→∞ f n.




Obviously, the f n’s are still elementary integrable, and by the L p Dominated Con-vergence Theorem, we indeed get f = L p-limn→∞ f n.

Comments. A. The above result gives us the fact that L pK(X, A, µ) is the com-pletion of L pK,elem(X, A, µ). This allows for the following alternative constructionof the L p spaces.

B. For a measurable function f : X → K, by the (proof of the) above result,it follows that the condition f ∈ L pK(X, A, µ) is equivalent to the equality f =µ-a.e.- limn→∞ f n, for some sequence (f n)∞

n=1 of elementary integrable functions,which is Cauchy in the L p norm , i.e.

(c) for every ε > 0, there exists N ε, such that

f m − f n p < ε, ∀ m, n ≥ N ε.

One key feature, which will be heavily exploited in the next section, deals withthe Banach space p = 2, for which we have the following.

Proposition 3.5. Let (X, A, µ) be a measure space, and let K be one of the fields R or C.

(i) The map ( . | . ) : L2K(X, A, µ) × L2

K(X, A, µ) → K, given by

( f | g ) = f , g =

X

fgdµ, ∀ f, g ∈ L2K(X, A, µ),

defines an inner product on L2K(X, A, µ).

(ii) One has the equality

f 2 =

( f | f ), ∀ f ∈ L2K(X, A, µ).

Consequently , L2K(X, A, µ) is a Hilbert space.

Proof. The properties of the inner product are immediate, from the properties

of integration. The second property is also clear.

Remark 3.4. The main biproduct of the above feature is the fact that thecorrespondence (6) is an isometric isomorphism , in the case p = q = 2. Thisfollows from Riesz Theorem (only the surjectivity is the issue here; the rest hasbeen discussed in Remark 3.3). If φ : L2

K(X, A, µ) → K is a linear continuous map,then there exists some h ∈ L2


φ(g) = ( h | g ), ∀ g ∈ L2K(X, A, µ).

If we put f = h, then the above equality gives

φ(g) = f, g, ∀ g ∈ L2K(X, A, µ).

i.e. φ = Λf .

Comments. Eventually (see Section 5) we shall prove that the correspondence(6) is surjective also in the general case.The correspondence (6) also has a version for q = 1. This would require the

definition of an L p space for the case p = ∞. We shall postpone this until we reachSection 5. The next exercise hints towards such a construction.

Exercise 4 ♦. Let (X, A, µ) be a measure space, let K be one of the fields R or C,and let f : X → K be a bounded measurable function. Define M = supx∈X |f (x)|.Prove the following.



334 LECTURE 35

(i) Whenever g ∈ L1K(X, A, µ), it follows that the function f g also belongs to

L1K(X, A, µ), and one has the inequality

f g1 ≤ M · g1.(ii) The map

Λf : L1K(X, A, µ) g −→

X

f g d µ ∈ K

is linear and continuous. Moreover, one has the inequality Λf ≤ M .

Remark 3.5. If we apply the above Exercise to the constant function f = 1,we get the (already known) fact that the integration map

(9) Λ1 : L1K(X, A, µ) g −→

X

g dµ ∈ K

is linear and continuous, and has norm Λ1 ≤ 1. The follwing exercise gives theexact value of the norm.

Exercise 5 . With the notations above, prove that the following are equivalent:(i) the measure space (X, A, µ) is non-degenerate, i.e. there exists A ∈ A

with 0 < µ(A) < ∞;(ii) L1

K(X, A, µ) = 0;(ii) the integration map (9) has norm Λ1 = 1.



Lectures 36-37

4. Radon-Nikodym Theorems

In this section we discuss a very important property which has many importantapplications.

Definition. Let X be a non-empty set, and let A be a σ-algebra on X . Giventwo measures µ and ν on A, we say that ν has the Radon-Nikodym property relative

to µ, if there exists a measurable function f : X → [0, ∞], such that

(1) ν (A) =

A

f dµ, ∀ A ∈ A.

Here we use the convention which defines the integral in the right hand side by A

f dµ =

X

f κ A dµ if f κ A ∈ L1+(X, A, µ)

∞ if f κ A ∈ L1+(X, A, µ)

In this case, we say that f is a density for ν relative to µ.

The Radon-Nikodym property has an equivalent useful formulation.

Proposition 4.1 (Change of Variables). Let X be a non-empty set, and let A be a σ-algebra on X , let µ and ν be measures on A, and let f : X → [0, ∞] be a measurable function.

A. The following are equivalent (i) ν has the Radon-Nikodym property relative to µ, and f is a density for ν

relative to µ;(ii) for every measurable function h : X → [0, ∞], one has the equality 16

(2)

X

h dν =

X

hf dµ.

B. If ν and f are as above, and K is either R or C, then the equality (2)also holds for those measurable functions h : X → K with h ∈ L1

K(X, A, ν ) and hf ∈ L1

K(X, A, µ).

Proof. A. (i) ⇒ (ii). Assume property (i) holds, which means that we have(1). Fix a measurable function h : X → [0, ∞], and use Theorem III.3.2, to find a

sequence (hn)∞n=1 ⊂ A-ElemR(X ), with

(a) 0 ≤ h1 ≤ h2 ≤ · ·· ≤ h;(b) limn→∞ hn(x) = h(x), ∀ x ∈ X .

Of course, we also have

(a) 0 ≤ h1f ≤ h2f ≤ ·· · ≤ hf ;

16 For the product hf we use the conventions 0 · ∞ = ∞ · 0 = 0, and t · ∞ = ∞ · t = ∞,

∀ t ∈ (0, ∞].

335



336 LECTURES 36-37

(b) limn→∞ hn(x)f (x) = h(x)f (x), ∀ x ∈ X .

Using the Monotone Convergence Theorem, we then get the equalities

(3) X

h dν = limn→∞

X

hn dν and X

hf dµ = limn→∞

X

hnf dν

Notice that, if we fix n and we write hn = pk=1 αkκ Ak

, for some A1, . . . , A p ∈ A,and α1 > · · · > α p > 0, then

X

hn dν =

pk=1

αkν (Ak) =

pk=1

X

αkκ Akf dµ =

X

hnf dµ,

so using (3), we immediately get (2).The implication (ii) ⇒ (i) is trivial, using functions of the form h = κ A, A ∈ A.B. Suppose ν has the Radon-Nikodym property relative to µ, and f is a density

for ν relative to µ, and let h : X → K be a measurable function with h ∈ L1K(X, A, ν )

and hf ∈ L1K(X, A, µ). In the complex case, using the inequalities |Re h| ≤ |h| and

|Im h| ≤ |h|, it is clear that both functions Re h and Im h belong to L1

(X, A, ν ),and also the products (Re h)f and (Im h)f belong to L1(X, A, µ). This shows thatit suffices to prove (2) under the additional hypothesis that h is real-valued. In thiscase we consider the functions h±, defined by

h+ = maxh, 0 and h− = max−h, 0.

Since we have 0 ≤ h± ≤ |h|, it follows that h± ∈ L1+(X, A, ν ), as well as h±f ∈

L1+(X, A, µ). In particular, we get the equalities

(4)

X

h dν =

X

h+ dν − X

h− dν and

X

hf dν =

X

h+f dµ − X

h−f dµ.

Since h± ≥ 0, we can use property A.(ii) above, and we have

X h± dν = X h±f dµ,

and then the desired equality (2) immediately follows from (4).

One important issue is the uniqueness of the density. For this purpose, it willbe helpful to introduce the following.

Definition. Let T be one of the spaces [−∞, ∞] or C, and let r be somerelation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞, ∞]).Given a measurable space (X, A, µ), and two measurable functions f 1, f 2 : X → T ,

f 1 r f 2, µ-l.a.e.

if the setA =

x ∈ X : f 1(x) r f 2(x)

belongs to A, and it has locally µ-null complement in X , i.e. µ[X A] ∩ F ) = 0,

for every set F ∈ A with µ(F ) < ∞. (If r is one of the relations listed above, theset A automatically belongs to A, so all intersections [X A] ∩ F , F ∈ A, alsobelong to A.) The abreviation “µ-l.a.e.” stands for “µ-locally-almost everywhere.”Remark that one has the implication

f 1 r f 2, µ-a.e. ⇒ f 1 r f 2, µ-l.a.e.

Remark that, when µ is σ-finite, then the other implication also holds:

f 1 r f 2, µ-l.a.e. ⇒ f 1 r f 2, µ-a.e.




With this terminology, one has the following uniqueness result.

Proposition 4.2. Suppose A is a σ algebra on some non-empty set X , and µ

and ν are measures on A, such that ν has the Radon-Nikodym property relative toµ. If f, g : X → [0, ∞] are densities for ν relative to µ, then

f = g, µ-l.a.e.

In particular, if µ is σ-finite, then

f = g, µ-a.e.

Proof. Consider the set B =

x ∈ X : f (x) = g(x)

, which belongs to A.We need to prove that B is locally µ-null, i.e. one has µ(B ∩ F ) = 0, for all F ∈ A

with µ(F ) < ∞. Fix F ∈ A with µ(F ) < ∞, and let us write B ∩ F = D ∪ E ,where

D =

x ∈ B ∩ F : f (x) < g(x)

and E =

x ∈ B ∩ F : f (x) > g(x)

.

If we define, for each integer n ≥ 1, the setsDn =

x ∈ B ∩ F : f (x) + 1

n ≤ g(x)

and E n =

x ∈ B ∩ F : f (x) ≥ g(x) + 1n

,

then it is clear that

B ∩ F = D ∪ E =∞n=1

(Dn ∪ E n),

so in order to prove that µ(B ∩F ) = 0, it suffices to show that µ(Dn) = µ(E n) = 0,∀ n ≥ 1.

Fix n ≥ 1. It is obvious that f (x) < ∞, ∀ x ∈ Dn, so if we define the sequence(Dkn)∞

k=1 ⊂ A, by

Dkn =

x ∈ Dn : f (x) ≤ k

, ∀ k ≥ 1,

we have the equality Dn = ∞k=1 Dkn, so in order to prove that µ(Dn) = 0, it suffices

to show that µ(Dkn) = 0, ∀ k ≥ 1. On the one hand, since f (x) ≤ k, ∀ k ≥ 1, usingthe inclusion Dkn ⊂ F , we get

ν (Dkn) =

Dk

n

f dµ ≤ X

kκ Dkn

dµ = kµ(Dkn) ≤ kµ(F ) < ∞.

On the other hand, since g(x) ≥ f (x) + 1n , ∀ x ∈ Dkn, we get

ν (Dkn) =

Dk

n

g dµ ≥ X

(f κ Dkn

+ 1nκ Dk

n) dµ =

=

X

f κ Dkn

dµ +

X

1nκ Dk

ndµ = ν (Dkn) + 1

nµ(Dkn).

Since ν (Dkn) < ∞, the above inequality forces µ(Dkn) = 0.

The fact that µ(E n) = 0, ∀ n ≥ 1, is proven the exact same way.

In general, the uniqueness of the density does not hold µ-a.e., as it is seen fromthe following.

Example 4.1. Take X to be some non-empty set, put A = ∅, X

, and definethe measure µ on A, by µ(∅) and µ(X ) = ∞. It is clear that µ has the Radon-Nikodym property realtive to itself, but as sensities one can choose for instance theconstant functions f = 1 and g = 2. Clearly, the equality f = g, µ-a.e. is not true.



338 LECTURES 36-37

Remark 4.1. The local almost uniqueness result, given in Proposition 4.2,holds under slightly weaker assumptions. Namely, if (X, A, µ) is a measure space,and if f, g : X

→[0,

∞] are measurable functions for which we have the equality

A

f dµ =

A

g dµ,

for all A ∈ A with µ(A) < ∞, then we still have the equality f = g, µ-l.a.e. Thisfollows actually from Proposition 4.2, applied to functions of the form f

A

and gA

.

Let us introduce the following.

Notations. For a measure space (X, A, µ) we define

Aµ0 = N ∈ A : µ(N ) = 0;

Aµfin = F ∈ A : µ(F ) < ∞;

Aµ0,loc = A ∈ A : µ(A ∩ F ) = 0, ∀ F ∈ A

µfin.

With these notations, we have the inclusions Aµ0 = A

µ0,loc ∩ Aµfin ⊂ A

µ0,loc ⊂ A,

and Aµ0 and Aµ0,loc are in fact σ-rings.

Comment. The “locally-almost everywhere” terminology is actually designedto “hide some pathologies under the rug.” For instance, if (X, A, µ) is a degeneratemeasure space , i.e. µ(A) ∈ 0, ∞, ∀ A ∈ A, then “anything happens locallyalmost-everywhere,” which means that we have the equality Aµ0,loc = A.

At the other end, there is a particular type of measure spaces on which, even inthe absence of σ-finiteness, the notions of “locally-almost everywhere” and ”almosteverywhere” coincide, i.e. we have the equality A

µ0,loc = A

µ0 . Such spaces are

described by the following.

Definition. A measure space (X, A, µ) is said to be nowhere degenerate, orwith finite subset property , if

(f) for every set A ∈ A with µ(A) > 0, there exists some set F ∈ A, with F ⊂ A, and 0 < µ(F ) < ∞.

With this terminology, one has the following result.

Proposition 4.3. For a measure space (X, A, µ), the following are equivalent:

(i) Aµ0,loc = Aµ0 ;

(ii) (X, A, µ) has the finite subset property.

Proof. (i) ⇒ (ii). Assume Aµ0,loc = Aµ0 , and let us prove that (X, A, µ) has

the finite subset property. We argue by contradiction, so let us assume there existssome set A ∈ A, with µ(A) = ∞, such that µ(B) ∈ 0, ∞, for every B ∈ A, with

B ⊂ A. In particular, if we start with some arbitrary F ∈ A

µ

fin, using the factthat µ(A ∩ F ) ≤ µ(F ) < ∞, we see that we must have µ(A ∩ F ) = 0. This provesprecisely that A ∈ A

µ0,loc. By assumption, it follows that A ∈ A

µ0 , i.e. µ(A) = 0,

which is impossible.(ii) ⇒ (i). Assume that (X, A, µ) has the finite subset property, and let us

prove the equality (i). Since one inclusion is always true, all we need to prove isthe inclusion A

µ0,loc ⊂ A

µ0 , which equivalent to the inclusion A

µ0,loc ⊂ A

µfin. Start

with some set A ∈ Aµ0,loc, but assume µ(A) = ∞. On the one hand, using the finite




subset property, there exists some set F ∈ A with F ⊂ A and µ(F ) > 0. On theother hand, since A ∈ A

µ0,loc, we have µ(F ) = 0, which is impossible.

Example 4.2. Take X be an uncountable set, let A = P(X ), and let µ be thecounting measure, i.e.

µ(A) =

Card A if A is finite


Then (X,P(X ), µ) has the finite subset property, but is not σ-finite.

When we restrict to integrable functions, the two notions µ-l.a.e, and µ-a.e.coincide. More precisely, we have the following.

Proposition 4.4. Let (X, A, µ) be a measure space, let K be one of the fieldsR or C, and let p ∈ [1, ∞). For a function f ∈ L

pK(X, A, µ), the following are

equivalent:

(i) f = 0, µ-l.a.e.

(ii) f = 0, µ-a.e.Proof. Of course, we only need to prove the implication (i) ⇒ (ii). Assume

f = 0, µ-l.a.e. Using the function g = |f | p, we can assume that p = 1 and f (x) ≥ 0,∀ x ∈ X . Consider then the set N = x ∈ X : f (x) > 0, and write it as a unionN =

∞n=1 N n, where

N n = x ∈ X : f (x) ≥ 1n, ∀ n ≥ 1.

Of course, all we need is the fact that µ(N n) = 0, ∀ n ≥ 1. Fix n ≥ 1. On the onehand, the assumption on f , it follows that N n ∈ A

µ0,loc. On the other hand, the

inequality 1n

κ N n ≤ f , forces the elementary function 1n

κ N n to be µ-integrable, i.e.µ(N n) < ∞. Consequently we have

N ∈ Aµ0,loc ∩ Aµfin = A

µ0 .

Comment. In what follows we will discuss several results, which all have asconclusion the fact that one measure has the Radon-Nikodym property with respectto another one. All such results will be called “Radon-Nikodym Theorems.”

The first result is in fact quite general, in the sense that it works for finitesigned or complex measures.

Theorem 4.1 (“Easy” Radon-Nikodym Theorem). Let (X, A, µ) be a finitemeasure space, let K denote one of the fields R or C, and let C > 0 be someconstant. Suppose ν is a K-valued measure on A, such that

|ν (A)| ≤ Cµ(A), ∀ A ∈ A.

Then there exists some function f ∈ L1K(X, A, µ), such that

(5) ν (A) = A f dµ, ∀ A ∈ A.

Moreover:

(i) Any function f ∈ L1K(X, A, µ), satisfying (5) has the property |f | ≤ C , µ-

a.e. If ν is an “honest” measure, then one also has the inequality |f | ≥ 0,µ-a.e.

(ii) A function satisfying (5) is essentially unique, in the sense that, whenever f 1, f 2 ∈ L1

K(X, A, µ) satisfy (5), it follows that f 1 = f 2, µ-a.e.



340 LECTURES 36-37

Proof. The ideea is to somehow make sense of X

h dν , for suitable measurablefunctions h, and to examine the properties of such a number relative to the integral

X h dµ. The second integral is of course defined, for instance for h∈

L1

K

(X, A, µ),but the first integral is not, because ν is not an “honest” measure. The proof willbe carried on in several steps.

Step 1: There exist four “honest” finite measures ν k, k = 1, 2, 3, 4, and num-bers αk, k = 1, 2, 3, 4, such that ν = α1ν 1 + α2ν 2 + α3ν 3 + α4ν 4, and

(6) ν k ≤ Cµ, ∀ k = 1, 2, 3, 4.

In the case K = R we use the Hahn-Jordan decomposition ν = ν + − ν −. We alsoknow that ν ± ≤ |ν |, the variation measure of ν . In this case we take α1 = 1,ν 1 = ν +, α2 = −1, ν 2 = ν −, and we set ν 3 = ν 4 = 0, α3 = α4 = 0.

In the case K = C, we write ν = η + iλ, with η and λ finite signed measures,and we use the Hahn-Jordan decompositions η = η+ − η− and λ = λ+ − λ−. Wealso know that the variation measures of η and λ satisfy |η| ≤ |ν | and |λ| ≤ |ν |, so

we also have η±

≤ |ν | and λ±

≤ |ν |. In this case we can then take α1 = 1, ν 1 = η+

,α2 = −1, ν 2 = η−, α3 = i, ν 3 = λ+, α4 = −i, ν 4 = λ−.Notice that in either case we have

ν k ≤ |ν |, ∀ k = 1, 2, 3, 4.

By Remark III.8.5 it follows that we have |ν | ≤ Cµ, so we immediately get theinequalities (6).

Step 2 : For any measurable function h : X → [0, ∞], one has the inequality

(7)

X

h dν k ≤ C

X

hdµ, ∀ k = 1, 2, 3, 4.

To prove this, we choose a sequence of elementary functions (hn)∞n=1 ⊂ A-ElemR(X ),

with

• 0 ≤ h1 ≤ h2 ≤ . . . (everywhere),• limn→∞ hn(x) = h(x), ∀ x ∈ X ,

so that by the General Monotone Convergence Theorem, we get the equalities X

h dµ = limn→∞

X

hn dµ and

X

h dν k = limn→∞

X

hn dν k, ∀ k = 1, 2, 3, 4.

This means that, in order to prove (7), it suffices to prove it under the extraassumption that h is elementary. In this case, we have

h = β 1κ B1 + · · · + β pκ Bp,

with β 1, . . . , β p ≥ 0 and B1, . . . , B p ∈ A. The inequality is then immediate, from(6) since we have

X h dν k =

p

j=1 β jν k(Bj) ≤ C

p

j=1 µ(Bj) = C X hdµ.

As a consequence of Step 2, we get the fact that, for every k = 1, 2, 3, 4, onehas the inclusions

L1K(X, A, µ) ⊂ L

1K(X, A, ν k) and NK(X, A, µ) ⊂ NK(X, A, ν k).

Taking quotients, this gives rise to correctly defined linear maps

(8) Φk : L1K(X, A, µ) h −→ h ∈ L1

K(X, A, ν k), k = 1, 2, 3, 4.




(Here we use the abusive notation that identifies an element in L1 with a functionin L1, which is defined almost uniquely.) Moreover, one has the inequality

X

|h| dν k ≤ C X

|h| dµ, ∀ h ∈ L1K(X, A, µ), k = 1, 2, 3, 4,

in other words, the linear maps (8) are all continuous. For every k = 1, 2, 3, 4, letφk denote the integration map

φk : L1K(X, A, ν k) h −→

X

h dν k ∈ K.

We know (see Remark 3.5) that the φk’s are continuous. In particular, the compo-sitions ψk = φk Φk : L1

K(X, A, µ) → K, which are defined by

ψk : L1K(X, A, µ) h −→

X

h dν k, k = 1, 2, 3, 4,

are linear and continuous.

We now use Proposition 3.3 which states that one has an inclusion

(9) Θ : L2K(X, A, µ) → L1

K(X, A, µ),

which is in fact a linear continuous map. So if we consider the compositions θk =ψk Θ, which are defined by

θk : L1K(X, A, µ) h −→

X

h dν k, k = 1, 2, 3, 4,

then these compositions are linear and continuous. Apply then Riesz Theorem (inthe form given in Remark 3.4), to find functions f 1, f 2, f 3, f 4 ∈ L2

K(X, A, µ), suchthat

θk(h) = f k, h, ∀ h ∈ L2K(X, A, µ), k = 1, 2, 3, 4.

In particular, using functions of the form h = κ A, A

∈A (which all belong to

L2K(X, A, µ), due to the finiteness of µ), we get

ν k(A) =

X

κ A dν k =

X

f kκ A dµ, ∀ A ∈ A, k = 1, 2, 3, 4.

Finally, if we define the function f = α1f 1 + α2f 2 + α3f 3 + α4f 4 ∈ L2K(X, A, µ),

then the above equalities immediately give the equality (5).At this point we only know that f belongs to L2

K(X, A, µ). Using the inclusion(9), it turns out that f indeed belongs to L1

K(X, A, µ).Let us prove now the additional properties (i) and (ii).To prove the first assertion in (i), we start off by fixing some function f ∈

L1K(X, A, µ), which satisfies (5), and we define the set

A = x ∈ X : |f (x)| > C ,

for which we must prove that µ(A) = 0. Since f is measurable, it follows that Abelongs to A. Consider the “rational unit sphere” S 1Q in K, defined as

(10) S 1Q =

−1, 1 if K = R

e2πit : t ∈ Q if K = C

The point is that S 1Q is dense in the unit sphere S 1 in K:

S 1 = α ∈ K : |α| = 1,



342 LECTURES 36-37

so we immediately have the equality A =α∈S1

QAα, where

Aα =

x

∈X : Re[αf (x)] > C .

Since S 1Q is countable, in order to prove that µ(A) = 0, it then suffices to show that

µ(Aα) = 0, ∀ α ∈ S 1Q. Fix then α ∈ S 1Q, and consider the K-valued measure η = αν .It is clear that we still have

(11) |η(A)| = |ν (A)| ≤ Cµ(A), ∀ A ∈ A,

as well as the equality

(12) η(A) =

A

αf dµ, ∀ A ∈ A.

For each integer n ≥ 1, let us define the set

Anα = x ∈ X : Re[αf (x)] ≥ C + 1n

,

so that we obviously have the equality Aα = ∞n=1 Anα. In particular, in order to

prove µ(Aα) = 0, it suffices to prove that µ(Anα) = 0, ∀ n ≥ 1. Fix for the momentn ≥ 1. Using (12), it follows that

Re η(Anα) = Re

An

α

αf dµ

=

An

α

Re[αf ] dµ =

X

Re[αf ]κ Anα

dµ.

Since we have Re[αf ]κ Anα

≥ (C + 1n)κ An

α, the above inequality can be continued

with

Re η(Anα) ≥ X

(C + 1n)κ An

αdµ = (C + 1

n)µ(Anα).

Of course, this will give

|η(Anα)| ≥ Re η(Anα) ≥ (C + 1n)µ(Anα).

Note now that, using (11), this will finally give

Cµ(Anα) ≥ (C + 1n)µ(Anα),

which clearly forces µ(Anα) = 0.Having proven that |f | ≤ C , µ-a.e., let us turn our attention now to the unique-

ness property (ii). Suppose f 1, f 2 ∈ L1K(X, A, µ) are such that

ν (A) =

A

f 1 dµ =

X

f 2 dµ, ∀ A ∈ A.

Consider then the difference f = f 1 − f 2 and the trivial measure ν 0 = 0. Obviouslywe have

|ν 0(A)| ≤ 1nµ(A), ∀ A ∈ A,

for every integer n ≥ 1, as well as

ν 0(A) = A

f dµ, ∀ A ∈ A.

By the first assertion in (i), it follows that

|f 1 − f 2| = |f | ≤ 1

n, µ-a.e.,

for every n ≥ 1. So if we take the sets (N n)∞n=1 ⊂ A defined by

N n = x ∈ X : |f 1(x) − f 2(x)| > 1n,




then µ(N n) = 0, ∀ n ≥ 1. Of course, if we put N =∞n=1 N n, then on the one hand

we have µ(N ) = 0, and on the other hand, we have

f 1(x) − f 2(x) = 0, ∀ x ∈ X N,

which means that we indeed have f 1 = f 2, µ-a.e.Finally, let us prove the second assertion in (i), which starts with the assumption

that ν is an “honest” measure. Let f ∈ L1K(X, A, µ) satisfy (5). By the uniqueness

property (ii), it follows immediately that

f = Re f, µ-a.e.,

so we can assume that f is already real-valued. Consider the “honest” measureω = Cµ − ν , and notice that the function g : X → R defined by

g(x) = C − f (x), ∀ x ∈ X,

clearly has the property

ω(A) = A

g dµ, ∀ A ∈ A.

Since we obviously have

0 ≤ ω(A) ≤ Cµ(A), ∀ A ∈ A,

by the first assertion of (i), applied to the measure ω and the function g, it followsthat |g| ≤ C , µ-a.e. In other words, we have now a combined inequality:

max|f |, |C − f | ≤ C, µ-a.e.

Of course, since f is real valued, this forces f ≥ 0, µ-a.e.

In what follows we are going to offer various generalizations of Theorem 4.1.There are several directions in which Theorem 4.1 can be generalized. The main

direction, which we present here, will aim at weakening the condition |ν | ≤ Cµ.The following result explains that in fact the case of K-valued measures can bealways reduced to the case of “honest” finite ones.

Proposition 4.5 (Polar Decomposition). Let A be a σ-algebra on a non-empty set X , let K be one of the fields R or C, and let ν be a K-valued measure on A. Let |ν | denote the variation measure of ν . There exists some function f ∈ L1

K(X, A, |ν |),such that

(13) ν (A) =

A

f d|ν |, ∀ A ∈ A.

Moreover

(i) Any function f ∈ L1K(X, A, |ν |), satisfying (13) has the property |f | = 1,

|ν |-a.e.

(ii) A function satisfying (13) is essentially unique, in the sense that, when-ever f 1, f 2 ∈ L1

K(X, A, |ν |) satisfy (13), it follows that f 1 = f 2, |ν |-a.e.

Proof. We know that

|ν (A)| ≤ |ν |(A), ∀ A ∈ A.

So if we apply Theorem 4.1 for the finite measure µ = |ν | and C = 1, we immediatelyget the existence of f ∈ L1

K(X, A, |ν |), satisfying (13). Again by Theorem 4.1, the



344 LECTURES 36-37

uniqueness property (ii) is automatic, and we also have |f | ≤ 1, |ν |-a.e. To provethe fact that we have in fact the equality |f | = 1, |ν |-a.e., we define the set

A = x ∈ X : |f (x)| < 1,which belongs to A, and we prove that |ν |(A) = 0. If we define the sequence of sets(An)∞

n=1 ⊂ A, by

An = x ∈ X : |f (x)| ≤ 1 − 1n, ∀ n ≥ 1,

then we clearly have A =∞n=1 An, so all we have to show is the fact that |ν |(An) =

0, ∀ n ≥ 1. Fix n ≥ 1. For every B ∈ A, with B ⊂ An, we have

|f (x)| ≤ 1 − 1n , ∀ x ∈ B,

so using (13) we get

|ν (B)| =

B

f d|ν |≤ B

|f | d|ν | ≤ B

(1 − 1n) d|ν | = (1 − 1

n)|ν |(B).

Now if we take an arbitrary pairwise disjoint sequence (Bk)∞k=1 ⊂ A, with∞k=1 Bk =An, then the above estimate will give

∞k=1

|ν (Bk)| ≤ (1 − 1n)

∞k=1

|ν |(Bk) = (1 − 1n)|ν |(An).

Taking supremum in the left hand side, and using the definition of the variationmeasure, the above estimate will finally give

|ν |(An) ≤ (1 − 1n)|ν |(An),

which clearly forces |ν |(An) = 0.

Remark 4.2. The case K = R can be slighly generalized, to include the caseof infinite signed measures. If ν is a signed measure on A and if we consider the

Hahn-Jordan set decomposition (X +

, X −), then the density f is simply the function

f (x) =

1 if x ∈ X +

−1 if x ∈ X −

The equality (13) will then hold only for those sets A ∈ A with |ν |(A) < ∞.Since |ν | is allowed to be infinite, as explained in Example 4.1, the only version of uniqueness property (ii) will hold with “|ν |-l.a.e” in place of “|ν |-a.e” Likewise, theabsolute value property (i) will have to be replaced with ”|f | = 1, |ν |-l.a.e”

Comment. Up to this point, it seems that the hypotheses from Theorem 4.1are essential, particularly the dominance condition |ν | ≤ Cµ. It is worth discussingthis property in a bit more detail, especially having in mind that we plan to weakenit as much as possible.

Notation. Suppose A is a σ-algebra on some non-empty set X , and supposeµ and ν are “honest” (not necessarily finite) measures on A. We shall write

ν µ,

if there exists some constant C > 0, such that

ν (A) ≤ Cµ(A), ∀ A ∈ A.

A few steps in the proof of Theorem 4.1 hold even without the finiteness as-sumption, as indicated by the follwing.




Exercise 1* . Suppose A is a σ algebra on some non-empty set X , and supposeµ and ν are “honest” measures on A. Prove the following.

(i) If ν

µ, then one has the inclusionsNK(X, A, µ) ⊂ NK(X, A, ν ) and L

pK(X, A, µ) ⊂ L

pK(X, A, ν ), ∀ p ∈ [1, ∞).

Consequently (see the proof of Theorem 4.1) one has linear maps

L pK(X, A, µ) h −→ h ∈ L p

K(X, A, ν ), ∀ p ∈ [1, ∞).

Show that these linear maps are continuous.(ii) Conversely, assuming one has the inclusion

L p0K (X, A, µ) ⊂ L

p0K (X, A, ν ),

for some p0 ∈ [1, ∞), prove that ν µ.

Hint: To prove (ii) show first one has the inclusion L1+(X,A, µ) ⊂ L1+(X,A, µ). Then show thatthe quantity

C = sup X

h dν : h

∈L1+(X,A, µ),

X

h dµ

≤1

is finite. If C = ∞, there exists some sequence (hn)∞n=1 ⊂ L1+(X,A, µ), with X

hn dµ ≤ 1 and

X

h dν ≥ 4n, ∀ n ≥ 1.

Consider then the series∞

n=112n hn, and get a contradiction. Finally prove that ν (A) ≤ Cµ(A),

∀ A ∈ A.

It is the moment now to introduce the following relation, which is a highlynon-trivial weakening of the relation .

Definition. Let A is a σ-algebra on some non-empty set X , and supposeµ and ν are “honest” (not necessarily finite) measures on A. We say that ν isabsolutely continuous with respect to µ, if for every A ∈ A, one has the implication

(14) µ(A) = 0 =⇒ ν (A) = 0.

In this case we are going to use the notation

ν µ.

It is obvious that one always has the implication

ν µ ⇒ ν µ.

Remarks 4.3. Let (X, A, µ) be a measure space. A. If ν is an “honest” measureon A, which has the Radon-Nikodym property relative to µ, then ν µ. This ispretty obvious, since if we pick f : X → [0, ∞] to be a density for ν realtive to µ,then for every A ∈ A with µ(A) = 0, we have f κ A = 0, µ-a.e., so we get

ν (A) =

A

f dµ =

X

f κ A dµ = 0.

B. For an “honest” measure ν onA

, the relation ν µ is equivalent to theinclusionNK(X, A, µ) ⊂ NK(X, A, ν ).

By Exercise 1, this already suggests that the relation is much weaker than (see Exercise 2 below).

C. If ν is either a signed or a complex measure on A, then the following areequivalent:

(i) the variation measure |ν | is absolutely continuous with respect to µ;



346 LECTURES 36-37

(ii) for every A ∈ A, one has the implication (14)

The implication (i) ⇒ (ii) is trivial, since one has

|ν (A)| ≤ |ν |(A), ∀ A ∈ A.

The implication (ii) ⇒ (i) is also clear, since if we start with some A ∈ A withµ(A) = 0, then we get |ν (B)| = 0, for all B ∈ A with B ⊂ A, and then arguingexactly as in the proof of Proposition 4.3, we get |ν |(A) = 0.

Convention. Using Remark 4.2.A, we extend the definition of absolute con-tinuity, and the notation ν µ to include the case when ν is either a signedmeasure, or a complex measure on A. In other words, the notation ν µ meansthat |ν | µ.

The following techincal result is key for the second Radon-Nikodym Theorem.

Lemma 4.1. Let (X, A, µ) be a finite measure space, and let ν be an “honest”measure on A, with ν µ. Then there exists a sequence (ν n)∞

n=1, of “honest”measures on A, such that

(i) ν n µ, ∀ n ≥ 1; in particular the measures ν n, n ≥ 1 are all finite;(ii) ν 1 ≤ ν 2 ≤ . . . ;

(iii) limn→∞ ν n(A) = ν (A), ∀ A ∈ A.

Proof. Let us define

ν n = (nµ) ∧ ν, ∀ n ≥ 1.

Recall (see III.8, the Lattice Property; it is essential here that one of the measures,namely nµ, is finite) that by construction ν n has the following properties:

(a) ν n ≤ nµ and ν n ≤ ν ;(b) whenever ω is a measure with ω ≤ nµ and ω ≤ ν , it follows that ω ≤ ν n.

Property (a) above already gives condition (i). It will be helpful to notice thatproperty (a) also gives the inequality

(15) ν n ≤ ν, ∀ n ≥ 1.

The monotonicity condition is now trivial, since by (b) the inequalities ν n−1 ≤(n − 1)µ ≤ nµ and ν n−1 ≤ ν , imply ν n−1 ≤ (nµ) ∧ ν = ν n.

To derive property (iii), it will be helpful to recall the actual definition of theoperation ∧. Fix for the moment n ≥ 1. One first considers the signed measureλn = nµ − ν , and its Hahn-Jordan decomposition λn = λ+

n − λ−. In our case, weget λ+

n ≤ nµ and λ−n ≤ ν . With these notations the measures ν n are defined by

ν n = nµ− λ+n , ∀ n ≥ 1. If we fix, for each n ≥ 1, a Hahn-Jordan set decomposition

(X +n , X −n ) for X relative to λn, then we have

(16) ν n(A) = ν (A ∩ X +n ) + nµ(A ∩ X −n ), ∀ A ∈ A, n ≥ 1.

Consider then the sets X +∞ = ∞n=1 X +n and X −∞ =

∞n=1. It is clear that X ±∞ ∈ A,

and X −∞ = X X

+∞.

Fix now a set A ∈ A, and let us prove the equality (iii). On the one hand, theobvious inclusions X −n ⊃ X −∞, combined with (16), give the inequalities

(17) ν n(A) ≥ ν (A ∩ X +n ) + nµ(A ∩ X −∞), ∀ n ≥ 1.

On the other hand, since λn+1 = µ + λn, ∀ n ≥ 1, using Lemma III.8.2, we get therelations

X +1 ⊂µ

X +2 ⊂µ

. . . .




(Recall that the notation D ⊂µ

E stands for µ(D E ) = 0.) Since ν µ, we also

have the relations

A ∩ X +1 ⊂ν A ∩ X

+2 ⊂ν . . . ,

so using Proposition III.4.3, one gets the equality

ν (A ∩ X +∞) = limn→∞

ν (A ∩ X +n ).

Combining this with the inequalities (15) and (17) then yields the inequality

(18) ν (A) ≥ limsupn→∞

ν n(A) ≥ liminf n→∞

ν n(A) ≥ ν (A ∩ X +∞) + limn→∞

nµ(A ∩ X −∞)

.

There are two posibilities here.

Case I : µ(A ∩ X −∞) > 0.

In this case, the estimate (18) forces

ν (A) = limsupn→∞

ν n(A) = lim inf n→∞

ν n(A) = ∞.

Case II : µ(A ∩ X −∞) = 0.

In this case, using absolute continuity, we get ν (A ∩ X −∞) = 0, and the equalityA = (A ∩ X +∞) ∪ (A ∩ X −∞) yields

ν (A) = ν (A ∩ X +∞).

Then (18) forceslimsupn→∞

ν n(A) = lim inf n→∞

ν n(A) = ν (A).

In either case, the concluison is the same: limn→∞ ν n(A) = ν (A).

After the above preparation, we are now in position to prove the following.

Theorem 4.2 (Radon-Nikodym Theorem: the finite case). Let (X, A, µ) be a finite measure space.

A. If ν is an “honest” measure on A, with ν µ, then there exists a measurable function f : X → [0, ∞], such that

(19) ν (A) =

A

f dµ, ∀ A ∈ A.

Moreover, such a function is essentially unique, in the sense that, whenever f 1, f 2 :X → [0, ∞] are measurable functions, that satisfy (19), it follows that f 1 = f 2,µ-a.e.

B. Let K be either R or C. If λ is a K-valued measure on A, with λ µ, then there exists a function f ∈ L1


(20) λ(A) =

A

f dµ, ∀ A ∈ A.

Moreover:

(i) A function f ∈ L1K(X, A, µ) satisfying (20) is essentially unique, in the

sense that, whenever f 1, f 2 ∈ L1K(X, A, µ) satisfy (20), it follows that

f 1 = f 2, µ-a.e.(ii) If f ∈ L1

K(X, A, µ) is any function satisfying (20), then the variation measure |λ| of λ is given by

|λ|(A) =

A

|f | dµ, ∀ A ∈ A.



348 LECTURES 36-37

Proof. A. Use Lemma 4.1 to find a sequence (ν n)∞n=1 of “honest” measures

on A, such that

• ν n µ, ∀ n ≥ 1; in particular the measures ν n, n ≥ 1 are all finite;• ν 1 ≤ ν 2 ≤ . . . ;• limn→∞ ν n(A) = ν (A), ∀ A ∈ A.

For each n ≥ 1, we apply the “Easy” Radon-Nikodym Theorem 4.1, to find somemeasurable function f n : X → R, such that

ν n(A) =

A

f n dµ, ∀ A ∈ A.

Claim: The sequence (f n)∞n=1 satisfies

0 ≤ f n ≤ f n+1, µ-a.e., ∀ n ≥ 1.

Fix n ≥ 1. On the one hand, since the ν n’s are “honest” finite measures, andν n µ, by part (i) of Theorem 4.1, it follows that f n

≥0, µ-a.e. On other hand,

since ν n+1 − ν n is also an “honest” finite measure with ν n+1 − ν n µ, and withdensity f n+1 − f n, again by part (i) of Theorem 4.1, it follows that f n+1 − f n ≥ 0,µ-a.e.

Having proven the above Claim, let us define the function f : X → [0, ∞], by

f (x) = liminf n→∞

maxf n(x), 0 ∀ x ∈ X.

It is obvious that f is measurable. By the Claim, we have in fact the equality

f = µ-a.e.- limn→∞

f n.

Since we also have

f κ A = µ-a.e.- limn→∞

f nκ A, ∀ A ∈ A,

using the Claim and the Monotone Convergence Theorem, we get A

f dµ =

X

f κ A dµ = limn→∞

X

f nκ A dµ = limn→∞

A

f n dµ =

= limn→∞

ν n(A) = ν (A), ∀ A ∈ A.

Having shown that f satisfies (19), let us observe that the uniqueness propertystated in part A is a consequence of Proposition 4.2.

B. Let λ be a K-valued. In particular, the variation measure |λ| is finite, so bythe Polar Decomposition (Proposition 4.3) there exists some measurable functionh : X → K, such that

(21) λ(A) = A h d|λ|, ∀ A ∈ A,

and such that |h| = 1, |λ|-a.e. Replacing h with the measurable function h : X →K, defined by

h(x) =

h(x) if |h(x)| = 1

1 if |h(x)| = 1

we can assume that in fact we have

|h(x)| = 1, ∀ x ∈ X.




Apply then part A, to the measure |λ|, which is again absolutely continuous withrespect to µ, to find some measurable function g : X → [0, ∞], such that

|λ|(A) = A

gdµ, ∀ A ∈ A.

Remark that, since X

g dµ = |λ|(X ) < ∞,

it follows that g ∈ L1+(X, A, µ). Fix for the moment some set A ∈ A. On the one

hand, since

(22) |hκ A| ≤ 1,

and |λ| is finite, it follows that hκ A ∈ L1K(X, A, |λ|). On the other hand, since

g ∈ L1+(X, A, µ), using (22) we get the fact that hκ Ag ∈ L1

K(X, A, µ). Using theChange of Variable formula (Proposition 4.1) we then get the equality

X

hκ A d|λ| = X

hκ Ag dµ,

which by (21) reads:

λ(A) =

A

hg dµ.

Now the function f 0 = hg (which has |f 0| = g) belongs to L1K(X, A, µ), and clearly

satisfies (20).To prove the uniqueness property (i), we start with two functions f 1, f 2 ∈

L1K(X, A, µ) which satisfy

A

f 1 dµ =

A

f 2 dµ = λ(A), ∀ A ∈ A.

If we define the function ϕ = f 1 − f 2 ∈ L1

K(X, A, µ), then we clearly have A

ϕ dµ =

A

0 dµ = ω(A), ∀ A ∈ A,

where ω is the zero measure. Since ω ≤ µ, using Theorem 4.1 it follows that ϕ = 0,µ-a.e.

To prove (ii) we start with some f ∈ L1K(X, A, µ) that satisfies (20), and we

use the uniqueness property (i) to get the equality f = f 0, µ-a.e., where f 0 is thefunction constructed above. In particular, using the construction of f 0, the factthat |f 0| = g, and the fact that g is a density for |λ| relative to µ, we get

A

|f | dµ =

A

|f 0| dµ =

A

g dµ = |λ|(A), ∀ A ∈ A.

At this point we would like to go further, beyond the finite case. The followinggeneralization of Theorem 4.2 is pretty straightforward.

Corollary 4.1 (Radon-Nikodym Theorem: the σ-finite case). Let (X, A, µ)be a σ-finite measure space.


(23) ν (A) =

A

f dµ, ∀ A ∈ A.



350 LECTURES 36-37

Moreover, such a function is essentially unique, in the sense that, whenever f 1, f 2 :X → [0, ∞] are measurable functions, that satisfy (19), it follows that f 1 = f 2,µ-a.e.



(24) λ(A) =

A

f dµ, ∀ A ∈ A.

Moreover:



f 1 = f 2, µ-a.e.(ii) If f ∈ L1

K(X, A, µ) is any function satisfying (24), then the variation measure |λ| of λ is given by

|λ|(A) = A |f | dµ, ∀ A ∈ A.

Proof. Since µ is σ-finite, there exists a sequence (An)∞n=1 ⊂ A

µfin, with∞

n=1 An = X . Put X 1 = A1 and X n = An (A1 ∪ · · · ∪ An−1), ∀ n ≥ 2. Then(X n)∞

n=1 ⊂ Aµfin is pairwise disjoint, and we still have

∞n=1 X n = X . The Corol-

lary follows then immediately from Theorem 4.2, applied to the measure spaces(X n, A

Xn

, µXn

) and the measures ν Xn

and λXn

respectively. What is used here

is the fact that, if K denotes one of the sets [0, ∞], R or C, then for a functionf : X → K the fact that f is measurable, is equivalent to the fact that f

Xn

is

measurable for each n ≥ 1. Moreover, given two functions f 1, f 2 : X → K, the con-dition f 1 = f 2, µ-a.e. is equivalent to the fact that f 1

Xn

= f 2Xn

, µ-a.e., ∀ n ≥ 1.

Finally, for f : X → K(= R, C), the condition f ∈ L1K(X, A, µ), is equivalent to the

fact that f Xn ∈L1

K(X n, AXn , µXn ), ∀ n ≥ 1, and∞n=1

Xn

f Xn

d

µXn

< ∞.

Comment. The σ-finite case of the Radon-Nikodym Theorem, given above, isin fact a particular case of a more general version (Theorem 4.3 below). In orderto formulate this, we need a concept which has already appeared earlier in III.5.Recall that a measure space (X, A, µ) is said to be decomposable, if there exists apairwise disjoint subcollection F ⊂ A

µfin, such that

(i)F ∈F F = X ;

(ii) for a set A ⊂ X , the condition A ∈ A is equiavelnt to the condition

A∩

F ∈A,

∀F

∈F ;

(iii) one has the equality

µ(A) =F ∈F

µ(A ∩ F ), ∀ Aµfin.

Such a collection F is then called a decomposition of (X, A, µ). Condition (ii)is referred to as the patching property , because it characterizes measurability asfollows.




(p) Given a measurable space (Y, B), a function f : (X, A) → (Y,B) is mea-surable, if and only if all restrictions F

F : (F, A

F ) → (Y, B), F ∈ F , are

measurable.

Theorem 4.3 (Radon-Nikodym Theorem: the decomposable case). Let (X, A, µ)be a decomposable measure space. Let Aµσ-fin be the collection of all µ-σ-finite setsin A, that is,

Aµσ-fin =

A ∈ A : there exists (An)∞

n=1 ⊂ Aµfin, with A =

∞n=1

An

.


(25) ν (A) =

A

f dµ, ∀ A ∈ Aµσ-fin.

Moreover, such a function is locally essentially unique, in the sense that, whenever

f 1, f 2 : X → [0, ∞] are measurable functions, that satisfy (25), it follows that f 1 = f 2, µ-l.a.e.



(26) λ(A) =

A


Moreover:



f 1 = f 2, µ-a.e.(ii) If f ∈ L1

K(X, A, µ) is any function satisfying (26), then the variation measure

|λ|, of λ, satisfies

|λ|(A) = A

|f | dµ, ∀ A ∈ Aµσ-fin.

Proof. Fix F to be a decomposition for (X, A, µ).A. For every F ∈ F , we apply Theorem 4.2 to the measure space (F, A

f

, µF

)

and the measure ν F

, to find some measurable function f F : F → [0, ∞], such that

ν (A) =

A

f F dµ, ∀ A ∈ AF

.

Using the patching property, there exists a measurable function f : X → [0, ∞],such that f

F

= f F , ∀ F ∈ F . The key feature we ar going to prove is a particularcase of (25).

Claim 1: ν (A) = A f dµ, ∀ A ∈ Aµfin.

Fix A ∈ Aµfin. On the one hand, we know that

µ(A) =F ∈F

µ(A ∩ F ).

Since the sum is finite, it follows that the subcollection

F (A) =

F ∈ F : µ(A ∩ F ) > 0



352 LECTURES 36-37

is at most countable. We then form the set A =F ∈F (A[A ∩ F ], which is clearly a

subset of A. The difference D = A A has again µ(D) < ∞, so its measure is also

given asµ(D) =

F ∈F

µ(D ∩ F ).

Notice however that we have µ(D ∩ F ) = 0, ∀ ∈ F . (If F ∈ F (A), we already haveD ∩ F = ∅, whereas if F ∈ F F (A), we have D ∩ F ⊂ A ∩ F , with µ(A ∩ F ) = 0.)Using then the above equality, we get µ(D) = 0. By abosulte continuity we also

get ν (D) = 0. Using the equality A = A ∪ Dn, and σ-additivity (it is essential herethat F (A) is countable), it follows that

ν (A) = ν (A) =

F ∈F (A)

ν (A ∩ F ).

Using the hypothesis, we then get

(27) ν (A) = F ∈F (A)

A∩F f dµ.

Now if we list F (A) = F k∞k=1, and if we take a partial sum, we have

nk=1

A∩F k

f dµ =

Gn

f dµ =

X

f κ Gndµ,

where

Gn =

pk=1

[A ∩ F k], ∀ n ≥ 1.

It is clear that we have

• f κ G1≤ f κ G2 ≤ . . . ,

•limn→∞(f κ Gn

)(x) = (f κ A)(x),

∀x

∈X ,

so using the Monotone Convergence Theorem, it follows that

limn→∞

X

f κ Gndµ =

X

f κ A dµ =

A

f dµ.

Using (27) we then get

ν (A) = limn→∞

X

f κ Gndµ =

A

f dµ.

On the other hand, since µ(A A) = 0, it follows that A

f dµ =

A

f dµ,

so the preceding equality immediately gives the desired equality

ν (A) = A

f dµ.

At this point let us remark that the local almost uniqueness of f already followsfrom Remark 4.1.

Let us prove now the equality (25). Start with some set A ∈ Aµσ-fin, and choose

a sequence (An)∞n=1 ⊂ A

µfin, such that A =

∞n=1 An. Define the sequence (Bn)∞

n=1

byBn = A1 ∪ · · · ∪ An, ∀ n ≥ 1,




so that we still have Bn ∈ Aµfin, ∀ n ≥ 1, as well as A =

∞n=1 Bn, but moreover we

have B1 ⊂ B2 ⊂ . . . . For each n ≥ 1, using Claim 1, we have the equality

ν (Bn) = Bn

f dµ.

Using these equalities, combined with

• 0 ≤ f κ B1 ≤ f κ B2 ≤ . . . ,• limn→∞(f κ Bn

)(x) = (f κ B)(x), ∀ x ∈ X ,

the Monotone Convergence Theorem, combined with continuity yields B

f dµ =

X

f κ B dµ = limn→∞

X

f κ Bndµ = lim

n→∞

Bn

dµ = limn→∞

ν (Bn) = ν (A).

B. We start off by choosing a measurable function h : X → K, with |h| = 1,

such that

λ(A) =

A

h d|λ|, ∀ A ∈ A.

Using part A, there exists some measurable function g0 : X → [0, ∞], such that

(28) |λ|(A) =

A

g0 dµ, ∀ A ∈ Aµσ-fin.

At this point, g0 may not be integrable, but we have the freedom to perturb it (µ-l.a.e.) to try to make it integrable. This is done as follows. Consider the collection

F 0 =

F ∈ F : |λ|(F ) > 0

.

Since |λ| is finite, it follows that F 0 is at most countable. Define then the setX 0 =

F ∈F 0

F ∈ Aµσ-fin. Since X 0 is µ-σ-finite, every set A ∈ A with A ⊂ X 0, is

µ-σ-finite, so we have

|λ|(A) =

A

g0 dµ, ∀ A ∈ AX0

.

Applying the σ-finite version of the Radon-Nikodym Theorem to the σ-finite mea-sure space (X 0, A

X0

, µX0

) and the finite measure λX0

, it follows that the density

g0

X0

belongs to L1+(X 0, A

X0

, µX0

), which means that the function g = g0κ X0

belongs to L1+(X, A, µ). With this choice of g, let us prove now that the equality

(28) still holds, with g in place of g0. Exactly as in the proof of part A, it sufficesto prove only the equality

(29) |λ|(A) = A

g dµ, ∀ A ∈ Aµfin.

Claim 2 : |λ|(A) = |λ|(A ∩ X 0), ∀ A ∈ Aµσ-fin.

Since (use the fact that |λ| is finite) the equality is equivalent to

|λ|(A X 0) = 0, ∀ A ∈ Aµσ-fin,



354 LECTURES 36-37

it suffices to prove it only for A ∈ Aµfin. If A ∈ A

µfin, using the properties of the

decomposition F , we have

|λ|(A) = F ∈F

|λ|(A ∩ F ) = F ∈F 0

|λ|(A ∩ F ) + F ∈F F 0

|λ|(A ∩ F ) =

= |λ| F ∈F 0

[A ∩ F ]

+

F ∈F F 0

|λ|(A ∩ F ) =

= |λ|(A ∩ X 0) +

F ∈F (A)

|λ|(A ∩ F ).

Notice now that, for F ∈ F F 0, we have |λ|(F ) = 0, which gives |λ|(A ∩ F ) = 0,so the Claim follows immediately from the above computation.

Having proven the above Claim, let us prove now (29). Fix A ∈ Aµfin. The

desired equality is now immediate from Claim 2, combined with (28):

|λ|(A) =

|λ|(A

∩X 0) = A∩X0

g0 dµ = X g0κ A∩X0

dµ =

=

X

g0κ X0κ A dµ =

X

gκ A dµ =

A

g µ.

Define now the function f 0 = hg. Since |f 0| = g ∈ L1+(X, A, µ), it follows that

f 0 ∈ L1K(X, A, µ). Let us prove that f 0 satisfies the equality (26). Start with some

A ∈ Aµσ-fin. On the one hand, using Claim 2, we have

|λ(A X 0)| ≤ |λ|(A X 0) = 0,

so we get λ(A) = λ(A ∩ X 0). Using the σ-finite version of the Radon-NikodymTheorem for (X 0, A

X0

, µX0

) and λX0

, we then have

λ(A) = λ(A

∩X 0) = A∩X0

hg0 dµ = X hg0κ A∩X0dµ =

= X

hg0κ X0κ A dµ =

X

hgκ A dµ = A

hg dµ = A

f 0 dµ.

We now prove the uniqueness property (i) of f (µ-a.e.!). Assume f ∈ L1K(X, A, µ)

is another function, such that

λ(A) =

A


Claim 3 : f = f 0, µ-l.a.e.

What we need to show here is the fact that

f κ B = f 0κ B, µ-a.e., ∀ B ∈ Aµfin.

But this follows immediately from the uniqueness from part B of Theorem 4.2,applied to the finite measure space (B, AB

, µB

) and the measure λB

, which has

both f B

and f 0B

as densities.

Using Claim 3, we now have f − f 0 ∈ L1K(X, A, µ), with f − f 0 = 0, µ-l.a.e.,

so we can apply Proposition 4.4, which forces f − f 0 = 0, µ-a.e., so we indeed getf = f 0, µ-a.e.

Property (ii) is obvious, since by (i), any function f ∈ L1K(X, A, µ), that satisfies

(26), automatically satisfies |f | = |f 0| = g, µ-a.e.




Comment. One should be aware of the (severe) limitations of Theorem 4.3,notably the fact that the equalities (25) and (26) hold only for A ∈ A

µσ-fin. For

example, if one considers the measure space (X,P(X ), µ), with X uncountable,and µ defined by

µ(A) =

∞ if A is uncountable0 if A is countable

This measure space is decomposable, with a decomposition consisting of singletons:F =

x : x ∈ X

. For a measure ν on P(X ), the condition ν µ meansprecisely that ν (A) = 0 for all countable subsets A ⊂ X . In this case the equality(25) says practically nothing, since it is restricted solely to countable sets A ⊂ X ,when both sides are zero.

In this example, it is also instructive to analyze the case when ν is finite (see partB in Theorem 4.3). If we follow the proof of the Theorem, we see that at some pointwe have constructed a certain set X 0 =

F ∈F 0, where F 0 =

F ∈ F : ν (F ) > 0

.

In our situation however it turns out that X 0 = ∅. This example brings up a veryinteresting question, which turns out to sit at the very foundation of set theory.

Question: Does there exists an uncountable set X , and a finite measure ν on P(X ), such that ν (X ) > 0, but ν (A) = 0, for every countable subset A ⊂ X ?

(The above vanishing condition is of course equivalent to the fact that ν (x) = 0,∀ x ∈ X .) It turns out that, not only that the answer of this question is unkown, butin fact several mathematicians are seriously thinking of proposing it as an axiomto be added to the current system of axioms used in set theory!

The limitations of Theorem 4.3 also force limitations in the Change of Variablesproperty (see Proposition 4.1), which in this case has the following statement.

Proposition 4.6 (Local Change of Variables). Let (X, A, µ) be a measure

space, and let ν be a measure on A, and let f : X → [0, ∞] be a measurable function.

A. The following are equivalent:

(i) one has

ν (A) =

A

f dµ, ∀ A ∈ Aµσ-fin;

(ii) for every measurable function h : X → [0, ∞], with the property that theset E h = x ∈ X : h(x) = 0 belongs to Aµσ-fin, one has the equality

(30)

X

h dν =

X

hf dµ.

B. If ν and f are as above, and K is either R or C, then the equality (30)also holds for those measurable functions h : X → K with E h ∈ A

µσ-fin, for which

h ∈ L1K(X, A, ν ) and hf ∈ L1

K(X, A, µ).

Proof. A. (i) ⇒ (ii). Assume (i) holds. Start with some measurable functionh : X → [0, ∞], such that the set E h = x ∈ X : h(x) = 0 belongs to Aµσ-fin. Theequality (30) is then immediate from Proposition 4.1, applied to the measure space(E h, A

Eh

, µEh

), and the measure ν Eh

, which has density f Eh

.



356 LECTURES 36-37

(ii) ⇒ (i). Assume (ii) holds. If we start with some A ∈ Aµσ-fin, then obviously

the measurable function h = κ A will have E h = A, so by (ii) we immediately get

ν (A) = X

κ A dν = X

κ Af dµ = A

f dµ.

B. Assume now ν and f satisfy the equivalent conditions (i) and (ii). Supposeh : X → K is measurable, with E h ∈ A

µσ-fin, such that h ∈ L1

K(X, A, ν ) andhf ∈ L1

K(X, A, µ). Then the equality (30) follows again from Proposition 4.1,applied to the measure space (E h, A

Eh

, µEh

), and the measure ν Eh

, which has

density f Eh

.



Appendix A

Zorn Lemma

In this Appendix we review basic set theoretical results, which are consequencesof the following postulate:

Axiom of Choice. Given any non-empty collection 17 X i : i ∈ I of non-empty sets, the cartesian product

i∈I

X i

is non-empty.Recall that the cartesian product is defined as

i∈I

=

f : I →i∈I

X i : f (i) ∈ X i, ∀ i ∈ I

.

In order to formulate several consequences of the Axion of Choice, we needseveral concepts.

Definitions. Given a set X , by a relation on X one means simply as subsetR ⊂ X × X . The standard notation for relations is:

xRy ⇐⇒ (x, y) ∈ R.

An order relation on X is a relation with the following properties:• x x, ∀ x ∈ X ;• if x, y, z ∈ X satisfy x y and y z, then x z;• if x, y ∈ X satisfy x y and y x, then x = y.

In this case the pair (X, ) is called an ordered set .An ordered set (X, ) is said to be totally ordered , if

• for any elements x, y ∈ X one has either x y or y x.

More generally, given an (arbitrary) ordered set (X, ), by a totally ordered subset of (X, ), one means a subset T ⊂ X , which becomes totally ordered with respectto the order relation

T .

Example A.1. Fix a set M , and take X to be the collection of all subsets of M . Then X carries a natural order relation defined by inclusion:

A B ⇐⇒ A ⊂ B.A totally ordered subset C of (X, ⊂) is called a chain of subsets of M . Two subsetA, B ⊂ M will be said to be comparable, if either A ⊂ B, or B ⊂ A, i.e. thecollection A, B is a chain of subsets of M .

Definition. Let M be a set. A collection F of subsets of M is said to havethe chain property , if

17 By a “collection of sets” one simply means a set whose elements are sets themselves.

357



358 APPENDIX A

(c) whenever C ⊂ F is a chain, it follows that the union C ∈C C also belongs

to F .

Lemma A.1. Let M be a set, let F be a collection of subsets of M with thechain property. For every set A ∈ F , the collection

comp(A;F ) = B ∈ F : Bcomparable to Ahas the chain property.

Proof. Let C ⊂ comp(A;F ) be a chain, and put T =C ∈C C . Since F has

the chain property, we have T ∈ F . To show that T is comparable with A, weconsider the two pssibilities:

Case 1: A ⊃ C , for all C ∈ C. In this case we have A ⊃ C ∈C C = T .

Case 2: There exists C 0 ∈ C, such that A ⊂ C 0. In this case we have A ⊂C 0 ⊂ T .

Lemma A.2. Let M be some non-empty set, let F let F be a non-empty collec-

tion of subsets of M , with the chain property Suppose one has a mapF A −→ xA ∈ M,

with the property that A ∪ xA ∈ F , ∀ A ∈ F .

Then there exists A ∈ F such that xA ∈ A.

Proof. For each A ∈ F we define A+ = A ∪ xA. Call a subset G ⊂ F

inductive, if it has the chain property, and

(+) A ∈ G ⇒ A+ ∈ G.

It is quite clear that if Gi, i ∈ I is a collection of inductive subsets of F , then theintersection

i∈I Gi is again an inductive subset of F .

Fix now some subset A0

∈F , and define

G0 = G inductiveA0∈G

G.

Note that the subset F 0 = A ∈ F : A ⊃ A0 is an inductive subset of F , so inparticular, G0 is non-empty, and G0 ⊂ F 0, i.e.

(1) A ⊃ A0, ∀ A ∈ G0.

Claim: The set G0 is a chain.

What we need to prove is the fact that G0 is totally ordered by inclusion. Considerthe set

T = T ∈ G : T is comparable with every A ∈ G0 =A∈G0

comp(A;G0),

and we try to prove that T = G0. By Lemma A.1 it is clear that T has the chainproperty. Using (1), it is clear that A0 ∈ T . Finally, we need to prove property(+). We prove this indirectly as follows. Fix T ∈ T , consider the collection

VT = comp(T +;G0) = A ∈ G0 : A comparable with T +,

and let us prove that VT = G0, by showing that VT is an inductive set, and containsA0. First of all, by Lemma A.1, it follows thatVT has the chain property. Secondly,using (1) we have A0 ⊂ T ⊂ T +, so A0 ∈ VT . Finally, to check property (+), we



ZORN LEMMA 359

start with some V ∈ VT , and we show that V + ∈ VT . In the case when T + ⊂ V ,we are done, because we have T + ⊂ V ⊂ V +. Assume T + ⊂ V , so that we haveV

⊂T . Since T is comparable with V +, we either have V +

⊂T , in which case we

are done, or we have T ⊂ V +. In the latter case, we have

V ⊂ T ⊂ V +.

Since V + = V ∪ xV , the above inclusions forces either T = V , which givesT + = V +, or T = V +. Clearly, either case gives V + ∈ VT . Having shown that VT is inductive, the inclusion VT ⊂ G0 will force the equality VT = G0. In turn, thedefinition of VT proves that T + ∈ T , so T is indeed inductive. Finally, the inclusionT ⊂ G0 then forces T = G0, and by the definition of T , it follows that G0 is indeeda chain.

Having proven the Claim, we now take A =G∈G0

G. Since G0 has the chainproperty, it follows that A ∈ G0. By construction we have

A ⊃ G, ∀ G ∈ G0.

In particular we have A ⊃ A+, which clearly forces xA ∈ A.

Definitions. Let (X, ) be an ordered set. By a maximal element for X onemeans an element x ∈ X with the property:

y ∈ X : x y = x.

In other words, this means that there is no element y ∈ X , with x y and y = x.Given a subset S ⊂ X , an element x ∈ X is said to be an upper bound for S , if

s x, ∀ s ∈ S.

If such an x exists, we say that S has an upper bound . (It is not assumed that xbelongs to S !)

Lemma A.3 (“Easy” Zorn Lemma). Let M be a set, and let F be a collection

of subsets of M . Assume• the Axiom of Choice is true;• F has the chain property;• F and is hereditary, in the sense that, whenever A ∈ F , it follows that all

subsets of A belong to F .

Then, when equipped with the inclusion relation, (F , ⊂) has at least one maximal element.

Proof. The proof will be carried on by contradiction. Assume no A ∈ F ismaximal. For each A ∈ F , define

X A =

x ∈ M A : A ∪ x ∈ F

.

Claim: For every A ∈ F , the set X A is non-empty.

Indeed, since A is not maximal, there exists some B ∈ F , with A B. In particular,there exists some x ∈ B A, and since A ∪ x ⊂ B, by the hereditary property,it follows that x ∈ X A.

Use now the Axiom of Choice, to find a map

F A −→ xA ∈ M,

such that xA ∈ X A, ∀ A ∈ F . This means that A ∪ xA ∈ F , and xA ∈ A, for allA ∈ F . By Lemma A.2 this is however impossible.



360 APPENDIX A

Theorem A.1 (Zorn Lemma). Assume the Axiom of Choice is true. Let (X, )be a non-empty ordered set, with the following property

(z) every totally ordered subset A ⊂ X has an upper bound.Then X has at least one maximal element.

Proof. Define the collection

F = A ⊂ X : A totally ordered subset.

Clearly F is non-empty (it contains, for instance, all singletons).It is quite clear that F satisfies the hypothesis of Lemma A.3. So (F , ⊂) has a

maximal element A. Take now x to be an upper bound for A, i.e. a x, ∀ a ∈ A.Now we prove that x is maximal for (X, ). Suppose y ∈ X satisfies x y.

Then clearly A ∪ y will still be a totally ordered subset of X , i.e. A ∪ y ∈ F .The maximality of A in (F , ⊂) will force A∪y = A, so we get y ∈ A, hence y x.Since we also have x y, this forces y = x.



Appendix B

Cardinal Arithmetic

In this Appendix we discuss cardinal arithmetic. We assume the Axiom of Choice is true.

Definitions. Two sets A and B are said to have the same cardinality , if thereexists a bijective map A

→B. It is clear that this defines an equivalence relation

on the class18 of all sets.A cardinal number is thought as an equivalence class of sets. In other words,

if we write a cardinal number as a, it is understood that a consists of all sets of agiven cardinality. So when we write card A = a we understand that A belongs tothis class, and for another set B we write card B = a, exactly when B has the samecardinality as A. In this case we write card B = card, A.

Notations. The cardinality of the empty set ∅ is zero. More generally thecardinality of a finite set is equal to its number of elements. The cardinality of theset N, of all natural numbers, is denoted by ℵ0.

Definition. Let a and b be cardinal numbers. We write a ≤ b if there existsets A ⊂ B with card A = a and card B = b.

This is equivalent to the fact that, for any sets A and B, with card A = a and

card B =b

, one of the following equivalent conditions holds:• there exists an injective function f : A → B;• there exists a surjective function g : B → A.

For two cardinal numbers a and b, we use the notation a < b to indicate thata ≤ b and a = b.

Theorem B.1 (Cantor-Bernstein). Suppose two cardinal numbers a and b sat-isfy a ≤ b and b ≤ a. Then a = b.

Proof. Fix two sets A and B with card A = a and card B = b, so there existinjective functions f : A → B and g : B → A. We shall construct a bijectivefunction h : A → B. Define the sets

A0 = A g(B) and B0 = A f (A).

Then define recursively the sequences (An

)n≥0

and (Bn

)n≥0

by

An = g(Bn−1) and Bn = f (An−1), ∀ n ≥ 1.

Claim 1: One has Am ∩ An = Bm ∩ Bn, ∀ m > n ≥ 0.

Let us first observe that the case when n = 0 is trivial, since we have the inclusionsAm = g(Bm−1) ⊂ g(B) = A A0 and Bm = f (Am−1) ⊂ f (A) = B B0. Next weprove the desired property by induction on m. The case m = 1 is clear (this forces

18 The term class is used, because there is no such thing as the “set of all sets.”

361



362 APPENDIX B

n = 0). Suppose the statement is true for m = k, and let us prove it for m = k + 1.Start with some n < k +1. If n = 0, we are done, by the above discussion. Assumefirst n

≥1. Since f and g are injective we have

Ak+1 ∩ An = g(Bk) ∩ g(Bn−1) = g(Bk ∩ Bn−1) = ∅,

Bk+1 ∩ Bn = f (Ak) ∩ f (An−1) = f (Ak ∩ An−1) = ∅,

and we are done.Put C = A n≥0 An and D = B

n≥0 Bn.

Claim 2 : One has the equality f (C ) = D.

First we prove the inclusion f (C ) ⊂ D. Start with some point c ∈ C , but assumef (c) ∈ D. This means that there exists some n ≥ 0 such that f (c) ∈ Bn. Sincef (c) ∈ f (A) = B B0, we must have n ≥ 1. But then we get f (c) ∈ Bn = f (An−1),and the injectivity of f will force c ∈ An−1, which is impossible.

Second, we prove that D ⊂ f (C ). Start with some d ∈ D. First of all, sinceD

⊂B B0 = f (A), there exists some c

∈A with d = f (c). If c

∈C , then there

exists some n ≥ 0, such that c ∈ An, and then we would get d = f (c) ∈ f (An) =Bn+1, which is impossible.

We now begin constructing the desired bijection. First we define φ :n≥0 Bn →

B by

φ(b) =

b if b ∈ Bn and n is odd

(f g)(b) if b ∈ Bn and n is even

Claim 3 : The map φ defines a bijection

φ :n≥0

Bn →n≥1

Bn.

It is clear that, since φBn

is injective, the map φ is injective. Notice also that, if

n

≥0 is even, then φ(Bn) = f g(Bn) = f (An+1) = Bn+2. When n

≥0 is odd we

have φ(Bn) = Bn, so we have indeed the equality

φ n≥0

Bn

=n≥1

Bn.

Now we define ψ :n≥0 An → B by ψ = φ−1 f . Clearly ψ is injective, and

ψ n≥0

An

= φ−1 n≥0

f (An)

= φ−1 n≥0

Bn+1

= φ−1

n≥1

Bn

=n≥0

Bn,

so ψ defines a bijection

ψ :n≥0

An →n≥0

Bn.

We then combine ψ with the bijection f : C → D, i.e. we define the map h : A → B

byh(x) =

ψ(x) if x ∈ n≥0 Anf (x) if x ∈ A

n≥0 An = C.

Clearly h is injective, and

h(B) = ψ n≥0

An ∪ f (C ) =

n≥0

Bn ∪ D = B,

so h is indeed bijective.



CARDINAL ARITHMETIC 363

Theorem B.2 (Total ordering for cardinal numbers). Let a and b be cardinal numbers. Then one has either a ≤ b, or b ≤ a.

Proof. Choose two sets A and B with card A = a and card B = b. In order toprove the theorem, it suffices to construct either an injective function f : A → B,or an injective function f : B → A.

We define the set

X = (C, D, g) : C ⊂ A, D ⊂ B, g : C → D bijection.

We equip X with the following order relation:

(C, D, g) (C , D, g) ⇐⇒

C ⊂ C

D ⊂ D

g = gC

We now check that (X, ) satisfies the hypothesis of Zorn Lemma. Let A ⊂ X

be a totally ordered subset, say A = (C i, Di, gi) : i

∈I . Define C = i∈I C

i,

D = i∈I Di, and g : C → D to be the unique function with the property that

gC i

= gi, ∀ i ∈ I . (We use here the fact that for i, j ∈ I we either have C i ⊂ C j

and gjC i

= gi, or C j ⊂ C i and giC j

= gj . In either case, this proves that

giC i∩C j

= gjC i∩C j

, ∀ i, j ∈ I , so such a g exists.) It is then pretty clear that

(C,D,g) ∈ X and (C i, Di, gi) (C,D,g), ∀ i ∈ I , i.e. (C,D,g) is an upper boundfor A. Use now Zorn Lemma, to find a maximal element (A0, B0, f ) in X.

Claim: Either A0 = A or B0 = B.

We prove this by contradiction. If we have strict inclusions A0 A and B0 B,then if we choose a ∈ A A0 and b ∈ B B0, we can define a bijection g :A0 ∪ a → B0 ∪ b0 by g(a) = b and g

A0

= f . This would then produce a

new element (A0

∪ a

, B0

∪ b

, g)

∈X, which would contradict the maximality of

(A0, B0, f ).The theorem now follows immediately from the Claim. If A0 = A, then f :

A → B is injective, and if B0 = B, then f : B → A is injective.

We now define the operations with cardinal numbers.

Definitions. Let a and b be cardinal numbers.

• We define a+b = card S , where S is any set which is of the form S = A∪Bwith card A = a, card B = b, and A ∩ B = ∅.

• We define a·b = card P , where P is any set which is of the form P = A×Bwith card A = a and card B = b.

• We define ab = card X , where X is any set of the form X which is of theform

X = i∈I

Ai,

with card I = b and card Ai = a, ∀ i ∈ I . Equivalently, if we take two setsA and B with card A = a, and card B = b, and if we define

AB =B

A = f : f function from B to A,

then ab = card(AB).



364 APPENDIX B

It is pretty easy to show that these definitions are correct, in the sense that theydo not depend on the particular choices of the sets involved. Moreover, theseoperations are consistent with the usual operations with natural numbers.

Remark B.1. The operations with cardinal numbers, defined above, satisfy:

• a + b = b + a,• (a + b) + d = a + (b + d),• a + 0 = a,• a · b = b · a,• (a · b) · d = a · (b · d),• a · 1 = a,• a · (b + d) = (a · b) + (a · d),• (a · b)d = (ad) · (bd),• ab+d = (ab) · (ad),• (ab)d = (ab·d,

for all cardinal numbers a, b, d

≥1.

Remark B.2. The order relation ≤ is compatible with all the operations, inthe sense that, if a1, a2, b1, and b2 are cardinal numbers with a1 ≤ a2 and b1 ≤ b2,then

• a1 + b1 ≤ a2 + b2,• a1 · b1 ≤ a2 · b2,• a

b11 ≤ a

b22 .

Proposition B.1. Let a ≥ 1 be a cardinal number.

(i) If A is a set with card A = a, and if we define

P(A) = B : B subset of A,

then 2a = cardP(A).(ii) a < 2a.

Proof. (i). Put

P = 0, 1A =

f : f function from A to 0, 1,

so that 2a = card P . We need to define a bijection φ : P → P(A). We take

φ(f ) = a ∈ A : f (a) = 1, ∀ f ∈ P.

It is clear that, since a function f : A → 0, 1 is completely determined by the seta ∈ A : f (a) = 1, the map φ is indeed bijective.

(ii). The map A a −→ a ∈ P(A) is clearly injective. This prove theinequality a ≤ 2a. We now prove that a = 2a, by contradiction. Assume there is abijection θ : A → P(A). Define the set

B = a ∈ A : a ∈ θ(a),

and choose b ∈ A such that B = θ(b). If b ∈ B, then by construction we getb ∈ θ(b) = B, which is impossible. If b ∈ B, we have b ∈ θ(b), which forces b ∈ B,again an impossibility.

We now discuss the properties of these operations, when infinite cardinal num-bers are used.

Lemma B.1 (Properties of ℵ0).

(i) For any infinite cardinal number a, one has the inequality ℵ0 ≤ a.




(ii) ℵ0 + ℵ0 = ℵ0;(iii) ℵ0 · ℵ0 = ℵ0;

Proof. (i). Let a be an infinite cardinal number, and let A be an infiniteset A, with card A = a. Since for every finite subset F ⊂ A, there exists somex ∈ A F , one to construct a sequence (xn)n∈N ⊂ A, with xm = xn, ∀ m > n ≥ 1.Then the subset B = xn : n ∈ N has card B = ℵ0, so the inclusion B ⊂ A givesthe desired inequality.

(ii). Consider the sets

A0 = n ∈ N : n, even and A1 = n ∈ N : n, odd.

Then clearly card A0 = card A1 = ℵ0, and the equality A0 ∪ A1 = N gives

ℵ0 + ℵ0 = card A0 + card A1 = card(A0 ∪ A1) = card N = ℵ0.

(iii). Take the set P = N × N, so that ℵ0 · ℵ0 = card P . It is obvious thatcard P ≥ ℵ0. To prove the other inequality, we define a surjection φ : N → P as

follows. For each n ≥ 1 we take sn = n(n − 1)/2, we setBn = m ∈ N : sn < m ≤ sn+1,

and we define φn : Bn → P by

φ(m) = (n + sn − m, m − sn + 1), ∀ m ∈ Bn.

Notice that

(1) φn(Bn) = ( p,q) ∈ N × N : p + q = n + 1.

Notice also thatn≥1 Bn = N, and Bj ∩ Bk = ∅, ∀ j > k ≥ 1, so there exists a

(unique) function φ : N → P , such that φBn

= φn, for all n ≥ 1. By (1) it is clear

that φ is surjective.

Theorem B.3. Let a and b be cardinal numbers, with 1 ≤ b ≤ a, and a infinite.Then:

(i) a + b = a;(ii) a · b = a.

Proof. It is clear that

a ≤ a + b ≤ a + a,

a ≤ a · b ≤ a · a,

so in order to prove the theorem, we can assume that a = b.(i). Fix some set A with card A = a. Use Zorn Lemma, to find a maximal

non-empty family Ai : i ∈ I of subsets of A with

(a) card Ai =ℵ

0, for all i, j∈

I ;(b) Ai ∩ Aj = ∅, for all i, j ∈ I with i = j.

If we put B = A

i∈I Ai

, then by maximality it follows that B is finite. Inparticular, if we take i0 ∈ I then obviously card(Ai0 ∪ B) = ℵ0, so if we replaceAi0 with Ai0 ∪ B, we will still have the above properties ( a) and (b), but alsoA =

i∈I Ai. This proves that a = card A = ℵ0 · d, where d = card I . In other

words, we have a = card(N × I ). Consider then the sets

C 0 = n ∈ N : n even and C 1 = n ∈ N : n odd,



366 APPENDIX B

so that (C 0 × I ) ∪ (C 1 × I ) = I × N, and (C 0 × I ) ∩ (C 1 × I ) = ∅. In particular,we get

a = card(C 0 × I ) + card(C 1 × I ) == (card C 0) · (card I ) + (card C 1) · (card I ) =

= ℵ0 · d + ℵ0 · d = a + a.

(ii). Fix A a set with card A = a. We are going to employ Zorn Lemma to finda bijection A → A × A. Define

X =

(D, f ) : D ⊂ A, f : D → D × D bijective

.

Equip X with the following order

(D, f ) (D, f ) ⇐⇒

D ⊂ D

f = f D

Notice that X is non-empty, since we can find at leas one set D⊂

A with card D =ℵ0. We now check that X satisfies the hypothesis of Zorn Lemma. Let T =

(Di, f i) : i ∈ I

be a totally ordered subset of X. It is fairly clear that if one takes

D =i∈I and one defines f : D → D × D as the unique function with f

Di

= f i,

∀ i ∈ I , then f is injective, and

f (D) =i∈I

f (Di) =i∈I

f i(Di) =i∈I

(Di × Di) = D × D,

so the pair (D, f ) indeed belongs to X, and is an upper bound for T .Use Zorn Lemma to produce a maximal element (D, f ) ∈ X. Notice that, if we

take d = card D, then by construction we have

(2) d · d = d.

We would like to prove that D = A. In general this is not the case (for example,when A = N, every (D, f ) ∈ X, with N D finite, is automatically maximal). Wenotice however that all we need to show is the equality

(3) d = a.

We prove this equality by contradiction. We know that we already have d ≤ a.Suppose d < a. Put G = A D notice that d + card G = a. Since d < a, by (i) wesee that we must have the equality card G = a. Then there exists a subset E ⊂ Gwith card E = d. Consider the set

P = (E × E ) ∪ (E × D) ∪ (D × E ).

Since E ∩ D = ∅, the three sets above are pairwise disjoint, so using (2) combinedagain with part (i), we get

card P = card(E × E ) + card(E × D) + card(D × E ) =

= d · d + d · d + d · d = d + d + d = d = card E.

This means that there exists a bijection g : E × P , which combined with the factthat E ∩ D = P ∩ (D × D) = ∅, will produce a bijection h : D ∪ E → P ∪ (D × D),such that h

D

= f and hE

= g. Since we have P ∪ (D × D) = (D ∪ E ) × (D ∪ E ),the pair (D ∪ E, h) ∈ X will contradict the maximality of (D, f ).




Corollary B.1. If a is an infinite cardinal number, and if b is a cardinal number with 2 ≤ b ≤ 2a, then

ba

= 2a

.

Proof. We have

2a ≤ ba ≤ (2a)a = 2a·a = 2a,

and the desired equality follows from the Cantor-Bernstein Theorem.

Corollary B.2. Let a be an infinite cardinal number, let A be a set with card A = a, and define

P fin (A) = F ∈ P(A) : F finite.

Then card P fin (A) = a.

Proof. First of all, the map A

a

−→ a

∈P fin (A) is injective, so a

≤cardP fin (A).We now prove the other inequality. For every integer n ≥ 1, let An denote the

n-fold cartesian product. We treat the sequence A1, A2, . . . as pairwise disjoint.For every n ≥ 1 we define the map

φn : An → P fin (A),

by

φ(a1, . . . , an) = a1, . . . , an,

and we define the map φ :∞n=1 An → P fin (A) as the unique map such that

φAn = φn, ∀ n ≥ 1. Notice now that, since

card An = an = a, ∀ n ≥ 1,

it follows that

card ∞n=1

An

= ℵ0 · a = a,

which gives

card(Range φ) ≤ a.

But it is clear that

∅ ∪ Range φ = P fin (A),

and the fact that P fin (A) is infinite, proves that

cardP fin (A) = card(Range φ) ≤ a.

We conclude with a result on the cardinal number c = card R.

Proposition B.2.

(i) For two real numbers a < b, one has

card(a, b) = card[a, b) = card(a, b] = card[a, b] = c.

(ii) c = 2ℵ0 .



368 APPENDIX B

Proof. (i). It is clear that, since (a, b) is infinite, we have

card[a, b] = 2 + card(a, b) = card(a, b).

The inclusions (a, b) ⊂ [a, b) ⊂ [a, b] and (a, b) ⊂ (a, b] ⊂ [a, b], combined with theCantor-Bernstein Theorem, immediately give

card[a, b) = card(a, b] = card(a, b).

Finally, the bijection

(a, b) t −→ tan

π(2t − a − b)

2(b − a)

∈ R

shows that card(a, b) = c.(ii). The proof of this result uses a certain construction, which is useful for

many other purposes. Therefore we choose to work in full generality. Consider theset

T =

0, 1

ℵ0 = a = (αn)n∈N : αn

∈ 0, 1

,

∀n

∈N,

so 2ℵ0 = card P . For any real number r ≥ 2, we define the map φr : T → [0, 1] by

φ(a) = (r − 1)∞n=1

αnrn

, ∀ a = (αn)n∈N ∈ T.

The maps φr, r ≥ 2 are “almost” injective. To clarify this, we define the set

T 0 =

a = (αn)n∈N ∈ T : the set n ∈ N : αn = 0 is infinite

.

Note that

T T 0 =

(αn)n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀ n ≥ N

.

Clearly φ is surjective. In fact φ is “almost” bijective.

Claim 1: Fix r ≥ 2. For elements a = (αn)n∈N, b = (β n)n∈N ∈ T 0, the

following are equivalent (∗) φr(a) > φr(b);(∗∗) there exists k ∈ N, such that alphak > β k, and αj = β j, for all j ∈ N

with j < k.

We first prove the implication (∗∗) ⇒ (∗). If a, b ∈ T 0 satisfiy (∗∗), then

(4) φr(a) − φr(b) =r − 1

rk+ (r − 1)

∞n=k+1

αn − β nrn

≥ r − 1

rk− (r − 1)

∞n=k+1

β n2n

.

Notice now that there are infinitely many indices n ≥ k + 1 such that β n = 0. Thisgives the fact that

∞

n=k+1

β nrn

<∞

n=k+1

1

rn=

1

(r − 1)rk,

so if we go back to (4) we get

φr(a) − φr(b) ≥ r − 1

rk− (r − 1)

∞n=k+1

β nrn

>r − 1

rk− 1

rk=

r − 2

rk≥ 0,

so in particular we get φr(a) > φr(b.Conversely, if φr(a) > φr(b), we choose

k = minn ∈ N : αn = β n.




Using the implication (∗∗) ⇒ (∗) we see that we cannot have β k > αk, because thiswould force φ(b) > φ(a). Therefore we must have αk > β k, and we are done.

Using Claim 1, we now see that φrT 0 : T

0 →[0, 1] is injective

Claim 2 : card(T T 0) = ℵ0.

This is pretty clear, since we can write

T T 0 =

∞k=1

Rk,

whereRn =

a = (αn)n∈N ∈ T : αn = 1, ∀ n ≥ 1

.

Since each Rn is finite, the desired result follows.Using Claim 2, we have

2ℵ0 = card T = card(T T 0) + card T 0 = ℵ0 + card T 0.

Since

ℵ0 < 2ℵ0 , the above equality forces

2ℵ0 = card T 0.

For every r ≥ 2, we also have card φr(T T 0) ≤ ℵ0, which then gives card

φr(T )

φr(T 0) ≤ ℵ0, hence using the injectivity of φr

T 0

, we have card φr(T 0) = card T 0 =

2ℵ0 , so we get

2ℵ0 = card φr(T 0) ≤ card φr(T ) = card φr(T 0)+card

φr(T )φr(T 0) ≤ card phir(T 0)+ℵ0 = 2ℵ0+ℵ0 = 2

By the Cantor-Bernstein Theorem this forces card φr(T ) = 2ℵ0 .Now we are done, since for r = 2 we clearly have φ2(T ) = [0, 1].

Corollary B.3.

(i) cℵ0 = c.(ii) If we define the set

Pcount = C ⊂ R : card F ≤ ℵ0,

then cardPcount (R) = c.

Proof. (i). This is immediate from the equality 2ℵ0 = c and from CorollaryB.1.

(ii). Using the inclusion P fin (R) ⊂ Pcount (R), combined with Corollary B.2, wesee that we have the inequality

c ≤ cardPcount (R).

To prove the other inequality, we define a map φ : RN → Pcount (R), as follows. If a ∈ RN is a sequence, say a = (αn)n∈N, we put

φ(a) = αn : n ∈ N.

Since φ is clearly surjective, using part (i) we getcardPcount (R) ≤ card RN = c

ℵ0 = c.



Appendix C

Ordinal numbers

In this Appendix we discuss ordinal number arithmetic. The Axiom of Choiceis assumed to be true.

Definition. Let X be a non-empty set. A well ordering on X is an total orderrelation

on X with the following property:

(w) every non-empty subset A ⊂ X has a smallest element, i.e. there existsa ∈ A, such that a x, ∀ x ∈ A.

In this case the pair (X, ) is called a well ordered set .

Notations. Let (W,) be a well-ordered set. For any a ∈ W , we define

W (a) = x ∈ W : x a and x = a.

Remark that (W (a), ) is well-ordered.

Lemma C.1. Let (W,) be a well ordered set. For a subset S ⊂ W , the following are equivalent:

(i) for every s ∈ S , one has the inclusion W (s) ⊂ S ;(ii) either S = W , or there exists some a ∈ W , such that S = W (a).

Proof. (i)

⇒(ii). Assume S W . Take a to be the smallest element of the

set W S . If s ∈ S , then a = s, and by (i) we cannot have a s, since this wouldforce a ∈ W (s) ⊂ S . Therefore we must have s a, i.e. s ∈ W (a). This prove theinclusion S ⊂ W (a). Conversely, if s ∈ W (a), then s must belong to S . Otherwises ∈ W S would contradict the minimality of a.

(ii) ⇒ (i). This is trivial.

Definition. A subset S , as above, is called a full subset.

The key feature of well-ordered sets is the following.

Lemma C.2 (Transfinite Induction Principle). Let (W,) be a well-ordered set. Let w1 ∈ W be the smallest element of W . Assume A ⊂ W is a set with theproperty

(i) If w ∈ W has the property that, W (w) ⊂ A, then w ∈ A.

Then A = W .Proof. Consider the set

S = s ∈ A : W (s) ⊂ A.

It is obvious that S is full, and S ⊂ A. By Lemma C.1, either S = W , in whichcase we clearly get A = W , or there exists w ∈ W , such that S = W (w). In thiscase we have W (w) ⊂ A. By (i) this forces w ∈ A, so we get w ∈ S , which isimpossible.

371



372 APPENDIX C

Another useful feature is

Lemma C.3 (Recursion Principle). Let (W,) be a well-ordered set, and let

w1 be the smallest element in W . Let X be a set, and assume one has a family of maps Φa : W (a) X → X , a ∈ W w1. Then for any element x1 ∈ X , there

exists a unique function f : W → X , such that

(1) f (w1) = x1 and f (a) = Φa

f W (a)

, ∀ a ∈ W w1.

Proof. For every a ∈ W let us denote the set W (a) ∪ a simply by W a, andlet us define the set

F a =

g : W a → X : g(w1) = x1 and g(b) = Φb

gW (b)

, ∀ b ∈ W a w1

.

Remark that, for any a, b ∈ W , with a b, one has

(2) f W a

∈ F a, ∀ f ∈ F b.

Claim: For every a∈

W , the set F a is a singleton.

We prove this statement using transfinite induction. Define

A =

a ∈ W : F a is a singleton

.

Suppose a ∈ W has the property W (a) ⊂ A, which means that F b is a singleton,for all b ∈ W (a). For each b ∈ W (a), let f b : W b → X be the unique element in F b.We notice that, for any b, c ∈ W (a), with b c, using (2), we have

(3) f cW b

= f b.

This follows immediately from the fact that f cW b

belongs to F b. Using the obvious

equality

W (a) =

b∈W (a)

W b,

we define g : W (a) → X as the unique function with the property that gW b

= f b,

∀ b ∈ W (a). Finally, we define f a : W a → X by f aW (a)

= g, and f a(a) = Φa(g). It

is clear that f a ∈ F a, so F a has at least one element. If h ∈ F a is another function,then for every b ∈ W (a) we have h

W b

∈ F b, which forces hW b

= f b, in particular

giving hW (a)

= g = f aW (a)

. Then h(a) = Φa(g), which means that we also have

h(a) = f a(a), so we must have h = f a.Having proven the Claim, we now have a family of functions f a : W a → X ,

a ∈ W , with f bW a

= f a, for all a, b ∈ W with a b. Using the equality

W =a∈W

W a,

we then define f : W → X to be the unique function such that f W a = f a, ∀ a ∈ W .Notice that, for each a ∈ W w1, we have f (a) = f a(a), and since f a ∈ F a,

we immediately get (1). The uniqueness of f with property (1) is also clear, sinceany such f will atomatically satisfy f

W a

∈ F a, for all a ∈ W .

Comment. The system of maps Φa :W (a) X → X , a ∈ W is to be thought

as a “recurence relation,” in the sense that it is used to define the value f (a) interms of all “preceding” values f (w), w a, w = a.



ORDINAL NUMBERS 373

Definitions. Given two well ordered sets (W 1, 1) and (W 2, 2), a map f :(W 1, 1) → (W 2, 2) is called an full embedding , if

• f is injective.• For any two elements x, y ∈ W 1, one has

x 1 y ⇒ f (x) 2 f (y).

• f (W 1) is a full subset of W 2.

If f is a full emebedding, with f (W 1) = W 2, then f is called an order isomorphism .

The properties of these types of maps are contained in the following

Proposition C.1. A. Suppose (W 1, 1) and (W 2, 2), are well-ordered sets.

(i) If f : (W 1, 1) → (W 2, 2) is a full embedding, then

f

W 1(a)

= W 2

f (a)

, ∀ a ∈ W 1.

In particular, if w1 is the smallest element in W 1, and w2 is the smallest element in W 2, then f (w1) = w2.

(ii) If f : (W 1, 1) → (W 2, 2) is an order isomorphism, then f −1 : (W 2, 2

) → (W 1, 1) is again an order isomorphism.(iii) There exists at most one full embedding f : (W 1, 1) → (W 2, 2).

B. Suppose (W 1, 1), (W 2, 2), (W 3, 3) are well-ordered sets, and

(W 1, 1)f −→ (W 2, 2)

g−→ (W 3, 3)

are full emebeddings.

(i) The composition g f : (W 1, 1) → (W 3, 3) is again a full emebdding.(ii) The composition g f is an order isomorphism, if and only if both f and

g are order isomorphisms.

Proof. A. (i). Start first with some element x ∈ W (a). Since x 1 a, we havef (x)

2 f (a). Since f is injective, and x

= a, we must have f (x)

= f (a), hence

x ∈ W 2f (a). Conversely, if y ∈ W 2f (a)), then using the fact that f (W 2) is fullin W 2, it follows that y ∈ f (W 2), so there exists some x ∈ W 1, with y = f (x). If a 1 x, then we would get f (a) 2 f (x), which is impossible. Therefore we musthave x 1 a and x = a, i.e. x ∈ W 1(a), so y indeed belongs to f

W 1(a)

. The

second assertion is now clear since we have

W 2

f (w1)

= f

W 1(w1)

= f (∅) = ∅,

which clearly forces f (w1) = w2.(ii). This is obvious.(iii). Suppose f, g : (W 1, 1) → (W 2, 2) are full embeddings, and let us show

that we must have f = g. We use transfinite induction. Define the set

A = w ∈ W 1 : f (w) = g(w).

Let w ∈ W 1 be some element such that W 1(w) ⊂ A, and let us prove that w ∈ A,i.e. f (w) = g(w). Denote f (w) by a, and g(w) by b. Using the fact that f W 1(w)

=

gW 1(w)

, combined with (i), we have

W 2(a) = W 2

f (w)

= f

W 1(w)

= g

W 1(w)

= W 2

g(w)

= W 2(b).

This clearly forces a = b. Indeed, if a = b, then either a b, in which casea ∈ W 2(b) W 2(a), or b a, in which case b ∈ W 2(a) W 2(b). In either case, wewill get W 2(a) = W 2(b).



374 APPENDIX C

B .(i). It is clear that g f is injective, and satisfies the second condition in thedefinition, so the only thing we need to prove is the fact that ( g f )(W 1) is full. If f (W

1) = W

2, there is nothing to prove, since we would get (g

f )(W

1) = g(W

2),

which is full.Assume f (W 1) = W 2(a), for some a ∈ W 2. Then by (i) we have

(g f )(W 1) = g

f (W 1)

= g

W 2(a)

= W 3

g(a)

,

so again (g f )(W 1) is full.(ii). Assume first that both f and g are order isomprphisms. Then g f :

(W 1, 1) → (W 3, 3) is a full embedding, by (i), and it is clearly surjective, henceg f is indeed an order isomorphism.

Conversely, assume g f : (W 1, 1) → (W 3, 3) is an order isomorphism. Thisclearly forces g to be surjective, hence an order isomorphism. But then g−1 is anorder isomorphism, and so will be g−1 (g f ) = f .

Corollary C.1. If (W,

) is a well-ordered set, and a

∈W , then there is no

full embedding (W,) → (W (a), ).

Proof. Suppose there exists a full embedding f : (W,) → (W (a), ). Sincethe inclusion ι : (W (a), ) → (W,) is obviously a full embedding, the compositionι f : (W,) → (W,) is a full embedding. Since we also have IdW : (W,) →(W,) as a full embedding, this would force ι f = IdW , which would force ι to besurjective. But this is obviously impossible.

Definitions. Two well-ordered sets W 1, 1) and (W 2, 2) are said to havethe same order type, if there exists an order isomorphism (W 1, 1) → (W 2, 2).By the above considerations, this defines an equivalence relation on the class of allwell-ordered sets.

An ordinal number is thought as an equivalence class of well-ordered sets. Inother words, if we write a cardinal number as α, it is understood that α consists

of all well-ordered sets of a given order type. So when we write ord(W,) = αwe understand that (W,) belongs to this class, and for another well-ordered set(W , ) we write ord(W , ) = α, exactly when (W , ) has the same order typeas (W,). In this case we write ord(W , ) = ord(W,).

We regard the empty set ∅ as a well-ordered set, with the empty relation. Wewrite ord(∅) = 0.

Comments. If (W 1, 1) and (W 2, 2) are well-ordered sets, then one has theobvious implication

ord(W 1, 1) = ord(W 2, 2) =⇒ card W 1 = card W 2.

Conversely, if the well-ordered sets (W 1, 1) and (W 2, 2) are finite, and card W 1 =card W 2, then ord(W 1, 1) = ord(W 2, 2). Indeed, if we take n = 1card W 1, thenone can define recursively a finite sequence (wk)n

k=1 ⊂W 1, by taking w1 to be the

smallest element of W 1, and defining, for each k ∈ 2, 3, . . . , n the element wk tobe the smallest element of the set W 1 w1, w2, . . . , wk−1. The obvious bijection

1, 2, . . . , n k −→ wk ∈ W 1

will then define an order isomorphism1, . . . , n, ≤ → (W 1, 1).

Likewise (W 2, 2) has same order type as1, . . . , n, ≤

.



ORDINAL NUMBERS 375

Using the above notations, we can then regard all non-negative integers asordinal numbers, by identifying ord(W,) = card(W ), for all finite well-orderedsets (W,

).

Notation. If α is an ordinal number, say α = ord(W,), for some well-orderedset (W,), then the cardinal number card W does not depend on the particularchoice of (W,). We will denote it by card α. As dicussed above, if

card α = card β = finite cardinal,

then α = β . As we shall see later, this implication holds only for finite ordinalnumbers.

Definitions. Let α1 and α2 be ordinal numbers, say α1 = ord(W 1, 1) andα2 = ord(W 2, 2), where (W 1, 1) and (W 2, 2) are two well-ordered sets. We writeα1 ≤ α2, if there exists a full embedding f : (W 1, 1) → (W 2, 2). By PropositionC.1, this definition is independent of the choices of ( W 1, 1) and (W 2, 2).

We write α1 < α2 if α1 ≤ α2 and α1 = α2.

Remark C.1. If α1 and α2 are ordinal numbers, with α1 ≤ α2, then card α1 ≤card α2.

Proposition C.2. The relation ≤ is an order relation, on any set of ordinal numbers.

Proof. It is obvious that α ≤ α, for any ordinal number αAssume α1 and α2 are ordinal numbers with α1 ≤ α2 and α2 ≤ α1, and let

us show that this forces α1 = α2. Let (W 1, 1) and (W 2, 2) be well-ordered setswith α1 = ord(W 1, 1) and α2 = ord(W 2, 2). Since α1 ≤ α2, there exists a fullemebedding f : (W 1, 1) → (W 2, 2). Since α2 ≤ α1, either there exists a fullemebdding g : (W 2, 2) → (W 1, 1). By Proposition C.1.B, the composition g f :(W 1, 1) → (W 1, 1) is a full emebedding. Since we already have a full emebddingIdW 1 : (W 1,

1)

→(W 1,

1), by Proposition C.1.A, we must have g

f = IdW 1 .

Using Proposition C.1.B this forces f (and g) to be order isomorphisms, so weindeed have α1 = α2.

Finally, suppose α1, α2 and α3 are ordinal numbers such that α1 ≤ α2 andα2 ≤ α3. The fact that α1 ≤ α3 follows immediately from Proposition C.1.B.

Theorem C.1 (Ordinal Comparability Theorem). Let α1 and α2 be ordinal numbers. Then either α1 ≤ α2, or α2 ≤ α1.

Proof. Let (W 1, 1) and (W 2, 2) be well-ordered sets with α1 = ord(W 1, 1)and α2 = ord(W 2, 2). For every a ∈ W 1 we denote the set W 1(a) ∪ a simply byW a1 . It is clear that (W a1 , 1) is well-ordered. Consider the set

A =

a ∈ W 1 : there exists a full embedding (W a1 , 1) → (W 2, )

.

By Proposition C.1.A, we know that for any a

∈A, there exists a unique full

embedding (W a1 , 1) → (W 2, 2). We denote this full embedding by f a.

Claim 1: The set A is full. Moreover, for any a, b ∈ A, with b a, we havef b = f a

W b1

.

Start with some a ∈ A, and let us prove that W 1(a) ⊂ A. Fix some arbitrary b ∈W (a). Then the inclusion ι : (W b1 , 1) → (W a1 , 1) is obviously a full embedding,since we can write

W b1 = W 1(c),



376 APPENDIX C

where c is the smallest element of the set

Db = x ∈ W 1 : b 1 x and b = x.

(The fact that a ∈ Db shows that Db = ∅.) Then the composition

f a ι : (W b1 , 1) → (W 2, 2)

is a full emebedding, so b indeed belongs to A. Moreover, we will have f b = f a ι =f aW b1

.

Define the map φ : A → W 2 by

φ(a) = f a(a), ∀ a ∈ A.

Remark that

(4) φW a1

= f a, ∀ a ∈ A.

Indeed, if we take some b ∈ W 1(a), then by Claim 1, we have φ(b) = f b(b) = f a(b),so we get φW 1(a)

= f aW 1(a)

.

Claim 2 : φ : (A, 1) → (W 2, 2) is a full embedding.

We start by proving the first two conditions. Let a, b ∈ A be such that b 1 a andb = a, and let us show that φ(b) 2 φ(a) and φ(b) = φ(a). We have b ∈ W a1 andφW a1

= f a, so using the fact that f a : (W a1 , 1) → (W 2, 2) is a full embedding,

we indeed get φ(b) = f a(b) f a(a) = φ(a), and φ(b) = φ(a).We now show that φ(A) is full in (W 2, 2). Start with some y ∈ φ(A), and let

us show that W 2(x) ⊂ φ(A). On the one hand, since we obviously have

A =a∈A

W a1 ,

we also have

φ(A) = φ a∈AW a

1 = a∈Aφ(W a

1),

so there exists some a ∈ A, such that y ∈ φ(W a1 ) = f a(W a1 ). On the other hand,since f a : (W a1 , 1) → (W 2, 2) is a full embedding, it follows that f a(W a1 ) is full,so we get W 2(y) ⊂ f a(W a1 ) = φ(W a1 ) ⊂ φ(A).

We now finish the proof. Since both A and φ(A) are full, there are three casesto examine

Case 1: A = W 1. In this case φ : (W 1, 1) → (W 2, 2) is a full embedding, sowe get α1 ≤ α2.

Case 2: φ(A) = W 2. In this case φ : (A, 1) → (W 2, 2) is a an orderisomorphism, so φ−1 : (W 2, 2) → (W 1, 1) is a full embedding, and we get α1 ≤α2.

Case 3: A W 1 and φ(A) W 2. This means there exist a1 ∈ W 1 and

a2 ∈ W 2 such that A = W 1(a1) and φ(A) = W 2(a2). This case turns out to beimpossible. To see this, we define ψ : W a11 → W 2 by ψW 1(a)

= φ and ψ(a1) = a2,

then ψ : (W a11 , 1) → (W 2, 2) will still be an order isomorphism. Indeed, the firsttwo conditions in the definition are clear, while the equality

ψ(W a11 ) = W a22 = y ∈ W 2 : y 2 a2,

proves that ψ(W a11 ) is full. The existence of ψ then forces a1 ∈ A, which contradictsthe equality A = W 1(a1).



ORDINAL NUMBERS 377

Theorem C.2. Let α be an ordinal number. Then the class P α of all ordinal numbers β with β < α is a set. More explicitly, if (W,) is a well-ordered set with ord(W,

) = α, then the map

φ : W a −→ ord(W (a), ) ∈ P α

is a bijection. Moreover, (P α, ≤) is well-ordered, and φ : (W,) → (P α, ≤) is an order isomorphism.

Proof. Let β be an ordinal number with β < α. Then there exists a well-ordered set (W 1, 1), and a full emebedding φ : (W 1, 1), such that

• β = ord(W 1, 1),• φ(W 1) = W (a1),

for some a1 ∈ W . This fact already proves that P α is a set.

Claim: The element a1 ∈ W does not depend on the particular choice of (W 1,

1).

Indeed, if (W 2, 2) is another well-ordered set, and ψ : (W 2, 2) → (W,) isanother full emebdding with

• β = ord(W 2, 2),• ψ(W 2) = W (a2),

for some a2 ∈ W , then we would get the existence of an order isomorphism γ :(W (a1), ) → (W (a2), ). We can assume (otherwise we replace γ with γ −1) thata1 a2. If a1 = a2, we would have a1 ∈ W (a2), so if we work with the well-orderedset Z = W (a2) we would have an order isomorphism (Z, ) → (Z (a1), ). ByCorollary C.1 this is impossible. Therefore, we must have a1 = a2.

Using the Claim, we then define aβ as the unique element in W , such thatord(W (aβ), ) = β . Define the map ψ : P α β −→ aβ ∈ W . It is clear thatφ

ψ = IdP α .

Let us prove now that ψ φ = IdW . Start with some arbitrary a ∈ W , and putβ = φ(a) = ord(W (a), ). Since ord(W (a), ) = β , by the Claim, we must haveaβ = a, i.e. ψ(β ) = a, which means that (ψ φ)(a) = a.

Finally, we note that, if a, b ∈ W are elements with a b, then the obvious fullembedding (W (a), ) → (W (b), ) proves that ord(W (a), ) ≤ ord(W (b), ), i.e.φ(a) ≤ φ(b).

Since φ is bijective, it is clear that, for a, b ∈ W , we have in fact the equivalence

a b ⇐⇒ φ(a) ≤ φ(b).

This proves that (P α, ≤) is well-ordered, and φ : (W,) → (P α, ≤) is an orderisomorphism.

Corollary C.2. If S is a set of ordinal numbers, then (S,

≤) is well-ordered.

Proof. By Theorem C.1, (S, ≤) is totally ordered. Fix some non-empty subset A ⊂ S, and let us show that A has a smallest element. Start with some arbitraryα ∈ A. If α ≤ β , ∀ β ∈ A, we are done. Otherwise, the intersection A ∩ P α isnon-empty. We then use the fact that (P α, ≤) is well-ordered, to choose α1 to beits smallest element. If we start with some arbitrary β ∈ A, then either α ≤ β , inwhich case we immediately get α1 < β , or β < α, in which case β ∈ A ∩ P α, andwe again get α1 ≤ β . So α1 is in fact the smallest element of A.



378 APPENDIX C

Theorem C.3 (Well ordering Theorem). Every non-empty set has a well or-dering.

Proof. Let

W =

(W,) : (W,) well-ordered, and W ⊂ X

.

For two elements (W 1, 1) and (W 2, 2), we define (W 1, 1) (W 2, 2), if andonly if W 1 ⊂ W 2, and the inclusion map (W 1, 1) → (W 2, 2) is a full embedding.(This is equivalent to the fact that W 1 is a full subset of (W 2, 2), and 1=2

W 1

.)

It is obvious that (W,) is an ordered set. We want to apply Zorn Lemmato this set. We need to check the hypothesis. Start with a totally ordered subsetT =

(W i, i) : i ∈ I

⊂ W, and let us show that T has an upper bound in W.Define W =

i∈I W i. For a, b ∈ W , we define a b, if and only if there exists

i ∈ I , such that a, b ∈ W i, and a i b. Let us chack that (W,) is a well-orderedset. First of all, we need to show that is an order relation on W . It is clearthat a

a,

∀a

∈W . Suppose a, b

∈W satisfy a

b and b

a, and let us show

that a = b. We know there exists i, j ∈ I such that a, b ∈ W i and a i b, anda, b ∈ W j and b j a. Now there are two possibilities: either (W i, i) (W j , j),or (W j , j) (W i, i). In the first case we get a i b and b i a, so we wouldget a = b. In the other case, by symmetry, we again get a = b. Let us show nowtransitivity. Suppose a,b,c ∈ W satisfy a b and b c, and let us show thata c. We know there exist i, j ∈ I , such that a, b ∈ W i and a i b, and b, c ∈ W jand b j c. As above, we have two possibilities: either (W i, i) (W j , j), or(W j , j) (W i, i). In the first case we get a,b,c ∈ W j and a j b j c, so we geta j c. In the second case, we get a,b,c ∈ W i and a i b i c, so we get a i c. Ineither case we get a c.

Next we show that (W,) is totally ordered. Start with arbitrary a, b ∈ W , andlet us prove that either a b or b a. If we choose i, j ∈ I such that a ∈ W i and

b ∈ W j, then using the two possiblities (W i, i) (W j , j) or (W j , j)

(W i, i)we immediately see that we can find k ∈ I (k is either i or j), such that a, b ∈ W k.Then using the fact that (W k, k) is totally ordered, we either have a k b, orb k a. This gives either a b, or b a.

In order to prove that (W,) is well-ordered, and (W i, i) (W,), ∀ i ∈ I ,we shall use the following

Claim: For any i ∈ I , one has the implication:

a ∈ W i =⇒ W (a) ⊂ W i.

Indeed, if there exists some b ∈ W (a), but b ∈ W i, this would mean that thereexists some j ∈ I , with b ∈ W j , b j a, and b = a. This would then force(W i, i) (W j , j), and b ∈ W j(a). But this is impossible, since the fact that W iis full in (W j ,

j) would force b

∈W j(a)

⊂W i.

Let us show now that (W,) is well-ordered. Start with some arbitrary non-empty subset A ⊂ W . Choose i ∈ I , such that A ∩ W i = ∅, and take a to be thesmallest element in A ∩ W i, in the well-ordered set (W i, i), i.e.

(5) a ∈ A ∩ W i, and a i x, ∀ x ∈ A ∩ W i.

Let us prove that a is in fact the smallest element of A, in (W,). Start with somearbitrary element b ∈ A, and let us prove that a b. Assume the opposite, whichusing the fact that (W,) is totally ordered, this means that b a, and b = a,



ORDINAL NUMBERS 379

i.e. b ∈ W (a). By the Claim hoewever, this will force b ∈ W i, so we would getb ∈ A ∩ W i, and the choice of a would give a i b, which would then give a b,thus contradicting the assumption on b.

We now prove (W i, i) (W,), ∀ i ∈ I . It is clear that the inclusion mapι : (W i, i) → (W,) satsifies the first two conditions in the definition of fullembeddings, so the only thing we need is the fact that W i is full in (W,). Butthis is precisely the content of the above Claim.

Having shown that every totally orderes subset T ⊂ W has an upper bound, wenow invoke Zorn Lemma, to get the existence of a maximal element ( W,) ∈ W.The proof of the Theorem will be finished onece we prove that W = X . We provethis equality by contardiction. Assume W X . Pick an element x ∈ X W , anddefine the set W 1 = W ∪ x. Equipp W 1 with the order relation 1 defined by

a b ⇐⇒

a, b ∈ W and a b,or b = x

It is pretty obvious that W = W 1(x) and =1 W , so (W 1, 1) is well-orderedand (W,) (W 1, 1). Since W W 1, this would contardict the maximality.

Comment. An interesting consequence of the Well-Ordering Theorem is thefollowing: For any cardinal number a, there exists an ordinal number α, such that card α = a.

Another interesting application is the following:

Corollary C.3. If C is a set of cardinal numbers, then (C, ≤) is well-ordered.

Proof. For any a ∈ C we choose a well-ordered set (W a, a) with card W a = a.Choose any set X with

a < card X, ∀ a ∈ C.

(For example, we can take Y =

a∈C W a, so that a ≤ card Y , ∀ a ∈ C, and then we

define X = 0, 1Y

.) Choose a well-ordering on the set X . Define α = ord(X, )and αa = ord(W a, a), ∀ a ∈ C). Since

card αa = a < card X = card(X, ), ∀ a ∈ C,

it follows that we have αa < α, i.e. αa ∈ P α, ∀ a ∈ C.Apply now the fact that the ordinal set (P α, ≤) is a well-ordered, to find some

a0 ∈ C, such thatαa0 ≤ αa, ∀ a ∈ C.

This will clearly imply

a0 = card αa0 ≤ card αa = a, ∀ a ∈ C.

Examples C.1. As previously discussed, for every finite cardinal number n ≥0, there exists exactly one ordinal number with n as its cardinality.

The next interesting case is the classA = α : α ordinal number with card α ≤ ℵ0,

then A is a set. Indeed if we choose an ordinal number γ 1 with card γ 1 = c, then A

algoritmic notes

Documents