fa lecture

Upload: frederico-sande-viana

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 FA Lecture

    1/420

    Draft version: comments to [email protected] please

    Functional Analysis NotesM. Einsiedler, T. WardDraft July 2, 2012

  • 7/29/2019 FA Lecture

    2/420

    Draft version: comments to [email protected] please

    ii

    Acknowledgements

    We are grateful to several people for their comments on drafts of sections,including Anthony Flatters, Thomas Hille, Alex Maier, Andrea Riva, andRene Ruhr. Also Emmanuel Kowalski for making available notes on spectraltheory and allowing us to raid them.

  • 7/29/2019 FA Lecture

    3/420

    Draft version: comments to [email protected] please

    Contents

    1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 From Even and Odd Functions to Group Representations . . . . . 31.2 (Equi-)distribution of Points and Measures . . . . . . . . . . . . . . . . . 81.3 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.3.1 A Second-Order Linear Initial Value Problem . . . . . . . . . 121.3.2 The Volterra Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.3 The SturmLiouville Equation . . . . . . . . . . . . . . . . . . . . . . 14

    1.4 Partial Differential Equations and the Laplace Operator . . . . . . 181.4.1 The Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.4.2 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    1.5 Distributions as Generalized Functions . . . . . . . . . . . . . . . . . . . . . 241.6 Highly Connected Networks: Expanders . . . . . . . . . . . . . . . . . . . . 251.7 What is spectral theory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321.8 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    2 Norms, Banach Spaces, and Hilbert Spaces . . . . . . . . . . . . . . . . 352.1 Norms and Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    2.1.1 Normed Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.1.2 Semi-Norms and Quotient Norms. . . . . . . . . . . . . . . . . . . . 412.1.3 A Comment on Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    2.2 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.2.1 Proofs of Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.2.2 The Completion of a Normed Vector Space . . . . . . . . . . . 55

    2.2.3 Non-Compactness of the Unit Ball . . . . . . . . . . . . . . . . . . . 572.3 The space of continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . 58

    2.3.1 The ArzelaAscoli theorem . . . . . . . . . . . . . . . . . . . . . . . . . 582.3.2 The StoneWeierstrass Theorem. . . . . . . . . . . . . . . . . . . . . 612.3.3 Continuous Functions in Lp Spaces . . . . . . . . . . . . . . . . . . 66

    2.4 Bounded Operators and Functionals . . . . . . . . . . . . . . . . . . . . . . . 702.4.1 The Volterra Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752.4.2 The Norm of Continuous Functionals on C(X) . . . . . . . . 77

  • 7/29/2019 FA Lecture

    4/420

    Draft version: comments to [email protected] please

    iv Contents

    2.4.3 Banach Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792.5 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    2.5.1 Definitions and Elementary Properties . . . . . . . . . . . . . . . 802.5.2 Isometries are Affine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842.5.3 Convex Sets in Uniformly Convex Spaces . . . . . . . . . . . . . 852.5.4 Two Applications to Measure Theory . . . . . . . . . . . . . . . . 912.5.5 Orthonormal Bases and GramSchmidt . . . . . . . . . . . . . . 962.5.6 The Non-Separable Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    2.6 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    3 From Fourier Series to Dirichlet Boundary Value Problems 1033.1 Fourier Series on Compact Abelian Groups . . . . . . . . . . . . . . . . . 103

    3.2 Fourier Series on Td

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073.2.1 Convolution on the Torus . . . . . . . . . . . . . . . . . . . . . . . . . . 1093.2.2 Dirichlet and Fejer Kernels . . . . . . . . . . . . . . . . . . . . . . . . . 1103.2.3 Differentiability and Fourier Series . . . . . . . . . . . . . . . . . . . 116

    3.3 Spectral Theory for Group Actions on Td . . . . . . . . . . . . . . . . . . . 1183.3.1 Group Actions and Unitary Representations . . . . . . . . . . 1183.3.2 Measure-Preserving Actions of Compact Groups . . . . . . . 1213.3.3 Unitary Representations of Compact Abelian Groups . . 1213.3.4 Integrating Hilbert Space-valued Functions . . . . . . . . . . . 1223.3.5 Proof of the Weight Decomposition . . . . . . . . . . . . . . . . . . 125

    3.4 Sobolev Spaces and Embedding on the Torus . . . . . . . . . . . . . . . 1283.4.1 L2-Sobolev Spaces on Td . . . . . . . . . . . . . . . . . . . . . . . . . . 1283.4.2 The Sobolev Embedding Theorem on Td . . . . . . . . . . . . . 132

    3.5 Sobolev Spaces and Embedding Theorem on Open Sets . . . . . . 1343.5.1 L2-Sobolev Spaces on Open Subsets . . . . . . . . . . . . . . . . . 1343.5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1373.5.3 Restriction Operators and Traces . . . . . . . . . . . . . . . . . . . . 1393.5.4 Sobolev Embedding in the Interior. . . . . . . . . . . . . . . . . . . 144

    3.6 The Dirichlet Boundary Value Problem and Elliptic Regularity 1473.6.1 The Pre-Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483.6.2 Elliptic Regularity for the Laplace Operator . . . . . . . . . . 1503.6.3 Dirichlets Boundary Value Problem in two dimensions . 155

    3.7 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    4 Compact Self-Adjoint Operators and Laplace Eigenfunctions161

    4.1 The Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1614.2 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1624.2.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 1624.2.2 Integral Operators are often Compact . . . . . . . . . . . . . . . . 165

    4.3 Spectral Theory of Self-Adjoint Compact Operators. . . . . . . . . . 1694.3.1 The Adjoint Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1694.3.2 The Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.3.3 Proof of the Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . 173

  • 7/29/2019 FA Lecture

    5/420

    Draft version: comments to [email protected] please

    Contents v

    4.4 Eigenfunctions for the Laplace Operator. . . . . . . . . . . . . . . . . . . . 1764.4.1 A Compact Right Inverse on the Torus . . . . . . . . . . . . . . . 1774.4.2 A Self-Adjoint Right Inverse on Open Subsets . . . . . . . . . 1784.4.3 Compactness of the Right-Inverse . . . . . . . . . . . . . . . . . . . 179

    5 Uniform Boundedness and Open Mapping Theorem . . . . . . . 1895.1 Uniform Boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

    5.1.1 Uniform Boundedness and Fourier Series . . . . . . . . . . . . . 1915.2 Open Mapping and Closed Graph Theorems . . . . . . . . . . . . . . . . 193

    5.2.1 Baire Category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1945.2.2 Proof of Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . 1965.2.3 Consequences: Bounded Inverses and Closed Graphs . . . 197

    5.3 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

    6 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.1 HahnBanach Theorem and its Consequences . . . . . . . . . . . . . . . 201

    6.1.1 HahnBanach Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.1.2 The HahnBanach Theorem Consequences. . . . . . . . . . . . 2036.1.3 An Application of the Spanning Criterion . . . . . . . . . . . . 2066.1.4 The Bidual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2086.1.5 Banach Limits and Amenable Groups . . . . . . . . . . . . . . . . 209

    6.2 The Duals ofLp(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2136.2.1 The Dual ofL1(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2146.2.2 The Dual ofLp(X) for p > 1 . . . . . . . . . . . . . . . . . . . . . . . 216

    6.3 Riesz Representation, The Dual of C(X) . . . . . . . . . . . . . . . . . . . 220

    6.3.1 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2206.3.2 Totally Disconnected Compact Spaces . . . . . . . . . . . . . . . 2216.3.3 Compact Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2246.3.4 Locally Compact -Compact Metric Spaces . . . . . . . . . . . 2286.3.5 Continuous Linear Functionals on C(X) . . . . . . . . . . . . . . 230

    6.4 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

    7 Locally Convex Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2357.1 Weak Topologies and TychonoffAlaoglu . . . . . . . . . . . . . . . . . . . 235

    7.1.1 Weak* Compactness of the Unit Ball . . . . . . . . . . . . . . . . 2377.1.2 More Properties of the Weak and Weak* Topologies . . . 2397.1.3 Analytic Functions and the Weak Topology . . . . . . . . . . . 241

    7.2 Applications of Weak* Compactness . . . . . . . . . . . . . . . . . . . . . . . 2437.2.1 Equidistribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2437.2.2 Elliptic Regularity at the Boundary . . . . . . . . . . . . . . . . . . 250

    7.3 Topologies on B(X, Y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2507.4 Locally Convex Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2527.5 Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

    7.5.1 Applications of the HahnBanach Lemma . . . . . . . . . . . . 2557.5.2 Extremal Points and the KreinMilman Theorem. . . . . . 258

  • 7/29/2019 FA Lecture

    6/420

    Draft version: comments to [email protected] please

    vi Contents

    7.6 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

    8 Spectral Theory of Unitary Operators, Fourier Transforms 2638.1 Spectral Theory of Unitary Operators . . . . . . . . . . . . . . . . . . . . . . 263

    8.1.1 Bochners Theorem for Positive-Definite Sequences . . . . 2648.1.2 Cyclic Representations and the Spectral Theorem . . . . . 2668.1.3 Proof of Bochners theorem . . . . . . . . . . . . . . . . . . . . . . . . . 2698.1.4 Projection-valued Measures . . . . . . . . . . . . . . . . . . . . . . . . . 273

    8.2 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2768.2.1 Fourier Transform on L1(Rd) . . . . . . . . . . . . . . . . . . . . . . . 2808.2.2 Fourier Transform on L2(Rd) . . . . . . . . . . . . . . . . . . . . . . . 2848.2.3 Fourier transform and smoothness, Schwartz space. . . . . 287

    9 Banach Algebras and Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2899.1 Spectrum and Spectral Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

    9.1.1 The Geometric Series and its Consequences . . . . . . . . . . . 2919.1.2 Using Cauchy Integration . . . . . . . . . . . . . . . . . . . . . . . . . . 293

    9.2 C-algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2979.3 Commutative Banach Algebras and their Gelfand duals . . . . . . 298

    9.3.1 Commutative Unital Banach Algebras . . . . . . . . . . . . . . . 2999.3.2 Commutative Banach Algebras without a Unit . . . . . . . . 3019.3.3 The Gelfand Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3029.3.4 The Gelfand Transform for Commutative C-algebras . . 305

    9.4 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

    10 Functional Calculus and Spectral Theory . . . . . . . . . . . . . . . . . . 30910.1 Definitions, Basic Lemmas, Main Goals . . . . . . . . . . . . . . . . . . . . 309

    10.1.1 Discrete, Continuous, and Residual Spectrum . . . . . . . . . 30910.1.2 Numerical Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31210.1.3 Main Goals: Spectral Theorem and Functional Calculus 313

    10.2 Continuous Functional Calculus for Self-Adjoint Operators . . . 31610.2.1 Corollaries to the Continuous Functional Calculus . . . . . 319

    10.3 Spectral measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32210.3.1 The Spectral Theorem for Self-Adjoint Operators. . . . . . 324

    10.4 Spectral Measures and the Measurable Functional Calculus . . . 32910.4.1 Non-Diagonal Spectral Measures . . . . . . . . . . . . . . . . . . . . 33010.4.2 The Measurable Functional Calculus . . . . . . . . . . . . . . . . . 331

    10.5 Commuting Normal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33410.6 Projection-valued measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33610.7 The spectral theorem for normal operators . . . . . . . . . . . . . . . . . . 34510.8 Some Facts on the Spectrum of a Tree . . . . . . . . . . . . . . . . . . . . . 347

    10.8.1 The Correct Upper Bound for the Summing Operator . . 34810.8.2 Chebyshev Polynomials of the Second Kind . . . . . . . . . . . 350

  • 7/29/2019 FA Lecture

    7/420

    Draft version: comments to [email protected] please

    Contents vii

    11 Spectral Theory of Self-Adjoint Unbounded Operators . . . . 35511.1 Examples, Definitions, and the Main Theorem . . . . . . . . . . . . . . 35511.2 Operators of the form TT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35911.3 Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

    Appendix A: Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365A.1 Set Theory and Axiom of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 365A.2 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366A.3 Convergence and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368A.4 Inducing Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369A.5 Compact Sets and Tychonoff Theorem . . . . . . . . . . . . . . . . . . . . . 372A.6 Normal Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374

    Appendix B: Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379B.1 Basic Definitions and Measurability . . . . . . . . . . . . . . . . . . . . . . . . 379

    B.1.1 Measure and Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380B.2 Properties of the Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382B.3 The p-Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384B.4 Near-continuity of Measurable Functions . . . . . . . . . . . . . . . . . . . 386

    Hints for Selected Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

    References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404

    General Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

  • 7/29/2019 FA Lecture

    8/420

    Draft version: comments to [email protected] please

  • 7/29/2019 FA Lecture

    9/420

    Draft version: comments to [email protected] please

    Introduction

  • 7/29/2019 FA Lecture

    10/420

    Draft version: comments to [email protected] please

  • 7/29/2019 FA Lecture

    11/420

    Draft version: comments to [email protected] please

    1

    Motivation

    We start by discussing some seemingly disparate topics that are all intimatelylinked to notions from functional analysis. Some of the topics have been im-portant motivations for the development of the theory that came to be calledfunctional analysis in the first place and some topics concern more recent ap-plications of the theory. We hope that the variety of topics helps to clarify thecentral role of functional analysis in mathematics.

    1.1 From Even and Odd Functions to GroupRepresentations

    We recall the following elementary but useful notions of symmetry and anti-symmetry for functions. A function f : R R is said to be even if

    f(x) = f(x)

    for all x R, and odd iff(x) = f(x)

    for all x R. Every function f : R R can be split into an even and an oddcomponent, since

    f(x) = f(x)+f(x)

    2

    the even part+ f(x)

    f(x)2

    the odd part. (1.1)

    This chapter is atypical for these notes. The reader may and the lecturer shouldskip, or return later to, this chapter for convenience. In fact the (sometimesinformal) discussions here are not needed for the formal development of the theory,which starts in Chapter 2, but they may help to motivate some of the laterdevelopments. We also apologize for the trivial first two pages, but we hope thatthese help to clarify how natural the discussed decompositions are.

  • 7/29/2019 FA Lecture

    12/420

    Draft version: comments to [email protected] please

    4 1 Motivation

    Exercise 1.1. Is the decomposition of a function into odd and even parts in (1.1)unique? That is, if f = e + o with e an even function and o an odd function,is e(x) = f(x)+f(x)2 ?

    Behind the definition of even and odd functions, and the decompositionin (1.1), is the group Z/2Z = {0, 1} acting on R via the map x (1)nxfor n Z/2Z.

    In order to generalize this observation, recall that an action of a group Gon a set X is a map

    G X X(g, x) g.x

    with the propertiesg.(h.x) = (gh).x

    for all g, h G and x X, and

    e.x = xfor all x X, where e G is the identity element.

    Having associated the decomposition of a function into odd and even partswith the action of the group Z/2Z on R, the notion of group action suggestsmany generalizations of the decomposition.

    We begin this by discussing functions on R2. We could once again con-sider the action of Z/2Z via scalar multiplication by (

    1)n for n

    Z/2Z

    on R2. Alternatively, we could treat the two components independently, andallow (Z/2Z)2 to act via the action

    (n1, n2).(x1, x2) ((1)n1x1, (1)n2x2)for n1, n2 Z/2Z and x1, x2 R. Notice that the action of this group of orderfour leads to four different types of functions, namely those that are:

    even with respect to both variables; even with respect to x1 and odd with respect to x2; odd with respect to x1 and even with respect to x2; odd with respect to both variable.Once more every function can be decomposed into a sum of four components,one of each type (see Exercise 1.2).

    Exercise 1.2. (a) Show that every function f : R2 R can be decomposed into asum of four functions with the symmetry properties listed above.(b) Consider the group action ofZ/nZ on R2 by letting k + nZ act by rotation bythe angle 2kn (using (1.2) and the matrix k(

    2kn ) as below). Generalize the above

    decompositions to this case.

    Here we are using n Z as a shorthand for the coset n + 2Z Z/2Z.

  • 7/29/2019 FA Lecture

    13/420

    Draft version: comments to [email protected] please

    1.1 From Even and Odd Functions to Group Representations 5

    However, there are other natural actions on R2 (as e.g. in the above exer-cise). Let T = R/Z be the one-dimensional circle group or 1-torus, and definean action ofT on R2 by the rotation

    T R2 R2,

    x1x2

    k(2)

    x1x2

    , (1.2)

    where

    k() =

    cos sin sin cos

    is the matrix of anti-clockwise rotation through the angle 2 on R2. In study-

    ing any situation with rotational symmetry on R2

    one is naturally led to thisaction. What is the corresponding decomposition of functions for this action?How many different classes of functions will appear in the corresponding de-composition?

    Exercise 1.3. Verify that (1.2) defines an action ofT on R2.

    Clearly one distinguished class of functions is given by the functions in-variant under rotation that is, functions satisfying

    f

    x1x2

    = f

    k(2)

    x1x2

    for all T. The graph of such a function is the surface obtained by rotatinga graph of a function [0, ) R or [0, ) C about the z-axis.To guess what the other classes of function should be, notice that all of thesymmetries of functions considered above can be phrased naturally in termsof the possible continuous group homomorphisms of the acting group to thegroup

    S1 = {z C | |z| = 1}.It is easy to show (see Exercise 1.4 for the third, non-trivial, statement) that

    (1) any homomorphism Z/2Z S1 has the formn (1)n,

    so there are two such homomorphisms;

    (2) any homomorphism (Z/2Z)2

    S1

    has the form(n1, n2) (1)n1(1)n2 ,

    so there are four such homomorphisms; and

    Once again we will sometimes write t as shorthand for the coset t + Z R/Z; inparticular the interval [0, 1) may be identified with T using addition modulo 1.Notice that this makes T into a topological group by declaring elements to be closeif they (or rather their representatives) can be chosen to be close in R.

  • 7/29/2019 FA Lecture

    14/420

    Draft version: comments to [email protected] please

    6 1 Motivation

    (3) any continuous homomorphism T S1 has the form

    n() = e2in.

    For any topological group G, we call the continuous homomorphisms from Gto S1 the characters ofG. Notice that the characters in (1) and (2) correspondexactly to the even and odd functions in (1.1), respectively to the four types

    of functions for the action of (Z/2Z)2 on R2 considered above.

    Exercise 1.4. Show that any continuous homomorphism T S1 has the form n() for some n Z.

    Turning to (3), we say that a function f : R2

    C has weight n (is of

    type n) iff(k(2)v) = n()f(v)

    for all T and v R2.One might now guess and we will see in Chapter 3 that this indeed is

    the case that any reasonable function f : R2 C can be written as a linearcombination

    f =nZ

    fn (1.3)

    where fn has weight n. However, in contrast to (1.1) this is an infinite sum, sowe are no longer talking about a purely algebraic phenomenon. The decompo-sition (1.3), its existence and its properties, lies both in algebra and in analysis.We therefore have to become concerned both with the algebraic structure andwith questions of convergence. Depending on the notion of convergence used,the class of reasonable functions turns out to vary. These classes of reasonablefunctions will provide us with important examples of Banach spaces to bedefined in Chapter 2.

    The discussion above on decompositions into sums of functions of differentweights will later be part of the treatment ofFourier analysis (see Chapter 3).For this we will initially study the mathematically simpler situation of theaction ofT on T by translation,

    (x, y) x + y

    for x, y T. Adjusting the definitions above appropriately, we say that

    f : T C Notice that we are now allowing functions to be complex-valued, and that we

    have simplified the notation for points in R2. This helps to clarify the underlyingstructure, and reflects one of the themes of functional analysis: thinking of pro-gressively more complicated objects (numbers, then vectors, then functions, thenoperators) as points in a larger space allows the real structures to be seen moreclearly.

  • 7/29/2019 FA Lecture

    15/420

    Draft version: comments to [email protected] please

    1.1 From Even and Odd Functions to Group Representations 7

    has weight n Z if and only if f is a multiple of n itself. We therefore seek,for a reasonable function f : T C, constants cn for n Z with

    f =nZ

    cnn. (1.4)

    The right-hand side of (1.4) is called the Fourier series of f. We will see laterthat it is relatively straightforward (at least in the abstract sense) to find theFourier coefficients cn via the identity

    cn =

    10

    f(x)n(x) dx

    for all n Z.Exercise 1.5. Use de Moivres formula e2in = cos n + isin n to show thatthe discussion on Fourier series culminating in (1.4) has a purely real analog. Findformulas for the Fourier coefficients of the analogous decomposition into sums ofsins and cosines (assuming the formula for the complex version).

    Similarly, we will show that for a reasonable function f : R2 C thefunction

    fn(v) =

    10

    f(k(2)v) n() d

    for n Z has weight n, and that

    f = nZ f

    n. (1.5)

    Exercise 1.6. Show that if the function fn(v) =10

    f(k(2)v) n() d is well-defined (that is, if the integral exists for all, or for almost every, v R2), then it hasweight n.

    To summarize, we will introduce classes of functions (which will be exam-ples of Banach spaces), and determine whether for functions in these classesthe Fourier series (1.4) or the weight decomposition (1.5) converges, and inwhat sense the convergence does or does not happen.

    For functions f : R3 C one can generalize the discussion above in manydifferent ways, by considering the actions of various different groups as follows.

    Z/2Z, giving the familiar generalization of even and odd functions. (Z/2Z)3, giving a decomposition into eight functions defined by their odd-

    or even-ness with respect to each of the three variables. T = SO(2) acting by rotations in the x, y-plane about the z-axis. This

    gives a simple generalization of our discussion of functions R2 C, andwe will be able to treat this case in a similar way to the two-dimensionalcase.

    SO(3), the full group of orientation-preserving rotations ofR3.

  • 7/29/2019 FA Lecture

    16/420

    Draft version: comments to [email protected] please

    8 1 Motivation

    The last case in this list is more difficult to analyze than any of the casesdiscussed above. The additional complications in this case are much deeperthan might at first appear. For example, the group SO(3) is simple, and as aresult there are no non-trivial continuous homomorphisms SO(3) S1, so thiscannot be used to define classes of functions in the same way. In fact the caseof SO(3) requires the theory of harmonic analysis and unitary representationsof compact groups. We will not reach this important topic in this notes butlay the ground for it and refer to the excellent treatment in ??.

    Exercise 1.7. Prove that the group SO(3) of rotations ofR3 forms a compact groupin its natural topology when viewed as a set of 3 3 real matrices. Show that thisgroup is simple, and deduce that a continuous homomorphism SO(3) S1 must betrivial.

    1.2 (Equi-)distribution of Points and Measures

    A sequence (xn)n of elements of a metric space X is dense if for every x Xthere is a subsequence (xnk )k that converges to x. A much finer property isgiven by equidistribution, which we now define for X = [0, 1].

    A sequence (xn)n1 of points in [0, 1] is said to be equidistributed or uni-formly distributed if any one of the following equivalent conditions is satisfied:

    (1)1

    K|{k [1, K] | xk [a, b]}| b a as K for any 0 a < b 1.

    (2) 1K

    Kk=1

    f(xk) 10

    f(x) dx as K for any continuous function f C([0, 1]).

    (3)1

    K

    Kk=1

    f(xk) 1

    0

    f(x) dx as K for any Riemann-integrable f R([0, 1]).

    (4)1

    K

    Kk=1

    n(xk) 1

    0

    n(x) dx =

    0 if n = 0;1 if n = 0

    as K for any n Z(n is defined on p. 6).

    We will now sketch some of the implications between these equivalent state-ments (see Exercise 1.8, Kuipers and Niederreiter [23] or [12, Sect. 4.4.1] for

    a detailed treatment). We will develop all of the theorems needed later in thetext, and will return to the topic of equidistribution in Chapter 7 from a moregeneral point of view.

    Almost a proof of (4) = (2). Consider the algebra of trigonometricpolynomials

    A =

    Nn=N

    cnn | cn C, N N

    .

  • 7/29/2019 FA Lecture

    17/420

    Draft version: comments to [email protected] please

    1.2 (Equi-)distribution of Points and Measures 9

    Using the complex version of the StoneWeierstrass theorem (Theorem 2.34),it may be seen that A is dense in C(T) with respect to the uniform metric(see Proposition 3.19). Given f C(T) and > 0, there is some g A with

    f g = supxT

    |f(x) g(x)| <

    which implies that f g < and

    1

    K

    K

    k=1 f(xk) 1

    K

    K

    k=1 g(xk) < for any K 1. If K is sufficiently large then, by assumption, 1K

    Kk=1

    g(xk)

    g

    < .It follows that 1K

    Kk=1

    f(xk)

    f

    < 3,which is not quite the claim in (2) since C(T) and C([0, 1]) differ slightly. Anyfunction f : T

    C gives rise to a function f : R

    C via the diagram

    R

    f// C

    Tf

    ??

    which we can restrict to [0, 1], defining an element g C([0, 1]). If f : T Cis continuous then g is also, but g satisfies g(0) = g(1). We will handle thisissue below in the proof that (2) implies (1), where we will only assume (2)for all f C(T). Proof of (2) = (1). Suppose first that 0 < a < b < 1 and write

    [a,b]

    for the characteristic function of the interval [a, b]. Fix > 0 and choosecontinuous functions f, f+ : [0, 1] R with(a) 0 f(x) [a,b](x) f+(x) 1 for all x [0, 1],(b)

    10

    (f+ f) < , and

    We use

    f as shorthand for

    10

    f(x) dx for convenience.

  • 7/29/2019 FA Lecture

    18/420

    Draft version: comments to [email protected] please

    10 1 Motivation

    (c) f+(0) = f+(1) = f(0) = f(1) = 0.

    For example, the functions f+ and f could be chosen to be piecewise linear,as illustrated in Figure 1.1. In this case the shaded region can easily be chosento have total area bounded above by , as required in (b).

    a b0 1

    f

    f+

    Fig. 1.1. The function [a,b] and the approximations f (dots) and f+ (dashes).

    By (c), the functions f and f+ also define continuous functions on T.Since

    1

    K

    K

    k=1

    f(xk) 1

    K

    K

    k=1

    [a,b](xk)

    1

    K

    K

    k=1

    f+(xk)

    (b a)

    f

    f+ (b a) +

    as K , we obtain

    (b a) liminfK

    1

    K

    Kk=1

    [a,b](xk) limsupK

    1

    K

    Kk=1

    [a,b](xk) (b a) + ,

    which implies the claim in (1) for 0 < a < b < 1. The formula in (1) holdstrivially if f 1, so

    1

    K

    Kk=1

    [0,a)(xk) + (b,1](xk)

    1 (b a)as K by taking the difference. Suppose now that a = 0 < b < 1. Then,for any sufficiently small > 0, we have

  • 7/29/2019 FA Lecture

    19/420

    Draft version: comments to [email protected] please

    1.2 (Equi-)distribution of Points and Measures 11

    f = [,b] [0,b] [0,b+) + (1,1] = f+

    and (f+ f) < 3,

    and the formula in (1) already holds for f and f+. As before, this impliesthe claim for

    [0,b]. The case of 0 < a < b = 1 is similar.

    As in many proofs in analysis approximation played a crucial role in theabove argument. In fact, two notions of approximation were used: uniformapproximation and approximation by functions that differ in integral verylittle. We will study these and related notions of approximation throughoutthese notes.

    Exercise 1.8. Prove the remaining implications to show that the four characteri-zations of equidistribution at the start of Section 1.2 are indeed equivalent.

    Example 1.9. A simple example of an equidistributed sequence may be ob-tained as follows. Fix RQ and define xk = {k} [0, 1) for k N,where {t} denotes the fractional part of the real number t. To see that thisdefines an equidistributed sequence, the characterization in (4) is the mostconvenient to use. For n = 0, the function n is identically 1, so

    1

    K

    Kk=1

    0(xk) = 1

    for all K. If n = 0, then

    1

    K

    Kk=1

    e2ink =1

    =1

    K

    Kk=1

    e2in

    k=

    1

    K

    e2in(K+1) e2ine2in 1 0

    as K .An amusing consequence of this example is a special case of Benfords law.

    Exercise 1.10. Use the equidistribution from Example 1.9 to show the following.

    Write n for the leading digit of 2n written in decimal (so the sequence (n) be-gins (2, 4, 8, 1, 3, 6, 1, 2, 5, . . . )). Then

    1

    K|{k | 1 k K, k = 1}| log10 2

    as K . Using Exercise 1.12 below, generalize this to a statement about powersof 2 and 3 with the same exponent.

  • 7/29/2019 FA Lecture

    20/420

    Draft version: comments to [email protected] please

    12 1 Motivation

    Clearly there is some notion of convergence of measures to the Lebesguemeasure in the discussion above. In order to formulate this precisely, we willneed to define an appropriate topology on a space of measures. This topologywill be called the weak*-topology (see Chapter 7), and as we will show thespace of probability measures on a compact metric space is itself a compactmetric space in this topology. This result helps to provide a coherent settingfor many equidistribution results.

    Exercise 1.11. Assume that 1, . . . , d R are linearly independent over Q. Showthat

    1

    T

    T0

    f

    t(1, . . . , d) (mod Zd)

    dt

    Td

    f(x) dx

    as T , for any f C(Td).

    Exercise 1.12. Assume that 1, 1, . . . , d R are linearly independent over Q.Show that

    1

    N

    N1n=0

    f

    n(1, . . . , d) (mod Zd)

    Td

    f(x) dx

    for any f C(Td).

    1.3 Ordinary Differential Equations

    There is no need to motivate the study of differential equations, as they areof central importance across all sciences that deal with measurable quantities

    that change with respect to other variables of the system studied. Here wewant to briefly indicate how even the simplest differential equations can leaddirectly to the study of integral operators, which may be analyzed using toolsfrom functional analysis.

    1.3.1 A Second-Order Linear Initial Value Problem

    Consider first the differential equation

    f(x) + f(x) = g(x) (1.6)

    with the initial values

    f(0) = 1, f(0) = 0.Let us recall briefly the familiar approach to solving such an equation. Firstone finds all solutions to the homogeneous equation

    A reader familiar with the theorem of Picard and Lindelof on the existence anduniqueness of solutions to certain initial value problems and its proof will not besurprised by this connection. However there are further connections, which wewill begin to expose here.

  • 7/29/2019 FA Lecture

    21/420

    Draft version: comments to [email protected] please

    1.3 Ordinary Differential Equations 13

    f(x) + f(x) = 0,

    givingf(x) = A sin x + B cos x (1.7)

    for constants A and B. Then one moves on to the problem of finding oneparticular solution fp to the equation

    fp (x) + fp(x) = g(x), (1.8)

    ignoring the initial values, which may be done by a sophisticated guess if gis sufficiently simple, or by using the method of variation of parameters (thatis, treating A and B as functions of x rather than constants). Finally, taking

    the sum of f from (1.7) and a solution to (1.8), one chooses the constants Aand B in the solution to the homogeneous equation to satisfy the initial values.Rather than going through this in detail, we claim that the function

    f(x) = cos(x) +

    x0

    sin(x t)g(t) dt

    is a solution to the initial value problem. This is easily checked by a calcula-tion: f(0) = 1 clearly, and

    f(x) = sin x + sin(x x)g(x) +x

    0

    cos(x t)g(t) dt,

    so f(0) = 0. Finally,

    f(x) = cos x + cos(x x)g(x) x

    0

    sin(x t)g(t) dt= f(x) + g(x)

    as required.

    1.3.2 The Volterra Equation

    If the original differential equation in (1.6) is changed slightly, to take theform

    f(x) + f(x) = (x)f(x), (1.9)

    with the same initial values f(0) = 1 and f(0) = 0, then the argument usedabove does not solve the equation. Nonetheless, the ideas are still useful, sinceit suggests transforming the equation into the integral equation

    f(x) = cos(x) +

    x0

    sin(x t) (t)f(t) g(t)

    dt. (1.10)

    Now define k(x, t) = sin(x t)(t) so that (1.10) takes the form

  • 7/29/2019 FA Lecture

    22/420

    Draft version: comments to [email protected] please

    14 1 Motivation

    f = u + K(f), (1.11)

    where u(x) = cos x and

    K(f)(x) =

    x0

    k(x, t)f(t) dt.

    Here K is a linear map, defined on some space of nice functions. We willtherefore call K an operator, and due to its nature it is an integral operator.

    Solving the perturbed equation (1.9) with initial values will turn out tobe very straightforward at the level of abstraction we aim at in functionalanalysis. We can rewrite the equation (1.11) as a Volterra equation

    (I K)f = uwhere I is the identity map. The solution f is then given by applying theinverse operator (I K)1, which we may calculate (in this particular case)using an operator form of the geometric series:

    (I K)1 =n=0

    Kn,

    and hence

    f =n=0

    Knu.

    Clearly we will have to study convergence of these infinite series of powersof operators, and also make precise the classes of functions on which thesearguments make sense (see Section 2.4.1).

    1.3.3 The SturmLiouville Equation

    Finally, we make another small change to the differential equation (1.6). Fixa parameter > 0 and consider the SturmLiouville equation

    f + 2f = g, (1.12)

    with the boundary conditions

    f(0) = f(1) = 0.

    We may proceed just as before. The functions of the form

    f(x) = A cos x + B sin x

    give all solutions to the homogeneous differential equation f + 2f = 0. Nextone needs to find a particular solution fp to

  • 7/29/2019 FA Lecture

    23/420

    Draft version: comments to [email protected] please

    1.3 Ordinary Differential Equations 15

    fp + 2fp = g

    (ignoring the boundary conditions). After this, one would use the solutionsto the homogeneous differential equation to satisfy the boundary conditions.Explicitly, given fp we can calculate the vector

    fp(0)fp(1)

    (1.13)

    and try to express it as a linear combination of the two vectors

    cos 0cos 1

    =

    1

    cos and sin 0sin 1

    =

    0

    sin

    .

    If

    det

    1 0

    cos sin

    = sin

    is non-zero, then this is always possible and we find a unique solution to theboundary value problem. However, if Z then sin = 0 and we maybe unlucky with the value of the vector (1.13): if

    fp(0)fp(1)

    and

    1

    cos

    are

    linearly independent, then there will not be a solution to the boundary value

    problem.This obstruction to being able to find a solution to the boundary valueproblem may be phrased in terms of an integral operator. At first sight thisconnection (and this example) may appear contrived, but in fact it opens adoor to the important topic of the spectral theory of operators, which is crucialfor many other problems.

    Define the continuous function (the Green function) on [0, 1]2 by

    G(s, t) =

    s(t 1) for 0 s t 1;t(s 1) for 0 t s 1.

    We claim that the conditions

    f(0) = f(1) = 0f = h

    (1.14)are equivalent to f = Kh, where K is the operator defined by

    K(h)(s) =

    10

    G(s, t)h(t) dt. (1.15)

    In order to justify the claim, assume first that f = Kh. Then

  • 7/29/2019 FA Lecture

    24/420

    Draft version: comments to [email protected] please

    16 1 Motivation

    f(0) =1

    0

    G(0, t) =0

    h(t) dt = 0,

    and f(1) = 0 for the same reason. Moreover,

    f(s) =

    s0

    t(s 1)h(t) dt +1s

    s(t 1)h(t) dt,

    f(s) =@@@@@@s(s 1)h(s) +

    s0

    th(t) dt

    @@@@@@s(s 1)h(s) +

    1

    s

    (t 1)h(t) dt,

    andf(s) = (s)h(s) (s 1)h(s) = h(s),

    so f is a solution of the boundary value problem (1.14).To see the converse, notice that the boundary value problem has a solution

    (by the argument above). However, our previous discussion of the boundaryvalue problem associated to the SturmLiouville equation (1.12) (which needsto be modified for the case = 0) shows that in this case the solution is unique.Thus the equivalence of (1.14) and f = Kh is established.

    Exercise 1.13. Modify the argument for the SturmLiouville equation for thecase = 0, and show that the solution is always unique.

    In particular, the fact that sn(x) = sin nx for any n Z satisfiessn(0) = sn(1) = 0sn = (n)2sn

    implies that

    sn = (n)2K(sn).In other words, the values

    n = (n)2

    for n = 1, 2, . . . are eigenvalues of the linear map (or integral operator) K.Thus we can rephrase our earlier observation regarding the equivalent for-

    mulationsf + 2f = gf(0) = f(1) = 0

    f = K(2f + g) I + 2K f = K(g)

    by saying that this differential equation always has a unique solution for any gunless = n corresponds to one of the eigenvalues n = (n)2 = 2of K.

    Actually these are all the eigenvalues of K (see Exercise 1.14).

  • 7/29/2019 FA Lecture

    25/420

    Draft version: comments to [email protected] please

    1.3 Ordinary Differential Equations 17

    This discussion gives some hope that the notion of eigenvalues and eigen-vectors (which might themselves be functions) of operators may make sense,and can be useful in the study of ordinary differential equations. (In fact,these questions also turn out to be useful for the study of partial differentialequations, see the discussion of the next section.) However, as we will seelater, some care must be taken because the operators arising act on infinite-dimensional spaces eigenvectors may not exists, and the spectral theory ofoperators will be found to contain many new possibilities and phenomenainvolving generalized eigenvalues and eigenvectors when compared with thefamiliar theory of eigenvectors and eigenvalues of matrices (that is, operatorson finite-dimensional vector spaces). We will start this topic in Chapter 4.

    Exercise 1.14.

    Suppose that f L

    1

    ([0, 1]) and that Kf = f for some in R{0}, where K is the operator (1.15) discussed in connection with the SturmLiouville problem. Show that f must be smooth on (0, 1), and deduce that f and must satisfy the conditions found above.

    Exercise 1.15. In this exercise we generalize the connection between the SturmLiouville boundary value problem and integral operators. Let a < b be real numbers,and assume that p C1([a, b]) and q C([a, b]) are real-valued functions with p > 0and q > 0. We define the second order differential operator

    L(f) = (pf) + qf.

    Also let 1, 2, 1, 2 R and define the boundary conditions

    B1(f) = 1f(a) + 2f(a) = 0,

    B2(f) = 1f(b) + 2f(b) = 0.

    Assume that f1 and f2 are fundamental solutions of the differential equation L(f) =

    0 such that we also haveB1(f1) = B2(f2) = 0.

    Show thatp(f1f

    2 f

    1f2) = c

    is a constant. Using this, define an associated Green function

    G(s, t) =

    1cf1(s)f2(t) for a s t b,1c

    f1(t)f2(s) for a t s b,

    and show that for h C([a, b]) the boundary-value problem

    B1(f) = B2(f) = 0L(f) = h

    is equivalent to the equation

    This is the first case of the phenomenon called elliptic regularity, which we willreturn to in Chapter 3.

    That is, the functions f1, f2 form a basis of the vector space of all solutions.

  • 7/29/2019 FA Lecture

    26/420

    Draft version: comments to [email protected] please

    18 1 Motivation

    f(s) = K(h)(s) =ba

    G(s, t)h(t) dt.

    Calculate G explicitly for the equation given by L(f) = f, B1(f) = f(a)and B2(f) = f

    (b).

    1.4 Partial Differential Equations and the LaplaceOperator

    We would like to discuss two particular partial differential equations. As wewill see later, the mathematical background needed for this, most of which

    comes from functional analysis, is much more interesting than that neededfor ordinary differential equations. One of the objectives for this book is tomake the informal discussion in this section more formal and rigorous. Wewill start this in Chapter 3 and Chapter 4.

    chapter:FourierSeriestoDirichletBoundaryValueProblemsFAIn both of the partial differential equations that we will discuss, we will

    need to understand and express the difference between the value of a functionat a point and its values in a neighborhood of the point. One might try to dothis using an average over some nearby values, but we would like to have aninfinitesimal version of this difference. This desire brings the Laplace operator

    f =2f

    x21+ +

    2f

    x2d

    for a smooth function f : Rd R into the picture because of the followingsimple observation.

    Proposition 1.16 (Laplace and neighborhood averages). Let Rdbe an open set, and suppose that f : R is a C2 function. Then

    limr0

    1

    r2 vol(Br(x))

    Br(x)

    (f(y) f(x)) dy = cf(x)

    for any x , where c = 12(d+2) .

    Proof. Suppose for simplicity of notation that x = 0, and apply Taylor

    approximation to obtain

    f(y) = f(0) + f(0)y +1

    2

    di,j=1

    2f

    xixj(0)yiyj + o

    y2 , Here interesting is a synonym for difficult. The same operator is sometimes written as 2.

  • 7/29/2019 FA Lecture

    27/420

    Draft version: comments to [email protected] please

    1.4 Partial Differential Equations and the Laplace Operator 19

    where f(0) is the total derivative off at 0 and we used the notation o() fromp. 389.. Now in the integral over the r-ball

    Br(0) = {y Rd | y < r}

    the linear terms (and the mixed quadratic terms) cancel out due to the sym-metry of the ball. Thus

    Br(0)

    f(y) dy = vol(Br(0)) f(0) +1

    2

    di=1

    2f

    x2i(0)

    Br(0)

    y2i dy

    + vol (Br(0))o

    r2

    . (1.16)

    Next notice that Br(0)

    y2i dy =B1(0)

    y2j dy

    for all 1 i, j d andBr(0)

    y2 dy =B1(0)

    r2z2rd dz

    using the substitution y = rz. It follows thatBr(0)

    y2i dy =1

    d

    dj=1

    Br(0)

    y2j dy

    = 1dBr(0)

    y2 dy

    =rd+2

    d

    B1(0)

    z2 dz =:C

    .

    Combining this with (1.16) gives

    1

    r21

    vol(Br(0))

    Br(0)

    (f(y) f(0)) dy = 1r2 vol(Br(0))

    1

    2f(0)

    rd+2

    dC + o(1)

    =C

    2d vol(B1(0)) =c

    f(0) + o(1).

    For completeness, we calculate the value of c using d-dimensional sphericalcoordinates. Every point z Rd is of the form z = rv for some r 0, and

    v Sd1 = {w Rd | w = 1}.Then using this substitution we have

  • 7/29/2019 FA Lecture

    28/420

    Draft version: comments to [email protected] please

    20 1 Motivation

    vol(B1(0)) =Sd1

    10

    rd1 dr dv = 1d

    vol(Sd1) ,

    where the integration with respect to v uses (d 1)-dimensional volume mea-sure on the sphere Sd1, and so

    C =

    B1(0)

    z2 dz =Sd1

    10

    rd+1 dr dv =1

    d + 2vol(Sd1) .

    Thus

    c =C

    2d vol(B1(0))=

    1d+2$$

    $$$vol(Sd1)

    2d1

    d$$$$

    $vol(Sd1)

    =1

    2(d + 2).

    1.4.1 The Heat Equation

    The heat equation describes how temperatures in a region Rd (repre-senting a physical medium) evolves given an initial temperature distributionand some prescribed behavior of the heat at the boundary . Inside themedium we expect the flow of heat to be controlled by the difference betweenthe temperature at each point and the temperature in a neighborhood of thepoint. If we write u(x, t) for the temperature of the medium at the point x atthe time t, then this suggests a relationship

    ut

    = constant >0

    xu, (1.17)

    where

    xu = u =2u

    x21+ +

    2u

    x2d

    is the Laplace operator with respect to the space variables x1, . . . , xd only.We call (1.17) the heat equation. If we take the physical interpretation of thisequation for granted, then we can use it to give heuristic explanations of someof the mathematical phenomena that arise.

    Suppose first that we prescribe a timeindependent temperature distribu-tion at the boundary of the medium , and then wait until the system

    has settled into thermal equilibrium. Experience (that is, physical intuition)suggests that in the long run (as time goes to infinity) the temperature distri-bution inside will reach a stable (time-independent) configuration. That is,for any prescribed boundary value b : R we expect the heat equationon to have a time-independent solution. More formally, we expect there tobe a function u : R with

    u = 0u| = b.

    (1.18)

  • 7/29/2019 FA Lecture

    29/420

    Draft version: comments to [email protected] please

    1.4 Partial Differential Equations and the Laplace Operator 21

    The boundary value problem (1.18) is the Dirichlet boundary value problem.Proving what the physical intuition suggests, namely that the Dirichlet bound-ary value problem does indeed have a (smooth) solution will take us into thetheory of Sobolev spaces. The case d = 2 will be proved in Chapter 3.

    Leaving the Dirichlet problem to one side for now, we continue with theheat equation. Motivated by the experience of ordinary differential equations,we would like to know how we can find other solutions to the partial differentialequation (ignoring the boundary values for now). A simple kind of solution toseek would be those with separated variables, that is solutions of the form

    u(x, t) = F(x)G(t)

    with x Rd

    and t R. The heat equation would then imply thatF(x)G(t) =

    u

    t= c (F(x)) G(t)

    and so (we may as well choose all physical constants to make c = 1) thequotient

    G(t)G(t)

    =F(x)

    F(x)

    is independent ofx and oft, and therefore is a constant (as this is not really aproof, we will not worry about the division). In summary, u(x, t) = F(x)G(t)solves the equation

    u

    t= xu

    ifG(t) = et

    andF = F

    for some constant , which one can quickly check (rigorously). Ignoring forthe moment the values of F on the boundary , it is easy to find functionswith F = F for any R by using suitable exponential and trigonometricfunctions. However, these turn out not to be particularly useful for the generalapproach. Only those special functions F : R with

    F = F inside F| = 0turn out to be useful in the general case. However, it is not clear that suchfunctions, nor for which values of they may exist.

    Suppose now that the following non-trivial result the existence of a basisof eigenfunctions (which we will be able to prove in many special cases inChapter 4) is known for the region Rd.

  • 7/29/2019 FA Lecture

    30/420

    Draft version: comments to [email protected] please

    22 1 Motivation

    Claim. Every sufficiently nice function f : R can be decomposed into asum f =

    Fn of functions Fn : R satisfying

    Fn = nFn for some n < 0Fn| = 0.

    We may then solve the partial differential equation

    u

    t= xu

    with boundary values

    u|{t} = 0 for all tu|{0} = f

    using the principle of superposition to obtain the general solution

    u(x, t) =n

    Fn(x)ent. (1.19)

    Since n < 0 for each n 1, the series (1.19) converges to 0 as t 0 ifit is absolutely convergent, in accordance with our physical intuition, sincethe boundary condition has temperature 0 for all t > 0. We conclude bymentioning that the claim above will follow from the study of the spectraltheory of an operator (much like the discussion in Section 1.3.3), but theoperator involved will not have a concrete definition as an integral operator.

    1.4.2 The Wave Equation

    The wave equation describes how an elastic membrane moves. We let u(x, t)be the vertical position of the membrane at time t above the point withcoordinate x. As the membrane has mass (and hence inertia) our assumptionis that the vertical acceleration a second derivative of position with respectto time t of the membrane at time t above x will be related to the differencebetween the position of the membrane at that point and at nearby points.Hence we call

    2u

    t2= cxu (1.20)

    the wave equation. As in the case of the heat equation, we may as well choose

    physical units to arrange that c = 1.Once more we may argue from physical intuition that the Dirichlet bound-

    ary problem for the wave equation always has a solution. Consider a wire loopabove the boundary (notice that even at this vague level we are imposingsome smoothness: our physical image of a wire loop may be very distortedbut will certainly be piecewise smooth) and imagine a soap film whose edge is

  • 7/29/2019 FA Lecture

    31/420

    Draft version: comments to [email protected] please

    1.4 Partial Differential Equations and the Laplace Operator 23

    the wire. Then, after some initial oscillations, we expect the soap film to sta-bilize, giving a solution to the boundary value problem defined by the shapeof the wire.

    In this context, what is the meaning of eigenfunctions of the Laplace oper-ator that vanish on the boundary? To see this, imagine a drum whose skin hasthe shape so that the vibrating membrane is fixed along the boundary ,which is simply a flat loop. Suppose now that F : R satisfies

    F = F in F| = 0

    for some < 0, then we claim that

    u(x, t) = F(x)cos(t)solves the wave equation:

    2

    t2u(x, t) = F(x) (()2)

    =

    cos(t)

    = F(x)cos(t)

    = x

    F cos(t)

    .

    In other words, if we start the drum at time t = 0 with the prescribed shapegiven by the function F, then the drum will produce a pure tone of fre-quency 2 .

    Exercise 1.17. Assume that satisfies the basis of eigenfunctions claim fromp. 1.4.1, and that the Dirichlet boundary value problem always has a solution on .(a) Combine our two discussions of the heat equation to produce a non-rigorousgeneral procedure along the lines of this section to solve the boundary value problem(no rigorous proof is expected)

    u

    t= u in [0, )

    u|{t} = bu|{0} = f.

    (b) Repeat (a) for the wave equation.

    In the real world there would also be a friction term, and the model for this is amodified wave equation, but we will ignore these subtleties.

    This preferred frequency for certain physical objects is part of the phenomenaof resonance, and the design of large structures like buildings or bridges tries toprevent resonances that may lead to reinforcement of oscillations by wind, forexample.

  • 7/29/2019 FA Lecture

    32/420

    Draft version: comments to [email protected] please

    24 1 Motivation

    Exercise 1.18. For the clamped vibrating string the wave equation over T thebasis of eigenfunctions claim is precisely the claim that every nice function canbe represented by its Fourier series. Assuming that this holds, show the basis ofeigenfunctions claim for the domain = (0, 1) R. (In fact we have alreadyencountered the eigenfunctions x sin(nx) with n = 1, 2 . . . ; no rigorous proof isexpected, but explore the connection.)

    1.5 Distributions as Generalized Functions

    Both in applications and within mathematics it is often useful to have a gen-eralized notion of function to allow, for example, a function F on R with the

    property that R

    (x)F(x) dx = (0) (1.21)

    for any nice functions : R R. Such an F might represent a point mass(a dimensionless object of mass 1 located at 0), or be a mathematical rep-resentation of an impulse in physics. Since F is certainly not a function (seeExercise 1.20), one needs to develop a new theory that includes such ob-jects(1). The theory ofdistributions allows for such generalized functions, andpermits them to be differentiated, multiplied by smooth functions, and so on.Of course if we were only interested in expressions of the form in ( 1.21) thenwe could simply study measures, since (1.21) is simply the integral againstthe Dirac measure 0 at the origin. However, within the space of measures it

    does not normally make sense to take derivatives (and this is the case for 0)and we will be able to allow a derivative map in the space of distributions.The most direct approach to distributions superficially seems to be a cheat:

    We declare a distribution to be a linear continuous functional (that is, a linearcontinuous map to the base field R or C) on a space of nice test functions {}.Here the definition of nice may vary, to give different classes of distributions.For example, one could consider all smooth functions with compact supporton Rd as the nice test functions.

    Requiring continuity of the linear functional is natural but needs a topol-ogy to be defined on the space of test functions. In the case of smooth functionsof compact support on Rd the natural topology does not come from a Banachspace (that is, a linear space complete with respect to a norm see Sec-tion 2.2), so we will need to study more general classes of topological vector

    spaces (that is, vector spaces equipped with a topology making all the vectorspace operations continuous). We will start the discussion of these locallyconvex vector spaces in Chapter 7.

    This definition of a distribution is a cheat because we have finessed theproblem that no function F satisfies (1.21) by simply declaring F to be thedistribution (that is, continuous linear functional) which sends the test func-tion to (0) without giving a more direct generalization of functions on R.We may write this formally as

  • 7/29/2019 FA Lecture

    33/420

    Draft version: comments to [email protected] please

    1.6 Highly Connected Networks: Expanders 25

    F, = (0),

    where we write F, for the action of the functional F on the test function .One sometimes also writes

    R

    F for F, , especially if we continue to thinkof F as a generalized function, but whenever we want to prove somethingabout F we will go back to the formal definition of F as a functional on thespace of allowed test functions. Even though this may look dubious at firstsight, the intuition provided by the viewpoint that F is a generalized functionis often useful, and will stay consistent with the formal treatment of F as alinear functional. As indicated above however, we can only treat this theoryrigorously after more preparatory material has been developed.

    Exercise 1.19. Show that any integrable function gives rise to a distribution. Thatis, any f in L1(R) defines a linear functional Ff on the space Cc (R) of smoothcompactly supported functions via

    Ff, =

    f(x)(x) dx.

    Moreover, show that the resulting map f Ff is linear and injective. Actually it issufficient to assume that f L1loc(R), the space oflocally integrable functions, thosethat are measurable and have the property that their restriction to any compact setis integrable.

    Exercise 1.20. Show that no measurable and locally integrable function f : R Rhas the property (1.21) as ranges over all smooth compactly supported functions.

    1.6 Highly Connected Networks: Expanders

    In designing large connected networks (for example, connecting many com-puters and servers) one is often confronted with two competing constraints:

    (High connectivity) Starting from any vertex, it should be easy to reachany other vertex quickly (that is, in few steps);

    (Sparsity) The network should be economical, meaning that there shouldnot be an unnecessarily large number of edges in the network.

    Clearly it is easy to achieve the first at the expense of the second by usinga complete graph (in which every pair of vertices has an edge joining them),

    and it is easy to achieve the second at the expense of the first (by arrangingthe edges so that the vertices are strung along a single line, so as to achieveconnectivity at the lowest possible cost).

    Exercise 1.21. Analyze the number of edges as a function of the number of verticesin the two extreme constructions of connected networks from above.

    Of course there is another option of creating a center vertex with a directconnection to each of the existing vertices, but the center vertex created in this

  • 7/29/2019 FA Lecture

    34/420

    Draft version: comments to [email protected] please

    26 1 Motivation

    way would be very costly (or even technically impossible) and would defeatthe objective of achieving sparsity.

    The notion ofexpander graphs is an attempt to achieve a balance betweenthe two constraints. In order to describe expanders, we will need some basicnotation from graph theory.

    A graph G = (V, E) is a set of vertices V (the nodes of the network) andedges E V Vgiving the list of direct connections between nodes. We willalways assume that the graph is undirected, so each edge goes both ways andthe set Eis symmetric. We will also assume that the graph issimple, i.e. thata pair of vertices is at most connected by one edge and that there is never anedge from a vertex to itself.

    The requirement of sparsity is achieved by requiring that the graph G be k-regular for a fixed k. A graph G = (V, E) is said to be k-regular if, for anyvertex v Vthere are exactly k edges from v to other vertices in V(possiblyincluding an edge to v itself). We will fix k and look for k-regular graphs witha large number of vertices. Notice that this will impose a sparsity conditionon the graph, since the number of edges |E| will be a linear function of thenumber of vertices |V| (in contrast to the case of a complete graph, a simpleundirected graph in which every pair of distinct vertices is connected by aunique edge, for which |E| = 1

    2|V| (|V| 1)).

    In order to define the notion of high connectivity, we will need some prepa-rations. A graph G = (V, E) is called connected if for any two v, w Vthere ex-ists a path from v to w in the sense that there is a list v0 = v, v1, v2, . . . , vn = wof vertices in Vwith (vi, vi+1) Efor i = 0, . . . , n 1. Such a path may con-sist of a singleton, so each vertex is connected to itself by a path of lengthzero. Notice that there is a natural metric on any connected graph: we maydefine d(v, w) to be the minimal length of a path from v to w (that is, theminimal number of edges in a path joining v to w; see Figure 1.2 and Exer-cise 1.22). The diameter of a connected graph G is the minimal N N withthe property that for any two vertices v and w there is a path of length nomore than N connecting v to w.

    Exercise 1.22. Verify that the notion of distance on a graph illustrated in Fig-ure 1.2 defines a metric on the set of vertices of a connected graph.

    The smaller the diameter is in comparison with V, the better the connec-tivity of the graph is. The worst case with the vertices strung out on a line (orif we seek a 2-regular graph, arranged around a circle) has diameter

    |V |1

    (or |V|2). The other extreme case of a complete graph has diameter 1. In

    Formally the set of edges is viewed as a subset of (V V){(a, a) : a V}, andin this sense symmetry means that (a, b) E if and only if (b, a) E. We willthink of a single edge joining vertex a to vertex b if (a, b) E, (a, b) and (b, a) willbe viewed as a single element of the set of edges E. In particular, |E| will be thetotal number of edges drawn in the graph, each of which is viewed as a two-wayconnection.

  • 7/29/2019 FA Lecture

    35/420

    Draft version: comments to [email protected] please

    1.6 Highly Connected Networks: Expanders 27

    v

    w

    v1

    v2

    Fig. 1.2. Two points v, w at distance 3 in a connected graph.

    the case of expander graphs we will see that families of graphs may be foundwith diameter N log |V|.Definition 1.23 (Expanders). A sequence of finite k-regular graphs (Gi = (Vi, Ei))i1is an expander family if there exists a constant > 0 (independent of i) with

    |S| min|S|, |ViS|for any subset S Vi for any i 1, where

    S = {v S |there exists w ViS with (v, w) E}

    {v

    ViS

    |there exists w

    S with (v, w)

    E}is the boundary of S.

    A few comments are in order. We may always assume that (0, 1). Anyfinite collection of finite k-regular connected graphs (formally, a sequence asin Definition 1.23 that repeats these) is automatically an expander family.As this is not at all interesting and in particular does not achieve the realbenefit of the slower growth rate from the logarithmic bound on the diamter one usually requires in addition that |Vi| as i . Notice that wemust also have k 3, because k = 2 corresponds to a regular |V|-gon, whichwe quickly see cannot be an expander family. An expander family consists ofconnected graphs, but much more is true.

    Proposition 1.24 (Small diameter). For an expander family (Gi)i, wehave diam Gi log |Vi|.

    Proof. Given some vertex v Vi we claim that the metric ball We write A B for functions A and B defined on N to mean that there is some

    constant c for which A(n) cB(n) for all n 1. In the current case the constant cwill depend on k and on (as in Definition 1.23) but is not allowed to depend onthe particular graph G.

  • 7/29/2019 FA Lecture

    36/420

    Draft version: comments to [email protected] please

    28 1 Motivation

    Ba(v) = {w Vi | d(v, w) a}

    has more than |V|2

    elements if the integer a satisfies

    a D =log(|Vi|/2)

    log (1 + /(k + 1)).

    Assuming the claim, suppose that v, w Vi are any pair of vertices. Then, bythe claim, each of |Ba(v)| and |Ba(w)| is greater than |V|2 so that these twoballs must have non-empty intersection. By the triangle inequality, it followsthat

    d(v, w) 2(D + 1) ,k log |Vi|,giving the proposition.

    To prove the claim, notice that if S = Bn(v) then

    Bn+1(v)Bn(v) = SS.

    Moreover,

    |SS| 1k + 1

    |S|

    since every element of S S must connect to one element of SS and atmost k elements ofSS can connect to the same element ofSS. Togetherwith the defining property of expander graphs, and assuming (0, 1) say,we deduce that

    |B0(v)| = 1,|B1(v)| k 1 2 > 1 + k+1 ,and, by induction,

    |Bn+1(v)| = |Bn(v)| + |Bn(v)Bn(v)|

    1 + k+1

    |Bn(v)| >

    1 +

    k+1

    n+1for all n with |Bn| |Vi|2 . Since for n = a D the lower bound is |Vi|2 , thisproves the claim.

    Thus expander families achieve a balance between the two constraints ofhigh connectivity (with logarithmic growth of the diameter) and sparsity of thegraph (with only linear growth of the number of edges and a fixed number of

    edges for its neighbors). However, several questions remain, the most pressingof which are the following.

    Do expander families exist? What is their connection to functional analysis?The first examples of expander families were found by Pinsker [38] (translatedin [39]) using a non-constructive probabilistic argument. The same year Mar-gulis [29] (translation in [30]) was able to give an explicit construction using

  • 7/29/2019 FA Lecture

    37/420

    Draft version: comments to [email protected] please

    1.6 Highly Connected Networks: Expanders 29

    Kazdans Property (T) of the group SL3(Z). Margulis showed that the familyof quotients SL3(Z)/ by finite index subgroups are (via a standard graphstructure on them) an expander family. To prove this, we will discuss (2) uni-tary representations of the group SL3(Z) (i.e. actions of SL3(Z) by unitarytransformations on a Hilbert space).

    To prepare some of the ground for the proof by Margulis, we can alreadyexhibit a connection between the expander property and properties of eigen-values of linear maps associated to the graphs.

    Let G = (V, E) be a finite graph and identify Vwith the set {1, 2, . . . , |V|}.The adjacency matrix AG of the graph G is the matrix with |V| rows and |V|columns and with entries in {0, 1} so that (AG)ij = 1 if and only if there isan edge from vertex i to vertex j. A simple graph G with adjacency matrix

    AG =

    0 1 0 1 11 1 1 0 00 1 0 1 11 0 1 0 11 0 1 1 0

    is shown in Figure 1.3.

    1

    2

    34

    5

    G

    Fig. 1.3. A connected graph on 5 vertices.

    Several properties of the graph are reflected in the properties of the ad-jacency matrix. The matrix A

    Gis symmetric by our standing assumption on

    the graph G. We also defineMG =

    1

    kAG ,

    which is an averaging operator in the following sense. A vector x R|V| maybe thought of as a function on the set of vertices, and applying MG to xgives a new function which at the vertex i is equal to the mean of the valuesof the function x at all the neighbors of i. By analogy with the material inSection 1.4, one also studies the graph Laplace operator

  • 7/29/2019 FA Lecture

    38/420

    Draft version: comments to [email protected] please

    30 1 Motivation

    G = I MG.

    Since MG is symmetric it is diagonalizable and has only real eigenvalues.Moreover,

    i

    |(MGx)i| =i

    j

    (MG)ij xj

    i,j

    (MG)ij |xj | =j

    |xj |,

    since i

    (MG)ij = 1 (1.22)

    for all j by construction. Therefore, any eigenvalue on MG has ||

    1 andby (1.22) we see that 1 = 1 is an eigenvalue (with the constant vectors aseigenvectors). The relationship between the eigenvalues and connectivity isillustrated by the following elementary lemma.

    Lemma 1.25 (Connectivity). A k-regular graph is connected if and onlyif 1 is a simple eigenvalue of MG.

    Exercise 1.26. Prove Lemma 1.25.

    What we need next is a quantitative version of this relationship, the firststep of which is given by the following proposition.

    Proposition 1.27 (Eigenvalues and expanders). Let(

    Gi = (

    Vi,

    Ei))i1 be

    a sequence of graphs. For each i, let Mi = MGi be the averaging operatorassociated to Gi, and order its eigenvalues as

    1(Mi) = 1 2(Mi) |Vi|(Mi).

    Suppose that there exists some > 0 with

    2(Mi) 1 (1.23)

    for all i 1. Then the sequence of graphs is an expander family.

    The uniform estimate in (1.23) is called a spectral gap for the sequence ofgraphs. The converse of Proposition 1.27 also holds, but we will not need this

    direction (we refer to Lubotzky [28] for the proof).Proof of Proposition 1.27. Let > 0 be as in the statement of the propo-sition. Let G = Gi and M = Mi for some fixed i, so that 2(M) 1 . Alsolet S Vbe any subset with |S| 1

    2|V|. We again think of vectors in R|V| as

    functions on V, and notice that

    M( S) = S+ fS,

  • 7/29/2019 FA Lecture

    39/420

    Draft version: comments to [email protected] please

    1.6 Highly Connected Networks: Expanders 31

    where fS is a vector that vanishes outside of S and has absolute value lessthan 1 on the elements ofS. We will estimate fS2 from above and below,and the resulting estimate will prove the claim. In fact, it follows that

    M( S) S2 = fS2

    |S|.On the other hand, M is diagonalizable and so

    S =j

    vj ,

    M( S) =

    j

    jvj ,

    and finally

    M( S) S =

    |V|j=2

    (j 1)vj.

    Furthermore, since M is symmetric we can assume that the vectors vj arenormal to each other, so

    M( S) S2 min

    2j|V||j 1|

    |V|j=2

    vj2

    |V|j=2

    vj

    2

    .

    Thus we need to relate the last norm to the size ofS. To this end, notice that

    V2 =

    |V|and

    S, V = |S|,

    so the orthogonal projection of S onto V is

    |S||V| V. Therefore,

    |V|j=2

    vj

    2

    = S |S||V| V

    2

    1 |S||V|

    S2 (by restricting the sum to S)

    12|S| (since |S||V| 12 )

    and putting these inequalities together gives

    |S| M(

    S) S2

    |V|j=2

    vj

    2

    2

    |S|.Thus the sequence (Gi)i1 is an expander family with =

    2

    4 .

  • 7/29/2019 FA Lecture

    40/420

    Draft version: comments to [email protected] please

    32 1 Motivation

    1.7 What is spectral theory?

    As we will see later the topics considered in Section 1.1, Section 1.3, Sec-tion 1.4, and Section 1.6 all are connected to spectral theory.

    The goal of spectral theory, at its broadest, might be described as anattempt to classify all linear operators. The restriction to Hilbert space isnatural for two reasons. It is much easier than the general case of operators onBanach spaces (indeed, the general picture for Banach spaces is barely under-stood today). Secondly, many of the most important applications belong tothis simpler setting of operators on Hilbert spaces. This is more than a happycoincidence, since Hilbert spaces are distinguished among Banach spaces asbeing the spaces most closely linked to the notions of distance, angle, and

    orthogonality in Euclidean geometry. Euclidean geometry in turn seems to bea sufficiently accurate mathematical model for the physical universe on manydifferent size scales, so it is not so surprising, that some of the most usefulinfinite-dimensional arguments remain close to this geometric intuition.

    How might one set about classifying linear operators? Finite-dimensionallinear algebra suggests that linear maps T1, T2 : H1 H2 which are linkedby a relation of the form

    T2 U1 = U2 T1, (1.24)where Ui : Hi Hi for i = 1, 2 are invertible linear maps, will share manyproperties. In the finite-dimensional case, this is because the map Ui, beinginvertible, may be thought of as corresponding to a changing of basis in thespace Hi, which is an operation that does not affect the intrinsic properties

    of the operators. This interpretation fails in general for infinite-dimensionalspaces where no good theory of bases exists, but the definition still has interest,and one may try to describe all operators H1 H2 up to such equivalence.

    If H1 = H2 = H is a single Hilbert space, then one can specialize thisnotion of equivalence, saying that operators T1, T2 : H H are equivalent ifthere is an invertible linear map U : H H with

    T2 U = U T1, (1.25)

    or equivalently if T2 = U T1U1. Once again, the interpretation of U as a

    change of basis is not available in the infinite-dimensional setting, but thenotion is natural.

    In linear algebra, the classification problem is successfully solved by the

    theory of eigenvalues, eigenspaces, minimal and characteristic polynomials,which leads to a canonical normal form for any linear operator Cn Cn forany n 1.

    We wont be able to get such a general theory if H is infinite-dimensional,but it turns out that many operators of greatest interest have properties which,in the finite-dimensional case, ensures an even simpler description. They maybelong to any of the special classes of operators defined on a Hilbert space by

  • 7/29/2019 FA Lecture

    41/420

    Draft version: comments to [email protected] please

    1.8 Further Topics 33

    means of the adjoint operation T T: normal operators, self-adjoint opera-tors, positive operators, or unitary operators. For these classes, if dim H = n,then there is an orthonormal basis (e1, . . . , en) of eigenvectors of T with cor-responding eigenvalues (1, . . . , n), and in this basis, we can write

    T

    ni=1

    iei

    =

    ni=1

    iiei, (1.26)

    corresponding to a diagonal matrix representation. There is one interpretationof this representation which turns out to be amenable to generalization (ingeneral, we will not be able to use bases in the same way in the infinite-dimensional setting). Consider the linear map U : H

    Cn defined by linearly

    extending the map with

    U : ei (0, . . . , 0, 1, 0, . . . , 0),

    where there is a 1 in the ith position. This map is a bijective isometry, bydefinition of an orthonormal basis, ifCn has the standard inner product. If

    T1 : Cn Cn

    (i) (ii)

    then (1.26) becomesT1 U = U T. (1.27)

    This is obvious, but we may interpret this as follows, which gives a slightlydifferent view of the classification problem. For any finite-dimensional Hilbertspace H, and normal operator T, we have found a model space and opera-tor (Cn, T1), such that in the sense of (1.27) (H, T) is equivalent to (Cn, T1)(in fact, unitarily equivalent, since U is isometric).

    The theory we will describe later will be a generalization of this typeof normal form reduction, a point of view emphasized in the work of Reedand Simon [40, Ch. VII]. This is successful because the model spaces andoperators are indeed quite simple: they are of the type L2(X, ) for somemeasure space (X, ) (the finite-dimensional case ofCn corresponding to X ={1, . . . , n} with the counting measure), and the operators are multiplicationoperators

    Tg : f

    gf

    for some suitable function g : X C.

    1.8 Further Topics

    We list here a few more topics that we will be able to discuss while developingthe theory called functional analysis.

  • 7/29/2019 FA Lecture

    42/420

    Draft version: comments to [email protected] please

    34 1 Motivation

    Inheriting Smoothness: Suppose that f : R2 R is continuous and thepartial derivatives

    k

    xk1f,

    k

    xk2f

    exist and are continuous for all k 1, then f is smooth. That is, all mixedderivatives also exist and are continuous. How is it that the existence of thedirectional partial derivatives along the x1 and x2 axes alone can guaranteesmoothness? We will provide the necessary background to answer thisquestion in Chapter 3 (see Exercise 3.22).

    Generalized Limits: How can one construct a generalized limit notion thatassigns to every bounded sequence a limit, and still has many of the usualexpected properties? One such property is translation invariance with re-spect to the underlying group (for a sequence in the normal sense, thisgroup would be Z). Do all groups have similar notions of generalized limits?We will answer the first question in Section 6.1.5, where we will also starta discussion of the second question, which leads to the topic of amenablegroups.

    For the construction of expander graphs we will discuss groups that arein some sense diametrically opposite to amenable groups. These are thegroups with property (T) introduced by Khazhdan in 1967.

  • 7/29/2019 FA Lecture

    43/420

    Draft version: comments to [email protected] please

    2

    Norms, Banach Spaces, and Hilbert Spaces

    In this chapter we start the more formal treatment of functional analysis,giving the fundamental definitions and introducing some of the basic examplesand their properties.

    2.1 Norms and Semi-Norms

    We will assume familiarity with the following concepts from linear algebra:vector spaces, subspaces, quotient spaces, dimension (which may be infinite),linear maps, image and kernel of linear maps. The notion of basis of a vector

    space will only be used to distinguish finite-dimensional vector spaces frominfinite-dimensional ones. We will not usually try to describe the vector spacesthat arise in functional analysis, or the linear maps between them, in terms ofbases. An exception will arise in the study of Hilbert spaces (see Section 2.5)and in the study of certain (important but nonetheless special) operators onthem (see Chapter 4). Also recall that a subset K V of a vector space issaid to be convex if for k1, k2 K and t [0, 1] we have

    (1 t)k1 + tk2 K.

    2.1.1 Normed Vector Spaces

    Throughout these notes we will be working with real or complex vector

    spaces (V, +, ) (here + is vector addition, and scalar multiplication). Wewill call the elements of the field simply scalars if we want to avoid makingthe distinction between the real and complex case. For instance, in the fun-damental definitions to come in this section, we treat the real and complexcases simultaneously.

    Definition 2.1. Let V be a real or complex vector space. A norm is a map

    : V R

  • 7/29/2019 FA Lecture

    44/420

    Draft version: comments to [email protected] please

    36 2 Norms, Banach Spaces, and Hilbert Spaces

    with the following properties:

    (1) v 0 for any v V, and v = 0 if and only ifv = 0 (Strict positivity);(2) v = ||v for all v V and scalars (Homogeneity) ; and(3) v + w v + w for all v, w V (Triangle inequality).If is a norm on V, then (V, ) is called a normed vector space.

    It is easy to give examples of normed vector spaces, and we list a fewstandard examples here (more will appear throughout these notes).

    Example 2.2. The following are examples of normed real vector spaces, inwhich we write v = (v1, . . . , vd)

    t

    for elements ofRd.

    (1) V = Rd with v = v2 = v21 + + v2d .(2) V = Rd with v = v = max1id |vi|.(3) V = Rd with v = v1 = |v1| + + |vd|.(4) V = Rd with norm defined by

    vB = inf{ > 0 | 1v B},

    where B is an open, centrally symmetric (that is, with B = B), convex,bounded (with respect to the Euclidean norm) subset ofRd.

    (5) Let X be any topological space (for example, a metric space), and let

    V = Cb(X) = {f : X R | f is continuous and bounded}

    with the uniform or supremum norm

    f = f = supxX

    |f(x)|.

    Notice that if X is compact, then Cb(X) coincides with C(X), the spaceof continuous functions X R.

    (6) A special case of (5) makes C([0, 1]), and so also the subspace

    C1([0, 1]) = {f : [0, 1] R | f has a continuous derivative on [0, 1]},

    into a normed vector space. A different norm on C1([0, 1]) may be obtainedby setting

    f

    C1([0,1]) = max

    {f

    ,

    f

    }.

    (7) Finally, consider the vector space of real polynomials

    R[x] = {f =Nk=0

    cf(k)xk | N N, cf(k) R}

    on which we can define any of the following norms (thinking of f R[x]really as the vector of its coefficients):

  • 7/29/2019 FA Lecture

    45/420

    Draft version: comments to [email protected] please

    2.1 Norms and Semi-Norms 37

    a) f1 =k=0

    |cf(k)|,

    b) f2 = k=0

    |cf(k)|21/2

    , or

    c) f = maxk0

    {|cf(k)|}.We could also think of polynomials as defining continuous functionson [0, 1] thus embedding

    R[x] C1([0, 1]) C([0, 1]),so that the norm C1([0,1]) or may also be used.The examples in Example 2.2 all generalize in the obvious way to form

    normed complex vector spaces, with the exception of (4), where additionalrequirements on the set B are required (see Exercise 2.5).

    Exercise 2.3. Verify that Example 2.2(1),(2),(3),(5),(6), and (7) define normed vec-tor spaces over R or C.

    Exercise 2.4. Show that Example 2.2(4) defines a real normed vector space.

    Exercise 2.5. (a) Show that for a complex normed vector space (V, ) the openunit ball

    B = BV1 (0) = {v V | v < 1}

    has the property that B = B for any C with || = 1.

    (b) Show that if B C

    d

    is open, convex, bounded, and satisfies B = B forany C with || = 1, then there exists a norm on Cd whose open unit ball is B.

    Lemma 2.6 (Associated metric). Suppose that (V, ) is a normed vectorspace. Then for every v, w V we havev w v w. (2.1)Moreover, writing

    d(v, w) = v wforv, w V defines a metricd onV such that the norm function : V Ris continuous with respect to the topology induced by the metric d.

    Proof. For any v, w

    V,

    v = v w + w v w + wand

    w = w v + v v w + vby Definition 2.1(2) and (3), and the two equations together give (2.1).

    To see that d is a metric we need to check the following defining propertiesfor a metric.

  • 7/29/2019 FA Lecture

    46/420

    Draft version: comments to [email protected] please

    38 2 Norms, Banach Spaces, and Hilbert Spaces

    (1) Strict positivity: That d(v, w) 0 for all v, w V and d(v, w) = 0 if andonly if v = w is clear by Definition 2.1(1).

    (2) Symmetry: d(v, w) = d(w, v) for all v, w V by Definition 2.1(2) with =1.

    (3) Triangle inequality: we have

    d(u, w) = uw = uv+vw uv+vw = d(u, v)+d(v, w).Finally, the norm is continuous at v V if for every > 0 there existssome > 0 such that

    w B (v) = {u V | d(u, v) < }

    implies that w v < . By (2.1), we may choose = to see this. Notice that the triangle inequality makes addition continuous. If we write

    B (v) = {w V | w v < }for the ball of radius around v V, then we have

    B/2

    (v1) + B/2

    (v2) B (v1 + v2)

    for every > 0. This means that (v, w) v + w is continuous at (v1, v2) and,since v1, v2 V were arbitrary, shows that addition is continuous.

    Scalar multiplication is also continuous. To see this, notice that

    w

    v = (

    )w

    (v

    w),

    so if| | < v+1

    andw v <

    for some (0, 1), thenw < v + 1

    and sow v < (1 + ||).

    This gives continuity of scalar multiplication at (, v).We now turn to the sense in which the topology induced by a norm deter-

    mines the norm.

    Lemma 2.7 (Equivalence of norms). Two norms and on the samevector space induce the same topology if and only if there exists a (Lipschitz-)constant c > 0 such that

    1cv v cv (2.2)

    for all v V. In this case we call the norms equivalent.

  • 7/29/2019 FA Lecture

    47/420

    Draft version: comments to [email protected] please

    2.1 Norms and Semi-Norms 39

    Proof. If (2.2) holds, then the standard neighborhoods of v V,

    B

    (v) = {w V | w v < }and

    B (v) = {w V | w v < }with respect to the two norms satisfy

    B1c

    (v) B (v) B

    c (v).

    This implies that the topologies have the same notion of neighborhood, andso are identical.

    Suppose now that the two topologies are the same, so that B1 (0) is aneighborhood of 0 in this topology. Then there must be some > 0 with

    B

    (0) B1 (0).Equivalently, v < implies that v < 1. For any v V{0}, if w =

    2v vthen

    w = 2v v

    < ,

    sow =

    2v v < 1.

    This implies that

    v 2

    vfor all v V, giving the second inequality in (2.2). Reversing the roles of and gives the first inequality also.

    The phenomenon seen in the proof of Lemma 2.7, where a property onall of V is determined by the local behavior at 0 is something that will oc-cur frequently. For Rd the notion of equivalence of norms has the followingproperty.

    Proposition 2.8 (Equivalence in finite dimensions). If V = Rd thenany two norms on V are equivalent.

    As we will see in the proof, this is related to the compactness of the closed

    unit ball in Rd

    .Proof of Proposition 2.8. Let 1 be the norm on Rd from Exam-ple 2.2(3), and let be an arbitrary norm on Rd. It is enough to show thatthese two norms are equivalent. Write e1, . . . , ed for the standard basis ofRd,and let M = max1id ei. Then

    v = di=1

    viei di=1

    |vi|ei Mv1, (2.3)

  • 7/29/2019 FA Lecture

    48/420

    Draft version: comments to [email protected] please

    40 2 Norms, Banach Spaces, and Hilbert Spaces

    where we have used the triangle inequality generalized by induction to