the physics of in ation - university of cambridge · the physics of in ation a course for graduate...

The Physics of InflationA Course for Graduate Students in Particle Physics and Cosmology

by

Daniel Baumann

Contents

Preface 1

I THE QUANTUM ORIGIN OF LARGE-SCALE STRUCTURE 5

1 Classical Dynamics of Inflation 7

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 The Horizon Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.1 FRW Spacetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.2 Causal Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.3 Shock in the CMB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.4 Quantum Gravity Hocus-Pocus? . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 The Shrinking Hubble Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.1 Solution of the Horizon Problem . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.2 Solution of the Flatness Problem∗ . . . . . . . . . . . . . . . . . . . . . . 12

1.3.3 Conditions for Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 The Physics of Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.1 False Vacuum Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.2 Slow-Roll Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.3 Hybrid Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.4.4 K-Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.5 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Quantum Fluctuations during Inflation 23

2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2 Classical Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2.1 Comoving Gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2.2 Constraint Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.3 Quadratic Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.2.4 Mukhanov-Sasaki Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.5 Mode Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3 Quantum Origin of Cosmological Perturbations . . . . . . . . . . . . . . . . . . . 28

2.3.1 Canonical Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.3.2 Non-Uniqueness of the Vacuum . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3.3 Choice of the Physical Vacuum . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.4 Zero-Point Fluctuations in De Sitter . . . . . . . . . . . . . . . . . . . . . 32

2.4 Curvature Perturbations from Inflation . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4.1 Results for Quasi-De Sitter . . . . . . . . . . . . . . . . . . . . . . . . . . 33

i

ii Contents

2.4.2 Systematic Slow-Roll Expansion∗ . . . . . . . . . . . . . . . . . . . . . . . 34

2.5 Gravitational Waves from Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.6 The Lyth Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Contact with Observations 39

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2 Superhorizon (Non)-Evolution∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.2.1 Weinberg’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.2.2 Separate Universe Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3 From Vacuum Fluctuations to CMB Anisotropies . . . . . . . . . . . . . . . . . . 44

3.3.1 Statistics of Temperature Anisotropies . . . . . . . . . . . . . . . . . . . . 44

3.3.2 Transfer Function and Projection Effects . . . . . . . . . . . . . . . . . . 45

3.3.3 CMBSimple∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.3.4 Coherent Phases and Superhorizon Fluctuations∗ . . . . . . . . . . . . . . 52

3.3.5 CMB Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3.6 Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.4 From Vacuum Fluctuations to Large-Scale Structure . . . . . . . . . . . . . . . . 59

3.4.1 Dark Matter Transfer Function . . . . . . . . . . . . . . . . . . . . . . . . 59

3.4.2 Galaxy Biasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.5 Future Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4 Reheating after Inflation 63

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Elementary Theory of Reheating . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.3 Parametric Resonance and Preheating . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3.1 QFT in a Time-Dependent Background . . . . . . . . . . . . . . . . . . . 66

4.3.2 Narrow Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.3 Broad Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.3.4 Termination of Preheating . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.3.5 Gravitational Waves from Preheating . . . . . . . . . . . . . . . . . . . . 77

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5 Primordial Non-Gaussianity 79

5.1 Why Non-Gaussianity? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 Gaussian and Non-Gaussian Statistics . . . . . . . . . . . . . . . . . . . . . . . . 80

5.2.1 Statistics of CMB Anisotropies . . . . . . . . . . . . . . . . . . . . . . . . 80

5.2.2 Sources of Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.2.3 Primordial Bispectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.2.4 Shape, Running and Amplitude . . . . . . . . . . . . . . . . . . . . . . . . 82

5.2.5 Shapes of Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3 Quantum Non-Gaussianities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.3.1 The in-in Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3.2 Single-Field Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.4 Classical Non-Gaussianities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.4.1 The δN Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.4.2 Inhomogeneous Reheating . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Contents iii

5.5 Large-Scale Structure and Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . 101

5.6 Future Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

II THE PHYSICS OF INFLATION 103

6 Effective Field Theory 105

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.2 EFT Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.2.1 Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.2.2 A Toy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.2.3 Tree-Level Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.2.4 RG Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.2.5 One-Loop Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.2.6 Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6.2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.3 The Standard Model as an Effective Theory∗ . . . . . . . . . . . . . . . . . . . . 123

6.3.1 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6.3.2 The GIM Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.3.3 Accidental Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.3.4 Neutrino Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3.5 Beyond the Standard Model Physics . . . . . . . . . . . . . . . . . . . . . 126

6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7 Effective Field Theory and Inflation 127

7.1 UV Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.2 The Eta Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.3 Large-Field Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.4 Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

8 Supersymmetry and Inflation 133

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

8.2 Facts about SUSY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

8.2.1 SUSY and Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

8.2.2 Superspace and Superfields . . . . . . . . . . . . . . . . . . . . . . . . . . 135

8.2.3 Supersymmetric Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . 136

8.2.4 Miraculous Cancellations . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

8.2.5 Non-Renormalization Theorem . . . . . . . . . . . . . . . . . . . . . . . . 139

8.2.6 Supersymmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

8.2.7 Supergravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.2.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.3 SUSY Inflation: Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.3.1 The Supergravity Eta-Problem . . . . . . . . . . . . . . . . . . . . . . . . 141

8.3.2 Goldstone Bosons in Supergravity . . . . . . . . . . . . . . . . . . . . . . 142

8.4 SUSY Inflation: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

8.4.1 Hybrid Inflation and Naturalness . . . . . . . . . . . . . . . . . . . . . . . 143

iv Contents

8.4.2 SUSY Pseudo-Natural Inflation . . . . . . . . . . . . . . . . . . . . . . . . 144

8.4.3 UV Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.5 Signatures of SUSY Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.5.1 Hubble-Mass Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.5.2 The Squeezed Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.5.3 Scale-Dependent Halo Bias . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

9 String Theory and Inflation 147

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.2 Elements of String Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.2.1 Fields and Effective Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.2.2 String Compactifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.3 Warped D-brane Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.4 Axion Monodromy Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Outlook 149

III Supplementary Material 151

A The Effective Theory of Single-Field Inflation 153

A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

A.2 Spontaneously Broken Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . 154

A.2.1 Global Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

A.2.2 Effective Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

A.2.3 A Toy UV-Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

A.2.4 Energy Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

A.2.5 Gauge Symmetries and Decoupling . . . . . . . . . . . . . . . . . . . . . . 158

A.3 Effective Theory of Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

A.3.1 Goldstone Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

A.3.2 Energy Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

A.4 Non-Gaussianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

A.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

B Cosmological Perturbation Theory 167

B.1 The Perturbed Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

B.1.1 Metric Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

B.1.2 Matter Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

B.2 Scalars, Vectors and Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

B.2.1 Helicity and SVT-Decomposition in Fourier Space . . . . . . . . . . . . . 168

B.2.2 SVT-Decomposition in Real Space . . . . . . . . . . . . . . . . . . . . . . 170

B.3 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172



B.3.3 Einstein Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Contents v

B.3.4 Popular Gauges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

B.4 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177




B.5 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177




C Exercises 179

Preface

Unless we accept fine-tuning of initial conditions, the standard Big Bang cosmology suffers

from the so-called horizon problem. This refers to the fact that the cosmic microwave back-

ground (CMB) at the time of decoupling naively consisted of about 104 causally disconnected

patches. Yet, we observe an almost perfectly uniform CMB temperature field across super-

horizon scales at recombination. A key goal of modern cosmology is to explain this large-scale

uniformity without resorting to fine-tuning.

Cosmological inflation, an early period of accelerated expansion, solves the horizon problem

dynamically and allows our universe to arise from generic initial conditions. At the same time,

quantum fluctuations during inflation produce small inhomogeneities. The primordial seeds that

grew into galaxies, clusters of galaxies and the temperature anisotropies of the CMB were thus

planted during the first moments of the universe’s existence. By measuring the anisotropies in

the microwave background and the large-scale distribution of galaxies in the sky, we can infer

the spectrum of the primordial perturbations laid down during inflation, and thus probe the

underlying physics of this era. Over the next decade, the inflationary era – perhaps 10−30 seconds

after the Big Bang – will thus join nucleosynthesis (3 minutes) and recombination (380,000

years) as observational windows into the primordial universe. However, while the workings of

recombination and nucleosynthesis depend on well-tested laws of atomic and nuclear physics

respectively, the ‘physics of inflation’ remains speculative. The Standard Model of particle

physics almost certainly does not contain the right type of fields and interactions to source an

inflationary epoch. To describe inflation we therefore have to leave to comfort of the Standard

Model and explore ‘new physics’ possibly far above the TeV scale. Some of the boldest and most

profound ideas in particle physics come into play at these scales.

Overview

The aim of this course is two-fold: In Part I, we give a first-principles introduction to inflation.

We show how inflation classically solves the horizon problem, while quantum mechanically pro-

viding a mechanism to generate the primordial seeds for the large-scale structure of the universe.

However, despite this phenomenological success of inflation, the microphysical cause for the in-

flationary expansion remains mysterious. In Part II, we discuss our current understanding of

the physics of inflation. We will use this as a welcome excuse to learn some fascinating physics

such as effective field theory, supersymmetry and aspects of string theory.

As an orientation for the reader, we give a brief road map of the course:

1

2 Preface

Part I: The Quantum Origin of Large-Scale Structure

The structure of Part I of the course is summarized in fig. 1. Let me make a few comments on

the individual chapters:

At the heart of the horizon problem lies the fact that in the conventional Big Bang evolution

the comoving Hubble radius (aH)−1 is an increasing function of time. Inflation solves the

horizon problem by reversing this behavior, making (aH)−1 temporarily a decreasing function

of time (see fig. 1). Fluctuations that naively seem out of causal contact at recombination hence

become causally connected before inflation. In Chapter 1, we describe this solution to the

horizon problem and the classical dynamics of inflation which underlies it.

super-horizonsub-horizon

transfer function

CMBrecombination today

projection

horizon exit

time

comoving scales

horizon re-entry

zero-point fluctuations

(Ch.2)

(Ch.1)

(Ch.3)

(Ch.2)(Ch.5)

reheating(Ch.4) (Ch.3)

Figure 1: A road map for Part I of the lectures.

The shrinking comoving Hubble radius is also the key feature of inflation that allows quantum

zero-point fluctuations on sub-horizon scales to become classical fluctuations on super-horizon

scales (see fig. 1). Remarkably, the quantum-mechanical treatment of inflation leads to pri-

mordial fluctuations that are in striking agreement with the observed CMB anisotropies. In

Chapter 2, we will learn about the intricacies of quantum field theory in curved spacetimes,

such as the inflationary quasi-de Sitter spaces. This will culminate in a derivation of the pri-

mordial density fluctuations predicted by inflation. We present the calculation in full detail and

try to avoid ‘cheating’ and approximations. We also explain why inflation predicts a stochastic

background of gravitational waves.

To make contact with observations, the primordial inflationary fluctuations need to be evolved

to late times, when they seed the CMB and the formation of large-scale structures (LSS).

In Chapter 3, we provide this important link between the correlation functions computed at

horizon crossing during inflation and late-time observables. We review Weinberg’s proof that the

curvature perturbation ζ freezes on superhorizon scales during single-field inflation. This allows

us to relate our computation of fluctuations at horizon exit (high energies) to horizon re-entry

(low energies), while remaining ignorant about the details of the physics of the time in between.

The subsequent subhorizon evolution of the fluctuations is well-understood and computable in

perturbation theory.

Although, reheating has no (or little) effect on the CMB anisotropies, it is an important

component of any complete theory of inflation. In Chapter 4, we digress to review our current

understanding of the reheating era. As we will see, reheating can be understood as a wonderful

Preface 3

application of quantum field theory in a time-dependent background and involves rich physics

such as Bose condensation, parametric resonance and Schrodinger scattering.

One of the most active areas of current research in theoretical cosmology is the possibility of

non-Gaussianity of the primoridial fluctuations. In Chapter 5, we give a detailed discussion

of the computation of higher-order correlation functions during inflation. We discuss both

quantum-mechnically sources of non-Gaussianity (computed in the in-in formalism) and classical

sources of non-Gaussianity (computed in the δN formalism).

Part II: The Physics of Inflation

Part II of the course moves to a critical discussion of the mystery of the physical origin of the

inflationary era:

One of the most powerful organizing principles of theoretical physics is the concept of effective

field theories (EFT). In the absence of a UV-complete theory of high-energy physics, EFTs give

us a way to discuss low-energy physics in a model-independent way, while including possible

high-energy corrections systematically. In Chapter 6, we give an extensive discussion of the

basic principles of EFT. Although we will later use inflation as our main example, much of

the techniques that we will develop in this chapter have a much wider range of applicability.

The slogan “if you can’t understand it in effective field theory, then you haven’t understood it”

applies almost universally in theoretical physics. We end our treatment of EFT with a discussion

of technical naturalness. We illustrate the concept of naturalness with the hierarchy problem of

the Standard Model and the analogous eta problem of inflationary cosmology.

In Chapter 7, we give a brief discussion of the challenges of inflation as an effective theory.

We describe the eta problem, large-field inflation and single-field non-Gaussianity, highlighting

the extraordinary Planck-scale sensitivity of inflation. We discuss the role of global symmetries

in considerations of naturalness in inflation.

In Chapter 8, we discuss supersymmetry (SUSY) as an attractive solution to the radiative

instability of scalar fields. If the symmetry is unbroken, it controls dangerous radiative effects

by enforcing exact cancellations between boson and fermion loops. Even if the supersymmetry

is broken at low energies – as it has to be if it is to describe the real world – the appearance

of supersymmetry at high energies can still help to regulate loop effects. We explain that in

inflation, SUSY is often required to achieve technical naturalness of the quantum-corrected

inflaton action. We illustrate these considerations with a detailed case study of supersymmetric

hybrid inflation.

We end the course, in Chapter 9, we a brief survey of the state-of-the-‘art’ of inflation in

string theory. We explain that the Planck-scale sensitivity of inflation provides both the primary

motivation and the central theoretical challenge for realizing inflation in string theory. We

illustrate these issues through two case studies: warped D-brane inflation and axion monodromy

inflation.

Digressions, exercises and computational details are separated from the main text by horizontal lines.

These parts can be omitted without loss of continuity, but they often illuminate the underlying physics.

Appendix C collects a number of instructive exercises.

4 Preface

Recommended Books and Resources

• Books:

- Mukhanov, Physical Foundations of Cosmology.

- Dodelson, Modern Cosmology.

- Weinberg, Cosmology.

- Mukhanov and Winitzki, Introduction to Quantum Effect in Gravity.

- Terning, Modern Supersymmetry.

- Peskin and Schroeder, Introduction to Quantum Field Theory.

• Videos:

- Many of the lectures at the recent PiTP schools are excellent:

http://video.ias.edu/pitp-2010 (e.g. Seiberg’s lectures on SUSY) and

http://video.ias.edu/pitp-2011 (e.g. Zaldarriaga’s lectures on the CMB).

- TASI 2009 videos can be found here:

http://physicslearning2.colorado.edu/tasi/tasi 2009/tasi 2009.htm

• Reviews:

- My TASI lectures (arXiv:0907.5424) have been superseded by the present notes.

- Skiba, TASI Lectures on Effective Field Theory, (arXiv:1006.2142).

- Luty, TASI Lectures on SUSY Breaking (arXiv:hep-th/0509029).

- Chen, Primordial Non-Gaussianity from Inflation Models (arXiv:1002.1416).

• Papers:

Many of the original papers on inflation are well worth reading:

- Guth’s famous paper (http://www.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-2576.pdf)

has a very clear description of the Big Bang puzzles.

- Maldacena’s paper (arXiv:astro-ph/0210603) contains the first rigorous computation of

the three-point function for slow-roll inflation.

- Kofman, Linde, and Starobinsky (arXiv:hep-ph/9704452) worked out everything we now

know about (p)reheating.

Acknowledgements

I am most grateful to my friends and collaborators for explaining many of the concepts described

in these notes to me, especially Liam McAllister, Daniel Green, and Matias Zaldarriaga.

Part I

The Quantum Origin of

Large-Scale Structure

5

1 Classical Dynamics of Inflation

1.1 Introduction

Running the expansion of the universe back in time, the uniformity of the CMB becomes a

mystery. It is a famous fact that in the conventional Big Bang cosmology the CMB at the time

of decoupling consisted of about 104 causally independent patches. Two points on the sky with

an angular separation exceeding 2 degrees, should never have been in causal contact, yet they

are observed to have the same temperature to extremely high precision:

This puzzle is the horizon problem.

As we will see, the horizon problem in the form stated above assumes that no new physics

becomes relevant for the dynamics of the universe at early times (and extremely high energies).

In this chapter, I will explain how a specific form of new physics may lead to a negative pressure

component and quasi-exponential expansion. This period of inflation produces the apparently

acausal correlations in the CMB and hence solves the horizon problem.

Remarkably, inflation also explains why the CMB has small inhomogeneities:

Quantum mechanical zero-point fluctuations during inflation are promoted to cosmic significance

7

8 1. Classical Dynamics of Inflation

as they are stretched outside of the horizon. When the perturbations re-enter the horizon at

later times, they seed the fluctuations in the CMB. Through explicit calculation one finds that

the primordial fluctuations from inflation are just of the right type—Gaussian, scale-invariant

and adiabatic—to explain the observed spectrum of CMB fluctuations. This remarkable story

will be told in the next chapter. This chapter will be setting the stage by explaining how the

classical dynamics during inflation solves the horizon problem. An effort was made to keep this

description as concise as possible. More details may be found in my previous lectures on the

topic1, as well as in the standard textbooks.2

1.2 The Horizon Problem

We start with a lightning review of FRW cosmology and the horizon problem3 of the standard

Big Bang scenario.

1.2.1 FRW Spacetimes

Modern cosmology begins with two observational facts: i) the universe is expanding and ii) on

scales larger than 300 million light years the matter distribution is homogeneous and isotropic. In

fact, as we go back in time the approximation of homogeneity and isotropy is expected to become

increasingly accurate. The average spacetime is then described by the Friedmann-Robertson-

Walker (FRW) metric4

ds2 = −dt2 + a2(t)

[dr2

1− kr2+ r2dΩ2

], (1.2.1)

where k = 0, k = +1 and k = −1 for flat, positively curved and negatively curved spacelike

3-hypersurfaces, respectively. For ease of notation we will restrict most of our discussion to the

case k = 0.5 In that case, the Friedmann equations for the evolution of the scale factor a(t) are

H2 =1

3M2pl

ρ and H +H2 = − 1

6M2pl

(ρ+ 3p) , (1.2.2)

where H ≡ ∂t ln a is the Hubble parameter and ρ and p are the density and pressure of back-

ground stress-tensor (here assumed to be a perfect fluid). To study the propagation of light (and

hence the causal structure of the FRW universe) it is convenient to define conformal time τ via

the relation

dτ =dt

a(t). (1.2.3)

The FRW metric then factorizes into a static Minkowski metric ηµν multiplied by a time-

dependent conformal factor a(τ),

ds2 = a2(τ)[−dτ2 + dr2 + r2dΩ2

]≡ a2(τ) ηµνdxµdxν . (1.2.4)

1D. Baumann, TASI Lectures on Inflation (arXiv:0907.5424).2Mukhanov, Physical Principles of Cosmology; Dodelson, Modern Cosmology; Weinberg, Cosmology.3 Inflation is sometimes motivated by listing a host of other problems, such as the flatness problem, the

monopole problem, the entropy problem, etc. However, in my opinion, these are all just close cousins of the

horizon problem. In other words, any solution to the horizon problem is likely to solve these secondary problems

as well.4Throughout these notes I will set the speed of light equal to unity, c ≡ 1.5A flat universe is in fact favored by present observations (see fig. 1.5). Furthermore, as we will explain, it is

a fundamental prediction of 60 e-folds of inflationary expansion.

1.2 The Horizon Problem 9

1.2.2 Causal Structure

The radial propagation of light is characterized by the following two-dimensional line element

ds2 = a2(τ)[−dτ2 + dr2

]. (1.2.5)

Just like in Minkowski space, the null geodesics of photons (ds2 ≡ 0) are straight lines at ±45

angles in the τ -r plane

r(τ) = ±τ + const. (1.2.6)

The maximal distance a photon (and hence any particle) can travel between an initial time tiand later time t > ti is

∆r = ∆τ ≡ τ − τi =

∫ t

ti

dt′

a(t′), (1.2.7)

i.e. the maximal distance travelled is equal to the amount of conformal time elapsed during the

interval ∆t = t − ti. The initial time is often taken to be the ‘origin of the universe’, ti ≡ 0,

defined formally by the initial singularity6 ai ≡ a(ti = 0) ≡ 0. We then get

∆rmax(t) =

∫ t

0

dt′

a(t′)= τ(t)− τ(0) . (1.2.8)

We call this the comoving (particle) horizon.

The integral defining conformal time may be re-written in the following illuminating way

τ ≡∫

dt

a(t)=

∫(aH)−1 d ln a . (1.2.9)

This shows that the elapsed conformal time depends on the evolution of the comoving Hubble

radius (aH)−1. For example, for a universe dominated by a fluid with equation of state w ≡ p/ρ,

we find that this evolves as

(aH)−1 ∝ a 12

(1+3w) . (1.2.10)

Note the dependence of the exponent on the combination (1 + 3w). All familiar matter sources

satisfy the strong energy condition (SEC), 1+3w > 0, so it was reasonable for post-Hubble physi-

cists to assume that the comoving Hubble radius increases as the universe expands. Performing

the integral in (1.2.9) gives

τ ∝ 2(1+3w) a

12

(1+3w) , (1.2.11)

up to an irrelevant integration constant. For conventional matter sources the initial singularity

is therefore at τi = 0,7

τi ∝ a12

(1+3w)

i = 0 , for w > −13 , (1.2.12)

and the comoving horizon (1.2.8) is finite,

∆rmax(t) ∝ a(t)12

(1+3w) , for w > −13 . (1.2.13)

6Of course, the concept of a classical spacetime (and hence the FRW metric) has broken down by that time.

We will get back to that below.7Of course, the actual value of τi is a matter of definition. The invariant statement is that for conventional

matter sources the integral in (1.2.8) is dominated by the upper limit and receives vanishing contributions from

early times.


1.2.3 Shock in the CMB

A moment’s thought will convince the reader that the finiteness of the conformal time elapsed

between ti = 0 and the time of CMB decoupling trec implies a serious problem: most spots

in the CMB have non-overlapping past light-cones and hence never were in causal contact (see

fig. 1.1). Why aren’t there order-one fluctuations in the CMB temperature?

Past Light-Cone

Recombination

Particle Horizon

Conformal Time

Last-Scattering Surface

Big Bang Singularity

!rec

!0

!i = 0

Figure 1.1: Conformal diagram for the standard FRW cosmology.

CMB correlations. Let us compute the angle subtended by the comoving horizon at recombination.

This is defined as the ratio of the comoving particle horizon at recombination and the comoving angular

diameter distance from us (an observer at redshift z = 0) to recombination (z ' 1090) (cf. fig. 1.1)

θhor =dhor

dA. (1.2.14)

A fundamental quantity is the comoving distance between redshifts z1 and z2

τ2 − τ1 =

∫ z2

z1

dz

H(z)≡ I(z1, z2) . (1.2.15)

The comoving particle horizon at recombination is

dhor = τrec − τi ≈ I(zrec,∞) . (1.2.16)

In a flat universe, the comoving angular diameter distance from us to recombination is

dA = τ0 − τrec = I(0, zrec) . (1.2.17)

The angular scale of the horizon at recombination therefore is

θhor ≡dhor

dA=I(zrec,∞)

I(0, zrec). (1.2.18)

Using

H(z) = H0

√Ωm(1 + z)3 + Ωγ(1 + z)4 + ΩΛ , (1.2.19)

1.3 The Shrinking Hubble Sphere 11

where Ωm = 0.3, ΩΛ = 1 − Ωm, Ωγ = Ωm/(1 + zeq) and zeq = 3400, we can numerically evaluate the

integrals I(0, zrec) and I(zrec,∞), to find

θhor = 1.16 . (1.2.20)

Causal theories should have vanishing correlation functions for

θ > θc ≡ 2θhor = 2.3 . (1.2.21)

Inflation explains why we observe correlations in the CMB even for θ & 2.

1.2.4 Quantum Gravity Hocus-Pocus?

Let me digress briefly to make an important qualifier: when we inferred that the total conformal

time between the singularity and decoupling is finite and small, we included times in the integral

in (1.2.9) that were arbitrarily close to the initial singularity:

∆τ =

∫ δt

0

dt′

a(t′)︸︷︷︸QG?

+

∫ t

δt

dt′

a(t′)︸︷︷︸inflation?

. (1.2.22)

However, in the first integral in (1.2.22) we have no reason to trust the classical geometry

(1.2.1). By stating the horizon problem as we did, we were hence implicitly assuming that the

breakdown of General Relativity in the regime close to the singularity does not lead to large

contributions to the conformal time: δτ ∆τ . This assumption may be incorrect and there

may, in fact, be no horizon problem in a complete theory of quantum gravity.8 In the absence

of an alternative solution to the horizon problem this is a completely reasonable attitude to

take. However, I will now show that inflation provides as very simple and computable solution

to the horizon problem. Effectively, this is achieved by modifying the scale factor evolution in

the second integral in (1.2.22), i.e. in the classical regime. I then leave it to the reader to decide

if this solution or a version of ‘quantum gravity hocus-pocus’ is preferable.

1.3 The Shrinking Hubble Sphere

Our presentation of the horizon problem has highlighted the fundamental role played by the

growing Hubble sphere of the standard Big Bang cosmology. A simple solution to the horizon

problem therefore suggests itself: conjecture a phase of decreasing Hubble radius in the early

history of the universe,d

dt(aH)−1 < 0 . (1.3.23)

If this lasts long enough, the horizon problem may be avoided.

1.3.1 Solution of the Horizon Problem

As noted earlier, a decreasing Hubble radius requires a violation of the SEC, 1 + 3w < 0,

cf. (1.2.10). We then notice that the Big Bang singularity is now pushed to negative conformal

time,9

τi ∝ 2(1+3w) a

12

(1+3w)

i = −∞ , for w < −13 . (1.3.24)

8I thank Erik Verlinde for an interesting debate on this important issue.9The integral in (1.2.8) is now dominated by the lower limit.


Past Light-Cone

Recombination

Particle Horizon

Conformal Time

Last-Scattering Surface

Big Bang Singularity

Reheating

causal contact

Infla

tion

!rec

!0

0

!i = !"

Figure 1.2: Conformal diagram for inflationary cosmology.

This implies that there was much more conformal time between the singularity and decoupling

than we had thought! Fig. 1.2 shows the new conformal diagram. The past light cones of widely

separated points in the CMB now had time to intersect before the time τ = 0. In inflationary

cosmology, τ = 0 isn’t the initial singularity, but instead becomes the time of reheating. There

is time both before and after τ = 0.

A decreasing comoving horizon means that large scales entering the present universe were

inside the horizon before inflation (see fig. 1.3). Causal physics before inflation therefore had

time to establish spatial homogeneity. With a period of inflation, the uniformity of the CMB is

not a mystery anymore.

1.3.2 Solution of the Flatness Problem∗

In foonote 3, I advertised that any solution to the horizon problem also solves the other Big

Bang puzzles. Let me therefore demonstrate that a shrinking Hubble sphere indeed solves the

flatness problem.

Consider adding spatial curvature to the Friedmann equation (1.2.2),

H2 =ρ

3M2pl

− k

a2. (1.3.25)

Dividing both sides by the Hubble parameter, we can write this as

1− Ω(a) =−k

(aH)2, (1.3.26)

1.3 The Shrinking Hubble Sphere 13


CMBrecombination

today

time

comoving scales

horizon re-entry

sub-horizon

horizon exit(aH)!1

k!1

reheatingINFLATION

[ln a]

Figure 1.3: Solution of the horizon problem. Scales of cosmological interest were larger than the

Hubble radius until a ∼ 10−5 (where today is at a(t0) ≡ 1). However, at very early times, before

inflation operated, all scales of interest were smaller than the Hubble radius and therefore susceptible to

microphysical processing. Similarly, at very late times, the scales of cosmological interest are back within

the Hubble radius.

where

Ω(a) ≡ ρ(a)

ρcrit(a), ρcrit ≡ 3M2

plH2 . (1.3.27)

The deviation of the normalized density parameter Ω from unity is a measure of the curvature

of the universe. From observations we know that today |1 − Ω(a0)| . 0.01. However, if there

was some amount of spatial curvature in the early universe, we have to worry that it will grow

with time. Conversely, in order to explain the flatness of the universe today, we have to explain

a much more extreme flatness at early times, e.g. |1 − Ω(aGUT)| . 10−55. From eq. (1.3.26)

we see that the time evolution of the curvature parameter |1 − Ω(a)| again relates to the time

evolution of the comoving Hubble radius (aH)−1. Whenever (aH)−1 is an increasing function

of time, curvature grows. In contrast, during inflation, when (aH)−1 decreases, the universe is

driven towards flatness. This solves the flatness problem. The solution Ω = 1 is an attractor

during inflation.

Exercise. Show that

dΩ

d ln a= (1 + 3w)Ω(Ω− 1) . (1.3.28)

This makes it apparent that Ω = 1 is an unstable fixed point if the strong energy condition is satisfied,

but becomes an attractor during inflation.

1.3.3 Conditions for Inflation

Decreasing comoving horizon. I like the shrinking Hubble sphere as the fundamental definition

of inflation since it most directly relates to the horizon problem and is key for the inflationary

mechanism of generating fluctuations.

However, before we move to a description of the physics that can lead to a shrinking Hubble

sphere, we show that this definition of inflation is equivalent to other popular ways of describing


inflation:

d

dt(aH)−1 < 0 ⇒ ε ≡ − H

H2< 1 ⇔ d2a

dt2> 0 ⇔ ρ+ 3p < 0 .

Accelerated expansion. From the relation

d

dt(aH)−1 =

d

dt(a)−1 = − a

(a)2, (1.3.29)

we see that a shrinking comoving Hubble radius implies accelerated expansion

d2a

dt2> 0 . (1.3.30)

This explains why inflation is often defined as a period of accelerated expansion.

Slowly-varying Hubble parameter. Alternatively, we may write

d

dt(aH)−1 = − aH + aH

(aH)2= −1

a(1− ε) , where ε ≡ − H

H2> 0 . (1.3.31)

The shrinking Hubble sphere therefore also corresponds to

ε = − H

H2= −d lnH

dN< 1 . (1.3.32)

Here, we have defined dN ≡ d ln a = Hdt, which measures the number of e-foldsN of inflationary

expansion. Eq. (1.4.36) implies that the fractional change of the Hubble parameter per e-fold is

small. Moreover, to solve the cosmological problems we want inflation to last for a sufficiently

long time (usually at least N ∼ 40 to 60 e-folds). To achieve this requires ε to remain small for

a sufficiently large number of Hubble times. This condition is measured by a second parameter

η ≡ ε

Hε=d ln ε

dN. (1.3.33)

For |η| < 1 the fractional change of ε per Hubble time is small and inflation persists.

Negative pressure. What forms of stress-energy source accelerated expansion? Assuming a

perfect fluid with pressure p and density ρ, we consult the Friedmann equations in (1.2.2),

H +H2 = − 1

6M2pl

(ρ+ 3p) = −H2

2

(1 +

3p

ρ

), (1.3.34)

to find that

ε = − H

H2=

3

2

(1 +

p

ρ

)< 1 ⇔ w ≡ p

ρ< −1

3 , (1.3.35)

i.e. inflation requires negative pressure or a violation of the strong energy condition. How this

can arise in a physical theory will be explained in the next section. We will see that there is

nothing sacred about the strong energy condition and that it can easily be violated.

1.4 The Physics of Inflation 15

1.4 The Physics of Inflation

We have shown that a given FRW background with time-dependent Hubble parameter H(t)

corresponds to cosmic acceleration if and only if

ε ≡ − H

H2< 1 . (1.4.36)

For this condition to be sustained for a sufficiently long time, requires

|η| ≡ |ε|Hε 1 , (1.4.37)

i.e. the fractional change of ε per Hubble time is small. In this section, we discuss what

microscopic physics can lead to these conditions.

1.4.1 False Vacuum Inflation

The first version of inflation considered a universe dominated by the constant energy density

of a metastable false vacuum. This leads to an exponentially expanding de Sitter space with

H = const., and hence ε = η = 0. However, classically, false vacuum inflation never ends.

Quantum-mechanically, tunnelling from the false vacuum to the true vacuum ends inflation

locally, but the post-inflationary universe looks nothing like our universe. The universe is either

empty or much too inhomogeneous. This is the graceful exit problem of old inflation. Any

successful inflationary mechanism has to include a way of ending inflation and successfully

reheating the universe. We will have to work a bit harder.

1.4.2 Slow-Roll Inflation

Consider a scalar field φ, the inflaton, minimally coupled to Einstein gravity10

S =

∫d4x√−g

[M2

pl

2R− 1

2gµν∂µφ∂νφ− V (φ)

], (1.4.38)

where R is the four-dimensional Ricci scalar derived from the metric gµν and V (φ) is so far an

arbitrary function:

10In principle, we could imagine a non-minimal coupling between the inflaton and the graviton, however, in

practice, non-minimally coupled theories can be transformed to minimally coupled form by a field redefinition.

Similarly, we could entertain the possibility that the Einstein-Hilbert part of the action is modified at high

energies. However, the simplest examples for this UV-modification of gravity, so-called f(R) theories, can again

be transformed into a minimally coupled scalar field with potential V (φ).


The time evolution of the homogeneous mode of the inflaton φ(t) is governed by the Klein-

Gordon equation

φ+ 3Hφ = −V ′ , (1.4.39)

where the size of the Hubble friction is determined by the Friedmann equation

H2 =1

3M2pl

[1

2φ2 + V

]. (1.4.40)

From (1.4.39) and (1.4.40) we derive the continuity equation

H = −1

2

φ2

M2pl

. (1.4.41)

Substituting this into the definition of ε, we find

ε =12 φ

2

M2plH

2. (1.4.42)

Inflation therefore occurs if the potential energy, V , dominates over the kinetic energy, 12 φ

2. For

this condition to persist the acceleration of the scalar field has to be small. To assess this, it is

useful to define the dimensionless acceleration per Hubble time

δ ≡ − φ

Hφ. (1.4.43)

Taking the time-derivative of (1.4.42) we find

η = 2(ε− δ) . (1.4.44)

Hence, if ε, |δ| 1 then both H and ε have small fractional changes per e-fold: ε, |η| 1.

So far, no approximations have been made. We simply noted that in a regime where ε, |δ| 1, inflation persists. We now use these conditions to simplify the equations of motion. This is

called the slow-roll approximation. The condition ε = 12

φ2

M2plH

2 1 implies 12 φ

2 V and hence

leads to the following simplification of the Friedmann equation

H2 ≈ V

3M2pl

. (1.4.45)

The condition |δ| = |φ|H|φ| 1 simplifies the Klein-Gordon equation to

3Hφ ≈ −V ′ . (1.4.46)

Substituting (1.4.45) and (1.4.46) into (1.4.42) gives

ε = − H

H2=

12 φ

2

M2plH

2≈M2

pl

2

(V ′

V

)2

≡ εv . (1.4.47)

Furthermore, taking the time-derivative of (1.4.46),

3Hφ+ 3Hφ = −V ′′φ , (1.4.48)


leads to

δ + ε = − φ

Hφ− H

H2≈M2

pl

V ′′

V≡ ηv . (1.4.49)

Hence, a convenient way to assess a potential V (φ) is to compute the potential slow-roll param-

eters11 εv and ηv. When these are small, slow-roll inflation occurs:

εv ≡M2

pl

2

(V ′

V

)2

1 and |ηv| ≡M2pl

|V ′′|V 1 . (1.4.50)

The amount of inflation is measured by the number of e-folds of accelerated expansion

N ≡∫ af

ai

d ln a =

∫ tf

ti

H(t) dt , (1.4.51)

where ti and tf are defined as the times when ε(ti) = ε(tf ) ≡ 1. In the slow-roll regime we can

use

Hdt =H

φdφ ≈ −3H

V ′·Hdφ ≈ 1√

2εv

dφ

Mpl(1.4.52)

to write (1.4.51) as an integral in the field space of the inflaton

N =

∫ φf

φi

1√2εv

dφ

Mpl, (1.4.53)

where φi and φf are defined as the boundaries of the interval where εv < 1. The largest scales

observed in the CMB are produced some 40 to 60 e-folds before the end of inflation

Ncmb =

∫ φf

φcmb

1√2εv

dφ

Mpl≈ 40− 60 . (1.4.54)

A successful solution to the horizon problem requires at least Ncmb e-folds of inflation.

Case study: m2φ2 inflation. As an example, let us give the slow-roll analysis of arguably the simplest

model of inflation: single field inflation driven by a mass term

V (φ) =1

2m2φ2 . (1.4.55)

The slow-roll parameters are

εv(φ) = ηv(φ) = 2

(Mpl

φ

)2

. (1.4.56)

To satisfy the slow-roll conditions εv, |ηv| < 1, we therefore need to consider super-Planckian values for

the inflaton

φ >√

2Mpl ≡ φf . (1.4.57)

The relation between the inflaton field value and the number of e-folds before the end of inflation is

N(φ) =φ2

4M2pl

− 1

2. (1.4.58)

Fluctuations observed in the CMB are created at

φcmb = 2√NcmbMpl ∼ 15Mpl . (1.4.59)

11In contrast, the parameters ε and η are often called the Hubble slow-roll parameters. During slow-roll the

parameters are related as follows: εv ≈ ε and ηv ≈ 2ε− 12η.


Attractor

!

!

!2

3m Mpl

!!

2

3m Mpl

Figure 1.4: Phase space diagram of m2φ2 inflation.

Finally, let us comment that slow-roll inflation for the m2φ2 potential is an attractor solution. To see

this you should study the phase space diagram using12

dφ

dφ= −

√32

1M2

pl(φ2 +m2φ2)1/2φ+m2φ

φ. (1.4.60)

The result is portrayed in fig. 1.4.13

1.4.3 Hybrid Inflation

In single-field slow-roll inflation, a single field governs both the inflationary dynamics (ε 1)

and the exit from inflation (ε → 1).14 In contrast, in hybrid inflation the shape of the inflaton

potential is decoupled from the exit from inflation. This is achieved by coupling the inflaton

field φ to a second field ψ. Consider, for example, the following Lagrangian

L = −1

2(∂µφ)2 − 1

2(∂µψ)2 − V (φ)− 1

4λ(M2 − λψ2)2 − g2

2φ2ψ2 , (1.4.61)

where V (φ) M4

4λ , so that the dominant contribu-

tion to the inflationary energy density is coming from

the false vacuum energy of the symmetry breaking

potential V (ψ) ≡ 14λ(M2 − λψ2)2. The coupling be-

tween φ and ψ induces an effective mass of the second

field that depends on the value of the inflaton

m2ψ(φ) = −M2 + g2φ2 . (1.4.62)

For large values of the inflaton field φ > φc ≡ M/g

the field ψ is stabilized at its only minimum at ψ = 0.

During that phase, ψ is very massive and can be

integrated out, so that the theory reduces to that

of single-field slow-roll inflation. However, for φ <

φc, the waterfall field ψ becomes tachyonic and ends

inflation.

12To arrive at eq. (4.3.64) we substituted φ = φ dφdφ

into the Klein-Gordon equation.13Figure reproduced from V. Mukhanov, Physical Foundations of Cosmology.14This then often requires that the field moves over a super-Planckian field range during the 60 e-folds of

inflation.


1.4.4 K-Inflation

Slow-roll inflation assumes that the kinetic term is canonical, i.e. Ls.r. = X − V (φ), where X ≡−1

2(∂µφ)2. As we have seen, this puts strong constraints on the shape of the potential V (φ) via

the potential slow-roll conditions, εv, |ηv| 1. However, these conditions for inflation are not

absolute, but assume the slow-roll approximations. In contrast, the Hubble slow-roll conditions,

ε, |η| 1 don’t make any approximations, and allow for a larger spectrum of inflationary

backgrounds. In particular, the constraints on the inflationary potential can potentially be

relaxed if higher-derivative corrections to the kinetic term were dynamically relevant during

inflation, i.e. |H| H2 not because the theory of potential-dominated, but because it allows

non-trivial dynamics.

A useful way to describe these effects is by the following action,

S =

∫d4x√−g

[M2

pl

2R+ P (X,φ)

], (1.4.63)

where

P (X,φ) =∑

n

cn(φ)Xn

Λ4n−4. (1.4.64)

For X Λ4, the dynamics reduces to that of slow-roll inflation, so we are now interested in

the limit X ∼ Λ4. Naively, this looks like playing with fire, since X/Λ4 controls the derivative

expansion in (1.4.64). In particular, in the limit X → Λ4 we have to worry about the appearance

of unstable ‘ghost’ states and the stability under radiative corrections. Specifically, in the absence

of symmetries, there is no way to protect the coefficients cn in (1.4.64) from quantum corrections.

The predictions derived from (1.4.64) then can’t be trusted. However, sometimes the theory is

equipped with a symmetry that forbids large renormalizations of these coefficients. This is

the case, for instance in Dirac-Born-Infeld (DBI) inflation, where a higher-dimensional boost

symmetry protects the special form of the Lagrangian

P (X,φ) = −Λ4(φ)

√1 +

X

Λ4(φ)− V (φ) . (1.4.65)

In this case, the boost symmetry forces quantum corrections to involve the two-derivative com-

bination ∇∇φ. We stress that only when they come with protective symmetries are P (X)–

theories really interesting and predictive theories. The stress-energy tensor arising from (1.4.63)

has pressure P and energy density

ρ = 2XP,X − P , (1.4.66)

where P,X denotes a derivative with respect to X. The inflationary parameter (1.4.36) becomes

ε = − H

H2=

3XP,X2XP,X − P

. (1.4.67)

The condition for inflation is still ε 1, which now is a condition on the functional form of

P (X). However, it should be remembered that unless a protective symmetry is identified, there

is no guarantee that the P (X)–theory is radiatively stable.

The fluctuations in P (X)–theories have a number of interesting features. First, in the limit

X ∼ Λ4, they propagate with a non-trival speed of sound

c2s =

dP

dρ=

P,XP,X + 2XP,XX

1 . (1.4.68)


The limit of small sound speed implies enhanced interactions in the cubic and quartic Lagrangian.

This leads to large amount of non-Gaussianity. In Chapter 5, we will explain how those non-

Gaussianities are calculated.

1.5 Outlook

In this chapter, we discussed the classical dynamics of inflation (~ = 0) and explained how

it provides a simple solution to the horizon problem. Inflation therefore explains the large-

scale homogeneity, isotropy and flatness of the universe. In the next chapter, I will present

the quantum limit of inflation (~ 6= 0) and show that it provides a beautiful mechanism to

explain the observed CMB fluctuations. The evolution of the Hubble sphere in fig. 1.3 will play

a fundamental role in this story. It allows quantum zero-point fluctuations of the inflaton field

to lead to primordial density fluctuations of precisely the right type to account for the observed

CMB anisotropies.

Multipole moment

Angular Scale90 2 0.20.5

homogeneous

flat

isotropic

scale-invariantsuperhorizon

adiabatic

Gaussian

Figure 1.5: The observational evidence for inflation.

A compact representation of CMB data is in terms of the angular power spectrum (see fig. 1.5)

C` ≡1

2`+ 1

∑

m

|a`m|2 , where∆T (θ, φ)

T=∑

`,m

a`mY`m(θ, φ) . (1.5.69)

All the predictions of inflation are directly (or indirectly) encoded in the CMB power spectrum:

On large scales the universe is

1a) homogenous: the temperature fluctuations are small: ∆T . 10µK,

1b) isotropic: little information is lost by the sum over a`m’s in (1.5.69),

1c) flat: the first peak of the power spectrum is at ` ∼ 200.

Its small-scale fluctuations are

2a) superhorizon: the power doesn’t vanish for θ > 2,

1.5 Outlook 21

2b) scale-invariant: the primordial power is nearly independent of scale,

2c) Gaussian: little information is lost by reducing the data to the power spectrum,

2d) adiabatic: the presence of the acoustic peaks constrains isocurvature fluctuations.

2 Quantum Fluctuations during Inflation

2.1 Motivation

In this chapter and the next, we discuss the primordial origin of the temperature variations in

the CMB. The main goal will be to show how quantum fluctuations in quasi-de Sitter space

produce a spectrum of fluctuations that accurately matches the observations.

The reason why inflation inevitably produces fluctuations is simple: as we have seen in the pre-

vious chapter, the inflaton evolution φ(t) governs the energy density of the early universe ρ(t)

and hence controls the end of inflation. Essentially, φ plays the role of a local clock reading

off the amount of inflationary expansion remaining. Because microscopic clocks are quantum-

mechanical objects with necessarily some variance (by the uncertainty principle), the inflaton

will have spatially varying fluctuations δφ(t,x) ≡ φ(t,x) − φ(t). These fluctuations imply that

different regions of space inflate by different amounts. In other words, there will be local differ-

ences in the time when inflation end δt(x). Moreover, these differences in the local expansion

histories lead to differences in the local densities after inflation. In quantum theory, local fluctua-

tions in δρ(t,x) and hence ultimately in the CMB temperature ∆T (x) are therefore unavoidable.

The main purpose of this chapter is to compute this effect. It is worth remarking that the the-

ory wasn’t engineered to produce the CMB fluctuations, but their origin is instead a natural

consequence of treating inflation quantum mechanically.

reheating

inflation

end

Figure 2.1: Quantum fluctuations δφ(t,x) around the classical background evolution φ(t). Regions

acquiring a negative frozen fluctuations δφ remain potential-dominated longer than regions with positive

δφ. Different parts of the universe therefore undergo slightly different evolutions. After inflation, this

induces relative density fluctuations δρ(r,x).

23

24 2. Quantum Fluctuations during Inflation

2.2 Classical Perturbations

For concreteness, we will consider single-field slow-roll models of inflation

S =

∫d4x√−g

[1

2R− 1

2gµν∂µφ∂νφ− V (φ)

], (2.2.1)

where Mpl ≡ 1. We will study both scalar and tensor fluctuations. For the scalar modes

we have to be careful to identify the true physical degrees of freedom. A priori, we have 5

scalar modes: 4 metric perturbations – δg00, δgii, δg0i ∼ ∂iB and δgij ∼ ∂i∂jH – and 1 scalar

field perturbation δφ. Gauge invariances associated with the invariance of (2.2.1) under scalar

coordinate transformations – t → t + ε0 and xi → xi + ∂iε – remove two modes. The Einstein

constraint equations remove two more modes, so that we are left with 1 physical scalar mode.

Deriving the quadratic action for this mode is the aim of this section.


transfer function


projection

horizon exit

time

comoving scales

horizon re-entry


reheating

Figure 2.2: Curvature perturbations during and after inflation: The comoving horizon (aH)−1 shrinks

during inflation and grows in the subsequent FRW evolution. This implies that comoving scales k−1 exit

the horizon at early times and re-enter the horizon at late times. While the curvature perturbations ζ

are outside of the horizon they don’t evolve, so our computation for the correlation function 〈ζk1ζk2〉 at

horizon exit during the early de Sitter phase can be related directly to CMB observables at late times.

2.2.1 Comoving Gauge

We will work in a fixed gauge throughout. For a number of reason it will be convenient to work

in comoving gauge, defined by the vanishing of the momentum density, δT0i ≡ 0. For slow-roll

inflation, this becomes1

δφ = 0 . (2.2.2)

In this gauge, perturbations are characterized purely by fluctuations in the metric,

δgij = a2(1− 2ζ)δij + a2hij . (2.2.3)

1Below we use the Einstein equations to replace the additional (non-dynamical) metric perturbations δg00 and

δg0i in terms of ζ. This results in an action purely for ζ which is why, for now, we can afford to be a bit implicit

about the remaining metric perturbations.

2.2 Classical Perturbations 25

Here, hij is a transverse (∇ihij = 0), traceless (hii = 0) tensor and ζ is a scalar. One can show

that the comoving spatial slices φ = const. have three-curvature R(3) = 4a2∇2ζ. Hence, ζ is

referred to as the (comoving) curvature perturbation.2

The perturbation ζ has the crucial property that (for adiabatic matter fluctuations) it is

time-independent on superhorizon scales (see fig. 2.2):

limkaH

ζk = 0 . (2.2.4)

We will prove this important fact in the next chapter. The constancy of ζ on superhorizon scales

allows us to relate CMB observations directly to the inflationary dynamics (at the time when

a given fluctuation crosses the horizon) while allowing us to be completely ignorant about the

high-energy physics during the intervening history of the universe.

2.2.2 Constraint Equations

Solving the Einstein equations for the non-dynamical metric perturbations δg00 and δg0i in terms

of ζ is a bit tedious. Readers who don’t want to be distracted by these technical details are

advise to skip to the next subsection and proceed to the result for the quadratic action for the

perturbation ζ. Otherwise, the details can be found in the following digression:

Free field action for ζ. The constraint equations are solved most conveniently in the ADM formalism,

where the metric fluctuations become non-dynamical Lagrange multipliers. In ADM, spacetime is sliced

into three-dimensional hypersurfaces

ds2 = −N2dt2 + gij(dxi +N idt)(dxj +N jdt) . (2.2.5)

Here, gij is the three-dimensional metric on slices of constant t. The lapse function N(x) and the shift

function Ni(x) appear as non-dynamical Lagrange multipliers in the action, i.e. their equations of motion

are purely algebraic. For our purposes this is the main advantage of the ADM formalism. The action

(2.2.1) becomes

S =1

2

∫d4x√−g

[NR(3) − 2NV +N−1(EijE

ij − E2)

+N−1(φ−N i∂iφ)2 −Ngij∂iφ∂jφ− 2V], (2.2.6)

where Eij is the extrinsic curvature of the three-dimensional spatial slices

Eij ≡1

2(gij −∇iNj −∇jNi) , E = Eii , (2.2.7)

and the perturbed three-metric is

gij = a2(1− 2ζ)δij . (2.2.8)

The ADM action (2.2.6) implies the following constraint equations for the Lagrange multipliers N and N i

∇i[N−1(Eij − δijE)] = 0 , (2.2.9)

R(3) − 2V −N−2(EijEij − E2)−N−2φ2 = 0 . (2.2.10)

To solve the constraints, we split the shift vector Ni into irrotational (scalar) and incompressible (vector)

parts

Ni ≡ ψ,i + Ni , where ∂iNi = 0 , (2.2.11)

2Sometimes the notationR is used for the comoving curvature perturbation, to distinguish it from the curvature

perturbation on uniform density hypersurfaces, which sometimes is also denoted by ζ. We will continue to use ζ

for the comoving curvature perturbation.


and define the lapse perturbation as

N ≡ 1 + α . (2.2.12)

The quantities α, ψ and Ni then admit expansions in powers of ζ,

α = α1 + α2 + . . . ,

ψ = ψ1 + ψ2 + . . . ,

Ni = N(1)i + N

(2)i + . . . , (2.2.13)

where, e.g. αn = O(ζn). The constraint equations may then be solved order-by-order: To get the

quadratic action, we only need to solve the constraint equations to first order. Eq. (2.2.10) then implies

α1 =ζ

H, ∂2N

(1)i = 0 , (2.2.14)

where N(1)i ≡ 0 with an appropriate choice of boundary conditions. Furthermore, at first order eq. (2.2.9)

gives

ψ1 = − ζ

H+a2

Hεv ∂

−2ζ , (2.2.15)

where ∂−2 is defined via ∂−2(∂2f) = f .

Substituting the first-order solutions for N and Ni back into the action, one finds the following second-

order action

S2 =1

2

∫d4x a3 φ

2

H2

[ζ2 − a−2(∂iζ)2

]. (2.2.16)

To arrive at this result requires integration by parts and use of the background equations of motion.

2.2.3 Quadratic Action

Substituting δg00(ζ) and δg0i(ζ) into (2.2.1) and expanding in powers of ζ, we find

S =1

2

∫dt d3x a3 φ2

H2

[ζ2 − 1

a2(∂iζ)2

]+ · · · (2.2.17)

The ellipses in (2.2.17) refer to terms that are higher order in ζ. Being interested only in the

quadratic action of ζ we will now drop these terms. We will come back to these terms when we

discuss higher-order correlations and non-Gaussianity in Chapter 5. We define the canonically-

normalized Mukhanov variable

v ≡ zζ , (2.2.18)

where

z2 ≡ a2 φ2

H2= 2a2ε . (2.2.19)

Switching to conformal time, we get

S =1

2

∫dτ d3x

[(v′)2 − (∂iv)2 +

z′′

zv2

]. (2.2.20)

We recognize this as the action of an harmonic oscillator with a time-dependent mass

m2eff(τ) ≡ −z

′′

z= −H

aφ

∂2

∂τ2

aφ

H. (2.2.21)

The time-dependence of the effective mass accounts for the interaction of the scalar field ζ with

the gravitational background. Notice that both a(t) and φ(t) contribute to meff(τ).

2.2 Classical Perturbations 27

2.2.4 Mukhanov-Sasaki Equation

Varying the action S, we arrive at the Mukhanov-Sasaki equation3

v′′k +

(k2 − z′′

z

)

︸︷︷︸≡ω2

k(τ)

vk = 0 , (2.2.22)

where we defined the Fourier modes,

vk(τ) ≡∫

d3x e−ik·x v(τ,x) . (2.2.23)

In de Sitter space, a = −(Hτ)−1, the effective frequency reduces to

ω2k(τ) = k2 − 2

τ2(de Sitter). (2.2.24)

To guide our intuition we consider special limits of (2.2.22): For modes with wavelengths much

smaller than the horizon, k2 |z′′/z|, we get

v′′k + k2vk = 0 (subhorizon). (2.2.25)

This leads to oscillating solutions: vk ∝ e±ikτ . For modes with wavelengths much larger than

the horizon, k2 |z′′/z|, we find instead

v′′kvk

=z′′

z≈ 2

τ2(superhorizon). (2.2.26)

This has the growing solution4 vk ∝ z ∝ τ−1 (and the decaying solution vk ∝ τ2). This implies

that ζ indeed freezes on superhorizon scales: ζk = z−1vk ∝ const.

2.2.5 Mode Expansion

Since the frequency ωk(τ) in (2.2.22) depends only on k ≡ |k|, the most general solution of

(2.2.22) can be written as5

vk ≡ a−k vk(τ) + a+−kv

∗k(τ) . (2.2.27)

Here, vk(τ) and its complex conjugate v∗k(τ) are two linearly independent solutions of (2.2.22).

As indicated by dropping the vector notation k on the subscript, the mode functions, vk(τ) and

v∗k(τ), are the same for all Fourier modes with k ≡ |k|. The Wronskian of the mode functions is

W [vk, v∗k] ≡ v′kv∗k − vkv∗k ′ = 2i Im(v′kv

∗k) . (2.2.28)

From the equation of motion (2.2.22) it follows that W [vk, v∗k] is time-independent. Furthermore,

by rescaling the mode functions as vk → λvk (giving W [vk, v∗k] → |λ|2W [vk, v

∗k]) we can always

normalize vk such that

W [vk, v∗k] = v′kv

∗k − vkv∗k ′ ≡ −i . (2.2.29)

3The Mukhanov-Sasaki equation is hard to solve in full generality since the function z(τ) depends on the

background dynamics. For a given inflationary background, φ(τ) and a(τ), one may of course solve eq. (2.2.22)

numerically. However, to gain a more intuitive understanding of the solutions, we will discuss approximate

analytical solutions in the pure de Sitter limit, as well as in the slow-roll expansion of quasi-de Sitter space.4Recall that τ runs from large negative values to zero during inflation.5The −k on a+

−k was chosen for later convenience.


The reason for this particular choice of normalization will become clear momentarily.

The two time-independent integration constants a±k in (2.2.27) are

a−k =v∗k′vk − v∗kv′k

v∗k′vk − v∗kv′k

=W [v∗k, vk]

W [v∗k, vk]and a+

k = (a−k )∗ , (2.2.30)

where the relation between a+k and a−k follows from the reality of v. Note that the constants a±k

may depend on the direction of the wave vector k.

Finally, Fourier transforming (2.2.27) gives

v(τ,x) =

∫d3k

(2π)3/2

[a−k vk(τ) + a+

−kv∗k(τ)

]eik·x (2.2.31)

=

∫d3k

(2π)3/2

[a−k vk(τ)eik·x + a+

k v∗k(τ)e−ik·x

], (2.2.32)

where the second line is manifestly real, since a+k = (a−k )∗.

2.3 Quantum Origin of Cosmological Perturbations

Our task now is to quantize the field v. This is not much more complicated than quantizing

the simple harmonic oscillator in quantum mechanics, except for a small subtlety in the vacuum

choice arising from the time-dependence of the oscillator frequencies ωk(τ).6

2.3.1 Canonical Quantization

The canonical quantization procedure proceeds in the standard way: the field v and its canon-

ically conjugate momentum π ≡ v′ are promoted to quantum operators v and π, which satisfy

the standard equal-time commutation relations7

[v(τ,x), π(τ,y)] = iδ(x− y) , (2.3.33)

and

[v(τ,x), v(τ,y)] = [π(τ,x), π(τ,y)] = 0 . (2.3.34)

It follows from (2.2.22) that the commutation relation (2.3.33) holds at all times if it holds at

any one time. The Hamiltonian is

H(τ) =1

2

∫d3x

[π2 + (∂iv)2 +m2

eff(τ)v2]. (2.3.35)

The constants of integration a±k in the mode expansion of v become operators a±k , so that the

field operator v is expanded as

v(τ,x) =

∫d3k

(2π)3/2

[a−k vk(τ)eik·x + a+

k v∗k(τ)e−ik·x

]. (2.3.36)

Substituting (2.3.36) into (2.3.33) and (2.3.34) implies

[a−k , a+k′ ] = δ(k− k′) and [a−k , a

−k′ ] = [a+

k , a+k′ ] = 0 . (2.3.37)

6For a nice treatment of quantum field theory in curved backgrounds I strongly recommend: V. Mukhanov

and S. Winitzki, Introduction to Quantum Effects in Gravity.7Here, we defined ~ ≡ 1.

2.3 Quantum Origin of Cosmological Perturbations 29

We realize that our normalization for the mode functions (2.2.29) was wisely chosen to make

(2.3.37) simple. The operators a+k and a−k may then be interpreted as creation and annihilation

operators, respectively. As usual, quantum states in the Hilbert space are constructed by defining

the vacuum state |0〉 via

a−k |0〉 = 0 , (2.3.38)

and by producing excited states by repeated application of creation operators

|mk1 , nk2 , · · · 〉 =1√

m!n! · · ·[(a+

k1)m(a+

k2)n · · ·

]|0〉 . (2.3.39)

2.3.2 Non-Uniqueness of the Vacuum

An unambiguous physical interpretation of the states in (2.3.38) and (2.3.39) arises only after

the mode functions vk(τ) are selected.8 However, the normalization (2.2.29) is not sufficient to

completely fix the solutions χk(τ) to the second-order ODE (2.2.22). An unambiguous definition

of the vacuum still requires additional physical input.

To illustrate this ambiguity explicitly, consider the following functions

uk(τ) = αkvk(τ) + βkv∗k(τ) , (2.3.40)

where αk and βk are complex constants. The functions uk(τ) of course also satisfy the equation

of motion (2.2.22). Moreover, they satisfy the normalization (2.2.29), i.e. W [uk, u∗k] = −i, if the

coefficients αk and βk obey

|αk|2 − |βk|2 = 1 . (2.3.41)

At this point there is therefore nothing that permits us to favor vk(τ) over uk(τ) in our choice

of mode functions. In terms of uk(τ) the expansion of v takes the form

v(τ,x) =

∫d3k

(2π)3/2

[b−k uk(τ)eik·x + b+k u

∗k(τ)e−ik·x

], (2.3.42)

where b±k are alternative creation and annihilation operators satisfying (2.3.37). Comparing

(2.3.42) to (2.3.36) leads to the Bogolyubov transformation between b±k operators and a±k oper-

ators:

a−k = α∗k b−k + βk b

+−k and a+

k = αk b+k + β∗k b

−−k . (2.3.43)

Both sets of operators can be used to construct a basis of states in the Hilbert space:

a−k |0〉a = 0 b−k |0〉b = 0 , (2.3.44)

and

|mk1 , nk2 , · · · 〉a =1√

m!n! · · ·[(a+

k1)m(a+

k2)n · · ·

]|0〉a , (2.3.45)

|mk1 , nk2 , · · · 〉b =1√

m!n! · · ·[(b+k1

)m(b+k2)n · · ·

]|0〉b . (2.3.46)

8Changing vk(τ) while keeping v fixed, changes a±k [cf. (2.2.30)] and hence changes the vacuum |0〉 and the

excited states |m,n, · · · 〉.


It should be clear that the b-states are in general different form the a-states. In particular, the

b-vacuum contains a-particles:

b〈0|N (a)k |0〉b = b〈0|a+

k a−k |0〉b (2.3.47)

= b〈0|(αk b+k + β∗k b−−k)(α∗k b

−k + βk b

+−k)|0〉b (2.3.48)

= |βk|2δ(0) . (2.3.49)

The divergent factor δ(0) arises because we are considering an infinite spatial volume, but the

mean density of a-particles in the b-vacuum is finite (and typically not zero):

n ≡∫

d3k nk =

∫d3k |βk|2 . (2.3.50)

2.3.3 Choice of the Physical Vacuum

Clearly, we are still missing some essential physical input to define the unique vacuum state.

Vacuum in Minkowski Space

How do we usually do this? In a time-independent spacetime a preferable set of mode functions

and thus an unambiguous physical vacuum can be defined by requiring that the expectation

value of the Hamiltonian in the vacuum state is minimized. To illustrate this let us consider the

Mukhanov-Sasaki equation in Minkowski space (i.e. the a ≡ 0 limit of (2.2.22)):

v′′k + k2vk = 0 . (2.3.51)

We aim to find the mode functions vk that minimize the expectation value of the Hamiltonian

in the vacuum. We will therefore compute v〈0|H|0〉v for an arbitrary mode function v and then

find the preferred function v that minimize the result. In terms of our mode expansion, the

Hamiltonian (2.3.35) becomes

H =1

2

∫d3k

[a−k a

−−kF

∗k + a+

k a+−kFk +

(2a+

k a−k + δ(0)

)Ek

], (2.3.52)

where

Ek ≡ |v′k|2 + k2|vk|2 , (2.3.53)

Fk ≡ v′ 2k + k2v2k . (2.3.54)

Since a−k |0〉v = 0, we have

v〈0|H|0〉v =δ(0)

4

∫d3kEk . (2.3.55)

Dividing out the uninteresting divergence, δ(0), we infer that the energy density in the vacuum

state is

ε =1

4

∫d3kEk . (2.3.56)

It is clear that this is minimized if each k–mode Ek is minimized separately. We therefore need

to determine the vk and v′k that minimize the expression

Ek = |v′k|2 + k2|vk|2 . (2.3.57)

2.3 Quantum Origin of Cosmological Perturbations 31

We mustn’t forget that the mode functions χk satisfy the normalization (2.2.29),

v′kv∗k − vkv∗k ′ = −i . (2.3.58)

Using the parameterization vk = rkeiαk , for real rk and αk, (2.3.58) becomes

r2kα′k = −1

2, (2.3.59)

and (2.3.57) gives

Ek = r′2k + r2kα′2k + k2r2

k (2.3.60)

= r′2k +1

4r2k

+ k2r2k . (2.3.61)

It is easily seen that (2.3.61) is minimized if r′k = 0 and rk = 1√2k

. Integrating (2.3.59) gives

αk = −kτ (up to an irrelevant constant that doesn’t affect any observables; e.g. this constant

phase factor drops out in the computation of the power spectrum) and hence

vk(τ) =1√2ke−ikτ . (2.3.62)

This defines the preferred mode functions for fluctuations in Minkowski space. For these mode

functions we find Ek = k ≡ ωk and Fk = 0, so the Hamiltonian is

H =

∫d3k ωk

[a+k a−k +

1

2δ(0)

]. (2.3.63)

Hence, the Hamiltonian is diagonal in the eigenbasis of the occupation number operator Nk ≡a+k a−k .

Vacuum in Time-Dependent Spacetimes

The vacuum prescription which we just applied to Minkowski space does not generalize straight-

forwardly to time-dependent spacetimes. In this case the mode equation (2.2.22) involves time-

dependent frequencies ωk(τ) and the ‘minimum-energy vacuum’ depends on the time τ0 at which

it is defined. Repeating the above argument, one can nevertheless determine the vacuum which

instantaneously minimizes the expectation value of the Hamiltonian at some time τ0. One finds

that the initial conditions

vk(τ0) =1√

2ωk(τ0)e−iωk(τ0)τ0 , v′k(τ0) = −iωk(τ0)χk(τ0) (2.3.64)

select the preferred mode functions which determine the vacuum |0〉τ0 . However, since ωk(τ)

changes with time, the mode functions satisfying (2.3.64) at τ = τ0 will typically be different

from the mode functions that satisfy the same conditions at a different time τ1 6= τ0. This

implies that |0〉τ1 6= |0〉τ0 and the state |0〉τ0 is not the lowest-energy state at a later time τ1.

Bunch-Davies Vacuum

How do we resolve this ambiguity for the inflationary quasi-de Sitter spacetime?


From fig. 2.2 we note that a sufficiently early times (large negative conformal time τ) all

modes of cosmological interest were deep inside the horizon:

k

aH∼ |kτ | 1 (subhorizon) . (2.3.65)

This means that in the remote past all observable modes had time-independent frequencies;

e.g. in perfect de Sitter space:

ω2k = k2 − 2

τ2→ k2 . (2.3.66)

The corresponding modes are therefore not affected by gravity and behave just like in Minkowski

space:

v′′k + k2vk = 0 . (2.3.67)

The two independent solutions of (2.3.67) are vk ∝ e±ikτ . As we have seen above only the

positive frequency mode vk ∝ e−ikτ is the ‘minimal excitation state’, cf. eq. (2.3.62).

Given that all modes have time-independent frequencies at sufficiently early times, we can

now avoid the ambiguity in defining the initial conditions for the mode functions that afflicts

the treatment in more general time-dependent spacetimes. In practice, this means solving the

Mukhanov-Sasaki equation with the (Minkowski) initial condition

limτ→−∞

vk(τ) =1√2ke−ikτ . (2.3.68)

This defines a preferable set of mode functions and a unique physical vacuum, the Bunch-Davies

vacuum.

2.3.4 Zero-Point Fluctuations in De Sitter

We are now ready to apply the formalism to de Sitter space.

De Sitter Mode Functions

Recall that the Mukhanov-Sasaki equation in de Sitter is

v′′k +

(k2 − 2

τ2

)vk = 0 . (2.3.69)

This has the following exact solution

vk(τ) = αe−ikτ√

2k

(1− i

kτ

)+ β

eikτ√2k

(1 +

i

kτ

). (2.3.70)

The initial condition (2.3.68) fixes β = 0, α = 1, and, hence, the unique mode function is

vk(τ) =e−ikτ√

2k

(1− i

kτ

). (2.3.71)

This determines the future evolution of the mode including its superhorizon dynamics:

limkτ→0

vk(τ) =1

i√

2· 1

k3/2 τ. (2.3.72)

2.4 Curvature Perturbations from Inflation 33

Zero-Point Fluctuations

Knowledge of the mode functions for canonically-normalized fields in de Sitter space allows us

to compute the effect of quantum zero-point fluctuations:

〈vkvk′〉 = 〈0|vkvk′ |0〉= 〈0|

(a−k vk + a+

−kv∗k

) (a−k′vk′ + a+

−k′v∗k′)|0〉

= vkv∗k′〈0|a−k a+

−k′ |0〉= vkv

∗k′〈0|

[a−k , a

+−k′]|0〉

= |vk|2 δ(k + k′)

≡ Pv(k) δ(k + k′) .

On superhorizon scales this approaches [cf. eq. (2.3.72)]

Pv =1

2k3

1

τ2=

1

2k3(aH)2 . (2.3.73)

All power spectra for fields in de Sitter space are simple rescalings of this power spectrum for

the canonically-normalized field. For example, the power spectrum of curvature perturbations

is

Pζ =1

z2Pv . (2.3.74)

2.4 Curvature Perturbations from Inflation

Strictly speaking, the curvature fluctuations ζ = z−1v are ill-defined in perfect de Sitter since

z2 = 2a2ε vanishes in that limit. This is just a reflection of the fact that for perfect de Sitter

inflation never ends, so ζ is meaningless. In reality, we know that inflation has to end and that

the spacetime during inflation has to deviate from the de Sitter idealization. This deviation is

described by the small but finite slow-roll parameter ε.

2.4.1 Results for Quasi-De Sitter

In quasi-de Sitter space, the curvature perturbation ζ is well-defined, and its power spectrum

follows directly from eq. (2.3.73),

Pζ =1

z2Pv =

1

4k3

H2

ε=

1

2k3

H4

φ2. (2.4.75)

Since ζ freezes at horizon crossing, we may evaluate the r.h.s. at k = aH. The power spectrum

then becomes a function purely of k:

Pζ(k) =1

4k3

H2

ε

∣∣∣∣k=aH

, (2.4.76)

or in dimensionless form

∆2s(k) ≡ k3

2π2Pζ(k) =

1

8π2

H2

ε

∣∣∣∣k=aH

. (2.4.77)

Since H and possibly ε are now functions of time, the power spectrum will deviate slightly

from the scale-invariant form ∆2s ∼ k0. The common way to quantify the deviation from scale-

invariance is via the scalar spectral index ns:

ns − 1 ≡ d ln ∆2s

d ln k. (2.4.78)


We split the r.h.s. into two factors

d ln ∆2s

d ln k=d ln ∆2

s

dN× dN

d ln k. (2.4.79)

The derivative with respect to e-folds is

d ln ∆2s

dN= 2

d lnH

dN− d ln ε

dN. (2.4.80)

The first term is just −2ε and the second term is −η (see Chapter 1). The second factor in

eq. (2.4.79) is evaluated by recalling the horizon crossing condition k = aH, or

ln k = N + lnH . (2.4.81)

Hence,dN

d ln k=

[d ln k

dN

]−1

=

[1 +

d lnH

dN

]−1

≈ 1 + ε . (2.4.82)

To first order in the Hubble slow-roll parameters we therefore find

ns − 1 = −2ε− η . (2.4.83)

The parameter ns is an interesting probe of the inflationary dynamics. It measures deviations

of the perfect de Sitter limit: H, H, and H.

2.4.2 Systematic Slow-Roll Expansion∗

The same results may be derived as a systematic expansion in slow-roll parameters:

ε ≡ − H

H2, η ≡ ε

Hε, κ ≡ η

Hη. (2.4.84)

This will involve a slow-roll expansion of the Mukhanov-Sasaki equation (2.2.22):

v′′k +

(k2 − z′′

z

)vk = 0 . (2.4.85)

Given z2 = 2a2ε we find

z′

z= (aH)

[1 +

1

2η

](exact) , (2.4.86)

z′′

z= (aH)2

[2− ε+

3

2η − 1

2εη +

1

4η2 + ηκ

](exact) . (2.4.87)

Despite the appearance of the slow-roll parameters, both expressions above are exact. From the

definition of ε we furthermore get

d

dτ

(1

aH

)= ε− 1 (exact) . (2.4.88)

Expanding the expressions to first order in the slow-roll parameters, ε, |η|, |κ| 1, gives

aH = −1

τ(1 + ε) (first order in SR) , (2.4.89)

2.4 Curvature Perturbations from Inflation 35

andz′′

z=

1

τ2

[2 + 3

(ε+

1

2η

)]≡ ν2 − 1

4

τ2(first order in SR) , (2.4.90)

where

ν ≡ 3

2+ ε+

1

2η . (2.4.91)

For constant ν, the Mukhanov-Sasaki equation,

v′′k +

(k2 − ν2 − 1

4

τ2

)vk = 0 , (2.4.92)

has an exact solution in terms of Hankel functions of the first and second kind:

vk(τ) =√−τ[αH(1)

ν (−kτ) + βH(2)ν (−kτ)

]. (2.4.93)

To impose the Bunch-Davies boundary condition at early times, we consider the limit

limkτ→−∞

vk(τ) =

√2

π

[α

1√ke−ikτ + β

1√keikτ

], (2.4.94)

where we used

limkτ→−∞

H(1,2)ν (−kτ) =

√2

π

1√−kτ

e±ikτe±iπ2

(ν+ 12

) , (2.4.95)

and dropped the unimportant phase factors e±iπ2

(ν+ 12

). Comparing (2.4.94) to (2.3.68) we find

β = 0 and α =

√π

2. (2.4.96)

Hence, the Bunch-Davies mode functions to first order in slow-roll are:

vk(τ) =

√π

2(−τ)1/2H(1)

ν (−kτ) , (2.4.97)

To compute the power spectrum of curvature fluctuations, Pζ = z−2Pv, we use z ∼ τ12−ν (first

order in SR)

Pζ ∼π

2(−τ)2ν |H(1)

ν (−kτ)|2 . (2.4.98)

In the superhorizon limit, −kτ 1, this reduces to

∆2s ≡

k3

2π2Pζ ∼ k3−2ν , (2.4.99)

where we used

limkτ→0

H(1)ν (−kτ) =

i

πΓ(ν)

(−kτ2

)−ν. (2.4.100)

Finally, the scale-dependence of the scalar spectrum is

ns − 1 ≡ d ln ∆2s

d ln k= 3− 2ν , (2.4.101)

or, in terms of the slow-roll parameters,

ns − 1 = −2ε− η . (2.4.102)

This shows that the spectrum is perfectly scale-invariant in de Sitter space, while slow-roll

corrections to de Sitter led to percent-level deviations from ns = 1.


2.5 Gravitational Waves from Inflation

One of the most robust and model-independent predictions of inflation is a stochastic background

of gravitational waves with an amplitude given simply by the Hubble scale H during inflation.

The simplicity of this prediction means that a measurement of primordial gravitational waves

would give clean information about arguably the most important inflationary parameter, namely

the energy scale of inflation. Most excitingly, inflationary gravitational waves lead to a unique

signature in the polarization of the CMB. A large number of ground-based, balloon and satellite

experiments are currently searching for this signal.

The formalism we introduced for the scalar fluctuations can easily be applied to compute the

quantum generation of tensor perturbations (i.e. transverse and traceless perturbations to the

spatial metric, δgij = a2hij). In fact, in this case, our job is considerably simpler due to the fact

that first-order tensor perturbation are gauge-invariant and don’t backreact on the inflationary

background. Expansion of the Einstein-Hilbert action gives the second-order action for tensor

fluctuations

S =M2

pl

8

∫dτ d3x a2

[(h′ij)

2 − (∇hij)2]. (2.5.103)

Here, we have reintroduced the explicit factor of M2pl to make hij manifestly dimensionless. Up

to the normalization factor ofMpl

2 this is the same as the action for a massless scalar field in an

FRW universe.

We define the standard Fourier representation for transverse, traceless tensors

hij(τ,x) =

∫d3k

(2π)3/2

∑

γ=+,×εγij(k)hk,γ(τ) eik·x , (2.5.104)

where εγii = kiεγij = 0 and εγijεγ′

ij = 2δγγ′ . The fields hk,γ describe the two polarization modes of

the gravitational waves (+ and ×). Eq. (2.5.103) then becomes

S =∑

γ

∫dτ d3k

a2

4M2

pl

[(h′k,γ)2 − k2(hk,γ)2

]. (2.5.105)

For the canonically-normalized fields,

vk,γ ≡a

2Mplhk,γ , (2.5.106)

this reads

S =∑

γ

1

2

∫dτ d3k

[(v′k,γ)2 −

(k2 − a′′

a

)

︸︷︷︸≡ω2

k(τ)

(vk,γ)2]. (2.5.107)

For a de Sitter background, we havea′′

a=

2

τ2. (2.5.108)

Eq. (2.5.107) should be recognized as essentially two copies of the action (2.2.20). Hence, we

can jump straight to result in eq. (2.3.73):

Pv =1

2k3(aH)2 . (2.5.109)

2.6 The Lyth Bound 37

Defining the tensor power spectrum Pt as the sum of the power spectra for each polarization

mode of hij , we find

Pt = 2 · Ph = 2 ·(

2

aMpl

)2

· Pv =4

k3

H2

M2pl

, (2.5.110)

or the dimensionless spectrum

∆2t (k) =

2

π2

H2

M2pl

∣∣∣∣∣k=aH

. (2.5.111)

This completes our treatment of the quantum generation of scalar and tensor fluctuations in

inflation.

2.6 The Lyth Bound

A useful way of normalizing the gravitational wave am-

plitude is the tensor-to-scalar ratio

r ≡ ∆2t

∆2s

= 16ε =8

M2pl

φ2

H2. (2.6.112)

It turns out that it is the size of r that determines

whether inflationary gravitational waves are detectable

in future CMB observations. Roughly, tensors will be

observable in the not to distant future, if r > 0.01.

Note that the tensor-to-scalar ratio relates directly to

the evolution of the inflaton as a function of e-folds N

r =8

M2pl

(dφ

dN

)2

. (2.6.113)

The total field evolution between the time when CMB fluctuations exited the horizon at Ncmb

and the end of inflation at Nend can therefore be written as the following integral

∆φ

Mpl=

∫ Ncmb

Nend

dN

√r

8. (2.6.114)

During slow-roll evolution, r(N) doesn’t evolve much and one may obtain the following approx-

imate relation, called the Lyth bound,

∆φ

Mpl= O(1)×

( r

0.01

)1/2, (2.6.115)

where r ≡ r(Ncmb) is the tensor-to-scalar ratio on CMB scales. Large values of the tensor-to-

scalar ratio, r > 0.01, therefore correlate with ∆φ > Mpl or large-field inflation. How to make

sense of such super-Planckian field excursions in effective field theory remains a key theoretical

challenge (see Chapter 6).

3 Contact with Observations

3.1 Introduction

So far, we have computed the power spectra of ζ and h at horizon exit. In this chapter, we

show how to relate these results to observations of the cosmic microwave background (CMB)

and large-scale structure (LSS). Making this correspondence explicit is crucial if the data is to

be used to extract information about the early universe.


transfer function


projection

horizon exit

time

comoving scales

horizon re-entry


reheating

Figure 3.1: From vacuum fluctuations to CMB anisotropies.

The challenge is to relate the predictions made at horizon exit (high energies) to the observ-

ables after horizon re-entry (low energies). These times are separated by a time interval in which

the physics is very uncertain. Not even the equations governing perturbations are well-known.

How can we still make predictions? The only reason that we are able to connect late-time ob-

servables to inflationary theories is the fact that the wavelengths of the perturbations of interest

we outside the horizon during the period from well before the end of inflation until the relatively

near present (see fig. 3.1).

In the previous chapter, we showed that the curvature perturbation ζ freezes after horizon

crossing for inflation driven by a single scalar field. However, this is not enough. After in-

flation, the universe becomes filled with matter and radiation, and we need establish under

which conditions ζ remains conserved on superhorizon scales. In §3.2, we review two proofs

of the conservation of ζ outside of the horizon for adiabatic matter perturbations. In §3.3, we

39

40 3. Contact with Observations

discuss how the subsequent subhorizon evolution processes the primordial perturbations until

the fluctuations get imprinted in the CMB surface of last scattering. This evolution involves

well-understood low-energy physics and is hence computable. We will present a simplified CMB

code the evaluate the CMB transfer function. Finally, in §3.4 we make some brief comments

about the relation between large-scale structure observables and the primordial perturbations.

3.2 Superhorizon (Non)-Evolution∗

Establishing rigorously the conditions under which ζ doesn’t evolve on superhorizon scales, is

crucial for making reliable predictions about late time observables such as the CMB. In §3.2.1, we

therefore review Weinberg’s proof of the conservation of superhorizon curvature perturbations.1

Beware, coming from Weinberg, the proof is long, convoluted, but precise. For a more easy-

going (if less rigorous) proof, we sketch the ‘separate universe’ approach2 in §3.2.2. Readers

whose attention span has been sufficiently reduced by the internet3 may skip to the next section

without loss of continuity.

3.2.1 Weinberg’s Proof

We start with the gauge-invariant curvature perturbation written in Newtonian gauge variables

(see Appendix B)

ζ ≡ −Ψ +Hδu . (3.2.1)

Here, Ψ is the scalar metric perturbation and δu is the velocity potential for the total energy-

momentum tensor. During slow-roll inflation δu = δφ/φ. The rate of change of ζ is4

ζ = X +O( k2

a2H2 ) , where X ≡˙ρδp− ˙pδρ

3(ρ+ p)2. (3.2.2)

Thus ζ is conserved in the limit of small wavenumber if and only if X = 0 in this limit.5 We

then proceed in two steps:

1. Following Weinberg, we prove that whatever the constituents of the universe and the clas-

sical equations governing them may be, these equations always have a physical solution for

which X → 0 and ζ approaches a non-zero constant ζo in the limit k → 0. This is called

the adiabatic mode.

2. We then prove that single-field inflation excites the adiabatic mode.

Together these two facts prove that the curvature perturbations set up during single-field infla-

tion remain conserved after inflation.

1Weinberg (arXiv:astro-ph/0302326).2Wands et al. (arXiv:astro-ph/0003278).3Nicholas Carr, The Shallows (What the internet is doing to our brains).4This follows from the conservation of the stress-tensor.5We see that X vanishes when the pressure p+ δp is a function only of the perturbed energy density ρ+ δρ.

3.2 Superhorizon (Non)-Evolution∗ 41

Adiabatic Modes in Cosmology

Weinberg’s proof is based on the observation that in the special case of a spatially homogeneous

universe, the field equations and dynamical equation for matter and radiation are invariant

under coordinate transformations that are not symmetries of the unperturbed metric.6

His explicit proof was performed in Newtonian gauge whose properties are reviewed in Ap-

pendix B. For convenience we here collect some of its most relevant equations:

Newtonian gauge. Scalar perturbations to the metric in Newtonian gauge are

ds2 = (−1− 2Φ)dt2 + a2(t)(1− 2Ψ)dx2 ≡ (gµν + gµν)dxµdxν . (3.2.3)

The Einstein field equations are

∇2Ψ− a2Ψ− 6aaΨ− aaΦ− (4a2 + 2aa)Φ = 4πGa2[δρ− δp−∇2π

], (3.2.4)

∂i∂j [Ψ− Φ] = 8πGa2∂i∂jπ , (3.2.5)

∂i[aΨ + aΦ] = −4πGa(ρ+ p)∂iδu , (3.2.6)

3a2Ψ + 6aaΨ +∇2Φ + 3aaΦ + 6aaΦ = 4πGa2[δρ+ δp+∇2π

]. (3.2.7)

We will need to consider spacetime coordinate transformations of the form

xµ → xµ + εµ(x) . (3.2.8)

Metric perturbations transform as

∆g00 = −2∂0ε0 , (3.2.9)

∆gi0 = −∂0εi − ∂iε0 + 2Hεi , (3.2.10)

∆gij = −∂iεj − ∂jεi + 2aaδijε0 . (3.2.11)

Similarly, the perturbations of the stress-tensor transform as

∆δp = ˙pε0 , ∆δρ = ˙ρε0 , and ∆δu = −ε0 . (3.2.12)

In Newtonian gauge, general first-order spatially homogeneous scalar and tensor perturbations

to the metric take the form

g00 = −2Φ(t) , gi0 = 0 , and gij − 2a2(t)Ψ(t)δij + a2(t)hij(t) , (3.2.13)

where hij are transverse and traceless tensor perturbations. We now want to find those gauge

transformations (3.2.8) that preserve both the conditions of the Newtonian gauge and of spatial

homogeneity. The field equations (not matter what they are!) will necessarily be invariant under

those transformations. Eq. (3.2.9) shows that in order for g00 to remain spatially homogeneous,

the transformation parameter ε0 must be of the form

ε0(t,x) = ε(t) + χ(x) , (3.2.14)

so that

∆Φ = ε . (3.2.15)

6In this respect, the theorem is analogous to the Goldstone theorem of QFT.


Eq. (3.2.10) then shows that in order for gi0 to remain equal to zero, εi must have the form

εi(t,x) = a2(t)fi(x)− a2(t)∂iχ(x)

∫dt

a2(t). (3.2.16)

Eq. (3.2.11) then implies

∆gij = −a2(∂ifj + ∂jfi) + 2δijaa[ε+ χ]− 2∂ijχ

∫dt

a2. (3.2.17)

In order not to introduce any spatial dependence in gij , the parameter χ must be a constant, in

which case it can be set to zero simply by absorbing it into the definition of ε(t). We must also

take fi to be of the form fi = ωijxj , where ωij is a constant matrix. Hence, we get

∆gij = −a2[ωij + ωji] + 2δijaaε . (3.2.18)

Comparing this to

∆gij = −2a2∆Ψδij + a2∆hij , (3.2.19)

we find

∆Ψ = 13ωii −Hε , (3.2.20)

∆hij = −ωij − ωji + 23ωkkδij . (3.2.21)

The corresponding gauge transformations of quantities appearing in the stress-tensor are given

in eq. (3.2.12). Since [gµν , Tµν ] and [gµν + ∆gµν , Tµν + ∆Tµν ] are both solutions of the field

equations, their difference must also be a solution. We conclude that there is always a spatially

homogeneous solution of the Newtonian gauge field equations, with scalar perturbations

Ψ = Hε− 13ωkk , Φ = −ε , δp = − ˙pε , δρ = − ˙ρε , δu = ε , π = 0 . (3.2.22)

Exercise. Confirm that eq. (3.2.22) satisfies the Einstein equations (3.2.4)–(3.2.7).

Substituting eq. (3.2.22) into eq. (3.2.1), gives ζ the time-independent value

ζo = 13ωkk . (3.2.23)

Similarly, there is a spatially homogeneous solution with a tensor perturbation

hij ∝ ωij − 13ωkkδij , πij = 0 . (3.2.24)

So far, this may seem a bit like an empty statement: ε is an arbitrary function of time, and

ωij is an unrelated arbitrary constant matrix. But for zero wavenumber, these are just gauge

modes! To find physical modes, we have to extend these solutions to non-zero wavenumber.

For the tensor modes, there is no subtlety: in this case, no field equations disappear for zero

wavenumber, so the solution with hij time-independent automatically has an extension to a

physical mode with non-zero wavenumber. For scalars, however, we need to work a bit more.

First, we note that the ‘constraint’7 equation (3.2.5) vanishes for zero wavenumber. To get a

physical mode we must impose on the perturbations the condition

Φ = Ψ . (3.2.25)

7It is important that we don’t have to use an evolution equation. The evolution equations are still completely

arbitrary.

3.2 Superhorizon (Non)-Evolution∗ 43

From eq. (3.2.22), we therefore get

ε = −Hε+ ζo . (3.2.26)

For ζo 6= 0, this differential equation has the solution

ε(t) =ζo

a(t)

∫ t

Ta(t′)dt′ , (3.2.27)

with the integration limit T arbitrary. This implies the following solutions for the long-wavelength

metric perturbations

Ψ = Φ = ζo[−1 +

H(t)

a(t)

∫ t

Ta(t′)dt′

], (3.2.28)

and matter perturbations

δp˙p

=δρ˙ρ

= −δu = − ζo

a(t)

∫ t

Ta(t′)dt′ . (3.2.29)

Notice that this solution satisfies X = 0 and hence is called adiabatic.

Finally, we observe that two solutions of the form (3.2.27) with different values of T , but the

same ζo are still solutions of (3.2.26). The difference of these two solutions is also a solution,

but with ζo = 0. This solution is a decaying mode, so it can typically be ignored at late times,

but its existence is important for the counting of the number of adiabatic solutions.

Adiabatic Modes in Single-Field Inflation

Weinberg proved that there are always two independent physical adiabatic solutions of the dif-

ferential equations governing the scalar fluctuations. Hence, if these equations have no more

than two independent solutions, then any perturbations must be adiabatic! This is the case for

single-field inflation. We are done. The perturbations set up by single-field inflation are nec-

essarily adiabatic and will therefore remain constant on superhorizon scales even after inflation

ends.

3.2.2 Separate Universe Approach

A popular alternative to Weinberg’s proof of the conservation of ζ is the separate universe

approach. We here briefly sketch the basic idea.8

Consider different super-horizon sized regions of the universe to be evolving like separate

FRW universes, where density and pressure may take different values, but that are locally

homogeneous. After patching together the different regions, this can be used to follow the

evolution of the curvature perturbation with time. In fig. 3.2 we show two locally homogeneous

regions (a) and (b), separated by a coordinate distance λ on an initial hypersurface (e.g. uniform-

density hypersurface) at time t1. Note that ζ may be interpreted as a local rescaling of the

background scale factor, a(t, ~x) = a(t)eζ(t,~x). Moreover, adiabatic fluctuations are completely

determined by ζ alone. For adiabatic fluctuations the two regions (a) and (b) only differ by

a shift in time, a(t)eζa and a(t)eζb . In particular, for adiabatic fluctuations, the density and

the pressure perturbations simply correspond to shifts forward and backwards in time along

the background solution. The two regions therefore have identical, but slightly time-shifted,

8More details can be found in Wands et al. (arXiv:astro-ph/0003278).


Figure 3.2: (Figure reproduced from Wands et al.) Illustration of the separate universe approach.

evolutions. Under these conditions, uniform-density hypersurfaces are separated by a uniform

expansion and hence the curvature perturbation, ζ, remains constant.

For non-adiabatic perturbations it is no longer possible to define a simple shift to describe both

the density and pressure perturbation. The existence of a non-zero pressure perturbation on

uniform-density hypersurfaces changes the equation of state in different regions of the universe

and hence leads to perturbations in the expansion along different worldlines between uniform-

density hypersurfaces.

This is the gist of the separate universe argument for the conservation of ζ for adiabatic

perturbations.

3.3 From Vacuum Fluctuations to CMB Anisotropies

Next, we discuss the evolution of perturbations after modes re-enter the horizon and eventually

become the anisotropies we observe in the CMB. A complete discussion of CMB physics clearly

would require a whole course of its own.9 Here, we sketch schematically the different effects

that have to be accounted for in order to related the observed CMB spectrum to the primordial

fluctuations.

3.3.1 Statistics of Temperature Anisotropies

Inside of the horizon the curvature perturbations ζ lead to density fluctuations δρ in the pri-

mordial plasma. When the universe cools, neutral hydrogen forms and photons decouple. These

photons become the CMB and the primordial density perturbations get imprinted in the CMB

anisotropies. Fig. 3.3 shows a map of the measured CMB temperature fluctuations ∆T (~n) rel-

ative to the background temperature T0 = 2.7 K, where ~n denotes the direction in sky. The

9The formation of the CMB and its acoustic oscillations involves some fascinating physics. If you haven’t

studied this before, I strongly recommend reading Scott Dodelson’s brilliant book (Dodelson, Modern Cosmology.)

and/or listening to Matias Zaldarriaga’s recent lectures at the PiTP summer school (http://video.ias.edu/pitp-

2011).

3.3 From Vacuum Fluctuations to CMB Anisotropies 45

Figure 3.3: Temperature fluctuations in the CMB. Blue spots represent directions on the sky where

the CMB temperature is ∼ 10−4 below the mean, T0 = 2.7 K. This corresponds to photons losing energy

while climbing out of the gravitational potentials of overdense regions in the early universe. Yellow and

red indicate hot (underdense) regions. The statistical properties of these fluctuations contain important

information about both the background evolution and the initial conditions of the universe.

harmonic expansion of this map is

Θ(~n) ≡ ∆T (~n)

T0=∑

`m

a`mY`m(~n) , (3.3.30)

where

a`m =

∫dΩY ∗`m(~n)Θ(~n) . (3.3.31)

Here, Y`m(~n) are the standard spherical harmonics on a two-sphere with ` = 0, ` = 1 and

` = 2 corresponding to the monopole, dipole and quadrupole, respectively. The magnetic

quantum numbers satisfy m = −`, . . . ,+`. The multipole moments a`m may be combined into

the rotationally-invariant angular power spectrum10

CTT` =1

2`+ 1

∑

m

〈a∗`ma`m〉 , or 〈a∗`ma`′m′〉 = CTT` δ``′δmm′ . (3.3.32)

The angular power spectrum is an important tool in the statistical analysis of the CMB. It

describes the cosmological information contained in the millions of pixels of a CMB map in

terms of a much more compact data representation. Fig. 3.4 shows recent measurements of the

CMB angular power spectrum. The figure also shows a fit of the theoretical prediction for the

CMB spectrum to the data. The theoretical curve depends both on the background cosmological

parameters and on the spectrum of initial fluctuations. We hence can use the CMB as a probe

of both.

3.3.2 Transfer Function and Projection Effects

The linear evolution which relates ζ and ∆T is summarized by the transfer function ∆T`(k)

appearing in the following integral over momentum modes

a`m = 4π(−i)`∫

d3k

(2π)3∆T`(k) ζ~k Y`m(k) . (3.3.33)

10This is the Legendre transform of the real space two-point correlation function 〈∆T (~n)∆T (~n′)〉.


Multipole moment

Angular Scale90 2 0.20.5

homogeneous

flat

isotropic

scale-invariantsuperhorizon

adiabatic

Gaussian

Figure 3.4: Angular power spectrum of CMB temperature fluctuations.

The transfer function may be written as the line-of-sight integral over physical source terms

ST (k, τ) and a geometric projection factor PT`(k[τ0 − τ ]) (combinations of Bessel functions)11

∆T`(k) =

∫ τ0

0dτ ST (k, τ)︸︷︷︸

Sources

PT`(k[τ0 − τ ])︸︷︷︸Projection

, (3.3.34)

where τ0 is conformal time today. The transfer functions ∆T`(k) generally have to be computed

numerically using Boltzmann codes such as CMBFast or CAMB.12 In the next section, I will

present a simplified treatment (the so-called two fluid approximation) that captures the most

relevant physics without having to solve the complete hierarchy of Boltzmann equations. You

are then invited to write a simple Mathematica code to calculate the CMB transfer function

yourself!

Substituting (3.3.78) into (3.3.32) and using the identity

∑

m=−`Y`m(k)Y`m(k′) =

2`+ 1

4πP`(k · k′) , (3.3.35)

we find

CTT` =2

π

∫k2dk Pζ(k)︸︷︷︸

Inflation

∆T`(k)∆T`(k)︸︷︷︸Anisotropies

. (3.3.36)

This result shows how the primordial power spectrum Pζ(k) gets processed into the observed

CMB power spectrum CTT` . To measure the primordial spectrum, CTT` needs to be deconvolved

by taking into account the appropriate transfer functions and projection effects, i.e. for a given

background cosmology we can compute the evolution and projection effects in eq. (3.3.36) and

therefore extract the inflationary initial conditions Pζ(k). By this deconvolution procedure, the

CMB provides a fascinating probe of the early universe.

11A derivation of the source terms and the projection factors is beyond the scope of this lecture, but may be

found in Dodelson’s book (see also the simplified treatment in the next section).12http://camb.info/


3.3.3 CMBSimple∗

High-energy theorists often consider CMBFast as a black-box that takes the primordial pertur-

bations as an input and magically spits out the CMB power spectrum as an output. This need

not be so. In this section, you will be guided to writing your own CMB code, let us call it

CMBSimple.

Our code will implement the two fluid approximation of Seljak.13 This approximation is based

on the simple observation that before recombination photons and baryons (meaning electrons

and protons) are coupled strongly to each other via Thomson scattering. We can therefore treat

the photons and baryons a single fluid. Since the dark matter and the photon-baryon fluids

DM

Figure 3.5: Illustration of the two fluid approximation.

don’t couple directly to each other, their stress tensors are separately conserved,

∇µTµν(i) = 0 , for each fluid i . (3.3.37)

From this we get the fluid equations of motion. The dynamics of scalar perturbations is then

governed by the continuity equation for density fluctuations δ = δρ/ρ and the Euler equation

for the divergence of the velocity field, θ = ~∇ · ~v. For a single uncoupled fluid, with equation of

state w = p/ρ and sound speed c2s = δp/δρ, these are

δ′ = −(1 + w)(θ − 3φ′)− 3H(c2s − w

)δ , (3.3.38)

θ′ = −H(1− 3w)θ − w′

1 + wθ +

c2s

1 + wk2δ + k2φ , (3.3.39)

where H = a′/a and φ is the gravitational potential, i.e. the metric perturbation in Newtonian

gauge14

ds2 = a2(τ)[−(1 + 2φ)dτ2 + (1− 2φ)d~x2

]. (3.3.40)

We first determine the linear dynamics of a single Fourier mode φ~k and then sum over all

modes at the end. For any given perturbation mode with wavenumber ~k = kk, gravity sources

13Seljak (arXiv:astro-ph/9406050).14Note that comoving gauge is not a good gauge inside the horizon.


only velocities in the density perturbations parallel to k. We thus take ~v to be of the form

~v = −ivk, which implies

θ = kv . (3.3.41)

For cold dark matter, w = c2s = 0, the fluid equations then become

δ′c = −kvc + 3φ′ , v′c = −Hvc + kφ . (3.3.42)

The photons contribute a pressure pγ = 13ργ to the photon-baryon fluid. The effective equation

of state and sound speed of the photon-baryon fluid therefore are

w =1

3 + 4R, c2

s =1

3(1 +R), (3.3.43)

where

R =3

4

ρbργ

. (3.3.44)

In the tight coupling approximation we have vγ = vb and δγ = 43δb.

15 With this we get the

photon evolution equations

δ′γ = −43kvγ + 4φ′ , (1 +R)v′γ = −Rvγ + 1

4kδγ + (1 +R)kφ . (3.3.45)

Exercise. Derive eq. (3.3.45) more formally following the treatment in Ma and Bertschinger:16 i.e. ex-

pand the hierarchy of Boltzmann equations in the mean free path of the photons.

The fluid equations (3.3.42) and (3.3.45) have to be supplemented by the linearized Einstein

equation for the gravitational potential φ,

k(φ′ +Hφ

)= 4πGa2

∑

i

(ρi + pi)vi , (3.3.46)

where the sum is over both fluids.

Finally, we have the Friedmann equation for the background

H2 =

(a′

a

)2

=8πGa2

3

∑

i

ρi . (3.3.47)

Exercise. Show that eq. (3.3.47) has the following analytic solution

y ≡ a

aeq= (αx)2 + 2αx , x ≡ τ

τr, α2 ≡ arec

aeq, (3.3.48)

where , a−1rec ≈ 1100, a−1

eq ≈ 2.4 × 104 Ωmh2, and τr is the would-be conformal time at recombination if

the universe had always been matter-dominated after the Big Bang. Show that

τr ≡( 4arec

ΩmH20

)1/2

. (3.3.49)

15Recall that δnγ = δnb, while nγ ∝ T 3 and ργ ∝ T 4.16Ma and Bertschinger (arXiv:astro-ph/9506072).


For numerical convenience we rescale time, x ≡ τ/τr, and momentum, κ ≡ kτr. From now on

primes will always refer to derivatives with respect to x. We also define

η ≡ a′

a=

1

a

da

dx=

2α(αx+ 1)

(αx)2 + 2αx≡ τrH . (3.3.50)

The fluid equations of motion then become

δ′c = −κvc + 3φ′ , (3.3.51)

v′c = −ηvc + κφ , (3.3.52)

δ′γ = −43κvγ + 4φ′ , (3.3.53)

v′γ = (1 + 34yb)

−1(−34ybη vγ + 1

4κδγ) + κφ , (3.3.54)

with

φ′ = −ηφ+3η2

2κ

vγ(43 + y − yc) + vcyc

1 + y. (3.3.55)

Here, we defined yb,c ≡ yΩb,cΩm

. The task now is to solve these equations for φ, δγ , δc, vγ , vc,

subject to the initial conditions at x = xi,

φ = 1 , (3.3.56)

δγ = −2φ , (3.3.57)

δc = 34δγ , (3.3.58)

vγ = −14κη δγ , (3.3.59)

vc = vγ . (3.3.60)

By setting the initial gravitational potential φ(xi) ≡ 1 we have extracted the primordial curva-

ture perturbation ζo from the transfer function, cf. eq. (3.3.78).

Exercise. Write a short Mathematica notebook to solve this system of equations from xi = 10−3 to

xrec = (√

(α2 + 1) − 1)/α (use NDSolve). Use ωm ≡ Ωmh2 = 0.13 and ωb ≡ Ωbh

2 = 0.02. Evaluate the

solutions at xrec and tabulate [φ+ 14δγ ](xrec) and vγ(xrec) between κmin = 0.01 and κmax = 300.

Assuming instantaneous recombination, we have the following relation between these pertur-

bations and the observed CMB anisotropies,

Θ(k, ~n) = [φ+ 14δγ ](τrec) + ~n · ~vγ(τrec) + 2

∫ τ0

τrec

φ′(τ)dτ , (3.3.61)

where the quantities on the r.h.s. are implicitly functions of the wavenumber k.

Explaining this formula in detail would really take us too far off the main track of these

lectures. We therefore contend ourselves with a few comments and refer the interested reader

to Dodelson’s excellent book for more details and a derivation of eq. (3.3.61): The first two

terms are evaluated at recombination. This reflects the fact that the CMB is a snapshot of the

last-scattering surface. The first (monopole) term includes a term associated with the intrinsic

fluctuation in the photon density δγ and a gravitational redshift contribution arising from the

Newtonian potential φ. Combined these terms lead to the famous Sachs-Wolfe (SW) effect.

The second (dipole) term describes a Doppler shift. Besides being a source from temperature


Figure 3.6: Projection of a plane wave onto the surface of last scattering.

anisotropies, it is important for the generation of CMB polarization (see below) and the corre-

lation between polarization and temperature anisotropies. The last term leads to the Integrated

Sachs-Wolfe (ISW) effect. It doesn’t receive any contributions from the matter era, when φ′ = 0,

but only from residual radiation at early times and dark energy at late times.

Equipped with the solution for a single Fourier mode, we can now sum over all modes, weigthed

by the initial conditions ζ~k, and then compute the temperature anisotropies as a function of

direction on the sky ~n. To do this, we need to relate Θ(~n) in eq. (3.3.30) to the Fourier space

solution Θ(k) in eq. (3.3.61). First, we note that our assumption of instantaneous recombination

implies

Θ(~n) =

∫drΘ(~x) δ(r − r?) , (3.3.62)

where the integral is over conformal distance and r? is the angular diameter distance between

us and the last-scattering surface. Here, Θ(~x) is the real space temperature field, related to our

solution Θ(k) via

Θ(~x) =

∫d3k

(2π)3ei~k·~x ζ~k Θ(k) . (3.3.63)

The weighting by the initial condition ζ~k was introduced because our solution Θ(k) was found

for ζ~k = 1 (see eq. (3.3.56)). Hence, we get

Θ(~n) =

∫d3k

(2π)3ei~k·r?~n ζ~k Θ(k) . (3.3.64)

Using the identity

ei~k·r?~n = 4π

∑

lm

i`j`(kr?)Y∗`m(~k)Y`m(~n) , (3.3.65)

we find that the CMB angular power spectrum takes the form of eq. (3.3.36),

CTT` =2

π

∫k2dk Pζ(k)∆2

T`(k) , (3.3.66)

with

∆T`(k) = (φ+ 14δγ) j`(k[τ0 − τrec]) + vγ j

′`(k[τ0 − τrec]) + 2

∫ τ0

τrec

dτ j`(k[τ0 − τ ])φ′(τ)

φ′(τ0).

(3.3.67)


Here, the spherical Bessel functions j` are coming form the projection of the plane waves onto the

last-scattering surface. The argument of the Bessel functions is related to the angular diameter

distance between us and the last-scattering surface, which in flat space is r? = τ0 − τrec.

By assuming perfect tight coupling (mean free path = zero) and instantaneous recombination,

our solution is still missing some important physics. Including the effects of a finite mean free

path for the photons and a finite duration of recombination would lead to the damping of small

scale fluctuations17 (the former effect is sometimes called Silk damping). We could rectify this

by adding viscosity directly in the fluid equations of motion.18 Here, we follow Seljak and take

a simpler, more phenomenological approach. Haters would call it a fudge. We introduce a high

momentum cutoff in the integral (3.3.36), i.e. we define

CTT` ∝∫k2dk Pζ(k)D(k)∆2

T`(k) , (3.3.68)

where

D(k) = e−(k/kD)2. (3.3.69)

The damping scale kD can be related to Ωm and Ωb (see the exercise below). See Seljak’s paper

for more details.

Exercise. Ignore the ISW term in eq. (3.3.67). For the Silk damping scale in (3.3.69) use

κ−2D = 2x2

s + σ2x2rec , (3.3.70)

with xs ≡ 0.6ω1/4m ω

−1/2b a

3/4rec and σ ≈ 0.03. Then use the tabulated solutions from the previous exercise

to evaluate the integral (3.3.68) (use NIntegrate). Ask Mathematica to find an interpolating solution (use

Interpolation). Plot `(`+ 1)CTT` . The result should look like this:

200 400 600 800 1000 1200 1400

0.2

0.4

0.6

0.8

17Essentially, fluctuations on scales smaller than the mean free path are damped; just like you can’t have sound

waves with wavelengths smaller than the mean free path in air.18See Mukhanov’s book.


3.3.4 Coherent Phases and Superhorizon Fluctuations∗

Let me digress briefly to describe arguably the most dramatic observational evidence that some-

thing like inflation must have occurred in the early universe. The following is a trivialization of

arguments that have been explained beautifully in an article by Dodelson.19

The Peaks of the TT Spectrum

As we have shown in the previous chapter, inflation produces a nearly scale-invariant spectrum

of perturbations, i.e. a particular Fourier mode is drawn from a distribution with zero mean and

variance given by

〈ζ~kζ~k′〉 = (2π)3δ(~k + ~k′)Pζ(k) , (3.3.71)

where k3Pζ(k) ∝ kns−1 with ns ≈ 1. You might think then that the shape of the power spectrum

can be measured in observations, and this is what convinces us that inflation is right. Of course,

it is true that we can measure the power spectrum, both of the matter and of the radiation,

and it is true that the observations agree with the theory. But, according to Dodelson, “this

is not what tingles our spines when we look at the data”. Rather, the truly striking aspect of

perturbations generated during inflation is that all Fourier modes have the same phase.

Consider a Fourier mode of ζ with wavenumber k. In §3.2, we proved that ζ~k is conserved

outside the horizon, k < aH. Since the fluctuation amplitude was constant outside the horizon,

ζ ≈ 0 at horizon re-entry. If we think of each Fourier mode as a linear combination of a sine and

a cosine mode, then inflation excited only the cosine modes (defining horizon re-entry as t ≡ 0).

k3/2(Θ

0+

Ψ)

τ/τrec τ/τrec

k3/2(Θ

0+

Ψ)

Figure 3.7: Evolution of an infinite number of modes all with the same wavelength. Recombination is at

τ = τrec. (Left) Wavelength corresponding to the first peak in the CMB angular power spectrum. (Right)

Wavelength corresponding to the first trough. Although the amplitudes of all these different modes differ

from one another, since they start with the same phase, the ones on the left all reach maximum amplitude

at recombination, the ones on the right all go to zero at recombination. This leads to the acoustic peaks

of the CMB power spectrum.

Once inside the horizon, the curvature perturbation ζ sources density fluctuations δ which

evolve under the influence of gravity and pressure. In the previous section, we described this

process in Newtonian gauge, where density fluctuations are sourced by the Newtonian poten-

tial φ. Inside the horizon we can think of φ and ζ interchangably. Combining the two photon

19Dodelson, Coherent Phase Argument for Inflation (arXiv:hep-ph/0309057).


evolution equations in (3.3.45), we find

δ′′γ +R′

1 +Rδ′γ + c2

sk2δ = Fg[φ] , (3.3.72)

where cs is the sound speed of the photon-baryon fluid and Fg is the gravitational source term

Fg = 4

[φ′′ +

R′

1 +Rφ′ − 1

3k2φ

]. (3.3.73)

We see that the photon-baryon fluid can sustain acoustic oscillations, where the inertia is pro-

vided by the baryons, while the pressure is provided by the photons. Imagine that recombination

happens instantaneously (as we saw in the previous section, this is not a terrible approximation).

At last scattering, fluctuations with different wavelengths are then captured at different phases

in their oscillations. Modes of a certain wavelength are captured at maximum or minimum

amplitude, while others would be captured at zero amplitude. If all Fourier modes of a given

wavelength have the same phases they interfere coherently (see fig. 3.7) and the spectrum of

all Fourier produces a series of peaks and troughs in the CMB power spectrum as seen on the

last-scattering surface. This is of course what we see in the data (see fig. 3.3). However, in

order for the theory of initial fluctuations to explain this it needs to involve a mechanism that

produces coherent initial phases for all Fourier modes. Inflation does precisely that! Because

fluctuations freeze when they exit the horizon, the phases for the Fourier modes were set well

before the modes of interest entered the horizon. When were are admiring the peak structure

of the CMB power spectrum we are really admiring the ability of the primordial mechanism for

generating fluctuations to coordinate the phases of all Fourier modes. Without this coherence,

the CMB power spectrum would simply be white noise (see fig. 3.8) with no peaks and troughs

(in fact, this is precisely why cosmic strings or topological defects are ruled out as the primary

sources for the primordial fluctuations.).

τ/τrec

k3/2(Θ

0+

Ψ)

τ/τrec

k3/2(Θ

0+

Ψ)

Figure 3.8: Modes corresponding to the same two wavelengths as in fig. 3.7, but this time with random

initial phases. The anisotropies at the angular scales corresponding to these wavelengths would have

identical rms’s if the phases were random, i.e. the angular peak structure of the CMB would be washed

away.

The Peaks of the TE Spectrum

The skeptic might not be convinced by the above argument. The peaks and troughs of the CMB

temperature fluctuation spectrum are at ` > 200, corresponding to angular scales θ < 1. All


Multipole moment

(+

1)C

TE

/2π

[µK

2]

Figure 3.9: Power spectrum of the cross-correlation between temperature and E-mode polarization

anisotropies. The anti-correlation for ` = 50 − 200 (corresponding to angular separations 5 > θ > 1)is a distinctive signature of adiabatic fluctuations on superhorizon scales at the epoch of decoupling,

confirming a fundamental prediction of the inflationary paradigm.

of these scales were within the horizon at the time of recombination. Hence, it is in principle

possible (and people have tried in the 90s) to engineer a theory of structure formation which

obeys causality and still manages to produce only the ‘cosine mode’. Such a theory would

explain the CMB peaks without appealing to inflation. This doesn’t sound like the most elegant

thing in the world, but it can’t be excluded as a logical possibility.

However, we now show that even these highly-tuned alternatives to inflation can be ruled

out by considering CMB polarization. Looking at fig. 3.9 we see that the cross-correlation

between CMB temperature fluctuations and the E-mode polarization has a negative peak around

100 < ` < 200. This anti-correlation signal is also the result of phase coherence, but now

the scales involved were not within the horizon at recombination. Hence, there is no causal

mechanism (after τ = 0) that could have produced this signal. One is almost forced to consider

something like inflation with its shrinking comoving horizon leading to horizon exit and re-

entry.20 As Dodelson explains

At recombination, [the phase difference between the monopole (sourcing T ) and the

dipole (sourcing E) of the density field] causes the product of the two to be negative

for 100 < ` < 200 and positive on smaller scales until ` ∼ 400. But this is precisely

what WMAP has observed! We have clear evidence that the monopole and the dipole

were out of phase with each other at recombination.

This evidence is exciting for the small scale modes (` > 200). Just as the acoustic

peaks bear testimony to coherent phases, the cross-correlation of polarization and

temperatures speaks to the coherence of the dipole as well. It solidifies our picture

of the plasma at recombination. The evidence from the larger scale modes (` < 200)

though is positively stupendous. For, these modes were not within the horizon at

recombination. So the only way they could have their phases aligned is if some

20It should be mentioned here that there are two ways to get a shrinking comoving Hubble radius, 1/(aH).

During inflation H is nearly constant and the scale factor a grows exponentially. However, in a contracting

spacetime a shrinking horizon can be achieved if H grows with time. This is the mechanism employed by

ekpyrotic/cyclic cosmology. When viewed in terms of the evolution of the comoving Hubble scale inflation and

ekpyrosis appear very similar, but there are important differences, e.g. in ekyprosis it is a challenge to match the

contracting phase to our conventional Big Bang expansion.


primordial mechanism did the job, when they were in causal contact. Inflation is

just such a mechanism.

3.3.5 CMB Polarization

In addition to anisotropies in the CMB temperature, we expect the CMB to become polarized

via Thomson scattering of the photons from free electrons just before decoupling (see fig. 3.10).

As we explain in this section, this polarization contains crucial information about the primordial

fluctuations and hence about inflation.

QuadrupoleAnisotropy

Thomson Scattering

e–

Linear Polarization

COLD

HOT

Figure 3.10: Thomson scattering of radiation with a quadrupole anisotropy generates linear polariza-

tion. If a free electron ‘sees’ an incident radiation pattern that is isotropic, then the outgoing radiation

remains unpolarized because orthogonal polarization directions cancel out. However, if the incoming

radiation field has a quadrupole component, a net linear polarization is generated.

The E/B Decomposition

The spin-1 polarization field can be decomposed spin-0 quantities, the so-called E- and B-modes.

Characterization of the radiation field.

The mathematical characterization of CMB polarization anisotropies is slightly more involved than

that the description of temperature fluctuations because polarization is not a scalar field so the standard

expansion in terms of spherical harmonics is not applicable.

The anisotropy field is defined in terms of a 2×2 intensity tensor Iij(n), where as before n denotes the

direction on the sky. The components of Iij are defined relative to two orthogonal basis vectors e1 and

e2 perpendicular to n. Linear polarization is then described by the Stokes parameters Q = 14 (I11 − I22)

and U = 12I12, while the temperature anisotropy is T = 1

4 (I11 + I22). The polarization magnitude and

angle are P =√Q2 + U2 and α = 1

2 tan−1(U/Q). The quantity T is invariant under a rotation in the

plane perpendicular to n and hence may be expanded in terms of scalar (spin-0) spherical harmonics

(3.3.30). The quantities Q and U , however, transform under rotation by an angle ψ as a spin-2 field

(Q± iU)(n)→ e∓2iψ(Q± iU)(n). The harmonic analysis of Q± iU therefore requires expansion on the

sphere in terms of tensor (spin-2) spherical harmonics

(Q± iU)(n) =∑

`,m

a±2,`m ±2Y`m(n) . (3.3.74)

For a description of the mathematical properties of these tensor spherical harmonics, ±2Y`m, the reader


should consult the original paper by Kamionkowski et al.21 Instead of the moments a±2,`m it is convenient

to introduce the linear combinations

aE,`m ≡ −1

2(a2,`m + a−2,`m) , aB,`m ≡ −

1

2i(a2,`m − a−2,`m) . (3.3.75)

Then one can define two scalar (spin-0) fields instead of the spin-2 quantities Q and U

E(n) =∑

`,m

aE,`m Y`m(n) , B(n) =∑

`,m

aB,`m Y`m(n) . (3.3.76)

E < 0 E > 0

B < 0 B > 0

Figure 3.11: Examples of E-mode and B-mode patterns of polarization. Note that if reflected across

a line going through the center the E-mode patterns are unchanged, while the positive and negative

B-mode patterns get interchanged.

The scalar quantities E and B completely specify the linear polarization field. E-mode po-

larization is curl-free with polarization vectors that are radial around cold spots and tangential

around hot spots on the sky (see fig. 3.11). In contrast, B-mode polarization is divergence-free

but has a curl: its polarization vectors have vorticity around any given point on the sky.22

The symmetries of temperature and polarization anisotropies allow four types of correlations:

the autocorrelations of temperature fluctuations and of E- and B-modes denoted by TT , EE,

and BB, respectively, as well as the cross-correlation between temperature fluctuations and E-

modes: TE. All other correlations (TB and EB) vanish for symmetry reasons. Only the TE

and EE correlations have been detected so far.

The angular power spectra are defined as before

CXY` ≡ 1

2`+ 1

∑

m

〈a∗X,`maY,`m〉 , X, Y = T,E,B , (3.3.77)

where now both scalars ζ and tensors h can act as a source

aX,`m = 4π(−i)`∫

d3k

(2π)3∆X`(k) ζ~k, h~kY`m(k) . (3.3.78)

21Kamionkowski et al. (astro-ph/9611125).22Evidently the E and B nomenclature reflects the properties familiar from electrostatics, ∇ × E = 0 and

∇ ·B = 0.


Note that there will be distinct transfer functions ∆X`(k) for temperature fluctuations and E-

and B-mode polarization. In particular, the transfer functions for polarization will be more

involved and can’t be computed in a simple tight coupling approximation. The full Boltzmann

machinery will be needed.

A Smocking Gun

The cosmological significance of the E/B decomposition of CMB polarization is related to the

following remarkable facts:

i) scalar (density) perturbations create only E-modes and no B-modes.

ii) vector (vorticity) perturbations create mainly B-modes.23

iii) tensor (gravitational wave) perturbations create both E-modes and B-modes.

The angular power spectrum of CMB B-modes is related to the primordial tensor power

spectrum Ph(k) as follows

CBB` = (4π)2

∫k2dk Ph(k)︸︷︷︸

Inflation

∆2B`(k) , (3.3.79)

where ∆B`(k) is the transfer function for B-modes.

The fact that scalars do not produce B-modes while tensors do is the basis of the famous state-

ment that a detection of B-modes is a smoking gun of tensor modes, and therefore of inflation.

E-modes

B-modes

r=0.01

r=0.3

EPIC-LC

EPIC-2m

WMAP Planc

k

Figure 3.12: E- and B-mode power spectra for a tensor-to-scalar ratio saturating current bounds,

r = 0.3, and for r = 0.01. Shown are also the experimental sensitivities for WMAP, Planck and two

different realizations of a future CMB satellite (CMBPol) (EPIC-LC and EPIC-2m).

What precisely would we learn from a B-mode detection?

23 However, vectors decay with the expansion of the universe and are therefore believed to be subdominant at

recombination.


1. “proof” that inflation occurred

No other early universe mechanism produces a stochastic background of tensor fluctuations

that span superhorizon scales at recombination.

2. energy scale of inflation

The tensor-to-scalar ratio is a direct measure of the energy scale of inflation

V 1/4 ∼( r

0.01

)1/41016 GeV . (3.3.80)

Large values of the tensor-to-scalar ratio, r > 0.01, correspond to inflation occurring at

GUT scale energies.

3. super-Planckian field-variation

We showed in the previous chapter that the tensor-to-scalar ratio relates to the field

evolution during inflation,∆φ

Mpl= O(1)×

( r

0.01

)1/2. (3.3.81)

Observable tensors, r > 0.01, therefore imply super-Planckian field vevs. One of the key

challenges for string inflation is to make sense of such large-field models of inflation.

Primordial tensor modes have of course not yet been detected, but the hunt is on. A number

of CMB experiments are now going after the elusive gravitational waves predicted by inflation.

A detection could be imminent.

3.3.6 Non-Gaussianity

The primordial fluctuations are Gaussian to a high degree. However, as we describe in detail

in Chapter 5, even a small amount of non-Gaussianity would encode a tremendous amount of

information about the physics of inflation. The primary diagnostic of non-Gaussian statistics

is the three-point function of inflationary fluctuations. In momentum space, the three-point

correlation function (or bispectrum) is

〈ζk1ζk2ζk3〉 = (2π)3 δ(k1 + k2 + k3) fNLB(k1, k2, k3) . (3.3.82)

Here, fNL is a dimensionless parameter defining the amplitude of non-Gaussianity, while the

function B(k1, k2, k3) captures the momentum dependence. The amplitude and sign of fNL, as

well as the shape and scale dependence of B(k1, k2, k3), depend on the details of the interaction

generating the non-Gaussianity, making the three-point function a powerful discriminating tool

for probing models of the early universe.

Current CMB data imply |fNL| < O(100) (with the precise value depending on the shape of

non-Gaussianity that is tested for). To appreciate the agree of Gaussianity that this implies, we

should notice that the better measure of non-Gaussianity is

NG ∼ fNLζ ∼ O(0.1%) . (3.3.83)

Constraints on Gaussianity are therefore stronger than our constraints on the flatness of the

universe. Nevertheless, many popular inflationary model suggest the possibility of non-Gaussian

signals an order of magnitude below the current bounds. This important regime will soon be

probed by the Planck satellite. Something else to look forward to in the near future.

3.4 From Vacuum Fluctuations to Large-Scale Structure 59

3.4 From Vacuum Fluctuations to Large-Scale Structure

Large-scale structure observations are becoming a more and more important complementary

probe of the early universe. While the CMB studies the largest scales (and hence earliest

horizon exits during inflation), large-scale structure probes smaller scales (and later horizon

exits).

Figure 3.13: Distribution of galaxies. The Sloan Digital Sky Survey (SDSS) has measured the positions

and distances (redshifts) of nearly a million galaxies. Galaxies first identified on 2d images, like the one

shown above on the right, have their distances measured to create the 3d map. The left image shows a

slice of such a 3d map.

3.4.1 Dark Matter Transfer Function

We derived the dark matter evolution equations in §3.3.3, cf. eq. (3.3.42). Together with the

Einstein equation for the gravitational potential φ(τ) (with initial conditions set up by inflation),

we can therefore compute the evolution of the dark matter density. We allow us to relate the

dark matter power spectrum to the power spectrum of primordial curvature perturbations,

Pδ(k, z) ⇔ Pζ(k) , (3.4.84)

where we have indicated that the observed dark matter power is a function of redshift z. This

leads to the so-called dark matter transfer function Tδ(k, τ), which is conventionally defined as

follows

Pδ(k, τ) =4

25

(k

aH

)4

T 2δ (k, τ)Pζ(k) . (3.4.85)

The numerical factor and the k-scaling that have been factored out from the transfer function

are conventional. They can be rationalized if we recall that subhorizon modes during the matter

era satisfy the following Poisson equation:

δk(τ) = −2

3

k2

(aH)2φk = −2

5

k2

(aH)2ζk . (3.4.86)


Figure 3.14: The matter power spectrum. The break in the spectrum around keq reflects the change in

the growth of density perturbations before and after matter-radiation equality, as well as the decay of the

gravitational potential for modes that enter the horizon during radiation domination. For scale-invariant

initial conditions, k3Pζ(k) = const., the quantity k−1Pδ(k) would be a constant for k < keq (i.e. for

modes than didn’t enter the horizon before matter-radiation equality).

Let us sketch the physics of the transfer function: the qualitative shape of the matter power

spectrum (see fig. 3.14) is easily understood if we recall some basic facts about the evolution

of fluctuations after they enter the horizon. Density fluctuations evolve under the competing

influence of pressure and gravity. During radiation domination the large radiation pressure pre-

vents the rapid growth of fluctuations; the density contrast only grows logarithmically, δc ∼ ln a.

Moreover, the gravitational potential φ decays inside the horizon during radiation domination.24

24If the universe is dominated by a fluid with equation of state w and sound speed c2s = w, the evolution

equation for the gravitational potential can be written as

φ′′k +6(1 + w)

1 + 3w

1

τφ′k + wk2φk = 0 . (3.4.87)

This has the following exact solution

φk(η) = yα [C1(k)Jα(y) + C2(k)Yα(y)] , y ≡√wkτ , α ≡ 1

2

(5 + 3w

1 + 3w

), (3.4.88)

where Jα and Yα are Bessel functions of order α. During the matter-dominated era, w = 0, this becomes

φk(τ) = C1(k) +C2(k)

y5, (3.4.89)

whereas during the radiation-dominated era, w = 13, we find

φk(τ) =1

y2

[C1(k)

( sin y

y− cos y

)+ C2(k)

(cos y

y+ sin y

)]. (3.4.90)

In both cases the decaying mode may be dropped by setting C2(k) ≡ 0. For a radiation-dominated background,

the Newtonian potential is time-independent on superhorizon scales, limkτ→0 φk(τ) = C1(k), but decays on

3.4 From Vacuum Fluctuations to Large-Scale Structure 61

Modes that enter the horizon during the radiation era will therefore be suppressed. This ex-

plains the suppression of the matter power spectrum for k > keq (see fig. 3.14). In contrast,

during matter domination the background pressure is negligible and gravitational collapse op-

erates more effectively, δc ∼ a. Importantly, the gravitational potential is constant on all scales

during the matter era. Under the simplifying assumption that there is no significant growth of

density perturbations between the time of horizon entry and matter domination one therefore

arrives at the following approximate transfer function

Tδ(k) ≈

1 k < keq

(keq/k)2 k > keq. (3.4.92)

Although eq. (3.4.92) is intuitively appealing for understanding the qualitative shape of the spec-

trum (i.e. the break in the spectrum at k ≈ keq), it is not accurate enough for most quantiative

applications. Exact transfer functions can of course be computed numerically with CMBFast or

CAMB. A famous fitting function for the matter transfer function was given by Bardeen, Bond,

Kaiser and Szalay (BBKS)25

Tδ(q) =ln(1 + 2.34q)

2.34q

[1 + 3.89q + (1.61q)2 + (5.46q)3 + (6.71q)4

]−1/4, (3.4.93)

where q = k/ΓhMpc−1 and we defined the shape parameter

Γ ≡ Ωmh exp(−Ωb −√

2hΩb/Ωm) . (3.4.94)

More accurate transfer functions may be found in a paper by Eisenstein and Hu.26 For our pur-

poses it is only important to note that (give the background cosmological parameters) the dark

matter transfer function can be computed and used to relate the dark matter power spectrum

Pδ(k, z) to the inflationary spectrum Pζ(k).

3.4.2 Galaxy Biasing

With the exception of gravitational lensing, we unfortunately never observe the dark matter

directly. What we observe (e.g. in galaxy surveys like the Sloan Digital Sky Survey (SDSS)) is

luminous or baryonic matter. Let us call the density contrast for galaxies δg. On large scales

the following phenomenological ansatz for relating the galaxy distribution and the dark matter

has proven useful27

δg = b δc or Pδg = b2Pδ . (3.4.95)

Here, b is the so-called the (linear) bias parameter. It may be viewed as a parameter describ-

ing the ill-understood physics of galaxy formation. The bias parameter b can be obtained by

measuring the galaxy bispectrum Bδg .

subhorizon scales. In contrast, during matter-domination the growing mode linear gravitational potential is

time-independent on all scales, with a spatial profile φk given by the Poisson equation

− k2φk =3

2H2δk + 3H2φk . (3.4.91)

25Bardeen et al. (ApJ, 304, 15-61,1986).26Eisenstein and Hu (astro-ph/9709112).27This may be ‘derived’ via the Press-Schechter formalism.


Modulo these complications the galaxy power spectrum Pδg(k) is an additional probe of infla-

tionary scalar fluctuations Pζ(k). As it probes smaller scales it is complementary to observations

of the CMB. Finally, in the presence of primordial non-Gaussianity, the galaxies bias can develop

an interesting scale-dependence b(k) on large scales. We will discuss this effect and its relevance

for early universe cosmology in Chapter 5.

3.5 Future Prospects

CMB and LSS experiments are starting to reach fantastic levels of precision. For the first time,

we will be taking a serious stab at detecting the elusive B-modes and possibly signatures of

non-Gaussianity.

4 Reheating after Inflation

If inflation is correct, then reheating is an important era in the history of our universe. It is the

time when the known matter was created. It therefore may be surprising that courses on inflation

rarely treat the reheating process as carefully as they should. The main reason for this is probably

that reheating is a very complex phenomenon that almost requires a course on its own. However,

it also contains a lot of interesting physics. In this lecture, I will try to summarize the most

important effects as we now understand them. We will see that reheating can be understood as

a nice application of quantum field theory in a time-dependent classical background. For further

details, I refer to the classic reference by Kofman, Linde and Starobinsky.1

4.1 Introduction

reheating

inflation

Before drowning ourselves in technical details, it

will be useful to give a qualitative overview about

our current understanding of reheating. After

inflation ends, the inflaton φ starts to oscillate

around the global minimum of the potential V (φ).

Most of the inflationary energy ρφ becomes kinetic

energy 12 φ

2. To avoid an empty universe requires

that this energy is converted into Standard Model

degrees of freedom. Throughout this chapter, we

will study a simple toy model that contains all the

important physical effects

L = 12(∂µφ)2 − V (φ)− g2φ2χ2 . (4.1.1)

Here, we have introduced a direct coupling between the inflaton and an additional boson2 χ.

The field χ is a proxy for Standard Model fields or any hidden sector fields that subsequently

decay into Standard Model fields. We assume that the effective potential has a minimum at

φ = σ. Finite σ will serve as a toy model for spontantaneous symmetry breaking. Of course,

this also includes the case without symmetry breaking if we set σ = 0. Expanding around the

minimum we have

V (φ) = 12m

2(φ− σ)2 + · · · . (4.1.2)

1Kofman, Linde and Starobinsky, Towards The Theory of Reheating After Inflation, (arXiv:hep-ph/9704452).2We could also include a Yukawa coupling to a fermion, hφψψ. However, it turns out that Bose condensation

will be important for efficient reheating.

63

64 4. Reheating after Inflation

Shifting the field, φ− σ ⇒ φ, we get

V (φ) + g2φ2χ2 ⇒ 12m

2φ2 + 2g2σφχ2 + g2φ2χ2 + · · · . (4.1.3)

We have therefore generated two types of interactions between the inflaton and the additional

scalar: φχ2 and φ2χ2. Both of these interactions can contribute to the decay of the inflationary

energy.

In §4.2, we study reheating due to the elementary decays φ→ χχ and φφ→ χχ, with decay

rates Γφ→χχ and Γφφ→χχ. These decays lead to a transfer of energy from the inflaton oscilations

to the χ-field. In the equation of motion for the inflaton, this effect is captured by an additional

friction term,

φ+ 3Hφ+ Γφ+m2φ = 0 . (4.1.4)

The expansion rate H decreases with time, and reheating completes when Γ = H.

This perturbative theory of reheating ignores many crucial effects. In particular, it doesn’t

include the backreaction of the classical inflaton oscillations on the quantum mechanical pro-

duction of χ-particles. In particular, the φ-χ couplings in eq. (4.1.3) imply source terms in the

equation of motion for χ. For example, for the φ2χ2 interaction we have

χk + 3Hχk +

(k2

a2+ g2φ2(t)

)χk = 0 . (4.1.5)

In §4.3, we show that this can lead to resonance phenomena that can enormously enhance the

reheating efficiency. Depending on the size of the coupling g and the oscillation amplitude φ,

the resonance will be either narrow or broad. In the former case, a small range of momenta

(k = k? ±∆k) participate in the resonance,3 while the latter case allows resonance for a wide

range of momenta (k ≤ k?). It is now believed that the first stages of reheating most likely occur

in a regime of broad parametric resonance. To distinguish this stage from the subsequent stages

of slow reheating and thermalization, this phenomenon is called preheating. However, reheating

never completes at this stage of parametric resonance. Eventually the resonance becomes narrow

and inefficient, and the final stages of the decay of the inflation and thermalization of its decay

products can be described by the elementary theory of reheating. Thus, the elementary theory

of reheating proves to be useful even in theories that begin with preheating. However, it should

be applied not to the original coherently oscillating inflaton field, but to its decay products, as

well as to the parts of the inflationary energy that survived preheating.

4.2 Elementary Theory of Reheating

We begin with a description the inflaton dynamics after inflation, as the field oscillates around

a minimum of the potential. Ignoring the coupling to χ, the inflaton satisfies the Klein-Gordon

equation,

φ+ 3Hφ+m2φ = 0 , where H2 = 13M2

pl

(12 φ

2 + 12m

2φ2). (4.2.6)

This has the following solution

φ(t) ≈ Φ(t) sin(mt) , where Φ(t) ∼ Mpl

mt. (4.2.7)

3As we will see, this regime can be understood as the amplification of the inflaton decay due to Bose conden-

sation of the produced χ-particles.

4.2 Elementary Theory of Reheating 65

Averaged over many oscillations the scale factor grows as a ∼ t2/3 and the energy density is

ρφ = 12〈φ2〉+ 1

2m2〈φ2〉 ' 1

2m2Φ2 ∼ a−3 . (4.2.8)

This reproduces the well-known result that a scalar field oscillating in a quadratic potential

behaves as pressureless dust.

Exercise. Show that a scalar field oscillating in a quartic potential V (φ) = 14λφ

4 behaves like radiation:

ρφ ∼ a−4.

A homogeneous scalar field oscillating with frequency ω = m can be considered as a coherent

wave of φ-particles with zero momenta and particle density

nφ =ρφm

=1

2m

(〈φ2〉+m2〈φ2〉

)' 1

2mΦ2 . (4.2.9)

So far, this has ignored the coupling to χ and the associated particle production. Let us now

include this effect perturbatively. For concreteness, we consider the coupling 2g2σφχ2. The

decay rate of the process φ→ χχ is computed via standard field theory methods4

Γφ→χχ =g4σ2

8πm. (4.2.10)

Taking into account the expansion of the universe, the time-evolution equations for the number

densities of the φ and χ-particles can be written as

1

a3

d(a3nφ)

dt= −Γnφ and

1

a3

d(a3nχ)

dt= 2 Γnφ , (4.2.11)

where the factor of two in the second equation arises because one φ-particle decays into two

χ-particles. Using eq. (4.2.9), this can be written as

φ+ 3Hφ+ Γφ+m2φ = 0 . (4.2.12)

We see that the particle decay can be described by an additional friction term in the Klein-

Gordon equation. Reheating completes when the Hubble expansion rate H ∼ 23t drops below

the decay rate Γ. At this time the energy density is

ρ(tr) = 3Γ2M2pl . (4.2.13)

Assuming that thermodynamic equilibrium is reached quickly after that, we can relate this to

the reheating temperature

ρ(tr) = 3Γ2M2pl =

π2

30g?T

4r , (4.2.14)

where g? ∼ 100 is the number of relativistic degrees of freedom at that time. We find

Tr ∼ 0.1√

ΓMpl . (4.2.15)

Exercise. Repeat the analysis for the coupling g2φ2χ2. First, show that

Γφφ→χχ =g4Φ2

8πm.

Notice that Γ ∝ Φ2 ∼ 1/t2 decreases faster than H ∼ 1/t. When/how does reheating end?

4Peskin and Schroeder, Introduction to Quantum Field Theory.


4.3 Parametric Resonance and Preheating

The perturbative analysis above has an important short coming: it ignored the coherent nature

of the oscillating inflaton field. In reality, the beginning of reheating is not well-described by

a superposition of free asymptotic single inflaton states. Instead the inflaton is a coherently

oscillating homogeneous field. When we take this into account we are led to the possibility that

the time-dependent classical inflaton background φ induces the quantum mechanical production

of matter particles χ. In this section, we will see that this can completely change our conception

of reheating.

Figure 4.1: Instability bands of the Mathieu equation.

4.3.1 QFT in a Time-Dependent Background

Consider the quantum field χ in the classical background φ(t),

χ(t,x) =

∫d3k

(2π)3/2

(ak χk(t)e

−ik·x + a†k χ∗k(t)e

ik·x), (4.3.16)

where a†k and ak are creation and annihilation operators, respectively, and the mode functions

satisfy5

χk + 3Hχk +

(k2

a2+ g2φ2(t)

)χk = 0 . (4.3.17)

5Here, we have ignored the possibility of an explicit mass for the field χ, i.e. we have set mχ ≡ 0. It would be

straightforward to include finite mχ, but since it doesn’t lead to qualitatively new features we won’t do so.

4.3 Parametric Resonance and Preheating 67

Ignoring the expansion of the universe, eq. (4.3.17) becomes

χk +(k2 + g2Φ2 sin2(mt)

)χk = 0 . (4.3.18)

Defining time as z ≡ mt, this becomes the Mathieu equation

χ′′k + (Ak − 2q cos 2z)χk = 0 , (4.3.19)

where

Ak ≡k2

m2+ 2q and q ≡ g2Φ2

4m2. (4.3.20)

The properties of the solutions to the Mathieu equation have been classified in so-called sta-

bility/instability charts (see fig. 4.1). The main characteristic of the solutions is exponential

instabilities within certain resonance bands ∆k,

χk ∝ exp (µkz) , (4.3.21)

where µk are called Floquet exponents. This corresponds to the exponential growth of occupation

numbers (i.e. particle production)

nk = |χk|2 ∝ exp (2µkz) . (4.3.22)

We will study the solutions of the Mathieu equation in two important regimes: q 1 (narrow

resonance) and q > 1 (broad resonance).

4.3.2 Narrow Resonance

We first discuss the solutions to the Mathieu equation in the regime q 1.

Narrow Resonance in Minkowski Space

The stabilitiy/instability chart of the Mathieu equation

shows that, for q 1, resonances occur near A(n)k ≈ n2,

where n ∈ Z (see fig. 4.1). The widths of the resonance

bands is ∆k(n) ∼ mqn. For q < 1, the first band is the most

important:

k = m(1± 1

2q). (4.3.23)

The instability parameter in this band is

µk =

√(q2

)2−( km− 1)2

. (4.3.24)

It vanishes at the edges of the band and is maximal at its center

µmax = µk=m =q

2=g2Φ2

8m2. (4.3.25)

This looks similar in spirit to the elementary theory of reheating where two φ-particles of mass m

decay into two χ-particles with momenta, k = m. However, as we will see, narrow resonance is

a completely different effect, relying crucially on Bose condensation of the produced particles.

It dominates over elementary decays when qm > 3H + Γ.


2 4 6 8 10 12 14

-50

50

100

2 4 6 8 10 12 14

2

4

6

8

Figure 4.2: (Reproduced from Kofman et al.) Narrow parametric resonance for the field χ in the theory12m

2φ2 in Minkowski space for q ∼ 0.1. Here, time is plotted in units of [m/2π]−1.

Exercise. Repeat the analysis for the coupling 2g2σφχ2. First, show that the problem can be mapped

to a mathematically equivalent one. Then match the parameters of the answers.

Fig. 4.2 show a numerical simulation of the regime of narrow parametric resonance. For each

oscillation of the field φ(t) the growing mode of the field χ oscillates one time. The upper figure

shows the growth of the mode χk for the momentum k corresponding to the maximal speed of

growth. The lower figure shows the logarithm of the occupation number of particles nk in this

mode. As we see, the number of particles grows exponentially, and lnnk in the narrow resonance

regime looks like a straight line with a constant slope. This slope divided by 4π gives the value

of the parameter µk. In this particular case µk ∼ 0.05, exactly as it should be in accordance

with the relation µk ∼ 12q for this model.

Narrow Resonance as Bose Condensation∗

The narrow resonance effect can also be understood as the Bose condensation of χ-particles.6

Let us sketch the basic reasoning for the case 2g2σφχ2 : In this case, a single φ decays into

two χ. In the rest frame of the φ-particle, the momenta of the two produced χ-particles have

the same magnitude k, but opposite directions. If the corresponding states in phase space are

already occupied, then the inflaton decay is enhanced by a Bose factor. The rate of the φ→ χχ

process is proportional to

Γφ→χχ ∝∣∣∣〈nφ − 1,nk + 1,n−k + 1| a+

k a+−ka

−φ |nφ,nk, n−k〉

∣∣∣2

= (nk+1)(n−k+1)nφ . (4.3.26)

6Mukhanov, Physical Principles of Cosmology.


The inverse decay χχ→ φ can also take place. Its rate is

Γχχ→φ ∝∣∣∣〈nφ + 1,nk − 1,n−k − 1| a+

φ a−k a−−k |nφ,nk,n−k〉

∣∣∣2

= nkn−k(nφ + 1) . (4.3.27)

In this section, pay careful attention to the fonts to distinguish occupation numbers (n) from

number densities (n). Taking into accoung that nk = n−k ≡ nk and nφ 1, we find that the

effective decay rate in (4.2.11) is

Γ ' Γχ(1 + 2nk) . (4.3.28)

It remains to determine nk in terms of the number density nχ. In the rest frame of φ, both

χ-particles have energy 12m. The three-momentum of the produced χ-particles is therefore

k =

((m2

)2− 4g2σφ(t)

)1/2

, (4.3.29)

where we assume 4g2σφ m2. The oscillating term, g2σφ ' g2σΦ sin(mt), leads to a scattering

of the momenta in phase space. If g2σΦ 116m

2, then all particles are created in a thin shell

of width

∆k ' m · 8g2σΦ

m2 m , (4.3.30)

centered around k? ' 12m. Hence,

nk= 12m '

nχ(4πk2

?∆k)/(2π)3' π2nχmg2σΦ

=π2Φ

2g2σ

nχnφ

. (4.3.31)

Bose condensation is essential when nk 1, or

nχ >2g2σ

π2Φnφ . (4.3.32)

For Φ ∼ Mpl, the occupation number therefore exceeds unity as soon as a fraction g2σ/Mpl of

the inflaton energy is converted into χ-particles. Since the above derivation required g2σ/Mpl 16m

2/M2pl ∼ 10−10, only a tiny fraction of the inflaton energy is transferred in the regime of the

elementary theory of reheating, nk < 1. Bose condensation effects take over almost immediately.

Substituting (4.3.31) in (4.3.28), gives

Γ ' g4σ2

8πm

(1 +

2π2Φ

g

nχnφ

), (4.3.33)

where we used (4.2.10) for Γχ. Substituting this into the second equation in (4.2.11), leads to

1

a3

d(a3nχ)

dN=g4σ2

2m2

(1 +

π2Φ

g2σ

nχnφ

)nφ , (4.3.34)

where N ≡ mt/2π measures the number of inflaton oscillations. For simplicity, let us ignore the

expansion of the universe and neglect the decrease of the inflaton amplitude due to particle pro-

duction, i.e. Φ ≈ const. In the regime nk 1, cf. eq. (4.3.32), the differential equation (4.3.34)

can easily be integrated

nχ ∝ exp

(2π2g2σΦ

m2N

)≡ e2πµN . (4.3.35)

As before, we find exponential growth.

Exercise. Repeat the analysis for the coupling g2φ2χ2.


Expansion and Rescattering

So far, this has ignored the expansion of the universe and the rescattering of the produced

χ-particles. Both effects reduce the efficiency of the resonance. First, we see that expansion

narrows the width of the resonance band, ∆k ∝ Φ(t) ∝ 1/t. Moreover, due to the expansion,

modes redshift out of the resonance. The rescattering also moves modes out of the resonance

band. This shows that narrow parametric resonance is quite delicate and requires detailed

numerical simulations including all relevant effects to decide if it really occurs.

4.3.3 Broad Resonance

Next, we consider the Mathieu equation (4.3.19) in the regime q 1. Fig. 4.1 shows that

instabilities now occur for much broader ranges of k. Moreover, we anticipate that the instability

coefficients µk will be larger and reheating will be very efficient. In this section, we discuss broad

parameteric resonance both numerically and analytically.

0.5 1.5 2 2.5

-100

-50

50

100

0.5 1.5 2 2.5

2

4

6

8

10

12

Figure 4.3: (Reproduced from Kofman et al.) Broad parametric resonance for the field χ in Minkowski

space for q ∼ 2× 102 in the theory 12m

2φ2.

Numerical Simulations

A numerical solution in the broad resonance regime in a Minkowski background is shown in

fig. 4.3. For each oscillation of the field φ(t) the field χk oscillates many times (since, for q 1,

the effective mass for χ is much larger than the inflaton mass, mχ ≡ gΦ m). Each peak

in the amplitude of the oscillations of the field χ corresponds to a place where φ(t) = 0. At


this time the occupation number nk is not well-defined, but soon after that time it stabilizes

at a new, higher level, and remains constant until the next jump. A comparison of the two

parts of this figure demonstrates the importance of using proper variables for the description

of preheating. Both χk and the integrated dispersion 〈χ2〉 behave erratically in the process of

parametric resonance. Meanwhile nk is an adiabatic invariant. Therefore, the behavior of nkis relatively simple and predictable everywhere except during the short intervals of time when

φ(t) is very small and the particle production occurs.

Analytic Interpretation

The structure in the plot for lnnk(t) suggests that a simple analytic treatment should be possible.

Away from φ(t) = 0, the frequency of χ changes adiabatically, |ω| ω2, and nk is conserved.

Particle production occurs when the adiabatic condition is violated

R ≡ |ω|ω2

> 1 , where ω(t) =√k2 + g2φ2(t) . (4.3.36)

For small wavenumbers, k → 0, we find

R ' φ

gφ2. (4.3.37)

The key point about this relation is that R diverges whenever φ→ 0, i.e. twice every oscillation.

At these points we expect explosive particle production. For finite k, adiabaticity is violated if

R =g2φφ

(k2 + g2φ2)3/2∼ g2φmΦ

(k2 + g2φ2)3/2> 1 , (4.3.38)

where in the second equality we used φ ' mΦ cos(mt) ∼ mΦ, which is valid near the origin φ = 0.

Eq. (4.3.38) implies

k2 . (g2φmΦ)2/3 − g2φ2 ≡ f(φ) . (4.3.39)

We see that this starts to be satisfied, for small k, when

the field φ(t) becomes smaller than φmax ≡√

mΦg , where

f(φmax) = 0. The maximum range of wavenumbers

satisfy eq. (4.3.39) at φ?, where f ′(φ?) = 0. A rough

estimate is φ? ∼ 12φmax. The effective range of k par-

ticipating in the resonance is roughly,

k2 . k2? ≡ gmΦ . (4.3.40)

This condition changes in a simple way if we take into

account the expansion of the universe,

k2

a2(t). k2

?(t) ≡ gmΦ(t) . (4.3.41)

We see that the expansion makes broad resonance more effective, since more k-modes are red-

shifted into the instability band as time proceeds.


Exercise. Consider chaotic inflation with V (φ) = 12m

2φ2 and the coupling g2φ2χ2. Broad resonance

occurs for g > 2mΦ ∼ 10−6, where we used Φ ∼ Mpl and m ∼ 10−6Mpl (from COBE normalization). Is

such a large coupling consistent with naturalness of the inflationary potential? Consider the one-loop

correction to the inflaton mass

δm2 =g2

16π2Λ2

uv ∼g2

16π2M2

pl .

Naturalness requires δm < m ∼ 10−6Mpl, or g < 10−5. This seems to disallow the regime of broad

parametric resonance for chaotic inflation. The reheating of this model then has to occur predominantly

through the elementary decays of φ and narrow resonances in χ.

Broad Resonance in an Expanding Universe

Let us study broad resonance more quantitatively and include the expanding of the universe. It

is convenient to remove the Hubble friction from the equation of motion, by defining Xk(t) ≡a3/2(t)χk(t). The mode equation then becomes

Xk + ω2kXk = 0 , (4.3.42)

where

ω2k ≡

k2

a2+ g2Φ2 sin2(mt) + ∆ , ∆ ≡ −3

4(3H2 + 2H) . (4.3.43)

Since 3H2 = −2H during matter domination, we can set ∆ ≡ 0. We also define the comoving

occupation number of particles

nk =ωk2

(|Xk|2ω2k

+ |Xk|2)− 1

2. (4.3.44)

Fig. 4.4 show a simulation of broad parametric resonance in an expanding universe (plotting

Xk rather than χk illustrates the growth of χ relative to the decay of Φ ∝ 1/t.) Note that the

number of particles nk in this process typically increases, but it may occasionally decrease as

well. This is a distinctive feature of stochastic resonance in an expanding universe. A decrease

in the number of particles is a purely quantum mechanical effect which would be impossible if

these particles were in a state of thermal equilibrium.

The stochastic resonance effect can be understood by inspecting the behavior of the phases

of the functions χk near φ(t) = 0: here, Minkowski space and an expanding universe are quali-

tatively different. In Minkowski space, near φ(t) = 0 the phases of χk are equal (see fig. 4.3). In

an expanding universe, this is not the case (see fig. 4.4): the phases of χk at successive moments

when φ(t) = 0 are practically uncorrelated. The reason is easy to understand. The frequency

of oscillations of χk is proportional to Φ, which in an expanding universe is a function of time.

In the broad resonance regime, this implies that the frequency of oscillations of χk changes a

lot during each oscillation of the inflaton φ. Interestingly, this doesn’t completely destroy the

resonance effect. As we will show analytically in the next section (and as is confirmed in the

simulations), although in an expanding universe with q 1 the phases of χk are practically

stochastic, in 75% of all events the amplitude of χk grows after passing through φ(t) = 0. Over

a long time period, the occupation number of χ-particles therefore grows (see fig. 4.5). How-

ever, the parameter q = g2Φ2

4m2 ∝ t−2 decreases with time, making the resonance more and more

narrow. Eventually, the resonance ceases to exist and nk stabilizes at a constant value.


0.5 1.5 2 2.5 3 3.5 4

-60

-40

-20

20

40

60

80

0.5 1.5 2 2.5 3 3.5 4

1

2

3

4

5

Figure 4.4: (Reproduced from Kofman et al.) Early stages of parametric resonance in the theory 12m

2φ2

in an expanding universe with scale factor a ∼ t2/3 for g = 5× 10−4, m = 10−6Mpl. The initial value of

the parameter q in this process is q0 ∼ 3× 103.

10 20 30 40 50 60

10

20

30

40

Figure 4.5: (Reproduced from Kofman et al.) Same process as in fig. 4.4, but now followed over a

longer period of time.

Broad Resonance as Schrodinger Scattering∗

We would like to have a more analytic understanding of broad resonance in an expanding

universe. The structures we have seen in the numerical simulations suggest that this is feasible.

Let us label with tj , j ∈ Z, the times when φ(tj) = 0. Away from those points the modes

evolve adiabatically, e±i∫ω dt. Consider incoming waves

Xjk(t) =

αjk√2ωe−i

∫ t0 ω dt +

βjk√2ωe+i

∫ t0 ω dt , (4.3.45)


where the coefficients αjk and βjk are constant for tj−1 < t < tj . At t = tj the waves scatter, and

the outgoing waves are

Xj+1k (t) =

αj+1k√2ωe−i

∫ t0 ω dt +

βj+1k√2ωe+i

∫ t0 ω dt , (4.3.46)

where the coefficients αj+1k and βj+1

k are constant for tj < t < tj+1. The outgoing amplitudes

αj+1k and βj+1

k can be expressed in terms of the incoming amplitudes αjk and βjk and reflection

and transmission amplitudes Rk and Tk,

(αj+1k e−iθ

jk

βj+1k e+iθjk

)=

(1Tk

R∗kT ∗k

RkTk

1T ∗k

)(αjk e

−iθjk

βjk e+iθjk

). (4.3.47)

Here, θjk =∫ tj

0 dt ω(t) is the phase accumulated by the time tj . To compute Rk and Tk, we

approximate the coupling in the vicinity of tj as g2φ2(t) ≈ g2Φ2m2(t − tj)2 ≡ k4?(t − tj)2. The

mode equation then isd2Xk

dt2+

(k2

a2+ k4

?(t− tj)2

)Xk = 0 . (4.3.48)

Defining a rescaled momentum κ ≡ k/(ak?) and new time variable τ ≡ k?(t− tj), this becomes

d2Xk

dτ2+(κ2 + τ2

)Xk = 0 . (4.3.49)

This equation may be interpreted as a one-dimensional Schrodinger equation for particle scatter-

ing through an inverted parabolic potential, V = −τ2. The problem has therefore been reduced

to a standard quantum mechanics problem. The solution is

Rk = − ieiϕk√1 + eπκ2

and Tk = − ieiϕk√1 + e−πκ2

, (4.3.50)

where

ϕk ≡ arg Γ

(1 + iκ2

2

)+κ2

2

(1 + ln

2

κ2

). (4.3.51)

Notice that Rk = −iTke−π2κ2

and |Rk|2 + |Tk|2 = 1. Substituting (4.3.50) into (4.3.47), we get(αj+1k

βj+1k

)=

( √1 + e−πκ2 eiϕk ie−

π2κ2+2iθjk

−ie−π2 κ2−2iθjk√

1 + e−πκ2 e−iϕk

)(αjkβjk

). (4.3.52)

We can use this to express the number density of the outgoing particles nj+1k = |βj+1

k |2 in terms

of the number density of the incoming particles njk = |βjk|2 and the parameters of the scattering

potential,

nj+1k = e−πκ

2+ (1 + 2e−πκ

2)njk − 2e−

π2κ2√

1 + e−πκ2

√njk(n

jk + 1) sin θjtot , (4.3.53)

where θjtot ≡ 2θjk − ϕk + arg βjk − argαjk.

Digression. Let us digress briefly to understand part of the answer more intuitively. Consider eq. (4.3.49).

This has two solutions, χink and χoutk , associated with vacuum states in the far past and the far future.

These two sets of modes are related by a Bogoliubov transformation

χink = αkχoutk + βkχ

out∗k . (4.3.54)


If we start in the state with no particles in the far past, the number density of particles in the far future

is

nk = |βk|2 . (4.3.55)

Part of the incoming wave χink will be transmitted, Tkχink , and part of it will be reflected Rkχ

in∗k . The

Bogoliubov coefficient in (4.3.54) can therefore be expressed in terms of the transmission and reflection

coefficients

βk =R∗kT ∗k

. (4.3.56)

We use a trick from quantum mechanics to relate Rk and Tk. Moving along the real time axis, the WKB

form of the solution χink (τ) will be violated at small τ . However, if we take τ to be complex, then we can

move from τ = −∞ to τ = +∞ along a complex contour in such a way that the WKB approximation

remain valid throughout,

χink (τ) ∼ 1√2√κ2 + τ2

e−i∫ τ√κ2+(τ ′)2 dτ ′

. (4.3.57)

Here, the integral∫ τ

dτ ′ becomes a contour integral along a semi-circle of large radius in the lower

complex τ -plane. For large |τ |, we can estimate the phase integral in (4.3.57) by expanding

√κ2 + τ2 ∼ τ +

κ2

2τ. (4.3.58)

Going around the semi-circle, this gives

(e−iπ)−iκ2 ∼ ie−πκ2

. (4.3.59)

This is exactly the ratio between R∗k and T ∗k , so we find

nk = |βk| = e−πκ2

, (4.3.60)

in agreement with eq. (4.3.53) for njk = 0.

From eq. (4.3.53) we see that particle creation is significant only for κ2 . 1, or

k2

a2. k2

? = gmΦ . (4.3.61)

Compare this to our previous estimate (4.3.41). For large occupation numbers, njk 1,

eq. (4.3.53) reduces to

nj+1k = njk e

2πµjk , (4.3.62)

where

µjk ≡1

2πln(

1 + 2e−πκ2 − 2e−

π2κ2

sin θjtot

√1 + e−πκ2

). (4.3.63)

The first two terms correspond to spontaneous particle creations which always increase the

particle number (µk > 0). The last term, however, corresponds to induced particle creation,

which can either increase or decrease the number of particles (depending on the sign of sin θjtot).

We see that the particle creation depends crucially on the interference of the wave functions,

i.e. the phase (anti)correlation between successive scatterings. For the fastest growing mode

(k = 0), nj+1k > njk if

− π4 < θjtot <

5π4 . (4.3.64)

Treating θjtot as a random variable, the range of phases in (4.3.64) implies that the solution

grows about 75% of the time.


Finally, let us determine the net particle creation after a number of oscillations of the inflaton

and many scatterings. We start in the vacuum state α0k = 1, β0

k = n0k = 0 and random initial

phase θ0k. After a number of oscillations, the occupation number of χ-particles is

nk(t) = 12e

2π∑j µ

jk ≡ 1

2e2(m∆t)µk , (4.3.65)

where ∆t is the total duration of the resonance and we defined the “average” Floquet exponent,

µk ≡π

m∆t

∑

j

µjk . (4.3.66)

This parameter is typically of order unity,7 which shows that broad parametric resonance in an

expanding background can be very efficient and can convert a substantial fraction of the inflaton

energy density into matter in less than a Hubble time. The number density of created particles

is

nχ(t) =1

a3

∫d3k

(2π)3nk(t) =

1

4π2a3

∫dk k2e2(mt)µk . (4.3.68)

This can be estimated by the method of steepest decent. We first note that the function µk has

a maximum µmax ≡ µ at some kmax. In practice, kmax ∼ 12k? and

nχ(t) ∼ k3?

64π2a3√πµ(mt)

e2(mt)µ . (4.3.69)

This result determines how fast the energy is drained from the inflaton field.

4.3.4 Termination of Preheating

So far, we haven’t taken into account the backreaction of the produced χ-particles on the dy-

namics of preheating. Ultimately, this backreaction terminates preheating.

The backreaction effect on the inflaton mass can be estimated as follows

∆m2 = g2〈χ2〉 , (4.3.70)

where 〈χ2〉 is a quantum expectation value

〈χ2〉 =1

2π2a3

∫k2dk |Xk(t)|2 . (4.3.71)

Inserting the mode expansion of Xk, we find

∆m2(t) ≈ gnχ(t)

φ(t). (4.3.72)

This expression looks ill-defined when φ crosses zero. However, the equation of motion is well-

behaved, so we can estimate the size of the backreaction by replacing φ by its amplitude Φ.

Backreaction becomes important when ∆m2 > m2, or

nχ(t) >m2Φ(t)

g. (4.3.73)

7With eq. (4.3.63), we find

µk ≈1

2πln 3−O(κ2) . (4.3.67)

4.4 Conclusions 77

At this time the broad resonance is destroyed and preheaing ends. Using eq. (4.2.7) for Φ(t)

and eq. (4.3.69) for nχ(t), we find

∆t ∼ (µm)−1 . (4.3.74)

This time interval is typically short compared to the Hubble time. A more rigorous analysis of

backreaction effects and the end of preheating requires numerical analysis.

4.3.5 Gravitational Waves from Preheating

Although reheating is a fundamental part of any complete theory of inflation, its imprints on

cosmological observables are unfortunately rather limited. With some luck reheating could

produce relics like cosmic strings or other topological defects that then survive in the late

universe. Another interesting possibility is that inhomogeneities in χ source gravitational waves.

The basic idea is that χ-modes (and also φ-modes from inverse scattering) contribute a source

term to the equations of motion for tensor fluctuations,

h′′ij + 2a′

ah′ij −∇2hij = 16πG (Tij)

TT , (4.3.75)

where

Tij = ∂iχ∂jχ+ · · · . (4.3.76)

Details of the computation of the gravitational wave spectrum can be found in a nice paper by

Kofman and collaborators.8 For high-scale inflation the signal peaks at very high frequencies. In

fact, the peak frequencies for GUT scale inflation are much too high to be observable by future

direct detection experiments. However, if inflation is very low scale, then the gravitational signal

could peak around 1 Hz and hence potentially be observable.

4.4 Conclusions

This chapter has given a sketch of some of the non-trivial physical processes that we expect to be

relevant during reheating. Despite our best efforts the treatment is still terribly incomplete: For

example, I didn’t discuss tachyonic preheating, which is relevant for understanding reheating

in hybrid inflation. Moreover, I didn’t describe the problem of thermalization of the decay

products. This involves non-perturbative rescattering of the decay products and turbulence.

However, I hope that I have given you enough background to study these important topics in

your own time.

8Dufaux et al., Theory and Numerics of Gravitational Waves from Preheating after Inflation, (arXiv:0707.0875).

5 Primordial Non-Gaussianity

5.1 Why Non-Gaussianity?

The CMB power spectrum analysis reduces the WMAP data from about 106 pixels to 103 multi-

pole moments. This enormous data compression is only justified if the primordial perturbations

were drawn from a Gaussian distribution with random phases. In principle, there can be a wealth

of information that is contained in deviations from the perfectly Gaussian distribution. So far

experiments haven’t had the sensitivity to extract this information from the data. However,

this is about to change. The Planck satellite will provide accurate measurements of higher-order

CMB correlations, or non-Gaussianity. The will allow the study of primordial quantum fields

to move beyond the free field limit and start to constrain (or measure!) interactions.

Measurements of primordial non-Gaussianity are a powerful way to bring us closer to the

ultimate goal of particle physics, which is to determine the action (i.e. the fields, symmetries

and couplings) as a function of energy scale. At low energies, E < 1 TeV, physics is completely

described by the Standard Model of particle physics.1 Various lines of arguments suggest that

new physics should appear close to the TeV scale. The particle physics community is eagerly

awaiting experimental results from the LHC to elucidate the physics of the TeV scale. However,

it seems unlikely that this new physics will also explain the inflationary era in the early universe

(or be relevant for alternatives to inflation). To probe the physics at energy scales far exceeding

the TeV scale, we are likely to require cosmological data. In these notes I will explain how

non-Gaussianity will help in this quest. In the process, I will give the readers two essential

tools – the in-in formalism and the δN formalism – that will allow them to perform their own

computations of non-Gaussianities in models of the early universe.

The outline of these chapter is as follows: In §5.2 I describe the Gaussian and non-Gaussian

statistics of the CMB from a phenomenological perspective. I introduce the bispectrum as the

primary diagnostic for non-Gaussianity. I discuss how the momentum dependence of the bis-

pectrum encodes invaluable physical information. In the rest of the notes I present different

mechanism to produce observable CMB bispectra: In §5.3 I describe quantum mechanically gen-

erated non-Gaussianities. I introduce the in-in formalism to compute higher-order correlation

functions and apply it to a variety of examples. In §5.4 I discuss classical non-Gaussianities

generated after horizon-exit. I explain the δN formalism and apply it to models of inflation

with additional light fields. I conclude, in §5.6, with a few remarks about future prospects for

both the theory and observations of primordial non-Gaussianity.

1An exception to this statement are neutrino masses.

79

80 5. Primordial Non-Gaussianity

For further reading I highly recommend the two classic papers by Maldacena2 and Weinberg3,

as well as the wonderfully clear reviews by Chen4, Lim5 and Komatsu6.

5.2 Gaussian and Non-Gaussian Statistics

5.2.1 Statistics of CMB Anisotropies

CMB experiments measure temperature fluctuations ∆T as a function of the position n on the

sky. Depending on the resolution of the experiment, the CMB maps have Npix number of pixels,

ni, with ∆Ti ≡ ∆T (ni). The temperature anisotropy is Gaussian when its probability density

function (PDF) is

Pg(∆T ) =1

(2π)Npix/2|ξ|1/2 exp

−1

2

∑

ij

∆Ti(ξ−1)ij∆Tj

, (5.2.1)

where ξij ≡ 〈∆Ti∆Tj〉 is the covariance matrix (or two-point correlation function) of the tem-

perature anisotropy and |ξ| is its determinant. It is common practice to expand ∆T in spherical

harmonics, ∆T (n) =∑

`m a`mY`m(n). The Gaussian PDF for the a`m’s is

Pg(a) =1

(2π)Nharm/2|C|1/2 exp

[−1

2

∑

`m

∑

`′m′

a∗`m(C−1)`m,`′m′a`′m′

], (5.2.2)

where C`m,`′m′ ≡ 〈a∗`ma`m〉 and Nharm is the number of ` and m that is summed over. For

a Gaussian CMB the covariance matrix, C`m,`′m′ , provides a full description of the data. All

higher correlations either vanish, 〈a`ma`′m′a`′m′〉 = 0, or can be expressed in terms of C`m,`′m′ .

When the CMB is statistically homogeneous and isotropic (i.e. invariant under translations and

rotations on the sky), then7

C`m,`′m′ = C` δ``′δmm′ , (5.2.3)

and (5.2.2) reduces to

Pg(a) =∏

`m

e−|a`m|2/(2C`)

√2πC`

. (5.2.4)

How do we describe a non-Gaussian CMB? Naively, we have a problem. There is only one

way for the CMB to be Gaussian but an infinite number of ways of being non-Gaussian. So

which non-Gaussian PDF do we pick? Fortunately, we know from observations that the CMB

is very close to the Gaussian distribution (5.2.2). It therefore makes sense to “Taylor expand”8

the probability distribution around a Gaussian distribution

P (a) =[1− 1

6

∑

all `imj

〈a`1m1a`2m2a`3m3〉∂

∂a`1m1

∂

∂a`2m2

∂

∂a`3m3

+ · · ·]× Pg(a) . (5.2.5)

2Maldacena, (arXiv:astro-ph/0210603).3Weinberg, (arXiv:hep-th/0506236).4Chen, (arXiv:1002.1416).5Lim, (Part III, Advanced Cosmology).6Komatsu, (arXiv:1003.6097).7This is equivalent to ξij = 〈∆Ti∆Tj〉 = ξ(|ni − nj |).8Strictly speaking this is a “Gram-Charlier expansion”.

5.2 Gaussian and Non-Gaussian Statistics 81

Evaluating the derivatives results in

P (a) = Pg(a)×[1 +

1

6

∑

all `imj

〈a`1m1a`2m2a`3m3〉

(C−1a)`1m1(C−1a)`2m2(C−1a)`3m3

− 3(C−1)`1m1,`2m2(C−1a)`3m3

]. (5.2.6)

This formula tells us, as expected, that the leading deviation from the Gaussian PDF is pro-

portional to the angular bispectrum 〈a`1m1a`2m2a`3m3〉. The formula is also used to estimate

the angular bispectrum from data by maximizing this PDF. For the remainder of these notes it

suffices to know that this can be done. We will instead focus our attention on describing possible

sources for a non-zero bispectrum.

5.2.2 Sources of Non-Gaussianity

We distinguish the following sources for a non-Gaussian CMB:

1. Primordial non-Gaussianity

Non-Gaussianity in the primordial curvature perturbation ζ produced in the very early

universe by inflation (or an alternative).

2. Second-order non-Gaussianity

Non-Gaussianity arising from non-linearities in the transfer function relating ζ to the CMB

temperature anisotropy ∆T at recombination.

3. Secondary non-Gaussianity

Non-Gaussianity generated by ‘late’ time effects after recombination (e.g. lensing).

4. Foreground non-Gaussianity

Non-Gaussianity created by Galactic and extra-Galactic sources.

All of these sources contribute to the observed signal and it is important to understand them

both theoretically and empirically. Only if we understand the secondary non-Gaussianity well

enough can we hope to extract the primordial signal reliably. Having said that, for the remainder

of these notes we will be high-energy chauvinists and focus entirely on the microphysical origin

of primordial non-Gaussianity.

5.2.3 Primordial Bispectrum

The leading non-Gaussian signature is the three-point correlation function, or its Fourier-

equivalent, the bispectrum

〈ζk1ζk2ζk3〉 = Bζ(k1,k2,k3) . (5.2.7)

For perturbations around an FRW background, the momentum dependence of the bispectrum

simplifies considerably: Because of homogeneity, or translation invariance, the bispectrum is

proportional to a delta function of the sum of the momenta, Bζ(k1,k2,k3) ∝ δ(k1 + k2 + k3),

i.e. the sum of the momentum 3-vectors must form a closed triangle. Because of isotropy, or


rotational invariance, the bispectrum only dependence on the magnitudes of the momentum

vectors, but not on their orientations,

Bζ(k1,k2,k3) = (2π)3δ(k1 + k2 + k3)Bζ(k1, k2, k3) . (5.2.8)

If the power spectrum is scale-invariant, the shape of the bispectrum only depends on two ratios

of ki’s, say x2 ≡ k2/k1 and x3 ≡ k3/k1,

Bζ(k1, k2, k3) = k−61 Bζ(1, x2, x3) . (5.2.9)

Most of our examples will be of the scale-invariant form, but for the moment we will keep an

open mind and not restrict to that case.

5.2.4 Shape, Running and Amplitude

The bispectrum is often written as

Bζ(k1, k2, k3) =S(k1, k2, k3)

(k1k2k3)2·∆2

ζ(k?) , (5.2.10)

where ∆2ζ(k?) = k3

?Pζ(k?) is the dimensionless power spectrum evaluated at a fiducial momentum

scale k?. The function S is dimensionless and, for scale-invariant bispectra, invariant under

rescaling of all momenta. Moreover, it are the functions S that appear in the integrals of the

CMB bispectra. We distinguish two types of momentum dependence of S:

• The shape of the bispectrum refers to the dependence of S on the momentum ratios k2/k1

and k3/k1, while fixing the overall momentum scale K ≡ 13(k1 + k2 + k3).

• The running of the bispectrum refers to the dependence of the bispectrum on the overall

momentum K, while keeping the ratios k2/k1 and k3/k1 fixed.

0.0 0.2 0.4 0.6 0.8 1.00.5

0.6

0.7

0.8

0.9

1.0

squeezed equilateral

folded

isosceles

elongatedx2

x3

Figure 5.1: Momentum configurations of the bispectrum.

Finally, it is conventional to define the amplitude of non-Gaussianity as the size of the bispec-

trum in the equilateral momentum configuration,9

fNL(K) =5

18S(K,K,K) , (5.2.11)

9The factor of 518

is a historical convention (see §5.2.5).

5.2 Gaussian and Non-Gaussian Statistics 83

where we have indicated that fNL can depend on the overall momentum. However, for scale-

invariant bispectra, fNL is a constant and it is convenient to extract it explicitly from the shape

function. In this case we write the bispectrum as

Bζ(k1, k2, k3) =18

5fNL ·

S(k1, k2, k3)

(k1k2k3)2·∆2

ζ , (5.2.12)

where now the shape function is normalized as S(K,K,K) ≡ 1. We will use this definition of Sin the remainder.

5.2.5 Shapes of Non-Gaussianity

Local Non-Gaussianity

One of the first ways to parameterize non-Gaussianity phenomenologically was via a non-linear

correction to a Gaussian perturbation ζg,.10

ζ(x) = ζg(x) +3

5f loc.

NL

[ζg(x)2 − 〈ζg(x)2〉

]. (5.2.13)

This definition is local in real space and therefore called local non-Gaussianity. The bispectrum

of local non-Gaussianity is

Bζ(k1, k2, k3) =6

5f loc.

NL × [Pζ(k1)Pζ(k2) + Pζ(k2)Pζ(k3) + Pζ(k3)Pζ(k1)] , (5.2.14)

=6

5f loc.

NL

∆2ζ

(k1k2k3)3

(k2

1

k2k3+

k22

k1k3+

k23

k1k2

), (5.2.15)

where in the second line we assumed a scale-invariant spectrum, Pζ(k) = ∆ζk−3. We can read

off the local shape function as

Sloc.(k1, k2, k3) =1

3

(k2

3

k1k2+ 2 perms.

). (5.2.16)

Without loss of generality, let us order the momenta such that k1 ≤ k2 ≤ k3. The bispectrum for

local non-Gaussianity is then largest when the smallest k (i.e. k1) is very small, k1 k2 ∼ k3.

By momentum conservation, the other two momenta are then nearly equal. In this squeezed

limit, the bispectrum for local non-Gaussianity becomes

limk1k2∼k3

Sloc.(k1, k2, k3) =2

3

k2

k1. (5.2.17)

We will later prove a powerful theorem that states that non-Gaussianity with a large squeezed

limit cannot arise in single-field inflation, independent of the details of the inflationary action.

Equilateral Non-Gaussianity

We will see below that higher-derivative corrections during inflation can lead to large non-

Gaussianities. A key characteristic of derivative interactions is that they are suppressed when

any individual mode is far outside the horizon. This suggests that the bispectrum is maximal

10The factor of 3/5 in eq. (5.2.13) is conventional since non-Gaussianity was first defined in terms of the

Newtonian potential, Φ(x) = Φg(x) + f loc.NL

[Φg(x)2 − 〈Φg(x)2〉

], which during the matter era is related to ζ by a

factor of 3/5.


when all three moves have wavelengths equal to the horizon size. The bispectrum therefore

has a shape that peaks in the equilateral configuration, k1 = k2 = k3. The CMB analysis that

searches for these signals uses the following template for the shape function

Sequil.(k1, k2, k3) =

(k1

k2+ 5 perms.

)−(

k21

k2k3+ 2 perms.

)− 2 . (5.2.18)

This template approximates the shape that we will compute for higher-derivative theories (such

as DBI inflation) in §5.3.2.

0.0 0.5 1.00.5

0.75

1.0

0

3.5

7.0

0.0 0.5 1.0

0.0

0.5

1.0

0.5

0.75

1.0

x2

x3

Sequil(1, x2, x3)

S local(1, x2, x3)

x2

x3

Figure 5.2: Shape functions of local and equilateral non-Gaussianity.

The Cosine

What if the real CMB was truly non-Gaussian, but we searched for the non-Gaussianity with

the wrong template? Can we still detect a signal?

To answer this question, we digress briefly and sketch how non-Gaussianity is measured in

practice. Importantly, the signal is much too small for a mode-by-mode measurement of the

bispectrum. Instead, we have to start with a theoretically motivated momentum dependence

for the bispectrum and fit for the overall amplitude. Concretely, imagine that we measure ζk in

a three-dimensional survey.11 The three-point function is of the form

〈ζk1ζk2ζk3〉 = fNL · (2π)3δ(k1 + k2 + k3)B(k1, k2, k3) . (5.2.19)

We would like to test for some particular shape function B(k1, k2, k3) and use the data to measure

the overall amplitude fNL. In the limit of small non-Gaussianity, it can be shown that the best

estimator for fNL is

fNL =

∑kiζk1ζk2ζk3

B(k1,k2,k3)Pk1

Pk2Pk3∑

kiB(k1,k2,k3)2

Pk1Pk2

Pk3

, (5.2.20)

where Pki ≡ Pζ(ki) is the power spectrum and the sums are over all physical triangles in

momentum space. Eq. (5.2.20) naturally leads us to define a ‘scalar product’ of two bispectra

B1 and B2,

B1 ·B2 ≡∑

ki

B1(k1, k2, k3)B2(k1, k2, k3)

Pk1Pk2Pk3

. (5.2.21)

11Of course, the CMB measurements are two-dimensional, but the principle will be the same.

5.3 Quantum Non-Gaussianities 85

If two shapes have a small scalar product, the optimal estimator (5.2.20) for one shape will be

very bad in detecting non-Gaussianities with the other shape and vice versa. As we explained

above, assuming isotropy, the shape function depends only on two momentum ratios, say x2 =

k2/k1 and x3 = k3/k1. The definition of the scalar product (5.2.21) contains a factor of x32x

33

from the power spectra in the denominator (assuming scale-invariance). Furthermore, we get a

measure factor x2x3 when going from a three-dimensional sum over modes to one-dimensional

integrals over x2 and x3. This suggests that we define the scalar product of two shapes as

S1 · S2 ≡∫

VS1(x2, x3)S2(x2, x3) dx2dx3 , (5.2.22)

where Si are the shape functions defined in (5.2.12) and the integrals are only over physical

momenta satisfying the triangle inequality: 0 ≤ x2 ≤ 1 and 1 − x2 ≤ x3 ≤ 1. An important

quantity is the normalized scalar product, or cosine, of two shapes,

C(S1,S2) =S1 · S2

(S1 · S1)1/2(S2 · S2)1/2. (5.2.23)

The cosine provides an observationally motivated answer to the questions: “How different are

two bispectrum shapes?”

Orthogonal Non-Gaussianity

Using the cosine we can define a phenomenological shape that is orthogonal to both the local

and equilateral templates, i.e. Sortho. · Sloc. = Sortho. · Sequil. ≡ 0,

Sortho.(k1, k2, k3) = −3.84

(k2

1

k2k3+ 2 perms

)+ 3.94

(k1

k2+ 5 perms.

)− 11.10 . (5.2.24)

Below we will see how this shape arises in inflationary theories with higher-derivative interac-

tions.

5.3 Quantum Non-Gaussianities

Let us start to do some physics. There are two classes of mechanisms that generate non-

Gaussianity during inflation:

i) quantum-mechanical effects can generate non-Gaussianity at or before horizon exit;

ii) classical non-linear evolution can generate non-Gaussianity after horizon exit.

In this section we introduce the Schwinger-Keldysh in-in formalism12 to compute quantum non-

Gaussianities during inflation. In the next section we present the δN formalism to compute the

classical effects.

12After pioneering work by Calzetta and Hu and Jordan the application of the in-in formalism to cosmological

problems was revived by Maldacena and Weinberg.


5.3.1 The in-in Formalism

The problem of computing correlation functions in cosmology differs in important ways from the

corresponding analysis of quantum field theory applied to particle physics. In particle physics

the central object is the S-matrix describing the transition probability for a state |in〉 in the far

past to become some state |out〉 in the far future,

〈out|S|in〉 = 〈out(+∞)|in(−∞)〉 . (5.3.25)

Imposing asymptotic conditions at very early and very late times makes sense in this case, since

in Minkowski space, states are assumed to be non-interacting in the far past and the far future

when the scattering particles are far from the interaction region. The asymptotic states relevant

for particle physics experiments are therefore taken to be vacuum states of the free Hamiltonian

H0. In cosmology, however, we are interested in expectation values of products of operators

at a fixed time. Special care has to be taken to define the time-dependence of operators in

the interacting theory. Boundary conditions are not imposed on the fields at both very early

and very late times, but only at very early times, when their wavelengths are much smaller

than the horizon. In this limit the interaction picture fields should have the same form as in

Minkowski space13. This leads to the definition of the Bunch-Davies vacuum—the free vacuum

in Minkowski space. In this section we describe the in-in formalism to compute cosmological

correlation functions as expectation values of two |in〉 states.

Preview. In presentations of the in-in formalism it is easy not to see the forest for the trees.

Let me therefore give a brief summary of the qualitative ideas behind the formalism, before

jumping into the technical details:

Our goal is to compute n-point functions of cosmological perturbations such as the primordial

curvature perturbation ζ or the gravitational wave polarization modes h× and h+. We collec-

tively denote these fluctuations by the field ψ = ζ, h×, h+ and consider expectation values of

operators such as Q = ψk1ψk2 · · ·ψkn ,

〈Q〉 = 〈in|Q(t) |in〉 . (5.3.26)

Here |in〉 is the vacuum of the interacting theory at some moment ti in the far past and t > tiis some later time such as horizon crossing or the end of inflation. To compute the matrix

element in (5.3.26) we evolve Q(t) back to ti, using the perturbed Hamiltonian δH. Computing

this time-evolution is complicated by the interactions inside of δH = H0 + Hint (these lead to

non-linear equations of motion). We therefore introduce the interaction picture in which the

leading time-dependence of the fields is determined by the quadratic Hamiltonian H0 (or linear

equations of motion). Corrections arising from the interactions are then treated as a power

series in Hint.

This leads to the following important result

〈Q〉 = 〈0| T ei∫ t−∞(1−iε) H

Iint(t

′)dt′QI(t) Te

−i∫ t−∞(1+iε)H

Iint(t

′′)dt′′ |0〉 , (5.3.27)

where T (T ) is the (anti-)time-ordering symbol. Note that both QI and HIint are evaluated using

interaction picture operators. The standard iε prescription has been used to effectively turn off

the interaction in the far past and project the interacting |in〉 state onto the free vacuum |0〉.13This is a consequence of the equivalence principle.


By expanding the exponential we can compute the correlation function perturbatively in Hint.

For example, at leading order (tree-level), we find

〈Q(t)〉 = −i∫ t

−∞dt′ 〈0| [QI(t), HI

int(t′)] |0〉 . (5.3.28)

We can use Feynman diagrams to organize the power series, drawing interaction vertices for

every power of Hint. Notice that in the in-in formalism there is no time flow: each vertex

insertion is associated not just with momentum conservation, but also a time integral.

In the rest of this section I will derive the in-in master formula (5.3.27). Readers more

interesting in applications of the result, may jump to §5.3.2.

Time evolution. In the Heisenberg picture, the time-dependence of the operator Q(t) is

determined by the Hamiltonian

H[ψ(t), pψ(t)] =

∫d3x H[ψ(t,x), pψ(t,x)] , (5.3.29)

where H is the Hamiltonian density. As usual, the field ψ and its conjugate momentum pψsatisfy the equal-time commutation relation

[ψ(t,x), pψ(t,y)

]= iδ(x− y) , (5.3.30)

and all other commutators vanish. The Heisenberg equations of motion are

d

dtψ = i

[H,ψ

]and

d

dtpψ = i

[H, pψ

]. (5.3.31)

We split the fields into a time-dependent background ψ(t), pψ(t) and spacetime-dependent

fluctuations δψ(x), δpψ(x). Our goal will be to “quantize” the fluctuations on the “classical”

background. The evolution of the background is determined by the classical equations of motion

˙ψ =∂H∂pψ

and ˙pψ = −∂H∂ψ

. (5.3.32)

Since ψ and pψ are complex numbers they commute with everything and the fluctuations satisfy

the commutation relation [δψ(t,x), δpψ(t,y)

]= iδ(x− y) . (5.3.33)

We expand the Hamiltonian into a background H[ψ, pψ] and fluctuations δH[δψ, δpψ; t]. The

time-dependence of the perturbation Hamiltonian arises both from the time-dependence of the

fluctuations δψ(t), δpψ(t) and the explicit time-dependence of the background fields a(t), ψ(t),

and pψ(t). We denote this source of time-dependence by the “; t” in the argument of δH.

Terms linear in fluctuations cancel on using background equations of motion. The perturbation

Hamiltonian δH therefore starts quadratic in fluctuations. It determines the time-dependence

of the fluctuations

d

dtδψ = i

[δH, δψ

]and

d

dtδpψ = i

[δH, δpψ

]. (5.3.34)

We see that the perturbations are evolved using the perturbed Hamiltonian. Like in standard

quantum field theory, these equations are solved by

δψ(t,x) = U−1(t, ti)δψ(ti,x)U(t, ti) , (5.3.35)


where the unitary operator U satisfies14

d

dtU(t, ti) = −iδH[δψ(t), δpψ(t); t]U(t, ti) , (5.3.37)

with initial condition U(ti, ti) ≡ 1.

Time evolution in the interaction picture. To describe the time evolution of the perturbations

in the presence of interactions, we split the perturbed Hamiltonian into a quadratic part H0 and

an interacting part Hint,

δH = H0 +Hint . (5.3.38)

The free-field piece H0 will then determine the leading time-evolution of the interaction picture

fields δψI , δpIψ,d

dtδψI = i

[H0[δψI(t), δpIψ(t); t], δψI(t)

], (5.3.39)

and similarly for δpIψ. The initial conditions of the interaction picture fields are chosen to

coincide with those of the “full theory”, δψI(ti,x) = δψ(ti,x) (which for inflation corresponds

to the Bunch-Davies initial conditions). It follows from (5.3.39) that in evaluating H0[δψI , δpIψ; t]

we can take the time argument of δψI and δpIψ to have any value, and in particular we can take

it to be ti, so that

H0[δψI(t), δpIψ(t); t] → H0[δψ(ti), δpψ(ti); t] . (5.3.40)

Notice that H0 still depends on time through the time-evolution of the background (; t). The

solution of (5.3.39) can again be written as a unitary transformation

δψI(t,x) = U−10 (t, ti)δψ(ti,x)U0(t, ti) , (5.3.41)

where U0 satisfies15

d

dtU0(t, ti) = −iH0[δψ(ti), δpψ(ti); t]U0(t, ti) , (5.3.43)

with U0(ti, ti) ≡ 1.

Expectation value. We now return to our goal: 〈Q(t)〉 = 〈in|Q(t) |in〉. We first use the

operator U(t, ti) to evolve Q(t) back to Q(ti)

〈Q(t)〉 = 〈in|Q[δψ(t), δpψ(t)] |in〉 = 〈in|U−1(t, ti)Q[δψ(ti), δpψ(ti)]U(t, ti) |in〉 . (5.3.44)

14 We recall from standard quantum field theory that eq. (5.3.35) has the following formal solution

U(t, ti) = T exp

(−i∫ t

ti

δH(t)dt

), (5.3.36)

where T is the time-ordering operator (as required for time-dependent Hamiltonians). However, this form of the

solution is not very useful, since it is hard to calculate δψ(t,x) in the presence of interactions (since the equations

of motion would be non-linear). In the following we will develop a more useful perturbative scheme to compute

U as a power series in Hint. For this purpose we introduce the interaction picture.15Eq. (5.3.43) has the following solution

U0(t, ti) = T exp

(−i∫ t

ti

H0(t)dt

). (5.3.42)

This operator is used to evolve the interaction picture fields. All of this is just a highbrow way of saying that the

interaction picture fields are calculated using the free-field Hamiltonian. In inflation, this means that the linear

solutions given by the Mukhanov-Sasaki equation are the interaction picture fields.


We then insert the identity operator 1 = U0(t, ti)U−10 (t, ti) in two places (i.e. we do nothing)

and write

〈Q(t) 〉 = 〈in|F−1(t, ti)U−10 (t, ti)Q[δψ(ti), δpψ(ti)]U0(t, ti)F (t, ti) |in〉 , (5.3.45)

where we defined the new operator

F (t, ti) ≡ U−10 (t, ti)U(t, ti) . (5.3.46)

Since δψ(ti) = δψI(ti) and U0(t, ti) determines the time-evolution of the interaction picture

fields, this becomes

〈in|Q(t) |in〉 = 〈in|F−1(t, ti)Q[δψI(t), δpIψ(t)]F (t, ti) |in〉 . (5.3.47)

Using eqs. (5.3.37) and (5.3.43), it is an easy exercise to show that F (t, ti) satisfies

d

dtF (t, ti) = −iHint[δψ

I(t), δpIψ(t); t]F (t, ti) , (5.3.48)

with F (ti, ti) ≡ 1. Hence, F (t, ti) is the unitary evolution operator associated with Hint, with

the interaction Hamiltonian constructed out of interaction picture fields. Eq. (5.3.48) has the

following solution

F (t, ti) = T exp

(−i∫ t

ti

Hint(t)dt

), (5.3.49)

where T is the standard time-ordering operator (as required for time-dependent Hamiltonians).

Projection onto the free vacuum. Letting ti → −∞+ ≡ −∞(1 + iε), the in-in expectation

value in (8.2.15) becomes

〈Q〉 = 〈in|(Te−i

∫ t−∞+ Hint(t

′)dt′)†QI(t)(Te−i

∫ t−∞+ Hint(t

′′)dt′′) |in〉 . (5.3.50)

As in the standard in-out treatment of quantum field theory in Minkowski space, we have used

the iε prescription to effectively turn off the interaction Hint in the infinite past. We can therefore

identify the in vacuum with the free-field vacuum, |in〉 → |0〉. However, in the in-in formalism

there is an additional subtlety: the conjugation operator in (5.3.50) acts on the integration limit,

leading to an important sign flip

〈Q〉 = 〈0| T ei∫ t−∞− Hint(t

′)dt′ QI(t) Te−i∫ t−∞+ Hint(t

′′)dt′′ |0〉 , (5.3.51)

where we defined −∞± = −∞(1 ∓ iε) and T denotes anti-time-ordering. The integration

contour goes from −∞(1 − iε) to t (where the correlation function is evaluated) and back

to −∞(1 + iε). Notice that the contour doesn’t close. This will imply that the contraction of

operators lead to real-valued Wightman Green’s functions, rather than complex-valued Feynman

Green’s functions.

Perturbative expansion. Finally, correlation functions are then computed perturbatively in

Hint. For instance, to leading order (5.3.51) can be written as

〈Q(t)〉 = −i∫ t

−∞dt′ 〈0| [QI(t), Hint(t

′)] |0〉 . (5.3.52)

This key equation allows us to compute bispectra at tree-level. Higher orders in Hint can be

written as nested commutators

in∫ t

−∞dt1

∫ t1

−∞dt2 · · ·

∫ tn−1

−∞dtn 〈[Hint(tn), [Hint(tn−1), . . . , [Hint(t1), QI(t)] · · · ]]〉 . (5.3.53)


5.3.2 Single-Field Inflation

Let us now apply the in-in formalism to a few examples in single-field inflation. This means

finding the interaction Hamiltonian for the curvature perturbation in a specific inflationary

model and applying the master equation (5.3.51).

Slow-Roll Inflation

We start with the action of a scalar field with canonical kinetic term X ≡ −12(∂µφ)2 and potential

V (φ), minimally coupled to Einstein gravity,

S =

∫d4x√−g

[1

2R+X − V (φ)

], (5.3.54)

where Mpl ≡ 1. Non-Gaussianity arises both from inflaton self-interactions and gravitational

non-linearities.16 We parameterize metric perturbations in the ADM formalism

ds2 = −N2dt2 + hij(dxi +N idt)(dxj +N jdt) . (5.3.55)

In these variables the action becomes

S =

∫dtd3x

√hN

(1

2R(3) +X − V +

1

2N2(EijE

ij + E2)

), (5.3.56)

where R(3)[hij ] is the three-dimensional Ricci scalar, Eij is the extrinsic curvature,

Eij =1

2(hij −∇iNj −∇jNi) , (5.3.57)

and E ≡ Eijhij is its trace. We work in comoving gauge, δφ ≡ 0, and define the curvature

perturbation as

hij = a2e2ζδij . (5.3.58)

It is a straightforward, but tedious task to solve the constraint equations for the Lagrangian

multipliers N = 1 + δN and Ni = ∂iψ,

δN =ζ

Hand ψ = − ζ

a2H+ χ , where ∂2χ = εζ . (5.3.59)

It is sufficient to solve N and Ni to first order in ζ, since the second-order and third-order

perturbations will multiply the first-order and zeroth-order constraint equation, respectively.

Substituting the Lagrange multipliers into the action, Maldacena found

L2 = ε(∂µζ)2 , (5.3.60)

and

L3 = ε2ζζ2 + ε2ζ(∂iζ)2 − 2εζ(∂iζ)(∂iχ) + 2f(ζ)δL2

δζ+O(ε3) , (5.3.61)

where

f(ζ) ≡ η

4ζ2 + · · · (5.3.62)

16Of course, this is a gauge-dependent statement. For example, in unitary gauge (δφ = 0), the inflaton

fluctuations are eaten by the metric and the non-Gaussianity is purely in the gravity sector.


and δL2δζ is the variation of the quadratic action L2 = a3L2 with respect to ζ. The dots in (5.3.62)

represent a large number of terms with derivatives acting on ζ. These terms vanish outside of

the horizon and hence don’t contribute to the bispectrum. Maldacena showed that the term

proportional to f(ζ) can be removed by a field redefinition,

ζ → ζn + f(ζn) . (5.3.63)

This field redefinition has the following effect on the correlation function

〈ζ(x1)ζ(x2)ζ(x3)〉 = 〈ζn(x1)ζn(x2)ζn(x3)〉+η

2(〈ζn(x1)ζn(x2)〉〈ζ(x1)ζn(x3)〉+ cyclic) + · · ·

(5.3.64)

Expanding the in-in master formula (5.3.51) to first order in Hint = −L3 +O(ζ4), we get

〈ζ3n〉 = −i

∫ 0

−∞dt 〈0|[ζn(k1)ζn(k2)ζn(k3), Hint(t)]|0〉 . (5.3.65)

Before embarking on a lengthy calculation of the bispectrum, it is often advisable to perform

an order-of-magnitude estimate of the expected size of the signal, i.e. to do a quick and dirty

way to estimate (5.3.65) without explicitly performing the integral. For example, the first term

in (5.3.61), can be written as∫

dtHint(t) ⊂ −∫

d3xdτa2ε2ζ(ζ ′)2. We only need to keep track

of factors of H and ε. Any time- and momentum-dependence will work itself out and only

contributes to the shape function. Using a ∝ H−1 and ζ ∝ ζ ′ ∝√

∆ζ ∼ H/√ε, we estimate

that the contribution from the three-point vertex is ∼ H√ε. Combining this with estimates for

the size of the three external legs, ζ3 ∼ H3ε−3/2, we find17

〈ζ3〉 = −i∫

dt 〈[ζ3, Hint(t)]〉 ∝H4

ε∝ O(ε)∆2

ζ ∼ fNL∆2ζ . (5.3.67)

Similar results are obtained for the other two interactions in (5.3.61), fNL ∼ O(ε). Moreover,

it is easy to see that the contribution from the field redefinition in (5.3.64) is fNL ∼ O(η). We

conclude that the non-Gaussianity in slow-roll inflation is slow-roll surpressed,

fNL ∼ O(ε, η) . (5.3.68)

This small amount of non-Gaussianity will never be observable in the CMB.

To get the full momentum-dependence of the bispectrum we actually have to do some real

work and compute the integral in (5.3.65) using the free-field mode functions for ζ. Here, we

just cite the final answer,18

Ss.r. =11

2εSε +

3

2η Sη , (5.3.69)

where we have separated the bispectrum into a contribution proportional to ε and a contribution

proportional to η,

Sε =1

11

[−(

k21

k2k3+ 2 perms.

)+

(k1

k2+ 5 perms.

)+

8

3K

(k1k2

k3+ 2 perms.

)], (5.3.70)

Sη =1

3

(k2

1

k2k3+ 2 perms.

), (5.3.71)

17A simple way to remember this back-of-the-envelope technique for estimating non-Gaussianity is

L3

L2∼ fNLζ . (5.3.66)

18In §5.3.2 we present a similar calculation in more detail. In that case, the result may actually be observable,

so our efforts will be of more direct observational relevance.


where K ≡ 13(k1 + k2 + k3). This slow-roll shape is well approximated by a superposition of the

local and equilateral shapes

Ss.r. ∝ (6ε− 2η)Sloc. +5

3εSequil. . (5.3.72)

Like the local shape the slow-roll shape therefore peaks in the squeezed limit. However, the

smallness of the slow-roll parameters makes the signal unobservable.

Small Sound Speed

So far this might seem a bit like Sisyphus work since we realized quickly that the non-Gaussianity

from slow-roll inflation will never be observable. What are mechanisms that could produce large

non-Gaussianity during inflation? This question will concern us in the remainder of these notes.

Higher Derivatives and Radiative Stability

What kind of high-energy effects could deform slow-roll inflation in such a way as to produce

large non-Gaussianity without disrupting the inflationary background solution? In effective field

theory the effects of high-energy physics are encoded in high-dimension operators for the inflaton

lagrangian.19 Non-derivative operators such as φn/Λn−4 form part of the inflaton potential and

are therefore strongly constrained by the background (see previous section). In other words,

the existence of a slow-roll phase requires the non-Gaussianity associated with these opera-

tors to be small.20 This naturally leads us to consider higher-derivative operators of the form

(∂µφ)2n/Λ4n−4. These operators don’t affect the background, but in principle they could lead

to strong interactions. Let us consider the leading correction to the slow-roll lagrangian

L = Ls.r. +(∂µφ)4

8Λ4. (5.3.73)

We split the inflaton field into background φ(t) and fluctuations ϕ(x, t). For ˙φ Λ2, we can

ignore the correction to the quadratic lagrangian for ϕ,

L2 ≈ −1

2(∂µϕ)2 . (5.3.74)

We get the cubic lagrangian for ϕ by evaluating one of the legs of the interaction (∂φ)4 on the

background ˙φ,

L3 = −˙φ

2Λ4ϕ(∂µϕ)2 . (5.3.75)

As before, we estimate the size of the non-Gaussianity in the quick and dirty way,

fNL ∼1

ζ

L3

L2∼ 1

ζ

˙φ ϕ

Λ4. (5.3.76)

19In addition, there could be high-energy modifications to the vacuum state. These effects require a separate

discussion.20A possibility to get large non-Gaussianities is to have additional light fields (i.e. fields with mass smaller

than H) during inflation. These new degrees of freedom are not constrained by the slow-roll requirements and if

their fluctuations are somehow converted into curvature perturbations, these can be much less Gaussian than in

single-field slow-roll inflation. We discuss this possibility in the next section.


Using ϕ ∼ Hϕ and ζ = H˙φϕ, we get

fNL ∼˙φ2

Λ4. (5.3.77)

We therefore find that we only get significant non-Gaussianity when ˙φ > Λ2, in which case we

can’t trust our derivative expansion. In other words, for ˙φ > Λ2 there is no reason to truncate

the expansion at finite order as we did in (7.4.14). Instead, operators of arbitrary dimensions

become important in this limit,

P (X,φ) =∑

cn(φ)Xn

Λ4n−4, where X ≡ −1

2(∂µφ)2 . (5.3.78)

As an effective-field theory, eq. (7.4.19) makes little sense when X > Λ4. All of the coefficients

cn are radiatively unstable. Hence, if we want to use a theory like (7.4.19) to generate large non-

Gaussianity, we require a UV-completion. Interestingly, an example for such a UV-completion

exists in string theory. In Dirac-Born-Infeld inflation,

P (X,φ) =Λ4

f(φ)

√1− f(φ)

X

Λ4− V (φ) , (5.3.79)

the form of the action is protected by a higher-dimensional boost symmetry. This symmetry

protects eq. (5.3.79) from radiative corrections and allows a predictive inflationary model with

large non-Gaussianity. It would be interesting to explore if there are other examples of P (X)

theories that are radiatively stable. In the next subsection, we will allow ourselves a bit of artistic

freedom and study the phenomenology of general P (X) theories, while ignoring the serious issues

they face in explaining radiative stability. However, I emphasize that the problem of radiative

stability should not be treated lightly. It is an important requirement of any satisfactory theory.

Non-Gaussianity in P (X) Theories

We consider theories whose action can be written in the following form

S =

∫d4x√−g

[1

2R+ P (X,φ)

]. (5.3.80)

We expand both the inflaton and the metric in small fluctuations, i.e. φ(x, t) = φ(t) + ϕ(x, t)

and gµν = gµν + δgµν . We could now go ahead and solve the metric fluctuations in the Einstein

constraint equations in terms of the matter fluctuations ϕ. After some hard work, we would

arrive at an action for a single scalar degree of freedom. However, this is unnecessarily hard work

in the case we are interested in. Large non-Gaussianities only arise from the higher-derivative

inflaton self-interactions, while the effects of mixing with gravity are subdominant. At leading

order, we will therefore arrive at the correct result by simply ignoring the mixing with metric

fluctuations and computing the action for inflaton perturbations ϕ in an unperturbed spacetime

gµν . In this decoupling limit, gµν → gµν , we find X ≡ X + δX, where

X =1

2˙φ2 and δX = ˙φϕ− 1

2(∂µϕ)2 . (5.3.81)

Defining ϕ ≡ ˙φπ and ignoring (slow-roll suppressed) derivatives of ˙φ, we get

δX = 2X

[π − 1

2(∂µπ)2

]. (5.3.82)


We used this in a Taylor expansion of P (X),

P (X) = P (X) + P,XδX +1

2P,XX(δX)2 +

1

6P,XXX(δX)3 + · · · (5.3.83)

In terms of the (rescaled) inflaton fluctuations π this becomes

P = XP,X(∂µπ)2 + 2X2P,XX[π2 − π(∂µπ)2 + · · ·

]+

4

3X3P,XXX

[π3 + · · ·

](5.3.84)

The background equations of motion relate the first coefficient to the Hubble expansion param-

eter, XP,X = M2plH, while the remaining parameters are free, M4

n ≡ Xn dnPdXn . Up to cubic order

we hence get the following lagrangian

L = M2plH(∂µπ)2 + 2M2

2

[π2 − π(∂µπ)2

]+

4

3M4

3 π3 . (5.3.85)

A few comments are in order:

1. The M2 operator leads to a correction of the kinetic term π2, but not of the gradient term

(∂iπ)2. Lorentz invariance is broken, space and time are treated differently and the theory

for the fluctuations develops a non-trivial sound speed,

L2 =M2

plH

c2s

[π2 − c2

s(∂iπ)2], (5.3.86)

where1

c2s

= 1− 2M42

M2plH

. (5.3.87)

For M42 M2

pl|H| the sound speed is significantly smaller than the speed of light, cs 1.

The free-field mode function derived from (5.3.86) is,

πk(τ) = π(o)k (1 + ikcsτ)e−ikcsτ , (5.3.88)

where π(o)k is its superhorizon value

π(o)k ≡

i

2Mpl

√εk3cs

. (5.3.89)

Hence, the power spectrum for ζ = −Hπ is

Pζ = H2|π(o)k |2 =

H2

4εcsM2pl

1

k3. (5.3.90)

2. A non-linearly realized symmetry in (5.3.85) relates a small sound speed (large M2) to

large interactions. In this limit we expect large non-Gaussianities. Our standard estimate

confirms this

fNL ∼1

ζ

L3

L2∼ 1

Hπ

π(∂iπ)2

π2∼ 1

c2s

1 . (5.3.91)


Let us use the in-in formalism (with ψ → π) to compute the non-Gaussianity in the limit

cs 1. At tree level, we use the in-in master formula to compute the bispectrum after horizon-

crossing (τ → 0):

limτ→0〈π3〉 = −i

∫ 0

−∞+

dτ〈[π3(0), Hint(τ)]〉 = (2π)3δ(k1 + k2 + k3)Bπ(k1, k2, k3) , (5.3.92)

where, at leading order in the interactions, Hint = −∫

d3x a4Lint.

We first consider the π(∂iπ)2 interaction. Performing the standard Wick contractions, we find

Bπ(∂iπ)2 = 2M42 · π(o)

k1π

(o)k2π

(o)k3

∫ 0

−∞+

dτ

Hτ(π∗k1

)′π∗k2π∗k3

(k2 · k3) + perms.+ c.c. , (5.3.93)

where k2 · k3 = 12(k2

3 − k21 − k2

2). We inserting (5.3.88) for the wavefunctions. The resulting

integral converges21 due to the iε in the integration limit −∞+ = −∞(1− iε). The bispectrum

for the curvature fluctuation is obtained by a simple rescaling: Bζ(∂iζ)2 = H3Bπ(∂iπ)2 . Using

eq. (5.2.12) to extract the amplitude and the shape of the non-Gaussianity, we find

fζ(∂iζ)

2

NL =85

324

(1− 1

c2s

), (5.3.94)

and

Sζ(∂iζ)2 ∝

(k2

1 − k22 − k2

3)K

k1k2k3

−1 +

1

9

∑

i>j

kikjK2

+1

27

k1k2k3

K3

+ perms.

, (5.3.95)

where K ≡ 13(k1 + k2 + k3).

Similarly, the bispectrum associated with the π3 interaction is

Bπ3(k1, k2, k3) = 2M43 · π(o)

k1π

(o)k2π

(o)k3

∫ 0

−∞+

dτ

Hτ(π∗k1

)′(π∗k2)′(π∗k3

)′ + perms.+ c.c. (5.3.96)

Inserting the wavefunctions and performing the integral, we find the following amplitude

f ζ3

NL =10

243

(1− 1

c2s

)(c3 +

3

2c2s

), (5.3.97)

and the shape

Sζ3 =k1k2k3

K3. (5.3.98)

The parameter c3 in eq. (5.3.97) is defined via

M43 ≡ c3 ·

M42

c2s

. (5.3.99)

In DBI inflation there is a particular relation between M2 and M3, corresponding to

cDBI3 =

3

2(1− c2

s) ≈3

2. (5.3.100)

Both Sζ(∂iζ)2 and Sζ3 are equilateral shapes. However, the shapes are not identical. In fact, for

c3 ≈ −5.4 the combined shape is nearly orthogonal to the equilateral shape (see §5.2.5).


slow-roll inflationDBI inflation

sound speed

Figure 5.3: CMB Precision Tests. CMB data constrains the parameters M2 and M3 in the effective

lagrangian: L = M2plH(∂µπ)2 + 2M4

2 (π2 − π(∂µπ)2) + 43M

43 π

3.

CMB Precision Tests

The lagrangian (5.3.85) is the minimal deformation of standard single-field slow-roll inflation.

Smith et al.22 used CMB data to constrain its parameters (see fig. 5.3).

A Consistency Relation

We conclude our discussion of non-Gaussianity in single-field inflation with a powerful theorem.

Under the assumption of a single field, but making absolutely no other assumptions about the

inflationary action, Creminelli and Zaldarriaga proved that the following has to hold in the

squeezed limit:

limk1→0〈ζk1ζk2ζk3〉 = (2π)3δ(k1 + k2 + k3) (1− ns)Pζ(k1)Pζ(k3) , (5.3.101)

i.e. for single-field inflation, the squeezed limit of the three-point function is suppressed by

(1−ns) and vanishes for perfectly scale-invariant perturbations. A detection of non-Gaussianity

in the squeezed limit can therefore rule out all models of single-field inflation! In particular, this

statement is independent of: the form of the potential, the form of the kinetic term (or sound

speed) and the initial vacuum state.

Proof:

The squeezed triangle correlates one long-wavelength mode, kL = k1 to two short-wavelength modes,

kS = k2 ≈ k3,

〈ζk1ζk2

ζk3〉 → 〈(ζkS

)2ζkL〉 . (5.3.102)

Modes with longer wavelengths freeze earlier. Therefore, kL will be already frozen outside the horizon

when the two smaller modes freeze and acts as a background field for the two short-wavelength modes.

21If you Wick rotate, τ → −iτ , Mathematica will do the integral.22Smith, Senatore, and Zaldarriaga, (arXiv:0901.2572).

5.4 Classical Non-Gaussianities 97

Why should (ζkS)2 be correlated with ζkL? The theorem says that it isn’t correlated if ζk is precisely

scale-invariant, but that the short scale power does get modified by the long-wavelength mode if ns 6= 1.

Let’s see why. We decompose the evaluation of (5.3.102) into two steps:

i) we calculate the power spectrum of short fluctuations 〈ζ2S〉ζL in the presence of a long mode ζL;

ii) we then calculate the correlation 〈(ζkS)2ζkL

〉.

The calculating is simplest in real-space: The long-wavelength curvature perturbation ζkLrescales the

spatial coordinates within a given Hubble patch (recall that ds2 = −dt2 + a(t)2e2ζ(x,t)dx2). The two-

point function 〈ζ2S〉 will depend on the value of the background fluctuations ζL already frozen outside

the horizon. In position space the variation of the two-point function given by the long-wavelength

fluctuations ζL is at linear order

〈ζ2S〉ζL(∆x) = 〈ζ2

S〉0(∆x) + ζL ·d

dζL〈ζ2

S〉∣∣∣∣0

+ · · · , (5.3.103)

where the subscript 0 denotes a quantity in the absence of ζL (or in the limit ζL → 0). Using ddζL→ d

d ln ∆x ,

we find

〈ζ2S〉ζL = 〈ζ2

S〉0 + ζL · (1− ns) · 〈ζ2S〉0 . (5.3.104)

To get the three-point function in (5.3.102), we multiply (5.4.113) by ζL and average over it

〈〈ζ2S〉ζLζL〉 = (1− ns)〈ζ2

L〉〈ζ2S〉0 . (5.3.105)

Going to Fourier space gives eq. (5.3.101),

Bζ(kS, kS, kL → 0) = (1− ns)Pζ(kL)Pζ(kS) . (5.3.106)

QED.

From the proof it is clear that we didn’t have to assume any details about inflation, except

that the existence of ζL modifies 〈ζ2S〉. This assumption is satisfied for a single field, but for

multiple fields 〈ζ2S〉 can be modified by other things as well. That is why this is a theorem for

all single-field models, and can be used to ruled them out!

5.4 Classical Non-Gaussianities

In the previous section we computed the non-Gaussianity generated at horizon crossing. In

this section we discuss a second source of non-Gaussianity arising from non-linearities after

horizon crossing when all modes have become classical. A convenient way to describe these

non-Gaussianities is the δN formalism.

5.4.1 The δN Formalism

Scalar perturbations to the spatial metric on a fixed time slice t can be written as a local

perturbation to the scalar factor

a(x, t) ≡ a(t)eψ(x,t) . (5.4.107)

The local number of e-folds of expansion between two time slices t1 and t2 is

δN12(x) =

∫ t2

t1

d ln a

dtdt = ψ(x, t2)− ψ(x, t1) . (5.4.108)


Define δN(x, t) as the number of e-folds from a fixed flat slice23 (ψ = 0) to a uniform density

slice (ψ = ζ) at time t. Then,

ζ(x, t) = δN(x, t) . (5.4.109)

This leads to a simple algorithm to compute the superhorizon evolution of the primordial curva-

ture perturbation ζ: to illustrate the procedure consider a set of scalars φi. A linear combination

of these fields will be the inflaton. The remaining fields are ‘isocurvatons’. We assume that all

fields have become superhorizon at some initial time. At this time we choose a spatially flat

time-slice, on which there are no scalar metric fluctuations, but only fluctuations in the matter

fields, φi + δφi(x). Choose the final time-slice to have uniform density, i.e. the inflaton field is

unperturbed and all fluctuations are in the metric and the isocurvatons. Evolve the unperturbed

fields φi in the initial slice ‘classically’ to the unperturbed final slice, and denote the correspond-

ing number of e-folds N(φi). Next, evolve the perturbed initial field configuration φi + δφi(x)

‘classically’ to the perturbed final slice. (Notice that in single-field inflation the ‘perturbed’ final

slice is equal to the unperturbed slice, while in general they are different.) The corresponding

number of e-folds is N(φi + δφi). The difference between those two answers,

δN = N(φi + δφi)− N(φi) , (5.4.110)

is the curvature perturbation ζ, cf. eq. (5.4.109). Taylor expanding, we obtain an expression for

ζ in terms of the scalar field fluctuations δφi and derivatives of N defined on the initial slice,

ζ = Niδφi +1

2Nijδφiδφj + · · · (5.4.111)

where Ni ≡ ∂iN , Nij = ∂i∂jN , etc. are derivatives evaluated on the initial slice.

We see that there are two sources of non-Gaussianity for ζ:

1. Intrinsic non-Gaussianity of the fields δφi.

2. Non-Gaussianity resulting from the non-linear relationship between ζ and δφi.

In this section we are interested in the second effect, so we assume that the inflaton and the

isocurvatons are Gaussian on the initial slice. Any non-Gaussianity arises from the subsequence

non-linear evolution. Under this assumption, the correlation functions for ζ can be written as

〈ζ(x1)ζ(x2)〉 = NiNj 〈δφi(x1)δφj(x2)〉 (5.4.112)

〈ζ(x1)ζ(x2)ζ(x3)〉 = NijNkNl 〈δφi(x1)δφj(x1)δφk(x2)δφl(x3)〉 + 2 perms. (5.4.113)

The four-point function in (5.4.113) can be written as a product of two-point functions for

Gaussian fluctuations. Using the power spectrum for nearly massless fields in de Sitter,

〈δφi(k1)δφj(k2)〉 =H2

2k31

(2π)3δ(k1 + k2)δij , (5.4.114)

we find

Bζ(k1, k2, k3) =NijNiNj

(N2l )2

· (P (k1)P (k2) + 2 perms.) . (5.4.115)

23Previously, we called ζ the curvature perturbation in ‘comoving’ gauge. On superhorizon scales this is equal

to the curvature perturbation in ‘uniform density’ gauge, so we don’t introduce a separate variable.

5.4 Classical Non-Gaussianities 99

where

Pζ(k) = N2i ·

H2

2k3. (5.4.116)

Unsurprisingly, we find a bispectrum of the local shape. (The non-Gaussianity is coming from

local non-linearities in real space.) The amplitude is

f loc.NL =

5

6

NijNiNj

(N2l )2

. (5.4.117)

5.4.2 Inhomogeneous Reheating

In this section, we present two applications of the δN formalism. In the curvaton model in-

homogeneous reheating occurs because the amplitude of an oscillating field after inflation is

modulated by long-wavelength fluctuations set up during inflation. In scenarios of modulated

reheating the same long-wavelength fluctuations lead to a spatially dependent inflaton decay

rate. Both mechanism lead to a new source for curvature perturbations with potentially large

non-Gaussianity.

Modulated Curvaton Oscillations

Let there be a second light field σ during inflation with potential V (σ) = 12m

2σσ

2 and mass suffi-

ciently small, mσ H, to allow long-wavelength quantum fluctuations δσ to be unsuppressed.

The energy density associated with σ is negligible initially. However, its quantum fluctuations δσ

will become the primary source for the primordial curvature perturbation ζ. For this reason, σ is

given the name ‘curvaton’. Let σ? = σ+δσ(x) be the amplitude of the curvaton at horizon-exit.

Assume that the fluctuations are Gaussian at horizon-exit. The curvaton amplitude remains

frozon until the Hubble parameter drops below mσ. At that time the curvaton starts oscillation

around the minimum of its potential and behaves as pressureless matter. Initially, the universe is

radiation-dominated after inflation, but the fraction of the matter energy density associated with

the curvaton oscillations quickly grows.24 We assume that the curvaton is unstable and decays

when its decay rate equals the Hubble expansion rate, H = Γ (sudden-decay approximation).

We will use the δN formalism to compute the statistics of the resulting curvature perturba-

tions. The initial spatially flat slice t0 has radiation density ργ,0 and curvaton density ρσ,0. The

scale factor at that time is a0. Both components initially redshift as radiation, until H = mσ at

time t1, and the curvaton starts oscillating. The Friedmann equation at t1 is

3M2plm

2σ =

(a0

a1

)4

(ργ,0 + ρσ,0) ≈(a0

a1

)4

ργ,0 . (5.4.118)

The curvaton now dilutes as matter, until it decays at t2 when H = Γ. The Friedmann equation

at t2 is

3M2plΓ

2 =

(a0

a2

)4

ργ,0 +

(a0

a1

)4(a1

a2

)3

ρσ,0 . (5.4.119)

Since the time-slice t2 is defined by a constant Hubble parameter (and hence constant energy

density), we automatically are on the final uniform density slice required by the δN formula.25

24Recall that matter dilutes as a−3, while radiation redshifts as a−4.25After t2, both components become radiation and the evolution is the same everywhere, so there is no further

contribution to δN .


To apply the δN formula for ζ, we need to determine the number of e-folds of expansion from

t0 to t2 as a function of the initial field value of the curvaton σ. First, we note that (5.4.119)

implies

e−4N + e−3Nα = const. (5.4.120)

where α ≡ (a0/a1)(ρσ,0/ργ,0). At leading order, a0/a1 is independent of σ, cf. eq. (5.4.118).

Since ρσ,0 ∝ V (σ) ∝ σ2, we infer that α ∝ σ2. Differentiating (5.4.120) with respect to σ and

using (5.4.117), we find

f loc.NL =

5

6

Nσσ

N2σ

=5

3xσ,2− 5(4 + 9xσ,2)

12(4 + 3xσ,2), (5.4.121)

where we defined xσ,2 = eNα = ρσ,2/ργ,2. Sometimes the final answer is written as

f loc.NL =

5

4x− 5

3− 5x

6, (5.4.122)

where

x ≡ 3ρσ,24ργ,2 + 3ρσ,2

. (5.4.123)

This can lead to large non-Gaussianities if x 1.

Inhomogeneous Decay Rate

In string theory and supergravity it is natural that coupling constants are functions of additional

moduli fields. It is therefore quite natural to imagine that the decay rate of the inflation is

modulated by a second light field Γ(σ). Quantum fluctuations of σ during inflation will lead

to fluctuations in the density after inhomogeneous reheating. We will use the δN formalism to

compute the resulting non-Gaussianity.

As in the curvaton scenario, the spacetime is practically unperturbed until σ decays at H = Γ.

The δN formula requires the number of e-folds from the intial time, when the spacetime is

unperturbed, to a final time after decay, when the energy density has some value. Denoting

these times by the subscripts i and f , and labelling the time of decay by ‘dec’, we can write

eN =afai

=afadec

adec

ai. (5.4.124)

For purposes of illustration we assume a matter-dominated universe by the time of reheating.

We can then take the initial slice to be during matter domination. During matter domination

we have a ∝ H−2/3, while after the decay, during radiation domination, we have a ∝ H−1/2.

Since the decay occurs at H = Γ, we therefore have

adec

ai∝ Γ−2/3 and

afadec

∝ Γ1/2 . (5.4.125)

This gives eN ∝ Γ−1/6, and

∂N

∂Γ= −1

6

1

Γand

∂2N

∂2Γ=

1

6

1

Γ2. (5.4.126)

and hence

δN = −1

6

[δΓ

Γ− 1

2

(δΓ

Γ

)2]. (5.4.127)

5.5 Large-Scale Structure and Non-Gaussianity 101

Using δΓ = Γ′δσ + 12Γ′′(δσ)2 + · · · we find

f loc.NL = 5

[Γ′′Γ

(Γ′)2− 1

]. (5.4.128)

This can lead to large non-Gaussianity if the dependence of the decay rate on the modulus σ is

non-linear, Γ′′Γ (Γ′)2.

5.5 Large-Scale Structure and Non-Gaussianity

derive scale-dependent bias in peak-background split.

5.6 Future Prospects

The current WMAP constraints on local, equilateral, and orthogonal non-Gaussianity are

f loc.NL = 32± 21 , f equil.

NL = 26± 140 , fortho.NL = −202± 104 . (5.6.129)

The Planck satellite will improve error bars by at least a factor of 5. The results are expected

in less than two years. Stay tuned.

LSS constraints.

Part II

The Physics of Inflation

103

6 Effective Field Theory

In the absence of a fundamental theory of high energy physics including gravitation, we can still

make progress. A systematic way of parameterizing our ignorance is to construct effective field

theories valid at the energy scale of the experiment. Effective field theory considerations will

give us a valuable perspective on the physics of inflation. Before we discuss this in more detail

in the next chapter, we will take this chapter as an opportunity to introduce the basic principles

of effective field theory.

6.1 Introduction

Few concepts in theoretical physics are more widely applicable than effective field theory (EFT).

Progress in diverse fields, such as condensed matter physics, particle physics, and cosmology,

has relied heavily on the power and universality of EFT techniques. EFT isolates the relevant

low-energy degrees of freedom, while systematically including the effects of high-energy degrees

of freedom as non-renormalizable corrections. The low-energy physics is then described by

an effective action for the light fields that includes all operators that are consistent with the

symmetries of the problem. These notes give a first introduction to this important topic. The

text is adapted from the excellent TASI lectures of Witold Skiba1 and Markus Luty2.

Phenomena involving distinct energy, or length, scales can often be analyzed by considering

one relevant scale at a time. In most branches of physics, this is such an obvious statement

that it does not require any justification. The multipole expansion in electrodynamics is useful

because the short-distance details of charge distribution are not important when observed from

far away. One does not worry about the sizes of planets, or their geography, when studying

orbital motions in the Solar System. Similarly, the hydrogen spectrum can be calculated quite

precisely without knowing that there are quarks and gluons inside the proton.

Taking advantage of scale separation in quantum field theories leads to effective field theories

(EFTs). Fundamentally, there is no difference in how scale separation manifests itself in classical

mechanics, electrodynamics, quantum mechanics, or quantum field theory. The effects of large

energy scales, or short distance scales, are suppressed by powers of the ratio of scales in the

problem. This observation follows from the equations of mechanics, electrodynamics, or quantum

mechanics. Calculations in field theory require extra care to ensure that large energy scales

decouple.

Decoupling of large energy scales in field theory seems to be complicated by the fact that inte-

1W. Skiba, TASI Lectures on Effective Field Theory, (arXiv:1006.2142).2M. Luty, TASI Lectures on Supersymmetry Breaking, (arXiv:hep-th/0509029).

105

106 6. Effective Field Theory

gration over loop momenta involves all scales. However, this is only a superficial obstacle which

is straightforward to deal with in a convenient regularization scheme, for example dimensional

regularization. The decoupling of large energy scales takes place in renormalizable quantum

field theories whether or not EFT techniques are used. There are many precision calculations

that agree with experiments despite neglecting the effects of heavy particles. For instance, the

original calculation of the anomalous magnetic moment of the electron, by Schwinger, neglected

the one-loop effects arising from weak interactions. Since the weak interactions were not under-

stood at the time, Schwinger’s calculation included only the photon contribution, yet it agreed

with the experiment within a few percent. Without decoupling, the weak gauge boson contribu-

tion would be of the same order as the photon contribution. This would result in a significant

discrepancy between theory and experiment and QED would likely have never been established

as the correct low-energy theory.

The decoupling of heavy states is, of course, the reason for building high-energy accelerators.

If quantum field theories were sensitive to all energy scales, it would be much more useful to

increase the precision of low-energy experiments instead of building large colliders. By now,

the anomalous magnetic moment of the electron is known to more than ten significant digits.

Calculations agree with measurements despite that the theory used for these calculations does

not incorporate any TeV-scale dynamics, grand unification, or any notions of quantum gravity.

If decoupling of heavy scales is a generic feature of field theory, why would one consider

EFTs? That depends on whether the dynamics at high energy is known and calculable or else

the dynamics is either non-perturbative or unknown. If the full theory is known and perturbative,

EFTs often simplify calculations. Complex computations can be broken into several easier tasks.

If the full theory is not known, EFTs allow one to parameterize the unknown interactions, to

estimate the magnitudes of these interactions, and to classify their relative importance. EFTs are

applicable to both cases with the known and with the unknown high-energy dynamics because in

an effective description only the relevant degrees of freedom are used. The high-energy physics

is encoded indirectly though interactions among the light states.

6.2 EFT Fundamentals

6.2.1 Basic Principles

The first step in constructing effective field theories is identifying the relevant degrees of freedom

for the measurements of interest. For instance, in particle physics, light particles φL are included

in the effective theory while heavy particles φH are integrated out. Here, we distinguish light and

heavy degrees of freedom on the basis of whether the corresponding particles can be produced

on shell at the energies available to the experiment of interest.

Effective Actions

Formally, the heavy fields are integrated out by performing a path integral over the heavy degrees

of freedom only. This results in an effective action for the light degrees of freedom,

eiSeff(φL) ≡∫DφH e

iS(φL,φH) . (6.2.1)

In practice, only lattice gauge theorists actually do path integrals, the rest of us converts path

integrals into Feynman rules and Feynman diagrams. We will see examples of this perturbative

6.2 EFT Fundamentals 107

procedure below. The effective Lagrangian will contain a finite number of renormalizable terms

of dimension four or less, and a infinite tower of non-renormalizable3 terms of dimension larger

than four,

Leff(φL) = L∆<4 +∑

i

ciOi(φL)

Λ∆i−4, (6.2.2)

where ∆i are the dimensions of the operators Oi. The operators Oi are made out of the light

degrees of freedom φL and are local in spacetime (for E < MH ∼ Λ). In weakly interacting

theories the dimensions of the composite operators Oi is determined simply by adding the

dimensions of all fields (and possibly derivatives) making up Oi. The field dimensions are

determined from the kinetic terms. In strongly interaction theories the dimensions of operators

typically differ significantly from the sum of the constituent field dimensions determined in the

free theory.

The sum over higher dimensional operators in eq. (6.2.2) is in principle an infinite sum. In

practice, just a few terms are pertinent. Only a finite number of terms needs to be kept because

the theory needs to reproduce experiments to finite accuracy and also because the theory can

be tailored to specific processes of interest. The higher the dimension of an operator, the

smaller its contribution to low-energy observables. Hence, obtaining results to a given accuracy

requires a finite number of terms.4 This is the reason why non-renormalizable theories are just

as good as renormalizable theories. An infinite tower of operators is truncated and a finite

number of parameters is needed for making predictions, which is exactly the same situation as

in renormalizable theories.

It is a simplification to assume that different higher dimensional operators in eq. (6.2.2) are

suppressed by the same scale Λ. Different operators can arise from exchanges of distinct heavy

states that are not part of the effective theory. The scale Λ is often referred to as the cutoff

of the EFT. This is a somewhat misleading term that is not to be confused with a regulator

used in loop calculations, for example a momentum cutoff. Λ is related to the scale where the

effective theory breaks down. However, dimensionless coefficients do matter. One could redefine

Λ by absorbing dimensionless numbers into the definitions of operators. The breakdown scale

of an EFT is a physical scale that does not depend on the convention chosen for Λ. This scale

could be estimated experimentally by measuring the energy dependence of amplitudes at small

momentum. In EFTs, amplitudes grow at high energies and exceed the limits from unitarity

at the breakdown scale. It is clear that the breakdown scale is physical since it corresponds to

on-shell contributions from heavy states.

Finally, let us remark that terms in the L∆≤4 part of eq. (6.2.2) also receive contributions from

the heavy fields. Such contributions may not lead to observable consequences as the coefficients

of interactions in L∆≤4 are determined from low-energy observables. In some cases, the heavy

fields violate symmetries that would have been present in the full Lagrangian L(φL, φH = 0) if

the heavy fields are neglected. Symmetry-violating effects of heavy fields are certainly observable

in L∆≤4.

3As we will see, effective theories are just as renormalizable as so-called renormalizable theories.4Not all terms of a given dimension need to be kept. For example, one may be studying 2 → 2 scattering.

Some operators may contribute only to other scattering processes, for example 2 → 4, and may not contribute

indirectly through loops to the processes of interest at a given loop order.


Relevant, Irrelevant and Marginal

At energies below Λ, the behavior of the different operators in eq. (6.2.2) is determined by their

dimension. We can distinguish three types of operators: relevant (∆ < 4), marginal (∆ = 4),

and irrelevant (∆ > 4).

Irrelevant operators deserve their name because their effects are suppressed by powers of E/Λ

and are thus small at low energies. Of course, this does not mean that they are not important.

In fact, they usually contain the interesting information about the underlying dynamics at

higher scales. The point is that irrelevant operators are weak at low energies. In contrast,

relevant operators become more important at lower energies. In four-dimensional relativistic

field theories, the number of possible relevant operators is rather low: ∆ = 0 (the unit operator),

∆ = 2 (bosonic mass term φ2), ∆ = 3 (fermionic mass term ψψ and cubic scalar interactions φ3).

Finite mass effects are negligible at very high energies (E m), however they become relevant

when the energy scale is comparable to the mass.

Operators of dimension four are equally important at all energy scales and are called marginal

operators. They lie between relevancy and irrelevancy because quantum effects could modify

their scaling behaviour on either side. Well-known examples of marginal operators are φ4, the

QED and QCD interactions and the Yukawa ψψφ interactions.

In any situation where there is a large mass gap between the energy scale being analyzed and

the scale of any heavier states (i.e. m,E Λ), the effects induced by irrelevant operators are

always suppressed by powers of E/Λ, and can usually be neglected. The resulting EFT, which

only contains relevant and marginal operators, is called renormalizable. Its predictions are valid

up to E/Λ corrections.

These consideration offer a new perspective on the old concept of renormalizability. Take QED

as an example. The theory was constructed to be the most general renormalizable (∆ ≤ 4)

Lagrangian consistent with the electromagnetic U(1) gauge symmetry. However, there exist

other interactions (exchanges of Z bosons) that contribute to e+e− → e+e−, which at low

energies (E MZ) generate additional non-renormalizable local couplings of higher dimensions.

The reason why QED is so successful to describe the low-energy scattering of electrons with

positrons is not renormalizability, but rather the fortunate fact that MZ is very heavy and the

leading non-renormalizable contributions are suppressed by E2/M2Z .

Uses of EFTs

Effective Lagrangians are typically used in one of two ways:

1. The “full theory” S[φL, φH] is known:

In this case, integrating out the heavy modes gives a simple way of systematically analyzing

the effects of the heavy physics on low-energy observables. Because only the low energy

scale appears explicitly in the Feynman diagrams of the EFT, amplitudes are easier to

calculate and to power count than in the full theory.

Examples: Integrating out the W , Z bosons from the SU(2)L × U(1)Y Electroweak La-

grangian at energies E mW,Z results in the Fermi theory of weak decays plus corrections

suppressed by powers of E2/m2W,Z . It is easier to analyze electromagnetic or QCD cor-

rections to weak decays in the four-Fermi theory than in the full Electroweak Lagrangian.

Another example is the use of EFTs to calculate heavy particle threshold corrections to


low energy gauge couplings. This has applications, for instance, in Grand Unified Theories

and in QCD.

2. The full theory is unknown (or known but strongly coupled):

Whatever the physics at the scale Λ is, by decoupling it must manifest itself at low ener-

gies as an effective Lagrangian of the form eq. (6.2.2). If the symmetries (e.g. Poincare,

gauge, global) that survive at low energies are known, then the operators Oi(x) appearing

in Leff [φL] must respect those symmetries. Thus by writing down an effective Lagrangian

containing the most general set of operators consistent with the symmetries, we are nec-

essarily accounting for the UV physics in a completely model independent way.

Examples: The QCD chiral Lagrangian below the scale ΛχSB of SU(3)L × SU(3)R →SU(3)V chiral symmetry breaking. Here the full theory, QCD, is known, but because

ΛχSB is of order the scale ΛQCD ∼ 1 GeV, where the QCD coupling is strong, it is im-

possible to perform the functional integral in eq. (6.2.1) analytically. Another example is

general relativity below the scale Mpl ∼ 1019 GeV. This theory can be used to calculate,

e.g., graviton-graviton scattering at energies E Mpl. Above those energies, however,

scattering amplitudes calculated in general relativity start violating unitarity bounds, and

the effective field theory necessarily breaks down. Thus general relativity is an effective

Lagrangian for quantum gravity below the strong coupling scale Mpl. Finally, it is believed

that the Standard Model itself is an effective field theory below scales of order 1 TeV or

so. This scale manifests itself indirectly, in the form of SU(2)L × U(1)Y gauge-invariant

operators of dimension ∆ > 4 constructed from Standard Model fields, certain linear com-

binations of which have been constrained experimentally using collider data from the LEP

experiments at CERN (c.f. precision electroweak constraints using effective Lagrangians).

If there is indeed new physics at the TeV scale, it will be seen directly, at the LHC.

Power Counting

EFTs are based on several systematic expansions. In addition to the usual loop expansion in

quantum field theory, one expands in the ratios of energy scales. There can be several scales

in the problem: the masses of heavy particles, the energy at which the experiment is done, the

momentum transfer, and so on. In an EFT, one can independently keep track of powers of

the ratio of scales and of the logarithms of scale ratios. This could be useful, especially when

logarithms are large. Ratios of different scales can be kept to different orders depending on the

numerical values, which is something that is nearly impossible to do without using EFTs.

When constructing an EFT one needs to be able to formally predict the magnitudes of different

operators Oi in the effective Lagrangian. This is referred to as power counting the terms in the

Lagrangian and it allows one to predict how different terms scale with energy. In the simple

EFTs discussed here, power counting is the same as dimensional analysis using the natural

~ = c = 1 units, in which [mass] = [length]−1. From now on, dimensions will be expressed in the

units of [mass], so that energy has dimension 1, while length has dimension −1. The Lagrangian

density L has dimension 4, since the action S =∫L d4x must be dimensionless. The dimensions

of fields are determined from their kinetic energies because in weakly-interacting theories these

terms always dominate. The kinetic energy term for a scalar field, ∂µφ∂µφ, implies that φ has

dimension 1, while that of a fermion, i ψ/∂ψ, implies that ψ has dimension 32 in four space-time

dimensions.


Matching and Running

In the next section, we will consider an example where the full theory is known and weakly

coupled. In that case, it is, in principle, possible to perform the functional integral in eq. (6.2.1)

and derive the effective theory in terms of the couplings of the full theory. In partice, it is easier

to fix the low-energy parameters through a procedure called matching.

The idea is very simple: calculate some observable, for instance the scattering amplitude of the

light particles, in two ways. First, calculate the amplitude in the full theory, expanding the result

in powers of E/Λ. Then, calculate the same quantity in the effective field theory, adjusting the

EFT parameter in order to reproduce the result of the full theory. In general, the coefficients in

the effective Lagrangian are calculable as a series expansion in the parameters of the full theory,

such as the couplings λ 1 between the light and the heavy fields. Matching will be performed

to a certain order in λ (e.g. tree level, one loop, etc.). Loop graphs in matching calculation

typically contain logarithms log Λ/µ. In order to avoid possible large logarithms that could

render perturbation theory invalid, one must choose a matching scale µ that is of the same order

as the masses of the fields that are being integrated out, i.e. µ ∼ Λ. To calculate observables at

low energy scale E Λ, it is better to evaluate loop graphs in the EFT at a renormalization

point µ ∼ E, indicating that in the EFT one should use the couplings ci(µ ∼ E). These can be

obtained in terms of the coefficients ci(µ ∼ Λ) obtained through matching by renormalization

group (RG) evolution between the scales Λ and E within the EFT.

RG flow

RG flow

matching

matching

Figure 6.1: Construction of a low-energy EFT for a theory with scale Λ1 Λ2 E.

For a theory with multiple scales, the procedure is similar. A typical example is shown in

fig. 6.1, which depicts the construction of an EFT at a low scale E Λ2 Λ1 starting from

a theory of light fields ϕ coupled to heavy fields Φ1, Φ2 (masses of order Λ1, Λ2). One first

constructs an EFT1 for ϕ, Φ2 (regarded as approximately massless) by integrating out the fields

Φ1. This generates a theory defined by its coupling constants at the renormalization scale

µ ∼ Λ1. This EFT1 is then used to RG evolve the couplings down to the threshold µ ∼ Λ2,

at which point Φ2 is treated as heavy and removed from the theory. This finally generates an

EFT2 for the light fields ϕ which can be used to calculate at the scale E. In EFT2, logarithms

of E/Λ2 can be resummed by RG running.

6.2.2 A Toy Model

We will study a simple toy example that nevertheless illustrates all the important lessons you

need to know about effective field theory.


Let the full theory be a massless fermion ψ and two massive bosons Φ and ϕ,

L = iψ /∂ψ + 12(∂µΦ)2 − 1

2M2Φ2 + 1

2(∂µϕ)2 − 12m

2ϕ2 − λΦψψ − ηϕψψ , (6.2.3)

where the parameters λ and η are dimensionless Yukawa couplings. Let us assume M m (for

now we won’t worry if this hierarchy is natural). We want to determine the effective theory for

the light fields: the fermion ψ and the scalar ϕ. The interactions generated by the exchanges of

the heavy field Ψ will be mocked up by new interactions involving the light fields.

6.2.3 Tree-Level Matching

We start with tree-level effects. Consider ψψ → ψψ scattering to order (λ2, η0) in the coupling

constants and keep terms to second order in the external momenta.

As we described above, integrating out the heavy fields is accomplished by comparing, or

matching, amplitudes in the full UV-theory and the effective IR-theory. Specifically, the process

ψψ → ψψ has the following amplitude in the UV-theory,

AUV = u3u1u4u2(−iλ)2 i

(p3 − p1)2 −M2− 3↔ 4 , (6.2.4)

where we defined ui ≡ u(pi). The exchange term 3↔ 4 comes with a minus sign as required

by Fermi statistics. We can expand the propagator in inverse power of the scale M ,

(−iλ)2 i

(p3 − p1)2 −M2≈ i λ

2

M2

(1 +

(p3 − p1)2

M2+ · · ·

). (6.2.5)

As advertised, we will neglect terms higher than second order in external momenta, i.e. we

construct the effective theory to finite order in the expansion parameter p2/M2. The effective

theory is therefore only valid at energies below the mass of the heavy scalar, p2 M2.

Let us find the corresponding effective theory. To zeroth order in external momenta, we can

reproduce the ψψ → ψψ scattering amplitude by the following four-fermion Lagrangian

Lp0,λ2

eff = iψ /∂ψ +c

2ψψψψ . (6.2.6)

The amplitude calculated with this effective Lagrangian is

AIR = u3u1u4u2(ic)− 3↔ 4 . (6.2.7)

Comparing AIR to AUV, we find c = λ2/M2. At next order in external momenta, we can write

the following Lagrangian5

Lp2,λ2

eff = iψ /∂ψ +1

2

λ2

M2ψψψψ + d∂µψ∂

µψψψ . (6.2.8)

We want to compare the terms obtained from this effective Lagrangian with the amplitude

derived from the UV-theory, eq. (6.2.5). The effective Lagrangian needs to be valid both on-shell

and off-shell. For the matching, we can use any choice of external momenta that is convenient. In

particular, we can choose the momenta to be both on-shell and off-shell. The external particles,

here ψ, are identical in the full and effective theories. The choice of external momenta has

5There is a second independent two-derivative operator, ∂µψψψ∂µψ. It turns out that it isn’t required to

match the UV-theory, so we won’t have to consider it.


nothing to do with UV dynamics. In other words, for any momenta below the cutoff the full

and effective theories must be identical, thus one is allowed to make opportunistic choices of

momenta to simplify calculations.

In the present example, we choose p21 = p2

2 = p23 = p2

4 = 0. The amplitudes then can only

depend on pi ·pj , with i 6= j. The effective theory now needs to reproduce the −2i λ2

M2p1·p3

M2 −3↔4 part of the amplitude in eq. (6.2.5). The term proportional to d in Lp

2,λ2eff gives6

AIR = id(p1 · p3 + p2 · p4) u3u1u4u2 − 3↔ 4 . (6.2.9)

Conservation of momentum, p1 + p2 = p3 + p4, implies p1 · p3 = p2 · p4. Hence, we match UV

and IR amplitudes for d = −λ2/M4.

At tree-level, the effective Lagrangian is7

Lp2,λ2

eff = iψ∂ψ +c

2ψψψψ + d∂µψ∂

µψψψ + 12(∂µϕ)2 − 1

2m2ϕ2 − ηϕψψ , (6.2.10)

where

c =λ2

M2and d = − λ2

M4. (6.2.11)

6.2.4 RG Running

We should think of the matching as being performed at the scale associated with the mass of

the heavy particle, i.e. eq. (6.2.11) should be interpreted as

c(µ = M) =λ2

M2and d(µ = M) = − λ2

M4. (6.2.12)

At tree level, the couplings don’t run and therefore apply unchanged at lower energies. However,

at loop level, the couplings will run as we go to low energy scales, e.g. µ = m. Since this RG

flow only involves energies below M , it can be computed in the effective theory. Consider, for

example, loop contributions to the ψψ → ψψ amplitude to lowest order in the momenta and to

order λ2η2 in the UV coupling constants. Naively, this will lead to a correction to the tree-level

amplitude of order η2/(4π)2. However, this will not be a good estimate if there are several

scales in the problem. For instance, since we have extra light scalars, m M , the scattering

amplitude could contain large logs, such as log(M/m).

In fact, in an EFT one separates logarithm-enhanced contributions and contributions indepen-

dent of large logs. The log-independent contributions arise from matching and the log-dependent

ones are accounted for by the RG evolution of parameters. By definition, while matching one

compares theories with different field contents. This needs to be done using the same renormal-

ization scale in both theories. This so-called matching scale is usually the mass of the heavy

particle that is being integrated out. No large logarithms can arise in the process since only one

scale is involved. The logs of the matching scale divided by a low-energy scale must be identical

in the two theories since the two theories are designed to be identical at low energies. We will

illustrate loop matching in the next section. It is very useful that one can compute the matching

and running contributions independently. This can be done at different orders in perturbation

theory as dictated by the magnitudes of couplings and ratios of scales.

6Think Wick contractions.7Note that this effecitve Lagrangian is not complete to order λ2 and p2, it only contains all tree-level terms of

this order. For example, the Yukawa coupling ϕψψ receives corrections proportional to ηλ2 at one loop.


In our effective theory described in eq. (6.2.10) we need to find the RG equation for the

Lagrangian parameters. For concreteness, let us assume we want to know the amplitude at the

scale µ = m. Since we will be interested in the momentum-independent part of the amplitude,

we can neglect the term proportional to d. By dimensional analysis, the two-derivative term

will always be suppressed by m2

M2 compared to the leading term arising from the non-derivative

term.8

Let us first compute RG running of the coupling c :

First, a fermion self-energy diagram leads to a one-loop correction to the kinetic term

= (−iη)2

∫ddk

(2π)di(/k + /p)

(k + p)2

i

k2 −m2

= η2

∫ddq

(2π)d

∫ 1

0dx

/q + (1− x)/p

[q2 −∆2]2

=iη2

(4π)2

1

ε

(∫ 1

0dx(1− x)/p

)+ finite

=iη2/p

2(4π)2

1

ε+ finite , (6.2.13)

where we used Feynman parameters to combine the denominators and shifted the loop momen-

tum q = k + xp. We then used the standard result for loop integrals (see eq. (A.44) in Peskin

and Schroeder) and expanded d = 4 − 2ε. Only the 1ε pole is kept as the finite term does not

enter the RG calculation.

Next, we compute the loop corrections due to the four-fermion vertex.

There are six diagrams with a scalar exchange because there are six different pairings of

the external lines. The diagrams are depicted in fig. 6.2 and there are two diagrams in each

of the three topologies. All of these diagrams are logarithmically divergent in the UV, so we

can neglect the external momenta and masses if we are interested in the divergent parts. The

divergent terms must be local and therefore be analytic in the external momenta. Extracting

positive powers of momenta from a diagram reduces its degree of divergence which is apparent

from dimensional analysis. Diagrams (a) in fig. 6.2 are the most straightforward to deal with

and the divergent part is easy to extract

2(−iη)2ic

∫ddk

(2π)di/k

k2

i/k

k2

i

k2= −2cη2

∫ddk

(2π)d1

k4= −2icη2

(4π)2

1

ε+ finite. (6.2.14)

We did not mention the cross diagrams here, denoted 3↔ 4 in the previous section, since

they go along for the ride, but they participate in every step. Diagrams (b) in fig. 6.2 require

more care as the loop integral involves two different fermion lines. To keep track of this we

indicate the external spinors and abbreviate u(pi) = ui. The result is

2(−iη)2ic

∫ddk

(2π)du3i/k

k2u1 u4

−i/kk2

u2i

k2=

icη2

2(4π)2

1

εu3γ

µu1 u4γµu2 + finite. (6.2.15)

8This reasoning only holds if one uses a mass-independent regulator, like dimensional regularization with

minimal subtraction. In dimensional regularization, the renormalization scale µ only appears in logs. In less

suitable regularization schemes, the two-derivative term could contribute as much as the non-derivative term as

the extra power of 1M2 could become Λ2

M2 , where Λ is the regularization scale. With the natural choice Λ ≈ M ,

the two-derivative term is not suppressed at all. Since the same argument holds for terms with more and more

derivatives, all terms would contribute exactly the same and the momentum expansion would be pointless. This

is, for example, how hard momentum cut off and Pauli-Villars regulators behave. Such regulators do their job,

but they needlessly complicate power counting. From now on, we will only be using dimensional regularization.


(a) (b) (c)

Figure 6.2: Diagrams contributing to the renormalization of the four-fermion interaction. The (red)

dashed lines represent the light scalar ϕ. The four-fermion vertices are represented by the kinks on the

fermion lines. The fermion lines do not touch even though the interaction is point-like. This is not due to

limited graphic skills of the author, but rather to illustrate the fermion number flow through the vertices.

(a) (b) (c)

Figure 6.3: The full theory analogs of the Feynman diagrams in fig. 6.2. The (green) thick dashed

lines represents the heavy scalar Φ.

This divergent contribution is canceled by diagrams (c) in fig. 6.2 because one of the momentum

lines carries the opposite sign

2(−iη)2ic

∫ddk

(2π)du3i/k

k2u1 u4

i/k

k2u2

i

k2. (6.2.16)

If the divergent parts of the diagrams (b) and (c) did not cancel this would lead to operator

mixing which often takes place among operators with the same dimensions. We will illustrate

this shortly.

To calculate the RG equations (RGEs) we can consider just the fermion part of the Lagrangian

in eq. (6.2.10) and neglect the derivative term proportional to d. We can think of the original

Lagrangian as being expressed in terms of the bare fields and bare coupling constants and rescale

ψ0 =√Zψψ and c0 = cµ2εZc. As usual in dimensional regularization, the mass dimensions of

the fields depend on the dimension of space-time. In d = 4 − 2ε, the fermion dimension is

[ψ] = 32 − ε and [L] = 4− 2ε. We explicitly compensate for this change from the usual 4 space-

time dimensions by including the factor µ2ε in the interaction term. This way, the coupling c


does not alter its dimension when d = 4− 2ε. The Lagrangian is then

Lp0,λ2η2 log = iψ0 /∂ ψ0 +c0

2ψ0ψ0 ψ0ψ0 = iZψψ /∂ ψ +

c

2ZcZ

2ψµ

2εψψ ψψ

= iψ /∂ ψ + µ2ε c

2ψψ ψψ + i(Zψ − 1)ψ /∂ ψ + µ2ε c

2(ZcZ

2ψ − 1)ψψ ψψ , (6.2.17)

where in the last line we separated the counterterms. We can read off the counterterms from

eqs. (6.2.13) and (6.2.14) by insisting that the counterterms cancel the divergences we calculated

previously.

Zψ − 1 = − η2

2(4π)2

1

εand c(ZcZ

2ψ − 1) =

2cη2

(4π)2

1

ε, (6.2.18)

where we used the minimal subtraction (MS) prescription and hence retained only the 1ε poles.

Comparing the two equations in (6.2.18), we obtain Zc = 1 + 3η2

(4π)21ε .

The standard way of computing RGEs is to use the fact that the bare quantities do not depend

on the renormalization scale

0 = µd

dµc0 = µ

d

dµ(cµ2εZc) = βcµ

2εZc + 2εcµ2εZc + cµ2εµd

dµZc, (6.2.19)

where βc ≡ µ dcdµ . We have µ ddµZc = 3

(4π)2 2ηβη1ε . Just like we had to compensate for the

dimension of c, the renormalized coupling η needs an extra factor of µε to remain dimensionless

in the space-time where d = 4− 2ε. Repeating the same manipulations we used in Eq. (6.2.19),

we obtain βη = −εη − η d logZηd log µ . Keeping the derivative of Zη would give us a term that is of

higher order in η as for any Z factor the scale dependence comes from the couplings. Thus, we

keep only the first term, βη = −εη, and get µ ddµZc = − 6η2

(4π)2 . Finally,

βc =6η2

(4π)2c . (6.2.20)

We can now complete our task and compute the low-energy coupling, and thus the scattering

amplitude, to the leading log order

c(m) = c(M)− 6η2

(4π)2c log

(M

m

)=

λ2

M2

[1− 6η2

(4π)2log

(M

m

)]. (6.2.21)

Of course, at this point it requires little extra work to re-sum the logarithms by solving the

RGEs. First, one needs to solve for the running of η. We will not compute it in detail here, but

βη = 5η3

(4π)2 . Solving this equation gives

1

η2(µ2)− 1

η2(µ1)=

10

(4π)2log

µ1

µ2. (6.2.22)

Putting the µ dependence of η from Eq. (6.2.22) into Eq. (6.2.20) and performing the integral

yields

c(m) = C(M)

(η2(m)

η2(M)

) 35

, (6.2.23)

which agrees with eq. (6.2.21) to the linear order in log(Mm

).


6.2.5 One-Loop Matching

Construction of effective theories is a systematic process. We saw how RG equations can account

for each ratio of scales, and we now increase the accuracy of matching calculations. To improve

our ψψ → ψψ scattering calculation we compute matching coefficients to one-loop order. As an

example, we examine terms proportional to λ4. This calculation illustrates several important

points about matching calculations.

Our starting point is again the full theory with two scalars and a fermion. Since we are only

interested in the heavy scalar field, we can neglect the light scalar for the time being and consider

L = iψ /∂ ψ − σψψ +1

2(∂µΦ)2 − M2

2Φ2 − λψψΦ +O(ϕ) . (6.2.24)

We added a small mass, σ, for the fermion to avoid possible IR divergences and also to be able

to obtain a nonzero answer for terms proportional to 1M4 .

(d)

(a) (b)

(c)

Figure 6.4: Diagrams in the full theory to order λ4. Diagram (d) stands in for two diagrams that differ

only by the placement of the loop.

The diagrams that contribute to the scattering at one loop are illustrated in Fig. 6.4. As we

did before, we will focus on the momentum-independent part of the amplitude and we will not

explicitly write the terms related by exchange of external fermions. The first diagram gives

(a) = (−iλ)4

∫ddk

(2π)du3i(/k + σ)

k2 − σ2u1 u4i

i(−/k + σ)

k2 − σ2u2

i2

(k2 −M2)2

= λ4

[−u3γ

αu1 u4γβu2

∫ddk

(2π)dkαkβ

(k2 − σ2)2(k2 −M2)2

+u3u1 u4u2

∫ddk

(2π)dσ2

(k2 − σ2)2(k2 −M2)2

]. (6.2.25)

The loop integrals are straightforward to evaluate using Feynman parameterization

1

(k2 − σ2)2(k2 −M2)2= 6

∫ 1

0dx

x(1− x)

(k2 − xM2 − (1− x)σ2)4. (6.2.26)


The final result for diagram (a) is

(a)UV =iλ4

(4π)2

[UV

1

2

∫ 1

0dx

x(1− x)

xM2 + (1− x)σ2+ σ2US

∫ 1

0dx

x(1− x)

(xM2 + (1− x)σ2)2

]

=iλ4

(4π)2

[UV

(1

4M2+

σ2

4M4

(3− 2 log

(M2

σ2

)))+ US

σ2

M4

(log(M2

σ2

)− 2

)]+ · · · ,

(6.2.27)

where we abbreviated US = u3u1 u4u2, UV = u3γαu1 u4γαu2, and in the last line omitted

terms of order 1M6 and higher. The subscript (...)UV stands for the full theory, We will denote

the corresponding amplitudes in the effective theory with the subscript (...)IR. The cross box

amplitude (b) is nearly identical, except for the sign of the momentum in one of the fermion

propagators

(b)UV =iλ4

(4π)2

[−UV

(1

4M2+

σ2

4M4

(3− 2 log

(M2

σ2

)))+ US

σ2

M4

(log(M2

σ2

)− 2

)]+ · · · .(6.2.28)

Diagrams (c) and (d) are even simpler to evaluate, but they are divergent:

(c)UV = −4iλ4

(4π)2

σ2

M4US

[3

1

ε+ 3 log

(µ2

σ2

)+ 1

]+ · · · , (6.2.29)

where 1ε = 1

ε−γ+log(4π). Here, µ is the regularization scale and it enters since coupling λ carries

a factor of µε in dimensional regularization. The four Yukawa couplings give λ4µ4ε. However,

µ2ε should be factored out of the calculation to give the proper dimension of the four-fermion

coupling, while the remaining µ2ε is expanded for small ε and yields log(µ2). In the following

expression a factor of two is included to account for two diagrams

(d)UV = −2iλ4

(4π)2M2US

[1

ε+ 1 + log

( µ2

M2

)+

σ2

M2

(2− 3 log

(M2

σ2

))]+ · · · . (6.2.30)

The sum of all of these contributions is

(a+. . .+d)UV =2iλ4US

(4π)2M2

[−1

ε− 1− log

( µ2

M2

)+

σ2

M2

(−6

ε− 6 log

(µ2

σ2

)− 6 + 4 log

(M2

σ2

))].

(6.2.31)

We also need the fermion two-point function in order to calculate the wave function renor-

malization in the effective theory. The calculation is identical to that in eq. (6.2.13). We need

the finite part as well. The amplitude linear in momentum is

i/pλ2

2(4π)2

(1

ε+ log

( µ2

M2

)+

1

2+ · · ·

). (6.2.32)

It is time to calculate in the effective theory. The effective theory has a four-fermion interaction

that was induced at tree level. Again, we neglect the light scalar ϕ as it does not play any role

in our calculation. The effective Lagrangian is

L = izψ /∂ ψ − σψψ +c

2ψψ ψψ . (6.2.33)

We established that at tree level, c = λ2

M2 , but do not yet want to substitute the actual value of

c as not to confuse the calculations in the full and effective theories. To match the amplitudes


(d)

(a) (b)

(c)

Figure 6.5: Diagrams in the effective theory to order c2. Diagram (d) stands in for two diagrams that

are related by an upside-down reflection. As we drew in Fig. 6.2, the four-fermion vertices are not exactly

point-like, so one can follow each fermion line.

we also need to compute one-loop scattering amplitude in the effective theory. The two-point

amplitude for the fermion kinetic energy vanishes in the effective theory. The four-point diagrams

in the effective theory are depicted in fig. 6.5. Diagrams in an effective theory have typically

higher degrees of UV divergence as they contain fewer propagators. For example, diagram (a)IR

is quadratically divergent, while (a)UV is finite. This is not an obstacle. We simply regulate

each diagram using dimensional regularization.

Exactly like in the full theory, the fermion propagators in diagrams (a)IR and (b)IR have

opposite signs of momentum, thus the terms proportional to UV cancel. The parts proportional

to US are the same and the sum of these diagrams is

(a+ b)IR = 2ic2σ2

(4π)2US

[1

ε+ log

(µ2

σ2

)]+ . . . . (6.2.34)

If one was careless with drawing these diagrams, one might think that there is a closed fermion

loop and assign an extra minus sign. However, the way of drawing the effective interactions

in fig. 6.5 makes it clear that the fermion line goes around the loop without actually closing.

Diagram (c)IR is identical to its counterpart in the full theory. Since we are after the momentum-

independent part of the amplitude, the heavy scalar propagators in (c)UV were simply equal to−iM2 . Therefore,

(c)IR = −4ic2σ2

(4π)2US

[3

1

ε+ 3 log

(µ2

σ2

)+ 1

]+ . . . . (6.2.35)

As in the full theory, (d)IR includes a factor of two for two diagrams

(d)IR = 2ic2σ2

(4π)2US

[3

1

ε+ 3 log

(µ2

σ2

)+ 1

]+ . . . . (6.2.36)

The sum of these diagrams is

(a+ . . .+ d)IR = −2ic2σ2

(4π)2US

[2

ε+ 2 log

(µ2

σ2

)+ 1

]. (6.2.37)

Of course, we should set c = λ2

M2 at this point.


Before we compare the results let us make two important observations. There are several logs

in the amplitudes. In the full theory, log( µ2

M2 ), log(µ2

σ2 ) and log(M2

σ2 ) appear, while in the effective

theory only log(µ2

σ2 ) shows up. Interestingly, comparing the full and effective theories diagram by

diagram, the corresponding coefficients in front of log(σ2) are identical. This means that log(σ2)

drops out of the difference between the full and effective theories so log(σ2) never appears in

the matching coefficients. It had to be this way. We already argued that the two theories are

identical in the IR, so non-analytic terms depending on the light fields must be the same. This

would hold for all other quantities in the low-energy theory, for instance for terms that depend

on the external momenta. This correspondence between logs of low-energy quantities does not

have to happen, in general, diagram by diagram, but it has to hold for the entire calculation.

This provides a useful check on matching calculations. When the full and effective theory are

compared, the only log that turns up is the log( µ2

M2 ). This is good news as it means that there

is only one scale in the matching calculation and we can minimize the logs by setting µ = M .

The 1ε poles are different in the full and effective theories as the effective theory diagrams are

more divergent. We simply add appropriate counterterms in the full and the effective theories to

cancel the divergences. The counterterms in the two theories are not related. We compare the

renormalized, or physical, scattering amplitudes and make sure they are equal. We are going to

use the MS prescription and the counterterms will cancel just the 1ε poles. It is clear that since

the counterterms differ on the two sides, the coefficients in the effective theory depend on the

choice of regulator. Of course, physical quantities will not depend on the regulator.

Setting µ = M , the difference between eqs. (6.2.31) and (6.2.37) gives

c(µ = M) =λ2

M2− 2λ4

(4π)2M2− 10λ4σ2

(4π)2M4. (6.2.38)

To reproduce the two-point function in the full theory we set z = 1 + λ2

4(4π)2 in the MS pre-

scription since there are no contributions in the effective theory. To obtain physical scattering

amplitude, the fermion field needs to be canonically normalized by rescaling√zψ → ψcanonical.

This rescaling gives an additional contribution to the λ4

(4π)2M2 term in the scattering amplitude

from the product of the tree-level contribution and the wave function renormalization factor.

Without further analysis, it is not obvious that it is consistent to keep the last term in the

expression for c(µ = M). One would have to examine if there are any other terms proportional

to 1M4 that were neglected. For example, the momentum-dependent operator proportional to d

in Eq. (??) could give a contribution of the same order when the RG running in the effective

theory is included. Such contribution would be proportional to λ2η2σ2

(4π)2M4 log(M2

m2 ). There can also

be contributions to the fermion two-point function arising in the full theory from the heavy

scalar exchange. We were originally interested in a theory with massless fermions which means

that σ = 0. It was a useful detour to do the matching calculation including the 1M4 terms as

various logs and UV divergences do not fully show up in this example at the 1M2 order.

We calculated the scattering amplitudes arising from the exchanges of the heavy scalar. In the

calculation of the ψψ → ψψ scattering cross section, both amplitudes coming from the exchanges

of the heavy and light scalars have to be added. These amplitudes depend on different coupling

constants, but they can be difficult to disentangle experimentally since the measurements are

done at low energies. The amplitude associated with the heavy scalar is measurable only if

the mass and the coupling of the light scalar can be inferred. This can be accomplished, for

example, if the light scalar can be produced on-shell in the s channel. Near the resonance


corresponding to the light scalar, the scattering amplitude is dominated by the light scalar and

its mass and coupling can be determined. Once the couplings of the light scalar are established,

one could deduce the amplitude associated with the heavy scalar by subtracting the amplitude

with the light scalar exchange. If the heavy and light states did not have identical spins one

could distinguish their contributions more easily as they would give different angular dependence

of the scattering cross section.

6.2.6 Naturalness

Integrating out a fermion in the Yukawa theory emphasizes several important points. We are

going to study the same “full” Lagrangian again, but this time assume that the fermion is heavy

and the scalar ϕ remains light

L = iψ/∂ψ −Mψψ +1

2(∂µϕ)2 − m2

2ϕ2 − η ψψϕ , (6.2.39)

where M m. We will integrate out ψ and keep ϕ in the effective theory. As we did earlier,

we have neglected the potential for ϕ assuming that it is zero. There are no tree-level diagrams

involving fermions ψ in the internal lines only. We are going to examine diagrams with two

scalars and four scalars for illustration purposes. The diagrams resemble those of the Coleman-

Weinberg effective potential calculation, but we do not necessarily neglect external momenta.

The momentum dependence could be of interest. The two point function gives

= (−1)(−iηµε)2

∫ddk

(2π)di2

Tr[(/k + /p+M)(/k +M)]

[(k + p)2 −M2](k2 −M2)

= − 4iη2

(4π)2

[(3

ε+ 1 + 3 log

( µ2

M2

))(M2 − p2

6

)+p2

2− p4

20M2+ · · ·

], (6.2.40)

where we truncated the momentum expansion at order p4. The four-point amplitude, to the

lowest order in momentum is

= − 8iη4

(4π)2

[3

(1

ε+ log

( µ2

M2

))− 8 + · · ·

]. (6.2.41)

There are no logarithms involving m2 or p2 in eqs. (6.2.40) and (6.2.41). Our effective theory at

the tree level contains a free scalar field only, so in that effective theory there are no interactions

and no loop diagrams. Thus, logarithms involving m2 or p2 do not appear because they could

not be reproduced in the effective theory. Setting µ = M and choosing the counterterms to

cancel the 1ε poles we can read off the matching coefficients in the scalar theory

L =

(1− 4η2

3(4π)2

)(∂µϕ)2

2−(m2 +

4η2M2

(4π)2

)ϕ2

2+

η2

5(4π)2M2

(∂2ϕ)2

2+

64η2

(4π)2

ϕ4

4!+· · · (6.2.42)

To obtain physical scattering amplitudes one needs to absorb the 1− 4η2

3(4π)2 factor in the scalar ki-

netic energy, so the field is canonically normalized. The scalar effective Lagrangian in Eq. (6.2.42)

is by no means a consistent approximation. For example, we did not calculate the tadpole dia-

gram and did not calculate the diagram with three scalar fields. Such diagrams do not vanish

since the Yukawa interaction is not symmetric under ϕ → −ϕ. There are no new features in

those calculations so we skipped them.


The scalar mass term, m2 + 4η2M2

(4π)2 , contains a contribution from the heavy fermion. If the

sum m2 + 4η2M2

(4π)2 is small compared to 4η2M2

(4π)2 one calls the scalar “light” compared to the heavy

mass scale M . This requires a cancellation between m2 and 4η2M2

(4π)2 . Cancellation happens when

the two terms are of opposite signs and close in magnitude, yet their origins are unrelated.

No symmetry of the theory can relate the tree-level and the loop-level terms. If there was a

symmetry that ensured the tree-level and loop contributions are equal in magnitude and opposite

in sign, then small breaking of such symmetry could make make the sum m2 + 4η2M2

(4π)2 small. But

no symmetry is present in our Lagrangian. This is why light scalars require a tuning of different

terms unless there is a mechanism protecting the mass term, for example the shift symmetry or

supersymmetry.

The sensitivity of the scalar mass term to the heavy scales is often referred to as the quadratic

divergence of the scalar mass term. When one uses mass-dependent regulators, the mass terms

for scalar fields receive corrections proportional to Λ2

(4π)2 . Having light scalars makes fine tuning

necessary to cancel the large regulator contribution. There are no quadratic divergences in

dimensional regularization, but the fine tuning of scalar masses is just the same. In dimensional

regularization, the scalar mass is quadratically sensitive to heavy particle masses. This is a

much more intuitive result compared to the statement about an unphysical regulator. Fine

tuning of scalar masses would not be necessary in dimensional regularization if there were no

heavy particles. For example, if the Standard Model (SM) was a complete theory there would be

no fine tuning associated with the Higgs mass. Perhaps the SM is a complete theory valid even

beyond the grand unification scale, but there is gravity and we expect Planck-scale particles in

any theory of quantum gravity. Another term used for the fine tuning of the Higgs mass in the

SM is the hierarchy problem. Having a large hierarchy between the Higgs mass and other large

scales requires fine tuning, unless the Higgs mass is protected by symmetry.

It is apparent from our calculation that radiative corrections generate all terms allowed by

symmetries. Even if zero at tree level, there is no reason to assume that the potential for the

scalar field vanishes. The potential is generated radiatively. We obtained nonzero potential

in the effective theory when we integrated out a heavy fermion. However, generation of terms

by radiative corrections is not at all particular to effective theory. The RG evolution in the

full theory would do the same. We saw another example of this in §6.2.4, where an operator

absent at one scale was generated radiatively. Therefore, having terms smaller than the sizes

of radiative corrections requires fine tuning. A theory with all coefficients whose magnitudes

are not substantially altered by radiative corrections is called technically natural. Technical

naturalness does not require that all parameters are of the same order, it only implies that none

of the parameters receives radiative corrections that significantly exceed its magnitude. As our

calculation demonstrated, a light scalar that is not protected by symmetry is not technically

natural.

Naturalness is a stronger criterion. Dirac’s naturalness condition is that all dimensionless

coefficients are of order one and the dimensionful parameters are of the same magnitude. A

weaker naturalness criterion, due to ’t Hooft, is that small parameters are natural if setting a

small parameter to zero enhances the symmetry of the theory. Technical naturalness is yet a

weaker requirement. The relative sizes of terms are dictated by the relative sizes of radiative

corrections and not necessarily by symmetries, although symmetries obviously affect the mag-

nitudes of radiative corrections. Technical naturalness has to do with how perturbative field

theory works.


6.2.7 Summary

We have constructed several effective theories so far. It is a good moment to pause and review

the observations we made. To construct an EFT one needs to identify the light fields and their

symmetries, and needs to establish a power counting scheme. If the full theory is known then an

EFT is derived perturbatively as a chain of matching calculations interlaced by RG evolutions.

Each heavy particle is integrated out and new effective theory matched to the previous one,

resulting in a tower of effective field theories. Consecutive ratios of scales are accounted for by

the RG evolution.

This is a systematic procedure which can be carried out to the desired order in the loop

expansion. Matching is done order by order in the loop expansion. When two theories are

compared at a given loop order, the lower order results are included in the matching. For

example, in §6.2.5 we calculated loop diagrams in the effective theory including the effective

interaction we obtained at the tree level. At each order in the loop expansion, the effective theory

valid below a mass threshold is amended to match the results valid just above that threshold.

Matching calculations do not depend on any light scales and if logs appear in the matching

calculations, these have to be logs of the matching scale divided by the renormalization scale.

Such logs can be easily minimized to avoid spoiling perturbative expansion. The two theories

that are matched across a heavy threshold have in general different UV divergences and therefore

different counterterms.

EFTs naturally contain higher-dimensional operators and are therefore non-renormalizable.

In practice, this is of no consequence since the number of operators, and therefore the number

of parameters determined from experiment, is finite. To preserve power counting and maintain

consistent expansion in the inverse of large mass scales one needs to employ a mass-independent

regulator, for instance dimensional regularization. Consequently, the renormalization scale only

appears in dimensionless ratios inside logarithms and so it does not alter power counting. Con-

tributions from the heavy fields do not automatically decouple when using dimensional regular-

ization, thus decoupling should be carried out explicitly by constructing effective theories.

Large logarithms arise from the RG running only as one relates parameters of the theory

at different renormalization scales. The field content of the theory does not change while its

parameters are RG evolved. However, distinct operators of the same dimension can mix with

one another. The RG running and matching are completely independent and can be done

at unrelated orders in perturbation theory. The magnitudes of coupling constants and the

ratios of scales dictate the relative sizes of different contributions and dictate to what orders

in perturbation theory one needs to calculate. A commonly repeated phrase is that two-loop

running requires one-loop matching. This is true when the logarithms are very large, for example

in grand unified theories. The log(MGUT/Mweak) is almost as large as (4π)2, so the logarithm

compensates the loop suppression factor. This is not the case for smaller ratios of scales.

The contributions of the heavy particles to an effective Lagrangian appear in both renormal-

izable terms and in higher dimensional terms. For the renormalizable terms, the contributions

from heavy fields are often unobservable as the coefficients of the renormalizable terms are de-

termined from low-energy experiments. The contributions of the heavy fields simply redefine

the coefficients that were determined from experiments instead of being predicted by the theory.

The coefficients of the higher-dimensional operators are suppressed by inverse powers of the

heavy masses. As one increases the masses of the heavy particles, their effects diminish. This

typical situation is referred to as the decoupling of heavy fields.

6.3 The Standard Model as an Effective Theory∗ 123

When the high-energy theory is not known, or it is not perturbative, one still benefits from

constructing an EFT. One can power count the operators and then enumerate the pertinent

operators to the desired order. One cannot calculate the coefficients, but one can estimate them.

In a perturbative theory, explicit examples tell us what magnitudes of coefficients to expect at

any order of the loop expansion.

6.3 The Standard Model as an Effective Theory∗

We are now ready to start building effective field theory models. If we believe in the naturalness

principle articulated in the previous section, then the models should be defined by specifying

the particle content and the symmetries of the theory. Then we should write down all possible

couplings consistent with the symmetries.

6.3.1 The Standard Model

Let us apply these ideas to the standard model. The standard model is defined to be a theory

with gauge group

SU(3)C × SU(2)W × U(1)Y . (6.3.43)

The fermions of the standard model can be written in terms of 2-component Weyl spinor fields

as

Qi ∼ (3,2)+ 16, (6.3.44)

(uc)i ∼ (3,1)− 23, (6.3.45)

(dc)i ∼ (3,1)+ 13, (6.3.46)

Li ∼ (1,2)− 12, (6.3.47)

(ec)i ∼ (1,1)+1, (6.3.48)

where i = 1, 2, 3 is a generation index. In addition, the model contains a single scalar multiplet

H ∼ (1,2)+ 12. (6.3.49)

According to the ideas above, we must now write the most general interactions allowed by the

symmetries. The most important interactions are the marginal and relevant ones. The marginal

interactions include kinetic terms for the Higgs field, the fermion fields, and the gauge fields:

Lkinetic = (DµH)†DµH +Q†i iσµDµQi + · · · − 1

4BµνBµν + · · · (6.3.50)

Note that these include the gauge self interactions. Also marginal is the quartic interaction for

the Higgs

∆Lquartic = −λ4

(H†H)2 (6.3.51)

and Yukawa interactions:

∆LYukawa = (yu)ijQiH(uc)j + (yd)ijQ

iH†(dc)j + (ye)ijLiH†(ec)j . (6.3.52)

Note that the Yukawa interactions are the only interactions that break a SU(3)5 global symmetry

that would otherwise act on the generation indices of the fermion fields. This means that the


Yukawa interactions can be naturally small without any fine tuning. This is reassuring, since it

means that the small electron Yukawa coupling ye ∼ 10−5 is perfectly natural.

Finally, the marginal interactions include ‘vacuum angle’ (v.a.) terms for each of the gauge

groups:

Lv.a. =g2

1Θ1

16π2BµνBµν +

g22Θ2

8π2Tr(WµνWµν) +

g23Θ3

8π2Tr(GµνGµν), (6.3.53)

where Bµν = 14εµνρσBρσ, etc. These terms break CP , and are therefore very interesting. These

terms are total derivatives, e.g.

BµνBµν = ∂µKµ, Kµ = 12εµνρσAνFρσ. (6.3.54)

This is enough to ensure that they do not give physical effects to all orders in perturbation

theory. They can give non-perturbative effects with parametric dependence ∼ e1/g2, but these

are completely negligible for the SU(2)W × U(1)Y terms, since these gauge couplings are never

strong. The strong vacuum angle gives rise to CP -violating non-perturbative effects in QCD,

most importantly the electric dipole moment of the neutron. Experimental bounds on the

neutron electric dipole moment require Θ3 . 1010. Explaining this small number is the ‘strong

CP problem.’ There are a number of proposals to solve the strong CP problem. For example,

there may be a spontaneously broken Peccei-Quinn symmetry leading to an axion, or there may

be special flavor structure at high scales that ensures that the determinant of the quark masses

is real.

There is one relevant interaction that is allowed, namely a mass term for the Higgs field:

Lrelevant = −m2HH

†H. (6.3.55)

Note that mass terms for the fermions such as Lec are not gauge singlets, and therefore forbidden

by gauge symmetry. The Higgs mass parameter cannot be forbidden by any obvious symmetry,

and therefore must be fine tuned in order to be light compared to heavy thresholds such as the

GUT scale. For example, in GUT models there are massive gauge bosons with masses of order

MGUT that couple to the Higgs with strength g, where g is the unified gauge coupling. These

will contribute to the effective Higgs mass below the GUT scale

∆m2H ∼

g2M2GUT

16π2∼ 1030GeV2 (6.3.56)

for MGUT ∼ 1016GeV. In order to get a Higgs mass of order 100 GeV we must fine tune to one

part in 1026!

We can turn this around and ask what is the largest mass threshold that is naturally compat-

ible with the existence of a light Higgs boson. The top quark couples to the Higgs with coupling

strength yt ∼ 1, and top quark loops give a quadratically divergent contribution to the Higgs

mass. Assuming that this is cut off by a new threshold at the scale M , we find a contribution

to the Higgs mass of order

∆m2H ∼

y2tM

2

16π2, (6.3.57)

which is naturally small for M . 1TeV. We get a similar estimate for M from loops involving

SU(2) × U(1) gauge bosons. So the standard model is natural as an effective field theory only

if there is new physics at or below a TeV. This is the principal motivation for the Large Hadron

Collider (LHC) at CERN, which will start operation in 2007-2008 with a center of mass energy

of 14TeV. It is expected that the LHC will discover the mechanism of electroweak symmetry

breaking and the new physics that makes it natural.

6.3 The Standard Model as an Effective Theory∗ 125

6.3.2 The GIM Mechanism

One very important feature of the standard model is that it violates flavor in just the right

way. The quark mass matrices are proportional to the up-type and down-type Yukawa cou-

plings. Diagonalizing the quark mass matrices requires that we perform independent unitary

transformations on the two components of the quark doublet Qi. This gives rise to the CKM

mixing matrix, which appears in the interactions of the mass eigenstate quarks with the W±

(‘charged currents’). Crucially, the interactions with the photon and the Z (‘neutral currents’)

are automatically diagonal in the mass basis. This naturally explains the phenomenology of

flavor-changing decays observed in nature, including the ‘GIM suppression’ of flavor changing

neutral current processes such as K0–K0 mixing.

For our purposes, what is important is that this comes about because the quark Yukawa

couplings are the only source of flavor violation in the standard model. If there were other

couplings that violated quark flavor, these would not naturally be diagonal in the same basis

that diagonalized the quark masses, and would in general lead to additional flavor violation. A

simple example of this is a general model with 2 Higgs doublets, in which there are twice as

many Yukawa coupling matrices.

6.3.3 Accidental Symmetries

It is noteworthy that the standard model was completely defined by its particle content gauge

symmetries. In particular, we did not have to impose any additional symmetries to suppress

unwanted interactions. If we look back at the terms we wrote down, we see that all of the relevant

and marginal interactions are actually invariant under some additional global symmetries. One

of these is baryon number, a U(1) symmetry with charges

B(Q) = 13 , B(uc) = B(dc) = −1

3 , B(L) = B(ec) = B(H) = 0. (6.3.58)

Another symmetry is lepton number, another U(1) symmetry with charges

L(Q) = L(uc) = L(dc) = 0, L(L) = +1, L(ec) = −1, L(H) = 0. (6.3.59)

These symmetries can be broken by higher-dimension operators. For example, the lowest-

dimension operators that violate baryon number are dimension-six:

∆L ∼ 1

M2QQQL+

1

M2ucucdcec , (6.3.60)

where the color indices are contracted using the SU(3)C invariant antisymmetric tensor. Consis-

tency with the experimental limit on the proton lifetime of 1033 yr gives a bound M & 1022GeV.

Although this is larger than the Planck mass, these couplings also violate flavor symmetries,

and it seems reasonable that whatever explains the small values of the light Yukawa couplings

can suppresses these operators.

A very appealing consequence of this is that if the standard model is valid up to a high scale

M , then the proton is automatically long-lived, without having to assume that baryon number

is an exact or approximate symmetry of the fundamental theory. Baryon number emerges as an

‘accidental symmetry’ in the sense that the other symmetries of the model (in this case gauge

symmetries) do not allow any relevant or marginal interactions that violate the symmetry.


6.3.4 Neutrino Masses

Lepton number can be violated by the dimension-five operator

∆L ∼ 1

M(LH)(LH). (6.3.61)

When the Higgs gets a VEV, these gives rise to Majorana masses for the neutrinos of order

mν ∼v2

M. (6.3.62)

In order to get neutrino masses in the interesting range mν ∼ 10−2eV for solar and atmospheric

neutrino mixing, we require M ∼ 1015GeV, remarkably close to the GUT scale. The interaction

(6.3.61) also has a nontrivial flavor structure, so the actual scale of new physics depends on the

nature of flavor violation in the fundamental theory, like the baryon number violating interactions

considered above.

The experimental discovery of neutrino masses has been heralded as the discovery of physics

beyond the standard model, but it can also be viewed as a triumph of the standard model.

The standard model predicts that neutrino masses (if present) are naturally small, since they

can only arise from an irrelevant operator. We can view the discovery of neutrino masses as

evidence for the existence of a new scale in physics. This is analogous to the discovery of weak

β decay, which can be described by an effective 4-fermion interaction with coupling strength

GF ∼ 1/(100GeV)2. (Therefore, Fermi was doing effective quantum field theory in the 1930’s!)

6.3.5 Beyond the Standard Model Physics

The steps in constructing an extension of the standard model are the same ones we followed in

constructing the standard model above. The model should be defined by its particle content

and symmetries. We then write down all couplings allowed by these principles. The goal is to

find an extension of the standard model that cures the naturalness problem, but preserves the

successes of the standard model described above.

6.4 Conclusions

Effective field theory is the language in which all of modern theoretical physics is phrased (or

should be phrased). It it well worth becoming proficient in speaking it.

7 Effective Field Theory and Inflation

Let us describe inflation as a low-energy effective theory. First, we identify the relevant light

degrees of freedom at the energy scale of the “experiment” (for inflation, this is the Hubble

scale, E ∼ H). The EFT will contain at least one light scalar, the inflaton φ. Next, we write

down the effective action for the inflaton, cf. eq. (6.2.2). We are obliged to down all operators

consistent with the assumed symmetries of the inflaton,

OδΛδ−4

, (7.0.1)

where δ denotes the mass dimension of the operator. The purpose of this chapter is to explain

that for inflation even Planck-suppressed operators don’t decouple. Instead they make critical

contributions to the dynamics. This indirectly makes inflation a window into quantum gravity.

7.1 UV Sensitivity

What value should we choose for the cutoff Λ? At what scale do we expect new degrees of

freedom to become important? The larger the cutoff, the more suppressed the effects of the

operators Oδ. However, the largest we can make the cutoff is the Planck-scale. The presence of

some form of new physics at the Planck scale is required in order to render graviton-graviton

scattering sensible, just as unitarity of W -W scattering requires new physics at the TeV scale.

Although we know that new degrees of freedom must emerge, we cannot say whether the physics

of the Planck scale is a finite theory of quantum gravity, such as string theory, or is instead simply

an effective theory for some unimagined physics at yet higher scales.

Sensitivity to higher-dimension operators is commonplace in particle physics: as we saw in

the previous section, bounds on flavor-changing processes place limits on physics above the TeV

scale, and lower bounds on the proton lifetime even allow us to constrain GUT-scale operators

that would mediate proton decay. However, particle physics considerations alone do not often

reach beyond operators of dimension δ = 6, nor go beyond M ∼ MGUT. (Scenarios of gravity-

mediated supersymmetry breaking are one exception.) Equivalently, Planck-scale processes, and

operators of very high dimension, are irrelevant for most of particle physics: they decouple from

low-energy phenomena.

In inflation, however, the flatness of the potential in Planck units introduces sensitivity to

δ ≤ 6 Planck-suppressed operators, such as

O6

M2pl

. (7.1.2)

127

128 7. Effective Field Theory and Inflation

As we explain in §7.2, an understanding of such operators is required to address the smallness

of the eta parameter, i.e. to ensure that the theory supports at least 60 e-folds of inflationary

expansion. This sensitivity to dimension-six Planck-suppressed operators is therefore common

to all models of inflation.

For large-field models of inflation the UV sensitivity of the inflaton action is dramatically

enhanced. As we discuss in §7.3, in this important class of inflationary models the potential

becomes sensitive to an infinite series of operators of arbitrary dimension.

7.2 The Eta Problem

The most common field theory mechanisms for inflation involve a scalar field φ with mass mφ

parametrically smaller than the Hubble scale H:

η =m2φ

3H2 1 . (7.2.3)

It is difficult to protect this hierarchy against high-energy corrections. We know that some new

degrees of freedom must appear at Λ . Mpl to give a UV-completion of GR. In string theory

this scale is often found to be significantly below the Planck scale, Λ . Ms . Mpl. If φ has

O(1) couplings to the ‘Planck slop’ fields ψ, then integrating out the fields ψ yields the following

corrections to the low-energy effective action

∆Lφ =O∆(φ)

Λ∆−4, (7.2.4)

with O the set of all allowed operators in the effective theory of the inflaton field φ. Consider

an inflationary lagrangian of slow-roll form

Lφ = −1

2(∂µφ)2 − Vφ(φ) . (7.2.5)

The above argument makes us worry that integrating out the massive fields ψ yields corrections

to the potential of the form

∆V = c V0(φ)φ2

Λ2. (7.2.6)

If this term arises, the eta parameter receives the following correction

∆η =M2

pl

V0(∆V )′′ ≈ 2c

(Mpl

Λ

)2

. (7.2.7)

Since Λ .Mpl and Wilson suggests c ∼ O(1), we find

∆η & 1 . (7.2.8)

In supersymmetric theories a minor miracle occurs. Above the scale H, the theory is supersym-

metric and the contributions from bosons and fermions precisely cancel. However, supersym-

metry is spontaneously broken during inflation, leading to an inflation mass of order Hubble,

mφ ∼ H, and an eta parameter of order one, ∆η ∼ 1.

7.3 Large-Field Inflation 129

7.3 Large-Field Inflation

The Planck-scale sensitivity of inflation is dramatically enhanced in models with observable

gravitational waves, r > 0.01. In this case, the inflaton field moves over a super-Planckian range

during the last 60 e-folds of inflation, ∆φ > Mpl. This observation makes an effective field

theorists nervous and a string theorist curious. Let us explain why.

It is postulated that the theory below a cutoff Λ is

Ls.r. = −1

2(∂µφ)2 − V (φ) . (7.3.9)

The problem with field excursions larger than Λ is that the effective potential becomes sensitive

to the couplings to massive degrees of freedom. Let us parameterize these degrees of freedom by

a set of scalar fields ψi with nearly equal masses mψi ∼ Λ. As good Wilsonian’s we write down

generic couplings between the inflaton φ and the massive fields ψi

V (φ, ψi) = V (φ) + Λ2ψ2i + g2

i φ2ψ2

i + g4i φ

4ψ2i

Λ2+ · · · (7.3.10)

For simplicity let us also assume that gi ∼ g. The potential then takes the form

V (φ, ψi) = V (φ) + Λ2ψ2i fi

(g2φ2i

Λ2

). (7.3.11)

For ∆φ & Λg the masses of the heavy fields change by order one, ∆mψ ∼ Λ. This implies that

some of the fields ψi will leave the EFT (mψ Λ), while others will join it (mψ Λ). The

EFTs at 〈φ〉 = 0 and 〈φ〉 = Λg will be different; the effective potential for the inflaton will be

different. Often the problem is presented as follows: assuming Λ → Mpl and g ∼ 1, the same

EFT is valid throughout observable inflation only if

∆φ Λ

g∼Mpl . (7.3.12)

The problem can also be expressed as a constraint on the couplings g. Since field excursions

smaller than Λg are still ok, observable gravity waves ∆φMpl, require

Λ

gMpl or g Λ

Mpl. (7.3.13)

This can be a strong constraint on the coupling in the generic situation Λ Mpl. This leads

us to a key question: Why does the inflaton couple so weakly (not even gravitationally!) to

Planck-scale degrees of freedom? Obviously, this is a question for a Planck-scale theory like

string theory.

Let us compare the Planck-scale sensitivity of small-field and large-field inflation:

(i) In small-field inflation we worry that dimension-six Planck-suppressed operators O6

M2pl

lead

to unacceptably large corrections to the inflaton mass, ∆η ∼ 1. However, operators of

higher dimensions are harmless, e.g. O7

M3pl

gives ∆η ∼ ∆φMpl 1. The important corrections

in small-field inflation are dimension-six and smaller, ∆ ≤ 6.

(ii) In contrast, for large-field inflation the terms O∆

M∆−4pl

become larger for larger ∆.

In case (i), we have a finite number of operators O∆≤6 that contribute important corrections to

the inflationary dynamics. We can therefore hope to enumerate all O∆ with ∆ ≤ 6 and balance

them against each other (fine-tuning). In case (ii), enumeration is not an option. One needs a

sufficiently powerful symmetry to protect the inflaton from all corrections.

130 7. Effective Field Theory and Inflation

7.4 Non-Gaussianity

In is well-known that single-field slow-roll inflation produces Gaussian fluctuations. What kind

of high-energy effects could deform slow-roll inflation in such a way as to produce large non-

Gaussianity without disrupting the inflationary background solution? In effective field theory

the effects of high-energy physics are encoded in high-dimension operators for the inflaton la-

grangian.1 Non-derivative operators such as φn/Λn−4 form part of the inflaton potential and are

therefore strongly constrained by the background. In other words, the existence of a slow-roll

phase requires the non-Gaussianity associated with these operators to be small.2 This naturally

leads us to consider higher-derivative operators of the form (∂µφ)2n/Λ4n−4. These operators

don’t affect the background, but in principle they could lead to strong interactions. Let us

consider the leading correction to the slow-roll lagrangian

L = Ls.r. +(∂µφ)4

8Λ4. (7.4.14)

We split the inflaton field into background φ(t) and fluctuations ϕ(x, t). For ˙φ Λ2, we can

ignore the correction to the quadratic lagrangian for ϕ,

L2 ≈ −1

2(∂µϕ)2 . (7.4.15)

We get the cubic lagrangian for ϕ by evaluating one of the legs of the interaction (∂φ)4 on the

background ˙φ,

L3 = −˙φ

2Λ4ϕ(∂µϕ)2 . (7.4.16)

We estimate the size of the non-Gaussianity as follows,

fNL ∼1

ζ

L3

L2∼ 1

ζ

˙φ ϕ

Λ4. (7.4.17)

Using ϕ ∼ Hϕ and ζ = H˙φϕ, we get

fNL ∼˙φ2

Λ4. (7.4.18)

We therefore find that we only get significant non-Gaussianity when ˙φ > Λ2, in which case we

can’t trust our derivative expansion. In other words, for ˙φ > Λ2 there is no reason to truncate

the expansion at finite order as we did in (7.4.14). Instead, operators of arbitrary dimensions

become important in this limit,

P (X,φ) =∑

cn(φ)Xn

Λ4n−4, where X ≡ −1

2(∂µφ)2 . (7.4.19)

As an effective-field theory, eq. (7.4.19) makes little sense when X > Λ4. All of the coefficients

cn are radiatively unstable. Hence, if we want to use a theory like (7.4.19) to generate large non-

Gaussianity, we require a UV-completion. Interestingly, an example for such a UV-completion

1In addition, there could be high-energy modifications to the vacuum state. These effects require a separate

discussion.2A possibility to get large non-Gaussianities is to have additional light fields (i.e. fields with mass smaller

than H) during inflation. These new degrees of freedom are not constrained by the slow-roll requirements and if

their fluctuations are somehow converted into curvature perturbations, these can be much less Gaussian than in

single-field slow-roll inflation.

7.4 Non-Gaussianity 131

exists in string theory. In Dirac-Born-Infeld (DBI) inflation,

P (X,φ) =Λ4

f(φ)

√1− f(φ)

X

Λ4− V (φ) , (7.4.20)

the form of the action is protected by a higher-dimensional boost symmetry. This symmetry

protects eq. (A.3.36) from radiative corrections and allows a predictive inflationary model with

large non-Gaussianity. It would be interesting to explore if there are other examples of P (X)

theories that are radiatively stable.

8 Supersymmetry and Inflation

8.1 Introduction

Why inflation and supersymmetry?

8.2 Facts about SUSY

Readers who don’t know about SUSY won’t learn it here.1 Instead, this section is just a quick

reminder of some essential facts about SUSY that we will need in our discussion of SUSY

inflation.

8.2.1 SUSY and Naturalness

Scalar fields play a prominent role in many cosmological theory. Since scalar field theories suffer

from UV divergencies, we have to worry about naturalness. Let us remind ourselves of the

hierarchy problem in particle physics and the role of SUSY in its resolution.2 This will serve as

a useful analogy for the corresponding problems in cosmology.

The standard model (SM) of particle physics is well-known to be unreasonably effective, since

it is in accord with all the experimental data. However, the consistency of the model relies on

the Higgs field having a vacuum expectation value (vev) of 246 GeV even though this is highly

unstable under quantum loop corrections. This instability can be seen by computing the loop

corrections to the Higgs mass term. The fact that these corrections diverge quadratically with

the high-energy cutoff is the signal that this instability is a severe problem. Much of the recent

interest in supersymmetry (SUSY) has been driven by the possibility that SUSY can cure this

instability.

The largest contribution to the Higgs mass correction in the SM of particle physics comes

from the top quark loop. The top quark acquires a mass from the vev, 〈H〉 ≡ v, of the, real,

neutral component of the Higgs field H. Given the coupling of the Higgs to the top quark:

LYukawa = − yi√2HtLtR + h.c. , (8.2.1)

where tL and tR are the left-handed and right-handed components of the top quark, yt is the

1A good starting point for learning about SUSY is: Stephan Martin, A Supersymmetry Primer, (arXiv:hep-

ph/9709356).2See Terning, Modern Supersymmetry.

133

134 8. Supersymmetry and Inflation

top Yukawa coupling. Expanding H around its vev, H = v + h, we find

mt =ytv√

2. (8.2.2)

Given the coupling in eq. (8.2.1), we can easily evaluate the top loop contribution to the Higgs

mass

−iδm2h

∣∣top

= (−1)Nc

∫d4k

(2π)4Tr

[−iyt√2

i

/k −mt

−iy∗t√2

i

/k −mt

]

= −2Nc|yt|2∫

d4k

(2π)4

k2 +m2t

(k2 −m2t )

2. (8.2.3)

After a Wick rotation, k0 → ik4 and k2 → −k2E, we can perform the angular integration and

impost a hard momentum cutoff, k2E < Λ2. This yields

−iδm2h

∣∣top

=iNc|yt|2

8π2

∫ Λ2

0dk2

E

k2E(k2

E −m2t )

(k2E +m2

t )2. (8.2.4)

Changing variable to x = k2E +m2

t , results in

−iδm2h

∣∣top

= −Nc|yt|28π2

∫ Λ2

m2t

dx

(1− 3m2

t

x+

2m4t

x2

)

= −Nc|yt|28π2

[Λ2 − 3m2

t ln(Λ2 +m2

t

m2t

)+ · · ·

], (8.2.5)

where · · · indicates finite terms in the limit Λ → ∞. So we find that there are quadratically

and logarithmically divergent corrections which (in the absence of a severe fine-tuning) push the

natural value of the Higgs mass term (and hence the Higgs VEV) up toward the cutoff. Another

way of saying this is that the SM can only be an effective field theory with a cutoff near 1 TeV,

and some new physics must come into play near the TeV scale which can stabilize the Higgs

vev. SUSY is (was?) the leading candidate for such new physics.

There is a simple way to stabilize the Higgs VEV by cancelling the divergent corrections to

the Higgs mass term. Suppose there are N new scalar particles φL and φR that are lighter than

a TeV with the following interactions:

Lscalar = −λ2h2(|φL|2 + |φR|2)− h(µL|φL|2 + µR|φR|2)−m2

L|φL|2 −m2R|φR|2 . (8.2.6)

The interactions in eq. (8.2.6) produce two new corrections to the Higgs mass term:

δm2h

∣∣1

=λN

16π2

[2Λ2 −m2

L ln(Λ2 +m2

L

m2L

)−m2

R ln(Λ2 +m2

R

m2R

)], (8.2.7)

and

δm2h

∣∣2

=N

16π2

[−µ2

L ln(Λ2 +m2

L

m2L

)− µ2

R ln(Λ2 +m2

R

m2R

)]. (8.2.8)

Notice that if N = Nc and λ = |yt|2 the quadratic divergences in eqs. (8.2.5) and (8.2.7) are

canceled. If we also have mt = mL = mR and µ2L = µ2

R = 2λm2t , the logarithmic divergences in

eqs. (8.2.5), (8.2.7) and (8.2.8) are canceled as well. SUSY is a symmetry between fermions and

bosons that guarantees just these conditions. The cancellation of the logarithmic divergence is

more than is needed to resolve the hierarchy problem; it is the consequence of powerful non-

renormalization theorems.

8.2 Facts about SUSY 135

8.2.2 Superspace and Superfields

Points in superspace are labeled by the following coordinates:

xµ , θα , θα . (8.2.9)

Here, θα and θα are constant complex anti-commuting two-component spinors with dimension

[mass]−1/2. In the superspace formalism, the components of a supermultiplet are united into a

single superfield, which is a function of the superspace coordinates. Infinitesimal translations

in superspace correspond to global supersymmetry transformations. Any superfield can be

expanded in a power series in the anti-communting variables3

S(x, θ, θ) = a+ θψ + θχ+ θθM + θθN + θσµθVµ + θθθλ+ θθθρ+ θθθθD . (8.2.10)

In these notes, we will only consider chiral superfields. These have the folllowing component

expansion

Φ(y, θ) = φ(y) +√

2θψ(y) + θ2F (y) , (8.2.11)

where

yµ ≡ xµ − iθσµθ . (8.2.12)

Taylor expanding eq. (8.2.11) in the Grassmann variables θ and θ, we find

Φ(x, θ, θ) = φ(x)−iθσµθ∂µφ(x)− 14θ

2θ2∂2φ(x)+√

2θψ(x)+ i√2θ2∂µψ(x)σµθ+θ2F (x) . (8.2.13)

Under a supersymmetry transformation the chiral superfield transforms as

δΦ = i(εQ+ εQ)Φ , (8.2.14)

where

Qα ≡ −i∂

∂θα− (σµ)αβ θ

β ∂

∂xµ, (8.2.15)

Qα ≡ +i∂

∂θα+ θβ(σµ)βα

∂

∂xµ. (8.2.16)

Exercise. Show that

Qα, Qα = 2(σµ)ααPµ , where Pµ ≡= −i∂µ . (8.2.17)

Show that eq. (8.2.14) implies

δεφ = εψ , (8.2.18)

δεψα = −i(σµε)α∂µφ+ εαF , (8.2.19)

δεF = −iεσµ∂µψ . (8.2.20)

We also have anti-chiral superfields

Φ(y, θ) = φ(y) +√

2θψ(y) + θ2F (y) . (8.2.21)

3Since there are two components of θα and likewise for θα, the expansion always terminates, with each term

containing at most two θ’s and two θ’s


8.2.3 Supersymmetric Lagrangians

A key observation is that the integral of a general superfield over all of superspace is automati-

cally invariant under supersymmetry transformations

δεA = 0 , for A =

∫d4x

∫d2θd2θ S(x, θ, θ) . (8.2.22)

This follows immediately from the fact that Q and Q in eqs. (8.2.15) and (8.2.16) are sums of

total derivatives with respect to the superspace coordinates xµ, θ, θ, so that (εQ+ εQ)S vanishes

upon integration.

In the special case of a chiral superfield, the integral over half the superspace is supersymmetric

δεB = 0 , for B =

∫d4x

∫d2θ Φ(x, θ, θ) . (8.2.23)

To see this, we note that the F-term of a chiral superfield transforms into a total derivative, see

eq. (8.2.20).

Let us construct the supersymmetric Lagrangian for a chiral superfield Φ. Consider the

composite superfield

ΦΦ = φφ+√

2θψφ+√

2θψφ+ θθφF + θθφF + θσµθ[iφ∂µφ− iφ∂µφ− ψσµψ

]

+ i√2θθθσµ

(ψ∂µφ− ∂µψφ

)+√

2θθθψF + i√2θθθσµ

(ψ∂µφ− ∂µψφ

)+√

2θθθψF

+ θθθθ[FF − 1

2∂µφ∂µφ+ 1

4 φ∂2φ+ 1

4φ∂2φ+ i

2 ψσµ∂µψ + i

2ψσµ∂µψ

], (8.2.24)

where all fields are evaluated at xµ. For the special case of a single chiral superfield, consider

the integral over superspace

∫d2θd2θ ΦΦ = −∂µφ∂µφ+ iψσµ∂µψ + FF + ∂µ(· · · ) . (8.2.25)

This is the Lagrangian density for the massless free Wess-Zumino model.

To obtain interactions and mass terms, we consider products of chiral superfields, such as Φ2

and Φ3. It is easy to see that any holomorphic function W (Φ) of a chiral superfield is also a

chiral superfield. From this we can construct a supersymmetric Lagrangian by integrating over

half the superspace

∫d2θW (Φ) + h.c. = ∂φWF − 1

2∂2φWψψ − ∂µ(· · · ) + h.c. , (8.2.26)

where ∂φW ≡ ∂ΦW |Φ=φ and ∂2φW ≡ ∂2

ΦW |Φ=φ. We added the complex conjugate,∫d2θ W (Φ),

the make the term real. For this to correspond to a Lagrangian density, W (φ) must have

dimension [mass]3. The most general renormalizable superpotential is

W (Φ) = 12mΦ2 + 1

3gΦ3 . (8.2.27)

The total Lagrangian is

L =

∫d4θ ΦΦ +

(∫d2θ W (Φ) + h.c.

)(8.2.28)

= −∂µφ∂µφ+ iψσµ∂µψ + FF + (∂φWF + h.c.)− 12(∂2

φWψψ + h.c.) . (8.2.29)


The part of the Lagrangian depending on the auxiliary field F take the simple form

Laux = FF + ∂φWF + ∂φW F . (8.2.30)

Notice that this is quadratic and without derivatives. This means that the field F does not

propagate. Hence, we can solve the field equations for F ,

F = −∂φW , (8.2.31)

and substitute the result back into the Lagrangian

Laux = −|∂φW |2 ≡ −VF (φ) . (8.2.32)

We see that the superpotential W (Φ) leads to a (F-term) potential VF (φ) for the scalar field φ.

Notice that the scalar potential is positive semi-definite, VF ≥ 0.

Exercise. Using eq. (8.2.27) show that:

1) the mass of the scalar φ equals the mass of the spinor ψ (after all the theory is supersymmetric).

2) the coefficient g of the Yukawa coupling g(φψψ) also determined the scalar self-coupling g2|φ|3.

Explain why this is the source of the “miraculous” cancellations in SUSY perturbation theory.

So far, we have only discussed renormalizable supersymmetric Lagrangians. As we have seen

in previous chapters, inflation is an effective theory and we are therefore interested in non-

renormalizable theories.

A non-renormalizable theory involving a set of chiral superfield Φi can be constructed as

L =

∫d4θ K(Φi, Φj) +

(∫d2θ W (Φi) + h.c.

)(8.2.33)

• The superpotential W , is an arbitrary holomorphic of the chiral superfields treated as

complex variables. It has dimension [mass]3.

• The Kahler potential K is a function of both chiral and anti-chiral superfields. It is real,

and has dimension [mass]2.

The part of the Lagrangian coming from the superpotential is∫d2θ W (Φi) = WiF

i − 12Wijψ

iψj , (8.2.34)

where Wi ≡ ∂φiW and Wij ≡ ∂φi∂φjW . After integrating out the auxiliary fields F i, the part

of the scalar coming from the superpotential is

VF = KiWiW , (8.2.35)

where Ki is the inverse of the Kahler metric

Ki = ∂φi∂φjK . (8.2.36)

Exercise. Derive eq. (8.2.35).


8.2.4 Miraculous Cancellations

Let us show explicit how SUSY enforces cancellations between fermion and boson loops. As a

concrete example, consider the following renormalizable theory, K = ΦΦ andW = 12mΦ2+ 1

3gΦ3,

with Lagrangian

L = ∂µφ∂µφ+ iψσµ∂µψ − |mφ+ gφ2|2 − (12m+ gφ)ψψ − (1

2m+ gφ)ψψ . (8.2.37)

Defining φ ≡ 1√2(A+ iB) and Ψ ≡ (ψ, ψ), we can write this as

L = 12(∂µA)2 − 1

2m2A2 + 1

2(∂µB)2 − 12m

2B2 + 12Ψ(i/∂ −m)Ψ

− 1√2mgA(A2 +B2)− 1

4g2(A4 +B4 + 2A2B2)− 1√

2gΨ(A− iBγ5)Ψ . (8.2.38)

We can draw 5 one-loop corrections to the mass of A: 4 boson loops and 1 fermion loop. Using

the usual Feynman rules for non-supersymmetric field theory, we find

(B1) + (B2) = 4g2

∫d4k

(2π)4

1

k2 −m2, (8.2.39)

(B3) + (B4) = 4g2m2

∫d4k

(2π)4

1

(k2 −m2)((k − p)2 −m2), (8.2.40)

and

(F1) = −(− ig√

2

)2

2

∫d4k

(2π)4Tr

i(/k +m)

k2 −m2

i(/k − /p+m)

(k − p)2 −m2

= −2g2

(∫d4k

(2π)4

1

k2 −m2+

∫d4k

(2π)4

1

(k − p)2 −m2

+

∫d4k

(2π)4

4m2 − p2

(k2 −m2)((k − p)2 −m2)

). (8.2.41)

The total one-loop correction to the mass of A therefore is

2g2

∫d4k

(2π)4

1

k2 −m2−∫

d4k

(2π)4

1

(k − p)2 −m2+

∫d4k

(2π)4

p2 − 2m2

(k2 −m2)((k − p)2 −m2)

.

The signs here are crucial, arising from the relative sign between the bosonic and fermionic

loops. The UV-divergences of the first to terms cancel, and the last term is only log-divergent

∫

Λ

d4k

(2π)4

1

(k2 −m2)((k − p)2 −m2)≈∫ Λ

0

2π2k3dk

(2π)4

1

k4∼ ln Λ . (8.2.42)

In contrast, non-supersymmetric theories usually produce quadratic divergences

∫

Λ

d4k

(2π)4

1

k2 −m2≈∫ Λ

0

2π2k3dk

(2π)4

1

k2∼ Λ2 . (8.2.43)

In SUSY theories quadratic divergences cancel because of boson-fermion degeneracy of the spec-

trum.


8.2.5 Non-Renormalization Theorem

We have just seen an example for the power of SUSY to protect a theory against radiative

corrections. Can this be generalized? How do theories with general K and W behave under

quantum corrections?

The answer is remarkable:

• K gets corrections order-by-order in perturbation theory.

• W is not renormalized in perturbation theory.

In this section, we will follow Seiberg and prove the non-renormalization theorem for the su-

perpotential. This will be a very slick argument, using only symmetries and holomorphy of the

superpotential.

Consider a single chiral superfield Φ, with superpotential Wtree(Φ). As our canonical example,

we will again use the Wess-Zumino model, Wtree(Φ) = 12mΦ2 + 1

3gΦ3. For m = g = 0 the theory

has a global U(1)×U(1)R. The R-symmetry acts as a phase factor on the superspace coordinate

θ, i.e. θ → e−iαθ. By convention R[θ] = 1. Since Grassmann integrals define∫d2θθ2 as a pure

number (= 1), we also have R[d2θ] = −2. Thus U(1)R is a symmetry if R[W ] = 2. Assign

R[Φ] = 1, so the free theory (g = 0) preserves an R-symmetry. Treat the coupling g as a

background field (spurion field) that has charge R[g] = −1, so that the R-symmetry is preserved

by the interaction. An ordinary U(1) symmetry does not act on θ, so the superpotential W is

neutral under such a symmetry. The following table summarizes the symmetries of the WZ-

model:

U(1) U(1)RΦ 1 1

m −2 0

g −3 −3

To get an effective theory valid below some scale µ, we integrate out modes from Λ down

to µ. In practice, you would start with Wtree at the scale Λ, expand in Fourier modes, and

carry out the path integral over modes above µ (but below Λ). The result can be assembled

into an effective superpotential Weff that depends only on modes below µ. The claim is that

Weff(Φ) = Wtree(Φ).

We will exploit symmetries to proves this result without any work. The superpotential must

have R-charge 2 and vanishing U(1) charge. For example, the term mΦ2 has those charges.

Moreover, the combination gΦ/m is neutral under both U(1)’s. The most general form of the

effective superpotential therefore is

Weff = mΦ2h

(gφ

m

), (8.2.44)

where h is an unknown holomorphic function. Consider a power series expansion of the effective

superpotential

Weff = mΦ2h

(gφ

m

)=∑

n

angnm1−nΦn+2 . (8.2.45)

In the weak coupling limit, g → 0, this must just give the mass term since in that case there are

no interactions. Hence, we find n ≥ 0. Next, we have to ensure that the massless limit, m→ 0,


is sensible. This requires n ≤ 1. Hence, we have found that only n = 0 and n = 1 are allowed.

Renaming the coefficients, we find

Weff = 12mΦ2 + 1

3gΦ3 = Wtree . (8.2.46)

The superpotential is not renormalized!

Exercise. Generalize the proof to arbitrary Wtree.

We have therefore reached the important conclusion that holomorphic couplings in the super-

potential are not renormalized. Quantum effects in chiral field theory are completely captured

by the renormalization of the Kahler potential (wave function renormalization).

8.2.6 Supersymmetry Breaking

SUSY is broken when the vacuum state |0〉 is not invariant under SUSY transformations

Qα|0〉 6= 0 and Qα|0〉 . (8.2.47)

In global SUSY, the Hamiltonian operator H is related to the SUSY generators through the

SUSY algebra

H = P 0 =1

4(Q1Q1 + Q1Q1 +Q2Q2 + Q2Q2) . (8.2.48)

If SUSY is unbroken in the vacuum state, it follows that H|0〉 = 0 and the vacuum has zero

energy. Conversely, if SUSY is spontaneously broken in the vacuum state, then the vacuum

state must have positive energy, since

〈0|H|0〉 =1

4

(||Q1|0〉||2 + ||Q1|0〉||2 + ||Q2|0〉||2 + ||Q2|0〉||2

)> 0 , (8.2.49)

if the Hilbert space is to have positive norm.

F-term Breaking

If spacetime-dependent effects and fermion condensates can be neglected, then 〈0|H|0〉 = 〈0|V |0〉,where

V = FiFi . (8.2.50)

In general the scalar potential will also have a D-term contribution, V = 12D

aDa. We have

ignored this in our enormously simplified discussion of SUSY. From eq. (8.2.50) If the equations

Fi = ∂φiW = 0 (and Da = 0) can’t be solved simultaneously, then SUSY is spontaneously

broken.

O’Raifeartaigh Model

Consider a theory with three chiral superfields Φ1,2,3 and superpotential

W = −kΦ1 +mΦ2Φ3 + 12yΦ1Φ2

3 , (8.2.51)

where without loss of generality we can choose k, m and y to be real and positive (by phase

rotations of the fields). Note that W contains a linear term. Such a term is required to achieve

F-term breaking at tree-levle in renormalizable superpotential, since otherwise setting all φi = 0

8.3 SUSY Inflation: Generalities 141

will always giv a supersymmetric global minimum with all Fi = 0. The linear term is allowed if

the corresponding chiral supermultiplet is a gauge singlet. The scalar potential (8.2.50) becomes

V = |F1|2 + |F2|2 + |F3|3 , (8.2.52)

where

F1 = k − 12yφ

23 , (8.2.53)

F2 = −mφ3 , (8.2.54)

F3 = −mφ2 − yφ1φ3 . (8.2.55)

Clearly, F1 = 0 and F2 = 0 are not compatible, so SUSY must be broken. If m2 > yk, then

the absolute minimum of the classical potential is at φ + 2 = φ3 = 0 with φ1 undetermined,

so F1 = k and V = k2 at the minimum. The fact that φ1 is undetermined at tree level is an

example of a “flat direction” in the scalar potential; this is a common feature of supersymmetric

models. However, as usual, the flat direction parameterized by φ1 is an accidental feature of the

classical scalar potential and doesn’t survive quantum corrections, i.e. loop corrections will lift

the potential.

8.2.7 Supergravity

Gravity exists, so if supersymmetry is realized in Nature, it has to be a local symmetry. Gauging

global SUSY leads to supergravity (SUGRA). This is a huge and complicated topic. Learning

the full machinery of SUGRA would keep us busy for a while. We will save our energy for

applications of SUSY to inflation. The only result from SUGRA that we will need for that

discussion is the F-term potential,

VF = eK/M2pl[KiDiWDjW − 3|W |2

], (8.2.56)

where DiW = ∂iW + 1M2

pl(∂iK)W is called the Kahler covariant derivative of W .

Exercise. Show that eq. (8.2.56) reduces to eq. (8.2.35) in the limit Mpl →∞.

8.2.8 Further Reading

8.3 SUSY Inflation: Generalities

8.3.1 The Supergravity Eta-Problem

An important instance of the eta problem arises in locally-supersymmetric theories, i.e. in su-

pergravity.4 In N = 1 supergravity, a key term in the scalar potential is the F-term potential,

VF = eK/M2pl

[KϕϕDϕWDϕW − 3

M2pl|W |2

], (8.3.57)

where K(ϕ, ϕ) and W (ϕ) are the Kahler potential and the superpotential, respectively; ϕ is a

complex scalar field which is taken to be the inflaton; and we have defined DϕW ≡ ∂ϕW +

4This case is relevant for many string theory models of inflation because four-dimensional supergravity is the

low-energy effective theory of supersymmetric string compactifications.


M−2pl (∂ϕK)W . For simplicity of presentation, we have assumed that there are no other light

degrees of freedom, but generalizing our expressions to include other fields is straightforward.

The Kahler potential determines the inflaton kinetic term, −K,ϕϕ ∂µϕ∂µϕ, while the su-

perpotential determines the interactions. To derive the inflaton mass, we expand K around

some chosen origin, which we denote by ϕ ≡ 0 without loss of generality, i.e. K(ϕ, ϕ) =

K0 + K,ϕϕ|0 ϕϕ+ · · · . The inflationary Lagrangian then becomes

L ≈ −K,ϕϕ ∂ϕ∂ϕ− V0

(1 + K,ϕϕ|0

ϕϕ

M2pl

+ . . .)

(8.3.58)

≡ −∂µφ∂µφ− V0

(1 +

φφ

M2pl

)+ . . . , (8.3.59)

where we have defined the canonical inflaton field φφ ≈ Kϕϕ|0 ϕϕ and V0 ≡ VF |ϕ=0. We

have retained the leading correction to the potential originating in the expansion of eK/M2pl

in eq. (8.3.57), which could plausibly be called a universal correction in F-term scenarios. The

omitted terms, some of which can be of the same order as the terms we keep, arise from expanding[KϕϕDϕWDϕW− 3

M2pl|W |2

]in eq. (8.3.57) and clearly depend on the model-dependent structure

of the Kahler potential and the superpotential. This results in a large, model-independent

contribution to the eta parameter

∆η = 1 , (8.3.60)

as well as a model-dependent contribution which is typically of the same order. It is therefore

clear that in an inflationary scenario driven by an F-term potential, eta will generically be of

order unity.

Under what circumstances can inflation still occur, in a model based on a supersymmetric

Lagrangian? One obvious possibility is that the model-dependent contributions to eta approxi-

mately cancel the model-independent contribution, so that the smallness of the inflaton mass is

a result of fine-tuning. Clearly, it would be far more satisfying to exhibit a mechanism that re-

moves the eta problem by ensuring that ∆η 1. This requires either that the F-term potential

is negligible, or that the inflaton does not appear in the F-term potential. The first case does not

often arise, because F-term potentials play an important role in presently-understood models

for stabilization of the compact dimensions of string theory. However, in the next section, we

will present a scenario in which the inflaton is a Goldstone boson and does not appear in the

Kahler potential, or in the F-term potential, to any order in perturbation theory. This evades

the particular incarnation of the eta problem that we have described above.

8.3.2 Goldstone Bosons in Supergravity

A promising approach to realize a technically natural small value for η is to make the inflaton

a Goldstone boson with small mass protected by a shift symmetry. Consider, for example, a

superpotential which spontaneously breaks a global U(1) symmetry

W = S(ΦΦ− f2) , (8.3.61)

where Φ and Φ are two independent chiral superfields whose bottom components are the scalar

fields φ and φ. Let the expectation values of the fields be

Φ = feθ/f and Φ = fe−θ/f , (8.3.62)

8.4 SUSY Inflation: A Case Study 143

where θ = ρ+ iϕ is a complex scalar field.5 The canonical Kahler potential then becomes

K = Φ†Φ + Φ†Φ = 2f2 cosh

(θ + θ†

f

), (8.3.63)

and the supergravity potential for θ is

V = exp

[2f2

M2pl

coshθ + θ†

f

][σ4 + . . .

]. (8.3.64)

We notice that only the real part of θ acquires a mass; the shift symmetry of the Goldstone

boson is protecting the imaginary component.6

8.4 SUSY Inflation: A Case Study

8.4.1 Hybrid Inflation and Naturalness

To illustrate the role of SUSY in inflationary model-building, we consider hybrid inflation as an

example. Recall the basic elements of hybrid models: i) an inflaton φ with potential V (φ) that

vanishes at the origin, V (φ = 0) = 0, ii) a waterfall field ψ with symmetry breaking potential

κ(v2 − ψ2)2 = V0 −m2ψψ + · · · (8.4.65)

where V0 ≡ κv4 and m2ψ ≡ 2κv2. iii) a coupling between the inflaton field φ and the waterfall

field ψ that controls the end of inflaton and removes the vacuum energy.

For concreteness, consider the following inflaton-waterfall coupling

λφ2ψ2 . (8.4.66)

This generates a loop correction to the inflaton mass

δm2φ ∼

λ

16π2Λ2

uv , (8.4.67)

where Λuv is the cutoff of the loop integral. In order for φ to act as a switch on the waterfall

field, we require

λφ2i > m2

ψ , (8.4.68)

where φi is the initial value of the infaton. This implies a minimal value for the natural size of

the inflaton mass

m2φ >

1

16π2

Λ2m2ψ

φ2i

. (8.4.69)

The hybrid mechanism requires the hierarchy m2φ m2

ψ. This puts a strong upper limit on

the cutoff Λuv. Supersymmetry is one of the few ways we know to achieve a low cutoff in a

5In the following we use θ both for the chiral superfield and its bottom component. Which is meant should be

clear from the context.6This looks like a nice solution to the eta problem; however, it assumes that shift symmetry breaking contribu-

tions in the UV are small—i.e. we have to assume that there are no non-trivial corrections to (8.3.63). However,

generic UV-completions are expected to break continuous global symmetries, so symmetries of the Kahler poten-

tial are not believed to persist beyond leading order. For the moment, we will assume that these effects are small,

but we will return to it later.


controlled way. Even with SUSY, we may (at best) cut off the loop integral at Λ2uv ∼ m2

ψ. In

that case, we get

m2φ >

1

16π2

m4ψ

φ2i

. (8.4.70)

We have to ensure that this lower limit on inflaton mass is consistent with the upper limit

coming from smallness of the η-parameter

m2φ ηH2 ∼ η V0

M2pl

. (8.4.71)

Using φi Mpl, we find consistency only if m4ψ V0, i.e. the waterfall field has to be light

compared to the scale of the total vacuum energy it controls. A naturally small mass for ψ

again requires some symmetry explanation. In contrast with the slow-roll field, SUSY alone can

protect the lightness of the waterfall.

We see that naturalness puts very specific requirements on the structures for hybrid models.

In the next section, we present an example in the context of SUSY.

8.4.2 SUSY Pseudo-Natural Inflation

For purposes of illustration, we will consider the specific supergravity model of Arkani-Hamed

et al.7 (see also Kaplan and Weiner8).

The superpotential is

W = λ0S(ΦΦ− f2) +λ1

2(Φ + Φ)Ψ2 + λ2X(Ψ2 − v2) , (8.4.72)

where λ21f

2 > 2λ22v

2 and

Φ ≡ (f + ρ)eiϕ/f , (8.4.73)

Φ ≡ (f − ρ)e−iϕ/f . (8.4.74)

This term preserves a U(1) symmetry which is spontaneously broken. As before, the Goldstone

boson ϕ associated with the broken symmetry will be the inflaton. Without loss of generality,

we assume that the flat modulus ρ is stabilized at ρ ≡ 0 after supersymmetry breaking. The

second term in W breaks the U(1) explicitly and gives the Goldstone mode a potential. The field

Ψ is the standard waterfall field of hybrid models of inflation. During inflation it is stabilized at

Ψ = 0. Finally, the last term in W includes the field X whose F-term dominates the inflationary

potential energy, V0 ≈ |FX |2 = λ22v

4. The Kahler potential takes the same form as in (8.3.63).

In particular, it respects the U(1) symmetry. Given this input (and for now assuming no other

contributions to W and K), the inflationary potential receives two main contributions:

i) a loop-suppressed supergravity coupling

δK =λ2

1

16π2(Φ†Φ + h.c.) ⇒ V1 = V0

(1− λ2

1

4π2

f2

M2pl

sin2 ϕ

f

), (8.4.75)

where λ21 ≡ λ2

1 log(Λf ) and we dropped a small constant term, V0(1 +

λ21

8π2f2

M2pl

) ≈ V0.

7Arkani-Hamed et al., Pseudo-Natural Inflation, (arXiv:hep-th/0302034).8Kaplan and Weiner, (arXiv:hep-ph/0302014).

8.5 Signatures of SUSY Inflation 145

ii) a one-loop Coleman-Weinberg contribution

V2 = V0λ2

2

4π2log(λ1 cos(ϕ/f)

µ/f

), (8.4.76)

where µ is the renormalization scale.

The complete inflaton potential hence is

V = V0

(1− λ2

1

4π2

f2

M2pl

sin2(ϕ/f) +λ2

2

4π2log(cos(ϕ/f)

)), (8.4.77)

where have absorbed small constants into V0. Small ε and η can be achieved with λ1 . 1, λ2 1

and f Mpl. This is easily seen from (8.4.77) for the regime ϕ f : in this case we find

η ' − λ21

2π2− λ2

2

4π2

M2pl

f2and ε ' η2 ϕ

2

M2pl

, (8.4.78)

and inflation with η . 10−2 therefore requires

λ1 . 1 and λ2 .f

Mpl 1 . (8.4.79)

Supersymmetry makes the small value of λ2 technically natural.

8.4.3 UV Sensitivity

8.5 Signatures of SUSY Inflation

8.5.1 Hubble-Mass Scalars

8.5.2 The Squeezed Limit

8.5.3 Scale-Dependent Halo Bias

8.6 Conclusions

9 String Theory and Inflation

9.1 Introduction

When I find the energy I will add content to this chapter.

9.2 Elements of String Theory

9.2.1 Fields and Effective Actions

9.2.2 String Compactifications

9.3 Warped D-brane Inflation

9.4 Axion Monodromy Inflation

9.5 Conclusions

147

Outlook

149

Part III

Supplementary Material

151

A The Effective Theory

of Single-Field Inflation

A.1 Introduction

The standard approach to study inflation is to assume the existence of a fundamental scalar

field – the inflaton – and postulate a specific form for its action. Given this action one finds

a background that describes an accelerating spacetime, |H| H2. Perturbing around this

background gives the action for fluctuations. In this note I want to describe an interesting

alternative approach in which the most general effective action for the fluctuations is written

down directly1, without model-dependent assumptions about the microscopic physics sourcing

the background. Instead, the inflationary quasi-de Sitter background H(t) is assumed as given

and the focus is on small fluctuations around this background. This formalism is particularly

powerful in the study of non-Gaussianity.

By definition, |H| H2, inflation implies approximate time-translation invariance of the

background. However, inflation has to end, so the symmetry has to be spontaneously broken.

As in gauge theory, there is a Goldstone boson associated with the symmetry breaking. In

the case of inflation, this field characterizes fluctuations in the ‘clock’ measuring time during

inflation

π ∼ δt ∼ δφ˙φ. (A.1.1)

The final equality is for the case of a fundamental scalar field φ, but we emphasize that the

description in terms of π is more general than that. The Goldstone boson is related by a simple

rescaling to the comoving curvature perturbation ζ (and hence to cosmological observables such

as the CMB temperature fluctuations),

ζ ∼ δa

a∼ Hπ . (A.1.2)

Being a Goldstone boson, the action for π is constrained by the symmetries of the background.

The construction of the low-energy effective action for this Goldstone degree of freedom is the

fundamental objective of these notes.

Before we embark on our journey to the effective theory of inflation, let me summarize what

awaits us at the promised land:

- First and foremost, effective field theory is the central organizing principle of theoretical

physics. Applying it to inflation will allow us to systematically classify all models of

1Cheung et al., The Effective Theory of Inflation.

153

154 A. The Effective Theory of Single-Field Inflation

inflation (as long as they are characterized by a single clock). In a systematic way the

effective field theory explores the full range of possibilities.

- The effective theory doesn’t commit to a specific microscopic realization of the physics

of inflation. In particular, the effective theory of inflation allows cosmologists to stop

apologizing for using scalar fields to describe inflation. It shows that the basic predictions

of inflation don’t rely on that assumption. It never mattered what was creating the

background. The fluctuations are scalars because of Goldstone’s theorem.

- Describing inflationary fluctuations in terms of the Goldstone degrees of freedom will

help greatly in identifying the most relevant low-energy degrees of freedom. For instance,

the Goldstone interpretation will allow a systematic decoupling of metric perturbations.

Studying the Goldstone fluctuations in the unperturbed background allows the most direct

way to obtain the leading order results. In alternative formulations of inflation this physical

decoupling property is often much less manifest. All of this is summarized by Weinberg’s

First Law of Theoretical Physics: “You can use any variables you like to analyse a problem,

but if you use the wrong variables you’ll be sorry”. For many applications Goldstone bosons

are simply the right description of the physics.

- The Goldstone picture will make it clear what is dictated by the underlying symmetries of

the de Sitter background (and hence model-insensitive) and what is not (and hence allows

us to distinguish between different models).

- The symmetries of the background are non-linearly realized in the effective theory. This

will lead to important correlations between different orders in the perturbation expansion.

For instance, it shows that a small speed of sound (a term in the quadratic lagrangian) is

related by symmetry to large interactions (a term in the cubic lagrangian).

- Physical energy scales are readily identified in the Goldstone language. This greatly facil-

itates understanding the dynamics of the theory in different energy regimes.

- Finally, the effective theory of inflation simply seems to be the most physical way to

describe non-Gaussianities in single-clock inflation.

A.2 Spontaneously Broken Symmetries

Physical systems with spontaneously broken symmetries–i.e. systems for which a symmetry of the

action is not a symmetry of the ground state–are ubiquitous in nature. Some of the key physical

characteristics of such systems are captured by the low-energy effective theory of the Goldstone

bosons associated with the symmetry breaking. Our goal in these notes is to formulate inflation

as an example of spontaneous symmetry breaking. In this case, the symmetry is approximate

time translation invariance of the de Sitter background. Before we develop the effective theory

of inflation we review the standard treatment of symmetry breaking in gauge theory.

A.2.1 Global Symmetries

The physics of spontaneous symmetry breaking is based on two powerful theorems due to Noether

and Goldstone. We assume that the reader is familiar with Noether’s theorem which states that

A.2 Spontaneously Broken Symmetries 155

for every global continuous symmetry of the action there exists a conserved current jµ, with

∂µjµ = 0 . (A.2.3)

Spontaneous breaking of a global symmetry naturally leads to Goldstone bosons2 whose low-

energy properties are largely governed by the nature of the symmetries which are spontaneously

broken. The Goldstone state |π〉 is obtained by performing a symmetry transformation on the

ground state |0〉, with spacetime-dependent transformation parameter. One can show that this

implies that the following matrix element cannot vanish

〈π|j0(x, t)|0〉 6= 0 , (A.2.4)

where j0 is the charge density guaranteed to exist by Noethers theorem. Eq. (A.2.4) implies the

following two critical properties of Goldstone bosons:

1. gaplessness

The energy of the Goldstone boson must vanish in the limit of vanishing 3-momentum:

limp→0

E(p) = 0 . (A.2.5)

To prove this, we note that

jµ(x, t) = e−iHtj0(x, 0)eiHt , (A.2.6)

ji(x, t) = e−iP·xji(0, t)eiP·x , (A.2.7)

where H|0〉 = P|0〉 = 0, P|π〉 = p|π〉 and H|g〉 = Ep|π〉. Differentiating (A.2.4) with

respect to time, we find

− iEpe−iEpt〈π|j0(x, 0)|0〉 = 〈π|∂0j0(x, t)|0〉

= −〈π|∂iji(x, t)|0〉= −ipi〈π|ji(x, t)|0〉 . (A.2.8)

The r.h.s. vanishes in the limit p→ 0. However, because of (A.2.4), the l.h.s. only vanishes

if (A.2.5) holds. In relativistic theories–E(p) =√p2 +m2–this implies that the Goldstone

bosons are massless. Not that we have arrived at this conclusion without every writing

down an action.3 We just followed Noether and Goldstone.

2. low-energy decoupling

The above argument can be extended to more complicated matrix elements. In this way

one finds that the Goldstone bosons must decouple from all interactions in the limit p→0.4 This observation significantly constrains the low-energy action parameterizing the

Goldstone interactions.5

2For spontaneously broken supersymmetry the Goldstone mode is a fermion.3In fact, gaplessness and low-energy decoupling inform us what the effective action has to look like.4Basically, in the zero-momentum limit, the spacetime-dependent symmetry transformation that generated

the Goldstone boson from the ground state becomes spacetime-independent and the Goldstone mode becomes

indistinguishable from the ground state, i.e. the Goldstone state becomes a symmetry transformation of the

ground state.5The description of the physics in terms of the low-energy lagrangian of the Goldstone boson is useful even

when the underlying symmetry is not an exact symmetry. In this case, the small breaking of the symmetry

can be treated perturbatively, leading to pseudo-Goldstone bosons with a small mass and small non-derivative

interactions in the effective action, i.e. both gaplessness and low-energy decoupling become approximate.


A.2.2 Effective Lagrangian

Consider a gauge group G that is spontaneously broken to a subgroup H. Let T a be the

generators of the broken symmetry which live in the coset G/H. The Goldstone modes |π〉are then obtained from the ground state |0〉 by performing a symmetry transformation with

spacetime-dependent transformation parameter

U = eiπ(x)/fπ , where π ≡ πaT a . (A.2.9)

The effective action for the Goldstone boson π has to be consistent with gaplessness and low-

energy decoupling. At lowest order in a low-energy expansion – O(E2) – the unique invariant

lagrangian for the Goldstone boson is

Leff = −f2π

2∂µU·∂µU † , (A.2.10)

where ∂µU·∂µU † ≡ Tr[∂µU∂µU †]. In terms of π this becomes

Leff → −1

2(∂µπ)2 +

1

6f2π

[(π·∂µπ)2 − π2(∂µπ)2

]+ · · · (A.2.11)

Notice the appearance of an infinity series of non-renormalizable interactions. Moreover, we

see that there are special relations between the interactions dictated by the broken symmetry

and that the couplings are determined by the single parameter fπ. These interactions are

called universal. At higher-order in the energy expansion we obtain additional, non-universal

interactions. For example, at O(E4), we find the following single-derivative terms

Leff = −f2π

2∂µU·∂µU † + c1 [∂µU·∂µU †]2 + c2 ∂µU·∂νU † ∂µU·∂νU † + · · · (A.2.12)

where c1 and c2 are model-dependent, dimensionless constants. If fπ sets the natural scale

relative to which the low-energy limit is to be taken, then we expect ci ∼ O(1). In terms of π,

we again find a series on non-renormalizable interactions,

Leff → · · ·+ c1

4f4π

((∂µπ)4 − 2

3f2π

(∂µπ)2(π·∂µπ)2 + · · ·)

+ · · · . (A.2.13)

Individual interactions are again related by the non-linearly realized symmetry. Going beyond

single-derivative terms we may include terms involving higher derivatives such as terms involving

∂2U . At O(E4) this allows a number of additional terms.

A.2.3 A Toy UV-Completion

We digress briefly to present a simple field theory—the sigma model—that at low energies

reduces to the effective lagrangian of the previous section:

L = −1

2∂µΣ·∂µΣ† +

µ2

2Σ·Σ† − λ

4[Σ·Σ†]2 , (A.2.14)

with Σ = σ + iπ, where π ≡ T aπa. At high energies, this provides a toy UV-completion of the

effective Goldstone theory. At low energies, the classical minimum of the potential in (A.2.14)

leads to a symmetry breaking vev, 〈σ〉2 = v2 = µ2

λ . To identify the Goldstone mode we introduce

the following parameterization

Σ = (v + ρ(x))U(x) , where U = eiπ(x)/v , (A.2.15)

A.2 Spontaneously Broken Symmetries 157

such that π ≡ T aπa = π + · · · . The lagrangian (A.2.14) becomes

L = −1

2

((∂µρ)2 − 2µ2ρ2

)+

(v + ρ)2

2∂µU·∂µU † − λvρ3 − λ

4ρ4 . (A.2.16)

Integrating out the massive field ρ gives precisely the effective lagrangian (A.2.12),

Leff = −f2π

2∂µU·∂µU † + c1[∂µU·∂µU †]2 + · · · , (A.2.17)

where fπ ≡ v and c1 ≡ v2

8µ2 = 18λ . We see that the non-universal terms in (A.2.12) arise from

integrating out the heavy fields of the UV-completion.

A.2.4 Energy Scales

In order to understand the dynamics of a theory, it is important to identify the energy scales at

which different phenomena become important.

Symmetry Breaking Scale

At low energies, the symmetry is spontaneously broken and a description of the physics in terms

of weakly-coupled Goldstone boson is appropriate. At sufficiently high energies, the symmetry

is restored and degrees of freedom other than the Goldstone modes become relevant (such as

the field ρ in the example of the previous section). In this section we give a precise definition of

the symmetry breaking scale Λb that divides these two regimes.

By definition, any theory with a continuous global symmetry has a conserved Noether current

jµ even if the symmetry is spontaneously broken. A signature of spontaneous symmetry breaking

is the fact that the charge Q =∫d3x j0 does not exist. The existence of a well-defined charge

requires that j0(x) vanishes at least as x−3 in the limit x→ 0. In momentum space this means

that j0(p) scales at most like p−1 for p → 0. We will use this criterium to identify the energy

scale below which the symmetry is spontaneously broken.

We start with the current associated with the effective lagrangian (A.2.11),

jµ = −fπ∂µπ +O(f0π) . (A.2.18)

The two-point function of the current is

∫d4xeipx〈0|Tjµ(x)jν(0)|)〉 = i(pµpν − ηµνp2)Π(p2) , (A.2.19)

where

Π(p2) ≡ f2π

p2+O(1) . (A.2.20)

At low energies, ω < fπ, the first term in (A.2.20) dominates, j0 ∼ p−2 and the charge at infinity

does not exist. As a result, the symmetry is spontaneously broken at energies below Λb = fπ.

At energies above fπ the first term in (A.2.20) is sub-dominant and the symmetry is restored.

The upshot of this slightly formal argument is that the symmetry breaking scale can be read of

from the scale in the Noether current for the canonically-normalized field. We will use the same

argument to determine the symmetry breaking scale for inflation.


Strong Coupling Scale

The regime of validity of an effective theory is not always obvious. Given a microscopic definition

of the theory (i.e. a UV-completion such as the sigma model of the previous section), the regime

of validity is determined by the scales at which additional modes were integrated out. Given

only the effective description, these energy scales may not be transparent in the Lagrangian. A

fairly reliable method to identify the cutoff of the effective theory is to determine the energy

scale at which the theory becomes strongly coupled.

The Goldstone action (A.2.11) is an expansion in π/fπ which contains irrelevant operators

of arbitrarily large dimensions. These interactions become important at high energies. By

dimensional analysis, the effective coupling is ω/fπ, suggesting strong coupling near fπ. More

formally we can define the strong coupling scale as the energy scale at which the loop expansion

breaks down or perturbative unitary of Goldstone boson scattering is violated. Such analysis

lead to the strong coupling scale Λ? = 4πfπ.

A.2.5 Gauge Symmetries and Decoupling

Since gravity is described by a local gauge symmetry, we will be interested in additional effects

that arise when the symmetry breaking involves a local symmetry. The effective action for the

Goldstone boson then becomes

L = −f2π

2∇µU·∇µU † + · · · , (A.2.21)

where ∇µ ≡ ∂µ + igAµ. Here, Aµ is the gauge field associated with the broken gauge symmetry.

The quadratic Lagrangian for the Goldstone boson and the gauge field becomes

L = −1

4F 2µν −

1

2(∂µπ)2 − 1

2m2A2

µ + im∂µπ·Aµ , (A.2.22)

where m2 ≡ f2πg

2. Of course, we could always go to the gauge where π = 0 (unitary gauge) and

the theory is describe purely in terms of a massive vector Aµ. However, describing the physics in

terms of the Goldstone boson has the advantage that it makes the high-energy behavior of the

theory manifest. Specifically, it tells us that at high energies, the scattering of the longitudinal

mode of the gauge field is well-described by the scattering of the Goldstone bosons. (This is an

example of the Goldstone boson equivalence theorem.) This is seen most easily by taking the

decoupling limit where g → 0 and m → 0 for fπ = m/g = const. In this limit, there is now no

mixing between π and Aµ and the Goldstone action reduces to (A.2.10) (the local symmetry has

effectively become a global symmetry). For energies E > m, the Goldstone bosons are the most

convenient ways to describe the scattering of the massive vector fields. Restoring finite g and

m, gives corrections to the results from pure Goldstone boson scattering that are perturbative

in m/E and g2.

This completes our review of spontaneous symmetry breaking in gauge theory. We are now

ready to apply the same formalism to inflation.

A.3 Effective Theory of Inflation

A useful definition of inflation is as an FRW background with nearly constant expansion rate

H ≈ const., or |H| H2. However, inflation has to end, so the time-translation invariance

A.3 Effective Theory of Inflation 159

of the quasi-de Sitter background6 has to be spontaneously broken, H(t). This suggests that

we can formulate inflation as an example of spontaneous symmetry breaking. We will use the

classic treatment of the previous section as an analogy.7

A.3.1 Goldstone Action

To identify the Goldstone boson associated with the symmetry breaking, we perform a spacetime-

dependent time shift

U ≡ t+ π(x) . (A.3.23)

We now construct the effective action for the Goldstone mode π. By definition, the Goldstone

field transforms as π → π − ξ under time reparameterization t → t + ξ, so that U = t + π is

invariant. The Goldstone mode π is related to the primordial curvature perturbation ζ by a

simple rescaling, ζ = −Hπ. Understanding the action for π will therefore allow us to compute

correlation functions for cosmological observables. Following our gauge theory example, the

effective action for the Goldstone boson is a general function of U

L = F (U, (∂µU)2,2U, · · · ) . (A.3.24)

The low-energy expansion of this action unifies all known single-field models of inflation8 and

allows a systematic classification of interactions.

Slow-Roll Inflation

At O(E2), the lagrangian is

L = Λ4(U)− f4(U)gµν∂µU∂νU , (A.3.25)

where Λ(U) and f(U) are a priori free functions of the invariant time U = t + π. However,

demanding tadpole cancellation determines the coefficients

Λ4 ≡ −M2pl(3H

2 + H) and f4 ≡M2plH . (A.3.26)

Eq. (A.3.26) ensures that the action starts quadratic in π when the equations of motion of the

FRW background are imposed. At leading order, the coefficients of the action are therefore

completely fixed by the de Sitter background H(t)

L = M2plHg

µν∂µU∂νU −M2pl(3H

2 + H) . (A.3.27)

This is nothing but slow-roll inflation in disguise:

L = −1

2gµν∂µφ∂νφ− V (φ) , (A.3.28)

where φ = ˙φ(t + π) and V (φ) = M2pl(3H

2 + H). The theory in (A.3.27) includes couplings

between the Goldstone π and metric fluctuations δgµν . This is analogous to the couplings

6The symmetry group G of de Sitter space includes both temporal and spatial diffeomorphisms. Spatial

diffeomorphisms H will be preserved during inflation, but time diffeomorphisms, living in G/H, will be broken.7Since inflation breaks a spacetime symmetry and not an internal symmetry, the analogy will not be perfect.8In fact, the systematic treatment of the effective theory helped to identify single field theories that hadn’t

been previously discussed.


between π and Aµ in the gauge theory example. Just like in the gauge theory we can find a

limit in which π alone controls the dynamics, i.e. we can define a decoupling limit Mpl → ∞and H → 0, with M2

plH fixed. This limit is the same as in gauge theory under the following

identifications: g →M−1pl , m2 → H and E → H. At energies E2 H, we can therefore ignore

the mixing of π with metric perturbations δgµν , i.e. we can evaluate the action for the Goldstone

mode π in the unperturbed de Sitter background gµν

gµν∂µU∂νU → gµν∂µ(t+ π)∂ν(t+ π) = −1− 2π + (∂µπ)2 . (A.3.29)

Since we care about correlation functions evaluated at freeze-out, E2 ∼ H2 |H|, the decoupled

π-lagrangian derived from (A.3.29) will give answers that are accurate up to fractional corrections

of order H2

M2pl

and HH2 = −ε. For eq. (A.3.27), the decoupling limit (A.3.29) implies

Ls.r. = M2plH(∂µπ)2 . (A.3.30)

We note that in the decoupling limit the Goldstone mode is precisely massless9 and the fluctua-

tions are perfectly Gaussian. This is consistent with our interpretation of (A.3.27) as the action

for fluctuations around slow-roll inflationary backgrounds. The near-perfect Gaussianity won’t

be maintained when we consider higher orders in the derivative expansion.

DBI Inflation

At next-to-leading order, we can add the following single-derivative term to the effective la-

grangian

Lcs =1

2M4

2 (gµν∂µU∂νU + 1)2 , (A.3.31)

where ‘-1’ was subtracted from (∂µU)2 to cancel the tadpole, i.e. to ensure that (A.3.31) starts

quadratic in π. In the decoupling limit (A.3.29) this adds the following terms to the Goldstone

action

Lcs = 2M42

(π2 + π(∂µπ)2

). (A.3.32)

We note that a non-linearly realized symmetry relates dispersion to interactions. In other words,

the size of the kinetic term π2 and the strength of the interaction π(∂µπ)2 are related to the

same coefficient M2. In principle, M2(t+ π) can depend on time, but in practice we are always

interested in cases where any time-dependence is proportional to the small parameter ε 1, or

M2 HM2. Adding (A.3.30) and (A.3.31), we find

Ls.r. + Lcs = −M2

plH

c2s

[ (π2 − c2

s(∂iπ)2)

+ π(∂µπ)2], (A.3.33)

where we defined a sound speed1

c2s

≡ 1− 2M22

M2plH

. (A.3.34)

9Including the mixing with gravity gives the (pseudo-)Goldstone a small mass m2π = −6H,

Ls.r. = M2plH

((∂µπ)2 + 3εH2π2) .

This mass for π is precise what is required so that ζ = −Hπ is massless

Ls.r. = M2plH(∂µζ)

2 ,

and hence freezes outside of the horizon.


The Goldstone formalism makes it apparent how a small sound is by symmetry related to large

interactions and hence large non-Gaussianities.

Adding higher powers of gµν∂µU∂νU reproduces the so-called P (X)–theories, with X ≡−1

2(∂µφ)2,

LP (X) =∑

n

1

n!M4n (gµν∂µU∂νU + 1)n , where M4

n = Xn ∂nP

∂X2. (A.3.35)

This includes DBI inflation as a special case

P (X,φ) = f(φ)−1√

1− f(φ)X − V (φ) . (A.3.36)

The DBI action (A.3.36) is special in that its form is protected against radiative corrections by

a higher-dimensional boost symmetry.10

Higher Derivatives

To complete our discussion of the effective Goldstone action we consider contributions with more

than one derivative acting on U . This leads to models of ghost inflation and galieon inflation (if

the operators satisfy the Galilean symmetry π → π+ bµxµ + c). A complete description of these

higher-derivative theories is beyond the scope of these notes. We therefore restrict ourselves to

citing all higher-derivative operators that contribute to the action up to cubic order in π,

L = M21

[2U]2

+ M2

[2U]3

+ M33

[2U][

(∂µU)2]

+ M34

[2U][

(∂µU)2]2

+ M25

[2U]2[

(∂µU)2]

+ M26

[∇µ∇αU

][∇α∇µU

]+ M2

7

[(∂µU)2

][∇µ∇αU

][∇α∇µU

]

+ M28

[2U][∇µ∇αU

][∇α∇µU

]+ M9

[∇µ∇αU

][∇α∇βU

][∇β∇µU

]+ · · · ,

where the square brackets were introduced to denote terms with background values subtracted,

such as

[(∂µU)2

]≡ (∂µU)2 + 1 (A.3.37)

[∇µ∇νU

]≡ ∇µ∇νU −H(gµν +∇µU∇νU) . (A.3.38)

This ensures that action doesn’t have tadpoles. In unitary gauge the higher-derivative operators

are related to the extrinsic curvature δKµν . We will ignore them in the remainder and instead

focus on the predictions from the single-derivative action.

A.3.2 Energy Scales

Arguably the most interesting limits of the effective theory of inflation is the limit of small sound

speed, cs 1. In this limit, interactions are systematically enhanced and the non-Gaussianity

of the fluctuations can be large. The Goldstone action in this limit is

L = −M2

plH

c2s

[ (π2 − c2

s(∂iπ)2)

+ π(∂iπ)2], (A.3.39)

where we have dropped the π3 interaction since it is suppressed by a factor of c2s relative to the

π(∂iπ)2 interaction.

To understand the dynamics of the theory we identify three fundamental energy scales:

10General P (X) theories do not have this feature and are hence plagued by radiative instabilities.


- the symmetry breaking scale, Λb, is the energy scale at which time translations are sponta-

neously broken and a description in terms of a Goldstone boson first becomes applicable;

- the strong coupling scale, Λ?, defines the energy scale at which the effective description

breaks down and perturbative unitarity is lost;

- the Hubble scale, H, is the energy scale associated with the cosmological experiment.

In slow-roll inflation, these three energy scales are easily identified. Time-translation invariance

is broken by the background φ(t) at the scale Λ4b = ˙φ2 = 2M2

pl|H|. At energy scales above Λb,

the symmetry is restored and we should not integrate out the background. Because the theory is

effectively Gaussian, the self-interactions of φ are weak up to very high energies. The theory only

becomes strongly coupled at the Planck scale, so the UV-cutoff is Mpl. Inflationary observables

freeze out at horizon-crossing, or when their frequencies become equal to the expansion rate,

ω ∼ H. Inflation therefore directly probes energies of order Hubble, i.e. the energy scale of the

experiment is H. The freeze-out at the Hubble scale is universal, but the symmetry breaking

scale and the strong coupling scale are model-dependent. We will now determine these scales

for the lagrangian (A.3.39).

Symmetry Breaking

Although the inflationary background spontaneously breaks a gauge symmetry, in the decoupling

limit the gauge symmetry becomes a global symmetry. As long as the decoupled π-lagrangian

is a reliable description, the language of spontaneously broken global symmetries will therefore

be useful.

We determine the symmetry breaking scale for inflation by considering the Noether current

of time translations, i.e. the zero-component of the stress tensor, jµ = T 0µ. For the lagrangian

(A.3.39), we find

j0 = T 00 =2M2

plH

c2s

π + · · · (A.3.40)

From our gauge theory discussion we expect 2M2plHc

−2s to control the symmetry breaking (i.e. to

control the energy scale at which the charge ceases to exist). However, for cs 6= 1 Lorentz

invariance is broken so we have to be careful to define an energy scale. In fact, by dimensional

analysis we see that the coefficient in (A.3.40) is an energy density [ω][k3], not an energy4 [ω4].

To convert the symmetry breaking scale to a true energy scale, we use the dispersion relation

ω = csk. We conclude that the symmetry is spontaneously broken at

Λ4b ≡ 2M2

pl|H|cs . (A.3.41)

Although the current gives a natural definition of the symmetry breaking scale, it is nice to

check that it agrees with our intuition. First of all, in the case of slow-roll inflation (i.e. for

cs = 1), the symmetry breaking scale is given by 2M2pl|H| = ˙φ2. This matches the intuition that

the time variation of the background is breaking the symmetry. Moreover, one may rewrite the

(dimensionless) power spectrum of curvature fluctuations in terms of the symmetry breaking

scale

∆ζ ≡ k3Pζ(k) =1

4

H2

M2pl ε cs

=1

2

(HΛb

)4. (A.3.42)


Hence, when H ∼ Λb, the size of quantum fluctuations is of the same order as the symme-

try breaking scale. This is the regime of eternal inflation, which is again consistent with the

interpretation of Λ4b = ˙φ2 for slow-roll.

Strong Coupling

Next, we determine the strong coupling scale from the action for the Goldstone boson (A.3.39),

L = −1

2

(π2c − c2

s(∂iπc)2)

+1

M2?

πc(∂iπc)2 , (A.3.43)

where we defined π2c = 2M2

pl|H|c−2s π2 and

M4? ≡M2

pl|H|c−2s (1− c2

s)−1 . (A.3.44)

We expect that the theory becomes strongly coupled at an energy scale related to the coupling

M?. For a relativistic theory the strong coupling scale would be equal to M?, but in the non-

relativistic small-cs theory we again have to careful to identify Λ4? with units [ω4]. By dimensional

analysis, we find [M4? ] = [k]7[ω]−3, and hence Λ4

? = M4? × c7

s, or

Λ4? ≡M2

pl|H|c5s(1− c2

s)−1 . (A.3.45)

As in the gauge theory example, the effective coupling is given by ω/Λ?. We expect that

strong coupling arises at some order-one value of this coupling. More formally, we can define

the strong coupling scale by the breakdown of perturbative unitarity of the Goldstone boson

scattering. This is found to happen when ω4 > 2πΛ4?. Note the large suppression of Λ4

? by

factors of cs 1.

The interactions which become strongly coupled are the same that give rise to measurable

non-Gaussianity. As a result, we should be able to interpret the strong coupling scale in terms

of the size of the non-Gaussianity. A simple estimate for the amplitude of the non-Gaussianity

is

fNL ζ ≡L3

L2

∣∣∣∣ω=H

∼M2

plHc−2s π(∂iπ)2

M2plH π2

= c−2s ζ ∼

(Λb

Λ?

)2ζ . (A.3.46)

Using the power spectrum (A.3.42) as an estimate for the size of ζ ∼ ∆1/2ζ , we find

L3

L2∼(H

Λ?

)2. (A.3.47)

We see that L3 ∼ L2, or fNL ∼ ζ−1 ∼ 104, when H ∼ Λ?. This indicates a breakdown of the

perturbative description as Λ? approaches H.

New Physics?

Summary. In the previous sections we derived two important energy scales in the effective

theory of inflation: the symmetry breaking scale, Λ4b = 2M2

pl|H|cs, and the strong coupling

scale, Λ4? = 2M2

pl|H|c5s(1 − c2

s)−1. In slow-roll inflation, cs → 1, the strong coupling scale is

much larger that the symmetry breaking scale. However, in models with small speed of sound,

cs 1, this hierarchy of scales is reversed,

Λ4b

Λ4?

= (1− c2s)c−4s ' 16(f equil.

NL )2 , (A.3.48)


where we used (A.3.46) to relate cs to fNL. Therefore, any measurable non-Gaussianity (f equil.NL &

10) requires the strong coupling scale to appear parametrically below the scale at which the

background is integrated out.

Implications. The inherently strongly-coupled nature of the above theories was a result of

restricting the particle content of the model. However, just like in particle physics, one should

take this as an indication that new degrees of freedom may become important at energies below

the scale of strong coupling.11 Therefore, there is an energy scale ωnew at which ‘new physics’

becomes important. If we wish to maintain both weak coupling and the effective small cs–

description at Hubble, we require H2 < ω2new

√2πΛ2

?. Given our previous results, we find

H4

Λ4?

= 32 ∆ζ(fequilNL )2 . (A.3.49)

where∆ζ

2π2 = 2.4 × 10−9 and |f equil.NL | . 300. This implies that the new physics must enter not

far above the Hubble scale:

H2 < ω2new

√2πΛ2

? ≈ O(20)(f equil.

NL

100

)−1H2 . (A.3.50)

This range of energies is sufficiently small that the new physics is not obviously decoupled

at the Hubble scale. The use of “” in (A.3.50) is a reminder that our loop expansion is

being controlled by the ratio ω2/√

2πΛ2?. When ω2

new →√

2πΛ2? the theory becomes strongly

coupled. However, quantum corrections of any observable become increasingly important as one

approaches this limit. Therefore, a useful perturbative description requires that we expand in

small ω2/√

2πΛ2?, to ensure that our description is not dominated by strong dynamics. This

ratio reflects the strength of a coupling in any UV-completion of the effective theory.

A.4 Non-Gaussianity

At the single-derivative level the effective Goldstone action up to cubic order is

L2 = M2plH

[π2 − (∂iπ)2

]+ 2M4

2 π2 , (A.4.51)

L3 = −2M42 π(∂iπ)2 + 2

(M4

2 +2

3M4

3

)π3 . (A.4.52)

The parameter M2 corrects the quadratic slow-roll lagrangian by adding the kinetic term π2,

while not modifying the gradient term (∂iπ)2. (The coefficient of the gradient term is completely

fixed by the background.) For M42 M2

pl|H| this leads to a small sound speed. In the same

limit we get enhanced interactions in the cubic lagrangian. Making this symmetry explicit is

one of the main insights of the effective theory of inflation. Moreover, we have identified two

unique interactions in the cubic lagrangian – π(∂iπ)2 and π3 – with amplitudes set by the two

free parameters M2 and M3. These parameters characterize the simplest deviations from vanilla

slow-roll inflation. Measurements of the CMB anisotropies can be used to put constraints on

M2 and M3 (see Fig. A.1). These measurements are the precise analog of electroweak precision

tests of the Standard Model.

11The ‘new physics’ could also take the form of a change in the physical description of the existing degrees of

freedom.

A.4 Non-Gaussianity 165

slow-roll inflationDBI inflation

sound speed

Figure A.1: CMB Precision Tests. CMB data constrains the parameters M2 and M3 in the effective

lagrangian: L = M2plH(∂µπ)2 + 2M4

2 (π2 − π(∂µπ)2) + 43M

43 π

3.

The two operators π(∂iπ)2 and π3 provide a physically motivated basis to define two unique

shapes of non-Gaussianity. Being both purely derivative interactions, both π(∂iπ)2 and π3

produce bispectra that peak in the equilateral momentum configuration. However, the bispectra

are not identical, so we can find two linear combinations – called the equilateral and orthogonal

shapes – that are mutually orthogonal is a well-defined sense.

Single-field bispectra. The bispectrum associated with the ζ(∂iζ)2 = H3π(∂iπ)2 interaction is

Bζ(∂iζ)2 = −∆2ζ

4

(1− 1

c2s

)· 24K 3

3 − 8K 22 K

33 K1 − 8K 4

2 K2

1 + 22K 33 K

31 − 6K 2

2 K4

1 + 2K 61

K 93 K

31

,

where we defined

K1 = k1 + k2 + k3 ,

K2 = (k1k2 + k2k3 + k3k1)1/2 ,

K3 = (k1k2k3)1/3 .

The bispectrum associated with the ζ3 interaction is

Bζ3 = 4∆2ζ

(c3 +

3

2c2s

)(1− 1

c2s

)· 1

K 33 K

31

,

where the parameter c3 is defined via

M43 ≡ c3 ·

M42

c2s.


A.5 Conclusions

The effective theory of inflation is nice because:

• cosmologists can stop apologizing for using scalar fields;

• low-energy limit is constrained by symmetry;

• systematic classification of all single-field models of inflation;

• physical scales are readily identified and interpreted;

• Goldstone language explains decoupling from gravity;

• systematic classification of non-Gaussianities.

B Cosmological Perturbation Theory

In this appendix, we summarize basic facts of cosmological perturbation theory.

B.1 The Perturbed Universe

We consider perturbations to the homogeneous background spacetime and the stress-energy of

the universe.

B.1.1 Metric Perturbations

The most general first-order perturbation to a spatially flat FRW metric is

ds2 = −(1 + 2Φ)dt2 + 2a(t)Bidxidt+ a2(t)[(1− 2Ψ)δij + 2Eij ]dx

idxj (B.1.1)

where Φ is a 3-scalar called the lapse, Bi is a 3-vector called the shift, Ψ is a 3-scalar called

the spatial curvature perturbation, and Eij is a spatial shear 3-tensor which is symmetric and

traceless, Eii = δijEij = 0. 3-surfaces of constant time t are called slices and curves of constant

spatial coordinates xi but varying time t are called threads.

B.1.2 Matter Perturbations

The stress-energy tensor may be described by a density ρ, a pressure p, a 4-velocity uµ (of the

frame in which the 3-momentum density vanishes), and an anisotropic stress Σµν .

Density and pressure perturbations are defined in an obvious way

δρ(t, xi) ≡ ρ(t, xi)− ρ(t) , and δp(t, xi) ≡ p(t, xi)− p(t) . (B.1.2)

Here, the background values have been denoted by overbars. The 4-velocity has only three inde-

pendent components (after the metric is fixed) since it has to satisfy the constraint gµνuµuν =

−1. In the perturbed metric (B.1.1) the perturbed 4-velocity is

uµ ≡ (−1− Φ, vi) , or uµ ≡ (1− Φ, vi +Bi) . (B.1.3)

Here, u0 is chosen so that the constraint uµuµ = −1 is satisfied to first order in all pertur-

bations. Anisotropic stress vanishes in the unperturbed FRW universe, so Σµν is a first-order

perturbation. Furthermore, Σµν is constrained by

Σµνuν = Σµµ = 0 . (B.1.4)

167

168 B. Cosmological Perturbation Theory

The orthogonality with uµ implies Σ00 = Σ0j = 0, i.e. only the spatial components Σij are

non-zero. The trace condition then implies Σii = 0. Anisotropic stress is therefore a traceless

symmetric 3-tensor.

Finally, with these definitions the perturbed stress-tensor is

T 00 = −(ρ+ δρ) , (B.1.5)

T 0i = (ρ+ p)vi , (B.1.6)

T i0 = −(ρ+ p)(vi +Bi) , (B.1.7)

T ij = δij(p+ δp) + Σij . (B.1.8)

If there are several contributions to the stress-energy tensor (e.g. photons, baryons, dark matter,

etc.), they are added: Tµν =∑

I TIµν . This implies

δρ =∑

I

δρI , (B.1.9)

δp =∑

I

δpI , (B.1.10)

(ρ+ p)vi =∑

I

(ρI + pI)viI , (B.1.11)

Σij =∑

I

ΣijI . (B.1.12)

Density, pressure and anisotropic stress perturbations simply add. However, velocities do not

add, which motivates defining the 3-momentum density

δqi ≡ (ρ+ p)vi , (B.1.13)

such that

δqi =∑

I

δqiI . (B.1.14)

B.2 Scalars, Vectors and Tensors

The Einstein Equations relate metric perturbations to the stress-energy perturbations. Ein-

stein’s Equations are both complicated (coupled second-order partial differential equations) and

non-linear. Fortunately, the symmetries of the flat FRW background spacetime allow perturba-

tions to be decomposed into independent scalar, vector and tensor components. This reduces

the Einstein Equations to a set of uncoupled ordinary differential equations.

B.2.1 Helicity and SVT-Decomposition in Fourier Space

The decomposition into scalar, vector and tensor perturbations is most elegantly explained in

Fourier space. We define the Fourier components of a general perturbation δQ(t,x) as follows

δQ(t,k) =

∫d3x δQ(t,x)e−ik·x . (B.2.15)

First note that as a consequence of translation invariance different Fourier modes (different

wavenumbers k) evolve independently.1

1The following proof was related to me by Uros Seljak and Chris Hirata.

B.2 Scalars, Vectors and Tensors 169

Proof:

Consider the linear evolution of N perturbations δQI , I = 1, . . . , N from an initial time t1 to

a final time t2

δQI(t2,k) =

N∑

J=1

∫d3k TIJ(t2, t1,k, k)δQJ(t1, k) , (B.2.16)

where the transfer matrix TIJ(t2, t1,k, k) follows from the Einstein Equations and we have

allowed for the possibility of a mixing of k-modes. We now show that translation invariance in

fact forbids such couplings. Consider the coordinate transformation

xi′

= xi + ∆xi , where ∆xi = const. (B.2.17)

You may convince yourself that the Fourier amplitude gets shifted as follows

δQ′I(t,k) = e−ikj∆xjδQI(t,k) . (B.2.18)

Thus the evolution equation in the primed coordinate system becomes

δQ′I(t2,k) =

N∑

J=1

∫d3k e−ikj∆x

jTIJ(t2, t1,k, k)eikj∆x

jδQ′J(t1, k) (B.2.19)

≡N∑

J=1

∫d3k T ′IJ(t2, t1,k, k)δQJ(t1, k) . (B.2.20)

By translation invariance the equations of motion must be the same in both coordinate systems,

i.e. the transfer matrices TIJ and T ′IJ must be the same

TIJ(t2, t1,k, k) = ei(kj−kj)∆xjTIJ(t2, t1,k, k) . (B.2.21)

This must hold for all ∆xj . Hence, either k = k or TIJ(t2, t1; k, k) = 0, i.e. the perturbation

δQI(t2,k) of wavevector k depends only on the initial perturbations of wavevector k. At linear

order there is no coupling of different k-modes. QED.

Now consider rotations around the Fourier vector k by an angle ψ. We classify perturbations

according to their helicity m: a perturbation of helicity m has its amplitude multiplied by eimψ

under the above rotation. We define scalar, vector and tensor perturbations as having helicities

0, ±1, ±2, respectively.

Consider a Fourier mode with wavevector k. Without loss of generality we may assume that

k = (0, 0, k) (or use rotational invariance of the background). The spatial dependence of any

perturbation then is

δQ ∝ eikx3. (B.2.22)

To study rotations around k it proves convenient to switch to the helicity basis

e± ≡e1 ± ie2√

2, e3 , (B.2.23)

where e1, e2, e3 is the Cartesian basis. A rotation around the 3-axis by an angle ψ has the

following effect (x1′

x2′

)=

(cosψ sinψ

− sinψ cosψ

)(x1

x2

), x3′ = x3 , (B.2.24)


and

e′± = e±iψe± , e′3 = e3 . (B.2.25)

The contravariant components of any tensor Ti1i2...in transform as

T ′i1i2...in = ei(n+−n−)ψTi1i2...in ≡ eimψTi1i2...in , (B.2.26)

where n+ and n− count the number of plus and minus indices in i1 . . . in, respectively. Helicity

is defined as the difference m ≡ n+ − n−.

In the helicity basis e±, e3, a 3- scalar α has a single component with no indicies and is

therefore obviously of helicity 0; a 3-vector βi has 3 components β+, β−, β3 of helicity ±1 and 0; a

symmetric and traceless 3-tensor γij has 5 components γ−−, γ++, γ−3, γ+3, γ33 (the tracelessness

condition makes γ−+ redundant), of helicity ±2, ±1 and 0.

Rotational invariance of the background implies that helicity scalars, vectors and tensors

evolve independently.2

Proof:

Consider N perturbations δQI , I = 1, . . . , N of helicity mI . The linear evolution is

δQI(t2,k) =

N∑

J=1

TIJ(t2, t1,k)δQJ(t1,k) , (B.2.27)

where the transfer matrix TIJ(t2, t1,k) follows from the Einstein Equations. Under rotation the

perturbations transform as

δQ′I(t,k) = eimIψδQI(t,k) , (B.2.28)

and

δQ′I(t2,k) =N∑

J=1

eimIψ TIJ(t2, t1,k) e−imJψδQ′J(t1,k) . (B.2.29)

By rotational invariance of the equations of motion

TIJ(t2, t1,k) = eimIψ TIJ(t2, t1,k) e−imJψ = ei(mI−mJ )ψTIJ(t2, t1,k) , (B.2.30)

which has to hold for any angle ψ; it follows that eithers mI = mJ , i.e. δQI and δQJ have the

same helicity or TIJ(t2, t1,k) = 0. This proves that the equations of motion don’t mix modes of

different helicity. QED.

B.2.2 SVT-Decomposition in Real Space

In the last section we have seen that 3-scalars correspond to helicity scalars, 3-vectors decompose

into helicity scalars and vectors, and 3-tensors decompose into helicity scalars, vectors and

tensors. We now look at this from a different perspective.

A 3-scalar is obviously also a helicity scalar

α = αS . (B.2.31)

2The following proof was related to me by Uros Seljak and Chris Hirata.

B.2 Scalars, Vectors and Tensors 171

Consider a 3-vector βi. We argue that it can be decomposed as

βi = βSi + βVi , (B.2.32)

where

βSi = ∇iβ , ∇iβVi = 0 , (B.2.33)

or, in Fourier space,

βSi = − ikikβ , kiβ

Vi = 0 . (B.2.34)

Here, we have defined β ≡ kβ.

Exercise 1 (Helicity Vector) Show that βVi is a helicity vector.

Similarly, a traceless, symmetric 3-tensor can be written as

γij = γSij + γVij + γTij , (B.2.35)

where

γSij =

(∇i∇j −

1

3δij∇2

)γ (B.2.36)

γVij =1

2(∇iγj +∇j γi) , ∇iγi = 0 (B.2.37)

∇iγTij = 0 . (B.2.38)

or

γSij =

(−kikjk2

+1

3δij

)γ (B.2.39)

γVij = − i

2k(kiγj + kjγi) , kiγi = 0 (B.2.40)

kiγTij = 0 . (B.2.41)

Here, we have defined γ ≡ k2γ and γi ≡ kγi.

Exercise 2 (Helicity Vectors and Tensors) Show that γVij and γTij are a helicity vector and

a helicity tensor, respectively.

Choosing k along the 3-axis, i.e. k = (0, 0, k) we find

γSij =1

3

γ 0 0

0 γ 0

0 0 −2γ

, (B.2.42)

γVij = − i2

0 0 γ1

0 0 γ2

γ1 γ2 0

, (B.2.43)

γTij =

γ× γ+ 0

γ+ −γ× 0

0 0 0

. (B.2.44)


B.3 Scalars


Four scalar metric perturbations Φ, B,i, Ψδij and E,ij may be constructed from 3-scalars, their

derivatives and the background spatial metric, i.e.

ds2 = −(1 + 2Φ)dt2 + 2a(t)B,idxidt+ a2(t)[(1− 2Ψ)δij + 2E,ij ]dx

idxj (B.3.45)

Here, we have absorbed the ∇2E δij part of the helicity scalar ESij in Ψ δij .

The intrinsic Ricci scalar curvature of constant time hypersurfaces is

R(3) =4

a2∇2Ψ . (B.3.46)

This explains why Ψ is often referred to as the curvature perturbation.

There are two scalar gauge transformations

t → t+ α , (B.3.47)

xi → xi + δijβ,j . (B.3.48)

Under these coordinate transformations the scalar metric perturbations transform as

Φ → Φ− α , (B.3.49)

B → B + a−1α− aβ , (B.3.50)

E → E − β , (B.3.51)

Ψ → Ψ +Hα . (B.3.52)

Note that the combination E−B/a is independent of the spatial gauge and only depends on the

temporal gauge. It is called the scalar potential for the anisotropic shear of world lines orthogonal

to constant time hypersurfaces. To extract physical results it is useful to define gauge-invariant

combinations of the scalar metric perturbations. Two important gauge-invariant quantities were

introduced by Bardeen

ΦB ≡ Φ− d

dt[a2(E −B/a)] , (B.3.53)

ΨB ≡ Ψ + a2H(E −B/a) . (B.3.54)


Matter perturbations are also gauge-dependent, e.g. density and pressure perturbations trans-

form as follows under temporal gauge transformations

δρ→ δρ− ˙ρα , δp→ δp− ˙pα . (B.3.55)

Adiabatic pressure perturbations are defined as

δpad ≡˙p˙ρδρ . (B.3.56)

The non-adiabiatic, or entropic, part of the pressure perturbations is then gauge-invariant

δpen ≡ δp−˙p˙ρδρ . (B.3.57)

B.3 Scalars 173

The scalar part of the 3-momentum density, (δq),i, transforms as

δq → δq + (ρ+ p)α . (B.3.58)

We may then define the gauge-invariant comoving density perturbation

δρm ≡ δρ− 3Hδq . (B.3.59)

Finally, two important gauge-invariant quantities are formed from combinations of matter

and metric perturbations. The curvature perturbation on uniform density hypersurfaces is

−R ≡ Ψ +H˙ρδρ . (B.3.60)

The comoving curvature perturbation is

ζ = Ψ− H

ρ+ pδq . (B.3.61)

We will show that ζ andR are equal on superhorizon scales, where they become time-independent.

The computation of the inflationary perturbation spectrum is most clearly phrased in terms of

ζ and R.

B.3.3 Einstein Equations

To relate the metric and stress-energy perturbations, we consider the perturbed Einstein Equa-

tions

δGµν = 8πGδTµν . (B.3.62)

We work at linear order. This leads to the energy and momentum constraint equations

3H(Ψ +HΦ) +k2

a2

[Ψ +H(a2E − aB)

]= −4πGδρ (B.3.63)

Ψ +HΦ = −4πGδq . (B.3.64)

These can be combined into the gauge-invariant Poisson Equation

k2

a2ΨB = −4πGδρm . (B.3.65)

The Einstein equation also yield two evolution equations

Ψ + 3HΨ +HΦ + (3H2 + 2H)Φ = 4πG

(δp− 2

3k2δΣ

)(B.3.66)

(∂t + 3H)(E −B/a) +Ψ− Φ

a2= 8πGδΣ . (B.3.67)

The last equation may be written as

ΨB − ΦB = 8πGa2δΣ . (B.3.68)

In the absence of anisotropic stress this implies, ΨB = ΦB.


Energy-momentum conservation, ∇µTµν = 0, gives the continuity equation and the Euler

equation

δρ+ 3H(δρ+ δp) =k2

a2δq + (ρ+ p)[3Ψ + k2(E +B/a)] , (B.3.69)

δq + 3Hδq = −δp+2

3k2δΣ− (ρ+ p)Φ . (B.3.70)

Expressed in terms of the curvature perturbation on uniform-density hypersurfaces, ζ, (B.3.69)

reads

ζ = −H δpenρ+ p

−Π , (B.3.71)

where δpen is the non-adiabatic component of the pressure perturbation, and Π is the scalar

shear along comoving worldlines

Π

H≡ − k2

3H

[E −B/a+

δq

a2(ρ+ p)

](B.3.72)

= − k2

3a2H2

[ζ −ΨB

(1− 2ρ

9(ρ+ p)

k2

a2H2

)]. (B.3.73)

For adiabative perturbations, δpen = 0 on superhorizon scales, k/(aH) 1 (i.e. Π/H → 0

for finite ζ and ΨB), the curvature perturbation ζ is constant. This is a crucial result for our

computation of the inflationary spectrum of ζ. It justifies computing ζ at horizon exit and

ignoring superhorizon evolution.

B.3.4 Popular Gauges

For reference we now give the Einstein Equations and the conservation equations is various

popular gauges:

• Synchronous gauge

A popular gauge, especially for numerical implementation of the perturbation equations

(cf. CMBFAST or CAMB), is synchronous gauge. It is defined by

Φ = B = 0 . (B.3.74)

The Einstein equations become

3HΨ +k2

a2

[Ψ +Ha2E

]= −4πGδρ , (B.3.75)

Ψ = −4πGδq , (B.3.76)

Ψ + 3HΨ = 4πG

(δp− 2

3k2δΣ

), (B.3.77)

(∂t + 3H)E +Ψ

a2= 8πGδΣ . (B.3.78)

The conservation equation are


a2δq + (ρ+ p)[3Ψ + k2E] , (B.3.79)


3k2δΣ . (B.3.80)

B.3 Scalars 175

• Newtonian gauge

The Newtonian gauge has its name because it reduces to Newtonian gravity in the small-

scale limit. It is popular for analytic work since it leads to algebraic relations between

metric and stress-energy perturbations.

Newtonian gauge is defined by

B = E = 0 , (B.3.81)

and

ds2 − (1 + 2Φ)dt2 + a2(t)(1− 2Ψ)δijdxidxj . (B.3.82)

The Einstein equations are

3H(Ψ +HΦ) +k2

a2Ψ = −4πGδρ , (B.3.83)

Ψ +HΦ = −4πGδq , (B.3.84)

Ψ + 3HΨ +HΦ + (3H2 + 2H)Φ = 4πG

(δp− 2

3k2δΣ

), (B.3.85)

Ψ− Φ

a2= 8πGδΣ . (B.3.86)

The continuity equations are


a2δq + 3(ρ+ p)Ψ , (B.3.87)


3k2δΣ− (ρ+ p)Φ . (B.3.88)

• Uniform density gauge

The uniform density gauge is useful for describing the evolution of perturbations on su-

perhorizon scales. As its name suggests it is defined by

δρ = 0 . (B.3.89)

In addition, it is convenient to take

E = 0 , −Ψ ≡ R . (B.3.90)


3H(−R+HΦ)− k2

a2[R+ aHB] = 0 (B.3.91)

−R+HΦ = −4πGδq , (B.3.92)

−R − 3HR+HΦ + (3H2 + 2H)Φ = 4πG

(δp− 2

3k2δΣ

), (B.3.93)

(∂t + 3H)B/a+R+ Φ

a2= −8πGδΣ . (B.3.94)


3Hδp =k2

a2δq + (ρ+ p)[−3R+ k2B/a] , (B.3.95)


3k2δΣ− (ρ+ p)Φ . (B.3.96)


• Comoving gauge

Comoving gauge is defined by the vanishing of the scalar momentum density,

δq = 0 , E = 0 . (B.3.97)

It is also conventional to set −Ψ ≡ ζ in this gauge.


3H(−ζ +HΦ) +k2

a2[−ζ − aHB] = −4πGδρ (B.3.98)

−ζ +HΦ = 0 , (B.3.99)

−ζ − 3Hζ +HΦ + (3H2 + 2H)Φ = 4πG

(δp− 2

3k2δΣ

), (B.3.100)

(∂t + 3H)B/a+ζ + Φ

a2= −8πGδΣ . (B.3.101)


δρ+ 3H(δρ+ δp) = (ρ+ p)[−3ζ + k2B/a] , (B.3.102)

0 = −δp+2

3k2δΣ− (ρ+ p)Φ . (B.3.103)

Equations (B.3.103) and (B.3.99) may be combined into

Φ =−δp+ 2

3Σ

ρ+ p, kB =

4πGa2δρ− k2RaH

. (B.3.104)

• Spatially-flat gauge

A convenient gauge for computing inflationary perturbation is spatially-flat gauge

Ψ = E = 0 . (B.3.105)

During inflation all scalar perturbations are then described by δφ.


3H2Φ +k2

a2[−aHB)] = −4πGδρ (B.3.106)

HΦ = −4πGδq (B.3.107)

HΦ + (3H2 + 2H)Φ = 4πG

(δp− 2

3k2δΣ

)(B.3.108)

(∂t + 3H)B/a+Φ

a2= −8πGδΣ . (B.3.109)



a2δq + (ρ+ p)[k2B/a] , (B.3.110)


3k2δΣ− (ρ+ p)Φ . (B.3.111)

B.4 Vectors 177

B.4 Vectors


Vector type metric perturbations are defined as

ds2 = −dt2 + 2a(t)Sidxidt+ a2(t)[δij + 2F(i,j)]dx

idxj , (B.4.112)

where Si,i = Fi,i = 0. The vector gauge transformation is

xi → xi + βi , βi,i = 0 . (B.4.113)

They lead to the transformations

Si → Si + aβi , (B.4.114)

Fi → Fi − βi . (B.4.115)

The combination Fi + Si/a is called the gauge-invariant vector shear perturbation.


We define the vector part of the anisotropic stress by

δΣij = ∂(iΣj) , (B.4.116)

where Σi is divergence-free, Σi,i = 0.


For vector perturbations there are only two Einstein Equations,

˙δqi + 3Hδqi = k2δΣi , (B.4.117)

k2(Fi + Si/a) = 16πGδqi . (B.4.118)

In the absence of anisotropic stress (δΣi = 0) the divergence-free momentum δqi decays with the

expansion of the universe; see Eqn. (B.4.117). The shear perturbation Fi + Si/a then vanishes

by Eqn. (B.4.118). Under most circumstances vector perturbations are therefore subdominant.

They won’t play an important role in these lectures. In particular, vector perturbations aren’t

created by inflation.

B.5 Tensors


Tensor metric perturbations are defined as

ds2 = −dt2 + a2(t)[δij + hij ]dxidxj , (B.5.119)

where hij,i = hii = 0. Tensor perturbations are automatically gauge-invariant (at linear order).

It is conventional to decompose tensor perturbations into eigenmodes of the spatial Laplacian,

∇2eij = −k2eij , with comoving wavenumber k and scalar amplitude h(t),

hij = h(t)e(+,×)ij (x) . (B.5.120)

Here, + and × denote the two possible polarization states.



Tensor perturbations are sourced by anisotropic stress Σij , with Σij,i = Σii = 0. It is typically

a good approximation to assume that the anisotropic stress is negligible, although a small

amplitude is induced by neutrino free-streaming.


For tensor perturbations there is only one Einstein Equation. In the absence of anisotropic stress

this is

h+ 3Hh+k2

a2h = 0 . (B.5.121)

This is a wave equation describing the evolution of gravitational waves in an expanding universe.

Gravitational waves are produced by inflation, but then decay with the expansion of the uni-

verse. However, at recombination their amplitude may still be large enough to leave distinctive

signatures in B-modes of CMB polarization.

C Exercises

Exercises for Chapter 1

Problem 1 (Horizon Problem)

- Consider a FRW metric in conformal coordinates, ds2 = a(τ)2[−dτ2 +dx2]. The scale factor

in front of the whole metric does not affect the propagation of light rays and therefore does not

affect causality. So why does the condition a > 0 solve the horizon problem?

- Suppose all matter fields photons for example are coupled not directly to the metric gµν ,

but to gµν = h(φ)gµν , where φ is a scalar field, evolving in time and h is a given function. We

are assuming to be in ‘Einstein frame’, i.e. that the action for gµν is just the standard Einstein-

Hilbert action. Is it enough to have acceleration of the “effective” scale factor a of the metric g

to solve the horizon problem?

- Assuming instantaneous reheating after inflation at temperature Trh show, using entropy

conservation, that we need at least

N = 46 + logTrh

1010 GeV+

1

2log |Ωi − 1|

e-folds of inflation to solve the flatness problem. Here, Ωi is the curvature parameter when

inflation begins.

Problem 2 In a bouncing model, the scale factor is initially contracting, then reaches a

minimum (the bounce) and then starts a (decelerated) expansion. Compare the diagram k−1a

vs. H−1 for inflation and for bouncing models. Discuss how to realize a bounce.

Problem 3 The ratio between pressure and energy density is usually called w ≡ p/ρ. What

are the possible values of w for a scalar field, with standard kinetic term and potential, which

evolves in time, assuming ρ > 0? And if we assume a positive definite potential?

Problem 3 (λφ4 Inflation) Determine the predictions of an inflationary model with a quartic

potential

V (φ) = λφ4 .

1. Compute the slow-roll parameters ε and η in terms of φ.

2. Determine φend, the value of the field at which inflation ends.

179

180 C. Exercises

3. To determine the spectrum, you will need to evaluate ε and η at horizon crossing, k = aH

(or −kτ = 1). Choose the wavenumber k to be equal to a0H0, roughly the horizon today.

Show that the requirement −kτ = 1 then corresponds to

e60 =

∫ N

0dN ′

eN′

H(N ′)/Hend,

where Hend is the Hubble rate at the end of inflation, and N is defined to be the number

of e-folds before the end of inflation

N ≡ ln(aend

a

).

4. Take this Hubble rate to be a constant in the above with H/Hend = 1. This implies that

N ≈ 60. Turn this into an expression for φ. This simplest way to do this is to note that

N =∫ tend

t dt′H(t′) and assume that H is dominated by potential energy. Show that this

mode leaves the horizon when φ = 22Mpl.

5. Determine the predicted values of ns, r and nt. Compare these predictions to the latest

CMB data.

6. Estimate the scalar amplitude in terms of λ. Set ∆2s ≈ 10−9. What value does this imply

for λ?

This model illustrates many of the features of generic inflationary models: (i) the field is of

order – even greater than – the Planck scale, but (ii) the energy scale V is much smaller than

Planckian because of (iii) the very small coupling constant.

181


Problem 1

- An harmonic oscillator of frequency ωi is in its vacuum state. Its frequency is instantaneously

changed to ωf . Write the state of the system in terms of the new eigenstates and calculate its

energy. This integral of Hermite polynomials may be useful:

∫ +∞

−∞e−x

2H2m(xy)dx =

√π

(2m)!

m!(y2 − 1)m .

- An harmonic oscillator of frequency ωi is in its first excited state and its frequency is

instantaneously changed to ωf ωi. What is its final energy? [No long calculation is needed.]

Problem 2 Photons are massless, but they are not produced during inflation. Why?

Problem 3 Calculate the equal time 2-point function of a massless scalar in a fixed de Sitter

background in real space. What is the physical meaning of the IR divergence?

Problem 4 Using symmetry and simple scaling arguments, calculate the tilt of the spectrum

of a scalar with small mass, m2 H2, in a fixed de Sitter background.

182 C. Exercises


Problem 1

The objective of this problem is to reproduce the results of Seljak 94 (S94). Starting with

the equations in Ma & Bertschinger 95 derive equation (3) in S94. Derive the solutions for the

evolution of the background quantities y, η as a function of x. Write a routine in Mathematica

(using NDSolve) that solves the equations for the perturbations up to recombination for a given

value of κ and cosmological parameters. Reproduce the top panel of figure 1. Derive equation

(5) starting from the integral solution. Make a spline of the sources at recombination as a

function of κ and integrate them to obtain C`. Reproduce the bottom panel of figure 2. Include

Silk damping as done in S94. For the more ambitious you can include damping by adding shear

viscosity directly to the equations (see Mukhanovs book).

Problem 2

Use the equations derived in the previous problem to determine the parameters of the cosmo-

logical model that govern the dynamics of the perturbations up to recombination. What is the

effect of changing the distance to the last scattering surface? Assume you want to determine

Ωm, Ωb, ΩΛ and h using the CMB temperature power spectra. Argue that there is a degeneracy

between parameters.

Problem 3

The objective of this problem is to get a sense of how the temperature power spectrum depends

on the cosmological parameters and how well current data can determine these parameters. We

will consider the following parameters to describe the matter content of the universe (ΩΛ, Ωbh2,

Ωbh2) and we will restrict ourselves to flat models, (Ωk = 0). To describe the power spectrum

of initial curvature fluctuations we will use the amplitude As and the spectral index ns. We will

not consider gravity waves.

We will compare theoretical models with the latest WMAP. All the necessary ingredients can

be found in LAMBDA http://lambda.gsfc.nasa.gov/. Use CAMB online tool from LAMBDA to

compute the temperature power spectra for the WMAP best fit model. You can choose the

specific parameter from the WMAP best fit table also in LAMBDA. Produce three families

of models with one parameter of (ΩΛ, Ωbh2, Ωbh

2) varying in each. Produce plots that show

the C` for each family. In these plots also show the WMAP data with its corresponding error

bars (you can download a table with the power spectra and its error bars from LAMBDA). For

each family explain the physics behind the changes in the power spectra. Roughly estimate (by

comparing the changes produced by each parameter with the error bars in the data) the range

of values for each parameter seems acceptable. Compare with what is given in the table of best

fit parameters.

183


Problem 1 Consider chaotic inflation with V (φ) = 12m

2φ2 and the coupling g2φ2χ2. Broad

resonance occurs for g > 2mΦ ∼ 10−6, where we used Φ ∼ Mpl and m ∼ 10−6Mpl (from COBE

normalization). Is such a large coupling consistent with naturalness of the inflationary potential?

Consider the one-loop correction to the inflaton mass

δm2 =g2

16π2Λ2

uv ∼g2

16π2M2

pl .

Naturalness requires δm < m ∼ 10−6Mpl, or g < 10−5. This seems to disallow the regime of

broad parametric resonance for chaotic inflation. The reheating of this model is now predomi-

nantly through the elementary decays of φ and narrow resonances in χ.

184 C. Exercises


Problem 1 Using symmetry arguments, show that the n-point function of ζ in Fourier space

in a generic model of inflation is of the form

〈ζk1 · · · ζkn〉 = (2π)3δ(k1 + · · ·+ kn)F (kn) , (C.0.1)

where F is an homogeneous function of the ks of degree −3(n− 1).

Problem 2 Consider a massless scalar φ in de Sitter space with an interaction Mφ3. Calculate

the 3-point function 〈φk1φk2φk3〉.

185


Problem 1

Consider a probe D3-brane in an AdS5 ×X5 throat

ds2 =r2

R2(−dt2 + dx2) +

R2

r2dr2 + ds2

X5.

Imagine the throat is cut off at r = ruv ∼ R and connected to a compactification. The brane can

move in r, with canonically normalized field φ ∼ r/α′. Compute the four-dimensional Planck

mass as a function of the maximum value of the canonically normalized field φuv ∼ ruv/α′ and

the geometry of the compact dimensions. Using the Lyth bound, convert this to a bound on the

tensor to scalar ratio as a function of these quantities.

Problem 2

- Consider the DBI action

S = −∫

d4x√−g φ

4

λ

√1− λ(∂µφ)2/φ4 − V (φ) .

Derive the stress-energy tensor. Show explicitly that inflation can occur even on a steep potential

that does not satisfy the slow-roll conditions.

- Generalize this to

S =

∫d4x√−g P (X,φ) , X ≡ −gµν∂µφ∂νφ .

the physics of in ation - university of cambridge · the physics of in ation a course for graduate...

Documents