yorikiyo nagashima - archive

967

Upload: others

Post on 18-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Yorikiyo Nagashima - Archive
Page 2: Yorikiyo Nagashima - Archive

Yorikiyo Nagashima

Elementary Particle Physics

Volume 1: Quantum Field Theory and Particles

WILEY-VCH Verlag GmbH & Co. KGaA

Page 3: Yorikiyo Nagashima - Archive
Page 4: Yorikiyo Nagashima - Archive

Yorikiyo Nagashima

Elementary Particle Physics

Page 5: Yorikiyo Nagashima - Archive

Related Titles

Nagashima, Y.

Elementary Particle PhysicsVolume 2: Standard Model and Experiments

Approx. 2013ISBN 978-3-527-40966-2

Russenschuck, S.

Field Computation for Accelerator MagnetsAnalytical and Numerical Methods for Electromagnetic Design and Optimization

2010ISBN 978-3-527-40769-9

Stock, R. (ed.)

Encyclopedia of Applied High Energy and Particle Physics

2009

ISBN 978-3-527-40691-3

Griffiths, D.

Introduction to Elementary Particles

2008ISBN 978-3-527-40601-2

Belusevic, R.

Relativity, Astrophysics and Cosmology

2008ISBN 978-3-527-40764-4

Reiser, M.

Theory and Design of Charged Particle Beams

2008ISBN 978-3-527-40741-5

Page 6: Yorikiyo Nagashima - Archive

Yorikiyo Nagashima

Elementary Particle Physics

Volume 1: Quantum Field Theory and Particles

WILEY-VCH Verlag GmbH & Co. KGaA

Page 7: Yorikiyo Nagashima - Archive

The Author

Yorikiyo NagashimaOsaka [email protected]

Cover Image

Japanese symbol that denotes“void” or “nothing”; alsosymbolizes “supreme state ofmatter” or “spirit”.

All books published by Wiley-VCH are carefullyproduced. Nevertheless, authors, editors, andpublisher do not warrant the informationcontained in these books, including this book, tobe free of errors. Readers are advised to keep inmind that statements, data, illustrations,procedural details or other items mayinadvertently be inaccurate.

Library of Congress Card No.: applied for

British Library Cataloguing-in-Publication Data:A catalogue record for this book is availablefrom the British Library.

Bibliographic information published by theDeutsche NationalbibliothekThe Deutsche Nationalbibliothek lists thispublication in the Deutsche Nationalbibliografie;detailed bibliographic data are available on theInternet at http://dnb.d-nb.de.

© 2010 WILEY-VCH Verlag GmbH & Co. KGaA,Weinheim

All rights reserved (including those of translationinto other languages). No part of this book maybe reproduced in any form – by photoprinting,microfilm, or any other means – nor transmittedor translated into a machine language withoutwritten permission from the publishers. Regis-tered names, trademarks, etc. used in this book,even when not specifically marked as such, arenot to be considered unprotected by law.

Typesetting le-tex publishing services GmbH,LeipzigPrinting and Binding betz-druck GmbH,DarmstadtCover Design Adam Design, Weinheim

Printed in the Federal Republic of GermanyPrinted on acid-free paper

ISBN 978-3-527-40962-4

Page 8: Yorikiyo Nagashima - Archive

V

Foreword

This is a unique book. It covers the entire theoretical and experimental content ofparticle physics.

Particle physics sits at the forefront of our search for the ultimate structure ofmatter at the smallest scale, but in the process it has also learned to question thenature of our space and time in which they exist. Going hand in hand with tech-nological advances, particle physics now has extended its reach to studies of thestructure and the history of the universe. It seems that both the ultimate small andthe ultimate large are linked together. To explore one you must also explore theother.

Thus particle physics covers a vast area. To master it, you usually have to readdifferent books, at increasingly advanced levels, on individual subjects like its his-torical background, basic experimental data, quantum field theory, mathematicalconcepts, theoretical models, cosmological concepts, etc. This book covers most ofthose topics in a single volume in an integrated manner. Not only that, it showsyou how to derive each important mathematical formula in minute detail, thenasks you to work out many problems, with answers given at the end of the book.Some abstract formulas are immediately followed by their intuitive interpretationand experimental consequences. The same topics are often repeated at differentlevels of sophistication in different chapters as you read on, which will help deepenyour understanding.

All these features are quite unique to this book, and will be most helpful to stu-dents as well as laymen or non-experts who want to learn the subject seriously andenjoy it. It can serve both as a text book and as a compendium on particle physics.Even for practicing particle physicists and professors this will be a valuable refer-ence book to keep at hand. Few people like Professor Nagashima, an accomplishedexperimental physicist who is also conversant with sophisticated theoretical sub-jects, could have written it.

Chicago, October 2009 Yoichiro Nambu

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 9: Yorikiyo Nagashima - Archive
Page 10: Yorikiyo Nagashima - Archive

VII

Contents

Foreword V

Preface XVII

Acknowledgements XXI

Part One A Field Theoretical Approach 1

1 Introduction 31.1 An Overview of the Standard Model 31.1.1 What is an Elementary Particle? 31.1.2 The Four Fundamental Forces and Their Unification 41.1.3 The Standard Model 71.2 The Accelerator as a Microscope 11

2 Particles and Fields 132.1 What is a Particle? 132.2 What is a Field? 212.2.1 Force Field 212.2.2 Relativistic Wave Equation 252.2.3 Matter Field 272.2.4 Intuitive Picture of a Field and Its Quantum 282.2.5 Mechanical Model of a Classical Field 292.3 Summary 322.4 Natural Units 33

3 Lorentz Invariance 373.1 Rotation Group 373.2 Lorentz Transformation 413.2.1 General Formalism 413.2.2 Lorentz Vectors and Scalars 433.3 Space Inversion and Time Reversal 453.4 Covariant Formalism 473.4.1 Tensors 473.4.2 Covariance 483.4.3 Supplementing the Time Component 49

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 11: Yorikiyo Nagashima - Archive

VIII Contents

3.4.4 Rapidity 513.5 Lorentz Operator 533.6 Poincaré Group* 56

4 Dirac Equation 594.1 Relativistic Schrödinger Equation 594.1.1 Dirac Matrix 594.1.2 Weyl Spinor 614.1.3 Interpretation of the Negative Energy 644.1.4 Lorentz-Covariant Dirac Equation 694.2 Plane-Wave Solution 714.3 Properties of the Dirac Particle 754.3.1 Magnetic Moment of the Electron 754.3.2 Parity 774.3.3 Bilinear Form of the Dirac Spinor 784.3.4 Charge Conjugation 794.3.5 Chiral Eigenstates 824.4 Majorana Particle 84

5 Field Quantization 895.1 Action Principle 895.1.1 Equations of Motion 895.1.2 Hamiltonian Formalism 905.1.3 Equation of a Field 915.1.4 Noether’s Theorem 955.2 Quantization Scheme 1005.2.1 Heisenberg Equation of Motion 1005.2.2 Quantization of the Harmonic Oscillator 1025.3 Quantization of Fields 1055.3.1 Complex Fields 1065.3.2 Real Field 1115.3.3 Dirac Field 1125.3.4 Electromagnetic Field 1145.4 Spin and Statistics 1195.5 Vacuum Fluctuation 1215.5.1 The Casimir Effect* 122

6 Scattering Matrix 1276.1 Interaction Picture 1276.2 Asymptotic Field Condition 1316.3 Explicit Form of the S-Matrix 1336.3.1 Rutherford Scattering 1356.4 Relativistic Kinematics 1366.4.1 Center of Mass Frame and Laboratory Frame 1366.4.2 Crossing Symmetry 1396.5 Relativistic Cross Section 141

Page 12: Yorikiyo Nagashima - Archive

Contents IX

6.5.1 Transition Rate 1416.5.2 Relativistic Normalization 1426.5.3 Incoming Flux and Final State Density 1446.5.4 Lorentz-Invariant Phase Space 1456.5.5 Cross Section in the Center of Mass Frame 1456.6 Vertex Functions and the Feynman Propagator 1476.6.1 eeγ Vertex Function 1476.6.2 Feynman Propagator 1516.7 Mott Scattering 1576.7.1 Cross Section 1576.7.2 Coulomb Scattering and Magnetic Scattering 1616.7.3 Helicity Conservation 1616.7.4 A Method to Rotate Spin 1616.8 Yukawa Interaction 162

7 Qed: Quantum Electrodynamics 1677.1 e–μ Scattering 1677.1.1 Cross Section 1677.1.2 Elastic Scattering of Polarized e–μ 1717.1.3 e�eC ! μ�μC Reaction 1747.2 Compton Scattering 1767.3 Bremsstrahlung 1817.3.1 Soft Bremsstrahlung 1837.4 Feynman Rules 186

8 Radiative Corrections and Tests of Qed* 1918.1 Radiative Corrections and Renormalization* 1918.1.1 Vertex Correction 1918.1.2 Ultraviolet Divergence 1938.1.3 Infrared Divergence 1978.1.4 Infrared Compensation to All Orders* 1998.1.5 Running Coupling Constant 2048.1.6 Mass Renormalization 2088.1.7 Ward–Takahashi Identity 2108.1.8 Renormalization of the Scattering Amplitude 2118.2 Tests of QED 2138.2.1 Lamb Shift 2138.2.2 g � 2 2148.2.3 Limit of QED Applicability 2168.2.4 E821 BNL Experiment 216

9 Symmetries 2219.1 Continuous Symmetries 2229.1.1 Space and Time Translation 2239.1.2 Rotational Invariance in the Two-Body System 2279.2 Discrete Symmetries 233

Page 13: Yorikiyo Nagashima - Archive

X Contents

9.2.1 Parity Transformation 2339.2.2 Time Reversal 2409.3 Internal Symmetries 2519.3.1 U(1) Gauge Symmetry 2519.3.2 Charge Conjugation 2529.3.3 CPT Theorem 2589.3.4 SU(2) (Isospin) Symmetry 260

10 Path Integral: Basics 26710.1 Introduction 26710.1.1 Bra and Ket 26710.1.2 Translational Operator 26810.2 Quantum Mechanical Equations 27110.2.1 Schrödinger Equation 27110.2.2 Propagators 27210.3 Feynman’s Path Integral 27410.3.1 Sum over History 27410.3.2 Equivalence with the Schrödinger Equation 27810.3.3 Functional Calculus 27910.4 Propagators: Simple Examples 28210.4.1 Free-Particle Propagator 28210.4.2 Harmonic Oscillator 28510.5 Scattering Matrix 29410.5.1 Perturbation Expansion 29510.5.2 S-Matrix in the Path Integral 29710.6 Generating Functional 30010.6.1 Correlation Functions 30010.6.2 Note on Imaginary Time 30210.6.3 Correlation Functions as Functional Derivatives 30410.7 Connection with Statistical Mechanics 306

11 Path Integral Approach to Field Theory 31111.1 From Particles to Fields 31111.2 Real Scalar Field 31211.2.1 Generating Functional 31211.2.2 Calculation of det A 31511.2.3 n-Point Functions and the Feynman Propagator 31811.2.4 Wick’s Theorem 31911.2.5 Generating Functional of Interacting Fields 32011.3 Electromagnetic Field 32111.3.1 Gauge Fixing and the Photon Propagator 32111.3.2 Generating Functional of the Electromagnetic Field 32311.4 Dirac Field 32411.4.1 Grassmann Variables 32411.4.2 Dirac Propagator 33111.4.3 Generating Functional of the Dirac Field 332

Page 14: Yorikiyo Nagashima - Archive

Contents XI

11.5 Reduction Formula 33311.5.1 Scalar Fields 33311.5.2 Electromagnetic Field 33711.5.3 Dirac Field 33711.6 QED 34011.6.1 Formalism 34011.6.2 Perturbative Expansion 34211.6.3 First-Order Interaction 34311.6.4 Mott Scattering 34511.6.5 Second-Order Interaction 34611.6.6 Scattering Matrix 35111.6.7 Connected Diagrams 35311.7 Faddeev–Popov’s Ansatz* 35411.7.1 A Simple Example* 35511.7.2 Gauge Fixing Revisited* 35611.7.3 Faddeev–Popov Ghost* 359

12 Accelerator and Detector Technology 36312.1 Accelerators 36312.2 Basic Parameters of Accelerators 36412.2.1 Particle Species 36412.2.2 Energy 36612.2.3 Luminosity 36712.3 Various Types of Accelerators 36912.3.1 Low-Energy Accelerators 36912.3.2 Synchrotron 37312.3.3 Linear Collider 37712.4 Particle Interactions with Matter 37812.4.1 Some Basic Concepts 37812.4.2 Ionization Loss 38112.4.3 Multiple Scattering 38912.4.4 Cherenkov and Transition Radiation 39012.4.5 Interactions of Electrons and Photons with Matter 39412.4.6 Hadronic Shower 40112.5 Particle Detectors 40312.5.1 Overview of Radioisotope Detectors 40312.5.2 Detectors that Use Light 40412.5.3 Detectors that Use Electric Signals 41012.5.4 Functional Usage of Detectors 41512.6 Collider Detectors 42212.7 Statistics and Errors 42812.7.1 Basics of Statistics 42812.7.2 Maximum Likelihood and Goodness of Fit 43312.7.3 Least Squares Method 438

Page 15: Yorikiyo Nagashima - Archive

XII Contents

Part Two A Way to the Standard Model 441

13 Spectroscopy 44313.1 Pre-accelerator Age (1897–1947) 44413.2 Pions 44913.3 πN Interaction 45413.3.1 Isospin Conservation 45413.3.2 Partial Wave Analysis 46213.3.3 Resonance Extraction 46613.3.4 Argand Diagram: Digging Resonances 47213.4 � (770) 47513.5 Final State Interaction 47813.5.1 Dalitz Plot 47813.5.2 K Meson 48113.5.3 Angular Momentum Barrier 48413.5.4 ω Meson 48513.6 Low-Energy Nuclear Force 48713.6.1 Spin–Isospin Exchange Force 48713.6.2 Effective Range 49013.7 High-Energy Scattering 49113.7.1 Black Sphere Model 49113.7.2 Regge Trajectory* 494

14 The Quark Model 50114.1 S U(3) Symmetry 50114.1.1 The Discovery of Strange Particles 50214.1.2 The Sakata Model 50514.1.3 Meson Nonets 50714.1.4 The Quark Model 50914.1.5 Baryon Multiplets 51014.1.6 General Rules for Composing Multiplets 51114.2 Predictions of SU(3) 51314.2.1 Gell-Mann–Okubo Mass Formula 51314.2.2 Prediction of Ω 51414.2.3 Meson Mixing 51614.3 Color Degrees of Freedom 51914.4 S U(6) Symmetry 52214.4.1 Spin and Flavor Combined 52214.4.2 S U(6) � O(3) 52514.5 Charm Quark 52514.5.1 J/ψ 52514.5.2 Mass and Quantum Number of J/ψ 52714.5.3 Charmonium 52714.5.4 Width of J/ψ 53314.5.5 Lifetime of Charmed Particles 53614.5.6 Charm Spectroscopy: SU(4) 537

Page 16: Yorikiyo Nagashima - Archive

Contents XIII

14.5.7 The Fifth Quark b (Bottom) 53914.6 Color Charge 53914.6.1 Color Independence 54214.6.2 Color Exchange Force 54414.6.3 Spin Exchange Force 54514.6.4 Mass Formulae of Hadrons 547

15 Weak Interaction 55315.1 Ingredients of the Weak Force 55315.2 Fermi Theory 55515.2.1 Beta Decay 55515.2.2 Parity Violation 56215.2.3 π Meson Decay 56415.3 Chirality of the Leptons 56715.3.1 Helicity and Angular Correlation 56715.3.2 Electron Helicity 56915.4 The Neutrino 57115.4.1 Detection of the Neutrino 57115.4.2 Mass of the Neutrino 57215.4.3 Helicity of the Electron Neutrino 57615.4.4 The Second Neutrino νμ 57815.5 The Universal V–A Interaction 57915.5.1 Muon Decay 57915.5.2 CVC Hypothesis 58415.6 Strange Particle Decays 58915.6.1 ΔS D ΔQ Rule 58915.6.2 Δ I D 1/2 Rule 59115.6.3 Kl3 W KC ! π0 C lC C ν 59215.6.4 Cabibbo Rotation 59615.7 Flavor Conservation 59815.7.1 GIM Mechanism 59815.7.2 Kobayashi–Maskawa Matrix 60015.7.3 Tau Lepton 60115.7.4 The Generation Puzzle 60515.8 A Step Toward a Unified Theory 60815.8.1 Organizing the Weak Phenomena 60815.8.2 Limitations of the Fermi Theory 61015.8.3 Introduction of SU(2) 614

16 Neutral Kaons and CP Violation* 61716.1 Introduction 61816.1.1 Strangeness Eigenstates and CP Eigenstates 61816.1.2 Schrödinger Equation for K0 � K

0States 619

16.1.3 Strangeness Oscillation 62216.1.4 Regeneration of K1 626

Page 17: Yorikiyo Nagashima - Archive

XIV Contents

16.1.5 Discovery of CP Violation 63016.2 Formalism of CP and CPT Violation 63216.2.1 CP, T, CPT Transformation Properties 63216.2.2 Definition of CP Parameters 63516.3 CP Violation Parameters 64016.3.1 Observed Parameters 64016.3.2 � and �0 64416.4 Test of T and CPT Invariance 65316.4.1 Definition of T- and CPT-Violating Amplitudes 65416.4.2 T Violation 65416.4.3 CPT violation 65616.4.4 Possible Violation of Quantum Mechanics 66216.5 Experiments on CP Parameters 66416.5.1 CPLEAR 66416.5.2 NA48/KTeV 66616.6 Models of CP Violation 673

17 Hadron Structure 67917.1 Historical Overview 67917.2 Form Factor 68017.3 e–p Elastic Scattering 68317.4 Electron Proton Deep Inelastic Scattering 68717.4.1 Cross-Section Formula for Inelastic Scattering 68717.4.2 Bjorken Scaling 69017.5 Parton Model 69317.5.1 Impulse Approximation 69317.5.2 Electron–Parton Scattering 69617.6 Scattering with Equivalent Photons 69917.6.1 Transverse and Longitudinal Photons 69917.6.2 Spin of the Target 70217.6.3 Photon Flux 70317.7 How to Do Neutrino Experiments 70517.7.1 Neutrino Beams 70517.7.2 Neutrino Detectors 70917.8 ν–p Deep Inelastic Scattering 71217.8.1 Cross Sections and Structure Functions 71217.8.2 ν, ν–q Scattering 71517.8.3 Valence Quarks and Sea Quarks 71617.8.4 Comparisons with Experimental Data 71717.8.5 Sum Rules 71917.9 Parton Model in Hadron–Hadron Collisions 72117.9.1 Drell–Yan Process 72117.9.2 Other Hadronic Processes 72417.10 A Glimpse of QCD’s Power 725

Page 18: Yorikiyo Nagashima - Archive

Contents XV

18 Gauge Theories 72918.1 Historical Prelude 72918.2 Gauge Principle 73118.2.1 Formal Definition 73118.2.2 Gravity as a Geometry 73318.2.3 Parallel Transport and Connection 73418.2.4 Rotation in Internal Space 73718.2.5 Curvature of a Space 73918.2.6 Covariant Derivative 74118.2.7 Principle of Equivalence 74318.2.8 General Relativity and Gauge Theory 74518.3 Aharonov–Bohm Effect 74818.4 Nonabelian Gauge Theories 75418.4.1 Isospin Operator 75418.4.2 Gauge Potential 75518.4.3 Isospin Force Field and Equation of Motion 75718.5 QCD 76018.5.1 Asymptotic Freedom 76218.5.2 Confinement 76718.6 Unified Theory of the Electroweak Interaction 77018.6.1 S U(2) � U(1) Gauge Theory 77018.6.2 Spontaneous Symmetry Breaking 77418.6.3 Higgs Mechanism 77818.6.4 Glashow–Weinberg–Salam Electroweak Theory 78218.6.5 Summary of GWS Theory 784

19 Epilogue 78719.1 Completing the Picture 78819.2 Beyond the Standard Model 78919.2.1 Neutrino Oscillation 78919.2.2 GUTs: Grand Unified Theories 79119.2.3 Supersymmetry 79219.2.4 Superstring Model 79519.2.5 Extra Dimensions 79619.2.6 Dark Matter 79719.2.7 Dark Energy 798

Appendix A Spinor Representation 803A.1 Definition of a Group 803A.1.1 Lie Group 804A.2 S U(2) 805A.3 Lorentz Operator for Spin 1/2 Particle 809A.3.1 S L(2, C ) Group 809A.3.2 Dirac Equation: Another Derivation 811

Page 19: Yorikiyo Nagashima - Archive

XVI Contents

Appendix B Coulomb Gauge 813B.1 Quantization of the Electromagnetic Field in the Coulomb Gauge 814

Appendix C Dirac Matrix and Gamma Matrix Traces 817C.1 Dirac Plane Wave Solutions 817C.2 Dirac γ Matrices 817C.2.1 Traces of the γ Matrices 818C.2.2 Levi-Civita Antisymmetric Tensor 819C.3 Spin Sum of jM f i j

2 819C.3.1 A Frequently Used Example 820C.3.2 Polarization Sum of the Vector Particle 822C.4 Other Useful Formulae 823

Appendix D Dimensional Regularization 825D.1 Photon Self-Energy 825D.2 Electron Self-Energy 830

Appendix E Rotation Matrix 833E.1 Angular Momentum Operators 833E.2 Addition of the Angular Momentum 835E.3 Rotational Matrix 835

Appendix F C, P, T Transformation 839

Appendix G S U(3), S U(n) and the Quark Model 841G.1 Generators of the Group 841G.1.1 Adjoint Representation 842G.1.2 Direct Product 843G.2 SU(3) 844G.2.1 Structure Constants 844G.2.2 Irreducible Representation of a Direct Product 846G.2.3 Tensor Analysis 851G.2.4 Young Diagram 854

Appendix H Mass Matrix and Decaying States 859H.1 The Decay Formalism 859

Appendix I Answers to the Problems 865

Appendix J Particle Data 915

Appendix K Constants 917

References 919

Index 929

Page 20: Yorikiyo Nagashima - Archive

XVII

Preface

The purpose of this book is to present introductory explanations to junior gradu-ate students of major advances in particle physics made in the last century. It isalso aimed at laypeople who have not received a standard education in physics butare willing to make an extra effort to follow mathematical logic. Therefore it is or-ganized to be suitable for self-study and is as self-contained as possible. This iswhy it starts with an introduction to field theory while the main purpose is to talkabout particle physics. Quantum field theory is an essential mathematical tool forunderstanding particle physics, which has fused quantum mechanics with specialrelativity. However, phenomena that need the theory of relativity are almost com-pletely limited to particle physics, cosmology and parts of astrophysics, which arenot topics of everyday life. Few students dare to take a course in special or gen-eral relativity. This fact, together with the mathematical complexity of field theory,makes particle physics hard to approach. If one tries to make a self-contained expla-nation of mathematical aspects of the field, hardly any space is left for the details ofparticle-physics phenomena. While good textbooks on field theories are abundant,very few talk about the rich phenomena in the world of elementary particle physics.One reason may be that the experimental progress is so rapid that any textbook thattries to take an in-depth approach to the experimental aspect of the field becomesobsolete by the time it is published. In addition, when one makes a great effort tofind a good book, basic formulae are usually given without explanation. The au-thor had long felt the need for a good textbook suitable for a graduate course, wellbalanced between theory and experiment.

Without doubt many people have been fascinated by the world that quantummechanics opens up. After a student has learned the secrets of its mystery, he(or she) must realize that his world view has changed for life. But at what depthwould it have been if he thought he had grasped the idea without knowledge ofthe Schödinger equation? This is why the author feels uneasy with presentation ofthe basic formulae as given from heaven, as is often the case in many textbooks onexperimental particle physics.

This book is the first of two volumes. Volume I paves the way to the StandardModel and Volume II develops applications and discusses recent progress.

Volume I, i.e. this book, is further divided into two parts, a field theoretical ap-proach and the way to the Standard Model. The aim of Part I is to equip students

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 21: Yorikiyo Nagashima - Archive

XVIII Preface

with tools so they can calculate basic formulae to analyze the experimental datausing quantized field theories at least at the tree level. Tailored to the above pur-pose, the author’s intention is to present a readable introduction and, hopefully, anintermediate step to the study of advanced field theories.

Chapters 1 and 2 are an introduction and outline of what the Standard Model is.Chapter 3 picks out essential ingredients of special relativity relevant for particlephysics and prepares for easy understanding of relativistic formulae that follow.Chapters 4–7, starting from the Dirac equation, field quantization, the scatteringmatrix and leading to QED (quantum electrodynamics), should be treated as one,closed subject to provide basic knowledge of relativistic field theories and step-by-step acquisition of the necessary tools for calculating various dynamic processes.For the treatment of higher order radiative corrections, only a brief explanation isgiven in Chapter 8, to discuss how mathematical consistency of the whole theoryis achieved by renormalization.

Chapter 9 deals with space-time and internal symmetry, somewhat independenttopics that play a double role as an introduction to the subject and appendices toparticle physics, which is discussed in Part II.

Two chapters, 10 and 11, on the path integral are included as an addition. Themethod is very powerful if one wants to go into the formal aspects of field theory,including non-Abelian gauge theories, in any detail. Besides it is the modern wayof interpreting quantum mechanics that has many applications in various fields,and the author felt acquaintance with the path integral is indispensable. However,from a practical point of view of calculating the tree-level formulae, the traditionalcanonical quantization method is all one needs. With all the new concepts andmathematical preparation, the chapter stops at a point where it has rederived whathad already been calculated before. So readers have the choice of skipping thesechapters and coming back later when their interest has been aroused.

By jumping to Chapter 18 which describes axioms of the Standard Model, thefield theoretical approach can be concluded in a self-contained way, at least formal-ly. But it was placed at the end of Part II because the phenomenological approachalso culminates in the formulation of the unified gauge theories. Although two al-ternative approaches are prepared to reach the goal, the purpose of the book is topresent particle physics, and field theory is to be considered as a tool and treated assuch.

As part of the minimum necessary knowledge, a chapter was added at the end ofPart I on basic techniques of experimental measurements, the other necessary toolto understand particle physics phenomena. Readers who are perplexed with thecombination may skip this chapter, but it is the author’s wish that even studentswho aim at a theoretical career should understand at least this level of experimentaltechniques. After all, the essence of natural science is to explain facts and theoristsshould understand what the data really mean. In addition, a few representativeexperiments are picked out and scattered throughout the book with appropriatedescriptions. They were chosen for their importance in physics but also in order todemonstrate how modern experiments are carried out.

Page 22: Yorikiyo Nagashima - Archive

Preface XIX

In Part II, various particle phenomena are presented, the SU(3) group is intro-duced, basic formulae are calculated and explanations are given. The reader shouldbe able to understand the quark structure of matter and basic rules on which themodern unified theory is constructed. Those who do not care how the formulaeare derived may skip most of Part I and start from Part II, treating Part I as anappendix, although that is not what the author intended. Part II ends when theprimary equations of the Standard Model of particle physics have been derived.

With this goal in mind, after presenting an overview of hadron spectroscopy inChapter 13 and the quark model in Chapter 14, we concentrate mainly on the elec-tro-weak force phenomena in Chapter 15 to probe their dynamical structure andexplain processes that lead to a unified view of electromagnetic and weak inter-actions. Chapter 17 is on the hadron structure, a dynamical aspect of the quarkmodel that has led to QCD. Chapter 18 presents the principles of gauge theory; itsstructure and physical interpretation are given. Understanding the axioms of theStandard Model is the goal of the whole book.

Chapter 16 on CP violation in K mesons is an exception to the outline describedabove. It was included for two reasons. First, because it is an old topic, dating backto 1964, and second, because it is also a new topic, which in author’s personal per-ception may play a decisive role in the future course of particle physics. However,discussions from the viewpoint of unified theories is deferred until we discuss therole of B-physics in Volume II. Important though it is, its study is a side road fromthe main route of reaching the Standard Model, and this chapter should be con-sidered as a topic independent of the rest; one can study it when one’s interest isaroused.

Although the experimental data are all up-to-date, new phenomena that the Stan-dard Model predicted and were discovered after its establishment in the early 1970sare deferred to Volume II. The last chapter concludes what remains to be done onthe Standard Model and introduces some intriguing ideas about the future of par-ticle physics.

Volume II, which has yet to come, starts with the axioms of gauge field theo-ries and presents major experimental facts that the Standard Model predicted andwhose foundations were subsequently established. These include W, Z, QCD jetphenomena and CP violation in B physics in both electron and hadron colliders.The second half of Volume II deals issues beyond the Standard Model and recentdevelopments of particle physics, the Higgs, the neutrino, grand unified theories,unsolved problems in the Standard Model, such as axions, and the connection withcosmology.

A knowledge of quantum mechanics and the basics of special relativity is re-quired to read this book, but that is all that is required. One thing the author aimedat in writing this series is to make a reference book so that it can serve as a sortof encyclopedia later on. For that reason, each chapter is organized to be as self-contained as possible. The list of references is extensive on purpose. Also, as a re-sult, some chapters include sections that are at higher level than that the reader haslearned up to that point. An asterisk * is attached to those sections and problems.

Page 23: Yorikiyo Nagashima - Archive

XX Preface

If the reader feels they are hard, there is no point in getting stuck there, skip it andcome back later.

This book was based on a series of books originally written in Japanese and pub-lished by Asakura Shoten of Tokyo ten years ago. Since then, the author has re-organized and reformulated them with appropriate updating so that they can bepublished not as a translation but as new books.

Osaka, October, 2009 Yorikiyo Nagashima

Page 24: Yorikiyo Nagashima - Archive

XXI

Acknowledgements

The author wishes to thank Professor J. Arafune (Tokyo University) for valuableadvice and Professors T. Kubota and T. Yamanaka (Osaka University) for readingpart of the manuscript and for giving many useful suggestions. Needless to say, anymistakes are entirely the author’s responsibility and he appreciates it if the readernotifies them to him whenever possible.

The author would like to express his gratitude to the authors cited in the textand to the following publishers for permission to reproduce various photographs,figures and diagrams:

� American Institute of Physics, publisher of Physics Today for permission to re-produce Figure 19.4;

� American Physical Society, publisher of the Physical Review, Physical ReviewLetters and the Review of Modern Physics, for permission to reproduce Figures4.3, 5.2ab, 8.8, 9.2, 9.3ab, 12.18abc, 13.11a, 13.17, 13.19, 13.21, 13.23, 13.24ab,14.6, 14.13–15, 14.20b, 14.25, 14.30, 15.3ab, 15.13, 15.15, 15.17a, 15.21, 16.5–7ab, 16.19, 16.21–23, 17.2b–17.7, 17.12c, 17.15a, 17.21ab, 18.10ab, 18.11abc;

� Annual Reviews, publisher of Annual Review of Nuclear and Particle Sciencefor permission to reproduce Figures 12.23ab, 13.29, 14.18, 17.11;

� Elsevier Science Ltd., publisher of Nuclear Physics, Physics Letters, Physics Re-port, for permission to reproduce Figures 8.9ab, 12.24, 12.25, 12.29, 12.33ab,12.35ab, 12.37, 12.40–42, 13.14abcd, 13.18, 13.30, 15.8, 15.17b, 15.22, 15.23,16.1b, 16.8ab–16.10ab, 16.12, 16.13, 16.15–17ab, 17.19, 18.18;

� Institute of Physics Publishing Ltd., publisher of Report on Progress of Physicsfor permission to reproduce Figures 15.11 and 15.12;

� Japan Physical Society, publisher of Butsuri for permission to reproduce Figure12.26b;

� Nature Publishing Group, publisher of Nature for permission to reproduce Fig-ures 13.5 and 14.1;

� Particle Data Group, publisher of Review of Particle Physics for permission to re-produce Figures 12.10, 12.15–12.17, 12.20, 12.28b, 12.30ab, 12.38, 12.47, 13.25,13.26, 14.11, 14.16, 14.24, 14.26, 17.17, 18.16;

� Royal Society, publisher of the Proceedings of the Royal Society for permissionto reproduce Figure 13.4;

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 25: Yorikiyo Nagashima - Archive

XXII Acknowledgements

� Shokabo, publisher of the Cosmic Rays for permission to reproduce Figure12.22;

� Springer Verlag, publisher of Zeitschrift für Physik and European Journalof Physics for permission to reproduce Figures 12.43, 16.18, 17.14c, 17.15b,17.18ab, 17.24.

Page 26: Yorikiyo Nagashima - Archive

Part One A Field Theoretical Approach

Page 27: Yorikiyo Nagashima - Archive
Page 28: Yorikiyo Nagashima - Archive

3

1Introduction

1.1An Overview of the Standard Model

1.1.1What is an Elementary Particle?

Particle physics, also known as high-energy physics, is the field of natural sciencethat pursues the ultimate structure of matter. This is possible in two ways. One is tolook for elementary particles, the ultimate constituents of matter at their smallestscale, and the other is to clarify what interactions are acting among them to con-struct matter as we see them. The exploitable size of microscopic objects becomessmaller as technology develops. What was regarded as an elementary particle atone time is recognized as a structured object and relinquishes the title of “elemen-tary particle” to more fundamental particles in the next era. This process has beenrepeated many times throughout the history of science.

In the 19th century, when modern atomic theory was established, the exploitablesize of the microscopic object was � 10�10 m and the atom was “the elementaryparticle”. Then it was recognized as a structured object when J.J. Thomson extract-ed electrons in 1897 from matter in the form of cathode rays. Its real structure (theRutherford model) was clarified by investigating the scattering pattern of α-parti-cles striking a golden foil. In 1932, Chadwick discovered that the nucleus, the coreof the atom, consisted of protons and neutrons. In the same year, Lawrence con-structed the first cyclotron. In 1934 Fermi proposed a theory of weak interactions.In 1935 Yukawa proposed the meson theory to explain the nuclear force actingamong them.

It is probably fair to say that the modern history of elementary particles beganaround this time. The protons and neutrons together with their companion pions,which are collectively called hadrons, were considered as elementary particles un-til ca. 1960. We now know that they are composed of more fundamental particles,the quarks. Electrons remain elementary to this day. Muons and τ-leptons, whichwere found later, are nothing but heavy electrons, as far as the present technolo-gy can tell, and they are collectively dubbed leptons. Quarks and leptons are thefundamental building blocks of matter. The microscopic size that can be explored

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 29: Yorikiyo Nagashima - Archive

4 1 Introduction

by modern technology is nearing 10�19 m. The quarks and leptons are elementaryat this level. They may or may not be at a smaller level. The popular string theoryregards the most fundamental matter constituent not as a particle but as a stringat the Planck scale (� 10�35 m), but in this book we limit ourselves to treating onlyexperimentally established facts and foreseeable extensions of their models.

1.1.2The Four Fundamental Forces and Their Unification

Long-Range Forces: Gravity and the Electromagnetic Force There are four kinds offundamental forces in nature, the strong, electromagnetic, weak and gravitationalforces. The strong force, as the name suggests, is the strongest of all; it is a thou-sand times stronger, and the weak force a thousand times weaker, than the electro-magnetic force. The gravitational force or gravity is negligibly weak (that betweenthe proton and electron is 10�42 times the electromagnetic force) at elementaryparticle levels. Its strength becomes comparable to the others only on an extreme-ly small scale (Planck scale � 10�35 m), or equivalently at extremely high energy(Planck energy � 1019 GeV), not realizable on the terrestrial world. Despite its in-trinsic weakness, gravity controls the macroscopic world. Indeed, galaxies hundredmillions of light years apart are known to be mutually attracted by the gravita-tional force. The strength of the force decreases (only) inversely proportional tothe distance squared, but it also scales as the mass, and the concentration of themass more than compensates the distance even on the cosmic scale. We refer to itas a long-range force. The behavior of the electromagnetic force is similar to andfar stronger than gravity, but, in general, the positive and negative charges com-pensate each other. However, its long-distance effect manifests itself in the formof the galactic magnetic field, which spans hundreds or possibly millions of lightyears. It binds nuclei and electrons to form atoms, atoms to form molecules andmolecules to form matter as we observe them. Namely, it is the decisive force inthe microworld where the properties of matter are concerned. Atomic, molecu-lar and condensed matter physics need only consider the electromagnetic force. Amathematical framework, called quantum electrodynamics (QED), is used to de-scribe the dynamical behavior of point-like charged particles, notably electrons, theelectromagnetic field and their interactions. Mathematically, it is a combination ofquantized Maxwell equations and relativistic quantum mechanics.

Short-Range Forces: The Strong and Weak Forces In the ultramicroscopic world ofthe nuclei, hadrons and quarks (at scales less than� 10�15 m), both the strong andthe weak force come in. The reason why they are important only at such a smallscale is that they are short ranged reaching only a few hadron diameters, and ac-cumulation of the mass does not help to make it stronger. They are referred to asshort-range forces. The strong force (referred to as the color force at the most fun-damental level) acts between quarks to bind them to form hadrons. Historically, thestrong force was first discovered as the nuclear force to bind protons and neutrons.But as they were found to be composites of the quarks, the nuclear force is recog-

Page 30: Yorikiyo Nagashima - Archive

1.1 An Overview of the Standard Model 5

nized as a kind of molecular force (van der Waals force) that can be derived fromthe more fundamental color force. In 1935, Yukawa predicted the existence of theπ meson as the carrier of the nuclear force. The idea that the force is transmittedby a “force carrier particle” was revolutionary and laid the foundation for present-day gauge theories. Later, it was clarified that the pion was a composite and cannotbe a fundamental force carrier, but the basic idea remains valid. The weak force isknown to act in the decay of hadrons, notably in nuclear decays. It is also knownto control the burning rate of the sun and to play a decisive role to in the explosionof type II supernovae.

The electromagnetic force, though playing an essential role at the microscopiclevel, can also act at the macroscopic distance. We observe electromagnetic phe-nomena in daily life, for example electrically driven motors or the reaction of a tinymagnet to the earth’s magnetic field. The reason is that its strength is proportion-al to the inverse square of the distance (Coulomb’s law). In general, a long-rangeforce decreases only as the inverse power of the distance (� r�n) and can act on anobject no matter how far away it is if there is enough of the force source. A short-range force, on the other hand, is a force whose strength diminishes exponentiallywith the distance. The nuclear force, for instance, behaves like � e�r/ r0/r2(r0 �

10�15 m) and reaches only as far as r . r0. The range of the weak force is evenshorter, of the order of� 10�18 m. Because of this, despite their far greater strengththan gravity, their effects are not observed in the macroscopic world. However, ifone can confine enough matter within the reach of the strong force and let it re-act, it is possible to produce a large amount of energy. Nuclear power and nuclearfusion are examples of such applications.

When a Long-Range Force Becomes a Short-Range Force The above description em-phasizes the difference between the properties of the forces. Modern unified theo-ries tell us, however, that all the fundamental forces, despite their variety of appear-ances, are essentially long-range forces and can be treated within the mathematicalframework of gauge theories. The seemingly different behavior is a result of meta-morphosis caused by environmental conditions. We introduce here one example ofa long-range force becoming short range under certain circumstances. A train witha linear induction motor, the next generation supertrain, uses a phenomenon calledthe Meissner effect to magnetically levitate its body in the air. There, the magnet-ic field is completely repelled from the superconductor. This happens because, atvery low temperatures near absolute zero, electron pairs in a special configurationcalled the “Cooper pair” collectively react to the magnetic field to form a strong eddycurrent to cancel the invading magnetic flux. The cancellation is complete exceptfor a shallow depth at the surface, into which the flux can invade. In other words,the electromagnetic force, which is originally long range, becomes short range inthe superconductor. This example inspired Nambu [287] to formulate an importantconcept known as “spontaneous symmetry breaking”, which played a crucial rolein forming the Standard Model.

Page 31: Yorikiyo Nagashima - Archive

6 1 Introduction

Higgs Mechanism The essence of modern unified theory is to recognize that “thevacuum in which we live is in a sort of superconducting phase”. What correspondsto the Cooper pair in the superconductor is called the Higgs particle, unseen byus but thought to permeate everywhere in the universe. The weak force is longrange originally, but because of the ultracold vacuum that surrounds us becomesshort range, owing to the pseudo-Meissner effect caused by the Higgs particles.The short-range force translates to a finite mass of the force carrier particle inquantum field theory, as we shall see in the following sections. The Higgs parti-cle condenses at the ultracold temperature and causes a phase transition of thevacuum, a phenomenon similar to the vapor-to-water phase transition. Elementaryparticles, which have zero mass originally, acquire mass because of the draggingeffect caused by swimming through the condensed sea of the Higgs particle. Thismass-generating scheme is called the Higgs mechanism. It is thought that at hightemperature (such as at the hot Big Bang) and before the Higgs condensation theweak force was a long-range force, just like the electromagnetic force, and bothwere just different aspects of the same force. The relation between them is similarto that between magnetic and electric forces. They are unified in the sense theyare described by a single mathematical framework, just like the electric and mag-netic forces are described by the Maxwell equations. After Glashow, Weinberg andSalam, who constructed the unified theory of the electromagnetic and the weakforces, it is often called as GWS theory or simply the electroweak theory.

Confinement and Screening of Color Contrary to the weak force, the strong forcebecomes short range for a different reason. A group of quarks bind together bythe color force to become hadrons. Protons and pions are examples. At short dis-tances, the color force between the quarks is inversely proportional to the squareof the distance, just like any long-distance forces but at long distances, its behavioris different and has a component that does not diminish as the distance increases.Therefore, stored potential energy between them increases with distance, hencean infinite energy is required to separate them to a macroscopic distance. Thequarks are, in a sense, connected by a string and, like the north and south polesof a magnet, cannot be isolated individually, a phenomenon called “confinement”.They tend not to separate from each other by more than � 10�15 m, which is thetypical size of hadrons. When the string is stretched beyond its limit, it is cut intomany pieces, producing more hadrons. The color charges of the quarks come inthree kinds dubbed “R, G, B”, for red, green and blue, but only a neutral color com-bination can form hadrons.1) In the vicinity of the hadron, quarks of different colorare located at different distances and hence the effect of individual color can be felt.But from far away the color looks completely neutralised and the strong force be-tween the hadrons vanishes, making it a short-range force. The force between thehadrons, including the nuclear force, is an action of partially compensated residual

1) The name “color” is used as a metaphor for three kinds of force-generating charge. We mayequally call them by a number or any other index. Only a certain mixture of R, G, B, or colorwith anticolor (complementary color) that makes white can form hadrons. Hence in the colorterminology we can conveniently say hadrons are color-neutral.

Page 32: Yorikiyo Nagashima - Archive

1.1 An Overview of the Standard Model 7

colors and is therefore secondary in its nature, like the van der Waals force actingbetween molecules. The action of color forces is, mathematically, almost identicalto that of the electromagnetic force and is also described by a gauge theory, thedifference being only in the number of charges (three).2) This is the reason for thename quantum chromodynamics (QCD) to describe the mathematical frameworkof the strong force dynamics.

1.1.3The Standard Model

The weak force can be described by a gauge theory that contains two kinds ofcharge. General relativity, the theory of gravity, was also recognized as a kind ofgauge theory. Considering that the electromagnetic and weak forces are unifiedand all four fundamental forces work in the same mathematical framework, it isnatural to consider that all the forces are unified but show different aspects in dif-ferent environments. The Grand Unified Theory (GUT), to unify the electroweakand strong interactions, and the Super Grand Unified Theory,3) to combine all theforces, including gravity, are currently active research areas. Among them “super-gravity” and “superstring theories” are the most popular. But only the electroweaktheory and QCD are experimentally well established and are called collectively “theStandard Model” (SM) of elementary particles. The history and current situation offorce unification are shown in Fig. 1.1.

The essence of the Standard Model can be summarized as follows:

1. The elementary particles of matter are quarks and leptons.2. The mathematical framework for the force dynamics are gauge theories.3. The vacuum is in a sort of superconducting phase.

Particle species in the Standard Model are given in Table 2.1 of Sect. 2.1. Matter par-ticles, the quarks and leptons, are fermions having spin 1/2, and the fundamentalforce carriers, also called gauge particles, are spin 1 bosons. The Higgs boson is aspin 0 particle and is the only undiscovered member. Whether it is elementary ora dynamical object remains to be seen. The Standard Model was proposed around1970 and was firmly established experimentally by the end of the 1970s. The Stan-dard Model was constructed on the torrent of data from the 1960s to 1980s, whenlarge accelerators become operational (see Table 1.1). It marked the end of the eraof the 80-year quest for the fundamental building blocks of matter that started withRutherford at the beginning of the 20th century, and opened a new epoch. Theeffect on our consciousness (material view) was revolutionary and comparable tothat of relativity and quantum mechanics at the beginning of the 20th century. The

2) Gluons, the equivalent of photons in electromagnetism, come in eight colors, which arecombinations of the three basic colors, excluding white.

3) The prefix “super” is attached for theories that contain the supersymmetry that combinesfermions and bosons to form one family. They are examples of theories beyond the StandardModel and are briefly described in Chap. 19.

Page 33: Yorikiyo Nagashima - Archive

8 1 Introduction

Strong Force Electro-magnetic Weak Force Gravity(Nuclear Force) Force

Strength ~0.1 (~10) 1/137 ~10 ~10 # -5 -42

Source Color Charge Electric Charge Weak Charge Mass (hadron) (Gravitational Charge)

Classical Newton's Law Theory Maxwell Equation

General Relativity

1935 1934 Yukawa Theory Fermi Theory 1946~1949

Quantum Quantum Electro-Dynamics Theory (Quark Model) (QED)

Quantum Chromo- Glashow-Weinberg- Dynamics (QCD) Salam Theory

Standard Model

Grand Unified Theory ? New Kaluza-Klein Theory

Super Grand Unified Theory

String Theory ?

Unification of the Forces

~1973 1960~1968

16651864

1914

# At r = 10 m -18

Figure 1.1 Unification of the forces. Those enclosed in dashed lines are not yet established.Only the strong force changes its strength appreciably within currently available energy rangesor, equivalently, distances. This is why the distance of the strong force is specified.

Standard Model is very powerful and can explain all the phenomena in the mi-croworld of particles in a simple and unified way, at least in principle. Forty yearshave passed since the Standard Model was established, yet there has appeared on-ly one phenomenon that goes beyond the Standard Model. Neutrino oscillation inwhich a neutrino of one kind is transformed to another while it propagates, doesnot happen if the mass of the neutrino vanishes, as is assumed in the StandardModel. But it is fair to say it only needs a small stretch and is not a contradictionto the Standard Model. The model is so powerful to the extent that it is not easyto think of an experiment with currently available accelerators that could challengethe Standard Model in a serious way. A quantum jump in the accelerator energyand/or intensity is required to find phenomena that go beyond the Standard Modeland explore new physics.

Page 34: Yorikiyo Nagashima - Archive

1.1 An Overview of the Standard Model 9

Table 1.1 Chronicle of particle physics. Important discoveries and contributing accelerators.Items in parentheses (� � � ) are theoretical works.

Year Names Discovered or proposed Accelerator

1897 J.J. Thomson Electron Cathode tube

1911 E. Rutherford Atomic model α Isotope1929 W. Heisenberg (Quantum field theory)

W. Pauli

1930 P.A.M. Dirac (Dirac equation)1930 W. Pauli (Neutrino)

1932 J. Chadwick Neutron α Isotope

" C.D. Anderson Positron Cosmic ray" E.O. Lawrence Cyclotron Cyclotron

1934 E. Fermi (Weak interaction theory)1935 H. Yukawa (Meson theory)

1937 C.D. Anderson Muon Cosmic ray

S.H. Neddermyer1947 G.F. Powell Pion Cosmic ray

" G.O. Rochester Strange particles Cosmic ray

C.C. Butler1946–1949 S. Tomonaga (QED)

J. SchwingerR. Feynman

1950–1955 Strange particle production Cosmotron

1953 K. Nishijima (Nishijima–Gell-Mann law)M. Gell-Mann

1954 C.N. Yang (non-Abelian gauge theory)

R.L. Mills1955 E. Segre Antiproton Bevatron

O. Chamberlain1956 C.L. Cowan Neutrino Reactor

F. Reines

1956 C.N. Yang (Parity violation)T.D. Lee

1958 R. Feynman (V-A theory)

and others1960 Y. Nambu (Spontaneous breaking

of symmetries)1960–1970 L.W. Alvarez et al. Resonance production Bevatron

by bubble chambers Cosmotron

1962 L. Lederman 2-neutrinos BNL/AGSM. Schwarz

J. Steinberger

Page 35: Yorikiyo Nagashima - Archive

10 1 Introduction

Table 1.1 (continued)

Year Names Discovered or proposed Accelerator

1964 M. Gell-Mann (Quark model)

G. Zweig" V. Fitch CP violation BNL/AGS

J. Cronin

" P. Higgs (Higgs mechanism)" N. Samios et al. Ω BNL/AGS

1961–1968 S. Glashow (Electro-weak unification)

S. WeinbergA. Salam

1969 J.I. Friedman Parton structure of the nucleon SLAC/Linac

H.W. KendallR.E. Taylor

� 1971 G. ’t Hooft (Renormalization of EW theory)M.J.G. Veltman

1972–1980 Neutrino experiments Fermilab/PS

CERN/SPS1973 M. Kobayashi (KM model on CP)

T. Maskawa

1973 H.D. Politzer (Asymptotic freedom and QCD)D.J. Gross

F. Wilczek" A. Lagarrigue et al. Neutral current CERN/PS

1974 C.C. Ting J/ψ (charm quark) BNL/AGS

B. Richter SPEAR1974–1980 Charm spectroscopy SPEAR, DORIS

1975 M. Pearl τ lepton SPEAR

1978 L. Lederman � (bottom quark) Fermilab/PS1978–1979 Bottom spectroscopy CESR/SPS

Gluon DESY/PETRA1983 C. Rubbia W, Z0 S NppS

van de Meer

1990–2000 Nν D 3 LEPPrecision test of EW theory LEP, TEVATRON

1994 Top quark TEVATRON

1998 Y. Totsuka et al. Neutrino oscillation Cosmic rays2000 SLAC/BaBar Kobayashi–Maskawa SLAC/B

KEK/BELLE model confirmed KEK/B

Page 36: Yorikiyo Nagashima - Archive

1.2 The Accelerator as a Microscope 11

The fact that the kinetic energy of particles in particle physics far exceeds themass energy makes special relativity an indispensable tool to describe their behav-ior. Understanding the mathematical formalism that is built on the combination ofspecial relativity and quantum mechanics, also known as quantum field theory, re-quires a fair amount of effort, and the ensuing results very often transcend humancommon sense. The only other fields that need both disciplines are cosmology andextreme phenomena in astrophysics, such as supernovae and black-hole dynam-ics.

1.2The Accelerator as a Microscope

In order to investigate the microscopic structure of matter, we need a tool, a micro-scope, that magnifies the object so that we can see it, or more generally recognize itusing our five senses. What corresponds to the light-collecting device in the micro-scope is a particle accelerator. Shining light on the object corresponds to makinghigh-energy particles collide with target particles whose structure we want to inves-tigate. The scattered light is collected by objective lenses and focused to an image,which we observe through an eyepiece. In the accelerator experiment, we use aradioisotope detector to measure patterns of the scattered particles. By computerprocessing they can be transformed to a pattern that we can recognize as a mag-nified image. The magnification of a microscope is determined by a combinationof lenses, but they merely transform the image to help human eyes to recognizethe scattered pattern of light. The resolution, the ability of the optical microscopeto discriminate, is largely determined by the wavelength of the light. The morepowerful electron microscope uses an electron beam and instead of optical lenses,quadrupole magnets to focus the beam. The scattered electrons are illuminated ona fluorescent board to make an image that human eyes can recognize. In otherwords, the electron microscope is a mini accelerator–detector complex.

The quantum principle dictates that particles have a dual wave/particle propertyand the wavelength is given by the Planck constant divided by the particle momen-tum (the de Broglie wavelength λ D h/p ). Therefore, the resolving power is directlyproportional to the momentum or the energy of the incoming particle. This is whywe need high-energy accelerators to explore the structure of matter. The smaller thescale, the higher the energy. Therefore, to explore elementary particles, the highestenergy technologically realizable at the era is required. This is why particle physicsis often called high-energy physics.

Figure 1.2, commonly referred to as the Livingston chart, shows the history ofmajor accelerators and their energy. On the right are indicated the approximateranges of energy that are suitable for the investigation of the particle structure atthe specified level. We summarize important discoveries and tools used for them inthe history of particle physics in Table 1.1. We can see that the most effective toolsfor particle physics are accelerators of the highest energy available at the time. Theone with the highest energy currently is the LHC (Large Hadron Collider; a proton–

Page 37: Yorikiyo Nagashima - Archive

12 1 Introduction

Figure 1.2 The Livingstone chart showing development of accelerators and the investigatedconstituent particles.

proton colliding accelerator of total energy 14 TeV) in Europe. It started operationin 2009. The LHC can explore a micro-size of� 10�20 m and is expected to discoverthe Higgs particle.

Page 38: Yorikiyo Nagashima - Archive

13

2Particles and Fields

In particle physics, a particle is defined as an energy quantum associated with a“field”. The field is an entity that is defined over all space and time (collectivelycalled space-time) and is able to produce waves, according to classical physics, orquanta, in present day terminology, when excited or when some amount of energyis injected. In a microworld the quanta are observed as particles, but collectivelythey behave like waves. A quantum is the name given to an object that possessesthe dual properties of both a particle and a wave. A typical example of a field is theelectromagnetic field. It is created by a charge and extends all over space. It is staticwhen the charge that creates it does not move, but it can be excited by vibratingthe source of the field, the electric charge; then the vibrating field propagates. Thisis a charge radiating an electromagnetic wave. It is well known that historicallyquantum mechanics has its origin in Planck’s recognition that blackbody radiationis a collection of countable quanta, photons.

In 19th century physics, waves could only be transmitted by some kind of vi-brating medium and the electromagnetic waves were considered to propagate in amedium called the “ether”. The existence of the ether was ruled out with the adventof special relativity, and people began to consider that the vacuum, though void ofanything in the classical sense, has the built-in property of producing all kinds offields, whose excitations are observed as quanta or particles. The Standard Modelhas advanced the idea that the vacuum is not an empty entity but rather filled withvarious kinds of exotic material, including the Higgs particle, and exhibits dynam-ical properties like those observed in ordinary matter. The vacuum as we view itnowadays is a kind of resurrected ether with strange attributes that nobody hadthought of. In this chapter, we describe an intuitive picture of both particles andfields, define some associated variables and prepare basic tools and terminologiesnecessary for the treatment of quantum field theory.

2.1What is a Particle?

There are several properties that characterize a particle.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 39: Yorikiyo Nagashima - Archive

14 2 Particles and Fields

(1) A particle is spatially localized at each instant and its number is countable.In classical mechanics, particle’s mass m and velocity v or energy E and mo-

mentum p can be defined and measured. In the nonrelativistic limit, where theparticle’s velocity is far smaller than the velocity of light c, these variables of a freemoving particle are given by

p D mv , E D12

mv2 Dp 2

2m, v D jv j , p D jp j (2.1)

where bold letters denote three-component vectors. However, when the velocity isclose to c, Einstein’s formulae

p Dmvp

1� (v/c)2, E D

mc2p1 � (v/c)2

(2.2)

have to be used instead. In the microworld, the energy and the momentum aremore important quantities directly related to observables. We can define the massand velocity in terms of the energy and the momentum. The relation can be derivedfrom Eq. (2.2):

E 2 D (p c)2 C (mc2)2 , v Dp c2

E(2.3)

Note, Eq. (2.3) also holds for particles with zero mass, such as the photon. Whenthe mass vanishes, E D p c and v is always equal to c.

(2) The particle can be created or annihilated.This results from Eq. (2.2). The mass of a particle is but one form of energy and

can be created if enough energy exists, or the energy can be converted back to massenergy subject to selection rules of one kind or another. Examples of reactions are

eC C e� �! γ C γ (2.4a)

�! μC C μ� (2.4b)

�! p C p (2.4c)

p C p �! any number of π’s (2.4d)

πC �! μC C νμ (2.4e)

π� C p �! K0 C Λ (2.4f)

As a result, the existence probability of a particular kind of particle is not con-served, and the one-particle treatment of the Schrödinger equation is not satisfac-tory for treating relativistic particles. Quantum field theory is needed to treat theproduction and annihilation of particles.

Page 40: Yorikiyo Nagashima - Archive

2.1 What is a Particle? 15

(3) A particle is not necessarily stable.The statement is a corollary to statement (2). The Equation (2.4e) says that the

pion is unstable and decays to a muon and a neutrino. Its average lifetime τ ismeasured to be approximately 2.6� 10�8 s. This means there is a time uncertaintyΔ t � τ for the existence of the pion. Then the Heisenberg uncertainty principle

ΔE � Δ t & „ (2.5)

tells us that there is also uncertainty in the energy or the mass in the rest frame.Here „ D h/2π is Planck’s constant divided by 2π and is also called Planck’sconstant when there is no confusion. In the case of the pion

Δmc2 � ΔE �„

Δ t�

6.58 � 10�22 MeV � s2.6 � 10�8 s

� 2.5 � 10�14 MeV

) Δmm�

2.5 � 10�14

140� 2 � 10�16 (2.6)

the uncertainty in the mass is, to all practical purposes, negligible. However, in thecase of � meson, the lifetime is � 4.2 � 10�24 s1) and the mass uncertainty is

Δm �6.58 � 10�22 MeV � s

4.2 � 10�24 s � c2� 160 MeV/c2 (2.7)

which cannot be neglected compared to its mass 780 MeV/c2. The relative uncer-tainty is very large:

Δmm� 0.21 (2.8)

Let us treat the situation a little more quantitatively. When the average lifetimeis given by τ, the particle can die in a brief time interval d t with a probability d t/τ.If dN of N particles die in d t

dNND �

d tτ

) N(t) D N0e�t/τ(2.9)

In quantum mechanics, the existence probability of a particle is proportional to thesquare of its wave function ψ

jψj2 / e�t/τ (2.10)

and the wave function of the particle in its energy eigenstate has time dependenceψ(x , t) � ψ(x )e�i E t/„. By putting

E D E0 � iΓ2

(2.11)

1) In practice, such a short time interval cannot be measured. It is inversely derived from the masswidth Eq. (2.16).

Page 41: Yorikiyo Nagashima - Archive

16 2 Particles and Fields

the time dependence of the probability is given by

jψ(t)j2 D jψ(x )e�i(E0�i Γ /2)t/„j2 D jψ(x )j2e�Γ t/„

) Γ D„

τ

(2.12)

From the equations, we know that being unstable is equivalent to having a complexenergy with negative imaginary part Γ > 0. The next question is then what kindof mass distribution does the particle have? The wave function as a function of theenergy can be expressed as a Fourier transform

Qψ(E ) D1p

Z 1

�1d t ψ(t)e i E t/„ (2.13)

Assuming the particle was created at t D 0 and x D 0, ψ(t) D 0 for t < 0, then

Qψ(E ) D1p

Z 1

0d t ψ(x D 0, t)e i E t/„

D1p

Z 1

0d t ψ(0)e�i(E0�i Γ/2)t/„e i E t/„

Dψ(0)p

i„E � E0 C i(Γ /2)

(2.14)

Therefore, the probability that the particle takes an energy value E is given by

P(E ) / j Qψ(E )j2 Djψ(0)j2

2π„2

(E � E0)2 C (Γ /2)2 (2.15)

The functional shape is shown in Figure 2.1 and is called a Lorentzian, the samefunction as the Breit–Wigner resonance formula, which describes unstable levels ofnuclei. From the shape, Γ represents the full width at half maximum (FWHM) ofthe function. Γ can be regarded as the energy uncertainty of the unstable particle,hence the Heisenberg uncertainty principle

ΔE � Δ t � Γ � τ D „ (2.16)

E0

Γ

E0

0.5

1.0

Figure 2.1 Lorentzian shape or Breit–Wigner resonance formula for the energy (or mass) distri-bution of unstable particles.

Page 42: Yorikiyo Nagashima - Archive

2.1 What is a Particle? 17

is reproduced. The unstable elementary particles such as � and Δ(1236) have themass and width

m(�) D 776 MeV/c2 , Γ D 158 MeV (2.17)

m[Δ(1236)] D 1236 MeV/c2 , Γ D 115 MeV (2.18)

Historically, they were observed as strong enhancements in the cross sections ofπ-π and π-p scatterings, namely as resonances. The process strongly suggeststhem to be of dynamical origin, and in the early days of hadron dynamics they weretreated as such. Today, all hadrons, including nucleons, pions and resonances, areidentified as the energy levels of quark bound states and are treated on an equalfooting.

(4) A particle has spins and other degrees of freedom.Energy and momentum are not the only attributes a particle possesses. It acts as

if it were spinning on some axis. Spin means angular momentum and its magni-tude is quantized to integers or half-integers in units of the Planck constant „. Spincomponents or a projection of the spin onto a certain axis (z-axis) is also quantizedin units of „:

Sz D S, S � 1, S � 2, � � �,�(S � 1),�S (2.19)

If a charged particle has a finite spin, it has a magnetic moment.2)

Spin in quantum mechanics has, mathematically, all the properties of angularmomentum, but the picture of a self-rotating object such as the rotating earth doesnot apply, because a point particle cannot have angular momentum in the classicalsense (L D r�p ). Therefore the spin component should rather be regarded as just adegree of freedom. For a particle of spin 1/2 like an electron, only (Sz D 1/2,�1/2)is allowed. For the electron, energy, momentum, electric charge and spin orienta-tion exhaust its degrees of freedom. This is evident as only two Sz D ˙1/2 elec-trons can occupy the same orbit in a hydrogen atom. The Pauli exclusion principleforbids other electrons of the same attributes to occupy the same quantum state atthe same space point. Other particles may have more attributes, such as isospin,color, strangeness, etc., which are generically called internal degrees of freedom.

(5) A particle has a corresponding antiparticle.The particle and its antiparticle have identical mass, lifetime and spin values,

but the spin-components and all the internal quantum numbers are opposite. Thepositron is the antiparticle of the electron. The photon is its own antiparticle, asituation that holds for other particles, such as neutral pions, which have no dis-criminating internal quantum numbers. The sum of spin components and internalquantum numbers of a particle and its antiparticle is zero, namely the same quan-tum number as that of the vacuum. When they meet, they annihilate pair-wiseand become pure energy, usually in the guise of photons, gluons and other gauge

2) A composite particle such as the neutron can generally have magnetic moment even if it iselectrically neutral. Think of two oppositely charged particles rotating in reverse directions. Even aneutral elementary particle like the neutrino can have it due to higher order radiative corrections.

Page 43: Yorikiyo Nagashima - Archive

18 2 Particles and Fields

Table 2.1 Elementary particles and their characteristics

Matter constituent particlesSpin Electric charge a I3

b Generation

I II III

Quarks 1/2 C2/3 C1/2 u (up) c (charm) t (top)

1/2 �1/3 �1/2 d (down) s (strange) b (bottom)

Leptons 1/2 0 C1/2 νe νμ ντ

1/2 �1 �1/2 e� μ� τ�

a In units of proton charge.b The third component of weak isospin.

bosons. If sufficient energy is given to the pair, the produced energy instantly re-materializes to any number of particle–antiparticle pairs or bosons of heavier masswithin the allowed energy limit.

(6) Matter particles and force carriers.Minimum elements of matter that can be extracted fall into two classes. One ex-

erts the strong force and the other does not. The former are called hadrons, the lat-ter, leptons. Hadrons with half-integer spin are called baryons and include protons,neutrons and other heavier particles such as Λ, Σ , � , � � � . Those with integer spinare called mesons and include π, K, �, � � � . All hadrons are composites of quarksthat have spin 1/2, but the quarks cannot be isolated (confinement). One mightdoubt the real existence of such objects, but there is ample evidence for their exis-tence inside the hadrons. Today, six kinds of quarks (u, d, s, c, b, t) are known withincreasing mass, each having three color degrees of freedom and electric charge2/3 or �1/3 in units of the proton charge e. The leptons have spin 1/2, too. Sixkinds of leptons are known, three kinds of charged leptons (e, μ, τ) and their as-sociated neutral leptons (neutrinos) (ν e , νμ , ντ ). In the Standard Model, the massvalue of the neutrino is assumed zero.3) To summarize, the constituent particles ofmatter are six kinds of quarks and leptons distinguished by “flavor”. They act pair-wise in the weak interaction and are further divided into three generations. Theparticle species are summarized in Table 2.1.

Properties of the particles in the same row but in different generations are al-most identical apart from their mass, which becomes heavier with the generation(see Table 2.2). Known matter in the universe is made mostly of the first gener-

3) In 1998, a phenomenon called neutrinooscillation, in which νμ is converted toanother kind of neutrino, was discovered inatmospheric neutrinos. This implies a finitemass of the neutrino. Within the context of

this book, however, the neutrino mass maysafely be assumed zero until later, whenneutrino properties are discussed.

Page 44: Yorikiyo Nagashima - Archive

2.1 What is a Particle? 19

Table 2.2 Mass values of quarks and leptons [311]

quark u d s c b t

a � 3 MeV � 5 MeV � 104 MeV � 1.27 GeV � 4.2 GeV 171 GeV

lepton ν1b ν2

b ν3b e μ τ

c ? � 0.009 eV � 0.05 eV 0.511 MeV 105.7 MeV 1777 MeV

a Current mass which is determined by dynamical processes like scattering.b Observed νe , νμ , ντ are mixtures of mass eigenstates ν1, ν2, ν3.c For the neutrinos, only mass differences Δm2

i j D jm2i � m2

j j are known. Consequently,m3 � m2 � m1, is assumed. Note: MeVD 106 eV, GeVD 109 eV

ation.4) Others are generated in high-energy phenomena or supposedly exist inextreme conditions such as the interior of neutron stars. The existence and role ofthe generation is an unsolved problem.5) There is no clear evidence that highergenerations do not exist, but the weak gauge boson Z0 can decay into pairs of neu-trinos and antineutrinos producing no others except the known three kinds. If thefourth generation exists, its neutrino has a mass greater than mZ /2 D 45 GeV/c2.Considering it an unlikely situation, the Standard Model assumes only three gen-erations.

Quantum field theory requires the existence of a particle that mediates the in-teraction, a force carrier. It is a quantum of the force fields. Gauge theory requiresthe force field to be a vector field, hence the force carrier must have spin 1.6) Cor-responding to the four fundamental forces in nature, there are four kinds of forcecarrier particles, gluons for the strong force, photons for the electromagnetic force,weak bosons (W ˙, Z0) and gravitons for gravity. A list of fundamental interactionsand their characteristics are given in Table 2.3.

Vacuum particle? If the Higgs particle is discovered, there could be another cat-egory of spin 0 fundamental particle. It may be called a vacuum particle from therole of the Higgs. However, whether it is fundamental or of dynamical origin hasyet to be determined. A typical example of the dynamical Higgs is if it is composedof a t t pair [277], just like the Cooper pair in superconductivity is composed of anelectron pair of opposite momentum and spin orientation.

4) There is an unidentified matter form in theuniverse known as “dark matter”, whichmakes up � 22% of all cosmic energy. Theordinary matter of known type makes uponly � 5%. The rest is unidentified vacuumenergy called “dark energy”.

5) The muon, for instance, could be termeda heavy electron. Except for the fact that itcan decay into an electron and neutrinos,all the reaction formulae for the muonmay be reproduced from those of theelectron simply by replacing its mass. The

seemingly mysterious existence of the muonis expressed in a phrase by I. Rabi: “Whoordered it?”. The generation puzzle is theextension of the muon puzzle.

6) This is true for the Standard Model forcecarriers. The quantization of the gravitationalfield produces a spin 2 graviton. This isdue to the special situation that the sourceof the gravity is mass, or more generallyenergy-momentum, which itself is a secondrank tensor in space-time.

Page 45: Yorikiyo Nagashima - Archive

20 2 Particles and Fields

Tabl

e2.

3Li

stof

fund

amen

talf

orce

san

dth

eirc

hara

cter

istic

s.

Forc

eSt

rong

(Nuc

lear

)aEl

ectr

omag

netic

Wea

kG

ravi

ty

Sour

ceCo

lor

char

ge(h

adro

n)El

ectr

icch

arge

Wea

kch

arge

Mas

sb

(RG

B)(I

3D

˙1/2

)

Stre

ngth

0.1

c(1

0)1/

137

10�5

10�4

2

Ran

ge[m

]1

d(1

0�15 )

110

�18

1

Pote

ntia

lsha

pek 1

1 rC

k 2r

exp(�

r)r

e1 r

exp(�

mW

r)r

e1 r

Forc

eca

rrie

rg

(glu

on)

πm

eson

γ(p

hoto

n)W

˙,Z

0gr

avito

n

Spin

10

11

2T

heor

yQ

CD

(Yuk

awa

theo

ry)

QED „

ƒ‚…

Gen

eral

rela

tivity

GW

Sf

The

ory

Gau

gesy

mm

etry

SU(3

)—

U(1

)�SU

(2)

Poin

caré

grou

p

aSe

cond

ary

stro

ngfo

rce

actin

gon

hadr

ons.

bTo

beex

act,

ener

gy-m

omen

tum

tens

or.

cA

tdis

tanc

e10

�18

m.T

hedi

stan

ceis

spec

ified

asth

eco

lor

forc

ech

ange

sits

stre

ngth

cons

ider

ably

with

inth

em

easu

red

rang

e10

�15–1

0�18

m.

dTh

era

nge

beyo

nd�

10�1

5m

ism

eani

ngle

ssbe

caus

eof

confi

nem

ent.

em

π,m

Wsh

ould

bere

adas

c/„

,mW

c/„

,i.e

.as

inve

rse

Com

pton

wav

elen

gth.

fG

lash

ow–W

einb

erg–

Sala

m.

Page 46: Yorikiyo Nagashima - Archive

2.2 What is a Field? 21

2.2What is a Field?

2.2.1Force Field

Force at a Distance, Contact Force and the Force FieldIn classical mechanics, the gravitational force is expressed by Newton’s formula

f D �GNmM

r2(2.20)

This means that an object with mass M exerts its force directly on another objectwith mass m at a distance r. Einstein considered such a “force at a distance” isunacceptable. To understand why, let’s carry out a thought experiment. Supposethe sun suddenly disappears at t D 0 and infer what would happen to the earth.According to Newton’s way of thinking, the sun’s attraction would disappear in-stantly, because the mass M, the source of the force, has disappeared and the earthwould be thrown off its orbit into cosmic space at the same instant. According toEinstein’s principle of relativity, nothing can travel faster than light, and the infor-mation of the sun’s disappearance would take at least 8 minutes to reach the earth.During the 8 minute’s interval there is no way for the earth to know that the sunhas vanished, which means that the gravitational force continues to act. It followsthat something other than the sun must be there to exert the force on the earth.We call this something a “field”. The sun (mass) creates a gravitational field overall space and the force acting on the earth is the action of the local field at the samespace point as where the earth is located. Namely, all forces must be “contact forces”that can act only upon contact with the object on which they exert. In this way ofthinking, wherever a force acts there must be an associated field, a gravitationalfield for gravity and an electromagnetic field for the electromagnetic force.

Take a look at the Maxwell equations of the electric force, which are written as

r � E (t, x ) D� (t, x )

�0, f (ty , y ) D q1E (ty , y ) , r � E (t, x ) D 0 (2.21)

This translates to the statement that an electric charge distribution � at a spacepoint x at time t creates an electromagnetic field E (x , t) over all space. A pointcharge q1 at a spacetime point (ty , y ) receives the force from the local electric fieldE at the same space-time where it is located. The third equation is an auxiliarycondition that the static field must satisfy. If the static electric field is created by apoint charge q2 at point r D 0

� D qδ(r) , δ(r) D δ(x )δ(y )δ(z) (2.22)

Solving Eq. (2.21) with the above input gives Coulomb’s law

f Dq1q2

4π�0

1r2

rr

, r D jrj (2.23)

Page 47: Yorikiyo Nagashima - Archive

22 2 Particles and Fields

The result has the same form as Newton’s law, but in the dynamical (i.e. time-dependent) solution the influence of a changing force never propagates faster thanlight. Namely, the Maxwell equations respect the principle of relativity. This is guar-anteed by the local nature of the differential equations. Integral forms are usefulfor a intuitive interpretation, but the equations have to be written in differentialforms. Historically, Maxwell’s equations were obtained empirically (except for thedisplacement current), then the Lorentz transformation was derived to make theforce action consistent in the rest frame and in a moving frame, which is a basis ofthe theory of relativity.

PhotonAn oscillating electromagnetic field propagates through space and becomes an elec-tromagnetic wave. It can be created by oscillating the source, but once created, itbecomes a wave acting independently of the source. Quantum mechanics tells usthat the wave behaves also as a particle. An entity exhibiting the dual behavior of aparticle and a wave is called a quantum. The quantum of the electromagnetic fieldis called a photon. The above statement can be summarized as follows

The principle of relativity requires the force field and the quantum principlerequires quanta of the field, i.e. force carrier particles.

The electromagnetic wave can carry energy and when it meets another electrical-ly charged object it forces it to oscillate, because classically it is an oscillating forcefield. But as a particle, when it hits the object, it is either scattered or absorbed, giv-ing it a kick. The energy and momentum of the quantum are given by the relations

E D hν D „ω , p D „k Dhλ

kjkj

(2.24)

where ν is the wave’s frequency, ω D 2πν the angular frequency, λ the wave-length and k the wave vector. As we have seen, the creation of the electromagnetic(force) field is equivalent to the emission of a photon (force carrier particle) andthe exertion of the force by the electromagnetic (force) field on the charge (source)is equivalent to absorption of the photon (quantum) by the charged object (anoth-er source). In quantum field theory, action of a force means an exchange of theforce carrier particle. In other words, action/reaction of the force is synonymous toabsorption/emission of the force quantum by the object on which the force acts.In a mathematical treatment of quantum mechanics, energy is transmitted oftenin momentum or energy space without ever referring to the space variables. Thenotion of a force is a bit different from that used in classical mechanics, where itrefers to an energy variation associated with space displacement ( f D �@E/@x ). Soinstead of saying force exertion, the word “interaction” is used where any exchangeof energy occurs.

Page 48: Yorikiyo Nagashima - Archive

2.2 What is a Field? 23

Strength of the ForceIn the following, we talk of (potential) energy instead of force. The electromagneticinteraction between, say, a proton and an electron in a hydrogen atom is expressedby the Coulomb potential and the strength of the force by α where

V(r) De2

4π�0

1rD α„cr

α �e2

4π�0„c�

1137

(2.25)

Here, to define a dimensionless quantity α, we used the fact that „c has the dimen-sion of [energy�distance]. e is the unit electric charge that both the electron and theproton have, which has a value of 1.6 � 10�19 Coulomb in rationalized MKS units.α is called the fine structure constant and represents the fundamental strengthof the electromagnetic interaction. The dimensionless strength of the gravitationalforce can be defined similarly:

V(r) D Gm1m2

rD αG

�m1m2

m p me

�„cr

αG D GNm p me

„c� 3 � 10�42

(2.26)

where GN is Newton’s constant of universal gravitation and m p , me are the protonand electron mass. In particle physics gravity can be neglected because (� � � ) � O(1)hence αG is by many orders of magnitude smaller than α. The strength of thestrong and weak force will be described in the following section.

Problem 2.1

(1) Calculate and confirm the values of the fine structure constant and the gravita-tional strength.(2) Calculate the mass value which makes αG D 1. Express it in units of GeV. It iscalled the Planck energy.

Force RangeGenerally a particle has finite mass. What will happen if a force carrier particle hasnonzero mass? To exert the force, it has to be emitted and absorbed by interactingobjects. The finite mass of the force carrier if emitted produces energy uncertaintyof at least Δmc2. The Heisenberg uncertainty principle dictates that the time inter-val Δ t that allows energy violation should be less than� „/ΔE . If the force carrieris not absorbed within this time interval, the force cannot be transmitted. As theparticle velocity is less than the light velocity c, the force range r0 should be lessthan

r0 . cΔ t � c„/mc2 D „/mc (2.27)

Therefore, the range of a force mediated by a force carrier having a finite mass islimited to its Compton wavelength.

Page 49: Yorikiyo Nagashima - Archive

24 2 Particles and Fields

Let us treat the problem a little more quantitatively. As will be explained soon, anyrelativistic particle should be a quantum of a field that satisfies the Klein–Gordonequation�

1c2

@2

@t2 �r2 C μ2

�' D 0, μ D mc/„ (2.28)

In analogy with the electric potential, the force field created by a point source wouldsatisfy the equation�

1c2

@2

@t2 �r2 C μ2

�' D gδ3(r) (2.29)

where g denotes the strength of the force corresponding to the electric charge of theelectric force. For a static potential where @/@t D 0, the equation is easily solved togive

'(r) Dg

4πe�r/ r0

r, r0 D

1μD„

mc(2.30)

Problem 2.2

Derive Eq. (2.30) from Eq. (2.29).Hint: Use the Fourier transform

'(r) DZ

d3 p(2π)3 e i p�r Q'(p ), δ3(r) D

Zd3 p

(2π)3 e i p�r .

The reach of a short-range force is mathematically defined as Eq. (2.30), whichis called the Yukawa potential. When r � r0, the potential is of the Coulombtype, but for r r0, decreases very rapidly. For gravitation or the Coulomb force,r0 D 1 and they are called long-range forces. Based on the above formula, Yukawapredicted the mass value of the π meson, the nuclear force carrier, to be roughly200 times that of the electron from the observation that the reach of the nuclearforce is approximately � 10�15 m. The strength of the nuclear force is measuredto be g2/4π ' 10, which is 1000 times the electromagnetic force. The source ofthe nuclear force is a nucleon, but as the nucleon as well as the pion turned outto be composites of quarks, the real source of the strong force was later identifiedas the color charge, which all quarks carry. The true strong force carrier is calledthe gluon and its mass is considered to be zero, just like the photon. The potentialbetween the quarks is of the Coulomb type, but for reasons to be described later,has an extra term and is approximately (phenomenologically) given by

φ(r) D k11rC k2 r (2.31)

The second term increases linearly with distance and is called the confining poten-tial. k1 decreases logarithmically at short distance, a phenomenon called asymptoticfreedom and takes a value � 0.1 at r � 10�18 m.

Page 50: Yorikiyo Nagashima - Archive

2.2 What is a Field? 25

The weak force carriers come in three kinds, charged W ˙ with mass of80.4 GeV/c2 and a neutral Z0 with mass of 91.2 GeV/c2. Their mass value meansr0 � 10�18 m for the weak force. The range of the weak force is so small thatpractically it could be replaced with a contact force before the advent of big acceler-ators. The coupling strength was expressed in terms of the dimensionful quantityGF � 10�5 m2

p for historical reasons, where m p is the mass of the proton.

Problem 2.3

From the mass values of W ˙, Z0, calculate the weak force range.

Problem 2.4

From observation of stars and polarization of star light, it is known that a coherentmagnetic field of � 3 μ(μ D 10�6) gauss exists in the Crab nebula. This meansthat the electromagnetic force range is at least hundreds of light years. If the in-tergalactic magnetic field is confirmed, the range is even larger, millions of lightyears. Calculate the upper limit of the photon mass.

2.2.2Relativistic Wave Equation

Physical quantities that characterize a particle are energy E and p , but in quantummechanics the particle is a quantum and shows wave behavior. The wave picture ofa particle can be derived by replacing

E ! i„@

@t, p ! �i„r (2.32)

and regarding them as operators acting on a wave function. Inserting Eq. (2.32) intothe nonrelativistic energy-momentum relation (2.1), we obtain a wave equation forthe de Broglie wave:

i„@ψ@tD �

„2

2mr2ψ (2.33)

If the particle is moving in a potential, one adds a potential V(x ) and the result is thewell-known Schrödinger equation. To see if we can obtain a relativistic Schrödingerequation, we apply (2.32) to Eq. (2.3) and obtain�

1c2

@2

@t2 �r2 C μ2

�' D 0 , μ �

mc„

(2.34)

which is called the Klein–Gordon equation. Equation (2.34) has a plane wave solu-tion

'(r) � '(x , t) D N ei(p�x�E t )/„ (2.35)

Page 51: Yorikiyo Nagashima - Archive

26 2 Particles and Fields

where N is a normalization constant. A general solution can be expressed asa superposition of plane waves. Unfortunately, unlike the nonrelativistic equa-tion (2.33), we cannot regard ' as a probability amplitude for a particle. If ' is tobe regarded as a probability amplitude just like ψ in the Schrödinger equation,the probability density � and its flow (current) j have to be defined to satisfy thecontinuity equation (i.e. conservation of probability)

@�

@tCr � j D 0 (2.36)

The current j may assume the same form as the Schrödinger equation:

j D �i„

2m

�'�r' � 'r'�� (2.37)

Then, in a relativistic treatment (to be discussed in Chap. 3), �c has to be the fourthcomponent of the four-dimensional current vector.7)

�c Di„

2mc

�'� @'

@t� '

@'�

@t

�(2.38)

It is easily verified that � and j defined above satisfy the continuity equationEq. (2.36).

One is tempted to interpret � as the probability density, but it does not work as itencounters difficulties.

Negative Energy The first difficulty stems from the fact that the energy-momentum equation E 2 D (p c)2 C (mc2)2 has two solutions

E D ˙q

(p c)2 C (mc2)2 (2.39)

Namely, E can be negative. The particle with negative energy cannot occupy a stablestate. Because it can be transferred to an indefinitely lower energy state by emittinga photon, hence is unstable. Historically, people tried to construct a theory withoutthe negative energy solution. Mathematically, however, they are required to satisfythe completeness condition as well as causality and trying to construct a closedsystem without them has produced various paradoxes.

Negative probability The second difficulty stems from the fact that the Klein–Gordon equation is a second-rank differential equation and requires two initialconditions '(t D 0) and @'/@t(t D 0) to define its solution uniquely. In otherwords, ' and @'/@t can be chosen arbitrarily and do not always guarantee a positivevalue of �. In fact, if we calculate j μ using the plane wave solution (2.35) we have

j μ D jN j2p μ

mD jN j2

�E

mc,

pm

�(2.40)

from which we see that a negative energy solution also gives negative probability.For these reasons, the Klein–Gordon equation was once abandoned historically.

7) c is multiplied to make the physical dimension the same as the space components.

Page 52: Yorikiyo Nagashima - Archive

2.2 What is a Field? 27

2.2.3Matter Field

However, we can still produce wave behavior by interpreting ' as a classical field.The four-dimensional current

q j μ � (�c, j ) D i„q('�@μ' � '@μ'�) , @μ �

�1c

@

@t,�r

�(2.41)

can now be interpreted as a current (charge flow) having positive as well as negativecharges. From the classical field point of view, negative energy means negative fre-quency and simply means a wave propagating in the opposite direction. We shallsee later that particles, or quanta to be exact, appear by quantizing the the ampli-tude of the waves. From the wave amplitude of negative frequency appear particleswith quantum number opposite to that of positive frequency, as suggested by thecurrent flow. Both quanta have positive energy and particles with negative energynever appear. As it happened, history took a detour in coping with the notion ofnegative energy.

The above statement is equivalent to considering matter particles on an equalfooting with the photon. Just as the photon is the quantum of the classical electro-magnetic field, we regard a matter particle as a quantum of some kind of matterfield. To see the parallelism between the matter field and the electromagnetic field,observe that the electric field E and the magnetic field B are expressed in terms ofa scalar and a vector potential

E D �rφ �@A@t

, B D r � A (2.42a)

Aμ �

�φc

, A�

(2.42b)

In the Lorentz gauge, the equations that the potentials satisfy are given as�1c2

@2

@t2 �r2�

φ D q�

�0(2.43a)�

1c2

@2

@t2 �r2�

A D μ0q j (2.43b)

where q denotes electric charge. In vacuum, where � D j D 0, Eq. (2.43) areequations for a freely propagating electromagnetic field, namely, the photon. Equa-tion (2.43) are nothing but the Klein–Gordon equation with m D 0. Note that '

which we have introduced is a scalar field while the photon is a vector field. Thedifference results in the spin of the field quantum, but does not affect the discus-sion here.

The particle behavior lost by treating ' as a classical field can be recovered byquantizing the field (a process sometimes called as the second quantization), justas the photon is the quantized electromagnetic field. Quantization of the field is an

Page 53: Yorikiyo Nagashima - Archive

28 2 Particles and Fields

inevitable consequence of fusing quantum mechanics with the theory of relativity.8)

Since the field is a collection of an infinite number of harmonic oscillators (see thefollowing arguments), the quantization of the field can be achieved by quantizingits harmonic components.

2.2.4Intuitive Picture of a Field and Its Quantum

Before we jump into the mathematics of field quantization, it may help to have aphysical image of what the field is made of. We use an analogy to visualize the fieldto help the reader to understand the otherwise formal and mathematical procedureof quantization. Of course, there is always a danger of stretching the analogy toofar and applying it where it should not be. True understanding is obtained only byinterpreting the mathematics correctly. With those caveats, I proceed and pick upan example from a familiar phenomenon in condensed matter physics.

The example we describe is metal, a form of crystal. In the metal, positively ion-ized atoms are distributed regularly lattice wise. A mechanical model of point mass-es connected by springs is a good approximation (Fig. 2.2). Atoms in the lattice canvibrate and transmit their vibration to neighboring lattices. Namely, the lattice ofpositively ionized atoms has the property of the field we want to talk about. Thequantum of the lattice vibration is called a phonon. In the hypothetical conditionswhere the temperature is absolute zero and all the lattice vibrations come to rest,the electric field made by the lattice is uniform, at least at a scale larger than thelattice spacing, and therefore no net forces exist to act on the electrons. This is be-cause the direct Coulomb forces between the electrons are cancelled by negativeCoulomb forces acting between the electron and the lattice.

Put an electron at some space point in the metal, and its electric field woulddeform the regular lattice configuration (or field) in its neighborhood. Anotherelectron placed at some distance would feel the field distortion. Namely, the twoelectrons can exert forces on each other through coupling with the field. If the elec-

8) The field could be relativistic or nonrelativistic. In this book, we talk about only relativistic fields.

Figure 2.2 Mechanical oscillator model of a field.

Page 54: Yorikiyo Nagashima - Archive

2.2 What is a Field? 29

tron is in motion, the changing field it produces can excite the lattice vibration or,in quantum mechanics, create a phonon. The vibration can propagate through thefield and disturbs the other electron, or equivalently the electron is scattered byabsorbing the phonon.

This classical treatment of the electron coupling with the lattice vibration can berephrased as an interaction of the electron and the phonon. The quantum treat-ment of the phonon–electron interaction and that of the photon–electron interac-tion in particle physics are very much alike, except that a relativistic treatment ishardly necessary in condensed matter physics. Historically, the latter framework(quantum electrodynamics, QED) was established first, with subsequent applica-tion to condensed matter. The phonon is the quantum of the field or a wave excitedin a real medium made of atoms, and so it cannot be identified as a real particle thattravels in the vacuum. Suppose the medium is infinitely large, the lattice spacingso small that we cannot realize its discreteness by any technology we can use, andthe lattice is invisible. Then the phonon has every reason to qualify as a real parti-cle flying in the vacuum. Whatever the properties of the vacuum that surrounds usmay be, we may consider it to be made of such a superlattice,9) and has the abilityto produce waves and transmit them.

The caveats I warned about are as follows. The mechanical field model can createa longitudinal wave whereas the electromagnetic wave in vacuum is always trans-verse. The fields we deal with in particle physics have infinite degrees of freedomwhile the lattice degrees of freedom are many but finite because of the finite sizeand spacing, though for all practical purposes it can be treated as infinite. Absolutezero temperature cannot be realized, and the lattice is perpetually in thermal excita-tion. The metal environment provides electrons with a hot bath of phonons. Whenan electron travels, it is constantly scattered by the vibrating lattice or phonons. Thenet effect is a velocity-dependent resistive force, and the velocity saturates when theresistive force cancels the local electric force.

2.2.5Mechanical Model of a Classical Field

Here, we show that when the field is a collection of an infinite number of connect-ed harmonic oscillators distributed continuously in space, it satisfies the Klein–Gordon equation, which is the fundamental equation of motion for relativisticfields: �

@2

@t2� r2 C μ2

�'(t, x ) �

�@μ@μ C μ2'(x ) D 0 (2.44)

In order to simplify the mathematics, we consider a one-dimensional oscillator(Figure 2.3) consisting of N C 1 point masses m with interval a spanning totallength D D (N C 1)a. Let u n�1, u n , u nC1 be the displacements of the the (n �1)th, nth and (n C 1)th mass points, respectively. Then their equation of motion is

9) We use this as a metaphor. In condensed matter physics superlattice has its own meaning.

Page 55: Yorikiyo Nagashima - Archive

30 2 Particles and Fields

Figure 2.3 N C 1 mass points are connected by N springs of length a. un�1, un , unC1 aredisplacements of the (n � 1)th, nth and (n C 1)th points, respectively.

expressed as

md2u n

d t2 D T u nC1 � u n

a

�� T

u n � u n�1

a

�(2.45)

Here, we have assumed the springs had been prestretched to a length a to give ten-sion T and that the additional increase of the tension is proportional to its relativestretch. Equation (2.45) can be derived from the Lagrangian

L D K � V DX

n

"m2

�du n

d t

�2

�T2a

(u nC1 � u n)2

#(2.46)

where K is the kinetic energy and V is the potential energy, by using Euler’s equa-tion

dd t

�@L

@ Pu n

��

�@L

@u n

�D 0 (2.47)

where Pu D du/d t.Making N large and keeping the tension constant, we put D D (NC1)a, x D na,

Δx D a, and rename u n�1, u n , u nC1 as u(x � Δx ), u(x ), u(x C Δx ). Then theequation becomes

m@2u@t2

(x , t) D T�

u(x C Δx , t) � u(x , t)Δx

�u(x , t)� u(x � Δx , t)

Δx

�(2.48)

Divide both sides by a D Δx , define linear mass density σ D m/a and usefu(x C Δx ) � u(x )g/Δx ' @u(x )/@x . Then we obtain a wave equation for a wavepropagating with velocity v:

σ@2u@t2

(x , t) D T1

Δx

�@u@x

(x , t) �@u@x

(x � Δx , t)�' T

@2u@x2

(x , t) (2.49)

) 1v2

@2 u@t2 (x , t) �

@2u@x2 (x , t) D 0 , v D

rTσ

(2.50)

If we consider N oscillators under the influence of gravity as in Figure 2.4, theLagrangian (2.46) becomes [Problem 2.5]

L DX

n

"m2

�du n

d t

�2

�T2a

u nC1 � u n

�2�

mg2`

u2n

#(2.51)

Page 56: Yorikiyo Nagashima - Archive

2.2 What is a Field? 31

a a

l

1 2 N N+1

Figure 2.4 Connected pendulums under the influence of gravity (` D length of the pendulums).

where g D gravitational acceleration, and in the continuum limit

1v2

@2u@t2 (x , t)�

@2u@x2 (x , t)C

ω20

v2 u(x , t) D 0 , ω20 D

g`

(2.52)

The equation has exactly the same form as the Klein–Gordon equation, as one cansee by the substitution v ! c, ω0/v ! μ.

Problem 2.5

Prove that the Lagrangian (2.46) is modified to Eq. (2.51) when the oscillators areconnected pendulums of length ` (Fig. 2.4). Prove, in the continuum limit, that theequation of motion becomes Eq. (2.52).

Problem 2.6

Show that the total energy of N C 1 pendulums is given by

E DX

n

"m2

�@u n

@t

�2

C12

Ta

u nC1 � u n

�2C

m2

g`

u2n

#(2.53)

In the continuum limit, the Lagrangian is given by

L DZ L

0dx

"σ2

�@u@t

(x , t) 2

�T2

�@u@x

(x , t) 2

�σω2

0

2(u(x , t))2

#(2.54)

Redefining ' Dp

T u, μ Dq

ω20/v2 and recovering original 3-dimensionality, the

Lagrangian and the energy become

L �Z

d3xL D

•dx d y dz

12

"1v2

�@'

@t

�2

� (r' � r')� μ2'2

#(2.55)

E DZ

d3x12

"1v2

�@'

@t

�2

C (r' � r')C μ2'2

#(2.56)

Page 57: Yorikiyo Nagashima - Archive

32 2 Particles and Fields

The corresponding Hamiltonian can be obtained by

H DZ

d3xH D

Zd3x [p P' �L] p D

δL

δ P'(2.57)

which of course reproduces Eq. (2.56).Taking the limit D ! 1 and putting v ! c reproduces the Lagrangian for the

Klein–Gordon equation:

1c2

@2'

@t2 � r2' C μ2' D 0 (2.58)

The solution to the Klein–Gordon equation may be expressed as a superposition ofplane waves (mathematically as the Fourier transform)

'(x , t) D1

(2π)3

Zd3k Q'(k)e i(k�x�ω t ) , ω2 D c2(jkj2 C μ2) (2.59)

When the Klein–Gordon equation is quantized as described in Sect. 5.3, μ2 be-comes the mass of the quantum.

2.3Summary

We have visualized the field that obeys the Klein–Gordon equation as three-dimensional connected springs distributed over all space. They have the followingcharacteristics.

1. A classical field is a collection of an infinite number of harmonic oscillatorsnumbered by space coordinate x D (x , y , z) interacting with neighboring os-cillators.

2. The amplitude difference in the discrete oscillators becomes differential in thecontinuum limit. This means an oscillator at point x interacts with neighbor-ing oscillators through spatial derivatives. However, as is well known in clas-sical mechanics, an infinite number of connected (i.e. interacting) harmonicoscillators is equivalent to the same number of “normal modes”, which oscil-late independently of each other. The normal modes are plane waves, givenin (2.59), and the interaction with the neighboring oscillators in the originalmode is converted simply to propagation of the wave in the normal mode. Aninfinite number of interacting harmonic oscillators is equivalent to a collectionof freely propagating waves numbered by the wave number k. In short, an al-ternative way of viewing the field is to regard it as a collection of an infinitenumber of harmonic oscillators numbered by the wave number k.

3. Plane waves show particle behavior when they are quantized. As their energy-momentum is given by E D „ω, p D „k , the mass m of the quantum is givenby

m D

pE 2 � (p c)2

c2 D„μc

(2.60)

Page 58: Yorikiyo Nagashima - Archive

2.4 Natural Units 33

It is interesting to note the origin of the mass term μ or ω0 in the mechanicaloscillator model. It did not exist in the medium that excites the wave. It wasgiven as an external force effect acting on all the oscillators uniformly. This hasa far-reaching implication. In the Standard Model, we shall learn that all thefundamental particles in nature have vanishing mass originally and acquirefinite mass through interactions with the Higgs field, which pervades the sur-rounding space. Namely, the mass of an elementary particle is not an intrinsicproperty of its own but is an acquired attribute as a result of environmentalchanges. This concept served as a major breakthrough in elucidating the deeproots of gauge theory and so reaching unified theories.

2.4Natural Units

Universal constants such as „ and c appear frequently in equations. In order toavoid writing these constants repeatedly and risking mistakes in the expression,we introduce “natural units”. All physical quantities have dimensions. They areexpressed in terms of length, mass and time. The familiar MKS units we use aremeter (m) for the length L, kilogram (kg) for the mass M and second (s or sec) forthe time T. The dimension of any physical quantity Q is expressed as

[Q ] D [La M b T c ] (2.61)

Velocity, for instance, has the dimension

[v ] D [LT �1] D m/s (2.62)

Similarly, the dimensions of other mechanical quantities, e.g. acceleration α, force fand energy E, are expressed as

[α] D [LT �2] D m/s2 (2.63a)

[ f ] D [M LT �2] D kg �m/s2 � Newton (2.63b)

[E ] D [M L2T �2] D kg �m2/s2 � Joule (2.63c)

One need not, however, necessarily adopt M LT as the basic units to constructthe dimensions of physical quantities. Any three independent physical quantitiesare equally eligible. Besides, MKS units are too large for applications in parti-cle physics. We may choose energy as one of the basic units and express it inelectron volts (eV), an amount of energy obtained when an electron or proton ofunit charge is accelerated by 1 V of electric potential. As the unit electric charge is1.6021 � 10�19 Coulomb

1 eV D 1.6021 � 10�19 Coulomb � Volt D 1.6021 � 10�19 Joules (2.64)

Page 59: Yorikiyo Nagashima - Archive

34 2 Particles and Fields

Moreover as the extension of electronvolts

1 keV D 103 eV

1 MeV D 106 eV

1 GeV D 109 eV

1 TeV D 1012 eV

(2.65)

are also used, depending on the kind of phenomena concerned. For the time being,we will use MeV as the standard energy unit. In quantum mechanics and relativitytheory, „ D h/2π, where h is Planck’s constant and c the velocity of light, frequent-ly appears. Adopting c, „ and energy as the three basic units, the description ofphysical quantities is simplified, as we describe now. Since

„ D 1.055 � 10�34 J s (Joules � s) D 6.582 � 10�22 MeV s (2.66a)

„c D 1.9733 MeV � 10�13 m (2.66b)

Length and time can be expressed as

[L] D [„c] � [E�1]

[T ] D [„] � [E�1](2.67)

Namely, length is measured by inverse energy E�1 in units of 1.9733 � 10�13 m.Time is also measured by inverse energy E�1 in units of 6.582 � 10�22 s. Now wedefine natural units by putting „ D c D 1. Then the constants „ and c disappearfrom all the equations. The velocity v becomes a dimensionless quantity measuredin units of the velocity of light, which we had to write as D v/c previously.Every dimensionful quantity is expressed in terms of energy or inverse energy.The mass multiplied by c2 and the momentum multiplied by c become energy, sothey are all expressed in MeV. Sometimes the mass is expressed in MeV/c2, andthe momentum in MeV/c, to differentiate them from real energy. To recover theoriginal physical dimension, we only need to reconstruct the right dimension bymultiplying or dividing by „ and c. For instance, the pion mass mπ is 140 MeV, andwe talk about distance or time as 1/mπ . Then

1mπD„c[MeV �m]

140 MeVD

1.97 � 10�13 m140

D 1.4 � 10�15 m

1mπD„[MeV � s]140 MeV

D6.58 � 10�22 s

140D 4.7 � 10�24 s

(2.68)

When we talk of a cross section σ being 1/m2p , its value in MKS units is given by

σ D1

m2pD

(„c)2�m p c2

�2 D

�1.97 � 10�13[MeV �m]

938.3 MeV

2

D 4.4 � 10�28 cm2

(2.69)

Page 60: Yorikiyo Nagashima - Archive

2.4 Natural Units 35

Problem 2.7

Show the value of the classical electron radius re is 2.82�10�15 m, where re is givenby α/me . α D 1/137 is the fine structure constant and me , the electron mass, is0.511 MeV.

Problem 2.8

The neutrino changes its flavor periodically while it propagates, a phenomenoncalled neutrino oscillation. The probability of a ν e changing into νμ at time t isgiven by

P(ν e ! νμ W t) / sin2 Δm2

4Et (2.70a)

Δm2 D jm21 � m2

2j (2.70b)

If Δm2 is given in units of (eV)2, E in MeV, the length in meters, show that theoscillation wavelength L is given by

L[m] D4πEΔm2 D 2.4 �

E [MeV]Δm2 [eV2]

(2.71)

In electromagnetism, we have used, so far, rationalized MKS units. From nowon we will use Heaviside–Lorentz units, which put �0 D μ0 D 1 and natural unitsinstead of MKS units.

Page 61: Yorikiyo Nagashima - Archive
Page 62: Yorikiyo Nagashima - Archive

37

3Lorentz Invariance

In particle physics, most particles we treat are relativistic, i.e. E p c mc2,and the theory of special relativity is an indispensable mathematical tool. A notablefeature is that space and time mix and cannot be treated independently as in New-tonian mechanics. Physical objects that were considered as an independent three-component vector and a scalar in nonrelativistic physics mix in high-energy phe-nomena. They have to be combined to make a four-component Lorentz vector thattransforms like a time and space coordinate (its difference to be exact). For theirconsistent and unified treatment, we have to rely on Einstein’s theory of relativity,which is based on two principles.

1. Invariance of the velocity of light, which postulates that the velocity of lightalways remains as the constant c in any inertial frame.

2. Relativity principle, which requires covariance of the equations, namely thephysical law should keep its form invariant in any inertial frame. In mathemat-ical language this amounts to the fact that physical laws have to be expressed inLorentz tensors.

The relativity principle applies to Galilei transformation and is valid in Newtoni-an mechanics also, but the invariance of the velocity of light necessitates Lorentztransformation in changing from one inertial system to another that are movingrelative to each other with constant speed.

3.1Rotation Group

Before introducing Lorentz transformation operators, it is useful first to review thefamiliar rotation operator in three dimensions, which makes a group O(3). For thedefinition of a group, see Appendix A. The basic operations are very similar, andin fact the Lorentz group contains the rotation group as its subgroup. Let r B D

(xB , yB , zB ) be a point in a coordinate frame L that can be obtained from point r A

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 63: Yorikiyo Nagashima - Archive

38 3 Lorentz Invariance

by rotation through angle θ . Then in matrix form

rB �

24xB

yB

zB

35 D [Ri j ]

24xA

yA

zA

35 � R r A (3.1)

As the rotation does not change the length r Dp

x2 C y 2 C z2,

r2B D rT

B r B D rTARTR r A D r2

A D rTAr A (3.2a)

) RT R D 1 (3.2b)

Namely, R is a 3 � 3 orthogonal matrix. A set of three variables that transformssimilarly by rotation is called a vector and is expressed as V D (Vx , Vy , Vz ). As anexample, let us write down a rotation around the z-axis by angle θ (Figure 3.1a).

24V 0

xV 0

y

V 0z

35 D

24cos θ � sin θ 0

sin θ cos θ 00 0 1

3524Vx

Vy

Vz

35$ V 0 D Rz (θ )V (3.3)

Here, we have defined the rotation as moving the vector from point A to point Breferred to as active transformation. Some authors use the word rotation to meantransformation of the coordinate frame without actually moving the vector, whichwe refer to as passive transformation. Mathematically, the active rotation is equiv-alent to passive rotation by angle �θ (Figure 3.1b), i.e. the passive transformationis an inverse active transformation. We use active transformation throughout thisbook. What we are concerned with is the effect of the operation on a “vector fieldV (x )”, a vector defined at every space coordinate x . The effect can be expressed as

V 0(x ) D D(R)V (R�1x ) (3.4a)

xB

xA

R

y

x = RxB A

e = Re =R eB A ij A j ψ (x ) = ψ (x )B B A A

A B B AV’ = e ψ = RV

A A AV = e ψ

x’

x

yy’

θxA

AV

(a) (b)

R-1

Figure 3.1 (a) Active transformation moves the vector. (b) Passive transformation moves thecoordinate frame.

Page 64: Yorikiyo Nagashima - Archive

3.1 Rotation Group 39

where D(R) acts on the components of the vector. Expressed in terms of compo-nents

V 0i (xk ) D

Xj

Ri j Vj

Xl

R�1k l xl

!(3.4b)

To prove the above equation, let V 0(x) be a new vector field obtained from the oldvector field V (x ) by rotation. Both fields are defined at all space points x . Let V B D

V (x B ) and V A D V (x A) and V 0A D V 0(x B ). The vector V 0

A is obtained by rotatingV A from point x A to point x B (Figure 3.1a). Note that the vector V B is the old vectorfield evaluated at x B and generally differs from V A or V 0

A in both magnitude anddirection. We can write down three-component vectors in factorized form, namelyas a product of their magnitude and a direction vector

V A D jV AjeA , V A0 D jV A

0jeB (3.5)

where eA D V A/jV Aj and eB D V B/jV B j are two base vectors of unit lengthconnected by the rotation. jV j is the magnitude or length of the vector. The basevectors are specified by their direction only. Since the magnitude of the vector doesnot change by rotation, but the direction of the base vector changes, we have

jV 0Aj D RjV Aj D jV Aj D jV (x A)j D jV (R�1x B )j (3.6a)

eB D R eA i.e. (eB)k DX

l

Rk l(eA)l � D(R)eA (3.6b)

) V 0(x B ) D jV 0AjeB D jV AjR eA D RfjV (x A)jeAg D RV (R�1x B )

D Rk l [V(R�1x B )]l D D(R)V (R�1x B )(3.6c)

Rewriting x B as x , we obtain Eqs. (3.4). When the rotation angle around the z-axisis infinitesimal, we can obtain D(Rz)(δθ ) from Eq. (3.3).

D(Rz )(δθ ) D

24 1 �δθ 0

δθ 1 00 0 1

35 D 1� i δθ

240 �i 0

i 0 00 0 0

35 D 1 � i δθ Sz

(3.7a)

Sz � idRz

ˇθD0

(3.7b)

We can also extract an operator that carries out the reverse rotation on the argu-ments as follows. For any function of space point x

f (R�1z x ) D f (x C δθ y , y � δθ x , z)

D f (x , y , z)C δθ (y@x � x@y ) f (x , y , z)

D (1� i δθ L z) f (x , y , z)

L z D �i(x@y � y@x )

(3.8)

Page 65: Yorikiyo Nagashima - Archive

40 3 Lorentz Invariance

L z is the familiar orbital angular momentum operator and Sz is the spin angularmomentum operator. Combining the operations on the coordinates and compo-nents

Rz (δθ )V (R�1z x ) D (1� i δθ Jz)V (x )

Jz D L z C Sz(3.9)

Rotation through a finite angle θ can be obtained by repeating the infinitesimalrotation δθ D θ /n n times and making n !1:

Rz (θ ) D limn!1

Y�1 � i Jz

θn

�n

D e�i θ Jz (3.10)

Jz is called a “generator” of rotation on axis z. Other spin generators are definedsimilarly.

Sx D

240 0 0

0 0 �i0 i 0

35 Sy D

24 0 0 i

0 0 0�i 0 0

35 Sz D

240 �i 0

i 0 00 0 0

35 (3.11)

They satisfy the following commutation relations:

[ Ji , J j ] D i ε i j k Jk (3.12a)

[L i , L j ] D i ε i j k L k (3.12b)

[Si , S j ] D i ε i j k Sk (3.12c)

ε i j k D

(C1 i j k D even permutation of 123

�1 i j k D odd permutation of 123(3.13)

The relation Eq. (3.12) is called Lie algebra and �i j k is called the structure func-tion of the rotation group O(3). The rotation can be specified by a set of threeparameters, a unit vector to define the rotation axis and the rotation angle θ , and isexpressed as a rotation vector θ . Generally, the 3 � 3 rotation matrix has nine vari-ables, but they are reduced to three because of the orthogonality relation (3.2b). Itis known that essential properties1) of the rotational group O(3) (or more generallythose of a Lie group) is completely determined by the three generators and theircommutation relations (i.e. structure constant). The rotation operator is expressedas

R(n, θ ) D e�i J�θ D e�i J�nθ (3.14)

1) Meaning local properties. Global properties may differ. For instance S U(2) and O(3) have thesame algebra but there is 2 : 1 correspondence between the two. See Appendix A.

Page 66: Yorikiyo Nagashima - Archive

3.2 Lorentz Transformation 41

3.2Lorentz Transformation

3.2.1General Formalism

Introducing a coordinate variable x0 D c t, we define two kinds of Lorentz vector

dx μ D (dx0, dx1, dx2, dx3) D (cd t, dx , d y , dz) (3.15a)

dxμ D (dx0, dx1, dx2, dx3) D (cd t,�dx ,�d y ,�dz) (3.15b)

Lorentz transformation is defined as one which keeps the Minkowski metric, thescalar product of the two vectors defined by

d s2 D c2d t2 � dx2 � d y 2 � dz2 � dx μ dxμ � gμν dx μ dx ν (3.16)

invariant. Here we adopted the convention (Einstein’s contraction) that when thesame Greek index appears simultaneously as a superscript and a subscript, sum-mation from 0 to 3 is assumed. “gμν” defined by Eq. (3.16) is called the metrictensor and is expressed in matrix form as

G � [gμν] D

2664

1 0 0 00 �1 0 00 0 �1 00 0 0 �1

3775 ,

G�1 � [gμν] D

2664

1 0 0 00 �1 0 00 0 �1 00 0 0 �1

3775

(3.17)

Minkowski space is defined as a space having the metric (length) defined byEq. (3.16). In Minkowski space the inverse of the metric tensor is the same asthe metric itself, a feature which does not hold in general.2) In general relativity,the metric tensor is a function of space-time coordinates and the Einstein fieldequation is a differential equation for gμν. Introducing a column vector x for thevector with upper index and row vector xT for the vector with lower index, the(homogeneous) Lorentz transformation can be expressed in the form

x μ 0D Lμ

ν x ν , x0 D

26664

x00

x10

x20

x30

37775 D Lx , L D [Lμ

ν ] D@x μ 0

@x ν (3.18)

2) The distinction of the upper and lower index plays a minor role in Minkowski space. In fact, in theold convention, x4 D i ct was used to keep the metric in the same form as the Euclidian definitionof length without distinction of upper and lower indexes. But it is essential in general relativity.

Page 67: Yorikiyo Nagashima - Archive

42 3 Lorentz Invariance

The general homogeneous Lorentz transformation includes space rotation andconstant velocity parallel transformation, called (Lorentz) boost, which are specifiedby two vectors, ω and v . The magnitude of the rotation vector ω represents a rota-tion angle, and its direction the rotation axis. Consider the special case of Lorentzboost in the x direction. Here, a particle at (t, x , y , z) in a coordinate frame L isboosted to (t0, x 0, y 0, z0) with velocity v. This statement is equivalent to changing toanother coordinate frame L0 which is moving in the x direction at velocity �v. L0 isassumed to coincide with L at t D t0 D 0. Then the two coordinates are related by

t ! t0 Dt C (v/c2)xp

1 � (v/c)2) x 00

D γ (x0 C x ) (3.19a)

x ! x 0 Dx C v tp1 � (v/c)2

) x 01D γ (x0 C x ) (3.19b)

x 02D x2 , (3.19c)

x 03D x3 , D

vc

, γ D1p

1� 2(3.19d)

or in matrix form26664

x00

x10

x20

x30

37775 D

2664

γ γ 0 0γ γ 0 00 0 1 00 0 0 1

37752664

x0

x1

x2

x3

3775 (3.20)

A combination of four variables that transforms like the coordinate x μ (morerigorously the difference of the coordinate dx μ) is called a contravariant vector andis denoted as Aμ . A covariant vector is defined as

A μ D gμν Aν D (A0,�A1,�A2,�A3) � (A0,�A) (3.21)

Conversely, the contravariant vector can be defined using the inverse of the metrictensor

Aμ D gμν A ν D (A 0,�A 1,�A 2,�A 3) , gμ� g�ν D δμν (3.22)

In the nonrelativistic limit, the 0th (time) component and the space componentsare separated, the time component behaving as a scalar under space rotation, hencethere is no distinction between the two types of vectors. However, under the Lorentztransformation, the distinction has to be made. The contra- and covariant vectors inMinkowski space can be exchanged simply by changing the sign of the space com-ponents, an operation called parity transformation. The Lorentz transformation ofthe covariant vector is expressed as

A0μ D A ν M ν

μ , or A0TD ATM (3.23)

Page 68: Yorikiyo Nagashima - Archive

3.2 Lorentz Transformation 43

Putting A μ D dxμ ,

dx 0μ dx μ 0

D (dxTM)(Ldx) D dxμ dx μ D dxTdx (3.24a)

) M D L�1 , M νμ D

@x ν

@x μ 0 (3.24b)

which means the Lorentz transformation of the covariant vector is given by theinverse transformation. In a boost like Eq. (3.19), it is obtained simply by the sub-stitution v ! �v .

Problem 3.1

Show that a boost L v in general direction (along v ) is given by

Lv D

24 γ �γ

�γ 1C O� ˝ O�(γ � 1)

35

D

26664

γ x γ y γ z γx γ O2

x γ C O2y CO2

zOxO y (γ � 1) Ox

Oz (γ � 1) y γ O y

Ox (γ � 1) O2x CO2

y γ C O2z

O yOz (γ � 1)

z γ OzOx (γ � 1) Oz

O y (γ � 1) O2x CO2

y CO2

z γ

37775

( O� ˝ O�)i j � O iO j , O� D

�j�j

(3.25)

Hint: Separate the three-dimensional vector in the longitudinal (k�) and the trans-verse directions (? �)

x D [x � O�( O� � x )]C O�( O� � x ) (3.26)

and apply the boost to the longitudinal part.

3.2.2Lorentz Vectors and Scalars

We define Lorentz vectors and Lorentz scalars, which will appear frequently inquantum field theory. An important physical variable is the proper time τ. It is thetime an observer feels in the observer’s rest frame.

Proper Time The proper time dτ � d tp

1 � 2 is a Lorentz-invariant scalar.

Proof: d s2 D (cd t)2 � dx2 � d y 2 � dz2 D c2d t2f1 � (dx/d t)2 � (d y/d t)2 �

(dz/d t)2g D (cd t)2(1� 2) D (cdτ)2

is Lorentz invariant by definition. �

Page 69: Yorikiyo Nagashima - Archive

44 3 Lorentz Invariance

Since a vector multiplied or divided by a scalar is also a vector, it immediatelyfollows that the variables in Table 3.1 are Lorentz vectors.

Table 3.1 Useful Lorentz vectors.

Name Contravariant vector Covariant vector

Coordinate difference dx μ dxμ (3.27a)

Velocity uμ D dx μ /dτ D (γ c, γ v ) uμ D dxμ /dτ (3.27b)Acceleration αμ D duμ /dτ αμ D duμ /dτ (3.27c)

Energy-momentum p μ D muμ D (E/c, p ) pμ D (E/c,�p ) (3.27d)

Wave vector kμ D (ω/c, k) kμ D (ω/c,�k) (3.27e)Current j μ D �0uμ D (�c, j ) a jμ D (�c,� j ) (3.27f)

Electromagnetic potential Aμ D (φ/c, A) A μ D (φ/c,�A) (3.27g)Differential operator @μ D @

@xμD�

1c

@@t ,�r

�b @μ D

@@x μ D

�1c

@@t ,r

�(3.27h)

a �0 is (charge) distribution in its rest frame.b Notice the minus sign of the space components of the differential vector.

Notice, space components of the four-velocity uμ differ from conventional onesby the factor γ .

Problem 3.2

Prove that (3.28e), (3.28g), (3.28h) are Lorentz vectors.

Scalar Product If the scalar product of a given Lorentz vector Aμ and a four-com-ponent variable B μ satisfy the equation

A0μ B 0μ D Aμ Bμ D A μ B μ D A0

μ B 0 μD constant (3.28)

then B μ is a Lorentz vector. We shall denote the scalar product as A � B � A μ B μ .3)

Proof:

A0μ B 0μ D Lμ

� A� B 0μ D A� B� ) B 0

μ Lμ� D B�

Multiplying by M�σ D [L�1]�σ from the right

B 0μ Lμ

� M�σ D B 0

μ δμσ D B 0

σ D B�M�σ

which shows Bσ is a covariant vector. �

By making scalar products with two vectors, it is shown that variables in Table 3.2are Lorentz-invariant quantities.

3) We denote the usual scalar product in three-dimensional space by bold letters, i.e. A � B, and theLorentz scalar product by normal letters, A � B.

Page 70: Yorikiyo Nagashima - Archive

3.3 Space Inversion and Time Reversal 45

Table 3.2 Useful Lorentz scalars.

Name Scalar

Einstein’s relation p2 D p μ pμ D (E/c)2 � p2 D (mc)2 (3.29a)

d’Alembertian @μ@μ D 1c2

@2

@t2 �r2 D � (3.29b)

Phase of a wave kx � kμ x μ D ω t � k � x (3.29c)

Current conservation law @μ j μ D@�

@t Cr � j D 0 (3.29d)

Phase space volume θ (p0)δ(p2 � (mc)2)d4 p D d3 p2E/c (3.29e)

Volume element in Minkowski space d4 x D dx0 dx1 dx2 dx3 (3.29f)

Equation (3.29f) can be proved by calculating the Jacobian of the transformation:

d4x 0 Dˇˇ@x 0

@x

ˇˇ d4x D (det L)d4x D d4 x [See Eq. (3.36)] (3.30)

Comment 3.1 In high-energy experiments, where E � p c, the production cross sectionis often expressed in approximately Lorentz-invariant form

d3σc3d3 p/E

'1E

d2σdE dΩ

(3.31)

3.3Space Inversion and Time Reversal

Since the Lorentz transformation keeps the metric d s2 invariant in any inertialcoordinate frame,

d s2 D gμν dx μ dx ν D gμν dx μ 0dx ν 0D gμν Lμ

� Lνσ dx� dx σ

D g�σ L�μ Lσ

ν dx μ dx ν

) gμν D g�σ L�μ Lσ

ν

(3.32)

where in the second line, indices were interchanged since it does not make anydifference when summation is carried out. It can be rewritten in matrix represen-tation as

G D LTGL (3.33)

Multiplying by G�1 from the left

1 D G�1LTGL D [GLG�1]TL) L�1 D [GLG�1]T (3.34)

where we have used the relations GT D G, [G�1]T D G�1. From Eq. (3.34)

det L�1 D det G det L det G�1 (3.35)

Page 71: Yorikiyo Nagashima - Archive

46 3 Lorentz Invariance

Using det L�1 D 1/ det L, we obtain

det L D ˙1 (3.36)

We see there are two kinds of Lorentz transformation, depending on the sign ofdet L. Taking the (0-0) component of Eq. (3.32),

1 D (L00)2 �

Xi

(Li0)2 ) jL0

0j > 1 (3.37)

The transformation with L00 > 1 is called orthochronous and the one with L0

0 < 1is called non-orthochronous, meaning “changes or does not change the time direc-tion”. The Lorentz transformation can be classified into four classes depending onthe sign of det L and L0

0, and that with det L > 0 and L00 > 0 is called the proper

Lorentz transformation Lp .The proper Lorentz transformation becomes a non-transformation or identity

transformation in the limit when the transformation parameter becomes 0, namelyit is continuously connected with L D 1. It includes boost as well as rotation R inthree-dimensional space. The rotation is expressed as

L D

2664

1 0 0 000 R0

3775 (3.38)

Defining P, T and PT by

P D

2664

1 0 0 000 �10

3775 , T D

2664�1 0 0 000 10

3775 ,

PT D

2664�1 0 0 000 �10

3775 (3.39)

then det Lp D det(PT) D 1, det P D det T D �1. P is called space inversion or paritytransformation, T time reversal and PT total reflection. Their transformation is dis-crete and is not connected smoothly with the identity transformation. Any Lorentztransformation can be expressed as a combination of the proper transformationwith the three discrete transformations and needs six independent parameters (ve-locity and rotation vector) to specify it.

Page 72: Yorikiyo Nagashima - Archive

3.4 Covariant Formalism 47

3.4Covariant Formalism

3.4.1Tensors

Products of two vectors transform under Lorentz transformation as

Aμ 0B ν 0D Lμ

� Lνσ A� B σ (3.40a)

A μ0Bν

0 D M�μ M σ

ν A � Bσ (3.40b)

Aμ 0Bν0 D Lμ

� M σν A�Bσ (3.40c)

and make contravariant, covariant and mixed tensors of rank 2, which are called thedirect product of two vectors. More generally, any set of 16 variables that behavesin the same way as the product of two vectors is called a rank 2 Lorentz tensor. Theconcept can be extended to arbitrary rank r

T μν����σ��� � Aμ B ν � � � P� Q σ � � � product of r vectors (3.41)

Here, � means “transforms like”. By multiplying a tensor by δ�μ , gμν or g�σ and

making a contraction (summation of the same upper and lower indices), the rankr tensor can be converted to a new tensor of rank r � 2:

T μν����σ��� (rank r)!

δ�μ T μν���

�� D T 0 ���

gμν T μν����σ��� D T 0 ���

�σ���g�σ T μν���

�σ��� D T 0 μν������

9=;

D rank r � 2 tensor

(3.42)

We introduce another useful tensor, called the Levi-Civita tensor, defined as

�μν�σ D

(˙1, μν�σ D even (odd) permutation of 0123

0, any pair of μν�σ are equal

�0123 D ��0123 D 1

(3.43)

The Levi-Civita tensor is a Lorentz tensor that transforms to itself because

�μν�σ0D Lμ

a Lνb L�

c Lσd�abcd D �μν�σ det L D �μν�σ (3.44)

Page 73: Yorikiyo Nagashima - Archive

48 3 Lorentz Invariance

3.4.2Covariance

When equations are made of tensors of the same rank, they keep the same form inany inertial frame. For example,

Aμ D B μ LH) Aμ 0 D B μ 0 (3.45a)

A μ B μν D C ν LH) A μ

0B μν0 D C ν 0 (3.45b)

They are said to be (Lorentz) covariant. By transformation the values of both sides ofequations may change, but the functional form is kept invariant. Note, Lorentz co-variance is not to be confused with Lorentz invariance in strict sense, which meansthat the value itself does not change under Lorentz transformation and which ap-plies only to Lorentz scalars. Note, however, many authors including myself oftenuse Lorentz invariance to mean Lorentz covariance. The principle of relativity re-quires covariance under transformations. In special relativity, the transformationis Lorentz transformation; in general relativity it is general coordinate transforma-tion.

To get used to the usage of covariant formalism, let us take a look at the familiarMaxwell equations:

r � E D q�

�0, r � B �

1c2

@E@tD μ0qj (3.46a)

r � E C@B@tD 0 , r � B D 0 (3.46b)

The electric field E and the magnetic field B can be expressed in terms of poten-tials:

E D �rφ �@A@t

, B D r � A (3.47)

Since @μ and Aμ D (φ/c, A) are both Lorentz vectors, Equation (3.47) may berewritten as an antisymmetric tensor of rank 2

F μν � @μ Aν � @ν Aμ D

2664

0 �E1/c �E2/c �E3/cE1/c 0 �B3 B2

E2/c B3 0 �B1

E3/c �B2 B1 0

3775 (3.48)

This means that electromagnetic fields look like two sets of vectors but actuallyare different components of a single second-rank tensor. Using F μν, the Maxwellequations (3.46) can be expressed in two sets of tensor equations

@μ F μν D μ0q j μ (3.49a)

@μ Fνλ C @ν Fλμ C @λ Fμν D 0�Fμν D gμ� gνσ F�σ� (3.49b)

Page 74: Yorikiyo Nagashima - Archive

3.4 Covariant Formalism 49

Problem 3.3

Confirm Eq. (3.48) and Eq. (3.49).

Problem 3.4

Prove that a dual tensor defined by

QF μν D12

�μν�σ F�σ (3.50)

exchanges E ! B, B ! �E , and Eq. (3.49b) can be rewritten as

@μ QF μν D 0 (3.51)

From the definition of the electromagnetic tensor F μν, the Maxwell equationwithout sources (3.49b) is actually an identity equation. Namely Eq. (3.46b) isequivalent to saying that the electromagnetic potential Aμ is a Lorentz vector. InEq. (3.49), the two Maxwell equations are written as rank 1 and rank 3 tensorequations. In this expression, Lorentz covariance of the Maxwell equations is quiteevident. Historically the converse held. The Maxwell equations were derived first,then Lorentz transformation was devised to make the equation covariant, whichled Einstein to special relativity.

Problem 3.5

By its construction, f μ D qF μνu ν , where uμ is defined in Eq. (3.28b), is a con-travariant vector. Show that in the nonrelativistic limit its space component reducesto the Lorentz force F D q(E C v � B) and hence f μ is its relativistic extension.

3.4.3Supplementing the Time Component

In constructing a covariant equation, sometimes it happens that we know only thespace component of a four-vector; the time (zeroth) component is unknown. We didnot know the time component of the Lorentz force, but the four-vector defined inProblem 3.5 naturally supplements it. In the following we show two more examplesto explicitly construct the yet unknown component of the four-vectors.

Example 3.1: Newton’s Second Law

Here, we construct a four-vector f μ which represents a force whose space compo-nent, in the nonrelativistic limit, reduces to F that appears in Newton’s second law.Referring to the list of Lorentz vectors Table 3.1, a Lorentz-covariant extension of

Page 75: Yorikiyo Nagashima - Archive

50 3 Lorentz Invariance

the second law may be expressed as

mduμ

dτD f μ , f μ D ( f 0, f ) � ( f 0, γ F ) (3.52)

It is straightforward to show that the space component of Eq. (3.52) reproducesNewton’s second law

mdvd tD F (3.53)

To derive the time component of the four-vector f μ , we differentiate a Lorentzscalar u μ uμ D 1 and use Eq. (3.52) to obtain

m�

u μduμ

�D m

�u0

du0

dτ� u �

dudτ

�D u0 f 0 � u � f D 0

) f 0 Du � f

u0D γ

v � Fc

γ!0�!

v � Fc

(3.54)

Then the time component of Eq. (3.52) becomes

mdu0

dτD γ

d(mγ c)d t

D f 0 Dγ v � F

c,)

dd t

mc2p

1 � (v/c)2

!D v � F (3.55)

The right hand side of the equation is the rate of energy increase, and the aboverelation led Einstein to derive the famous relation E D mc2.

Example 3.2: Equation of Spin Precession*

We derive here the spin precession frequency of a particle moving in a homoge-neous electromagnetic field. Although the quantum mechanical motion should bederived using the Dirac equation (see Chap. 4) with inclusion of a Pauli term, theexpectation value of the spin operator is known to follow the same time dependenceas that would be obtained from a classical equation of motion [47, 216]. The spinoperator is an antisymmetric tensor S μν (see Sect. 3.5), but a formulation in termsof a covariant (axial) four-vector would be much more convenient and we will try toconstruct it. We define the spin four-vector S μ which reduces to the conventional

spin S μ RD (0, s) in the particle’s rest frame (R).4) It obeys the equation of motion

d sdτ

RD g

q2m

s � B , q D ˙e (3.56)

where μ D gq„/2m („ and c omitted in natural units) is the magnetic momentof the particle and also defines the g-factor. Denote the particle’s four-velocity by

4) Actually, S μ is the dual of the tensor S μν in the sense that S μ D (1/2c)�μν�σ u ν S�σ , where uμ isthe particle’s four velocity.

5) We assume the equation is linear in the spin S μ and the external field F μν .

Page 76: Yorikiyo Nagashima - Archive

3.4 Covariant Formalism 51

uμ D (u0, u) D γ (1, v ), which obeys the equation of motion

mduμ

dτD qF μνu ν D q(u � E , u0E C u � B) (3.57)

where we used Eq. (3.52) and the Lorentz force qF μνu ν supplied by the electro-magnetic field. To obtain the zeroth component S0 of the spin vector, we considera Lorentz scalar u � S that vanishes in the rest frame and hence in any frame. Then

S � u D S0u0 � u � S D 0! S0 D v � S (3.58)

The equation of motion for S0 in the rest frame is readily obtained

dS0

dτD

dvdτ� S C v �

dSdτ

RD

qm

s � E (3.59)

Eq. (3.56) and Eq. (3.59) may be combined and generalized to a Lorentz covariantform:5)

dS μ

dτD g

q2m

�F μν Sν C (Sμ F μν u ν)uμ � q

m

�dudτ� S�

D gq

2mF μν Sν C

�g � 2

2

�qm

(Sμ F μν u ν)uμ (3.60a)

F μν Sν D (S � E , S0E C S � B) RD (s � E , s � B) (3.60b)

Sμ F μν u ν D S0(u � E ) � u0(S � E ) � (S � u � B) RD �s � E (3.60c)

where (3.57) is used in obtaining the second line. Equation (3.60a) is the relativisticequation of motion of the spin vector. It can easily be shown that Eq. (3.60a) reducesto Eq. (3.56) and Eq. (3.59) in the rest frame if one makes use of Eq. (3.60b) and(3.60c). For experimental use, we are interested in the spin precession frequency.The formula Eq. (3.60a) is used to determine a precise value of the g � 2 factor ofthe muon, which will be treated in more detail in Sect. 8.2.2.

3.4.4Rapidity

Successive Lorentz boost in the same direction is represented by a single boostwhere the transformation velocity is given by

00 D jv/cj00 D C 0

1C 0 (3.61)

Proof: Assume velocity v 0 in frame L is observed as v 00 in frame L0, where theframe L0 is traveling in the x direction with �v in frame L. Space coordinates(t0, x10) are expressed in terms of (t, x1) using transformation of Eq. (3.19). We

Page 77: Yorikiyo Nagashima - Archive

52 3 Lorentz Invariance

omit other coordinates for simplicity:

x00D γ (x0 C x1) (3.62a)

x10D γ (x0 C x1) (3.62b)

0 Dv 0

cD

dx1

dx0 (3.62c)

Then

00 Ddx10

dx00 Dγ (dx0C dx1)γ (dx0 C dx1)

D C 0

1C 0 (3.63)

Equation (3.63) shows the velocity is not an additive parameter, i.e. nonlinear,in the successive transformation. A more convenient additive parameter “rapidity”can be defined by

D tanh η , or η D12

ln1C 1�

(3.64)

Problem 3.6

Show that the rapidity is additive, i.e. η00 D η C η0.

Using the rapidity, a Lorentz transformation with finite η, can be decomposedinto N successive transformations with rapidity Δη D η/N , which we discussmore in the following section. Solving , γ in terms of η we have

D tanh η , γ D cosh η , γ D sinh η (3.65)

Lorentz boost (3.19) can be rewritten as

x00D cosh ηx0 C sinh ηx1 (3.66a)

x10D sinh ηx0 C cosh ηx1 (3.66b)

Compare this with rotation in the x–y plane.

x 0 D x cos θ � y sin θ (3.67a)

y 0 D x sin θ C y cos θ (3.67b)

As Eq. (3.66) can be obtained from Eq. (3.67) by the substitution θ ! �i η, x !i x0, y ! x1, Lorentz boost (in the x direction) is formally a rotation by angle �i ηin the x and imaginary time (i x0) plane.

Page 78: Yorikiyo Nagashima - Archive

3.5 Lorentz Operator 53

Comment 3.2 In high-energy collider experiments, produced secondary particles areboosted in the z direction (along the incoming particle axis). The boosted angular distri-bution is better expressed as rapidity distribution. dN/dη � const. is observed experi-mentally. At high energies, each particle has E � p c, pk D p cos θ , and its rapidity isapproximated by so-called pseudo-rapidity

η0 D12

ln1C k1� k

D12

lnE C pkcE � pkc

� � ln tanθ2

(3.68)

This fact is taken into account in designing detectors, which are divided into modulesthat span the same solid angle in the η–φ (azimuthal angle) plane.

3.5Lorentz Operator

In order to obtain a Lorentz transformation operator that will act on a field ψ,where ψ can have multiple components, the Lorentz operation of the field can bedefined similarly to rotation [see Eq. (3.4)]

ψ(x )L! ψ0(x ) D D(L)ψ(L�1x ) (3.69)

where Lx means for small boost

x00D x0 C Δηx1 , x10

D x1 C Δηx0 (3.70)

where we have considered boost only in the x direction for simplicity.First we consider Lorentz transformation operations of the scalar field. Here only

the coordinate variables are affected:

ψ0(x ) D ψ(L�1x ) D ψ(x0 � Δηx1, x1 � Δηx0)

D ψ(x0, x1) � Δη(x1@0ψ C x0@1ψ) � (1� iΔηN1)ψ(x ) (3.71)

As a transformation with finite η can be achieved by N successive infinitesimaltransformations with Δη D η/N :

ψ0(x ) D limN!1f1 � i(η/N )N1g

N ψ D exp(�i ηN1)ψ (3.72)

N1 is called a generator of Lorentz boost and is defined by the formula

N1 D �i(x0@1 C x1@0) D i(x0@1 � x1@0) (3.73a)

Similarly, considering boosts in the x2 and x3 directions

N2 D i(x0@2 � x2@0) (3.73b)

N3 D i(x0@3 � x3@0) (3.73c)

Page 79: Yorikiyo Nagashima - Archive

54 3 Lorentz Invariance

As is well known, orbital angular momentum operators, which are generators ofspace rotation, are given as

L i D �i(x j @k � x k@ j ) D i(x j @k � x k@ j ) , (i, j, k) D (1, 2, 3) and cyclic

(3.73d)

It is easily verified that Ni and L j satisfy the following commutation relations:

[L i , L j ] D i�i j k L k (3.74a)

[Ni , N j ] D �i�i j k L k (3.74b)

[Ni , L j ] D i�i j k Nk (3.74c)

�i j k D

(C1 i j k D even permutation of 123

�1 D odd permutation of 123(3.74d)

Now define Lorentz generators M μν and Lorentz variables ωμν by

M μν D i(x μ@ν � x ν@μ) (3.75a)

M μν D �M νμ , ωμν D �ωνμ (3.75b)

(M23, M31, M12) D (L1, L2, L3) (3.75c)

(M01, M02, M03) D (N1, N2, N3) (3.75d)

(ω23, ω31, ω23) D (θ1, θ2, θ3) (3.75e)

(ω01, ω02, ω03) D (η1, η2, η3) (3.75f)

Then the general Lorentz transformation operators that act on coordinate variablescan be written as

L � exp[�i(N � η) � i(L � θ )] D exp��

i2

ωμν M μν�

(3.76)

Next we consider Lorentz transformations on multi-component fields and includespin operators. Then a general Lorentz boost operator is expressed as

M μν D i(x μ@ν � x ν@μ)C S μν (3.77a)

(S23, S31, S12) D (S1, S2, S3) (3.77b)

(S01, S02, S03) D (K1, K2, K3) (3.77c)

Page 80: Yorikiyo Nagashima - Archive

3.5 Lorentz Operator 55

where S and K satisfy the same commutation relations as L and N in Eq. (3.74).In the following we consider a Lorentz transformation of the vector fields.6) It canbe expressed as

V μ(x )L! V μ 0(x ) D Lμ

ν V ν(L�1x ) (3.78)

Problem 3.7

(a) By inspecting Eq. (3.66) and Eq. (3.67) and generalizing to other directions, showthat infinitesimal proper Lorentz transformation with θ D δθ , η D δη acting onvector components is given by

Lμν D δμ

ν C �μν (3.79a)

�μν D

2664

0 δη1 δη2 δη3

δη1 0 �δθ3 δθ2

δη2 δθ3 0 �δθ1

δη3 �δθ2 δθ1 0

3775 (3.79b)

(b) Show also that

�μν � gμ���ν D ��νμ (3.79c)

(c) Show that Eq. (3.32) alone constrains εμν to be antisymmetric.

As the Lorentz vector transforms in the same way as the space-time coordinates,we can derive the spin part of the Lorentz generator S μν by using Eq. (3.66) or(3.79). Let us derive K1, the Lorentz boost generator in the x direction. Writing

Lμν D [1 � i δη � K � i δθ � S ]μ

ν (3.80)

we can derive K1 by

[K1]μν D i

@

@η1Lμ

ν

ˇˇ

η1D0D

2664

0 i 0 0i 0 0 00 0 0 00 0 0 0

3775 (3.81)

Other generators are obtained similarly.

Problem 3.8

Show

K1 D

2664

0 i 0 0i 0 0 00 0 0 00 0 0 0

3775 , K2 D

2664

0 0 i 00 0 0 0i 0 0 00 0 0 0

3775 , K3 D

2664

0 0 0 i0 0 0 00 0 0 0i 0 0 0

3775 (3.82a)

6) For the treatment of particles with half-integer spin, see Appendix A.

Page 81: Yorikiyo Nagashima - Archive

56 3 Lorentz Invariance

S1 D

2664

0 0 0 00 0 0 00 0 0 �i0 0 i 0

3775 , S2 D

2664

0 0 0 00 0 0 i0 0 0 00 �i 0 0

3775 , S3 D

2664

0 0 0 00 0 �i 00 i 0 00 0 0 0

3775

(3.82b)

Show also that Ki and Si satisfy the same commutation relations Eq. (3.74) as Ni

and L i .

3.6Poincaré Group*

So far we have considered only homogeneous Lorentz transformation. However,the reader should be aware that the Minkowski metric Eq. (3.16) is invariant undermore general Lorentz transformations known as the Poincaré transformation:

x μ 0D Lμ

ν x ν C aμ , x0 D LxC a , L D [Lμν ] D

@x μ 0

@x ν (3.83)

If we write the transformation as T(L, a), it satisfies

T(L0, a0)T(L, a) D T(L0L, L0a C a0) (3.84)

and makes a Poincaré (or inhomogeneous Lorentz) group. If we consider an in-finitesimal transformation, the transformation matrix can be written as

Lμν D δμ

ν C ωμν , aμ D εμ (3.85)

The infinitesimal translation can be expressed as

f 0(x μ C εμ) D f (x μ � εμ) D f (x μ) � εμ@μ f (x μ) D [1C i εμ(i@μ)] f (x μ)

D (1C i εμ P μ) f (x μ)

P μ D i@μ D i(@0,�r) (3.86)

The homogeneous Lorentz transformation is already given by Eq. (3.75). Thereforethe inhomogeneous Lorentz transformation operator can be written as

f (L�1x � a) D�

1� i12

ωμν M μν C i εμ P μ�

f (x ) (3.87)

The finite transformation is obtained as usual by successive operation and is givenby

U(L, a) D exp��

i2

ωμν M μν C i aμ P μ�

(3.88)

Page 82: Yorikiyo Nagashima - Archive

3.6 Poincaré Group* 57

Fields are transformed by the Poincaré group operation as

ψ0(x ) D D(L, a)ψ(L�1x � a) D U(L, a)ψ(x ) (3.89)

where M μν now includes the spin operator. Definitions Eq. (3.86) and Eq. (3.75)show that the operators P μ and M μν satisfy the relations

[P μ , P ν ] D 0 (3.90a)

[M μν , P�] D i(gν�P μ � gμ�P ν) (3.90b)

[M μν , M�σ ] D i(gν�M μσ � gμ�M νσ � gνσ M μ� C gμσ M ν�) (3.90c)

Note, the last commutation relation is the same as Eq. (3.74), but re-expressed interms of M μν instead of L i and Ki . They constitute a closed Lie algebra, as theyshould to make a group, and thus are generators of the Poincaré group. FromEq. (3.75) and Eq. (3.90), we see that the momentum as well as the angular mo-mentum operators, J D (M23, M31, M12), commute with the Hamiltonian (H DP0) and hence are constants of motion, but the Lorentz boost operators fK D(M01, M02, M03)g are not. We can make two Casimir operators that commute withall the generators:

m2 D Pμ P μ (3.91a)

Wμ W μ D � J( J C 1)m2 , Wμ D �12

εμν�σ M ν�P σ (3.91b)

Wμ is called the Pauli–Lubanski operator. Equation (3.91) can be used to define themass and spin of a particle. It can be verified that

[P μ , W ν ] D 0 (3.92)

Therefore we can diagonalize the mutually commutable operators simultaneously,

P μ , W3, P2, W 2 (3.93)

and use their eigenvalues to specify a one particle state. Particle states may be clas-sified by the value of m2 and P μ as follows:

1. P2 D m2 > 0 ordinary massive particle2. P2 D m2 D 0 photon, graviton, etc.3. P2 < 0 tachyon or virtual particle4. P μ D 0 vacuum

(3.94)

Only particles in categories 1 and 2 are physical particles and we consider themonly. For particles with m2 > 0, we can also specify their spin S. In their restframe, P μ D (m , 0, 0, 0), Wi becomes

Wi D �12

ε i ν�σ J ν�P σ D �m2

ε i j k0 J j k D mSi (3.95)

Page 83: Yorikiyo Nagashima - Archive

58 3 Lorentz Invariance

and the state can also be specified by its spin components. However, for particleswith m2 D 0 including photons and gravitons, which fly always at the speed oflight, the notion of rotation in the rest frame cannot be applied. As

P2 D W 2 D P � W D 0 (3.96)

P μ and W μ are orthogonal and both are on the light cone. So if W μ is a real vector,it means they are proportional:

W μ D λP μ (3.97)

Therefore, the particle state with m2 D 0 is characterized by one number, whichis proportional to W μ/P μ and has the dimension of angular momentum. It iscalled helicity (λ D J3 along the momentum direction). With the parity operation(if conserved), the helicity can be flipped. It can be shown that J3 D ˙j J j. Thephoton has J D 1 and its interaction preserves parity. Therefore the photon canhave J3 D ˙1, but a state with J3 D 0 is not allowed, which means the photon is aquantum of a transverse wave. In summary, a massless particle can have only twostates, specified by its helicity. For the case of a spin 1/2 massless particle, whichis known as the Weyl particle, we shall give an explicit expression of the state inSect. 4.1.2.

In summary, in the framework of special relativity, a particle state can be spec-ified by its energy-momentum, its spin value s, and spin components sz . If theparticle is massive, sz has 2S C 1 components but if it is massless sz D ˙S hasonly two components.

Page 84: Yorikiyo Nagashima - Archive

59

4Dirac Equation

4.1Relativistic Schrödinger Equation

4.1.1Dirac Matrix

We have seen in Chap. 2, that the relativistic wave equation encounters problems,such as negative probability and negative energy, if we try to interpret the wavefunction as a probability amplitude. The difficulties can be avoided by interpret-ing it as an equation for a (classical) field. Historically, a relativistic version of theSchrödinger equation without negative probability was first obtained by Dirac. Italso turned out to be the basic wave equation for the spin 1/2 particle. Since quarksand leptons, the fundamental particles of matter, have spin 1/2, we have to under-stand the structure of the spin 1/2 field before we quantize the Dirac fields. For themoment, we follow Dirac’s argument to derive a relativistic Schrödinger equation.His heuristic argument helps in understanding the physical meaning of otherwisemathematical equations. Dirac’s logic was as follows.

1. The negative probability originates in the fact that the Klein–Gordon equa-tion contains second time derivatives. Therefore, the equation must be a linearfunction of the first time derivative. Relativistic covariance requires that spacederivatives have to be linear, too.

2. The wave function ψ should have more than one component. Because theequation of motion generally includes second derivatives and they can only beincluded by introducing coupled multicomponent equations.1) If we define the

1) This can be understood, for instance, byconsidering a simple harmonic oscillatorequation @2q/@t2 D �ω2 q which includes asecond derivative. By introducing q1 D q,q2 D �i(@q/@t)/ω, it can be converted to twosimultaneous equations, @q1/@t D iωq2 ,@q2/@t D iωq1 . In matrix form, they can be

combined to a single linear equation.

dψdtD H ψ, ψ D

�q1

q2

�, H D

�0 iω

iω 0

�(4.1)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 85: Yorikiyo Nagashima - Archive

60 4 Dirac Equation

probability as

� DX

r

ψ�r ψr D ψ† ψ , ψ D

264

ψ1

ψ2...

375 (4.2)

� stays positive definite.3. The Klein–Gordon equation (i.e. the Einstein relation E 2 D p2C m2) has to be

satisfied.

Applying (1) and (2), the equation must be of the form

i@ψ@tD H ψ D

�iX

i

α i@

@x i C m

!ψ D (α � p C m)ψ (4.3)

Here, α i and are matrices. p is a differential operator but becomes a c-numberwhen we apply it to a plane wave ψ(x ) D Ae�i p x D Ae�i E tCp�x . Since any wavecan always be decomposed into plane waves, we shall use (E , p ) interchangeablybetween differential operators and c-numbers unless there is confusion betweenthe two:

pμ D (E ,�p ), i@μ , p μ D (E , p ), i@μ D

�i

@

@t,�ir

�(4.4)

The Schrödinger equation (4.3) can be expressed in the form

E ψ D (α � p C m)ψ (4.5)

Statement (3) requires

E 2 D p2Cm2 D

3Xi, j D1

(α i α j C α j α i )p i p j Cm3X

iD1

(α i C α i )p i C 2m2

(4.6a)

) α2i D 2 D 1 , α i α j Cα j α i D 0 (i ¤ j ) , α iCα i D 0 (4.6b)

Matrices that satisfy Eq. (4.6b) are called Dirac matrices. Their dimension has to bean even number.

Proof:(a) Eigenvalues of α i , are˙1.

* α i φ D λφ ! α2i φ D φ D λ2φ, ) λ D ˙1 (4.7)

(b) α i , are traceless.

* Tr[α i ] D Tr[α i 2] D Tr[α i ] D �Tr[α i 2] D �Tr[α i ] (4.8)

The third equality follows because a cyclic order change does not affect the tracevalue and the fourth equality uses the anticommutativity of α i and . (a) and (b)together restrict the dimension to be an even number. �

Page 86: Yorikiyo Nagashima - Archive

4.1 Relativistic Schrödinger Equation 61

4.1.2Weyl Spinor

If m D 0, α i D ˙σ i satisfy the Dirac condition (4.6b), where σ i are Pauli matrices:

i@ψ@tD ˙σ � (�ir)ψ (4.9)

σ1 D

�0 11 0

�, σ2 D

�0 �ii 0

�, σ3 D

�1 00 �1

�(4.10)

This particular form is called the Weyl equation. We shall consider a plane-wavesolution of the type ψ(x ) � u(p )e�i p x , where u(p) is a two-component wave func-tion. We distinguish two solutions φL and φR, corresponding to different signs ofthe Weyl equation (4.9). They satisfy

E φL D �σ � pφL , E φR D Cσ � pφR (4.11a)

φL D �jpjE

σ � pjpj

φL , φR DjpjE

σ � pjpj

φR (4.11b)

Here,

h �σ � pjpjD σ � Op (4.12)

is a spin projection operator in the direction of motion and is called a helicity oper-ator. Op is a unit vector in the direction of p . Classically, the helicity eigenstate withh D (C) represents a moving state rotating clockwise2) and the particle is said tobe right handed. Similarly, the helicity (�) state is left handed.

Problem 4.1

Let Op be in the direction (θ , φ). Then the helicity operator is expressed as

σ � Op D�

cos θ sin θ e�i φ

sin θ e i φ � cos θ

�(4.13)

Show that eigenstates �˙ of the helicity operator are expressed as

�C D

"cos θ

2 e�i φ/2

sin θ2 e i φ/2

#, �� D

"� sin θ

2 e�i φ/2

cos θ2 e i φ/2

#(4.14)

As m D 0, jp j/E D ˙1. Choosing �� for φL gives

φL D �� E > 0 , φL D �C E < 0 (4.15)

2) The orbital angular momentum is defined as l D r � p and lz > 0 means clockwise rotationviewed from upstream or in the �z direction. We consider the spin operator to have the sameattributes.

Page 87: Yorikiyo Nagashima - Archive

62 4 Dirac Equation

In other words, the solution φL to the Weyl equation is a pure helicity state of a par-ticle but flipped for its antiparticle (E < 0). The exact definition of an antiparticlewill be given in the next section. We shall use it to denote the negative energy par-ticle. φL does not contain a �C component for the particle and no �� componentsfor the antiparticle. We have now justified the use of L/R for the subscript. We callit the chirality (�) state, an extended helicity concept applicable to the antiparticle.It will be discussed in more detail in Sect. 4.3.5. φR is defined similarly. φL and φR

are solutions of different equations and hence are particles independent of eachother. They are called Weyl particles.

However, the two equations (4.11a) and hence φL and φR interchange by a paritytransformation:

x $ �x , p $ �p (4.16)

Therefore the solutions cannot be applied to particles that have strong or electro-magnetic interactions, as they are known to conserve parity. The Weyl solutioncould describe the neutrino, which interacts only weakly apart from the universalgravity.3) To describe particles in the real world, we have to combine the two solu-tions. Let us extend ψ from a two-components to 4-component wave function anddefine

ψ D�

φL

φR

�, α �

��σ 00 σ

�, �

�0 11 0

�(4.17)

is a matrix to exchange φR and φL. The above α and satisfy the Dirac condition(4.6b) and so the equation using α and defined above satisfies all the require-ments for describing relativistic particles having finite mass. It is called the Diracequation. Note that the way to determine α and is not unique. Different α0 and0 defined by transformation matrix T

α0 D T αT �1 , 0 D T T �1 (4.18)

satisfy the same Dirac conditions. The matrix form in Eq. (4.17) is called the Weylrepresentation. Another representation commonly used in many textbooks is thePauli–Dirac representation, which is given by

ψD D T ψWeyl , T D1p

2

�1 1�1 1

αD D

�0 σσ 0

�, D D

�1 00 �1

� (4.19)

The physics content does not depend on which representation is used, but it iseasier to proceed with an explicit representation. In this book we use the Weyl rep-resentation unless otherwise noted because it diagonalizes the energy-momentum

3) However, the phenomenon of neutrino oscillation, discovered in 1998 [357], is evidence for thefinite mass of the neutrino, hence the possibility of the neutrino being a Weyl particle is denied.

Page 88: Yorikiyo Nagashima - Archive

4.1 Relativistic Schrödinger Equation 63

term and is more suitable for describing relativistic (E , p m) particles. ThePauli–Dirac representation diagonalizes the mass term and is convenient for de-scription in the nonrelativistic energy region. Let us proceed. The Dirac equationis given by

i@ψ@tD (�α � ir C m)ψ , E ψ D [α � pC m]ψ (4.20)

Now we try to construct a probability current by assigning

j μ(x ) D (�(x ), j (x )) (4.21a)

�(x ) D ψ†(x )ψ(x ) , j (x ) D ψ†(x )αψ(x ) (4.21b)

Using the Dirac equation (4.20)

@�

@tD

@ψ†

@tψ C ψ† @ψ

@tD �if�(H ψ)†ψ C ψ† H ψg

D �i [�f(�i α � r C m)ψg†ψ C ψ†f(�i α � r C m)ψg]

D �r � (ψ† αψ) D �r � j

(4.22)

so the current defined above respects the equation of continuity. Since � is alwayspositive definite, the interpretation of ψ as a probability amplitude wave functionis possible. Dirac seems to have succeeded in constructing a relativistic version ofthe Schrödinger equation.

Problem 4.2

Consider Eq. (4.20) as expressed in the Heisenberg picture. Using the Heisenbergequation of motion

dOdtD i [H, O ] (4.23)

and setting O D x , show that dx/d t D α.

Therefore α can be considered as a velocity operator. This also means j μ D (�, j )can be interpreted as a classical four-component current (�, �v ).

Problem 4.3

Prove that H D α � p C m does not commute with L but does with J , where

L D x � p D x � (�ir) , J D LC S D LC12

Σ , Σ D�

σ 00 σ

�(4.24)

S satisfies the commutation relation of the angular momentum. It can be con-sidered as a four-component version of the spin operator with s D 1/2.

Page 89: Yorikiyo Nagashima - Archive

64 4 Dirac Equation

Problem 4.4

Show that the four-component helicity operator h D Σ � Op commutes with theDirac Hamiltonian.

Therefore, the helicity eigenstate is also an energy eigenstate of the Hamiltonian,and its quantum number can be used to specify the quantum state of the spin 1/2particle.

4.1.3Interpretation of the Negative Energy

Spin of the Dirac ParticleThere still remains the negative energy problem, but first we will try to understandwhat kind of freedom the wave function possesses. In the limit of p ! 0, the Diracequation, Eq. (4.20), becomes

E ψ D mψ D m�

0 11 0

�ψ (4.25)

Solutions are easy to find:

ψC,r D

��r

�r

�, ψ�,r D

��r

��r

�(4.26)

where �r is a two-component spin wave function

�C D�

10

�, �� D

�01

�(4.27)

ψ˙,r are eigenstates of the Hamiltonian as well as the four-component spin oper-ator S defined in Eq. (4.24). Namely,

E ψC,r D mψC,r E ψ�,r D �mψ�,r (4.28a)

S3ψ˙,C D12

ψ˙,C S3ψ˙,� D �12

ψ˙,� (4.28b)

We now have proved that the Dirac equation describes a particle of spin 1/2, butalso has a freedom corresponding to the sign of the energy. The negative energystate seems unavoidable.

Consequences of the Dirac EquationDirac applied his equation to the problem of energy levels in the hydrogen atom.By solving the Dirac equation including a static Coulomb potential, he obtainedthe energy spectrum given in Fig. 4.1 and explained the fine and hyperfine struc-ture of the energy levels, which were all degenerate according to the nonrelativisticSchrödinger equation.4) He also investigated the property of the electron in a mag-

4) Actually Dirac did not calculate the energy levels of the hydrogen when he published the paper onhis equation. Later, in his Oppenheimer memorial talk in 1969, he confessed that he did not doit because he was afraid his equation might be disproved by detailed calculations. The author isgrateful to Professor T. Kubota for pointing out this historically interesting anecdote.

Page 90: Yorikiyo Nagashima - Archive

4.1 Relativistic Schrödinger Equation 65

Fine structure10.9 GHz

Lamb shift1057 MHz

Hyperfine structure1420 MHz

1S1/2

3S1/2

2S1/2 2P1/2

3P3/23P1/2

2P3/2

3D3/2

3D5/2

Triplet

Singlet

Figure 4.1 Low-lying energy spectrum ofthe hydrogen atom. Degenerate states withthe principal quantum number n are sepa-rated by spin–orbit coupling (fine structure)and spin–spin coupling (hyperfine struc-ture). The Lamb shift is due to a higher or-der QED effect. The diagram is not drawn

to scale. Note that the hyperfine splitting of1S1/2 corresponds to an energy splitting of5.89 � 10�6 eV D 1420 MHz D 21 cm radiowave, which is extensively used in radio as-tronomy to probe the hydrogen abundance inthe universe.

netic field and succeeded in explaining the g D 2 value of the moment, which isdescribed in Sect. 4.3. By his great success, he must have convinced himself of thecorrectness of his equation. Then the negative energy puzzle must correspond tosome physical reality and its apparent contradictory property should have some ex-planation. The contradiction stems from the fact that if a particle is in a negativeenergy state, it can lose further energy by emitting photons and falls down withoutlimit and hence cannot be stable.

Dirac considered that the vacuum must be a state where all the negative energylevels are occupied. Then, because of the Pauli exclusion principle, electrons cannotfall down to negative energy levels and thus become stable. However, the conceptopened a new possibility. When an energy greater than twice the electron mass isgiven to the vacuum, an electron in the negative-energy sea is excited to a positiveenergy level and a hole is created in the sea (Fig. 4.2a). When an electric field isapplied, a negative energy electron next to the hole moves in to fill the hole andcreates another hole where it was before (Fig. 4.2b). The process is repeated and thenet effect is that the hole moves in the opposite direction to the electron. It behavesas an ordinary particle with positive energy, but having charge and momentumopposite to the electron. Its spin component is also opposite, because when thehole is filled, the quantum number is that of the vacuum. Thus, Dirac predictedthe existence of the antiparticle and that a pair creation of a particle and antiparticlefrom vacuum will result if enough energy is injected. Thus materialization of pureenergy became a possibility, as was already implicit in Einstein’s equation E Dmc2.

Page 91: Yorikiyo Nagashima - Archive

66 4 Dirac Equation

Figure 4.2 (a) The vacuum is a sea of negative-energy electrons. A hole is created by exciting anelectron to a positive energy level. (b) The hole moves in the opposite direction to the electron.

Figure 4.3 Anderson’s discovery of thepositron in a Wilson cloud chamber [21].The incident charged particle in the cosmicrays leaves a curved track in the magneticfield. The particle loses energy after passingthrough a 9 mm thick lead plate, leading to alarger curvature (smaller radius). Thereforeit has passed through the plate from below.

From the orientation of the magnetic field andthe direction of the bend, the charge of theparticle is determined to be positive. From thedifference of the curvature before and after thepassage of the plate, the energy loss, hencethe mass, of the particle can be deduced. Itwas concluded that it had much lighter massthan the proton. Photo from [22].

Dirac considered first that the antiparticle might be the proton, but it soon be-came clear that the antiparticle must have the same mass, same spin, but oppositespin component as well as the opposite magnetic moment. In 1932, C.D. Andersondiscovered the positron [21] (see Fig. 4.3), a positively charged particle that has thesame mass as the electron, in cosmic rays using a cloud chamber and confirmedDirac’s theory.

Particles Traveling in the Opposite Time DirectionAlthough the existence of the antiparticle was confirmed experimentally, Dirac’shole theory cannot be applied to bosons, for which the Pauli exclusion principle

Page 92: Yorikiyo Nagashima - Archive

4.1 Relativistic Schrödinger Equation 67

Figure 4.4 The antiparticle is (a) a hole in the sea of negative electrons or (b) an electron travel-ing in the reverse time direction.

does not apply.5) To avoid the use of the Pauli exclusion principle, let us take aminute to consider how an electron with negative energy behaves in the hole the-ory. Suppose an electron and an antiparticle having energy-momentum p1 and q1

are scattered into a state with p2 and q2 (Fig. 4.4a). The process can be expressedby

p1 C q1 ! p2 C q2 (4.29)

In the hole theory, a hole with q1 became a hole with q2, as depicted in Figure 4.4a,but this is a reinterpretation of an event where an electron with �q2 was scatteredinto an electron with �q1. Writing the equation in terms of the original electrongives

p1 C (�q2)! p2 C (�q1) (4.30)

Here the scattering of the electron with (�q2) precedes the filling of the hole (�q1).Equation (4.29) and Eq. (4.30) are equivalent mathematically. What actually hap-pens is Eq. (4.29), but we can consider it as Eq. (4.30), but with the negative particletraveling backward in time. The interpretation is that when a negative-energy par-ticle of energy-momentum �p μ D (�E ,�p ), E > 0, travels backward in time,it is observed as an antiparticle of positive energy-momentum p μ traveling in thenormal time direction, as is depicted in Figure 4.4b. The arrow indicates the timedirection.

The interpretation is valid for bosons, where no Pauli exclusion principle canbe applied. Of course, in reality, there are no backward-traveling particles. This isan example of how one has to make a physically correct interpretation of a math-ematical equation that exhibits an apparent contradiction. Feynman and Stückel-berg abandoned the idea of the negative-energy sea and only applied the concept ofbackward traveling to the negative-energy particle.

5) The concept of holes is useful and is routinely used in today’s condensed matter physics. Inthe ground state, at zero absolute temperature, all the electrons occupy states from the lowestlevel up to the Fermi level, which may be called the vacuum state. Then, at finite temperature,some electrons are excited to above the Fermi level and holes are created. Many phenomena aresuccessfully described using particle–hole interactions.

Page 93: Yorikiyo Nagashima - Archive

68 4 Dirac Equation

Let us consider a wave field instead of a wave function, obeying the spinlessKlein–Gordon equation for simplicity. The wave with negative frequency, i.e. p μ D

(�E ,�p ), E > 0, can be written as

e�i(p 0t�p�x )ˇ

p μ ! e if(�E)(�t )�(�p)�(�x)g D e i(E t�p�x) (4.31)

This means a wave with negative frequency, i.e. p μ D (�E ,�p ), E > 0, travelingbackward in time as well as in space direction gives a wave traveling in the samedirection as that with positive frequency. A current created by a plane wave fieldφ D N e�i p x is expressed as

j μ D i e(φ�@μ φ � @μ φ�φ) � ejN j2 p μ D ejN j2(E , p ) (4.32)

The plane wave with negative frequency p μ D (�E ,�p ) gives a current

j μ D ejN j2(�jE j,�p) D �ejN j2(jE j, p ) (4.33)

which is the same current as the positive energy current, only the charge is oppo-site. If we interpret ψ not as a relativistic probability amplitude but as a classicalwave field, the negative energy simply means negative frequency and presents nodifficulty in its interpretation.

The concept of backward time traveling, though unnecessary once quantizedfield theory is introduced, brings in a new concept about scattering phenomena.

Consider a particle A proceeding in time. It will behave as described in Fig. 4.5a,where kinks denote interactions. In the conventional way of thinking, the particlenever travels backward in time. Using the new concept of backward travel for a

Figure 4.5 If backward time travel is allowed,a particle–antiparticle pair appears. (a) Ordi-nary scattering. (b) When backward travel isallowed, a scattering backwards in time oc-curs. From the observational point of view, anantiparticle B and a particle C appear at time

t D t2. The antiparticle and the original par-ticle A annihilate each other and disappear attime t D t3. (c) Within an interval allowedby the Heisenberg uncertainty principle, anynumber of particles can appear.

Page 94: Yorikiyo Nagashima - Archive

4.1 Relativistic Schrödinger Equation 69

negative-energy particle, a process like Fig. 4.5b is allowed. Observationally no par-ticle travels backward in time. What really happens is that an antiparticle B and aparticle C appear at time t D t2, and subsequently the particle B annihilates withthe original particle A and disappears at time t D t3. The particle C takes over thepart of A and proceeds onwards. A pair creation from vacuum has occurred. As amatter of fact, any number of particles can be created in a time interval allowedby the Heisenberg uncertainty principle. The charge has to be conserved, whichmeans the backward-traveling particle must have the opposite charge.

What we have learned is, whether a sea of negative energy or backward travelin time is considered, a negative-energy state forces us to go from a one-particledescription to a multiparticle world. A backward-traveling particle induces vacuumfluctuation. The vacuum is no longer a stable static state, but rather like boilingwater, constantly creating and annihilating pairs of particles and antiparticles.

4.1.4Lorentz-Covariant Dirac Equation

The Schrödinger equation involves special treatment of the time variable or energyand its form is not manifestly covariant. For a covariant treatment, we multiplyboth sides of Eq. (4.20) by and rearrange the equation to get the Dirac equationin a covariant form:�

γ μ pμ � m�

ψ D�γ μ i@μ � m

�ψ(x ) D 0 (4.34)

where we have defined the Dirac γ matrices by

γ 0 D D�

0 11 0

�, γ D (γ 1, γ 2, γ 3) D α D

�0 σ�σ 0

�(4.35)

They satisfy the following relations

γ μ γ ν C γ ν γ μ D gμν (4.36)

γ 0 † D γ 0 , γ† D �γ (4.37)

Equation (4.36) is called the Dirac condition for gamma matrices. To obtain a con-jugate form of the Dirac equation, we take the hermitian conjugate of Eq. (4.34)and multiply by γ 0 from the right. Using γ 0γ μ†γ 0 D γ μ we obtain

ψ(γ μ i �@μ C m) D 0 , ψ D ψ† γ 0 (4.38)

ψ is called the adjoint of ψ. The equation of continuity is now expressed as

@μ j μ D @μ(ψγ μ ψ) D 0 (4.39)

Page 95: Yorikiyo Nagashima - Archive

70 4 Dirac Equation

The form suggests that the current ψγ μ ψ is a Lorentz 4-vector. Indeed if we defineS μν by

S μν Di4

[γ μ , γ ν ] , S μν D �S νμ (4.40a)

S i j D12

�σ 00 σ

��

12

Σ , S0 i Di2

α i Di2

��σ 00 σ

�(4.40b)

they satisfy the commutation relations Eq. (3.74) for Lorentz group generators.Therefore

S(L) D exp��

i2

ωμν S μν�D exp(α � η/2 � i Σ � θ/2) (4.41)

is a Lorentz operator acting on ψ(x ). The above formula shows that the rotationoperator is unitary but the boost operator is not.6) Using Eq. (4.41) and the Diraccondition Eq. (4.36) for the gamma matrices, we obtain

S�1 D γ 0S† γ 0 (4.42)

A Lorentz-transformed spinor takes the form [see Eq. (3.69)]

ψ(x )L! ψ0(x ) D D(L)ψ(L�1x ) (4.43)

or equivalently, expressed in terms of S(L),

ψ(x )! ψ0(x 0) D S(L)ψ(x ) , ψ(x )! ψ0(x 0) D ψ(x )S�1(L) (4.44)

A bit of algebra shows ψψ and ψγ μ ψ are a Lorentz scalar and a vector, respec-tively. Although ψ itself is not a vector, its bilinear form with constant matricessandwiched in between behaves like a vector (or a tensor) and is called a spinor.

Problem 4.5

For Lorentz boost in the x direction Eq. (4.41) becomes

S(L x ) D exp(αx ηx /2) (4.45)

Show ψγ μ ψ follows the same transformation law as given in Eq. (3.62) and thatψψ remains unchanged.

More detailed treatment of the spinor is given in Appendix A. Loosely speaking,we may say the spinor is like the square root of the vector. Since

ψγ μ ψL! ψ0γ μ ψ0 D ψS�1γ μ S ψ D Lμ

v ψγ v ψ (4.46a)

6) This is related to the fact that the Lorentz group is not a compact group. The range of theLorentz parameter is �1 < v < 1 or �1 < η < 1. There is no finite-dimensional unitaryrepresentation. As the argument becomes very mathematical, we only note that it is possible torealize states that transform as unitary representations of the Poincaré group.

Page 96: Yorikiyo Nagashima - Archive

4.2 Plane-Wave Solution 71

we may simply write

γ μ L! γ μ0

D S�1γ μ S D Lμν γ ν (4.46b)

In this sense we also refer to γ μ as a Lorentz 4-vector, but it is meaningful onlywhen it is sandwiched between spinors.

Now we prove the Lorentz covariance of the Dirac equation. By a Lorentz trans-formation Eq. (4.34) should take the form

(γ μ i@0μ � m)ψ0(x 0) D 0 (4.47)

Inserting (4.44) and using

@0μ D (L�1)�

μ@� (4.48)

Eq. (4.47) becomes

[γ μ(L�1)�μ i@� � m]S ψ(x ) D 0 (4.49)

Multiplying by S�1 from left and using Eq. (4.46b), we obtainh(S�1γ μ S )(L�1)�

μ i@� � mi

ψ(x ) D [Lμσ(L�1)�

μ γ σ i@� � m]ψ(x ) D 0

) (γ μ i@μ � m)ψ(x ) D 0 (4.50)

Thus the Dirac equation has been proved to keep its form under the Lorentz trans-formation.

4.2Plane-Wave Solution

Rewriting the four-component Dirac equation again in terms of two-componentwave functions, we obtain

i(@0 � σ � r)φL(x ) D mφR(x ) (4.51a)

i(@0 C σ � r)φR(x ) D mφL(x ) (4.51b)

See Eq. (4.17), (4.34) and (4.35) to obtain the above equations. We see that the massis the cause of mixing of φL and φR.7) For plane-wave solutions, φL and φR can be

7) The reason why the two helicity states mixwhen the mass is finite can be understoodas follows. The velocity of a particle withfinite mass is always less than the speed oflight c. Therefore it is possible to transformto a coordinate system that is flying fasterthan the particle. In that coordinate system,

the particle is moving backwards but rotatingin the same direction. Therefore the helicitystatus is inverted. The amplitude for mixingis proportional to� m/p . If m D 0, there isno such coordinate system and hence thehelicity does not mix.

Page 97: Yorikiyo Nagashima - Archive

72 4 Dirac Equation

expressed as

φL(x ) D N φL(p )e�i p x (4.52a)

φR(x ) D N φR(p )e�i p x (4.52b)

They satisfy the following coupled equations:

(p 0 C σ � p )φL(p ) D mφR(p ) (4.53a)

(p 0 � σ � p )φR(p ) D mφL(p ) (4.53b)

From Eq. (4.53a), we can express φR in terms of φL

φR D1m

(p 0 C σ � p )φL (4.54)

Then inserting this equation into Eq. (4.53b), we obtain

1m

(p 0 � σ � p )(p 0C σ � p )φL D1m

(p 02� p2)φL D mφL (4.55)

Therefore, φL satisfies the Klein–Gordon equation. Similarly φR satisfies theKlein–Gordon equation too. To obtain the solution to Eq. (4.53), we assumep 0 D E > 0 and rewrite E , p in terms of rapidity. Using E D mγ D m cosh η,p D mγ Op D m sinh η Op [see Eq. (3.65)], we have

E ˙ σ � p D m (cosh η ˙ sinh ησ � Op ) D m exp(˙σ � η) (4.56a)

η D12

lnE C pE � p

, Op Dpjp j

, p D jp j , η D η Op (4.56b)

Note: When dealing with a transcendental function of matrices, we can alwaysexpand it in series. Thus

exp(σ � η) D 1C (σ � η)C(σ � η)2

2!C

(σ � η)3

3!C � � � (4.57a)

Using

(σ � η)2n D η2n1 , (σ � η)2nC1 D η2nC1(σ � Op ) (4.57b)

and rearranging the series, we obtain

D

�1C

η2

2!C � � �

�C

�η C

η3

3!C � � �

�(σ � Op )

D cosh η C sinh η (σ � Op )(4.57c)

Page 98: Yorikiyo Nagashima - Archive

4.2 Plane-Wave Solution 73

Let us introduce four-component spin vectors σμ , σμ and dot products by

σμ D (1, σ) D σμ , σμ D (1,�σ) D σμ (4.58a)

p � σ D p μ σμ D E � σ � p , (4.58b)

p � σ D p μ σμ D E C σ � p (4.58c)

Then multiplying byp

p � σ Dp

m exp(�σ � η/2) on both sides of Eq. (4.53a) andusing (E � σ � p )(E C σ � p ) D E 2 � p2 D m2, we obtainp

p � σφL Dp

p � σφR (4.59)

which has solutions

φL D Np

p � σ�r D Np

m exp��

12

σ � η�

�r (4.60a)

φR D Np

p � σ�r D Np

m exp�C

12

σ � η�

�r (4.60b)

Combining the two two-component wave functions to obtain a four-componentsolution

u r (p ) ��

φL

φR

�E>0

D N�p

p � σ�rpp � σ�r

�D Np

m exp�

12

α � η��

�r

�r

�(4.61a)

where �r , (r D 1, 2) is a basis of two-component spin eigenstates, and N is anormalization constant taken as N D 1 because we adopt normalizations suchthat there are 2E particles per volume. (See Sect. 6.5.2 for detailed discussion.)Comparing with Eq. (4.41), we see the plane-wave solution ψ(x ) � u(p)e�i p x isnothing but a Lorentz-boosted spinor from the rest frame. For spin functions, it isoften convenient to use the helicity eigenstate �˙ instead of �r . In that case, p � σand p � σ acting on them are pure c-numbers.

For the solution with p 0 D �E < 0, we use e i p x D e i(E t�p�x) for a plane-waveexpression instead of e�i p xjp 0D�E D e i(ECp�x ) and invert the momentum togetherto represent a particle propagating in the same direction as that of positive energy.Replacing (p 0 ˙ σ � p )! �(E ˙ σ � p ) in Eq. (4.53), the solution becomes

vr (p ) ��

φL

φR

�E<0

D

� pp � ση r

�p

p � ση r

�, η D i σ2� � D

�0 1�1 0

�� � (4.61b)

where η, a flipped spin state, was chosen instead of � for reasons which will laterbecome clear (see Sect. 4.3.4). Sometimes, a somewhat different expression foru r(p ) and vs(p ) may be useful. Going back to Eq. (4.60) and using

exp��

12

σ � η�D cosh

η2� sinh

η2

σ � Oη D coshη2

1� tanh

η2

σ � Op�

coshη2D

rE C m

2msinh

η2D

rE � m

2m

tanhη2D

sE � mE C m

Dp

E C m

(4.62)

Page 99: Yorikiyo Nagashima - Archive

74 4 Dirac Equation

the plane-wave solutions are expressed as

u r(p ) D�p

p � σ�rpp � σ�r

�D

rE C m

2

24

1� σ�pECm

��r

1C σ�pECm

��r

35 (4.63a)

vr(p ) D� p

p � ση r

�p

p � ση r

�D

rE C m

2

24

1 � σ�pECm

�η r

1C σ�pECm

�η r

35 (4.63b)

These solutions satisfy the following relations:

ur (p )u s(p ) D 2mδ r s , u†r (p )u s(p) D 2E δ r s (4.64a)

v r (p )vs(p ) D �2mδ r s , v†r (p )vs(p ) D 2E δ r s (4.64b)

ur (p )vs(p ) D v r (p )u s(p ) D 0 (4.64c)

u†r (p )vs(�p) D v†

r (�p)u s(p) D 0 (4.64d)

Note: u†r (p )vs(p ) ¤ 0 , v†

r (p )u s(p ) ¤ 0 (4.64e)

Problem 4.6

Prove Eq. (4.64).

If we take the z-axis in the direction of the momentum, �C D�

10

�, �� D

�01

�,

all the plane-wave solutions can be written down concisely.

Problem 4.7

Using c-number p μ, prove

( 6p � m)u(p) D 0 , u(p)( 6p � m) D 0 (4.65a)

( 6p C m)v (p) D 0 , v (p)( 6p C m) D 0 (4.65b)

6p � γ μ pμ (4.65c)

The notation p/ is referred as the Feynman slash.

Problem 4.8

Denoting the a-th component of u r (p ) as fu r(p )ga , show

[ΛC]ab �1

2m

XrD1,2

fu r(p )gafur (p )gb D�6p C m

2m

�ab

[Λ�]ab �1

2m

XrD1,2

fvr (p )gafv r (p )gb D�6p � m

2m

�ab

(4.66)

Page 100: Yorikiyo Nagashima - Archive

4.3 Properties of the Dirac Particle 75

are projection operators for picking up positive and negative energy solutions re-spectively. Namely

ΛCu r(p ) D u r(p ) , ΛCvr (p ) D 0 , Λ�u r(p ) D 0 ,

Λ�vr (p ) D vr (p)(4.67)

The properties of projection operators are

ΛC C Λ� D 1 , Λ2C D ΛC , Λ2

� D Λ� , Λ˙Λ� D 0 (4.68)

Equations (4.66) meanXrD1,2

u r(p )ur (p ) D6p C m ,X

rD1,2

vr(p )v r (p ) D6p � m (4.69)

which will appear frequently in the calculation of the transition amplitude.

4.3Properties of the Dirac Particle

4.3.1Magnetic Moment of the Electron

To understand the nonrelativistic properties of the Dirac particle, let us derive alow-energy approximation of the Dirac equation. We apply it to the electron, which

has electric charge q D �e. Setting ψ D�

u A

u B

�, first we write down the equation

in terms of two-component wave functions. We also make the substitution pμ !

pμ C eA μ to include the electromagnetic interaction:

fγ μ(i@μ C eA μ) � mgψ(x ) D f(i@0 C eφ)γ 0 � γ � (�ir C eA) � mgψ(x )

D

��m i@0 C eφ � σ � (�ir C eA)

i@0 C eφ C σ � (�ir C eA) �m

� �u A

u B

�D 0

(4.70a)

This is a set of two equations for the two-component spinors uA, uB.

fi@0 C eφ � σ � (�ir C eA)guB D muA (4.70b)

fi@0 C eφ C σ � (�ir C eA)guA D muB (4.70c)

In the high-energy limit, where the mass can be neglected, equations for uA anduB are disconnected and represent two independent particles (uA � φL, uB � φR).But in the low-energy limit, the difference disappears and uA uB. For this reason

Page 101: Yorikiyo Nagashima - Archive

76 4 Dirac Equation

we introduce8)

uC D1p

2(uA C uB) u� D

1p

2(�uA C uB) (4.71)

with the expectation u� ! 0 in the nonrelativistic limit. Then the Dirac equationreads

(i@0 C eφ)uC � σ � (�ir C eA)u� D muC (4.72a)

(i@0 C eφ)u� � σ � (�ir C eA)uC D �mu� (4.72b)

Up to now the equation is exact. At this point we choose the p 0 > 0 solution andset

uC D ' e�i(mCT )t u� D �e�i(mCT )t (4.73)

Then we can make the substitution i@0 ! m C T , where T now represents thekinetic energy. Substituting Eq. (4.73) in Eq. (4.72) and applying the nonrelativisticapproximation T, eφ � m, Eq. (4.72b) becomes

� '1

2mσ � (�ir C eA)' (4.74)

Substituting this expression in Eq. (4.72a), we obtain

T' D

��eφ C

12m

σ � (�ir C eA)σ � (�ir C eA)�

' (4.75)

Then we make use of the relations

(σ � C)(σ � D) D C � D C i σ � (C � D)

(�ir C eA) � (�ir C eA)' D �i e(r � AC A � r)'

D �i e(hhr � Aii' C hhr'ii � AC hhA � r'ii) D �i eB'

(4.76)

Here hh� � � iimeans that the action of the nabla does not go beyond the double brack-et and B is the magnetic field. After inserting Eq. (4.76) in Eq. (4.75) and rearrang-ing, we obtain

T' D

��

12m

(r C i eA)2 � eφ Ce

2mσ � B

�' (4.77)

which is the Schrödinger equation including the electromagnetic potential plus anextra magnetic interaction term �μ � B. This means that the Dirac particle has amagnetic moment

μ D �e

2mσ D g

q2m

s (g D 2) (4.78)

8) This is equivalent to adopting the Pauli–Dirac representation in Eq. (4.19).

Page 102: Yorikiyo Nagashima - Archive

4.3 Properties of the Dirac Particle 77

Using the general expression for the magnetic moment μ D g(q/2m)s, this meansthat the Dirac particle has a g-factor of 2. This solved the then mysterious value ofg D 2 for the spin magnetic moment of the electron. This fact again defies theclassical notion of a spin being the angular momentum of a self-rotating object.From classical electromagnetism, a circulating electric charge should have g D 1.

The classical magnetic moment of a circulating charge can be calculated asfollows. The Maxwell equation gives the value of the magnetic moment μ DI S to a closed loop current I with inner area S. Assume the current is madeby a particle with charge q circulating at radius r. Taking its velocity v andcirculating period T, or the frequency ν D 1/T ,

μ D I S D qνπr2 D qv

2πrπr2 D

q2m

r mv Dq

2mL

(L D angular momentum)(4.79)

Deviation of the value g from 2 is called an anomalous magnetic moment. Forinstance, the proton and the neutron have magnetic moments of 2.79 and �1.91(in units of the nuclear magneton q/2mN) and their deviation from the standardvalue is large, which exposes the nonelementary nature of the particles. Even for theDirac particle g � 2 could be nonzero if we take into account higher order effects(radiative corrections) of the quantum electromagnetic fields (see arguments inSect. 8.2.2).

4.3.2Parity

In Sect. 4.1.2, we saw that φR and φL are exchanged by parity operation and intro-duced the matrix to do the action. Here, we treat the parity operation of the four-component Dirac equation

(γ μ i@μ � m)ψ(x ) D 0 (4.80)

By parity operation, the Dirac equation is transformed to

(γ μ i@μ0 � m)ψ0(x 0) D 0 (4.81a)

x μ D (x0, x )! x μ 0D (x0,�x) D xμ (4.81b)

ψ(x )! ψ0(x 0) D S ψ(x ) (4.81c)

So if there is a matrix S that satisfies

S�1γ 0S D γ 0 , S�1γ k S D �γ k (k D 1 to 3) (4.82)

Page 103: Yorikiyo Nagashima - Archive

78 4 Dirac Equation

We can claim that the Dirac equation is parity reflection invariant. Obviously

S D ηPγ 0 (4.83)

satisfies the condition as expected. Here, ηP is a phase factor with the absolutevalue of 1. Applying the parity operation to the plane-wave solutions (4.61a) and(4.61b), we obtain

Su r(�p ) D ηPu r(p ) (4.84a)

S vr(�p ) D �ηPvr (p) (4.84b)

which means that a particle and its antiparticle have relatively negative parity. Usu-ally ηP D 1 is assumed and hence the particle has positive parity and the antiparti-cle negative.

4.3.3Bilinear Form of the Dirac Spinor

A hermitian bilinear form made of four-component Dirac spinors T D ψΓ ψ has16 independent components. We have seen V μ � ψγ μ ψ behaves as a contravari-ant vector under proper Lorentz transformation. Under parity operation,

V 00D ψS�1γ 0S ψ D ψγ 0γ 0γ 0ψ D ψγ 0ψ D V 0 (4.85a)

V 0 D ψγ 0γ γ 0ψ D �ψγ ψ D �V (4.85b)

Therefore, V μ is a polar vector. Introducing

γμ D gμν γ ν , σμν D gμ�gνσ σ�σ (4.86)

Equation (4.85) can be expressed as

V μ P! V μ 0

D Vμ (4.87)

All the 16 independent bilinear forms are classified according to their Lorentztransformation property and are summarized in Table 4.1.

Problem 4.9

Confirm the parity transformation property of each item in Table 4.1.

Problem 4.10

Using (4.63), prove that the matrix elements hp 0jΓ jp i D u(p 0)Γ u(p ) in the non-relativistic limit are given to the order O(p/m) as shown in Table 4.2.

Page 104: Yorikiyo Nagashima - Archive

4.3 Properties of the Dirac Particle 79

Table 4.1 Sixteen independent bilinear forms ψΓ ψ.

Name Bilinear form # a After parity transf. Parity

Scalar S ψψ 1 ψψ C

Pseudoscalar P iψ γ5ψ b 1 �iψ γ5ψ �

Vector V ψγ μ ψ 4 ψγμ ψ �

Axial vector A ψγ5γ μ ψ 4 �ψγ5γμ ψ C

Tensor T ψσμν ψ 6 ψσμν ψ C

a # is the number of independent components.b γ5 D iγ0 γ1 γ2γ3. See Eq. (4.106). i is multiplied to make the form hermitian.

Table 4.2 Nonrelativistic limit of the γ matrices for plane wave solution.

Γ hp 0jΓ jpi/2 m

1 1

γ5 �[σ � (p 0 � p )]/(2m)

γ0 1γ (p C p 0)/(2m)C [iσ � (p 0 � p )]/(2m)

γ5γ0 �[σ � (p 0 C p )]/(2m)γ5γ �σσ i j D i

2 [γ i , γ j ] �i j k σka

σk0 D i2 [γ k , γ0] if(p 0 � p )/(2m)C [iσ � (p 0 C p )]/(2m)g

a i j k is a cyclic of 123.

Using the fact that in the nonrelativistic limit p mv , the above quantities can beexpressed as

h1i � 1 , hγ 0i � 1 , hγ μi � (1, v) , hγ μ γ 5i � (σ � v , σ) (4.88)

These expressions will help in understanding the physical interpretation or dy-namical properties of hΓ i.

4.3.4Charge Conjugation

Charge conjugation is the operation of exchanging a particle and its antiparticle.The fact that the signature of an antiparticle appears as the opposite sign in thephase of the wave suggests a connection with complex conjugate operation. Thedifference between the particle and its antiparticle manifests itself in the interac-tion with electromagnetic fields. Interaction with the electromagnetic field can be

Page 105: Yorikiyo Nagashima - Archive

80 4 Dirac Equation

taken into account by the substitution

p μ ! p μ � qAμ (4.89)

and that of the antiparticle by replacing q by �q. So if we can transform the fieldwhich satisfies the same equation of motion but with opposite charge, the opera-tion is the charge conjugation. Let us start with the simplest example, the Klein–Gordon equation. Substituting Eq. (4.89) in the Klein–Gordon equation,�

(@μ C i qA μ)(@μ C i qAμ)C m2 φ D 0 (4.90)

Taking the hermitian conjugate of the above equation we have�(@μ � i qA μ)(@μ � i qAμ)C m2 φ† D 0 (4.91)

Comparing Eq. (4.90) with Eq. (4.91), we see that if φ describes a particle, then φ†

describes its antiparticle, thus confirming our original guess. It also means that ifφ is a real field, it cannot have antiparticle or it is its own antiparticle. It followsthat a real field describes a neutral particle. The converse is not true. If a particlehas a quantum number other than charge that differentiates the particle from itsantiparticle, for example strangeness, the field is complex.

For the case of the Dirac particle, it is a little more complicated since mixing ofthe components could result from the charge conjugation operation. Let us startfrom the four-component Dirac equation including the electromagnetic interac-tion: �

γ μ(i@μ � qA μ)� m

ψ D 0 (4.92)

If there exists ψ c that satisfies the equation�γ μ(i@μ C qA μ) � m

ψ c D 0 (4.93)

then ψ c is the charge conjugate of ψ. Taking the hermitian conjugate of Eq. (4.92),we obtain

ψ†[γ μ†(�i �@μ � qA μ) � m] D 0 (4.94)

Multiplying by γ 0 from the right, and using the relation γ μ † D γ 0γ μ γ 0, we obtain

ψ[�γ μ(i �@μ C qA μ)� m] D 0 (4.95)

Now, taking the transpose of the above equation

[�γ μT(i@μ C qA μ) � m]ψTD 0 (4.96)

and multiplying by C from the left, we obtain

[�C γ μTC�1(i �@μ C qA μ) � m]C ψT

D 0 (4.97)

Page 106: Yorikiyo Nagashima - Archive

4.3 Properties of the Dirac Particle 81

So, if there is a matrix C that satisfies the relation

C γ μTC�1 D �γ μ (4.98)

we can define a charge conjugate operation and

ψ c D C ψT (4.99)

satisfies the equation (4.93) for the antiparticle.Noting

γ μT D γ μ(μ D 0, 2) , γ μT D �γ μ(μ D 1, 3) (4.100)

we can adopt

C D i γ 2γ 0 D

�i σ2 00 �i σ2

�(4.101)

i σ2 flips the spin orientation. Therefore, the charge conjugation operation includesspin flipping, as conjectured in Sect. 4.1.3. This is why we adopted the flipped-spin solution for the negative energy wave function (4.61b). This C matrix has theproperties

C D C� D �C�1 D �CT D �C† (4.102)

The charge-conjugated Dirac wave function is given by

ψ c D C ψTD i γ 2γ 0γ 0T

ψ� D i γ 2ψ� (4.103a)

ψ cD ψ c† γ 0 D (C ψT)† γ 0 D ψ�C† γ 0 D ψT γ 0�

C†γ 0 D �ψT C�1

(4.103b)

Naturally, an electromagnetic current should change its sign under the charge-con-jugation operation. In quantum mechanics, it is a definition and the sign has to bechanged by hand, but in quantum field theory it comes automatically.

Proof:

j μ cD qψ c γ μ ψ c D �qψTC�1γ μ C ψT

D qψTγ μT ψT

D �q(ψγ μ ψ)T D �qψγ μ ψ D � j μ(4.104)

The minus sign in the first equation of the second line comes from the anticommu-tative nature of the fermion field, which we will show in Chap. 5. We can removethe transpose in the penultimate equation because the quantity in the parenthesesis a 1 � 1 matrix. �

Page 107: Yorikiyo Nagashima - Archive

82 4 Dirac Equation

Problem 4.11

Prove (u r)c D C uTr D vr , (vr )c D C vT

r D u r (4.105)

4.3.5Chiral Eigenstates

We define a chirality operator γ 5 by

γ 5 D i γ 0γ 1γ 2γ 3 D �i α1α2α3 D

��1 00 1

�(4.106)

which satisfies the relation

(γ 5)2 D 1 , γ 5γ μ C γ μ γ 5 D 0 (4.107)

We also define chiral eigenstates, which play an important role in the weak inter-action, by

ψL(x ) D12

(1� γ 5)ψ D�

1 00 0

��φL

φR

�D

�φL

0

�(4.108a)

ψR(x ) D12

(1C γ 5)ψ D�

0 00 1

��φL

φR

�D

�0

φR

�(4.108b)

It is easily seen that ψL and ψR are eigenstates of the chirality operator γ 5

γ 5ψL D �ψL , γ 5ψR D CψR (4.109)

In the Weyl representation, the two chiral eigenstates are expressed in terms of thetwo-component spinors φL and φR separately. In other representations, they arenot necessarily separated, but the following arguments are equally valid. Multiply-ing Eq. (4.34) by γ 5 and using Eq. (4.107) we can obtain an equation for γ 5ψ. Thenadding and subtracting it from Eq. (4.34), we obtain

γ μ i@μ ψL D mψR (4.110a)

γ μ i@μ ψR D mψL (4.110b)

which reproduces the four-component version of Eq. (4.53). Since γ 5 anticom-mutes with all the γ μ ’s (μ D 0 to 3), it commutes with the Lorentz operator (4.41).Therefore the chiral eigenstate is invariant under the proper Lorentz transforma-tion. Under the parity operation

ψ(x )P�! ψ0(x 0) D ψ0(x0,�x ) D γ 0ψ(x ) (4.111)

Since γ 0 anticommutes with γ 5, ψL is converted to ψR and vice versa. In the Weylrepresentation, φL and φR are the two chiral eigenstates.

Page 108: Yorikiyo Nagashima - Archive

4.3 Properties of the Dirac Particle 83

Let us take a look at the spin properties of the plane-wave solution and how theyare related to the Weyl solutions. We notice that in the high-energy or small-masslimit p ! E

limp!E

p � σ D 2E�

1� σ � Op2

�, lim

p!Ep � σ D 2E

�1C σ � Op

2

�(4.112)

They are just projection operators to pick up the helicity � state. This means φL

and φR approach the Weyl solution in the high-energy limit (p ! E or m ! 0).For p ¤ E ,

pp � σ,

pp � σ preferentially pick up helicity� components. To decide

the helicity expectation value of φL, φR, apply them to an unpolarized helicity state,i.e. �0 D �C C ��:

φL Dp

p � σ�0 Dp

E C p �� Cp

E � p �C � (a�� C b�C) (4.113)

Therefore

φL D (p

E C p �� Cp

E � p �C)

p!E���!

p2p

"�� C

�m2p

��C C O

�mp

�2#

φR D (p

E C p �C Cp

E � p ��)

p!E���!

p2p

"�C C

�m2p

��� C O

�mp

�2#

(4.114)

We note that φL is primarily a negative helicity state, i.e. left-handed, but containsa small � O(m/p ) component of the opposite helicity state. The expectation valueof the helicity is given by

hh(φL)i D(φ†

L σ � Op φL)

(φ†L φL)

D�jaj2 C jbj2

jaj2 C jbj2D �

pED �v (4.115)

Similarly, we obtain hh(φR)i D Cv . For antiparticles, whose wave function is rep-resented by v (p ), the helicity is inverted. The chirality is a Lorentz-extended versionof the helicity operator, with its handedness inverted for antiparticles. Knowledgeof the chiral characteristics comes in very handy in understanding weak-interac-tion phenomenology. Note, however, the chiral states are not the eigenstates of theequation of motion, because γ 5 does not commute with the Hamiltonian. Helicitystates are the Hamiltonian eigenstates, but they change their state by interactions.We have also seen in Eq. (4.110) that the two chiral states mix when the mass termexists in the Hamiltonian. But the chirality is Lorentz invariant and is conservedin vector interactions, which include the strong, weak and electromagnetic interac-tions.

Page 109: Yorikiyo Nagashima - Archive

84 4 Dirac Equation

Chirality Conservation in the Vector InteractionWe prove that chirality is conserved in vector-type interactions before and afterinteraction. The photon, for instance, couples to currents

ψL(R)γμ ψR(L) D ψ†

�1� γ 5

2

�†

γ 0γ μ�

1˙ γ 5

2

�ψ

D ψ† γ 0γ μ�

1� γ 5

2

��1˙ γ 5

2

�ψ D 0

ψγ μ ψL(R) D ψL(R)γμ ψ D ψL(R) γ

μ ψL(R) ¤ 0 (4.116)

In the high-energy limit, where the mass can be neglected, chirality is equivalentto helicity except for the sign, hence helicity is conserved, too.

Note that an interaction that respects the gauge principle is obtained by the re-placement @μ ! @μ C i qA μ. Namely, the interaction Lagrangian has the form�q j μA μ D �qψγ μ ψA μ . It is good to remember that the fundamental interac-tion in the Standard Model is of vector type and preserves chirality.

Comment 4.1 In high-energy experiments we are mainly interested in an energy rangefrom 1 GeV to 1 TeV D 1000 GeV. The mass of the u, d, s quarks and that of the lep-tons (except the τ) are less than � 100 MeV (see Table 2.2). They can, for all practicalpurposes, be regarded as massless particles. For them the chiral eigenstates are essential-ly equivalent to the helicity eigenstates (flipped for their antiparticles) and hence areeigenstates of the Hamiltonian, too. c, b quarks and the τ lepton have mass in the fewgigaelectronvolts range, and in many cases they can be treated similarly. Only for the topquark, whose mass is � 171 GeV, is the effect of the mass sizable. This is why we usethe Weyl representation, which diagonalizes the chiral operator γ 5, instead of the Diracrepresentation, which diagonalizes the mass term . For historical reasons, the latter hasbeen commonly used in most textbooks. It is convenient only in treating phenomena ofrelativistic origin in the nonrelativistic energy range.

4.4Majorana Particle

Quantum numbers that differentiate particles from antiparticles include electriccharge, isospin, strangeness and lepton/baryon number.

Examples eC ¤ e� (electric charge, lepton number)n ¤ n (magnetic moment, baryon number)

K0 ¤ K0

(strangeness)

π0 and the photon do not have such quantum numbers, and thus they cannot bedistinguished from their antiparticles. Now, consider neutrinos. They belong to thelepton family and are considered to have lepton numbers just as charged leptonsdo. The lepton numbers 1 and�1 can be assigned to the neutrino and antineutrino,

Page 110: Yorikiyo Nagashima - Archive

4.4 Majorana Particle 85

respectively. Experimental observations seem to respect lepton number conserva-tion in neutrino reactions. For instance, νμ and νμ are produced through the decayof pions:

πC ! μC C νμ , π� ! μ� C νμ (4.117)

and when they collide with nucleons, they produce μ� (lepton number 1) and μC(lepton number �1)

νμ C n ! μ� C p , νμ C p ! μC C n (4.118)

in such a way as to conserve the lepton number. A νμ does not produce a μC ifit collides with a proton. However, the weak interaction acts only on particles withnegative chirality.9) If the particle’s mass is zero, it is a Weyl particle and is in apure helicity (�) state, while the antiparticle is in pure helicity (C) state. What wasconsidered as lepton number conservation may have been nothing but a test forhelicity conservation, although experimentally there is no evidence against leptonnumber conservation as far as the neutrino is concerned.

Let us assume that the neutrino is a Majorana particle, a name given to a fermionthat does not have its own antiparticle. Conditions for the Majorana particle are

1. It satisfies the Dirac equation.(γ μ i@μ � m)N(x ) D 0

2. The particle and its antiparticle are identical.N c D N

(4.119)

In the Weyl representation, we can put ψL D

��0

�and ψR D

�0η

�. Then define N1

and N2 as follows:

N1 D ψL C (ψL)c D

��

�i σ2� �

�(4.120a)

N2 D ψR C (ψR)c D

�i σ2η�

η

�(4.120b)

They satisfy N c1 D N1, N c

2 D N2. Applying the Dirac equation for N1

�γ μ i@μ � m

N1 D

��m i@0 C i σ � r

i@0 � i σ � r �m

���

�i σ2� �

�D 0 (4.121)

and rewriting the equation for two-component spinors, we obtain

(@0 � σ � r)� D �m(σ2� �)

(@0 C σ � r)(σ2� �) D m�(4.122)

9) This is true only for charged current interactions, where W ˙ is exchanged. The neutral currentinteractions act on both chiralities.

Page 111: Yorikiyo Nagashima - Archive

86 4 Dirac Equation

The two equations are not independent. The second equation can be derived fromthe first. It is indeed an equation for a two-component spinor. N1 and N2 have onlytwo independent components despite their seemingly four-component structure.Similarly we can obtain the equation for N2:

(@0 C σ � r)η D m(σ2η�)

(@0 � σ � r)(σ2η�) D �mη(4.123)

From the above argument, we see � and η satisfy two independent equations. Inthe limit of m ! 0, the equations become identical to the Weyl equation (4.9).If m ¤ 0, the deviation from the Weyl solution is small for relativistic particles(m � E ). � is in a chirality minus state and is predominantly left-handed. Similar-ly, η is predominantly right-handed. Therefore, we call � (η) a left (right)-handedMajorana particle. Since they satisfy independent equations, their mass need notbe the same in general (mL ¤ mR).

Note, the Majorana particle is a quantum mechanical concept and there is noequivalent in classical mechanics. If we consider � or η as expressed by pureclassical numbers, it is straightforward to show that Equation (4.122) has nosolution. This can be seen by inserting � T D (a, b) and p D 0 and solvingthe equation explicitly for a and b. Another way to see this is to note that theLagrangian to derive the mass term should contain a term like � † i σ2� � Di � �

i fσ2gi j � �j . However, σ2 is antisymmetric and it would vanish if � were an

ordinary c-number field. In quantum field theory, � is an anticommuting fieldoperator and the contradiction does not appear.

Let us see the effect of charge conjugation on the left- and right-handed Majorananeutrinos. After decomposing the four-component spinors N1 and N2 into NL andNR using the projection operators (1� γ 5)/2

NL D12

(1� γ 5)N , NR D12

(1 � γ 5)N (4.124)

we apply the charge conjugation operation to them:

(NL)c D C NTL D i γ 2γ 0N

TL D i γ 2 1

2(1� γ 5)N�

D12

(1C γ 5)i γ 2γ 0NTD

12

(1C γ 5)N c D NR

) (NL)c D (N c)R D NR (4.125a)

(NR)c D (N c)L D NL (4.125b)

The Majorana particle changes its handedness by the charge conjugation operation.No distinction exists between a particle and its antiparticle. The handedness can be

Page 112: Yorikiyo Nagashima - Archive

4.4 Majorana Particle 87

Figure 4.6 Three possibilities for the neutrino:(a) four-component Dirac neutrino m ¤ 0,(b) two-component Weyl neutrino (m D 0)and (c) two-component Majorana neutrino

(m ¤ 0). P and C are the parity and charge-conjugation operations, respectively. L is aLorentz boost. The neutrino states transformto each other by the designated operation.

exchanged either by the parity or the charge conjugation operation. From now on,we adopt the convention that ψ c

L � (ψ c)L, to denote left handed antiparticle for theDirac particle and left handed particle for the Majorana particle.

Problem 4.12

Since i σ2� � (�i σ2η�) satisfies the same equation as η (� ), it is a right (left)-handed particle. Show that if mL D mR then the chirality L(R) component ofψ D (N1 C i N2)/

p2 satisfies Eq. (4.51).

Therefore the Dirac particle is equivalent to two Majorana particles having thesame mass. Actually, having the same mass guarantees the Lagrangian to havean extra symmetry, namely invariance under phase transformation (ψ ! ψ 0 De�i α ψ, ψ† ! ψ†0

D e i α ψ†) and to have a conserved particle number accordingto Noether’s theorem (see Sect. 5.2).

We summarize relative relations of the Dirac, Majorana and Weyl particles inFigure 4.6.

How Can We Tell the Difference Between the Dirac and Majorana Neutrinos?If the neutrino is a Majorana particle, lepton number is not conserved. The sim-plest reaction that can test it irrespective of the helicity state is a neutrinoless (0ν)double beta decay, in which a nucleus emits two electrons with no neutrinos:

0ν decay: (A, Z )! (A, Z C 2)C e� C e� (4.126)

A double beta decay that emits two neutrinos (2ν) (to conserve the lepton number)appears as a higher order weak interaction and has been observed:

2ν decay: (A, Z )! (A, Z C 2)C e� C e� C ν e C ν e

Another clue to distinguish the Majorana particle from the Dirac particle is thefact that the former cannot have a magnetic moment. This is because by charge

Page 113: Yorikiyo Nagashima - Archive

88 4 Dirac Equation

conjugation the magnetic moment changes its sign but the particle itself remainsas it is. However, if there is more than one kind of Majorana particle they cantransform into each other through magnetic interaction. This feature is sometimesapplied to solar or cosmic magnetic field phenomena to test the neutrino property.A charged Dirac particle can have a magnetic moment, as already described forthe case of the electron. A neutral Dirac particle can also have a magnetic momentthrough high-order intermediate states and also an electric dipole moment if timereversal invariance is violated.10) According to the Standard Model calculation, themagnetic moment of the Dirac neutrino is [151]

μν �3eGF

8p

2π2mν � 3 � 10�19

1 eV

�μBohr , μBohr D

e2me

(4.127)

which is far below the observable limit with the present technology. However, if anew interaction exists which induces a large magnetic moment, it may be exposedby improved experiments.

10) We will defer detailed discussion of time reversal to Chap. 9.

Page 114: Yorikiyo Nagashima - Archive

89

5Field Quantization

Before we quantize fields, we need to understand the Lagrangian formalism andderivation of invariants when the Lagrangian respects a certain symmetry. Impor-tant invariants are the energy-momentum and the electric charge, whose underly-ing symmetries are translational invariance in space-time and gauge invariance.

5.1Action Principle

5.1.1Equations of Motion

In solving a particle motion in classical mechanics, we usually start from Newton’sequation of motion, where the path of the particle motion is traced by successivelyconnecting infinitesimal changes at each point. There is another approach basedon the action principle, where all the possible paths between two points are inves-tigated and the one with the minimum action is selected. The two approaches areequivalent in classical mechanics. In quantum mechanics, however, because of thewave nature of a particle, the path chosen by the least action is the most proba-ble, but other paths are possible, too. The deviation from the most probable pathis very small, of the order determined by Planck’s constant, but it is quite sizablein the microcosm extending from molecules and atoms down to elementary par-ticles, i.e. in the world where quantum mechanics applies. A good example is thedouble slit interference phenomenon of an electron beam. Therefore, in our studyof elementary particles, a logical choice is to start from the action principle.

In the action principle, a particle’s path from a point q1 D q(t1) to q2 D q(t2) issuch that it minimizes the “action S” defined by an integral of the Lagrangian L

S DZ t2

t1

L(q, Pq)d t (5.1a)

L D K � V D12

m Pq2 � V(q) (5.1b)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 115: Yorikiyo Nagashima - Archive

90 5 Field Quantization

where K is the kinetic energy of the particle and V is some potential in which itmoves. The action is a functional of both the position q and the velocity Pq of theparticle, which, in turn, are functions of time t. Fixing the positions q1 and q2,we change the path q(t) by an infinitesimal amount δq D δq(t), then the smallchange in S (variation δS ) should vanish if the action is at the minimum:

δS DZ t2

t1

d tδL DZ t2

t1

d t�

δLδq

δq CδLδ Pq

δ Pq�

D

�δLδ Pq

δq�t2

t1

Z t2

t1

d t�

dd t

�δLδ Pq

��

δLδq

�δq D 0

(5.2)

We have used δ Pq D δ(dq/d t) D (dδq/d t). Since δq(t1) D δq(t2) D 0 and δq isarbitrary

dd t

�δLδ Pq

��

δLδqD 0 (5.3)

which is called the Euler–Lagrange equation. One advantage of the Lagrangian for-malism is that the equation holds for a general coordinate system. Applying it to aharmonic oscillator where V(q) D mω2q2/2, the Lagrangian becomes

L D12

m Pq2 �12

mω2q2 (5.4)

The Euler–Lagrange equation gives

Rq D �ω2q (5.5)

5.1.2Hamiltonian Formalism

The Hamiltonian formalism is another way to derive equations of motion equiva-lent to the action principle and is particularly useful in deriving quantum mechan-ical equations of motion. In mechanics of a point particle, the Lagrangian uses qand Pq, while the Hamiltonian formalism chooses to use q and its canonical conju-gate momentum defined by

p D@L@ Pq

(5.6)

as two independent variables. q and p are called canonical variables. Then

δL D@L@q

δq C@L@ Pq

δ Pq D@L@q

δq C p δ Pq D@L@q

δq C δ(p Pq) � Pqδ p (5.7)

If we define the Hamiltonian by

H � p Pq � L (5.8)

Page 116: Yorikiyo Nagashima - Archive

5.1 Action Principle 91

then

δH D δ(p Pq � L) D �@L@q

δq C Pqδ p D � Pp δq C Pqδ p (5.9)

We used the Euler–Lagrange equation in deriving the last equation. From the aboverelation, it follows that

Pq D@H@p

, Pp D �@H@q

(5.10)

Equations (5.9) and (5.10) mean the Hamiltonian is a constant of motion if all thetime dependences are in p and q, namely if it does not have explicit time depen-dence. For a harmonic oscillator

H Dp 2

2mC

12

mω2q2 (5.11)

Pq Dpm

, Pp D �mω2q ) Rq D �ω2q (5.12)

Eq. (5.12) reproduces the equation of motion for the harmonic oscillator Eq. (5.4).

5.1.3Equation of a Field

Let us extend the action to a system of N 3 particles oscillating independently thatare distributed uniformly in a large but finite volume V D L3, each oscillator occu-pying a volume ΔV D (L/N )3. This is expressed as

S DZ

d tN3X

rD1

L(qr , Pqr) DZ

d tN3Xr

�12

mr Pq2r �

12

mr ω2r q2

r

�(5.13)

Taking N 3 independent variations for qr , we obtain N 3 Euler–Lagrange equations:

dd t

�δLδ Pqr

��

δLδqrD 0 (5.14)

In the continuous limit where the spacing L/N goes to zero, the collection of os-cillators is called a “field”. Instead of numbering by r, we may use the space co-ordinate x D (x , y , z) to number the oscillators. The nomenclature “field” is bydefinition some quantity defined at all space positions. In this book we restrictits usage to mean an infinite collection of oscillators distributed over all space. Tomake a connection between the field and discrete oscillators, the field φ(t x ) maybe represented by the average value over ΔV at r-th point, and associate qr withφ(x ) scaled with the volume ΔV :

many-particle system ! fieldqr ! φ(x)numbering by r ! numbering by xPN3

rD1 ΔV !R

d3x

(5.15)

Page 117: Yorikiyo Nagashima - Archive

92 5 Field Quantization

φ(x ) need not necessarily be space oscillators. It may represent any oscillatingphysical object, such as the electromagnetic field. Therefore the action is repre-sented as

S DZ

d t�Z

d3xL�φ(x ), Pφ(x )

���

Zd4xL(φ(x ), @t φ(x )) (5.16)

In the last equation x is used to denote four space-time coordinates instead of xas φ is a function of time, too. L is called Lagrangian density. In field theory, theaction appears as a 4-integral of the Lagrangian density, S D

Rd tL D

Rd4xL; we

may call it simply the Lagrangian where there is no confusion. The field is an infi-nite number of some physical objects oscillating independently at position x . If wewant a field whose oscillation propagates in space, namely a wave, each oscillatorhas to have some connections with its neighbors. Remember the mechanical mod-el of the field considered in Sect. 2.2.5. Receiving force from or exerting force onthe neighboring oscillators was the cause of wave propagation. In the continuouslimit, that means the oscillator is a function of derivatives of the space coordinateand Eq. (5.16) should include a space differential:

L(φ(x ), Pφ(x )) �! L(φ(x ), @μ φ(x )) @μ D (@0, r) (5.17)

where @t is replaced by @0 D (1/c)(@/@t) to make the dimension of the four-differ-ential equal. Taking an infinitesimal variation of the action

δS DZ

d4x�

δL

δφδφ C

δL

δ(@μ φ)δ(@μ φ)

�(5.18)

and partial integration of the second term gives surface integrals, including δφ(x ).In a mechanical system with finite degrees of freedom, the boundary conditionwas δqr(t1) D δqr(t2) D 0. For the field we require that δφ(x ) D 0 also at someboundary point, which is usually chosen at infinity. Then the surface integral van-ishes and

δS D �Z

d4x

"@μ

(δL

δ�@μ φ

�)�

δL

δφ

#δφ(x ) D 0

) @μδL

δ(@μ φ)�

δL

δφD 0

(5.19)

This is the Euler–Lagrange equation for the field. An obvious advantage of the La-grangian formalism is that the Lagrangian density is a Lorentz scalar and easy tohandle. The symmetry structure is also easy to see. The Lagrangian density is cho-sen to give known equations of the field or sometimes given a priori from somesymmetry principle. Those to produce the real or complex Klein–Gordon equa-tions, the Maxwell equations of the electromagnetic field, and the Dirac field aregiven by

Page 118: Yorikiyo Nagashima - Archive

5.1 Action Principle 93

LKG,r D12

(@μ φ@μ φ � m2φ2) real field (5.20a)

LKG,c D @μ'†@μ' � m2'†' complex field (5.20b)

Lem D �14

Fμν F μν D12

(E2 � B2) electromagnetic field (5.20c)

LDirac D ψ(i γ μ@μ � m)ψ Dirac field1) (5.20d)

Problem 5.1

Using the Euler–Lagrange equations, derive the equations of motion from the La-grangian Eq. (5.20).

Massive Vector FieldFor comparison, we give here the Lagrangian of a massive vector field to realizesome special features of the massless electromagnetic field. The Lagrangian of amassive vector field is given by analogy to the electromagnetic Lagrangian by

Lem D �14

f μν f μν C12

m2 Wμ W μ � g Wμ j μ (5.22a)

f μν D @μ W ν � @ν W μ (5.22b)

The equation of motion is given by

@μ f μν C m2 W ν D g j ν (5.23)

which is called the Proca equation. If j μ is a conserved current, then by taking thedivergence of Eq. (5.23), we can derive an equation

@μ W μ D 0 (5.24)

Then Eq. (5.23) is equivalent to the following two equations:�@μ@μ C m2� W ν D g j ν , @μ W μ D 0 (5.25)

Equation (5.25) shows that the number of independent degrees of freedom of theProca field is three, and when quantized, it represents a particle with spin 1. Inthe case of the electromagnetic field (or m D 0 vector field in general), unlike themassive field, the condition Eq. (5.24) does not come out automatically and has tobe imposed as a gauge condition.

1) Sometimes a symmetrized version is used:

LDirac D12

���i@μ ψγ μ ψ � mψ ψ

�C ψ

�γ μ i@μ � m

�ψ

(5.21)

It differs from Eq. (5.20d) by a surface integral.

Page 119: Yorikiyo Nagashima - Archive

94 5 Field Quantization

Hamiltonian Formalism for FieldsThe conjugate momentum field is defined just like particle momentum Eq. (5.6):

π(x ) DδL

δ Pφ(5.26)

Likewise, the Hamiltonian density for the field can be obtained using the sameformula as Eq. (5.8)

H (x ) D π(x ) Pφ(x )�L (5.27)

The Hamiltonian is a constant of motion as is the case for point particle mechanics,i.e.

dHdtD

dd t

Zd3 xH D 0 (5.28)

Proof:

PH Ddd t

(π Pφ �L) D Pπ Pφ C π Rφ ��

@L

@tC

δL

δφPφ C

δL

δ PφRφ C

δL

δ(@k φ)@k Pφ

�(5.29a)

Pπ Pφ Ddd t

�δL

δ Pφ

�Pφ D

��@k

�δL

δ(@k φ)

�C

δL

δφ

�Pφ (5.29b)

where we have used the Euler–Lagrange equation to obtain the third equatity inEq. (5.29b). Partially integrating the first term gives

Pπ Pφ D�

δL

δ(@k φ)@k Pφ C

δL

δφPφ�

(5.29c)

Because of this equality, the first term of Eq. (5.29a) is canceled by the second andfourth terms in large parentheses in Eq. (5.29a) and the second term by the thirdterm in large parentheses. Then

PH D �@L

@t(5.29d)

Therefore unless the L has an explicit dependence on t,2) the time variation of theHamiltonian density vanishes. �

The Hamiltonian formalism is convenient for quantization. But as we shall seesoon, it is the 00 component of a Lorentz tensor and hence more complicated fromthe Lorentz covariance point of view than a simple scalar Lagrangian.

2) Usually this happens when a time-dependent external force acts on the system. Remembermechanical energy conservation under the effect of a potential.

Page 120: Yorikiyo Nagashima - Archive

5.1 Action Principle 95

5.1.4Noether’s Theorem

We have seen that the energy of a field is a conserved quantity. But a question aris-es immediately: are there any other conserved quantities? There is a convenientand systematic way of deriving conserved quantities known as Noether’s theorem.It shows that there is a constant of motion for each continuous symmetry trans-formation that keeps the Lagrangian invariant. Then observed selection rules canbe rephrased in terms of symmetry requirements of the Lagrangian and serve asuseful guides for introducing new interaction terms when developing new theo-ries. Let us assume the Lagrangian is a function of the fields φ r and @μ φ r , where rrepresents some internal degree of freedom, and that it has no explicit dependenceon the coordinates x μ . This is the case when the fields do not couple with an exter-nal source, i.e. if the system is closed. Then for some infinitesimal variation of thefield φ r , the variation of the Lagrangian is given by

δL DδL

δφ rδφ r C

δL

δ(@μ φ r )δ(@μ φ r ) (5.30)

As δ(@μ φ r ) D @μ δφ r , the above equation can be rewritten as

δL D

�δL

δφ r� @μ

δL

δ(@μ φ r )

�δφ r C @μ

�δL

δ(@μ φ r)δφ r

(5.31)

The first term vanishes because of the Euler–Lagrange equation. So if the La-grangian is invariant for some continuous transformation of φ r , the second termmust vanish, which defines a conserved current J μ . This is called the Noether cur-rent and satisfies the continuity equation

J μ DδL

δ(@μ φ r )δφ r , @μ J μ D 0 (5.32)

The meaning of the conserved current is that the space integral of the 0th compo-nent does not change with time, i.e.

dQdtD 0 , Q D

Zd3x J0 (5.33)

Proof:

dQdtD

Zd3x@t J0 D �

Zd3 xr � J D �

ZS

dσ Jn D 0 (5.34)

where in going to the fourth equality we have used Gauss’s theorem to changethe volume integral to a surface integral, and the last equality follows from theboundary condition that the fields vanish at infinity. �

Page 121: Yorikiyo Nagashima - Archive

96 5 Field Quantization

Conserved ChargeAs an example, let us derive a charged current for a complex field. The Lagrangiangiven by Eq. (5.20b)

LKG D @μ'†@μ' � m2'†' (5.35)

is invariant under continuous phase transformation

' ! e�i qα', '† ! '†e i qα (5.36)

where α is a continuous variable but constant in the sense it does not depend on x.The phase transformation has nothing to do with the space-time coordinates andis an example of internal symmetry, referred to as U(1) symmetry. Then taking

δφ1 D δ' D �i qδα' , δφ2 D δ'† D i qδα'† (5.37)

the Noether current is expressed as

J μ D i q�'†@μ' � (@μ'†)'

δα (5.38)

Removing δα we have reproduced the conserved current: [Eq. (2.41)]. The con-served charge is expressed as

Q D �i qZ

d3xX

r

�π r(x )'r (x )� π†

r (x )'†r (x )

(5.39a)

D i qZ

d3x ['† P' � P'†'] (5.39b)

The first expression is a more general form, while the second one is specific to thecomplex Klein–Gordon field.

Problem 5.2

The Lagrangian density of the Dirac field is made up of combinations of ψ(x ), ψ(x )and is invariant under the gauge transformation

ψ ! e�i α ψ , ψ ! ψ e i α (5.40)

Prove that the Noether current corresponding to infinitesimal transformationsδψ D �i δαψ, δψ D i δαψ is given by

j μ D ψγ μ ψ (5.41)

Energy-Momentum TensorWe consider here translational invariance under an infinitesimal displacement ofthe coordinate

x 0 μD x μ C δx μ (5.42)

Page 122: Yorikiyo Nagashima - Archive

5.1 Action Principle 97

Then

δφ r D φ r (x μ C δx μ) � φ r (x ) D @μ φ r δx μ (5.43a)

δ(@μ φ r ) D @μ φ r (x ν C δx ν) � @μ φ r (x ) D @μ@ν φ r δx ν (5.43b)

If the Lagrangian does not depend explicitly on x μ , the variation can be expressedin two ways:

δL1 D @μLδx μ (5.44a)

δL2 DδL

δφ rδφ r C

δL

δ(@μ φ r )δ(@μ φ r ) (5.44b)

In calculating the action variation, unlike the internal transformation where thevariation of the field is evaluated at the same space-time, we have to take into ac-count the volume change in the integral, too, i.e.

δS DZ

d4x 0L(x 0) �Z

d4xL(x ) DZ

d4x δLC

Z(d4x 0 � d4x )L (5.45)

The second term can be calculated using the Jacobian of the transformation, buthere we rely on the fact that δS D 0, which means it is equal to minus the firstterm. Therefore,

δS DZ

d4x (δL2 � δL1)

D

Zd4x

�δL

δφ rδφ r C

δL

δ(@μ φ r)δ(@μ φ r )� @μLδx μ

� (5.46)

Inserting Eq. (5.43) into the first term of Eq. (5.46), the integrand becomes�δL

δφ r� @μ

δL

δ(@μ φ r )

�@ν φ r δx ν C @μ

�δL

δ(@μ φ r )@ν φ r � δμ

ν L

δx ν (5.47a)

The first term vanishes because of the Euler–Lagrange equation. Thus we haveobtained a conserving tensor called the energy-momentum tensor:

@μ T μν D 0 (5.47b)

T μν DδL

δ(@μ φ r )@ν φ r � gμνL (5.47c)

The name is given because

T 00 DδL

δ(@0φ r )@0 φ r �L (5.48)

agrees with the Hamiltonian density given in Eq. (5.27). Its space integral is aconstant of motion and reconfirms Eq. (5.28). Then, by Lorentz covariance, P i DR

d3x T 0 i �R

d3xP i are momenta of the field. They are also constants of motion.

Page 123: Yorikiyo Nagashima - Archive

98 5 Field Quantization

Problem 5.3

Using the Jacobian prove thatZ(d4x 0 � d4x )L D

Zd4x

��@μLδx μ� (5.49)

Problem 5.4

Calculate the energy momentum tensor of the scalar field whose Lagrangian isgiven by Eq. (5.20a) and show that it is symmetric with respect to suffixes μ and ν.

We can derive the angular momentum tensor by requiring the invariance underan infinitesimal Lorentz transformation given in Eq. (3.79)

δx μ D �μν x ν (5.50a)

δφ(x ) D �μν x ν@μ φ(x ) (5.50b)

Problem 5.5

(a) Show the corresponding angular momentum tensor is given by

M�, μν D T �ν x μ � T �μ x ν (5.51)

(b) Show that

@� M� μν D T μν � T νμ (5.52)

Hint: Replace the second term in Eq. (5.47a) by δx ν ! �μν x ν and use �νμ D ��μν.

Problem 5.5 shows that T μν D T νμ is a necessary condition for the angularmomentum tensor to be conserved. However, the expression for T μν in Eq. (5.47c)is not manifestly symmetric and indeed it is not always symmetric. We note that theenergy momentum tensor itself is not a measurable quantity but its space integralis. So there is the freedom to add a total derivative. We may add a term of the form

@λ f λμν , f λμν D � f μλν (5.53)

so that

@λ@μ f λμν � 0 (5.54)

We consider a new energy momentum tensor

T μν ! T μν C @λ f λμν (5.55)

Page 124: Yorikiyo Nagashima - Archive

5.1 Action Principle 99

This new energy momentum tensor is conserved and we can always choose thisextra term to make it symmetric in μ and ν.

There is another reason to have a symmetric energy momentum tensor. In gen-eral relativity, the symmetric gravitational field or curvature tensor Rμν couples tothe energy momentum tensor through Einstein’s equations:

Rμν �12

gμν R D �8πG

c2 Tμν (5.56)

The Ricci tensor Rμν and the metric tensor gμν are both symmetric and Tμν shouldbe symmetric, too.

The Complex FieldWe now apply the energy momentum tensor formula to Eq. (5.20b) and derive theHamiltonian and the momentum operators expressed in terms of the fields:

δL

δ(@μ')D @μ'† δL

δ(@μ'†)D @μ'

H D T 00 DX

r

δL

δ(@0φ r )@0φ r �L

D @0'† P' C P'†@0' � (@μ'†@μ' � m2'†')

D P'† P' Cr'† � r' C m2'†'

(5.57a)

P DX

r

T 0 i DδL

δ(@0φ r )@i φ r D �

P'†r' C r'† P'

�(5.57b)

H DZ

d3xH D

Zd3x

�P'† P' Cr'† � r' C m2'†'

(5.58a)

P DZ

d3xP D �

Zd3x

�P'†r' Cr'† P'

(5.58b)

Problem 5.6

(1) Prove that H and P are conserved quantities by explicitly differentiating them.(2) Prove that the Hamiltonian density H and the momentum density P satisfythe continuity equation, namely

@H

@tCr �P D 0 (5.59)

Problem 5.7

Using the Lagrangian of the electromagnetic field, Eq. (5.20c), show

H DZ

d3x12

�E2 C B2� (5.60a)

P DZ

d3x E � B (5.60b)

Page 125: Yorikiyo Nagashima - Archive

100 5 Field Quantization

Problem 5.8

Using the Lagrangian of the Dirac field, Eq. (5.20d), show

H DZ

d3x ψ(�i γ � r C m)ψ DZ

d3x ψ† i Pψ (5.61a)

P DZ

d3x ψ†(�ir)ψ (5.61b)

Q D qZ

d3x ψ† ψ (5.61c)

5.2Quantization Scheme

There are several ways to make the transition from classical mechanics to quantummechanics. Probably the most powerful method is that of the path integral, whichdeals with the action, taking all possible paths into account. Historically, howev-er, the Hamiltonian formalism was used first and has been extensively used sincethen. The path integral method is a powerful tool for treating the formal aspectsof the theory, and indispensable in discussing non-Abelian gauge theories, but ifone is interested in quick practice at calculating transition rates, the conventionalHamiltonian formalism is more easily accessible using the accumulated knowl-edge of quantum mechanics. Since the path integral uses mathematical approach-es that for laypeople need some acquaintance before solving realistic problems, weshall defer the study of the path integral method until Chap. 10.

5.2.1Heisenberg Equation of Motion

To derive the quantum mechanical representation of the equation of motion that issuitable for the treatment of the harmonic oscillator, we start from the Schrödingerequation

i@

@tjψ(x )i D H jψ(x )i (5.62)

Here, we have used Dirac’s notation to describe state vectors in Hilbert space onwhich operators in quantum mechanics operate. The formal solution to the equa-tion is expressed as

jψ(t)i D e�i H tjψ(0)i (5.63)

where we have suppressed the space coordinate dependence. The Hamiltonian inthe Schrödinger equation contains operators as functions of canonical variablesq, p in the form q ! x , p ! �ir and which are time independent. The time

Page 126: Yorikiyo Nagashima - Archive

5.2 Quantization Scheme 101

dependence is in the state vector. This is called the Schrödinger picture. In thefollowing, we attach a subscript S to quantities in the Schrödinger picture and asubscript H to the Heisenberg picture that we describe now. We define state vectorsand operators in the Heisenberg picture by

jψHi D e i HS t jψSi D jψS(0)i (5.64a)

OH D e i HS t OSe�i HS t (5.64b)

In the Heisenberg picture, all the time dependence is in the operators and thestate vectors do not change with time. From Eq. (5.64b), we can derive the timedependence of the Heisenberg operators:

dOH

d tD i [HS, OH] (5.65)

Putting OS D HS, Eq. (5.64b) shows HH D HS and from Eq. (5.65) we see that theHamiltonian does not change with time, i.e. it is a conserved quantity.3) From nowon, we omit the subscript of the Hamiltonian and simply write it as H. Then theequation that the Heisenberg operators satisfy is written as

dOH

d tD i [H, OH] (5.66)

which is called the Heisenberg equation of motion. Its relativistic version can beexpressed as

@μ OH D i [Pμ, OH] (5.67)

This is still a quantum mechanical equation. We can prove that Eq. (5.67) is alsovalid in quantum field theory, namely if the operators Pμ and OH are functions ofquantized fields. We shall prove this later in Sect. 9.1.1.

Problem 5.9

Derive the space part of Eq. (5.67).

We can make the transition from classical mechanics to quantum mechanics bythe following steps:

(1) Regard the classical variablesas quantum operators

(2) Impose the commutation relation [qi , p j ] D i δ i j

(3) Then calculate @μ OH D i [Pμ , OH]

(5.68)

The advantage of the Heisenberg picture is that it makes the correspondencewith classical mechanics apparent.

3) See Eq. (5.29d) for a caveat.

Page 127: Yorikiyo Nagashima - Archive

102 5 Field Quantization

5.2.2Quantization of the Harmonic Oscillator

We start with the quantization of the harmonic oscillator, a prescription describedin any standard textbook of quantum mechanics. The Hamiltonian is given byEq. (5.11). Using the Heisenberg equation of motion

Pq D i [H, q] Dpm

, Pp D i [H, p ] D �mω2q ) Rq D �ω2q (5.69)

which is the same equation as Eq. (5.12), but q and p are now operators. Now weintroduce operators defined by

q D

r1

2mω(a C a†) , p D �i

rmω

2(a � a†) (5.70)

The canonical commutation relation (5.68) becomes

[a, a†] D 1 (5.71)

The Hamiltonian in terms of a, a† is given as

H D12

ω�a† a C aa†� D ω

�a† a C

12

�(5.72)

Inserting the new variables into the Heisenberg equation gives

Pa D �i ωa , Pa† D i ωa† (5.73a)

) a D a0e�i ω t, a† D a†0 e i ω t (5.73b)

We rename a0 and a†0 as a and a† and consider their properties. As will be shown

in the following, eigenvalues of the operator N � a† a are either zero or positiveintegers. Hence the energy intervals of the harmonic oscillator are all the same(ΔE D „ω).

Problem 5.10

Using the Hamiltonian (5.72) and the commutator (5.71), derive Eq. (5.73a).

The Number OperatorWe prove that the operator N D a† a has positive integer eigenvalues and can beconsidered as the number operator. It satisfies the commutation relations

[N, a] D �a , [N, a†] D a† (5.74)

Let us consider an eigenstate of the operator N with eigenvalue n:

N jni D njni (5.75)

Page 128: Yorikiyo Nagashima - Archive

5.2 Quantization Scheme 103

Using the commutator (5.74), we see

N a†jni D a†(1C N )jni D (n C 1)a†jni (5.76a)

N ajni D (N � 1)ajni D (n � 1)a†jni (5.76b)

which means that a†jni, ajni are also eigenstates of N with eigenvalues (n ˙ 1).Therefore, a, a† are operators that change the eigenvalue by +1 and �1, respective-ly. Set the normalization condition as

hnjni D 1 , ajni D cnjn � 1i (5.77a)

n D hnjN jni D hnja† ajni D jcn j2hn � 1jn � 1i D jcn j

2 ! cn Dp

n

(5.77b)

We have chosen the phase of cn to make it real. Equation (5.76) means that theexpectation values of a†, a are nonzero only between jni and jn ˙ 1i:

jni Da†jn � 1i

cnD

(a†)2jn � 2icn cn�1

D � � � D(a†)mjn � mi

cn cn�1 � � � cn�mC1(5.78)

Repeated operation of a on jni eventually makes the eigenvalue negative. Becauseof Eq. (5.77b), the state cannot have negative eigenvalues. To avoid contradiction,there must exist a state j0i that satisfies

aj0i D 0, N j0i D 0 (5.79)

We call it the ground state or the vacuum state. Equation (5.78) tells us that anystate can be constructed from j0i by multiplying by a†:

jni D(a†)n

pn!j0i , N jni D njni (5.80a)

H jni D ω�

a† a C12

�jni D ω

�n C

12

�jni (5.80b)

The value of the energy is given by a multiple of a positive integer times ω plussome fixed constant.

The fact that all the energy intervals of a harmonic oscillator are equal becomesimportant when we quantize the field. At this point we make an important re-interpretation. We do not consider that the energy is n multiples of ω but that nparticles with energy ω are created. Then N D a† a represents a number operatorof the particle and the Hamiltonian H D ωa† a represents the energy sum of nparticles. a and a† can be interpreted as the annihilation and creation operatorsof the particle. The constant term ω/2 in the Hamiltonian is called the zero-pointenergy. For the moment we ignore it because the reference energy can always beshifted and hence unobservable. We shall come back to it in the last section of thischapter.

Page 129: Yorikiyo Nagashima - Archive

104 5 Field Quantization

Quantization of the FermionWe have derived the equation for a harmonic oscillator in two ways: A using q, p ,and B using a, a†. Method A uses Eq. (5.11) as the Hamiltonian and the commu-tator Eq. (5.68) while method B uses Eq. (5.72) and Eq. (5.71). They are connectedby the relation Eq. (5.70). Whichever we choose as the starting point, we can derivethe other. Therefore, both methods are equivalent. However, we notice that if wechoose B as the starting point, the commutator Eq. (5.71) is not a unique choice.If, instead of the commutator, we require anticommutator relations, we have

[a, a†]C � fa, a†g � aa† C a† a D 1 (5.81a)

fa, ag D fa†, a†g D 0 (5.81b)

We can equally obtain the harmonic oscillator using the Heisenberg equation.

Problem 5.11

Using the Hamiltonian Eq. (5.72) and the anticommutator Eq. (5.81), prove thatthe Heisenberg equation of motion gives the same harmonic oscillator equationEq. (5.73a) for a, a†.

In this case, however, the eigenvalues of the number operator are limited only to1, 0:

N 2 D a† aa† a D a†(1� a† a)a D a† a D N (5.82a)

) N D 1, 0 (5.82b)

The eigenvectors are j1i and j0i. Besides,

N a†j0i D a† aa†j0i D a†(1� a† a)j0i D a†j0i (5.83a)

) a†j0i D j1i (5.83b)

Similarly,

aj1i D j0i , aj0i D 0 , a†j1i D 0 (5.83c)

The last equation shows that a† behaves like an annihilation operator acting onj1i, because there is no n D 2 state. In summary, a, a† can be interpreted as theannihilation or creation operator of a particle; only one particle can occupy a state.Such a particle is called a fermion. Expression of the Hamiltonian in terms of q, pusing the relation Eq. (5.70) gives

H D i ω q p (5.84)

Page 130: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 105

which has no simple interpretation in classical mechanics and is, therefore, a pure-ly quantum mechanical object. Expressed in terms of a, a†

H D ω�

a† a �12

�(5.85)

which has a logical physical interpretation.Particles that satisfy the commutation relation are said to obey Bose–Einstein

statistics and are called bosons, while those that satisfy the anticommutation re-lation obey Fermi–Dirac statistics. Any number of bosons can occupy a state, butonly one fermion can enter a state.

5.3Quantization of Fields

How can we quantize fields? In Eq. (5.15), we have seen that the field is a collectionof oscillators qr distributed over all space x , and that x has been chosen for num-bering instead of r. To make the transition from discrete r to continuous x , let usdivide the space into rectangular boxes of volume ΔV D ΔxΔyΔz and numberthem by r. Define qr as the field averaged over ΔV at point x D x r multiplied byΔV , i.e. qr(t) D φ(t, x )ΔV . Then the field Lagrangian can be rewritten in terms ofqr :

L D12

Zd3x [ Pφ2 � (rφ)2 C m2φ2] D

Zd3 xL D

XΔV Lr

D1

2ΔV

Xr

Pq2r � (terms independent of Pqr) (5.86)

We define the conjugate momentum by

pr(t) D@L@ PqrDPqr

ΔVD

@(P

r ΔV Lr )

@( Pφ(x r )ΔV )D

@L

@ Pφ(x r )D Pφ(x r ) (5.87)

Then the Hamiltonian is given by

H DX

pr Pqr � L DX

π(x r , t) Pφ(x r )ΔV � LΔV!0����!

Zd3xH (5.88a)

H D π(x , t) Pφ(t, x )�12

[ Pφ2 � (rφ)2 C m2] (5.88b)

which reproduces the field Hamiltonian in the continuous limit. Since we havereplaced the field by a collection of discrete oscillators, the quantization can beperformed in a quantum mechanical way:

[qr , ps] D i δ r s , [qr , qs] D [pr , ps ] D 0 (5.89)

Page 131: Yorikiyo Nagashima - Archive

106 5 Field Quantization

The equal time commutator of the field can be obtained by replacing qr , pr byφ(t, x r )ΔV , π(t, x r )

[φ(t, x r ), π(t, x s)] D iδ r s

ΔVΔV!0����! [φ(t, x ), π(t, y )] D i δ3(x � y ) (5.90)

In summary, it is reasonable to assume that the quantization of the fields can beachieved by

1. interpreting the classical fields as quantum field operators in the Heisenbergpicture, and

2. imposing equal time commutation relations of the fields and their conjugatemomentum fields

[φ s(x ), π t (y )]txDt y D i δ s t δ3(x � y ) (5.91a)

[φ s(x ), φ t (y )]txDt y D [π s(x ), π t (y )]txDt y D 0 (5.91b)

where the indices s, t specify additional internal degrees of freedom. We haveused x to denote the four-dimensional space-time coordinate, i.e. x D (t, x ).The conjugate momentum field is defined by

π s(x ) DδL

δ Pφ s(5.91c)

In this book, however, we shall adopt an alternative method to quantize plane waveswith specific wave number k and frequency ω. Because any wave field can be(Fourier) decomposed in terms of waves that are harmonic oscillators, the methodwe learned in quantum mechanics is directly applicable. Both methods are equiva-lent, but the latter probably makes it easier to grasp the physical picture behind themathematical formula, although once one is used to the method, the former maybe easier to handle.

5.3.1Complex Fields

We start from a spin 0 complex scalar field that obeys the Klein–Gordon equation.It has two internal degrees of freedom and can be expressed in an alternative form:

' D1p

2(φ1 C i φ2) , '† D

1p

2(φ1 � i φ2) (5.92)

Here, we treat ' and '† as two independent fields. The Lagrangian is already givenby Eq. (5.20b) and the equation of motion can be derived from the Euler–Lagrangeequation. The energy, momentum and other conserved physical quantities were

Page 132: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 107

given in [Eq. (5.58)].4)

H DZ

d3x��

@'†

@t

��@'

@t

�C (r'† � r')C m2'†'

�(5.93a)

P D �Z

d3x�

@'†

@tr' Cr'† @'

@t

�(5.93b)

Q D i qZ

d3x�'† @'

@t�

@'†

@t'

�(5.93c)

Since the Klein–Gordon equation is a linear wave equation, it can be expanded interms of waves having wave number k

'(x ) DX

k

qk e i k�x continuous limit���������! V

Zd3 p

(2π)3 qk e i k�x 5) (5.94)

The continuous limit is written for later reference, but in this chapter we stick todiscrete sum to aid intuitive understanding of the quantization procedure. Apply-ing the Klein–Gordon equation

[@μ@μ C m2]' DX

k

�Rqk C ω2qk

�e i k�x D 0 , ω D

qk2C m2 (5.95)

which shows that the Klein–Gordon field is a collection of an infinite number ofharmonic oscillators indexed by the wave number k. Those with different k are indifferent oscillation modes, and hence independent of each other. We separate q k

into positive and negative frequency parts6) and write

'(x ) DX

k

1p

2ωV

�ak e�i ω t C ck e i ω t� e i k�x (5.96)

where V is the normalization volume for the plane wave. Inserting Eq. (5.96) intoEq. (5.93), and using the orthogonal relation (refer to Eq. 6.73)

1V

Zd3x ei(k�k0)�x D δkk0 (5.97)

4) The wording “electric charge” is used here forsimplicity, but the argument applies to anyconserved charge-like quantum numbers, forexample, strangeness.

5) Later we will set V D 1, but we retain theintegration volume V for a while. WhenV D L3 is finite, we should use a discrete

sum, but even in the continuous limit (i.e.V ! 1), it is sometimes convenient tothink V is large but finite. See Sect. 6.6.2 fordetailed discussion.

6) By convention we call e�iωt positivefrequency and eiωt negative frequency.

Page 133: Yorikiyo Nagashima - Archive

108 5 Field Quantization

we obtain

H DX

k

ω�a†

k ak C c†k ck

(5.98a)

P DX

k

k�a†

k ak � c†k ck

(5.98b)

Q DX

k

q�a†

k ak � c†k ck

(5.98c)

Then we impose�ak , a†

k0D δkk0 ,

�ak , ak0

D�a†

k , a†k0D 0 (5.99a)

�ck , c†

k0D δkk0 ,

�ck , ak0

D�c†

k , c†k0D 0 (5.99b)

�ak , c†

k0D�a†

k , ck0D�ak , ck0

D�a†

k , c†k0D 0 (5.99c)

Here,�a†

k , ak�

can be interpreted as creation and annihilation operators of a particlewith energy-momentum (E D „ω, p D „k). However, if we apply the same rule tothe negative frequency part,

�c†

k , ck�

creates or annihilates a particle with negativeenergy. This is not allowed since there should be no particles with negative energy.Let us examine more carefully what effects c†

k and ck produce when operating onthe state vector. As before, we define the number operator N by

N DX�

a†k ak C c†

k ck�

(5.100)

Using the commutation relation Eq. (5.99), we obtain the relations�N, a†

k

D a†

k , [N, ak ] D �ak (5.101a)

�N, c†

k

D c†

k , [N, ck ] D �ck (5.101b)

which shows that a†k , c†

k can be interpreted as creation and ak , ck as annihilationoperators of particles with momentum k. Now we apply the Heisenberg equationto the field ' and have it operate on a energy eigenstate jEi

P'jEi D i [H, ']jEi , H jEi D E jEi (5.102)

Compare the coefficients of e�i ω t and eCi ω t on the two sides of the first equation.Using the relation (5.101), we obtain

�i ω ak jEi D i [H, ak ]jEi D i(H � E )akjEi

i ωck jEi D i [H, ck ]jEi D i(H � E )ckjEi

) H ak jEi D (E � ω)ak jEi , (5.103a)

H ck jEi D (E C ω)ck jEi (5.103b)

Page 134: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 109

ak , the coefficient of the positive-frequency waves, reduces the number of particlesby 1 and also reduces the energy of the state jEi by ω. So it is a legitimate annihila-tion operator. But ck , the coefficient of the negative-frequency wave term, actuallyincreases the energy of the state by ω. A logical interpretation is then that ck cre-ates a particle and should be considered as a creation operator. We then redefinec�k ! b†

k , c†�k ! bk and the commutation relations as

bk � c†�k , b†

k � c�k (5.104a)

�bk , b†

k0D δkk0 ,

�bk , bk0

D�b†

k , b†k0D 0 (5.104b)

�ak , b†

k0D�ak , bk0

D�a†

k , b†k0D 0 (5.104c)

The reason for adopting �k is to make the second term in Eq. (5.98b) positive sothat the particle has positive momentum as well as positive energy. We obtain thenewly defined field and Hamiltonian, etc. as

'(x ) DX

k

1p

2ωV

�ak e�i ω t C ck e i ω t�e i k�x

DX

k

1p

2ωV

�ak e�i ω tCi k�x C b†

k e i ω t�i k�x �In terms of the new creation/annihilation operators, the energy, momentum andcharge operators are rewritten as

H DX

k

ω�a†

k ak C b†k bk C 1

(5.105a)

P DX

k

k�a†

k ak C b†k bk C 1

(5.105b)

Q DX

k

q�a†

k ak � b†k bk � 1

(5.105c)

The constant term appears because c†�k c�k D bk b†

k D 1 C b†k bk . The above ex-

pression tells us that bk is an annihilation operator of a particle having the energy-momentum (ω, k), but opposite charge, which we call an antiparticle. Note, we of-ten encounter a statement that the antiparticle has negative energy. From the abovediscussion, the reader should be convinced that there are no particles with negativeenergy. The antiparticle has positive energy, too.

Summary of Field QuantizationField Expansion

'(x ) DX

k

1p

2ωV

�ak e�i ω tCi k�x C b†

k e i ω t�i k�x � (5.106)

Page 135: Yorikiyo Nagashima - Archive

110 5 Field Quantization

Commutation Relations�ak , a†

k0D δkk0 ,

�bk , b†

k0D δkk0 (5.107a)

�ak , ak0

D�a†

k , a†k0D 0 ,

�bk , bk0

D�b†

k , b†k0D 0 (5.107b)

�ak , b†

k0D�a†

k , bk0D 0

�ak , bk0

D�a†

k , b†k0D 0 (5.107c)

Energy, Momentum and Charge Operators

H DX

k

ω�a†

k ak C b†k bk C 1

(5.108a)

P DX

k

k�a†

k ak C b†k bk C 1

(5.108b)

Q DX

k

q�a†

k ak � b†k bk � 1

(5.108c)

Using relations in the box, it is a straightforward calculation to prove

['(x ), π(y )]txDt y D i δ3(x � y ) (5.109a)

['(x ), '(y )]txDt y D [π(x ), π(y )]txDt y D 0 (5.109b)

where

π(x ) �δL

δ( P'(x ))D P'†(x ) (5.109c)

Problem 5.12

Decomposing the complex field and its conjugate momentum in terms of annihi-lation and creation operators, prove Eq. (5.109).

Note: As the conjugate momenta of ' and '† are given by

π' DδL

δ P'D P'† , π'† D

δL

δ P'† D P' (5.110)

the commutators for [', π' ] and�'†, π'†

are equivalent in the sense that one is the

hermitian conjugate of the other. Therefore there are no additional commutatorsto Eq. (5.109).

Page 136: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 111

Equation (5.106) may be inverted as

ak D iZ

d3 x e�k (x ) !@t '(x ) , a†

k D �iZ

d3x ek (x ) !@t '†(x ) (5.111a)

bk D iZ

d3x e�k (x ) !@t '†(x ) , b†

k D �iZ

d3x ek (x ) !@t Oφ(x ) (5.111b)

ek (x ) D1

p2ωV

e�i k �x , e�k (x ) D

1p

2ωVei k �x (5.111c)

where

A(t) !@t B(t) D A(t)

@B(t)@t�

@A(t)@t

B(t) (5.112)

Quantization of the fields can also be achieved by imposing the commutationrelations Eq. (5.109) on the fields and their conjugate momentum fields. Then thecommutation relations for the creation and annihilation operators Eq. (5.107) canbe proved by using Eq. (5.111) and the commutation relations Eq. (5.109). The twomethods of quantization are mathematically equivalent and one can be derivedfrom the other.

Problem 5.13

Starting from the field commutators Eq. (5.109) and using Eq. (5.111), prove thecommutation relations for the creation and annihilation operators Eq. (5.107).

5.3.2Real Field

The expression for the real field is obtained by setting ' D '† D φ/p

2 and insert-ing it into Eq. (5.93), giving

H D12

Zd3 x

�Pφ2 C (rφ � rφ)C m2φ2 (5.113a)

P D �Z

d3x�

@φ@trφ

�(5.113b)

The complex scalar field is actually two real scalar fields of the same mass addedtogether. Inserting ' D (φ1C i φ2)/

p2, '† D (φ1 � i φ2)/

p2, the energy-momen-

tum of ' and φ1, φ2 can be converted into each other. The difference is that nocharge operator can be defined for the real field, which is readily seen by looking atEq. (5.93c). Therefore the real field describes electrically neutral particles, but theconverse is not true. For instance, K0 and K

0are electrically neutral, but have the

“strangeness” quantum number to distinguish them and hence are described bythe complex field.

Page 137: Yorikiyo Nagashima - Archive

112 5 Field Quantization

Fock SpaceAs can be seen from Eq. (5.106), ' connects states differing in particle number by 1.Especially, a matrix element between the vacuum and one-particle system appearsfrequently. Defining jki D

p2ωa†

k j0i7)

ak j0i D h0ja†k D 0 , h0jak jki D hkja

†k j0i D

p2ω (5.114a)

h0j'(x )jki D1p

Ve�i k �x , hkj'(x )j0i D

1p

Vei k �x (5.114b)

k � x � ω t � k � x (5.114c)

Problem 5.14

Show that

jEi D jk1, k2, � � � , km , q1, q2, � � � , qni

�p

2ωk1a†k1

p2ωk2a†

k2� � �p

2ωk m a†km

p2ωq1b†

q1

p2ωq2b†

q2

� � �p

2ωqn b†qnj0i (5.115)

is a state with m particles and n antiparticles whose total energy and momentumare given by

E D ωk1 C � � � ωkm C ωq1 C � � � ωqn (5.116a)

P D k1 C � � � C k m C qq C � � � C qn (5.116b)

A state with any number of particles represents a Fock vector, and all Fock vectorsconstitute a Fock space.

Summary The relativistic wave function suffered from negative-energy particles.But if it is reinterpreted as the classical field and quantized again, the difficulty dis-appears. Because of this, the quantization of the field is often called second quanti-zation. The quantized field φ is no longer a wave function but an operator that actson the Fock vectors (states in the Heisenberg picture). It has wave characteristics inits expansion in terms of e�i k �x and also a particle nature, in that the coefficientsak , a†

k can be interpreted as the square root of the particle number.

5.3.3Dirac Field

From the Lagrangian of the Dirac field Eq. (5.20d), using Noether’s theorem we canderive expressions for the Hamiltonian etc.:

H DZ

d3x ψ†�

α �r

iC m

�ψ D

Zd3x ψ† i

@ψ@t

(5.117a)

7) This is a relativistic normalization that we will adopt. The details will be described in Sect. 6.5.2.

Page 138: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 113

P DZ

d3x ψ†(�ir)ψ (5.117b)

q J μ D qZ

d3x ψγ μ ψ D qZ

d3x ψ†(1, α)ψ ! Q D q J0

D qZ

d3x ψ† ψ (5.117c)

We expand the field in terms of plane-wave solutions of the Dirac equation (seeSect. 4.2). Writing down the Dirac fields as

ψ DXp ,r

1p2Ep V

ha p r u r(p )e�i p �x C b†

p r vr (p )e i p �xi

(5.118a)

ψ DXp ,r

1p2Ep V

ha†

p r ur(p )e i p �x C b p r v r (p )e�i p �xi

(5.118b)

H, P , Q can be rewritten as

H DXp , r

Ep

a†

p ,r a p ,r � b p,r b†p ,r

�(5.119a)

P DXp ,r

p

a†p ,r a p ,r � b p,r b†

p ,r

�(5.119b)

Q DXp ,r

Ep

a†

p ,r a p ,r C b p ,r b†p ,r

�(5.119c)

Notice the difference from and similarity to Eq. (5.98). Here we note that if we applythe same logic as we followed after Eq. (5.98), we cannot use the commutationrelations like those of Eq. (5.104), because they produce negative energies, as isshown by the second term of the Hamiltonian. But if we impose the followinganticommutators:n

a p ,r , a†p 0 ,r 0

oDn

b p ,r , b†p 0 ,r 0

oD δ p ,p 0 δ r,r 0

fa p ,r , a p 0 ,r 0g Dn

a†p ,r , a†

p 0 ,r 0oDn

b†p ,r , b†

p 0 ,r 0oDn

b†p ,r , b†

p 0 ,r 0oD 0n

a p ,r , b†p 0 ,r 0

oD fa p ,r , b p 0 ,r 0g D

na†

p ,r , b†p 0 ,r 0

oD 0

(5.120)

the energy becomes positive definite and we obtain physically consistent interpre-tations:

H DXp ,r

Ep

a†

p ,r a p ,r C b†p ,r b p ,r � 1

�(5.121a)

P DXp ,r

p

a†p ,r a p ,r C b†

p ,r b p ,r � 1�

(5.121b)

Q DXp ,r

Ep

a†

p ,r a p ,r � b†p ,r b p ,r C 1

�(5.121c)

Therefore, the Dirac field must obey Fermi–Dirac statistics.

Page 139: Yorikiyo Nagashima - Archive

114 5 Field Quantization

Problem 5.15

Show that Eq. (5.118) can be inverted to give

a p ,r D

Zd3x up ,r(x )e�

p (x )γ 0ψ(x ) , a†p ,r D

Zd3x ψ(x )γ 0u p ,r(x )e p (x )

(5.122a)

b p ,r D

Zd3 x ψ(x )γ 0vp ,r(x )e p (x ) , b†

p ,r D

Zd3x v p ,r(x )e�

p (x )γ 0ψ(x )

(5.122b)

e p (x ) D1p

2Ep Ve�i p �x , e�

p (x ) D1p

2Ep Vei p �x (5.122c)

The equal time commutators of the Dirac field are given as follows. The conju-gate field is defined by

π r DδL

δψrD i [ψγ 0]r D i ψ†

r (5.123)

Since the Lagrangian contains no derivatives of ψ, there is no conjugate momen-tum to ψ†.8) The above equation shows that i ψ† itself is the conjugate momentumto ψ. Therefore the commutators are˚

ψr (x ), ψ†s (y )

�tx Dt y

D δ r s δ3(x � y ) (5.124a)

fψr (x ), ψs(y )gtx Dt yD˚

ψ†r (x ), ψ†

s (y )�

tx Dt yD 0 (5.124b)

5.3.4Electromagnetic Field

The electromagnetic field can be expressed by the potential Aμ D (φ, A), where φis the scalar or Coulomb potential and A the vector potential. It is a real vector po-tential and, when quantized, represents a spin 1 neutral particle, called the photon.It obeys the Maxwell equation in vacuum:

@μ F μν D @μ@μ Aν � @ν �@μ Aμ� D 0 (5.125a)

F μν D @μ Aν � @ν Aμ (5.125b)

As it has gauge symmetry, which means the electromagnetic field F μν does notchange by a gauge transformation,

Aμ ! A0 μ(x ) D Aμ(x )C @μ �(x ) (5.126)

8) If one uses the symmetric Lagrangian of Eq. (5.21), one sees that the conjugate momentum to ψ†

is�iψ, which yields the same commutation relation.

Page 140: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 115

The gauge has to be fixed for calculating observable quantities. The usual conven-tion is to adopt the Lorentz gauge

@μ Aμ D 0 W Lorentz gauge (5.127)

Then the Maxwell equation becomes

@μ@μ Aν D 0 (5.128)

which is the Klein–Gordon equation with m D 0. Even after adopting the Lorentzgauge, we still have an extra degree of freedom to change the potential using afunction that satisfies

Aμ ! A0μD Aμ C @μ � , @μ@μ � D 0 (5.129)

We shall show that although the potential has four components, it has only two de-grees of freedom because of the gauge symmetry. Writing the polarization vector ofthe photon having energy-momentum k as εμ(k, λ), the potential can be expandedas

Aμ DXk,λ

1p

2ωV

hak,λ εμ(k, λ)e�i k �x C a†

k ,λ ε�μ(k, λ)e i k �xi

(5.130)

For the free electromagnetic potential, the following Einstein relation holds:

k2 D kμ k μ D 0 or ω D k0 D jkj (5.131)

The Lorentz condition Eq. (5.127) means

kμ εμ(k, λ) D 0 (5.132)

Equation (5.129) means ε has an extra degree of freedom to change

εμ ! ε0 μ D εμ C i k μ � (k) , k2 D 0 (5.133)

We can choose � (k) to make ε0 D 0. Then the Lorentz condition becomes

k � � D 0 (5.134)

Therefore the free electromagnetic field (i.e. the electromagnetic wave) has onlytransverse polarization, with two degrees of freedom.

Coulomb Gauge and the Covariant GaugeBecause of the gauge freedom, the quantization of the electromagnetic field is non-trivial. The easiest way to understand it intuitively is to adopt the Coulomb gauge,which minimizes the degrees of freedom. In this gauge, the potential satisfies thecondition

r � A D 0 W Coulomb gauge (5.135)

Page 141: Yorikiyo Nagashima - Archive

116 5 Field Quantization

Then the field can be expressed in terms of transverse potential A? and instanta-neous Coulomb potential (see Appendix B). In this gauge, only the transversely po-larized photons are quantized; they are treated as two independent massless Klein–Gordon fields. The instantaneous Coulomb field has no time dependence, hence isnot a dynamical object. It is left as a classical field. In this gauge, manifest Lorentzcovariance is lost, although a detailed examination shows it is not. Due to absenceof manifest covariance, calculations in general become complicated, and the usu-al convention is to adopt a method we refer to as the covariant gauge. It treats allfour components of the field potential Aμ on an equal footing and independently.Then we expect to have commutators like Eq. (5.141a), which is in conflict with theLorentz condition Eq. (5.127), at least as an operator condition. We also encounteranother problem. The Lagrangian density does not contain PA0, which means thatthe conjugate momentum π0 does not exist. Therefore A0 cannot be a dynamicalvariable. A remedy to the problem is to add an additional term, called the gaugefixing term, to the Lagrangian

L D �14

F μν Fμν �1

2λ(@μ Aμ)2 (5.136)

Then the Euler–Lagrange equation becomes

@μ@μ Aν �

�1 �

�(@μ Aμ) D 0 (5.137)

which reproduces the desired Maxwell equation without imposing the Lorentzgauge condition Eq. (5.127), if we set λ D 1. The price to pay is that the addedterm is not gauge invariant and we no longer have the freedom to make anothergauge transformation. Namely, the gauge has been fixed and the choice of λ D 1 iscalled the Feynman gauge.

In the covariant gauge, the four potential fields are independent and we can as-sign four independent polarization vectors to each component. In the coordinatesystem where the z axis is taken along the photon momentum k, they can be ex-pressed as

εμ(0) D (1, 0, 0, 0) (5.138a)

εμ(1) D (0, 1, 0, 0) (5.138b)

εμ(2) D (0, 0, 1, 0) (5.138c)

εμ(3) D (0, 0, 0, 1) (5.138d)

Circular polarization ε(˙) is also used instead of linear polarizations ε(1), ε(2)

εμ(C) D (0,�1/p

2,�i/p

2, 0) (5.138e)

εμ(�) D (0, 1/p

2,�i/p

2, 0) (5.138f)

Page 142: Yorikiyo Nagashima - Archive

5.3 Quantization of Fields 117

A quantum that has polarization ε(3) is called a longitudinal photon, and that withε(0), a scalar photon. The four polarization vectors satisfy the orthogonal condition

εμ(λ)εμ(λ0)� D gλλ0 (5.139)

Notice the polarization of the scalar photon, unlike others, is time-like, i.e. (εμ(0)εμ(0) D 1 > 0). Then the completeness condition or sum of the polarization statesis written asX

λ

gλλ0 εμ(λ)εν(λ0)� D gμν (5.140)

Unphysical PhotonsThe usual quantization method we have developed so far leads to a commutationrelation for the electromagnetic fields

[Aμ(x ), πν(y )]txDt y D �i gμνδ3(x � y ) (5.141a)

πμ DδL

δ PAμD��@μ Aμ , E

�(5.141b)

As we can see, the Lorentz gauge condition (@μ Aμ D 0) contradicts the abovecommutation relation and forces π0 to vanish, thereby invalidating constructionof canonical commutation relations for the fields. So we adopt a weaker Lorentzcondition valid only for the observable states jΦ i, jφi

hφj@μ AμjΦ i D 0 (5.142)

and eliminate unwanted components from the final expression for the observ-ables. As a rigorous mathematical treatment of the method is beyond the scopeof this textbook, we only describe the essential ingredients that are necessary forlater discussions. The commutation relation consistent with the field commutatorEq. (5.141a) ish

ak ,λ , a†k,λ

iD �gλλ0 δk k0 (5.143)

which gives a negative norm (hλ D 0jλ D 0i D �1) for the scalar photon. Obvi-ously a photon having negative probability is unphysical. The fact is that the longi-tudinal and the scalar photon appear only in intermediate states and never appearas physical photons, which satisfy the Einstein relation kμ k μ D ω2 � jkj2 D 0.The unusual property of the scalar photon works in such a way as to compensateunphysical contributions of the longitudinal polarization states.

Intermediate photons created by electric current j μ satisfy the Maxwell equation

@μ@μ Aν D q j ν (5.144)

Comparing both sides in momentum space, it is clear that the energy �momentumof the intermediate photon does not satisfy the Einstein relation for the particle

kμ k μ ¤ 0 (5.145)

Page 143: Yorikiyo Nagashima - Archive

118 5 Field Quantization

The photon is said to be off-shell, or in a virtual state. Equation (5.144) is a resultof interaction terms in the Lagrangian

L! L0 D L � q j μAμ (5.146)

Later we shall learn that the interaction term in the above equation appears natu-rally as a consequence of the “local” gauge invariance for a system including boththe matter (Dirac) field and the electromagnetic field, but here we assume it as giv-en. Virtual photons are produced and absorbed by the currents (i.e. by interactingcharged particles) and appear as a propagator in the form � 1/ k2 (to be describedsoon) in the equation. While in the Coulomb gauge only the transverse polariza-tions need to be considered, in the covariant gauge all four polarization states haveto be included:

XλD˙

εμ(λ)εν(λ)� D δ i j �k i k j

jkj2Coulomb gauge (5.147a)

!

3XλD0

εμ(λ)εν(λ)� ! �g μν covariant gauge (5.147b)

The arrow in Eq. (5.147b) means it is not an actual equality. Negative norm for thescalar polarization is implicitly assumed as shown by Eq. (5.140). To see the dif-ference between the two methods, we use the fact that in the transition amplitudethey appear sandwiched by their sources, the conserved currents. As a typical ex-ample, take the matrix element for electron–muon scattering, which has the form[see Eq. (7.12)]

M � j μ(e)�gμν

k2 j ν(μ) (5.148)

Since the conserved current satisfies

@μ j μ D 0! kμ j μ D k0 j 0 � jkj j 3 D 0 (5.149)

in the coordinate system where the z-axis is taken along the momentum direction,Eq. (5.148) after summation of polarization states can be rewritten as

j μ�gμν

k2j ν D

1k2

"�gμ0gν0

„ ƒ‚ …scalar

Ck i k j

k2„ƒ‚…longitudinal

C

�δ i j �

k i k j

k2

�„ ƒ‚ …

transverse

#j μ j ν

D1k2

�(k0)2

k2 � 1�

( j0)2 Cj ? � j ?

k2D

j0 j0

jkj2C

j ? � j ?k2

(5.150a)

j ? D j �kjkj2

(k � j ), k � j ? D 0 (5.150b)

Page 144: Yorikiyo Nagashima - Archive

5.4 Spin and Statistics 119

The first term represents the contribution of an instantaneous Coulomb potential9)

and the second is the contribution of the current coupled to the transverse photon.In the Coulomb gauge, the two terms are treated separately and only the transversephotons are quantized. In the covariant gauge it is included in the contributionof the longitudinal and the scalar photon. Note, the proof was given in a specialcoordinate system, but Eq. (5.148) and the current conservation law Eq. (5.149)are written in Lorentz-invariant form. Therefore, a proof valid in one coordinatesystem is equally valid in any coordinate system connected by the proper Lorentztransformation.

When the photon appears as a real (external) particle, we may still take the totalsum instead of only transverse polarizations, as will be shown later in Eq. (7.54).

In summary, longitudinal and scalar photons, which are artifacts of the covari-ant gauge, have unphysical properties, but they appear only in intermediate states,manifesting themselves as the instantaneous Coulomb potential. We can treat allfour components of the electromagnetic potential in a Lorentz-covariant way andmay safely take the polarization sum from 0 to 3 to obtain correct results.

5.4Spin and Statistics

We have seen that the creation and annihilation operators have to satisfy the com-mutation relation for the scalar and vector fields and anticommutation relation forthe Dirac field. This was a requirement to make the energy positive definite. Therelation can be generalized and the following well-known spin-statistics relationholds.

Particles having integer spin obey Bose–Einstein statistics and those with half-integer spin obey Fermi–Dirac statistics.

We shall re-derive the spin-statistics from a somewhat different point of view.We allow both commutation and anticommutation relations for the creation–annihilation operators and see what happens. Writing [A, B ]˙ D AB˙ B A, we seth

ak , a†k0i

˙D δkk0 , [ak , ak0 ]˙ D

ha†

k , a†k0i

˙D 0 (5.151)

We first calculate the commutators of the scalar fields. Using the above equationswe can show

[φ(x ), φ(y )]� DX

k

12ωV

[e�i k �(x�y )� e i k �(x�y )] D

(i Δ(x � y )

i Δ1(x � y )(5.152)

9) ( j 0)2/k2 is the Fourier transform of a static Coulomb potential V D �(x )�(y)/jx � yj.

Page 145: Yorikiyo Nagashima - Archive

120 5 Field Quantization

where

i Δ(x � y ) DZ

d3k(2π)32ωk

[e�i k �(x�y )� e i k �(x�y )]10)

D

Zd4 k

(2π)3 ε(k0)δ(k2 � m2)e�i k �(x�y )

(5.153a)

i Δ1(x � y ) DZ

d3k(2π)32ωk

[e�i k �(x�y )C e i k �(x�y )]

D

Zd4 k

(2π)3δ(k2 � m2)e�i k �(x�y )

�exp

h� m

pjx � y j2 � (x0 � y 0)2

ijx � y j2 � (x0 � y 0)2

for � (x � y )2 1

m2 (5.153b)

where ε(k0) D ˙1 for k0 ? 0 and we used

δ(k2 � m2) D1

2E

�δ(k0 � E )C δ(k0 C E )

, E D

qk2 C m2 (5.154)

The function Δ(x ) has the properties

Δ(x )jx0D0 D 0 ,@

@tΔ(x )

ˇˇ

x0D0D �δ3(x) (5.155a)

[@μ@μ C m2]Δ(x ) D 0 (5.155b)

It is Lorentz invariant, as can be seen from Eq. (5.153a), because the factors inthe integral are all Lorentz scalars. This applies also to ε(k0), since it lies on thelight cone (k2 � m2 D 0) and does not change sign by any proper (orthochronous)Lorentz transformation: time-like (k2 > m2) momentum vectors with k0 > 0 al-ways lie in the forward light cone and those with k0 < 0 always in the backwardlight cone. Since Δ(x�y ) D 0 for x0 D y 0, the Lorentz invariance dictates that it al-so vanishes at space-like distance [(x� y )2 < 0]. This is not true for the function Δ1.It is also Lorentz invariant but it does not vanish at equal time. In fact, according toEq. (5.153b) it does not vanish outside the light cone, at least within a distance ofthe Compton wavelength of the particle. This means that if we adopt the anticom-mutator for the scalar field it is described by the function Δ1(x� y ), and causality isviolated.11) Remember that in quantum mechanics, uncommutative operators obeythe uncertainty principle and their values cannot be determined independently. Onthe other hand, if we adopt the commutator then it becomes Δ(x� y ), the two mea-surements at space-like distance are not related and causality is respected.

10) For the relation between discrete sum and continuous integral, see Eq. (5.94).11) Strictly speaking we should talk about commutators of observables composed of φ’s, but they are

related by simple algebra.

Page 146: Yorikiyo Nagashima - Archive

5.5 Vacuum Fluctuation 121

We can do similar calculations for the Dirac field:

[ψ(x ), ψ(y )]˙ DXp ,r

12Ep V

�u r(p )ur (p )e�i p �(x�y )˙ vr (p )v r(p )e i p �(x�y )

DX

p

12Ep V

[( 6p C m)e�i p �(x�y )˙ ( 6p � m)e i p �(x�y )]

D (ir/C m)Z

d3 p(2π)32Ep

[e�i p �(x�y )� e i p �(x�y )]

D

((ir/C m)i Δ(x � y )

(ir/C m)i Δ1(x � y )

(5.156)

where p/ D γ μ pμ is the Feynman slash. This shows that unless we choose theanticommutation relation for the Dirac field, the causality condition is not satisfied.Since fields of spin higher than 1/2 can be considered as being built up of productsof spin-1/2 fields, any (half) integer spin field has to satisfy (anti)commutationrelations if causality is respected.

5.5Vacuum Fluctuation

We have seen that the Hamiltonian of the fields [see Eq. (5.105a) and Eq. (5.121a)]has a constant term, i.e. the zero point energy. In the case of the harmonic oscil-lator, it represents the uncertainty of the position in confinement of the potential,and the resultant uncertainty of the momentum, in turn, produces the minimumenergy even in the ground state. Namely, it is a quantum fluctuation associatedwith the uncertainty principle. In field theory it becomes a diverging vacuum ener-gy. Usually, observed energy is measured from a certain reference point and onlythe difference matters in ordinary situations. As long as the infinite energy staysconstant, it does no harm, unless we are treating gravity. Experimentally, however,there is evidence that the zero-point energy is real. It is the energy of the vacuum,where “nothing!” exists. By nothing we mean every form of matter that is known tous and energy fields such as the electromagnetic field. It permeates space, so howdo we know of its existence? The fact is that lacking a gravitational tool we cannotknow its absolute value. But we can change the boundary condition surroundingthe vacuum we want to measure and by doing so its variation can be measured.Other known examples of vacuum fluctuation are the Lamb shift, which is de-scribed in Sect. 8.2.1, and van der Waals forces, which acts on molecules.

We emphasize the importance of understanding the vacuum energy for two rea-sons. First, the Standard Model of elementary particles has its foundation in theHiggs mechanism, which considers the vacuum to be a dynamical object that canmetarmorphose and is indeed in a sort of superconducting phase as a way of gen-erating mass for the particles. Second, the recently established concordance modelof cosmology acknowledges that 70% of the cosmic energy is the vacuum energy,

Page 147: Yorikiyo Nagashima - Archive

122 5 Field Quantization

alias dark energy which causes accelerated expansion of the universe. Pursuit ofthe vacuum energy could well be the theme of the 21st century.

5.5.1The Casimir Effect*

Let us investigate the existence of the zero-point energy. As an example, consider alimited space between two large parallel metal plates.

We idealize the plate by assuming it to be perfectly conducting and to have asquare shape of size L2 at a gap distance d � L, as depicted in Fig. 5.1; we ignoreedge effects. The vacuum energy of the electromagnetic field is given by

E0 D12

Xk,λ

ωk , ωk Dq

k2x C k2

y C k2z (5.157)

When the gap distance is sufficiently small, the wave modes that can exist insidethe gap volume are limited. Only standing waves are allowed. We expect the vol-ume to be filled with less energy inside the boundary than outside the boundary,and there will be an attractive force between the plates, which is called the Casimirforce [92].12) Although the zero-point energy in each case is divergent, the differ-ence turns out to be finite. The calculation goes as follows. The wave number,kz , that is perpendicular to the plates can have only discrete values kz D nπ/d(n D 0, 1, 2, � � � ), whereas it can be continuous if the plates do not exist. Only waveswith transverse polarization contribute to the energy density:

Efree D12

Xk ,λD1,2

ωk D12

2L2d(2π)3

Zd3k

qk2

x C k2y C k2

z

D12

L2

(2π)2

Zdkx dky

Z 1

�1ddkz

2π2q

k2x C k2

y C k2z

D12

L2

(2π)2

Zdkx dky

Z 1

0dn2

rk2

x C k2y C

nπd

�2(5.158a)

Egap D12

L2

(2π)2

Zdkx dky

(qk2

x C k2y C 2

1XnD1

rk2

x C k2y C

nπd

�2)

(5.158b)

In writing Eq. (5.158b), we have taken into account that when kz D 0, the propa-gating waves in the gap can have only one polarization mode. Then the difference

12) The sign of the Casimir force depends on the geometry. For instance, the geometry of twohemispheres causes a repulsive force [74].

Page 148: Yorikiyo Nagashima - Archive

5.5 Vacuum Fluctuation 123

L

L

d

xy

z

VACUUM FLUCTUATION

STANDING WAVES ONLY

Figure 5.1 Casimir effect illustrated. Between two metal plates, an attractive force is generatedthrough vacuum fluctuations.

in the energy density is

E D Egap � Efree

L2d

D1

2πd

Z 1

0kdk

(k2C

1XnD1

rk2 C

nπd

�2�

Z 1

0dn

rk2 C

nπd

�2)

Dπ2

8d4

Z 1

0dτ

" 1XnD�1

pτ C n2 �

Z 1

�1dzp

τ C z2

#D �

π2

720d4

(5.159)

Proof: Equation (5.159) is apparently not well defined because it diverges as τ !1. Therefore we regularize the equation by putting13)

f α(τ, z) Dp

τ C z2e�αp

τCz2(5.160)

Then we can use the Abel–Plana formula

1XnD0

f (n) �Z 1

0f (z)dz D

12

f (0)C iZ 1

0

f (i t)� f (�i t)e2π t � 1

d t (5.161)

13) The regularization treatment is not quite an ad hoc patch up. For wavelengths shorter than theatomic size, it is unrealistic to use a perfect conductor approximation. So it is natural to introducea smooth cutoff, which attenuates the wave when k � 1/a, where a ' atomic distance. In fact, itis known that above the plasma frequency ω D

pne2/mε0 a metal becomes transparent to light.

Page 149: Yorikiyo Nagashima - Archive

124 5 Field Quantization

Conditions for the formula to be valid are:(1) f (z) is analytic (differentiable) in the region Re(z) � 0.(2) limy!˙1 e�2πjy j f (x C i y ) D 0 uniformly for all x � 0.(3) limx!1

R1�1 e�2πjy j f (x C i y )d y D 0.

Obviously the regularized function f (z) DR1

0 dτ f α (τ, z) Eq. (5.160) satisfies theconditions (2) and (3) for α > 0. We note we can safely put α D 0 in the r.h.s. ofthe Abel–Plana formula. When τC t2 < 0, we choose the branch of the square rootfunction as follows to ensure condition (1):

f (i t) D f (�i t) Dp

τ � t2 for t2 < τ (5.162a)

f (i t) D � f (�i t) D ip

t2 � τ for t2 > τ (5.162b)

Then Eq. (5.159) becomes

E D π2

8d4

Z 1

0dτ2

" 1XnD0

pτ C n2 �

Z 1

0dzp

τ C z2 �12

#

Dπ2

8d4

Z 1

0dτ2 i

Z 1

0d t

f (i t)� f (�i t)e2π t � 1

D �π2

2d4

Z 1

0dτZ 1

d t

pt2 � τ

e2π t � 1

D �π2

2d4

Z 1

0d tZ t2

0dτ

pt2 � τ

e2π t � 1D �

π2

3d4

Z 1

0d t

t3

e2π t � 1

D �π2

3d4

Z 1

0d t

1XnD1

t3e�2πnt D �1

8π2 � (4) D �π2

720d4

(5.163)

where the Zeta (� ) function is defined in the region Re(s) > 1 as

� (s) D1X

nD1

1ns

(5.164)

which is known to be holomorphic (an analytic function) in all the complex s planeexcept at s D 1. �

Now that we have obtained the extra energy density in the gap, we can calculatethe Casimir force per unit area that acts on the parallel plates:

FCasimir D �@

@d

�EVL2

�D �

π2„c240d4 D

1.3 � 10�3

fd[μm]g4N/m2 (5.165)

Practically, the above value has to be corrected for nonperfect conductivity and forthe finite temperature environment. When an electric potential of 17 mV is ap-plied between the plates at the distance of d D 1 μm, the Coulomb force becomesstronger, and therefore it is a very difficult experiment to prove the formula. Ex-perimentally, it is difficult to configure two parallel plates uniformly separated by

Page 150: Yorikiyo Nagashima - Archive

5.5 Vacuum Fluctuation 125

a distances of less than a micrometer, a minimum requirement for the Casimirforce to become dominant over gravity between the plates. Therefore, one plate isreplaced by a sphere of radius R. For R d, where d is the minimum distance be-tween the sphere and the plate, the Casimir force acting on the sphere is modifiedto

Fsphere D 2πRE D � π3

360R„cd3 (5.166)

The first successful experiment [249] proved the existence of the Casimir force to anaccuracy of better than 5%. Recent experiments (Fig. 5.2) have reached an accuracyon the 1% level.

Figure 5.2 (a) The experimental apparatusof Harris et al. [197]. A metal sphere is sup-ported by a cantilever. When the Casimir forceacts on the sphere, the cantilever bends andits motion is measured by the displacement

of the laser reflection at the photodiodes. (b)Experimental data on the Casimir force as afunction of the plate–sphere distance. Thedata exactly fit the theoretical curve.

Page 151: Yorikiyo Nagashima - Archive

126 5 Field Quantization

Strange Properties of the VacuumNegative Pressure A vacuum filled with zero-point energy has the same density every-where. This means that the vacuum has negative pressure. Compressed air confinedin a balloon has outward pressure. Does the negative pressure of the vacuum produceinward forces? If we confine a small volume of vacuum in a cylinder and considerthe work the vacuum does outside as depicted in Fig. 5.3, P D �� can be shown.Another way is to use the first law of thermodynamics. For an isolated system, theinternal energy decreases if work is done to the outside world, namely dU D �P dV .Since U D �V , we obtain

d�

dVD �

�C PVD 0 , ) P D �� (5.167)

Figure 5.3 Negative pressure of vacu-um. When confined vacuum in a box withvolume V expands, exerting force on themoving wall, the pressure is given by

P D F/S D �@E/@V D �(1/S)(@E/@x).Since the total vacuum energy in a volumeis �v V(�v D constant), P D ��v .

Antigravity We know our universe is expanding, and until only a few years ago itsrate was thought to be decreasing because the gravitational force of matter in theuniverse counteracts it. Intuitively one might think the space filled with vacuum en-ergy would shrink even more. On the contrary, it accelerates the expansion accordingto general relativity. Namely, the vacuum energy has repulsive gravitational force.Whether the universe keeps expanding or eventually collapses depends critically onthe balance between matter and vacuum energy in the universe. It is known that,in cosmic history, there were two epochs in which the vacuum energy dominated:the inflation epoch and now. The vacuum energy of the former could be of dynam-ic origin, where the field (dubbed inflaton) breaks a symmetry spontaneously andliberates the vacuum energy in the form of latent heat that has disappeared as timewent by. But the vacuum energy (also known as dark energy), which occupies� 70%of the present cosmic energy budget, seems static, i.e. has constant energy densi-ty (see Section 19.2.7). This means it could also have existed in the inflation epoch.However, the observed value differs from that of the inflation by some 120 orders ofmagnitude! Remember, for instance, that a tiny quark (< 10�18 cm) could be largerthan the whole universe (> 130 billion light years) if magnified by a mere 50 ordersof magnitude. How could nature provide two kinds of energy of such magnitude?This is a big number. This is the notorious cosmological constant problem. The con-stant vacuum energy density means also that the total cosmic energy increases as theuniverse expands. Is the law of energy conservation violated?

Page 152: Yorikiyo Nagashima - Archive

127

6Scattering Matrix

Clues to clarify the inner structure of particles can be obtained, in most cases, by in-vestigating their scattering or decay patterns. In nonrelativistic quantum mechan-ics, the transition rates of these processes are usually calculated using perturbationexpansions. The underlying assumption is that the interaction energy, i.e. the ex-pectation value of the interaction Hamiltonian, is small compared to the systemenergy, which is represented by the total Hamiltonian H. The transition rate inquantum mechanics is given by the so-called Fermi’s golden rule:

W f i D 2πjh f jT jiij2� (6.1a)

h f jT jii D h f jHI jii CXn¤i

h f jHI jnihnjHI jiiEi � En

(6.1b)

Here, h f jHI jii is the matrix element of the interaction Hamiltonian between theinitial state and the final state. � is the final state density, and in the case of a one-body problem with a scattering center it is expressed as

� DdndED δ(Ei � E f )

Vd3 p(2π)3 (6.1c)

where V is the normalization volume. The purpose of this chapter is to derive rela-tivistic version of Eqs. (6.1a)–(6.1c) and prepare materials. Namely, we calculate therelativistic version of the vertex functions (hnjHI jii) of the electromagnetic interac-tion and the propagators [1/(Ei � En)] in Eq. (6.1b). QED calculation of the variousprocesses, which amounts to assembling these materials in the form of scatteringmatrices, will be dealt with in the next chapter.

6.1Interaction Picture

In quantized field theory, the field operators in the Heisenberg picture obey theHeisenberg equation of motion:

dOH

d tD i [H, OH] , OH D O(φH, πH) (6.2)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 153: Yorikiyo Nagashima - Archive

128 6 Scattering Matrix

where “O” means any operator composed of the field φ and its conjugate π. Stateson which the operators act are the Fock vectors jαHi, which are constant in time.So far we have expressed every physical object in terms of the free fields. From nowon we want to handle interacting fields. The Heisenberg picture naturally takes in-to account interactions when the Hamiltonian contains the interaction. However,generally we do not know how to express the exact solutions in analytical form,if they can be solved formally. The standard procedure is that we start from thefree field, for which the exact solutions are known, and treat the interaction assmall perturbation. This is the most practical way to approach realistic problems.It is convenient to introduce the interaction picture to handle the perturbative ex-pansion. To formulate it we start from the Schrödinger picture. In the Schrödingerpicture, operators are constants of time and all the time dependence is contained inthe state vectors. In the following, we attach the subscripts S, H and I for operatorsand vectors in the Schrödinger, Heisenberg and interaction pictures, respectively.The Schrödinger equation is expressed as

i@

@tjαS(t)i D HSjαS(t)i (6.3)

A formal solution can be written as

jαS(t)i D e�i HS t jαS(0)i (6.4)

where jαS(0)i is to be determined by initial conditions. The Heisenberg picture canbe obtained by a unitary transformation

jαH(t)i D e i HS t jαS(t)i D jαS(0)i (6.5a)

φH D e i HS t φS e�i HS t , πH D e i HS t πSe�i HS t (6.5b)

OH D e i HS t OS(φS, πS)e�i HS t D O(φH, πH) (6.5c)

HH D e i HS t HS e�i HS t D HS D H(φH, πH) (6.5d)

Note, the last equation tells us that the Hamiltonian in the Heisenberg picture ex-pressed in terms of the Heisenberg operators is the same as that in the Schrödingerpicture. So we shall simply use H to denote the total Hamiltonian in both pictures.Despite the time independence of the Heisenberg state vector, the time variableis retained. It does not mean the state changes with time, but rather it should beinterpreted as reflecting the dynamical development of the operator at time t. TheHeisenberg equation (6.2) can easily be derived from Eq. (6.5c).

Page 154: Yorikiyo Nagashima - Archive

6.1 Interaction Picture 129

The interaction picture is obtained by first separating the total Hamiltonian intothe free and interacting parts:

H D H0 C HIS (6.6a)

jαI(t)i D e i H0 t jαS(t)i (6.6b)

φI(t) D e i H0 t φSe�i H0 t , πI(t) D e i H0 t πSe�i H0 t (6.6c)

The equation of motion that jαI(t)i satisfies is given by

i@

@tjαI(t)i D e i H0 t

��H0 C i

@

@t

�jαS(t)i D HI(t)jαI(t)i (6.6d)

HI(t) D e i H0 t HISe�i H0 t D HI[φI(t), πI(t)] (6.6e)

The operators in the interaction picture satisfy the free field equation

dφI(t)d t

D i [H0, φI(t)] (6.6f)

where H0 D H0fφI(t), πI(t)g. State vectors and operators in the Heisenberg andthe interaction picture are related by a unitary transformation:

jαH(t)i D e i H tjαS(t)i D e i H t e�i H0 t jαI(t)i � U†(t)jαI(t)i (6.7a)

OH D U†(t)OIU(t) (6.7b)

The time dependence of the operators in the interaction picture is dictated by H0,and that of the state vector by HI. Therefore, if HI vanishes, jαI(t)i does not dependon time and the interaction picture coincides with the Heisenberg picture. Theequation of motion for the state vector jαI(t)i resembles the Schrödinger equationbut the Hamiltonian HI includes the free field operators that are time dependent.

A convenient feature of the interaction picture is that the field operators alwayssatisfy the equation of free fields and that the field expansion developed in the pre-vious section can be directly applied. That is no longer true for fields in the Heisen-berg picture. The interacting Klein–Gordon field, for instance, can be expandedjust like the free field

'H(x ) DX

k

1p

2ωV

haH(k)e�i k �x C b†

H(k)e i k �xi

(6.8)

However, the coefficients of the plane waves are related with those of free fields by

aH(k) D U†(t)afree(k)U(t) , b†H(k) D U†(t)b†

free(k)U(t) (6.9)

and no longer retain their simple physical interpretation as annihilation and cre-ation operators for single quantum of the field. Note, the interacting fields satisfy

Page 155: Yorikiyo Nagashima - Archive

130 6 Scattering Matrix

the same commutation relation as the free fields, because the commutation rela-tion is invariant under unitary transformation.

What we shall need in the following is the transition amplitude from, say, a statejα(t0)i at time t0, which develops to jα(t)i at time t, then transforms to anotherstate j(t)i:

h f jii D hH(t)jαH(t)i D hS(t)je�i H(t�t 0)jαS(t0)i (6.10a)

where we have used Eqs. (6.5a) and (6.4). The transition amplitude does not dependon which picture is used because they are related by unitary transformations. Nowwe express it in the interaction picture. Using Eq. (6.7a), they can be expressed as

h f jii D hI(t)jαI(t)i D hI(t)je i H0 t e�i H(t�t 0) e�i H0 t 0jαI(t0)i (6.10b)

jαI(t)i D U(t, t0)jαI(t0)i (6.10c)

U(t, t0) D e i H0 t e�i H(t�t 0) e�i H0 t 0D U(t)U†(t0) (6.10d)

U(t, t0) is referred to as the time evolution operator. U(t, t0) is manifestly unitaryand satisfies the following identity (for t1 � t2 � t3):

U(t1, t2)U(t2, t3) D U(t1, t3)

U(t1, t3)U†(t2, t3) D U(t1, t2)

U(t, t) D 1

(6.11)

If the interaction works only in a limited volume, an assumption which is gen-erally justified, it vanishes for t ! 1, t0 ! �1 in scattering processes.1) Then,limt!�1 jαI(t)i, limt!1 jI(t)i can be taken as free Hamiltonian eigenstates.We may call the states prepared before scattering “in” states and those after scat-tering “out” states. As they are both eigenstates of the free Hamiltonian and areconsidered to make a complete orthonormal basis, they are connected by a uni-tary transformation, which we call the scattering matrix and denote as S. ThenEq. (6.10b) and Eq. (6.10c) give

S D U(1,�1) (6.12a)

h f jii D houtjαini D hinjS jαini D houtjS jαouti (6.12b)

houtj D hinjS , or jαini D S jαouti (6.12c)

The mean transition probability per unit time is given by

W f i D limt!1

t0!�1

jhI(t)jU(t, t0)jαI(t0)i � δ f i j2

t � t0(6.13)

1) The interaction takes place typically within the scale of hadron size � 10�15 m. If a particletraverses the region at the velocity of light, the typical time scale of scattering is � 10�24 s.Comparatively speaking, the human time scale, for instance “a second”, is infinite. For along-range Coulomb-like force, the interaction range is, in principle, infinite but a screening effectdue to the dielectricity of matter effectively damps the force beyond a few atomic distances.

Page 156: Yorikiyo Nagashima - Archive

6.2 Asymptotic Field Condition 131

where by subtracting δ f i , we have eliminated the process where the incomingparticles pass without interaction. If the asymptotic states are represented by planewaves of definite momentum, the wave function extends over all space and cannever be free. For correct treatment we need to use a wave packet formalism, buthere we simply assume an adiabatic condition is implicit, in which the interactionHamiltonian is slowly turned on and off at jtj ! 1 so that there is no energytransfer to the system. A mathematical technique to warrant the assumption is tomodify

HI(t)! H εI , H ε

I D HIe�jεjt (6.14)

and let ε go to zero after all the calculations. This procedure is called the adiabaticapproximation.

6.2Asymptotic Field Condition

We consider an explicit representation for the interacting fields. In the relativisticaction approach, the interaction Lagrangian is just minus the free Hamiltonian inthe absence of derivative couplings. The fundamental interactions in the real worldare governed by gauge theories. Other types of interactions appear generally asapproximations in phenomenological approaches. In gauge theories, replacementof derivatives by covariant derivatives produces the interaction term and gives thesource of the electromagnetic (or more generally gauge) field:

ψγ μ i@μ ψ(x )! ψγ μ i Dμ ψ(x ) D ψγ μ i@μ ψ(x )� qψγ μ ψA μ (6.15a)

@μ F μν D q J ν D qψγ ν ψ (6.15b)

The interaction of quarks is governed by QCD, which is also a gauge theory withthe color degree of freedom obeying S U(3) internal symmetry. However, the pion,a composite of quarks, can phenomenologically be treated as a (pseudo)scalar field,which obeys the Klein–Gordon equation. Its interaction with the nucleon field ψN

is described by the Yukawa interaction, whose interaction Lagrangian is expressedas

LINT D �gψN(x )γ 5ψN(x )π(x ) (6.16)

The equation of motion can be expressed in a similar form as for the electromag-netic field. Setting gψN(x )γ 5ψN(x ) D J

Kx φ(x ) � (@μ@μ C m2)x φ(x ) D J(x ) (6.17)

For the following discussions, we generically write the source term as J or J μ ,depending on whether it is a scalar or a vector, and do not concern ourselves withits detailed structure.

Page 157: Yorikiyo Nagashima - Archive

132 6 Scattering Matrix

Let us solve Eq. (6.17) formally. Denoting the Green’s function by G(x � y ), wehave, by definition�

@μ@μ C m2�x G(x � y ) D �δ4(x � y ) (6.18)

The general solution to Eq. (6.17) is the sum of the homogeneous solutions (freefield) to the equation and the integral of the Green’s function times the inhomoge-neous term:

φ(x ) D φfree(x ) �Z

d4 y G(x � y ) J(y ) (6.19)

We have to specify the boundary conditions to determine φfree. The Green’s func-tion itself is not unique, either. Here we choose to use the advanced and retardedGreen’s function

Δret(x � y ) D 0 for x0 � y 0 < 0 (6.20a)

Δadv(x � y ) D 0 for x0 � y 0 > 0 (6.20b)

Then the second term in Eq. (6.19) vanishes as t ! �1, depending on whetherwe use the retarded or advanced functions. Therefore, we may identify the free fieldas φ in

out, which are generators of the “in” and “out” states:

φ(x ) D φin(x )CZ

d4 yΔret(x � y ) J(y ) (6.21a)

φ(x ) D φout(x )CZ

d4 yΔadv(x � y ) J(y ) (6.21b)

The “in” and “out” fields are related by

φin D S φoutS�1 (6.22)

This can be easily seen as

a†in(p ) D S a†

out(p )S�1 aout(p 0) D S�1ain(p 0)S (6.23)

and

jp ini � a†in(p )j0i D S a†

out(p )S�1j0i D S a†out(p )j0i � S jpouti (6.24)

where�means “equal” except for the normalization. We also assumed the stabilityof the vacuum, namely S j0i D j0i apart from the phase, which can be set as 1.

Since φ(x ) approaches φoutin

(x ) asymptotically in the limit t ! ˙1, we aretempted to think

φ(x )t!˙1�����!

pZ φout

in(x ) (6.25)

Page 158: Yorikiyo Nagashima - Archive

6.3 Explicit Form of the S-Matrix 133

which is called the “strong” asymptotic condition. But it has been shown that if itis imposed as an operator condition, no scattering occurs. Therefore, we impose a“weak” asymptotic condition

limt!˙1

h f jφ(x )jii Dp

Zh f jφoutin

(x )jii (6.26)

The renormalization constant Z appears because, φ has ability to produce extravirtual states and reduce the probability amplitude of remaining in a state that φfree

can create. However, for most of the following discussions we shall assume Z D 1.We need to include Z explicitly when considering the higher order corrections,which we will treat in Chap. 8.

6.3Explicit Form of the S-Matrix

We now want to calculate the time evolution operator. Inserting Eq. (6.10c) intoEq. (6.6d), we obtain an equation for U:

i@

@tU(t, t0) D HIU(t, t0) (6.27)

The initial condition is

U(t0, t0) D 1 (6.28)

Using Eq. (6.28), Eq. (6.27) can be converted to an integral equation

U(t, t0) D 1 � iZ t

t0

d tHI(t)U(t, t0) (6.29)

Expanding successively,

U(t, t0) D 1� iZ t

t0

d tHI(t)C (�i)2Z t

t0

d t1

Z t1

t0

d t2 HI(t1)HI(t2)C� � � (6.30a)

) S D 1� iZ 1

�1d tHI(t)C (�i)2

Z 1

�1d t1

Z t1

�1d t2HI(t1)HI(t2)C � � �

(6.30b)

Interchanging the variables t1 $ t2 in the third term and changing the integrationorder givesZ 1

�1d t1

Z t1

�1d t2HI(t1)HI(t2) D

Z 1

�1d t1

Z 1

t1

d t2HI(t2)HI(t1)

D12

Z 1

�1d t1

�Z t1

�1d t2HI(t1)HI(t2)C

Z 1

t1

d t2 HI(t2)HI(t1)�

D12

Z 1

�1d t1

�Z 1

�1d t2T [HI(t1)HI(t2)]

� (6.31)

Page 159: Yorikiyo Nagashima - Archive

134 6 Scattering Matrix

where T [� � � ] is a time-ordered product defined as

T [HI(t1)HI(t2)] D

(HI(t1)HI(t2) t1 > t2

HI(t2)HI(t1) t1 < t2(6.32)

Expressing HI in terms of the Hamiltonian density

HI(t) DZ

d3xH (t, x ) �Z

d3xH (x ) (6.33)

The S matrix can be expressed as

S D 1� iZ

d4xH (x )C(�i)2

2!

“d4x1d4x2T [HI(x1)HI(x2)]C � � �

C(�i)n

n!

“� � �

Zd4x1d4x2 � � � d4xn T [HI(x1)HI(x2) � � �HI(xn)]C � � �

� T exp��iZ

d4xHI(x )�

(6.34)

Note that the last equation is a formal expression. Its real meaning is given bythe preceding equation. As we can see from Eq. (6.34), the integration is carriedout over all space-time and HI D �LI if the interaction does not contain timederivatives, and hence relativistically invariant. The effect of multiplying by T isa bit uncertain, but it does not destroy Lorentz invariance either, if the operatorcommutator vanishes in the space-like distance.

Next we define the transition operator T and the Lorentz-invariant matrix ele-ment M f i by

h f jS jii D δ f i � i2πδ(Ei � E f )h f jT jii

D δ f i � i(2π)4δ4(Pi � P f )h f jMjii (6.35)

where Pi and P f represent the sum of the initial- and final-state energy-momen-tum, respectively. Although we obtained a formal solution Eq. (6.34) for the scat-tering matrix, expansion up to second order is enough for most of our later treat-ments. In the nonrelativistic treatment, we may extract and separate time depen-dence explicitly.

h f jHI(t)jii D h f je i H0 t HISe�i H0 t jii D h f jHISjiie�i(Ei�E f )t (6.36)

Inserting the above equation into Eq. (6.30), we obtain an equivalent formula toEq. (6.35) as

h f jS jii D δ f i � 2π i δ(Ei � E f )h f jT(E D Ei)jii (6.37a)

h f jT(Ei)jii D h f jHISjii CX

n

h f jHISjnihnjHISjiiEi � En C i ε

C � � � (6.37b)

Page 160: Yorikiyo Nagashima - Archive

6.3 Explicit Form of the S-Matrix 135

where the adiabatic approximation H εI D HIe�εjtj was used in the calculation.

Thus we have re-derived Eq. (6.1b). We see the adiabatic approximation gives aprescription for handling the singularity at Ei D En . T(E ) can be expressed inoperator form as

T(E ) D HI C HI1

E � H0 C i�HI

C HI1

E � H0 C i�HI

1E � H0 C i�

HI C � � � (6.38)

where we have used HI for HI(0) D HIS. This equation is the basis for the non-relativistic treatment of perturbation theory, but sometimes it is more useful thancompletely relativistic equations.

Problem 6.1

Replacing HI with H εI of Eq. (6.14), derive Eq. (6.37) from Eq. (6.30).

6.3.1Rutherford Scattering

Before using the relativistic scattering amplitude, let us reproduce the Rutherfordscattering formula for later comparison. This is scattering of a charged particlehaving spin 0 by the Coulomb potential created by a scattering center. The matrixelement in Eq. (6.37) is expressed as

h f jHISjii DZ

d3 x ψ�f (x )HISψ i (x ) (6.39a)

HIS DZ e2

4π1r

(6.39b)

If the 3-momentum of the incoming and outgoing particles are assumed to be p iand p f , respectively, the wave functions are given by

ψ i D N ei p i �x , ψ f D N ei p f �x (6.40)

where N D 1/p

V is a normalization factor, normalized as one particle per unitvolume. Inserting Eqs. (6.39b) and (6.40) into Eq. (6.39a)

h f jH jii DZ e2

4πV

Zd3x

ei(p i �p f )�x

r, r D jx j (6.41)

Putting q D p i � p f and carrying out the integration we find

I �Z

d3xei q�x

rD 2π

Z 1

0drZ 1

�1dz

ei qr z

rr2

D2πi q

Z 1

0dr(e i qr � e�i qr) (6.42a)

Page 161: Yorikiyo Nagashima - Archive

136 6 Scattering Matrix

The integral diverges formally, but we may consider that the potential has a damp-ing factor e�μ r due to the screening effect of the surrounding matter. Then

I D2πi q

Z 1

0dr

e�μ rCi qr � e�μ r�i qr�D

2πi q

�1

μ � i q�

1μ C i q

D4π

q2 C μ2 !4πq2 (6.42b)

Inserting Eq. (6.42b) into Eq. (6.41)

h f jH jii D�

Z e2

4πV

�4πq2 D

Z αV

4πq2 , α D

e2

4π'

1137

(6.43a)

q2 D jp i � p f j2 D 4p 2 sin2 θ

2(6.43b)

where θ is the angle between p i and p f , and jp i j D jp f j D p by energy conser-vation. The final state density is calculated to be

� D δ(Ei � E f )Vd3 p(2π)3 D V

�dEd p

��1 p 2dΩ8π3 D V

m p dΩ8π3 (6.44)

where we used E D p 2/2m and integrated over E. Inserting Eqs. (6.43) and (6.44)into Eq. (6.1a)

W f i D 2π�

Z αV

�2 �4πq2

�2

Vm p dΩ

8π3D

pV

4Z2α2mq4

dΩ (6.45)

To convert the transition probability into a cross section, it has to be divided byincoming flux. The flux is the number of particles passing through unit area inunit time. When the wave function is normalized to one particle per unit volume,the incoming flux can be given as v/V , where v D p/m is the velocity of theparticle,

) dσdΩD

4Z2α2m2

q4 DZ2α2m2

4p 4 sin4(θ /2)(6.46)

The normalization volume disappears from the final expression as it should. Thisis the famous Rutherford scattering formula and coincides with that derived clas-sically.

6.4Relativistic Kinematics

6.4.1Center of Mass Frame and Laboratory Frame

In dealing with reactions of elementary particles, there are two commonly used ref-erence frames. For theoretical treatments and for collider accelerator experiments,

Page 162: Yorikiyo Nagashima - Archive

6.4 Relativistic Kinematics 137

the center of mass (CM) frame, in which total momentum is zero, is used. Beforethe advent of colliders, most experiments were of the type that a beam extractedfrom an accelerator hits a target at rest. This is referred to as the laboratory (LABfor short) frame, and most nuclear and some particle experiments today are of thistype. They are connected by Lorentz transformation and relations that will becomeuseful later are derived here. Let us consider two-body reactions 1C 2! 3C 4. Weassign numbers to particles as follows: incoming (1), target (2), scattered (3) andrecoiled (4), and denote 4-momenta and masses as p1, p2, p3, p4Im1, m2, m3, m4.We indicate variables in the CM frame by an asterisk. In the laboratory frame, theyare expressed as

p1 D (E1, p2), p2 D (m2, 0), p3 D (E3, p3) , p4 D (E4, p 4) (6.47)

and in the CM frame

p �1 D

�E�

1 , p�� , p �2 D

�E�

2 ,�p�� , p �3 D

�E�

3 , p 0 �� ,

p �4 D

�E�

4 ,�p 0 �� (6.48)

Graphically they are expressed as in Fig. 6.1. By energy-momentum conservation

1 1 2

4

33

p’*

p*–p’*

–p*φ

θ θ*

4

2

(a) (b)

Figure 6.1 Two-body scattering in (a) the laboratory frame and (b) the center of mass frame.

p1 C p2 D p3 C p4 (6.49)

Consider the 1 C 2 particle system as composing a single particle having Etot D

E1 C E2 and p tot D p 1 C p 2, then in the CM frame the energy W D E�tot can

be considered as the mass of the two-body system because p �tot D 0 by definition.

The Lorentz factors and γ to transform the LAB frame to the CM frame can beobtained by applying one-body formulae ( D p/E , γ D E/m) to the system, i.e.

Dp tot

EtotD

p1

E1 C m2, γ D

1p1� 2

DE1 C m2

W(6.50a)

W D�m2

1 C 2E1m2 C m22

�1/2(6.50b)

Page 163: Yorikiyo Nagashima - Archive

138 6 Scattering Matrix

Now that we know the Lorentz factors to connect variables in the two frames, wecan apply the standard Lorentz transformation formula to each particle:

ELAB D γ�E� C p �

k�

(6.51a)

pk LAB D γ�E� C p �

k�

(6.51b)

p? LAB D p �? (6.51c)

where pk, p? are components of the momentum along and perpendicular to thez-axis (direction of CM frame motion):

pk LAB D pLAB cos θ , p? LAB D pLAB sin θ (6.52a)

p �k D p � cos θ � , p �

? D p � sin θ � (6.52b)

An important point here is that the transverse components are not affected by theLorentz transformation. This fact is extensively utilized in analyzing particles, forexample sorting b-quarks out of many-particle jets.2)

Problem 6.2

Prove that particles that are distributed isotropically in the CM frame (dN/dΩ � Dconstant), are distributed uniformly in energy in the LAB frame (dN/dE D

constant).

Although Eqs. (6.52) are the bases for connecting variables in the two frames, itis more convenient in actual calculations to use the Lorentz-invariant Mandelstamvariables, defined as follows:

s � (p1 C p2)2 D (E1 C E2)2 � (p1 C p 2)2 (6.53a)

t � (p1 � p3)2 D (E1 � E3)2 � (p1 � p3)2 (6.53b)

u � (p1 � p4)2 D (E1 � E4)2 � (p1 � p4)2 (6.53c)

As they are Lorentz scalars, their values do not change between frames. For in-stance

s D

(�E�

1 C E�2

�2 : CM

(E1 C m)2 � p21 D m2

1 C 2E1m2 C m22 : LAB

(6.54)

As s D W 2, s can be considered as the invariant mass (squared) of the two-bodysystem. Writing s in terms of CM variables

ps D W D E�

1 C E�2 D

qp�2 C m2

1 C

qp�2 C m2

2 (6.55)

2) B hadrons, which contain a b (or b) quark, have mass around mB � 5 GeV. When they areproduced in high-energy collisions, they typically appear as jets in which many hadrons flymore or less in the same direction having high longitudinal momentum but limited transversemomentum pT < mB/2 relative to the jet axis.

Page 164: Yorikiyo Nagashima - Archive

6.4 Relativistic Kinematics 139

we can solve the CM energy in terms of s

E�1 D

s C m21 � m2

2

2p

s, E�

2 Ds � m2

1 C m22

2p

s(6.56a)

p � Dq

E�1

2� m2

1 Dλ�s, m2

1, m22

�2p

sD

ps

2λ(1, x1, x2) (6.56b)

λ(x , y , z) D (x2 C y 2 C z2 � 2x y � 2y z � 2zx )1/2 (6.56c)

x1 D m21/s, x2 D m2

2/s (6.56d)

jtj is referred to as the momentum transfer (squared) and can be written downsimilarly

t D (E1 � E3)2 � (p1 � p3)2 D m21 C m2

3 � 2(E1E3 � jp1jjp3j cos θ ) : LAB

(6.57)

θ is the scattering angle and is defined by cos θ D (p2 � p3)/jp1jjp3j. In most high-energy experiments we are interested in E m , E ' p D jp j, then

t ' �2p1 p3(1� cos θ ) ' �4E1E3 sin2 θ2

(6.58)

t has the same form either in LAB or in CM. The role of u is similar to t, exceptparticle 3 is replaced by particle 4. In CM, t and u are expressed in the relativisticapproximation (E m),

t ' �2p �1 p �

3 (1� cos θ ) (6.59a)

u ' �2p �1 p �

3 (1C cos θ ) (6.59b)

From the above formula, we see that s gives the total energy, while t, u give themeasure of the scattering angle. When the masses can be neglected, i.e. s m2

i ,the ranges of the variables are s � 0, t 0, u 0.

Problem 6.3

Show that s, t, u are not independent but satisfy the relation

s C t C u D4X

iD1

m2i (6.60)

6.4.2Crossing Symmetry

The physical meanings of the three variables are apparently different, but consid-ering the fact that mathematically antiparticles appear as having negative energy-

Page 165: Yorikiyo Nagashima - Archive

140 6 Scattering Matrix

momentum and as backward-traveling particle, we can show that the roles of s, tand u are equal and symmetric. Take the example of π–proton elastic scattering(Fig. 6.2a)

πC(p1)C p (p2) ! πC(p3)C p (p4) (6.61)

By changing the variables

p1 ! �p1, p4 ! �p4 (6.62)

a process we call “crossing”, the physical meaning of the scattering is modified tothat of Fig. 6.2b

p (p4)C p (p2) ! π�(p1)C πC(p3) (6.63)

Because by the change in Eq. (6.62) the Mandelstam variables are changed to

s D (p1C p2)2 D (�p4� p3)2 ! u t � (p4� p3)2 � �2p 2t (1Ccos θt ) (6.64a)

t D (p1 � p3)2 D (�p4 C p2)2 ! s t � (p4 C p2)2 � 4p 2t (6.64b)

u D (p1 � p4)2 ! tt � (p4 � p1)2 � �2p 2t (1� cos θt ) (6.64c)

pt represents the momentum in the p p CM frame, θt is the angle that the outgo-ing π� makes with the incoming antiproton p . Here, t � 0, s, u 0 and t � s t ,u � tt , s � u t play the role of s, t, u in reaction (6.61). The reactions (6.61) and(6.63) are referred to as s-channel and t-channel reactions, respectively. Similarly,the u-channel reaction (Fig. 6.2c) can be defined. Thus, the three variables s, t, u canbe negative as well as positive and, depending on their value, represent three differ-ent reactions. This means that if the scattering amplitudes are written in terms ofs, t, u, other crossed-channel reactions are obtained immediately by crossing. Fig-ure 6.3 shows the three physical regions in an oblique coordinate frame, for thecase where the pion mass is μ and the proton mass M.

Figure 6.2 Crossing symmetry. Mutual relations of three different scatterings.

Page 166: Yorikiyo Nagashima - Archive

6.5 Relativistic Cross Section 141

Figure 6.3 Three physical regions of s, t, u in an oblique coordinate frame. A point in the planerepresents (s, t, u) by its signed perpendicular length to the s D 0, t D 0, u D 0 lines.

6.5Relativistic Cross Section

6.5.1Transition Rate

We first derive the relativistic version of Eq. (6.1a) (W f i D 2πjT f i j2�). The mean

transition probability per unit time is given by Eq. (6.13), so referring to Eq. (6.35)

jS f i � δ f i j2 D j � (2π)4 i δ4(pi � p f )M f i j

2

D (2π)4δ4(pi � p f )�Z

d4x ei(p i�p f )�x�jM f i j

2

D (2π)4δ4(pi � p f )�Z

d4x�jM f i j

2

D (2π)4δ4(pi � p f )V T jM f i j2

(6.65)

Here, the δ-function was first expressed as an integral and we used the fact that thetotal integrated volume V T is large on microscopic scale but finite on macroscopicscale. The transition rate per unit time per unit volume is obtained by dividing W f i

by V T and multiplying by the final state density � f :

W f i D (2π)4δ4(pi � p f )jh f jMjiij2� f (6.66)

Then the cross section is obtained by dividing W f i by the incoming flux and targetparticle density.

Page 167: Yorikiyo Nagashima - Archive

142 6 Scattering Matrix

6.5.2Relativistic Normalization

In the relativistic treatment, the usual normalization conditionZ�(x )dV D 1 (6.67)

is unsatisfactory. For flying particles the volume element dV is Lorentz contracted,and the space integral in the rest frame overestimates the volume by γ D E/m.It is better to adopt a normalization that is proportional to γ . Therefore, startingfrom this section, we adopt relativistic normalization, which we choose to setZ

�dV D 2E (6.68)

Writing plane-wave solutions obeying the Klein–Gordon equation as ' D N e�i p �xand those obeying the Dirac equation as ψ D N u r(p )e�i p �x, the normalizationconditions for both of them areZ

�d3x KGD

Zd3x i

�'� P' � P'�'

D 2E N 2V boson (6.69a)

DiracD

Zd3x ψ�ψ D N 2V u†

r (p )u(p )D 2E N 2V fermion (6.69b)

In both cases, they give N D 1/p

V . The condition Eq. (6.68) means the wavefunctions of the Dirac and Klein–Gordon fields are normalized as

f p (x ) D1p

Ve�i p �x boson (6.70a)

Up r D u r(p ) f p (x ) fermion (6.70b)

In the above treatment we integrated the density over a large but finite volume V DL3, which in turn means that the momentum is discrete. If we impose periodicboundary conditions for the plane wave,

e i px (xCL) D e i px x ,! px L D 2πnx , nx D 0,˙1,˙2, � � � ,

) Δnx DL

2πΔpx , Δny D

L2π

Δp y , Δnz DL

2πΔpz

(6.71)

then the number of wave states in the volume is given by

p D (px , p y , pz ) , p i L D 2πni ni D 0,˙1,˙2, � � � (6.72a)

Δn D�

L2π

�3

Δ p (6.72b)

The discrete sum becomes an integral in the large volume limit:

Xp

DX

n

Δn D�

L2π

�3 Xp

Δ p ! VZ

d3 p(2π)3 (6.72c)

Page 168: Yorikiyo Nagashima - Archive

6.5 Relativistic Cross Section 143

The integral of the plane waves in discrete or continuous format is

1V

Zd3x ei(p�p 0)�x D δ p ,p 0

V!1�����!

(2π)3

Vδ3(p � p 0) (6.73)

Proof:

1L

Z L/2

�L/2e i(p x�p 0

x )x dx D

8<:

1 I (px D p 0x )

2L(px � p 0

x )sin

L2

(px � p 0x ) D 0 I (px ¤ p 0

x )

* px L D 2πn , (n D 0,˙1,˙2 � � � ) (6.74)�

The one-particle state orthogonality and completeness conditions are now

hp 0jp i D(2π)3

V2Ep δ3(p 0 � p )

1 D VZ

d3 p(2π)32Ep

jpihp j(6.75)

The commutator of the discrete creation/annihilation operators becomes in thecontinuous limith

a p r , a†p 0s

i˙D

(2π)3

Vδ r s δ3(p � p 0) (6.76)

Then the one-particle state is expressed in terms of the creation operator as

jpi Dp

2E a†(p )j0i (6.77)

Having defined normalizations with finite V, we set the volume as 1 from now onunless otherwise noted, because they always disappear from the final expressionfor measurable quantities.3)

With the above definitions, the expansion formulae for the fields now become

'(x ) DX

k

1p

hak e�i k �x C b†

k e i k �xi

D

Zd3 k

(2π)3p

hak e�i k �x C b†

k e i k �xi

: scalar (6.78a)

ψ(x ) DZ

d3 p

(2π)3p

2Ep

Xr

ha p u r (p )e�i p �x C b†

p v (p )r e i p �xi

: Dirac

(6.78b)

Aμ(x ) DZ

d3k

(2π)3p

2ωXλ

hak λ εμ(k, λ)e�i k �x C a†

k λ εμ(k, λ)�e i k �xi

: photon (6.78c)

3) We shall prove this for the case of the cross section in the following section.

Page 169: Yorikiyo Nagashima - Archive

144 6 Scattering Matrix

The matrix elements of the field between the vacuum and one-particle state appearfrequently and we give them here:

h0j'(x )jki D e�i k �x hkj'†(x )j0i D e i k �x (6.79a)

h0jψ(x )jp , ri D u r(p )e�i p �x hp jψ(x )j0i D ur (p )e i p �x (6.79b)

h0jAμ(x )jkλi D εμ(k, λ)e�i k �x hkjAμ(x )j0i D εμ(k, λ)�e i k �x (6.79c)

Note As we said, taking the continuous limit does not necessarily mean V isinfinite. Therefore δ(0) should be interpreted as

δ3(0) D1

(2π)3

ZV

d3x DV

(2π)3 (6.80)

Then

hp jpi D(2π)3

V2E δ3(0) D 2E (6.81)

6.5.3Incoming Flux and Final State Density

In the relativistic normalization, the final state density of one particle is given by

d� f DV

(2π)3

d3 p2E

(6.82)

for each particle in the final state. Let us name the incoming particle 1, the targetparticle 2 and assume N f particles are produced in the final state. The incomingflux is given by 2E1v12/V , where v12 is the incoming velocity relative to the target,and the target particle density is given by 2E2/V . Then the cross section for thereaction becomes

dσ DV 2

2E12E2v12jM f i j

2(2π)4δ4

0@p1 C p2 �

N fXf D1

p f

1A N fY

f D1

�Vd3 p f

(2π)32E f

�(6.83)

We define Lorentz-invariant flux F by

F � 4E1E2v12 D 4E1E2

�p1

E1�

p2

E2

�D

(4p � W CM

4p1m2 LAB

D 4�(p1 � p2)2 � m2

1m22

1/2D 2sλ(1, x1, x2)

λ(x , y , z) D (x2 C y 2 C z2 � 2x y � 2y z � 2zx )1/2

x1 D m21/s , x2 D m2

2/s

(6.84)

Page 170: Yorikiyo Nagashima - Archive

6.5 Relativistic Cross Section 145

As the state vector contains a factor 1/p

V for each particle, M f i produces a totalof (N f C 2) 1/

pV . Then naturally, all the volume V’s are canceled and disappear

from the expression of dσ. As V always disappears from the final expression formeasurable quantities, there is no harm in setting V D 1.

6.5.4Lorentz-Invariant Phase Space

We introduce Lorentz-invariant phase space by

dLI P S � (2π)4δ4

0@X

i

p i �X

f

p f

1A 1

m!

Yf

d3 p f

(2π)32E f(6.85)

where m! is a correction factor if identical particles are included in the state. Insert-ing Eqs. (6.84) and (6.85) into Eq. (6.83), we obtain the cross-section formula

dσ DjM f i j

2

2sλ(1, x1, x2)dLI P S (6.86)

The decay rate for particle 2 into many particles dΓ (2 ! 3 C 4C � � � ) is obtainedby dividing the transition rate W f i D W(2! 3C 4I � � � ) by the flux of particle 2 inits rest frame

dΓ DjM f i j

2

2m2dLI P S (6.87)

Equations (6.86) and (6.87) are the relativistic version of Fermi’s golden ruleEq. (6.1a).

6.5.5Cross Section in the Center of Mass Frame

Let us calculate the cross section for two-body scattering

1C 2! 3C 4

in the CM frame. Using Eq. (6.83), the cross section for two-body scattering is givenby

dσ D1

4E1E2v12jM f i j

2(2π)4δ4(p1C p2� p3� p4)d3 p3

(2π)32E3

d3 p4

(2π)32E4(6.88)

In CM, p1Cp2 D 0, E1CE2 D W . Writing p i D jp1j D j�p2j, p f D jp 3j D j�p4j,

v12 D

ˇˇ p1

E1�

p2

E2

ˇˇ D p i(E1 C E2)

E1E2D

p i WE1E2

(6.89a)

) F D 4E1E2v12 D 4pi W (6.89b)

Page 171: Yorikiyo Nagashima - Archive

146 6 Scattering Matrix

dLI P SCM D1

16π2 δ(W � E3 � E4)δ3(p3 C p4)d3 p3

E3

d3 p4

E4(6.89c)R

d3 p4 gives 1. Then to carry out p3 integration, we use the relation δ(G )d p f D

(@G/@p f )�1δ(G )dG :

G D E3 C E4 � W Dq

p 2f C m2

3 Cq

p 2f C m2

4 � W (6.89d)

@G@p fD

p f

E3C

p f

E4D

p f WE3E4

(6.89e)

) dLI P SF

D1

4pi W

�@G@p f

��1 p 2f dΩCM

16π2E3E4D

164π2 s

�p f

p i

�dΩCM

(6.89f)

We obtain

dσdΩ

ˇˇCMD

�p f

p i

�jM f i j

2

64π2 s(6.90)

Problem 6.4

Prove that the cross section in LAB for the case m1 D m3 D 0, m2 D m4 D M isgiven by

dσdΩ

ˇˇLABD

�p3

p1

�2 jM f i j2

64π2M2(6.91)

Problem 6.5

Sometimes it is useful to express the cross section in terms of the Lorentz-invariantvariable t rather than the scattering angle cos θ . Assuming m1 D m3 D m, m2 D

m4 D M , show that

d t D 2p1 p3d cos θ Dp1 p3

πdΩ '

(s � M2)2

4π sdΩCM (6.92a)

dσd tD

πp1 p3

dσdΩ

ˇˇCMD

jM f i j2

16πλ(s, m2, M2)2'

jM f i j2

16π(s � M2)2(6.92b)

where the last equation in both lines holds when s m2.Hint: Use

t D q2 D (p1 � p3)2 D 2m2 � 2(E1E3 � p1 p3 cos θ ) in any frame (6.93)

and in CM

p1 D p3 D

ps

2λ�

1,m2

s,

M2

s

�'

s � M2

2p

s(6.94)

Page 172: Yorikiyo Nagashima - Archive

6.6 Vertex Functions and the Feynman Propagator 147

6.6Vertex Functions and the Feynman Propagator

In Sect. 6.3 we derived an explicit formula for the scattering matrix that is therelativistic version of Eq. (6.1b):

h f jT jii D h f jHIjii CX

n

h f jHIjnihnjHIjiiEi � En

(6.95)

h f jHIjii is the first-order matrix element of the interaction Hamiltonian, but inFeynman diagrams, which will appear in abundance in the following, it is referredto as the vertex function and 1/(Ei � En) as the propagator, the relativistic versionof which is referred to as the Feynman propagator. It is necessary as well as eluci-dating to learn beforehand how elements h f jHIjii and 1/(Ei � En) that appear inthe above formula are expressed in quantized field theory. To calculate the vertexfunction, we need an explicit form of the interaction Hamiltonian. In the following,we mainly discuss the electromagnetic interaction.

6.6.1eeγ Vertex Function

As was described in Chap. 4, the interaction of electrons (q D �e) with the elec-tromagnetic field is given by the gauge principle, which is to replace derivativesby covariant derivatives @μ ! Dμ D @μ � i eA μ. This replacement produces theinteraction Hamiltonian density HI for a Dirac particle

HI D �Lint D qψγ μ ψA μ D q j μA μ (6.96)

Similarly, the interaction Hamiltonian for the complex scalar field is given by

L D @μ'†@μ' � m2'†' ) LCLI

HI D qA μ�i�'†@μ' � @μ'†'

�� q2A μ Aμ'†'

(6.97)

Consider a transition from an initial state with an electron having the energy-mo-mentum p and spin state r to a final state with an electron having p 0, s and a pho-ton with k and polarization λ. The expectation value or the matrix element of theHamiltonian is given by

hp 0 sI k λjHI(x )jp ri D �ehp 0sjψ(x )γ μ ψ(x )jp rihkλjA μ(x )j0i (6.98)

Here, we have separated the electron and photon, because their Fock spaces areindependent of each other. We have also expressed the sign of the charge explicitly(q D �e). The expectation value of the electromagnetic field is readily obtainedusing Eq. (6.78c)

hkλjA μ(x )j0i D εμ(k, λ)�e i k �x (6.99)

Page 173: Yorikiyo Nagashima - Archive

148 6 Scattering Matrix

We use Eq. (6.78b) for the expression of ψ, ψ. Considering jp ri � a†p r j0i, hp 0 sj �

h0ja p 0 s , the expectation value of the current � ψψ contains a term

h0ja p 0 a†s a t a†

p j0i (6.100)

where we have used the abbreviated notation p 0, p , t, s to denote both momentumand spin states:

at a†p D

�˚at , a†

p

�� a†

p atj0i D (2π)3δ3(t � p )j0i

h0ja p 0 a†s D h0j

�˚a p 0 a†

s

�� a†

s a p 0D h0j(2π)3δ3(s � p 0)

(6.101)

In deriving the last equation of both lines, we used the fact that the annihilation(creation) operator acting on the vacuum state from the left (right) annihilates thestates. Therefore

h0ja p 0 a†s a t a†

p j0i D (2π)6δ3(t � p )δ3(s � p 0) (6.102)

Taking into account the normalization factor (p

2Ep � � � ) for the states and integrat-ing over t, s (note we are also taking the sum over spin)

hp 0sjψ(x )γ μ ψ(x )jp ri D us(p 0)γ μ u r (p )e�i(p�p 0)�x (6.103)

Therefore

(a) e ! e C γ

hp 0sI k λjHIjp i D �eus(p 0)γ μ u r(p )εμ(k, λ)�e�i(p�p 0�k )�x (6.104a)

This matrix element corresponds to diagram (a) of Fig. 6.4. Notice the sign of theenergy momentum that appears in the exponent. The sign of the energy-momen-tum p of an incoming and annihilated particle in the exponent is negative and thatof outgoing or created particles is positive. As the interaction can happen anywherein space-time, the matrix element will be integrated over space-time, producingδ-functions. Note, the action is an integral of the Lagrangian over all space-time.Namely, the energy and the momentum are conserved at each interaction point orat the “vertex” of the diagram:

E D E 0 C ω, p D p 0 C k (6.104b)

In a similar way, the diagram in Fig. 6.4b corresponds to

(b) γ C e ! e

hp 0 sjHIjp rI kλi D �eus(p 0)γ μ u r (p )εμ(kλ)e�i(p�p 0Ck )�x (6.104c)

Next, we consider the interaction of an antiparticle. Corresponding to the diagramin Fig. 6.4c,

(c) eC ! eC C γ

hp 0 sI k λjHIjp ri D Cev r (p )γ μvs(p 0)εμ(kλ)�e�i(p�p 0�k )�x (6.105a)

Page 174: Yorikiyo Nagashima - Archive

6.6 Vertex Functions and the Feynman Propagator 149

Figure 6.4 Various processes of the electron–photon interaction in the lowest order that isgiven by LINT D �qA μ j μ . Time goes upward; a line with an upward-going arrow designates aparticle and one with a downward arrow, an antiparticle.

Here again, the sign of the energy-momentum in the exponent gives (�) for incom-ing and (C) for outgoing particles and there is no distinction between the particleand antiparticle. The remnant of the statement that the antiparticle goes backwardin time is on the order of v r(p )γ μ vs(p 0). Notice also that the sign of the electriccharge is correctly given, i.e. opposite to the electron, as it must be. The contribu-tion to the matrix element is a product of bs v(s), which is in the positive-frequencypart of ψ, and b†

t v (t), which is in the negative-frequency part of ψ

hp 0jbs b†t jpi D hp

0jfbs b†t g � b†

t b sjpi D hp 0j(2π)3δ3(s � t) � b†t b sjpi

The second term gives the above matrix element with the correct negative sign.The negative sign results from the anticommutation relation, but it also producesthe first term, an unwanted constant contribution, which gives a diverging inte-gral. This first term can be treated in a similar manner as the constant term inthe Hamiltonian, namely such an unphysical term is to be subtracted at the verybeginning of the calculation. The technique is to assume that terms like bs b†

t inthe bilinear expression ψψ, in which the annihilation operator comes to the leftof the creation operator, are already ordered like �b†

t b s from the beginning. Op-erators arranged in such a manner are called “normal ordered” or “N product” inquantized field theory. We shall assume that the interaction Hamiltonian is normalordered.4) The diagram in Fig. 6.4d is quite similar to Fig. 6.4b with appropriatechanges (u r(p )! vs(p 0), us(p 0)! v r (p ), �e ! Ce)

(d) γ C eC ! eC

hp 0 sjHIjp rI kλi D Cev r (p )γ μvs(p 0)εμ(kλ)e�i(p�p 0Ck )�x (6.105b)

4) The normal ordering is a formal manipulation to avoid unwanted constants, which are ofteninfinite. Whether this is physically allowed is a matter to be treated carefully. For instance, thevacuum energy that was inherent in quantized harmonic oscillators can be eliminated, but wehave seen that its existence is experimentally established by the Casimir effect [92].

Page 175: Yorikiyo Nagashima - Archive

150 6 Scattering Matrix

Figures 6.4e,f, which describe pair creation or annihilation by the photon, are ex-pressed as

(e) γ ! e� C eC

hp r, p 0 sjHIjkλi D �eur(p )γ μ vs(p 0)εμ(kλ)e�i(k�p�p 0)�x (6.106a)

( f ) e� C eC ! γ

hkλjHIjp r, p 0 si D �ev s(p 0)γ μ u r(p )εμ(kλ)�e�i(pCp 0�k )�x (6.106b)

Finally Figs. 6.4g,h, which describe total transition to vacuum and vice versa, areexpressed as

(g) γ C e� C eC ! vacuum

h0jHIjp r, p 0 sI k λi D �ev s (p 0)γ μ u r(p )εμ(kλ)e�i(pCp 0Ck )�x (6.107a)

(h) vacuum ! e� C eC C γ

hp r, p 0 sI k λjHIj0i D �eur (r)γ μ vs(p 0)εμ(kλ)�e i(pCp 0Ck )�x (6.107b)

Virtual ParticlesIn quantum field theory, we describe most physical phenomena in terms of parti-cles. However, in the case of interacting electromagnetic fields, for instance a staticCoulomb field between two charged particles, no particles are emitted, yet we talkof the force as the result of particle exchange. What is the difference between a realphoton, which appears as a quantum of freely propagating waves, and the quantumof the particles exchanged between two interacting objects? The freely propagatingwave satisfies the Klein–Gordon equation in vacuum. Then Einstein’s relation

p 2 D E 2p � jp j

2 D m2 massive particle (6.108a)

k2 D ω2 � jkj2 D 0 photon (6.108b)

should hold for the field quanta. A particle satisfying Eq. (6.108) is often said tobe “on the mass shell” or simply “on shell”. The electric field that is produced by apoint charged particle satisfies

@μ@μ Aν(x ) D eδ3(r)! k2 ¤ 0 (6.109)

The quantum of the interacting field, therefore, does not satisfy Einstein’s rela-tion. It is said to be “off shell” or a virtual particle. It is not a physical particle andcannot be observed as such. The reader is reminded that for the virtual particle,the ordinary physical picture does not apply. For instance, if the force is createdby exchange of a real particle, it can only give a repulsive force; no attractive forcecan result because emission or absorption of a particle having momentum alwayscauses the party involved to recoil. Another example: the real photon has only twodegrees of freedom in its polarization. Since the polarization is a vector in space-time, it needs four components in a relativistic treatment. Artificially (mathemati-cally) introduced extra quanta (longitudinal and scalar photons) appear only in the

Page 176: Yorikiyo Nagashima - Archive

6.6 Vertex Functions and the Feynman Propagator 151

intermediate state of a process. They show peculiar properties such as negativeprobability, superluminal velocity5) and nonzero mass.

The vertices described above generally connect initial states to intermediate andfinal states. Initial and final states that are prepared by experimental set ups arecomposed of real (on mass shell) particles that satisfy the Einstein relation E Dp

p2 C m2, ω (photon) D jkj, or in relativistic notation p 2 D m2, k2 D 0. Thismeans that once the momentum p of a particle is specified, its energy is fixed, too.At the vertex, both the energy and the momentum are conserved in a relativistictreatment, which in many cases does not allow all the particles to be “on shell”;one or two or all of the states are off mass shell. An exception is an initial massiveparticle (say a π) converting to two less-massive particles (μ C ν), where all theparticles could be “on shell” if energy-momentum conservation is satisfied.

6.6.2Feynman Propagator

We have so far studied only free fields, which satisfy a certain equation of motion,a homogeneous differential equation. An interacting field, which is created by asource, obeys an inhomogeneous differential equation. A formal solution to theinteracting field is given using a Green’s function. In quantized field theories thisis the role played by the Feynman propagator. It appears everywhere and is anindispensable tool for the calculation of scattering matrices. The definition and theproperties of the Feynman propagator are given below.

Feynman Propagator of the PhotonThe Feynman propagator of the photon is defined as

i DFμν(x � y )6) � h0jT [A μ(x )A ν(y )]j0i

D

(h0jA μ(x )A ν(y )j0i tx > ty

h0jA ν(y )A μ(x )j0i tx < ty

(6.110)

T(� � � ) is the “T product”, defined in Eq. (6.32). Inserting a unit operator 1 DPjnihnj and considering that A μ has matrix elements only between states dif-

fering in particle number by 1, only one-particle intermediate states differing inmomentum or spin contribute. Inserting Eq. (6.75) and considering

h0jA μ(x )jqλi D εμ(λ)e�i q�x (6.111a)

hqλjA ν(y )j0i D εν (λ)�e i q�y (6.111b)

εμ(λ)εν(λ)� ! �gμν see Eq. (5.147b) (6.111c)

5) The space-like momentum k2 < 0 means its (formal) velocity is larger than the velocity of light.6) It is customary to write DF for particles with m D 0 and ΔF when m2 ¤ 0. This applies to other

functions, e.g. Δ, Δ retadv

, too.

Page 177: Yorikiyo Nagashima - Archive

152 6 Scattering Matrix

we can express the Feynman propagator as

h0jT [A μ(x )A ν(y )]j0i

D �gμν

Zd3q

(2π)32ω

˚θ (x � y )e�i q�(x�y )C θ (y � x )e i q�(x�y )� (6.112a)

θ (x ) D

(1 t > 0

0 t < 0(6.112b)

The step function can be expressed by a contour integral in the complex plane:

θ (t) D1

2π i

Z 1

�1dz

ei z t

z � i ε(6.113)

where i ε is a small imaginary part prescribed to handle the singularity. Then

h0jT [A μ(x )A ν(y )]j0i

D �gμν

Zd3q

(2π)32ω1

2π i

Zdz

"e�i(ω�z)t 0Ci q�x0

z � i εC

e�i(z�ω)t 0�i q�x 0

z � i ε

#x 0Dx�y

D�i gμν

(2π)4

Zd3q2ω

Zdq0

�1

q0 � ω C i ε�

1q0 C ω � i ε

�e�i q0 t 0Ci q�x0

(6.114a)

D�i gμν

(2π)4

Zd4 q

e�i q�(x�y )

q2 C i ε(6.114b)

In summary, the Feynman photon propagator is expressed as an integral

h0jT [A μ(x )A ν(y )]j0i � i DF μν(x � y ) D�i gμν

(2π)4

Zd4q

e�i q�(x�y )

q2 C i ε(6.115)

It is a vacuum expectation value of two field operators that create a photon at yand annihilate it at x, and is referred to as a two-point function of the field. Inthe language of classical field theory, it is the effect of the electromagnetic fieldcreated by a source (a charged particle) at a point y, acting on a charged particle atpoint x. That the time ordering is considered suggests the role of a Green function.We shall see that the Feynman propagator DF is indeed a Green’s function of theelectromagnetic field that satisfies the equation

@μ@μ Aν(x ) D q j ν(x ) (6.116)

Green’s FunctionLet us start from the simplest relativistic field φ(x ) that satisfies a spin-less Klein–Gordon equation. The field produced by a source obeys the equation given by�

@μ@μ C m2 φ D �(x ) (6.117)

Page 178: Yorikiyo Nagashima - Archive

6.6 Vertex Functions and the Feynman Propagator 153

where � is a source to produce the field. A formal solution to the equation is ex-pressed using a Green’s function G(x , y ), which satisfies

φ(x ) D φ0(x ) �Z

d4 y G(x , y )�(y ) (6.118a)

�@�@� C m2�G(x , y ) D �δ4(x � y ) (6.118b)

�@�@� C m2� φ0(x ) D 0 (6.118c)

The physical meaning of the Green’s function is that the field at space-time point xis given by the action of the field created by a source at space-time point y that thenpropagates to point x.

The boundary conditions imposed on G is most frequently translationally invari-ant, so G can be regarded as a function of x� y . Then it can be solved using Fouriertransformation:

G(x , y ) D1

(2π)4

Zd4q QG(q)e�i q�(x�y ) (6.119a)

δ4(x � y ) D1

(2π)4

Zd4qe�i q�(x�y ) (6.119b)

Inserting Eq. (6.119) into Eq. (6.118b) we obtain

QG (q) D1

q2 � m2, q2 D (q0)2 � q2 , ω D

qq2 C m2 (6.120)

The Green’s function is expressed as

G(x , y ) D1

(2π)4

Zd4q

e�i q�(x�y )

q2 � m2

D1

(2π)4

Zd3qei q�(x�y )

Zdq0 e�i q0(tx �t y )

�1

q0 � ω�

1q0 C ω

�(6.121)

The integration on q0 can be interpreted as a contour integral in the complex plane,but a rule has to be established to handle the singularity. The retarded and advancedGreen’s function can be defined by requiring G(x� y ) D 0 for tx < ty and tx > ty .For tx > ty (tx < ty ), the contour can be closed in the lower (upper) semi-circle,therefore, adding a small imaginary part, q0 ! q0 ˙ i ε, gives the desired effect:

G retadv

(x � y ) D1

(2π)4

Zd3qei q�(x�y )

Zdq0 e�i q0(tx �t y )

�1

q0 ˙ i ε � ω�

1q0 ˙ i ε C ω

D �i θ [˙(x0 � y 0)]Z

d3 q(2π)32ω

[e�i q�(x�y ) � e i q�(x�y )]

D

(θ (x0 � y 0)Δ(x � y )

�θ (y 0 � x0)Δ(x � y )(6.122a)

Page 179: Yorikiyo Nagashima - Archive

154 6 Scattering Matrix

iΔ(x ) �1

(2π)3

Zd3q2ω

[e�i q�x � e i q�x ] DZ

d4q(2π)3

�(k0)δ(k2 � m2)e�i k �x

(6.122b)

where �(k0) D ˙1 for k0 ? 0. In relativistic quantum theory, we have learned thatthe negative energy contribution should be interpreted as a particle proceedingbackward in time. Then the contribution of the negative energy should vanish fortx > ty .

To make e�i q�x an ordinary propagating wave when q0 D ω > 0 and a backward-traveling wave when q0 D �ω < 0, we have to close the integral path in such away as to include the pole at q0 D ω when t > 0 and q0 D �ω when t < 0.This can be achieved by adding or subtracting a small imaginary number i ε to thedenominator. Using

Z C1

�1dq0 e�i q0 t

(q0)2 � ω2 C i εD

Z C1

�1dq0 e�i q0 t

�1

q0 � ω C i ε�

1q0 C ω � i ε

D

8<ˆ:�2π i

e�i ω t

2ωt > 0

�2π ieCi ω t

2ωt < 0

(6.123a)

the Green’s function for the quantized Klein–Gordon field can be expressed as

GF(x � y ) D �iZ

d3q(2π)32ωq

[θ (x0 � y 0)e�i q�(x�y )C θ (y 0 � x0)e i q�(x�y )]

(6.123b)

which is nothing but the Feynman propagator Eq. (6.112) except for the polariza-tion sum factor �gμν and i. The Feynman propagator for a spinless scalar field isdefined as

h0jT ['†(x )'(y )]j0i D iΔF(x � y ) Di

(2π)4

Zd4q

e�i q�(x�y )

q2 � m2 C i ε

(@μ@μ C m2 � i ε)ΔF(x � y ) D �δ4(x � y )(6.124)

Electron PropagatorThe propagator for the Dirac particle can be defined similarly:

i SF(x � y ) � h0jT [ψ(x )ψ(y )]j0i

D

(h0jψ(x )ψ(y )j0i tx > ty

�h0jψ(y )ψ(x )j0i tx < ty

(6.125)

The minus sign in the second line comes from the anticommutativity of thefermion field. Taking the sum over intermediate states, we obtain a similar expres-sion as for the photon propagator except for the spin sum, which has to be done

Page 180: Yorikiyo Nagashima - Archive

6.6 Vertex Functions and the Feynman Propagator 155

separately for particle and antiparticle. Using the spin sum expression Eq. (4.69)in Sect. 4.2,

i SF(x , y ) DZ

d3 p(2π)32E

"θ (x � y )

Xr

u r(p )ur (p )e�i p �(x�y )

�θ (y � x )X

r

vr(p )v r (p )e i p �(x�y )

#

D

Zd3 p

(2π)32E

�θ (x � y )( 6p C m)e�i p �(x�y )

C θ (y � x )(� 6p C m)e i p �(x�y )D (ir/C m)

Zd3 p

(2π)32E[θ (x � y )e�i p �(x�y )

C θ (y � x )e i p �(x�y )]

D (ir/C m)ΔF(x � y ) Di

(2π)4

Zd4 p e�i p �x ( 6p C m)

p 2 � m2 C i ε(6.126)

The spin sum ( 6p ˙ m) was replaced with derivatives in the penultimate line. Thisis permissible because the two time derivatives of the step functions compensateeach other and vanish. The electron propagator Eq. (6.126) has exactly the sameform as the photon propagator Eq. (6.115) except for the numerator, where thephoton polarization sum is replaced by the electron spin sum when tx > ty andminus the positron spin sum when tx < ty :

QSF(p ) D6p C m

p 2 � m2 C i ε�

16p � m C i ε

(6.127)

The last equality is often used when one wants a simple expression, but its realmeaning is defined by the preceding equation. Since it is easily shown that theelectron propagator satisfies the equation�

γ μ i@μ � m�

SF(x , y ) D δ4(x � y ) (6.128)

it, too, is the Green’s function for the Dirac equation.

Relation with the Nonrelativistic PropagatorApart from the spin sum in the numerator, the Feynman propagator consists oftwo energy denominators:

QDF(q) �1

q2 C i εD

12ω

�1

q0 � ω C i εC

1�q0 � ω C i ε

�(6.129)

It is elucidating to consider how they are related to the nonrelativistic propagator

1Ei � En

(6.130)

Page 181: Yorikiyo Nagashima - Archive

156 6 Scattering Matrix

Figure 6.5 The two processes of e–μ scattering (a) tx > ty , (b) tx < ty , and Compton scatter-ing (c) tx > ty , (d) tx < ty . The Feynman propagator contains both of them together.

Let us consider the example of e–μ scattering via hypothetical spinless photon ex-change. In the nonrelativistic treatment, the time sequence of the process has to beconsidered. There are two intermediate states, depending on which particle emitsa photon first (see Fig. 6.5a,b). As the energy of the two intermediate states afteremission of the photon is given by

En(a) D E1 C E4 C ω , En(b) D E2 C E3 C ω (6.131a)

the virtual photon energy that the electron at point x receives from the muon atpoint y is given by

q0 D

(E3 � E1 D E2 � E4 > 0 case (a)

E3 � E1 D E2 � E4 < 0 case (b)(6.131b)

The two denominators of the Feynman propagator can be rewritten as

q0�ω D (E2�E4)�fEn(a)�E1�E4g D E1CE2�En(a) D Ei�En(a) (6.131c)

�(q0Cω) D �(E3�E1)�fEn(b)�E2�E3g D E1CE2�En(b) D Ei �En(b)

(6.131d)

which exactly reproduces the two energy denominators of the nonrelativistic treat-ment. The factor 1/2ω arises from the relativistic normalization. Namely, the Feyn-man propagator contains both intermediate states by allowing the particle to takenegative energy and interpreting it as propagating backward in time. Actually, itmeans the reaction of the electron back to the muon, and really there is no particletraveling backward in time.

Negative energy is meant to represent an antiparticle, too. In the case of thephoton, there is no distinction between particle and antiparticle and the spin sumgives (�gμν) in both cases. For the case of Compton scattering (see Fig. 6.5c,d),the Feynman propagator in (c) represents an electron created at y proceeding to xwith positive energy, while that in (d) represents a particle proceeding with negativeenergy going backward in time. The latter is an antiparticle, i.e. positron, emittedat x that proceeds normally to y. The expression for the energy denominator is the

Page 182: Yorikiyo Nagashima - Archive

6.7 Mott Scattering 157

same as that of the photon propagator in e–μ scattering, but now the spin sum hasto be done separately for the particle and antiparticle:

QSF(p ) �1

2E

�PrD1,2 u(p )u(p )

p 0 � E C i εC�P

rD1,2 v (�p )v(�p )

�p 0 � E C i ε

)6p C m

p 2 � m2 C i ε

(6.132)

where the minus sign in front of the v (�p )v(�p ) comes from the Feynman pre-scription for the change of time order and the negative momentum comes fromthe change of the integration variable p ! �p . Recall that u(p )u(p ) and v (p )v(p )appeared as the coefficient of e�i p �(x�y ) and e i p �(x�y ) originally.

Note: The intermediate photon in the above example has the energy-momentumqμ D (q0, q). Since q D p3 � p1 D p2 � p4, q2 ¤ 0, the photon is not on shell. Onthe other hand, intermediate states inserted to calculate the Feynman propagatorare particles created by the creation operator of the free field and therefore are onshell in the sense that their energy ω is defined by the relation ω D jqj. Where didthe difference come from? The fact is that q0 that appears in the Feynman propa-gator is an integration variable obtained by setting q0 D ˙(z C ω) in Eq. (6.114a).It is this q0 that satisfies energy conservation and has nothing to do with ω, theenergy of the photon. Some people say that in the nonrelativistic treatment parti-cles in the intermediate states are on shell but the energy is not conserved in thetransition, and in the relativistic treatment the energy and momenta are conservedbut the particles are not on shell. The reader should realize that the latter half ofthis statement is not correct, as is clear from the above argument.

6.7Mott Scattering

6.7.1Cross Section

A process in which an electron is scattered by an unmovable scattering center, typ-ically a heavy nucleus, is called Mott scattering. The target should be heavy enoughfor its recoil to be neglected. However, the electron should be energetic enough tobe treated relativistically. The interaction Hamiltonian in this case is

HI D q j μA μ , j μ D ψγ μ ψ (6.133)

As long as the recoil of the target can be neglected, there is no effect that thescattering electron can give to the target. Then the field created by the targetis not disturbed and can be treated as an external source, namely as a classicalc-number potential. Such external fields are, for example, a static Coulomb force[Aμ D (Z q0/4πr, 0, 0, 0)] or an oscillating magnetic field [Aμ D (0,�y B sin ω t,

Page 183: Yorikiyo Nagashima - Archive

158 6 Scattering Matrix

x B sin ω t, 0)]. The scattering matrix element is given by

S f i � δ f i D �iZ

d4 xh f jHIjii D �iZ

d4xhp f sjq j μjpi riA μ (6.134a)

A μ(x ) D (Z q0/4πr, 0, 0, 0) (6.134b)

where p i r and p f s denote the momentum and the spin state of the incomingand outgoing electrons, respectively. The matrix element of the Hamiltonian wascalculated in Sect. 6.6.1:

hp f sjq j μ(x )jpi ri D qhp f sjψ(x )γ μ ψ(x )jpi ri

D qus (p f )γ μ u r(pi)e�i(p i�p f )�x (6.135)

Inserting Eq. (6.135) and Eq. (6.134b) into Eq. (6.134a), we obtain

S f i � δ f i D �2π i δ(Ei � E f )qus (p f )γ 0u r(pi)

"Z q0

Zd3x

ei(p i �p f )�x

r

#

D �2π i δ(Ei � E f )qus (p f )γ 0u r(pi)�QV (q)

D �2π i δ(Ei � E f )

Z qq0

jqj2us(p f )γ 0u r(pi)

QV (q) DZ

d3x ei q�x V(x) DZ q0

jqj2, q D p i � p f

(6.136)

Then

dσ D1

2Ei v

�Z e2

�2 � 4πjqj2

�2

jus(p f )γ 0u r(pi)j22πδ(Ei � E f )d3 p f

(2π)32E f

(6.137)

This expression, like that for Rutherford scattering, does not contain the δ-functionδ3(p i� p f ), i.e. the momentum is not conserved. This originates from the externalfield approximation that the target nucleus does not recoil.

In order to obtain the cross section, we need to calculate jus(p f )γ 0u r (pu)j2. Giv-en a special example, where only the 0th component of the spinor contributes, itmay be easier and faster to use the explicit expression for u r (p ) given in Sect. 4.2,Eq. (4.63). But for later applications, we adopt a more general approach here. Un-der usual experimental conditions, the incoming particles are not polarized, andthe final polarization is not measured either. In that case, we take the average (takethe sum and divide by the degrees of freedom) of the initial-state polarization andthe sum of the final-state polarization. Write the quantity Lμν as

Lμν �12

2Xr,sD1

�us (p f )γ μ u r(pi)

�us(p f )γ ν u r(pi)

� (6.138a)

Page 184: Yorikiyo Nagashima - Archive

6.7 Mott Scattering 159

�us(p f )γ ν u r(pi)

�D ur (pi)Γ u s(p f ) (6.138b)

Γ D γ 0Γ † γ 0 (6.138c)

When Γ D γ μ , Γ D γ μ ,

Lμν D12

2Xr,sD1

�us(p f )γ μ u r(pi)

�ur (pi)γ ν u s(p f )

(6.139)

Now we use the properties of the Dirac spinorsXr

u r(p )ur (p ) D6p C m ,X

r

vr (p )v r (p ) D6p � m (6.140)

and insert them in Eq. (6.139) to obtain

Lμν D12

Tr�γ μ( 6p i C m)γ ν( 6p f C m)

(6.141)

The above trace of the γ matrix can be calculated by making use of the propertiesof γ matrices (see Appendix C),

Tr[γ μ1 γ μ2 � � � γ μ2nC1 ] D 0 (6.142a)

Tr[γ μ1 γ μ2 � � � γ μn ] D (�1)nTr[γ μn � � � γ μ2 γ μ1] (6.142b)

Tr[γ μ γ ν] D 4gμν (6.142c)

Tr[γ μ γ ν γ� γ σ] D 4(gμν g�σ C gμσ gν� � gμ�gνσ) (6.142d)

Then the final expression for Lμν is

Lμν D 2hn

p μf p ν

i C p νf p μ

i � (p f � pi)gμνoC m2gμν

iD 2

�p μ

f p νi C p ν

f p μi C

q2

2gμν

�(6.143a)

q2 D (pi � p f )2 D 2m2 � 2(p f � pi)

' �2p f p i(1� cos θ ) (pi , p f m) (6.143b)

Problem 6.6

Using the properties of γ matrices Eq. (4.36), derive Eq. (6.142).Hint: To derive the first equation, use (γ 5)2 D 1, γ 5γ μ D �γ μ γ 5.

Now, going back to the Mott scattering cross section, only μ D ν D 0 com-ponents contribute. Since the energy is conserved by virtue of the δ-function,Ei D E f D E , jp i j D jp f j D p , v D p/E . Then

L00 D 2�2E f EiCm2�E f EiC p f p i cos θ

D 4E 2

�1� v2 sin2 θ

2

�(6.144)

Page 185: Yorikiyo Nagashima - Archive

160 6 Scattering Matrix

The incoming flux and the final state density are given by

2Ei v D 2p i D 2p (6.145a)

δ(Ei � E f )d3 p

(2π)32ED

�dEd p

��1 p 2dΩ16π3E

Dp dΩ16π3 (6.145b)

Inserting Eqs. (6.144) and (6.145) into Eq. (6.137), we obtain

dσdΩD

4Z2α2

q4 E 2�

1 � v2 sin2 θ2

�(6.146)

which is called the Mott scattering formula. Comparing this with the Rutherfordscattering formula (6.46), we see that the relativistic effect makes m ! E or (v Dp/m ! p/E ), and the spin effect of the electron gives

1!12

Pjus(p f )γ 0u r(pi)j2

4E1E2D

�1� v2 sin2 θ

2

�(6.147)

Problem 6.7

Using the explicit solution Eq. (4.63)

u r(p ) D�p

p � σ�rpp � σ�r

�D

rE C m

2

24

1 � σ�pECm

��r

1C σ�pECm

��r

35 (6.148)

show that spin nonflip and spin flip cross sections are proportional to

dσ(spin nonflip) /�

1Cp 2

(E C m)2cos θ

�2

(6.149a)

dσ(spin flip) /�

p 2

(E C m)2 sin θ�2

(6.149b)

and that their sum is equal to Eq. (6.147) apart from the normalization factor.

Problem 6.8

Using the current of the charged scalar particle

q j μ D �i e('†@μ' � @μ'†') (6.150)

derive the scattering formula for a given external Coulomb potential and show thatit agrees with the first term of Eq. (6.146).

Page 186: Yorikiyo Nagashima - Archive

6.7 Mott Scattering 161

6.7.2Coulomb Scattering and Magnetic Scattering

Scattering by the electromagnetic field can be divided into that by the electric andthat by the magnetic field. The Hamiltonian is written as

HI D qφ � μ � B , μ D �e

2mg s (6.151)

where s is the electron spin and g D 2 is the g-factor. Therefore, we see scatteringby the electric field is spin nonflip and that by the magnetic field is a spin-flip pro-cesses. Mott scattering by a static Coulomb potential contains the spin-flip term.This is because, viewed from the electron’s rest frame, the target is moving andhence a magnetic field is induced. As is clear from Eq. (6.146), the magnetic contri-bution is proportional to v2, and it is small for nonrelativistic particles but becomesimportant and comparable to the electric contribution for relativistic particles.

6.7.3Helicity Conservation

In the high-energy limit, where the electron mass can be neglected, v D 1 and theMott scattering cross section becomes

dσdΩ

ˇˇMottD

4Z2α2

q4 E 2 cos2 θ2

(6.152)

which differs from that of spinless particles by cos2(θ /2). This is a helicity rotationeffect. The reason is

1. the massless particle is a Weyl particle, and only polarizations along the direc-tion of its momentum are allowed (i.e. helicity eigenstates) and

2. a vector-type interaction, like that of the electromagnetic force, conserves thechirality (see arguments in Sect. 4.3.5), i.e. helicity is conserved in the high-energy limit.

The initial helicity is along the z-axis (parallel to the incoming momentum), andthe final helicity along the scattered particle at angle θ . The matrix element of therotation operator is given by d j

f,i D h j, j 0z j exp(�i θ Jy )j j, j zi and for the Mott case

j D 1/2, j z D ˙1/2, d1/21/2,1/2 D cos(θ /2). (See Appendix E for the rotation matrix.)

6.7.4A Method to Rotate Spin

The above argument inspires a method to rotate spin using low-energy Coulombscattering. Place heavy nuclei around a circle and have a low-energy charged parti-cle with longitudinal polarization start from one end of a diameter. Let it be scat-tered by a nucleus and detected at the other end of the diameter (see Fig. 6.6). As

Page 187: Yorikiyo Nagashima - Archive

162 6 Scattering Matrix

Figure 6.6 Converting longitudinal polarization to transverse polarization using Mott scattering.

the scattering angle is 90ı and the spin does not flip, the particle’s polarization isrotated by 90ı, and now the particle has transverse polarization, which is suitablefor measurement. The method was used to decide the helicity of the decay electronfrom decay.

6.8Yukawa Interaction

Yukawa interaction was first proposed to explain the nuclear force and it wasclaimed it was due to the exchange of a spin 0 massive meson. The interaction of aspin 1/2 particle with a scalar field is important both historically and for future ap-plications. Historically, it offered the first viable explanation of low-energy nuclearinteractions, i.e. strong interactions. For future applications, it is important for in-teraction with the Higgs field, which generates the mass of all otherwise masslessquarks and leptons. Here, we investigate a low-energy nuclear scattering mediatedby a scalar or a pseudoscalar meson. The relevant Lorentz-invariant interaction isof the type

LINT D �gψΓ ψφ (6.153)

where Γ D 1 if φ is a scalar and Γ D i γ 5 if φ is a pseudoscalar. We consider aprocess

N1(p1)C N2(p2)! N1(p3)C N2(p4) (6.154)

and assume N1 and N2 are distinguishable. The mass of the nucleon and the me-son are taken to be m and μ, respectively. The second-order scattering amplitudecorresponding to the exchange of the φ meson is (Fig. 6.7)

Page 188: Yorikiyo Nagashima - Archive

6.8 Yukawa Interaction 163

Figure 6.7 Nucleon–nucleon interaction via a scalar meson exchange.

S f i � δ f i D �i T f i

D hp3 p4jT�

(�i)2

2!gZ

d4x ψ(x )Γ ψ(x )φ(x )

� gZ

d4 y ψ(y )Γ ψ(y )φ(y )�jp1 p2i

D(�i g)2

2!

�hp3j

Zd4x ψ(x )Γ ψ(x )jp1ih0jT [φ(x )φ(y )]j0i

� hp4j

Zd4 y ψ(y )Γ ψ(y )jp2i C (x $ y )

�(6.155)

Interactions at x and y can be separated because particles 1 and 2 are two distin-guishable particles and the interactions take place at separate space-time points.The second term, where x and y are exchanged, gives an identical contribution tothe first, it merely compensates the factor 2!. Inserting

hp3j

Zd4x ψ(x )Γ ψ(x )jp1i D

Zd4x e�i(p1�p3)�x u(p3)Γ u(p1) (6.156a)

h0jT [φ(x )φ(y )]j0i DZ

d4 qe�i q�(x�y )

(2π)4

iq2 � μ2 (6.156b)

hp4j

Zd4 y ψ(y )Γ ψ(y )jp2i D

Zd4 y e�i(p2�p4)�x u(p4)Γ u(p2) (6.156c)

into Eq. (6.155), we obtain

T f i D (2π)4δ4(p1 � p3 C q)(2π)4δ4(p2 � p4 � q)Z

d4 q(2π)4

� u(p3)Γ u(p1)g2

q2 � μ2 u(p4)Γ u(p2)

D (2π)4δ4(p1 C p2 � p3 � p4)

� u(p3)Γ u(p1)g2

q2 � μ2u(p4)Γ u(p2)

ˇˇ

qDp3�p1

(6.157)

Page 189: Yorikiyo Nagashima - Archive

164 6 Scattering Matrix

Scalar Meson ExchangeΓ D 1

To evaluate the amplitude in the nonrelativistic limit, we retain terms only to thelowest order in the 3-momenta:

p1 D (m , p1), p2 D (m , p2), p3 D (m , p3) , p4 D (m , p4) (6.158)

Then

us(p3)u r(p1) D � †s (p

p3 � σp

p3 � σ)�p

p1 � σpp1 � σ

��r ' 2mδ r s (6.159a)

q2 � μ2 D �(q2 C μ2) D ��(p3 � p1)2 C μ2 (6.159b)

The amplitude becomes

T f i D �(2π)4δ(p1C p2 � p3 � p4)g22mδ r s2mδ t u

jqj2 C μ2 (6.160)

The amplitude of a scattering by a potential V in quantum mechanics is given by

T f i D 2πδ(Ei � E f ) QV (q)

QV (q) DZ

d3x ψ†(x )V(x )ψ(x ) D1

(2π)3

Zd3x ei q�x V(x )

(6.161)

Therefore QV (q) for the scalar meson exchange is given by

QV (q) D �g2

q2 C μ2 (6.162)

We have dropped (2m)2 and the δ-function for the momentum conservation. Thefactor (2m)2 comes from the relativistic wave function normalization and shouldbe dropped for comparison with the nonrelativistic formula. The δ-function origi-nates from integration over the total coordinate R, whereZ

d3x ei(p1�p3)�xZ

d3 y ei(p2�p4)�y

D

Zd3R ei(p1Cp2�p3�p4)�R

Zd3r e i q�r

D (2π)3δ3(p1 C p2 � p3 � p4)Z

d3r e i q�r

R D12

(x C y ) , r D x � y

(6.163)

and means the CM frame is unaffected by the internal motion. In the nonrelativis-tic treatment, only the relative motion is treated, so it does not enter the equation.The potential in the space coordinate is given by

V(x) D1

(2π)3

Zd3qei q�x �g2

q2 C μ2 D �g2

4πe�μ r

r(6.164)

Page 190: Yorikiyo Nagashima - Archive

6.8 Yukawa Interaction 165

which gives an attractive force. Yukawa postulated the existence of a massive forcecarrier of the nuclear force and knowing the force range to be approximately10�15 m, he predicted its mass to be � 200 me.

For the pseudoscalar meson exchange, the nuclear force becomes repulsive. Weknow that the π meson is actually a pseudoscalar meson. How then can the nuclearforce be attractive so that it can form a bound state, the deuteron? This is relatedto the charge exchange force, which will be discussed in connection with isospinexchange in Sect. 13.6.

Page 191: Yorikiyo Nagashima - Archive
Page 192: Yorikiyo Nagashima - Archive

167

7Qed: Quantum Electrodynamics

Quantum electrodynamics (QED) is a mathematical framework for treating thequantum dynamics of electrons with the quantized electromagnetic field. Howev-er, it can be applied to the electromagnetic interactions of any charged Dirac par-ticle, including quarks and leptons. It is a gauge theory based on the symmetryU(1) and shares many basic features with more complicated non-Abelian gaugetheories that govern the weak and the strong interactions. Thus learning QED isan indispensable step in understanding the dynamical aspect of particle physics.

7.1e–μ Scattering

As the next logical step after derivation of the Rutherford and Mott scattering, wetreat the scattering process of a spin 1/2 particle by a target also having spin 1/2.When the recoil of the target is not negligible, as is the case for e–μ scattering,we have to use the quantized electromagnetic field. Experimentally, a viable choicewould be e–p (electron–proton) scattering, but the proton is a hadron, which has afinite size as well as a large anomalous magnetic moment, and therefore is not aDirac particle. So we consider e–μ scattering, which is impractical from the experi-mental point of view, but has applications to e–q (electron–quark) scattering and isuseful for analyzing deep inelastic scattering data to probe the inner structure ofthe hadron. Besides, using crossing symmetry, the process can also be applied toe�eC ! μ�μC and e�eC ! qq, (i.e. e�eC ! hadrons), which are widely usedcollider experiments.

7.1.1Cross Section

We define the kinematic variables of e–μ scattering as follows:

e(p1)C μ(p2)! e(p3)C μ(p4) (7.1)

In pi (i D 1 to 4), we include spin orientation as well, unless otherwise specified.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 193: Yorikiyo Nagashima - Archive

168 7 Qed: Quantum Electrodynamics

Figure 7.1 e–μ scattering process.

In the lowest order, e–μ scattering occurs as shown in Fig. 7.1, by exchange of aphoton. As there are two vertices in the diagram, it is a second order process. TheHamiltonian contains electron as well as muon interactions:

HI D q( j νe C j ν

μ )A ν D �e�ψ e γ ν ψe C ψ μ γ ν ψμ

�A ν (7.2)

The scattering matrix is given by

S f i � δ f i D(�i)2

2!

Zd4x

Zd4 y T [hp3, p4jHI(x )HI(y )jp1, p2i] (7.3)

Considering the e-transition p1 ! p3 and μ-transition p2 ! p4 to occur in dif-ferent space-time positions, they can be separated. We insert intermediate states nbetween the two Hamiltonians,

1 DX

n

jnihnj (7.4)

where n represents any physical state that can be realized through the interaction.The intermediate state in Fig. 7.1 corresponds to one photon with energy-momen-tum qμ and polarization λ,

Xn

jnihnj DX

λ

Zd3q

(2π)32ωjq, λihq, λj, ω D jqj (7.5)

Inserting the intermediate states between the two Hamiltonians, T h� � � i in Eq. (7.3)is expressed as

Zd3q

(2π)32ωT [hp3jHI(x )jp1, qλihp4, qλjHI(y )jp2i] (7.6)

There is also a term with p1 $ p2, p3 $ p4, but exchanging the integrationvariable x $ y makes it identical to Eq. (7.6) and merely compensates the factor 2!.Therefore it is enough to consider Eq. (7.6) only. Referring to Eqs. (6.104a) and(6.104b), Eq. (7.6) can be expressed as

hp3jHI(x )jp1, qλi D �eus(p3)γ μ u r(p1)e�i(p1�p3)�xh0jA μ(x )jqλi

hp4, qλjHI(y )jp2i D �eum(p4)γ ν u l(p2)e�i(p2�p4)�y hqλjA ν(y )j0i(7.7)

Page 194: Yorikiyo Nagashima - Archive

7.1 e–μ Scattering 169

Then

S f i � δ f i D (�i)2Z

d4xZ

d4 y˚�eus (p3)γ μ u r(p1)e�i(p1�p3)�x�

�X

λ

Zd3q

(2π)32ωT [h0jA μ(x )jqλihqλjA ν(y )j0i]

�˚�eum (p4)γ ν u l (p2)e�i(p2�p4)�y�

(7.8)

As the middle term is the Feynman propagator, we can insert Eq. (6.115) into it.Then

S f i � δ f i D (�i)2Z

d4xZ

d4 y˚�eus (p3)γ μ u r(p1)e�i(p1�p3)�x�

�1

(2π)4

Zd4qe�i q�(x�y ) �i gμν

q2 C i ε

�˚�eum (p4)γ ν u l (p2)e�i(p2�p4)�y�

(7.9)

As integration over x , y givesZd4x ! (2π)4δ4(p1 � p3 C q)Zd4 y ! (2π)4δ4(p2 � p4 � q)

(7.10a)

1(2π)4

Zd4q

�“d4x d4 y

�! (2π)4δ4(p1 C p2 � p3 � p4) (7.10b)

the matrix element of the scattering matrix becomes

S f i D δ f i � (2π)4 i δ4(p1 C p2 � p3 � p4)M f i

D δ f i C (2π)4δ4(p1 C p2 � p3 � p4)

� us(p3)(i eγ μ)u r(p1)��i gμν

q2 C i ε

�um(p4)(i eγ ν)u l(p2)

q D p3 � p1 D p2 � p4

(7.11)

which gives the Lorentz invariant matrix element M f i

M f i D � j μ(e)gμν

q2 C i εj ν(μ)

D �e2us(p3)γ μ u r(p1)1

q2 C i εum(p4)γμ u l (p2)

(7.12)

Under normal experimental situations, the polarization states of the particle are notmeasured. Then to obtain the transition amplitude, we have to take the spin sum(average of the initial- and sum of the final-state polarization). We have alreadydone part of the calculation [see Eqs. (6.139)–(6.143)],

Lμν D12

2Xr,sD1

�us(p3)γ μ u r(p1)

�ur(p1)γ ν u s(p3)

D 2�

p3μ p1

ν C p3ν p1

μ Cq2

2gμν

� (7.13a)

Page 195: Yorikiyo Nagashima - Archive

170 7 Qed: Quantum Electrodynamics

The same calculation for the muon part gives

Mμν D 2�

p4 μ p2ν C p4ν p2μ Cq2

2gμν

�(7.13b)

The current conservation law @μ j μ D 0 gives

qμ Lμν D qν Lμν D 0 , qμ M μν D qν M μν D 0 (7.14)

which can be explicitly verified from expressions (7.13).

Problem 7.1

Prove Eq. (7.14) by actually multiplying Eq. (7.13) by qμ.

Using the relation

p4 D p4 � p2 C p2 D �q C p2 (7.15)

and Eq. (7.14), one sees that p4 in Mμν can be replaced by p2:

N D Lμν Mμν

D 4�

p3μ p1

ν C p3ν p1

μ Cq2

2gμν

��2p2 μ p2 ν C

q2

2gμν

D 4

"4(p1 � p2)(p2 � p3)C q2 p 2

2 C q2(p1 � p3)C�

q2

2

�2

� 4

# (7.16a)

Making use of

2(p1 � p2) D (p1 C p2)2 � p 21 � p 2

2 D s � m2 �M2

2(p2 � p3) D �(p3 � p2)2 C p 23 C p 2

2

D �(p1 � p4)2 C m2 C M2 D m2 C M2 � u

2(p1 � p3) D �(p1 � p3)2 C p 21 C p 2

3 D 2m2 � t

N is rewritten as

N D 4�

(s � m2 �M2)(m2 C M2 � u)C t(m2 C M2)Ct2

2

�D 2

�(s � m2 � M2)2 C (u � m2 �M2)2 C 2t(m2 C M2)

(7.16b)

where we used

s C t C u D 2(m2 C M2) , q2 D t (7.16c)

in deriving the last equality. Substituting Eqs. (7.16b) in the square of the matrixelement Eq. (7.12), the invariant scattering amplitude (squared) becomes

jM f i j2 D

e4

q4 N

D4e4

q4

�(s � m2 � M2)(m2 C M2 � u)C t(m2 C M2)C

t2

2

� (7.17)

Page 196: Yorikiyo Nagashima - Archive

7.1 e–μ Scattering 171

where jM f i j2 denotes that the average of the initial spin and sum of the final spin

state were taken for jM f i j2. In the LAB frame, neglecting the electron mass (m D

0), the invariant Mandelstam variables are expressed as

s � M2 D 2p1M , M2 � u D 2p3M , t D �2p1 p3(1� cos θ ) (7.18)

Inserting the above expression into Eq. (7.16b), the invariant matrix element is

jM f i j2LAB D

e4

q4 16M2 p1 p3 cos2 θ2

�1�

q2

2M2 tan2 θ2

�(7.19)

As the cross section in LAB is given by Eq. (6.91),

dσdΩ

ˇˇLABD

�p3

p1

�2 164π2 M2

jM f i j2LAB (7.20a)

D dσMottp3

p1

�1 �

q2

2M2 tan2 θ2

�(7.20b)

dσMott D4α2E 2

3

q4 cos2 θ2

(7.20c)

The first term in square brackets [� � � ] represents scattering by the electric fieldcreated by the target and the second by the magnetic field. As p3 D p1/[1C p1(1�cos θ )/M ], the limit M !1 reproduces the Mott formula.

The cross section in CM is given by Eq. (6.90)

dσdΩ

ˇˇCMD

p f

p i

164π2 s

jM f i j2CM (7.21)

In many cases of high-energy scattering, where s, u M2, we can set M2 D 0.Then

dσdΩ

ˇˇCMD

p f

p i

164π2 s

(4πα)2

q4 � 2(s2 C u2)

Dα2

q4

s2 C u2

2sD

α2

q4 s

1C

1C cos θ

2

!2

2

(7.22)

This is a formula that appears frequently later in discussions of deep inelastic scat-tering.

7.1.2Elastic Scattering of Polarized e–μ

We consider in this section longitudinally polarized electron–muon scattering. Weuse here chiral eigenstates to describe the processes. The advantage is that thechirality does not change during the interaction and the calculation is easier than

Page 197: Yorikiyo Nagashima - Archive

172 7 Qed: Quantum Electrodynamics

that using the helicity eigenstate. Besides, it is also the eigenstate of the weak in-teraction and the derived formula can be applied directly to neutrino scattering.The disadvantage is that the chiral states are not eigenstates of the Hamiltonian,and the scattering amplitude written in terms of the chiral eigenstates does notdescribe the physical process accurately and hence does not necessarily reflect theproperties of helicity eigenstates correctly. However, as was explained in Sect. 4.1.2,the massless particle is a Weyl particle and chirality L(R) state and the helicity statesare the same except for a sign change between particles and antiparticles. In mosthigh-energy experiments, the energy is high enough to justify setting the mass ofthe particle to zero. This is certainly true for electrons (m ' 0.5 MeV), muons andlight quarks (u, d, s) (m ' 5 to 100 MeV) for typical experiments where the energyis above � 1 GeV.

The conservation of chirality in vector-type interactions, which include all thestrong, weak and electromagnetic interactions in the Standard Model, was alreadyshown in Sect. 4.3.5. This means

ψγ μ ψ D ψL γ μ ψL C ψRγ μ ψR

ψL γ μ ψR D ψRγ μ ψL D 0(7.23)

If we make use of the property that the chiral eigenstate does not change afterinteractions, the unpolarized e–μ scattering can be decomposed into the followingfour processes:

eLμL ! eL μL (LL) (7.24a)

eRμL ! eRμL (RL) (7.24b)

eLμR ! eLμR (LR) (7.24c)

eRμR ! eRμR (RR) (7.24d)

Let us consider eL μL ! eLμL first. The scattering amplitude can be obtained fromthat of unpolarized e–μ scattering by the substitution γ μ ! γ μ(1 � γ 5)/2, hence

dσdΩ

ˇˇCMD

�p f

p i

�1

64π2 sjM f i j

2 (7.25a)

jM f i j2 D

e4

q4 KLμν(p1, p3)KLμν(p2, p4) (7.25b)

KLμν(p1, p3) D

14

Xr,s

us(p3)γ μ(1�γ 5)u r(p1)ur(p1)γ ν(1�γ 5)u s(p3) (7.25c)

D14

Tr�γ μ(1� γ 5)( 6p1 C m1)γ ν(1 � γ 5)( 6p3C m3)

(7.25d)

Page 198: Yorikiyo Nagashima - Archive

7.1 e–μ Scattering 173

To calculate the trace, we use the properties of the γ 5 matrix:

γ μ γ 5 D �γ 5γ μ , γ 5 D �i4!

εμν�σγ μ γ ν γ�γ σ (7.26a)

ε0123 D �ε0123 D 1 (7.26b)

εμν�σ D

8<ˆ:�1 μν�σ D even permutation of 0123

C1 μν�σ D odd permutation of 0123

0 otherwise

(7.26c)

εμν�σ εμνγ δ D 2(g�δ gσγ � g�γ gσδ) (7.26d)

Then

Tr[γ 5] D Tr[γ 5γ μ ] D Tr[γ 5γ μ γ ν ] D Tr[γ 5γ μ γ ν γ�] D 0 (7.27a)

Tr[γ 5γ μ γ ν γ� γ σ] D �4 i εμν�σ (7.27b)

Using these relations, Eq. (7.25d) can be simplified to give

KLμν(p1, p3) � P μν C Qμν (7.28a)

D 2[fp1μ p3

νC p1ν p3

μ � gμν(p1 � p3)gC i εμν�σ p1� p3 σ ] (7.28b)

K μνL (p1, p3)KLμν(p2, p4) can be calculated using the following equalities:

P μν(p1, p3)Pμν(p2, p4) D 8[(p1 � p2)(p3 � p4)C (p1 � p4)(p2 � p3)] (7.29a)

Qμν(p1, p3)Q μν(p2, p4) D 8[(p1 � p2)(p3 � p4) � (p1 � p4)(p2 � p3)] (7.29b)

P μν Q μν D Pμν Qμν D 0 (7.29c)

4(p1 � p2)(p3 � p4) D�(p1 C p2)2 � p 2

1 � p 22

�(p3 C p4)2 � p 3

3 � p 24

D�s � m2

1 � m22

� �s � m2

3 � m24

�(7.29d)

Similarly

4(p1 � p4)(p2 � p3) D�u � m2

1 � m24

� �u � m2

2 � m23

�(7.29e)

Then

K μνL (p1, p3)KLμν(p2, p4) D 16(p1 � p2)(p3 � p4)

D 4(s � m21 � m2

2)(s � m23 � m2

4)mi !0D 4s2

(7.30)

Inserting Eq. (7.30) into Eq. (7.25), we obtain

dσLL �dσdΩ

ˇˇCM

(LL) Dα2

t2 s (7.31)

Page 199: Yorikiyo Nagashima - Archive

174 7 Qed: Quantum Electrodynamics

In calculating the cross section for eRμL scattering, we only need to replace γ 5 by�γ 5 in the electron part of KL(p1, p3). Then, Qμν in KL(p1, p3) becomes �Qμν, butthat in the muon part does not change. Then Qμν Q μν is changed to �Qμν Q μν andthe net effect is

(p1 � p2)(p3 � p4)! (p1 � p4)(p2 � p3) or s ! u (7.32a)

) dσRL Dα2

t2

u2

sD

α2

t2 s�

1C cos θ2

�2

(7.32b)

Others can be calculated similarly:

dσLR D dσRL , dσRR D dσLL (7.33)

The unpolarized scattering cross section is simply the sum of all the above crosssections divided by 4 to account for average L, R fractions in the beam:

dσdΩ

ˇˇCMD 2 �

�14

α2

t2

s2 C u2

s

�D

α2

t2

s2

"1C

�1C cos θ

2

�2#

(7.34)

As t D q2, this agrees with Eq. (7.22).The above expression can be understood in terms of helicity conservation. In the

CM frame, the eL μL and eRμL two-body systems are in the helicity eigenstates ofthe total angular momentum J D 0, 1. The system scattered by angle θ can beobtained from the incoming system by rotation if there are no other changes. Therotation matrix elements (see Appendix E) are given by

d jj 0z , j zD h j, j 0

z je�i Jy θ j j j zi (7.35a)

d00,0 D 1 , d1

1,1 D12

(1C cos θ ) (7.35b)

The helicity states are described in Fig. 7.2. For LL or RR scattering, the sum of theJz is zero and is conserved at any angle, but for RL or LR scattering Jz D ˙1 andit is inverted at the scattering angle θ D π, violating angular momentum conser-vation. The rotation matrix reflects this conservation law and gives the suppressionfactor (1C cos θ )2/2 in Eq. (7.35).

Note, that eL $ eR exchange gives s $ u in the scattering matrix. As s $ uexchange can be alternatively realized by crossing (p1 ! �p3, p3 ! �p1), par-ticle $ antiparticle exchange of one or other partner in the scattering and thatof R $ L are equivalent as far as the kinematics of the scattering is concerned.In e–μ scattering the crossing can be distinguished by seeing the charge changeof the particle, but for neutral particles, such as neutrinos, looking at the angulardistribution difference is one way to tell particles from antiparticles.

7.1.3e�eC ! μ�μC Reaction

It is possible and not so difficult to calculate the ee ! μμ production cross sec-tion using the standard procedure we have learned so far [see Eq. (C.17)]. Here,

Page 200: Yorikiyo Nagashima - Archive

7.1 e–μ Scattering 175

Figure 7.2 The difference of the angular dis-tributions by helicity conservation. For LLscattering, sum of the angular componentsin the e–μ two-body system is zero and itis conserved at any angle. For RL scatter-

ing Jz D 1/2 C 1/2 before scattering,but after scattering at the scattering angleθ D π, Jz D �1 and the angular momen-tum is not conserved. The suppression factor(1C cos θ )2/4 reflects this fact.

instead, we use crossing to derive the cross section. The eμ scattering amplitude asshown in Fig. 7.3a can be converted to ee ! μμ reaction (Fig. 7.3b) by crossing(p3 ! �p3, p2 ! �p2).

We attach a subscript t to the crossed channel variables to distinguish those be-fore crossing,

s D (p1 C p2)2 ! (p1 � p2)2 D u t (7.36a)

t D (p1 � p3)2 D q2 ! (p1 C p3)2 D s t (7.36b)

u D (p1 � p4)2 ! (p1 � p4)2 D tt (7.36c)

The scattering angle θt in the crossed channel is defined by the angle between eand μ. The spin-averaged scattering matrix element (squared) is then converted to

jM f i j2(eμ ! eμ) �

s2 C u2

t2! jM f i j

2(ee ! μμ) �t2

t C u2t

s2t

(7.37)

Figure 7.3 Crossing (p1 ! �p1, p3 ! �p3) converts eμ scattering (a) to ee ! μμ reac-tion (b).

Page 201: Yorikiyo Nagashima - Archive

176 7 Qed: Quantum Electrodynamics

Dropping the subscript, the cross section for ee ! μμ becomes

dσdΩ

ˇˇCMD

α2

st2 C u2

2s2 D4πα2

3s

�3

16π(1C cos2 θ )

�(7.38a)

σTOT D

Zdσ D

4πα2

3sD

87

s[GeV2]nb (nb D 10�33 cm2) (7.38b)

This is a basic cross section used in electron–positron colliding accelerator experi-ments.

Problem 7.2

Show if the mass of the produced particle is not neglected

σTOT D4πα2

3sF()

F() D �

3� 2

2

�, D particle velocity in CM

(7.39)

7.2Compton Scattering

Scattering AmplitudeThe photon–electron scattering is known as the Compton scattering. Historically,it is famous as the process to have shown the particle nature of the light waveand have proved Einstein’s light quantum hypothesis proposed in 1905. Here, wecalculate the process with later application to QCD in mind. The variables in thereaction are defined as

γ (k)C e(p ) ! γ (k0)C e(p 0) (7.40)

In the lowest order, two processes shown in Fig. 7.4 contribute.The scattering matrix is expressed as

S f i � δ f i D(�i q)2

2!

Zd4x d4 yh f jT [ j μ(x )A μ(x ) j ν(y )A ν(y )]jii (7.41)

From now on we will omit δ f i from the scattering matrix where there is no confu-sion. Separating the photon and electron parts

S f i D (�i q)2Z

d4x d4 y T�hk0jA μ(x )j0ihp 0j j μ(x ) j ν(y )jp ih0jA ν(y )jki

D (�i q)2

Zd4x d4 y εμ(k0)�hp 0jT [ j μ(x ) j ν(y )]jp iεν(k)e i(k 0�x�k �y )

(7.42a)

Page 202: Yorikiyo Nagashima - Archive

7.2 Compton Scattering 177

tx

k, ε(k) p k, ε(k) p

k’, ε(k’) p’ p’ k’, ε(k’)

k+p p k’

x

y y

x

(a) (b)

Figure 7.4 Two diagrams that contribute to Compton scattering.

hp 0jT [ j μ(x ) j ν(y )]jp i D hp 0jT [ψ(x )γ μ ψ(x )ψ(y )γ νψ(y )]jp i (7.42b)

where the polarization index λ has been suppressed for simplicity. The factor 1/2is canceled by an identical contribution obtained by x $ y in Eq. (7.41). Corre-sponding to two annihilation operators ψ, there are two ways of making the matrixelement. When x0 � y 0 > 0, writing the matrix elements suffixes explicitly

(1) (2)

hp 0jψ α(x )γ μα ψ(x )ψ γ (y )γ ν

γ δ ψδ(y )]jpi

(2) (1) (7.43)

In the bracket labeled (1), the incoming particle jp i is annihilated by ψδ at point yand the outgoing particle hp 0j is created by ψ α at point x. In brtacket (2), jp i isannihilated by ψ at point x and the outgoing particle hp 0j is created by ψ γ atpoint y. Therefore, bracket (1) corresponds to diagram (a) of Fig. 7.4 and bracket (2)to (b). The matrix element of bracket (1) can be written as

hp 0jT [ j μ(x ) j ν(y )]jp ij(a) x0>y 0

D uα(p 0)γ μαh0jψ(x )ψγ (y )j0iγ ν

γ δ u δ(p )e i(p 0�x�p �y )(7.44)

When x0 � y 0 < 0

hp 0jT [ j μ(x ) j ν(y )]jp i D hp 0j j ν(y ) j μ(x )jp i (7.45)

This is the same as Eq. (7.43) if the arguments are interchanged:

x $ y , μ $ ν , α $ γ δ (7.46)

(1) (2)

hp 0jψ γ (y )γ νγ δ ψδ(y )]ψ(x )α γ μ

α ψ(x )jpi

(2) (1) (7.47)

Page 203: Yorikiyo Nagashima - Archive

178 7 Qed: Quantum Electrodynamics

This time diagram (a) corresponds to bracket (2), and its matrix element is givenby

hp 0jT [ j μ(x ) j ν(y )]jp ij(a) x0<y 0

D uα(p 0)γ μαh0j � ψ γ (y )ψ(x )j0iγ ν

γ δ u δ(p )e i(p 0�x�p �y )(7.48)

Combining Eqs. (7.44) and (7.48)

hp 0jT [ j μ(x ) j ν(y )]jp ij(a) D u(p 0)γ μh0jT [ψ(x )ψ(y )]j0iγ ν u(p )e i(p 0�x�p �y )

(7.49)

hj � � � ji is just the Feynman propagator. Using Eq. (6.126), the scattering matrixelement corresponding to diagram (a) of Fig. 7.4 becomes

S f i j(a) D

Zd4x d4 y ei(k 0Cp 0)�x�i(kCp )�yεμ(k0)�u(p 0)(�i qγ μ)

�i

(2π)4

Zd4qe�i q(x�y ) 6 q C m

q2 � m2 C i ε

�(�i qγ ν)u(p )εν(k)

(7.50a)

D (2π)4δ4(k C p � k0 � p 0) (7.50b)

� εμ(k0)�u(p 0)(�i qγ μ)i( 6pC 6 k C m)

(p C k)2 � m2 C i ε(�i qγ ν)u(p )εν(k)

(7.50c)

The scattering matrix for diagram (b) can be obtained similarly, but is more easilyobtained by crossing diagram (a), i.e. by interchanging the variables k $ �k0,ε(k0)� $ ε(k). Then the total scattering amplitude is given by

M DMa CMb (7.51a)

D e2u(p 0)�6 ε0� 6pC 6 k C m

(p C k)2 � m26 ε C 6 ε

6p� 6 k0 C m(p � k0)2 � m2

6 ε0��

u(p ) (7.51b)

where ε/D γ μ εμ(k), ε/0 D γ μ εμ(k0) are used for brevity. The calculations that followare familiar trace algebra. But what will become important for later applications isthe high-energy behavior at s m2. So we will set m D 0, which considerablysimplifies the calculation.

Summation of the Photon Polarization Assuming that neither photon nor electronare polarized, we take the average of the initial and sum of the final states. Asdiscussed in Sect. 5.3.4, the polarization sum can be extended to all four states,giving

XλD˙

εμ(λ)�εν(λ) D δ i j �k i k j

jkj2!

XλDall

εμ(λ)�εν(λ) D �gμν (7.52)

Page 204: Yorikiyo Nagashima - Archive

7.2 Compton Scattering 179

This is because the polarization vector always appears coupled with the conservedcurrent. Since the conserved current satisfies

@μ j μ D 0! kμ j μ D k0 j 0 � jkj j 3 D 0 (7.53)

in the coordinate system where the z-axis is taken along the momentum direction.Then the polarization sum of the matrix elements (squared) is rewritten asX

λ

jMj2 �X

λ

jεμ(k, λ) j μ � � � j2 � �gμν j μ j ν† � (� � � )

�gμν j μ j ν† D � j 0 j 0†C j 3 j 3†„ ƒ‚ …j 0 j 0†

��1C

(k0)2

jkj2

�C j 1 j 1†C j 2 j 2†„ ƒ‚ …j ? � j ?

Dk2

k2 j j0j2 C j ? � j †

? D j ? � j †? D

�δ i j �

k i k j

jkj2

�j i j j †

(7.54)

The first term vanishes because for the real photon k2 D 0.As the degrees of freedom of both the photon polarization and electron spin are

two, we divide by 4 in taking the spin-polarization average:

jMj2 D14

XjMa CMb j

2 D14

X(jMaj

2 C jMb j2 C 2 Re MaM�

b )

D 2e2�

us�

su

�(7.55a)

We prove the first term. Setting m D 0 and picking up only the numerators, wehave X

r,s,λ

jMaj2 � e4

Xus(p 0) 6 ε0 �( 6pC 6 k) 6 εu r(p )ur (p ) 6 ε�( 6pC 6 k) 6 ε0u s(p 0)

(7.56a)

Using (see Appendix C for trace calculation formula)Xλ

6 ε� � � � 6 ε DX

λ

ε�μ γ μ � � � εν γ ν D (�gμν)γ μ � � � γ ν D �γμ � � � γ μ (7.56b)

γμ C/ γ μ D �2 C/ (7.56c)

6 ε� X

r

u r(p )ur (p )

!6 ε D �γμ( 6p C m)γ μ ' 2 6p (7.56d)

6 ε� X

s

u r(p 0)us (p 0)

!6 ε0 ' 2 6p 0 (7.56e)

) 14

Xr,s,λ

jMaj2 D e4Tr[( 6pC 6 k) 6p ( 6pC 6 k) 6p 0]

D e4Tr[6p 6p 6p 6p 0C 6p 6p 6 k 6p 0C 6 k 6p 6p 6p 0C 6 k 6p 6 k 6p 0]D e4Tr[6 k 6p 6 k 6p 0] D 4e4[2(k � p )(k � p 0) � k2(p � p 0)]D �2e4 su (7.56f)

Page 205: Yorikiyo Nagashima - Archive

180 7 Qed: Quantum Electrodynamics

Here we have used the relations

6p 6p D p 2 D m2 D 0 , 6 k 6 k D k2 D 0 , s D 2(k � p ) , u D �2(k � p 0)(7.56g)

Recovering other factors, and using that the denominators give (pC k)2�m2 ' s2

jMaj2 D �2e4 u

s(7.56h)

The interference term vanishes. Then the cross section is obtained from Eq. (6.92)

dσd tD

jM f i j2

16π(s � m2)2 '2πα2

s2

us�

su

�(7.57)

Problem 7.3

When Eq. (7.51) is written as ε0 �μ Mμν εν , gauge symmetry requires (Sect. 5.3.4)

invariance under the transformations

ε0�μ ! ε0�

μ C ck0μ , ε�

ν ! ε�ν C ckν (7.58)

Therefore the following equalities should hold:

k0μMμν D kνMμν D 0 (7.59)

Prove the equality explicitly using Eq. (7.51)

Problem 7.4

Show that gauge invariance does not hold with either Ma or Mb alone, but holdswith both terms added together.

Problem 7.5

Show that if we assume that the incoming photon is virtual and has mass k2 D

�Q2, then the interference term does not vanish, and Eq. (7.55a) becomes

jMj2 D 2e4��

us�

suC

2Q2 tsu

�(7.60)

This equation appears frequently in QCD.

Problem 7.6*

If we assume the electron mass is nonzero, the calculation can be carried out inthe LAB frame. Writing the variables as

k D (ω, k) , k0 D (ω0, k0) , p D (m , 0) , ε D (0, �)

Page 206: Yorikiyo Nagashima - Archive

7.3 Bremsstrahlung 181

then the following equalities are satisfied:

ε � p D ε0 � p D ε � k D ε0 � k0 D 0 (7.61a)

ε � p 0 D �ε � k0 , ε0 � p 0 D ε0 � k (7.61b)

Taking the average of the electron spin but leaving the photon polarization as it is,derive after a straightforward but long calculation

dσdΩ

ˇˇLABD

α2

4m2

�ω0

ω

�2 � ωω0 C

ω0

ωC 4(� � �0)2

�(7.62)

This is known as the Klein–Nishina formula.

7.3Bremsstrahlung

Scattering Amplitude*Bremsstrahlung (braking radiation in German) is a process in which a chargedparticle emits a photon when its path is bent by the electric field of the nucleus. Itcan be considered as Compton scattering by a virtual photon (γ�) produced by thenucleus:

γ�(q)C e(pi)! γ (k)C e(p f ) (7.63)

The scattering amplitude is obtained from that of the Compton scattering Eq. (7.51)by replacing the incoming photon (q) with an external field Aμ D (Z e/r, 0, 0, 0).

εμ(q)e�i q�y !Z e

4πrδμ0 D

Z e(2π)3

Zd4q ei q�y δμ0δ(q0)

jqj2

q D p f � p i C k(7.64)

S f i D C2πδ(Ei � E f � ω)Z ejqj2

u(p f )

�(i e 6 ε�)

i6p f C 6 k � m

(i eγ 0)C (i eγ 0)i

6pi � 6 k � m(i e 6 ε�)

�u(p i)

D 2π i δ(Ei � E f � ω)u(p f )eΓ u(pi)A0

Γ D6 ε� ( 6p fC 6 k C m)(p f C k)2 � m2

γ 0 C γ 0 ( 6p i� 6 k C m)(pi � k)2 � m2

6 ε� ,

A0 D �Z e2

jqj2

(7.65)

Page 207: Yorikiyo Nagashima - Archive

182 7 Qed: Quantum Electrodynamics

The unpolarized cross section becomes

dσ(pi ! p f C k) D 2πδ(Ei � E f � ω)Z2e6

2Ei jv i j

Xju f Γ u i j

2

�1jqj4

d3 p f

2E f (2π)3

d3k2ω(2π)3

(7.66)

whereP

takes the average of the initial and the sum of the final electron spinstates and the sum of the photon polarization states. The trace we want to calculateis X

ju f Γ u i j2 D

12

Tr

"�6 ε� 6p fC 6 k C m

2p f � kγ 0 � γ 0 6pi� 6 k C m

2pi � k6 �

� ( 6p i C m)

�γ 0 6p fC 6 k C m

2p f � k6 ε� 6 ε

6p i� 6 k C m2pi � k

γ 0�

( 6p f C m)

#

�18

(T1 C T2 C T3) (7.67)

We have rewritten the denominator using the relation

(p f C k)2 � m2 D 2(k � p f ) , (pi C k)2 � m2 D 2(k � p i) (7.68)

Choosing �0 D 0, the calculation of the traces yields

T1 D (p f � k)�2X

ε

Tr

�˚6 ε( 6p fC 6 k C m)γ 0( 6p i C m)γ 0( 6p fC 6 k C m) 6 ε( 6p f C m)

�D 8(k � p f )�2

�X

ε

�2(ε � p f )2 ˚m2 C 2Ei E f C 2Ei ω � (pi � p f ) � (k � p i)

�C2(ε � p f )(ε � p i)(k � p f )C 2Ei ω(k � p f ) � (k � p i)(k � p f )

(7.69a)

T2 D T1(pi $ �p f ) (7.69b)

T3 D �(p f � k)�1(pi � k)�1X

ε

Trh

γ 0( 6p i� 6 k C m) 6 ε( 6p i C m)γ 0

� ( 6p fC 6 k C m) 6 ε( 6p f C m)C (pi $ �p f )i

D 16(k � p f )�1(k � pi)�1X

ε

"(ε � pi)(ε � p f )

n(k � pi )� (k � p f )

C 2(p i � p f )� 4Ei E f � 2m2o

C (ε � p f )2(k � pi )� (ε � pi)2(k � p f )C (k � pi)(k � p f ) � m2ω2

C ωn

ω(pi � p f ) � Ei(p f � k) � E f (k � pi )o#

(7.69c)

Page 208: Yorikiyo Nagashima - Archive

7.3 Bremsstrahlung 183

pi θi

z

y

θfpf

k

Figure 7.5 Bremsstrahlung. The emitted photon is in the z direction and the emitted electron isin the y–z plane. The incoming electron is in a plane rotated by angle φ from the y–z plane.

The reaction does not take place in a plane, so we choose a coordinate frame asdescribed in Fig. 7.5.

We take the z-axis along the photon momentum k, the emitted electron mo-mentum p f in the y–z plane and the incoming electron momentum p i in a planerotated by an angle φ from the y–z plane. In this coordinate system, the parametersare

k μ D (ω, 0, 0, k) ω D jkj D k

p μf D (E f , 0, p f sin θ f , p f cos θ f )

p μi D (Ei , pi sin θi sin φ, p i sin θi cos φ, pi cos θi)

ε1 D (0, 1, 0, 0, ) ε2 D (0, 0, 1, 0)

(7.70)

where pi D jp i j, p f D jp f j. Then the cross section is expressed as

dσ DZ2α3

4π2

p f

p i jqj4dωω

dΩk dΩ f

"p 2

f sin2 θ f

(E f � p f cos θ f )2(4E 2

i � jqj2)C

p 2i sin2 θi

(Ei � pi cos θi )2

� (4E 2f � jqj

2)C 2ω2p 2

i sin2 θi C p 2f sin2 θ f

(E f � p f cos θ f )(Ei � pi cos θi )

� 2pi p f sin θi sin θ f cos φ

(E f � p f cos θ f )(Ei � pi cos θi )(4Ei E f � jqj2 C 2ω2)

#(7.71)

This is known as the Bethe–Heitler formula.

7.3.1Soft Bremsstrahlung

Now we consider the process of soft Bremsstrahlung. Here we are only interestedin the soft photon limit ω D k0 ! 0. Then the photon momentum k in thenumerator of the scattering amplitude Eq. (7.65) can be ignored. Then using the

Page 209: Yorikiyo Nagashima - Archive

184 7 Qed: Quantum Electrodynamics

relations

( 6p i C m) 6 ε�(k)u(pi) D�2�ε� � pi

�� 6 ε�( 6p i � m)

u(pi)

D 2�ε� � pi

�u(pi)

u(p f ) 6 ε�(k)( 6p f C m) D�2�ε� � p f

�� ( 6p f � m) 6 ε�

D 2u(p f )(ε� � p f )

(7.72)

the Bremsstrahlung amplitude can be reexpressed as

S f i D 2π i δ(Ei � E f � ω)u(p f )γ 0u(p )A0

�e�

(ε� � p f )(k � p f )

�(ε� � pi)(k � pi )

��A0 D �

Z e2

jqj2(7.73)

If we approximate q D p f �p iCk ' p f �p i in A0, this is just the elastic (i.e. Mott)scattering amplitude [see Eq. (6.136)] times a factor for emission of a photon. Thusthe Bremsstrahlung cross section for soft photon emission is the Mott scatteringcross section times the probability Pγ of emitting a photon:

dσ[e(pi)! e(p f )C γ (k)] D dσMott(pi ! p f )dPγ

dPγ Dd3k

2ω(2π)3

XλD1,2

e2

ˇˇ (ελ � p f )

(k � p f )�

(ελ � pi)(k � pi)

ˇˇ2 (7.74)

Since the polarization sum can be extended to all λ D 0, . . . , 3, we havePλ ελ

μ ελν D �gμν, and we can write

Pγ Dαπ

Z2dω

ω

ZdΩ4π

"(pi � p f )

( Ok � pi)( Ok � p f )�

12

(m2

( Ok � pi )2C

m2

( Ok � p f )2

)#

(7.75)

where Ok D k/jkj. The integrand goes to1 as ω ! 0. This is known as the infrareddivergence. The resolution of this question comes when the one-loop correctionsare added properly. We shall discuss the infrared divergence in detail in Sect. 8.1.3.Note that the number of photons increases as ω ! 0, but the emitted energyremains finite. Here, we simply cut off at ω D μ, assuming the photon has asmall finite mass. The total photon emission probability also diverges formally ifintegrated over all the momentum range. But remember we have limited jkj �jqj ' jp i � p f j in making the approximation, so the integral should have an upperlimit around jqj. Terms in the braces f� � � g are easily calculated to giveZ

dΩ4π

m2

( Ok � p )2D

m2

2E 2

Z C1

�1d cos θ

1(1� cos θ )2

D 1 (7.76)

The first term in square brackets [� � � ] can be integrated using Feynman’s parame-terization

1abD

Z 1

0dx

1[ax C b(1� x )]2

(7.77)

Page 210: Yorikiyo Nagashima - Archive

7.3 Bremsstrahlung 185

Writing ( Ok � pi) D Ei(1 � Ok � � i ) � � � ,ZdΩ4π

(pi � p f )

( Ok � pi)( Ok � p f )

D

Z 1

0dx

Z C1

�1

d cos θk

2

1� � i � � f

[1� Ok � f� f x C � i (1� x )g]2

D

Z 1

0dx

1� i f cos θ1 � j� f x C � i (1� x )j2

'

Z 1

0dx

1� 2 cos θ1 � 2 C 42 sin2(θ /2)x (1 � x )

D

Z 1

0dx

m2 � q2/2m2 � q2x (1 � x )

'

(1C [�q2/(3m2)]C O((m2/(�q2))2) m2/(�q2) 1

ln(�q2/(m2))C O(m2/(�q2)) m2/(�q2)� 1

�q2 D �(p f � pi )2 ' q2 D 4p2 sin2(θ /2)

(7.78)

where the integration over θk was carried out choosing the z-axis along � f x C� i (1� x ). The soft Bremsstrahlung cross section is then

dσdΩ f

D

�dσdΩ

�elastic

απ

A IR(q2) lnω2

max

μ2

A IR(q2) DZ 1

0dx

m2 � q2/2m2 � q2x (1 � x )

� 1

D

8<:

�q2

3m2 C O��q2

m2

�2

Nonrelativistic

ln�q2

m2

�� 1C O

m2

�q2

�Extremely relativistic

(7.79)

The angular distribution of the emitted photon is interesting. When the electronenergy is much larger than its mass, i.e. ! 1, the denominators of the expres-sions in the bracket of Eq. (7.75) go to zero as θ ! 0 and the radiation is stronglypeaked forward:

1 � cos θ D 1 ��

1 �1

γ 2(1C )

�cos θ '

θ 2

2C

12γ 2

) 1/21 � cos θ

'1

θ 2 C

1γ2

� D 1

θ 2 Cm

E

�2

(7.80)

which means about 1/2 of the total radiation is within a narrow angle of θ < 1/γ .The emission, therefore, is strongly concentrated in the initial and the final electrondirections.

Note, the soft photon emission probability Eq. (7.79) factorizes as the productof the elastic scattering and the photon emission probability. In the soft photon

Page 211: Yorikiyo Nagashima - Archive

186 7 Qed: Quantum Electrodynamics

limit, the momentum loss of the electron is negligible. Then the electron does notchange its state after the soft photon emission and the emission probability of thesecond soft photon is the same as the first. This means the soft emission is sta-tistically independent of the elastic scattering and we expect a Poisson distributionfor the n-photon emission probability. The phenomenon is treated in more detailin Sect. 8.1.4. The electron current is then a classical current source of the photonemission and we expect a semi-classical treatment would produce the same result.

Problem 7.7

Consider a classical source where the momentum of a moving charged particlechanges instantaneously at t D 01). The current can be expressed as

j μ(x ) D eZ 1

�1dτ

dx μ

dτδ4 (x μ � x μ(τ))

x μ(τ) D

((pi

μ/m)τ τ < 0

(p fμ/m)τ τ > 0

(7.81)

(1) Show that

Qj (k)μ D

Zd4x ei k �x j μ(x ) D i e

�p f

μ

(k � p f )�

p iμ

(k � pi )

�(7.82)

We may assume the integrand of Eq. (7.81) is multiplied by e��jτj to assure conver-gence.(2) Show that the one-photon emission probability agrees with dPγ given inEq. (7.74)Hint: Calculate S f i � δ f i D �i

Rd4xhγ (k)jA μ(x )j0i j μ(x ), treating the photon

quantum mechanically.

7.4Feynman Rules

Until now, we have used Feynman diagrams to aid calculation, but there is a deeper,in fact, one-on-one correspondence between the diagrams and the scattering ampli-tudes. With some experience, the calculations are made much easier. We summa-rize here the correspondence so that one can reconstruct the scattering amplitudeeasily by simply looking at the Feynman diagrams. As the scattering amplitude isexpressed as S D T exp

��iR

d4xHI�, once we know the interaction Hamiltoni-

an, the corresponding terms are uniquely defined. The electromagnetic interactionHamiltonian is obtained from the Lagrangian (HI D �LI) by the replacement

1) This is an idealization. In reality, the interaction takes a finite amount of time Δ t, in which casethe emitted photon energy would be limited ω < 1/Δ t, providing a natural cutoff for emittedphoton energy.

Page 212: Yorikiyo Nagashima - Archive

7.4 Feynman Rules 187

@μ ! @μ C i qA μ

spin 1/2 particle � iHI D �i qψ(x )γ μ ψ(x )A μ(x ) (7.83a)

scalar particle � iHI D qA μ['†@μ' � @μ'†']C i e2A μ Aμ'†' (7.83b)

(1) Feynman diagrams First draw all the topologically different diagrams for thedesired process.(2) External lines (Fig. 7.6)

Figure 7.6 Basic elements of the Feynman diagram: external lines.

Assign wave functions to the external lines (incoming or outgoing particles)

creation annihilationscalar boson W '†(x ) ! e i k 0�x '(x ) ! e�i k �x

electron W ψ(x ) ! u(p 0)e i p 0�x ψ(x ) ! u(p )e�i p �x

positron W ψ(x ) ! v (p 0)e i p 0�x ψ(x ) ! v(p )e�i p �xphoton W A μ(x ) ! εμ(k0)�e i k 0�x A μ(x ) ! εμ(k)e�i k �x

(7.84)

(3) Vertices (Fig. 7.7)

Figure 7.7 Basic elements of the Feynman diagram: vertices.

To the vertices assign

scalar-photon 3-point vertex ! �i q(k C k0)μ

scalar-photon 4-point vertex ! 2 i q2gμν

electron-photon 3-point vertex ! �i qγ μ(7.85)

(4) Internal lines (Fig. 7.8) To the internal lines that connect two vertices, assignthe Feynman propagators

Page 213: Yorikiyo Nagashima - Archive

188 7 Qed: Quantum Electrodynamics

Figure 7.8 Basic elements of the Feynman diagram: internal lines.

scalar h0jT ['(x )'(y )]j0i ! iΔF(k2) Di

k2 � m2 C i ε

photon h0jT [A μ(x )A ν(y )]j0i ! i DF(k2) D�i gμν

k2 C i ε

electron h0jT [ψ(x )ψ(y )]j0i ! i SF(k2) Di( 6p C m)

p 2 � m2 C i εmassive vector2) h0jT [W μ(x )W ν(y )]j0i ! iΔμν

F (k2)

D�i�gμν � k μ k ν/m2

�k2 � m2 C i ε

(7.86)

(5) Momentum assignment (Fig. 7.9)

p

p’

p’ k

p k

kq

q

q

k kk q

p

p

p k

(a) (b) (c)

Figure 7.9 Momentum assignment to loopFeynman diagrams. From left to right: vertexcorrection, photon self-energy and fermionself-energy. For the fermion loops �1 should

be multiplied for each loop. The variable k inthe figure is not contrained by δ functions atvertices hence has to be integrated.

Assign ki to every boson and p i to every fermion.

external lines ! determined by experimental conditions.internal lines ! Starting from external lines, assign ki , pi to

all the internal lines so as to make the totalsumD 0 at the vertices.

loop lines !

Zd4k

(2π)4

! multiply by � 1 for each fermion loop.

(7.87)

The momentum can be assigned to all the internal lines starting from the ex-ternal lines, whose momenta are decided by experimental conditions. Space-time

2) The difference from the photon is its mass and longitudinal polarization, which gives an extraterm in the numerator of the propagator.

3) Tree diagrams are those which do not contain loop diagrams.

Page 214: Yorikiyo Nagashima - Archive

7.4 Feynman Rules 189

integrals produce δ-functions at each vertex, which ensure that the sum of theenergy momentum is conserved. The δ-functions constrain all the internal mo-menta in tree diagrams3) and disappear by integration, except the one that ensuresthe conservation of overall energy-momentum. When loops appear as higher or-der effects, their internal momentum is not restricted by the δ-function and hasto be integrated, hence the factor

Rd4k/(2π)4. Examples are shown in Fig. 7.9a–c,

where k has to be integrated. For the fermion loops, �1 should be multiplied foreach loop.

Page 215: Yorikiyo Nagashima - Archive
Page 216: Yorikiyo Nagashima - Archive

191

8Radiative Corrections and Tests of Qed*

8.1Radiative Corrections and Renormalization*

So far we have calculated the matrix elements only in the lowest order (tree approx-imation). Most of the essential features of the process are in it, and it is not theauthor’s intention to get involved with a theoretical labyrinth by going into high-er orders. However, the appearance of the infinity was inevitable when we tried tocalculate higher order corrections in the perturbation series. It is an inherent dif-ficulty deeply rooted in the structure of quantized field theory. We should at leastunderstand the concept of how those infinities are circumvented to make QED aself-consistent mathematical framework. We may as well say that the history oftheoretical particle physics has been the struggle to cope with and overcome theinfinity problem. The quest is still going on. The standard remedy is “renormaliza-tion”, which extracts finite results by subtracting infinity from infinity. The processis rather artificial, but has stood the long and severe test of observations. Today,renormalizability is considered an essential ingredient for the right theory. Thehigher order corrections are indispensable in comparing theoretical predictionswith precision experiments and serve as crucial tests for the internal consistencyof the theory. The author considers it important that the reader understand at leastthe philosophy behind it. For this purpose, we describe the essence of the radiativecorrections in order to understand the renormalization prescription.1)

8.1.1Vertex Correction

We consider first the effect of radiative corrections on the strength of the couplingconstant. The strength of the electromagnetic constant is typically measured by alow-energy elastic scattering by nuclei referred to as Mott scattering. The scattering

1) This chapter deals with the self-consistency of QED. The discussion is mathematical and ratherformal, except for Sect. 8.2, which summarizes the validity of QED. Readers may skip this chapter;it is not compulsory for understanding later chapters. They may come back when they havegained enough understanding and when they are inclined to.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 217: Yorikiyo Nagashima - Archive

192 8 Radiative Corrections and Tests of Qed*

matrix element can be expressed as [see Eq. (6.136)]

MMott D �i qu(p 0)γ μ u(p )�i gμν

q2 [q j ν(q)] (8.1)

j μ is the source of an external field or that created by a static scattering center,which is symbolically denoted by a cross in Fig. 8.1.

So far we have treated the interaction of an electron with the field only in thelowest order, which is depicted as diagram (b). Higher order effects include thevertex correction (c), the photon propagator correction (d), and electron propagatorcorrections (e,f). Emission of very low energy photons (g,h) is physically indistin-guishable from the process that radiates none and should be included in a consis-tent treatment. We first consider the vertex correction (c) to the process. Applyingthe Feynman rules described in the previous section, �i qγ μ in Eq. (8.1) has anextra contribution

� i qγ μ ! �i q(γ μ C Γ μ)

Γ μ D

Zd4k

(2π)4

�i g�σ

k2

�(�i qγ�)

i( 6p 0� 6 k C m)(p 0 � k)2 � m2 C i ε

γ μ i( 6p� 6 k C m)(p � k)2 � m2 C i ε

(�i qγ σ)� (8.2)

As one can see easily by counting the powers of k, the integral diverges like� log k2 as k ! 1, which is called the ultraviolet divergence. It also diverges ask ! 0, which is called the infrared divergence. The latter has a logical and physi-cally justifiable interpretation and hence is not a fundamental difficulty.

QED vertex higher order corrections

(a) (b) (c) (d)

(e) (f) (g) (h)

p p p p

p’ p’ p’ p’

q q q q q

q q q q

k

k

k

k

q-kp-k

p’-k

p-k

p’-k

p p p p

p’ p’ p’ p’

++++

++

Figure 8.1 Corrections to the vertex function.The effective vertex including overall effectsis symbolically represented as the black circlein diagram (a), which in the lowest order isgiven by the vertex in (b). Higher order effectsinclude the vertex correction (c), the photonpropagator correction (d), and the fermionpropagator corrections (e, f). Emission of very

low energy photons (g, h) are physically indis-tinguishable from the process that radiatesnone. Two kinds of divergences, ultravioletand infrared, occur once diagrams involvingloops are included. The former is treated byrenormalization and the latter is canceled by(g, h).

Page 218: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 193

8.1.2Ultraviolet Divergence

The ultraviolet divergence is inherently connected with the infinite degrees of free-dom of the field and rather complicated to handle. A prescription is the renormal-ization developed by Tomonaga, Schwinger and Feynman.2) To understand it, letus calculate Eq. (8.2) and see how the difficulties arise. Throughout this calculationwe tacitly assume that the vertex function is sandwiched by two on-shell spinors. Acurrent operator sandwiched by on-shell spinors is generally expressed as

hp 0j j μ(x )jp i � hp 0j Qj μ(q)jp ie i q�x qμ D p 0 μ � p μ

� u(p 0)�

γ μ F1(q2)Ci σμνqν

2mF2(q2)

�u(p )e i q�x (8.3)

From Lorentz invariance alone, terms proportional to qμ and p 0 μCp μ should exist.But the current conservation constrains

@μ j μ(x ) � qμ Qj μ(q) D 0 (8.4)

So there should be no term proportional to qμ . γ μ sandwiched with u(p 0) and u(p )gives

2mγ μ D 6p 0γ μ C γ μ 6p D 2p 0 μ � γ μ( 6p 0� 6p ) D 2p 0 μ � γ μ 6 q

D 2p μ C ( 6p 0� 6p )γ μ D 2p μC 6 qγ μ D (p 0 μ C p μ)�12

[ γ μ , 6 q]

D (p 0 μ C p μ)C ii2

[γ μ , γ ν ]qν D (p 0 μ C p μ)C i σμνqν

(8.5)

where we have used 6p u(p ) D u(p 0)6p 0 D m. The relation is called the Gordon equal-ity and shows that p 0 μ C p μ can be expressed in terms of γ μ and i σμν qν. In thestatic limit, p 0 μ D p μ D (m , 0), q2 ! 0, i qμ D (0,�i q) � (0,r), and γ μ coupledwith the electromagnetic field reduces to

hp 0jeZ

d3x ei q�x ψ(x )γ μ ψ(x )A μ(x )jp iq2!0����!

heφ C

e2m

σ � Bi

(8.6)

This expression means γ μ includes both the electric charge and the magnetic mo-ment of a Dirac particle with g D 2. Then F1(q2) and F2(q2) describe the dynamicpart of the charge and the anomalous magnetic moment, which arise from theinner structure of the particle. They are constrained by Eq. (8.6)

F1(0) D 1 , F2(0) D � D g � 2 (8.7)

From the above argument we expect Γ μ defined by Eq. (8.2) to have the samestructure as Eq. (8.3) with F1(0) D 0 because Γ μ is defined as a correction to γ μ .

2) Any standard textbook on quantized field theory has a detailed description of it. We quote somerepresentative books: [63, 64, 215, 322, 338].

Page 219: Yorikiyo Nagashima - Archive

194 8 Radiative Corrections and Tests of Qed*

In order to calculate Eq. (8.2), we begin with the Feynman parameter trick:

1a1a2 � � � an

D (n � 1)!Z 1

0� � �

Z 1

0

nYiD1

dziδ(1�

Pi zi )

(P

i a i zi )n (8.8)

We put

a1 D k2�μ2Ci ε , a2 D (p 0�k)2�m2Ci ε , a3 D (p�k)2�m2Ci ε (8.9)

where we have introduced a small photon mass μ temporarily to avoid the infrareddivergence. Then Eq. (8.2) is expressed as

Γ μ D �2 i e2Z

d4k(2π)4

Z 1

0dz1dz2 dz3δ(1� z1 � z2 � z3)

�N μ(k)

(k0 2 �M2 C i ε)3

(8.10a)

N μ(k) D γν( 6p 0� 6 k C m)γ μ( 6p� 6 k C m)γ ν

k0 D k � p 0z2 � p z3

M2 D m2(1� z1)2 C μ2z1 � q2z2z3

(8.10b)

Now we change the integration variable from k to k0, and rewrite the numerator interms of k0. Then terms in odd powers of k0 vanish because of symmetry. Rewritingk0 as k, the numerator becomes

N μ(k)! N μ(k C p 0z2 C p z3) D N1 C N2 C (terms linear in k μ)

N1 D �2��k2γ μ C 2k μ k ν γν

N2 D γν [6p 0(1� z2)� 6p z3 C m]γ μ[� 6p 0z2C 6p (1� z3)C m]γ ν

(8.11)

After straightforward but long calculations of Dirac traces, using relations

γν γ μ γ ν D �2γ μ ,

γν A/ B/ γ ν D 4(A � B) ,

γν A/ B/ C/ γ ν D �2 C/ B/ A/

γ μ 6 q D qμ C i σμνqν

6 qγ μ D qμ � i σμνqν

(8.12)

and Gordon decomposition, N μ becomes

N μ D N1 C N2 D �2�� k2γ μ C 2k μ k ν γν C q2(1� z2)(1� z3)γ μ

C m2�1� 4z1 C z21

�γ μ C z1(1� z1)i σμνmqν

C (2� z1)(z2 � z3)mqμ (8.13)

The term proportional to qμ is forbidden by current conservation. Here, the coef-ficient of qμ is odd for the exchange z2 $ z3 and therefore vanishes after integra-tion over z2 and z3, as it must. The first two terms are divergent, as can be seen by

Page 220: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 195

counting the powers of k. Obviously we need to separate the divergent and finiteparts and extract a meaningful result out of them. The simplest method is simplyto stop integration at some finite cutoff value Λ. But usually that destroys the built-in symmetry. For QED, the Pauli–Villars regularization method which respects theLorentz invariance has frequently been used.3) But for non-Abelian gauge theory itis unsatisfactory because it does not respect the gauge symmetry and the preferredmethod is dimensional regularization, where the space-time dimensionality is ana-lytically continued from the real physical value, four, to a complex value D. For realD < 4, the integral of the form

Rd D k/(k2C � � � )2 converges. When we recover the

original dimension D D 4 � ε, ε ! 0, the divergence reappears in the form of apole at D D 4, which can be subtracted. Other poles appear at D D 6, 8, . . ., but itturns out that the integral is analytic in the complex D-plane and we can make ana-lytic continuation to any complex values except those poles. The relevant formulasin this case are [see Eq. (D.20) and discussions in Appendix D]

Zd D k

[k2 � M2 C i ε]nD i(π)D/2 (�1)n

(M2)n�D/2

Γ�n � D

2

�Γ (n)

(8.15a)

Zk2d D k

[k2 � M2 C i ε]n D i(π)D/2(�1)n�1 D/2(M2)n�1�D/2

Γ�n � 1� D

2

�Γ (n)

(8.15b)Zk μ k ν d D k

[k2 � M2 C i ε]nD

gμν

D

Zk2d D k

[k2 � M2 C i ε]n(8.15c)

where Γ is the well-known gamma function defined by

Γ (x ) DZ 1

0d t e�t t x�1 (8.16)

Setting n D 3, defining an infinitesimal number ε D 4� D and using

Γ ε

2

�D

2ε� γE C O(ε) (8.17a)

x ε/2 D e(ε/2) ln x D 1Cε2

ln x C O(ε2) (8.17b)

we obtainZd4kk2

[k2 � M2 C i ε]3D i π2 �Δ � ln M2 (8.18a)

Δ D2ε� γE (8.18b)

3) In this method the propagator is replaced

1k2 � m2

!1

k2 � m2�

1k2 � Λ2

Dm2 � Λ2

(k2 � m2)(k2 � Λ2)(8.14)

and the integral made finite. After the calculation, Λ !1 is taken to recover the original form.The divergent terms appear as Λ2 or ln Λ.

Page 221: Yorikiyo Nagashima - Archive

196 8 Radiative Corrections and Tests of Qed*

Zd4kk μ k ν

[k2 � M2 C i ε]3D

gμν

4

Zd4k k2

[k2 � M2 C i ε]3(8.18c)Z

d4k[k2 � M2 C i ε]3

D �i π2

21

M2 (8.18d)

where γE is the Euler constant.4) The divergent part is included in Δ. Inserting theabove equalities, we obtain

Γ μ Dα

Z 1

0dz1dz2 dz3δ(1� z1 � z2 � z3)

�(Δ � ln M2)γ μ

C γ μ�

m2(1 � 4z1 C z21 )C q2(1 � z2)(1� z3)

M2

Ci σμνqν

2m2m2z1(1� z1)

M2

� (8.20)

Though we could separate the divergent part, it is by no means unique. If we addsome finite part to the infinity, it is still infinity. To determine the infinite part, wenote that, in the static limit, the current has to have the form of Eq. (8.6) and Γ μ

has to vanish in the limit q2 ! 0. This suggests a prescription to subtract termsevaluated at q2 D 0. Then we can write

γ μ C Γ μ D γ μ C�Γ μ(q2) � Γ μ(0)

D γ μ F1(q2)C

i σμν

2mF2(q2) (8.21a)

F1(q2) D 1Cα

Z 1

0dz1dz2dz3 δ(1� z1 � z2 � z3)

�ln�

m2(1� z1)2 C μ2z1

m2(1� z1)2 � q2z2z3 C μ2z1

Cm2(1� 4z1 C z2

1 )C q2(1� z2)(1� z3)m2(1 � z1)2 � q2z2 z3 C μ2z1

�m2(1� 4z1 C z2

1 )m2(1� z1)2 C μ2z1

�(8.21b)

F2(q2) Dα

Z 1

0dz1dz2 dz3δ(1� z1 � z2 � z3)

�2m2 z1(1� z1)

m2(1 � z1)2 C μ2z1 � q2z2 z3 � i ε

� (8.21c)

The expressions no longer contains the ultraviolet divergence. The treatment of thefinite part of F1(q2) will be postponed until we discuss corrections to the electronpropagator. The calculation of F2(q2) is easier. It does not include the ultraviolet

4) Defined as

γE D limN!1

"NX

nD1

1n� ln N

#D 0.57721566 � � � . (8.19)

Page 222: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 197

or infrared divergence. We can safely set μ D 0. To evaluate the static anomalousmagnetic moment, we can set q2 D 0 and

F2(0) Dα

Z 1

0dz1dz2dz3 δ(1� z1 � z2 � z3)

�2m2z1(1� z1)

m2(1� z1)2

Z 1

0dz1

Z 1�z1

0dz2

2z1

(1� z1)D

α2π

(8.22)

Equation (8.7) tells us that the real value of the magnetic moment of the electronμ D g(e/2m)s is not given by g D 2 but (g � 2)/2 D F2(0) D α/2π ' 0.0011614.

8.1.3Infrared Divergence

F1(q2) needs more elaborate calculation. Here we concentrate on the dominant partas μ2 ! 0. It is given as

F1(q2) ' 1Cα

Zdz1 dz2dz3 δ(1� z1 � z2 � z3)

�m2(1� 4z1 C z2

1 )C q2(1� z2)(1� z3)m2(1� z1)2 � q2z2z3 C μ2z1

�m2(1� 4z1 C z2

1 )m2(1� z1)2 C μ2z1

� (8.23)

The divergence occurs in the integration region where z1 ! 1, z2, z3 ! 0. In thisregion, we can set z1 D 1, z2 D z3 D 0 in the numerator of Eq. (8.23). We can alsoset μ2z1 ! μ2. The integral then becomes

F1(q2) ' 1Cα

Z 1

0dz1

Z 1�z1

0dz2

��2m2 C q2

m2(1� z1)2 � q2z2(1 � z1 � z2)C μ2�

�2m2

m2(1� z1)2 C μ2

�(8.24)

By changing the variables

� Dz2

1� z1, η D 1 � z1, dz1 dz2 D �ηd� dη (8.25)

we obtain

F1(q2) ' 1Cα

Z 1

0d�

Z 1

0

d(η2)2

��2m2 C q2

[m2 � q2� (1� � )]η2 C μ2 ��2m2

m2η2 C μ2

D 1�α

Z 1

0d�

�m2 � q2/2

m2 � q2� (1� � )ln�

m2 � q2� (1� � )μ2

� ln

�m2

μ2

��(8.26)

Page 223: Yorikiyo Nagashima - Archive

198 8 Radiative Corrections and Tests of Qed*

In the limit μ2 ! 0, the parameter � in the logarithm is not an important part ofthe integrand. We have already encountered the coefficient of the log function aspart of the soft photon emission probability [see Eq. (7.79)]:

A IR(q2) �Z 1

0d�

m2 � q2/2m2 � q2� (1� � )

� 1

'

([�q2/(3m2)]C Of(�q2/m2)2g �q2 � m2

ln(�q2/m2)f1C O(m2/(�q2))g � 1 �q2 m2

(8.27)

The final expression for F1(q2) becomes

F1(q2) D 1�α

([�q2/(3m2)] ln(m2/μ2) �q2 � m2

ln(�q2/m2) ln(�q2/μ2) �q2 m2(8.28)

The q2 dependence of the second line is known as the Sudakov double logarithm.Since this is a second-order correction to the vertex, the contribution to the crosssection for the elastic scattering can be written (for the case q2 m2) as

dσ(p ! p 0) D dσ0jF1(q2)j2 D dσ0

�1 �

απ

ln�q2

m2ln�q2

μ2C O(α2)

�(8.29)

The cross section for the soft bremsstrahlung (contribution from Fig. 8.1g,h) wasgiven in Eq. (7.79):

dσ(p ! p 0) D dσ0απ

ln�q2

m2ln

ω2max

μ2(8.30)

When the two cross sections are added together, we find

dσ(p ! p 0) D dσ0

�1 �

απ

ln�q2

m2ln�q2

ω2maxC O(α2)

�(8.31)

which is a finite, converging result independent of μ2, as desired.

Summary of the Infrared Divergence The infrared divergence is induced by low-en-ergy (k ! 0) photons (soft photons). Diagram (c) in Fig. 8.1 is a third-order [O(e3)]effect but it is indistinguishable from the lowest order effect. Therefore, their con-tributions have to be added to the lowest order vertex diagram (b) and squared toderive a measurable quantity. Diagram (c) produces an O(e6) effect by itself, butalso produces O(e4) by interfering with (b). This is what we have in Eq. (8.29).Bremsstrahlung processes that emit very low energy photons (g,h) cannot be dis-tinguished from those which emit none, and their contributions have to be added(after squaring the amplitude). This effect is of O(e4). We have shown that the con-tribution from emission of a real soft photon and that from the soft-photon loops

Page 224: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 199

compensate each other and no infinities appear. This is true for all orders of per-turbations in QED and is demonstrated in the next section. In fact the infraredcompensation works for any gauge theory.

The reader may consider that the photon-emitting process is physically distinctfrom that with no emission. It is certainly true that high-energy photons can bemeasured and the process is distinct, but consider experimentally. The passage ofthe electron is measured typically using counters of some kind and its momentumby bending in a magnetic field. The counters have a finite size and finite energyresolution, and low-energy photons that fly close to the electron are measured to-gether with the electron, and cannot be separated. If you say this is the problem ofthe detector and not the principle, I would argue that even in principle one couldnever make a detector with infinite accuracy. The detector cannot avoid counting1, 2, 3, . . . , n extra photons, and the sum of all of them becomes finite theoretically.Therefore, for the calculation of the infrared contributions, knowledge of the exper-imental setup becomes necessary in principle. In practice, the calculation is carriedout with some cutoff. If the soft photons are not observed, their contribution can beintegrated over the variables. When one calculates the total cross section or inclu-sive processes (in which one measures some finite number of particles and doesnot observe the rest), one can derive a finite result independent of the experimentalconditions. Thus, we can consider that the infrared divergence is not an inherentproblem.

8.1.4Infrared Compensation to All Orders*

Here we show that the infrared divergence due to photon emission is compensatedby that of virtual photon exchange to all orders. The most comprehensive treatmentof the infrared divergence is given in [397]. Multiple emission of soft photons wasfirst considered by Bloch and Nordsieck [67] classically. The discussion here followsclosely that of Weinberg [382].

We have learned that the soft bremsstrahlung process was well approximatedby elastic scattering of an electreon at a nucleus multiplied by the probability ofemitting a soft photon [e(p )! e(p 0)C γ (k)]. The cross section in the soft-photonlimit was given by Eq. (7.74)

dσ(p ! p 0 C k) D dσ(p ! p 0)Pγ

Pγ D

Zd3 k

2ω(2π)3

XλD1,2

e2

ˇˇ (ελ � p f )

(k � p f )�

(ελ � pi)(k � pi )

ˇˇ2

'απ

ln�q2

m2ln

ω2max

μ2� q2 m2

(8.32)

where μ is the small photon mass introduced to avoid the infrared divergence andωmax is the maximum photon energy the detector is capable of detecting. We haveseen that the infrared divergence vanishes to order α if we include the virtual pho-ton contribution to the vertex.

Page 225: Yorikiyo Nagashima - Archive

200 8 Radiative Corrections and Tests of Qed*

Here we intend to show that the cancellation is effective at every order of photonemission. This may seem an impossible task, since there are a variety of ways inwhich the infrared divergence can enter the various Feynman diagrams. However,the problem is made tractable by extracting only those in which soft photons areemitted from the incoming or outgoing electrons on the mass shell.

The Feynman diagram where a total of n soft photons are emitted from on-shell electrons has the structure depicted in Fig. 8.2a. Only soft photons emittedfrom (nearly) on-shell electrons contribute, because the denominator of the elec-tron propagator before (or after) the photon emission is given as

1(p ˙ k1)2 � m2 C i ε

D1

(p 2 � m2)˙ 2(p � k1)(8.33)

and becomes large only when p 2 � m2 D 0 and k1 ! 0. Contributions of hardphotons or soft photons emitted from deeply off-shell electrons, where (p ˙ k)2 �

m2 m2, are negligible.First we consider a process where all the soft photons are emitted from the out-

going electron. The amplitude has the following form, where we have numbered

p

p-km+1

p-km+1-km+2

p’

km+1

km+2

kn

km

k3

k2

k1

k1k2 k3

k2n

p’

p

HARD PHOTONS

(kj)

(a) (b)

ki

Figure 8.2 Diagrams that contribute to theinfrared divergence. (a) Only soft photonsemitted from on-shell electrons, namely thoseincoming and outgoing electrons at the outerlegs give large contribution. Soft photons orhard photons from deep off-shell electrons

(denoted by the gray blob) do not contributemuch. (b) Only soft-photon internal lines ex-changed by on-shell electrons contribute tocancel the divergence created by real photonemission.

Page 226: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 201

the photons as in Fig. 8.2:

u(p 0) � � � (�i q 6 ε�j )

i( 6p 0C 6 k1 C � � �C 6 k j )C m2(p 0 � k1 C � � � C k j )

(�i q 6 ε�j C1)

�i( 6p 0C 6 k1 C � � �C 6 k j C1)C m

2(p 0 � k1 C � � � C k j C1)� � � u(p )

(8.34)

We are only interested in the soft-photon limit (ω D jk0j ! 0). Therefore, we canneglect k/ terms in the numerator. Next we move p/0 to the left and use u(p 0)(p/0 �m) D 0. Namely

u(p 0) 6 ε�( 6p 0C m) 6 ε�( 6p 0C m) � � � u(p )

D u(p 0)[2(ε� � p 0) � ( 6p 0 � m) 6 ε�]( 6p 0 C m) � � � u(p )

D u(p 0)2(ε� � p 0) 6 ε�( 6p 0 C m) � � � u(p )

D u(p 0)2(ε� � p 0)2(ε� � p 0) � � � u(p )

(8.35)

Thus all the numerators can be replaced with (2ε� � p 0). Furthermore, we can usethe formulaX

perm

1(p 0 � k1)

1(p 0 � (k1 C k2))

� � �1

(p 0 � (k1 C � � � kn))

D1

(p 0 � k1)1

(p 0 � k2)� � �

1(p 0 � kn)

(8.36)

which can easily be proven for n D 2:

1(p 0 � k1)

1(p 0 � (k1 C k2))

C (1$ 2) D1

(p 0 � k1)1

(p 0 � k2)(8.37)

The result can be proved for general n by induction. Then the scattering amplitudewith subsequent n-photon emission that is effective at k j ! 0 can be written as

u(p 0)iMe l u(p )nY

j D1

e(ε� � p 0)(p 0 � k j )

(8.38a)

Next we assume all the soft photons are emitted from the incoming electron andrepeat the same procedure. We obtain

u(p 0)iMe l u(p )nY

j D1

��

e(ε� � p )(p � k j )

�(8.38b)

which can be obtained from Eq. (8.38a) by the substitutions

p 0 ! p , k j ! �k j

Page 227: Yorikiyo Nagashima - Archive

202 8 Radiative Corrections and Tests of Qed*

as can be seen in the lower part of Fig. 8.2a. In reality, the n photons can be emittedfrom either the outgoing or the incoming electron. For this case, we have

A real D u(p 0)iMe l u(p )Xperm.

n1Cn2Dn

n1Yi

n2Yj

e(ε� � p 0)(p 0 � ki )

��e

(ε� � p )(p � k j )

D u(p 0)iMe l u(p )nY

j D1

e�

(ε� � p 0)(p 0 � k j )

�(ε� � p )(p � k j )

� (8.39)

Then the probability of emitting n photons is obtained by squaring the amplitude,taking the polarization sum, which gives

Pλ (ε � A)(ε� � B) D �(A � B), and mul-

tiplying by the final-state phase space volume. Note 1/n! comes in because of theBose statistics of the photon. Summing over all n’s, we obtain

dσ D dσ(p ! p 0)1X

nD0

1n!

R n D dσ(p ! p 0) exp[R]

R DZ

d3k(2π)32ω

e2X

λ

�(ε�

λ � p0)

(p 0 � k)�

(ε�λ � p )

(p � k)

�2

(7.79)'

απ

ln��q2

m2

�ln�

ω2max

μ2

�for � q2 D �(p 0 � p )2 m2

(8.40)

where ωmax is the maximum energy the detector is capable of seeing and μ is thephoton mass temporarily introduced to avoid the infrared divergence. Note the one-photon emission probability R was already given as Pγ of the bremsstrahlung crosssection dσ(p ! p 0 C γ ) D dσ(p ! p 0)Pγ in Eqs. (7.74) and (7.79).

Next we consider the corresponding internal soft-photon contributions. Corre-sponding to the n real-photon emission cross section, we need to consider an in-terference term of order 2n (n soft internal lines) with the 0th-order diagram. Firstwe arrange 2n external photons arbitrarily on either outgoing or incoming electronlines. Then we pick up a pair (ki , k j ) of photon lines and connect them to make aninternal line. Recollecting that to each external photon line a factor (�� � p 0)/(p 0 � ki )or�(�� � p )/(p �ki ) is attached, depending on whether the i, j th line is emitted fromthe outgoing or the incoming electron, we have an amplitude for each internal line

A internal D e2Z

d4ki

(2π)4

�ik2

i C i ε

�pi

(pi � ki )

��

�p j

�(p j � ki )

�(8.41)

where the dot product is produced by �gμν (DP

��μ �ν) in the numerator of the

photon propagator. pi and p j stand for p 0 or p in the numerator and p 0 or �p inthe denominator, depending on whether the photon i, j is emitted from the out-going or incoming line. The extra minus sign in the second denominator appearsbecause if we define k as the photon momentum emitted by the line i then k mustbe absorbed by line j.

In making n internal lines out of 2n external lines for all possible combinations,we use Eq. (8.39) to express 2n external photon lines, choose lines i and j (i ¤ j )

Page 228: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 203

and connect pairwise to give

Xperm

n1Cn2D2n

n1YiD1

n2Yj D1

(ε�i � p

0)(p 0 � ki )

(ε�j � p )

(p � k j )

!

D

nYiD1

nYj D1

�(ε�

i � p0)

p 0 � ki�

(ε�i � p )

p � ki

�" (ε�j � p

0)p 0 � k j

�(ε�

j � p )

p � k j

#

! �

nYi

��p 0

(p 0 � ki )�

p(p � ki )

��

�p 0

(p 0 � k j )�

p(p � k j )

��k j D�ki

(8.42)

whereP

�μi

��ν

j jiD j D �gμν was used in going to the third line. We have to dividethe sum by 2n n! because, on integrating over ki , interchanging n virtual photonsand two directions of a pair with each other produces the same diagrams. Thereforethe sum of all different combinations gives

A n,virtual DV n

n!(8.43)

V De2

2

Zd4k

(2π)4

�ik2 C i ε

�p 0

(p 0 � k)�

p(p � k)

��

�p 0

�(p 0 � k)�

p�(p � k)

' �α

2πln��q2

m2

�ln��q2

μ2

�for � q2 m2 (8.44)

Derivation of the last line can be done following the same logic as was done to cal-culate the one loop vertex correction function of Eq. (8.2). In fact, one can recognizethat V is just its infrared limit which resulted in Eq. (8.28). Virtual photon exchangediagrams may contain m extra external photon emissions, but they factorize andconstitute a part of Eq. (8.40). Therefore adding 0, 1, 2 . . . , n, . . . virtual photon ex-change contributions, we can obtain the correction factor A virtual that multipliesthe elastic scattering amplitude. The scattering cross section for emission of anynumber of photons becomes

1XnD0

dσ(p ! p 0 C nγ ) D dσ0

1XnD0

R n

n!

! ˇˇ 1X

nD0

V n

n!

ˇˇ2

D dσ0 exp(R) exp(2V )

D dσ0 � exp�

απ

ln��q2

m2

�ln�

ω2max

μ2

��

� exp��

απ

ln��q2

m2

�ln��q2

μ2

��

D dσ0 exp��

απ

ln��q2

m2

�ln��q2

ω2max

��for q2 m2 (8.45)

Page 229: Yorikiyo Nagashima - Archive

204 8 Radiative Corrections and Tests of Qed*

This expression depends on the detector sensitivity, but no longer on the infraredcutoff μ2. Unlike the O(α) correction, which could go to�1,5) the correction factoris finite and lies between 0 and 1. If we set ωmin instead of ωmax as the minimumenergy the detector can detect, the probability of a scattering cross section with nophoton emission is

dσ(p ! p 0 C 0γ ) D dσ0 exp��

απ

ln��q2

m2

�ln��q2

ω2min

��(8.46)

which decreases faster than any power of q2. The exponential correction factorcontains the Sudakov double logarithm and is known as the Sudakov form factor.

n-Photon EmissionIf we set up a detector to observe any number of photons with energy ωmin < ω <

ωmax, the probability of finding n photons would be proportional to R n/n!, whereR is given by integrating the photon energy from ωmin to ωmax in Eq. (8.40)

Pn /1n!

R n

R DZ

d3k(2π)32ω

e2X

λ

�(ε�

λ � p0)

(p 0 � ki )�

(ε�λ � p )

(p � ki )

�2

'απ

ln��q2

m2

�ln�

ω2max

ω2min

�for � q2 D �(p 0 � p )2 m2

(8.47)

As the total probability of observing any number of photons should equal 1, we cancalculate the normalization constant N from

1 D N1X

nD0

R n

n!D N eR (8.48a)

Pn DR n

n!e�R , R D �

απ

ln��q2

m2

�ln�

ω2max

ω2min

�(8.48b)

This gives a Poisson distribution, as we expected from discussions in Sect. 7.3.1.

8.1.5Running Coupling Constant

We now turn to other higher order corrections that involve the self-energy of thephoton and the electron. Let us consider the photon propagator (Fig. 8.3).

When the contribution of the loop correction (second diagram on the right handside, rhs) is added, the photon propagator is modified from the lowest order ex-

5) The whole perturbation argument breaks down when α ln(m/μ) becomes comparable tounity, which occurs at μ � e�137 m � 10�60 m. Physically this is an absurd situation, butmathematically it is a problem.

Page 230: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 205

= + + + q q q q q q q q

p-q

p

Figure 8.3 When the higher order radiative corrections are added to the lowest propagator DF ,it is dressed to become iDF0 D iDFC iDF (�iΣγ γ )iDFC . . .. Here only ladder-type correctionsare summed and included.

pression

�i gμν

q2 !gμν

i q2 Cgμ

λ

i q2 (�i Π λσ)gν

σ

i q2 D

�gμν C

1i q2 (�i Π μν)

�1

i q2 (8.49a)

�i Π μν D �

Zd4 p

(2π)4Tr�

i eγ μ i( 6p C m)p 2 � m2 C i ε

i eγ ν i( 6p � 6 q C m)(p � q)2 � m2 C i ε

�(8.49b)

There is an extra minus sign multiplied for each fermion loop. It originates fromthe anticommutativity of the fermion field. The first term in the first line is thelowest order photon propagator, which is the Feynman propagator of the free field.Counting the powers of k, the integral seems to diverge as � k2. However, becauseof gauge invariance, Π μν satisfies the constraint

qμ Π μν D qν Π μν D 0 (8.50)

which means its functional form has to be of the type

Π μν D (q2gμν � qμ qν)Πγ (8.51)

Therefore, the powers of k are effectively reduced by 2 in Πγ , and it only divergesas log k2, just like the loop integral of the diagram in Fig. 8.1c. For the sake of lateruse, we define

Σγ γ � q2Πγ (8.52a)

! Π μν D

�gμν �

qμ qν

q2

�Σ γ γ (8.52b)

If we expand Πγ (q2) in Taylor series6)

Πγ (q2) D Πγ (0)C q2 @

@q2Πγ (q2)

ˇˇ

q2D0C � � � � Πγ (0)C OΠγ (q2) (8.53)

By definition OΠγ (0) D 0. The power of k2 in the integral Eq. (8.49b) which con-tributes to the coefficients of the Taylor series reduces as the power of q2 increases.Since the constant term Πγ (0) contains the highest power of k2, if it diverges at

6) It can be shown that generally Πγ (q2) does not have a pole at q2 D 0.

Page 231: Yorikiyo Nagashima - Archive

206 8 Radiative Corrections and Tests of Qed*

most logarithmically, we expect that coefficients of the higher q2 powers do notdiverge and are finite.

Indeed, actual calculation (see Appendix D for details) shows that the divergentterm is included only in Πγ (0):

Πγ (q2) D Πγ (0) �2απ

Z 1

0dz z(1 � z) ln

�1�

q2

m2 z(1 � z)�

(8.54)

Πγ (0) Dα

3π(Δ � ln m2) (8.55)

Δ is the divergent term and is given by Eq. (8.18). If we use dimensional regulariza-tion, and carry out the integration in D-dimensional space, the integral convergesfor D � ε, ε > 0. The divergence comes back when we make ε ! 0 as Δ containsa pole term 1/ε. The second term is well defined and finite and behaves like

OΠγ (q2) D

8<ˆ:

αq2

15πm2 �q2 ! 0

�α

3πln�q2

m2 �q2 !1

(8.56)

Including the second-order effect, the lowest order Mott scattering amplitudeEq. (8.1) is modified to

i eu(p 0)γ μ u(p )�

gμν

i q2 Cgμλ

i q2 (�i Π λσ)gσν

i q2

�[e j ν(q)] (8.57a)

The term proportional to qμ qν vanishes when coupled with the conserved current(qμ j μ D 0 or qμ uγμ u D 0). The above expression gives

i eu(p 0)γ μ u(p )�i gμν

q2

h1� Πγ (0) � OΠγ (q2)

i[e j ν(q)]

' i eu(p 0)γ μ u(p )�i gμν

q2 [1C Πγ (0)]�1h1 � OΠγ (q2)

i[e j ν(q)]

(8.57b)

Then defining

Πγ (0) Dα

3π(Δ � ln m2) D

1Z3� 1 � δZ3 (8.58a)

eR D e�1C Πγ (0)

�1/2� Z1/2

3 e (8.58b)

we rewrite Eq. (8.57) to give

! i eRu(p 0)γ μ u(p )�i gμν

q2

h1 � OΠγ (q2)

i[eR j ν(q)] (8.59a)

α ! αR De2

R

4π(8.59b)

We shall call e the bare coupling constant and eR the renormalized coupling con-stant. The bare coupling constant, which appears in the original Lagrangian, is

Page 232: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 207

then not the true coupling constant. The true coupling constant should include allthe higher order corrections and is an observable because it appears in measurablequantities such as cross sections. Here we have treated δZ3 D Πγ (0) as a smallnumber in the perturbation series in terms of α, though actually it is divergent.

In effect, we have subtracted infinity from infinity to obtain a finite expression,which is not allowed because it is an ambiguous procedure. That is what we learnedin high-school mathematics. The justification is as follows. Higher order correc-tions for the photon come in because of the interactions with charged particles.But if q2 D 0, the photon is on shell, and it is a freely propagating wave. The fieldis completely decided by the equation of motion without a source. We know thepropagator for the free photon. It is the Green’s function to the free Klein–Gordonequation, which is �i gμν/q2. Whatever form the modified propagator takes, it hasto respect this condition. Therefore, in principle, the correction to the lowest orderterm must vanish at q2 D 0 or Πγ (0) D 0 must hold. In practice it does not, so wesubtract it as redundant.

Now, the photon can make an electron–positron pair at any time and acquire anextra self-energy. So applying the replacement Eq. (8.49) to the photon propagatorsuccessively, the net effect is to change DF ! D 0

F (to dress the photon), the dia-grams of which are depicted in Fig. 8.3. Only ladder-type diagrams are collectedhere. More complex diagrams have to be included if we go to higher orders.

Then the expression for D 0F is given by

D 0F

μν(q2) Dgμν

i q2

�1C (�i Σγ γ )

1i q2C(�i Σγ γ )

1i q2

(�i Σγ γ )1

i q2C � � �

Dgμν

i [q2 C Σγ γ (q2)]D

gμν

i q2

11C Πγ (q2)

Dgμν

i q2

1

1C Πγ (0)C OΠγ (q2)�

gμν

i q2

1

Z�13 C OΠγ (q2)

Dgμν

i q2

Z3

1C Z3 OΠγ (q2)(8.60)

As the propagator is always sandwiched between two vertices, it accompanies e2

[see Eq. (8.49)]. Therefore, we move the Z3 factor to them and rename

e2 ! e2R D Z3e2 (8.61)

OΠγ itself is proportional to e2 and appears as multiplied by Z3. Therefore we canreplace e2 in OΠγ by e2

R, too.

Z3 OΠγ (q2, e2) D OΠγ (q2, e2R) (8.62)

Although Z3 is a diverging constant, if we consider Z3 and consequently eR tobe finite, the photon propagator can still be expressed as gμν/q2, but the couplingconstant modified to

e2R(q2) �

Z3e2

1C Z3 OΠγ (q2, e2)D

e2R

1C OΠγ (q2, e2R)

(8.63)

Page 233: Yorikiyo Nagashima - Archive

208 8 Radiative Corrections and Tests of Qed*

e2R is now a function of q2 and is referred to as the effective or running coupling

constant. Apart from the treatment of the infinite constants, all the higher ordercorrections to the photon propagator result in simply making the coupling con-stant run. From the above argument, the ultraviolet problem, as far as the photoncorrection is concerned, can be contained in the coupling constant, a procedurecalled renormalization.

Finally, we give a form frequently quoted as the running coupling constant ofQED at high energy. Substituting Eq. (8.56) in Eq. (8.63) we have at large Q2 D

�q2 > 0

α(Q2) Dα

1�α

3πln

Q2

m2e

(8.64)

where α D α(0). The screening due to vacuum polarization gives the smallestvalue of α at infinite distance. Conversely α becomes larger at larger value of Q2

or at shorter distance.

8.1.6Mass Renormalization

Next let us consider the radiative corrections to the electron (Fermion) propagator.As an example consider the Compton scattering amplitude. Its inner line corre-sponds to the propagator

i SF(p ) Di( 6p C m)

p 2 � m2 C i εD

i6p � m

(8.65)

The next higher order correction is a process to emit and reabsorb a photon, asshown in Fig. 8.4. Using the same argument for the photon propagator, we maysum all the ladder-type repetition of the self-energy:

i SF(p )! i S 0F(p ) D

i6p � m

Ci

6p � m(i Σe(p ))

i6p � m

C � � � (8.66a)

i Σe(p ) DZ

d4k(2π)4

�i gμν

k2 � μ2 C i ε(i eγ μ)

i6p� 6 k � m C i ε

(i eγ ν) (8.66b)

where we introduced a small photon mass μ to avoid the infrared divergence. Aswe can see by counting the powers of k, the integral diverges as k ! 1. Σe(p )can be calculated again using the same regularization method as the photon (see

Figure 8.4 iS 0F D iSF C iSF(�iΣe )iSF C . . ..

Page 234: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 209

Appendix D). Then S 0F(p ) can be rewritten as

S 0F(p ) D

i6p � m C i ε

�1C Σe(p )

16p � m

��1

Di

6p � m C Σe(p )(8.67)

Note, the pole of the propagator has shifted, meaning the mass value has changedas the result of higher order corrections. Then the original mass term m, called thebare mass, that appeared in the Lagrangian is not the true mass, but a parameterhaving no physical meaning. The true mass should include all the corrections andthe propagator should have the pole at the correct (or observed) mass, which wedenote by mR. Then we perform the Taylor expansion around 6p D mR in the samespirit as we did for the photon propagator around q2 D 0:

Σe(p ) D δm C [δZ2 C OΠe( 6p )]( 6p � mR) (8.68a)

δm D Σe(p )j p/DmR D Σe (m)C O(α2) ' �3α4π

mΔ (8.68b)

δZ2 �1

Z2� 1 D

@Σe(p )@p/

ˇˇ

p/DmR

' Cα

4πΔ (8.68c)

OΠe(mR) D 0 (8.68d)

where ' means that some finite contributions are omitted. The diverging quan-tities are contained in δm and δZ2. After they are subtracted, the remaining OΠe

is finite. OΠe(mR) D 0 since it is the third term of the Taylor expansion around6p D mR. Using the above quantities, S 0

F (8.66) can be rewritten as

S 0F(p ) D

i6p � m C Σe(p )

Di

[Z�12 C OΠe(p 2)]( 6p � mR)

Di Z2

[1C Z2 OΠe(p 2)]( 6p � mR)

mR D m � δm

(8.69)

As the propagator accompanies e2 also, the replacement e2 ! e0 2R D Z2e2 removes

the diverging Z2, in the same way as we removed Z3. Therefore

i e2SF(p )! i e02RS 0

F De02

R

1C OΠe(p 2)

16p � mR

(8.70a)

e02R D Z2 e2 , mR D m � δm (8.70b)

Thus we have shown that the renormalization effect is to make the replacementm ! mR and multiply the electron propagator by Z2 and the photon propagatorZ3. Z2 and Z3 are absorbed in the definition of the running coupling constant:

e2 !e2

R

(1C OΠe(p 2)(1C OΠγ (q2)� F1(q2) (8.71)

Page 235: Yorikiyo Nagashima - Archive

210 8 Radiative Corrections and Tests of Qed*

where F1 is a vertex correction Eq. (8.24). Note, when the higher order correctionsare applied to the external lines, p 2 D p 02

D m2R or q2 D 0 for the photon,

OΠe(mR) D OΠγ (0) D 0 holds. Therefore, the effect of the renormalization on theexternal line is only to change m ! mR and multiply the wave functions by Z2 orZ3 (see arguments in Sect. 8.1.8).

8.1.7Ward–Takahashi Identity

The correction to the vertex function was calculated from Eq. (8.2)

u(p 0)γ μ u(p )! u(p 0)[γ μ C Γ μ ]u(p ) (8.72)

where Γ μ is given in Eq. (8.20) and includes a divergent term. It can be separatedand removed in a similar way as was done for the propagator. Namely, when wetake the limit

q D p 0 � p ! 0 , 6p 0 ! m , 6p ! m (8.73)

then

Γ μ(p 02, p 2) D Γ μ(m2, m2)C (finite term) � (Z�11 � 1)γ μ C (finite term)

(8.74)

As a consequence, the vertex is modified by higher order corrections to

γ μ ! Z�11 � γ μ(1C finite term) (8.75)

Actually, Z1 can be obtained without explicitly calculating Eq. (8.2). The trick is touse the following relation:

@

@pμSF(p ) D

@

@pμ

16p � m

D �1

6p � mγ μ 16p � m

(8.76)

and

SF(p )S�1F (p ) D 1 (8.77)

Differentiating Eq. (8.77) and using Eq. (8.76), we obtain

@

@pμS�1

F (p ) D �S�1F (p )

�@

@pμSF(p )

�S�1

F D γ μ (8.78)

The meaning of Eqs. (8.76) and (8.78) is that the derivative of a propagator withrespect to the momentum is equivalent to attaching a zero-energy photon or con-verting the propagator to a vertex with zero momentum transfer. The argument

Page 236: Yorikiyo Nagashima - Archive

8.1 Radiative Corrections and Renormalization* 211

can be generalized and any vertex function can be obtained by differentiating thecorresponding propagator:

Γ μ(p , p ) D@

@pμS 0 �1

F (p )� γ μ D@Σe (p )

@pμ(8.79)

which is referred to as the Ward–Takahashi identity. It is easily seen that the rela-tion holds between the lowest corrections Eq. (8.2) and Eq. (8.66b). DifferentiatingEq. (8.68a) and comparing with Eq. (8.74), we immediately see

Z1 D Z2 (8.80)

8.1.8Renormalization of the Scattering Amplitude

To see where all the renormalized constants Z1, Z2, Z3 and δm end up in the finalscattering amplitude, we consider an example, Compton scattering (Fig. 8.5).

Figure 8.5 (a) Compton scattering in the lowest order. (b) Up to fourth order with dressing. SeeFig. 8.6 for dressing.

The scattering amplitude corresponding to the first Feynman diagram on the rhsin Fig. 8.5a is expressed as

M � ε0 �μ u0(i eγ μ)

i6pC 6 k � m

(i eγ ν)uεν (8.81)

where we have suppressed the spin and polarization suffixes and have writtenus(p 0)! u0, u r(p )! u, εμ(k0, λ0)! ε0. Taking into account higher order correc-tions, the internal as well as external lines of the electron propagator acquire Z2 inaddition to m ! m � δm, the photon propagator Z3, and the vertex function Z�1

1 .Namely, after including the corrections, the scattering amplitude is modified to

M � (Z3ε0 �μ )(Z2u0)

�i Z�1

1 eγ μ� � i Z2

6pC 6 k � m C δm

�� (i Z�1

1 eγ ν)(Z2u)(Z3εν)(1C � � � )(8.82)

where � � � are contributions from OΠe(p 2), OΠγ (q2), finite terms in the vertex correc-tion and others. We divide Z2, Z3 into two as Z1/2

2 � Z1/22 , Z1/2

3 � Z1/23 and move

Page 237: Yorikiyo Nagashima - Archive

212 8 Radiative Corrections and Tests of Qed*

them to combine them with the wave function and the coupling constant e. Then

M ��Z1/2

3 ε0μ

���Z1/22 u0��i Z�1

1 Z2Z1/23 eγ μ� i

6p � mR

��i Z�1

1 Z2Z1/23 eγ ν��Z1/2

2 u��

Z1/23 ε

�(1C � � � )

D ε0R

�u0R(i eRγ μ)

i6p � mR

(i eRγ ν) uRεR(1C � � � )

(8.83)

The modified expression does not contain diverging constants (Zi ’s, δm). In sum-mary, by the replacements

e ! eR D Z�11 Z2Z1/2

3 e D Z1/23 e (8.84a)

m ! mR D m � δm (8.84b)

ψ ! ψR D Z1/22 ψ (8.84c)

A μ ! A Rμ D Z1/23 A μ (8.84d)

all the diverging constants are absorbed in the redefinition of the coupling con-stant, mass, and wave function normalizations. Corrections to the second Feyn-man diagram in Fig. 8.5a can be treated similarly. We call the propagators and thevertex functions with corrections “dressed”. Then the two lowest order amplitudeswritten in terms of dressed particles are modified to the two diagrams written inthin bold lines in Fig. 8.5b. The dressed diagram contains diagrams as depicted inFig. 8.6.

Figure 8.6 Irreducible diagrams such as thefirst term on the rhs can be dressed by takingthe self-energy and vertex corrections. Re-ducible diagrams to the right of irreducible

diagrams can be included in the irreduciblediagram simply by dressing the propagatorsand the vertices of the irreducible diagram.

There are other higher order effects that cannot be expressed by simply dressingthe particle, for example the third and fourth diagrams in Fig. 8.5b. Their contri-bution is finite and is included in (1C � � � ). They are referred to as irreducible dia-grams, which are defined as ones that cannot be separated by cutting a single line.They have to be calculated separately. Examples of reducible diagrams are shownon the rhs of Fig. 8.6. The reducible diagrams in Fig. 8.6 (diagrams on the rhs) canbe included in the irreducible diagram (the first diagram) by simply dressing itspropagators and vertices.

Page 238: Yorikiyo Nagashima - Archive

8.2 Tests of QED 213

The prescription developed above seems rather artificial. All the infinities arecontained in the mass mR, the coupling constant e2

R and the wave function normal-ization. We simply swept the dust under the carpet. However, what we measure inexperiments are masses and coupling constants with all the theoretical correctionsincluded. Then m, α that appear in the Lagrangian should be regarded as mere pa-rameters, and measured numbers, such as m D 0.511 MeV, α D 1/137, . . . shouldbe assigned to mR D m � δm and eR D e � δe. As renormalized theory is writtenpurely in terms of eR and mR, which are observed quantities, it does not includeany infinities and is mathematically consistent in itself. QED, or quantum electro-dynamics, thus established by Tomonaga, Schwinger and Feynman was the firstsuccessful gauge field theory that could give consistent and unambiguous predic-tions starting from first principles. Nevertheless, ad hoc treatments of infinities,such as renormalization, can only be justified if they agree with experiment.

8.2Tests of QED

8.2.1Lamb Shift

The first term in Eq. (8.59) of the Mott scattering amplitude with radiative correc-tion represents the usual Coulomb force in the static limit. The second term isthe QED correction and is referred to as the vacuum polarization effect. In quan-tized field theories, the vacuum is not an empty space but constantly creating andannihilating charged particle pairs. Under the influence of the applied externalelectromagnetic effect, those pairs naturally polarize. The vacuum behaves like adielectric material and its effect appears as a part of the radiative correction. Onephenomenon in which this correction appears is the energy shift of the hydrogenlevels. They can be calculated by including the Coulomb potential in the Diracequation, but without using quantized field theory the levels 2s1/2 and 2p1/2 are de-generate. Experimentally, there is a gap between the two energy levels (see Fig. 4.1),which is called the Lamb shift [248]. The energy levels are calculated by taking ma-trix elements of the effective Coulomb potential between the two hydrogen wavefunctions. The effective Coulomb potential can be obtained by considering higherorder corrections to the Mott scattering amplitude. In addition to the vacuum po-larization effect, they also include vertex correction, together with the infrared andthe anomalous magnetic moment effects, as we saw in Sect. 8.1.2. Altogether, thenet effect on the Mott amplitude is given by

u(p 0)�

γ μ�

1 �αq2

3πm2

�ln

mμ�

38�

15

� C

α2π

i σμνqν

2m

�1jqj2

u(p ) (8.85)

The first term in the braces f� � � g represents the first-order Mott scattering ampli-tude. The first and second terms in the parentheses (� � � ) are the vertex corrections,�1/5 is the vacuum polarization and the last term in the square brackets [� � � ] is

Page 239: Yorikiyo Nagashima - Archive

214 8 Radiative Corrections and Tests of Qed*

the contribution due to the anomalous magnetic moment. The correction includesterms proportional to q2, which for the static field is r2 acting on the space coor-dinate of the field. The operation of r2 on the Coulomb potential produces δ3(0).Namely, only the wave function at the origin contributes, hence it affects the s statebut not the p state. Therefore, the correction removes the degeneracy between s1/2

and p1/2. In configuration space, the effective potential shift becomes [215]

ΔVeff D4α2

3m2

�ln

mμ�

38�

15C

38

�δ3(x )C

α2

4πm2

σ � Lr3

(1) (2) (3) (4) (40)(8.86)

The vertex correction terms [(1) and (2)] are larger than the vacuum polarizationeffect (3) in the Lamb shift. The infrared cut is determined by extrapolating itscontribution and connecting it smoothly with that obtained by nonrelativistic treat-ment, which is finite. The contribution from the anomalous magnetic moment[terms (4) and (40)] can be calculated by extracting the lowest order matrix elementof the effective Hamiltonian

H D ��e

2mψ(x )

�12

σ0 i F0 i

�ψ(x ) , � D

α2π

(8.87)

If we express the energy shift by its corresponding frequency (ΔE D hΔν), thevertex correction amounts to 1010 MHz, the vacuum correction to�27.1 MHz, andthe anomalous magnetic moment to 68 MHz giving a total of 1051 MHz which isaccurate to about 6 MHz. Since then, higher order corrections have been calculated,so the accuracy is now � 0.01 MHz.

Theory total 1057.8445 ˙ 0.0026 MHz [279]Experiment 1057.8450 ˙ 0.0090 MHz

As one can see, the measurement is precise enough to test the correction. It wasQED’s first major success to convince people that it is the right theory for the elec-tromagnetic interactions of the electron.

8.2.2g � 2

In the radiative correction to the vertex, a term proportional to qν σμν appears. Thisis just σ � r � A � μ � B and is a higher order correction to the magnetic mo-ment that alters the value of g � 2 from zero. The statement that the Dirac particlehas g D 2 is no longer correct when higher order QED effects are included. Wehave calculated the first loop correction to the vertex and obtained a correction thatamounts to α/2π [see Eq. (8.22)]. Calculations of even higher orders have since

Page 240: Yorikiyo Nagashima - Archive

8.2 Tests of QED 215

been performed [219, 232] and have been compared with the experimental values:�g � 2

2

�eD

α2π� 0.328 478 444 002 90 (60)

απ

�2

C 1.181 234 016 828 (19) α

π

�3

� 1.914 4(35) α

π

�4C 0.0(3.8)

απ

�5C � � �

(8.88a)

Inserting the value of the fine structure constant, the theoretical value can be com-pared with the experimental data:

(1 159 652 180.85 ˙ 0.76) � 10�12 experiment [219, 311] (8.88b)

The experimental value nowadays is so accurate that it perfectly agrees with thetheoretical value. In fact, the theoretical error is larger due to uncertainty in thefine structure constant. Therefore, the experimental value is used to determine themost accurate value of the fine structure constant [219]:

α�1 D 137.035 999 069 (96) [relative error D 0.70 ppb 7)] (8.89)

The experimental value of g � 2 is no longer a value to be tested but a standard.Similar calculations have also been made for the muon [311]�

g � 22

�μD

α2πC 0.765 857 410

απ

�2C 24.050 509

απ

�3

C 130.805α

π

�4� � �

(8.90a)

The formula gives

D (116 584 718.10˙ 0.16) � 10�11 QED only

(C154˙ 2) � 10�11 electroweak

(C6 916˙ 46) � 10�11 hadronic

D (116 591 788˙ 46) � 10�11 theory total

(8.90b)

D (116 592 080˙ 54) � 10�11 experiment (8.90c)

The different value of the muon g � 2 reflects the difference of the mass. Theerror in the muon is much larger than that of the electron. This is because for themuon the hadronic as well as weak interaction correction is sizeable (Fig. 8.7b–e).The hadronic correction enters in the process in which a virtual photon becomeshadrons (i.e. quark pairs). The hadronic correction uses the high-energy e�eC !hadrons data.

Although the procedure of subtracting infinity from infinity is aesthetically un-satisfactory, a theory that can reproduce the experimental result to the accuracy of

7) ppb: parts per billionD 10�9.

Page 241: Yorikiyo Nagashima - Archive

216 8 Radiative Corrections and Tests of Qed*

(a) (b) (c) (d) (e)

h

γ

μγ γ

μ ν μ

W W

γ

μ μγ

μ Z

h

μγ

γ

μ μγ

γμμ μ

μ

Figure 8.7 Feynman diagrams for g � 2 of the muon. (a) Schwinger term [O(α) correction],(b) hadronic contribution, (c) hadronic light-by-light contribution, (d,e) lowest order W ˙, Zcontribution.

10�10 must have some truth in it. That we trust in the renormalization procedurehas firm foundations experimentally. In fact, the confidence in the gauge theorystarting from QED is so solid that any deviation of the experimental measurementof g�2 is interpreted as an effect of unknown new physics. In fact, a small discrep-ancy from the theoretical prediction was detected in measurements of the g � 2 ofthe muon [281]. It could be an experimental error, but if real it could signal a possi-ble new physics, for instance, supersymmetry (see Chap. 19) and is one of the hottopics of the day (as of 2010). Nowadays, the precision measurement of a physicalobject to look for a possible deviation from the theoretical prediction is considereda valid and powerful method to search for new physics.

8.2.3Limit of QED Applicability

QED is a quantized version of the well-established classical Maxwell theory. At themacroscopic level, it can be applied to the magnetic field in the periphery of Jupiterand to the galactic magnetic field. Namely, it is valid at the scale of at least a fewhundred light years. What about at the microscopic level? The fact that there was alimit to the applicability of classical electromagnetism at the turn of the 20th cen-tury led to the introduction of quantum theory. We can test the validity of QED atsmall distances or at high energy by performing e�C eC ! μ�C μC and Bhabhascattering (e�C eC scattering), etc. If we parametrize the difference of the electronpropagator between the theory and the experiment by F D 1C s/Λ2, where s is thetotal energy squared, we can estimate quantitatively the applicability limit of QEDby the value of Λ. At the present experimental level, we see no deviation from QED.The present experimental lower limit of Λ is larger than 1 TeV. In other words, weknow that QED is valid down to the length scale of 10�20 m. Therefore, QED isvalid in scales over the range of 40 orders of magnitude and is still extending itsvalidity. QED’s triumph is remarkable.

8.2.4E821 BNL Experiment

Although the measured value g�2 of the electron is much more precise and is usedto define the prime fine structure constant α, the muon is much more sensitive to

Page 242: Yorikiyo Nagashima - Archive

8.2 Tests of QED 217

effects other than QED, including the electroweak effect (i.e. the contribution ofFeynman diagrams that contain W ˙, Z0) and hadronic interactions and possiblynew physics. The contribution of the physics of the energy scale M to g � 2 isproportional to (m/M )2 and hence the effect is � (mμ/me )2 ' 4.3 � 104 larger forthe muon. Even though aμ � (g � 2)/2 is � 800 times less precise than ae , it isstill � 50 times more sensitive to new physics.

Since hadronic as well as weak effects on g�2 can be calculated accurately usingthe now established Standard Model (SM), the deviation, if any, of a measurementfrom the prediction of SM could be a possible clue to new physics. For this reason,the precise measurement of g � 2 of the muon not only serves as a crucial test ofQED but is also at the forefront of experimental research in high-energy physics.We describe briefly how to measure the value of g � 2 experimentally.

Principle of the Measurement A charged elementary particle with half-integerspin has a magnetic dipole moment μ with spin s:

μ D gs

q2m

�s (8.91)

where q D ˙e is the charge of the particle and the proportionality constant gs

is referred to as the Landé g-factor. In a homogeneous magnetic field where themagnetic energy is given by H D �μ � B , the spin receives a torque from the field,which is given by N D μ � B , and the spin makes a precessive motion on the axisparallel to the magnetic field. In the particle’s rest frame the precession (Larmor)frequency is given by

ωL D gsqB2m

(8.92)

but when the particle is moving, a relativistic effect called Thomas precession isadded and the net frequency is given by

ω s D ωL C ωT D gsqB2mC

qBγ m

(1� γ ) (8.93)

The particle makes a circular orbit perpendicular to the magnetic field. Assumingv � B D 0, its angular velocity is given by the cyclotron frequency ωc and the spinprecession frequency relative to the orbit becomes

ωa D ω s � ωc D gsqB2m„ƒ‚…

Larmor

CqBmγ

(1� γ )„ ƒ‚ …Thomas

�qBm㄃‚…

cyclotron

D

�gs � 2

2

�qBm� aμ

qBm

(8.94)

Since, m (the muon mass) is known and the strength of the magnetic field can befixed by experimental conditions, what we need to determine the anomaly aμ isto measure the spin precession frequency. But at the precision level of parts perbillion (ppb), B and m have to be measured with full care, too.

Page 243: Yorikiyo Nagashima - Archive

218 8 Radiative Corrections and Tests of Qed*

Thomas precession may be understood as follows. Imagine an airplane flyingin a circular orbit. Its motion may be considered as a series of longitudinaladvances by d`k ' rΔθ accompanied by a transverse shift d`? D Δθ d`k.Inside the airplane (variables denoted by a prime), the longitudinal distance isLorentz contracted (d`0

k D d`k/γ ), but the transverse distance is not (d`0? D

d`?), giving the angular shift (Δθ 0 D γ Δθ ). When the airplane comes backto the original point, the rotation angle in the airplane frame (comoving frame)has an excess 2π(γ � 1) relative to the ground frame. Therefore, the rate ofextra advance rotation per period (angular frequency) is given by

ωcom

ωcircD

2π(γ � 1)/T2π/T

D γ � 1 (8.95)

This is a shift of a reference axis with which the rotation angle of a spin-ning object is measured. The advance shift of the reference means the spinprecession frequency of a muon moving in a circular orbit with the veloci-ty v D 2πr ω in the laboratory frame is smaller by γ � 1. Since the circularmotion of the muon is due to the Lorentz force of the magnetic field and its an-gular frequency (cyclotron frequency ωc) itself becomes smaller by γ becauseof the relativistic mass increase, the net change in the angular frequency dueto the Thomas precession becomes ΔωThomas D �(qB/m)(γ � 1)/γ .

Note: The Thomas precession is a purely kinematical effect independent ofthe dynamics of the spinning object and is common to objects in quantummechanics and classical tops or gyroscopes.

Experimental Method(i) Measurement of ωa : Firstly, we have to prepare a large enough number of po-larized muons and a method to measure their direction. Polarized muons can beobtained by collecting decay muons emitted in the forward direction of flying pio-ns. The weak interaction mediated by the gauge boson W ˙ acts only on the left-handed particles. Because of this, the decay neutrino from πC in the reaction

πC ! μC C νμ (8.96a)

μC ! eC C ν e C νμ (8.96b)

has helicity �1 and by angular momentum conservation μC has also helicity �1,namely is 100% polarized in the pion rest frame. Consequently, muons emitted inthe forward direction in the laboratory frame along the high-energy pion beam arealso highly polarized. The left-handed nature of the weak interaction also works inmuon decay and e˙ are preferentially emitted antiparallel (parallel) to the polar-ization of μ˙. Therefore the electron intensity at a fixed direction relative to themuon momentum is modulated by the muon spin precession (see Fig. 8.8). Name-

Page 244: Yorikiyo Nagashima - Archive

8.2 Tests of QED 219

ly, the muon has a built-in polarization analyzer. The modulation frequency directlyreflects ωa in Eq. (8.94).(ii) Measurement of B: The magnetic field B is measured by observing the Larmorfrequency of stationary protons, ω p D g p (eB/2m p ), in a nuclear magnetic reso-nance (NMR) probe. This requires the value of μμ/μ p to extract aμ from the data.Fortunately, it can be determined precisely from the measurement of the hyperfinestructure in muonium (μ–e bound state). From Eq. (8.94) one obtains

aμ DR

μμ/μ p �R , R D ωa

ω p(8.97)

(iii) Focusing of the muon beam: To store muons in a circular orbit and retain themfor the duration of the measurement, focusing of the beam is necessary. This canbe achieved by making gradients on the magnetic field, but then the homogene-ity of the field is sacrificed, which is the most critical factor in the experiment toobtain high precision at ppb level. Therefore, E821 (and also previous CERN exper-iments [36]) used an electrostatic quadrupole field focusing. With the presence ofthe electric field, the precession frequency is modified to [216, 276]

ωa Dqm

�aμ B �

�aμ �

1γ 2 � 1

�v � E

�(8.98)

For γmagic D 29.3 (pμ D 3.09 GeV/c), the second term vanishes and aμ reduces toEq. (8.94) again.

Figure 8.9a shows schematically the muon injection and storage in the g � 2ring. The trajectory of the injected beam is modified by three kicker modules tobecome a circle that stays in the ring. The spin precesses � 12ı/circle (Fig. 8.9b).The decay electrons are measured by 24 lead-scintillator calorimeters distributedalong the inner circle of the ring. Figure 8.8 shows the measured distribution ofelectron counts. The result gives

aexpμ D 1.165 920 80 (63) � 10�3 (8.99)

Figure 8.8 Distribution of electron counts versus time for 3.6� 109 muon decays in the R01 μ�data-taking period. The data are wrapped around modulo 100 μs. Data from [281].

Page 245: Yorikiyo Nagashima - Archive

220 8 Radiative Corrections and Tests of Qed*

Protons Pions Polarized Muons Inflector

KickersStorage Ring

Injection Orbit

Ring Orbit

p = 3.1 GeV/cTarget

Polarized Muons

Storage Ring

Momentum

Spin

ωa = aμ eB mμ

24 Calorimeters

Decay Electrons

(a) (b)

Figure 8.9 (a) Schematic diagram of the E821muon storage ring [281]. Polarized muonswith � 3.1 GeV/c enter the storage ringthrough a field-free channel. The three kick-er modulators provide a short current pulse,which gives the muon bunch a transverse10 mrad kick and places it in the ring orbit.

(b) Schematic diagram of the muon spin pre-cession and the electron detector layout. Notethe spin precession rate is exaggerated. Inreality it is � 12ı/turn. Electrons are prefer-entially emitted parallel or antiparallel to themuon spin direction. Figure from [219].

where the number in parenthesis denotes error in the last two digits. The experi-mental value is compared with the theoretical prediction

atheorμ D 1.165 917 93 (68) � 10�3 (8.100)

which is 3.2σ (σ D standard deviation) off. Whether the deviation is a possible signof new physics is under much debate.

Page 246: Yorikiyo Nagashima - Archive

221

9Symmetries

When one does not know the principles or the fundamental equations that governthe phenomena one observes, one often obtains clues to them by looking at thesymmetries or selection rules they exhibit. Historically, Maxwell constructed hisunified theory of electromagnetism by assuming the existence of the displacementcurrent, which was required by current conservation. Discovery of special relativitywas inspired by Lorentz invariance of Maxwell’s equations. History also shows thatdisclosed symmetries, more often than not, turn out to be approximate. Isospin andflavor S U(3), which led to the discovery of the quark model, are such examples. To-day what people believe to be strictly conserved are based on gauge symmetries.Energy-momentum, angular momentum and charge conservations are in this cat-egory. The conservation of baryon and lepton numbers, although phenomenolog-ically valid, are not considered as strict laws. Their mixing is an essential featureof the grand unified theories (GUTs). Symmetries and conservation laws are inti-mately connected. Whether based on gauge symmetry or not, apparent symmetriesand conservation laws can be powerful tools for discovering new physics.

Symmetry means one cannot tell the difference before and after a certain trans-formation. For the case of continuous space-time translation, this means thereis no special position or absolute coordinate system in setting space-time coordi-nates for describing a physical phenomenon. The same experiments done at twodifferent places (say in Tokyo and in New York) or done at different times (todayor tomorrow) should produce the same result. In other words, translational in-variance means the existence of certain nonobservables, i.e. absolute position andtime. So far, we have elaborated on the relation between space-time translation-al symmetry and the resultant energy-momentum conservation. However, it is acommon feature of symmetry. Similar relations can be derived for a variety of sym-metry transformations. We list some of them here in Table 9.1 before proceedingfurther.

Symmetries come in two categories, first, those related to space-time structure,which are often referred to as external, and second, those related to internal sym-metries, such as electric charge, isospin and color charge. Mathematically speak-ing, the latter leave the Lagrangian density invariant after the symmetry operationapart from a total derivative. It gives a surface integral that vanishes under normalcircumstances. Space-time symmetries change the coordinate values, so only the

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 247: Yorikiyo Nagashima - Archive

222 9 Symmetries

Table 9.1 Example symmetries and conservation laws.

Nonobservables Symmetry Conserved observablestransformation or selection rules

Space-time symmetry

Absolute position x ! x C a MomentumAbsolute time t ! t C b Energy

Absolute direction θ ! θ C α Angular momentum

Absolute velocity Lorentz transformation Lorentz invarianceRight or left x !�x Parity

Internal symmetry

Particle identity Permutation Bose–Einstein

or Femi–Dirac statisticsCharge + or � Q!�Q C invariance

Absolute phase '! e�iQα ' Charge– among quarks S U(3) gauge transformation Color charge

Approximate symmetry

– among leptons a ψ ! e�iLα ψ Lepton number

– among quarks ψ ! e�iBα ψ Baryon numberNear equal mass of u, d Flavor S U(2) Isospin

– of u, d, s Flavor S U(3) Unitary spin

a Phenomenologically, lepton and baryon numbers are conserved.

action, which is an integral of a Lagrangian over all space-time, remains invari-ant. In other classifications, they are either continuous or discrete. We discuss thecontinuous space-time symmetries first.

9.1Continuous Symmetries

Where there is a symmetry, there exists a corresponding conservation law. Whatthis means is that when an equation of motion or the Lagrangian is invariant undersome symmetry operation, there exists a physical observable that does not changeas a function of time. We have already shown some examples in Sect. 5.1.4 in theform of Noether’s theorem. Here, we start by comparing how the symmetry and theinvariance of the equation of motion are related in classical, quantum mechanicaland quantum field theoretical treatments.

Page 248: Yorikiyo Nagashima - Archive

9.1 Continuous Symmetries 223

9.1.1Space and Time Translation

Classical Mechanics In the Lagrangian formalism, the equation of motion is givenas the Euler–Lagrange formula

dd t

�@L(qi , Pqi )

@ Pqi

��

@L(qi , Pqi)@qi

D 0 (9.1)

If the Lagrangian is translationally invariant, namely,

L(qi) D L(qi C ai ) (9.2)

or equivalently @L/@qi D 0, the Euler–Lagrange formula Eq. (9.1) tells us that theconjugate momentum defined by

pi D@L@ Pqi

(9.3)

is a constant of time. Therefore, translational invariance in space coordinates leadsto the existence of the conjugate momentum, which is time invariant. Since theLagrangian formalism is valid in a general coordinate system, this statement isgeneral. If qi is an angular variable, the conservation of angular momentum fol-lows. The time derivative of the Hamiltonian is expressed as

dHdtD

dd t

�Pq@L@ Pq� L

�D Rq

@L@ PqC Pq

dd t

�@L@ Pq

��

�@L@tC

@L@qPq C

@L@ PqRq�

D �@L@t

(9.4)

where the Euler–Lagrange equation was used to arrive at the last equation. Theabove expression tells us that the Hamiltonian is a constant of motion if the La-grangian does not contain time variables explicitly. Namely, translational invariancein time leads to energy conservation.

Quantum Mechanics Observables in quantum mechanics are expressed as matrixelements of hermitian operators. If there is a unitary operator OU 1) that does nothave explicit time dependence, and if the transformed wave function ψ0 D OU ψobeys the same Schrödinger equation, then

i@

@tψ D OH ψ ! i

@

@tOU ψ D OU OH OU�1 OUψ

) i@

@tψ0 D OH ψ0 ! OU OH OU�1 D OH or [ OH , OU ] D 0

(9.5)

1) The symmetry operation has to be unitary because the transition probability of the state beforeand after symmetry transformation has to be the same. (hψ0jψ0i D hψj OU† OUjψi D hψjψi).We attach a hat O to differentiate operators from numbers when we discuss objects in quantummechanics.

Page 249: Yorikiyo Nagashima - Archive

224 9 Symmetries

The operator OU commutes with the Hamiltonian. If OU is a continuous function ofsome parameter ε and OU ! 1 when ε ! 0, then for infinitesimal ε

OU ' 1� i ε OQ (9.6)

As OU is unitary, OQ is hermitian. The operator OQ in OU D e�i OQα , where α is a contin-uous parameter, is called the generator of the transformation. Since, the hermitianoperator in quantum mechanics is an observable, OQ is an observable. As OQ com-mutes with the Hamiltonian, the time derivative of the expectation value is givenby

dh OQid tD

dd thψj OQjψi D ihψj[ OH , OQ]jψi D 0 (9.7)

Therefore, if there is a continuous symmetry represented by a unitary operator OUthat leaves the equation of motion invariant, its generator OQ is a constant of time.

The hamiltonian is a time translation operator. This is obvious from

ψ(t) D e�i OH t ψ(0)! ψ(t C a) D e�i a OH ψ(t) (9.8)

To find the corresponding space translation operator OU(a) D e�i a OQ that is definedby

ψ(x )! ψ(x C a) D OU(a)ψ(x ) (9.9a)

we make the displacement infinitesimal:

OU(�)ψ(x ) D [1 � i� OQC O(�2)]ψ(x ) D ψ(x C �)

D ψ(x )C �@

@xψ(x )C O(�2)

) OQ D i@

@xD � Op

(9.9b)

A Lorentz invariant expression can be obtained by combining Eqs. (9.8) and (9.9):

ψ(x μ C aμ) D e�i aμ Pμ ψ(x μ) (9.10a)

Pμ D i(@0,r) (9.10b)

Problem 9.1

For a translational operator Op where O(x ) is a function of Op and x:

O(x C a) D e i a Op O(x )e�i a Op (9.11a)

where O(x ) is a function of Op and x, prove

rO(x ) D i [ Op , O(x )] (9.11b)

Page 250: Yorikiyo Nagashima - Archive

9.1 Continuous Symmetries 225

Translational Invariance in Quantum Field TheoryCovariant version of the Heisenberg equation of motion is expressed as

@μ O D i [Pμ , O ] (9.12a)

O(x C a) D e i aμ Pμ O(x )e�i aμ Pμ (9.12b)

The field theoretical version of the energy momentum operator is given by

P μ D

Zd3x T μ0 (9.13)

where T μν is the energy-momentum tensor that can be derived from the La-grangian using Noether’s theorem. Here, P μ and O(x ) are functions of thefield operators. Note, corresponding to the negative sign of space variables inthe Lorentz-invariant expression (aμ Pμ D a0P0 � a � P ), Pμ D (P0,�P) is used.

We first prove that Eq. (9.12b) is equivalent to the Heisenberg equation (9.12a).To prove the equivalence, it is enough to show that one can be derived from theother and vice versa. Consider aμ in Eq. (9.12b) as a small number εμ , expand bothsides of the equation and compare terms of O(ε):

(1C i εμ Pμ)O(x )(1� i εμ Pμ) ' O(x )C εμ@μ O (9.14a)

) i [Pμ , O(x )] D @μ O(x ) (9.14b)

Conversely, to derive Eq. (9.12b) from Eq. (9.12a), we use the Baker–Campbell–Hausdorff (BCH) formula.

Baker–Campbell–Hausdorff formulae Let A, B be operators.

Theorem 9.1

eAB e�A D B C [A, B ]C12!

[A, [A, B ]]C � � �

C1n!

nA0 s‚ …„ ƒ[A, [A, [A, � � � [A, B ]]]]C � � �

(9.15)

Corollary 9.1

[B, e�A] D e�A�

[A, B ]C12

[A, [A, B ]]C � � ��

(9.16a)

[eA, B ] D�

[A, B ]C12

[A, [A, B ]]C � � ��

eA (9.16b)

Page 251: Yorikiyo Nagashima - Archive

226 9 Symmetries

Theorem 9.2

Assuming [A, [A, B ]] D [B, [B, A]] D 0

eACB D e� 12 [A, B ]eAeB D eAeB e� 1

2 [A,B ] (9.17a)

eAeB D e[A,B ]eB eA (9.17b)

Corollary 9.2

eA/2eB eA/2 D eACB (9.18)

Problem 9.2

Prove Theorem 9.1 and Theorem 9.2

Replace A by i Pμ aμ , B by O, and make use of Eq. (9.12a), then

e i Pμ aμO(x )e�i Pμ aμ

D O(x )C aμ@μ O(x )C � � � Can

n!@

(n)μ O(x )C � � �

D O(x C a)(9.19)

It remains to prove Eqs. (9.12) with Pμ expressed as Eq. (9.13). We consider thecase of a complex scalar field. Generalization to other fields is straightforward. Theenergy-momentum operator is given in Eq. (5.93).

H DZ

d3x��

@'†

@t

��@'

@t

�C (r'† � r')C m2'†'

�(9.20a)

P D �Z

d3x�

@'†

@tr' Cr'† @'

@t

�(9.20b)

Using the equal time commutation relation

['(x ), π(y )]txDt y D ['(x ), P'†(y )]txDt y D i δ3(x � y ) (9.21)

it is easy to verify Eq. (9.12a) if O D '(x ) or O D π(x ).Then for any operators A, B that satisfy

@μ A D i [Pμ , A] , @μ B D i [Pμ , B ] (9.22a)

i [Pμ, AB ] D i [Pμ , A]B C Ai [Pμ, B ] D (@μ A)B C A@μ B D @μ(AB) (9.22b)

it follows that the equation is valid if (AB) is replaced with any polynomials ofA and B. Since Pμ is a conserved operator that does not include any space-time

Page 252: Yorikiyo Nagashima - Archive

9.1 Continuous Symmetries 227

coordinates, applying any function of space-time differentiation g(@) on both sidesof the equation gives

i [Pμ , fg(@)O(A, B)g] D @μfg(@)O(A, B)g (9.23)

Therefore, if O is a local operator, i.e. a polynomial of '(x ), π(x ) and their differen-tials, it satisfies

@μ O D i [Pμ O ] (9.24)

q.e.d.In summary, the notion that for the space-time translational symmetry there ex-

ists a corresponding conserved quantity, the energy-momentum, is valid in quan-tum field theory as it was in classical field theory. In quantum field theory, the en-ergy-momentum operators are the generators of the translational transformation.Note also that the Heisenberg equation holds even if the fields are interacting, be-cause the equal time commutation relation, which was the basis of the proof, isalso valid for interacting fields. The validity is ensured by the fact that they areconnected by a unitary transformation [see Eq. (6.7)].

9.1.2Rotational Invariance in the Two-Body System

Discussions in the previous section have shown that energy-momentum conser-vation is a result of translational invariance in Cartesian coordinates. As the La-grangian formalism is valid for a general coordinate system, translational invari-ance in polar coordinates φ ! φ C α results in another conserved quantity, theangular momentum. Expressed in the original Cartesian coordinates

pφ D �i@φ D �i(x@y � y@x ) D r �r

iD l z (9.25)

In Chap. 3 we have shown that Lorentz invariance leads to the existence of spin an-gular momentum, which acts on spin components of the field and is an intrinsicproperty of particles, the quanta of the field. Determining the spin is an essentialstep in the study of elementary particle physics and the method is based on theconsideration of rotational invariance. Experimentally, the most frequently usedreactions are two-body scatterings. If rotational invariance holds, angular momen-tum is conserved. Then the scattering amplitude can be decomposed into partialwaves of definite angular momentum, and by analyzing the angular distributionsof the two-body scattering state the spin of the system can be determined. We havealready learned in quantum mechanics that the scattering amplitude in the centerof mass (CM) frame can be expanded in partial waves:

dσdΩD j f (θ )j2 (9.26a)

f (θ ) D1

2 i p

XJ

(2 J C 1)[S J (E )� 1]PJ (θ ) (9.26b)

Page 253: Yorikiyo Nagashima - Archive

228 9 Symmetries

where p is the momentum in CM, S J (E ) is the partial element of the scatteringmatrix with angular momentum J and PJ (θ ) is the Legendre function of cos θ .The above formula is valid for a spinless particle. It can be extended to particleshaving spin [217]. For the treatment, it is convenient to work in the eigenstateswhere the incoming and outgoing states are specified by their helicity

h DJ � pjp j

(9.27)

instead of Jz , which is a component of the angular momentum along the fixedz-axis. The helicity is by its definition rotationally invariant.

What we are dealing with is a two-particle system having momentum and helicityp1, λ1 and p2, λ2, respectively:

jΨ i D jp1λ1I p 2λ2i D ψ1(p1, λ1)ψ2(p2, λ2) (9.28)

Let us first separate the motion of the two-body reaction

A(p1)C B(p2)! C(p3)C D(p4) (9.29)

into that of the CM frame itself and the relative motion in it. Defining the mo-mentum of the CM system P and the relative momentum p in the initial and finalstates by

P i D p1 C p2 , P f D p3 C p 4 (9.30a)

p i D12

(p1 � p 2) , p f D12

(p3 � p4) (9.30b)

and working in the CM frame, in which Pi D P f D (E , 0), state vectors in CM canbe expressed by their relative momentum in polar coordinates. We rewrite the two-particle state as

jp θ φI λ1λ2i D ψ1(p λ1)ψ2(p λ2) (9.31)

where particle 1 has momentum p D (p , θ , φ) in polar coordinates and helicity λ1

and particle 2 has �p and helicity λ2. Sometimes we will use the total helicity

λ D λ1 � λ2 , μ D λ3 � λ4 (9.32)

to denote the two-particle helicity state, i.e. jλi D jλ1λ2i, when there is no dangerof confusion. Expanding the plane-wave state in terms of those that have angularmomentum J, M is the center of discussion.

If rotational invariance holds, angular momentum is conserved and the scatter-ing matrix elements can be expressed as

h J 0M 0λ3λ4jS j J M λ1λ2i D δ J J 0 δM M 0hλ3 λ4jS J jλ1 λ2i (9.33)

Page 254: Yorikiyo Nagashima - Archive

9.1 Continuous Symmetries 229

where j J M λ1λ2i is a state of definite angular momentum constructed out of statescontaining particles of definite helicity. The essence of the partial-wave expansionis condensed in this equality.

We note that the final state, where the particle is scattered by angle θ , can beobtained by rotation from the initial state, where the momentum is along the z-axis (θ D φ D 0). The rotation in this case is around the y-axis by θ and thenaround the z-axis by φ:

jp θ φI λi D e�i JZ φ e�i Jy θ jp00I λi � R(θ , φ)jp00I λi (9.34)

where R(θ , φ) denotes the rotation operator. Conventionally the alternative form

R(θ ) D e�i JZ φ e�i Jy θ e i JZ φ (9.35)

is often used, which we will adopt here. It differs from Eq. (9.34) only in the phase.Now we define the rotation matrix by

h J 0M 0λjR(θ , φ)j J M λi D δ J J 0 e�i(M 0�M)φh J M 0λje�i θ Jy j J M λi

D δ J J 0 e�i(M 0�M)φ d JM 0 M (θ )

(9.36)

The two states have the same helicity, as it is conserved by rotation. The rotationmatrix d J

M M 0 (θ ) satisfies the orthogonality condition and is normalized by

Z 1

�1d cos θ d J

M M 0 (θ )d J 0M M 0 (θ ) D δ J J 0

22 J C 1

(9.37)

General properties and expressions for d JM 0 ,M (θ ) are given in the Appendix E. Mul-

tiplying Eq. (9.34) by h J M I λj from the left gives

h J M λjp θ φI λi DXJ 0 M 0h J M λjR(θ , φ)j J 0M 0ih J 0M 0jp00I λi

DXM 0

e i(M 0�M)φ d JM M 0 (θ )h J M 0λjp00I λi

D e i(λ�M)φ d JM λ(θ )h J M λjp00I λi

� N J ei(λ�M)φ d JM λ(θ ) (9.38)

The penultimate equality follows because the momentum of the state jp00I λi isalong the z axis and its total helicity λ D λ1 � λ2 is exactly the same as M D Jz ,hence

h J M 0λjp00I λi D δM 0 λ N J (9.39)

Page 255: Yorikiyo Nagashima - Archive

230 9 Symmetries

N J is calculated to be

N J D

r2 J C 1

4π(9.40)

To derive N J , we use the normalization conditions

h J 0M 0λ0j J M λi D δ J J 0 δM M 0 δλ01λ1

δλ02λ2

(9.41a)

hp 0θ 0φ0I λ0jp θ φI λi D δ(cos θ � cos θ 0)δ(φ � φ0)δλ01λ1

δλ02λ2

(9.41b)

and apply them to

1 D h J M λj J M λi DX

μ

Zd cos θ dφh J M λjp θ φI μihp θ φI μj J M λi

(9.38)(9.37)D

4π2 J C 1

jh J M λjp00I λij2 D4π

2 J C 1N 2

J (9.41c)

Next we expand the state Eq. (9.34) in terms of angular momentum eigenstates:

jp θ φλi DXJ,M

j J M λih J M λjp θ φI λi DXJ M

N J d JM λ e i(λ�M)φj J M λi

(9.42)

Then the scattering matrix between the two states expressed in polar coordinatescan be written

hp f θ φI μjS jpi00I λi(9.33)D

XJ M

2 J C 14π

e i(μ�M)φ d J

M μ

��hμjS J (E )jλi

e i(λ�M)φ d J

M λ

�jθDφD0

DXJ M

2 J C 14π

hμjS J (E )jλid Jλμ(θ )e i(λ�μ)φ

hμjS J (E )jλi � hλ3 λ4jS J (E )jλ1λ2i , λ D λ1 � λ2 , μ D λ3 � λ4

(9.43)

To normalize the scattering amplitude, we compare the scattering matrix in linearmomentum space and in polar coordinates in the CM system:

hp3 p4I μjS jp1 p2I λi D δ i f � (2π)4 i δ4X

pi �X

p f

�M f i

D (2π)4δ4(Pi � P f ) � (4π)2

ss

p i p f

� hp f θ φI μjS jpi00I λi (9.44)

Page 256: Yorikiyo Nagashima - Archive

9.1 Continuous Symmetries 231

where s D (p1C p2)2. The factor in the second line results from the normalizationconditions Eq. (9.41). To prove it, set S D 1 and use

hp3 p4jp1 p2i D (2π)6(16E1E2E3E4)1/2δ3(p1 � p3)δ3(p2 � p4)

δ3(p1 � p3)δ3(p2 � p4)d3 p3d3 p4 D δ3(P i � P f )δ3(p i � p f )d3P f d3 p f

D δ4(Pi � P f )δ(cos θ � cos θ 0)δ(φ � φ0)�

@E f

@p f

��1

p f2d4P f d cos θ dφ

(9.45)

and symmetrize between the initial and the final state. We define the scatteringmatrix T J (E ) by2)

hλ3λ4jS J (E )jλ1λ2i D δλ1λ3 δλ2λ4 C ihλ3λ4jT J (E )jλ1λ2i (9.46)

Then using Eq. (9.44) and comparing with the cross section formula (6.90) fordσ/dΩ jCM, we obtain

dσdΩD j f λ3λ4,λ1 λ2 j

2 (9.47a)

f λ3λ4,λ1 λ2 D

rp f

p i

M f i

8πp

s(9.47b)

D1

2 i p i

XJ

(2 J C 1)hλ3λ4j�S J (E )� 1

�jλ1λ2id

Jλμ(θ )e i(λ�μ)φ

(9.47c)

The equation is an extension of Eq. (9.26). In fact, referring to Eq. (E.21) in Ap-pendix E,

d J00 D PJ (θ ) (9.48)

This agrees with Eq. (9.26) when the particles have no spin.

Unitarity of S-matrixFrom the unitarity of the scattering matrix we have

S S† D S†S D 1 (9.49)

Sandwiching it between an initial and final state gives

δ f i DX

n

S�n f Sni (9.50)

In terms of the T matrix defined in Eq. (9.46)

i(T �i f � T f i ) D

Xn

T �n f Tni (9.51)

2) T J D (S J � 1)/2i is also often used in the literature.

Page 257: Yorikiyo Nagashima - Archive

232 9 Symmetries

If it is a helicity amplitude, time reversal invariance requires S f i D Si f

[Eq. (9.121)],

2 Im (T f i) DX

n

T �n f Tni (9.52)

For forward scattering (i D f ),

2 Im (Ti i) DX

n

jTni j2 (9.53)

The right hand side (rhs) is proportional to the total cross section and the left handside (lhs) to the imaginary part of the forward scattering amplitude. The relation isgeneral in the sense that it is valid for inelastic as well as elastic scattering. Namely,

σTOT D k Im�

f (θ D 0)

(9.54)

To obtain the proportionality constant, we use Eq. (9.47) and

T f i D �(2π)4δ4(Pi � P f )M f i (9.55)

Referring to the cross section formula in terms of M f i [see (6.86)]

σTOT DX

f

(2π)4δ4(Pi � P f )jM f i j

2

2sλ(1, x1, x2)D

2 Im (M f i )4pp

s

D4πp

Im f f (0)g

(9.56)

where we have used the relation pCM D λ(s, m21, m2

2)/(2p

s) Dp

sλ(1, x1x2)/2[Eq. (6.56)].

Problem 9.3

Using Eq. (9.47), calculate the total cross section for the elastic scattering

σTOT D

ZdΩ

Xλ3λ4

j f λ3λ4,λ1 λ2 j2 (9.57)

and then prove that

σTOT(λ1, λ2) D4πp

Im ( f λ3λ4,λ1 λ2 ) (9.58)

assuming there is no inelastic scattering.

Page 258: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 233

9.2Discrete Symmetries

Discrete symmetries are not functions of a continuous parameter and no infinitesi-mal variation or differentiation is possible. Parity transformation, time reversal andcharge conjugation to exchange particles and antiparticles are such examples.

9.2.1Parity Transformation

General Properties Parity, or P operation for short, is often referred to as “mirror-ing” (reflection) of an object, but the exact definition is reversal of signs on all spacecoordinates:

xP! �x (9.59)

Performing the parity transformation again brings the state back to the originalone, that is P2 D 1. Therefore the corresponding unitary transformation UP satis-fies

UP UP D 1 ) UP D U�1P D U†

P (9.60)

The parity transformation is at the same time unitary and hermitian, hence an ob-servable by itself. Its eigenvalue is˙1. It is known that parity is violated maximallyin the weak interaction, but that it is conserved in the strong and electromagneticinteractions, at least within experimental limits.

Momentum and Angular Momentum There are many observables that have theirown parity. Classically, the momentum p is mv D mdx/d t and quantum me-chanically �ir; it changes its sign under parity transformation. Orbital angularmomentum, on the other hand, has positive parity since it is given by x � p . Theparity of the spin angular momentum S is not clear, but we can assume it has thesame property as its orbital counterpart. The above examples show that there aretwo kinds of (three-dimensional space) vectors. Those that change their sign un-der the parity transformation are called polar vectors and the others, which do not,axial vectors. Similarly, a quantity that is rotationally invariant but changes its signunder the P operation is called a pseudoscalar. One example is the helicity operatorσ � p/jp j, the spin component along the momentum direction.

Intrinsic Parity of a Particle When the Hamiltonian is a function of the distancer D jxj

H Dp 2

2mC V(r) (9.61)

it is invariant under parity operation, and if ψ(x ) is a solution then ψ(�x ) is also asolution:

ψ0(�x ) D UP ψ(x) D ˙ψ(x ) (9.62)

Page 259: Yorikiyo Nagashima - Archive

234 9 Symmetries

Generally, a particle has its own intrinsic parity. When operated on a one particlestate,

P jqi D ˙ηP jP qi (9.63)

Here, q is a quantum number that specifies the state, and ηP is a phase that accom-panies the parity transformation. The phase degree of freedom is allowed becausethe expectation value of an observable is always sandwiched by a state and its con-jugate. Usually, the phase is taken to be 1 for simplicity.

Parity of Fields The parity of the Dirac bilinears was already given in Sect. 4.3.3.The parity of the scalar, vector and tensor fields are the same as the correspondingDirac bilinears. We reproduce in Table 9.2 the parity transformation properties ofthe fields.

Parity of the Photon Now let us consider the parity of the photon. Since the timecomponent of a charged current (� D j 0) is a scalar with positive parity and thespace component is a polar vector, the Maxwell equations

r � E D q�,@B@tD �r � E (9.64)

tell us that

E (x)! E 0(�x) D �E (x) , B(x )! B 0(�x) D B(x ) (9.65)

Since the electromagnetic potential is connected to the field by

E D rφ �@A@t

, B D r � A (9.66)

the transformation property of the potential is given by

φ(x)! φ0(�x) D φ(x) , A(x )! A0(�x) D �A(x) (9.67)

This means the photon is a polar vector and has negative parity.

Table 9.2 Property of Dirac bilinears under parity transformation.

S(t, x ) P(t, x ) V μ (t, x ) Aμ (t, x ) T μν (t, x )

P S(t,�x ) �P(t,�x ) Vμ (t,�x ) �A μ (t,�x ) Tμν (t,�x )

Note 1: S D ψ ψ, P D iψ γ5 ψ, V μ D ψ γ μ ψ, Aμ D ψ γ μ γ5 ψ, T μν D ψ σμν ψNote 2: V μ D (V 0, V ), Vμ D (V 0,�V )

Page 260: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 235

Parity of Many-Particle SystemsThe quantum number of a many-particle system is additive for those correspond-ing to continuous symmetry and multiplicative for those corresponding to discretesymmetry, provided they are independent. The wave function of a many-particlesystem when there are no interactions between the particles is described by

Ψ (x1, x2, � � � , xn) D ψ1(x1)ψ2(x2) � � � ψn(xn) (9.68a)

Ψ 0 D U Ψ D ψ01 ψ0

2 � � � ψ0n D U ψ1U ψ2 � � �U ψn(xn) (9.68b)

When U D e i Qα

e i Qα Ψ D Π e i Qi α ψ i D e i(Q1CQ2C��� )α Π ψ i

) Q D Q1 C Q2 C � � � C Q n(9.69)

When the transformation is discrete, U itself is converted to its quantum number,the rhs becomes Π η i Π ψn and

η D η1η2 � � � ηn (9.70)

Parity of a Two-Body SystemConsider the case where the intrinsic parity of spinless particles 1, 2 is P1, P2 andthe wave function of the relative motion is Φ (x ) D f (r)YLM (θ , φ), where YLM isa spherical harmonic function. Under the parity operation θ ! π � θ and φ !φCπ. Using the property of the spherical harmonic function YLM (π�θ , φCπ) D(�1)L YLM (θ , φ),

U Ψ (x1, x2) D U�φ1(x1)Φ (x1 � x2)ψ2(x2)

D P1P2(�1)L Ψ (9.71)

The parity of a three-body system can be determined similarly as P1P2P3(�1)`CL,where ` is the relative angular momentum between the particles 1, 2 and L is thatof the particle 3 relative to the CM of the 1 C 2 system. Now consider a reactiona C b ! c C d, which has relative angular momentum `, `0 before and after thereaction, respectively:

hcdjS jabi D hcdjP�1P S P�1P jabi D Pa Pb(�1)`Pc Pd (�1)`0hcdjS jabi

)˚1 � Pa Pb(�1)`Pc Pd (�1)`0�

hcdjS jabi D 0 (9.72)

If hcdjS jabi ¤ 0 and [P, S ] D 0, the parity before and after the reaction is con-served:

Pa Pb(�1)` D Pc Pd (�1)`0(9.73)

When the parity of the particles a, b, c is known, that of d can be determined usingthe above equation. We shall determine the parity of the pion in the next sectionthis way. Equation (9.73) does not hold if hcdjS jabi D 0 by some other selectionrules. Therefore, there are as many ambiguities in the determination of the parity

Page 261: Yorikiyo Nagashima - Archive

236 9 Symmetries

assignment as the number of selection rules. For instance, to determine the parityof π0, one can use a process p ! p C π0 or n ! n C π0, but to determine that ofπC one cannot use p ! p C πC, because it is forbidden by charge conservation.The process p ! n C πC is allowed, but one needs to know the relative parity ofn to p. Similarly one cannot determine the parity of KC using the process p !n C KC as it is forbidden by strangeness conservation. p ! Λ C KC is allowed,then again the parity of KC and Λ cannot be determined independently. The usualassumption is that all the quarks u, d, s, etc. have relatively positive parities. Sinceall the hadrons are made of quarks, the parities of the hadrons can be determined inprinciple once the relative angular momentum among the quarks is specified. Theparity of the familiar baryons (p , n, Λ, � � � ), therefore, is assumed to be positive.

Parity Transformation of Helicity States*The parity transformation property of the helicity amplitude is a bit complicated,because it does not use the orbital angular momentum explicitly. Those who arenot interested in the complication of the derivation may skip this part and use onlythe relevant formula Eq. (9.87). The following arguments follow closely those ofJacob and Wick [217].

To determine the parity of a partial wave state j J M λi in the helicity formalism,we must go back to the beginning and evaluate the effect on the helicity statesEq. (9.31)

jp00λ1λ2i D ψ1(p λ1)ψ2(p λ2) (9.74)

where particles 1 and 2 are moving in Cz and �z directions, respectively. ψ1 isobtained by Lorentz boost from its rest frame. Since the eigenvalues of the pari-ty operation are only phases, we need to fix the relative phases of various statesbeforehand. They can be determined from a requirement that ψ(0λ) satisfies theusual angular momentum formula

J˙jsλi D ( Jx ˙ Jy )jsλi D [(s � λ)(s ˙ λ C 1)]1/2js, λ ˙ 1i (9.75)

The parity operation does not change the spin direction and at rest Jz D λ

P ψ(0λ) D ηψ(0λ) (9.76)

where η is the particle’s intrinsic parity. It is convenient to define a mirror operationwith respect to the x z-plane:

Y � e�i π Jy P (9.77)

Using the relations

e�i θ Jy j J M λi DXM 0

d JM 0 M (θ )j J M 0λi (9.78a)

d JM 0 M (π) D (�1) j CM 0

δM 0 ,�M (9.78b)

Page 262: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 237

we obtain

Yψ1(p λ1) D η1(�1)s1�λ1 ψ1(p ,�λ1) (9.79)

Similarly

Yψ2(p λ2) D η2(�1)s2Cλ2 ψ2(p ,�λ2) (9.80)

The reason for �λ2 in the exponent is that the particle’s momentum is �p , henceM D Jz D �λ. Combining the two equations

Y jp00λ1λ2i D η1η2(�1)s1Cs2�λ1Cλ2 jp00,�λ1,�λ2i (9.81a)

) P jp00λ1λ2i D η1η2(�1)s1Cs2�λ1Cλ2 e i π Jy jp00,�λ1,�λ2i (9.81b)

Now, what we want is the phase that appears in

P j J M λ1λ2i D e i αj J M,�λ1,�λ2i (9.82)

Since the parity and the angular momentum operators commute, the phase α doesnot depend on M. Therefore, without loss of generality, we can set M D λ Dλ1� λ2. For the same reason, we can confine our argument to the case θ D φ D 0.Expanding the rhs in Eq. (9.81b) in partial waves

e i π Jy jp00,�λ1,�λ2i DX

J M μ1 μ2

XJ 0 M 0 ν1 ν2

j J M μ1μ2i

� h J M μ1μ2je i π Jy j J 0M 0ν1ν2ih J 0M 0ν1ν2jp00,�λ1,�λ2i

(9.39)D

XJ M

j J M,�λ1,�λ2iN J d JM,�λ (�π)

DX

J

N J (�1)λ� J j J λ,�λ1 � λ2i

(9.83)

Inserting this into Eq. (9.81b) gives

P jp00λ1λ2i DX

J

N J η1η2(�1)s1Cs2� J j J λ,�λ1,�λ2i (9.84)

On the other hand, expanding the lhs of Eq. (9.81b) directly using Eq. (9.42) gives

P jp00λ1λ2i DXJ M

P j J M λ1λ2ih J M λ1λ2jp00λ1λ2i

DX

J

N J P j J λλ1λ2i(9.85)

Page 263: Yorikiyo Nagashima - Archive

238 9 Symmetries

Comparing Eq. (9.84) and Eq. (9.85)

P j J λλ1λ2i D η1η2(�1)s1Cs2� J j J λ,�λ1,�λ2i (9.86)

Noting s1 C s2 � J is an integer, we finally obtain

P j J λλ1λ2i D η1η2(�1) J�s1�s2 j J λ,�λ1,�λ2i (9.87)

When the parity is conserved, applying Eq. (9.87) to the scattering matrix,

h�λ3,�λ4jS J j � λ1,�λ2i D ηS hλ3λ4jS J jλ1λ2i

ηS D η1η2η3η4(�1)s3Cs4�s1�s2(9.88)

Using Eq. (9.47) and d Jnm(π � θ ) D (�1) JCn d J

n,�m (θ ), a similar formula for thescattering amplitude can be obtained:

h�λ3,�λ4j f (θ , φ)j � λ1,�λ2i D ηS hλ3λ4j f (θ , π � φ)jλ1λ2i (9.89)

Parity Violation in the Weak InteractionWe will describe the detailed dynamics of the weak interaction in Chap. 15. Here,we describe only the essence of parity violation. The momentum p is an observablewith negative parity, but it does not mean that parity is violated, since particles with�p exist equally. However, if the S-matrix contains an observable that has negativeparity, it means parity is violated in the scattering. For instance,

J � p D J p cos θ (9.90)

is such an observable. In the strong magnetic field at ultra-low temperatures, anucleus can be polarized along the magnetic field. An asymmetry in the angulardistribution of the decay particles from the nuclei means that parity is violated inthe decay. Here, the angle of the particle is defined relative to the magnetic field, i.e.the spin orientation of the parent nuclei and the asymmetry is relative to a planeperpendicular to the polarization axis, i.e. whether θ is ? π/2. The experiment ofWu et al. [392] that proved parity violation in the weak interaction was determinedthis way.

In π-p scattering where the initial proton is unpolarized, the final proton canbe polarized perpendicular to the plane of scattering, namely σ � k1 � k2 ¤ 0;where k1, k2 are the pion momenta before and after scattering, respectively. Thisis because the parity of σ � k1 � k2 is positive. However, if the recoiled protonis longitudinally polarized, i.e. polarized along its momentum, which meansσ � p ¤ 0, conservation of parity is violated.

The origin of the parity-violating transition can be traced back to the Hamilto-nian. If parity is conserved in an interaction, the parity operator commutes withthe Hamiltonian. Since the scattering matrix is made from the Hamiltonian, itcommutes with the S-matrix, too. In order for the S-matrix to contain the parity-

Page 264: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 239

violating term, the Hamiltonian must contain a parity-violating component, too:

H D H0 C HPV (9.91a)

P H0P�1 D H0, P HPV P�1 D �HPV (9.91b)

where H0 and HPV are a scalar and a pseudoscalar, respectively. HPV is the originof the parity-violating transition. We shall see how the parity-violating observableis related to HPV.

Let us pause to consider the meaning of the extra parity-violating term in theLagrangian. There are observables that change sign by the parity transforma-tion P. The momentum p and helicity σ � p are examples. Their existencealone does not mean parity violation. We claim that parity is violated if a phe-nomenon exhibits a different behavior in the mirror world than in ours. Thismeans the dynamical motion in the mirror world is different for a given ini-tial state. Namely, it happens if the transition amplitude includes a term thatbehaves differently in the mirror world, in other words, if it includes odd-par-ity terms such as σ � p . As the transition amplitude is a matrix element ofthe Hamiltonian for given initial and final states, the Hamiltonian itself hasto include an odd-parity term to induce the parity-violating phenomenon. Inmathematical language, it simply means [P, H ] ¤ 0. Calculations of the tran-sition amplitude will show that if C 0 is the coefficient of the odd-parity termin the Hamiltonian, the parity-violating observables always appear multipliedby C 0.

The weak interaction is known to violate parity. Beta decay, (A, Z ) ! (A, Z C1) C e� C ν e or in the quark model d ! u C e� C ν e , is mediated by a chargedforce carrier, the W � vector boson, but in the low-energy limit, the interactionHamiltonian is well described by the four-Fermi interaction, which can be writtenas

HINT D (uγμ d)˚

e�CV γ μ C C 0

Vγ 5� ν e�

C (uγμ γ 5d)˚

e�CA γ μ γ 5 C C 0

Aγ μ� ν e�C h.c.

(9.92)

where u, d, ν e , e� stand for the Dirac fields and h.c. means the hermitian con-jugate of the preceding term. Referring to Table 9.2, we see that terms with CV,CA, which contain Vμ V μ , A μ Aμ , are parity conserving, while those with C 0

V, C 0A,

which contain Vμ Aμ , A μV μ , are parity violating. Namely, CV, CA and C 0V, C 0

A arethe strengths of the parity-conserving and parity-violating terms, respectively. Sincethe observed rate is proportional to the square of the amplitude, the effect of theparity violation appears as the interference term. The total rate Γ is proportional to

Page 265: Yorikiyo Nagashima - Archive

240 9 Symmetries

jH0j2 C jHPVj

2. By preparing suitable initial and final states of the nucleus, it canbe arranged to pick up only the CA, C 0

A part.3) Then the angular distribution of theelectron is shown to have the form

dΓdΩ� 1C α

( J � p e)me

D 1C αP v cos θ (9.93a)

α D2 Re (C�

A C 0A)

jCAj2 C jC 0Aj

2 (9.93b)

P is the polarization of the parent nucleus. This shows clearly that the origin ofthe parity-violating observable J � p e is the parity-violating Hamiltonian HPV. Theeffect is maximal when jCAj D jC 0

Aj, and experimental data have shown this is thecase (C 0

A ' �CA). Experimental data have fixed the strengths as well as the phasesof the coupling constants (C 0

V D �CV, C 0A D �CA, CA D CV) and excluded other

types (i.e. S, P, T) of quadrilinear forms. The Hamiltonian for beta decay has beenshown to be

HINT DGFp

2fuγμ(1� γ 5)dgfeγ μ(1 � γ 5)ν eg (9.94)

Here, CV was replaced by the universal Fermi coupling constant GF/p

2, GF D

10�5 � m2p, where mp is the mass of the proton. This is called the V–A interac-

tion and was the standard phenomenological Hamiltonian before the advent of theStandard Model.

9.2.2Time Reversal

Time Reversal in Quantum MechanicsTime reversal of a process means to make it proceed backward in time. In a macro-world, when a glass is dropped on the floor, the reverse process is that the brokenpieces fly back to where the glass was broken and amalgamate to shape the originalglass. Everybody knows this does not happen. But in the micro-world, Newton’sequation of motion under the influence of a force F(x) that does not depend ontime

md2xd t2D F (x) (9.95)

is invariant under the time reversal operation t ! �t, and this is indeed realized.Physical objects that are dependent on time, for example the velocity v D dx/d t,the momentum p D mv , the angular momentum L D x � p change their signunder time reversal and are suitable observables to test the hypothesis. Take thetypical example of π–p scattering:

π(p1)C p (p2)! π(p3)C p (p4) (9.96)

3) If the so-called Gamow–Teller nuclear transitions with jΔ J j D 1 and no parity change areselected, the vector interaction does not contribute (CV D C 0

V D 0).

Page 266: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 241

Figure 9.1 Time reversal. (a) Particles having momenta p1 and p2 enter, and exit as two parti-cles with momenta p3 and p4. (b) Time-reversed reaction of (a). Particles with momenta �p3and �p4 enter, and exit as particles with momenta �p1 and �p2.

The time-reversed process is a reaction where the outgoing particles reverse theirmomenta, go back to where they were scattered, undergo the reverse interactionand recover the initial states with their momenta reversed (Fig. 9.1):

π(�p3)C p (�p4)! π(�p1)C p (�p2) (9.97)

Namely, their positions are unchanged, but their momenta reversed and the initialand final states are interchanged. Expressing this in terms of the transition ampli-tude

hT φ(p f )jT ψ(p i )i D hψ0(p 0

i )jφ0(p 0

f )i D hφ(�p f )jψ(�p i )i� (9.98)

The time-reversal operation seems to include the complex conjugate operation inaddition to ordinary unitary transformation. Let us see if the notion is valid inquantum mechanics. Consider a Schrödinger equation

i@ψ(t)

@tD H ψ(t) (9.99)

and reverse the time direction of the wave function (H is independent of time)

t ! t0 D �t , ψ(t)! ψ0(t0) D T ψ(t) (9.100)

Then

i@ψ0(t0)

@t0 D H 0ψ0(t0) , H 0 D T H T �1 (9.101)

If H 0 D H ,

�i@(T ψ)

@tD H(T ψ) (9.102)

This shows that in order for the transformed wave function T ψ to satisfy the sameequation as Eq. (9.99), requiring T H T �1 D H is not enough; it is also necessaryto take the complex conjugate of both sides. The process is known as Wigner’stime reversal. In other words, Eq. (9.98) is a necessary condition for the time re-versal operation in quantum mechanics. Under time reversal, the wave function istransformed to

ψ(t)T! ψ0(t0) D T ψ(t) D ψ�(�t) (9.103)

Page 267: Yorikiyo Nagashima - Archive

242 9 Symmetries

When ψ represents a plane wave with momentum p

ψ(t, x I p ) � e i p�x�i E t T! e i p�x�i E t 0

j�t 0D�t D e�i p �x�i E t (9.104)

time reversal reverses the momentum of the state to �p as expected. In general, atransformation of the form

ψ(t) D aφ1(t)C bφ2(t)

! ψ0 D a�φ01(t0)C b�φ0

2(t0) D a�φ�1 (�t)C b�φ�

2 (�t)(9.105)

is called an antiunitary transformation. This means the T operation is expressedas a product of a unitary transformation U and complex conjugate operation K.Time-reversal invariance means the observables do not change under the opera-tion Eq. (9.98). As the observables are expressed as the square of the transitionamplitude, their antiunitarity does not produce any contradiction.

In fact, the necessity as well as consistency of complex conjugation can be shown

in many examples. For instance, in classical mechanics p D mdx/d tT! �p , but

in quantum mechanics p is a space differential operator and has no t dependence.By complex conjugation it obtains the right sign. Besides, the quantum condition

[xi , p j ] D i δ i j (9.106)

and the commutator of the angular momentum

[L i , L j ] D i ε i j k L k (9.107)

are not invariant for the change p ! �p , L ! �L but are invariant after anadditional complex conjugate operation. In conclusion, the time-reversal operationrequires, in addition to flipping the time, the complex conjugate to be taken of allc-numbers in the equation.

Problem 9.4

Show the transformation properties under the time-reversal operation.

Electric field E (t) ! E 0(�t) D E (t)Magnetic field B(t) ! B0(�t) D �B(t)Potential Aμ D (φ, A) ! Aμ 0(t0) D (φ0(�t), A0(�t))

D (φ(t),�A(t)) D A μ(t)

(9.108)

Time Reversal of the S-MatrixThe S-matrix is defined as lim t!1

t 0!�1U(t, t0) [Eqs. (6.11), (6.12)]. Since T U(t)T �1D

U(�t), it is obvious that T S T �1 D S†. Another way of seeing this is to use

hout W q f jqi W ini D hout W q f jS jqi W outi D hin W q f jS jqi W ini (9.109)

Page 268: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 243

From the first expression, we obtain

hout W q f jqi I ini D hout W q f jT �1T jqi W ini D hin W T(q f )jT(qi) W outi�

D hout W T(qi)jT(q f )I ini D hout W T(qi)jS jT(q f )I outi

(9.110a)

where we have used T jini D jouti, T jouti D jini to go to the third equality. Fromthe last expression in Eq. (9.109), we obtain

hin W q f jS jqi W ini D hin W q f jT �1T S T �1T jqi W ini

D hout W T(q f )j(T S T �1)jT(qi) W outi�

D hout W T(qi )j(T S T �1)†jT(q f ) W outi

(9.110b)

Comparing the last expressions of Eq. (9.110a) and Eq. (9.110b), we conclude

T S T �1 D S† (9.111)

Time Reversal of Partial WavesAs the spin operator has no correspondence in classical mechanics, we are not surehow it transforms under the T operation. Let us assume its transformation propertyis the same as that of the orbital angular momentum, then s � x � p changes itssign, too. Therefore

T Ji T �1 D � Ji , T J˙T �1 D � J� (9.112a)

Jz T j J Mi D �T Jz j J Mi D �M T j J Mi

) T j J Mi D η( J, M )j J,�Mi (9.112b)

The second equality of the first line follows from complex conjugation of i. To deter-mine the phase η, we use the conventional relation among the angular momentumeigenstates:

T J�j J Mi D [( J C M )( J � M C 1)]1/2T j J M � 1i

D [( J C M )( J � M C 1)]1/2η( J, M � 1)j J,�M C 1i

D � JCT j J, Mi D �η( J, M ) JCj J,�Mi

D �η( J, M )[( J C M )( J � M C 1)]1/2j J,�M C 1i

) η( J, M ) D �η( J, M � 1) (9.113a)

This equation means

η( J, M ) D η( J )(�1)M (9.114)

η( J ) is a factor independent of M, and can be chosen considering the rotational

Page 269: Yorikiyo Nagashima - Archive

244 9 Symmetries

invariance of the S-matrix and the antiunitarity of the T reversal. By choosing

η( J ) D (�1) J 4) (9.115)

we can fix the phase as

T j J Mi D (�1) J�M j J,�Mi (9.116)

The phase of the helicity state can be determined similarly:

T j J M λi D (�1) J�M j J,�M λi (9.117)

To prove the equality, we use the fact that the helicity does not change its sign underT and that the T-transformed one-particle state jp λi is ηj� p λi which is equivalentto 180ı rotation on the y axis. Namely

T jp00λ1λ2i D εe�i π Jy jp00λ1λ2i (9.118)

where ε is a phase factor. The rest of the argument uses the same logic as wederived the parity of the state in Eq. (9.87).

Problem 9.5

Prove that the T-operated helicity eigenfunctions for a spin 1/2 particle given byEq. (4.14)

�C D�

cos θ2 e�i φ/2

sin θ2 e i φ/2

�, �� D

�� sin θ

2 e�i φ/2

cos θ2 e i φ/2

�(9.119)

are given by

T �˙ D �i σ2��˙ D (�1)S�M �� (9.120)

Finally, as to the helicity scattering amplitude, using the antiunitarity of the Ttransformation and Eq. (9.117) we find

hλ3λ4jS J jλ1λ2i D hλ1λ2jS J jλ3λ4i (9.121)

Time Reversal of FieldsScalar and Vector Fields We define the T reversal of a scalar field that obeys theKlein–Gordon equation by

'(t, x )T! '0(x 0) D T'T �1 D '0(�t, x ) (9.122)

Since the free-field Lagrangian is bilinear in form in ' and '†, both choices

'0 D ' and '0 D '† (9.123)

4) The conventional spherical harmonic function YLM is defined with η( J ) D 1. To be consistentwith our definition, it has to be changed to YLM ! iL YLM , but this is not a problem in this book.

Page 270: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 245

are possible. But here we impose a constraint by requiring that the wave function(denoted as ψ D c-number) should respect the rule we saw in quantum mechanics.As the field ' can be expanded in terms of plane waves as

' DX

k

1p

�ak e�i k �x C b†

k e i k �x (9.124)

the wave function ψ(t, x ) describing a particle with momentum p can be extractedby

ψ(t, x ) D h0j'(t, x )jp i (9.125)

Similarly, the T-reversed state should be able to be extracted by

ψ0(�t, x ) D hT 0j'(t, x )jTp i (9.126)

But for a c-number wave function, it obeys the rule of Eqs. (9.98) and (9.103)

ψ0(�t, x ) D ψ(�t, x )� Eq. (9.98)D h0j'(�t, x )jp i�

D hT 0jT f'(�t, x )pgi D hT 0jT'(�t, x )T �1jTp i(9.127)

Comparing Eqs. (9.126) and (9.127), we can choose the T-transformed field as

T'(t, x )T �1 D '(�t, x ) (9.128)

subject to the additional constraint of taking the complex conjugate of all thec-numbers.5) Referring to the expansion formula of ', the transformation proper-ties of the annihilation and creation operators are

ak ! Tak T �1 D a�k , b†k ! T b†T �1 D b†

�k (9.129)

The transformation is consistent with the expression

jki Dp

2ω a†k j0i

T!p

2ω a†�k j0i D j � ki (9.130)

Under T reversal, the current operator of the scalar field

j μ D i q('†@μ' � @μ'†') (9.131)

changes its argument t ! �t and i ! �i . Therefore the transformation is

T j 0(t, x )T �1 D j 0(�t, x ) , T j (t, x )T �1 D � j (�t, x )

i.e. j μ(t, x )T! j μ(�t, x )

(9.132)

As the vector field obeys Proca’s equation

@μ f μν C m2V ν D j ν , f μν D @μ V ν � @ν V μ (9.133)

the consistency argument tells us that

V μ(t, x )T! Vμ(�t, x ) (9.134)

5) As a matter of fact, an alternative definition of the reversal accompanied by the hermitianconjugation operation '(t, x )! '†(�t, x ) exists, but we will not elaborate on it in this book.

Page 271: Yorikiyo Nagashima - Archive

246 9 Symmetries

Dirac Field As the component of angular momentum changes its sign under timereversal, rearrangement of the spin component is necessary for the Dirac field.What we have to do is to find a transformation such that

ψ(x )T! T ψ(x )T �1 D ψ0(t0, x ) D B ψ(�t, x ) (9.135)

satisfies the same Dirac equation. Here, B is a c-number 4� 4 matrix that operateson spin indices. Consider the Dirac equation for the wave field

(γ μ i@μ � m)ψ(t, x ) D 0T! (γ μ i@0

μ � m)�ψ0(t0, x ) D 0 (9.136)

where @0μ D (�@0,r). Substituting Eq. (9.135) in Eq. (9.136) and multiplying B�1

from the left, we obtain

B�1[γ 0 �(�i)(�@0)C γ� � (�ir)� m]B ψ(�t, x )

D [B�1γ 0 �B(i@0)C B�1γ�B � (�ir)� m]ψ(�t, x ) D 0(9.137)

Therefore, if B which satisfies the following relations exists

B�1γ 0 �B D γ 0 , B�1γ k �B D �γ k (9.138)

the time reversed field ψ satisfies the same equation and T transformation invari-ance holds. Considering γ 2 � D �γ 2, γ μ � D γ μ(μ ¤ 2), we can adopt as B

B D i γ 1γ 3 D �i γ 5C D �Σ2 D

��σ2 0

0 �σ2

�(9.139)

Then

ψ0(t0) D T ψ(t)T �1 D B ψ(�t) D i γ 1γ 3ψ(�t) (9.140a)

ψ0(t0) D T ψ(t)T �1 D ψ(�t)B�1 D ψ(�t)i γ 1γ 3 . (9.140b)

The matrix B has the property that

B D B† D B�1 D �B�

B�1γ μ �B D γμ(9.141)

Therefore, the transformation property of the Dirac bilinear vector is given by

ψ1(t)γ μ ψ2(t)T! ψ0

1(�t)γ μ �ψ02(�t) D ψ1B�1γ μ �B ψ2

D ψ1(�t)γμ ψ2(�t) (9.142a)

ψ1(t)γ μ γ 5ψ2(t)T! ψ1(�t)γμ γ 5ψ2(�t) (9.142b)

The transformation properties of other Dirac bilinears are given similarly and welist them in Table 9.3.

Page 272: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 247

Table 9.3 Properties of Dirac bilinears under T transformation.

S(t, x ) P(t, x ) V μ (t, x ) Aμ (t, x ) T μν (t, x )

T S(�t, x ) �P(�t, x ) Vμ (�t, x ) A μ (�t, x ) �Tμν (�t, x )

Note 1: S D ψ ψ, P D iψ γ5 ψ, V μ D ψγ μ ψ, Aμ D ψγ μ γ5 ψ, T μν D ψ σμν ψNote 2: V μ D (V 0, V ), Vμ D (V 0,�V )

Experimental TestsPrinciple of Detailed Balance: The amplitude of a scattering process from a initialstate jii to a final state j f i and its inverse f ! i are related by Eq. (9.121). Thecross section of a process ACB! CCD, assuming both A and B are unpolarizedis expressed as

dσdΩ

(AB! CD) D1

(2SA C 1)(2SB C 1)

j f λCλD,λA λB j2 (9.143)

On the other hand the cross section of the inverse process is, again assuming bothC and D are unpolarized,

dσdΩ

(CD! AB) D1

(2SC C 1)(2SD C 1)

j f λA λB,λC λD j2 (9.144)

As Eqs. (9.121) and (9.47) mean

p 2AB

j f λCλD,λA λB j2 D p 2

CD

j f λA λB ,λC λD j2 (9.145)

where pAB and pCD are the momenta of particles in CM of AB and CD, respectively.we obtain

dσ(AB! CD)dσ(CD! AB)

DpCD

2(2SC C 1)(2SD C 1)pAB

2(2SA C 1)(2SB C 1)(9.146)

The equation is referred to as the “principle of detailed balance”.

Strong Interaction As an example of the application of the principle of detailed bal-ance to test time-reversal invariance, we show both cross sections of the reactionsp C 27Al• αC 24Mg in Fig. 9.2. The data show T-reversal invariance is preservedin the strong interaction to better than 10�3.

Weak InteractionThere is evidence that small CP violation exists in the kaon and B-meson decays(see Chap. 16), hence the same amount of T violation is expected to exist as longas CPT invariance holds, which is discussed in Sect. 9.3.3. In fact, a T-violationeffect consistent with CPT invariance has been observed in the neutral K meson

Page 273: Yorikiyo Nagashima - Archive

248 9 Symmetries

decays [107], which will be described in detail in Chap. 16. With this one exception,there is no other evidence of T violation. As it offers some of the most sensitive testsfor models of new physics, we will discuss some of the experimental work below.

Observables that violate T invariance can be constructed as a triple product ofT-odd variables like σ � p1� p2, p1 � p2� p3. In the first example, σ can be preparedby a polarized beam, polarized targets or spin of scattered/decayed particles.

KC ! π0 C μC C νμ : In the decay KC ! π0 C μC C νμ , the transversecomponent of muon polarization relative to the decay plane determined by p μ� p π(which is σμ � p μ � p π) was determined to be 1.7 ˙ 2.5 � 10�3 [225, 226]. Theimaginary (i.e. T-violating) part of the decay amplitude was also determined to beIm (� ) D �0.006˙ 0.008 (see Sect. 15.6.3 for the definition of � ).

n! pC e� C Nνe: Another example is the triple correlation D of the neutronpolarization and the momenta of electron and antineutrino in the beta decay n !p e�ν e . It is defined as

d W / 1C D P n � p e � p ν (9.147a)

where P n is the polarization of the neutron. The experimental value is given asD D �4˙ 6� 10�4 [350]. The data can be used to define the imaginary part of thecoupling constant. The relative phase φAV is related to D by

λ Dˇˇ gA

gV

ˇˇ , φAV D Arg

�gA

gV

�(9.147b)

sin φAV D D(1C 3λ2)

2λ(9.147c)

where gA, gV are axial and vector coupling constants of the weak interaction [CV

and CA of Eq. (9.92)]. φAV is given as 180.06ı ˙ 0.07ı.

Figure 9.2 Test of detailed balance and time reversibility in the reaction 27AlC p • 24MgC α.The intensity of the T-violating effect (� D j fT-violating/ fT-invj

2) is smaller than 5 � 10�4 [66].

Page 274: Yorikiyo Nagashima - Archive

9.2 Discrete Symmetries 249

The Electric Dipole Moment of the NeutronThe definition of the electric dipole moment (EDM) is classically

d DZ

�(x )x d3x (9.148)

which is nonzero if the charge distribution within an object is polarized, in otherwords, not distributed evenly. When the object is a particle like the neutron, re-gardless of whether of finite or point size, its only attribute that has directionalityis the total spin σ. If the particle has a finite d, it has to be proportional to σ. Whilethe transformation property of σ under P, T isC,�, respectively, that of d is �,C,as can be seen from Eq. (9.148). Therefore the existence of the EDM of a particleviolates both P and T. As the neutron is a neutral composite of quarks, its EDM canbe a sensitive test of T-reversal invariance in the strong interaction sector as well asthe weak interaction. The interaction of the magnetic and electric dipole momentswith the electromagnetic field is given by

HINT D �μ � B � d � E (9.149)

Its relativistic version can be expressed as

HINT D �qψγ μ ψA μ Ci2

dψγ 5σμν ψFμν (9.150)

Problem 9.6

Show that in the nonrelativistic limit, Eq. (9.150) reduces to Eq. (9.149).

For an s D 1/2 particle, depending on the magnetic quantum number m D˙1/2, the energy level splits in the electromagnetic field. When an oscillating fieldis injected, a resonance occurs corresponding to the Larmor frequency

ΔE D 2μB ˙ 2dE D h(ν ˙ Δν) (9.151)

Here ˙ corresponds to the directions of the magnetic and electric fields beingeither parallel or antiparallel to each other. Therefore from the deviation Δν ofthe resonance frequency when the electric field is applied, one can determine thestrength of the EDM. What has to be carefully arranged experimentally is the par-allel alignment of the electric field relative to the magnetic field, because if thereexists a perpendicular component E? a magnetic field of γ v � E? for a movingsystem with velocity v is induced and gives a false signal.

An example material used for the experiment is the ultra cold neutron (UCN) ofT � 0.002 K, which has energy � 2 � 10�7 eV and velocity v � 6 ms�1. This isthe energy the neutron obtains in the earth’s gravitational field when it drops 2 m.The de Broglie wavelength is very long (λ � 670 Å), and consequently the neutroninterferes coherently with material, allowing the index of refraction n to be defined:

n D�

1 �λ2N acoh

π˙

μBM v2/2

�1/2

(9.152)

Page 275: Yorikiyo Nagashima - Archive

250 9 Symmetries

Figure 9.3 (a) The neutron EDM experimen-tal apparatus of the RALySussex experimentat ILL [37, 198]. The bottle is a 20 liter stor-age cell composed of a hollow upright quartzcylinder closed at each end by aluminum elec-trodes that are coated with a thin layer of car-bon. A highly uniform 1 μT magnetic field B0parallel to the axis of the bottle is generated bya coil and the electric field (E0) is generatedby applying high voltage between the elec-trodes. The storage volume is situated within

four layers of mu metal, giving a shielding fac-tor of about 10 000 against external magneticfluctuations. (b) A magnetic resonance plotshowing the UCNs with spin-up count afterthe spin precessing magnetic field has beenapplied. The peak (valley) corresponds to themaximum (minimum) transmission throughthe analyzing foil. The measured points to de-tect the EDM effect are marked as four cross-es. The corresponding pattern for spin downis inverted but otherwise identical.

where N is the number of nuclei per unit volume, acoh is the coherent forwardscattering amplitude and M v2/2 is the kinetic energy of the neutron. The ˙ signdepends on whether the magnetic moment and the field are parallel or antiparallel.Inserting the actual values, we obtain a total reflection angle � 5ı for v � 80 m�1,but for a velocity v < 6 m�1 the angle exceeds 90ı and total reflection is obtained atany angle. Namely, the neutron, if it is slow enough, can be transported through abent tube just like light through an optical fiber and also can be confined in a bottlefor a long duration of time (referred to as a magnetic bottle or a neutron bottle).That the refractive index differs depending on the polarization is used to separateneutrons of different polarization.

Thermal neutrons emerging from the deuterium moderator of a reactor aretransported through a curved nickel pipe and further decelerated by a totally re-flecting turbine before arriving at the entrance of the apparatus. Figure 9.3 showsthe apparatus for the experiment [37, 198].

A magnetized iron–cobalt foil of 1 μm thickness is used to block neutrons of onespin orientation using the principle of total reflection stated above. The neutronbottle, a cylindrical 20-liter trap within a 1 μT uniform magnetic field B0, is able to

Page 276: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 251

store neutrons for more than 100 s. The electric field, of approximately 10 kV/cm,was generated by applying high voltage (HV) to the electrode. After application of aresonant oscillating magnetic field (� 30 Hz) perpendicular to B0, which turns theneutron spin perpendicular to the magnetic field and makes it precess freely, theshutter is opened, and the neutrons leave the bottle, dropping and passing throughthe iron–nickel foil again, which this time acts as an analyzer of the polarization.Only those that remain in the initial spin state can reach the detector, which islocated below the UCN polarizing foil. The ratio of the two spin states is determinedby the exact value of the applied frequency, as shown in Eq. (9.151).

Four points, marked with crosses, slightly off resonance, are measured wherethe slope is steepest. The existence of the EDM should appear as a frequency shiftwhen the electric field is applied. The measurement did not detect any shift, givingthe upper limit of the dipole moment of the neutron as

dn < 2.9 � 10�26 e cm (9.153)

Let us pause and think of the sensitivity that the measurement represents. Crudelyspeaking, the neutron is spatially spread to the size of the pion Compton wave-length (� 10�13 cm). If the T violation effect is due to the weak interaction, it isprobably reasonable to assume that the relative strength of emission and reabsorp-tion of the W boson is � (g2

w/m2w)/(g2

s /m2π ). Here, gw, gs are the strength of the

weak and strong interactions. If we assume gw � gs, following the spirit of theunified theory, then the strength of the EDM induced by the weak interaction isexpected to be roughly of the order 10�13 � (mπ/mW )2 � 10�19e cm. Since theexperimental value is 6 orders of magnitude smaller than the expectation, we canconsider the EDM of the neutron a good test bench for T-reversal invariance. Atpresent, the expected value of the Standard Model using the Kobayashi–Maskawamodel is 10�33–10�34 e cm. The probability of detecting the finite value of the EDMis small as long as the Standard Model is correct. However, there are models thatpredict a value just below the present experimental upper limit [2, 129], and animproved experimental measurement is desired.

9.3Internal Symmetries

9.3.1U(1) Gauge Symmetry

Conserved ChargeWe have already introduced the most fundamental result of the internal symmetry,the conserved charge current Eq. (5.38), in Chap. 5:

J μ D i q�'†@μ' � @μ'†'

(9.154)

Page 277: Yorikiyo Nagashima - Archive

252 9 Symmetries

This was the result of the Lagrangian’s symmetry, namely invariance under thephase transformation

' ! e�i qα' , '† ! '†e i qα (9.155)

which generally holds when the Lagrangian is bilinear in complex fields, includingDirac fields. The transformation Eq. (9.155) has nothing to do with the space-timecoordinates and is an example of internal symmetry, referred to as U(1) gauge sym-metry of the first kind.6) The space integral of the time component of Eq. (9.154) isa conserved quantity and generically called a charge operator:

Q D �i qZ

d3xX

r

�π r(x )'r (x )� π†

r (x )'†r (x )

(9.156a)

D i qZ

d3xh'† P' � P'†'

i(9.156b)

The first expression is a more general form, while the second one is specific to thecomplex Klein–Gordon field. It is easy to show that Q satisfies the commutationrelations

[Q, '] D �q' , [Q, '†] D q'† (9.157)

This means that ' and '† are operators to decrease and increase the charge of thestate by q. Then using the BCH formulae Eq. (9.15),

eAB e�A D B C [A, B ]C12!

[A, [A, B ]]C � � � C1n!

[A, [A, [A � � � [A, B ]]]]C � � �

(9.158)

we can show that

e i αQ' e�i αQ D e�i qα' , e i αQ'† e�i αQ D eCi qα'† (9.159)

Namely, the charge operator is the generator of the gauge transformation. In sum-mary, we rephrase Noether’s theorem in quantum field theory. If there is a contin-uous symmetry, i.e. a unitary transformation that keeps the Lagrangian invariant,a corresponding conserved (Noether) current exists. The space integral of its 0thcomponent is a constant of time and is also the generator of the symmetry trans-formation.

9.3.2Charge Conjugation

Charge conjugation (CC for short or C transformation) is an operation to exchangea particle and its antiparticle, or equivalently an operation to change the sign of

6) It is also called a global symmetry. When the phase is a function of space-time, i.e. α D α(x ), it iscalled a gauge transformation of the second kind or a local gauge transformation, which plays akey role in generating the known fundamental forces (See Chapter 18).

7) Here we use the word “charge” in a general sense. It includes not just electric charge, but alsostrangeness, hypercharge, etc.

Page 278: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 253

the charge Q.7) Examples of the C transformation are exchanges of an electronand positron e� $ eC, proton and antiproton p $ p , and πC $ π�. Thephoton and the neutral pion are not changed since they are charge neutral andthe particles are their own antiparticles. The charge conjugation operation dealswith variables in internal space, but because of the CPT theorem, it is inseparablyconnected with PT, space-time symmetry. This is because the antiparticle can beconsidered mathematically as a particle with negative energy-momentum travelingbackward in time.

Charge Conjugation of the Field OperatorsWe learned in Chap. 4 that the charge conjugation operation generally involvestaking the complex (hermitian for the operator) conjugate of the wave function orthe field. This resulted from the fact that the interaction with the electromagneticfield is obtained by the gauge principle, which means replacement of the derivativeby its covariant derivative

@μ ! Dμ D @μ C i qA μ (9.160)

Charge conjugation means essentially changing the sign of the charge, and theabove form suggests it involves complex conjugation. Let us see if this is true.

We start by considering the charge eigenstate

Qjq, p , szi D qjq, p , szi (9.161)

By definition, charge conjugation means

C jq, p , szi D ηC j � q, p , szi (9.162)

ηC is a phase factor. Then

QC jq, p , szi D ηC Qj � q, p , szi D �ηC qj � q, p , szi

C Qjq, p , szi D qC jq, p , szi D qηC j � q, p , szi

Therefore

C Q D �QC or C QC�1 D �Q (9.163)

As a particle state is constructed from the creation operator a† and its antiparticlefrom b†, we define the charge conjugation operator C with the properties

Caq C�1 D ηCbq , C b†q C�1 D ηCa†

q (9.164a)

C bq C�1 D η†Caq , Ca†

q C�1 D η†Cb†

q (9.164b)

C C† D C†C D 1, ηCη†C D 1 (9.164c)

Applying the charge conjugation twice brings the field back to its original form,hence its eigenvalue is ˙1 and is referred to as C parity. The definition Eq. (9.164)

Page 279: Yorikiyo Nagashima - Archive

254 9 Symmetries

of charge conjugation means the exchange ' $ '† for the Klein–Gordon field, aswe guessed from a simple argument. Namely

C'C�1 D ηC'† , C'†C�1 D η†C' (9.165)

The expression for the charge operator Eq. (9.156) shows clearly that the inter-change ' • '† changes the sign of the charge operator. With the above proper-ty, it can easily be shown that the charge operator defined by Eq. (9.156) satisfiesEq. (9.163).

Problem 9.7

Prove

C D exp

"i

π2

Xk

�b†

k � a†k

�(bk � ak )

#(9.166)

has the required properties for the charge conjugation operator.

Charge Conjugation of the Dirac FieldFor a field with spin, an additional operation is required to flip the spin component.For instance, for the Dirac field,

C ψ(x )C�1 D ηCC 0ψT(x ) D ηCC 0γ 0ψ�(x )

C ψ(x )C�1 D �η�C ψT(x )C 0 �1

(9.167)

where C 0 on the rhs is a 4 � 4 matrix that acts on the spin components as given inEq. (4.101). Using Eq. (9.167) and the property of the C 0 matrix [see Eq. (4.98)],

C 0 �1γ μ C 0 D �(γ μ)T (9.168)

the bilinear form of the Dirac operators is changed to

ψ2γ μ ψ1C!˚�η�

C ψT2 C 0 �1� γ μ

nηCC 0ψT

1

oD ψT

2 (γ μ)TψT1 D �

�ψ1γ μ ψ2

�T

D �ψ1γ μ ψ2 D ��ψ2γ μ ψ1

�† (9.169)

The minus sign in the last equality of the second line comes from the anticommu-tativity of the field and the removal of the transpose in the last equality is allowedbecause it is a 1 � 1 matrix. By setting ψ2 D ψ1 D ψ, we have a C-transformedcurrent

j μC D C(qψγ μ ψ)C�1 D �qψγ μ ψ D � j μ (9.170)

In quantum mechanics, the change of the sign is a part of the definition and hadto be put in by hand, but it is automatic in field theory. Transformation propertiesof other types of bilinear Dirac fields can also be obtained by using Eq. (9.168), seeTable 9.4.

Page 280: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 255

Table 9.4 Property of Dirac bilinears under C transformation.

S(t, x ) P(t, x ) V μ (t, x ) Aμ (t, x ) T μν (t, x )

C S†(t, x ) P†(t, x ) �V μ †(t, x ) Aμ †(t, x ) �T μν †(t, x )

Note 1: S D ψ ψ, P D iψ γ5 ψ, V μ D ψγ μ ψ, Aμ D ψγ μ γ5 ψ, T μν D ψ σμν ψNote 2: V μ D (V 0, V ), Vμ D (V 0,�V )

Charge Conjugation of Vector FieldsAs the vector and the scalar field satisfy the same Klein–Gordon equation, thecharge conjugation operation transforms them as follows:

' ! C'C�1 D ηC'† , V μ ! C V μ C�1 D ηCV μ † (9.171)

The phase factor ηC cannot be determined for the free field. But if the field inter-acts, for instance, with the Dirac field, a Lorentz-invariant interaction Lagrangianhas the form

LINT D f ψ1ψ2' C gψ1γ μ ψ2Vμ C (h.c.) (9.172)

The second term (h.c.) is the hermitian conjugate of the first term, and by its pres-ence the Lagrangian becomes hermitian. The requirement of C invariance for theLagrangian fixes the phase. Using Table 9.4 for the transformation of the Diracbilinears, we obtain

'C! '† , V μ C

! �V μ† (9.173a)

f D f � , g D g� (9.173b)

As we noted earlier, by C operation, in addition to the electric charge, the strange-ness, baryon number, lepton number and all other additive quantum numberschange their sign, and therefore the eigenstate of the charge conjugation has tobe truly neutral in the sense that all the quantum numbers have to be zero.

C Parity of the Photon The photon is a massless vector boson and is described bythe Maxwell equation.

@μ@μ Aν D q j ν (9.174)

The interaction Lagrangian of the electromagnetic interaction is given by LINT D

�q j μA μ. Since the electric current changes its sign by C transformation, A μ mustalso change sign to keep the Lagrangian invariant. Therefore, the C parity of thephoton is �1. This can also be derived from Eq. (9.173) when applied to a realvector field. The C parity of an n-photon system is (�1)n . This is independent ofboth the orbital and the spin angular momentum.

Page 281: Yorikiyo Nagashima - Archive

256 9 Symmetries

Experimental TestsC in the Strong Interaction: Let us consider a process

p C p ! πC(π0)C XC ! p C p ! π�(π0)C X (9.175)

If the strong interaction conserves the charge conjugation symmetry, the angulardistribution measured relative to the incoming antiprotons should exhibit a sym-metry between θ and π � θ , as shown in Fig. 9.4 and in the following equations.The energy spectrum should be identical in both cases, too:

dσdΩ

(πCI θ ) DdσdΩ

(π�I π � θ ) (9.176a)

dσdΩ

(π0I θ ) DdσdΩ

(π0I π � θ ) (9.176b)

dσdE

(πC) DdσdE

(π�) (9.176c)

These relations have been confirmed experimentally within the errors.

C Parity of the Neutral Two-Particle System: A two-particle system consisting of aparticle and its antiparticle is neutral and can be a C eigenstate. We consider themas identical particles in different states. If the spin and orbital angular momentumof the system are L, S , the C parity of the system is expressed as

C D (�1)LCS (9.177)

Proof: If they are fermions ( f f D e�eC, p p , etc.) they change sign by a par-ticle exchange because of Fermi statistics. The exchange consists of that of thecharge (C), space coordinates and spin coordinates. Therefore

�1 D C(�1)L(�1)SC1 (9.178)

Here, we have used the fact that the spin wave function is symmetric when S D 1and antisymmetric when S D 0. The case for the boson system can be provedsimilarly. �

p p

π+

θ p p

π-

π θ

C.C.

(a) (b)

Figure 9.4 (a) p C p ! πC C X , (b) p C p ! π� C X . If C is conserved, dσ(p C p !πC) D dσ(p C p ! π�), where dσ stands for either dσ/dΩ or dσ/dE . If the pion angle ismeasured relative to the incident antiprotons, dσ/dΩ (θ ) D dσ/dΩ (π � θ ).

Page 282: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 257

Problem 9.8

The �0 meson is a vector particle with J P D 1� and decays into πC C π�. Proveit cannot decay into 2π0.

C in Electromagnetic Interactions: The most stringent test of C symmetry in elec-tromagnetic interactions comes from the nonexistence of the π0 ! 3γ decay [311]:

Γ (π0 ! 3γ )Γtotal

< 3.1 � 10�8 90%CL (CL D confidence level) (9.179)

The following decays are known to occur via the electromagnetic interactionthrough their strength (the decay rate). Asymmetry tests are not as stringent asthat of decay branching ratios, but we list them here as another test. As η(548)is a neutral scalar meson having 0� and mass of 548 MeV, it changes to itself byC operation and so do π0 and γ . Since πC and π� are interchanged, the energyspectrum of the decays

η ! πC C π� C π0 (9.180a)

η ! πC C π� C γ (9.180b)

should be identical. Experimentally, the asymmetry is less than 0.1%.As π0, η decay into 2γ , their C parity is positive and they cannot decay into 3γ .

Conversely, if a particle decays into 3γ or π0γ , ηγ its C parity is negative.

Problem 9.9

Positronium is a bound state of an electron and a positron connected by theCoulomb force. For the L D 0 ground state, there are two states, with S D 0, 1.Show that the S D 0 state decays into 2γ and the S D 1 state into 3γ , but not viceversa.

C Violation in Weak Interactions: Next we consider weak interactions, for example

μ� ! e� C ν C ν (9.181)

If C symmetry is respected, the helicity of the electron and that of the positronshould be the same provided other conditions are equal. The helicity can be mea-sured first by making the electron emit photons by bremsstrahlung and then let-ting the photons pass through magnetized iron. The transmissivity of the photondepends on its polarization, which in turn depends on the electron helicity. Theresult has shown that h(e�) ' �1, h(eC) ' C1 [111, 267]. Therefore charge con-jugation symmetry is almost 100% broken in weak interactions. The origin lies inthe Hamiltonian. The parity-violating weak interaction Hamiltonian was given inEq. (9.92). If we take into account the C transformation property of the axial vector(see Table 9.4) we have

ψ2γ μ γ 5ψ1C! ψ1γ μ γ 5ψ2 (9.182)

Page 283: Yorikiyo Nagashima - Archive

258 9 Symmetries

This and Eq. (9.173) mean under C transformation

CV ! C�V , CV

0 ! �CV0 �

CA ! C�A , C 0

A ! �C 0 �A

(9.183)

Namely, if C invariance holds, CV, CA have to be real and C 0V, C 0

A have to be pure-ly imaginary. But the decay asymmetry of a polarized nucleus that was discussedfollowing Eq. (9.93) has shown C 0

A D �CA D real. Therefore, C is also maximallyviolated. But the combined CP symmetry does not change the V–A Hamiltonian ofEq. (9.94) and is conserved.

The chirality operator (1 � γ 5) in front of the lepton field operator in Eq. (9.94)is the origin of h(e�) ' h(ν e) D �1, h(eC) ' h(ν e) D C1 as we discussed inSect. 4.3.5. The latter is typical of the weak interaction where the weak boson W μ

couples only to the left-handed field (i.e. (1� γ 5)ψ). This will be discussed in detailin Chap. 15.

The V–A interaction Eq. (9.94) respects T invariance as well as CP invariance. Inprinciple, CP invariance and T invariance are independent. However, the combinedCPT invariance is rooted deeply in the structure of quantum field theory and is thesubject of the next section.

As long as CPT invariance holds, CP violation means T violation. There is ev-idence that small CP violation exists in the kaon and B-meson decays, hence thesame amount of T violation is expected to exist. In fact, a T-violation effect consis-tent with CPT invariance has been observed in the neutral K meson decays [107],which will be described in detail in Chap. 16.

9.3.3CPT Theorem

CPT is a combined transformation of C, P and T. All the coordinates are inverted(t, x )! (�t,�x ) and the operation is antiunitary because of the T transformation.All c-numbers are changed to their complex conjugates. Writing the combined op-erator as Θ , its action is summarized as

(t, x )CPT��! (�t,�x)

c-numbers ! (c-numbers)*houtjH jini ! hΘ injH jΘoutiΘ jq, p , σi ! jq, p ,�σi

(9.184)

The CPT combined transformation property of the Dirac bilinears is given by

ψ1Γ ψ2CPT��!

(ψ2Γ ψ1 Γ D S, P, T

�ψ2 Γ ψ1 Γ D V, A(9.185)

independent of the order of the C, P, T operations. If we take the hermitian con-jugate of the bilinears, the original form is recovered except for the sign of the co-ordinates. Transformation properties of the scalar, pseudoscalar, vector fields, etc.

Page 284: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 259

are shown to be the same as the corresponding Dirac bilinears. It follows that anyLorentz scalar or second-rank tensor made of any number of fields are broughtback to their original form by a combined operation of (CPT C hermitian conju-gate). Therefore the Lagrangian density (a scalar) and the Hamiltonian density (asecond rank tensor) transform under CPT

L(t, x )CPT��! L†(�t,�x) (9.186a)

H (t, x )CPT��! H †(�t,�x ) (9.186b)

Since L and H are hermitian operators and the action is their integral over allspace-time, we conclude that the equations of motion are invariant under CPTtransformation. The Hamiltonian H D

Rd3xH changes the sign of time t, but as

H is a conserved quantity and does not depend on time, the transformed Hamilto-nian is the same as before it is transformed. As the scattering matrix is composedof the Hamiltonian, it is invariant, too. Unlike individual symmetry of C, P or T,the combined CPT invariance theorem can be derived with a few fundamental as-sumptions, such as the validity of Lorentz invariance, the local quantum field theo-ry and the hermitian Hamiltonian. The violation of CPT means violation of eitherLorentz invariance or quantum mechanics.8) Theoretically it has firm foundations.Experimental evidence of CPT invariance is very strong, too.

We list a few examples of the CPT predictions:(1) The mass of a particle and its antiparticle is the same.

Proof: Let H be the CPT-invariant Hamiltonian. As the mass is an eigenstate ofthe Hamiltonian in the particles rest frame

m D hq, σjH jq, σi D hq, σjΘ�1Θ H Θ�1Θ jq, σi D hΘ q, σjH jΘ q, σi�

D hq,�σjH jq,�σi (9.187)

The spin orientation of jq,�σi is different from jq, σi, but the mass does notdepend on the spin orientation (see Poincaré group specification for a particle statein Sect. 3.6). Therefore the masses of the particle and antiparticle are the same.

(2) The magnetic moment of a particle and its antiparticle is the same but itssign is reversed.

(3) The lifetime of a particle and its antiparticle is the same.

Proof: Writing the S-matrix as

Sα D 1C i2πδ(Mα � E f )hIoutjT jαIouti (9.188)

8) It is pointed out that CPT might be violated in quantum gravity [130, 210].

Page 285: Yorikiyo Nagashima - Archive

260 9 Symmetries

Table 9.5 Summary of transformation properties under C, P and T.

S(t, x ) P(t, x ) V μ (t, x ) Aμ (t, x ) T μν (t, x )

P S(t,�x ) �P(t,�x ) Vμ (t,�x ) �A μ (t,�x ) Tμν (t,�x )

C S†(t, x ) P†(t, x ) �V μ †(t, x ) Aμ †(t, x ) �T μν †(t, x )T S(�t, x ) �P(�t, x ) Vμ (�t, x ) A μ (�t, x ) �Tμν (�t, x )

CP S†(t,�x ) �P†(t,�x ) �Vμ†(t,�x ) �A μ

†(t,�x ) �Tμν†(t,�x )

CPT S†(�t,�x ) P†(�t,�x ) �V μ †(�t,�x ) �Aμ †(�t,�x ) T μν †(�t,�x )

Note 1: S D ψ ψ, P D ψγ5 ψ, V μ D ψγ μ ψ, Aμ D ψγ μ γ5 ψ, T μν D ψ σμν ψNote 2: The differential operator @μ transforms exactly the same as the vector field Vμ except it doesnot change sign under C-operation.

the total decay rate of α is given by

Γ (α, qα) D 2πX, q

δ(Mα � E f )jh, qI outjT jα, qαI outij2

D 2πX, q

δ(Mα � E f )jh, qI outjΘ�1Θ T Θ�1Θ jα, qαI outi�j2

D 2πXN, Nq

δ(MNα � E f )jh N, NqI injT j Nα, NqαI ini�j2

D Γ ( Nα, Nqα) (9.189)

where, qα , q denote quantum numbers other than particle species. We have usedthe fact that for one particle state, jα, qαI ini D jα, qαI outi, Mα D MNα and thatboth “in” and “out” states form complete sets of states, i.e.X

, q

j, qI inih, qI i nj DX, q

j, qI outih, qI outj D 1 (9.190)

Finally, in Table 9.5 we give a summary list of the transformation properties of C,P and T.

9.3.4SU(2) (Isospin) Symmetry

Isospin MultipletsAmong hadrons there are many small groups in which members having differentelectric charge share some common properties. For instance, in the following ex-ample members have almost the same mass values (given in units of MeV) within

Page 286: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 261

groups.

24 p

n

35 938.272

939.565

24πC

π0

π�

35 139.570

134.977139.570

24KC

K0

35 493.68

497.65

24Σ C

Σ 0

Σ �

35 1189.37

1192.641197.45

δm/m 0.14 % 3.3 % 0.8 % 0.33 %

(9.191)

In addition, it is known that the strength of the interaction is almost the same with-in the accuracy of a few percent. Because they have different electric charges, thedifference can be ascribed to the electromagnetic interaction, which is weaker by� O(α) D 1/137. Therefore, if we neglect the small mass difference, there is nodistinction among the members of the groups as far as the strong interaction isconcerned. They constitute multiplets but are degenerate. In analogy to spin multi-plets, which are degenerate under a central force, we call them “isospin multiplets”.Just as the degeneracy of the spin multiplets is resolved by a magnetic field (theZeeman effect), that of the isospin multiplets is resolved by turning on the electro-magnetic force. We conceive of an abstract space (referred to as internal space incontrast to external or real space) and consider the strong interaction as a centralforce with the electromagnetic force violating the rotational invariance just like themagnetic field in external space. If we identify the isospin as the equivalent of spinin real space, the mathematical structure of isospin is exactly the same as that ofspin, which is the origin of the name.

Charge IndependenceHistorically, the concept of isospin was first proposed by Heisenberg in 1932 toconsider the proton and the neutron as two different states of the same particle(nucleon) when he recognized the fact that the energy levels of mirror nuclei arevery similar. Consider the potentials between the nucleons V. When

Vp p D Vnn charge symmetry (9.192)

we call it charge symmetry and when

Vp p D Vnn D Vp n charge independence (9.193)

we call it charge independence. To formulate the mathematics of the charge inde-pendence, we consider transformation of the proton wave function ψp and that ofthe neutron ψn and define the doublet function ψ by

ψ ��

ψp

ψn

�! ψ0 D

�ψp

0ψn

0

�D

�αψp C ψn

γ ψp C δψn

�D U ψ (9.194)

Charge independence means the expectation value of the potential does not changeunder the transformation U. Conservation of probability requires U to be a unitarymatrix

UU† D U†U D 1 (9.195)

Page 287: Yorikiyo Nagashima - Archive

262 9 Symmetries

Under this restriction, U should have the form

U D e i φ�

α �� α�

�, jαj2 C jj2 D 1 (9.196)

If we further require U to be unimodular (det U D 1), we have φ D 0. Generally,any transformation that keeps the norm of N complex numbers

ψ† ψ D jψ1j2 C jψ2j

2 C � � � C jψN j2 (9.197)

invariant makes a group that is called a unitary group U(N ), and when det U D 1it is called a special unitary group SU(N ). An SU(N ) matrix can be expressed as

U D exp

24� i

2

N2�1XiD1

λ i θi

35 (9.198)

where λ i are traceless hermitian matrices (Appendix G) called generators of theSU(N) transformation. For N D 2, there are three traceless hermitian matricesand we can adopt the Pauli matrices for them:

τ1 D

�0 11 0

�, τ2 D

�0 �ii 0

�, τ3 D

�1 00 �1

�(9.199)

Since I D τ/2 satisfies the same commutation relation

[Ii , I j ] D i ε i j k Ik (9.200)

as the angular momentum, the SU(2) transformation is mathematically equivalent(holomorphic) to the rotational operation. The difference is that the operand is nota spin state in real space, but a multiplet consisting of particles of roughly thesame mass as those in Eq. (9.191). Namely, the act of rotation is performed not ina real space but in a kind of abstract space (referred to as isospin space) and thestate vector ψ denotes p when its direction is upward and n when downward. Suchsymmetry in the internal space is called an internal symmetry. External symmetriesrefer to those in a real space such as spin and parity. If we denote the differenceof the states like p or n by a subscript r as φ r (x ), the internal symmetry operationacts on r, while the external symmetry acts on the space-time coordinate x and legs(components) of Lorentz tensors. They are generally independent operations, butsometimes connected like CPT.

As the mathematics of SU(2) is the same as that of angular momentum, it isconvenient to express various physical quantities using the same terminology asused in space rotation. For instance, p and n make a doublet; we say the nucleonhas isospin I D 1/2. The three kinds of pions π˙, π0 constitute a triplet, or a vectorin isospace, and have I D 1. The term “charge independence” of the nuclear forcemeans the interaction between the nucleon and the pion is rotationally invariant inisospin space, namely the interaction Hamiltonian is an isoscalar.

Page 288: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 263

Conserved Isospin CurrentWe consider the proton and the neutron as the same particle, but they have dif-ferent projections in the abstract isospin space and constitute a doublet, called anisospinor:

ψ D�

pn

�(9.201)

Here we have adopted the notation p , n to denote ψp , ψn . The Lagrangian densityis written as

L D ψ(i γ μ@μ � m)ψ

D p (i γ μ@μ � m)p C n(i γ μ@μ � m)n(9.202)

They should have the same mass because of the isospin symmetry. Note, p is aspinor with four components in real space, and ψ a two-component spinor inisospin space. Therefore, ψ has 8 independent components. In exactly the samemanner as a rotation in real space, a rotation in isospin space can be carried out byusing a unitary operator

UI D e�iP

i α j I j D e�i α�I (9.203)

where I D (I1, I2, I3) satisfies Eq. (9.200). Let us consider transformations definedas

ψ ! ψ0 D e� i2 τ j α j ψ � e� i

2 τ�α ψ, (9.204a)

and

ψ ! ψ0D ψ e

i2 τ�α where ψ D (p , n) (9.204b)

where α D (α1, α2, α3) is a set of three independent but constant variables. Sinceτ i , α i are independent of space-time, the transformation does not change anyspace-time structure of the spinors p , n, and the Lagrangian density is kept in-variant under the continuous isospin rotation. Equations (9.204) have exactly thesame form as the U(1) phase transformation of Eq. (9.155) and is called the (global)SU(2) gauge transformation. Now consider an infinitesimal rotation

ψ ! ψ C δψ , ψ ! ψ C δψ

δψ D �i2

τ i ψε i D �i2

[τ i ]ab ψb ε i , δψ Di2

ψτ i ε i Di2

ψ a [τ i ]ab ε i

(9.205)

Applying Noether’s formula Eq. (5.32) to the transformation, we obtain

J μ DδL

δ(@μ φ r)δφ r

φ rDψ ,ψD

δL

δ(@μ ψ)δψ C δψ

δL

δ(@μ ψ)

D

��

δL

δ(@μ ψa )i2

[τ i ]ab ψb Ci2

ψ a [τ i ]abδL

δ(@μ ψ b)

�ε i

D12

ψγ μ τ i ψε i

(9.206)

Page 289: Yorikiyo Nagashima - Archive

264 9 Symmetries

The second term in the second line vanishes because there is no @μ ψ in the La-grangian. Since ε i is an arbitrary constant, Eq. (9.206) defines a conserved currentreferred to as isospin current:

( J μ) j D ψγ μ τ j

2ψ (9.207)

It is easy to check that an isospin operator defined by

OI j D

Zd3x ψγ 0 τ j

2ψ (9.208)

satisfies the relations

[I j , ψ] D �I j ψ , [I j , ψ†] D I j ψ , e i α�OI ψ e�i α OI D e�i α�I ψ (9.209)

Therefore the operator OI is the generator of the isospin rotation which appeared inEq. (9.203). The isospin current differs from the Dirac current only in the presenceof the isospin operator τ/2 sandwiched between the isospinors ψ and ψ.

In a similar vein, when we have three identical real Klein–Gordon fields φ D(φ1, φ2, φ3), we can also establish the rotational invariance of φ �φ D φ1

2Cφ22C

φ32 and that of the Lagrangian, which can be expressed as

L D @μ φ† � @μ φ � m2φ† � φ (9.210)

leading to the existence of conserved isospin angular momentum, for which φ isa I D 1 triplet. Since the pions have spin 0 and constitute a triplet of almost thesame mass, we can identify them with the field φ.9) In analogy to Eq. (9.208), itcan be expressed as

I π D �iZ

d3x πa [t]ab φb (9.211a)

t1 D

240 0 0

0 0 �i0 i 0

35 , t2 D

24 0 0 i

0 0 0�i 0 0

35 , t3 D

240 �i 0

i 0 00 0 0

35 ,

(9.211b)

However, to identify (jπCi, jπ0i, jπ�i) as the T3 D (C1, 0,�1) state, it is moreconvenient to use a slightly different representation:

t10 D

1p

2

240 1 0

1 0 10 1 0

35 , t2

0 D1p

2

240 �i 0

i 0 �i0 i 0

35 ,

t30 D

241 0 0

0 0 00 0 �1

35 ,

(9.212)

9) Note, the pion field is a I D 1 triplet representation of the SU(2) group. If the fundamentalrepresentation with I D 1/2 does not exist, we may have to consider the pion as the fundamentalrepresentation of the SO(3) group, as π j ’s are real fields.

Page 290: Yorikiyo Nagashima - Archive

9.3 Internal Symmetries 265

Problem 9.10

Prove that I , I 0, satisfy

[Ii , I j ] D i ε i j k Ik , [Iπ i , Iπ j ] D i ε i j k Iπk (9.213a)

[Ii , ψ j ] D i ε i j k ψk , [Iπ i , φ j ] D i ε i j k φk (9.213b)

Problem 9.11

Prove �Iπ3,

φ1 ˙ i φ2p

2

�D ˙

φ1 ˙ i φ2p

2, [Iπ3, φ3] D 0 (9.214)

This equation can be used to define

φ1 ˙ i φ2p

2j0i D α p jπ˙i D �jπ˙i , φ3 D jπ0i (9.215)

The phase α p was chosen to produce

Iπ˙jI D 1, I3 D 0i Dp

2j1,˙1i (9.216)

which is a special case of the usual convention adopted from the angular momen-tum operator:

Iπ˙jI, I3i D [(I � Iπ3)(I ˙ I3 C 1)]1/2jI, I3 ˙ 1i (9.217)

The real nucleon and the pion do not have exactly the same mass within the multi-plet. The isospin rotation is not an exact symmetry, but as long as the difference issmall we can treat the isospin as a conserved quantity.

Isospin of the AntiparticleIn order to find the isospin of the antiparticle, we apply the charge conjugation op-eration to Eq. (9.194). Then the field represents the antiparticle. As the C operationcontains complex conjugation, the transformation matrix becomes

p 0 D α� p C �n (9.218a)

n0 D � p C α n (9.218b)

here, p , n denote the fields of the antiparticles p , n. In general, if the symmetrytransformation matrix is U, that of the antiparticle is given by its complex conju-

Page 291: Yorikiyo Nagashima - Archive

266 9 Symmetries

gate U�. Rearranging Eq. (9.218), we obtain

(�n0) D α (�n)C p (9.219a)

p 0 D ��(�n)C α� p (9.219b)

which means��n0

p 0�D

�α �� α�

���n

p

�D U

��n

p

�(9.220)

The equation means that if (p , n) is an isodoublet, (�n, p ) is also an isodoublet thattransforms in the same way.

Charge Symmetry We mentioned charge symmetry in Eq. (9.192). It can be de-fined more explicitly as the rotation in isospin space by π on the y-axis. For a dou-blet its operation is

e�i π I2 D e�i(π/2)τ2 D �i τ2 D

�0 �11 0

�(9.221)

Page 292: Yorikiyo Nagashima - Archive

267

10Path Integral: Basics

The path integral is now widely used in many fields wherever quantum mechanicaltreatments are necessary, including quantum field theory and statistical mechan-ics. In solving the simple problems that are treated in this book, it is hardly a morepowerful tool than the ordinary canonical formalism. In fact, when ambiguitiesarise in the path integral formulation, they are usually resolved by comparing itwith the canonical quantization formalism or Schrödinger equation. But becauseof its powerful concept and widespread use, acquaintance with the method is in-dispensable. Feynman’s path integral expresses a particle’s dynamical propagationas a coherent sum of an infinite number of amplitudes over all possible paths.The path that is dictated by the classical equation of motion is simply the mostprobable, thus the method yields important insight into the connection betweenthe classical and quantum mechanical approaches. It has the remarkable feature ofnot using operators in Hilbert space. Only ordinary c-numbers appear in the for-mula, although at the expense of an infinite number of integrations. What we dealwith primarily is the propagator or the transition amplitude.

The purpose of this chapter is to give a basic idea of the path integral. Applica-tions to field theory are given in the next chapter.

10.1Introduction

Since the underlying ideas and mathematics are very different from those of thestandard canonical quantization, we go back to the basics of quantum mechanicsand review some of their features to pave the way to the path integral.1)

10.1.1Bra and Ket

We define a state or Dirac’s ket jqi as the eigenstate of a position operator Oq, namely

Oqjqi D qjqi (10.1)

1) In this chapter we write „ and c explicitly.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 293: Yorikiyo Nagashima - Archive

268 10 Path Integral: Basics

where q is the coordinate of the position; it is a c-number. In this book we use a“hat”, e.g. OA, to denote an operator with eigenvalue A. For any ket vector, we candefine its conjugate, bra vector and a scalar (inner) product of a bra hφj and a ketjψi as a pure number with the property

hφjψi D hψjφi� (10.2)

The position eigenstates define a complete orthonormal basis. Namely they satisfy

hqjq0i D δ(q � q0) W orthonormality (10.3a)Zdqjqihqj D 1 W completeness (10.3b)

Similarly, eigenstates of the momentum operator Op also satisfy the orthonormalityand completeness conditions

Op jp i D p jp i

hp jp 0i D 2π„ δ(p � p 0)Zd p

2π„jp ihp j D 1

(10.4)

The position and momentum operators satisfy the commutation relation

[ Oq, Op ] D i„ (10.5)

10.1.2Translational Operator

If we define a translational operator by

OU(a)jqi D jq C ai (10.6)

it satisfies the relation

Oq OU(a)jqi D Oqjq C a >D (q C a)jq C ai D f OU(a) Oq C a OU(a)gjqi (10.7)

Since this holds for any jqi that are a complete orthonormal basis,

[ Oq, OU(a)] D a OU(a) (10.8)

We learned in the previous chapter that OU(a) can be expressed as

OU(a) D e�i a Op (10.9)

Page 294: Yorikiyo Nagashima - Archive

10.1 Introduction 269

It is straightforward to show

OU(a)�1 Oq OU(a) D e i a Op Oqe�i a Op D Oq C a (10.10a)

OU(a) OU(b) D OU(a C b) (10.10b)

It follows that an operator O( Oq, Op ) that can be Taylor expanded in powers of Opand Oq is transformed by OU(a) as

OU(a)�1O( Oq, Op ) OU(a) D O( Oq C a, Op ) (10.11)

Using the translation operator, any eigenvector of the Oq is expressed as

jqi D OU(q � q0)jq0i (10.12)

where q0 is the origin of the coordinate. When translational invariance holds, theorigin can be placed anywhere. Without loss of generality we may put q0 D 0 andthe position eigenvector is expressed as

jqi D OU(q)j0iq (10.13)

The subscript “q” is attached to remind us it is a position eigenvector. Let us seethe effect of Op acting on the position vector:

jq C dqi D�

1� dqi„Op�jqi C O(dq2)

) Op jqi D limd q!0

i„jq C dqi � jqi

dqD i„

@

@qjqi

(10.14)

Multiplying by hp j on both sides, we obtain

hp j Op jqi D p hp jqi D i„@

@qhp jqi (10.15)

This has a solution

hp jqi D N exp��

i„

p q�

(10.16)

To determine the constant N, we use Eq. (10.4). Inserting a complete set in theintermediate states

hp 0jp i DZ

dqhp 0jqihqjp i DZ

dqN e� i„ q p 0

N�ei„ q p 0

D 2π„δ(p 0 � p )jN j2

) jN j D 1

(10.17)

Therefore, we have obtained the transformation amplitude:

hqjp i D hp jqi� D exp�

i„

q p�

(10.18)

Page 295: Yorikiyo Nagashima - Archive

270 10 Path Integral: Basics

Then it can easily be shown that

jqi DZ

d p2π„jp ihp jqi D

Zd p

2π„e� i

„ q p jp i (10.19a)

jp i DZ

dq jqihqjp i DZ

dq ei„ q p jqi (10.19b)

Namely, the position basis vector and the momentum basis vector are connected toeach other by Fourier transformation.

To find representations of Op in the position basis, we use Eq. (10.14). Multiplyingby hqj on both sides, we obtain

hqj Op jq0i D hqji„@

@q0 jq0i D

�i„

@

@q0 hqjq0i�D i„

@

@q0 δ(q � q0)

D �i„@

@qδ(q � q0)

(10.20)

Note: The operator Op acts on the ket vector jqi and does nothing to ordinary c-num-bers. Equation (10.15) is different from the one used in the Schrödinger equation,p D �i@/@q. This is because “bra” and “ket” are elements of abstract Hilbert spaceand the Schrödinger wave function is a representation of the ket vector jψi in thecoordinate basis.2) Defining the wave function as

ψ(q) D hqjψi (10.21)

The relations between the ket vectors

Oqjqi D qjqi (10.22a)

Op jqi D i„@

@qjqi (10.22b)

hqjq0i D δ(q � q0) and 1 DZ

dqjqihqj (10.22c)

are converted to operations acting on the wave function:

hqj Oq D qhqj ! hqj Oqjψi D qhqjψi D qψ(q) (10.23a)

hqj Op D �i„@

@qhqj ! hqj Op jψi D �i„

@

@qhqjψi D �i„

@

@qψ(q) (10.23b)

hφjψi DZ

dqhφjqihqjψi DZ

dqφ(q)�ψ(q) (10.23c)

2) The situation is reminiscent of the vector Ain Newton mechanics. Its length, directionand inner product are defined, but otherwiseit is an abstract entity. A representation(A1, A2, A3) can be obtained by expandingin terms of the Cartesian base vectorse i . In the ket notation, it is written as

jAi DPje i ihe i jAi. The effect of an action

on the basis vectors is opposite to thaton the representation. For instance, theexpression for an active rotation of a vector ismathematically equivalent to inverse rotationof the base vectors.

Page 296: Yorikiyo Nagashima - Archive

10.2 Quantum Mechanical Equations 271

Using

hqjp i D e i p q/„ (10.24)

ψ(q) D hqjψi D hqj�Z

d p2π„jp ihp j

�jψi D

Zd p

2π„hqjp ihp jψi

D

Zd p

2π„e i p q/„ψ(p )

(10.25a)

Similarly

ψ(p ) DZ

dqe�i p q/„ψ(q) (10.25b)

Notice the sign of the exponent is different from that of the jqi and jp i Fouriertransformation of Eq. (10.19).

10.2Quantum Mechanical Equations

10.2.1Schrödinger Equation

In terms of bras and kets, the Schrödinger equation is expressed as

i„@

@tjψ(t)i D H( Op , Oq)jψ(t)i (10.26)

where the Hamiltonian H is time independent. Let us calculate the matrix elementof the Hamiltonian operator in the position basis:

OH DOp 2

2mC V( Oq) (10.27)

hqjOp 2

2mjq0i D

12m

Zd y hqj Op jyihy j Op jq0i

D1

2m

Zd y

��i„

@

@qδ(q � y )

���i„

@

@yδ(y � q0)

D �„2

2m

Zd y

@

@qδ(q � y )

@

@yδ(y � q0)

D �„2

2m@2

@q2 δ(q � q0)

hqjV( Oq)jq0i D V(q)δ(q � q0)

) hqj OH jq0i D��„2

2m@2

@q2 C V(q)�

δ(q � q0) (10.28)

In the q representation

i„@

@thqjψ(t)i D

Zdq0 hqj OH ( Op , Oq)jq0ihq0jψ(t)i (10.29a)

Page 297: Yorikiyo Nagashima - Archive

272 10 Path Integral: Basics

Inserting Eq. (10.28)

i„@

@tψ(t q) D OH(q,�i„r)ψ(t q) (10.29b)

This is the familiar Schrödinger equation one encounters in textbooks of quantummechanics.

10.2.2Propagators

The Schrödinger equation can be cast in an integral form. To do this we define atime evolution operator by

jψ(t1)i D

(OK (t1, t2)jψ(t2)i t1 > t2

0 t1 < t2(10.30)

If we insert Eq. (10.30) into the Schrödinger equation (10.26) and integrate, weobtain

OK(t1, t2) D θ (t1 � t2)e� i„ OH t1 e

i„ OH t2 D θ (t1 � t2)e� i

„ OH (t1�t2) (10.31)

The last equation holds when the Hamiltonian is time independent. The time evo-lution operator satisfies�

i„@

@t1� OH

�OK (t1, t2) D i„δ(t1 � t2) (10.32)

The propagator is given by taking matrix elements in a given basis. In the coordi-nate basis defined by Oqjqi D qjqi, it is written as

hq1j OK (t1, t2)jq2i � K(t1q1I t2q2) (10.33)

Because OK (t1, t2)! 1 as t1 ! t2, it satisfies the initial condition

limt1!t2

K(t1q1I t2q2) D δ(q1 � q2) (10.34)

Equation (10.32) shows that it obeys the equation�i„

@

@t1� OH

�K(t1q1I t2q2) D i„δ(t1 � t2)δ(q1 � q2) (10.35)

which shows that it is the Green’s function (or the propagator) for the time-depen-dent Schrödinger equation. We remind the reader that OK(t1, t2) is an operator whileK(t1q1I t2 q2) is a number.

The Schrödinger equation assumes all the time dependence is contained in thewave function and the operator is time independent. We can also take the attitudethat all the time dependence is contained in the operator and the wave function is

Page 298: Yorikiyo Nagashima - Archive

10.2 Quantum Mechanical Equations 273

time independent, which is called the Heisenberg picture. They are mathematicallyequivalent. The difference lies in the choice of base vectors. In the field theoreti-cal treatment, the Heisenberg picture is generally more convenient and intuitivelyappealing, because equations of quantized fields are straightforward extensions ofclassical field equations.

So we will investigate the propagator in the Heisenberg picture. Attaching thesubscripts S or H to differentiate objects in the two pictures

OH(t) D ei„ OH t OSe� i

„ OH t (10.36a)

jt qiH D ei„ OH t jqiS (10.36b)

Note, in the Heisenberg picture

OqHjt qiH D qjt qiH (10.37)

Despite its apparent dependence on the time coordinate in the ket, it representsa stationary state. Therefore, it should not be interpreted as that the ket changeswith time; rather it should be interpreted as that the dynamical development ofan operator at time t is transformed to the state. The transition amplitude in theHeisenberg picture is given by

Hht1 q1jt2q2iH D S hq1je� i„ OH t1 e

i„ OH t2 jq2iS D S hq1j OK (t1, t2)jq2iS

D K(t1q1I t2q2)(10.38)

Equation (10.38) states that the propagator is a probability amplitude for finding thesystem at some point q1 in the configuration space at time t1 when it was originallyat q2 at time t2. Or stated differently, it is a matrix element of the time-ordered timeevolution operator between the coordinate basis states in the Heisenberg picture.

As stated earlier, the state vector jψiH D ei„ OH t jψ(t)iS in the Heisenberg pic-

ture is stationary, hence the wave function hqjψiH is stationary. But as Eq. (10.36b)shows, the coordinate basis is moving and Hht qjψiH is the same wave function asthat appears in the Schrödinger equation:

Hht qjψiH D S hqjψ(t)iS D ψ(t q) (10.39)

From now on, we omit the subscript H. Completeness of states enables us to write

ψ(t f q f ) D ht f q f jψi DZ

dqiht f q f jti q iihti q i jψi

D

Zdqi K(t f q f I ti q i)ψ(ti q i )

(10.40)

This is Huygens’ principle, that a wave at later time t f is the sum of propagatingwaves from qi at time ti . The equation is very general and satisfies the causalitycondition (K(t f I ti ) D 0 for ti > t f ).

Page 299: Yorikiyo Nagashima - Archive

274 10 Path Integral: Basics

10.3Feynman’s Path Integral

10.3.1Sum over History

To calculate the propagator, we first divide the continuous time interval into N smallbut finite and discrete intervals

t j D t0 C j ε , ε Dt f � ti

N, t0 D ti , tN D t f

exp��

i„OH (t f � ti )

�D

NYj D1

exp��

i„OH (t j � t j �1)

� (10.41)

and insert complete sets of coordinate bases in the intermediate time point (seeFig. 10.1).

�t

�q�

qiti D t0 ���

������t1

���t2

�...

��t j q j�

������

����

...t f D tN �

q f

Figure 10.1 To perform the path integral, the time coordinate is discretized to N intervalsbetween ti and t f . At each t D t j , q j is integrated from �1 to1. qi and q f are fixed.R Dq � limN!1

Rdq1 � � � dqN�1.

ht f q f jti q ii D

Z� � �

Zdq1dq2 � � � dqN�1ht f q f jtN�1, qN�1i

� � � ht j , q j jt j �1, q j �1i � � � ht1q1jti q ii

(10.42)

Referring to Eq. (10.38), we can rewrite each intermediate inner product as

ht j q j jt j �1q j �1i D hq j je� i„ OH (t j �t j �1)jq j �1i D hq j je� i

„ ε OH jq j �1i

(10.43)

If the Hamiltonian is of the form

OH DOp 2

2mC V( Oq) (10.44)

Page 300: Yorikiyo Nagashima - Archive

10.3 Feynman’s Path Integral 275

hq j jV( Oq)jq j �1i D V(q j �1)hq j jq j �1i D

Zd p

2π„hq j jp ihp jq j �1iV(q j �1)

hq j jOp 2

2mjq j �1i D

Zd p 0

2π„d p

2π„hq j jp 0ihp 0j

Op 2

2mjp ihp jq j �1i

D

Zd p 0

2π„d p

2π„e

i„ p 0q j e� i

„ p q j �1 (2π„)δ(p � p 0)p 2

2m

D

Zd p

2π„hq j jp ihp jq j �1i

p 2

2m(10.45)

) hq j jH( Op . Oq)jq j �1i D

Zd p

2π„hq j jp ihp jq j �1iH(p , q j �1) (10.46)

The Hamiltonian is no longer an operator but is written in terms of classical num-bers. Generally, the Hamiltonian may include a polynomial of the product of op-erators Oq, Op , as, for instance, the magnetic force on the charged particle. In suchcases, a certain subtlety arises. If all Op ’s stand to the left of the coordinate opera-tors, we can insert the complete momentum basis to the left of the Hamiltonianand obtain

hq j je� i„ εH( Op,Oq)jq j �1i '

Zd p

2π„hq j jp ihp j

�1 �

i„

εH( Op , Oq)�jq j �1i

D

Zd p

2π„hq j jp ihp jq j �1i

�1 �

i„

εH(p , q j �1)�

'

Zd p

2π„hq j jp ihp jq j �1ie� i

„ εH(p ,q j �1) (10.47)

which has the same form as Eq. (10.46). On the other hand, if all Op ’s stand to theright of the coordinates,

hq j je� i„ εH( Op,Oq)jq j �1i '

Zd p

2π„hq j j

�1 �

i„

εH( Op , Oq)�jp ihp jq j �1i

D

Zd p

2π„hq j jp ihp jq j �1i

�1 �

i„

εH(p , q j )�

'

Zd p

2π„hq j jp ihp jq j �1ie� i

„ εH(p ,q j ) (10.48)

As there is no well-defined way to order the product of operators, we take a mid-point.3) Putting hq j jp i D e i p q j , we obtain

ht j q j jt j �1 q j �1i D

Zd p

2π„exp

�i p (q j � q j �1) �

i„

εH�

p ,q j C q j �1

2

��(10.49)

3) For a special ordering called Weyl ordering, where the ordering of operators is symmetrized,like p2 q D 1

3 (p2 q C p q p C q p2), the midpoint method is exact, and this is the conventionalassumption. See [113, 255] for more detailed treatments.

Page 301: Yorikiyo Nagashima - Archive

276 10 Path Integral: Basics

Writing all the inner products similarly, we obtain

ht f q f jti q ii D limε!0

Z� � �

Zdq1 � � � dqN�1

d p1

2π„� � �

d pN

2π„

� exp

24 i„

NXj D1

�p j (q j � q j �1) � εH

�p j ,

q j C q j �1

2

� 35(10.50)

Note that there are N inner products, hence N integrals over the momenta, butthere are only N�1 intermediate states, hence N�1 integrals over the coordinates.In the continuum limit of ε ! 0, the exponent can be rewritten as

limε!0

N!1

i„

NXj D1

�p j (q j � q j �1)� εH

�p j ,

q j C q j �1

2

D limε!0

N!1

i„

εNX

j D1

�p j

q j � q j �1

ε

�� H

�p j ,

q j C q j �1

2

Di„

Z t f

t i

d t�p Pq � H(p , q)

D

i„

Z t f

t i

d tL Di„

S [q]

(10.51)

where S [q] is the classical action. Therefore, the propagator may be symbolicallywritten as

ht f q f jti q ii D

ZDqD

p2π„

�exp

�i„

Z t f

t i

d tfp Pq � H(p , q)g�

(10.52)

This is the path integral expression for the propagator from (ti q i) to (t f q f ). Inthe continuum limit q becomes a function of t, and the integral is a functionalintegral. Each function q(t) and p (t) defines a path in phase space. The integrationis over all functions and is infinite dimensional. An outstanding characteristic ofthe path integral is that the quantum operators Oq and Op do not appear, but onlytheir representations expressed in ordinary numbers. It is not obvious, however,that the infinite-dimensional integrals of this type are well-defined mathematically,i.e. if they converge. We proceed assuming they do.

If the Hamiltonian has a special form, namely quadratic in the momenta, suchas

H(p , q) Dp 2

2mC V(q) (10.53)

Page 302: Yorikiyo Nagashima - Archive

10.3 Feynman’s Path Integral 277

we can perform the p-integration:

Zd p j

2π„exp

"i„

ε

p j (q j � q j �1)

ε�

p 2j

2m

!#

D

Zd p j

2π„exp

"�

i ε2m„

(�p j �

m(q j � q j �1)ε

�2

�m(q j � q j �1)

ε

�2)#

D1

2π„

�2πm„

i ε

�1/2

exp�

i mε2„

q j � q j �1

ε

�2�

D m

2π i„ε

�1/2exp

�i mε2„

q j � q j �1

ε

�2�

4) (10.54)

Substituting this expression into Eq. (10.50), we obtain

ht f q f jti q ii D limε!0

N!1

m2π i„ε

�N/2Z� � �

Zdq1 � � � dqN�1

� exp

24 i„

εNX

j D1

�m2

q j � q j �1

ε

�2� V

�q j C q j �1

2

� 35D C

ZDq exp

�i„

Zd t�

12

m Pq2 � V(q) �

D CZ

Dq ei„ S [q] (10.56)

where C is a constant independent of the dynamics of the system and S [q] is theaction of the system. This is the Feynman path integral in an alternative form.Although it is proved only for a special form of the Hamiltonian, the use of theLagrangian has many desirable features. The action is a Lorentz scalar and thesymmetry structure of the system is very easy to grasp. Besides it has the classi-cal path as the first approximation, so the transition from classical mechanics toquantum mechanics is transparent in this formalism. Therefore, we will use thisas the starting point from which any equation of motion is derived. The propagatoris expressed as a sum of all the path [q(t)] configurations connecting ti q i with t f q f

with the weight e(i/„)S [q]. Classically, the path is determined as the one to minimizethe action, but, in quantum mechanics, any path connecting the end points is pos-sible, though the action of a path far from the classical one generally gives a large

4) The Gaussian integral can be extended by analytically continuing jAj to jAje iθ (�π/2 � θ � π/2)inZ

dqe�jAjq2D

rπjAj

!

Z 1

�1dq e�ijAjq2

D

s

π ijAj

(10.55a)

This is a variation of the Fresnel integralsZ 1

0dx sin x2 D

Z 1

0dx cos x2 D

12

rπ2

(10.55b)

Page 303: Yorikiyo Nagashima - Archive

278 10 Path Integral: Basics

value relative to „, hence a small weight because of the oscillatory nature of thefactor e(i/„)S [q]. If there is more than one path, as in the simple and familiar two-slitexperiment, both paths contribute and the interference is an important aspect ofquantum mechanical treatment. If the Hamiltonian is not of the type Eq. (10.53),we may still use Eq. (10.56), but in that case we have to be careful in defining thepath integral measure Dq.

10.3.2Equivalence with the Schrödinger Equation

From the form of the path integral, it is hard to think there is any connectionwith the Schrödinger equation, but we have developed the formalism starting fromthe Schrödinger equation. Both forms are known to reproduce the quantum me-chanical phenomena correctly. We show here that the path integral includes theSchrödinger equation. The latter is a differential equation, so we examine an in-finitesimal form of the path integral. For a small time difference t f D ti C ε, thepropagator in Eq. (10.56) is expressed as

ht f D t C ε, q f jt q ii D m

2π i„ε

�1/2exp

�i„

ε�

m2

q f � qi

ε

�2

�V�

q f C qi

2

� � (10.57)

As the propagator describes the propagation of a wave function and satisfies

ψ(t C ε, q) DZ 1

�1d y K(t C ε, qI t y )ψ(t, y ) (10.58)

we can insert Eq. (10.57) in Eq. (10.58) and obtain

ψ(t C ε, q) D m

2π i„ε

�1/2Z

d y exp�

i„

ε�

m2

q � yε

�2

�V�

q C y2

� �ψ(t, y )

(10.59)

Changing the integration variable to z D y � q, we can rewrite the expression as

ψ(t C ε, q) D m

2π i„ε

�1/2Z

dz exp�

i m2„ε

z2�

exp��

i„

εV

q Cz2

��� ψ(t, q C z)

(10.60)

As ε is a small number, the dominant contribution will come from the regionwhere ˇ m

2„εz2ˇ. 1 (10.61)

Page 304: Yorikiyo Nagashima - Archive

10.3 Feynman’s Path Integral 279

In this region the first term is of the order of 1. We can expand the second and thethird terms in ε. Note each power of z n gives a factor of εn/2. Then

ψ(t C ε, q)

D m

2π i„ε

�1/2Z

dz exp�

i m2„ε

z2��

1�i„

εV(q)C O(εz)�

�ψ(t q)C zψ0(t q)C

z2

2ψ00(t q)C O(z3)

D m

2π i„ε

�1/2Z

dz exp�

i m2„ε

z2��

ψ(t q)�

1 �i ε„

V(q)�

Czψ0(t q)Cz2

2ψ00(t q)C O(z3, εz)

�(10.62)

Each term can easily be integrated and the results areZ 1

�1dz exp

�i m2„ε

z2�D

�2π i„ε

m

�1/2

(10.63a)Z 1

�1dzz exp

�i m2„ε

z2�D 0 (10.63b)

Z 1

�1dzz2 exp

�i m2„ε

z2�D

i„εm

�2π i„ε

m

�1/2

(10.63c)

Substituting the above equation back into Eq. (10.62), we obtain

ψ(t C ε, q) D ψ(t q)Ci„ε2m

ψ00(t q)�i ε„

V(q)ψ(t q)C O(ε2) (10.64)

In the limit ε ! 0, we obtain

i„@ψ(t q)

@tD

��„2

2m@2

@q2 C V(q)�

ψ(t q) (10.65)

Therefore the path integral contains the Schrödinger equation.

10.3.3Functional Calculus

The path integral is a functional integral, which uses a type of mathematics readersmay not have encountered. Perhaps it is useful to introduce basic operations of thefunctional and to evaluate simple examples for future applications. The simplestfunctional we know is the action functional defined by

S [q] DZ t f

t i

d tL(q, Pq) (10.66)

Unlike a function, which gives a value for a particular point q [or q D (q1, � � � , qN )],the functional gives a value for an entire trajectory q D q(t) along which the in-tegration is carried out. If the trajectory connecting qi D q(ti) and q f D q(t f )

Page 305: Yorikiyo Nagashima - Archive

280 10 Path Integral: Basics

is different, the value of the functional is different. Thus the functional has thegeneric form

F [ f ] DZ

dqF( f (q)) (10.67)

Traditionally square brackets are used to denote the functional. Some people callF( f (q)) a functional, meaning a function of functions. Note that, unlike our defini-tion Eq. (10.67), which maps a continuous set of variables to one number, it mapsto another continuous set of numbers. The notion of derivatives is also used in thefunctional. We define the functional derivative by

δF [ f (q)]δ f (y )

D limε!0

F [ f (q)C εδ(q � y )]� F [ f (q)]ε

(10.68)

It follows from the above equation

δ f (q)δ f (y )

D δ(q � y ) (10.69)

Problem 10.1

Prove that if

F [ f ] DZ

d y F( f (y )) DZ

d y f f (y )gn

thenδF [ f ]δ f (q)

D n f f (q)gn�1(10.70)

Problem 10.2

Prove that

δSδq(t0)

D �dd t

�@L@ Pq

�tDt 0C

@L@q

ˇˇ

tDt 0(10.71)

where S is the action given by Eq. (10.66).

Problem 10.3

Define

Fx [ f ] DZ

d y G(x , y ) f (y ) (10.72)

Here, x is to be regarded as a parameter. Prove

δFx [ f ]δ f (z)

D G(x , z) (10.73)

Page 306: Yorikiyo Nagashima - Archive

10.3 Feynman’s Path Integral 281

We can make a Taylor expansion of the functional, too:

S [q] D S [qc C η] D S [qc]CZ

d t η(t)δS [q]δq(t)

ˇˇ

qDqc

C12!

Zd t1d t2η(t1)η(t2)

δ2S [q]δq(t1)δq(t2)

ˇˇ

qDqc

C � � �

(10.74)

The form of the Taylor expansion Eq. (10.74) can be understood most easily if wedivide the time integral into N segments (see Fig. 10.2):

S [q] DZ t f

t i

d t L(q(t), � � � ) DNX

j D1

ε j L(q j , � � � )

D F [q0, q1, q2 � � � , qN�1, qN ] , q0, qN fixed

(10.75)

In this form, the functional becomes an ordinary multivariable function:

(q0, q1, q2, . . . , q j , � � � qN )

Then the Taylor expansion of the functional can be carried out using the ordinaryTaylor expansion:

F [q C η] D F(q0C η0, q1 C η1, � � � , qN C ηN )jη0DηN D0

D F [q]C1X

nD0

NXj1D1

� � �

NXj nD1

1n!

Tn( j1, � � � , j n) η j1 � � � η j n

(10.76)

where

Tn( j1, � � � , j n) D@n F [q]

@q j1 � � � @q jn

ˇˇ

ηD0

(10.77)

Going back to continuous variables

j k ! tk , η j k ( j k D 1, � � �N )! η(tk ),X

j k

!

Zd tk , ti tk t f (10.78)

xi

xf

xj 1xj ηj

ηj+1xj+1

ηj 1

x

t

Figure 10.2 Functional differential path.

Page 307: Yorikiyo Nagashima - Archive

282 10 Path Integral: Basics

the second term in Eq. (10.76) with n D 1 becomes

Xk

@F@qk

ηk DX

k

εk

�1εk

@F@qk

�ηk D

Zd t0η(t0)

δS [q]δq(t0) (10.79)

Note, in the second equality, (1/ε)@F/@qk is well behaved because of Eq. (10.75)and gives the second term in Eq. (10.74). Other higher order terms can be derivedsimilarly.

10.4Propagators: Simple Examples

Here we calculate the propagator in some simple examples to gain familiarity withthe path integral method.

10.4.1Free-Particle Propagator

The free particle is the simplest example. In fact, the propagator can be obtainedeasily without depending on the path integral. For the sake of later applications, wederive the result in two ways.

(1) Canonical Method The definition of the transition amplitude Eq. (10.38) gives

Kfree( f, i) � K0(t f q f I ti q i) D ht f q f jt1qiiHDH0 D hq f je� i„ H0(t f �t i )jqii

(10.80)

The Hamiltonian of the free particle is given by

H0 DOp 2

2m(10.81)

It does not contain coordinate operators, therefore it can be diagonalized by simplyinserting the momentum basis in the intermediate states:

K0(t f q f I ti q i ) D θ (t f � ti )Z

d p 0

2π„d p

2π„hq f jp 0ihp 0je� i

„ H0(t f �t i )jp ihp jqii

D

Zd p

2π„

Zd p 0

2π„exp

�i„

(p 0q f � p qi)�

�θ (t f � ti )2π„δ(p � p 0) exp

��

i„

p 2

2m(t f � ti )

��

D θ (t f � ti )Z

d p2π„

exp�

i„

�p (q f � qi) �

p 2

2m(t f � ti )

D θ (t f � ti )

sm

2π i„(t f � ti )exp

�i„

m(q f � qi)2

2(t f � ti )

�(10.82)

Page 308: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 283

where we have usedR

dze�Az2Dp

π/A and its analytical continuation to A DjAjei θ , (π/2 θ �π/2). It expresses the well known fact that a localized wavepacket spreads with time.

It is often more convenient to work in momentum space. The boundary condi-tion K D 0 for t � t0 < 0 can be explicitly taken into account in the momentumrepresentation. Using the integral representation for the step function

θ (t) D1

2π i

Z 1

�1dzei z t

z � i ε(10.83)

and inserting it into Eq. (10.82), we obtain

K0(t q W t0q0) DZ

d p2π„

dω2π i

1ω � i ε

� exp�

i„

�p (q � q0) �

�p 2

2m� „ω

�(t � t0)

� (10.84)

In this form, the energy no longer takes its on-shell value p 2/2m, but any valueE D p 2/2m � „ω. So we use the energy variable to express the propagator

K0(t q W t0q0) DZ

d p2π„

dE2π„

ei„ [p (q�q0)�E(t�t 0)] i„

E � (p 2/2m)C i ε(10.85)

This is the momentum representation of the free propagator. In field theory, wewill encounter its relativistic counterpart.

Note: Classically, the exponent of the free propagator can be expressed as(i/„)(p q/2)(p D m(q f � qi )/(t f � ti ) D m Pq, q D q f � qi ), but quantum me-chanically, the momentum and the position cannot be specified simultaneously.The action has to be written in terms of the initial and final coordinates and theelapsed time only, as expressed in the last equation of Eq. (10.82). Let us comparethe result with the classical relation. The solution to the action

S [q] DZ t f

t i

d tm2Pq2 (10.86)

is given by solving the Euler–Lagrange equation, which is written as

m Rq D 0! Pq D v D constant (10.87)

Therefore, the classical action gives

S [qcl] DZ t f

t i

d t12Pq2 D

m2

v2(t f � ti ) Dm2

(q f � qi)2

t f � ti(10.88)

Thus the propagator is written as

Kfree( f, i) D�

m2π i„(t f � ti )

�1/2

ei„ Scl (10.89)

This is a special example because the quantum solution is completely determinedby the classical path.

Page 309: Yorikiyo Nagashima - Archive

284 10 Path Integral: Basics

(2) Gaussian Integral The second method is a brute-force way, but provides a goodexercise for path integral calculation. Since the Hamiltonian contains the momen-tum in quadratic form, we can use Eq. (10.56) for the path integral calculation. Theaction is given by Eq. (10.86). Then the propagator is calculated to give

Kfree( f, i)

D limε!0

N!1

m2π i„ε

�N/2Z

dq1 � � � dqN�1 exp

24 i„

εNX

j D1

m2

q j � q j �1

ε

�2

35

D limε!0

N!1

m2π i„ε

�N/2Z

dq1 � � � dqN�1 exp

24 i m

2„ε

NXj D1

�q j � q j �1

�2

35(10.90)

UsingR

dze�Az2Dp

π/A with its analytical continuation to A D jAje i θ (�π/2 θ π/2), we can evaluate the integral. Changing the integration value to

y j D m

2 i„ε

�1/2q j (10.91)

we have

Kfree( f, i) D limε!0

N!1

m2π i„ε

�N/2�

2 i„εm

�(N�1)/2

Zd y1 � � � d yN�1 exp

24� NX

j D1

�y j � y j �1

�2

35 (10.92)

Let us integrate the jth integral and see what relations exist between different j’s:

I j �

Zd y j d y j C1 exp

��

�(y j � y j �1)2 C (y j C1 � y j )2

C (y j C2 � y j C1)2 �

D

Zd y j d y j C1 exp

��

�2�

y j �y j C1 C y j �1

2

�2

C12

(y j C1 � y j �1)2 C (y j C2 � y j C1)2 �

Integration over y j will take I j to the form

I j Dπ

2

�1/2Z

d y j C1 exp��

�32

�y j C1 �

2y j C2 C y j �1

3

�2

C13

(y j C2 � y j �1)2 �

2

�1/2�

2π3

�1/2

exp��

13

(y j C2 � y j �1)2�

(10.93)

Page 310: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 285

From the above calculation, we see that repeating the integration j times startingfrom j D 1 gives

Z� � �

Zd y1 � � � d y j D

π2

�1/2�

2π3

�1/2

� � �

�j π

j C 1

�1/2

� exp��

1j C 1

(y j C1 � y0)2�

D

�(π) j

j C 1

�1/2

exp��

1j C 1

(y j C1 � y0)2�

(10.94)

Inserting Eq. (10.94) into Eq. (10.92), we obtain

Kfree( f, i) D limε!0

N!1

m2π i„ε

�N/2�

2 i„εm

�(N�1)/2

"�(π)N�1

N

�1/2

exp��

1N

(yN � y0)2 #

D limε!0

N!1

m2π i„εN

�1/2exp

�i m

2„N ε(qN � q0)2

D

�m

2π i„(t f � ti )

�1/2

exp�

i„

m(q f � qi )2

2(t f � ti )

�(10.95)

where we have used N ε D t f � ti , q0 D qi , qN D q f in the last equation. Note allthe potentially singular terms have disappeared from the expression. In the limitt f ! ti

Kfree( f, i) D ht f q f jti q iit f !t i����! δ(q f � qi) (10.96)

which is nothing but the orthonormality relation in the coordinate basis.

10.4.2Harmonic Oscillator

Next we consider the one-dimensional harmonic oscillator. This is an example thatcan also be solved exactly. In fact, the method and conclusion are valid for any ac-tion that contains variables up to quadratic terms, and thus have very general uses.In the following, we consider a harmonic oscillator interacting with an externalsource described by the Lagrangian

L D12

m Pq2 �12

mω2q2 C J q (10.97)

The time-dependent external source J(t) can be thought of as an electric field if theoscillator carries an electric charge. The well-known result for the free harmonicoscillator can be obtained from this system in the limit J(t) ! 0. Furthermore,

Page 311: Yorikiyo Nagashima - Archive

286 10 Path Integral: Basics

if the source is time independent, the problem can also be solved exactly. In thiscase the system is nothing but an oscillator with its equilibrium position displaced,because the Lagrangian can be rewritten to give

L D12

m Py 2 �12

mω2 y 2 CJ2

2mω2, y D q �

Jmω2

(10.98)

This is the case, for instance, of a suspended spring under the effect of gravity.The Euler–Lagrange equation for the action Eq. (10.97) is determined by the re-

quirement

δS [q]δq(t)

ˇˇ

qDqcl

D 0 (10.99)

which gives the equation of motion to determine the classical path qcl(t).

m Rqcl C mω2qcl D J (10.100)

In quantum mechanics, q(t) can take any path, but it is convenient to expand itaround the classical path. Define the quantum fluctuation path η(t) by

q(t) D qcl(t)C η(t) (10.101)

Because of its quadratic nature, the Taylor expansion of the action terminates at thesecond power of η(t)

S [q] D S [qcl C η]

D S [qcl]CZ

d t η(t)δS [q]δq(t)

ˇˇ

qDqcl

C12!

Zd t1d t2η(t1)η(t2)

δ2S [q]δq(t1)δq(t2)

ˇˇ

qDqcl

(10.102)

The second term in Eq. (10.102) vanishes because the classical path is where theaction is at its extremum. Then

S [q] D S [qcl]C12!

Zd t1d t2 η(t1)η(t2)

δ2S [q]δq(t1)δq(t2)

ˇˇ

qDqcl

(10.103)

Calculating the functional derivatives, this becomes

S [q] D S [qcl C η] D S [qcl]Cm2

Z t f

t i

d t�Pη(t)2 � ω2 η2 (10.104)

The variable η(t) represents the quantum fluctuations around the classical path.As the end points of the trajectories are fixed, they satisfy the boundary condition

η(ti) D η(t f ) D 0 (10.105)

Page 312: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 287

As the harmonic oscillator is an important example, we calculate the propagatoragain in two ways.

(1) Matrix MethodSumming over all the paths is equivalent to summing over all possible fluctuationssubject to constraints Eqs. (10.105). The propagator is expressed as

Kosci( f, i) � ht f q f jti q iijharmonic oscillator

D CZ

Dq ei„ S [q] D C e

i„ S [qcl ]

ZDη

� exp�

i„

�m2

Zd t f Pη(t)2 � ω2η2g

�(10.106a)

D limε!0

N!1

m2π i„ε

�N/2e

i„ S [qcl ]

Z 1

1dη1 � � � dηN�1

� exp

24 i m

2„ε

NXj D1

( η j � η j �1

ε

�2� ω2

�η j C η j �1

2

�2)35

D limε!0

N!1

m2π i„ε

�N/2�

2 i„εm

�(N�1)/2

ei„ S [qcl ]

Zd y1 � � � d yN�1

� exp

24� NX

j D1

(�y j � y j �1

�2� ε2ω2

�y j C y j �1

2

�2)35 (10.106b)

We have the boundary condition y0 D yN D 0 because η(ti) D η(t f ) D 0. Theexponent is quadratic in y and can be rewritten in matrix form:

NXj D1

(�y j � y j �1

�2� ε2ω2

�y j C y j �1

2

�2)D

N�1Xj,kD1

y j M j k yk D yTM y

(10.107)

where

y D

266666664

y1...

y j...

yN�1

377777775

, M D

26664

a b 0 � � �

b a b � � �

0 b a � � �...

......

. . .

37775 (10.108a)

a D 2�

1 �ε2ω2

4

�, b D �

�1C

ε2ω2

4

�(10.108b)

Page 313: Yorikiyo Nagashima - Archive

288 10 Path Integral: Basics

M is a symmetric or more generally hermitian matrix and therefore can be diago-nalized by a unitary matrix U. Then

yTM y D yTU† M D U y D z† MD z DN�1Xj D1

λ j z2j (10.109a)

where

z D U y , MD D

26664

λ1 0 � � �

0 λ2 � � �...

...λN�1

37775 (10.109b)

The Jacobian for the change of the variables from y j to z j is unity, because thetransformation is unitary. The integration over all z j gives"

π(N�1)QN�1JD1 λ j

#1/2

D

�π(N�1)

det M

�1/2

(10.110)

Substituting Eq. (10.110) back into Eq. (10.106) gives

Kosci( f, i) D limε!0

N!1

m2π i„ε

�N/2�

2 i„εm

�(N�1)/2

ei„ S [qcl ]

Zd y1 � � � d yN�1 exp[�y† M y ]

D limε!0

N!1N εDt f �t i

m2π i„ε det M

�1/2e

i„ S [qcl ]

(10.111)

The action has an interaction term, but the result is still governed by the classicalpath. This is because the Lagrangian contains only up to quadratic terms. In thatcase the path integral over the quantum fluctuation only changes the normalizationfactor, which appears as

p1/ε det M . It is clear that the path integral can be defined

only if the matrix M does not have any vanishing eigenvalues.It remains to evaluate the determinant of the matrix. Denoting the determinant

of a j � j submatrix as D j , and inspecting Eq. (10.108a), we see it satisfies thefollowing recursion relations:

D j C1 D aD j � b2D j �1 (10.112a)

D�1 D 0 , D0 D 1 , D1 D a (10.112b)

We note that we can reformulate the recursion relations also in a simple matrixform as�

D j C1

D j

�D

�a �b2

1 0

� �D j

D j �1

�(10.113)

Page 314: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 289

Successive insertion gives

�DN�1

DN�2

�D

�a �b2

1 0

�� � �„ ƒ‚ …

(N�2) factors

�D1

D0

�D

�a �b2

1 0

�N�2 �a1

�(10.114)

We can determine eigenvalues of the basic 2� 2 matrix in Eq. (10.114) easily. From

detˇˇa � λ �b2

1 �λ

ˇˇ D 0 (10.115)

we obtain

λ˙ D12

(a ˙p

a2 � 4b2) (10.116a)

a D λC C λ� (10.116b)

A similarity matrix to diagonalize the basic matrix is also easily obtained:

S D�

cλC dλ�c d

�, S�1 D

1λC � λ�

�1/c �λ�/c�1/d λC/d

�(10.117)

Then �a �b2

1 0

�D S

�λC 00 λ�

�S�1 (10.118)

Inserting this in Eq. (10.114), we obtain�DN�1

DN�2

�D S

�λN�2

C 00 λN�2�

�S�1

�a1

�(10.119)

Inserting Eq. (10.117) and Eq. (10.116) in Eq. (10.119)

DN�1 DλN

C � λN�λC � λ�

(10.120)

This formula is consistent with Eqs. (10.112). Using Eq. (10.116a) and Eq. (10.108b)

λC � λ� D (a2 � 4b2)1/2 D

"4�

1�ε2ω2

4

�2

� 4�

1Cε2ω2

4

�2#1/2

D (�4ε2ω2)1/2 D 2 i εω (10.121a)

λNC D

"a Cp

a2 � 4b2

2

#N

D

242

1� ε2ω2

4

�C 2 i εω

2

35

N

D�1C i εω C O(ε2)

N

(10.121b)

Page 315: Yorikiyo Nagashima - Archive

290 10 Path Integral: Basics

λN� D

"a �p

a2 � 4b2

2

#N

D

242

1� ε2ω2

4

�� 2 i εω

2

35

N

D�1� i εω C O(ε2)

N

(10.121c)

Substituting Eqs. (10.121) in Eq. (10.120), we obtain

limε!0

N!1εDN�1 D lim

ε!0N!1

ελN

C � λN�λC � λ�

D limε!0

N!1

ε2 i εω

�(1C i εω)N � (1 � i εω)N

D limε!0

N!1

ε2 i εω

"�1C

i ωTN

�N

�1�

i ωTN

�N#

De i ωT � e�i ωT

2 i ωD

sin ωTω

(10.122)

We obtain the propagator for the harmonic oscillator as

Kosci( f, i) D limε!0

N!1N εDt f �t i

m2π i„ε det M

�1/2e

i„ S [qcl ]

D mω

2π i„ sin ωT

�1/2e

i„ S [qcl ]

(10.123)

Note all the potentially singular terms have disappeared and the end result is finite.

Problem 10.4

Show that for L D 12 m Pq2 � 1

2 mω2q2 with boundary condition q(ti ) D qi , q(t f ) Dq f

ei„ S [qcl ] D exp

�i mω

2„ sin ωT

n(q2

f C q2i ) cos ωT � 2q f q i

o�(10.124a)

Kosci( f, i) D mω

2π i„ sin ωT

�1/2

� exp�

i mω2„ sin ωT

n(q2

f C q2i ) cos ωT � 2q f q i

o� (10.124b)

(2) Operator MethodIt is a well-known fact that a linear operator5) has matrix representations and thusthe matrix method developed in the previous section can easily be extended to the

5) If an operator satisfies the relation

OO(ajψ1i C bjψi) D a OOjψ1i C b OOjψ2i (10.125)

it is said to be linear.

Page 316: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 291

case of operators. Here, we calculate the determinant of the hermitian operatorexplicitly, which has more applications later in field theory. The important part ofthe path integral for the harmonic oscillator has the form

I DZ

Dη exp�

i„

Z t f

t i

d t( Pη2 � ω2η2)�

(10.126)

By partial integration, it can be rewritten as

I DZ

Dη exp�

i„

Z t f

t i

d tη(t)��

d2

d t2 � ω2

η(t)�

(10.127)

To handle the operator as a matrix we introduce two mathematical tricks.1) First we have to make sure the integral is well behaved i.e. convergent. This is

made possible when the function has a certain property so that we could analyticallycontinue into the complex plane by rotating the time axis t ! t e�i θ . This is calledWick rotation. The technique is widely used and when we set the rotation angleθ D π/2 or t ! �i τ, and correspondingly d2/d t2 ! �d2/dτ2, we have moved tothe Euclidean space from the Minkowski space. Na mely,

�SE(x , τ) D i SM(x ,�i τ) (10.128)

The sense of the rotation is completely fixed by the singularity structure of thetheory. We shall discuss it further later at Sect. 10.6 when we define the generatingfunctional. Then we have the Euclidean action

SE D

Z τ f

τ i

dτη(τ)��

d2

dτ2 C ω2�

η(τ) �Z τ f

τ i

dτ η(τ)Aη(τ) (10.129)

and the path integral

I DZ

Dq e� 1„ SE (10.130)

becomes a well-behaved integral as long as SE is positive definite. After performingthe calculation, we can go back to the original Minkowski space by setting τ backto τ D i t.

2) To show the matrix nature of the operator A, we start as usual from the discretetime interval.

If we divide the Euclidean time interval into N segments and put

ε D ε0 Dτ f � τ i

N, η j D η( j ε) , ηk D η(kε0)

d2

dτ2 η(τ) D1ε2 (η j C1 � 2η j C η j �1)

(10.131)

The time integral is expressed as the continuous limit of the discrete sum

SE D �

N�1Xj

εη j

�1ε2

(η j C1 � 2η j C η j �1) � ω2η j

DX

j,k

εε0η j A j k ηk DX

j,k

εε0ηTAη(10.132)

Page 317: Yorikiyo Nagashima - Archive

292 10 Path Integral: Basics

where

A j k D �1ε0

�δ j C1,k � 2δ j,k C δ j �1,k

ε2

�C ω2 δ j,k

ε0 (10.133)

In the last line of Eq. (10.132) we used bold letters for η to emphasize its vectorproperty η D (η1, � � � , ηN ). In the continuous limit, this gives

SE �

ZdτηT(τ)A(τ)η(τ) D

Zdτdτ0 ηT(τ)A(τ, τ0)η(τ0)

A(τ, τ0) D��

d2

dτ2 C ω2�

δ(τ � τ0)(10.134)

where we have used the relation

limε0!0

δ j,k

ε0 D δ(τ � τ0) (10.135)

In this form the operator is clearly seen to be a matrix whose elements are speci-fied by two continuous variables. Referring to Eq. (10.133), A(τ, τ0) is a symmetricoperator and in the proper basis satisfies the eigenequation

Aφ j D λ j φ j (10.136)

As the eigenfunction constitutes a complete basis, the variable η can be expandedin terms of φ j

η(τ) DX

j

a j φ j (τ) (10.137)

The boundary condition η(τ i) D η(τ f ) D 0 translates to φ j (τ i) D φ j (τ f ) D 0.The normalized eigenfunction is easily obtained as

φ j (τ) D

s2

(τ f � τ i )sin

j π(τ f � τ i )

(τ � τ i) j D 1, � � � , N � 1 (10.138a)

λ j D

�j π

τ f � τ i

�2

C ω2 (10.138b)

We note that the eigenvalues are positive definite and ensure the convergence ofthe Euclidean integral. Note, also, that if we assume the time integral is discretizedinto N intervals as before, there are N � 1 intermediate time points and there canbe just as many (N � 1) independent coefficients a j in the Fourier expansion inEq. (10.137). Therefore it can be considered as an orthogonal transformation todiagonalize the matrix A. Equation (10.137) is still valid if N !1 and remains anorthogonal transformation that keeps the value of the Jacobian at unity. When we

Page 318: Yorikiyo Nagashima - Archive

10.4 Propagators: Simple Examples 293

change the path integral variable from η to a j , we obtain

Zdτ(ηTAη) D

Zdτdτ0

1Xj,kD1

a j φ j (τ)A(τ, τ0)ak φk (τ0)

D

1Xj D1

λ j a2j

)Z

Dηe� 1„R

d τ ηT Aη D

1Yj D1

Zda j e� 1

„ λ j a2j D

1Yj D1

�π„λ j

�1/2

D C (det A)�1/2 (10.139)

The determinant of the hermitian operator is expressed formally as

det��

d2

dτ2C ω2

�D det A D

1Yj D1

λ j

D

1Yj D1

�j π

τ f � τ i

�2 1Yj D1

"1C

�ω(τ f � τ i )

j π

2#

D C 0 sinh ω(τ f � τ i )ω(τ f � τ i)

6)

C 0 D1Yj D1

�j π

τ f � τ i

�2

(10.140)

C, C 0 are divergent constants that do not contain dynamical variables and can beabsorbed in the normalization constant. Analytically continuing this back to realtime, we obtain

det��

d2

dτ2C ω2

�τDi t���! det

�d2

d t2C ω2

�D N 0 2 sin ω(t f � ti )

ω (t f � ti )(10.142)

where N 0 is a normalization constant. It may be divergent, but as long as it is inde-pendent of dynamical variables, the actual value is irrelevant for physical outcomes,as will be clarified in later discussions. As a result we obtain

I DZ

Dη exp�

i„

Z t f

t i

d t�

d2

d t2 � ω2�

η�D N

�sin ω(t f � ti )

ω(t f � ti )

��1/2

(10.143)

6) We used the formulae

sin x D x1Y

nD1

�1�

x2

n2π2

�, cos x D

1YnD1

�1�

4x2

(2n � 1)2 π2

�(10.141)

Page 319: Yorikiyo Nagashima - Archive

294 10 Path Integral: Basics

The constant N can be determined by noting that, in the limit ω ! 0, the propa-gator should agree with that of the free particle. The final result is

Kosci( f, i) D�

mω2π i„ sin ω(t f � ti )

�1/2

ei„ S [qcl ] (10.144)

which agrees with what we obtained using the matrix method. Our result was ob-tained for the harmonic oscillator, but the result can be extended to general quadrat-ic actions of the form

I DZ

Dη exp�

i„

Z t f

t i

d t ηT(t)Aη(t)�D

Np

det A(10.145)

So far, we have used only real variables. The path integral can be generalized tocomplex variables. We note

2πaD

Zdx e�ax2/2

Zd y e�a y 2/2 D

Zdx d y e�a(x2Cy 2)/2 (10.146)

Introducing a complex variable z D (x C i y )/p

2, we write

2πaD

1i

Zdzdz�e�az�z (10.147)

Note, z and z� are independent variables. Generalizing to n complex variables andwriting z D (z1, z2, . . .),Z

d n zd n z�e�z† Az D

Zd n zd n z�e�z�

l A j k zk D(2π i)n

det A(10.148)

In this case, it goes without saying that the N-dimensional symmetric matrixshould be replaced by the hermitian matrix. Generalization of the path integralformula Eq. (10.145) to complex variables is now straightforward. Introducing thecomplex variable '(t) D (η C i � )/

p2

I DZ

D'�D' exp�

i„

Z t f

t i

d t '†(t)A'(t)�D

N 0

det A(10.149)

where A is a hermitian operator.

10.5Scattering Matrix

In this section, we try to calculate the scattering amplitude when we have a scatter-ing potential in the Hamiltonian. In general, the propagator is not exactly calcula-ble and we rely on the perturbation expansion. This is valid when the potential or,more precisely, when the time integral of V(t q) is small compared with „.

Page 320: Yorikiyo Nagashima - Archive

10.5 Scattering Matrix 295

10.5.1Perturbation Expansion

The propagator is given by Eq. (10.56)

K(t f q f I ti q i) D CZ

Dq exp�

i„

Zd t�

12

m Pq2 � V(q) �

D limε!0

N!1

m2π i„ε

�N/2Z� � �

Zdq1 � � � dqN�1 exp

24 i„

εNX

j D1

˚T j � Vj

�35(10.150)

where T j and Vj are the kinetic and potential part of the Lagrangian at t D t j . Asthe action in the path integral is a classical action , the Lagrangian can be separatedinto the kinetic and potential terms. We then expand the potential part

exp��

i„

Z tk

tk�1

d t V(t q)�D 1�

i„

εV(tk , qk )C12!

��

i„

εV(tk , qk )�2

� � � (10.151)

When Eq. (10.151) is substituted back into Eq. (10.150), we obtain a series expan-sion

K D K0 C K1 C K2 C � � � (10.152)

The first term does not contain V and is the free propagator:

K0 D CZ

Dq exp�

12

m Pq2�

Eq. (10.82)D θ (t f � ti )

�m

2π i„(t f � ti )

�1/2

exp�

i m(q f � qi)2

2„(t f � ti )

� (10.153)

K1 includes only one V(tk , qk ) and is expressed as

K1 D�i„

limε!0

N!1

N�1XkD1

εZ

dqk

24C N�k

ZdqkC1 � � � dqN�1 exp

8<: i m

2„ε

N�1Xj Dk

(q j � q j �1)2

9=;35

� V(tk , qk )

24C k

Zdq1 � � � dqk�1 exp

8<: i m

2„ε

k�1Xj D1

(q j � q j �1)2

9=;35

D �i„

Z t f

t i

d tZ 1

�1dq K0(t f q f I t q)V(t q)K0(t qI ti q i) (10.154)

where C Dp

m/2π i„ε. As K(t f q f I t q) vanishes if t > t f , and K(t qI ti q i) if t < ti ,the integration over time can be extended to all values of t to give

K1 D �i„

Z 1

�1d tZ 1

�1dqK0(t f q f I t q)V(t q)K0(t qI ti q i) (10.155)

Page 321: Yorikiyo Nagashima - Archive

296 10 Path Integral: Basics

Similarly, we can prove that the second-order term is expressed as

K2 D

��

i„

�2 Z 1

�1d t1d t2

Z 1

�1dq1dq2

��K0(t f q f I t2 q2)V(t2q2)K0(t2q2I t1 q1)V(t1q1)K0(t1q1I ti q i)

(10.156)

Analogous expressions hold for all K j in the expansion Eq. (10.152). Note, thereis no 1/2! in Eq. (10.156). The reason for this is that the two interactions occur atdifferent times, but they are integrated over all space-time and they are indistin-guishable. The effect is merely to compensate 1/2!. Similarly, there are j ! differentcombinations that give identical contributions, and no 1/ j ! appears in the expres-sion for K j . Equation (10.152) is now expressed as

K(t f q f I ti q i)

D K0(t f q f I ti q i) �i„

Z 1

�1d tZ 1

�1dqK0(t f q f I t q)V(t q)K0(t qI ti q i)C � � �

D K0(t f q f I ti q i) �i„

Z 1

�1d tZ 1

�1dqK0(t f q f I t q)V(t q)K(t qI ti q i)

(10.157)

The first equation is the perturbation expansion of the propagator, called the Bornseries. The second equation is obtained by replacing K0 in the second term byK, which is equivalent to including all the successive terms. It is exact and is anintegral equation for the propagator.

We have elaborated on using the path integral method to derive the perturbationseries. In fact, it can be derived directly from Eq. (10.35). Separating the potentialfrom the Hamiltonian H D H0 C V it can be rewritten as�

i„@

@t1� H0

�K(t1q1I t2q2) D i„δ(t1� t2)δ(q1� q2)CV(t1q1)K(t1q1I t2q2)

(10.158)

Since K0 satisfies the same equation with V D 0, the equation can be transformedto a integral equation that reproduces Eq. (10.157).

Using the propagator, we can now express the time development of a wave func-tion. Inserting Eq. (10.157) in Eq. (10.40),

ψ(t f q f ) DZ

dq i K(t f q f I ti q i )ψ(ti q i )

D

Zdq i K0(t f q f I ti q i )ψ(ti q i )

�i„

Zd tdq

Zdq i K0(t f q f I tq)V(t q)K(t qI ti q i )ψ(ti q i )

D

Zdq i K0(t f q f I ti q i )ψ(ti q i )

�i„

Zd tdq K0(t f q f I tq)V(t q)ψ(tq) (10.159)

Page 322: Yorikiyo Nagashima - Archive

10.5 Scattering Matrix 297

Note, the time sequence is such that t f > t > ti , but the integral can be madeover the entire time region from �1 toC1, because the existence of the retardedpropagator guarantees the vanishing of the integrand for t > t f or t < ti . Herewe have changed from one space dimension to three. This is an integral equationfor the wave function. Now we assume that for t ! �1, ψ becomes a free planewave function, which we denote as φ. Then, the first term of the rhs of Eq. (10.159)is also a plane wave, because the free propagation of a plane wave is also a planewave:

ψ(t f q f ) D φ(t f q f ) �i„

Zd tdq K0(t f q f I tq)V(t q)ψ(tq) (10.160)

As ψ(t q) obeys the Schrödinger equation, namely�i

@

@t� H0

�ψ(tq) D V(t q)ψ(tq) (10.161)

where H0 is the free Hamiltonian and φ(t q) is the solution to the free equation(V D 0), the propagator must obey the equation�

i@

@t� H0

�K0(tqI t0q0) D i„δ(t � t0)δ3(q � q0) (10.162)

This is the equation for the Green’s function of Eq. (10.161), which reproducesEq. (10.35).

Problem 10.5

Using the free-particle propagator Eq. (10.82) or Eq. (10.95), prove thatZdqi K0(t f q f I ti q i)e

i„ (p qi�i E ti ) D e

i„ (p q f �i E t f ) , E D

p 2

2m(10.163)

Extension to three dimensions is straightforward.

10.5.2S-Matrix in the Path Integral7)

Now we want to apply our results to more practical situations, namely how to re-late the wave function to experimental measurements. In a typical scattering ex-periment a bunch of particles is provided with a definite initial condition, say, fixedmomentum p i , they illuminate a target and the probability that they are scatteredto a specific final state, at momentum p f , is measured. The incident and scat-tered particles are considered free in the distant past t ! �1 or in the far futuret ! C1. We represent the initial- and final-state wave functions of particles with

7) Compare the discussion here with that in Sects. 6.1–6.3.

Page 323: Yorikiyo Nagashima - Archive

298 10 Path Integral: Basics

momentum p i and p f by

ψin(tqI p i ) D N ei„ (p i �q�Ei t ) D htqjp ii (10.164a)

ψout(tqI p f ) D N ei„ (p f �q�E f t ) D htqjp f i (10.164b)

One problem with the expressions is that the plane wave spreads over all spaceand time, including the scattering center V(t q), so the particles can never be free.We could use a wave packet formalism to circumvent the problem but usually thisinvolves complicated calculation. The conventional way around the problem is theadiabatic treatment, in which the potential is switched on and off slowly as a func-tion of time. Namely, we replace V by Ve�εjtj so that V D 0 at t D ˙1.8) Assum-ing that is the case, and with the initial conditions ψ D ψin, V ! 0 for jtj ! 1,the wave function at time t D t f is expressed as

ψC(t f q f ) DZ

dq i K0(t f q f I ti q i )ψin(ti q i )

�i„

Zd tdqK0(t f q f I tq)V(t q)ψC(tq)

(10.165)

This is the same equation as Eq. (10.159) except ψ(ti , q f ) in the first termis replaced by ψin and the implicit assumption that jV j ! 0 at t ! �1. Thesuperscript on ψC(tq) denotes that it corresponds to a wave that was free att ! �1, which is the result of the retarded nature of the propagator such thatK0(tqI t0q0) D 0 for t < t 0. To eliminate ψC on the rhs, we make successiveiterations, using the first term as the first input to ψC in the second term. Theresult is

ψC(t f q f ) DZ

dq i K0(t f q f I ti q i )ψin(ti q i )

�i„

Zd tdqdq i K0(t f q f I tq)V(t q)K0(tqI ti q i )ψin(ti q i )C � � �

(10.166)

8) This is equivalent to introducing a small negative imaginary part for the energy, i.e. E ! E � iε,which in turn is Feynman’s recipe for handling the singular point in the propagator.

�t

�qti , qi

�������

t, q�����t f , q f

ti , qi

��������t1, q1����t2, q2 �������

t f , q f

Figure 10.3 First- and second-order scattering diagram. Time runs from bottom to top.

Page 324: Yorikiyo Nagashima - Archive

10.5 Scattering Matrix 299

We are interested in the transition amplitude from the state jψini at t D �1 tojψouti at t D 1, which is called the scattering amplitude. It is defined as

S f i D hψoutjψCit f !1 DZ

dq f hψoutjt f q f iht f q f jψCi

D

Zdq f ψ�

out(t f q f )ψC(t f q f )

D

Zdq f dq i ψ�

out(t f q f )K0(t f q f I ti q i )ψin(ti q i )

�i„

Zd tdq f dqdq i ψ�

out(t f q f )K0(t f q f I tq)

� V(tq)K0(tqI ti q i )ψin(ti q i)

C

��

i„

�2 Zd t1d t2 dq f dq2dq1dq i

� ψ�out(t f q f )K0(t f q f I t2 q2)V(t2, q2)K0(t2q2I t1 q1)

� V(t1q1)K0(t1q1I ti q i )ψin(ti q i )C � � �

(10.167)

where � � � are higher order terms. The formula can be depicted as in Fig. 10.3.

1. Attach ψin(ti , qi) at point (ti , qi).2. Draw a line from (ti , qi ) to (t, q). Attach K0(ti q i I t, q) to the line.3. Attach the vertex

��� i

„�

V(t, q)

to the point (t, q).4. Draw a line from (t, q) to (t f , q f ). Attach K0(t qI t f , q f ) to the line.5. Attach ψout(t f , q f ) at point (t f , q f ).6. Integrate over time t and over all the intermediate positions q i , q, q f .

This completes the first-order scattering process. The corresponding picture isdrawn in Fig. 10.3 (left). The second-order scattering process can be drawn in asimilar manner. The ordering of the factors is such that in writing each factor onegoes from the bottom upwards. In the nonrelativistic treatment, one does not godownwards (backwards in time) in Fig. 10.3. The correspondence is called the Feyn-man rules.9) In nonrelativistic quantum mechanics, which we are dealing with atpresent, the pictorial representations shown in Fig. 10.3 are hardly necessary, butin quantum field theory they are a great aid to calculation.

As Problem 10.5 showsZdqi K0(t f q f I ti q i)ψin(ti q i ) D ψin(t f q f )Z

dqi ψ�out(t f q f )K0(t f q f I ti q i ) D ψ�

out(ti q i )(10.168)

9) In field theory, particles with negative energy can be formally interpreted as going backwards intime. The relativistic Feynman propagator includes both retarded and advanced propagation. Onecan go upwards or downwards freely in time and need not worry about the time sequence. TheFeynman propagator takes care of it automatically (see discussion in Sect. 6.6.2).

Page 325: Yorikiyo Nagashima - Archive

300 10 Path Integral: Basics

Using Eq. (10.168), we can rewrite Eq. (10.167). The first term gives simply δ3(p f �

p i ), which means no scattering. The interesting part is included in the second andfollowing terms. Here, we write only up to second terms, which is the first-orderterm in the power expansion of the potential V:

S f i D δ3(p f � p i ) �i„

Zd tdq ψ�(tq)V(t q)ψin(tq)

D δ3(p f � p i ) �i„

N 2Z

d tdq e� i„ (p f �q�i E f t )V(t q)e

i„ (p i �q�i Ei t )

(10.169)

where N is the normalization factor. For the time independent Coulomb potential,the time integration merely gives a δ function. Thus, the second term gives

S f i � δ f i D �2π i δ(E f � Ei)N 2Z

ei„ (p i �p f )�qV(q)dq (10.170)

This is the well-known Born approximation in quantum mechanics.

Problem 10.6

Prove Eqs. (10.169).

10.6Generating Functional

10.6.1Correlation Functions

So far we have considered the transition amplitude in the form of a path integral:

ht f q f jti q ii D NZ

Dqei„ S [q] (10.171)

Later on, we shall need matrix elements of operator products such as

D12 D ht f q f j Oq(t1) Oq(t2)jti q ii

D

ZDq q1q2 exp

�iZ t f

t i

d tL(q)�

t f � t1 � t2 � ti(10.172)

where the boundary conditions on the path integral are

q f D q(t f ) , qi D q(ti) , q1 D q(t1) , q2 D q(t2) (10.173)

We can show that it is related to the two-point vacuum correlation function

hΩ jT [ Oq(t1) Oq(t2)]jΩ i (10.174)

Page 326: Yorikiyo Nagashima - Archive

10.6 Generating Functional 301

where jΩ i is the vacuum state in the Heisenberg picture. We add a hat, e.g. Oq, whenwe want to distinguish operators from their representations q, which are ordinarynumbers. To evaluate Eq. (10.172) we break up the functional integral as follows:Z

Dq DZ

dq1

Zdq2

Zq(t1)Dq1q(t2)Dq2

Dq t f � t1 � t2 � ti (10.175)

The main functional integralR Dq is now constrained at times t1 and t2 (in addi-

tion to the endpoints t f and ti ). There q1 D q(t1) and q(t2) are held constant. Theirintegration has to be done separately and is placed in front of the main function-al integral. Then the numbers q1, q2 can be taken outside the main integral. Themain integral is now divided into three transition amplitudes:

D12 D

Zdq1

Zdq2q1q2

Zq(t1)Dq1q(t2)Dq2

Dq exp�

iZ t f

t i

d tL�

D

Zdq1

Zdq2 q1q2ht f q f jt1q1iht1 q1jt2q2iht2q2jti q ii

D

Zdq1

Zdq2 q1q2hq f je�i H(t f �t1)jq1ihq1je�i H(t1�t2)jq2i

� hq2je�i H(t2�t i )jqii

(10.176)

We can turn the c-number q1 into a Schrödinger operator using qjq1i D OqSjq1i

[see Eq. (10.36b)]. The completeness relationR

dq1jq1ihq1j D 1 allows us to elimi-nate the intermediate state jq1ihq1j. The q2 integration can be eliminated similarly,yielding the expression

D12 D hq f je�i H(t f �t1) OqSe�i H(t1�t2) OqSe�i H(t2�t i )jqii (10.177)

The Schrödinger operator can be converted into the Heisenberg operator using therelation (10.36a). For the case t2 > t1, q1 enters the path integral earlier than q2.Therefore the ordering of the operators Oq(t1) Oq(t2) in D12 [Eq. (10.172)] should bereversed. Using a time-ordered product we can combine the two cases into one togive

D12 D hq f je�i H t f T [ OqH(t1) OqH(t2)]e i H ti jtii D hq f t f jT [ OqH(t1) OqH(t2)]jti q ii

(10.178)

We see that the path integral has a built-in structure to give the time-ordered corre-lation function as its moments. A similar formula holds for arbitrary functions ofoperators:

ht f q f jT [O1( OqH(t1)) � � �ON ( OqH(tN ))]jti q ii

D NZ

DqO1(q(t1)) � � �ON (q(tN ))ei„ S [q]

(10.179)

The salient feature of the transition amplitude is that we can extract the vacuumstate by sending ti , t f to ˙1 in a slightly imaginary direction (t ! t0 D t e�i δ).

Page 327: Yorikiyo Nagashima - Archive

302 10 Path Integral: Basics

This will extract the vacuum state jΩ i from jq f i and jqii, because we can get ridof all the n ¤ 0 states by making t f , ti large with a small negative imaginary partt f , ti !˙1(1 � i δ):

limt i!�1(1�i δ)

e i H ti jqii

D limjt i j!1(1�i δ)

"e�i E0jt i jjΩ ihΩ jqii C

Xn¤0

e�i Enjt i jjnihnjqii

#

) limjt i j!1

jΩ ihΩ jqiie�i E0�jt i j(1�i δ)

�(10.180a)

where E0 D hΩ jH jΩ i. Any state jni for n ¤ 0 has a larger damping factor thanthe n D 0 state and hence the ground state is the only surviving state after jti j ismade large. Similarly, it is shown that

hq f je�i H t f D limt f !1(1�i δ)

(hq f jΩ ie�i E0�t f (1�i δ))hΩ j (10.180b)

Using Eq. (10.180b) and (10.180a), the time-ordered correlation function is ex-pressed as the vacuum expectation value:

limt i!�1t f !1

ht f q f jT [ Oq(t1) Oq(q2)]jti q ii

D (hq f jΩ ie�i E0�1(1�i δ))hΩ jT [ Oq(t1) Oq(q2)]jΩ i(hΩ jqiie�i E0�1(1�i δ))

(10.181)

) hΩ jT [ Oq(t1) Oq(q2)]jΩ i D limt i!�1t f !1

ht f q f jT [ Oq(t1) Oq(q2)]jti q ii

hq f jΩ ihΩ jqiie�i E0(t f �t i )(1�i δ)

(10.182)

The rhs is independent of the endpoints and therefore the lhs must also be in-dependent of the endpoints. The awkward extra phase and overlap factors can beeliminated by dividing by the same expression as D12 but without q1 and q2. Ourfinal expression for the n-point correlation function is

hΩ jT [ OqH(t1) � � � OqH(tn)]jΩ i D limt f ,t i !˙1(1�i δ)

R Dq q1 � � � qn exph

iR t f

t id tL

iR Dq exp

hiR t f

t id tL

i(10.183)

10.6.2Note on Imaginary Time

Tilting the time axis is a mathematical trick called the Wick rotation (Fig. 10.4a).The direction of the rotation is mandated by the requirement that there are no

Page 328: Yorikiyo Nagashima - Archive

10.6 Generating Functional 303

singularities of the integrand encountered in the rotation of the time axis. This isthe case for

t ! t e�i δ 0 δ < π (10.184)

For larger angles, the rhs of Eq. (10.180) develops essential singularities for ti !

�1 and t f ! 1. This rotation corresponds to an analytic continuation in thevariable t. The real four-dimensional Minkowski space can be viewed as a subspaceof a four-dimensional complex space (or of a five-dimensional space with threereal space coordinates and one complex time coordinate). For δ D π/2, this corre-sponds to t D �i τ (τ real) and the space becomes Euclidean. Physical results areobtained by evaluating the relevant expressions in Euclidean time because their os-cillatory behavior is converted to an adequate damping factor and is well behaved.Then after the calculation, one can rotate the result (τ D i t) back to the real timeaxis (Minkowski time t) to obtain physically meaningful results. The method workswell only if no singularities are encountered on the way.

We note that the end result of the Wick rotation and rotation back to Minkowskitime is equivalent to addition of a small negative imaginary part in the energy. Thisproduces the proper damping of the oscillating exponentials to make the integralconverge. We know from our study of the Green’s function in the complex energyplane (Fig. 10.4b), that the singularities have to occur at ˙ω � i ε to incorporatethe causal boundary conditions (see Sect. 6.6.2). Therefore, an analytic continuationfrom Re(k0) to Im(k0) is meaningful only if the rotation is counterclockwise. Thetwo rotations are equivalent since we can represent k0 ! i„@/@t.

Furthermore, the Wick rotation also rationalizes the treatment of the scatteringmatrix in terms of plane waves, which are not local. The interaction is made tobe turned on and off slowly without any energy transfer (i.e. adiabatic approxima-tion) so that the notion of an asymptotically free field at large time is justified (seearguments in Sect. 10.5.2). All the reasonings stem from the constraint that thetransition amplitude has to be finite at large time. In the following, we shall notalways indicate it explicitly, but it should be understood as being done.

Im t

Re t

δ+T

-T

Im k0

Re k0X

X

ω iε

ω + iε

(a) (b)

Figure 10.4 (a) Rotation of the time axis to ensure the convergence of the path integral. (b)Rotation in the complex energy plane to incorporate the causal boundary condition.

Page 329: Yorikiyo Nagashima - Archive

304 10 Path Integral: Basics

10.6.3Correlation Functions as Functional Derivatives

Next, we show that we can generate various correlation functions by simply addingan appropriate external source. For instance, a charged particle generates an elec-tromagnetic field, and its interaction is given by A μ J μ . The current J μ plays therole of the source to generate the interactions. We define a modified action of theform

S [q, J ] D S [q]CZ t f

t i

d t q(t) J(t) (10.185)

and further define

Z [ J ] D hΩ jΩ i J D NZ

Dqei„ S [q, J ] (10.186)

where N is a normalization constant to make Z [0] D 1. We know how to rotatethe time axis in the definition of Z [ J ] to isolate the vacuum state. Equivalently, wemay add a small negative imaginary part to the Hamiltonian. Adding i εq2 to theLagrangian will achieve this:

Z [ J ] D NZ

Dq exp�

i„

Z 1

�1d t�LC J q C i εq2�� (10.187)

Then

δZ [ J ]δ J(t1)

D NZ

Dqi„

δS [q, J ]δ J(t1)

ei„ S [q, J ] D N

ZDq

i„

q(t1)ei„ S [q, J ] (10.188)

Therefore,

δZ [ J ]δ J(t1)

ˇˇ

JD0D N

ZDq

i„

q(t1)ei„ S [q] D

i„

NhΩ j Oq(t1)jΩ i (10.189)

Repeating the functional differentiation, we have

δ2Z [ J ]δ J(t1)δ J(t2)

ˇˇ

JD0D N

ZDq

�i„

�2

q(t1)q(t2)ei„ S [q, J ]

D

�i„

�2

NhΩ jT [ Oq(t1) Oq(t2)]jΩ i

(10.190)

In general, we have

δn Z [ J ]δ J(t1) � � � δ J(tn)

ˇˇ

JD0D

�i„

�n

NhΩ jT [ Oq(t1) � � � Oq(tn)]jΩ i (10.191)

Page 330: Yorikiyo Nagashima - Archive

10.6 Generating Functional 305

Then we have

Z [ J ] D1X

nD0

1n!

�i„

�n Zd t1 � � � d tn J(t1) � � � J(tn)hΩ jT [ Oq(t1) � � � Oq(tn)]jΩ i

(10.192)

Namely, Z [ J ] is a generating functional of n-point time-ordered correlation func-tions or the Green’s functions in the vacuum. It is known as the vacuum functionalor the generating functional of the vacuum Green’s functions. Using Eqs. (10.192)and (10.182), the time-ordered n-point function is expressed as

hΩ jT [ Oq(t1) � � � Oq(tn)]jΩ i D�„

i

�n � 1Z( J )

δn Z [ J ]δ J(t1) � � � δ J(tn)

�JD0

(10.193)

The usefulness of Eq. (10.193) lies in the fact that any scattering amplitude canbe converted to n-point vacuum expectation values using reduction formula (seeSect. 11.5). Therefore, if one knows all the vacuum Green’s functions, one canconstruct the S-matrix of the theory and therefore solve the problem. In quantumfield theory, these correlation functions play a central role.

It is convenient to introduce another related functional W( J ) defined by

Z [ J ] D ei„ W [ J ] or , W [ J ] D �i„ ln Z [ J ] (10.194)

Then taking the functional derivative, we obtain

δ W [ J ]δ J(t1)

ˇˇ

JD0D

�„

i1

Z [ J ]δZ [ J ]δ J(t1)

�JD0D hΩ j Oq(t1)jΩ i � h Oq(t1)i (10.195)

where h� � � imeans the vacuum expectation value. We have learned that in the caseof free particles and harmonic oscillators, the path integral for the transition ampli-tude is proportional to the exponential of the classical action [Eqs. (10.89), (10.123)].For this reason W [ J ] is also called the effective action. The advantage of usingW( J ) over Z( J ) is that it gives statistical deviations from the mean values. This canbe shown by further differentiating Eq. (10.195):

iδ2 W [ J ]

δ J(t1)δ J(t2)

ˇˇ

JD0

D

�„

i

�2 � 1Z [ J ]

δ2Z [ J ]δ J(t1)δ J(t2)

�1

Z2[ J ]δZ [ J ]δ J(t1)

δZ [ J ]δ J(t2)

�JD0

D˝T�Oq(t1) Oq(t2)

˛� h Oq(t1)ih Oq(t2)i

D˝T�( Oq(t1) � h Oq(t1)i)( Oq(t2) � h Oq(t2)i)

˛(10.196)

Page 331: Yorikiyo Nagashima - Archive

306 10 Path Integral: Basics

We recognize this is the second-order deviation from the mean. The third-ordermean deviation is given by

�„

i

�2 δ3 W [ J ]δ J(t1)δ J(t2)δ J(t3)

ˇˇ

JD0

D

�„

i

�3 � 1Z [ J ]

δ3Z [ J ]δ J(t1)δ J(t2)δ J(t3)

�1

Z2[ J ]δ2Z [ J ]

δ J(t1)δ J(t2)δZ [ J ]δ J(t3)

�1

Z2[ J ]δ2Z [ J ]

δ J(t2)δ J(t3)δZ [ J ]δ J(t1)

�1

Z2[ J ]δ2Z [ J ]

δ J(t3)δ J(t1)δZ [ J ]δ J(t2)

C1

Z3[ J ]δZ [ J ]δ J(t1)

δZ [ J ]δ J(t2)

δZ [ J ]δ J(t3)

�JD0

D˝T�( Oq(t1) � h Oq(t1)i)( Oq(t2) � h Oq(t2)i)( Oq(t3)� h Oq(t3)i)

˛(10.197)

We can go on to higher order deviations from the mean. In quantum field theory,W [ J ] is known as the generating functional for the connected vacuum Green’sfunction, which will be discussed later in Sect. 11.6.7.

10.7Connection with Statistical Mechanics

A typical application of the path integral outside the field of particle physics isto calculate the partition functions in statistical mechanics. Once it is given as afunction of the temperature and volume (or pressure), the internal energy, entropy,pressure and other standard thermodynamical quantities are obtained by simplydifferentiating the partition function.

In a statistical ensemble of many particles that are confined in a volume V andin thermal equilibrium with a heat bath at temperature T, the probability of eachparticle being in a state i is given by

pi D1Z

e�Ei , D1

kBT, (10.198a)

where kB D 8.617�10�5 eV K�1 is the Boltzmann constant. The partition functionZ and the Helmholtz free energy F D U � T S are defined for a system in thermalequilibrium, hence the Hamiltonian can be considered as time independent. Theyare defined by

Z DX

i

e�Ei DX

i

hije� OH jii D Trhexp(� OH)

i(10.198b)

F D �1

ln Z D �kBT ln Z (10.198c)

Page 332: Yorikiyo Nagashima - Archive

10.7 Connection with Statistical Mechanics 307

Thermodynamical variables such as entropy and pressure can be derived from thepartition function:

S D �@F@T

ˇˇVD kB 2 @F

@, P D �

@F@V

(10.198d)

Just to refresh our memory of thermodynamics, we give a derivation of theHelmholtz free energy in the boxed paragraph at the end of this chapter.

On the other hand the functional integral is used to calculate

G(x , t W y ) � ht, x jti D 0, yi D hx j exp��

i t„OH�jyi

Eq. (10.52)D

ZDqD

p2π„

�exp

�i„

Z t f

t i

d t fp Pq � H(p , q)g�

(10.199)

Namely, if one can calculate the propagator Eq. (10.199), all one has to do to obtainthe partition function is “to replace i t/„ with (or better to say analytically contin-ue to t ! �i„) and to take the trace”. Note, the trace can be taken in any basis.In particular, if we choose the basis in the Schrödinger picture, we can write

Z DZ

dqhqje� OH jqi (10.200)

If we introduce an imaginary time τ D i t/„, denote the τ derivative of q as Pq andset τ i D 0 for simplicity, the partition function is given by

Z DZ

dqhqje� OH jqi DZ

dqZ

q(0)Dq()DqD

p2π„

� exp

"�

Z

0dτ fH(p , q)� i p Pqg

# (10.201a)

For the special case H(p , q) D p 2/2m C V(q), this becomes

Z DZ

dqZ

q(0)Dq()Dq exp

"�

Z

0dτ�

12

m Pq2 C V(q) #

(10.201b)

where

Dq D limN!1

m2π„�

�N/2dq1dq2 � � � dqN�1 (10.201c)

We have already encountered this type of functional integral in Sect. 10.6.2 as theWick rotation.

We add a few simple exercises in statistical mechanics just to remind the readerof some basic uses of the partition function.

Page 333: Yorikiyo Nagashima - Archive

308 10 Path Integral: Basics

Example 1: Free Particle Using Eq. (10.201b) and substituting t f � ti D �i„,q f D qi in Eq. (10.95)

Z D [(10.201b)]free particle D

Zdq[(10.95)]t f �t iD�i„

q f Dqi

D

�m

2π„2

�1/2 Zdq

(10.202)

Then the Helmholtz free energy is given by

F D �1

ln Z D �1

ln

"�m

2π„2

�1/2

V

#(10.203)

Using Eqs. (10.203) and (10.198d), we obtain the equation of state for the ideal gasP V D kBT .

Example 2: Harmonic Oscillator

Problem 10.7

(1) Calculate the propagator for the harmonic oscillator at imaginary time t D�i„ to obtain

K(x W y ) D�

mω2π„ sinh(„ω)

�1/2

� exp��

mω2„ sinh„ω

˚(x2 C y 2) cosh(„ω) � 2x y

�� (10.204)

Hint: Use (10.124).(2) Calculate the partition function and the Helmholtz free energy. Show they aregiven by

Z D1

2 sinh„ω/2, F D kT ln

�2 sinh

„ω2k T

�(10.205)

(3) Using the energy eigenvalues of the harmonic oscillator, derive the partitionfunction directly by calculating Eq. (10.198b), and show it agrees with Eq. (10.205).(4) Derive the Planck blackbody radiation formula from Eq. (10.205).

Page 334: Yorikiyo Nagashima - Archive

10.7 Connection with Statistical Mechanics 309

Derivation of the Helmholtz Free Energy Having defined the partition function foran ensemble of particles in a volume V and in thermal equilibrium with a heat bathat temperature T, we will prove that the Helmholtz free energy is given by

F � U � T S D �kB T ln Z (10.206)

Note, Ei is a function of V, hence Z is a function of T and V. The internal energy Uof a system is given as the expectation (or average) value of the energy. Denoting theprobability of a particle in the state i as pi and using

PEi e�Ei D �@Z/@, the total

energy is given by

U �X

pi Ei D1Z

XEi e�Ei D �

1Z

@Z@

ˇˇVD �

@ ln Z@

ˇˇV

(10.207)

Similarly, the pressure is given as an energy change ΔU divided by an infinitesimalvolume change ΔV under equitemperature conditions:

P D �1Z

X @Ei

@Ve�Ei D

1Z

@

@V

Xe�Ei

ˇˇ

TD

1

�@ ln Z

@V

�T

D �@F@V

ˇˇ

T(10.208)

When the volume of the system changes, two things happen. First, each energy levelshifts slightly, and second, heat energy d Q flows in from the heat bath to keep thetemperature constant. The energy increase dU is given by

dU D �P dV C d Q (10.209)

Noting d(U)D dUC U d , we can calculate d Q/ T from

d QTD kB (dUC P dV )D kB

�d(U)� U d C P dV

(10.210)

Using Eqs. (10.207) and (10.208), we obtain

d QTD kB

�d(U)C

@ ln Z@

d C@ ln Z

@VdV�

(10.211)

The rhs is a total derivative. Therefore writing d Q/ T D d S , kB D 1/ T and inte-grating we have

S DUTC kB ln Z(T, V) �

UT�

FT

, or F D U � T S (10.212)

which proves Eq. (10.206). Using F D �(ln Z )/, U D �@ ln Z/@jV D@(F )/@jV , we can rewrite the entropy as

S DU � F

TD kB

�@(F )

@

ˇˇV� F

�D kB2 @F

@

ˇˇVD �

@F@T

ˇˇV

(10.213)

From Eqs. (10.212), (10.207) and (10.198a), it can also be shown that

S D �kB

Xi

p i ln pi (10.214)

which is the well-known expression for the entropy.

Page 335: Yorikiyo Nagashima - Archive
Page 336: Yorikiyo Nagashima - Archive

311

11Path Integral Approach to Field Theory

Applications of the path integral method to field theory are given in this chapter.Scattering amplitudes of QED processes treated in Chap. 7 and the self-energy ofthe photon and electron already considered in Chap. 8 are rederived.

11.1From Particles to Fields

Up to now we have discussed the one-particle system. We can extend the pathintegral method to systems of particles with many degrees of freedom. Considera system of N particles having coordinates qr (r D 1, � � � , N ). Then the transitionamplitude is expressed as

ht f q f jti q ii )ht f q1f , � � � qN

f jti q1i , � � � q

Ni i D N

Z Yr

Dqr(t)ei„ S [q]

qrf D qr(t f ) , qr

i D qr(ti )

(11.1)

The path integral on the rhs is now N-fold. The action has the form

S [q] DZ t f

t i

d tX

r

L(qr , Pqr) (11.2)

and the integration is over all paths starting at qri at t D ti and ending at qr

f att D t f . The field φ(x ) D φ(t, x ) can be considered as an infinite set of qr indexedby the position x , namely

r ! x

qr ! φ(x)ΔxXr

!

Z (11.3)

To define the path integral, instead of dividing just time into segments, we dividespace and time, and the Minkowski space is broken down into four-dimensionalcubes of volume ε4. Inside each cube at coordinate (t j , xk , y l , zm), φ is taken to be

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 337: Yorikiyo Nagashima - Archive

312 11 Path Integral Approach to Field Theory

a constant:

φ( j, k, l, m) D Nφ(t j , xk , y l , zm) (11.4)

Derivatives are approximated by, for example,

@φ@x

ˇˇ

j,k ,l,m'Nφ(t j , xk C ε, y l , zm ) � Nφ(t j , xk , y l , zm )

ε(11.5)

We replace the four indices ( j, k, l, m) formally by the single index a and write

φ(t j , xk , y l , zm), @μ φ(t j , xk , y l , zm )�D L(φa , @μ φa) D La (11.6a)

S [φ] DZ

d4xL '

N4XaD1

ε4La (11.6b)

The path integral of the field is defined asZDφe

i„ S [φ] D

ZDφ exp

�i„

Zd4xL(x )

D limN!1

Z N4YaD1

dφa exp

"i„

NXa

ε4La

#

D limN!1�!0

Z N4YaD1

dφa exp�

i„

ε4La

�(11.7)

11.2Real Scalar Field

11.2.1Generating Functional

Now we consider the real scalar field φ(x ) D φ(t, x ), which obeys the Klein–Gordon equation. From now on, we adopt natural units and omit „. We contin-ue to consider that the field is interacting with an external field J(x ). Then theclassical equation that φcl satisfies is

[@μ@μ C m2 � i ε]φcl D J(x ) (11.8)

We have introduced a small negative imaginary part to the mass squared to en-sure that the path integral is well behaved, as we discussed in Sect. 10.6.2. TheFeynman’s Green function is defined as one that obeys

[@μ@μ C m2 � i ε]ΔF(x � y ) D �δ4(x � y )

ΔF(x � y ) D ��@μ@μ C m2 � i ε

�1δ4(x � y ) D

Zd4q

(2π)4

e�i q�(x�y )

q2 � m2 C i ε

(11.9)

Page 338: Yorikiyo Nagashima - Archive

11.2 Real Scalar Field 313

The second equation is written to demonstrate that ΔF is formally defined as theinverse of the Klein–Gordon operator. The prescription of how to handle the sin-gularity of the Green’s function is dictated by the small negative imaginary part ofthe mass, which reproduces the recipe discussed in Sect. 6.6.2 to choose positiveenergy for particles going forward in time and negative energy for backward-goingparticles. The vacuum-to-vacuum amplitude is given by

Z0[ J ] D NZ

Dφ exp�

iZ

d4x�

12

˚@μ φ@μ φ � (m2 � i ε)φ2�C J φ

�(11.10)

We have attached the subscript “0” to Z to denote that we have not included theinteraction of the scalar field apart from the artificial external source and hence φcan be considered free except for the external source J. The notation Z [ J ] is savedfor future use to denote the generating functional of interacting fields. We remindthe reader that φ in the path integral is not the field satisfying Eq. (11.8), but theintegration variable, which can have any value. We modify the action slightly, usingthe identityZ

d4x@μ φ@μ φ DZ

d4 x@μ(φ@μ φ)�Z

d4x φ@μ@μ φ (11.11)

The first term is a surface integral and vanishes if φ ! 0 at infinity. Then Z0[ J ] isexpressed as

Z0[ J ] D NZ

Dφ exp��iZ

d4x�

12

φ�@μ@μ C m2 � i ε

�φ � J φ

�(11.12)

To re-formulate the path integral, we use the fact that φ is an integration variableand we are free to shift the reference path. We choose the reference path as thesolution to Eq. (11.8):

φ(x )! φ0(x ) D φ(x )C φcl(x ) D φ(x ) �Z

d4 yΔF(x � y ) J(y ) (11.13)

The action in Eq. (11.12) can be rewritten as

� iZ

d4x�

12

φ0 �@μ@μ C m2 � i ε�

φ0 � J φ0�

D �iZ

d4x�

12

φ�@μ@μ C m2 � i ε

�φ

C12

φ�@μ@μ C m2 � i ε

�φcl C

12

φcl�@μ@μ C m2 � i ε

�φ

C12

φcl�@μ@μ C m2 � i ε

�φcl � J φcl � J φ

�(11.14)

The second and third terms on the rhs give the same contribution when integrat-ed over and are canceled by the last term because of Eq. (11.8). The fourth termis 1

2 J φcl again because of Eq. (11.8). Then using Eq. (11.13),R

d4x J φcl can berewritten as �

’d4x d4 y J(x )ΔF(x � y ) J(y ).

Page 339: Yorikiyo Nagashima - Archive

314 11 Path Integral Approach to Field Theory

Collecting all the terms, we can now write the action as

S [φ] D �i2

Zd4x φ(@μ@μ C m2 � i ε)φ �

i2

Zd4x d4 y J(x )ΔF(x � y ) J(y )

(11.15)

The second term does not contain φ, hence it can be taken out of the functionalintegral. Z0[ J ] now takes the form

Z0[ J ] D N exp��

i2

Zd4 x d4 y J(x )ΔF(x � y ) J(y )

ZDφ exp

��

i2

Zd4x φ(@μ@μ C m2 � i ε)φ

�(11.16)

The second factor, formally giving (det[i(@μ@μ C m2 � i ε)])�1/2, is a constant, be-cause it is integrated over all functions φ. We will calculate det[i(@μ@μ C m2 � i ε)]later (see Sect. 11.2.2). Here we simply denote it by N�1. So the final expressionfor Z0[ J ] is

Z0[ J ] D exp��

i2

Zd4 x d4 y J(x )ΔF(x � y ) J(y )

�(11.17)

Alternatively, the above equation can be derived using the generalized Gaussianintegral. Note that just as integration of quadratic polynomials in an exponent isgiven byZ 1

�1dq e�aq2CbqCc D

rπa

exp�

b2

4aC c

�(11.18)

the N-dimensional Gaussian integral in such a form is generally expressed asZd y1 � � � d yN e� 1

2 yT AyCBT yCC D(2π)N/2p

det[A]exp

�12

BT A�1B C C�

(11.19)

where y D (y1, y2, � � � , yN ) and A and B are N � N and N � 1 matrices. The gen-eralization to the path integral of the bilinear scalar field action can easily be done.It is to be understood that the integration can be carried out in Euclidian spaceand analytically continued back to give the real space-time result so that formal ap-plication of the Gaussian formula Eq. (11.19) is valid for the oscillating integrand.Setting

A D i�@μ@μ C m2 � i ε

�, B D i J , C D 0 (11.20)

the straightforward application of Eq. (11.19) results in

Z0[ J ] D NZ

Dφ exp��iZ

d4x�

12

φ˚@μ@μ C m2 � i ε

�φ � J φ

DN Dp

det Aexp

�12

BTA�1B�

D exp��

i2

Zd4 x d4 y J(x )ΔF(x � y ) J(y )

�(11.21)

Page 340: Yorikiyo Nagashima - Archive

11.2 Real Scalar Field 315

where D is a diverging constant, which can be absorbed in the normalization con-stant. Equation (11.21) reproduces the result Eq. (11.17).

Normalization of the Path Integral Let us settle the question of the normalization.In most field theoretical questions, one is primarily interested in calculating expec-tation values of time-ordered operator products between the vacuum states, name-ly vacuum to vacuum transition rate. Consequently, the value of the normalizationconstant N is irrelevant, because we are only interested in the normalized tran-sition amplitude. As Z0[ J ] is the vacuum-to-vacuum transition amplitude in thepresence of the source J, it is sensible to normalize it to Z0[ J D 0] D 1. Or equiva-lently we can eliminate the awkward extra constant by dividing the path integral byZ0[0], which produces exactly the same constant. This is exactly what we have donein defining the n-point time-ordered correlation functions in Eq. (10.183). ThenEq. (11.10) must be rewritten as

Z0[ J ] D

R Dφ exp��iR

d4x˚ 1

2 φ�@μ@μ φ C m2 � i ε

�φ � J φ

�R Dφ exp��iR

d4 x˚ 1

2 φ�@μ@μ φ C m2 � i ε

�φ�

D exp��

i2

Zd4x d4 y J(x )ΔF(x � y ) J(y )

� (11.22)

We have two equivalent expressions for the generating functional. The first, in theform of the path integral, is convenient for defining time-ordered n-point vacuumcorrelation functions. The second does not contain the path integral, and is conve-nient in calculating their functional derivatives.

11.2.2Calculation of det A

As we discussed in the previous section, the actual value of det A is not important aslong as it does not contain any dynamical variables. It can always be factored awayby proper normalization. However, it is helpful to become familiar with the actualcalculation of the determinant, and we do it for the case of the most familiar Klein–Gordon operator. Perhaps it may be of interest to mention that det A in Euclidiantime is the partition function (see discussion in Sect. 10.7) and that it has a practicaluse in statistical mechanics:

det A D det�i�@μ@μ C m2� δ4(x � y )

(11.23)

As we will show, this is a divergent integral, but it does no harm as stated above. Toevaluate it we go to Euclidian space (x0 D �i x4

E) and write

i S D �i2

Zd4x φ(x )

�@μ@μ C m2� φ(x )

D �12

Zd4xE φ(xE)

��@μ@μ C m2�

E φ(xE)

� �12

Zd4xE d4 yEφ(xE)A(xE, yE)φ(yE)

(11.24a)

Page 341: Yorikiyo Nagashima - Archive

316 11 Path Integral Approach to Field Theory

xE D�x1

E , x2E , x3

E , x4E

�D (x1, x2, x3, i x0) (11.24b)

where x4E is treated as a real number. Now

A(xE, yE) D��@μ@μ C m2�

E δ4(xE � yE) (11.25)

is a matrix whose row and column indices are continuous variables. The followingreformulation is useful:

det A D exp[ln det A] D exp

24ln

Yj

λ j

35 D expfTr[ln A]g (11.26)

where the λ’s are the eigenvalues of the matrix A. The validity of the equalities canbe most easily seen in the diagonalized form, noting that the trace is invariant ifthe matrix is diagonalized or not. The eigenvalues of the matrix A are easily seenin its Fourier transform

A D��@μ@μ C m2�

E δ4(xE � yE) D (�@μ@μ C m2)E

Zd4kE

(2π)4e i kE�(xE�yE)

D

Zd4kE

(2π)4 e i kE�(xE�yE) �k2E C m2� (11.27a)

kE D�k1

E, k2E, k3

E, k4E

�D (k1, k2, k3,�i k0) (11.27b)

We see A�1 is not singular, namely A has an inverse, since the eigenvalues arepositive definite. Of course, we knew the inverse exists. It was already given byEq. (11.9) as the Feynman propagator.

Next we prove that a function of matrices f [A(x , y )] is the Fourier transform ofthe same function of the Fourier-transformed matrices, namely

f [A(x , y )] DZ

dk2π

dk0

2πe i(k x�k 0y ) f [A(k, k0)] (11.28a)

A(k, k0) DZ

dx d y e�i(k x�k 0y )A(x , y ) (11.28b)

Proof: For simplicity we consider in one dimension. In the above statement, thefunction is to be understood as expandable in a Taylor series:

f (A) D c0 C c1AC c2A2 C � � � cn An C � � �

or f [A(x , y )] D c0 C c1A(x , y )C c2

Zdz1A(x , z1)A(z1, y )C � � �

C cn

Zdz1 � � � dzn�1A(x , z1)A(z1, z2) � � � A(zn�1, y )C � � �

(11.29)

Page 342: Yorikiyo Nagashima - Archive

11.2 Real Scalar Field 317

For n D 2 it is easy to show thatZdz1 A(x , z1)A(z1, y )

D

Zdz1

dk2π

dk0

2πd p2π

d p 0

2πe i(k x�k 0z1) e i(p z1�p 0 y )A(k, k0)A(p , p 0)

D

Zdk2π

d p 0

2πe i(k x�p 0y )

Zdk0

2πA(k, k0)A(k0, p 0) (11.30)

The nth term has n � 1 integrations, yielding n � 1 δ functions, and the result isZdz1 � � � dzn�1 A(x , z1)A(z1, z2) � � � A(zn�1, y )

D

Zdk2π

dk0

2πe i(k x�k 0y )

Zdk1

2π� � �

dkn�1

2πA(k, k1)A(k1, k2) � � � A(kn�1, k0)

D

Zdk2π

dk0

2πe i(k x�k 0y )An(k, k0) (11.31)

which proves Eq. (11.28a). If A(k, k0) is diagonal, namely A(k, k0) D A(k)(2π)δ(k�k0), or, equivalently, if A(x , y ) is a function of x � y , Eq. (11.28a) can be expressedas

f [A(x , y )] DZ

dk2π

e i k(x�y ) f [A(k)] (11.32)

then

Tr[ f [A(x , y )]] DZ

dx d y δ(x � y )Z

dk2π

e i k(x�y ) f [A(k)]

D

Zdx

Zdk2π

f [A(k)](11.33)

Now we go back to four dimensions. Equation (11.27) tells us that

A�kE, k0

E

�D�k2

E C m2� δ4�kE � k0E

�(11.34)

Then setting f (A) D ln A, the determinant A of Eq. (11.27) is given as

det A D exp�Tr ln A(xE, yE)

D exp

�Zd4xE d4 yE δ4(xE � yE)

�Zd4kE

(2π)4 e i k �(xE�yE) ln�k2

E C m2� �

D exp�Z

d4xE

Zd4kE

(2π)4ln�k2

E C m2�� (11.35)

Analytically continuing back to Minkowski space by setting x4E D i x0, k4

E D �i k0,det A is given by

det A D exp�Z

d4xZ

d4k(2π)4 ln(�k2 C m2)

�(11.36)

Page 343: Yorikiyo Nagashima - Archive

318 11 Path Integral Approach to Field Theory

As announced, this is a divergent quantity. The factorR

d4x arose from the trans-lational symmetry. The action S stays constant for a class of the field which canbe obtained by symmetry operations. Given a constant integrand the trace gives adiverging contribution, so it should be removed from the path integral. Our defini-tion of normalization Eq. (11.22) takes care of this.

11.2.3n-Point Functions and the Feynman Propagator

In Sect. 10.6, we derived a relation between n-point functions and generating func-tionals. The field theoretical version of an n-point function is

h0jT [ Oφ(x1) � � � Oφ(xn)]j0i D�

1i

�n δn Z0[ J ]δ J(x1) � � � δ J(xn)

ˇˇ

JD0(11.37)

which is easily obtained by taking functional derivatives of the first expression ofthe generating functional Eq. (11.22). Note, here we express the vacuum state byj0i, which is the vacuum of the free field, because we are using Z0[ J ], the generat-ing functional of the free field. Looking at the second expression of Eq. (11.22), wesee that n-point functions can be expressed in terms of Feynman propagators.

First we show that for a quadratic action such as that of the scalar field, the one-point vacuum function vanishes:

1i

δZ0[ J ]δ J(x1)

D1i

δδ J(x )

exp��

i2

Zd4x d4 y J(x )ΔF(x � y ) J(y )

D �

Zd4z J(z)ΔF(x1 � z)Z0[ J ]

(11.38)

Using Eq. (11.37), this clearly gives

h0jT [ Oφ(x )]j0i D h0j Oφ(x )j0i D1i

δZ0[ J ]δ J(x1)

ˇˇ

JD0D 0 (11.39)

Next, as the two-point correlation function is given by

h0jThOφ(x1) Oφ(x2)

ij0i D

�1i

�2 δ2Z0[ J ]δ J(x1)δ J(x2)

ˇˇ

JD0(11.40)

we obtain by differentiating Eq. (11.38)�1i

�2 δ2Z0[ J ]δ J(x1)δ J(x2)

D

�iΔF(x1 � x2)C

Zd4z1ΔF(x1 � z1) J(z1)

Zd4z2ΔF(x2 � z2) J(z2)

�Z0[ J ]

(11.41)

After setting J D 0, we obtain

h0jThOφ(x1) Oφ(x2)

ij0i D

�1i

�2 δ2Z0[ J ]δ J(x1)δ J(x2)

ˇˇ

JD0D iΔF(x1 � x2) (11.42)

Page 344: Yorikiyo Nagashima - Archive

11.2 Real Scalar Field 319

which shows that the two-point correlation function is nothing other than the Feyn-man propagator. This agrees with Eq. (6.124), of course, which was derived usingthe canonical quantization method.

11.2.4Wick’s Theorem

We continue with n-point functions. We note that the generating functional Z0[ J ]can be Taylor expanded around J:

Z0[ J ] D 1CZ

d4x1δZ0[ J ]δ J(x1)

ˇˇ

JD0J(x1)

� � � C1n!

Zd4x1 � � � d4 xn

δn Z0[ J ]δ J(x1) � � � δ J(xn)

ˇˇ

JD0J(x1) � � � J(xn)C � � �

D

1XnD0

1n!

Zd4x1 � � � d4xn i nh0jT [ Oφ(x1) � � � Oφ(xn)]j0i J(x1) � � � J(xn)

(11.43)

where we have used Eq. (11.37). Another way of expressing the generating func-tional is

Z0[ J ] D1X

kD0

1k!

��

i2

Zd4x d4 y J(x )ΔF(x � y ) J(y )

�k

D 1C1X

kD1

1k!

��

i2

�k Zd4 x1 � � � d4x2k Δ12Δ34 � � � Δ2k�1,2k J1 J2 � � � J2k

(11.44a)

Δ i j D ΔF(xi � x j ) (11.44b)

Since, Z0[ J ] contains only even powers of J, it is obvious that

h0jT [ Oφ(x1) � � � Oφ(x2k�1)]j0i D1

i2k�1

δ2k�1Z0[ J ]δ J(x1) � � � δ J(x2k�1)

ˇˇ

JD0D 0

(11.45a)

If n D 2k is even, the n-point function is a sum of products of two-point functions:

h0jT [ Oφ(x1) � � � Oφ(x2k )]j0i DX

perms

iΔF(xp1 � xp2 ) � � � iΔF(xp2k�1 � xp2k )

(11.45b)

where the sum is taken for all the permutations. This is one version of Wick’stheorem.

Page 345: Yorikiyo Nagashima - Archive

320 11 Path Integral Approach to Field Theory

Note the factor 1/2k k! has disappeared from the expression for the 2k-point func-tion. This can be seen as follows. n-Functional derivatives are completely symmet-ric and there are n! permutations (different orders) of n variables x1 � � � xn . Whenn D 2k, the coefficient of J1 . . . J2k in Eq. (11.44a) Δ12Δ34 � � � Δ2k�1,2k has 2k pair-wise symmetry because Δ i j D Δ j i and another k!-fold symmetry because the or-der of the product of k Δ i j ’s can be arbitrary. That leaves (2k)!/(2k k!) D (2k�1)!! D(2k � 1) � � � 3 � 1 irreducible combinations in Eq. (11.45b). As an explicit example weconsider the case n D 4. We then should have 4!/(222!) D 3 terms for the 4-pointfunction, which is given by

h0jT [ Oφ(x1) Oφ(x2) Oφ(x3) Oφ(x4)j0i D1

222!

X4! permutations

iΔ i1 i2 iΔ i3 i4

D �ΔF(x1 � x2)ΔF(x3 � x4)� ΔF(x1 � x3)ΔF(x2 � x4)

� ΔF(x1 � x4)ΔF(x2 � x3)

(11.46)

Equations (11.43) and (11.45) are important because, as we will learn lat-er (Sect. 11.5), the transition rate of any physical process can be expressed interms of n-point vacuum correlation functions using the Lehmann–Symanzik–Zimmermann (LSZ) reduction formula.

So far we have treated only the generating functionals of the free Lagrangian.The relation between the n-point vacuum correlation functions and the functionalderivatives is equally valid if the Lagrangian includes interaction terms.

11.2.5Generating Functional of Interacting Fields

So far we have dealt with the action of up to quadratic terms or exactly solublecases. In general, the action includes higher order polynomials or transcendentalfunctions that are not analytically soluble. Consider adding a potential V(φ) to thefree Lagrangian:

L D12

�@μ φ@μ φ � (m2 � i ε)φ2 � V(φ) (11.47)

It is not possible to solve this problem exactly even when V � 1. But as long as Vis small, we can solve it perturbatively in the following way. We retain the externalsource as before and write

S [ J ] D S0[φ, J ] �Z

d4x V(φ)

S0[φ, J ] DZ

d4x�

12

�@μ φ@μ φ � (m2 � i�)φ2C J(x )φ(x )

� (11.48)

Z [ J ] D NZ

Dφe�iR

d4zV(φ(z))e i S0[φ, J ] (11.49)

We now use the relation

δδ J(x )

Z0[ J ] Dδ

δ J(x )

ZDφe i S0[φ, J ] D

ZDφ i φ(x )e i S0[φ, J ] (11.50)

Page 346: Yorikiyo Nagashima - Archive

11.3 Electromagnetic Field 321

This relation is also true for any power of φ, or for a potential V(φ) that is Taylorexpandable:

)Z

Dφ V(φ)e i S0[φ, J ] D V�

1i

δδ J(x )

�Z0[ J ] (11.51)

This relation allows us to take the V-dependent factor in Eq. (11.49) out of the pathintegral. Consequently, after exponentiation, we obtain

Z [ J ] D NZ

Dφ e�iR

d4zV(φ(z))e i S0[φ, J ]

D exp��iZ

d4z V�

1i

δδ J(z)

��Z0[ J ]

(11.52)

where Z0[ J ] is the vacuum functional without interaction. Using the expres-sion (11.22) for Z0[ J ]

Z [ J ] D exp��iZ

d4z V�

1i

δδ J(z)

��

� exp��

i2

Zd4x d4 y J(x )ΔF(x � y ) J(y )

� (11.53)

If V is small we can expand the first exponential in powers of V, and the problemcan be treated perturbatively order by order.

11.3Electromagnetic Field

11.3.1Gauge Fixing and the Photon Propagator

The electromagnetic field is gauge invariant and this produces some difficulties inthe path integral. The action of the free electromagnetic field is given by

S DZ

d4x��

14

Fμν F μν�D �

14

Zd4x (@μ A ν � @ν A μ)(@μ Aν � @ν Aμ)

D �14

Zd4x

�@μ A ν@μ Aν � @ν A μ@μ Aν � @μ A ν@ν Aμ C @ν A μ@ν Aμ

(11.54)

By partial integration

S D �14

Zd4x

��2A �@μ@μ A� C 2Aμ@μ@ν Aν�

D12

Zd4x Aμ(x )

�gμν@�@� � @μ@ν

Aν(x )

D12

Zd4 k

(2π)4QA μ(k)

��gμν k2 C k μ k ν QA ν(k) (11.55)

Page 347: Yorikiyo Nagashima - Archive

322 11 Path Integral Approach to Field Theory

where QA(k) is the Fourier transform of A(x ). The action is gauge invariant, whichmeans it does not change if the electromagnetic field undergoes gauge transforma-tion:

QA μ(k)! QA0μ(k) D QA μ(k)C kμ

Qλ(k) (11.56)

or L D constant for a vast class of λ, which makes the path integral badly divergent:

ZDAei S [A] �

N4YaD1

3YμD0

dAμa � constant!1 (11.57)

where a stands for t j , xk , y l , zm . Namely, A μ D @μ λ(x ) is equivalent to A μ(x ) D 0and should not be included in the path integral. The divergence problem is linkedto another difficulty: the operator Λμν D gμν@�@� � @μ@ν does not have an inverse.This means that the equation of the Feynman propagator�

gμν@�@� � @μ@ν

DFνσ(x � y ) D δσ

μ δ4(x � y )

or��gμν k2 C kμ kν

QD νσ

F (k) D δσμ

(11.58)

has no solution. To see this, put D ν�

F (k) D Agν� C Bk ν k�, insert it in Eq. (11.58)and solve for A and B. The remedy we discussed in the Lorentz covariant treatmentof the canonical quantization was to add a gauge-fixing term to the Lagrangian:

L D LEM CLgf D �14

Fμν F μν �1

2λ(@μ Aμ)2 (11.59)

In particular, we saw that setting λ D 1 in the canonical quantization produces theMaxwell equation, which satisfies the Lorentz condition

@μ@μ Aν D 0 (11.60a)

hφj@μ Aμjψi D 0 (11.60b)

where hφj and jψi are physical state vectors. The same prescription works for thepath integral. If we add the gauge-fixing term to the Lagrangian, it is no longergauge invariant and there is no redundant integration which produces unwanteddivergence. Following the same reformulating procedure as led to Eq. (11.55), weobtain a new action

S DZ

d4 x��

14

Fμν F μν �1

2λ(@μ Aμ)2

D12

Zd4x A μ(x )

�gμν@�@� �

�1�

�@μ@ν

�A ν(x )

D12

Zd4k QA μ(k)

��gμν k2 C

�1 �

�k μ k ν

�QA ν(k) (11.61)

The operator

Λμν D gμν@�@� �

�1 �

�@μ@ν (11.62)

Page 348: Yorikiyo Nagashima - Archive

11.3 Electromagnetic Field 323

now has an inverse and the Feynman propagator for the electromagnetic field (thephoton) is a solution to the equation�

gμν@�@� �

�1 �

�@μ@ν � i ε

�DF

νσ(x � y ) D δσμ δ4(x � y ) (11.63)

where a small negative imaginary part is added to keep the path integral well be-haved at large time. In momentum space it is given by

QDFμν D �1

k2 C i ε

�gμν � (1� λ)

kμ kν

k2

�(11.64)

The constant λ is arbitrary. Note, setting λ ! 1 reproduces the gauge invariantLagrangian, but no propagator exists in this case. Choosing λ D 1 reproduces thefamiliar Feynman propagator, which we will adopt from now on. In this case

DFμν(x � y ) DZ

d4k(2π)4

e�i k �(x�y ) �gμν

k2 � i εD �gμν ΔF(x � y )jm2D0

@�@� DFμν(x � y ) D gμν δ4(x � y ) (11.65)

The addition of the gauge-fixing term has made the action and hence the path in-tegral gauge dependent, consequently the propagator is also gauge dependent, asis reflected by the arbitrary choice of λ in Eq. (11.64). However, they all lead to thesame physics because in the transition amplitude DFμν appears coupled to con-served currents, which fulfills @μ j μ D 0 ! kμ Qj μ(k) D 0. As a consequence, theλ-dependent term in the propagator always drops out and the physically observabletransition amplitudes are gauge independent.

11.3.2Generating Functional of the Electromagnetic Field

Once we have fixed the action for the electromagnetic field, the generating func-tional can be constructed in a similar manner as the scalar field. We add the ex-ternal field to the Lagrangian and reformulate the action in terms of the externalfield and the Feynman propagator. The bilinear electromagnetic field term can beintegrated out just like the scalar field in deriving Eq. (11.22) and again we have twoalternative expressions for the generating functional, just as for the scalar field:

ZEM0 [ J ] D N

ZDA exp

�iZ

d4x��

14

Fμν F μν �12

(@μ Aμ)2 C Jμ Aμ �

D

R DA exp�iR

d4x˚ 1

2 Aμ Λμν Aν C Jμ Aμ�R DA exp

�iR

d4x˚ 1

2 Aμ Λμν Aν�

D exp��

i2

Zd4x d4 y J μ(x )DFμν(x � y ) J ν(y )

�(11.66)

where Λμν is defined in Eq. (11.62). Note this is the generating functional ofthe free electromagnetic field. That of the interacting field will be discussed inSect. 11.6.

Page 349: Yorikiyo Nagashima - Archive

324 11 Path Integral Approach to Field Theory

The two-point vacuum expectation value can be derived from the above equationand again proves to be the Feynman propagator of the photon:

h0jT [A μ(x )A ν(y )]j0i D�

1i

�2 δ2Z0[ J ]δ J μ(x )δ J ν(y )

D i DFμν(x � y ) DZ

d4k(2π)4

�i gμν

k2 � i εe�i k �(x�y )

(11.67)

11.4Dirac Field

The path integral deals only with classical numbers. However, we cannot writethe Lagrangian for the anticommuting fermion fields using normal or classicalc-numbers. We need the notion of anticommuting classical numbers, mathemati-cally known as Grassmann variables.

11.4.1Grassmann Variables

Polynomial We use G-variables or G-numbers to denote Grassmann variables orGrassmann generators. Two independent Grassmann variables η i and η j are anti-commuting, i.e.

η i η j D �η j η i (11.68)

In particular,

η2i D 0 (i not summed) (11.69)

This means the G-numbers cannot have a length or an absolute value. Becauseof Eq. (11.68), any function of Grassmann variables including a c-number has asimple Taylor expansion:

f (η) D a C bη (11.70)

where a and b are functions of ordinary c-numbers. For instance, an exponentialfunction of G-numbers is

eaη D 1C aη (11.71)

Complex Conjugate A complex Grassmann number is defined by using two realG-numbers � and η

� D � C i η, �� D � � i η (11.72)

Page 350: Yorikiyo Nagashima - Archive

11.4 Dirac Field 325

If we require ��� to be real, i.e.

(��� )� D ��� D �� �� (11.73)

then

(� � i η)(� C i η) D � 2 C η2 C 2 i � η D 2 i � η D (2 i � η)�

) (� η)� D �� η ! (� η)� D η�(11.74)

Therefore

�1�2 D (�1 C i η1)(�2 C i η2) D (�1�2 � η1η2)C i(�1η2 C η1�2)

��2 ��

1 D (�2 � i η2)(�1 � i η1) D �2 �1 � η2η1 � i �2η1 � i η2�1

D�(�1�2 � η1η2)C i(�1η2 C η1�2)

�) (�1�2)� D ��

2 ��1 (11.75)

This shows that the complex conjugate of a G-number product behaves like thehermitian conjugate of matrix products.

Derivative As Grassmann variables are anticommuting, two distinguishablederivatives are defined. A left derivative would give

@

@η i(η j ηk ) D

�@η j

@η i

�ηk � η j

�@ηk

@η i

�D δ i j ηk � δ i k η j (11.76)

whereas a right derivative would give

@

@η i(η j ηk ) D η j

�@ηk

@η i

��

�@η j

@η i

�ηk D δ i k η j � δ i j ηk (11.77)

Thus we have to be careful as to which one is used. In our discussion, we assumewe use the left derivatives unless otherwise noted. The left derivative can be gener-alized to multivariables:

@

@ηb(ηa � � � ηb � � � η c) D (�1)P (ηa � � �

ηb missinghi � � � η c) (11.78)

where P is the number of permutations to move ηb to the left. We note that twosuccessive derivatives anticommute just like G-numbers. Namely,

@

@η i

@

@η jD �

@

@η j

@

@η i(11.79a)

which also means�@

�2

D 0 (11.79b)

Page 351: Yorikiyo Nagashima - Archive

326 11 Path Integral Approach to Field Theory

This relation implies that there is no inverse to the derivative. This can be seen bymultiplying the defining equation for the inverse

@

�@

��1

φ(η) D φ(η) (11.80)

from the left by @/@η

@

@η@

�@

��1

φ(η) D 0 ��

@

��1

φ(η) D@

@ηφ(η) (11.81)

which is a contradiction. Thus the inverse does not exist.The conventional commutation relation between the derivative and the coordi-

nate takes the form�@

@η i, η j

D

@

@η iη j C η j

@

@η iD δ i j (11.82)

Integral Because there is no inverse to the derivative, the integral in the space ofG-variables can only be defined formally, which reserves some general propertiesof ordinary integration. Since any function can be expressed as a polynomial in theform of Eq. (11.70), there are only two types of integral:Z

dη ,Z

dηη (11.83)

All other integrals are zero. To derive rules for the integrals, we require that Grass-mann variables respect translational invariance. Let � be a constant G-number andchange the integration variable from η to η0 D η � � . If translational invarianceholds, the two integrals should give the same result:Z

dη0 DZ

dη (11.84a)Zdη0η0 D

Zdη(η � � ) D

Zdηη C �

Zdη D

Zdηη (11.84b)

)Z

dη D 0 (11.85a)

To evaluateR

dηη, we use the fact that a G-number commutes with a pair of otherG-numbers:Z

dηη� D �Z

dηη

This meansR

dηη is a c-number and we can define it to take a value 1. NamelyZdηη D 1 (11.85b)

Page 352: Yorikiyo Nagashima - Archive

11.4 Dirac Field 327

The definition can be justified if we consider the fact that the integral of a totalderivative

R@ must vanish if we ignore surface terms, and, the definite integral

being a constant, its derivative must also vanish, namelyZ@ D @

ZD 0 (11.86)

Since the definitions of the integral Eqs. (11.85a) and (11.85b) mean that for anyfunction we haveZ

dη i f (η j ) DZ

dη i (a C bη j ) D bδ i j Dd

dη if (η j ) (11.87)

Namely, the differential and integral operations are identical. HenceZ@ D @

ZD @2 D 0 (11.88)

which confirms Eq. (11.86).Because the integral is the same as the differential, the Jacobian of the variable

transformation differs from that of ordinary integration. If we redefine the vari-ables of the integration from η to η0 D aη, thenZ

dη f (η) D@ f (η)

@ηD a

@ f (η0/a)@η0 D a

Zdη0 f

�η0

a

�(11.89)

The last equality is opposite to what happens to an ordinary integral. Namely. theJacobian that appears in the redefinition of the Grassmann variables is the inverseof what one would expect for the ordinary integral.

Multiple Integral In multiple integrals the integration variable has to be broughtfirst next to the integration measure:Z

dη1

Zdη2η1η2 D �

Zdη1

Zdη2η2η1 D �

Zdη1η1 D �1 (11.90)

Problem 11.1

Consider N-variable transformation �i DP

j Ui j η j . Write

Z NYiD1

dη i f (η1, � � � , ηN ) D JG

Z NYiD1

d�i g(�1, � � � , �N )

g(� ) D f (U�1� )

(11.91)

Using Eq. (11.85) and anticommutativity of the G-variables, prove

JG D@(�1, � � � , �N )@(η1, � � � , ηN )

D det[Ui j ] (11.92)

Page 353: Yorikiyo Nagashima - Archive

328 11 Path Integral Approach to Field Theory

δ function A delta function in the space of Grassmann variables is written as

δ(η � η0) D η � η0 (11.93)

Proof: As f (η) D a C bηZdη(η � η0) f (η) D

Zdη(η � η0)(a C bη)

D@

@η(η � η0)(a C bη) D a C bη0 D f (η0)

(11.94)

Note, as δ(η) D η, δ(aη) D aη D f@(aη)/@ηgη. Generally

δ( f (η)) D@ f (η)

@ηδ(η) (11.95)

which is consistent with the Jacobian defined by Eq. (11.92).An integral representation of the δ-function is obtained by noting that if η is a

G-variable

1i

Zdηe i η(��� 0) D

1i

Zdη[1C i η(� � � 0)] D � � � 0 D δ(� � � 0) (11.96)

Gaussian Integral Next we treat the Gaussian integral of the Grassmann variables.First we prove that for an integral of complex Grassmann variablesZ

dη�dηe�η�η D 1 (11.97)

The above equality can easily be proved becauseZdη�dηe�η�η D

Zdη�dη(1� η�η) D

@

@η�@

@η(1� η�η) D 1 (11.98)

Next we generalize the Gaussian integral to n complex variables. Denoting a Grass-mann vector ηT D (η1, η2, � � � , ηn), we consider an integral

IG �

Z nYiD1

dη�i dη i exp[�η† Aη] (11.99)

where A is an n � n matrix and we assume A† D A, because in the path integralwe are interested only in hermitian matrices. Then it can be diagonalized by usinga unitary matrix U:

D D UAU† or A D U† DU (11.100)

where D is a diagonal matrix and is expressed as

D j k D δ j k λ j (11.101)

Page 354: Yorikiyo Nagashima - Archive

11.4 Dirac Field 329

Changing the variables to � D U η

IG D

Z nYiD1

dη�i dη i exp[�� †UAU† � ]

D det U† det UZ nY

iD1

d� �i d�i exp[�� † D� ]

D

Z nY

iD1

d� �i d�i

!exp

24�X

j

λ j � �j � j

35

D

nYj D1

�Zd� �

j d� j exph�λ j � �

j � j

i�

D

nYj D1

�Zd� �

j d� j

h1� λ j � �

j � j

i�D

nYiD j

λ j D det A (11.102)

We have used the fact that the Jacobian is unity:

J D (det U†)(det U) D det(U†U) D 1 (11.103)

We also used the fact thatR

d� �d� is a c-number and can be moved freely inthe product. Note, the integral is 2n dimensional but the matrix is n dimensional.This formula is to be contrasted with the n-fold Gaussian integral over ordinaryc-number complex variables [Eq. (10.148)]:

Ic D

Zdz�

1 dz1 � � � dz�n dzn e�z† Az D

(2π i)n

det A(11.104)

We note that the determinant of the Grassmann Gaussian integral appears in thenumerator whereas that of the ordinary Gaussian appears in the denominator.

The generalization to the functional integral is straightforward. Change

η i ! η(x )nY

iD1

dη�i dη i ! Dη(x )Dη(x )

η† Aη !Z

d4x d4 y η�(x )A(x , y )η(y )

(11.105)

and we obtainZDη(x )Dη(x ) exp

��iZ

d4x d4 y η�(x )A(x , y )η(y )�D det i A (11.106)

Next, we evaluate an integral of the form

IF D

Z nYiD1

dη�i dη i ηk η�

l exp[�η† Aη] (11.107)

Page 355: Yorikiyo Nagashima - Archive

330 11 Path Integral Approach to Field Theory

It is clear that an integral containing one or an odd number of G-variables in frontof the exponential function vanishes. After unitary transformation the integral be-comes [see Eq. (11.102)]

I f D

nYj D1

Zd� �

j d� j (U�1k s �s � �

t Ut l)h1� λ j � �

j � j

i(11.108)

From this equation we see only terms of the formZd� �

j d� j � j � �j (1� λ j � �

j � j ) DZ

d� �j d� j � j � �

j (11.109)

gives non-zero contribution. Namely, �s � �t with s D t D j survive and λ j is lost.

Therefore, the effect of inserting an extra ηk η�l in the integral IG is to make the

change

λ j ! U�1k j U j l (11.110)

or equivalently to multiply IG [Eq. (11.102)] by

U�1k j

1λ j

U j l D (A�1)k l (11.111)

Thus the end result becomes

IF D

Z nYiD1

dη�i dη i ηk η�

l exp[�η† Aη] D det A(A�1)k l (11.112)

Inserting another pair ηm η�n in the integrand would yield a second factor (A�1)mn .

Gaussian Integral of Real Grassmann Variables We can obtain similar expressionsfor real-variable Gaussian integrals with the orthogonal matrix A:

Zdη1 � � � dηn e� 1

2 ηT Aη D

(pdet A n even

0 n odd(11.113)

The proof is more complicated than that for complex variables because the matrixis antisymmetric in this case. As real-variable integrals are used less than complex-variable integrals we leave the proof as an exercise.

Problem 11.2

Let η be an n-dimensional real Grassmann vector. Expand the Gaussian integral ina Taylor series:

e� 12 ηT Aη D 1�

12

ηT AηC

� 12

�2

2!

��ηT Aη

�2C� � �C

� 12

� j

j !

��ηT Aη

� jC� � � (11.114)

Page 356: Yorikiyo Nagashima - Archive

11.4 Dirac Field 331

Since the jth term contains 2j G-variables, the series does not go beyond j >

n/2.Show that

(1) For an n-dimensional integral In

I2 D

Zdη1dη2 e� 1

2 ηT Aη D12

(A 12 � A 21) D A 12 Dp

det A (11.115a)

I3 D 0 (11.115b)

I4 D

Zdη1 � � � dη4e� 1

2 ηT Aη

D

� 12

�2

2

4Xi j k lD1

A i j A k l

Zdη1 � � � dη4 η i η j ηk η l

D (A 12A 34 � A 13A 24 C A 14A 23) Dp

det A (11.115c)

(2) Prove Eq. (11.113).*

Problem 11.3

Prove that a similar equation to Eq. (11.112) holds for ordinary numbers, namelyfor an N-dimensional real vector x D (x1, � � � , xN ) and real symmetric and positivedefinite D � D matrix A

Z NYiD1

dx i xk1 � � � xk n e� 12 xT Ax

D

8<:

(2π)N/2

pdet A

�A�1

k1 k2� � � A�1

kn�1 knC permutations

�n D even

0 n D odd

(11.116)

11.4.2Dirac Propagator

We are now ready to express the anticommuting Dirac field in terms of Grassmannvariables. We may define a Grassmann field ψ(x ) in terms of any set of orthonor-mal basis functions:

ψ(x ) DX

j

ψ j φ j (x ) (11.117)

The basis function φ j (x ) are ordinary c-number functions, but the coefficients ψ j

are Grassmann numbers. Then the Dirac field can be expressed by taking φ(x )to be the four-component Dirac spinors u r(p )e�i p �x and vr (p )e�i p �x. Using the

Page 357: Yorikiyo Nagashima - Archive

332 11 Path Integral Approach to Field Theory

Grassmann–Dirac field, the 2-point correlation function for the Dirac field is ex-pressed as

h0jT [ Oψ(x1) Oψ(x2)]j0i D

R DψDψψ(x1)ψ(x2) exp�iR

d4x ψ(i 6@� m)ψR DψDψ exp

�iR

d4x ψ(i 6@� m)ψ

(11.118)

It is also to be understood that the integral can be analytically continued in the timevariable complex plane to ensure the good behavior of the integral or equivalentlya small imaginary part i ε is implicit to produce the right propagator. ComparingEq. (11.118) with Eq. (11.112), we obtain

h0jT [ Oψ(x ) Oψ(y )]j0i D i SF(x � y ) D [�i(i 6@� m)]�1δ4(x � y )

SF(x � y ) DZ

d4 p(2π)4

e�i p �(x�y )

6p � m C i ε

D

Zd4 p

(2π)4 e�i p �(x�y ) 6p C mp 2 � m2 C i ε

(11.119)

11.4.3Generating Functional of the Dirac Field

Using the Feynman propagator, the generating functional of the Dirac field is writ-ten as

ZDirac0 [η, η] D

R DψDψ exp�iR

d4x˚

ψ(i 6@� m)ψ C ηψ C ψη�R DψDψ

�iR

d4x˚

ψ(i 6@� m)ψ�

D exp��iZ

d4x d4 y η(x )SF(x � y )η(y )�

(11.120)

The derivation of the second line from the first is completely in parallel with thatof the scalar field [see Eq. (11.22)]. Then the n-point correlation function can bederived by successive differentiation of the generating functional. We have to becareful only about the sign change depending on the position of the Grassmannvariables. The operation �i δ/δη pulls down ψ but to pull down ψ, i δ/δη is nec-essary because it has to be anticommuted through ψ before it can differentiate η.If we apply two derivatives to Z0 to pull down ψ(x )ψ(y ):��

1i

δδη(y )

��1i

δδη(x )

�ZDψDψ exp

�iZ

d4z˚

ψ(i 6@� m)ψ C ηψ C ψη��

D

��

1i

δδη(y )

�ZDψDψ ψ(x ) exp

�iZ

d4 z˚

ψ(i 6@� m)ψ C ηψ C ψ η��

D �

ZDψDψ ψ(x )ψ(y ) exp

�iZ

d4z˚

ψ(i 6@� m)ψ C ηψ C ψη��

(11.121)

Page 358: Yorikiyo Nagashima - Archive

11.5 Reduction Formula 333

Note the minus sign was added because δ/δη has to go through ψ, which waspulled down by the first derivative. If the ordering of the operators and of the deriva-tives is the same, no extra sign appears, namely

�1i

�2 δ2Z0[η, η]δη(y )δη(x )

ˇˇ

ηDηD0

D h0jT [ Oψ(x ) Oψ(y )]j0i (11.122)

This formula can be extended to a general 2n-point function:

�1i

�2n δ2n Z0[η, η]δη(y1) � � � δη(yn)δη(x1) � � � δη(xn)

ˇˇ

ηDηD0

D h0jT [ Oψ(x1) � � � Oψ(xn) Oψ(y1) � � � Oψ(yn)]j0i

(11.123)

To prove that we have the right phase in the above equation, we first show that ifthe derivatives are ordered in exactly the same way as the operator products, therelation between the two is expressed as�

�1i

�n �1i

�n δnCn Z0[η, η]δη(yn) � � � δη(y1)δη(xn) � � � δη(x1)

ˇˇ

ηDηD0

D h0jT [ Oψ(yn) � � � Oψ(y1) Oψ(xn) � � � Oψ(x1)]j0i

(11.124)

The proof goes as follows. The first derivative δ/δη(x1) pulls ψ(x1) down fromZ0 and produces no sign change. The second derivative has to be anticommutedwith ψ(x1) before pulling down ψ(x2) from Z0, producing one extra �1. But it iscanceled by moving the pulled-down ψ(x2) to the left of ψ(x1). Similarly, the ithderivative has to be anticommuted i times before it pulls ψ(xi ) down from Z0, butthe generated sign is canceled exactly by moving the pulled down ψ back to the ithposition. The argument is similar for δ/δη except for an extra �1 to go to the rightof ψ in the exponent before the derivative is applied to η.

Rearranging the ordering of the operators to move all ψ(y i) to the right of ψ(x j )produces the minus sign of (�1)n2

. Hence the total number of�1’s becomes n2Cnand (�1)n2Cn D 1. Renaming xi and y i in ascending order recovers Eq. (11.123).

11.5Reduction Formula

11.5.1Scalar Fields

Here we introduce the LSZ (Lehmann–Symanzik–Zimmermann) reduction for-mula [260], which relates the scattering matrix to n-point vacuum correlation func-tions. We first consider the case of scalar particles. It is straightforward to gen-eralize to other types of particles. We assume the weak asymptotic condition of

Page 359: Yorikiyo Nagashima - Archive

334 11 Path Integral Approach to Field Theory

Eq. (6.26) with Z set to be 1, namely

limt!�1

h f j Oφ(x )jii D h f j Oφ inout

(x )jii (11.125)

The “in” and “out” states are free states, i.e. on the mass shell, and can be describedby the free field. For scalar fields, the free-field operator can be expressed as

Oφ inout

(x ) DZ

d3q(2π)3

p2ωq

�a in

out(q)e�i q�x C a†

inout

(q)e i q�x�

(11.126)

The scattering matrix is given as

S f i D hout W f ji W ini D hin W f jS ji W ini D hout W f jS ji W outi (11.127)

Then we can define the in (out) state vector of definite momentum as

jq inouti Dp

2ωa†in

out(q)jΩ i D �i

p2ω

Zd3x eq(x )

!@t Oφ in

out(x )jΩ i

D limt!�1t!C1

��iZ

d3x f q(x ) !@t Oφ(x )jΩ i

� (11.128)

where

eq De�i q�xp

2ω, f q D e�i q�x , A

!@t B D A@t B � (@t A)B (11.129)

Here, although not explicitly written, we assume that the operators appear sand-wiched between the free states, as they should be consistent with the conditionEq. (11.125). Next, we wish to convert the three-dimensional integral into a Lorentz-invariant four-dimensional integral. To do that we use the identity

limt!1� lim

t!�1

� Zd3 x A(x ) D

Z 1

�1d t@t

Zd3x A(x ) (11.130)

Consider, for example, an S-matrix element for the scattering of m particles withmomenta (q1, q2, � � � , qm) to n particles of momenta (p1, � � � , pn). We want to ex-tract the free field Oφ in

out(x ) from it. First note

hout W jα W ini D hout W jp

2ω a†in(q)jα � q W ini (11.131)

where we have used hj, jαi and jα� qi to denote states containing n, m and m�1particles, respectively. In jα � qi a particle having momentum q is taken out. Thenusing Eqs. (11.128) and (11.130), Eq. (11.131) becomes

hout W jα W ini D limt!�1

��iZ

d3x f q(x ) !@t hout W j Oφ(x )jα � q W ini

D iZ

d4x@t

hf q(x ) !@t hout W j Oφ(x )jα � q W ini

i� lim

t!1

�iZ

d3x f q(x ) !@t hout W j Oφ(x )jα � q W ini

� (11.132)

Page 360: Yorikiyo Nagashima - Archive

11.5 Reduction Formula 335

The second term is just a creation operator of the “out” state, or an annihilationoperator if it acts on the left, therefore

hout W ja†out(q)

p2ωq D hout W � p j , p j ja

†out(q)

p2ωq

D hout W � p j jaout(p j )a†out(q)

p2ω j

p2ωq

D�p

2ω j 2ωq�hout W � p j j

n[aout(p j ), a†

out(q)]C a†out(q)aout(p j )

oD 2ω j (2π)3δ3(p j � q)hout W � qj

Cp

2ω jp

2ωqhout W � p j ja†out(q)aout(p j ) (11.133)

The process can be repeated until a†out(q) goes to the left end and destroys the

vacuum. The term vanishes if hj contains no particles with momentum q. Thisterm is called a “disconnected diagram”, because one particle emerges unaffectedby the scattering process and is hence disconnected from the rest of the particles.

The first term on the rhs of Eq. (11.132) gives

D iZ

d4 x @t

hf q(x ) !@t hout W j Oφ(x )jα � q W ini

iD �i

Zd4x

˚@2

t f q(x )� f q(x )@2t

�hout W j Oφ(x )jα � q W ini

D iZ

d4 x˚(�r2 C m2) f q(x )C f q(x )@2

t

�hout W j Oφ(x )jα � q W ini

D iZ

d4 x˚

f q(x )(@2t � r

2 C m2)�hout W j Oφ(x )jα � q W ini (11.134)

where in deriving the third line we have used the fact that the plane wave functionf q(x ) satisfies the Klein–Gordon equation and the last line was obtained by partial

integration. We now have a Lorentz-invariant expression. Collecting terms, we find

hout W jα W ini D ( disconnected diagram )

C iZ

d4x f q(x )(@μ@μ C m2)hout W j Oφ(x )jα � q W ini(11.135)

We have extracted one particle from the abstract S-matrix and converted it into amatrix element of the field operator Oφ. Next, we extract the second particle withmomentum p from the “out” state. Disregarding the disconnected diagram, we get

hout W j Oφ(x )jα � q W ini D hout W � p jaout(p )p

2ω p Oφ(x )jα � q W ini

D hout W � p j Oφ(x )p

2ω p ain(p )jα � q W ini

C hout W � p jfaout(p )p

2ω p Oφ(x ) � Oφ(x )p

2ω p ain(p )gjα � q W ini

D hout W � p j Oφ(x )jα � q � p W ini

C iZ

d3x 0�

limt 0!1

�f �

p (x 0) !@t

0 hout W � p j Oφ(x 0) Oφ(x )jα � q W ini �

� iZ

d3x 0�

limt 0!�1

�f �

p (x 0) !@t

0 hout W � p j Oφ(x ) Oφ(x 0)jα � q W ini �

Page 361: Yorikiyo Nagashima - Archive

336 11 Path Integral Approach to Field Theory

(11.136a)

Disregarding again the disconnected diagram (the first term), the second and thethird terms can be combined to give

D iZ

d4x 0@t0�

f �p (x 0)

!@t

0 hout W � p jT [ Oφ(x 0) Oφ(x )]jα � q W ini�

D iZ

d4x 0 f �p (x 0)(@μ

0@μ 0C m2)hout W � p jT [ Oφ(x 0) Oφ(x )]jα � q W ini

(11.136b)

where T [� � � ] is a time-ordered product. The process can be repeated on both sidesuntil we extract all the particles. We can always choose the “in” and “out” momen-ta so that there are no disconnected diagrams. Then the final expression for thescattering matrix for m incoming particles and n outgoing particles is given by

hjS jαi D i mCnZ nY

j D1

d4 x 0j

mYkD1

d4 xk f �p j

(x 0j )(����!@μ

0@μ 0C m2) j

� hΩ jThOφ(x 0

1) � � � Oφ(x 0n) Oφ(x1) � � � Oφ(xm )

ijΩ i( ���@μ@μ C m2)k f qk (xk )

(11.137)

where we have arranged the incoming part of the operation to the right of thecorrelation function, although how it is ordered is irrelevant for the case of bosonfields. This is called the LSZ reduction formula to convert a S-matrix to a vacuumexpectation value of an mC n-point correlation function. Note, we have derived theabove formula in the Heisenberg representation. The field Oφ is that of interactingparticles, which contains all the information on the interaction and the vacuumstate is also that of the full, interacting theory.

As the n-point time-ordered correlation function, in turn, can be expressed as afunctional derivative [see Eq. (11.37)], the S-matrix expressed in terms of generatingfunctionals is given by

hjS jαi D i mCnZ nY

j D1

d4 x 0j

mYkD1

d4 xk f �p j

(x 0j )(����!@μ

0@μ 0 C m2) j

�1i

�mCn δmCn Z0[ J ]δ J(x1) � � � δ J(xmCn)

ˇˇ

JD0

( ���@μ@μ C m2)k f qk (xk )

(11.138)

Page 362: Yorikiyo Nagashima - Archive

11.5 Reduction Formula 337

11.5.2Electromagnetic Field

Since the free electromagnetic field (in the Lorentz gauge) obeys the Klein–Gordonequation with mass D 0 and is expressed as

Aμinout

(x ) DZ

d3k(2π)3

p2ωk

hεμ(k, λ)a in

out(k, λ)e�i k �x

C εμ�(k, λ)a†inout

(k, λ)e i k �xi (11.139)

the reduction formula of the electromagnetic field is almost the same as that of thescalar field, except for the addition of the polarization vector. Namely,

hjS jαi D i mCnZ nY

j D1

d4x 0j

mYkD1

d4xk f �p j

(x 0j )ε

�� j

(p j , λ j )(����!@μ

0@μ 0) j

� hΩ jThOA�1 (x 0

1) � � � OA�n (x 0n) OAσ1 (x1) � � � OAσm (xm)

ijΩ i

� ( ���@μ@μ)k εσk (qk , λk ) f qk (xk ) (11.140)

We can express the vacuum correlation function in terms of functional derivatives.Referring to Eq. (11.37),

hjS jαi DZ nY

j D1

d4x 0j

mYkD1

d4xk f �p j

(x 0j )ε

�� j

(p j , λ j )(����!@μ

0@μ 0) j

�δmCn Z [ J ]

δ J�1 (x 01) � � � δ J�n (x 0

n)δ J σ1 (x1) � � � δ J σm (xm )

� ( ���@μ@μ)k εσk (qk , λk ) f qk (xk ) (11.141)

11.5.3Dirac Field

It is straightforward to apply the reduction formula to the Dirac field. The differ-ence between the scalar field is that we have to be careful about the ordering of theoperators since the Dirac field is a four-component spinor and anticommuting. Asbefore, we first postulate the asymptotic condition as

limt!�1t!1

h f j Oψ(x )jii D h f j Oψ(x ) inoutjii (11.142)

where the free Dirac field can be expressed as

Oψin(x ) DZ

d3 p

(2π)3p

2Ep

hU r

p (x )ain(p , r)C V rp (x )b†

in(p , r)i

(11.143a)

U rp (x ) D u r (p )e�i p �x , V r

p (x ) D vr (p )e i p �x (11.143b)

Page 363: Yorikiyo Nagashima - Archive

338 11 Path Integral Approach to Field Theory

(i!6@ � m) Oψin(x ) D 0 , Oψ in(x )(�i

�6@ � m) D 0 (11.143c)

ain(p , r) DZ

d3xU

rp (x )p2Ep

γ 0 Oψin(x ) , a†in(p , r) D

Zd3x Oψ in(x )γ 0

U rp (x )p2Ep

(11.143d)

bin(p , r) DZ

d3x Oψ in(x )γ 0V r

p (x )p2Ep

, b†in(p , r) D

Zd3x

Vrp (x )p2Ep

γ 0 Oψin(x )

(11.143e)

The reduction formula is obtained following the same logical flow as the scalarfield:

hout W p jq W ini D hout W p jq

2Eq a†in(q, r)jΩ W ini

D �hout W p jn

a†out(q, r)

q2Eq �

q2Eq a†

in(q, r)ojΩ W ini

C hout W p jq

2Eq a†out(q, r)jΩ W ini

D �( limt!1� lim

t!�1)Z

d3x hout W p j Oψ(x )jΩ W iniγ 0U rq (x )

C disconnected part

D iZ

d4x hout W p ji@0

nOψ(x )γ 0U r

q (x )ojΩ W ini C (� � � )

D iZ

d4x hout W p jnOψ(x )i

�@0 γ 0U r

q (x )C Oψ(x )i!@0 γ 0U r

q (x )o

� jΩ W ini C (� � � )

(11.144)

We convert the time derivative of U rq (x ) to a space derivative using the equation of

motion Zd4x Oψ(x )i

!@0 γ 0U r

q (x ) DZ

d4x Oψ(x )f�i γ �!r C mgU r

q (x )

D

Zd4x Oψ(x )fi γ �

�r C mgU r

q (x )(11.145)

where the last equation is obtained by partial integration. Then the first term in thelast equality of Eq. (11.144) is combined to become

hout W p jq W ini D iZ

d4x hp j Oψ(x )jΩ W ini(i �6@ C m)U r

q (x ) (11.146)

Page 364: Yorikiyo Nagashima - Archive

11.5 Reduction Formula 339

Similar calculations can be carried out and give

p2ωhout W f ja†

in(q)ji W ini D (�i)Z

d4x h f j Oψ(x )jii(�i �6@ � m)x U r

q (x )q2Ep hout W f jaout(p )ji W ini D (�i)

Zd4 y U

sp (y )(i

!6@ � m)y h f j Oψ(y )jii

p2ω0hout W f jb†

in(q0)ji W ini D iZ

d4x 0 Vmq0 (x 0)(i

!6@ � m)x 0h f j Oψ(x 0)jiiq

2Ep 0hout W f jbout(p 0)ji W ini D iZ

d4 y 0 h f j Oψ(y 0)jii(�i �6@ � m)y 0 V n

p 0 (y 0)

(11.147)

For the one-particle transition q ! p , we have

hout W p jq W ini

D (�i)2Z

d4x d4 y U p (y )(i!6@ � m)y hΩ jT [ψ(y )ψ(x )]Ω i

� (�i �6@ � m)x Uq(x )

D (�i)2Z

d4x d4 y U p (y )(i!6@ � m)y

��

1i

δδη(y )

��1i

δδη(x )

�Z0[η, η](�i

�6@ � m)x Uq(x )

(11.148)

We now have to extend the reduction formula to many-particle systems. As theordering of the operators is important, we begin with the transition amplitude. Inthe transition amplitude, particles are arranged in such a way that the subscripts in-crease from left to right and antiparticles are placed to the right of particles. Namely,denoting the momenta by

q1 C � � � C qm„ ƒ‚ …particles

C q1 C � � � C qm0„ ƒ‚ …antiparticles

! p1 C � � � C pn„ ƒ‚ …particles

C p1 C � � � C p n0„ ƒ‚ …antiparticles

the transition amplitude can be written as

hout W jα W ini D hout W p1, � � � , pnI p1, � � � , p n0 jq1, � � � , qmI

q1, � � � , qm0 W ini

D (i)m0Cn0(�i)mCn

Z mYiD1

n0Yi 0D1

d4xi d4 y 0i 0

nYj D1

m0Yj 0D1

d4x 0j 0 d4 y j

� U p (y j )(i!6@ � m)y j V q0 (x 0

j 0 )(i!6@ � m)x 0

j 0

� hΩ jT [ψ(y1) � � � ψ(yn)ψ(y 01) � � � ψ(y 0

n0)ψ(x1) � � � ψ(xm)

ψ(x 01) � � � ψ(x 0

m0 )]jΩ i(�i �6@ � m)xi Uq(xi )(�i

�6@ � m)y 0

i0Vp 0 (y 0

i 0)

(11.149)

Page 365: Yorikiyo Nagashima - Archive

340 11 Path Integral Approach to Field Theory

Here, the ordering of the operators matches that in hout W jα W ini exactly. Rear-rangement of the operators produces additional sign changes, of course. To expressthe above formula in terms of functional derivatives, the vacuum expectation valueof the operator T-product can be replaced by the functional derivatives

Oψ(y i)$1i

δδη(y i)

, Oψ(xi )$ �1i

δδη(xi)

(11.150)

provided we keep the ordering of the functional derivatives exactly as that of theoperators in the T-product [see Eq. (11.124)]. In the following, we only give the casefor n D m , n0 D m0 D 0:

hout W p1 � � � pnjq1 � � � qn W ini DZ nY

iD1

d4xi d4 y i U pi (y i)(i!6@ � m)yi

�δ2n Z0[η, η]

δη(x1) � � � η(xn)δη(y1) � � � δη(yn)(� �6@ � m)xi Uqi (xi ) (11.151)

where the phases are exactly canceled first by that of the functional derivatives andsecond by moving δ/η(y i)’s to the right of δ/δη(x )’s.

11.6QED

We are now ready to apply the method of the path integral to actual calculationof QED processes. We shall derive the scattering amplitude for reactions such asthe Compton and e–μ scatterings and show that they produce the same results ascalculated using the canonical quantization method.

11.6.1Formalism

We now want to write down the scattering matrix describing the interactions ofelectrons and photons. In the previous section, we learned to express the scatteringmatrix in the form of the LSZ reduction formula. As it is expressed in terms ofthe functional derivatives of the generating functional of the interacting fields, theremaining task is to derive the generating functional of interacting fields involvingthe Dirac and electromagnetic fields.

In QED, which describes phenomena involving the electron (a Dirac field) inter-acting with the electromagnetic field, the whole Lagrangian LQED can be obtainedfrom the free Lagrangian of the electron and the electromagnetic field by replacingthe derivative @μ by the covariant derivative Dμ D @μ C i qA μ, a procedure called“gauging”.

Page 366: Yorikiyo Nagashima - Archive

11.6 QED 341

First we write the gauge-invariant full Lagrangian for the electron interactingwith the electromagnetic field (photon).

Lfree@μ!Dμ�����! LQED D �

14

Fμν F μν C ψ(i D/ � m)ψ (11.152a)

D/ D γ μ(@μ C i qA μ), (q D �e)

LQED D L0 CLint (11.152b)

L0 D �14

Fμν F μν C ψ(i 6@� m)ψ

Lint D �qψγ μ ψA μ

The action of the generating functional of the interacting electrons and photonsshould include, in addition to L of Eq. (11.152b), a gauge-fixing term

�1

2λ(@μ Aμ)2 (11.153)

and source terms, and hence is not gauge invariant. We adopt the Feynman gaugeand set λ D 1. Then we can write down the generating functional:

Z [η, η, J ] D NZ

DψDψDA exp�

iZ

d4x

�L0 CLint �

12

(@μ Aμ)2 C Jμ Aμ C ηψ C ψη � (11.154)

where N is a normalization constant to make Z [0, 0, 0] D 1. After the standardmanipulation to integrate the bilinear terms [refer to Eqs. (11.66) and (11.120)], thegenerating functional Z0[η, η, J μ ] for the free fields, namely those with Lint D 0,can be expressed as

Z0[η, η, J ] D exp��iZ

d4x d4 y

�12

J μ(x )DFμν(x � y ) J ν(y )C η(x )SF(x � y )η(y ) �

(11.155)

We learned in Sect. 11.2.5, that the generating functional of the interacting fieldscan be obtained by replacing the field in the interaction Lagrangian by its func-tional derivatives [see Eq. (11.53)]. In QED, we replace the fields in the interactionLagrangian by

ψ ! �1i

δδη

, ψ !1i

δδη

, A μ !1i

δδ J μ (11.156)

Page 367: Yorikiyo Nagashima - Archive

342 11 Path Integral Approach to Field Theory

Therefore, the generating functional for QED is given by

Z [η, η, J ] D N exp��iZ

d4x��

1i

δδη

�(qγ μ)

�1i

δδη

��1i

δδ J μ

��� Z0[η, η, J ]

D N exp�Z

d4x��

1i

δδη

�(�i qγ μ)

�1i

δδη

��1i

δδ J μ

��� Z0[η, η, J ] (11.157)

We retain the minus sign in front of the coupling constant because the action isi S D i

Rd4xLint D

Rd4xf�iV(x )g and the perturbation expansion is carried out

in powers of minus the coupling constant, i.e. in (�i q)n.

11.6.2Perturbative Expansion

The exponential of the interaction part in Eq. (11.157) can be Taylor-expanded inthe coupling strength jqj

Z [η, η, J ] D N1X

nD0

1n!

�Zd4x

��

1i

δδη

�(�i qγ μ)

�1i

δδη

��1i

δδ J μ

� n

Z0[η, η, J ]

(11.158)

Note, η and η are actually four-component spinors and the above expression is anabbreviation of

��

δδη

�γ μ�

δδη

�D

4Xa,bD1

��

δδηa

�(γ μ)ab

�δ

δηb

�(11.159)

In the following we calculate the first- and second-order terms explicitly and ex-press these results graphically. The Feynman rules that establish the connectionbetween the algebraic and graphical representations are very useful in understand-ing the so far very formal treatment of the functional derivatives and are of greathelp in performing actual calculations.

Page 368: Yorikiyo Nagashima - Archive

11.6 QED 343

11.6.3First-Order Interaction

The first-order interaction term is given by

Z1 � N�Z

d4x��

1i

δδη(x )

�(�i qγ μ)

�1i

δδη(x )

�1i

δδ J μ(x )

� Z0[η, η, J ] (11.160a)

D N�Z

d4 x��

1i

δδηa(x )

���i qγ μ

ab

��1i

δδηb(x )

��1i

δδ J μ(x )

� exp�� i

Zd4z2 d4z3

�12

J�(z2)DF�σ(z2 � z3) J σ(z3)

C η l (z2)fSF(z2 � z3)gl m ηm(z3) �

(11.160b)

D N�Z

d4 x��

1i

δδηa(x )

���i qγ μ

ab

�Zd4 z1d4z3DFμσ(x � z1) J σ(z1)fSF(x � z3)gbm ηm(z3)

� exp�� i

Zd4w2d4w3

�12

J α(w2)DFα(w2 � w3) J (w3)

C η s(w2)fSF(w2 � w3)gs t η t (w3) �

D

�C i N

Zd4x d4z1 DFμσ(x � z1) J σ(z1)

˚�� i qγ μ

ab

�fSF(x � x )gba

�C N

Zd4x d4z1d4z2 d4z3 DFμσ(x � z1) J σ(z1)

��� i qγ μ

ab

�fSF(x � z3)gbm ηm(z3)η l (z2)fSF(z2 � x )gl a

�Z0[η, η, J ]

D

�i N

Zd4 x d4z1Tr

�(�i qγ μ)SF(x � x )

DFμν(x � z1) J ν(z1)

C (�1)NZ

d4x d4z1 d4z2d4z3 DFμν(x � z1) J ν(z1)

� η(z2)SF(z2 � x )(�i qγ μ)SF(x � z3)η(z3)�

Z0[η, η, J ] (11.160c)

The first term vanishes because

Tr[γ μ SF(0)] DZ

d4 p(2π)4

Tr[γ μ( 6p � m)]p 2 � m2 � i ε

D

Zd4 p

(2π)4

4p μ

p 2 � m2 � i εD 0

(11.161)

Page 369: Yorikiyo Nagashima - Archive

344 11 Path Integral Approach to Field Theory

Note the extra (�1) phase in the second term of the last line, which was generatedby moving η to the left of η. This does not happen for boson fields. The finalexpression for the first-order generating functional is

Z1[η, η, J ] D �NZ

d4x d4z1 d4z2d4z3DFμν(x � z1) J ν(z1)

� η(z2)SF(z2 � x )(�i qγ μ)SF(x � z3)η(z3)Z0[η, η, J ]

(11.162)

The expression Z1/Z0 is pictorially illustrated in Figure 11.1.It represents several different processes. The reduction formula decides which

one to pick up. In Figure 11.1a, an electron is created by a source denoted by across at z3, goes upward in time, interacts at x with the electromagnetic field and isdestroyed by a source (or sink) at z2. Here, a photon is created at z1 and is absorbedby the electron. But in (b), the photon is emitted by the electron at x and destroyedby the source (sink). The electromagnetic field is real and hence the photon is itsown antiparticle. This is why the same source can act both as a source to create aphoton and as a sink (negative source) to destroy it. In Figure 11.1c, the sense oftime direction is reversed to denote a positron (electron’s antiparticle) created at z3

and destroyed at z2; otherwise it is the same as (a).The pictorial description of the expression is as follows:

1. Crosses represent sources or sinks (η, η, J ν).2. Dots represent interactions. “�i qγ μ” denotes an interaction in which a Dirac

spinor comes in and goes out, at the same time emitting or absorbing a photon.3. Lines with an arrow represent the electron propagator, SF(x � z3) going from

z3 to x and SF(z2 � x ) from x to z2. The arrow is directed downward if theline represents an antiparticle (positron). The spin state of the electron wassuppressed but can be attached if necessary.

4. Wavy lines represent the photon propagator DFμν(x � z1).

z2 z1

z3

x

z2

z1z3

x

(a) (b) (c) (d) (e)

t

x

z2

z1

z3

x

z3

z1

z2

x

z3

z1z2

x

Figure 11.1 First-order generating functionalof the interaction with three external sources.The same expression can represent variousprocesses. (a,b) An electron is created at low-er marked by a cross (z3), interacts at thevertex (x) and is destroyed at the upper cross(z2). A photon is created by a source at z1

and absorbed by the electron at the vertex (a)or emitted by the electron and destroyed bythe source (b). (c) The same as (a) except apositron is created at z3 and destroyed at z2.(d) Electron positron pair creation by a pho-ton. (e) Pair annihilation to a photon.

Page 370: Yorikiyo Nagashima - Archive

11.6 QED 345

11.6.4Mott Scattering

As a preparatory step to quantum field theory of the scattering matrix, we first de-scribe a semiclassical treatment of electron scattered by a fixed Coulomb potential.This applies to the scattering of electrons by heavy nuclei. The nucleus is very heavyand its recoil can be neglected in low-energy scattering. Then the electric field cre-ated by the nucleus can be treated classically and only the electrons are treatedusing the quantized field. The electromagnetic field is not replaced by a functionalderivative but is a classical field; it is represented by a Coulomb potential

A μ D (φ, 0, 0, 0), φ DZ q0

4π1r

(11.163)

where the electric charge of the nucleus is taken to be Z q0 and that of the electronq D �e. The generating functional becomes

Z1 � N�Z

d4x��

1i

δδη(x )

�(�i qγ μ)

�1i

δδη(x )

�A μ(x )

� exp��iZ

d4z2 d4z3 η(z2)SF(z2 � z3)η(z3)�

(11.164a)

D N��iZ

d4x Tr[(�i qγ μ)SF(x � x )]A μ(x )

C

Zd4x d4z2 d4z3η(z2)SF(z2 � x )(�i qγ μ)SF(x � z3)η(z3)A μ(x )

� exp��iZ

d4w2d4w3 η(w2)SF(w2 � w3)η(w3)�

(11.164b)

The first term does not contribute. The reduction formula Eq. (11.148) for an elec-tron of momentum p coming in and one of p 0 going out becomes

S f i D hout W p 0jp W ini

D (�i)2Z

d4w2d4 w3U p 0 (w2)(i!6@ � m)w2

�1i2

�δ2Z1[η, η]

δη(w3)δη(w2)

ˇˇ

ηDηD0(�i �6@ � m)w3 Up (w3)

D

Zd4x d4 w2d4 w3U p 0 (w2)(i

!6@ � m)w2

� SF(w2 � x )(�i qγ μ)SF(x � w3)(�i �6@ � m)w3 Up (w3)A μ(x )

(11.165)

From the properties of the Feynman propagators and the Dirac wave functions

(i!6@ � m)w2 SF(w2 � x ) D δ4(w2 � x )

SF(x � w3)(�i �6@ � m)w3 D δ4(x � w3)

Up (x ) D u r(p )e�i p �x, U p 0 (x ) D u s(p 0)e i p 0�x(11.166)

Page 371: Yorikiyo Nagashima - Archive

346 11 Path Integral Approach to Field Theory

the scattering amplitude becomes

S f i D

Zd4x e�i(p�p 0)�x us(p 0)(�i qγ μ)u r (p )A μ(x )

D �2π i δ(E 0 � E )Z

d3x ei(p�p 0)�x�

Z qq0

4π1r

�us(p 0)γ 0u r(p )

D �2π i δ(E 0 � E )�

qq0

jqq0j

�4πZ α

us(p 0)γ 0u r(p )q2

q2 D (p � p 0)2 D 4jp j2 sin2 θ2

(11.167)

where α D e2/4π D 1/137 is the fine structure constant and θ is the scatteringangle. Equation (11.167) has reproduced the Mott scattering cross section formulaEq. (6.136).

11.6.5Second-Order Interaction

Referring to Eq. (11.162), the second-order interaction is given by

Z2 �N2!

�Zd4x

��

1i

δδη(x )

�(�i qγ μ)

�1i

δδη(x )

��1i

δδ J μ(x )

� 2

� Z0[η, η, J ]

DN2

�Zd4 y

��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

��

Zd4x d4 z1d4z2d4z3 η(z2)SF(z2 � x )(�i qγ μ)

� SF(x � z3)η(z3)DFμν(x � z1) J ν(z1)Z0[η, η, J ]�

�N2

�Zd4 y

��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

�Zd4x d4z1d4z2d4 z3C [η, η, J ]Z0[η, η, J ]

�(11.168)

where we have separated the part of the fields already pulled out of Z0[η, η, J ]by the first interaction derivatives and abbreviated it as C [η, η, J ]. Then the effectof the second interaction derivatives on Z0 is to pull an electron or photon linewith external source out of Z0 and connect it to the second vertex at y. Namely,it adds an incoming or outgoing particle to the vertex at y. On the other hand ifthe derivatives act on C [η, η, J ], they connect either an electron line or photon linefrom the vertex x to the vertex y. Namely it draws an internal line between the twovertices. Depending on which of the terms (C [η, η, J ] or Z0[η, η, J ]) are acted onby the three functional derivatives, there are six distinct processes, which will beexplained one by one.

Page 372: Yorikiyo Nagashima - Archive

11.6 QED 347

Disconnected Diagrams When all three derivatives act on Z0[η, η, J ], they pulldown another trio of external lines and there are no internal lines to connect xand y. The contribution is

Z21 D12

��

Zd4 x d4z1d4z2d4z3 η(z2)SF(z2 � x )(�i qγ μ)

� SF(x � z3)η(z3)DFμν(x � z1) J ν(z1)�

��

Zd4 y d4w1d4 w2d4 w3η(w2)SF(w2 � y )(�i qγ�)

� SF(y � w3)η(w3)DF�σ (y � w1) J σ(w1)�

Z0[η, η, J ]

D12

�Z1

Z0

�2

Z0 (11.169)

where Z1 is defined by Eq. (11.162). This is two separate processes generated bytwo independent “trios of external sources”, each one identical to the first-orderinteraction, and is depicted in Fig. 11.2.

z2

z1

z3

x

w2

w1

w3

y

(a) (b)

Figure 11.2 Among the second-order processes, these discon-nected diagrams describe two independent processes, each oneidentical to the first-order interaction in Fig. 11.1.

This is the so-called disconnected diagram and describes two independent pro-cesses, each of which has already appeared in the lower order perturbation. Wecan eliminate the disconnected diagrams by using W [η, η, J ] D �i ln Z [η, η, J ]instead of Z [η, η, J ], which will be explained later in Sect. 11.6.7.

Four External Electron Lines This is the case when both δ/δη and δ/δη act onZ0 and pull down two electron lines while δ/δ J acts on C and connects the twovertices by a photon line. This contributes to electron–electron or Bhabha (e�–eC)

z2

z3

w2

w3

x y

(a) (b) (c)

z2

z3 w2

w3

x y

z2z3

w2 w3

x

y

Figure 11.3 Four-point functions: Diagrams that contribute to (a) electron–electron or (b,c)Bhabha (e�– eC) scattering.

Page 373: Yorikiyo Nagashima - Archive

348 11 Path Integral Approach to Field Theory

scattering (see Fig. 11.3):

Z22 DN2

�Zd4 y

��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

��

Zd4 x d4z1d4z2d4z3

˚η(z2)SF(z2 � x )(�i qγ μ)

� SF(x � z3)η(z3)DFμν(x � z1) J ν(z1)�

Z0[η, η, J ]�

Di2

NZ

d4x d4 y d4z2 d4z3d4w2d4w3

�˚

η(z2)SF(z2 � x )(�i qγ μ)SF(x � z3)η(z3)DFμ�(x � y )

� η(w2)SF(w2 � y )(�i qγ�)SF(y � w3)η(w3)�

Z0 (11.170)

The S-matrix of the process will be discussed later.

Compton-Type Process Here δZ0/δ J and δZ0/δη pull down one more photonline and an electron line while the remaining δC/δη connects the two vertices byan electron line. This gives the first term in the next equation and can representCompton scattering, pair annihilation into two gammas or pair creation by twophotons. Another one where the role of η and η is exchanged gives the secondterm:

Z23 DN2

�Zd4 y

��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

��

Zd4 x d4z1d4z2d4z3 η(z2)SF(z2 � x )(�i qγ μ)

� SF(x � z3)η(z3)DFμν(x � z1) J ν(z1)Z0[η, η, J ]�

Di2

NZ

d4x d4 y d4z1 d4z3d4w1d4w3˚

J σ(w1)DFσ�(w1 � y )

�DFμν(x � z1) J ν(z1)� �

η(z3)SF(z3 � x )(�i qγ μ)

� SF(x � y )(�i qγ�)SF(y � w3)η(w3)

C η(w3)SF(w3 � y )

� (�i qγ�)SF(y � x )(�i qγ μ)SF(x � z3)η(z3)

Z0 (11.171)

The second term is identical to the first and compensates the factor two. This isseen if we exchange the variables x $ y , z3 $ w3, z1 $ w1 and μ $ �. Note, thediagrams (a–d) in Fig. 11.4 are described using the notation of the first term, butevery diagram is included in both terms as every variable is integrated and can berenamed. The S-matrix of the process will be treated later.

Vacuum Energy When all three derivatives act on C [η, η, J ], they connect all theexternal lines with each other and no external lines remain (Fig. 11.5a). This is a

Page 374: Yorikiyo Nagashima - Archive

11.6 QED 349

z1

w1

w3

z3z1

w1

z3

w3

y

x

(a) (b) (c) (d)

x y x y x y

z1 w1

w3z3 z1 w1

w3z3

Figure 11.4 Four-point functions: Diagrams that contribute to Compton-like processes. (a,b)Compton scattering, (c) pair annihilation and (d) pair creation of photons.

vacuum-to-vacuum transition. The expression has the form

Z24 Di2

NZ

d4x d4 y Tr�(�i qγ�)SF(y � x )(�i qγ μ)

� SF(x � y )DF�μ (x � y )Z0[η, η, J ]

(11.172)

It is absorbed by the normalization constant N to give Z2[0, 0, 0] D 1.

Photon Self-Energy When only the photon derivative acts on Z0, and the otherson C, the two electron lines already pulled down into C are connected to each oth-er to make a loop. This is called a photon self-energy diagram (Fig. 11.5b). Theexpression becomes

Z25 D �N2

Zd4x d4 y d4zd4 w J σ(w )DFσ�(w � y )

� Tr�(�i qγ�)SF(y � x )(�i qγ μ)SF(x � y )

� DFμν(x � z) J ν(z)Z0

(11.173)

This is a correction to the photon propagator. Note the extra minus sign for theloop. Whenever there are fermion loops, an extra sign (�1) is added, which comesfrom reordering the fermion fields. This appears in the higher order correction inthe vacuum expectation value of the two-point time-ordered correlation function.The whole propagator consists of functional derivative of Z0 and Z25 and it is given

w

x

y xx

yy

z

w

z(a) (b) (c)

Figure 11.5 (a) Vacuum-to-vacuum transition amplitude, (b)photon self-energy and (c) electron self-energy.

Page 375: Yorikiyo Nagashima - Archive

350 11 Path Integral Approach to Field Theory

by

hΩ jT [ OA μ(x ) OA ν(y )]jΩ i D�

1i

�2 δ2

δ J μ(x )δ J ν(y )Z [η, η, J ]

ˇˇ

ηDηD JD0

D

�1i

�2 δ2

δ J μ(x )δ J ν(y )fZ0 C Z1 C (� � � C Z25 C � � � )C � � � g jηDηD JD0

D i DFμν(x � y )CZ

d4 w d4zDFμ�(x � w )

� Tr�(�i qγ�)SF(w � z)(�i qγ σ)SF(z � w )

DFσν(z � y ) (11.174)

The factor 1/2 was canceled by taking second derivatives of the quadratic J σ J ν

term. Using the momentum representation of the propagators Eqs. (11.65) and(11.119) and integrating over w and z, the second term becomesZ

d4k(2π)4 e�i k �(x�y ) �gμ�

k2 C i ε

Zd4 p

(2π)4

� Tr�

(�i qγ�)1

6p � m C i ε(�i qγ σ)

16p� 6 k � m C i ε

��gσν

k2 C i ε

(11.175)

Then the momentum representation of the photon propagator including the sec-ond-order correction is expressed as

i QDFμν(k) D�i gμν

k2 C i εC�i gμ�

k2 C i ε(�i Π �σ)

�i gσν

k2 C i ε

�i Π �σ D �

Zd4 p

(2π)4

� Tr�

(�i qγ�)1

6p � m C i ε(�i qγ σ)

16p� 6 k � m C i ε

� (11.176)

Thus we have reproduced the radiative correction to the photon propagatorEq. (8.49b).

Electron Self-Energy When only one of the electron derivatives acts on Z0 and theothers on C, this pulls down an electron line from the source and connects it to thesecond vertex through both a photon and another electron line, i.e. this is the self-energy correction to the electron vertex (See Fig. 11.5c).

Problem 11.4

Derive the electron self-energy corresponding to Fig. 11.5c and show that its vacu-um expectation value becomes a higher order correction term to i SF(x � y )

Page 376: Yorikiyo Nagashima - Archive

11.6 QED 351

i SF(x � y )2nd order

D

Zd4w d4z SF(x � w )(�i qγ μ)SF(w � z)(�i qγ ν)SF(z � y )

� DFμν(w � z)

D

Zd4 p

(2π)4 e�i p �(x�y ) i6p � m C i ε

i Σe(p )i

6p � m C i ε

i Σe(p ) DZ

d4k(2π)4

�i gμν

k2(�i qγ μ)

i6p� 6 k � m C i ε

(�i qγ ν) (11.177)

which reproduces Eq. (8.66).

11.6.6Scattering Matrix

Electron–Muon Scattering To treat the interaction of the muon and the electron,we only need to add the muon Lagrangian to the QED Lagrangian. The former isidentical to that of the electron except the particle species must be distinguished.The interaction Lagrangian becomes

Lint D Lint,e CLint, μ D �qψ e γ μ ψe � qψμ γ μ ψμ (11.178)

where the electric charge of the electron and muon is q D �e. The second-ordergenerating functional Z2[η, η, J ] contains

12!

�iZ

d4xLint

�2

D12

�iZ

d4x (Lint,e CLint, μ )�2

D12

�iZ

d4xLint,e

�2

C12

�iZ

d4xLint, μ

�2

C iZ

d4xLint,e iZ

d4 yLint, μ

(11.179)

Therefore, the generating functional of e–μ scattering is readily obtained fromEq. (11.170):

Zeμ D N�Z

d4 y��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

��

Zd4x d4z1d4z2d4z3

˚η(z2)SF(z2 � x )(�i qγ μ)

� SF(x � z3)η(z3)DFμν(x � z1) J ν(z1)�

Z0[η, η, J ]�

D iZ

d4x d4 y d4z2 d4z3d4w2d4w3

� fη(z2)SF(z2 � x )(�i qγ μ)SF(x � z3)η(z3)ge DFμ�(x � y )

� fη(w2)SF(w2 � y )(�i qγ�)SF(y � w3)η(w3)gμ Z0 (11.180)

Page 377: Yorikiyo Nagashima - Archive

352 11 Path Integral Approach to Field Theory

where we have distinguished the electron and the muon. To calculate the transi-tion amplitude of the electron–muon scattering, we assign the following kinematicvariables:

e(p1)C μ(p2) ! e(p3)C μ(p4) (11.181)

Application of the LSZ reduction formula Eq. (11.151) to e–μ scattering gives

hout W p3 p4jp1 p2 W ini D (�i)4Z

d4x1 d4x2d4x3 d4x4

� U p3 (x3)(i!6@ � me)x3 U p4 (x4)(i

!6@ � mμ)x4

�1i

�4 δ4Zeμ [η, η, J ]δημ(x2)δη e(x1)δηe(x3)δημ(x4)

ˇˇ

ηDηD JD0

� (�i �6@ � me)x1 Up1 (x1)(�i

�6@ � mμ)x2 Up2 (x2)

(11.182)

The effect of the 4-derivatives in the reduction formula on the generating func-tionals Zeμ in Eq. (11.180) is given by

δ4Zeμ [η, η, J ]δημ(x2)δη e(x1)δηe(x3)δημ(x4)

ˇˇ

ηDηD JD0

D iZ

d4x d4 yhfSF(x3 � x )(�i qγ μ)SF(x � x1)ge

� DFμ�(x � y ) fSF(x4 � y )(�i qγ�)SF(y � x2)gμi

(11.183)

Functional derivatives acting on Z0 do not contribute since they contain the sourcefunctions. As the effect of the Dirac operators in the reduction formula is only toconvert the outermost propagators to the corresponding wave functions, the finalexpression for the scattering amplitude becomes

hout W p3, p4jp1, p2 W ini

D iZ

d4x d4 y˚U p3 (x )(�i qγ μ)Up1 (x )

�e

� DFμ�(x � y )˚U p4 (y )(�i qγ�)Up2 (y )

�μ

(11.184)

The formula is depicted in Fig. 11.6a.When we insert the expressions

Up (x ) D u r (p )e�i p �x , U p 0 D us (p 0)(x )e i p 0�x (11.185a)

DFμν(x � y ) DZ

d4k(2π)4 e�i k �(x�y ) �gμν

k2 C i ε(11.185b)

(where we have recovered the subscript for the spin degree of freedom) in this for-mula, the integration of x and y produces two δ-functions, and after k integration

Page 378: Yorikiyo Nagashima - Archive

11.6 QED 353

p3

p1

p4

p2

x y

k’

k

p’

p

xy

k’

k

p’

p

x

y

(a) (b) (c)

Figure 11.6 (a) Electron–muon scattering and (b,c) Compton scattering.

the scattering amplitude becomes

hout W p3, p4jp1, p2 W ini D (2π)4δ4(p1 C p2 � p3 � p4)

� [us(p3)(�i qγ μ)u r(p1)]e�i gμν

(p1 � p3)2 C i ε[un(p4)(�i qγ ν)u m(p2)]μ

(11.186)

This reproduces the scattering amplitude Eq. (7.11) obtained using the standardcanonical quantization method.

Problem 11.5

Derive the Compton scattering amplitude and confirm it agrees with Eq. (7.51).Hint: As the photon source J μ has no directionality, it could serve either as a sourceor a sink. There are two ways to make the photon derivatives, producing two distinctscattering processes, as depicted in Fig. 11.6b,c.

11.6.7Connected Diagrams

In the expansion of the generating functional, we saw that it contained the discon-nected diagrams that describe two (or more) independent processes. We will showthat the generating functional W produces only connected diagrams. It is relatedto Z by

Z [ J ] D e i W [ J ] or W [ J ] D �i ln Z [ J ] (11.187)

We consider the case of the second-order interaction. The generating functional ofQED was given by Eq. (11.157), which we reproduce here:

Z [η, η, J ] D N exp�Z

d4x��

1i

δδη

�(�i qγ μ)

�1i

δδη

��1i

δδ J μ

��� Z0[η, η, J ] (11.188)

For ease of calculation we abbreviate the exponent by i L I and express Z [ J ] as fol-lows:

Z [ J ] � Z [η, η, J ] D N Z0Z�10 e i L I Z0[ J ] (11.189)

Page 379: Yorikiyo Nagashima - Archive

354 11 Path Integral Approach to Field Theory

Taking the logarithm of Z [ J ], we can Taylor expand to the second order of theinteraction Lagrangian i L I , assuming it is small:

ln Z [ J ] D ln N Z0 C ln�Z�1

0 e i L I Z0�

D ln N Z0 C ln˚1C Z�1

0

�e i L I � 1

�Z0�

D ln N Z0 C ln�

1C Z�10

�i L I C

12!

(i L I )2 C � � �

�Z0

D ln N Z0 C Z�10

�i L I C

12!

(i L I )2

Z0 �12

˚Z�1

0 (i L I ) Z0�2C � � �

D ln N C S0 C S1 C

�S2 �

12

S21

�C � � � (11.190)

Recovering the original notation

S0 D ln Z0

D �i2

Zd4 x d4 y

�J μ(x )DFμν(x � y ) J ν(y )C η(x )SF(x � y )η(y )

S1 D Z�1

0 (i L I )Z0

Di

Z0

Zd4xLint

��

1i

δδη

,1i

δδη

,1i

δδ J μ

�Z0[η, η, J ] D

Z1

Z0

S2 D Z�10

12!

(i L I )2Z0

D1

Z0

12!

�iZ

d4xLint

��

1i

δδη

,1i

δδη

,1i

δδ J μ

� 2

Z0[η, η, J ]

DZ2

Z0

12

S21 D

12

�Z1

Z0

�2

(11.191)

where Z1 and Z2 are the first- and second-order terms of Z [ J ] in the power ex-pansion of the interaction Lagrangian defined in Eqs. (11.162) and (11.168). Thedisconnected diagram that appeared in the second-order interaction was given by

Z21 D12

Z1Z0

�2Z0 [see Eq. (11.169)], which is exactly canceled in ln Z [ J ], as is seen

in the parentheses in the last equation of Eq. (11.190). Higher order disconnecteddiagrams can also be shown to be canceled by the corresponding products of lowerorder contributions.

11.7Faddeev–Popov’s Ansatz*

This section is provided to explain why a ghost field is necessary in the covariantformalism of non-Abelian gauge theory. The discussion is formal and mathemati-cal. The reader is advised to read Chap. 18 first and then come back here.

Page 380: Yorikiyo Nagashima - Archive

11.7 Faddeev–Popov’s Ansatz* 355

11.7.1A Simple Example*1)

The concept of Faddeev and Popov’s gauge-fixing method is rather hard to grasp.However, the method constitutes an important concept for handling symmetry inthe path integral and becomes an indispensable tool when we extend our discus-sion to non-Abelian gauge theory. Therefore we shall make a small detour, spend-ing a few extra pages rederiving the gauge-fixing condition that could be phrasedin one sentence for Abelian gauge theory. Perhaps it helps if we do a simple exer-cise that follows their strategy closely to isolate the redundant degrees of freedom.Consider an integral

W D

“dx d y G(x , y ) D

“d r G(r) , r D

qx2 C y 2 (11.192)

Here, we assume G(x , y ) is a function of r and has rotational symmetry. The natureof the symmetry is apparent in polar coordinates:

W D

Zdθ

Zd r r G(r) (11.193)

The integral is invariant under coordinate rotations

r D (r, θ )! rφ D (r, θ C φ) (11.194)

The essential part of the integrand that does not change by rotation is isolated as afunction of r. The symmetry manifests itself in that it does not depend on θ andthat θ integration merely adds an integration volume over the symmetry space,which is the cause of an infinity in the path integral. Our aim is to embed thissymmetry condition in the original coordinates. To do that we insert

1 DZ

dφδ(φ � θ ) (11.195)

Then Eq. (11.193) becomes

W D

Zdφ

Zd r dθ δ(φ � θ )r G(r) (11.196)

By this insertion, the second integration has been reduced from a two-dimensionalto a one-dimensional one by the delta function, which constrains the two-dimen-sional integration to be carried out at a fixed angle θ D φ.

More generally, a condition f (r) D 0 that fixes the angle at some value of φ maybe used. φ can vary as a function of r as long as f (r) D 0 gives a unique solutionof φ for a given value of r. As

Rdφδ[ f (r)] ¤ 1, we define a quantity Δ f (r) by

Δ f (r)Z

dφ δ[ f (r φ)] D 1 (11.197)

1) Here we closely follow the arguments of Cheng and Li [97].

Page 381: Yorikiyo Nagashima - Archive

356 11 Path Integral Approach to Field Theory

Then

[Δ f (r)]�1 �

Zdφ δ[ f (r φ)] D

Zdφ

�d fdφ

��1

δ[φ � θ (r)]

) Δ f (r) D@ f (r)@θ

ˇˇ

f D0

(11.198)

Δ f (r) itself is invariant under the rotation because

[Δ f (rφ0 )]�1 D

Zdφδ[ f (r φCφ0)] D

Zdφ00δ[ f (rφ00 )] D [Δ f (r)]�1

(11.199)

Insertion of Eq. (11.197) in Eq. (11.192) gives

W D

Zdφ Wφ D

Zdφ

�Zd rG(r)Δ f (r)δ[ f (r)]

�(11.200)

Wφ is rotationally invariant because

Wφ0 D

Zd r G(r)Δ f (r)δ[ f (r φ0 )] D

Zd r 0G(r0)Δ f (r0)δ[ f (r 0)]

D Wφ

(11.201)

where r 0 D (r, φ0) and we have used the fact that both G(r) and Δ f (r) are invariantunder rotation. We have succeeded in isolating the symmetry-independent volumefactor from the symmetry-invariant integral.

11.7.2Gauge Fixing Revisited*

In gauge theories, we consider a local gauge transformation2)

A μ ! Aωμ (11.202a)

Aωμ D U(ω)A μU�1(ω)�

ig

U(ω)@μU�1(ω) (11.202b)

U(ω) D exp

"�i

NXaD1

ωa (x )Ta

#(11.202c)

In the following discussion, we consider only an infinitesimal transformation, butthe conclusion is general. Then the transformation is written as

Aμa

ωD Aμ

a C f abc ωb Aμc C

1g

@μ ωa (11.203)

2) Gauge transformation properties are discussed in detail in Chap. 18. Here, it suffices to acceptthat general infinitesimal gauge transformations are given by Eq. (11.203).

Page 382: Yorikiyo Nagashima - Archive

11.7 Faddeev–Popov’s Ansatz* 357

where a, b, c (a, b, c D 1 � � �Nc) are color indices for non-Abelian gauge transfor-mation and f abc’s are structure constants of the unitary group representation. ForAbelian groups, f abc D 0. For a given set of fields Aμ

a(x ) in the gauge-invariantLagrangian, the action is constant on any path of Aμ ω

a (x ) that can be obtained bythe gauge transformation U(ω). This will naturally give a infinite volume becausethe path integral is over the whole range of A, including Aω ’s that are gauge trans-formed from A. We want to separate this part of the path integral. The procedurewill be exactly the same as in the simple example we have just discussed. First wehave to fix the gauge at some value. The gauge condition is expressed genericallyas

ga (A μ) D 0 , i.e. ga(Aωμ)jωD0 D 0 (11.204)

Some commonly used gauge-fixing conditions may be represented in the form

Lorentz gauge: @μ Aμa D 0 (11.205a)

Coulomb gauge: r � Aa D 0 (11.205b)

Axial gauge: A3a D 0 (11.205c)

Temporal gauge: A0a D 0 (11.205d)

The generating functional of the gauge fields is expressed as

Z DZ

DAei S [A] (11.206)

Then we insert

Δ[Aω ]Z

Dωδ[g(Aω)] D 1 (11.207)

which is analogous to Eq. (11.197) and defines Δ[Aω ]. The generating functional istransformed to

Z DZ

DAei S [A]Z

Dω Δ[Aω ]δ[g(Aω)]

DADω D3Y

νD0

NcYaD1

Yx�

dAνa(x�)

NcYbD1

Yy σ

dωb(y σ)(11.208)

Notice, however, that Eq. (11.207) is a functional integral and “the delta functional”should be considered as an infinite product of δ-functions, one for each space-timepoint x:

δ[g(Aω)] DNcY

aD1

Yx μ

δ(ga(Aω(x μ))) (11.209)

Page 383: Yorikiyo Nagashima - Archive

358 11 Path Integral Approach to Field Theory

Namely Eq. (11.207) should read

1 D

NcYa

Zdωa δ[ga(Aω)]

!det

�@ga

@ωb

D limN!1

NYn

Zdωn δ[gn(Aω)]

@(g1, � � � , gN )@(ω1, � � � , ωN )

(11.210)

where n includes both color and the discrete space-time point. Then

Δ[A]�1 D

NcYaD1

Yx�

Zdωa(x�)δ(ga(Aω(x�)))

D

NcYaD1

Yx

Zdga(x )δ(ga(Aω(x )))

„ ƒ‚ …D1

@[ω1(x ), � � � , ωNc (x )]@[g1(x ), � � � , gNc (x )]

� det�

δωδg

�yDxgD0

(11.211)

The last step defines the functional determinant of the continuous matrix @ωa (x )/@gb(y ), with rows labeled by (a, x ) and columns by (b, y ). The gauge conditionga(A(x )) D 0 can be rewritten as ga(Aω(x ))jωD0 D 0. Δ[A] is then expressed as

Δ[A] D det�

δg(Aω)δω

�ωD0

for A satisfying g(A) D 0 (11.212)

The functional determinant in Eq. (11.212) can be calculated using Eq. (11.203).Clearly, Δ(Aω) is gauge invariant

Δ[A] D Δ[Aω ] (11.213)

because, in Eq. (11.207), if we perform another gauge transformation ga(Aω) !ga(AωCω0

), the group nature of the measure means Dω D D(ω C ω0) and hence

Δ[AωCω0] D

ZDωδ[g(AωCω0

)] DZ

D(ω C ω0)δ[g(AωCω0)]

D

ZDω00δ[g(Aω00

)] D Δ[Aω ](11.214)

At this stage, we can look at the generating functional Z in Eq. (11.208) andchange the order of DA and Dω. Since e i S [A], Δ[Aω ] are all gauge invariant, theycan be replaced by e i S [Aω ], Δ[Aω ]. Now we change the integration variable to DAω ,but the new integration variable can be renamed back to A, leading to

Z DZ

DAei S [A]Z

Dω Δ[Aω ]δ[g(Aω)]

D

ZDω

�ZDAei S [A]Δ[A]δ[g(A)]

� (11.215)

Page 384: Yorikiyo Nagashima - Archive

11.7 Faddeev–Popov’s Ansatz* 359

where we have been able to take out the factor Dω since the integral in large squarebrackets is independent of ω. This is the Faddeev–Popov ansatz [136]. The integra-tion over ω simply gives an overall multiplicative constant to Z and can be absorbedin the normalization constant. Thus we have successfully isolated the integrationover redundant degrees of freedom originating from the gauge symmetry. In orderto obtain a more tractable expression, however, we go one step further to manipu-late Eq. (11.207) before we put it back into the generating functional.

The identity Eq. (11.207) can be generalized to

Δ(A)Z

DωNY

aD1

δ[ga(Aω) � ha(x )] D 1 (11.216)

where ha(x ) is a set of arbitrary functions. This merely redefines ga to make thegauge condition ga(Aω(x )) D ha(x ). As ha(x ) is independent of Aμ , Δ[A] is un-affected. Now multiply both sides by an arbitrary functional H [h] D H [h1(x ), � � � ,hN (x )] and integrate over all ha(x ). The result is a more general identity:

Δ[A]

R DωDhQN

aD1 δ�ga(Aω)� ha(x )

H [h]R DhH [h]

D Δ[A]

R DωH [g(Aω)]R DhH [h]D 1

(11.217)

Now insert this identity instead of Eq. (11.207) in the generating functional Z, andas before change the order of DA and Dω. The same arguments we used in ob-taining Eq. (11.215) can isolate

R Dω and we obtain

Z D Δ[A]

R DωR DhH [h]

ZDAei S [A]H [g(A)] D N

ZDAei S [A]H [g(A)]Δ[A]

(11.218)

It remains to evaluate Δ[A] and fix H [g(A)].

11.7.3Faddeev–Popov Ghost*

Lorentz GaugeTo proceed further, we need to fix the gauge condition. Let us choose the Lorentzgauge, namely

ga (A) D @μ Aμa (11.219)

Then for A satisfying @μ Aμ D 0

ga(Aω) D @μ

�Aμ

a C1g

@μ ωa C f abc ωb Aμc

D1g

@μ@μ ωa C f abc Aμc@μ ωb

(11.220)

Page 385: Yorikiyo Nagashima - Archive

360 11 Path Integral Approach to Field Theory

Hence

δg(x )δω(y )

ˇˇ

ωD0D

�1g

δab@μ@μ C f abc Aμc@μ

�δ(x � y )

) Δ[A] D det�

δab

g@μ@μ C f abc Aμ

c@μ

� (11.221)

For the Abelian field, f abc D 0 and Δ[A] does not depend on A. It can be absorbedin the normalization constant. However, for the non-Abelian field, Δ[A] depends onA and is therefore a dynamical object. It cannot be absorbed by the normalizationconstant. A convenient way is to rewrite it as a path integral in terms of anticom-muting c-numbers. We cannot use the ordinary commuting c-numbers becausethe determinant has to be inverse in that case. We refer to Eq. (11.106), which isreproduced as the next equation:Z

Dη(x )Dη(x ) exp��iZ

d4 x d4 y η�(x )A(x , y )η(y )�D det i A (11.222)

Using this formula, we have

Δ[A] DZ

Dη(x )Dη(x ) exp�

iZ

d4xLghost

�Lghost D �η�

a (x )[δab@μ@μ C g f abc Aμc (x )@μ ]ηb(x )

(11.223)

The factor 1/g is absorbed into the normalization of the η and η fields. By ex-pressing the determinant as a path integral, we have introduced a new field. It haspeculiar characteristics: it is a scalar field obeying the Klein–Gordon equation butis anticommuting. It is called the Faddeev–Popov ghost. As it has a color index,there are as many (i.e. Nc) ghost fields as gauge fields. It interacts only with thegauge field through derivative couplings. Since there is no reason to consider it asa physical particle we do not introduce an external source and hence it only appearsin the intermediate states, namely, in closed loops in Feynman diagrams.

We still have the freedom to adopt an arbitrary functional H [h] and we choose itto be a Gaussian integral

H [h] D exp��

i2λ

Zd4x g2

�D exp

��

i2λ

Zd4x (@μ Aμ

a)2�

(11.224)

It amounts to changing the gauge field Lagrangian

L! L �1

2λ(@μ Aμ

a)2 (11.225)

which is exactly the gauge-fixing term we introduced before to circumvent the dif-ficulty (see Eq. (11.59)).

Page 386: Yorikiyo Nagashima - Archive

11.7 Faddeev–Popov’s Ansatz* 361

In summary, the final form for Z, taking note of Eqs. (11.223) and (11.224), takesthe form

Z D NZ

DADηDη exp�

iZ

LnAbel

�(11.226a)

LnAbel D Lfree CLGF CLFPG (11.226b)

D Lfree �1

2λ(@μ Aμ

a)2 � ηa Mab ηb (11.226c)

Mab D δab@μ@μ C f abc Aμc@μ (11.226d)

Axial GaugeThe ghosts have the wrong spin-statistics and appear only in intermediate states.Their existence, however, is essential to retain the unitarity of the theory. This isanalogous to the scalar photon with indefinite norm, i.e. negative probability, thatappeared in the covariant Lorentz gauge of the Abelian theory. In the non-Abeliangauge, for instance in quantum chromodynamics (QCD), similar situations occurfor scalar and longitudinal gluons. In addition, because of the nonlinear natureof the non-Abelian theory, they couple to themselves. The creation of unphysicalgluon pairs cannot be canceled by themselves. The ghost’s role is precisely to com-pensate these unwanted contributions and to maintain the unitarity of the theoryorder by order. In the Abelian case, when the Coulomb gauge was adopted onlyphysical photons appeared and we did not need unphysical photons. Similarly, wemay ask if there is a gauge in non-Abelian gauge theory in which the ghosts do notappear. Indeed there is: the axial gauge,3) where A3

a(x ) D 0. More generally, wedefine the axial gauge with a unit vector t μ that satisfies the condition

tμ Aμa D 0, tμ t μ D ˙1 (11.227)

Then the gauge-fixing term is

ga (A) D tμ Aμa (11.228)

and under the gauge transformation Eq. (11.203)

δga [Aω ] D f abc Aμb tμ ω c C t μ@μ ωa D t μ@μ ωa (11.229)

Hence

δga

δωbD δab t μ@μ (11.230)

This does not contain Aμa, and its contribution can be integrated out. We do not

need ghosts in the axial (or temporal) gauge. However, just like the Coulomb gauge,the Feynman propagator is complicated in these gauges and calculation is moredifficult.

3) By choosing tμ to be timelike, the temporal gauge is included also.

Page 387: Yorikiyo Nagashima - Archive
Page 388: Yorikiyo Nagashima - Archive

363

12Accelerator and Detector Technology

12.1Accelerators

Particle physics explores the deepest inner structure of matter. The smallest sizewe can explore is directly proportional to the energy we can transfer to the par-ticles. For this reason, accelerators of the highest possible energy are the majorapparatus one needs. With them we can boost particles, make them collide and de-compose and investigate their inner structure by looking at their debris. The roleof detectors is to identify secondary particles, reconstruct their space-time struc-ture by measuring energy-momentum and elucidate their nature. Particle physicsis a science based on observation and experimentation. Its progress is inextricablyrelated to the advent of technologies. The ability to devise an experiment to testtheoretical hypotheses and to understand connections between natural phenome-na and predictions induced from first principles are attributes that researchers inthe field have to acquire. To interpret experimental data correctly, it is essential tograsp the essence of experimental techniques and their underlying principles andto recognize their advantages and disadvantages.

There are numerous examples from the past in which people, confused by falsedata, have taken a roundabout route. Acquiring correct knowledge of experimentaltechniques and cultivating one’s own critical power help to assess the observed da-ta correctly and enhance one’s ability to devise new methods that can lead to newideas. However, the general knowledge that is necessary for the experimental tech-niques is mostly based on electromagnetism, condensed matter physics, and thephysics of molecules, atoms and nuclei. These are, to say the least, detached fromelementary particle physics. To avoid writing a manual-like textbook on experimen-tal methods by going into too much detail, I restrict myself here to describing onlybasic ideas and a few examples of representative accelerators and detectors. In thischapter we discuss only experimental tools. Important experiments are introducedwhenever possible as we pick up each subject throughout the book.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 389: Yorikiyo Nagashima - Archive

364 12 Accelerator and Detector Technology

12.2Basic Parameters of Accelerators1)

12.2.1Particle Species

The basic idea of particle acceleration is to apply an electric field to charged parti-cles. Let the charge of the particle be q, the applied voltage V (in volts) and assumethe field is uniform. Then the particle is accelerated to energy

E D qV D qεd (12.1)

where ε D dV/d t is the gradient of the electric potential and d the length of thefield. When a charged particle has unit charge (D 1.6 � 10�19 Coulomb) and isaccelerated by 1 Volt of electric potential, we say it has acquired the energy of 1electron volt (eV for short). The electron volt is a small unit, and we list below larg-er units and their applied physics in that energy region:

keV D 103 eV : X-rays, structure of atoms and moleculesMeV D 106 eV : nuclear reactions

: mπ (pion mass)D 139.6 MeV/c2

: hadron size� „/mπ c ' 1.4� 10�13 cmGeV D 109 eV : mp (proton mass)D 0.938 GeV/c2

: mW (W boson mass)D 80.26 GeV/c2

: (p

2GF)�1) (energy scale of electroweak interaction):D 250 GeV

TeV D 1012 eV : typical energy of particle physics frontier

: Note: total energy of LHC2)D 14 TeV

Particles that can be accelerated are usually limited to stable charged particles,which are electrons (e) and protons (p), including their antiparticles. When onewants to use a beam of unstable particles such as pions (π), kaons (K) and hyperons(Y), one irradiates protons on matter (beryllium, iron, etc.) and uses the secondaryparticles that are produced. In the case of neutrinos (ν) and muons (μ), tertiaryparticles, which are the decay products of secondary particles, are used. The energyof the secondary or tertiary beams is usually less than half that of the primary andthe intensity is of the order of 10�6–10�7 of the primary. Until the unified theorieswere established, strong, electromagnetic and weak interactions were consideredas independent phenomena and experiments with different particle species wereconsidered as belonging to different research fields. Historically, hadron beams(p , π, K ) on fixed targets were very powerful in discovering new particles and res-onances. Charting and classifying them (hadron spectroscopy) was an important

1) [266].2) The Large Hadron Collider (LHC) is a proton–proton collider at CERN that started operation in

2009.

Page 390: Yorikiyo Nagashima - Archive

12.2 Basic Parameters of Accelerators 365

discipline of the field. The hyperon beam [247], a secondary hadron beam that con-tains a large fraction of hyperons, contributed to measuring the magnetic momentof hyperons and elucidating their decay properties. Neutrino beam experiments inthe 1970s [145] were used to discover the neutral current and, together with in-tense electron beam experiments, clarified the quark–parton structure of hadrons.Both approaches, spectroscopy and dynamics, were instrumental in establishingthe unified theories. Muon beam experiments [124] extended the knowledge ob-tained from the electron beam to the deeper inner structure of the hadron andenhanced the accuracy obtained by the neutrino experiments, thereby clarifyingthe quark–gluon structure, and contributed to establishing QCD (quantum chro-modynamics).

With the advent of the unified theories, however, it became possible to interpretvarious phenomena produced by different particles in a unified way. From thisviewpoint, the energy is the most important parameter and the particle speciesplays a secondary role.3) The discovery of new particles at the energy frontierhas been accorded the foremost importance in testing the correctness of proposedmodels by a potential breakthrough. In terms of energy, colliding accelerators arefar more advantageous than those utilizing fixed targets. From the accelerator tech-nological point of view, the proton is much easier to accelerate to higher energiesthan the electron. Consequently, hadron colliders (usually p on p or p ) have anadvantage compared to electron colliders as the energy frontier machine. Howev-er, the hadron is a quark composite while the electron is elementary. Thereforehadron-associated phenomena have an intrinsic difficulty in their analysis and the-oretical interpretation. It is also not clean experimentally, because the hadron isa strongly interacting particle and produces a great number of soft gluons, whichappear as QCD background to a reaction one wants to investigate. The search fora new particle in hadron reactions is literally looking for “a needle in a haystack”.Thus electron machines are usually more advantageous in doing precise and de-cisive experiments to test theoretical models, except for those unreachable in theelectron machine.

Today, fixed target experiments are carried out mostly in cases where the intensityof the beam and the species of the particle are the most important assets, namelywhen high precision in flavor physics is at stake. Examples are CP violation tests,rare decay modes in kaons and B-mesons, and neutrino and muon experimentsto probe lepton flavor mixing. Generally speaking, for collider experiments at theenergy frontier, a general purpose detector that is a combination of a variety ofdevices covering the largest possible solid angle is required. On the other hand, aclear and definite purpose is ascribed to fixed target experiments. They are mostoften accompanied by unique and well thought out measuring devices specific tothe purpose.

3) To be fair, since the Standard Model has been established, the origin of particle species is anunanswered question. The universe can live with just the first generation (u, d) quarks and (νe , e)leptons and the existence of the second and third families remains an unsolved “flavor problem”.

Page 391: Yorikiyo Nagashima - Archive

366 12 Accelerator and Detector Technology

12.2.2Energy

The highest energy is the most fundamental parameter of an accelerator becauseit determines the minimum size of the object being probed. Equally important isits luminosity, which determines the number of events one can sample. It maybe compared with the clearness of an image if the energy is the magnification ofa microscope. The minimum scale (λ) one can probe with an accelerator havingenergy E is related to the de Broglie wavelength of the incident particle:

λ Dhp'„cED

2 � 10�16

ps/2[GeV]

m (12.2)

E is the energy of colliding particles in the center of mass system. Throughoutin this book, we use “

ps D 2E” to denote the total center of mass energy. s is

a Mandelstam variable and is defined by the following formula using the energy-momentum of two colliding particles (see Sect. 6.4):

s D c2(p1 C p2)2 D (E1 C E2)2 � (p1c C p2c)2 (12.3)

In the collider accelerator, p1 D �p2 andp

s D E1CE2, the total energy of collidingparticles.4) For fixed target experiments p2 D 0 and when E1 m1c2, m2c2

ps D

q(E1 C m2c2) � (p1c)2 '

p2E1m2c2 (12.4)

This means the equivalent center of mass energy of the fixed target experimentsincreases only proportionally to the square root of the accelerator energy and isinefficient compared to the collider accelerators. This is why, with a few exceptions,most of the important results in the last 30–40 years have been obtained in colliderexperiments.

Problem 12.1

The energy of the Tevatron at Fermilab in the USA hasp

s D 1.96 TeV D

(980 C 980) GeV. Derive the center of mass energy when one uses this beamfor a fixed target experiment and prove it is not possible to produce a W boson(mW D 80.4 GeV). Assume a hydrogen target, m p D 0.938 GeV.

Note the Tevatron [343] started as an accelerator with a beam energy of 400 GeVfor fixed targets in 1973 and W production was one of its major aims.

Effective Energy of Proton Colliders The proton is a composite of quarks, and theenergy that participates in the reactions is that of the constituent quarks. Assuming

4) For asymmetric colliders (p 1 ¤ �p 2) such as B-factories (KEKB and PEP II),p

s has to becalculated using Eq. (12.3). The B-factory is a high luminosity accelerator that operates atp

s D 2mB � 10 GeV designed to produce copious B-particles, which contain b-quarks.

Page 392: Yorikiyo Nagashima - Archive

12.2 Basic Parameters of Accelerators 367

the quark in the proton has a fraction x of the parent proton momentum, the totalenergy

pOs of the colliding quarks is given by

pOs D

q(x1 p1c C x2 p2c)2 '

px1x2p

s (12.5)

The average fractional momentum hxi has been measured in deep elastic scatter-ings by electrons and muons (see Chap. 17) to be 0.1–0.3, which means 0.01

ps .p

Os .p

s. However, the probability of quarks having x1x2 � 1 is not zero andreactions with

pOs 'p

s are not impossible. Whether the reaction is useful or notdepends on the luminosity and the skill of experimenters in sorting out signalsfrom the background.

12.2.3Luminosity

Events that a detector picks up are mixtures of signals and background. The lu-minosity L plays a critical role in extracting clean signals out of unwanted back-grounds. It is defined as the quantity that gives an event rate when multiplied withthe cross section.

N (reaction rate/s) D Lσ (12.6)

The unit of L is “per square centimeter per second” (cm�2 s�1). The cross sectionσ is predetermined by nature and is not controllable. Therefore the number ofevents, a decisive factor in reducing statistical errors, is directly proportional to theluminosity. When large background signals exist, one has to filter them out usuallyby applying cuts in various kinematical variables, such as momenta, angles andspatial as well as time locations. However it is not possible to make an ideal filterthat is capable of cutting out all the unwanted events while passing all the signals.In most cases, the signals are reduced, too. When the background signals areabundant, the events have to go through many filters and the number of survivingsignals critically depends on the luminosity. The luminosity of a circular collider isgiven by [240, 316]

L DN1N2 f

nb AN1, N2 : total number of particles in the beam

f : revolving frequency of the particlesnb : number of bunches in a circle

(the beam is not continuous but bunched)

A : the beam’s cross-sectional area at the colliding pointA Dp

εx xp

ε y y

x , y : beam amplitude at the interaction pointεx , ε y : beam emittance

Page 393: Yorikiyo Nagashima - Archive

368 12 Accelerator and Detector Technology

The beam emittance is defined as a phase space area in the ( i , 0i D d i/dz)-

plane of the particles in the beam, where z is the coordinate in the beam direction.Namely

ε i D

Z0

i d (12.7)

Luminosities that can be achieved by electron and proton beams are slightly dif-ferent. L e D 1034 cm�2 s�1 has been achieved at B-factories, which is an approxi-mately 102 increase compared to their predecessors, and L p ' 3 � 1032 cm�2 s�1

has been achieved at the Tevatron. At LHC the designed luminosity is L p D

1034 cm�2 s�1 and up to 1035 cm�2 s�1 is envisaged in the future. For fixed-tar-get experiments, the reaction rate N (s�1) can be calculated using the number ofincoming particles (Nin) and target particles (Ntgt) as

N D NinσNtgt D Ninσ�`NA

� W density of the target in g/cm3

` W the length of the target in cm

NA W Avogadro number D 6.022 � 1023 mol�1

(12.8)

which gives

L D Nin�`NA (12.9)

As a typical example, taking � D 10 g/cm3, ` D 1 cm, Nin D 107 s�1 (secondarybeam) to 1012 s�1 (primary beam) gives L D 1032 to 1037 s�1. Until the end of the20th century, the luminosity at a fixed target was generally larger but the gap isclosing fast. In electron collider experiments, a basic cross section is given by theprocess e�eC ! μCμ�. Note, in the Standard Model, for colliding e�eC all theproduction cross sections by the electroweak interaction are of the same order andthe following number is a good number to keep in mind:

σμμ � σ(e�eC ! μCμ�)

D8π3sD

87[nb]s[GeV2]

�10�31

s[GeV2][cm2]

(12.10)

where nb means nanobarn. Atp

s D 10 GeV, L D 1034 cm�2 s�1, N � 100 s�1.

Signal-to-Noise Ratio (S/N) As the proton is a quark composite, quarks that havenot participated in the reaction (called spectators) escape from the parent proton,radiating soft gluons, which hadronize, producing jets and constituting the so-called QCD backgrounds. For this reason, clean signals (with large S/N ratio) arehard to obtain in hadron colliders. Suppose one wants to detect a Higgs meson. Itsproduction cross section is � 10�35 cm2. Furthermore, the decay branching ratioto lepton pairs (H0 ! e�eC, μ�μC), which are suitable for identifying the Higgs,

Page 394: Yorikiyo Nagashima - Archive

12.3 Various Types of Accelerators 369

is � 10�4. In contrast, a typical hadronic cross section that appears as QCD back-grounds in hadron colliders is σ � 10�25 cm2 and with L D 1032 to 1034 cm�2 s�1

produces events at a rate of N D 107 to 1010 s�1, a formidable number to copewith. The S/N ratio at its most naive level is estimated to be 10�14, a challenge forexperimentalists.

At electron colliders, production cross sections of interest relative to that of ee !μμ are usually of the order of 10�1 and that of hadronic production is � 3 � 4,therefore S/N ratios are much better. Generally speaking, electron colliders are ad-vantageous in that because the incoming particles are elementary theoretical in-terpretations are straightforward. Furthermore, the whole energy of the collidingelectrons can be used to produce the wanted particles and the backgrounds arefew. Therefore, signals are cleaner and easier to interpret theoretically. However,the electron, being much lighter than the proton (me/m p ' 2000), emits photonsif bent by magnetic fields (synchrotron radiation). Synchrotron light source facili-ties take advantage of this and are extensively used in the fields of materials scienceas well as atomic, molecular and biophysics. From the high energy physics pointof view, it is a pure loss. Since the loss increases as � E 4, it is increasingly difficultto construct a high energy, high luminosity circular accelerator for electrons. It isbelieved that the LEP5) at CERN with its energy

ps D 200 GeV is about the limit

of a circular machine. If higher energy is desired, one has to rely on linear col-liders. So far only one linear collider has been operational, the SLC at SLAC withp

s � 90 GeV, and its technology is not as mature as that of circular accelerators.In summary, with hadron colliders, innovative approaches to detector designs

are required to cope with copious backgrounds, whereas with electron collidersdeveloping advanced accelerator technology is the critical factor. With present tech-nology, when physics of the TeV region is to be investigated, hadron machines areeasier to construct, but linear colliders of the TeV range are a must for future ex-periments and extensive R&D is being carried out.

12.3Various Types of Accelerators

12.3.1Low-Energy Accelerators

One accelerator on its own, whatever its mechanism, cannot accelerate particlesabove an energy of about 1 GeV, and a variety of accelerators are utilized in cascadeto achieve this. The final stage of a high energy accelerator complex, so far, withthe exception of the SLAC linear accelerator (SLC), has always been a synchrocy-clotron, a kind of circular accelerator. It confines particles inside a circular pipewith fixed radius, places an accelerator tube at one point or two and accelerates

5) An electron collider that operated from 1989 to 2000 withp

s D 2mz (180 GeV) � 200 GeV whichprovided precision data to confirm the validity of the Standard Model.

Page 395: Yorikiyo Nagashima - Archive

370 12 Accelerator and Detector Technology

the particles every time they pass the tube. As a single acceleration tube can beused many times, it is very effective and has a high performance-to-cost ratio. Elec-trons or protons emanating from a source are first accelerated by a static electricfield accelerator (usually a Cockroft–Walton accelerator) to a few hundred thousandvolts, then further accelerated by a linac to hundreds of million electron volts, andfinally injected into a synchrotron. As the particles can be accelerated more effec-tively if the ratio of the final energy to the injected energy is small, two or moresynchrotrons are used in cascade. The first of them is often called a booster or anaccumulator and that in the final stage the main ring.

Cockroft–Walton This type of accelerator uses a static electric field to accelerateparticles and has the structure shown in Fig. 12.1. A combination condenser–diodecircuit is arranged to work as a parallel connection in charging the capacitors and aseries connection in discharging them. Figure 12.1 shows an eightfold multiplica-tion of the applied voltage. The Cockroft–Walton accelerator used for acceleratingthe 12 GeV proton synchrotron (hereafter referred to as PS) at KEK6) produced750 kV.

2V

4V

6V

8V

AC

CE

LE

RA

TO

R T

UB

EION SOURCE

V sin ωt

BEAM

0V

Figure 12.1 Cockloft–Walton accelerator: The circuit is con-nected in such way that applied voltages are arranged in par-allel when charging and in series when discharging, giving then-fold (n D 8 in the figure) voltage.

Linac Protons accelerated by the Cockroft–Walton accelerator are injected into alinear accelerator (linac). It is composed of a series of electrodes in a cavity (a hollowtube made of copper), to which an alternating electric voltage is applied by a radiofrequency (RF) supplier (see Fig. 12.2).

The polarity of the applied voltage is arranged to decelerate the particles whenthey pass the cavities and to accelerate them when they cross the gaps between

6) High Energy Accelerator Organization in Japan.

Page 396: Yorikiyo Nagashima - Archive

12.3 Various Types of Accelerators 371

RF Generator

ParticleSource

Figure 12.2 Proton Linear Accelerator: The accelerating RF voltage is applied to particles whenthey cross gaps and a decelerating voltage is applied when they are in the tube. As the particlesgain energy, their velocity increases and the corresponding tube is longer.

them. They feel no voltage inside the cavity because of its electric shielding. Theflow of injected protons is continuous, but only those synchronized with the ap-plied voltage are accelerated and consequently they emerge bunched as a series ofdiscontinuous bundles. The length of the cavity needs to be longer as the parti-cles proceed because they fly faster each time they are accelerated. The acceleratingvoltage is usually limited to a few MeV (million electron volts) per meter, and anenergy of 100–200 MeV is reached at the final stage. Electrons acquire near-lightvelocity at a few MeV, so the length becomes constant after a meter or two. Thelongest electron linac is SLC at SLAC (Stanford Linear Accelerator Center), whichhasp

s D 90 (45C 45) GeV, and a length of 3 km. It uses 230 klystrons (large vac-uum-tube RF amplifiers) and makes two beams of electrons and positrons collide60 times per second. The positrons are made by bombarding high-energy electrons(34 GeV at SLC) on matter, returning them to the starting point and re-acceleratingthem using the same accelerator tube a half-period of the RF later.

Cyclotron A charged particle with its velocity perpendicular to a uniform magneticfield makes a circular motion. When the particle with mass m, charge q and mo-mentum p moves on a circular orbit of radius R in a magnetic field of strength B,the equation of motion with centrifugal force supplied by the magnetic field is giv-en by

mv2

RD qv B

p [GeV/c] D mv D qB R D 0.3B R [T m](12.11)

where T stands for tesla. The equation is valid at relativistic energy if the relativisticmass (m ! mγ D m/

p1� (v/c)2) is used. The radius of curvature is proportional

to the momentum and the inverse of the field strength. The angular frequency isgiven by

ω D 2πv

2πRD

qBmγ

(12.12)

and is referred to as the cyclotron frequency. When the particle motion is nonrel-ativistic (v � c), ω is a constant if the magnetic field strength is constant. In thecyclotron a pair of half-cut hollow discs called D-electrodes or “dees” are placed in

Page 397: Yorikiyo Nagashima - Archive

372 12 Accelerator and Detector Technology

S

N

D1 D2

ION SOURCE

COIL

VACUUMCHAMBER

(a) (b)

D2

RF VOLTAGE

DEE

DEE

D1

B

Figure 12.3 Layout of a cyclotron: (a) Thecross section. Two electrodes, the dees, are inthe magnetic field. (b) RF voltage is appliedbetween the two dees to accelerate particles

when they pass the gap. The orbital radiusgrows as the particles are accelerated, but therevolution frequency stays the same.

a uniform magnetic field (Fig. 12.3a) and a RF voltage is applied across the dees.The particles, injected near the center of the magnetic field, are accelerated onlywhen they pass through the gap between the electrodes. The combined effect ofthe magnetic field and the acceleration forces the particles to travel in a spiral pathand eventually leave the dees (Fig. 12.3b).

Synchrocyclotron As particles are accelerated, their mass increases, the cyclotronfrequency is no longer constant and their circular motion is not synchronous withthe accelerating field. The cyclotron’s acceleration is effectively limited to v/c . 0.1or in the case of protons about 15 MeV. To overcome this limit, either ω or B has tobe changed. The synchrocyclotron increases the maximum energy by decreasingthe frequency as the mass increases. The disadvantage of this method is that onlyclusters of particles can be accelerated. Once the RF is synchronized to an injectedcluster, the next cluster has to wait until the preceding cluster has reached its min-imum frequency and is extracted. The intensity is reduced to a fraction (� 1/100)of that of the cyclotron.

Isochronous Cyclotron In the isochronous cyclotron, in contrast to the synchrocy-clotron, the RF is kept constant and the magnetic field is synchronized with themass increase. This is achieved by dividing the poles of the magnet into concentricsectors, and this type of cyclotron is referred to as a sector or ring cyclotron. Theparticles fly freely between the sectors and their trajectories are bent only whenthey pass through the sector magnets. The shape of the sectors is designed to keepthe combined total flight time in free space and in the sectors constant as the par-ticle becomes heavier. Thus the sector cyclotron can achieve high energy (severalhundred MeV for protons) without loss of intensity. Figure 12.4 shows an example,the 590 MeV Ring Cyclotron of PSI (Paul Sherrer Institute in Switzerland), from

Page 398: Yorikiyo Nagashima - Archive

12.3 Various Types of Accelerators 373

Figure 12.4 (a) PSI 590 MeV Ring Cyclotron.The beam is accelerated first to 870 kV by aCockroft–Walton accelerator, then pre-accel-erated to 72 MeV by a small ring cyclotronand injected into the 590 MeV Ring Cyclotron.(b) Floor layout of the Ring cyclotron. Fouraccelerating tubes (KAV1–4) and eight sec-

tor magnets (SM1–8) are shown. The sectormagnets are of C-type with the open edge di-rected toward the center. The beam comes infrom the left, is injected into the inner edgeof SM3 at the upper right and extracted fromAHB at the upper right. Photo courtesy of PaulScherrer Institute [328].

which a 2 mA DC beam can be extracted [328]. The highest energy is limited by thecost of making giant magnets.

12.3.2Synchrotron

As is shown by Eq. (12.11), if one wants to increase the energy, either B or R orboth has to be increased. The synchrotron keeps the radius of the circle constantby increasing the strength of the magnetic field. If one wants to go higher than, say,1 GeV energy, it is the most practical method. It is accompanied by one or more pre-accelerators because a certain amount of energy is necessary to keep particles ona circular orbit of a definite radius. Pre-accelerators usually consist of a Cockroft–Walton accelerator followed by a linear accelerator. Sometimes two synchrotrons

Page 399: Yorikiyo Nagashima - Archive

374 12 Accelerator and Detector Technology

LHC

LHC-b

ATLAS

ALICE

SPS

PSBOOSTER

AD CN

GS

neutrinos

LEIRP Pb ions

East Area

Gran Sasso730 km

LINACS

West Area

pbar

Antiprotondecelerator

ISO

LDE

Figure 12.5 CERN accelerator complex:Protons are accelerated first to 50 MeV bya linac, then to 1.4 GeV by the booster, to25 GeV by the PS (Proton Synchrotron), to450 GeV by the SPS (Super PS) [133] and fi-nally to 7 TeV by the LHC main ring. ISOLDEis used to study unstable nuclei. Part of the

beam is used for fixed-target experiments. Forinstance, CNGS (CERN Neutrinos to GranSasso) is a neutrino beam for neutrino oscil-lation experiments carried out at LNGS (theGran Sasso National Laboratory) of INFN (Is-tituto Nationale de Fisica Nucleare) in Italy730 km away. Taken from [93].

are used in cascade to bring the energy even higher. This is because the accep-tance of the injected beam can be made higher if the ratio of injected to maximumenergy is lower, and there is a net gain in intensity. For this reason the first accel-erator is called a booster and the last the main ring. Figure 12.5 shows CERN’s7)

accelerator complex [93]. Protons are first accelerated to 50 MeV by a linac, then to1.4 GeV by the booster, to 28 GeV by the PS (Proton Synchrotron), to 450 GeV bythe SPS (Super PS) and finally to 7 TeV by the LHC main ring. PS was the mainring between 1960 and 1980 and is still used for a variety of experiments. SPS wasbuilt originally as an antiproton–proton collider and achieved the historical discov-ery of W and Z0 bosons. It also served as an intermediate accelerator for electronsand positrons to be injected into the Large Electron–Positron Collider (LEP). LEPalso used to be the LHC main ring; it operated from 1989 to 2000 at

ps D mZ

and later atp

s D 200 GeV. It was the largest electron collider and instrumental inestablishing the Standard Model of particle physics.

Elements of the Synchrotron The magnets that are used to keep the particles ontheir circular trajectories are of dipole type, like the one shown in Fig. 12.6a, orC-type, obtained by cutting the magnet at the center.

The accelerating tube is a RF cavity and is placed at a point on the circle. The fre-quency is synchronized so that it accelerates the particles every time they pass the

7) European Organization for Nuclear Research. The acronym CERN originally stood for the earlierFrench name Conseil Européen pour la Recherche Nucléaire. See for example [93] for moredetails.

Page 400: Yorikiyo Nagashima - Archive

12.3 Various Types of Accelerators 375

N

S

COIL S

S

N

N

d

(a) (b)

(c)

Figure 12.6 Cross sections transverse tothe particle path of a typical dipole (a) and aquadrupole magnet (b). Thin lines with ar-rows denote magnetic field directions andthick arrows denote magnetic forces. In thedipole, the force is one directional, but in the

quadrupole, it is inward (converging) hori-zontally but outward (diverging) vertically.(c) A pair of dipoles can make the particle rayconvergent in both the x and y directions, act-ing like a combination of convex and concavelenses.

RF cavity. With present technologies, the maximum field strength of the magnetis limited to � 1.4 T (normal conducting) or 3 T (Tevatron) to 10 T (LHC) (super-conducting).8) Let us consider the cases of the Tevatron (980C 980 GeV) and LHC(7C 7 TeV). Inserting B D 3 T (3.8 T), R D 1 km (2.8 km), Eq. (12.11) tells us thatE � p c � 900 GeV (7 TeV). Namely, the size of the accelerator becomes largerin proportion to the maximum energy. The revolution frequency at the Tevatron(LHC) is � 10�5 (10�4) and the accelerating voltage is � 1 MeV; it takes � 10 s(� 700 s) to reach the maximum energy. When a continuous beam is injected intothe linac, it becomes bunched in the synchrotron with a cluster length of � 1 mm.

Beam Stability Usually particles in the beam have dispersion (spread) in their ve-locity as well as their position. The accelerator will work only when the particlesthat are not exactly synchronous with those in the central orbit tend to convergeor oscillate around the center. If their motion diverges, they will not be acceleratedstably.

i) Betatron Oscillation Stability Transverse oscillation relative to the orbit is calledbetatron oscillation. To control the betatron oscillation and keep the particles onthe orbit, quadrupole-type magnets are used. They have the structure shown inFig. 12.6b. With the configuration depicted in the figure, the magnetic field exertsa converging force (focus, F) on the particles in the x direction and diverging force

8) In laboratories, if only a small volume is activated for a short time, magnetic field strengths ofseveral hundred tesla have been achieved. Here, we are talking of industrial level magnets that areof large size and capable of stable operation.

Page 401: Yorikiyo Nagashima - Archive

376 12 Accelerator and Detector Technology

EE’

L’L

O

Central Orbit

Phase Φs ωst

Acc

eler

atin

gV

olta

ge

Figure 12.7 Synchrotron oscillation and phasestability. Point O denotes where the revolu-tion frequency and the acceleration frequencysynchronize. A particle which arrives early (E)(late, L) receives larger (smaller) acceleration

and its orbital radius becomes larger (small-er). Therefore, it takes more (less) time to goaround and arrives later (E0) (earlier, L0) at thenext period.

(defocus, D) in the y direction. The quadrupole magnet is a kind of convex lens andconcave lens in optics in one. Just as a combination of concave and convex lensesplaced at a distance d has a combined focal length F, which is given by

1FD

1f1C

1f2�

df1 f 2

(12.13)

a pair of quadrupole magnets acts as a focusing device. If we take, for example,f1 D � f2 D f , then F D d/ f 2 and the pair can exert a converging force in both

the x and y directions (Fig. 12.6c). In the synchrotron dipole magnets are placed atequal distances to keep the particles moving on the circular orbit and quadrupolesare distributed between the dipoles in a configuration (FDFDFD. . .). The use ofmany FD combinations is called strong focusing.

ii) Phase Stability Longitudinal oscillation in the beam direction around the cen-tral point synchronized with the applied RF voltage is called synchrotron oscil-lation. Suppose the circular motion of particles at point O (see Fig. 12.7) is syn-chronous with the RF frequency and they come exactly back to point O after aperiod. The phase of point O is called synchronous phase. Particles at E, which ar-rive earlier, experience higher acceleration, which causes their orbit to expand andconsequently they take more time to circulate, arriving later at E0 in the next period.On the other hand, particles at L, arriving later, receive less acceleration and arriveearlier, at L0, next time. Therefore the particles’ phase tends to converge to the syn-chronous phase and thus the beam can be accelerated stably. The convergence ofthe longitudinal motion in the particles’ orbit is referred to as phase stability.

Synchrotron radiation, which is described in the next paragraph, attenuates boththe betatron and synchrotron oscillation and thus has the effect of stabilizing thebeam.

Synchrotron Radiation Synchrotron radiation is the phenomenon that a particleemits light as its path is bent by the magnetic field. Because of this, the energy

Page 402: Yorikiyo Nagashima - Archive

12.3 Various Types of Accelerators 377

cannot be made arbitrarily large proportional to the radius. At some point the en-ergy loss due to synchrotron radiation becomes so large that the compensation ofthe energy becomes technically formidable. The energy loss ΔE per turn of thesynchrotron radiation is given by

ΔE D4π„c3γ 4

3R

ΔE [MeV] �0.0885(E [GeV])4

R[m]W for the electron

(12.14)

At LEP, R D 2804 m and E D 100 GeV, which gives ΔE D 3.2 GeV/turn.Power/RING D 16 MW. At LHC, E D 7 TeV, which gives ΔE D 6.7 keV/turn.

Synchrotron radiation is almost monochromatic and collimated, which makesa high quality light source but is a loss as an accelerator. The proton being 2000times heavier than the electron, the synchrotron radiation loss in the proton ma-chine is (2000)4 � 1013 smaller and hence has never been a problem. LHC is thefirst machine to consider the loss seriously, but at E D 7 TeV, R D 2804 m theenergy loss is a mere 6.7 keV/turn. This is the main reason for the technologicallyavailable maximum energy of the electron accelerator being roughly a tenth that ofthe proton machine.

12.3.3Linear Collider

As the energy loss due to synchrotron radiation increases rapidly, like� E 4, and be-cause of the already formidable energy loss, a circular electron machine beyond thesize of the LEP is considered impractical. This is why a linear collider is being con-sidered as the future machine for electron–positron collisions. The first workinglinear collider was SLC (Stanford Linear Collider), which operated at

ps D 90 GeV.

The technology of the linear collider is not mature and SLC could barely competewith LEP in producing physics results. However, for precision physics beyond theTeV region, it is considered a must and a team of collaborators from all over theworld is working to realize the ILC (International Linear Collider). A starting en-ergy of

ps D 500 GeV is envisaged, later to be increased to 1 � 2 TeV. The main

difficulties of the linear collider are:

1. Unlike circular accelerators, an accelerating tube can accelerate particles onlyonce on a trajectory. Therefore, an increase of the acceleration voltage per unitlength is necessary from the present level of � 1 MeV/m to � 100 MeV/m. Ifthis is achieved, the total length of the accelerator becomes 1 TeV/100 MeV �10 km.

2. As two beams can collide only once, the collision probability has to be in-creased, too. This requires the colliding beam size to be on order of � 0.1 μmor less compared to � 10 μm at present.

3. As a corollary to the second item, technologies have to be developed to maintainand control the accuracy of 0.1 μm over a distance of 10 km.

Page 403: Yorikiyo Nagashima - Archive

378 12 Accelerator and Detector Technology

12.4Particle Interactions with Matter9)

12.4.1Some Basic Concepts

A particle detector is a device that determines the energy, momentum, spatial posi-tion or time response of an incoming particle by inducing interactions with specificmaterials suitable for the purpose. It is useful to understand basic features of inter-actions that are used for measurement before discussing individual instruments.Particles are classified roughly into two classes depending on their interactionswith matter: hadrons, which interact strongly, and electrons, photons and muons,which interact only electromagnetically. From the detector point of view, weak in-teractions (and gravity, too) can be neglected. Neutrinos are usually recognized as“missing energy” except for specific devices to detect neutrino interactions directly.Suppose a particle with definite energy and direction passes through a material.It interacts either electromagnetically (Fig. 12.8a, b) or strongly (c, d). A strong-ly interacting particle is usually elastically scattered if the energy is low (. a fewhundred MeV), but at high energy (E m p c2 � 1 GeV) its reaction is inelastic,producing one or more particles. From the incident particle point of view, the en-ergy is regarded as lost or absorbed by the material. In the case of electromagneticinteractions, the incident particle loses its energy (E ! E � ΔE ) by scatteringelectrons (namely ionizing atoms). It also undergoes multiple scattering, a seriesof small-angle scatterings by the atomic Coulomb potential, and changes its angleθ ! θ C Δθ . If a particle is energetic enough, secondary particles produced bystrong or electromagnetic interactions induce a series of inelastic reactions, mak-ing a shower of particles. Among the secondary particles, neutral pions decay im-

9) [138, 261, 311].

Δθ

Δθ

(a) (b)

(c) (d)

N(E)

ΔΕ

E0

Figure 12.8 Particles’ interactions with matter.Electromagnetic interaction: (a) The particleloses energy ΔE by ionizing atoms. (b) Theparticle changes direction by multiple scatter-

ing. Strong interaction: (c) Reaction with (ab-sorption by) nuclei. (d) Shower phenomenon:a combination of strong and EM interaction.

Page 404: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 379

mediately to two energetic gamma rays and by a combination of pair productionand bremsstrahlung produces an electromagnetic shower. Therefore the showerphenomenon induced by hadrons is a combination of strong and electromagneticinteractions.

Collision Length and Absorption Coefficient We define a few quantities to give ameasure of reaction probability.

Cross Section The cross section σ is defined as the probability of reaction whenon average one particle per unit area comes in and hits a target whose density isone particle per unit volume. Then when Nin particles uniformly distributed overan area A hit a target consisting of Ntgt particles uniformly distributed in a volumeV D AL, the number of reactions N is given by

N D A �Nin

A�

Ntgt

V� L � σ D

NinNtgt

A� σ (12.15)

The cross section has the dimension of an area and may be considered as the tar-get’s effective area size. The standard unit is

1 barn � 1 b D 10�24 cm2 (12.16)

but also

1 mb (millibarn) D 10�3 b D 10�27 cm2

1 μb (microbarn) D 10�6 b D 10�30 cm2

1 nb (nanobarn) D 10�9 b D 10�33 cm2

1 pb (picobarn) D 10�12 b D 10�36 cm2

1 fb (fentobarn) D 10�15 b D 10�39 cm2

1 at (attobarn) D 10�18 b D 10�42 cm2

(12.17)

are used.A process where particles A and B collide and become particles C (DA) and D

(DB) is called elastic scattering. Sometimes when the particle species change (C ¤A, D ¤ B , for example π� p ! K0Λ), it is called quasielastic. If the final stateconsists of three or more particles C, D, E . . ., it is called inelastic scattering. Let thefraction of (CCD) production of all the reactions be B R(CD) (branching ratio) andthe total cross section σTOT, then the cross section of the reaction (ACB! CCD)is given by

σ(AC B! CCD) D σTOTB R(CD) (12.18)

The angular distribution of secondary particles contains important information.The fraction of the cross section in which the secondary particle is scattered intosolid angle dΩ at angle (θ , φ) (Fig. 12.9) is called the differential cross section andis defined as

σ DZ

dσdΩ

dΩ DZ

dσdΩ

sin θ dθ dφ (12.19)

Page 405: Yorikiyo Nagashima - Archive

380 12 Accelerator and Detector Technology

θ

dΩIncident Beam

Unit Area

Figure 12.9 When a flux of particles whoseintensity is one particle per unit area per unittime is incident on a target consisting of a sin-gle particle and dN particles per unit time are

scattered into the solid angle dΩ at (θ , φ),the differential cross section is defined bydσdΩ

(θ , φ) D dN .

When a particle moves a distance d in matter, the number of collisions it experi-ences is given by Eq. (12.15) by putting Nin D 1, A D unit area, V D d

Ncoll D σnd (12.20)

where n is the number density of target particles in the matter. Therefore the aver-age length before a particle experiences a collision is given by

λ D1

σ�NA(12.21)

where � is the matter density [g/cm3] and NA D 6.023 � 1023 g�1 is the Avogadronumber. λ is referred to as the collision length or the absorption length when onlyinelastic collisions are counted.

The number of particles passing through the matter decreases as they interact.If the length is sufficiently short, the rate can be considered to be proportional tothe length. As the probability that a particle is lost while moving a distance dx isgiven by dx/λ, the number of particles lost dN out of N particles is expressed as

dNND �

dxλ� �kdx (12.22)

k is called the absorption coefficient. Solving this equation gives

N D N0 exp(�kx ) D N0(�x/λ) (12.23)

where N0 is the initial number of particles. It is often more convenient to use not x,the length, but the weight per unit length (� D �x ). Denoting the flux (the numberof particles passing unit area per unit time) as I

I D I0 exp(�μ� )

� D �x , μ D1

λ�

(12.24)

μ is referred to as the mass absorption coefficient.

Page 406: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 381

When the energy of strongly interacting particles, such as protons and pions,exceeds � 10 GeV, the total cross section is � 25 mb and the collision length inmatter with density � 1 g/cm3 is � 67 cm. The cross section of electromagneticscattering by electromagnetic interaction (not ionization), for instance, Thomsonscattering (elastic scattering) by protons is σ � (8π/3)α2(„/m p c)2 � 0.2 μb, whichgives the mean collision length � 100 km. The cross section of weak interactionsis, taking as an example neutrinos from the sun (E � 1 MeV), σ � 10�44 cm2 andthe collision length is� 1 light year in matter with � � 1 g/cm3. Thus the collisionlength gives a measure to distinguish one type of interaction from another. Exam-ples that are often mentioned in the instrumentation, for instance in the design ofa calorimeter, are the collision and absorption lengths of iron (Fe): λ ' 10.5 cmand 16.8 cm, respectively.

A good guess at the strong interaction rate is given as follows. Hadrons such asnucleons constantly emit and reabsorb pions and are considered as an extendedobject whose size is of the order of the pion Compton wavelength (λπ D „/mπ c '1.4 � 10�13 cm). As a nucleus with atomic mass A can be considered as denselypacked nucleons, it is a good approximation to treat the nucleus as a sphere withits radius approximated by rA D A1/3λπ . If a hadron hits a nucleus, it reacts imme-diately at the surface and the nucleus looks like a totally absorbing black disk. Thearea of the disk is given by

σA D 2π�

λπ

2

�2

A2/3 ' 3.2 � 10�26 A2/3 cm2 10) (12.25)

This is called the geometrical cross section. In practice, adopting the measuredcross section of π p scattering, the empirical formula

σA D 2.5 � 10�26A2/3 cm2 (12.26)

may be used as a measure.

12.4.2Ionization Loss

Ionization Loss per Unit Weight (d E/d � ) The process of losing energy by ioniz-ing atoms while passing through matter is the most fundamental phenomenonapplied for measurement of charged particles [15]. It is important to understandthe phenomenon in detail. Electrons are bound in atoms and although they are notat rest their motion is sufficiently slow that they can be approximated as a statictarget. When a particle with charge ze passes through matter with velocity v, and ifits mass M is large compared with that (m) of electrons (M m), the recoil of theparticle by scattering electrons can be ignored and the particle may be consideredto follow a straight line track. In the following, we try to calculate the energy loss

10) Quantum mechanically, the total cross section of a black disk is 2πR2, taking into account shadowscatterings.

Page 407: Yorikiyo Nagashima - Archive

382 12 Accelerator and Detector Technology

using the quantum field theories we learned in Chap. 7 as much as possible. Dur-ing the calculation we use natural units but recover „ and c when expressing thefinal results. Let j μ

e be the current the electrons make and j μm that of the incoming

charged particle. Then the scattering amplitude is given by [see Eq. (7.3)]

i T f i D

Zd4x d4 yh f jT

�˚�i e j μ

e (x )A μ(x )� ˚�i z e j ν

m(y )A ν(y )�jii

D �ze2Z

d4x d4 yh f j j μe (x )jiih0jT [A μ(x )A ν(y )]j0ih f j j ν

m(y )jii

(12.27)

Putting the energy-momentum of the initial and final state pi D (m , 0), p f D

(E f , p f ) � (E , p ), we find the matrix element of the electron current becomes

h f j j μe (x )jii D u(p f )γ μ u(pi)e�i(p i�p f )�x (12.28)

h0jT [A μ(x )A ν(y )]j0i is the Feynman propagator given by Eq. (6.115):

h0jT [A μ(x )A ν(y )]j0i D �i gμν

(2π)4

Zd4q

e�i q�(x�y )

q2 C i ε(12.29)

If we neglect the recoil and energy loss of the charged particle, it can be treatedclassically. Assuming the particle follows a straight track along the z-axis, we have

j νm(y ) D

Z 1

�1dτ uν δ4(y μ � uμ τ) D v ν δ(y 1)δ(y 2)δ(y 3 � v t)

v ν D (1, 0, 0, v )(12.30)

where uμ D (γ , γ v) is the particle’s four-velocity and τ (D t/γ ) is the proper time.Integration of x gives (2π)4δ4(q C pi � p f ), and further integration of q makes it(2π)4. Integration of y gives the Fourier transform of j ν

m:

j νω �

Zd4 y j ν

m(y )e i q�y Dv ν

v

Z L/2

�L/2dzei(ω/v�qz)z D

v ν

v2πδ

qz �

ωv

�qμ D (ω, q) D p μ

f � p μi D (E � m , p ) � (T, p ) (12.31)

where we have explicitly shown the integration limits of z to remind us that thelength the particle travels is infinite microscopically but finite macroscopically. In-serting these equations in Eq. (12.27), we obtain

T f i D2πv

δ�

pz �Tv

�ze2

q2vμ u(p f )γ μ u(pi) (12.32)

The scattering probability is given by squaring T f i , multiplying by the final statedensity, dividing by the initial state density 2Ei and taking the average and sum ofthe electron spin state:

dw f i D1

2Ei

0@1

2

Xspin

jT f i j2

1A d3 p f

(2π)32E f

Dz2e4L

16πEi E f v2 δ�

pz �Tv

�1q4 vμ vν Lμν d p 2

T d pz

(12.33)

Page 408: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 383

Here we have used j2πδ(pz � T/v )j2 D 2πδ(pz � T/v )R

dz D 2πLδ(pz � T/v ).Lμν is given by Eq. (6.143). Using

Lμν D12

Xr,s

[us(p f )γ μ u r(pi)]�us(p f )γ ν u r (pi)

�D 2

�p μ

i p νf C p ν

i p μf C gμν q2

2

vμ p μ

i D m , vμ p μf D E f � v p f z D E � T D m

vμ vν gμν D 1� v2 D1

γ 2

(12.34)

vμ vν Lμν is expressed as

vμ vν Lμν D 2�

2m2 Cq2

2γ 2

D 4m2

�1C

q2

4m2 γ 2

�(12.35)

Inserting Eq. (12.35) in Eq. (12.33), the ionization probability is expressed as

dw f i D4πz2α2L

v2

m2

Ei E f

�1q4 C

1/4m2 γ 2

q2

�d p 2

T (12.36)

The energy loss per unit length is obtained by multiplying Eq. (12.36) by the energyT given to the electron and its number density ne , dividing by L and integratingover p 2

T :

dEdxD

4πz2α2ne

mv2

�m2

ZT

m C T

�1q4C

1/4m2 γ 2

q2

d p 2

T

�(12.37)

The momentum transfer can be expressed in terms of pT and T using pz D T/v :

�q2 D jqj2 � ω2 D p 2T C

�Tv

�2

� T 2 D p 2T C

�T

γ

�2

(12.38a)

On the other hand

�q2 D �(E f � Ei)2 C (p f � p i )2 D �(E f � m)2 C p2

f

D 2m(E f � m) D 2mT(12.38b)

which is an exact relation with the mass of the charged particle taken into account.

Page 409: Yorikiyo Nagashima - Archive

384 12 Accelerator and Detector Technology

As the energy loss T given to the electron is mostly in the nonrelativistic region, wemay approximate

E f D m C T ' m (12.39a)

Q2 � �q2 � p 2T (12.39b)

Then Eq. (12.37) can be approximated by

dEdxD

4πz2α2ne

mv2

"12

Z Q2max

Q2min

�1

Q2 �1

4m2γ 2

dQ2

#(12.40)

The maximum value of Q2max D 2mTmax occurs when the charged particle hits

head on and scatters the electron in the forward direction. The value of T is givenby

T D2m2γ 2 cos2 θ

(m/M C γ )2 � 2γ 2 cos2 θ(12.41)

where θ is the electron scattering angle and M the mass of the charged particle.From this Tmax ' 2m2γ 2. Q2

min is (Tmin/γ )2 from Eq. (12.38a). T must be atleast larger than the ionization energy I of the atom to be ionized, which meansQ2

min D (I/γ )2. Thus the expression for dE/dx is

dEdxD

2πz2α2ne

mv2

�ln

2m2γ 2Tmax

I 2 � 2�

(12.42)

which is referred to as the Bethe–Bloch formula [137].For practical applications, it is more convenient to give the ionization loss per

unit weight (� D �x ). Inserting „ and c to recover the right dimensions and takinginto account the density effect (δ/2, which will be described soon), the ionizationloss per gram�centimeter according to the Particle Data Group [311] is given by

dEd�D K

ZA

�z

�2 �12

ln�

2me c22γ 2Tmax

I 2

�� 2 �

δ2

�(12.43)

d� D �dx � : density of matterK D 4πNA r2

e me c2 D 0.3071 MeV cm2/g

NA D 6.022� 1023 g�1 D Avogadro number

re D e2/(4πε0 me c2) ' 2.82� 10�13 cm D classical electron radiusZ, A D atomic number and atomic mass of matter

I ' 16Z0.9 eV D average ionization potential of matterδ D density effect

Page 410: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 385

From the formula, some characteristics of the ionization loss can be read off. Theionization loss per unit weight dE/d� is

1. proportional to Z/A and oblivious of matter species (weak dependence on ln I )and

2. a function of the velocity of the incoming particle.3. At low γ . 1, the curve has 1/2 dependence, while at large γ 1 has a

logarithmic rise (� ln γ 2).

Figure 12.10 shows the energy loss of the positively charged muon in copper (Cu).The Bethe–Bloch formula is relevant in the region γ D 0.1 to 1000. At higher en-ergies, radiative losses due to bremsstrahlung and pair creation become important.

For hadrons (π, p , etc.), strong interactions become dominant at high energiesbefore the radiative processes come in. As can be seen in Fig. 12.10, the ionizationrises slowly (� ln γ ) and for a wide range of γ the ionization loss is more orless constant. This is referred to in the literature as minimum ionization loss. It isconvenient to remember the fact that

Minimum Ionization Loss: The energy loss per gram of relativistic particles isabout 1.8 MeV and barely depends on the energy.

The ionization loss is a statistical phenomenon. The measured data fluctuatesaround the value given by Eq. (12.43). The distribution is Gaussian, but with thinmaterials the statistical occurrence of ionization just once or twice with a big kickto the electron is sizeable, producing an asymmetric distribution with a long tailon the high-energy side, which is called the Landau distribution (Fig. 12.11a).

Muon momentum

1

10

100

Sto

ppin

g po

wer

[M

eV c

m2 /

g]

Lin

dhar

d-

Sch

arff

Bethe-Bloch Radiative

Radiative effects

reach 1%

μ+ on Cu

Without δ

Radiative losses

βγ0.001 0.01 0.1 1 10 100 1000 104 105 106

[MeV/c /VeG[] c]

100101 0011.0 10 0011 101

[TeV/c]

Anderson- Ziegler

Nuclear losses

Minimum ionization

Eμc

μ−

Figure 12.10 dE/dx from positive muons in copper as a function of γ D p/M c. Figure takenfrom [311].

Page 411: Yorikiyo Nagashima - Archive

386 12 Accelerator and Detector Technology

Figure 12.11 (a) The Landau distribution,which is a typical energy loss in a thin ab-sorber. Note that it is asymmetric with along high-energy tail. Figure from [311]. (b)

dE/d� curves as a function of momentump D M γ . As dE/d� is a function of γonly, the curves separate depending on theparticle’s mass.

In practical applications, information on momentum (p D M γ ) rather thanvelocity is easily obtained. Then the ionization curve is different for different mass-es (Fig. 12.11b). Thus combined knowledge of p and dE/d� can be used to identifyparticle species.

Density Effect The logarithmic rise ln γ 2 in the ionization loss is due to the rela-tivistic effect of contraction of the distance from the passing particle to the atomsbeing ionized (or transverse electric field E? / γ ). However, if many atoms ex-ist in this region, the screening effect due to dielectric polarization attenuates theeffect, which is referred to as the density effect.

Let us consider how the ionization formula is affected. In matter, the photonsatisfies the dispersion relation

ω2 Djqj2c2

εDjqj2c2

n2 (12.44)

where ω is the wave frequency, q the wave number, ε (normalized as ε0 D 1) thedielectric constant and n the refractive index. Denoting the phase velocity of lightin matter by cm(D c/

pε), the transverse wave number by qT and using qz D ω/v ,

we have

qT Dωv

v2

c2 � 1 (12.45)

For ε > 1, a real photon (Cherenkov light) is emitted if v > cm (described later inSect. 12.4.4).

Figure 12.12 shows argon’s dielectric constant as a function of photon energy.The region of interest is the absorption region (10 eV . „ω . 10 keV), where the

Page 412: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 387

1 102 104 [eV]

1 102 104 [eV]

10 1

10 2

10 3

10 3

10 5

10 7

10 3

10 5

10 7

1/ε 2

ε 1−1

(a)

(b)

Optical Absorption X ray Region Region Region

Figure 12.12 Dielectric constant (ε Dε1 C iε2) of argon gas at 1 atmosphere. (a)ε2 is displayed as photon absorption length inmeters (/ 1/ε2). (b) ε1 � 1. Abscissa is pho-ton energy in eV. In the optical region, matter

is transparent with ε2 D 0, ε1 > 1. This isthe region in which Cherenkov light can beemitted. The absorption region has an effecton the ionization loss. The transition radiationoccurs in the X-ray region. Figure from [14].

dielectric constant is expressed as

ε D ε1 C i ε2

ε1 ' 1�ω2

p

ω2< 1

„ωp D „

se2ne

ε0me'

�2

ZA

�[g/cm3] 1/2

� 21 eV

(12.46)

where ne is the electron number density and ωp is the plasma frequency. In theabsorption region, the light is partially absorbed and the dielectric constant is com-plex. In this region, qT is imaginary and the transverse electric field decreases ex-ponentially. Writing qT D i�,

� Dq

0γ 0 Dωv

r1� ε

vc

�2'

ωc

s1

(γ )2 Cω2

p

ω2

γ!1�����!

ωp

c

0 D n Dv

cm, γ 0 D

1p1� 0 2

(12.47)

where cm is the velocity of light in matter. The above formula tells us that whensaturation due to the density is effective, Qmin is not given by I/(cγ ) but by „ωp/c.Formally, we only need to replace e by e/n and c by cm D c/n in the formula togive the ionization loss in matter. As a result of this the second term in the lastequality of Eq. (12.38a) is replaced by �2. In the language of quantized field theory,the photon has acquired mass due to the polarization effect of matter. From the

Page 413: Yorikiyo Nagashima - Archive

388 12 Accelerator and Detector Technology

relation (ω/c)2 D q2/n2 ' q2(1� δn2) D q2C (ωp/c)2, the acquired mass is givenby „ωp/c2.

Let us estimate at what value of γ the density effect becomes sizeable. Themaximum transverse distance ymax at which the charged particle can have influ-ence without considering the density effect is given roughly by

ymax '„

pT maxD„cγ

I'

�20 eV

I

�γ � 10�6 cm (12.48)

The size 10�6 cm is of the order of the average intermolecular distance of gas atone atmosphere. On the other hand, the maximum distance of the density effect atmaximum saturation is given by

ym,max Dc

ωp'

1p

�� 10�6 cm (12.49)

Comparing the two, the value of γ to reach saturation is given by γ � 1/p

�,which is� 100 for gas. But for a liquid or solid saturation is reached at small valuesof γ . When the density effect is sizeable, the ionization loss is smaller by δ/2

δ2D ln

ymax

ym,maxD ln

„ωp

IC ln γ (12.50)

Comparing this equation with the original formula, dE/d� increases only as� ln γ in contrast to the original � ln(γ )2 (see the curve in Fig. 12.10 with andwithout the density effect).

Range When a charged particle enters a thick layer of matter, it loses all its energyby ionization loss and stops in the matter. The distance it travels is called the flightdistance or range. Let E be the original energy of the particle and R the range, then

R DZ R

dx DZ E dE

dE/dx(12.51)

As E/M c2 D γ D 1/p

1� 2 and dE/dx are both functions of only, R/M isalso a function of only. Referring to Eq. (12.43), the expression for R can be givenas

R DMz2 G() (12.52)

Namely, the range is proportional to the mass of the incoming particle, and inverse-ly proportional to the charge squared. The concept of the range is useful only whenenergy loss other than ionization loss can be neglected. This is the case for muonswith energy less than a few hundred GeV and low-energy hadrons (R < λ, whereλ is the nuclear interaction length or roughly those with E � M . a few hundredMeV). Energetic hadrons (E � M & 1 GeV) usually interact strongly with matterbefore losing all their energy and induce shower phenomena by cascade interac-tions. The calorimeter utilizes the shower phenomenon to measure the hadronicenergy.

Page 414: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 389

12.4.3Multiple Scattering

In the consideration of the ionization loss, recoils the traversing particle receivesfrom electrons were totally neglected. However, scattering of a particle by nuclei isnonnegligible because they are heavy. Because the probability of Rutherford scat-tering is inversely proportional to sin4 θ /2 (see Sect. 6.3.1), most of the scatteringoccurs at small angles. Therefore, when a particle traverses a thick material, theeffect of Rutherford scattering is a superposition of statistically independent small-angle scatterings. Namely

hθ 2i DX

θ 2i (12.53)

which shows that the average scattering angle increases as the particle proceeds.Since it is a statistical phenomenon, the distribution is Gaussian. Namely, the an-gular distribution is given by

12π

exp��

θ 2

2θ 20

�dΩ (12.54)

Let us calculate the average scattering angle using the Mott formula given inSect. 6.7. Since the angle is small, the spin effect can be neglected. Using jqj2 D2p 2(1 � cos θ ) ' p 2θ 2,

dσ D4Z2α2E 2

q4

�1 � v2 sin2 θ

2

�dΩ '

4πZ2α2

v2

dq2

q4 (12.55)

The average scattering angle per collision can be found fromZθ 2dσ D

4πZ2α2

v2 p 2

Zdq2

q2 D8πZ2α2

v2 p 2 lnqmin

qmax

D8πZ2α2

v2 p 2ln

rmax

rmin

(12.56)

divided byR

dσ. Here, rmax D 1/qmin is the distance at which the Coulomb fieldof the nuclei is screened by electron clouds. We take rmax as the Thomas–Fermiatomic radius (ratm D 1.4aB Z�1/3 D 7.40 � 10�9 Z�1/3), where aB D „/αme c isthe Bohr radius. For rmin, we adopt the nuclear radius rA D A1/3„/mπ c. When aparticle traverses a distance X cm, it is scattered N (N D X

Rdσ �NA/A) times on

average. Then recovering „ and c, the average scattering angle θ0 is given by

hθ 20 i D Nhθ 2

i i D X16πZ2(α„c)2�NA/A

v2 p 2 ln�

rmax

rmin

�1/2

�E 2

s

v2 p 2

XX0

(12.57a)

Es D me c2

r4παD 21.2 MeV (12.57b)

Page 415: Yorikiyo Nagashima - Archive

390 12 Accelerator and Detector Technology

X �10 D 4αr2

eNAZ2�

Aln

183Z1/3 (12.57c)

Es is referred to as the scale energy and X0 as the radiation length. It is customaryto express the radiation length in g/cm2, in which case X0 ! X 0

0 D �X0 is to beunderstood, and we follow the custom in the following. According to more detailedcalculation or fits to the data [138, 311], the average scattering angle is given by

θ0 �

qhθ 2

0 i D zin

�Es

v p

�sXX0

�1C 0.038 ln

XX0

�11) (12.58a)

X0 D

�4αr2

eNAZ(Z C 1)

Aln

287Z1/2

��1

D(716.4 g cm�2)A

Z(Z C 1) ln(287/p

Z )(12.58b)

The accuracy of measuring the spatial position of a particle track is considerablyaffected by multiple scattering, especially at low energies. In fact, in deciding theparticle momentum using the curvature of a charged track in the magnet, the limitof accuracy very often comes from the indefiniteness due to multiple scattering.

12.4.4Cherenkov and Transition Radiation

A charged particle flying at constant speed does not radiate in vacuum, but canemit photons in matter. Physically, this can be interpreted as dipole radiation of po-larized matter that the moving charge has created. If the particle velocity is smallerthan the velocity of light in the matter v < c/n, the polarization is symmetric andno light is emitted because of destructive interference (Fig. 12.13a).

Figure 12.13 Polarization of a medium as acharged particle traverses it. (a) When the par-ticle velocity v is smaller than the light velocityin the matter c/n, the polarization is symmet-ric and no light emitted. (b) When v > c/n,

the polarization is asymmetric and light isemitted (Cherenkov light). If the medium isnot uniform, the polarization becomes asym-metric and light is emitted even if v < c/n(transition radiation).

If v > c/n, the polarization is asymmetric and constructive interference results inlight emission (Cherenkov light). If the medium is not uniform, the polarizationdistribution is asymmetric and light is emitted even if v < c/n.

11) θ Dq

θ 2x C θ 2

y is a two-dimensional angle. If a one-dimensional projected θx is desired, use

Es ! Es/p

2D 13.6 MeV.

Page 416: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 391

The transition amplitude for radiation by a particle is given by

i T f i D h f j � iZ

Hint d4x jii D �i eZ

d4xh f jA μ(x ) j μm(x )jii (12.59)

j μm(x ) is a classical current created by the incident charged particle and is given by

Eq. (12.30). Treating photons as a quantized field

h f jA μ(x )jii D ε�μ (λ)e i k x (12.60a)

Integration over x gives

j μm(x )!

v μ

vj ω

j ω D

Zδ(x )δ(y )δ(z � v t)e i k �x d4x D

Z L/2

�L/2dzei(ω/v�k cos θ )z

(12.60b)

where L is the length of material that the particle goes through. The transitionamplitude becomes

dw f i DX

λD˙jT f i j

2 d3k(2π)32ω

DX

λD˙e2ˇˇ εμ v μ

v

ˇˇ2 j j ωj

2 d3k(2π)32ω

(12.61)

The polarization sum is given by

XλD˙

ˇˇεμ

v μ

v

ˇˇ2 D

�δ i j �

ki k j

jkj2

�vi v j

v2 D sin2 θ 2 (12.62)

The above equations apply to vacuum. The electromagnetic field (photon) in mattercan be obtained by replacing

e !ep

ε, k ! nk (or c ! cm D c/

pε) , ε D n2 (12.63)

Inserting Eqs. (12.60), (12.62) and (12.63) in Eq. (12.61), the number of emittedphotons per unit frequency and solid angle dΩ is expressed as follows (recoveringphysical dimensions with „ and c):

d2 NdωdΩ

Dαωn4π2c2

ˇˇZ L/2

�L/2dz( Ok � Ov )e i z/ZF

ˇˇ2 (12.64a)

Ok Dkjkj

, Ov Dvjv j

ZF Dhω

v� nk cos θ

i�1(12.64b)

ZF is referred to as the formation zone.

Page 417: Yorikiyo Nagashima - Archive

392 12 Accelerator and Detector Technology

Cherenkov Radiation If the medium is uniform, the integration gives a δ-functionand nonzero value only when the phase vanishes, namely

cos θc D1

nD

1v/cm

, cm Dcn

(12.65)

Therefore, only when the particle velocity exceeds that of light in the matter doesthe traveling particle emit photons, which are called Cherenkov light. Carrying outthe integration, using j2πδ(� � � )j2 D 2πδ(� � � )

Rdz D 2πδ(� � � )L and dividing by L,

we find the number of emitted photons per unit length and per unit energy interval(dE D d„ω) is

d2NdE dx

Dα„c

�1 �

1n2 2

�D

α„c

sin2 θc 370 sin2 θc eV�1 cm�1 (12.66)

Note, the index of refraction is a function of the photon energy (E D „ω).

Transition Radiation If the velocity of the charged particle does not exceed the ve-locity of light in the material, the phase of Eq. (12.64) does not vanish and no radi-ation is emitted in a uniform medium. However, if there exists nonuniformity inthe medium, for instance, if there is a thin dielectric foil in vacuum with thicknessL at position �L/2 6 z 6 L/2 (see Fig. 12.14), they give a finite contribution to theintegral and the photon is emitted. This is called transition radiation. As is clearfrom the above arguments, transition radiation and Cherenkov radiation have acommon origin.

For this configuration, the number of photons per unit energy interval and perunit solid angle is obtained from Eq. (12.64) by integration. The integral is of theform

Z L/2

�L/2f (n1)C

�Z �L/2

�1C

Z 1

L/2

�f (n2) D

Z L/2

�L/2f f (n1)� f (n2)gC

Z 1

�1f (n2)

(12.67)

VACUUM DIELECTRIC VACUUML

θ

φ

v βc

Figure 12.14 Emission of transition radiation in a thin slab.

Page 418: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 393

where n2 D 1 is the index of refraction in vacuum. The second term gives a δ-func-tion and vanishes in vacuum. Therefore,

d2NdE dΩ

Dαω

4π2c2n sin2 θ

�sin[(L/2) (ω/v � n1k cos θ )]

ω/v � n1k cos θ� (n1 ! n2)

�2

4π2ωn sin2 θ

�sin[(ωL/2c) (1/ � n1 cos θ )]

1/ � n1 cos θ� (n1 ! n2)

�2

(12.68)

In the region jzj ZF, the exponential function oscillates violently and does notcontribute, but the integral over the formation zone (jzj . ZF) can give a finitecontribution. This means the foil thickness L must be larger than the size of theformation zone, which in practical situations described later is tens of microme-ters. There are no constraints on the index of refraction n and transition radiationoccurs for a wide range of ω. For ω in the optical region (see Fig. 12.12), it givesrise to optical transition radiation at wide angles at modest velocities, that is for below the Cherenkov threshold. However, we are interested in applications for veryrelativistic particles with, say, γ & 1000. In the X-ray region, the index of refractionis smaller than and very close to 1. It can be written as n D

pε ' 1� ω2

p/2ω2 [seeEq. (12.46)]. The intensity of emitted photons is high only where the denominatorin Eq. (12.68) nearly vanishes, which occurs at very small angles of θ ' 1/γ . Thenusing ' 1� 1/2γ 2, the denominator can be approximated by

1� n cos θ '

12

1γ 2 C

ω2p

ω2 C θ 2

!(12.69)

where we have put n1 D 1 � ω2p/2ω2 and n2 D 1. The emission rate becomes

d2NdE dΩ

π2ωφ2 sin2

"ωL2c

1γ 2 C

ω2p

ω2 C φ2

!#

"1

1/γ 2 C ω2p/ω2 C φ2

�1

1/γ 2 C φ2

#2 (12.70)

The two numerators taken approximately equal were factored out and we have used

n sin2 θ 2 ' (n sin θ )2 D sin φ2 ' φ2 (12.71)

where φ is the emission angle in vacuum (see Fig. 12.14). Equation (12.70) repre-sents the interference between the two equal and opposite amplitudes with relativephase given by the argument of the sin2 factor. The interference can be important,but if there is no interference, the flux is

d2NdE dΩ

π2ωφ2

"1

1/γ 2 C ω2p/ω2 C φ2 �

11/γ 2 C φ2

#2

(12.72)

Page 419: Yorikiyo Nagashima - Archive

394 12 Accelerator and Detector Technology

Integration over solid angle dΩ D πdφ2 gives

d IdED „ω

dNdED

απ

"(1C 2

�ω

γ ωp

�2)

ln�

1C γ ωp

ω

�2 � 2

#

Dαπ

(2 ln

�γ ωp/ω

�ω/γ ωp � 1

16

�γ ωp/ω

�4 ω/γ ωp 1

(12.73)

The spectrum is logarithmically divergent at low energies and decreases rapidlyfor ω > γ ωp. Integration of Eq. (12.73), however, is finite and gives a total ener-gy flux αγ„ωp/3 and a typical photon energy of order γ„ωp/4 [216, 311], where„ωp 20 eV. Therefore the radiated photons are in the soft X-ray range 2–20 keV.Since the emitted energy is proportional to γ , the transition radiation is inherentlysuitable for measurements of very energetic particles. However, only � α photonsare emitted with one thin foil. The yield can be increased by using a stack of foilswith gaps between them.

12.4.5Interactions of Electrons and Photons with Matter

Bremsstrahlung The interactions of electrons and photons differ considerablyfrom those of other particles and need special treatments. The reason is as fol-lows. An energetic charged particle loses its energy by bremsstrahlung under theinfluence of the strong electric field made by the nuclei. According to classicalelectrodynamics, the loss is proportional to (2/3)αjRrj2, where Rr is the accelerationthe particle receives. Since jRrj / Z/m, the loss rate is proportional to (Z/m)2,which is large for light particles, especially electrons.

The bremsstrahlung cross section is given by the Bethe–Heitler formulaEq. (7.71). The cross section for emitting a photon per photon energy intervalis obtained by integrating over the electron final state and photon solid angle.The calculation is tedious. Besides, target nuclei are not isolated but surround-ed by electrons, which effectively screen the nuclear electric field. Therefore, weonly quote a result in the relativistic limit with the screening effect taken intoaccount [61, 203]:

dσ D 4Z2αr2e

E f

Ei

dωω

" E 2

i C E 2f

Ei E f�

23

!ln

183Z1/3

C19

#

forEi E f

me ω α�1Z�1/3

(12.74)

where Ei , E f D Ei �ω, ω are the energies of the initial and final electron and pho-ton, respectively, re D α„/(me c) being the classical electron radius. If we neglect

Page 420: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 395

Table 12.1 Radiation length and critical energy

Material X0 [g] Ec [MeV]

H2 (liquid) 69.3 382

C 44.4 102Al 24.3 47

Fe 13.8 23.9

Pb 5.83 6.9

the second term in the square brackets of Eq. (12.74), which is on the order of afew percent, the equation can be rewritten as

dσdωD

AX0NA

�43�

43

y C y 2�

(12.75a)

y DωEi

(12.75b)

where the radiation length X0 is given by Eq. (12.57c).A characteristic of bremsstrahlung is that the number of emitted photons is pro-

portional to 1/ω and hence the energy spectrum is nearly flat. The energy loss isobtained by

�dEdxD nA

Z „ωmax

dσdω

dω (12.76)

where nA is the number density of atoms. The result is an easy to remember for-mula:

�dEdxD

EX0

, ! E D E0e�x/X0 (12.77)

Actually X0 is defined as the length where the energy loss by bremsstrahlung isequal to that by ionization. Figure 12.15 shows Eq. (12.77) is a good approximationfor energies higher than a few hundred MeV. As 1/X0 / Z2, (dE/d� )brems / Z2

and (dE/d� )ionization / Z ,

(dE/d� )brems

(dE/� )ionization'

E [MeV]800/(Z C 1.2)

�EEc

(12.78)

Ec D 800 [MeV]/(Z C 1.2) is called the critical energy.12) At E > Ec the energyloss due to bremsstrahlung is larger than that due to ionization and at E < Ec

the converse is true. The critical energy is an important parameter in designing acalorimeter. Table 12.1 lists radiation lengths and critical energies of a few exam-ples.

12) There are other definitions of Ec and refined approximations. See [311].

Page 421: Yorikiyo Nagashima - Archive

396 12 Accelerator and Detector Technology

Bremsstrahlung

Lead (Z 82)Positrons

Electrons

IonizationMøller (e )

Bhabha (e+)

Positron annihilation

1.0

0.5

0.20

0.15

0.10

0.05

(cm

2 g−1

)

E (MeV)10 10 100 1000

1 E−

dE dx(X

0−1)

Figure 12.15 Radiation loss of the electron. Figure taken from [311].

Photon Absorption Photons lose energy through three processes: the photoelectriceffect, Compton scattering and pair creation. For Eγ a few MeV, pair creation isthe dominant process.

Pair creation is closely related to bremsstrahlung. It can be considered asbremsstrahlung by an electron with negative energy and that the electron itselfis transferred to a positive energy state. The cross section is almost identical tothat of bremsstrahlung and hence shares many common features. Like electrons,the photon intensity (energy � number per unit area per unit time) as it passesthrough matter is approximated by

d I D �μ I dx

) I D I0e�μx � Io e�x/λ(12.79)

μ is called the mass absorption coefficient if x is expressed in g/cm2. The (mass) at-tenuation length λ D 1/μ of the photon for various materials is given in Fig. 12.16.At high energies (Eγ � 100 MeV), they become constant and Eq. (12.79) is a goodapproximation in the high-energy region. The attenuation length is conceptuallyanalogous to the radiation length. Above Eγ > 1 GeV

λ D97

X0 (12.80)

holds to within a few percent.

Cascade ShowerLongitudinal Profile A very high energy electron loses its energy by bremsstrahlungbut the radiated photon produces a pair of electrons, which produce photons again.

Page 422: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 397

Photon energy

100

10

10 4

10 5

10 6

1

0.1

0.01

0.001

10 eV 100 eV 1 keV 10 keV 100 keV 1 MeV 10 MeV 100 MeV 1 GeV 10 GeV 100 GeV

Abs

orpt

ion

len

gth

λ (

g/cm

2)

Si

C

Fe Pb

H

Sn

Figure 12.16 Photon attenuation length. Figure taken from [311].

Thus a great number of electrons and photons are created in a cascade by an en-ergetic electron or photon. The multiplication continues until the electron energyreaches the critical energy, after which the ionization effect takes over. If the materi-al length is measured in units of radiation length and the ionization and Comptoneffect are neglected (Rossi’s approximation A [336]), the shower profile is indepen-dent of material. If only the Compton effect is neglected (approximation B) and ifthe energy is measured in units of critical energy, the result is also independent ofmaterial.

A Simple Model Basic characteristics of the cascade shower can be understood byconsidering a very simple model [203]. Let us assume that the electron as well asthe γ -ray lose half their energy as they pass a radiation length X0. The electronemits a γ -ray and the γ -ray creates a positron–electron pair. Then at the depth of2X0, the incident electron that had E0 is converted to two electrons, one positronand one photon, each having energy E0/4. Proceeding this way, at the depth t inunits of X0, the numbers of electrons and photons are given by

ne D2tC1 C (�1)t

3'

23

2t (12.81a)

nγ D2t � (�1)t

3'

13

2t (12.81b)

and the energy of each electron is given by E D E0/2t . Assuming the shower stopsat E D Ec, then Ec D E0/2tmax . Therefore

tmax Dln(E0/Ec)

ln 2(12.82a)

Nmax D23

2tmax D23

E0

Ec(12.82b)

Page 423: Yorikiyo Nagashima - Archive

398 12 Accelerator and Detector Technology

N(> E ) DZ t (E)

d tN D23

Z t (E)

d t exp(t ln 2)

'23

e t (E) ln 2

ln 2D

23

E0

E � ln 2'

E0

E

(12.82c)

) dNdED �

23

E0

ln 2�

1E 2

(12.82d)

As there are two electrons for each photon, the length per track is (2/3)X0, whichgives the total length traveled by the charged particles

L D23

Z tmax

d tN D�

23

�2 Zd t e t ln 2 '

49 ln 2

2tmax 'E0

Ec(12.83)

As the ionization loss is approximately constant regardless of the particle’s ener-gy, the total energy of the incident particle transferred to the cascade shower isgiven by (dE/dx ) � L / E0. The depth at shower maximum is tmax ' ln(E0/Ec).This means the length of the shower increases logarithmically with the energy. Theshower stops abruptly at the shower maximum owing to the simplicity of the mod-el whereas the real shower decreases exponentially. However, basic characteristicsof the model are confirmed qualitatively by both Monte Carlo calculations and data.

Analytical formulae are useful for understanding the characteristics of the show-er, but with the advent of high speed computers Monte Carlo programs have comein handy for simulating showers. Figure 12.17 shows a longitudinal profile of theshower produced by a 30 GeV electron. The longitudinal shape is well reproduced

0.000

0.025

0.050

0.075

0.100

0.125

0

20

40

60

80

100

(1/

E0)

dE

/d

t

t = depth in radiation lengths

Nu

mbe

r cr

ossi

ng

plan

e

30 GeV electron incident on iron

Energy

Photons × 1/6.8

Electrons

0 5 10 15 20

Figure 12.17 A profile of a 30 GeV electronshower generated by the EGS4 simulationprogram. The histogram shows fractional en-ergy deposition per radiation length and the

curve is a gamma-function fit to the distribu-tion. Circles and squares indicate the numberof electrons and photons with total energygreater than 1.5 MeV. Figure taken from [311]

Page 424: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 399

by a gamma function of the form

dEd tD E0b

(bt)a�1e�bt

Γ (a)(12.84a)

t D x/X0 , tmax Da � 1

bD ln(E/Ec C Ci ) (12.84b)

b ' 0.5 , 13) Ce D �0.5 , Cγ D 0.5 (12.84c)

where tmax is the thickness at shower maximum and a is calculated fromEq. (12.84b).

Transverse Profile So far we have ignored transverse expansion of the shower.In actual situations, although both bremsstrahlung and pair creation are well char-acterized by the radiation length, Coulomb scattering becomes significant belowthe critical energy. A characteristic length representing the transverse expansion isgiven by the Molière unit

RM �Es

EcX0 '

7AZ

[g/cm2] (12.85)

where Es is the scale energy defined in Eq. (12.57b). It is known that 90% of thetotal energy is contained within a radius r D RM and 99% within 3.5RM. A detectorto measure the total energy of an incident particle by inducing showers is called acalorimeter.

LPM Effect In bremsstrahlung, both the scattering angle of electrons and theemission angle of photons are small. This means the kick given to the electronis almost transverse and the longitudinal momentum transfer is very small:

kk D pi� p f �ω Dq

E 2i � m2

e�q

E 2f � m2

e�ω 'm2

e ω2Ei E f

2γ 2 (12.86)

Then, due to the Heisenberg uncertainty principle, the region of photon emissionis spatially extended � 1/ kk. This is reintroduction of the formation zone (ZF) thatwe described in the section on transition radiation. The length of the formationzone is given by � 2γ 2/ω, which is on the order of 10 μm for 100 MeV photonsemitted from 25 GeV electrons. If the condition of the electron changes before itgoes out of the zone, interference with the emitted photon occurs. For instance,if the electron receives a sizeable multiscattering due to many scattering centersinside the zone and if the net scattering angle becomes comparable to that of theemitted photon, the emission is attenuated. This is referred to as the LPM (Landau–Pomeranchuk–Migdal) effect [234, 250, 275]. When the emitted photon energy issmall (i.e., has long wavelength), the density effect (see the paragraph “DensityEffect” in Sect. 12.4.2) has to be taken into account, too. Here, due to dielectriceffect of matter, the wave number becomes k ! nk, which induces a phaseshift(n � 1)kZF and hence destructive interference.

13) Exact numbers depend on material. See for instance, Figure 27.19 of [311].

Page 425: Yorikiyo Nagashima - Archive

400 12 Accelerator and Detector Technology

In order to quantify the above argument, we note that the amplitude of photonemission is proportional to the formation zone ZF. Using the approximations 1 � ' 1/2γ 2, 1 � n ' ω2

p/ω2 and 1 � cos θ ' θ 2/2, Eq. (12.64b) can be written as

ZF D2cγ 2

ω1

1C γ 2θ 2 C γ 2ω2p/ω2 (12.87)

Equation (12.87) shows that the attenuation of the photon emission occurs whenthe second and third terms in the denominator become nonnegligible compared tothe first term. The LPM attenuation results from the second term and the densityattenuation from the third term. Defining ωLPM by the condition that the multiplescattering angle becomes equal to the emission angle

γ 2 ˝θ 20

˛jX DZF(γ2 θ 2D1) D 1! ωLPM D

E 2e

ELPM(12.88a)

ELPM [TeV] D

�me c2

�4 X0

„cE 2S

D 7.7X0 [cm] (12.88b)

From the above formula we see that the photon emission is suppressed when

„ω < „ωLPM or„ωEe

<Ee

ELPM(12.89)

As ELPM is given as 4.3 TeV in lead and 151 TeV in carbon [234], this is primarily anextremely high energy phenomenon. The quantitative estimation of the suppres-sion is not easy, because the number of terms in the bracket of Eq. (12.68) increas-es. Qualitatively speaking, the ω dependence on the number of emitted photonsshifts from 1/ω to 1/

pω at ω � Ee . Figure 12.18a shows a comparison of the

energy loss per radiation length of uranium by a 25 GeV electron with the standardBethe–Heitler formula [26].

We may say that the ultrahigh energy electron behaves like a pion, because emis-sion of photons and electrons is suppressed and the effective radiation length be-comes larger.

The density attenuation comes in when the phase shift of the emitted photonbecomes of the order of 1:

(n � 1)kZF ' γ 2ω2p/ω2 ' 1

i.e. „ω � γ„ωp � γf� [g/cm3]g1/2 � 21 eV(12.90)

The suppression effect is due to the third term in the denominator of Eq. (12.87).For the case of electrons with 25 GeV energy (γ D 5�104), the density effect comesin at ω � 5 MeV in lead and � 1 MeV in carbon, which are both smaller than theLPM effect. The LPM effect exists also in pair creation by photons.

Although the LPM effect was pointed out long ago theoretically, it has been con-sidered only in treating ultrahigh energy cosmic ray reactions [224, 278] and a quan-titative treatment was difficult. One experiment was carried out using an electron

Page 426: Yorikiyo Nagashima - Archive

12.4 Particle Interactions with Matter 401

Figure 12.18 LPM and dielectric effect inbremsstrahlung [25, 26]. (a) Calculated ener-gy spectrum of photons emitted by 25 GeVelectrons incident on uranium. The LPMcurve (dashed line) is compared with theBethe–Heitler (BH) formula (solid line).Experimental data are the emitted photonspectra from 25 GeV electrons incident on

5% X0 uranium (b) and 6% X0 carbon (c). Thesolid line is a Monte Carlo calculation withLPMC DE (density effect)C TR (transitionradiation). The dotted line (LPMC TR1) andthe dashed line (BHC TR2) use different TRformulae. The deviation of BH from a flat lineis due to multiple scattering.

beam at SLAC [26]. Figure 12.18 shows the result. Both the LPM and density effectsare observed. The rise of the spectrum at the low energy end is due to transitionradiation.

The LPM effect is sizeable only at extremely high energies. However, it may be-come a problem when designing a calorimeter for the future LHC or ILC accelera-tors in the tens of TeV region.

12.4.6Hadronic Shower14)

Energetic hadrons incident on thick material also make a shower. If we scale bynuclear absorption length instead of radiation length, arguments from the electro-magnetic shower are generally valid, too. Namely, the total length of all the tracksis proportional to the incident energy and the size of the shower grows as ln E ,etc. There are some characteristics peculiar to hadronic showers. We show typicalshower configurations of the hadronic and electromagnetic showers side by side(Fig. 12.19) to demonstrate their similarity as well as their difference.i) Absorption Length Usually the absorption length λ is much longer than the ra-diation length: λFe D 131.9, λPb D 194, X0 Fe D 13.84, X0 Pb D 6.37 (all in unitsof g/cm2). The hadronic shower grows slowly compared with the electromagneticshower. This fact can be used to separate them.ii) e/h Compensation In nuclear reactions, a sizeable fraction of the energy disap-pears. Some fraction is used to break nuclei (binding energy) and some is carried

14) [135, 152, 336, 346].

Page 427: Yorikiyo Nagashima - Archive

402 12 Accelerator and Detector Technology

Figure 12.19 Monte Carlo simulation of the different development of hadronic and electromag-netic cascades in the earth’s atmosphere, induced by 100 GeV protons (a) and photons (b). [Z(vertical) axis range: 0–30.1 km, X/Y axis range:˙5 km around shower core.] Image by [345].

away by neutrinos, which are the decay product of pions. Besides, slow neutrons,which are produced in plenty at nuclear breakup, are hard to observe. Consequent-ly, the energy resolution of hadronic calorimeters is worse than that of electromag-netic calorimeters. Furthermore, the secondary particles are mostly pions, of which� 1/3 are neutral and promptly decay to two photons. They produce electromag-netic showers, which are localized relative to hadronic showers. If all the energy ismeasured this is not a problem, but a practical calorimeter is usually of samplingtype and has a structure consisting of layers of absorbers with active detectors sand-wiched between them. The electromagnetic part of the shower may not be sampledor, if it is, the measured energy is much higher because of its local nature. Both ef-fects enhance the statistical fluctuation, especially that of the first reaction. Thehadronic calorimeter has an intrinsic difference in the detection efficiency for theelectronic and hadronic components and its ratio is dubbed the e/h ratio. The per-formance can be improved by enhancing the detection efficiency of the neutronand deliberately decreasing that of the electric component. As uranium capturesslow neutrons with subsequent photon emission, the efficiency can be improved ifit is used as the absorber material [134, 389]. However, as it is a nuclear fuel andhighly radioactive, a conventional choice is to use a material with high hydrogencontent (for example, organic scintillators) to enhance the detection efficiency forslow neutrons and to optimize the ratio of the absorber thickness to that of the de-tector material to adjust the e/h ratio at the expense of reduced energy resolution.

Page 428: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 403

Balancing the responses of electron and hadron components improves linearity ofthe energy response and is effective in measuring the so-called missing energy.

12.5Particle Detectors

12.5.1Overview of Radioisotope Detectors

Particle detectors derive information from light or electrical signals that are pro-duced by particles when they interact with material as they enter the detector. His-torically the first detector was a fluorescent board, with which Röntgen discoveredthe X-ray. Rutherford used a ZnS scintillator to measure the α-ray scattering by agold foil, which led to his celebrated atomic model. The measurement was a tediousobservation by the human naked eye using a microscope. Later, ionization cham-bers, proportional counters and Geiger counters were introduced. Their signal iselectrical. Then, around 1940, there was a major step forward with the invention ofphotomultipliers, which enabled amplification of fluorescent light combined withfast electronics. Another breakthrough was made by G. Charpak when he inventedthe multiwire proportional chamber (MWPC) in 1968, which laid the foundationsof modern detector technology.

A Wilson cloud chamber was used to discover the positron and strange particles,emulsions to discover π mesons, and bubble chambers to discover a great numberof resonances. They were of a type that covered the complete 4π solid angle. Theywere powerful in that they could reconstruct the entire topology of reactions, in-cluding many secondary particles, and played an important role historically. How-ever, because of their long response and recovery times, they become obsolete asthe need for rapid measurement and fast processing time increased.

Modern measuring devices use a combination of various detectors, includinglight-sensitive counters, gas chambers and radiation-hard solid-state detectors withcomputer processing, and achieve the functions of (1) complete-solid-angle cover-age, (2) the ability to reconstruct the whole topology and (3) high-speed processing.

In the following, we first explain detectors classified according to their output sig-nals, then describe their usage depending on what one wants to measure. Signalsare of two kinds, photonic and electronic. Detectors with light outputs are scintil-lation counters, Cherenkov counters and transition-radiation counters. Detectorswith electric signals are semiconductor detectors and various gaseous chambers.Roughly speaking, light signals are faster than electric signals by at least an orderof magnitude, but the latter excel in precision.

Page 429: Yorikiyo Nagashima - Archive

404 12 Accelerator and Detector Technology

12.5.2Detectors that Use Light

a) Scintillation Countersi) Organic Scintillators Fluorescent materials such as anthracene (C14H10), p-ter-phenyl (C18H14), tetraphenyl butadiene (C28H22) and their mixtures are used aslight-emitting scintillators. They come as liquid scintillators, when dissolved in anorganic solvent such as toluene (C7H-8), or solid plastic scintillators, when mixedin plastic bases.

Charged particles ionize atoms in the scintillator as they pass through. Ionizedelectrons excite molecules to excited energy levels and when they return to theground state they emit UV (ultraviolet) photons, which are generally outside thesensitive regions (400–700 nm; 1 nm D 10�9 m) of photomultipliers. Therefore,usually wavelength shifters (another type of fluorescent material) are mixed to gen-erate photons of longer wavelength (Fig. 12.20). Commonly used organic scintilla-tors produce light pulses with attenuation time � 10�9 s and are suitable for fastcounting.ii) Inorganic Scintillators (NaI, CsI, BiGe, BGO (bismuth germanate), etc.) Histor-ically NaI has been in long use but others share a common light-emitting mech-anism. A small amount of thallium is mixed in NaI (NaI-Tl), which become pho-toactive centers, capture ionized electrons and emit light on falling from the excit-ed state to the ground state. One advantage of inorganic scintillators is a large lightoutput (& 10 times) compared to organic scintillators, which gives much better res-olution (a few percent at 1 MeV). Another advantage is their short radiation length(2.6 cm with NaI, 1.12 cm with BGO), which is used to make a compact electromag-netic calorimeter. Disadvantages are that they are slow (attenuation time � 10�6 s)and generally expensive.

Ionization excitation of base plastic

Forster energy transfer

γ

γ

base plastic

primary fluorescer (~1% wt/wt )

secondary fluorescer (~0.05% wt/wt )

photodetector

emit UV, ~340 nm

absorb blue photon

absorb UV photon

emit blue, ~400 nm1 m

10 4 m

10 8 m

Figure 12.20 Scintillation ladder depictingthe operating mechanism of plastic scintil-lators. Förster energy transfer is an energytransfer between two chromophores (part ofmolecules responsible for its color) in proxim-

ity (�10�8 m) without radiation. Approximatefluorescer concentrations and energy transferdistances for the separate subprocesses areshown [311].

Page 430: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 405

iii) Structure of Plastic Scintillators Light from scintillators is guided to photonsensors (typically photomultipliers but photodiodes and other semiconductor de-vices are also used) through transparent plastic light guides. Figure 12.21 showsa few examples of the basic structure of plastic scintillators. Figure 12.21a is themost primitive; (b) is an improved (called adiabatic) type, which has better lightcollection efficiency. This was achieved by matching the areas of the light-emittingsurface and receiving surface by using bent rectangular strips. Liouville’s theoremof phase-space conservation works here. With the advent of collider detectors, acompactified multilayer structure without dead space looking into the interactionpoint has become important. Figure 12.21c uses optical fiber to achieve this at theexpense of light collection efficiency and guides light to far away photomultipliersor other light-collecting devices.

A typical structure of a photomultiplier is shown in Fig. 12.22. When photonshit the photocathode, an electron is emitted with probability 10–20% or higher andhits the first dynode D1, emitting a few more electrons. They are attracted andaccelerated by the higher electric potential of successive dynodes. The potentialis supplied by the accompanying circuit shown at the top of the figure. This isrepeated as many times as the number of dynodes and anode. For an incidentelectron, 3–4 electrons are emitted, thus multiplying the number of electrons incascade. As they reach the anode, the number of electrons is multiplied to (3 �4)14 ' 108, producing a strong electric pulse.

Plastic scintillation counters excel in time response and resolution and are usedto detect the arrival time of particles and trigger other devices. They can also mea-

Figure 12.21 Scintillation counters with prim-itive (fishtail) (a) and improved (adiabatic) (b)light guides. (c) The use of optical fibers em-bedded in a groove as light guides is mostly

used for stacks of scintillators, typically in acalorimeter where compactness takes prece-dence over light collection efficiency. PM =photomultiplier.

Figure 12.22 Schematic layout of a photomultiplier and voltage supply taken from [298].

Page 431: Yorikiyo Nagashima - Archive

406 12 Accelerator and Detector Technology

sure the time interval that the particle takes to fly a certain length (TOF; time offlight) and determine the particle’s velocity. Combined with the momentum infor-mation this can be used to determine the mass, i.e. particle species. A typical plasticscintillator has a density of 1 g/cm3. Therefore, when an energetic charged particlepasses a scintillator of thickness 1 cm, it deposits � 1.8 MeV of energy in it. As-suming an average ionization energy of 30 eV, it creates � 6 � 104 ion pairs andproduces � 10 000 photons. As the average energy of fluorescence is hhνi � 3 eV,about 1–2% of the energy is converted to photons. Assuming geometrical efficiencyof 10% for the photons to reach the photomultiplier and assuming quantum effi-ciency � 20% for conversion to electrons, the number of emitted photoelectronsis about 200, which has statistical uncertainty of 10–20%, a typical resolution ofplastic scintillators.

Problem 12.2

Assume that light signals with an attenuation time of 10 ns (1 ns D 10�9 s) are10 ns long triangular pulses and are amplified by the photomultiplier given inFig. 12.22. Assuming a gain (amplification factor) of 107, R D 50Ω (R C � 10 ns,where C is stray capacitance of the circuit), what is the voltage of the pulse heightobtained from 200 photoelectrons?

b) Cherenkov Counteri) Basic concept The Cherenkov counter resembles the scintillation counter in thatit extracts light from a transparent material. It differs in that it does not emit light ifthe particle velocity is below a threshold ( D v/c D 1/n; n D index of refraction),and also in that the emission is not isotropic but directional. The response timeis also much faster, restricted only by the photomultiplier response. The emissionangle θ with respect to the particle direction is determined by

cos θ D1

n(12.91)

The energy loss per unit length is given by Eq. (12.66). Denoting the frequency ofthe Cherenkov light as ν

dEdxD

2πz2αc

�1�

12n2

�hνdν (12.92)

Here, ze is the charge of the incident particle. In terms of the wavelength it isproportional to dλ/λ3 and the spectrum is dominated by photons in the violet toultraviolet region. The number of photons per unit length is obtained by dividingthe above formula by hν. Putting n approximately constant in the narrow region(ν1–ν2) where photomultipliers are sensitive, we obtain

dNdxD

2πz2αc

(ν1 � ν2) sin2 θc ' 500z2 sin2 θc cm�1 (12.93)

Page 432: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 407

As an example, putting z D 1, D 1, n D 1.33 (water), dN/dx ' 200 cm�1.Further setting λ D 600 nm, dE/dx ' 400 eV/cm. As the ionization loss is� 1.8 MeV/g, only 0.02% of the energy is used and the light emission efficiencyis � 1/100 that of a scintillator.ii) Threshold-Type Cherenkov Counter Counters of this type are used to identifyparticle species by utilizing the fact that light is emitted only when > 1/n. Forinstance, when there is a mixed beam of electrons, muons and pions with fixedmomentum, the velocities are e > μ > π because me < mμ < mπ . As theindex of refraction of the gas changes in proportion to the pressure, two Cherenkovcounters can be arranged so that one of them detects electrons only and the otherelectrons and muons but not pions. Then the two counters in tandem can separatethree species of the particle.iii) Differential-Type Cherenkov Counter Even if > 1/n, the emission angle de-pends on the velocity. By picking up light emitted only in the angle interval fromθ to θ C Δθ , particle identification is possible. Figure 12.23a shows a differential-type Cherenkov counter called DISC [265], which was used to identify particles inthe hyperon beam (a 15 GeV secondary beam produced by a 24 GeV primary pro-ton beam at CERN). The beam enters from the left and goes through a gas radiator.Emitted light is reflected back by a mirror, through a correction lens and slit, re-flected through 90ı by another mirror and collected by photomultipliers. Only lightbetween θ and θ C Δθ is detected by use of a slit. To achieve Δ/ D 5 � 10�5,a correction lens made of silica gel is used. Figure 12.23b shows a velocity curve ofthe Cherenkov counter. The chosen velocity is varied by changing the pressure ofthe nitrogen gas.iv) RICH (Ring Image Cherenkov Counter): In this type of counter the emitted lightis mapped on a plane to produce characteristic ring patterns. An example used aspart of the collider detector complex at DELPHI [119] is shown in Fig. 12.24.

Figure 12.23 (a) A differential-type gasCherenkov counter used to identify particlesin the hyperon beam [265]. (b) A velocity curveof the Cherenkov counter to identify 15 GeV/c

particles produced by the 24 GeV primary pro-ton beam. The velocity is varied by changingthe pressure of the nitrogen gas.

Page 433: Yorikiyo Nagashima - Archive

408 12 Accelerator and Detector Technology

Figure 12.24 RICH counter. A part of DELPHI detector at LEP/CERN [119].

Figure 12.25 Cherenkov light detected by the RICH counter. (a,b) Cherenkov light from theliquid radiator projected on a plane perpendicular to the incident particles. (a) Photons emittedby a single particle. (b) Accumulated light from 246 cosmic ray muons.

Functionally, it is two counters in one. The front section uses the liquid radiatorC6F14 and the rear section uses the gas radiator C5F12. Cherenkov light from thegas radiator is reflected back by mirrors and detected by the same drift tube de-tector that detects the light from the liquid radiator. The drift tube (similar to thedrift chamber to be described later) contains liquid CH4C C2H6C TMAE, whereTMAE is tetrakis(dimethylamino)ethylene. The signals are processed by comput-ers. Figure 12.25 shows a pattern projected on a plane perpendicular to the incidentparticles.

Figure 12.25a is the response for a single particle and (b) is a collection oflight signals obtained from 246 cosmic ray muons. The poor photon statistics ofCherenkov light is reflected in the small number of detected photons.

c) Transition Radiation Counter The transition radiation counter uses light emittedat the boundary of two materials with different dielectric constants. As described

Page 434: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 409

before, the origin is dipole emission of polarized media induced by the passage ofa particle. As the emitted photon energy is proportional to γ D 1/

p1� 2, it is

most suitable for detection of extremely relativistic (γ & O(1000)) particles. Thedifferential emission spectrum is given by Eq. (12.72). The emission angle of thephoton is concentrated in the forward region (θ 1/γ ). The spectrum decreasesrapidly above ω > γ ωp and the lower cutoff is determined by the detector condi-tion. Taking the cutoff frequency ωc D 0.15γ ωp, the detected number of photonsper layer is given roughly by

Nγ (ω > 0.15γ ωp) � 0.5α (12.94)

As the number of photons emitted is extremely small, some several hundred layerswith gaps between are necessary for practical use. In a practical design, the transi-tion radiation counter consists of radiators and detectors. A dominant fraction ofsoft X-rays are absorbed by the radiator itself. To avoid absorption of emitted X-raysby the absorbers, they have to be thin. But if they are too thin the intensity is re-duced by the interference effect of the formation zone (ZF ' a few tens of microm-eters). Besides, for multiple boundary crossings interference leads to saturation atγsat ' 0.6ωp

p`1`2/c, where `1, `2 are the thickness of the radiators and the spac-

ing, respectively [30, 155]. As the X-ray detectors are also sensitive to the parent par-ticle’s ionization, sufficient light to surpass it is necessary. A practical design has tobalance the above factors. Radiators that have been used come in two kinds: a stackof thin foils or a distribution of small-diameter straw tubes. Figure 12.26 showsthe latter type used in VENUS/TRISTAN [340]. The radiator consists of polypropy-lene straw tubes with 180 μm diameter placed in He gas. With cutoff at 11 keV, itachieved 80% efficiency for electrons with rejection factor greater than 10 againstpions.

Figure 12.26 Transition radiation counter. (a) Schematic layout of one used inVENUS/TRISTAN. (b) Its radiation spectra [340, 341]. Note the pion’s signal is due to ioniza-tion.

Page 435: Yorikiyo Nagashima - Archive

410 12 Accelerator and Detector Technology

12.5.3Detectors that Use Electric Signals

a) Gaseous DetectorsBasic Structure Electrons and ions can move at high speed in gas and when theproper potential is applied they can be extracted as electric signals. Three types ofgaseous detectors – ionization, proportional, and Geiger tubes – were used firsthistorically. They share the same structure but have different operating voltages(Fig. 12.27). A rare gas such as argon (Ar) or xenon (Xe) is confined in a cylindricaltube with a metal wire stretched at the central position. The reason for using raregas is so that ionized electrons do not easily recombine with positive ions. Theouter wall of the cylinder (cathode) is grounded and high voltage is applied to thecentral wire (anode). The strength of the electric field is proportional to 1/r . Whena charged particle passes through the tube, electrons and positive ions are madealong the track. The number of ion pairs is proportional to the energy loss of theparticle. If no voltage is applied, they recombine and restore electric neutrality, butif a voltage is applied, electrons are attracted to the anode and positive ions areattracted to the cathode, creating an electric current. However, as positive ions aremuch heavier they are slow in reaching the cathode and the signals they create areusually discarded.Operation Mode When the applied voltage is low, exactly the same number of freeelectrons as are created by ionization are collected by the anode. Detectors oper-ating in this region (ionization mode) are called ionization tubes (or chambers,depending on the shape of the detector). The electric signal is small and used on-ly when a large enough number of electrons are created. With higher voltage, the

CATHODE

ANODE WIRE

CHARGED PARTICLE

+HV

SIGNAL

(a) (b)0 250 500 750 1000

1012

1010

108

106

104

102

1

Num

ber

of C

olle

cted

Ion

s

Electron

α particle

IonizationMode

ProportionalMode

Limited

ModeStreamer

GeigerMode

DischargeRegion

RecombinationBefore Collection

Applied Voltage (volt)

Figure 12.27 (a) Basic structure of a gaseous counter. (b) The number of collected ions in vari-ous detection modes depending on the applied voltage.

Page 436: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 411

accelerated electrons knock out additional electrons from the gas molecules. Bymaking the radius of the anode wire small, an electric field with large gradientis created near the anode, the secondary electrons can ionize the molecules andmultiplication begins, leading eventually to a cascade. When the voltage is not toohigh, the number of electrons collected at the anode is proportional to the numberproduced by ionization (proportional mode). Detectors in this range (proportionaltubes) can attain an amplification factor of 103–106 by varying the voltage.

With further increase of the applied voltage, the space charge effect due to theion pairs distorts the electric field and the proportionality is lost (limited streamermode). However, some designs take advantage of the high output, which elimi-nates the need for electronic amplifiers. The induced cascade is still local, but witheven higher voltage it spreads all over the detector region (Geiger mode), givinga saturated output signal regardless of the input. The reason for the spreading ofthe cascades is that excited molecules emit light when going back to the originalstates and excite more molecules in the far distance. In this mode, a quenchinggas to absorb light is needed to stop the discharge. The recovery time is long andit is not suitable for high-speed counting. Note that, depending on the inserted gasspecies, the region of limited proportionality may not appear. The relation betweenthe voltage and the number of output electrons is depicted in Fig. 12.27b.Ionization Chamber As the output is small in the ionization mode, it is used onlyfor special purposes, such as a calorimeter, where all the energy is deposited inthe detector. The use of liquid increases the density and enables the detector tocontain showers in a small volume and also increase the number of ionized elec-trons per unit length. In liquids, molecules such as oxygen capture electrons and asmall fraction of such impurities prevents electrons from reaching the anode. Forthis reason only rare gas liquids such as Ar and Xe are used. Even in this case,the concentration of oxygen impurities has to be less than a few ppm (parts permillion D 10�6). There are many advantages in using it in high-energy experi-ments, but the disadvantage is that the liquid state can be maintained only at liquidnitrogen temperature (77 K) and therefore necessitates a cooling vessel (cryostat),which makes a dead region. To avoid this, so-called warm liquids (organic liquidsreferred to as TMP and TME) are being developed.MWPC (Multiwire Proportional Chamber) The MWPC, invented by Charpak [95]in 1968, made a big step forward in the high-speed capability of track detection. Ca-pable of high speed and high accuracy, it has accelerated the progress of high-ener-gy physics and has found widespread application as an X-ray imager in astronomy,crystallography, medicine, etc. For this contribution, Charpak was awarded the No-bel prize in physics in 1992. The basic structure is depicted in Fig. 12.28a. Manywires are placed at equal intervals between two cathode plates and high voltage isapplied between the wires (anodes) and the cathodes to be operated in proportionalmode. The electric field is shaped as in Fig. 12.28b and each wire acts as an in-dependent proportional tube. When a particle traverses the plate perpendicularly,ionized electrons are collected by nearby wires, thus achieving position accuracydetermined by the spacing of the anode wires.

Page 437: Yorikiyo Nagashima - Archive

412 12 Accelerator and Detector Technology

Figure 12.28 (a) A schematic of a MWPC and (b) lines of the electric field [312].

A typical example has wire spacing 0.5–2 mm, plate separation 1.5–2 cm, electricvoltage 3–4 kV and radius of the anode wires 10–50 μm. The field is 104–105 V/cmand the amplification factor is� 104. A space resolution of 0.5 mm is obtained. Forthe gas in the chamber, Ar with 10% methane or so-called magic gas [Ar 75% Cisobutane (C4H10) C Freon 13 (CF3Br)] are commonly used.

Usually two MWPCs are used in combination to determine both x and y coor-dinates. Sometimes more planes with u or/and v coordinates (slanted relative tox , y ) are used to eliminate multi-hit ambiguity. If strips of metal sheet instead ofplanes are used as cathodes, a single plane of the MWPC can obtain x and y si-multaneously. The risetime of the MWPC is fast and the decay time is dictated bythe ion mobility and the size of the chamber. If the wire spacing is a few millime-ters, the decay time can be very fast (25–30 ns) and the MWPC can sometimes beused to obtain time information. Each wire can cope with a very high luminosity of� 104 s�1 mm�1.i) Drift Chamber (DC) The basic structure of a drift chamber is given in Fig. 12.29.Contrary to the proportional chamber, the field lines in the DC are horizontal be-tween the anode and cathode wires. Besides, a number of field wires with gradu-ated voltages are used to obtain a uniform electric field. Ions produced by incidentparticles drift towards the anode with constant velocity. The drift time together withthe time information about the particle passage obtained separately determines thedistance from the anode wire. The anode wires are in the proportional mode andthe field distribution near the anode is similar to that in the MWPC. The uniform

Figure 12.29 Basic structure of a drift chamber [77]. The field lines in the figure are horizontalbetween the anode and cathode wires. A number of field wires with graduated voltages are usedto obtain a uniform electric field.

Page 438: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 413

electric field has � 1 kV/cm extending 1–10 cm. As the drift velocity of the elec-trons is 4–5 cm/μs and the accuracy of time measurement is � 2 ns, it can givean accuracy of position determination � 100 μm. Drift chambers are often usedin a magnetic field to determine the momentum of particles. In a magnetic field,electrons feel the Lorentz force and the voltage applied to the field wires is variedto compensate it. As the operational mode is the same as the MWPC, the responsetime and radiation hardness are similar. There are many variations of drift cham-bers, some of which will be introduced later.ii) GEM (Gas Electron Multiplier) With the advent of semiconductor microchiptechnology, much smaller strips of conductive plates have come into realization.GEM is one such technology among many similar detectors (micromega detector,etc., see [311]). Figure 12.30 shows the layout of the GEM detector. It has essentiallythe same structure as the MWPC, however, the anode intervals are much smaller.Multiplication of ions is carried out not by a cylindrical electric field but by a dif-ferent potential across the hole and outcoming electrons are captured by nearbyanodes.

Figure 12.30 (a) Layout of a GEM detector.Ions in the drift space are attracted by theGEM, go through an amplification hole andare collected by nearby anodes. (b) Schematicview and typical dimensions of the hole struc-ture in the GEM amplification cell. Electricfield lines (solid) and equipotentials (dashed)

are shown. On application of a potential dif-ference between the two metal layers, elec-trons released by ionization in the gas volumeabove the GEM are guided into the holes,where charge multiplication occurs in the highfield [311].

b) SSDs (Solid State Detectors) When a small amount of an element with fivevalence electrons (P, As, Sb) is added to a semiconductor with four valence elec-trons such as silicon (Si) or germanium (Ge), excess electrons are easily separatedto become free electrons (n-type semiconductors or n-SSD) and the SSD becomesslightly conductive. On the other hand if an element with three valence electrons(B, Al, Ga, In) is added, one electron is missing for each covalent bond and anelectron is borrowed from somewhere else, resulting in excess positive freely mov-ing holes (p-type semiconductor or p-SSD). The elements with five valence elec-trons are called donors and those with three, acceptors. Free electrons and holes

Page 439: Yorikiyo Nagashima - Archive

414 12 Accelerator and Detector Technology

are called carriers. When p-SSD and n-SSD are in plane contact with each other(p-n junction), thermal motion causes electrons to diffuse into p-SSD and holesinto n-SSD. The diffused electrons (holes) annihilate holes (electrons) on the otherside, leaving both sides of the junction with ionized donors and accepters. As theseions are bound in lattices, there remain distributed static charges, namely electricfields. The field works in the opposite direction to prevent conduction and even-tually balances with the diffusing force. Carriers are absent in this region, whichis called the depletion layer. The voltage across the depletion region is called thediffusion voltage. The outside region is neutral and the potential is nearly flat.

When one applies reverse bias, namely, positive (negative) voltage on the n (p)side of the junction (Fig. 12.31a), electrons and holes are attracted and extend thedepletion region. No current flows. If, on the other hand, forward bias is applied(Fig. 12.31b), the diffusion potential is reduced, causing electrons on the n side andholes on the p side to diffuse beyond the depletion region, and a current flows.

If a charged particle passes through the depletion region, ionized electrons andions along the track are attracted by the diffusion voltage and a pulse current flows.Ion pairs outside the depletion region diffuse slowly, giving a slow component thatis mostly ignored. The p-n junction can be considered as a condenser with elec-trodes placed at the two edge of the depletion layer. Let C be its capacitance, CA

the effective capacitance of an amplifier connected to the junction to read out theelectric signal, and Q the amount of all the charges collected. Then the magnitudeof the pulse is given by Q/(C C CA). A characteristic of a semiconductor is that theenergy levels of the impurity are very close to the conduction band, and it takes onaverage only 3.0 eV (Ge) or 3.7 eV (Si) to make pairs of ions. We have already statedthat in ordinary matter it takes about 30 eV to ionize atoms. This means that thenumber of created ions in the SSD is ten times and statistical accuracy is greatly en-hanced. An ionization loss of 1.8 MeV can produce 1.8�106/3.7 ' 5�105 ions withstatistical error of 1/

pN ' 0.2%. If we consider the additional geometrical loss of

Figure 12.31 A p-n junction in reverse bias (a)and in forward bias (b). When there is no biasvoltage across the junction, electrons fromthe n side and holes from the p side diffuseacross the junction and vanish after recom-bination, creating a positively charged regionon the n side and a negatively charged regionon the p side. As a result, a region with no

conducting electrons and holes, referred to asthe depletion region, develops. With reversebias (a) the depletion region expands withlittle mobility. With forward bias (b), both elec-trons and holes are attracted and absorbed bythe electrodes, which in turn supply more ofthem, and the current flows.

Page 440: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 415

- HV

OutputParticle

+

+

+

+

+

+

-

-

- --

-

-

-

Al Electrode

B-doped Sip type

As-doped Sin type

Al ElectrodeSensitive Area~300 μm

~30 μm

Figure 12.32 Schematic layout of a silicon microstrip detector. The strips are capacitively cou-pled.

scintillation counters, the resolution is at least �10 that of NaI and approximately�100 that of plastic scintillation counters.

If a SSD is made in the form of arrays of strips with a width of say 30 μm us-ing methods taken from the manufacture of integrated circuits (Fig. 12.32), it canfunction as a high precision position detector. Applications of SSDs as high-qualitydetectors are plentiful. The disadvantage is that it is difficult to manufacture largedetectors and they are prone to radiation damage.

12.5.4Functional Usage of Detectors

The information one wants to obtain using particle detectors is classified as follows.(1) Time of particle passage: Detectors are used to separate events that occur withina short time interval or to determine the particle’s velocity by measuring it at twodifferent locations. The most widely used detectors are scintillation counters withtime resolution Δ t ' 100 to 200 ps (1 ps D 10�12 s).(2) Position of particles: Detectors are used to reconstruct tracks, the direction ofmomentum and patterns of reactions. MWPCs (multiwire proportional chamberswith position resolution Δx � 1 mm), DCs (drift chambers with Δx ' 100 to200 μm) and SSDs (solid state detectors with Δx ' 10 to 100 μm) are used. Forsome special experiments where detection of fast decaying particles is at stakeemulsions are used to obtain Δx ' 0.1 to 0.5 μm. When the speed matters, ho-doscopes with reduced position resolution in which strips of scintillation countersare arranged to make a plane come in handy. They are used for triggering (select-ing desired events from frequently occurring reactions and initiating data taking)or for crude selection of events.(3) Energy-momentum (E , p ) is obtained by dE/dx counters (which measure theionization loss by a thin material), range counters or calorimeters (which measurethe total energy deposited in a thick material). To obtain the momentum one mea-sures the curvature of a particle’s circular motion in a magnetic field.

Page 441: Yorikiyo Nagashima - Archive

416 12 Accelerator and Detector Technology

(4) Particle identification (PID) is carried out by measuring the mass of the parti-cle. With combined knowledge of the momentum p and velocity v, the mass valuecan be extracted from m D p/v γ . A pattern of particle reactions contains muchvaluable information. A long track without interaction may be identified as a μ,an electromagnetic shower as γ or e (depending on the existence of a pre-track),a V-shaped decay pattern with long (typically a few cm � 1 m) missing tracks asK0, Λ and with (typically 10 μm � 1 mm) short missing tracks as τ leptons or B, Cparticles. In a many-body system, knowing the energies and momenta of particlescan give an invariant mass ((mc2)2 D (

PEi)2 � (

Pp i )

2) of the whole system.In the following we list detectors classified by their function.

Momentum Measurement A particle with momentum p makes a circular motionin a magnetic field, and from its curvature the momentum can be determined [seeEq. (12.11)]:

pT[GeV/c] D 0.3B R [T m] (12.95)

Note, however, that the measured momentum is a component (pT) transverse tothe magnetic field. The total momentum can be calculated if the angle with re-spect to the magnetic field is known. Let us assume that a track-measuring deviceis placed in the magnetic field and n points of a track with total projected length Lwithin the magnet are measured. The momentum can be determined by measur-ing a quantity called the sagitta, which is defined as the distance from the middlepoint of the arc to a line made by two end points. For a not so large angle whereθ ' sin θ holds

s 'L2

8RD

R8

�0.3B L

pT

�2

%sagitta (12.96)

From the formula, one sees that the momentum resolution improves as the mag-netic field and path length become larger and as the resolution of space measure-ment improves. The momentum resolution is given by

ΔpT

pTD

ΔRRD

Δ ssD Δ s

8RL2 D

80.3

pT

B L2 Δ s (12.97)

One needs to measure at least three points to determine the sagitta and the res-olution improves as the number of measured points increases. If we denote theresolution of a point measurement as Δx , the relation and the momentum resolu-tion are given by

ΔpT

pTD

8<ˆ:r

32

80.3

pTΔxB L2 N D 3

pTΔxB L2

10.3

r720

N C 4N & 10 15)

(12.98)

In addition to this measurement error, one has to consider the error due to multiplescattering (see Sect. 12.4.3).

15) [175].

Page 442: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 417

Figure 12.33 An example of momentum mea-surement by a drift chamber in the magnet.(a) Cross section perpendicular to beam ax-is of the Central Drift Chamber (CDC) in theVENUS detector [27]. (b) Electric field linesand equipotentials in a unit cell. CDC has 10layers, each of which, except the innermostlayer, is a combination of three planes of drift

chambers staggered by half a cell to eachother. Time information from one cell alonesuffers from left–right ambiguity but informa-tion from two staggered planes can determinethe vector of the particle motion. Informationfrom the third plane is used to obtain the zcoordinate along the axis.

Figure 12.33 shows a cylindrical drift chamber to determine momenta of sec-ondary particles built in the VENUS/TRISTAN detector [27]. The magnetic field isalong the beam direction and the wires are also strung in the same direction. Aunit cell is made of one anode wire and eight field wires. The equipotentials of theelectric field are circular and the position accuracy does not depend on the angle ofincidence of the particle.

Energy Measurement If a particle loses energy only by ionization, the range(Sect. 12.4.2) can determine the total kinetic energy of the particle. However, thismethod can be applied only to muons or low-energy hadrons below a few hundredMeV. The majority of particles interact in one way or another before they stop. Acalorimeter can measure the total energy of practically all known particles exceptmuons and neutrinos. A calorimeter can measure the energy of neutral particles,too. They have become an indispensable element of modern high-energy colliderexperiments. They are suitable for observation of jets, which have their origins inenergetic quarks and gluons, and by making a calorimeter a modular structure,it is possible to cover a large solid angle. The depth of a calorimeter grows onlylogarithmically with energy.i) Total Absorption Cherenkov Counter This is a Cherenkov counter of total ab-sorption type. A transparent material that contains a heavy metal having largedensity and short radiation length is suitable for the purpose. Lead glass (� � 5,X0 � 2 cm), for instance, with a depth of 15–20X0 can contain almost all electro-magnetic showers and can detect Cherenkov light, whose output is proportional to

Page 443: Yorikiyo Nagashima - Archive

418 12 Accelerator and Detector Technology

the total length of the shower tracks. Thus it serves as an electromagnetic calorime-ter and can measure the total energy of an incident electron or a gamma ray. Othermaterials that can be used include NaI, CsI, and BGO.ii) Sampling Calorimeter For hadrons a total absorption counter made totally ofactive material is impractical because the size becomes too big. For economic rea-sons, a sandwich of iron or lead absorbers and active detectors is frequently used.As detectors plastic scintillators or liquid argon proportional chambers are used.The calorimeter is usually placed in the magnetic field, but it is better to placethe photomultipliers outside the field. Figure 12.34a shows the basic structure of acalorimeter. Showers are sampled when they pass through the active detector. Ide-ally a separate counter should be used for each sampled signal, but in order to avoidhaving a large space occupied by light guides and photomultipliers, which may cre-ate a dead angle or space, light outputs are often transmitted to a wavelength-shifterand led to a photomultiplier through light guides. Figure 12.34b shows a configu-ration in which UV light is converted to blue light by the wavelength-shifter in thescintillator, which in turn hits a second wavelength-shifter after passing throughthe air gap and is converted to green light. Light with longer wavelength can trav-el further and thus can be guided along a long path to photomultipliers that areplaced behind the absorber/radiator or where there is little effect of the magneticfield.Recently, optical fibers such as in Fig. 12.21 have also been used. As show-ers can produce a large number of electrons, ionization chambers can be used asdetectors that can extract more accurate information. Here absorbers play a dou-ble role, being electrodes, too (Figure 12.34c). A practical design of a liquid argoncalorimeter (that used in the ATLAS/LHC detector) is shown in Fig. 12.35.

As the difference between the electromagnetic and hadronic calorimeters lies inthe use of radiation length versus interaction length, different materials and sam-pling rates should be used. Usually lead is used for electromagnetic calorimetersand iron for hadronic calorimeters. However, the structure can be the same. If 20X0

of lead is used for the electromagnetic calorimeter, this amounts to 0.65 interaction

Figure 12.34 Structure of sampling calorime-ters. (a) A calorimeter compactified to be usedin a collider detector. To avoid a dead angleand to make the solid angle as large as possi-ble, light is transported through a wavelength-shifter and light guide to distant photomulti-

pliers. (b) Conceptual and enlarged layout ofthe light transmission mechanism of (a). (c)Liquid argon is used instead of scintillators asphoton detectors. Absorbers are immersed inliquid argon in a cryogenic vessel.

Page 444: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 419

Δϕ = 0 0245

Δη = 0 02537 5mm/8 = 4 69 mm Δη = 0 0031

Δϕ=0 0245x436 8mmx4 =147 3mm

Trigger Tower

TriggerTowerΔϕ = 0 0982

Δη = 0 1

16X0

4 3X0

2X0

1500

mm

470 m

m

η

ϕ

η = 0

Strip towers in Sampling 1

Square towers in Sampling 2

1 7X0

Towers in Sampling 3 Δη 0.0245 0.05

(a) (b)

Figure 12.35 A module of the liquid ar-gon calorimeter used in the ATLAS/LHCexperiment. (a) Accordion structure ofthe sampling layers with a shower simu-lation superimposed. (b) Actual designof a module with solid angle coverage ofΔη � Δφ (radian) D 0.025 � 0.0245. For

the definition of η, see Eq. (12.100). The thick-ness of a lead layer is 1–2 mm and a readoutelectrode made of thin copper and Kapton(polyimide) sheets is placed in the middle ofthe lead layers. The total thickness is 22 radia-tion lengths. It has attained an energy resolu-tion of σ/E D 10%/

pE C 0.3% [32, 209].

length. For this reason, electromagnetic calorimeters are usually placed in front ofhadronic calorimeters.iii) Energy Resolution of a Calorimeter The calorimeter uses a shower event, whichis a statistical phenomenon. The detected total length is proportional to the energy.If most of the shower can be contained in the calorimeter, the energy resolution σ isexpected to be� 1/

pE considering the statistical (Poisson) nature of the shower. In

the sampling calorimeter, the statistics become worse proportional to the absorberthickness d. The combined effects on the resolution are expected to be

σp

ED k

rdE

(12.99)

As the above argument shows, the energy resolution of the calorimeter is worseat low energy (. 10 GeV). On the other hand, momentum resolution using themagnetic field is worse in proportion to the inverse of the momentum. Namely, inthe low-energy region momentum measurement using a magnetic field is accurate,and at high energies the calorimeter is a better method. Modern collider detectorsusually use both to cover all the energy regions.

Particle Identification (PID) The identification of particle species can be achievedby knowing its mass. It can be determined if two variables out of v, E D mγ , p D

Page 445: Yorikiyo Nagashima - Archive

420 12 Accelerator and Detector Technology

Figure 12.36 Particle identification using TOF method (Δ t D 200 ps, L D 175 cm).

mγ are known. We have already shown that at fixed momentum the Cherenkovcounter can select velocities and can identify particles.i) Time-of-Flight (TOF) Method Particle identification is done by measuring thevelocity of particles by the most direct method, namely by measuring the time offlight between two counters. This method usually uses two scintillation counters,but any counters capable of fast timing can be used. Figure 12.36 shows data takenby TOF counters in VENUS/TRISTAN, which has Δ t D 200 ps and a flight dis-tance of 175 cm. The figure shows that particles of p . 1 GeV/c can be identifiedwith the above setup.ii) E/p Method The momentum of a particle is measured with a track-measur-ing device in the magnetic field and the energy is measured by electromagneticcalorimeters. Electrons give E/p ' 1, but hadrons (pions) give E/p � 1, becausethey hardly deposit any energy in the electromagnetic calorimeter as λ (nuclear in-teraction length) X0 (radiation length). Usually the method is used to separateelectrons from hadronic showers, but taking the shower profile further into con-sideration, a rejection factor e/π of 103–104 has been obtained without degradingthe detection efficiency (ε & 90%) [134, 389].iii) Method of dE/dx As dE/dx is sensitive only to , knowing the particle mo-mentum enables PID to be achieved. For this purpose dE/dx has to be measuredaccurately. The most advantageous detector for this method is the time projectionchamber (TPC). It can measure p, dE/dx and space position of all particles si-multaneously and thus can reconstruct the topology of an entire reaction that hasoccurred in the chamber. It is a variation of the drift chamber and its structure isdepicted schematically in Fig. 12.37. The chamber is a large-diameter cylindricaltube with a central partition that works as a cathode and two endcaps that work asanodes. It is placed in a solenoidal magnet and is used as a collider detector. Thebeam goes through the central axis of the TPC and the interaction point is at thecenter of the chamber.

A TPC that was used for TOPAZ/TRISTAN experiments has an inner radius of2.6 m, length of 3 m and the applied voltage is �43.25 kV [223]. All the secondary

Page 446: Yorikiyo Nagashima - Archive

12.5 Particle Detectors 421

Charged ParticleIons

Cathode PadsMWPC

e- e+

InteractionPoint

E EB

- 43.75 kV

( 2 Tesla )

r

Central Partition (- 43.75 kV)

Figure 12.37 Illustration of the TPC used inTOPAZ/TRISTAN [223]. Inner radius 2.6 m,length 3 m. Ions from particle tracks are at-tracted away from the center of the TPC inthe z direction. At the end wall anode wiresof eight trapezoidal MWPCs read the r coor-dinate of the ions. At the back of the anodes,

cathode pads obtain the φ coordinate by read-ing the induced voltage. The arrival time givesthe z coordinate and the number of ions givesdE/dx information. By reconstructing tracksand measuring their curvature, the momen-tum is determined and with dE/dx the parti-cle species can be identified.

Figure 12.38 dE/d� curves as a function of momentum. As dE/d� is a function of γ on-ly, curves made by different particle species at fixed momentum p D mγ separate. Figurefrom [311].

particles go through the chamber except at small solid angles along the beam,therefore it essentially covers 4π solid angle. A magnetic field is applied parallelto the cylinder axis, which is used to measure particle momenta. It is also designedto prevent transverse diffusion of ions so that they keep their r–φ position untilthey reach the anodes. Ions generated along tracks of the secondary particles driftparallel to the beam axis at constant velocity toward the endcap. The endcap con-

Page 447: Yorikiyo Nagashima - Archive

422 12 Accelerator and Detector Technology

sists of eight trapezoidal MWPCs, which can read the r position of the ions. Padsat the back side of the MWPCs read the induced voltage capacitively and obtainthe coordinate φ perpendicular to r. From the drift time of ions the z coordinate isknown. The signal strength at the anode is proportional to the ionization loss of theparticles, and is sampled as many times as the number of wires, giving good statis-tical accuracy for the measurement of dE/dx . From the curvature of the tracks inthe magnetic field, the momentum is determined and together with dE/dx infor-mation this allows particle species to be identified. Figure 12.38 is a dE/d� curveobtained by the TPC of the PEP4 experiment [311].

As described above, the TPC is capable of reconstructing three-dimensional coor-dinates of all the tracks, momenta and dE/dx . The information obtained is compa-rable to that of bubble chambers and yet TPC can function at high speed. It is a verypowerful detector suitable for electron–positron collider experiments. However, itis not without its limitations in that it takes 30–50 μs for ions to reach the endcaps,and has difficulties in applications that require high-repetition data taking, such asin hadron collider experiments.

12.6Collider Detectors

a) Basic Concept The mainstream modern detector is a general-purpose colliderdetector, which is a combination of detectors with various functions. Many of themshare a common basic structure as given in Fig. 12.39, where the basic elementsand the particle’s behavior in them are illustrated. A typical example, the ALEPHdetector used in LEP/CERN [10], is shown in Fig. 12.40. It is used to measure Z0

properties accurately and, together with other detectors at LEP, has been instru-mental in establishing the validity of the Standard Model.

ElectromagneticShower

HadronicShower

μ ν

π,pn

K0 , Λ

Muon ChamberReturn YokeHadron CalorimeterSuperconducting CoilElectromagnetic CalorimeterParticle Identifying Detector Track Chamber

Vertex DetectorBeam Pipe

Figure 12.39 Schematic diagram of a typical collider detector shown in cross section perpendic-ular to the beam axis. Dotted lines indicate neutral and hence invisible particles.

Page 448: Yorikiyo Nagashima - Archive

12.6 Collider Detectors 423

We list a few characteristics common to all the general-purpose collider detectors.(i) General purpose: They are made to respond to almost all (& 90%) kinds ofconceivable reactions. Particle species that can be identified include e (electronsand positrons), γ and π0, μ, charged hadrons (mostly pions) and neutral hadrons(neutron, K0, Λ, etc.).(ii) Necessary conditions for a general-purpose detector are:

1. capability of identifying a particle’s species, measuring the energy-momentumno matter in which direction a particle flies,

2. uniformity of the sensitivity and accuracy over the complete solid angle,3. hermeticity: ideally the detector should have no dead angle, but practically the

beam pipe makes a dead zone in the forward and backward regions. But atleast the momentum and energy components of all the particles transverse tothe beam direction should be able to be measured. This is important in orderto measure the missing energy, which gives important information about theproduction of neutrinos or some of the suggested new particles.

As can be seen from Fig. 12.40, the detector is separated into two parts, a bar-rel region (45ı . θ . 135ı) and an endcap region (a few degrees . θ . 45ı).They have radial symmetry in the cross-sectional view perpendicular to the beamdirection (Fig. 12.41) and also forward–backward symmetry with respect to the in-teraction point.16) The barrel and endcap parts have different geometrical structure

16) This is not the case for asymmetric colliders like those at B-factories and HERA (e–p collider atDESY).

Figure 12.40 Side view of the ALEPH detector at LEP [10].

Page 449: Yorikiyo Nagashima - Archive

424 12 Accelerator and Detector Technology

Figure 12.41 Cross-sectional view of theALEPH detector at LEP [10]. The ALEPH de-tector consists of (from the center outwards)1. beam pipe, 2. silicon semiconductor vertex

detector, 4. inner track chamber, 5. TPC, 6.electromagnetic calorimeter, 7. superconduct-ing coil, 8. hadron calorimeter, and 9. muonchamber.

and necessitate different approaches in designing them, but functionally it is desir-able to have uniformity to avoid experimental errors.

We will describe a few of the important functions of the components.

b) Measurement of Interaction and Decay Points of Short-Lived Particles Reactionsproduced by beam particles colliding with residual gas in the beam are unwantedevents. To remove them it is necessary to determine the interaction point by trac-ing back several tracks. Track detectors inserted in the magnetic field can reproducethem in principle, but if one wants to measure decay points of short-lived particles(charm and beauty particles), which have lifetimes 10�12 s . τ . 10�13 s and fly adistance of only ` D cτγ , i.e. tens or hundreds of micrometers, a specialized vertex(crossing points of tracks) detector is needed. SSD detectors are used, and resolu-tion of � 10 μm in the r φ direction,� 20 μm in the z direction were realized withthe ALEPH detector. Figure 12.42 shows an example of decay-point measurementsand Fig. 12.43 a lifetime curve of a short-lived particle.

c) Momentum Analysis Momenta of charged particles can be determined with acombination of magnet and track chamber. A superconducting solenoidal coil isused to generate a strong magnetic field of 1–1.5 T (Tesla). The direction of the mag-netic field is usually chosen along the beam direction and thus only the transversecomponent of the momentum is measured but with knowledge of the polar angle θof the particle the total momentum can be reconstructed. The tracking chamber us-es drift chambers of one type or another. By stretching wires along the beam direc-

Page 450: Yorikiyo Nagashima - Archive

12.6 Collider Detectors 425

Figure 12.42 Display of a two-jet event record-ed by the ALEPH detector, in which a B0Scandidate has been reconstructed in the fol-lowing mode: B0S ! DC

S C e C νe ,DC

S ! 'πC , ' ! KC . . . . (a) Overall

view in the plane transverse to the beams. (b)Detailed view of the vertex detector, showingthe hits in the two layers of silicon. (c) Zoomclose to the interaction point (IP), showingthe reconstructed tracks and vertices [106].

δ

Λ →

→ →

Figure 12.43 Lepton impact parameter distribution of Λ`� candidates. The solid curve is theprobabilitiy function at the fitted value of the lifetime while the shaded areas show, one on thetop of the other, the background contributions [11].

Page 451: Yorikiyo Nagashima - Archive

426 12 Accelerator and Detector Technology

tion a good spatial accuracy can be obtained for r–φ coordinates. The z coordinatesare usually obtained by layers of slanted wires stretched at a small angle relative tothe main layers and their accuracy is not as good as that of r or φ. ALEPH used aTPC, and with B D 1.5 T, Δ s D 100 μm obtained δ pT/p 2

T D 0.0009 (GeV/c)�1.Δ s D 100 μm was obtained for particles having transverse track length L D 1.4 m.Because of the need to minimize multiple scattering, which restricts the accuracyof momentum and also of the long track length, the tracking detector is usuallyplaced in the central, inner part of the whole detector. The track chamber can al-so identify unstable neutral particles such as K0 and Λ by detecting their decayvertices.

d) Energy Vectors Calorimeters are used to measure energies of charged as wellas neutral particles and are usually placed outside the tracking chambers. They aregenerally modularized to obtain space point information. One module is usuallydesigned geometrically to have a fixed interval of rapidity Δη and azimuthal angleΔφ so that counting rate per module becomes uniform. The QCD hadron produc-tion rate is known to be roughly constant in rapidity η or pseudorapidity η0, whichis related to polar angle θ by

η0 D12

lnE C pkE � pk

'12

ln1C cos θ1 � cos θ

D � ln tanθ2

(12.100)

In ALEPH, each module extends over a solid angle of 3�3 cm2 or 0.8ı�0.8ı and isdistributed lattice wise. A position resolution of 1.5 mm in determining the show-er generation point was obtained. By connecting this point with the interactionpoint the direction and energy (i.e. energy vector) of a particle can be measured.Calorimeters are physically and functionally divided into two parts, the electromag-netic calorimeter placed inside and the hadronic calorimeter placed outside thesuperconducting coil. The former is composed of a lead/scintillation counter sand-wich, which has a total of 23 radiation lengths, whereas the latter is a sandwich ofiron plates with 5–10 cm thickness and streamer tubes (a gas tube that works instreamer mode). The total absorption length is λ D 7.6 or � 1.2 m of iron. Theiron absorber plays a double role as the return yoke of the magnetic flux createdby the superconducting coil. As described, a particle’s momentum is measured bythe track chamber, its energy by the calorimeter and, using the E/p method, elec-trons (γ ’s too if there are no tracks) and hadrons are separated. In ALEPH, thecontamination of hadrons in the electron signal was less than 10�3. The energyresolution was ΔE/E D 1.7C 19/

pE [GeV]% for the electromagnetic calorimeter

and � 60%/p

E [GeV] for the hadronic calorimeter.

e) Particle Identification Depending on the physics aim of the detector, a particleidentification device such as a transition radiation counter to separate e from π ora RICH counter to identify hadron species is usually inserted between the track-ing chamber and the calorimeter. ALEPH’s TPC serves a double role as a trackingchamber and a particle identification device.

Page 452: Yorikiyo Nagashima - Archive

12.6 Collider Detectors 427

f ) Muon Identification The muon can be identified as a noninteracting long track.Therefore a track detector outside the magnet yoke, sometimes with additional ab-sorbers totalling & 10λ, serves as the muon identifier. The only other particle thatcan penetrate this much absorber is the neutrino or yet undiscovered weakly in-teracting particles. Backgrounds for muons directly produced in the reaction aredecay muons from pions and false signals of pions, a small albeit nonnegligiblecomponent of pions that survive after the absorbers.

g) Missing Energy In the center of mass system of the colliding particles, the totalsum of momenta should vanish. Missing energy is defined as the vector sum of allthe energy vectors of charged as well as neutral particles but with the sign inverted.As each hadronic particle is not identified, they are assumed to be pions or zero-mass particles. Then the energy vector is the same as the momentum vector. Whenit is known that only one neutrino is missing the missing energy vector is that ofthe neutrino. A reaction to produce W (uC d ! W C ! e(μ)CC ν) was measuredthis way and the mass of W was determined.

New particles that are predicted by theoretical models, for example supersym-metry (SUSY) particles, also give a missing energy accompanied by (an) energeticlepton(s) or mono-jet. The accurate measurement of the missing energy is an im-portant element of a general-purpose detector. This is why the hermeticity of thedetector is important. Still, energies that escape to the beam pipe are sizeable andonly the transverse component of the energy vectors can be determined.

h) Luminosity Measurement In addition to the main detector, small luminositycounters made of a tracking chamber and a calorimeter are placed in the for-ward and backward regions and measure well-known reactions such as small-angleBhabha (e�eC elastic) scattering for electron colliders. As the cross section is wellknown and calculable, knowing the counting rate gives back the intensity of inci-dent beam. It is necessary to measure an absolute cross section of a reaction.

i) Summary A collider detector has all the properties described above, and canserve as a general-purpose detector. If the total energy

ps is known, reactions that

can be induced can be more or less be predicted and similarity in designs of differ-ent groups is conspicuous. For instance, at the LEP/CERN collider,

ps D mZ0 D

91 GeV and four detectors ALEPH, DELPHI, L3 and OPAL were constructed. Themajor goals to measure the mass, decay width and modes of Z0 accurately werecommon to all and all the detectors were capable of doing it. Therefore, from thestandpoint of testing the validity of the Standard Model, differences between thedetector performances can be considered small. The difference, however, appearsin the details of the detector component, depending on whether one wants to makeprecision measurements of leptonic modes or to carry out detailed tests of QCD orwhatever. As a result of this, sensitivity to or the potential for in-depth investigationof a new phenomenon may be different. How much effort is put into apparently ir-relevant functions of detectors strongly reflects the group’s philosophy and insight.

Page 453: Yorikiyo Nagashima - Archive

428 12 Accelerator and Detector Technology

12.7Statistics and Errors

12.7.1Basics of Statistics

Particle reactions are statistical phenomena. Detectors are devices that convert aparticle’s passage into signals that a human can recognize and their performance isalso dictated by statistics. We describe here basic concepts of statistical phenomenato understand the quality of measurements and the errors associated with them.Frequently used formula that are stated in a standard textbook of statistics andprobability are given here without proof. Proofs are given only when they are spe-cially connected to particle measurements.

There are no measurements without errors. They come in two kinds, systemat-ic and statistical. Systematics are associated with the setup of reference points ofmeasurement and make measured values either overestimated or underestimated.When the source of the error is understood, it could be removed in principle. Butsome part of it remains always unexplained or uncontrollable. For instance, we saythe peak of Mt. Chomolungma is 8848 m above sea level. The reference point isat sea level h D 0, but the value of h could be either Ch0 or �h0 and we nev-er know its exact value. However, by careful arrangement, it is possible to set anupper limit to it, i.e. (jhj < ε) and ε is referred to as the systematic error in themeasurement of mountain heights. The systematic error has its origin in the setupof each experiment (or an experimenter’s skill) and cannot be discussed in unbi-ased way. Therefore, we only discuss statistical errors. Their origin inherently liesin the measured object itself and/or in the measuring devices. The measured val-ue fluctuates irregularly (at random) around a true value. Repeated measurementcan decrease the size of statistical errors, but smaller values than the systematicerror are fruitless, so usually measurements are ceased when the statistical errorbecomes comparable to the systematic error.

The behavior of errors or their distribution depends on the circumstances. Whenthe probability function of events is unknown, a normal distribution, to be de-scribed later, is assumed. With such a blind assumption, usually no serious mis-takes are induced in most circumstances because of

1. the law of large numbers,2. central limit theorem, and3. general properties of the normal distribution.

Law of Large Numbers Given a random variable x with a finite expected value (μ), ifits values are repeatedly sampled (sample value xi ), as the number n of these obser-vations increases, their mean will tend to approach and stay close to the expected

Page 454: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 429

value. Namely,

limn!1

nXiD1

xi

n! μ (12.101)

Central Limit Theorem The central limit theorem states that as the sample sizen increases the distribution of the sample average of these random variables ap-proaches the normal distribution with a mean μ and variance (to be described soon)σ2/n irrespective of the shape of the original distribution.

However, one has to be careful because very often one tends to put too muchconfidence in the omnipotentiality of the normal distribution.

We first define some terminology and a few definitions.Average Value or Mean If sample events described by a variable x are realized in-dependently according to a distribution function f (x ), the average μ is defined by

μ DZ

dx x f (x ) � E(x ) (12.102)

with normalizationZdx f (x ) D 1 (12.103)

Generally E(g(x )) is defined by

E(g(x )) DZ

dx g(x ) f (x ) (12.104)

When the variable x is not continuous but discrete, replace f (xi ) by f i and take thesum instead of integral (

Rdx !

Pi ). In the following we take x as a representative

of both continuous and discrete variables unless otherwise stated.Variance This is a measure of dispersion of the variable x around the mean μ andis defined by

σ2 � E((x � μ)2) DZ

dx (x � μ)2 f (x ) (12.105)

σ Dp

σ2 is called the standard deviation. It is easy to verify that

E((x � μ)2) D E(x2) � E(x )2 D E(x2) � μ2 (12.106)

Correlation When there are two statistical variables x , y with means μx , μ y andstandard deviations σx , σ y , their covariance and correlation r are defined by

cov(x , y ) D E((x � μx )(y � μ y )) (12.107a)

r Dcov(x , y )

σx σ y(12.107b)

r takes values �1 r 1. When there exists a linear relation such as y D ax C b,jrj D 1, and when x and y are completely independent, r D 0. “Independent”means the distribution function can be separated like f (x , y ) D g(x )h(y ). Fig-ure 12.44 shows some examples of correlation.

Page 455: Yorikiyo Nagashima - Archive

430 12 Accelerator and Detector Technology

Figure 12.44 Examples of correlation.

Normal Distribution The normal distribution is also called the Gaussian distri-bution. It is the most frequently occurring function in all statistical phenomena.Generally, measured values and detector fluctuation (errors) can be considered asobeying the normal distribution. The functional form with mean μ and standarddeviation σ is given by

G(μ, σ) D1

p2πσ

exp��

(x � μ)2

2σ2

�(12.108)

Figure 12.45 shows its functional form. The area in 1σ range (μ � σ x μC σ)is 68.3%, 2σ 95.5%, and 3σ 99.7%.17) In the figure, 90% of the total area is shownas a white area with boundary at jx � μj D 1.64σ. It is worth remembering that theprobability of the variable x deviating more than one standard deviation is as largeas 30%.

3 2 1 0 1 2 3

f (x; μ,σ)

α /2α /2

(x μ) /σ

1 α

Figure 12.45 Gaussian distribution (α D 0.1). 90 % of the area is within the region (x�μ)/σ D˙1.64.

Another measure of dispersion is the full width at half maximum (FWHM),which is often quoted instead of the standard deviation. This is the x range wherethe probability exceeds half the maximum value. FWHM is denoted as Γ and forthe normal distribution is given by

Γ D 2σp

2 ln 2 D 2.35σ (12.109)

Overlap of the Gaussian Distribution Let there be two independent variables x1, x2,both following Gaussian distributions with mean 0 and variance σ2

1, σ22, and mea-

17) Conversely, the values of n (jx � μj < nσ) that give 90%, 95% and 99% area are 1.64, 1.96 and2.58.

Page 456: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 431

sured as the combination

x D ax1 C bx2 (12.110)

Then their overlap will obey the distribution

G(x ) DZ

dx1dx21q

2πσ21

exp��

x21

2σ21

�1q

2πσ22

exp��

x22

2σ22

�(12.111)

Its Laplace transform is given byZ 1

0dx eλx G(x ) D exp

�12

λ2�a2σ21 C b2σ2

2

��(12.112)

This is the Laplace transform of the Gaussian with the standard deviation

σ Dq

a2σ21 C b2σ2

2 (12.113)

Namely, the error of an overlap of two independent statistical phenomena is giv-en by the square root of the weighted sum of the error squared. If the differenceinstead of the sum is taken in Eq. (12.110), the error is still given by the sameformula.

Poisson Distribution Uniformly distributed but independent events obey the Pois-son distribution. Reactions and decays in particle physics, the number of ions perunit length created by passing charged particles, electrons emitted from a heatedmetal surface or the photocathode of photomultipliers, all of which are intimatelyconnected with detector operations, obey the Poisson distribution. If we let the av-erage occurrence of an event per unit time be μ, then the probability that it occursr times in time t is given by

P(μI r) D(μ t)r

r!e�μ t (12.114)

This can be explained as follows. Let us assume that the event occurred M times ina time interval T. Both T and M are taken to be large enough. Then the probabilitythat r events fall in a small time interval t (t � T ) is given by p r(1� p )M�r , wherep D t/T . As there are M Cr ways to sample r out of M events, the probability that revents occur in the time interval t is given by the binomial distribution

P(M I r) D M Cr p r(1� p )M�r

DM(M � 1) � � � (M � r C 1)

r!p r(1� p )M�r

D1 � f1 � 1/Mg � � � f1 � (r � 1)/Mg

r!(M p )r

�1�

M pM

�M�r

(12.115)

Page 457: Yorikiyo Nagashima - Archive

432 12 Accelerator and Detector Technology

0.6

0.5

0.4

0.3

0.2

0.1

00 5 10 15 20 25 30 35

1

234

510

20

POISSON DISTRIBUTION

P (μ; r) μr! e-μ

P( μ

; r)

μ 0.5

Figure 12.46 Poisson distribution: (μ r / r!)e�μ .

Taking the limits M !1, T !1, keeping M p D M t/T ! μ t fixed, and usinglimM!1(1� x/M )M D e�x , we obtain the Poisson distribution Eq. (12.114). Themean occurrence Nr of events in unit time interval and the variance of the Poissondistribution are given by

Nr D1X

rD0

r P(r) D μ, σ2 D

1XrD0

(r � μ)2P(r) D μ (12.116)

Figure 12.46 shows the Poisson distributions for several values of μ. Note the dis-tribution is not symmetric around μ. If we equate μ D n, when n is large thePoisson becomes the Gaussian with mean n and variance n. We might as well sayit is a good approximation already at n D 10 (see Fig. 12.46). This is an example ofthe central limit theorem.

Problem 12.3

When n is large, show that the Poisson distribution with mean n, variance n ap-proaches the Gaussian distribution:

limn!1

�f (r) D

nr

r!e�n

�!

1p

2πnexp

��

(r � n)2

2n

�(12.117)

Hint: Use Sterling’s formula

n! Dp

2πnnC1/2e�n (12.118)

and expand ln f (r) in a Taylor series.

Example of Application: Deciding the Upper Limit of the Probability Value whenno Events Are Observed When one tries to determine the validity of a certain the-oretical formula, for instance the lifetime of double beta decay, and ends up withno observation, the upper limit of the rate or the lower limit of the lifetime can bedetermined as follows. Let the decay rate be λ. The probability that no events are

Page 458: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 433

observed during the observation time T is given by the Poisson distribution as

P(λI 0) D e�λT (12.119)

The equation can be interpreted as the probability of observing no events as a func-tion of λ. Therefore, the probability λ λu is given by

P(λ λu) D TZ λu

0dλ e�λT D 1� e�λu T � CL (12.120)

T is the normalization constant when λ is taken as the variable. We call the aboveequality the confidence level (CL) of λ lying in the interval 0 λ λu,

) λu D �1T

ln(1� CL) (12.121)

For CL D 90%, λu D 2.3/T . Namely, when no events are observed, the upper limitwith 90% confidence level is given by the rate with 2.3 as the number of events.Simultaneous Observation of Two Independent Phenomena Suppose one is mea-suring the lifetime of a particle. The counts one obtains include noise of the detec-tor, which is also statistical but of different and independent origin:

n D n1(particle) C n2(noise) (12.122)

The probability of n events being observed is given by

P(μ1, μ2I n) DX

n1Cn2Dn

P(μ1I n1)P(μ2I n2) DX (μ1)n1

n1!e�μ1

(μ2)n2

n2!e�μ2

D(μ1 C μ2)n

n!e�(μ1Cμ2) D P(μ1 C μ1I n) (12.123)

12.7.2Maximum Likelihood and Goodness of Fit

a) Maximum Likelihood Method The real value μ and its variance of a measuredobject can never be known, but a good guess can be obtained by measurements.Suppose there are n independent samples xi (i D 1 . . . n), which we call a statisticalvariable obeying a distribution function f (x I a). a is a parameter to be determined.The maximum likelihood function is defined by

L(x1, . . . , xn I a) D f (x1I a) f (x2I a) . . . f (xn I a) (12.124)

It can also be considered as the probability that gives the observed values (x1, x2, . . . ,xn). If the function L is analytic, the value of a can be obtained by solving

dLdaD 0

ordL�

da�

d ln Lda

D 0(12.125)

Page 459: Yorikiyo Nagashima - Archive

434 12 Accelerator and Detector Technology

The two equations are equivalent in obtaining a and either one is used dependingon the situation. When there is more than one parameter, simultaneous equationsare obtained. When the solution of Eq. (12.125) is written as Oa, it is not the true abut an estimated value. An estimated variance is given by

Oσ2 D

Zdx1 . . . dxn (a � Oa)2L(x1, . . . , xn I a) (12.126)

This formula, however, is not necessarily calculable unless the analytical form off (x I a) is known. When the probability function is given by the Gaussian, the max-

imum likelihood function is given by

L DnY

iD1

1p

2πσexp

��

(xi � μ)2

2σ2

�(12.127)

If we write the estimate of μ as Oμ and that of σ2 as Oσ2, L is maximized by

@L@μD L

X xi � μσ2

D 0 ) Oμ D1n

Xxi D Nx

(12.128a)

@2 L@μ2 D L

��

nσ2 C

X 1σ2

xi � μσ2

�2�D 0) Oσ2 D

1n

nXiD1

(xi � μ)2

(12.128b)

As the real μ is unknown, it is replaced by Oμ. Then the degrees of freedom arereduced by 1 and it is known that

s2 �1

n � 1

nXiD1

(xi � μ)2 Dn

n � 1Oσ2 (12.129)

gives a better estimate of the variance. If n is sufficiently large, the difference issmall. The arithmetic average Nx is an estimate of μ, but is not the real μ. Thismeans another n samples will give a different Nx . In this sense, Nx itself is a statisticalvariable. Consequently, the variance of Nx can be defined:

σ2( Nx) D E(( Nx � μ)2) D1n2 E

0@ 1

n

Xi

xi � μ

!21A D 1

n2

XE((xi � μ)2)

Dnσ2

n2D

σ2

n(12.130)

Namely σ( Nx) Dp

σ2( Nx ) D σ/p

n gives the error associated with the arithmeticmean Nx . σ( Nx) is referred to as the standard error of the mean. As is clear from the

Page 460: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 435

above derivation, the equality σ( Nx) D σ/n holds independently of the functionalform of the distribution.

The reader is reminded here that Oσ2 is an estimate of σ2, which is the real vari-ance of the variable x, and not an estimate of the variance of Nx . σ decides the fluc-tuation of the variable xi distribution and does not change no matter how manysamples are taken. Oσ2 approaches σ2 as n increases. Likewise Nx approaches true μas n increases, and its standard deviation is given by σ/

pn, which approaches 0 as

n increases.Now, note that at a point @L�/@μ D 0,

�@2L�

@μ2 Dnσ2 D

1σ2( Nx )

(12.131)

Equation (12.126) is a general expression, but when it cannot be calculated analyt-ically one applies the central limit theorem and assumes the same formula as forthe Gaussian distribution is valid. Namely,

1σ2( Oa)

D �

�@2 L�

@a2

�aDOa

(12.132)

Note, however, this is generally an approximation and valid only when the statisticsis large. When more than one parameter is involved, an error matrix U can bedefined as the inverse of

(U�1)i j D �@2 L�

@ai @a j(12.133)

Then it can be shown that the diagonal elements σ2i are equal to the variance of Oai

and the nondiagonal elements are covariances. Namely

U D

2664

σ21 cov(1, 2) cov(1, 3) � � � � � �

σ22 cov(2, 3) � � � � � �

σ23 � � � � � �

� � � � � �

3775 (12.134)

We note that the estimate determined by the maximum likelihood method is in-variant under transformation. That is, if b D g(a), the best estimate of b is givenby Ob D g( Oa).

b) Propagation of Errors One often wants to calculate another quantity using theobtained sample variables. Here we are interested in the error of u D f (x , y ) given(x , y ) with standard deviations σx , σ y . The variance σ2

u of u is given by

σ2u D E((u � Nu)2) (12.135)

In the first approximation, Nu D f ( Nx , Ny ). How the variable u moves around Nu canbe obtained by the Taylor expansion

u � Nu ' (x � Nx )@ f@x

ˇˇ NxC (y � Ny )

@ f@y

ˇˇ Ny

(12.136)

Page 461: Yorikiyo Nagashima - Archive

436 12 Accelerator and Detector Technology

Inserting the square of Eq. (12.136) in Eq. (12.135), we obtain

E((u � Nu)2) D E

"(x � Nx )2

�@ f@x

�2

C (y � Ny )2�

@ f@y

�2#

C 2(x � Nx )(y � Ny )@ f@x

@ f@y

(12.137)

Using Eqs. (12.105) and (12.107), we obtain

σ2u D σ2

x

�@ f@x

�2

C σ2y

�@ f@y

�2

C 2cov(x , y )@ f@x

@ f@y

(12.138)

Namely, the variance of u is the sum of the squares of its argument errors mul-tiplied by their derivative. The third term vanishes if the variables x and y are in-dependent, but if there is correlation between the two, the propagation error ismodified greatly from the sum of squares. In the following we ignore correlations.

Example 12.1

u D x ˙ y

σ2u ' σ2

x C σ2y

(12.139)

which reproduces Eq. (12.113).Consider an experiment with a neutron target. This is achieved by subtracting

proton target data from deuteron target data. Let n1, n2 be the number of eventsfrom each experiment measured in the time intervals t1, t2. One wants to decidethe best balance in dividing the time into t1 and t2 with the total time T D t1 C t2

fixed. The counting rates are r1 D n1/ t1, r2 D n2/ t2 and r1 � r2 is the desirednumber. As the distribution is Poissonian, the variance σ2 of n is σ2 D n and theerror δn D

pn. Using Eq. (12.139),

δ r2 D δ r21 C δ r2

2 Dδn2

1

t21C

δn22

t22D

n1

t21C

n2

t22D

r1

t1C

r2

T � t1(12.140)

To minimize δ r , we differentiate Eq. (12.140) with respect to t1, put it equal to 0and obtain

t1

t2D

rr1

r2(12.141)

Example 12.2

u D x �y or u Dxy

σ2u

u2D

σ2x

x2C

σ2y

y 2

(12.142)

Page 462: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 437

Example 12.3: Asymmetry of Angular Distribution

Asymmetry of a distribution is a frequently measured quantity. Let R be the num-ber of counts in the interval 0 cos θ 1 and L in �1 cos θ 0. We want toestimate the error of its asymmetry

A DR � LR C L

(12.143)

@A@RD

2L(R C L)2

,@A@LD�2R

(R C L)2(12.144)

We put N D R C L:

σ2A D

�2LN 2

�2

σ2R C

�2RN 2

�2

σ2L (12.145)

As both L and R obey the Poisson distribution σ2R D R , σ2

L D L

σ2A D

4L2R C 4R2LN 4

D 4R LN 3D

1 � A2

N(12.146)

When A is small

σA '

r1N

(12.147)

The error has to be smaller than the value of the asymmetry itself. Let us define afigure of merit (FOM) by

FOM �AσA

(12.148)

The number of events N to obtain FOM is

FOM DAσAD

Ar1� A2

N

! N D (FOM)2 (1 � A2)A2 (12.149)

Let us further assume one wants at least FOM � 3. Then the required number ofevents is

N D

(30 000 A D 0.01

9 A D 0.5(12.150)

The experimental time to achieve the desired FOM depends greatly on the value ofthe asymmetry itself.

Page 463: Yorikiyo Nagashima - Archive

438 12 Accelerator and Detector Technology

12.7.3Least Squares Method

Consider a function y D f (x I a1, a2, � � � , am ) having x as a variable. We want todetermine the value of m ai ’s. Assume a measurement was carried out at n pointsxi (i D 1, . . . , n) and measured values y i and their errors σ i have been obtained.The least squares method minimizes the following quantity:

S DnX

iD1

�y i � f (xi I a1, � � � , am)

σ i

�2

(12.151)

In order to obtain a meaningful result the number n must be larger than the num-ber of parameters to be determined. If the distribution is Gaussian with the meanf (xi I a1, � � � , am), it is easy to prove the result agrees with that given by the max-

imum likelihood method. In this case S is called the chi square (�2) distribution.Consequently, it is often said that the least squares method minimizes the �2 dis-tribution. If the y i distribution is not Gaussian, the statement is not exact, but isnot far from the truth because of the central limit theorem. Note we have assumedno errors in determining xi in Eq. (12.151). If errors have to be assigned,

σ2y ! σ2

y C σ2x

�@ f@x

�2

(12.152)

has to be used instead of σ2y . To obtain the best fit, S has to be a minimum and the

conditions

@S@a jD 0 (12.153)

give associated equations to be solved. When analytical solutions are not available,numerical calculations using computers have to be carried out. With similar argu-ments as before, the error matrix U of a j is given by

U�1 D12

@2S@ai @a j

(12.154)

If S is nonlinear, computer calculation is the only way. In this case, the error matrixhas to be solved numerically, too. A way to determine the variance is again guidedby relations of the Gaussian distribution. When the distribution is one-dimensionalGaussian, we have

σ2( Oa) D�

@2S@a2

��1

aDOa(12.155)

Taylor expanding S around the minimum, we obtain

S(a) ' S( Oa)C12

(a � Oa)2 @2S@a2D S( Oa)C

1σ2

(a � Oa)2 (12.156)

Page 464: Yorikiyo Nagashima - Archive

12.7 Statistics and Errors 439

Therefore, we have the equality

S( Oa C σ) D S( Oa)C 1 (12.157)

Generalizing the relation, we may define the standard error σ as the deviation ofa from Oa to give S the value Smin C 1. Equation (12.155) also shows how to obtainσ numerically. When there are two parameters to be determined, a closed contourwhere S D Smin C 1 gives an error region within 1σ in the a1–a2 plane.

�2 Distribution The chi square (�2) distribution is a theoretical probability distri-bution in inferential statistics, e.g., in statistical significance tests. The �2 distribu-tion with n degrees of freedom using x (x � 0) as its variable is given by

Kn(x ) Dx n/2�1e�x/2

2n/2Γ (n/2)(12.158)

where Γ (x ) is the Gamma function. The function’s form is given in Fig. 12.47.Suppose the distribution function of a certain variable μ is given by F(μ, ai ),

where ai are m parameters one wants to determine by an experiment. Let y i bemeasured points of μ obeying the normal distribution G(μ i , σ2). Then �2 definedby

�2 D

nXiD1

(y i � μ)2

σ2 (12.159)

obeys Eq. (12.158). The �2 distribution is referred to as the goodness of fit andis a measure of how well the determined parameters ai reproduce the measuredvalues. The measured n values y i � μ, (i D 1, . . . , n) are independent variables,but if the average Ny is substituted for μ, y i � Ny are n � 1 independent parametersand n � 1 should be used as the degrees of freedom in applying the � square test.When there are m parameters to be determined, the number of degrees of freedombecomes n � 1 � m. The confidence level CL is given by

CL(S ) DZ 1

Sdx Kn(x ) (12.160)

n=1

n=2n=3n=4

n=5

0 1 2 3 4 5 6 7 8 9 100

0.2

0.8

0.6

0.4

χ 2

K n(χ

2 )

0 10 20 30 40 500.0

0.5

1.0

1.5

2.0

2.5

Degrees of freedom n

50%

10%

90%99%

95%68%32%

5%1%

χ2 /n

Figure 12.47 �2 distribution: Kn (x) D (x n/2�1 e�x/2)/(2n/2 Γ (n/2)) and the reduced �2 is equalto �2/n for n degrees of freedom. The curves are given for given values of CL.

Page 465: Yorikiyo Nagashima - Archive

440 12 Accelerator and Detector Technology

Usually, taking CL D 5% and when �2 S , the fit is approved as good. As arough estimate, if μ is considered as a good representative, (y i � μ) should take theapproximate value σ2 and so �2/(n � 1) � 1 is expected. If the value of �2 is too faraway, the original theoretical assumption has to be reconsidered.

Page 466: Yorikiyo Nagashima - Archive

Part Two A Way to the Standard Model

Page 467: Yorikiyo Nagashima - Archive
Page 468: Yorikiyo Nagashima - Archive

443

13Spectroscopy

Spectroscopy originally meant the use of a prism to disperse visible light and an-alyze a complex wave form as a function of its wavelengths. The concept was ex-panded to include any measurement of observables as a function of wavelengthor oscillation frequency. As the frequency is equivalent to energy in quantum me-chanics, collecting and sorting of energy levels in any phenomenon is also calledspectroscopy. It is an indispensable procedure in science to elucidate the under-lying basic rules from observation of seemingly complex phenomena. The role ofthe periodic table in understanding atomic structure and radiation spectra in clar-ifying the nature of quantum mechanics come to mind immediately. In a similarvein, collecting the masses of hadrons, including both elementary particles andresonances, which were thought of as dynamic processes of interacting elementaryparticles, and determining their spin, parity and other quantum numbers were es-sential procedures in reaching the quark structure of the hadrons. Later on, we willdeal with level structures of quark composites, but here we follow history to learna heuristic way to reach the truth. The first step is to collect facts. The second stepis to sort them out, find a regularity and see if they can be reproduced by a simplemodel. The third step is to elucidate the underlying law that governs the dynamicsand reach the truth.1)

Discoveries in the pre-accelerator era were made using isotopes and cosmic rays.That was the age of hunting particles, exciting but very much dependent on luck.With the advent of large accelerators – Cosmotron (3.3 GeV, 1953–1968), Bevatron(6.2 GeV, 1954–1971), PS (28 GeV, 1959–) and AGS (33 GeV, 1960–) – the age of har-vesting particles had arrived. It became possible to plan and prepare experimentsin an organized way, controlling the particle’s conditions.

In this chapter, we describe how to extract information on hadrons (π, nucleon,K, �, ω, etc.) and resonances and determine their properties. This is the first stepdescribed above. We start from the basic structure of the interactions between pi-ons and nucleons, which are representatives of the meson and baryon families ofthe hadron. Analysis of hadron interactions exemplified by π–N scattering is es-sential in order to understand their dynamics. By analyzing their behavior we can

1) This is Sakata’s syllogism that prevailed in his school, whose pupils proposed IOO symmetry [212](flavor S U(3) of the Sakata model [339]), a predecessor of the quark model, lepton–baryonsymmetry, the fourth quark [271] and the Kobayashi–Maskawa model [239].

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 469: Yorikiyo Nagashima - Archive

444 13 Spectroscopy

extract resonances and determine their spin-parity as well as other features. Classi-fication of resonances and construction of a model are the second step, which willbe discussed in the next chapter. A brief description of low-energy nuclear forcesand the high-energy behavior of hadronic cross sections is given in the last part ofthis chapter.

13.1Pre-accelerator Age (1897–1947)

The history of particle physics as a research discipline may be said to have begunwith Fermi’s (1934) and Yukawa’s (1935) proposal of the weak and strong (meson)theories. However, the first discovery of an elementary particle as we know it todaydates back to as early as 1897, when J.J. Thomson discovered the electron, althoughthe early history of the particle is more closely related to atomic structure and thequantum concept of particle–wave duality. The photon was recognized as a particlefollowing Einstein’s light quantum hypothesis in 1905 and subsequent demonstra-tion of Compton scattering in 1925. The discovery of the neutron in 1935 showedthat the nucleus consists of particles. It triggered Fermi’s and Yukawa’s applica-tions of field theory to the weak and strong forces, incorporating particle creationand annihilation. In their theories, new particles (the neutrino and the meson) areclaimed as indispensable players in the dynamical phenomena. Actually, the neu-trino was proposed by Pauli in 1930 to reconcile apparent violations of energy andangular momentum conservation in beta decays. Fermi included it and applied thequantized field theory invented by Heisenberg and Pauli to beta decay. Yukawa’streatment in his first paper was semiclassical but his idea of a quantum of a forcecarrier was revolutionary and remains one of the prime axioms in modern particletheory.

It was the beginning of new era in which people no longer hesitated to claimnew particles to explain newly discovered phenomena. Until then, the change ofthe concept of matter aroused by the developement of quantum mechanics hadbeen so violent that people were more willing to abandon fundamental laws than tointroduce dubious new particles. For instance, Bohr, who represented mainstreamthinking at the time, opted to discard energy conservation in beta decay ratherthan introduce unknown new particles. That he did not like Yukawa’s idea eitherwas indicated in his comment “You like new particles so much” when Yukawapresented the new theory to him on his visit to Japan in 1937.

Electron The first particle recognized as an elementary particle was the electron. Itwas known for a while as the cathode ray, which is emitted from a hot filament in avacuum (low pressure to be exact) glass tube when high voltage is applied betweenan anode plate and the cathode. In fact, Hertz, who proved the existence of theelectromagnetic wave experimentally, knew it. He erroneously concluded that thecathode ray was not charged. It was only after vacuum technology improved that the

Page 470: Yorikiyo Nagashima - Archive

13.1 Pre-accelerator Age (1897–1947) 445

High Voltage

Cathode

Anode/Collimator

Deflector

+

+ Displacement

Figure 13.1 The apparatus with whichJ.J. Thomson discovered the electron. A highvoltage is applied between the cathode andthe anode plate. The emitted cathode ray goesthrough a hole in the anode, is collimated by

a slit, hits the glass wall and produces a fluo-rescent point. When a transverse electric fieldis applied across two plates in the middle (de-flector), the cathode ray can be deflected [366].

deflection of the cathode ray was detected by J.J. Thomson in 1897 in the apparatusshown in Fig. 13.1 [366].

The cathode ray, after going through collimators, is deflected by a transverse elec-tric field. From the direction of the deflection the charge can be determined. Thebeam can also be deflected by a magnetic field created by two coils applied perpen-dicularly to both the beam and the electric field. By adjusting the field strength tocompensate the deflection by the electric field, the velocity of the particles can bemeasured. The value of e/m can be determined by the deflection angle created bythe electric field. Its value was a thousand times smaller than that of the hydrogenion (the proton) measured in electrolysis. Thus Thomson concluded that the cath-ode ray consisted of negative particles with mass much smaller than any knownmatter. An accurate value of the electric charge was obtained by Millikan’s oil dropexperiment and the mass was determined.

Atomic Structure Thomson proposed an atomic model that consists of a positivecharge cloud of size � 10�8 cm in which electrons are embedded like the seedsof a watermelon. The other model to be contrasted with Thomson’s was Nagaoka’ssaturnian model [284], in which electrons circulate around a positive nucleus. Themodels could be tested by making an α-particle scatter on matter. The α-particlewas discovered by Becquerel, and by 1911 it was known to have charge C2e anda mass four times that of hydrogen. Since the α-particle is much heavier than theelectron, it would not be deflected by electrons. Deflection by the positive chargewould also be weak in the Thomson model because of its extended distribution.Rutherford noticed that these characteristics of the Thomson model did not agreewith observations. He further conjectured that if the Thomson model were right,large-angle scattering is possible only as an accumulation of small scattering an-gles and the distribution would be Gaussian. On the other hand, if the charge isconcentrated in a small region and the target mass is large, when the α-particle hitsthe atom head on it will be repelled back. If the collision is not head-on but withina small transverse distance (impact parameter), the deflection would still be large.Figure 13.2 shows trajectories (magnified 200 billion times) that an incoming par-

Page 471: Yorikiyo Nagashima - Archive

446 13 Spectroscopy

ticle makes for different impact parameters under the influence of the repulsiveCoulomb force.

Following Rutherford’s suggestion, Geiger and Marsden carried out an experi-ment using the detector shown in Fig. 13.3 and obtained the angular distributionshown in Fig. 13.2, which agreed with Rutherford’s formula given in Eq. (6.46). Theformula was calculated classically assuming the scattering pattern in Fig. 13.2 and

10

102

103

104

105

0º 60º 120º 180ºScattering Angle

Num

ber o

f α P

artic

les

Fixed Target

b Im

pact

Par

amet

er

Trajectories in the Coulomb Potential(a) (b)

Figure 13.2 (a) Trajectories of incoming par-ticles with different impact parameters scat-tered by the Coulomb force of the fixed scat-tering center. Consider it as a 2 � 1011 timesmagnified picture of an α-particle scattered byan atom. When the impact parameter is small,

the α-particle is deflected through a largeangle. (b) Data fit well with the Rutherfordscattering formula. The dashed line indicatesthe predicted scattering distribution for theThomson model [158].

Gold Foil

Microscope Source

α particlesFluorescent Screen(ZnS)

Vacuum Pump

θ

Micr

osco

pe

Gold Foil

(a) (b)

Figure 13.3 Side (a) and plan (b) viewof Geiger and Marsden’s apparatus: an αsource, scattering gold foil, and a microscopewhose front end is coated with fluorescentmaterial (ZnS) are contained in a metal cylin-der. The source is concentrated radon andα-particles are incident on the foil at a rightangle through a slit. The source and the foil

are fixed to the axis, but the cylinder togetherwith the microscope can be rotated aroundthe axis so that the scattering angle of theα-particles can be varied. The measurementwas carried out by flickers of the fluorescencebeing observed by the human eye and count-ed [158].

Page 472: Yorikiyo Nagashima - Archive

13.1 Pre-accelerator Age (1897–1947) 447

an uniform impact parameter distribution. Rutherford was fortunate because it isa rare example of the classical treatment producing the same result as quantummechanics.

The expected curve according to the Thomson model is depicted as a dashedline. As we discuss later in Chap. 17 or more simply according to the Heisenberguncertainty principle, a measure of the minimum distance that the projectile canpenetrate into the target is given by the de Broglie wavelength of its momentumtransfer Q D 2p sin(θ /2) . p . Since the data agreed with a formula that assumesa point target, the size of the nucleus is at most the inverse of the α-particle mo-mentum. Assuming kinetic energy of 4 MeV, which gives p � 170 MeV/c, one canestimate the nuclear size to be at most r0 � „c/170 MeV/c � 10�13cm, which is105 times smaller than an atom. This experiment unambiguously established thefact that the atom has a structure with positive charge concentrated in a regionr < 10�13 cm and a surrounding electron cloud extending to 10�8 cm.

Neutron In 1932, Irène and Frédéric Joliot-Curie noticed that neutral particleswith strong penetrating power are emitted when α-particles from polonium col-lide with a beryllium target. Unfortunately, they mistook them for energetic gam-ma rays. Chadwick, knowing Rutherford’s idea of the possible existence of a neutralheavy particle, a possible composite of the proton and the electron, proved in 1932that the mass of the neutral particle of the Joliot-Curies was as large as that of theproton. His apparatus is shown in Fig. 13.4. The mass measurement was done asfollows. In Newtonian mechanics, if a ball having mass m collides with another ballwith the same mass at rest, the incident ball stops and the target moves on withthe same velocity as that of the incident ball. If the target ball has mass 14m, thevelocity that the target obtains is 2/15 that of the incident ball. These numbers canbe obtained by a simple calculation using the law of conservation of momentum.Placing a paraffin foil which contains hydrogens or nitrogen (mN D 14mH) gaschamber, the maximum velocity of recoiled hydrogens or nitrogen are those ob-

Pump

BerylliumPolonium(α Source)

Ionization Chamber

Amplifierneutron

paraffin

proton

Figure 13.4 Chadwick’s apparatus for detect-ing the neutron. The neutron is knocked outfrom Be by the α-particle emitted from polo-nium. Various materials are placed in front

of the ionization chamber on the right, whichmeasures the ionization loss of charged parti-cles scattered by the neutron [94].

Page 473: Yorikiyo Nagashima - Archive

448 13 Spectroscopy

tained from head-on collisions. The velocity can be determined from the ionizationloss of the particles, which is converted to electric current, amplified and automat-ically recorded. It was the most advanced instrument at the time. The ratio of thevelocity as determined by ionization loss between the paraffin and the nitrogen gasfoil was 2/15, confirming that the mass of the neutron was equal to that of theproton.

Positron Anderson’s discovery of the positron (1933) has already been describedin Chap. 4 in explaining the concept of the antiparticle. We only mention that thediscovery was made by using a cloud chamber, which was invented by Wilson in1911. It played an important role also in discovering the muon and pion, as wellas strange particles later. It is a sealed box containing supercooled, supersaturatedwater or alcohol vapor. When a charged particle goes through the chamber, it ion-izes the matter along its path. The ions act as condensation nuclei, around whicha mist will form, creating a track just like an airplane contrail. The velocity and themomentum can be measured by the particle’s ionization losses and the curvatureof the track in the magnetic field.

Muon and Pion The discovery by Anderson and Neddermeyer in 1937 [293] ofa particle with mass � 200 me in cosmic rays aroused a considerable amount ofexcitment in view of Yukawa’s prediction of the existence of a nuclear force carrier.However, the particle had a long life, interacting weakly with matter, which castdoubt on its role as the strong force carrier. It took a long and confusing time beforethe issue was finally settled by Powell’s demonstration of sequential decay (π !μ ! e) on emulsion photographs (Fig. 13.5), thus establishing the pion as thestrong force carrier predicted by Yukawa. As to the muon, however, although veryinstrumental in clarifying the properties of the weak interaction, its very existence,i.e. its role, remains a mystery to this day.

e

μπ

Figure 13.5 Emulsion photograph of π� μ� e decay. A pion decays into a muon (π! μC νμ )and the muon decays into an electron (μ! eC νe C νμ ) [80].

Muon Neutrino The existence of the second neutrino was inferred from the muondecay in μ ! e C ν e C νμ , because the energy spectrum of the produced electronwas not monochromatic but showed a continuous spectrum. That νμ might bedifferent from ν e , which appears in beta decay, was suspected from the absenceof μ ! e C γ , which will be treated in more detail in the discussion of the weakinteraction in Chap. 15.

Page 474: Yorikiyo Nagashima - Archive

13.2 Pions 449

Strange Particles Strange particles were also discovered in cloud chambers ex-posed to cosmic rays, but their detailed properties were investigated using accelera-tors. The pursuit of their properties and role in particle physics led to the discoveryof the quark model, which will be explained in the next chapter.

13.2Pions

Pions, nucleons and their interactions are the basis and also the best studied sub-jects of all hadrons. This is because the pion is the lightest hadron and an intensepion beam can be easily made by bombarding protons on matter. We start by de-termining the spin-parity of the pion.

Spin of πC To determine the spin of πC, the principle of detailed balance wasapplied to the reaction

πC C D • p C p (13.1)

Denoting the spin of π, D, p as s, sD and s p , respectively, and the momentum ofπ, p in their system as k, p, Eq. (9.146) is written as

dσ(p p ! πD)σ(πD ! p p )

D 2k2

p 2

(2s C 1)(2sD C 1)(2s p C 1)2 D

32

k2

p 2 (2s C 1) (13.2)

The factor 2 in front of the equation is a correction factor for two identical particlesin the final p p system. We used the fact that sD D 1, s p D 1/2 to derive thelast equation. There is a factor 3 difference in the cross section when the spin is 1compared to s D 0 and it is easy to distinguish them experimentally.

Knowing σ(p p ! πD) D 4.5˙ 0.8 mb at Tp D 340 MeV or Tπ D 21.4 MeV [91]gives the predictions σ(πD ! p p ) D 3.0 ˙ 1.0 mb if s D 0 and 1.0 ˙ 0.3 mb ifs D 1. The measurement showed that σ D 3.1˙0.3 mb at Tπ D 25 MeV [100, 126].The spin of the pion was determined to be 0.

Parity of π� Let us consider a process in which a slow pion stops in deuterium(D), is absorbed by it and then dissociates into two neutrons:

π� C D ! n C n (13.3)

The stopping π is trapped by the Coulomb force to fall in an orbit of the deuterium,drops down to the S-state and reacts. Therefore the orbital angular momentum is0 before the reaction. Assuming the orbital angular momentum in the final stateas L and using the parity conservation law [Eq. (9.73)]:

Pπ D PD (Pn)2(�1)L (13.4)

Deuterium is a bound state of the proton p and the neutron n and is known tohave spin 1 mostly in the S (` D 0) state with a small mixture of D (` D 2) state

Page 475: Yorikiyo Nagashima - Archive

450 13 Spectroscopy

PD D Pp Pn(�1)`. As we discussed before, if Pp Pn D 1 is assumed, the aboveequation leads to

Pπ D (�1)L (13.5)

To determine L, we use the angular momentum conservation. As sn D 1/2, thetotal spin of the two neutron system is either 0 or 1. Taking into account sπ D 0,the initial value of the angular momentum is Ji D sD D 1. The final value is avector sum of s and L. Then

s D 0 ! J f D Ji D 1 ! L D 1s D 1 ! J f D L � 1, L, LC 1 ! L D 2, 1, 0

(13.6)

We need one more condition to determine the parity. We note that the final stateis a two-fermion state and has to be antisymmetric under permutation. Since thewave function is a product of both the spin and spatial wave functions, if the spinwave function is symmetric then the spatial wave function must be antisymmetricand vice versa. As the s D 0 (1) state is antisymmetric (symmetric), L has to be even(odd) correspondingly:

s D 0 ! L ¤ 1s D 1 ! L D odd

(13.7)

The only allowed value is L D 1, therefore

Pπ D �1 (13.8)

This means if the reaction (13.3) exists, the pion’s parity has to be negative. If theparity is positive the reaction does not happen. The reaction has been observed andthe parity of the pion (π� to be exact) is determined to be negative.

Spin and Parity of π0 The neutral pion decays into two photons with nearly 100%probability. This fact alone excludes the pion’s spin from being 1. In the π0 restframe, the two γ ’s are emitted back to back. Taking the z-axis along the photonmomentum, and writing R L when the photon moving in the z direction is righthanded (sz D 1) and the other moving in the �z direction is left handed (sz D 1), atotal of four states (R L, LR , R R, LL) are conceivable (Fig. 13.6). As L z � (x � p ) �p D 0, Jz includes only spin components. R L, LR states, which have Jz D ˙2,cannot be realized unless J � 2. Jz of R R, LL states is 0, which does not rule outhaving J D 1. But if J D 1, the wave function should behave as the eigenfunctionof ( J, Jz ) D (1, 0), namely Y10(θ , φ) � cos θ , which gives �1 at θ D π. But 180ırotation of R R or LL is equivalent to particle exchange of 1 and 2, which has to giveC1 by their bosonic nature. This is a contradiction, hence J D 1 is not allowed.

Next we assume the spin of π0 is zero and derive its parity. The parity transfor-mation changes the direction of the momentum but not that of the spin. ThereforeR and L are exchanged, which, in turn, means R R and LL are exchanged. Then theparity eigenstate must be of the form

jPi DR R ˙ LLp

2$ P D ˙1 (13.9)

Page 476: Yorikiyo Nagashima - Archive

13.2 Pions 451

Z

RL LR

RR LL

Jz 2 Jz -2

Jz 0 Jz 0

(a) (b)

(c) (d)

Figure 13.6 Polarization states of two photons in π0 CM. The wavy lines denote the photonmomentum and the bold arrows denote their spin orientation. (a) RL, (b) LR have Jz D ˙2while (c) RR, (d) LL have Jz D 0.

Writing the photon polarization vectors as ex , e y , we can express the right-handedand left-handed states as

jRi D �ex C i eyp

2, jLi D

ex � i eyp

2(13.10)

Then

P D ˙1$ R R ˙ LL �

(ex ex � e y e y

ex e y C e y ex(13.11)

This means that if P D C1, the polarization of the two photons is parallel, whilethe polarization of the negative parity states is orthogonal. Since the direction ofthe polarization is that of the electric field, electron–positron pairs produced inγ ! eC C e� tend to be emitted in the direction of the electric field. Among theπ0 decay reactions, there is a mode called Dalitz decay

π0 ! (eC C e�)C (eC C e�) (13.12)

which can happen via inner bremsstrahlung. Therefore an experiment to measurethe angular distribution of the two electron pairs can reveal the parity of π0. Atheoretical calculation gives 1 ˙ A cos θ , where θ is the angle between the twoplanes that the pairs make and A ' 0.48. Measurement gives Pπ D �1.

Tensor Method It is instructive to rederive the previous result using anothermethod, the tensor method. It is complementary to the partial wave analysis andin many cases simpler and intuitive. We try to construct the S-matrix, at least partof it, using materials available in the problem. We note first that the S-matrix is ascalar because of rotational invariance. In the decay π0 ! 2γ , if π0 is has spin 0,the wave function has only one degree of freedom and is a scalar. But if π0 has spin1, it has three degrees of freedom, hence the S-matrix has to be a scalar productof the pion polarization vector π with another vector. The same argument appliesto any incoming or outgoing particles. Therefore the amplitude must be linear in�1, �2 and also π if the pion is a vector. The materials to construct the S-matrix are

Page 477: Yorikiyo Nagashima - Archive

452 13 Spectroscopy

Table 13.1 Tensor analysis of π0 ! γ γ decay amplitude.

J P (π0) Amplitude Polarization direction

0C (�1 � �2) f (k2) �1 k �2

0� (k � �1 � �2) f (k2) �1 ? �2

1� (π � k)(�1 � �2)g(k2) �1 k �2

1C (π � �1 � �2)g(k2) �1 ? �2

Note: k � (�1 � �2)D �1(k � �2)� �2(k � �1) D 0

the momenta (k1, k2) and polarization vectors (�1, �2) of the photon. The problemsimplifies if we choose the working frame properly, namely, the pion rest frameand the Coulomb gauge, k1 D �k2 D k and �1 � k D �2 � k D 0. The amplitudesthat can be made of these materials are listed in Table 13.1, where f (k2), g(k2)represent the remainder of the amplitude.

Under the parity transformation, k ! �k and the pion wave function alsochanges its sign depending on its intrinsic parity. Since the final state is composedof two identical bosons, the amplitude has to be symmetric under exchange of thetwo particles, i.e. k $ �k and �1 $ �2. The 1˙ amplitudes do not satisfy this re-quirement and are forbidden. The remaining choice is between 0C and 0�, whichmust be determined experimentally.

While the possibility of s � 2 is not excluded, it is unlikely considering the factthat the three members of the pion (πC, π0, π�) have almost the same mass. Theyconstitute an isospin triplet (to be discussed later), and are considered to share thesame properties. If s � 2, the angular distribution of π0 ! 2γ would be anisotrop-ic, which is not observed experimentally. In high-energy reactions, a great numberof pions are produced and their multiplicity is expected to be proportional to theirdegrees of freedom (2s C 1). But it is the same for all π˙, π0, so we may assumethey have the same spin and parity. Today, pions are considered as composites ofu, d quarks and u, d antiquarks, πC � �ud, π0 � uu � dd, π� � du. Their totalspin is s D 0, the relative orbital angular momentum ` D 0 and they constituteJ P D 0� mesons.

Problem 13.1

Prove the following statements assuming parity invariance.

1. The decay 0C ! 0C 0 is allowed, but 0� ! 0C 0 is not, where the final stateis composed of two identical particles. Experimentally, η(548) is known to havespin 0 and

Γ (η ! πCπ�)Γtotal

< 1.3� 10�5 ,Γ (η ! π0π0)

Γtotal< 4.3� 10�4 90% CL

(13.13)

Page 478: Yorikiyo Nagashima - Archive

13.2 Pions 453

which tells us that η has J P D 0�.2. 0C ! 0� C 0� C 0� is forbidden.3. 0� ! 0C C γ , 0� C γ are forbidden.4. 0� ! 0� C 0� C 1� is allowed.5. 1� ! 0� C 1� is allowed.6. Let slow antiprotons stop in hydrogen. Prove that the process p C p ! 2π0 is

forbidden.

G Parity As was shown in Section 9.3.4, the transformation of a particle doubletto an antiparticle doublet is obtained by the successive operation of C and e�i π I2,rotation about the axis I2 by angle π. It is referred to as the G operation:�

pn

�!

��np

�D C

��np

�D C

�0 �11 0

��pn

�� G

�pn

�G D C e�i π I2

(13.14)

The fact that the G-operated state transforms in the same way under isospin ro-tation means that the G and isospin operations are commutable. Let us operatewith G on the pion fields. The C operation interchanges π D (π1 C i π2)/

p2 and

π† D (π1 � i π2)/p

2, which means π2 ! �π2. The charge symmetry operationis a rotation around the y-axis and changes π1, π3 ! �π1, �π3. The net effect ofthe G transformation is to change the sign of all the pions:

G π D �π (13.15)

This means a system of n pions has G parity (�1)n . Unlike C parity, G parity canbe defined for a charged system and is more convenient. However, since the Goperation includes a rotation in isospin space, the applicability is limited to thestrong interaction. In the electromagnetic interaction, G parity is not conserved.Conversely it can be used to differentiate one from the other. Especially useful isan application to the neutral I3 D 0 system.

Theorem 13.1

The G parity of a system consisting of a particle q and an antiparticle q is given by(�1)SCLC1.

Proof: As the isospin mathematics is the same as that of the angular momentum,the eigenvalue of e�i π I2 has the same form as that of e i πL y . As the system is neutraland has I3 D 0, the eigenfunction is YI0 � PI (the Legendre polynomial of Ithorder). As PI (�x ) D (�1)I PI (x ), and rotation by π around the y-axis changes(θ ! π � θ , φ ! π � π) or cos θ ! � cos θ ,

e i π I2 YI0 D (�1)I (13.16)

Page 479: Yorikiyo Nagashima - Archive

454 13 Spectroscopy

On the other hand C operation on a neutral system yields C D (�1)LCS [seeEq. (9.177)]. Consequently, G-parity of a neutral system is G D (�1)LCSCI . �

Examples of the G Parity Selection Rule(1) π production in p n system: The system has I D 1. For slow p stopping indeuterium, the reaction takes place in the S (L D 0) state. Then an even (odd)number of pions are produced in the decay from 3S1 (1S0) states.(2) �! 2π:

�˙(1�)! π˙π0 ) G� D C ! � �!� 3π (13.17a)

ω(1�)! πCπ�π0 ) Gω D � ! ω �!� 2π (13.17b)

Problem 13.2

Main decay modes of the η meson (mη D 548 MeV, Γη D 1.3 MeV) are

η ! πC C π� C π0 22.73˙ 0.34%

! π0 C π0 C π0 32.56˙ 0.23%

! 2γ 39.31˙ 0.20%

! πC C π� C γ 4.60˙ 0.16% (13.18)

All the decays are via electromagnetic interaction. Why? Considering C, P, J selec-tion rules, show that J P C of η is 0�C if J < 2.

13.3πN Interaction

13.3.1Isospin Conservation

Charge Independence of πN Coupling As the pion is a spin 0 scalar in real space,the simplest Lorentz invariant interaction can be expressed as

HINT � i gψγ 5ψπ 2) (13.19)

In isospin space, an obvious invariant that can be made from the two vectors ψτψand π is ψτ � πψ. The interaction Hamiltonian is now

HINT D i gψγ 5τ � πψ

D i g[(np C p n)π1 C i(np � p n)π2 C (p p � nm)π3]

D i g[p

2(p nπC C n p π�)C (p p � nn)π0] (13.20a)

2) ( f /mπ )ψ γ μ γ5 ψ@μ π is equally valid, but in the low-energy limit it is equivalent to thepseudoscalar interaction of Eq. (13.19) with f D g(mπ/2m p ) [refer to the low-energyapproximation of the Dirac bilinears in Eq. (4.2)].

Page 480: Yorikiyo Nagashima - Archive

13.3 πN Interaction 455

π� Dπ1 C i π2p

2, πC D

π1 � i π2p

2, π0 D π3 (13.20b)

where in the second and third lines we have omitted the space-time variables asthey play no role in the following discussion. The last line of Eq. (13.20a) meansthat the strengths of the interactions

p $ n C πC W n $ p C π� W p $ p C π0 W n $ n C π0 (13.21)

are proportional top

2 Wp

2 W 1 W �1 (13.22)

The charge independence condition Eq. (9.193) Vp p D Vnn D Vp n is derived fromthe above relation. The argument goes like this. The strength of the potential be-tween nucleons is expressed in Fig. 13.7.

Figure 13.7 The strengths of (a) Vp p , (b) Vnn and (c,d) Vp n are the same.

Fig. 13.7a Exchange of π0 Vp p / g2[(1) � (1)] D g2

Fig. 13.7b Exchange of π0 Vnn / g2[(�1) � (�1)] D g2

Fig. 13.7c Exchange of π0 Vp n / g2[(1) � (�1)] D �g2

Fig. 13.7d Exchange of π˙ Vp n / g2[(p

2) � (p

2)] D 2g2

Since p p and nn are both I D 1 states and symmetric with exchange of the parti-cles 1 and 2, the I D 1 part of the potential becomes

Vp n(c)C Vp n(d) D g2 D Vp p D Vnn (13.23)

which shows the charge independence of the nuclear force.

Isospin Exchange Force In Section 6.8 we saw that when two distinguishablefermions scatter by exchanging a scalar meson the scattering matrix is given byEq. (6.157)

T f i D (2π)4δ4(p1 C p2 � p3 � p4)

� u(p3)Γ u(p1)g2

q2 � μ2 u(p4)Γ u(p2)ˇˇ

qDp3�p1

(13.24)

Page 481: Yorikiyo Nagashima - Archive

456 13 Spectroscopy

where Γ D 1 (i γ 5) for a (pseudo)scalar meson exchange. In the nonrelativisticlimit, the scalar meson propagator becomes �g2/(q2 C μ2), which is the Fouriertransform of an attractive Yukawa potential. We now extend the discussion to in-clude the isospin degrees of freedom. The interaction including the isospin is givenin Eq. (13.20):

HINT D gψΓ τ � πψ D gX

i

ψΓτ i ψπ i (13.25)

where ψ is now a 2� 1 matrix in the isospin space. Then Eq. (13.24) is modified to

T f i � u(p3)Γτ1,i u(p1)g2δ i j

q2 � μ2 u(p4)Γτ2, j u(p2)ˇˇ

qDp3�p1

(13.26)

where we have omitted the δ-function. We see the only modification due to inclu-sion of the isospin is

g2

q2 � μ2!

g2

q2 � μ2(τ1 � τ2) (13.27)

where we have omitted the isospin wave function and expressed the scattering ma-trix as a 2 � 2 matrix in isospin space. The subscripts 1, 2 were attached to denotethat the τ matrix acts only on particle 1 or 2.

Now, a system of two identical fermions must be odd under permutation of par-ticles 1 and 2. Since we have included the isospin degree of freedom, regarding thenucleon (p , n) as identical but in different states, the permutation must be done onall three kinds of wave functions:

(1$ 2) � (isospin) � (spin ) � (spatial wave function) (13.28)

Since the two nucleons both have I D 1/2, the total isospin is either 1 or 0 andsymmetric or antisymmetric accordingly. The isospin wave functions are given by

I D 1 D

8<ˆ:jp p i I3 D C1

1p2(jp ni C jn p i) I3 D 0

jnni I3 D �1

(13.29a)

I D I3 D 01p

2(jp ni � jn p i) (13.29b)

The value of h f jτ1 � τ2jii is

τ1 � τ2

2D τ1

2C

τ2

2

�2� τ1

2

�2� τ2

2

�2D I(I C 1) � 2 �

12

�12C 1

D

(12 I D 1

� 32 I D 0

(13.30)

The deuteron, a bound state of (p n), has I D 0, S D 1 and only the even orbitalangular momentum L D 0, 2, . . . states are allowed, which is indeed the case. It hasa dominant S (L D 0) state with a small mixture of D (L D 2) state.

Page 482: Yorikiyo Nagashima - Archive

13.3 πN Interaction 457

For I D 0, τ1 � τ2 < 0 and the sign of the potential is reversed. Namely, thescalar meson exchange gives a repulsive force and cannot make a bound state ifthe isospin degree of freedom is taken into account. However, if the exchangedmeson is a pseudoscalar, Eq. (13.26) gives an attractive potential and can explainthe existence of a nuclear bound state. Since it takes some extra algebra to showthat the pseudoscalar exchange indeed gives an attractive force, explicit calculationswill be deferred to Section 13.6.

The appearance of the isospin exchange force (� τ1 � τ2) is due to the exchangeof isospin carried by the pion. Whenever the force carrier has a quantum numberthat is described by S U(N ), N � 2, a similar exchange force term appears. Thecolor degree of freedom carried by the gluon induces a color exchange force inS U(3). and it played a crucial role in disentangling the hadron mass spectrumand in proving the color charge to be the source of the strong force, which will bediscussed in detail in Section 14.6.

33 Resonance (Δ(1232)) In the early days before the discovery of quarks, hadrons,including pions and nucleons, were considered elementary and their dynamics wasextensively studied through their reactions. Figure 13.8 shows one example of suchinvestigations. One immediately notes a rich structure in their behavior and manyresonances have been identified. Today, they are recognized as superpositions ofexcited energy levels of quark composites. But discovery does not happen in a day.Given this kind of data, it is necessary to isolate levels, identify their propertiesand classify them to make a Mendeleev periodic table of hadrons. Namely, findingregularities among the many levels is the first step in reaching a deeper under-standing of the structure. This is the process called hadron spectroscopy. Beforelaunching a systematic investigation, we pick out the most conspicuous resonanceat M D 1232, which used to be called the 33 resonance, and apply the algebraof isospin symmetry to show how it works. There are eight combinations in πNscattering, but we pick out three easily accessible reactions in the following:

πC C p ! πC C p (13.31a)

π� C p ! π� C p (13.31b)

π� C p ! π0 C n (13.31c)

Expressing only the isospin indices, we write the scattering amplitude as

h f jT jii DXI,I3

XI 0 ,I 0

3

h f jI I3ihI I3jT jI 0 I 03ihI

0 I 03jii (13.32)

If charge independence is valid, the total isospin I and I3 are conserved. The dif-ference of the I3 is that of the electric charge, and so I3 conservation follows fromelectric charge conservation, too. Just like rotational invariance in real space, thescattering amplitude can be decomposed into partial amplitudes of definite isospin.

Page 483: Yorikiyo Nagashima - Archive

458 13 Spectroscopy

Figure 13.8 Resonance structure of low-ener-gy π˙–p scattering data. The names are N forI D 1/2 and Δ for I D 3/2 resonances. Thenumber in parenthesis is its mass value. S, P,D, F, G, and H stand for the L D 0, 1, 2, 3, 4and 5 states. D15 means L D 2, I D 1/2,J D 5/2. Some resonances are overlaps of

two or more states. N(1680) may be an over-lap of N(1675) D15 and N(1680) F15. Simi-larly, Δ(1920) may be an overlap of Δ(1920)P33 and Δ(1930) D35. N(2200) may also bean overlap of (N(2190) G17 and N(2220)H19) [156].

The Wigner–Eckart theorem tells us that

hI I3jT jI 0 I 03i D δ I I 0 δ I3 I 0

3hI jT jI i � δ I I 0 δ I3 I 0

3T(I ) (13.33)

The transformation matrices hI I3jii can be obtained from the Clebsch–Gordancoefficient table (see Appendix E). Let us first rewrite the jπNi states in terms ofjI I3i: ˇ

ˇ32 ,32

�D jπC p i (13.34a)ˇ

ˇ32 ,12

�D

r13jπCni C

r23jπ0 p i (13.34b)ˇ

ˇ32 ,�12

�D

r23jπ0ni C

r13jπ� p i (13.34c)ˇ

ˇ32 ,�32

�D jπ�ni (13.34d)

Page 484: Yorikiyo Nagashima - Archive

13.3 πN Interaction 459

ˇˇ12 ,

12

�D

r23jπCni �

r13jπ0 p i (13.35a)ˇ

ˇ12 ,�12

�D

r13jπ0ni �

r23jπ� p i (13.35b)

Solving the above equations in terms of jI I3i, we obtain3)

jπC p i Dˇˇ32 3

2

�(13.36a)

jπ� p i D

r13

ˇˇ32 ,�

12

��

r23

ˇˇ12 ,�

12

�(13.36b)

jπ0ni D

r23

ˇˇ32 ,�

12

�C

r13

ˇˇ12 ,�

12

�(13.36c)

Using the Wigner–Eckart theorem

hπC p jT jπC p i D�

32

32

ˇˇ T

ˇˇ32 3

2

�D T

�32

�(13.37a)

hπ� p jT jπ� p i D

"r13

�32

,�12

ˇˇ �

r23

�12

,�12

ˇˇ#

T

"r13

ˇˇ32 ,�

12

��

r23

ˇˇ12 ,�

12

�#

D13

�T�

32

�C 2T

�32

��(13.37b)

hπ0njT jπ� p i D

"r23

�32

,�12

ˇˇC

r13

�12

,�12

ˇˇ#

T

"r13

ˇˇ32 ,�

12

��

r23

ˇˇ12 ,�

12

�#

D

p2

3

�T�

32

�� T

�32

��(13.37c)

The three amplitudes (or all eight amplitudes) are written in terms of two indepen-dent amplitudes T(3/2) and T(1/2). Combining the three equations in Eq. (13.37),

�hπC p jT jπC p i C hπ� p jT jπ� p i Cp

2hπ0njT jπ� p i D 0 (13.38)

The three amplitudes make a triangle in the complex plane. If we express the cor-responding cross sections by dσC, dσ�, dσ0, Eq. (13.38) can be rewritten as

jp

dσ� �p

2dσ0j <p

dσC <p

dσ� Cp

2dσ0 (13.39a)

3) The Clebsch–Gordan coefficients make an orthogonal matrix. Therefore, reading the tablevertically instead of horizontally, the inverse matrix elements can be obtained.

Page 485: Yorikiyo Nagashima - Archive

460 13 Spectroscopy

jp

2dσ0 �p

dσCj <p

dσ� <p

2dσ0 Cp

dσC (13.39b)

jp

dσC �p

dσ�j <p

2dσ0 <p

dσC Cp

dσ� (13.39c)

As the cross sections are proportional to the square of the amplitude, they can beexpressed as

dσC D kˇˇT�

32

�ˇˇ2 (13.40a)

dσ� D kˇˇ13�

T�

32

�C 2T

�12

� ˇˇ2 (13.40b)

dσ0 D k

ˇˇp2

3

�T�

32

�� T

�12

� ˇˇ2 (13.40c)

If jT(3/2) T(1/2)

dσC W dσ� W dσ0 D 9 W 1 W 2 (13.41)

If jT(3/2)� T(1/2)

dσC W dσ� W dσ0 D 0 W 2 W 1 (13.42)

The above relations use only isospin conservation; no other restrictions applyand consequently they are valid for any value of dynamic variables such as thescattering angle or the energy as long as the comparison is made with the samecondition. The data of a conspicuous resonance Δ(1232) at the total energy

ps D

1100 � 1300 MeV or the kinetic energy of the pion 100 . Tπ . 300 are shown inFig. 13.9. The measured total cross sections at the peak are

210 mb W 24 mb W 48 mb 4) (13.43)

which shows that Δ(1232) has I D 3/2. On the other hand, Δ(1232) is observedin πC p , πCn, π� p , and π�n channels having electric charge (CC,C, 0,�), mak-ing a quartet, and I D 3/2 can be inferred without the above demonstration. Butthe fact that the cross sections were just as predicted is strong evidence that thehypothesis of charge independence is valid in πN reactions or more generally inthe strong interaction.

Other Tests Other evidence is given by the absence of

d C d ! 4HeC π0 (13.44)

The deuteron and helium are isosinglets (I D 0), and the pion an isotriplet (I D 1),so the above reaction, if it exists, violates isospin conservation.

4) 1 mb D 10�27 cm2.

Page 486: Yorikiyo Nagashima - Archive

13.3 πN Interaction 461

Figure 13.9 Three scattering cross sections in the Δ(1232) region.

Problem 13.3

Helium 3He and tritium 3H make an isospin doublet. Show that

dσ(p d ! 3HπC)dσ(p d ! 3Heπ0)

D 2 (13.45)

Problem 13.4

Show that

dσ(p p ! dπC)dσ(n p ! dπ0)

D 2 (13.46)

Problem 13.5

(KC, K0), (Σ C, Σ 0, Σ �) make I D 1/2 and 1 multiplets. Writing the amplitudesfor

πC C p ! KC C Σ C , π� C p ! K0 C Σ 0 , π� C p ! KC C Σ �

(13.47)

as aC, a0, a�, show thatp

2a0 C a� � aC D 0 (13.48)

Problem 13.6

(KC, K0), (� 0, � �) make I D 1/2 doublets. Consider the reactions

K�n ! K0� � , K� p ! KC� � , K� p ! K0� 0 (13.49)

Page 487: Yorikiyo Nagashima - Archive

462 13 Spectroscopy

and show the relation among the amplitudes

f (K0� �) D f (KC� �)C f (K0� 0) (13.50)

Problem 13.7

Λ0 is an isosinglet particle. Show that

dσ(πCn ! KCΛ0) D dσ(π� p ! K0Λ0) (13.51)

Fermi–Yang Model If we make an isotriplet from two isodoublets (p , n) and(�n, p ) we have

j1, 1i D �p n , j1, 0i D1p

2(p p � nn) , j1,�1i D n p (13.52)

These states have baryon number 0, isospin I D 1 and the same quantum numberas the pions (πC, π0, π�). Therefore the idea that the pions are bound states of anucleon and an antinucleon was proposed by Fermi and Yang [139]. This was thepredecessor of the Sakata model, which extended the S U(2) Fermi–Yang model toS U(3), taking (p , n, Λ) as the three basic constituents of all the hadrons, which inturn developed into the successful quark model.

13.3.2Partial Wave Analysis

A scattering process can be decomposed into a sum of contributions with definiteangular momentum J, which is called partial wave expansion. Even if a resonanceis observed at a certain value of J, there are contributions from other partial waves,which may not form resonances. Generally they are slowly varying backgrounds asa function of the energy. If the energy is high and many inelastic channels consist-ing of more than one π and a nucleon are open, the contribution of the resonancemay be only a small fraction and may not appear as a sharp peak. To find reso-nances in such a case, a partial wave analysis taking into account other contribu-tions is necessary. The starting point is the partial wave expansion formula derivedin Chap. 9. We reproduce them here:

dσdΩD j f λ3 λ4,λ1 λ2 j

2 (13.53a)

Page 488: Yorikiyo Nagashima - Archive

13.3 πN Interaction 463

f λ3λ4,λ1 λ2 D

rp f

p i

M f i

8πp

s5) (13.53b)

D1

2 i p i

XJ

(2 J C 1)hλ3λ4j[S J (E )� 1]jλ1 λ2idJλμ(θ )e i(λ�μ)φ

(13.53c)

P j J λλ1λ2i D η1η2(�1) J�s1�s2 j J λ,�λ1,�λ2i (13.55a)

h�λ3,�λ4jS J j � λ1,�λ2i D ηS hλ3λ4jS J jλ1λ2i (13.55b)

ηS D η1η2η3η4(�1)s3Cs4�s1�s2 (13.55c)

h�λ3,�λ4j f (θ , φ)j � λ1,�λ2i D ηS hλ3λ4j f (θ , π � φ)jλ1λ2i (13.55d)

where all the kinematical variables are to be evaluated in the center of mass frame.P is the parity operator and λ i , η i are helicity and intrinsic parity of the particle irespectively. As the nucleon and the pion have spin 1/2 and 0 respectively, we setλ1 D 0, λ2 D ˙1/2. In the following we write ˙ instead of ˙1/2 for simplicity.According to Eq. (13.55b), we have

S JCC D S J

�� , S JC� D S J

�C (13.56)

Denoting the orbital angular momentum as L, we have L D J ˙ 1/2. Consideringthat π has intrinsic parity �1, the parity transformation formula (13.55a) becomes

P j J M I˙i D �(�1) J�1/2j J M I�i (13.57)

where J, M denote the angular momentum and its component along z-axis. Then,eigenstates of the parity operator can be constructed as

j J�i Dj J M ICi C j J M I �i

p2

j JCi Dj J M ICi � j J M I�i

p2

(13.58)

and their parity is given by

P j J˙i D ˙(�1) J�1/2j J˙i D �(�1) J˙1/2j J˙i (13.59)

Equation (13.59) means j J˙i corresponds to L D J˙1/2. According to Eq. (13.56),there are only two independent amplitudes for a fixed J. Two L’s with the same J

5) The relation with the scattering matrix is given by Eq. (6.35):

S f i D δ f i � iT f i D δ f i � i(2π)4 δ4X

i

pi �X

f

p f

�M f i (13.54)

Page 489: Yorikiyo Nagashima - Archive

464 13 Spectroscopy

state differ by 1 and hence correspond to states of different parity. If parity is con-served, they do not mix. Therefore, partial matrices made of states J� are diago-nalized. From the unitarity of the S-matrix, we have

S J� D e2 i δ J� , S JC D e2 i δ JC (13.60)

where δ J is referred to as the phase shift. When inelastic channels are open, theunitarity condition changes. If we are considering only elastic scattering channel,the effect appears as attenuation of the flux. It can be taken into account by consid-ering a complex phase shift that has a positive imaginary part. We define

e2 i δ J D e2 i(Reδ J CiImδ J ) D η J e2 iReδ J (13.61)

Then, using Eqs. (13.58), we have

h J M ICjS � 1j J M ICi2 i p

�S J

CC � 1

2 i pD

e2 i δ� C e2 i δC � 24 i p

�f J� C f JC

2(13.62a)

h J M ICjS � 1j J M I �i2 i p

�S J

C� � 1

2 i pD

f J� � f JC2

(13.62b)

f J˙ D1p

ei δ J˙ sin δ J˙ (13.62c)

which defines the partial scattering amplitudes. Inserting Eqs. (13.62) inEq. (13.53c) and rewriting d j

λμ using formula in Eq. (E.22), we obtain

fCC DX

J

( f J� C f JC ) cosθ2

hP 0

JC1/2(θ )� P 0J�1/2(θ )

i(13.63a)

fC� DX

J

( f J� � f JC ) sinθ2

hP 0

JC1/2(θ )C P 0J�1/2(θ )

ie�i φ (13.63b)

P 0L D

dPL(θ )d cos θ

(13.63c)

P0(x ) D 1 , P1(x ) D x , P2(x ) D3x2 � 1

2,

P3(x ) D5x2 � 3x

2, � � � (13.63d)

where PL(θ )’s are Legendre polynomials. The other two amplitudes can be ob-tained from Eq. (13.55d). Without loss of generality we can set φ D 0 and we have

f�� D fCC , f�C D � fC� (13.64)

In ordinary experiments, helicity states are not observed. In this case, the averageof the initial and sum of the final helicity states are taken to give

dσdΩD

12

Xλμ

j f μλj2 D j fCCj2 C j fC�j2 (13.65)

Page 490: Yorikiyo Nagashima - Archive

13.3 πN Interaction 465

Polarization of the Nucleon If the target nuclei are unpolarized, nucleons in thefinal states of πN scattering are polarized perpendicular to the reaction plane. Thereasoning is as follows. For fixed z-axis, there are two spin degrees of freedomin both the initial and final states. Consequently the scattering amplitude can beexpressed as a 2 � 2 matrix. Any 2 � 2 matrix can be expanded in terms of a unitmatrix and Pauli’s matrices σ. The rotational invariance of the interaction restrictsthem to be scalars. Since the only available physical quantities in the center of massframe are initial momentum p i and final momentum p f , the scalars that can bemade are limited to

1 , σ � p i , σ � p f , σ � p i � p f (13.66)

σ � p i and σ � p f change sign under the parity operation and hence do not contributein parity-conserving reactions. Only the terms 1 and σ � p i � p f are allowed and theformer does not change the polarization of the nucleon. From the latter, we knowthat the polarization generated by the scattering is perpendicular to the scatteringplane.

If we set φ D 0, the scattering occurs in the x–z plane and the polarization isin the y direction. If we adopt the z-axis along the final momentum direction, theeigenstates of σ y can be constructed from the helicity eigenstates in the followingway:

j "i D1p

2(jCi C ij�i)

j #i D1p

2(jCi � ij�i)

(13.67)

The probability of the spin being upward (s y D 1/2) is given by

dσdΩ

(") D12

jh" j f jλij2 D14

j fCλ � i f�λj2

D Im( f �C� fCC)C

12

(j fCCj2 C j fC�j2)

(13.68)

As the probability of the spin being downward can be calculated similarly, the de-gree of polarization is given by

P �dσ(")� dσ(#)dσ(")C dσ(#)

D2Im( f �

C� fCC)

j fCCj2 C j fC�j2(13.69)

The polarization gives information independent from that given by the scatteringcross section. For instance, as is described in the following, the angular distribu-tion of scattering cross sections having the same J but different L is the same andcannot be distinguished for different L or equivalently different parity states. Thena polarization measurement can resolve the degeneracy.

Page 491: Yorikiyo Nagashima - Archive

466 13 Spectroscopy

Differential Cross Section When only a specific J contributes, the unpolarizedcross section is given by

dσdΩD

�J C

12

�2

j f J˙ j2h�

d J1/2,1/2

�2C (d J

1/2,�1/2)2i

D 2π(2 J C 1)j f J˙ j2 I J (θ )

(13.70)

where we used Eq. (E.18) and defined I J (θ ) as a normalized angular distributionsuch that integration over 4π solid angle gives 1. When J D 1/2, f J� correspondsto L D 0 and f JC to L D 1. If only a single partial wave contributes, we have, usingEqs. (13.63),

S1/2 W fCC D f1/2� cos(θ /2) , fC� D f1/2� sin(θ /2)P1/2 W fCC D f1/2C cos(θ /2) , fC� D � f1/2C sin(θ /2)

(13.71)

Using the notation commonly used in spectroscopy,6) the differential cross sectionsare expressed as

S1/2 WdσdΩD j fCCj2 C j fC�j2 D j f1/2�j2 (13.72a)

P1/2 WdσdΩD j f1/2Cj2 (13.72b)

Similarly using Eq. (E.22)

d3/2˙1/2,1/2 D

(3 cos θ ˙ 1)4

�cos(θ /2)sin(θ /2)

�(13.72c)

P3/2 WdσdΩD 8πj f3/2�j2

(1C 3 cos2 θ )8π

(13.72d)

D3/2 WdσdΩD 8πj f3/2Cj2

(1C 3 cos2 θ )8π

(13.72e)

From the above formula and also from Eq. (13.70), we see that the angular distri-butions are solely functions of J and are independent of parity or L for the same J.

13.3.3Resonance Extraction

Reactions between pions (π) and nucleons (denoted as N, not to be confused withn to denote neutrons) are among the best investigated phenomena. This is becausethe π meson is light and an intense pion beam can easily be made by bombardingany material with protons with energy higher than 300 MeV. Resonances, whichappear as large peaks of the scattering cross sections, were of immediate interest.With the advent of (then) high-energy intense accelerators, together with the in-vention of a powerful detector, the bubble chamber, (see Fig. 14.2 for more detaileddescription) many hadronic resonances were discovered and investigated in the

6) S, P, D, F, . . . denotes L D 0, 1, 2, 3, . . . .7) As the antinucleon has the baryon number �1, extra nucleon–antinucleon pairs are allowed in

the final state, but we do not consider them staying in relatively low-energy regions.

Page 492: Yorikiyo Nagashima - Archive

13.3 πN Interaction 467

1960s. Although the resonances as well as the pion and the nucleon (collectivelyreferred to as hadrons) are actually composites of quarks, we will stick mostly tothe terminology used in spectroscopy in this chapter. The composite nature of thehadrons will be treated in the next chapter. Kaons and hyperons Y (baryons thatcarry strangeness quantum number) were also investigated in this era. Apart fromselection rules arising from the strangeness, kinematical structures of K N or πYare similar to those of πN . Therefore, we mostly concentrate on πN phenomena.Resonances in the πN system denoted as N�, N��, � � � are dynamical enhance-ments of many particles in scattering states. Like those of atoms and nuclei, higherresonances quickly descend to lower energy levels, eventually settling in the groundstate as a nucleon and mesons.7) Namely

π C N ! N�� ! N� C nπ ! N C mπ (13.73)

Because of baryon number conservation, N�� and N� have the baryon number 1.It is a number assigned to the nucleon, hyperon and their excited states. In con-trast, mesons have baryon number 0. A general pattern of πN reactions is depictedin Fig. 13.10. Hadronic resonances typically have widths (Γ ) of 10–100 MeV, which

= = + + + . . .

(a) (b) (c) (d) (e)

Figure 13.10 A general pattern of πN reac-tions. Thin lines denote π and N and thicklines denote resonances. Resonances haveshort decay times, typically on the order of

the strong time scale� 10�23 s. π Mesonsand other weakly decaying particles (K, Λ, Σ ,etc.) which have lifetimes of 10�10–10�8 s areconsidered as stable in strong interactions.

means their lifetimes are τ � „/Γ � 10�22–10�23 s.8) At flight velocity near thespeed of light their flight distance before decay is at most � 10�15 m, compara-ble to hadron size and far below observable limits. What is observed is debris ofdecayed resonances. However, energy-momentum conservation helps us to recon-struct their mass. We define the invariant mass M of a many-particle system by

M2 D (p1 C p2 C � � � )2 D (E1 C E2 C � � � )2 � (p1 C p2 C � � � )2 (13.74)

where pi ’s are four-component momenta of particles. Therefore, one method tofind the resonances is to pick up mπ or mπ C N out of an n-particle final statein the multi-pion production reaction, plot the number of appearances as a func-tion of their invariant mass and see if a peak or peaks appear. In the special case

8) A particle flying at near light speed can have longer life time because of Lorentz time dilatation.Here we are considering only the order of magnitude.

Page 493: Yorikiyo Nagashima - Archive

468 13 Spectroscopy

where resonances can be formed by an incoming π (particle 1) and a target nucleon(particle 2) (Fig. 13.10c), the invariant mass is given by

M2 D (p1 C p2)2 � s D (E1 C m)2 � p22

D (m1 C m2)2 C 2m2T1

T1 D E1 � m1

(13.75)

Here we assumed the target is at rest and used p 2 D 0. A plot of scattering crosssection as a function of the invariant mass immediately indicates the existence ofa resonance as a peak or bump (see Fig. 13.8) and its mass can be determined byEq. (13.75). To sort out many resonances, their quantum numbers, including spin,parity and isospin, have to be determined.

Resonance Behavior Let us look at the behavior of a resonance when it is a dom-inant contribution to the scattering amplitude. Since the resonance is an unstableintermediate state, we may treat it as a particle that has energy with negative imagi-nary part Γ /2 [see Eq. (2.14)]. When a resonance intermediate state jRi dominates,perturbation expansion of the scattering amplitude in quantum mechanics gives

T f i DX

n

h f jHI jnihnjHI jiiE0 � En

�h f jHI jRihRjHI jii

E0 ��ER � i Γ

2

� /

pΓ fp

Γi

E0 ��ER � i Γ

2

�(13.76)

Here Γ f � jh f jHI jRij2, Γi � jhijHI jRij2 are partial decay widths to the statesj f i, jii and the total decay width is given by

Γ DX

k

Γk /X

k

jhkjHI jRij2 (13.77)

The elastic scattering matrix for a single partial wave can be expressed as

S D e2 i δ Dcot δ C icot δ � i

(13.78)

Let us expand cot δ in Taylor series in the neighborhood of ER , which gives δ Dπ/2,

cot δ D cot δER C (E � ER )d cot δ

dE

ˇˇ

ER

C � � � (13.79)

At E D ER the real part of the denominator vanishes, we can put

cot δER D 0, �d cot δ

dE

ˇˇ

ER

D �dδdE

ˇˇ

ER

� �2Γ

(13.80a)

The scattering matrix can be approximated as

S �E � ER � i(Γ /2)E � ER C i(Γ /2)

(13.80b)

Page 494: Yorikiyo Nagashima - Archive

13.3 πN Interaction 469

which results in the scattering amplitude

f (E ) De i δ sin δ

pD �

1p

(Γ /2)E � ER C i(Γ /2)

(13.81)

This has the desired form to reproduce a resonant shape of the scattering crosssection. In other words, the resonance occurs where the phase shift δ J of a partialwave becomes π/2 ˙ nπ. As can be seen from the definition of Γ [Eq. (13.80a)],if the phase shift as a function of energy increases rapidly in the vicinity of π/2, Γbecomes small, giving a sharp peak.

Unitary Limit For each partial wave, the range of value S J D e2 i δ J is constrainedto je2 i δ J j 1 and we have the following unitary limit for the scattering amplitude:

j f J˙ j2 D

sin2 δ J˙p 2

1p 2

(13.82)

This is equivalent to saying that the probability should not exceed 1. The crosssection of a partial wave for an unpolarized target in the vicinity of a resonance isgiven by

dσdΩD

2π(2 J C 1)p 2

(Γ /2)2

(E � ER )2 C (Γ /2)2 I J (θ ) (13.83)

For inelastic scattering (i ¤ f ) it becomes [see Eq. (13.76)]

dσdΩD

2π(2 J C 1)p 2

(Γ f /2)(Γi/2)(E � ER )2 C (Γ /2)2

I J (θ )9) (13.84)

In particular, for J D 3/2

dσdΩD

8πp 2

(Γ f /2)(Γi/2)(E � ER )2 C (Γ /2)2

�1C 3 cos2 θ

�(13.86)

From Eq. (13.83), we see that the resonance occurs at E D ER and when no inelasticchannels are open it can reach the unitary limit.

Spin and Isospin of Δ(1232) Δ(1232), denoted as Δ in this chapter, appeared inthe πN scattering cross section as the first prominent peak in Fig. 13.8:

π C N ! Δ ! π C N (13.87)

Its isospin is 3/2 since four charged states (ΔCC, ΔC, Δ0, Δ�) were observed.In order to determine its angular momentum J, we compare the total cross section

9) Equations (13.83) and (13.84) apply to spin-averaged πN scattering. More generally, for incidentparticles with spin sa and sb , the spin-averaged resonance formula is given by

dσdΩD

4π(2 J C 1)(2sa C 1)(2sb C 1)

1p2

(Γ f /2)(Γi /2)

(E � ER )2 C (Γ /2)2I J (θ ) (13.85)

Page 495: Yorikiyo Nagashima - Archive

470 13 Spectroscopy

with the unitary limit. As it occurs around T ' 180 MeV, which is low enough to beable to ignore the inelastic channel contribution, we may approximate Γi D Γ f D

Γ . Then the cross section should take the value of unitary limit at the resonancepeak.

Figure 13.11a [264] shows that the unitary limit is reached almost completely,which determines the spin of Δ(1232) to be J D 3/2. Then, the angular distribu-tion should be that of J D 3/2 given by Eq. (13.86). The data [31] (Fig. 13.11b) atT D 170 MeV confirm this. As previously explained, the angular distribution at theresonance alone cannot distinguish the L value or parity of Δ,

PΔ D Pπ Pp (�1)L D (�1)LC1 (13.88)

but considering the interference effect with other partial waves, it can be deter-mined. Classically, the orbital angular momentum corresponds to L D r � p ,where r is restricted by the nuclear force range (jrj < R), which is known tobe � 10�13 cm. At the Δ region, the pion momentum is � 300 MeV/c, and on-ly L D 0, 1 are significant. Thus we are led to conjecture that Δ is P3/2 with L D 1.To confirm this, let us consider how the angular distribution is affected if there is asmall L D 0 background, which is expected to vary slowly. Using Eqs. (13.63), thescattering amplitude including both S (L D 0) and P (L D 1) wave contributions isexpressed as

dσ � jS j2 C jP j2(1C 3 cos2 θ )C 4Re(S�P ) cos θ (13.89a)

On the other hand, if we assume S (L D 0) and D (L D 2), the expression becomes

dσ � jS j2 C jDj2(1C 3 cos2 θ )C 4Re(S�D)(3 cos2 θ � 1) (13.89b)

From Eqs. (13.89), we see that the interference term gives cos θ dependence andproduces asymmetry for the assignment Δ D P , but no asymmetry for D. In gen-eral, asymmetric terms arise from interferences between waves of opposite parity.The resonance amplitude is expressed as

f J �1

E � ER C i(Γ /2)D

E � ER � i(Γ /2)(E � ER)2 C (Γ /2)2 (13.90)

and its functional form is described in Fig. 13.12. Assuming the S wave does notproduce a resonance in this region, its amplitude is a slowly varying function. Thenthe interference term with the real part of the resonance amplitude changes its signbelow and above the resonance energy. Angular distributions at Tπ D 98, 310 MeVshown in Fig. 13.11b behave exactly as expected for the existence of cos θ interfer-ence, which vanishes at the resonance but produces opposite asymmetries belowand above the resonance, confirming our previous conjecture that L D 1. In sum-mary, the spin parity of Δ is determined to be J P D (3/2)C.

Page 496: Yorikiyo Nagashima - Archive

13.3 πN Interaction 471

0 30 60 90 120 150 180CM angle in degrees

dσ/d

Ω i

n m

b/st

er

0

5

10

15

20

25

30

Tπ = 170 MeV

Tπ = 98 MeV

Tπ = 310 MeV

(1 + cos2θ)

(a)

(b)

Figure 13.11 (a) The production cross sec-tion of Δ(1232) reaches its unitary limit8πλ/2 D 8π/p 2 [264]. (b) Angular distri-butions of the πN system in the neighbor-hood of Δ(1232). At the resonance peak

(Tπ D 180 MeV, (s D 1232 MeV)2), theangular distribution is � 1C 3 cos2 θ . Belowand above the resonance, the angular distri-bution is asymmetric due to interference withS-wave amplitude (see text) [23, 31, 90].

Page 497: Yorikiyo Nagashima - Archive

472 13 Spectroscopy

Im ap

Re ap

EER

Re as

Figure 13.12 Qualitative behavior of a resonance amplitude aP and background amplitude aS.

13.3.4Argand Diagram: Digging Resonances

Not all resonances are as conspicuous as Δ(1232). Some of them appear as peaksand bumps, but many are hidden behind conspicuous resonances or overwhelm-ing backgrounds that come from inelastic channels. To disentangle many overlap-ping resonances and backgrounds, it is necessary to make a detailed partial waveanalysis as a function of energy and angles. If it is assumed that the maximumorbital angular momentum is limited by Lmax ' p R, R ' 1/mπ , the scatteringamplitude can be expressed as an finite number of partial waves. When the cross-section as well as polarization data are fitted as functions of energy and angle, everypartial wave scattering amplitude (A J ) can be determined, at least in principle. Letus examine how these partial amplitudes behave. If inelastic channels are open that

Im AJ

Re AJ

r 1/2

2δJ

ηJ2

Figure 13.13 A complex partial-wave ampli-tude lies within a circle of radius 1/2 (unitari-ty). For an elastic channel, it is located at theorigin when the kinetic energy is zero. As theenergy increases, it moves clockwise or coun-

terclockwise depending on whether the forceis repulsive or attractive. When δ J D π/2,namely when the amplitude crosses the imag-inary axis counterclockwise, it shows a reso-nance behavior.

Page 498: Yorikiyo Nagashima - Archive

13.3 πN Interaction 473

damp the elastic waves, its effect can be taken into account by introducing elasticityas defined in Eq. (13.61). Then the partial-wave scattering amplitude can be writtenas

A J � p f J Dη J e2 i δ J � 1

2 i(13.91)

On the Re(A J )–Im(A J ) plane, A J is confined to within a circle of unit radius withits center located at x D 0, y D 1/2. This is called the Argand diagram. Thisis another expression for the unitary limit. Starting at the origin when the kineticenergy T D 0, A J proceeds on the circle clockwise when the force is repulsiveand counterclockwise when attractive. As the rotation angle is given by 2δ and itsdistance from the center by η J /2, the track departs from the circle when inelasticchannels such as 2π C N, 3π C N, � � � are open (see Fig. 13.13). When it crossesthe imaginary axis counterclockwise, δ J D π/2, it becomes a resonance amplitude.When the speed at crossing is fast, the resonance width is narrow [see Eq. (13.80)].In the vicinity of the resonance, the amplitude A J takes the form of Eq. (13.81); wecan express it as a function of two parameters xσ , xλ and � defined by

A σλ D �

pxσ xλ

� C i

xσ DΓσ

Γ, � D (E � ER)

(13.92)

When many channels mix, the behavior of the resonance (in the Argand diagrams)is very complex (see Problem 13.8). Generally, if the track is circling counterclock-wise, a place where it runs parallel to the x-axis can be considered as a resonance.For πN and K N scattering, there are rich stocks of data and detailed analyses havebeen carried out [204]. Figure 13.14 shows part of two independent analyses usingdifferent formalisms. In general, they show good agreement. We see that many res-onances that are not apparent in Fig. 13.8 were revealed by doing detailed partialwave analysis.

Problem 13.8

Show that resonances in the Argand diagram in Fig. 13.15 occur for the followingcases:

(a) Resonance amplitude only.(b, c) Resonance when it coexists with a background wave that has attractive force

and the same spin-parity.(d) Resonance when it coexists with a background wave that has repulsive force

and the same spin-parity.

Page 499: Yorikiyo Nagashima - Archive

474 13 Spectroscopy

Figure 13.14 Examples of πN scatteringpartial-wave analysis. It is common to useN and Δ to denote I D 1/2 and I D 3/2with S (strangeness)D 0 members. (a)P33 (L D 1, I D J D 3/2) contour in-cludes Δ(1232), etc. (b) D13 (L D 2, I D

1/2, J D 3/2) contour includes N(1520), etc.(c) S11 (L D 0, I D 1/2, J D 1/2) contourincludes N(1535), etc. (d) S31 (L D 0, I D3/2, J D 1/2) contour includes Δ(1620),etc. [204].

Page 500: Yorikiyo Nagashima - Archive

13.4 � (770) 475

Im AJ

Re AJ

Im AJ

Re AJ

1/2 1/2

a

b

cd

Figure 13.15 Depending on the elasticity η J being smaller or larger than 0.5 or/and the exis-tence of background amplitudes, there exist a variety of resonance behaviors.

13.4� (770)

Mass and Width � was the first resonance discovered in the ππ channel. Threecharged states are known

�C ! πC C π0 , �0 ! πC C π� , 10) �� ! π� C π0 (13.93)

which means the isospin of � is I D 1. If real ππ scattering is possible, its spin andparity could be determined in the same way as Δ(1232). π is an unstable particleand cannot be used as a real target, but there is a way to make one effectively. Dueto the Heisenberg uncertainty principle, a proton can emit and reabsorb pions asfollows:

p $ n C πC, p $ p C π0 (13.94)

The transition is an effect of strong interaction and occurs frequently. Consequent-ly, the proton can be considered as surrounded by a cloud of pions. which couldserve as the target. If we can shoot pions at the cloud, we can make ππ scatteringhappen. A necessary condition is that the time for which a π is virtually free is longcompared to the collision time. Referring to Fig. 13.16, this means the contribu-tion of diagram (b) must dominate. Using (q, μ), (p p , M ), (pn, M ) to denote thefour-momenta and masses of the target π, proton and neutron, respectively, thecondition for the pion to be free is

q2 D qα qα D μ2 (13.95)

10) Note there is no �0 ! π0C π0 mode. This is forbidden by isospin invariance and boson statistics.The I D 1 state made of two I D 1 states is antisymmetric, and 2π0 can only be symmetric.

Page 501: Yorikiyo Nagashima - Archive

476 13 Spectroscopy

= + + . . .+

π+ π+ π+ π+ π+ π+ π+

π π+ π+ π+p p p p

n n n nπ+

(a) (b) (c) (d)

π+

+

ρ+

Figure 13.16 Diagrams contributing to πN ! ππN reaction: π exchange (b) and othercontributions (c,d).

In a frame where jp p j D jp nj D p , assuming the masses of the proton and theneutron are the same,

Ep D En

q2 D (p p � pn)2 D (Ep � En)2 � (p p � p n)2 D 2p 2(1 � cos θ ) < 0(13.96)

q2 is negative, meaning it is not free but a virtual particle. The difference betweenthe mass of the free and the virtual particle δμ D

pμ2 � q2 indicates the degree

of virtuality. From the uncertainty principle, the time duration for which the targetpion is free may be measured by τfree � 1/δμ. q2 cannot be made positive. Then, tomake τfree large, i.e. δμ as small as possible, jq2jmust be small, or from Eq. (13.96)we have to make cos θ ! 1. Considering the requirement that the target π shouldbe well away from the nucleon, in other words, the space extension of the virtual πshould be much larger than the size of the proton R � 1/μ

jqj D jp p � p n j '

q2p 2(1� cos θ )� μ (13.97)

we see τfree � 1/δμ � 1/μ. The collision time τcoll is given by the time necessaryfor a particle to go through the target. Since the target looks thin because of theLorentz contraction, it is given by τfree � R/cγ � (1/μ)(Ein/M )�1, where Ein isthe energy of the incoming π. Therefore conditions for τcoll � τfree are given byEq. (13.97) plus

Ein M (13.98)

If they are satisfied we can consider the concept of the virtual target to be valid.11)

11) The objection may be raised that because ofthe Regge exchange dominance in the high-energy forward scattering (see Section 13.7)the exchanged particle may not be limitedto a single π. The reason justifying thedominance of diagram (b) in Fig. 13.16comes from the experimental observationthat

r �σ(π� p ! π�πC n)σ(π� p ! π� π0 n)

' 2 (13.99)

Assumption of isospin invariance gives

σ(π� πC ! π� πC)σ(π� π0 ! π� π0)

D 1 (13.100)

Since there is a factorp

2 differencebetween πC and π0 in the Clebsch–Gordancoefficients of πN coupling, the observedratio confirms the isospin invariance, hencethe dominance of the target pion.

Page 502: Yorikiyo Nagashima - Archive

13.4 � (770) 477

Therefore, when performing the experiment

πC p ! πC C π0 C p (13.101)

which includes two π’s in the final state, if we set the incident π energy sufficientlyhigh, and choose events with small recoil angle of the target proton, we are doingthe pseudoexperiment

πC C πtgt ! πC C π (13.102)

where πtgt is π’s emitted from the proton. Once the experimental conditions for ππscattering are settled, the determination of the mass and width of a ππ resonanceis straightforward. All we have to do is to plot the cross section as a function ofincident energy and measure the position and width of a peak. The observed values

Figure 13.17 Angular distributions in thedi-pion CM frame of the reaction πC p !�C p ! πC C π0 C p . (a) 640 <

ps < 680,

(b) 700 <p

s < 780, (c) 790 <p

s < 820.Atp

s D m� ' 740 MeV, the angular dis-

tribution fits with cos2 θ . The cross section isnot zero at θ D 90ı because of nonresonantcontributions. Note the absence of the cos θ 2

term below and above the resonance [88].

Page 503: Yorikiyo Nagashima - Archive

478 13 Spectroscopy

are

m(�) D 775 MeV

Γ D 146 MeV(13.103)

Spin In a similar manner to the case of Δ(1232), by choosing the z-axis along theincident πC direction, we can set Jz D L z C Sz D 0, where we have used thefact that the spin of π is 0. If a resonance is made with angular momentum J, theamplitude which the two π’s make is proportional to d J

00 D PJ in the center ofmass (CM) frame of the two π’s. PJ is a Legendre polynomial and is given by

I(θ ) � jP0j2 � constant J D 0

� jP1j2 � cos2 θ J D 1

� jP2j2 � (3 cos2 θ � 1)2 J D 2 (13.104)

When we plot the cross section as a function of the invariant mass m2ππ, the peak

position gives m2�. The angular distribution at the resonance is given in Fig. 13.17,

which fits well with cos2 θ and proves J P D 1�.

13.5Final State Interaction

13.5.1Dalitz Plot

In the previous section, we described a method to extract ππ scattering from areaction

π C N ! π C π C N (13.105)

There, the target nucleon played an auxiliary role in providing a virtual π target.A more general method to establish the existence of resonances is to make aninvariant mass distribution of an arbitrary number of particles extracted from mul-tiparticle final states. Using the invariant mass, we can investigate resonances notonly in the ππ channel but also in many π0s, πN channels and other combina-tions. In order to investigate the behavior of the matrix element, we need to knowevent distributions when it is a constant, namely the distribution of phase space.In general, when there are n particles, the scattering cross section is given by (seeSection 6.5.4)

dσ, dΓ / jh f jT jiij2dLI P S (13.106a)

dLI P S D1

m!(2π)4δ4

Pi �

nXf

p f

� nYf

d3 p f

(2π)32E f(13.106b)

Page 504: Yorikiyo Nagashima - Archive

13.5 Final State Interaction 479

where Pi is the total four-momentum in the initial state and m! is a correctionfactor for m identical particles.

Let us consider, as an example, a case with n D 3 like reaction Eq. (13.105).Integrating over three-momentum p3, the rest of the integral becomes proportionalto

p 21 p 2

2 d p1d p2dΩ1dΩ2

E1E2E3δ(E � E1 � E2 � E3) (13.107)

where pi D jp i j. In the CM frame of the three particles, using p3 D �(p1 C p 2),we have

E1 D

qp 2

1 C m21, E2 D

qp 2

2 C m22

E3 D

qp 2

1 C p 22 C m2

3 C 2p1 p2 cos �(13.108)

where � is the angle between p1 and p2. If the initial state is unpolarized, thecoordinate axis can be chosen in any direction, giving

RdΩ1 D 4π and dΩ2 D

2πd cos � . Using Eq. (13.108), variables can be changed from (p1, p2, cos � ) to(E1, E2, E3). Calculating the Jacobian of the transformation, we obtain

d p1d p2d cos � DE1E2E3

p 21 p 2

2dE1dE2dE3 (13.109)

Because of the δ-function, the integration over E3 is trivial, giving

dLI P S �Z

d3 p1d3 p2

E1E2E3δ(E � E1 � E2 � E3) / dE1dE2 (13.110)

Namely, the phase space of the three-body final states is flat on the E1–E2 plane.Therefore, the event rate dN is given by

dN � jh f jT jiij2dE1dE2 (13.111)

and the event number density on the E1–E2 plane is directly proportional to thesquare of the matrix element, which is known as the Dalitz plot. The boundary ofthe distribution is given by j cos � j 1 and using Eq. (13.108) it leads to

�1 E 2 � 2E(E1 C E2)C 2E1E2 C m2

1 C m22 � m2

3

2q

E 21 � m2

1

qE 2

2 � m22

1 (13.112)

Sometimes it is more convenient to use the invariant mass instead of the particleenergy. The invariant mass (m2

23) of particles 2 and 3 is given by

m223 D (p2 C p3)2 D (E � E1)2 � p2

1 D E 2 � 2E E1 C m21

) dm223 D �2E dE1

(13.113a)

Similarly

dm231 D �2E dE2 (13.113b)

Page 505: Yorikiyo Nagashima - Archive

480 13 Spectroscopy

We see the distribution on the E1–E2 plane is directly proportional to that on them2

23–m231 plane.

Figure 13.18 [38] shows an event distribution on the m2(K0π�)–m2(nπ�) plane

obtained by shooting a 3 GeV K beam on a neutron target (actually a deuteron tar-get) and extracting K

0π�n from the final state. One can clearly see concentrations

of events corresponding to K�(892) and Δ(1232). Projection of events on each co-ordinate axis shows a sharp peak on top of a smooth slowly varying hill. The hillcorresponds to the phase space volume and the sharp peak reflects the shape of thematrix element. It indicates that the matrix element M is proportional to

M /1

(E � ER )C i Γ /2(13.114)

and that the resonance appears as dominant intermediate states. Similarly, the in-variant mass distributions of πCπ0, πCπ� and πCπ�π0 obtained from the re-action πC C p ! 3π (or 4π)C p are plotted in Fig. 13.19 [12]. One immediatelysees �C(770) in Fig. 13.19a, �0(770) in (b) and η(548) and ω(782) in (c).

300

200

100

1001.01.52.0

(GeV/c2)2 M2(K0π–)

2.5 0.5

200

1.7

2.6

3.1

3.8

4.5

(GeV

/c2) 2

M2(n

0π–)

Figure 13.18 Dalitz plot of the reaction K� n ! K0π�n and its projection on each coordinateaxis. Two resonances K�(892) in K π and Δ(1232) in the πN channel are clearly seen [38].

Page 506: Yorikiyo Nagashima - Archive

13.5 Final State Interaction 481

Figure 13.19 Distributions of invariantmasses in the final state of πC p reac-tions. (a) πC C p ! πC C π0 C p(b) πC C p ! πC C πC C π� C p(c) πCC p ! πCCπCCπ�Cπ0C p . The

abscissa is the invariant mass (a) m(πCπ0),(b) m(πCπ�), (c) m(πCπ� π0) in units ofMeV. One can see �C in (a), �0 in (b) andη(550) and ω(783) in (c) [12].

Thus, by observing invariant mass distributions of multiparticle final states inπN and K N reactions, many resonances were discovered in the 1960s.

13.5.2K Meson

In the K meson decay

KC ! πC C πC C π� (13.115)

the final-state particles have a good symmetry in that all the particles are π’s ofmore or less the same mass. In such a case it is better to treat the three variablesE1, E2, E3 equally. Instead of orthogonal coordinates, one may plot them as a pointon the inner area of an equilateral triangle as in Fig. 13.20. If we use kinetic energyTi D Ei � mπ as variables,

T1 C T2 C T3 D mK � 3mπ D Q (13.116)

Since the sum of three perpendicular lines from a point P to the bases is equal tothe height CC0 and is a constant, the point P can represent an event in K ! 3π.We identify the lengths of three lines as T1, T2, T3 divided by Q so that the sumis normalized to 1. The boundary is constrained by j cos � j 1. For Ti � mπ ,

Page 507: Yorikiyo Nagashima - Archive

482 13 Spectroscopy

T3

T1T2

T3 /Q

(T1 - T2)/√3QA

A’ C

C’

B

B’

I

II

P

Figure 13.20 Symmetrical Dalitz plot for3-particle decays. Point P corresponds to anevent with T1 C T2 C T3 D Q, as the sumof the three perpendicular lines T1, T2, T3 tothe bases is equal to the height CC0, which isa constant. The abscissa measured from C is

equal to (T1 � T2)/p

3. For nonrelativistic par-ticles, the boundary is a circle (dashed line),but it is deformed to a shape close to a trian-gle as they become relativistic. For the shadedregions I, I I see the text.

it becomes a circle; this is almost the situation for K decay since the Q-value is75 MeV, which has to be shared among the three π’s.

Problem 13.9

Show that the boundary in the triangular Dalitz plot of Fig. 13.20 is a circle or atriangle with apexes at A,B,C for Ti � mπ or Ti mπ .

At the boundary, two of the momenta are parallel or antiparallel. On the linesAA0, BB0, CC0, the energy of the two particles is the same. Setting π1 D π2 D πC,the distribution should be symmetric for 1 $ 2 exchange, and hence we use onlythe right half of the plane. If we set the orbital angular momentum in the CMframe of πCπC as LC and relative orbital angular momentum of π� to the twoπC CM as L�, the spin of KC (J) is in the range

jLC � L�j J LC C L� (13.117)

Since LC is the orbital angular momentum of two identical spinless particles, itmust be even. We list possible values of J in Table 13.2. If J ¤ 0 one of LC, L� isnot zero. If LC � 2, and πCπC are relatively at rest, the angular momentum barri-er (see the following section) would diminish the value of the matrix element. Thishappens when the energy of π� is maximum, namely in the region I in Fig. 13.20.Similarly if L� > 0, the event rate in region II, where π� is at rest, would alsodecrease. The data shown in Fig. 13.21 [305] seem very uniform over the entireregion and we conclude the spin of KC is zero.

The three π final state has parity P3π(�1)LCCL� D �1 and was called the τ de-

cay mode. There is another decay channel, KC ! πCπ0, which was called theθ mode. The parity of θ is positive, hence τ and θ must be different particles ifparity is conserved. However, τ and θ had the same mass and lifetime and weremost likely the same particle. The two contradictory facts were called the τ–θ puz-

Page 508: Yorikiyo Nagashima - Archive

13.5 Final State Interaction 483

Table 13.2 Possible values of J P for given (LC , L�).

L� LC D 0 LC D 2

0 0� 2�

1 1C 1C, 2C, 3C

2 2� 0�, 1�, 2�, 3�, 4�

T 3/Q

(T1 T2)/√3Q0.50

0.5

0

1.0A

Figure 13.21 Dalitz plot of KC ! πC CπCCπ�. The ordinate is T3 and the abscissais (T1 � T2)/

p3 normalized to Q. The sol-

id line describes a circle and the dashed line

indicates the actual boundary. From the uni-formity of the distribution, the spin of the Kmeson was determined to be 0 [305].

zle. It led Lee and Yang to question the validity of parity invariance in the weakinteraction. We now know that parity is 100% broken in the weak interaction andhence the parity of the K meson cannot be determined with reactions in which theweak force is active. The parity of K was eventually determined by kaon capture byhelium, forming a hypernucleus in the reaction

K� CHe4 !Λ H4 C π0 (13.118)

In the hypernucleus one neutron is replaced with a Λ0 particle having m D

1116 MeV and strangeness S D �1. The measurements showed that both He4 andΛH4 have J D 0. The K meson is expected to be captured in the S (L D 0) state andthe parity of K is determined to be negative, assuming the parity of Λ relative to the

Page 509: Yorikiyo Nagashima - Archive

484 13 Spectroscopy

proton is positive. The indefiniteness originates from the fact that the strangenessis conserved in the strong interaction (see argument in Section 9.2.1).

13.5.3Angular Momentum Barrier

When the range R of a force is finite, classically there should be no contributionsfrom partial waves with l > lmax D p R, where p is the particle momentum or wavenumber. In quantum mechanics, however, all the partial waves have componentsat r < R and they contribute. This can be seen as follows. A plane wave can beexpanded as

e i p x DXlD0

i l(2l C 1)Pl (cos θ ) j l(p r) (13.119)

This shows that a component of a partial wave in the plane wave is proportionalto regular solutions of the spherical Bessel function j l(p r). Since limp r!0 j l(p r) �(p r)l , at low energies where p � R�1, the lth component of the partial wave inthe potential is proportional to (p R)l . It can be rephrased as follows. If a particlehas angular momentum l, a centrifugal potential / l(l C 1)/r2 is created, whichacts as a repulsive force. When it is superimposed on the short-range attractiveforce, the potential looks like that in Fig. 13.22. An excited bound state level can beformed far below the peak of the barrier but with E > 0. It is an unstable state,because the wave can penetrate the barrier to leak outside of it (tunneling effect)and thus has finite probability of decaying. With the above reasoning, the decaymatrix element becomes proportional to p l , making the decay rate Γ / p 2lC1.The mechanism is effective for incoming waves (scattering or absorption) as wellas outgoing waves, and makes the scattering amplitude proportional to p 2l . ThenEq. (13.62c) tells us that δ l � p f l � p 2lC1. Thus, in the low-energy region (p �R�1), components with high angular momentum are highly damped, which isreferred to as the angular momentum barrier.

V(r)Angular Momentum

Barrier l(l+1)r2

Strong Attractive Potential

R

rUnstable state

Figure 13.22 The quantum tunneling effect allows a particle to leak in or out through the angu-lar momentum barrier with a damping factor� (p R)l .

Page 510: Yorikiyo Nagashima - Archive

13.5 Final State Interaction 485

13.5.4ω Meson

The ω meson was discovered in the reaction [268]

p C p ! πC C πC C π0 C π� C π� (13.120)

as a relatively narrow resonance. We also have seen it in Fig. 13.19c. It was seenonly in neutral combinations and not in charged ones. It fixes the isospin I D 0.The ω meson has mass and width

mω D 782 MeV , Γω D 8.5 MeV (13.121)

This implies the decay is strong and that the G parity is�1. I D 0 implies C D �1.Since the isospin is a good quantum number in the strong interaction, it constrainsthe form of the wave function. As π is an isovector, the only way to construct an I D0 combination from three pions is I 1 � I 2� I 3, which is totally antisymmetric. ThenBose statistics fixes the spatial part of the wave function to be totally antisymmetric,too. This property can be used to determine the spin and parity of the ω mesonin the Dalitz plot of the 3π final state. We consider constructing matrix elementsusing the tensor method as we did for π0 in Section 13.2. Materials we can use areE1, E2, E3 and p1, p2, p3. We consider three cases.

JP D 0�: A totally antisymmetric scalar that can be made of three momenta isp1 � p2 � p3, which vanishes identically because p1 D �(p2 C p3). Therefore, thedistribution must be proportional to

f / (E1 � E2)(E2 � E3)(E3 � E1) (13.122)

The event density on the Dalitz plot must vanish when two particles have the sameenergy, which are the lines AA0, BB0 and CC0 in Fig. 13.20, and is depicted asa three-dimensional plot in Fig. 13.23a. Note the difference from K and η decay(Problem 13.2). For them there are no constraints from the isospin. For ω to havea flat distribution, it has to be an isovector [398].

JP D 1�: Because 3π has negative parity, we need an axial vector here. Sincep1 C p 2 C p3 D 0, we have only one axial vector

f / q D p1 � p2 D p2 � p3 D p3 � p1 (13.123)

Figure 13.23 Event distributions in Dalitz plot for I D 0 and J P D 0�, 1�, 1C [354, 398].

Page 511: Yorikiyo Nagashima - Archive

486 13 Spectroscopy

which is already totally antisymmetric. At the boundary of the Dalitz plot plane,two momenta are parallel or antiparallel, hence the density vanishes as depicted inFig. 13.23b.

JP D 1C: A completely antisymmetric vector can be reconstructed as

f / p1(E2 � E3)C p 2(E3 � E1)C p3(E1 � E2) (13.124)

At the center E1 D E2 D E3, hence f D 0. For E2 D E3, the first term vanishesand the second and third can vanish if p1 D maximum, because then p2 D p3 D

�p3/2. The depletion region is shown in Fig. 13.23c. We can go on to considerhigher spins but we stop here. The interested reader can refer to [369, 398].

The results can be compared with experimental data (Fig. 13.24). Since all thedistributions considered here have sixfold symmetry, it is convenient to plot allthe points in one sector. Figure 13.24b shows such a plot and is compared withtheoretical predictions. It is clear ω has J P D 1�.

Figure 13.24 (a) Dalitz plot for the decayω0 ! πC C π0 C π� [12, 13]. (b) Numberof events in one sextant of the Dalitz plot. Thecurves correspond to the theoretical distribu-

tion for a vector meson (solid line), an axialvector (dashed line) and a pseudoscalar (dot-dashed line) [354].

Page 512: Yorikiyo Nagashima - Archive

13.6 Low-Energy Nuclear Force 487

13.6Low-Energy Nuclear Force

13.6.1Spin–Isospin Exchange Force

The scattering amplitude for N N interaction with exchange of a spin 0 isoscalarmeson was given in Eq. (6.157):

T f i D (2π)4δ4(p1Cp2� p3� p4)u(p3)Γ u(p1)g2

q2 � μ2u(p4)Γ u(p2) (13.125)

where μ is the mass of the exchanged meson. Here we investigate the exchange of aπ meson, which is a pseudoscalar (Γ D i γ 5) in space-time and a vector in isospinspace. We have seen in Section 13.3.1 that the modification caused by inclusionof the isospin is only multiplication by an additional factor τ1 � τ2, but the calcu-lation of the pseudoscalar exchange effect needs some algebra. Using Eq. (4.63),u(p 0)i γ 5u(p ) becomes in the nonrelativistic limit

us(p 0)i γ 5τu r(p ) D i(τ) f i � †s

hp(p 0 � σ)(p � σ) �

p(p 0 � σ)(p � σ)

i�r

' i(τ) f i � †s σ � (p � p 0)�r D i(τ)i f � †

s σ � q�r (13.126)

where (τ) f i is a matrix element in isospin space with f, i denoting the proton orneutron state. The equation shows that γ 5 interaction is effectively a derivativecoupling, or that the pion couples with the nucleon predominantly in the P waveas compared to S wave, which is generally dominant in low-energy reactions. Thisis due to the negative parity of the pion, because in the reaction N ! N C πparity and angular momentum conservation forbids the pion to be produced inthe S state. N N interaction is another matter. The scattering amplitude contains apropagator due to exchange of the meson. It has the form

T f i �i

(p1 � p3)2 � μ2D

im2

1 C m22 � 2(E1E2 � p1 p2 cos θ )� μ2

(13.127)

which contains the S wave as well as other waves.12)

From now on we drop the spin and isospin wave function for simplicity andleave the scattering amplitude in matrix form in the spin and isospin space. Thescattering matrix for the pseudoscalar exchange is expressed as

T f i D (2π)4δ4(p1 C p2 � p3 � p4)g2 (σ1 � q)(σ2 � q)q2 � μ2 (τ1 � τ2) (13.129)

12) From Eq. (9.26), the Jth partial wave amplitude can be extracted using

SJ (E )� 12i

D

Z C1

�1d cos θ f (θ )PJ (θ ) (13.128)

Page 513: Yorikiyo Nagashima - Archive

488 13 Spectroscopy

where the subscripts 1, 2 were attached to distinguish the particles 1 and 2. To gaininsight into the propreties of the potential, we rewrite the matrix element as

hp3ji gZ

d3x ψ(x )γ 5τψ(x )π(x )jp1, qi (13.130a)

' i gZ

d3x e�i(p1�p3)�x σ � (p1 � p3)h0jτ � π(x )jqi

D �e�i(E1�E3)tx gZ

d3x e�i(p1�p3)�x (σ � r)h0jτ � π(x )jqi (13.130b)

where we used a charge-independent interaction and ' means a nonrelativisticapproximation. We made a partial integration to obtain the last equality. The ap-proximation is equivalent to replacing the Hamiltonian by

HINT D i gZ

d3x ψ(x )γ 5τψ(x )π(x )! �gX

k

ψ†(x )σk τψ(x ) � @k π(x )

(13.131)

where ψ†(x ), ψ(x ) in the last equation are two-component fields. This is called thestatic approximation. Then the scattering matrix is given by

S f i � δ f i D �i T f i D hp3 p4j(�i)2Z

d4x HINT(x )Z

d4 y HINT(y )jp1p3i

D �g2Z

d4x d4 y e�i(p1�p3)�x�i(p2�p4)�y

� (σ1 � rx )(σ2 � ry )τ1ih0jT [π i (x )π j (y )]j0iτ2 j

D �g2Z

d4x d4 y e�i(p1�p3)�x�i(p2�p4)�y

� (σ1 � rx )(σ2 � ry )τ1i

�Zd4q

(2π)4 e�i q�(x�y ) i δ i j

q2 � μ2

�τ2 j

(13.132)

Separating the CM system R D (x C y )/2 from the relative motion r D x � y andintegrating over R gives a δ-function for momentum conservation:

T f i D (2π)4δ4(p1 C p2 � p3 � p4)Z

d3 r e�i q�rˇˇ

rDx�y

�(τ1 � τ2)(σ1 � rx )(σ2 � ry )

�1

(2π)3

Zd3qei q�(x�y ) �g2

q2 C μ2

�(13.133)

We have used E1 ' E3 ' m in the denominator of the pion propagator. Then thecontent of the curly brackets is the static Yukawa potential [D �g2e�μ r/(4πr)]. The

Page 514: Yorikiyo Nagashima - Archive

13.6 Low-Energy Nuclear Force 489

nonrelativistic potential in space coordinates is given as the content of the squarebrackets divided by (2m)2 to correct for the relativistic normalization:

V(r) D (τ1 � τ2)(σ1 � rx )(σ2 � ry )��

f 2

4πμ2

e�μ r

r

�ˇˇ

rDx�y(13.134a)

f Dμ

2mg (13.134b)

Performing the differentiation, we obtain

V(r) Df 2

4πμ(τ1 � τ2)

�13

(σ1 � σ2)C�

1s2C

1sC

13

�S12

�Y(s) (13.135a)

S12 D3(σ1 � s)(σ2 � s)

s2 � (σ1 � σ2) , Y(s) De�s

s(13.135b)

s D μ r, s D jsj , r D x � y (13.135c)

σ1 � σ2 is called a spin exchange force13) and S12 a tensor force. S12 gives a repulsiveforce along the spin direction and attractive force perpendicular to it (or vice versadepending on the overall sign), and averages to zero if integrated over all angles. Itvanishes for the S wave. therefore we will concentrate on the spin exchange force(the first term).

Problem 13.10

Derive Eq. (13.135) from Eq. (13.134).

We have seen in Section 13.3.1 that τ1 � τ2 gives the isospin exchange force

τ1 � τ2

2D

(12 I D 1

� 32 I D 0

(13.136)

The same formula holds for the spin functions and gives a positive sign for S D 1.The deuteron, a bound state of (pn), has I D 0, S D 1 and only the even orbitalangular momentum L D 0, 2, . . . states are allowed, which is indeed the case. Ithas a dominant S (L D 0) state with a small mixture of D (L D 2) state. For I D 0,τ1 � τ2 < 0 and the sign of the potential is reversed. Namely, the scalar meson ex-change gives a repulsive force and cannot make a bound state if the isospin degreeof freedom is taken into account. But if the exchanged meson is a pseudoscalar,Eq. (13.135) gives an attractive potential for the I D 0, S D 1 state and can explainthe deuteron bound state.

The appearance of the spin exchange force (� σ1 � σ2) is due to the exchangeof the angular momentum carried by a P-wave meson. Similarly, the appearanceof the isospin exchange force (� τ1 � τ2) is due to the exchange of isospin carriedby the pion. Whenever the force carrier has a quantum number that is describedby S U(N ), N � 2 a similar exchange force term appears. The color degree offreedom carried by the gluon induces a color exchange force in S U(3), which willbe discussed in detail in the next chapter.

13) For the electromagnetic force, it is a magnetic dipole–dipole interaction.

Page 515: Yorikiyo Nagashima - Archive

490 13 Spectroscopy

13.6.2Effective Range

A highlight of the Yukawa theory was the prediction of the π meson with mπ '

200 me . It was based on the knowledge that the nuclear force is short range� 10�13 cm. How do we know the range of a force? In the low-energy limitδ l / k2lC1 (see Section 13.5.3), where k is the CM momentum. The S-wave phaseshift δ0 being � k, we have in the low-energy limit

k cot δ ' �1aC

12

r0k2 C O(k3) (13.137)

which is referred to as the effective range approximation (see Problem 13.12).Then the scattering amplitude, hence the cross section, can be calculated usingEqs. (13.78) and (13.53). The formula (13.137) is general in the sense that it is in-sensitive to the potential shape as long as it falls faster than 1/r . “a” is referred toas the scattering length and is a measure of the target size (see Problem 13.11). Itis positive (negative) for an attractive (repulsive) force. r0 is referred to as the ef-fective range and is a measure of the force range. Equation (13.137) means that bymeasuring low energy scattering data, a and r0 can be obtained as well as the signof a.

Problem 13.11

Show that in the limit k ! 0, σTOT D 4πa2.

Problem 13.12

Consider S-wave scattering due to a spherical potential with finite range:

V(r) D 0 (r > R) , V(r) ¤ 0 (r < R) (13.138)

Setting E D k2/2m and the L D 0 radial wave function as S(r), we define u(r) Dr S(r)jr<R, v (r) D r S(r)jr>R and denote the solution at k D 0 as u0(r), v0(r). Provethe following equations:

v0(r) D 1 �ra

(13.139)

k cot δ D �1aC

12

r0k2 C O(k3) (13.140)

r0 D 2Z R

0dr�v2

0 � u20

(13.141)

Using the potential Eq. (13.135), the properties of the deuterons and low-energyN N scattering behavior were reproduced well by setting

f 2

4π' 0.08 or

g2

4π' 14 (13.142)

Page 516: Yorikiyo Nagashima - Archive

13.7 High-Energy Scattering 491

which was the original success of the Yukawa theory. However, the coupling con-stant is� 1000 times bigger than the electromagnetic interaction and is too big forperturbational treatment. This was the primary difficulty in constructing the theo-ry of strong interactions. It was only after the quark model was established that amathematical framework of the strong interaction, i.e. QCD, was developed.

13.7High-Energy Scattering

13.7.1Black Sphere Model

As we have seen in Fig. 13.8, πN cross sections in the low-energy region (T .1 GeV) show a rich structure with peaks and dips. At high energies, however, manyresonances overlap and begin to show a smooth structure. Furthermore, contribu-tions from inelastic channels dominate and the fraction of elastic scattering crosssection diminishes (Fig. 13.25). Above 10 GeV, the π p total cross section is almostflat, rising slowly (� log2 E ) above 100 GeV (Fig. 13.26). The cross section in thisregion (10 < E < 100 GeV) is� 25 mb. If it is replaced with the geometrical area ofacircle πR2, R D 0.9 � 10�13 cm and reproduces the strong (nuclear) force range.

Besides, as can be seen from Fig. 13.26, cross sections of the particle and antipar-ticle (p and p, π� p and πC p , K� p and KC p ) approach each other. There is an oldprediction that their cross sections should agree at high energies, which is known

Figure 13.25 πC p total and elastic cross sections and their behavior at high energy [311].

Page 517: Yorikiyo Nagashima - Archive

492 13 Spectroscopy

Figure 13.26 High-energy behavior of cross sections of various particles. The rise fits a ln2 Edependence [313].

as Pomeranchuk’s theorem [302, 325]. Combination with isospin invariance gives

σ(π�n) D σ(πC p ) D σ(π� p ) (13.143)

Qualitatively we may expect that the black sphere approximation reasonably re-produces the high-energy behavior because inelasticity grows as the energy goeshigher, becoming totally absorptive in the high-energy limit. The argument goes asfollows. Ignoring spin effects, the elastic scattering amplitude can be expanded interms of partial waves [Eqs. (13.53c), (13.91)]:

f (s, θ ) DX

(2l C 1) f l (s)Pl(cos θ ) (13.144a)

f l Dη l e i δ l � 1

2 i p(13.144b)

If there is no absorption η l D 1 and for total absorption η l D 0. In the black spheremodel, total absorption in all partial waves is assumed (η l D 0) and only sums upto l lmax are taken. lmax � p R, where R is the force range, is introduced from asemiclassical argument. Then the forward scattering amplitude f (θ D 0) can be

Page 518: Yorikiyo Nagashima - Archive

13.7 High-Energy Scattering 493

obtained by inserting Pl (cos θ D 1) D 1

f (s, 0) Di

2p

lmaxXl

(2l C 1) �i2

p R2 (13.145)

Application of the optical theorem [Eq. (9.56)] gives

σTOT D4πp

Im f (s, 0) D 2πR2 (13.146)

Equation (13.146) reproduces the approximate value and constancy of the observedcross section. However, quantum mechanically, the force range is not cut offabruptly at a certain value R but extends beyond its limit. From a purely analyticalargument, lmax �

ps ln s can be deduced, which sets an upper limit to the total

cross section:

σTOT < constant � (ln s)2 (13.147)

This is referred to as the Froissart bound [150]. Observed data abovep

s > 100 GeVseem to saturate and satisfy this limit (Fig. 13.26).

So qualitatively the black sphere model seems to work. However, it cannot repro-duce the angular distributions in high-energy elastic scattering. Using approxima-tions valid at small angles θ � 0

t D �4p 2 sin2 θ /2 �(p θ )2

Pl (cos θ ) J0((l C 1/2)θ )(13.148)

we can calculate

f (s, 0) Di

2p

p RXlD0

(2l C 1)Pl (cos θ ) i

2p

Z p R

02x dx J0(x θ )

i Rθ

J1(p R θ ) Di p Rp�t

J1(Rp�t)

(13.149)

This is a typical diffraction pattern with dips and peaks.14) Experimental data give,irrespective of particle species,

dσd t

�dσd t

�tD0

eatCbt2(13.150)

where a and b are constants dependent on particle species. This is different fromthe angular distribution given by the black sphere model.

14) This is a striped pattern simlar to the Fraunhofer diffraction that light waves make through apinhole.

Page 519: Yorikiyo Nagashima - Archive

494 13 Spectroscopy

13.7.2Regge Trajectory*

To understand the observed angular distributions, let us examine the quantumfield theoretical possibilities. Interactions (scatterings) are, in general, mediated byexchange of particles. Let the spin of the exchanged t-channel particle be J. Thenthe high-energy scattering amplitude mediated by a particle exchange has the form

f (s, t) � g2J

(�s) J

t � m2J

s !1 (13.151)

It is smooth as a function of s and has the property of becoming small as jtj grows.g2

J can be considered as the square of the coupling constant. Indeed, the one-pionexchange model is known to reproduce high-energy small-angle scattering behav-ior qualitatively. We expand the notion and consider that the scattering amplitudecan be expressed as a sum of particle exchange amplitudes of various kinds. Asthese particles give poles in the complex t-plane, a dominant contribution to thescattering amplitude with given t naturally comes from the nearest pole. For s-chan-nel scattering, t is negative and as far as small-angle scattering (t 0) is concerned,the contribution of a particle with the smallest mass, namely that of π, should dom-inate. By taking into account other heavier particles, we are led to the idea of writingthe scattering amplitude as a sum of poles generated by t-channel particles and res-onances. Such formalism is naturally realized in the crossed t-channel amplitudeintroduced in Section 6.4.2. As an example, let us consider the s-channel π p ampli-tude starting from the t-channel ππ scattering. We expand it in partial waves in thet channel. Note, however, the crossing symmetry can be applied only for Lorentz-invariant amplitudes, so we have to use A(s, t) D �8π

ps f (s, t) instead of f (s, t)

[see Eq. (13.53b)]. The partial wave expansion formula in the t channel is given by

A(s, t) DX

(2l C 1)A l(t)Pl(cos θt ) (13.152a)

cos θt D �1�2s

t � 4μ2 (13.152b)

where μ is the π mass. In order to convert this amplitude to the s-channel π pscattering amplitude, all we need to do is to analytically continue from the t > μ2,s < 0 region to s > 0, t < 0. However, when cos θt is considered as a complexvariable, the convergence area of the Legendre function Pl (cos θt ) is limited to in-side an ellipse tangential to the nearest pole and with foci at cos θt D ˙1. In orderto circumvent the difficulty, Regge regarded the amplitude as a function of com-plex variable l. In this case, it can be rewritten in a form known as the Watson–Sommerfeld transformation

A(s, t) Di2

ZC1

d l(2l C 1)A(l, t)Pl (� cos θt )

sin π l(13.153)

The contour C1 is described in 13.27a. The formula is equivalent to Eqs. (13.152),because sin π l has poles at l D 0,˙1,˙2, � � � with residue (�1)l/π. Then we de-form the contour as C2 in Fig. 13.27b. The properties of the Legendre function

Page 520: Yorikiyo Nagashima - Archive

13.7 High-Energy Scattering 495

C1

l 0 1 2 3 4

Im l

Re l

Im l

Re l

Re l 1/2C2

(a) (b)

Figure 13.27 (a) Integration contour C1 in the Watson–Sommerfeld transformation. (b) De-formed integration contour C2 consists of a few circles around Regge poles and a line along thedashed line Reflg D �1/2.

Pl (cos θt ) are well known. It is convergent as long as the contour C2 stays to theright of Reflg D �1/2. The problem is the behavior of A(l, t), but Regge showedthat for nonrelativistic potentials (superposition of Yukawa potentials) it has atmost a few poles, which are referred to as Regge poles. Writing the Regge poleterms as

A(l, t) �n(t)

l � αn(t)(13.154)

Eq. (13.153) can be rewritten as

A(s, t) Di2

ZC2

d l(2l C 1)A(l, t)Pl(� cos θt )

sin π l

� πX

n

(2αn(t)C 1)n(t)sin παn(t)

Pαn (� cos θt )(13.155)

As l, αn are complex numbers, convergence is not lost if we go to the region t < 0,s > 0, and thus analytic continuation is possible.

Properties of Regge poles for potential scattering are well studied. As the energyincreases, the pole position of αn(t) on the Reflg–Imflg plane moves with t asdepicted in Fig. 13.28a; this is called a Regge trajectory. Starting from Reflg D�1/2, the Regge pole moves to the right on the real axis, departs from it at somevalue of t (at the threshold t D 4μ2 in the ππ channel where the ππ scatteringbecomes physical), goes to the upper right and eventually changes direction to goback to Reflg D �1/2. When αn(t) is an integer on the real axis, it represents abound state. After passing the threshold, when it has integer real part and a smallimaginary part, it represents a resonance. To see this we calculate the contributionof the Regge pole to the partial-wave amplitude. Using the relation

12

Z 1

�1dz Pl (z)Pα(�z) D

sin παπ(α � l)(αC l C 1)

(13.156)

Page 521: Yorikiyo Nagashima - Archive

496 13 Spectroscopy

3

Re E

-1

1

2

6

5

4

Re l

-1 1 2 3

Re l

Im l

Threshold for reactions

Below threshold

Positive Energy Region

(a) (b)

Figure 13.28 A Regge pole moves in the com-plex l plane as the energy changes; its trackis called a Regge trajectory. A pole on the re-al axis corresponds to a bound state while

Imflg > 0 corresponds to a resonance. (a)The behavior of a Regge trajectory in a nonrel-ativistic potential. (b) A nonrelativistic Chew–Frautschi plot.

we can obtain

A l(t) D12

Z 1

�1d cos θ A(t, cos θ )Pl(cos θ )

D(2αn C 1)n(t)

(αn(t)� l)(αn(t)C l C 1)

(13.157)

If the Regge pole is located near the real axis, we can expand the above equation inthe neighborhood of Refαn(tr )g D l and the partial amplitude becomes

A l n(tr )

(t � tr )dRefαn(t)g

d t

ˇˇ

tDtr

C iImfαn(tr )g(13.158)

which is a well-known resonance formula.A diagram in which Reflg is displayed as a function of energy is called a Chew–

Frautschi plot. For potential scattering (see Fig. 13.28b), Reflg diminishes at highenergy. The field theoretically extended version or phenomenologically obtainedplot seems to extend linearly without limit (Fig. 13.29 [98]).

In applications to field theory, we need to take into account exchange forces. Forinstance, when the t channel is ππ, the bosonic nature of π together with isospinindependence restricts l to either odd or even numbers. In other words, we have toreplace the complex Legendre polynomial by

Pαn (cos θ )!12

[Pαn (cos θ )˙ Pαn (� cos θ )] (13.159)

For the Regge trajectory including 1�� vectors such as �, ω, φ, the (�) sign hasto be adopted and for 0CC, 2CC mesons including f0, a2, the (C) sign has to

Page 522: Yorikiyo Nagashima - Archive

13.7 High-Energy Scattering 497

5Re α(t)

4

3

2

1

0 1

–1dip

t = (Mass)2 GeV2

2 3 4 5

observed, spin known

αp=.44 + .93tT(2

200)

S(1930)

g(1670)

f(1260)

B(1235)

π A(1640)

π(140)

φ(140)K (892)

Kω (1

420)

f (1514)

A 2(1

300)

ω(784)

p(765)

observed, spin unknown

points from π–p > π°n data

6

Figure 13.29 A Chew–Frautschi plot. Meson resonances are shown [98]. The data in the nega-tive t region can be obtained by fitting the data to the t-channel particle exchange amplitudes.Circles were obtained by fitting to π� p ! π0 n data should lie on the extended � trajectory.

be chosen. Therefore for one Regge trajectory, resonances appear at every otherinteger with Δ l D 2. For a parity-degenerate trajectory, Δ l D 1.

Now let us see how the s channel amplitude behaves if analytically continuedfrom the t channel. We are interested in π p scattering for values s ! 1, t < 0.As Pl(� cos θ ) ! s l in the limit s ! 1, a Regge pole positioned at the extremeright (the highest trajectory in the Chew–Frautschi plot) gives the dominant contri-bution. Therefore, instead of particles with discrete integer spin, we are talking ofa scattering amplitude due to exchange of Regge poles with continuous l value:

A(s, t) �πX

n

(αn(t)C 1/2)n(t)�

τ C e�i παn(t )

sin παn(t)

�Pαn (� cos θt )

)�X

n

�ss0

�αn (t )

γn(t)�

τ C e�i παn(t )

sin παn(t)

�(13.160)

where s0 is a parameter having the same dimension as s and αn(t) denotes a Reggepole representing bound states or resonances in the t channel. The numerator inthe square brackets is a generalization of Eq. (13.159) and, depending on τ D ˙, itis called an even (odd) trajectory. Using dσ/d t � jA(s, t)j2/s2, we obtain

dσd tD

�dσd t

�tD0

e2[α(t )�1] ln(s/s0) (13.161a)�dσd t

�tD0D (1C r2)

(σTOT)2

16π, r D

RefAgImfAg

(13.161b)

Page 523: Yorikiyo Nagashima - Archive

498 13 Spectroscopy

Since we can write α(t) α(0)C α0 t in the vicinity of t 0, we have reproducedthe observed t dependence Eq. (13.150).

In order to have an approximately constant total cross section (σTOT � �ImA(t D0)/s), A(s, t) � s is necessary, which, in turn, means αP(0) D 1C �, � ' 0. If thisRegge trajectory has the same quantum number as the vacuum, the particle andantiparticle share a common dominant term, leading to Pomeranchuk’s theorem.Therefore, the trajectory αP is called the Pomeranchuk trajectory or Pomeron.

A t-channel Regge trajectory exchanged in a charge exchange scattering such asπ� p ! π0n has quantum number I D 1. It becomes the � resonance at t D m2

and l D 1 [i.e. α�(m2�) D 1], and is called the � trajectory. It crosses the point where

α�(0) D 1/2 (see Fig. 13.29), and makes the cross section behave as s�1.Other trajectories corresponding to mesons having different quantum numbers,

for example, ω (I D 0, J P D 1�) and K�(I D 1/2, J P D 1�) follow more or lessthe same path.

Crossing between s $ u converts one member to an antiparticle, but its effecton the Regge trajectory is simply an exchange cos θt $ � cos θt and it changes itonly by a multiplicative factor τ, as can be seen from Eq. (13.160). Consequently,by taking the difference of the cross sections of the particle and antiparticle [ forinstance, σ(π� p ) � σ(πC p )], we can extract the Regge trajectory having τ D �(here � trajectory) and by taking the sum of the two we can separate that with τ D C(Pomeron, f0, etc.). If we assume the scattering amplitude takes the form given byEq. (13.151), the residue of a Regge pole exchanged in the reaction ACB! ACBcan be written as a product γn ! γRAAγRBB (factorization). Here, γRAA, γRBB arequantities proportional to the coupling constant of the Regge trajectory with theparticle A or B, and can be related by S U(3) symmetry, as explained in the nextchapter. Thus all hadronic total cross sections may be expressed by a small numberof parametrized Regge poles. Furthermore, in the t region below threshold, thephase of the Regge trajectories depends only on the factor

S D �τ C e�i πα(t )

sin πα(t)(13.162)

and we have

Re[ f (s, 0)] D tanhπ

2α(0)

i� Imf f (s, 0)g τ D � (13.163a)

D � cothπ

2α(0)

i� Imf f (s, 0)g τ D C (13.163b)

Using parameters obtained by fitting the formula to the data, we can predict the re-al part of the forward scattering amplitude. As trajectories other than the Pomeroncontribute to the real part, and dominant trajectories (� and f0, etc.) among themhave α(0) D 1/2, we expect the real part to decrease as� s�0.5 at high energies. Ex-perimentally, the real part can be determined by measuring the interference effectof the strong scattering amplitude with the Coulomb scattering; the result confirmsour prediction [45, 46].

Page 524: Yorikiyo Nagashima - Archive

13.7 High-Energy Scattering 499

102

PROTON-PROTON ELASTIC SCATTERING

CLYDE

⎧⎨⎩

MOMENTA (GeV/c)

(1966) 30, 50, 70, 71

(1967) 14.2

(1968) I 71, 101, 121

(1968) II 19.2

(1968) 3.0

(1971) 100, 12.0, 14.2, 24.0

(1972) 1500

ALLABY et al.

ALLABY et al.

ACHGT

ANKENBRANDT et al.3

5

10

12

14

19

241500

FOUR-MOMENTUM TRANSFER SQUARED, |t| (GeV2)86420

101

100

10–1

10–2

10–3

10–4

dσ/

dt

(mb

/GeV

2 )

10–5

10–6

10–7

Figure 13.30 p p elastic cross sections as a function of jtj for a variety of incident momen-ta [183]. At high energy, a diffraction peak appears, which shrinks as the energy increases.

Phenomenologically, the behavior of the high-energy total cross sections can bewell reproduced by using two effective Regge poles, the Pomeron and Reggeon[312]. The latter is an effective Regge trajectory with �, ω, f , etc. combined:15)

σab D σab D X ab s� C Y ab s�η

� D αP(0) � 1 D 0.093˙ 0.002

η D 1 � αR(0) D 0.560˙ 0.017

(13.164)

where s is in GeV2. X ab and Y ab are parameters that depend on particle species.The first term represents the Pomeron and the second the Reggeon.

15) [312] gives three terms, separating the Reggeon contribution into C D C1 and �1 parts, thusreproducing the small differences between the particle (σab ) and antiparticle (σab) cross sections,but with somewhat poor fitting the difference can be neglected. Recently, owing to accumulation ofmore precise data, the Pomeron contribution has been modified to X s� ! ZCB log2(s/s0 )2 [311].

Page 525: Yorikiyo Nagashima - Archive

500 13 Spectroscopy

According to Eq. (13.161), the coefficient of t in the exponent grows with s. Thismeans that the forward scattering peak shrinks as s becomes larger. This is ob-served in p p scattering (Fig. 13.30), but in many other channels it is not the case.This means either s is not large enough to be in the asymptotic region or a simpleapproximation with a few Regge poles is not sufficient to reproduce the data.

In summary, the Regge model is very attractive as a way to explain the high-energy behavior and is frequently used in representing the asymptotic behavior ofthe scattering cross sections. In the language of the Standard Model, the Pomeronis not due to a specific resonance but a dynamical composite effect of multiplegluons. In the Regge representations, a scattering amplitude in terms of t-channellow-energy resonances describes the high-energy part of the s channel, and theconverse is also true. Thus, there is a duality in the scattering amplitude. Extendingthe idea, each scattering amplitude in the crossed channel may be totally expressedas a sum over resonances in s, t, u channels and indeed such a model was created(Veneziano model [378]). The duality in which one scattering amplitude can beconsidered as both an s and a t channel amplitude induced a (hadronic) stringmodel [288, 329, 358], which, in turn, triggered the modern string models. In thissense, the analysis with Regge trajectories may be said to be an important turningpoint in the history of particle physics.

Page 526: Yorikiyo Nagashima - Archive

501

14The Quark Model

Nucleons and pions have been thought of as elementary particles for a long time. Ittook over 30 years of trial and error to realize they are composites made of quarks.During the process, there were two turning points. One was the discovery of thestrange particles, which began in the late 1940s, and the other that of charmedparticles in 1974. By the time the charmed particles were discovered, particle tax-onomy was fairly advanced and the significance of the discovery was immediatelyrecognized, creating a turbulent but transient transformation of the particle viewreferred to as “the November revolution”. On the other hand, the characteristics ofthe strange particles have been a central riddle, something researchers had to pur-sue for a long time. In retrospect, when the characteristics of the strange particleswere understood, the essence of the quark model was already in hand. The processwas slow and did not create a fever like that of the charmed particles, but was farmore important in its contribution to the progress of particle physics.

14.1SU(3) Symmetry

Historically, the discovery of strange particles added to isospin another conservedinternal quantum number, the strangeness S. Just as isospin (S U(2) symmetry)was introduced when the equality of both the mass and the properties of the pro-ton and neutron was noticed, the near equality (m p ' mn ' mΛ) of three basicelements of the Sakata model [339] led to S U(3) symmetry, known as IOO sym-metry [212] at the time of introduction. Based on the Sakata model and S U(3),the meson octet was successfully explained. The discovery of the Ω particle firm-ly established the validity of S U(3). However, one misplacement of p , n, Λ as thethree basic elements, hindered the Sakata model and the “eightfold way” took over,which, in turn, led to the quark model. With the discovery of the charmed parti-cles, S U(3) was extended to S U(4), but all the essential features of so-called flavorsymmetry are contained in S U(3) and the quark model. In this chapter we explainthat static properties of the hadrons, including the mass and magnetic moment,are well reproduced by the quark model. We defer discussions of the dynamic

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 527: Yorikiyo Nagashima - Archive

502 14 The Quark Model

properties such as the inner structure of the hadron, which can be investigatedby scattering experiments, until Chap. 17.

14.1.1The Discovery of Strange Particles

Around the late 1940s, Rochester and Butler [332], while investigating reactionsinduced by high-energy cosmic rays, observed strange particles that left V-shapedtracks in cloud chambers (Fig. 14.1).

Some of them were neutral, decaying into π� p , or to π�πC (Fig. 14.1a) andothers were charged and decayed into π�n or π˙π0 (Fig. 14.1b):

V1 ! π� C p

V2 ! π� C πC

V3 ! π� C n

V4 ! πC C π0

(14.1)

The pion was a new particle when it was discovered, but it was theoretically antic-ipated as a necessary ingredient of the nuclear force carrier. In contrast, theoristsanticipated neither the existence nor the necessity of the “V” particles. Namely, theycould not understand their properties. The most outstanding puzzle was that theywere produced frequently in the cloud chamber, meaning the production was a fastprocess, but they left visible tracks, which is a slow process. The fact that the pro-duction was fast meant it was induced by the strong force. Then if the “V” particlewere a strongly interacting particle, its decay time should be 10�24–10�22 s (some-times referred to as the strong time scale) and it could not leave visible tracks.Visible tracks as long as a few centimeters mean a decay time of 10�10–10�8 s,which is a typical weak interaction time scale. The riddle was that a particle inter-acted strongly at one time and weakly at another. Then it was realized that all the“V” particles were produced in association with another “V” particle [310]. Nakanoand Nishijima [286] and Gell-Mann [159] independently proposed a new quantumnumber “strangeness S” to explain the controversy and that it is conserved in thestrong interaction but not in the weak interaction.

(a) (d)(c)(b)

Figure 14.1 Production of new particles in the Wilson cloud chamber. Among the many show-ers penetrating through a thick lead absorber, a V-shaped track (a,b) and a kink (c,d) are ob-served [332].

Page 528: Yorikiyo Nagashima - Archive

14.1 S U(3) Symmetry 503

Let us rename V1, V2 as Λ, K0 and assume they have strangeness S D �1 andS D C1, respectively. All the hitherto known particles, including π, N , are as-sumed to have S D 0. Then the strangeness number in the production reactioncan be written as

π�C p ! K0 C ΛS W 0 0 C1 �1

(14.2)

S is conserved before and after the reaction. On the other hand, in the decays

Λ ! π� C pS I �1 0 0

K0 ! πC C π�S I C1 0 0

(14.3)

S is not conserved and the strong interaction is forbidden. Similarly, if V3, V4 areΣ �, KC, the reaction becomes

π� C p ! KC C Σ � (14.4)

The reaction is clearly visualized in a bubble chamber picture exposed to the Beva-tron beam at LBL (Lawrence Berkeley Laboratory) (Fig. 14.2).

Interestingly, these strange particles have partners with different electric chargebut with nearly the same mass. Namely they also form isospin multiplets, just likenucleons and pions. Nishijima and Gell-Mann proposed a relation between thestrangeness S, the electric charge Q (in units of the proton charge), and the thirdcomponent of the isospin:

Q D I3 CB C S

2� I3 C

Y2

Y D B C S(14.5)

Here, B is the baryon number and Y is called the hypercharge. This is a simple andyet very powerful rule that is used to predict missing partners of isospin multiplets.The law applies to all the hadrons discovered later. Table 14.1 lists a sample ofhadrons. Particles with B D 0 are called mesons, B D 1 baryons and B D 1, S ¤ 0

Table 14.1 Quantum numbers of representative hadrons.

B (baryon number) 0 0 1J P 0� 1� (1/2)C

Y D 1 (KC, K0) (K�C, K�0) (p , n)

Y D 0 (πC, π0, π�) (�C, �0, ��) (Σ C, Σ 0, Σ �)η, η0 ω, φ Λ

Y D �1 (K0, K�) (K

�0, K��) (� 0, � �)

Page 529: Yorikiyo Nagashima - Archive

504 14 The Quark Model

Λ0

K0

p

π –

π –

π+

π –

Figure 14.2 A bubble chamber picture tak-en at the 6 GeV Bevatron/LBL. π� C p !K0 C Λ, K0 ! π� C πC, Λ ! π� C p .Photo LBL [254]. A bubble chamber is a vesselfilled with a superheated transparent liquid(hydrogen, propane). Ions made by a passingcharged particle act as evaporation centers,

around which bubbles form, creating a track.The mechanism is similar to that of the cloudchamber with gas and mist replaced by liquidand bubbles. The high density of the liquidmakes the bubble chamber a very effectiveparticle detector. It was invented by Glaser in1952 [172].

hyperons. All the hundreds of hadrons so far discovered satisfy the Nishijima–Gell-Mann law.

In the 1960s, after the Cosmotron (3 GeV) at BNL and the Bevatron (6 GeV) atLBL began operation, detailed measurements of K N scattering and systematicsearches for K N, K K N, πY, K Y channels in multiparticle productions were car-ried out, clarifying the existence and properties of S D 0,�1,�2 particles andresonances. We have explained part of the story already in the previous chapter(Fig. 13.14–Fig. 13.19). Hundreds of hadrons were thus discovered. Then, one nat-urally raises the question of whether they are really elementary. If they are consid-ered as composites of more fundamental particles, what properties should thesehave?

Page 530: Yorikiyo Nagashima - Archive

14.1 S U(3) Symmetry 505

14.1.2The Sakata Model

Hadrons have a variety of spins and isospins (s, I D 0, 1/2, 1, 3/2, . . .). As any spin(or isospin) can be constructed from s D 1/2 (I D 1/2) bases, the fundamental par-ticles must comprise s D I D 1/2 particles, which requires at least a pair of s D 1/2particles with different electric charge. Conversely, two kinds of species with thisattribute can compose any value of spin s and isospin I. But to have the strangenessS, a third fundamental particle with nonzero S is necessary. The Sakata [339] modelchose (p , n, Λ) as the three fundamental particles. Noticing the nearly same valueof the mass of these particles, Ikeda, Ohnuki and Ogawa [212] proposed that thestrong interaction is invariant under the exchange p $ n $ Λ $ p (IOO sym-metry). This is the introduction of S U(3).1) However, just as isospin conservation(S U(2) symmetry) was an approximate symmetry, S U(3) symmetry (sometimescalled unitary spin) is even more so. The mass difference between p and n is� 0.1%but that between the nucleon and Λ is� 20%. S U(3) symmetry is considered validonly within this accuracy.

Let us define three basis vectors of the S U(3) group ψ and write them in matrixform:2)

ψ D

24 p

35 �

24� 1

� 2

� 3

35 D � i e i

e1 D

241

00

35 , e2 D

240

10

35 , e3 D

240

01

35

(14.6)

A triplet (p , n, Λ) represented as a 3 � 1 column vector is a three-dimensionalrepresentation of S U(3) having (I3, Y ) D (1/2, 1), (�1/2, 1), (0, 0) and is denot-ed as 3. The antiparticles can be represented as a 1 � 3 row vector (p , n, Λ) �(�1, �2, �3). The antitriplet has quantum numbers opposite to 3, i.e. (I3, Y ) D(�1/2,�1), (1/2,�1), (0, 0) and belongs to 3�.3) Denoting U as a unitary uni-mod-ular (det U D 1) 3� 3 matrix, the S U(3) transformation can be expressed in termsof eight (D 32 � 1) hermitian traceless matrices λ i as

U D exp

i2

8XiD1

λ i θi

!

Tr(λ i ) D 0

(14.7)

1) To be exact, this is flavor S U(3), to bedistinguished from color S U(3) to bedescribed later. The latter is the symmetry ofthe color charge that the quark has.

2) For the definition of groups andterminologies, refer to Appendices A and G.

3) Note 3� ¤ 3 in S U(3), but 2� D 2 in S U(2),because they occupy the same point on the

I3 map. For S U(2), the quantum numbersof the antidoublet are the same as thoseof the basis doublet. Hence by a suitabletransformation (p , n) ! (n,�p ), they canbe made to belong to the same I D 1/2representation. [See Eq. 9.220]

Page 531: Yorikiyo Nagashima - Archive

506 14 The Quark Model

(where θi are eight independent real numbers and the factor 1/2 in the exponentis a historical convention) just as the S U(2) transformation matrix is expressed interms of three (D 22 � 1) Pauli matrices. For λ i , the convention is to adopt Gell-Mann matrices, which are defined as

λ1 D

240 1 0

1 0 00 0 0

35 , λ2 D

240 �i 0

i 0 00 0 0

35 , λ3 D

241 0 0

0 �1 00 0 0

35 (14.8a)

λ4 D

240 0 1

0 0 01 0 0

35 , λ5 D

240 0 �i

0 0 0i 0 0

35 , (14.8b)

λ6 D

240 0 0

0 0 10 1 0

35 , λ7 D

240 0 0

0 0 �i0 i 0

35 , λ8 D

1p

3

241 0 0

0 1 00 0 �2

35

(14.8c)

The number of diagonal matrices is two, which is the rank of S U(3). The eight λ i ’ssatisfy the following commutation relations:�

λ i

2,

λ j

2

�D i f i j k

λk

2(14.9a)�

λ i

2,

λ j

2

D

13

δ i j C i di j kλk

2(14.9b)

f i j k is called the structure constant of S U(3), where fa, bg D ab C ba and f i j k ,di j k are totally antisymmetric or symmetric with respect to i j k permutations andhave the values listed in Table 14.2.

Table 14.2 Structure constants of S U(3).

f123 D 1

f147 D f246 D f257 D f345 D f516 D f637 D12

f458 D f678 D

p3

2d118 D d228 D d338 D �d888 D

1p

3

d146 D d157 D d256 D d344 D d355 D12

d247 D d366 D d377 D �12

d448 D d558 D d668 D d779 D �1

2p

3

Page 532: Yorikiyo Nagashima - Archive

14.1 S U(3) Symmetry 507

14.1.3Meson Nonets

Composites of a baryon and an antibaryon have the baryon number B D 0, hencethey can be identified as mesons. Since under S U(3) transformation

ψ ! ψ0 D U ψ , ψ ! ψ0D ψU† 4) (14.10)

the combination ψ† ψ D �i � i D p p C nn C ΛΛ is invariant and hence

η0 �1p

3(p p C nn C ΛΛ) (14.11)

makes an irreducible representation by itself of dimension 1 (singlet). The remain-ing eight elements made up of the combinations

P ij D � i � j � (1/3)Tr(� i � j )

D

24 1

3 (2p p � nn � ΛΛ) n p Λ pp n 1

3 (2nn � ΛΛ � p p ) Λnp Λ nΛ 1

3 (2ΛΛ � p p � nn)

35

(14.12)

mix with each other by the transformation and constitute the bases of an irre-ducible representation of dimension 8 (octet). The above process of multiplicationand decomposition can be expressed in the simple form

3˝ 3� D 1˚ 8 (14.13)

The process is an extension of S U(2) isospin addition to S U(3). In S U(2), whenI D 1/2 and 1/2 are added together, they can be decomposed to I D 0 and I D 1,which, in the above notation, can be expressed as 2˝ 2 D 1˚ 3. By comparing thequantum numbers of the octet members, the observed mesons can be identified as

πC D � 1�2, π0 D �(� 1�1 � � 2�2)/p

2, π� D �� 2�1

KC D � 1�3, K0 D � 2�3, K0D �� 3�2, K� D � 3�1

η D (� 1�1 C � 2�2 � 2� 3�3)/p

6

(14.14)

The negative sign of π0, π�, K0

reflects the fact that (n,�p ) makes an I D 1/2doublet [see Eq. (9.220)]. The construction of neutral particles is not unique, butconsidering the isospin is a good quantum number, π0 is made by applying I� DI1� i I2 to πC and η is made to be orthogonal to both π0 and η0. Using Eq. (14.14),Eq. (14.12) can be rewritten as

P ij D

2664� π0p

2C

ηp6

πC KC

�π� π0p2C

ηp6

K0

K� �K0

�q

23 η

3775 (14.15)

4) Actually ψ means ψ† as far as S U(3) operation is concerened, but very often it also represents theLorentz conjugate. So the latter’s nomenclature is also used.

Page 533: Yorikiyo Nagashima - Archive

508 14 The Quark Model

Mapping octet members on the I3–Y plane gives Fig. 14.3a. Note, in Fig. 14.3 thequark model (u, d, s) is used instead of (p , n, Λ) for the sake of later discussions.Meson octets have the same (I3, Y ) asignments in both the Sakata and quark mod-els. Such a diagram is called a weight diagram. They reproduce a spectrum of ob-served spin D 0 mesons, although when it was first made, η and η0 had not yetbeen discovered. For the spin D 1 members, replacements π ! �, K ! K�, η,η0 ! ω, φ make the weight diagram as shown in Fig. 14.3b. Thus, the Sakatamodel prepared a mathematical tool S U(3) and obtained the correct meson as-signment, but because it put aside (p , n, Λ) as a special trio, it failed to reproducethe baryon octet correctly (see Fig. 14.4b). The correct assignments were made bythe eightfold way of Gell-Mann [160] and Ne’eman [294]. From then on the Sakata

+1

-1

-1/2 1/2 +1-1 I3

Y

K+

K-

K0

K0_

π+π0, η, η'π-

(us)_

(ud)_

(us)_

(ds)_

(du)_

(uu, dd, ss)_

JP = 0-

(a)

-1/2 1/2 +1-1 I3

Y

K*+

K*-

K*0

K*0_

ρ+ρ0, ω, φρ-

(us)_

(ud)_

(ds)_

(us)_

(ds)_

(du)_

(uu, dd, ss)_

JP = 1-

(b)

-1

+1K(496)

K(496)

π(138)η(548)η'(958)

K*(894)

K*(894)

ρ (776)ω (783)φ (1020)

(ds)_

Figure 14.3 Nonet D singletC octet made of (u, d, s) and (u, d , s). Numbers in parenthesesare mass values in MeV averaged within the isospin multiplet. (a) Correspondence betweenobserved 0� mesons and quark–antiquark pair combinations. (b) Same as (a) for 1� mesons.

-1/2 1/2 +1-1 I3

Y

p

Ξ-

n

Ξ0

Σ+Σ0, Λ0Σ

-1

+1

-1/2 1/2 +1-1 I3

Y

uud

uus

udd

dss uss

dds dus

JP = 1/2+

(b) (a)

-1

+1 Ν(939)

Σ (1193)Λ (1116)

Ξ (1318)

Figure 14.4 (a) An octet made of three quarks. (b) Observed (1/2)C baryons.

Page 534: Yorikiyo Nagashima - Archive

14.1 S U(3) Symmetry 509

model was superseded by the quark model. The arguments for the meson we havedeveloped so far are equally valid for the quark model, too.

14.1.4The Quark Model

The eightfold way confers a more fundamental role on the octet since, experimen-tally, baryons with J P D (1/2)C as well as mesons were observed as octets. Ifhypercharge Y D S C B is adopted instead of the strangeness S as in the Saka-ta model, both baryons and mesons can be treated in a unified way. Though theeightfold way could reproduce the observed multiplets correctly, it lacked logicaljustification of why S U(3) works. If one asks for the existence of the fundamentalparticles as bases of the octet, one needs three bases. They are similar to p , n, Λ ofthe Sakata model but have slightly different quantum numbers and are referred toas u, d, s quarks. If they have baryon numbers, 3�˝3 D 1˚8 can reproduce mesonoctets. But to realize an octet with baryon number 1, a different combination hasto be introduced. As it happens, a product of three 3’s can make an octet that canbe identified with the baryon octet. If that is so, the baryon number of the quarksmust be 1/3. Now one uses the same assignment of the Sakata model except forthe baryon number and requires that Nishijima–Gell-Mann’s law Eq. (14.5) appliesto quarks, too. Then their quantum number can be assigned as in Table 14.3.

According to Table 14.3, the electric charge of the quarks would become 2/3 or�1/3, not integers. This was an extraordinary concept at that time and even the pro-poser Gell-Mann [161] claimed for a while that the quarks were not real but shouldbe considered as a convenient mathematical tool. The other proposer, Zweig [400],is said to have thought them real. According to the modern interpretation, all thestrong interactions originate from color charges that all the quarks carry and theirstrength does not depend on the quark species. As mu � md � m p /3 � 330 MeV,ms � muC(mΛ�m p ) � 500 MeV can be inferred from phenomenological consid-erations [see Eq. (14.57)], it is clear that the isospin [S U(2)] and unitary spin [S U(3)]are approximate symmetries whose validity is limited to phenomena where themass difference is negligble on the energy scale in question. Namely, S U(3) sym-metry under exchange of u, d, s is rather accidental. As the quark species u, d, s, . . .are said to differ in “flavor”, this symmetry is now referred to as flavor S U(3) todistinguish it from color S U(3), which plays a fundamental role in the Standard

Table 14.3 Quantum numbers of the quarks.

Name spin I3 S B Y Q

u (up) 1/2 1/2 0 1/3 1/3 2/3

d (down) 1/2 �1/2 0 1/3 1/3 �1/3

s (strange) 1/2 0 �1 1/3 �2/3 �1/3

Page 535: Yorikiyo Nagashima - Archive

510 14 The Quark Model

Model. Historically, however, flavor S U(3) was instrumental in reaching the quarkstructure of hadrons.

14.1.5Baryon Multiplets

To construct baryons, let us first look at the transformation property of nine prod-ucts of two quarks T i j D � i � j . They are not irreducible, but if they are dividedinto symmetric and antisymmetric parts

T i j D12

(� i � j � � j � i)C12

(� i � j C � j � i ) D Ai j C S i j (14.16)

each part transforms to itself because the symmetry property is conserved in thetransformation (Problem 14.1). In other words, nine product tensors T i j are sepa-rated into three antisymmetric and six symmetric tensors:

3˝ 3 D 3� ˚ 6 (14.17)

Why the three Ai j ’s belong to 3� and not to 3 is clearly seen by looking at thequantum numbers they possess. Alternatively, the transformation property of ηk �

2A i j D �i j k � i � j can be shown to be the same as that of (� k )� (Problem 14.2).Multiplying Eq. (14.17) by one more 3, the first term on the rhs of Eq. (14.17) gives

3� ˝ 3 D 1˚ 8 (14.18)

just like the meson nonets. However, unlike the meson singlet, 1 here is a totallyantisymmetric combination of Ai j and � k , i.e. ψ1 D �i j k � i � j � k (Problem 15.3),and the remaining eight tensors become an octet. From the second term 6, 18 ten-sors can be made. Among them, ten are totally symmetric and constitute a decupletwhereas the reamining eight tensors make another octet. Namely we have obtained

6˝ 3 D 8˚ 10 (14.19)

In summary, we have derived

3˝ 3˝ 3 D 1A ˚ 8MA ˚ 8MS ˚ 10S (14.20)

Among them, 1 is totally antisymmetric and 10 is totally symmetric with re-spect to three indices. The two octets, 8MA, 8MS occupy the same position on theI3–Y plane, but as can be seen from the way they are constructed, they have mixedsymmetry, one antisymmetric and the other symmetric in the first two indices. Ex-perimentally, (1/2)C baryons constitute 8 (Fig. 14.4) and (3/2)C baryons constitute10 (Fig. 14.5). This means the real baryon (1/2)C octet is a combination of the twooctets with mixed symmetry.

Page 536: Yorikiyo Nagashima - Archive

14.1 S U(3) Symmetry 511

Ξ- Ξ0

Σ+Σ0Σ-

(a)

Ω-

-1/2 1/2 +1-1 I3

Yuud

uus

udd

dss uss

dds uds

-1

+1

3/2-3/2

-2

0

uuuddd

sss

-1/2 1/2 +1-1 I3

YJP = 3/2+

-1

+1

3/2-3/2

-2

0

(b)

I=3/2

I=1

I=1/2

I=0

Δ++Δ+Δ0Δ-

Δ (1232)

Σ (1385)

Ξ (1530)

Ω (1672)

Figure 14.5 (a) 10 can be constructed as totally symmetric combinations of three quarks. (b)Observed spectrum of (3/2)C resonances. The numbers on the right denote their mass in MeV.

Problem 14.1

Show that the symmetry property of Eq. (14.16) does not change by transformation.Namely what was (anti)symmetric before is also (anti)symmetric after the transfor-mation.

Problem 14.2

Show that Ai j in Eq. (14.16) belongs to 3� in two ways: 1) by looking at the quan-tum number and 2) by looking at its transformation property.

Problem 14.3

Show that a totally antisymmetric wave function ψ1 D �i j k � i � j � k is invariantunder S U(3) transformation and hence constitutes a singlet.

14.1.6General Rules for Composing Multiplets

In order to understand the basic rules for constructing multiplets from basis vec-tors, we have shown some simple examples. When the dimension of the multipletsincreases, the method by hand like the one we have used rapidly becomes complex.There are general formulae in [S U(2)] for constructing any value of angular mo-mentum J out of spin 1/2 bases using the Clebsch–Gordan coefficients. There isa similar formula in S U(3) and tables of the corresponding Clebsch–Gordan co-efficients are available (see for instance [311]). We refer to Appendix G for more

Page 537: Yorikiyo Nagashima - Archive

512 14 The Quark Model

detailed arguments. We restrict arguments here merely to a simple extension ofS U(2) to S U(3) and how to use tables derived from general addition rules of S U(3).In S U(2), a multiplet is specified by the angular momentum j.5) Elements in amultiplet are specified by the eigenvalue of Jz D m D ( j, j � 1, � � � ,� j ). A groupconsisting of (2 j1C1)(2 j2C1) elements made by multiplying the 2 j1C1 elementsof a multiplet specified by j1 by the (2 j2C1) elements of another multiplet j2 gen-erally forms a reducible group. But such a group can be decomposed into a sum ofirreducible groups (multiplets) as

j 1 ˝ j 2 D ( j 1 C j 2)˚ ( j 1 C j 2 � 1)˚ � � � ˚ (j j 1 � j 2j) (14.21)

Elements of the irreducible group belonging to ( J, M ) can be expressed in terms ofClebsch–Gordan coefficients as (see Appendix E)

j J Mi DX

m1,m2

C J Mj1 m1, j2 m2

j j1m1ij j2m2i

m1 C m2 D M(14.22)

In S U(3), specifying the dimension alone is not enough, as we saw that 3� ¤ 3. Anadditional indicator is necessary. This is because there are two Casimir operators inS U(3). It is convenient to use a pair of numbers (p , q) to specify a multiplet wherep, q are effective numbers of quarks and antiquarks6) . The dimension is given by

D(p , q) D12

(p C 1)(q C 1)(p C q C 2) (14.23)

We list some of the correspondences between the notation 1, 3, etc. and D(p , q):

D(0, 0) D 1 , D(1, 0) D 3 , D(0, 1) D 3� , D(1, 1) D 8 D 8� ,

D(2, 0) D 6 , D(0, 2) D 6� , D(3, 0) D 10 , D(0, 3) D 10� , . . .(14.24)

The decomposition corresponding to Eq. (14.21) in S U(2) can be made usingYoung diagrams and expressed as

D(p1, q1)˝ D(p2, q2) DX

n

n(p , q)D(p , q)

An example: 8˝ 8 D 1˚ 8˚ 8˚ 10˚ 10� ˚ 27(14.25)

The difference from S U(2) is that in Eq. (14.21), specification of J is enough touniquely determine the representation, while in S U(3) there is more than one rep-resentation with the same dimension and n(p , q) becomes an integer larger than 1.In the above example, two octets appear with differing symmetry. The composition

5) In mathematical language, by the eigenvalues of the Casimir operator, which commutes withall the transformations in the group except the trivial identity operator. In S U(2) it is J2 witheigenvalue j( j C 1).

6) An antisymmetric pair of quarks (or antiquarks) are equivalent to an antiquark (or a quark). Seearguments following Eq. (14.17) and Appendix G for details.

Page 538: Yorikiyo Nagashima - Archive

14.2 Predictions of SU(3) 513

formula corresponding to Eq. (14.22) is

jμγ , νi DXν1,ν2

�μγ μ1 μ2

ν ν1 ν2

�jμ1ν1ijμ2ν2i (14.26)

Here, μ specifies the multiplet expressed in (p , q) and the subscript γ attached toμ differentiates μ with the same dimension. ν denotes different members withina multiplet, usually specified as (Y, I, I3) with the condition Y D Y1 C Y2, I3 D

(I1)3 C (I2)3.

Wigner–Eckart Theorem Corresponding to the Wigner–Eckart theorem in S U(2),which is expressed as

h j2m2jT( J M )j j1m1i D δm2,MCm1 C j2 m2J M, j1 m1

h j2jjT J jj j1i (14.27)

its analogue in S U(3) is given by

hμ2ν2jT(μν)jμ1ν1i DX

γ

�μγ μ1 μ2

ν ν1 ν2

�hμ2jjT μjjμ1i (14.28)

where the reduced matrix element hμ2jjT μjjμ1i depends only on the representa-tions involved. The Gell-Mann–Okubo mass formula described in the followingsection is an application of the Wigner–Eckart theorem.

14.2Predictions of SU(3)

14.2.1Gell-Mann–Okubo Mass Formula

If S U(3) symmetry were exact, all the quarks would have equal masses (mu D

md D ms). Although mu ' md is approximately true [m p (938 MeV) ' mn

(939 MeV)], Δm D ms � mu ' mΛ(1116 MeV) � m p (938 MeV) is fairly large.If we consider the mass difference is the origin of S U(3) symmetry breaking, theHamiltonian mass term would be

Hm D mu(uuC dd)C ms ss

D2mu C ms

3(uuC dd C s s) �

Δm3

(uuC dd � 2s s)

� m0ψψ CΔmp

3ψλ8ψ � HS C HM S (14.29)

The first term transforms as a singlet of S U(3) and the second as an eighth mem-ber (I D Y D 0) of an octet and hence is the symmetry-breaking term. Using

Page 539: Yorikiyo Nagashima - Archive

514 14 The Quark Model

Eq. (14.29) Gell-Mann [160] derived a sum rule that holds for particles in an octet(Problem 14.4).

12

(mN C m� )„ ƒ‚ …1128.5 MeV

D14

(3mΛ C mΣ )„ ƒ‚ …1135.0 MeV

(14.30a)

m2K„ ƒ‚ …

0.246 MeV2

D14

3m2

η C m2π

�„ ƒ‚ …

0.231 MeV2

(14.30b)

The agreement with experimental data is very good. Then Okubo, using Eq. (14.29),derived a mass formula applicable to any multiplet [300]:

M D a C bY C c�

I(I C 1) �14

Y 2�

(14.31)

a, b, c are constants that differ for each multiplet. Equation (14.31) is referred toas the Gell-Mann–Okubo mass formula. The reason for using mass squared ratherthan mass itself in Eq. (14.30b) is motivated by the Lagrangian mass term of theKlein–Gordon equation. The agreement with experiments is certainly better withthe mass squared.

Problem 14.4

Derive Eqs. (14.30) from Eq. (14.31).

Problem 14.5*

Derive Eqs. (14.30) from Eq. (14.29).

14.2.2Prediction of Ω

When flavor S U(3) symmetry was proposed, three members of the (3/2)C decu-plet, Δ(1232), Σ (1385), � (1530) were known.7) They fit as members of the decu-plet. However, the last member Ω was not known at the time. If S U(3) symmetryis correct, Ω with J P D (3/2)C should exist and have I D 0, Q D �1, S D �3.Furthermore, applying Eq. (14.31) the following relations are obtained:

M(Ω )� M(� �) D M(� �)� M(Σ �) D M(Σ �) �M(Δ) (14.32)

We have attached an asterisk to denote that they are members of the decuplet.Inserting observed values M(Δ) D 1232 MeV, M(Σ �) D 1385 MeV, M(� �) D

7) The notation for baryonic resonances with different J P is as follows. If they have the same (I, Y )as [N(1/2, 1), Δ(3/2, 1), Λ(0, 0), Σ (1, 0), � (1/2,�1) . . .], they are expressed as [N(1440), Δ(1232),Λ(1520), Σ (1385), � (1530), . . .] with the mass value in parentheses.

Page 540: Yorikiyo Nagashima - Archive

14.2 Predictions of SU(3) 515

1530 MeV, we obtain M(Ω ) ' 1680 MeV. The Ω mass value is significant. Ω can-not decay by the strong interaction, because a composite baryon with S D �3 canbe constructed from (� K ), (ΛK K ), (N K K K ), but they all have mass values largerthan Ω : M(� K) � 1809 MeV, M(ΛK K ) � 2107 MeV, M(N K K K ) � 2423 MeV.Consequently, Ω cannot decay to them because of energy conservation. Decaysto (ΛK), (� π) are allowed, but only weakly because they have S D �2, violatingstrangeness conservation. As a result of this prediction, a bubble chamber team atBNL launched a search for Ω , and out of 105 pictures, they found one with evidenceof Ω production (Fig. 14.6). The picture clearly shows the following sequence ofreactions:

K0 is not seen in the picture, but from energy-momentum conservation the miss-ing mass Mmiss can be calculated; Mmiss D mK indicates that it is a K0. The Ω �leaves a visible track and its lifetime and mass were calculated to be � 10�10 s and1686˙ 12 MeV, confirming the S U(3) prediction. This was a result of S U(3) sym-

π- P

K+

K0

K-

Λ0

Ξ0

γ

γ

Ω- π-

Figure 14.6 A picture of Ω production obtained in a bubble chamber at BNL [49].

Page 541: Yorikiyo Nagashima - Archive

516 14 The Quark Model

metry and not of the quark model, but historically the latter was inspired by theformer.

Note that so far resonances that appear in the strong interaction having lifetime� 10�23 s were thought of as dynamical phenomena, not in the same category asquasistable particles with long lifetime & 10�10 s. But from the fact that the Ω �belongs to the same decuplet as other resonances, it is clear that the lifetime doesnot play any essential role in the classification of particle species. This is analogousto treating all the excited atomic or nuclear energy levels, stable or unstable, onequal footings. As quark composites, it is legitimate to treat hadronic resonancesequally with other hadrons such as p and π.

14.2.3Meson Mixing

As stated before, flavor S U(3) is not a rigorous symmetry and the Hamiltonian con-tains a symmetry-breaking interaction as in Eq. (14.29). It has the same transforma-tion property as the neutral member of an octet and has nonzero matrix elementsbetween particles belonging to different multiplets but having the same Y and I3.For example, it induces mixing between a singlet meson φ0 D (uuC dd C s s)/

p3

and the Y D I3 D 0 member of an octet φ8 D (uuCdd�2s s)/p

6. For 1� mesons,we write

φ D cos θ φ8 � sin θ φ0

ω D sin θ φ8 C cos θ φ0(14.33)

φ, ω are observed mesons and are mass eigenstates of a 2 � 2 matrix M2:

M2�

φω

�D

�m2

φ 00 m2

ω

� �φω

�(14.34a)

On the other hand, the mass matrix M2 0 with bases (φ8, φ0) can be expressed as

M2 0�

φ8

φ0

�D

�M2

8 M280

M208 M2

0

��φ8

φ0

�, M2

80 D M208 (14.34b)

Using TrM2 D TrM2 0, det M2 D det M2 0, we obtain

m2φ C m2

ω D M28 C M2

0

m2φ m2

ω D M28 M2

0 � M208M2

80

(14.35)

Gell-Mann and Okubo’s formula Eq. (14.30), which is valid for no mixing, means

M28 D

4m2K� � m2

3(14.36)

Page 542: Yorikiyo Nagashima - Archive

14.2 Predictions of SU(3) 517

Using Eqs. (14.33) and (14.36), the following relations are obtained:

tan2 θ DM2

8 � m2φ

m2ω � M2

8D

4m2K� � m2

� � 3m2φ

3m2ω � 4m2

K� C m2�

(14.37a)

tan 2θ D2M2

08

M20 � M2

8(14.37b)

Table 14.4 gives the relevant mass values and mixing angles.

Problem 14.6

Derive Eqs. (14.37).

When sin θ D 1/p

3(' 0.577, θ ' 35.3),

φ D

r23

uuC dd � 2s sp

6�

r13

uuC dd C s sp

3D �s s (14.38a)

ω D

r13

uuC dd � 2s sp

6C

r23

uuC dd C s sp

3D

uuC ddp

2(14.38b)

This is called ideal mixing. Table 14.4 shows that for vector 1� and tensor 2Cmesons the mixing is almost ideal. For the pseudoscalars, the mixing angle is smalland negative. This is because mη0 > M08 and is attributed to the chiral anomalycontribution [97, 208, 361], which is an interesting topic theoretically but we do notdiscuss it here. If ideal mixing is realized, φ, f 0

2 � s s, ω, f2 � uuCdd, the formeris expected to decay predominantly to K K and the latter to 2π, 3π (see OZI rulebelow). The observed branching ratios are [311]

BR(φ ! K K) ' 84% , BR(φ ! 3π) ' 15%

BR(ω ! 3π) ' 90% ,

BR( f 02 ! K K) ' 100% ,

BR( f2 ! 2π C 4π) ' 95% , BR( f2 ! K K) ' 4%

(14.39)

which confirms the expectation.

Table 14.4 Meson mixing angle [311].

0� 1� 2C

Mass η0 957.7 ω 782.0 f2 1275.1[MeV] η 547.9 φ 1019.5 f 0

2 1525.0

π0 135.0 �0 770.0 a2 1318.3

K0 497.6 K� 891.7 K�2 1425.6

M8 569.3 928.7 1459.6

θ �11.3ı 40.0ı 32.1ı

Page 543: Yorikiyo Nagashima - Archive

518 14 The Quark Model

φs

s

u

ud

d

π

π+

π0 φs

s

s

s K+

K-

(a) (b)

d

d

u

u

Figure 14.7 (a) φ! 3π, (b) φ! KCK�, K0K0

OZI Rule The suppression of φ ! 3π and ω, f2 ! K K can be explained by theso-called OZI (Okubo–Zweig–Iizuka) rule [211, 301, 400], which claims that decayswith disconnected quark diagrams are suppressed. It is respresented pictorially inFig. 14.7.

Since φ is predominantly composed of s s, it can go to K K with connected quarklines, whereas the quark lines have to be cut and regenerated to go to 3π’s [note:φ ! 2π is forbidden by G-parity (G D (�1)I C D �1), see paragraph in Sect. 13.2].This is an empirical rule. The reason for it was clarified with the advent of QCD(see Sect. 14.5.4).

Problem 14.7

Experimentally, σ(K� p ! φΛ) ' 100σ(π � p ! φn). By drawing the quarkdiagram. show that the OZI rule applies here.

Problem 14.8

The decay process of a vector meson 1� to lepton pairs l l goes through a photonintermediate state and its decay width is proportional to the square of the electriccharges Q2 D (

Pi Q i )2

16πα2hQi2jψ(0)j2

m2V

(14.40)

where α is the fine structure constant, ψ is the wave function of the vector mesonand mV is its mass. Assuming m� D mω D mφ and ideal mixing, show the relativeratios of the decays are

Γ� W Γω W Γφ D12W

118W

19

(14.41a)

Page 544: Yorikiyo Nagashima - Archive

14.3 Color Degrees of Freedom 519

The experimental data are

Γ� D 7.04˙ 0.06 keV , Γω D 0.60˙ 0.02 keV, Γ� D 1.27˙ 0.04 keV

(14.41b)

and agree fairly well with the expectation.

Problem 14.9

Derive Eq. (14.40)*.Hint: V ! l l can be considered as qq ! l l scattering in which quark flux isprepared by the vector meson (see Fig. 14.8). Use:

1. σ(e�eC ! μ�μC) in Eq. (C.17) with appropriate replacement of the quarkmass and charge,

2. The quark flux prepared by the meson is F(qq) D 2 q jψ(0)j2, where q is thequark velocity in the total center of mass frame energy

ps D mV ,

3. The result would still be a factor 3 off, which can be corrected by consideringthe color factor described below.

V

qi

qi

l

l

γ

Qie e

Figure 14.8 A diagram that contributes to V ! `` decay.

14.3Color Degrees of Freedom

The success of S U(3) immediately raised a new problem. Ω � and ΔCC are com-posed of s s s and uuu, which are symmetric. Their spin is 3/2, which means theconfiguration is """ and is also symmetric. Since ground states of the quark com-posites are expected to have orbital angular momentum L D 0 (states with L � 1are generally at higher energy levels), the particles’ wave functions in the decu-plet are totally symmetric. This contradicts quarks having spin 1/2, which shouldobey Fermi statistics. Therefore, we need to introduce another degree of freedomto make the wave function totally antisymmetric, which we refer to as “color”. Inorder to make three particles totally antisymmetric, the color freedom has to bethree. We have already learned that a singlet made from 3˝ 3˝ 3 in S U(3) is total-ly antisymmetric [see arguments following Eq. (14.18)]. We name the three degrees

Page 545: Yorikiyo Nagashima - Archive

520 14 The Quark Model

of color freedom R (red), B (blue) and G (green). The names are arbitrary. We couldjust as well call them η1, η2 and η3 or use numbers. However, it is a very conve-nient name, since all the observed hadrons seem to be color singlets, and can becalled white or color neutral figuratively.

Evidence for Color Degrees of Freedom There is a great deal of evidence for theexistence of three colors (Nc D 3).

1) The spin statistics of Ω � and ΔCC that we have just stated.2) π0 ! γCγ : As the neutral pion π0 is a bound state of qq, i.e. jπ0i D (�juuiCjddi)/

p2 [see Eq. (14.14)], the process goes through quark loop intermediate states

as depicted in Fig. 14.9. The decay matrix element is the sum of the two diagramsin Fig. 14.9. If the quark has Nc colors, the number of diagrams is Nc and thedecay rate has to be multiplied by N 2

c . The quark model calculation with Nc D 3gives [7, 55]

Γ (π0 ! 2γ ) D N 2c

�Q2

u � Q2d

�2 α2m3π0

32π3 f 2π

(14.42)

where f π is the pion decay constant defined in Eq. (15.44). Calculation givesΓ (π0 ! 2γ ) D 7.64 eV, in good agreement with the observed lifetime τexp D

8.4 ˙ 0.6 � 10�17 s, which gives Γexp D „/τ D 7.84 ˙ 0.6 eV. Without color, thetheoretical value would be a factor of 9 less than the experimental value.

3) Drell–Yan process (p C p ! l l C X ): In the p (π)–p reactions to make alepton pair with large invariant mass, the production cross section can be shown tobe proportional to Nc (see Sect. 17.9.1) and a comparison with experiment showsNc D 3.

4) R-value: The R-value is defined by

R �σ(e�eC ! all hadrons)

σ(e�eC ! μ�μC)D Nc

Xi

Q2i (14.43)

D 3 �

"�23

�2

C

��

13

�2

C

��

13

�2

C � � �

#

D

8<ˆ:

2 u, d, sp

s < 2mc103 u, d, s, c 2mc <

ps < 2mb

113 u, d, s, c, b 2mb <

ps

(14.44)

π0

γe

γ

u

u

u

23

e23

e13

π0

γ

γ

d

d

d

e13

+

(a) (b)

Figure 14.9 Triangular diagrams: π0 ! γ C γ goes through quark loops.

Page 546: Yorikiyo Nagashima - Archive

14.3 Color Degrees of Freedom 521

e e

μ μ_

_

(a)

tx

e e

qi qi_

_

e

e e

Qie

HADRONS

(b)

Figure 14.10 One-photon exchange diagrams for μμ and hadron production in e� eC reaction.R D σ(e� eC ! hadrons)/σ(e� eC ! μ� μC) D 3

Pi Q2

i .

Feynman diagrams for e�eC ! μ�μC and e�eC ! all hadrons are shown inFig. 14.10.

Quarks are confined and it means the quarks produced in e�eC reaction areconverted 100% to hadrons. From Fig. 14.10 it is clear that production of a quarkpair is a pure electromagnetic process and its cross section is proportional to Q2

itimes that of μμ production. Multiplying by Nc and summing over all the kinds ofquarks that can be produced gives the R-value.

Figure 14.11 gives a compilation of R-value data. It clearly shows that at thethreshold of going over

ps > 2mq , a variety of resonances appear, but that the

R-value agrees with the predictions atp

s 2mq , q D c, b for each energy rangewith mass values given in Table 2.2.

10-1

1

10

102

103

1 10 102

R ω

ρ

φ

ρ

J/ ψ ψ (2 S )Υ

Z

√s [GeV]

u+d+s u+d+s+c u+d+s+c+b

No color

Figure 14.11 Compilation of R-value data [311]. Lines indicated by arrows represent the values2, 10/3 and 11/3, respectively. The dashed line indicates the R-value for uC d C s C c C b withno color. Lines J/ψ, � indicate sharp resonances.

Page 547: Yorikiyo Nagashima - Archive

522 14 The Quark Model

5) The weak force carrier W ˙ boson decays into

W ! eν e , μνμ , τντ , ud0 , cs0 (14.45)

and the quarks hadronize after the decay. d0, s0 are Cabibbo-rotated d, s (seeSect. 15.6.4). If the masses of the leptons and quarks are neglected comparedto mW D 81 GeV, the partial decay rates are all the same because the coupling ofthe W boson is universal. Therefore, the branching ratio of W ! eν e (or any singleleptonic decay) is 1/5 without color, but 1/9 if color is included. The experimentalvalue is � 11% and supports the existence of the color degrees of freedom.

14.4SU(6) Symmetry

14.4.1Spin and Flavor Combined

SU(6) Multiplets As the quark has color degrees of freedom, a wave function ofquark composites can be expressed as a product:

Total wave function D color � flavor � spin � space

Assuming all hadrons are color singlets and observed low-lying hadrons are inthe S states of orbital angular momentum, the remaining degrees of freedom are3 � 2 D 6. If the six degrees of freedom in the flavor and spin are equal andif the strength of the interaction does not change by exchange among them, thesymmetry to be considered is S U(6). Base vectors are

(� 1, � 2, � � � , � 6) D (u ", u #, d ", d #, s ", s #) (14.46)

We designate this as 6 or, writing flavor and spin separately, as (3, 1/2). The wayto make product representations from the above bases is quite similar to that ofS U(3). Meson multiplets are (see Appendix G)

6˝ 6� D 1˚ 35 D (1, 0)˚ (1, 1)˚ (8, 0)˚ (8, 1) (14.47)

which means both mesons with J P D 0�, 1� appear in a singlet and an octet. It iscustomary to combine them and call them a nonet. Baryons appear in

6˝ 6˝ 6 D 20A ˚ 70MA ˚ 70MS ˚ 56S (14.48)

However, the Fermi statistics of the quark allows only totally symmetric 56S repre-sentation. Separating the flavor and the spin, we can re-express it as

56 D (8, 1/2)˚ (10, 3/2) (14.49)

Therefore the baryon (1/2)C octet and the decuplet (3/2)C exhaust the availablespectra. Namely, the baryons, unlike the mesons, do not constitute a nonet. Note,in Sect. 14.1.5 we stated that there are two baryon octets with different symmetry,now we know that combined with spin wave functions they should make a totallysymmetric wave function (see Problem 14.11).

Page 548: Yorikiyo Nagashima - Archive

14.4 S U(6) Symmetry 523

Wave Functions in SU(6) Wave functions of the baryons in flavor octet and thoseof spin are of mixed symmetry. The combined wave functions have to be totallysymmetric in the S states. As an example, let us consider jp i � juudi. With respectto the first two quarks it is symmetric, and hence the spin wave function of the firsttwo must also be symmetric. A symmetric spin wave function is expressed as

j1, 1i D"", j1, 0i D1p

2("# C #"), j1,�1i D## (14.50)

In order to make a spin 1/2 wave function from the product of Eq. (14.50) and thethird quark, we use the Clebsch–Gordan coefficients to give

j1/2, 1/2i D

r23""# �

r13

("# C #") "p

2

D1p

6[2 ""# �("# C #") "] � �M S

(14.51)

As the wave function must be totally symmetric, we make a cyclic replacement ofthe particle 1! 2! 3 and add them all together. Therefore

jp "i � uud[2 ""# �("# C #") "]C cyclic

D1p

18(2u " u " d # �u " u # d " �u # u " d "

C 2u " d # u " �u # d " u " �u " d " u #

C 2d # u " u " �d " u " u # �d " u # u ")

(14.52)

The normalization constant is determined to make hp jp i D 1.

Problem 14.10

Show that the same wave function is obtained if one starts from the antisymmetricwave function for the first two quarks

φM A�M A �(ud � du)p

2u

("# � #")p

2" (14.53)

Problem 14.11

Using φM S D1p

6f2uud � (ud C du)ug and �M S , φM A, �M A given in Eqs. (14.51)

and (14.53), prove that the S U(6) wave function of Eq. (14.52) can be written as

jp "i D1p

2[φM S �M S C φM A�M A] (14.54)

φM S , φM A are wave functions belonging to 8MS, 8MA, as is clear from the sym-metry. φM S , φM A of other baryons are given in Table G.2.

Page 549: Yorikiyo Nagashima - Archive

524 14 The Quark Model

Magnetic Moment of the Proton The S U(6) wave function is given by Eq. (14.52),but the relative spin direction of u and d is the same if cyclic permutation is per-formed. Therefore the first term is enough for considering the magnetic moment.The normalization, however, is important and has to be redefined each time. Con-sequently we consider

jp i D uud1p

6

�2 ""# �("#" C #"")

(14.55)

The magnetic moment of the proton can be calculated as a vector sum of that ofeach quark. From the first term, the proton magnetic moment receives a contri-bution from the quarks μ p D μu C μu � μd with probability (2/

p6)2 D 2/3.

From the second term, the proton receives μ p D μu � μu C μd with probability2 � (1/

p6)2 D 1/3

) μ p D23

(2μu � μd)C13

μd D13

(4μu � μd) (14.56a)

Similarly for the neutron,

μn D13

(4μd � μu) (14.56b)

If S U(6) is an exact symmetry, mu D md , hence μd D �μu/2, which makesμn/μ p D �(2/3). The agreement with the observed value (�0.685) is very goodand was considered a great success of S U(6) symmetry. Here, we go one step fur-ther and calculate all the magnetic moments of the 1/2, 3/2 baryons, but regardmu, md , ms as parameters to be adjusted with the observed values. Table 14.5 liststhe magnetic moments of other baryons together with fitted and observed values.

Problem 14.12

Derive the expressions in the second column of Table 14.5.

We see that all nine magnetic moments are well reproduced with only threeadjustable variables. The quark masses determined from the above table usingμ i D qi e/2mi are

mu D 338 MeV , md D 322 MeV , ms D 510 MeV (14.57)

We see mu ' md ' m p /3, ms � mu ' mΛ � m p . They are called the constituentmass (effective static quark mass) in contrast to current mass (see Table 14.5),which is determined by the scattering data. The constituent masses are known toreproduce static properties of the hadron, including the magnetic moment and themass. The riddle of two different quark masses will be resolved later when we learnabout asymptotic freedom (see Sect. 18.5.1). It states that quarks with the currentmass are freely flying at almost the speed of light but confined inside a sphere withthe effective size of hadrons (. 10�15 m). In describing static properties, however,

Page 550: Yorikiyo Nagashima - Archive

14.5 Charm Quark 525

Table 14.5 Magnetic moments derived using S U(6) wave function [117].

Baryon Magnetic moment Theory Observed value a

p (4μu � μd )/3 input 2.792847386˙ (63)

n (4μd � μu )/3 input �1.91304275˙ (45)Λ μ s input �0.613˙ 0.004

Σ C (4μu � μ s )/3 2.67 2.458˙ 0.01

Σ 0 (2μu C 2μd � μ s )/3 0.79 –Σ 0 ! Λ (μd � μu )/

p3 b –1.63 �1.61˙ 0.08

Σ � (4μd � μ s )/3 –1.09 �1.160˙ 0.025

� 0 (4μ s � μu )/3 –1.43 �1.250˙ 0.014� � (4μ s � μd )/3 –0.49 �0.651˙ 0.0025

Ω � 3μ s –1.84 �2.02˙ 0.05

a In units of proton Bohr magneton (e/2m p ). Data are taken from [311].b Transition magnetic moment Σ 0 ! Λ.

the contributions of the quark kinetic energy and the potential made by the colorforce field add up to make the constituent mass. The constituent quarks are almostat rest and nonrelativistic treatment reproduces the static properties of the hadronquite well.

14.4.2SU(6) � O(3)

Excitations of orbital angular momentum with L D 0(S ), L D 1(P ), L D 3(D), . . .are possible. Using the spectroscopic terminology 2SC1L J ( J P C ) to denote quan-tum numbers of the excited states, the known 0�, 1� are denoted as 1S0(0�C),3S1(1��). For L > 0 meson states, 1P1(1C�), 3P0(0CC), 3P1(1CC), 3P2(2CC), . . .have been observed (Fig. 14.12).

14.5Charm Quark

14.5.1J/ψ

In November 1974, an incident remembered as the “November Revolution” oc-curred. A precursor was the discovery that summer of an anomalously large peakby Ting and group at mee ' 3 GeV in a reaction p p ! e�eC C X carried out atBNL (Brookhaven National Laboratory). Its spectrum was so anomalous, the eventsshowing a large, sharp peak with practically no backgrounds, that the group was

Page 551: Yorikiyo Nagashima - Archive

526 14 The Quark ModelL

1

2

0

ρ3(1690) K3*(1780) ω3(1670) φ3(1850) K2 (1820)ρ1(1700) K* (1680) ω (1650)π2(1670) K2 (1770) η2 (1645) η2(1870)

D

a2(1320) K2* (1430) f2 (1270) f2’(1525)P

a1(1260) K1A(1400) f1 (1285) f1 (1420) a0(1450) K0* (1430) f0 (1370) f0 (1710) b1(1235) K1B(1270) h1(1170) h1(1380)

S’ ρ (1450) K*(1410) ω (1420) φ (1680)π (1300) K (1460) η (1295) η’ (1475)

S ρ (770) K* (892) ω (782) φ (1020)

π K η’ (958) η (549)

P’

S’’1--0-+

1--0-+

1++0++

2++

1+-

2--1--

3--

2-+

0 1 2 3M2 (GeV/c2)

Figure 14.12 A table of observed mesons classified by S U(6) � O(3) [251, 311]. Particlesin the blank spaces have not yet been observed. At the right of each box J P C ’s are indicated.S 0, S 00 , P 0 are radial excitations of S and P.

not certain about the validity of their observation and repeated tests without evermaking it public. On the other hand, at SLAC (Stanford Linear Accelerator Center)on the west coast, Richter’s group had been measuring the total cross sections ofthe e�eC reaction since the summer at the electron–positron collider acceleratorSPEAR and were frustrated by anomalous behavior of the observed reaction rate,which went up and down every time they re-measured with no sign of stabilization.Eventually, they noticed that the lack of reproducibility might be due to the inaccu-racy of the energy setting. They repeated the experiment with improved accuracyand they also observed a very narrow peak at 3 GeV. Ting named it the J-particle andRichter named it ψ, and today it is known as J/ψ. The reason why people madea fuss about J/ψ was that despite its large mass value of 3 GeV the width was farsmaller than the experimental resolution of a few MeV. In conventional thinking,when the parent’s mass is large compared to the offspring’s mass (if they are a fewpions, the total mass is at most 1/10 that of the parent), the parent is expected tohave a short lifetime because of the large Q-value. In other words, the decay widthshould be at least as large as that of resonances with comparable mass (for instance� has mass 770 MeV and width Γ� � 150 MeV). The situation is very similar to thediscovery of the strange particles. Hot debates ensued, but it was soon realized thatJ/ψ is composed of the fourth, “charm” quark; Ting and Richter received the Nobelprize in 1975. The short duration between the discovery and the award of the prizewas second only to the discovery of parity violation by Lee and Yang, indicating thelarge impact that made on the high-energy community.

Page 552: Yorikiyo Nagashima - Archive

14.5 Charm Quark 527

14.5.2Mass and Quantum Number of J/ψ

J/ψ was produced in the following reactions [33, 34]:

BNL W p C B e ! J/ψ C � � �

J/ψ ! e�eC (14.58a)

SLAC W e� C eC ! J/ψ ! e�eC, μ�μC, hadrons (14.58b)

Figures 14.13 and 14.14 show J and ψ production cross section results. Notethe scale of the abscissa. In Figure 14.15 the cross section goes down just beforerising to the peak of the resonance. This is due to an interference effect betweenthe photon and J/ψ intermediate states in ee ! μμ and indicates that J/ψ sharesthe quantum number J P C D 1�� with the photon. The shape of the resonancebefore and after the peak is not symmetric. This is due to the radiative effect, inwhich the electron and positron with larger energy than the resonance mass emitphotons and lose energy, and the remaining less energetic electron–positron pairforms the resonance.

Isospin and G-Parity of J/ψ If I D 1, the decay to �0π0 is forbidden. This is easilyseen by looking at the Clebsch–Gordan coefficients of making I D 1 from twoI D 1 objects. I D 0 was confirmed by the decay mode to ΛΛ and

BR( J/ψ ! �0π0)BR( J/ψ ! �C π�)C BR( J/ψ ! �� πC)

�12

(14.59)

Since C D (�), this gives G D (�1)I C D (�). This is also seen by the predominantdecay into an odd number of pions rather than an even number of pions.

14.5.3Charmonium

Potentials of the Quarks Following the discovery of J/ψ (3096.9 ˙ 0.1 MeV), ahigher state ψ0 D ψ(3686.0 ˙ 0.1 MeV) and its decay products � c0(3415.1 ˙ 1.0),� c1(3510.6 ˙ 0.5), � c2(3556.3 ˙ 0.4), etc. from ψ0 ! γ � were discovered succes-sively. When one makes a chart in terms of J P C (Fig. 14.16), it resembles boundstate levels of the positronium made of an electron and a positron and thus theyare named charmonium states. From the similarity of the levels, their structure isexpected to be almost identical. Namely, it is inferred that a pair of point particleswith spin 1/2 interacting in a Coulomb-like potential has formed a bound state.The level structure is better reproduced (Fig. 14.17) if one uses

V(r) D �43

α s

rC k r 7) (14.60)

7) V D rn , �1 � N � 2 (1/r, ln r, r, r2 ) produce similar spectra. However, only V � r gives theRegge trajectories ( J / m2) shown in Fig. 14.18.

Page 553: Yorikiyo Nagashima - Archive

528 14 The Quark Model

Figure 14.13 J-particle production in p–Be collisions. A very narrow resonance is observed inthe e� eC invariant mass distribution. The experiment was carried out with the 28 GeV AGS atBrookhaven National Laboratory [33].

The coefficient 4/3 of the first term is introduced to be consistent with the QCDpotential, which will be described later [see Eq. (14.86)]. The second term, if extrap-olated to r !1, gives the confining potential, because to separate the two quarksto r D 1 requires an infinite amount of energy.

Evaluation of α s For a short distance (r . 1/mπ ' 10�15 m), the first term isdominant and we can drop the second term for a qualitative argument about thebound state.9) Energy levels of a bound particle due to the Coulomb potential (i.e.the levels of hydrogen atoms and positronium) are expressed as

En D �α2

2μn

n: principal quantum number (14.61)

where α is the fine structure constant and μ D m1m2/(m1 C m2) is the reducedmass. The energy difference between the two levels of 3S1 with n D 1 and n D 2

9) This is a good approximation for bottomonium to be discussed later. For the case of charmonium,the second term is comparable to the first term.

Page 554: Yorikiyo Nagashima - Archive

14.5 Charm Quark 529

Figure 14.14 Production cross sections of e� eC ! ψ ! X , where X is (a) hadrons, (b)μ� μC and (c) e� eC. The experiment was carried out with PEP at Stanford Linear AcceleratorCenter [34, 73].

is given by

ΔE ' 5 eV Dα2me

4

�112 �

122

�D

316

α2me (14.62)

where the reduced mass me/2 is used for the positronium. For charmonium, wechange ΔE ! 600 MeV, α ! (4/3)α s , me ! mc . We set mc ' 3700/2 MeV,because while the decay width of ψ0(3686) is narrow, that of ψ00 D ψ(3770)! D D(D D are hadrons which contain a charm quark) is wide (� 25 MeV), indicating astrong decay. In other words, the mass of ψ exceeds 2mc between the two levels.Inserting numerical values (me ' 0.5 MeV, α ' 1/137), we obtain α s ' 1. Itis very surprising that levels at energy scales differing by a factor of 100 millionexhibit the same structure. Thus, it was clarified that a Coulomb-like force, but 100times stronger, is at work between the quarks.

Confinement and the Regge Trajectory The Regge trajectory represents the phe-nomenon that a series of particles with the same quantum numbers other thanspin are on straight tracks with the same slope on the m2–J plane (Figs. 14.18

Page 555: Yorikiyo Nagashima - Archive

530 14 The Quark Model

1.00

0.30

0.10

0.033.1003.0953.0853.000 3.090

Collision Energy

Rela

tive

yiel

d of

μ to

ele

ctro

n

Figure 14.15 Interference effect between γand J/ψ [73]. If J/ψ has the same quantumnumber as γ , the interference effect appearsin the e� eC ! μ� μC reaction as a changein the cross section just before the resonance

sets in (solid line). If the quantum number isdifferent, no interference occurs (dashed line).The data show the existence of the interfer-ence.

and 13.29). We shall show that V(r) D k r can reproduce the Regge trajectories.Assume the attractive potential V(r) D k r n , that two massless quarks are rotatingon a circle and that the centrifugal force is in balance with the potential. The massof the two-particle system, i.e. the total energy, is given by

H D 2p C k r n (14.63)

As they are rotating, they have angular momentum J D 2p r . They are stationaryat the distance where H is minimum,

@H@rD 0! J D nkr nC1 (14.64)

Inserting Eq. (14.64) in Eq. (14.63) we have

H / J n/(nC1) ! J / M (nC1)/n (14.65)

To reproduce the Regge trajectory, n has to be 1. Inserting n D 1 and recalculating,we obtain

J Dm2

4k�

(mc2)2

4k„c(14.66)

where we have recovered „ and c in the last equation. From Figure 14.18 the slopeof the Regge trajectories is obtained to be � 1 (GeV)�2 and

k D 0.25 GeV2/„c ' 1.3 GeV/10�15 m (14.67)

which is often referred to as the string tension. The work necessary to separate thequarks to a distance of 1 m is � 1015 GeV � 150 tons� 1 m. It is impossible to givethat amount of energy to a quark.

Page 556: Yorikiyo Nagashima - Archive

14.5 Charm Quark 531

3500

3000

ψ(3S)(3770)

ψ(2S)(3686)

J/ψ(1S)(3097)

ηc(2S)(3654)

ηc(1S)(2980)

χc0(1P)χc1(1P)

χc2(1P)DD2 MD

Charmonium Levels

Positronium Levels

γγγγ

γγ γ

γ

γ

γγ∗

γ∗hadrons

hadrons

hadrons

hadrons

hadrons

ππηπ

hadronshadrons

JPC 0-+ 1-- 0++ 1++ 2++

JPC 0-+ 1-- 0++ 1++ 2++

n=2

n=1n=1

n=2 3S1

3S1

3P13P2

3P0

E1

E1

M1

M1

M11S1

1S1~5 eV

Mas

s (M

eV)

Figure 14.16 Comparison of levels between charmonium and positronium [311].

Mas

s (G

eV)

Har

mon

icos

cilla

tor

Line

ar

Line

ar +

Coul

omb

+ sp

inco

uplin

g

4.0

3.5

3.0

ψ´

ψ

χ

χ

2d

1d

1p1p

2p3s

2s2s, 1d

2p, 1f

3s, 2d, 1g

1s 1s

Figure 14.17 A comparison of level structureby using various kind of potentials [101]. Therightmost column shows the levels of a har-monic oscillator (V � r2), the second fromright is due to a linear potential and the third

is a combination of the linear and Coulombpotentials [Eq. (14.60)]. Introducing spin cou-plings reproduces the observed spectrum,shown in the fourth column (at the left).

Page 557: Yorikiyo Nagashima - Archive

532 14 The Quark Model

Figure 14.18 Regge trajectories: A group of particles with the same quantum numbers excepttheir spin are on straight tracks ( J / m2) with the same gradient on the m2–J plane. This iscalled the Chew–Frautschi plot [98].

V(r) D k r means the quarks are attracted to each other by a string with constanttension. Hence k is referred to as the string constant. As 1.3 GeV � 9mπ , an ener-gy equivalent of 9π’s is stored if the two quarks are � 10�15 m apart. One noticesthat this is about the size and mass of a hadron. If the string between the quarks isstretched, a large amount of energy is stored in it. According to the Heisenberg un-certainty principle, pairs of particles and antiparticles are constantly produced andannihilated in the vacuum. If a pair is produced on the line, they make two colorsinglets with the original quarks. Then the force between the pair is weaker thanthat between the members of the singlet and the string is cut off. This means thatmaking hadrons by extracting quark pairs from the vacuum is more advantageousthan stretching the string further (Fig. 14.19). Thus, the string stretched betweentwo energetic quarks is cut into pieces, producing many hadrons, which is referredto as quark hadronization. This is why the quark is believed to be unable to separate

Page 558: Yorikiyo Nagashima - Archive

14.5 Charm Quark 533

r

(a)

(b)

Figure 14.19 Energy stored between twoquarks increases in proportion to the separa-tion and behaves like a string with constanttension. If a pair of quarks is produced in vac-uum and combines with the original pair tomake a new pair of color singlets, the orig-

inal color force is terminated there and thestring is cut. In other words, for r > 10�15 mit is more advantageous to cut the string andhadronize than to stretch the string further.Many hadrons are created.

as a single entity (confinement). An energetic quark (and gluon also) is convertedinto many hadrons flying as a bundle in the same direction, which is observed asa jet phenomenon. Figure 14.20a shows a typical two-jet event (e�C eC ! q C q)observed by the VENUS detector of TRISTAN e�eC collider [285, 380]. One imme-diately observes that many tracks and associated energies are concentrated in twoopposite directions. Figure 14.20b shows an azimuthal angle distribution of thejet axis produced by polarized electrons at SPEAR [195]. The theoretical angulardistribution of a particle in the e�eC reaction is expressed as

dσdΩ/ 1C α cos2 θ C P2α sin2 θ cos(2') (14.68)

where P is beam polarization and α D C1 for spin 1/2 particles and α D �1 forspin 0 particles. The data show α D 0.78˙ 0.12, confirming the quark’s spin to be1/2.

14.5.4Width of J/ψ

The resonance width of the J/ψ was narrower than the experimental resolution.We discuss a method to determine the width. When there is a resonance, the totalcross section to a final state f is given by [see Eq. (13.85)]

e�eC ! J/ψ ! f

σ D2 J C 1

(2s1 C 1)(2s2 C 1)4πp 2

�Γee

Γ

��à f

Γ

�Γ 2/4

(E � ER)2 C Γ 2/4

(14.69)

Here, Γe/Γ and Γ f /Γ are the branching ratios of the J/ψ decaying to electronsor the final f state, s1 D s2 D 1/2 are electron spins and J D 1. The observedvalue is a cross section convoluted with the experimental resolution δexp. Undernormal circumstances, Γ δexp and the functional shape of Eq. (14.69) can bereproduced faithfully. If, on the other hand, Γ is much smaller than δexp (here,δexp/E ' 0.1%, δexp ' 3 MeV 93 keV), the measured cross section is nothingbut the error function (see Fig. 14.21).

Page 559: Yorikiyo Nagashima - Archive

534 14 The Quark Model

≈≈ 0 45 90 135 180

400

1000

800

600

Azimuthal Angle of Jet Axis(degrees)

√s = 7.4 GeV

dσ /d Ω ~ 1 + α cos 2θ + P 2α sin 2θ cos(2ϕ)

α = 0.78 ± 0.12

(b)

(a)

Figure 14.20 Quark jets produced ine� C eC ! q C q reaction. (a) Two viewsof tracks in the central drift chamber. Num-bers in rectangles at outer circle are energydeposits in the lead glass calorimeter. Lowerright: Back-to-back emisson of two sharply

peaked jets is evident in the Lego plot, a 3-dimensional display of energy deposits inφ � z plane [285]. (b) Azimuthal angulardistribution of the jet axis produced by a po-larized beam. α D 1 for spin 1/2 and α D �1for spin = 0 particles [195].

When f D ee, the integral of Eq. (14.69) over energy with p D pR D

1548 MeV/c fixed at the resonance peak gives

Z 1

0σ(E )dE D

3π2

2p 2R

�Γee

Γ

�2

Γ (14.70)

which should be equal to the area under the resonance curve of σ(ee ! e�eC)(Fig. 14.14c). As the branching ratio Γee/Γ D 0.0594 can be measured indepen-dently, comparison with the data gives Γ D 93.2˙ 2.1 keV, Γee D 5.52˙ 0.14 keV.The smallness of Γ D 93 keV may be recognized by comparing it with hadronic

Page 560: Yorikiyo Nagashima - Archive

14.5 Charm Quark 535

Real Cross sectionWidth (Γ )

Observed Cross section Width (σ)

Experimental ResolutionWidth (σ)

Figure 14.21 When the experimental resolution is much worse than the resonance width,δexp � Γ , the observed shape of the resonance is that of the error function.

J/ψc

c

u

ud

d

π

π+

π0d

dψ(3S)

c

c

c

c

D+

D-(a) (b)

d

d

Figure 14.22 (a) J/ψ ! 3π has disconnected charm lines. (b) In ψ(3770) ! DD the charmlines are connected. By the OZI rule, (a) is heavily suppressed. Compare with Fig. 14.7.

decays of comparable resonances:

� W M D 768 MeV, Γ� D 152 MeV, Γ�!ee D 6.8 keVω W M D 782 MeV, Γω D 8.4 MeV, Γω!ee D 0.6 keVφ W M D 1020 MeV, Γφ D 4.4 MeV, Γφ!ee D 1.37 keV

(14.71)

We see the width of J/ψ is 1000 times smaller than typical hadron widths andalmost comparable with the electromagnetic decay widths.

This was why people were surprised by the discovery of J/ψ. As the mass ofJ/ψ is far larger than p , n, Λ, Σ , etc., it should decay strongly to pions and otherlight particles, giving Γ � O(100) MeV, unless some selection is at work. Indeed,ψ(3S )(3772.92 ˙ 0.35) has a broad width � 27.3 ˙ 1.0 MeV and predominantlydecays to DCD� or D0D

0(Fig. 14.22b). The D particles were discovered in e� C

eC ! DCD�, D0D0

[114, 177, 178, 321] and they are quasistable particles. Their

Page 561: Yorikiyo Nagashima - Archive

536 14 The Quark Model

masses and lifetimes are

M(D˙) D 1869.6˙ 0.2 MeV , τ(D˙) D 10.4˙ 0.07 � 10�13 s (14.72a)

M(D0) D 1864.8˙ 0.17 MeV , τ(D0) D 4.10˙ 0.015 � 10�13 s (14.72b)

These phenomena cannot be explained if these particles contain only the knownquarks u, d, s. They can be understood if J/ψ, ψ(3770) are assumed to be made ofnew quarks cc. The fact that D particles are heavy (m � 2 GeV) and yet quasistablecan be understood if they contain c or c. We now know that DC D cd, D� D cd,D0 D cu, D

0D cu. The c quark has a new quantum number “charm” that is con-

served in the strong interaction, hence it cannot decay strongly to ordinary particlescontaining only u, d, s, but it can decay weakly. J/ψ has total charm quantum num-ber zero and the decay mode conserves the charm quantum number, but drawingquark lines of the decay, the charm line has to be closed (see Fig. 14.22a). This isbecause m( J/ψ) < 2mD and no process satisfying the OZI rule (see Sect. 14.2.3)exists. On the other hand the charm quark line in the decay ψ(3770) ! D D isconnected and hence the OZI rule is satisfied.

In the language of QCD, where the quark lines are disconnected, they are con-nected by gluon lines. Each time a gluon is exchanged, a factor α s/2π � 0.1,where α s D g2/4π, has to be multiplied. For J/ψ three gluons are exchanged(Problem 14.13). Therefore the decay rate is suppressed by (α s/2π)6 compared toordinary strong decays.

Problem 14.13

Explain why there have to be at least three gluons in the intermediate state inFig. 14.22.

14.5.5Lifetime of Charmed Particles

The fourth quark was proposed as early as 1964 based on baryon–lepton symme-try [196, 270],10) but it was not until 1970 that it was taken seriously, when Glashow,Iliopoulos and Maiani showed that the absence of strangeness-changing neutralweak current can be explained if the existence of the charm quark is assumed (GIMmechanism [174], see Sect. 15.7.1 for details). It constitutes a doublet in the weakinteraction paired with the s quark. If D0 is made of the charm quark in a weakdoublet, it can decay predominantly to the s quark by emitting a W C weak boson,

10) The charm quark itself was discovered byNiu [297] prior to the November revolution.His group found a short-lived particlewith mass and lifetime m � 2–3 GeV,τ � 10�14 s, decaying to π0 and a chargedparticle in an emulsion stack placed in an

airplane. The Nagoya group analyzed itregarding it as the fourth quark. However, asit was a cosmic ray event with poor statistics,and the particle identification was notdefinitive, the discovery was largely ignoredoutside Japan.

Page 562: Yorikiyo Nagashima - Archive

14.5 Charm Quark 537

l+

νl

D0 K

u u

sc

W+u

D+

K

d

sc

W+

d

su

d

u

π+

π+

(a) (b)

Figure 14.23 (a) D0 ! K� C lC C ν l , (l D e, μ), where u acts only as a spectator. Herethe decay can be considered as c ! s C lC C ν l . (b) An example of hadronic decay DC !K�πCπC.

producing a variety of decay products (Fig. 14.23). Let us consider a semi-lepton-ic decay D0 ! K� C lC C ν l (l D e, μ). Here, the u quark does not participatein the decay, playing only a spectator role, and the process can be considered asc ! s C lC C ν l , which has identical kinematical structure to μ ! e C ν e C νμ ,differing only in the mass values. Since the lifetime of the muon is /

�G2

F m5μ

��1

[see Eq. (15.84)], the lifetime of D is estimated to be roughly

τD � BR(D ! K�eCν e)�

mc

�5

τμ

� 0.036 � (0.1/1.5)5 � 2.2 � 10�6 s � 10�13 s

(14.73)

which reproduces the experimental value (τD exp D (4.10˙ 0.02)� 10�13 s) approx-imately. This shows that the D meson decays weakly.

14.5.6Charm Spectroscopy: SU(4)

If there are four quarks, the flavor symmetry S U(3) is extended to S U(4). However,the mass of the charm quark � 1.3 GeV11) is far greater than others, and S U(4) isnot considered as a good symmetry. Nevertheless, it is useful in classifying andpredicting the existence of hadrons containing the c quark. We briefly describemultiplets that are generated by S U(4) (see Appendix G for S U(4) algebra). Themeson nonets in S U(3) become 16-plets in S U(4).

Charmed Mesons

4˝ 4� D 1˚ 15 (14.74)

Among the 15 members of 15, we already know eight members, which do notcontain c. The remaining seven members contain a c quark and are called charmed

11) This is the current mass, which is somewhatsmaller than the constituent mass, but not asdrastically small as u, d, s.

Page 563: Yorikiyo Nagashima - Archive

538 14 The Quark Model

mesons. They are

J P D 0�, 1�

8<ˆ:

C D 1 W D0(cu), DC(cd), Ds(cs)

C D 0 W η c � cc

C D �1 W D0(cu), D�(cd), D s(cs)

(14.75)

Their configuration in I3–Y–c space can be expressed as in Fig. 14.24, (a) for 0�and (b) for 1�. All of them have already been discovered.

Charmed Baryons

4˝ 4˝ 4 D 4A ˚ 20MA ˚ 20MS ˚ 20S (14.76)

The two 20S with mixed symmetry correspond to (1/2)C and the totally symmetric20S corresponds to J P D (3/2)C. Consequently, there are 12 charmed baryons with(1/2)C and 10 charmed baryons with J P (3/2)C (Fig. 14.24c,d). ΛC

c , Σc , �c and Ωc

have been observed.

(b)

sD

DD

sD

ρ +ρ

K

*0

K*

*+K*0

D 0*D*

*

*+

*+

cdcucs

usds

su sdud

ucsc

dc

0ρ ωφψJ/

du

K 0*

(a)

sD0D

sDD

0K

π π

+K

– Kuc

scdc

cdcucs

+

D+

+

K 0

usds

su sddu

0D

ηηηc

π 0

ud

C

I

Y

(c)

Ξ +c

Σ ++c

Ξ 0

n pΞ c

0

udc

dssdds

ussuus

uududduds

ssc

uscdscuuc

uccscc

dcc

Ω+cc

Ξ ++ccΞ +

cc

Σ 0c

Ξ

Σ Σ +Λ,Σ 0

Σ +cΛ+

c,

Ω 0c

ddc

(d)

Ω++ccc

Ξ ++ccΞ +

cc

Ω+cc

Σ ++c

Ξ +cΞ 0

c

ΩΞ 0

Σ +

Δ+Δ0Δ

ΣΞ

Δ++

ddcdsc usc

uuc

uuduus

ussdss

udddds

ddd uuu

Σ 0

udccΣ +

Σ 0c

dcc ucc

uds

ssc

scc

sss

Ω 0c

Figure 14.24 Charm spectrum: meson 16 andbaryon 20. (a) 0� (b) 1�. The central planesof (a) and (b) represent C D 0 mesons andexcept for ηc and J/ψ are S U(3) multiplets.

(c) (1/2)C, (d) (3/2)C . The plane at the bot-tom represents C D 0 baryons, which alreadyappeared in S U(3).

Page 564: Yorikiyo Nagashima - Archive

14.6 Color Charge 539

14.5.7The Fifth Quark b (Bottom)

A repeat of almost the same procedure was used to discover upsilon � (bb boundstate) in 1977 [213] shortly after the charm discovery

p C p ! � C X

� ! μ� C μC (14.77)

and later, a series of upsilons were discovered [323] at DESY (Deutsches Elektronen-Synchrotron) and at CESR (Cornell Electron Storage Ring) [24, 71]:

e� C eC ! � (9460) , � 0(10023) , � 00(10355) , � 000(10580) (14.78)

Among them the first three had narrow widths and � 000 had a broad width, indicat-ing mb � 5 GeV. Figure 14.25 shows the first three upsilons.

The level structure of the � ’s is quite similar to that of the charmonium and col-lectively they are called bottomonium. However, corresponding to the larger massmb ' 5 GeV, bound states are formed deep in the potential, and up to � (3S ) arequasistable (Fig. 14.26). For the observed hadrons that include the b quark, refer tothe Particle Data Group table [311].

14.6Color Charge

The discovery of the color degrees of freedom was the key to the true understandingof the strong force. It was also the discovery of the “color charge”. In electromag-

Figure 14.25 �, � 0, � 00 observed in e� eC ! � ’s [24, 71].

Page 565: Yorikiyo Nagashima - Archive

540 14 The Quark Model

=

BB threshold

(4S)

(3S)

(2S)

(1S)

(10860)

(11020)

hadrons

hadrons

hadrons

γ

γ

γ

γ

ηb(3S)

ηb(2S)

χb1(1P) χb2(1P)

χb2(2P)

hb(2P)

ηb(1S)

JPC 0 + 1 1+ 0++ 1++ 2++

χb0(2P)χb1(2P)

χb0(1P)hb (1P)

Figure 14.26 Level structure of bottomonium [311].

netism, the electric force is generated by an electric charge and the magnetic forceby its current. Since current is nothing but a moving charge, the electric and mag-netic forces are unified by the relativity principle. The current is a Lorentz fourvector and its 0th or scalar component generates the electric force and is spin inde-pendent in its rest frame. The space component, a vector, generates the magneticforce, which acts on (couples to) the current (the moving charge) or more generallyangular momentum, including spin. This characteristic is closely related to the factthat the photon is a Lorentz vector. Just like the electromagnetic force, the strongforce (or more rigorously color force to discriminate it from the nuclear force) isgenerated by the color charge and its current. This is why color force acts like theCoulomb force at short distances. Since the characteristics of the two forces arevery much alike, the color force is expected to obey the same mathematical frame-work as that of the electromagnetic force, namely gauge theroy. The difference liesin the number of charges. There are three color charges (R , G, B), which gener-ate the strong force with equal strength. The action (or emission/absorption of theforce carrier, which is referred to as the gluon) changes the color. So we can infer

Page 566: Yorikiyo Nagashima - Archive

14.6 Color Charge 541

that the characteristics of the color force are basically similar to those of the electro-magnetic force with added effects due to color exchange. We have already learnedthat the nuclear exchange force stems from the exchange of pions, which carryisospin. Unlike isospin, whose symmetry is S U(2), color comes in three kinds andits symmetry is S U(3). This is the origin of color S U(3) and, unlike flavor S U(3),it is considered to be an exact symmetry.12)

From the above arguments, it is natural that a gauge theory based on S U(3) hasemerged as a strong candidate for governing the strong interaction. Because ofits similarity to QED (quantum electrodynamics), the mathematical framework ofthe color force is called QCD (quantum chromodynamics). A strict formalism forconstructing the QCD Lagrangian will be given in Chap. 18, but here we show thata lot of insight into the QCD features can be obtained by simple color countingif we restrict our arguments to the lowest order corrections in the gluon emissionor exchange, namely as long as we do not touch the nonlinear aspect of QCD.Much of the elementary dynamics can be obtained using QED formula we alreadyknow with additional color factors. To obtain a glimpse of the power of QCD, weconsider a color exchange potential and how low-lying mass levels of the hadronare reproduced.

The force carrier, the gluon, comes in eight kinds. Let us try to understand why.Analogy with the nuclear force helps because this is also an exchange force obeyingS U(2) symmetry that alters the identity of the force source – p (proton) or n (neu-tron) – by the force action. The nuclear force carrier pion comes in three kinds. Ifwe omit the space-time structure, the nuclear force interaction is expressed as

H D gN

3XiD1

ψτ i

2ψπ i D

gN

2

hp2(p nπC C n p π�)C (p p � nn)π0

i(14.79a)

ψ D�

pn

�, (πC, π0, π�) D

�π1 � i π2p

2, π3,

π1 C i π2p

2

�(14.79b)

where τ i is the Pauli matrix. The pion field operators (πC, π0, π�) have the sameisospin function as fp n, (p p � nn)/

p2, n pg and change the isospin component of

the force source ψ. For instance, absorption of πC changes n to p (i.e. πC � p n).Actually there are four combinations to change isospin components (p , n) to (p , n),but the combination (p p C nn)/

p2 is totally neutral (isospin singlet) and does not

count. Therefore, the number of force carriers is 22 � 1 D 3. In mathematicallanguage, the pion belongs to the regular representaion of the S U(2) group. Incorrespondence to the isospin exchange hamiltonian, one that generates the colorexchange interaction can be expressed as

H D gs

8XiD1

ψλ i

2ψA i , ψ D

24R

GB

35 (14.80)

12) At long range, the confinement force becoms effective, which is a nonlinear effect of the colorcharge.

Page 567: Yorikiyo Nagashima - Archive

542 14 The Quark Model

λ i is the Gell-Mann matrix [Eq. (14.8)] and is the regular representation of S U(3).A i is the force field, referred to as the gluon. There are 32�1 D 8 gluons (A i). Theabsorption or emission of the gluon changes the color charge or the color state ofthe quark (R $ G $ B $ R), just as the pion changes p $ n. The gluon is aneight-component vector in color space, just as the pion is an isovector in isospinspace.

Unlike the pion, the gluon is a Lorentz vector, like the photon, and its spatialpart induces the magnetic interaction. This is the origin of color and spin exchangeforces, which will be described shortly.

14.6.1Color Independence

We show the color independence of the quark–gluon coupling strength justlike the charge independence of the pion–nucleon coupling (recall argumentsin Sect. 13.3.1).

We first reformulate the gluon fields to grasp an intuitive image of the gluon’srole. Just as we used (πC, π0, π�) instead of abstract (π1, π2, π3), we define gluons,gi ’s (i D 1, . . . , 8), as follows. Omitting the space-time part of the field subscriptsfor simplicity, we write down the Gell-Mann matrices explicitly:

�LQCD D gs

Xψ(λa/2)ψA a

Dgsp

2

�R G g1 C G R g2 C

1p

2(RR � G G )g3 C R B g4

CB R g5 C G B g5 C B G g7 C1p

6(RR C G G � 2BB)g8

�(14.81a)

where

g1 D1p

2(A 1 � i A 2) , g2 D

1p

2(A 1 C i A 2) , g3 D A 3

g4 D1p

2(A 4 � i A 5) , g5 D

1p

2(A 4 C i A 5) ,

g6 D1p

2(A 6 � i A 7) , g7 D

1p

2(A 6 C i A 7) ,

g8 D A 8

(14.81b)

The ga ’s are redefined gluon fields. Equation (14.81a) means, for instance, g1 hasthe ability to change its color G ! R if absorbed by G, in other words, as far as thecolor degrees of freedom are concerned, it plays an equivalent role to R G . This isanalogous to πC being � n p .

Let us calculate the strength of the force that acts between two quarks using theLagrangian in Eq. (14.81a). First of all, between R R we can draw two Feynman di-agrams, shown in Fig. 14.27a,b. Here, at the vertices R is annihilated and recreated

Page 568: Yorikiyo Nagashima - Archive

14.6 Color Charge 543

(d) (e)

g'

g'√ 2

1 g'√ 6

1 g'√ 6

1 g'√ 6

2 g'√ 6

2g'√ 2

1

√ 21

√ 61

√ 61

g'

g3= (RR GG) g8= (RR + GG 2BB) g8= (RR + GG 2BB)

g8= (RR + GG 2BB) g4= RB g5= RB

(a) (b) (c)

R R

RR

R R

RR

R B

BR

R B

RB

B B

BB

g3 g8 g8

g8 g4, g5g'

√ 62g'

√ 61

√ 61

Figure 14.27 Color independence, i.e. universality of the quark–gluon coupling strength.

(i.e. R ! R � R R). Among the gluons in Eq. (14.81b), one sees that g3 and g8

couple to R R. The strength of the g3, g8 exchange is given by

g0 2

"�1p

2

�2

C

�1p

6

�2#D

23

g0 2 (14.82)

where g0 D gs/p

2. Next we calculate the strength of the B B force (Fig. 14.27c).Looking at Eq. (14.81a), we see only g8 couples to B B , consequently

g0 2 �

��

2p

6

�2

D23

g0 2 (14.83)

This has the same strength as the R R force. Considering the universality of theforces acting between R , B, G , it is logical to obtain the same strength. Next weconsider the R B force. First, in addition to the contribution (Fig. 14.27d) due to g8,which couples to both R and B, we have to consider the R B exchange force due tog4, g5 (Fig. 14.27e). Combining the two

g0 2��

1p

6

��

��

2p

6

�C (1) � (1)

�D

23

g0 2 (14.84)

one sees that the universality is maintained. Next, let us consider the force in asinglet meson (RR C G G C B B)/

p3.

Page 569: Yorikiyo Nagashima - Archive

544 14 The Quark Model

g'

g4= RB g5= RB

(a) (b) (c)

B

RR

g8= (RR + GG 2BB)B B

BB

g8 g4, g5g'

g6= GB g7= GBBB

GG

g6, g7

B B

g'g'g'√ 6

2 g'√ 6

2

√ 61

Figure 14.28 Gluon exchanges between BB in a color singlet meson.

Since each force that acts on the three pairs is equal, we only consider an ini-tial state B B and multiply by 3. But as the weight of R R in the singlet meson is(1/p

3)2 D 1/3, all we need to do is to calculate the BB contribution. Consider-ing B B can be converted to B B, R R, G G , there are three diagrams to consider(Fig. 14.28). As the gluon is a vector and changes its sign for the antiparticle, wehave three contributions from each diagram:

g0 2��

2p

6

���

2p

6

�C (1)(�1)C (1)(�1)

�D �

83

g0 2 D �43

g2s (14.85)

The formula tells us that there is an attractive force between the quarks forminga singlet meson. This is the strength of the color force that acts between q and q,corresponding to that of the Coulomb force in QED. Writing α s D g2/4π, we have

43

α s (color force in QCD)$ α (Coulomb force in QED) (14.86)

Now we have reproduced the coefficient of the Coulomb-like force that was intro-duced in Eq. (14.60) to explain the charmonium spectrum.

14.6.2Color Exchange Force

The calculations we have done are intuitive and easy to understand but rather com-plicated. In general, we rely on group theory. The force acting on qq or qq can beexpressed as a potential. Expressing it in scattering matrix form, we have

M(c C c0 ! b C b0) � g2s

XA

˝bb0j

�ψ b(λA/2)bc ψc

�ψ b0 (λA/2)b0 c0 ψc0

jcc0˛

� g2s

XA

hbb0jF1AF2Ajcc0i

(14.87)

where we have suppressed all irrelevant factors except color indices. F1A, F2A arerepresentation matrices of S U(3) generators that act on particles 1 and 2. If we take

Page 570: Yorikiyo Nagashima - Archive

14.6 Color Charge 545

the sum over color indices A, Eq. (14.87) reduces to*XA

F1AF2A

+� hF 1 � F2i D

12h(F1 C F 2)2 � F2

1 � F 22i

D12hF 2 � F2

1 � F22i

(14.88)

which is a Casimir operator of S U(3) [see Eq. (G.50) in Appendix G]. This is theS U(3) extension of the isospin exchange force τ1 � τ2 [see Eq. (13.27)]. If we assumethat three-body forces are the sum of two-body forces, the sign and strength of itspotential are proportional to

CF DXi< j

hF i � F j i D12

*F2 �

3XiD1

F 2i

+F D F 1 C F2 C F3 (14.89)

Values of CF [refer to Eq. (G.51)] are given in Table 14.6. By surveying the table, onenotices that a strong attractive force acts on color singlets and qualitatively repro-duces the feature that only color singlets are observed in nature. The force actingon qq[3�] has to be attractive, being a part of qqq and making a singlet by couplingwith q[3]. The coupling among qqqq[3] is the same as for qqq[1], meaning thatattaching an extra q to qqq does not give a stronger attraction. Values of the forcestrength given in Table 14.6 are based on one-gluon exchange only. They cannotbe considered as quantitatively correct as they do not take into account nonlinearcouplings, which are special features of QCD. Nevertheless we are pleased to seethat they give qualitatively correct results.

14.6.3Spin Exchange Force

Before we go into the details of the color force, we review the space-time structureof the force. From what we have studied so far, the color force acting on the quarks

Table 14.6 Color factorPhF j � F j i.

Configuration Multiplets Strength

qq 1 �4/3

qq 8 C1/6

qq 3� �2/3qq 6 C1/3

qqq 1 �2

qqq 8 �1/2qqq 10 +1

qqqq 3 �2

Page 571: Yorikiyo Nagashima - Archive

546 14 The Quark Model

is like the Coulomb force, at least at distances comparable with the hadron size.This suggests that we may acquire some insight into the level structure by studyingthat of the hydrogen atom, which is a bound state as a result of the Coulomb force(Fig. 4.1). The levels are obtained by solving the Dirac equation, but the principalpart can be obtained from the Schrödinger equation with a central force α/r to give

En D �α2

2μn2

n D 1, 2, . . . (14.90)

where μ is reduced mass. For hydrogen m1 D m p m2 D me and μ ' me.The solutions of the Schrödinger equation having the same principal quantum

number n but different L and spin have the same energy, for instance (2S,2P),(3S,3P,3D), . . . are degenerate. Solving the Dirac equation, which takes into ac-count spin–orbital angular momentum interaction (LS force) and spin–spin inter-action (S S force), we find the levels split. Splitting due to the LS force is calledthe fine structure, and that due to spin–spin coupling the hyperfine structure. TheLS force arises from the magnetic interaction between the magnetic moment μL

produced by the circulating electron and the electron spin magnetic moment μ e .The Hamiltonian is given by

HLS � �μL � μ e D �

�e

2meL��

�ge

e2me

s e

�� �

αm2

eL � s (14.91)

The hyperfine structure arises from the spin–spin interaction, which is a magneticinteraction produced by spin magnetic moments of the electron and proton:

Hss � �μ e � μ p D

�ge

e2me

s e

��

�g p

e2m p

s p

��

αme m p

s e � s p (14.92)

The interactions exchange the spin component of the interacting particle and arecalled spin exchange forces. As is clear from Eqs. (14.91) and (14.92), the differentmagnitude of the level splitting between the fine and hyperfine structures stemsfrom the different masses of the electron and proton. In fact, the positroniumshows the same structure as the hydrogen atom, but m p is replaced by me, thusthe LS and S S forces generate approximately the same magnitude of splitting.Calculation of the hyperfine splitting using quantum mechanics gives (see for in-stance [63])

Ess D8π3

e i e j

mi m jjψ(0)j2 s i � s j (14.93)

ψ(0) is the value of the wave function at the origin. Since

s i � s j D12

h(s i C s j )2 � s2

i � s2j

iD

12

[s(sC1)� s i (s iC1)� s j (s jC1)] (14.94)

Page 572: Yorikiyo Nagashima - Archive

14.6 Color Charge 547

where s is the total spin, the spin exchange force produces the energy differencebetween different s, hence the hyperfine splitting.

Since the gluon is a Lorentz vector just like the photon, we expect the same kindof forces to be generated by the gluon. That is to say, the spin-independent colorelectrostatic force is expected to generate the principal level splitting differentiatedby n (D 1, 2, . . .) and the color magnetic force is expected to generate the fine struc-ture, which is different depending on the value of L and S (not to be confused withthe strangeness). The LS force is absent for L D 0 states.

14.6.4Mass Formulae of Hadrons

Now we extend the level splitting force to include the color exchange. When theS U(3) operator is written as F (D F1, F2, � � � , F8), the gluon exchange includes bothcolor and the (real) spin exchange force. The Hamiltonian of the two-quark inter-action is expected to have the form

Hcolor � (F a � F b)(sa � sb)

F a � F b �

8Xi

Fai Fbi , sa � sb �

3Xi

sa j sb j(14.95)

Comparing this with the interaction of the hydrogen atom, we may interpret it asthe color magnetic moment interaction that is generated by three color charges.Considering the mass and color charge strength, we see the Hamiltonian will havethe form

H D H0 � AXab

α s

ma mb(F a � F b)(sa � sb) (14.96)

where ma , mb are the quark masses and α s D g2s /4π is the color coupling con-

stant. In the above expression, we considered only low lying states with orbitalangular momentum L D 0, hence there is no LS force and we are dealing with hy-perfine splittings. H0 generates the confining force, which has principal quantumnumbers. The color exchange term (F a � F b) was evaluated in Table 14.6. Since allthe hadrons are assumed to be color singlets, constituent quarks in the meson onwhich the operator (F a � F b) acts are in a singlet state. For baryons, it acts on a pairout of three quarks, but as the pair forms a singlet with the remaining q (3), thequark pair belongs to 3�. We obtain

(F a � F b) D

(�4/3 meson

�2/3 baryon(14.97)

Page 573: Yorikiyo Nagashima - Archive

548 14 The Quark Model

If the quark masses are degenerate (mu D md D ms), we can sum over spincomponents. For mesons

meson: (sa � sb) D

(�3/4 0�

1/4 1� (14.98a)

and for baryons

baryon:Xab

(sa � sb) D

(�3/4 (1/2)C

3/4 (3/2)C (14.98b)

Therefore, the mass value differs depending on the spin. The above calculationgives m(1�) > m(0�), m(3/2) > m(1/2) and the relative separation is given by

m(1�)� m(0�) > m(3/2) � m(1/2) (14.99)

which is consistent with observations. If there is no color exchange, the levels formesons or baryons are inverted, contradicting the observations.

Problem 14.14

Derive Eqs. (14.98).

If we assume that the masses of hadrons and quarks are large compared to theirbinding energy (nonrelativistic approximation), the hadron mass in the first ap-proximation [H0 in Eq. (14.96)] can be expressed as a sum over quark masses.Then we have

baryon W mB D m1 C m2 C m3 C B�

(s1 � s2)m1m2

C(s2 � s3)m2m3

C(s3 � s1)m3m1

DX

mi CX

i j

Bmi m j

(s i � s j ) (14.100a)

meson W mM D m1 C m2 CC

m1m2(s1 � s2) (14.100b)

For baryons there are a (1/2)C octet and a (3/2)C decuplet and considering theisospin degeneracy (mu ' md ), there are eight levels. Here, as an example, wecalculate the mass of Λ(1/2) � Σ (1/2) � Σ �(3/2), which contain an s quark. Weset 1 D s, 2, 3 D u, d, and m2 D m3 D mu D md , m1 D ms . Note, the totalwave function including both flavor and spin should be totally symmetric. Sincethe quarks 2 and 3 make I D 1 (Σ , Σ �) and I D 0 (Λ), the spin wave functionsmust be s D 1 (I D 1) and s D 0 (I D 0), respectively. For

Xi j

s i � s j

mi m jD

s2 � s3

m2uC

s1 � (s2 C s3)mu ms

(14.101)

Page 574: Yorikiyo Nagashima - Archive

14.6 Color Charge 549

we obtain the following:

2(s2 � s3) D (s2 C s3)2 � s22 � s2

3 D s23(s23 C 1) � 3/2

(s2 � s3) D

(1/4 s23 D 1 (Σ , Σ �)

�3/4 s23 D 0 (Λ)(14.102a)

s1 � (s2 C s3) D (1/2)˚

s2 � s21 � (s2 C s3)2�

D (1/2) fs(s C 1) � 3/4 � s23(s23 C 1)g

s(s C 1) D

(15/4 s D 3/2 (Σ �)

3/4 s D 1/2 (Σ , Λ)

s23(s23 C 1) D

(2 s23 D 1 (Σ , Σ �)

0 s23 D 0 (Λ)

) s1 � (s2 C s3) D

8<ˆ:

1/2 (Σ �)

�1 (Σ )

0 (Λ)

(14.102b)

Inserting Eqs. (14.102) in Eqs. (14.101) and (14.100a) we obtain

m(Σ �) D 2mu C ms C B�

14m2

uC

12mu ms

�(14.103a)

m(Σ ) D 2mu C ms C B�

14m2

u�

1mu ms

�(14.103b)

m(Λ) D 2mu C ms C B��

34

1m2

u

�(14.103c)

Other baryons can be calculated similarly to get

m(Δ) D 3mu C B�

34

1m2

u

�(14.104a)

m(N ) D 3mu C B��

34

1m2

u

�(14.104b)

m(� �) D mu C 2ms C B�

14m2

sC

12mu ms

�(14.104c)

m(� ) D mu C 2ms C B�

14m2

s�

1mu ms

�(14.104d)

m(Ω ) D 3ms C B�

34

1m2

s

�(14.104e)

There are eight equations with three adjustable parameters. The mass formulaefor mesons can be calculated in a similar manner. We compare our results withobserved values in Fig. 14.29 and Table 14.7.

Page 575: Yorikiyo Nagashima - Archive

550 14 The Quark Model

(a) (b) (c)800

1500

1700

1000

Mas

s (M

eV)

1+ 3+

2 2

3mu

2mu + ms

mu + 2ms

3ms

N/Δ(1089)

Σ(1264)

Ξ(1439)

Ω(1614)

N(939)

Λ(1116)Σ(1193)

Ξ(1318)

Δ (1232)

Σ(1384)

Ξ(1533)

Ω (1672)

Figure 14.29 Levels of the baryon splitting due to spin–spin coupling [101]. (a) Levels due toquark mass splitting, (b) (c) spin-dependent force resolves the degeneracy.

Table 14.7 Mass values calculated including color–color and spin–spin couplings [296].

Baryon Theory Exp. Meson Theory Exp.

Ω 1682 1672 φ 1032 1020 For baryons� � 1529 1533 K� 896 892 mu D md D 363 MeV

Σ � 1381 1384 ω 780 783 ms D 538 MeV

Δ 1239 1232 � 780 776 B D 200 m2u MeV

� 1327 1318 η 559 549 For mesons

Σ 1179 1193 K 485 496 mu D md D 310 MeVΛ 1114 1116 π 140 138 ms D 483 MeV

N 939 939 C D 640 m2u MeV

Problem 14.15

Derive Eqs. (14.104).

Problem 14.16

Calculate meson masses using Eq. (14.100b).

The quark masses determined from baryons and mesons may seem different,but considering the simple approximation we used, they are surprisingly close.Note that those determined from baryon magnetic moments [mu D 338 MeV,md D 322 MeV, ms D 510 MeV, see Eq. (14.57)] are in between. These mass for-

Page 576: Yorikiyo Nagashima - Archive

14.6 Color Charge 551

mulae agree with the Gell-Mann–Okubo mass formula extended to S U(6) [192]

m D AC B Y C C [I(I C 1) � Y 2/4]C D S(S C 1) (14.105)

to first order in power expansion of Δm D ms � mu.

Mass Formula Including Orbital Angular Momentum We have to include the spin–orbit coupling (LS force) to calculate the masses of states with orbital angular mo-mentum L > 0. A representative formula, given by [117], has the form

H D L(r1, r2, r3)CX�

mi Cp2

i

2mi

��

23

α s

XSi j (14.106a)

Si j D1r�

12mi m j

� p i � p j

rC

(r � p i )(r � p j )

r3

�1

mi m j

�8π3

(s i � s j )δ3(r)C1r3

˚3(s i � Or)(s j � Or) � (s i � s j )

��

�1

2r3

"1

m2i

(l i � s i ) �1

m2j(l j � s j )C

1mi m j

˚(l i � s j ) � (l j � s i )

�#

C � � � � � � (relativistic correction) (14.106b)

where Or D r/jr j, l D r � p . The first term of Eq. (14.106a) gives the confinementpotential and phenomenologically a harmonic oscillator potential is often used.The second term is the mass and kinetic energy term and the third term Si j is

mas

s (M

eV)

2200

2100

2000

1900

1500

1400

1600

1700

1800

N*1/2–Δ*1/2–Λ*1/2–Σ*1/2–Ξ*1/2–Ω*1/2–N*3/2–Δ*3/2–Λ*3/2–Σ*3/2–Ξ*3/2–Ω*3/2–N*5/2–Λ*5/2–Σ*5/2–Ξ*5/2–

Figure 14.30 A comparison of the baryon mass prediction and experimental values based onthe nonrelativistic quark model [214]. Hatched boxes are observed data of resonance massesand the black bars represent the prediction.

Page 577: Yorikiyo Nagashima - Archive

552 14 The Quark Model

the “Breit interaction” (see for instance [58]) commonly used in nuclear physics.The second term in the first line of the Breit interaction is referred to as the Dar-win term and the third term as the zitterbewegung, the second line represents thespin–spin interaction and the third line represents the spin–orbit interaction. Phe-nomenologically, however, it is known to reproduce the observed baryons’ massand width when the spin–orbit interaction is neglected and only the spin–spin in-teraction is included. In the previous calculation, we used the first term in the spin–spin interaction term. As an example, Fig. 14.30 [204, 214] shows a comparison ofthe calculation with the observed baryon mass spectrum with L D 1 and negativeparity.

With Λ�(1/2)� as the only exception, the agreement is remarkable. The numberof parameters was five.

It is not well understood why a nonrelativistic potential can reproduce the ob-served data so well.13) However, the phenomenological success in hadron spec-troscopy can certainly be considered strong evidence for the validity of the quarkmodel and QCD.

13) Because of the asymptotic freedom, which will be explained in Chap. 18, the high-energyphenomena of QCD are fairly well understood, but the low-energy part of QCD needsnonperturbative approaches and remains a little-explored area to this day.

Page 578: Yorikiyo Nagashima - Archive

553

15Weak Interaction

All hadrons and leptons are known to take part in the weak interaction, but its effectis felt only when the strong and electromagnetic interactions are forbidden by someselection rules. Since weak interaction phenomena were hard to reproduce experi-mentally, the progress in clarifying the weak process was slow at the beginning. Butits very weakness made the perturbative approach effective and clear connectionsof theoretical ideas with observations were possible. Fermi’s theory, which was pro-posed in 1934 and adapted to V–A interaction in 1957 to take parity violation intoaccount, reproduced observed experimental data well. However, it is based on aneffective Lagrangian valid only in first order and, lacking self-consistency, has nopredictive power for higher order corrections. Nonetheless, its similarity with theelectromagnetic interactions inspired the pursuit of a self-consistent mathematicalframework modeled on QED – the only self-contained gauge theory at that time– and finally led to the unified theory of the electromagnetic and weak forces.

To help comprehend the logical flow that is described in the following, we startfrom known basic principles of the weak interaction, which serve as backgroundknowledge, then follow in the footsteps of our great leaders, because it is instructiveto learn how they gained insight from observations. It is especially illuminating tounderstand the similarities to and the differences from the electromagnetic forceat each step. The purpose is to learn heuristic methods as well as to study the factsand logic of elementary particle physics in the weak force world.

15.1Ingredients of the Weak Force

The weak force is generated by the “weak charge”, just as the electromagnetic forceis generated by the electric charge. This means there is a “weak static force” aswell as a “weak magnetic force”, the former produced by the static weak chargeand the latter by its current, i.e. the moving charge. Correspondingly, the forcecarrier (denoted as W μ) is a four-vector and couples to a four-current j μ

W (weakcurrent and HWI D gW j μ

W Wμ) generated by the motion of the weak charge carrier(i.e. quarks and leptons) just as the photon is a Lorentz vector that couples to the

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 579: Yorikiyo Nagashima - Archive

554 15 Weak Interaction

electric current (HEM D e j μe A μ). The coupling constant gW is the strength of the

weak charge, corresponding to the unit electric charge e.The first major difference is that the weak charge comes in two kinds, to be de-

noted temporarily as Q W1, Q W 2, which we call the “weak isospin doublet”. Forexample, the electron–neutrino has Q W1 and the electron has Q W 2, and togetherthey form a weak doublet (ν e , e�). The term “weak doublet” is used to distinguishit from the isospin doublets that appeared in the strong interaction. The namingis not accidental, though. The generators of the nuclear force, the proton and neu-tron, constitute an isospin doublet (p , n) and change their identity by the nuclearforce action (referred to as the exchange force), because the force carriers (i.e. thepions) are charged. Like the pion, the W boson also comes in three kinds and con-stitutes a weak isospin triplet (W C, Z0, W �).1) Since the lepton is a fermion andfermion number is known to be conserved, we assign lepton number 1 to all lep-tons and �1 to all antileptons. The sum of the lepton number is conserved in theweak as well as in other interactions. The action and reaction of the weak force(absorption and emission of the W) change the identity of the weak charge carri-ers. For instance, by emitting W C (or by absorbing W �), ν e changes into e�. Thesix known leptons are classified into three doublets. A notable feature of the weakforce is that it changes the lepton only into itself or into the other member of thesame doublet. Namely, the three doublets in the lepton family (ν e , e�), (νμ , μ�),(ντ , τ�) are independent. They do not mix with each other, which is referred to aslepton flavor conservation.2)

Quark doublets:�

ud

� �cs

� �tb

�Lepton doublets:

�ν e

e�

� �νμ

μ�

� �ντ

τ�

The situation in the quark sector is more complicated. Its known six members alsoconstitute three doublets in the weak interaction. However, mixing among them,known as CKM (Cabibbo–Kobayashi–Maskawa) mixing, occurs.

The second major difference between the weak and electromagnetic forces istheir reach. The force range r0 is determined by the Compton wavelength of theforce carrier. Since the masses of W ˙ and Z0 are

MW D 80.40˙ 0.025 GeV , MZ D 91.186˙ 0.002 GeV (15.1)

r0 is � 2 � 10�18 m. For practically all reactions we deal with in this chapter, theenergy in a process is much smaller than mW . This means that the mass of W, Zcan be regarded as infinite or equivalently the force range as zero. Namely, the

1) The reason for denoting the neutral memberof the W family not as W 0 but as Z0 isbecause it is a mixture of W 0 with thephoton, changing its identity and at the sametime acquiring a different mass as well ascoupling strength from its charged partners.

But in this chapter the distinction of W 0

from Z0 does not matter.2) The neutrino oscillation, discovered in 1998

(to be discussed in Sect. 19.2.1), mixes them,but mixing among the charged leptons hasnot been observed. In this book we assumelepton flavor is conserved.

Page 580: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 555

weak force is effective only when the participating particles are in direct contactwith each other.

15.2Fermi Theory

15.2.1Beta Decay

Neutrino Hypothesis The weak interaction was first discovered in nuclear beta ()decays back in 1896 by Becquerel. In alpha (α) and gamma (γ ) decays discovered atthe same time, the emitted particles had a monochromatic energy spectrum, whichagreed with the difference of the energy between the parent and the daughter nu-clei. However, in decay, the energy of the emitted rays, i.e. of the electrons,was continuous and there was no sign of other particles. Namely, the law of con-servation of energy seemed to be violated. Bohr was willing to abandon the lawof conservation of energy, but Pauli proposed in 1930 that a yet unknown parti-cle (neutrino) carries away the missing energy. In 1934, taking into account Pauli’sneutrino hypothesis, Fermi proposed the first weak interaction theory to explainthe phenomenon. Its formalism was made in analogy with the electromagnetic in-teraction. The gamma decay process, in which a proton in a nucleus emits a photon(Fig. 15.1a) can be described using an interaction Lagrangian

LEM D �q j μA μ D �e p γ μ p A μ3) (15.2)

Here, p , p , A μ are field operators of the proton and the photon. Namely, in thequantum field theoretical interpretation, the operator p annihilates a proton andA μ creates a photon and at the same time p recreates a proton. The decay is aprocess in which a neutron in the nucleus is converted into a proton and at thesame time emits an electron–neutrino pair instead of the photon (Fig. 15.1b):4)

n ! p C e� C ν e (15.4)

In terms of the field operators, the process can be described by a four-fermioninteraction Lagrangian

L D �G( j Ch )μ( j �

l )μ D �G(p γ μ n)(eγμ ν)C h.c. (15.5)

3) The proton is not a Dirac particle, as it hasa large anomalous magnetic moment and isspatially extended. Therefore, interactionswith the anomalous magnetic momentμ p D g p e/2m p

LDg p � 2

2e

2m pσμν (@μ A ν � @ν A μ ) (15.3)

have to be included from the phenomeno-logical viewpoint. Equation (15.2) is to beconsidered as a starting point.

4) It is actually an antineutrino in modernterminology.

Page 581: Yorikiyo Nagashima - Archive

556 15 Weak Interaction

γ uL eLνL

W-

p

p

p

n

e

ν

dL

GFe

e(pγ μp)Aμ

_GF(pγ μn)(eγμν)

_ _

Photon Emission β decay (Fermi) Standard Model

PropagatorQ2

Q2 +

<<

mW2

mW2

gW2

tx

(a)

GF ~gW

2

mW2

gWgW

(b) (c)

Figure 15.1 The weak interaction was con-ceived in analogy with the electromagneticinteraction. (a) A field theoretical illustrationof a photon emission. (b) Fermi interaction:Instead of γ , an (e� νe) pair is emitted. (c)

The decay in the Standard Model. A d quarkin the neutron is converted to a u quark withthe emission of e� νe . Note the W ˙ cou-ples only to left-handed particles, hence thesubscript L.

where h.c. stands for the hermitian conjugate and is added to make the Lagrangianhermitian. We omit the subscript e from ν in the four-fermion interaction untilthe difference between ν e and νμ is discovered. This was the first application ofHeisenberg and Pauli’s quantized field theory. In the case of photon emissions,quantum field theory could be considered as a reformulation of known formalism,but in the decay the electron and the neutrino were literally created inside thenucleus. G is called the Fermi coupling constant and is a measure of the weakforce strength. The Lagrangian is a product of two currents, hadronic and lepton-ic, and is called current–current coupling. To distinguish them from the electro-magnetic current, they are called weak currents. The subscripts h,l attached to thecurrents are abbreviations of hadronic or leptonic. Unlike the electromagnetic cur-rent, which does not change the identity of particles, the weak currents change theelectric charge [n ! p and ν ! e� or (vacuum! e�ν)] and are called chargedcurrents. In this sense the electromagnetic current is a kind of neutral current.

In the Standard Model, the charged current couples to W ˙:

LWI D gW

hj C μ

h W Cμ C j � μ

l W �μ

iC h.c.

D gW

h(uL γ μ dL) W C

μ C (eLγ μ ν eL) W �μ

iC h.c. (15.6)

Here, W C(�)μ is a field operator to annihilate (create) W C(�) bosons. Beta decay in

the Standard Model is the two-step process depicted in Fig. 15.1c, whose effectiveLagrangian can be expressed as [see for instance Eq. (7.12)]

LSM, 2nd order D g2W (uLγ μ dL)

gμν

q2 � m2W

(eL γ ν ν eL) (15.7)

where q2 D (pd � pu)2 is the momentum transfer squared. Since the energy differ-ence of the nucleus before and after the decay is a few MeV and can be neglected

Page 582: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 557

compared to mW D 80 GeV, the Lagrangian can be rewritten as

LSM, 2nd order ' �G (uL γ μ dL)�eL γμ ν eL

�, G D

g2W

m2W

(15.8)

which essentially reproduces the Fermi interaction, except that only left-handedparticles participate in the Standard Model, as is indicated by subscript L to thefield operators. The left-handed nature of the weak interaction will be explainedlater in connection with parity violation. The Lagrangian of this type is called thefour-fermion interaction.

Allowed Transitions To obtain the rate of decay, it is necessary to calculate thetransition amplitude from an initial to a final state:

T f i D G

Zd4 xhp jp(x )γ μ n(x )jniheνje(x )γμ ν(x )j0i (15.9)

For a while, we follow the historical development, treating the proton and neutronas elementary particles. If we consider them as represented by plane waves, thenucleonic part of Eq. (15.9) is

hp j j hμ(x )jni � u(p 0)γ μ u(p )e�i(p�p 0)�x (15.10)

where p , p 0 are four-momenta of the proton and the neutron. As the Q value (en-ergy difference before and after the reaction) of the decay is at most a few MeVand overwhelmingly small compared to the nuclear mass, the recoil of the nucleuscan be neglected. Referring to Table 4.2, only the time component survives in thenonrelativistic limit and is proportional to 1:Z

d3x hp jp(x )γ μ n(x )jni ' δμ0u†(p 0)u(p )δ3(p0 � p) � δμ0h1i (15.11)

In practice, the proton and neutron are in a nucleus. The expectation value inEq. (15.11) should be calculated not by using the free nucleon wave functions butby using initial and final wave functions of the nucleus. As the operation includesa change of a neutron to a proton, the operator is not exactly 1 but 1τC, whereτ˙ D (τx ˙ i τ y )/2 is the raising (lowering) operator of the isospin.

As for the electron and neutrino, they must be treated relativistically becausetheir mass is much smaller than typical -decay energies. Namely, the leptoniccurrent is expressed as

heν ej j μe (x )j0i � u(pe)γ μ v (pν)e i(p eCpν)�x (15.12)

Therefore, the transition matrix element becomes, after integration over the timevariable,

T f i D 2πδ(Δ � Ee � Eν)G

Zd3x Φ �

F (x )ΦI (x )e�i(p e Cp ν )�x

��u(pe)γ 0v (pν)

Δ D M(Z, A)� M(Z C 1, A)

(15.13)

Page 583: Yorikiyo Nagashima - Archive

558 15 Weak Interaction

where ΦI , ΦF are wave functions of the nucleus before and after the decay, re-spectively, and Δ is the mass difference of the parent and daughter nucleus. Asjp ej, jp νj . a few MeV, the de Broglie wavelength of the electron and neutrino is� 10�13 m, which is much larger than the nuclear size � 10�14 m. This means(p e C p ν ) � x � 1 in the overlap region of the nuclear wave functions. Thereforethe higher powers in the expansion

e�i(p e Cp ν )�x D 1� i(p e C p ν) � x C � � � (15.14)

can be neglected in the first approximation. The allowed processes under this con-dition are called “allowed transitions”. If some selection rule is effective and thefirst term does not contribute, one needs to consider the higher order terms (for-bidden transition), but we do not discuss them here as they are more relevant fornuclear physics. For the allowed transition, the nuclear part

h1i DZ

d3x Φ �F ΦI (15.15)

is separated from the leptonic part and is a constant that does not include anydynamical variables.

ft Value Calculating the decay rate from Eq. (15.13), we have

dΓ D G2jh1ij

2Xspin

L002πδ(Δ � Ee � Eν)d3 pe

(2π)32Ee

d3 pν

(2π)32Eν(15.16a)

Xspin

L00 DX

r,s

ur (pe)γ 0vs(pν)v s(pν)γ 0u r(pe) D Tr�6pe γ 0 6pν γ 0

D 4�2p 0

e p 0ν � g00(pe � pν)

D 4Ee Eν(1C ve cos θ ) (15.16b)

where ve D pe/Ee is the electron velocity and θ is the angle between the electronand the neutrino.

Pspin Lμν is given in Eq. (6.143). Integration over the phase space

gives

δ(Δ � Ee � Eν)p 2ν d pν D pν Eν

pν Eν D (Δ � Ee)2 if mν D 0(15.17)

which gives

dΓ D G2jh1ij

2(1C v cos θ )1

(2π)5 p E(Δ � E )2dE dΩe dΩν (15.18)

Here, we have put p D pe, E D Ee, v D ve . From the expression we know thatthe electron energy spectrum is wholly determined by the phase space volume. Wedefine

f �Z Δ

me

F(Z, E )p E(Δ � E )2dE (15.19)

Page 584: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 559

F(Z, E ) is a correction factor due to the electron Coulomb wave function and is aknown function [280]. When the electron is nonrelativistic it is approximated by

F(Z, E ) '2πZ α/v

1 � exp(�2πZ α/v )(15.20)

For light to medium elements F � 1, f grows in proportion to� Δ5 and for a smallchange of Δ (i.e. Q-value of the decay) the lifetime changes drastically from 10�8

to 100 s. If we use the half lifetime t1/2 D ln 2/Γ , the total decay rate is written as

Γ DZ

dΓ DG2

2π3 jh1ij2 f D

ln 2t1/2

(15.21)

We define the f t value by

f t1/2 D2π3 ln 2G2

jh1ij2

(15.22)

As f, jh1ij are calculable using the nuclear shell model and t1/2 is an observable, wecan determine G .

Fermi and Gamow–Teller Transitions In allowed transitions, the orbital angularmomentum of the lepton pair relative to the nucleus vanishes and if there is nospin flip in the nuclear transition; the angular momentum transfer from the initialto the final state of the nucleus is Δ J D 0 with no parity change. This is calledthe Fermi transition. In actual decays, transitions with the angular momentumtransfer equal to one (Δ J D 0,˙1, but 0! 0 is forbidden5) ) with no parity change(Gamow–Teller transition) have been observed, hence the original Lagrangian pro-posed by Fermi was not sufficient to explain all the observed decays. Further-more, parity violation was discovered in 1957 and its effect must also be taken intoaccount. The most general form of the four-fermion interaction Lagrangian for decay can be written as follows:

L D �G

X�(p Γ i n)

�Ci eΓi ν C C 0

i eΓi γ 5ν�C h.c.

(15.23)

Γ i D 1(S) , γ μ(V) , σμν(T) , γ μ γ 5(A) , i γ 5 (P)

where S, V, T, A and P represent scalar, vector, tensor, axial vector and pseudoscalar,respectively. By setting CV D 1 and Ci D C 0

i D 0 for all others, Eq. (15.23) reducesto the original Fermi interaction Lagrangian Eq. (15.5). For more general argu-ments, interactions including derivatives have to be considered, but we omit themas the inclusion of derivatives changes the energy spectrum and there is no sign ofsuch terms. Since Ci , C 0

i are complex numbers, there are 20 parameters in total.Using the transformation property of each term (see Appendix F), one sees thatC 0

i D 0 for P invariance, Ci (C 0i ) are real (pure imaginary) for C invariance and all

Ci , C 0i must be real for T invariance.

5) See Problem 13.1 (3).

Page 585: Yorikiyo Nagashima - Archive

560 15 Weak Interaction

Problem 15.1*

Prove the above statement.

In the real world, P and C are maximally violated and T (or CP if CPT invari-ance holds) invariance is almost true. In the following discussions, we assume Tinvariance until proven otherwise and set all Ci , C 0

i as real, reducing the numberof parameters to 10. For a while we proceed assuming P, C invariance as well, andset C 0

i D 0. In this case, the number of parameters is 5. In the approximationneglecting nuclear recoils, the transition matrix elements can be calculated usingTable 4.2, and we obtain

T f i � G�h1i

˚CSu(pe)v (pν)C CV u(pe)γ 0v (pν)

�C hσi �

˚CA u(pe)γ 0σv (pν)C CTu(pe)σv (pν)

� (15.24a)

h1i �Z

d3x Φ �F (s)τCΦI (s) (15.24b)

hσi �Z

d3x Φ �F (x )στCΦI (x ) (15.24c)

Resultant expressions for the decay rates have the same form as Eqs. (15.18) and(15.21) but the matrix elements squared jh1ij2 and the angular correlation coeffi-cient 1 are replaced by [156, 280]

jh1ij2 ! C2F jh1ij

2 C C2GTjhσij

2 (15.25a)

C2F D jCSj

2 C jCVj2 C

2me

EeCS CV (15.25b)

C2GT D jCAj

2 C jCTj2 �

2me

EeCACT (15.25c)

1C v cos θ ! 1C λv cos θ

λ D

��jCSj

2 C jCVj2�jh1ij2 C (1/3)

��jCAj

2 C jCTj2�jhσij2

C2F jh1ij2 C C2

GTjhσij2(15.25d)

If the parity-violating terms are included, all the coefficients are replaced by

jCi j2 ! jCi j

2 C jC 0i j

2 (15.26)

Later, jC 0i j D jCi j was clarified, and consequently Eqs. (15.25) are equally valid

after the discovery of parity violation if we use G/p

2 instead of G . The readermay have noticed that P-type interaction is missing. It does not contribute to theallowed transitions. This is because P-type coupling is equivalent to derivative cou-pling in the low-energy limit (see Table 4.2 and discussions in Sect. 13.6.1); thematrix element is proportional to the recoil momentum of the nucleus, hence canbe neglected. When the interaction type is S or V, the selection rule is the sameas V, namely Δ J D ΔP D 0, and is referred to as the Fermi transition. When

Page 586: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 561

the interaction is A or T, the electron and the neutrino carry away the angular mo-mentum 1 generated by the spin flip nuclear transition. This is referred to as theGamow–Teller transition, in which Δ J D 0,˙1, ΔP D 0.

Problem 15.2

Prove that if F(Z, E ) ' 1, Ee me , then f D Δ5/30 and the total decay ratebecomes

Γ DG2

Δ5

60π3

�C2

F jh1ij2 C C2

GTjhσij2� (15.27)

Determination of the Interaction Type When the initial and final states of nucleartransitions are known, the mixing rate of the Fermi and Gamow–Teller transitions

F DC2

F jh1ij2

C2F jh1ij

2 C C2GTjhσij2

(15.28)

is a function of jCGT/CFj. If the f t value is measured as a function of F, jCGT/CFj

can be determined [233]. The experimental values are given by

G D 1.14730 ˙ 0.00064 � 10�5 GeV2

jCGT/CFj D 1.2601˙ 0.0025 (15.29)

When one interaction type dominates, the angular correlation coefficient λ is givenby

λ D �1 (S ) , D C1 (V ) ,

D 1/3 (T ) , D �1/3 (A) (15.30)

Measuring the angular coefficient λ for various nuclei, and plotting it as a functionof mixing parameter F, one can determine which of the interactions are effective(Fig. 15.2). Historically, the interaction type was determined this way. The result is

CS D CT D 0 ,

CV D 1 , CA ' �1.260CV (15.31)

The sign of CA was determined by angular correlation of (eν) and of the neutrinowith respect to the spin of free polarized neutrons [82, 83].

Beta Decay of the Neutron When the initial and final states belong to the sameisospin multiplet, the values of h1i, hσi can be calculated without knowing thenuclear wave functions. For the neutron decay, jp i, jni are eigenstates of isospinI D 1/2,

h1i2 D 1, hσi2 D 3 (15.32)

Page 587: Yorikiyo Nagashima - Archive

562 15 Weak Interaction

Figure 15.2 Electron–neutrino angular correlation coefficients plotted against the mixing pa-rameter F of the decays in the allowed transitions [280].

As Δ D mn � m p D 1.29 MeV is the same order as the electron mass me D

0.511 MeV, the expression for Γ in Eq. (15.27) is not usable, but integration of fand consideration of the weak magnetism, which will be explained later, gives avalue that agrees well with the experimental value (τn D 885.7˙ 0.8 s).

15.2.2Parity Violation

Probably the most shocking event in the history of particle physics was the discov-ery of parity violation in 1957. Today any conservation law is not sacred, consideredas an entity to be checked by observations, but at that time it was beyond imagi-nation that such a fundamental law as space-time symmetry could be broken. Thefirst indication was the so-called τ–θ puzzle. The particle, known as the K mesontoday, decays to 3π (τ particle) at one time and to 2π (θ particle) at another. Both ofthem were known to have spin s D 0. Since the pion is known to have spin-parityJ P D 0�, to make J D 0 in the 2π combination, the orbital angular momentumhas also to be zero, resulting in positive parity. In the 3π system, taking l as theorbital angular momentum of 2π and L as that of the third π relative to the centerof mass of the 2π, l C L D J D 0 must hold. This means l D L and the parityis P D (�1)3ClCL D �1. So τ and θ must be different particles. However, themystery was that they had the same mass and the same lifetime, indicating theywere one and the same particle.

Page 588: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 563

Lee and Yang investigated past experiments and found that there was ample ev-idence of parity conservation in the strong and electromagnetic interactions butno evidence in the weak interaction. They proposed several experiments to test thevalidity [258].

How to Test Parity Violation The momentum p changes its sign by parity transfor-mation but the angular momentum J � r � p does not. Therefore if an observablesuch as J � p is found in a decay process, it is evidence of parity violation. Let p bethe momentum of decay electrons from a polarized nucleus having angular mo-mentum J . Then J � p is proportional to the cosine of the electron emission angleθ with respect to the polarization axis. Therefore, by observing decay electronsfrom polarized nuclei, one can test the existence of the term J � p . It appears asup–down asymmetry with respect to a plane perpendicular to the polarization axis.

Wu’s Experiment C.S. Wu, who was at the same university as Lee and Yang, madepolarized 60Co by cooling it down to 0.01 K and measured the angular distribu-tion of the decay electrons. Figure 15.3a shows her apparatus [392]. By applyinga weak magnetic field (a few hundred gauss6) ), electrons in the outer shell ofthe atom can be polarized. The nucleus is polarized dynamically by spin–spin in-

6) One gauss (1 Gs) is 10�4 tesla (10�4 T).

Figure 15.3 (a) Apparatus used to discoverparity violation. (b) Observed data. 60Co(5C) decays to 60Ni(4C) followed by gammatransitions 4C ! 2C ! 0C. The agreement

between the γ anisotropy and the asymmetryof the electron is evidence that the asymmetryexists and is proportional to the polariza-tion [392].

Page 589: Yorikiyo Nagashima - Archive

564 15 Weak Interaction

teraction with the electron. The degree of nuclear polarization was determinedby measuring simultaneously the anisotropy of its γ radiation in the transition60Ni(4C)! 2C ! 0C:

� DW(π/2) � W(0)

W(π/2)(15.33)

Here, W(θ ) is the relative intensity of the emitted gamma ray at θ with respect tothe polarization axis. Since 60Co(5C)! 60Ni(4C) is a Gamow–Teller transition, weobtain the angular distribution by setting Ci D C 0

i D 0, i ¤ A in Eq. (15.23). Theresultant angular distribution formula is

S(θ ) � 1C AP v cos θ

A D2Re(CAC 0

A�)

jCAj2 C jC 0Aj

2 (15.34)

where P is the polarization of the Co and v the electron velocity. The experimentalresult was P D 0.65, v � 0.6, AP v D �0.38, which means

A D �1˙ 0.1) C 0A � �CA (15.35)

Therefore the parity violation is maximal. It also shows the maximum C violation,because C 0

A is not pure imaginary with magnitude as big as CA. It was later con-firmed that for V type decay also

C 0V � �CV (15.36)

Inserting Eqs. (15.31), (15.35) and (15.36) in Eq. (15.23), one can express the La-grangian for decay as

L D �Gp

2[p γ μ(1� 1.26γ 5)n][eγμ(1� γ 5)ν] (15.37)

The factor 1/p

2 is inserted in order to use the same G as before the discovery ofparity violation. The Lagrangian (15.37) contains V and A type with approximatelyequal strength and opposite sign and is commonly referred to as V–A type. We havenow obtained an important characteristic of the weak interaction. Because of 1�γ 5,as far as the lepton is concerned only components with left chirality contribute tothe interaction. For the notion of chirality, see Sect. 4.3.5. We defer consideration ofthe hadrons (or quarks), where the axial vector coupling differs by as much as 26%,until we discuss the conserved vector current (CVC) hypothesis in Sect. 15.5.2.

15.2.3π Meson Decay

We determined the coupling constants of S, V, T and A from decays, but asthe P type does not contribute in allowed transitions, we need to investigate pion

Page 590: Yorikiyo Nagashima - Archive

15.2 Fermi Theory 565

decay. Let us first calculate the decay rate using the interaction Lagrangian (15.37)as determined by decay. Almost 100% of the pions decay as

π� ! μ� C ν, πC ! μC C ν (15.38)

with lifetime [311]

τ D 2.603˙ 0.0005 � 10�8 s (15.39)

A conspicuous feature of pion decay is that the branching ratio of π ! e C ν isextremely small:

BR(π˙ ! e˙,--ν ) DΓ (π˙ ! e˙,--ν )

Γ (π˙ ! all)D 1.230˙ 0.004 � 10�4 (15.40)

Considering its larger phase space volume, one would naively expect a larger decayrate than for π ! μν, an expectation not fulfilled by observation. To investigatethe reason, let us write down the variables in the reaction as

π(k)! l(p )C ν(q) (15.41)

Since the muon has the same properties as the electron except for its mass, weinfer its emission can be treated in the same way as the decay. Therefore, wesimply make the replacement e ! μ in Eq. (15.37) and the interaction Lagrangianshould have the form

LWI D �Gp

2[p γ μ(1� 1.26γ 5)n][μγμ(1� γ 5)ν] (15.42)

The transition amplitude becomes

T f i D hμνj �LWIjπi D (2π)4δ4(k � p � q)

�Gp

2h0jp γ μ(1� 1.26γ 5)njkihp , qjμγμ(1� γ 5)νj0i (15.43)

The hadronic part contains corrections due to strong interaction (wave functionsmade by quarks) and cannot be calculated easily from first principles, but usingsymmetry arguments some restrictions can be applied. From Lorentz invariance,the hadronic part should be able to be expressed as

h0jp γ μ(1 � 1.26γ 5)njπi D �1.26h0jp γ μ γ 5njπi � �i k μ f π (15.44)

The first term in the parentheses does not contribute because the pion has negativeparity. The restriction comes from the vector nature of the current and that the pionmomentum is the only vector that appears in the process. f π is referred to as thepion decay constant. From energy conservation and the Dirac equation

kμhp , qjμγ μ(1� γ 5)νj0i D (p C q)μ u(p )γ μ(1� γ 5)v (q)

D mμ u(p )(1� γ 5)v (q)(15.45)

Page 591: Yorikiyo Nagashima - Archive

566 15 Weak Interaction

Therefore the transition matrix is written as

T f i D (2π)4δ4(k � p � q)Gp

2i f π mμ u(p )(1� γ 5)v (q) (15.46)

The decay rate can be calculated using the γ -matrix algebra given in Appendix C,and becomes

Γ (π ! μν) D (G f π)2 m2μ

(m2π � m2

μ)2

m3π

(15.47)

Since G is already given by decay, the measurement of the decay rate gives

f π D 131.74 ˙ 0.15 MeV ' 0.94mπ (15.48)

The decay rate Γ (π ! eν) is obtained simply by replacing mμ ! me ,

) Γ (π ! eν)Γ (π ! μν)

Dm2

e

�m2

π � m2e

�2

m2μ

�m2

π � m2μ

�2 D 1.284 � 10�4 (15.49)

Including the radiative corrections, the theoretical value becomes 1.233�10�4 [231]and agreement with the experimental value Eq. (15.40) is very good.

Why the decay rate is proportional to the mass squared of the charged leptoncan be understood as follows. In the rest frame of the πC, the neutrino and thecharged lepton fly away in opposite directions (Fig. 15.4). Since the neutrino haszero mass, the factor (1 � γ 5) forces the neutrino helicity to be 100% negative.(Note, in π� decay the antineutrino is produced, which has 100% positive helicity.)Since the pion spin is zero, angular momentum conservation forces the μC tohave negative helicity, too. As lC is an antiparticle, the negative helicity componentis suppressed by a factor ml/p in the negative chirality state (see arguments inSect. 4.3.5). Therefore, if ml D 0, the reaction is forbidden completely. It survivesowing to its finite mass but with suppression factor � m/p . The origin of thehelicity suppression is 1 � γ 5.

That the lepton in the weak interaction has the factor (1 � γ 5) can also be con-firmed by the direct observation of the muon helicity in the π ! μν decay. Experi-mental data [140] show

0.9959 < jh(μC)j < 1.0 (15.50)

ν μ+π +

Figure 15.4 The decay rate πC ! μC C νis proportional to m2

μ . The pion spin is 0, andthe neutrino’s helicity is negative. Thereforeangular momentum conservation forces themuon helicity to be negative, too. But in the

V–A interaction, the component of the pos-itive helicity in the muon wave function isproportional to mμ /p , which gives the decayrate� m2

μ /p 2.

Page 592: Yorikiyo Nagashima - Archive

15.3 Chirality of the Leptons 567

Exclusion of P-Type Interaction The pion decay rate also severely constrains theexistence of the P-type interaction. If the Lagrangian is of P (Γ D i γ 5) type, thedecay amplitude becomes

T f i � G CPh0jp i γ 5njπiu(p )i γ 5(1� γ 5)v (q)

D G CP f 0π u(p )(1� γ 5)v (q)

(15.51)

Note, no lepton mass appears in the expression. This is because the P-type inter-action flips the chirality (see next section) and the suppression factor disappears.Therefore, if the P-type dominates, the rate will be determined largely by the phasespace volume and the electron decay rate will be larger than the muon decay rate.Quantitatively, the decay rate ratio when the Lagrangian contains an extra P-typeinteraction is given by

Γ (π ! eν)Γ (π ! μν)

D

ˇˇ 1 � (me/mπ )2

1� (mμ/mπ )2

ˇˇ2ˇˇ CP f 0

π � (me/mπ )2CV–A f π

CP f 0π � (mμ/mπ )2CV–A f π

ˇˇ2 (15.52)

Comparing the formula with the experimental value, one can easily understandthat the P-type interaction is highly excluded. In order not to contradict the experi-mental value we need jCPj

2/jCV–Aj2 . 10�4.

Summarizing the result, one may say that the characteristic of the V–A interac-tion manifests itself most conspicuously in the pion decay mode. Historically, thedetermination of the weak interaction type took a long and meandering road. Atone time erroneous experiments puzzled people as to which of the solutions V, Aor S, T, P should be adopted. Logically, however, it is much clearer to grasp variousfeatures if one simply looks at the helicity of the decay leptons. Simply from thefact that the decay lepton has negative helicity [h(ν) D �1, h(l) D �vl ] while thatof the antilepton is opposite [h(ν) D C1, h(l) D Cvl ], one can conclude that theparity is maximally violated and that

CS D CT D CP D 0 , CA D �CV7)

C 0i D �Ci i D V, A

(15.53)

which will be discussed in the next section.

15.3Chirality of the Leptons

15.3.1Helicity and Angular Correlation

The significance of Eq. (15.37) cannot be overestimated. If the neutrino has van-ishing mass, (1� γ 5)ν implies a two-component neutrino and that the chirality of

7) The reason for setting CA D �CV and not CA D �1.26CV will be clarified in the discussion ofthe CVC hypothesis in Sect. 15.5.2.

Page 593: Yorikiyo Nagashima - Archive

568 15 Weak Interaction

the lepton that appears in the weak interaction is always negative. This means thehelicity of the emitted e�, ν must be �v, +1 (for eC, ν, the opposite is true). Wenotice that the V, A-type interaction conserves the chirality, but S, T, P inverts it. Tosee this we note that the γ matrix has the following properties:

γ 5γ 5 D 1 , γ 5γ 0 D �γ 0γ 5 (15.54a)

γ 5Γ D

(�Γ γ 5 V, A

CΓ γ 5 S, T, P(15.54b)

Then as [(1 � γ 5)/2]2 D (1� γ 5)/2

12

eΓ (1� γ 5)ν D e†γ 0Γ�

1� γ 5

2

�2

ν

D

�1� γ 5

2e�†

γ 0Γ�

1� γ 5

2

�ν

(� V, A

C S, T, P(15.55)

This equation means that the measurement of the helicity of the emitted leptoncan distinguish STP from VA. Once one accepts the fact that the neutrino hasnegative chirality, the fact that the angular correlation coefficient has value λ D 1for V and λ D �1/3 for A [Eq. (15.25d)] can be understood as follows. In the V-typeinteraction Δ J D 0 and the eν pair has total angular momentum J D 0. Since thehelicity of the electron is negative and that of ν is positive, they tend to fly in thesame direction to have J D 0 (Fig. 15.5). Namely they want to have θ D 0, hence

ν_

e-

60CoNi

5+ 4+ + 1

ΔJ 1

θ 180 o

e-ν

ΔJ 0

e-_ ν_

(a)

(c)

(b)

Figure 15.5 Explanation of angular correla-tion between e and ν. The negative chiralityforces the electron to have h(e�) D �1and h(ν) D C1. In the Fermi transition (a),Δ J D 0 and the two leptons have total spin

zero, therefore they align themselves in thesame direction, while in the Gamow–Tellertransition (b), Δ J D 1 and the opposite istrue. The asymmetry in Co decay can be ex-plained in the same way.

Page 594: Yorikiyo Nagashima - Archive

15.3 Chirality of the Leptons 569

λ > 0. For A-type interaction, Δ J D 1 and, the total spin of eν being 1, they wantto fly in opposite directions (θ D π), hence λ < 0 (Fig. 15.5). The factor 1/3 reflectsthe spin degree of freedom.

In a similar manner, the reason for the asymmetry A D �1 in Wu’s cobalt exper-iment can be explained. In the process Co(5C)! Ni(4C) (Fig. 15.5c), Δ J D 1 andthe total spin of eν has to be 1. They have to be emitted in opposite directions andtheir total spin has to be aligned along the same direction as the parent Co’s spin.This means A < 0.

15.3.2Electron Helicity

The helicity of the electron can be measured after converting the longitudinal po-larization to transverse polarization. There are several methods. We list a few here.

1. The electrostatic deflector changes the momentum but does not change thespin direction (Fig. 15.6).

2. Applying both the electric and the magnetic field perpendicular to the electronmotion in such a way as to keep the electron’s path straight, the magnetic fieldcan induce spin rotation according to

idσd tD [σ, H ] D [σ,�μ � B] D ijμjσ � B (15.56)

3. Use Coulomb scattering as described in Sect. 6.7.4.

The transverse polarization can be measured using Mott scattering with nu-clei. The electrostatic interaction produces symmetric scattering but the magnetic(spin–orbit coupling H D �g s � l) interaction produces left–right asymmetry in thescattering. The magnetic field is induced by the motion of the electron and acts on

β- source

σ

σ σ

μθ

(a)

z y

x

B

p

+

+

+

-

--

Deflector

Laboratory Frame Rest Frame of Electron

Electro-static

(b) (c)

Figure 15.6 Measurement of the electronhelicity. The helicity is determined by first con-verting the longitudinal polarization to trans-verse polarization and then measuring theleft–right asymmetry of the Mott scattering.

Here, an electrostatic deflector (a) is used.The magnetic interaction of the spin inducesasymmetry in the scattering and its magnitudecan be calculated using QED.

Page 595: Yorikiyo Nagashima - Archive

570 15 Weak Interaction

Coulomb Magnetice

+Ze

e e

down upx

y

(a) (b)

Figure 15.7 (a) Scattering by the Coulomb force is symmetric. (b) Scattering by the spin–orbitcoupling force is asymmetric.

1 1 4 10 40 100 400 1000kinetic energy (keV)

0.8

0.6

0.4

0.2

0.80.60.40.20

0v/c

60Co32pLAZARUSVAN KLINKENECKARDTWENNINGERBROSIBIENLEINULLMANN

high velocities

intermediate velocities

present 3H-dataat low velocities

Figure 15.8 A compilation of measured electron helicity from allowed decays. Λ is a smallcorrection for Coulomb and screening effects [241].

the spin magnetic moment of the electron, because, in the electron rest frame, thetarget nucleus produces an electric current.

The reason for the asymmetry production can be understood classically as fol-lows. As is shown in Fig. 15.7b, let us define the x-axis in the direction of the elec-tron motion. Assume the electron is polarized in the z direction. When the electronis on a trajectory with y > 0, l D r � p is along the �z direction, whereas whenthe electron is on a trajectory with y < 0, l D r � p is along the +z direction, andthe Hamiltonian changes its sign. In other words, the force (produced by a nonuni-form magnetic field) is either attractive or repulsive depending on y > 0 or y < 0,and always bends the electron in the same direction, producing the asymmetry.

Measurements showed the electron’s helicity to be h(e�) D �v and that of thepositron h(eC) D Cv and confirmed the negative chirality of the decay electron asshown in Fig. 15.8 [147, 280].

Page 596: Yorikiyo Nagashima - Archive

15.4 The Neutrino 571

15.4The Neutrino

15.4.1Detection of the Neutrino

The neutrino is the only particle that does not interact strongly or electromagnet-ically. The weak interaction, literally, couples weakly to matter, meaning not on-ly that its strength is weak but also that it occurs very seldom. It took nearly 30years after Pauli’s prediction before it was actually detected by Reines and Cowanin 1956 [105, 330]. Today the neutrino is mass produced in reactors as well as inaccelerators and even a neutrino telescope has been set up to watch supernovae.However, Reines and Cowan’s method of detecting the neutrino is still valid and isbeing used in a forefront experiment to confirm neutrino oscillation. Therefore, itis worthwhile describing it here. The scheme of their detector is shown in Fig. 15.9.The process used is an inverse decay and the produced secondary particles weredetected using the following reactions:

ν e C p (u)! (eC C W �)C p (u)! eC C n(d) (15.57a)(eC C e� ! γ C γ

n C 108Cd ! 109Cd�! 109CdC γ

(15.57b)

where we have denoted participating quarks in parentheses. Neutrinos produced ina reactor were used for the detection. They are actually antineutrinos produced inthe decays of fission products. Because of lepton number conservation, they canonly be converted to positrons by (virtually) emitting W �, and the target proton(actually a u quark in the proton) is converted to a neutron (d quark) by absorbingW �. As the positron is an antielectron, it is annihilated immediately by an electronin matter, producing two gamma rays with total energy 2me . The recoil neutron is

Reactor β decay νe_

pn Cd

e+

Photomultipliers

Liquid Scintillator

+ CdCl2Water

Figure 15.9 Layout of the apparatus used todetect the neutrino in 1956. The antineutrinoproduced in decays of nuclear fission prod-ucts reacts with a proton and emits a positronand a neutron. The positron immediately an-

nihilates an electron in matter, emitting twoγ rays. The slow neutron loiters for a while,eventually being captured by a Cd nucleus toemit another γ [105, 330].

Page 597: Yorikiyo Nagashima - Archive

572 15 Weak Interaction

nonrelativistic and is scattered many times by matter before it is captured by a Cdnucleus, thereby emitting another gamma ray. A time as long as 10�4 s will havepassed and the neutron “totters” � 1 m. The gamma rays kick electrons (Comp-ton scattering) in matter and the recoil electrons, in turn, emit bremsstrahlungphotons, creating pairs. These electrons are detected by scintillation counters. Thesignal of the neutrino reaction is thus the occurrence of a prompt signal producedby the positron followed by a delayed signal produced by neutron capture. The ex-periment was carried out at the Savannah River Plant in South Carolina, USA.

The detector consisted of a water tank sandwiched by liquid scintillators, whichwere viewed by photomultipliers. The tank contained about 200 liters of water with45 kg of cadmium chloride (CdCl2) dissolved in it. It was calculated that about 2 �1014 neutrinos were emitted per 1 kW of the reactor power and the flux injected intothe detector was 5 � 1013 s�1 cm�2. The intensity was 100 times that of the solarneutrino, but the reaction rate was extremely small and suffered severely from falsesignals created by cosmic rays. To date, the cosmic ray background is a seriousproblem to be considered in designing an experiment. This was especially truefor the first experiment. Eventually, taking the difference between mesurementswith the reactor in operation and not in operation, the experimenters confirmed anevent rate of one per 20 min.

15.4.2Mass of the Neutrino

The neutrino mass, i.e. whether it vanishes or not, has been a central issue inclarifying the properties of the neutrino. It still is, because a nonvanishing neutrinomass is a sign that the Standard Model is incomplete. There is evidence that it isfinite because neutrino oscillation, in which a neutrino changes its flavor duringflight, has been observed (see Sect. 19.2.1). However, the oscillation is sensitiveonly to the mass difference and determination of its absolute value is a forefrontresearch goal as it offers a clue to new physics.

A most promising reaction that may be able to determine the neutrino massdirectly is the nuclear decay with a small Q value. The energy spectrum of thedecay electron is obtained by integrating Eq. (15.18) over angles:

dΓ DG2

2π3jMj2F(Z, Ee)pe Ee pν Eν dEe (15.58)

where we have used a more general expression jMj2F(Z, E ), instead of jh1ij2, thatincludes the Coulomb correction factor F(Z, E ) and recovered the correct neutrinovariables to discuss the mass of the neutrino. We have already shown that jMj2

does not contain the energy variable in an allowed transition. If mν ¤ 0

pν Eν D (E0 � Ee)q

(E0 � Ee)2 � m2ν ,

E0 D M(Z, A) �M(Z C 1, A)� mν

(15.59)

Page 598: Yorikiyo Nagashima - Archive

15.4 The Neutrino 573

0 5 10 15 20Electron Energy (keV)

β-D

ecay

Rat

e

(a)

ν 0ν 0

σ 0K

urie

Plo

t

(b) E0 mν E0 Ee E0

ν 0 ν 0

σ 0

(c)

Figure 15.10 (a) Spectrum of the electron in decays. (b) Theoretical Kurie plot near end point.(c) Kurie plot with smearing. Smearing is due to experimental resolution and contributions offinal excited states.

We define a spectral function referred to as a Kurie plot by

K(E ) D�

dΓ /dEe

F(Z, Ee)pe Ee

�1/2

(15.60)

If mν D 0, K(E ) is proportional to E0 � Ee and describes a straight line.Figure 15.10a depicts a whole electron spectrum from decays. For the mν de-

termination, only electrons near the end point are important. If mν ¤ 0 the Kurieplot deviates from a straight line and is vertical at the end point (Fig. 15.10b), butafter convolution with the experimental resolution (σ), the curve becomes blurred(Fig. 15.10c). As mν can be determined more accurately with smaller value of E0,the tritium isotope (T) with E0 D 18.6 keV is usually chosen as the source. Manyexperiments have been carried out, so far, in vain. The result has always been con-sistent with zero mass, albeit with progressively improved upper limit.

Troizk–Mainz Experiment The previous arguments are valid for early experiments,in which the mass sensitivity was no more than 10–100 eV. Modern neutrino exper-iments have resolutions close to or better than a few eV, so that atomic or molecularbinding energy (a few eV) of the final states have to be considered (see [308] for a re-view), because the ray source used for the experiment is not a tritium nucleus butis prepared as a neutral tritium atom or molecule T2. In the following experimentsthe molecular gas of tritium T2, whose level structure is well understood, was used.

Page 599: Yorikiyo Nagashima - Archive

574 15 Weak Interaction

Taking into account various final states of daughter products [(3He�T)C+e], the ray energy spectrum is modified to

dΓ � pe Ee pν Eν D pe Ee(E0 � Ee)q

(E0 � Ee)2 � m2ν

! pe Ee

Xi

Pi (E0 � Vi � Ee)q

(E0 � Vi � Ee)2 � m2ν

(15.61)

where Vi is the excited energy of a final state and Pi is its probability. The re-coil energy of the daughter is included in E0. Note that the above formula tells usthat the experimental observable is m2

ν , not mν . The spectrum shape is shown inFig. 15.10c, which includes smearing due to the excited energy of the daughters.

In the early measurements, a magnetic spectrometer was used to determine themomentum of the decay electrons, but it only measures the transverse componentof the momentum relative to the magnetic field lines and smearing due to scatter-ing cannot be avoided, thus making it hard to resolve the contributions of the finalstates. Modern detectors use an electrostatic spectrometer, which measures the to-tal energy of the electron. It filters and passes only a small portion of the high-endspectrum, which is the region most sensitive to the mass value.

Here we describe the basic principles of the electrostatic filter of the Troizk [54]and Mainz experiment [243, 379], which uses MAC-E (Electrostatic filter with Mag-netic Adiabatic Collimator). A schematic layout is shown in Fig. 15.11. Two super-conducting solenoids provide the guiding magnetic field. Beta particles, which areemitted from the tritium source in the strong field BS inside the left solenoid areguided magnetically into the spectrometer along the magnetic field line and de-tected by a detector, which is located at the other end of the strong field BD. HV(high voltage) electrodes generate a retarding (accelerating) graded electric field inthe left (right) half of the spectrometer and act as a filter to pass only high-energy particles determined by the voltage U0.

From the source, particles are emitted isotopically. When a particle is emittedat angle θ , its transverse momentum relative to the magnetic field line causes itto describe a circular motion and proceeds forward in a spiral along the field line.The angular frequency of the circular motion is given by the cyclotron frequencyωc D eB/γ m, where m is the electron mass and γ the relativistic Lorentz factor.The orbital angular momentum is a constant of motion for a constant magneticfield. The magnetic field of the spectrometer is not uniform but strong at bothends and weakest (B D Bmin) in the central plane of the spectrometer. However, aslong as the adiabatic condition

1B

dBdt� ωc (15.62)

is satisfied, the orbital angular momentum is still a constant of the motion. Theorbital angular momentum L (times e/2m) is a product of the magnetic momentμ created by the circular motion and the relativistic γ factor and can be expressedas

e2m

L D γ μ D γ I S D γeωc

2ππ�2 D

p 2T

2mB�

ET

B(15.63)

Page 600: Yorikiyo Nagashima - Archive

15.4 The Neutrino 575

T2-source

BS Bmax Bmin

adiabatic transformation E⊥ → E||

BD

U0

HV electrodes

detector

s.c. solenoid s.c. solenoidqE

Figure 15.11 Principle of the MAC-E filter:Two superconducting solenoids produce amagnetic guiding field. The tritium source isplaced in the left solenoid, while the detectorsits in the right solenoid. Between the twosolenoids there is an electrostatic retardationsystem consisting of several cylindrical elec-trodes. In the central plane perpendicular to

the field lines, the minimum of the magnet-ic field Bmin coincides with the maximum ofthe electric retarding potential U0. The vec-tors in the lower part of the picture illustratethe aligning of the momentum vector alongB in the region of the low magnetic field bythe adiabatic transformation (plotted withoutelectrostatic retardation) [308].

where I is the circular current, S the area of the circle, � its radius and ET thetransverse energy of the electron. In the nonrelativistic approximation, which isvalid for the tritium decay,

μ DET

BD constant 8) (15.64)

Equation (15.64) means the transverse energy diminishes in proportion to thestrength of the magnetic field. Without the electric field, the total energy E DEk C ET is a constant of the motion, namely, the spectrometer transfers the trans-verse energy to the longitudinal energy, as is illustrated at the bottom of Fig. 15.11.At the central plane (analyzing plane), Bmin/BS D 3 � 10�4 [Tesla]/6 [Tesla] and allthe particles are almost parallel to the axis of the two solenoidal coils. The electricfield produces the retarding field and pushes all the particles back except those withenergy higher than the maximum potential eU0. The right half of the spectrometer,on the other hand, accelerates the particles that have survived the electric filter anddelivers them to the detector.

8) This also means magnetic flux conservation in a circle, because

Φ D π�2 B D π�

pT

eB

�2

B D2πm

e2

ET

B(15.65)

Page 601: Yorikiyo Nagashima - Archive

576 15 Weak Interaction

Electrostatic Spectrometers

Magnetic Spectrometers

1990 1995 2000 2005YEAR

BEIJING

LIVERMORE

LOS ALAMOS

MAINZ

TOKYO

TROIZK

TROIZK (step)

ZURICH

mν2 [

eV]2

25

0

25

50

75

100

125

150

175

Figure 15.12 Squared neutrino mass values obtained from tritium decay in the period 1990–2005 plotted against the year of publication [308].

Practically all the particles that are directed toward the detector, except thosewith a tiny fraction (Bmin/BS) of ET, are accepted. Therefore the geometrical accep-tance of the source is nearly 2π.

Figure 15.12 shows the history of neutrino mass measurements. The currentbest value is [311]

mν D 1.1˙ 2.4 eV (15.66)

Since from the oscillation data m . 0.05 eV is inferred (see Sect. 19.2.1), the abso-lute mass measurement should be aimed at least at this value when future experi-ments are being planned.

15.4.3Helicity of the Electron Neutrino

The helicity of the neutrino cannot be measured directly, but it can be deter-mined by observing the recoil nucleus as it emits a neutrino following electroncapture [179]. The reaction used is described in Fig. 15.13a.

e� C 152Eu ! 152Sm� C ν , Sm� ! Sm C γJ 1/2 0 1 1/2 1 0 1

(15.67)

In the first reaction, Eu at rest absorbs an electron in an orbital angular mo-mentum S state (K-capture), therefore the spin of the recoiled Sm� on emitting aneutrino is aligned in the same direction as the electron spin and opposite to theneutrino spin. If the momentum of the emitted γ is along the spin direction of the

Page 602: Yorikiyo Nagashima - Archive

15.4 The Neutrino 577

(0, )(3, )

(1, )

(2,+)(0,+)

γ1

γ2

γ

152Eu (0, ) : Qasi stable state152Sm* (1, ) : t (7+2)x10 14 sec

50+25 keV152Eu

0.956 MeV

0.963 MeV

0.12MeV

152Sm*

e ν

(a) (b)

Eu 132 m Source

Analyzingmagnet

Scale

1''

Pb

Sm2 O3

Scatterer

RCA6342

Fe + Pb Shield

Mu Metal hhield

2''×3 1''2

No I(T )

Figure 15.13 A scheme for measuring thehelicity of the neutrino and the correspond-ing apparatus. (a) Eu(0�) captures an elec-tron, emits a neutrino, is excited to Sm�(1�)and subsequently decays to Sm, emitting γrays. Only those γ ’s that are emitted in thesame direction as the recoil momentum haveenough energy to scatter and excite anoth-er Sm to a resonant state 963 keV above theground state. This is possible because the

half-life of 152Sm� is � 7 � 10�14 s, with re-sultant broadening of the level, thus allowingthe γ energy to reach the resonance energy.The helicity of the γ can be measured usingthe different absorption in passing throughmagnetized iron. (b) The apparatus contains152Eu in the magnetized iron. Emitted γ raysare scattered by Sm2O3 below and detected bya NaI scintillation counter [179].

parent Sm�, the γ preserves the parent’s helicity. Therefore by measuring the γhelicity, the helicity of the neutrino can be determined. The trick of this ingeniousexperiment is to pick out exclusively the forward-going γ by using resonance scat-tering. Resonance scattering occurs only when the incident γ has the right energy

Page 603: Yorikiyo Nagashima - Archive

578 15 Weak Interaction

to raise Sm to the resonant level 963 keV above the ground state

γ C 152Sm! 152Sm� ! 152SmC γ (15.68)

This means the energy of the incoming γ ray has to be slightly larger than 963 keVto give the recoil energy to the target. This extra energy is supplied by movingSm�, which emits the gamma ray in the forward direction. The helicity of the γcan be determined by passing it through magnetized iron as the absorption ratedepends on the value of the helicity. When the electron spin in the iron atom isantiparallel to the γ spin, it can absorb the γ by flipping its spin, but when the spinis parallel it cannot flip its spin, hence there is less absorption. The experimenthas demonstrated the negative helicity of the neutrino and confirmed the weakinteraction type as represented by Eq. (15.37).

15.4.4The Second Neutrino νμ

The decay electron from the muon has a continuous energy spectrum, indicatingtwo neutral particles (neutrinos) are emitted.

μ� ! e� C ν e C νμ (15.69)

The initial question was whether they are the same two neutrinos (ν e , ν e) as appearin nuclear decays or two different neutrinos (ν e , νμ).

If νμ is identical to ν e the muon should be able to decay to eCγ through the pro-cess depicted in Fig. 15.14, because the second neutrino (νμ) couples to the muonand the first to the electron. The calculated decay rate was within experimentalreach but the process has not been observed to this day [311]

BR(μ ! e C γ ) < 1.2 � 10�11 (15.70)

If νμ ¤ ν e , the process in Fig. 15.14 is absent and its nonexistence can be ex-plained. The hypothesis can be tested by making decay neutrinos from the pion

πC ! μC C νμ (15.71)

γ

ν

e

μ

W

Figure 15.14 μ ! e C γ can occur through this process if the νμis identical or mixes with νe , giving a much higher decay rate than isobserved.

Page 604: Yorikiyo Nagashima - Archive

15.5 The Universal V–A Interaction 579

react with matter. If νμ ¤ ν e , electrons will not be produced by νμ N reactions.

νμ C n ! μ� C p νμ C n !/ e� C p

νμ C p ! μC C n νμ C p !/ eC C n (15.72a)

ν e C n ! e� C p ν e C n !/ μ� C p

ν e C p ! eC C n ν e C p !/ μC C n (15.72b)

An experiment carried out at BNL in 1962 confirmed the hypothesis [112]. Onlymuons are produced and no electrons were observed in the reaction using neu-trinos from π decay. This was the first evidence that there exists a new quantumnumber (one might call it “e-ness” and “mu-ness” to distinguish the two kinds ofleptons) that is conserved in the leptonic weak interaction. Later, the tau lepton τand its associated neutrino were discovered in the e�C eC reaction and “tau-ness”was added. Today, they are generically called lepton flavor.

15.5The Universal V–A Interaction

15.5.1Muon Decay

The muon was discovered in 1937 by Anderson and Neddermeyer [293] and wasmistaken as the meson predicted by Yukawa because it had about the right mass(mμ ' 200me). The confusion lasted for ten years until finally the pion was discov-ered in 1947 by Powell [80] (see Fig. 13.5). But after the pion had been identifiedas the right Yukawa meson, the muon remained a mystery. Except for its mass,the muon had identical properties to the electron and may well be called a “heavyelectron”. Except for the fact that it can decay into an electron and neutrinos, allthe reaction formula for the muon may be reproduced from those of the electronsimply by replacing its mass. Nobody could explain why it should exist. The per-plexity was exemplified by Rabi’s words: “Who ordered it?” Although its existenceitself was a mystery, its properties have provided essential clues in clarifying thecharacteristics of the weak interaction. Precision experiments on the muon prop-erties provide a sensitive test of the Standard Model and remain a hot topic to thisday.

Decay Rate The muon decays practically 100% of the time to an electron and twoneutrinos. Here we investigate its decay rate and see what it tells us. Let us definethe kinematical variables as

μ�(p1)! e�(p4)C ν e(p2)C νμ(p3) (15.73)

Page 605: Yorikiyo Nagashima - Archive

580 15 Weak Interaction

The interaction Lagrangian may be written as

LWI D �Gμp

2

�νμ γ μ(1� γ 5)μ

�eγμ(1 � γ 5)ν e

(15.74)

Here, the Fermi coupling constant is denoted as Gμ for the sake of later discus-sions. The transition amplitude is given by

T f i D (2π)4δ4(p1 � p2 � p3 � p4)

�Gμp

2

�u(p3)γ μ(1� γ 5)u(p1)

�u(p4)γμ(1� γ 5)v (p2)

(15.75)

If the muon is not polarized, taking the spin average of the initial and the sum ofall the final state spins, we can express the decay rate as

dΓ D1

2mμjMj2dLI P S (15.76a)

jMj2 D

�Gμp

2

�2 162

KLμν(p1, p3)KLμν(p2, p4) (15.76b)

dLI P S D (2π)4δ4(p1� p2� p3� p4)d3 p2

(2π)32E2

d3 p3

(2π)32E3

d3 p4

(2π)32E4(15.76c)

where

KLμν(p1, p3) D (1/4)

Xspin

�u(p3)γ μ(1� γ 5)u(p1)

��u(p3)γ ν(1� γ 5)u(p1)

�D 2

�˚p μ

1 p3ν C p1

ν p μ3 � gμν(p1 � p3)

�C i�μν�σ p1� p3σ

� P μν(p1, p3)C Qμν(p1, p3) (15.77)

[see Appendix Eq. (C.10d)]. Using

P μν(p1, p3)Pμν(p2, p4) D 8 f(p1 � p2)(p3 � p4)C (p1 � p4)(p2 � p3)g

Qμν(p1, p3)Q μν(p2, p4) D 8 f(p1 � p2)(p3 � p4) � (p1 � p4)(p2 � p3)g

P μν Q μν D Qμν Pμν D 0

(15.78)

we obtain

KLμν(p1, p3)KLμν(p2, p4) D 16(p1 � p2)(p3 � p4) (15.79)

If Eq. (15.79) is inserted in Eqs. (15.76), the expression for the decay rate becomes

jMj2 D 64G2μ (p1 � p2)(p3 � p4) (15.80a)

dΓ D64G2

μ(p1 � p2)(p3 � p4)

2mμ(2π)5

d3 p2d3 p3d3 p4

8E2E3E4δ4(p1 � p2 � p3 � p4) (15.80b)

Page 606: Yorikiyo Nagashima - Archive

15.5 The Universal V–A Interaction 581

Problem 15.3

When m2 D m3 D 0, show

S μν �

Zp μ

2 p3ν δ4 (P μ � p2 � p3)

d3 p2

2E2

d3 p3

2E3D

π24

(gμν P2 C 2P μ P ν)

(15.81)

Hint: From Lorentz invariance the equation can be expressed as Agμν C B P μν.Calculate A and B in the center of mass frame.

Using Eq. (15.81), we have

dΓ DG2

μ

2π5 (p1μ S μν p4 ν)d3 p4

mμ E4(15.82)

If we neglect the electron mass, energy takes its minimum value 0 in the muonrest frame, and maximum value mμ/2 when the momenta of the two neutrinos areboth in the opposite direction to that of the electron. Putting p1 D (mμ , 0, 0, 0) andsetting me D 0 we obtain the electron energy spectrum

dΓdxD

G2μ m5

μ

96π3 x2(3� 2x ) , x DE4

mμ/20 x 1 (15.83)

This reproduces the experimental data (Fig. 15.15) quite well.Integrating over the energy, we obtain the total decay rate

Γ DG2

μ m5μ

192π3 (15.84)

Problem 15.4

Derive Eq. (15.83).

Problem 15.5

Show for me > 0 Eq. (15.84) is modified to

Γ DG2

μ m5μ

192π3 f

m2

e

m2μ

!, f (y ) D 1� 8y C 8y 3 � y 4 � 12y 2 ln y (15.85)

The precise expression including the radiative correction is given by [53, 333]

Γ DG2

μ m5μ

192π3 f

m2

e

m2μ

!�1C

α2π

�1C

2α3π

lnmμ

me

��254� π2

��(15.86)

Page 607: Yorikiyo Nagashima - Archive

582 15 Weak Interaction

Figure 15.15 Energy spectrum of the decay electrons in μ! eνν. The line is a theoretical curveincluding the radiative correction with the Michel parameter � D 3/4 [44].

Inserting the observed muon lifetime [311]

τ D1ΓD 2.197019 ˙ 0.000021 � 10�8 s (15.87)

we find the coupling constant is determined as

Gμ ' 10�5 GeV�2 (15.88)

which is almost identical with G [Eq. (15.29)] obtained from nuclear decays.This was the first indication that the coupling constant of the weak interactionwas universal. For later discussion we list the two coupling constants and comparethem:

Gμ D 1.16637 ˙ 0.00001 � 10�5 GeV�2 (15.89a)

G D 1.14730 ˙ 0.00064 � 10�5 GeV�2 (15.89b)

The small but significant difference between the two coupling constants will bediscussed later.

Page 608: Yorikiyo Nagashima - Archive

15.5 The Universal V–A Interaction 583

Michel Parameters In calculating the muon decay rate, we assumed V–A inter-action. Historically, however, the most general expression [Eq. (15.23)] was usedto determine the interaction type. In the electron energy spectrum from polarizedmuons, angular correlations with the polarization axis vary according to the inter-action type. Its general form is expressed in terms of Michel parameters [115, 141,274]. The usefulness of the parameter lies in that it can be applied not only to themuon but also to the tau or any new lepton and determine its interaction type.Development of surface muons9) has enabled precise experiments using polarizedmuons, providing a very effective test bench to check the validity of the StandardModel and beyond. The distribution of the decay electron from the polarized muonμ� can be expressed in the approximation neglecting the electron mass as

dΓdx d cos θ

D 4x2�

3(1� x )C23

�(4x � 3)� Pμ � cos θ

�1� x C

23

δ(4x � 3) � (15.90)

where Pμ is the muon polarization, x D 2Ee/mμ and θ is the angle between themuon spin and the electron momentum. Note that the term with the polarizationis the parity-violating term. �, � , δ are the Michel parameters [141]. In order not tomake the equation too complicated, we restrict the interaction type to V and A inthe following. The general formula can be found in [311]. The Lagrangian can bewritten in the form

LWI D �Gμp

2

�eγμ(1 � γ 5)ν e

˚νμ γ μ �gLL(1� γ 5)C gLR(1C γ 5)

μ�

Ceγμ(1C γ 5)ν e˚

νμ γ μ �gRL(1� γ 5)C gRR(1C γ 5)

μ�C h.c.

(15.91)

The Michel parameters are expressed as

� D34jgLLj

2 C jgRRj2

D, δ D

34jgLLj

2 � jgRRj2

D, � D 1

D D jgLLj2 C jgLRj

2 C jgRLj2 C jgRRj

2(15.92)

In the V–A or the Standard Model, gLL D 1, gLR D gRL D gRR D 0 and gives

� D δ D34

, � D 1 (15.93)

If we adopt V–A as proven for the electron, we can set gRL D gRR D 0 and test onlythe muon part, gLL D 0, gLR D 1 for the VCA type and gLL D gLR(gLL D �gLR) for

9) If matter is irradiated with pions, slow pions stopping in the matter, muons produced inπ ! μν have some definite energy. Especially, those from the surface of matter have a cleanmonochromatic spectrum.

Page 609: Yorikiyo Nagashima - Archive

584 15 Weak Interaction

pure V(A) type, giving � D 0 (V–A), 3/8 (V), 3/8 (A). The data show [311]

� D 0.7509˙ 0.0010 (15.94a)

δ D 0.7495˙ 0.0012 (15.94b)

j� Pμj D 1.0007˙ 0.0035 (15.94c)

jPμj is the polarization of the muon and should be 100% for muons obtained fromπ ! μν in the V–A interaction. An experimental analysis using the surface muonbeam gives [40]

jgLLj > 0.888 , jgLRj < 0.060 , jgRLj < 0.110 ,

jgRRj < 0.033 90%CL (15.95)

In summary, the data confirms the V–A-type Lagrangian with high accuracy.

15.5.2CVC Hypothesis

Isospin Current The form of the interaction determined by the nuclear and muondecay can be combined to give

LWI DGp

2

�p γμ(1� 1.26γ 5)n

�eγ μ(1 � γ 5)ν e C μγ μ(1� γ 5)νμ

C

Gμp

2

�νμ γμ(1� γ 5)μ

�eγ μ(1� γ 5)ν e

C h.c. (15.96)

As far as the vector part of the weak current is concerned, the coupling constantof the muon decay Gμ and that of nuclear decay (G) agree within 2%. n ! ptransformation is really d ! u transformation and is influenced by quark wavefunctions to form a proton bound state. Namely, the coupling here includes thestrong interaction correction. Based on near identity of the strength, it was sug-gested that the weak vector current is a conserved current, namely

@μ j WCμ � @μ p γ μ n D @μ ψγ μ τCψ D 0

ψ D�

pn

�, τC D

�0 10 0

� (15.97)

The inference is based on the analogy of the identical strength of the electriccharge for the electron and the proton. The proton, interacting strongly, couplesto the electromagnetic current not only as a single proton but also through pionclouds as depicted in Fig. 15.16a, and yet the strength of the charge is identicalto that of the electron, which does not have such a complication. The identity isguaranteed by current conservation:

@μ j μEM D @μ p γ μ p D @μ ψγ μ

�1C τ3

2

�ψ D 0 (15.98a)

Page 610: Yorikiyo Nagashima - Archive

15.5 The Universal V–A Interaction 585

(b)

p

p

e

p

p

e

p

p

n

p

pn

ep

p

pe

π0 π+ + + + + ...

p

n

p

n

p

n

p

n

p

n

+ + + + π0 π- ...

π+

π+

π-

π0e

ν

e

ν

e

νp

pnCV CV CV

CV CV

(a)

γ γ γ γ γ

n

p

Figure 15.16 Diagrams to show the universal-ity of the coupling constant. The strength ofthe electromagnetic current is invariant underthe strong interaction effect as depicted in (a)because of the current conservation. Similarly,

the CVC hypothesis makes the weak chargeinvariant under the influence of strong interac-tion corrections. If (e� ν) is replaced by W �the correspondence is more apparent.

Here, in moving from the second equality to the third, we used

p D1C τ3

2ψ, ψ D

�pn

�, τ3 D

�1 00 �1

�(15.98b)

Equations (15.98) have the form of Nishijima–Gell-Mann’s rule in a differential ex-pression and (1C τ3)/2 corresponds to Y/2 C I3. Comparing (15.97) and (15.98),we see the τ3 part of the electromagnetic current and the weak vector current(/ τ˙) have the same form. This suggests they are just different components ofthe same isospin current

j μI D ψγ μ

τ2

�ψ , @μ j μ

I D 0 (15.99)

This is referred to as the CVC (conserved vector current) hypothesis [167].

Chiral Invariance The coefficient of the axial vector current (γ 5γ μ) of the nucleondiffers from that of the lepton by 26%. If this is regarded as the result of the stronginteraction correction, it is logical to believe that, at the quark level, the couplingmust be equal. This advances to a stage one step beyond where the nuclear decay(Z, A)! (Z C 1, A) was considered as n ! p , and decay should be consideredas a d ! u process. At this most fundamental level, it is conceivable that the weakinteraction is described by

LWI D �Gp

2

�uγ μ(1� γ 5)d

�eγμ(1� γ 5)ν e

(15.100)

Page 611: Yorikiyo Nagashima - Archive

586 15 Weak Interaction

This is referred to as the V–A interaction. Later Adler [5] and Weisberger [384] usedthe PCAC (partially conserved axial vector current) hypothesis10) and succeeded inderiving CA/CV D 1.26 in terms of π–N scattering cross section, thereby confirm-ing the above conjecture. Equation (15.100) has a simple and beautiful form andagrees with what can be derived theoretically from the chiral invariance [355, 356].It requires that the equation of motion should be invariant under the chiral trans-formation

ψ ! ψ0 D exp(�i γ 5α)ψ , ψ† ! ψ0 † D ψ† exp(i γ 5α) (15.102)

Note, it is a gauge transformation but changes the phase of the left- and right-handed components in the opposite direction. The V–A type was also inferred fromthe two-component neutrino theory [144] and mass inversion requirement [342].Namely, the V–A interaction has a certain theoretical foundation as a basic startingpoint of the equation of motion. Since

ψ D�

1 � γ 5

2C

1C γ 5

2

�ψ D ψL C ψR

γ 5ψL D �ψL, γ 5ψR D ψR

(15.103)

the Lagrangian for the free Dirac particle has the form

LDirac D ψ(i γ μ@μ � m)ψ

D i ψLγ μ@μ ψL C i ψRγ μ@μ ψR C m(ψL ψR C ψRψL) (15.104)

The second equality can be derived using Eq. (15.103) and anticommutativity ofγ 5 with all other γ μ’s. It is easy to verify that the vector and axial vector currentare invariant under chiral transformation, i.e. ψ0γ μ ψ0 D ψγ μ ψ, ψ0γ 5γ μ ψ0 Dψγ 5γ μ ψ. However, the mass term transforms as

ψ0ψ0 D ψ† e i αγ5γ 0e�i αγ5

ψ

D ψ e�2 i αγ5ψ D cos 2αψψ � i sin 2αψγ 5ψ (15.105)

that is, it is not invariant. Namely, chiral invariance requires the vanishing of thefermion mass, which, in turn, makes the left- and right-handed components in-dependent, as shown by Eq. (15.104), in the absence of interactions to couple left-and right-handed components. One notices that Yukawa-type interaction LYukawa �

(gψψ C g0ψγ 5ψ)φ mixes the L-R components but current-type interactions suchas LEM � eψγ μ ψA μ or four-fermion interaction do not. The chiral transforma-tion can be considered as independent gauge transformations for left and rightcomponents and is referred to as the chiral gauge transformation:

ψL ! ψ0L D e�i αL ψL , ψR ! ψ0

R D e�i αR ψR (15.106)

10) The axial current is not conserved becauseCA D �1.26, but the deviation is small andone may say it is almost conserved. In a loosesense, this is what PCAC means. In a morerigorous definition, it is expressed as [162]

@μ j Aμ D f π m2

π φπ (15.101)

which is conserved in the limit mπ ! 0.Here, f π is the decay constant defined inEq. (15.44) and φπ is the pion field.

Page 612: Yorikiyo Nagashima - Archive

15.5 The Universal V–A Interaction 587

Problem 15.6

Show that α in Eq. (15.105) and αL, αR in Eq. (15.106) are connected by α D (αR�

αL)/2.

The fact that the weak Lagrangian has the form that is required by chiral gaugeinvariance has a fundamental significance for an understanding of the weak inter-action. It means that the fermion mass has to vanish and that the left- and right-handed components are independent, i.e. different particles. Indeed, if the weakforce is generated by the weak charge, only the left- handed component carries it.The CVC hypothesis and chiral invariance provide an important clue for the unifi-cation of the electromagnetic and weak forces.

Experimental Test of CVC(i) Weak Magnetism: Using Lorentz invariance, a general expression for the protonelectromagnetic current can be written as

JEMμ D hp 0jp γ μ p jp i

D u(p 0)�

Fp1(q2)γ μ C� p

2m pFp 2(q2)i σμνqν

�u(p )

q2 D (p � p 0)2 , Fp1(0) D Fp 2(0) D 1

(15.107)

If the proton is a point particle, i.e. a Dirac particle without higher order cor-rections, Fp1(q2) D 1, Fp 2(q2) D 0. � p is the anomalous magnetic moment ofthe proton in units of nuclear magneton e/2m p . Note the term proportional toqμ D p μ � p 0 μ vanishes by current conservation and that proportional to p μC p 0 μ

can be expressed by the above two terms using the Dirac equation for u(p 0), u(p )[see Eq. (8.5)]. The electromagnetic current of the neutron can be obtained bychanging the subscript p ! n. Since the neutron has vanishing electric chargebut finite anomalous magnetic moment, Fn1(0) D 0, Fn2(0) D 1. ComparingEq. (15.107) with Eq. (15.98), we see the τ3 part of Eq. (15.98) may be written as

JEMμ D ψ

hFV1(q2)γ μ C

�V

2m pFV2(q2)i σμνqν

i τ3

2ψ (15.108a)

FV1 D Fp1 � Fn1q2!0����! 1 (15.108b)

�V FV2 D � p Fp 2 � �n Fn2q2!0����! � p � �n

� p D 2.792847351(28) , �n D �1.91302473(45) (15.108c)

where the number in parenthesis of the experimental value denotes the error in thelast two digits. The CVC hypothesis means the weak charged current Eq. (15.97) isthe same as the terms in square brackets of Eq. (15.108a). Namely, the vector partof the weak current should be expressed as

JWCμ D ψ

�FV1(q2)γ μ C

�V

2m pFV2(q2)i σμνqν

�τCψ (15.109)

Page 613: Yorikiyo Nagashima - Archive

588 15 Weak Interaction

Therefore, if the second term (referred to as weak magnetism) in the square brack-ets of Eq. (15.109) can be shown to be the same as the anomalous magnetic mo-ment of the nucleon, the CVC hypothesis is proved. As the term is proportional toq, it does not contribute to the allowed transition of decay and one has to turn tothe forbidden transition to discuss the matter. Since we do not want to go into toomany details of nuclear theory, we do not discuss the CVC hypothesis in the de-cay any more here. Historically, however, the weak magnetism was measured [259]and provided the first evidence of the CVC hypothesis.

Another test is to perform high-energy neutrino scattering by a proton (ν lC p !l C n) and see if it has the same q2 dependence as e p elastic scattering (e C p !e C p ) [9]. Here, we consider the effect of CVC in pion decay.(ii) πC ! π0 C eC C ν: The electromagnetic current of π˙ is proportional tothe third component of the isospin by the Nishijma–Gell-Mann rule. ConsideringLorentz invariance and separating the coupling constant, one finds

JEMμ D hπC(k0)j jEM

μ (x )jπCi

D�F1(q2)(k0 μ C k μ)C F2(q2)(k0 μ � k μ)

hπCjI3jπCie�i q�x

q2 D (k0 � k)2 (15.110)

where k, k0 are the initial and final momenta of the pion. The form factor F1 isnormalized to become 1 in the static limit (q2 D 0). Current conservation means

@μ jEMμ D 0

) F1(q2)(k0 2 � k2)C F2(q2)(k0 � k)2 D 0(15.111)

Substitution of k0 2 D k2 D m2π makes F2(q2) D 0. Therefore the electromagnetic

current is expressed as

JEMμ D (k0 μ C k μ)F1(q2)h f jI3jiie�i q�x (15.112)

The CVC hypothesis requires

JWμ D (k0 μ C k μ)F1(q2)h f jI�jiie�i q�x (15.113)

since I�j1,C1i Dp

2j1, 0i, i.e. hπ0jI�jπCi Dp

2 [Eqs. (E.3), (E.10)]. For the πCdecay q2 is small and the form factor can be approximated by F1(q2) ' F1(0) D 1.We have the matrix element

M(πC ! π0eCν) DGp

2

p2(k0 μ C k μ)u(pe)γμ(1� γ 5)v (pν) (15.114a)

Then the decay rate can be calculated as

Γ D (2π)4δ4(Δ � pe � pν)1

2mπ

Xspin

jMj2d3 pπ

(2π)32ωd3 pe

(2π)32Ee

d3 pν

(2π)32Eν

(15.114b)

Page 614: Yorikiyo Nagashima - Archive

15.6 Strange Particle Decays 589

The result is given in [104, 222]:

Γ DG2

Δ5

30π3

�1 �

12

ΔmπC

�3

Rme

Δ

�' 0.393 s�1

R(x ) Dp

1� x2

�1�

92

x2 � 4x4�C

152

x4 ln

1Cp

1� x2

x

! (15.115)

The observed value is 0.394 ˙ 0.01 s�1 and agrees with the CVC hypothesis verywell.

Problem 15.7

*

As Δ D mπC � mπ0 � mπ0 , we may neglect the π0 momentum, which givesk0 μ C k μ D (mπC C mπ0 )δμ

0 . In this approximation, show Eq. (15.114b) becomes

Γ DG2

Δ5

30π3

(1 � Δ/2mπC )2

1 � Δ/mπCRme

Δ

�' 0.414 s�1 (15.116)

which agrees with Eq. (15.115) within a few %.

15.6Strange Particle Decays

15.6.1ΔS D ΔQ Rule

Particles carrying strangeness may not have the same interaction as the familiarhadrons composed of u and d quarks. So we shall see first if we can apply the samerules to phenomena associated with strange particles if we replace the d quark bythe s quark. As the hadronic part of the Lagrangian of the decay was expressed as

j CW

μ � uγ μ(1� γ 5)d (15.117)

it is natural to think that the decay of strange particles is induced by

j CW

μ � uγ μ(1� γ 5)s (15.118)

with leptonic part unchanged. Under this assumption, the relative ratio of K˙ !e˙ C ν and K˙ ! μ˙ C ν can be calculated in the same way as for the pion,leading to the formula corresponding to Eq. (15.49):

Γ (K ! eν)Γ (K ! μν)

Dm2

e

�m2

K � m2e

�2

m2μ

�m2

K � m2μ

�2 (15.119)

Page 615: Yorikiyo Nagashima - Archive

590 15 Weak Interaction

That the formula gives a value which agrees well with the experimental value2.45˙ 0.11� 10�5 supports the above assumption. If the hadronic weak current isexpressed by Eq. (15.118) this is an operator to annihilate s and create u (and viceversa). Since u carries (I D I3 D 1/2) and s (S D �1, I D 0), we expect that in thestrangeness changing reaction, relations

s ! u Q W �13!

23

, I W 0!12

, S W �1! 0 (15.120)

hold which means that the quantum number of the hadronic part undergoeschanges

ΔS D ˙1 , ΔS D ΔQ , Δ I D 1/2 (15.121)

This is also obtained from Nishijima–Gell-Mann’s law:

Q D I3 C12

(B C S ) (15.122)

Consider the list of observed leptonic decays of the strange particles

K˙ ! π0 C e˙ C ,--ν (15.123a)

K0L ! π˙ C e� C ,--ν (15.123b)

Λ ! p C e� C ν (15.123c)

Σ � ! n C e� C ν (15.123d)

� 0 ! Σ C C e� C ν (15.123e)

The regularity is obvious. They all satisfy ΔS D ΔQ. On the other hand, a reactionwith ΔS D �ΔQ such as

Σ C ! n C eC C ν (15.124)

has not been observed and its decay rate relative to (15.123d) is below 0.009. It iseasy to understand if we think in terms of the constituent quarks. As Σ � D uus, scan be changed to u by emitting W �, converting Σ � to n D uud, but Σ C D uuscan only be changed to uuu and not to n D udd. K0

L can be obtained either fromK0 or K

0and it is a mixture of S D 1 (d s) and S D �1 (sd). Therefore, it should

be able to decay to both, π˙ l�,--ν . However, once KL’s parent is identified as K0

or K0, one of them violates the ΔS D ΔQ rule and should be forbidden. One can

measure an observable

x D(ΔS D �ΔQ) rate(ΔS D ΔQ) rate

Dhπ� lCνjT jK

0i

hπ� lCνjT jK0i(15.125)

Page 616: Yorikiyo Nagashima - Archive

15.6 Strange Particle Decays 591

where KL ! π� lC,--ν in the numerator and denominator are separately obtainedfrom K

0and K0. The measured value gives [311]

Refxg D �0.0018˙ 0.006 , Imfxg D �0.0012˙ 0.002 (15.126)

and is consistent with 0. A more comprehensive treatment is given in the nextchapter.

15.6.2Δ I D 1/2 Rule

We turn now to the Δ I D 1/2 rule. It is easy to understand it in leptonic or semilep-tonic decays in Eq. (15.123), but consider hadronic decays such as

Λ ! p C π�, Λ ! n C π0 (15.127)

The transition amplitude may be written as

T f i D GShp juγ μ(1� γ 5)sjΛihπ�jdγμ(1� γ 5)uj0i (15.128)

where GS has been used instead of G , since it is to be determined by experiment.The above expression contains both a Δ I D 1/2 (us) and a Δ I D 1 (du) part,allowing both Δ I D 1/2 and Δ I D 3/2 transitions. Experimentally, however, theΔ I D 1/2 rule seems to be respected in hadronic decays as well as leptonic decays.If one requires the rule, we should have

Γ (Λ ! nπ0)Γ (Λ ! nπ0)C Γ (Λ ! p π�)

D13

(15.129)

Correcting for the phase space difference, the right-hand side becomes 0.345,which is to be compared with the experimental value 0.358˙ 0.005.

The decay KC ! πCπ0 (I D 2) is a Δ I D 3/2 process (Problem 15.8). It ismuch less common than the Δ I D 1/2 process K0

S ! ππ (I D 0):

Γ (KC ! πCπ0 (I D 2))Γ (K0

S ! ππ (I D 0))D 4 � 10�2 (15.130)

The validity of the Δ I D 1/2 rule in hadronic decay is believed to originate from thedynamics of the strong interaction, but it is not well understood yet and remainsas a problem for QCD to solve.

Problem 15.8

Show that πCπ0 cannot have I D 1 when J D 0. Therefore, if Δ I D 1/2 is strict,KC ! πCπ0 is forbidden.

Page 617: Yorikiyo Nagashima - Archive

592 15 Weak Interaction

Problem 15.9

(a) Using the Δ I D 1/2 rule show that

Γ (KS ! πCπ�)Γ (KS ! π0π0)

DΓ (KL ! πCπ�)Γ (KL ! π0π0)

D 2 (15.131a)

Γ (� � ! Λπ�)Γ (� 0 ! Λπ0)

D 2 (15.131b)

(b) Assuming aC, a�, a0 are decay amplitudes for Σ C ! nπC, Σ � ! nπ�,Σ C ! p π0, show that

aC Cp

2a0 D a� (15.131c)

Hint: Assume the existence of a hypothetical particle, the “spurion”, carrying I D1/2 and consider the decay as the scattering with the spurion where the isospin isconserved (see Sect. 13.3.1 for isospin analysis).

15.6.3Kl3 W KC ! π0 C lC C ν

Decay Rate As a representative of decay processes satisfying jΔS j D 1, we pickout KC ! π0 lCν l (l D e, μ). The reaction is similar to πC ! π0eν, Eq. (15.114a),as far as the kinematics is concerned. Therefore, denoting the coupling constant asGS, the decay amplitude is given by

T f i DGSp

2hπ0(kπ )jsγ μ(1� γ 5)ujKC(kK )ihlCνjνγμ(1� γ 5)lj0i

�GSp

2S μ lμ

S μ D hπ0(kπ )jsγ μ(1� γ 5)ujKC(kK )i

D fC(q2)�kK

μ C kπμ�C f�(q2)

�kK

μ � kπμ�

lμ D u(pν)γμ(1 � γ 5)v (pe) (15.132)

As both K and π have negative parity, the axial vector current does not contribute.f˙ are functions of q2 D (kK � kπ )2 and real if T invariant. As

kK � kπ D pe C pν , kK C kπ D 2kK � (pe C pν) (15.133)

and multiplication of the leptonic current by the lepton momenta produces ml ,f� is multiplied by m2

l . Therefore, when l D e, the f� term can be neglected.As an extension of the CVC hypothesis, let us assume that the vector part of thestrangeness-changing weak current is conserved, too, Then just like the isospincurrent [S U(2) symmetry] was induced from CVC, S U(3) symmetry can be appliedhere and fC(0) becomes 1/

p2. As S U(3) symmetry is not exact, the correction

Page 618: Yorikiyo Nagashima - Archive

15.6 Strange Particle Decays 593

changes the value slightly (1 ! 0.982 ˙ 0.008 [263]). Since q2 is a small number,the form factor is usually parametrized as

f˙(q2) D f˙(0)h1C λ˙

�q2/m2

π

�i(15.134)

in the experimental analysis. In the Kμ3 decay, both fC and f� can be determined,but in Ke3 decay only fC is determined. We discuss Ke3 only here. The electronenergy spectrum and (π0ν) angular correlation can be calculated from Eq. (15.132)

dΓ D (2π)4δ4(kK � kπ � pe � pν)jMj2

2mK

d3 kπ

(2π)32ωπ

d3 pe

(2π)32Ee

d3 pν

(2π)32Eν

(15.135a)

jMj2 DXspin

ˇˇ GSp

2S μ lμ

ˇˇ2 D G2

S

2

Xspin

ˇfCu(pν)2 6 kK (1� γ 5)v (pe)

ˇ2D 2G2

S j fCj2 Tr�( 6pe � m) 6 kK (1� γ 5) 6pν 6 kK (1� γ 5)

D 16G2

S j fCj2�2(kK � pe)(kK � pν) � m2

K (pe � pν)

(15.135b)

Electron Spectrum To obtain the electron spectrum, we have to integrate over kπ

and pν. We set fC(q2) D fC(0) in the following calculation. We need to calculateZpν

μ δ4(pνC kπ � a)d3 pν

d3kπ

ωπD Aaμ , where a D kK � pe (15.136a)

The equality is deduced from Lorentz invariance. To find A, we multiply by aμ onboth sides and, making use of a2 � m2

π D 2(a � pν), we obtain

A Da2 � m2

π

2a2

Zd3 pν

d3kπ

ωπδ4(pν C kπ � a) (15.136b)

The integration can be performed easily in a frame where a D 0:Zd3 pν

d3kπ

ωπδ4(pν C kπ � a) D

Zd3 pν

ωπ Eνδ(Eν C ωπ � Ea)

D 4πpν a0

a20D 2π

a2 � m2π

a2

(15.136c)

As a result we obtain

Aaμ D π

�a2 � m2

π

�2

a4 aμ (15.136d)

Inserting Eq. (15.136d) in Eq. (15.135) gives

dΓ DG2

S j fCj2

(2π)5EK

�2(kK � pe)(kK � a) � m2

K (pe � a)

π

�a2 � m2

π

�2

a4

d3 pe

Ee(15.137)

Page 619: Yorikiyo Nagashima - Archive

594 15 Weak Interaction

800

700

600

500

400

300

200

100

300

200

100Num

ber o

f ent

ries

60 80 100 120 140 160 180 200 220 –1 –0.5 0.50

• Experimental data(a) Vector spectrum (with radiative corrections)(b) Scalar spectrum(c) Tensor spectrum

(a)(b)

(c)

S

T

V

cos αMomentum (MeV/c)

e+ νe

K+

απ0

Figure 15.17 Comparison of V, S, T interaction type with KC ! π0eC νe . (a) The energyspectrum of the decay electron [132]. (b) π0νe angular correlation. α is the angle between π0

and ν in the dilepton center of mass system [75].

Using a D kK � pe, and evaluating variables in the kaon rest frame, we find

a2 D m2K � 2mK Ee , (15.138a)

a2 � m2π D 2mK (Emax � Ee) , Emax D

m2K � m2

π

2mK(15.138b)

2(kK � pe)(kK � a) � m2K (pe � a) D m2

K Ee(mK � 2Ee) (15.138c)

where Emax is the maximum energy of the decay electron and we have set me D 0.Taking into account Eqs. (15.138), we finally obtain the energy spectrum of thedecay electron:

dΓdEeD

G2S j fCj2

2π3 mK E 2e

(Emax � Ee)2

(mK � 2Ee)(15.139)

This formula agrees with experimental data (Fig. 15.17a).We have confirmed that the vector-type interaction (with maximal parity-violat-

ing component, of course) reproduces the experimental data very well. In derivingEq. (15.139), we assumed fC(q2) D constant.

Page 620: Yorikiyo Nagashima - Archive

15.6 Strange Particle Decays 595

Form Factors of Kl3 How can we derive information on fC(q2)? Since q2 D (kK �

kπ)2 D m2K C m2

π � 2mK ωπ is a function of only ωπ , information on fC(q2) canbe obtained by observing the energy spectrum of the pion. It can be calculated byperforming a similar calculation as for the electron spectrum. The formula is

dΓdωπ

DG2

S j fCj2

12π3 mK k3π (15.140)

Then from Ke3 data one can obtain λC and from Kμ3 data, λ�. However, recently,it has become more conventional to use [311]

f0(q2) D fC(q2)C f�(q2)q2

m2KC � m2

π0

(15.141a)

f0(q2) D f0(0)

1C λ0

q2

m2πC

!(15.141b)

The present value of observed λ˙ [311] is given by

λC D 0.0296˙ 0.0006

λ0 D 0.0196˙ 0.0012

Im � � Im�

f�fC

�D �0.006˙ 0.008

(15.142)

The information on Im� comes from T (time reversal) violating transverse polar-ization PT of μ [226], where

PT D Im �mμ

mK

jp μj

Eμ C jp μj cos θμν � m2μ/mK

(15.143)

Interaction Type In Fig. 15.17, theoretical spectra for the interaction type S (scalar)and T (tensor) are also given. They can be calculated starting from

T f i DGSp

2

�2mK fS

˚e(1� γ 5)ν l

�C

2 fT

mKkK

μ kπν ˚eσμν(1� γ 5)ν l

��(15.144)

instead of Eqs. (15.132). The angular correlation between π0 and ν in the dileptoncenter of mass frame is given by [75]

�(Eπ , cos α) D �

�j fS C fT cos αj2mK ωπ C

12j fCj2kπ sin2 α

�(15.145)

Figure 15.17b shows π0ν e angular correlation. It is clear from the figure that V–Ainteraction is equally valid for strangeness-changing weak interactions.

Page 621: Yorikiyo Nagashima - Archive

596 15 Weak Interaction

Strength of Strangeness-Changing Weak Decays The total decay rate can be ob-tained from Eq. (15.139) by integrating over the electron energy:

ΓKe3 DG2

S f 2Cm5

K

768π3f�

m2π

m2K

�f (y ) D 1� 8y C 8y 3 � y 4 � 12y 2 ln y

(15.146)

f�m2

π/m2K

�D 0.579 for mπ/mK D 0.273. Comparing Eq. (15.146) with the exper-

imental value τK D 1.2380 ˙ 0.0021 � 10�8 s, BR�KC ! π0eCν e

�D 0.0508 ˙

0.0005, gives

GS/G � 0.22 (15.147)

Problem 15.10

Derive Eqs. (15.140) and (15.146).

15.6.4Cabibbo Rotation

We have learned that the V–A interaction derived from the muon and nuclear decays are equally valid for strangeness- changing decays. However, the measuredcoupling constant G2

S is approximately 1/20 of G2 and is very small. This is also

true for other strangeness-changing decays. In other words, the coupling is univer-sal in the sense that the strength is equal for all strangeness-changing processes,yet the universality does not apply between reactions with ΔS D 0 and ΔS ¤ 0.This very annoying situation was saved by introduction of the Cabibbo rotation [84].

It frequently happens in quantum mechanics that a good quantum numberchanges depending on what kind of interaction is active. For example, the orbitalangular momentum is a good quantum number when the central force is activeand spherical harmonic wave functions are suitable for the treatment of dynamics.However, when spin–orbit coupling is active, the total angular momentum is thegood quantum number and the wave functions are mixtures of differing orbitalangular momentum. Cabibbo thought that when the weak force is active, a statesuitable for dynamic treatment may not necessarily be the mass eigenstate that isobservable as a particle. The state suitable for describing the dynamical behavior ofthe particle is actually a mixture of mass eigenstates

d0 D cos θcd C sin θc s (15.148)

and the particle that participates in the weak interaction is not d but d0, which isreferred to as the “weak eigenstate”. Cabibbo proposed that the weak current shouldbe expressed as

j CW

μD uγ μ(1� γ 5)d0 (15.149)

Page 622: Yorikiyo Nagashima - Archive

15.6 Strange Particle Decays 597

This is referred to as Cabibbo rotation. According to this idea, the Fermi couplingconstant should be universal, but in experiments only the mass eigenstates arepicked up. Consequently, only a component of the particle that is contained in theweak eigenstate is measured. The fraction is given by the (square of) of the Cabibborotation angle. In the strangeness-changing process the fraction is given by sin2 θc

changing Gμ ! GS D Gμ sin θc. From Ke3 data [75]

sin θc D 0.220˙ 0.002 (15.150)

From Λ decay [72], it was determined as

sin θc D 0.231˙ 0.003 (15.151)

and on the average,

sin θc D 0.221˙ 0.00211) (15.152)

Strictly speaking, the above value should be restricted to vector coupling only, butother measurements including the axial current also confirmed the same value. Insummary, the V–A interaction can be applied equally well to all the strangeness-changing weak interactions by introducing a single parameter, the “Cabibbo an-gle” θc.

Cabibbo’s proposal means that the coupling constant G of the nuclear decayis actually G D Gμ cos θc. No Cabibbo rotation is necessary for the leptonic cur-rent. Since there was a small but significant difference between G and Gμ , as wasshown in Eq. (15.89), identifying the difference as cos θc gives the same value ofthe Cabibbo angle and

cos θc D 0.973˙ 0.002 12) (15.153)

From now on we use GF to denote the strength of the universal four-Fermi interac-tions instead of Gμ in honor of Fermi.

11) When the quark family is extendedto three generations including (c, s),(t, b), the Cabibbo rotation becomes theKobayashi–Maskawa (KM) matrix. In thiscase sin θc should be replaced by V12 or Vus .Numerically, however, the difference is verysmall.

12) For the same reason as in footnote 11, cos θc

should be replaced by V11 or Vud . From the

unitarity of the Kobayashi–Maskawa matrix

jV11j2CjV12j

2CjV13j2 D 1 (15.154)

Hence jV11j2 C jV12j

2 need not necessarilybecome unity, but it accidentally happenedthat jV13j

2 jV11j2, jV12j

2, and the Cabibborotation, which closes within the twogenerations, worked well.

Page 623: Yorikiyo Nagashima - Archive

598 15 Weak Interaction

15.7Flavor Conservation

15.7.1GIM Mechanism

A careful reader might have already noticed that so far we have talked only aboutthe charged current weak interaction or that mediated by charged W ˙, where theparticipating members change their electric charge. It is natural to imagine thatthere is a neutral current interaction and it is compulsory if the weak force carrierW constitutes an S U(2) triplet, as is the case in the Standard Model. Indeed, theexistence of Z0 and its associated reactions was among the major predictions of theGlashow–Weinberg–Salam model, and subsequent experimental confirmation es-tablished its validity. Historically and phenomenologically, the need for the neutralcurrent was not strong. The effect of the ordinary (strangeness conserving) neutralcurrent (i.e. j μ

NC � uγ μ u, dγ μ d) is hidden by much stronger electromagnetic cur-rents, hence it is not surprising that their effect was not observed. Moreover, therewas strong evidence of the absence (or suppression to be exact) of the strangeness-changing neutral current. For instance, consider the kaon decays in Table 15.1 thatare induced by the charged current (CC) and the corresponding neutral currentinteraction (NC). Their Feynman diagrams are depicted in Fig. 15.18. Table 15.1suggests the absence of Z0 for the diagrams (b) and (d) in Fig. 15.18. However,there was no reason to believe in the nonexistence of the strangeness-changingneutral current and it was a mystery at that time. In 1970 Glashow, Iliopoulos andMaiani [174] proposed the existence of the fourth quark, and that it should make adoublet with s0, an orthogonal state to the Cabibbo-rotated d0:

d0 D d cos θc C s sin θc

s0 D �d sin θc C s cos θc(15.155)

The Cabibbo-rotated d0 makes a neutral current

d0d0 D dd cos2 θc C s s sin2 θc C sin θc cos θc(sd C d s) (15.156)

where space-time structure was omitted in the above equation. Equation (15.156)contains the strangeness-changing current sdC d s, which should induce diagrams

Table 15.1 Suppression of strangeness-changing neutral current decays.

CC BR NC BR

KC ! μC νμ 63.54˙ 0.14% K0 ! μ� μC 6.84˙ 0.11� 10�9

KC ! π0 eCνe 5.08˙ 0.05% KC ! πCν e νe 1.5C1.3�0.9 � 10�10 a

a The experimental value is for KC ! πC CP

i ν i ν i .

Page 624: Yorikiyo Nagashima - Archive

15.7 Flavor Conservation 599

e+νμμ+

W+

u s

K+

W+

ν ν

u s

K+

u d

Z0

μ+

d s

K0

Z0

μ νe

u s

K+

u u

(a) (b) (c) (d)

π0 π+

Figure 15.18 The charged current decays (a) KC ! μC ν and (c) KC ! π0eC νe and thestrangeness-changing neutral current decays (b) K0! μ� μC and (d) KC! πCνν.

(b) and (d) of Fig. 15.18. But if a state expressed as s0 in Eq. (15.155) exists, theneutral current that s0s0 makes

s0s0 D dd sin2 θc C s s cos2 θc � sin θc cos θc(sd C d s) (15.157)

cancels the strangeness-changing part of d0d0 and together with d

0d0 the neutral

current becomes

d0d C s0 s D dd C s s (15.158)

The absence of the strangeness-changing current is explained. This is called theGIM mechanism. There had been other predictions of the fourth quark as early as1964, but its existence was seriously considered only after the GIM proposal. Thecharm quark was subsequently discovered in 1974.

The reason why the strangeness-changing neutral current exists, albeit very sup-pressed, is due to the higher order charged current processes (Fig. 15.19). In theStandard Model, GF ' g2

W /m2W , and the value of gW is comparable with that of

e.13) Therefore the contribution of higher order diagrams (Fig. 15.19) would be onthe order of α, but the observed values are much smaller. This is because the GIMmechanism is at work in the higher order contributions. There are processes de-picted in Fig. 15.19(b) or (d) corresponding to Fig. 15.19(a) or (c) that are obtainedby replacing c with u. Since the coupling of W to ud or cs is � cos θc and that tous or cd � sin θc with differing sign, the contributions of (a) and (b) or (c) and (d)are equal but with opposite sign. If the mass of the charm quark were the same asthat of the up quark, the cancellation would be perfect and the decays would nothappen. Assuming mc mu the decay amplitude can be calculated and gives [154]

M(K0 ! μμ) � αGF

�m2

u

mc

�2

(15.159)

13) To be exact, GF/p

2 D g2W /(8m2

W ), e D gW / sin θW, where θW is the Weinberg mixing angle ofthe electromagnetic and the weak interactions.

Page 625: Yorikiyo Nagashima - Archive

600 15 Weak Interaction

(a) (b) (c) (d)

νν

u s

K+

u d

π+

W+

W+

u e+(μ+)

gWsinθc

gWcosθc

νν

u s

K+

u d

π+

W+

W+

c e+(μ+)

-gWsinθc

gWcosθc

sd

K0

u

μ+μ

νμ

W+W

gWcosθc gWsinθc

gW gWgW

gW

gW

gW

sd

K0

c

μ+μ

νμ

W+W

gW gW

-gWsinθc gWcosθc

Figure 15.19 GIM mechanism at work in thehigher order corrections. (a) and (b) are di-agrams for K� ! μ� μC, (c) and (d) forKC ! πCνν. If mc D md , cancellation due

to the GIM mechanism should be completeand the above processes would be forbidden.The mass difference makes a finite contribu-tion to (a)C (b) or (c)C (d).

Adjusting the mass value to fit the experimental value, one can obtain mc �

1.5 GeV. The mass prediction was done before the charm quark was discovered.The significance of the c quark discovery was not just the confirmation of thefourth quark, it also established the doublet structure of the weak interaction andat the same time explained the absence of the strangeness-changing neutral cur-rent. Unlike the s quark, the c quark was predicted and anticipated. The discoverywas an important milestone in the construction of the Standard Model. In sum-mary, there is strong evidence for the absence of the strangeness-changing neutralcurrent (SCNC) and there is no compelling reason to believe in the existence of thestrangeness-conserving neutral current, at least from a phenomenological point ofview.

15.7.2Kobayashi–Maskawa Matrix

We know today that there are six quarks in three doublets with hierarchical struc-ture in the sense that the mass of the quarks increases with the generation (seeTable 19.1). The Cabibbo rotation is generalized to a 3 � 3 unitary transformationmatrix, which is referred to as the Cabibbo–Kobayashi–Maskawa (or CKM) ma-trix [239]. Namely, the W boson couples to the three doublets�

ud0

� �cs0

� �tb0

�(15.160a)

with equal strength, and they are mixtures of the mass eigenstate D D (d, s, b):

24d0

s0b0

35 D UCKM

24d

sb

35 D

24Vud Vus Vub

Vcd Vcs Vcb

Vt d Vt s Vt b

3524d

sb

35 (15.160b)

Page 626: Yorikiyo Nagashima - Archive

15.7 Flavor Conservation 601

The unitarity of the CKM matrix is a GIM mechanism extended to the three gener-ations:

d0d0 C s0s0 C b

0b D D

0D 0 D DU†U D D DD D dd C s s C bb (15.161)

There is no flavor-changing neutral current of any kind; this is not limited tostrangeness only. Though there are nine independent variables in the 3 � 3 uni-tary matrix, considering the quark field’s freedom to change its phase, the numberis reduced to four: three rotation angles θ12, θ23, θ13 and one phase δ. It is conven-tional to express the CKM matrix in the form

UCKM D R23 IδD R13 I †δD

R12

D

241 0 0

0 c23 s23

0 �s23 c23

3524 c13 0 s13e�i δ

0 1 0�s13e i δ 0 c13

3524 c12 s12 0�s12 c12 0

0 0 1

35

D

0@ c12c13 s12c13 s13e�i δ

�s12c23 � c12 s23s13e i δ c12c23 � s12 s23s13e i δ s23c13

s12 s23 � c12c23 s13e i δ �c12s23 � s12c23s13e i δ c23c13

1A

IδD D diag (1, 1, e i δ) , s12 ' 0.220 , s23 ' 0.04 , s13 ' 0.004

(15.162)

Here, Ri j is a rotation matrix on the i j plane and c i j , s i j stand for cos θi j , sin θi j .The Cabibbo angle θc D θ12, but the 2 � 2 Cabibbo rotation matrix worked wellbecause c13 ' 1, s13 ' 0. However, the b ! u transition [actually B�(bu) !π0(uu)C eC C ν e etc.] has been observed, showing s13 ¤ 0.

It is worth noting that the motivation for introducing the CKM matrix wasnot merely an extension from two to three generations. Indeed, Kobayashi andMaskawa proposed it in 1973 before the discovery of the charm quark. Its moti-vation was to introduce a complex phase to explain CP violation, which will beexplained in the next chapter in more detail.

15.7.3Tau Lepton

The τ lepton was discovered in 1975 [318] and is one of the few elementary particlesthat can be observed in its bare form. The decay of the τ has many features incommon with muon decay, and is suitable for investigating the universality of theweak interaction. We shall describe a few properties of τ and how to determinethem, then we discuss general features of the weak interaction in the lepton sector.

Production If there is enough energy (p

s > 2mτ ), a τ pair can be produced inthe reaction

e� C eC ! τ� C τC (15.163)

Page 627: Yorikiyo Nagashima - Archive

602 15 Weak Interaction

e− e+τ −

τ +

l e−, μ−

νe, μντ

l or hadrons ντ

Figure 15.20 Topology of e� C eC ! τ� C τC production.τ decays to l C ν l C ντ or ντC hadrons. As τ’s are short lived,only decay products of the pair are observed. A typical topologyis an acoplanar lepton pair.

Decay patterns of the τ lepton are written as

τ˙ ! ,--ν τC(W ˙)

(W ˙)!

8<ˆ:

e˙ C ,--ν e

μ˙ C ,--ν μ

qi q j ! hadrons W qi q j D ud, us, etc.

(15.164)

where (W ˙) is virtual. As the τ lepton has very short decay lifetime (2.910˙0.015�10�13 s), what are observed are decay products of τ�τC pairs (Fig. 15.20). If bothtau’s decay leptonically, the topology consists of an acoplanar14) lepton pair (e�eC,e�μC, μ�μC) and it is easily recognized. If only one τ decays leptonically, thetopology is a single lepton in one hemisphere and a few hadrons in the other hemi-sphere, another combination easy to recognize.

Mass and Spin The production cross section of a spin 1/2 particle is given by [seeEq. (7.38)]

σ1/2 D4πα2

31s

(15.165)

The equation is common to ee ! μμ, ττ if s (2ml )2. Immediately above thethreshold, however, the effect of the produced mass or nonrelativistic correction isnecessary. If we write the Lorentz factor of the particle D p τ/Eτ , γ D (1�2)�1/2,the cross section is modified as (see Problem 7.2)

σTOT D σ1/2�

3� 2

2

�(15.166)

In order to determine the spin of τ, theoretical cross sections for different spins

14) Acoplanar means that one of the pair is not on a plane made by the incoming e� eC and the otherparticle.

Page 628: Yorikiyo Nagashima - Archive

15.7 Flavor Conservation 603

were considered [370, 371]:

σττ D σ1/2F() (15.167a)

F() D

8ˆ<ˆˆ:

3/4 s D 0

(3� 2)/2 s D 1/2

3(3C 4γ 2)/4 s D 1, � D 0

3(3C 2γ 2 C 4γ 4)/4 s D 1, � D 1

(15γ�2 C 302 C 404γ 2 C 166γ 4)/9 s D 3/2, A D 1,

B D C D D D 0

(15.167b)

where the magnetic moment of the spin 1 particle is defined as μ D (1C �)e/2M .Note that a fermion–antifermion pair is produced in S-wave (σ / jp j) and thecross section rises sharply at the threshold as a function of energy, while that ofboson pairs is in P-wave (σ / jp j3). This is due to parity conservation in e�eC re-action, where the dominant contribution comes from one-photon exchange, whichhas negative parity. The boson pairs have positive intrinsic parity, hence S-wave isforbidden.

Figure 15.21 compares the measured cross section with theoretical predictionsfor various spins [317]. It is clear that τ has s D 1/2. Fitting the cross section withthe theoretical curve immediately above the threshold gave the mass as

mτ D 1777.0 ˙ 0.3 MeV (15.168)

Figure 15.21 Comparison of the total production cross section ee ! ττ with theoreticalpredictions for various spins. Rττ D σ(ee ! ττ)/σ(ee ! μμ) [35, 317].

Page 629: Yorikiyo Nagashima - Archive

604 15 Weak Interaction

Coupling Type As the τ has a large mass, it has many decay modes. Among them,purely leptonic modes

τ� ! e� C ν e C ντ

τ� ! μ� C νμ C ντ(15.169)

have the same kinematical structure as muon decay. Therefore, investigation ofthe energy spectrum of the decay leptons allows the Michel parameters to be deter-mined (Fig. 15.22a,b), [28, 311]):

� D 0.745˙ 0.008 (15.170)

The value agrees with that of V–A interaction.

Universality The leptonic decay rate can be calculated using the identical formulaas for muon decay [Eq. (15.84)] with the mass replaced by mτ :

Γ (τ� ! e�ν e ντ ) DG2

τ m5τ

192π3 (15.171)

Then the lifetime of τ is given by

ττ D τμ

�Gμ

�2 �mμ

�5

BR (τ� ! e�ν e ντ ) (15.172)

Setting Gμ D Gτ and inserting the observed values for mτ D 1777.0 ˙ 0.3 MeV,mμ D 105.658389 ˙ 0.000034 MeV, τμ D 2.19703 ˙ 0.00004 � 10�6 s, BR D0.1783 ˙ 0.0008, one finds the calculated value agrees with the experimental val-ue of ττ(exp) D 2.910 ˙ 0.015 � 10�13 s within the error. Conversely, using theexperimental value to test the universality of the coupling constant gives [164, 169]

GμD 0.999˙ 0.006 (15.173)

N0.025

N0.025

300

200

100

200

0

100

00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

ARGUS

τ → μνντ → eνν

V – AV + A

V – AV + A

E/Emax E/Emax

Figure 15.22 Comparison of the τ energy spectrum with V˙A predictions. (a) τ ! eνν , (b)τ! μνν [28].

Page 630: Yorikiyo Nagashima - Archive

15.7 Flavor Conservation 605

If the weak interaction actually occurs through W exchange, then Gμ � g2μ g2

e andGτ � g2

τ g2e . So the above universality applies to gμ � gτ , giving gμ/gτ D 1.001 ˙

0.003. By investigating the ratio of τ ! μνμ ντ and τ ! eν e ντ , we can also testgμ � ge universality. Using

Γ (τ� ! μνμ ντ)Γ (τ� ! eν e ντ )

DBR(τ� ! μνμ ντ )BR(τ� ! eν e ντ)

Dg2

τ g2μ

g2τ g2

e

f (m2μ/m2

τ )

f (m2e/m2

τ )(15.174)

where f (y ) is given in Eq. (15.85) and comparing with the experimental value

BR(τ� ! μνμ ντ)BR(τ� ! eν e ντ )

D(17.36˙ 0.05)%(17.85˙ 0.05)%

D 0.974˙ 0.004 (15.175)

gives gμ/ge D 1.001˙ 0.002.We may say the universality is very good.

Branching Ratio and Quark Counting As the mass of τ is large, there are manyhadronic decay modes. But as mc > mτ , the hadronic decay modes are limited to

τ� ! ντ C W � ! ντ C uC d0 (15.176a)

D

(τ� ! ντ C d C u probability D cos2 θc

τ� ! ντ C s C u probability D sin2 θc(15.176b)

Owing to the Cabibbo rotation, τ decays to du with probability cos θ 2c and to su with

sin2 θc; their combined decay rate can be considered as one to a single channel d0.At the quark level, the decay rate W � ! ud0 is the same as the leptonic decay(W � ! e�ν e or μ�νμ) provided the mass of the quarks and leptons is neglected.It is a good approximation since me � mu � md � a few MeV, mμ � 100 MeV.Even the heaviest μ is much lighter than mτ (' 2 GeV). Since the quark has an ex-tra color degree of freedom, the decay rate to the quarks has to be tripled. Thereforea total of five channels are open for τ decay. Assuming all the quarks hadronize,the electronic and muonic decay modes are predicted each to be 1/5 D 20%. Thisis very close to the observed value of 17.7%. It demonstrates a practical use of thequark model, that simple quark counting gives fairly good agreement with experi-ment. A more precise estimate considering the QCD correction gives much betteragreement [169].

15.7.4The Generation Puzzle

From what we have analyzed so far, the three leptons e, μ, τ have the same proper-ties except for their mass. Various kinds of reaction rates can be accurately repro-duced by simply changing the mass. Despite their similarity, they have their ownquantum number, referred to as lepton flavor. (ν e, e�) has e-quantum number,(νμ , μ�) μ-number and (ντ , τ�) τ-number, which are conserved in the weak inter-action, and the three pairs do not mix. The fact that the three neutrinos are also dis-tinct, each having separate flavor, can be deduced from the following observations.

Page 631: Yorikiyo Nagashima - Archive

606 15 Weak Interaction

From the energy spectrum of the pure leptonic decay of the muon (μ� ! e�ν e νμ)and the τ lepton (τ� ! e�ν e ντ , μ�νμ ντ), we know there are at least two kinds ofneutral particles in each decay, which we call neutrinos. That ν e and νμ have differ-ent flavor can be deduced from the absence of μ ! eγ , as discussed in Sect. 15.4.4.Using the same arguments and from the absence of τ ! μγ , eγ , we can deduceντ ¤ νμ , ν e . Other evidence comes from the fact that ν e , νμ cannot produce τ di-rectly. Precision tests of the leptonic flavors come from the absence of the followingdecay modes:

BR(μ ! eγ ) < 4.9 � 10�11

BR(μ ! eee) < 1.0 � 10�12

BR(τ ! eγ ) < 1.1 � 10�4

BR(τ ! μγ ) < 4.2 � 10�6

BR(τ ! 3l) < 3.3 � 10�6

BR(Z ! l1 l2)(l1 ¤ l2) < 1.7 � 10�6 (l1 D e, l2 D μ)

(15.177)

Today, lepton flavor conservation is accepted and is built in to the Standard Mod-el. The flavor does not mix in the Standard Model. If one adds a small mass15) tothe neutrino, which is a natural extension of, but nonetheless one step beyond,the Standard Model, the mixing is proportional to (mν/mW )2 [97], which is ex-tremely hard, to say the least, to detect in decay reactions. Why do leptons withthe same properties appear repeatedly? This is a question not resolved by the Stan-dard Model. It is the same old problem of Rabi’s “who ordered it?” extended to thee–μ–τ puzzle. It is also referred to as the family or generation problem, becausethe same question exists in the quark sector. Namely, (c, s) and (t, b) seem to bereplicas of (u, d) except for their mass. Combining the quark and lepton doublets,(u, d, ν e , e�), (c, s, νμ , μ�) and (t, b, ντ , τ�) constitute three generations.

One idea to resolve the puzzle is to consider that they belong to the same sym-metry group, called horizontal (or family) symmetry. Another is to consider thatleptons and quarks are composites of more fundamental particles. In this model,μ, τ are excited states of the electron. In modern string theory the family structurearises from vacuum or space-time structure of extra dimensions. At present, thereare no hints as to how to differentiate between the models.

How Many Generations? So far we have discovered three generations of thequark/lepton family. The Standard Model assumes there are no more. The asser-tion is not supported by compelling evidence, however, there is an indication thatthe number of generations is indeed limited to three. It came later, after the Stan-dard Model was established, but we will jump ahead. It comes from the absence ofan extra neutrino in the decay of Z0. Z0 is produced and decays in the following

15) If there are mass differences between the neutrinos and if they mix, neutrino oscillation occurs(see Sect. 19.2.1), as has been observed.

Page 632: Yorikiyo Nagashima - Archive

15.7 Flavor Conservation 607

reactions:

e� C eC ! Z !

8<ˆ:

l l l D e, μ, τ

ν l ν l l D e, μ, τ

uu, dd, s s, cc, bb

(15.178a)

The Z boson (mZ D 91.2 GeV) cannot decay to a top quark pair because the topquark is too heavy (mtop D 171.2˙ 1.2 GeV). Production of quark pairs is observedas hadrons. The neutrino pairs cannot be observed, but their decay branching ratioscan be accurately estimated from

Γ

Z !

Xi

ν i ν I

!D ΓTOT � Γ (Z ! hadrons)� Γ

Z !

Xi

l i l i

!

(15.179)

where ΓTOT is the total decay width of Z, which can be determined from the cross-section shape (Fig. 15.23). If there is an extra neutrino, the total decay rate ΓTOT

changes, hence an observable branching ratio, for instance that to hadrons BR DΓ (Z ! hadrons)/ΓTOT, does, too.

The data agree quite well with the assumption that Z decays to three neutrinos,three charged leptons and five quarks without top but with three color degreesof freedom. Figure 15.23 shows hadronic total cross sections. From the data thenumber of types of neutrinos is determined to be

Nν D 2.92˙ 0.05 (15.180)

The above argument does not exclude the possibility that a neutrino with massgreater than mZ /2 ' 45 GeV or an exotic neutrino that does not couple to Z (re-ferred to as a “sterile neutrino”) exists. However, the fourth generation, if it exists,

Figure 15.23 Hadronic total cross section vs number of neutrino types. e� C eC ! Z !Pqi q i where qi D u, d, s, c, b. The data agree with the Standard Model and the number of

neutrino types was determined to be 2.92˙ 0.05 [262].

Page 633: Yorikiyo Nagashima - Archive

608 15 Weak Interaction

must include a neutrino with large mass (> 45 GeV). Considering all the knownneutrinos have nearly vanishing mass, it is considered an unlikely situation.

15.8A Step Toward a Unified Theory

15.8.1Organizing the Weak Phenomena

In the Standard Model, the weak force is mediated by W bosons, but that was notnecessarily taken for granted from the beginning. The notion of the force carrierin the weak interaction was seriously considered only when the near universalityof the coupling constant in the muon and the nuclear decay was realized [256].With the establishment of the V–A interaction, it became a main current. Let ussummarize what we have learned so far and see where the W boson fits in.

The phenomenological weak interaction can be reproduced by the Lagrangian

LWI D �4GFp

2( JC

L )μ( J�L )μ (15.181a)

( JCL )μ D ( J�

Lμ )† D

Xi

ψ i Lγ μ τCψ i L (15.181b)

where τC is the charge-raising operator and not to be confused with the τ lepton.

ψi D

�ud0

�,

�cs0

�,

�tb0

�,

�ν e

e�

�,

�νμ

μ�

�,

�ντ

τ�

�(15.181c)

ψL D12

(1� γ 5)ψ (15.181d)

Namely, all the weak interactions of the elementary particles have the same com-mon coupling constant GF and are of V–A type. We can rephrase the term V–Ainteraction to mean that only left-handed (or more rigorously negative chirality)particles carry the weak charge that generates the force. Then it can be simply stat-ed that the weak force is a universal vector-type interaction. Fermi’s intuition wasright after all.

At about the same time as Fermi proposed the weak theory, Yukawa proposedthe π meson as the nuclear force carrier. He tried to explain the weak force usingthe same π but was unsuccessful. However, the existence of a force carrier is a far-sighted concept that is carried over to the modern Standard Model. If one consid-ers the weak force to be mediated by a vector boson (W ) having a large mass, theFermi interaction can be reproduced as a second-order process, as we discussedat the very beginning (Sect. 15.2). The characteristics of the weak force having avector force carrier with universal coupling constant is reminiscent of QED. Onenaturally wonders whether the weak interaction has the same structure as the elec-tromagnetic force.

Page 634: Yorikiyo Nagashima - Archive

15.8 A Step Toward a Unified Theory 609

The Fermi coupling constant has dimensions of “mass2”. In the old days, theproton mass was used, making the value of GF � 10�5/m2

p very small. This isthe origin of the name “weak interaction”. However, if mW (� 80 GeV) is used,the strength of the coupling constant is about the same order as the electromag-netic force. This was one of the first clues to the unified theory. In other words,in the high-energy region, where the typical energy scale is comparable to or larg-er than mW , the effect of the weak force would be just as strong as that of theelectromagnetic force. The weak force is weak simply because the energy scale Ebeing considered is much smaller than the mass of the force carrier, reducing theeffective strength of the force in proportion to E 2/m2

W .Once we accept the existence of W’s, not only can all the weak interaction dy-

namics be reproduced in an organized way but also the resemblance to QED isquite apparent:

LWI D �gWp

2

Xi

hψ i L γ μ τCψ i L W C

μ C h.c.i

(15.182a)

g2W

8m2WD

GFp

2(15.182b)

The factor 1/p

2 is conveniently added for later discussions and has no specialmeaning. In summary, all the weak reactions can be described in a simple andbeautiful form by taking into account

1. the quark model2. V–A interaction3. W ˙ as the force carrier4. the Cabibbo rotation

Figure 15.24 describes how various weak reactions can be interpreted by the Wmediating processes. The V–A interaction type was determined around 1957. Untilthen interpretations of decays were in great confusion, but the discovery of pari-ty violation prompted a rapid convergence within a year. 1957 was an epoch in thehistory of the weak interaction. A big tide of aspiration for the discovery of the Wboson emerged, and from then on many accelerator construction proposals tried tojustify themselves by including the potential capability of W production. However,the mass being unknown at that time, whether a proposed accelerator could real-ly produce it remained uncertain. Then finally came the Weinberg–Salam model,which predicted the mass accurately.

The universal V–A theory could be said to be a summary of the pre-Weinberg–Salam era, and it is very powerful phenomenologically. Even today, unless onewants to discuss the direct production of W and related phenomena, it can givesatisfactory explanations for not only decay reactions but also neutrino scatterings.The phenomenological success was a big factor in causing the Fermi theory to betaken seriously despite its theoretical shortcomings, which will be described in thefollowing. Similarity with QED, including the universality and vector-type interac-tion and the fact that the weak force shares part of it with the electromagnetic force

Page 635: Yorikiyo Nagashima - Archive

610 15 Weak Interaction

νee

νμ

μ-

W W+W+

u d d

u d up

n K0s d

u d d u

d d u

u d u

n (or p)

π+ π(a) (b) (c) (d)

(e) (f) (g)

X

gμgegdgd gd

gs

νee

μ-

νμ

Wgμ ge

νee

n

p

Wge

K0

π+ π

νμ

μ

W+

X

n, p

Figure 15.24 Reorganizing hadronic weakinteractions in terms of quarks and W. All theinteractions are reproduced by a current madeof (u, d0) (where d0 D d sin θc C s sin θc),(νe , e�) and (νμ , μ�) coupled to W ˙.(a) The muon decay is elementary as it is.

When (b) decay, (c) K0 ! πCπ� and(d) νμ n ! μ� X (hadrons) are rewrittenusing the quarks, they are represented by(e), (f) and (g), respectively. The couplingconstants are written as gd D gW cos θc,gs D gW sin θc and ge D gμ D gW .

as exemplified in weak magnetism, provided crucial clues and a springboard for aunified theory.

15.8.2Limitations of the Fermi Theory

Although phenomenologically powerful, it is immediately apparent that the four-Fermi V–A theory is not satisfactory from the theoretical point of view. For instance,the scattering cross section of a neutrino by an electron (ν e e� ! ν e e�) can beeasily calculated using Eq. (15.181) to give

dσdΩ

(ν e e� ! ν e e�) DG2

F

4π2 s , σTOT DG2

F

πs (15.183)

The cross section increases proportional to the square of the center of mass en-ergy s. On the other hand, the unitarity restricts the cross section with angularmomentum J (unitary limit)

σTOT (2 J C 1)πλ/2D (2 J C 1)π/p 2 D 4π(2 J C 1)/s (15.184)

Page 636: Yorikiyo Nagashima - Archive

15.8 A Step Toward a Unified Theory 611

As the four-fermion Lagrangian is a contact interaction, only the S wave con-tributes, hence the theory is valid only if

σTOT < πλ/2! s . 2π/GF ! p . 300 GeV (15.185)

One might think the introduction of W may save the situation. However, in the cal-culation of higher order corrections, the W boson contributions to the intermediatestate have to be added or are integrated over the energy. In the limit p !1, all themasses can be neglected and the most divergent term in the correction generallyhas the form (see for instance Eq. (8.2))

� GF

Zf (p )

d4 pp 4

(15.186)

In the case of the weak force, the coupling constant GF has dimension E�2, whichmakes f (p ) / p 2 and the integral diverges badly. The divergence at p !1 is theinevitable fate of quantum field theory, but there are two kinds of divergences, goodand bad. The four-Fermi interaction is a typical example of bad divergence. Gooddivergence means that it can be removed by introducing a finite number of correc-tion terms (renormalizable theory) and gives correct results if treated with the rightrecipe, as was clarified in QED. We saw that the anomalous magnetic moment ofthe electron and muon was reproduced to an accuracy of 10�10 and that the appli-cability of QED extends to the energy region as high as a few TeV. This fact providesa solid foundation for the renormalizable theory that, despite its shortcomings, isvery close to the true theory. This is why QED is always considered as a model inreaching the right theory. In QED, the coupling constant corresponding to GF isdimensionless. If one chooses QED as a model, the coupling constant has to be di-mensionless, too. Therefore introduction of W seems the right step for remedyingthe problem.

Introduction of W Let us introduce W and see that gW in the LagrangianEq. (15.182) is indeed dimensionless. Let us consider ν eCe� ! ν eCe� again. Theinteraction Lagrangian has a similar form to the QED Lagrangian LQED D � j μ A μ:

LWI D �gWp

2(νL γ μ eL W C

μ C eL γ μ νL W �μ ) (15.187)

That gW is dimensionless can be seen as follows. Pick up the mass term in theLagrangian of the vector field [see Eqs. (5.20d) and (5.22a)] and consider its physicaldimension:�Z

d3xLW

�D

�Zd3x mψψ

�D

�Zd3 x m2

W Wμ W μ�D [E ] (15.188)

This gives [L] D [E 4], [ψ] D [E ]3/2, [Wμ ] D [E ], and comparing the dimensions ofthe two sides of Eq. (15.187) proves gW is dimensionless.

Page 637: Yorikiyo Nagashima - Archive

612 15 Weak Interaction

νe

νe

νe

νe

νe

e

e

e

e

e

W+

W+

(a) (b)

W+

p3 p4

p1 p2

p3 p4

k qp1 k p2+k

kp1 p2

q p1 p3

Figure 15.25 νe e� scattering diagram. (a) The lowest order. (b) Higher order W exchange.

Now we can investigate the effect of W exchange in ν e scattering as shown inFig. 15.25. The matrix element of Fig. 15.25a is given by

�i T f i D

�e��i

gWp

2γ μ�

(1� γ 5)2

ν e

��i�gμν � qμ qν/m2

W

�q2 � m2

W

�ν e

��i

gWp

2γ ν�

(1� γ 5)2

e��

(15.189)

The differential cross section becomes

dσdΩD

G2F s

4π2

�m2

W

m2W � q2

�2

D(G2

F/4π2)s�1C

�s/m2

W

�(1� cos θ )

2 (15.190)

Integration over the solid angle gives

σTOT (ν e e� ! ν e e�) DG2

F sπ

11C s/m2

W(15.191)

σTOT ! constant, which is a great improvement compared to linear divergence inEq. (15.183).

However, if we calculate the S-wave scattering amplitude f0

f0 DGF s4π

Zd cos θ

11C

�s/m2

W

�(1� cos θ )

DGF m2

W

2πln�

1Cs

m2W

� (15.192)

we find this diverges logarithmically and the problem is not solved, at least not inthe lowest order. What about higher order contributions? If we consider a diagramthat contains two W’s in the intermediate states like the one in Fig. 15.25b, we needto add those for all polarization states of W as well as the internal momentum k.As the Feynman propagator of the W boson is given by

ΔF(q) �P

λ �μ(λ)�ν(λ)�

q2 � m2W

D�gμν C qμ qν/m2

w

q2 � m2W

(15.193)

Page 638: Yorikiyo Nagashima - Archive

15.8 A Step Toward a Unified Theory 613

the contribution of Fig. 15.25b contains an integral of the form

� g4W

Zd4k ΔF�σ(k � q)ΔFμν(k)

��u(p3)γ�SF(p1 � k)γ μ u(p1)

�u(p4)γ σ SF(p2 C k)γ ν u(p2)

SF(p ) D

1(2π)4

Zd4 p e�i p �x 6p C m

p 2 � m2 C i ε

(15.194)

If we pick up the dominant contribution at high energy (k !1), it becomes

� g4W

Zd4k

k� k σ/m2W

k2

k μ k ν/m2W

k2

E 6 kk2

E 6 kk2

�g2

W E 2

m2W

�g2

W

m2W

Zd4kk4

O(k2) (15.195)

where E 2 arises from the normalization of the wave function and�g2

W E 2/m2W

�can

be taken away from the divergence consideration. The last expression correspondsto f (p ) � p 2 in Eq. (15.186). One notices that the divergent term is generated byqμ qν in the numerator of the W propagator.

Origin of Divergences The origin of the divergent term is easy to understand. Amassive vector boson has three degrees of freedom in the spin components. Thepolarization vector �μ(λ), (λ D 1, . . . , 3) in its rest frame can be taken as

�μ(1) D (0, 1, 0, 0) (15.196a)

�μ(2) D (0, 0, 1, 0) (15.196b)

�μ(3) D (0, 0, 0, 1) (15.196c)

Setting the z-axis along the momentum direction of the W, and Lorentz boostingthe particle to where W has p μ

W D (ω, 0, 0, q), we find the transverse polarizationsdo not change but the longitudinal polarization is transformed to

�μ(3) D (q/mW , 0, 0, ω/mW ) (15.197)

We realize that the second term in the numerator of the propagator (15.193) comesfrom the longitudinal polarization. This is equivalent to introducing the dimension[E ]�2 in the coupling constant and induces a divergent integral. When the mass ofthe force carrier is zero, it has only transverse polarization and the divergent termis not generated.

Furthermore, gauge invariance, which forces the mass of the force carrier tovanish, also guarantees the reduction of the divergent terms [see Eq. (8.51) andassociated arguments]. It was QED’s lesson that a gauge-invariant theory has, afterall, at most logarithmic divergences and they can be subtracted by renormalizingthe coupling constant and the mass.

Page 639: Yorikiyo Nagashima - Archive

614 15 Weak Interaction

15.8.3Introduction of SU(2)

If one wants to apply gauge theory to W’s, one has to deal with the difference be-tween the W boson and the photon. This is twofold. First, the W boson is charged,and second, it has a finite and large mass. In order to solve the problems one ata time, we forget the finite mass for the moment and see if gauge theory can beapplied to the charged force carrier. Let us consider what kind of symmetry it canhave. Defining a column vector

ΨL D

�νL

e�L

�(15.198)

and using

τ˙ D12

(τ1 ˙ i τ2) (15.199)

where τ i , i D 1, . . . , 3 are the Pauli matrices, we can reformulate the Lagrangian(15.187) as

LWI D �gwp

2Ψ L γ μ�τC W C

μ C τ� W �μ

�ΨL

D �gw

2Ψ L γ μ(τ1 W1 μ C τ2 W2 μ)ΨL

W ˙ DW1 � i W2p

2

(15.200)

The form reminds one of the isospin-independent nuclear force:

LN D �igN

2Ψ N γ 5(τ � π)ΨN

D �i gN Ψ N

"τCπC C τ�π�

p2

C12

τ3π0

#ΨN

ΨN D

�pn

�, π D (π1, π2, π3)

(15.201)

Apart from the space-time structure (not vector but pseudoscalar), it has almostidentical structure to Eq. (15.200) if we replace

ΨN ! ΨL , π ! W (15.202)

Here, we need to introduce a neutral W 0 to make them symmetrical and add(τ3/2)W μ

3 to terms inside the parentheses on the second line of Eq. (15.200). Thenuclear force was isospin independent or invariant under exchange of p and n. Thisis S U(2) symmetry, to which the gauge theory of QED was generalized by Yang andMills in 1954 [396]. Therefore, requiring the weak interaction to be invariant under

Page 640: Yorikiyo Nagashima - Archive

15.8 A Step Toward a Unified Theory 615

exchange of (ν e $ e�) or the “weak isospin independence”, we postulate the weakLagrangian has the form

LWI D �gW

2Ψ L γ μ(τ � W μ)ΨL (15.203)

The S U(2) symmetry assumption is to require the existence of the third member,a neutral W 0 boson. For comparison’s sake let us write down the electromagneticLagrangian for the (ν e , e�) pair:

LEM D e(eγ μ e) De2

Ψ γ μ(1� τ3)Ψ A μ , Ψ D�

ν e

e

�(15.204)

Comparing Eq. (15.204) with the weak current we see immediately that the isospincurrent

j iμ � Ψ γ μ τ i Ψ (15.205)

appears in both the weak and the electromagnetic current. In fact, we have alreadyseen that the vector part of the weak current and the electromagnetic current isidentical apart from the coupling constant. It is a consequence of the CVC hypoth-esis and has been experimentally confirmed. Thus, the weak current not only re-sembles the electromagnetic current but also owns a part of it jointly. This stronglysuggests that the weak interaction is also governed by a gauge theory and is a partof a larger symmetry group containing both the weak force and electromagnetism.Conjecture had reached this point when the V–A interaction was determined in1957. Glashow even conceived a model to unify the weak and electromagnetic in-teractions [173]. However, the finite mass of W, which apparently violates gauge in-variance, was the obstacle to going further. To reach the electroweak unified model,deep insight to the theory of gauge invariance is necessary. We shall discuss gaugetheory and the electroweak theory in Chap. 18.

Page 641: Yorikiyo Nagashima - Archive
Page 642: Yorikiyo Nagashima - Archive

617

16Neutral Kaons and CP Violation*

Symmetry is a main theme of modern particle physics. The discovery of parityviolation was an epoch-making event in its history and we have learned in the pre-vious chapter that it stems from the different action of the weak force on left- andright-handed particles. Pursuit of its origin has led us to chiral gauge symmetry,which is the foundation of modern particle theory. The discovery of CP violation in1964 was also unexpected, and despite 50 years’ effort there has been little progressin understanding its true nature. Nevertheless, we have ample reason to believethat elucidating its origin would establish yet another paradigm of particle physics,because recently it was confirmed that the framework in which CP violation is atwork requires (at least) three generations of quarks and leptons, whose existenceremains a mystery referred to as the flavor problem (or family or generation prob-lem). It is also known that CP violation played a crucial role in the early universe indeveloping from the hot big bang chaos to the present form of a matter-dominantuniverse. There is a good chance that pursuit of CP violation will lead to true un-derstanding of the origin of the universe and uncover the mystery of flavors, aboutwhich the Standard Model has little to offer.

CP violation is built into the Standard Model within the framework of theKobayashi–Maskawa (KM) matrix, but its dynamical origin remains to be clarified.It is also suggested that although CP violation as observed experimentally canbe fully explained by the KM framework, it is not enough to explain the matter–antimatter asymmetry of the universe. In this sense “CP violation” can be consid-ered as an exceptional topic in the Standard Model. Needless to say, concentratingon a small anomaly has often provided a chance to open a door to a new view, and,in the author’s personal opinion, understanding the vacuum issues (also knownas the Higgs mechanism and dark energy) and CP violation are the two key factorsto a new physics beyond the Standard Model. In this chapter, we discuss neutralkaon processes, in which the CP violation was discovered and which are the mostinvestigated. Although we do not touch phenomena involving B decays, the neutralkaon provides sensitive tests for other fundamental questions, including T, CPTinvariance and quantum gravity, and deserves a special chapter of its own.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 643: Yorikiyo Nagashima - Archive

618 16 Neutral Kaons and CP Violation*

16.1Introduction

16.1.1Strangeness Eigenstates and CP Eigenstates

There are two kinds of neutral kaons, K0 and K0, which have strangeness S D C1

and S D �1, respectively. S is a conserved quantum number in the strong inter-action, as can be seen in the reactions with particles that have known strangeness(Sπ D Sp D Sn D 0, SΛ D �1):

π� (du) C p (uud) ! K0 (d s) C Λ (uds) (a)S 0 0 C1 �1

π� (du) C p (uud) ! K0 (d s) C K0

(sd) C n (udd) (b)S 0 0 C1 �1 0

(16.1a)

K0(d s) C p (uud) ! KC (us) C n (udd) (c)S C1 0 C1 0

K0(sd) C n (udd) ! K� (su) C p (uud) (d)

S �1 0 �1 0K

0(sd) C p (uud) ! πC (ud) C Λ (uds) (e)

S �1 0 0 �1K

0(sd) C n (udd) ! π� (du) C Σ C(uus) (f)

S �1 0 0 �1

(16.1b)

K0 (d s, S D C1) pairs with KC(us) and K0

(sd, S D �1) pairs with K� (su) tomake isospin doublets that are antiparticles of each other:�

KCK0

�,

"�K

0

K�

#(16.2)

The above strangeness-conserving reactions and their doublet structure are obvi-ous if one analyzes them in terms of the quark model, taking note of the fact thatthe strong interaction does not change a quark’s flavor.

As to decay processes, they have common decay modes including 2π, 3π andπ˙ l�ν and cannot be distinguished by strangeness. This reflects the simple factthat it is not a good quantum number in the weak interaction. On the other hand,since the discovery of P as well as C violation, the combined operation CP hasbeen considered to be conserved in the weak interaction. We shall come back toCP violation later, but let us discuss processes first assuming it is conserved. Wedefine a CP operator by

CPjK0i D e i �CP jK0i D ηCPjK

0i ,

CPjK0i D e�i �CP jK0i D η�

CPjK0i ,

(16.3)

Page 644: Yorikiyo Nagashima - Archive

16.1 Introduction 619

It is conventional to set the phase �CP D 0 and define CP eigenstates by

jK1i D (jK0i C jK0i)/p

2 (16.4a)

jK2i D (jK0i � jK0i)/p

2 (16.4b)

Then CPjK1i D jK1i, CPjK2i D �jK2i and K1, K2 can have different decay modes.The 2π and 3π systems to which the K meson can decay have CP values +1 and�1, respectively, hence the decay modes are separated as K1 ! 2π, K2 ! 3π.

(1) 2π: Remembering P π D �π, P(2π) D (�1)l and the C operation exchangesπC $ π�, we find Bose statistics dictates that C and P operations give the samesign. Therefore CP(πCπ�) D (�1)2l D C1, where l is the orbital angular mo-mentum. As CPπ0 D π0, Bose statistics forces l of the 2π0 system to be even andCP(π0π0) D C1.

(2) 3π: For πCπ�π0, denoting the orbital angular momentum of πCπ� as l andthat of π0 relative to the center of mass of πCπ� as L, we have CP(πCπ�π0) D(�1)2lCLC3. As the spin of the K meson is 0, l D L. Because of the angular mo-mentum barrier (see Sect. 13.5.3), the dominant component is CP D �1.1) For3π0, the same argument as for 2π0 makes l D even and CP(3π0) D �1.

The above argument shows that K1 decays to 2π and K2 to 3π. As the Q-value(mK �

Pmπ ) of K1 ! 2π is much larger than that of K2 ! 3π, the K1 ! 2π

decay rate is much larger than that of K2 ! 3π. Namely, K1 is short lived comparedto K2.

16.1.2Schrödinger Equation for K0 � K

0States

Both K0 and K0

share common decay modes 2π and 3π. This means a transitionK0 $ (2π, 3π)$ K

0or mixing is possible and hK0jH jK

0i ¤ 0. The existence of

the mixing means that K0 can change to K0

in vacuum. Time development of thewave function of the K0, K system can be written as

i@

@tψ(t) D H ψ(t) (16.5a)

ψ(t) D a(t)jK0i C b(t)jK0i C

Xn

cm(t)jmi (16.5b)

Here, jmi is any other state than K0 and K0

into which K0, K0

can decay. TheHamiltonian H is hermitian and contains the strong interaction as well as the weakinteraction, which induces mixing. However, it is very difficult to solve the equationin general. It is convenient to consider only two states K0, K

0and introduce a mass

operator Λ such that the two-component wave function obeys the Schrödinger-typeequation

i@ψ(t)

@tD Λψ(t) , ψ(t) D α(t)jK0i C (t)jK

0i (16.6)

1) Because of this K1 ! 3π can occur (albeit rarely), but K2 !/ 2π.

Page 645: Yorikiyo Nagashima - Archive

620 16 Neutral Kaons and CP Violation*

Any transition to other states (m ¤ K0, K0) is considered as K0 or K

0lost from

this two-component system. Then the probability amplitude in the kaon rest framemust decrease with time. The conservation of the probability (or unitarity) means

@�

@tD

@(ψ† ψ)@t

D �i ψ†(Λ � Λ†)ψ � �ψ†(t)Γ ψ(t) (16.7)

This means Λ can be expressed in terms of two hermitian operators M and Γ

Λ DΛ C Λ†

2C

Λ � Λ†

2� M �

i2

Γ , M† D M , Γ † D Γ (16.8)

Then the time evolution of the eigenstate of Λ can be expressed as

e�i Λ tjKi � exp��i mK t �

Γ2

t�jKi (16.9)

Multiplying hK0j, hK0j from left and noting that jK0i and jK

0i are orthogonal, Eq.

(16.6) can be cast in matrix form

i@

@t

�α(t)(t)

�D

"hK0jΛjK0i hK0jΛjK

0i

hK0jΛjK0i hK

0jΛjK

0i

#�α(t)(t)

�D

�Λ11 Λ12

Λ21 Λ22

��α(t)(t)

�(16.10)

Note the mass matrix Λ has complex eigenvalues and is not a hermitian operator.The mass matrix can be expressed as (see Appendix H)

Λ i j D Mi j � iΓi j

2(16.11a)

Mi j D mK δ i j C hijHW j j i CX

n¤K0,K0

PhijHW jnihnjHW j j i

m j � En(16.11b)

Γi j D 2πX

n¤K0,K0

δ(m j � En)hijHW jnihnjHW j j i (16.11c)

where i, j denote K0 or K0

and P means principal value. The intermediate statesn is any state that K0 or K

0can transition through the weak interaction. The above

formula is referred to as the Weisskopf–Wigner approximation. In this approxima-tion weak interactions between final states are neglected and it is valid for time tlarge compared with the strong interaction scale tst � 1/mK � 10�23�24 s. Equa-tions (16.11), except for the first term in (b), look like the second-order transitionmatrix Eq. (6.37) introduced in Chap. 6. However, there are some differences. Here,the sum over intermediate states is taken only over continuum states to whichK0, K

0can decay and does not include K0, K

0themselves. In the transition matrix

Eq. (6.37), the intermediate states constitute a complete set and the formula is validfor time t much greater than tst.

Page 646: Yorikiyo Nagashima - Archive

16.1 Introduction 621

Diagonalization of Eq. (16.10) changes (α, ) to (α0, 0), which is a representationin the bases jK1i, jK2i

ψ D α0(t)jK1i C 0(t)jK2i D α(t)jK0i C (t)jK0i (16.12a)

i@

@t

�α00

�D Λdia

�α00

�D

�λS 00 λL

� �α00

�, (16.12b)

λS D mS �i2

ΓS λL D mL �i2

ΓL (16.12c)

We use K1,2 to denote CP D ˙1 eigenstates, but have attached subscripts S, L todenote the eigenvalues. The nomenclature KS, L is saved for the weak decay eigen-states including CP violation effects that appear later. Note λ is a complex numberbut mS, L, ΓS, L are real numbers. Representations in the K0, K

0and K1, K2 bases

are related by�α

�D U

�α00

�, U D

1p

2

�1 11 �1

�(16.13a)

Λ D�

Λ11 Λ12

Λ21 Λ22

�D U ΛdiaU�1 D

�M �Δλ/2�Δλ/2 M

�(16.13b)

M D (λL C λS)/2 , Δλ D λL � λS (16.13c)

The eigenvalues are expressed in terms of the original mass matrices as

λSLD

Λ11 C Λ22

s�Λ11 � Λ22

2

�2

C Λ12Λ21 � M � Δλ/2 (16.14a)

Tr[Λ] D λS C λL D Λ11 C Λ22 (16.14b)

det[Λ] D λS λL D Λ11Λ22 � Λ12Λ21 (16.14c)

The symmetry requirements constrain the matrix elements, which will be dis-cussed in detail in Sect. 16.2.1:

Λ11 D Λ22 D M , Δλ D �2p

Λ12Λ21 CPT (16.15a)

jΛ12j D jΛ21j T (16.15b)

Λ11 D Λ22 D M and jΛ12j D jΛ21j CP (16.15c)

Note that for off-diagonal elements (Λ12, Λ21), only the absolute magnitude can beconstrained. This is because we are free to change their phase,

jK0i ! e�i � jK0i , jK0i ! e i � jK

0i (16.16)

which originates from strangeness conservation in the strong interaction. Orrather, the freedom to change the phase is the origin of the strangeness con-

Page 647: Yorikiyo Nagashima - Archive

622 16 Neutral Kaons and CP Violation*

servation (Noether’s theorem).2) By rephasing the elements jK0i ! e i � jK

0i,

jK0i ! e�i � jK0i,

Λ12 ! Λ12e2 i � , Λ21 ! Λ21e�2 i � (16.17)

Observable quantities do not depend on the phase convention. Note, however, thatonce one has fixed it one has to stick to it, otherwise contradictions will ensue.Referring to Eq. (16.11), we see that Eq. (16.15a) means the mass and lifetime of aparticle and its antiparticle are the same.

Problem 16.1

Derive Eq. (16.13) and Eq. (16.14).

16.1.3Strangeness Oscillation

Strangeness oscillation is a phenomenon that K0 and K0

states transition back andforth. Suppose we have a pure jK0i beam at t D 0. It can be made in several ways:

1. By using the difference of the energy threshold between reactions (a) (Eth D

0.91 GeV) and (b) (Eth D 1.5 GeV) in Eq. (16.1a) and setting the π energy be-tween 0.91 and 1.5 GeV.

2. By stopping a high-flux low-energy antiproton beam in a pressurized hydrogengas target and selecting the following reactions:

p C p !

(K�πCK0

KCπ�K0 (16.18)

The CPLEAR experiment at CERN [109] used a beam with momentum set atp D 200 MeV/c and obtained pure K0 (K

0) by tagging K�πC (KCπ�).

3. By using a φ factory. An electron–positron collider with total center of massenergy

ps D mφ D 1019.4 MeV called DAPHNE of INFN (Istituto Nazionale

di Fisica Nucleare) at Frascati, Italy [237] can produce a coherent K0–K0

statewith J P C D 1�� by

e� C eC ! φ , φ ! K0K0

(16.19)

If both CPT and CP invariance hold, K1, K2 are the mass eigenstates. SettingjK0(t D 0)i D jK0i, its time development can be described as

jK0(t)i D [jK1(t)i C jK2(t)i]/p

2 D [jK1ie�λS t C jK2ie�λL t ]/p

2

D gC(t)jK0i � g�(t)jK0i (16.20a)

g˙(t) D12

he�λL t ˙ e�λS t

i(16.20b)

2) Rephasing would change the CP phase ηCP ! ηCP e2 i� defined in Eq. (16.3) for consistency, but itwould not change any conclusion related to CP phase.

Page 648: Yorikiyo Nagashima - Archive

16.1 Introduction 623

which shows that a K0

component appears in the originally pure K0 beam as timepasses. The probability of detecting K0 (or K

0) at time t is given by

P(K0 ! K0) D jhK0jK0(t)ij2 D jgC(t)j2 (16.21a)

P(K0 ! K0) D jhK

0jK0(t)ij2 D jg�(t)j2 (16.21b)

jg˙j2 D14

he�ΓS t C e�ΓL t ˙ 2e�Γ t cos Δmt

iD

e�Γ t

2

�cosh

ΔΓ2

t ˙ cos Δmt�

(16.21c)

Γ D12

(ΓS C ΓL) 'ΓS

2, ΔΓ D ΓS � ΓL ' ΓS , Δm D mL � mS,

(16.21d)

Note, if CP invariance is assumed, P(K0 ! K0) D P(K0! K

0) and P(K0 !

K0) D P(K

0! K0). Equations (16.21) mean components of K0 and K

0oscillate

as a function of time. If Δm Γ ' ΓS, the oscillation is too fast to detect K0

and K0

separately. However, if Δm ' ΓS, the oscillation can be observed. Obser-vations, show Δm ' ΓS/2. In Fig. 16.1, however, an oscillation with Δm D ΔΓ isillustrated to emphasize the oscillation feature.

The right-hand panel of Fig. 16.1 shows asymmetry data of CPLEAR [107] de-fined as

A Δ m(τ) DfP(K0 ! K0)C P(K

0! K

0)g � fP(K0 ! K

0)C P(K

0! K0)g

fP(K0 ! K0)C P(K0! K

0)g C fP(K0 ! K

0)C P(K

0! K0)g

3)

DjgC(t)j2 � jg�(t)j2

jgC(t)j2 C jg�(t)j2D

cos Δmτcosh ΔΓτ/2

(16.22)

where τ is the proper time of the kaon. From the observed curve, Δm can be deter-mined and is given by

Δm D mL � mS D 3.485˙ 0.01 � 10�6 eV

D 0.4738˙ 0.0013 ΓS(16.23)

The sign of Δm was determined using the regeneration process described in thenext section.

In order to observe the strangeness oscillation, K0 and K0

have to be identified asa function of time. Since they are eigenstates of the strong interaction, the observa-tion must be done in principle not by their decay mode but by their collision modein the strong interaction. Notice, however, that the subsequent decay mode has tobe identified to confirm what reaction has occurred. For example, in a hydrogen

3) Their measurement is sufficiently accurate that CPT- and T-violating terms have to be considered.

Then P(K0 ! K0) ¤ P(K0! K

0), P(K0 ! K

0) ¤ P(K ! K0). However, by taking the sum,

both CPT- and T-violating terms are cancelled and give the same expressions as Eqs. (16.21).

Page 649: Yorikiyo Nagashima - Archive

624 16 Neutral Kaons and CP Violation*

0 1 2 3 4 5 6 τ/τS

1.0

0.75

0.50

0.25

0

Δm = 0

Δm = ΓS

Δm = 0

τ

(a)

(b)

Figure 16.1 (a) Illustration of the strangenessoscillation. The intensity of a pure K0 beam att D 0 decreases as time passes (dashed line)and that of K

0, which did not exist at t D 0,

appears. If Δm ¤ 0, the two curves oscillate.Here, Δm is set equal to Γ to emphasize the

oscillation behavior. (b) Asymmetry A Δm (τ)data of CPLEAR of strangeness-conservingand strangeness-changing components. Thestrangeness was tagged by the sign of thedecay leptons [107].

Page 650: Yorikiyo Nagashima - Archive

16.1 Introduction 625

K0 K0

π+

πK0 K0

π+

ππ0

Figure 16.2 K0 � K0

mixing through 2π, 3π intermediate states.

bubble chamber, that a K0 was produced by the reaction π� p ! K0Λ at t D 0 canbe confirmed by detecting subsequent decay of Λ ! π� p (τ � 2.63 � 10�10 s).K

0at time t can be detected, for example, by observing the reaction K

0p ! πCΛ,

which leaves a visible πC track, but must be confirmed by its associated Λ decay.As the momentum and direction of K0 can be reconstructed from πC and Λ, theflight distance of K can be converted to its proper time. Other usable reactions are

K0 C p ! KC C n , K0C n ! K� C n ,

K0C n ! π0 C Λ(Λ ! π� C p )

(16.24)

Having said that, we mention another and more practical method which is a specialfeature to K0 � K

0system, that is, to use the ΔS D ΔQ selection rule. It is a rule

that hadronic component in the decay process has to respect because at the quarklevel it is s ! uCW � (see Sect. 15.6). As mK˙ > mK0 , no strangeness-conservingdecay mode d ! ue�ν e (d ! ueCν e) exists, only the strangeness-changing modes ! ue�ν e (s ! ueCν e) can occur, and the sign of the decay leptons can tell uswhich of the parents it comes from. The strangeness oscillation curve of CPLEARin Fig. 16.1 was tagged in this way. Note, the ΔS D ΔQ rule is obvious in thequark model, but experimentally one has to be careful about the possible existenceof ΔS ¤ ΔQ processes.

K0K0

s

s

d

d

i

j

W+ W+ K0K0

s

s

d

d

i j

W+

W+

i, j = u, c, t

K0

s

d

u

u

W+

d

dπ+

π-

K0

s

d u

uW+ d

d

π+

π-

(a) (b)

(c) (d)

Figure 16.3 Box diagram. (a,b) K0–K0

mixing in the Standard Model. (c,d) Description of how2π states appear in the intermediate states by cutting the mass matrix diagram across the dash-dotted lines. 3π or π lν states can be constructed similarly.

Page 651: Yorikiyo Nagashima - Archive

626 16 Neutral Kaons and CP Violation*

The mass difference Δm arises because of mixing between K0 and K0

throughintermediate states such as those depicted in Fig. 16.2. Namely, it is an effect thatarises from the nondiagonal elements of the mass matrix. In the language of theStandard Model, the intermediate states are given by W ˙ and/or up-type quarks(u, c, t) shown in Fig. 16.3a,b.

Figure 16.3c,d shows how to associate the modern interpretation in terms ofquarks with the old picture (2π, l πν, etc.). The mixing is a ΔS D 2 effect andsecond-order in the weak interaction. Therefore Δm should be proportional to thesquare of the coupling constant G2

F sin2 θc. From dimensional analysis

Δm � G2F sin2 θcm5

K � 10�4 eV (16.25)

which is not too far from the observed value.

16.1.4Regeneration of K1

A pure beam, whether K0 or K0

at t D 0, becomes a pure K2 beam after a longenough time (t τS D 1/ΓS) where τS is K1’s mean lifetime. This can beconfirmed by the fact that two-prong decay (a topology that splits to two branch-es) modes appear in the neighborhood of production (l . lS D γ τSc), whereγ D EK/mK is the Lorentz factor, but at distances l lS only three-prong decaymodes are observed (Fig. 16.4).

Let us consider making this beam pass through matter. K0 having S D C1 iselastically (K0N ! K0N ) or quasi-elastically (K0 p ! KCn) scattered at most, butK

0, having S D �1, can also produce hyperons [Eqs. (16.1b) (e)(f)] and thus have

larger absorption rate. If we write the states after matter passage as

jK0i ! f jK0i, jK0i ! f jKi (16.26)

then

jK2i D (jK0i � jK0i)/p

2! ( f jK0i � f jK0i)/p

2

D12

( f C f )jK2i C12

( f � f )jK1i(16.27)

lS lSl >> lS

Production Target Regeneration Target

Figure 16.4 Regeneration of K1. Two-prong decay modes (K ! 2π) are limited to the neighbor-hood (l . lS) of the production and regeneration targets. Three-prong decay modes (K ! 3π)are observed at long distances (l � lS) as well as at short distances.

Page 652: Yorikiyo Nagashima - Archive

16.1 Introduction 627

Since f ¤ f , the K1 component, which did not exist before, appears after passagethrough the matter, which is referred to as regeneration.4) The regeneration occursfrom the difference in the reaction rate and does not depend on which reaction.However, depending on the target size, it can be classified as inelastic scattering,in which the kaon reacts with individual nucleons in the nucleus; diffraction scat-tering, in which it is scattered elastically by the nucleus as a whole; and coherentscattering, in which it reacts coherently with many atoms. Among them, coherentscattering plays a special role in the study of CP violation and deserves a more de-tailed explanation. As there is a small difference between the masses of K1 and K2,the momentum k2 of the parent K2 and k1 of the daughter K1 are related by

k22 C m2

L D k21 C m2

S

Δ k D k1 � k2 DmL C mS

k1 C k2(mL � mS) D

mk

Δm

m D (mL C mS)/2 , k D (k1 C k2)/2

(16.28)

Because of the Heisenberg uncertainty principle, scattering from targets withinthe range d . 1/Δ k D k/mΔm can interfere coherently. If we take k D 10 GeV,insertion of m D mK ' 0.5 GeV, Δm ' 3.5 � 10�6 eV gives d � 100 cm. Thescattering angle by coherent reaction becomes

d(1 � cos θ ) . λ � 1/ k

θ 2 � 2Δ k/ k � 10�17 (16.29)

The momentum transfer to the generated k1 is pT . 100 eV, which is much small-er than the diffraction scattering (pT . 10 MeV) and inelastic scattering (pT &100 MeV) and it is almost a perfect forward scattering. Figure 16.5 shows an exam-ple of p 2

T distribution after regeneration [244].Writing the regenerated state after passage of matter with thickness L as

jψi D jK2i C �(L)jK1i (16.30)

we see �(L) is given by [181, 257]

�(L) Di πN

kf f (0) � f (0)glS

1 � exp[fi(Δm/ΓS)� 1/2gL/ lS]1/2 � i(Δm/ΓS)

(16.31)

N : Number density of the scattering centerk D mK γ : Momentum of the kaon

f (0), f (0) : Forward scattering amplitude of K0, K0

lS D γ/ΓS : Mean flight distance of K1

γ : Lorentz factorΓS : Total decay rate of K1

Δm D mL � mS : Mass difference of K2 and K1

4) This is like circularly polarized light traversing an optically active medium. It is also a typicalphenomenon characteristic of quantum mechanics and is a manifestation of the principle ofsuperposition.

Page 653: Yorikiyo Nagashima - Archive

628 16 Neutral Kaons and CP Violation*

102

103

104

105

106

107

0 500 1000 1500 2000 2500Reg beam π+π− p 2 MeV2/c2

Even

ts p

er 5

0 M

eV2 /c

2 Data

Total background

Regenerator scatter Collimator scatter

KL→πeνKL→πμν

Figure 16.5 Data p 2T of K ! πCπ� of the re-

generator beam. The Monte Carlo predictionsfor the background components are over-laid. The arrow indicates cut to select sam-

ples [244]. Notice that the width of forwardpeak is determined not by intrinsic scatteringwidth but by experimental resolution.

Proof: As is well known from optical theory, a combined wave of incoming andforward-scattered waves added coherently over all scattering centers is well de-scribed by a plane wave with wave number nk, where n is the index of refrac-tion [176]. The expression for n is

n2 D 1C4πN f (0)

k2 , n2 � 1� 1 (16.32)�

The eigenmass M of the wave is given by

k2 C m2K D (nk)2 C M2

M Dq

m2K � (n2 � 1)k2 ' mK �

(n2 � 1)k2

2mKD mK �

2πN f (0)mK

(16.33)

Since K0 and K0

give different indexes of refraction, the Schrödinger equationEq. (16.10) is modified to

i@

@t

�α(t)(t)

�D

�Λ11 Λ12

Λ21 Λ22

� �α

��

2πNmK

"f 00 f

#�α

�(16.34)

To derive the solution, we first note that the mass matrix can be expressed in ananalogous form to that given in Eq. (16.13b):�

M 0 � δΔλ �Δλ/2�Δλ/2 M 0 C δΔλ

M 0 D12

(λL C λS)�πNmK

( f C f ) , δΔλ DπNmK

( f � f )

(16.35)

Page 654: Yorikiyo Nagashima - Archive

16.1 Introduction 629

As δ is a small number, the solution of Eq. (16.35) is a small perturbation tojK1i, jK2i and is expressed as

jKSMi ' jK1i C δjK2i

jKLMi ' jK2i � δjK1i(16.36)

where only up to O(δ) was considered. The eigenmass is given by the formula

λSMLMD λS

L�

πNmK

( f C f ) (16.37)

Problem 16.2

Show that Eqs. (16.36) are the solutions to Eq. (16.34).

Setting the initial condition as the beam being pure K2 before it enters the matter,we find the amplitude on exiting the regenerator of thickness L is given by

jK2(t)i ' jKLM(t)i C δjK1(t)i ' jKLMie�i λLM t C δjKSMie�i λSM t

' fjK2i � δjK1ige�i λLM t C δjK1ie�i λSM t

D e�i λLM thjK2i C (e iΔλm t � 1)δjK1i

iΔλm D λLM � λSM D λL � λS D Δλ

(16.38)

Inserting time t D l/γ to travel distance l and comparing with Eq. (16.30), weobtain

�(L) D (eiΔλ L/γ � 1)δ D �πNmK

( f � f )Δλ

(1 � e iΔλ L/γ )

Di πN

k( f � f )lS

1 � exp[fi(Δm/ΔΓS) � 1/2gl/ lS]1/2 � i(Δm/ΓS)

j�(L)j2 D π2N 2 j f � f j2

m2K

1C e�l/ lS � 2e�l/(2lS) cos(Δm/ΓS)l/ lSjΔmj2 C Γ 2

S /4

(16.39)

In deriving the above expression, we have neglected ΓL compared to ΓS as ΓS ΓL.

Sign of Δm The sign of Δm can be determined if there is another K1 componentwhose intensity is comparable to that of the regenerated K1.

1. One method is to use a pure K0(K0) beam at t D 0. Suppose the regenerator

of thickness L D γ tL is set at distance D D γ tD from the position ofbeam generation. The amplitude of the beam on exiting the regenerator can bedescribed as

ψ(tD C tL) � jK1ie�i λS(tDCtL) C e�i λL(tDCtL)fjK2i C �(L)jK1ig

D e�i λL(tDCtL)hjK2i C

ne iΔλ(tDCtL) C �(L)

ojK1i

i (16.40)

Page 655: Yorikiyo Nagashima - Archive

630 16 Neutral Kaons and CP Violation*

The intensity of the K1 component is proportional to

e�(DCL)/ lS C j�(L)j2 C 2e�(DCL)/2lSj�(L)j cos�

arg[�(L)] �ΔmΓS

D C LlS

(16.41)

where lS D γ/ΓS is the mean attenuation length of K1. By changing L, thephase of the cos term can be measured as a function of L and hence the sign ofΔm can be determined. Using this method, the sign of Δm D mL � mS wasdetermined to be positive [272].

2. The second method is to use a CP-violating K1 component (to be describedsoon) in the pure KL beam. In this case, the K1 component of the beam afterpassing the regenerator is given by

hK1jψ(t)i � ηe�i λLt C �(L)e�i λSt (16.42)

where η is the CP-violating amplitude (K1 in KL). The K1 intensity is propor-tional to

jηj2e�ΓL t C j�j2 e�ΓS t C 2jηjj�je�Γ t cos[Δm C arg(�) � arg(η)] (16.43)

Then by varying the length of the regenerator and measuring the ensuing mod-ulation, the sign and magnitude of Δm can be determined. Note, however, thiscan be done only after the CP-violating amplitude has been determined withoutthe regenerator.

16.1.5Discovery of CP Violation

CP violation was discovered in 1964, seven years after the discovery of parity viola-tion. As long-lived K2 has CP D �1, it was thought that it can decay only to 3π andπ˙ l�ν l , and not to 2π. Fitch and Cronin [99] observed for the first time that K2

decays to 2π. Figure 16.6 shows their apparatus.The experiments was carried out using the 30 GeV AGS (Alternating Gradient

Synchrotron) proton accelerator at Brookhaven National Laboratory (BNL). Neu-tral kaons were extracted from a secondary beam created by bombarding the pri-mary proton beam on a Be target through a collimator at 30ı. Charged parti-cles were eliminated by a sweeping magnet. The beam has average momentum� 1100 MeV/c, and a helium bag5) was placed in the decay region at l � 20 lS(decay length of K1) to minimize spurious interactions. To measure decay pointsand momenta of decay π˙, a pair of spectrometers consisting of a magnet, scin-tillation counters and spark chambers were placed in a symmetrical configuration.

5) This is a poor man’s vacuum chamber. One would also have used proportional tubes or driftchambers with much less material and fast response instead of spark chambers if moderntechniques had been available.

Page 656: Yorikiyo Nagashima - Archive

16.1 Introduction 631

K20

Plan view

l foot

Collimator

57 Ft. tointernal target Helium bag

Magnet

Magnet

Scintillator

Scintillator

WaterCarenkow

WaterCarenkow

Spark chamber22°

22°

Figure 16.6 Discovery of CP violation. Lay-out of apparatus. K2 beam comes in from theleft and decays in a helium bag. 2π are de-tected by a pair of telescopes, which consist

of a magnet, scintillation counters and sparkchambers, and determines the pions’ momen-tum. The helium bag contains regenerators toproduce 2π for calibration [99].

Furthermore, to calibrate the apparatus, a total of 43 g of tungsten was placed atevery 28 cm in the decay region. Regenerated K1 has the same momentum as itsparent K2; it is an ideal calibrator for K2 ! 2π events. Denoting the momentaof decay pions as p1, p2 and assuming the observed particles are pions, we canreconstruct an effective mass m� of the parent particle from the equation

m�2D (E1 C E2)2 � (p1 C p2)2 ' 2m2

π C 2p1 p2(1 � cos θ ) (16.44)

Events that gives m� ' mK are identified as decay products from kaons. Back-grounds from other kaon decays etc. are eliminated by requiring that the recon-structed kaon momentum p K D p1 C p2 is directed in the beam direction. Fig-ure 16.7 shows events with m� � mK and their momentum in the same directionas the beam. A clear signal above the background can be seen in the middle graphon the right-hand side. From the data

BR(K2 ! πCπ�) DΓ (K2 ! πCπ�)

Γ (K2 ! all)D 2.0˙ 0.4 � 10�3 (16.45)

was obtained. In 1967, K2 ! π0π0 was observed, too [110, 153].Since observed kaons with short and long lives were proved not to be eigenstates

of CP, we rename them KS and KL instead of K1 and K2.

Page 657: Yorikiyo Nagashima - Archive

632 16 Neutral Kaons and CP Violation*

Data: 5211 eventsMonte-Carlo calculationVector f–

f+ = 0.5

300

300

200

100

350 400

400

450 500

500

550 600

600

MeV

484 < m* < 494

494 < m* < 504

504 < m* < 514

Num

ber o

f eve

nts

10

0

10

20

30

0

10

00.9996 0.9997 0.9998 0.9999 1.0000

cos θ

Figure 16.7 (a) Reconstructed mass distribution [99]. (b) Only data in the region m� � mKshow a peak at θ D 0 that is above the background. This is evidence for K2 ! 2π.

16.2Formalism of CP and CPT Violation

16.2.1CP, T, CPT Transformation Properties

We have to define CP violation parameters to describe the observed phenomena.As CP, T and CPT are related, we first list transformation properties of CP, T, CPToperations on particle states (see Appendix F). Denoting phase factors η i D e i �i ,η i D e i � i

CPjq, p , sα Winouti D ηCPjq,�p , sα W

inouti (16.46a)

Tjq, p , sα Winouti D ηTjq,�p ,�sα W

outin i (16.46b)

CPTjq, p , sα Winouti D ηCPTjq, p ,�sα W

outin i (16.46c)

ηCPT D ηCPηT (16.46d)

Page 658: Yorikiyo Nagashima - Archive

16.2 Formalism of CP and CPT Violation 633

where q (q) denotes a particle (antiparticle), p its momentum and sα its spin com-ponent. Note T and CPT are antiunitary operators and complex conjugation of c-numbers is to be included in their operation (see Sect. 9.2.2). Defining

Oi jqi D η i jOi qi Oi D CP, T, CPT

and using O2i D 1, we have

η i D η i� (16.47)

By requiring CPT D TCP, we have

CPTjq, p , sα .ini D CPηTjq,�p ,�sα .outi D ηTηCPjq, p ,�sα .outi

TCPjq, p , sα .ini D TηCPjq,�p , sα .ini D η�CPη�

T jq, p ,�sα .outi(16.48)

where we have taken the complex conjugate of c-numbers. This means

ηCPT D ηCPT D η�CPT D ηCPηT D 1

ηCP D η�CP

ηT D η�T D ηCPTηCP D ηTη2

CP , ηT D ηCPTη�CP D η�

T η�CP

2

(16.49)

Mass Matrix We use jAi to denote a one-particle state and jBi for continuumstates to which A can decay weakly. As there is no distinction between “in” and“out” for a one-particle state, jA W ini D jA W outi D jAi. For spinless kaons in therest frame, indices other than particle species are irrelevant. Applying T invarianceto the amplitude,

hijHW j j i D hijT �1T HW T �1T j j i D (η�i η j hijHW j j i)�

D η i η�j h j jHW jii

) hnjHW jK0i D η�T ηnhK0jHW jni

hnjHW jK0i D η�

T ηnhK0jHW jni

(16.50)

Applying Eq. (16.50) to the decay matrix in Eq. (16.11),

Γ21 D 2πX

n

δ(E0 � En)h2jT �1T HW T �1T jnihnjT �1T HW T �1T j1i

D 2πXTn

δ(E0 � En)(η�T ηn)�hTnjHW jT 2i(η�

T ηn)hT1jHW jTni

D (ηTη�T )2π

Xn

δ(E0 � En)h1jHW jnihnjHW j2i D η2CPΓ12 (16.51)

where jTni denotes T-transformed state defined by Eq. (16.46). As T changes theparticle’s momentum as well as spin orientation, jTni ¤ jni, but summation onall the continuum states makes the difference disappear. Similar arguments applyto the mass part in Eq. (16.11), and

M21 D η2CPM12 , Γ21 D η2

CPΓ12 (16.52a)

Page 659: Yorikiyo Nagashima - Archive

634 16 Neutral Kaons and CP Violation*

This means

Λ21 D η2CPΛ12 , or jΛ12j D jΛ21j T invariance (16.52b)

It also means

M�12

M12D

Γ �12

Γ12, or Im(M�

12Γ12) D 0 (16.52c)

No constraints are obtained for Λ11 and Λ22.Requiring CPT invariance gives

Λ11 D Λ22 CPT invariance (16.53)

No constraints are made for Λ12 or Λ21. CP invariance gives

Λ11 D Λ22 and jΛ12j D jΛ21j CP invariance (16.54)

Symmetry Properties of Decay Amplitudes Denoting the CPT operation as Θ , theapplication of the requirement of its invariance to a weak decay amplitude for aprocess i ! f can be written as

h f W outjHW jii D h f W outjΘ�1Θ HW Θ�1Θ jii D hΘ f W injHW jΘ ii�

DXout

(hΘ f W injΘ f W outihΘ f W outjHW jΘ ii)�

D hΘ f W outjHW jΘ ii�e2 i δ f (16.55)

where we have used [see Eq. (6.12)]

h f W outj f W ini D h f W outjS j f W outi D S f f D e2 i δ f (16.56)

δ f is the scattering phase shift of the f state as well as f because the strong in-teraction is invariant under C transformation. jΘ f i is a CPT-transformed state asdefined in Eq. (16.46). Then Eq. (16.55) means that a decay amplitude and its CPTconjugate are related by

h f W outjHW jii D Aei δ f

hΘ f W outjHW jΘ ii D A�e i δ f (16.57)

where A is a complex number. Similarly, T and CP invariance can be expressed as

h f W outjHW jii D h f W outjT�1THW T�1Tjii D hT f W injHW jTii�

D hT f W outjHW jTii�e2 i δ f (16.58)

h f W outjHW jii D hCP f W outjHW jCPii (16.59)

Page 660: Yorikiyo Nagashima - Archive

16.2 Formalism of CP and CPT Violation 635

Neutral K Meson Decays As the K meson is a scalar,

jΘ K0i D jCPK0i D jK0i , jTK0i D jK0i (16.60)

Since a momentum-inverted final state can be reproduced by rotation from theoriginal state and there is no specific orientation in the total angular momentumzero state, we have

jΘ f W outi D jCP f W outi D j f W outi , jT f W outi D j f W outi (16.61)

Separating the strong phase and defining weak amplitudes A and A by

h f W outjHW jK0i � A f ei δ f , h f W outjHW jK0i � A f ei δ f (16.62)

Equations (16.57), (16.58) and (16.59) relate one to the other:

CPT A(i ! f ) � A f D A�(i ! f ) (16.63a)

T

(A(i ! f ) � A f D A�(i ! f )

A(i ! f ) D A�(i ! f )(16.63b)

CP A(i ! f ) D A(i ! f ) (16.63c)

16.2.2Definition of CP Parameters

Now we are ready to apply the formalism we have developed so far to the neutral Kmesons.

Mixing Parameters Since we learned that observed neutral kaons are not eigen-states of CP, we can write

jKSi D1p

1C j�Sj2(jK1i C �SjK2i)

D1p

2(1C j�Sj2)

n(1C �S)jK0i C (1� �S)jK

0io

jKLi D1p

1C j�Lj2(jK2i C �LjK1i)

D1p

2(1C j�Lj2)

n(1C �L)jK0i � (1� �L)jK

0io

(16.64a)

�S, L ¤ 0 means CP violation. They need not be equal. We decompose them as

�S D � C δ , �L D � � δ (16.64b)

We shall show soon that CPT invariance means δ D 0 and T invariance means� D 0. For the sake of later discussions, we introduce an alternative form for ex-

Page 661: Yorikiyo Nagashima - Archive

636 16 Neutral Kaons and CP Violation*

pressing KS and KL:

jKSi D pp

1� zjK0i C qp

1C zjK0i (16.65a)

jKLi D pp

1C zjK0i � qp

1 � zjK0i (16.65b)

jp j2 C jqj2 D 1 (16.65c)

Then it is easy to show that the mass matrix is expressed as

Λ D�

Λ11 Λ12

Λ21 Λ22

�D

264 M C

z2

Δλ �p

2q

p1 � z2Δλ

�q

2p

p1� z2Δλ M �

z2

Δλ

375

M D12

(λL C λS) , Δλ D λL � λS

(16.66)

Problem 16.3

Derive Eq. (16.66).

CPT and T invariance means Λ11 D Λ22 and jΛ21j D jΛ12j, which, in turn,means z D 0 (CPT), and jq/p j D 1 (T), respectively. In terms of the mass matrixelements, δ, � can also be defined as

δ �Λ22 � Λ11

2Δλ, � �

Λ21 � Λ12

2Δλ(16.67)

The two definitions Eqs. (16.67) and (16.64) are equivalent in first order in � or δ.Since the second-order term never appears in this book, either definition is valid.The former is convenient for theoretical and the latter for experimental evaluations.To first order, z, p , q are related to δ, � by

z D �2δ, (16.68a)

pqD

1C �

1 � �or � D

p � qp C q

(16.68b)

For theoretical discussions, � is expressed as

� D �Im(M12) � iIm(Γ12)/2

ΔΓ /2 � iΔm

Dsin φSW

Δmei φSW

��Im(M12)C i

Im(Γ12)2

�(16.69a)

φSW � tan�1 2ΔmΔΓ

D 43.49˙ 0.08ı (16.69b)

Page 662: Yorikiyo Nagashima - Archive

16.2 Formalism of CP and CPT Violation 637

where φSW is referred as the superweak phase for the reason discussed aroundEq. (16.184).

To summarize, CP is violated if �S or �L is nonzero. CPT invariance means δ D 0which, in turn, means �S D �L D �. In this case, CP violation is equivalent to Tviolation and Re(�) ¤ 0. Note � itself is a phase dependent variable as we willdiscuss in more details in Sect. 16.3.2.

Problem 16.4

Prove that the solution of Eq. (16.66), where δ, � are defined by Eq. (16.67), is givenby Eq. (16.64) in first order in δ and �.

Bell–Steinberger Relation Taking the inner product of jKSi and jKLi, we see

hKSjKLi D 2[Re(�)� iIm(δ)] (16.70)

As this equation shows, the observed states KS and KL are not orthogonal and mixtogether. The amount of overlap is a measure of CP and CPT violation. Its valuecan be estimated approximately as follows. Using the relation

ΛjKSi D λSjKSi , ΛjKLi D λLjKLi (16.71)

and multiplying by hKLj and hKSj from the left, respectively, we obtain

hKLjΛjKSi D λShKLjKSi (16.72a)

hKSjΛjKLi D λLhKSjKLi (16.72b)

Subtracting the complex conjugate of Eq. (16.72a) from Eq. (16.72b), we have

(λL � λ�S )hKSjKLi D hKSjΛ � Λ†jKLi D �ihKSjΓ jKLi (16.73)

Insertion of hKSjKLi D 2[Re(�) � iIm(δ)] and Eq. (16.11c) gives

2

Γ C iΔm� �

Re(�) � iIm(δ)DX

f

A L( f )A�S( f )�( f ) (16.74)

A L( f ) D h f jHW jKLi , A�S( f ) D hKSjHW j f i

where �( f ) denotes the phase space volume of the state f. The expression is phaseconvention independent. The rhs is on shell and the terms are observables, hencecan be determined by experiments. Equation (16.74) is called the Bell–Steinbergerrelation or unitarity condition, as it is derived from the unitarity relation Eq. (16.7).Denoting the partial decay width as Γ (KS, L ! f ) D jA S, L( f )j2�( f ) and applying

Page 663: Yorikiyo Nagashima - Archive

638 16 Neutral Kaons and CP Violation*

the Schwartz inequality to Eq. (16.74), we have

2jΓ C iΔmjjRe(�)� iIm(δ)j X

f

pΓL( f )

pΓS( f )

rPf ΓS( f )

� Pf ΓL( f )

�Dp

ΓSΓL

or jRe(�)� iIm(δ)j 12

sΓSΓL

Γ2C Δm2

'

sΓL

2ΓS' 0.03 (16.75a)

This is the interesting result that a bound can be placed on a CP- and CPT-violatingquantity from knowledge of CP-conserving quantities.

Time Development of K0 and K0

Time development of a wave function is de-scribed by jψ(t)i D e�i Λ tjψ(0)i. To express it in a more tangible form, we writethe inverse of Eq. (16.65)

jK0i D1

2p

p1� zjKSi C

p1C zjKLi

�jK

0i D

12q

p1C zjKSi �

p1� zjKLi

� (16.76)

Then the time development of pure K0 at t D 0 is given by

jK0(t)i D e�i Λ tjK0i D1

2p

hp1� zjKSie�i λS t C

p1C zjKLie�i λL t

iD [gC(t)C zg�(t)]jK0i �

p1 � z2 q

pg�(t)jK

0i (16.77a)

where

g˙(t) D12

�e�i λL t ˙ e�i λS t�

Similarly

jK0(t)i D [gC(t)� zg�(t)]jK

0i �p

1� z2 pq

g�(t)jK0i (16.77b)

Combining the two equations, we have

e�i Λ t D

264 gC(t)C zg�(t) �

pq

p1� z2g�(t)

�qp

p1� z2g�(t) gC(t)� zg�(t)

375 (16.78)

Page 664: Yorikiyo Nagashima - Archive

16.2 Formalism of CP and CPT Violation 639

This is an exact formula, but in most cases approximations to first order in δ, � aremore convenient. We list several equivalent formula for later reference. Substitut-ing z D �2δ, p/q ' (1C �)/(1� �), we have

jK0(t)i D jK0ihK0je�i Λ tjK0i C jK0ihK

0je�i Λ tjK0i (16.79a)

D jK0i[gC(t) � 2δg�(t)]C jK0i[2(Λ21/Δλ )g�(t)] (16.79b)

D jK0i[gC(t)� 2δg�(t)]� jK0i[(1� 2�)g�(t)]C O(�2, δ2) (16.79c)

D1p

2

�(1 � � C δ)e�i λSt jKSi C (1� � � δ)e�i λLt jKLi

C O(�2, δ2)

(16.79d)

jK0(t)i D jK

0ihK

0je�i Λ tjK

0i C jK0ihK0je�i Λ tjK

0i (16.80a)

D jK0i[gC(t)C 2δg�(t)]C jK0i[2(Λ12/Δλ )g�(t)] (16.80b)

D jK0i[gC(t)C 2δg�(t)]� jK0i[(1C 2�)g�(t)]C O(�2, δ2) (16.80c)

D1p

2

�(1C � � δ)e�i λSt jKSi � (1C � C δ)e�i λLt jKLi

C O(�2, δ2)

(16.80d)

Formulae with CPT Invariance Assumption As δ ' 0 experimentally, and sincemany expressions can be simplified, we assume CPT invariance from now on untilwe test it later in Sect. 16.4. Setting z D 0, we have from Eqs. (16.14) and (16.15)

Λ11 D Λ22 D M , λS, L D Λ11 ˙p

Λ12Λ21 D Λ11 ˙qp

Λ12

D Λ11 ˙pq

Λ21 (16.81a)

Λ12 D �Δλ

2pq

, Λ21 D �Δλ

2qp

, Λ21 C Λ12 D �Δλ C O(�2)

(16.81b)

qpD

sΛ21

Λ12D

sM�

12 � i Γ �12/2

M12 � i Γ12/2(16.81c)

Δλ D �2p

Λ12Λ21 D �(Λ21C Λ12)C O(�2) ' �2�

ReM12 Ci2

ReΓ12

�(16.81d)

The above equations are obtained simply by setting z D 0 in Eq. (16.66). Note, byrephasing [Eq. (16.17)], Λ12 ! e2 i � Λ12, Λ21 ! e�2 i � Λ21, but also by Eq. (16.81c),

Page 665: Yorikiyo Nagashima - Archive

640 16 Neutral Kaons and CP Violation*

q/p ! (q/p )e�2 i � , therefore (q/p )Λ12 and (p/q)Λ21 are phase-independentvariables. They should be, as Δλ is an observable. Then our choice of phaseEq. (16.64a), or equivalently Eq. (16.68b), constrains the phase of Λ12 and Λ21, too.Actually, by virtue of Eq. (16.81b), Λ12, Λ21 D Δλ/2 C O(�). Therefore both M12

and Γ12 are very close to real numbers and their phase is small [� O(�)],

) Δm D mL � mS D �2Rep

Λ12Λ21

�' �2ReM12 (16.82a)

ΔΓ D ΓS � ΓL D 4Imp

Λ12Λ21

�' 2ReΓ12

6) (16.82b)

ΔmΔΓ D �4Re(M12Γ �12) (16.82c)

ImM12 � ReM12 , ImΓ12 � ReΓ12 (16.82d)

Here 'means a small deviation due to CP violation. If CP invariance is assumed,the relations are exact.

Problem 16.5

Derive (16.82).

16.3CP Violation Parameters

16.3.1Observed Parameters

2π Decay Parameters ηC�, η00 The 2π CP-violating parameters are defined by

ηC� �hπCπ�jHW jKLi

hπCπ�jHW jKSiD jηC�je i φC� (16.83a)

η00 �hπ0π0jHW jKLi

hπ0π0jHW jKSiD jη00je i φ00 (16.83b)

jηC�j and jη00j can be determined from measurement of the 2π decay ratio. Thephase angle φC�, φ00 as well as the sign of Δm can be determined from theinterference effect between KS ! 2π and KL ! 2π. Suppose there is a beam

6) Note our definition of Δm and ΔΓ is tomake them both positive. This is possiblefor the neutral kaons because they weremeasured. However, some authors prefer touse Δm D mL � mS, ΔΓ D ΓL � ΓS orvice versa (S $ L) for a general treatmentincluding neutral D and B mesons.With our choice of phase convention and

in the absence of CP violation, Λ12 D

M12 � iΓ12/2 D �Δλ/2 D �Δm � iΔΓ /2.The physical quantity does not depend onthe phase choice, but some intermediatevariables may change, so one must be carefulabout the sign convention when reading theliterature.

Page 666: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 641

expressed by

jψ(t)i D jKL(t)i C �jKS(t)i D jKL(0)ie�i λL t C �jKS(0)ie�i λS t (16.84)

The 2π decay amplitudes are

h2πjψ(t)i D [ηe�i λLt C �e�i λS t ]h2πjKSi (16.85)

As the number of 2π decays is proportional to jh2πjψ(t)ij2

N2π D NKˇ�e(iΔ m�ΓS/2)t C ηe�(ΓL/2)t

ˇ2D NK

njηj2e�ΓL t C j�j2 e�ΓS t C 2e�Γ t jη�j cos(Δmt C φ� � φη

oφ� D arg(�) , φη D arg(ηC� or η00) , Γ D (ΓS C ΓL)/2

(16.86)

One can prepare � in two ways.(1) In the first method, a pure K0(K

0) beam is produced at t D 0, in which case

� D ˙1, φ� D 0. If NK0 D N , NK0 D N at t D 0, one substitutes j�j2 ! 1 in thesecond term and j�j ! (N � N )/(N C N ) in the third term of Eq. (16.86). In orderto maximize the interference effect, one needs

j�j exp(�Γ t) � jηj � 2 � 10�3 (16.87)

This means that the detector is to be placed in the neighborhood of l D γ t ' 12lSwhere lS D γ/ΓS is the mean decay length. From the time dependence of theinterference effect, one can determine both Δm and φ. Figure 16.8 shows the 2π

τ τ

K0+K0

K0

K0

background

(a) (b)

Figure 16.8 Decay curves of pure K0 and K0

beams at t D 0. (a) K0 C K0

curve shows nointerference. (b) K0 and K

0separately [107].

Page 667: Yorikiyo Nagashima - Archive

642 16 Neutral Kaons and CP Violation*

τ

Figure 16.9 Asymmetry curve of neutral K ! πCπ� between initially pure K0 and K0

beams [107].

decay curves. When decay products from both pure K0 and K0

beams at t D 0 areadded together, there is no interference as is shown in Fig. 16.8a, but separatelythey show the interference effect. Figure 16.9 gives an asymmetry curve defined by

AC� DN2π(� D �1) � N2π(� D 1)N2π(� D �1)C N2π(� D 1)

D �2jη j je(ΓS�ΓL)t/2 cos(Δmt � φ j )

1C jη j j2e(ΓS�ΓL)t (16.88)

The world average values of the decay parameters are given by [311]7)

jηC�j D 2.233˙ 0.012 � 10�3

jη00j D 2.222˙ 0.012 � 10�3

φC� D 43.51˙ 0.05ı

φ00 D 43.52˙ 0.05ı

(16.89)

Historically, however, both parameters were determined by the second method,using regeneration.

(2) The second method uses a regenerator to produce a KS beam. Given theregenerator thickness L, we have � D �(L) given by (16.31). The measured phaseis Δmt and φ� � φη . Since both t and L (which resides in φ�) can be varied, φη aswell as the sign of Δm can be determined. If one observes the semileptonic decaymode, one can determine φ� because φη D 0. Usually, the regenerator method isaccompanied by systematic errors associated with φ�, and it is not quite an idealmethod. However, if one measures both πCπ� and π0π0 modes and takes the

7) These values were determined using all the methods discussed in the following. The valuesdetermined from k ! 2π alone are less accurate.

Page 668: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 643

difference, the systematic error associated with � almost cancels and a very precisevalue of φ00 � φC� can be obtained. The most recent experimental data give [244]

Δφ � φ00 � φC� D 0.39˙ 0.45ı (16.90)

which is consistent with φ00 D φC�. A global analysis by the Particle Data Groupgives a much better value [311]:

Δφ D 0.006˙ 0.014 (16.91)

Δφ gives a measure of CPT violation [Eq. (16.175)], which will be discussed later.

Asymmetry AL Since the three-body leptonic decay modes are not CP eigenstates,the CP violation effect appears as the difference between the decay mode eCν e π�and its CP conjugate e�ν e πC, that is, as the asymmetry A L of the kaon with longerlife:

A L �Γ (KL ! eCν e π�) � Γ (KL ! e�ν e πC)Γ (KL ! eCν e π�)C Γ (KL ! e�ν e πC)

(16.92)

The asymmetry also appears in the muonic decay mode. KS can also decay to themin principle but, the 2π decay rate being much larger, the branching ratio is small:

Γ (KS ! eνπ)ΓS

�Γ (KL ! eνπ)

ΓSD BR(KL ! eνπ)

ΓL

ΓS� 10�3 (16.93)

The CP violation is expected to be even smaller by another factor of 10�3 and ismore difficult to measure.

The asymmetry can be determined as follows. The transition amplitudes for K l3are defined by

AC D heCνπ�jT jK0i , A� D he�νπCjT jK0i W ΔS D ΔQ

(16.94a)

A� D he� νπCjT jK0i , AC D heCνπ�jT jK0i W ΔS D �ΔQ

(16.94b)

We assume here that there are no ΔS D �ΔQ processes and set A � D AC D 0.CPT invariance constrains A� D (AC)� [Eq. (16.63a)]. The asymmetry of KS, KL

decays is calculated to give

A S, L �Γ (KS, L ! eCν e π�) � Γ (KS, L ! e�ν e πC)Γ (KS, L ! eCν e π�)C Γ (KS, L ! e�ν e πC)

Dj(1C �S, L)A�j2 � j(1 � �S, L)ACj2

j(1C �S, L)A�j2 C j(1� �S, L)ACj2

D 2Re(�S, L) D 2Re(�)˙ 2Re(δ) (16.95)

Page 669: Yorikiyo Nagashima - Archive

644 16 Neutral Kaons and CP Violation*

where Eq. (16.64a) was used to derive the second equality. We retain the nomencla-ture �S, L D � ˙ δ to remind us that A S and A L are different, although δ D 0 withCPT invariance. It is a constant independent of time if a pure KS, L beam is used.Under normal circumstances, the neutral kaon is produced as a mixture of K0 andK

0. Since observing the charge of a decay electron is equivalent to selecting the

flavor with the ΔS D ΔQ rule, the asymmetry shows the same time dependenceas that of the strangeness oscillation. But for long time t 10/ΓS after produc-tion, the identity of the parent does not matter and the residual asymmetry agreeswith A L given by Eq. (16.95). Figure 16.10a,b shows asymmetries of e and μ asa function of time. One can recognize interference and residual asymmetry. Theasymmetry is given by

A L(e) D 3.34˙ 0.07 � 10�3

A L(μ) D 3.04˙ 0.25 � 10�3

A L(e C μ) D 3.32˙ 0.06 � 10�3

(16.96)

Note the oscillation is not a CP violation effect. It is induced by the mixing andsmall mass difference of the kaons.

The decay asymmetry A S is important for deriving information on δ, but it isdifficult to determine as it has to be measured immediately after production of K0

or K0. Recently it was measured for the first time by the KLOE group [235], though

not so accurately that it can be used for other analysis:

A S(e) D 1.5˙ 9.6stat ˙ 2.9syst � 10�3 (16.97)

Problem 16.6

Show that by observing the flavor-tagged asymmetry into the same-sign semilep-tonic mode one can extract �S, despite the long time passage, to extract the asym-metry value:

A L˙ �Γ�K

0(t D 0)! l˙ν l π� � Γ

�K0(t D 0)! l˙ν l π�

Γ�K

0(t D 0)! l˙ν l π�C Γ

�K0(t D 0)! l˙ν l π�

t�τS����! 2Re(�S)

(16.98)

This is another method of measuring �S D � C δ.

16.3.2� and �0

The two most important CP-violating parameters that play a central role in thetheoretical interpretation of the neutral kaon decays are � and �0. � is a CP-violating

Page 670: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 645

AL=

(N+ –N

– )/(N

+ +N– ) KL

0 → e+νπ–

Lifetime (x 10–10s)

0.08

0.06

0.02

0 10 20–0.02

–0.06

–0.08

–0.04

0.04

AL=

(N+ –N

– )/(N

+ +N– ) KL

0 → μ+νπ–

Lifetime (x 10–10s)

0.06

0.02

010 20

–0.02

–0.06

–0.08

–0.10

–0.04

0.04

(a) (b)

Figure 16.10 Asymmetry AL [(a) K0L ! eC νπ�, (b) K0

L ! μCνπ�] oscillates as a function oftime. The residual asymmetry is the CP violation effect. The oscillation is not due to CP violationbut due to mixing and mass difference [171].

mixing parameter while �0, defined in Eq. (16.103c), expresses a CP violation effectin the decay. � is often referred to as the indirect and �0 as the direct CP violationeffect.8)

Phase ConventionWe investigate the interplay of CP violation parameters �, �0,ηC�, and η00 andclarify their roles. Since the neutral kaons decay predominantly to 2π channels, wefirst analyze the 2π decay mode. The 2π amplitude can be decomposed to isospinstates9)

A(K0 ! πCπ�) D1p

3

p2A 0 C A 2

�(16.100a)

A(K0 ! π0π0) D1p

3

�A 0 C

p2A 2

�(16.100b)

A(KC ! πCπ0) D

r32

A 2 (16.100c)

The I D 1 state is antisymmetric and is not allowed by Bose statistics. The decayamplitudes, taking into account the final-state interactions, can be expressed as [see

8) There is yet another type, referred to as the“mixed interference” CP violation parameter,defined by

λ f Dqp

A f

A f(16.99)

which appears in the interference between adecay without mixing K0 ! f and a decaywith mixing K0 ! K

0! f . Such an effect

occurs only in decays to final states that are

common to K0 and K . The CP-violatingeffect is defined by Im(λ f ) ¤ 0. In thekaon decays, it is of little use but it plays animportant role in neutral B-meson decays.

9) We deal with the symmetrized state

πC π� D (πC π� C π� πC)/p

2. Normally,I D 0 and I D 2 components in the neutraltwo-pion states are expressed as jI D I3 D

0i D (πC π� � π0 π0 C π� πC)/p

3.

Page 671: Yorikiyo Nagashima - Archive

646 16 Neutral Kaons and CP Violation*

Eq. (16.62)]

TI D h2π, I jHW jK0i D A I ei δ I (16.101a)

T I D h2π, I jHW jK0i D AI ei δ I (16.101b)

where δ I are scattering phase shifts of isospin I. CPT invariance means AI D A�I

and CP violation means A I ¤ A�I . Next we define

α DhI D 0jHW jK2i

hI D 0jHW jK1iDh0jHW jK0i � h0jHW jK

0i

h0jHW jK0i C h0jHW jK0i

DA 0 � A0

A 0 C A0D i

ImA 0

ReA 0(16.102)

Here we have used h0j, h2j to denote 2π, I D 0, 2 eigenstates. We also defineCP-violating decay parameters for I D 0, 2:

�0 �hI D 0jHW jKLi

hI D 0jHW jKSiDh0jHW jK2i C �h0jHW jK1i

h0jHW jK1i C �h0jHW jK2i

' � C α D � C iImA 0

ReA 0(16.103a)

�2 �1p

2

hI D 2jHW jKLi

hI D 0jHW jKSiD

1p

2

h2jHW jK2i C �h2jHW jK1i

h0jHW jK1i C �h0jHW jK2i

� �0 C �0ω (16.103b)

�0 '1p

2

h2jHW jK2i

h0jHW jK1i� αω D i ω

�ImA 2

ReA 2�

ImA 0

ReA 0

�(16.103c)

ω �1p

2

h2jHW jKSi

h0jHW jKSi'

1p

2

ReA 2

ReA 0e i(δ2�δ0) (16.103d)

where δ0, δ2 are 2π scattering phase shift in I D 0, 2 states. Note that �0, �2 and ω(hence �0 also) are observables and are independent of the phase convention.

Proof:

� f Dh f jHW jKLi

h f jHW jKSiDh f jHW j

np jK0i � qjK

0io

h f jHW jn

p jK0i C qjK0io D 1 � λ f

1C λ f

λ f D

�qp

�h f jHW jK

0i

h f jHW jK0iD

�qp

�A f

A f

(16.104)

By rephasing jK0i ! e�i � jK0i, jK0i ! e i � jK

0i, we have A f /A f ! e2 i � A f /A f

and q/p ! e�2 i � (q/p ) [see discussions following Eqs. (16.81)], therefore λ f isphase independent. �

Page 672: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 647

The relative amplitude ω violates the Δ I D 1/2 rule. Its absolute value and phasecan be obtained from the experimental data. If � indicates the phase space

jωj2 'Γ (KC ! πCπ0)

Γ (KS ! 2π)'

23

BR(KC ! πCπ0)τKS

τKCD (3.19 � 10�2)2

(16.105a)

Re(ω) '112

"�00

�C�Γ (KS ! πCπ�)Γ (KS ! π0π0)

� 2

#(16.105b)

From the above relations we obtain arg(ω)observed ' ˙55.6ı , which is in reasonableagreement with that given by Eq. (16.103d), namely [120, 157, 344]

δ2 � δ0 D �47.6˙ 1.5ı (16.106)

Since the phase of jK0i, jK0i can be chosen arbitrarily [Eq. (16.17)], we are free to

change �. This can be shown as follows. By rephasing

� DΛ21 � Λ12

2Δλ!

Λ21e�2 i � � Λ12e2 i �

2Δλ

'Λ21(1� 2 i � )� Λ12(1C 2 i � )

2Δλ' � C i �

(16.107)

where we have used the fact that CP violation is a small effect [i.e. O(�Δλ)] andonly needs a small phaseshift to redefine �. Using this freedom, we can change �

to a more convenient form. Choosing � D ImA 0/ReA 0, we can make � equal to �0.Namely, the CP-violating decay parameter and the mixing parameter are the same,a convention we adopt from now on.10) In this case one has to bear in mind thatK

0is defined by

CPjK0 >D e2 i � jK0i (16.108)

Note that in many discussions outside the realm of neutral kaon decays, includ-ing those about the Kobayashi–Maskawa matrix, � D 0 is assumed. So one mustbe careful about the phase convention. Since we chose to set � D �0, which is aphase-convention-independent parameter, we shall use Q� when we need the phase-dependent mixing parameter. Note, many authors use � to denote both the phase-independent I D 0 decay amplitude and the phase-dependent mixing parameterinterchangeably.

Then � defined in Eq. (16.69) is modified to

� Dsin φSW e i φSW

Δm

��Im(M12)C i

Im(Γ12)2

�C i

ImA 0

ReA 0(16.109)

10) Historically, Wu–Yang’s phase choice [393] toeliminate iImA0/ReA0 was adopted, and forthis phase choice � D �0. We use the same �

but ImA0/ReA0 is retained explicitly.

Page 673: Yorikiyo Nagashima - Archive

648 16 Neutral Kaons and CP Violation*

Using

i(sin φSW e i φSW)�1 D 1C i cos φSW/ sin φSW D 1C iΔΓ /2Δm (16.110)

� can be rewritten as

� D sin φSW e i φSW

��

Im(M12)Δm

�Im(Γ12)

ΔΓ

�C i

�Im(Γ12)

ΔΓC

ImA 0

ReA 0

�� �T C i δ' (16.111a)

δ' �Im(Γ12)

ΔΓC

ImA 0

ReA 0'

12

�arg(Γ12) � arg(A�

0 A0)

(16.111b)

We shall show that δ' is a very small number.Experimental data show

BR(KS ! 2π C (γ )) D 99.89%,BR(KS ! 3π) D 3.5 � 10�7,BR(KS ! e˙ν e π�) D 7.04 � 10�4,BR(KS ! μ˙νμ π�) D 4.69 � 10�4

All other decay modes < 10�5

(16.112)

One notices that 2π channel including photon emissions is overwhelmingly thedominant decay mode. Since ΓS ΓL, ΓS ' Γ2π , it is a good approximation touse only the 2π intermediate state in the evaluation of Γ12. Note that the leptonicmode (K ! l ν l π) does not contribute to Γ12 if the ΔS D ΔQ rule is exact. Settingthe phase space volume as � and using Eqs. (16.11) and (16.101), we have

Im(Γ12) 'hImhK0jHW jI D 0ihI D 0jHW jK

0i�C

I D 2 term�i

D Im

A�0 A0 C A�

2 A2

��

CPTDDDD Im

h(A�

0)2 C (A�2)2i

D �2

ReA 0ImA 0 C ReA 2ImA 2

�� (16.113)

ΓS 'hjhI D 0jHW jK0 C K

0i/p

2ij2 C (I D 2 term)i

' 2�(ReA 0)2 C (ReA 2)2 � (16.114)

) Im(Γ12)ΔΓ

'Im(Γ12)

ΓS' �

ImA 0

ReA 0C 2ω2

�ImA 0

ReA 0�

ImA 2

ReA 2

�(16.103c)DDDD �

ImA 0

ReA 0˙ 2

ˇω�0ˇ (16.115)

Using experimental values [ω � 0.03, j�0/�j ' 10�3; see Eq. (16.183)], we see thatthe second term is at most � 10�5 of the first term and can be neglected. Thisproves our promised statement.

Page 674: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 649

We see that �T is completely aligned along the superweak phase. Using Δm '�2Re(M12), ΔΓ ' 2Re(Γ12) [see Eq. (16.82)], �T can further be rewritten as

�T D sin φSW e i φSW

�� arg(M�

12Γ12)2

�' sin φSW e i φSW

�2Im(M�

12Γ12)Δm ΔΓ

�(16.116)

where ΔmΔΓ D �4Re(M�12Γ12) has been used in obtaining the last equality. Note,

T invariance means Im(M�12Γ12) D 0 [see Eq. (16.52c)]. Of course it should be as

T and CP are equivalent under the assumption of CPT invariance. But it is goodto remember that �T , if defined exactly by the second equality of Eq. (16.116), isintrinsically a T violation parameter and will reappear later in Eq. (16.148) in thediscussion of T invariance.

Conventionally, throughout most literatures, � defined by Eq. (16.111) is used.However, as δ' is small, the following discussions are valid irrespective of whether� or �T are used, except in trying to determine the value of δ' itself. Note also, that� itself is a phase dependent variable, but its real part Re(�) is an observable andis phase independent as can be seen in the expression of the decay asymmetryparameter A L [see Eq. (16.98)].

Evaluation of �

To determine � and �0, we express them in terms of observables. We relate the 2πdecay amplitudes ηC�, η00 to the isospin parameters � and �0.

Using the inverse relation of Eq. (16.100) we have

hI D 0jHW jKLi D

r23hC � jHW jKLi �

r13h00jHW jKLi

D

r23

ηC�

"r23hI D 0jHW jKSi C

r13hI D 2jHW jKSi

#

r13

η00

"�

r13hI D 0jHW jKSi C

r23hI D 2jHW jKSi

# (16.117)

and dividing by hI D 0jHW jKSi we obtain

� D

�23

ηC� C13

η00

�C

23

(ηC� � η00)ω (16.118)

Similarly

�2 D

�13

ηC� �13

η00

�C

13

(ηC� C 2η00)ω (16.119)

Conversely

ηC� D� C �2

1C ω' � C �0 , η00 D

� � 2�2

1� 2ω' � � 2�0 (16.120)

Page 675: Yorikiyo Nagashima - Archive

650 16 Neutral Kaons and CP Violation*

As �0 � � experimentally [see Eq. (16.183)],

�0

�'

13

�1�

η00

ηC�

�(16.121)

is a good first approximation. Figure 16.11 shows a sketch of Eqs. (16.120).Experimentally, using Eq. (16.89),

jηC�j ' jη00j , φC� ' φ00 D 43.51˙ 0.05ı ' φSW (16.122)

which means

� ' ηC� ' η00 , (16.123)

At this stage, it is convenient to define components parallel and perpendicular toe i φSW in the complex plane. Then

� D �kek C �?e? (16.124a)

ek D e i φSW D cos φSW C i sin φSW (16.124b)

e? D i e i φSW D � sin φSW C i cos φSW (16.124c)

We will learn later that the components of �, �0 perpendicular to ek are related toCPT violation. The experimental observations Eq. (16.123) mean

j�?j � j�kj (16.125a)

and that � can be determined very accurately by the relations

j�j D (2jηC�j C jη00j)/3

φ� D (2jφC�j C jφ00j)/3(16.125b)

φ 00φ +

φ ε ∼ φ SW

ε

2ε'ε' φ ε'

iξ i

ε

A0A0

Figure 16.11 Sketch of Eq. (16.120). Not to scale. Since we redefined Q� as the mixing parameter,it is phase dependent, but � D �0 is not.

Page 676: Yorikiyo Nagashima - Archive

16.3 CP Violation Parameters 651

which give

j�j D 2.229˙ 0.012 � 10�3

φ� D 43.51˙ 0.05ı (16.126)

Direct CP ViolationWhile � represents an indirect CP violation in mixing, �0 is a direct CP violation indecay amplitudes to the I D 0, 2 state. To see that �0 represents a CP violation in thedecay, let us find a necessary condition for the two CP conjugate decay rates to bedifferent. Note, the total decay rate is guaranteed to be the same by CPT invariance,but the partial decay rate to a particular final state f can be different if CP is violated.The partial decay amplitude is expressed as [see Eq. (16.57)]

T(B ! f ) DX

j

A j e i δ j (16.127a)

T(B ! f ) DX

j

A j e i δ j (16.127b)

where j labels different intermediate states and δ j is the scattering phase shift. CPTinvariance constrains A j D A�

j and T invariance means A j D A�j . Therefore,

under the assumption of CPT invariance, CP violation appears as a phase of theamplitude. Let us write

A j D A�j D jA j je

i φ W j D a j e i φ W j , A j D a j e�i φ W j (16.128)

For the experimental sign of CP violation in the decay

δΓ D Γ (B ! f ) � Γ (B ! f ) ¤ 0 (16.129)

the interference effect is necessary, which requires at least two intermediate states.Writing

T(B ! f ) D a1e i δ1 e i φ W1 C a2e i δ2 e i φ W2

T(B ! f ) D a1e i δ1 e�i φ W1 C a2e i δ2 e�i φ W2(16.130)

then

Γ (B ! f ) � Γ (B ! f ) D a1a2 sin(φW1 � φW2 ) sin(δ1 � δ2) (16.131)

The above equation means that the two states must have both different strongphase and weak phase.

Next we want to express the decay rate difference in terms of the CP violationparameter η f that we defined previously [Eq. (16.83)].

Page 677: Yorikiyo Nagashima - Archive

652 16 Neutral Kaons and CP Violation*

Problem 16.7

Show the decay asymmetry at time t is given by

A f �Γ (K

0! f )� Γ (K0 ! f )

Γ (K0! f )C Γ (K0 ! f )

D 2Re(� � η f )

η f �h f jHW jKLi

h f jHW jKSi' � C i

ImA f

ReA f

(16.132)

which is a restatement that if the final state is reached via a single route, the asym-metry does not appear. Show also

A f (π0π0) D 2(� � η00) D 4�0 , A f (πCπ�) D 2(� � ηC�) D �2�0

(16.133)

This explicitly shows that �0 represents the difference of two CP conjugate decayrates.

Note, the CP violation parameter �0 defined in Eq. (16.103c) can be expressed inthe form

Re(�0) D Re�

i ω�

ImA 2

ReA 2�

ImA 0

ReA 0

��

' �1p

2

ReA 2

ReA 0sin(δ2 � δ0) sin(φ2 � φ0)

(16.134)

which is of the form Eq. (16.131) and shows that I D 0, 2 states provide the twodifferent routes for decays K ! πCπ� and K ! π0π0.

Problem 16.8

Show that �0 can also be defined as

Re(�0) D16

ˇˇ Aπ0 π0

A π0π0

ˇˇ �

ˇˇ AπC π�

A πCπ�

ˇˇ!

(16.135)

Evaluation of �0From Eq. (16.120)

�0 '13

(ηC� � η00) (16.136)

It is convenient to use the decomposition defined in Eq. (16.124):

�0k�D Re(�0/�) ,

�0?�D Im(�0/�) (16.137)

Page 678: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 653

Since

arg(�0) D arg(ω) Dπ2C δ2 � δ0 D 42.3˙ 1.5ı ' φSW (16.138)

we have

�0?j � j�

0kj (16.139)

Using ηC� ' η00, φC� ' φ00 ' φSW

�0? '

jηC�j3

(φC� � φ00) D �jηC�j

3Δφ (16.140)

When both ηC� and η00 are measured it is possible to determine �0k in principle,

but with poor accuracy. Experimentally Re(�0/�) can be determined directly [51,244]. Using Eq. (16.120)

Re�

�0

�'

16

ˇˇ ηC�

η00

ˇˇ2 � 1

!'

13

�ˇˇ ηC�

η00

ˇˇ � 1

�D 1.65˙ 0.26 � 10�3

(16.141)

where we have used jηC�/η00j2 ' 1 to derive the third equality. Δφ can also be de-

termined accurately using the value of Re(�0/�). Since φ00/φC� ' 1, Eq. (16.120)means

Δφ D φ00 � φC� ' �3Im(�0/�) ' �3Re�

�0

�tan(φ�0 � φSW)

' 0.006˙ 0.008ı [311](16.142)

16.4Test of T and CPT Invariance11)

We have observed CP violation, but if CPT invariance holds this is equivalent toT violation. The foundation on which CPT invariance is established is very solid,as stated in Chap. 9. It can be proved with very few axioms, such as the principlesof quantum mechanics and Lorentz invariance. Namely, to deny CPT invarianceis equivalent to denying quantum field theory. One wonders if the observed CPviolation is T violation with CPT invariance or genuine CP violation, which holdsregardless of CPT invariance.

The neutral kaon system provides test benches for the most rigorous determina-tion of T and CPT symmetry and a variety of other fundamental principles. This isbecause the mass difference (Δm ' 3.5�10�6 eV) is so small that its effect appearsas macroscopic interference phenomena and enables CP and CPT parameters tobe determined with an accuracy no other means can provide. In this section, westart with no assumption about CP or CPT invariance.

11) See [48, 68, 121, 269, 299].

Page 679: Yorikiyo Nagashima - Archive

654 16 Neutral Kaons and CP Violation*

16.4.1Definition of T- and CPT-Violating Amplitudes

The starting point is Eq. (16.55). We first write the decay amplitudes for K ! 2πand K l3 are expressed as sum of two amplitudes:

h2π, I jHW jK0i D (A I C BI )e i δ I (16.143a)

h2π, I jHW jK0i D (AI C B I )e i δ I (16.143b)

heCν e π�jHW jK0i D (a C b) (16.143c)

he�ν e πCjHW jK0i D (a C b) (16.143d)

If CPT invariance holds, ACB D (ACB)� [Eq. (16.63a)], so if we rewrite AI D A�I ,

a D a�, B I D �B�I , b D �b�, A I , a represent CPT-conserving and BI , b represent

CPT-violating terms. Furthermore, if we write

A I D ReA I C iImA I , BI D ReBI C iImBI

a D Rea C iIma , b D Reb C iImb (16.144)

T invariance means A�I D A I , B�

I D BI and CP invariance means AI D A I ,B I D BI . Consequently, what each term represents is

Re A I CPT, T, CPIm A I CPT, �T, ��CPRe BI ��CPT, T, ��CPIm BI ��CPT, �T, CP

If we can demonstrate smallness of both direct and indirect CPT violation param-eters, i.e., jBI j � jA I j, δ � �, the assumption of CPT invariance is justified.Similar logic can be applied to a, b.

16.4.2T Violation

If observables such as transverse polarization (σ � p1 � p2) in the 3-body decayK0 ! π˙(p1)μ�(p2)νμ or the electric dipole moment (EDM) are detected, theyare a sure sign of T violations [see Sect. 9.2.2] independent of CP or CPT violation.Unfortunately, they have not been observed and there is no other evidence thatT invariance is violated. Therefore we investigate whether we can show it in theneural kaon system. T invariance means [19]

T�hΨ out(t f )jΦ in(ti )i

D hΦ out(t f )jΨ in(ti )i (16.145)

Setting Φ D K0, Ψ D K0, we have

jhKout0 (t f )jK in

0 (ti )ij2 D jhKout0 (t f )jK

in0 (ti )ij2 (16.146)

Page 680: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 655

Therefore, time reversal violation can manifest itself in an asymmetry parameterknown as the Kabir asymmetry [221]

A T DP [K

0(t D 0)! K0(t)] � P [K0(t D 0)! K

0(t)]

P [K0(t D 0)! K0(t)]C P [K0(t D 0)! K

0(t)]

DjhK0je�i Λ tjK0ij

2 � jhK0je�i Λ tjK0ij2

jhK0je�i Λ tjK0ij2 C jhK0je�i Λ tjK0ij2

Dj2(Λ12/Δλ )g�(t)A f j

2 � j2(Λ21/Δλ )g�(t)A f j2

j2(Λ12/Δλ )g�(t)A f j2 C j2(Λ21/Δλ )g�(t)A f j2

'jΛ12j

2f1� 2Re(y )g � jΛ21j2f1C 2Re(y )g

jΛ12j2f1� 2Re(y )g C jΛ21j2f1C 2Re(y )g

'jΛ12j

2 � jΛ21j2

jΛ12j2 C jΛ21j2� 2Re(y ) (16.147a)

A f D h f jHW jK0i D a C b D a(1� y )

A f D h f jHW jK0i D a� � b� D a�(1C y�) , y D �b/a (16.147b)

where f and f are decay final states to tag the flavor. We have used Eqs. (16.79b)and (16.80b) in deriving the third line. No assumption on CPT invariance is madehere. Apart from CPT violating parameter Re(y ) which can be tested separately, Tviolation means jΛ12j ¤ jΛ21j, which confirms the statement in Eq. (16.52b).

�T

A convenient parameter can be defined as

�T �i

ΔΓjΛ12j

2 � jΛ21j2

λL � λSD sin φSW e i φSW

2Im(M�12Γ12)

ΔmΔΓ

D sin φSW e i φSW12

sinfarg(Γ12) � arg(M12)g

(16.148)

where φSW, Δm and ΔΓ are defined in Eqs. (16.81). �T is related to � by � D

�T C i δ' [see Eq. (16.111)12) ]. Using this �T, the asymmetry A T is expressed as

A T D 4Re(�T)� 2Re(y ) (16.149)

The value of Re(y ) can be evaluated from the asymmetry of semileptonic decays[see Eq. (16.155)] and is known to be consistent with zero. In the analysis here, itwas assumed to vanish.

12) Note, if �T is defined by Eq. (16.148), δ' will be redefined by this relation.

Page 681: Yorikiyo Nagashima - Archive

656 16 Neutral Kaons and CP Violation*

Figure 16.12 Direct T-violation effect observed by CPLEAR [107, 108].

�T ¤ 0 means T violation whether or not CPT invariance is violated. Note A T

is constant and does not depend on time. K0 ! K0, K

0! K0 processes can

be identified using lepton charge tagging, i.e. K0 by eCν e π� and K0

by e�ν e πC,assuming the strict ΔS D ΔQ rule.13) Figure 16.12 shows the CPLEAR measure-ment [107, 108] of the asymmetry A T, which is clearly nonzero.14)

This is the first, and so far only, directly observed T-violation effect. The value isdetermined as

4Re(�T) D 6.2˙ 1.4stat ˙ 1.0syst � 10�3 (16.150)

16.4.3CPT violation

Re(δ)CPT invariance requires Λ11 D Λ22 [see Eq. (16.15a)], or equivalently δ D 0.Therefore an observable similar to A T that can test CPT violation is given by

A CPT DP [K

0(t D 0)! K

0(t)]� P [K0(t D 0)! K0(t)]

P [K0(t D 0)! K

0(t)]C P [K0(t D 0)! K0(t)]

DjhK0je�i Λ tjK0ij

2 � jhK0je�i Λ tjK0ij2

jhK0je�i Λ tjK0ij2 C jhK0je�i Λ tjK0ij2(16.151)

If the final state is tagged by charge detection of the decay lepton, the leptonic decayamplitudes in Eq. (16.94) are accompanied by the direct CPT-violating amplitudeb D �y a and are expressed as

hK0je�i Λ tjK0i D a�(1C y�)[gC(t)C 2δg�(t)] (16.152a)

hK0je�i Λ tjK0i D a(1� y )[gC(t) � 2δg�(t)] (16.152b)

13) Experimental analyses are usually carried out without assuming the ΔS D ΔQ rule, but so far noeffect has been observed.

14) Because of different normalization, what CPLEAR actually measured is AT D 4Re(�)� 4Re(y).

Page 682: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 657

τ

Figure 16.13 AsymmetryAexpδ versus the neutral-kaon decay time in units of τS . The solid line

represents the result of the fit [107].

) A CPT Dj(1C y )�[gC(t)C 2δg�(t)]j2 � j(1� y )[gC(t)� 2δg�(t)]j2

j(1C y )�[gC(t)C 2δg�(t)]j2 C j(1� y )[gC(t)� 2δg�(t)]j2

D 2Re(y )C 4Re(δ) sinh ΔΓ t/2C Im(δ) sin Δmt

cosh ΔΓ t/2C cos Δmtt�τS����! 4Re(δ)C 2Re(y ) (16.152c)

Re(δ) and Re(y ) can be determined by observing residual asymmetry. Figure 16.13shows CPLEAR’s experimental data on A CPT.15) The data show no sign of CPTviolation.

The obtained values are [107]

Re(δ) D 2.9˙ 2.6stat ˙ 0.6syst � 10�4 (16.153)

Re(y )The value of Re(y ) can be determined using the asymmetry of semileptonic decays.Including Re(y ) in the expression for the asymmetry Eq. (16.95) we obtain

A S, L D 2[Re(�)˙ Re(δ)� Re(y )] (16.154a)

) A S C A L D 4[Re(�)� Re(y )] (16.154b)

A S was measured in KLOE [235]. However, the most accurate value of Re(y ) comesfrom the analysis of the Bell–Steinberger relation and is given by

Re(y ) D 0.4˙ 2.5 � 10�3 (16.155)

15) Actually what they measured is A δ , which enhances the sensitivity to δ and compensates Re(y),i.e. [A δ � 8Re(δ)]. It differs from ACPT by the normalization method. For details see [68, 107].

Page 683: Yorikiyo Nagashima - Archive

658 16 Neutral Kaons and CP Violation*

Im(δ) can also be determined in principle from K l3 decay asymmetry by observingits time variation, as is described in Eq. (16.152c), but again its most accurate valuecan be determined by using the Bell–Steinberger relation.

Analysis Using the Bell–Steinberger Relation*The Bell–Steinberger relation Eq. (16.74) can be expressed as

2

Γ C iΔm� �

Re(�)� iIm(δ)DX

f

A L( f )A�S( f )�( f ) (16.156a)

DX

f D2π

η2πBR(KS ! ππ)ΓS CX

f D3π

η�3πBR(KL ! 3π)ΓL

C 2[Re(�)� Re(y )� iIm(δ)]BR(KL ! l πν)ΓL (16.156b)

η2π Dh2πjHW jKLi

h2πjHW jKSi, η3π D

h3πjHW jKSi

h3πjHW jKLi(16.156c)

In the neutral kaon system only a few modes give significant contributions to therhs of Eq. (16.156a) [see Eq. (16.112)]. Only the ππ(γ ), 3π and π l ν l modes turnout to be relevant up to 10�7 level [238]. The second line for the K l3 intermediatestates derives from the following calculation:

hKSjHW jeCπ�νiheCπ�νjHW jKLi

C hKSjHW je�πCνihe�πCνjHW jKLi�

D12

h �1C ��

S

�(1C �L)jA(K0! eCπ�ν)j2

� (1� ��S )(1� �L)jA(K

0! e�πCν)j2

i�

D Γ (KL ! l νπ)12

h(1C ��

S )(1C �L)f1� 2Re(y )g

� (1� ��S )(1� �L)f1C 2Re(y )g

iD f2Re(�)� 2 iIm(δ)� 2Re(y )gBR(KL ! l νπ)ΓL (16.157)

Dividing both sides of Eqs. (16.156a) and (16.156b) by ΓS gives

2

ΓΓSC i tan φSW

! �Re(�)� iIm(δ)

DX

f

η2πBR(KS ! 2π)CΓL

ΓS

�Xf

η�3πBR(KL ! 3π)C BR(KL ! l νπ)2 fRe(�)� Re(y )� iIm(δ)g

�(16.158)

As BR(KS ! 2π) 1, ΓL/ΓS ' 1.75 � 10�3, the contribution of the second lineis much smaller. Since the above equation is a relation of complex numbers, wecan determine two variables, Im(δ) and Re(y ) treating Re(�) as given. Alternative-ly, by using Eq. (16.154) we can replace [Re(�) � Re(y )] by an observable quanti-

Page 684: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 659

ty (A S C A L) on the rhs. Then we can determine Re(�) and Im(δ). Analyses byCPLEAR/KLOE [107, 238] give

Re(�) D 159.6˙ 1.3 � 10�5 , Im(δ) D 0.4˙ 2.1 � 10�5 (16.159)

Note on ΔS D �ΔQ ComponentsSo far we have completely neglected the ΔS D �ΔQ contribution as it does notexist in the quark model. If one is so skeptical as to doubt CPT invariance, it islogical to include the ΔS D �ΔQ contribution [denoted as x , x or x˙ D (x˙x )/2],too. If its contribution is included, from a purely experimental point of view wehave to add ΔS D �ΔQ amplitudes for leptonic decays to those expressed inEqs. (16.143):

heCν e π�jHW jK0i D a C b ΔS D ΔQ (16.160a)

he�ν e πCjHW jK0i D a� � b� ΔS D ΔQ (16.160b)

he�ν e πCjHW jK0i D c C d ΔS D �ΔQ (16.160c)

heCν e π�jHW jK0i D c� � d� ΔS D �ΔQ (16.160d)

We define ΔS D ΔQ rule-breaking parameters by

x Dc� � d�

a C b'

c� � d�

a(1C y ) , x D

c� C d�

a � b'

c� C d�

a(1� y )

(16.161)

We can also separate them into CPT-conserving xC D (xC x )/2 and CPT-violatingx� D (x � x)/2 terms. These parameters enter in the asymmetries A Δ m, A T, A CPT

and can be determined if the asymmetries are reanalyzed including these parame-ters. So far there is no sign of them and their upper-limit values are given by [107,238]

Re(x ) D 1.9˙ 4.1 � 10�3

Im(x ) D 1.2˙ 1.9 � 10�3

Re(x�) D 0.2˙ 1.3 � 10�2

Im(xC) D 1.2˙ 2.2 � 10�2

(16.162)

K�K0

Mass Difference and Decay Width DifferenceWe learned that CPT invariance means

CPT invariance:! Λ11 D Λ22 (16.163a)

or mK0 D mK0 , ΓK0 D ΓK0 (16.163b)

Page 685: Yorikiyo Nagashima - Archive

660 16 Neutral Kaons and CP Violation*

Let us express the mass and width difference in terms of the CPT-violating param-eter δ. The CPT-violating mixing parameter is given by

δ DΛ22 � Λ11

2ΔλD

sin φSW e i φSW

2Δm

�i(M11 �M22)C

Γ11 � Γ22

2

�M11 �M22 D mK0 � mK0 , Γ11 � Γ22 D ΓK0 � ΓK0

(16.164)

It is convenient to decompose δ into δk and δ? as in Eq. (16.124):

Re(δ) D δk cos φSW � δ? sin φSW (16.165a)

Im(δ) D δk sin φSW C δ? cos φSW (16.165b)

where δk, δ? are signed real numbers. Then

mK0 � mK0 D M11 �M22 D2Δm

sin φSWδ? D ΔΓ f�Re(δ) tan φSW C Im(δ)g

' 2Δmf�Re(δ)C Im(δ)g (16.166a)

ΓK0 � ΓK0 D Γ11 � Γ22 D4Δm

sin φSWδk D 2ΔΓ fRe(δ)C Im(δ) tan φSWg

' 2ΔΓ fRe(δ)C Im(δ)g (16.166b)

As we have already determined Re(δ) and Im(δ), the mass and decay width differ-ence can also be evaluated. Inserting values of Re(δ) and Im(δ) in Eqs. (16.153)and (16.159), we obtain [107]

δk D 1.9˙ 2.0 � 10�4

δ? D �1.5˙ 2.0 � 10�4 (16.167)

and subsequently from Eqs. (16.166) we have [107, 236]

mK0 � mK0 D 0.5˙ 5.8 � 10�19 GeV (16.168a)

ΓK0 � ΓK0 D �1.5˙ 2.0 � 10�18 GeV (16.168b)

Dividing Eq. (16.168) by the mass and width of the kaon, one finds the CPTviolation can be at most [311]

mK0 � mK0

mK0. 0.8 � 10�18 (16.169a)

ΓK0 � ΓK0

ΓSD 5.2˙ 5.8 � 10�4 (16.169b)

Equation (16.169a) is considered the most stringent proof of CPT invarianceat this time. Equation (16.169b) is a measure of direct CPT violation [see alsoEq. (16.175a)].

Page 686: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 661

Direct CPT Violation in 2π Decays*The observed value arg(ω) D �55.6ı is in reasonable agreement with δ2 � δ0 D

�47.6˙ 1.5ı . Therefore the expression for ω without CPT invariance assumption

ω D1p

2

ReA 2 C iImB2

ReA 0 C iImB0(16.170)

indicates ImBI � ReA I . With this assumption, the expressions for A L, α, �, �0 inEqs. (16.95) and (16.102)–(16.103) are modified as follows:

12

A L ! Re(�)��

Re(δ)�RebRea

D Re(�)�

nRe(δ)C Re(y )

o(16.171a)

iImA 0

ReA 0!

iImA 0 C ReB0

ReA 0 C iImB0' i

ImA 0

ReA 0C

�ReB0

ReA 0

(16.171b)

�0 D Q� � δ C iImA 0

ReA 0! �T C i δ' �

�δ �

ReB0

ReA 0

(16.171c)

�0 ! i ω�

ImA 2 � iReB2

ReA 2 C iImB2�

ImA 0 � iReB0

ReA 0 C iImB0

' i ω�

ImA 2

ReA 2�

ImA 0

ReA 0

�C ω

�ReB2

ReA 2�

ReB0

ReA 0

(16.171d)

Those in large braces f� � � g are newly added CPT-violating terms. Relations betweenredefined �, �0 and η00, ηC� from Eq. (16.120) do not change.

If we approximate ΓK0 � ΓK0 by its 2π intermediate states, a calculation similarto what we carried out to derive ΓS ' 2(ReA 0)2� (see Eq. (16.113)) leads to

ΓK0 � ΓK0 ' �hnjA(K0 ! πCπ�)j2 C jA(K0 ! π0π0)j2

o�njA(K

0! πCπ�)j2 C jA(K

0! π0π0)j2

oiD �

hjA 0 C B0j

2 C jA 2 C B2j2�

�jA�

0 � B�0 j

2 C jA�2 � B�

2 j2�i

D 4�[ReA 0ReB0 C ReA 2ReB2]

' 2ΓS

�ReB0

ReA 0C 2jωj2

ReB2

ReA 2

' 2ΓSReB0

ReA 0

(16.172)

Substituting Eq. (16.172) into Eq. (16.164), we obtain

Qδ � δ �ReB0

ReA 0' i sin φSW e i φSW

�mK0 � mK0

2ΔmC

ReB0

ReA 0

�(16.173)

Relations between η, �, �0, δ and ReB0/ReA 0 are depicted in Fig. 16.14.We see that Qδ k δ?, namely Qδ is almost perpendicular to the superweak direction

in the 2π dominance limit (δ' D 0). In this approximation,

j Qδj ' j�j tan(φ� � φSW) (16.174)

Page 687: Yorikiyo Nagashima - Archive

662 16 Neutral Kaons and CP Violation*

iδφ

φSW

mK0 mK0

m

δ =~

φε0

δ δ

δ

ε

ε’2ε’

δ

εΤ

ε CPT

Figure 16.14 Diagram showing � when CPT invariance is not assumed (not to scale). CPT-invariant � is denoted as �CPT. In the 2π dominance approximation, δ' D 0, the CPT-violatingterm Qδ is almost orthogonal to �.

Experimentally, φ� � φSW ' φC� � φSW D 0.61˙ 0.62˙ 1.01ı .An estimate of the CPT-violating 2π decay amplitude can be obtained as follows.

From Eq. (16.172)

ReB0

ReA 0'

ΓK0 � ΓK0

2ΓSD

δkcos φSW

(16.175a)

As was discussed in Eq. (16.138), arg(�0) ' φSW and Eq. (16.171d) tells us that

ω�

ReB2

ReA 2�

ReB0

ReA 0

�' ��0

?(16.140)DDDD

jηC�j3

Δφ (16.175b)

Substituting experimental values of Δφ and δk allows ReB0/ReA 0 and ReB2/ReA 2

to be estimated from Eqs. (16.175) [107]

ReB0

ReA 0D 2.6˙ 2.9 � 10�4 ,

ReB2

ReA 2D 1.3˙ 4.5 � 10�4 (16.176)

We conclude that under the present circumstances, there is no sign of CPT viola-tion.

16.4.4Possible Violation of Quantum Mechanics

Theoretically, CPT invariance is deeply rooted in quantum field theory. Its violationmeans that one of Lorentz invariance, locality, unitarity or the principle of quantum

Page 688: Yorikiyo Nagashima - Archive

16.4 Test of T and CPT Invariance 663

mechanics is not correct. Unitarity is related to the conservation of probability andLorentz invariance (or its extension, Poincaré invariance) to conservation of ener-gy or angular momentum, therefore the most suspicious was violation of localityin the strong interaction. However, the success of the Standard Model makes thepossibility very small. With the advent of quantum gravity as exemplified by blackhole radiation, it was pointed out that the existence of a quantum gravitational fieldmight possibly affect the propagation of quantum mechanical waves [201]. In otherwords, when we discuss CPT violation, we have to dare to intrude into the sacredarea of quantum mechanics. Ellis and others in [130, 210] proposed experiments totest the breakdown of quantum coherence due to quantum gravity. In their treat-ment, CPT violation is incorporated in the breakdown of quantum mechanics,where a pure state develops into a mixed state. They expressed the effect in termsof three parameters α, , γ . These parameters have dimensions of mass and areexpected to have values� m2

K /mPlanck � 2�10�20 GeV. Phenomenologically, theireffect most conspicuously appears as the modification (destruction) of macroscop-ic interference effects, for example in neutron propagation and K ! 2π, 3π, π l ν.In the neutral kaon decays, the effect could appear as the loss of coherence of theneutral kaon wave function. Figure 16.15 shows how this quantum decoherenceeffect would appear in the asymmetry of K ! 2π and K ! l πν [107], which wehave already shown in Fig. 16.9 and Fig. 16.1. So far no breakdown of quantummechanics has been observed. The experimental data can set upper limits to theseparameters, which are approaching the estimated values of � 10�20 GeV.

Another test bench for the possible breakdown of quantum mechanics is provid-ed by DAPHNE and the B-factory, where correlated pairs of K0–K

0(or KS–KL) and

B0–B0

are copiously produced [56, 236]. Here, one can also test quantum entan-glement or the Einstein–Podolsky–Rosen (EPR) paradox [128] in connection withquantum decoherence. We refer to [60].

− Δ

τ τ (a) (b)

Figure 16.15 The measured decay-rate asymmetries (a) A2π and (b) A Δm analyzed for a possi-ble loss of coherence. The dashed lines represent the expected asymmetries with positive valuesof α, and γ , which are 10 times larger than the obtained upper limits [107].

Page 689: Yorikiyo Nagashima - Archive

664 16 Neutral Kaons and CP Violation*

16.5Experiments on CP Parameters

Since all CP- and/or CPT-violation tests depend so much on precision experiments,we need to study how they are carried out and understand their limitations. Weshould also be alert in devising improved or new techniques or proposing newmethods. Most of the CP-violation measurements made before 2000 were carriedout by fixed-target experiments, in which a high-energy proton beam bombardeda target to produce neutral kaons (charged particles were removed by sweepingmagnets). Because of the collimated kaon beam with high momentum, most ex-periments had to optimize their apparatus for specific purposes. With the adventof collider accelerators, a comprehensive study of CP phenomena became possible.For some specific purposes, however, where high flux of the kaon beam is required,the fixed-target experiments still have an advantage, although by no means an easyone. It took two large groups (NA31! NA48 and E731! KTeV) 20 years to deter-mine just one number, the direct CP violation parameter �0. We will describe howthe experiments were done.

Two collider detector experiments on kaons are worth mentioning: CPLEAR atCERN and KLOE at Frascati in Italy. Since they deal with low-energy kaons, exper-iments with a large solid angle (almost 4π) and covering 0–20 KS decay regionswere possible.

The former used a low-energy antiproton beam to tag neutral kaons K0 (K0) by

p p ! K˙π�, and the latter produced correlated kaon pairs by e� C eC atp

s Dmφ in a reaction

e� C eC ! φ ! K0K0

or KCK� with J P C D 1�� (16.177)

The detector concept is very similar in both experiments. Here we describe theCPLEAR detector.

16.5.1CPLEAR

CPLEAR was an experiment to study CP violation in kaon decays. Tagged K0 andK

0were produced by stopping low-energy p in a pressurized (at 16 bar) hydrogen

gas target. The beam had a flux of 106 p s�1 and its momentum was 200 MeV/cwith a spread of 5 � 10�4. The accumulated total number of p was 1.1 � 1013. Theuse of liquid hydrogen was avoided to minimize the amount of matter in the decayvolume. The incoming p beam was delivered by the Low Energy Antiproton Ring(LEAR) facility at CERN and the size of the beam was about 3 mm FWHM. K0 andK

0were identified by the reactions

p at rest C p ! (K�πC)C K0

! (KCπ�)C K0 (16.178)

Page 690: Yorikiyo Nagashima - Archive

16.5 Experiments on CP Parameters 665

Figure 16.16 CPLEAR detector. Longitudinal view [107, 109].

each having a branching ratio of 2�10�3. The conservation of strangeness in thestrong interaction dictates that a K0 is accompanied by a K� and a K

0by a KC.

The highest momentum of the kaon was � 750 MeV/c, the mean decay lengthof KS was 4 cm and the cylinder volume with radius of 60 cm corresponded toroughly 20 KS mean flight paths, covering the interesting interference region. Thewhole detector was embedded in a warm i.e. normal conducting solenoidal magnet,which provided a 0.44 T uniform field [109].

π

π

π

π

π(a) (b)

Figure 16.17 Display of CPLEAR events.p p ! K�πCK0 with the neutral kaon de-caying to (a) πCπ� and (b) e� πCν. Parts ofdetectors: T; gas target, S2, C, S1; Cherenkov

and scintillator counters that make a particleindentifier. Others can be identified by com-paring with Fig. 16.16 [107, 109].

Page 691: Yorikiyo Nagashima - Archive

666 16 Neutral Kaons and CP Violation*

The general layout is shown in Figure 16.16. A series of cylindrical tracking de-tectors (drift chambers, streamer tubes, etc.) provided information about the tra-jectories of charged particles to determine their charge signs, momenta and posi-tions. The reconstructed track resolution was . 350 μm in r and r φ and 2 mm inz; 0.05% . Δp/p . 0.1%. Particle identification (PID) was carried out by a com-bination of a threshold-type Cherenkov detectors, to identify charged kaons, andscintillation counters, which measured the time of flight and dE/dx . The outer-most detector was a lead/gas-tube sampling calorimeter (ECAL) to detect photonsproduced in π0 decays.

Two transverse views of the detector together with typical event displays areshown in Fig. 16.17. Both of them show K�πC tracks originating in the hydro-gen target. In Fig. 16.17a, K0 has decayed to two charged pions while in (b) it hasdecayed according to K0 ! e�πCν. The total number of events measured was6.4 � 105.

16.5.2NA48/KTeV

It may be surprising to learn that at the turn of the century, 30 years after theirdiscovery, all the known CP-violation phenomena were still explained by a singlenumber � and were consistent with the superweak model, which denied the exis-tence of direct CP violation in the decay or in any place other than in neutral kaondecays. Although the Standard Model predicted its existence and demonstrated itsvalidity in the B-meson factory experiments later, proof of direct CP violation, alsoknown as �0, was one of the biggest issues at the turn of the century. The issue wassettled by two definitive experiments, NA48 [282] at CERN and KTeV [244] at Fer-milab. They are good examples of how modern state-of-the-art experiments usingfixed targets are carried out and deserve some detailed explanation.

Assuming CPT invariance, the effect can be tested experimentally by measuringthe double ratio of KL and KS decaying to both πCπ� and π0π0:

R DΓ (KL ! πCπ�)/Γ (KS ! πCπ�)

Γ (KL ! π0π0)/Γ (KS ! π0π0)D

ˇˇ ηC�

η00

ˇˇ2 ' 1C 6Re

��0

�Re��0/�

�' 1.6 � 10�3 (16.179)

The above number (10�3) may not seem impressive, but in terms of decay asym-metry, which is represented by the value of Re(�0) itself (see Problem 16.7) ratherthan its ratio to �, one realizes the difficulty of the experiment:

δ f DΓ (K

0! f ) � Γ (K0 ! f )

Γ (K0! f )C Γ (K0 ! f )

D

(�2Re(�0) D �5.2 � 10�6 f D πCπ�

C4Re(�0) D 10.4 � 10�6 f D π0π0

(16.180)

Page 692: Yorikiyo Nagashima - Archive

16.5 Experiments on CP Parameters 667

Figure 16.18 Scheme of the beam arrange-ment for the NA48 experiments. The KL beamcomes straight from the KL target. The atten-uated primary beam is bent and hits the KStarget 120 m downstream of the KL target.

AKS is the KS anti-counter located at the exitof the KS collimator. The two beams are aimedat the center of the detector with crossing an-gle 0.6 mrad. Note the difference between thevertical and horizontal scales [282].

In such experiments, one needs not only to accumulate high statistics using pre-cision detectors, which can only be constructed using modern technology, but alsoto minimize the systematic error, which constrains the ultimate precision. Mea-surement of the double ratio has the inherent advantage of canceling many sys-tematics. By measuring the four decay modes simultaneously, any time variationor background correction common to all four modes cancels out.

Both experiments are similar in detector concept, differing mainly in the prepa-ration of the beam. NA48 makes separate KL and KS lines that converge at thedetector. KTeV makes two parallel KL beam lines and uses one to regenerate aKS beam. Figure 16.18 shows the layout of the NA48 beams. A proton beam(� 1.5 � 1012 protons per pulse, 450 GeV/c), delivered by the SPS at CERN, im-pinges on a target that is located 126 m upstream of the decay volume. The chargedcomponents are removed by sweeping magnets. The neutral beam consists almostentirely of photons, neutrons, and K0

L mesons, with some few residual K0S , Λ and

� particles at the highest energies. The remaining primary beam goes througha bent crystal and about 10�5 of it, satisfying the channeling condition,16) goesthrough the tagger and hits the KS target just 6 m upstream of the decay volume.The tagger is used to separate KS events from those of KL origin in the time do-main. The energy of the neutral kaons is 70–170 GeV with average 110 GeV. Thewhole target and the collimator system are aligned along an axis pointing to the

16) If a particle goes through a single crystal at an angle close to a major crystal direction, coherentscattering on the lattice nuclei forces the particle to follow the lattice direction. The role of thebent crystal is similar to that of a bending magnet, with reduced intensity and excellent selectionof well-defined emittance. Its effect was used to separate desired particles from the halo of thebeam [122].

Page 693: Yorikiyo Nagashima - Archive

668 16 Neutral Kaons and CP Violation*

25 cm

120 140 160 180Z Distance from kaon production target (meters)

Beams

Regenerator BeamVacuum Beam

AnalysisMagnet

Regenerator

DriftChambers

TriggerHodoscope

MuonVeto

Steel

CsI

Mask Anti

Photon VetoesPhotonVetoes

CA

Figure 16.19 Top view of the KTeV (E832) detector. The evacuated decay volume extends toz D 159 m [244].

center of the detector 120 m away, such that the two beams intersect at this pointwith an angle of 0.6 mrad.

In KTeV, a primary proton beam from the Fermilab Tevatron [3�1012 protons perpulse, 800 GeV/c in a 20 s extraction cycle (“spill”)], on the other hand, makes twoidentical KL beams, one of which hits the regenerator target 120 m downstreamof the primary target. The average kaon momentum is 70 GeV/c, somewhat lowerthan that of NA48 because of the different primary energy and beam distance fromproduction to decay point. The regenerator consists of 84 scintillator modules of10 � 10 � 2 cm3, each viewed by a photomultiplier. The last module defines thebeginning of the decay volume. The total length is 170 cm, corresponding to abouttwo hadronic interaction lengths, and the magnitude of the regeneration amplitudeis j�j � 0.03.

For the detector we mainly explain the KTeV apparatus, with comparisons whereappropriate, because the concepts of KTeV and NA48 are very similar. Figure 16.19shows the top view of the KTeV(E832) detector. The evacuated decay volume ex-tends to z D 159 m.

Two typical event displays, one showing K ! πCπ� and the other! 2π0, areshown in Fig. 16.20.

Reconstruction of K ! πCπ� A magnetic spectrometer composed of a magnet,which gives a kick of 412 MeV/c, drift chambers and triggering hodoscopes recon-structs trajectories of charged particles with the momentum resolution σ p /p D[1.7 ˚ (p/14)] � 10�3, where p is the momentum in GeV/c. Kaon identificationwas performed by reconstructing the invariant mass (16.44) of the two chargedparticles (see Fig. 16.20, left), using the same method as the first CP-violation ex-periment by Fitch and Cronin, but with far better accuracy. The position resolutionis � 100 μm and the mass resolution is � δmK 1.6 MeV/c2. Within the ac-

Page 694: Yorikiyo Nagashima - Archive

16.5 Experiments on CP Parameters 669

0150005001

1 0

0 5

0 0

0 5

1 0

100 120 140 160 180 200

1 5

1 0

0 5

0 0

0 5

1 0

1 5

(a) (b)

Figure 16.20 Event display: Upper figures show CsI configuration and deposited energy clus-ters. Lines are trajectories (shown in lower figures) of reconstructed pions projected on the CsIplane with vertex at crossing. (a) K ! πCπ�; (b) K ! π0π0 ! 4γ [391].

ceptance window of 488 < mK < 508 MeV/c2, backgrounds for KL ! πCπ�(0.1%) come mainly from K ! πe(μ)ν, where e(μ) were misidentified as pions.The p 2

T distribution can separate KS ! 2π from the coherent regenerated beam(cut on p 2

T < 250 MeV/c2) from that of the background (diffraction scattering), aswas shown in Fig. 16.5. It also tells us that major background for KS ! πCπ�(0.087%) comes from regenerator scattering.

Reconstruction of K ! π0π0 A CsI electromagnetic calorimeter is composed of2232 inner cells (2.5�2.5 cm2) and 868 larger outer cells (5�5 cm2), covering a totalarea of 1.9�1.9 m2 with two holes in the center through which the beams pass. Theconfiguration is shown in Fig. 16.20. The depth of each crystal is 50 cm (27X0). Theenergy resolution is σE /E D 2%/

pE˚0.4%, where E is in GeV. K ! π0π0 ! 4γ

deposits energy at four places (see Fig. 16.20, right). The energy E1, E2 of the two

Page 695: Yorikiyo Nagashima - Archive

670 16 Neutral Kaons and CP Violation*

photons satisfies the relation

m2π0 D (E1 C E2)2 � (p1 C p 2)2 D 2E1E2(1� cos θ ) 4E1E2θ 2

4E1E2r212/z2 (16.181)

where z is the distance to the decay position, r12 is the distance between the twoenergy clusters in the detector and θ is their opening angle from the decay point.Assuming any 2γ ’s come from π0 decay, the vertex position is reconstructed. Outof three (=4C2/2) pair combinations made of 4γ ’s, two with the minimum separa-tion are chosen and mK is reconstructed (Fig. 16.21b). NA48 uses a liquid kryptonionization chamber and its energy resolution is σE /E D 3.2 ˙ 0.2%/

pE ˚ 9 ˙

1%/E ˚ 0.42˙ 0.05%. The vertex position was determined with a method similarto KTeV using the formula d D

Pi> j

pEi E j ri j /mK with assumed mK . The res-

olution of the longitudinal vertex position was on the order of 20–30 cm for KTeVand � 50 cm for NA48. Backgrounds for KL ! 2π0 (0.484%) come mainly from3π0, where 2γ ’s are missing, and for KS ! 2π0 (1.235%) mainly from the regen-erator and collimator scattering. The backgrounds for neutral decay modes are 10times larger than those for charged decay modes and are one of the major sourcesof errors. This is due to the energy resolution of the electromagnetic calorimeterand inapplicability of the p 2

T cut to eliminate the regenerator scatterings.

Other Detectors At the back end are located steel absorbers and muon identifiers.The decay volume and some parts of the detector are equipped with an extensivearray of photon veto counters to reduce trigger rates and backgrounds and to definesharp apertures and edges that limit the detector acceptance.

Overall Correction Separation of regenerated KS ! 2π events from those of CP-violating KL ! 2π is physically clear as the two beams are � 20 cm apart. Thesolid angle difference due to separation and the fact that they are not aligned alongthe center of the detector could be a potential source of systematic errors. Theregenerator also produces some background. KTeV observes four decay modes si-multaneously, by frequently switching the regenerator position alternately betweenthe two beam positions and also the polarity of the magnet. Any time fluctuationcancels when the ratio is taken and the geometrical difference averaged out. Note,however, some backgrounds contribute differently to the four modes, hence do notcancel and have to be corrected carefully for each mode. Use of high resolutiondetectors is considered essential for removing such background.

The total numbers of events accumulated are

N(KL ! π0π0) 3.4 � 106

N(KS ! π0π0) 5.6 � 106

N(KL ! πCπ�) 11.2 � 106

N(KS ! πCπ�) 19.4 � 106

Total 40 � 106

(16.182)

Before Re(�0/�) can be extracted from these data, backgrounds must be identifiedand subtracted, and acceptance corrections must be determined. Both these tasks

Page 696: Yorikiyo Nagashima - Archive

16.5 Experiments on CP Parameters 671

π π

π

(a)

(b)

Figure 16.21 Reconstructed mass distribution of 2π from vacuum (KL) beam. (a) πCπ� withestimated backgrounds by Monte Carlo. (b) π0π0. Backgrounds are mainly from 3π0 with twophotons missing [244, 347].

Page 697: Yorikiyo Nagashima - Archive

672 16 Neutral Kaons and CP Violation*

→ π π → π πE

vent

s pe

r m

eter

(th

ousa

nds)

Eve

nts

per

met

er (

thou

sand

s)

m( Z xetreV)m( Z xetreV )

Vacuumbeam

Regeneratorbeam

Vacuumbeam

Regeneratorbeam

(a) (b)

Figure 16.22 z-Vertex distribution of reconstructed (a) K ! πCπ� and (b) K ! π0 π0 for thevacuum beam (dashed) and regenerator beam (line histogram) [244].

require Monte Carlo simulations of the experiment. Especially important is thedecay vertex distribution, which differs widely between KL and KS, as shown inFig. 16.22. While the KL decay distribution is almost flat in the decay volume,that of KS halves in about 2 m. This could be a potential source of serious sys-tematic errors. NA48 uses a weighted z-distribution for KL vertices [weight factor� e�ΔΓSt D e(z/γ )(1/τS�1/τL)] to make them similar, which contributes to cancella-tion of many systematics but at the expense of statistics. KTeV, on the other hand,relies on elaborate Monte Carlo simulations to assure the correctness of recon-struction. Figure 16.23 shows examples of decay distributions for K ! πCπ�and KL ! πe(μ)ν. The agreement is excellent, but the degree of disagreementindicates the limits of systematic errors, too. In the end the largest single errorcomes from the overall reproducibility of the events, exemplified by the KL decaydistribution in Fig. 16.23.

Both NA31/NA48 and E731/KTeV spent nearly twenty years in deriving theirfinal results. During this time, they both learned from experience and from eachother. The single biggest issue common to both experiments was to determinegeometrical efficiency and, in the end, their differences were minor compared totheir similarity.

Extraction of Re(�0/�) KS components include regenerated KS / � and CP-violating KS / η, which are not distinguishable. The number of events is propor-tional to the square of the sum η C � [see Eq. (16.86)] and is a function of �/η.The fitting must be carried out including not only Re(�0/�) but also Δm , τS, thephases φ00, φC� and �. The accuracy can be cross-checked by floating well-knownparameters such as τS and Δm to see if they agree with the known values. The end

Page 698: Yorikiyo Nagashima - Archive

16.6 Models of CP Violation 673

π π

→ π π → π ν

π π(a) (b)

Figure 16.23 Vertex distributions of KL from K ! πCπ� (a) and K ! π lν (b) for bothreconstructed data and Monte Carlo (MC) simulations. Lower figures show the ratio data/MC,indicating good agreement [244].

results of KTeV and NA48 are [51, 244, 245, 311]

Re(�0/�) D 2.07˙ 0.26 � 10�3 KTeV (16.183a)

Re(�0/�) D 1.47˙ 0.22 � 10�3 NA48 (16.183b)

weighted average D 1.65˙ 0.23 � 10�3 PDG08 (16.183c)

which clearly established the nonzero value of �0.

16.6Models of CP Violation

In discussing models of CP violation, a primary assumption is CPT invariance.Experimentally, the relative strength of the CPT-violating strong interaction is es-timated from Eq. (16.169a) to be . 10�18. Dividing this number by α, the upperlimit of CPT violation in the electromagnetic interaction is found to be < 10�16,

Page 699: Yorikiyo Nagashima - Archive

674 16 Neutral Kaons and CP Violation*

and by dividing by GF m2K , that of the weak interaction < 10�14. These values give

the most stringent test of CPT violation. For the electromagnetic interaction, thereexists a test by the magnetic moment of the electron

[g(eC) � g(e�)]/gaverage D �0.5˙ 2.1 � 10�12

[g(μC)� g(μ�)]/gaverage D �0.11˙ 0.12 � 10�8(16.184)

but all other tests give at most � 10�4. CPT symmetry is known to hold withexcellent accuracy.

Theoretically, CPT invariance has rock-solid foundations, as described inSect. 16.4.4. It is very hard to construct a self-consistent theory without CPTinvariance. Therefore, we assume its invariance in the discussions below. In thefollowing paragraphs we list models that accommodate CP violation (denoted as��CP). The models are classified according to where one seeks the origin of CP vi-olation. The reason for listing models other than the Standard Model is becausethe Kobayashi–Maskawa model is considered OK within the Standard Model, butnot sufficient to explain the matter–antimatter asymmetry of the universe, andtheories beyond the Standard Model, including GUT (grand unified theory) andsuperstring models, are hot issues in the world of high-energy physics.

CP Violation in the Strong Interaction The model assumes that ��CP occurs in twosteps. First, KL decay to X in a CP-conserving interaction, followed by X ! 2πin the strong interaction. If we let f be the strength of the transition amplitudeX ! 2π, f has to be � 10�3 to explain jηj � 10�3 and is referred to as themillistrong force. Since CPT invariance is assumed, the interaction is one of fourpossible processes:

(1) P,C, �T (2) �P,C, T (3) �P, C, �T (4) �P,C, �T

In the electromagnetic transitions of atomic levels, a small mixture of different par-ity components induces interference with the main component of the amplitudeand rotates the direction of polarization. This tiny polarization shift is experimen-tally observed, but can be explained quite well by the neutral current effect of theStandard Model, hence models (2),(3) and (4), which include P violation, are ex-cluded [50, 70]. The possibility of millistrong interaction remains in principle, butno one has conceived of a natural way to embed CP violation in QCD and the pos-sibility does not seem to be taken seriously.

CP Violation in the Electromagnetic Interaction If the electromagnetic interactionis the cause of CP violation, α � jηj means CP has to be violated maximally.However, experimentally observed asymmetry values in the decay of η mesonsη ! πCπ�π0, πCπ�γ give 0.9 ˙ 1.7 � 10�3, 9 ˙ 4 � 10�3 [311]. ThereforeC violation in the electromagnetic interaction is small if any. The P-violating com-ponent does not exist beyond that allowed by the Standard Model, as was explainedin the previous section. Therefore, the possibility of CP violation originating in theelectromagnetic interaction is excluded.

Page 700: Yorikiyo Nagashima - Archive

16.6 Models of CP Violation 675

Superweak Model The superweak model assumes an interaction HSW withΔS D 2 is the cause of CP violation. The CP-violating term hK0jHSWjK

0i appears

in the mass matrix Λ12. Hermiticity of HSW dictates the CP-violating componentmust satisfy Λ21 D Λ�

12. Then from Eq. (16.67)

� DΛ21 � Λ12

2ΔλD �

Im(hK0jHSWjK0i)

(Γ S � ΓL)/2 � iΔm(16.185)

Therefore, the relation φ� D φSW D tan�1(2Δm/ΔΓ ) holds exactly (strictly speak-ing,Cπ is possible theoretically but excluded experimentally). Calling the strengthof the superweak model � f GF, we have f � �Δm/GFm3

K � 10�11. The effectappears only in the mass matrix and does not appear in the ΔS D 1 decay ampli-tude, hence �0 D 0. Therefore, the following equalities are exact in the superweakmodel:

jηC�j D jη00j, φC� D φ00 D φ� D φSW (16.186)

In this model, all the CP-violating effects can be described by a single parameter �

and do not appear in other mesons, including B mesons. Although the superweakmodel was finally excluded by NA48/KTeV and by B-factory experiments, histori-cally it played a prominent role for nearly 40 years.

Milliweak Interaction The milliweak model assumes the CP-violating effect occursin the ΔS D 1 interaction, whose strength is � 10�3. There are varieties of mod-els. One model increases the number of Higgs particles to introduce complex num-bers in the Lagrangian. Another is a left–right symmetric model, in which parityviolation is assumed spontaneous. Since the Standard Model has been established,these have become exotic models that go beyond the Standard Model but are dis-cussed in GUT in the context of the matter-dominant universe.

Origin of CP Violation in the Standard Model The CP violation effect in the Stan-dard Model belongs to the class of milliweak interactions. CP violation can be in-troduced into the Lagrangian by making the coupling constant complex. In theStandard Model, the weak interaction is mediated by W ˙ and Z0. They coupleto weak isospin doublets and the charged current interaction Lagrangian has theform of Eq. (15.200), namely

LWI D �gwp

2Ψ L γ μ(τC W C

μ C τ� W �μ )ΨL

ΨL D

�ud0

�,

�cs0

�,

�tb0

� (16.187)

If we denote u j ( j D 1, . . . , 3) D (u, c, t) and dk (k D 1, . . . , 3) D (d, s, b), theinteraction Lagrangian of the charged W boson can be written as

LW D �gWp

2

Xhu j L γ μVj k dkL W C

μ C d kLγ μV �j k u j L W �

μ

i(16.188)

Page 701: Yorikiyo Nagashima - Archive

676 16 Neutral Kaons and CP Violation*

where Vj k is the KM (Kobayashi–Maskawa) matrix element introduced inEq. (15.160b). If we apply CP transformation [Appendix F, Eq. (F.4)] to the aboveLagrangian, it is transformed to

LW D �gWp

2

Xhd kLγ μVj k u j L W �

μ C u j L γ μV �j k dkL W C

μ

i(16.189)

We see that CP invariance requires the KM matrix Vj k to be real. As the field u j , dk

has the freedom to change its phase, many KM elements Vj k can be made real us-ing this freedom. Let us see how many complex numbers the KM matrix has. Ingeneral, a N � N complex matrix has 2N 2 independent real variables. Unitari-ty requirement reduces the number to N 2. In this N 2-dimensional space of realnumbers, N C2 D N(N � 1) space rotations can be defined in terms of rotationangle θ j . The rest of the variables become phase angles φk . On the other hand, the2N quark fields have the freedom to change their phase as

U �

24u

ct

35! FU U D

24e i φu

e i φ c

e i φ t

35U (16.190a)

D �

24d

sb

35! FD D D

24e i φd

e i φ s

e i φb

35D (16.190b)

This transforms

V ! F�U V FD , namely Vj k ! e�i(φu j �φdk )Vj k (16.191)

2N � 1 phases, excluding the common overall phase, can be removed by redefini-tion of the quark fields. In the end,

N 2 �N(N � 1)

2� (2N � 1) D

(N � 1)(N � 2)2

(16.192)

phases remain. Therefore for N 2, all the matrix elements can be made real andno CP violation effect appears unless put in explicitly by hand. As CP is violated,N must be � 3 was the original motivation of introducing the KM matrix. As themeasurement of Z0 decay width shows that the number of neutrinos with mν <

mZ /2 is limited to three (see Sect. 15.7.4), it is natural to think N D 3, as is requiredby the Standard Model. In this case the KM matrix is expressed by three rotationangles and one phase angle, as in Eq. (15.162). In this convention, the dominantcontribution of the phase δ is contained in Vub and Vt d .

Therefore, in the Standard Model, Feynman diagrams that contain Vub or Vt d areCP-violating amplitudes. Referring to the Standard Model description of K

0$ K0

transition amplitude, Fig. 16.3, when the intermediate states j or j are top quarks,the transition t $ d attaches Vt d to the amplitude and it becomes a CP-violatingamplitude. But the decay amplitudes are on-mass shell; no top quarks appear in thedecay diagrams in the lowest order approximation, referred to as the tree diagram,

Page 702: Yorikiyo Nagashima - Archive

16.6 Models of CP Violation 677

K0

s

d

u

u

Wd

dπ+

πi u,c,tVis Vid*

K0

s

d

u

u

W+

d

π

K0

s

d u

uW–

d

d

π–

π+

+

(a) (b) (c)

Figure 16.24 K ! πCπ� decay amplitudes. Tree diagrams (a,b) do not contain KM matrixelement Vtd . The penguin diagram (c) contains the top quarks in the intermediate states. Thecoupling Vtd is complex, hence diagram (c) violates CP.

shown in Fig. 16.24a,b. In order to insert the top quark in the intermediate states,the higher order diagram or the so-called penguin diagram has to be introduced(Fig. 16.24c).

The Standard Model clarifies the dynamical structure of CP violation and its cal-culation gives the right order of magnitude for the experimental values, but owingto QCD complexity accurate estimates are hard to obtain.

Page 703: Yorikiyo Nagashima - Archive
Page 704: Yorikiyo Nagashima - Archive

679

17Hadron Structure

17.1Historical Overview

We have succeeded in constructing the quark model from hadron spectroscopy.What it can tell us is about the static nature of the hadron, for example its mass,electric charge, spin, isospin and strangeness. They are all characteristics that canbe observed from outside. The quark model cannot tell us about the inner struc-ture of the hadron. It cannot tell us whether an elementary particle with fractionalcharge really exists inside the hadron. In order to investigate the inner structure ofthe hadron, we need a probe that can go inside the hadron and show us its struc-ture. Qualifications for the probe include (1) its own properties are well known, (2)it has small size and (3) it can penetrate deep inside the hadron. From this point ofview, leptons are useful. One may compare the situation with the situation at thebeginning of the 20th century, when there were several models of atomic structure:the α particle was used as a probe, and the investigation of scattering patterns froma golden target led to the Rutherford model being established.

In the 1960s, the world of particle physics was in confusion. In the field of stronginteractions, confidence in field theory was weak and the idea of particle democracybased on the analyticity of the scattering amplitude was at its peak. Above all thebootstrap theory told us that all particles are equal, no elementary particles existand that they themselves are building blocks of other particles, namely they aretheir own fundamental constituents. Part of the reasoning came from an experi-mental observation of the nucleon form factor. A series of experiments by Hofs-tadter et al. [81, 206] showed that, unlike the atom, the nucleon does not have acore and the whole structure is soft like a cloud or jelly. There was no reason tobelieve in the existence of a substructure inside the nucleon. Unlike 50 years pre-viously, this time Thomson’s viewpoint seemed more pertinent than Rutherford’s.In addition to denying subatomic structure, particle physicists at that time foundthe unusual properties of the quark, such as fractional charge, incompatible withtheir common sense. Even Gell-Mann, the quark proposer himself, said for a whilethat it should be considered as a mathematical tool rather than a real model.

The situation changed drastically when deep inelastic scattering data were pre-sented in 1969. This was the Rutherford experiment brought up to date in the sense

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 705: Yorikiyo Nagashima - Archive

680 17 Hadron Structure

that it clarified the parton structure of the nucleon, namely, that the nucleon is com-posed of point-like particles. Series of deep inelastic scattering data, beginning withthose from the MIT-SLAC group, clarified that the nucleon is a quark compositeand that the quark really has fractional charge, which paved the way for quantumchromodynamics (QCD) to appear as the theory of the strong interaction.

This chapter explains the process of clarifying the inner structure of the nucleonto be a composite of point-like particles generically called partons, and the role ofdeep inelastic scattering of electrons (and muons) and neutrinos in identifying thepartons as quarks and gluons.

17.2Form Factor

Momentum Transfer as a Measure of the Size We Can Probe The electron is the bestknown and most easily obtainable probe of all time. In order to use the electron asa probe, we need to know the formula for its reaction with other particles. The basicequation is the Rutherford scattering formula introduced in Sect. 6.3.1:

dσ D2πvjH f i j

2δ(E � Ei)d3 p

(2π)3 D1

(2π)2

m pvjV f i j

2dΩ

dσdΩD

ˇˇ m2π

Zd rV(r)e i Q�r

ˇˇ2 D 4Z2 α2m2

Q4

m, Z W mass and electric charge of the target

V(r) DZ e2

4π1rD

Z αr

Q2 D jQj2 D jk i � k f j2 D 4k2 sin2 θ

2k, θ W momentum and scattering angle of the electron

(17.1)

Q is the momentum that the electron gives to the target, or the strength of the im-pact, and is called the momentum transfer. Equation (17.1) holds when the targetis a point particle. If the target is spread and has a charge distribution �(r), thepotential is modified to

V(r)! V 0(r) DZ

d r 0 V(r � r 0)�(r 0) (17.2)

and the matrix element to

V f i ! V 0f i D

Zd r V 0(r)e i Q�r D

�Zd(r � r 0) V(r � r 0)e i Q�(r�r0)

Zd r 0�(r 0)e i Q�r0

D V f i F(Q2) (17.3a)

Page 706: Yorikiyo Nagashima - Archive

17.2 Form Factor 681

F(Q2) DZ

d r�(r)e i Q�r (17.3b)

where, for simplicity, the distribution was assumed to be spherically symmetric.F(Q2) is called the form factor and is a measure of the target spread. Normalizingthe charge distribution byZ

d r�(r) D 1 (17.4)

we have

F(0) D 1 (17.5)

From the above discussion, we know that the cross-section formula for an extendedtarget is obtained from the point target by multiplying by the form factor (squared)

dσ(θ ) D dσ p tjF(Q2)j2 (17.6)

The above equation says that if we know the cross section formula of a point target,we can obtain information about the target spread by measuring the scatteringpattern. As is clear from Eq. (17.3b), the part of �(r) where Q � r 1 does notcontribute to the integral because of violent oscillation due to exp(i Q � r). Thismeans that the form factor is a measure of total charge integrated over the regionr < 1/Q. In other words, the electron has penetrated to the depth given by r ' 1/Q.The maximum value of the momentum transfer Q gives the limit to which we canprobe the inner structure of the target.

Target Size When Q � r � 1, the form factor can be expanded as

F(Q2) DZ

d r �(r)�1C (i Q � r) � (1/2)(i Q � r)2I � � �

D 1 �

Q2

6hr2i C � � �

(17.7a)

hr2i D

Zd r�(r)jrj2 D �6

dF(Q2)dQ2

ˇˇ

Q2D0(17.7b)

This means that if we can measure the cross section for small values of jQj, theinformation we can obtain is limited to the total charge (Z α) and the size of itsspread. The larger the value of jQj we can obtain, i.e., the larger the energy andscattering angle, the deeper we can probe the inner structure.

Examples of Form Factors It helps to understand the physical meaning of the formfactor if we know the spectral shape of typical spatial distributions. We list a fewrepresentative distributions.

Page 707: Yorikiyo Nagashima - Archive

682 17 Hadron Structure

Spherical Distribution �(r) D �0 for r R and �(r) D 0 for r > R (Fig. 17.1)

F(� ) D3(sin � � � cos � )

� 3, � D QR (17.8a)

This is similar to light going through a pinhole making several bright and darkrings and is called a diffraction pattern. It is a typical wave characteristic that com-monly occurs on going around behind a clear-cut boundary.Nuclear Charge Distribution The nucleus looks like a blurred sphere with radiusR D r0A1/3. Figure 17.2b shows scattering data for an electron incident on calcium.It resembles the diffraction pattern, but somewhat smoothed. A charge distributionthat reproduces the observed pattern well is depicted in Fig. 17.2a. R represents thenuclear size, t D (4 ln 3)a the thickness of the surface or the degree of blurriness.One sees valleys become shallower as the surface is blurred. Fitting parameters tothe data, we obtain, roughly, r0 ' 1.03 � 10�13 cm, t ' 2.4 � 10�13 cm.Exponential Distribution When the distribution becomes harder smoothly towardsthe core, it may be represented by an exponential or a Gaussian:

�(r) D �0 exp(�r/R) (17.9a)

F(Q2) D1

(1C Q2/R2)2(17.9b)

A good example is the sun, which is a gas sphere with intensity slowly rising to-wards the core. Note: The reader is reminded that the Yukawa potential V(r) �eμ r/r (which is a singular distribution and hence cannot be normalized) has theFourier transform V(Q2) � 1/(1C Q2/μ2), which resembles the exponential formfactor but is softer.

-0.1

1.0

0.5

0

F(Q

2 )

F(ξ)

2 6 10ξ

F(ξ) 2

R

ρ(r)

ρ0

r

(a) (b)

Figure 17.1 (a) �(r) D �0 for r � R and �(r)D 0 for r > R. (b) F(� ) D 3(sin � � � cos � )/� 3.When there is a clear-cut boundary, the form factor shows a typical diffraction pattern.

Page 708: Yorikiyo Nagashima - Archive

17.3 e–p Elastic Scattering 683

20 30 40 50 60SCATTERING ANGLE IN DEGREES

10 28

10 30

10 32

10 34

10 36

Ca40

Ca48

Best fit: this data

Best fit: at 250 MeV

e Ca SCATTERING

dσ(c

m2 /s

tr)dΩ

x10

/10

(750MeV)

(b)(a)

R

t

r

ρ(r)

1.00.9

0.10

Figure 17.2 (a) �(r) D �0/[1Cexp(r�R)/a], t D (4 ln 3)a. (b) e–Ca scattering data. From fittedcurves R D r1/3

0 , r0 D 1.03� 10�13 cm, t D 2.4� 10�13 cm are obtained [57].

Gaussian Distribution

�(r) D �0 exp (�r2/2σ2) (17.10a)

F(Q2) D exp(�Q2σ2/2) (17.10b)

Problem 17.1

Derive Eqs. (17.9) and Eqs. (17.10).

17.3e–p Elastic Scattering

Electromagnetic Form Factor of the Nucleon The proton has spin s D 1/2 and aspatially spread structure. If it were a point particle, namely a Dirac particle, theelastic scattering cross section with an electron

e(ω, k)C p (E , p )! e(ω0, k0)C p (E 0, p 0) (17.11)

could be obtained from the e–μ scattering cross section in the laboratory (LAB)frame [Eq. (7.20)] by replacing the target mass by the proton mass M:

dσdΩ

ˇˇLABD σM

ω0

ω

�1 �

q2

2M2tan2 θ

2

�(17.12a)

Page 709: Yorikiyo Nagashima - Archive

684 17 Hadron Structure

σM D σMott D4α2ω0 2

Q4cos2 θ

2(17.12b)

q2 D �Q2 D (k � k0)2 ' �2ωω0(1� cos θ )

ω0 D ω/[1C ω(1� cos θ )/M ] (17.12c)

Q2 D �q2 is a relativistic version of Q2. The formula is valid for the electronmass set to m D 0. The first term in [� � � ] represents scattering by the electricCoulomb field created by the target and the second by the magnetic field. Whenthe target spread is taken into account, the charge distribution and the magneticmoment distribution are generally different and two different form factors need tobe introduced. In order to see how to include them, we express the most generalproton current in the form

q Jpμ(x ) D ehp 0j J μ(x )jp i � e Jp

μ(p 0, p )e�i q�x

D eu(p 0)h

F1(q2)γ μ C�

2MF2(q2)i σμνqν

iu(p )e�i q�x (17.13)

From Lorentz invariance alone, other terms proportional to qμ D (p 0 � p )μ, (p 0 Cp )μ are allowed. However, using current conservation @μ Jp

μ D 0 and the Diracequation that the plane wave functions u(p ), u(p 0) satisfy, it can be proved that theabove two terms are enough.

Problem 17.2

Show that a term proportional to qμ D (p 0 � p )μ, if it exists in Eq. (17.13), isforbidden by current conservation. Next, prove the equality which is valid whensandwiched between u(p 0) and u(p ):

γ μ D1

2M

�(p 0 C p )μ C i σμν qν

(17.14)

which is referred to as the Gordon equality.

The Gordon equality shows that the term proportional to (p 0 C p )μ is alreadyincluded in Eq. (17.13). To understand the physical meaning of Eq. (17.13), rewriteit as

e Jpμ(p 0, p ) D eu(p 0)

�F1(q2)

(p 0C p )μ

2MC

F1(q2)C �F2(q2)2M

i σμν qν

�u(p )

(17.15)

and take nonrelativistic limit [p 0 μ, p μ ! (M, 0)]. Then expressing A μ D (φ,�A)

(p 0 C p )μ A μ ! 2M φ , i σμν qν A μ ! i σ i j q j A i � σ � r � A D σ � B

(17.16a)

If the form factors, are normalized as F1(0) D 1, F2(0) D 1, we have

e Jpμ(p 0, p )A μ ! eφ C (1C �)

e2M

σ � B (17.16b)

Page 710: Yorikiyo Nagashima - Archive

17.3 e–p Elastic Scattering 685

From this equation, we see the first term � γ μ contains magnetic moment (Diracmoment) interaction as well as the Coulomb interaction, but the contribution ofthe anomalous magnetic moment is contained in the second term � i σμν qν. Asthe proton has charge e and magnetic moment μ p D (1C � p )e/2M

F1p (0) D 1 , F2p (0) D 1 , � D � p (17.17)

Similarly, as the neutron has total charge 0, although the inside distribution maynot vanish,

F1n(0) D 0 , F2n(0) D 1 , � D �n (17.18)

The experimental values are given by

� p D 1.79 , �n D �1.91 (17.19)

Rosenbluth Formula Using the proton current Eq. (17.13) and the electron current

q j eμ(x ) D �ehk0j j μ(x )jki D �eu(k0)γ μ u(k)e�i q�x

q D k � k0 (17.20)

The electron–proton elastic scattering amplitude is given in the one-photon-ex-change approximation as

i T f i D �e2Z

d4x j eμ(x )

��i gμν

q2

�Jp

ν(x ) (17.21)

and the cross section becomes

dσdΩ

ˇˇLABD σM

ω0

ω

��F 2

1 ��2q2

4M2 F2

��

q2

2M2 (F1 C �F2)2 tan2�

θ2

��(17.22)

Defining

GE(q2) D F1(q2)C�2q2

4M2 F2(q2) , GM(q2) D F1(q2)C �F2(q2) (17.23)

the cross section Eq. (17.22) is rewritten as

dσdΩ

ˇˇLABD σM

ω0

ω

�G2

E C τG2M

1C τC 2τG2

M tan2�

θ2

��(17.24a)

τ DQ2

4M2D �

q2

4M2(17.24b)

Q2 D �(k � k0)2 D 4ωω0 sin2 θ2

, ω0 Dω

1C (ω/M )(1� cos θ )(17.24c)

This is referred to as the Rosenbluth formula. In the limit q2 ! 0, GE ! 1,GM ! 1C �, GE(q2), GM(q2) are considered as expressing the electric charge andmagnetic moment distributions, respectively.

Page 711: Yorikiyo Nagashima - Archive

686 17 Hadron Structure

Early data on elastic e–p scattering came from electron beams at Stanford univer-sity and SLAC (Stanford Linear Accelerator Center). As the cross section has theform

1σM

dσdΩD A(Q2)C B(Q2) tan2

�θ2

�(17.25)

by changing angle θ at fixed Q2, A and B and consequently GE, GM can be obtainedas a function of Q2. Figure 17.3 shows data on the proton magnetic moment.

The observed data can be fitted well with experimental formulae in the range0 Q2 30 (GeV/c)2

G pE D

G pM

μ pD

G nM

μnD

1�1C (Q2[(GeV/c)2]/0.71)

2

G nE (Q2) D 0

(17.26)

Comparing the expression with Eq. (17.9), we see the charge distribution inside theneutron is neutral, as it is outside, and all other distributions are very smooth fromoutside to the core and well represented by an exponential shape. This means thesize of the nucleon is aboutp

hr2i ' 0.7 � 10�13 cm ' 1/(2mπ ) (17.27)

If the nucleon has a core and has charge λe, the form factor should approach λasymptotically. However, the data do not show such a tendency and can be inter-preted as that the charge distribution is very soft right to the core.

p n + π+ n p + π–

ρp(r) ρn(r)(a) (b) (c)

1

10 1

10 2

10 3

10 4

10Q2 (GeV)2

20 300

SLAC Experiment

GMμ

GMμ Q2

0.71 GeV21+

1⎛⎝

⎛⎝

2

Figure 17.3 (a) Form factor of the proton magnetic moment [364]. (b,c) Expected charge distri-butions of the proton and the neutron in early models.

Page 712: Yorikiyo Nagashima - Archive

17.4 Electron Proton Deep Inelastic Scattering 687

Before the appearance of the quark model, the nucleon was considered to emitand reabsorb pions

p ! n C πC, n ! p C π� (17.28)

In this image, the charge distribution is expected to be of the form depicted inFig. 17.3b,c. The proton charge distribution will be positive at the center and theneutron distribution is expected to be positive in the core and negative in the pe-riphery. The data do not support this concept. In the quark model, p D uud,n D udd. Since both charge and magnetic moment distribution are consideredto represent that of the constituent quarks, the similarity of G p

E , G pM and G n

M areeasily explained. The net charge of the neutron vanishes everywhere if the threeudd distributions are the same. As the three quarks are separate and do not makeany particular hard core if distributed smoothly, the exponential form and absenceof a hard core are also explained.

17.4Electron Proton Deep Inelastic Scattering

17.4.1Cross-Section Formula for Inelastic Scattering

In quasielastic scattering, where the final state is composed of a scattered electronand a recoiled particle with mass MX

e(k)C P(p )! e(k0)C X(PF ) generally MX ¤ Mp (17.29)

there is a one-to-one correspondence between the scattered electron angle θ andthe energy loss ν � ω�ω0 if the incident energy ω and Mx are fixed. Measurementof one variable determines the other, too (Problem 17.3).

Problem 17.3

Fixing the incident electron energy on a target at rest and setting W D MX , derivethe relation

ν � ω � ω0 DQ2

2MC

W 2 �M2

2M(17.30)

between the momentum transfer Q2 and scattered energy ω0. Since Q2 is a func-tion of ω, ω0 and cos θ , this equation means the scattering angle θ is uniquelydetermined once ω0 is known. Specially for elastic scattering (W D M ), show that

ν DQ2

2MD

ωω0

M(1� cos θ ) (17.31)

Assume the electron mass m D 0.

Page 713: Yorikiyo Nagashima - Archive

688 17 Hadron Structure

For inelastic scattering, W(D MX > M ) is arbitrary and takes continuous valueswithin the kinematically allowed region. In this region, the scattering angle θ andthe scattered energy ω0 are independent. If we observe only scattered electrons, thecross section has the generic form

d2σdΩ dω0 D σM

�W2(Q2, ν)C 2W1(Q2, ν) tan2

�θ2

��(17.32)

Here, W1, W2 are called structure functions, analogous to form factors in elasticscattering. Unlike the elastic case, they are functions of both Q2 and ν. Equation(17.32) can be derived from Eq. (17.21). The transition amplitude for inelastic scat-tering can be written as

i T f i D i(2π)4δ(k C p � k0 � PF ) j e μe2

q2 Jpμ (17.33a)

j e μ D u(k0)γμ u(k) , Jpμ D hX(PF )j JEM

μ(0)jp(p )i (17.33b)

Following the standard procedure, the cross section can be derived as

dσ D(4πα)2

4(k � p )L μν

1Q4 4πM W μν d3k0

(2π)32ω0 (17.34a)

L μν D12

Xspin

u(k0)γμ u(k)u(k)γν u(k0) D 2h

k0μ kν C k0

ν kμ C (q2/2)gμν

i(17.34b)

W μν �(2π)4

4πM

XF

δ4(p C q � PF )

241

2

Xspin

Jpμ Jp

ν �35 (17.34c)

As J μp is hermitian

W μν D W νμ � (17.35)

W μν is a Lorentz tensor. Moreover, it is a function of q and p only, because theproton spin average has been taken. Therefore it can be expanded by the availableLorentz tensors qμ qν , p μ p ν , p μ qν, p ν qμ, i�μν�σ p�qσ . The current conservation@μ J μ � qμ Jp

μ D 0 constrains

qμ W μν D qν W μν D 0 (17.36)

Three Lorentz tensors satisfy this condition:��gμν C

qμ qν

q2

�,

�p μ �

(p � q)qμ

q2

��p ν �

(p � q)qν

q2

�,

i�μν�σ p� qσ

(17.37)

The third term is an antisymmetric tensor and has opposite parity relative to thefirst two. Since the electromagnetic interaction conserves parity, this term does not

Page 714: Yorikiyo Nagashima - Archive

17.4 Electron Proton Deep Inelastic Scattering 689

enter. In fact, by multiplying by L μν, it is easily proved that their product vanishes.Therefore, W μν can be expressed as

W μν D

��gμν C

qμ qν

q2

�W1(Q2, ν)

C

�p μ �

(p � q)qμ

q2

��p ν �

(p � q)qν

q2

�W2(Q2, ν)

M2

(17.38)

which defines W1 and W2. Using

qμ L μν D qν L μν D 0 (17.39)

we can show

L μν W μν D 4ωω0�

W2 cos2 θ2C 2W1 sin2 θ

2

�(17.40)

From this we finally obtain the desired formula

d2σdΩ dω0 D

4α2

Q4 ω0 2�

W2(Q2, ν) cos2�

θ2

�C 2W1(Q2, ν) sin2

�θ2

��

D σM

�W2 C 2W1 tan2

�θ2

��(17.41)

Problem 17.4

Prove Eq. (17.41).

Problem 17.5

Assuming the proton to be a structureless Dirac particle, and setting the final stateof the proton as p 0 (i.e. elastic scattering), calculate the structure function by in-serting J μ

p D eu(p 0)γ μ u(p ) in Eq. (17.34c), or by comparing Eq. (17.12a) andEq. (17.41), and show that

W2(Q2, ν) D δ�

ν �Q2

2M

�(17.42a)

W1(Q2, ν) DQ2

4M2 δ�

ν �Q2

2M

�(17.42b)

Hint: Use the following formula:Zdω0 δ

�ν �

Q2

2M

�D

ω0

ω(17.43)

Page 715: Yorikiyo Nagashima - Archive

690 17 Hadron Structure

17.4.2Bjorken Scaling

In 1969, the SLAC-MIT group carried out an experiment in which an electron beamwas incident on a proton target and the intensity of the scattered electron was mea-sured. Figure 17.4 [149] shows some of the data: a differential cross section as afunction of final hadronic mass W for a fixed incident energy ω of 10 GeV and thescattering angle θ D 6ı. Here, one sees a sharp peak (scaled down by 1/8.5) atthe left side corresponding to elastic scattering, and a few other quasielastic peaks,corresponding to resonance production

e C p ! e C N� M� > M (17.44)

Below the peaks one sees a broad hill-like bump, which constitutes inelastic scat-tering, where multiple particles are produced with invariant mass W continuouslychanging in the range allowed kinematically. One also refers to this region as thecontinuum.

Data similar to in Fig. 17.4 sliced at some narrow range of W band and plottedas a function of Q2 are shown in Fig. 17.5 [76]. One sees that the cross sectionof elastic scattering falls off rapidly as a function of Q2, which is a manifestationof the target being an extended object, but those of inelastic scattering at large W(loosely dubbed deep inelastic scattering) show a relatively slow decrease.

Actually, such asymptotic behavior was expected by Bjorken on a theoretical ba-sis [62]. A more quantitative description of the asymptotic behavior at large hadron

W (GeV)0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6

d2 σ/d

ΩdW

1.5

0.5

1.0

2.0

0

E 10.0 GeVθ 6

Elastic Scattering (x 1/8.5)

Quasi-elastic Scattering

Continuum

Figure 17.4 d2 σ/dΩ dω0 (E2 D 10 GeV,θ D 6ı), is plotted as a function ofW, the invariant mass of the hadronic fi-nal states [69, 149, 227]. The elastic peak(W D M) is scaled down by 1/8.5. Its radia-

tive tail is also drawn. The broad bump underthe dashed line labeled continuum indicatesinelastic scattering. Dashed line is drawn onlyto guide eyes.

Page 716: Yorikiyo Nagashima - Archive

17.4 Electron Proton Deep Inelastic Scattering 691

0 1 2 3 4 5 6 7q2 (GeV/c2)

1

10 1

10 2

10 3

10 4

(d2 σ

/dΩ

dE') /σ

MO

TT (G

eV-1

)

ELASTIC SCATTERING

xW 2 GeVW 3 GeVW 3.5 GeV

θ=10

Figure 17.5 (d2 σ/dΩ dω0 )/σM in GeV�1 [76].

mass W is expressed

in the limit ω, Q2, ν !1

and for 0 < x �Q2

2M ν< 1

(17.45)

as

ν W2(Q2, ν)! F2(x ) ,

M W1(Q2, ν)! F1(x ) 1)(17.46)

Equation (17.45) is often referred to as the scaling limit. Equations (17.46) meanthat in the limit, the structure functions are not functions of two independent vari-ables but depend only on their ratio, which is referred to as Bjorken scaling. InFigure 17.6, M W1 and νW2 are plotted as functions of δ D (W 2 C Q2)/Q2 D

1/x C M2/Q2, which is � 1/x in the scaling limit. The variable δ is used to takeaccount of the mass effect in the low-energy region.

They do not depend on Q2, as is shown in Fig. 17.7, a clear manifestation ofBjorken scaling. Bjorken’s reasoning for the scaling is as follows. Since M W1, νW2

are dimensionless quantities, physically fixed variables such as the proton mass

1) These F1, F2 are not to be confused with the form factors that appeared in Eq. (17.13).

Page 717: Yorikiyo Nagashima - Archive

692 17 Hadron Structure

Figure 17.6 Bjorken scaling. (a) Structure functions plotted as a function ofδ (denoted as ω) D (W 2 C Q2)/Q2 D 1/x C M2/Q2 [148, 149]. They do not depend onQ2, as is shown in Fig. 17.7. The plot consists of data with W > 1.8 GeV.

0 2 4 6 8Q 2 (GeV/c2)

ν

ω = 4

W2

0.5

0.1

0.2

0.3

0.4

0

+x

6 1810 26

Figure 17.7 Bjorken scaling. νW2 does not depend on Q2. The plot consists of data with W >

1.8 GeV.

Page 718: Yorikiyo Nagashima - Archive

17.5 Parton Model 693

can be neglected in the scaling limit and they should depend only on an availabledimensionless variable, which is x.

We will explain the physical meaning of the Bjorken scaling in terms of partonsin the next section. In summary, the soft structure inferred from observation ofelastic scattering had to be reconsidered in light of the deep inelastic scatteringdata.

17.5Parton Model

The parton model was proposed by Feynman to analyze high-energy hadron colli-sions [142, 143] and applied by Bjorken and Paschos to e–p inelastic scattering [65](see also [394]).

17.5.1Impulse Approximation

Let us assume that the nucleon is a collection of point-like particles (partons) andthat the scattering cross section can be approximated as the sum of the scatter-ing by partons (impulse approximation). Denoting the partons by qi , the impulseapproximation means

dσdΩ

ˇˇ

eNDX

i

dσdΩ

ˇˇ

eqi

(17.47)

Summation of the individual scattering events requires that each scattering is in-dependent of and does not affect the others. For this assumption to be valid, thetime τint necessary to finish the interactions must be sufficiently short comparedto the time τfree, which is a measure of each parton flying freely

τint � τfree (17.48)

As the parton is inside the nucleon, it is bound to other partons. The motion of aparton induces a drag effect on other partons due to forces working between them.The drag time would be of the order of the inverse of the binding energy

τfree �1

binding energy'

1Efree � Etot

>1M

(17.49)

The magnitude of the binding energy is expected to be at most the nucleon mass(M). It is not small, but fixed, and does not increase with the nucleon energy E Dp

p 2 C M2. Viewed from the incident electron, the high-energy nucleon looks likea flat pancake (Fig. 17.8a) because of Lorentz contraction. The contraction factor isgiven by 1/γ ' M/E .

The interaction time is considered to be the time that the electron spends in thenucleon or that it takes to pass through the nucleon. As the nucleon has the size

Page 719: Yorikiyo Nagashima - Archive

694 17 Hadron Structure

Electron

Proton

Parton k k’

q k - k’

q + xp

xpPF

pHADRONS

HADRONS

Proton

Electron

(a) (b)

Figure 17.8 (a) Because of Lorentz contrac-tion, the target proton looks like a pancakeviewed from the incoming electron. The inter-action time, which is the time for the electronto pass through the nucleon, is much short-er than the partons’ mutual interaction time(time to move transverse to the electron’sdirection of motion). Partons are free in this

approximation. (b) Parton kinematics. Only aparton with fractional momentum x of the par-ent proton interacts with the electron. Otherpartons are spectators and pass through with-out interacting. Partons are considered freebefore and after scattering. When they fly awaybeyond the hadron size, they transform them-selves into many hadrons (hadronization).

� 1/mπ in its rest frame, the interaction time is given by

τint '1

ME

(17.50)

Consequently, for high energies where

E M2/mπ � 6.3 GeV! τint � τfree (17.51)

the condition Eq. (17.48) for the impulse approximation will be satisfied.Next we rewrite the condition in terms of the parton variables. Because of Lorentz

contraction, the longitudinal momentum pk of the nucleon is much larger than thetransverse momentum p? of the parton inside the nucleon. Therefore, for the i-th parton’s longitudinal momentum we have p i k pi? ' hp?i, where hp?iis the average transverse momentum of the parton inside the nucleon. Its value(� 330 MeV/c [335]) can be estimated from transverse momentum distributions ofthe seconday particles in p–p collisions (see Fig. 13.30). When the electron collideswith the proton, the latter is excited to a virtual state with mass W. The inverse ofthe energy transfer to the nucleon is a measure of the interaction time and is givenby

1τint� q0 �

qp 2 C W 2 �

qp 2 C M2 �

W 2 �M2

2pD

2M ν � Q2

2p(17.52)

where we have used Eq. (17.30) in deriving the last equality. On the other hand,if we set the longitudinal parton momentum as fraction x of the parent nucleon

Page 720: Yorikiyo Nagashima - Archive

17.5 Parton Model 695

(pi k D xi pk), the time for which the nucleon can be separated to free partonswould be given by

1τfree

� ΔE Dq

p 2 C M2

�q(x1 p1 k)2 C p 2

1? C μ21 C

Xi�2

q(xi p i k)2 C p 2

i?C μ2i

(17.53)

where μ i is the mass of the i-th parton and we have attached subscript 1 to theparton that interacts with the electron. If the nucleon has infinite energy [or ap-proximately higher energy than that satisfying Eq. (17.51)], we may consider thatall the partons move in the same direction as the parent nucleon (i.e. xi > 0). Sincethe sum of momenta of all the partons has to be equal to that of the nucleon, weuse

0 < xi < 1 ,X

xi D 1q(xi p i k)2 C p 2

i?C μ2i � xi pk C

μ2i C p 2

i?2xi p

(17.54)

and calculate ΔE , which gives

ΔE �1

2p

"M2 �

Xi

μ2

i C p 2i?

2xi

!#(17.55)

The above equation means that the impulse approximation is not valid for xi ! 0,xi ! 1. A measure for the approximation to be valid is given by

1 xi μ/p , p?/p (17.56)

The parton has energy-momentum given by

piμ D

q(xi pk)2 C p 2

i?C μ2i , pi?, xi pk

�(17.57)

but within the above conditions Eq. (17.57) can be approximated by p μi ' xi p μ.

Using the condition that the parton is free before and after the scattering (p 2i D μ2

i ,see Fig. 17.8b)

μ21 D (x1 p )2 D (q C x1 p )2 (17.58a)

) x � xi D �q2

2(p � q)D

Q2

2M ν(17.58b)

Page 721: Yorikiyo Nagashima - Archive

696 17 Hadron Structure

Q2

0 0.2 0.4 0.6 0.8 1.0

x 0

x 0.3

x 0.5

x 1.0

W Mp (Elastic Scattering)W MΔ1W MΔ2

2Μν

νE1

y Ε1 Ε3E1

x Q2

2Μν

Figure 17.9 Kinematics of deep inelastic scat-tering. The gray triangular area indicates theallowed region. Elastic scattering is represent-ed by the line with W D Mp (x D 1) andquasielastic by W D MΔ1 , MΔ2 . The darker

gray half-oval corresponds to the deep inelas-tic scattering (DIS) region. The higher theenergy, the closer the DIS region comes to theboundary.

In summary, conditions for the impulse approximation to hold are

E M , Q2 M2 , 2M ν M2 (17.59a)

0 < x DQ2

2M ν< 1 (17.59b)

These are the same conditions as for Bjorken scaling to hold. The scattering thatsatisfies the conditions is called deep inelastic scattering. The coordinate framesatisfying the conditions is called the infinite momentum (p1) frame. Figure 17.9illustrates the allowed region for Q2, 2M ν, and how x , y , W are related to them.The variable y is introduced for later applications. According to the observed data,the scaling seems to hold early for Q & 1 GeV/c (Fig. 17.6b). We next investigatethe physical meaning of the structure functions.

17.5.2Electron–Parton Scattering

Scaling Variables As will be clarified later, the partons consist of quarks and glu-ons. The gluons are electrically neutral and do not contribute to electron–protonscattering. Here, we assume the parton is a quark. It is a point-like particle havingspin s D 1/2, mass μ and electric charge qparton D λq e. The scattering cross section

Page 722: Yorikiyo Nagashima - Archive

17.5 Parton Model 697

in the parton’s rest frame is given by making the changes M ! μ, α ! λq α inEq. (17.12)

dσdΩ

ˇˇ

eq,LABD

4(λq α)2

Q4

ω0 3

ωcos2

�θ2

��1C

Q2

2μ2 tan2�

θ2

��(17.60)

In reality, the parton is inside the nucleon with relative momentum given byEq. (17.59b). Therefore we need to define Lorentz-invariant variables and rewriteEq. (17.60) in terms of them. Let us first show that the variables so far treatedas those in the laboratory frame can, in fact, be considered as Lorentz-invariantvariables. Neglecting both proton and electron masses in the limit E M , thefollowing equalities hold in the LAB frame:

s � (k C p )2 ' 2(k � p ) D 2ωM (17.61a)

y �(p � q)(p � k)

Dω � ω0

ωD

νωD

2M νs

(17.61b)

x �Q2

2M νD

Q2

s yD

s(1 � y ) sin2(θ /2)M2 y

(17.61c)

sin2�

θ2

�D

M2x ys(1� y )

, cos2�

θ2

�D 1�

M2x ys(1� y )

' 1 (17.61d)

One sees that variables in the laboratory frame [ω, ν, sin2(θ /2), etc.] can be ex-pressed in terms of Lorentz-invariant variables (s, x , y ).

Problem 17.6

Confirm

0 <Q2

2M ν(D x ) < 1 , 0 <

2M νs

(D y ) < 1 (17.62)

Parton Distribution Function For elastic scattering in the laboratory frame (LAB),θ and ω0 are related as given in Eq. (17.31). Marking variables in the center of massframe (CM) with asterisk, we can rewrite y in CM as

y D(p � q)(p � k)

DE�(ω� � ω0 �)� (�k�) � (k�

� k0 �)E�ω� � (�k�

� k�)'

1 � cos θ �

2(17.63)

We obtain

dσd yD ω

dσdω0

ˇˇLABD

4πM2

s(1� y )2

dσdΩ

ˇˇLABD 4π

dσdΩ

ˇˇCM

(17.64)

The above equation holds for two-body scattering with target mass M. When weapply it to the electron–parton two-body system, we make the replacements M !μ, s ! sq D (k C pq)2. Noting that for elastic scattering

Q2 D 2μν D sq y (17.65)

Page 723: Yorikiyo Nagashima - Archive

698 17 Hadron Structure

we rewrite Eq. (17.60) to obtain

dσd y

ˇˇ

eqD

4πλ2q α2

Q4 sq

�(1� y )C

y 2

2

�(17.66)

As the parton’s momentum in the nucleon is given by x p , we have

sq D (k C x p )2 ' 2x (k � p ) ' x s (17.67)

Writing the i-th parton’s momentum distribution function in the nucleon as qi(x ),the probability that the momentum is in the interval from x to x C dx is given byqi(x )dx . Then the scattering cross section of the electron with the i-th parton inthe nucleon is given by

dσd yD

4πα2

Q4 s�

(1� y )Cy 2

2

�λ2

i x q i(x )dx (17.68)

Consequently, the electron–nucleon scattering cross section is obtained by sum-ming over all the partons and becomes

d2σdx d y

D4πα2

Q4 s�

(1� y )Cy 2

2

�Xi

λ2i x q i(x )dx (17.69)

Note that the cross section is factorized as a product of two functions each con-tining x or y only. We have derived the e–p scattering cross section as a sum ofe–q scatterings. Note, the partons are elastically scattered by the electron, but therecoil parton is not an observable by itself. The quarks, also known as partons,are confined, and when they come out of the nucleon they are observed as pionsand many other hadrons. However, characteristics of e–q elastic scattering can beobtained by observing the behavior of the electron only. Note that Eq. (17.69) is ex-pressed only in terms of electron variables, which can be observed experimentally.Rewriting the generic expression Eq. (17.41) for inelastic scattering in terms of xand y (Problem 17.7), we have

d2σdx d y

ˇˇinelD

4πα2

Q4 s�

ν W2(1� y )C 2x M W1

�y 2

2

��(17.70)

This equation becomes in the scaling limit

d2σdx d y

ˇˇinelD

4πα2

Q4 s�

F2(x )(1� y )C 2x F1(x )�

y 2

2

��(17.71)

Comparing Eq. (17.71) with Eq. (17.69), we see that in the quark–parton model thestructure function is given by

F2(x ) D xX

λ2i q i(x ) (17.72a)

F1(x ) D12

Xλ2

i q i (x ) (17.72b)

Page 724: Yorikiyo Nagashima - Archive

17.6 Scattering with Equivalent Photons 699

We have proved that by considering e–p inelastic scattering as a sum of e–q elasticscatterings, the structure functions M W1(Q2, ν) and ν W2(Q2, ν), both of which arefunctions of two independent variables, become, in the scaling limit, functions ofx D Q2/2M ν only. Bjorken scaling is evidence that the hadrons have substructureand are made of point-like particles. We have also succeeded in finding a physicalinterpretation of the structure functions in Bjorken scaling. They represent themomentum distribution of partons in the nucleon.

Problem 17.7

Show that Eq. (17.41) becomes Eq. (17.70) if one rewrites it in terms of x and y.

17.6Scattering with Equivalent Photons

In this section we try to interpret deep inelastic scattering in terms of equivalentphotons, those emitted by the electron. It offers a different perspective and helps togive an intuitive interpretation of the picture. Changing the way of expression doesnot necessarily increase information, but in some cases it is easier to interpret.Besides, the viewpoint presented in this section can be directly extended to QCDapplications.

17.6.1Transverse and Longitudinal Photons

The formalism we have developed so far is based on the assumption that one-pho-ton exchange gives a good approximation. Then what the nucleon sees is not theelectron but a photon. If we take the view that the nucleon interacts with the pho-ton, the electron is just a supplier of photons. The cross section can be consideredas a convolution of the photon–nucleon interaction cross section and the inten-sity of the photon provided by the electron integrated over the photon spectrum.The cross section formula expressed this way is called the Weizsäcker–Williamsapproximation [78, 96].

The interaction of the photon with a charged particle is given by

LINT D �QeZ

d4x A μ(x ) J μ(x ) (17.73)

the transition amplitude is given by

T f i D (2π)4δ4(q C p � PF )�μ(λ)hPF j J μ(0)jp i (17.74)

Page 725: Yorikiyo Nagashima - Archive

700 17 Hadron Structure

and the photon absorption cross section in the LAB frame can be written as

dσγ De2

4K M

XF

(2π)4δ4(� C p � PF )

�12

Xspin,λ

hPF j�μ(λ) J μjp ihPF j�ν(λ) J νjp i�

D4π2α

K

�μ(λ)�ν(λ)� W μν

(17.75)

where W μν is the same structure function for the e–p inelastic scattering as givenby Eq. (17.34c). K is the photon energy in the LAB frame, M the target mass and�μ(λ) the polarization vector of helicity λ.

There are two problems associated with the photon being virtual. One is the ap-pearance of an additional (λ D 0) polarization. The virtual photon is like a massivevector particle but its four-momentum q is space-like (q2 < 0). The polarizationvectors have to be orthogonal to the photon’s energy-momentum vector q, namely,

q � �(λ) D qμ�μ(λ) D 0 λ D ˙, 0 (17.76)

If the particle is a massive free vector particle, q is time-like (q2 > 0) and all threepolarization vectors satisfying Eq. (17.76) are space-like. But the virtual photon isspace-like, and one of the polarization vectors must be time-like. Consequently, wedefine three polarization vectors as

�μ(˙) D �(0, 1,˙i, 0)/p

2 (17.77a)

�μ(0) D1p�q2

(q3, 0, 0, q0) D1p�q2

�qν2 � q2, 0, 0, ν

�(17.77b)

It is possible to make ν D 0 by choosing a frame referred to as the Breit frame,which we will discuss soon. In this frame

�μ(0) D (1, 0, 0, 0) (17.78)

For this reason, the photon with λ D 0 is sometimes called a scalar photon.2) Thepolarization vectors satisfy orthogonality and completeness conditions:

�(λ) � �(λ0) D �μ(λ)�μ(λ0) D (�1)λ δλλ0 (17.79a)

(�1)λC1�μ(λ)�ν(λ) D ��

gμν �qμ qν

q2

�(17.79b)

The metric is different from the ordinary definition of the completeness condition.The metric used here takes into account that �μ(0) is time-like [�(0) � �(0)� D C1].See also arguments in Appendix C.

2) Confusingly though, it is conventionally referred to as the longitudinal photon.

Page 726: Yorikiyo Nagashima - Archive

17.6 Scattering with Equivalent Photons 701

Problem 17.8

Show that the polarization vector of the scalar photon can be expressed as

�μ(0) D a�

p μ �(p � q)

q2 qμ�

, a D�

p 2 �(p � q)2

q2

��1/2

(17.80)

where p μ is the four-momentum of the target proton.

The second problem is the definition of the incident photon energy K. For thevirtual photon, any variable that converges to the real photon energy in the limitQ2 ! 0 can be a candidate. Here we adopt a definition of the photon energy interms of the flux Eq. (6.84) [168].3)

4K M jLAB D 4[(q � p )2 � q2 p 2]1/2 D 4M(ν2 C Q2)1/2

) K Dq

ν2 C Q2(17.82)

Using Eqs. (17.75), (17.77) and (17.79), we can express the cross sections for scat-tering by transverse and longitudinal photons as

σT �12

4π2αK

XλD˙1

�μ(λ)�ν(λ)� W μν D4π2α

KW1 (17.83a)

σL �4π2α

K�μ(0)�ν(0)� W μν D

4π2αK

WL (17.83b)

WL D

��1C

ν2

Q2

�W2 � W1

�(17.83c)

In the region where the scaling law holds

σT !4π2αK M

F1 (17.84a)

σL !4π2αK M

FL (17.84b)

FL D1

2x(F2 � 2x F1) (17.84c)

Problem 17.9

Derive Eq. (17.83).

3) Hand’s definition [194], which can be defined by observable variables of the final state only, isused frequently, too:

K D (W 2 � M2)/2M D ν � Q2/2M (17.81)

Page 727: Yorikiyo Nagashima - Archive

702 17 Hadron Structure

17.6.2Spin of the Target

Callan–Gross Relation As is clear from Eq. (17.72), when the target has spin s D1/2, the equality

F2 D 2x F1 (17.85)

holds, which is referred to as the Callan–Gross relation [85]. In this case, σL/σT D

0. On the other hand, if the target has spin s D 0, there is no magnetic scattering,which means F1 D 0 or σT/σL D 0. In other words, interpreting the reaction asthat by the photon, the target spin can be easily recognized.

Breit Frame An intuitive explanation can be given in the Breit frame (also calledthe brick wall frame). In this frame (Fig. 17.10), the electron, moving in the x–z plane, changes its momentum in the z-direction only, without changing px . Inother words, there is no energy transfer and it gives only recoil momentum tothe target. On the other hand, the target proton (or parton inside) comes in fromthe Cz direction, receives recoil momentum only and bounces back in the Czdirection. In this frame, both particles move as if they had bounced off a brick wall.

In terms of the virtual photon, it comes in from �z direction and is absorbed bythe proton. Since a transversely polarized photon gives spin component sz D 1 tothe target, a s D 0 target cannot absorb the photon and σT D 0. If the target hass D 1/2, spin transfer Δ sz D 1 causes spin flip. If the photon is longitudinally po-larized, spin transfer Δ sz D 0 may also be possible, allowing it to interact with bothphotons. But notice that the electromagnetic interaction is of vector type, whichkeeps the particles chirality invariant. This means the particle preserves its helicity(namely its spin has to be flipped) in the mass-zero limit and forces σL D 0. Exper-

Electron

Electron

Proton

p

p’

k’

kProton

p

p’

(a) (b)

xz

Figure 17.10 In the Breit frame, the electronand the proton recoil from each other withoutchanging energy and px but changing pz , as ifthey had bounced off a brick wall. The protonenters from theCz direction and recoils backin theCz direction, while the photon ejectedby the electron enters with transverse mo-mentum q, which is absorbed by the proton.

As the electron’s spin is 1/2 and the virtualphoton’s 1 (if transverse wave) or 0 (if longitu-dinal wave), angular momentum conservationdictates that only the transverse wave canbe absorbed by the target parton. ThereforeσL D 0. On the other hand, if the target par-ton has spin 0, it can absorb the longitudinalwave only.

Page 728: Yorikiyo Nagashima - Archive

17.6 Scattering with Equivalent Photons 703

BCDMSEMC

0.6

0.6 0.80.20 0.4

0.2

0

0.4

–0.6

–0.2

–0.4

x

R=σ L/σ

T

Figure 17.11 The Callan–Gross equality means R D σL/σT D 0. Observation shows it isgenerally true except in the region x . 0.1 [16, 59].

imentally, partons in the proton have dominant s D 1/2 components (Fig. 17.11)as expected.

Problem 17.10

When the target has mass μ or transverse momentum p 2?, show that

σL

σTD

4(hp 2?i C μ2)Q2

(17.86)

Hint: Calculate the cross section Eq. (17.83) in the Breit frame. In this framethe virtual photon has momentum only and no energy. The scalar polarization isexpressed in a simple form, too:

qμ D (0, 0, 0,q�q2) , �μ(0) D (1, 0, 0, 0) (17.87)

Note, if μ2 C p 2T � O(0.25 GeV2), then σL/σT ' O(1/Q2). If they have x depen-

dence, Fig. 17.11 can be explained.

17.6.3Photon Flux

Highlighting the role of the photon in the deep inelastic scattering, we also intro-duce the concept of the effective photon flux that the electron provides.

Page 729: Yorikiyo Nagashima - Archive

704 17 Hadron Structure

We can rewrite the cross section as

d2dσdΩ dω0 (e p ! e0 X ) D Γ (σT C �σL) (17.88a)

Γ DαK

2π2Q2

ω0

ω1

1 � �, � D

�1C 2

ν2 C Q2

Q2 tan2 θ2

��1

(17.88b)

Using d3k0 D dk0dk2T/π D ω(1 � y )d y dk2

T/π and Eqs. (17.61), the above expres-sion becomes in the infinite momentum frame

dσ(e p ! e0 X ) Dh

σT C �(y ) σL

iΓ dΩ dω0 p1 frame

������! σT dP(y , p 2T)d y

dP(y , p 2T)d y D

α2π

Pγ e(y )d p 2

T

p 2T

d y , p 2T D

�2k0 sin

θ2

�2

Pγ e(y ) D�

1C (1 � y )2

y

�(17.89)

One sees that the quantity dP(y , p 2T)d y can be interpreted as the photon flux pre-

pared by the electron. Equations (17.89) are called the Weizsäcker–Williams formu-la [78, 96, 381, 390].

Problem 17.11

Derive Eqs. (17.88).

Problem 17.12

Derive Eqs. (17.89).

The expression diverges for p 2T ! 0, which is the familiar Coulomb force singu-

larity at r D 0. Integration over p 2T can be carried out as follows. The denominator

p 2T is really Q2 in the original expression, which is expressed in the infinite mo-

mentum frame as

Q2 D �(k � k0)2 'p 2

T

1� yC

m2e y 2

1 � y(17.90)

Therefore, we can set the lower limit of p 2T integration to not 0 but m2

e y 2 if thedivergence is to be avoided. The upper limit of the integration is given by (p 2

T)max �

(k0θmax)2 � ω2(1� y )2θ 2max. Neglecting ln θ 2

max compared to ln(ω2/m2e ), we haveZ

d ln p 2T ' 2 ln (ω/me ) (17.91)

which gives the integral form

σ(eN ! e0N 0) Dαπ

lnωme

Z1C (1� y )2

yσ(γ N ! N 0I y ) d y (17.92)

Page 730: Yorikiyo Nagashima - Archive

17.7 How to Do Neutrino Experiments 705

Equation (17.92) is also referred to as the Weizsäcker–Williams formula in integralform.

The Weizsäcker–Williams formula considers electron–proton scattering as a two-step process: the electron–photon and photon–proton reactions. The electron’s roleis simply to provide the photon flux. The formula is valid only in the infinite mo-mentum frame but it becomes a very useful tool in QCD, where such frames areroutinely provided.

17.7How to Do Neutrino Experiments

Deep inelastic scattering by the electron clarified the parton structure of the nucle-on. We now want to do the same experiments with the neutrino to take advantageof its ability to distinguish the quark flavor. It took 25 years before the neutrino wasdetected after Pauli’s prediction in 1930 and another 30 years before high-intensi-ty neutrino beams could be used for systematic investigations. The neutrino is theonly particle that participates exclusively in the weak interaction.4) However, once itbecame possible to do experiments with intense neutrino beams, they provided anideal probe to investigate the characteristics of the weak interaction, which undernormal circumstances is hidden behind the strong and/or electromagnetic inter-actions. The neutrino is electrically neutral, and because of its weak reaction ratewith matter, neutrino reactions do not happen easily. To do neutrino experimentsby actively detecting it, construction of a large accelerator and commensurate de-velopment of technology were indispensable. We describe briefly the essence ofhow to do neutrino experiments.

17.7.1Neutrino Beams

High-energy ( 10 MeV) and high-intensity neutrino νμ(νμ) beams are mainlyobtained using

πC(π�)! μC(μ�)C νμ(νμ)

KC(K�)! μC(μ�)C νμ(νμ)(17.93)

To obtain ν e(ν e)

μC(μ�)! eC(e�)C ν e(ν e)C νμ(νμ)

KC(K�)! π0 C eC(e�)C ν e(ν e)(17.94)

are used. For low-energy ν e neutrinos (1–4 MeV), decays of fission material pro-duced in atomic reactors are also used.

4) Except for gravity, which affects everything that has energy. Gravity becomes important whencosmic neutrinos are discussed.

Page 731: Yorikiyo Nagashima - Archive

706 17 Hadron Structure

In early experiments before about 1970, the requirement of a higher intensitybeam was much stronger than its quality. For this purpose, it was necessary tofocus pions and kaons produced by the primary proton beam in the forward direc-tion. It is impossible to focus all of them but a horn nearly achieves it. For realisticexplanations we take examples from Fermilab and CERN in the 1970s, which sup-plied primary protons with energy � 400 GeV. The horn is shaped as shown inFig. 17.12a, with a cross-sectional view through its center shown in (b). When alarge current (say� 104 A) is supplied to the horn, a circular magnetic field outsideand surrounding the horn is generated. Such a large current can only be made bystoring charge in a condenser (capacitor) bank and discharging it instantaneous-ly by shorting the circuit. Because of this, the beam has to be condensed in time(fast extraction), which makes the operation of neutrino experiments incompatiblewith others, because most of them require slow extraction to use time coincidencetechniques to select reactions. In this sense, neutrino experiments are costly.

The target is placed at the mouth of the horn, where it is irradiated by the protonbeam. Produced secondary pions (and kaons) go forward, but those emitted at alarge angle enter the magnetic field and are bent forward. The horn is designed tofocus secondary particles emitted at almost any angle within a limit. The particlesthen go through a long (� 350 m) tunnel, where they decay to produce neutrinos.Pions that did not decay are stopped by a thick shield at the end of the tunnel. Sincehadrons interact strongly, a concrete or iron shield of a few meters is enough toabsorb all of them. But muons interact only electromagnetically and lose typicallyonly � 1.8 MeV/g. To filter muons of, for example, 200 GeV, it takes � 105 g D100 kg D 120 m of iron. If cost matters, earth may be substituted, but its density is

(a)

(c)(b)

1010

109

104

105

106

107

108

0 50 100 150 200 250 300

WBB

NBB

pπ, K 200 GeV/c

νν

ν(ν)

GeV

/1013

INC

IDEN

T PR

OTO

NS

Eν (GeV)

pproton 400 GeV/c

K decay

π decay

I

IB

I

B

Target

Proton I π μν

Beam

Figure 17.12 (a) Principle of horn focusing.When current I is supplied to the cylindricalhorn, a circular magnetic field is produced,which focuses one sign of charged pions inthe forward direction. (b) Cross-sectional view

of the horn. (c) WBB (wide band beam) andNBB (narrow band beam) spectra of the neu-trino beam. NBB is produced by monochro-matic secondary pions and kaons [352].

Page 732: Yorikiyo Nagashima - Archive

17.7 How to Do Neutrino Experiments 707

m 005m 053m 07

TARGET

2ndaryTransport Beam

Beam Monitor

PrimaryProton Beam

Ion Chambers

Decay Particles

Beam Stopper Neutrino Detector

Earth Shield(Muon Filter)

Beam Dump

Figure 17.13 Layout of the dichromatic neu-trino beam line at Fermilab circa 1970. Theprimary proton beam hits the target and pro-duces pions (and kaons). The pions are mademonochromatic by a beam focusing systemconsisting of dipole and quadrupole magnets,

then they are injected into the decay pipe,where they decay to muons and neutrinos.Undecayed pions are absorbed by a beamdump and muons are stopped by muon filtersmade of earth (or iron).

small (2–3 gr/cm�3), and the necessary length would be � 3 times, which causes10 times intensity loss [/ 1/(distance)2]. The neutrino beam obtained in this wayhas a wide spectrum and is called a wide band neutrino beam (WBB).

A typical spectrum is shown in Fig. 17.12c. Oppositely charged particles are bentin opposite directions and diverge, which can be used to separate ν form ν some-what, but particles in the forward direction inside the horn are not separated, whichis the reason for a large mixture of ν in ν. Figure 17.13 shows the layout of a Fermi-lab neutrino beam. A dichromatic beam is described below but the overall layoutis similar to that of a WBB beam, except that the horn is replaced by a set of dipoleand quadrupole magnets.

As the technology advanced, one wanted to have a beam of good quality with goodcontrol of energy and intensity. For this purpose narrow band beams (NBBs) beganto be used. The neutrino being neutral, its energy cannot be measured directly,but taking advantage of two-body decay [π(K ) ! μν] in producing it, one candetermine its energy kinematically. If we mark variables in the CM frame with *,the neutrino energy in the LAB frame is connected to those in the CM frame byLorentz transformation:

Eν LAB D γ (E�ν C p �

ν cos θ �) (17.95a)

p �ν D E�

ν D (M2 � m2μ)/2M (17.95b)

where M is the parent’s (π or K) mass and , γ are Lorentz factors that are ex-pressed in terms of the parent’s energy (E) and momentum (P) as

γ D E/M , γ D P/E (17.96)

As the parent is spinless, the decay distribution in the CM frame is isotropic. There-fore, the distribution of particles in the LAB frame is

dNND

14π

dΩ � D12

d cos θ � DdELAB

2γ E�ν� const. dELAB (17.97)

Page 733: Yorikiyo Nagashima - Archive

708 17 Hadron Structure

Namely, the energy spectrum of the neutrino is flat between the minimum and themaximum of its energy

min(Eν LAB) D E�ν γ (1� ) ' 0 (17.98a)

max(Eν LAB) D E�ν γ (1C ) ' 2γ

M2 � m2μ

2MD E

"1 �

m2μ

M2

#(17.98b)

D

(0.427E π decay

0.954E K decay(17.98c)

Examples of such flat spectra are shown in Fig. 17.12c. One sees that the maximumneutrino energy from K decay is about two times that from π decay. However, thepion production cross section is much higher than that of kaons. As the decay angleθν and the energy Eν of the neutrino in the LAB frame are related by (Fig. 17.14a)

Eν DM2 � m2

μ

2(E � P cos θν)(17.99)

if there is independent information about the neutrino decay angle θν , its energycan be determined. Using E M, cos θ ' 1� θ 2/2, we obtain

Eν ' Eν maxθ 2

0

θ 2 C θ 20

, θ0 D M/P (17.100)

The distribution curve is plotted in Fig. 17.14b. θν can be determined from thedistance of the interaction point at the detector from the beam center. Figure 17.14cshows a scatter plot of θν versus Eν . The two bands correspond to neutrinos fromK and π. In this way, at fixed angle one can obtain a dichromatic beam with twoenergy peaks. The low-lying flat spectra of Fig. 17.12c are integrated spectra of thedichromatic beam.

Recent Developments More recently, high intensity monochromatic beams ofvarying energy have become available [362]. This apparatus uses horns with a sec-ondary beam set at an off-axis angle from the primary beam, taking advantage ofthe differing spectra of the secondaries depending on the production angle. Theprimary beam intensity is also increased. The so-called superbeam uses a protonbeam having 1–4 MW power.

With future neutrino oscillation experiments in mind, the concepts of “neutri-no factory” and “beta beam” are being developed [42]. The former uses an intensetertiary muon beam that is phase rotated and cooled5) for better optical proper-tiess, accelerated, stored and allowed to decay. In this way 1020 muons/year areenvisaged. The beta beam accelerates -active unstable isotopes, stores them and

5) Phase rotation is an accelerator technique that uses a magnetic field to make the energy spreadnarrow at the expense of making time spread wide or vice versa. Cooling is another technique toreduce transverse divergence of the beam. Those techniques are necessary because the secondaryand tertiary beams spread widely both in energy and direction.

Page 734: Yorikiyo Nagashima - Archive

17.7 How to Do Neutrino Experiments 709

π μ

ν

θν

θ 0 θ ν

Εν

Εν = Μ2 m μ2

2(E Pcosθν)(a)

(b) (c)

250

200

150

100

50

00 0.5 1.51

Impact radius (meters)

Ener

gy (G

eV)

Figure 17.14 (a) If the parent particle ismonochromatic, the neutrino energy Eνand the decay angle θν are related. (b) WhenE � Mπ, K , Eν ' Eν,max � θ 2

0 /(θ 2 C θ 20 ).

(c) Scatter plot of measured neutrino ener-

gy (muon + hadron energy) versus detectorradius for a sample of events. The separatebands are due to neutrinos from pion andkaon decay [116].

lets them decay. Since decay has very low Q value in the isotope’s rest frame,neutrinos from accelerated isotopes have good quality. They are monochromatic,collimated and with virtually no background.

17.7.2Neutrino Detectors

Event rates of neutrino experiments are given by N D Nν σtotNtarget. Since σtot

is the subject of the investigation, it is beyond the control of the experimenters.However, the fact that the cross section increases in proportion to the neutrinoenergy (σtot / Eν) can be used to the experimenters’ advantage by raising theenergy. Making Nν and Ntarget large is where the experimenters’ skill comes in.

Page 735: Yorikiyo Nagashima - Archive

710 17 Hadron Structure

In neutrino experiments, one makes a gigantic target to increase Ntarget, but inorder not to lose generated events the target itself has to serve as a detector. Abubble chamber is an ideal target cum detector capable of reconstructing all thekinematic variables. Its operating principle was briefly described in Fig. 14.2. It canoften record a single event that definitively detects a new phenomenon, dependingon what one is pursuing. Good examples are the discovery of the Ω particle totest flavor S U(3) (Fig. 14.6) and a neutral current event to confirm the Glashow–Weinberg–Salam electroweak theory [199, 200].

For precision neutrino experiments, however, one wants to have a more massivetarget/detector to gain high statistics. The size of the bubble chamber is inherentlyconstrained by its use in a magnetic field. Gargamelle and BEBC (Big EuropeanBubble Chamber) at CERN and the 15 foot bubble chamber at Fermilab are madewith the highest technology available using superconducting magnets and yet lim-ited to � 10 tons at most because of the mentioned constraint [1]. More suitable isan electronic counter detector, the mass of which can be made arbitrarily large togain high statistics.

Generated particles in the neutrino interactions other than e and μ are almost allpions and nucleons, which are hadronized remnants of a quark jet (a phenomenonin which many hadrons are ejected in the same direction). Detailed identificationof each hadron’s motion is not so important as the collective behavior of the jet.The bubble chamber is overqualified for such a purpose. Eliminating unnecessaryinformation drastically and concentrating on detecting the leptons and the total en-ergy of the hadrons, the detector structure can be simpler but massive. Figure 17.15shows a typical detector, that of the CDHS (CERN–Dortmund–Heidelberg–Saclay)group at CERN [59, 116]. Basically, they are made of iron modules (toroidal andmagnetized) with drift chambers and scintillation counters interspersed amongthe modules to measure muon tracks and hadronic energy. The whole detector isdivided into 19 supermodules and each supermodule is further subdivided into 5 or10 cm thick iron plates. The combination of iron/scintillator modules constitutes asampling calorimeter for hadrons. The total weight is 1235 ton, and yet the detec-tor is electronics personified, equipped with a great number of readout modules.Information that the detector can supply is:

1. Coordinates of the interaction point: from the transverse position the energy(ω) of the incoming neutrino can be determined using Eq. (17.100).

2. Scattering angle θ of the muon and its energy (ω0): the energy (momentum tobe exact) is determined by the curvature of the muon tracks going through themagnetized iron.

3. Hadronic total energy (Ehad): measured by the hadron calorimeter.

Page 736: Yorikiyo Nagashima - Archive

17.7 How to Do Neutrino Experiments 711

Figure 17.15 (a) View along the 19 mod-ules of the CDHS electronic detector at SPS.The black lightguides and phototubes, whichare used to measure the hadron energy, canbe seen sticking out of the magnetized ironmodules. The hexagonal structures are thedrift chambers that measure the muon tra-jectories [352]. PHOTO CERN. (b) Computerreconstruction of a charged-current neutrinointeraction in the detector. The neutrino en-

ters from the left. The top row gives the pulseheight in the scintillator planes along the de-tector. This is a measure of energy depositedto determine the total hadron energy. The pan-els below show the measured coordinates inthe three drift-chamber projections togetherwith the fit curve for the muon reconstruc-tion. The curvature tells the momentum of themuon [3].

From the above information, s, x , y of charged current interactions (νμ C N !μ C hadrons) can be determined as well as a self-consistency check made (Ehad D

ω � ω0).6)

CDHS used thick iron to gain mass, which is good for detecting muons andhadronic showers, the former because of its long range and the latter because ofthe collective behavior, but then the electron, which makes a short electromagneticshower, is absorbed by the iron and undetectable. Charged current interactionsinduced by , -νecannot be distinguished from neutral current interactions, whichmakes active detection of, say, the νμ ! ν e transition impossible, an importantphenomenon known as neutrino oscillation. In this case only the disappearance

6) Neutral current phenomena (νμ C N ! νμ C hadrons), where no muons are observed, can bemeasured also, but in this case the lepton (neutrino) information is missing and no x informationis obtained.

Page 737: Yorikiyo Nagashima - Archive

712 17 Hadron Structure

of the νμ can be measured by comparing fluxes at different places, which is muchmore difficult. If one wants to identify electrons and reconstruct their kinematics,one needs fine-grained sampling, which uses more readouts and reduces the targetdensity, hence the total mass. This type would be the next generation of detectors.

17.8ν–p Deep Inelastic Scattering

17.8.1Cross Sections and Structure Functions

Feynman diagrams describing neutrino inelastic scattering are shown inFig. 17.16b. In the low-energy region that we are dealing with (Eν < 100 GeV),the W mass can be set as infinity and the interaction is well described by the four-Fermi Lagrangian. But to make the comparison with e–p scattering clear we drawthe W line in the diagram. Both processes are considered as two-body scattering ofquarks by leptons in the parton model. Setting the electron and neutrino kinemat-ics identical and fixing the incoming energy, we see their scattering is a function ofx and y only. If the scaling rule holds, x represents the parton’s initial momentum,the distribution of which is given by q(x ) irrespective of incoming particle species.Denoting leptonic and hadronic weak (charged) current as l μ and J μ , respectively,we can express the Lagrangian LF as

�LF DGFp

2(l�μ JC

μ C lCμ J�μ )

l�μ D μγ μ(1� γ 5)ν , lCμ D νγ μ(1� γ 5)μ D (l�μ)†

JCμ D uγμ(1 � γ 5)d , J�

μ D dγμ(1 � γ 5)u D ( JCμ)†

(17.101)

The first term in the first line contributes to ν and the second to ν scattering. Defin-ing structure functions for ν, ν separately but in a similar form to e–p scattering,

k

k’

e

e

q

γ

k

k’

νμ

μ

N

q

Wxp

xp+q xp+q

xpp

p

pF pF

Np

(a) (b)

Figure 17.16 (a) e–p inelastic scattering. (b) ν–N inelastic scattering in the quark model.

Page 738: Yorikiyo Nagashima - Archive

17.8 ν–p Deep Inelastic Scattering 713

we have

W ν,ν�σ D

(2π)4

4πM

XF

δ4(p C q � PF )�

12

�Xspin

hp j J�� jPF ihPF j J˙

σ jp i

D

��g�σ C

q�qσ

q2

�W1(Q2, ν)

C

�p� �

(p � q)q�

q2

��pσ �

(p � q)qσ

q2

�W2(Q2, ν)

M2

� i��σα p α q W3(Q2, ν)2M2

C [� (q� qσ)]C [� (p�qσ C q� pσ)]C [� i(p�qσ � q� pσ)]

(17.102)

where PF is sum of momenta of final particles. The weak current contains the axialcurrent, which is not conserved. The constraint due to the current conservation isabsent and W μν cannot be simplified like the electromagnetic current. However,the last term in Eq. (17.102) breaks T invariance, and the three terms in the lastline are proportional to lepton mass (divided by incident neutrino energy) whenmultiplied by the lepton current and can be neglected compared to the first threeterms. The third term is a new term, which is introduced by parity violation. Thecross section can be calculated to give

d2σν,ν

dΩ dω0 D(GF/p

2)2

4(k � p )4K�σ

L � 4πM W ν,ν�σ

d3k0

(2π)32ω0

K�σL D

14

Xspin

u(k0)γ�(1� γ 5)u(k)u(k)γ σ(1� γ 5)u(k0) (17.103a)

D 2h

k0 � k σ C k0 σ k� � (k � k0)g�σ C i��σαkα k0

i(17.103b)

The neutrino being left-handed, a factor 1/2 for spin average is not necessary. Cal-culating Eq. (17.103), we obtain cross sections for ν, ν–N scattering:

d2σν,ν

dΩ dω0 DG2

F

2π2 ω0 2�

2W ν,ν1 (Q2, ν) sin2 θ

2C W ν,u

2 (Q2, ν) cos2 θ2

�W ν,ν3 (Q2, ν)

ω C ω0

Msin2 θ

2

� (17.104)

Changing variables to x , y , we find the structure constants become in the scalinglimit

M W1(Q2, ν)! F1(x ) , νW2(Q2, ν)! F2(x ) ,

ν W3(Q2, ν)! �F3(x ) (17.105)

The minus sign in front of F3 is for convenience to make F3 > 0. The cross section

Page 739: Yorikiyo Nagashima - Archive

714 17 Hadron Structure

is expressed as

d2σν,ν

dx d yD

G2F

2πs�

(1� y )F ν,ν2 (x )C

�y 2

2

�2x F ν,ν

1 (x )˙ y

1�y2

�x F ν,ν

3 (x )�

(17.106)

Problem 17.13

Derive Eq. (17.104).

As the charge symmetry operation (e�i π Iy ) changes p ! �n and J˙ ! J�,the following equalities hold:

F ν pi D F νn

i , F ν pi D F νn

i , i D 1, 2, 3 (17.107)

Then for an isoscalar target (Np D Nn)

F νi D

�F ν p

i C F νni

�/2 D

�F ν p

i C F νni

�/2 D F ν

i (17.108)

and the difference between σν and σν is only the sign of F3. In conclusion, theν, ν–N cross section on an isoscalar target becomes

d2σν,ν

dx d yD

G2F

2πs�

(1� y )F ν2 (x )C

�y 2

2

�2x F ν

1 (x )˙ y

1 �y2

�x F ν

3 (x )�

(17.109)

Comparing with Eq. (17.70), we see that, apart from the F3 term, the neutrino crosssection can be formally obtained from the e–p cross section by making the change

4πα2

Q4 !G2

F

2π(17.110)

This is natural if we consider the weak Lagrangian and the equivalent second-orderelectromagnetic Lagrangian. If the parton spin is 1/2 and the Callan–Gross equality(2x F1 D F2) holds, Eq. (17.109) can be rewritten as

d2σν

dx d yD

G2F

2πs�

(F2 C x F3)2

C(F2 � x F3)

2(1� y )2

�(17.111a)

d2σν

dx d yD

G2F

2πs�

(F2 C x F3)2

(1� y )2 C(F2 � x F3)

2

�(17.111b)

Note, the structure functions Fi in the ν–N cross section and those in the e–pcross sections differ in general. However, if the parton model is right, the structurefunctions are the partons’ distribution functions in the nucleon and they should berelated simply. Let us see how they are related in the quark model.

Page 740: Yorikiyo Nagashima - Archive

17.8 ν–p Deep Inelastic Scattering 715

17.8.2ν, ν–q Scattering

To calculate ν (ν)–q scattering cross sections, we make maximum use of the he-licity-tagged e–μ scattering cross section we derived in Chap. 7. As the weak La-grangian is given by

�LW D (GF/p

2)�uγμ(1 � γ 5)d

�μγ μ(1� γ 5)ν

D

4GFp

2(uLγμ dL)(μL γ μ νL)

(17.112)

the expressions for the e–μ scattering cross section decomposed in terms of helicity[Eqs. (7.31)–(7.33)] are directly applicable. We substitute

e2

t! 4

GFp

2(17.113)

in the equations for the e–μ cross sections

dσLL

ˇˇCMD

α2

t2 s ,dσLR

ˇˇCMD

α2

t2 s�

1C cos θ �

2

�2

(17.114)

and use y D (1�cos θ �)/2. Since the difference between the neutrino and antineu-trino appears only in its helicity as far as the cross section is concerned, conversionfrom eμ to νq can be easily made:

dσd yD

G2F

πs W (νq, νq scattering) (17.115a)

dσd yD

G2F

πs(1� y )2 W (νq, νq scattering) (17.115b)

Note that we do not take the spin average for neutrino scattering.When we deal with quarks in the nucleon, we replace s by x s and multiply by the

distribution function qi(x ), just as we did in the e–p inelastic scattering. As thereare also antiquarks in the nucleon, whose distribution we denote by qi (x ), we have

d2 σν

dx d yD

G2F

2πsX

2x�qi(x )C qi (x )(1� y )2 (17.116a)

d2 σν

dx d yD

G2F

2πsX

2x�qi(x )(1� y )2 C qi (x )

(17.116b)

Comparing with Eq. (17.111), we obtain

F ν2 D

X2x [qi (x )C qi (x )] (17.117a)

x F ν3 D

X2x [qi (x ) � qi (x )] (17.117b)

Page 741: Yorikiyo Nagashima - Archive

716 17 Hadron Structure

17.8.3Valence Quarks and Sea Quarks

There are two kinds of quarks in the nucleon, valence quarks, which define thequantum number of the nucleon (uud in the proton or udd in the neutron), andsea quarks, whose quantum numbers are the same as those of the vacuum. Asthe sea quarks consist of q i qi and for them qi(x ) D qi (x ), Eq. (17.117) meansF2 represents all the quarks and antiquarks and F3 the valence quarks only. If wedenote the distribution functions of u, d, s quarks in the proton as u(x ), d(x ), s(x ),charge symmetry requires those in the neutron to be d(x ), u(x ), s(x ). If we neglects quarks in the nucleon and c quark production, charge and lepton number conser-vation requires that reactions between neutrinos and quarks are limited to

νμ C d ! μ� C u , νμ C u! μ� C d (17.118a)

νμ C u! μC C d , νμ C d ! μC C u (17.118b)

From this we can write the structure functions of the proton and neutron as

F ν p2 (x ) D 2x [d(x )C u(x )] (17.119a)

F νn2 (x ) D 2x [u(x )C d(x )] (17.119b)

Therefore for isoscalar targets

F ν2 D (F ν p

2 C F νn2 )/2 D x [uC d C uC d] (17.120a)

x F ν3 D x (F ν p

3 C F νn3 )/2 D x [uC d � u � d] � x [uv C dv] (17.120b)

We have marked the distribution function of valence quarks with the subscript v.Equation (17.116) is rewritten as

d2σν

dx d yD

G2F

2πsx [q(x )C q(x )(1� y )2] (17.121a)

d2σν

dx d yD

G2F

2πsx [q(x )(1� y )2 C q(x )] (17.121b)

But this time q(x ) D u(x )C d(x ), q(x ) D u(x )C d(x ). For e–N scattering the crosssection is proportional to the charge squared and we obtain from Eq. (17.72) thefollowing expressions:

F e p2 D x

�49

(uC u)C19

(d C d)�

(17.122a)

F en2 D x

�49

(d C d)C19

(uC u)�

(17.122b)

Therefore for an isoscalar target

F e2 D

518

xh

uC d C uC diD

518

x�q(x )C q(x )

(17.123)

Page 742: Yorikiyo Nagashima - Archive

17.8 ν–p Deep Inelastic Scattering 717

Therefore, if the quark model is right, the structure function for the isoscalar targethas to satisfy

F ν2 D

185

F e2 (17.124)

17.8.4Comparisons with Experimental Data

Total Cross Section Integrating Eq. (17.121) over x and y

σνtot D σ0(Q C Q/3) (17.125a)

σνtot D σ0(Q/3C Q) (17.125b)

σ0 DG2

F

2πs , Q D

Zdx x q(x ) , Q D

Zdx x q(x ) (17.125c)

This means that as a result of Bjorken scaling, the total cross section of the neutrinoincreases in proportion to the incident neutrino energy

σν,νtot / s D 2M ω (17.126)

The compilation of neutrino total cross section given in Fig. 17.17 confirms theabove conclusion well.

If the rule σtot / ω breaks down, it may be due to either 1) the effect of theW mass or a higher order effect of the weak interaction or 2) the parton model isbad (needs QCD correction). To date, the slope of the neutrino cross section forω < 350 GeV is constant and no sign of breakdown is seen.

Antiquarks in the Nucleon If we set D Q/(Q C Q), which is the fraction of theantiquark’s momentum, can be calculated from the total cross-section ratio:

R �σν

tot

σνtotD

QC 3Q

3Q C QD

σνtot/E

σνtot/E

D 0.5 (17.127)

which gives D 0.175.7)

dσ/d y Integrating Eq. (17.121) over x, we obtain the y distribution

dσν

d yD σ0[Q C Q(1� y )2] (17.128a)

dσν

d yD σ0[Q(1� y )2 C Q] (17.128b)

7) Rigorously, the distribution functions are not functions of x only but contain a Q2-dependentcorrection (QCD effect), which makes energy dependent. We ignore its effect here.

Page 743: Yorikiyo Nagashima - Archive

718 17 Hadron Structure

In the limit Q Q,

dσν

d y� 1 ,

dσν

d y� (1� y )2 (17.129)

Remembering that the cross sections of ν, ν with a point particle are given byEq. (17.115) and has the shapes dσ/d y (νq) � 1, sσ/d y (νq) � (1 � y )2, we seethat the cross sections of ν with nucleons directly reflects those of point particles.We present actual data [3] in Fig. 17.18. The assumption of no antiquarks in thenucleon Eq. (17.129) is almost valid, but the effect of a small mixture of antiquarksis seen as a deviation at y 1.

Electric Charge of the Quark According to the quark/parton model, the structurefunctions represent momentum distributions in the nucleon. They are given byEqs. (17.120) and (17.123) and represent the same function apart from their coeffi-cients. Thier ratio should be 5/18 if the quark model is right. Figure 17.18b showsa comparison of neutrino data with e–p and μ–p data [3]. The agreement with pre-dictions is excellent and supports the idea of the parton being the quark.

Later observations on various processes, including the R D σ(e�eC ! allhadrons)/σ(e�eC ! μ�μC) annihilation process and π0 ! 2γ , show goodagreement with the quark model (Sect. 14.3). It was also shown that no partonmodels with integer charge can explain all the data, but the quark model can give

Figure 17.17 Slope of the total cross section of the neutrino σtot/Eν [311].

Page 744: Yorikiyo Nagashima - Archive

17.8 ν–p Deep Inelastic Scattering 719

y0 0.2 0.4 0.6 0.8 1.0

1.0

0.5

0

dσ (ν

νΝ)

dydσ

dy

y=0

νν Eν 30 200 GeV

0 0.25 0.50 0.75 1.0

1.5

1.0

0.5

0

xF i

(x)

xF3

F2

qν_

10 < Q2 < 20 GeV2

CDHS

185 F2

(μN)EMC95 F2

(eD)SLAC MIT

(a) (b)

Figure 17.18 (a) Except for a small shift dueto a small mixture of antiquarks, the partonmodel formulae dσν /d y � 1, dσν /d y �(1 � y)2 agree well with the data. (b) Com-

parison of structure functions due to ν, e, μ.F2(x) D q(x)C q(x), x F3(x) D q(x) � q(x)and q(x) are plotted. �, ı, � are ν data,� D μ p , 4 D eD data [3].

satisfactory explanations for all of them [246]. Thus it was proved without doubtthat the parton is nothing but the quark itself.

17.8.5Sum Rules

Expressing the quantum numbers of the nucleon in terms of quarks, we have thefollowing sum rules. Here we include s quarks.

Baryon number: B D 1 DZ

dx�

13

�[u � uC d � d C s � s] (17.130a)

Isospin: I3 D 1/2 DZ

dx�

12

�[u � d � uC d] D

Zdx

�12

�[uv � dv]

(17.130b)

Electric charge:Q D 1 D

Zdx

��23

�(u � u)�

�13

�(d � d C s � s)

D

Zdx

��23

�uv �

�13

�(dv C s � s)

�(17.130c)

Strangeness: S D 0 D �Z

dx (s � s) (17.130d)

Page 745: Yorikiyo Nagashima - Archive

720 17 Hadron Structure

From Eq. (17.122) we also haveZdx

F e p2 � F en

2

xD

13

Zdx [(uv � dv)C 2(u � d)] (17.131)

Combining the S U(3) symmetry assumption u(x ) D d(x ) D s(x )(D s(x )) withEq. (17.130b), we have another equality

U �Z

dxF e p

2 � F en2

xD

13

(17.132)

They are not all independent but are related by Nishijima–Gell-Mann’s law.Equation (17.130a) is referred to as the Gross–Llewellyn Smith (GLS) sum

rule [188], Eq. (17.130c) as Adler’s sum rule [6] and Eq. (17.132) as Gottfried’s sumrule [182]. Verification of these sum rules is not easy, as accurate data in the regionx � 0 are necessary. Experimentally, data are available for 0.003 x 0.8 andB � 0.9˙ 0.2, U � 0.227˙ 0.016. We may say the GLS and Gottfried’s sum rulesare roughly valid.

Existence of the Gluon As the total momentum of the quarks and gluons shouldadd up to that of the proton,

1 DZ

dxhX

x qi (x )C x g(x )i

(17.133)

where g(x ) denotes the gluon momentum distribution function. However, sincethe gluon does not interact with the electron or the neutrino, the total momentumPq of quarks that appear in Eq. (17.133) is given by

Pq D

ZdxX

x qi(x ) D 1�Z

dx x g(x ) < 1 (17.134)

The total momentum of quarks and antiquarks is given by Q C Q. FromEq. (17.125) and the experimental value of the total cross section, we have thetotal momentum

QC Q D3π

4G2F ωM

�σν

tot C σνtot

�D 0.5˙ 0.04 (17.135)

Since sum of momenta obtained from e–p scattering must give an identical value,there are particles which carry the rest of the momentum but do not interact weaklyor electromagnetically. This was the first indication of the gluon’s existence. Thegluon carries half of the nucleon momentum, which means it exists in the nucleonnot instantaneously but permanently.

In summary, we have shown that the parton consists of the quark and the gluon.Direct evidence of the gluon was later presented by three-jet production in theelectron–positron annihilation process e�eC ! q C q C g [43, 291] (Fig. 17.19),and that it has spin 1 was also confirmed [363].

Page 746: Yorikiyo Nagashima - Archive

17.9 Parton Model in Hadron–Hadron Collisions 721

Figure 17.19 Direct production of the gluon. A three-jet event by the reaction e� C eC !qC q C g in the JADE experiment [291].

17.9Parton Model in Hadron–Hadron Collisions

17.9.1Drell–Yan Process

The Drell–Yan process is a way of producing a (large mass) muon pair in hadronicreactions. It is a good test bench for applying the parton model adopted in leptondeep inelastic scattering to hadronic processes. We have learned that the nucleon,a representative hadron, can be treated as a collection of partons in the infinite mo-mentum limit. If the high-energy hadron reactions, which produce a great numberof many other hadrons, can be interpreted as elastic or quasielastic parton scat-tering, interpretations of hadronic processes would be greatly simplified. If an an-tiparton in one of the hadrons annihilates with another parton in the other hadron,producing a lepton pair via electromagnetic interaction, it would provide an easilyrecognizable object, especially if it is hard, i.e. has a large invariant mass. Let usconsider the reaction

p ( or π, p )C p ! μ� C μC C X (17.136)

where X means anything. In the parton model, the process can be considered asthe one depicted in Fig. 17.20. In the parton model, a quark q having momentumOp1 D x1 P1 pair-annihilates with an antiquark q having Op2 D x2P2 to become a

muon pair. We denote the kinematic variables as

q( Op1)C q( Op2)! μ�(k1)C μC(k2) (17.137)

Writing the invariant mass and momentum of the muon pair as m , qμ and neglect-ing the parton mass, we have in the center of mass (CM) frame of the collidingprotons,

Page 747: Yorikiyo Nagashima - Archive

722 17 Hadron Structure

Proton

q q_

μ μ+

k1 k2

P1 P2

p1 x1P1 p2 x2P2

(anti ) Proton

Figure 17.20 In the parton model, p C p ! μ� μC C X is replaced by qq! γ�! μ� μC.

Op1 D x1P1 ' x1(E , 0, 0, P ) (17.138a)

Op2 D x2P2 ' x2(E , 0, 0,�P ) (17.138b)

q D x1P1 C x2 P2 D [(x1 C x2)E , 0, 0, (x1 � x2)P ] (17.138c)

where we have neglected the proton mass and set E ' P . From this we can definethe invariant mass of the muon pair

m2 D q2 D (x1C x2)2E 2 � (x1 � x2)2P2 ' 4x1x2 E 2 D x1x2 s � τ s (17.139)

Note, here, q2 > 0, namely it is time-like, in contrast to q2 < 0 in deep inelasticscattering. The cross section for qq ! μ�μC can be obtained from that of ee !μμ [Eq. (7.38)]

dσdΩ

ˇˇCM

(e�eC ! μ�μC) D4πα2

3s

�3

16π(1C cos2 θ )

�(17.140a)

σtot(e�eC ! μ�μC) D4πα2

3s(17.140b)

by replacing e�eC by qq. Namely, we replace s by m2 and correct for the quarkcharge λq and color degrees of freedom. As there are three colors R , G, B forthe quarks but the muon pair is colorless, only combinations of R R, G G andB B are allowed. Since each color exists with probability 1/3, the color factoris (1/3)2 � 3. Setting the probability of the quarks having momentum x1, x2 asq(x1)q(x2)dx1dx2, we obtain

d2σ(p p ! μμ C X ) D4πα2

9m2

Xi

λ2i [qi (x1)qi (x2)C qi(x2)qi (x1)]dx1 dx2

(17.141)

We introduce a directly observable variable referred to as the “Feynman x”

xF � x1 � x2 D ˙2jqj/p

s (17.142)

Page 748: Yorikiyo Nagashima - Archive

17.9 Parton Model in Hadron–Hadron Collisions 723

where q is the momentum of the muon pair in the p p CM frame. Using the Feyn-man x,8) we can rewrite

dx1 dx2 Ddm2dxF

(x1 C x2)s(17.143a)

d2dσdm2dxF

D4πα2

9m2 s

Xλ2

i[qi (x1)qi (x2)C qi(x2)qi (x1)]

(x1 C x2)(17.143b)

D4πα2

9τ2 s2

x1x2

(x1 C x2)

Xλ2

i [qi(x1)qi (x2)C qi (x2)qi (x1)] (17.143c)

x1, x2 can be calculated from two observables τ, xF:

x1 D12

�q4τ C x2

F C xF

�(17.144a)

x2 D12

�q4τ C x2

F � xF

�(17.144b)

The significance of Eq. (17.143c) lies in the fact that we can use parton distributionfunctions determined by deep inelastic scattering. Note, however, that the kinemat-ical region of its applicability has to be chosen with care. The parton model is validonly in hard reactions, namely, energy and momentum transfer associated with re-actions have to be large. The parton’s momentum itself has to be larger than 1 GeV.From experience in e(ν)–N scattering, it would be safe if x & 1/E . Assuming theconditions are satisfied, we examine Eq. (17.143b) at xF D 0 (x1 D x2 D

pτ). It is

expressed as

m4 d2σdm2dxF

ˇˇ

xFD0D

4πα2

9

pτX

λ2i q i(p

τ)qi (p

τ) (17.145)

This means we can observe the product of distribution functions and check thevalidity of the parton model. To obtain dσ/dm2, we may integrate Eq. (17.143b),but it is more convenient to go back to Eq. (17.141) and use the relation

1 D δ(m2 � x1 x2 s)dm2 (17.146)

After changing the order of integration we obtain

m4 dσdm2 D

4πα2

dLdτ

(τ) (17.147a)

dLdτ

(τ) DZ

dx1 dx2 δ(x1x2�τ)X

λ2i [qi (x1)qi (x2)Cqi(x2)qi (x1)] (17.147b)

dL/dτ(τ) can be interpreted as the luminosity of the parton collision. According toEq. (17.147), the lhs does not depend on energy if τ D m2/s is kept constant, inother words, a scaling law holds for the Drell–Yan cross section. Figure 17.21a [79]shows the Drell–Yan cross sections at 200, 300 and 400 GeV. Scaling is confirmedand we have also shown the validity of the parton model in hadronic reactions.

8) The “x” of deep inelastic scattering is referred to as the Bjorken x.

Page 749: Yorikiyo Nagashima - Archive

724 17 Hadron Structure

Figure 17.21 s�d2 σ/d

pτd y

�vsp

τ. (a) The dashed (solid) line corresponds to order α sQCD prediction at

ps D 19.4(38.8) GeV. Data are for � 19.4,4 23.8, ˘ 27.4, � 38.8 GeV.

(b) Angular distribution of muon pairs showing 1C cos2 θ characteristic of qq! μμ [79, 207].

Note, however, that although momentum and angular distributions are well re-produced, the absolute cross section exhibits a small discrepancy; the difference isreferred to as the K factor

K D(dσ/dm2)exp

(dσ/dm2)theor(17.148)

Observation [79] shows that K � 1.6 in this energy region and is considered toarise from QCD higher order corrections.

17.9.2Other Hadronic Processes

The formalism we developed for the Drell–Yan process can easily be extended toother hadronic reactions. Let us consider a process where hadron A and hadron Bcollide and emit a parton or a lepton, which we denote generically as c:

AC B ! c C X (17.149)

Page 750: Yorikiyo Nagashima - Archive

17.10 A Glimpse of QCD’s Power 725

The reaction cross section of the process can be expressed in parton language asthat of parton a in hadron A and b in B [D Oσ(ab ! c X )] multiplied by parton dis-tribution functions. If we use f a

A (xa)dxa to denote the probability of parton a beingfound in hadron A with momentum xa , the hadronic cross section is expressed as

σ(AB ! c X ) DZ

dxa dxb Oσ(ab ! c X )

��

f aA (xa) f b

B (xb)C (A$ B if a ¤ b)

(17.150)

Here Oσ denotes the elementary process by partons and is to be understood as theaverage of the color degrees of freedom for the initial state and the sum over thefinal states. Writing the total energy of the partons a and b as Os, we have

Os D xa xb s D τ s (17.151)

The formula corresponding to Eqs. (17.147) can be written as

dσdτ

(AB ! c X ) DdL ab

dτ(τ) Oσ(ab ! c X I Os D τ s) (17.152a)

dL ab

dτ(τ) D

Zdxa dxb δ(xa xb � τ)

hf a

A (xa) f bB (xb)C (A$ B if a ¤ b)

i(17.152b)

As for observables, rapidity η is commonly used in addition to Feynman x, where

Eab D xa E C xb E D (xa C xb)E (17.153a)

Pab D xa P C xb(�P ) ' (xa � xb)E (17.153b)

η D12

ln(Eab C Pab k)(Eab � Pab k)

D12

lnxa

xb(17.153c)

xa , xb can be obtained using Eq. (17.144) if τ and xF are known, but when τ and ηare known, they can be obtained from

xa Dp

τeCη , xb Dp

τe�η (17.154)

In terms of rapidity, the cross section is expressed as

d2σdηdτ

Dd2σ

dxa dxbDh

f aA (xa ) f b

B (xb)C (A$ B if a ¤ b)iOσ (17.155)

These formulae are used for expressing cross sections for production of W, Z andjets in hadronic reactions.

17.10A Glimpse of QCD’s Power

We have learned that partons are quarks and gluons and that gluons carry a fairamount of their parents’ momentum. The dynamics of quarks and gluons have

Page 751: Yorikiyo Nagashima - Archive

726 17 Hadron Structure

k

k’

q k - k’

xp

q + xp

p

ProtonElectron

Figure 17.22 Deep inelastic scattering in QCD. Partons are made of quarks and gluons. Quarksand gluons are constantly interacting with each other. For a complete QCD treatment, the con-tribution of the gluon has to be included.

been investigated successfully using QCD, a color S U(3) gauge theory. In its frame-work, the deep inelastic scattering process can be thought of as the electron inter-acting with one of the quarks (Fig. 17.22). In the parton model, partons are treatedas mutually independent and the whole reaction can be thought of as a sum ofindependent e–q elastic scatterings. In terms of the equivalent photon scattering,the virtual photon carrying momentum q and mass Q2 D �q2 is absorbed by aquark. In QCD, the quarks are accompanied by gluon clouds, just like the electronor the proton is surrounded by photon or pion clouds. When the probe used to lookinto the inner structure of the target is soft (namely when the momentum transferit carries is small), it does not have much resolving power. The virtual photon issimply absorbed by a parton carrying fractional momentum x of the target proton

(a) (b) (c) (d)

qp

p’

q p

p’ gt

γ ∗ γ ∗

q p yP

γ ∗P

q p

p’ g

s

γ ∗

p = q+ zp=q+xP’

pp’ p p’

(e) (f)

q g

t

q g

u

Figure 17.23 Deep inelastic scattering pro-cess in QCD. (a) As the probe becomes hard-er, it begins to see a structure in the quark.The quark is split from the gluon cloud. (b)

The parton model is equal to the Born term.(c,d) The next higher order corrections areCompton-like processes.

Page 752: Yorikiyo Nagashima - Archive

17.10 A Glimpse of QCD’s Power 727

changing its momentum from p D x P to p 0 D q C x P . In QCD the process canbe considered as the Born term depicted in Fig. 17.23b. But as the probe becomesharder and penetrates deep into the object, it begins to resolve the core from thecloud.

When the photon sees a magnified quark, the probability of finding a quark inthe quark–gluon cloud composite is not 100%. In the infinite momentum frame,we may consider that the quark in the cloud carries a fraction z of the total mo-mentum, and the virtual photon interacts with it. The process can be expressedpictorially as in Fig. 17.23a. The proton carrying momentum P supplies a partonwith fractional momentum p D y P . Subsequently the photon sees a magnifiedquark interacting with the quark in the cloud and picks up fraction z of p D y P .The parton with p D y P comes out as one carrying q C z p D q C x P , x D z y .

Just as the parton model treated the proton as a collection of partons, the photonwith larger Q2 sees a parton split into a collection of partons. The parton model isthen the Born term (Fig. 17.23b) of the whole reaction (a), and one can think of thenext-order QCD process as depicted in Fig. 17.23c–f. In Fig. 17.23c,d, the photon

F 2(x

,Q2 )

0

1

2

3

4

6

5

Figure 17.24 F2(x , Q2) plotted as a func-tion of Q2 for different values of x. The dataare a compilation of those obtained from e–P and μ–P scattering and include those of

the ZEUS [399], NMC [295], BCDMS [52] andE665 [4] groups. The lines are NLO (next toleading order) QCD fitting.

Page 753: Yorikiyo Nagashima - Archive

728 17 Hadron Structure

sees a quark split into a quark and a gluon and in Fig. 17.23e,f, a gluon split into aqq pair.

The whole process is formulated as the DGLAP (Dokshitzer–Gribov–Lipatov–Altarelli–Parisi) evolution equation [17, 18, 123, 187]. We do not go into the detailsof its calculation here, but let us catch a glimpse of its power. Figure 17.24 showsan example of the DGLAP formula fitted to the structure function. The data clearlyshow variation of the cross section as a function of Q2 i.e. the breakdown of thescaling, hence the naive parton model. The higher order QCD effect is at work,as shown by the various slopes of the distribution of the data at fixed x, and canreproduce the data correctly for a vast range of kinematic variables. The qualitativefeatures of the data can be understood as follows. As Q2 grows, the virtual photonsees more and more gluons, which in turn create qq pairs. This means the numberof quarks in the large x region decreases while that in the small x region increases.This sort of behavior originates from multiple emission of gluons, which is com-mon to all gauge field theories. But this amazing data reproducibility together withother evidence we mentioned in describing the quark model in Chap. 14 firmlyestablished the validity of QCD as the correct theory of quarks and gluons.

Page 754: Yorikiyo Nagashima - Archive

729

18Gauge Theories

18.1Historical Prelude

The development of particle physics during the period beginning in the 1970s com-pletely changed our view of matter. For the first time in history, we began to thinkwe could answer the question about what rules control the world of matter. Somemay feel it exaggerated to claim that much, but the success of gauge theory andthe resultant changes of our view on matter were so revolutionary as to justifythese words. It is an epoch-making phenomenon comparable only to the establish-ing of relativity and quantum mechanics. The gauge principle is now considereda fundamental cosmic principle in conjunction with relativity and quantum me-chanics. Yet the gauge principle thus established was by no means a new idea. Itwas well known that the Maxwell equations satisfy it, and it was in 1918, imme-diately after the proposal of general relativity, that Weyl published the first gaugetheory [387, 388]. Why did it take such a long time to recognize it as one of themost fundamental principles governing nature? The reader may refer to [304] for ahistory of how gauge theory developed.

First of all, a long time was spent establishing self-consistent quantum field the-ories. It was in the late 1940s that QED, the first quantum field theory based onthe gauge principle, was established. Divergence, an inherent difficulty of quan-tum field theory, was treated by renormalization. Gauge theory could have mademore progress triggered by this opportunity, but there were several factors hin-dering it. First, renormalization formalism depends on a recipe in which a finiteterm is extracted by subtracting infinity from infinity, and this was not considered arefined theory mathematically. Second, field theoretical treatment of strong interac-tion phenomena faced the difficulty that the coupling constant was large (� 10) andperturbation approaches did not work well. Other methods were no better. Aboveall, in the 1960s the idea of a particle democracy, in particular the bootstrap model,according to which all the particles are made of other particles, gained considerablesupport. The notion of fundamental particles and confidence in field theory itselfwere weak.

Furthermore, the weak and strong interaction phenomena did not reveal thecharacteristics that any gauge theory should respect. First, the force has to be long

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 755: Yorikiyo Nagashima - Archive

730 18 Gauge Theories

range, like the Coulomb force, but both were short range. Today, we know thischaracteristic is hidden for different reasons, the weak force due to spontaneoussymmetry breaking and the strong force due to confinement. We were finally ableto understand their true nature as a result of our perception that quarks and lep-tons are the fundamental constituents of matter. Based on the quark model, it waspossible to sort out and organize essential features from a proliferation of hadronicand weak interaction phenomena, thus enabling us to see through this frameworkhow the fundamental forces operate.

The second characteristic of gauge theory is its universality. The strength of grav-ity/electromagnetism depends only on the mass/electric charge and not on matterspecies or its form. General relativity is also a sort of gauge theory, but because of itsmathematical complexity it took a long time to fully appreciate this [202, 230, 376].

The third characteristic is that the source of the force in gauge theory must bea conserved quantity. Energy momentum/electric charge, the sources of the gravi-tational/electromagnetic force, were known to be strictly conserved quantities. Onthe other hand, there were no such conserved quantities in strong and weak inter-action. At least they were invisible to our eyes. The reason is the same as for theirshort-range character.

The fourth characteristic is that in strong and weak interactions the action of theforce changes the identity of the particle (exchange force). For instance, the protonand neutron interchange with each other under the action of the nuclear force andthe neutrino is transformed to the charged lepton. It was only after Yang and Mills’smathematical formulation [396] that extended the applicability of gauge theory toexchange forces that they were reconsidered in its framework.

Despite its great success, the gauge principle is not an easy concept and is pre-sented in a formal way in most textbooks. Here, we try to explain it more physically,using known concepts and analogies, for an intuitive understanding but with someloss of mathematical rigor. The explanation will take a somewhat roundabout route,so let us list the logical steps we follow from now on.

1. First we define the gauge principle, mathematically and formally.2. Then we try to apply a geometrical interpretation to the mathematical formula

and attach physical meanings, making correspondence with gravity (generalrelativity).

3. We introduce an experiment to demonstrate the Aharonov–Bohm effect asproof that the gauge potential is an observable in quantum mechanics and playsa more fundamental role than the electromagnetic field strength.

4. We extend gauge symmetry to nonabelian groups using S U(2) (isospin) sym-metry. We show, applying the geometrical interpretation to gauge theory, thatextension to nonabelian groups is straightforward and their formulae are easilyderived in a natural and logical way.

5. QCD is a direct application of S U(3) gauge theory to quark colors. We derive theasymptotic freedom qualitatively, whose discovery elevated QCD to the theoryof the strong interaction historically. We also discuss confinement in a qualita-tive way.

Page 756: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 731

6. Establishing the weak interaction theory is more complicated in two ways. First,it mixes with the electromagnetic interaction and the symmetry has to be slight-ly modified to S U(2)L �U(1)Y . Glashow’s scheme to combine the two forces isdiscussed first.

7. Second, the weak force carrier, the W ˙ boson, has a large mass. If the weakforce respects the gauge principle, the symmetry is broken and as a result thetheory is unrenormalizable. To rescue the situation, the concept of spontaneoussymmetry breaking is introduced. Within its framework, a symmetry-breakingphenomenon can be realized without breaking the symmetry mathematically.Then the Higgs mechanism to attach finite mass to particles is discussed.

8. Combining (6) and (7) the GWS (Glashow–Weinberg–Salam) model, alsoknown as electroweak theory, is constructed.

QCD and the GWS model are collectively dubbed gauge theories based onS U(3)color � S U(2)L � U(1)Y and referred to as the “Standard Model” of parti-cle physics.

18.2Gauge Principle

18.2.1Formal Definition

We review QED as the model of the gauge theory to start from. From the La-grangian’s invariance under global (or first-kind) gauge transformation for afermion field ψ

ψ(x )! ψ0(x ) D e�i α ψ(x ) , ψ(x )! ψ0(x ) D ψ(x ) e i α (18.1)

Noether’s theorem guarantees the existence of a conserved current

jEMμ D qψγ μ ψ , @μ jEM

μ D 0 (18.2)

where q is the signed electric charge in units of proton charge. A conserved chargeoperator Q is defined as the integral of the time component of the conserved cur-rent, which is a generator of the gauge transformation (see arguments in Chap. 5)

Q D qZ

d3x ψγ 0ψ , e i Qα ψ e�i Qα D e�i qα ψ (18.3)

where the factor q is extracted from the phase considering the different chargethat each fermionic field carries. The gauge field (photon) is said to couple to theconserved current when the interaction Lagrangian is expressed as

LEM,INT D �qψγ μ ψA μ (18.4)

Page 757: Yorikiyo Nagashima - Archive

732 18 Gauge Theories

The photon field satisfies the Maxwell equations

@μ F μν D q j ν (18.5a)

@λ Fμν C @μ Fνλ C @ν Fλμ D 0 (18.5b)

Fμν D @μ A ν � @ν A μ (18.5c)

which are invariant under the gauge transformation of the photon field

A μ(x )! A0μ(x ) D A μ(x )C q�1@μ α(x ) (18.6)

Here α is an arbitrary well-behaved continuous function of x, not related to α inEq. (18.1) at this stage. The field ψ satisfies the free Dirac equation, but if it is gaugetransformed with α(x ), which depends on x (local or second kind), according to

ψ(x )! ψ0(x ) D e�i α(x )ψ(x ) , ψ(x )! ψ0(x ) D ψ(x )e i α(x ) (18.7)

the transformed field ψ0 no longer satisfies the Dirac equation because the differ-ential operator brings @μ α(x ) into the equation of motion. However, a Dirac fieldcoupled with the gauge field has Lagrangian

LEM D ψ�i γ μ(@μ C i qA μ) � m

ψ (18.8)

which is invariant under simultaneous local gauge transformations of both ψ andA μ if one uses the same α(x ). In other words, if the force field is inserted in the La-grangian in such a form that the differential operator @μ is replaced by the covariantdifferential operator Dμ

@μ ! Dμ D @μ C i qA μ (18.9)

the equation of motion is invariant under the local gauge transformations provid-ed both the fermion and the force fields are transformed simultaneously. This facthas been known for quite a while under the name “minimum interaction”. Gaugetheory has promoted it to a principle.

Gauge Principle“The equation of motion is invariant under a local gauge transformation.”

It may also be stated as follows. When the Lagrangian we deal with is invariantunder local gauge transformations, a conserved object, which may be called elec-tric charge (or more generally hypercharge), is the source of the force field. Theinteraction Lagrangian that couples the matter (fermion) field with the force fieldis obtained by replacing the differential operator in it by the covariant differential.The force field introduced this way is called the gauge field. If we make an interact-ing system from a free field Lagrangian, which respects a global gauge symmetry,by replacing the derivative by the covariant derivative, we say the system is gauged.

Page 758: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 733

Conserved quantities can be obtained by symmetry transformations, but not allof them are sources of a force. Nowadays, all the fundamental forces (strong, elec-tromagnetic, weak and gravity) are considered as based on the gauge principle, andonly those conserved objects based on the gauge principle are strictly conserved.Those not based on them (including lepton and baryon conservation) are consid-ered ineligible for the name. In fact, in GUTs (grand unified theories), which dealwith very high energy phenomenon that could only be realized at the time of bigbang, they are broken.

In nonabelian gauge symmetry the phase transformation is extended to multi-component fields in matrix representations. Examples are those of S U(2) (isospin)and S U(3) (unitary spin) we met in Chap. 14.

18.2.2Gravity as a Geometry

According to Einstein’s theory of general relativity, a particle always moves on astraight line (geodesic line to be exact) in space-time. According to it, space-timeis not flat as we perceive intuitively but curved. How it is curved is determined bythe presence of matter (energy-momentum tensor). As it is hard to imagine thedeformation of space-time in the four dimensions in which we live, we consid-er a two-dimensional space. Then, figuratively speaking, the space behaves like astretched rubber sheet. When it is empty without matter, it is flat (Euclidean space)and Cartesian coordinates can be set up (Fig. 18.1a). A mass point or particle Aon it moves on a straight-line trajectory. Its motion of equation is Newton’s firstlaw (law of inertia), or its extension to special relativity satisfying Lorentz invari-ance d p/dτ D mdv/dτ D 0, where τ is proper time. When there is a mass B, itdeforms the space as shown in Fig. 18.1b. The space is no longer flat, but parti-cle A’s motion still follows a geodesic line (i.e. a straight line in the curved space).However, projected on Cartesian coordinates it looks like a curved line. Then it isinterpreted as particle A’s motion being affected by B’s force bending its trajecto-

A

BA

(a) (b)

Figure 18.1 Geometrical interpretation ofgravity. (a) Space-time is like a stretched rub-ber sheet. If there is no matter, it is flat andCartesian coordinates can be set up. A pointmass particle A moves on a straight trajectoryobeying the law of inertia. (b) When there isa big mass B, the space is deformed (curved)

and no Cartesian coordinates can be set up.A’s trajectory is still a geodesic line, namelya straight line in the curved space, but pro-jected on Cartesian coordinates it is curvedand is interpreted as dynamical bending of thetrajectory of A attracted by B’s gravity.

Page 759: Yorikiyo Nagashima - Archive

734 18 Gauge Theories

ry. Gravity as a dynamical phenomenon between two massive objects, in Newton’sconcept, has been replaced by a geometry of space-time, in general relativity.

To describe the equation of motion, we need 1) to know how space-time is de-formed by the presence of matter and 2) to define geodesic lines in a curved space.

We will see that the gauge force can also be interpreted geometrically. We con-sider components of complex matter fields as vector components in a coordinateframe in an internal space and gauge transformation as a rotation (or more gener-ally coordinate transformation) in the internal space. Then the force can be inter-preted as the effect of a curved hyperspace, the space-time and the internal spacecombined. The source of the gauge force is the (electric) charge, whose presencedeforms hyperspace just as a mass deforms space-time. We interpret the motion ofthe particle as following a geodesic in hyperspace, which is curved when projectedon space-time.

18.2.3Parallel Transport and Connection

In order to construct mathematical formulae for geodesic lines in a curved space,we need to understand the notion of parallel transport of a vector. Parallel trans-port is the action of moving a vector from one place to another without changingits magnitude or direction. Then a geodesic line is defined as a trajectory that can beobtained by parallel transport of its tangential vector. In space-time, where its met-ric (equivalent of length in Euclidean space)1) is defined, it can be shown that thegeodesic line is also the shortest path connecting the two points. Therefore, it is anextension of the concept of a straight line in Euclidean space. Let us start by defin-ing a straight line locally. If a line is represented by its direction (tangential vector),the definition of a straight line is that when its tangential vector is parallel trans-ported by an infinitesimal distance from x D (x , y ) to x C dx D (xC dx , y C d y ),it does not change its direction or, equivalently, components of the vector do notchange. Translated to a particle’s motion, it can be described by Newton’s first law,that is, the particle does not change its direction when it has moved an infinitesimaldistance in time d t. Thus a straight line is defined by the statement that the timederivative of the particle’s spatial coordinates (velocity) stays constant (Fig. 18.2a):

Straight line in Cartesian coordinates:dvd tD 0 or

d pd tD 0 (18.11)

However, if we use curvilinear coordinates, the mathematical expressionEq. (18.11) has to be modified. In order to see this, let us view the same linein polar coordinates. This is a frame where two base vectors (e r , eθ ) to define its lo-cal coordinates are directed in the r (radial) and θ (orthogonal to radial) directions

1) In Minkowski space, the metric ds2 isdefined as

ds2 D ημν dx μ dx ν D (cdt)2�dx2�d y2�dz2

(18.11)

ημν is referred to as the metric tensor. Ingeneral relativity, the metric tensor is afunction of space-time point x and denotedas gμν (x ).

Page 760: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 735

x

y

(a) (b) (c)

A local coordinate system

VA(x)

VB = VA║ (x + dx)

x

x + dx dθ

φ dθ

φ

Figure 18.2 (a) A tangential vector pointingin the direction of a straight line in Cartesiancoordinates has the same components ev-erywhere. (b) In polar coordinates, while thedirection of the vector stays unchanged, itsdirection relative to the local coordinate axis

is different everywhere. Hence the vector’scomponents are functions of the space point.(c) A vector V A(x) at point AD x is paralleltransported to a point BD x C dx withoutchanging its magnitude and direction. It isdenoted as V A k(x C dx ).

(Fig. 18.2b). The directions of the coordinate frame set up at space point x � (r, θ )point away from the origin (e r ) or are rotated by π/2 (eθ ), and are inclined relativeto the Cartesian setting. To define a straight line locally, the change of base vectorshas to be compensated.

Let us find an expression for vector parallel transport. A vector V A(x ) shown inFig. 18.2c at point x � (r, θ ) has components V A D (Vr , Vθ ) D (V cos φ, V sin φ)in polar coordinates as well as in Cartesian coordinates. Note that we have delib-erately set the point A at θ D 0 to make their components identical. When itis parallel transported to point B, x C dx D (r C dr, θ C dθ ), we denote it asV B D V A k(x C dx ). Here, the base vectors of the polar coordinate at x C dx isinclined by an angle dθ relative to those at x . Components of V A k are the sameeverywhere as those of V A(x ) in Cartesian coordinates, but in the polar coordinateframe correction terms to compensate the directional change of the base vectorshave to be included:

VA k r(x C dx ) � VB r D V cos(φ � dθ ) ' V cos φ C dθ V sin φ

D Vr C dθ Vθ (18.12a)

VA k θ (x C dx ) � VB θ D V sin(φ � dθ ) ' V sin φ � dθ V cos φ

D Vθ � dθ Vr (18.12b)

This example clearly shows that vector components in the polar coordinates changetheir values when parallel transported. In differential geometry, the transportedvector is generically expressed as

V k(x C dx ) D V (x ) � Γ ν(x )V (x )dx ν (18.13)

The correction term is assumed to be a linear transformation of the original vectorand proportional to the displacement if it is small. The assumption is justified inour example Eq. (18.12). Γ ν introduced in Eq. (18.13) is referred to as the “con-nection”. Note that Γ ν is a space-time vector, as indicated by its subscript ν, but it

Page 761: Yorikiyo Nagashima - Archive

736 18 Gauge Theories

120˚ 135˚ 150˚ 165˚ 180˚ 165˚ 150˚ 135˚ 120˚ 105˚ 90˚ 75˚

Tokyo

SanFrancisco

Figure 18.3 A great circle connecting Tokyoand San Francisco is a geodesic on the earth’ssurface (sphere). It looks curved when pro-jected onto Cartesian coordinates in a flatspace. In a local orthogonal coordinate system

(longitude, latitude) set up on the earth’s sur-face, an airplane flying along the great circle isbeing parallel transported. However, its direc-tion with respect to local coordinates changesincessantly.

is also a matrix acting on an object to be parallel transported. Since in Eq. (18.13)V itself is a space vector, Γ ν here is a n-component vector whose components aren � n matrix where n is the dimension of the space-time.

In differential geometry (and in general relativity), where a metric is given theconnection is a derived quantity in terms of the derivatives of the metric tensorgμν .2) In Cartesian coordinates, all the coordinates are globally aligned in the samedirection and components of the parallel transported vector are the same every-where, namely, the connection identically vanishes everywhere. The polar coordi-nates we described above are set up on a flat plane, hence they can always be trans-formed back to the original Cartesian coordinates. In other words, there alwaysexists a transformation that makes the connection vanish identically in a flat spaceby a suitable transformation. Since Cartesian coordinates can only be defined in astraight and flat space, the existence of Cartesian coordinates is equivalent to thespace being Euclidean.

2) It is defined as

Γ μ � [Γμ ]�ν D12

g�λ (@μ gλν C @ν gλμ � @λ gμν )

(18.14)

and is referred to as the Christoffelconnection. gμν is a metric tensor definingthe metric of the space (see Footnote 1).

Note: In differential geometry, two distinctvectors, contravariant and covariant vectors,are defined. The Christoffel connection isused to relate their components. Vectors inour example used in Eq. (18.12) are related tobut not identical to one of them because indifferential geometry the base vectors changetheir magnitude as well as their direction.

Page 762: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 737

Next, we consider the earth’s surface, where the coordinates are expressed interms of two angles (longitude, latitude). They are orthogonal coordinates every-where and locally (i.e., in a limited region) are almost identical to Cartesian coor-dinates (with suitable rescaling), but globally they are not. A geodesic line (greatcircle) connecting Tokyo and San Francisco is depicted in Fig. 18.3. The line’s di-rection changes continuously relative to local coordinates. Here, also, parallel trans-port can be defined using the connection. But this time, no Cartesian coordinatescan be set up on the sphere. Therefore there is no transformation to convert the(longitude, latitude) system to Cartesian coordinates or, equivalently, there are notransformations to make the connection vanish everywhere. Obviously, the con-nection contains information on the space itself as well as the characteristics ofthe coordinate system. This is expected because the connection is derived from themetric tensor which defines the structure of the spacetime.

18.2.4Rotation in Internal Space

We learned in Chap. 9 that in order to describe physical phenomena we have toset up a coordinate system but that the phenomena themselves do not dependon which coordinates are adopted. An experiment carried out in Japan or in Ger-many should produce the same result, which is referred to as translational sym-metry. From it, Noether’s theorem derives energy-momentum conservation. Sim-ilarly, whether the experimental setup is directed north or east, it should producethe same result. This is rotational symmetry, from which angular momentum con-servation is derived. Contrary to these examples, charge and isospin conservationare derived from gauge symmetry (a phase transformation invariance), which hasnothing to do with real space-time. But the above logic suggests that a coordinatesystem can be set up in internal space, too, and the origin of the conservation lawcan be considered as a result of the physical phenomena being independent of howwe choose the internal coordinate system.

The simplest example is a charged particle that has two degrees of freedom, cor-responding to positive and negative charge. Let us denote the field components asψ1, ψ2; the field can be considered as a two-component vector in a certain internalspace. Denoting the orthonormal base vectors in this space as e1, e2, we can expressthe field as

ψ D ψ1e1 C ψ2e2 (18.15)

If the physical phenomena do not depend on how we set up the coordinates, thelaws of physics do not change if we rotate the coordinates. Denoting transformedψ as ψ0, by rotating the coordinate axis by an angle α we can write

ψ01 D ψ1 cos α C ψ2 sin α

ψ02 D �ψ1 sin α C ψ2 cos α (18.16)

Page 763: Yorikiyo Nagashima - Archive

738 18 Gauge Theories

Introducing the complex field by

ψ D (ψ1 C i ψ2)/p

2 , ψ† D (ψ1 � i ψ2)/p

2 (18.17)

we have

ψ0 D e�i α ψ , ψ† 0 D e i α ψ† (18.18)

The rotation of a real field in two-dimensional space reduces to a phase transfor-mation in a one-dimensional complex space. Charge conservation results from thesymmetry of the phase transformation, which is referred to as U(1) gauge symme-try.

Now we imagine a hyperspace by combining (four-dimensional) space-time andinternal space and consider the field ψ a vector in the hyperspace. If the coordi-nates in the hyperspace are equivalents of the curvilinear coordinates, their direc-tion changes locally. If the rotation angle in the internal space depends on space-time coordinates α D α(x ), it is a local gauge transformation. Then in analogy toEq. (18.13) a parallel transported field in the hyperspace can be expressed as

ψk(x C dx ) D ψ(x ) � i qA ν(x )ψ(x )dx ν (18.19)

which introduces A ν(x ) as the connection. Here x is used to denote space-timevariables in Minkowski space. As we will soon see, A ν(x ) is nothing but the gaugepotential. We use the terms (gauge) potential for A μ and field strength for itsderivative, i.e. Fμν D @μ A ν � @ν A μ , which is the electromagnetic field in U(1)gauge symmetry. When we refer to ψ we call it the matter field to distinguish itfrom the gauge field. We have introduced the gauge potential not as a force fieldbut as a “connection”, which is a geometrical object relating vectors in differentplaces. The equation is identical to Eq. (18.13) if we regard Γ ν D i qA ν as an oper-ator acting on a vector (ψ) in an internal space. Unlike the Christoffel connection,which acts on a space-time vector (hence has two extra space-time indices), the con-nection A ν has only one subscript in space-time and is a Lorentz vector becausethe field ψ in Eq. (18.19) is a scalar in space-time. The extraction of the q factoris for later convenience and the reason for introducing the imaginary number i isbecause the rotation in the internal space is a phase transformation.

Perhaps a simple example may lead to an intuitive understanding of the intro-duction of i. Consider the parallel transformation depicted in Fig. 18.2c as a vectordefined in an internal space but moved from x to x C dx in space-time. Then wecan introduce a vector in an internal complex plane

V (x ) D Vr C iVθ

V k(x C dx ) D Vk r C iVk θ(18.12)D V � i dθ V D [1� i A ν(x)dx ν ]V (18.20a)

A ν(x) D@θ@x ν D

��y

x2 C y 2 ,x

x2 C y 2

�3) (18.20b)

3) One may notice that, apart from its normalization, this happens to be the gauge potential that isrealized outside a solenoidal coil. See Problem 18.1.

Page 764: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 739

The interpretation of the gauge potential as the connection in a hyperspace meansthe gauge potential connects the locally rotating coordinates in the internal spaceand contains in itself the properties of the hyperspace. Note, however, no metric isdefined in the internal space. Unlike the Christoffel connection, which is derivedfrom the metric, A ν as the connection is a starting point for the discussion thatfollows. The assumption is only justified by overall consistency.

18.2.5Curvature of a Space

Considering the role of the connection, we realize that it should contain informa-tion about the characteristics (or “curvedness”) of a curved space. Let us considerhow we can extract such information from the connection. Parallel transport of avector in a flat space is easy to understand intuitively, but that in a curved spacetakes some getting used to. Let us consider parallel transport of vectors on a sphere(in real space). How a vector moves on the sphere can be estimated by paintingmany parallel arrows on a flat plane and copying them on the sphere by rolling thesphere on it. Figure 18.4 illustrates two examples of vector transport.

A

CD

North Pole

BA

RQ

P

Deficit Angle α2

π/2

r

×r/2

30˚

AA

E

D

C

B

r

(a) (b) (c)

Cut and spread

Deficit Angle α1 Deficit Angle α230˚ Latitude Line

R

E

Figure 18.4 Deficit angle as a measure of cur-vature. (a) A vector denoted by an arrow ofoutline type starting from the north pole (P)is parallel transported along the lines P ! Q! R! P, where Q, R are points on the equa-tor 90ı apart. When it comes back to whereit started, its direction is π/2 different, whichis referred to as the deficit angle. The totalcurvature is defined as the deficit angle in ra-dians and the average curvature by dividingit by the area enclosed by the loop, which is(π/2)/(4πR2/8) D 1/R2. Local curvature isobtained by making the closed loop infinitesi-mally small.ABCDE is a line along latitudeD 30ı (pointsA and E are identical). It is not a great circle,

and the black vector at A, when parallel trans-ported along the latitude 30ı line, changesits direction relative to the local (longitude,latitude) coordinates as depicted. When itcircles around and comes back its directionis opposite to the original direction. This canbe explained by making a plane tangential tothe line ABCDE (b) and spreading it out ona plane, which is shown in (c). The vector’sdirection is the same everywhere, but it is notthe same relative to the line ABCDE. As thearea of the sphere above the latitude 30ı isπR2 and the deficit angle is π, the curvature isagain 1/R2.

Page 765: Yorikiyo Nagashima - Archive

740 18 Gauge Theories

A conspicuous characteristic of a curved space is that if a vector is parallel trans-ported along a closed loop, its final direction is different from what it was initially.The difference is referred to as the “deficit angle”. As the bending of the spacegrows, the deficit angle becomes larger. The total curvature of an area is the deficitangle expressed in radians and the average curvature is defined as that divided bythe area. Local curvature is defined as the average curvature with the closed looptaken to be infinitesimal.

As an example, let us consider a sphere with radius R. In Fig. 18.4a, a vectordenoted by a white (outline typeface) arrow at the north pole is parallel transportedto the equator P! Q along a longitude line, then moved along the equator to theeast as far as 90ı in longitude (Q! R), and finally back to the north pole (R! P).The vector direction is shifted by π/2 relative to the original. The area surroundedby the closed line P ! Q! R ! P is πR2/2. Therefore, the average curvature is(π/2)/(πR2/2) D 1/R2.

For the case of parallel transport along a line at latitude 30ı, one can see thevector rotates relative to the local coordinates as it moves. How it moves can beunderstood by spreading the tangential plane on a flat plane. As the plane tangen-tial to the 30ı latitude line is a cone, one can see that the parallel arrows on theplane rotate relative to the latitude line (Fig. 18.4c). Points A and E are identical. Asis clear from the figure, the two vectors at A and E are parallel on the spread-outsheet, but in opposite directions on the sphere, hence the deficit angle is π.

In a similar manner, we can define curvature in a hyperspace by parallel trans-porting the field ψ along a closed loop and by calculating the deficit angle. Let usconsider the case when the field is transformed ψ ! ψ C δψ by circulating alonga loop defined as the periphery of an infinitesimal surface element dS D dx μ dx ν

(Fig. 18.5):

δψ D �i qI

A μ ψdx μ

D �i qZ

d S[r � A]n?d S ψdx μ dx ν ' �i qFμνψdx μ dx ν (18.21a)

Fμν D @μ A ν � @ν A μ (18.21b)

where Stokes’ theorem was used in obtaining the third equality. The rotation angleΔα is given by

e�iΔα ψ D ψ C δψ ' (1� iΔα)ψ , ) Δα D i δψ/ψ (18.22)

P(xμ, xν) Q(xμ + dxμ, xν)

R(xμ + dxμ, xν + dxν )S(xμ, xν + dxν )

xνxλ

dS

Figure 18.5 Integral path for calculating covariant derivatives Fμν D @μ A ν � @ν A μ .

Page 766: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 741

Therefore the curvature in the hyperspace is given by

Δαdx μ dx ν D

i δψ/ψdx μ dx ν D qFμν (18.23)

This shows that the electromagnetic field is nothing but the curvature of a hyper-space. The Maxwell equation [see Eq. (3.49)]

@μ F μν D q j ν (18.24)

can be interpreted as defining equations of curvature.

The Electromagnetic Field is the Curvature of a HyperspaceThe existence of the electric charge (and its current) deforms the hyperspaceand its curvature is defined by the Maxwell equation.

Let us compare it with the equation of general relativity (Einstein’s equation):

Rμν �12

R gμν D 8πGN Tμν (18.25)

where GN is the gravitational constant and Rμν and R D�gμν Rμν

�on the left

hand side are referred to as Ricci’s tensor. It is derived by contracting the Riemanntensor, the curvature of space-time.4) This, in turn, is derived from the Christoffelconnection by the same procedure as we used to derive the electromagnetic field.We remind the reader that the exact mathematical formula in differential geome-try is not important, rather, tracing the logical flow and trying to understand thecause and result are. As Tμν on the rhs is the energy-momentum tensor, Einstein’sequation says that the energy distribution in space-time determines the curvatureof space-time. One recognizes complete parallelism in the formalism of electro-magnetic and gravity equations.

18.2.6Covariant Derivative

A free-field equation is generally a second-rank differential equation, which isgenerically expressed as

Λ(@μ)ψ(x ) D 0 (18.27)

4) Riemann’s and Ricci’s curvature tensors areexpressed as

R�σμν D �

�@μ Γ ν � @ν Γ μ C

�Γ μ Γ ν � Γ ν Γ μ

��σ

Rμν D R�μ�ν , Rμν D gμ� gνσ R�σ ,

R D Rμμ D gμ� R�μ (18.26)

The third term in the square brackets, whichis nonlinear in Γ , stems from the nonabeliannature of the Christoffel connection. Notice ithas exactly the same form as the nonabeliangauge field [see Eq. (18.57) and discussionsin Sect. 18.4.3].

Page 767: Yorikiyo Nagashima - Archive

742 18 Gauge Theories

where Λ(a) is a polynomial of a up to quadratic terms. We need to define thedifferential operator in a general coordinate frame. When a field ψ(x ) is defined atall space-time points, its derivative is normally defined as

@ψ(x )@x

D limΔ x!0

ΔψΔxD lim

Δ x!0

ψ(x C Δx )� ψ(x )Δx

(18.28)

For a vector field, the above formula applies to each component. The differencefor a vector field as compared to a scalar field is that, when making difference Δψbetween ψ(x C Δx ) and ψ(x ), the feet of both vectors have to be at the samespace-time point. In the Cartesian coordinate system, one need not worry aboutthis, because the direction of a vector or equivalently all components of the vectorare the same anywhere in space. This is not true in curvilinear coordinates, be-cause the same vector has different components at different space points. In thiscase, the vector at x has to be parallel transported to x C Δx without changing itsmagnitude and direction (Fig. 18.6). We already know the expression for a vectorparallel transported from x to x C dx . It is given in Eq. (18.13). To differentiate ψ,one has to make a difference Δψ by subtracting ψk(x C Δx ) from ψ(x C Δx ) asshown in Fig. 18.6.

Dμ ψ(x ) D limΔ x!0

Δψ(x )Δx μ

D @μ ψ(x )C Γ μ(x )ψ(x ) (18.29a)

which is referred as covariant derivative. Here, we illustrated ψ as a vector in thespacetime and used Eq. (18.13) to define the derivative as given in the differentialgeometry. Similarly, to differentiate ψ in the curved hyperspace, we use Eq. (18.19).Then its derivative can be defined as

D ψ(x )@x μ � lim

Δ x μ!0

ψ(x μ C Δx μ) � ψk(x μ C Δx μ)Δx μ

Ddψ(x )dx μ

C i qA(x )μ(x ) ψ(x ) (18.29b)

which is precisely the “covariant derivative” we defined in Eq. (18.9). The covari-ant derivative contains the connection, which contains the characteristics of thecoordinate system as well as the structure of the space.

According to differential geometry, in curvilinear coordinates the same equationholds anywhere in the space if one uses covariant derivatives instead of ordinarydifferential operators. Phrased in a different way, an equation expressed in termsof covariant derivatives keeps the same form under any general coordinate trans-formation provided a suitable connection is used. This is covariance under generalcoordinate transformation or the principle of general relativity. The principle itselfis a matter of geometry and has no physical substance. The physics lies in identify-ing the space-time nature has chosen and the equation of motion in that space. Theformer is realized by Einstein’s equation and the equation of motion is obtainedfrom the principle of equivalence. We already have the equivalent of the Einsteinequation in the hyperspace. Therefore, our next task is to translate the equivalenceprinciple to the language of the gauge theory.

Page 768: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 743

Δx μ

ψ(x )

ψ (x + Δx)=

ψ(x + Δx )Δψ

ψ (x + Δx ) = ψ(x) - Γ(x)νψ(x) Δx ν=Dμψ(x) = = ∂μψ + Γ(x)μψ(x)

ΔψΔx μ

Figure 18.6 Parallel transport and covariantderivative. Parallel transport of a vector incurvilinear coordinates is defined by using the“connection Γμ ”. ψ(x) parallel transported to

point x C Δx is denoted as ψk(x C Δx). Aderivative in curvilinear coordinates is definedas limΔx!0[ψ(x C Δx) � ψk(x C Δx)]/Δx .

18.2.7Principle of Equivalence

The principle of equivalence identifies gravity as the inertial force in an accelerat-ing system. In an inertial frame, there is no gravity and the equation of motionis described by Newton’s law (d p/d t D 0) or its extension to special relativity(d p/dτ D 0, τ D proper time). Since the equation expressed in terms of covari-ant derivatives keeps the same form if transformed to an accelerating frame, theequation of motion under the gravitational field is expressed as

Principle of Equivalence

Eq. of motion in inertial frame Eq. of motion in curved spaced p μ

dτD 0 !

D p μ

dτD 0

(18.30)

This means that the trajectory that a point particle follows is a geodesic line. Wehave now presented the mathematical formula of what we described pictoriallyin Fig. 18.1. A particle A under the effect of another mass B follows the trajec-tory of a geodesic line. In order for the above equation to hold, one has to verifythe existence of the inertial frame in an accelerating frame. Thinking physical-ly, it is always possible by getting in an elevator and cutting the suspension wire(Fig. 18.7a). The elevator goes into free fall and no gravity exists inside it. If onefeels bad about getting in a falling object, a satellite would provide the same inertialframe. In mathematical language, there is always a transformation that makes theconnection vanish at a point. The vanishing of the connection means the existence

Page 769: Yorikiyo Nagashima - Archive

744 18 Gauge Theories

dpdτ 0

Dpdτ 0

To the Center of Mass

A B

A’ B’

(a) (b)

Figure 18.7 Principle of equivalence. (a) In agravity field, space-time is curved, but there al-ways exists a coordinate transformation where“the connection” vanishes at a point. Physi-cally, it can be realized by entering an elevatorand cutting its suspension wire. The equationof motion in the elevator is that of the inertialframe, i.e. dp/dτ D 0, where τ is proper time.Outside the elevator it is obtained by replac-

ing the derivative with its covariant versionand transforming back to the original acceler-ating frame. (b) An illustration of the fact thatthe connection vanishes only locally. In a widearea containing A and B attracted by the earth(a spherical gravitational center), one seestwo mass points are attracted to each otherwithout any force, which means the area is notin an inertial frame.

of Cartesian (Minkowskian in the case of space-time) coordinates, which, in turn,means an inertial frame. Notice that a local inertial frame means the vanishing ofthe connection at a point mathematically, but physically it includes its immediateneighborhood as long as the higher order terms in the Taylor expansion can beneglected. If one considers a wide area such as that depicted in Fig. 18.7b, one seestwo mass points A and B come closer as time elapses, which means the area is notan inertial frame.

What is meant by “local”? For radius < 10 m, a circular accelerator can be builtassuming the earth is flat. But when the radius of the acceleration circle is on theorder of kilometers, the curvature of the earth’s surface can no longer be neglectedin aligning magnets. Here, locality is limited only to � 100 m. Do the dynamics ofgalaxy formation or even the formation of a cluster of galaxies need to take into ac-count cosmic curvature effect? Here locality applies to a region as large as millionsof light years. Locality changes depending on the kind of observables we are talkingabout.

Going back to particle field theories, in the absence of an interaction, the mat-ter field equation can be derived from free Lagrangians. We already know them,the Klein–Gordon, Dirac or Maxwell (or Proca if massive) equations for spin 0, 1/2or 1 fields. They are the equivalent of the law of inertia in gravity. General rela-tivity means covariance of the equation under general coordinate transformation.In gauge theory, it means covariance under local gauge transformation. In bothcases it is achieved if the equation is written in terms of the covariant derivatives.

Page 770: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 745

The principle of equivalence tells us that by replacing the derivatives in the freeLagrangian by the covariant derivatives, the Lagrangian in curved space, namely,that of interacting fields, is obtained. This is precisely the process of gauging wementioned before.

Priniciple of Equivalence in Gauge Theory

Eq. of free field Eq. of interacting fieldΛ(@μ)ψ(x ) D 0 ! Λ(Dμ)ψ(x ) D 0

(18.31)

The existence of an inertial frame in general relativity is translated to gauge the-ory as follows. No matter what internal coordinate [gauge D phase α(x )] is used,there always exists a coordinate (gauge) transformation such that the connection(gauge potential) vanishes at at least one point. The requirement is expressed as

A μ(x )! A0μ(x ) D A μ C @μ α(x ) D 0 (18.32)

Let us see if there is a gauge transformation that makes the gauge field vanish atevery space-time point. The condition that Eq. (18.32) can be solved is for α to beintegrable, namely

@μ@ν α(x ) D @ν@μ α(x ) (18.33)

Combining this with Eq. (18.21b) gives

Fμν D @μ A ν(x )� @ν A μ(x ) D 0 (18.34)

Namely, the nonexistence of the electromagnetic field is the condition for the van-ishing of the gauge potential everywhere. Since, in hyperspace, the electromagneticfield represents its curvature, and the vanishing of the gauge potential everywheremeans a flat hyperspace, it is logical to require the electromagnetic field to vanishidentically.

18.2.8General Relativity and Gauge Theory

Let us rephrase the theory of general relativity in the language of gauge theories.

In an inertial frame, Lorentz (or more generally Poincaré) invariance holds.The Lorentz transformation [Eq. (3.76)] has the form of a phase transforma-tion, and contains 6 (10) parameters, the velocity and rotation angle (and dis-placement). They are constants and the transformation could be dubbed a

Page 771: Yorikiyo Nagashima - Archive

746 18 Gauge Theories

global gauge transformation. If the transformation is made local, i.e. if theparameters are functions of space-time points, it is a general coordinate trans-formation. The principle of general relativity is rephrased as invariance underlocal Poincaré transformation. It requires that the equation of motion does notchange its form. The principle of general relativity is nothing but invarianceunder local gauge transformation.

A big difference between gravitational theory and gauge theory is that, in gaugetheory, parallel transport is activated in an internal space and the connection hasonly one subscript as far as space-time components are concerned. This is whythe gauge field is a Lorentz vector. Hence the gauge particle has spin s D 1. If the

Table 18.1 Correspondence between gauge theory and general relativity.

Gauge Theory General Relativity

frame internal space space-time

coordinate trf. gauge trf. Poincaré trf.

Global invariants charge energy

Symmetry eq. motion KG/Dirac eq. a inertial law

Λ(@μ )ψ D 0d p μ

dτD 0 b

coordinate trf. local gauge trf. local Poincaré trf.or general coordinate trf.

connection gauge potential Christoffel connection

i qA μ (x ) Γ μ D Γ λνμ

parallel trpt. ψ � i qA μ ψdx μ p λ � Γ λμν p μ dx ν

cov. derivative Dμ D @μ C i qA μ Dμ p λ D @μ p λ C Γ λμν p ν

curvature electromag. field Riemann’s curvatureFμν Rλ

μν�

origin of curv. charge energyLocal field equation Maxwell eq. Einstein eq.

Symmetry @μ F μν D q j ν Rμν �12

Rgμν D 8πGN T μν

(curved space) eq. motion Λ(Dμ )ψ D 0D p μ

dτD 0

equivalence potential can Γ μ can vanish locallyprinciple vanish locally (existence of

local inertial frame)

total curv. D 0 existence of(flat space) A μ D 0 globally Cartesian coordinates

Γμ D 0 globally

a KG = Klein–Gordonb τ is proper time, dτ D dt

p1� 2

Page 772: Yorikiyo Nagashima - Archive

18.2 Gauge Principle 747

hyperspace is curved, the Lorentz invariance is preserved. But in gravity, applicationof a general coordinate transformation breaks Lorentz invariance, which means thespace-time is curved. Since the object to be parallel transported is a Lorentz vector,the (Christoffel) connection has three indices for space-time components; one asa vector in spacetime, two as a matrix acting on spacetime vector to be paralleltransported. Since the connection for gravity is derived as derivatives of the metrictensor of rank 2 that gives the gravitational potential, the graviton, the quantum ofthe gravitational field, has spin 2. As a summary of what we have described so far,we list correspondences between gauge and gravity fields in Table 18.1.

Are All the Fundamental Forces Geometrical?The geometrical interpretation of the gauge force we have presented is notrigorous but by no means accidental. In 1956, Utiyama proved that generalrelativity is a kind of gauge theory [376]. There is also Kaluza–Klein theory,which was proposed in 1921. Working in five-dimensional space-time, theyshowed that the electromagnetic force can be derived from general relativityas the force produced by the energy-momentum in the fifth dimension andprojected in four-dimensional space-time. In their model, the fifth dimensionis not infinitely long but curled to form a cylinder, making the space trans-lation in the fifth dimension periodic, which is essentially an act of gaugetransformation. The electromagnetic potential appears as the metric tensorconnecting the fifth dimension with ordinary four-dimensional space-time.The electric charge is proportional to the fifth component of the momentum.Just as all three forces – weak, electromagnetic and strong – are describedby local gauge theories based on S U(3)color � S U(2)L � U(1)Y symmetry, it isknown that quantum gravity can be obtained by gauging the supersymmetrythat connects fermions and bosons. This is referred to as supergravity. How-ever, a self-consistent method of quantizing the gravitational field is not yetknown. One idea that is considered promising is to regard elementary parti-cles not as points but as strings. Superstring theory tries to unify all four forcesby applying the gauge principle to supersymmetric strings in ten-dimensionalspace-time. It is also a generalized Kaluza–Klein theory, extending general rel-ativity to larger dimensions. Whether the extra six-dimensional space is curledup as Kaluza and Klein suggested or whether we live in a brane world float-ing in a vast ten-dimensional space is a hot topic currently being discussed. Apossible way of detecting the extra dimension has been suggested and couldbe tested by the Large Hadron Collider (LHC) experiment.

Page 773: Yorikiyo Nagashima - Archive

748 18 Gauge Theories

18.3Aharonov–Bohm Effect

The gauge potential has a freedom of gauge transformation, which casts doubt onit being an observable. In classical electromagnetism, it is merely a mathematicalentity, useful but of no physical significance. The electromagnetic field F μν, onthe other hand, is gauge invariant and physical. However, in quantum mechanics,in describing the electromagnetic interaction in the Schrödinger equation, we useAμ D (φ/c, A), which suggests the potential plays a fundamental role. In fact, inquantum mechanics, A μ is an observable and plays a more fundamental role thanthe electromagnetic field. Its role was suggested in the geometrical interpretationof the gauge fields, but is most clearly demonstrated by the Aharonov–Bohm ef-fect [8].5) It is the effect that the gauge potential causes in a well-known double slitexperiment with electrons (see Fig. 18.8).

We first describe the electron wave’s behavior without an electromagnetic field.The wave function at point R can be obtained from that at P by the following pro-cedure. Starting from P, electrons reach R either through slit Q or through slit S.Denote the two paths as C1, C2, the two wave functions at R as ψ1, ψ2 and thephases that ψ1, ψ2 have acquired as α1, α2. The particle distribution at R is givenby

jψj2 D jψ1 C ψ2j2 D jψ2j

2 j1C expf�i(α1 � α2)gjψ1/ψ2jj2 (18.35)

From the expression, one sees that the absolute value of the phase is not an observ-able but its difference is. Next, we place a confined solenoidal magnetic field (per-pendicular to the plane) at a place far away from the electron path. If the solenoidis made slim and long enough that no magnetic field leaks out, the electrons neverfeel the field. Then classically the solenoid should have no effect on the electronpath. However, observation shows that the interference pattern changes when thefield is turned on. The effect is referred to as the Aharonov–Bohm effect.

The reason is as follows. Even if the field strength vanishes, the potential neednot. The condition is B D r � A(x ) D 0, which restricts the allowed functionalform of the potential outside the solenoid as

A(x ) D1qr f (x ) (18.36a)

Turning on the magnetic field produces the gauge potential, which transforms thewave function to

ψ ! e�i f i (x )ψ i (x ) (18.36b)

Here, we attached suffix i to differentiate phases picked up by electrons in thetwo paths. One should note that this is the requirement of the gauge principle.

5) It was first discovered by Ehrenberg and Siday in 1949 [127] and rediscovered by Aharonov andBohm.

Page 774: Yorikiyo Nagashima - Archive

18.3 Aharonov–Bohm Effect 749

B 0

B 0

B (MagneticField)

ElectronSource

(a) (b)

C1P QR

SC2

B

Figure 18.8 (a) Aharonov–Bohm effect. Adouble-slit experiment can demonstrate clear-ly the wave nature of the electron beam. If aconfined solenoidal magnetic field is placedfar away from the electron paths, the interfer-ence pattern changes. Although the magneticfield vanishes outside the solenoid, the poten-

tial does not. This is clear proof of the gaugepotential affecting the electron’s motion. (b)Magnetic flux quantization. When there is asuperconducting ring current, the magneticfield is repelled from the superconductor andthe flux inside the ring is quantized, showinganother role of the gauge potential.

Turning on the magnetic field means the gauge potential is produced by a gaugetransformation that accompanies the corresponding phase transformation in thematter field. Compare Eqs. (18.6) and (18.7). Solving Eq. (18.36a) for f (x ) we canexpress f i as a line integral

f i (x ) D qZ R

P, Ci

A(x ) � dx (18.36c)

Substituting Eqs. (18.36b) and (18.36c) into Eq. (18.35), we have a modified electrondistribution with the magnetic field on:

jψB j2 D jψ2j

2ˇˇ1C exp

��i q

IA � dx � i(α1 � α2)

ˇˇψ1

ψ2

ˇˇˇˇ2 (18.36d)I

A � dx D�Z

C1

ZC2

�A � dx (18.36e)

His a line integral along a closed loop made by C1 and C2. If there is no magnetic

field inside the loop, the integral vanishes and the interference pattern does notchange. But if there is a field somewhere inside the loop, the magnetic flux in theloop does not vanish. Namely,

Φ DZ

Bn dS DI

A � dx ¤ 0 (18.37)

where Stokes’ theorem was used in going from the second to the third equation.Despite the fact that the electron never passes a region where the magnetic fieldexists, it perceives the field and the interference pattern changes. The prediction ofEq. (18.36d) has been experimentally confirmed [303, 306]. The Aharonov–Bohmeffect with an electric field is also observed [377]. Since the electron never sees themagnetic field, it must be the gauge field that is acting on the electron. Thus the

Page 775: Yorikiyo Nagashima - Archive

750 18 Gauge Theories

Aharonov–Bohm effect has clearly established that the gauge potential is real andphysical and that the electromagnetic field Fμν alone is not enough to describethe electromagnetic phenomena. Since Fμν can be derived from the potential bydifferentiation, the potential is a physical object more fundamental than the field.

Note, this shows that the gauge potential itself is not an observable, but its dif-ference is. When it manifests itself as an observable, the gauge invariance is main-tained. Notice that the line integral defined in Eq. (18.37) is gauge invariant if thepotential itself can be varied locally.

Another interesting example related to the Aharonov–Bohm effect is the super-conducting ring current in the presence of a magnetic field [Fig. 18.8]. Owing tothe Meissner effect, the magnetic field cannot exist in the superconductor. How-ever, the electron wave function acquires a phase given by Eq. (18.37) every timeit goes around the ring. The requirement that the wave function be single valuedforces the magnetic flux to be quantized:

Φ D 2π„

2en D 2.067833636 � 10�15 Wb� n , n D 0,˙1,˙2, � � � (18.38)

This is also experimentally confirmed [118, 218] and has wide applications, forexample in the Josephson junction. The factor two in the denominator is relatedto the fact that the supercurrent is produced by condensed Cooper pairs, whichcarry charge �2e.6)

In the geometrical interpretation of the gauge potential, the Aharonov–Bohm ef-fect is easy to understand. The magnetic flux is the total curvature of the closed area.Consider a cone with its apex rounded (see Fig. 18.9a). The round area is where themagnetic field exists and the space (two-dimensional plane in the figure) is curved,but the other area is flat in the sense that the local curvature vanishes. This is eas-ily seen by cutting the cone and spreading it out on a flat plane (Fig. 18.9b). Theside plane of the cone is tangential to the flat plane and local curvature vanisheseverywhere. Both paths ABC and ADE are straight lines in the flat space locally andthe direction keeps its original value. Namely, the electron on the path never expe-riences the magnetic field, but at point A(D E) where the two paths meet, there isa deficit angle. Geometrically, the local curvature always vanishes, but the total cur-vature does not if the loop surrounds the apex area. If the loop does not contain theapex area, the total curvature, hence the deficit angle, vanishes. The electron per-ceives the total curvature, which is equivalent to feeling the electromagnetic force,as the global behavior of the space geometry.

Notice that the gauge potential is nonzero outside the solenoid. As we stated inSect. 18.2.7, the potential can be made to vanish everywhere only when the electro-magnetic field vanishes identically. If it exists somewhere, no matter how far sep-

6) The Cooper pair consists of two electronswith opposite momentum and spin. Theyattract each other through phonon (thequantum of lattice vibration) exchange andmake a bound state, which is the Cooperpair. Normally, thermal motion destroysthe bound states, but below a critical

temperature a great number of Cooper pairsare produced (Bose–Einstein condensation)and act coherently, making a quantum fluidof electrons. It is an example of spontaneoussymmetry breaking, which will be discussedlater in this chapter.

Page 776: Yorikiyo Nagashima - Archive

18.3 Aharonov–Bohm Effect 751

D

CB

A

B

E

(MagneticField)

A

CE

D

C

B

Cut and spread(a) (b)

DeficitAngle

DeficitAngle

Figure 18.9 An example where the effect ofthe gauge potential Aμ is manifest. The elec-tromagnetic field is interpreted as the curva-ture of a hyperspace. A particle passing whereno magnetic field exists (a flat plane) feels its

effect when the space is globally curved likethe surface of a cone. The two particle’s trajec-tories are ABC and ADE, and when they meetat C (D E), their direction is different.

arated, it deforms the space and no Cartesian coordinates can exist. The geodesicline in the curved space looks bent if projected on the Cartesian coordinates, mean-ing the electron perceives the electromagnetic force.

Considering the role of the gauge potential as the connection, which is locallydefined but also contains global information about the entire space, the connec-tion’s controlling power must prevail throughout the whole space. Therefore, itsquantum must have vanishing mass, otherwise its control would not reach far be-yond its Compton wavelength. Thus, the geometrical interpretation of gauge theoryprovides a natural explanation of the vanishing mass of the force quantum.

Problem 18.1

Derive the vector potential outside a solenoid of radius a.

An Experiment to Prove the Aharonov–Bohm Effect The Aharonov–Bohm effecttouches fundamental issues in quantum mechanics. Experimental tests of theAharonov–Bohm effect were carried out immediately after their proposal (for areview see [303]) and confirmed it. However, with the emergence of the gaugeprinciple as a successful axiom underlying all the forces, many subtle theoreticalas well as experimental questions were raised as to the validity of the effect. The-oretically, locality, reality of the gauge potential and the single valuedness of thewave function, etc. were at issue. A major complaint on the experimental side wasthat the effect can be explained by leakage of the magnetic field.

Here, we introduce an experiment to prove the Aharonov–Bohm effect that re-sponded to the criticism [306, 367, 368]. A solenoid or magnetized whiskers, how-ever long and confined, have an outer field in order to close the flow of the magnet-ic flux. Consequently, the experiment was carried out on a toroidal magnet usingelectron holography. A schematic layout for the hologram formation and its re-construction method are depicted in Fig. 18.10a,b. A specimen (toroidal magnet)

Page 777: Yorikiyo Nagashima - Archive

752 18 Gauge Theories

Referencewave Specimen

Objectivelens

Intermediatelens

Projectionlens

Biprism

Hologram

(a) (b)

Realimage

Mach Zehnder Hologram InterferenceInterferometer

LightA D

B C

Image

Figure 18.10 (a) Schematic diagram of theelectron optical system for hologram forma-tion. An image of the specimen is formed ona photographic film. A biprism is employed tocause the objective wave and reference waveto overlap and form a hologram. (b) Schemat-ic diagram of hologram reconstruction. Col-

limated laser light is split into two beamsand both illuminate the hologram. One (ABCpath) reconstructs the image, which is su-perposed with the other transmitted throughthe ADC path (comparison beam) to producean interference micrograph. Figures adaptedfrom [306].

was illuminated by an electron beam. Its image was magnified and recorded ona photographic film. Only half of the beam was used. The other half was used asa reference beam, which was overlapped with the image by a biprism to form thehologram. To reconstruct the original image from the hologram, two laser beamsseparated by a half-silvered mirror were used. One, which goes along the path ABCin the figure, reconstructs the image, focused on the observation plane. Note thatthe beam splits into three after going through the hologram, one is transmittedstraight forward and the two others are the image beam and its conjugate emittedat angles tilted symmetrically to the transmitted one. The other beam (path ADC)was used as the comparison beam. It also splits into three, only the transmittedbeam is superposed on the observation plane to form the interference images. Twotypes of interference images were used: a contour map and an interferogram. Theformer is obtained by setting the reconstructed and transmitted waves parallel, asillustrated in Fig. 18.10b. The latter, i.e. the interferogram, is obtained by tilting thetwo waves to each other (� 10�3 rad).

Figure 18.11a is a contour map illustrating magnetic flux lines. There, a constantflux flow of h/e lies between two adjacent lines. An interferogram showing wavefronts and a schematic illustration are given in Fig. 18.11b and c. The photographreveals that phase differences exist between two electron beams that have passed

Page 778: Yorikiyo Nagashima - Archive

18.3 Aharonov–Bohm Effect 753

Figure 18.11 (a–c) An electron beam illumi-nated the toroid from above. (a) A contourmap showing magnetic flux lines. (b) Inter-ference pattern showing the retarded phaseof the electrons going through the hole. (c)Illustration of the wave front. (d) Interference

pattern made with a Permalloy toroid wrappedwith superconducting Nb to prevent magneticflux leakage. The bridge stretching upward is athermal contact to control the temperature ofthe toroid [367, 368].

through the inner and outer spaces of a toroidal magnet, where there are no mag-netic fields. In addition, from (b), the phase displacement was counted as 5.5λ,agreeing with the theoretical value of 6λ to within 20%. However, this interfero-gram is still taken with part of the electrons going through the magnetic field.

The third picture was taken with a toroid completely shielded by superconduct-ing material. The toroid is made of ferromagnetic Permalloy sandwiched betweentwo SiO2 substrates and surrounded by niobium (Nb). Below the transition tem-perature Tc, the niobium becomes superconducting and repels the magnetic fieldcompletely by the Meissner effect, thus eliminating it inside the toroid. Its pres-ence, moreover, quantizes the magnetic flux Φ D n(h/2e), n D integer. Shieldingmakes the interference stripes invisible on the toroid, and only relative displace-ments of wave fronts modulo the wavelength can be detected. For the supercon-ducting phase (T < Tc), the phase shift is exactly 0 for n even and π for n odd. Thepicture shows a case for n odd (π). One can see the inside stripe is exactly in themiddle of two outside stripes.

Thus, the experiment has established the Aharonov–Bohm effect firmly and con-sequently the physical role of the gauge potential. If it is not accepted, nonlocalityof the magnetic field has to be argued. The existence of quantized flux touches

Page 779: Yorikiyo Nagashima - Archive

754 18 Gauge Theories

another fundamental issue in quantum mechanics, i.e., single valuedness of wavefunctions.

18.4Nonabelian Gauge Theories

18.4.1Isospin Operator

Up to now, we have regarded the internal space as a one-dimensional complexspace. Our arguments in Sect. 18.2 can be extended directly to an n-dimensionalcomplex space. The major difference is that for n � 2 two consecutive coordinatetransformations (C D AB) are different if the order is reversed (C 0 D B A ¤ AB).Because of this, the field strength and its equation become nonlinear, makingmathematical manipulations complicated. Extension of the formalism to n D 2complex space is to introduce S U(2) (isospin) symmetry. In Chap. 9, we introducedisospin of the strong interaction by considering u$ d symmetry or, more precise-ly, by requiring the equation of motion to be invariant for unitary transformations:

ψ ��

ud

�!

�u0d0

�D

"U

#�ud

�(18.39)

Without loss of generality, the unitary matrix can be expressed as

U D exp

�i

3XaD1

αa ta

!D exp(�i α � t) (18.40)

which has the form of phase transformations. In general, S U(n) transformationcan be expressed in this form using n2 � 1 traceless hermitian matrices called thegenerators of the group. When ta are noncommutative, the representation matri-ces U are noncommutative, too, and the group is referred to as nonabelian. Thegroup characteristics are essentially7) determined by their commutation relation(Lie algebra), which is expressed as

[ta , tb ] Dn2�1XcD1

i f abc tc � i f abc tc (18.41)

where f abc are completely antisymmetric constants referred to as structure con-stants of the group. For the case of S U(2), ta D τa/2 holds, where τa are Paulimatrices and

f abc D �i j k S U(2) (18.42)

7) Meaning those of local nature. Those of global nature, such as connectedness, may be different.For instance, S U(2) and SO(3) share the same structure constants but there is 1:2 correspondencebetween the two (see Sect. A.2 in Appendix A).

Page 780: Yorikiyo Nagashima - Archive

18.4 Nonabelian Gauge Theories 755

In an N-dimensional representation space spanned by N-component vectors, Uare N � N unitary matrices. When N D n, the matrices are referred to as thefundamental representation, and if N D n2 � 1 they are referred to as the adjointrepresentation. In the following, we limit our discussion to S U(2), but conclusionsare applicable to any S U(n).

If the Lagrangian is invariant under S U(2) transformation of the form Eq. (18.40),where αa ’s (a D 1 � 3) are constants, Noether’s theorem predicts the existence ofa conserved isospin current for the matter field with isospin I:

J μa (x ) D gψγ μ ta ψ

ψ D

264

ψ1...

ψN

375 (18.43)

where N D 2I C 1 and ψ is a vector in isospin space. The generators ta are N �Nmatrices. The space integral of Eq. (18.43)

OIa D

Zd3x J0

a (x ) (18.44)

defines a conserved operator, which is referred to as isospin. In the following, wesometimes refer to it as isospin charge, meaning it generates the force field just asthe electric charge, which is a conserved observable, generates the electromagneticforce field. Note that a field having I D 0 does not generate a force field just asthe neutral charge does not generate the electric field. The isospins are operatorsto generate isospin rotations (S U(2) phase transformations), i.e.

e i α�OI ψ e�i α�OI D exp[�i α � t]ψ (18.45)

which is the global gauge transformation of S U(2). The formalism we developed inthe previous section can be directly applied. Since S U(2) transformation is math-ematically almost identical to three-dimensional space rotation, our analogy of thegauge transformation to a rotation in the internal space is intuitively more appeal-ing. By gauging the S U(2) transformation, or equivalently replacing the derivativeswith the covariant derivatives, we can obtain the Lagrangian of interacting fields.

18.4.2Gauge Potential

From now on, we consider matter fields belonging to I D 1/2 or the fundamentalrepresentations of S U(2), as is the case for the weak interaction of quarks and lep-tons, though the following discussions are valid for any isospin value. The gaugepotential of S U(2) can be introduced by directly applying Eq. (18.19). Since thereare three (n2 � 1 in general) parameters in Eq. (18.40) for gauge transformation ofthe matter field, we need three gauge potentials to keep the local gauge transfor-mation invariant. This requirement restricts the form of the gauge potential. All

Page 781: Yorikiyo Nagashima - Archive

756 18 Gauge Theories

we have to do is to replace A μ(x ) by

A μ(x )! [Aμ(x )]i j DX

a

A a μ(x )[ta ]i j D Aμ � [t]i j (18.46)

which are 2�2 matrices acting on the two-component matter field ψ. The infinites-imal parallel transport of the matter field is defined as

ψk(x C dx ) D ψ(x )� i gAμ ψ D ψ � i gAμ � tψ

or in components ψk i (x C dx ) D ψ i (x )� i gAμ(x ) � [t]i j ψ j (x )

(18.47)

where g is the coupling constant corresponding to e in QED. The correspondingcovariant derivative can be defined as

[Dμ ]i j D [@μ C i gAμ]i j D δ i j @μ C i gAμ � t i j (18.48)

The gauge transformation of the potential can be obtained by the requirement thatit keeps the equation of motion invariant when the matter field undergoes a gaugetransformation

ψ(x )! ψ0(x ) D U(x )ψ(x ) D exp[�i gα(x ) � t]ψ(x ) (18.49)

where we have extracted the factor g from the phase for later convenience. If wedenote the gauge-transformed covariant derivative of Dμ as D 0

μ , the Lagrangian iskept invariant if

(Dμ ψ)0 D D 0μ ψ0 D D 0

μ(U ψ) D U(Dμ ψ)

) D 0μ D U DμU�1

(18.50)

which defines the transformation property of the gauge field:

A0μ D UAμU�1 �

ig

U@μU�1 (18.51)

Setting U as an infinitesimal transformation (α small), we have

A0a μ D A a μ C f abc αb A c μ C @μ αa(x )C � � � (18.52)

which does not depend on the representation matrices ta . Adopting vector notationin the internal space, we can express the above relation also as

A0μ D Aμ C α � Aμ C @μ α(x )C � � �

(α � Aμ)a DXb,c

f abc αb A c μ(18.53)

Since, in quantized field theory, the gauge potential is a field on an equal foot-ing with matter fields, we refer to the gauge potential as the gauge field here. The

Page 782: Yorikiyo Nagashima - Archive

18.4 Nonabelian Gauge Theories 757

meaning of the equation is that the gauge field is rotated in the internal space,which means that the gauge field can be phase transformed by the isospin opera-tor given in Eq. (18.44). This is possible only when the gauge field carries isospin,hence the gauge field itself is the source of the force. This feature did not existin U(1) gauge symmetry. In other words, while the photon is neutral and cannotgenerate the electromagnetic field by itself, the S U(2) gauge field carries isospincharge and can be the source of the force field. The isospin of the gauge field isI D 1 since there are three components. As the action of the force is representedby coupling to the gauge field in the Lagrangian, the interaction of the S U(2) gaugefield is nonlinear, which is the conspicuous characteristic of nonabelian gauge the-ory.

18.4.3Isospin Force Field and Equation of Motion

The curvature of the isospin hyperspace can be obtained by the same procedure asfor Fμν in QED. But this time the gauge field are matrices and noncommutative,which means the order of parallel transport has to be chosen carefully. Let us movethe matter field in the x D x μ , y D x ν plane and take the same infinitesimallysmall loop as was done for the electromagnetic field (Fig. 18.5).

Omitting variables other than x , y , we have

1. Parallel transport from P! Q:

ψk(x C dx , y ) D [1 � i gAx (x , y )dx ]ψ(x , y ) (18.54a)

2. Parallel transport from Q! R:

ψk(x C dx , y C d y ) D [1� i gAy (x C dx , y )d y ]ψk(x C dx , y ) (18.54b)

Denoting the parallel-transported ψ via route P! Q! R as ψPQRk , we find

ψPQRk D [1 � i gAy (x C dx , y )d y ][1� i gAx (x , y )dx ]ψ(x , y )

' [1 � i gfAy (x , y )d y CAx (x , y )dx C @x A y (x , y )dx d yg

� g2Ay (x , y )Ax (x , y )dx d y ]ψ(x , y ) (18.55)

The transport R ! S ! P can be obtained from the transport P ! S ! R bychanging the sign of the path and the transport P ! S ! R by exchanging x $ yin ψPQR

k in Eq. (18.55). Namely,

IPQRSP

ψk(x , y ) ' ψPQRk � ψPSR

k

D �i g[@xAy � @y Ax C i g(AxAy �Ay Ax )]ψ(x , y )dx d y

(18.56)

Page 783: Yorikiyo Nagashima - Archive

758 18 Gauge Theories

Comparing this with the defining equation of the curvature Eq. (18.21a), we obtainthe field strength

Fμν D @μAν � @νAμ C i g[Aμ, Aν ] D1i g

�Dμ , Dν

(18.57)

The second equality can be proved by actually performing the arithmetic of thecommutator. It is an operator relation, but one sees that it does not contain anyderivative after taking the commutator. For an abelian group [Aμ , Aν ] D 0 andEq. (18.57) reduces to the electromagnetic field.

Problem 18.2

Derive the second equality of Eq. (18.57).

Using

[Aμ , Aν ] D [A a μ ta , A b ν tb ] D i f abc A a μA b ν tc D i(Aμ � Aν) � t (18.58)

Fμν can also be expanded in terms of ta :

Fμν D Fa μν ta D F μν � t (18.59a)

F μν D @μ Aν � @ν Aμ � gAμ � Aν (18.59b)

The gauge transformation of Fμν can be obtained from Eq. (18.57)

F 0μν D

1i g

hD 0

μ , D 0ν

iD

1i g

U�Dμ , Dν

U�1 D UFμνU�1 (18.60)

As Fμν is a matrix, F 0μν ¤ Fμν , i.e. the nonabelian field strength Fμν is not gauge

invariant. When one wants to construct a gauge-invariant Lagrangian, one can use

Tr[FμνF μν ] D12

F μν � F μν (18.61)

To obtain the derivative DλFμν , one can choose the path depicted in Fig. 18.12.The result is

DλFμν D @λFμν C i g[Aλ , Fμν ] D [Dλ , Fμν ] (18.62)

xμ xμ + dxμ

xν + dxν

A

B

dxλ

Figure 18.12 Integral path for calculating covariant derivatives Dλ Fμν .

Page 784: Yorikiyo Nagashima - Archive

18.4 Nonabelian Gauge Theories 759

Then the equivalent of the Maxwell equation becomes

DμF μν D @μF μν C i g[Aμ, F μν ] D gJν

Jν D j νa ta D j ν � t

(18.63)

or, separating the representation matrix,

@μ F μν � gAμ � F μν D g j ν (18.64)

which can also be obtained from the Lagrangian

LSU(2) D ψ(i γ μDμ � m)ψ �14

F μν � F μν (18.65)

Problem 18.3

Derive Eq. (18.62).

Problem 18.4

Derive Eq. (18.64) from the Lagrangian in Eq. (18.65).

Substituting Eq. (18.62) in the Jacobi identity

[Dμ , [Dν , Dλ ]]C [Dν , [Dλ , Dμ ]]C [Dλ , [Dμ , Dν ]] D 0 (18.66)

we have the S U(2) equivalent of the sourceless Maxwell equation (3.49b):

DμFνλ C DνFλμ C DλFμν D 0 (18.67)

Thus, we have derived a S U(2) gauge-invariant Lagrangian and the associatedequations of motion. The Lagrangian Eq. (18.65) has exactly the same form as thatof QED, except that the matter field ψ is an I D 1/2 doublet and the gauge fieldAμ is 2� 2 matrices acting on ψ and an I D 1 triplet. The major difference lies inthat F μν contains nonlinear (quadratic) term and hence the Lagrangian containsup to quartic terms in the gauge fields.

So far we have discussed only S U(2) gauge theory. Extension to general S U(n)is straightforward. The only difference is that there exist n2 � 1 generators andthe same number of gauge fields and that the structure constant �abc has to bereplaced by f abc. We summarize the equations of motion for the I D 1/2 matterfield and the I D 1 gauge field, which respect S U(2) gauge invariance. Expectingapplications to the weak interaction, we denote the S U(2) gauge field as W μ

a andits coupling strength as gW .

Page 785: Yorikiyo Nagashima - Archive

760 18 Gauge Theories

SU(2) Lagrangian

LSU(2) D ψh

i γ μ@μ C i gW W μ �

τ2

�� m

iψ �

14

F μν � F μν

ψ D�

ψ1

ψ2

�, W μ D

�W μ

1 , W μ2 , W μ

3

�(18.68)

F μνa D @μ W ν

a � @ν W μa � gW �abc W μ

b W νch τa

2,

τb

2

iD i�abc

τ c

2(18.69)

The interaction Lagrangian can alternatively be expressed in terms of chargedand neutral bosons as

�LINT D gW ψγ μ W μ �τ2

ψ

DgWp

2ψγ μ

hτC W C

μ C τ� W �μ

iψ C gW ψγ μ

h τ3

2W 0

μ

W � D1p

2(W1 ˙ i W2) , W 0 D W3 (18.70)

The equations of motion are

(i γ μ@μ � m)ψ D gW W � (τ/2)ψ (18.71a)

@μ F μν � g W μ � F μν D gW ψγ ν(τ/2)ψ (18.71b)

18.5QCD

QCD is an abbreviation for quantum chromodynamics, meaning the dynamics ofcolor charge, and obeys the same mathematical framework as QED. The QCD La-grangian can be obtained from that of S U(2) in Eq. (18.68) by simple replacements:

�ψ1

ψ2

�!

24R

GB

35 ,

3XaD1

Wa μτa

2!

8XaD1

A a μλa

2,

�abc ! f abc , gW ! gS

(18.72)

where R , G, B are Dirac fields having color R G B , A a μ eight gluon vector fields, λa

Gell-Mann’s 3 � 3 matrices and the structure constant �i j k is replaced by f i j k (seeAppendix G). Corresponding to the eight generators, there are eight gauge fields

Page 786: Yorikiyo Nagashima - Archive

18.5 QCD 761

A a μ (a D 1, . . . , 8) and the covariant derivative is defined as

Dμ D @μ C i gS Aμ � t D @μ C i gS

8XaD1

A a μλa

2(18.73)

The QCD Lagrangian is given by Eq. (18.68) with the modifications describedabove.

Who invented QCD is not clear. Many people suggested the existence of color de-grees of freedom, including Gell-Mann. In 1964–1965, dynamical models to ascribethe source of the strong force to color were proposed [185, 193], but their modelshad a somewhat complicated symmetry structure. A turning point was the discov-ery of asymptotic freedom in 1973 by Gross, Wilczek and Politzer [189, 190, 324].Asymptotic freedom means that the effective quark–gluon coupling constant α s D

g2S /4π is a decreasing function of the momentum transfer Q2, or more specifically

behaves like (ln Q2)�1. This means limQ2!1 α s(Q2) ! 0, namely a perturbativeapproach can be used in the high-energy region. This provided a theoretical justi-fication of the parton model and a means to improve it. As a result, QCD surfacedas the sole candidate for the strong interaction. Asymptotic freedom is a conse-quence of the nonlinear gluon coupling. Color confinement is considered anotherconsequence. Investigations using lattice QCD8) are suggestive of it.

Since QCD is a field theory very similar to QED except for the color degree offreedom and nonlinear coupling of the gluon field, the calculation of perturbationseries, i.e. that of Feynman diagrams in the lowest order, tree approximation, isalmost identical to that of QED except for color counting.9) When perturbativecorrections are made to the parton model, such as emission of gluons from thequark, the structure functions develop as a function of Q2, i.e. a shift of the partonmodel from scaling at large Q2 [ > 20 (GeV/c)2] is apparent. Comparison of thosepredictions with precision data from electron, muon and neutrino deep inelasticscattering, as well as phenomena in e�eC, established the validity of QCD in thelate 1970s.

Hadron–hadron interactions can also be interpreted by the parton model[Eq. (17.150)]. QCD gave its justification and provided a recipe for how to cal-culate higher order corrections. A notable feature of quarks and gluons is that they

8) Lattice QCD is a technique to calculate valuesof various dynamical variables numerically.Four-dimensional space-time is dividedinto a lattice of interval a, breaking Lorentzinvariance but in such a way as to preservethe gauge invariance. In the limit a! 0, theLorentz invariance is recovered, but at themoment the size is limited by computingpower. Lattice QCD provides a powerfulmethod to treat the nonperturbative aspect ofQCD.

9) If one formulates QCD in the covariantgauge, longitudinal as well as scalar gluonsare quantized, too. They are unphysicaland should not appear in the observables.

In QED, they compensate each otherinternally and never appear as real photons.In QCD, because of its nonabelian nature,the longitudinal and scalar gluons do notcompensate each other exactly. A ghost field,a scalar but obeying Fermi–Dirac statistics,has to be introduced to cancel unphysicalgluons. The ghost is not a physical particlebut an artifact arising from adopting thecovariant gauge. If one works in the axial(A3 D 0) or temporal (A0 D 0) gauge, onedoes not need the ghost, but the calculationis generally easier in the covariant gauge. SeeSect. 11.7 for the existence of the ghost field.

Page 787: Yorikiyo Nagashima - Archive

762 18 Gauge Theories

are confined inside hadrons and at high energies only observed as jets. By observ-ing jet phenomena, it became possible to treat the interactions between quarks andgluons at high energies quantitatively with high precision.

On the other hand, at small Q2 . 1 (GeV/c)2 asymptotic freedom does not applyand a nonperturbative treatment is necessary. Lattice QCD provides a way to cal-culate hadron masses, decay constants and other dynamical variables in which anonperturbative treatment is necessary. Also chiral perturbation theory was devel-oped based on QCD; it handles many low-energy phenomena, including decays ofheavy quarks (c, b, t). QCD is now a standard tool in nuclear physics, too.

As a theory that can reproduce various aspects of hadronic phenomena and thatcan calculate strong force corrections to electroweak theory, QCD, in parallel withGWS electroweak theory, has established itself as the Standard Model of particlephysics. It has also developed new phenomena of its own, including the instantoneffect and the strong CP problem with its associated prediction of axions and theirpossible effect on observables.

18.5.1Asymptotic Freedom

In Sect. 8.1.5, we calculated the QED coupling constant as a function of Q2 inEq. (8.63), which is also referred to as the running coupling constant when itsdependence on Q2 is emphasized:

α(Q2) Dα0

1C Πγ (Q2)

Πγ (Q2) D Πγ (0)C OΠγ (Q2)

OΠγ (0) D 0 , OΠγ (Q2) D �α0

3πln�Q2/m2

e

�Q2 m2

e (18.74)

where the coupling α0 D e20/4π, which is defined in terms of the bare charge e0,

is to be considered as a mere parameter. Q2 D �q2 is the 4-momentum squaredof the virtual (space-like) photon and me is the electron mass. Πγ (0) is a divergentquantity and has to be treated with the renormalization recipe. However, OΠγ (Q2)is finite and physically meaningful. Expression (18.74) is obtained by summing achain of vacuum polarization diagrams (a)–(e), which are repetitions of the one-loop contribution. Expressed in a somewhat different form

1α(Q2)

D1

α0� B(Q2) , B(Q2) D �

1α0

[Πγ (0)C OΠγ (Q2)] (18.75)

The renormalization procedure replaces the divergent term by an observable, thecoupling constant α(0) measured at infinite distance (Q2 D 0). Namely, by

1α(0)

D1α0� B(0) , α(0) D 1/137 . � � � (18.76)

Page 788: Yorikiyo Nagashima - Archive

18.5 QCD 763

(a) (b) (c) (d) (e)

Figure 18.13 QED: Diagrams that are collected to calculate the running coupling constant:(a) is a symbolic expression for the sum of (b)–(e). (a) is called the leading log approximation(LLA).

Substituting Eq. (18.76) in Eq. (18.75), we have

1α(Q2)

D1

α(0)� [B(Q2)� B(0)] D

1α(0)

� (Q2)

(Q2) D1

3πln

Q2

m2e

(18.77)

We now have a divergent-free (renormalized) expression for an effective couplingconstant in terms of an observable quantity:

α(Q2) Dα(0)

1� α(0)3π ln Q2

m2e

(18.78)

The above expression for the function was obtained as the first-order perturba-tion solution, picking up only the dominant ln Q2 term in the limit Q2 m2

e .However, the coupling constant obtained from the above expression actually con-tains the series of higher order corrections depicted in Fig. 18.13, as can be seen byexpanding Eq. (18.78) in powers of α(0), and it is called the leading log approxima-tion (LLA).

In QCD, we cannot subtract the coupling constant at Q2 D 0, as it is infinitebecause of confinement. Instead, we choose some reference value Q2 D μ2 todefine the coupling constant. Denoting the QCD coupling constant as α s(Q2), wecan write

1α s(Q2)

D1

α s(μ2)� [B(Q2) � B(μ2)] D

1α s(μ2)

� (Q2) (18.79)

Diagrams that contribute to the function in the lowest order or LLA approxi-mation to αS (Q2) are given in Fig. 18.14. The function is expressed as10)

(Q2) D �0

4πln

Q2

μ2 (18.80a)

0 D �

�23

nf C 5 � 16�

(18.80b)

10) The contribution of each diagram is gauge dependent. The physical interpretation is most clear inthe physical gauge, where only two transverse gluons are quantized and the longitudinal gluonrepresents simply an instantaneous Coulomb potential.

Page 789: Yorikiyo Nagashima - Archive

764 18 Gauge Theories

(a) (b) (c)

f

f

gT

gT

gL

gT

Figure 18.14 QCD: Diagrams that contributeto the quark–gluon running coupling con-stant. (a) is common to QED and QCD.(b,c) Gluon self-coupling diagrams. (b) Trans-

verse gluon pair. (c) Longitudinal gluon andtransverse gluon pair. Repetition of types (a)–(c) are all added in LLA calculation.

The first term is the contribution of the fermion loop and nf is the number ofquark flavors.11) The factor (C5) in the second term in Eq. (18.80b) originates frompair creation of gluons having transverse polarizations. The third term (�16) isa contribution of one transverse gluon created together with a longitudinal gluon(Fig. 18.14c). The reason for both the first and second terms being positive is relatedto unitarity [Sect. (9.1.2)], i.e. Im(T f i ) �

Pn jTni j

2. The amplitudes of the diagramsin Fig. 18.14 are proportional to σ(qq ! f f , gg) for Q2 D �q2 > 0, as canbe seen by cutting them across the loop. Namely, fermion pairs and transversegluon pairs are physical and can be produced if energy-momentum conservationis satisfied. However, the longitudinal gluons are unphysical and do not contributeto the cross section. Therefore the function can be negative, and in fact it is largeand negative. One can see that as long as (2/3) nf < 11, the function is negative( < 0). The QCD running coupling constant is now expressed as

α s(Q2) Dα s(μ2)

1C α s (μ2)4π

�11 � 2

3 nf�

ln Q2

μ2

(18.81)

which approaches zero as Q2 ! 1 (asymptotic freedom). Asymptotic freedomjustifies the use of perturbation theory at large Q2 (phenomenologically at Q &1 GeV/c) and the parton model, which treats quarks as point particles. The differ-ence between QED an QCD is illustrated in Fig. 18.15. It is due to the color degreeof freedom and nonlinear gluon couplings in QCD. The vacuum polarization effectdue to physical fermions and gluons serves as a screening effect (similar to dielec-tricity in electromagnetism), but the effect of the unphysical gluon loop is oppositeand acts as antiscreening. Note that the QED coupling constant is running, too,and when Q2 is sufficiently large, the denominator of Eq. (18.78) could vanish for-mally, but if one calculates the value of Q necessary to achieve this, it is absurdlylarge and not attainable experimentally. In practice, the variation of α(Q2) is verysmall in the range of Q2 achievable today.

11) Note that in QED nf D 1 and 0 D �2/3, which agrees with Eq. (18.74) if we take into account

the normalization difference of QCD, where what corresponds to e in QCD is actually g/p

2 forhistorical reasons. (See the second line of Eq. (18.70) for why g/

p2 is used.) nf is the effective

number of quark flavors (not always 6), whose mass satisfies the condition m f < Q.

Page 790: Yorikiyo Nagashima - Archive

18.5 QCD 765

Screening Anti screening

e+ qG qR

RGRB BG

RBBB

BG

RGGGGRBG

RG GBBBBGRB

BR

+

qR qB qB qG qG

Test Charge

Test Color Charge (a)

(c)

(b)

Source Source

r0 1 fm~α ~ 1/137

Confinement Barrier

αs

α

Cou

plin

g St

reng

th

Q mπ

Asymptotic Region

Q > 1 GeV/c~ Q 0 ~

αs > 1~

~

r =

Figure 18.15 (a,b) Pictorial illustration ofscreening and antiscreening in QED/QCD. (a)The strength of QED coupling constant seenby a test charge decreases as it goes awayfrom the source (dielectric effect of the vac-uum). (b) The strength of α s seen by a testcolor charge (another quark) increases due tothe antiscreening effect of gluon self-coupling

and becomes the confining potential. (c) α,the QED coupling constant seen by a testcharge, is finite at r D 1 (Q2 D 0) and in-creases at shorter distances as the screeningis weaker. α s is meaningless at large distancesr & 10�13 cm because of confinement. It issmall at short distances.

On the other hand, the variation of α s(Q2) within the technically achievablerange is sizeable. Therefore, the treatment of the running coupling constant nor-malized at an arbitrary value of μ2 takes a bit of getting used to. The observablequantities should not depend on which μ one chooses. The requirement takes theform of a renormalization group equation and constrains the behavior of the cou-pling constant. Here, we only show that regardless of μ2 choice we can choose a

Page 791: Yorikiyo Nagashima - Archive

766 18 Gauge Theories

0

0.1

0.2

0.3

1 10 102

μ GeV

αs(μ

)

Figure 18.16 Measured α s as a function of scale μ; μ corresponds to Q in the text [311].

universal constant that is the fundamental constant of QCD. Equation (18.79) withEq. (18.80) substituted gives

1α s(Q2)

�0

4πln Q2 D

1α s(μ2)

�0

4πln μ2 (18.82)

The lhs is a function of Q2 and the rhs of μ2 only. Consequently they are constantand we equate them with �(0/4π) ln Λ2

QCD:

1α s(Q2)

�0

4πln Q2 D �

0

4πln Λ2

QCD

) α s(Q2) D4π

0 ln Q2

Λ2QCD

D12π

(33� 2nf) ln Q2

Λ2QCD

(18.83)

Figure 18.16 shows observed values of α s at various Q2 and indeed confirms thelogarithmic dependence of the coupling constant. Note, however, that what is ac-tually measured is α(Q2) and ΛQCDhas to be calculated from it. The experimen-tal value of ΛQCD is 150–300 MeV. In principle, it should not depend on Q2, butEq. (18.83) is an approximation using one-loop diagrams.12) There still is greatuncertainty as to the true value of ΛQCD.

It is interesting to note that Eq. (18.83) becomes infinite at Q D ΛQCD, whichis on the order of 0.5 fm D 0.5 � 10�13 cm, roughly equal to half the hadron size.As the formula is valid only in the asymptotic region, Q has to be at least largerthan � 1 GeV/c, where α s(1 GeV/c) � 0.5 for the formula (18.83) to be valid. It is

12) For higher order expressions of α s (Q2) or ΛQCD see [311].

Page 792: Yorikiyo Nagashima - Archive

18.5 QCD 767

conventional to express experimental values of α s converted to that at Q D mZ D

91 GeV/c. The experimental value is [311]

α s(mZ ) D 0.1170 ˙ 0.0012 (18.84)

18.5.2Confinement

Confinement is also considered to be a result of nonlinear gluon coupling. It is notrigorously proven yet, although lattice QCD calculations are indicative of it. Qualita-tively, the reasoning for confinement is well understood. The argument closely fol-lows that of the Meissner effect in superconductivity. It is a perfect diamagnetisminduced by a Bose–Einstein condensate of Cooper pairs. They are bound states ofan electron pair with opposite momentum and spin orientation attracted to eachother through lattice vibrations (phonons). The attractive force, normally hiddenby thermal motion, manifests itself at low temperature. When the magnetic fieldtries to penetrate any medium, eddy currents made by freely moving electrons areproduced to compensate it, a familiar phenomenon in electromagnetism that isreferred to as Lent’s effect. Normally the effect is small, but in the superconductor,the compensation works maximally and shows perfect diamagnetism. In the lan-guage of field theory, the photon has acquired a mass and the force becomes shortranged.

The invasion of the magnetic flux into the superconductor is limited by theCompton wavelength of the massive photon. If the strength of the magnetic fieldexceeds some critical value Bc, the magnetic energy can destroy the Cooper pairsand melt the condensate, and the superconductor is converted to a normal con-ductor. This is referred to as a type I superconductor (Fig. 18.17a). In a type IIsuperconductor, the magnetic field can partially break the superconductivity forthe field range Bc1 < B < Bc2, and flux tubes that penetrate the superconductorare created (Fig. 18.17b). Within the narrow confined region of the tube, B > Bc2

and the medium is normal conducting. Outside the tube, it is superconducting andthe magnetic flux is repelled from it by supercurrent circulating around the tubes.A simple explanation of superconductivity by Ginzburg and Landau is given in theboxed paragraph on p. 769Confinementequation.18.85.

Now imagine a colored quark in a medium in a sort of superconducting state,but with the roles of the magnetic and electric fields exchanged. The color electricfield is created around the quark. In its vicinity, the field strength is strong enoughthat it melts the superconductivity of the medium, and the flux expands isotopically(Fig. 18.17c). The flux density decreases in proportion to the inverse of the distancesquared and the field strength is of Coulomb type, just like the electric field inQED. At long distances, the field is not strong enough to destroy the superconduct-ing state. As the total flux must be conserved unless it is absorbed by (an)otheranticolored quark(s), the requirement of minimum energy state constrains the fluxto be confined in a narrow tube with constant cross-sectional area, keeping its fieldstrength above the critical value. This penetrates the medium (Fig. 18.17d) just like

Page 793: Yorikiyo Nagashima - Archive

768 18 Gauge Theories

(a) (b)

(c) (d)

q qr

q q

E

Figure 18.17 Meissner effect. (a) Type I super-conductor. The magnetic flux is repelled fromthe superconductor for B < Bc. (b) Type IIsuperconductor. For Bc1 < B < Bc2 the mag-netic flux is squeezed and flux tubes can pene-trate the superconductor. Inside the tubes it is

a normal conductor. (c) The color electric fieldflux expands uniformly in the vicinity of thecolor charge and the force is Coulomb-like. (d)For large r, the flux is squeezed to form a tubewith constant cross section. The flux densityand the force strength stay constant.

-4

-3

-2

-1

0

1

2

3

0.5 1 1.5 2 2.5 3

β = 6.0β = 6.2β = 6.4Cornell

r/r0

[V(r

) – V

(r0)]

r 0

Figure 18.18 Confinement potential calculated by lattice QCD. The Cornell potential is given byV(r) D σr � e/ r, (σ � 0.1 GeV � 10�13 cm, e � 0.7 GeV/10�13 cm) normalized to V(r0) D 0. D 2N/g2 for S U(N) [39].

Page 794: Yorikiyo Nagashima - Archive

18.5 QCD 769

the magnetic flux tube in the type II superconductor. As the flux per unit area inthe tube is kept constant, the strength of the force is also kept constant. Its accu-mulated energy grows in proportion to the tube’s length, in other words, it behaveslike a stretched string with constant tension. The tube is terminated by another an-ticolor charge or the antiquark. Thus the potential energy between the two quarksis expressed by a combination of the Coulomb and linear potentials:

V(r) �arC br (18.85)

It qualitatively reproduces the confining potential introduced to explain charmoni-um levels in Eq. (14.60). Figure 18.18 shows an example of a lattice QCD calculationof the potential.

Ginzburg–Landau Theory of SuperconductivitySuperconductivity is characterized by its permanent electric current and per-fect diamagnetism (Meissner effect [273]). When we write e� (D 2e) andm� (D 2m) for the electric charge and mass, respectively, of the supercur-rent carrier (i.e. Cooper pair), its behavior in the magnetic field is described bythe Ginzburg–Landau free energy [170]

F D Fn C12

(r � A)2 C1

2m�

ˇˇ�r

i� e�A

�ψˇˇ2 C αjψj2 C

2jψj4 ,

(18.86)

α D α0(T � Tc) , > 0 (18.87)

where Fn is the free energy of the normal conductor and ψ is the Cooper pair’swave function, commonly referred to as the order parameter. The equation ofmotion for ψ and A can be obtained by minimizing F, which gives

12m�

�r

i� e�A

�2

ψ C αψ C jψj2ψ D 0 (18.88a)

r � (r � A) D j De�

2m��ψ�(�irψ)C (irψ�)ψ

e�2

m� ψ�ψA

(18.88b)

At T > Tc the free energy has a minimum at jψj D 0. This can be seen bylooking at the potential part of the free energy (i.e. the last two terms), butbelow the critical temperature, T < Tc, the minimum moves to jψj D ψ0 Dp�α/ ¤ 0 (see Fig. 18.23 for the potential shape). This is a Bose–Einstein

condensate state, because as T ! 0 all the Cooper pairs lie in the ground statewith density given by jψ0j

2 D ns . Writing ψ(x ) D ψ0Cφ(x ), we consider onlythe ground state, i.e. φ(x ) D 0. Then the kinetic energy term in Eq. (18.88b)

Page 795: Yorikiyo Nagashima - Archive

770 18 Gauge Theories

can be neglected. Taking a time derivative or rotation of Eq. (18.88b) and usingB D r � A, we obtain

E D �@A@tD

@

@t

�j

λ2

�, ΔB �

1λ2 B D 0 , λ D

sm�

e�2ns(18.89)

λ is referred to as the (London) penetration depth.The first equation gives the supercurrent with perfect conductivity because

any electromagnetic field produces accelerating current and the second theMeissner effect. Equations (18.89) mean that the photon has acquired a massμ D „/λc.

Consider a medium with boundary at x D 0 (see Fig. 18.19). The re-gion (x < 0) is the vacuum in the magnetic field B D (0, B, 0) and that(x > 0) is superconducting. Equations (18.89) have a solution B(x > 0) D(0, B e�x/λ , 0), which connects with B(x < 0). One sees that the magneticfield penetrates only to the penetration depth� λ, which is typically hundredsof atomic distances, a microscopic size. At the boundary, the density of theCooper pair does not rise abruptly, but reaches its bulk value over the coher-ence length defined by � D „/

pm�jαj. This can be seen from an approximate

solution (ψ � e�x/� ) to Eq. (18.88a), which can be obtained by setting A D 0and jψj2 � α. The ratio � D λ/� , referred to as the Ginzburg–Landauparameter, discriminates between type I and type II superconductors:

� � λ/� , type I � < 1/p

2 , type II � > 1/p

2 (18.90)

Vacuum Super-conductor

B Be -x / λ

y

xλξ

|ψ0|2 = ns

Figure 18.19 Meissner effect.

18.6Unified Theory of the Electroweak Interaction

18.6.1SU(2) � U(1) Gauge Theory

In Chap. 15, we learned that the weak interaction is of vector type, that it acts onlyon left-handed particles and that W ˙ couple to isospin current, suggesting W ˙

Page 796: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 771

are members of an isospin triplet and the existence of a neutral boson W 0. All thispoints to a gauge theory of S U(2) symmetry. But there are two objections to beaddressed. One is that the weak interaction shares part of the isospin current withthe electromagnetic interaction and may not obey strict S U(2) symmetry. The otheris the short-range nature of the interaction, or equivalently the finite mass of W ˙,which obviously contradicts the notion of W ˙ being gauge particles. Deferring themass problem until later, we will first address the fusion of the weak force with theelectromagnetic force.

Since the electromagnetic current consists of isospin and remaining current [seeEqs. (15.98)], we consider the photon as a mixture of two gauge fields, one generat-ed by the isospin and the other by a yet to be determined symmetry. We assume itto be U(1) for simplicity. Then the electric charge consists of isospin and the U(1)hypercharge, which are related to the electric charge by

Q D I3 CY2

(18.91)

which defines the hypercharge of particles. The factor 1/2 is introduced to make theformula the same as the Nishijima–Gell-Mann formula.13) A new ingredient thatdid not exist in the isospin of the strong interaction is that we need to discriminateleft- and right-handed particles as having different isospin and hypercharge. It isnecessary because the isospin operator, the source of the weak force, acts only onleft-handed particles and the hypercharge has to make a necessary adjustment tokeep the charge Q independent of handedness. This means that we regard the left-and right-handed particles as different particles.

Let us consider a doublet consisting of (ν e , e�) for concreteness, though thefollowing arguments are valid for any isospin doublet Ψ , including the quarks. Acharged current made of a left-handed doublet

jCμ D νLγ μ eL D Ψ Lγ μ τCΨL (18.92a)

Ψ D�

νe

�, ΨL D

�νL

eL

�(18.92b)

couples with W ˙, i.e. the weak interaction Lagrangian is assumed to be given by

�LWI Dgw

2Ψ L γ μ(τ � W μ)ΨL

D gw Ψ L γ μ�

1p

2(τC W C

μ C τ� W �μ)C

12

τ0 W 0μ

ΨL (18.93a)

W � D1p

2(W1 ˙ W2) , W 0 D W3 , τ˙ D

12

(τ1 ˙ i τ2) (18.93b)

13) Note, the hypercharge thus defined is different from that of the strong interaction. In stronginteractions, hypercharge is Y D (BC S C C C B � � � )/2, where B is baryon number and S, C, . . .are flavor numbers corresponding to strangeness, charmness, etc. and not sources of force fields.

Page 797: Yorikiyo Nagashima - Archive

772 18 Gauge Theories

where we have included the hypothetical W 0 to make the interaction consis-tent with S U(2) gauge interaction [see Eq. (18.70)]. With the assumption that νR

does not exist,14) eR has no partner and constitutes a singlet. The hypercharge ofνL, eL, eR is determined from Eq. (18.91)

Y(νL) D Y(eL) D �1 , Y(eR) D �2 15) (18.94)

Denoting the gauge field associated with the hypercharge as B μ , the Lagrangiansatisfying independently S U(2) and U(1) symmetries [denoted as S U(2) � U(1)]can be written as

LWI D Ψ L i γ μ Dμ ΨL C eR i γ μ Dμ eR �14

F μν � F μν �14

Bμν B μν (18.95a)

Dμ D 1@μ C i gw W μ � t C i(gB/2)Bμ Y (18.95b)

The factor 1/2 for gB is for later convenience and has no special meaning. SinceΨL and eR have different isospin and hypercharge, the isospin operator t and thehypercharge operator Y have different eigenvalues depending on the operand. Forinstance, t eR D 0, Y ΨL D �ΨL, YeR D �2eR. Notice that the Lagrangian has nomass term. If it does, the left- and right-handed components mix and no uniquequantum numbers can be assigned independently (see arguments in Sect. 4.3.5).In other words, in a world where both left- and right-handed particles respect inde-pendently gauge invariance (chiral invariance), the fermion cannot have mass. Letus close our eyes to massless fermions for a moment and proceed.

Extraction of the W μ part of the Lagrangian Eq. (18.95a) reproduces Eq. (18.93a).Extraction of terms including W 0

μ and Bμ gives

�LNC Dgw

2(νL γ μ νL � eL γ μ eL) W 0

μ

CgB

2

h�(νL γ μ νL C eL γ μ eL) � 2eRγ μ eR

iBμ

D12

�gw W 0

μ � gB Bμ�

(νL γ μ νL)

�12

�gw W 0

μ C gB Bμ�

(eL γ μ eL) � gB Bμ eRγ μ eR

(18.96)

where (�1) and (�2) in front of νLγ μ νL and eR γ μ eR in the first line reflect their hy-percharge. Since both gauge fields B and W 0 couple to the same current, (νLγ μ νL

and eL(R)γ μ eL(R)), they mix through fermion loops (Fig. 18.20). As the neutrino

14) Since the discovery of neutrino oscillation,this assumption is no longer true. However,the following discussion remains valid. Theonly necessary modification is to include νR

in the Lagrangian, which is rather academicunder the present circumstances. Theassignment I(eR) D 0 holds regardless of theneutrino mass.

15) For quarks, (uL, dL) couple with W ,constituting an isodoublet, but uR, dR

are isosinglets because they do not. ThenY(uL) D Y(dL) D 1/3, Y(uR) D 4/3,Y(dR) D �2/3. The same rule applies to(c, s) and (t, b) quarks.

Page 798: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 773

Bμ W0μ

f

fFigure 18.20 Mixing between B and W 0 occurs when bothfields couple to the same fermion pair.

should not couple to the electromagnetic field, we define the combination that cou-ples to the neutrino as Zμ and its orthogonal combination as the electromagneticfield A μ:

Zμ D1q

g2w C g2

B

�gw W 0

μ � gB Bμ�� cos θW W 0

μ � sin θWBμ (18.97a)

A μ D1q

g2w C g2

B

�gB W 0

μ C gw Bμ�� sin θW W 0

μ C cos θWBμ (18.97b)

which defines the Weinberg angle θW. Inserting Zμ , A μ defined in Eq. (18.97) backin Eq. (18.96), and rewriting their coefficients using Q D I3 C Y/2, we obtain

�LNC D gw sin θW (eLγ μ QeL C eRγ μ QeR) A μ C

qg2

w C g2B

��I3 (νL γ μ νL C eL γ μ eL) � Q sin2 θW (eL γ μ eL C eRγ μ eR)

D QeΨ γ μ A μ Ψ C gz Ψ γ μ Zμ(I3L � Q sin2 θW)Ψ (18.98a)

e D gw sin θW , gz D

qg2

w C g2B D

esin θW cos θW

(18.98b)

where I3L means that it acts only on the left-handed part of the field. The firstterm is the electromagnetic interaction and the second a new weak neutral currentinteraction that couples to Zμ . Because of mixing between S U(2) and U(1), thecurrent that couples to the weak neutral boson Zμ is a mixture of the left-handedneutral current and the electromagnetic current. Hereafter, we restrict the use ofthe word neutral current to that which couples exclusively to Zμ unless otherwisestated.

We have, starting from the chiral world, where left- and right-handed particlesare independent, succeeded in extracting a left–right symmetric current, whichcan be identified as the electromagnetic current. It is a gauge theory based onS U(2) � U(1) and contains the neutral weak gauge fields Z0 as well as the electro-magnetic field. Note, however, that the above formalism is valid only for masslessgauge fields as well as for massless fermions. Adding mass terms by hand makesthe formalism phenomenologically usable in many applications today. Using theabove Lagrangian and equating the W ˙ exchange transition with the four-Fermiinteraction of the weak interaction, we obtain

GFp

2D

g2w

8m2WD

e2

8m2W sin2 θ 2

W

(18.99)

sin2 θW is referred to as the Weinberg angle, and if it can be determined by experi-ments, the W mass would also be determined. A shortcoming of the theory is that

Page 799: Yorikiyo Nagashima - Archive

774 18 Gauge Theories

the mass is put in by hand, thereby destroying the gauge invariance. No reliablehigher order calculations are possible either.

18.6.2Spontaneous Symmetry Breaking

We have learned in the previous section that we could construct a satisfactory theo-ry, at least phenomenologically, by putting the mass terms in by hand. But by doingso, the gauge invariance is broken, making it unrenormalizable, with resultant lossof predictive power. Here we demonstrate that despite apparent breakdown of thesymmetry, there are cases in which the symmetry is preserved deep in the roots ofthe mathematical formalism. We might well say the symmetry is not broken buthidden.

An equation with some symmetry generally has solutions that respect the samesymmetry, but those solutions are not necessarily stable. Consider, for instance,a stick standing upright on a plane. Apply pressure straight down from above(Fig. 18.21). Within the elastic limit it is compressed, but beyond a certain pressureit is bent and stays that way. The solution is supposed to have rotational symmetryaround an axis, but the realized stable solution obviously breaks it. This is a casein which the equation respects the symmetry but its solution does not. A remnantof the symmetry is there in that any direction can be chosen and the realized statehas the same energy regardless of the chosen direction. Namely, the ground statesare infinitely degenerate.

Another example where field theory can be applied is to a ferromagnetic sub-stance. Ferromagnetism is produced as a result of electron spins all being aligned.At high temperature, the direction of the spins is random due to thermal motion,chaos in a sense, but since there is no special direction, the rotational symmetry isrespected. When an external magnetic field is applied, all the spins are aligned andthe rotational symmetry is broken. This is the case that the symmetry is broken byintroducing a symmetry-breaking perturbation by hand in the system. The externalfield is not rotationally symmetric, which is the origin of the symmetry breakingof the realized state. However, even without a symmetry-breaking perturbation, thespins can be aligned. This happens below a critical point, the Curie temperature.Below it, spin–spin interaction, which by itself is rotationally symmetric, wins overthermal turbulence and the spins align themselves (Fig. 18.22a). The chosen di-rection is quite arbitrary, because the states are infinitely degenerate, but once adirection is chosen, all the neighboring spins follow the initiator’s trigger and align

F

Figure 18.21 A stick under pressure. The bent position exhibits aground state that breaks the rotational symmetry and infinitely degen-erate.

Page 800: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 775

T = 0 T < Tc T > Tc 40 μm

(a) (b) (c)

109°

109° 180°109° 71°

71°

71°

TcT

Mag

netiz

atio

n

Figure 18.22 (a) Magnetization pattern as afunction of temperature. Below the Curie tem-perature (Tc), spins are ordered and magneti-zation appears. (b) Domain patterns observed

on a f110g surface away from a crystal edge.(c) Inferred domain magnetization directions.Figures taken from [309].

along the same direction; the chosen direction is frozen. No local perturbations canchange it because it takes a vast amount of energy (binding energy per spin timesthe number of spins). Such processes are observed commonly in condensed matterphysics. Common features of them are

1. There is some critical temperature Tc.2. A configuration with good symmetry becomes unstable below the critical tem-

perature and changes to another stable state where the original symmetry isbroken.

3. Realized stable states (ground states) are continuously (infinitely) degenerate.

When the process happens, it is called spontaneous symmetry breaking. Thechosen direction need not be the same in one place as in another if they are farapart. Because of this, domains of differing spin direction are formed (Fig. 18.22b).A little man living in such a world would never imagine that the equation thatgoverns the direction of spins respects rotational symmetry. Such a macroscopicmetamorphosis at a critical temperature has been known quite a while as a phasetransition and its formalism is well developed in condensed matter physics. It wasNambu who introduced the idea of spontaneous symmetry breaking into particlephysics. He investigated superconductivity from the viewpoint of gauge symmetrybreakdown and applied it to particle physics [287, 289, 290].

The mathematical formalism of spontaneous symmetry breaking goes like this.Consider a self-interacting complex scalar field whose expectation value is taken tobe the magnetization we talked about. Its Lagrangian has the form16)

16) The Hamiltonian that can be derived from Eq. (18.100) is referred to as the Gibbs free energy andφ is an order parameter in the terminology of condensed matter physics.

Page 801: Yorikiyo Nagashima - Archive

776 18 Gauge Theories

TemperatureU(φ)

φ1φ1= Vφ2= 0φ2

LowU(φ)

|φ|

HighTemperature

(a) (b)

Figure 18.23 (a) At high temperature T > Tc, vacuum is where φ D 0. (b) At low temperatureT < Tc, jφj D 0 state is unstable, but the state with lowest energy is infinitely degenerate.

LG D @μ φ @μ φ � V(φ) (18.100a)

V(φ) D λ�

φ†φ Cμ2

�2

D μ2(φ† φ)C λ(φ†φ)2 C V(0) , (18.100b)

λ > 0 μ2 D a (T � Tc) (18.100c)

The constant term V(0) can be chosen arbitrarily, because we are only dealing withthe energy difference or excitations from the ground state.17) The Lagrangian isinvariant under gauge transformation

φ ! φ0 D e�i α φ , φ† ! φ†0D φ† e i α (18.101)

Note, when jφj is to be interpreted as the magnetization, the rotational symmetryis in real space. Here we use rotational symmetry in the complex internal spacebecause we want to discuss gauge symmetry. From now on we talk about the com-plex scalar field in particle physics and the field of magnetization interchangeably.When we talk about magnetization, the lowest energy state is referred to as theground state.

At high temperature (T > Tc), the potential is of the form depicted in Fig. 18.23aand the lowest energy (the ground state) is realized with jφj D 0 or zero magnetiza-tion. At low temperature (T < Tc), however, μ2 < 0 and the shape of the potentialis like a Mexican hat or the bottom of a wine bottle (Fig. 18.23b). φ D 0 is anunstable solution, and the ground state is at jφj D v/

p2 D

p�μ2/2λ; the magne-

tization is finite. The ground state can be anywhere that satisfies jφj D v/p

2 alonga circle on the bottom of the wine bottle. Putting φ D (φ1 C i φ2)/

p2 we choose

the ground state at

φ1 D v , φ2 D 0 , v D

r�

μ2

λ(18.102)

17) V(0) is considered as the vacuum energy. Its absolute value becomes important if gravity matters.

Page 802: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 777

As the ground state is the vacuum in field theory, the field φ has a finite vacuumexpectation value:

hφi D h0jφj0i D v/p

2 (18.103)

Since we always measure excited energy (particle creation) from the ground state(vacuum), and the new vacuum is at φ1 D v , we define a new field

φ01 D φ1 � v (18.104)

The new field has the vacuum expectation value

hφ01i D hφ2i D 0 (18.105)

The field described in terms of φ01, φ2 does not respect the original gauge (rota-

tional) symmetry. This is a case of spontaneous symmetry breaking. In field theo-ry, once the vacuum has been chosen, it is fixed, because there are infinitely manydegrees of freedom and no matter how small the microscopic energy needed tochange the configuration, if multiplied by infinity, it is infinite. In the case of con-densed matter physics, the number of degrees of freedom is large but finite (of theorder of the Avogadro number) and the ground state can be changed by heating orcooling the bulk of the matter. The spontaneously broken symmetry in field theoryis due to the vacuum that surrounds us, and we are the little men in the ferro-magnet we talked about. We cannot change our vacuum unless we heat the wholeuniverse! We are in one of the domains. Our whole universe is most likely in onedomain.18)

Note that spontaneous symmetry breaking does not break the symmetry math-ematically. Observed phenomena were simply reexpressed in terms of φ0

1 and φ2.Mathematically, no symmetry-breaking terms are introduced. All the phenomenacan equally be described in terms of original variables that respect the symmetry,though in this case physical interpretation is much harder.

Nambu–Goldstone Boson What does the new field represent? In terms of φ01, φ2,

which we rewrite as φ1, φ2, the Lagrangian is expressed as

LG D12

�@μ φ1@μ φ1 � (�2μ2)φ1

2C 12

�@μ φ2@μ φ2

C

�λv φ1jφj2 C

λ4jφj2

� (18.106)

The first term represents a massive scalar field with massp�2μ2, or the magne-

tization we talked about, the second is a massless scalar field and the third is the

18) The boundaries of domains are known as “defects” and they are commonly observed in condensedmatter physics. There are conjectures that defects may have been formed at the big bang and theirremnants may be observable as monopoles, cosmic strings, etc. If they are observed, the standardcosmic model has to be modified.

Page 803: Yorikiyo Nagashima - Archive

778 18 Gauge Theories

interaction between the two. A massless scalar boson appears if a continuous sym-metry is spontaneously broken with infinitely degenerate ground (vacuum) states,a generalization of Nambu’s proposal, which is referred to as the Goldstone theo-rem [180]. The massless boson is referred to as the Nambu–Goldstone boson. Doessuch a particle exist in a ferromagnet? The answer is yes, and it is called a magnon.

Although changing the chosen state to another is difficult, it is possible to givea small perturbation locally. Then the spin–spin interaction, which was the causeof the spontaneously broken state, transmits the perturbation to neighboring spinsand the perturbation propagates, making a wave (Fig. 18.24). The magnon is a spin

λ2

Figure 18.24 When the magnetization isaligned, a spin wave, a deviation from the or-dered state, appears. In the long-wavelengthlimit, the ordered ground state is recovered.

This means that its quantum, called themagnon, is a massless particle commonlyreferred to as the Nambu–Goldstone boson.

wave. That its long-wavelength limit recovers the static ground state, i.e. ω D 0as λ ! 1 or k ! 0, means the mass of the quantum is zero. Why does suchan extra boson appear? Originally there were two fields to represent a continuoussymmetry. After condensation, all particles (the field’s quanta) are ordered and oc-cupy the same state. There is no need of two fields to describe an ordered state;one is enough. Its quantum has to be massive to provide some stability to the state.The residual field is not constrained and its quantum is massless. One sees inFig. 18.23b that in the φ1 direction it is at the minimum of a potential and takessome effort to move it (excite the field, i.e. create a particle), but φ2 lies on a flatcircle, and needs no force to move it.

The question is: do we have Goldstone bosons as an indication of symmetrybreaking?

18.6.3Higgs Mechanism

In nature, no massless Goldstone bosons are known and there was difficulty inaccepting the notion of spontaneously broken symmetry. However, there is a loop-hole in Goldstone’s theorem and the Goldstone boson does not appear when thesymmetry is associated with a long-range force like that created by the gauge field.In the absence of a long-range force, the wave in the Goldstone mode can prop-agate but the force’s action does not reach beyond a few neighboring particles. Ifthe force is of long range, particles far away are affected simultaneously, and theyact coherently, constituting a longitudinal wave. In a plasma, as a result of theCoulomb force, a longitudinal wave called plasma oscillation is formed. Its action

Page 804: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 779

screens the Coulomb force, converting it to a short-range force. In the language offield theory, the quantum of the plasma oscillations has mass, namely the photonhas acquired mass and has become a plasmon. Higgs and others [131, 191, 205]showed that when the symmetry is broken in association with the gauge field, theGoldstone boson does not appear but instead the gauge boson acquires mass. Wewill see how.

Mass of the Gauge Boson When a gauge field is introduced, the Lagrangian con-taining the self-interacting scalar field (hereafter referred to as the Higgs field) andthe gauge field is expressed as

LH D (Dμ φ)†(D μ φ) � V(φ)�14

Fμν F μν (18.107a)

Dμ D @μ C i qA μ , V(φ) D μ2(φ† φ)C λ(φ†φ)2 (18.107b)

Let us see what happens after the gauge symmetry is spontaneously broken. It isconvenient to decompose φ not linearly like φ D (φ1C i φ2)/

p2 but exponentially

φ D exp�i(φ2/v )

φ1p

2! exp

�i(φ2/v )

v C φ01p

2(18.108)

Exponential DecompositionMathematically it is always possible to transform into this form as long asthe number of degrees of freedom does not change. Then the difference ofφ2 in the two expressions is a matter of physical interpretation. φ2 in thisform is called a phase field. If expanded in powers of φ2/v , it is the same asthe expansion φ D (v C φ1 C i φ2)/

p2 if higher order terms are neglected.

Namely, as long as the excitation of the field is small, where the particle con-cept of the field is valid anyway, the difference is small. Remember, the fieldquantization is applied to harmonic oscillators or using up to quadratic termsin the Taylor expansion of the potential around the minimum and the restis treated as interaction terms. The difference appears as a global feature ofthe field, which reflects the collective effects of the interaction. For instance,a local plane wave could globally be a part of spherical wave at far distance orcirculating wave whose radius is large. In the spontaneous symmetry break-ing, we are discussing what will happen if we extend our approximation upto the quartic power of the Taylor expansion. In Fig. 18.23b φ2 is constrainedto lie on a circle and has periodicity, hence its name. But locally, as long as itremains in the neighborhood of φ2 D 0, the linear treatment is equally valid.

Page 805: Yorikiyo Nagashima - Archive

780 18 Gauge Theories

Problem 18.5

Show that for φ01, φ2 � v , Eq. (18.108) is equivalent to

φ D1p

2(v C φ0

1 C i φ2) (18.109)

If we rename φ01 as φ1 and replace the gauge field by

Bμ D A μ C1

qv@μ φ2 (18.110)

the Lagrangian takes the form

LH D12

�Dμ φ1D μ φ1 � (�2μ2)φ2

1

� λv φ3

1 �λ4

φ41

�14

FB μν FBμν C

(qv )2

2Bμ B μ (18.111)

FB μν D @μ Bν � @ν Bμ

Problem 18.6

Derive Eq. (18.111).

In this expression, there is no Goldstone boson. Notice, the second line repre-sents a massive vector field [see Eq. (5.22)]. φ2 is absorbed as the third (longitu-dinal) component of the gauge field, which has mass mB D qv . Note that thenumber of degrees of freedom is the same as before. Originally, there were onecomplex scalar field or two real scalar fields and a massless gauge vector boson,which has two degrees of freedom. After symmetry breaking, there are one realscalar field and a massive vector boson, which has three degrees of freedom. It isoften said that the gauge boson became fat by eating a ghost. This is the processthat a component of the Higgs field is converted to the longitudinal component ofthe gauge field to make it massive; it is called the Higgs mechanism.

In order to choose a vacuum from an infinite number of degenerate states,we have used the freedom of gauge transformation Eq. (18.101) as expressed inEq. (18.108)

φ ! e�i α φ() φ !φ1p

2e i(φ2/v ) (18.112)

In other words, the vacuum has been fixed by a special kind of gauge transforma-tion. The gauge thus chosen is commonly referred to as the unitary gauge.

A Hamiltonian that can be extracted from a nonrelativistic version of Eq. (18.107)is a well-known formula in condensed matter physics and is referred to as theGinzburg–Landau free energy for superconductivity (see the boxed paragraph on

Page 806: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 781

p. 769Confinementequation.18.85). φ denotes the Cooper pair wave function. Be-low the critical temperature, the Cooper pair is mass produced (condensation) andmatter is transformed to the superconducting phase. A superconductive mediumrepels the magnetic field by producing eddy currents to compensate the intrudingmagnetic perfectly (perfect diamagnetism referred to as the Meissner effect) exceptin a thin skin depth. This is due to condensed Cooper pairs acting coherently asa macroscopic quantum fluid. In the language of field theory, the photon (gaugefield) has acquired mass in the superconductor and the electromagnetic force haschanged to a short-range (� skin depth) force with the range given by the Comp-ton wavelength of its mass. In the superconductor, the gauge invariance is brokenspontaneously. Adopting the Higgs mechanism is to consider that the vacuum thatsurrounds us is in the superconductive phase. The temperature of the phase tran-sition is of the same order as the vacuum expectation value v.

What is the meaning of the vacuum expectation value? In magnetism, it means afinite magnetization in the ground state where all the spins are aligned, a situationin which a great number of particles occupy the same state, which is referred toas (Bose–Einstein) condensation in condensed matter physics. When the field hasa finite vacuum expectation value, it means condensation to the ground state. Allthe particles act coherently, or as a macroscopic quantum state. In this state thecharacteristic field behavior is more that of a wave than that of particles. In theHiggs mechanism, the condensate is the Higgs field, an equivalent of a sea ofCooper pairs acting as a superfluid, which screens the force, converting it to ashort-range force.

Fermion Mass We have explained the finite mass of the gauge boson as a resultof spontaneous breakdown of the gauge symmetry. What about the fermion mass?The vanishing of the fermion mass was the result of chiral invariance. We shallshow that the fermion mass can also be generated as a result of spontaneous sym-metry breaking. Let us assume that the Higgs field has an interaction with thefermion field. Adopting a Yukawa-type interaction and denoting the fermion fieldas ψ, we can write the Lagrangian in Lorentz-invariant form

�LYukawa D f ψL ψRφ C h.c. (18.113)

The chiral invariance can be maintained by assiging suitable hypercharge to ψL, ψR

and φ, namely to satisfy �YψL C YψR C Yφ D 0. When the symmetry is brokenspontaneously, the Higgs field acquires vacuum expectation value, i.e. φ ! (v Cφ1)/p

2,

�LYukawa D

�f vp

2C

fp

2φ1

�(ψRψL C ψR ψL)

D m f ψψ Cm f

vψψφ1 , m f D

f vp

2

(18.114)

Page 807: Yorikiyo Nagashima - Archive

782 18 Gauge Theories

Thus we have created the fermion mass term, but as a by-product we have alsoinduced a fermion–Higgs interaction. There are no principles or guides to con-strain fermion–Higgs couplings, hence the fermion mass is an additional arbitraryparameter. This applies also to the mass of the Higgs particle itself (m2

H D �2μ2 D

2λv2), which is proportional to the strength of the self-coupling constant λ and hasto be determined experimentally.

What we have learned so far is summarized as follows. It is possible to generatemass terms in the gauge-invariant Lagrangian consisting of massless fermions andgauge bosons (referred to as the gauge sector). The mass term was generated bythe interaction of the Higgs field with the gauge field as a result of spontaneoussymmetry breaking. Its Lagrangian (Higgs sector) is also gauge invariant, as is seenfrom Eq. (18.107). Extracting the mass term only from the Higgs sector and addingit to the gauge sector violates the gauge invariance. But if we take into account thewhole Higgs sector as well as the mass term, the gauge invariance is preserved. Ifthe Higgs mechanism is right, we have to include the Higgs field in the calculationof higher order terms of gauge particle interactions. Above all, the Higgs particleitself has to exist.

18.6.4Glashow–Weinberg–Salam Electroweak Theory

We are now ready to formulate a gauge-invariant electroweak theory. We discussonly (ν e , e�) doublets as before. Other doublets can be treated similarly.

Higgs Doublet We have succeeded in constructing a gauge-invariant theory withfinite mass of the gauge boson conceptually. What remains to be done is to in-corporate the Higgs mechanism into the weak interaction framework based onS U(2) � U(1). In order to do this, we need at least four Higgs fields. Three woulddisappear, giving mass to W ˙ and Z, and the remaining one would become amassive Higgs field. There are many ways to incorporate four Higgs fields, but thesimplest one, which Weinberg and Salam (WS) adopted, is to assume the existenceof an isospin doublet of a complex scalar field and its charge conjugate. We expressthem as

Φ D�

φCφ0

�, Φ c D �i τ2Φ † D

��φ0 C

φ�

�(18.115)

Then using the Nishijima–Gell-Mann law, we have Y(Φ ), Y(Φ c) D 1, �1. Whenthey acquire vacuum expectation values, we write them as

Φ !

"0

(vC')p2

#, Φ c !

"�

(vC')p2

0

#(18.116)

Page 808: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 783

The Lorentz-invariant Yukawa interaction respecting S U(2) � U(1) symmetry isgiven by

�Lmass D f e

heR(Φ †ΨL)C (Ψ LΦ )eR

i(18.117)

As the neutrino mass is assumed to be zero, we do not include its mass term here,but if one wants to give mass to the I3 D C1/2 member of the doublets, one usesΦ c instead of Φ .

SU(2) � U(1) The total Lagrangian for (ν e , e�) and W ˙, Z, γ and the Higgsrespecting S U(2) � U(1) symmetry can be expressed as

LGWS D Ψ L i γ μ Dμ ΨL C eR i γ μ Dμ eR �14

F μν � F μν �14

Bμν B μν

C (Dμ Φ )†(D μ Φ )� V(φ)� f e�eR(Φ †ΨL)C (Ψ LΦ )eR

(18.118a)

Dμ D @μ C i gw W μ � t C i(gB/2)Bμ � Y (18.118b)

F μν D @μ W ν � @ν W μ � gw W μ � W ν (18.118c)

Bμν D @μ Bν � @ν Bμ (18.118d)

V(φ) D λ�jΦ j2 C

μ2

�2

(18.118e)

The gauge sector is the first line of the Lagrangian, and the Higgs sector the secondline. We need to rewrite W 0

μ , Bμ in terms of Zμ , A μ. When the Higgs field acquiresa vacuum expectation value, one can obtain the Lagrangian of GWS theory. It isequivalent to Eq. (18.95) derived previously plus the mass term and the Higgs field.

Mass of W and Z The mass of W is already given in Eq. (18.99), but to derivethe mass of Z, we need to know the Weinberg mixing angle and the Higgs fieldvacuum expectation value. They can be obtained by substituting

Φ D

"0vp

2

#(18.119)

into the kinematic energy term (the first term in the second line) of the Higgssector in Eq. (18.118) to give

mW Dgw v

2, mZ D

gz v2D

gz

gwmW D

mW

cos θW(18.120)

The mass of W is also related to the Fermi coupling constant by Eq. (18.99). Fromthis, if one can obtain the value of sin θW experimentally, mW , mZ , v can be cal-

Page 809: Yorikiyo Nagashima - Archive

784 18 Gauge Theories

culated. Using experimentally observed value sin θW D 0.2231 [311]

m2W D

p2e2

8GF sin2 θWD

�37.3 GeV

sin θW

�2

D (80.4 GeV)2 (18.121a)

mZ DmW

cos θWD 91.19 GeV (18.121b)

v D 1/qp

2GF D 250 GeV (18.121c)

Problem 18.7

Derive mW , mZ from Eq. (18.118).

18.6.5Summary of GWS Theory

We have reached an amazing conclusion: that the mass is not an inherent propertyof a particle but one that has been acquired as a result of environmental change.The GWS theory is constructed on the gauge principle based on S U(2) � U(1)symmetry that is spontaneously broken. Requiring the existence of the Higgs field,which is responsible for the symmetry breaking, allowed the finite mass of thegauge bosons as well as that of fermions to be incorporated in the theory withoutbreaking the gauge invariance. It is a unified theory of the electromagnetic andweak interactions19) and hence is referred to as the electroweak theory. Renormal-izability of the theory was proved by ’t Hooft in 1971 [359, 360]. Therefore, it ismathematically closed, self-consistent and has strong predictive power. All dynam-ical phenomena caused by the electroweak force are calculable to arbitrary accuracy,at least in principle. The theory succeeded in reproducing all phenomena knownat that time and so its proof resided in the prediction of new phenomena.

When Was the GWS Model Accepted as the Standard Model?It is interesting to note that such a revolutionary theory was barely noticedupon its publication. According to citation records, the number of citationswas 0 in 1967, 0 in 1968, 0 in 1969, 1 in 1970, 4 in 1971, 64 in 1972, 162in 1973 and 226 in 1974 [102]. This includes Weinberg’s own citations. Whywas such an outstanding paper not noticed? Because it was uncertain if thetheory was renormalizable. Weinberg and Salam guessed that it would be ifthe symmetry was broken spontaneously but left it unsolved. It was clear thatthe sudden increase of the citation is due to ’t Hooft’s paper [359, 360], which

19) It is not unified in the strict sense because S U(2) and U(1) are not subgroups of a bigger group.The two coupling constants of S U(2) and U(1) are connected through Weinberg mixing, but thereare still two coupling constants. It is unified in the sense that we can formulate two forces in aconsistent and unified way.

Page 810: Yorikiyo Nagashima - Archive

18.6 Unified Theory of the Electroweak Interaction 785

provided a proof of renormalizability. Since then, Weinberg’s seminal paperon the subject [383] boasts top citing to this day (6815 as of July 2009), sur-passed only by the Review of Particle Physics, which is a compilation of all theworld data in high-energy physics (see [348]). The first experimental evidencesupporting the GWS model was the discovery of neutral current reactions in1973 [199, 200], but what really established its validity beyond doubt was thediscovery of W and Z in 1983 with exactly the mass predicted [372–375]. Butwhy did it take as long as 16 years to prove the model? I would argue that theyear the model was accepted by the community was 1978. By then, the ma-jority already believed it, but there was one piece of counter evidence againstit. Some atomic parity-violation experiments did not confirm the predictionthat is now history. Convincing evidence to show that the GWS model, not theatomic experiment, was right was presented at ICHEP 78 (International Con-ference for High Energy Physics) by the SLAC group. It was the detection ofparity violation in polarized electron scattering off a deuteron target [326, 327].In addition, a compilation of measured Weinberg angles from a variety of ex-periments were shown to agree with each other [41] and singled the Wein-berg–Salam model out from others. Ample evidence for QCD, including pre-cision experiments on deep inelastic scattering, was also presented at the con-ference. The next year, Glashow, Weinberg and Salam were awarded the Nobelprize.

Page 811: Yorikiyo Nagashima - Archive
Page 812: Yorikiyo Nagashima - Archive

787

19Epilogue

The first experimental evidence for GWS theory were the discovery of neutral cur-rent reactions [199, 200]. The fact that all the neutral current reactions can be re-produced with one universal parameter, the Weinberg angle, convinced people therightness of the theory. But, the definitive evidence was provided by the discoveryof W and Z [372–375] having exactly the right mass as predicted. Since then manyexperiments, for example, SPEAR at SLAC (7.4 GeV, 1972–1990), PETRA at DESY(40 GeV, 1978–1990) and TORISTAN at KEK (60 GeV, 1987–1996) confirmed thepredictions of GWS theory. Above all, LEP at CERN (100 � 200 GeV, 1989–2000),in which the center of mass (CM) energy was set at

ps D mZ D 92 GeV, made a

number of high-precision measurements confirming the higher order predictionsof the electroweak theory. They could be compared with the Lamb shift experimentin QED in establishing the validity of the Standard Model to high precision (see forinstance [184, 365]).

Among hadron colliders, the Sp pS (Super Proton Synchrotron,p

s D 540 !630 GeV, 1979–1990) discovered W ˙ and Z in 1983. The Tevatron, which operat-ed 1983–2009, discovered the top, the last quark in the Standard Model. Both ofthem provided a variety of hadron jet data confirming many predictions of QCD.B factories (PEPII at SLAC, E D 3.1 C 9 GeV, 1998–2008, and KEKB at KEK,E D 3.5 C 8 GeV, 1999 onwards) confirmed that CP-violation data fit well withthe Kobayashi–Maskawa framework. The only major remaining issue in the Stan-dard Model is the discovery of the Higgs particle, which is expected to be foundsoon when the LHC starts operation.1)

All the known data so far, mainly obtained at electron–positron and hadron col-liders, have been consistent with the Standard Model. The only evidence pointingbeyond the Standard Model is neutrino oscillation, but whether this phenomenonis outside of GWS theory is debatable and depends on the exact definition of theStandard Model. This only shows how perfect the Standard Model is, but at thesame time it is a disappointment in the sense that progress towards new physicshas been very slow. Therefore we will close this book by making a few commentson what lies beyond the Standard Model.

1) The LHC is a p–p colliding accelerator atp

s D 14 TeV and luminosity L of 1033–1034 cm�2s�1. Itbegan operation in 2009. More discoveries mentioned later in this chapter are also expected.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 813: Yorikiyo Nagashima - Archive

788 19 Epilogue

19.1Completing the Picture

Generation Puzzle and Mass Hierarchy Despite its great success, one caveat ofthe Standard Model is that it is powerless in explaining the generation (or family)structure. The three generations of family (u, d, ν e , e�), (c, s, νμ , μ�), (t, b, ντ , τ�)seem to share the same propeties except their mass. The Standard Model cannottell why there are three and only three generations. Their mass is also a mystery.Their mass hierarchical structure shown in Fig. 19.1 suggests a certain pattern,but in the Standard Model all the mass values are parameters to be adjusted. Themass is generated by the Higgs field and is expressed as mi D f i v/

p2 GeV where

v D 1/pp

2GF D 250 GeV is the vacuum expectation value of the Higgs fieldand f i is its coupling strength [see Eqs. (18.114)(18.121c)]. There are no criteria todetermine or constrain the strength of the Higgs coupling hence they are arbitrary.

Higgs Mechanism Another caveat is that the Standard Model does not have muchpredictive power for phenomena involving mass scales above � 1 TeV. This is be-cause it is essentially a low energy phenomenological theory valid below the elec-troweak scale characterized by v D 250 GeV. The Higgs particle with its asso-ciated mass generating mechanism is at the core of the Standard Model, but we

Log

m/e

V10

10

8

6

4

2

0

-2

I II III

e

du

cs

t

b

μτ

Upper limit on mν

GENERATION

ν1ν2

ν3

Figure 19.1 Mass hierarchy of elementaryparticles. The figure suggests a certain pat-tern which is an unsolved problem. Alsonotice, the mass of neutrinos are extremelysmall compared to others. For the neutri-nos, only their mass differences and upper

limits of the absolute values are known [!Eq. (19.6)]. The dashed lines were drawn as-suming m3 � m2 � m1. The origin andhierarchical structure of the mass are out-standing problems of the Standard Model.

Page 814: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 789

have no idea about the dynamical properties of the Higgs particle. Its Lagrangianwas chosen for its simplicity and for theoretical convenience with no observation-al foundations. Unless we discover the Higgs and investigate its properties, thedetailed dynamical structure of the Higgs condensation cannot be clarified. With-out the knowledge we cannot predict interactions above the critical temperaturewhere the phase transition and electroweak splitting take place. Comparing to thesuperconductivity with which the Higgs mechanism can be compared, we havethe Ginzburg–Landau theory, but not the BCS (Bardeen–Cooper–Schriefer) theorywhich clarified the dynamical mechanism of the superconductivity.

To produce the Higgs particle and understand its dynamics is the most urgenttheme in the high energy particle physics. Hopefully, LHC (Large Hadron Colliderat CERN) will give us hints on them.

19.2Beyond the Standard Model

19.2.1Neutrino Oscillation

So far the only firm experimental evidence beyond the Standard Model is neutrinooscillation, discovered in 1998. It is a phenomenon analogous to strangeness os-cillation, where a neutrino of one flavor is converted to another flavor (ν e ! νμ ,νμ ! ντ , etc.) (for a recent review see [42, 87]). It happens if the neutrinos havenonzero mass and mix with each other. It also means there is a right-handed neu-trino.

Two-Body Oscillation The neutrino is distinguished by its flavor (να D ν e , νμ , ντ ).It is identified by its production in association with the corresponding chargedlepton. Namely, the flavor states are the weak force eigenstates identified at thetime of production. In general, they are not necessarily the mass eigenstates (ν i ,i D 1 � 3). If they are different and mix with each other, the flavor eigenstates aresuperpositions of mass eigenstates:

jναi DX

j

Uα j jν j i (19.1)

Then as each eigenstate has different time dependence, the mixing rate changesas time passes inducing appearance of different flavor eigenstates. Assuming thatthe neutrino is stable, and considering the smallness of the neutrino mass, we canexpress the time development as

jνα (t)i DX

j

Uα j jν j ie�i E j t , E j Dq

p 2 C m2j ' p C

m2j

2E(19.2)

For simplicity, we treat the oscillation between two flavors. Then there is only oneindependent mixing matrix element. In terms of a mixing angle θ , the flavor eigen-

Page 815: Yorikiyo Nagashima - Archive

790 19 Epilogue

state can be written as

jν ei D cos θ jν1 > C sin θ jν2i

jνμi D � sin θ jν1 > C cos θ jν2i(19.3)

Therefore, the probability that what was ν e at t D 0 changes to νμ at time t is givenby

P(ν e ! νμI t) D jhνμ jν e(t)ij2 D j sin θ cos θ (1� e�i(E1�E2)t )j2 (19.4a)

' sin2 2θ sin2 Δm2

4EL D sin2 2θ sin2 1.27

Δm2[(eV)2]E [GeV]

L [km] (19.4b)

Δm2 Dˇm2

1 � m22

ˇ, L D c t (19.4c)

These equations show that oscillation occurs only when there is mixing (θ ¤ 0)and also a mass difference exists (Δm2 ¤ 0). The probability that ν e survives isgiven by

P(ν e ! ν e I t) D 1� P(ν e ! νμI t) (19.5)

An optimum condition for the oscillation to be observed is given by Δm2L/E � 1.By careful choice of the value of L/E , a wide range of the mass region can beexplored. The three-flavor-mixing formulae are different, but so far observed valuesdo not change appreciably if analyzed in either way.

Two kinds of oscillation have been observed. One is disappearance of the atmo-spheric neutrino [νμ (νμ) ! ντ (ντ) W ντ not observed], which was detected bycomparing downward-going neutrinos with upward-going neutrinos [357]. The en-ergy of the neutrino E is 1–10 GeV, and the oscillation length L � 10 000 km (theearth’s diameter). The other is that of the solar neutrino, (ν e ! νμ , ντ ) where Eis 1–10 MeV and L � 108 km [349]. The observed mass-squared difference and themixing angle are given by

Δm2atm D 2.5˙ 0.5 � 10�3 eV2 sin2 2θatm � 1.0 (19.6a)

Δm2ˇ D 8.0˙ 0.3 � 10�5 eV2 sin2 2θˇ � 0.86˙ 0.04 (19.6b)

Note that if the three masses are hierarchical (m3 m2 m1), the mass differ-ence is a measure of the neutrino mass itself and mν . 0.05 eV is expected.

Implication As the mass of a particle in the Standard Model is generated by itscoupling with the Higgs, the difference of the mass is ascribed to the difference inthe Higgs coupling constant. The observed mass ranges from ' 171 GeV (mtop)down to 0.01 eV (mν). It is not likely that a single constant/mechanism can gen-erate masses differing by an order of 1013 on the energy scale. Many attempts areunder way to solve the mass hierarchy problem. The hierarchy is vertical (i.e. withina generation) as well as horizontal (i.e. across the generations). The vertical sym-metry is treated in the grand unified theories (GUTs), to be described soon. Some

Page 816: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 791

seek horizontal symmetry. Yet the neutrino mass problem is special because ofits tiny mass as is demonstrated in Fig. 19.1. A promising scenario to explain itis the see-saw mechanism [163, 395], in which the existence of a very heavy right-handed Majorana neutrino makes the mass of the left-handed neutrino light. Themodel requires the neutrino to be a Majorana particle, whose antiparticle is indis-tinguishable from itself, and that the underlying physics is of high-energy origin.This naturally leads to GUTs.

After the discovery of neutrino oscillation, the importance of flavor mixing inthe lepton sector and CP violation in the PMNS (Pontecorvo–Maki–Nakagawa–Sakata) matrix, which has the same meaning as the CKM (Cabibbo–Kobayashi–Maskawa) matrix in the quark sector, was recognized and at present various neutri-no experiments are being planned in many laboratories across the world [42]. Theyare expected to provide complementary knowledge to what can be obtained at thehigh-energy frontier.

19.2.2GUTs: Grand Unified Theories

GUTs were discussed long before the discovery of the neutrino oscillation. The suc-cess of the Standard Model immediately prompted the idea that the same recipethat had worked in unifying electroweak forces might also work if the strong forceis included. There was another motivation to work on unified theories. The Stan-dard Model has over twenty arbitrary parameters, including the three coupling con-stants of the electroweak and strong forces, fourteen masses of the weak bosons,quarks and leptons, three mixing parameters and one CP violation phase of theKobayashi–Maskawa matrix (to be extended to the MNS matrix, which includes sixparameters, including two extra Majorana phases), the number of generations, etc.This is to be contrasted with general relativity, which includes only one parameter,Newton’s gravitational constant. A unified theory should not have so many unde-cided parameters.

Another motivation is theoretical, i.e. the absence of the quantum anomaly. Wehave learned that there are two ways to break the symmetry that exists in the La-grangian: explicit and spontaneous. In the former the symmetry-breaking term isput in by hand, for example by applying a magnetic field in an otherwise rotation-ally symmetric environment. The latter is realized when the ground state of theLagrangian breaks the symmetry. A third kind appears only in quantum mechan-ics. When symmetry is conserved at the classical Lagrangian level, but broken whenquantized, we talk about a quantum anomaly. When the anomaly exists in gaugetheory, which happens in the Standard Model, the gauge symmetry is broken andthe theory becomes nonrenormalizable. This is serious, but when the anomaliesare added over all the members of a family in the Standard Model, they miracu-lously disappear. The vanishing of the anomaly is guaranteed by the fact that the

Page 817: Yorikiyo Nagashima - Archive

792 19 Epilogue

sum of the electric charge in a family adds up to zero:Xi

q i D (qu C qd) � 3(color)C qe C qν e

D

��23

�C

��

13

� � 3C (�1)C (0) D 0

(19.7)

This fact suggests a strong connection between quarks and leptons. GUT modelscan naturally incorporate the mechanism of anomaly cancellation.

Since the electroweak theory is a gauge theory based on S U(2) � U(1) and QCDon S U(3), it is natural to think of a larger group that contains S U(3)�S U(2)�U(1).The first GUT theory was based on S U(5) [166], a minimal extension of the groupthat contains the Standard Model. Extrapolating the running coupling constant ofQCD and others, it is generally agreed that GUT unification is only possible atextremely high energies � 1014–1016 GeV, commonly referred to as the GUT (en-ergy) scale. Major predictions of GUT were the instability of the proton and theexistence of the very heavy magnetic monopole at the GUT scale. The failure todiscover the monopole prompted the inflation model, which later became part ofstandard cosmological theories. Naive GUT models like the one based on simpleS U(5) are probably excluded by proton decay experiments (see for instance [252]),which did not observe the decay mode p ! eCπ0 and set the upper limit of thelifetime as & 1033�34/BR(p ! eπ0) years. But GUTs that take into account super-symmetry predict that the dominant decay mode of the proton is p ! KCν andare still viable models.

19.2.3Supersymmetry

Supersymmetry (SUSY for short) is a symmetry that connects fermions withbosons and considers them as one and the same particle but belonging to differ-ent states. If the symmetry exists, all bosons (or fermions) should have fermionic(bosonic) partners with the same mass, constituting super-multiplets. We knowthere are no such particles. Obviously supersymmetry is broken if it exists. Al-though no hints of supersymmetry are around experimentally, there are at leasttwo strong theoretical motivations for its existence.

One is the naturalness problem in GUTs. GUT models consider the strong andelectroweak forces as united with a common coupling strength. At the GUT ener-gy (� 1016 GeV), the first spontaneous symmetry breaking occurs and the strongand electroweak forces are split. The second symmetry breaking occurs at the elec-troweak scale (� 1/

pGF � a few hundred GeV), where the weak gauge bosons ac-

quire mass and are separated from the electromagnetic force. GUT models containa variety of extra gauge and Higgs bosons (hereafter referred to as GUT particles)associated with the overarching symmetry, some of which have GUT scale mass in-duced by spontaneous symmetry breaking. When one wants to calculate quantumcorrections of some observables, one has to deal with the two energy scales at ev-ery order of the perturbation series, which is referred to as the hierarchy problem.

Page 818: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 793

For the electroweak observables, quantum corrections due to GUT particles withenergy scale � 1016 GeV have to be adjusted to the electroweak scale � 100 GeV.Since quantum corrections to bosonic mass like that of Higgs appear in quadraticform, one has to make a fine tuning of order (1016/102)2 � 1028 at every order ofthe perturbation calculations. This is considered unnatural and referred to as the“naturalness problem”.

Three types of recipes for its remedy have been proposed. One is to consider theelectroweak gauge bosons W ˙, Z as composites of more fundamental fermions;among these the most notable was the technicolor model. The binding energy playsthe role of a built-in cutoff. Form factors describing the penetration effect of gaugebosons at GUT scale can be safely assumed to vanish. However, despite many ef-forts, no viable models have been built and the so-called technicolor models havebeen largely abandoned.

The second is the addition of extra dimensions, which will be discussed later. Thethird is supersymmetry. SUSY requires that for every bosonic loop (i.e. higher or-der) correction, there exists a corresponding fermionic loop and they exactly canceleach other. If the symmetry is broken, it is still under control provided the break-ing scale is of the electroweak scale. All the corrections due to GUT particles arecanceled away. There is one piece of circumstantial evidence in support of super-symmetry. The three coupling constants α (QED), αW [S U(2)] and α s (QCD) arerunning if quantum corrections are taken into account. The quantum correctionsare calculated by including known particles in the loops. Their mass determines atwhich energy scale they have to be included. It was shown that the observed cou-pling constants, if extrapolated to GUT scale, do not meet together, in other words,the three forces do not unify. On the other hand, if supersymmetry is includedsomewhere above the electroweak scale, the number of particles contributing tothe quantum corrections doubles and it was shown that all three coupling con-stants meet together [20, 253]. SusyGUTs (see Fig. 19.2), i.e. GUTs with built-insupersymmetry, are favored over non-SusyGUTs.

The second reason for invoking supersymmetry is the desire to include gravityin unified theories. The symmetry operator changes fermions to bosons or viceversa and hence is a spinor. Amazingly, its commutator is identified as the energy-momentum operator. If it is gauged, i.e. if the operators are made local and includethe corresponding gauge fields, they can accommodate general coordinate trans-formations, which means gravity is included. The quantized gauge theory of super-symmetry is referred to as supergravity. However, supergravity as a mathematicalframework for unification has not been very successful. There is an inherent diver-gence problem associated with gravity being a tensor force. The problem may beavoided if one considers the particle not as a point but a string. This hope triggeredthe idea of superstring theory.

If supersymmetry exists, every known particle should have its SUSY partner withthe same mass but with spin differing by 1/2 (see Table 19.1). Since no such par-ticles are observed, supersymmetry is broken. But if the symmetry is broken at� 1 TeV level, we expect the symmetry to recover at higher energy and that SUSYparticles will be discovered with mass in the few hundred GeV region. Since SUSY

Page 819: Yorikiyo Nagashima - Archive

794 19 Epilogue

1 102 104 106 108 1010 1012 1014 1016

ENERGY (GeV)

10

20

30

40

50

60

0

Rec

ipro

cal C

oupl

ing S

treng

th1/α1

α2

1/

1/

α3 With SUSY

Without SUSY

Figure 19.2 Unification of electroweak andstrong forces. Plotted are the inverse of α1,α2 and α3, which are coupling strengths ofU(1), S U(2) and S U(3) gauge symmetries.The couplings do not meet at a point without

supersymmetry but they do with supersym-metry. The supersymmetry is assumed to bebroken below � 1 TeV, which is the cause ofthe kinks in the figure.

requires the superpartner to belong to the same multiplet as the known particles,the coupling strength is the same as their partners. Therefore, reaction topologiesand their reaction rates are calculable if their mass is known. The design of thedetector at LHC takes this into account.

Table 19.1 Superparticles.

Particle spin Superpartner spin Name

νe , νμ , ντ 1/2 Oνe , Oνμ , Oντ 0 sneutrino

e, μ, τ 1/2 Oe, Oμ, Oτ 0 sleptonu, c, t 1/2 Ou, Oc, Ot 0 squark

d, s, b 1/2 Od , Os, Ob 0 squarkγ 1 Oγ 1/2 photino a

W ˙ 1 OW ˙ 1/2 wino

Z 1 OZ 1/2 zino a

g 1 Og 1/2 gluino

h, H, A b 0 Oh, OH , OA, 1/2 higgsino a

H˙ b 0 OH˙ 1/2 charginoG c 2 OG 3/2 gravitino

a Photino, zino and higgsinos are mixed to make 5 neutralinos �01–�05.b Additional Higgs are required in the supersymmetric model.c Graviton.

Page 820: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 795

In the simplest supersymmetric model, each ordinary particle makes a doubletwith its superpartner with spin differing by 1/2. In many models, SUSY particlesare considered to carry R-parity to distinguish them from the ordinary particles.The left-handed and right-handed quarks and leptons separately make doubletswith their superpartners having spin 0 because conversion of one to the other ismutual and one cannot convert one SUSY particle with s D 0 to qL and qR atthe same time. Because of R-parity the lightest SUSY particle is stable, which isusually assigned to the neutralino or gravitino. It is a prime candidate for darkmatter, which is discussed later.

19.2.4Superstring Model

See [186]. The string as an elementary particle was first proposed as a model forthe hadron. We have learned that the intra-quark forces share many characteristicswith the string, Regge trajectories among them (recall discussions in Sects. 13.7.2and 14.5.3). Above all, it was pointed out that the duality of the Veneziano modelwas well explained by the string model. However, the hadronic string model hadtoo many unwanted particles and it could not explain asymptotic freedom, so it wasabandoned eventually.

When the string model was reconsidered at the Planck energy scale (� 1019 GeV),the unwanted particles at the GeV–TeV scale turned out to include gravitons, andthe model was revived as a candidate for the unified theory of all forces, includinggravity. The string model considers the fundamental particle not as a point par-ticle but a one-dimensional string (or even higher dimensional brane) and eachparticle species is one of its vibrational modes. It is a quantized gauge theorywith supersymmetry incorporated in it, and also an extended general relativity inten-dimensional space-time à la Kaluza–Klein.

The reason for thinking that the string resides in ten-dimensional space-timeis the absence of the quantum anomaly. As we live in four-dimensional space-time, the extra six-dimensional space is either curled up to a small size or ourfour-dimensional brane world floats in a ten-dimensional bulk. The string modeleven has the possibility of determining the space-time structure of the universeuniquely. Theorists became really taken with it and it is often dubbed the theoryof everything (TOE). Many fascinating ideas have been proposed but are unfortu-nately untestable, because the model works essentially at the Planck energy scale.There are very few subjects if any that can be related to low-energy phenomenafor experimental tests. If the only guiding principle is the beauty of the theory,there are too many possibilities and the research field runs the potential danger ofbecoming more like mathematics or philosophy than physics. We certainly needsome observational guidance.

Page 821: Yorikiyo Nagashima - Archive

796 19 Epilogue

19.2.5Extra Dimensions

The idea of extra dimensions has already been mentioned in Sect. 18.2.8 in con-nection with Kaluza–Klein theory. All the superstring models have built-in extradimensions, but here a different perspective on the extra dimensions is presented.

Gravity is different from other forces in many respects. The graviton has spin 2,its strength is extremely weak, and it is closely related to the space-time structure.If we want to treat gravity on an equal footing with other forces, we have to go tothe Planck energy, which is some 1017 orders of magnitude apart in scale. The ideaof extra dimensions [29] offers the possibility of solving this hierarchy problem.If space-time is 4 C D (D D 1, 2, . . .) dimensional, gravity spreads in (3 C D)-dimensional space. Then Gauss’s law tells us that the gravitational force flux (henceits strength) decreases � r�(2CD ). Let us denote the gravitational constant in (4CD)-dimensional space-time as GD � M�2

D . GD is the fundamental gravitationalconstant in (3CD)-dimensional space just like GN D 1/M2

Pl is in three-dimensionalspace. We want to have MD not too different from the electroweak scale, which weconveniently set at 1 TeV. The gravity under consideration should not contradictNewton’s inverse square law. Therefore, we assume the extra D-dimensional spaceis curled up at the scale of R D (MD )�1. Then the distance in the extra dimensionaldirection saturates at r � R and the power of the force propagation decreases to� r�2, recovering the usual Newton’s law. The dimensional argument constrainsthe force as

�1

M2D

m1m2

r2

1(MD R)D

D GNm1m2

r2D

1M2

Pl

m1m2

r2(19.8a)

) R D

M2

Pl

M2CDD

!1/D

'

�MD

1 TeV

��(DC2)/D

� (10)(32/D )�17 cm

D

8<ˆ:

� 0.4 light years D D 1

� 1 mm D D 2

� 5 � 10�7 cm D D 3

� 10�9 cm D D 4

(19.8b)

where m1, m2 are masses of the particles introduced for convenience and whichdisappear from the final expression. Since for r � R, f � r�(2CD ), and for r R ,f � r�2, D D 1 is ruled out. However, Newton’s law has never been tested at or

below the scale of 1 mm, hence D � 2 is allowed. In other words, the fundamentalgravity energy scale being � 1 TeV is not excluded if one allows extra dimensions.Then the hierarchy problem may not exist.

The idea of extra dimensions triggered many models and tests of them havebeen proposed. The values listed above are somewhat more limited now, but thebasic idea remains valid. One may refer to [311], p. 1272 for recent developments.In some models like one we described above, only the graviton lives in the extradimensions and ordinary particles are confined to the four-dimensional Minkowski

Page 822: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 797

space. In other models, gauge particles or all the Standard Model particles residein the extra dimensions, too. Gravitons (denoted as G) are produced in e�C eC !γ (W ˙, Z )CG or hadronic colliders (qC q ! gluonCG ). Since they are emittedin the extra-dimensional space as well as in real space, the signal in the hadroncollider is a monojet + missing energy.

The momentum of the emitted graviton in the extra dimensions appears in theguise of a mass in our space:

E D

vuutp2 C

DXiD1

p 2i C m2 �

qp2 C m2

KK (19.9)

which is referred to as the KK (Kaluza–Klein) tower. Its spectrum is discrete be-cause of the finite space size of the extra dimensions, but the spacing is small(ΔE � „c/R � 10�4 eV for D D 2) and is almost continuous. Therefore, al-though the production rate is small for one KK graviton, the phase space volumefor emitting the KK gravitons is large and the total production rate is at a detectablelevel. The detection of virtual graviton effects and mini black holes has also beendiscussed.

19.2.6Dark Matter

The existence of dark matter has been clarified by astrophysical observations. It isso named because it does not radiate any electromagnetic waves and its existenceis only detected by its gravitational force.

The existence of dark matter was first suggested by Zwicky in 1933 [401] to ex-plain the violent random motion of galaxies in the Coma Cluster and revived byRubin and others in their observations of rotational curves of spiral galaxies [337]in the 1970s. The motion of neutral hydrogens detected as 21.1 cm radio waves (seeFig. 4.1) showed that the rotational velocity of spiral galaxies, including the MilkyWay, stays constant far beyond the boundary of shining stars and unambiguous-ly proved the existence of dark matter (Fig. 19.3). Since then dark matter has alsobeen observed in many other places, including gravitational lensing. On the the-oretical side, the properties of dark matter have been investigated in conjunctionwith cosmic expansion to produce the observed large-scale structure of the galaxydistribution and cosmic microwave background radiation.

Problem 19.1

Show that the rotational velocity goes down �p

M/r if the mass is concentratedat the center and that to make the rotational velocity constant the mass of a galaxyhas to grow � r .

What are the properties of dark matter? It is electrically as well as color-wise neu-tral, which means that the interaction with ordinary matter is weak. Its propertiesdo not fit with known particles in the Standard Model. The best candidate for dark

Page 823: Yorikiyo Nagashima - Archive

798 19 Epilogue

5 10 15 20 25 30RADIAL DISTANCE (kpc)

100

200

300ROTATION CURVE

CUMULATIVEMASS

SURFACE DENSITY

SOUTH PRECEDING

OPTICAL R> 12 kpc

21-cm MAJOR AXIS

VELOCITY (km/s)ROTATIONAL

Figure 19.3 An image of the Andromedagalaxy with its rotation curve (velocity of gasclouds and stars orbiting the center of thegalaxy vs. distance from the center). The con-

stancy (flatness) of the rotation curve beyondthe point where the light ends indicates thepresence of dark matter that holds the galaxytogether. Adapted from [307].

matter is the stable lightest neutral SUSY particle, known as the neutralino, whichis a mixture of the superpartners of the photon, Z and Higgs particle. Another can-didate is the axion, a hypothetical very light (pseudo-)scalar p article introduced toexplain the so-called strong CP problem in QCD [314, 315].2) From cosmologicalargument it is known that the dark matter is cold, i.e. its temperature is low or itexists as a mass cluster but not moving appreciably.

Many experiments to search for dark matter are under way. The idea is to catchit in the halo of the Milky Way galaxy. The solar system, including the earth, isorbiting around the center of the galaxy, which arouses winds of dark matter. Thesignal is a tiny ionization loss liberated by collision of slow dark matter with thedetector material. There is also a hope that it may be created directly in the LHC.

19.2.7Dark Energy

The existence of the cosmological constant in the cosmic evolution equation wassuspected for a while by analysis of large scale structure formations. The first directevidence was given by observation of far away type I supernovae, which exhibitedthe accelerating expansion of the universe [320, 331] (Fig. 19.4).

Just as the Standard Model of particle physics was established in the 1970s, thecosmological Standard Model (referred to as the concordance model) was estab-

2) A term in the QCD Lagrangian, which is the color extension of E � B in QED, violates both P andT invariance. Theoretically it is allowed but phenomenologically it does not exist. The reason isnot understood, which is referred to as the “strong CP problem”.

Page 824: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 799

10.0 0 10.0

8.0

NOW

Contraction

Adapted from Perlmutter, Physics Today (2003)

Critical Expansion

Uniform

Expansion

Accele

rating

Expan

sion

Big BangScal

e (A

vera

ge C

lust

er D

ista

nce)

Time (Billion Years) 13.7

Figure 19.4 History of cosmic expansion, asmeasured by the high-redshift supernovae(black data points). Given energy composi-tions, time evolution curves can be calculatedfrom the Friedman equation, assuming flat

cosmic geometry. Without dark energy, the ac-celeration is always decelerating and the curveis convex upwards. With dark energy, the ex-pansion is decelerating at the beginning butturns to acceleration later [319].

lished in 2003 when the precision data of WMAP3) was analyzed and a consistentview of the dynamical history of the universe was established [242, 351]. Accordingto the concordance model, the energy of our universe is occupied mostly by darkenergy and dark matter (see Fig. 19.5). The age of the universe was determinedto be 13.69 ˙ 0.13 billion years [125]. Another outcome was that the space-timecurvature of the universe was found to be consistent with zero.

What is the dark energy? How do we know that it is there? It can only be provedindirectly by detecting the acceleration of the expanding universe and analysis ofcosmic evolution. We will elaborate a bit on why we believe that dark energy existsby considering the cosmic expansion. It is governed by the Friedman equation

H2 D8π3c2 GN� �

kc2

a2 CΛc2

3(19.10)

where H D Pa/a is the expansion rate of the universe, � D �matterC�radiationC�vacuum

is the energy density, k is the curvature of space-time and Λ is the cosmologicalconstant. The value of H at present time (denoted as H0) is referred as the Hubbleconstant. a is a scale factor and can be any spatial distance, but conveniently onemight think of it as the size of the universe or average intergalaxy cluster distances.

3) The Wilkinson Microwave Anisotropy Probe, which was launched in 2001. It has mapped thecosmic microwave background (CMB) radiation (the oldest light in the universe) and produced thefirst fine-resolution (0.2ı) full-sky map of the microwave sky. Details can be obtained from [292].

Page 825: Yorikiyo Nagashima - Archive

800 19 Epilogue

Figure 19.5 WMAP data reveal that the cos-mic energy content includes only 4.6% atoms,the building blocks of stars and planets. Darkmatter comprises 23% of the universe. Thismatter, different from atoms, does not emitor absorb light. It has only been detected in-

directly by its gravity. 72% of the universe iscomposed of “dark energy”, which producesantigravity. This energy, distinct from darkmatter, is responsible for the present-day ac-celeration of the universal expansion. Credit:NASA / WMAP Science Team [292].

The equation is derived from Einstein’s equation of general relativity. However, itcan be made to appeal to intuitive understanding (albeit vaguely) if one derives itusing only a knowledge of undergraduate mechanics and thermodynamics. ApplyNewton’s familiar equation of motion in a gravitational field

12

mv2 � GmM

rD E (19.11)

to a test particle (mass m) at the periphery (r) of a sphere speeding away fromthe center of the sphere anywhere in the universe. The test particle which can beimagined as a galaxy feels the gravity from the mass inside the sphere but not fromoutside. Convert v ! Pr, M ! (4π/3)(�/c2)r3, enlarge the notion of mass densityto include radiation as well as vacuum energy density, 2E/mc2 ! �k, and onegets the Friedman equation.

Problem 19.2

Show the critical density of the universe is given by

�c D3H2

8πGD 1.88 � 10�29 h2 g/cm3 , h D 0.73 (19.12)

and confirm the value using the observed value of present expansion rate H0 D

73.3 km s�1 Mpc�1, 1 Mpc D 106 pc, 1 pc D 3.262 lightyears. The critical energydensity is defined as that to give k D 0 in the absence of the cosmological con-stant. It is also the density to separate eternal expansion or eventual collapse of theuniverse without Λ.

Page 826: Yorikiyo Nagashima - Archive

19.2 Beyond the Standard Model 801

By taking time derivatives of Eq. (19.10) and applying the first law of thermody-namics

d(�V )C P dV D 0 4) (19.13)

we can obtain the cosmic acceleration equation

RaaD �

4πG3c2

(�C 3P )CΛc2

3(19.14)

The first term in the parentheses is familiar: Newton’s universal attractive force.The pressure (P ) in the second term gives the difference acquired by going to gen-eral relativity. The cosmological constant Λ, if positive, gives the repulsive force.

Einstein’s original motivation for introducing the cosmological constant was tomake the universe stationary because he did not know the universe was expand-ing. The original equation without the cosmological constant was unstable, theuniverse perpetually expanding or shrinking. To make the universe static, he need-ed a counter term to cancel the attractive force. Later, when he heard of Hubble’sdiscovery of the expanding universe, he abandoned it, saying it was his biggestblunder in his life. He found no reason to keep it in the equation. It is interestingto note that today’s physicists take a different attitude toward the unknown. Peopleare beginning to believe that what is mathematically possible is likely to happen, asis exemplified by Gell-Mann’s words: “Everything which is not forbidden is com-pulsory”.

The pressure term could give a repulsive force if it is negative and dominatesthe first term in Eq. (19.14). Indeed, by inspecting Eqs. (19.10) and (19.14) andrecalling that the vacuum has negative pressure P D �� (see arguments in boxedparagraph of Sect. 5.5.1), one can easily see that the cosmological constant can beidentified as the constant vacuum energy density given by �vac D Λc2/(8πGN).Now we have understood that the accelerating expansion ( Ra > 0) means that thenegative pressure permeates all over the universe. Note that, unlike a cluster ofmass, which can exert force � M/r2, the vacuum energy is not clustered, andas can be seen from Eq. (19.14) the strength of the force increases linearly withdistance. This is why repulsive gravity is not observed. Only at cosmic distances issufficient vacuum energy accumulated to be perceptible. The cosmic expansion isthe only phenomenon (so far) where repulsive gravity can be seen in operation.

Problem 19.3

Derive Eq. (19.14) using Eqs. (19.10) and (19.13).

As is easily derived from Eq. (19.10), the constant and dominant energy densityforces the universe to expand exponentially. It is believed that the universe wentthrough such an epoch, known as inflation, before the thermal hot universe devel-oped. Since we really do not know if the acceleration is due to the cosmological con-stant, which stays constant as its name suggests, or some dynamical object, which

4) Assumption of the uniform universe restricts net thermal flow to vanish in any sampled volume.

Page 827: Yorikiyo Nagashima - Archive

802 19 Epilogue

can generate time varying negative pressure, the name “dark energy” is adopted toinclude all possible candidates for the cause of acceleration.

Note, in spontaneous symmetry breaking we learned in Sect. 18.6.2 that the vac-uum moves from φ D 0 to jφj D v/

p2. Then the difference in the potential

energy V(0) � V(v/p

2) is liberated as vacuum energy. Namely, a scalar field thatcan break symmetry spontaneously has the capability of creating vacuum energy.Cosmic inflation is believed to have been produced by such a scalar field, referredto as the inflaton. As the concordance model says that at present the dark energydominates over all other types of energy, we are, sort of, in the second inflationarystage. The dark energy could be static vacuum energy (equivalent to the cosmo-logical constant) or of dynamical origin. If there is a self-interacting scalar fieldwhose potential energy overwhelms its kinetic energy, it delivers vacuum energy,which varies with time. This is referred to as quintessence, which means the fifthelement after the ancient Greek four elements of earth, wind, fire and water, orperhaps modern u, d, ν and e? Whether the dark energy will eventually die away,as in some of the quintessence models, or stay constant, as is the case for the cos-mological constant, determines the fate of the universe.

From a particle physicist’s point of view, dark matter and dark energy are newchallenges. What have we been doing? After all the long journey of the quest forthe fundamental constituents of matter, what we have found occupies no morethan 5% of the total cosmic energy budget.

Page 828: Yorikiyo Nagashima - Archive

803

Appendix ASpinor Representation

A.1Definition of a Group

A group G is a set of elements with a definite operation that satisfy the followingconditions.

1. Closure: For any two elements A, B , a product AB exists and also belongs tothe group.

2. Associativity: For any three elements, (AB)C D A(B C ).3. Identity element: A unit element 1 which does not cause anything to happen

exists. For all elements A, 1A D A1 D A.4. Inverse element: For any element A, an inverse A�1 exists and A�1A D

AA�1 D 1.

If AB D B A, the set is called commutative or abelian group.

Example A.1

Integers Z with addition “C” as the operation and zero as an identity operation.Inverse is subtraction.

Example A.2

Rational numbers Q without zero with operation multiplication “�” and the num-ber 1 as identity element. Zero has to be excluded, because it does not havean inverse element. Other examples are parallel movement or rotation in two-dimensional space (a plane).

If AB ¤ B A, the set is called noncommutative or nonabelian group.

Example A.3

Space rotation O(3) which is a member of more general group O(N ) which keeps

the length of the N-dimensional real variables space r DqPN

iD1 x2i invariant.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 829: Yorikiyo Nagashima - Archive

804 Appendix A Spinor Representation

Group Representation The operations of a group are abstract. By using the grouprepresentation they are transformed to a format that is calculable. A familiar repre-sentation is a collection of matrices U(g) that act on a vector space jΨ i 2 V , suchthat for all group elements g 2 G , there exists a corresponding matrix U(g) thatsatisfies

U(A)U(B) D U(AB)

U(A�1) D U�1(A)

U(1) D 1 (Unit matrix)

(A.1)

The collection of operand vectors is called the representation space, and the dimen-sionality n of the vectors is called the dimension of the space.

Note, hermitian matrices do not form a group unless it is commutative.

Example A.4

U(N ) is a transformation that keeps the metric of the N-dimensional complexvariables space `2 D

PNiD1 ju i j

2 invariant. If we express u i0 s as a vector ψT D

(u1, u2, � � � , u N ), then N � N unitary matrices are the representation of the U(N )group elements. When det U D 1, it is referred to as the special unitary groupS U(N ).

Subgroup When a subset of a group makes a group by itself under the same mul-tiplication (transformation) law, it is called subgroup.

Subgroup of S U(N ): Any S U(N ) group has an abelian discrete subgroup H forwhich

An element h 2 H , h D e i2πk/N k D 0, 1, 2, � � �N � 1 (A.2)

ordinarily designated as ZN . It is the “center” of the group whose elements com-mute with all other elements of the group.

A.1.1Lie Group

When the group element is an analytical function of continuous parameters, it iscalled a Lie group. Any element of the Lie group can be expressed as

A(θ1, θ2, � � � , θn) D exp

i

nXiD1

θi Fi

!(A.3)

where n D N 2 � 1 for the S U(N ) group. Fi ’s satisfy commutation relations calledLie algebra.

[Fi , F j ] D i f i j k Fk (A.4)

Page 830: Yorikiyo Nagashima - Archive

A.2 S U(2) 805

Fi ’s are called generators of the group, f i j k structure constants. Since the wholerepresentation matrices are analytic functions smoothly connected to the unit ma-trix in the limit of θi ! 0, the Lie algebra Eq. (A.4) completely determines thelocal structure of the group.1) f i j k can be made totally antisymmetric in the in-dices i j k. Determination of the generator is not unique, because different F 0

i canbe made from any independent linear combination of Fi .

2) We are concerned witha compact Lie group in which the range of parameters is finite and closed withina finite volume in the space. For example, the rotation in a two-dimensional plane,0 θ 2π and is compact. In the Lorentz group,�1 < v < 1 and is finite but notclosed. If another parameter rapidity is used �1 < η < 1 and hence, the groupis not compact.

A.2SU(2)

We define a spinor as a base vector of S U(2) group representation in two-dimensional complex variable space.

� D�

�1

�2

�(A.6)

Then representation matrices are expressed by 2 � 2 unitary matrices with unitdeterminant. � transforms under S U(2)

� ! � 0 D U� or�

� 01

� 02

�D

�a bc d

���1

�2

�(A.7)

The condition for U is

U�1 D U† , and det U D 1 (A.8)

which constrains the U matrix to be of the form

U D�

a b�b� a�

�, jaj2 C jbj2 D 1 (A.9)

Therefore, the number of independent parameters in S U(2) is three. Equation(A.9) means

� 01 D a�1 C b�2

� 02 D �b��1 C a��2

(A.10)

1) An example of global structure is“connectedness”. For example S U(2) andO(3) have the same structure constants,but S U(2) is simply connected and O(3) isdoubly connected and there is a one-to-twocorrespondence between the two.

2) For instance, instead of angular momentumJi , we can use J˙ D Jx ˙ i Jy which satisfies

[ JC, J�] D 2 Jz , [ Jz , J˙] D ˙ J˙(A.5)

Page 831: Yorikiyo Nagashima - Archive

806 Appendix A Spinor Representation

Taking the complex conjugate of the equations and rearranging, we have��� �

2

�0D a

��� �

2

�C b

�� �

1

��� �

1

�0D �b� ��� �

2

�C a� �� �

1

� (A.11)

Therefore

� c �

��� �

2� �

1

�D

�0 �11 0

� �� �

1� �

2

�D �i σ2� � (A.12)

transforms in the same way as � .As there are three traceless hermitian matrices in S U(2), which are taken as

Pauli matrices, a general form of the unitary matrix with unit determinant takes aform of

U D exph�i

σ2� θiD cos

θ2� sin

θ2

i σ � n (A.13)

U† D exph

iσ2� θiD cos

θ2C sin

θ2

i σ � n (A.14)

s i D σ i/2 are generators of the SU(2) group and satisfy

[s i , s j ] D i ε i j k sk (A.15)

which is the same commutation relation as that of angular momentum. To seeconnections with the O(3) (three-dimensional rotation) group, let us take a lookat the transformation properties of a product of � 0 s, Hi j D �i � �

j . Taking Hi j ascomponents of a hermitian matrix, it can be expanded in terms of Pauli matrices.

H D a1C b � σ D a1C�

bz bx � i b y

bx C i b y �bz

�D AC B (A.16)

The matrix H transforms under S U(2)

H ! H 0 D U H U† D A0 C B0 D a01C b0 � σ (A.17)

A0 D a01 D U AU† D Ua1U† D a1 ) a0 D a (A.18)

It shows the coefficient of the unit matrix is invariant. To extract a, we take trace ofboth sides of Eq. (A.18) and obtain

2a D Tr[H ] D �1 � †1 C �2� †

2 D j�1j2 C j�2j

2 D � †� (A.19)

We see that S U(2) keeps the norm of the spinor invariant. Considering that theunitary transformation keeps the value of determinant invariant,

det B0 D det UBU† D det B ) b02x C b02

y C b02z D b2

x C b2y C b2

z (A.20)

Page 832: Yorikiyo Nagashima - Archive

A.2 S U(2) 807

This means the set of three parameters b D (bx , b y , bz ) defined in Eq. (A.17) be-haves as a vector in three-dimensional real space, i.e. a vector in O(3). We shallshow explicitly that components of b exactly follow the relation Eq. (3.3).

To extract bx from Eq. (A.17), we multiply both sides of the equation by σx andtake trace. Using Tr[σ2

x ] D 2, Tr[σx ] D Tr[σx σ y ] D Tr[σx σz ] D 0, we obtain

bx D12

Tr[σx H] D12

(σx , i j )� j � †i D � †

i (σx , i j )� j D � † σx � (A.21a)

similarly

b y D � † σ y � , bz D � † σz � (A.21b)

Therefore we can write

b D � †σ� (A.22)

Let us show that b indeed behaves as a vector in three-dimensional real space.Operation of S U(2) transforms � ! � 0 D U� . Rotation on axis z by θ transforms

bx ! b0x D � 0†σx � 0 D � †U† σx U� (A.23)

D � †�

cosθ2C sin

θ2

i σz

�σx

�cos

θ2� sin

θ2

i σz

�� (A.24)

Inserting σz σx σz D �σx , σz σx D i σ y , we obtain

b0x D cos θ bx � sin θ b y (A.25a)

Similarly

b0y D sin θ bx C cos θ b y (A.25b)

b0z D bz (A.25c)

Thus, b behaves exactly as a vector in O(3). We may define a spinor whose bi-linearform (A.22) behaves like a vector. Loosely speaking, it is like a square root of thevector.

Composites of Spinors Let us consider products of two independent spinors � andη, which may be thought as representing two particles of spin 1/2,

� D�

�1

�2

�, η D

�η1

η2

�(A.26a)

s D1p

2(�1η2 � �2 η1) , v D

8<ˆ:

vx D �1p

2(�1η1 � �2 η2)

vy D1p2i

(�1η1 C �2 η2)

vz D1p

2(�1 η2 C �2 η1)

(A.26b)

Page 833: Yorikiyo Nagashima - Archive

808 Appendix A Spinor Representation

It is easy to show that s and v behave like a scalar and a vector, namely as spin 0and 1 particles.

From the above arguments, we realize that S U(2) operation on � and that ofO(3) on vector (x , y , z) are closely related. Writing down explicitly operations onspin part

U D e�i σ�θ/2 D cosθ2� sin

θ2

i σ � n, R D e�iJ�θ D cos θ � sin θ iS � n

(A.27)

we see that while in S U(2), 4π rotation takes the spinor back to where it started,it takes 2π rotation in O(3). Between the range 2π � 4π, U ! �U correspondsto the same operation in O(3). The correspondence is 1 W 2. This is the reasonthat O(3) realizes rotations only on integer spin particles, while S U(2) is capableof handling those of half integer spin.

There are many reasons to believe that S U(2) is active in our real world. For in-stance, we know that quarks and leptons, the most fundamental particles of matterhave spin 1/2. A direct evidence has been obtained by a neutron interference ex-periment using a monolith crystal which has explicitly shown that it takes 4π to goback to the starting point [86, 103]. To confirm it personally, the reader may try asimple experiment shown in Figure A.1 in the boxed pragraph.

Confirm for yourself that 720ı rotation brings you back to where you were,but 360ı rotation does not.

(a) (b) (c)

(d) (e) (f)

Figure A.1 Illustration of S U(2) topology.After a rotation of 4π, the strings can bedisentangled without rotating the tube, butit is not possible to do so with 2π rotation.

(a) No rotation. (b) The tube is rotatedthrough an angle 4π. The configuration istopologically equivalent to (a). (c)–(f) showhow to undo the twists of the two strings.

Page 834: Yorikiyo Nagashima - Archive

A.3 Lorentz Operator for Spin 1/2 Particle 809

A.3Lorentz Operator for Spin 1/2 Particle

Referring to Eq. (3.77) and Eq. (3.74) in the text, combinations of spin operator Sand boost operator K

A D12

(S C i K) , B D12

(S � i K) (A.28)

satisfy the following equalities

[A i , A j ] D i ε i j k A k (A.29a)

[Bi , B j ] D i ε i j k Bk (A.29b)

[A i , B j ] D 0 (A.29c)

Both A and B satisfy commutation relations of S U(2), and they commute. In oth-er words, the Lorentz group can be expressed as a direct product of two groupsS U(2)� S U(2), and hence its representations can be specified by a couple of spins(sA, sB ). Let us consider the most interesting case where one has spin 1/2 and theother 0.

(0, 1/2) A D 0! K D CiS D Ciσ2

(A.30a)

(1/2, 0) B D 0! K D �iS D �iσ2

(A.30b)

Correspondingly there are two kinds of spinors which we denote φL and φR re-spectively. They transform under Lorentz operation as

φLL! φ0

L D exph�i

σ2� θ �

σ2� ηi

φL � M φL (A.31a)

φRL! φ0

R D exph�i

σ2� θ C

σ2� ηi

φR � N φR (A.31b)

which means that the two spinors behave in the same way under rotation, buttransform differently under the boost operation.

Note there is no matrix T that connects each other by transformation N D

T M T �1, i.e. in mathematical language they are inequivalent. Actually they areconnected by

N D � M���1 , � D �i σ2 (A.32)

A.3.1S L(2, C) Group

Lorentz transformation on spinors is represented by 2�2 complex matrix with unitdeterminant and is called S L(2, C ) group.

� 0 D S� I S D�

a bc d

�, ad � bc D 1 (A.33)

Page 835: Yorikiyo Nagashima - Archive

810 Appendix A Spinor Representation

The matrix S comprises 6 parameters which are related to the three angles andthe three rapidities in the Lorentz transformation. To see it, we look again at thetransformation properties of a product of spinor components Hi j D �i � �

j (i, j D1 � 2). In a similar manner as Eq. (A.16), we decompose it to two parts, one withunit matrix and the other with Pauli matrices

H D b01C b � σ D�

b0 C bz bx � i b y

bx C i b y b0 � bz

�S! H0 D SHS† (A.34)

Here, as the matrix S is not unitary, S† S ¤ 1, hence the first and the second termmix. However, the value of the determinant is invariant which leads

det H0 D det[SHS†] D j det Sj2 det H D det H (A.35)

) b0 0 2 � b0 2x � b0 2

y � b0 2z D b02 � b2

x � b2y � b2

z (A.36)

therefore S L(2, C ) operation has been proved to be equivalent to the Lorentz trans-formation.

To see the difference between φL and φR, let us see transformation properties oftwo vector-like quantities made of constant matrices sandwiched between the twospinors.

aμ � φ†Rσμ φR , aμ � φ†

R σμ φR (A.37a)

bμ � φ†L σμ φL , bμ � φ†

L σμ φL (A.37b)

σμ � (1, σ) D σμ , σμ � (1, �σ) D σμ (A.37c)

Put

S˙ D exp(˙ησx/2) , S†˙ D S˙ (A.38)

and define dashed quantities by(φ0 †

R σμ φ0R � φ†

R σμ0φR

φ0 †L σμ φ0

L � φ†L σμ

0φL(A.39)

then by the Lorentz transformation, they change as(σμ ! σ0 μ D S†

Cσμ SCσμ ! σ0

μ D S†�σμ S�(A.40)

Calculating “0,1” components under a boost in the x-direction

σ00(σ0

0) D S†˙1S˙ D e˙σx η D cosh η ˙ sinh ησx (A.41a)

σ10(σ1

0) D S†˙σx S˙ D

cosh

η2˙ sinh

η2

σx

�σx

cosh

η2˙ sinh

η2

σx

�D (sinh η ˙ cosh ησx ) (A.41b)

Page 836: Yorikiyo Nagashima - Archive

A.3 Lorentz Operator for Spin 1/2 Particle 811

The transformation is the same as Eq. (3.66). Therefore, we have proved that aμ ,bμ are contravariant vectors and aμ , bμ covariant vectors. We often refer σμ , σμ asLorentz vectors, but note that it is meaningful only if sandwiched between the twospinors. It is also clear that φ†

RφL, φ†L φR are Lorentz scalars.

A.3.2Dirac Equation: Another Derivation

Here, we try to make some correspondence between the spinors and particles con-sidering a special transformation with θ D 0. We can consider a particle withmomentum p as boosted from its rest state. Then there are two distinct states ofthe particle expressed as

φL(p ) D exph�

σ2� ηi

φL(0) (A.42a)

φR(p) D exphC

σ2� ηi

φR(0) (A.42b)

Let 'L, 'R be any two-component spinors behaving like φL, φR, then '†L (1,�σ)φL,

'†R(1, σ)φR are contravariant vectors and their dot products with the energy-

momentum vector '†L (E C σ � p )φL, '

†R(E � σ � p )φR must be scalars. Considering

'†L φR, '

†R φL are also scalars, it follows that (E C σ � p )φL, (E � σ � p )φR behave like

φR, φL and hence must be proportional to them. Using a relation

E C σ � p D m (cosh η C sinh ησ � n) D m exp(σ � η) (A.43)

and putting the proportionality constant C,

(E C σ � p )φL(p ) D m exp (σ � η) exph�

σ2� ηi

φL(0)

D m exphC

σ2� ηi

φL(0)

D C φR(p ) D C exphC

σ2� ηi

φR(0)

(A.44)

It follows that C D m and φL(0) D φR(0). The latter equality is physically a simplestatement that at rest there is no means to distinguish φL from φR. Applying thesame argument to (E � σ � p )φR and combining both results give

(E C σ � p )φL(p) D mφR(p ) (A.45a)

(E � σ � p )φR(p) D mφL(p ) (A.45b)

This is the celebrated Dirac equation written in terms of two-component spinors3) .Historically, the equation was created first and its solution followed. Here, we fol-lowed the inverse logic starting from the solution and found the equation. We see

3) Equation (A.45) is written in Weyl representation. In some text books, Dirac representation isused where they are connected by

ψD D T ψW , fα, gD D Tfα, gW T�1 , T D1p

2

�1 1�1 1

�(A.46a)

Page 837: Yorikiyo Nagashima - Archive

812 Appendix A Spinor Representation

that it is a relation that 2-component spinors must satisfy and is the simplest onethat a spin 1/2 particle respecting the Lorentz invariance has to obey.

Multiplying (E�σ � p ) on both sides of (A.45a), and using relation (E�σ � p )(ECσ � p ) D E 2 � p2 and (A.45b), we obtain

(E 2 � p2 � m2)φR D 0 , similarly (E 2 � p2 � m2)φL D 0 (A.47)

They satisfy the Einstein relation. If φL(p) and φR(p ) are expressed as a spacetimefunction (i.e. take their Fourier transform), they satisfy the Klein–Gordon equation.

Problem A.1

Define

S μν Di4

[γ μ, γ ν ] (A.48a)

(S23, S31, S12) D (S1, S2, S3) D12

Σ D12

�σ 00 σ

�(A.48b)

(S01, S02, S03) D (K1, K2, K3) Di2

α D12

��i σ 0

0 i σ

�(A.48c)

Show that S and K satisfy the same commutation relations as Eq. (3.74), thusqualify as Lorentz operators.

Looking at Eq. (A.42), we see the plane wave solution of the Dirac equation isindeed nothing but a Lorentz boosted spin wave function at rest as it should befrom our argument.

Page 838: Yorikiyo Nagashima - Archive

813

Appendix BCoulomb Gauge

The Lagrangian for an interacting electromagnetic field is given by

L D �14

Fμν F μν � q j μ Aμ D �12

(B2 � E 2)� q j μ Aμ (B.1a)

F μν D @μ Aν � @ν Aμ , Aμ D (φ, A) , j μ D (�, j ) (B.1b)

The Euler–Lagrange equation produces the Maxwell equation

@μ F μν D @μ@μ Aν � @ν �@μ Aμ� D q j ν (B.2)

In the Coulomb gauge, we impose the gauge condition

r � A D 0 (B.3)

Using the condition, the 0th component of the Maxwell equation becomes

r2 φ(x ) D �q� (B.4)

which has solution with boundary condition φx!1����! 0

φ(x ) Dq

Zd3x 0 �(x 0, t)jx � x 0j

(B.5)

Notice, the action of the charge at r0 on a point x is not retarded, but instantaneous.Defining the conjugate momentum fields by

π0 DδL

δ PφD 0 (B.6a)

πk DδL

δ PA kD � PAk � @k φ D E k (B.6b)

The Hamiltonian density is given by

H DδL

δ PA μ

PA μ �L D12

(E2 C B2)C E � rφ C q j μAμ

D12

(E2 C B2) � q�φ C q j μAμ D12

(E2 C B2) � j � A (B.7a)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 839: Yorikiyo Nagashima - Archive

814 Appendix B Coulomb Gauge

where in going to the second line partial integration was taken. This Hamiltonianseems to lack the Coulomb potential term, but actually it is hidden in the E2. Tosee it, we set

Ek D �rφ , E? D � PA (B.8)

then

E D E? C Ek , r � Ek D 0 , r � E? D 0 (B.9)

by partial integration, we can show that

jE j2 D jE? C Ekj2 D jE? � rφj2 D jE?j2 C jEkj2

jEkj2 D rφ � rφ D �(r2φ)φ D q�φ(B.10)

Using the above relations, the Hamiltonian is now expressed as

H D12

�B2 C E 2

?�C

q2

�φ � j � A? (B.11)

The Coulomb term has reappeared. Notice the factor 1/2. The Hamiltonian cannow be written as

H DZ

d3x�

12

˚(r � A?)2 C ( PA?)2 � j � A?

�C

q2

2

“d3x d3x 0 �(x )�(x 0)

4πjx � x 0j

�(B.12)

Thus the degree of freedom corresponding to the scalar potential is replaced withthe instantaneous Coulomb potential and the rest are written in terms of trans-versely polarized potential. They are not manifestly Lorentz invariant, but a detailedinvestigation can show that Lorentz invariance is respected. Therefore, when intu-itive and physical interpretation is desired, the Coulomb gauge is convenient. Butfor easier calculation, the covariant gauge is more convenient.

B.1Quantization of the Electromagnetic Field in the Coulomb Gauge

Since π0 D 0, the scalar component does not propagate meaning it is not a dy-namical object hence we cannot quantize A0. We leave it as a classical potentialand quantize only the three fields A according to the standard procedure with theconditions

[Ai(x ), π j (y )]txDt y D i δ?i j δ3(x � y ) (B.13a)

[Ai(x ), A j (y )]txDt y D [π i (x )π j (y )]txDt y D 0 (B.13b)

δ?i j δ3(x � y ) DZ

d3k(2π)32ω

�δ i j �

k i k j

jkj2

�e i k�(x�y ) (B.13c)

Page 840: Yorikiyo Nagashima - Archive

B.1 Quantization of the Electromagnetic Field in the Coulomb Gauge 815

The reason for using δ?i j instead of δ i j is to make the commutators consistentwith the Coulomb gauge condition (B.3), so that both sides of the commutatorsare divergenceless. The field quantization in terms of the annihilation or creationoperators is

A(x ) D1p

V

Xk,λD˙

1p

2ωV

hak ,λ�(k, λ)e�i k �x C a†

k,λ�(k, λ)�e i k �xi

(B.14a)

XλD˙

ε i (λ)ε j (λ) D δ i j �k i k j

jkj2(B.14b)

�ak,λ , a†

k0 ,λ0D δkk0 δλλ0 (B.14c)

[ak ,λ , ak0 ,λ0 ] D�a†

k,λ , a†k0 ,λ0

D 0 (B.14d)

Only the transverse polarization is included. The gauge condition requires

r � A D 0 !

(k � �(k, λ D ˙) D 0P

λD˙ ε i (k, λ)ε j (k, λ)� D δ i j �k i k j

jkj2

(B.15)

The quantization is carried out in the same manner as two independent Klein–Gordon fields. The only subtlety is in the definition of transverse delta function tomake the commutator relations consistent with the gauge requirement.

The advantage of the Coulomb gauge is clearness of its physical image, but thedisadvantage is that the form of the propagator and hence the calculation of thetransition amplitude becomes rather complicated. To derive the Feynman propaga-tor in the Coulomb gauge, we introduce a timelike vector ημ D (1, 0, 0, 0), whichis orthogonal to the polarization vectors i.e. η � ελ D 0. We form four orthonormalvectors ημ, ε1

μ , ε2μ , k μ , where k μ is a vector defined by

k μ Dkμ � (k � η)ημ

[(k � η) � k2]1/2 (B.16)

As k � ε D η � ε D 0, it is easy to prove that k is a spacelike unit vector, i.e. k2D �1,

and orthogonal to the polarization vectors (k � ε D 0). Then we have

gμν D ημ ην � k μ k ν �

2XλD1

ελ �μ (k)ελ

ν(k) (B.17)

In the covariant Feynman gauge, the propagator is expressed by simplyP3λD0 ελ �

μ (k)ελν(k)

k2 C i εD�gμν

k2 C i ε(B.18)

whereas in the Coulomb gauge the numerator is replaced by

2XλD1

ελ �μ (k)ελ

ν(k) D �gμν C ημν � k μ k ν

D �gμν �kμ kν

(k � η)2 � k2 C(k � η)(kμ ην C ημ kν)

(k � η)2 � k2 �k2 ημ ην

(k � η)2 � k2

(B.19)

Page 841: Yorikiyo Nagashima - Archive
Page 842: Yorikiyo Nagashima - Archive

817

Appendix CDirac Matrix and Gamma Matrix Traces

C.1Dirac Plane Wave Solutions

6p � pμ γ μ

( 6p � m)u(p ) D 0 , ( 6p C m)v (p ) D 0

u(p )( 6p � m) D 0 , v (p )( 6p C m) D 0

u†s (p )u r(p ) D v†

s (p )vr(p ) D 2E δ r s

us (p )u r(p ) D �v s(p )vr(p ) D 2mδ r s

us(p )vr(p ) D v s(p )u r(p ) D 0XrD˙1/2

u r(p )ur(p ) D 6p C m ,X

rD˙1/2

vr (p )v r (p ) D6p � m

(C.1a)

Gordon Equality

us(p 0)γ μ u r(p ) D1

2mus(p 0)[(p 0 C p )μ C i σμν(p 0 � p )ν]u r(p )

us(p 0)γ μ γ 5u r (p ) D1

2mus(p 0)[(p 0 � p )μ C i σμν(p 0 C p )ν]γ 5u r (p )

(C.1b)

C.2Dirac γ Matrices

γ μ γ ν C γ ν γ μ D 2gμν

γ 5 D i γ 0γ 1γ 2γ 3 D �i4!

Xεμν�σ γ μ γ ν γ�γ σ

(γ 5)2 D 1 , γ 5γ μ C γ μ γ 5 D 0

σμν Di2

[γ μ , γ ν ] , σ0 i D i α i , σ i j D ε i j k σk

γ 0(γ μ)†γ 0 D γ μ , γ 0(γ 5)†γ 0 D �γ 5

γ 0(γ 5γ μ)† γ 0 D γ 5γ μ , γ 0(σμν)†γ 0 D σμν

(C.2a)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 843: Yorikiyo Nagashima - Archive

818 Appendix C Dirac Matrix and Gamma Matrix Traces

γ 5γσ D �i3!

εμν�σγ μ γ ν γ�

γ μ γ ν γ� D gμν γ� � gμ�γ ν C gν�γ μ � i γ 5εμν�σ γσ

(C.2b)

Charge Conjugation Matrices

C D i γ 2γ 0 , CT D C† D �C , C C† D 1 , C2 D �1

C γ μTC�1 D �γ μ , C γ 5TC�1 D γ 5

C(γ 5γ μ)TC�1 D γ 5γ μ , C σμνTC�1 D �σμν (C.3a)

Chiral (Weyl) Representation

α D��σ 0

0 σ

�, D γ 0 D

�0 11 0

�, γ D α D

�0 σ�σ 0

�(C.4a)

γ 5 D

��1 0

0 1

�, C D

�i σ2 0

0 �i σ2

�(C.4b)

Pauli–Dirac Representation

α D�

0 σσ 0

�, D γ 0 D

�1 00 �1

�, γ D α D

�0 σ�σ 0

�(C.5a)

γ 5 D

�0 11 0

�, C D i γ 2γ 0 D

�0 �i σ2

�i σ2 0

�(C.5b)

Writing wave functions in the Weyl and Pauli–Dirac representations as ΦW, ΦD,they are mutually connected by

ΦD D S ΦW , γD D S γWS�1 , S D1p

2

�1 1�1 1

�(C.6)

C.2.1Traces of the γ Matrices

γμ γ μ D 4

γμ A/γ μ D �2 A/

γμ A/ B/γ μ D 4 (A � B)

γμ A/ B/C/γ μ D �2C/B/A/

γμ A/ B/C/D/γ μ D 2[D/A/B/C/C C/B/A/D/]

(C.7a)

Page 844: Yorikiyo Nagashima - Archive

C.3 Spin Sum of jM f i j2 819

Tr[1] D 4 , Tr[γ μ ] D Tr[γ 5] D 0

Tr[γ μ1 γ μ2 � � � γ μ2nC1 ] D 0

Tr[γ μ1 γ μ2 � � � γ μn ] D (�1)nTr[γ μn � � � γ μ2 γ μ1 ]

Tr[γ μ γ ν ] D 4gμν

Tr[γ μ γ ν γ�γ σ ] D 4 [gμν g�σ C gμσ gν� � gμ�gνσ ]

Tr[γ 5γ μ γ ν γ� γ σ] D �4 i εμν�σ D 4 i εμν�σ

Tr[γ 5] D Tr[γ 5γ μ] D Tr[γ 5γ μ γ ν ] D Tr[γ 5γ μ γ ν γ�] D 0

(C.7b)

C.2.2Levi-Civita Antisymmetric Tensor

ε i j k D ε i j k D

8<ˆ:C1 I i j k D even permutation of 123

�1 I i j k D odd permutation of 123

0 I if any two indices are the same

(C.8a)

εμν�σ D �εμν�σ D

8<ˆ:C1 I μν�σ D even permutation of 0123

�1 I μν�σ D even permutation of 0123

0 I if any two indices are the same

(C.8b)

εμν�σ εαγ δ D � det[gλλ0] , λ D μν�σ , λ D αγ δ

εαν�σ εαγ δ D � det[gλλ0

] , λ D ν�σ, λ D γ δ

εα�σ εαγ δ D �2 (g�γ gσδ � g�δ gσγ )

εαγ σ εαγδ D �6gσδ

εαγ δ εαγ δ D �24

(C.8c)

C.3Spin Sum of jM f i j2

K μν(p 0, p ) DX

r,sD1,2

�us(p 0)γ μ(a � bγ 5)u r(p )

��us(p 0)γ ν(a � bγ 5)u r(p )

�D Tr[γ μ(a � bγ 5)( 6p C m)(a� C b�γ 5)γ ν( 6p 0 C m0)]D 4[(jaj2 C jbj2)fp 0μ p ν C p 0 ν p μ � gμν(p 0 � p )g

C (jaj2 � jbj2)gμν m0m � 2 iReh(ab�)εμν�σ p 0

� pσ

i(C.9)

For antiparticles, replace m or m0 or both! �m ,�m0.

Page 845: Yorikiyo Nagashima - Archive

820 Appendix C Dirac Matrix and Gamma Matrix Traces

C.3.1A Frequently Used Example

e(p1) C μ(p2) ! e(p3) C μ(p4) (1) VV scattering; a D 1, b D 0, m1 D m3 D m,m2 D m4 D M

K μνV (p3, p1) D 4

h˚p μ

3 p ν1 C p ν

3 p μ1 � gμν(p3 � p1)

�C m2gμν

iD 4

�p μ

3 p ν1 C p ν

3 p μ1 C gμν

�q2

2

��, q D p1 � p3

K μνV (p3, p1)KV μν(p4, p2)

D 16h2(p1 � p2)(p3 � p4)C 2(p1 � p4)(p2 � p3)C q2f(p1 � p3)

C (p2 � p4)g C q4i

D 8[(s � m2 � M2)2 C (u � m2 � M2)2 C 2t(m2 C M2)]

D 16�

(s � m2 �M2)(m2 C M2 � u)C t(m2 C M2)Ct2

2

�' 8(s2 C u2) for s m2, M2

(C.10a)

The spin averaged matrix element becomes

jM f i j2 D

14

Xspin

jM f i j2 s�m2,M2

������!2e4

q4 (s2 C u2) (C.10b)

The cross section in the CM frame is given by

dσdΩ

ˇˇCMD

p3

p1

164π2 s

jM f i j2

Dα2

q4

s2 C u2

2sD

α2

q4

s2

"1C

�1C cos θ

2

�2# (C.10c)

(2) LL, LR, RL, RR scattering; (L: a D b D 1/2, R: a D �b D 1/2)

K μνL (p3, p1) D 2

h˚p μ

3 p ν1 C p ν

3 p μ1 � gμν(p3 p1)

�� i εμν�σ p3� p1σ

iK μν

R (p3, p1) D 2h˚

p μ3 p ν

1 C p ν3 p μ

1 � gμν(p3 p1)�C i εμν�σ p3� p1σ

iK μν

L (p3, p1)KLμν(p4, p2) D K μνR (p3, p1)KRμν(p4, p2)

D 16(p1 � p2)(p3 � p4) D 4(s � m2 � M2)2

' 4s2 s m2, M2

K μνL (p3, p1)KRμν(p4, p2) D K μν

R (p3, p1)KLμν(p4, p2)

D 16(p1 � p4)(p2 � p3) D 4(u � m2 �M2)2

' 4u2 u m2, M2

(C.10d)

Page 846: Yorikiyo Nagashima - Archive

C.3 Spin Sum of jM f i j2 821

e(p1) C e(p2) ! μ(p3) C μ(p4) Here, we consider only VV scattering. Then,K μν(p 0, p ) in Eq. (C.9) becomes

K μνV (p 0, p ) D 4 [fp 0 μ p ν C p 0ν p μ � gμν(p 0 � p )g � m2gμν]

D 4h

p 0μ p ν C p 0ν p μ � gμν s

2

�i, s D (p C p 0)2 (C.11)

K μνV (p2, p1)KV μν(p4, p3)

D 16 [2(p1 � p3)(p2 � p4)C 2(p1 � p4)(p2 � p3)

� sf(p1 � p2)C (p3 � p4)g C s2] ,

D 8[(t � m2 �M2)2 C (u � m2 � M2)2 C 2s(m2 C M2)]

(C.12)

In the CM frame,

E1 D E2 D

qp 2

1 C m2 , E3 D E4 D

qp 2

3 C M2 ,

s D 4E 21 D 4E 2

3 D 4E1E3 , p 21 D

s � 4m2

4, p 2

3 Ds � 4M2

4(p1 � p3)(p2 � p4) D (E1E3 � p1 p3 cos θ )2 ,

(p1 � p4)(p2 � p3) D (E1E3 C p1 p3 cos θ )2

(C.13)

Inserting Eq. (C.13) in Eq. (C.12), we have

K μνV (p2, p1)KV μν(p4, p3)

D 16�4�E 2

1 E 23 C p 2

1 p 23 cos2 θ

�C s(m2 C M2)

(C.14)

D 4s2�

(1C cos2 θ )C 4(m2 C M2)

ssin2 θ C 16

m2 M2

s2cos2 θ

�(C.15)

dσdΩ

ˇˇCMD

p3

p1

164π2 s

14

(4πα)2

s2 K μνV (p2, p1)KV μν(p4, p3)

Dα2

4sp3

p1

�(1C cos2 θ )C 4

(m2 C M2)s

sin2 θ

C16m2 M2

s2cos2 θ

� (C.16)

The total cross section is

σTOT D4πα2

3sp3

p1

�1C 2

(m2 C M2)s

C4m2 M2

s2

�(C.17a)

D4πα2

3s f

i

"1C

2 � 2i � 2

f

2C

�1 � 2

i

��1� 2

f

�4

#(C.17b)

where i, f D pi, f /Ei, f is the particle’s initial and final velocity in CM.

Page 847: Yorikiyo Nagashima - Archive

822 Appendix C Dirac Matrix and Gamma Matrix Traces

C.3.2Polarization Sum of the Vector Particle

Polarization vectors (εμ(λ)) are orthogonal to the momentum qμ of the particle, thephoton by the Lorentz condition and the massive boson by Proca’s equation:

qμ εμ(λ) D 0 (C.18)

Only the transverse polarizations are allowed for the real photon, and they satisfy

XλD˙

ε i (λ)ε j (λ) D δ i j �qi q j

jqj2(C.19)

For the case of the massive vector boson,XλD˙,l

εμ(λ)εν(λ) D ��

gμν �qμ qν

m2

�(C.20)

This can be understood by first considering the polarization vectors in the particlerest coordinate system, then boosting to the direction of the momentum. Definefour polarization vectors εμ(λ), (λ D 0 � 3) in the rest frame of the particle by

εμ(0) D (1, 0, 0, 0)

εμ(1) D (0, 1, 0, 0)

εμ(2) D (0, 0, 1, 0)

εμ(3) D (0, 0, 0, 1)

(C.21a)

Alternatively we can adopt circular polarizations for the transverse components. Inthe coordinate system where the 4-momentum of the particle is given by

qμ D (ω, 0, 0, jqj) (C.21b)

The four polarization vectors are boosted to

εμ(s) D�

ωm

, 0, 0,jqjm

εμ(C) D �1p

2(0, 1,Ci, 0)

εμ(�) D1p

2(0, 1,�i, 0)

εμ(l) D�jqjm

, 0, 0,jωjm

�(C.21c)

Here, we used names˙ for circular, scalar for the 0th and longitudinal for the thirdcomponent of the polarization. The direction of the longitudinal photon is parallelto the momentum. They satisfy the orthogonality condition

εμ(λ)εμ(λ0)� D gλλ0(C.21d)

Page 848: Yorikiyo Nagashima - Archive

C.4 Other Useful Formulae 823

and the completeness conditionXλDall

gλλ0εμ(λ)εν(λ0)� D gμν (C.21e)

Requirement of the constraint Eq. (C.18) removes the scalar polarization and thecompleteness condition becomes Eq. (C.20).

For virtual photons with time-like momentum (i.e. q2 > 0), the above discussioncan be extended by simply replacing m by

pq2.

For space-like momentum (q2 < 0), at least one of the three polarization vectorshas to be time-like. We cannot use εμ(s) of Eq. (C.21c), because it does not satisfythe constraint Eq. (C.18). The condition Eq. (C.18) is satisfied by using εμ(l) butreplacing the mass with

p�q2, because

ε(l)μ εμ(l) Dq2 � ω2

�q2 D 1 > 0 (C.22a)

What about the completeness condition? For space-like qμ , there is a coordinatesystem that makes jqj ¤ 0, q0 D 0, because the energy momentum of the space likevirtual photon is created by a charged particle, namely q D p i � p f where pi , p f

are the time-like momenta of the particle before and after emission of the photon.In that coordinate system εμ(l) D (1, 0, 0, 0) which agrees with εμ(0) defined inEq. (C.21). Accordingly the metric for the summation has to be changed and thecompleteness condition is kept unchanged.

XλD˙1,l!0

(�1)λC1εμ(λ)εν(λ)� D ��

gμν �qμ qν

q2

�(C.23)

εμ(l) is time-like and is called scalar polarization.

C.4Other Useful Formulae

AC B C C � (2π)4δ4(P � p3 � p4)d3 p3d3 p4

(2π)62E32E4

��a C bμ p μ

3 C cμν p μ3 p ν

4

D

λ(1, x , y )8π

�a C bμ P μ 1C x � y

2

Ccμν12

�(1 � x � y )P μ P ν �

λ2(1, x , y )6

(4P μ P ν � P2gμν) �

E3 D

qp 2

3 C m23 , E4 D

qp 2

4 C m24

λ(1, x , y ) D (1C x2 C y 2 � 2x � 2y � 2x y )1/2 , x Dm2

3

P2, y D

m24

P2

(C.24)

Page 849: Yorikiyo Nagashima - Archive

824 Appendix C Dirac Matrix and Gamma Matrix Traces

Especially, when m3 D m4 D 0

A Da

8π, B D

bμ P μ

16π,

C Dcμν

96π(2P μ P ν C P2gμν) (C.25)

To prove Eq. (C.24), use Lorentz invariance to show that

A D X(P2) , B D bμ P μ Y(P2) , C D cμν�P μ P ν Z(P2)C gμν W(P2)

(C.26)

and perform the calculation in the P rest frame.

Page 850: Yorikiyo Nagashima - Archive

825

Appendix DDimensional Regularization

D.1Photon Self-Energy

We demonstrate the method of dimensional regularization by calculating the pho-ton self-energy loop diagram.

�i Π μν D �

Zd4k

(2π)4Tr�

(�i eγ μ)i( 6 k C m)

k2 � m2 C i ε(�i eγ ν)

�i( 6 k� 6 q C m)

(k � q)2 � m2 C i ε

�(D.1)

and prove that

Π μν(q2) D (gμν q2 � qμ qν)Πγ (q2) (D.2a)

Πγ (q2) D Πγ (0)C OΠγ (q2) (D.2b)

in which OΠγ (q2) is a finite analytical function of q2.The trace calculation yields

Tr[� � � ] D Tr [( 6 k C m)γ ν( 6 k� 6 q C m)γ μ ]

D 4 [2k μ k ν � (k μ qν C k ν qμ)� gμν(k2 � k � q � m2)]

) �i Π μν D �4e2Z

d4k(2π)4

�[2k μ k ν � (k μ qν C k ν qμ) � gμν(k2 � k � q � m2)]

[k2 � m2 C i ε][(k � q)2 � m2 C i ε]

(D.3)

which is a diverging integral for k ! 1. The above integral Eq. (D.1) containsa O(k2) term in the numerator, which is quadratically divergent. In dimensionalregularization, it appears as a D D 2 pole, as the quadratic divergence in D D 4becomes logarithmic in D D 2 space. The gauge invariance, however, guaranteesthat the divergence is at most of logarithmic type, as is shown below. Since theintegral is well defined for D < 2 and is an analytic function of D, it is allowed to

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 851: Yorikiyo Nagashima - Archive

826 Appendix D Dimensional Regularization

assume that D is an integer larger than 4, perform the integration formally, thentake the limit D ! 4.

To perform the integration in D-dimensional space, we introduce

k μ D (k0, k1, k2, k3)! (k0, k1, � � � , k D�1) � (k0, k) (D.4a)

d4k(2π)4 ! μ4�D d D k

(2π)D (D.4b)

gμν gμν D 4! gμν gμν D D (D.4c)

k2 D (k0)2 � k2 D (k0)2 �

D�1XiD1

(k i )2 (D.4d)

μ is an energy scale parameter introduced to keep the physical dimension. Al-though introduction of μ is redundant in the following argument, we keep it be-cause it is indispensable when we extend our discussion to QCD. We introduce thefollowing notation for later convenience:

B0(q2, m2) DZ

k

1(1)(2)

(D.5a)

where we have definedZk�

16π2

iμ4�D

Zd D k

(2π)D(D.5b)

(1) � k2 � m2 C i ε , (2) D (k � q)2 � m2 C i ε (D.5c)

Applying the Feynman parameter formula

1abD

Z 1

0dx

1[(1 � x )a C x b]2

(D.6)

The integrand can be transformed to a more convenient form for integration

�i Π μν D �4e2Z 1

0dx

Zd4k

(2π)4

�[2k μ k ν � (k μ qν C k ν qμ)� gμν(k2 � k � q � m2)]

[(k � x q)2 � (m2 � x (1 � x )q2)]2(D.7)

Let us calculate B0 first. It is expressed as

B0 D

Zk

1(1)(2)

D16π2

iμ4�D

Z 1

0dx

d D k(2π)D

1[(k � x q)2 � M2]2

(D.8a)

M2 D �q2x (1 � x )C m2 � i ε (D.8b)

Page 852: Yorikiyo Nagashima - Archive

D.1 Photon Self-Energy 827

Changing the integration value from k ! k0 D k � x q, and introducing ω2 DPD�1iD1 (k i)2 C M2, we find the denominator becomes

(k0)2 � ω2 C i ε D [k0 � (ω � i ε)][k0 � (�ω C i ε)] (D.9)

It has poles at ω� i ε and �ωC i ε. Therefore, we can rotate the integral path fromk0 D �1 �1! �i1� i1, which is called the Wick rotation. Then writing

k2 D (k0)2 �

D�1XiD1

! �r2 D

D�1XiD0

(k i )2 (D.10)

As r is the length in the Euclidean space, we can adopt the polar coordinate systemand integrate all the angular variables, giving the D-dimensional solid angle ΩD . Itcan be calculated as follows. Using the equality

Z 1

�1dx e�x2

Dp

π)Z 1

�1dx1 � � �

Z 1

�1dxD e�x2

1 ����x2D D πD/2

(D.11)

recalculating it using the polar coordinate

ΩD

Z 1

0e�r2

r D�1dr D πD/2 (D.12)

Using the definition of the Γ function

Γ (x ) DZ 1

0e�t t x�1d t (D.13)

we obtain

ΩD D2πD/2

Γ (D/2)(D.14)

Now we can perform the integration in Eq. (D.8). At this point, we extend the powerof the denominator from two to more general n for later convenience. Then in theWick rotated coordinate, we obtain

iZ

d D k(�r2 �M2)2 ) (�1)n i

Zd D k

(r2 C M2)n D (�1)n i ΩD

Z 1

0

r D�1dr(r2 C M2)n

D (�1)n i ΩDM D�2n

2B�

D2

, n �D2

�(D.15a)

B(p , q) DZ 1

0d t

t q�1

(1C t)pCq DΓ (p )Γ (q)Γ (p C q)

(D.15b)

Page 853: Yorikiyo Nagashima - Archive

828 Appendix D Dimensional Regularization

Then

B0 D16π2μ4�D

i(�1)n i(2π)D

Z 1

0dx

2πD/2

Γ (D/2)M D�2n

2Γ (D/2)Γ (n � D/2)

Γ (n)(D.16a)

D (�1)nZ 1

0dx

�(4πμ2)2�D/2

(M2)n�D/2

�Γ�n � D

2

�Γ (n)

(D.16b)

As one can see, n D 2, D D 4 is a pole of the Γ function. The residue at the polecan be calculated by using infinite product representation for Γ

1Γ (z)

D zeγE z1Y

nD1

1C

zn

�e�z/n (D.17)

Setting n D 2 and defining an infinitesimal number ε D 4� D , and using

Γ ε

2

�D

2ε� γE C O(ε) (D.18a)

x ε/2 D e(ε/2) ln x D 1Cε2

ln x C O(ε2) (D.18b)

B0(q2, m2) D Δ �Z 1

0dx ln M2

D Δ �Z 1

0dx ln[�q2x (1 � x )C m2 � i ε] (D.19a)

Δ D2ε� γE C ln(4πμ2) (D.19b)

where γE ' 0.5772 is the Euler constant. The second term in B0 is a finite analyticalfunction. The divergent term is included in the first term as the pole of the complexD-plane. γE and ln(4πμ2) in Δ always accompany 2/ε as a unit and cancel in theobservable quantities. Therefore, it is to be subtracted together with 2/ε in therenormalization prescription. In QCD, where there is no special point to normalizethe coupling constant with the experimental value, ln μ2 is kept as an adjustableparameter.

To calculate other terms in Eq. (D.7), the following formulas are useful and easyto verify:Z

d D k[k2 � M2 C i ε]n

D i(π)D/2 (�1)n

(M2)n�D/2

Γ�n � D

2

�Γ (n)

(D.20a)Zk μ d D k

[k2 � M2 C i ε]n D 0 (D.20b)

Zk2d D k

[k2 � M2 C i ε]nD i(π)D/2(�1)n�1 D/2

(M2)n�1�D/2

Γ�n � 1� D

2

�Γ (n)

(D.20c)

Page 854: Yorikiyo Nagashima - Archive

D.1 Photon Self-Energy 829

Zk μ k ν d D k

[k2 � M2 C i ε]n Dgμν

D

Zk2d D k

[k2 � M2 C i ε]n (D.20d)

To derive the third equation, Γ (z C 1) D zΓ (z) is used. The last equation can beproved by noting that the integral must be proportional to gμν by Lorentz invarianceand that gμν gμν D D .

To calculate Π μν , we rewrite k in the numerator of Eq. (D.7) with k0 μ D k μ�x qμ

and drop terms linear in k0 as they vanish:

�i Π μν D �4e2Z 1

0dxZ

d4k(2π)4

�2k μ k ν � (k μ qν C k ν qμ) � gμν(k2 � k � q � m2)

[(k � x q)2 � (m2 � x (1 � x )q2)]2

) �4e2Z 1

0dx

Zd D k0

(2π)D

�2k0 μ k0 ν

� 2x (1 � x )qμ qν � gμν[k02� q2x (1 � x ) � m2]

[k02�M2 C i ε]2

M2 D m2 � x (1 � x )q2

(D.21)

Inserting Eq. (D.20) , we obtain

Π μν D4e2

16π2

Z 1

0dx

�4πμ2

M2

�2�D/2 �gμν M2

�D2� 1

�Γ�

1 �D2

Cn�2x (1 � x )qμ qν C gμν[q2x (1 � x )C m2]

o�

2 �D2

��(D.22a)

As (D/2� 1)Γ (1� D/2) D �Γ (2� D/2), we see that the pole at D D 2 which rep-resents the quadratic divergence vanishes. Reduction of the power in the divergentterm by the gauge invariance has worked!

Π μν D2απ

(gμν q2 � qμ qν)Z 1

0dx x (1� x )

�4πμ2

M2

�2�D/2

Γ�

2�D2

D2απ

(gμν q2 � qμ qν)Z 1

0dx x (1� x )

�Δ � ln(M2 � i ε)

D (gμν q2 � qμ qν)

�α

3πΔ �

2απ

Z 1

0dx x (1 � x ) ln[�q2x (1 � x )C m2 � i ε]

�(D.23a)

Page 855: Yorikiyo Nagashima - Archive

830 Appendix D Dimensional Regularization

Using Π μν D (gμν q2 � qμ qν)Πγ , we finally obtain

Πγ D Πγ (0)C OΠγ (q2) (D.24a)

Πγ (0) Dα

3π(Δ � ln m2) (D.24b)

OΠγ (q2) D �2απ

Z 1

0dx x (1 � x ) ln

�1�

q2

m2 x (1� x ) � i ε�

(D.24c)

D.2Electron Self-Energy

The electron self energy is given by

i Σe(p ) DZ

d4k(2π)4

�i gμν

(p � k)2 � μ2 C i ε(i eγ μ)

i( 6 k C m)k2 � m2 C i ε

(i eγ ν) (D.25)

where μ is a small mass introduced to avoid the infrared divergence. Using theFeynman parametrization, it becomes

i Σe(p ) D �e2Z 1

0dx

Zd4 k

(2π)4

γμ 6 kγ μ C mγμ γ μ

[(k � x p )2 � M2 C i ε]2

D �2e2Z 1

0dx

Zd4k

(2π)4

2m � x 6p[k2 � M2 C i ε]2

(D.26a)

M2 D �p 2x (1 � x )C (1 � x )m2 C x μ2 � i ε (D.26b)

where we have used the relation γμ 6 kγ μ D �2 6 k, γμ γ μ D 4. This is the sameintegral as Eq. (D.8) except for the numerator and its result is already given byEq. (D.19). Thus

Σe(p ) Dα

Z 1

0dx (2m � x 6p )

��Δ C ln M2 (D.27)

We can easily calculate δm and δZ2 defined in Eq. (8.68)

Σe(p ) D δm C [δZ2 C OΠe( 6p )]( 6p � mR ) (D.28a)

δm D Σe(p )j p/DmR Dα

2πmR

��

32

Δ CZ 1

0dx (2� x ) ln M2

R

�C O(α2)

(D.28b)

Page 856: Yorikiyo Nagashima - Archive

D.2 Electron Self-Energy 831

In deriving the last equation we used m D mR C δm D mR C O(α).

δZ2 D@Σe (p )

@ 6p

ˇˇ

p/DmR

�C

Δ2�

Z 1

0dx

�x ln M2

R C2m2

R x (1� x )(2� x )M2

R

�C O(α2)

(D.28c)

M2R D (1� x )2m2

R C x μ2 � i ε (D.28d)

In calculating M2R and its derivative, we have used p 2 D6p 6p and

M2R D M2jp/DmR D �m2

R x (1 � x )C (1 � x )(mR C δm)2 C x μ2 � i ε

D (1 � x )2m2R C x μ2 � i ε C O(α) (D.29)

We see, the first derivative of Σe(p ) still includes the divergent term. But OΠe( 6p ) isfinite and satisfies the relation

OΠe( 6p )j p/DmR D 0 (D.30)

Page 857: Yorikiyo Nagashima - Archive
Page 858: Yorikiyo Nagashima - Archive

833

Appendix ERotation Matrix

E.1Angular Momentum Operators

An operator to rotate a physical object by an angle α around the axis having a unitvector n(θ , φ) is given by

R D exp[�α(n � J)] (E.1)

The angular momentum operator Ji D L i C Si satisfies the commutation relation

[ Ji , J j ] D i ε i j k Jk (E.2)

J2 D J21 C J2

2 C J23 commutes with all Ji , and has an eigenvalue j ( j C 1), j D

0, 1/2, 1, � � � . One of Ji (conventionally J3) and J2 can be diagonalized simulta-neously, so the eigenvector can be specified by two values, j, j3. Writing J˙ DJ1 ˙ i J2

J2j j mi D j ( j C 1)j j mi

J3j j mi D mj j mi

J˙j j mi Dp

j ( j C 1) � m(m ˙ 1)j j m ˙ 1i

(E.3)

For spinless particle, J D L and

L i D x j pk � xk p j D �i(x j @k � xk@ j ) (E.4)

Eigenfunctions of the orbital angular momentum are spherical harmonic func-tions. They are, for L2 D L(LC 1), L3 D M , �L M L, given as

YLM (θ , φ) D (�1)(MCjMj)/2�

2LC 14π

(L � jM j)!(LC jM j)!

�1/2

P jMjL (cos θ )e i M φ (E.5a)

P jMjL (z) D (1 � z)jMj/2 1

2L L!d LCjMj

dz LCjMj (z2 � 1)L (E.5b)

YL,�M (θ , φ) D (�1)M [YLM (θ , φ)]� (E.5c)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 859: Yorikiyo Nagashima - Archive

834 Appendix E Rotation Matrix

Y00 D

r1

4π, Y10 D

r3

4πcos θ , Y1˙1 D �

r3

8πsin θ e˙i φ

Y20 D

r5

4π3 cos2 θ � 1

2, Y2˙1 D �

r158π

sin θ cos θ e˙i φ ,

Y2˙2 D

r15

32πsin2 θ e˙2 i φ

Y30 D

r7

4π5 cos3 θ � 3 cos θ

2,

Y31 D �

r21

64πsin θ (5 cos2 θ � 1)e˙i φ

Y32 D

r10532π

sin2 θ cos θ e˙2 i φ, Y33 D �

r35

64πsin3 θ e˙3i φ

(E.6)

For L D 0, J D S , and Si are (2S C 1)� (2S C 1) matrices. For S D 1/2, S D σ/2,where σ is the Pauli spin matrix.

S1 D12

�0 11 0

�, S2 D

12

�0 �ii 0

�, S3 D

12

�1 00 �1

�, (E.7)

The eigenfunctions are

�C1/2 D

�10

�, ��1/2 D

�01

�(E.8)

For S D 1, we can adopt eigenfunctions

eC1 D

241

00

35 , e0 D

240

10

35 , e�1 D

240

01

35 (E.9)

For which we can easily construct S˙ using Eq. (E.3)

SC D

240

p2 0

0 0p

20 0 0

35 , S� D

24 0 0 0p

2 0 00p

2 0

35 , (E.10)

Then

S1 D1p

2

240 1 0

1 0 10 1 0

35 , S2 D

1p

2

240 �i 0

i 0 �i0 i 0

35 ,

S3 D1p

2

241 0 0

0 0 00 0 �1

35 (E.11)

Page 860: Yorikiyo Nagashima - Archive

E.2 Addition of the Angular Momentum 835

E.2Addition of the Angular Momentum

Adding J1 and J2 vectorially

J D J1 C J2 (E.12)

The range of the eigenvalues J, M are

j j1 � j2j J j1 C j2 , M D m1 C m2 (E.13)

The eigenvectors of the J are given by

j J Mi DX

m1Cm2DM

C J Mj1 m1, j2 m2

j j1m1ij j2m2i (E.14)

C J Mj1 m1, j2 m2

are called the Clebsch–Gordan coefficients and their general forms aregiven in [334]. For the case, j2 D 1/2, 1 they are given by

Table E.1 C J Mj1,M�mI1/2,m .

Jnm m D 1/2 m D �1/2

j C 1/2h

j1CMC1/22 j1C1

i1/2 hj1�MC1/2

2 j1C1

i1/2

j � 1/2 �h

j1�MC1/22 j1C1

i1/2 hj1CMC1/2

2 j1C1

i1/2

Table E.2 C J Mj1,M�mI1/2,m .

Jnm m D 1 m D 0 m D �1

j1 C 1h

( j1CM )( j1CMC1)(2 j1C1)(2 j1C2)

i1/2 h( j1�MC1)( j1CMC1)

(2 j1C1)( j1C1)

i1/2 h( j1�M )( j1�MC1)

(2 j1C1)(2 j1C2)

i1/2

j1 �h

( j1CM )( j1�MC1)2 j1( j1C1)

i1/2M

[ j1( j1C1)]1/2

h( j1�M )( j1CMC1)

2 j1( j1C1)

i1/2

j1 � 1h

( j1�M )( j1 �MC1)2 j1(2 j1C1)

i1/2�h

( j1�M )( j1CM )j1(2 j1C1)

i1/2 h( j1CMC1)( j1CM )

2 j1(2 j1C1)

i1/2

E.3Rotational Matrix

The effect of a rotation on the z-axis simply gives a phase factor to the eigenfunctionj j mi, but that on y-axis generates states of different m. As, j j mimakes a complete

Page 861: Yorikiyo Nagashima - Archive

836 Appendix E Rotation Matrix

system if the value of j does not change, the new state can be expanded by theirlinear combination:

) e�i θ J2j j mi DX

n

d jnm(θ )j j ni (E.15)

d jnm(θ ) is called the rotation matrix and is defined by

d jnm(θ ) D h j nje�i θ J2j j mi (E.16)

The properties of d Jnm function are

d jnm(θ ) D d j

�m ,�n(θ ) D (�1)n�m d jmn(θ )

d jnm(θ ) D (�1) j Cn d j

n,�m (π � θ )

d jnm(π) D (�1) j Cn δn,�m

d jnm(�π) D (�1)n� j δn,�m

(E.17)

The normalization is given byZd cos θ d j1

nm(θ )d j2nm(θ ) D

22 j1 C 1

δ j1, j2 (E.18)

A general formula of rotation matrices are given by Wigner [334], but for j D1/2, 1, it is easily obtained by inserting Eqs. (E.7) and (E.11) into Eq. (E.16):"

d1/21/2,1/2 d1/2

1/2,�1/2

d1/2�1/2,1/2 d1/2

1/2,�1/2

#D e�i( θ

2 )σ2 DX

n

1n!

��i

θ2

σ2

�n

D cosθ2� sin

θ2

(i σ2) D

"cos θ

2 � sin θ2

sin θ2 cos θ

2

# (E.19)

Similarly, for S D 1

264

d11,1 d1

1,0 d11,�1

d10,1 d1

0,0 d10,�1

d1�1,1 d1�1,0 d1�1,�1

375 D

2664

1Ccos θ2 � sin θp

21�cos θ

2sin θp

2cos θ � sin θp

21�cos θ

2sin θp

21Ccos θ

2

3775 (E.20)

Slightly more general formulae are given in [217].

j D L D Integer

d L00 D PL(θ )

d L10(θ ) D �[L(LC 1)]�1/2 sin θ P 0

L(θ )

d L20(θ ) D [(L � 1)L(LC 1)(LC 2)]�1/2[2P 0

L�1(θ )� L(L � 1)PL(θ )]

P 0L(z) D

ddz

PL(z)

(E.21)

Page 862: Yorikiyo Nagashima - Archive

E.3 Rotational Matrix 837

j D L C 1/2

d j1/2,1/2(θ ) D ( j C 1/2)�1 cos

θ2

(P 0j C1/2 � P 0

j �1/2)

d j�1/2,1/2(θ ) D ( j C 1/2)�1 sin

θ2

(P 0j C1/2 C P 0

j �1/2)

d j1/2,3/2(θ ) D ( j C 1/2)�1 sin

θ2

sj � 1/2j C 3/2

P 0j C1/2 C

sj C 3/2j � 1/2

P 0j �1/2

!

d j�1/2,3/2(θ ) D ( j C 1/2)�1 sin

θ2

sj � 1/2j C 3/2

P 0j C1/2 C

sj C 3/2j � 1/2

P 0j �1/2

!

(E.22)

Page 863: Yorikiyo Nagashima - Archive
Page 864: Yorikiyo Nagashima - Archive

839

Appendix FC, P, T Transformation

We summarize here the transformation properties of fields. Only the Dirac fieldsand their bilinears are given. Corresponding scalars, vectors etc. that obey theKlein–Gordon equations are similar to the corresponding Dirac bilinears. jq, p , szi

means the state of a particle (q) or antiparticle (q) having momentum p , and spincomponent s D sz . (t, x ) stands for (t, x ). Combined operations usually give differ-ent phases depending on the order of the operation. But the Dirac bilinears do notbecause ηη� D 1.

P W (t, x ) ! (t,�x )P jq, p , si D ηP jq,�p , si

ψ(t, x ) ! γ 0ψ(t,�x )ψ(t, x ) ! ψ(t,�x )γ 0

S ψ1ψ2 ! ψ1ψ2

P i ψ1γ 5ψ2 ! �i ψ1γ 5ψ2�

V ψ1γ μ ψ2 ! ψ1γμ ψ2

A ψ1γ μ γ 5ψ2 ! �ψ1γμ γ 5ψ2

T ψ1 σμν ψ2 ! ψ1σμν ψ2

(F.1)

� i is attached to make i ψγ 5ψ hermitian.

*C W C jq, p , si D ηC jq, p , si

ψ(t, x ) ! C ψT(t, x ) D i γ 2γ 0ψT(t, x )ψ(t, x ) ! �ψT(t, x )C�1

C�1γ μ C D �γ μ T

C D �C† D �C�1 D �C T D C�

S ψ1ψ2 ! ψ2ψ1

P i ψ1γ 5ψ2 ! i ψ2γ 5ψ1

V ψ1γ μ ψ2 ! �ψ2γ μ ψ1

A ψ1γ μ γ 5ψ2 ! ψ2γ μ γ 5ψ1

T ψ1σμν ψ2 ! �ψ2σμν ψ1

(F.2)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 865: Yorikiyo Nagashima - Archive

840 Appendix F C, P, T Transformation

T W (t, x ) ! (�t, x )(c-number) ! (c-number)�

h f W outjH ji W ini D h f W outjT �1T H T �1T ji W iniD hT f W injH jT i W outi�

T jq, p , si D ηT jq,�p ,�si

ψ(t, x ) ! B ψ(�t, x )D i γ 1γ 3ψ(�t, x ) D �i γ 5C ψ(�t, x )

ψ(t, x ) ! ψ(�t, x )B�1 D ψ(�t, x )γ 1γ 3

B�1γ μ�B D γμ , B† D B�1 D �B T D B

S ψ1ψ2 ! ψ1ψ2

P i ψ1γ 5ψ2 ! �i ψ1γ 5ψ2

V ψ1γ μ ψ2 ! ψ1γμ ψ2

A ψ1γ μ γ 5ψ2 ! ψ1γμ γ 5ψ2

T ψ1σμν ψ2 ! �ψ1σμν ψ2

(F.3)

C P W (t, x ) ! (t,�x )C P jq, p , si D ηC P jq,�p , si

ψ(t, x ) ! ˙i γ 2ψT(t,�x )ψ(t, x ) ! �ψT(t,�x )i γ 2

S ψ1ψ2 ! ψ2ψ1

P i ψ1γ 5ψ2 ! �i ψ2γ 5ψ1

V ψ1γ μ ψ2 ! �ψ2γμ ψ1

A ψ1γ μ γ 5ψ2 ! �ψ2γμ γ 5ψ1

T ψ1σμν ψ2 ! �ψ2σμν ψ1

(F.4)

C P T W (t, x ) ! (�t,�x )(c-number) ! (c-number)�

h f W outjH ji W ini D h f W outjΘ�1Θ H Θ�1Θ ji W iniD hΘ f W injH jΘ i W outi�

Θ jq, p , si D ηΘ jq, p ,�siψ(t, x ) ! ˙i γ0γ 5ψT(t,�x )ψ(t, x ) ! ˙ψT(t,�x )i γ0γ 5

S ψ1ψ2 ! ψ2ψ1

P i ψ1γ 5ψ2 ! i ψ2γ 5ψ1

V ψ1γ μ ψ2 ! �ψ2γ μ ψ1

A ψ1γ μ γ 5ψ2 ! �ψ2γ μ γ 5ψ1

T ψ1σμν ψ2 ! ψ2σμν ψ1

(F.5)

Page 866: Yorikiyo Nagashima - Archive

841

Appendix GSU(3), SU(n) and the Quark Model1)

G.1Generators of the Group

The set of all the linear transformations that preserve the length L DqPN

iD1 ju i j2

in the N-dimensional coordinate space of N complex variables u i (i D 1 � N ) iscalled U(N ). With the further restriction of the determinant being 1, it is calledS U(N ), a special unitary group. The number of independent variables is N 2 forU(N ) and N 2�1 for S U(N ). Any representation matrix of the S U(3) group operat-ing on n-dimensional vectors can be expressed as n � n unitary matrices, which isa function of N 2 � 1 independent continuous variables θi and traceless hermitianmatrices Fi

U D exp

0@i

N2�1XiD1

θi Fi

1A , F †

i D Fi , Tr[Fi ] D 0 , (G.1)

The traceless condition makes the determinant of the matrix equal to one. It canbe proved as follows. If we write U D e i Q, Q is a hermitian matrix. Therefore, itcan be diagonalized by using some unitary matrix S�1QS D diagfQg. Then

det U D det[S�1 exp(i Q)S ]D det�X

S�1 (i Q)n

n!S�

D det�X (S�1 i QS )n

n!

�D det[diagfe i Qg] D

Yi

e i η i

D exp

iX

i

η i

!D e iTr[Q ] (D 1 if Tr[Q ] D 0)

(G.2)

where η i ’s are eigenvalues of the diagonalized matrix Q. N 2 � 1 Fi ’s satisfy com-mutation relations called Lie algebra:

[Fi , F j ] D i f i j k Fk (G.3)

1) References are [89, 101, 156, 165]

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 867: Yorikiyo Nagashima - Archive

842 Appendix G S U(3), S U(n) and the Quark Model

The Fi ’s are called the generators of the group and f i j k are structure constants.Since all the representation matrices are analytic functions smoothly connected tothe unit matrix in the limit of θi ! 0, the Lie algebra Eq. (G.3) completely deter-mines the local structure of the group.2) f i j k can be made totally antisymmetricin the indices i j k. Determination of the generator is not unique, because differentF 0

i can be made of from any independent linear combination of Fi .3)

Let us write ψ as representing an n-dimensional (n � N ) contravariant vector(with upper index) on which the n� n S U(N ) matrices act. Then ψ is transformedto ψ0 by U.

ψ D

26664

� 1

� 2

...� n

37775 , ψ0 D

266664

� 10

� 20

...� n 0

377775 , n � N (G.5a)

ψ ! ψ0 D U ψ (G.5b)

� a 0D U a

b � b , U D e�i θi Fi (G.5c)

where simultaneous appearance of indices means sum over the indices up to n.The collection of n matrices with n D N is called a fundamental representationand adjoint (or regular) representation if n D N 2 � 1.

G.1.1Adjoint Representation

The representation matrices with n D N 2 � 1 can be expressed in terms of thestructure constants.

U(N 2 � 1) D [Fa ]bc D �i f abc , (a, b, c D 1 � N 2 � 1) (G.6)

Proof: Insert A D [Fa ], B D [Fb ], C D [Fc ] in the Jacobi identities

[[A, B ], C ]C [[B, C ], A]C [[C, A], B ] D 0 (G.7)

use Eq. (G.3) and f abc D � f bac, then we obtain

f abd f cd e C f bcd f ad e C f cad f bd e D 0 (G.8)

Then insertion of Eq. (G.6) into the above equality leads to

[Fa , Fc] D i f acd Fd (G.9)

Therefore, Fa defined in Eq. (G.6) is an adjoint representation of the S U(N ). �

2) An example of global structure is“connectedness”. For example, S U(2) andO(3) have the same structure constants,but S U(2) is simply connected and O(3) isdoubly connected and there is a one-to-twocorrespondence between the two.

3) For instance, instead of angular momentumJi , we can use J˙ D Jx ˙ Jy , which satisfies

[ JC, J�] D 2 J3 , [ J3, J˙ ]D ˙ J˙ (G.4)

Page 868: Yorikiyo Nagashima - Archive

G.1 Generators of the Group 843

If the Lagrangian or the equation of motion is invariant under S U(N ) transfor-mation, Fk is a conserved quantity. The number of matrices that can be diagonal-ized simultaneously is called the rank of the group. The rank of S U(N ) is N � 1.Matrices that are made of Fk and commute with all Fk ’s are called Casimir opera-tors. Particles which can be identified as the representation vector of a group, canbe specified and classified by the eigenvalues of the Casimir operators and Fi ’s thatcan be diagonalized simultaneously.

Representation matrix of the complex conjugate vector ψ� is given by U�. Sinceits generator �F�

k satisfies the same commutation relation Eq. (G.3) also, ψ� is al-so a representation vector and is called a conjugate representation. If ψ representsa particle, then ψ� represents its antiparticle. Therefore, the quantum numbersof antiparticles are minus those of particles. If we define a covariant vector (withlower indices) by

ψ† D (ψ�)T D (� 1 �, � 2 �, � � � , � N �) � (�1, �2, � � � , �N ) (G.10)

It transforms as

�a ! �a0 D

XU a�

b � b�DX

Uab��b D

X�b Uba

† � (U†)ba �b (G.11a)

or ψ† ! ψ†0D ψ†U† (G.11b)

If a representation matrix U(A) of any element A can be expressed as

U(A) D

266666664

U1(A) 0 0

0 U2(A) 0

0 0. . .

377777775

(G.12)

It is said reducible and written as

U D U1 ˚ U2 ˚ � � � ˚ Uk (G.13)

G.1.2Direct Product

When there are two representations U(A), U(B) of dimensionality nA, nB whichhave elements A D fA j g, B D fBkg. Product of basic elements � j , ηk of eachrepresentation � j ηk spans (nA � nB )-dimensional space. The representation inthis space U(AB) is expressed as

U(AB)(� j ηk ) D � j0ηk

0 D (U(A)� j )(U(B)ηk) (G.14)

and is called a direct product representation. It is written as

U(AB) D U(A)˝ U(B) (G.15)

Page 869: Yorikiyo Nagashima - Archive

844 Appendix G S U(3), S U(n) and the Quark Model

In general the direct product is reducible.

Example G.1

Each representation of the angular momentum J D Ja and J D Jb has 2 Ja C 1and 2 Jb C 1 elements. The representation of the direct product Ja ˝ Jb has (2 Ja C

1)(2 Jb C 1) elements. It is reducible and decomposed to J D J1 ˚ J2 ˚ � � � ˚ Jn

where J1 D Ja C Jb , J2 D Ja C Jb � 1, � � � , Jn D j Ja � Jb j.

G.2SU(3)

G.2.1Structure Constants

It is customary to adopt Pauli matrices for SU(2) (Fi D σ i/2) and Gell-Mann’s λmatrices for SU(3) (Fi D λ i/2). Although S U(3) itself is a general mathematicalframework, it will be convenient for us to identify the three light quarks q i D

(u, d, s) as the S U(3) fundamental representation state vector and express them as

u D � 1 D

241

00

35 , d D � 2 D

240

10

35 , s D � 3 D

240

01

35 (G.16)

Since u $ d [D S U(2) transformation] is a part of S U(3) transformation, we de-fine λ i which operates only on u and d.

λ i D

24 τ i

00

0 0 0

35 , i D 1 � 3 (G.17a)

Then λ i ’s are isospin operators. In a similar manner, we define S U(2) operators“V spin” that act only on u$ s, and “U spin” that act only on d $ s.

λ4 D

240 0 1

0 0 01 0 0

35 , λ5 D

240 0 �i

0 0 0i 0 0

35 ,

V3 D

241/2 0 0

0 0 00 0 �1/2

35

λ6 D

240 0 0

0 0 10 1 0

35 , λ7 D

240 0 0

0 0 �i0 i 0

35 ,

U3 D

240 0 0

0 1/2 00 0 �1/2

35

(G.17b)

Page 870: Yorikiyo Nagashima - Archive

G.2 SU(3) 845

We have defined nine matrices, but

I3 � V3 C U3 D 0 (G.17c)

and are not independent. We define

λ8 �p

3Y D2p

3(U3 C V3) D

1p

3

241 0 0

0 1 00 0 �2

35 (G.17d)

The eigenvalue of the λ8 is called hypercharge and is denoted as “Y”. The numberof diagonal matrices is two, which is the rank of S U(3). The eight λ i satisfy thefollowing commutation relations�

λ I

2,

λ j

2

�D i f i j k

λk

2(G.18a)�

λ I

2,

λ j

2

D

13

δ i j C i di j kλk

2(G.18b)

f i j k , di j k are totally antisymmetric or symmetric with respect to i j k permutationsand have values listed in Table G.1

Table G.1 Structure constants of S U(3).

f123 D 1

f147 D f246 D f257 D f345 D f516 D f637 D12

f458 D f678 Dp

32

d118 D d228 D d338 D �d888 D1p

3

d146 D d157 D d256 D d344 D d355 D12

d247 D d366 D d377 D �12

d448 D d558 D d668 D d778 D �1

2p

3

Some Definitions of Terminology Let Fa(n) be an n-dimensional representationmatrix,

Tr[Fa(n)Fb(n)] D TR (n)δab (G.19a)

N2�1XaD1

Fa(n)2 D C2(n)1 (G.19b)

Page 871: Yorikiyo Nagashima - Archive

846 Appendix G S U(3), S U(n) and the Quark Model

By convention, the normalization of S U(N ) matrix is chosen to be TF � TR (n DN ) D 1/2. With this choice S U(N ) matrices satisfy relations

TF � TR (NF ) D12

, TR (NA) D N (G.20a)

CF � C2(NF ) DN 2 � 1

2N, CA � C2(NA) D N (G.20b)

where suffix F denotes fundamental (n D NF D N ) and A denotes adjoint (n DNA D N 2 � 1). A general method to obtain C2(n) is given in Sect. G.2.3.

V, U Spin Defining raising or lowering operators of I, V, U spins by

I˙ Dλ1 ˙ λ2

2, V˙ D

λ4 ˙ λ5

2, U˙ D

λ6 ˙ λ7

22I3 D [IC, I�] , 2V3 D [VC, V�] , 2U3 D [UC, U�]

(G.21)

In the quark model, q D (u, d, s) has quantum numbers

I3(q) D (1/2, �1/2, 0)

V3(q) D (1/2, 0, �1/2)

U3(q) D (0, 1/2, �1/2)

Y(q) D (1/3, 1/3, �2/3)

(G.22)

If we place the three quarks in a I3–Y plane and represent each q as a vector fromthe origin (referred to as a weight diagram), when we make representations ofdirect products of the state vectors, multiplication of u means an increase of I3, Yby 1/2, 1/3, respectively, which translates to an addition of the u-vector in the I3–Y plane. Similarly, the multiplication of d, s means the addition of d, s vectors.Actions of I˙, V˙, U˙ are depicted in Figure G.2. For instance the action of ICchanges d to u, which means moving a point in the I3–Y plane by 1 unit to the right.Conversely, I� changes u to d and moves a point by 1 unit to the left. Similarly, theaction of V˙ is to change s(u) to u(s) or to move a point diagonally, as is depictedin Fig. G.2a.

G.2.2Irreducible Representation of a Direct Product

a. Meson Octet Let us consider a composite system, spin 0 mesons, made of quarksand antiquarks. In the terminology of S U(3) group representation, we are makingdirect products of qi and q j , or Φ i

j D qi q j . By a unitary transformation

q ! q0 D U q , q ! q0 D qU†

or [Φ ] � [Φ ]ij ! Φ 0 D U Φ U†

(G.23)

which means a product Tr(Φ ) D qq D uu C dd C s s is invariant under S U(3)transformation.

) η0 �1p

3(uuC dd C ss) (G.24)

Page 872: Yorikiyo Nagashima - Archive

G.2 SU(3) 847

I3

-1 -1/2 1/2 1

Y

1/3

1

-1

-2/3

I

-1 -1/2 1/2 1

Y

2/3

1

-1

-1/3

ud

s u

s

_

_

_0

0

(a)

d

3 3*(b)

Figure G.1 (a) (u, d, s) make a triplet 3 and have (I3, Y ) D (1/2, 1/3), (�1/2, 1/3), (0,�2/3).They are expressed as vectors from the origin. (b) (u, d , s) make 3�. Vectors (u, d , s) have oppo-site quantum numbers to those of (u, d, s).

I3

Y

I3

Y

ud

su d

s

_

_

_

(a) (b)

V+V-

I+

I-I-I+

V+V-

U+

U+

U-U-

Figure G.2 I˙, V˙, U˙ move the state in the direction the arrows indicate.

is a vector that transforms to itself and hence makes singlet. The remaining eightvectors mix together by the transformation and cannot be separated further andhence makes an multiplet of 8 members called octet. They are expressed as

P ij D qi q j �

13

Xqi qi D Φ i

j �13

Tr[Φ ij ] (G.25)

The conclusion of the above discussion is that the direct representation made ofthree q’s and three q0s can be decomposed into groups of one and eight members.We express it as

3˝ 3� D 1˚ 8 (G.26)

Let us derive the same equation from a slightly different point of view. Base vectorsof the direct product 3 ˝ 3� are nine vectors, which are obtained by multiplyingu, d, s in 3 and u, d, s in 3�. Take the example of multiplying u by d . As u has(I3, Y ) D (1/2, 1/3) and d, (1/2,�1/3), the product ud has (I3, Y ) D (1, 0). In

Page 873: Yorikiyo Nagashima - Archive

848 Appendix G S U(3), S U(n) and the Quark Model

the I3–Y plane, it is obtained by summing the two vectors u and d, as shown inFig. G.3a. Other operations are similar. To repeat, first start from the origin. Weplace three vectors of (u, d, s) as in Fig. G.3b. Then add another three vectors ofu, d, s from each of the three apexes, and we obtain a total of nine states. Threeof them, uu, dd, s s, come back to where they started, i.e. there is a triple pointat I3 D Y D 0. One combination of them η0 D Tr[Φ ]/

p3 forms a singlet, as

we already explained (or according to a general rule Sect. G.2.2.c) and should beseparated. The result is 3˝ 3� D 1˚ 8 as depicted in Fig. G.3c.b. Baryon Octet and Decuplet Baryons are made of three quarks. We first make two-quark composites made of nine product vectors T i j D � i � j , and see their trans-formation properties. They are, in general, reducible, but we can separate theminto symmetric and antisymmetric parts:

T i j D12

(� i � j � � j � i)C12

(� i � j C � j � i ) D Ai j C S i j (G.27)

As the permutation symmetry does not change by the transformation, Ai j and S i j

transform to themselves. The nine product vectors have been decomposed into athree-dimensional antisymmetric group and a six-dimensional symmetric group.Namely

3˝ 3 D 3� ˚ 6 (G.28)

In the vectorial representation, they are expressed as Figure G.4a. That the threeAi j belongs to 3� and not to 3 can be easily seen by looking at their quantum num-

I3

Y

1/3

1

-2/3

-1

usds

du

dd

ud

sdsu

uu

ss-1 -1/2 1/2 1

(b)

(a)

I3

-1/2 1/2 1

Y

1/3 u d

0

ud (I =1,Y=0)

-1/3

(c) 3 X 3*= 8 + 1

Figure G.3 (a) As u, d has (I3, Y ) D (1/2, 1/3), (1/2,�1/3) ud can be obtained as the vectorsum of the two states and has (1, 0) in the I3–Y plane. (b) Applying the same operation to allcombinations of (u, d, s) � (u, d , s), a total of nine states are obtained. (c) 3˝ 3� D 8˚ 1.

Page 874: Yorikiyo Nagashima - Archive

G.2 SU(3) 849

(a)

3 X 3 = 6 + 3*

dd du ud uu

ds us

ss

(b)

6 X 3 = 10 + 8

ddd ddu uud uuu

dds dus uus

dss uss

sss

Figure G.4 Baryon multiplets. (a) 3˝ 3 D 6˚ 3�, (b) 6˝ 3 D 10˚ 8.

bers which are the same as (u, d, s). Mathematically, it can be shown as follows.Define

η i � ε i j k A j k (G.29)

then

B D η i � i (G.30)

is invariant under S U(3) transformation, because

BU�! B 0 D η0

i �0 iD ε i j k � 0 i � 0 j � 0 k

D ε i j k U il U j

m U kn � l � m � n

D ε l mn(det U)� l � m � n D ε l mn� l � m � n D B(G.31)

From Eq. (G.31), the vector η i ’s transformation matrix is shown to be (U�1)T D

U�, which means η i belongs to 3�. Multiplying T i j D 3� ˚ 6 by � k in 3 further,we get from the first term of Eq. (G.28) (Ai j � k )

3� ˝ 3 D 1˚ 8 (G.32)

just like we obtained meson singlet and octet. But unlike the meson case, the sin-glet term in Eq. (G.32) is a totally antisymmetric combination of � i ’s made fromAi j � k and the remaining 8 terms constitute an octet. Product vectors of the secondterm of S i j and � k can be separated to totally symmetric part of which there are10 members and the remaining 8 members which form another octet. Namely

6˝ 3 D 10˚ 8 (G.33)

To summarize, we have derived

3˝ 3˝ 3 D 1˚ 8˚ 8˚ 10

symmetry property W A MA MS S(G.34)

Page 875: Yorikiyo Nagashima - Archive

850 Appendix G S U(3), S U(n) and the Quark Model

Here, the singlet 1 consists of a totally antisymmetric and the decuplet 10 of totallysymmetric combinations of � i 0

s. The two octets 8 occupy the same position inI3 � Y plane, but they have different permutation symmetry. As we have seen,one is symmetric (MS D mixed symmetry) and the other is antisymmetric (MA Dmixed antisymmetric) with respect to the first two indices. Weight diagrams ofthese multiplets are depicted in Figure G.4.c. General Rules of the Irreducible Weight Diagrams A way to decompose a prod-uct representation into a sum of irreducible representations is given by using theYoung diagram (Sect. G.2.4). There is a theorem concerning the structure of irre-ducible weight diagrams.

Theorem G.1

Any weight diagram of irreducible representations must obey the following rules(proof omitted).

1. A weight diagram makes layers of convex polygons, generally irregularhexagons, with triangles in a special case (Figure G.4 and Figure G.5).

2. Multiplicity of points on the sides and apexes of the outermost layer is one.3. Multiplicity of those points on the second layer is two and increases by one as

we go into inner layers.4. When the polygon of the inner layers becomes a triangle, the multiplicity stops

to increase and stay at the same value no matter how deep we go into innerlayers. As a special case, the multiplicity of all the points on triangle weightdiagram is one.

d. Baryon Wave Functions To construct wave functions of 1, 8MA, 8MS, 10S, let usfirst consider those without the s-quark. Since, u, d have isospin 1/2, the wavefunctions can be constructed using the Clebsch–Gordan coefficients of S U(2). As1/2˝ 1/2 D 0˚ 1 with wave functions

I D 0 W j0, 0i D1p

2(ud � du) , I D 1 W

8<ˆ:j1,C1i D uu

j1, 0i D 1p2(ud C du)

j1,�2i D dd

(G.35)

They have the desired property being antisymmetric and symmetric with exchangeof the first pair of quarks. Adding the third quark to the wave functions, we canmake I D 1/2 from the first I D 0 and I D 1/2 and 3/2 from the second I D 1.Using Clebsch–Gordan coefficients we obtain

I D 1/2M A W

8<:j1/2,C1/2i D 1p

2(ud � du)u

j1/2,�1/2i D 1p2(ud � du)d

(G.36a)

I D 1/2M S W

8<:j1/2,C1/2i D 1p

6f2uud � (ud C du)ug

j1/2,�1/2i D � 1p6f2ddu � (ud C du)dg

(G.36b)

Page 876: Yorikiyo Nagashima - Archive

G.2 SU(3) 851

I D 3/2S W

8ˆ<ˆ:

j3/2,C3/2i D uuu

j3/2,C1/2i D 1p3(uud C uduC duu)

j3/2,�1/2i D 1p3(udd C dud C ddu)

j3/2,�3/2i D ddd

(G.36c)

There is no totally antisymmetric combination. It can only be made using all threeu, d, s. Obviously, I D 1/2M A , I D 1/2M S , I D 3/2 wave functions are membersof 8MA, 8MS, 10S. Since totally symmetric wave functions of 10 can be made easily,we list other wave functions in Table G.2.

There are some ambiguity in constructing the neutral (I3 D Y D 0) wave func-tions. As the isospin is a good quantum number, we make Σ 0 from Σ C by applyingI� and Λ0 by making orthogonal to Σ 0.

G.2.3Tensor Analysis

a. Dimension of an Irreducible Representation A general product vector or a tensorwhich is a product of contra-variant vectors � i and q covariant vectors � j is written

as T (p )(q) D T i j k ���

l mn��� and transforms like

�T i j ���

l m����0D U i

aU jb � � �U

† cl U† d

m � � � Tab���cd ��� (G.37)

It is called a contra-variant tensor of rank p if q D 0, and a covariant tensor of rankq if p D 0 and a mixed tensor if p , q ¤ 0. We write the whole collection of thetensors as (p , q). The weight diagram of (p , q) is a irregular hexagon which has

Table G.2 Baryon wave functions with mixed symmetry φM A , φM S and total antisymmetry φA .

φM A φM S

p 1p2

(ud � du)u 1p6f2uud � (ud C du)ug

n 1p2

(ud � du)d � 1p6f2ddu � (ud C du)dg

Σ C 1p2

(us � su)u 1p6f2uus � (us C su)ug

Σ 0 12 f(usd C dsu) � s(ud C du)g 1p

12f2(ud C du)s � (usd C dsu)

�s(ud C du)gΣ � 1p

2(ds � sd)d 1p

6f2dds � (ds C sd)dg

Λ0 1p12f2(ud � du)s C (usd � dsu) 1

2 f(usd � dsu)C s(ud � du)g

�s(ud � du)g� 0 1p

2(su � us)s 1p

6f2ssu � (us C su)sg

� � 1p2

(sd � ds)s 1p6f2ssd � (ds C sd)sg

φA

Λ01

1p6fs(ud � du)C (dsu � usd)C (ud � du)sg

Page 877: Yorikiyo Nagashima - Archive

852 Appendix G S U(3), S U(n) and the Quark Model

upper and lower sides of length p and q (see Figure G.5). If p or q is 0, then thehexagon becomes a triangle.

We can prove that δ ij D δ i j is a mixed tensor of p D q D 1, and that ε i j k , ε i j k

are covariant and contra-variant tensors of rank 3. Therefore, the product of a tensorby them is also a tensor. In general, the tensor (p, q) in Eq. (G.37) is reducible. Bymaking contractions with δ i

j , ε i j k , ε i j k we can make tensors of lower rank. Forinstance,

R j k ���bc��� D δa

i T i j k ���abc��� D T i j k ���

i bc��� (G.38a)

S k ���abcd ��� D εai j T i j k ���

bcd ��� (G.38b)

are tensors of (p �1, q�1), (p �2, qC1). If the tensor before contraction is alreadytraceless or symmetrized, the above operation gives zero result which means it isno longer reducible. Therefore, a tensor belonging to an irreducible representationis symmetrized with respect to upper or lower indices and traceless. Therefore, itsatisfies a condition

T i j ���i l m��� D 0 (G.39)

As the upper and lower indices are already symmetrized, all we need is to takea trace of the first index. Expressing the representation of this tensor space asD N (p , q), where N denotes the number of independent elements and is given by

N D12

(p C 1)(q C 1)(p C q C 2) (G.40)

Proof: We arrange upper indices like

11 � � �„ƒ‚…a

222 � � �„ƒ‚…b

33 � � �

„ ƒ‚ …p

(G.41)

V+

I+

U-A

BCφ1

φ2

φΜ

φmax

p

p

q

q

I3

Y

Figure G.5 The weight diagram of (p , q) is airregular hexagon which has upper and lowersides of length p and q. Multiplicity of pointsat periphery is one, and increases by one asone goes one step inside until one reachesa triangle. From there it stays constant. For

multiple points, the configuration of the wavefunction is not unique and depends on thepath one chooses. For a single multiplicitypoint, the configuration is unique. For in-stance, the same φ2 can be obtained from φ1through path A! B or C.

Page 878: Yorikiyo Nagashima - Archive

G.2 SU(3) 853

For a (0 a p ), b can take 0 � p � a positions which means (p � aC 1) waysto choose, giving Nu ways to arrange upper indices.

Nu D

pXaD0

(p � a C 1) D12

(p C 1)(p C 2) (G.42)

Likewise there are Nd D (qC 1)(qC 2)/2 ways to arrange lower indices, giving atotal of

Nu Nd D14

(p C 1)(p C 2)(q C 1)(q C 2) (G.43)

There are as many traceless conditions Eq. (G.39) as Nu Nd with p , q replacedwith p � 1, q � 1, so the number of independent combinations is

N D14

(p C 1)(p C 2)(q C 1)(q C 2) �14

p (p C 1)q(q C 1)

D12

(p C 1)(q C 1)(p C q C 2)(G.44)

A few examples:

D1(0, 0) D 1 , D3(1, 0) D 3 , D3(0, 1) D 3�,

D8(1, 1) D 8 , D6(2, 0) D 6 , D6(0, 2) D 6�

D10(3, 0) D 10 , D10(0, 3) D 10�, � � �

(G.45)

Next, we will show that the multiplicity of the outermost layer is one. As

I3 D12

[n(u)C n(d)� n(u) � n(d)]

Y D13

[n(u)C n(d)� 2n(s) � n(u) � n(d)C 2n(s)]

p D n(u)C n(d)C n(s)

q D n(u)C n(d)C n(s)

(G.46)

the value of n(q) is uniquely defined when one specifies p (or q), I3 and Y if q (or p)D 0.b. Value of the Casimir Operators How to derive C2 � C2(n) D F2 D

Pk (Fk )2.

First we rewrite the Casimir operator in terms of I, V, U spin operators.

F2 D12fIC, I�g C I 2

3 C12fVC, V�g C

12fUC, U�g C

34

Y 2

D (I� IC C V�VC C UCU�)C I 23 C 2I3 C

34

Y 2(G.47)

As the polygon is always convex in the I3 � Y plane, for the element φmax with thehighest value of I3

ICφmax D VCφmax D U�φmax D 0 (G.48)

Page 879: Yorikiyo Nagashima - Archive

854 Appendix G S U(3), S U(n) and the Quark Model

Therefore, if we know the values of I3 and Y of the φmax,

hF2i D hI3i2 C 2hI3i C

34

Y 2 (G.49)

We can obtain the maximum value of I3 by assuming all the p quarks are u and allthe q antiquarks are d, which gives I3 D (p C q)/2. Then Y D (p � q)/3, giving

hF2i D13

(p 2 C p q C q2)C (p C q) (G.50)

Some examples are:

C2(1) D hF2i1 D (p D q D 0) D 0C2(3) D hF2i3 D (p D 1, q D 0) D hF2i3� D 4

3

C2(6) D hF2i6 D (p D 2, q D 0) D 103

C2(8) D hF2i8 D (p D q D 1) D 3C2(10) D hF2i10 D (p D 3, q D 0) D 6

(G.51)

G.2.4Young Diagram

We describe here, a convenient method to decompose a S U(N ) product represen-tation to a sum of irreducible representations without proof.

G.2.4.1 Rules for Making Young Diagrams

1. Young diagram makes a correspondence between an N-dimensional funda-mental representation with a box, and its complex representation N� with acolumn of N � 1 boxes.

N D N D �

9>=>; (N � 1 boxes)

(G.52)

For example

SU(2) W D 2 or 2�

SU(3) W D 3 , D 3�(G.53)

2. Any irreducible representation is expressed by columns and rows of n boxes.The arrangement of the n boxes has to be convex toward the upper left. It meansif the number of boxes in the first row is λ1, that of the second row λ2, � � � andthat of the last row λm , then

λ1 � λ2 � � � � � λm , λ1 C λ2 C � � � C λm D n (G.54)

Page 880: Yorikiyo Nagashima - Archive

G.2 SU(3) 855

Allowed arrangements Forbidden arrangements

3. Any permutation of boxes in a row is meant to be symmetric and that in acolumn to be antisymmetric. Because of this property, the maximum size ofthe column is limited to N � 1, because in S U(N ) the maximum number ofelements that are antisymmetric is N and there is only one combination tomake N elements antisymmetric. If a column of m (� N ) boxes does not exist,the dimension is the same, so we may omit it. Pictorial examples are, for N D 4

N

8<: D �

One-dimensionalrepresentation

D

(G.55)

4. For any representation, there is a conjugate representation. Combined together,they make a N � N representation. For instance, for S U(3)

(a) (b)�

�����

� (c)

D (d) D (a)

(G.56)

As we can see from (b), (c) is conjugate to (a), but applying the conditionEq. (G.55), (c)D(d), which coincides with (a), in other words, 8 D 8� in S U(3).

G.2.4.2 Decomposition of Product Representations of A and B

1. Insert 1 in all the boxes of the first row of the representation B, 2 in all the boxesof the second row, and continue to m in the m-th (last) row.

2. Starting from the first row of B, pick any number of boxes and attach them toA in all possible ways subject to conditionsi. Keep the condition “convex toward the upper left”.ii. In any row, the number in any box must not decrease from left to right.iii. In any column, all the boxes must have a different number and it must

increase downwards.iv. Draw a line starting from the box at the extreme right to the first box on

the first row. Move to the second row and enter the box again at the extremeright and continue, until the line passes all the boxes. At any box on the line,count the number (ni ) of boxes on the path which contains the number i.The rule to obey is ni � niC1.

Page 881: Yorikiyo Nagashima - Archive

856 Appendix G S U(3), S U(n) and the Quark Model

Example G.2

Correct prescription Incorrect prescription

˝21 = 2

21 2

121

12 21

(a) (b) (c) (d)

Figure G.6 Diagrams (a) and (b) do not satisfy condition (i). Diagram (c) does not satisfy con-dition (ii) and (d) does not satisfy (iv), because on a line entering the first row from right andstopping at the first box, n1(D 0) < n2(D 1).

Example G.3

˝1 12 D

1 12 ˚

1 1

2

˚1

1 2 ˚1

21

˚

1

12

˚ 1

1 2

D ˚ ˚ ˚ ˚ ˚ �

) 8˝ 8 D 27˚ 10˚ 10� ˚ 8˚ 8˚ 1 (G.57)

There are two Young diagrams corresponding to the same 8, but their permutationproperties are different, hence they correspond to different representations.

G.2.4.3 Dimension of the Irreducible RepresentationThe dimension of a representation is given by

D DFH

(G.58)

Calculation of F: Insert N in the box at the upper left corner. Increase the numberby one as you move to the right (N C 1, N C 2, . . .). Decrease it by one as yougo downward, for instance, insert N � 1 in the first box in the second row. SeeFig. G.7a. D is given as product of all the numbers in the boxes.

N N+1 N+2 N+3

N-1 N N+1

N-2 N-1 N

N-3

� � ���

h1 D 4h2 D 3h3 D 1h4 D 2h5 D 1h5 D 1

H=h1 � h2 � h3 � h4 � h5

F=N(N+1)(N+2)(N+3)�(N-1)N(N+1)�(N-2)(N-1)N�(N-3)

D=F/H

Figure G.7 The dimension of an irreducible representation using the Young diagram is given byF/H .

Page 882: Yorikiyo Nagashima - Archive

G.2 SU(3) 857

Calculation of H: Draw a line in any row starting from the extreme right, bend itat any box with a right angle downward and exit from below. This is called a hook.Let h be the number of boxes in the path of the hook; H is given as the product ofall possible h’s (Fig. G.7b).

Example G.4

Dimension of in S U(3) is given in Figure G.8.

3 4

2

��

h1 D 3h2 D 1

h3 D 1

H=3 � 1 � 1F=3 � 4 � 2

D=F/H=8

Figure G.8 An example of S U(3) representation. F D 3 � 4 � 2 D 24, H D 3 � 1 � 1 D 3.Therefore, D D F/H D 8.

Example G.5

SU(3):3� ˝ 3D 1˚ 8 3˝ 3D 6˚ 3� 6˝ 3D 10˚ 8

˝ = ˚ ˝ = ˚ ˝ = ˚

SU(4): Hadron multiplets that are composed of four quarks are easily constructedusing the Young diagram.

Mesons W 4˝ 4� D 1˚ 15Baryons W 4˝ 4˝ 4 D 4˚ 20˚ 20˚ 20Symmetry A MS MA S

(G.59)

The symmetry property is the same as that of S U(3). The reason what was a singletin S U(3) now is a quartet, is there are four ways to make a totally antisymmetricthree-quark wave function out of four quarks. An octet and decuplet in S U(3) havebecome 20 adding baryons containing charm quarks.SU(5)

5˝ 5� D 1˚ 255˝ 5 D 10˚ 15

A S5� ˝ 10 D 5˚ 455 ˝ 10 D 40˚ 40 ˚ 10�

10 ˝ 10 D 5� ˚ 45� ˚ 5010� ˝ 10 D 1˚ 24 ˚ 75

(G.60)

SU(6)

Mesons W 6˝ 6� D 1˚ 35Baryons W 6˝ 6˝ 6 D 20˚ 70˚ 70˚ 56Symmetry A MS MA S

(G.61)

Page 883: Yorikiyo Nagashima - Archive
Page 884: Yorikiyo Nagashima - Archive

859

Appendix HMass Matrix and Decaying States

H.1The Decay Formalism

We describe here the time evolution of unstable particles in the Weisskopf–Wignerapproximation [385, 386]. The arguments here closely follow those of [220, 283].We consider a system that is described by a Hamiltonian

H D H0 C HW (H.1)

where H0 is the Hamiltonian of the strong interaction and free part of the weakinteraction while HW is that of the weak interaction that induces transition andis assumed to be a small perturbation to H0. The eigenstate of H0 consists of idegenerate discrete states jA j i and a series of continuum states jBni to which Acan decay weakly. Since we want to apply the formalism to the neutral K mesonsystem, we set i D 2, jA 1(2)i D jK0( NK0)i and jBni D 2π, 3π, π˙`�ν, � � � .

H0jA j i D E0jA j i

H0jBni D EnjBni(H.2)

As an initial state at t D 0 we consider a superposition of the discrete states jAi.

jt D 0i D a1(0)jA 1i C a2(0)jA 2i (H.3)

The time evolution of the state vector jti obeys the Schrödinger equation.

idd tjtiS D (H0 C HW)jtiS (H.4)

where the subscript S denotes that the states are those in the Schrödinger picture.They have time dependence jtiS D e�i E tjt D 0iS. In the following discussions, itis more convenient to work in the interaction picture:

jti � jtiI D e i H0 t jtiS (H.5)

For simplicity, we omit the subscript I for the interaction picture. The state jti obeysthe equation

idd tjti D HW(t)jti , HW(t) D e i H0 t HW e�i H0 t (H.6)

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 885: Yorikiyo Nagashima - Archive

860 Appendix H Mass Matrix and Decaying States

We write

jt D 0i D jt D 0iS

jti D a1(t)jA 1i C a2(t)jA 2i CX

n

bn(t)jBni(H.7)

Inserting Eq. (H.7) in Eq. (H.6) we obtain

i@a j (t)

@t

ˇˇ

j D1,2D

XkD1,2

hA j jHWjA kiak (t)

CX

n

e i(E0�En )thA j jHWjBnibn(t) (H.8a)

i@bm (t)

@tD

XkD1,2

e i(Em�E0)thBmjHWjA kiak (t)

CX

n

e i(Em�En )thBmjHWjBnibn(t) (H.8b)

Equations (H.8) are exact. We impose initial conditions such that

a1(0) D a1, a2(0) D a2

bm(0) D 0(H.9)

Weisskopf–Wigner’s first approximation (1) is that with the initial conditions weimposed the weak interactions in the final states are negligible, which is equiva-lent to discarding the second term in Eq. (H.8b). This means we regard pions andmuons, which are the decay products of the K mesons, as stable. With this approx-imation, Eqs. (H.8) can be solved. We obtain

bm(t) D �iX

kD1,2

Z t

0d t0e i(Em�E0)t 0

hmjHWjkiak (t0) (H.10a)

a j (t) D a j � iX

kD1,2

Z t

0d t0h j jHWjkiak (t0)

�Xn, k

Z t

0d t0

Z t 0

0d t00e i(E0�En )(t 0�t 00)h j jHWjnihnjHWjkiak (t00)

(H.10b)

where we abbreviated jA ki, jBni simply as jki, jni. Unlike Eqs. (H.8), Eqs. (H.10)are no longer mutually coupled. They can be solved by means of a Laplace trans-form. Define a new variable

Qa j (s) DZ 1

0d t e�s t a j (t) (H.11)

Page 886: Yorikiyo Nagashima - Archive

H.1 The Decay Formalism 861

insert it in Eq. (H.10b), and we obtain

Qa j (s) Da j

s�

is

Xk

W j k (s) Qak (H.12a)

W j k (s) D h j jHWjki CX

n

h j jHWjnihnjHWjkiE0 � En C i s

(H.12b)

Using matrix notation

a(0) D�

a1

a2

�, a(t) D

�a1(t)a2(t)

�, Qa(s) D

�Qa1(s)Qa2(s)

�W (s) D [W j k (s)]

(H.13)

Equations (H.12) can be cast in a form

[s C i W (s)] Qa(s) D a(0) (H.14a)

) Qa(s) D [s C i W (s)]�1a(0) (H.14b)

Then the inverse Laplace transform may be applied to yield

a(t) D1

2π i

Z s0Ci1

s0�i1d ses t Qa(s) D

12π i

Z s0Ci1

s0�i1d s

es t

[s C i W (s)]a(0)

D1

2π i

Z C1

�1d y

e(i yCs0)t

[y � i s0 C W (i y C s0)]a(0)

(H.15)

The value of s0 has to be chosen in such a way that the integration path in the com-plex s-plane lies to the right of all singularities of [s C i W (s)]�1. From Eq. (H.12b)we see that W is analytic except at those points where

E0 � En C i s D 0 (H.16)

which are on the imaginary axis. Therefore we can set s0 > 0 and infinitesimallysmall and rewrite it as �. Then the question is from where does the main contribu-tion to the integral come from?

For HW D 0 we have W (s) D 0 and the singularity of the integrand is a pole ats D 0. In other words, the main contribution comes from the vicinity of s D 0. Inthis case we obtain

a(t) D a for t � 0 (H.17)

as is necessary, since the states jA ki are stable when HW D 0. When it is switchedon, this pole moves but for small perturbation, it will stay close to the imaginaryaxis. Moreover, W (s) is expected to vary appreciably only when s changes by anamount of � E0 � En , i.e. over a range of energy determined by the strong inter-action H0. The second approximation (2) of the Weisskopf–Wigner approach is to

Page 887: Yorikiyo Nagashima - Archive

862 Appendix H Mass Matrix and Decaying States

replace W (s) in the vicinity of s D 0, Res > 0, by a constant

W (s)! W � lim�!0

W (�)

W j k D h j jHWjki CX

n

h j jHWjnihnjHWjkiE0 � En C i�

(H.18)

This is an approximation valid for time t very long compared with strong interac-tion scale. Then using the residue theorem to evaluate Eq. (H.15), we obtain

a(t) D1

2π i

Id s

es t

s C i Wa(0) D e�i W t a(0) t > 0 (H.19)

Returning to the Schrödinger picture and denoting Φ D e�i H0 t a, it satisfies theSchrödinger equation with the Hamiltonian replaced with the mass matrix

idd t

Φ (t) D ΛΦ (t) (H.20)

where the mass matrix Λ is defined by

Λ D H0 C W � M �i2

Γ (H.21a)

M D12

(Λ C Λ†) , Γ D i(Λ � Λ†) (H.21b)

Both M and Γ are hermitian but Λ is not. The matrix elements are expressed as

M j k D E0δ j k C h j jHWjki C PX

n

h j jHWjnihnjHWjkiE0 � En

(H.22a)

Γ j k D 2πX

n

δ(E0 � En)h j jHWjnihnjHWjki (H.22b)

where P means principal value. Γ j k are determined solely by matrix elements ofHW for energy conserving process, i.e. they are on-shell amplitudes.

Comparing Eqs. (H.22) with Eq. (6.38), we see that the mass matrix is essen-tially an unperturbed strong Hamiltonian plus the weak transition operator i.e.W D OTW. To the first order in weak interaction, it reduces to H0 C HW. If HW isa ΔS D 1 operator, it does not induce mixing between K0 and NK0. It is a secondorder effect as expressed in Eqs. (H.22). The difference between OTW here and thegeneral transition operator OT in Eq. (6.38) is that the former is valid only for timelong compared to strong interaction scale and that only continuum states to whichthe discrete states can decay appear in the intermediate states.

Equation (H.20) looks like a Schrödinger equation but Λ is not hermitian. With-in a sub-space consisting of K0 and NK0 which are degenerate eigenstates of H0, thetransition operator OTW D W gives modifications due to the weak interaction pro-cess. The antihermitian part of OTW describes the exponential decay of the states. Ifall the Γ j k ’s were zero, the system would evolve without decay. But the off-diagonal

Page 888: Yorikiyo Nagashima - Archive

H.1 The Decay Formalism 863

elements of OTW still cause mixing of K0 and NK0 and the eigenstates of the massmatrix are mixture of both. Denoting the eigenstates and eigenvalues as jσi andλσ , the time evolution of the states is given by

jσ(t)i D jσ(0)ie�i λσ t , λσ D mσ � iΓσ

2(H.23)

and represents exponentially decaying states with mean life τ D 1/Γσ .Note that the exponential decay law is only an approximation. If W in Eq. (H.15)

is expanded in Taylor series, inclusion of higher powers of y results in correctionsto the exponential decay law. It can be shown that the approximation fails both forvery short and for very long times [146, 228, 229].

Page 889: Yorikiyo Nagashima - Archive
Page 890: Yorikiyo Nagashima - Archive

865

Appendix IAnswers to the Problems

Chapter 2: Particles and Fields

Problem 2.2: For a static solution @0 D 0. Applying �r2 C μ2 to '(r) DR d3 p

(2π)3

e i p�r Q'(p) and using δ3(r) DR

d3 p ei p�r/(2π)3, we have Q'(p ) D g/(p2 C μ2). Then

'(r) D gZ

d3 p(2π)3

e i p�r

p2 C μ2 D1

(2π)2

Z 1

0p 2d p

Z 1

�1dz

ei p r z

p 2 C μ2

Dg

4πr1

2π i

Z 1

�1e i p r

�1

p � i μC

1p C i μ

�D g

e�μ r

4πr

In deriving the last equation, we have closed the contour through the upper half-circle of the complex p plane.Problem 2.3: rW D „c/mW ' 200 MeV � 10�13 cm/81 GeV ' 2.5 � 10�16 cm.Problem 2.4: Assuming the size of the Crab Nebula is L D 400 ly D 400 � 0.946 �1016 m, mγ < „c/100 ly ' 2 � 10�7 eV m/4 � 1018 m ' 0.5 � 10�25 eV. Assumingan intergalactic distance of 1 Mpc � 3.26 � 106 ly, mγ < 0.4 � 10�29 eV.

Problem 2.8: λ D 4πEΔ m2 D 4π E

„c

„c

jΔ mjc2

�2D 2.4 E(MeV)

δm2(eV)2 (m)

Chapter 3: Lorentz Invariance

Problem 3.1: Applying Lorentz boost to the longitudinal component

x00D γ x0 C γ ( O� � x ) , xk0 D γ x0C γ ( O� � x )

) x 0 D x 0k C x? D O�[γ x0 C γ ( O� � x)]C [x � O�( O� � x )]

D x C �γ x0 C (γ � 1)( O� � x ) O�

Problem 3.2: (3.28e): A relativistic wave equation for a scalar field is given by theKlein–Gordon equation. It has a plane wave solution � e i kμ x μ

, which should bea Lorentz scalar. Then kμ x μ is also a scalar, which in turn means kμ is a Lorentzvector. Another way of saying that kμ x μ is a Lorentz scalar is it is a phase, i.e. apure number which does not depend on the coordinate frame.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 891: Yorikiyo Nagashima - Archive

866 Appendix I Answers to the Problems

(3.28g): Since (�cI j ) is a vector and @μ@μ is a Lorentz scalar, the Maxwell equa-tion @μ F μν D j μ tells that (Aμ D φ/cIA) must also be a vector.

(3.28h): @0μ D

@

@x μ 0 D@x ν

@x μ 0@

@x ν(3.24b)D M ν

μ @ν

which shows that @μ is a covariant vector.Problem 3.6:

η00 D12

ln1C 00

1 � 00 D12

ln1C ( C 0)/(1C 0)1 � ( C 0)/(1C 0)

D12

ln(1C )(1C 0)(1� )(1� 0)

D η C η0

Problem 3.7(c):

gμν D g�σ L�μ Lσ

ν D g�σ (δ�μ C ��

μ)(δσν C �σ

ν )

' g�σ δ�μ δσ

ν C g�σ δ�μ�σ

ν C g�σ��μ δσ

ν

D gμν C �μν C �νμ , ) �μν D ��νμ

Chapter 4: Dirac Equation

Problem 4.5: ψγ μ ψ D ψ†(1, αx , α y , αz)ψ. eαx ηx /2 D cosh(ηx /2)Cαx sinh(ηx /2).The transformation property of the 0-th component is

ψ0γ 0ψ0 D ψ†[cosh(ηx /2)C αx sinh(ηx /2)][cosh(ηx /2)C αx sinh(ηx /2)]ψ

D ψ†[fcosh2(ηx /2)C sinh2(ηx /2)g C 2 sinh(ηx /2) cosh(ηx /2)αx ]ψ

D ψ†[cosh ηx C sinh ηx αx ]ψ D γ ψγ 0ψ C γ ψγ 1ψ

ψ0γ 1ψ0 D ψ†[cosh(ηx /2)C αx sinh(ηx /2)]αx [cosh(ηx /2)C αx sinh(ηx /2)]ψ

D ψ†[fcosh2(ηx /2)C sinh2(ηx /2)gαx C 2 sinh(ηx /2) cosh(ηx /2)]ψ

D ψ†[cosh ηx αx C sinh ηx ]ψ D γ ψγ 0ψ C γ ψγ 1ψ

ψ0γ 2ψ0 D ψγ 2ψ, ψ0γ 3ψ0 D ψγ 3ψ

which reproduces Eq. (3.62).Problem 4.6:

ur (p )u s(p ) D (� †r

pp � σ, � †

rp

p � σ)�p

p � σ�spp � σ�s

D � †r (p

p � σ p � σ Cp

p � σ p � σ)�s(4.60)D 2m� †

r �s D 2mδ r s

u†r (p )u s(p ) D (� †

rp

p � σ, � †r

pp � σ)

�pp � σ�spp � σ�s

D � †r (p � σ C p � σ)�s

(4.58)D 2E δ r s

Others can be proved similarly.

Page 892: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 867

Problem 4.8:

XrD1,2

u r(p )ur (p ) D N�p

p � σ�rpp � σ�r

�N

� †r

pp � σ, � †

rp

p � σ�

D N 2�p

p � σ 00

pp � σ

��1 11 1

� �pp � σ 00

pp � σ

D

"pp � σ p � σ p � σ

p � σp

p � σ p � σ

#

D

�m p � σ

p � σ m

�D 6p C m

Alternative proof: Using ΛC DP

r u r(p )ur (p ) and Eqs. (4.64), one can directlyprove Eqs. (4.67).Problem 4.10:

(a) u(p 0)γ u(p ) : We show for the case of γ 1. In the nonrelativistic limit E DE 0, jp j D jp 0j D p , p 0 D p n0, p D p n.

ur (p 0)γ 1u s(p ) D N

� †r

pp 0 � σ, � †

r

pp 0 � σ

� � 0 σ1

�σ 0

��pp � σ�spp � σ�s

�D N 2� †

r [�p

p 0 � σσ1p

p � σ Cp

p 0 � σσ1p

p � σ]�s

D N 2� †r

" rE C m

2C

rE � m

2σ � n0

!σ1

rE C m

2C

rE � m

2σ � n

!� (� � � � � � � )

#�s

D N 2� †r

�p (σ � n0σ1 C σ1σ � n)

�s

D � †r

�(p 0 C p )C i σ � (p 0 � p )

1 �s

where we have usedp

p � σ Dp

m exp� 1

2 ησ � nD

qECm

2 �

qE�m

2 σ � n(b) u(p 0)u(p ): We keep terms only to lowest order in the 3-momenta. Thus up

to O(p2, p 02)

p D (m , p ) , p 0 D (m , p 0) , p � σ D m � σ � p ,

p � Nσ D m C σ � p (I.1)

u(p 0)u(p ) D � †(p

p 0 � Nσ,p

p 0 � σ)�p

p � σ�pp � σ�

�D � †[

p(p 0 � Nσ)(p � σ)C

p(p 0 � σ)(p � Nσ)]�

D � †[p

(m C σ � p 0)(m � σ � p )Cp

(m � σ � p 0)(m C σ � p )]�

D m� †�

1 �σ � (p � p 0)

2mC 1C

σ � (p � p 0)2m

C O(p2)�

� ' 2m� †s �r

Page 893: Yorikiyo Nagashima - Archive

868 Appendix I Answers to the Problems

(c) hp 0jγ 5jp i: First use the fact that

u(p ) D6pm

u(p ) , u(p 0) D u(p 0)6p 0

m

) u(p 0)γ 5u(p ) D1

2m

�u(p 0) 6p 0γ 5u(p )C u(p 0)γ 5 6p u(p )

D

12m

u(p 0)γ 5( 6p� 6p 0)u(p )

γ 0γ 5( 6p� 6p 0) D�

0 11 0

���1 00 1

� �0 (p � σ � p 0 � σ)

(p � Nσ � p 0 � Nσ) 0

D

�(p � Nσ � p 0 � Nσ) 0

0 �(p � σ � p 0 � σ)

u(p 0)γ 5( 6p� 6p 0)u(p ) D u(p 0)†γ 0γ 5( 6p� 6p 0)u(p )

D � †(p

p 0 � σ,p

p 0 � Nσ)�

(p � Nσ � p 0 � Nσ) 00 �(p � σ � p 0 � σ)

� �pp � σ�pp � σ�

�D � †[

pp 0 � σ(p � Nσ � p 0 � Nσ)

pp � σ �

pp 0 � Nσ(p � σ � p 0 � σ)

pp � Nσ]�

Using (p � Nσ)(p � σ) D (p � σ)(p � Nσ) D m2, we can organize the above equation tomake

D 2m� †[p

(p 0 � σ)(p � Nσ) �p

(p 0 � Nσ)(p � σ)]�

up to here the equations are exact. Using the approximations Eq. (I.1)

u(p 0)γ 5u(p ) D1

2mu(p 0)γ 5( 6p� 6p 0)u(p )

D m� †��

1Cσ � (p � p 0)

2m

��

�1 �

σ � (p � p 0)2m

�C O(p2)

��

' � †s σ � (p � p 0)�r

Problem 4.12: Using Eq. (4.120)

ψL D (N1 C i N2)L D � � σ2η�, ψR D (N1 C i N2)R D i(η � σ2� �)

Then applying Eq. (4.122) and Eq. (4.123), we obtain

i(@0 � σ � r)ψL D i(@0 � σ � r)� � i(@0 � σ � r)(σ2η�)

D mi(η � σ2� �) D mψR

Similarly

i(@0 C σ � r)ψR D mψL

This is the Dirac equation Eq. (4.51) expressed in two-component spinors.

Page 894: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 869

Chapter 5: Field Quantization

Problem 5.1:

1. Complex field:

@L

@(@μ'†)D @μ' ,

@L

@'†D �m2'

) @μ@L

@(@μ'†)�

@L

@'† D (@μ@μ C m2)' D 0

2. Electromagnetic field:

@� 1

4 F μν Fμν�

@(@μ A ν)D F μν,

@L

@A νD �q j ν,

) @μ@L

@(@μ A ν)�

@L

@A νD �@μ F μν C q j ν D 0

3. Dirac field:

@L

@(@μ ψ)D ψ i γ μ,

@L

@ψD �mψ ! ψ

�i γ μ �@ μ C m

�D 0

Problem 5.3:

d4x 0

d4xD (Jacobian of x μ ! x 0 μ) D det

�@x 0 μ

@x ν

D det(δμν C @ν δx μ) D 1C @μ(δx μ)

)Z

(d4x 0 � d4x )L DZ

d4xL@μ(δx μ) DZ

d4x�@μ(Lδx μ) � @μLδx μ

The first term vanishes because it is a surface integral.Problem 5.5: The second term of Eq. (5.47) can be written as

@μ T μν δx ν D @μ T μ

ν �ν� x� D @μ T μν�ν�x� D

�@μ (T μν x�) � T �ν �ν�

D ��@μ (T μν x�)� T �ν ��ν

�$��! �

�@μ (T μ�x ν) � T ν�

�ν�

D12

�@μ (T μν x� � T μ�x ν) � T �ν C T ν�

�ν� D 0

Page 895: Yorikiyo Nagashima - Archive

870 Appendix I Answers to the Problems

Problem 5.6: Differentiating the Hamiltonian density

@H

@tD

@φ†

@t2

@2 φ@tCrφ† � r

@φ@tC m2φ† @φ

@tC fφ $ φ†g

Changing the first term by use of the Euler equation

@φ†

@t(r2 � m2)φ Crφ† � r Pφ C m2φ† Pφ C fφ $ φ†g

D r �˚

φ†r Pφ C Pφ†rφ�D �r �P

It is a surface integral and vanishes at infinity. It also proves problem (2). Nextdifferentiating P

�@P

@tD

@2φ†

@t2rφ Crφ† @2 φ

@t2C

@φ†

@tr

@φ@tCr

@φ†

@t@φ@t

D (r2 � m2)φ†rφ Crφ†(r2 � m2)φ Cr�

@φ†

@t@φ@t

Partially integrating the r2 part

D [@i φ†rφ C fφ† $ φg]xiD1xiD�1 � r(rφ† � rφ � m2φ†φ)

Cr

�@φ†

@t@φ@t

They are all surface integrals and vanish.Problem 5.9: If the operator e i a Op satisfies

O( Oq C a) D e i a� Op O( Oq)e�i a� Op (I.2)

then, for infinitesimal displacement ε,

O( Oq)C εd

d OqO( Oq)C � � � D O( Oq)C i ε[ Op , O( Oq)]C � � �

By putting O D Oq, we obtain [ Oq, Op ] D i . Namely, space translation operator is themomentum operator. The above equation means d

d Oq O( Oq) D i [ Op , O ]. In the Lorentzinvariant formalism, translation operator is defined by

O(x C a) D e i Pμ x μO(x )e i Pμ x μ

Namely the space part is defined with opposite sign compared to equation (I.2).Therefore to move x i ! x i C ai we have to use �P i D Pi . Hence

ri OH D �i [P i , OH ] D i [Pi , OH ]

Page 896: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 871

Problem 5.12: Inserting

φ(x ) DZ

d3k

(2π)3p

hck e i k x�i ω tx C d†

k e�i k xCi ω txi

π(y ) D @t φ†(y ) DZ

i d3k(2π)3

rω2

hc†

k e�i k yCi ω t y � dk eCi k y�i ω t yi

[φ(x , t), π(y , t)] D“

i d3kd3 p(2π)6

12

n[ck , c†

p ]e i k x�i p y�i(ωk�ω p )t

��d†

k , dpe�i k x�i p yCi(ωk�ω p )t � [ck , dp ]e i k x�i p y�i(ωk�ω p )t

C�d†

k , d†p

e�i k x�i p yCi(ωk�ω p )t

oThe third and fourth term vanish and the first and second term give k D p byintegration

D

Zi d3k

2(2π)3fe i k�(x�y ) C e�i k�(x�y )g D i δ3(x � y )

Problem 5.14: Prove by induction. We ignore the normalization factor since that donot change before and after the calculations. Using

ak a†p D δ p k C a†

p ak , a†k a†

p D a†p a†

k , ak j0i D 0 ,

we can prove for the one particle state

H jp i D NX

ωk a†k ak a†

p j0i

D NX

ωk a†k

�δ p k C a†

p a†k ak

�j0i D ω p jp i

Next operating the Hamiltonian on jni � jp1, p2, � � � i D a†p1 jp2, � � � i D a†

p1 jn � 1iand assume H jn � 1i D (ω2 C � � � ωn)jn � 1i, then

H jni DX

ωk a†k ak a†

p1jn � 1i D

Xωk a†

k

�δ p1k C a†

p1ak�jn � 1i

D ω1a†p1jn � 1i C a†

p1H jn � 1i

D ω1jni C (ω2 C � � � )a†p1jn � 1i

D (ω1 C ω2 C � � � )jni

Chapter 6: Scattering Matrix

Problem 6.1: Using

h f jH εI (t1)H ε

I (t2)jii DXh f jHI S jnie�i(En�E f )t1�εjt1j

� hnjHI S jiie�i(Ei�En )t2�εjt2j

Page 897: Yorikiyo Nagashima - Archive

872 Appendix I Answers to the Problems

The expectation value of the second order S-matrix becomes

h f jS jii D (�i)2Z 1

�1d t1

Z t1

�1d t2h f jH ε

I S (t1)H εI S (t2)jii

D �Xh f jHI S jnihnjHI S jii

Z 1

�1d t1

Z t1

�1d t2e�i(En�E f )t1�εjt1j�i(Ei�En )t2�εjt2j

D �Xh f jHI S jnihnjHI S jii

Z 1

�1d t1e i(E f �Ei )t1�εjt1j

Z 1

0dτe i(Ei�En )τ�ετ

D 2π i δ(E f � Ei)X h f jHI S jnihnjHI S jii

Ei � En C i ε

where we changed the integration variable t2 ! τ D t1 � t2 in going from thesecond to the third line.Problem 6.2: From Eq. (6.51), dELAB D γ d cos θ �, dΩ � D d cos θ �dφ�. There-fore d N

d Ω � /d N

d cos θ � /d N

d ELAB

Problem 6.3:

s D (p1 C p2)2 D p 21 C p 2

2 C 2(p1 � p2)

t D (p1 � p3)2 D p 21 C p 2

3 � 2(p1 � p3)

u D (p1 � p4)2 D p 21 C p 2

4 � 2(p1 � p4)

Using p 2i D m2

i , p2 � p3 � p4 D �p1, we obtain s C t C u DP

m2i .

Problem 6.4: Using the flux factor F D 4E1E2v12 D 4E1M(p1/E1) D 4p1M andthe phase space volume dLIPS, the cross section is given by

dσ D jM f i j2dLI P SLAB/F

D (2π)4δ(E1 C M � E3 � E4)δ3(p1 � p3 � p4)jM f i j

2

Fd3 p3d3 p4

(2π)64E3E4Rd3 p4 gives 1, at the same time changing the momentum p4 in E4. To carry outRd3 p3, we put the content of the δ function as G:

G � E3 C E4 D p3 C

q(p1 � p 3)2 C M2

D p3 C

qp 2

1 C p 23 � 2p1 p3 cos θ C M2

) @G@p3D 1C

p3 � p1 cos θE4

D1E4

(E4 C p3 � p1 cos θ )

D (E1 C M � p1 cos θ )/E4 D [M C p1(1� cos θ )]/E4

Page 898: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 873

Now we introduce q D p1 � p3 D p4 � p2 and use p 22 D p 2

4 D M2, p 21 D p 2

3 D 0.

p 24 D (p4 � p2 C p2)2 D (q C p2)2 D q2 C 2(p2 � q)C p 2

2

) �q2 D 2(p2 � q) D 2M(E1 � E3) D 2M(p1 � p3)

D �(p1 � p3)2 D 2p1 p3(1� cos θ )

) p1(1� cos θ ) D M(p1 � p3)/p3

Inserting the above equalities, we obtain @G/@p3 D p1M/p3E4 and

dLI P SLAB

FD

14p1M

�@G@p3

��1 p 23 dΩLAB

16π2E3E4D

p 23 dΩLAB

64π2(M p1)2

) dσdΩ

ˇˇLABD

�p3

p1

�2 jM f i j2

64π2 M2

Problem 6.6:

1. Proof of Eq. (6.142a). Using γ 5γ 5 D 1, γ 5γ μ D �γ μ γ 5

Tr[γ μ1 γ μ2 � � � γ μ2nC1 ] D Tr[γ 5γ 5γ μ1 γ μ2 � � � γ μ2nC1 ] D Tr[γ 5γ μ1 γ μ2 � � � γ μ2nC1 γ 5]

D �Tr[γ 5γ μ1 γ μ2 � � � γ 5γ μ2nC1 ] D (�1)2nC1Tr[γ μ1 γ μ2 � � � γ μ2nC1 ]

D �Tr[γ μ1 γ μ2 � � � γ μ2nC1 ] , ) Tr[γ μ1 γ μ2 � � � γ μ2nC1 ] D 0

2. Proof of Eq. (6.142b) Since Tr[A] D Tr[AT], C γ μT C�1 D �γ μ, we insert C C�1

between all γ ’s.

Tr[γ 1γ 2 � � � γ n] D Tr[γ nT � � � γ 2T γ 1T ] D Tr[C γ nT C�1 � � � C γ 1T C�1]

D (�1)nTr[γ n � � � γ 2γ 1]

3. Proof of Eq. (6.142c). Taking the trace of both sides of γ μ γ ν C γ ν γ μ D 2gμν ,and use Tr[γ μ γ ν] D Tr[γ ν γ μ]

4. Proof of Eq. (6.142d). The first step is

Tr[γ μ γ ν γ�γ σ ] D Tr[f2gμν � γ ν γ μgγ�γ σ ]

Using Eq. (6.142c)

D 8gμν g�σ � Tr[γ ν γ μ γ�γ σ ] D 8gμν g�σ � Tr[γ νf2gμ� � γ�γ μgγ σ]

D 8gμν g�σ � 8gμ� gνσ C Tr[γ ν γ�γ μ γ σ]

D 8gμν g�σ � 8gμ� gνσ C Tr[γ ν γ�f2gμσ � γ σ γ μg]

D 8gμν g�σ � 8gμ� gνσ C 8gμσ gν� � Tr[γ μ γ ν γ� γ σ]

In deriving the last equation, we used invariance of the trace under cyclic per-mutation.

) Tr[γ μ γ ν γ� γ σ] D 4gμν g�σ � 4gμ� gνσ C 4gμσ gν�

Page 899: Yorikiyo Nagashima - Archive

874 Appendix I Answers to the Problems

Problem 6.7: Using Eq. (6.148)

us(p f )γ 0u r (pi) D u s(p f )† u r(pi)

� � †s

��1 �

σ � p f

E f C m

��1 �

σ � p i

Ei C m

C

�1C

σ � p f

E f C m

��1C

σ � p i

Ei C m

���r

D 2� †s

�1C

p 2

(E C m)2 (σ � Op f )(σ � Op i )�

�r

D 2� †s

��1C

p 2

(E C m)2 ( Op f � Op i )�C

p 2

(E C m)2 i σ � Op f � Op i

��r

where we have used p D p Op , p f D pi D p , E f D Ei D E . The first term is thespin non-flip and the second term is the spin flip term. Since ( Op f � Op i ) D cos θ andj Op f � Op i j D sin θ ,

A D spin non-flip term D�

(E C m)�

1Cp 2

(E C m)2 cos θ �2

B D spin flip term D�

(E C m)�

p 2

(E C m)2 sin θ �2

Xr,sD1,2

(AC B) D 2 � (E C m)2�

1C2p 2

(E C m)2 cos θ Cp 4

(E C m)4

Factor 2 arises because each term is counted twice, for non-flip term r D s D 1, 2and flip term r D 1, s D 2 and r D 2, s D 1. Using cos θ D 1 � 2 sin2(θ /2)

AC B D 2(E C m)2

"�1C

p 2

(E C m)2

�2

�4p 2

(E C m)2 sin2 θ2

#

D 8E 2�

1 �p 2

E 2sin2 θ

2

Dividing by 2 for taking the average for the initials state and also by 4E 2 for particlenormalization recovers the Eq. (6.147).Problem 6.8: Considering only the vacuum in the intermediate stage

h f ji eh

φ†(x )@μ φ(x )� @μ φ†(x )φ(x )ijii

D i e˚h f j(φ†(x )j0ih0j@μ φ(x )jii � h f j@μ φ†(x )j0ih0jφ(x ))jii

�D e

np μ

i e i(p i �p f )�x�i(Ei�E f )t C p μf e�i(p i �p f )�xCi(Ei�E f )t

oD e

np μ

i e i(p i�p f )μ x μC p μ

f e�i(p i�p f )μ x μo

The first and the second term give the same contribution after integration over x.Comparing with Eq. (6.135), we see only us γ μ u s is replaced with 2p μ . Therefore,the only modification in the final cross section is to replace (1 � v2 sin2 θ

2 ) with 1.

Page 900: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 875

Chapter 7: QED

Problem 7.1: Noting q D p3 � p1, q2 D p 23 C p 2

1 � 2(p1 � p3) D 2fm2 � (p1 � p3)g

12

qμ Lμν D (p3μ � p1μ)�

p μ3 p ν

1 C p μ1 p ν

3 Cq2

2gμν

D

�fp 2

3 � (p1 � p3)gp ν1 C f(p1 � p3) � p 2

1gpν3 C

q2

2qν�

Dq2

2p ν

1 �q2

2p ν

3 Cq2

2qν D 0

Problem 7.2: The invariant matrix squared is given by

jMj2 D2e4

s2[(t �M2)2 C (u � M2)2 C 2sM2]

This can be derived from Eq. (7.17) and Eq. (7.16b) by crossing s ! u, t ! s,u! u and neglecting the electron mass m2. In CM we have

s D 4E 2 , u D M2 � 2E 2(1C cos θ ) , t D M2 � 2E 2(1� cos θ )

Then

s2

2e4 jMj2 D [�2E 2(1� cos θ )]2 C [�2E 2(1C cos θ )]2 C 8M2 E 2

D 8E 4(1C 2 cos2 θ )C 8M2E 2

D 8E 2(E 2 C p 2 cos2 θ C E 2 � P2)

D s2[1 � (2/2) sin2 θ ]

dσdΩD

p f

p i

164π2 s

jMj2 Dα2

2s�

1 �2

2sin2 θ

! σTOT D4πα2

3s�

3� 2

2

�Problem 7.4:

kνMμν D e2u(p 0)�

γ μ ( 6pC 6 k C m)(p C k)2 � m2

6 kC 6 k( 6p� 6 k0 C m)

(p � k0)2 � m2γ μ�

u(p )

Using p 2 D m2, 6 k 6 k D k2 D 0, 6p u(p ) D mu(p ), u(p 0) 6p 0 D mu(p 0), p � k0 Dp 0 � k.

Each term in [� � � ] gives

γ μ 6pC 6 k C m(p C k)2 � m2 6 ku(p ) D γ μ 2(p � k)� 6 k 6p C m6 k

2(p � k)u(p ) D γ μ u(p )

u(p 0) 6 k6p� 6 k0 C m

(p � k0)2 � m2 γ μ D u(p 0) 6 k6p 0� 6 k C m

(p 0 � k)2 � m2 γ μ

D u(p 0)2(p 0 � k)� 6p 06 k C m6 k

�2(p 0 � k)D �u(p 0)γ μ

The first and the second terms compensate each other.

Page 901: Yorikiyo Nagashima - Archive

876 Appendix I Answers to the Problems

Problem 7.5: We calculate interference terms only. Putting M DM1 CM2,

jM1M�2 j D

e4

4su

Xr,s

�ur (p 0)γ ν( 6pC 6 k)γ μ u s(p )

��us(p )γν( 6p� 6 k0)γμ u r(p 0)

D

e4

4suTr[6p 0γ ν( 6pC 6 k)γ μ 6p γν( 6p� 6 k0)γμ ]

Using γμ 6A 6B 6C γ μ D �2 6C 6B 6A, γμ 6A 6B γ μ D 4(A � B)

jM1M�2 j D �

2e4

su(p C k) � (p � k0)Tr[6p 0 6p ]

noting (p � p 0) D �t/2, (k � k0) D �(t C Q2)/2

jM1M�2 j D

2e4

sutfs C t C uC 2Q2g D

2e4

sutQ2

Next we prove jM2M�1 j D jM1M�

2 j. Since

jM2M�1 j D

4e4

suTr[γμ( 6p� 6 k0)γν 6p γ μ( 6pC 6 k)γ ν 6p 0]

We only need to prove

Tr[γ 1γ 2 � � � γ n ] D Tr[γ n � � � γ 2γ 1] .

Since

Tr[A] D Tr[AT], C γ μT C�1 D �γ μ ,

we insert C C�1 between all γ ’s.

Tr[γ 1γ 2 � � � γ n ] D Tr[γ nT � � � γ 2T γ 1T ] D Tr[C γ nT C�1 � � � C γ 1T C�1]

D (�1)nTr[γ n � � � γ 2γ 1]

Problem 7.6: Derivation of the Klein–Nishina formula.From the assumptions

6 k 6 ε D � 6 ε 6 k, 6 k 0 6 ε0 D � 6 ε0 6 k0, 6p 6 ε D � 6 ε 6p , 6p 6 ε0 D � 6 ε0 6p( 6p C m) 6 εu(p ) D6 ε(� 6p C m)u(p ) D 0 , ( 6p C m) 6 ε0u(p ) D 0

As we do not take photon sum

jMj2 De4

2Tr�

( 6p 0 C m)�6 ε0 6 ε 6 k2k � p

C6 ε 6 ε0 6 k0

2k0 � p

�( 6p C m)�6 k 6 ε 6 ε0

2k � pC6 k0 6 ε0 6 ε2k0 � p

De4

2

�I

(2k � p )2 CI I

(2k � p )(2k0 � p )C

I I I(2k0 � p )(2k � p )

CI V

(2k0 � p )2

Page 902: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 877

I V can be obtained from I by replacing with ε ! ε0, k ! �k0. I I D I I I with thesame reason as the Problem 7.4. We only need to calculate I and I I . As the rest isstraightforward but lengthy calculations, we only show results of each term.

I D 8(k � p )[(k 0 � p )C 2(k � ε0)2]

I I C I I I D 16[(k � p )(k0 � p )f2(ε � ε0)2 � 1g � (k � ε0)2(k0 � p )

C (k0 � ε)2(k � p )]

I V D I(ε! ε0, k ! �k 0) D 8(k0 � p )[(k � p )� 2(k0 � ε)2]

jMj2 De4

(k � p )2(k � p )

�(k0 � p )C 2(k � ε0)2

Ce4

(k0 � p )2(k0 � p )

�(k � p )� 2(k0 � ε)2

C 2e4

(k � p )(k0 � p )

�(k � p )(k0 � p )f2(ε � ε0)2 � 1g � (k � ε0)2(k0 � p )

C (k0 � ε)2(k � p )

Using (k � p ) D mω, (k 0 � p ) D mω0

jMj2 D e4�

ωω0 C

ω0

ω� 2C 4(ε � ε0)2

) dσdΩ

ˇˇLABD

�k0

k

�2jMj2

64π2m2 Dα2

4m2

�ω0

ω

�2

�ωω0 C

ω0

ω� 2C 4(ε � ε0)2

Problem 7.7 (2):

S f i � δ f i D �iZ

d4xhγ (k)jA μj0i j μ(x ) D �i�μ(λ) Oj (k)μ

dPγ D jS f i � δ f i j2 d3k

(2π)32ωD d3k(2π)32ω

�X

λ

e2�

(�(λ) � p f )(k � p f )

�(�(λ) � pi)

(k � pi )

�2

Chapter 8: Renormalization

Chapter 9: Symmetries

Problem 9.2: To prove the Theorem 9.1, we put

f (t) D e t AB e�t A , f (0) D B

Page 903: Yorikiyo Nagashima - Archive

878 Appendix I Answers to the Problems

differentiating by the parameter t

@ f (t)@tD e t A[A, B ]e�t A D [A, f (t)]

This is a differential equation of the first rank and uniquely defines a function ifthe initial condition at t D 0 is given. By changing A to tA, the right hand side ofEq. (9.15) is shown to satisfy the same differential equation and the same initialvalue. So both sides of the Eq. (9.1) are the same functions.

To prove the Theorem 9.2, change A ! tA, B ! tB and differentiate by t. Theleft hand side is

g(t) D et ACt B, g0(t) Ddg(t)

d tD (AC B)g(t) , g(0) D 1

Differentiating the right hand side [D h(t)] after the above change

h0(t) D e t AAet B e� 12 t2[A,B ] C e t Ae t B(B � t[A, B ])e� 1

2 t2[A,B ] , h(0) D 1

Using Eq. (9.15) and the condition

e t B A D�

AC [tB, A]C12

[tB, [tB, A]]C � � ��

e t B

D (AC [tB, A]) e t B

) e t B [A, B ] D ([A, B ]C [[tB, A], B ]) et B

D [A, B ]e t B

Similarly

e t A[A, B ] D [A, B ]e t A

) h0(t) D�(A � t[A, B ])et AC e t AB

e t B e� 1

2 t2[A,B ]

From Eq. (9.16b), e t AB D (B C t[A, B ])e t A

) h0(t) D (AC B)et Ae t B e� 12 t2[A,B ] D (AC B)h(t)

Since g(t) and h(t) satisfy the same differential equations and initial conditions,both are the same function. The second equation can be proved by noting

eAeB D eACB e12 [A,B ] , eB eA D eACB e� 1

2 [A,B ] ,

) eAeB D eB eAe[A,B ]

Problem 9.3:Putting T J

λ D hλ3λ4jT J (E )jλ1λ2i and using the partial wave expansion

σTOT D

ZdΩ

Xλ3,λ4

j f λ3λ4,λ1 λ2 (θ )j2

D

ZdΩ

Xλ3λ4

24 1

4p 2

XJ J 0

(2 J C 1)(2 J 0 C 1)T Jλ T J 0 �

λ d Jλμ d J 0

λμ

35

Dπp 2

XJ,λ3λ4

(2 J C 1)jT Jλ j

2

Page 904: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 879

We have usedR 1

�1 d cos θ d JM M 0 (θ )d J 0

M M 0 (θ ) D 2δ J J 0 /(2 J C 1) to derive the lastequality. Then

Im f λ3λ4,λ1 λ2 (0) D1

2p

XJ

(2 J C 1)ImT Jλ d J

λμ(0)

D1

4p

XJ,λ3,λ4

(2 J C 1)jT Jλ j

2 Dp

4πσTOT

In going to the third equality we have used ImT Jλ D Im[(e2 i δ J�1)/ i ] D 1�cos 2δ J ,

jT Jλ j

2 D 2(1 � cos 2δ J ).Problem 9.4: The Maxwell equation says @μ@μ Aν D j ν, the properties of Aν andj ν are equal. Since j μ D (�, �v ), A0 0(�t) D A0(t), A0(�t) D �A(t). For the trans-formation properties of the quantized fields, refer Eq. (9.133) and Eq. (9.134). Theproperties of E and B can be obtained by considering E D �rA0�@t A, B D r�A.Problem 9.6: Use Table 4.2 in Chap. 4. Or use

i2

γ 5σμν D �i4

�μν�σ γ�σ ,

12

�μν�σ F�σ D QF μν , QF μν(E , B) D F μν(�B,�E )

i2

γ 5σμν Fμν D �12

σμν QFμν D �i α � B � σ � E

Problem 9.7�: Define operatorsXk

(b†k � a†

k )(bk � ak ) DX

k

(Nk C Tk ) D N C T ,

Nk D a†k ak C b†

k bk , Tk D �(a†k bk C b†

k ak )

Then [Nk , Tq ] D [N, T ] D 0 and

[Nk , aq] D �δ qk aq , [Nk , a†q ] D δ qk a†

q , [Nk , bq] D �δ qk bq ,

[Nk , b†q ] D δ qk b†

q

[Tk , aq ] D δ qk bq , [Tk , a†q ] D �δ qk b†

q , [Tk , bq] D δ qk aq ,

[Tk , b†q ] D �δ qk a†

q

Applying the above commutation relations to the formulas

If [A, [A, B ]] D [B, [B, A]] D 0! eACB D e� 12 [A, B ]eAeB D eAeB e� 1

2 [A, B ]

eAB e�A D B C [A, B ]C12!

[A, [A, B ]C � � �

Page 905: Yorikiyo Nagashima - Archive

880 Appendix I Answers to the Problems

we obtain

e i α(NCT ) D e i αT ei αN D e i αN ei αT

ei αN (aq , bq)e�i αN D (e�i α aq , e�i α bq) ,

e i αN (b†q , a†

q )e�i αN D (e i α b†q , e i α a†

q )

e i αT (aq , bq)e�i αT D (aq cos α C i bq sin α, bq cos α C i aq sin α)

e i αT (a†q , b†

q )e�i αT D (a†q cos α � i b†

q sin α, b†q cos α � i a†

q sin α)

Choosing α D π/2, C D e i π2 (NCT ) D e i π

2 N ei π2 T D e i π

2 T ei π2 N

Caq C�1 D exph

iπ2

Ti

exph

iπ2

Ni

aq exph�i

π2

Ni

exph�i

π2

Ti

D exph

iπ2

Ti

(�i aq) exph�i

π2

TiD bq

C b†q C�1 D exp

hi

π2

Ti

exph

iπ2

Ni

b†q exp

h�i

π2

Ni

exph�i

π2

Ti

D exph

iπ2

Ti

(i b†q) exp

h�i

π2

TiD a†

q

) C'C�1 D '†, Similarly C'† C�1 D '

Problem 9.8: As the two π0’s are identical particles, their wave function must betotally symmetric, which means L D even. The angular momentum conservationrequires J D L D 1 and the two conditions are incompatible.Problem 9.9: The positronium is a bound state of e�eC and its C D (�1)LCS . ForL D 0, C D C,� depending on S D 0, 1 which allows 2γ , 3γ , respectively,but notvice versa.

Chapter 10: Path Integral I

Problem 10.2:

δSδq(t0)

D lim1ε

Zd t L(q(t)C εδ(t � t0), Pq(t)C ε

dd t

δ(t � t0))� L(q(t), Pq(t))

D

Zd t�

@L@q

εδ(t � t0)C@L@ Pq

εdd t

δ(t � t0)�D

@L@q

ˇˇ

tDt 0�

dd t

@L@ Pq

ˇˇ

tDt 0

Problem 10.4: Writing down the solution as x D A cos ω C B sin ω t with thecondition x (ti D 0) D qi , x (t f D T ) D q f gives

x D qi cos ω t Cq f � qi cos ωT

sin ωTsin ω t

Then calculate

Sc D

Z T

0d t�

m Px2

2�

mω2

2x2�

Page 906: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 881

Problem 10.7: (4) The average energy of a photon at temperature T is

E D �@ ln Z

@D

@

@ln 2 sinh

„ω2D„ω2

tanh„ω

2

D„ω2C

„ωe„ω � 1

The number of photons per unit phase space is 2Vd3 p/(2π„)3 D (8π/c3)V ν2dνwhere the factor 2 represents two polarization states. Expressing „ω D hν andomitting the zero point energy (the first term), the average photon energy per unitfrequency per unit volume is given by (8πh/c3)ν3dν/(ehν/ k T � 1) which is thePlanck’s formula.

Chapter 11: Path Integral II

Problem 11.1: The function f is expressed as a polynomial of η i ’s, but from the con-ditions

Rdη D 0 and anticommutativity of all the G-variables, the only surviving

term in the integrand is one which includes all the η1 � � � ηN . As

η1 � � � ηN D

NYiD1

�Xj k

(U�1)i j k � j k

The right hand side is a polynomial of � ’s, but as any overlap of the same variable(� )2 gives zero, only a term which includes all the �1 � � � �N survives. Therefore

η1 � � � ηN DX

ε j1 j2��� jN (U�1) j11(U�1) j22 � � � (U�1) jN N �1 �2 � � � �N

D (det U�1)�1�2 � � � �N

Writing the coefficient of the surviving polynomial as A,

Z NYiD1

dη i f (η1, � � � , ηN ) DZ NY

iD1

dη i (Aη1 � � � ηN )

D JG

Z NYiD1

d�i A det(U�1)�1, � � � , �N ,

On the other hand, because of Eq. (11.85b)Zdη1 � � � dηN η1 � � � ηN D

Zd�1 � � � d�N �1 � � � �N

Comparison of the two equations gives J G D det U.Problem 11.2: Consider an orthogonal transformation

η D OT� , ηTAη D � TO AOT� D � TA0�

Page 907: Yorikiyo Nagashima - Archive

882 Appendix I Answers to the Problems

As n-dimensional integration survives only when the integrand includes all theη i ’s, only n/2-th power of the expansion survives.

In D

Zdη1 � � � dηn e� 1

2 ηT Aη D JZ

d� � � � d�n e� 12 � T A0 �

D(�1)n/2� n

2

�!

Zd�1 � � � �n

� T A0�

�n/2

The equation follows because

1. Since O is orthogonal transformation Jacobian J D det O D 1.2. In the expansion only terms that have exactly n factors can contribute to the

integral.

Now any antisymmetric (n � n)-dimensional matrix A with even n can be broughtwith a unitary transformation to the form

A0 D UAU† D

26666664

0 a1 � � � 0 0�a1 0 � � � 0 0

...... � � �

......

0 0 � � � 0 an/2

0 0 � � � �an/2 0

37777775 (I.3)

for odd n the matrix has one more row and one more column with all zeros. Thenthe determinant is easily obtained from the above form (See proof of Eq. (I.3) be-low)

det A D det A0 D

((a1)2 � � � (an/2)2 n D even

0 n D odd

Assuming A0 has this form,

In D1� n2

�!

Zd�1 � � � �n

��

12

� TA0��n/2

D1� n2

�!

Zd�1 � � � �n

24�1

2

n/2Xj D1

(�2 j �1A02 j �1,2 j �2 j � �2 j A0

2 j,2 j �1�2 j �1)

35n/2

D(�1)n/2� n

2

�!

Zd�1 � � � �n

24 n/2X

j D1

a j �2 j �1 �2 j

35n/2

D(�1)n/2� n

2

�!

Zd�1 � � � �n

Xj1

� � �Xj n/2

�2 j1�1�2 j1 � � � �2 j n/2�1�2 j n/2 a j1 � � � an/2

There can be no two equal � 0 s in the integral. Since � appear always pairwise,they can be commuted to normal ordering �n � � � �1 with a sign change (�1)n/2.

Page 908: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 883

The product a j1 � � � an/2 gives justp

det A0. Then the role of sum is just to multiply(n/2)! with the same result. All things put together we obtain

In (n even) DZ

dη1 � � � dηn e� 12 ηT Aη

D

Zd�1 � � � �n �n � � � �1

pdet A0 D

pdet A

The result is to be contrasted with the normal (or bosonic) Gaussian integralZdx1 � � � dxn e� 1

2 xT Ax D(2π)n/2

pdet A

hProof of Eq. (I.3)i: First consider the simple case of n D 2.

A D�

0 A 12

�A 12 0

For the general case, we use the fact that i A is a hermitian matrix. It can be diago-nalized by a unitary transformation V

V i AV † D D

where D is real and diagonal whose elements are solutions to the secular equation

det ji A� λ1j D 0

Since AT D �A, we have det j(i A� λ1)Tj D det j � i A� λ1j D 0. Therefore, if λ isa solution, so is (�λ) and D is of the form

D D

26666664

a1 0 � � � � � � 00 �a1 � � � � � � 0...

.... . .

... 00 � � � � � � an/2 00 � � � � � � 0 �an/2

37777775

From this one sees that det A D i2n det D D (aa � � � an)2. To put D in the formEq. (I.3), we use the 2 � 2 matrix

S2 D1p

2

�i 11 i

which has the property

(�i)S2

�1 00 �1

�S†

2 D

�0 1�1 0

Page 909: Yorikiyo Nagashima - Archive

884 Appendix I Answers to the Problems

Thus S (�i D)S† D A0 for

S D

26664

S2 0 � � � 00 S2 � � � 0...

.... . .

...0 � � � � � � S2

37775

Then the matrix A0 defined by

A0 D S (�i D)S† D S [�iV (i A)V †]S† D (SV )A(SV )† D UAU†

reproduces Eq. (I.3). Since both A and A0 is real so is U , namely, U is an orthogonalmatrix and Eq. (I.3) is proved.Problem 11.4: We start from the second order interaction Z2 given in Eq. (11.168)

Z2 �N2!

�Zd4x

��

1i

δδη(x )

�(�i qγ μ)

�1i

δδη(x )

��1i

δδ J μ(x )

� 2

� Z0[η, η, J ]

DN2

�Zd4 y

��

1i

δδη(y )

�(�i qγ�)

�1i

δδη(y )

��1i

δδ J�(y )

��

Zd4x d4z1 d4z2d4z3η(z2)SF (z2 � x )(�i qγ μ)

� SF (x � z3)η(z3)DF μν(x � z1) J ν(z1)Z0[η, η, J ]�

(I.4)

To make an internal photon line, δ/δ J μ(x ) has to act on J ν(z1), producingDF μ�(x � y ). There are two ways to make the electron self energy contributions:(a) one pulls down another electron line and connects it to the outgoing electron.Namely, δ/δη(y ) acts on η(z3), and δ/δη(y ) acts on (δZ0/δη). (b) the other con-necting the new line (δZ0/δη) to the incoming electron of the first vertex. Theprocess (a) is expressed as:

Z26a D �N2

Zd4 x d4 y d4z2d4w

˚η(z2)SF (z2 � x )(�i qγ μ)

� SF (x � y )(�i qγ�)SF (y � w )η(w )DF μ�(x � y )�Z0

The process (b) is given

Z26b D �N2

Zd4x d4 y d4z1d4w

˚η(w )SF (w � y )(�i qγ�)

� SF (y � x )(�i qγ μ)SF (x � z1)η(z1)DF μ�(x � y )�Z0

By changing the integration variables we see that Z26b is identical to Z26a and thatserves merely to cancel the factor 1/2. This is a correction to the electron propaga-

Page 910: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 885

tor. Since the electron propagator is given by

hΩ jT [ Oψ(x ) Oψ(y )]jΩ i D�

1i

�2 δ2

δη(y )δη(x )Z [η, η, J ]

ˇˇ

ηDηD JD0

D

�1i

�2 δ2

δη(y )δη(x )fZ0 C Z1 C (� � � C Z26 C � � � )C � � � g jηDηD JD0

D i SF (x � y )CZ

d4w d4z SF (x � w )(�i qγ μ)SF (w � z)(�i qγ ν)

� SF (z � y )DF μν(w � z)

Using the momentum representation of the propagators Eq. (11.65) andEq. (11.119) and integrating over w and z, the second term becomes:Z

d4 p(2π)4 e�i p �(x�y ) 1

p/� m C i ε

� Zd4k

(2π)4

�gμν

k2 (�i qγ μ)1

p/� k/ � m C i ε(�i qγ ν)

�1

p/� m C i ε

Then the momentum representation of the electron propagator including the sec-ond order correction is expresses as

i QSF (p ) Di

p/� mC

ip/� m

i Σe(p )i

p/� m

i Σe(p ) DZ

d4k(2π)4

�i gμν

k2 (�i qγ μ)i

p/� k/ � m C i ε(�i qγ ν)

Problem 11.5: The reduction formula for the Compton scattering

γ (kλ)C e(p r)! γ (k0λ0)C e(p 0s)

where λ, λ0 are the photon polarizations and r, s are electron spin orientations isgiven by

outhk0λ0 p 0 sjkλ, p riin

D

Zd4x1d4x3d4 x2d4x4ε�

ν (k0, λ0) f k 0(x3)(@�@�)x3 U p 0 (x4)(i!6@ � m)x4

�δ4Z [η, η, J ]

δη(x2)δ J μ(x1)δη(x4)δ J ν(x3)

ˇˇ

ηDηD JD0

� (�i �6@ � m)x2 Up (x2)(

���@σ@σ)x1 εμ(k, λ) f k (x1) (I.5)

The generating functional that contributes to the Compton scattering was given byEq. (11.171):

ZCompton D iZ

d4x d4 y d4z1d4z2d4w1d4w3

nJ σ(w1)DF σ�(w1 � y )

� DF α(x � z1) J (z1)η(w3)SF (w3 � y )(�i qγ�)

� SF (y � x )(�i qγ α)SF (x � z3)η(z3)o

Z0[η, η, J ]

Page 911: Yorikiyo Nagashima - Archive

886 Appendix I Answers to the Problems

As the photon source J μ has no directionality, namely it could serve either as asource or a sink, there are two ways to make the photon derivatives producing twodistinct scattering processes as depicted in Figure I.1.

Application of the four-fold derivative in the reduction formula to ZCompton yields:

δ4ZCompton[η, η, J ]δη(x2)δ J μ(x1)δη(x4)δ J ν(x3)

ˇˇ

ηDηD JD0D i

Zd4x d4 y

�hn

DF ν�(x3 � y )DF αμ(x � x1)SF (x4 � y )(�i qγ�)

� SF (y � x )(�i qγ α)SF (x � x2)oCn

DF να(x3 � x )DF�μ(y � x1)

� SF (x4 � y )(�i qγ�)SF (y � x )(�i qγ α)SF (x � x2)oi

Inserting the above equation in Eq. (I.5), we obtain the scattering amplitude for theCompton scattering

outhk0 λ0, p 0sjk λ, p riin D iZ

d4x d4 y

� U p 0(y )h

f �k 0 (y ) f k (x )ε�

ν (k0λ0)(�i qγ ν)SF (y � x )(�i qγ μ)εμ(k, λ)

C f �k 0 (x ) f k (y )εμ(kλ)(�i qγ μ)SF (y � x )(�i qγ ν)ε�

ν (k0, λ0)iUp (x )

Insertion of

f k (x ) D e�i k �x , Up (x ) D u r(p )e�i p �x ,

SF (x � y ) DZ

d4 p(2π)4 e�i p �(x�y ) p/C m

p 2 � m2 C i ε

and the integration of x and y yields two δ functions and after integrating over pgives the final form:

outhk0 λ0, p 0sjk λ, p riin D �(2π)4 i δ4(k C p � k0 � p 0)e2us(p 0)

�ε/0 � p/C k/C m

(p C k)2 � m2 C i εε/C ε/

p/ � k/0 C m(p � k0)2 � m2 C i ε

ε/0 ��

u r(p )

where ε/D εμ(k, λ)γ μ , ε0/ D εν(k0, λ0)γ ν. The Compton scattering amplitudeEq. (7.51) has thus been reproduced.

k’

k

p’

p

xy

k’

k

p’

p

x

y

(a) (b) Figure I.1 Compton Scattering.

Page 912: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 887

Chapter 12: Accelerators and Detectors

Problem 12.3:As the mean n is large and the standard deviation σ/n D 1/

pn is small, r is large

and distributes around n. Let

F(r) D ln�

nr

r!e�n

�' r ln n � n �

�lnp

2π C�

r C12

�ln r � r

Expand F(r) in Taylor series around r D n and use up to quadratic terms in (r�n)2

F(n) D � lnp

2πn , F 0(n) D �1

2n, F 00(n) D �

�1n�

12n2

F(r) ' � lnp

2πn �1

2n(r � n) �

12

�1n�

12n2

�(r � n)2

D � lnp

2πn �1

2n(r � n)(r � n C 1)C

14n2 (r � n)2

Neglecting the third term and approximating n C 1 ' n, we obtain

F(r) D � lnp

2πn �1

2n(r � n)2 or

nr

r!e�n n!large�����!

1p

2πnexp

��

(r � n)2

2n

Chapter 13: Hadron Spectroscopy

Problem 13.1:

1. The total angular momentum in the final state is J D L and must be zero. Theparity of the final state is P1P2(�1)L D P1P2 D P2

1 D C1. The last equalityholds because the two particles are identical.

2. Let the orbital angular momentum of two particles ` and that of the relativemotion between the CM of the two-particle and the third particle L. From the re-quirement J D `C L D 0, ` D L. Then the parity becomes (�1)3(�1)`CL D �.The problem 2) and 3) were used to prove the parity violation in the weak decayof K0 ! 2π/3π.

3. We set the z-axis along the photon direction in the rest frame of the initialparticle. Then Jz D L z C Sz D Sz ¤ 0.

4. With the same setting as the problem 2), the angular momentum conservationrequires ` C L D 1. Then ` D L, L˙ 1, therefore both parities are allowed.

5. LC 1 D 1! L D 0, 1, 2 ; P D (�1)2(�1)L. ) L D 1 is allowed.6. The orbital angular momentum L i in the initial state can be considered as zero

which means P is (�1). As the final state consists of two identical bosons withS D 0. L must be even which means P D C and the parity is violated.

Page 913: Yorikiyo Nagashima - Archive

888 Appendix I Answers to the Problems

Problem 13.2: The decay to 2γ is obviously of electromagnetic origin. With thesame logic as used to determine the spin and parity of π0 (see arguments inSect. 13.2), the decay mode η ! 2γ tells that the spin of η cannot be 1, and that C-parity is (C). Since I D 0 the G-parity is G D C1. However, the 3π final state hasG D �1 which means I D 1. Therefore, the decay violates isospin and G-parityand cannot be strong. In fact, other branch modes have comparable branching ratiowith the 2γ mode and are consistent that they are all induced by the electromagnet-ic interaction. As the electromagnetic interaction conserves parity, the spin parityof η is fixed to be J P D 0�.Problem 13.3: From Eq. (13.34) Eq. (13.35)

3H πC Dˇˇ12 ,�

12

�j1, 1i D

r13

ˇˇ32 ,

12

�j1, 0i C

r23

ˇˇ12 ,

12

�j1, 0i

3H eπ0 D

ˇˇ12 ,

12

�j1, 0i D

r23

ˇˇ32 ,

12

�j1, 0i �

r13

ˇˇ12 ,

12

�j1, 0i

p d D j1/2, 1/2i ) T(p d !3H πC)T(p d !3H eπ0)

D �

p2/3T1/2p1/3T1/2

D �p

2

Problem 13.4:

p p D j1/2, 1/2ij1/2, 1/2i D j1, 1i

n p D j1/2,�1/2ij1/2, 1/2i D (j1, 0i � j0, 0i)/p

2

dπC D j1, 1i, dπ0 D j1, 0i

T(p p ! dπC)T(n p ! dπ0)

DT1

T1/p

2Dp

2!dσ(p p ! dπC)dσ(n p ! dπ0)

D 2

Problem 13.6:

K�n D j1/2,�1/2ij1/2,�1/2i D j1,�1i

K� p D j1/2,�1/2ij1/2, 1/2i D (j1, 0i C j0, 0i)/p

2

K0� � D j1/2,�1/2ij1/2,�1/2i D j1,�1i

KC� � D j1/2, 1/2ij1/2,�1/2i D (j1, 0i � j0, 0i)/p

2

K0� 0 D j1/2,�1/2ij1/2, 1/2i D (j1, 0i C j0, 0i)/p

2

) f (K�n ! K0� �) D T1

f (K� p ! K0� �)C f (K� p ! K0� 0) D (T1 � T0)/2C (T1 C T0)/2

D T1

Page 914: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 889

Problem 13.7:

KCΛ D j1/2, 1/2i, K0Λ0 D j1/2,�1/2i

πCn D j1, 1ij1/2,�1/2i Dp

1/3j3/2, 1/2i Cp

2/3j1/2, 1/2i

π� p D j1,�1ij1/2,C1/2i Dp

1/3j3/2,�1/2i �p

2/3j1/2,�1/2i

) T(πCn ! KCΛ)T(π� p ! K0Λ0)

D

p2/3

�p

2/3,

! dσ(πCn ! KCΛ) D dσ(π� p ! K0Λ0)

Problem 13.8:(a) The real part and imaginary part of the scattering from Eq. (13.91) give

ReT` Dη`

2sin 2δ , ImT` D

1� η`

2C η` sin2 δ`

η D 1 for elastic scattering and η < 1 at higher energy where inelastic scatteringoccurs and is a function of energy.

When δ D π4 , π

2 , 3π4 ,

ReT D

η2

, ImT D 0.5�

,�

ReT D 0, ImT D1C η

2

�,

ReT D �η2

, ImT D 0.5�

,

and this means each trajectory circles around the center of the Argand diagram(counter-)clock wise when the force is attractive or repulsive. This is the case fora single amplitude. (b) When there is a non resonant repulsive force which doesnot change appreciably as a function of energy, the total amplitude is sum of thetwo (with proper normalization to lie within the circle). We consider, for simplicity,for the case when the total amplitude is the average of the two [see Fig. I.2]. Thenexamples in Fig. 13.15 show various cases satisfying conditions (b–d).Problem 13.9: Denote 3-momentum and kinetic energy of 3 pions as k i and Ti

which satisfy

k1 C k2I k3 D 0 , T1 C T2 C T3 D mK � 3mπ D Q

Ti D ki W ultra-relativistic Ti Dk2

i

2mπW nonrelativistic

We show the boundary in (x , y ) D f(T1 � T3)/p

3, T2g plane becomes a circle fornonrelativistic case.

k2 D �(k1 C k3) ! k22 D k2

1 C k23 C 2k1 k3 cos �

Therefore

T2 D T1 C T3 C 2p

T1T3 cos � !ˇˇ T2 � T1 � T3

2p

T1T3

ˇˇ 1

Page 915: Yorikiyo Nagashima - Archive

890 Appendix I Answers to the Problems

(a) (b)

(c) (d)

η/2

5

6

1

2

34

5

67

12345671’

2’

3’4’5’6’

7’

1

1

2

2

3

3

4

4

7

8

5

6

7

8

1’

2’

3’

4’5’6’

7’

8’

4

5

6

7

89

456

78

5’

6’

7’8’

11

10

9

1210

1112

Figure I.2 Resonance behavior: (a) Singleresonance amplitude. (b) Repulsive force +resonance, (c) Attractive force + resonance,(d) Repulsive force + resonance. (b–d) Points1,2, � � � denote two amplitudes at the sameenergy. Points 10, 20, � � � on dashed lines are

midpoints connecting 1-1, 2-2, etc. (d) is sim-ilar to (b), but here the resonant amplitudeprogresses slowly while the non-resonantamplitude swiftly and the relative velocity isreversed at high energy.

The boundary is obtained by putting j cos � j D 1

T 21 C T 2

2 C T 23 � 2T1T2 � 2T2T3 � 2T3T1 D 0

Then �T2 �

Q3

�2

C

�T1 � T3p

3

�2

DT 2

1 C T 22 C T 2

3

3C

23fT2(T2 � Q) � T1T3g C

Q2

9

D13

�T 2

1 C T 22 C T 2

3 � 2T1 T2 � 2T2 T3 � 2T3 T1�C

Q2

9D

Q2

9

Page 916: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 891

Namely, in the (x , y ) plane, it is a circle with center at (x , y ) D (0, Q/3) and radiusQ/3.Ultra-relativistic: From

j cos � j Dk2

3 � k21 � k2

2

2k1k2 1! jk1 � k2j k3 k1 C k2

Ti D ki ! y D T3 � jT1 � T2j Dp

3jx j

Therefore, y is above lines AC and BC. Next,

T3 D Q � (T1 C T2) D Q � (k1 C k2) Q � k3 D Q � T3,! y D T3 Q2

Namely, y is below the line AB.Problem 13.12: Setting α D 2mV(r), the Schrödinger equation for radial wavefunction is given by

�u00k C αu k D k2u k , α D 2mV(r) (I.6)

�v 00k D k2vk (I.7)

From Eq. (I.7), we have

vk Dsin(k r C δ0)

sin δ0D cos k r C sin k r cot δ0 ' 1C (k cot δ0)r (I.8a)

) limk!0

vk D v0 D 1 � r/a , limk!0

k cot δ � �1/a (I.8b)

where we have used limk!0 δ / k. In Eq. (I.6) a solution u0 for k D 0 satisfies

�u000 C αu0 D 0 , ) α D u00

0 /u0 , (I.9)

Making (I.6) �u0 and inserting Eq. (I.9)

�u0u00k C u0αu k D k2u0u k , ! [u0

0u k � u0u0k ]0 D k2u0u k (I.10)

Similarly

[v 00 vk � v0v 0

k ]0 D k2v0vk (I.11)

Making (I.11)–(I.10)

[v 00 vk � v0v 0

k ]0 � [u00u k � u0u0

k ]0 D k2(v0vk � u0u k )

Integrating both sides

[v 00 vk � v0v 0

k � u00u k C u0u0

k ]10 DZ R

0(v0vk � u0u k )

Calculating l.h.s. explicitly using Eq. (I.8a) and Eq. (I.8b), we have

1aC k cot δ0 D k2

Z 1

0dr (v0vk � u0u k ) D k2

Z R

0dr (v0vk � u0u k )

where we used u0 D u k D 0 at r D 0 and u k D vK at r D 1. In the limit k ! 0,vk ! v0.

Page 917: Yorikiyo Nagashima - Archive

892 Appendix I Answers to the Problems

Chapter 14: Quark Model

Problem 14.1:We show for the case of antisymmetric tensor. Ai j ! A0 i j

A0 i j D U iaU j

b Aab D �U jb U i

a Aba D �U jaU i

b Aab D �A0 j i

In going to the second last equality, we used that suffix for sum can be freelychanged. For symmetric tensor Si j , the proof is similar.Problem 14.2: (I3, Y ) of A12 D 1p

2(ud � du), A23 D 1p

2(d s � sd), A31 D 1p

2(su �

us), are (0, 2/3), (�1/2,�1/3), (1/2,�1/3) which are the same quantum numbers asthose of s, u, d. Therefore, this three members belong to 3�.Problem 14.3:

ψ01 D ε i j k � 0 i � 0 j � 0 k D ε i j k U i

a U jb U k

c � a � b � c

D det[U ia ]εabc � a � b � c D εabc � a � b � c D ψ1

Problem 14.4: Inserting quantum numbers (Y, I ) D (1, 1/2), (0, 1), (0, 0), (�1, 1/2)of (N, Σ , Λ, � ) into Eq. (14.31), we obtain

mN D a C b C c/2 , mΣ D a C 2c , mΛ D a , m� D a � b C c/2 .

Eliminating a, b, c leads to Eq. (14.30a). Equality (14.30b) for mesons can be ob-tained by changing mN ! m2

K , m� ! m2K , mΛ ! m2

η , mΣ ! m2π .

Problem 14.5�: This takes an elaborate algebra.Tensor Method: Baryons are composed of qi q j qk , but as far as SU(3) propertiesare concerned, it is convenient to use the equality Eq. (14.12) with meson octetssubstituted with baryon octets.

B ij D

2664� Σ 0p

2C Λ0p

6Σ C p

�Σ � Σ 0p2C Λ0p

6n

� � �� 0 �q

23 Λ0

3775

Bij D

�B j

i

��D

26664� Σ

0p

2C Λ

0p

6�Σ

C�

C

Σ� Σ

0p

2C Λ

0p

6��

0

p n �q

23 Λ

0

37775 (I.12)

As the mass term is expressed as a tensor Bij B a

b , the solution is obtained if we canmake a combination that transforms as a singlet and the 8-th member of an octet.As stated in Sect. 14.1.3, meson singlet and octet can be expressed as

M0 DuuC dd C s s

p3

Dδ j

ip3

qi q j � S ba qa qb

(M8)ij D qi q j �

13

δ ij qa qa D

�δ i

a δbj �

13

δ ij δb

a

�qa qb � (O i

j )ba qa qb (I.13)

Page 918: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 893

Therefore, to reduce an arbitrary tensor to one that transforms like the mass term,we make contractions of any (i, j ) [or any pair of (i, b), (a, j ),(a, b)] with S i

j and(O3

3)ij and remove residual suffix (a � � � , b � � � ) dependence by using δa

b , �abc, �abc.Consequently, the mass term takes a form

A0fB ij B j

i g C Bf(O33)i

j Bja B a

i g C Cf(O33)i

j Bai B j

a g

D AfBij B j

i g C (O33)i

j [F0fB j

a B ai � B

ai B j

a g C D 0fB ja B a

i C Bai B j

a g]

As the second term in O33 [δ3

3δba/3] has the same transformation property as a sin-

glet, we move it to the singlet term and consider only the first term as the octetterm, namely we are allowed to change (O3

3)ab ! δa

3 δ3b . In summary, the Hamilto-

nian mass term becomes

M D AfBij B j

i g C FfB3a B a

3 � Ba3 B3

ag C DfB3a B a

3 C Ba3 B3

ag

D Afp p C nn C Σ�

Σ C C Σ0Σ 0 C Σ

CΣ � C Λ

0Λ0

C �0� 0 C �

C� �g

C Ffp p C nn � �0� 0 � �

C� �g C Dfp p C nn C �

0� 0

C �C

� � C43

Λ0Λ0g

) MN D AC F C D , M� D A � F C D , MΣ D A ,

MΛ D AC43

D

Eliminating A, F, D , we finally obtain

MN C M� D12

(MΣ C 3MΛ)

Method Using U Spin: We consider to express the mass in terms of quantum num-bers of S U(3). Conventionally used (I3, Y ) are based on subgroup S U(2)I �U(1)Y .As ψλ8ψ has the quantum number Y and does not differentiate Λ from Σ 0, weconsider a different subgroup S U(2)U �U(1)Q. S U(2)U is symmetry of d and s andis differentiated by U3. U3 of (d, s) is (C1/2, �1/2). (d, s) has U-spin 1/2 and u is aU-spin singlet. The electric charge Q differentiate (d, s) and u. In terms of U3 andQ, λ8 is reexpressed as λ8 D

p3Y D (2/

p3)(U3 C V3) D (4/

p3)(U3 C 1/2I3) (see

Eq. (G.17c) (G.17d) in Appendix G). Using (Q, U). The octet members are classifiedas

(Q, U) D (1, 1/2)! (p , Σ C) , (�1, 1/2)! (Σ �, � �) ,

(0, 1)! (n, Σ 0, Λ0) , (0, 0)! � 0

Σ 0, Λ0 are eigenstates of I3, but not of U3. To obtain U3 D 0 eigenstate, we applyU� D (λ6 � λ7)/2 to U D U3 D C1 eigenstate i.e. jni. The operation is analogousto obtain I3 D 0 state from I D I3 D C1 state by applying I� D (λ1 � λ2)/2.

Page 919: Yorikiyo Nagashima - Archive

894 Appendix I Answers to the Problems

I�, U� operations are to exchange (u! d, d ! �u), (d ! s, s ! �d). Similar tothe meson octet for which

jη0i D jI W 0, 0i D (uuC dd � 2s s)/p

6

I�jπCi D I�(ud) D �(uu � dd) Dp

2[(�uuC dd)/p

2]

Dp

2jI W 1, 0i

jU W 0, 0i D (�2uuC dd C s s)/p

6

U�jU W 1, 1i D U�jK0i D U�(d s) Dp

2[(�dd C s s)/p

2]

Dp

2jU W 1, 0i

From the above equation, we can obtain

jU W 1, 0i D �12

π0 �

p3

2η D �

12jI W 1, 0i �

p3

2jI W 0, 0i

jU W 0, 0i D

p3

2π0 �

12

η D

p3

2jI W 1, 0i �

12jI W 0, 0i

Substituting corresponding baryon octets in the above equation, we have

jU W 1, 0i D �12jI W 1, 0i �

p3

2jI W 0, 0i D �

12

Σ 0 �

p3

2Λ0

As the mass of particles within a multiplet can be expressed as a C bU3 C c I3,neglecting I3

Mn D hU W 1, 1jM jU W 1, 1i D a C b

hU W 1, 0jM jU W 1, 0i D

*�

12

Σ 0 �

p3

2Λ0

ˇˇM

ˇˇ�1

2Σ 0 �

p3

2Λ0

+

D14

MΣ 0 C34

MΛ0 D a

M� 0 D hU W 1,�1jM jU W 1,�1i D a � b

Eliminating a, b, we finally obtain

MN C M� D12

(MΣ C 3MΛ )

Problem 14.6: Substituting Eq. (14.33) in Eq. (14.34a),�M2

8 M280

M208 M2

0

�D

�c s�s c

��M2

φ 00 M2

ω

� �c �ss c

D

"c2M2

φ C s2M2ω cs(�M2

φ C M2ω)

cs(�M2φ C M2

ω) s2M2φ C c2M2

ω

#

where c D cos θ , s D sin θ . On the other hand, from Eq. (14.35)

M2φ � M2

8 D M20 � M2

ω D s2M2φ C c2M2

ω �M2ω D s2(M2

φ �M2ω )

) sin2 θ DM2

φ � M28

M2φ �M2

ω

Page 920: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 895

u d u u d u

u d s u d d

u s u d

s s s sΛ φ n φ

p K p π Figure I.3 Left: K� p ! φΛ, Right: π� p ! φn.

Problem 14.7: Using the quark model, K� D su, p D uud, φ D s s, Λ D uds,π� D du, n D udd, reactions can be described as in Fig. I.3. Namely in theprocess K� p ! φΛ, quark lines are connected but in π� p ! φn are not.Problem 14.8: The decay process of a vector particle into a lepton pair is one wherea pair of qq annihilates into a photon, subsequently decaying to a lepton pair. Thismeans the decay amplitude is proportional to the quark charge. Since pair annihi-lations uu and dd cannot be distinguished, the two amplitudes have to be added inthe decay process of �, ω D (uu˙ dd)/

p2 and φ D s s. They are proportional toˇ

ˇ 1p

2

�23�

��

13

� ˇˇ2 W

ˇˇ 1p

2

�23C

��

13

� ˇˇ2 W

ˇˇ�1

3

ˇˇ2 D 1

2W

118W

19

Problem 14.9: The decay rate Eq. (14.40) can be derived as follows. We consider avector V( J P D 1�) or a scalar S(0�) meson which is a composite of quarks (wheremq ml is assumed) in the relative orbital angular momentum S-state. The crosssection qq ! l l is given by Eq. (C.17).

σTOT D4πα2

3sQ2 f

i

"1C

2� 2i � 2

f

2C

(1� 2i )(1� 2

f )

4

#

where

i Dq

s � 4m2q/p

s , f D

qs � 4m2

l /p

s , s D 4E 2CM

and Q is the electric charge of the quark q. The cross section is a sum of spin tripletcross section σt and spin singlet σs, i.e. σ D 3

4 σt C14 σs. For the vector meson

decay V ! l l, σs does not contribute. Hence the decay rate Γ in the limit f ! 1(s D 4m2

q m2l ) is given by

Γ D (flux) � σt D 2 qjψ(0)j243

σ D16πα2

3Q2 jψ(0)j2

m2V

where we have put s D m2V , i ! 0 (nonrelativistic limit). ψ(0) is the S-state wave

function at the origin whose shape depends on the potential acting on the qq pair.it is smaller than Eq. (14.40) by a factor three. This is because the quark has threecolors which shall be explained soon. Three identical diagrams contribute to thetotal decay amplitude, therefore the decay rate has to be multiplied by 9 and has tobe divided by 3 because flux of each color is 1/3 on the average.

Page 921: Yorikiyo Nagashima - Archive

896 Appendix I Answers to the Problems

Problem 14.10: Performing cyclic replacement starting from

(ud � du)u("# � #") " D u " d # u " Cd # u " u " �u # d " u "

� d " u # u "

One obtains:

u " d # u " Cd # u " u " �u # d " u " �d " u # u "

C u " u " d # Cu " d # u " �u " u # d " �u " d " u #

C d # u " u " Cu " u " d # �d " u " u # �u # u " d "

D 2u " u " d # �u " u # d " �u # u " d "

C 2u " d # u " �u # d " u " �u " d " d #

C 2d # u " u " �d " u " u # �d " u # u "D Eq. (14.52)

Problem 14.12: If one makes wave function of the neutron, one sees it has permu-tation symmetry u $ d. Therefore, the magnetic moment of the neutron μn canbe obtained from μ p by exchange of u $ d. Magnetic moments of other baryonshaving two identical quarks out of three can be obtained in a similar manner.

μn D μ p (u$ d) , μΣ C D μ p (d ! s) , μΣ � D μn(u! s)

μ� 0 D μn(d ! s) , μ� � D μ p (u! s)

As Ω � is made of s s s and has spin s D 3/2, its wave function is given by s " s "s ". Therefore the magnetic moment is given by μΩ � D 3μ s .

The quark components of Σ 0, Λ0 are uds and the same for both, but they haveisospin I D 1, 0 hence the wave functions are symmetric (antisymmetric). Then,by symmetry, the spin wave functions of Σ 0(Λ0) are also symmetric (antisymmet-ric). The structure of the spin wave function of Σ 0 then is the same as that of theproton in Eq. (14.52) and that of Λ0 is antisymmetric and has the same structureas Problem 14.10.

jΣ 0i D uds1p

6f2 ""# �("# C #") "g ,

jΛ0i D uds1p

2f("# � #") "g

As a preparation for calculating μΣ Λ, let us use a method different given in thetext. The Hamiltonian for the magnetic moment in the quark model is given by

H D μu σz(u)C μd σz (d)C μ s σz (s)

σz (u) means that it acts only on u quark. Since σz #D � #, we have

H jΣ 0i Dudsp

6[μuf2 ""# �("# � #") "g

C μdf2 ""# �(� "# C #") "g

C μ sf�2 ""# �("# C #") "g] (I.14)

Page 922: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 897

By multiplying hΣ 0j from left and calculating the expectation values,

μΣ 0 D46

μu C46

μd C μ s

��

46C

16C

16

�D

23

(μu C μd ) �13

μd

Similarly, multiplying hΛj by

H jΛ0i Dudsp

2fμu("# C #") " Cμd (� "# � #") "

C μ s("# � #") "g

we obtain μΛ D μ s .Multiplying Eq. (I.14) by hΛ0j, we have

μΣ Λ D �1p

3μu C

1p

3μd D

�μu C μdp

3

Problem 14.13: The gluon has colors (8 species), hence ee cannot be converted toa gluon. As the gluon has the same C-parity (�1) as the photon, J/ψ having thesame quantum number as the photon cannot be converted to 2 gluons. Hence,three gluons have to be exchanged.Problem 14.14:

si � s j D12f(si C s j )2 � s2

i � s2j g

D12fS(S C 1) � s i (s i C 1) � s j (s j C 1)g

D

(12

˚1(1C 1) � 2 � 1

2

� 12 C 1

��D 1

4 S D 112

˚0(0C 1) � 2 � 1

2

� 12 C 1

��D � 3

4 S D 0Xi, j

(si � s j ) D12f(s1 C s2 C s3)2 � s2

1 � s22 � s2

3g

D12

�S(S C 1) � 3 �

12

�12C 1

D

(12

˚ 32

� 32 C 1

�� 3 � 1

2

� 12 C 1

��D 3

4 S D 32

12

˚ 12

� 12 C 1

�� 3 � 1

2

� 12 C 1

��D � 3

4 S D 12

Problem 14.15:

M D3X

iD1

mi C BXi, j

si � s j

mi m j

As Δ consists of u, d and has spin s D 3/2, the first term gives 3mu. UsingEq. (14.98b) the second term becomes 3

4B

m2u. Therefore,

m(Δ) D 3mu C B�

34

1m2

u

Page 923: Yorikiyo Nagashima - Archive

898 Appendix I Answers to the Problems

The equation for N can be obtained from the above formula by changingP

(si �

s j )! �3/4 (Eq. (14.98b)), m(N ) D 3mu C B� 3

41

m2u

�As � � consists of u or d C 2s and has spin s D 3/2,

M D mu C 2ms C B�

ss � ss

m2sC

2(su � ss)mu ms

.

Any quark pair makes spin s D 1, (ss � ss) D (su � ss) D 1/4. Therefore,

m(� �) D mu C 2ms C B�

14m2

sC

12mu ms

�� has spin s D 1/2, but as the pair s s is symmetric, so is the spin wave functionhence its spin has to be s D 1.

2su � ss D12

˚(su C ss C ss)2 � s2

u � (ss C ss)2�D

12

�12

�12C 1

��

34� 1(1C 1)

D �1

) m(� ) D mu C 2ms C B��

1mu ms

C1

4m2s

�As Ω is made of s s s, the mass can be obtained from that of uuu D ΔCC byreplacing u! s.

) m(Ω ) D 3ms C B�

34

1m2

s

�Problem 14.16:

M D m1 C m2 CC

m1m2(s1 � s2) , (s1 � s2) D �3/4(S D 0) , 1/4(S D 1)

φ D s s , S D 1 �! mφ D 2ms C C1

4m2s

K� D su , S D 1 �! mK� D mu C ms C C1

4mu ms

ω D (uuC dd)/p

2 , S D 1 �! mω D 2mu C C1

4m2u

�, S D 1 �! m� D mω

As η D (uuC dd � 2s s)(p

6), it has mass mπ with probability 1/3 and has mass ofa particle with s s, S D 0. Since s � s D �3/4

mη D13

�2mu � C

34m2

u

�C

23

�2ms � C

34m2

s

D23

(mu C 2ms) � C34

�1

m2uC

1m2

s

K D us , S D 0 �! mK D mu C ms � C3

4mu ms

π D ud , S D 0 �! mπ D 2mu � C3

4m2u

Page 924: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 899

Chapter 15: Weak Interaction

Problem 15.1: From Eq. (15.23), the most general Lagrangian for the beta decay canbe expressed as

L D G

X(p Γ 0

i n)[Ci (eΓi ν)C C 0i (eΓi γ 5ν)]C h.c.

Γi D 1(S ), γ μ(V ), σμν(T ), γ μ γ 5(A), i γ 5(P )

Γ 0i D 1(S ), γμ(V ), σμν(T ), γμ γ 5(A), i γ 5(P )

h.c. D G

X(nγ 0(Γ 0

i )† γ 0 p )[C�i (νγ 0Γ †

i γ 0e)

C (C 0i )�(νγ 0γ 5†γ 0γ 0Γ †

i γ 0e)]

Using relations γ 0Γ †i γ 0 D Γi , γ 0γ 5γ 0 D �γ 5.

The hemitian conjugate becomes

h.c. D G

X(nΓ 0

i p )[C�i (νΓi e) � (C 0

i )�(νγ 5Γi e)] (h.c. of the first term)

P: P transformation changes (see Appendix F)

ψ1Γi ψ2 ! ψ1γ 0Γi γ 0ψ2 D εP i ψ1Γ 0i ψ2 ,

ψ1Γ 0i ψ2 ! ψ1γ 0Γ 0

i γ 0ψ2 D εP i ψ1Γi ψ2

ψ1Γi γ 5ψ2 ! ψ1γ 0Γi γ 0γ 0γ 5γ 0ψ2 D �εP i ψ1Γ 0i γ 5ψ2 ,

εP i denotes sign change depending on transformation property of Γi . For instance,

�P i D C1 , for i D S, V, T, �1 for A, P .

Since εP i (p Γi n)εP i (eΓ 0i ν) D (p Γ 0

i n)(eΓi ν), we have

(p Γ 0i n)[Ci (eΓi ν)C C 0

i (eΓi γ 5ν)]P! (εP i p Γi n)[Ci (εP i eΓ 0

i ν) � C 0i (εP i eΓ 0

i γ 5ν)]

D (p Γ 0i n)[Ci (eΓi ν)� C 0

i (eΓi γ 5ν)]

Namely, the sign change εP i due to different Γi and exchange operation Γi $ Γ 0i

do not make any difference in the final Lagrangian. Only the sign change by γ 5 !

γ 0γ 5γ 0 D �γ 5 in the second term in [� � � ] transforms C 0i ! �C 0

i . Therefore, wehave shown that P invariance means C 0

i D 0.T: T transformation changes

Ci ψ1Γi ψ2 ! C�i ψ1B�1Γ �

i B ψ2 D C�i εT i ψ1Γ 0

i ψ2

) (p Γ 0i n)[Ci (eΓi ν)C C 0

i (eΓi γ 5ν)]T! (p B�1(Γ 0

i )�B n)

� [C�i (eB�1Γ �

i B ν)C (C 0i )�(eB�1Γ �

i B B�1(γ 5)�B ν)]

Page 925: Yorikiyo Nagashima - Archive

900 Appendix I Answers to the Problems

Inserting B�1γ μ �B D γμ , B�1γ 5 � B D γ 5, the above equation becomes

D (εT i p Γi n)[C�i (εT i eΓ 0

i ν)C (C 0i )�(εT i eΓ 0

i γ 5ν)]

D (p Γ 0i n)[C�

i (eΓi ν)C (C 0i )

�(eΓi γ 5ν)]

This shows that T invariance means Ci D C�i , C 0

i D (C 0i )

�.C: C transformation changes

Ci ψ1Γi ψ2C�! �Ci ψT

1 C�1Γi C ψT2 D �Ci εC i ψT

1 Γ Ti ψT

2

D Ci εC i ψ2Γi ψ1

) (p Γ 0i n)[Ci (eΓi ν)C Ci (eΓi γ 5ν)]

C�! (�p C�1Γ 0

i C n)

� [�Ci (eC�1Γi C ν) � C 0i (eC�1Γi C C�1γ 5C ν)]

Considering C D i γ 2γ 0 C�1Γi C C�1γ 5C D εC i Γ T(γ 5)T.The above equation becomes

D (εC i nΓ 0i p )[Ci(εC i νΓi e)C C 0

i (εC i νγ 5Γi e)]

D (nΓ 0i p )[Ci (νΓi e)C C 0

i (νγ 5Γi e)]

Comparing with Eq. (h.c. of the first term), it means Ci D C�i , C 0

i D �(C 0i )

�Problem 15.2: Inserting Eq. (15.25c) into Eq. (15.18) in the text

dΓ D G2(C2

F jh1ij2 C C2

G T jhσij2)(1C v cos θ )

�1

(2π)5 p E(Δ � E )2dE dΩe dΩν

using Z(1C v cos θ )dΩe dΩν D 16π2 ,

Z Δ

0E 2(Δ � E )2dE D

Δ5

30

we obtain

Γ DZ

dΓ DG2

Δ5

60π3(C2

F jh1ij2 C C2

G T jhσij2)

Problem 15.3: Only variables that are not integrated in the integrand are Δ Dp1 � p4 D p2 C p3. Lorentz invariance requires the integral is a tensor of rank 2and can be put in a form

S μν D Agμν C B Δμ Δν

In order to obtain coefficients A, B , we calculate C D S μν gμν , D D S μν Δμ Δν

p μ2 p ν

3 gμν D p2 � p3 D12

(Δ2 � p 22 � p 3

3) DΔ2

2

p μ2 p ν

3 Δμ Δν D (p2 � Δ)(p3 � Δ) D (p2 � p3)2 DΔ2

4

Page 926: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 901

In CM of the particles 2 and 3, (E2CE3) D 2p . ThenR d3 p2d3 p3

4E2 E3δ4(Δ� p2� p3) D π

2

(Agμν C B Δμ Δν )gμν D 4AC B Δ2 DπΔ2

4

(Agμν C B Δμ Δν )Δμ Δν D (AC B Δ2)Δ2 DπΔ4

8

) A DπΔ2

24, B D

π12

Problem 15.4: Denoting energy momentum of the muon in its rest frame as (mμ, 0)and that of the electron as (E , p ), Eq. (15.82) is written as

dΓ DG2

μ(p1μ S μν p4ν)

2mμ π5

d3 p4

E4,

p1μ S μν p4ν Dπ24

˚(p1 � p4)Δ2 C 2(p1 � Δ)(p4 � Δ)

�(I.15)

In the rest frame of the muon,

p1 � p4 D mμ E , Δ2 D (mμ � E )2 � p 2 D m2μ � 2mμ E C m2

e ,

p1 � Δ D p 21 � p1 � p4 D m2

μ � mμ E ,

p4 � Δ D p1 � p4 � p 24 D mμ E � m2

e

Substituting the above relations in Eq. (I.15)

dΓ DG2

μ

48π4

n3(m2

μ C m2e )E � 4mμ E 2 � 2mμ m2

e

o d3 pE

(I.16)

Putting m2e D 0, p D E D mμ

2 x , d3 pE D 4π p d p , we finally obtain

dΓ DG2

μ m5μ

96π3 x2(3� 2x )dx

Problem 15.5: We insert the following relations in Eq. (I.16).

p Dmμ

2x , E D

2

px2 C 4ω , ω D

�me

�2

xmax D 1� ω

dΓ DG2

μ m5μ

96π3

n3(1C ω)

px2 C 4ω � 2(x2 C 4ω) � 4ω

o x2dxp

x2 C 4ω

By making use of equalities;Zdx

px2 C c

D logˇx C

px2 C c

ˇZ p

x2 C c dx D12

nxp

x2 C c C c logˇx C

px2 C c

ˇoZ

x2p

x2 C c dx D18

n2x (x2 C C )

32 � cx

px2 C c

�c2 logˇx C

px2 C c

ˇo

Page 927: Yorikiyo Nagashima - Archive

902 Appendix I Answers to the Problems

we obtain

Γ DG2

μ m5μ

192π3f

m2

e

m2μ

!, f (ω) D 1� 8ω C 8ω3 � ω4 � 12ω2 log ω

Problem 15.7: We start from Eq. (15.114a)

M(πC ! π0eCν) DGp

2

p2(k0 μ C k μ)u(pe)γμ(1� γ 5)v (pν)

Inserting k0 μ C k μ D (mπC C mπ0 )δμ0

j(k0 μ C k μ)u(pe)γμ(1� γ 5)v (pν)j2

D (mπC C mπ0 )2δμ0δν0Tr[( 6pe C me)γ μ(1 � γ 5) 6pν γ ν(1� γ 5)]

D 2(mπC C mπ0 )2δμ0δν0Tr[6pe γ μ 6pν γ ν ]

D 8(mπC C mπ0 )2[2Ee Eν � (pe � pν)]

D 8(mπC C mπ0 )2Ee Eν(1C ve cos θ )

Γ D (2π)4δ4(Δ � pe � pν)1

2mπC

�Xspin

jMj2d3 pπ

(2π)32ωd3 pe

(2π)32Ee

d3 pν

(2π)32Eν

DG2

(2π)5

8(mπC C mπ0 )2

16mπC mπ0

Zδ(Δ � Ee � Eν)Ee Eν(1C ve cos θ )

�p 2

e d pe dΩe

Ee

p 2ν d pν dΩν

DG2

π3

(mπC C mπ0 )2

4mπC mπ0

Zpe Ee(Δ � Ee)2dEe

DG2

Δ5

30π3

(1� Δ/(2mπC ))2

1� Δ/mπCR(me/Δ)

R(x ) D 30Z 1

xd y y (1� y )2

qy 2 � x2

Dp

1 � x2

�1�

92

x2 � 4x4�C

152

x4 ln1Cp

1� x2

x

The cos θ term vanishes when integrated over solid angle. To calculate R(x ), useZd y y

qy 2 � 1 D (1/3)(y 2 � 1)3/2

Zd y y 2

qy 2 � 1 D

18

�2y (y 2 � 1)3/2 C y (y 2 � 1)1/2 � ln

ˇy C

qy 2 � 1

ˇ�Z

d y y 3q

y 2 � 1 D (1/5)(y 2 � 1)5/2 C (1/3)(y 2 � 1)3/2

Page 928: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 903

Problem 15.8: As π is a boson having J P D 0�, its total wave function is symmet-ric. In order to have J D 0 with two pions, the angular orbital momentum has tobe zero, making the spatial part of the war function symmetric. This means isospinpart of the wave function has to be symmetric, too. As I D 1 wave function is an-tisymmetric, πCπ0 can be only in I D 2 state. Therefore, Δ I D 1/2 rule forbidstransition KC ! πCπ0.Problem 15.9:

1. From the previous problem, KS, KL can decay only to I D 0. I D 0 wave func-tion can be constructed using the Clebsch–Gordan coefficients as

jI W 0, 0i D1p

3(πCπ� � π0π0 C π�πC)

and decay rates to πCπ�, π0π0, π�πC are equalD 1/3. But as observationallyπCπ� and π�πC cannot be distinguished, the decay rate to πCπ� is twiceas that to π0π0. Note that the decay KL ! 2π does not occur if CP is invariant.

2. We assume that the spurion s has I D 1/2, I3 D �1/2. Combinations with thespurion are

js� �i D j1,�1i

js� 0i D

ˇˇ12 ,�

12

� ˇˇ12 ,C

12

�D

1p

2(j1, 0i C j0, 0i)

hs� �jT jΛπ�i D h1,�1jT j1,�1i D T1

hs� 0jT jΛπ0i D1p

2(h1, 0j C h0, 0j)T j1, 0i D

1p

2T1

) Γ (� � ! Λπ�)Γ (� 0 ! Λπ0)

D

ˇˇ T(� � ! Λπ�)

T(� 0 ! Λπ0)

ˇˇ2 D 2

3.

jsΣ Ci Dr

13

ˇˇ32 ,

12

�C

r23

ˇˇ12 ,

12

jsΣ �i Dˇˇ32 ,�

32

jnπCi Dr

13

ˇˇ32 ,

12

�C

r23

ˇˇ12 ,

12

jnπ�i Dˇˇ32 ,�

32

jp π0i D

r23

ˇˇ32 ,

12

��

r13

ˇˇ12 ,

12

Page 929: Yorikiyo Nagashima - Archive

904 Appendix I Answers to the Problems

From above relations

aC D hΣ CjT jnπCi D13

T 32C

23

T 12

a� D hΣ �jT jnπ�i D T 32

a0 D hΣ CjT jp π0i D

p2

3

T 3

2� T 1

2

�) aC C

p2a0 D a�

Problem 15.10: From Eqs. (15.132), we can calculate the transition amplitude anddecay rate as

M DGsp

2f fC(kK μ C kπμ)C f�(kK μ � kπμ)gu(pν)γ μ(1� γ 5)v (pe)

dΓ DXspin

jMj2

2mKdLI P S D (2π)4δ4(kK � kπ � pe � pν)

Xspin

jMj2

2mK

d3 kπ d3 pe d3 pν

(2π)62ωπ2Ee2Eν

k μK C k μ

π D 2k μK � (k μ

K � k μπ ) D 2k μ

K � (p μe C p μ

ν )

The second term becomes m` by operating on leptonic current. We set m` D 0from now on.X

spin

jMj2 D 2G2s j fCj2kK μ kK ν

Xju(pν)γ μ(1C γ 5)v (pe)j2

(C.9)DDD 16G2

s j fCj2kK μ kK νfp μe p ν

ν C p νe p μ

ν � gμν(pe � pν)

C i εμν�σ pe� pνσg

D 16G2s j fCj2f2(kK � pe)(kK � pν)� m2

K (pe � pν)g (I.17)

) dΓ D (2π)4δ4(kK � kπ � pe � pν)G2

s j fCj2

mK

� 8f2(kK � pe)(kK � pν) � m2K (pe � pν)g

d3kπ d3 pe d3 pν

(2π)92ω2Ee2Eν(I.18)

To calculate energy spectrum of the electron, one decomposes dΓ as follows.

dΓ DG2

s j fCj2

mK

d3 pe

(2π)32Ee8f(2kK � pe)kK μ � m2

K peμg

Z(2π)4δ4(kK � kπ � pe � pν)p μ

νd3kπ d3 pν

(2π)62ω2EνZ� � �

(C.24)DDD

(1� y )2

16π(kK � pe)μ D

14π

�Emax � Ee

mK � 2Ee

�2

(kK � pe)μ

y Dm2

π

(kK � pe)2 Dm2

π

mK (mK � 2Ee), Emax D

m2K � m2

π

2mK

Page 930: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 905

Substituting

f2(kK � pe)kK μ � m2K peμg(kK � pe)μ D k2

K (kK � pe) � 2(kK � pe)

D m2K (mK � 2Ee)

in Eq. (I.16), we have

dΓdEeD

G2s j fCj2

2π3 mK E 2e

(Emax � Ee)2

mK � 2Ee

To obtain π spectrum, we rewrite Eq. (I.15) as follows.

dΓ DG2

s j fCj2

4π2mK

d3kπ

(2π)32ω8f2kK μ kK ν � gμν m2

KgZδ4(kK � kπ � pe � pν)p μ

e p νν

d3 pe d3 pν

2Ee2EνZ� � �

(10.53)DDD

π24

(gμν C 2qμ qν) D 4f(kK � q)2 � m2K q2g

D 4m2K (ω2 � m2

π) D 4m2K k2

π

where q D kK � kπ .

) dΓdωD

G2s j fCj2

12π3 mK k3π

In order to calculate the total decay rate, we integrate the above formula over energy.Noting

ωmin D mπ Dmk

22y , y D

mK,

ωmax Dm2

K C m2π

2mKD

mK

2(1C y 2)

Γ DG2

s j fCj2

192π3m5

K

Z 1Cy 2

2y

�qx2 � 4y 2

�3

dx DG2

s j fCj2

768π3f�

m2π

m2K

�f (y ) D 1� 8y C 8y 3 � y 4 � 12y 2 log y

In deriving the result, we have usedZ(x2 C C )3/2dx D

18

[2x (x2 C C )3/2 C 3C x (x2 C C )1/2

C 3C2 log jx Cp

x2 C C j]

Chapter 16: Neutral Kaons

Problem 16.1: Substituting p jK0i C qjK0i D

�pq

�in Eq. (16.10),

�Λ11 Λ12

Λ21 Λ22

� �pq

�D λ

�pq

Page 931: Yorikiyo Nagashima - Archive

906 Appendix I Answers to the Problems

Solutions can be obtained by solving the secular equations.((Λ11 � λ)p C Λ12q D 0

Λ21 p C (Λ22 � λ)q D 0

From the above equation, we get

qpD

λ � Λ11

Λ12D

Λ21

λ � Λ22

λ2 � (Λ11 C Λ22)λ C Λ11Λ22 � Λ12Λ21 D 0

) λS, L DΛ11 C Λ22

s�Λ11 � Λ22

2

�2

C Λ12Λ21

λS λL D Λ11Λ22 � Λ12Λ21

Problem 16.2: Solutions to x�

M Δ2

Δ2 M

�are give by

jK1i D1p

2

�11

�, jK2i D

1p

2

�1�1

Then to first order in δ, the following equality is also true.�M C δΔ Δ

2Δ2 M � δΔ

�1p

2

�1C δ1� δ

�D

�M C

Δ2

�1p

2

�1C δ1� δ

) jK1Mi D1p

2

�1C δ1 � δ

�D jK1i C δjK2i

is a solution in matter with eigenvalue M C Δ/2. Similarly

jK2Mi D1p

2

�1 � δ�1� δ

�D jK2 > �δjK1i

is also a solution with eigenvalue M � Δ/2. Note, as is shown in Eq. (16.35), Mrepresents (λS C λL)/2 � πN

m ( f C f ).Problem 16.3: Use Eq. (16.13) in the text,

Λ D�

Λ11 Λ12

Λ21 Λ22

�D U

�ΛS 00 ΛL

�U�1 ,

�α

�D U

�α00

Page 932: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 907

U can be obtained as follows. Using Eqs. (16.65), ψ D αjK0i C jK0i can be

reexpressed in the jKSi, jKLi base as

ψ D α0jKSi C 0jKLi

D α0(pp

1 � zjK0i C qp

1C zjK0i)

C 0(pp

1C zjK0i � qp

1 � zjK0i)

D (α0 pp

1 � z C 0 pp

1C z)jK0i C (α0qp

1C z � qp

1 � z)jK0i

D αjK0i C jK0i

) U D�

pp

1� z pp

1C zqp

1C z �qp

1� z

�U�1 D

12

24

p1�zp

p1Czq

p1Czp �

p1�zq

35

The rest of the calculations is straightforward.Problem 16.5: From Eq. (16.52c), CP violation means there is a small imaginarypart in M�

12Γ12.) M�12Γ12 D M12Γ 12e i Δ� Δ� � 1 where M12, Γ 12 are signed

real numbers. Writing M12 D M12e i �M , Γ12 D Γ 12e i �M ei Δ� ,pΛ12Λ21 D

�(M12 � i Γ12/2)

�M�

12 � i Γ �12/2

�1/2

D M12 � i Γ 12/2C O(Δ� 2)

) Δλ D �2p

Λ12Λ21 ' �2

M12 � i Γ 12/2�! (16.81d)

Problem 16.6: Use Eq. (16.79d) and Eq. (16.80d).Problem 16.7: Using

jK0i D1

2p(jKSi C jKLi) , jK

0i D

12q

(jKSi � jKLi) , jp j2 C jqj2 D 1

We calculate the decay asymmetry δ f . Denoting the final state phase space as �,

δ f �Γ (K

0! f ) � Γ (K0 ! f )

Γ (K0! f )C Γ (K0 ! f )

DΓ (K

0! f ) � Γ (K0 ! f )

Γ (KS ! f )

D1

Γ (KS ! f )

(ˇˇ 12q

(h f jT jKSi � h f jT jKLi)ˇˇ2 �

ˇˇ 12p

(h f jT jKSi C h f jT jKLi)ˇˇ2 �

)

D14

(ˇˇ 1q�

1 �h f jT jKLi

h f jT jKSi

�ˇˇ2 �

ˇˇ 1

p

�1Ch f jT jKLi

h f jT jKSi

�ˇˇ2)

Dj1 � η f j

2

4jqj2�j1C η f j

2

4jp j2'jp j2 � jqj2

4jp j2jqj2�

2Reη f

4jp j2jqj2

' 2(Re� � Reη f )

Page 933: Yorikiyo Nagashima - Archive

908 Appendix I Answers to the Problems

Now consider j f i D jπCπ�i or jπ0π0i. Referring to Eq. (16.120), we have

ηC� D �0 C �0 , �0 � 2�0

) δπ0π0 D 2Re(� � η00) D 4Re�0 , δC� D 2Re(� � ηC�) D �2�0

Problem 16.8:ˇˇ A f

A f

ˇˇ D ˇˇ (A S f � A L f )/2q

(A S f C A L f )/2p

ˇˇ D

ˇˇ p

q1� η f

1C η f

ˇˇ ' 1C 2Re(� � η f )

ˇˇ Aπ0 π0

A π0π0

ˇˇ �

ˇˇ AπC π�

A πCπ�

ˇˇ D 2Re(ηC� � η00) D 6Re�0

Chapter 17: Hadron Structure

Problem 17.1:

(17.8) W F(Q2) D

Rd r�(r)e i Q�rR

d r�(r)W

�(r) D �0 r R , � D 0 r > RZd r�(r) D 4π�0

Z R

0r2dr D

4π�0

3R3

Zd r�e i Q�r D 2π�0

Z R

0r2dr

Z C1

�1e i Qr z dz D

4π�0

Q

Z R

0r sin Qr dr

D4π�0

Q3

Z R Q

0� sin � d� D

4π�0

Q3 fsin � � � cos � g

) F(Q2) D3(sin � � � cos � )

� 3 , � D QR

(17.9) W �(r) D �0e�r/aZd r�(r) D 4π�0

Z 1

0e�r/a r2dr D 8π�0a3

Zd r�0 e�r/a e i Q�r D 2π�0

Z 1

0e�r/a r2dr

Z C1

�1e i Qr z dz

D2π�0

i Q

Z 1

0r(e�(r/a�i Qr )� e�(r/aCi Qr ))dr

D2π�0

i Q

�1

(1/a � i Q)2�

1(1/a C i Q)2

�D 8π�0

a3

(1C a2Q2)2

) F(Q2) D1

(1C a2Q2)2

Problem 17.2:

e J μ(x ) D eu(p 0)h

F1(q2)γ μ ��

2MF2(q2)i σμνqν

iu(p )e i qx

Page 934: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 909

We first prove @μ J μ D 0. As @μ ! i qμ, qμ γ μ D p/0� p/, inserting the Dirac equationu(p 0) 6p 0 D M u(p 0), 6p u(p ) D M u(p ), the first term vanishes. The second termqμ σμν qν vanishes also from the different symmetry of qμ qν and σμν. If there existsa term proportional to qμ , it enters in J μ as F3qμ . Since, qμ qμ D q2 does not vanishgenerally, the current conservation requires F3 D 0. Next, we show that a term like(p 0 C p )μ can be expressed in terms of γ μ and i σμνqν .

i σμνqν D12

[�γ μ( 6p 0� 6p )C ( 6p 0� 6p )γ μ]

D12

[�2(p 0 C p )μ C 2 6p 0γ μ C 2γ μ 6p ]

D �(p 0 C p )μ C 2M γ μ

Next, using Eq. (17.15), we calculate J μ A μ. In the nonrelativistic limit,

(p 0 C p )μ

2M�! (1I v) , q0 ! 0 ) (p 0 C p )μ A μ

2M! (φ � v � A)

i σμνqμ A ν �!A 0

2[γ � qγ 0 � γ 0γ � q]C i ε i j k σk q i A j

D �A 0q � α C σ � (i q � A) D σ � B

where we have used the fact that α goes to zero in the nonrelativistic limit as it hasno diagonal elements and that i q � A D r � A D B.Problem 17.3: Energy-momentum conservation gives

M2X D P2

F D (k C p � k0)2 D (k � k0)2 C 2p � (k � k0)C p 2

D q2 C 2M(ω � ω0)C M2

Using Q2 D �q2, ω � ω0 D Q2

2M CM2

X �M2

2M . For the elastic scattering,

MX D M ,

Q2 D �(k � k0)2 D �k2 � k0 2 C 2k � k0 D 2ωω0(1 � cos θ )

ω � ω0 Dωω0

M(1� cos θ )

Problem 17.4: Since, qμ L μν D qν L μν D 0, only terms proportional to gμν , p μ p ν inW μν do not vanish.

gμν L μν(17.34b)DDD 2[2k0 � k C 2q2] D 2q2 D �8ωω0 sin2 θ

2

p μ p ν L μν D 2�

2(p � k)(p � k0)Cq2 p 2

2

D 2�

2M2ωω0 � 2M2ωω0 sin2 θ2

D 4M2ωω0 cos2 θ2

) L μν W μν D 4ωω0�

W2 cos2 θ2C 2W1 sin2 θ

2

Page 935: Yorikiyo Nagashima - Archive

910 Appendix I Answers to the Problems

Substituting L μν W μν in Eq. (17.34a), and using k � p D M ω, d3k0 D ω0 2dω0dΩ

dσ D(4πα)2

4(k � p )4πM L μν W μν

Q4

d3k0

(2π)32ω0

D4α2

Q4 ω0 2�

W2 cos2 θ2C 2W1 sin2 θ

2

�dω0dΩ

Problem 17.5: Comparing Eq. (17.12) with Eq. (17.41)ZW2dω0 D

ω0

ω,

Z2W1 dω0 D

Q2

2M2

ω0

ω

Then using Eq. (17.43), we can obtain Eq. (17.42).Alternative Method: Insert J μ(p ) D u(p 0)γ μ u(p ) in Eq. (17.34c) for W μν .X

spin

J μp J ν�

p

DX

u(p 0)γ μ u(p )u(p )γ ν u(p 0) D Tr[( 6p 0 � M )γ μ( 6p � M )γ ν]

(C.10)DDD 4

�p μ p 0 ν C p ν p 0 μ C

q2

2gμν

Although

p D (M I 0), p 0 D p C q D PF D (EF I P F ) D (M C ω � ω0I k � k0)

q term in p 0 does not contribute because of Eq. (17.39). Considering final statedensity

PF D

d3PF(2π)32EF

, etc.

W μν D(2π)4

4πM

XF

δ4(p C q � PF )12

Xspin

J μp j ν�

p

DMEF

δ(M C ν � EF )��gμν Q2

4M2 Cp μ p ν

M2

Considering

EF D

q(k � k0)2 C M2 D

qQ2 C ν2 C M2 ,

@(ν � EF )@ν

DMEF

we have

W μν D δ�

ν �Q2

2M

���gμν Q2

4M2 Cp μ p ν

M2

) W2 D δ�

ν �Q2

2M

�, W1 D

Q2

4Mδ�

ν �Q2

2M

Problem 17.6: Using Eq. (17.30) in the text, x D Q2

2M ν < 1. For y it is obvious fromEqs. (17.61b).

Page 936: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 911

Problem 17.7: Using equality

@x@ cos θ

D �ωM

1 � yy

,@y@ω0 D �

that can be derived from Eqs. (17.61), we have

J �

ˇˇ @x

@ cos θ@x@ω0

@y@ cos θ

@y@ω0

ˇˇ D 1

M(1� y )

y

Considering

ω0 2 D ω2(1� y )2 , cos2 θ2' 1 , sin2 θ

2D

M2x ys(1� y )

,

s D 2ωM , ν Ds y

2Md2 σ

dx d yD

2πJ

d2σdω0dΩ

(17.41)DDD 2πM

y1� y

4α2

Q4

�s2

4M2

�(1 � y )2

�W2 C

2M2x ys(1� y )

W1

D4πα2

Q4s�

ν W2(1� y )C 2x M�

y 2

2

�W1

�Problem 17.8: εμ(0) is a Lorentz vector, and satisfies

qμ εμ(0) D a�

q � p �(p � q)

q2q2�D 0

Namely, it satisfies the condition to be a photon polarization vector. Furthermore, itis normalized as εμ(0)εμ(0) D 1 and is time-like. In fact, using variables evaluatedin the target protons rest frame,

a DQ

Mp

Q2 C ν2, p μ D (M I 0, 0, 0) , (p � q) D M ν ,

qμ D

νI 0, 0,q

ν2 C Q2�

, (Q2 D �q2)

We can show it agrees with Eq. (17.77b) in the text.Problem 17.9: Using ε(0) in the Problem 17.8, W μν can be expressed as

W μν (17.38)DDD

���gμν C

qμ qν

q2

�W1 C

εμ(0)εν(0)a2M2 W2

�Using εμ(˙)εμ(0) D 0, εμ(˙)gμν εν(˙)� D �1

σT D12

4π2 αK

XλD˙

εμ(λ)εν(λ)� W μν D4π2α

KW1

σL D4π2α

Kεμ(0)εν(0)� W μν D

4π2αK

��W1 C

1a2M2 W2

D4π2α

K

��1C

ν2

Q2

�W2 � W1

Page 937: Yorikiyo Nagashima - Archive

912 Appendix I Answers to the Problems

In deriving the last two equalities, we have used

εμ(0)gμν εν(0)� D 1 , a�2 D p 2 �(p � q)

q2D M2

�1C

ν2

Q2

�Problem 17.10: Denoting the parton mass as μ, and the energy and momentumof the proton in the Breit frame as (E I0, 0, P ), the parton’s momentum transfer q2

can be calculated as

p D (p 0I px , p y , x P ) , p 0 D (p 0I px , p y ,�x P ) I

p 0 Dq

p 2x C p 2

y C x2P2 C μ2

q2 D (p 0 � p )2 D �4x2P2 , �μ(0) D (1, 0, 0, 0)

As the electron scatters with the parton elastically, the structure function W μν onthe parton target is proportional to

Lμν D 2�

p μ p 0 ν C p ν p 0 μ C gμν q2

2

�(See Eq. (17.34b)). Therefore, the photon-parton cross section in the Breit frame isgiven by

σL / εμ(0)εν(0)� W μν / L00 D 2�

2p 0 p 0 0 Cq2

2

�D 4(p 2

T C μ2 C x2P2 � x2P2) D 4(p 2T C μ2)

σT /12

XλD˙

εμ(λ)εν(λ)� W μν /12

(Lx x C Ly y ) D 2px p 0x C 2p y p 0

y � q2

D Q2 C 2p 2T

Q2!1DDD Q2 , ) σL

σTD

4(p 2T C μ2)Q2

Problem 17.11:

d2σdω0dΩ

D4α2

Q4 ω0 2�

W2 cos2 θ2C 2W1 sin2 θ

2

�(I.19)

We re-write the contents in [� � � ] as follows

W2 C tan2 θ2

W1 D 2 tan2 θ2

1C ν2

Q2

�W2 � W1 C

n1C 2

1C ν2

Q2

�tan2 θ

2

oW1

2

1C ν2

Q2

�tan2 θ

2

� 2 tan2 θ2

11� ε

(W1 C ε WL)

Here we used WL D (1C ν2/Q2)W2 � W1. Inserting

ω0 2 cos2(θ /2)Q2 D

14

ω0

ω1

tan2(θ /2)WT,L D

K4π2α

σT,L

Page 938: Yorikiyo Nagashima - Archive

Appendix I Answers to the Problems 913

in Eq. (I.19)

d2σdω0 dΩ

D Γ (σT C σL) , Γ DαK

2π2Q2

ω0

ω1

1� ε

Problem 17.12: First we derive the parton approximation for ε. According toEqs. (17.61), ν � O(s/M ), Q2 � O(s), ν2 Q2,

ε D�

1C 2ν2 C Q2

Q2 tan2 θ2

��1

'

"1C 2

� s2M y

�2sx y

M2x ys(1� y )

#�1

D2(1 � y )

1C (1 � y )2

) 11� ε

D1C (1� y )2

y 2

Next considering @y@ω0 D �

1ω , @Q2

@ cos θ D �2ωω0, we have

dQ2d y D@(y , Q2)

@(ω0, cos θ )D 2ω0 dω0d cos θ ,

d2σdQ2d y

Dπω0

d2

dω0dΩD

πω0

α2K2π2Q2

ω0

ω1C (1� y )2

y 2σT

Inserting K/ω ' ν/ω D y

d2σ Dα

2π1C (1� y )2

yσT

dQ2

Q2 d y

Problem 17.13: Use (C.8c).

Chapter 18: Gauge Theories

Problem 18.1: Let the field strength inside the solenoid with radius a be B0. Sincethe magnetic field is along the z direction, we expect the vector potential in φ Dtan�1(y/x ) direction. The magnetic field inside the solenoid can be reproduced bythe potential Ain D

B02 (�y , x ) D 1

2 B � r D B02 �eφ where � D

px2 C y 2.

The vector potential outside the solenoid should take a form r f (r), and we noterφ takes the form

Aout D k��

yx2 C y 2 ,

xx2 C y 2 , 0

�D

k�

eφ .

The constant k can be determined by the condition,IAout � d r D

ZBn dS D B0πa2

As Aout D (k/�)eφ ,H

jrD�j A � d r DR 2π

0k��dφ D 2πk D πa2B0,) k D 1

2 B0a2

Page 939: Yorikiyo Nagashima - Archive

914 Appendix I Answers to the Problems

k can also be determined by the condition

Ain(� D a) D Aout(� D a)

Let us consider a function φ D tan�1� y

x

�, which has a cut at y D 0, x < 0 because

of periodicity of the angle. φ differs by 2π on both sides of the cut.

rφ � r tan�1 y

x

�D

��

yx2 C y 2

,x

x2 C y 2, 0�

(x < 0, y > 0) , tan φ < 0) φ D π at y !C0

(x < 0, y < 0) , tan φ > 0) φ D �π at y ! �0

) φx<0,y!C0 � φx<0,y!�0 D 2π

By a gauge transformation

A! A0 D ACr� , � D �12

B0a2φ(x , y ) , φ(x , y ) D tan�1(y/x )

A0 can be made 0 everywhere except at the cut, which is called a Dirac string.Problem 18.7:

D μ Φ †(Dμ Φ ) � Φ †�i gW W � t � i

gB

2B � Y

��

i gW W � t C igB

2B � Y

�Φ

Use t D τ/2, Y Φ D 1 and replace Φ with (0, v/p

2) and

D μ Φ †(Dμ Φ ) � (0, v/p

2)(1/2)2

�gW W 0 C gB B,

p2gW W C

p2gW W �, �gW W 0 C gB B

�2 �0

v/p

2

Dv2

8

h2g2

W W � W C C (�gW W 0 C gB B)2i

Dg2

W v2

4W � W C C

g2Z v2

8Z0Z0

Considering there is a factor 2 difference in the mass term between chargedand neutral boson, namely

˝mass term in the Lagrangian D m2

2

P3iD1 W 2

i D

m2(W � W C C W 02/2)˛

we have mW Dg W v

2 , mZ DgZ v

2 .

Page 940: Yorikiyo Nagashima - Archive

915

Appendix JParticle Data

Table J.1 Baryons

Name Quark contents Spin Mass Life time (sec) Principal decays

(MeV/c2) /width (MeV)

N p uud 1/2 938.272 1 –n udd 1/2 939.565 885.7 s p e� ν e

Λ uds 1/2 1115.68 2.63� 10�10 s p π�, nπ0

Σ C uus 1/2 1189.37 8.02� 10�11 s p π0, nπC

Σ 0 uds 1/2 1192.64 7.4� 10�20 s ΛγΣ � dds 1/2 1197.45 1.48� 10�10 s nπ�

� 0 uss 1/2 1314.9 2.90� 10�10 s Λπ0

� � dss 1/2 1321.7 1.64� 10�10 s Λπ�

ΛCc udc 1/2 2286.5 2.00� 10�13 s p K π, Λππ, Σ ππ

Δ(CC,C,0,�) uuu, uud, udd, 3/2 1232 118 MeV N πddd

Σ � (C,0,�) uus, uds, dds 3/2 1385 37 MeV Λπ, Σ π� � (0,�) uss, dss 3/2 1533 9 MeV � πΩ � sss 3/2 1672 8.2� 10�11 s ΛK�, � π

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 941: Yorikiyo Nagashima - Archive

916 Appendix J Particle Data

Table J.2 Mesons

Name Quark contents Spin Mass Life time (sec) Principal decays(MeV/c2) /width (MeV)

π˙ ud , du 0 139.570 2.60� 10�8 s μνμ

π0 uu, dd 0 134.977 8.4� 10�17 s γ γK˙ us, su 0 493.68 1.24� 10�8 s μνμ , ππ, πππ

K0, K0

ds , sd 0 497.61 KS 8.95� 10�11 s ππKL 5.12� 10�8 s πμνμ , πeνe , πππ

η uu, dd , ss 0 547.85 1.30 keV γ γ , πππη0 uu, dd , ss 0 957.66 0.20 MeV ηππ, �γD˙ cd , dc 0 1869.6 1.04� 10�12 s K ππ, K μνμ , Keνe

D0, D0

cu, uc 0 1864.8 4.1� 10�13 s K ππ, K μνμ , Keνe

Ds cs, sc 0 1968.5 5.0� 10�13 s η�, η0�, φ�

�(˙,0) ud , uu, dd , du 1 775.5 149 MeV ππK� (˙,0) us, ds , sd , su 1 894 50 MeV K πω uu, dd 1 782.6 8.5 MeV πππ, πγφ ss 1 1019.5 4.3 MeV K K , πππJ/ψ cc 1 3097 93 keV e� eC , μ� μC , �π, ωπ� bb 1 9460 54 keV e� eC , μ� μC , τ� τC

Page 942: Yorikiyo Nagashima - Archive

917

Appendix KConstants

Table K.1 Constants

Quantity Symbol Value unit

Speed of light in vacuum c 2.997925� 108 m s�1

Planck constant h 6.626069� 10�34 J sPlanck constant, reduced „ 1.054572� 10�34 J s

D 6.582119 � 10�22 MeV sConversion constant „c 197.326963 MeV fm a

Electron charge magnitude e 1.602176� 10�19 CElectron mass me 0.510999 MeV/c2

Proton mass mp 938.272013 MeV/c2

D 1.672622 � 10�27 kgNeutron mass mn 939.56536 MeV/c2

Avogadro constant NA 6.022142 mol�1

Boltzmann constant k 1.380650� 10�23 J K�1

D 8.617343 � 10�5 eV K�1

Fine structure constant α D e2/4π�0„c 7.297353� 10�3

D 1/137.036000 b

Classical electron radius re D α„/me c 2.817940� 10�15 mBohr radius a1 D (1/α)„/me c 0.529177� 10�10 mThomson cross section σT D 8πr2

e /3 0.665245 barn c

Gravitational constant GN 6.67428� 10�11 m3 kg�1 s�2

D 6.70881 � 10�39„c (GeV/c2)�2

Planck massp„c/GN 1.22089� 1019 GeV/c2

Planck lengthp„GN /c3 1.61624� 10�35 m

Planck timep„GN /c5 5.39124� 10�44 s

Fermi coupling constant GF/(„c)3 1.16637� 10�5 GeV�2

W ˙ boson mass mW 80.398 GeV/c2

Z0 boson mass mZ 91.1876 GeV/c2

Weak mixinga angle sin2 θWd 0.2231

Strong coupling constant α s (mZ ) 0.1176

a fm D 10�15 m.b At Q2 D 0. � 1/128 at Q2 D m2

W .c barn D 10�28 m2.d defined by 1� m2

W /m2Z .

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 943: Yorikiyo Nagashima - Archive
Page 944: Yorikiyo Nagashima - Archive

919

References

1 B. Fowler Nucl. Phys. B (Pro. Supple),36:37, 1994, ibid. C. Peyrou, p. 59 andother articles p. 3–568 in “30 years ofbubble chamber physics”.

2 S.A. Abel and O. Lebedev. J. High EnergyPhys., 01:133, 2006.

3 H. Abramowicz et al. Z. Phys. C, 17:283,1983.

4 M.R. Adams et al. Phys. Rev. D, 54:3006,1996.

5 S.L. Adler. Phys. Rev., 140:B736, 1965.6 S.L. Adler. Phys. Rev., 143:1144, 1966.7 S.L. Adler. Phys. Rev., 177:2426, 1969.8 Y. Aharonov and D. Bohm. Phys. Rev.,

115:485, 1959.9 L.A. Ahrens et al. Phys. Lett. B, 202:284,

1988.10 ALEPH Collaboration. Nucl. Instrum.

Methods A, 294:121, 1990.11 ALEPH Collaboration. Eur. Phys. J. C,

2:197, 1998.12 C. Alff et al. Phys. Rev. Lett., 9:322, 1962.13 C. Alff et al. Phys. Rev. Lett., 9:325, 1962.14 W.W.M. Allison. The Interaction of

Charged Particles and Photons in Mat-ter. In C.W. Fabjan and J.E. Pilcher,editors, Proceedings of the ICFA Schoolon Instrumentation in Elementary Par-ticle Physics at Trieste, Italy, 1987, pages3–42, World Scientific, 1987.

15 W.W.M. Allison and J.H. Cobb. Annu.Rev. Nucl. Part. Sci., 30:253, 1980.

16 G. Altarelli. Annu. Rev. Nucl. Part. Sci.,39:357, 1989.

17 G. Altarelli and G. Parisi. Nuovo Cimen-to, 4:335, 1974.

18 G. Altarelli and G. Parisi. Nucl. Phys. B,126:298, 1977.

19 L. Alvarez-Gaumé et al. Phys. Lett. B,458:347, 1999.

20 U. Amaldi et al. Phys. Lett. B, 260:447,1991.

21 C.D. Anderson. Phys. Rev., 41:405, 1932.22 C.D. Anderson. Phys. Rev., 43:491, 1933.23 H.L. Anderson. Phys. Rev., 100:279,

1955.24 D. Andrews et al. Phys. Rev. Lett.,

44:1108, 1980.25 P.L. Anthony et al. Phys. Rev. Lett.,

75:1949, 1995.26 P.L. Anthony et al. Phys. Rev. D, 56:1373,

1997.27 R. Arai et al. Nucl. Instrum. Methods Phys.

Res., 217:181, 1983.28 ARGUS Collaboration: H. Albrecht et al.

Phys. Lett. B, 246:278, 1990.29 N. Arkani-Hamed et al. Phys. Lett. B,

429:263, 1998.30 X. Artru et al. Phys. Rev. D, 12:1289,

1975.31 J. Ashkin et al. Phys. Rev., 101:1149,

1956.32 ATLAS Collaboration. ATLAS-TDR-002;

CERN-LHCC-96-041. Technical report,CERN LHCC, 1996.

33 J.J. Aubert et al. Phys. Rev. Lett., 33:1404,1974.

34 J.-E. Augustin et al. Phys. Rev. Lett.,33:1406, 1974.

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 945: Yorikiyo Nagashima - Archive

920 References

35 N. Bacino. Phys. Rev. Lett., 41:13, 1978.36 J. Bailey et al. Nucl. Phys. B, 150:1, 1979.37 C.A. Baker et al. Phys. Rev. Lett.,

97:131801, 2006.38 G. Bakker et al. Nucl. Phys. B, 16:53,

1970.39 G.S. Bali. Phys. Rep., 343:1, 2001.

arXiv:hep-ph/0001312v2.40 B. Balke et al. Phys. Rev. D, 37:587, 1988.41 C. Baltay. In S. Homma et al., editors,

Proc. 19th Int. Conf. on High EnergyPhysics, Tokyo, 1978. Physical Society ofJapan, Tokyo, 1979.

42 A. Bandyopadhyay et al. Rep. Prog. Phys.,72:106201, 2009. arXiv:0710.4947v2[hep-ph].

43 D.P. Barber et al. Phys. Rev. Lett., 43:830,1979.

44 M. Bardon et al. Phys. Rev. Lett., 14:449,1965.

45 V. Barger et al. Nucl. Phys. B, 5:411, 1968.46 V. Barger and M. Olsson. Phys. Rev.,

146:1080, 1966.47 V. Bargmann et al. Phys. Rev. Lett., 2:435,

1959.48 V.V. Barmin et al. Nucl. Phys. B, 247:293,

1984.49 V.E. Barnes et al. Phys. Rev. Lett., 12:204,

1964.50 G. Barr. Presented at Lepton-Photon

99 (LP99), 1999. http://www.sldnt.slac.stanford.edu/lp99/ (accessed March2010).

51 J.R. Batley et al. Phys. Lett. B, 544:97,2002.

52 BCDMS Collaboration: A.C. Benvenutiet al. Phys. Lett. B, 223:485, 1989.

53 M.A.B. Bég and A. Sirlin. Phys. Rep.,88:1, 1982.

54 A.I. Belesev et al. Phys. Lett. B, 350:263,1995.

55 J.S. Bell and R. Jackiw. Nuovo Cimento A,60:47, 1969.

56 Belle Collaboration: A. Go et al. Phys.Rev. Lett., 99:131802, 2007.

57 J.B. Bellicard et al. Phys. Rev. Lett.,19:527, 1967.

58 V.B. Berestetski, E.M. Lifshitz, and L.P.Pitaevskii. Relativistic Quantum Theory.Pergamon, New York, 1971.

59 P. Berge et al. Z. Phys. C, 49:187, 1991.60 J. Bernabeu et al. CPT and quantum

mechanics tests with kaons. In A. Di

Domenico, editor, Handbook on Neu-tral Kaon Interferometry at a Phi-Factory,page 39. Istituto Nazionale di Fisica Nu-cleare, Laboratori Nazionali di Frascati,2006. arXiv:hep-ph/0607322v1.

61 H. Bethe and W. Heitler. Proc. R. Soc.London A, 146:83, 1934.

62 J.D. Bjorken. Phys. Rev., 179:1547, 1969.63 J.D. Bjorken and S.D. Drell. Relativistic

Quantum Mechanics. McGraw-Hill, NewYork, 1964.

64 J.D. Bjorken and S.D. Drell. Relativis-tic Quantum Fields. McGraw-Hill, NewYork, 1965.

65 J.D. Bjorken and E. A. Paschos. Phys.Rev., 185:1975, 1969.

66 E. Blanke et al. Phys. Rev. Lett., 51:355,1983.

67 F. Bloch and A. Nordsieck. Phys. Rev.,52:54, 1937.

68 P. Bloch and L. Tauscher. Annu. Rev.Nucl. Part. Sci., 53:123, 2003.

69 E. Bloom et al. Phys. Rev. Lett., 23:930,1969.

70 E. Blucher. Presented at Lepton-Photon99 (LP99), 1999. http://www.sldnt.slac.stanford.edu/lp99/ (accessed March2010).

71 T. Böhringer et al. Phys. Rev. Lett.,44:1111, 1980.

72 M. Bourquin et al. Z. Phys. C, 21:27,1983.

73 A.M. Boyarski et al. Phys. Rev. Lett.,34:1357, 1975.

74 T.H. Boyer. Phys. Rev., 174:1764, 1968.75 H. Braun et al. Nucl. Phys. B, 89:210,

1975.76 M. Breidenbach et al. Phys. Rev. Lett.,

23:935, 1969.77 A. Breskin et al. Nucl. Instrum. Methods,

124:189, 1975.78 S. J. Brodsky et al. Phys. Rev. D, 4:1532,

1971.79 C.N. Brown et al. Phys. Rev. Lett.,

63:2637, 1989.80 R. Brown et al. Nature, 163:47, 1949.81 F. Bumiller. Phys. Rev., 124:1623, 1961.82 M.T. Burgy et al. Phys. Rev. Lett., 1:324,

1958.83 M.T. Burgy et al. Phys. Rev., 120:1829,

1960.84 N. Cabibbo. Phys. Rev. Lett., 10:531, 1963.

Page 946: Yorikiyo Nagashima - Archive

References 921

85 C.G. Callan, Jr. and D.J. Gross. Phys. Rev.Lett., 22:156, 1969.

86 A. Camacho and A. Camacho-Galván.Rep. Prog. Phys., 70:1937, 2007.

87 L. Camilleri et al. Annu. Rev. Part. Nucl.Sci., 58:343, 2008.

88 D.D. Carmony and R.T. van de Walle.Phys. Rev. Lett., 8:73, 1962.

89 P.A. Carruthers. Introduction to UnitarySymmetry. John Wiley & Sons, New York,1966.

90 A.A. Carter. Nucl. Phys. B, 58:378, 1973.91 W.F. Cartwright et al. Phys. Rev., 91:677,

1953.92 H.B.G. Casimir. Proc., K. Ned. Akad.

Wet., B, 51:793, 1948.93 CERN. http://public.web.cern.ch/

public/en/Research/AccelComplex-en.html (accessed March 2010).

94 J. Chadwick. Proc. R. Soc. London A,136:692, 1932.

95 G. Charpak et al. Nucl. Instrum. Methods,62:262, 1968.

96 M.-S. Chen and P. Zerwas. Phys. Rev. D,12:187, 1975.

97 T.-P. Cheng and L.-F. Li. Gauge Theoryof Elementary Particle Physics. ClarendonPress, Oxford, 1984.

98 C.B. Chiu. Annu. Rev. Nucl. Sci., 22:255,1972.

99 J. Christenson et al. Phys. Rev. Lett.,13:138, 1964.

100 D.L. Clark et al. Phys. Rev., 85:523, 1952.101 F.E. Close. An Introduction to Quarks

and Partons. Academic Press, New York,1979.

102 S. Coleman. Science, 206:1290, 1979.103 R. Collela et al. Phys. Rev. Lett., 34:1472,

1975.104 E.D. Commins and P.H. Bucksbaum.

Weak Interactions of Leptons and Quarks.Cambridge University Press, Cam-bridge, 1983.

105 C.L. Cowan, Jr. et al. Science, 124:103,1956.

106 P. Coyle and O. Schneider. C.R. Physique,3:1143, 2002.

107 CPLEAR Collaboration: A. Angelopou-los et al. Phys. Rep., 374:165, 2003.

108 CPLEAR Collaboration: A. Angelopoulset al. Phys. Lett. B, 444:43, 1998.

109 CPLEAR Collaboration: R. Adler et al.Nucl. Instrum. Methods Phys. Res. A,379:76, 1996.

110 J.W. Cronin et al. Phys. Rev. Lett., 18:25,1967.

111 G. Culligan et al. Nature, 180:751, 1957.112 G. Danby et al. Phys. Rev. Lett., 9:36,

1962.113 A. Das. Field Theory: A Path Integral Ap-

proach. World Scientific, Singapore, 2ndedition, 2006.

114 DASP Collaboration: R. Brandelik et al.Phys. Lett. B, 70:132, 1977.

115 M. Davier. Proc. SLAC Summer Instituteon Particle Physics, SLAC-R-528, 1997.

116 E.H. de Groot and J. Engels. Z. Phys. C,1:141, 1979.

117 A. De Rújula, H. Georgi andS.L. Glashow Phys. Rev. D, 12:147, 1975.

118 B.S. Deaver, Jr. and W. Fairbank. Phys.Rev. Lett., 7:43, 1961.

119 DELPHI Collaboration. Nucl. Instrum.Methods Phys. Res. A, 303:233, 1991.

120 T.J. Devlin and J.O. Dickey. Rev. Mod.Phys., 51:237, 1979.

121 C.O. Dib and R.D. Peccei. Phys. Rev. D,46:2265, 1992.

122 N. Doble et al. Nucl. Instrum. MethodsPhys. Res. B, 119:181, 1996.

123 Y.L. Dokshitzer. Sov. Phys. – JETP,46:641, 1977.

124 J. Drees and H.E. Montgomery. Annu.Rev. Part. Sci., 33:383, 1983.

125 J. Dunkley et al. Astrophys. J. Suppl.,180:306, 2009.

126 R. Durbin et al. Phys. Rev., 83:646, 1951.127 W. Ehrenberg and R.E. Siday. Proc. Phys.

Soc. B, 62:8, 1949.128 A. Einstein et al. Phys. Rev., 47:777, 1935.129 J. Ellis. Nucl. Instrum. Methods Phys. Res.

A, 284:33, 1989.130 J. Ellis et al. Nucl. Phys. B, 241:381, 1984.131 F. Englert and R. Brout. Phys. Rev. Lett.,

13:321, 1964.132 P.T. Eschstruth et al. Phys. Rev.,

165:1487, 1968.133 L. Evans et al. In G. Altarelli and L. DiLel-

la, editors, Proton–Antiproton ColliderPhysics, page 33. World Scientific, Singa-pore, 1989.

134 C.W. Fabjan et al. Nucl. Instrum. Methods,141:61, 1977.

Page 947: Yorikiyo Nagashima - Archive

922 References

135 C.W. Fabjan and T Ludlum. Annu. Rev.Nucl. Part. Sci., 32:335, 1982.

136 L.D. Faddeev and V.N. Popov. Phys. Lett.B, 25:29, 1967.

137 U. Fano. Annu. Rev. Nucl. Sci., 13:1,1963.

138 T. Ferbel. Experimental Techniques inHigh Energy Physics. Addison-Wesley,Reading, MA, 1987.

139 E. Fermi and C.N. Yang. Phys. Rev.,76:1739, 1949.

140 W. Fetscher. Phys. Lett. B, 140:117, 1984.141 W. Fetscher et al. Phys. Lett. B, 173:102,

1986.142 R.P. Feynman. In Proc. 3rd Topical Conf.

High Energy Collision of Hadrons. C.N.Yang, J.A. Cole, M. Good, R. Hwa, andJ. Lee-Franzini, editors, Gordon Breach,New York, p. 237, 1969.

143 R.P. Feynman. Phys. Rev. Lett., 23:1415,1969.

144 R.P. Feynman and M. Gell-Mann. Phys.Rev., 109:193, 1958.

145 H.E. Fisk and F. Sciulli. Annu. Rev. Nucl.Part. Sci., 32:499, 1982.

146 L. Fonda et al. Rep. Prog. Phys., 41:587,1978.

147 H. Frauenfelder and R.M. Steffan. InK. Siegban, editor, α, , γ Ray Spec-troscopy, page 1431. North-Holland,Amsterdam, 1966.

148 J.I. Friedman. Rev. Mod. Phys., 63:615,1991.

149 J.I. Friedman and H.W. Kendall. Annu.Rev. Nucl. Sci., 22:203, 1972.

150 M. Froissart. Phys. Rev., 123:1053, 1961.151 K. Fujikawa and R.E. Shrock. Phys. Rev.

Lett., 45:963, 1980.152 T. Gabriel et al. Nucl. Instrum. Methods,

171:237, 1980.153 J.-M. Gaillard et al. Phys. Rev. Lett., 18:20,

1967.154 M.K. Gaillard and B.W. Lee. Phys. Rev.

D, 10:897, 1974.155 G.M. Garibian et al. Nucl. Instrum. Meth-

ods, 125:133, 1975.156 S. Gasiorowicz. Elementary Particle

Physics. John Wiley & Sons, New York,1966.

157 J. Gasser and U.-G. Meißner. Phys. Lett.B, 258:219, 1991.

158 H. Geiger and E. Marsden. Philos. Mag.,25:604, 1913.

159 M. Gell-Mann. Phys. Rev., 92:833, 1953.160 M. Gell-Mann. Phys. Rev., 125:1067,

1962.161 M. Gell-Mann. Phys. Lett., 8:214, 1964.162 M. Gell-Mann and M. Levy. Nuovo Cim.,

16:705, 1960.163 M. Gell-Mann, P. Ramond, and R. Slan-

sky. Supergravity. North-Holland, Ams-terdam, 1979.

164 S. Gentile and M. Pohl. Phys. Rep.,274:287, 1996.

165 H. Georgi. Lie Algebras in Particle Physics.Perseus Books, New York, 2nd edition,1999.

166 H. Georgi and S.L. Glashow. Phys. Rev.Lett, 32:438, 1974.

167 S.S. Gershtein and Ya. B. Zeldovich. Sov.Phys. – JETP, 2:596, 1956.

168 F.J. Gilman. Phys. Rev., 167:1365, 1968.169 F.J. Gilman. Phys. Rev. D, 35:3541, 1987.170 V.L. Ginzburg and L.D. Landau. Zh.

Eksp. Teor. Fiz., 20:1064, 1950.171 S. Gjesdal et al. Phys. Lett. B, 52:113,

1974.172 D.A. Glaser. Phys. Rev., 87:665, 1952.173 S.L. Glashow. Nucl. Phys., 22:579, 1961.174 S.L. Glashow et al. Phys. Rev. D, 2:1285,

1970.175 R.L. Gluckstern. Nucl. Instrum. Methods,

24:381, 1963.176 M.L. Goldberger and K.M. Watson. Col-

lision Theory. John Wiley & Sons, NewYork, 1964.

177 G. Goldhaber. In Proc. of the SummerInstitute on Particle Physics, volume 198of SLAC Publication, page 379. 1979.

178 G. Goldhaber et al. Phys. Rev. Lett.,37:255, 1976.

179 M. Goldhaber et al. Phys. Rev., 109:1015,1958.

180 J. Goldstone. Nuovo Cimento, 19:154,1961.

181 M.L. Good. Phys. Rev., 110:550, 1958.182 K. Gottfried. Phys. Rev. Lett., 18:1174,

1967.183 K. Goulianos. Phys. Rep. C, 101:169,

1983.184 M.G. Green, S.L. Lloyd, P.N. Ratoff, and

D.R. Ward. Electron-Positron Physics atthe Z. IOP Publishing, Bristol, 1998.

185 O.W. Greenberg. Phys. Rev. Lett., 13:598,1964.

Page 948: Yorikiyo Nagashima - Archive

References 923

186 B.R. Greene. The Elegant Universe. W.W.Norton, New York, 1999.

187 G.N. Gribov and L.N. Lipatov. Sov. J.Nucl. Phys., 15:438, 675, 1972.

188 D.J. Gross and Llewellyn Smith C.H.Nucl. Phys. B, 14:337, 1969.

189 D.J. Gross and F. Wilczek. Phys. Rev.Lett., 30:1343, 1973.

190 D.J. Gross and F. Wilczek. Phys. Rev. D,8:3633, 1973.

191 G.S. Guralnik et al. Phys. Rev. Lett.,13:585, 1964.

192 F. Gürsey and L.A. Radicati. Phys. Rev.Lett., 13:173, 1964.

193 M.Y. Han and Y. Nambu. Phys. Rev.,139:B1006, 1965.

194 L.N. Hand. Phys. Rev., 129:1834, 1963.195 G. Hanson et al. Phys. Rev. Lett., 35:1609,

1975.196 Y. Hara. Phys. Rev., 134:B701, 1964.197 B.W. Harris et al. Phys. Rev. A, 62:052109,

2000.198 P.G. Harris et al. Phys. Rev. Lett., 82:904,

1999.199 F.J. Hasert et al. Phys. Lett. B, 46:121,

1973.200 F.J. Hasert et al. Phys. Lett. B, 46:138,

1973.201 S.W. Hawking. Commun. Math. Phys.,

87:395, 1982.202 K Hayashi and T. Nakano. Prog. Theor.

Phys., 38:491, 1967.203 W. Heitler. The Quantum Theory of Radi-

ation. Oxford University Press, Oxford,3rd edition, 1954. Republished by Dover,Mineola, NY, 1984.

204 A.J.G. Hey and R.L. Kelly. Phys. Rep.,96:71, 1983.

205 P.W. Higgs. Phys. Rev. Lett., 13:508, 1964.206 R. Hofstadter. Rev. Mod. Phys., 28:214,

1956.207 G.E. Hogan et al. Phys. Rev. Lett., 42:948,

1979.208 K. Huang. Quarks, Leptons and Gauge

Fields. World Scientific, Singapore, 2ndedition, 1992.

209 F. Hubaut. Nucl. Instrum. Methods A,518:31, 2004.

210 P. Huet and M.E. Peskin. Nucl. Phys. B,434:3, 1995.

211 J. Iizuka. Prog. Theor. Phys. Suppl., 37:21,1966.

212 M. Ikeda et al. Prog. Theor. Phys., 22:715,1959.

213 W.R. Innes et al. Phys. Rev. Lett., 39:1240,1977.

214 N. Isgur and G. Karl. Phys. Rev. D,18:4187, 1978.

215 C. Itzykson and J.-B. Zuber. QuantumField Theory. McGraw-Hill, New York,1980. Republished by Dover, Mineola,NY, 2006.

216 J.D. Jackson. Classical Electrodynamics.John Wiley & Sons, New York, 3rd edi-tion, 1998.

217 M. Jacob and G.C. Wick. Annals of Phys.,7:404, 1959.

218 R.C. Jaklevic et al. Phys. Rev., 140:A1628,1965.

219 F. Jegerlehner and A. Nyffeler. Phys.Rep., 477:1, 2009. F. Jegerlehner.arXiv:hep-ph/0703125v3.

220 P.K. Kabir. The CP Puzzle: Strange De-cays of the Neutral Kaon. Academic Press,Boston, MA, 1968.

221 P.K. Kabir. Phys. Rev. D, 2:540, 1970.222 G. Kallen. Elementary Particle Physics.

Addison–Wesley, Reading, MA, 1964.223 T. Kamae et al. Nucl. Instrum. Methods

Phys. Res. A, 252:423, 1986.224 K. Kasahara. Phys. Rev. D, 31:2737, 1985.225 KEK-E246 Collaboration: M. Abe et al.

Phys. Rev. Lett., 93:131601, 2004.226 KEK-E246 Collaboration: M. Abe et al.

Phys. Rev. D, 73:072005, 2006.227 K.W. Kendall. Rev. Mod. Phys., 63:597,

1991.228 L.A. Khalfin. Soviet Phys. Dokl., 2:340,

1957. [Original: Dokl. Akad. Nauk. USSR115:277,1957].

229 L.A. Khalfin. Pis’ma Zh. Eksp. Teor. Fiz.,8:106, 1968.

230 T.W.B. Kibble. J. Math. Phys., 2:212,1961.

231 T. Kinoshita. Phys. Rev. Lett., 2:477, 1959.232 T. Kinoshita et al. Phys. Rev. Lett., 52:717,

1984.233 O.C. Kistner and B.M. Rustad. Phys.

Rev., 114:1329, 1959.234 S. Klein. Rev. Mod. Phys., 71:1501, 1999.235 KLOE Collaboration: F. Ambrosino et al.

Phys. Lett. B, 636:173, 2006.236 KLOE Collaboration: F. Ambrosino et al.

Phys. Lett. B, 642:315, 2006.

Page 949: Yorikiyo Nagashima - Archive

924 References

237 KLOE Collaboration: F. Bossi et al. Riv.Nuovo Cim., 031(10):531, 2008. arX-iv:0811.1929v2 [hep-ex].

238 KLOE Collaboration: G. D’Ambrosio andG. Isidori. J. High Energy Phys., 12:011,2006. arXiv:hep-ex/0610034v1.

239 M. Kobayashi and K. Maskawa. Prog.Theor. Phys., 49:652, 1973.

240 R.D. Kohaupt and G. Voss. Annu. Rev.Nucl. Part. Sci., 33:67, 1983.

241 F.W.J. Koks and J. Van Klinken. Nucl.Phys. A, 272:61, 1976.

242 E. Komatsu et al. Astrophys. J. Suppl.,180:330, 2009.

243 C. Kraus et al. Eur. Phys. J. C, 40:447,2005.

244 KTeV Collaboration: A. Alavi-Harati et al.Phys. Rev. D, 67:012005, 2003.

245 KTeV Collaboration: A. Alavi-Harati et al.Phys. Rev. D, 70:079904, 2004.

246 H. Kühnelt and O. Nachtmann. Nucl.Phys. B, 88:41, 1975.

247 J. Lach and L. Pondrom. Annu. Rev. Nucl.Sci., 29:203, 1979.

248 W.E. Lamb and R.C. Retherford, Jr. Phys.Rev., 72:241, 1947.

249 S.K. Lamoreaux. Phys. Rev. Lett., 78:5,1997.

250 D.L. Landau and I.J. Pomeranchuk.Dokl. Akad. Nauk. SSSR, 92:535, 735,1953. English transl.: The Collected Pa-pers of L.D. Landau, D. ter Haar, Perga-mon Press, Oxford 1965.

251 L.R. Landau. In Proc. LP-HEP Conf.Geneva, July 1991, S. Hegarty, K. Potterand E. Quercigh, editors, World Scientif-ic, 1992.

252 P. Langacker. Phys. Rep., 72:185, 1981.253 P. Langacker and M. Luo. Phys. Rev. D,

44:817, 1991.254 Lawrence Radiation Laboratory. Intro-

duction to the Detection of Nuclear Par-ticles in a Bubble Chamber. The EalingPress, Cambridge, MA, 1964.

255 T.D. Lee. Particle Physics and Introductionto Field Theory. Harwood Academic Pub-lishers, Chur, Switzerland, revised andupdated 1st edition, 1988.

256 T.D. Lee et al. Phys. Rev., 75:905, 1949.257 T.D. Lee and C.S. Wu. Annu. Rev. Nucl.

Sci., 16:511, 1966.258 T.D. Lee and C.N. Yang. Phys. Rev.,

104:254, 1956.

259 Y.K. Lee et al. Phys. Rev. Lett., 10:253,1963.

260 H. Lehmann et al. Nuovo Cimento, 6:319,1957.

261 W.R. Leo. Techniques for Nuclear andParticle Physics Experiments. Springer,Berlin, 1987.

262 LEP Electroweak working group. Phys.Rep., 427:257–454, 2006.

263 H. Leutwyler and M. Roos. Z. Phys. C,25:91, 1984.

264 S.J. Lindenbaum and L.C.L. Yuan. Phys.Rev., 111:1380, 1958.

265 J. Litt and R. Meunier. Annu. Rev. Nucl.Part. Sci., 23:1, 1973.

266 M.S. Livingston and J.P. Blewett. ParticleAccelerators. McGraw-Hill, New York,1962.

267 P.C. Macq et al. Phys. Rev., 112:2061,1958.

268 B.C. Maglic et al. Phys. Rev. Lett., 7:178,1961.

269 L. Maiani, editor. The Second DaphnePhysics Handbook: CP and CPT Violationin Neutral Kaon Decays. INFN, Frascati,1995.

270 Z. Maki. Prog. Theor. Phys., 31:331, 1964.271 Z. Maki et al. Prog. Theor. Phys., 23:1174,

1960.272 W.A.W. Mehlhop et al. Phys. Rev.,

172:1613, 1968.273 W. Meissner and R. Ochsenfeld. Natur-

wissenschaften, 21:787, 1933.274 L. Michel. Proc. Phys. Soc. A, 63:514,

1950.275 A.B. Migdal. Phys. Rev., 103:1811, 1956.276 J.P. Miller et al. Rep. Prog. Phys., 70:795,

2007.277 V.A. Miransky et al. Phys. Lett. B, 221:177,

1989.278 A. Misaki. Nucl. Phys. B – Proc. Suppl.,

33:192, 1993.279 P.J. Mohr and B.N. Taylor. Rev. Mod.

Phys., 77:1, 2005.280 M. Morita. Beta Decay and Muon Cap-

ture. W.A. Benjamin, New York, 1973.281 Muon (g � 2) Collaboration: G.W. Ben-

nett et al. Phys. Rev. D, 73:072003, 2006.282 NA48: A. Lai et al. Eur. Phys. J. C, 22:231,

2001.283 O. Nachtmann. Elementary Particle

Physics. Springer, Berlin, 1989.284 H. Nagaoka. Philos. Mag., 7:445, 1904.

Page 950: Yorikiyo Nagashima - Archive

References 925

285 Y. Nagashima. Nucl. Phys. A, 478:265c,1988.

286 T. Nakano and K. Nishijima. Prog. Theor.Phys., 10:581, 1953.

287 Y. Nambu. A ‘superconductor’ modelof elementary particles and its con-sequences, Talk at Purdue (1960). InT. Eguchi and K. Nishijima, editors,Broken Symmetries, Selected Papers of Y.Nambu. World Scientific, Singapore,1995.

288 Y. Nambu. Proc. Int. Conf. on Symmetriesand Quark Models, R. Chand, editor,Gordon and Breach,1970, p. 269

289 Y. Nambu and G. Jona-Lasinio. Phys.Rev., 122:345, 1961.

290 Y. Nambu and G. Jona-Lasinio. Phys.Rev., 124:246, 1961.

291 B. Naroska. Phys. Rep., 148:67, 1987.292 NASA. http://map.gsfc.nasa.gov/

media/080998/index.html. (accessedMarch 2010).

293 S.H. Neddermeyer and C.D. Anderson.Phys. Rev., 51:884, 1937.

294 Y. Ne’eman. Nucl. Phys., 26:222, 1961.295 New Muon Collaboration (NMC): M.

Arneodo et al. Nucl. Phys. B, 483:3, 1997.296 K. Nishijima. The Field Theory (in

Japanese) Kinokuniya Shoten, Shinjuku,Tokyo, 1987.

297 K. Niu et al. Prog. Theor. Phys., 46:1644,1971.

298 M. Oda. Cosmic Rays. Shokabo, Tokyo,1972. In Japanese.

299 T. Ohshima. Prog. Theor. Phys. Suppl.,119:5, 93, 1995.

300 S. Okubo. Prog. Theor. Phys., 27:949,1962.

301 S. Okubo. Phys. Lett., 5:165, 1963.302 L.B. Okun and I.Ya. Pomeranchuk. Sovi-

et Phys. – JETP, 3:307, 1956.303 S. Olariu and I.I. Popescu. Rev. Mod.

Phys., 57:339, 1985.304 L. O’Raifeartaigh. The Dawning of Gauge

Theory. Princeton University Press,Princeton, NJ, 1997.

305 J. Orear et al. Phys. Rev, 102:1676, 1956.306 N. Osakabe et al. Phys. Rev A, 34:815,

1986.307 J.P. Ostriker. Proc. Natl. Acad. Sci. USA,

74:1767, 1977.308 E.W. Otten and C. Weinheimer. Rep.

Prog. Phys., 71:086201, 2008.

309 Ö. Özdemir et al. J. Geophys. Res.,100:2193, 1995.

310 Pais, A. Phys. Rev., 86:663, 1952.311 Particle Data Group: C. Amsler et al.

Review of particle physics. Phys. Lett. B,667:1, 2008.

312 Particle Data Group: D.E. Grrom et al.Review of particle physics. Eur. Phys. J.C, 15:1, 2000.

313 Particle Data Group: R.M. Barnett et al.Review of particle physics. Phys. Rev. D,54:1, 1996.

314 R. Peccei. The Strong CP Problem., in:CP Violation, Advaced Series on Direc-tions in High Energy Physics, Vol. 3, C.Jarlskog, editor, World Scientific, p. 503,1989.

315 R. Peccei and H.R. Quinn. Phys. Rev. D,16:1791, 1977.

316 C. Pellegrini. Annu. Rev. Nucl. Sci., 22:1,1972.

317 M.L. Perl. Annu. Rev. Nucl. Part. Sci.,30:299, 1980.

318 M.L Perl et al. Phys. Rev. Lett., 35:1489,1975.

319 S. Perlmutter. Phys. Today, April:53,2003.

320 S. Perlmutter et al. Astrophys. J., 517:565,1999.

321 I. Peruzzi et al. Phys. Rev. Lett., 37:569,1976.

322 M.E. Peskin and D.V. Schroeder. AnIntroduction to Quantum Field Theory.Westview Press, Boulder, CO, 1995.

323 PLUTO Collaboration: C. Berger et al. Z.Phys. C, 1:343, 1979.

324 H.D. Politzer. Phys. Rev. Lett., 30:1346,1973.

325 I.Ya. Pomeranchuk. Soviet Phys. – JETP,3:306, 1956.

326 C.Y. Prescott et al. Phys. Lett. B, 77:347,1978.

327 C.Y. Prescott et al. Phys. Lett. B, 84:524,1979.

328 PSI (Paul Scherrer Institut). http://abe.web.psi.ch/accelerators/ringcyc.php(accessed March 2010).

329 C. Rebbi. Phys. Rep. C, 12:1, 1974.330 F. Reines and C.L. Cowan, Jr. Phys. Rev.,

92:830, 1953.331 A.G. Riess et al. Astronom. J., 116:1009,

1998.

Page 951: Yorikiyo Nagashima - Archive

926 References

332 G.D. Rochester and C.C. Butler. Nature,160:855, 1947.

333 M. Roos and A. Sirlin. Nucl. Phys. B,29:296, 1971.

334 M.E. Rose. Elementary Theory of An-gular Momentum. John Wiley & Sons,New York, 1957. Republished by Dover,Mineola, NY, 1995.

335 A. M. Rossi et al. Nucl. Phys. B, 84:269,1975.

336 B. Rossi. High Energy Particles. Prentice-Hall, Englewood Cliffs, NJ, 1952.

337 V.C. Rubin and W.K. Ford. Astrophys. J.,159:379, 1970.

338 L.H. Ryder. Quantum Field Theory. Cam-bridge University Press, Cambridge,2nd edition, 1996.

339 S. Sakata. Prog. Theor. Phys., 16:686,1956.

340 M. Sakuda. Butsuri, 48:87, 1993.341 M. Sakuda et al. Nucl. Instrum. Methods

A, 311:57, 1992.342 J.J. Sakurai. Nuovo Cimento, 7:649, 1958.343 J.R. Sanford. Annu. Rev. Nucl. Part. Sci.,

26:151, 1976.344 A. Schenk. Nucl. Phys. B, 363:97, 1991.345 F. Schmidt. CORSIKA Shower Im-

ages. http://www.ast.leeds.ac.uk/~fs/showerimages.html (accessed March2010), 2008.

346 A.L. Sessoms et al. Nucl. Instrum. Meth-ods, 161:371, 1979.

347 P.S. Shawhan. PhD thesis, university ofChicago, December 1999.

348 SLAC. http://www.slac.stanford.edu/spires/topcites/matrix.shtml. (AccessedMarch 2010).

349 SNO Collaboration: Q.R. Ahmad et al.Phys. Rev. Lett., 87:071301, 2001.

350 T. Soldner et al. Phys. Lett. B, 581:49,2004.

351 D.N. Spergel et al. Astrophys. J. Suppl.,148:175, 2003.

352 J. Steinberger. Rev. Mod. Phys., 61:533,1989.

353 G. Sterman. QCD and jets. Lecture givenat TASI 2004. arXiv:hep-ph/0412013v1.

354 M.L. Stevenson et al. Phys. Rev., 125:687,1962.

355 E.C.G. Sudarshan and R.E. Marshak. InProc. Padua Conf. on Mesons and RecentlyDiscovered Particles, 1957.

356 E.C.G. Sudarshan and R.E. Marshak.Phys. Rev., 109:1860, 1958.

357 Super-Kamiokande Collaboration: Y.Fukuda et al. Phys. Rev. Lett., 81:1562,1998.

358 L. Susskind. Nuovo Cimento C, 12:457,1970.

359 G. ’t Hooft. Nucl. Phys. B, 33:173, 1971.360 G. ’t Hooft. Nucl. Phys. B, 35:167, 1971.361 G. ’t Hooft. Phys. Rev. Lett, 37:8, 1976.362 T2K Collaboration. Letter of Intent: Neu-

trino Oscillation Experiment at JHF.http://neutrino.kek.jp/jhfnu/loi/loi.v2.030528.pdf (accessed March 2010), 2003.

363 TASSO Collaboration: R. Brandelik et al.Phys. Lett. B, 97:453, 1980.

364 R.E. Taylor. Rev. Mod. Phys., 63:573,1991.

365 R. Tenchini and C. Verzegnassi. ThePhysics of the Z and W Bosons. WorldScientific, Singapore, 2008.

366 J.J. Thomson. Philos. Mag., 44:293, 1897.367 A. Tonomura et al. Phys. Rev Lett.,

48:1443, 1982.368 A. Tonomura et al. Phys. Rev. Lett.,

56:792–795, 1986.369 R.D. Tripp. Annu. Rev. Nucl. Sci., 15:325,

1965.370 Y.S. Tsai. Phys. Rev. D, 12:3533, 1975.371 Y.S. Tsai. Slac-pub-2105. http://www.

slac.stanford.edu/cgi-wrap/getdoc/slac-pub-2105.pdf(accessedMarch2010),1978.

372 UA1 Collaboration: G. Arnison et al.Phys. Lett. B, 122:103, 1983.

373 UA1 Collaboration: G. Arnison et al.Phys. Lett. B, 129:273, 1983.

374 UA1 Collaboration: G. Arnison et al.Phys. Lett. B, 126:398, 1983.

375 UA2 Collaboration: P. Bagnaia et al.Phys. Lett. B, 129:130, 1983.

376 R. Utiyama. Phys. Rev., 101:1597, 1956.377 A.M. van Oudenaarden et al. Nature,

391:768, 1998.378 G. Veneziano. Nuovo Cimento A, 57:190,

1968.379 D. Venos et al. Nucl. Instrum. Methods

Phys. Res. A, 560:352, 2006.380 VENUS. J. Phys. Soc. Japan, 56:3763,

1987.381 C.F. von Weizsäcker. Z. Phys. A, 88:612,

1934.382 S. Weinberg. Phys. Rev., 140:B516, 1965.

Page 952: Yorikiyo Nagashima - Archive

References 927

383 S. Weinberg. Phys. Rev. Lett., 19:1264,1967.

384 W.I. Weisberger. Phys. Rev., 143:1302,1966.

385 V. Weisskopf and E. Wigner. Z. Phys. A,63:54, 1930.

386 V. Weisskopf and E. Wigner. Z. Phys. A,65:18, 1930.

387 H. Weyl. Math. Z., 2:384, 1918.388 H. Weyl and K. Preuss. Aksd. Wiss., 433,

1918.389 R. Wigmans. Annu. Rev. Nucl. Part. Sci.,

41:133, 1991.390 E.J. Williams. Phys. Rev., 45:729, 1934.391 E.T. Worchester. PhD thesis, university

of Chicago, December 2007.392 C.S. Wu et al. Phys. Rev., 105:1413, 1957.

393 T.T. Wu and C.N. Yang. Phys. Rev. Lett.,13:380, 1964.

394 T.M. Yan. Annu. Rev. Nucl. Sci., 26:199,1976.

395 T. Yanagida. Prog. Theor. Phys. B, 135:66,1978.

396 C.N. Yang and R.N. Mills. Phys. Rev.,96:191, 1954.

397 D. Yennie et al. Annals Phys., 13:379,1961.

398 C. Zemach. Phys. Rev., 133:B1201, 1964.399 ZEUS Collaboration: S. Chekanov et al.

Eur. Phys. J. C, 21:443, 2001.400 G. Zweig. CERN Reports 8182/TH.401,

8149/TH.412, 1964.401 F. Zwicky. Helv. Phys. Acta, 6:110, 1933.

Page 953: Yorikiyo Nagashima - Archive
Page 954: Yorikiyo Nagashima - Archive

929

Index

aabsorption coefficient 379absorption length 401accelerating expansion 798, 801action 89action principle 89active transformation 38adiabatic approximation 131, 135adiabatic condition 131, 574adjoint 69adjoint representation 755

– of the SU(N) 842AGS 443Aharonov–Bohm effect 748, 751allowed transition 557–558αs 528Anderson–Neddermeyer 448Anderson, C.D. 66angular correlation 567angular momentum barrier 482, 484anomalous magnetic moment, see also g-2

77, 167, 193, 197, 214, 685antigravity 126, 800antiunitary transformation 242Argand diagram 472asymmetric collider 366asymmetry AL 643asymptotic condition

– field 131– strong 133– weak 133

asymptotic freedom 24, 524, 762, 764attenuation length 396axial gauge 357, 361axion 762, 798

bB-factory 366b quark 539

Baker–Campbell–Hausdorff formula225, 252

bare coupling constant 206bare mass 209barn 379baryon 18

– decuplet 848– multiplets 510– number 467, 507– octet 848– wave function 850

baryon–lepton symmetry 536BCS theory 789Bell–Steinberger relation 637, 658beta beam 708beta decay 555beta function (QCD) 763–764 function (QCD) 763betatron oscillation 375Bethe–Bloch formula 384–385Bethe–Heitler formula 183, 394Bevatron 443, 503–504Bjorken scaling 690–692, 699Bjorken x 723black sphere model 491Boltzmann constant 306booster 374bootstrap model 679, 729Born

– approximation 300– series 296

Bose–Einstein condensate 750, 769Bose–Einstein statistics 105boson 105

– weak 258bottomonium 528, 539box diagram 625bra and ket 267Breit frame 700, 702

Elementary Particle Physics, Volume 1: Quantum Field Theory and Particles. Yorikiyo NagashimaCopyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 978-3-527-40962-4

Page 955: Yorikiyo Nagashima - Archive

930 Index

Breit interaction 552Breit–Wigner resonance formula 16bremsstrahlung 181, 198, 394

– inner 451– soft 183, 199

brick wall frame 702bubble chamber 403, 466, 503, 710

cC, see also charge conjugation

– parity– of 2-particle system 256– of photon 255

– transformation 252, 635– violation 257, 564

C, P, T transformation 839Cabibbo angle 597Cabibbo rotation 522, 596–597, 605Callan–Gross relation 702, 714calorimeter 388, 395, 417

– e/h compensation 401– energy resolution 419– sampling- 418

canonical– conjugate 90– formalism 267– method 282– variables 90, 100

Cartesian coordinates 733, 736, 744cascade shower 396Casimir effect 122, 149Casimir operator 57, 512, 843, 853

– SU(3) 545causality 120, 273

– condition 273CDHS 711center of mass frame 136, 228central limit theorem 429CESR 539Chadwick 447channel

– s-, t-, u- 140charge

– independence 261, 454–455– symmetry 261, 266, 714

charge conjugation 79, 252, 265– Dirac matrix 818– of the Dirac field 254– of the vector field 255– test 256

charged current 556charm

– -ed baryon 538

– -ed meson 537– -ed particle 501– lifetime of -ed particle 536– quantum number 536– quark 525– spectroscopy 537

charmonium 527–529Cherenkov

– counter 406– differential type 407– ring image 407– threshold type 407– total absorption type 417

– radiation 390, 392Chew–Frautschi plot 496–497, 532chi square distribution 438–439chiral

– anomaly 517– eigenstate 82, 171–172– invariance 585, 772, 781– operator 82– perturbation theory 762– representation 818– transformation 586– world 773

chirality 62, 564, 567– conservation 84, 172

Christoffel connection 736, 739, 741, 747CKM matrix see also KM matrix 600–601CKM mixing 554Clebsch–Gordan coefficient 458, 512, 835cloud chamber 66, 403, 448, 502CM frame 228Cockroft–Walton 370coherence length 770coherent scattering 627collider detector 422collision length 379–380color 519, 895

– charge 6, 20, 24, 509, 539–540– counting 541– degrees of freedom 18, 519– evidence of 520– exchange force 541–542, 544, 547– factor 545– force 4, 6, 540– independence 542– singlet 533, 544– SU(3) 541

Compton scattering 176– path integral method 348, 353

confinement 6, 18, 24, 521, 529, 533, 761conjugate 268

Page 956: Yorikiyo Nagashima - Archive

Index 931

conjugate momentum 223connected diagrams 353connection 734–736, 738, 743–744conserved charge 96conserved isospin current 263constituent mass 524continuity equation 95, 99contravariant vector 42cooling 708Cooper pair 5, 19, 750, 767, 781correlation 429

– function 300, 304–305– n-point 315– two-point 300, 318–319

cosmic microwave background radiation 799cosmological constant 799Cosmotron 443, 504Coulomb gauge 115, 357, 361, 813Coulomb potential

– instantaneous 814covariance 48, 429

– of Dirac equation 71covariant

– formalism 47– gauge 115, 119– vector 42

covariant derivative131, 147, 253, 732, 741–742, 744, 756

covariant differential operator, see covariantderivative 732

CP– constraint 621– eigenstate 618– invariance 632– operator 618– 2π,3π of 619– transformation 618– transformation property 632

CP violation 617– �T 655– asymmetry AL 640, 643– asymmetry AS 644– decay parameter �0 (denoted as �), �0

635–636, 640, 644–650, 659– decay parameter �2, �f 646– decay parameter η+–, η00 640

– value of 642– direct 645, 651– discovery of 630–631– experiments on 664– formalism 632– in the SM model 675– indirect 645, 651–652

– mixed interference 645– mixing parameter �, �S, �L 635, 644

– evaluation of 649– phase convention 645

– models of 673– milli-strong 674– milliweak 675– superweak 675

CPLEAR 622, 664CPT

– constraint 621– formalism 632– invariance 632

– foundation of 662– theorem 253, 258– transformation 258, 840

CPT violation 656– decay parameter B0, B2 661– decay parameter y 655, 657– mixing parameter δ

635–639, 644, 656–662critical density 800critical energy 395, 397cross section 379crossing 140, 174–175

– symmetry 139, 494Curie temperature 774–775current–current coupling 556current mass 524curvature 741, 751, 757–758

– average 740– local 740– of space-time 739, 741, 799– of the universe 799– total 740

curvilinear coordinate 734, 742–743CVC hypothesis 567, 584, 592, 615

– experimental test of 587cyclotron 371

– frequency 217, 371, 574– isochronous 372– ring 372– sector 372– synchro- 372

dD meson 535–536Dalitz decay 451Dalitz plot 478

– for the decay 486– of K meson 483– of ω 486– symmetrical 482

Page 957: Yorikiyo Nagashima - Archive

932 Index

DAPHNE 622dark energy 19, 122, 798dark matter 19, 797de Broglie wave 25de Broglie wavelength 11decay amplitude 635decomposition

– of product representations 855decuplet 510Dee 371deep inelastic scattering 679

– by electron 687– by neutrino 712– neutrino data 717– QCD correction 725–726

defect 777deficit angle 740, 750ΔI = 1/2 rule 591, 647

– violating parameter ω 646Δm 623, 626

– sign of 623, 629ΔS = –ΔQ components 659ΔS = ΔQ rule 589, 625Δ function 120, 154delta functional 357Δ(1232) 457, 469Δ1 function 120ΔF, see Feynman propagatordensity effect 386, 400depletion layer 414derivative coupling 487DESY 539determinant

– of the hermitian operator 291, 293– of the Klein–Gordon operator 315

DGLAP evolution formula 728diamagnetism

– perfect 767, 769dichromatic beam 708differential geometry 735–736, 742diffraction pattern 493, 682diffraction scattering 627dimension

– fifth 747– of irreducible representation 851, 856

dimensional regularization 195, 825Dirac

– condition for γ matrices 69– equation 62– equation, another derivation 811– equation, Lorentz-covariant 69– equation, plane wave solution 71, 817– γ matrix 69, 817

– matrix 59–60, 817– spinor

– bilinear form 78direct product 47DIS, see deep inelastic scatteringdisconnected diagram 347domain 775double beta decay 87Drell–Yan process 520, 721

– scaling 723dressed 212drift chamber 412, 415, 417dual tensor 49duality 500

ee–e+ to μ– μ+ reaction 174e-μ scattering 167

– path integral method 351– polarized- 171

e-μ-τ puzzle, see generation puzzleeffective action 305effective range 490

– approximation 490eigen mass in matter 628eightfold way 508Einstein equation 741Einstein–Podolsky–Rosen, see EPR

paradox 663Einstein’s contraction 41elastic scattering

– cross section 491electric dipole moment 88

– of the neutron 249electron

– magnetic moment 75– self-energy 349–350, 830

electron holography 751electron volt 364electrostatic spectrometer 574electroweak theory 770emittance 368emulsion 415, 448energy-momentum tensor 96, 225

– complex field 99– Dirac field 100– electromagnetic field 99

energy vector 426entropy 307, 309ep elastic scattering 683EPR paradox 663equivalent photon 699

– cross section of 701

Page 958: Yorikiyo Nagashima - Archive

Index 933

– energy of 701– polarization of 699

error matrix 435, 438η meson 454Euclidean 736

– action 291– space 291, 733– time 303

Euler constant 196Euler–Lagrange equation 90, 92, 223exchange force

– isospin- 455– spin 545–546

extra dimension 793, 796

fFaddeev and Popov’s gauge fixing

method 355Faddeev–Popov ghost 354, 359family problem 606Fermi

– coupling constant 240, 556– coupling constant, value of 582– golden rule 127– interaction 556–557, 559– theory 553, 555

– limitation of 610– transition 559–561

Fermi–Dirac statistics 105, 113Fermi–Yang model 462fermion 104

– loop 205, 349– mass 781

ferromagnetism 774Feynman

– gauge 116– parameter 194– path integral 274, 277– propagator 147, 151, 318– propagator in the Coulomb gauge 815– rules 186, 299– x 722–723, 725

Feynman diagram– connected- 353– disconnected 347, 354– external line 187– internal line 187–188– irreducible 212– loop diagram 188–189– reducible 212– vertex 187

final state interaction 478fine structure 65, 546

fine structure constant 23, 35, 215Fitch–Cronin 630flavor 18

– conservation 554, 579, 598– eigenstate 789– quark 764– SU(3) 221, 509

Fock space 112forbidden transition 558force

– carrier 18–20– color exchange 457, 489– fundamental 20, 733– isospin exchange 457, 487, 489– long range 24– range 23– spin exchange 487, 489– tensor 489

form factor 681– electromagnetic 683– examples of 681– of KL3 595– of the nucleon 683, 686

formation zone 391, 393, 399four-fermion interaction 555, 557, 559, 611FP ghost 354Fresnel integral 277Friedman equation 799Froissart bound 493ft value 559functional 279

– calculus 279– derivative 280, 286, 304– integral 279– Taylor expansion 281– vacuum 305

fundamental representation 755

gg-2 193, 197, 214g-factor 50, 65, 77G operation 453G parity 453γ matrix

– trace 173Gamow–Teller transition 240, 559, 561gas electron multiplier, see GEM 413gauge

– condition 93– field 732– fixing 116, 321, 356– invariance 205, 321, 613, 750, 825

– chiral 587

Page 959: Yorikiyo Nagashima - Archive

934 Index

– principle 731– sector 783– symmetry 114– theory 729

– non-Abelian 754– transformation

– global 731– local 252, 732– 2nd kind 732– 1st kind 731

gauging 732, 745, 747, 755Gaussian distribution 430Gaussian integral 277, 284, 314Gauss’s law 796Geiger–Marsden 446Geiger mode 411Gell-Mann matrix 506, 542, 761, 844Gell-Mann–Okubo mass formula

513–514, 551GEM 413general relativity 733, 741, 744–745generating functional 300, 305, 315, 318–319

– for QED 342, 345– of interacting fields 320– of the connected Green’s function 306– of the Dirac field 332– of the electromagnetic field 323– of the scalar field 312– Taylor expansion 319

generation 18– how many 606– puzzle 605, 786–788

generator 754– of O(3) 40– of SU(N) 262– of the gauge transformation 252– of the group 805, 841–842– of the Lorentz group 53–55, 70– of the Poincaré group 57– of the rotation 54, 264– of the transformation 224– of the translation 227

geodesic line 733–734, 736–737, 743geometrical interpretation of gravity 733GIM mechanism 536, 598, 601Ginzburg–Landau parameter 770Ginzburg–Landau theory 767, 769, 781, 789Glashow–Weinberg–Salam theory 782global symmetry 252gluon 20, 24, 536, 541

– existence of 720– momentum 720

Goldstone boson 777–778, 780

Goldstone theorem 778goodness of fit 433Gordon equality 193–194, 684, 817grand unified theory 7, 221, 788–789, 791Grassmann number 324

– complex conjugate 324– δ function 328– derivative 325– Gaussian integral 328, 330– integral 326

gravitino 795graviton 20, 747, 795Green’s function 132, 151–152, 272, 297, 303

– advanced 132, 153– Feynman’s 312– retarded 132, 153

group 803– Abelian 803– adjoint representation 842, 846– conjugate representation 843– connectedness 805, 842– direct product 843, 847– fundamental representation 842, 846– generators 805– irreducible 512– irreducible representation 846– Lie group 804– nonabelian 803– O(3) 805– O(N) 804– rank of the 843– reducible 512, 843– representation 804– SL(2,C) 809– structure constant 805– structure constants of SU(3) 844– subgroup 804– U(N) 804, 841

GUT, see also grand unified theory 792– scale 792

GWS theory, see Glashow–Weinberg–Salamtheory 782

hhadron 4, 18, 167hadronic shower 401hadronization 532Hamiltonian formalism 90, 100heavy electron 579Heisenberg

– equation of motion 100–101, 127– picture 101, 127–128– uncertainty principle 69

Page 960: Yorikiyo Nagashima - Archive

Index 935

helicity 58, 228– conservation 161, 174–175– eigenstate 61, 172– of electron 569–570– of neutrino 576– operator 61– suppression 566

Helmholtz free energy 306, 309– free particle 308– harmonic oscillator 308

hierarchy problem 790, 792, 796Higgs

– field 779– mechanism 6, 778–782, 788– particle 6, 19, 782, 788– sector 783

Hilbert space 267hologram 752horn 706Hubble 801Huygens’ principle 273hypercharge 509, 771

– of νL, eL, eR 772– of Higgs 782– of leptons 772– of quarks 772– operator 772

hyperfine structure 64, 546hypernucleus 483hyperspace 734, 738–739, 751

iI spin see Isospinideal mixing 517ILC 377imaginary time 302impact parameter 445impulse approximation 693

– conditions for 696in state 132, 334incoming state, see in state 132index of refraction 628inertial frame 743infinite momentum frame 696inflation 801inflation model 792infrared

– compensation 199– divergence 184, 192, 194, 197, 200

instanton 762interacting fields 131, 147interaction picture 127interferogram 752

internal degrees of freedom 17invariant mass 468ionization chamber 403, 411ionization loss 381ionization mode 410IOO symmetry 505isoscalar target 714isospin 18

– conservation 454– current 264, 584–585, 755– degree of freedom 457– multiplets 260– of antiparticles 265– operator 754– symmetry 260

jJ/ψ 526, 535

– isospin and G-parity 527– mass and quantum number 527– width 533

Jacobi identity 759jet 533

– gluon 720

kK factor 724K meson 481Kabir asymmetry 655Kaluza–Klein theory 747, 795kaon decay 589

– time development of 638KK tower 797Kl3 decay 592

– form factor 595Klein–Gordon equation 25, 32Klein–Nishina formula 181KLOE 664klystron 371KM matrix 597, 600–601, 617, 676, 791Kobayashi–Maskawa matrix, see KM matrix

597, 675KTeV 666Kurie plot 573

lLAB frame, see laboratory framelaboratory frame 136Lagrangian

– complex field 93– density 92– Dirac field 93– electromagnetic field 93– formalism 89, 223

Page 961: Yorikiyo Nagashima - Archive

936 Index

– real field 93– SU(2) 760– SU(2)× U(1) 772

Lamb shift 213ΛQCD 766Landau distribution 385Landé g-factor 217Larmor frequency 217lattice QCD 761law of inertia 733, 744law of large numbers 428leading log approximation 762–763least squares method 438LEP 369, 374lepton 18

– flavor 579, 605– flavor conservation 606– number 554– number conservation 571

Levi-Civita tensor 47, 819LHC 364, 368, 374Lie algebra 804, 841

– O(3) 40– Poincaré group 57

Lie group 804– compact 805– noncompact 805

light guide 405limited streamer mode 411linac 370linear collider 377linear operator 290Livingston chart 11LLA, see leading log approximation 762long-range force 5longitudinal gluon 764Lorentz

– boost 42, 53, 70– condition 322– contraction 694– gauge 115, 357, 359– generators for spin 1/2 particle 70– invariance 37, 193, 221, 227– operator 53, 70– operator for spin 1/2 particle 809– scalars 45– tensor 47– vector 41, 44

Lorentz invariant– flux 144– matrix element 134, 169– phase space 145

Lorentz transformation 41, 138– homogeneous 41– inhomogeneous 56– orthochronous 46, 120– proper 46, 119–120

LPM effect 399–400LS force 546–547luminosity 367

– measurement 427

mMAC-E 574–575magnetic flux 750, 767

– line 753– quantization 749–750, 753– tube 767

magnetic moment– of the baryon 525– of the proton 524

magnetic spectrometer 574magnon 778Majorana

– phases 791Majorana particle 84Maki–Nakagawa–Sakata matrix 791Mandelstam variable 138, 140, 171, 366mass

– formula 547, 549, 551– of the gauge boson 779– of W, Z 783

mass absorption coefficient 396mass hierarchy 786–788mass matrix

619–620, 625–626, 628, 633, 636, 859– eigenvalue 621– symmetry constraint 634

mass operator 619matter field 27matter particle 18maximum likelihood method 433Meissner effect 5, 750, 753, 767–770, 781meson 18

– mixing 516– octet 846

meson exchange– scalar 164

metric 734, 736metric tensor 41–42, 734, 736, 747Michel parameter 583, 604minimum interaction 732minimum ionization loss 385Minkowski

– metric 41

Page 962: Yorikiyo Nagashima - Archive

Index 937

– space 41, 291, 303, 734, 744– time 303

mirror operation 236missing energy 427Molière unit 399momentum transfer 680monopole 792Mott scattering 157, 184, 192, 213

– path integral method 345multiple scattering 389multiwire proportional chamber 411, 415muon 448

– decay 579, 584MWPC, see multiwire proportional chamber

nn photon emission 204n-point function 318N product, see normal orderedNA48 666Nambu 775Nambu–Goldstone boson 778narrow band neutrino beam 707natural unit 33naturalness 792–793NBB, see narrow band neutrino beamnegative energy 26, 64negative pressure 126, 801negative probability 26neutral current 598, 773

– flavor-changing 601neutralino 795, 798neutrino 444, 448, 571

– atmospheric 790– beams 705– detector 709– Dirac neutrino 87– experiments how to 705– factory 708– helicity 566– helicity measurement 576– Majorana neutrino 86–87, 791– mass measurement 572, 576– oscillation 18, 35, 62, 554, 606, 789– reactor neutrino 571– right-handed 86, 789– second neutrino 578– solar 790– Weyl neutrino 87

neutron 447– bottle 250

Nishijima–Gell-Mann’s law502, 504, 509, 590, 771, 782

Noether’s theorem 95, 222, 225, 731non-Abelian 754

– field strength 758– gauge theory 100, 195, 753–760

nonet 507, 510, 522normal distribution 430normal ordered 149November Revolution 525ν–q scattering 715nucleon

– parton structure of 680number operator 102

oO(3) 37, 40octet 507, 509–510

– baryon 509– meson 507

off shell 118, 150Ω 514, 519ω meson 485on shell 150, 207order parameter 769, 775out state 132, 334outgoing state, see out state 132OZI rule 518, 535–536

pP, see also parity 559

– violation 559, 562P-type interaction 560, 564, 567pair creation 396parallel transport

734–736, 738, 740, 742, 756–757parity 77

– intrinsic 233– of π+ 449– of π0 450

– tensor method 451– of the photon 234– operation 77– transformation 42, 46, 62, 233, 839– two-body system 235

parity violation 239, 559–560, 562– in the weak interaction 238

partial wave– analysis 462, 472– cross section 466– expansion 229– parity 464, 470– S-matrix unitarity 464– scattering matrix 468

particle democracy 679

Page 963: Yorikiyo Nagashima - Archive

938 Index

particle identification 419partition function 306parton 680

– distribution function 697, 725– kinematics 694– luminosity 723– model 693

– in hadron collisions 721– variable 695–696

passive transformation 38path integral 267, 311

– determinant 315– expression 276– harmonic oscillator 285– matrix method 287– normalization 315– operator method 290– scattering matrix 294

– calculation of 297Pauli 444

– matrix 61Pauli–Dirac

– representation 62, 818Pauli–Lubanski operator 57Pauli–Villars regularization 195penetration length 770penguin diagram 677PEP 529phase convention 622, 640phase rotation 708phase shift 464phase stability 376phase transformation 252, 738, 749, 754phase transition 775φ factory 622photomultiplier 405photon

– longitudinal 117, 699– scalar 117, 700– self-energy 349, 825– soft 199– transverse 699– unphysical 117– virtual 118

π meson 449π0 to 2γ 520pion cloud 584pion decay 564

– constant 520, 565Planck

– energy 23– scale 4, 795

plasma frequency 123

plasmon 779plastic scintillator 405Poincaré transformation 746Poincaré group 56Poisson distribution 204, 431polarization

– in scattering 465– of nucleon 465– sum 117–119, 178–179, 822

Pomeranchuk– theorem 492, 498– trajectory 498

Pomeron 498positron 448positronium 527, 529principle of detailed balance 247principle of equivalence 743Proca equation 93propagator, see also Feynman propagator

127, 272– free particle 282

proper time 43proportional mode 411proton decay 792proton synchrotron, see PS 370PS 443

qQ-value 526QCD 7, 20, 760–764

– background 368–369– coupling constant 529, 763–764– Lagrangian 761– lattice 769– potential 528

QED 4, 167– path integral method 340– test 213

quantization– of fermion 104– of fields 89, 105–106, 111– of harmonic oscillator 102– of the Dirac field 112– of the electromagnetic field 114

– in the Coulomb gauge 814quantum anomaly 791, 795quantum chromodynamics, see QCDquantum electrodynamics, see QEDquark

– counting 605– electric charge 18– electric charge by DIS data 718– mass 19, 550

Page 964: Yorikiyo Nagashima - Archive

Index 939

– model 501, 509, 787, 841– potentials of 24, 527– sea- 716– valence 716

quintessence 802

rR-parity 795R-value 520–521radiation length 390, 395, 401radiative correction 191, 527range 388rapidity 51, 72, 725

– pseudo- 53reduction formula

– Dirac fields 337– electromagnetic fields 337– LSZ 320, 333, 340– scalar fields 333

regeneration 626Regge

– pole 495– trajectory 494, 527, 529

Reggeon 499relativistic

– cross section 141– kinematics 136– normalization 142– wave equation 25

renormalizability 784renormalizable theory 611renormalization 191, 208

– constant 133– group equation 765– mass- 208– scattering amplitude 211

renormalized coupling constant 206resonance 466

– amplitude 472– formula 469– structure 458

RF 370� meson 475Ricci’s curvature tensor 99, 741RICH 407Riemann’s tensor 741Rosenbluth formula 685rotation

– group 37– invariance

– 2-body- 227– π N scattering 465

– matrix 174, 229, 833, 835

– orthogonal condition 229– operator 229

rotational symmetry 737running coupling constant

204, 208, 762–765, 792Rutherford 446

– formula 446– model 3– scattering 135, 389, 680– scattering formula 136

sS-matrix, see also scattering matrix 134

– in the path integral 297S-matrixsagitta 416Sakata model 462, 505scalar product 44scaling 690, 723

– Bjorken 691– limit 691, 698, 713– variables 696

scattering length 490scattering matrix 127

– path integral method 351– Tif 231

Schrödinger– equation 271, 278– picture 101, 128

scintillation counter 404screening 764

– anti 764sea of negative-energy 66sea quark 716second quantization 112see-saw mechanism 791short-range force 5singlet 507SLC 369, 371, 377solid state detector 413space-time translation 223spectator 368spectroscopy 443

– hadron 457spherical harmonic function 235, 244, 833spin

– of π+ 449– of π0 450

– tensor method 451– operator 63

spin–orbit coupling 551spin-statistics 119spinor 70

– composite of 807

Page 965: Yorikiyo Nagashima - Archive

940 Index

– iso 263– representation 803

spontaneous symmetry breaking5, 750, 774–778, 792

SPS 374spurion 592SSD 413, 415standard deviation 429standard model 7, 731

– beyond the 789–799static approximation 488statistical mechanics 306strange particle 501

– decay of 589strangeness 501–502

– eigenstate 618– oscillation 622

string model 500– of hadron 532–533

strong force 4strong CP problem 762, 798structure constant 754, 842structure function 40, 688, 699–700

– definition of 689– isoscalar target 714– ν 712– ν 712– of neutrino 712– of point particle 689– physical interpretation of 699– scaling limit 698

SU(2) 505, 615, 805– symmetry 501– tensor 510

SU(2) × U(1) 783SU(2) × U(1) gauge theory 770, 774SU(3) 501, 505, 841, 844, 857

– color 510– flavor 510– rank of 506– singlet 511, 513– structure constant 506– symmetry 516– tensor 510

SU(3)color × SU(2)L × U(1)γ 731SU(4) 537, 857SU(5) 792, 857SU(6) 522, 857

– wave function 523SU(6) × O(3) 525SU(N) 841

Sudakov– double logarithm 198– form factor 204

sum over history 274sum rule 719superbeam 708superconductivity 781superconductor

– type I 767, 770– type II 767, 770

supercurrent 750, 769supergravity 747, 793superstring 747superstring model 795supersymmetry 747, 792superweak model 675superweak phase 637, 647, 650

– parameter ε 649SUSY, see supersymmetry 792SusyGUT 793symmetry

– continuous 222– discrete 233– family 606– horizontal 606, 791– internal 221, 251–252– list of 222– spacetime symmetry 221

synchrotron 373–374– oscillation 376– radiation 369, 376

tT, see also time reversal 839

– constraint 621– invariance 560, 595, 632– transformation 9, 839– violation

– asymmetry AT 655– evidence of 653– parameter �T 655

T product 134, 151τ–θ puzzle 483, 562τ lepton

– coupling type of 604– decay 601– mass and spin of 602– production of 601

technicolor 793temporal gauge 357, 361tensor analysis 851tensor method 485test of T and CPT invariance 653

Page 966: Yorikiyo Nagashima - Archive

Index 941

Tevatron 366Thomas precession 217–218Thomson model 445Thomson scattering 381Thomson, J.J. 444–445three-jet 721time evolution operator 130, 133, 272time projection chamber 420time reversal 46, 240

– of S-matrix 242– of scalar and vector fields 244– of the Dirac field 246– operation 241– quantum mechanics 240– test 247– Wigner’s 241

TOF 420Tomonaga–Schwinger–Feynman 193, 213top condensate 19total reflection 46TPC see time projection chambertraces of the γ matrices 818transition operator 134transition radiation 387, 390, 392

– counter 408translational operator 268translational symmetry 737tree approximation 191triangular diagram 520Troizk–Mainz experiment 573two point function 152type I supernova 798

uU(1) 251U, V spin 844, 846ultra cold neutron 249ultraviolet divergence 192–193unification 792unification of forces 8unified theory 611, 770, 794–795

– a step toward 608unitarity 764

– for kaon decay 620, 637– of scattering matrix 231– of the CKM matrix 601

unitary gauge 780unitary group

– special unitary group 262– SU(N) 262

unitary limit 469–470, 610unitary spin [see flavor SU(3)] 505universality 582, 596, 601, 604, 608Upsilon 539

vV–A interaction

240, 564, 566–567, 579, 583, 595, 608V particles 502vacuum

– energy 121–122, 348, 776, 801– fluctuation 69, 121, 123– has negative pressure 126– polarization 213, 762, 764– properties of 126

vacuum expectation value 777, 781, 784vacuum to vacuum transition 349valence quark 716variance 429vector

– axial 233– polar 233

Veneziano model 500, 795vertex 148, 151, 187

– correction 191, 213– diagram 198– function 127, 147

virtual particle 150

wW

– longitudinal polarization 613Ward–Takahashi identity 210Watson–Sommerfeld transformation 494wavelength shifter 404WBB see wide band neutrino beamweak 18

– charge 20, 553– current 553, 556– doublet 554– eigenstate 596, 789– force 4–5– interaction 553– magnetic force 553– magnetism 588

weak isospin– doublet 554– independence 615– triplet 554

weight diagram 508, 846Weinberg angle 773Weinberg–Salam model 609Weisskopf–Wigner approximation 620, 859Weizsäcker–Williams

– approximation 699– formula 704–705

Weyl 729– equation 61– ordering 275

Page 967: Yorikiyo Nagashima - Archive

942 Index

– particle 58, 62, 85, 172– representation 62, 82, 818– spinor 61

Wick– rotation 291, 302–303, 307– theorem 319

wide band neutrino beam 707Wigner–Eckart theorem 458, 513Wu–Yang’s phase 647

yYang and Mills 730Young diagram 512, 854Yukawa

– interaction 131, 162, 783– potential 24, 488

zzero-point energy 103, 121–122, 126ZEUS 727