Download - berg Matrix

Transcript
Page 1: berg Matrix

Heisenberg, Models, and the Rise of Matrix Mechanics

By Edward MacKinnon*

Werner Heisenberg's 1925 paper "Uber quantentheoretische Um

deutung kinematischer und mechanischer Beziehungen" is one of the

pivotal papers in the development of modern physics. It is also one of the most perplexing. The novel formulas, which effectively initiated

matrix mechanics, are simply presented as if they were intuitively obvious. They were, in fact, counterintuitive. Heisenberg's new

methodology was ostensibly justified by the doctrine that scientific formulations must be restricted to observable quantities. Yet his suc cess really hinged on the skillful use of a model of the atom embodying virtual processes which are in principle unobservable. Most vexing of all, at least for those trying to understand the development of

quantum theory, is the absence of any indication of the process that

led Heisenberg to his formulation of quantum theory. The present paper attempts to fill a part of this gap by explaining the way Heisen

berg came to introduce his distinctively new methods for quantum calculations. Before entering into the details of this development I will

outline the features that play a distinctive role in the present interpre tation.

Heisenberg's earliest papers on atomic physics were essentially detailed calculations which relied heavily on models of the atom, par

ticularly the core (or Rumpf) model. A variety of factors led Heisen

berg to modify his position on the reliability of this model, whether

considered as a realistic representation of atomic structure or simply as a tool for calculation. No matter how the model was adapted there were some problems, notably the helium atom and the hydrogen molecule, for which the model never yielded correct results.

In addition to computational difficulties Heisenberg gradually be came aware of problems of a more conceptual character. Niels Bohr, with whom Heisenberg worked, inculcated a type of physicalistic reasoning that led Heisenberg to make a more careful distinction between the physical and mathematical aspects of a problem. Another strong influence was Wolfgang Pauli's insistence that atomic models could not be considered realistic representations of atoms.

Under these internal and external influences Heisenberg gradually

*Philosophy Department, California State University, Hayward, California 94542.

137

Page 2: berg Matrix

138 EDWARD MACKINNON

developed a more critical awareness of the role of models and a

greater willingness to experiment with alternative models of the atom. The semi-classical formulas that served to explain the phenom enon of anomalous dispersion effectively employed a virtual oscillator

model of the atom. Heisenberg used this model as a tool to treat the

problem of the polarization of resonance fluorescence and also to revise his earlier treatment of the anomalous Zeeman effect. When these attempts led to some success Heisenberg attempted to use the virtual oscillator model as a conceptual tool for the redevelopment of

quantum theory. That this model actually did serve as a basis for

fabricating quantum mechanics I will demonstrate by a detailed re construction of the argumentation which, I believe, Heisenberg actu

ally followed in developing his pivotal paper. I will suggest reasons

why Heisenberg supressed his reliance on the virtual oscillator model and advocated instead a doctrine of reliance on observables. In the

concluding section I will indicate which aspects of Heisenberg's method were carried over into the matrix formulation of quantum mechanics.

Two guiding principles, which played an organizing role in my interpretation, should be made explicit. The first is the necessity of

preserving a distinction between a physical account and a mathemati cal formulation. Though the two may function in tandem, they are in different conceptual frameworks and are governed by different rules of entailment. Some philosophers of science have recently attempted to bring out the interpretative significance of this distinction in dif ferent ways.1 A somewhat similar distinction came to play a basic role in Heisenberg's thought through the contrast between the mathemat ical approach to physics he had assimilated at Munich and Gottingen and the stress on physicalistic reasoning he acquired under Bohr's influence. A second and related principle concerns the role of models in scien

tific explanation. To make some sense out of Heisenberg's develop ment of quantum mechanics it is necessary to distinguish between

*An excellent survey of the difficulties which induced many philosophers of science to abandon syntactical models of scientific explanation, where this distinction is not

operative, in favor of semantical models may be found in Frederick Suppe, "The Search for Philosophical Understanding of Scientific Theories," in F. Suppe, ed., The Structure of Scientific Theories (Urbana, 1974), pp. 1-241. A model of scientific explanation using this distinction will be developed in detail in my not yet completed volume, Scientific Explanation and Atomic Physics. Partial and preliminary versions of this model of scien tific explanation have already been presented in "Ontic Commitments of Quantum

Mechanics," Boston Studies in the Philosophy of Science, 9 (1969/1972), 103-156; "Theoreti cal Entities and Metatheories," Studies in the History and Philosophy of Science, 3 (1972), 105-117; Truth and Expression: The Hecker Lectures, 1968 (New York, 1971), chaps, ii, iii; and The Problem of Scientific Realism (New York, 1972; reissued in 1974), esp. pp. 38-71.

Page 3: berg Matrix

HEISENBERG AND MATRIX MECHANICS 139

iconic and conceptual models. An iconic model is the representation of some relatively unknown type of entity in terms of a different but more familiar one. Billiard balls have represented molecules; stretch able rotating tubes have represented the ether. For purposes of calcu lation the nucleus has been represented variously as a liquid drop, a localized gas, a cloudy crystal ball, a shell structure analogous to that

of the atom, and combinations of these relatively simple models. Such visualizable models often play a heuristic role in scientific dis

covery and supply a foundation for the application and extension of

known laws.

Conceptual models are more difficult to isolate, because they are

impossible to avoid. The language we use reflects and shapes our

understanding of the world we live in. Our fundamental categoriza tions sort the objects of experience into things of different types. Rules of semantic compatibility, few of which are explicit, govern the

attribution of properties, powers, and activities to different types of

things and processes. This conceptualization, implicit in the language used to describe and report the reality we experience, supplies a basis

for a type of entailment that is different from the formal, or rule

governed, inferences proper to logic. Thus, from the statement that

something is an X we may infer that it has Y or does Z when these

conclusions are entailed by the concept of being an X. A conceptual model serves to cluster and organize such entailments. Yet, a concep tual model cannot be the type of isolated unit that an iconic model is.

Even such a relatively particular representation as a description of the

structure and working of some mechanism relies on other linguistic factors such as the sortal terms that specify types of things, the hierar

chical ordering of categories these terms presuppose, and the mean

ings of the descriptive terms employed, meanings which generally

presuppose a familiarity with paradigm cases. Though contemporary

philosophical analysis has done much to clarify the conceptual entailments proper to key ordinary language terms, the analytic tech

niques developed have not yet been systematically extended to scien

tific languages.2 This distinction between two different types of models and the

2The ideas on the role of models in scientific reasoning used here stem from Mary B.

Hesse, Models and Analogies in Science (London, 1963); from Wilfrid Sellars, "Scientific

Realism or Irenic Instrumentalism/' in Boston Studies in the Philosophy of Science, (New

York, 1965), 2, 171-204; and from my article, "A Reinterpretauon of Harris Copernican Revolution/' Philosophy of Science, 42 (1975), 67-79, where the difference between con

ceptual and iconic models is explained. Some of the epistemological and linguistic

problems involved in using ordinary language analysis methods in scientific contexts

have been discussed in my "Language, Speech, and Speech-Acts," Philosophy and

Phenomenological Research, 34 (1973), 224-238.

Page 4: berg Matrix

140 EDWARD MACKINNON

roles they play is helpful in understanding Heisenberg's switch from the core model, originally interpreted as a realistic though inadequate

model of the atom, to the virtual oscillator model, which had a

functional rather than a representational role. After using the virtual oscillator model to develop quantum mechanics, Heisenberg, who had become acutely aware of the nonrepresentational character of the models he used, tried to redevelop the theory so that it would not

explicitly depend on any model. Paradoxically, it was this final rejec tion of any reliance on iconic models as realistic representations of the atom that brought to the forefront the problem of interpreting concep tual models of atoms and the significance of the language used in

reporting and describing atomic phenomena. This is a problem I hope to discuss elsewhere.

In this paper I will give a somewhat detailed treatment of Heisen

berg's work on the polarization of resonant fluorescent radiation. This

physical problem played a decisive role in the transition in Heisen

berg's view of models, and it has not been treated in any of the standard historical accounts.

1. HEISENBERG'S EARLY WORK

Prior to his paper on quantum mechanics in 1925, Heisenberg wrote at least sixteen technical papers. These can be classified under four general headings. The first is hydrodynamics, the topic of his first published paper and of the dissertation he wrote under A. Som

merfeld.3 The other three are problem areas that were then current in atomic physics: the anomalous Zeeman effect, models of molecules, and the scattering of light from atoms. I will consider only the last three and focus primarily on the difficulties and inconsistencies that

Heisenberg experienced, primarily as a result of his reliance on tra ditional atomic models.

In 1896 P. Zeeman discovered that spectral lines are split into three

components in a magnetic field, one polarized parallel to the mag netic field and two perpendicular to it. H. A. Lorentz quickly worked out an explanation for this in terms of an atomic model. This model is of interest here because later adaptations of it supplied the basis for the virtual oscillator model. In Lorentz' model each electron is consid ered to be quasi-elastically bound in the atom so that any displacement of it from equilibrium induces a force, proportional to the displace ment, which returns it to its rest position through damped oscilla tions. Sommerfeld and P. Debye independently worked out quantum explanations of the normal Zeeman effect, which reproduced Lorentz'

Page 5: berg Matrix

HEISENBERG AND MATRIX MECHANICS 141

results. The anomalous Zeeman effect?the complex splitting that

multiplet spectral lines undergo in the presence of a magnetic held?

proved a more formidable problem. The work done on this by Sommer

feld, A. Landed and others up to 1921 has been treated in detail by P. Forman4 and need not be discussed here.

Around December 1920, Sommerfeld assigned the anomalous Zeeman effect as a problem to Heisenberg, then a first year student at

the University of Munich. Heisenberg tackled this by adapting the core model that Sommerfeld had introduced as a special version of the general Bohr model. In this model the atom is represented as

consisting of a core, the nucleus of charge Z together with the inner Z -

1 electrons, and an optical electron (the Leuchtelektron) in an encom

passing orbit outside the core. In Sommerfeld's model the optical electron, which is responsible for most of the spectral and chemical

properties of the atom, can have a magnetic interaction with the core

to produce a splitting of energy levels.

Heisenberg adapted this model by assigning the core an average

angular momentum of Vi (in units of angular momentum) and as

signing the optical electron the value n? V2.5 The value of the

exchange energy linking the core and the optical electron depends on

whether the two angular momenta are parallel or anti-parallel. By

computing the quantized projections of the angular momenta of the core and optical electron on the total angular momentum of the atom

and of the total angular momentum on the external magnetic field,

Heisenberg was able to develop formulas for the anomalous magnetic

splitting of doublet and triplet lines. The continued work of Lande and Heisenberg on this problem led

to the well-known Lande vector model.6 Though this model fit the

3W. Heisenberg, "Die absoluten Dimensionen der Karmanschen Wirbelbewegung,"

Phys. Zeit., 23 (1922), 363-366. 4For a general history of this problem to 1921 see Paul Forman, "Alfred Land6 and

the Anomalous Zeeman Effect, 1919-1921," Historical Studies in the Physical Sciences, 2

(1970) , 153-261. 5W. Heisenberg, "Quantentheorie der Linienstruktur und die anomalen Zeeman

effekte," Zs. /. Phys., 8 (1972), 273-299. In this artide Heisenberg did not have the type of vector diagram that later became known as the Lande model, but he did calculate the

angles between different angular momentum vectors, which is the essential point. 6A. Lande, "Ober den anomalen Zeemaneffekt," Zs. f. Phys., 5 (1921), 231-241; 7

(1971) , 398-405. In his original treatment Lande gave three different g-factors for

singlets, doublets, and triplets, expressing each as a function of the principal quantum number n, and the azimuthal quantum number, k, which he interpreted as the quan tum number for the total angular momentum. The half-integral quantum number

which Heisenberg attributed to the core (and which was later associated with electron

spin) entered Lande's account as a projection of the azimuthal quantum number on the

direction established by an external magnetic held. Lande's later account is given in

"Termstruktur and Zeemaneffekt der Multiplets," Zs. /. Phys., 15(1923), 189-205. Here

Page 6: berg Matrix

142 EDWARD MACKINNON

spectroscopic data it did not yet supply an adequate theoretical ac count of why this splitting occurred.7 Furthermore, the use of this

model in treating particular problems led to assignments of quantum numbers different from those given by Bohr's Aufbauprinzip. This

principle "built" complex atoms by beginning with the hydrogenic structure and adding further electrons one by one on the assumption that the latest addition did not change the quantum numbers already assigned to the earlier electrons. This inconsistency became evident in a joint paper Lande and Heisenberg wrote on the anomalous Zeeman effect for the neon ion and atom.8 Their theory fit the spectroscopic data Paschen had obtained only if they assigned the core of the neon atom and ion the angular momentum values of 1 ? V2 and 2 ? V2

respectively, where the + or ? depends on whether the core and

orbital angular momenta are parallel or anti-parallel. According to Bohr's Aufbauprinzip these angular momenta should be assigned the values 1 and 2 respectively.9

Heisenberg recognized this lack of consistency as a sure sign of

deeper difficulties in the underlying theory: "It became clearer than ever that an explanation of the anomalous Zeeman effect must bring about profound modifications in our quantum theoretical concep tions. This is especially notable in the failure of the Aufbauprinzip with

respect to the statistical weights of the atom core and electrons."10

Lande had a unified form for theg-factor coupled to the basic vector diagrams but still lacked a theoretical explanation for the splitting. Lande had actually been the first to introduce vector diagrams for the addition for the quantized angular moments. The context in which this was done is explained by Forman, op. cit., (note 4), pp. 169-71.

7In discussing his g-factor modification of the effects of a magnetic field on energy states Lande stated: "We can not assign a theoretical basis for this modification." Op. cit. (note 6, 2nd reference), p. 398.

8A. Lande und W. Heisenberg, "Termstruktur der Multipletts hoherer Stufe," Zs. f. Phys, 25 (1924), 279-286.

9N. Bohr, "The Structure of the Atom and the Physical and Chemical Properties of the Elements," in The Theory of Spectra and Atomic Constitution, 2nd ed. (Cambridge, 1924), a revised translation of a paper originally published in Danish in 1921. Here Bohr

attempted to give a detailed picture of the probable arrangement of electrons in each atom. The neon atom that Lande and Heisenberg treated is represented as having two

inner electrons in lt orbits (where the notation nk gives the principal and azimuthal

quantum numbers), surrounded by a shell of four electrons in 2X orbits. This, in turn, is surrounded by four electrons in 22 orbits arranged so that their angular momenta have a

slightly disturbed tetrahedral symmetry. As a noble gas this forms a particularly stable core in higher atoms.

10"Seitdem das empirische Material bei den anomalen Zeemaneffekten durch Lande

entsprechend den bisherigen quantentheoretischen Prinzipen systematisch geordnet und in Formeln gebracht wurde, stellte es sich immer klarer heraus, dass ein Enklarung der Erscheinungen des anomalen Zeemaneffekets tiefgreifend Anderungen in unseren

quanten-theoretischen Vorstellungen mit sich bringen musse. Besonders eindrucksvoll

zeigt sich dies im Versagen des Aufbauprinzips hinsichtlich der statistischen Gewichte von Atomrest und Elektron." This is from W. Heisenberg, "Ober ein Abanderung der

Page 7: berg Matrix

HEISENBERG AND MATRIX MECHANICS 143

However, all that Heisenberg was then able to accomplish was a modification of the way the coupling of the core and optical electron is

expressed, a modification that preserved the Landed-values without

formally contradicting the Aufbauprinzip. Heisenberg's early papers on molecular models were the result of

tasks assigned him when he became Max Born's assistant at Got

tingen. In their first joint paper Born and Heisenberg developed a formula for the phase relationship between the two electrons in a

helium atom or a hydrogen molecule. Bohr had made the assumption that parahelium has two electrons in planes 120? apart in orbits both characterized by quantum numbers n = 1, k = 1, while orthohelium has one electron in an n = 1, k = 1 orbit and one in an n = 2, k = 1

orbit. He and H. A. Kramers kept trying to calculate energy levels

proper to these and other configurations, but they could never get the correct value for the ionization potential.11

Born and Heisenberg expressed the belief that there must be exact

and, in principle, determinable phase relations between the two orbit

ing electrons.12 However, the attempt to calculate such exact relations seemed impossibly difficult, so they settled for some approximate calculations adapted from techniques astronomers had developed for

planetary orbits. Though this approach was not completely success

ful, there was enough of a correspondence between their compu tations and the data to encourage further work. In the concluding section of their paper, they noted a difficulty of a different sort, one of principle rather than computational detail.

According to P. Ehrenfest's adiabatic principle, one of the guiding principles of the old quantum theory, it should be possible to explain the binding of two hydrogen atoms in a hydrogen molecule by begin ning with two atoms that are infinitely far apart and then letting them

approach each other directly and slowly until they are finally strongly coupled. In principle it should be possible to extend this analysis from the ideal adiabatic case to the more realistic non-adiabatic case. Yet, Born and Heisenberg were not able to make this extension in a way that would conserve energy and momentum. Once again, a model that supplied a convenient basis for calculations led to fundamental inconsistencies when interpreted as a realistic picture of the atom.

formalen Regeln der Quantentheorie beim Problem der anomalen Zeemaneffekte," Zs.

/. Phys., 26 (1924), 291-307. The citation is from p. 291. 11A survey of the difficulties physicists were then encountering in attempting to treat

the helium atom may be found in J. H. Van Vleck, 'The Dilemma of the Helium

Atom," Phys. Rev., 19 (1922), 419-423. 12M. Born and W. Heisenberg, "Uber Phasenbeziehungen bei den Bohrschen

Modellen von Atomen and Molekeln," Zs. f. Phys., 14 (1923), 44-55.

Page 8: berg Matrix

144 EDWARD MACKINNON

In a second paper Born and Heisenberg adapted the core model to

calculate allowed energy levels for excited helium.13 Their formula tion placed the excited electron in an encompassing orbit above the core constituted by the nucleus and the unexcited electron. Though this did not yield a quantitative agreement with the observational

data, it did give the correct general form for the spectral series. In a

later paper they developed a general theory which they hoped would

supply a basis for incorporating and integrating the work that various

researchers were doing on molecules.14 Here the fundamental role of molecular models comes through rather clearly. They need not be

descriptive pictures of molecules, but they should supply a concep tual basis for the systematic interrelation of various types of calcula tions. In the Born-Heisenberg approach this was to be achieved

through a five-step process:

(1) The molecule is treated as a rigid rotator;

(2) The vibrations of the nuclei are included and added to the rota

tional motions and energy; (3) The interactions between the molecular rotations and the nu

clear vibrations are added;

(4) The angular momentum of the electrons is added to the previ ous results;

(5) Finally, the molecule is treated as a complete mechanical system

comprising nuclei and electrons.

If the molecular theory is correct it should supply a basis for ex

plaining the macroscopic properties of molecules. Born and Heisen

berg used it to make detailed calculations of the influence of the

deformabilty of ions on optical and chemical constants.15 At this time

there seems to have been little doubt in Heisenberg's mind about the

utility of the core model for practical calculations. He was well aware

of its inconsistencies and inadequacies. But at a time when the old

quantum theory was becoming a thing of rags and patches, such

difficulties were commonplace and did not constitute sufficient

grounds for rejecting the model. It was the problem of dispersion and

the subsequent development of a competing model that seem to have

been the key factors inducing Heisenberg to develop a more critical

attitude towards the core model and towards the role of models in

13M. Born and W. Heisenberg, "Elektronenbahnen im angeregten Heliumatom," Zs.

/. Phys., 16 (1923), 229^-243. 14M. Born and W. Heisenberg, "Zur Quantentheorie der Molekeln/' Ann. d. Phys., 74

(1924), 1-31. 15M. Born and W. Heisenberg, "Uber den Einfluss der Deformierbarkeit der Ionen

auf optische und chemische Konstanten," Zs. /. Phys., 23 (1924), 388-^410; 26 (1924), 196-204.

Page 9: berg Matrix

HEISENBERG AND MATRIX MECHANICS 145

atomic explanations. These topics, accordingly, deserve a more de tailed consideration.

2. RADIATION AND VIRTUAL OSCILLATORS

During the period we are considering, 1924-1925, Heisenberg was

spending half the year in Gottingen as Born's assistant and the other half in Copenhagen as a Rockefeller fellow at Bohr's Institute for Theoretical Physics. In Copenhagen he worked on different prob lematic aspects of the interaction of matter and radiation. The feature of this work that most concerns us is the way in which Heisenberg switched from a reliance on the core model of the atom to a functional use of the virtual oscillator model. To understand why he accepted this model as functionally adequate, used it for calculations, and yet suppressed any reliance on it in his justification of quantum

mechanics, we must also consider his interactions with the two men who had the strongest influence on him at that time, Niels Bohr and

Wolfgang Pauli. Bohr's epistemological development will be the subject of a sepa

rate study.16 Here we will simply indicate some aspects of this de

velopment that relate to problems Heisenberg was considering. Bohr's ambition in his early works was to provide a descriptively accurate account of the structure of the atom and the behavior of its

component parts. After 1918 he began to stress a distinction between two types of principles: realistic ones, which are essentially descrip tive accounts of the reality they represent; and formal ones, whose

significance stems from the role they play in scientific theories rather than from any direct correspondence with physical reality. As late as

1920, in a dispute with the philosopher Norman Campbell, Bohr in sisted that while some of the principles of the quantum theory are

purely formal, the basic representation of the atom must be consid ered realistic.17

In the next three years Bohr gradually changed his emphasis and came to interpret the principles of quantum theory as formal rather than realistic. What initially disturbed him most was that a semi classical dispersion theory worked in a way that did not seem compat ible with his model of the atom. "Dispersion" was then used to refer to the scattering from atoms of light of wave length long relative to

16See my "Matter Waves and Conceptual Revolutions. III. Niels Bohr Philosopher Scientist" (forthcoming).

17The debate is contained in two letters: N. R. Campbell, "Atomic Structure," Nature, 106 (1920), 408--109; and N. Bohr, "Atomic Structure," Nature, 107 (1921), 104^107.

Page 10: berg Matrix

146 EDWARD MACKINNON

the size of the atoms. According to the Bohr theory there should be a resonance reaction when the frequency of the incident light corres

ponds to the mechanical frequency of the electron's rotation. If such a resonance were observed it would constitute strong evidence for the

reality of electronic orbits. It was, however, never observed. R. Ladenburg had greater success in handling this problem by

adapting the model Lorentz had introduced to explain the Zeeman effect.18 In the original Lorentz model the atom was assumed to have

relatively stationary electrons, a certain fraction of which, N, are the

dispersion electrons. The key assumption was that electrons dis turbed by electromagnetic radiation return to equilibrium positions through damped oscillations. Ladenburg adapted this assumption to

quantum physics of replacing N with the coefficients Einstein had introduced for transitions of atomic states.19

Ladenburg had treated the dispersion electrons as oscillators, but he did not present this treatment as a model competing with the Bohr model.20 Bohr took up Ladenburg's work in a survey article published in 1923 and interpreted it in the light of his distinction between formal and realistic principles. "Our whole knowledge of the nature of radia

tion, which to a great extent plays a decisive role in the problems of atomic structure, of course rests solely on those phenomena, in the closer consideration of which the formal nature of the quantum theory stands out particularly clearly."21 Since the mechanism cou

pling radiation to atoms is not included in the quantum theory, Ladenburg's work, Bohr felt, could be accepted provided that it was

interpreted as a formal rather than a realistic account.22 To take account of the observations, it must be assumed that this

coupling mechanism becomes active when the atom is illuminated in such a way that the total reaction of a number of atoms is the same as

18R. Ladenburg, "Die quantentheoretische Deutung der Zahl der Dispersionselek tronen/' Zs. /. Phys., 4 (1921), 451-468. The problem of dispersion in the old quantum theory is summarized in Max Jammer, The Conceptual Development of Quantum Mechanics

(New York, 1966), pp. 181-95, and in B. L. van der Waerden, ed., Sources of Quantum Mechanics (New York, 1967), pp. 9-18. This volume has a translation of Ladenburg's article, pp. 139-157.

19A Einstein, "Zur Quantentheorie der Strahlung," Phys. Zeit, 18 (1917), 121-128, translated in van der Waerden, ibid., pp. 63-77.

20Ladenburg, op. cit. (Note 18), said: "If the molecules are in equilibrium at radiation

temperature T, and if the electrons are regarded as three-dimentional oscillators with three degrees of freedom.. .[a derivation of eq. (2) follows]. Eq. (2) can be looked upon as the definition for the experimentally determinable quantity N, which has of course no definite meaning in quantum theory." (From van der Waerden, pp. 140, 142-143.)

21N. Bohr, "Uber die Anwendung der Quantentheorie auf den Atombau," Zs. /. Phys., 13 (1923), 117-165, translated in Proceedings of the Cambridge Phil. Soc. Supplement (1924), pp. 1^2, citation from p. 38.

22Ibid., p. 39.

Page 11: berg Matrix

HEISENBERG AND MATRIX MECHANICS 147

that of a number of harmonic oscillators in the classical theory. The

frequencies of the oscillators are equal to those of the radiation emit ted by the atom in the possible processes of transition, and the number of oscillators is determined by the probability of occurrence of such processes of transition under the influence of illumination.

Though this was the introduction of the virtual oscillator model, Bohr did not then develop it beyond its application to dispersion. Nor did he treat it as a model of an individual atom. His idea was simply that in treating the interaction of radiation with matter, it is possible to replace a collection of atoms by a collection of simple harmonic oscillators. Later, after J. C. Slater introduced the idea of a virtual radiation field,23 Bohr, Kramers, and Slater wrote a paper in which the formal nature of the quantum theory played a basic role and

which made extensive use of the idea of virtual oscillators as a basis for treating the interaction of radiation with matter.24

Heisenberg, as he has repeatedly testified,25 was strongly influ enced by Bohr's ideas on the proper physical interpretation of scien tific theories. He also worked with Kramers on the problem of disper sion. In a brief note, written after the Bohr, Kramers, and Slater

paper, Kramers extended Ladenburg's results to include transitions to and from excited states as well as the ground state.26 This extension led to his general dispersion formula,

P = EZiAfife2 I [4n*m(vf -

v2)] -

EXjAfrfe2 I [4n*m(jf -

v2)], (1)

23John C. Slater, "Radiation and Atoms," Nature, 113 (1924), 307. 24N. Bohr, H. A. Kramers, and J. C. Slater, "The Quantum Theory of Radiation,"

Phil. Mag., 47 (1924), 785-802.

25Thus, in an interview with T. S. Kuhn (SHQP, Interview 1, p. 4) Heisenberg said: But the strongest impression on me at that time was that Bohr thought so

differently on these problems from Sommerfeld. He never looked on the problems from the mathematical point of view, but from the physics point of view.

I should say that I have learned more from Bohr than from anybody else that the new type of theoretical physics which was almost more experimental than theoret ical. That is, you have to cover the experimental situation by means of concepts

which fit. Later on you have to put the concepts into mathematical forms, but that is more or less a trivial process which has to be solved. But the primary thing here is that you must find the words and concepts to describe a funny situation in

physics which is very difficult to understand. This interview is in the Archive for History of Quantum Physics. A detailed list of the material in this archive (Copenhagen, Philadelphia, Berkeley) may be found in T.S. Kuhn, J. L. Heilbron, P. Forman, and L. Allen, Sources for the History of Quantum Physics: An Inventory and Report (Philadelphia, 1967). Interviews from this archive will be cited as above, SHQP, followed by the folder and page numbers of the transcripts of the interviews. Microfilms of correspondence from this archive will be referred to by their classification numbers, except for those belonging to the Bohr Scientific Corres

pondence, which will be cited as BSC followed by the number and section of the microfilm.

26R. Ladenburg, op. cit. (note 18.)

Page 12: berg Matrix

148 EDWARD MACKINNON

where E is the intensity of the incident electromagnetic plane wave, v its frequency, P the induced polarization, A? (Af) the probability of a

particular absorption (emission) per unit time, r the decay characteriz

ing the same transition, and the other symbols have their usual mean

ings.

Though formula (1) workecl it had never been given a theoretical

justification. Moreover, it involved such serious inconsistencies as

building a quantum formula on a classical foundation and treating light as electromagnetic vibrations rather than light quanta. After the

discovery of the Compton effect in 1923, the Copenhagen physicists, who had rejected Einstein's light quantum hypothesis, were finally forced to take it seriously. A. Smekal found a way to introduce light quanta into dispersion theory.27 In his view an atom irradiated with

photons of frequency v would emit scattered photons of frequency v, v+vk, and v ?ve, where hvk and hv are energy differences between a

stationary state and the states into which an atom may jump by the emission and absorption of a photon.

Kramers secured the assistance of Heisenberg and together they wrote a paper presenting a systematic treatment of atoms and radia tion.28 They began their highly mathematical paper with a

generalized classical expansion for the electrical moment of an atom

exposed to a plane monochromatic train of light-waves of frequency v, split the expansion into parts corresponding to coherent and inco herent scattering, and then went from a classical to a quantum repre sentation by Born's trick of replacing differentials by differences. This new approach worked in the sense that part of the general expansion yielded Kramers' formula (1), while the other part could be inter

preted as an extension of Smekal's work. What intrigued Heisenberg was that formula (1) fit the virtual oscil

lator model. In effect it treated the individual atom as a doubly infinite set of virtual oscillators and, at least in this case, it seemed to work better than the Bohr model. The question of which model was more basic could not be settled by trying to determine which gave a more

realistic description of atoms. Bohr had convinced Heisenberg that models should be interpreted functionally rather than realistically. In

testing these competing models Heisenberg turned to a problem that, at least in retrospect, seems so obscure that no historian of quantum theory, to my knowledge, has ever presented a serious discussion of it. Yet, in Heisenberg's own development the problem of the polariza

27H. A. Kramers, 'The Law of Dispersion and Bohr's Theory of Spectra," Nature, 113

(1924), 673-676, in van der Waerden, op. cit. (note 18), pp. 177-180. 28H. A. Kramers and W. Heisenberg, "Cber die Streuung von Strahlen durch

Atome," Zs. f. Phys., 31 (1925), 681-708.

Page 13: berg Matrix

HEISENBERG AND MATRIX MECHANICS 149

tion of resonant fluorescent radiation supplied the arena in which two

competing models of the atom contested in a match that was a pre lude to quantum mechanics.29

Fluorescence is a luminescence stimulated by radiation. It differs from phosphorescence in that it does not continue more than a minute fraction of a second after the stimulating radiation is extinguished. Figure 1 depicts a typical experimental arrangement. A source S emits radiation which is passed through a focusing lens Lt and polarized by a Nicol prism Nj before converging on a resonance tube T. In the

experiments we are considering both the source and the resonance tube contain sodium vapor. The light emitted from the resonance tube at right angles to the incident ray is focused by another lens L2,

passed through another Nicol prism N2/ which serves as a polariza tion analyzer, and then passed into some apparatus, such as a photo cell, which can measure the intensity of radiation.

In 1923 R. Wood and A. Ellett discovered that even a weak mag netic field induces strong polarization in the resonance radiation of

29In a letter to me of 12 July, 1974 commenting on the first draft of this article

Heisenberg wrote: "I was especially glad to see that you noticed how important the

paper on the polarization of fluorescent light has been for my further work on quantum mechanics. Actually in Copenhagen I felt that this paper contained the first step in which I could go beyond the views of Bohr and Kramers."

Page 14: berg Matrix

150 EDWARD MACKINNON

the sodium D line.30 They discussed this with Charles Darwin who

suggested an explanation in terms of incident light exciting circular and linear vibrations inside the atoms involved. This suggestion was taken up by W. Hanle who noted its relation to Lorentz' theory of the Zeeman effect.31 P. D. Foote, A. E. Ruark, and F. L. Mohler showed how the classical Lorentz theory could be adapted to the anomalous Zeeman effect.32 The crucial point they brought out is that under the

experimental conditions depicted, the absorption of radiation could

only lead to some, rather than all, of the excited states usually found in the anomalous Zeeman effect. This point led to an immediate flurry of papers on both the theoretical and experimental aspects of the

problem. Two that have a direct bearing on this discussion are a paper by G. Joos giving the relative intensities of the various spectral lines involved33 and a paper by P. Pringsheim and E. Gaviola, who sepa rated the Dj and D2 lines and showed that only the D2 line is

polarized.34 The theoretical ideas that were developed can be simply sum

marized with the aid of Figures 2 and 3. In the Lorentz theory of the Zeeman effect the oscillations of an electron induced by a magnetic field can be resolved into three principal components: a linear one

parallel to the direction of the applied magnetic field H, and two circular vibrations rotating in opposite directions in a plane perpen dicular to the direction of the magnetic field. If the situation schematized in Figure 2 applied to a gas that exhibited a normal Zeeman effect, then the resonance fluorescence would, as indicated, exhibit only linear polarization.

This simple analysis can be extended to the anomalous Zeeman effect by considering each transition to be the effect of a parallel (| | in

30R. Wood and A. Ellett, "On the Influence of Magnetic Fields on the Polarisation of Resonance Radiation/' Proc. Royal Soc, 103 (1923), 396-403.

31W. Hanle, "Uber den Zeemaneffekt bei der Resonanzfluoreszenz," Die Naturwis

senschaften, 11 (1923), 690-91. 32P. D. Foote, A. E. Ruark, and F. L. Mohler, "The D2 Zeeman Pattern for Resonance

Radiation," Journ. of the Optical Soc. of America, 7 (1923), 415-418.

33George Joos, "Der Einfluss eines Magnetfeldes auf die Polarisation des Res

onanzlichts," Phys. Zs., 25 (1924), 130-134. This is the source for Figure 3 and for the relative intensities of the Zeeman components.

34E. Gaviola und P. Pringsheim, "Cber die Polarisation der Natrium

Resonanzstrahlung in magnetischen Feldern," Zs. f. Phys., 25 (1924), 367. Abbreviated references to other contemporary papers on this problem are: G. Breit, Phil. Mag., 47

(1924) , 832; J. A. Eldridge, Phys. Rev., 24 (1924), 234; L. Nordheim, Zs. /. Phys., 33

(1925) , 729; L. S. Ornstein and H. C. Burger, Phys. Zs., 25 (1924), 298; P. Pringsheim, Naturwiss., 12 (1924), 227, and Zs. f. Phys., 23; (1924), 324 F. Rasetti, Lined Rend., 33

(1924), 38; and E. Fermi and F. Rasetti, Zs. f. Phys., 33 (1925), 246. General surveys of this problem may be found in Peter Pringsheim, Fluorescence and Phosphorescence (New York, 1965), pp. 63-79; and in A. C. G. Mitchell and M. W. Zemansky, Resonance Radiation and Excited Atoms (Cambridge; 1934), ch. v., esp. pp. 258-278.

Page 15: berg Matrix

HEISENBERG AND MATRIX MECHANICS 151

Z

X

Figure 2. A schematic representation of the polarization phenomenon as explained by the Lorentz theory. In the case schematized here, the incident radiation induces radia

tion linearly polarized in the direction of the magnetic field but does not excite either

type of circular polarization.

Figure 3) oscillator for which Am = 0 or of a perpendicular (JL in Figure 3) oscillator for which Am = ? 1. Figure 3 illustrates how this analysis works for the D lines of sodium. In the type of experimental situation schematized in Figure 2 only parallel polarization can be absorbed,

which means that the sodium atom can be excited only by states that are related by parallel lines. Once in an excited state the atom can

decay by any allowed transition. We may consider the implications of this for Dx and D2 lines. In the Dx case either of the two excited states can be reached by the absorption of parallel radiation. Each of the two excited states can decay by the emission of either parallel or perpen dicular radiation. The intensity of each emission transition is the same. With the large number of transitions that occur in the experi

mental situation these equal polarizations would balance statistically so that the resulting radiation would look unpolarized. In the D2 case

only the excited states with magnetic quantum numbers of ? Vi can be reached by the absorption of parallel radiation. These excited states can return to the ground state either by radiating parallel compo nents, which have a total relative intensity of 8, or by radiating per pendicular components, which have a total relative intensity of 2.

Page 16: berg Matrix

152 EDWARD MACKINNON

3 2P 1/2

? 1

+ 3/2 1/2 I - 1/2 f -3/2 J

m

32P 1/2

X

+ 1/2 1 -!/2 J

m

Figure 3. Energy states and allowed transitions of a sodium atom in a weak magnetic field.

Again this may be interpreted statistically to give a net polarization of P =

(8 -2)/(8+2) = 6/io or 60 percent. In the experimental case in

which the D lines are not resolved into Dx and D2 components, the

intensity of the D2 radiation is twice that of the Dt in the stimulating radiation and four times that of D! in the resonance radiation. There

fore, excitation by parallel components produces a total radiation in

tensity of 20 for the parallel components and 8 for the perpendicular components, giving a maximum fluorescent polarization of P =

(20 ?8)/(20+8) = 12/28 or 43 percent. A more precise calculation using

the Van Vleck formula for relative intensities yields the result P = 50

percent.35 This value compares well with the best observed value available in 1925 of P = 45 percent. Similar calculations can be made for geometrically different arrangements of the relations between the

polarization of the incident radiation, the direction of the magnetic

35See Mitchell and Zemansky, ibid., pp. 272-276.

Page 17: berg Matrix

HEISENBERG AND MATRIX MECHANICS 153

field, and the relation between the source, the resonance tube, and the detector.36

Resonance fluorescence is the problem Heisenberg turned to while he was trying to decide which of the two competing models of the atom should be considered more basic. It is not hard to see why this

problem attracted him. The model being used to interpret resonance

fluorescence was, when properly interpreted, simply the virtual oscil lator model. The relationship of this to the Bohr model had already been discussed by Bohr himself, who argued that the models used to

explain fluorescent polarization did not contradict his theory, formally interpreted, but could be related to it through the correspondence principle.37 Yet, in these new developments the virtual oscillator model was not simply a technical formalism for handling the dis

persion of radiation by matter. It also included the anomalous Zeeman effect, the problem Heisenberg had previously handled

through the core model. Heisenberg's paper on this problem differed

radically from his earlier and later papers in that it contained almost no mathematics.38 It was a clear example of the type of purely physi cal reasoning that he claimed to have learned from Bohr, attempting to work out the physical consequences of each model for various

aspects of the polarization problem. He began with some general considerations by extending Bohr's use of the correspondence princi ple. Here the virtual oscillator model shares some features in common with the core model. In both cases one can assume that the Zeeman

effect, whether normal or not, is due to an optical electron traveling outside a core. The two differ in that the core model, which is a

specialized version of Bohr's general model, relies on the notions of

stationary states and of orbits that are determinable in principle, though for practical calculations one may have to rely on prob abilities. The oscillation model dispenses with the notion of stationary states and uses systematic perturbations of whatever motions the electrons have, rather than the orbits themselves, as the basis for calculations. One immediate consequence of this model is a generali zation of the argument used in explaining Figure 3. Selection rules

36A table of the best results available for different field and detection angles may be found in J. H. van Vleck, "Quantum Principles and Line Spectra," Bulletin of the

National Research Council, No. 54 (1926), pp. 1-316, esp. p. 174. This long review article

provides a good survey of the state of quantum theory just prior to the development of

quantum mechanics. Judging by the date on the preface the author finished compiling this survey on 7 August 1925.

37Niels Bohr., "Zur Polarisation des Fluorescenzlichtes," Naturwiss., 12 (1924), 1115-1117.

38W. Heisenberg, "Anwendung des Korrespondenzprinzips auf die Frage nach der Polarisation Fluoreszenzlichtes," Zs. /. Phys., 31 (1925), 617-626.

Page 18: berg Matrix

154 EDWARD MACKINNON

should hold for both degenerate and nondegenerate cases. Symmetry considerations indicate that parallel and perpendicular components balance in the case of doublets and triplets. Therefore these should not exhibit any polarization.

Heisenberg next considered a more complex case where the two

models lead to different predictions. This is the case of a weak electri cal field parallel to the magnetic field and perpendicular to the held of

the polarized component. In the Bohr model, which depends on defi nite orbits, the orbit of the electron would have to describe a rosette

path. This consideration coupled to the core model gives a qualitative basis for an explanation of the resulting polarization. The inner mag netic fields proper to the cores are randomly oriented relative to the

external field and play no effective role. But the external magnetic field should perturb the rosette path of the optical electron to produce a statistically significant polarization. The virtual oscillator model, on

the other hand, dispenses with any considerations of path and de

termines the polarization on essentially classical grounds. Classical

considerations, however, lead to the conclusion that there is no

polarization in this case. Though Heisenberg did not elaborate the reasons for this conclusion they can be reconstructed from the general

principles he relied on in the paper. The slightly earlier Bohr paper on

this topic39 had introduced the principle of spectroscopic stability to

explain some apparent anomalies in polarization. When the incident radiation is unpolarized and isotropic, the resonance radiation is also

unpolarized in the absence of a magnetic field. The assumption of

spectroscopic stability is that, to a first approximation, the total

polarization is also zero when a magnetic field is present. In this case,

only the electric field should have an effect. According to classical

theory, however, the frequency and polarization of an oscillator is

unchanged when an electrical field is applied perpendicular to the

direction of oscillation. Heisenberg left no doubt about which of the two competing conclusions he accepted: "Nevertheless we have

every ground to assume that this polarization is not present and that

in addition the standard virtual oscillators employed in quantum

theory for radiation obey laws according to which the closest analogy

39The basic principle is usually credited to Bohr, op. ext. (note 37). However, when

Bohr introduced this principle he cited discussions with Heisenberg as a basis. In an interview with Kuhn (SHQP, Interview 4, pp. 10-15) Heisenberg claimed that Bohr and

Kramers violently objected to his extension of this principle and spent three days trying to talk him out of it. The disagreement continued after Heisenberg left Copenhagen. On

8 January 1925 he sent Bohr a letter answering Bohr's objections and presenting a

defense of his position. The letter is microfilmed in Bohr's Scientific Correspondence (BSC 11, sect. 2).

Page 19: berg Matrix

HEISENBERG AND MATRIX MECHANICS 155

between the classical theory and the quantum theory is preserved."40 The problem was still obscure and the reasoning used more qualita

tive than quantitative. Yet, from it Heisenberg concluded that the virtual oscillator model was the correct one to work with. He ex

plained it in an interview with Kuhn:

So the whole thing was a program which one had consciously or

unconsciously in one's mind. That is, how can we actually re

place everywhere the orbits of the electrons by the Fourier com

ponents and thereby get into better touch with what happens? Well, that was the main idea of quantum mechanics later on. One could see, more and more clearly, that the reality were the Fourier components and not the orbits. The Fourier components were more real than the orbits and still somehow their connec tion was similar to their connection in classical mechanics. So one tried to look for those connections between the Fourier compo nents which were true in classical mechanics and to see whether or not similar connections are also true in quantum mechanics if one takes, instead of the Fourier components, the real lines. So this was the whole trick.41

The article on resonance fluorescence marked a turning point in the

development we are considering. From this time on Heisenberg used the virtual oscillator model as a basic tool. To understand the way he used, and eventually suppressed, this model it is necessary to con sider Pauli's reaction to Heisenberg's new method of doing physics. Since their university days together in Munich and their parallel careers as assistants to Born in Gottingen and fellows in Bohr's Insti tute in Copenhagen, Pauli remained Heisenberg's constant corres

pondent and severest critic. Because Pauli's criticism served as a foil to Heisenberg's daring thrusts it is helpful to consider Pauli's attitude towards the use of atomic models in scientific explanation. According to Heisenberg, Pauli had considered the concept of elec

tron orbits to be horribly mystical even in his student days.42 The

40"Trotzdem haben wir alien Grund, anzunehmen, dass diese Polarisation nicht vorhanden ist, und dass vielmehr die fur die Strahlung massgebenden virtuellen Oszil latoren der Quantentheorie Gesetzen gehorchen, nach denen die engste Analogie zwischen der klassischen Theorie und der Quantentheorie gewahrt bleibt." From

Heisenberg, op. cit. (note 38), p. 621.

41SHPQ, Interview 4, p. 15. 42W. Heisenberg, Der Teil und das Game: Gesprache im Umkreis der Atomphysik

(Munich, 1969), p. 56, translated by A. J. Pomerans as Physics and Beyond: Encounters and Conversations (New York, 1971). Since the conversations recounted are retrospective reconstructions no precise significance can be attached to the dating. However, the basic point is one that Heisenberg has made elsewhere. Thus, in "Erinnerungen an die

Page 20: berg Matrix

156 EDWARD MACKINNON

grounds for this position were given in Pauli's first published paper where he rejected the idea of an electrical field at a point as meaning less. He insisted: "One should accordingly hold fast to the idea that in

physics only quantities which are observable in principle should be introduced."43 Pauli was willing to accept Heisenberg's way of doing physics as long as Heisenberg treated models merely as functional tools. Thus, in a letter of 21 February 1924 Pauli wrote to Bohr criticiz

ing the way most physicists accept the reality of electronic orbits as

unproblematic. To this criticism he appended a footnote: "I do not mean Heisenberg in this [criticism]; he is more reasonable. In my opinion Heisenberg has hit upon the right direction on this point, because .. .he casts doubt upon any talk about definite paths."44 In

spite of this limited approbation Pauli continued to criticize Heisen

berg. Some six months later Heisenberg wrote Bohr: "I have received a letter from Pauli. He has done some reflecting on intensities; but

apart this this he grumbles about everyone, especially about my atomic physics."45

After Heisenberg finished his paper on the polarization of reso nance fluorescence he visited Pauli in Hamburg before returning to

Gottingen. On 8 January 1925 he wrote a letter to Bohr continuing their protracted argument on the polarization problem. The letter concluded with a report of Pauli's reaction to these new develop

ments: "He declared his agreement with the 'Smekal' jumps and the fluorescent polarization, with the latter only half. He believes in the

[word illegible], but not in virtual oscillators and is outraged at the 'virtualization' of physics."46

Zeit der Entwicklung der Quantenmechanik," in M. Fierz and V. Weisskopf, eds., Theoretical Physics in the Twentieth Century: A Memorial Volume to Wolfgang Pauli (New York, 1960), pp. 40-47, Heisenberg claims that Pauli was the first to seriously question the physical significance of the Bohr model.

43"Man mochte doch gern daran festhalten, in die Physik nur prinzipiell beobachtbare Grossen einzufiihren." From W. Pauli, "Merkurperihelbewegung und

Strahlenablenkung in Weyls Gravitationstheorie,/, Verh. Deut. Phys. Ges., 21 (1919), 742-750, citation from pp. 749-750. This is reproduced in R. Kronig and V. Weisskopf, eds., Collected Scientific Papers by Wolfgang Pauli (New York, 1964), 2, 1-9.

44W. Pauli to N. Bohr, 21 February 1924. This letter is in BSC 14, sect. 3. The asterisked statement at the bottom of this page is: "Heisenberg meine ich damit nicht. Der ist verniinftiger. Heisenberg hat nach meiner Ansicht gerade in diesem Punkt das

Richtige getroffen, dass er... von bestimmten Bahnen zu sprechen bezweifelt." 45W. Heisenberg to N. Bohr, 25 August 1924, BSC 11, sect. 2: "Von Pauli bekam ich

einen Brief. Er hat sich etwas iiber Intensitaten uberlegt, sonst schimpft er aber iiber alle und besonders iiber meine Atomphysik."

46W. Heisenberg to N. Bohr, 8 January 1925, in BSC 11, sect. 2: "Erklart sich jedoch mit den 'Smekals' Spriingen und der Fluoreszenzpolarisation einverstanden, d. h. die letztere nur halb. Er glaubt an die [word illegible], nicht aber an virtuelle Oszillatoren und schimpft iiber die 'Virtualisierung' der Physik."

Page 21: berg Matrix

HEISENBERG AND MATRIX MECHANICS 157

3. THE ANOMALOUS ZEEMAN EFFECT REVISITED

The differences between Heisenberg's method of doing physics and Pauli's came to a focus on the problem of the anomalous Zeeman effect. The core model had served as the basis for the earlier work

Heisenberg had done on this problem. Before dispensing with this

model, or at least the fixed orbit aspect of it, Heisenberg reconsidered his earlier work. By this time, however, the problem area had

changed significantly due to the work of Pauli. In 1923 Pauli, who was then working in Copenhagen, began an

intensive study of the anomalous Zeeman effect, trying to generalize the Lande-Heisenberg results while dispensing with any reliance on

models. Basically what he did was to treat the established results as

empirical generalizations from observational data, rather than as con clusions dependent on any particular model.47 For a multiplicity i

(where 2i gives the number of spectral lines), the inner quantum number / can have the values

j = k + i - V2, k + 1 -

3/2,... , k - (i -

V2).

In the presence of a magnetic field the energy associated with each line of this multiplicity is E =

mgOh, where O = eHi^irmc and m has the 2j

- 1 values: / -

1, / -

2, ..., ?(j -1). These m quantum numbers may be split into two components, one associated with the orbital angular momentum mlf while the other /jl is not given a physi cal interpretation but is simply assigned appropriate mathematical values. Thus,

m1 = 0, ? 1, ? 2,... , ? (k -1), m = ml + /jl, where /x = ? Vi ? doublets,

fjL = 0, ? 1 ? triplets.

Then the established results can be expressed in terms of one simple formula:

ElOh =m1 + 2fjL. (2)

In this formula, the factor of 2 gives the effect of the anomalous

magnetic moment associated with the core (and later with the spin). But the formula relies on quantum numbers and rules of interpreta tion rather than models.

Pauli had shown that it was possible to explain the anomalous Zeeman effect while dispensing with the core plus optical electron

47W. Pauli, "Ober die Gesetzmassigkeiten des anomalen Zeemaneffektes," Zs. /.

Phys., 16 (1923), 155-164.

Page 22: berg Matrix

158 EDWARD MACKINNON

model. He had not yet shown that this model was wrong. This he did two years later.48 The core model assumes that the multiplicity of

spectral lines results from an inner magnetic field which, in turn, is due to the averaged out effect of the electrons in the core. Since the inner electrons are closer to the nucleus they should be moving fast

enough to have a relativistic mass increase according to the formulas

m = ym0 y = 1/Vl-v2lc2,

where m0 is the rest mass. As Sommerfeld had shown some ten years earlier, the relativistic correction factor for an electron with principal quantum number n in an atom with atomic number Z is

y = l- a2Z2l2n2, (3)

where a, the fine structure constant, equals approximately V137. This correction is significant for the innermost electrons of heavy atoms.

This correction, Pauli argued, would necessarily affect any internal

magnetic field. Accordingly, if the core plus electron model is correct, then the core contribution to the anomalous Zeeman effect of the successive alkali elements (all of which have doublets) should exhibit a dependence on the atomic number. Experimental results showed no such Z-dependence. Hence, either one must postulate some secret

mechanism that somehow compensates for the relativistic mass in

crease, or one must conclude that the model is wrong. Pauli chose the latter alternative and assumed from symmetry considerations that

any closed shell must have a net angular momentum of zero. If this is

true, the magnetic effects attributed to the core must somehow be due to the optical electron itself. Pauli, accordingly, made the novel as

sumption that the electron has a classically indescribable double valuedness. Calling this "indescribable" is not simply a repudiation of models. Pauli was postulating a new fourth degree of freedom for the electron which differed from the three classical degrees of free dom associated with the three spatial coordinates in that it admitted of only two values rather than the continuous distribution associated with the classical degrees of freedom. Rather than attempt to explain why this double-valuedness obtains Pauli simply insisted that it is

impossible to describe the basis of this double-valuedness in terms of

any of the classical properties attributed to the electron. He strongly opposed the idea of the electron spin as an explanation of this double-valuedness when Kronig first proposed it, for Kronig's theory seemed to rely on just such a classical description.

48W. Pauli, "Ober den Einfluss der Geschwindigkeitsabhangigkeit der Elektronen masse auf den Zeemaneffekt," Zs. /. Phys., 31 (1925), 373-385.

Page 23: berg Matrix

HEISENBERG AND MATRIX MECHANICS 159

In early December 1924, Pauli sent Heisenberg a copy of his paper on the anomalous Zeeman effect. Heisenberg replied in a postcard on

15 December 1924 that to attribute to the electron an extra degree of

freedom was to compound a swindle.49 Within a couple of months, however, Heisenberg came to accept Pauli's idea, but gave it his own

interpretation and used it as a basis for testing rather than rejecting atomic models.50

The core model had supplied the best basis for calculating the spec tral lines of the anomalous Zeeman effect. Any model that replaced or

supplemented it would have to do at least as well. Heisenberg, ac

cordingly, began with the core model and adapted Pauli's idea of the

double-valuedness proper to the electron to introduce the virtual oscillator model. The old idea that the orbits of electrons can be ex

plained in mechanical terms must, he argued, yield to the idea that some sort of nonmechanical strain interrelates the core and the elec

tron to produce the double-valuedness. This is different from Pauli's

position in that the double-valuedness is attributed to the electron's interaction with the core, rather than to the electron itself. This in

terpretation supplied a way to relate the new idea to the old model.

The implications of this interpretation for the Zeeman effect can be

considered case by case.

Doublets are due to a single electron outside the core. The treat ment of hydrogen and hydrogen-like atoms had been one of the

undoubted successes of the old theory. All that is needed to modify this is to assume that the electron path is single valued while its interaction with the core is double valued. The case of two outer

electrons presents a more formidable problem. In spite of strenuous

efforts Heisenberg and Born had been unable to work this out suc

cessfully for the simplest cases of the helium atom and the hydrogen molecule. In returning to this problem Heisenberg gave up any at

tempt to explain it by a mechanical model and, in Pauli's fashion, concentrated on rules for quantum numbers. If the azimuthal quan tum numbers for the two electrons are kx and k2 the appropriate quan tum number i for the two electrons together should be restricted to

49Heisenberg's letters to Pauli during this period will be referred to by the numbers

assigned them in SHQP microfilm 80. This letter is PLC 0017 014. 50W. Heisenberg, "Zur Quantentheorie der Multiplettstruktur und der anomalen

Zeemaneffekte," Zs. /. Phys., 32 (1925), 841-860. Daniel Serwer has given a different

interpretation of this paper in "Unmechanischer Zwang: Pauli, Heisenberg and Atomic

Structure, 1923-25," a paper he read at the December 1973 History of Science Society Meeting in San Francisco. I wish to thank Professor Serwer for sending me a copy of this paper as well as the revised and greatly expanded version, "Unmechanischer

Zwang: Pauli, Heisenberg, and the Rejection of the Mechanical Atom, 1923-1925," which is published in this volume of Historical Studies in the Physical Sciences.

Page 24: berg Matrix

160 EDWARD MACKINNON

the values (kt + k2) ̂ i ̂ \kt -

k2\. But the more basic question is: what

physical significance do these numbers have? In attempting to answer this question Heisenberg introduced the

new (and now standard) orbital quantum number I = k ? 1. The

azimuthal quantum number k had originally been introduced to

specify the degree of ellipticity an orbit has. In the relativistic interpre tation of doublets it was also correlated with the magnitude of the

splitting. For a given n the lower the k number the more elliptical the

orbit is and the greater the splitting is due to relativistic effects as the electron orbits the nucleus. But, Heisenberg argued, such an interpre tation applied to the core model for heavy atoms led to contradictions. In atoms like iron, deep terms enter with heavy splitting. If this split ting is interpreted as characterizing highly elliptical orbits then these orbits would penetrate the core. However, this assumption is incom

patible with the systematics of the periodic table and its atomic expla nation.

Heisenberg's solution to this dilemma was to abandon the idea that

the azimuthal quantum number characterizes any mechanical prop

erty of orbits, an idea that had really worked only for a single electron

above a core with zero angular momentum. In place of the azimuthal

quantum number Heisenberg used his new quantum number I, which was not interpreted as characterizing such mechanical prop erties of orbits as degrees of ellipticity. The problem now was how to

handle electrons without giving a mechanical specification of orbital

properties. The virtual oscilator model, Heisenberg argued, provides an alternative route. Since it is based upon classical physics it also

provides a clear mechanical model. But this classical mechanical

model does not supply a descriptive account of the kinematics of the

electron, since it relates to the atom only in a formal way through the

correspondence principle. Others had already shown how the virtual oscillator model leads to

selection rules. After summarizing these rules, Heisenberg turned to

the problem of calculating allowed frequencies. In the Bohr theory, allowed frequencies were determined by the energy differences be

tween two mechanically determined orbits. The virtual oscillator

model dispenses with this sort of mechanical determination of orbits.

Yet, it must be able to supply an equivalent account, and to this end

Heisenberg attempted to adapt the work he had done with Kramers.

For light of frequency v (which could be Slater's radiation field) the

allowed atomic frequencies must have the general forms: v + cunJ or v

+ <*>nJ ? 2o)fc, where o)k is the fundamental vibration and the set a)nj are overtones. These frequencies, as the model requires, are simply a

spectral expansion in terms of Fourier components. The basic idea is

simple enough. Yet, when it is applied to actual atoms, especially to

Page 25: berg Matrix

HEISENBERG AND MATRIX MECHANICS 161

the complex atoms previously treated in terms of the core model, formidable problems ensue. One should calculate the allowed fre

quencies for each electron and then, to handle the interaction between the core and the optical electrons, calculate all possible combinations of frequencies.

This Heisenberg was unable to do. He settled for some general observations and plausible selection rules and then concluded the article by indicating the relation between what he had done and what remained to be done. He acknowledged that his present work was

preliminary and, in some basic respects, quite unsatisfactory. Un

doubtedly, electrons in atoms interacted in terms of some simple laws. But he could see no way to get at these except by using model

type pictures with a symbolic significance ("modellmassige Bilder

symbolischer Bedeutung'').50a Before publishing this article Heisenberg communicated his new

ideas to Pauli. In a letter of 28 February 1925 Pauli indicated that in addition to disagreeing with the way the physicists in Copenhagen treat the problem of radiation, he had some serious objections to

Heisenberg's treatment of the Zeeman effect, objections which he would discuss in detail in his forthcoming trip to Copenhagen.51 Though the precise nature of Pauli's objections are not recorded it is reasonable to assume that Pauli continued to object to Heisenberg's reliance on the virtual oscillator model. At this time Pauli was working on the long survey article on quantum theory that he contributed to the 1926 edition of the Handbuch der Physik, a survey article that does not include any developments after June 1925. Though Pauli included a

detailed treatment of both dispersion and the anomalous Zeeman

effect, he did not use or even mention the virtual oscillator model in either connection. He did introduce virtual oscillators or, in his ter

minology, "Ersatzoscillators" in discussing the polarization of reso nant fluorescent radiation, especially to summarize Heisenberg's con clusions. However, he treated these oscillators as a classical tool for calculation rather than as a model of the atom.52

4. FABRICATING QUANTUM MECHANICS

Heisenberg's paper on the Zeeman effect was mailed from

Copenhagen on 10 April 1925. In the conclusion Heisenberg outlined

50aHeisenberg, ibid., p. 860. 51W. Heisenberg to W. Pauli, 28 February 1925; in BSC 14, sect. 3. 52W. Pauli, "Quantentheorie," in H. Geiger and K. Scheel, eds., Handbuch der Physik,

(Berlin, 1923), 23, 1-278. This is in Pauli's Collected Papers, op. cit. (note 43), 1, 269-548.

Dispersion is treated on pp. 86-96, the anomalous Zeeman effect on pp. 220-231, and the polarization of resonant fluorescent radiation on pp. 101-108.

Page 26: berg Matrix

162 EDWARD MACKINNON

a new program for quantum theory. One should use the virtual oscil lator model to work out all the Fourier components for the electrons in an atom and for the coupling between electrons, In the rest of this article I will attempt to trace through in detail the way Heisenberg implemented this program and developed quantum mechanics. Since this historical reconstruction involves a conjectural element, I will

make the conjecture explicit before beginning the reconstruction. The basic conjecture I am making to interpret the work Heisenberg

did in June and July of 1925 is that in the organization of his paper establishing quantum theory53 he reversed the order of his own de

velopment and suppressed the method he actually used. The pub lished paper begins by stating the doctrine that quantum theory should be restricted to observable quantities and then gives some

general laws for kinematical and dynamical relations. In the conclud

ing half of the article this new quantum theory is applied to some

relatively simple cases, the anharmonic oscillator and the rotator. I believe that the actual historical process was as follows. Heisen

berg began by trying to implement the program he had indicated,

applying the virtual oscillator model to the hydrogen atom. When this

proved too difficult he tried the same method on the anharmonic oscillator. Though this problem was not physically significant, it was

sufficiently simple to supply a testing ground for his new method. The virtual oscillator model suggested a mathematical expansion that could be used to calculate the frequency and the energy. Heisenberg did the calculations and then checked them by an independent treat ment of the same problem using Born's perturbation method rather than the virtual oscillator model. When both approaches yielded exactly the same conclusion, Heisenberg was convinced that he had made a fundamental breakthrough to a new quantum mechanics.

Heisenberg was convinced?but could he convince others? Though the virtual oscillator model had supplied a convenient basis for calcu

lations, it was far too weak to support a new quantum theory. Heisenberg, accordingly, sought a general theory that could supply a

rational foundation for the calculations he had already made. When he thought he had found such a foundation he tested it on Pauli, the master of criticism. Pauli would certainly not accept the virtual oscil lator model as a foundation for a new rational quantum mechanics.

Heisenberg accorded this Gottingen principle a key epistemological his new method that did not involve any explicit dependence on the

53W. Heisenberg, "Ober quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen," Zs. /. Phys., 33 (1925), 877-893, translated in van der

Waerden, op. cit. (note 29), 261-276. Subsequent references to this article will cite the van der Waerden translation.

Page 27: berg Matrix

HEISENBERG AND MATRIX MECHANICS 163

virtual oscillator model. Heisenberg also intended to get a critical reaction from Born before publishing his new ideas. Since both Born and Pauli were stressing the need to restrict quantum theory to ob servable quantities,54 a view that Heisenberg was beginning to share,

Heisenberg accorded this Gottingen principle a key epistemological role in his theoretical formulation, though it was incompatible with his actual procedure.

This conjectural reconstruction is supported by Heisenberg's own accounts of these developments. My argument, however, will be

primarily based on a detailed reconstruction of Heisenberg's own

reasoning during this crucial period. In developing this argument, I will make extensive use of Heisenberg's correspondence to supple ment a critical analysis of his article. The temptation to project back into Heisenberg's article the methods of matrix multiplication is an obstacle to understanding the methods Heisenberg actually used. He has repeatedly insisted that at the time he wrote the article he was not familiar with the mathematics of matrices.55 In one of his interviews

with T. S. Kuhn, Heisenberg indicated how the program outlined in the conclusion of his final paper on the anomalous Zeeman effect led to the mathematical methods he used in developing quantum theory.

Well, the point was that from this moment I actually did work on

quantum mechanics straight away, because when I had studied the classical intensities of the Kepler ellipse, I very soon found out that it was too complicated to guess the intensities. And then I found?that was the point?that if I knew the Fourier series of,

say, a coordinate x, I wanted also to know the Fourier series of x2. And so I studied more generally the question of the connection between the Fourier series of x and that of x2, or that of x and y and that of x times y. And that was, of course, already practically

matrix mechanics. Then I went from the hydrogen back to a

problem where I could do the thing by just multiplying in a

simple way so that besides x, I only needed x2 and x3 and not

54In their interviews with Kuhn both Heisenberg (SHQP, Interview 5, p. 19) and

Jordan (SHQP, interview 1, p. 30) related that the idea of restricting quantum theory to observables was a common doctrine in the Gottingen cirde around Born. For Pauli's doctrine see note 43.

55In a recent summary of this development Heisenberg claimed: "At that time I must confess I did not know what a matrix was and did not know the rules of matrix

'multiplication/' From W. Heisenberg, "Development of Concepts in the History of

Quantum Theory," in Jagdish Mehra, ed., The Physicist's Conception of Nature (Dor drecht, 1973), pp. 264^275. The citation is from p. 267. In his interview with Kuhn

(SHQP, Interview 5, p. 12) he daimed that he found the noncommutative aspect of his

multiplication rules the most disagreeable feature of his original paper. It was precisely this multiplication law that led Born to suggest the use of matrices.

Page 28: berg Matrix

164 EDWARD MACKINNON

more. The simplest example was the anharmonic oscillator and

thereby 1 came to the anharmonic oscillator in quantum mechanics. You will find in the first paper on the anharmonic oscillator in quantum mechanics also a statement that it was (at fault) if one knew x also to know x2 or x times y and that kind of

thing. So that was really the point. But the point came out from

my attempts to work straight on the hydrogen atom and since these attempts had failed, I went to the more general question of

what I could do about multiplication of these Fourier series.56

Because of a severe attack of hay fever Heisenberg went to Helgo land, a small, pollen free island in the North Sea. It was there, early in June 1925, that he made his decisive breakthrough. On 5 June

Heisenberg wrote to R. Kronig presenting a rather detailed account of how he was adapting the virtual oscillator model to the problem of the anharmonic oscillator.57 On 8 June he wrote a letter to Bohr which relates only that his hay fever had let up and that his work was going slowly.58 On the same day he wrote a letter to Pauli but gave no clear indication of the sort of work he was doing. Heisenberg, it seems,

was willing to communicate his new approach to Kronig, then a post doctoral fellow interested in learning from Heisenberg new directions in atomic physics. But he would not communicate his new ideas to Bohr or Pauli until they were sufficiently developed to withstand the

expected criticism. Of the two only Pauli could be expected to analyze and criticize the technical formalism. Accordingly, we will begin with the problem treated in Heisenberg's letter to Kronig and treat his

correspondence with Pauli later.

Heisenberg was searching for a problem simple enough to serve as a test for his new approach employing Fourier expansions. The easiest problem might seem to be the simple harmonic oscillator

specified by the formula x + a)2x = 0. However, this oscillator only allows radiation frequencies that are equal to the mechanical frequen cies of oscillation. This was the assumption that had proved a failure in the development of dispersion theory. The simplest way to over come this limitation is to add a correction term dependent on x2,

yielding the formula for an anharmonic oscillator:

S6SHQP, Interview 6, p. 6. A similar account is contained in Heisenberg's contribu tion to the Pauli memorial volume (note 42).

57In an article in Fierz and Weisskopf, op. ext. (note, 42), Kronig reproduces the main

part of the letter and the diagram presented here as Figure 4. 58W. Heisenberg to N. Bohr, 8 June 1925, BSC 11, sect. 2. In his correspondence with

Bohr, Heisenberg did not indicate what he was doing until 31 August 1924, when he wrote saying that he had done something in quantum mechanics and that Kramers was

familiar with it. It was not until 21 October 1925 that Heisenberg communicated his new ideas to Bohr, and then he discussed their physical significance rather than the mathematical formalism.

Page 29: berg Matrix

HEISENBERG AND MATRIX MECHANICS 165

x + o)2x + Xx2 = 0. (4)

To get the Fourier expansion Heisenberg used one substitution:

x = a0 + ax cos a)t + a2 cos 2a>t + .... (5)

By expanding and reordering, he derived:

x2 = (a02 + fl!2/2) + 2(fl0fli + 0i02) cos <ot +_ (6)

Though the anharmonic oscillator obeys a different force law than the hydrogen atom, their expansions involve terms similar enough so

that one might suggest the method for handling the other. In the

hydrogen atom an electron at a distance a from the nucleus is bound to it by a force F = ?e2la2. For the anharmonic oscillator pic tured in Figure 4, the diagram Heisenberg used in his letter to Kronig, the dipole charge can vibrate back and forth along the x direction. The1 force exerted on it by a positive charge at P would be:

K = -e2la2 + e2l(a2+x2) = -1+1/(1+x%2)]. (7)

The denominator in the last term can be expanded in two different

ways. A direct Fourier expansion gives:

l/(l+x2/a2) = b0 + b1 cos <*)t + b2 cos 2c*t + ... . (8)

But one can also use the simple algebraic expansion

1/(1+*%2) = 1 - x2la2 + x4la4 - x6la6 + ... . (9)

In equation (9) one can substitute equation (6) for x2 with correspond ing expressions for x4, etc. If the resulting expression is to be identical

with equation (8), then the coefficient of each cos na)t in equation (8)

<

X

p

Figure 4. An anharmonic oscillator. This diagram is from a letter Heisenberg wrote to

R. Kronig on 8 June 1925.

Page 30: berg Matrix

166 EDWARD MACKINNON

must be equal to its counterpart in equation(9). The substitutions

yield the expressions:59

b0 = 1 - (1/a2) (al + a\\2 +...) + (lla4) (...) (10a)

bx = -

(21a2) (flofl! + flla2/2+ ...) . (10b)

In his letter to Kronig, Heisenberg did not really explain how this calculation can be given an atomic interpretation. However, the in

terpretation he used, though not its justification, can easily be recon structed. In the Lorentz, or virtual oscillator, model the key parameter is not the distance a, but the displacement x. The distance a formerly specified the radius of an orbit, the type of mechanical specification

Heisenberg had renounced. The distance x represented a displace ment from whatever equilibrium position the electron might have. In the original Lorentz model this was interpreted as a displacement from a fixed position. In the modified virtual oscillator model it could be interpreted as a displacement from a stable orbit. According to

Heisenberg's extension of the correspondence principle, the quantum correlative of the distance a from the center is the quantum number n

specifying the orbit, while the quantum analogue of oscillations about this orbit is the virtual transitions. To be more concrete, if b should

correspond to a transition, e.g., from the n to the n-1 orbit, then the coefficients that serve as an expansion of b should be interpreted as a summation of the transitions that lead from the same initial to the same final state by means of intermediate virtual transitions. Thus,

we have:

bx (n, n -1) =

-(1/a2) Mnfafan -1) + a1(n/n -l)a0(n -1)

+(V2)fli(n -l,n -2)a2(n,n -2) + (V2)a2(n+l,n -lMn.+l,*)] +(l/a4) [...]. (11)

Equation (11) illustrates the basic idea of how the virtual oscillator model is adapted to quantum mechanics. But it does not provide an

intuitively clear basis for the orderly assignment of virtual transitions. This assignment is clearer if we shift from a cosine to an exponential expansion: 00

x(n,t)= 2na?e?^, (12) n=\j

59In his letter to Kronig Heisenberg did not have the factor of V2 in the second line of

equation (11). I assume that this was a simple copying mistake. He also omitted the factor of (lla2) in the denominator of the right hand sides of equations (15) and (16).

Here, as in the next section, Heisenberg simplified the coefficients he used to highlight the significant features of the equations.

Page 31: berg Matrix

HEISENBERG AND MATRIX MECHANICS 167

and 00

V{l+x2la2) = s bmemi<at. (13)

m=0

Then the same process of expansion and identification of coefficients leads to:

b^2iu>t =

_(lfa2) [(a(f>i<*>ty + (flo02 + a^ypn* + ...]

+ (1/fl4) [...]+.... (14)

What is the precise physical interpretation to be accorded a term like

(axeiiat)2 in equation (14)? In the virtual oscillator model the term b2e2ia>t could represent a transition from state n to state n ?2. Correspond ingly, the right-hand side of equation (14) should represent an or dered sum of the virtual transitions leading from state n to state n?2

through an intermediate jump of 1, 2, or more steps. Thus, the term

{p,xei0it)2 with the subscript 1 should correspond to the virtual transi tion from state n to state n ? 1 followed by another virtual transition from state n ? 1 to the final state n ?2. Considering only this term we have:

b2 (n,n -2) eib)(n>n- 2)t

= (Ha2) ax (n,n -1) ax (n -\,n -2) e *?n>n- 1)+w(n" Un~2)]t. (15)

For this equation to hold the coefficients in equation (15) must relate

according to the pattern:

b2{n,n -2) =

(Ha2) ax (n,n -1) ax (n -l,n -2). (16)

With equation (16) one effectively has a matrix multiplication. When it supplied a successful basis for solving particular problems, Heisenberg quickly realized that he could dispense with the interpre tation in terms of virtual oscillations. Nevertheless, the original basis for this equation is clear. Nor can equation (16) be explained through the doctrine of observability Heisenberg stressed in his published paper. Virtual transitions are unobservable as a matter of principle, for they do not involve the emission of real photons. The original basis for equation (16) is the physical significance accorded it in the virtual oscillator model. One must be careful to keep the order of

multiplication as it is, ax {n,n ?1) ax (n ?1, n ?2), because the atom cannot pass from the n -1 to the n?2 state until it has already passed from the n to the n ? 1 state. To change the order would be to describe a different virtual process.

In classical radiation theory the solution of the equations of motion

yields not only the allowed frequencies but also their relative inten sities. The old quantum theory did not supply any direct means for

Page 32: berg Matrix

168 EDWARD MACKINNON

the calculation of relative intensities. Instead, one had to get them

indirectly by using classical physics and the correspondence princi ple, as Sommerfeld and Heisenberg had done,60 or one had to abduct

empirical generalizations from experimental data, as some ex

perimentalists had done.61 Heisenberg's new method allowed for a direct calculation of intensities. Again he used the anharmonic oscil lator to indicate how this is done. For the anharmonic oscillator gov erned by equation (4), Heisenberg assumed the general solution:

x = Xa0 + fli cos (ot + Xa2 cos 2(ot + ... + \n~lan cos ncot + ... .

By expansion and rearrangement one may obtain the appropriate expressions for x and x2. Substituting these terms in equation (4),

dropping any terms of higher order than X2, and matching the coeffi cients of cos n<ot gives:

This set of expressions can be transformed into the corresponding quantum mechanical formulas by using (for simplicity) <o? <dq (from equation 17b), realizing that no) associated with the subscript an goes with a virtual transition that has n steps and that the proper order of the coefficients must be preserved:62

n=0

n=3

n=2

n=l

o>oV + (V2) a,2 = 0,

0)2 + o)02 =

0,

( -4o>2 + o)02) a2 + (V2) a,2

(-9o>2 + o)02) a3 + axa2 =0

(17a)

(17b)

(17c)

(17d)

60A. Sommerfeld und W. Heisenberg, "Die Intensitat der Mehrfachlinien und ihrer

Zeemankomponenten," Zs. /. Phys., 11 (1922), 131-154.

61See, for example, H. C. Burger and H. B. Dorgelo, "Beziehung zwischen inneren

Quantenzahlen und Intensitaten von Mehrfachlinien," Zs. /. Phys., 23 (1924), 258-266.

"Heisenberg derives this equation in van der Waerden, op. cit. (note 18), p. 270. Two features of his treatment require some comment. First, Heisenberg used a nonstandard

expansion which simplified his calculations but led to unorthodox units. A standard

expansion would have the form:

x= a0+ ax cos o>f+ a2 cos 2a>r+ ...+ an cos tuot .

Since the harmonic oscillator only has the value x = ax cos orf, one solves the above

equation for the anharmonic oscillator by making the simplifying assumptions: al?a0, a2; a>Q>>l. This leads to the solutions for the coefficients:

ct0 = a2 = -(A/2) (?i/a>o) au

an = -(k^lZn) (aM)"-1*!, forn > 2,

where z is an increasingly large dimensionless number (z3= 16; z4= 480). These coeffi cients are related to the ones in the text by the equations:

ax = a; a0 = a2 = -

(V2) (ocJodD at;

Page 33: berg Matrix

HEISENBERG AND MATRIX MECHANICS 169

n=0 a>02a0 (n) + (V4) [at2 (n+l,rc) + ax2 (n,n -1) ] = 0, (18a)

n=l -co2 (n,n -1) + o>02 = 0, (18b)

n=2 -3a)02a2 (n,n -2) + Va^ (n,n -l)ax (n -l,n -2) = 0, (18c)

n=3 S(o02a3 (n,n -3) -h (Vz) [a2 (n,n -1) a2 (n -\,n -3) + a2 (n,n -2) ax (n -2,n -3)]

= 0. (18d)

These are the expressions for the relative intensity of different tran sitions.

Heisenberg's 5 June letter to Kronig does not give any further de tails on the calculations he was doing. Many years later he explained the next, and from his point of view the decisive, step in his de

velopment.

Then I noticed that there was no guarantee that the new mathematical scheme could be put into operation without con tradictions. In particular, it was completely uncertain whether the principle of the conservation of energy would still apply, and I knew only too well that my scheme stood or fell by that princi ple.

Other than that, however, several calculations showed that the scheme seemed quite self-consistent. Hence I concentrated on

demonstrating that the conservation law held, and one evening I reached the point where I was ready to determine the individual terms in the energy table, or, as we put it today, in the energy

matrix, by what would now be considered an extremely clumsy series of calculations. When the first terms seemed to accord with

Each a coefficient has the dimension of length [*]. The a coefficients, accordingly, have

the dimensions:

[an] =

[xnt*n-% while [X]= [x^r2].

The second confusing feature of Heisenberg's treatment is that his set of equations (19) in van der Waerden, p. 270, corresponding to our equations (18), has numerical

mistakes in the equations for n = 2 and n ? 3. For n = 2 he has:

[ -o>2(?,h -2) + o)20 a(n,n -2)] + (V2) [a(n,n -1) a(n -l,n -2)] = 0,

where he should have:

[ -3w2(n,? -2) + (V2) [a(n,n -1) a(n -l,n -2)] = 0.

For n = 3 he has:

[c*2(n,n -3) + o>?] a(n,n -3) + (V2) [a(n,n -1) a(w -\,n -3)] + (V2) [a(w,n -2) a(n -2,n -3)]

= 0,

where he should have:

[ -8a>2(n,n -3)1 a(n,n -3) +(1/2)[a(n,n -1) fl(w -l,w -3)] + (V2) [a(n,n -2) a(w -2,w -3)]

= 0.

Page 34: berg Matrix

170 EDWARD MACKINNON

the energy principle, I became rather excited, and I began to make countless arithmetical errors. As a result, it was almost three o'clock in the morning before the final result of my compu tations lay before me. The energy principle held for all the terms, and I could no longer doubt the mathematical consistency and coherence of the kind of quantum mechanics to which my calcu lations pointed. At first, I was deeply alarmed. I had the feeling that, through the surface of atomic phenomena, I was looking at a strangely beautiful interior, and felt almost giddy at the thought that I now had to probe this wealth of mathematical structures nature had so generously spread out before me. I was far too excited to sleep, and so, as a new day dawned, I made for the southern tip of the island, where I had been longing to climb a rock jutting out into the sea. I now did so without too much

trouble, and waited for the sun to rise.63

The energy conservation calculations were undoubtedly those

given in the printed paper. For the anharmonic oscillator, equation (4), he was able to prove that energy was conserved to first order in X, but this simple balance could not have been the clinching calculations that he spent his time on that fateful evening, for it does not really supply the energy tables he refers to. To get these he used a different anharmonic oscillator which has a simpler expansion

Since the anharmonic correction terms depend on jc3, an odd power, the even terms in the expansion drop out. Then one has:

Again the expansion coefficients, au a2, a3/..., can be interpreted as

giving the magnitude of the transition in terms of virtual oscillator levels. So the quantum theoretic analogue should have such terms as:

a (n,n -1) cos o> (n,n ?

l)t; Xa(n,n ?3) cos o> (n,n -3)f;.... (21)

Now one goes through the same process as before: determine x and

x3, substitute the results in equation (19), collect terms of order 1, 3, 5,..., n in cos near, and set the coefficients equal to zero. This gives recursion relations for the classical case and, by using equation (21), also for the quantum case. Heisenberg calculated the formulas for o>

(n,n ?1) as a function of the general asymptotic form of a {n,n ?r) and

x + <0o2x + Xx3 = 0 . (19)

x = fli cos (ot + Xa3 cos 3<ot + \2a5 cos 5<ot + ... . (20)

63Heisenberg, Physics and Beyond (note 42), p. 61.

Page 35: berg Matrix

HEISENBERG AND MATRIX MECHANICS 171

for the first two terms a(n,n ?1) and fl(n,n ?3). These are the terms

necessary for a precise determination of the energy to order A.2. The energy for an anharmonic oscillator is the sum of the kinetic

and potential energies:

W = (Vi) mx2 + (}k)m^2x2 + OA) mXx4. (22)

Substituting for x2, x2, and x4 their Fourier expansions and collecting all terms to order X2:64

w _ (n + V2)ha)0 3(n2 + n + V2)h2

- X2 5^X0^2

t17* 3 + (51/2>*2 + (59/2)* + (21/2> <23)

= ?0 + + * (24)

Equation (23) is undoubtedly the energy table Heisenberg was re

ferring to in the last excerpt cited. If such a big and awkward expres sion were to check out term by term then the new method must

surely be correct. He needed to find a way to check it. The previous citation insisted that the new results must be in accord with the prin ciple of energy conservation, but this blanket requirement is of little

help. Here the printed text is a much better guide.

This energy can also be determined using the Kramers-Born ap proach by treating the term (}k) mXx4 as a perturbation to the harmonic oscillator. The fact that one obtains exactly the same result (23) seems to me to furnish remarkable support for the

quantum-mechanical equations which have here been taken as basis. Furthermore, the energy calculated from (23) satisfies the relation (cf. eq. 24):

<o(n,n -1)/2tt =

llh\W(n)-W(n -1)]

^Heisenberg in van der Waerden, op. cit. (note 18), p. 272. Here again Heisenberg used a nonstandard expansion which simplified the calculations but led to unorthodox units. A standard expansion would have the form:

x = ?! cos cot + a3 cos 3arf + ats cos 5<ot + ... .

By calculating x and x3, substituting the results in equation (49), and making the

assumptions ax ? a3 ? a5, one obtains the general form:

where zn is an increasingly large dimensionless number (z3 = 32; z5 = 1024). These a

coefficients relate to Heisenberg's a coefficients by the formula an = X(n-1)/2a?. While the a coefficients all have the dimension of length [x], the a coefficients have the dimen sions [an]

= [Jt^""1], while [X]

= [xt]~2.

Page 36: berg Matrix

172 EDWARD MACKINNON

which can be regarded as a necessary condition for the possibility of a determination of the transition probabilities according to

equations (48) and (58) [the numbering of the equations has been

adjusted].65

This brief but decisive passage indicates that equations (23) and (24) met two different tests of correctness. The first was an independent derivation of equation (23) by a method different from Heisenberg's new method. The second was a demonstration of the consistency of

equations (23) and (24). These were the tests that convinced Heisen

berg that his method was indeed correct. Because of the crucial sig nificance of these tests, it is worthwhile reconstructing the reasoning that justified these conclusions.

The Kramers-Born approach Heisenberg refers to is the perturba tion method Born introduced and which he used to justify Karmers'

dispersion formula.66 This method involved an adaption of the

Hamilton-Jacobi equation. The Hamiltonian of a system may be writ ten as:

H = H0 + XHl (24) In the present case H0 is the Hamiltonian for a simple harmonic oscil lator:

H0 = p2/2m + mco02x2l2. (25)

XHi is the perturbing energy:

XHX = (V4) \mx4. (26)

By setting up the Hamilton-Jacobi equation corresponding to equa tion (24) Born developed a general technique for expressing the

energy in terms of a perturbation expansion:

H + v0 a6/dW0 = W0 + kWt + X2W2 + ..., (27)

where S is the action, W0 is the unperturbed energy, and Wt and W2 are the first and second order corrections in X. An application of this

method to the anharmonic oscillator specified by equation (19) yields:

<n + W?o (28)

W, = X3h2l (327^0)0^), (29)

_\3(n + y2)h2 2 327ria>o2m

'

65Ibid., p. 273. 66M. Born, "Ober Quantenmechanik," Zs. /. Phys., 26 (1924), 379-395, trans, in van

der Waerden, ibid., pp. 181-201.

Page 37: berg Matrix

HEISENBERG AND MATRIX MECHANICS 173

-X2 5127r3coo5m2

,(17713 + (51/2)n2 + (54/2)n + 21/2). (30)

The addition of equations (28), (29), and (30) yields equation (23).67 One other check could be performed. The most basic equation in

quantum theory was (Wn -

Wm) = hvnm> which says that the radia

tion frequency is the difference between energy levels divided by Planck's constant. To see if this equation is satisfied Heisenberg sim

ply substituted (n -

1) for n in equation (23) and used:

The value this gives for o>(n,n ?1) is exactly the same as that of equa tion (24), which was determined from the recursion formula. Heisen

berg felt justified in concluding that this double check furnished re markable support for his quantum mechanical equations.

One other, much simpler case was worked out. An electron moving around a center of force in a perfect circle of radius a has a constant

potential, so the total energy is simply the kinetic energy. This fact led to the results:

These equations also satisfy the quantum condition (31).

Heisenberg had ample grounds for concluding that his new method not only worked; it also surpassed anything yet developed in its ability to get results. In the old quantum theory one used classical

mechanics to determine the energy proper to an orbit and one used

quantum conditions, especially fpdq = nh, to restrict the allowed

orbits. This method did not give selection rules, intensities, or polari zations. These had to be determined in some other way, usually by supplementing Bohr's correspondence principle with plausible con siderations. Heisenberg's method allowed a direct determination of all these quantities.

v(n,n -1) =

o>(n,n -1)/2tt =

[W(n) -

W(n -l)]/n. (31)

(x)(n,n ?

1) = hnllTrma2,

W = (/z2/8ttW) (n2+n + V2).

(32) (33)

5. JUSTIFYING QUANTUM MECHANICS

Heisenberg had a new method that seemed to work, at least for some physically unrealistic cases. However, he lacked any adequate

67I have only carried out these calculations far enough to getC0/ the leading term in

Heisenberg's formula (23) X3w2/i2/327r2cu02m. A treatment of this problem by modern

perturbation methods, which reproduces Heisenberg's results, may be found in John L. Powell and Bernd Crasemann, Quantum Mechanics (Reading, Mass., 1961), pp. 381 388.

Page 38: berg Matrix

174 EDWARD MACKINNON

justification for claiming that his was a general method applicable to

all problems in atomic physics. The diffidence he felt on this point is

clearly reflected in his letters to Pauli. In his letter of 8 June Heisen

berg made no direct mention of his own work. Instead he indicated

that he thought the methods Born had used in redeveloping Kramers'

dispersion theory68 could be extended to a theory of coupling for any number of electrons. The conclusion of this letter reflects both the

type of justification Heisenberg was seeking for his new approach and his disagreement with Pauli on the Zeeman effect. We will begin with the Born formula he cited. Heisenberg discussed this formula as

if Born's method of replacing differential equations by difference

equations were the key to a new rational quantum mechanics rather

than as the means Heisenberg had used to check results he had

achieved by other means:

H2* = -(V2)s*(tfcd/dJ*) [|Cr|2 (vkr)l(vkrf

- (i>r)2].

Hence, so far as I know, something appears as the classical dis

persion formula_This also applies for arbitrary k, that is, for

any couplings. If Kramers transformed the formula (rkdld]k) [|C,.|2 {vkr)l{vkr

- vt)] into A[|Q.|2j^t/

- in)], why couldn't one

extend this to couplings as well? Born does it and probably is right in taking this to be the beginning of a rational quantum mechanics of couplings. At the same time for formula H2

= A ../... {sic} shows?that this is grist for my Zeeman mill!! The latter I shall,

by the way, now publish (without physical interpretation) but

with the papal blessing, in spite of you.69

Heisenberg left Helgoland and visited Pauli in Hamburg before re

turning to Gottingen. There on 21 June he wrote Pauli another letter

which was chiefly concerned with discussing problems Heisenberg had worked on earlier and which Pauli was now summarizing for his

Handbuch article. The only paragraph that mentions Heisenberg's

68Born, op. cit. (note 66). 69W. Heisenberg to W. Pauli, 8 June 1925, SHQP microfilm 80, PLC 0017. 015. The

citation is: "H2* = -(V2)Xk(rkdld}k) [\C? {vkt)l(vkr)2

- (vr)2]. Es ergibt sich, so viel ich

weiss, also etwas als die klassische Dispersionsformel; die Witz ist, dass dies auch gilt fur beliebige k, d.h. fur irgendwelche Koppelungen. Wenn nun Kramers die Formel

(rkdld]k)[\CT\2 (vkT)l{vkT -vt)] verwandelte in A[|Cr|2i'fcT/(i'jkT ?vt)], weshalb soli man dies nicht bei der Koppelung tun. Born tut dies, auch halt dies wohl mit Recht fiir einen

Anfang einer vernunftigen Quantenmechanik der Koppelung. Zugleich zeigt die

Formel, H2 = A... / ...?dass dies Wasser auf meine Zeeman-Muhle ist!! Letztere

werde ich ubrigens (ohne physikalische Deutung) mit dem papstlichen Segen jetzt publizieren, trotz Ihnen." The "papal blessing" probably refers to Bohr's approval. Serwer, op. cit. (note 50), has suggested that this letter is misdated in the files and

should be dated 8 June 1924 rather than 1925, a suggestion that I find highly plausible. Regardless of its dating the letter is an interesting revelation of Heisenberg's adaption of Born's method for developing a new quantum mechanics.

Page 39: berg Matrix

HEISENBERG AND MATRIX MECHANICS 175

new work is primarily concerned with stressing the need for a new mechanics rather than the rational ordering Pauli was seeking to im

pose on the established mechanics: "Now I am surprised that you are

surprised over the 'failure of mechanics.' If, in fact, mechanics were to succeed then one would never really know that there are atoms. But there is an alternative, a quantum mechanics, and what one must

wonder about is that the hydrogen atom accidentally corresponds with something classical, at least with respect to the consitution of

energy."70 Three days later Heisenberg wrote Pauli another letter discussing

further problems of Pauli's Handbuch article and also giving the first

summary of Heisenberg's new quantum mechanics. Since the bulk of this letter is reproduced by van der Waerden71 and the physics will be treated later, I will skip the mathematical details. Heisenberg wrote:

I have hardly any desire to write about my own work, because it is all quite unclear even to me and I have only a vague idea of

what will develop, but perhaps the basic ideas are correct. The fundamental idea is: In the calculation of any magnitude, such as

energy, frequency, and so forth, only the relationships between

quantities which are controllable in principle should enter. (In this regard it seems to me that e.g. Bohr's theory is much more formal with regard to hydrogen than Kramers' dispersion theory.)72

The letter goes on to give formulas for the first anharmonic oscil lator described by equation (4), but not for the anharmonic oscillator

which had supplied the crucial energy table. It concludes with a re

quest for criticism particularly about the difficulties concerned with the noncommutative product of Fourier rows. This request clearly indicates that Heisenberg had not yet related his scheme to matrix

multiplication:

70W. Heisenberg to W. Pauli, 21 June 1925, SHQP microfilm 80, PLC 0017.017. "Nun wundere ich mich dariiber, dass Sie sich iiber das 'Versagen der Mechanik' wundern. Wenn soeben, wie die Mechanik galte, wird man nie wesentlich konnen, dass es Atome gibt; es gibt eben eine andere, eine "Quantenmelphanik," und man muss sich dariiber wundern dass der Wasserstoffatom zufallig hinsichtlich der Energie konsti tute, mit etwas klassichem ubereinstimmt."

71Van der Waerden, op. ext. (note 18), pp. 24-25. 72W. Heisenberg to W. Pauli, 24 June 1925, SHQP microfilm 80, PLC 0017.018. "Ober

meine eigenen Arbeiten hab ich fast keine Lust zu schreiben, weil mir selbst alles noch unklar ist und ich nur ungefahr ahne, wie es werden wird, aber vielleicht sind die

Grundgedanken doch richtig. Grundsatz ist: Bei der Berechnung von irgendwelchen Grossen, als Energie, Frequenz usw. diirfen nur Beziehungen zwischen prinzipiell kontrollierbaren Grossen vorkommen. (Insofern scheint mir z. B. die Bohrsche Theorie beim Wasserstoff viel formaler als die Kramerssche Dispersionstheorie.)"

Page 40: berg Matrix

176 EDWARD MACKINNON

I would be thankful if you could write to me which arguments speak against this formula. Apart from the formulation of the

quantum conditions I am not yet really content with the whole schema. The strongest objection seems to me that the energy, written as a function of q and q, in general need not become a

constant, even when the equations of motion are fulfilled. This follows after all from the fact that the product of two Fourier series is not uniquely defined?but I will not bore you with such

things any longer.73

On 29 June Heisenberg wrote Pauli a postcard: "In the meantime I have progressed a bit further, but not much, and I am even more

convinced that this quantum mechanics is already correct, though Kramers criticizes me for optimism."74 Heisenberg finally finished his

paper and sent it to Pauli on 9 July 1925. The accompanying letter criticized Pauli for relying on the concept of electron paths, especially paths which can impinge on the nucleus. The attribution of circular or

elliptical paths to the electron, Heisenberg insisted, has no physical sense. The parts of the letter that particularly concern us are those

indicating the sort of critical reaction he was soliciting:

If you believe that your letter was read with derisive laughter, then you are seriously deceiving yourself. On the contrary, my position on mechanics has, since Helgoland, become more radi cal day by day. It is my firm conviction that the Bohr theory of

hydrogen in its previous form is no better than the Land?

theory of the Zeeman effect... .Accordingly, I venture to send

you the manuscript of my paper straightaway, because I believe

that, at least in the critical, i.e., the negative part, it contains some genuine physics. At the same time I have a very bad con

science, because I must ask you to send the work back to me in 2 or 3 days?for I must either complete it during the final days of

my stay here or burn it. My own view concerning this scribbling,

73Ibid. "Ich ware Ihnen sehr dankbar, wenn Sie mir schreiben konnten, welche Ar

gumente zu Ungunsten dieser Formel sprechen. Abgesehen von der Formulierung der

Quantenbedingungen bin ich mit dem ganzen Schema noch nicht recht zufrieden. Der starkste Einwand scheint mir der, dass die Energie, als Funktion der q und q ge schrieben, im allgemeinen keine Konstante zu werden braucht, auch wenn die Be

wegungsgleichungen erfiillt sind. Es liegt dies letzten Endes daran, dass das Produkt zweier Fourierreihen doch nicht eindeutig definiert ist?aber ich will Sie mit solchem nicht langer langweilen."

74W. Heisenberg to W. Pauli, 29 June 1925, SHQP microfilm 80 PLC 0017.018. "Inzwischen bin ich etwas, aber nicht viel weitergekommen und ich bin im Herzen wieder iiberzeugt, dass diese Quantenmechanik schon richtig ist, weshalb Kramers mich des Optismus anklagt_"

Page 41: berg Matrix

HEISENBERG AND MATRIX MECHANICS 177

about which I am not at all happy, is this: I am convinced of the

negative heuristic part, but I consider the positive part to be

excessively formal and insufficient. But perhaps people who know more than I can make something reasonable out of this. So

please read especially the introduction.75

The developmental sequence seems clear. In early June Heisenberg was convinced that he had made a fundamental breakthrough, primarily because a complicated problem checked out so exactly. Yet,

Heisenberg had no adequate justification for the method he had used

and, consequently, no convincing grounds for claiming that he had

developed a general method of quantum mechanics. In the month

following this breakthrough Heisenberg finally developed a justifica tion that gave promise of being adequate and did not explicitly rely on

the virtual oscillator model. He tested it on Pauli, who strongly op

posed the virtualization of physics. Pauli's answering letters are not

available, but their tone is conveyed by Heisenberg's summary: "Pauli's answer was markedly positive and encouraged me as I con

tinued to work."76 With this historical background we can now

analyze the first half of Heisenberg's article and relate it to his prob lem of developing a critical justification for the work he had done,

which did not explicitly rely on the virtual oscillator model he had used.

Heisenberg's article begins by citing some well-known difficulties with the old quantum theory and then suggesting a radical remedy: "In this situation it seems sensible to discard all hope of observing hitherto unobservable quantities, such as the position and period of

the electron, and to concede that the partial agreement of the quan

75W. Heisenberg to W. Pauli, 9 July 1925, SHQP microfilm 80, PLC 0017.019. "Wenn Sie glauben dass ich Ihren Brief mit Hohngelachter gelesen hatte, so tauschen Sie sich

sehr; im Gegenteil ist meine Meinung uber die Mechanik seit Helgoland von Tag zu Tag radikaler. Es ist meine feste Oberzeugung dass die Bohrsche Theorie des Wasserstoffs in der bisherigen Form nicht besser ist, als die Landesche Theorie des Zeemaneffekts_

Deshalb getraue ich mich auch Ihnen einfach das Manuskript meiner Arbeit kurzerhand

zuzuschicken, weil ich glaube, dass sie, wenigstens im kritischen, d. h. negativen Teil

wirkliche Physik enthalt. Zwar habe ich ein sehr schlechtes Gewissen, weil ich Sie bitten muss, die Arbeit mir in 2-3 Tagen wiederzusenden, da ich sie noch in den letzten Tagen meines Hierseins entweder fertig machen oder verbrennen mochte. Meine eigene Meinung uber das Geschreibsel, uber das ich gar nicht sehr glucklich bin, ist die: dass ich von dem negativen heuristichen Teil fest uberzeugt bin, dass ich aber den postiven fur reichlich formal und diirftig halte; aber vielleicht konnen Leute, die mehr konnen, etwas

vernunftiges daraus machen. Also lesen Sie bitte hauptsachlich die Einleitung." Part of this letter is in van der Waerden, op. cit. (note 18), p. 27.

76//Paulis Antwort war ausgesprochen positiv, und ermunterte mich in der weiteren

Ausarbeitung." This is from his "Erinnerung.../' op. cit. (note 42), p. 43. Although Heisenberg only cited his letter of 24 June he was emphasizing the importance he

attached to Pauli's evaluation.

Page 42: berg Matrix

178 EDWARD MACKINNON

turn rules with experience is more or less fortuitous. Instead it seems more reasonable to try to establish a theoretical quantum mechanics,

analogous to classical mechanics, but in which only relations between observable quantities appear."77 The novelty and complexity of his new mathematical methods tend to obscure the simplicity of Heisen

berg's program. Two of the most fundamental ideas of classical mechanics are presented and then given quantum analogues. The first idea is kinematics; it is the need for a space-time description of the

system being considered. The second is dynamics; the motion of a

system is governed by Newton's second law. The analogues to these two ideas supplemented by one quantum condition constitute the new

quantum mechanics. We will consider these points in order. In classical physics one can treat a system, such as a radiating

electron, either through a straightforward space-time description that

gives the position as a function of the time x(t) or by a Fourier expan sion for x(t). Since this is purely kinematic, quantum mechanics should be able to take this over. But there is one basic difference.

Heisenberg has already announced his intention of renouncing any attempt to specify the orbital paths of electrons in atoms. This renun ciation might seem to eliminate both x(t) and its Fourier expansion, but such a complete renunciation would not fit Heisenberg's pro gram. He had already abandoned the core model, which relied on a

space-time description of orbits. However, he did use the virtual oscillator model, which relied on Fourier expansions about equilib rium positions. Accordingly, he wanted to adapt the classical notion of kinematics in the Fourier expansion version but not in the direct

x(t) version. This conclusion can be restated without mentioning virtual oscillators. Renunciation of observables implies renunciation of a literal x(t) description of the orbits, but it need not imply renunci ation of a Fourier expansion of x(t).

Heisenberg gave no reason why an expansion of x(jt) should be

physically significant if x(t) is not itself physically significant. Instead he turned to the following question: If, instead of a classical quantity x(i), we have a quantum theoretical quantity, what quantum theoreti cal quantity will appear in place of x(t)2? Apart from Heisenberg's own use of Fourier expansions in the virtual oscillator model, nothing in the previous history of quantum theory even suggests that this

77From van der Waerden, op. cit. (note 18), p. 262. This statement may seem to be

incompatible with his submerged use of the virtual oscillator model. But they are

formally compatible. Since the virtual oscillator model dispenses with any reliance on

orbits and uses virtual oscillations around equilibrium positions as a fiction (the virtual

aspect), he could argue that the only mathematical terms that need be given a physical interpretation are those related to observables.

Page 43: berg Matrix

HEISENBERG AND MATRIX MECHANICS 179

question presents a problem. If the Fourier expansion of x(t) is a

series, then the Fourier expansion of x(t)2 should be the square of the series.

Heisenberg presented his own answer indirectly. Even though it is

impossible to associate the moving electron with a space-time point, it is possible to associate it with the emission of radiation. The signifi cance of this may be seen by a correspondence principle approach comparing classical and quantum formulas. In quantum theory one has the basic formula for the frequency due to a transition between

energy levels W(n) and W(n ?a):

v(n,n -a) = (llh) [W(n) -

W(n -a)]; (34)

the classical equivalent is:

v{n,a) = av (n)

= a (llh) (dWIdn). (35)

Similarly the classical formula for combining frequencies is:

v (n,a) + v{n, 0) = v (n, a+/8); (36)

it should have the quantum analogues:

v (n,n ?a) + v (n ?a, n ?a ? ft)

= v (n,n ?a ?

/3), (37a)

or

v(n -(3, n -a -/3) + f(n,n =

v(n,n -a -/3). (37b)

Heisenberg presented equations (37) as if they were obvious con

sequences of equations (34) and (35): "As characteristic for the com

parison between classical and quantum theory with respect to fre

quency, one can write down the combination relations (37)."78 Far from being obvious or familiar these new equations are unintelligible in terms of the then established principles of quantum theory. Thus in

equation (37a) the left side represents a frequency (or a photon) due to a transition from a state n to a state n ?a plus another frequency (or another photon) due to a transition from a state n ? a to a state

n?a?p. The net result is two separate emissions with two different

frequencies. The right side represents a third different frequency, one

due to a transition from state n to state n?a?p.

Equations (37) are intelligible only if they refer to virtual rather than real transitions. In virtual transitions no real photons are emitted, so

that one does not have to hold to what equation (37) seems to say; namely, that two photons of different frequencies can combine to

78Heisenberg in van der Waerden, ibid., p. 263.

Page 44: berg Matrix

180 EDWARD MACKINNON

form a third photon with another frequency. Rather, what one is then

making is the perfectly acceptable claim that a virtual transition from state n to stae n ?a followed by another virtual transition from state n?a to state n?a?p is energetically equivalent to one transition from state n to state n ?a ?p. Equation (37b) can be given a similar

interpretation using the idea that the order of addition?but not the order of multiplication?is unimportant.

The next section of Heisenberg's paper is a bit confusing.79 Classi

cally, a Fourier expansion of x(t) has the form: 00

x(t) = 2 aaexp(/aa>f). (38) a=?oo

Heisenberg's quantum adaption of this is geared to virtual transitions between a state n and some other state n?a. Therefore, he needs the

quantum form: 00

x(t) = 2 a(n,n -a)exp[ia)(n,n -a)t], (39) (X= ?00

From the general expansion given in the Kramers-Heisenberg paper and especially from the particular problems Heisenberg had already solved, it was clear that Heisenberg would also need an expansion formula for x (f)2, which was the problem he raised before treating combining relations for virtual frequencies.

But Heisenberg could not now go on to discuss the proper expan sion of x (t) or x (t)2 without violating his own stated position about strict adherence to observables. Hence, he needed some other func tion that could serve as well as x (t) in developing combining rules but

which could be interpreted as an observable. For this he selected the

amplitude proper to the radiation. The amplitude is an observable at least in the indirect sense that it determines intensities and polariza tions, both of which are measurable. Classically, this radiation

amplitude is the real part of a complex vector expression:

Re{Aa(n)exp[ia>(n)af]} (40)

Quantum mechanically, one would expect the analogous expression:

Re{A(n,n -a)exp[ia>(n,n ?a)t]}. (41)

Accordingly, Heisenberg developed his amplitude combining rules for the coefficients in the Fourier expansion of the radiation amplitude

A(n,n ?a) rather than for the coefficients proper to the position func tion a(n,n

? a). In the problems he actually solved, however, he used

79The interpretation given here was influenced by van der Waerden, ibid, pp. 33-34.

Page 45: berg Matrix

HEISENBERG AND MATRIX MECHANICS 181

only the unobservable position coefficients rather than the observable

radiation amplitudes. I will consider the combining rules for the ex

pansions he actually used.

Suppose x(n,t) has the classical Fourier expansion:

00 x{n,t) = 2 Aa(n)exp[io)(n)at]. (42) a=?oo

Then the expansion for the square is:

x(n,t)2 = i Bp(n)exp[iG)(n)pt], (43) /3=?oo

where 00

Be(n)exp[ia)(n)pt] =

5^ AaA^_aexp[zo)(n)(o:-r-i3-a)t]. (44)

The exponential is written in the peculiar form a+/3 ?a to indicate the

order of the products involved (multiplying complex numbers in

volves adding coefficients in the exponents). Heisenberg then pre sented as the simplest and most natural assumption for a quantum

mechanical analogue of equation (44) the expression:

B(n,n -p)exp]ia>(n -p)t] =

00

2aA(n,n ?a)A(n ?a,n ?B)exp[ia)(n,n ~P)t]. (45) ? 00

Similarly, x(t)3 may be given a Fourier expansion with coefficients

C(n,n ?y), where:

C (n,n -y) = ? i A(n,n -a)A(n -a,n -a -p)A(n -a -?). a=?oo /3=?oo

(46) It must be remembered that Heisenberg was not relying on ma

trices or noncommutative algebra at this time. Why should equations (45) and (46) be considered natural rather than equation (44)? Only equation (44) was in accord with the accepted practice of using Fourier expansions. Here again a double answer is needed: one to cover Heisenberg himself and one to cover the reasons he presented for public consumption. For Heisenberg himself equations (45) and

(46) were used because a virtual transition from state n to a final state

n?pis equivalent to a summation of virtual transitions from state n to

Page 46: berg Matrix

182 EDWARD MACKINNON

some intermediate state n?a, followed by a second virtual transition from any intermediate state reached to the final state n ? f$. One could

easily increase the number of virtual transitions, allowing, e.g., the electron to go from state n to some intermediate state n?a, then to a second intermediate state n ?a ?

/3, and then to the final state n ?y. The number of transitions may be increased, but in each case it is vital to arrange the coefficients so that they reflect the order of the tran sitions. Since an electron cannot leave the state n?a unless it has

already arrived there, it follows that:

A(n,n -a) A (n -a,n -/3) =/= A(n -a,n -ft) A(n,n -a) (47)

For critics like Pauli, Heisenberg could simply present equations (45) and (46) as intuitively plausible hypotheses whose validity is to be

judged by the conclusions that follow from them.

Heisenberg's dynamics is much easier to portray than his kinemat ics. For a periodic system Newton's second law takes the general form:

In the old quantum theory, one integrated equation (48) and then

using the Bohr-Sommerfeld condition set the total action around a closed path equal to an integral multiple of Planck's constant:

Heisenberg accepted this general idea of an action condition as an

integral part of his new theory with two modifications. First, instead of the position variable x(t) he used its Fourier expansion. Second, since he had suppressed any discussion of periodic orbits, he had to find a new quantum condition to replace equation (49). Once again,

Heisenberg's way of doing this was essentially an extension of Bohr's

correspondence principle. He set up a classical formulation of the

problem, imposed a new quantum condition on it, and then used the

prescriptions he had already established to get the proper quantum mechanical analogues of the classical expressions. For the expansion

x + f(x) = 0 (48)

jmxdx ?

J =nh (49)

00 x =

a^-oo a?(n)exp(ia<ont), (50)

it immediately follows that:

00

mx = m a^_oo aa(n)iao)nexp(ia(ont). (51)

Page 47: berg Matrix

HEISENBERG AND MATRIX MECHANICS 183

One then obtains: 00

jmxdx =

jmx2dt = Ittyyi

aJi00 fia(n)\2c?a>n. (52)

By the old Sommerfeld rules of quantizing, the action around an

orbit, equation (49), should be set equal to nh. Rather than do this

Heisenberg employs the physicist's trick of using mathematical methods which, strictly speaking, are simply illegitimate, but which can achieve a pragmatic legitimization through the physical signifi cance accorded them.

The idea Heisenberg presented is that since the action is deter mined only up to an additive constant, it is better to work with the derivative of equation (52) than with the equation itself:

(didn) (nh) = (dldn) jmx2dt. (53)

In quantum theory n is an integer, not a variable. Therefore, in a

quantum context the derivative in equation (53) is illegitimate. By the

correspondence principle, however, the discrete quantum integers n

get replaced by a variable in the classical limit. So one can treat n as if it were a variable and use the expansion in equation (52) to get:

00

h = 2mn 2 (dldn)(au>n\aa\2). (54)

Methodologically, there is no way to give equation (54) a consistent

interpretation. In the quantum limit in which h is significant, n is not a variable. The classical limit, in which n is a variable, is given a relative definition as the limit in which h ?> 0.

Yet, Heisenberg retained both sides of equation (54) and used it as a

bridge to be replaced by a quantum mechanical analogue. In view of his other transitions from classical to quantum formulations, the natural replacement here would seem to be:

Wn|#a|2 <i>(n,n+a)\a(n,n+a)\2.

Heisenberg, however, modified this to include the sum rule that had just been developed independently by W. Kuhn80 and W. Thomas.81 According to the Lorentz theory of dispersion, the polari

80W. Kuhn, "Uber die Gesamtstarke der von einem Zustande ausgehenden Ab

sorptionslinien," Zs. /. Phys., 33 (1925), 408-412, translated in van der Waerden, ibid.,

pp. 253-257. 81W. Thomas, "Uber die Zahl der Dispersionselektronen, die einem stationaren Zus

tande zugeordnet sind," Naturwiss., 13 (1925), 627.

Page 48: berg Matrix

184 EDWARD MACKINNON

zation P of the electrons in an atom induced by incident elec

tromagnetic radiation of frequency v and field strength E is: 00

P = E .|o ffyVtom'M-v2), (55)

where/f represents the number of dispersion electrons. In the limit in which v?Vi one should have the classical expression. But this actu

ally follows only if one has the sum rule:

2i/i = l. (56) To adapt this conservation law to a model that allows both emission and absorption (of virtual as well as real photons), one would have to subtract the absorptions from the emissions:

2,/< = 1. (57)

This subtraction is similar to what Kramers did in extending Laden

burg's dispersion formula, and it is probably what induced Heisen

berg to get the quantum analogue of equation (54) by doubling the coefficient and then subtracting absorption terms from emission ones to get:

00

h = 4^ j; j|a(n,n+a)|Mtt,H+?)

" \a{nfn^a)\2(o(n,n -a)]. (58)

Equations (48) and (58) are sufficient, when soluble, to give a com

plete determination of frequencies, energy values, and quantum theoretical transition probabilities. Heisenberg had made good on his

promise to produce a new quantum mechanics. To demonstrate the

validity of this formalism, Heisenberg applied it to the problems we

have already considered.

6. AFTERMATH

The virtual oscillator model played an essential role in the process of reasoning that led Heisenberg to the development of quantum mechanics. It also played an implicit role in suggesting mathematical

hypotheses and procedures. Yet, Heisenberg was able to redevelop his method in such a way that the finished paper manifested no

explicit dependence on any particular model of the atom. After this

paper was written the virtual oscillator model sunk from sight and never resurfaced. In place of models representing unobservable struc tures or processes the Heisenberg paper stressed the idea that scien

Page 49: berg Matrix

HEISENBERG AND MATRIX MECHANICS 185

tific theories should be exclusively concerned with quantities observ able in principle. I wish to conclude by indicating how the immediate

acceptance of this doctrine temporarily cloaked interpretative prob lems concerning the conceptual models implicit in Heisenberg's

work.

After developing quantum mechanics Heisenberg gave an explana tion of the new developments to the physicists at the University of

Berlin, where he met Einstein. Einstein invited Heisenberg to his home and questioned him about the doctrine of observables. Against Einstein's protests Heisenberg said that he thought that this doctrine was what Einstein himself had followed in his treatment of space and time in special relativity. Einstein's answer, as Heisenberg recon structs it, was: "Possibly I did use this kind of reasoning [diese Art von Philosophie], but it is nonsense all the same. Perhaps I could put it more diplomatically by saying that it may be heuristically useful to

keep in mind what one has actually observed. But, on principle, it is

quite wrong to try founding a theory on observable magnitudes alone. In reality the very opposite happens. It is the theory which decides what we can observe."82 Though Heisenberg did not accept Einstein's doctrine at that time he later incorporated it into his inde

terminacy principle. He claimed that a recollection of Einstein's statement inspired that principle.

Though Einstein's criticism might have remained in the back of

Heisenberg's mind, other developments soon preoccupied him. Around 11 July 1925, Heisenberg gave Born the final version of the

paper we have been considering. In his Nobel prize address Born recounted his reaction:

I could not take my mind off Heisenberg's multiplication rule, and after a week of intensive thought and trial I suddenly re

membered an algebraic theory which I had learned from my teacher, Professor Rosanes, in Breslau. Such square arrays are

well known to mathematicians and, in conjunction with a specific rule for multiplication, are called matrices. I applied this rule to

Heisenberg's quantum condition and found that this agreed in the diagonal terms. It was easy to guess what the remaining quantities must be, namely zero; and at once there stood before me the peculiar formula pq

? qp

= hllm.83

82W. Heisenberg, Der Teil..., op. cit. (note 42), p. 92. On pp. 111-112 Heisenberg re lates how the recollection of Einstein's statement contributed to the formation of his in

determinacy principle. *3Nobel Lectures: 1942-1962 (Amsterdam, 1964), p. 256.

Page 50: berg Matrix

186 EDWARD MACKINNON

Born, who did not like to work without an assistant, tried to get Pauli to work with him on this problem. Pauli refused on the grounds that Born's mathematical formalism would spoil Heisenberg's physi cal ideas. Born then secured the assistance of his student Pascual

Jordan. A few days later Jordan proved that the canonical equations of motion applied to the matrices representing the momentum p and the position q led to the result that the time derivative of pq

? qp must

vanish, implying that the matrix itself must be diagonal. Born and

Jordan then concentrated exclusively on a systematic redevelopment of Heisenberg's new mechanics in terms of matrices. In the introduc tion to their article they presented their evaluation of the situation: "The physical reasoning which led Heisenberg to this development has been so clearly described by him that any supplementary re

marks appear superfluous. But, as he himself indicates, in its for

mal, mathematical aspects his approach is but in its initial stages."84 Slightly later Dirac made a similar appraisal, accepting the physical basis of Heisenberg's new method as adequate while trying to rede

velop his mathematics along the lines of noncommutative algebra.85 Heisenberg was enthusiastic about the Born-Jordan redevelopment

and quickly mastered the requisite mathematics. Then he collabo rated with Born and Jordan on a new paper (the "three-man paper") giving a systematic overview of quantum mechanics in matrix form.86 The introduction, written by Heisenberg, explains the physical basis. It seems to couple a stress on observables with a total rejection of any reliance on models of the atom. Yet, what is really rejected is the idea that any visualizable models either picture the atom realistically or

supply an interpretative basis for the new formulation: "Admittedly, such a system of quantum-theoretical relations between observable

quantities, when compared with the quantum theory employed hitherto, would labor under the disadvantage of not being directly amenable to a geometrically visualizable interpretation, since the mo

84M. Born and P. Jordan, "Zur Quantenmechanik," Zs. /. Phys., 34 (1925), 858-888,

partially translated in van der Waerden, op. cit. (note 18), pp. 277-306; citation from p. 277. This is not simply a convenient way of introducing the paper. It seems to have

represented a fixed opinion on Born's part. Thus, in the lectures he subsequently gave at the Massachusetts Institute of Technology, Born summarized the new developments Heisenberg had initiated: "In his brief paper the leading physical ideas are dearly stated, but only exemplified on account of the lack of appropriate mathematical equipment. The required machinery Jordan and I have developed in the matrix calculus." In Born, Problems of Atomic Dynamics (New York, 1960; original ed. 1926), p. 67.

85P. A. M. Dirac, "Quantum Mechanics and a Preliminary Investigation of the Hy drogen Atom," Proc. Roy. Soc. A, 110 (1926), 561-569; induded in van der Waerden, op. cit (note 18), pp. 417-427.

86M. Born, W. Heisenberg, and P. Jordan, "Zur Quantenmechanik II," Zs. f. Phys., 35 (1926), 557-615; in van der Waerden, ibid., pp. 321-385.

Page 51: berg Matrix

HEISENBERG AND MATRIX MECHANICS 187

tion of electrons cannot be described in terms of the familiar concepts of space and time."87

This three-man paper quickly became the basic text for anyone who wanted to learn matrix mechanics. In it there was no trace of, or need

for, the virtual oscillator model, a model few physicists had ever heard of and fewer still had taken seriously. What was clear is that the models of the atom that the physicists were familiar with, the general Bohr-Sommerfeld model and such specializations of it as the core

model, were emphatically rejected in the new quantum mechanics.

Ironically, Heisenberg, whose success had hinged on the careful and critical use he had made of models in scientific reasoning, was gener

ally credited with being successful because he rejected any reliance on

models and the type of physical reasoning they support. While this

misinterpretation caused little difficulty in physics, it seems to have

given strong support to formalistic ideas of the nature of scientific

explanation, ideas that are at variance with Heisenberg's own prac tice.

While physicists generally seem to have accepted the view of Born and Dirac that Heisenberg's new physics was adequate but that his mathematics required redevelopment and further extension, there was one strong dissenting opinion. Niels Bohr, always more the natural philosopher than the mathematical physicist, realized that the new mathematical formulations required, but lacked, an adequate and consistent physical interpretation. Soon he initiated discussions with Heisenberg and Schrodinger in Copenhagen to think this

through.88 But this forms another and even more complex chapter in the tangled history of quantum theory.

ACKNOWLEDGMENTS

I wish especially to thank Professor Werner Heisenberg for his

helpful and encouraging comments on the first draft of this article. I also wish to thank Professor John Heilbron and Ms. Judy Fox for their

87Ibid., p. 322. Van der Waerden discussed this paper with each of the authors, and in his reconstruction (pp. 42-54) he carefully explains who wrote each part.

88Responding to this evaluation Heisenberg wrote to me on 12 July 1974: "You express the opinion that only Bohr felt the need for a physical interpretation of quantum

mechanics, while the other physicists including myself considered the mathematical scheme as a sufficient explanation of the phenomena. I believe that I have always shared the opinion of Bohr and I disagreed on this point with the other physicists; but I should add mat also Born and perhaps also Dirac understood well that the mathemati cal scheme alone is not sufficient to explain phenomena like the path of an electron in the cloud chamber."

Page 52: berg Matrix

188 EDWARD MACKINNON

cooperation in making available the material in the Center for the

History of Science and Technology at the University of California,

Berkeley, and the directors of the Niels Bohr Institute in Copenhagen for permission to use microfilm copies of the Bohr correspondence. I thank Daniel Serwer for sending a preprint of his article. Though it

was not received in time to incorporate it into the article, it was

helpful in changing two references and in translating one citation.

Finally, I wish to thank the editors of this volume and two anony mous referees for many helpful suggestions.


Top Related