"7 - springer

112
M MAGNETIC FIELD Magnetic field is generated by the passage of electrical currents through conductors or by the circulation of microscopic charges within mag- netic materials. The magnetic field exerts a mechanical stress on its sources. Basic Equations To separate electromagnetic theory from the theory of the solid state, Max- well's equations can be written in terms of the magnetic induction or flux-density B and the total current density J: V"B=O V' X B = 41TJ (1) (2) where, for purposes of the present article, the displacement current (l/c 2 )aE/at is neglected relative to 41TJ (see ELECTROMAGNETIC THEORY and MAGNETISM). The solution of Eqs. (1) and (2) is _ f 3 J(rd X (r - rd B(r) - d r1 1 r - r1 13 (3) where the integral includes all current-carriers, and B vanishes at infinity. From Eq. (3) it follows, for example, that an infinite straight conductor [Fig. l(a)] carrying a total axial current Ie gives rise to an azimuth- ally directed external magnetic induction of strength B = 2Ie/r, where r is the distance from I (0 ) I (b) FIG. 1. Axisymmetric configurations with axially directed current Ie. the conductor axis. The same is true for an axial current in any axisymmetric conductor, e.g., the toroidal conductor of Fig. I (b). An in- finitely long circular cylindrical conductor. [Fig. 2(a)] carrying an azimuthal current denslty It: per unit length contains an axially directed magnetic induction of strength. B = same expression holds for a stralght cylmdncal conductor of arbitrary cross section [Fig. 2(b)], or for an infinite plane current sheet. A cirCUlar cylinder like that of Fig. 2(a), but of finite length L and radius R, has a central mafnetic induction of strength B = 41TIbL(L2 + 4R rl/2. In terms of the vector potential A, B=V'XA and the gauge V" A = 0, one V'2 A = -41TJ and for A vanishing at infinity. (4) (5) (6) Magnetic lines of force are defined by dr ex B B B I I +- I - I I I,c lc I I -+-1 I - 1 I I I 1 ,... "7 I 1 (0 ) (b ) FIG. 2. Infinitely long cylindrical configurations with axially directed magnetic induction B. 675 R. M. Besançon (ed.), The Encyclopedia of Physics © Springer Science+Business Media New York 1990

Upload: khangminh22

Post on 07-Feb-2023

8 views

Category:

Documents


0 download

TRANSCRIPT

M MAGNETIC FIELD

Magnetic field is generated by the passage of electrical currents through conductors or by the circulation of microscopic charges within mag­netic materials. The magnetic field exerts a mechanical stress on its sources.

Basic Equations To separate electromagnetic theory from the theory of the solid state, Max­well's equations can be written in terms of the magnetic induction or flux-density B and the total current density J:

V"B=O

V' X B = 41TJ

(1)

(2)

where, for purposes of the present article, the displacement current (l/c 2 )aE/at is neglected relative to 41TJ (see ELECTROMAGNETIC THEORY and MAGNETISM).

The solution of Eqs. (1) and (2) is

_ f 3 J(rd X (r - rd B(r) - d r1 1 r - r1 13 (3)

where the integral includes all current-carriers, and B vanishes at infinity.

From Eq. (3) it follows, for example, that an infinite straight conductor [Fig. l(a)] carrying a total axial current Ie gives rise to an azimuth­ally directed external magnetic induction of strength B = 2Ie/r, where r is the distance from

I

(0 ) I

(b)

FIG. 1. Axisymmetric configurations with axially directed current Ie.

the conductor axis. The same is true for an axial current in any axisymmetric conductor, e.g., the toroidal conductor of Fig. I (b). An in­finitely long circular cylindrical conductor. [Fig. 2(a)] carrying an azimuthal current denslty It: per unit length contains an axially directed magnetic induction of strength. B = 41T~~. ~he same expression holds for a stralght cylmdncal conductor of arbitrary cross section [Fig. 2(b)], or for an infinite plane current sheet. A cirCUlar cylinder like that of Fig. 2(a), but of finite length L and radius R, has a central mafnetic induction of strength B = 41TIbL(L2 + 4R rl/2.

In terms of the vector potential A,

B=V'XA

and the gauge V" A = 0, one ha~

V'2 A = -41TJ

and

for A vanishing at infinity.

(4)

(5)

(6)

Magnetic lines of force are defined by dr ex B

B B

I I

+- I -I I I,c lc

I I

-+-1 I -1 I

I I 1 ,...

"7 I 1

(0 ) (b )

FIG. 2. Infinitely long cylindrical configurations with axially directed magnetic induction B.

675

R. M. Besançon (ed.), The Encyclopedia of Physics© Springer Science+Business Media New York 1990

MAGNETIC FIELD

A

FIG. 3. Magnetic flux tube.

and are endless, by virtue of Eq. (1). The mag­netic flux <P through a surface S, bounded by a closed curve Q (Fig. 3), is given by

<P = f dSB . n = f dQ· A (7)

where n is the normal to S. The flux tube de­fined by the field lines passing through Q con­tains constant flux, independent of S.

At a large distance from a localized current distribution at r = 0, Eq. (6) gives the dipole potential

where

A= mX r ,3 (8)

m=t I d 3 'lrl X J(rl) (9)

Equations in the Presence of Magnetic Ma­terials A macroscopic current density JM can be defined by local averaging of the microscopic current density Jm within magnetic materials

JM = (Jm)av (10)

More conveniently, a magnetization vector M can be introduced, where

676

variable of the magnetic material (Le., it is con­stant in uniform samples and constant fields). This condition follows automatically from the interpretation of M as a magnetic-moment den­sity per unit volume:

M=Nmo (12)

The theoretical molecular magnetic moment mo (with number density N) is derived from Jm by evaluation of Eq. (9) over the molecular volume.

For macroscopic purposes, the total current density J of the preceding section is now speci­fied by

J = Je + JM (13)

The component Je flows in conductors of resis­tivity 1/ in accordance with Ohm's law:

1/Je = E (14)

The component JM is derived from M. In the analysis of configurations involving

magnetic materials, the magnetic field H is a convenient vector

H = B - 41TM

Then Eqs. (1) and (2) take the form

'V . H = - 41T'V . M

'V X H = 47Tle

(15)

(16)

(17)

At the interface between two magnetic ma­terials, Eqs. (1) and (17) imply continuity of the normal component of B and of the tangen­tial component of H.

Across a sheet-current of density N per unit length, the tangential component transverse to Je of both Hand B undergoes an increment 41TI~. The other components of Hand Bare unaffected.

The field patterns set up by a magnetized sphere are illustrated in Fig. 4. The magnetic induction (a) and magnetic field (b) are identi­cal outside the sphere, but differ inside it, be­cause of the magnetization (c). The same pat­tern of magnetic induction could be generated in the absence of magnetization by a surface current (d). The source of the magnetic field in Eq. (16), that is to say the quantity 'V . M, is also referred to as the magnetic pole density. The north and south polar regions are indicated in (c).

For weakly magnetic materials, Eq. (15) can generally be written in terms of a scalar mag­netic permeability 11

IlH =B (18)

JM = 'V X M (11) For ferromagnetic materials, one can still write

In experiments, only JM can be measured di­rectly (via measurements on B), and M is then uniquely derivable from Eq. (11) only with the added condition that it is to be a local state

IlH = B - 41TMo (19)

where Mo is a permanent magnetization, but 11 now depends on the time history as well as the

677

lo)

47TM

North ~ South pale. Pole

(c)

( b)

(d)

FIG. 4. Magnetic induction (a) and magnetic field (b) arising from sphere with magnetization pattern shown in (c) or with surface current pattern shown in (d).

magnitude of H. The typical relation between B and H for ferromagnetic materials is illustrated inFig.S.

When Jc is zero everywhere, one can define a scalar potential [2, such that

H = - 1j[2

1j2[2 = 4nlj . M

(20)

(21 )

If the boundary condition on [2 is simply that it vanish at infinity, the solution is

B

H

FIG. 5. As magnetic field H is initially raised to HI, magnetic induction 8 rises to 8 1 , Cyclical pattern shown is typical of ferromagnetic materials: 8 saturates as H becomes large; 8 remains finite as H returns to null; 8 goes to -81 as H goes to -HI' Double­valued ness of 8(H) is known as hysteresis.

MAGNETIC FIELD

f iji' M(rl) [2(r) = - d3rl

I r - rl I (22)

In the presence of current-carrying conductors Eqs. (20) and (21) still hold in the region where Jc = 0, but Eq. (17) now implies a multivalued potential

(23)

where the integral is taken around a loop en­closing the total conductor current Ie. To keep the potential single-valued, so that the solution of Eq. (22) remains valid, one may adopt the "magnetic-shell" approach: Je is replaced with an equivalent M, in analogy with Eq. (11).

Magnetic Force and Energy From Maxwell's stress tensor, we find the volume force

f = - Ij - + - (B . Ij) B ( B2) I 8n 4n

(24)

= J X B

which agrees with the summation of the Lorentz forces on the moving charges composing J. The "magnetic pressure" against a current sheet bounding a region of finite B (as in the Meissner effect or ordinary skin effect) is thus B2 18n, evaluated at the surface. The force and torque on a body localized in a nearly uniform field are

F=(m'Ij)B

N=mXB

(26)

(27)

From the microscopic point of view underly­ing Eq. (2), the magnetic energy density is

B2 w=-

8n (28)

In the presence of magnetic materials, one is more interested in the electrical input energy required to go from Bo to B, and this is given by

.::lw = ~ rB H' dB

4n JBo

For H = pB, with constant p, this becomes

(29)

1 .::lw = -- (B2 - B 2) (30) 8np 0

The derivation of Eqs. (28) and (29) depends on the complete set of Maxwell's equations.

Units and Magnitudes The equations used here are based on the emu system. If the cur­rents are expressed in amperes, they must be divided by 10 to give magnetic inductions in gauss or fields in oersteds. A magnetic induc­tion of 10 kilogauss (one weber per square meter) can exert a maximum stress B2/8n of

MAGNETIC FIELD

about 4 atm, and contains an energy density of 0.4 joules/cm-3 •

The strength of the earth's magnetic induc­tion is about 0.2 gauss. For typical ferromag­netic materials, the maximum value of 47TM is 20 kilogauss; at this point of saturation, the permeability /1 approaches unity. At magnetic fields below one gauss, permeabilities of 1000 or more can be reached. The strength of ma­terials limits the magnetic induction obtainable nondestructively with laboratory electromag­nets to peak values well below a million gauss.

HAROLD P. FURTH

References

Stratton, J. A., "Electromagnetic Theory," New York, McGraw-Hill Book Co., 1941.

Jackson, J. D., "Classical Electrodynamics," New York, John Wiley & Sons, 1962.

Cross-references: ELECTRICITY, ELECTROMAG· NETIC THEORY, FERROMAGNETISM, FIELD THEORY, MAGNETISM.

MAGNETIC RESONANCE

The magnetic resonance phenomenon is the resonant interaction between an oscillating magnetic field and an orthogonal static mag­netic field mediated by the presence of objects possessing both angular momentum and a magnetic moment. The objects in question are normally microscopic in character (molecules, atoms, atomic nuclei or subatomic particles: protons, neutrons, electrons, muons, etc.), al­though historically the effect was first demon­strated with magnetized iron rods.

In atoms and molecules, a magnetic moment may arise from the orbital motions of electrons, in the same manner as an electric current flow­ing in a loop of wire generates a dipole-like magnetic field. According to quantum theory, angular momentum for such orbital or rota­tional motions is restricted to values which are integral multiples of -fi (-Ii = h/27T, where h = 6.626176 X 10-34 1 s is the Planck constant). Thus, we may calculate the magnitude of the magnetic moment /1 due to the orbital motion of an electron in the smallest (n = I) orbit of the classical Bohr atom using Ampere's law (/1 = current X area) and the fact that the angu­lar momentum L = mvr = nh. The result,

J1.B = .!!!... = 9.274078 X 10-24 1 T-1 , 2m

where elm is the magnitude of the electron charge to mass ratio, is known as the Bohr magneton, and is a convenient unit in which to denote atomic and molecular magnetic moments.

The spin, or intrinsic angular momentum, of

678

an elementary particle or a nucleus may take on values which are integral multiples of fi/2. The nuclear equivalent of the Bohr magneton, based on a classical model of the proton as a rotating sphere of charge is the nuclear magneton

/1N = 5.050824 X 10-27 J rl.

The actual intrinsic magnetic moments for the free electron and proton are respectively -1.001160 /1B and +2.792846 J1.N. The g factor is a dimensionless measure of magnetic moment (in units of - /1B/2) which is tradi­tionally used for electron-associated moments.

A particle possessing both angular momentum J and a proportional magnetic moment p.. = r1 = rl1f will precess gyroscopically when placed in a magnetic field Bo in a manner similar to that of a spinning top precessing about the Earth's gravitational field. The proportionality constant r is known as the magnetogyric ratio. The precession rate in radians sec-I, the Larmor frequency, may be derived by a classical calcu­lation to be

Wo = rBo. (1)

Coincidentally, an exact quantum calculation, using a Hamiltonian H = -p..' Bo, yields an energy separation (Zeeman interaction) be­tween adjacent energy levels of a particle with spin angular momentum quantum number I of Eo = fzwo. A total of 2I + I states with energies -fzwolz occur, I z being the z compo­nent (magnetic) angular momentum quantum number and taking on the values -I, -1+ I, ... , +1.

This beautiful correspondence between the quantum and classical descriptions is made even stronger for isolated spins by the fact that a quantum dynamic calculation shows that the expectation values (Ix>, (Iy>, and (Iz> behave with time in precisely the same manner as the components of angular momentum of a wholly classical particle. Thus, we are entitled to for­mulate and visualize descriptions of magnetic resonance phenomena in terms of classically precessing vectors.

The vector description of magnetic resonance is due to Rabi. If a spin experiences simul­taneously a static magnetic field and a field rotating in a plane perpendicular to the static field, resonance becomes possible. (In practice, linearly polarized transverse oscillatory fields are used; these may be decomposed into two counterrotating fields, one of which moves along with the precessing spin, the effect of the other being unimportant.) The detailed analysis of this situation is conveniently accomplished by a coordinate transformation to a frame of reference rotating synchronously with the rotating field (Fig. 1). By convention, the z-axis of the laboratory and rotating reference frames are collinear with the applied static

679 MAGNETIC RESONANCE

Laboratory Frame Rotallng Frame

.-.I'----'7--- - --yLAB ~-------YROT

FIG. 1. Rotating coordinate transformation. The rotating reference frame rotates in synchronism with the applied 81 RF magnetic field. Within the rotating frame, the 81 field is static, the original 80 field is replaced by the resonance offset field .:lB, and the magnetization M precesses about the net field Be.

field Bo = BokLAB' The rotating field ampli­tude is BI , and the laboratory and rotating frame vectors are defined as

and

BlkROT

respectively, w being the angular rotation frequency.

Thus, in the absence of a BI field a spin with true Larmor frequency Wo appears in the rotat­ing frame to precess at Wo - w == Llw. In order to preserve the form of the dynamical equa­tions upon transformation into the rotating frame, we adopt the idea of the spin experienc-

(a)

ing an effective field given by LlB = (wo - w)/'y, which acts along kROT.

Now the resonance effect becomes clear and simple to analyze. Upon application of aBI field, a net effective field Be = B 1 iROT + LlBkROT results which is no longer parallel to the z-axis. The spin now precesses at rate rBe about the effective field direction (often denoted by an effective field angle 0 = tan -I [B 1/ LlB]) and may suffer large excursions in direction. When the frequency of the applied BI field is far above or below Wo [Fig. 2(b) I the effective field direction is essentially the same as that of the laboratory field, and no significant effects on the spin are observed. The maximum effect occurs when the frequency of the applied BI field exactly matches Wo [LlB -+ 0, the reso-

Fer Off R •• onenee

B,

(b)

FIG. 2. Magnetic resonance-development of transverse magnetization. (a) On or near resonance (I.:lB I :s B I) the magnetization M may be nu tated from its thermal equilibrium position along z into the xy plane (900 pulse), inverted (1800 pulse), and so forth. (b) When the applied RF is far from resonance (I.:lB I » B I) the magnetization is not signifi­cantly perturbed.

MAGNETIC RESONANCE

nance condition, Fig. 2(a)]. In this case it is possible to nutate the spin into the transverse direction and leave it there (a "900 pulse") or invert its orientation entirely (a "1800 pulse," etc.).

The phenomenological description of the col­lective behavior of a system of weakly interact­ing spins was provided by a set of equations given by Bloch. Although they fail quantitatively in a number of specific instances (most notably for spin systems in solids or other strong-inter­action situations), they offer an enormously useful and generally applicable conceptual framework:

dMx Mx -=r(MX B) --

dt x Tz

dMy My -- = 'Y(M X B) --

dt Y T2

dMz = reM X B)z _ (Mz - Mo) . dt TI

(2)

Basically, the Bloch equations describe the time behavior of the components of the magne­tization M (summation over the individual microscopic magnetic moments) in the rotating frame. M 0 is the magnitude of the magnetiza­tion in thermal equilibrium with the "lattice" (i.e., the remaining motational degrees of free­dom of the material in which the spin system resides) given by the Curie law

CBo M --

0- T ' (3)

C being the Curie constant and T the absolute temperature. In thermal equilibrium, the mag­netization must be parallel to the static field (transverse components zero).

The cross-product terms in Eq. (2) represent the precessional or nutational behavior discussed above. The remaining terms are damping terms which describe the tendency of the system to return to thermal equilibrium. The time-depen­dent solution of the equations starting with an arbitrary M yields a z-component Mz which decays exponentially toward Mo with a time constant TI (the longitudinal or "spin-lattice" relaxation time). The transverse components Mx and My decay exponentially to zero with time constant Tz (transverse or "spin-spin" relaxation time). Longitudinal relaxation in­volves . exchange of energy with the lattice, while transverse relaxation involves energy exchange between spins or dephasing of pre­cessing components; hence the differentiation in relaxation times. It must always be true that TI #T2 • A low power (B I « ('Y 2 TI T2 )-1/2

steady-state solution gives a nonzero transverse magnetization parallel to Be which may exhibit saturation (disappearance of the transverse

680

component for sufficiently large values of B I near resonance).

The Bloch equations also suggest ways of experimentally observing magnetic resonance phenomena. The CW (continuous wave) tech­nique is based on the steady-state solution. The static transverse component of M in the rotat­ing frame corresponds in the laboratory frame to an oscillating transverse magnetization at the frequency of the applied field. It may be ob­served as a net power absorption by the sample or the appearance of a coupling between two orthogonal transverse radiofrequency (RF) coils containing the sample as field and applied RF are slowly brought into resonance according to Eq. (l) (either field or frequency may be swept). The transient response may be observed directly after a suitable (e.g., 900 ) pulse has been applied. The resulting oscillatory decaying magnetization induces a voltage in a coil sur­rounding the sample (which may be the same coil used to apply the pulse). This signal is known as the free induction decay (FlO), and contains components from all spins which were significantly affected by the pulse. The CW spectrum may be recovered by Fourier trans­formation of the FlO (see FOURIER ANALYSIS).

The CW method is traditional for observation of nuclear magnetic resonance (NMR) and is still almost exclusively employed for electron spin resonance (ESR, or electron paramagnetic resonance, EPR). Over the past decade (1970s), pulsed, or Fourier transform, techniques have gradually supplanted CW in chemical NMR applications because of the higher potential sensitivity for a given measurement time (all of the spectral information is "captured" in one short time interval) and the convenience with which computer-based signal averaging may be adapted. (See Figs. 3-5.)

The great preponderance of magnetic reso­nance measurements are in NMR and ESR. Every chemical element has at least one observ­able isotope. Common examples are listed in Table 1. ESR measurements may be made on any substance with net unpaired electron spin density. Examples are organic free radicals, high-spin metal ions, conduction electrons in metals, some doped semiconductors, as well as systems with long-range collective interactions: ferromagnetic, antiferromagnetic, and ferrimag­netic materials. The spontaneous polarization of ferromagnetic systems allows the occurrence of NMR and ESR without an external Bo field .

Several types of interactions affect NMR and ESR spectra. The presence of the Bo field in­duces a circulation of electrons in materials which in turn gives rise to a small opposing field. This field-proportional shielding effect depends sensitively upon the chemical environment and orientation at the spin under observation. In NMR the effect is called the chemical shift, and is always reported with respect to a refer­ence compound of the isotope in question;

681

a ' H caD MHz eWI

- 1-----& ppm (3eo HzJ-----_

b 1 3 C 17 5 .43 MHz FT, protgn deCQupl ecU

_ _____ ,eo ppm {12 . 1 IlriHd - ______ ..

FIG. 3. Typical solution NMR spectra. (a) Proton 60 MHz (Bo = 1.409 T) continuous wave spectrum of diethyl ether. Positions of spectral patterns are due to the chemical shift, line splittings within patterns are due to J couplings. Integral of each spectral pattern is proportional to the number of protons giving rise to that resonance. (b) 13C 75.43 MHz (Bo = 7.046 T) Fourier transform spectrum of ethylbenzene. Couplings to protons are removed by continuous irradiation of protons simultaneous with acqusition of carbon free induction decay. Carbon-carbon couplings do not appear because the low concentration of the isotope about 1% of 12C) makes the occurrence of 13C pairs rare. Solvent is deuterochloroform, CDCI3. Chemical shift reference, tetramethylsilane (TMS), (CH3 )4Si, has been added.

typical ranges of chemical shifts vary from 10 ppm (parts per million of field or fre­quency) for the three hydrogen isotopes to 250 ppm for carbon to several thousand ppm for the heavier elements or for shifts arising from conduction electrons in metals (these are called Knight shifts). This exquisite sensitivity to chemical effects has made NMR one of the most powerful and ubiquitous techniques of chemical analysis and research. Shiel dings in ESR are reported as g values, which range from roughly 1.9 to 2.2. Shifts arising from bulk magnetic susceptibility are also observed.

Interactions also occur between all the mag­netic dipoles in systems, such as between nuclei

MAGNETIC RESONANCE

a

( CH J,';» COH

-eH ,

I:iC 12 2 . a;) MHI. c ro •• ~ol .rh.tlon) ~

l-Butllnol at ,., I(

O':-J ~ ~

_ --- -_200 ppm. 14!!i00 Hz) _____ ~

b

~2aIl;Hz--_

FIG. 4. Ty~ical solid state Fourier transform NMR spectra. (a) 1 C spectrum of tertiary butanol, employ­ing proton decoupling [see Fig. 3(b)]. The widths and shapes of the lines are due to the anisotropy of the chemical shift. (b) 2H spectrum of deuterated hexa­methylbenzene. The width and shape of this spectrum are due to the anisotropy of the quadrupole splitting in teraction.

of the same isotope (homonuclear dipole­dipole), between different isotoptes (hetero­nuclear dipole- dipole), between nuclei and unpaired electrons, or between electrons. These interactions may occur directly through space, or may be "conducted" through chemi­cal bonds as a slight bias in spin polarization (indirect or J coupling when speaking of NMR, hyperfine coupling when speaking of couplings to nuclei in ESR spectra).

Couplings between molecular rotation-induced moments and nuclei or electrons can also occur (spin-rotation coupling). These are usually most apparent in their effect on relaxation (see next paragraph). Nuclei with I >! have nonspherical charge distributions, and therefore interact with electric field gradients in materials. Quadrupole couplings can range from zero to small pertur­bations on NMR spectra to values which com­pletely dominate the nuclear Zeeman energy. In the latter case, the quadrupole interaction can serve as the source of nuclear polarization rather than the Bo field, making possible "zero­field" NMR (normally called nuclear quadrupole resonance, NQR).

All of these interactions may be time depen­dent due to atomic or molecular motions, or due to natural or experimentally induced mo-

MAGNETIC RESONANCE 682

Particle

e

.u n 1H 2H 3H 13c 14N 1SN 19F 23Na 29Si 31p

43Ca 121Sb

.- (9.535 GHz, X-bend)

H H

" ~'.- I I

--~

__ ------ -------- 20 Gau •• -------------__

FIG. 5. Typical solution ESR spectrum. Most ESR spectra are obtained by a con­tinuous wave method in the derivative mode. Spectrum of parabenzosemiquinone radical anion in alkaline ethanol. Line splitting (hyperfine coupling to the four protons) is 2.368 gauss, g-value is 2.005.

TABLE 1. MAGNETIC RESONANCE DATA FOR SELECTED PARTICLES.

Larmor Electric Relative Magnetic Frequency Quadrupole Natural Relative Receptivity Moment in 2.3487-T Moment,b Isotopic Receptivity at Natural

Spin in Units of .uNa Field, MHz X 10-28 m2 Abundance per Partic1ec Abundanced

1 1.7340593.uB 65,821.07 0 1.000 2.08 X 108 2.08 X 108 ;: 1 1.7340706.u.u 318.33 0 0 32.3 0 ;: 1 -3.313670 68.51 0 0 0.322 0 "2 .! 4.873505 100.00 0 1.000 1.000 1.000 2

1.2125 15.35 2.73 X 10-3 1.5 X 10-4 9.65 X 10-3 1.45 X 10-6 1 5.1595 106.66 0 0 1.21 0 ;: 1 1.2166 25.15 0 1.11 X 10-2 1.59 X 10-2 1.76 X 10-4 ;:

1 0.5706 7.22 0.016 0.996 1.01 X 10-3 1.01 X 10-3 1 -0.4900 10.13 0 3.7 X 10-3 1.04 X 10-3 3.85 X 10-6 ;: 1 4.5509 94.08 0 1.000 0.833 0.833 ;: 3 2.8610 26.45 0.12-0.15 1.000 9.25 X 10-2 9.25 X 10-2 ;: 1 -0.9612 19.87 0 4.70 X 10-2 7.85 X 10-3 3.69 X 10-4 "2 1 1.9581 40.48 0 1.000 6.63 X 10-2 6.63 X 10-2 ;: 7 -1.4914 6.73 0.2 ± 0.1 1.45 X 10-3 6.40 X 10-3 9.28 X 10-6 "2 5 3.9537 23.93 -0.5 to -1.2 0.573 0.160 9.17 X 10-2 ;:

aMagnitude of magnetic moment vector .u = 'Yff'[J(J + 1»)112. Positron and muon moments in units of their respective magnetons. Values for all nuclei except 1 H are those observed in specific chemical compounds, uncor-re6ted for shielding.

Electric quadrupole moments are often known with only poor accuracy, due to certain assumptions which must be made in their experimental determination, and due to conflicting results from different experimental techniques. The table values represent a range from several sources.

c Approximate detection voltage signal to noise ratio, assuming equal Bo fields, line shapes, relaxation times, measurement bandwidths, and total measurement noise, relative to protons; proportional to w03J(J + 1).

dproduct of natural isotopic abundance and relative receptivity per particle.

683

tions of some of the magnetic dipoles in the system. Depending on the type of interactions, such motions can be the source of relaxation (e.g., Tl and Tz in the Bloch equations) between specific types of spins or between spins and the lattice.

Aside from the chemical and physical research and measurement, other applications of mag­netic resonance include measurement of mag­netic fields (proton magnetometers) and the use of ferrites (high RF resistivity ferrimagnetic ceramics of general formula MOFez03, M a divalent cation) as magnetic field-controllable microwave switches, phase shifters, attenuators, filters, circulators and isolators.

Most recently, magnetic resonance has been employed for noninvasive clinical imaging of the human body (zeugmatography). Lauterbur first demonstrated that the NMR spectrum of a compound with a single spectral line (e.g., protons in intracellular water) is the projection of the distribution of the compound in the object if a linear Bo field gradient, rather than a uniform Bo field, is applied across the object. Multiple projections in several directions may be used to reconstruct an image using algorithms similar to those employed in x-ray or emission computed tomography.

The formal description of magnetic resonance given above has been applied (by Feynman, Vernon, and Hellwarth) to the generalized two­level quantum system, and has proved extremely useful in understanding coherence phenomena in microwave, infrared and optical spectros­copies.

JEROME L. ACKERMAN

References

Carrington, A., and McLachlan, A. D., "Introduction to Magnetic Resonance with Applications to Chem­istry and Chemical Physics," New York, Harper and Row, 1967.

Slichter, C. P., "Principles of Magnetic Resonance," Berlin, Springer-Verlag, 1978.

Abragam, A" "The Principles of Nuclear Magnetism," Oxford, Oxford Univ. Press, 1961.

Becker, E. D., "High Resolution NMR," New York, Academic Press, 1980.

Helszain, J., "Principles of Microwave Ferrite Engi­neering," London, Wiley-Interscience, 1969.

Kaufman, L., Crooks, L. E., and Margulis, A. R. (Eds.), "Nuclear Magnetic Resonance Imaging in Medicine," New York, Igaku-Shoin, 1981.

Lee, K., and Anderson, W. A., "Nuclear Spins, Mo­ments and Magnetic Resonance Frequencies," in "The Handbook of Chemistry and Physics," (R. C. Weast, Ed.), Cleveland, The Chemical Rubber Com­pany.

Steinfeld, J. I., "Molecules and Radiation," New York, Harper & Row, 1974.

Cross-references: COHERENCE, ELECTRON SPIN, FERRIMAGNETISM, FOURIER ANALYSIS, LA-SERS, MAGNETISM, MASERS, RESONANCE. -

MAGNETISM

MAGNETISM

Magnetization Magnetic fields are produced both by macroscopic electric currents and by magnetized bodies. The first observed mani­festations of magnetism were the forces be­tween naturally occurring permanent magnets, and between these and the earth's field. North­and south-seeking poles could be identified. Poles were observed to be localized near the ends of long rods magnetized by contact with natural magnets or by a current-carrying coil. From the observed attraction and repulsion of unlike and like poles with an inverse square law came the concept of pole strength and the definition of the unit pole, that which acts on another in vacuum with a force of one dyne at a distance of one centimeter. The unit magnetic field, the oersted, could then be defined as that in which a unit pole experiences a force of one dyne. The magnetic moment of a long, uniformly magnetized rod of length I with a pole strength of m unit poles at each end is defined as ml, the largest couple that the sample can experience in a field of one oersted. The magnetization, M, is defined as the magnetic moment per unit volume, mllal, where a is the cross-sectional area, and is thus also equal to the pole strength per unit area, mla.

Magnetic Induction The induction, or flux density, B, is numerically equal to the field H in free space and is described as one line of flux per square centimeter for a field of one oersted. Its direction is that of the force on a unit north pole. If magnetic material is present, the flux density is equal to H + 47TM, since 47T lines of force emanate from the unit pole at each end of a dipole equivalent to a specimen of unit mag­netization. Magnetic poles are observed to occur in pairs. Lines of B are continuous, i.e., div B = O. If a material becomes strongly magnetized in a small field, the lines of flux can be considered to crowd into the material, leaving their original locations and reducing the field there. This is how magnetic shielding is accomplished. Changes of B within a coil induce voltages which can be measured and form the basis of galvanometer and fluxmeter measurement methods. For a coil of N turns of cross-sectional area a, in which the flux is changing at dBldt gauss/sec, E in volts is given by

dB E=-IO-8Na­

dt

Forces on magnetic bodies in field gradients are proportional to M.

Types of Magnetic Behavior In general a field H will produce a magnetization M in any ma­terial. If M is in the same direction as H, a sample will be attracted to regions of stronger field in a field gradient. It will be repelled if M is in the opposite sense. This experiment, as first performed by Faraday, is the basis for the broad classification of materials into paramag­netic, diamagnetic, and ferromagnetic. The sus-

MAGNETISM

ceptibility K is defined asM/H. The force Fx on a small specimen of volume v in a field Hy and a field gradient dH y/dx is

dHy Fx = (K2 - KdvHy-­

dx

where K 2 and Klare the volume susceptibilities of the specimen and the surrounding medium, usually air.

For paramagnetic materials, K is small and positive, usually between 1 and 1.001 at ordinary temperatures. These substances contain atoms or ions with at least one incomplete electron shell, giving them a non-zero atomic or ionic magnetic moment Ila. Many salts of the iron­group and rare-earth metals are paramagnetic, as are the alkali metals, the platinum and palladium metals, carbon, oxygen, and various other elements. Antiferromagnetic substances also have small positive K, as do ferromagnetics above their Curie temperatures. In the classical theory of paramagnetism, the orientations of the moments are considered to be initially thermally randomized in space. An applied field produces a net magnetic moment in its direction, as described by the classical Langevin function

M = coth (lla H)_ kT Ms kT llaH

where k is the Boltzmann constant. Ms is the value of M attained for very large H/T. Under most conditions, only the initial portion of this curve is observed, with the corresponding con­stant K. The conduction electrons at the top of the Fermi distribution in a metal can also give rise to a temperature-independent Pauli para­magnetism. The quantum-mechanical analogue of the Langevin function is called the Brillouin function (see PARAMAGNETISM). .

For diamagnetic materials, K is small and negative. Diamagnetism is a universal phe­nomenon but is often masked by paramagnetic or ferromagnetic effects. Net diamagnetic be­havior is observed in a number of salts and metals, and in the rare gases, in which there is no net moment. The effect can be regarded as the operation of Lenz's law on an atomic scale (see DIAMAGNETISM).

Ferromagnetic materials show a value of M which may be of the order of 103 in small fields. Thus Y. can be very large. It is common to describe their properties in terms of the perme­ability Il = B/H. Since M saturates in ordinary fields, K and Il are not constant. M is not neces­sarily in the same direction as H, so K and Il are, in general, tensors. Furthermore, ferromagnetics generally exhibit hysteresis in the dependence of M on H, and the details are very structure­sensitive. Still another distinction is the rather abrupt disappearance of ferromagnetism at a characteristic temperature, the Curie tempera­ture, Te.

684

Atomic Magnetic Moments There are two possible sources for the moments of individual atoms. They are electron orbital motion and electron spin (see ELECTRON SPIN). In most ferromagnetic materials, most of the moment comes from spin rather than orbital motion, a fact that is revealed experimentally by gyromag­netic measurements (see FERROMAGNETISM) and by magnetic resonance experiments (see MAGNETIC RESONANCE). Orbital motions are quenched by the electric fields of the neighbor­ing atoms in the crystal lattice. In the rare earth metals the unfilled shell is deep within the atom, orbital motion is not quenched, and the orbital contribution to the magnetic moment is observed. The unit of atomic moment is the Bohr magneton, IlB' which is the moment asso­ciated with one electron spin, numerically equal to 0.9274 X 10-20 erg/oersted. The spin quantum number, S, is one-half the number of unpaired electrons. The moment per atom is SgllJ3 where g is the gyromagnetic ratio, close to 2 for most materials. The moment in Bohr magnetons of an isolated atom or ion of the first transition series is equal to the number of unpaired d electrons, considering the first five electron spins to have one orientation and the next five the opposite (Hund's rule). The Ni++ ion, with eight d electrons, has the expected moment of 21lB in ferrites, in which the ionic spacing is great enough so that the d levels are not disturbed (see FERRITES). In metallic nickel, however, the d levels overlap considerably, and the moment corresponds to only 0.6 IlB per atom. Similarly the Bohr magneton numbers for metallic iron and cobalt are 2.2 and 1.7 respectively.

Ferromagnetism Ferromagnetism can only occur in a material containing atoms with net moments. Also, quantum-mechanical electro­static "exchange" forces must be present, hold­ing neighboring atomic moments parallel below the Curie temperature. These are much greater than the Lorentz force due to the average mag­netization and are, in fact, equivalent to an effective field on the order of 106 oersteds. Such an effective "molecular field" was postu­lated in 1907 by Weiss in extending the Langevin theory of paramagnetism to include ferromag­netic behavior. The Langevin function predicts a temperature dependence of magnetization, for small M, of

CH M=­

T

where C is a constant. The susceptibility is then CIT, which is Curie's law. Weiss pointed out that if the field H were augmented by an addi­tional field NM proportional to the magnetiza­tion, the temperature dependence became

CH M=-­

T- Te

685

where Te = NC. This is the Curie-Weiss law, approximately obeyed by ferromagnetic sub­stances above their Curie points. Below Te , the presence of the molecular field produces an alignment of the atomic moments correspond­ing to the spontaneous magnetization Ms even when no external field is present. However, ferromagnetic samples can have any net ex­ternally measured value of magnetization, in­cluding zero, which seems to contradict this result. Weiss therefore postulated the existence of domains separated by boundaries. In each domain the atomic moments are parallel, the domain magnetizations having different orienta­tions. The net external magnetization is then the vector sum of the domain magnetizations and can be varied by a rearrangement of the domain structure, which may happen in very small applied fields. This prediction has been completely verified by experiment. The motion of domain boundaries as observed under the microscope has been directly correlated with external changes in magnetization. Domain boundaries in iron are on the order of 1000 A thick. Within a boundary, neighboring mag­netic moments are not quite parallel. The change in orientation of the magnetization from one domain to another is distributed through the thickness of the boundary.

Within a domain, the magnetization will in general preferentially lie along some particular crystallographic direction. The energy difference between magnetization in the easiest and hard­est direction may exceed 108 erg/ cm 3 . This anisotropy is described in an appropriate trigo­nometric series with coefficients K j • Usually only a few terms are necessary. Often a material is described by a single K; this implies a uni­axial anisotropy energy of the form K sin2 (J. The Ki may pass through zero and change sign with changing composition or temperature. Al­though such details cannot in general be pre­dicted, the magnetocrystalline anisotropy will have the same over-all symmetry as the crystal structure. Anisotropy is best investigated in single crystals, by analysis of magnetization curves in various directions or from the rela­tionship between the measured torque and the direction of the applied field (see FERROMAG­NETISM). Dimensional changes are also asso­ciated with the position of the magnetization vector relative to the lattice (see MAGNETO­STRICTION).

There are two mechanisms available for chang­ing the externally measured magnetization of a ferromagnetic material: domain boundary mo­tion, and domain magnetization rotation. Broadly speaking, in magnetically soft materials, boundary motion accounts for most of the changes in low applied fields, leaving the mag­netization in each domain in the easy direction nearest the applied field. Then rotation against anisotropy produces the remaining change in higher fields. In very low fields, boundary mo­tion is practically reversible, but when bound-

MAGNETISM

aries move considerable distances, they experi­ence a net drag from impurities and irregularities in the material, causing hysteresis in the de­pendence of B on H. There will in general be a remanence Br , the flux density remaining after saturation when the field is reduced to zero, and a coercive force He, the reverse field re­quired to reduce the flux density to zero. A loss associated with the irreversibility of mag­netization changes also occurs in rotating fields. This loss becomes zero in very large fields, ex­cept in a few special cases.

Even in very slowly changing fields, a wall characteristically moves in jumps, each giving a sudden change in B. This irregularity has been known for a long time as the Barkhausen effect, and its physical origin is the irregularity of wall motion through various inhomogeneities in the material. Usually a very large number of these small jumps takes place. In special circumstances, however, the material may remain at Br , until, in a sufficiently large field, a single wall will be nucleated and sweep all the way across the specimen, leaving it at Br in the other direction. Such a materIal has only two stable states, +Br and - Br , a useful behavior in some applications.

Direct microscopic observation of domains, e.g., by the Faraday effect or the magnetic Kerr effect, is an important research tool. Under­standing and control of domain structures has progressed to the point that under appropriate conditions large numbers of tiny cylindrical domains can be deliberately produced and con­trollably moved in certain single-crystal ma­terials, enabling the development of memory devices utilizing this ability.

It is also necessary to consider the behavior in rapidly varying fields, discussed below.

Antiferromagnetism Exchange forces can operate to hold neighboring moments anti­parallel, rather than parallel. Materials whose magnetic moments are arranged in this way show no external permanent moment and are called antiferromagnetic. The sign of the ex­change force may depend, among other things, on the atomic spacing. Metallic manganese, for example, is antiferromagnetic, while many alloys of manganese, in which the average Mn-Mn dis­tance is greater, are ferromagnetic. In some anti­ferromagnetic compounds, the exchange inter­action appears to be of a next-.nearest-neighbor type, taking place through an intervening atom such as oxygen. This type of interaction is termed superexchange. Antiferromagnetic ma­terials, having no net external moment, show small positive susceptibilities that reach a maxi­mum at the temperature above which the ex­change forces can no longer hold the moments aligned against thermal agitation. This tempera­ture, TN, the Neel temperature, corresponds to the Curie temperature of a ferromagnet. Mag­netocrystalline anisotropy exists for anti­ferromagnets just as for ferromagnets (see ANTIFERROMAGNETISM ).

Ferrimagnetism With more than one type of

MAGNETISM

magnetic ion present, in certain compounds, antiferromagnetic coupling may lead to a net external moment corresponding to a Bohr mag­neton number equal to the difference in ionic moments. Other more complicated cases occur. Ferrites, insulating oxides with the spinel struc­ture, are important examples of this class of material, called ferrimagnetics (see FERRIMAG­NETISM).

Exchange Anisotropy A ferromagnetic phase may be in exchange coupling with an antiferro­magnetic phase, as in a cobalt particle covered with CoO. This leads to new phenomena, in­cluding non-vanishing high-field rotational hys­teresis. Such a material cooled in a field through the Neel temperature, if Tc > TN, may exhibit a hysteresis loop that is permanently displaced from the origin. This is equivalent to a unidirec­tional (not uniaxial) anisotropy and will appear in a torque curve as a sin () term. Ferromagnetic and antiferromagnetic regions in a single-phase alloy may also lead to these effects.

Other Configurations Atomic moments need not necessarily be either parallel or antiparallel. In a few materials they may be arranged in a triangular or spiral configuration. In some circumstances, an antiferromagnetic material may shift to a configuration having a large ferromagnetic moment in the appropriate com­bination of fields and temperatures (metamag­netism).

Rare Earths The rare earth elements have magnetic moments originating from unpaired electrons in the 4f shell. These electrons are close to the nucleus and are shielded by the 5 s and 5p electrons. Thus direct exchange does not occur in the rare earths. However, several of them exhibit ferromagnetism at low temper­atures, originating in indirect exchange via the three 5 d-6s conduction electrons. The atomic moments can be large in the heavy rare earths, in which the spin and orbital moments add. In fact, Dy and Ho have a moment per unit volume, at low temperatures, half again as large as that of iron. The rare earths often exhibit complex magnetic ordering structures.

The rare earths form many solid solutions among themselves, in which the variation of moment and Curie temperature have been in­vestigated. They also form many intermetallic compounds with other elements. Often a rare earth and another element will form several discrete binary compounds, sometimes showing extraordinary magnetic properties. A number of compounds, including RCos , have extremely high magnetocrystalline anisotropy. TbFe2 and DyFe2 exhibit the highest magnetostrictive strains known, on the order of 1 %, thousands of times higher than values typical of other materials.

Amorphous Materials It is possible by rapid quenching from the molten state to prepare metallic samples whose atomic structure is amorphous. The atoms are not arranged in a

686

crystal lattice but are randomly packed as in a glass. The saturation magnetizations and Curie temperatures of these materials, although generally somewhat less than those of their crystalline counterparts, are substantial. The demonstration of ferromagnetism in a glassy metallic structure is of great fundamental interest. Furthermore, in such a structure there is no macroscopic magnetic anisotropy. As a result, magnetization changes can take place readily in small fields. Thus these materials can show high permeabilities and narrow hys­teresis loops, giving them potential usefulness in various devices.

Permanent Magnets A useful permanent magnet material should have as large a hysteresis loop as possible. In the early magnet steels, wall motion was made difficult by a heterogeneous alloy structure. A different approach is based on the theory that sufficiently small particles should find it energetically unfavorable to con­tain domain boundaries. The critical size is proportional to KI/2/Ms. Reversal must then proceed by the difficult process of rotation against shape, strain-magnetostriction, or crystal anisotropy. Fine-particle ("'IOOOA) iron and iron-cobalt materials utilizing shape anisotropy have been developed. The Alnico permanent magnet alloys have very fine precipitate struc­tures and are probably also best regarded as fine-particle materials. A magnetic oxide, BaO . 6Fe2 03, utilizes magnetocrystalline anisotropy in fine-particle ("'1 f.lm) form. A new class of permanent magnet materials based on Cos -(rare earthh intermetallic compounds shows by far the highest permanent magnet properties of any material. These originate in the extremely high magnetocrystalline anisotropy of these ma­terials.

Thin Films Since a surface atom's surround­ings are different from those in the interior, the magnetization and Curie temperature of thin films should yield important information about the range of ferromagnetic interactions. Ex­perimental difficulties, primarily with purity, have beclouded the subject to some extent, but it now appears that any surface layer on nickel having substantially different magnetic proper­ties from the bulk cannot be more than a few Angstroms thick.

There have been many investigations of flux reversal in films, usually vapor-deposited on glass, which have been motivated by computer technology needs. Such films show a uniaxial anisotropy associated with fields present during deposition or sometimes with geometric effects such as the angle of incidence of the vapor beam.

Dynamic Behavior of Ferromagnetic Materials Changes in flux in a conductor induce emf's re­sulting in current flows whose fields tend to oppose the change in flux. For various time rates and geometries these can be calculated, leading to expressions for phase relationships

687

and skin depth in conductors (see ELECTRO­MAGNETIC THEORY). These expressions have often been applied to magnetic materials at power frequencies by simply replacing H by B. This is in general not a good approximation and leads to erroneous results. It is more nearly correct to recognize that highly localized eddy currents around moving domain boundaries are the entire source of loss under these condi­tions. For a given dB/dt, the loss calculated in this way is much greater, decreasing to the classical value as the density of domain bound­aries increases.

In bulk metals, domain wall velocities are usually determined by the damping associated with local eddy currents. In ferrites and thin films, other types of damping may predominate. These and many other aspects of the dynamic behavior of magnetic materials of all types have been investigated through resonance methods (see MAGNETIC RESONANCE).

Superparamagnetism For particles whose vol­ume v is on the order of 10-18 cm 3 or less, the direction of the entire particle moment Msv may fluctuate thermally. An assembly of such particles will exhibit the Langevin function magnetization curve of a paramagnetic with the extremely large moment Msv; thus it may be easily saturated with ordinary fields and tem­peratures. Such magnetization curves can be used to study particle sizes and size distributions.

Magnetic Bubbles A remarkable application of the principles of domain structure has been realized in the fabrication of computer memory components utilizing tiny cylindrical magnetic domains. In thin monocrystalline plates of a material such as a garnet or in thin amorphous films of certain alloys, having an appropriate com bination of magnetization and anisotropy, it is possible to establish stable cylindrical magnetic domains passing completely through the material. These domains, universally referred to as bubbles, can be generated, erased, moved, and sensed by overlays of conducting strips and Permalloy guide patterns. Each bubble is a "bit" of information. The dimensions of these bubbles and the associated patterns are on the order of a few microns. The storage density and access times of these memories appear to fit them to an important range of applications. (See FER­RIMAGNETISM. )

JOSEPH J. BECKER

References

Bozorth, R. M., "Ferromagnetism," New York, Van Nostrand Reinhold, 1951.

Kneller, E., "Ferromagnetismus," Berlin, Springer­Verlag, 1962.

Rado, G. T., and Suhl, H. (Eds.), "Magnetism," New York, Academic Press. Vol. 1,1963; Vol. I1A, 1965; Vol. lIB, 1966; Vol. III, 1963; Vol. IV, 1966.

Chikazumi, S., "Physics of Magnetism," New York, John Wiley & Sons, Inc., 1964.

MAGNETO·FLUID-MECHANICS

Morrish, A. H., "The Physical Principles of Magne­tism," New York, John Wiley & Sons, Inc., 1965.

Berkowitz, A. E., and Kneller, E. (Eds.), "Magnetism and Metallurgy," New York, Academic Press, 1969.

Nesbitt, E. A., and Wernick, J. H., "Rare Earth Per­manent Magnets," New York, Academic Press, 1973.

Bobeck, A. H., Bonyhard, P. I., and Geusic, J. E., "Magnetic Bubbles-An Emerging New Memory Technology," Proc. IEEE 63,1176-1195 (1975).

Wohlfarth, E. P. (Ed.), "Ferromagnetic Materials," Vols. 1 and 2, New York, North-Holland Publishing Co., 1980.

Cross-references: AMORPHOUS METALS, ANTI­FERROMAGNETISM, DIAMAGNETISM, FERRI­MAGNETISM, FERROMAGNETISM, MAGNETIC FIELDS, MAGNETIC RESONANCE, MAGNETOM· ETRY, RARE EARTHS, THIN FILMS.

MAGNETO-FLUID-MECHANICS

Magneto-fluid-mechanics is the subject that deals with the mechanics of electrically con­ducting fluids (such as ionized gases and liquid metals) in the presence of electric and magnetic fields. Magneto-hydrodynamics is another name used extensively, but it suffers from the less general meaning of the words "hydro" and "dynamics." Other names used are: magneto­hydro-mechanics, magneto-gas-dynamics, mag­neto-plasma-dynamics, hydromagnetics, etc.

The fundamental assumptions underlying magneto-fluid-mechanics are those of contin­uous media. In this respect, magneto-fluid­mechanics is related to plasma physics (see PLASMAS) in the same way that ordinary fluid mechanics is related to the kinetic theory of gases. More specifically, such phenomenological coefficients as viscosity, thermal and electrical conductivities, mass diffusivities, dielectric con­stant, etc., are assumed to be known functions of the thermodynamic state, as derived from microscopic considerations or experiments.

From electromagnetic theory, we know that the "Maxwell stresses" give rise to a body force made. up of the following components: elec­trostatic (applied on a free electric space charge); ponderomotive (the macroscopic sum­mation of the elementary Lorentz forces ap­plied on charged particles); electrostrictive (present when the dielectric constant is a func­tion of mass density); a force due to an inhomogeneous electric field and its magnetic counterpart; and the magnetostrictive force. For any fluid the last two forces are negligibly small at normal temperatures, whereas the ones as­sociated with the behavior of the dielectric con­stant, although normally small, are of the same order of magnitude as the buoyant forces under certain conditions. On the assumption that we deal with electrically neutral but ionized fluids,

MAGNETO-FLUID-MECHANICS

the only substantial force that remains is the ponderomotive force. Indeed, what is today called magneto-fluid-mechanics deals almost ex­clusively with this force.

Fundamental Equations The equations that govern magneto-fluid-mechanics are the follow­ing:

(1) Equation of conservation of mass, which is the same as in ordinary fluid mechanics.

(2) Equation of conservation of momentum, which is altered by the forces enumerated above. In particular, the ponderomotive force per unit volume is given by J X B where J is the vector current density and B the magnetic induction, both measured in the laboratory.

(3) Equation of energy conservation; the same as in ordinary fluid mechanics with the addition of the Joulean dissipation E'· J'. The primes indicate that the electric field and cur­rent density are measured in a frame of refer­ence moving with the fluid. In the nonrela­tivistic case and for zero space charge, we have E' = E + q X Band J' = J . The barycentric stream velocity is indicated by q.

(4) Equation describing the thermodynamic state.

(5) Conservation of electric charge. (6) Ampere's law. (7) Faraday's law. (8) Statement that the magnetic poles exist

in pairs only. (9) Ohm's phenomenological law . (10) Constitutive equations linking the elec­

tric field with the displacement vector and the magnetic intensity field with the magnetic induction. Equations (1) to (3) are the conser­vation equations. Equations (5) to (8) are Maxwell's equations. For a large number of problems, the phenomenological coefficients of electrical and thermal conductivity, viscosity and the like are assumed to remain unaffected by the magnetic field. This implies that the collision fre­quencies among the particles are much higher than the cyclotron frequency associated with the property a charged particle has to rotate around a magnetic line under the influence of the Lorentz force. This means that the transfer of electric charge, mass, momentum, and energy is not realized in a preferential direction.

Physically, the magneto-fluid-mechanic system of the above equations is coupled in the follow­ing sense: A velocity field q cutting magnetic lines of flux B gives rise to an induced current whose magnitude is given by J = a(q X B). At the same time the fluid feels an induced body force equal to J X B. On the other hand, the electric currents induced by the motion create, according to Ampere's law, a magnetic field which distorts the original applied mag­netic field. The basic mechanism of this distor­tion is the one created by the irreversibility in­troduced by the finite electrical conductivity, the same way that the distortion of the inviscid streamlines in ordinary fluid mechanics takes place by the action of viscosity.

688

Nondimensional Parameters and Some Im­portant Theorems In order to study the nature of the solutions as they emerge from different problems, we shall form a number of non­dimensional parameters that can be extracted from the different equations. The order of magnitude of the inertia force per unit volume is given by p V 2 I L where p is the mass density, V the velocity, and L a characteristic length; the order of magnitude of the ponderomotive force J X B, after using Ohm's law, is equal to OB2 V. Also, the order of magnitude of the viscous force is: f..l VIL 2 • The ratio of the typical inertia force over the viscous force is called the Reynolds number (Re) and, from the above, is found to be: Re = p VLlf..l. The ratio of the ponderomotive force over the inertia force is given by ~ = oB 2LlpV. The ratio of the pon­deromotive force over the viscous force is equal to (Re) . ~ and is defined in the literature as the square of the "Hartmann number.," denoted by M. We have M = BL Val";;;.

The distortion of the magnetic field due to the hydrodynamic field can be studied best with the help of the following two equations:

dQ f..l - = (Q • V)q + - V 2 Q dt P

dH I - =(H' V)q+ - V2H dt Of..le

In the above f..le is the magnetic permeability and H the magnetic field intensity. The first equation describes the diffusion of vorticity Q = V X q, whereas the second can be obtained by a combination of Ampere's and Ohm's laws after elimination of the electric field by using Faraday's law. In the ordinary fluid mechanic case, the streamlines obtained after solving the inviscid problem are distorted in regions of high vorticity through the mechanism of viscosity (last term in first equation). Similarly the mag­netic field calculated in the case of ideal, non­dissipative flow with 0 = 00 is distorted by the finite electrical conductivity (last term in sec­ond equation). The nondimensional number describing the influence of viscosity is the Reynolds number, and in perfect analogy as indicated by the above two equations, the magnetic field distortion is described by the number, (Re)m = f..le0VL and is called the mag­netic Reynolds number. When (Re)m is zero, the magnetic lines remain undisturbed, whereas in the limit (Re)m -+ 00, the magnetic lines are frozen into the fluid in exactly the same way that vorticity is frozen according to Helmholtz's theorem. Mathematical similarities apart, the freezing of the magnetic lines with the motion is evident in the case of 0 -+ 00 from the follow­ing physical considerations: An observer moving with the barycentric (stream) velocity in a me­dium of infinite electrical conductivity can mea­sure only a zero electric field and hence he does

689

not cut magnetic lines, which means that the magnetic lines must move along with his speed. From this argument it also follows that the total change in the magnetic flux through a given surface moving with the stream must be zero for an infinitely conducting medium. Finally, the remark should be made that except for stellar and interspace applications where the velocities and (especially) characteristic lengths are high, (Re)m is a small number. On the other hand, the assumption (Re)m = 0 is a rather dras­tic one since it permits the uncoupling of Maxwell's equations from the conservation equations. In thermonuclear plasma physics, much of which is analyzed by MHD models, the approximation of frozen field lines is very useful , especially in dealing with phenomena which occur over brief time intervals, such as in MHD instabilities.

For the calculation of the ponderomotive force we can use Ampere's law ell x H =J) to find that J X B = (IJ X H) X B. Through regular vector operations, we can show that (IJ X H) X B = - grad (B 2 /2J.1.e) + div (BB/J.l.e). One can identify the last term as representing a ten­sion equal to B2/2J.1.e acting along the lines of force, whereas the first one corresponds to an equivalent hydrostatic pressure equal to B2 /2J.1.e. This term is frequently called "magnetic pres­sure" and in different problems is found to behave precisely as the static pressure does. Fur­thermore, one can show that if the magnetic lines are lengthened, the magnetic field intensity is increased .

Consider now the propagation of small distur­bances in the form of acoustic waves for which the speed of sound for an ideal gas is propor-tional to -../PiP. Now one can show through a linearization of the equations of conservation, assuming the presence of the magnetic pressure alone, that a small disturbance (for a gas of infinite electrical conductivity) will be prop­agated, in perfect analogy, with a speed equal to v'B2 /PJ.l.e. This is the so-called Alven speed, and these waves are called magneto-fluid me­chanic waves. Of interest also are combinations of several mechanisms of propagation which might include sound and gravitational waves.

Because of the property of the magnetic lines to increase their tension when lengthened, along with the additional ones of distortion and prop­agation of disturbances, their properties are pre­sented in loose terms as resembling very much those of rubber bands.

Applications There are both astrophysical and terrestrial applications of magneto-fluid­mechanics. One of the earliest ones was perhaps suggested by Faraday, who thought to harness the river Thames with electrodes on its banks that would collect the induced electric current resulting from the flow of the river as it cuts the earth's magnetic field perpendicularly. Because of the small electrical conductivity of water, the small magnetic field of the earth and the

MAGNETO-FLUID-MECHANICS

small velocities, the interaction is too weak to be useful. However, in the laboratory, with a mutually perpendicular magnetic field, flow, and induced current density fields, a large inter­action is possible when hot ionized gases are used in conjunction with strong magnetic fields. This area of research is called magneto-hydro­dynamic power generation, and its popularity emerges from the fact that mechanical energy can be converted to electrical without ther­mally stressed rotating parts. As a consequence, higher temperatures can be imparted to the working medium with better thermal effi­ciencies. This scheme, under development now, seems to be limited by losses due to heat transferred from the hot gas to the out­side, corrosion of the electrodes, and Hall current losses. (When the gyrofrequency of the ionized particles is high compared to their col­lisional frequency, the particles drift in a direc­tion parallel to the flow, and as a result, the current to be collected by the electrodes in the direction perpendicular to the flow, di­minishes. The Hall effect can be turned to some advantage if it is designed to be substantial and if the current in the direction of flow is the one to be collected.)

One of the earliest astrophysical applications of magneto-fluid-mechanics lies in the area of solar physics and in particular the sunspots. Sunspots were seen and studied for the first time with the help of a telescope by Galileo about 1610. Three hundred years later, Hale discovered, through the Zeeman effect, that the magnetic field in the sunspots is very high (of the order of several thousand gauss). It was, however, only in the middle 1930s and in particular after the last world war that an explanation was sought in which the magnetic field was involved. At the writing of this article, there is no complete sunspot theory. However, the majority of workers in this area agree on the following rough picture. Because of mechanical equilibrium considerations, the pressure is the same at a given distance from the center of the sun in the sunspot proper or in the photosphere which is free of a magnetic field. This means that the magnetic pressure plus the static pressure in the sunspot region must balance the static pressure in the photo­sphere, a fact that implies that the static pressure in the sunspot region is smaller. If we picture the sunspot magnetic lines to be radial, the pressure gradient in this direction is indepen­dent of the magnetic field and balances exactly the gravitational force per unit volume pg. Hence P is constant inside and outside the sunspot. Since the static pressure is proportional to den­sity and temperature, the above arguments force us to accept a lower temperature inside the spot with a resulting darkening. The only question that rises is whether the order of magnitude of the magnetic pressure is enough for the effect to be significant. This seems to be so. If we assume a magnetic field of 1500 gauss (typical in a sun-

MAGNETO-FLUID-MECHANICS

spot), the magnetic pressure is about 0.1 of an atmosphere which is the typical pressure in the photosphere.

An explanation for the bipolar nature of sun­spots and the difference in the sign of their polarity has been offered. The differential rota­tion of the sun is invoked. The torroidal mag­netic lines of the sun's field lying on its surface are twisted, since for very high electrical con­ductivity, they are frozen with the motion. As a result, the magnetic intensity is amplified and so is the magnetic pressure. Simple considera­tions based on the observed kinematics of the differential rotation establish the location in latitude with time where the intensities will be high enough to give rise to sunspot activity. The result compares favorably with observa­tions. In fact, it can be shown that the sunspot activity migrates, time-wise, from the higher latitudes towards the equator as observations show. Because the twisted field is symmetric with respect to the equatorial plane, this model describes correctly the symmetry of the activity in the north and south hemispheres along with the fact that the polarity between two sym­metric sunspot pairs is opposite in sign.

Efforts have been made to discover the mecha­nism for the generation and maintenance of cosmic magnetic fields, such as fields in stars, the earth, and galaxies. The most promising direction seems to lie in the so called "dynamo theories." Here, some general magnetic field is assumed (not necessarily strong), which upon interaction with the motion of a conducting medium (convective, or motion due to Coriolis forces), induces currents which reinforce the original magnetic field. As the magnetic field is reinforced, the ponderomotive force suppresses the motion until some kind of a steady state for both the motion and the magnetic field is reached.

Magneto-fluid-mechanics also studies problems related to magnetic confinement of plasmas and their stability to small disturbances. Consider for instance the so-called "pinch effect." Here, a strong current is passed through a cylindrical column made up of a plasma. The axial current filaments create an azimuthal magnetic field (the magnetic lines are then rings with the cylinder axis as the locus of their centers,) and as a result, a ponderomotive force is induced which compresses the plasma radially * . Through this confinement, it is hoped that temperatures of the order of 106 to 107 K will be created so that thermonuclear FUSION can take place. Such configurations are normally subject to instabilities. Consider, for instance the case in which a small distortion in the' form of a "kink" is formed in a cylindrical plasma

*Pinch-effect devices are also useful in metallurgy where molten metals can be confined away from solid boundaries in order to remain pure.

690

column, such that the rings in the concave side are pressed together, whereas the rings in the convex side are separated. As a result, the magnetic flux density (and hence the magnetic pressure) will be higher on the concave side resulting in a force tending to increase the concavity. We say that this configuration is unstable, since the force induced by the imposed disturbance acts in a destabilizing direction. Note that in this configuration, the center of curvature of the undistorted plasma boundary cross section falls inside the plasma. One can now create another example in which the curvature of the confining undistorted boundary of the plasma is opposite (the center of curvature falls in. the vacuum) and show that the configuration wIll be stable. We can then state that a sufficient condition for stability is met when the magnetic lines are everywhere convex towards the plasma. If the magnetic lines induced by the currents going through the plasma are in an unstable configuration, externally imposed magnetic fields can be used in order to "stiffen" the configuration.

The "aurora borealis" can be explained in terms of the interaction of the solar wind (due to the continuous expansion of the solar corona with a velocity of about 500 km/sec) with the geomagnetic field. The inertia associated with this "wind" will penetrate the magnetic lines of the earth, only up to the point where the induced magnetic pressure is smaller than these inertial forces. The earth's magnetic field falls off with the inverse third power from the center of the earth. Knowing the mass density and the velocity of the solar wind, we can locate the remotest magnetic line from the earth that is strong enough to stop the penetration of the s?lar co~pusc~es. When this happens, these par­ticles will glide along this magnetic line and eventually will come to the foot of this line at th~ surface of the earth. An elementary compu­tation shows that the latitude of this line is the one where the "aurora borealis" is observed. (see AURORA AND AIRGLOW).

Convective motions can be effectively subdued by ~he presence of a magnetic field. Consider, for mstance, the convection in a thin horizontal layer due to heating from below. Convective cells will be formed when the buoyant force is enough to counterbalance the viscous force of the motion. (These were formerly called Benard cells because it was believed that Benard had observed them. However, the name convective cells is preferred.) At the same time balance of energy dictates that the heat conv~cted up­wards be equal to the heat conducted from the hot source at the bottom. The ratio of these two energies is called the "Rayleigh number" and ~~r a given geometry, it must be higher th~n a cntIcal value for the convective cells to ap­pear. However, when a magnetic field is present the ponderomotive force in general inhibits th~ motion and at the same time changes the geom­etry of the cell. The extent of this inhibition is

691

given by the Hartmann number (defined earlier) so that the critical Rayleigh number is higher for higher Hartmann numbers. Available labo­ratory experimental results reconfirm the find­ings of this theory. On a cosmic scale, it has been hypothesized that the roll-like granulation in the sunspot penumbra is the result of the magneto-fluid-mechanic inhibition of the mo­tion inside regular photospheric convective cells.

Magnetic fields are also known to inhibit · the onset of turbulence. For instance, consider the flow of mercury in a channel. Experiments have shown that the flow can be laminar well above the critical Reynolds number of 2000 or so, if a coil is wrapped around the pipe thus creating an axial magnetic field. The small disturbances perpendicular to the direction of the main stream will be damped out through the action of the induced retarding ponderomotive force.

Many other cosmic scale phenomena seem to be explainable through magneto-fluid-mechan­ics. To list but a few, there are the solar flares and filaments, the spiral structure of some galaxies, the heating of the solar corona, ex­plosion of magnetic stars and many others. Although order-of-magnitude analyses have been suggested to explain some of these phenomena, there are no complete self-consistent theories. Such theories seem to demand a simultaneous satisfaction of all the conservation and electro­magnetic equations-a formidable, if ever pos­sible, task. On the terrestial scale, many ap­plications have been undertaken, and some of them are dependent upon technological develop­ment rather than fundamental physical under­standing. To give a few more examples, in addition to those already mentioned, we list magneto-fluid-mechanic liquid metal pumps and flow meters, propUlsion devices based on the acceleration of a neutral plasma through which a current and a normal magnetic field from the outside are supplied (an area called "plasma propulsion"), or a device in which positive ions (such as the ones easily produced by alkali metals) are accelerated with an electric field (ionic propulsion). Other examples are devices to reduce the heat transfer in reentry objects by using the decelerating action of a magnetic field carried by the vehicle or to use the ponderomotive force as a control force when needed for the navigation of space crafts.

PAUL S. LYKOUDIS

References

Ferraro, V. C. A., and Plumpton, C., "An Introduction to MagnetO-Fluid-Dynamics," London, Oxford Uni­versity Press, Second Edition, 1966.

Alfven, H., and Falthammar, C. G., "Cosmical Electro­Dynamics," Oxford, The Clarendon Press, 1963.

Bateman, G., "MHD Instabilities," Cambridge, Mass., MIT Press, 1978.

MAGNETOMETRY

Cross-references: ASTROPHYSICS, AURORA AND AIRGLOW, CONSERVATION LAWS AND SYM· METRY, FLUID DYNAMICS, FLUID STATICS, HALL EFFECT AND RELATED PHENOMENA, IONIZATION, PLASMAS, SOLAR PHYSICS.

MAGNETOMETRY

Magnetometry is the art of determining ac­curately magnetic fields and the magnetic properties of matter. Both applications are of interest to pure science as well as to technology. The principles employed for measurements are based on the magnetostatic interaction between fields and moments, on voltages induced by flux changes, on the deflection of charge car­riers in fields and on the precession of nuclear and electronic spins in a field.

Matter exposed to a magnetic field of strength H (as produced by a current-carrying solenoid) is magnetized. This phenomenon is described by the vector of magnetization M = N( m), where N is the number of atoms per unit volume having a mean dipole moment (m). The mag­netic induction (flux density) is then given by B = Mo(H + M). One describes the magnetic response of a material by its susceptibility X or its permeability Mr using M = XH and B = MOMrH. It follows that Mr = I + X. In magnetic materials the susceptibility X is a function of temperature. Most often it will depend also on the magnitude and direction (relative to crystal­line axes) of H. Equations are given in SI units, which form a rational system. In all systems of units Mr and X are dimensionless numbers. Mr is the same in SI and emu, but X is smaller by I I( 41T) in emu. Note also that X is related to the num­ber of atomic dipoles per m3 in SI and per cm 3 in emu.

Magnetometry is a still expanding field, mainly because of the many practical aspects of magnetism. New magnetometer designs ap­pear constantly, but especially for absolute measurements the classical systems are still in use with only minor modifications.

Originally magnetometry developed from the interest in geomagnetism, with its importance for navigation. Nowadays exact measurements of variations in the earth's field are important for questions like the dynamics of the inner core of our planet and of the surface of the sun (via changes in the magnetosphere by sun spot activities).

The classical method, devised by· Gauss, de­termines the horizontal intensity of the earth field H - absolutely, with a relative accuracy better than 10-5 . Two measurements must be performed. First the torsional frequency of a small standard magnet having the magnetiza­tion M and being suspended by a torsion fiber is recorded. It is proportional to MH -. Then a small magnetic needle is suspended by the same fiber and its deflection under the action of the stam:'lrd magnet placed at a well defined

MAGNETOMETRY

position is measured. From this M/H- can be determined. In modern systems the stan­dard magnets are replaced by precision coils activated by exactly measured currents (sine galvanometer ).

With the earth inductor, devised by Weber, one obtains the inclination of the earth field. The induced voltage in a rotating coil is sensi­tively measured. If the rotation axis is parallel to the field direction the induced voltage vanishes. Angles of a tenth of a minute of arc can be resolved. Portable systems are available as survey instruments, particularly for prospect­ing of iron ores and related uses.

Highly accurate field measurement (to 10-8 ) are needed for the design of particle beam op­tical systems such as high energy accelerators or particle spectrometers. Similarly, such de­vices require a high temporal constancy of the field which can only be achieved by high gain feedback loops.

Rotating coil gaussmeters are modern descen­dants of the earth inductor. A small coil is wound on a nonmagnetic core and driven by a high speed electromotor. Such systems allow absolute measurements, but the probe averages over the volume covered by the coil. The me­chanics of the electrical contacts can also give rise to problems.

A convenient, simple, rather pointlike field probe (which has to be calibrated) is the Hall­effect gaussmeter. A constant current is sent through a small conductor or semiconductor (2 X 2 X 0.1 mm 3 ). The field component per­pendicular to the current flow will deflect the charge carriers via the Lorentz force and a volt­age perpendicular to field and current is gen­erated. It is proportional to the induction of the field. Most sensitive are semiconductor probes like InSb. Typically a Hall voltage of 5 mV is generated per T at lOrnA current. Such de­vices are now common laboratory equipment. They can be used over a fairly wide tempera­ture range.

An even simpler and cheaper method for coarsely measuring or controlling fields in the laboratory is the measurement of the magneto­resistance of a semimetal wire or film or of the forward diode bias of commercial semicon­ductor diodes at liquid helium temperatures. They have moderate sensitivity and an overall nonlinear response.

A highly precise field measuring device is the NMR magnetometer. One determines the pre­cession frequency of the nuclear magnetic moment of protons (or 7Li) with a standard nuclear magnetic resonance circuit. The extreme sharpness of the resonance line (e.g., of pro­tons in water) allows detection of variations in induction of 10-8 in 1 T. Measurements of broadening of the resonance line give informa­tion on the presence of small field gradients. An example is the field distortion caused by the presence of weak para- or diamagnetic impurities

692

near the probe. The system may thus also be used for measurements of susceptibility. The sensor probe can be as small as 10-3 m!. NMR magnetometers lend themselves for telemetric readout and for incorporation into a field stabilizing circuit.

In so-called volume averaging NMR mag­netometers the protons (e.g., of water or alco­hol) are flowing through a tube in a strong B­field where their spins are longitudinally polar­ized. The fluid then flows through a region with unknown B', where the spins are turned. The remaining longitudinal spin polarization is monitored by standard NMR. The method can be used between 0.1 IlT and 1 T with a typical relative accuracy of 10-7 around 1 mT.

A very sensitive device for measuring ex­tremely weak fields is the Zeeman magnetom­eter. One of its applications is probing inter­planetary fields in space. In the presence of a field the 2S 1/2 ground state of an alkali atom (e.g., Cs vapor) splits into a doublet. The popu­lation of the upper state is increased over the thermal equilibrium by optical pumping with circularly polarized light. The transition to the Zeeman ground state is forbidden. It can be stimulated by application of an rf field. This will bring the population ratio back to thermal equilibrium. The absorption of the pumping light in the vapor is dependent on the popula­tion ratio of the two Zeeman levels. By tracing the absorption maximum as a function of rf frequency one obtains the energy separation of the Zeeman levels and hence the acting field since the atomic moment of the alkali is pre­cisely known. Sometimes a feedback circuit is used to keep the system at exact resonance. Its sharpness allows measurements down to 0.1 nT.

It should be mentioned that most informa­tion on stellar or interstellar fields comes from the observation of the Zeeman splitting of spec­tral lines of certain elements present in stars or interstellar matter. However, the width of spontaneously emitted optical lines is orders of magnitudes wider and thus the sensitivity is down.

Next we mention the magnetometer most widely employed in technical application, the flux gate (or Forstersonde). It consists of two soft magnetic cores in parallel orientation. They are driven to saturation by ac currents of fixed frequency applied to a primary coil wound around each core; the two windings are of opposite direction. Thus in a secondary sensing coil wound around both cores nominally no signal will be present. A superimposed, am­bient dc field will produce a signal with twice the ac frequency in the sensing coil. The ampli­tude of this second harmonic is proportional to the dc field strength, its phase is related to the field direction. Modern systems use an addi­tional field coil excited by the sensing coil signal in a feedback circuit. It compensates the

693

ambient field and brings the sensing circuit back to zero.

Uses range from airborne survey of mineral deposits to minesweeping, submarine detection, treasure hunting (in archeology), and security checks. In these applications often the difference signal between two flux gates separated by a certain distance is monitored. Three devices mounted mutually perpendicular to each other are used to determine the vector components of a field. Magnetometers of this type were flown to the moon during the Apollo mission. The sensitivity of flux gates can be better than 10-5 A/m.

The most widespread use of magnetometry is the study of magnetic properties of matter. In pure science the quest is for the basic prin­ciples of magnetism and thus the electronic structure of matter. In the foreground of tech­nological applications stands the design of new permanent magnets, of magnetic cores of trans­formers and inductances and recently mainly for magnetic storage and recording devices. These technological applications favor mag­netically ordered materials. For example, new technologies appear on the horizon through the use of amorphous magnets. These types of measurements are less concerned with the mag­nitude and spatial distribution of fields but rather with the magnetic parameters of ma­terials such as the susceptibility or the mag­netization. A widely used instrument for such applications is the vibrating sample or Foner magnetometer. The specimen is vibrated per­pendicular to a uniform magnetic field. Two signal coils wound in opposite direction and connected in series are placed around the sample with their axes parallel to the direction of motion. The dipole field of the specimen induces an ac signal in the coils. It is compared after amplification with the signal excited in a second pair of coils by a ferromagnetic reference sample (located usually near the vi­brator) which is moved together with the un­known specimen. Both samples are mounted on a nonmagnetic shaft set into oscillations of ~80 Hz by a loudspeaker system. The mag­netic field may either be generated by an elec­tromagnet or superconducting coils. Careful design of pickup coils and phase-lock noise reduction allow the detection of changes in magnetization down to 10-6 A/m at fields of 106 A/m. The principle of the vibrating coil magnetometer is very similar. The sample stays fixed in a homogeneous magnetic field and the signal coils oscillate perpendicular to the field along the axis of the dipole moments of the sample. Demands on the homogeneity of the field are extremely high.

The other workhorse for studies of mag­netic properties of matter is the magnetic (or Faraday) balance. Superimposed on the mag­netizing uniform field is an inhomogeneous field. The dipole moments induced in a mag-

MAGNETOMETRY

netic material experience a force parallel to the direction of the gradient of the external field. The sample is usually suspended on a thin fiber from one side of the arm of a microbalance mounted some distance above the field-produc­ing magnets. In modern designs a feedback sys­tem is used which keeps the equilibrium by electromagnetic or electrostatic forces acting on the other side of the balance arm. The mea­surement is absolute. Sensitivity is about 10-6 -1 0-7 A/m in good systems. Fields are produced either by an electromagnet with in­clined pole caps or by a system of supercon-ducting coils. .

Both the vibrating sample magnetometer and the magnetic balance allow the variation of sample temperature rather straightforwardly. Sample and vibrating rod (or suspending fiber) are mounted inside a small dewar system or oven. Complete systems are available com­mercially in highly advanced designs.

Another system which makes use of the force exerted on a magnetic material in a nonuniform field is the pendulum magnetometer. The sample is fixed at the upper end of a pendulum rod which is suspended in the middle. A counter­weight is mounted on its lower end. For small amplitudes the force on the magnetic dipole acts as a restoring force and will thus change the frequency of oscillation. Measurements of pendulum frequency are made with and without field. The same basic principle is used in the vibrating reed magnetometer. The sample is attached to one end of the nonmagnetic, me­tallic reed (e.g., Au), while the other end is rigidly fixed. The reed is forced into oscilla­tions by an inhomogeneous ac field super­imposed on the uniform magnetizing dc field. The mechanical vibrations are converted by a piezoelectric transducer to an ac voltage which is proportional to the magnetization of the sample. Using look-in techniques one can re­solve moments down to 10-13 A m2 .

The classical astatic magnetometer is still used to determine the magnetization of rod­shaped samples. Two small, identical magnets are mounted horizontally and antiparallel to each other at the ends of a long, nonmagnetic rod suspended vertically by a weak torsion fiber. This is the measuring system which is unaf­fected by the earth field. Two opposing coils are placed in the plane of rotation of one of the magnets. The aligned axes of the coils stand perpendicular to the axis of the magnet. Their fields cancel exactly at the position of the magnet. The rod shaped sample is inserted into one of the coils. The balance of fields is dis­turbed by its magnetization and the resulting torsional deflection can be read out optically. Moments down to 10-9 A m2 can be determined a bsolu tely.

Critical parameters for commercial mag­nets are the coercive force and the remanence. They are determined by recording the hys-

MAGNETOMETRY

teresis loop (i.e., the B vs H curve for rising and decreasing field intensity) in a simple inductance magnetometer. The sample is inserted into a gap of the core of a toroidal solenoid. The cur­rent through the solenoid gives H. One obtains B by electronically integrating the voltage induced in a concentric pickup coil. Auto­matic systems of this type are common in ma­terials testing laboratories. The sensitivity can be improved by first balancing the signals from two toroids with empty gaps. After in­serting the sample into one gap the difference in outputs is measured.

For comparisons of the high field suscepti­bilities of ferromagnets the orbiting sample magnetometer is an advanced modern system. Several specimens are mounted on a disk which rotates with constant angular frequency of some 10 Hz. The uniform magnetizing field is directed along the axis of rotation. The samples pass successively by a sensing coil. Its output is stored in phase with the rotation. Digital sig­nal averagers can be used to reduce noise.

In single-crystal samples there are directions along which the material is more easily mag­netized. In general they coincide with the principal crystal symmetry directions. The free energy of a crystal thus contains a term de­pendent on the direction of magnetization rela­tive to the crystallographic axes. It is called the magneto crystalline anisotropy (energy) and is usually expressed in a set of parameters referred to as anisotropy constants. They yield basic information on the anisotropy of magnetic interaction of the atomic magnetic moments which in turn is often caused by the anisotropic electron distribution of the orbital ground state.

Magnetic anisotropy can be detected by all static methods for the measurement of magneti­zation if provision is made that the axis of the magnetizing field can be turned with respect to crystal orientation. Also used are ac bridges and ferro-, ferri-, or antiferromagnetic reso­nance. The most commonly employed system is, however, the torque magnetometer. The sample is suspended by a torsion wire between the poles of an electromagnet which can be moved around the sample. The magnetizing field tends to align the sample magnetization along an easy axis, and a torque is thus exerted on the sample, which is read out by optical or capacitive methods or by the variation in re­sistance of a set of strain gauge wires. A set of data is taken by varying the original orien­tation of the specimen relative to the external field.

In the ripple field magnetometer the magneti­zation of the sample is modulated by an ac (~I 00 Hz) "ripple" field superimposed parallel to the dc magnetizing field. A sensing coil oriented at 90 degrees to the field measures the perpendicular magnetization, which is pro­portional to the angular derivative of the anisot-

694

ropy energy density with respect to a reference crystalline axis.

On a similar principle operate rotating sample magnetometers. The specimen (often a sphere) rotates slowly (0.02-0.1 Hz) within a dc field applied perpendicular to its axis of rotation. The variation in flux with rotation is sensed by search coils oriented parallel and perpen­dicular to the dc field. The anisotropy con­stants are derived from the two measured flux components.

Information similar to the magnetic anisotropy can be obtained by measuring the magneto­strictive changes in dimensions of single-crystal samples as a function of magnitude and direc­tion of the applied field. X-ray techniques are usually too insensitive for this purpose, and strain gauges have found the widest use. Capaci­tive read out has also been reported.

A predecessor of this system is the spinner magnetometer. It rapidly rotates a ferromagnetic sample without external field. It is available as a survey instrument for anisotropy studies of rock samples.

Finally we discuss briefly a very modern sys­tem used for special magnetic measurements, the superconducting quantum interference de­vice (SQUID) which is rapidly gaining im­portance. Basically a fluxmeter, it does not, however, measure the flux itself but rather, with extreme accuracy, minute variations in flux. In fact, sensitivity has reached about 10-5

of a single flux quantum (CPo = hj2e = 2.07 X 10-15 Wb).

Applications of the SQUID in physics range from ac measurements of very small magnetic moments (e.g., magnetization in extremely weak fields) to a search for the elusive mag­netic monopole. Measurements of volume sus­ceptibilities down to 10-10 for 1 fJ.g samples have been reported. The SQUID has also be­come the central tool for magnetomedical and -biological research. Examples are magneto­cardiography and the study of fields generated by the action of the human brain.

The SQUID is based on two macroscopic quantum effects in superconductors. The first is flux quantization: The flux trapped inside a superconducting loop must be an integral multiple of CPo. The second is the dc Josephson effect: A superconducting ring is interrupted at one point by a weak link. Examples are a thin (:::; I mm) oxide layer or a point contact. The thickness of the insulating layer is such that electrons can tunnel through it. In a super­conductor a current of Cooper pairs is flowing which requires no voltage across the barrier. Only when a critical current (which can be kept as low as some 10-5 A) is exceeded, a voltage proportional to the tunnel current appears at this so-called Josephson junction. The critical current drops rapidly when a field is applied perpendicular to the flow of Cooper pairs. After this field has increased so much that an

695

additional flux quantum can be brought into the ring, the critical current jumps back to a higher value. These periodic discontinuities occur because the ring momentarily ceases to be superconducting, so that one quantum of flux can enter. In a wire loop, placed inside the ring, a voltage pulse is induced each time a flux jump occurs. By counting these pulses the total change of field through the ring can be calculated. For ~1 mm diameter a change in flux density of ~ I nT causes a jump.

A practical SQUID arrangement is the two­hole system: Inside an Nb cylinder a bore shaped like a dumbbell is drilled. The weak link is an Nb screw placed across the bar of the dumb­bell. If the flux in one hole increases, an equal decrease of flux in the other hole must follow. The current through the Josephson junction de­pends on the flux difference between the holes. One uses an external search coil placed in the field the variations of which are to be measured. It is connected in series to a wire loop inside the first hole. This whole circuit works as a flux transformer. The second hole is inductively coupled to an LC circuit tuned at some 10 MHz. The appearance of a voltage drop across the junction due to flux brought into the first hole can be regarded as a change in inductance of the superconducting ring which mistunes the oscillator. The voltage across the circuit as a function of field acting on the search coil is saw tooth like with a period corresponding to trapping one flux quantum inside the two­hole ring. The sensing coil can be placed at a convenient distance from the SQUID. By giving it the appropriate shape one may also detect variations in the spatial derivatives of fields with extreme sensitivity. Furthermore, the SQUID has a fast response. Rates of 107 <Po /s have been recorded.

F. JOCHEN LITTERST G. MICHAEL KALVIUS

References

1. Kohlrausch, F., "Praktische Physik" 22nd Ed., Vol. 2, p. 341, Stuttgart, B. G. Teubner, 1968.

2. Lark-Horovitz, K., and Johnson, V. A., "Methods of Experimental Physics," Vol. 6, "Solid State Physics," Part B, p. 171, New York, Academic Press, 1959.

3. Kalvius, G. M., and Tebble, R. S., "Experimental Magnetism," Vol. 1, John Wiley & Sons, Chichester, 1979.

4. Foner, S., "Vibrating Sample Magnetometer," Rev. Sci. Instr. 27548 (1956).

5. Gallop, J. C., and Petley, B. W., "SQUIDs and Their Applications," 1. Phys. E: Scient. Instr. 9, 417 (1976).

6. Prim dahl, F., "The Fluxgate Magnetometer," 1. Phys. E: Scient. Instr. 12,241 (1979); 15,221 (1982).

7. Pendlebury, J. M., et aI., "Precision Field Averaging

MAGNETOSPHERIC RADIATION BELTS

NMR Magnetometer ... ," Rev. Sci. [nstr. 50,535 (1979).

8. Parsons, 1. W., and Wiatr, Z. M., "Rubidium Vapour Magnetometer," Scient. Instr. 39,292 (1962).

9. Romani, G. 1., Williamson, S. J., and Kaufman, 1., "Biomagnetic Instrumentation," Rev. Sci. Instr. 53,1815 (1975).

Cross-references: GEOPHYSICS; HALL EFFECT AND RELATED PHENOMENA; MAGNETISM; MEA­SUREMENTS, PRINCIPLES OF; SUPERCONDUC­TIVITY; ZEEMAN AND STARK EFFECTS.

MAGNETOSPHERIC RADIATION BELTS

The sun emits continuously a fully ionized gas, the solar wind, which flows radially outward throughout the solar system. The solar wind plasma is primarily made up of protons and elec­trons, and its properties although variable, have average values at earth orbit of bulk velocity V ~ 350 km/sec, number density N ~ 5 cm-3 , and temperature T ~ 15 eV. Because of its high conductivity, the solar wind carries with it an embedded magnetic field, which on average is parallel to the ecliptic plane and traces an Archimedean spiral back to the sun in this plane due to solar rotation. At the orbit of earth the interplanetary magnetic field is highly variable in direction and magnitude, having an average value of ~ 5 nT (10-5 gauss).

The interaction of the supersonic solar wind with the intrinsic dipole magnetic field of the earth forms a region, the magnetosphere (Fig. 1), whose boundary, the magnetopause, sepa­rates interplanetary and geophysical plasma and magnetic field environments.! Upstream of the magneto pause is a bow shock formed in the solar wind-magnetosphere interaction pro­cess. At the bow shock the solar wind becomes thermalized and subsonic and continues its flow around the magnetosphere as magneto­sheath plasma, ultimately rejoining the un­disturbed solar wind. The bow shock is of interest because of its collisionless character, and much work is presently being done to un­derstand the nature and the development of the electric and magnetic field configurations required to establish a shock front in a collision­less medium. 2

A rough estimate of the position of the day­side magnetopause is obtained by balancing the solar wind pressure against the geomagnetic field with the resistive pressure of the geo­magnetic field itself:

t p V2 = B2 /8n,

where p = solar wind mass density, V = solar wind velocity, and B = 0.34/R3 gauss is the earth's field at the magnetic equator, with R = geocentric distance in units of earth radii. Use of the average solar wind values given above

MAGNETOSPHERIC RADIA nON'BEL TS

FIG. 1. An outline of the earth's magnetosphere. The lines represent magnetic field lines.

gives a dayside magnetopause distance of -10.2 earth radii, as compared to an average observed distance of -10.8 earth radii. More exact fluid and kinetic theory models give good agreement with the observed average latitudinal and longi­tudinal shape of the dayside magnetosphere.

In the antisolar direction, observations show that the earth's magnetic field is stretched out in an elongated geomagnetic tail (analogous to a cometary tail) to distances of several hundred earth radii. The geomagnetic tail field lines emanate from high geomagnetic latitudes from the vicinity of the auroral ovals to the geo­magnetic pole. Topologically the geomagnetic tail consists of roughly oppositely directed field lines separated by a "neutral" sheet of nearly zero magnetic field. Surrounding the neutral sheet is a plasma of "hot" particles, the plasma sheet, having a temperature of 1-10 keV, a density of 0.01-1 particle/cm3 and bulk flow velocity of a few tens to a few hundreds of km/sec. A definitive physical explanation of the extended geomagnetic tail has yet to be obtained.

Figure I is a schematic of the overall mag­netospheric configuration. There are seasonal variations due to the _120 tilt of the earth's magnetic dipole axis relative to its spin axis. More important, variations in solar wind pa­rameters cause large perturbations to the pic­ture shown in Fig. 1. These perturbations are observed to have scale variations much larger than the V- 1/3 and p-1/6 dependences pre­dicted by the simple pressure balance discussed earlier. Therefore, physical mechanisms other than simple pressure balance are required to explain observed magnetospheric variations. For example, geomagnetic field lines are known to interconnect with interplanetary magnetic field lines (magnetic field component normal

696

to the magnetopause), causing a two-way trans­fer of particles and energy between the inter­planetary medium and the solar wind. Neutral points (lines) are formed on the magneto­pause and, by magnetic flux conservation, in the geomagnetic tail in the neutral sheet region. It is not known if sites of field line intercon­nection are responsible for large scale par­ticle acceleration, but they are probably ef­fective in altering the shape and size of the magnetosphere.

Electrostatic, induced, and polarization elec­tric fields playa fundamental role in determining charged particle entry to and subsequent motion and acceleration within the magneto­sphere. Much present work in magnetospheric physics is aimed at identifying these fields and deconvolving the subsequent currents respon­sible for sustaining the magnetospheric con­figuration and causing its variations.

The in situ phase of magnetospheric research began dramatically with the discovery by Van Allen and his colleagues3 of a permanent, intense, trapped energetic particle population (the Van Allen radiation belts) residing in the geomagnetic field-a discovery made with data from the first United States satellite, Explorer I. Following this initial discovery, data obtained from the trapping regions showed the geomag­netic field to be at least at altitudes ::s several earth radii, a very efficient and vast magnetic mirror machine.4

The most useful approach for describing the motion of charged particles in the earth's magnetic field has been the guiding center ap­proximation and subsequent development of adiabatic invariant concepts.5 ,6 In the guiding center approximation, the instantaneous posi­tion r of a particle moving in a magnetic field is broken down into its circular motion of radius p, and the motion of the guiding center R:

r = R + p.

A general expression for the motion of the guid­ing center can be obtained by substituting the above into the equation of motion

d 2 r e dr m -- = mg + - - X B + Ee

dt2 c dt

where m = particle mass, e = electronic charge, c = velocity of light, g = acceleration of gravity, B = magnetic field, and E = electric field. This yields the nonrelativistic guiding center equation

d2 R e { I dR } fJ. -- = g + - E + - - X B - - VB dt 2 m c dt m

where fJ. = particle's magnetic moment due to gyration, p = cyclotron radius, Q = scale length

697

over which the magnetic field changes appreci­ably, and O(p/a) = terms of order pIa. In this approximation the particle's motion in the earth's field is broken down into three com­ponents: gyration about a field line, bounce back and forth along the field line between mirror points, and a slow longitudinal drift around the earth. While these motions are not strictly separate from one another, the vast difference in the time scales associated with them makes such a separation possible and leads directly to the consideration of adiabatic invariants. These motions are illustrated in Fig. 2.

The adiabatic invariants may be considered constants of the particle's motion provided that magnetic field variations are small compared with the time and spatial scales associated with the particle's motion. The first of the adiabatic invariants is the magnetic moment generated by the particle as it gyrates around the field line,

/1 = mUl 2 /2B = mu2 sin 2 al2B = W liB

where a = angle between the field line and the velocity vector.

This leads directly to the mirror equation and definition of the particle's mirror point:

W sin 2 a /1=--­

B = constant.

In a static field, W (the particle energy) is con­stant and

sin2 a1 IB 1 = sin2 a21B2 = constant.

Using the earth's equatorial magnetic field as a reference, the mirror point B value on a given line of force is determined by the particle's pitch angle at the equator:

BM = Beq/sin 2 O!eq.

In a dipolelike field configuration, such as the earth's, having a minimum B value at the equator, particles will simply bounce back and forth between conjugate mirror points located

Mirror point (Pitch angle of helical trajectory = 90·

Magnet ic field line

FIG. 2. Illustration of the motion of a charged particle trapped in the earth's magnetic field.

MAGNETOSPHERIC RADIATION BELTS

in the northern and southern hemispheres (cL, Fig. 2).

The second adiabatic invariant, obtained from the action integral and associated with the par­ticle's bounce back and forth along a field line, is given by

M*

J= 2 L Pil ds

where Pil = mUll is the particle momentum along the field line and the integral is taken along the field line between the two conjugate mirror points M and M * .

Forces due to the gradient of the earth's magnetic field and field line curvature cause a longitudinal drift across field lines with electrons drifting eastward and protons drift­ing westward. In an ideal dipole field this ef­fect produces a drift surface which is simply the figure of azimuthal revolution of a line of force. Associated with this drift motion is the third adiabatic invariant, the flux invariant

<PM = B dS.

<PM, the magnetic flux linked in the drift orbit of the particle, is the weakest of the three in­variants (/1, J, <PM) since it has associated with it the largest spatial and temporal scales. There­fore the conditions for adiabatic invariance are most easily violated for <PM.

In Table I we show for reference character­istic times associated with charged particle mo­tion in the magnetosphere.

Radiation belt particles represent significant energy storage in the magnetosphere (2 X 1022 _

2 X 1023 ergs). Their gradient and curvature drifts establish a current encircling the earth, the ring current, which is responsible for world­wide depressions of the earth's surface mag­netic field. During times of enhanced radiation belt intensities, particle energy densities sig­nificantly greater than the ambient magnetic field energy density are observed and can cause surface field variations up to several hundred nT. The bulk energy density of the ring current particles is contained within the energy range ~1-200 keV with a mean energy of ~85 keV. This high {3 plasma (! pu2 > B2 I 8rr) decays primarily through charge exchange and ion­cyclotron wave generation.

Protons, helium, and oxygen together form the ring current but their relative contributions are unknown. Thus the ultimate source of ra­diation belt particles, the solar wind or the ionosphere, is still uncertain. It is expected that strong energy and spatial dependencies will be evident in the source mixture for the bulk of the radiation belt particles. The very high energy (2: several tens of MeV) protons observed at low altitudes CS 1.8 earth radii)

MAGNETOSPHERIC RADIATION BELTS 698

TABLE 1. CHARACTERISTIC TIMES ASSOCIATED WITH PARTICLES TRAPPED IN THE EARTH'S

MAGNETIC FIELD.

Energy Gyr~eriod Bounce Period Drift Period Particle (keV) (rxR )(sec) (rx R)(sec) (rx l/R) (hr)

Electron 10 9.4 X 10-6 0.64 36.7 100 13 X 10-6 0.23 4.1

1000 80 X 10-6 0.13 0.54

Proton 10 17 X 10-3 27 36.7 100 17X10-3 8.6 3.4

1000 17 X 10-3 2.7 0.35

Values correspond to "'e = 1T/2 and R = 2Re.

are supplied by neutrons, generated in the at­mosphere by cosmic rays, which leave the earth's atmosphere and decay in the geomag­netic field.

Radiation belt particles are accelerated to their final energies via E X B convection across field lines and betatron and Fermi acceleration processes due to slow diffusion across mag­netic field lines under conservation of the first two adiabatic invariants. The relative importance of these mechanisms has not yet been estab-

Mercury

3

..... I- --I I-3.5 x 103 km -5 x 104 km

lished. Radiation belt particles having a solar wind source obtain an initial heating in the geomagnetic tail where gradient and curvature drive across electric field equipotentials can increase particle energies by amounts up to several tens of ke V. If the particles are from the ionosphere they are accelerated to energies of -1-10 ke V by electric fields parallel to mag­netic field lines emanating from the auroral zones. In either case a radiation belt source, most likely the plasma sheet, can be formed at

4.3 x 106 km

FIG. 3. MagnetosphereJike systems are probably common throughout the universe, with a large range in scale sizes. For example, the subsolar magnetopause distance for Mercury is 3.5 X 103 km; for the radio galaxy NGC 1265, the analogous distance is roughly 1018 km.

699

altitudes ~ 6.5 earth radii which then can be accelerated as discussed above to form the trapped particle population.

The second large energy storage region in the magnetosphere is the extended geomagnetic tail (3 X 1022 -3 X 1023 ergs). The plasma sheet particles and the stretched geomagnetic field lines contribute roughly equal parts to this energy storage. The relationship between the plasma sheet, the aurora, and the radiation belts is an intimate one but not yet fully understood. As discussed above, the earthward portion of the plasma sheet is a likely source for the radia­tion belts, but the relative contributions of the solar wind and ionosphere are unknown. It is also likely that diffuse auroral forms are due to plasma sheet electrons scattered into the loss cone by electrostatic waves. On the other hand, discrete auroral forms are most probably caused by electric fields parallel to auroral magnetic field lines. It is not known whether these fields are due to very narrow electrostatic field geom­etries called double layers or the observed shear flows in the high altitude plasma (V . E < 0) coupled with ionospheric current continuity restrictions. There appears to be no require­ment for anomolous resistivity mechanisms in auroral processes.

Associated with the parallel electric fields responsible for discrete auroral forms is an in­tense electromagnetic radiation in the 100-1000 kilohertz band, auroral kilometric radia­tion. The intensity of this radiation,. normalized to a planetary radius distance scale, is compa­rable to Jupiter's emissions. In fact, integration over respective radiating solid angles may make the earth a radio source of the same order as Jupiter in total power output.

If we consider the earth's magnetosphere in a general sense as a rotating magnetized plasma, we find that such objects are plentiful through­out our solar system and perhaps the universe. This is not surprising, since most of the universe is filled with plasma and the basic interactions between plasmas, electric fields, and magnetic fields being uncovered in the earth's magneto­sphere are present in the development of cosmic regions from small interstellar clouds to entire galaxies.

Interplanetary spacecraft have identified mag­netospheres around Mercury, Saturn, and Jupiter. 7 Astronomers have detected similar structures around rotating neutron stars (pul­sars) and radio wave-emitting galaxies . Figure 3 illustrates the scale sizes observed for these various magnetospherelike systems.

In all these cases it is evident that nature has been able to accelerate charged particles to very high energies. In the earth's magnetosphere there are at least four established methods of accelerating particles. The most general of these is magnetic field gradient and curvature drift across electric field equipotential surfaces. More specific mechanisms are betatron accelera-

MAGNETOSPHERIC RADIA TION BELTS

tion, Fermi acceleration, and acceleration by electric fields parallel to magnetic field lines. It remains to be shown that field line inter­connection can directly transfer energy from the magnetic field to charged particles (field line merging, reconnection) or if plasma turbu­lence effects are inportant as acceleration processes.

Magnetospheric systems, while similar, often have their own unique characteristics. For ex­ample, Jupiter and Saturn have moons in the heart of the charged particle populations which are effective absorbers creating distinctive features in their radiation belts. At Jupiter, the volcanic moon 10 is a copius source of sulfur and oxygen, both of which have been detected at all energies throughout the Jovian magneto­sphere. Jupiter's high spin rate (period = 9 hours 55 minutes 29.7 seconds) can produce effects to accelerate particles in addition to those found in the earth's magnetosphere. For ex­ample, low energy plasma corotating with Jupiter's magnetic field will exceed the Alfven speed and become supersonic well within the Jovian magnetosphere (30-40 Jovian radii). Even tiny Mercury, with neither atmosphere nor ionosphere, possesses a magnetosphere capable of accelerating large numbers of par­ticles to high energies.

Work remains to be done to understand how the laws of physics operate in interacting mag­netized plasma systems which display the range of boundary conditions seen throughout the solar system and in the universe.

DONALD J. WILLIAMS

References

1. For a detailed discussion of the magnetosphere and its interaction with the solar wind see, for example, Akasofu, S. I., and Chapman, S., "Solar Terrestrial Physics" 901 pp., London, Oxford Univ. Press, 1972, and Williams, D. J. (Ed.), "Physics of Solar Planetary Environments," 1038 pp., Washington, D.C., American Geophysical Union, 1976. For a brief summary of other magnetosphere systems see Stern, D. P., and Ness, N. F., "Planetary Magneto­spheres," Annual Review of Astronomy and Astrophysics 20 (1982).

2. See collection of papers on recent bow shock re­sults, Journal of Geophysical Research 86, 4319 (1981).

3. Van Allen, J. A., Ludwig, G. H., Ray, E. C., and McIlwain, C. E., "Observation of High Intensity Radiation by Satellites 1958 Q and 'Y," Jet Pro­pulsion 28,588 (1958).

4. See for example, Roederer, G., "Dynamics of Geomagnetically Trapped Radiation," Heidelberg, Springer-Verlag, 1970, and Williams, D. J., "Charged particles Trapped in the Earth's Magne+ic Field," Advances in Geophysics 15,137 (197 1).

5. Alfven, H., "Cosmical Electrodynamics," 1st Ed., London and New York, Oxford Univ. Press, 1950.

MAGNETOSPHERIC RADIATION BELTS

6. Spitzer, L., "Physics of Fully Ionized Gases," New York, Wiley-Interscience, 1956.

7. For Mariner-l0 Mercury results see Science 185, 141 (1974). For Voyager-l Jupiter results see Science 204, 945 (1979). For Voyager-2 Jupiter results see Science 206, 925 (1979). For Voyager-l Saturn results see Science 212, 159 (1981). For Voyager-2 Saturn results see Science 215, 499 (1982).

Cross-references: ELECTRON, GEOPHYSICS, IONO­SPHERE, PLANETARY ATMOSPHERES, PROTON, SPACE PHYSICS.

MAGNETOSTRICTION

When a polycrystalline nickel sample is placed in a magnetic field, it contracts along the field direction by about 30 parts per million and elongates in the transverse direction by about half that amount. There is also a small volume change. Such changes in dimension of magnetic materi/lls with variation of magnetic field strength or direction, are termed magnetostric­tion. They are measured by strain gages, optical dilatometers, capacitance variation, and x-ray analysis.

Below the Curie temperature, magnetostric­tion in weak fields is caused by domain rotation, becoming appreciable at fields near the knee of the B-H curve.

In saturating fields there is still a small linear dependence of magnetostriction on magnetic field strength, and above the magnetic ordering temperature magnetostriction is, except in rare instances, quadratic in magnetic field strength. Field strength dependent distortions in the satu­rated and paramagnetic regions, designated forced magnetostriction, are due to the para­process, the induction of a moment by the field.

The saturation magnetostriction of single crys­tals depends upon the direction of the (sublat­tice) magnetization, a, and the direction of measurement, {3, with respect to the crystal axes. In a cubic crystal (with collinear sublat­tices), to lowest order,

01 3 [2 {./ 2 2 {./ 2 - = Ao + ~ AIOO alI-'I + a2 1-'2 1

+a32~32 - 11 + 3 A 111 [aIO:2~1~2

+ O:2O:3~2~3 + O:3O:I~3~11 (1)

ClarkI gives higher order expressions for cubic and hexagonal symmetry.

For cubic crystals the fractional change in length along the field (and the magnetization) direction induced by a saturating magnetic field is, in principle, found by averaging Eq. (1) over

700

directions. This gives the saturation magneto­striction

2AlOO + 3 A 111 A =---=~-~ s 5 (2)

The assumption undt)rlying the averaging that yields Eq. (2) is that before the field is applied the material is unmagnetized and all polycrystal orientations are equally likely. This is in fact rarely the case; there is often both some rema­nent magnetization and some preferential orientation of crystallites. Measurements of All - Al, the difference in distortions parallel and perpendicular to the field direction, are more reproducible and significant, in that they are independent of the distortion in a fiducial "unmagnetized" state.

Magnetostriction coefficients vary greatly, de­pending upon the material, temperature, and magnetization state. For pure iron at room temperature, the saturation magnetostriction constants are AIOO ""' 20 X 10-6 ; Al11 ""' -20 X 10-6 , while for alloys near 80Ni-20Fe (weight per cent) these constants are almost zero. The cobalt ion causes a large magnetostriction; for cobalt ferrite AIOO ""' - 500 X I 0-6 while for nickel ferrite AIOO ""' - 30 X 10-6 • The largest known magnetostriction is that of dysprosium metal. 2 As a magnetic field is rotated in the basal plane of this hexagonal crystal, there is a basal plane distortion of almost one per cent, at liquid nitrogen temperatures and below. At room temperature, TbFe2 shows a magneto­striction 5 times larger than does any other materia13 ; (A ""' 2 X 10-3 ).

The source of magnetostriction is the depen­dence of magnetic energy on strain. Because the elastic energy is quadratic in strain while the magnetoelastic energy is linear in strain, the minimum free energy occurs at nonzero strain. For example, in a cubic crystal the equilibrium shear strain €xy is given by

B2 (T,H) €xy = O:xO:y

C44 (3)

Here C44 is the elastic constant, the a's are mag­netization direction cosines, and B2 (T, H) is a magnetoelastic coefficient representing the vari­ation of magnetic energy (magnetic anisotropy, dipolar, anisotropic exchange) with strain.

Quantum mechanical calculations of the mag­netoelastic coefficients are in a somewhat more satisfactory state in the case of nonconductors than for metals. Extensive calculations by Tsuya of the B coefficients of the spinels are reviewed by Kanamori.4

The temperature dependence (and "forced" field dependence) of the magnetostriction coef­ficients is due to statistical averaging as the in­dividual spins fluctuate around the average mag­netization direction a. For some materials this temperature dependence can be expressed en-

701

tirely in terms of a known function of the (sublattice) magnetization. For ferrimagnetsS

MT, H) = 'L;Ajn(O)fj[mn(T, H) 1 (4) n

That is, the magnetostriction coefficient Ai(T, H) is the sum over sublattices of temperature inde­pendent sublattice magnetostriction coefficients [Al11 (0) = B2n(0)/C44 1 times a functionfi of the sublattice magnetization Mn(T, H)/Mn(O). At sufficiently low temperatures this function reduces to

[Mn(T, H)] 3 f[mn(T, H)] = Mn(O) ; T<f. Tc (5)

for both A1IlO and Am. Clarki, Bozorth6 and Callen 7 give references.

EARL CALLEN

References

1. Clark, A. E., "Magnetostrictive Rare Earth-Fe2 Compounds," in "Ferromagnetic Materials" (E. P. Wohlfarth, Ed.}, New York, North Holland Publish­ing Co., 1980, Chapter 7, pp. 531-589.

2. Legvold, S., Alstad, J., Rhyne, J., Phys. Rev. Let­ters 10,509 (1963); Clark, A. E., Bozorth, R. M., and DeSavage, B., Physics Letters,S, 100 (1963).

3. Clark, A. E., Belson, H. S., AlP Conf. Proc. 5, 1498 (1972); Clark, A. E., Belson, H. S., Tomagawa, N., and Callen, E., Proc. Internat. Conf. on Magnetism, Moscow, August 1973.

4. Kanamori, J., "Magnetism" (Rado, G. T., and Suhl, H., Eds.), Vol. I, p. 127, New York, Academic Press, 1963.

5. Callen, E., Clark, A. E., DeSavage, B., Coleman, W., and Callen, H. B., Phys. Rev., 130, 1735 (1963).

6. Bozorth, R. M., Ferromagnetism, New York, Van Nostrand Reinhold, 1951.

7. Callen, E., J. Appl. Phys. 39,519 (1968).

Cross-references: FERRIMAGNETISM, FERROMAG· NETISM, MAGNETISM.

MANY-BODY PROBLEM

Scope and Definition A large part of the ex­perimental data of physics is concerned with natural objects which may be looked upon as being made up from smaller bodies. For example, we may think of the solar system as an object composed of the planets and the sun; ordinary matter, in solid, liquid or gaseous form, as com­posed of molecules and atoms; atoms and mole­cules themselves as made up from nuclei and electrons, the nuclei as composed from neutrons and protons, and so on. We shall call the com­posite object the system, and its constituents, the particles; and note that it seems most reasonable to suppose that the properties of the

MANY·BODY PROBLEM

system can be explained on the basis of the law of interaction between the particles and the laws of dynamics. The latter may be classical or quantum mechanical according to the demands of the situation. At each level of refinement we refrain from asking about the internal structure of the particles. This is to achieve a natural simplicity of description; but still, at each such level, we have a rich variety of natural phe­nomena to explain.

The many-body theory is not concerned with any fundamental or complete explanation of nature. Its chief aim is to formulate schemes according to which calculations of certain physical quantities can be performed theoreti­cally and the results can be compared with ex­perimental measurements. It is inherent in its methods that the number of particles is con­sidered as being large, and no attempt is made to find all the details of motions of the particles -a characteristic which distinguishes it from the so-called one-, two- or three-body problems.

The main approaches to the theory of quan­tum-mechanical many-body systems, such as nuclei, solids, and fluids, were worked out in a period of approximately ten years starting in the early 1950s. This led to a great deal of activity and attracted much attention. As a result, the term many-body problem has come to mean, almost exclusively, the theory of such systems at or near the absolute zero of temper­ature. The latter qualification serves to distin­quish the many-body problem as such from the closely related field of STATISTICAL MECHAN­ICS. (By convention many-body scattering theory is a separate subject). The new develop­ments were based on the observation that when the number of particles is so large that it may be considered effectively infinite, then the system becomes very similar to that of inter­acting fields-except for the nature of inter­actions considered-and the general formal methods of quantum field theory and quantum electrodynamics may be used with advantage.

There is only one general theorem in many­body theory; it is known as Poincare's theorem. Roughly speaking, it states that any given initial state of a finite many-body system will be re­peated provided one waits long enough. The quantum mechanical form of this theorem states that all observables in a finite system are almost periodic functions of time. This theorem has not had much practical use but has played an impor­tant role in discussions concerning the founda­tions of statistical mechanics.

The so-called many-body theory is mainly a collection of special approximate methods devel­oped for particular problems. The chief com­mon features of some of the methods, especially the ones connected with modern developments, will now be described.

Reduction to an Equivalent System of Non­interacting Particles The very fact that we can recognize some constituent particles leads us to

MANY-BODY PROBLEM

believe that in the lowest approximation we may neglect their interactions. This approximation is already quite successful in derivation of per­fect gas laws and electron theory of metals. A slightly different form of this assumption occurs in the case of atoms, which are treated as sys­tems of non-interacting electrons moving in the field of force of the nucleus. For planetary systems a similar approximation is used.

The normal mode analysis of a lattice provides an example where a transformation of coordi­nates is used to achieve such a reduction. In­stead of considering the coordinates of individual particles which interact with each other through harmonic forces, one considers certain linear combinations of displacements, the modes. In terms of the new variables there are no inter­actions and the solution is immediately obtained. This is an example of a transformation which introduces a collective description of the system.

Another type of situation occurs in nuclear theory, where it is found that a shell-model of the nucleus, built in analogy with the atomic shell-model, is very successful. The non-inter­acting particles of this model are called neutrons and protons, but interaction between them which must be used in this model is vastly dif­ferent from that observed in two-body scattering experiments. As a first approximation one can completely ignore the mutual interaction and assume that the particles move in a common one-body potential. This circumstance suggests that what are called neutrons and protons in this model are not the same as the free ones but are only some quasi-particles which are appropriate to the model and happen to have many prop­erties in common with actual particles. An analogous situation occurs in some solids where electrons as observed by means of cyclotron resonance experiments possess an effective mass different from the mass of free electrons. In fact, one may even say that many-body theories always deal with quasi-particles. That is true of most existing theories, but from such a point of view one loses sight of one of the basic motivations of many-body theory.

Effective Field Method This is one of the methods of taking into account the mutual inter­actions of the particles. One starts with a given motion of particles, e.g., from an approximation of the type described in the last paragraph, and calculates the field of force experienced by one of the particles under the influence of all the others. As a further refinement the field may be made self-consistent, that being the situation when the motion produced under the influence of the field is the same as that which generated it. But for the approximations made in the course of calculation, such as omission of the effects of correlations among the particles, a fully self-consistent theory would be a complete theory.

Examples are : Hartree-Fock theory, Fermi­Thomas approximations, Brueckner theory of

702

nuclei, Wigner-Seitz cell model in solid state theory, and several others.

Density Functional Method This -method is based upon a theorem of Hohenberg and Kohn, which states that if the ground state of a many­particle system is degenerate then the corre­sponding wave-function is a unique functional of the particle density. The theorem implies the existence of a universal functional of the exter­nal potential and particle density which is a minimum for the true particle density. Others have generalized the theorem to nonzero tem­peratures, nonlocal external potentials, rela­tivistic systems, and spin-dependent systems. In the latter case a space and spin-dependent den­sity is used. The universal functional of the theory is inferred from the basic many-body quantum mechanical description of the system. Given the functional, the ground-state energy and particle density are obtained from a varia­tional principle involving the particle density alone. Appropriate approximations reproduce Thomas-Fermi theory and its known generaliza­tions, but the density functional approach can go much further. Extensive applications to atomic, molecular and solid state problems have been made. (Unfortunately no comprehensive review of these works exists; we give references to the first basic papers.)

Collective Motion Theory In some phenom­ena, such as propagation of sound and plasma oscillations, it is clear that many particles are performing coordinated movements. To study such cases, one introduces some collective variables in addition to the usual ones, and the Hamiltonian is re-expressed in terms of these mixed variables. Subsidiary conditions have to be imposed upon this extended system of variables to preserve the original number of degrees of freedom. The collective variables should be such that there is no appreciable interaction between these and other degrees of freedom. When quantum mechanics is applica­ble, the collective motions are also excited in quanta which for all practical purposes may be treated as new (quasi) particles. The stability of collective motions is then expressed in terms of the lifetime of quasiparticles. Solid-state physics is particularly rich in exhibiting collective motions. Quasiparticles associated with some of them are: the phonons (sound, lattice vibration); the polarons (electron and its polarization field in dielectric); and the excitons (electron-hole ex­citations in insulators). Collective motions in nuclei can also be interpreted in a similar man­ner. Superconductivity and superfluidity are also examples of collective motion. The quasiparti­cles responsible for superconductivity are elec­tron pairs with equal and opposite momenta and spin.

Use of Techniques of Field Theory With these techniques it is possible to obtain formal expres­sions which represent the effect of interparticle interactions to any order in perturbation theory.

703

By carrying out rearrangements and partial summations of terms in perturbation series it is possible to see that, as far as the motion inside the system is concerned, the relationship be­tween the coordinates and momenta and the potential and kinetic energies is changed in such a way that it has to be described in terms of an effective mass and an effective interaction, which differ from the original quantities in a known way. In certain cases these effects can be calculated and are finite.

Brueckner's theory of nuclear matter is an example of this type. The effective mass is found to depend on the momentum of the particle inside the system, and the effective interaction, the so-called t- or K-matrix, is given by an integral equation involving the original inter­action. A self-consistent calculation of the properties of the system (nuclei or atoms) can be based on this understanding.

Similar techniques can be used for studying collective >motions. An example is the treatment of electron gas by Gell-Mann and Brueckner.

Perhaps the greatest advance has been made in the theory of superconductivity, where varia­tional and canonical transformation methods have been used.

A combination of all these methods is needed to study the difficult problem of relationship between various excitations, i.e., the interaction between various quasiparticles of a many-body system. One of the most useful tools in these calculations is the representation of matrix elements by means of diagrams, first introduced by Feynman. Many of these methods were first developed in connection with the theory of interacting fields and they are usually employed in many-body problems for the limiting case of an infinite number of particles, but these restrictions are not essential; in fact, they are quite general methods for treating arbitrary quantum mechanical systems.

KAILASH KUMAR

References

ter Haar, D., "Introduction to the Physics of Many­Body Systems," New York, Interscience Publishers, 1958.

De Witt, B., "The Many-Body Problem," London, Methuen, 1959.

Thouless, D. J., "The Quantum Mechanics of Many­Body Systems," New York, Academic Press, 1961.

Fetters, A. L., and Walecka, J. D., "Quantum Theory of Many Particle Systems," New York, McGraw­Hill, 1971.

Hohenberg, P. and Kohn, W., "Inhomogeneous Elec­tron Gas," Phys. Rev. 38, 864-871 (1964); Kohn, W., and Sham, L. J., "Self-Consistent Equations InclUding Exchange and Correlation Effects," Phys. Rev. 4A, 1133-1138 (1965) (for density functional method).

Kumar, K., "Perturbation Theory and the Nuclear

MASER

Many-Body Problem," Amsterdam, North-Holland Publishing Co., 1962.

Khilmi, G. F., "Qualitative Methods in Many-Body Problem," New York, Gordon and Breach, 1961 (for classical mechanics).

March, N. H., Young, W. H., Sampanthar, S., "The Many Body Problem in Quantum Mechanics," Cam­bridge, Cambridge University Press, 1967.

Ziman, 1. M., "Elements of Advanced Quantum Theory ," Cambridge, Cambridge University Press, 1969.

Cross-references: EXCITON, FEYNMAN DIA­GRAMS, FIELD THEORY, KINETIC THEORY, NUCLEAR STRUCTURE, PHONON, PLASMAS, QUANTUM ELECTRODYNAMICS, QUANTUM THEORY, SOLID-STATE THEORY, STATISTICAL MECHANICS, SUPERCONDUCTIVITY, SUPER­FLUIDITY.

MASER

The term "maser," coined by Townes and co­workers who pioneered this field, stands for m(icrowave) a(mplification by) s(timulated) e(mission of) r(adiation). "Microwave" has proved restrictive; stimulated emission ampli­fiers have operated in the UHF (-300 MHz), and at infrared, visible, and ultraviolet frequen­cies (see LASER). The principal advantage of the maser amplifier is its small intrinsic internal noise: the equivalent noise input temperature is but a few degrees Kelvin. The theoretical mini­mum noise input temperature is hfs/k, where h is Planck's constant, k is Boltzmann's constant, and fs is the signal frequency. This is 0.48 K at fs = 10 Ghz (Giga Hertz) or 10 X 109 Hz. Maser oscillators can generate exceedingly monochro­matic radiation, e.g., the ammonia maser has a short-term frequency stability of -5 parts in 1012 , and the atomic hydrogen maser has a short-term stability of better than 1 part in 1013 •

Because "quasi-optical" techniques are being employed increasingly in the millimeter and submillimeter regions, the distinction between LASER and maser in these regions is becoming eroded. Historically, maser oscillators used reso­nant systems of dimensions comparable to a cubic wavelength, (A3 ); laser oscillators used resonators with dimensions exceeding A 3 by many orders of magnitude.

Stimulated Emission of Radiation Because its energy is quantized, a molecule (here a ge­neric term) can exchange energy with the elec­tromagnetic radiation field only in discrete amounts (quanta). The emission or absorption of a quantum (photon) is associated with a transition between molecular energy states. For two states, 1m>, In> of energies Wm , Wn , (Wm > Wn), the frequency fmn of the radiation accompanying the (permitted) transition be­tween them satisfies the Bohr condition

hfmn = Wm - Wn. (1)

MASER

A molecule in state In>, exposed to radia­tion of frequency fmn and energy density u, has a probability per unit time u X Bnm (Bnm is a constant) of absorbing a photon hfmn and reaching state 1m>. There is also a probability u X Bmn that a molecule in the upper state 1m> will emit a photon hfmn and return to the lower state In>. The upper state molecule is stimulated to emit radiation of frequency fmn by the radiation field at this frequency. Stimu­lated emission, like absorption, is a process which is phase coherent with the incident radia­tion. Thermodynamical arguments by Einstein (1917) showed that

Bnm = Bmn. (2)

A molecule in the upper energy state 1m> may also revert to the lower state In> by spontaneously emitting radiation of frequency fmn. This spontaneous emission is a random process, which is phase incoherent with any incident radiation, and is therefore a source of noise in a maser.

The spontaneous emission probability Amn is given by

Amn = Bmn X hfmn X PI (3)

where PI is the number of wave modes per unit volume per unit frequency range open to radia­tion of frequency fmn. Table I shows values of PI under various conditions; c is the velocity of light, Ug is the group velocity of radiation.

In the microwave region (say, I to 100 GHz), Amn ~ Bmn; spontaneous emission is therefore negligible except as a source of noise. However, maser spontaneous emission noise is usually exceeded by noise arising from losses in ancil­lary microwave circuit elements.

Molecular transitions are excited by either the electric or magnetic component of the radiation field, depending upon whether the change in molecular energy is primarily electric or mag­netic in character. Each radiative transition has associated with it an effective oscillating electric or magnetic moment, usually dipplar. The prob­ability Bmn given above depends directly on this dipole moment and inversely on the fre­quency spread (line width) [) of the transition.

TABLE I

Environment

Enclosure large compared with the wavelength clfmn

Single mode resonant cavity, volume V, width of half-power response t;.f

Waveguide, cross section A

704

Conditions for Amplification Suppose radia­tion of frequency fmn is incident on an assembly of molecules with an allowed transition at this frequency [Eq. (1)]. Let the number of mole­cules in the upper state 1m> be N m , and in the lower state In> beN n. If the incident radiation energy density is u, the power absorbed by the molecules will be

PA =NnuBmnhfmn (4)

and the power emitted will be (see equation 2)

(5)

Since at microwave frequencies spontaneous emission is negligible, the condition for amplifi­cation is

PE >PA;i.e.,Nm >Nn. (6)

There must be an excess of molecules in the upper energy state of the transition associated with the signal frequency.

For thermal equilibrium at temperature T, Boltzmann statistics give

(Nm/Nn)=exp(-(Wm - Wn)/kTl

=exp (-hfmn/kT)'.:::.I- (hfmn/kT) (7)

at microwave frequencies, where hf ~ kT. Clearly a molecular system in thermal equilibrium is thus always absorptive. Equation (7) allows the definition of an "effective temperature" T m for an emissive system; Eq. (6) and (7) show that T m will be a "negative" temperature, and that ITm I-? 0 for (Nm/Nn ) -?OO. Obtaining an emis­sive condition, obtaining a "negative tempera­ture," and obtaining "population inversion" are thus synonymous. The excitation of a molecular assembly to an emissive condition is perhaps the crux of the maser problem. The schemes used depend on the conditions and on the molecular system. Discontinuous methods (pulse inversion, adiabatic fast passage) can be used, but the account here is confined to the princi­ples of continuous methods. In a gas, actual separation of the upper-state molecules may be possible. For example, the upper-state mole­cules for the 23.87 GHz ammonia maser transi­tion tend to increase their energy in a static electric field, while the lower-state molecules tend to decrease their energy (Quadratic Stark effect). In an inhomogeneous electric field, the wanted upper-state molecules will therefore drift to the low-field regions. An electrode sys­tem (with geometrical axial symmetry) which gives a low-field region along the symmetry axis will therefore confine the upper-state molecules in a beam along this axis while rejecting the lower-state ones.

In the atomic hydrogen maser, a state-selector magnet is used, with alternating north and south poles, again in an axial arrangement. The elec­tron and nuclear spins can either be "parallel" (F = 1) or "antiparallel" (F = 0). The energy of the atoms with F = 0 decreases with the mag-

705

netic field as they go off axis, while that of the atoms with F = 1 increases. Hence the atoms with F = 1, those in the higher energy state, are focused by the magnet system along its axis, while the lower-state atoms are lost to the beam. The upper-state atoms enter a teflon-coated quartz bulb inside a microwave cavity resonant at the transition frequency. The teflon coating minimizes the chance of an atom emitting its energy because of a collision with the wall of the bulb.

Most masers operate on the multilevel excita­tion scheme, requiring an input of energy ("pumping") at some frequency other than the transition frequency; forms of energy other than electromagnetic may also be used. The principles of the scheme will be illustrated by reference to a molecule having 3 levels with energies WI < W 2 < W 3, such that all transi­tions between levels are allowed. (The transi­tions other than the signal transition need not radiate electromagnetically). In thermal equi­librium the number densities (ni)e of the parti­cles in the different states (i) will satisfy

(nl)e > (n2)e > (n3)e.

The frequencies 132, hi , hi are defined from

imn = (Wm - Wn)/h.

Suppose now by some means, that the transi­tion 1-+ 3 is saturated, i.e., nl ~ n3. (This might be achieved by a sufficiently strong elec­tromagnetic field at frequency hi -known as the "pump" frequency). Under these conditions, it may happen either that n2 > n I, or that n3 > n2. In the first case, amplification will be possible at f21; in the second case, at f 32, pro­vided that the appropriate transition is electro­magnetically radiative.

There are many variants of the simple scheme just described. The frequency 131 may lie in the optical region (OPTICAL PUMPING); the excita­tion may be by collision processes in a gas discharge; or more than three levels may be in­volved, and pump frequencies lower than the signal frequency can sometimes be used.

Maser Materials Maser action has been achieved in gases (e.g., ammonia, formaldehyde, hydrogen, rubidium vapor) and liquids (e.g., protons in water) but the most important maser materials are the solid-state ones, since these have a high concentration of active centers in a small space. Present emphasis is on the use of certain paramagnetic ions diluted in a host crys­tal lattice. Three-level excitation, or some vari­ant, is usually employed.

PARAMAGNETISM is associated with ELEC­TRON SPIN. The directional quantization of angular momentum leads to the quantization of the energy of the ionic magnetic moments in a steady magnetic field. In general, the ground­state multiplet of these ions is split by the crys­tal field of the host lattice (Stark effect), and the levels are completely separated by steady

MASER

magnetic field (Zeeman effect). When the steady magnetic field is applied at an angle to the major symmetry axis of the crystal field, and the resultant Zeeman splitting is comparable with the initial Stark splitting of the levels, the usually forbidden "leap-frog" transitions neces­sary for 3- or multiple-level excitation become allowed. In crystal fields of low symmetry, "leap-frog" transitions may be allowed at very low or even zero magnetic fields. Clearly, ions having three or more energy levels are wanted, and any processes competing with radiative pro­cesses-e.g., the interaction of the "spins" with the lattice-are usually required to be small. Spin-lattice interaction can usually be reduced by cooling the lattice to a low absolute tem­perature; and indeed most solid-state paramag­netic masers operate at liquid nitrogen (77 K) or liquid helium (4.2 K) temperatures. Some ions and host lattices with which maser action has been achieved are listed in Table 2.

A "spin-spin" interaction process, known as cross-relaxation must also be taken into account, since it may either aid or inhibit maser action. Cross relaxation is dependent on spin concen­tration, but not on temperature. Consequently, maser action may be achieved at comparatively high temperature (77 K) but not at low tem­perature (4.2 K) where the considerably longer spin-lattice relaxation time might be expected to give better maser action. Rearrangement of the level populations occurs because of single or multiple quantum transitions between the levels, in which energy is "almost" conserved on the microscopic scale, any differential being ex­changed with the energy of the macroscopic spin system (total magnetic moment).

Amplifier Systems Maser amplifiers may be of either traveling-wave or resonant circuit (cavity) form. Their performances are ex­pressed in terms of a molecular Q-factor, Qm, defined over unit length for the traveling-wave maser and over the resonator volume for a cavity maser. At the signal frequency is,

Energy stored in the structure Qm = -2rris X --=..:....-------­

Power emitted by the molecules

(8)

since the Q's similarly defined for losses are positive.

Ion

Cr3+

Cr3+

Cr3+

Fe3+

Fe3+

Fe3+

TABLE 2

Effective Spin Host Lattice

A12 03, alumina (ruby) Ti02 rutile

Be3Al2(Si60IS) emerald Al203, alumina Ti02 rutile Al2 SiOs andalusite

MASER

For a magnetic dipole transition,

IQm I ex: S (N*prfi 11)-1 (9)

where S is the frequency width of the transi­tion at half-intensity, N* is the excess upper level population, Pm is the effective dipole mo­ment for the transition, and 11 is the ratio of the magnetic el1ergy coupled to the molecules to that stored in the microwave circuit.

Traveling-wave Maser The active maser ma­terial is placed in a waveguide carrying a pure traveling wave. The gain coefficient am is de­fined such that the power gain G for a length I of amplifier is given by

G = exp (2am l). (10)

It can be shown that

am = (21Tfs)/(IQm IVgI) (11)

where Vg is the group velocity of radiation in the guide. Because Pm is typically of the order of a Bohr magneton, and the active centers are diluted, it is necessary to use slow-wave struc­tures (Vg ~ cl 100) in order to keep I to a reasonable value (a few centimeters). Suitable values of Vg are readily achieved by the resonant slowing obtained in periodic structures. Systems such as the Karp structure, comb structure, and meander line are favored, since these support waves with the magnetic field circularly polar­ized in a plane containing the direction of prop­agation and perpendicular to the plane of the periodic elements. A comb-structure traveling­wave maser is illustrated schematically in Fig. 1. The sense of circular polarization is reversed on crossing this plane and is opposite in any re­flected wave to that in the forward wave. The nonreciprocal gyromagnetic properties of para­and ferrimagnetic materials may then be em­ployed to obtain forward gain and reverse at­tenuation with these slow waveguides.

Output

FIG. 1. A magnetic field is applied parallel to the "teeth" of the comb.

706

The noise input temperature Tin of a traveling wave maser is given approximately by

Tin~ITml+TI(IQmIIQe) (12)

where T m is the effective negative temperature of the maser material, Qm is the molecular Q (negative), Qe is the similarly defined ohmic loss factor, and TI is the actual temperature of the waveguide (and contents). In this approxima­tion, I Qm 1< Qe. The bandwidth bm of the amplifier is approximately equal to, but less than,S.

The Resonant circuit Maser may be of either transmission (two-port) or reflection (one-port) type: only the reflection type is considered here, since it is superior in performance to the transmission type. A reflection cavity maser and necessary ancillary equipment are illustrated schematically in Fig. 2. Assuming that the un­loaded resonant circuit (cavity) losses are negli­gible, the coupling to the external circuits will give rise to a Q-factor Qe, say. The power gain G of the reflection cavity maser is then given by

G=(Qe+IQmD2 (Qe-IQmD-2 (13)

The bandwidth bc depends on the gain in such a way that

G1/2 bc -:::=' 2lQml/Js(for G > 10, say)

The noise input temperature is given by Eq. (12) above, where now

Qil = Qil - 10m I-I

~ Input

Matched Termination

Circulator

Resonant Cavity

I-+---+-Active Medium

FIG. 2.

707

It is necessary to have some nonreciprocal de­vice to separate the reflected amplified output from the input signal; the ferrite circulator is most commonly used. The bandwidth and gain stability of the cavity maser are inferior to that of the traveling-wave maser, but the cavity maser is more easily constructed. If three-level excitation is used, it is clear that any maser sys­tem must support both "pump" and signal frequencies.

Maser Oscillators Equation (13) indicates that if I Qrn I is small enough, G becomes infi­nite; i.e., oscillation occurs when the stimulated emission is large enough to overcome all losses. The width of the signal emitted by a maser os­cillator is very much less than 0, so that for narrow 0 an extremely pure oscillation signal results, and a molecular transition which is rela­tively insensitive to external influences will thus give oscillations of high stability in frequency. The ammonia maser and the atomic hydrogen masers are two examples.

Two hydrogen maser oscillators have been compared against each other, and the relative frequency over several hours was stable to 1 part in 1014 • This stability has allowed the determination of the level separation of atomic hydrogen and its isotopes with greatly increased precision (e.g., 0.1 Hzin 1,420,405,751.768 Hz), and has thus allowed a further test of theoreti­cal quantum electrodynamics, and given a mote precise value for the fine structure constant. Hydrogen masers have also been flown in mis­siles and aircraft for the detection of small relativistic effects of the motion.

The Electron-Cyclotron Maser Relativistic free electrons gyrating in a static magnetic field undergo free-free transitions. If there is an energy-dependent level width, or an energy­dependent level spacing for the quantized free electron states, and a population inversion is also present, stimulated emission and hence amplification can occur.

The classical picture of cyclotron maser action can be obtained by considering the phases of electrons gyrating about the magnetic field. Each charge will radiate as an electric dipole; a multipolar contribution also occurs when relativistic effects are properly taken into ac­count. Consider a system of monoenergetic electrons, initially distributed randomly in phase. If a phase bunching mechanism exists, coherent emission will take place. Because of the relativistic mass change, this bunching does in fact occur. Electrons absorbing radiation become more massive, and go back in phase; electrons emitting radiation become less "mas­sive" and advance in phase. The ultimate phase distribution favors emission over absorption, thus increasing the intensity of the incoming electromagnetic wave.

From the quantum-mechanical viewpoint, a free electron in a uniform static magnetic field B is an anharmonic oscillator with quantizerl energy levels (neglecting spin)

MASER

Wn = mc2 [I + 2(n +! )hno/21TmC2] 1/2

- mc2 +p2/2m

where no is the rest electron gyrofrequency eB/m, and p is the unquantized momentum along the direction of B. Transitions between states In + 1) and In) occur at angular frequency wn =(1- nhno/21TmC2)no for nhno/21T« mc2 ; note that Wn decreases as n increases. If a system has a greater population in the state In + I) than in In), photons of angular frequency Wn will induce more (downward) transitions In+I)--*ln) than (upward) transitions In)--* In + 0, because of the unequal level spacing; hence stimulated emission exceeds absorption. If the width of the level In + I) exceeds that of In>, a similar effect can occur. This situa­tion can be obtained by causing the radiating atoms to suffer elastic phase-interrupting colli­sions (e.g., with neutral atoms) if the energy­dependent collision cross section is sufficiently strong.

In a typical device, the gyrotron, a solenoid creates an axially symmetric magnetic field about a gently tapering waveguide system, whose different, sections act as interaction space (open or "quasi-optical" cavity) and output (and if necessary, input) apertures. The electrons are produced at a cathode with a large emitting surface, and accelerated to­wards a collector. The magnetic field increases in intensity from the cathode to the interaction space, which has an almost uniform magnetic field. In the nonuniform field region, the elec­tron orbital velocity v 1 grows from the initial cathode orbital velocity according to V1 2 /B = constant: the orbital energy is drawn from that of the longitudinal motion, and from the ac­celerating electrostatic field. The electrons deliver up the RF energy in the interaction space, then pass through a section of decreasing magnetic field to an extended surface collector. Such devices are capable of powers of many kilowatts in the millimeter and sub millimeter wavelength regions.

Applications Maser amplifiers are now in use wherever the requirement for a very low noise amplifier outweighs the technological problems of cooling to low temperatures. They have been used in passive and active radioastronomical work, in satellite communications ("Project Echo") and as preamplifiers for microwave spectrometry. The "deep-space tracking" sta­tions around the world use ruby maser pre­amplifiers for the reception of signals from planetary probes. The ammonia and the atomic hydrogen masers are being studied as frequency standards and have been used in a new accurate test of special relativity. Sources and amplifiers in the submillimeter, micron, and optical wave­length regions are being studied and developed (see LASER). Maser theory has been used to explain numerous atomic and molecular emis-

MASER

sion lines observed in radio astronomy ('celestial masers').

G. J. TROUP

References

Andronov, A. A., Flyagin, V. A., Gaponov, A. V., Gol'denberg, A. L., Petelin, M. I., Usov; V. G., and Yulpatov, V. K., "The Gyrotron: High-Power Source of Millimetre and Submillimetre Waves," Infrared Physics 18,385 (1978).

Hershfield, J. L., and Granatstein, V. L., "The Elec­tron Cyclotron Maser-An Historical Survey," IEEE Transactions on Microwave Theory and Techniques MTT-25(6), 522 (1977).

Microwave Journal Staff, "Low Noise Maser for Radio Astronomy," MicrowaveJ. 21(3),52 (1978).

Ramsey, N. F., "Hydrogen Maser Research," in "Fun­damental and Applied Laser Physics" Proceedings of Esfahan Symposium (Feld, M. S., Javan, A., and Kurnit, N. A., Eds.), New York, John Wiley & Sons, 1973.

Weber, J., Rev. Mod. Phys. 31,681 (1959).

Books

Cook, A. H., "Celestial Masers," Cambridge Mono­graphs on Physics, Cambridge, U.K., Cambridge Univ. Press, 1977.

Ishii, T. Koryu, "Maser and Laser Engineering," Hun­tington, N.Y., Krieger, 1980.

Orton, J. W., Paxman, D. H., and Walling, J. C., "The Solid State Maser," London, Pergamon, 1970.

Siegman, A., "Microwave Solid State Masers," New York, McGraw-Hill, 1964.

Siegman, A., "An Introduction to Lasers and Masers," New York, McGraw-Hill, 1971.

Troup, G., "Masers and Lasers," 2nd Edition, London, Methuen and Co., 1963.

Weber, 1. (Ed.), "Masers," International Science Re­view Series, New York, Gordon and Breach, 1967.

Cross-references: COHERENCE, ELECTRON SPIN, FERRIMAGNETISM, LASER, LIGHT, MICROWAVE SPECTROSCOPY, MICROWAVE TRANSMISSION, OPTICAL PUMPING, PARAMAGNETISM, QUAN­TUM THEORY, ZEEMAN AND STARK EFFECTS.

MASS AND INERTIA

Mass is, along with length and time, one of the three fundamental undefinables of Newtonian dynamics (the explanation of matter in motion). These three undefinables combine to form the important operational quantity called force. One in turn may consider force as a measure of the interaction of mass with its environment.

Force is defined in such a way that it is de­pendent on eitlier one of two intrinsic proper­ties of matter, called gravitation and inertia. First, all matter exerts an attractive force on all other matter (active gravitational mass) and is in turn attracted by all other matter (passive

708

gravitional mass). Secondly, all matter resists any change in its motion (inertial mass).

The gravitational role of mass is quantified by Newton's universal law of gravitation,

Gmlm2 A

F = - 2 Uy r

(1)

wherein the force F of attraction (designated by the negative sign) between masses m 1 and m 2 is inversely proportional to the square of the distance r between their centers of mass, and directly proportional to the magnitudes of m 1 and m2. F is called a central force because it acts in a straight line (the direction of which is given by a unit vector uy ) which originates at the center of mass m 1 and terminates at the center of mass m2.

This force is independent of the physical di­mensions of ml and m2. Its strength, compared with forces of other kinds such as that of elec­tricity, is determined by the size of the gravi­tational constant G. In view of the high degree of accuracy over centuries of application inher­ing in Eq. (1), which does not require any specification as to which of the two masses is the active and which is the passive gravita­tional mass, no quantitative difference is expected between these two states of gravita­tional mass. Further, Newton's third law of action and reaction, to be considered below, implies their equality.

The inertial property of mass is described by Newton's three laws of motion. According to the first law, a body will continue in its state of rest or in a state of uniform motion (un­changing speed along a straight line) unless acted upon by a net external force. This means, for example, that the earth, instead of orbiting about. the sun, would move in a straight line and thus escape from the solar system. The "escape" is prevented by the force of gravita­tional attraction between the sun and the earth. It also means that motion of mass can exist without force. It is through Newton's second and third laws however, that an operational definition of inertial mass is achieved, one based on the concept of force which can be a push or a pull (such as that inherent in the force called weight). The second law states that the net force F is equal to the change (or deriv­ative) of momentum with time, momentum being the product of mass m and its velocity v:

d(mv) F=--.

dt (2a)

Taking the derivative of the momentum with respect to time leads to a two-term expression for the force F:

dm F =m·a+--v.

I dt (2b)

709

For most conventional situations the inertial mass is constant, so that the second term (dm/dt)v is equal to zero, thereby leaving F = mia. The acceleration a, which is the deriv­ative of velocity v with respect to time (dv/dt), represents a time-dependent increase or decrease in velocity of the mass or a change in its direc­tion of motion such as that encountered in rotation. Velocity is speed in a specific direc­tion, and is an example of a vector, namely it has magnitude (i.e., speed) and a specific direc­tion. In the case of the unit vector uy , the mag­nitude is one. Operationally, Eq. (2b) states that when the force F is fixed, mass and acceleration are inversely proportional to each other. The interdependence of the concepts of force and mass reflected by F = mja emphasizes the lack of a unique definition of what inertial mass is. The latter, however, may not prove ultimately to be a drawback, for the following reason. At least three of the hierarchy of four forces, i.e., gravitation, weak, electromagnetism, and strong, contain empirical constants, such as the per­mittivity of free space £0, the gravitational constant G, or the Fermi coupling constant GF. A delineation of the origin of these constants would provide a deeper insight into the nature of the interaction of mass with its environment, as well as a potentially richer understanding of the nature of mass itself.

For example, Einstein in banishing the con­cept of absolute velocity by focusing attention on the role of the relativity of the velocity term in Eq. (2b), enlarged the Newtonian operational view of the concept of inertial mass. In his theory of special relativity, the amount of mass associated with an object was shown to be a function of its relative velocity,

(3a)

a dependence quantified by the Lorentz trans­formation 1,

(3b)

Here e, the speed of light in a vacuum, is con­sidered independent of any motion of its source. The total mass of an object m was now con­sidered to be the sum of a constant, intrinsic rest mass mo and an increment of variable mass depending upon its velocity relative to some reference frame. This latter effect negates defining mass as a measure of the quantity of matter in a body. Significantly, mass could now be shown to be equivalent to energy via the re­lationship E = me2 • It is worth noting that this energy expression for mass does not reflect the fact that it is considerably easier for stable mass to be converted into radiant energy than the reverse. Significantly, the special relativistic Newtonian operational definition of mass has been found to be highly compatible with the

MASS AND INERTIA

concept of electromagnetic fields as well as with the quantum of action. This success has culminated in the extraordinarily precise disci­pline of quantum electrodynamics.

It should not be concluded, however, that the concept of force is a unique prerequisite in Newtonian mechanics for the definition of inertial mass. Mach proposed an experiment for determining the relative inertial mass free of any consideration of force including that of weight. Two masses M and m on a frictionless surface are held together against a compressed spring between them. When released, they fly apart under the influence of the equal and opposite forces exerted by the spring. The second law states that the product of the mass and the magnitude of the acceleration for each object should be identical, or

mA =Ma. (4a)

If we replace M by one standard unit of mass, the kilogram, i.e., M = 1, then

a m=-.

A (4b)

Mass m is therefore operationally defined as a ratio of two accelerations, i.e., that of the standard a and that of the object A ; the ratio of the two accelerations is constant for a par­ticular body and thereby quantifies an intrinsic property of matter. Its validity requires that Newton's third law hold, namely that for any force of action there must exist an equal and opposite force of reaction.

From the time of Galileo, there has existed reason to believe that the two intrinsic proper­ties of mass might have a common origin. Since the mass of a falling object on earth is simul­taneously inerting m j and gravitating m G, one can equate Eqs. (l) and (2b), where, on earth, Me is the mass of the earth and r is the distance from the center of the earth to the falling object:

GMemG A

mja = - r2 Ur · (5)

To a surprising degree of accuracy (about 1 part in 1012 ) the acceleration for all falling bodies in the vicinity of the earth is equal to -GMe/r2 ,

more commonly designated asg ~ 9.8 Newtons/ kilogram. It is the variation in r in the factor g and not a change in mass that causes the weight of a given object to be location dependent. This means that to the extent that all objects fall to the ground with the same acceleration, the ratio mG/mi, at fixed r, in Eq. (5) is constant so that mj can be set equal to mG by redefining the constant G. Einstein seized upon this apparent experimental equality of inertial and gravita­tional mass and made it absolute, thereby adopting the suggestion of Mach that inertial

MASS AND INERTIA

mass is simply the gravitational attraction of all the mass in the universe for the matter under­going a change in its motion. This enabled him to reduce the number of fundamental undefin­abIes of dynamics to two, namely length (three dimensions) and time. If we multiply the latter by the postulated invariant speed of light e, we obtain a fourth length or the fourth dimen­sion. Thus, dynamics is reduced to four-dimen­sional geometry and the existence of mass is indirectly reflected by the warpage of this four-dimensional geometric structure. (The unusual behavior of black holes can be at­tributed to optical effects encountered under conditions of extreme warpage.) From a Newtonian standpoint, general relativity ban­ishes absolute acceleration and quantifies the relativity of acceleration.

General relativity, in eliminating mass as a fundamental undefinable through the removal of any distinction between inertial and gravita­tional mass, redefines mass as a measure of the curvature of space-time. Despite many experi­mental confirmations involving matter and light; general relativity has resisted accommodation with electric charge, as well as with the related quantum of action h. Further, recent measure­ments of solar oblateness by Hill, if confirmed, would suggest that a small portion (~1.4%) of the advance of the perihelion of Mercury (43" / century), until now a stunning, unique and exact prediction of general relativity, may have to be attributed to some other mass effect. It therefore remains an open question whether a description of mass based upon the elimination of all distinction between inertia and gravita­tion can be maintained.

References

General Relativity

B.A. SOLDANO

Klein, H. Arthur, "The New Gravitation: Key to In­credible Energies," Philadelphia and New York, J. B. Lippincott Company, 1971.

Sciama, P. W., "The Unity of the Universe," Garden City, N.Y., Anchor Books (Doubleday and Co.), 1959.

Lieber, L. R., and Lieber, H. G., "The Einstein Theory of Relativity," New York and Toronto, Rinehart and Company, Inc., 1945.

Black Holes Taylor, J. G., "Black Holes," New York, Avon Books,

1973. Penrose, Roger, "Black Holes," Scientific American,

PPM 38-46, May 1972. Ruffini, Remo, and Wheeler, John A., "Introduction

to Black Holes," Physics Today 24(1), 30-41 (January 1971).

Newtonian Physics and Special Relativity Hecht, E., "Physics in Perspective," Reading, Mass.,

Addison Wesley Publishing Company, 1980.

710

Eisenberg, R. M., and Lerner, L. S., "Physics: Founda­tions and Applications," Vol. I, New York, McGraw­Hill, 1981.

French, A. P., "Special Relativity," M.I.T. Introduc­tory Physics Series, New York, W. W. Norton and Company, Inc., 1966.

Dicke, R. H., "The Solar Oblateness and the Gravita­tional Quadrupole Moment," Astrophysical Journal 591-24 (January 1970).

Cross-references: DYNAMICS, FRICTION, IMPULSE AND MOMENTUM, MECHANICS, RELATIVITY, STATICS.

MASS SPECTROMETRY

Mass spectrometry is based on observations of the behavior of positive rays by Thomson and Wien. In 1919, Aston demonstrated the exis­tence of isotopes by introducing neon gas into a mass spectrograph. Prior to 1940, mass spec­trographs and spectrometers were used primarily for isotopic studies in university laboratories. Analytical spectrometers became commercially available during the early years of World War II when their use for the rapid analysis of hydro­carbon mixtures was recognized.

Mass spectrometry provides information con­cerning the mass-to-charge ratio and the abun­dance of positive ions produced from gaseous species. There are several techniques for the production and measurement of the ions, and the design of an instrument is determined by its proposed application. The mass spectrograph, using a photographic plate for ion detection, had been used primarily for isotopic studies but later was used for the analysis of trace constitu­ents in solids. The mass spectrometer uses an electrical detection and recording system giving a metered output that provides a more accurate measure of the abundance of the ions than the photographic plate provides. The mass spec­trometer is used primarily for the quantitative analysis of gases, liquids, and a limited number of solids.

The five basic components of the instrument are the sample introduction system, the ion source, the mass analyzer, the ion detector, and the recorder. A sample pressure of approxi­mately 5 X 10-5 torr is generally required for a satisfactory analysis. An elevated temperature inlet system or other means of converting the sample into a gaseous state is required for less volatile species. Direct insertion probes for in­troducing samples directly into the region of ionization are available on many instruments.

The most common methods of producing positive ions are electron impact, thermal ion­ization, spark, field emission desorption, and chemical ionization. The electron impact source is the most widely used. Positive ions are pro­duced by removing one or more electrons from the molecules. Thermal ionization produces

711

positive ions by vaporizing a material directly into the ion source from a filament coated with the sample. With a spark source, the material under investigation must be a conductor, or else suitable means must be provided for initiat­ing and maintaining a spark. Ions produced in the spark are taken directly into the mass analyzer. In the field emission source, a high potential is applied between the sample­generally deposited on the tip of a tungsten wire-and another electrode. Ionic species representative of the sample are removed by a high-intensity electric field. Chemical ionization involves the reaction, directly in the ion source, of ions from a reactant gas such as methane with molecules of the sample.

The three most widely used types of mass spectrometers are (1) the single-focusing, magnetic deflection, (2) the double-focusing, and (3) the quadrupole. These three types of instruments differ primarily in the method used for mass separation. The single-focusing analyzer achieves direction but not velocity focusing of the ions. Ions of the same mass-to-charge ratio, having slightly different velocities resulting from different kinetic energies imparted in the ion­ization process, will not be focused simulta­neously, thus producing a broadening of the peak. The resolution of commercially available instruments of this type is generally limited to about one part in 500. That, is, mass 499 can be separated from mass 500 with about a 10% valley. With double-focusing instruments, an electric sector and a magnetic analyzer are placed in tandem to produce both velocity and direction focusing of the ions. Several commer­cial models of the double-focusing design are available having resolutions in excess of one part in 50,000 with an electron impact source, and greater than one part in 3000 with a spark source. The double-focusing geometry is neces­sary with spark source operation because of the wide energy spread of ions produced in the spark. With both single and double focusing, the resolution varies directly with the radius of the analyzer tube and inversely with the width of the slits located in the ion source and ion collector regions. Sensitivity, the abundance of the ions collected per unit sample charge, varies inversely with slit width, and a compromise must be made between resolution and sensitiv­ity. Combined gas chromatography-mass spec­trometry with appropriate data processing units is a popular type of instrumentation. Separation of complex mixtures into individual components as well as analysis is accomplished by mating these two types of instrumentation. Recently, two mass analyzers have been used in tandem for the technique termed MS/MS. The first analyzer is followed by a reaction region and then by a second analyzer. Chemical ionization is commonly used for the source of positive ions. In this technique, a spectrum is obtained of ions at a single m/e in the initial spectrum.

MASS SPECTROMETRY

A major advantage is the increased capability to detect individual organic compounds in complex mixtures.

The two types of ion detection and recording systems are the photographic plate and the elec­trical detector. The photographic method is commonly used with double-focusing instru­ments such as the Mattauch-Herzog design that focuses all ions simultaneously in one plane. The photographic plate records a complete spectrum (mass range - 36: 1) in a time interval of a few seconds to 10 minutes. However, the response of the plate to the ion intensity is nonlinear and quantitative results are more difficult to obtain than by electrical detection. Electrical detection systems use an ion collector, amplifier (commonly an electron multiplier), and recorder.

Positive and negative ions and neutral species are produced by the electron bombardment of molecules. The mass spectrum of a compound is a record of the positive ions collected. Posi­tive ions are produced by the removal of one or more electrons from the molecule and by the rupture of one or more bonds, fragmenting the molecule. While the majority of the positive ions are singly charged, doubly and triply charged ions are 0 bserved in many instances. Certain mass ions produced from organic molecules must be attributed to the rearrangement of hydrogen atoms during the ionization and frag­mentation processes. Metastable ions, formed when ions decompose while traversing the path to the collector, are also frequently ob­served. Metastable ions generally appear at nonintegral mass units and produce broad, low-intensity peaks.

.• In the electron impact source, the electron energy is usually adjusted to 50-70 eV, which is considerably above the appearance potential for molecular and fragment ions. For simplifica­tion of a complex spectrum, the bombarding energy can be reduced to provide sufficient energy to ionize the molecule but not enough to rupture bonds, thus achieving a spectrum consisting .primarily of molecular ions. Mass ions appearing in the normal mass spectrum correspond to the various atoms and combina­tions of atoms in the original molecule. The pattern of mass-ion intensities observed is independent of pressure. Differences in the patterns obtained for various compounds. can be used as the basis for the analysis of complex mixtures. Quantitative analysis is based on the ion current varying linearly with the partial pressure of the gas.

Some of the common uses for mass spectrom­etry include analysis of petroleum products, identification of drugs, determination of the structure of organic molecules, determination of trace impurities in gases, residual vacuum studies and leak detection in high-vacuum sys­tems, geological age determinations, tracer techniques with stable isotopes, determination

MASS SPECTROMETRY

of unstable ionic species in flames, identifica­tion of compounds separated by gas chroma­tography (combined gas chromatography-mass spectrometry), trace element analysis in metals and other solids, and microprobe studies of surfaces and of surface composition of various materials.

A. G. SHARKEY, JR.

References

1. Duckworth, Henry E., "Mass Spectroscopy," Cam­bridge, U.K., Cambridge Univ. Press, 1958.

2. Beynon, J. H., "Mass Spectrometry and is Appli­cations to Organic Chemistry," Amsterdam, Elsevier Publishing Co., 1960.

3. Biemann, K., "Mass Spectrometry: Organic Chemi­cal Applications," New York, McGraw-Hill Book Co., 1962.

4. McLafferty, F. W. (Ed.), "Mass Spectrometry of Organic Ions," New York, Academic Press, 1963.

5. Hill, H. C., "Introduction to Mass Spectrometry," 2nd Ed., London, Heyden and Son, Ltd., 1973.

6. Hamming, Mynard, C., and Foster, Norman G., "Interpretation of Mass Spectra," New York, Aca­demic Press, 1972.

7. McLafferty, Fred W., "Tandem Mass Spectrom­etry," Science 214 (1981).

8. Cooks, Graham R., and Glish, Gary L., "Mass Spec­trometry/Mass Spectrometry," C & EN, November 30, 1981.

Cross-references: FIELD EMISSION, IONIZATION, ISOTOPES, SPECTROSCOPY.

MATHEMATICAL PHYSICS

The term "mathematical physics" is almost synonymous with "THEORETICAL PHYSICS," but their difference is significant. It is like the difference between the descriptions of the electromagnetic field by Maxwell and by Fara­day respectively. The theoretical (nonmathe­matical) description draws on analogies between elements of the field and familiar mechanical models-stretched strings, compressed fluids, vortex motion, etc.; the mathematical descrip­tion made use of the abstract analytical proper­ties of the elements of the field to set up a purely symbolic description without mechanical models. Classical theoretical physics was largely mathematical in content, but was nevertheless based on mechanical models in the spirit of Faraday's theory of the electromagnetic field. The atom and interactions between atoms were regarded as the "real," "external" objects in terms of which all physical phenomena could be explained. The mathematical formalism was merely a handy tool or language in terms of which to set up the explanation. The atoms themselves were not explained, but regarded as the fundamental "building blocks" of the phy­sical world.

Einstein's theory of RELATIVITY is a magnifi-

712

cent historical example of mathematical physics we may cite to contrast with the classical atomic theory. Here mathematical abstractions, Min­kowski space 4-vectors, Riemannian tensors, etc. were invented or adopted from the stock­in-trade of pure mathematicians, with analytical properties that were seen to match those of the data of experimental physics-velocities, forces, field variables, etc . Then the logical (i.e., mathe­matical) consequences of relations among these abstractions predicted new and unexpected relations among either already known or as yet undiscovered data of experimental physics. The construction of a self-consistent mathematical description of all physical phenomena, without the use of hypothetical building blocks of any kind, is the aim of mathematical physics as distinct from theoretical physics.

As a more recent example of this same con­cept, one may cite the deduction of conserva­tion laws from generalized symmetry principles. The classically familiar conservation laws: ener­gy, momentum, angular momentum, have for some time now been recognized as logical con­sequences of the homogeneity and isotropy of time and space. Interpreting these homogene­ities and isotropy as meaning the invariance of physical laws under translation and rotation, one is led naturally to the theorem that each invariance principle corresponds to an appro­priate conservation law. This theorem has yield­ed very significant discoveries in the study of elementary particles and nuclear structure where the most cunningly devised classical models have been not only fruitless but actually mis­leading. A non-technical account of this subject appears in Chapter 27 of R. K. Adair's text "Concepts in Physics," New York, Academic Press, 1969.

The activities of mathematical physicists have resulted in the invention of new mathematical abstractions some of which were at first rejected by pure mathematicians as illogical, only later to be granted a respectable status in the vocabu­lary of pure mathematics. Examples include Oliver Heaviside's operational calculus, J. Willard Gibbs' vector analysis, and P. A. M. Dirac's delta-function techniques. On the other hand, many branches of pure mathematics which initially had been regarded as so abstract as to be entirely "useless," have been found by mathematical physicists to serve as remarkably useful tools in describing physical phenomena. Examples include non-Euclidean geometry in the problems of COSMOLOGY; function space in modern QUANTUM MECHANICS; spin or analy­sis, or the theory of binary forms, in quantum FIELD THEORY. Again collaboration between mathematical physicists and mathematicians has in recent years resulted in the construction of new disciplines of great value, examples being group theory, operations analysis, the theory of random functions, information theory and CY­BERNETIcs. The names of many contemporary scientists are involved here, including Eugene P.

713

Wigner, John von Neumann, C. E. Shannon, Norbert Wiener and many others.

On closer examination it becomes difficult to distinguish clearly between mathematical phy­sics and applied mathematics; very frequently the same individual may be responsible for dis­coveries in both areas. Classical examples of this may be cited: Isaac Newton, Laplace, Carl Friedrich Gauss, Henri Poincare, David Hilbert, Ernst Mach, A. N. Whitehead. Evidently our attempt to define mathematical physics is de­generating into a simple catalog of items with only a vague hint of general characteristics common to all particulars. Physics has some­times been defined as what physicists do, and one is expected to recognize the physicist with­out need for further definition than his own affirmation. Mathematical physics may then be defined as what physicists do with mathematics, or what mathematicians do with physics, or some superposition of the two. As the history of mathematical physics unfolds it becomes apparent that activity tends to cluster in a few fruitful directions at anyone time. Current interests can be judged from the contents of the leading journals devoted to the subject; among these the reader should consult the Journal of Mathematical Physics, and the Physical Review, published by the American Institute of Physics; The Proceedings of the Cambridge Philosophi­cal Society; Comptes Rendus (French Academy of Sciences); Progress of Theoretical Physics (Japan); Nuovo Cimento (Italy);Jndian Journal of Theoretical Physics; Zhurnal Eksperimental' noy i Teoreticheskoy Fiziki (USSR) (in English Translation "JETP"); and other translations published by the American Institute of Physics. Probably the most popular fields in recent years have been in the wide application to solid-state physics and statistical mechanics of quantum field theoretical techniques introduced initially to deal with the phenomena of high energy physics-nuclear interactions, creation and des­truction of particles. etc.

During the last decade the explosive growth in the availability of miniature computers and the wide accessibility of large computer facili­ties have had a revolutionary impact on the thinking of mathematical physicists, if not on mathematical physics itself. Exceptionally com­plicated problems that were easily prov~n "soluble in principle" have become soluble m practice, and this fact alone has freed creative thinkers to ask ever more elaborate and useful questions- e.g., finding the eigenvalues of very large matrices, solving nonlinear differential equations and adjusting parameters to make the solutions match experimental data, simulating the behavior of systems with many degrees of freedom * and the resulting understanding of

*A good example: Abraham, F. F., "Computer Sim­ulation of Diffusion Problems ... ," IBM Data Pro­cessing Division, and The Materials Science Depart­ment, Stanford University, Palo Alto, California, 1971.

MA THEMA TICAL PHYSICS

shock formation in high-speed phenomena. For these purposes there are now several libraries of program subroutines useful in the study of literally thousands of unexplored problems. Computational projects that were never even contemplated because they would have taken decades to complete can now be undertaken and completed in hours or even minutes. The other side of the coin is: such masses of numeri­cal data are generated that results can take years to digest and interpret. This difficulty is amelio­rated in the simpler cases by the recently devel­oped graphics display capability.

The most significant advance to date would seem to be the possibility of programming com­puters to perform abstract algebraic analysis without the use of purely numerical input. This mayor may not help eliminate the one most serious weakness of computer science: computer programming is so absorbing an occupation that an expert can hardly be expected also to under­stand the historic significance, or even relevance, of the major problems at the frontiers of knowl­edge in mathematical physics. Perhaps the most serious error committed by schools of computer science has been to exempt their students from a solid course on mathematical physics. Physi­cists who wish to keep in touch with advances in computer science should watch the continu­ing series of annual volumes "Methods in Com­putational Physics" (Alder, Fernbach, Roten­berg, Eds.; Academic Press).

A philosophy of mathematical physics has gradually evolved with all this creative activity. For current thinking in this area, the reader is referred to two pertinent journals that have appeared in the past few years: "Foundations of Physics, and International Journal of Theo­retical Physics, both from Plenum Press. Semi­popular expositions are available in a number of recent texts in addition to the one by Adair, cited above: Kenneth W. Ford, "The World of Elementary Particles, 1967; F. A. Kaempffer, "The Elements of Physics," 1967; and Kenneth W. Ford, "Basic Physics," 1968; all from Blais­dell Publishing Co.

As for college level texts on mathematical physics, there are very few that are organized around physical concepts. We may refer to the classic series of volumes by Arnold Sommerfeld, and those by Slater and Frank, as prototypes. The standard modern work is the two-volume set by Morse and Feshbach, "Methods of Theo­retical Physics," New York, McGraw-Hill, 1953. A less ambitious volume, William Band, "An Introduction to Mathematical Physics," New York, Van Nostrand Reinhold, 1959, served as an overall survey for advanced undergraduates. There is now a multitude of excellent texts whose major emphasis is the various mathemati­cal techniques employed in theoretical physics, and we cite only three: Margenau and Murphy, "The Mathe~tics of Physics and Chemistry," New York, Van Nostrand Reinhold, 2 volumes! 1956 and 1964; George Arfken, "Mathematica

MATHEMATICAL PHYSICS

Methods for Physicists," New York, Academic Press, 1966; and James T. Cushing, "Applied Analytical Mathematics for Physical Scientists," New York, John Wiley & Sons, 1975.

WILLIAM BAND

Cross-references: COSMOLOGY, CYBERNETICS, FIELD THEORY, MATHEMATICAL PRINCIPLES OF QUANTUM MECHANICS, QUANTUM THEORY, RELATIVITY, THEORETICAL PHYSICS.

MATHEMATICAL PRINCIPLES OF QUANTUM MECHANICS

Classical Newtonian mechanics assumes that a physical system can be kept under continuous observation without thereby disturbing it. This is reasonable when the system is a planet or even a spinning top, but is unacceptable for microscopic systems such as an atom. To ob­serve the motion of an electron, it is necessary to illuminate it with light of ultrashort wave­length (e.g., 'Y-rays); momentum is transferred from the radiation to the electron and the parti­cle's velocity is therefore continually disturbed. The effect upon a system of observing it can never be determined exactly, and this means that the state of a system at any time can never be known with complete precision; this is Heisenberg's uncertainty principle. As a conse­quence, predictions regarding the behavior of microscopic systems have to be made on a probability basis and complete certainty can rarely be achieved. This limitation is accepted and is made one of the foundation stones upon which the theory of quantum mechanics is constructed.

Any physical quantity whose value is mea­sured to determine the state of a physical sys­tem is called an observable. Thus, the coordi­nates of a particle, its velocity components, its energy, or its angular momentum components are all observables for the particle. A pair of ob­servables of a system are compatible if the act of measuring either does not disturb the value of the other. The cartesian coordinates (x, y, z) of a particle are mutually compatible, but the x-component of its momentum px is incompati­ble with x (similarly, y, py are incompatible, etc.). (p, q, ... w) constitute a maximal set of compatible observables if they are compatible in pairs and no observable is known which is compatible with everyone of them. In quan­tum mechanics, the state of a system at an in­stant is fully specified by observing the values of a maximal set of compatible observables. Such a state is called a pure state. The act of observing a system in a pure state disturbs the system in a characteristic manner; the system is accordingly said to be prepared in the pure state by the observation. The cartesian coordi­nates (x, y, z) of a spinless (see below) particle

714

form a maximal set, and if the position of the particle is known, it is in a pure state; similarly, the momentum components (Px, Py, pz) form a maximal set and a particle whose momentum is known is also in a (different) pure state.

A pure state in which the observables of a maximal set S have known values is termed an eigenstate of S and the observable values are called their eigenvalues in the eigenstate. If (Po, qo, ... , wo) are the eigenvalues, the eigen­state is sometimes denoted by the symbol Ipo, qo, ... , Wo >; in this article, pure states will be denoted more concisely by Greek letters. The eigenstates of S are represented by mutu­ally orthogonal vectors, called the base vectors, defining a frame of rectangular axes F in an ab­stract representation space. The number n of such eigenstates is usually infinite, but for simplicity, it will first be assumed that each ob­servable has only a finite number of eigenvalues and hence that the number n is finite; the repre­sentation space is then a straightforward gener­alization of ordinary space to n dimensions, with the additional requirement that the com­ponents of a vector will be permitted to take complex values. Any pure state of the system (not necessarily an eigenstate of S) is repre­sented by a vector a in the representation space. If (ai, a2,· .. ,an) are the components of a with respect to F, and these components are arranged as a column matrix, this matrix is said to provide an S-representation of the state. The state, vector, and column matrix are all denoted bya.

The scalar product of two vectors a, {3 is de­noted by (a, (3) and is defined in terms of their

n components in F to be L at{3i, where OIt is the

i=1 complex conjugate of OIj. v(a,OI) is called the norm of a and corresponds to the length of an ordinary vector. All vectors representing pure states are taken to have unit norms and the S­eigenstates are accordingly represented in the S-representation by the columns (1, 0, ... , 0), (0, I, ... ,0), etc.

Suppose a system is prepared in the state 01 at time t. The probability that the system can be observed in the state {3 at an instant immediately subsequent to t is taken to be 1(01, (3) 12. This event is termed a transition of the system from 01 to {3. If {3 is identical with 01, the probability of the transition is unity, as we expect. The proba­bility of a transition from 01 into the eigenstate (1,0, ... ,0) is found to be 101112. This pro­vides a physical significance for the components of 01 in the frame F.

If the system is prepared in a state a on a number of occasions and an observable a is mea­sured immediately after the preparation on each occasion, the values obtained will usually differ. If 01 is an eigenstate of a with eigenvalue ai, then a will take this value with complete cer­tainty. If, however, this is not the case, a will take all its possible eigenvalues aj with associ-

715 MA THEMA TICAL PRINCIPLES OF QUANTUM MECHANICS

ated probabilities Pi, and its mean or expected value ii in the state a is given by ii = L, Piaj. In

j

the S-representation, an n X n matrix is associ­ated with every observable a. This is said to represent a and is also denoted by a. Then, ii = at aa, where a is the column matrix specify­ing the 'state and at is its conjugate transpose. If a is a real observable, the matrix a is Her­mitian (i.e., at = a); this ensures that a is real. If a is one of the observables belonging to S, its matrix is diagonal, and the ith element in the principal diagonal is the eigenvalue aj of a in the

ith S-eigenstate; in this case, a=L,ata;a;= ;

L,la;1 2a;, which is clearly correct, since lail2

; is the probability the system will be observed in the ith S-eigenstate.

A necessary and sufficient condition that the state a should be an eigenstate of a in which a takes the eigenvalue ao is that a should satisfy the matrix equation aa = aoa. a is said to be sharp in the state a.

Observables are often introduced as functions of other observables, e.g., the kinetic energy T of a particle of mass m is defined in terms of its momentum by T = (p:2 + pj + pI )/2m. If a is defined in terms of u, v, ... by the equation a = ct>(u, v, ... ), then this equation also defines the matrix representing a in terms of the ma­trices representing U,V, etc. There is, however, a proviso: wherever a product uv occurs in ct>, this must be replaced by! (uv + vu) before matrices are substituted, whenever the matrices u,v are such that uv =1= vu; this is called symmetrization. If uv =1= vu, we say that u,v do not commute and it is then found that the observables u,v are incompatible.

Thus far, the evolution of a system with time has not been considered. Suppose a system is prepared in a state a(O) at t = 0 and is not there­after interfered with by further observation. Its state aCt) at a later time t is then determined by the Schrodinger equation,

da Ha = ttl dt' L = y'(=T)

where 1i = h/2rr (h is Planck's constant). Using the S-representation, a is the column matrix (aI, a2, ... ,an) and H is an n X n matrix, characteristic of the system, called its Hamil­tonian. H is Hermitian, and in the case of an isolated system, represents the energy observ­able of the system. An important property of H is that, if a is the matrix representing some ob­servable and a and H commute, then a is a con­s'tant of the system; this means that, if a has a sharp value at one instant, it keeps this sharp value for all t, and otherwise, its probability distribution over its eigenvalues never changes.

If S' is a maximal set of compatible observ­abies different from S, an alternative S'-

representation of the states and observables of a system can be developed. If a, a' are column matrices representing the same state in the two representations and a, a' are square matrices representing the same observable, the transfor­mation equations relating the two representa­tions take the forms

a' = ua, a' = uau- l ,

where u is a unitary n X n matrix (i.e., u- l = ut) characteristic of the two representations. In particular, the Hamiltonian H transforms as an observable.

The type of representation we have been de­scribing is called a Schrodinger representation. It is also possible, by rotating the frame in the representation space appropriately, to keep a constant as t increases. If this is done, the ma­trices a, b, ... representing the observables of the system necessarily become functions of t and are not constants as assumed previously. This type of representation is called a Heisen­berg representation. The Schrodinger equation above ceases to be valid and is replaced by equations of motion for the observable ma­trices a, b, ... taking the form

da L dt = t; [H, a],

where [H, a I = Ha - aH is called the commuta­tor of Hand a. If H and a commute, [H, al = 0 and a is a constant of the system as already stated.

If the number of basic eigenstates of a repre­sentation is infinite but enumerable (i.e., they can be placed in a sequence el, e2, e3," .), the matrices appearing in the theory will have an infinite number of rows. This creates conver­gence difficulties in most formulae (e.g., that for ii), but otherwise, the form of the theory is not affected. If, however, some of the observ­abies upon which the representation is based have eigenvalues which are spread continuously over real intervals, the basic eigenstates will not be enumerable and the matrix-type representa­tion must be abandoned. Such an observable is said to have a continuous spectrum of eigen­values; examples are provided by the coordi­nates (x, y, z) of a particle. In these circum­stances, the discrete sequence CXn of vector components is replaced by a function I/;(p, q, ... ) of the continuous eigenvalues p, q, ... of the observables bf the representation set S. I/; is called a wave function. If the system is in the state specified by 1/;, the probability that if p, q, etc. are measured they will be found to have values lying in the intervals (p, p + dp), (q, q + dq), etc., respectively is I/; *I/;dp'dq .... The representation space is now a function space called a Hilbert space in which each vec­tor corresponds to a wave function and the scalar product of the vectors corresponding to the wave functions I/;(p, q, ... ), ct>(p, q, ... ) is

MATHEMATICAL PRINCIPLES OF QUANTUM MECHANICS 716

defined by

(I/I,I/J) = J I/I*qxJpdq···,

the integration being over all possible eigen­values of p, q, etc.

In this type of representation, observables are represented by linear operators which can oper­ate upon the functions of the Hilbert space, transforming them into other functions belong­ing to the space. For example, if the system comprises a single particle and the representa­tion being used is based on the particle's coordi­nates (x, y, z), its state is specified by a wave function 1/1 (x , y, z) and its momentum compo­nents are represented by operators Px = (fl/t)(a/ax), py = (fl/t)(a/ay), pz = (fl/t)(a/az). This representation permits the immediate deri­vation of the important commutation rules for the coordinates and momenta, namely [Px,x] = fl / t, [p x, y] = 0, etc. Counterparts of all the re­sults given earlier for a matrix representation can now be written down. Thus, if a is the operator representing some observable of a sys­tem which is in the state I/I(p, q, ... ), the ex­pected value of a is given by

(i= J I/I*al/ldpdq···,

where a operates on the wave function 1/1 on its right. The necessary and sufficient condition for 1/1 to be an eigenstate of a with eigenvalue ao is al/l = ao 1/1. Finally, the Schrodinger equa­tion remains valid, but H is now an operator and 0: is replaced by the wave function 1/1; for a single particle moving in a conservative field in which its potential energy is V(x, y, z), em­ploying the coordinate representation, the total energy H = V + (Px2 + pj + pi )/2m, Schroding­er's equation becomes

tz2 a 1/1 --'IPI/I+VI/I=tfi-.

2m at The simplest example of an observable with a

discrete spectrum of eigenvalues is the angular momentum of a system about a point O. If (Mx , My, Mz) are the three components of angular momentum and M2 = M} + Mj + Ml is the square of its magnitude, the following re­lations hold between the matrices or operators representing these observables

[My,Mzl =tItMx , [Mz,Mxl =tfiMy ,

[Mx , My] = ttzMz .

These are the angular momentum commutation rules. M2 commutes with each of the compo­nents. It follows that the components are mutu­ally incompatible, but each is compatible with M2. Simultaneous eigenstates of M2 and Mz exist in which the eigenvalues of M2 are

Q(Q + l)h 2 , where Q = o,!. 1, ~, ... ,and of Mz aremh wherem=-Q -Q+l -"£+2 ... Q-l Q

Part ~f the angular' mome~tum ~f a ~ysten: i~ due to the orbital motion of its particles about o and the remainder is contributed by the in­trinsic angular momentum or spin of the parti­cles. Let (sx,Sy,sz) be the components of spin of a particle and S2 the square of its miJgnitude; then, if the particle is a fundamental one (e.g., an electron, proton, or vector meson), s 2 will have only one eigenvalue Q(Q + l)fz 2 and the particle is then said to have spin Qfz. For an elec­tron or a proton, Q =! and for a vector meson Q = 1. The eigenvalues of a spin component are then - Qfz, (- Q + l)/i, ... , Qfz. Thus, any spin com­ponent of an electron has but two eigenvalues, -! 11, ! fl; a spin component of a vector meson has three eigenvalues -fl, 0, fl.

If all aspects of the state of a fundamental particle except its spin are ignored, (s2 ,sz) con­stitute a maximal set of compatible observables. Employing a representation based on this set, the number of basic eigenstates for a particle of spin Qtz will be (2Q + 1) and the general spin state of the particle will be represented by a column matrix (O:Q, O:Q-l ,O:Q-2, ..• ,O:-Q); for a particle in this state, 100k 12 is the probability of measuring Sz to take the value kl!. In the special case of an electron or a proton, Q =! and its spin state is specified by a column (0:1/2,0:-1/2) called a spinor; in this representation, the three components of spin are represented by 2 X 2 matrices, thus:

The three 2 X 2 matrices appearing in these formulae are called the Pauli matrices and are denoted by ax, ay, az.

In recent years, much thought has been given to the problem of constructing Schrodinger equations for the fundamental particles, which remain unchanged in form when subjected to the group of Lorentz transformations of special relativity theory. Dirac's equation for the elec­tron is one such equation. The fields governed by these equations have themselves been treated as physical systems and quantized, thus leading to a quantum theory of fields. The symmetry of these fields under rotations and other transfor­mations has been fully exploited by the applica­tion of group theory and the properties of Lie alge bras. Details will be found in Ref. 10 of the bibliography below.

D. F. LAWDEN

References

1. Bohm, A., "Quantum Mechanics," Berlin, Springer-Verlag, 1979. r

2. Cunningham, J., and Newing, R. A., "Quantum Mechanics," Edinburgh, Oliver and Boyd, 1967.

717

3. D'Espagnat, B., "Conceptual Foundations of Quantum Mechanics," Reading, Mass., W. A. Benjamin, 1976.

4. Jauch, J. M., "Foundations of Quantum Me­chanics," Reading, Mass., Addison-Wesley, 1968.

5. Lawden, D. F., "Mathematical Principles of Quantum Mechanics," London, Methuen, 1967.

6. Matthews, P. T., "Introduction to Quantum Me­chanics," New York, McGraw-Hill, 1974.

7. Merzbacher, E., "Quantum Mechanics," New York, Wiley, 1970.

8. Pauli, W., "General Principles of Quantum Me­chanics," Berlin, Springer-Verlag, 1980.

9. Pilkuhn, H. M., "Relativistic Particle Physics," Berlin, Springer-Verlag, 1979.

10. Schweber, S. S., "Relativistic Quantum Field Theory," New York, Harper, 1961.

Cross-references: MA THEMA TICAL PHYSICS, MA· TRICES, QUANTUM THEORY, SCHRODINGER EQUATION, THEORETICAL PHYSICS, WAVE MECHANICS.

MATRICES

Matrix notation and operations are introduced into theoretical physics so that algebraic equa­tions and expressions in terms of rectangular arrays of numbers can be systematically handled.

An m X n matrix A = (aij) possesses m rows and n columns, having in double suffix notation the mn elements arranged to the form

... a )

.•. a~~ .

The general element aij may be a complex num­ber. If all elements are zero, A is the null matrix 0 or O. When n = I, the matrix is a col­umn vector v. If m = n, A is square of order n; if all elements not on the leading diagonal all, a22, ... , ann are zero, the matrix is a diagonal matrix D, while D is the unit matrix I if

all =a22 = ... =ann = 1.

The sum or difference of two m X n matrices A and B is an m X n matrix C = A ± B, where Cij = aij ± bij. The elements of QA are Qaij-

The transpose of A is denoted by AT; this is an n X m matrix whose ith row and jth column are identical respectively with the ith column and jth row of A. Hence vT is a row matrix with m elements. For convenience, the column v is often printed as a row with braces, {VI V2" ·vrn}.

The product C = AB is only defined when the number of columns of A equals the number of rows of B; A and B are then conformable for multiplication. If A is m X nand B is n X p, then C is m X p, with

n Cij = L aik bkj.

k=1

MATRICES

Generally, multiplication is not commutative, but it is always associative. The transpose of a product is given by (ABC)T = CTBT AT.

If A is square, then A is symmetric if A = AT, while if A = - AT it is skew-symmetric. A qua­dratic form SA in the n variables contained in the column x = {Xl X2 ... Xn} may be written as SA = x T Ax, where A is symmetric.

Let det A == IA I denote the determinant of the square matrix A. If det A =f= 0, A is non­singular. Then the definition of matrix multi­plication ensures that

det (AB) = det A det B

where A and B are square matrices of the same ordet.

The cofactor of aij in the square matrix A equals (-I )i+j times the determinant formed by crossing out the ith row and jth column in A. The sum of the n elements in any row (or col­umn) multiplied respectively by their cofactors equals det A; the sum of the n elements in any row (or column) multiplied respectively by the cofactors of another row (or column) equals zero. We have

det AT = det A; det QA = an det A.

The adjoint of A, denoted by adj A, is the transpos\! of the matrix formed when each element of A is replaced by its cofactor. We have

A adj A = (adj A)A = (det A)I

and det(adj A) = IA In-I. The unique reciprocal or inverse of a non-singular matrix A is given by

A-I = (adj A)/det A.

This has the property that AA -1 = A-I A = I. It follows that

and

(A -1 )T = (AT)-l.

Linear equations relating n variables Xi to n variables Yi may be expressed as x = Ay; if det A =f= 0, the unique solution for the Yi in terms of the Xi is y = A-I x.

The rank of an m X n matrix A is the order of the largest non-vanishing minor within A; of the m linear expressions Ax, the rank gives the number that are linearly independent. The m linear equations in n unknowns Ax = d, where d is a column with m elements, are consistent if the rank of A equals the rank of the augmented matrix (A d).

If the m linear equations Ax = d are incon­sistent in the n unknowns Xl, X2,' .. , Xn, where

MATRICES

n < m and rank A = n, there the "best" solu­tion for these n unknowns in the least squares sense is given by the normal equations

AT A being non-singular when rank A = n. An n X n matrix A is orthogonal if AT A = I,

that is, if A-I = AT. Clearly, IA 1= ± 1. If Ci de­notes the ith column of A, then CiT Ci = 1 and CiT Cj = 0 if i =F j; similar results hold for the rows. The transformation x' = Ax represents a rotation of rectangular Cartesian axes in three dimensions; IA I = +1 if the right-handed char­acter is preserved. The element Aij equals the cosine of the angle between the xi' and Xj axes. If N is skew-symmetric, then

(I + N)-1 (I - N)

is orthogonal. Matri(;es with complex elements are manipu­

lated according to the same rules. A square matrix H is Hermitian if H*T = H, and skew­Hermitian if H*T = - H, a star denoting the com­plex conjugate. A unitary matrix U satisfies U*T = U-l, the columns (and rows) enjoying the properties Ci*T Cj = I if i = j and 0 if i =F j. If N is skew-Hermitian, then (I + N)-1 (I - N) is unitary.

First- and second-order tensors, arising in many physical problems, may be expressed in matrix notation. If x' = Ax denotes a rotation of rectangular Cartesian axes, A being orthog­onal, then f' = Af and F' = AF AT define Cartesian tensors of orders I and 2 respectively. Evidently, if u and v are vectors or tensors of order I, then uTv is the invariant scalar product, Fu is a tensor of order I and uvT is a tensor of order 2. For example, the vector product u X v may be written as Uv, where

U =( ~3 -~3 _:2). -U2 Ul 0

U is a tensor of order 2 if u is a vector; UT is the dual of u.

The differential operator (column)

V == { a! l' a! 2' a! 3 }

is a vector or tensor of order I ; namely

V=AV.

Thus if cp is a scalar and f a vector,

grad cp == Vcp is a vector

div f\== VT f is a scalar

curl f == Vf is a vector.

718

But if x' = Ax, where A is not orthogonal, f' = Af defines a contravariant vector, but g' = AT-l g defines a covariant vector. The product gTf is now an invariant.

If A is square of order n, then the n homo­geneous equations Ak = Ak require

det(A - AI) = 0

for non-trivial solutions. This characteristic equation possesses n characteristic or latent roots; if they are all distinct, n corresponding characteristic or latent vectors exist. The vector ki corresponding to the root Ai may consist of the n cofactors of any row of A - Ail; at least one non-trivial row exists.

The Cayley-Hamilton theorem states that a square matrix A satisfies its own characteristic equation.

The following properties are important. If A is real and symmetric, and if Ai and Aj are dis­tinct, then kiTkj = 0; these two vectors are or­thogonal. Again, if A is real and symmetric, the n values of A are real, but if A is real and skew-symmetric, these n values are pure imagi­nary. For a real orthogonal matrix A, IAil = 1 for all i. The characteristic roots of A-I are I fAr·' ki still being the corresponding vectors. If IA 1 is the largest of the moduli of the n roots, then as r ~ 00, Arx ~ k 1 , where x is an arbitrary column.

If A is symmetric, n mutually orthogonal characteristic vectors may be found even if the roots are not all distinct. If each vector ki is normalized, i.e., divided by y'(kiTki), then the matrix

A = (kl k2 ... kn )

is orthogonal, and the product AT AA equals D, the diagonal matrix consisting of the n roots arranged down its leading diagonal in order. The matrix A is said to be diagonalized, and D is the canonical form of A.

More generally, if A is a general square matrix of order n, then n independent vectors ki may be found corresponding to the n roots if the latter are distinct. Then

transforms A into diagonal form, thus:

T-IAT=D.

If some of the characteristic roots are identical, it mayor may not be possible to find n corre­sponding independent columns (though it is always possible when A is symmetric). If it is possible, T is non-singular, and as before

T-IAT=D.

If this is not possible, there still exists a distinct non-singular matrix T such that

T-1 AT = C,

719

where C, no longer diagonal, is the standard canonical form of A, containing submatrices of the form

corresponding to the repeated roots. In each case , respectively,

An = T DnT-I or TenT-I.

This enables us to define functions of square matrices whose canonical forms are diagonal, namely, if A = T D r l . If

co

f(x) = L ar xr r=O

is convergent for Ix I < R, then define

co

f(A) = L arAr r=O

= T(f arDr) T-1

r=O

(fO'l) 0 ... )

= T\ ~ f(~2): :: T-I

provided IA I I, IA21, ... , IAn I < R. Thus, for example, if

then

(COS e sin e)

e J = . -sin e cos e

Similar remarks apply to Hermitian matrices H. If Hk = Ak, all the n values of A are real and n vectors can always be found such that

kj*Tkj = Oij.

The unitary matrix U = (k l k2 ... k n) trans­forms H into diagonal form.

TW0 quadratic forms SA = x T Ax, SB = xTBx, where A and B are symmetric and of the same order, may be reduced simultaneously to sums of squares . The equations Ak = ABk demand det(A - AB) = 0; this possesses n roots Ai and n corresponding vectors ki. If

T = (k l k2 ... k n )

MATRICES

then TT AT and TTBT are both diagonal. The transformation x = Ty yields the two sums of squares SA = yT(TT AT)y and SB = yT(TTBT)y. In particular , if SA is positiv~ definite, TT AT will eq!!,al I if new columns kj are used in T, where k i = kdv(kiT Aki) . . Necessary and sufficient conditions for the real quadratic form SA to be positive definite for all real x '* 0 are that the n determinants

a11, I a11 al21 '

a21 a22

a11 al2 a13 , ... ,detA

should be positive. This ensures that the n char­acteristic roots of A are all positive.

Finally, matrices may often usefully be parti­tioned employing matrices within matrices. Multiplication may still be performed provided each individual matrix product is permissible. For example,

( a bT)(e fT) = (ae + bTg afT + bTH)

c D g H ce + Dg efT + DH

where a, e are scalars , b, c , f, g are I X 3 col­umns and D, Hare 3 X 3.

Applications Differential Equations If

dx/dt + Ax = f, A being constant and

f= {fl(t), fz(t), .. . , fn(O},

then if T diagonalizes A, x = Ty yields n non­simultaneous equations dy/dt + Dy = T-I f. If yo(t) is a particular integral,

X(t)=T(-~lt::: 0 )

o .. . e- Ant

X [T-Ix(O) - Yo(O)] + Tyo(t).

Geometry. In three-dimensional Cartesian co­ordinates,

aTx + d = 0

represents a plane, the perpendicular distance from x I being

(aTxI + d)/V(aTa).

The equation x T Ax = d represents a central quadric. If A diagonalizes A, the rotation

x=Ax'

yields x'TDx' = d. The vectors k l , k2, k3 spec­ify the three principal axes, of semi-lengths y(d/Ai) when d/Ai > O.

Dynamics. The rotational equations of mo-

MATRICES

tion of a rigid body with respect to moving axes fixed in the body and with the origin fixed in space or at the centre of mass are

g=Jw+ruw

where g = couple, w = angular velocity, Q T = dual w. J denotes the inertia tensor - ~mXX, where XT = dual x. Explicitly,

J =(-~ -; ~;). -G -F C

The rotational kinetic energy is ! w TJw. When principal axes of inertia are chosen, J is diagonal, yielding Euler's equations.

Small oscillations about a position of equi­librium are investigated by considering the second order approximations

K.E. = c'J.TAq, P.E. = qTBq,

q containing n generalized coordinates measured from their equilibrium values. A and B are con­stant symmetric matrices. If the n roots of det(A + AB) = 0 are considered, and if q = Tx, where T = (k1 k2 ... k n ) reduces A to the unit matrix I, the equations of motion are

Xi + (l /Ai )Xi = O.

The elements of x are the normal coordinates; each individual solution Xi in terms of the q's is a normal mode of period 21"./Ai.

Electromagnetic Theory. Maxwell's 3 X 3 stress tensor in matrix notation is

T = ! [2eeeT + 2~hhT - e(eTe)I - ~(hTh)Il

in mks rationalized units. The field exerts a force across an area element n5S equal to Tn 5S.

When electromagnetic waves are propagated in an ionized medium the equation

curl curl e = k2(I + M)e

arises, where

in the usual notation with collisions neglected; here, n = unit vector directed along the external magnetic field, NT = dual n. These equations may be rearranged in terms of the matrix

f= {Ex, -Ey,ZoHx,ZoHy}

giving df/dz = -ikTf, where T is a 4 X 4 matrix. If the characteristic roots Ai(Z) of T are found, and if R diagonalizes T, then the transforma­tion f = Rg yields

dg dR - = -ikDg- R-l - g. dz dz

720

When k is large, approximate solutions are pos­sible if the terms of R-l(dR/dz)g arising from the non-diagonal elements of R-l(dR/dz) are much smaller in magnitude than the correspond­ing elements of kDg. Then

dgj ~ dR) ~ -ikDj gj - R-l - gj dz dz jj

and

giving rise to the characteristic waves propagat­ing in the medium.

Special Relativity. If x = {ict, x,y, z} refers to an inertial frame S, and if a second parallel frame S' has uniform relative velocity U along Ox, the Lorentz transformation is x' = Aux, where

-iU~/c 0

~ 0

o o 0

A is orthogonal, and ~ = l/VO - U2 /c2 ). We have Av Au = A w, where

W = (U + V)/O + UV/c 2 ).

For the general velocity v relating parallel frames,

( ~ -i~vT/c )

A = i~v/c I + (~- l)vvT/v2 .

The operator 0 = {a/icat, a/ax, a/ay, a/az} is a four vector satisfying 0' = AD; so are the four-current i and the four-potential b,

where a is the vector potential. They satisfy OTj = 0 (conservation of charge), DTb = 0 (the Lorentz relation). Maxwell's equations in mks units in free space take the form

where

oTF = -iT/eoc2

oTG=O

( 0 ieT/C) F = ObT - (obT)T = -ie/c -~oH

721

and

( 0 J.LOhT) G=

-J.Loh -iE/c

are tensors of order 2 under a Lorentz trans­formation. ET and HT are the respective 3 X 3 duals of e and h. All tensor equations are in­variant in form in all frames of reference.

The tensor of order 2

T = ! Eo c 2 F F - ! J.Lo -1 G G

= !(EO eTe + J.LohTh -2ieTH/c )

- 2iEh/c EoeeT+EoEE+J.LohhT+

J.LoHH

contains the energy density, Poynting's vector, the momentum density and Maxwell's stress tensor in partitioned form.

Applications may likewise be made to circuit theory, to elasticity where 3 X 3 stress and strain tensors are defined, and to quantum mechanics, embracing, for example, matrix mechanics and the Dirac wave equation of the electron.

JOHN HEADING

References

Gourlay, A. R., and Watson, G. A., "Computational Methods for Matrix Eigenproblems," London and New York, John Wiley.

Heading, 1., "Matrix Theory for Physicists," London, Longmans, Green & Co.

Heading, J., "Electromagnetic Theory and Special Relativity," Cam bridge, University Tutorial Press.

Jeffreys, H., and Jeffreys, B., "Methods of Mathe­matical Physics," Cambridge, The University Press.

Liebeck, H., "Algebra for Scientists and Engineers," London and New York, John Wiley.

Perlis, S., "Theory of Matrices," Reading, Mass., Addison-Wesley.

Williams, I. P., "Matrices for Scientists," London, Hutchinson.

Cross-references: DIFFERENTIAL EQUATIONS IN PH-YSICS, ELECTROMAGNETIC THEORY, LORENTZ TRANSFORMATIONS, MATHEMATI­CAL PRINCIPLES OF QUANTUM MECHANICS, QUANTUM THEORY.

MEASUREMENTS, PRINCIPLES OF

Measurement is the process of quantifying our experience of the world around us. It can be as simple as an elementary event-counting process (the number of automobiles passing a certain point per day or the number of ~-particles entering a certain radiation detector in a certain time interval) but it is usually a more complex process involving comparison with some refer-

MEASUREMENTS, PRINCIPLES OF

ence. For example, how many handspans wide is my desk? This is an elementary example, but even such a simple question reveals many of the essential and fundamental characteristics of measuring processes. First, the search for im­proved precision quickly makes desirable the acceptance of a standard reference quantity. Second, the primitive process of comparison makes it clear that we can make statements about the measurement only within certain limits. (My desk is between 6 and 7 hands pans wide). Our fundamental inability to make exact measurements leads to the concept of uncer­tainty, and a very extensive theory of the uncertainty of measurement exists.

Measurement Standards Standard, defined units of measurement have been in use for many thousands of years and the situation has been in continuous flux up to the present day. Every country maintains standards of measure­ment, not only of the basic quantities such as mass, length and time, which we normally think of in this context, but also of a large number of other items (such as the optical reflectivity of paper) which have been identified as impor­tant for trade and commerce. We shall restrict ourselves here to a few physical quantities which are important in scientific work. It will turn out that it may be a relatively simple matter to define a unit; it is usually a much more difficult problem to realize that unit in the laboratory so as to make possible the calibration of other instruments.

(a) Mass. The kilogram was defined in 1889 as the mass of a certain piece of metal (plati­num-iridium) still preserved at the International Bureau of Weights and Measures near Paris, France. Copies of this prototype kilogram, compared with the original by carefully refined beam balance techniques, are kept in most countries to serve as that country's definition of a kilogram. The normal process of compari­son using beam balances allows a precision for mass standards of around one part in 108 , mak­ing mass standards substantially less precise than those of length and time.

(b) Length. The original 1889 definition of the meter (chosen to be one ten millionth part of the quadrant of the earth's surface at the longitude of Paris) was realized initially using engraved marks on a platinum-iridium bar. However, it very soon thereafter became clear, on the basis of the pioneering work of A. A. Michelson on optical interferometry, that a much much more precise, stable, and easily realizable standard was available using the wave­length of carefully selected spectrum lines. Despite Michelson's early suggestions to that effect it was not until 1960 that an international standard of length was defined in terms of the wavelength of a certain line in the spectrum of krypton-86. Recently, however, it has become clear that even the precision available from the krypton-86 line has been surpassed by precision

MEASUREMENTS, PRINCIPLES OF

in two other areas, measurements of the velocity of light and the standard of time. As a conse­quence it is now possible to define a unit of length in terms of the unit of time and the mea­surement of the velocity of electromagnetic waves. As accepted by the General Confer­ence of Weights and Measures in 1983, the unit of length is the distance traveled in a time interval of 1/299,792,458 of a second by plane electromagnetic waves in a vacuum. Such a unit will be realizable in any laboratory with precise time standards, and will be translatable into actual distance measurements using standard techniques of interferometry.

(c) Time. For hundreds of years the most precise measurements of time were made by the astronomers, whose observations served to de­fine the basic unit of time, the second, in terms of the axial rotation of the earth. When ter­restrial clock systems achieved the precision required to show that the earth's rotation rate is not constant it became clear that replacement of the defined standard was necessary. Observa­tions on the radiation frequency in atomic tran­sitions can be made with very high precision and the present definition of the second (adopted in 1967) is the duration of 9,192,631,770 peri­ods of the radiation associated with a certain transition in the cesium-l 13 atom. Cesium beam clocks now provide a realizable standard of time, not only for time-keeping, but also for practical purposes such as radio navigation systems and for experimental work like long baseline interferometry in radioastronomy.

(d) Temperature. The definition of tempera­ture is based on the triple point of water, which was defined in 1954 to have a temperature of 273.l6 K. The unit of temperature, the kelvin, is thus defined (since 1967) is as 1/273.16 of the temperature of the triple point of water. The realization of a temperature scale" particularly over wide temperature ranges, is a much more difficult matter. In practice the standards labo­ratories maintain a working scale called the International Practical Temperature Scale, 1968 (abbreviated IPTS-68). In this, thirteen fixed points have temperatures assigned to them and specified methods of measurement are used to provide intermediate temperatures for calibra­tion purposes. Above the temperature of freez­ing gold (1337 K) optical pyrometry is used; between 903 K and 1337 K the standard mea­suring device is a thermocouple of platinum and an alloy of platinum with 10% rhodium; between 14 K and 903 K it is a platinum resistance ther­mometer. Standard temperatures are also avail­able in the liquid helium temperature range between 0.5 K and 4.2 K using the vapor pressure of helium.

(e) Electrical Quantities. Once again we have a distinction between the fundamental defini­tion of a standard quantity and its practical real­ization. In principle the fundamental electrical quantity is the unit of current (the ampere),

722

which is defined in terms of the force between adjacent current-carrying conductors. In prac­tice it is too difficult to implement this defini­tion with sufficient precision, and the practical standards are those of potential difference and resistance. The unit of potential difference (the volt) is defined in terms of the Josephson effect, a phenomenon occurring in superconducting junctions, which provides an extremely sensi­tive, precise, and stable connection between potential difference and frequency. The prac­tical realization of the volt is a bank of care­fully preserved electrolytic cells which can be moved for international comparison and which are checked periodically for drift using a Josephson source. The standard of resistance (the ohm) is realized using a bank of 1 ohm resistors which can be compared internally and internationally.

Measurement Uncertainty As was mentioned earlier the primitive act of comparison between an object and a reference quantity leads to a value which is known only within a certain interval. No matter how sophisticated the mea­suring process we cannot evade this fundamen­tal limitation; we can be confident about mea­surements only within a certain interval. The way in which we handle these uncertainties mathematically depends on the way in which our confidence varies along the scale.

(a) Estimated Uncertainty. If we are making our measurements by a personal, visual method, the outcome of the measuring process should be an interval, outside which we are certain the value does not lie. If, by careful examination of a scale, we feel confident that the value we seek does not lie below 24.6 and does not lie above 24.8, we can state that we are confident that our desired value lies inside the interval 24.6-24.8, although we can say nothing about its location within the interval. The interval is usually renamed 24.7 ± 0.1, and we call the quantity ± 0.1 the uncertainty of the reading. Frequently it is instructive to compare this uncertainty with the reading itself, and we call the ratio 0.1/24.7, usually expressed as a percentage, the precision of the measurement. This question of the uncertainty in reading a scale is only one contribution to our lack of exact knowledge of the value. There may be other effects, like calibration errors in the instrument, which affect all the readings in a similar way and constitute systematic errors in addition to reading uncertainty. To make satis­factory measurements we must always be aware of the presence of the reading uncertainty and also alert to the possible presence of systematic errors. These last must be identified and cor­rected if possible but, at the end of the whole process, it is important to express the reading in such a way that the quoted value provides a realistic appraisal of the complete range of uncertainty.

It is rarely sufficient to make such a measure-

723

ment of a single quantity and we are more commonly faced with the problem of calculat­ing some final quantity z as a function of a number of measured quantities x, y, etc., each of which has its own uncertainty ~x, ~y, etc. If

z = f (x, y, . . .),

the value of ~z will be calculated from

~z = (af/ax)~x + (af/ay)~y + .. . .

If we were confident that the values of x and y lay within the measured intervals Xo ± ~x, Yo ± ~y, etc., then we can be equally confident that the value of z lies within the calculated interval Zo ± ~z, where Zo = f(xo, Yo, .. . ). This general method of calculating uncertainties will be found to be useful for a wide range of functions .

Statistical Uncertainty Circumstances fre­quently do not permit the subjective estimation of an uncertainty interval as considered in the preceding section. Many measurement processes give results which are influenced by random fluctuations and we must resort to statistical treatment of the observations. Instead of identi­fying an interval within which we are confident our quantity lies we must be content with state­ments about probabilities. To make this possible we must build up experience of a fluctuating phenomenon by repeatedly making the measure­ment. We shall thereby obtain a sample of read­ings whose characteristics will give us as much information as is available from the system. This sample will have a certain frequency dis­tribution along the scale of values and the uncertainty of the measuring process is related to the breadth of the distribution. One suitable measure of the breadth is the standard deviation of the sample, defined to be

S = V[ECx -Xi)2 IN]. where x is the mean of the sample, Xi are the individual readings, and N is the number of readings in the sample. By making suitable as­sumptions about the basic distribution from which the sample was taken (often assumed to be Gaussian), we can now make numerical assertions about the sample. It turns out that anyone reading has a 68% probability of falling within the interval x ± S and a 95% probability of falling within x ± 2S. Rather than make assertions about single readings it is more useful to be able to make statements about probabili­ties for the sample mean. We calculate the standard deviation of the mean

and we are then able to assert that the value we seek in the measuring process has a 68% chance of falling within the interval x iSm and a 95%

MEASUREMENTS, PRINCIPLES OF

chance of falling within x ± 2Sm . The mean and the standard deviation of the mean thus provide us with measures of probability which take us as far as randomly fluctuating phenom­ena allow.

The statistically determined interval serves as the measure of uncertainty for processes gov­erned by random fluctuation so that, irrespective of the type of measurement, the outcome of the measuring process is an interval which has a certain probability of containing the value we seek. The measurements are now in suitable condition for the next step in a complete mea­suring process.

Systems and Models A measuring process is almost never a primitive process of simply com­paring an object with a scale (unless we are satisfied simply to measure something with a ruler or read a temperature on a mercury-in­glass thermometer); the situation is almost invariably more complicated. Even if we are doing something as simple as comparing the weight of an object with that of a standard mass using a beam balance we have to make the assumption that the balance beam arms are of equal length. Our measuring process almost invariably involves some complete sys­tem and the result of our measuring process is dependent on the properties of the system. Since we can never know these exactly we are dependent on the set of assumptions we make about the system (like the equality of the balance beam arms), and this set of assump­tions constitutes a model of the system. Our process of measurement almost invariably requires us, therefore, to consider the model of the system, and it is an integral part of all satisfactory measuring processes to check, not that the model is "correct" or "true," because all models are in principle oversimplified, but that the correspondence between the model and the system is good enough, at least at the level of precision under consideration.

Sometimes this process of testing the model of a measuring system will have been done for us if we buy an expensive piece of apparatus. If we buy a good quality slide-wire potentiom­eter from a reputable manufacturer it may be satisfactory to assume that the set of assump­tions which constitute the model of the system (such as linearity of the slide wire resistance) has been adequately tested. But if we are mak­ing up our own measurement system, it is vital to include in the process adequate provision for testing the model on which our work with the system is predicated.

It is common to do the model testing graph­ically. For example a measurement of the resistance of an electrical component is incom­plete without a study of the complete variation of the current through and the potential dif­ference across the component. Only if the plotted values of V and i turn out to be com­patible with a straight line can we assert that

MEASUREMENTS, PRINCIPLES OF

the resistance of the component is constant, with a value specified by the slope of the line. Usually the model of the system is more complicated than V = iR and more complicated graphical methods of checking the model against the system are needed. In compensation, however, the value we seek as the objective of the experiment will usually be obtained from the graph, so that drawing the graph serves the dual purpose of checking the compatibility of the system and the model and of providing a computational procedure for obtaining the answer. Even when the measurements in an experiment are processed completely analyti­cally (using a least squares procedure, for ex­ample, to fit a function to the observations), it is important not to forget the basically graph­ical nature of the process.

Measurement Systems The demands of mod­ern experimenting and the availability of auto­matic data processing methods have recently revolutionized measuring methods. It is now common to think 'of a complete measuring system containing: (a) a conversion stage which may be desirable to convert the basic quantity under investigation into some other form more amenable to measurement; (b) a sensor, detec­tor, or transducer stage to provide for conver­sion into an electrical signal; (c) a signal pro­cessing system to perform on the signal any necessary mathematical computations; (d) an output stage for display, storage, or telemetry of the information. Let us consider each of these stages in turn.

(a) Conversion Stage. This is not always pres­ent but is frequently necessary if it is impossible or inconvenient to work directly with the phenomenon under investigation. For example a gas thermometer bulb converts temperature into pressure, a moving coil ammeter converts currents into angles, a slide wire potentiometer converts values of potential difference into lengths, a prism, diffraction grating, or crystal converts the wavelength of electromagnetic radiation into angles, a digital voltmeter uses a ramp method to convert a potential difference into a time measurement, a mass spectrometer converts atomic masses into magnetic field values, etc.

(b) Sensor, Detector, or Transducer Stage. Almost invariably we wish to process our signal by electrical methods and so some process for converting the basic phenomenon into an elec­trical signal is required. Sensors are found in enormous variety depending on the type of physical phenomenon involved.

(i) Strain is measured using strain gauges, a small length of metallic or semiconducting material glued to the component under stress. Changes in length of the material are detected from the consequent alteration in electrical resistance.

(ii) Force or pressure transducers may rely on some elastic component (cantilever beam for force, membrane for pressure) to convert the force into a displacement which is then detected

724

using strain gauges. Alternatively, a piezoelec­tric crystal may be used to obtain an electrical output directly.

(iii) Temperature sensors use a variety of phenomena. Gas thermometers use the change in pressure with temperature of a gas, often helium. Thermocouples produce an electrical emf directly but require careful calibration. From low temperatures up to 500 or 600 K the most commonly used junction materials are copper and constantan (an alloy of copper and nickel). At higher temperatures tungsten and tungsten alloys are used. Resistance ther­mometers use the change in electrical resistance with temperature of metals or semiconductors, the choice of material depending on the tem­perature range. Pure platinum gives high preci­sion over a wide temperature range; germanium or ordinary carbon resistors are widely used at low temperatures; and thermistors, in which a sintered powder, generally of metallic oxides, provides rapid variation of resistance with temperature, can be made to suit a wide range of temperatures. The change in resistance which provides a measure of the temperature is com­monly measured using some kind of bridge circuit. Temperatures too high for normal sen­sors are measured by pyrometers in which an absolute measure of the intensity of radiation at some fixed wavelength is interpreted using Planck's radiation equation to give values of temperature. Liquid-in-glass thermometers, al­though commonly used as an indicator of temperature, only rarely qualify for precise measurement and cannot be used as a transducer to supply an electrical output.

(iv) Optical sensors may be of several different types. The vacuum photodiode contains a sur­face, generally of some cesium compound, which reacts to illumination by emitting electrons which are collected by an anode. The photo­multiplier tube uses successive stages of second­ary electron emission in an avalanche process to provide amplification of the current. Solid state devices may be of the photoconductive type, in which the resistance of a semiconducting ma­terial such as selenium or cadmium sulfide changes with illumination, or the photovoltaic type in which a semiconducting junction pro­duces its own emf in response to the light. The familiar "solar cell" is of this variety and usually contains a junction between silicon and a metal. Infrared detectors may be of the photoconduc­tive or photovoltaic type. They normally use semiconducting materials or juncticns involving compounds like indium antimonide or alloys containing such materials as lead, tin, and tel­lurium. For wavelengths which do not excite photoelectrons or for cases in which absolute measurements of intensity are required, a bolometer can be used. This device is con­structed to absorb all the incident radiation, regardless of wavelength, and convert it into a temperature increase which can be mea­sured using a thermocouple or resistance thermometer.

72S

(v) Acoustic transducers take various forms depending on the frequency of the radiation. For ultrasonic applications at frequencies of tens of kilohertz up to megahertz a piezoelec­tric transducer, commonly of quartz, is used to convert the pressure fluctuation in the sound wave to an electrical signal. At the lower fre­quencies of the auditory region microphones can use the piezoelectric effect in materials like lead titanate ceramic or the electrostatic properties of electrets, materials possessing permanent electric polarization.

(vi) Transducers for magnetic field measure­ments frequently use the Hall effect, in which a current-carrying conductor exhibits a trans­verse potential difference when placed in a magnetic field. Hall effect probes make a sturdy component for many common measurements of magnetic field but for other applications higher sensitivity may be required. For example, airborne surveys of the earth's magnetic field are commonly carried out using a flux-gate magnetometer, which detects the out-of-balance signal produced by an external field in a care­fully balanced magnetic circuit. Still higher sensitivity is available from the proton preces­sion magnetometer. This relies on the relation­ship between the frequency of proton precession (often in a water sample) and the magnitUde of the surrounding magnetic field, and can supply a sensitivity as low as 10-5 of the external field. Still higher sensitivity is available from the magnetic dependence of optical transitions in rubidium and cesium atoms in the form of the optically pumped rubidium or cesium magne­tometer. For the very lowest values of magnetic field the Josephson effect in superconductors has extended the range of magnetic field mea­surements by many orders of magnitude. Such devices require cooling to liquid helium tem­peratures but make possible the measurement of magnetic field changes of the order of 10-12

tesla. (vii) The various forms of ionizing radiation

encountered in nuclear physics are detected by a large variety of techniques, of which some of the more common are described below.

Gas-filled tubes containing electrodes can be used under various conditions of pressure and potential difference to detect the ionization produced by fast particles or high-energy radia­tion photons. An ionization chamber can pro­vide steady monitoring of particle flux densities and a similar gas-filled tube operated at higher potential constitutes a Geiger counter which provides a pulse of current for a single particle entering the counter. Scintillation counters detect individual photons of '}'-rays or -x-rays from the flash of visible light which is produced by passage through certain solid crystals or liquids. The solid materials are generally single crystals of organic materials such as anthracene or inorganic crystals of sodium iodide (doped with thallium). The light pulses emitted by the scintillation material are normally detected using photomultipliers. An enormous advantage

MEASUREMENTS, PRINCIPLES OF

of the use of a scintillation counter lies in the fact that the magnitude of the current pulse is closely proportional to the energy of the x-ray or '}'-ray photon. These crystals thereby permit the analysis of such a beam into a spectrum of its energy components. X-ray and '}'-ray beams can also be detected directly and analyzed using semiconducting crystals. Lithium-drifted ger­manium and silicon crystals, although requiring cooling to liquid air temperature, provide cur­rent pulses from individual x-ray or '}'-ray photons which are accurately proportional to the photon energies.

Signal Processing Systems Signal processing procedures have been available since the intro­duction of electronic circuitry but were initially restricted to such elementary operations as amplification or heterodyning (the generation of a beat frequency to expedite the tuning and amplification of radio-frequency signals). Rather more sophisticated processing became available with the development of analog com­puters but was still limited to relatively simple arithmetic operations or integration and dif­ferentiation. The real revolution came with the development of digital data processing, and the availability of fast and powerful computers now makes it possible, in "real time," to perform almost any desired mathematical operation on the observations as they are made. If, as is fre­quently the case, the output of the sensor stage is in analog form, it is necessary to pass the signal through some form of analog-digital (a-d) converter. Examples of on-line signal processing procedures are given below.

(a) Basic Arithmetic Operations. Division, for example, can be used to calculate resistance di­rectly as V Ii or velocities from values of distance and time.

(b) Statistical Operations. These include cal­culation of such distribution parameters as mean, standard deviation, and correlation coef­ficient. Comparison with a prescribed function, linear or otherwise, can be carried out by least squares methods, or, in the absence of a speci­fied function, a generalized polynomial fit can be obtained. In all cases progressive improve­ment of the accuracy of the calculations will result from continued revision as new observa­tions become available.

(c) Time A veraging. This is a very powerful method to improve signal-to-noise ratios. If some phenomenon such as a resonance peak or a spectrum line is hidden in noise, we can carry out a process of repeated scans over the range of variable containing the signal. If we have some way of storing the information and aver­aging the results of the repeated scans, we shall find that the random noise will give, ultimately, an average of zero. The desired signal, on the other hand, will add positively on every scan and eventually appear free of the noise which formerly masked it.

(d) Fourier Analysis. It is very frequently desirable to analyze a time-varying phenomenon into a frequency spectrum by the method of

MEASUREMENTS, PRINCIPLES OF

Fourier analysis. Such computation has recently been greatly facilitated by the development of the fast Fourier transform methods and is now commonly used in a wide range of applications. Typical is the Fourier transform spectrometer, in which optical spectra are processed in an in­terferometer. The resulting variation of fringe intensity with order of interference is analyzed to yield the frequency components of which the original beam was composed.

(e) Optical Image Enhancement. A two-di­mensional picture can be converted into digital data by a scanning process, thereby becoming susceptible to computer processing. Photographs showing such defects as lack of contrast, out­of-focus, or smearing from camera movement can be analyzed to determine the precise nature of the defect and subsequently corrected to construct an improved image. Many of the most spectacular results of space exploration among the planets would not have been visible without various processes of image enhancement to emphasize detail in particular ways, and similar improvement is available for microscopic images of biological material. Computer processing can also be used to create "false-color" images of normally invisible phenomena such as infrared emission, accoustic signals, and others. These are now familiar in many applications, such as the thermographs which identify poorly insu­lated houses or human breast cancer and the satellite-based photography which is used to study earth resources such as crops.

(f) Pulse-Height Analysis. This is a process in which separate storage is provided for pulses of differing height. The resulting display can then give a direct picture of, for example, a spectrum of 'Y-ray energies from a scintillation counter or x-ray energies from aLi-drifted silicon counter. This very powerful tech­nique can be used to record any phenomenon in which the quantity in which we are interested can be converted into pulses with height depen­dent on the original variable. For example, lifetime studies on excited atoms will use a time-to-pulse height converter to provide a direct picture of the decaying radiation.

(g) Others. Other examples of on-line signal processing too complex to be described here but of too great importance to omit include pattern recognition and the various procedures for clinical examination by tomography. In these a signal, derived usually from x-ray ab­sorption but occasionally from other phenom­ena, is analyzed to provide a picture of a cross­section of the human body as an aid to diagnosis.

Display and Storage Systems The tradItional methods of needle and scale are now less fre­qut!ntly encountered, except as indicators. For time-varying phenomena the strip-chart recorder remains useful at low rates of change and the cathode ray oscilloscope is abSOlutely irreplace­a'ble for rapidly changing phenomena. The addi­tion of digital processing and memory to the CRO has made it a uniquely powerful and

726

versatile instrument for the display and study of oscillations or time-varying phenomena over a wide range of frequencies and times.

Digital data processing makes possible the presentation of output information using digital numerical displays, based often on neon tubes or light-emitting diodes. These offer the con­venience of direct access to output information without need for further interpretation, e.g., a range scale on a laser range finder. Similar con­venience is available from the two- or three­dimensional cathode-ray tube displays in which the output of computer processing of observa­tions can be viewed directly in pictorial form.

For cases in which later use of the results is intended, digital storage methods using magnetic tape and disc or punched paper tape permit vast quantities of information to be stored and easily retrieved. If onward transmission of the infor­mation is required, the use of digital techniques makes possible the rapid and accurate transfer of observations over interplanetary distances.

D. C. BAIRD

References

Klein, H. A., "The World of Measurements," New York, Simon and Schuster, 1974.

Rossini, F. D., "Fundamental Measures and Constants for Science and Technology," Cleveland, CRC Press, 1974.

Baird, D. C., "Experimentation," Englewood Cliffs, N.J., Prentice-Hall, 1962.

"The International System of Units (SI)," Washington, D.C., National Bureau of Standards, and London, Her Majesty's Stationary Office.

Janossy, L., "Theory and Practice of the Evaluation of Measurements," Oxford, Oxford Univ. Press, 1965.

Plumb, H. H. (Ed.), "Temperature: Its Measurement and Control in Science and Industry," Instrument Society of America, 1972.

Quinn, T. J., "Temperature," Academic Press, New York, 1983.

Levi, L., "Applied Optics," New York, Wiley, 1980. Keyes, R. J. (Ed.), "Optical and Infrared Detectors,"

Springer-Verlag, New York, 1981. Peterson, A. P. G., "Handbook of Noise Measure­

ment," Concord, Mass., GenRad Inc., 1980. Zijlstra, H., "Experimental Methods in Magnetism,"

New York, Wiley, 1967. Knoll, G. F., "Radiation Detection and Measurement,"

New York, Wiley, 1979. Oppenheim, A. V., "Applications of Digital Signal

Processing," Englewood Cliffs, N.J., Prentice-Hall, Inc., 1978.

Cross-references: ASTRONOMY; COSMIC RAYS; ELECTRICAL MEASUREMENTS; MAGNETOME­TRY; NOISE, ACOUSTICAL; NUCLEAR INSTRU­MENTS; OPTICAL INSTRUMENTS; PHOTOGRA­PHY; PHOTOMETRY; TELEMETRY; THEORETI­CAL PHYSICS.

727

MECHANICAL PROPERTIES OF SOLIDS

When a material is in the solid phase, its con­stituent particles, which may be atoms, ions, or chemical molecules, vibrate about fixed equilib­rium positions in which the interparticle forc~ is zero. In most solids composed of small constItu­ent particles, e.g., metals and ionic soli~s, these interparticle interactions produce 3? I!1ternal atomic or molecular arrangement whIch IS regu­lar and periodic in three dimensi?ns over i~ter­vals which are large compared WIth the umt of periodicity. Such solids are called crystals.

Solids composed of larger units, e.g., polymers, can be crystalline, though the crystallinity is usually rather imperfect, or they can be amorphous.

When a solid is deformed by external forces, the constituent particles have their separations changed from the eqUilibrium values. The re­sultant of the interparticle forces acting on a particular particle is then no longer zero, but acts to restore the particle to its original posi­tion relative to its neighbors. When the solid is in equilibrium under the action of external forces, the interparticle (or internal) forces must be in equilibrium to give continuity of the material and must also be equal to the external forces, i.e., any element of the body must be in equilibrium. These internal forces are m~in­tained as long as the external forces are appbed. When the external forces are removed the inter­nal forces restore to the constituent particles their original separations.

If after unloading, the body returns exactly to its former size and shape its behavior is called perfectly elastic. If it retains completely its altered size and shape it is a perfectly plastic body. In general, the behavior of real bodies lies between these two extremes.

Stress and Strain Two types of forces may act on any element of a body: (a) surface forces, exerted by the surrounding material, which are proportional to the surface area of the element and (b) body forces, which are proportional to the volume of the element, e.g., gravitational forces. The effects of body forces are usually negligible compared with those resulting from surface forces.

For a body to be deformed and not merely accelerated when forces are applied to it the body must be in statical equilibrium u!1~er the action of the applied forces. The condItIOns of equilibrium are (a) there must be no unbalanced applied forces and (b) there must b.e no unbal­anced applied couples. Further, the mternal and external force equilibrium can be equated.

The effect produced in a given material by forces of given magnitudes depends on the size of the body to which they are applied, and hence to enable a comparison to be made of the r~action to external loading of bodies of different size, the concept of stress is introduced.

The stress in an element is defined as force divided by area over which the force acts. It is

MECHANICAL PROPERTIES OF SOLIDS

described as a homogeneous stress if, for an element of fixed shape and orientation, the value is independent of the position of the ele­ment in the body. Usually the term stress is taken to mean stress at a point, and is the limiting value of force divided by area over which the force acts as the area tends to zero. If a force of acts over a surface of area oA and makes an angle rp with the normal to the sur­face, the normal stress a is

_ . (OF cos rp ) a - bm oA

5A~O

and the tangential or shearing stress T is

_ . (OF sin rp) T - hm ~A

5A~O u

The stress is, of course, transmitted through the solid.

The change in the separation of the constitu­ent particles of the solid produced b.y the ap­plied forces is seen on the macroscopIC scale as a change in the size and shape of the body. Since the deformation of different bodies of a given material subjected to a particular load is a function of the size of body, comparisons are made using the relative deformation, or strain, defined as

change in dimension strain = " I d' . . ongma ImenSlOn

A strain is homogeneous if, after deformation, lines of the body that were originally straight remain straight and lines that were originally parallel remain parallel.

The following strains are found to be con­venient in describing the behavior of a body in various states of stress.

When a rod of unstretched length £0 has its length increased to £ by the application of ex­ternal forces, the conventional, engineering, or nominal tensile strain € is defined as

£ - £0 €=~.

Sometimes it is more convenient to use the true, natural, or logarithmic strain €*, defined as

Clearly

€* = loge (l + €)

If as the result of the application of a uni­for~ hydrostatic force, the volume of a solid changes from Vo to V, the bulk strain () is defined as

V- Vo () = ----v;;- .

MECHANICAL PROPERTIES OF SOLIDS

When a solid is sheared by the application of couples, the angle of shear is taken as a measure of the strain, in this instance a shear strain.

Elastic Behavior For very small strains « -0.1 per cent) the behavior of many solid materials is almost perfectly elastic. In this strain range a Ilpecimen will exhibit a linear re­lationship between the magnitude of the ap­plied forces and the deformation produced. This relationship is known as the Hooke law and in terms of stress and strain it may be stated in the form

stress = constant X strain.

The constant in this equation is called a modu­lus of elasticity. Each strain has a correspond­ing modulus of elasticity. These moduli are temperature-dependent, and in general, depend on the direction of measurement, but if elasti­cally isotropic solids are considered, the value of a particular modulus is independent of the direction in which it is measured. A solid is ef­fectively isotropic if it is composed of grains whose size is small compared with the smallest dimension of the solid and if the orientations of the grains are randomly distributed.

Consider a bar of uniform area of cross­section A and unstretched length £0 acted upon by forces F applied uniformly at the ends. If £ is the length when this load is applied, the stress a is given by a = F/A and the strain € is € = (£ - £0 )/£0. (Tensile stresses are counted positive.) When the Hooke law is obeyed, F/A = E(£ - £0 )/£0 where E is a constant for a given material at a given temperature and is known as the Young modulus of the material.

When a solid has a hydrostatic stress a ap­plied to it, the volume changes from Vo to V so that if the Hooke law is obeyed

K(V- Vo) a=

where K is a constant at a given temperature and is known as the bulk modulus of the material.

When a solid is deformed by couples produc­ing a shear stress 'T, the angle of shear 'Y is taken as a measure of the strain so that, if the Hooke law is obeyed,

'T = G'Y

where G is a constant at a given temperature and is known as the rigidity modulus for the material.

The axial deformation of a prismatic bar with unloaded prismatic surfaces is accompanied by a change in the cross-sectional area. Experiment shows that the rado

lateral strain/axial strain

is a constant known as the Poisson ratio v. For the small strains encountered in pure elastic be-

728

havior the change in cross-sectional area is very small, so the difference in the stress calculated using the original area of cross-section and using that when the load is applied is negligible.

Elastic shear deformation, in constrast, takes place at constant volume. The elastic moduli are not independent and it can be shown that, for an isotropic solid,

E=3K(1- 2v)

and E

G = 2(1 + v)'

Plastic Behavior When a solid is deformed under an increasing stress, a stage is reached when the further deformation produced by a slight increase in stress, though still elastic, does not obey the Hooke law. The stress at which the departure from linearity of the stress-strain curve first occurs is called the proportional limit or elastic limit. If the stress is increased beyond the elastic limit, a value is reached at which permanent deformation occurs, i.e., the speci­men does not recover completely its original size and shape on unloading. The stress at which permanent deformation is first detected has, for very many materials, a value characteristic of the material at that temperature and is called the yield stress. The corresponding point on the stress-strain curve is the yield point. For many materials the elastic limit and yield stress have almost the same value and are not readily dis­tinguished. The deformation not recovered on unloading is called the permanent set and the specimen is said to have suffered plastic deformation.

Plastic Deformation of Simple Crystalline Solids, e.g., Metals and Ionic Solids . The sim­plest mechanical test that can be performed on a solid is the tension test, and measurements made during such tests are often used to charac­terize particular materials.

Many simple crystalline solids are ductile at temperatures greater than about 0.3 to 0.4 of the melting temperature in kelvins. The plastic deformation of such materials takes place at ap­proximately constant volume, and hence, for a specimen tested in tension, the cross-sectional area decreases as extension proceeds. This change in cross-sectional area with strain neces­sitates a more careful definition of stress. Two definitions are in common use, namely, conven­tional stress ac, sometimes called the nominal or engineering stress, defined by

load ac = original area of cross section

and true Stlt:SS at, defined by

load at = area of cross section under that load

729

a

.... .... " , ~

~

MECHANICAL PROPERTIES OF SOLIDS

~ ~ ,

~

~ ,

/ ~

~

D

~ ~

~

~

, , ~ at

E

o L-______________ ~ ______ ~ ____________________________________________ _

C F

FIG. 1

When a tensile test is carried out on a fine­grained sample of a ductile material the conven­tional stress ac vs engineering strain € graph has the form shown by the solid line in Fig. 1. The actual shape of the curve depends on many variables, e.g., purity of the material, tempera­ture of testing, and rate of straining.

Over the region OA the graph is a straight line passing through the origin, the behavior is per­fectly elastic, and the Hooke law is obeyed. When the stress exceeds that at A macroscopic permanent deformation occurs and the curve bends towards the strain axis; the stress at A is the yield stress ay and A is the yield point. However, the stress needed to produce further plastic deformation increases with strain and the material is said to work-harden or strain­harden. If the specimen is unloaded when B is reached, the unloading path is BC, which has almost the same slope as OA; the elastic proper­ties of the material are little affected by plastic deformation. The elastic strain FC is recovered, but the material retains the plastic strain OC, which is the permanent set. The plastic strain becomes an increasing fraction of the total strain as the latter increases. When the test is continued the stress rises along CB (ignoring a small amount of hysteresis which is sometimes

observed), but before B is reached the curve bends towards the strain axis and then con­tinues to rise as if unloading had not taken place. ac continues to rise until D is reached and then starts to fall. The load corresponding to D on the ac vs € curve is the maximum load that the specimen can withstand in tension, and the value of ac corresponding to this load is called the ultimate stress or ultimate tensile strength au.

For deformations represented by OD on the ac vs € graph the extension is homogeneous, i.e., on the macroscopic scale the deformation is the same for all cross sections. At D, however, a neck forms in the specimen, all subsequent plastic deformation is restricted to this neck, and the load needed to produce further exten­sion falls. The neck gets progressively narrower until fracture occurs at a strain corresponding to E.

When at is plotted instead of ac, the curve has the form shown by the dotted line in Fig. 1. If, when the neck forms, at is measured in the neck, at continues to increase with strain up to fracture.

Mild steel and some other materials show a different behavior. The elastic range is termi­nated when the stress reaches a value known as

MECHANICAL PROPERTIES OF SOLIDS 730

TABLE I

E K Material (Nm-2 ) (Nm-2 )

Al 71 X 109 75 X 109

Cu 130 X 109 138 X 109

Steel 210 X 109 168 X 109

the upper yield stress aUYS . There is an abrupt partial unloading and macroscopic plastic de­formation occurs locally in regions called Liiders bands. These bands spread along the specimen and the value of ac oscillates about a relatively constant value known as the lower yield stress aL YS. When the Liiders bands cover the whole speCImen, further deformation is macroscopi­cally homogeneous and ac rises as the material work-hardens.

The stress-strain curves usually plotted use the engineering strain, but it should be noted that if true stress is plotted against true strain the re­sulting curve is the same for both compression and tension tests.

Some typical values of elastic moduli and yield stresses are given in Table 1. These values refer to measurements on fine-grained wires at room temperature.

The Deformation of Solid Polymers Polymer molecules consist of very long chains of atoms, often containing short side groups at regular intervals. In many of the common polymers the linking between neighboring chains is weak. This type of polymer is rigid at low tempera­tures and soft and rubbery at high tempera­tures, the transition being reversible. Such long­chain polymers, whose properties are strongly temperature-dependent, are called thermo­plastics.

Polymers in which there are frequent strong links between neighboring long chains are said to be crosslinked. Crosslinked polymers have properties that are rubberlike.

Crosslinking is also found in the thermoset­ting plastics, in which nonlinear structures are formed. When these materials polymerize, a process accelerated by raising the temperature, the monomers group themselves into a rigid framework that is not softened when the tem­perature is raised again. These materials show brittle behavior under an applied stress.

Thermoplastics can be either amorphous (or glassy), in which state the polymer chains are randomly oriented and intertwined, or crystal­line, in which small regions of the structure exhibit a definite arrangement of the polymer chains.

At very low temperatures both amorphous and crystalline polymers show essentially brittle

G ay au (Nm-2 ) 1) (Nm-2 ) (Nm-2 )

26 X 109 0.33 26 X 106 60 X 106

46 X 109 0.34 40 X 106 160 X 106

83 X 109 0.28 0.4 X 109 460 X 106

(aUYS) 0.3 X 109

(aLYS)

behavioro having a Young's modulus of about 109 -10 1 N m-2 and breaking at a strain of about 5%. When tested at temperatures close to the melting point th'e deformation of both types of material is dominated by the sliding of polymer chains over each other, large irre­versible strains are produced, and the behavior is termed viscofluid.

Between these extremes of behavior is an intermediate temperature range, the glass tran­sition range, in which the mechanical behavior is strongly time dependent and is called visco­elastic. One manifestation of the glass transition is a fairly abrupt change in volume expansivity, and this can be used to define a glass transition temperature Tg. At temperatures above Tg the polymer chains have a certain freedom of move­ment relative to each other, whereas at tem­peratures well below Tg there is a complete locking of the polymer chains and their in­dividual segments.

At temperatures up to Tg amorphous poly­mers deform elastically under tension until the so-called yield stress ay is reached, when the stress drops to a lower value ad (the draw stress) and a neck appears in the specimen (see Fig. 2). With further deformation this neck propagates along the specimen and the stress only rises again when the complete gauge length has been drawn down to a neck. In this process the polymer chains are oriented in the direction of the applied stress and the material becomes

a

T«~

FIG. 2.

731 MECHANICS

TABLE 2

State at E Tg Material Room Temperature (MNm-2 ) (K)

Polyethylene partially crystalline 70-280 153 Polyvinylchloride amorphous/slightly 2500-3500 353

crystalline Polymethylmethacrylate amorphous 2500-4000 380 Nylon 6 crystalline 2000-3000 323 Phenol formaldehyde thermosetting glass 7000

resin

a

FIG. 3.

stronger. At temperatures above Tg large strains develop from the start of the test and no yield stress drop is observed. In both these regimes the strain can be recovered completely by heat­ing the material at a temperature above Tg.

Crystalline polymers show rather similar curves (Fig. 3), but brittle behavior is observed for temperatures up to T~. Above Tg a neck is produced in tensile deformation, but it results from recrystallization of the polymer chains in the direction of the stress, since the melting point of the aligned chains is higher than that of the unaligned chains. Conse­quently, no recovery of the strain takes place when the material is heated. At temperatures around Tg the propagation of the neck is usually terminated by flaws in the material, but well above Tg the neck propagates along the entire length of the specimen and the stress rises when the aligned polymer chains become strained.

Some data for common polymeric materials are given in Table 2.

M. T. SPRACKLING

References

Benham, P. P., "Elementary Mechanics of Solids," New York, Pergamon Press, 1965.

Calladine, C. R., "Engineering Plasticity," New York, Pergamon Press, 1969.

Hall, C., "Polymer Materials," London, Macmillan Press, 1981.

Honeycombe, R. W. K., "The Plastic Deformation of Metals," London, Edward Arnold, 1975.

Closs-references: ELASTICITY, POLYMER PHYSICS, SOLID-ST ATE PHYSICS, VISCOELASTICITY.

MECHANICS

"Give me matter and motion," proclaimed Rene Descartes, "and I will make the Universe." And what the renowned 17th-Century philoso­pher meant by that bold and somewhat cryptic remark was that at the very primal heart of things physical, there is matter in motion. Understanding the marveolus subtleties, the unity of that fundamental reality, is essential to knowing the Universe on any level. In its broadest sense mechanics is just that: it is the study of the relative movement of objects (actual and impending), or, if you will, the study of motion and rest, the latter merely be­ing a special case of the former.

The subject developed historically along sev­eral different lines driven by very different practical concerns. For the most part, the de­scriptive aspects of the study evolved more suc­cessfully earlier on, followed only later by an effective explanatory capability. The result was an almost natural partition of mechanics into several broad subdivisions which are usually (though not universally) designated as kinemat­ics, dynamics, and statics.

The description of every sort of motion, with­out regard to either the cause thereof or to the physical nature of that which is moving, is known as kinematics. Insofar as it deals with the changing locations of objects in space and time, it can be regarded as the geometry of motion.

By contrast, dynamics (sometimes referred to in part as kinetics) is the study relatirig motion and the changes therein with the corresponding causative interactions. Dynamics seeks to ex­plain the motion described by kinematics.

As a special case of dynamics, statics relates specifically to the conditions governing con­stant relative motion (including "rest"). His­torically the discipline evolved simply as the

MECHANICS

study of objects at rest and that is still the quintessence of the business; but insofar as we now know that absolute rest is a fiction, the purview of statics is specified more appropri­ately as that of objects in unchanging motion. Loosely speaking, statics deals with systems that can be imagined to be motionless (like the Brooklyn Bridge). It is the study of the balance of interactions operating on and within any ma­terial system which results in its effectively be­ing at rest. Unlike dynamics, wherein time is central, the imagery of statics is quite indepen­dent of time-presumably nothing changes.

Kinematics The primary goal of kinematics is to provide a quantified description of motion, one in which the necessary concepts are in­herently measurable. The formulation begins naturally enough with the familiar ideas of space and time. Even so, it is an illusion to think that we can satisfactorily define these most basic underlying concepts. Pragmatically we must content ourselves with measuring in­tervals of space and time with meter sticks and clocks, relying on intuition for conceptual meaning.

Clearly the faster an object moves, the farther it will travel in a given amount of time. That's the crucial insight that leads to a definition of speed, the measure of "how fast." The oldest surviving thoughts on the subject are those of Aristotle (384-322 B.C.) and, like his fellow Greeks of the era, he specified speed as the distance traversed in a given amount of time. And that's just the way it was framed for well over a thousand years-a thing traveled with a speed of "so many miles in so many hours." Nowadays we say almost, but not quite, the same thing defining average speed (vav ) as the interval of distance traveled (t.l) divided by the interval of time (t.t) it took to do the travel­ing: vav == t.l/t.t. However close the ancients got to the idea, they never actually carried through the division, apparently because of a hesitance to divide the "unlike" notions of space and time.

The scholars of the mid-l300s, especially at the University of Paris and Merton College, Oxford, dealt quite successfully with the idea of constant speed. But when they attempted to define the speed of an object at any given moment, the instantaneous speed, they failed. That was not surprising, for they lacked the mathematical imagery of motion, a calculus of change, something Newton would create just for the purpose centuries later. As the time interval t.t over which Vav is determined is made smaller and smaller, the ratio of t.l/t.t approaches a value known as the instantaneous speed, or what we nowadays just call the speed (v). That limiting process actually defines the derivative and so in the notation of calculus, v = dl/dt: speed is the time rate of change of distance, the derivative of distance with respect to time.

732

If, instead of the distance traveled, we con­sider the displacement (s), which is the vector drawn from some origin to the moving body, we can express in a single concept both the speed and the direction of motion. Accordingly, velocity (v) is the time rate of change of the dis­placement, v = ds/dt, and the magnitude of the velocity is the speed.

Variations in motion are commonplace; change is the rule rather than the exception and the measure of this change is called acceleration. Aristotle considered the concept, hinting at it in his book Physics, but never quite grasping it clearly. His follower Strato (ca. 340-270 B.C.) seems to have been the first person to appreci­ate the real-world importance of the idea. He suggested that a body was accelerating when it traversed equal increments of distance in shorter and shorter times. During the 12th-Century re­vival of science, the alternative formulation was set forth that acceleration obtained when a body traveled greater and greater distances in successive equal intervals of time. Only later, in the 14th Century, did the realization begin to emerge that variations in the speed itself were the essence of the concept. Today we define acceleration (a) as the time rate of change of velocity: a == dv/dt.

Along with the notions of displacement and time, velocity and acceleration complete the vernacular of kinematics. Constructing the equation of motion of the system (the expres­sion of displacement as a function of time) be­comes the primary task of the discipline.

Dynamics The explanations of dynamics go beyond the descriptions of kinematics, requir­ing additional concepts which reflect the phys­ical nature of mover and moved. Perhaps the richest and, at the same time, most elusive of these notions is that of mass.

The idea that there had to be another measure of matter, in addition to the old standbys of weight and volume, was first proposed by the theologian Aegidius Romanus (ca. 1247-1316). In an effort to resolve some complex religio-philosophic questions concerning the Eucharist, he suggested that the true measure of a substance, the "how much" of a material en­tity, was its quantity-of-matter. Though in the end it had little or no influence on theology, the new insight, however undefined, found a welcome place in medieval dynamics. The Parisian physicist Jean Buridan (ca. 1300-1385) utilized the conception of quantity-of-matter in his highly influential impetus theory. By the 17th Century, the phrase quantity-of-matter and the term mass (already long in common unscientific usage) had become synonymous­Newton used them interchangeably.

Buridan is responsible for conceiving one of the most important dynamical ideas to come out of the Middle Ages. He began with a ques­tion which is essentially equivalent to this: "Given that both are traveling at the same

733

speed, which would you rather get hit by, a firefly or a fire engine?" Obviously the firefly, but why? What aspect of motion, above and beyond just speed, is involved? He suggested that the essential measure of motion was pro­portional to both quantity-of-matter and speed, that is, it depended on their product. This new metaphor of motion would soon come to be known as the quantity-of-motion and ultimately it would be reinterpreted and renamed momentum. Momentum (mv) has proved to be one of the fundamental character­istics of the movement of all things.

Since the time of Aristotle, it had been widely assumed that the sustained motion of an object required the action of a sustained force. It was Galileo Galilei (1564-1642) who most con­vincingly challenged this seemingly reasonable view. Still, it should be pointed out that there were others before him who had thought, if somewhat tentatively, along similar lines. The Tuscan master performed a series of experi­ments which led him to conclude that an object once set in motion and left alone will continue in motion all by itself forever. That was the law of inertia, one of the first grand insights into the long hidden workings of the Universe. On a sizable planet like Earth, gravity and all sorts of friction conspire to obscure this all-important underlying principle-that is why it remained unrevealed for 2000 years. Had we been dwell­ers on a far smaller vessel in space, the natural tendency for things in motion to continue in motion would have been quite obvious. As it is, we are not, and it is not, and it took the genius of Galileo to see beyond the Scholastic fiction, the age-old error that the natural tendency of matter was to rest.

When Isaac Newton (1642-1727) came to codify motion in his masterpiece, the "Mathe­matical Principles of Natural Philosophy," he lIJegan with a series of definitions. The first was a rather unsatisfactory attempt at defining quantity-of-matter, while the second was a clear statement of quantity-of-motion framed as the­product of mass and velocity. Newton then set out the three "axioms or laws of motion" which form the basis of dynamics (and statics) even to this day. The first of these was the law of in­ertia: Every body continues in its state of rest, or of uniform motion in a straight line, except insofar as it is compelled to change that state by forces impressed upon it. Force is the agent of change; it does not sustain motion, it changes it.

Newton's second law is a quantified recasting of the first law. Modernized somewhat in its language, it reads: Thc rate of change of the quantity-of-motion (i.e., the momentum) of a body is equal to, and occurs in the same direc­tion as, the net applied force, F = d(mv)/dt. From the first law we have that that which changes motion is force and now, more spe­cifically, force equals the change in the motion

MECHANICS

per unit time (the measure of motion being the quantity-of-motion or the momentum).

If mass is taken to be constant, F = mdv/dt and so F = mao This is the hallmark of Newton's theory and yet it does not appear anywhere in his work. In fact, it actually was introduced decades later by the Swiss mathematician Leonhard Euler. It is a real tribute to Newton's vision that his formulation in terms of momen­tum is in perfect accord with modern relativity theory, whereas F = ma is not. Mass is a func­tion of speed and therefore not constant in time-but that would not be shown until 1905 and Einstein.

Sir Isaac's third axiom, his third law of mo­tion, completes the logical picture of force which is the very pillar of his dynamics. An iso­lated body follows the law of inertia. It cannot alter its own motion; that requires some outside intervention called force. And when two bodies, like billiard balls, interact, it's only reasonable to assume that both will be affected, both mo­tions will be altered, both will experience a force. Leonardo de Vinci (1452-1519) had pointed the way long before. "An object offers as much resistance to the air," he wrote, "as the air does to the object. You may see that the beating of an eagle's wings against the air sup­ports its heavy body in the highest and rarest atmosphere .... " Whatever the source of in­spiration, and others had grasped the essence of it, too, Newton provided the final link: "To every action there is always opposed an equal reaction: or the mutual actions of two bodies upon each other are always equal, and directed to contrary parts." The interaction of two en­tities always occurs via an equal action-reaction pair: force and counterforce. There is no such thing, as a single force; force is a thing of pairs.

The second and third laws combine to reveal one of the guiding principles of modern phys­ics, the law of conservation of momentum. If two objects interact, the forces acting on each will be equal and opposite and so, too, will be their resulting changes in momentum. In terms of the system as a whole, these paired opposite momentum changes cancel each other, leaving the net or combined momentum unaltered. The total momentum of a system of interacting masses must remain unchanged provided that no net external force is applied. As Newton put it, "the quantity-of-motion ... suffers no change from the action of bodies among them­selves." Amusingly, the idea had been specula­tively anticipated by Rene Descartes (1596-1650), who wrote of the Creator: "He con­serves continually ... an equal quantity-of­motion" in the Universe.

To the kinematical ideas of time, displace­ment, velocity, and acceleration had been added the dynamical concepts of mass, momentum, and force, all united in the credo of the three laws. Brilliant physicists would spend the next two hundred years mathematically honing the

MECHANICS

fine edge of Newton's force-dynamics. And all the while another complementary piece to the scheme, another powerful vision, was slowly evolving-the concept of energy.

The necessity to formally quantify work came out of the practical needs of the engineers and scientists of the late 18th Century at the start of the Industrial Revolution. Work == force X distance; work equals the force applied to an object multiplied by the distance through which it moves, a quantity easily measured with scales and meter sticks and just as easily bought and paid for. Power, the amount of work done per unit time, was also a practical measure dic­tated by the demands of the new machine age.

In "The Two New Sciences," Galileo had long before shown some grasp of the key idea. He talked about the physics of pile drivers and recognized that the weight of the hammer and the distance through which it fell determined its effectiveness-force and distance related in a crucial way.

Suppose we do work on a hammer, exert a force on it and cause it to accelerate, and then slam it down on a nail bringing it to rest, thereby doing work on the nail; we have work­motion-work. And if we recognize that work is the changer of energy, then the hammer ap­parently has some sort of energy of motion.

Interestingly, the underlying insight to all of this actually began to evolve roughly a hundred years before it reached maturity in the 19th Century. Christian Huygens (1629-1695) never cared much for Descartes's reliance on the idea of quantity-of-motion. Momentum to be really meaningful must be framed as a directional quantity, a vector, like force. A body at rest could explode into two pieces violently flying in opposite directions and yet the total mo­mentum would remain zero throughout. That bothered Huygens, who suggested a different measure of the motion, one which would be independent of the direction of the velocity, one which would only vanish when all the mo­tion actually ceased. He subsequently decided on the prodUct of the mass and speed squared. Gottfried Leibniz (1646-1716), Newton's bitter rival, picked up the idea, calling mv2 the vis viva or living force. The great and meaningless vis viva controversy between the followers of Descartes and Leibniz roared on for decades as each side claimed the fundamental notion.

It was not until the beginning of the 1800s that Thomas Young (1773-1829) shifted the imagery and spoke of mv2 as energy. "Labour expended in producing any motion," he wrote, "is proportional .... to the energy which is obtained." Then Gaspard de Coriolis (1792-1843), using Newtonian mechanics, showed that the work done on a system was equivalent to a change in the quantity t mv2 . Lord Kelvin (1824-1907), years later, dubbed this the kinetic energy. Vis viva was forgotten, drowned out by the roar of the Industrial Revolution,

734

and in its place stood its look alike , kinetic energy.

Suppose that there is some sort of driving force constantly exerted on a body, like its weight. To move against that pull requires the application of a counterforce and the doing of work. The crucial point is that the force-be it elastic, electric, magnetic, gravitational, what­ever-continues to act even after the displace­ment. Once let loose, that force will drive the body back, imparting kinetic energy in the process. Clearly it is possible to do work and not have it immediately appear as kinetic en­ergy, and yet the potential for generating that energy is there. This retrievable stored energy, energy by virtue of position in relation to a force, is known as potential energy, a name given it by William Rankine (1820-1872).

When we consider the kinetic and potential energy of a body as a whole, it is understood that all of its atoms act together in an organized fashion. Alternatively, it is possible to impart motion to the individual constituent atoms which is disorganized, motion .not of the body, but within the body. A pendulum swings until it ultimately comes to rest-organized kinetic energy is transformed into disorganized kinetic energy or, as it's called nowadays, thermal en­ergy. The ubiquitous agent of that transforma­tion is known as friction.

One of the great revelations of the previous century was the law of conservation of energy opened out to include both thermal and me­chanical processes: Energy can neither be created nor destroyed, but only transformed from one form to another. Whatever energy is (and we have no satisfactory conceptual defini­tion of that underlying quantity, although we know its various manifestations) it is conserved.

To the force-time-momentum imagery of the 18th Century was added the force-space-energy vision of the 19th Century. And then in the 20th Century Albert Einstein (1879-1955), questioning the very basic understanding of space and time, profoundly recast kinematics and dynamics in his speCial theory of relativity. Newtonian mechanics turned out to be the low­speed approximation of the new vision whose real significance becomes apparent only at speeds that are appreciable in comparison to the speed of light.

The theory builds from two basic postulates. The first is known as the principle of relativity, which states that all the laws of physics are the same for nonaccelerating observers. The second, the principle of the constancy of the speed of light, maintains that light propagates in free space with a speed (c ~ 300,000,000 m/s) that is independent of the motion of the source (and of the observer). The speed of light in vacuum is absolute.

Among the many surprises provided by the new vision was the realization that rest, motion, simultaneity, time, length, and mass are not

735

absolutes as had long been thought. Instead these fundamental quantities are relative, they depend on the motion of the observer.

The result, which Einstein himself thought was "the most important," was the equivalence of mass and energy. These two seemingly dif­ferent concepts are actually manifestations of one single entity: mass-energy, E = mc2, the total energy E of an object equals its mass m multiplied by the speed of light squared.

Statics The central problem that was never far from the surface in the early days of the de­velopment of kinematics and dynamics, was the motion of the heavens. Aristotle had woven his theory of motion into his cosmography to form a single fabric that would stand for two thou­sand years. By contrast, the motivation behind the development of statics was far more mun­dane and pragmatic. The ancients, who weighed out their goods on balances, who raised stone, hauled ships and pitched tents were all prac­titioners of statics, even before the formal body of knowledge evolved.

An object experiencing no change in its mo­tion, no acceleration, is said to be in equilib­rium. Hence it follows from Newton's second law that translational equilibrium (a = 0) ob­tains when the sum of the forces acting on a body is zero, ~ F = O. If the two teams in a tug of war pull equally hard in opposite directions, the net force is zero and the rope remains mo­tionless. This is the first of the two conditions of equilibrium.

The simple equal-arm balance was already in widespread use well over 4000 years ago. It is not surprising then, that Aristotle, in his "Mechanica," attempted to analyze the impor­tant practical problem represented by balances, levers, and seesaws (which are just variations on one theme). When equal downward forces are applied to each end of the centrally pivoted rod, the system remains motionless, balanced horizontally. Any tendency to rotate clockwise is canceled by an equal tendency to rotate counterclockwise, and we say that the system is in rotational equilibrium. In fact, the word equi­librium derives from the Latin aequus, even or equal, and libra, a scale or balance.

It was the great Archimedes of Syracuse (287-212 B.C.) who, in his treatise "On Equilibrium," framed the law of the lever in a satisfactory way. Unequal forces (of magnitudes FI and F2 ) acting perpendicularly on a bar at unequal distances from the pivot (rl and r2, respec­tively) balance each other provided that F I rl = F2r2' It is not enough to be concerned only with the sizes of the forces; the distances from the pivot at which they act are crucial, as well.

Da Vinci, aware that the human body achieved its mobility via a system of various kinds of levers, set himself to the study of these simple machines. He was among the first to recognize the significance of the idea of the moment of a force.

MEDICAL PHYSICS

Envision a force F acting on a body which is pivoted at a point 0: The lever or moment-arm of the force F with respect to the axis passing through 0 is the perpendicular distance (r 1) drawn from 0 to the line of action of F. The moment of the force about 0 is then defined as the product of the magnitude F and r1. Nowa­days it is common practice to symbolize this quantity by the Greek letter tau, T, and refer to it as torque (from the Latin, torquere, to twist). The law of the lever is then simply a require­ment that the two opposing torques be equal. Formulated as a vector, torque becomes T = rX F.

In the case of a rigid body (viewed as a collec­tion of interacting point masses) Newton's sec­ond and third laws lead to the second condition of equilibrium: For a rigid body in equilibrium the sum of the torques about any point (due to all the externally applied forces acting on it) must be zero, ~T = O.

The two conditions of equilibrium provide a basis for the analysis of the forces at work in all sorts of mechanical systems from trusses and bridges to the muscle-bone structure of the human body.

Mechanics, the tap root of physics, is the seminal discipline, whether it is general rel­ativity on the cosmic scale, quantum mechanics in the micro-domain of the atom, or Newtonian mechanics in the macro-world of baseballs and ballistic missiles. "Give me matter and motion and I will make the Universe."

EUGENE HECHT

References

Hecht, E., "Physics in Perspective," Reading, Mass., Addison-Wesley Pub. Co., 1980.

Cajori, F., "A History of Physics," New York, Dover Pub., Inc., 1962.

Jammer, M., "Concepts of Mass," Cambridge, Mass., Harvard U niv. Press, 1961.

Toulmin, S., and Goodfield, J., "The Fabric of the Heavens," New York, Harper and Brothers, 1961.

Cross·references: DYNAMICS, ELASTICITY, GRAV· ITATlON, MASS AND INERTIA, MECHANICAL PROPERTIES OF SOLIDS, STATICS.

MEDICAL PHYSICS

That physics has an important place in medicine can scarcely be denied. A physician's first move in examining a patient is to measure his tem­perature, count his pulse, listen to his heart sounds and take his blood pressure. Only much later does the physician get around to chemical and laboratory tests. Yet every hospital of any stature has a laboratory or a department of clinical chemistry. Laboratories of clinical phys­ics are virtually nonexistent. While physics

MEDICAL PHYSICS

plays a large role in medical diagnosis and treat· ment, physicists have largely neglected the field.

Some of the earliest applications of the princi­ples of physics to problems in medicine were in the fields of optics and sound. An early contrib­utor was H. L. F. von Helmholtz; a physician as well as a physicist. His work in physiological op­tics and that on the sensations of tone are con­sidered classics. Even earlier, J. L. M. Poiseuille, a French physician and physicist, seeking a bet­ter understanding of the flow of blood, studied the flow of water in rigid tubes. His work not only contributed to physiology but also estab­lished an important relation in the physics of viscous fluids. D' Arsonval, a French physicist, pioneered in the therapeutic use of high fre­quency electric currents and measuring instru­ments. Much earlier, that unusual artist, inven­tor and physicist, Leonardo da Vinci, had shown a keen interest in the fascinating mechan­ic.~ of human locomotion.

With the intensive development of the sciences of physics and medicine in the latter part of the nineteenth century, the two drew further apart. This period also saw rapid development in the science of physiology which is concerned not only with chemical but also with physical pro­cesses in the body. Clinical physiology abounds with such concepts as the pressure-velocity re­lationships in the flow of blood, the mechanics of the cardiac cycle, the work of breathing, gas exchange in the lungs, voltage gradient in cellu­lar membranes, and cable properties of nerves, to name but a few. These concepts have, of necessity, been worked out by scientists with training and experience in the basic biological and clinical procedures. Physicists have been in­active in the field and have made very little con­tribution to its development. But there is a growing awareness among physiologists of the importance of physical principles and the need for precise statement of physical law. An ex­ample of this conviction is Howell's Textbook of Physiology in which editions since the eigh· teenth have carried the title, "Physiology and Biophysics. "1

A phenomenon of the mid-twentieth century has been the development of interdisciplinary fields of science. BIOPHYSICS combines the most fundamental of the biological and physical sciences. It has had an extremely rapid growth, with something like 30 to 40 university Depart­ments of Biophysics in America alone. Its emphasis has been on the application of physical principles to all aspects of biology-cellular, botanical, zoological as well as clinical.

An even more recent phenomenon has been the development of biomedical engineering. Its basis has been the application of the tremendous developments in electronics to medical measure­ments and instrumentation. In fact, the field is frequently referred to as biomedical electronics. Such recent developments as vector electrocar­diography, implantable pacemakers for the

736

heart, and intensive-care physiological monitors and recorders are examples of the impact of electronics on medicine. While these fields border on medical physics, none are concerned primarily with the application of physical prin­ciples to clinical problems. Yet they compete so effectively with medical pbo/sics that it is difficult to delineate the boundaries of the latter.

The discovery of X·RAYS by Roentgen in 1895 had an immediate impact upon medicine. With· in a few months, the new rays were used both diagnostically and therapeutically. Indirectly, their application set the stage for the develop­ment of medical physics. Therapeutic applica­tion of x-rays raised questions concerning their quality and quantity-both of which are im­portant in accurate dosimetry. Evaluation of early successes and failures indicated the im­portance of the proper distribution of dose between neoplasm and normal tissue. The physician turned to the physicist for assistance. The late Otto Glasser was one of the early radio­logical physicists; he and Fricke in 1924 con­structed an air wall ionization chamber for the measurement of radiation dose.2 Their construc­tion eliminated some of the nonlinear effects due to quality, i.e., photon energy distribution, in the evaluation of biological response. Other early workers in America were Edith Quimby and G. Failla. In England, L. H. Gray and W. V. Mayneord were active. In 1936, Gray proposed the Bragg-Gray formula for determin­ing the absolute amount of energy delivered to a medium from ionization measurements.3 The work of Fricke, Glasser, and Failla along with that of L. S. Taylor4 and others contributed to the establishment in 1928 by the Second Inter­national Congress of Radiology of the roentgen as a unit of radiation dose based on the amount of ionization generated in a standard volume of air. The use of higher energies and ionizing radiations other than x-rays led during the 1950's to the abandonment of the roentgen as a unit of absorbed dose. Dissatisfaction with the roentgen was also due to a growing realization that biological response was more nearly related to the energy absorbed in a medium. The Bragg­Gray formula permitted the calculation of ab­sorbed dose in a medium, and the work of J. S. Laughlin 5 established the dosimetry of high­energy radiations in energy units by calorimetric methods. While radiological physics is clearly a part of the broader discipline of medical physics, it included in the early days practically all that was organized of the later subject.

For many years there were no organizations of workers in the field of medical physics. In America, radiological physicists were associate members of the Radiological Society of North America, naturally dominated by radiologists. First in Britain (The Hospital Physicist's Asso­ciation) and later in America specialty groups were organized. The American Association of Physicists in Medicine (AAPM) brings together

737

those physicists working in hospitals and med­ical schools, and interested in an understanding of the physical side of medical problems. The membership has been largely drawn from those working in radiological physics, but an interest in all areas of the application of physics to medical problems is rapidly developing. The A.A.P.M. became affiliated with the American Institute of Physics in 1958 and is now showing the most rafid growth of its nine member or­ganizations. In 1973, the A.A.P.M. began pub­lication of its own journal, Medical Physics. A further indication of the developing awareness of this field are the International Conferences on Medical Physics, which meet every three years (the Sixth met with the International Conference on Medical and Biological Engineer­ing in Hamburg, Federal Republic of Germany in 1982).

One last word about a related field: Radiation protection was in the early days a part of radio­logical physics. In America, 1. S. Taylor was active for many years at the Bureau of Stan­dards in setting up guidelines for protection from radiation. During World War II, the Manhattan Project required large numbers of workers in the field of protection, and the term HEALTH PHYSICS was introduced. Since the war, the field has grown with the growth of the area of atomic energy. The Health Physics Society is a large and growing group with many local chap­ters and an international organization. The field seems, though, to be becoming more closely aligned with the area of public health than with clinical medicine.

When the first edition of the "Encyclopedia of Physics" was issued, the viability of medical physics as a profession was uncertain. It ap­peared that any breakthrough in the cure of cancer, eliminating radiation therapy as a mo­dality of treatment, would also eliminate the livelihood of medical physicists. But much was already happening and the results that have changed the picture are now apparent. Elec­tronics and nuclear physics have made many previously unmeasurable variables in medicine accessible to quantification. The introduction of the concept of modulation transfer function has made a science of the evaluation of diagnos­tic imaging quality. Nuclear medicine has had wide applications and attracted many physicists into the field. Ultrasound and computed to­mography have had spectacular successes in improving medical diagnostics. And now nu­clear magnetic resonance imaging is beginning to be applied. During the 1970s, with shrinking financial support for physics, many recent PhDs entered the field of medical physics as post­doctoral trainees. Physicists in general became more aware of the opportunities and the need to apply physical principles to the many prob­lems in medicine. Many large physics accelera­tor installations developed programs for the treatment of cancer. Even the largest acceler-

METALLURGY

at or in America, Fermilab, has a large and con­tinuing program for cancer treatment with high­energy neutrons. The award of a Nobel prize in Physiology and Medicine in 1977 to Roslyn S. Yalow7 for the development of radioimmuno­assay served notice to the world that physicists are contributing to the solution of problems in medicine. This was followed two years later by the award of Nobel prizes to a physicist, Allan M. Cormack,s and an engineer, Godfrey N. Hounsfield,9 for the development of computed tomography. Certainly, one can now state that the new discipline of Medical Physics is healthy, growing and showing unmistakable signs of sur­vival. And one can find few areas of physics where the challenge is greater, or the rewards more satisfying, than in making accurate evalua­tion of physical variables in the living human patient.

LESTER S. SKAGGS

References

1. Ruch, T. C., and Patton, H. D. (Eds.), "Physiology and Biophysics," 20th Ed., W. B. Saunders, Phila­delphia, 1973.

2. Fricke, H., and Glasser, 0., "Standardization of the Roentgen Ray Dose by Means of the Small Ioniza­tion Chambers," Am. J. Roentgenol. 13, 462 (1925).

3. Gray, L. H., "An Ionization Method for the Ab­solute Measurement of X-Ray Energy," Proc. Roy. Soc., London Ser. A., 156,578 (1936).

4. Taylor, L. S., and Singer, G., "An Improved Form of Standard Ionization Chamber," J. Res., Natl. Bur. Std. 5, 507 (1930).

5. Genna, S., and Laughlin, 1. S., "Absolute Calibra­tion of Cobalt-60 Gamma Ray Beam," Radiology 65,394 (1955).

6. Porter, B. F., "AlP Member Societies Entering the 1980's," Physics Today 34, 27 (1981).

7. Yalow, R. S., "Radioimmunoassay: A Probe for the Fine Structure of Biologic Systems," Science 206,1236 (1978).

8. Cormack, A. M., "Early Two-Dimensional Recon­struction and Recent Topics Stemming from It," Science 209, 1482 (1980).

9. Hounsfield, G. N., "Computed Medical Imaging," Science 210, 22 (1980).

Cross-references. BIOMEDICAL INSTRUMENTA­TION, BIOPHYSICS, HEALTH PHYSICS. MOLEC­ULAR BIOLOGY, RADIOACTIVITY, X-RAYS.

METALLURGY

The metallurgical industry is one of the oldest of the arts, but one of the youngest of the sub­jects to be investigated systematically and con­sidered analytically in the tradition of the pure sciences. It is only in comparatively recent times that any fundamental work has been

METALLURGY

carried out on metals and alloys, but there are now well-established and rapidly growing branches of science which are related to the metallurgical industry.

Extraction Metallurgy Extraction metallurgy, or the science of extracting metals from their ores, is broadly divided into two groups.

Ferrous . This branch is concerned with the production of iron (normally from iron ore, with coke and limestone in a blast furnace) and its subsequent refining into steel, by oxidizing the impurities either in an electric arc furnace by means of an appropriate slag on the surface or in a "converter," by blowing oxidizing gas through the molten iron. The most striking recent developments in this field have been the increasing use of pure gaseous oxygen in steel­making, and the increasing size of furnaces, with a resultant improvement in efficiency, rate of production, and quality of product. Over 500 million tonnes or 70% of the worlds's annual steel production is now made using oxygen converters.

Nonferrous. Some metals such as chromium, cobalt, and manganese, for example, are prin­cipally produced as alloying elements to improve the properties of steels. The nonferrous metals manufactured in greatest quantity include alu­minum, copper, nickel, zinc, magnesium, lead, and tin, with titanium being an important new­comer in view of its low density, high melting point (1943 K) and resistance to corrosion. The precious metals, and the "refractory metals" of very high melting point (e.g., tungsten and molybdenum) are other important families.

Shaping of Metals This may be carried out in three main ways.

Casting. Most metals are initially cast into ingots, which may be subsequently forged to shape. The technique of continuous casting is increasingly used in this context to improve efficiency and to increase the rate of produc­tion. By 1990 it is expected that one-half of the world's steel output will be continuously cast. Many alloys are designed to be cast into their final shape by pouring the molten alloy into an appropriate mold. These may be sand molds if only a small number of objects are re~uired, and very massive castings (e.g., over 10 Kg in mass) may also be produced in this way. A permanent mold, or die casting, is employed if large numbers of the object are required (particularly in alloys of low melting point, such ljS zinc-based alloys), and high dimensional accuracy can be achieved by these means. Cast iron is the cheapest metallic ma­terial. The microstructure of "gray cast iron" is shown in Fig. 1. It consists of flakes of graph­ite in a two-phase matrix of iron and iron carbide (Fe3C)' The brittleness of this material arises from the weakness of the graphite flakes, which act like cracks in the structure.

Forging. This entails shaping of the metal by rolling, pressing, hammering, etc., and may be carried out at high temperatures, when the

738

FIG. 1. The microstructure of gray cast iron: the black lines are flakes of graphite.

metal is soft (hot-working), or at lower tempera­tures (cold-working) where deformation leads to progressive hardening of the metal (work­hardening). In contrast with casting, forgings usually exhibit differing physical and mechan­ical properties in different directions, due to the directional nature of the shaping operation .

Much modern research in physical metallurgy is concerned with investigating the plastic flow and work-hardening behavior of metals and alloys . Metal crystals yield plastically at stresses several orders of magnitude lower than the theoretical value for the deformation of perfect crystals. This discrepancy is accounted for by the presence of linear imperfections known as "dislocation lines" within the crystals. Plastic flow takes place in metal crystals by "slip" or "glide" in definite crystallographic directions on certain crystal planes, due to the movement of dislocation lines under the applied stress. Dis­locations multiply and entangle as deformation proceeds, thus making further flow increasingly difficult (work-hardening)-the densitr of dis­locations rising from about 105 mm - in soft (annealed) metal to about 1Olo mm -2 in work­hardened material. These and other types of crystal defect (such as stacking faults, which are planar in geometry) can be studied by x-ray diffraction and also by means of the electron microscope and the field-ion microscope (q.v.).

Powder Metallurgy. This is a method of shap­ing by pressing finely powdered metal into an appropriately shaped die. The "green compact" thereby produced is of low strength and is sub­sequently heated in an inert atmosphere ("sin­tered"); the pressing and sintering may be re­peated until strong, dense products are obtained. The technology was first developed for metals which were of too high a melting point for con­ventional casting and forging methods, and tungsten lamp filaments were first produced by this means. Other refractory metals and hard metal-cutting alloys may thus be shaped, and some magnetic and other special alloys are pre-

739

pared in this way by suitable blending of powders, which avoids any contamination that may be associated with the melting process. The pressing and sintering conditions may be arranged to leave some residual porosity in the structure of, for example, bronze bearing alloys. The pores are filled with oil, thus producing the so-called oil-less bearings which can operate without further lubrication.

Joining. The three important methods of joining metals are riveting, soldering or brazing (in which metal components are joined by means of a layer of alloy of lower melting point),

1100 A

900

800

U 700 0

;

~ 600 :::> ... .. a. Il: ...

500 a. ~ ... ...

0400

300

0 0 10

METAI.U:RGY

and welding (in which the metal itself is fused). Weldability is often the critical factor in the selection of an alloy for a given purpose, since the metallurgical changes produced by localized heating are often associated with the develop­ment of deleterious properties at, or adjacent to, the weld.

Alloy Constitution Phase equilibria in alloy systems are represented on phase diagrams, which represent the temperature ranges of phase stability as a function of composition. An example of such a diagram is given in Fig. 2; they are experimentally established by, e.g.,

a.+b 6+£

Z

0 P

( + "1

0.+£

20 30 40 PERCENTAGE OF TIN

Poillt .-\ Il ( . 1> ~; V ( : H ., K L :'1 ~ ' o C. III. :\ "iH 'j!)~ iH ' 'i :;.~, ;:),., ; .... ,:) :l :--:ti :I.'-tli .-,SH .... )~o .-):!(I :;~{l ZOO

Sn. " 0 , :\ ' :; :!~,n :,!,." :, :,!.-, . !l :!j · 1I :W'li 1,-,' S ~ ~ ' Ii ~.-, . ~ t .-} · ~ :!i·tJ :11 · ~ I · ~ "

Point ~ () I' 1" {J Il :..: T l' \ ' \\ :\ Y Z " C. 3':;11 :\ :,11 :\,-,11 H~O [ i jli 1i~1I !i~1I ,-)!)l1 ,-,!In .-,nn ' '':' t l :,! ,-'x:! (i~n ~Iii

Sn. 0 0 , I'll :~:! ' .-,,-) :\, · S :\1 .• ~ :1. ,:1 :I~' ~ :1.-, . :! :11 '1; :I~' :\ :n · ' :I~· !l :I~· , :1\)· .-. 38·3

FIG. 2. The copper-tin phase diagram. (G. V. Raynor, "Institute of Metals Annotated Equilibrium Diagram," Series No.2, 1944.)

METALLURGY

thermal analysis, dilatometry, microscopical, and x-ray diffraction methods. Phase diagrams can also be calculated by computation of the Gibbs free energy minimum, if the thermo­dynamic functions of observable phases are determined by experiment. Diagrams such as that in Fig. 2 are invaluable in the interpreta­tion of the structures of alloys observed under the microscope.

The microstructure of an alloy (and hence its properties) will be determined not only by its composition, but also by its thermal and mechanical history. Of particular importance is the metallurgical control of the mechanical properties of an alloy by heat-treatment, which affects the distribution of the phases present. Hardness, for example, will depend upon the state of deformation (i.e., the density of dis­locations) and upon the composition of the alloy. Pure metal crystals can be hardened by other atoms in solid solution (solute hardening) as well as by finely dispersed particles of a hard second phase (precipitation, or dispersion hard­ening) which are effective in impeding the motion of dislocations when the crystal is stressed. Fig. 3 is an electron micrograph show­ing dislocations on the slip plane of a copper alloy crystal, and it illustrates how the presence of hard particles has caused local entanglement of the dislocations . The relationship between the microstructure and properties of metals and alloys is of fundamental importance and is a field of intense scientific activity.

Although many common alloys were not developed scientifically, a considerable theory of alloys is developing, springing from empirical rules and principles (notably those due to W . . Hume-Rothery) which have generalized the facts and enabled predictions to be made . The early theories of the metallic state, due to Drude and Lorentz, and later to Sommerfeld, were developed and discussed by N. F . Mott and H. Jones in their book "The Theory of the Properties of Metals and Alloys." A great in­crease in our knowledge of transition metals and al10ys has taken place, and some signs of gen-

FIG. 3. A deformed copper alloy crystal containing hard particles. Electron micrograph showing inter­action of dislocation lines with the second phase.

740

eral principles have begun to appear, although there is yet little theoretical knowledge enabling one to calculate properties or structures of alloys from fundamental principles. The pro­pensity to form deleterious phases in nickel­based materials has been related empirically to the average number of electron vacancies (Ny) in a given alloy. Computer calculations, using a system known as PHACOMP, have identified the critical value of Ny above which such phases form.

The Effect of Environment Upon the Behavior of Metals Low Temperature. Some metals and alloys exhibit a spectacular change in mechanical behavior with decrease in temperature . Many metals of body-centered cubic crystal symmetry (e.g., iron lmd mild steel) which are tough and ductile at ordinary temperatures become com­pletely brittle at subzero temperatures, the actual transition temperature depending upon the metallurgical condition of the alloy, the state of stress, and the rate of deformation . Some metals of hexagonal symmetry (e.g., zinc) exhibit this dfect, bu,t metals of face-centered cubic symmetry (e .g., copper) remain ductile to the lowest temperatures. This transition in behavior is clearly of critical importance in the selection of materials for . low-temperature ap­plication.

High Temperature. Apart from problems of oxidation (discussed below), metals tend to deform under constant stress at elevated tem­peratures (the deformation is known as "creep"), and creep-resistant alloys are designed to pro­vide strength at high temperatures. These are essentially alloys in a state of high thermo­dynamic stability, usually containing finely dispersed particles of a hard second phase which impede the movement of dislocations. Grain boundaries may be a source of weakness at elevated temperatures, and turbine blades in the form of alloy single crystals have recently been employed in engines of advanced design.

Fatigue. Metals break under oscillating stresses whose maximum value is smaller than that required to cause rupture in a static test, although many ferrous alloys show a "fatigue limit," or stress below which such fracture never occurs, however great the number of cycles of application. The phenomenon is associated with the nucleation of submicro­scopic surface cracks in the fatigued component early in its life, which initially grow very slowly. Eventually a crack grows until the ef­fective cross section of the piece is reduced to such a value that the applied stress cannot be supported, and rapid failure occurs. A typical fatigue fracture surface is shown in Fig. 4, in which two distinct zones are apparent. These correspond to the period of slow growth (left­hand side) and final failure, respectively.

Oxidation and Corrosion. With the exception of the "noble metals," which are intrinsically resistant to attack by the environment, metals

741

FIG.4. A fatigue fracture surface upon a large steel shaft. (Courtesy of British Engine Insurance Ltd.)

in general owe their oxidation resistance, when they are heated in air, to the presence of im­pervious oxide films on their surfaces. Those which develop porous oxides (e.g., the refractory metals tungsten and molybdenum) oxidize very rapidly at high temperatures. Oxidation resistant alloys are designed to maintain a protective film under these conditions.

Corrosion occurs under conditions of high humidity or immersion in aqueous media. The phenomenon can be interpreted electrochemi­cally-local anodes form at the region of metal dissolution, and local cathodes form where the electrons are discharged. "Galvanic corrosion" is encountered where dissimilar metals are in electrical contact under these conditions. Of particular importance is the conjoint action of stress and corrosion, where "stress corrosion" or (under fluctuating stresses) "corrosion fatigue" cracking may be encountered, in situations where no failure would occur under the action of the stress or the corrosive environment applied separately. Electrochemical principles are ap­plied in the protection against corrosion.

Materials Technology The scientific princi­ples which govern the behavior of metals are, of course, applicable to a wide range of other technologically important materials, such as polymers, ceramics, and glasses. In recent years many centers of metallurgical research both in industry and in universities have broadened their approach in this way and are often described as Departments of Materials Technology.

JOHN W. MARTIN

References

Metallurgical Data

The series of "Metals Handbooks," published by the American Society for Metals, Metals Park, Ohio.

METEOROLOGY

Smithells, C. J., "Metals Reference Book," London, Thornton Butterworth Ltd., 1983.

Moffatt, W. G., "Handbook of Binary Phase Dia­grams," Schenectady, N.Y., The General Electric Co., 1981.

General Reading

Street, A., and Alexander, W.O., "Metals in the Ser­vice of Man," London, Pelican, 1973.

Martin J. W., "Elementary Science of Metals," London, Wykeham Publications, 1969.

West, J. M., "Basic Corrosion and Oxidation," New York, Wiley, 1980.

Cross-references: CR YST ALLOGRAPHY, CR YST AL STRUCTURE ANALYSIS, ELECTROCHEMISTRY, LATTICE DEFECTS, MECHANICAL PROPERTIES OF SOLIDS, SOLID-STATE PHYSICS.

METEOROLOGY

Meteorology is the study of the atmosphere. The word is derived from the classical Greek meteoros meaning "things lifted up." As early as 250 B.C. Aristotle wrote extensively about these topics in his treatise "Meteorologica." Meteorology, as practiced today, is broadly interdisciplinary, drawing extensively from the fields of hydrodynamics, thermodynamics, optics, chemistry, and mathematics. It has applications in industry, agriculture, transporta­tion, economics, resource management, and many other human activities.

Almost every branch of physics and chemistry is in some way involved in atmospheric phe­nomena. This article is confined to brief over­views of modern methods of weather prediction, climatology, cloud physics (including attempts at weather modification), and boundary-layer processes.

Atmospheric Circulation and Weather Pre­diction The basic state of the atmosphere is described by seven variables: three components of velocity; temperature; pressure; density; and the water-vapor concentration (usually called the mixing ratio). The changes of these variables are governed by seven well known equations. The accelerations in the three principal direc­tions are given by three differential equations based on Newton's second law. These were written in their hydrodynamic form by Euler about 200 years ago. Temperature changes are governed by the first law of thermody­namics. The other equations are the equation of state and the equations for the conservation of mass of the dry air and water vapor. The complete set of equations has been established for well over 100 years, but because of their complexity, their application to weather predic­tion has become a reality only during the last 25 years as modern computers have become available.

METEOROLOGY

The numerical problems involved in solving these equations for the future values of weather elements are far from trivial. Essentially, the changes must be calculated for periods of a few minutes at a time based on the global distribu­tion of the elements derived from observation or calculated during earlier time steps, and moved forward step by step. This method was first studied by L. F. Richardson in the early 1 920s, but had to be abandoned because of inadequate computational capability. The prob­lems were successfully confronted by John von Neumann and his colleagues in Princeton about 1950. In 1955 the U.S. Weather Bureau began numerical weather prediction on a twice-daily basis. Today, numerical weather prediction is being carried out routinely from greatly im­proved global models at several national and international centers. The use of models has brought about a considerable increase of pre­dictive skill (measured by the degree to which accuracy exceeds that attainable using clima­tological averages). The skill achieved for two­to three-day forecasts by computer models is about that formerly attainable one day in advance by subjective methods. Demonstrable skill has been achieved out to ten or twelve days, but five days is a more realistic limit in temperate latitudes on a routine basis.

The use of circulation models in weather pre­diction requires upper-air observational input over at least the entire Northern Hemisphere (current models are now global in extent). The data are provided each 12 hours by several hundred soundings of pressure, temperature, humidity, and winds observed from balloons and telemetered to ground stations. The specifi­cations for these data and their distribution are coordinated by the World Meteorological Or­ganization (WMO). Data from several thousand surface stations and ships are also used.

Meteorological satellites have been employed extensively to supplement surface based data. Geostationary satellites, hovering at a height of 35,000 km above a fixed point on the equa­tor provide quasi-hemispheric visible and infra­red images every half hour. These are valuable for short-range prediction of thunderstorms, tornados, and other localized phenomena. They also provide important surveillance of tropical and extra tropical cyclones in data-sparse areas, particularly in the southern hemisphere. Winds derived from satellite cloud-image motions have made important inputs to numerical prediction models. Also for this purpose, much effort has been devoted to developing methods of deriving temperature soundings remotely from polar­orbiting satellites. The inversion technique for reconstructing the temperature distribution from infrared fluxes observed at different wave­lengths has been solved, but the technique can be used only in cloud-free regions.

Because of the global extent of the atmo­spheric circulation, much of the research is carried out in international programs. The most

742

recent of these is the Global Atmospheric Research Program (GARP), planning and co­ordination of which is done jointly by the International Union of Geodesy and Geophys­ics (IUGG) and WMO. The program seeks to provide a better understanding of the large scale atmosphere, to define better the data requirements of atmospheric prediction models, and to exploit the fullest observational capabili­ties of modern technology.

Climatology Climate may be defined as the statistical summary of past weather at a fixed place. It is most often thought of as a running average using a fixed time interval, the length of which may be as little as a month in some conceptions or indefinitely large in others. Usually a period of about thirty years is pre­ferred in order to filter out annual and shorter­term fluctuations while still allowing longer period variations to be studied. Strictly speak­ing, one cannot observe the climate of the present. However, the study of past climates and their changes provides a basis for projec­tions into the present and the future, and such studies are receiving intensive effort and sup­port at the present time.

Research in climatology may be subdivided into three main categories: (1) the reconstruc­tion of past climates -from weather observa­tions and proxy data; (2) the understanding of the physical and mathematical bases of climate and its changes; and (3) assessment of the impact of climate on society. Contributions to the study of climate have been and are being made by scientists in a wide range of disciplines.

Instrumental records of weather go back for about two centuries at most. Therefore, the reconstruction of climates depends mainly on the analysis and interpretation of proxy data, such as tree rings, glacier ice cores, and ancient descriptions of the distribution of plants.

Geological evidence long ago indicated the occurrence of large variations of climate during the history of the earth. At least four major glaciations occurred in Europe and North Amer­ica during the Pleistocene. Within the last ten years, a reasonably reliable chronology began to become available from isotope analyses of cores extracted from the deep ocean bottom. From these and other data is emerging an inter­esting pattern. The mean temperature of the earth declined significantly during Tertiary times and there may have been as many as 17 major glaciations during the last 1.7 million years. A period of about 100,000 years be­tween major episodes is quite conspicuous, and the spectrum over the last 400,000 years also shows significant peaks near 41,000 and 22,000 years. The last glaciation reached its climax about 16,000 years ago and was followed by a warm extreme about 6,000 years ago. Since then the mean temperature has gradually de­clined, with numerous short-term fluctuations on many time scales.

Rapid strides have been made during the last

743

decade in understanding the physical and math­ematical basis of climate. This understanding is greatly complicated by the large number of interactions that occur between different com­ponents of the atmosphere-ocean-earth system and by the response of such a complex system to perturbations imposed on it. As an example of such interactions or feedbacks, picture the effect of a temperature perturbation resulting possibly from some minor disturbance of the sun's atmosphere. An increase of surface tem­perature would increase evaporation from the ocean's surface, affecting in turn the mean cloudiness, snow cover, and probably the circulation pattern; these changes would in turn affect the distribution of absorption of solar radiation and thereby result in further changes of the temperature distribution. The complexity of the climate system has prompted the devel­opment of computer models capable of in­corporating them. Such models are currently severely limited by the insufficient understand­ing of the physics and chemistry of some of these interactions, and also the difficulty of evaluating the covariances that appear when the atmospheric equations are averaged over long periods of time. One of the interesting sugges­tions emerging from mathematical and com­putational studies of climate models is the possibility that the system may spontaneously oscillate between two or more quite different states without any external forcing.

Many possibilities exist for both external and internal forcing of climate changes. Observa­tions of solar energy output over the last 50 years have shown it to be remarkably steady; nevertheless, it is scientifically plausible that significant variations of this output rate may have occurred over longer time scales. It is also likely that the prevalence of suspended dust from volcanoes has varied widely. The impact of such dust on the terrestrial heat budget is not clear and may not have been great. It is becoming rather well accepted that on a time scale of tens of millions of years, continental displacements are effective causes of change; the major glaciations that occurred in India and Australia during Permian times, and proba­bly also the Tertiary cooling, can be attributed to this cause. The presence of 22-, 41-, and 100-thousand-year periods in the Pleistocene chronology gives strong support to the astro­nomical theory proposed originally by Milan­kovitch. According to this theory, changes in the orbital elements of the earth caused by the Moon and planets give rise to variations of solar insolation with precisely these periods. The amplitude of these variations is rather small, and the mechanism by which they would produce the observed climatic response is not understood. However, on the basis of past rela­tionships, it is possible to project that in the absence of anthropogenic influences, the mean temperature of the earth will decline over the next 60,000 years.

METEOROLOGY

It is estimated that about half of the carbon dioxide given off from the burning of fossil fuels resides in the atmosphere and the rest is dis­solved in the upper layers of the ocean. This atmospheric residue interacts with the infrared radiation in the atmosphere in such a way as to increase the surface temperature. At present the atmospheric content is increasing about 0.4% a year, and it is estimated that this in­crease, acting alone, would raise the mean tem­perature about 0.3 C over the next 50 years. The observational verification of carbon dioxide warming of the atmosphere has not been pos­sible because of numerous other factors that cannot be controlled.

Cloud Physics and Weather Modification The water vapor saturation required for the forma­tion of clouds and precipitation normally comes from upward motion and accompanying adia­batic cooling of the air. Condensation actually begins at humidities slightly less than lOO% on small, suspended hygroscopic nuclei derived from such sources as sea spray and combustion products. Because of the strong curvature of the surface of the growing droplets, a slight su­persaturation is required for continued growth. Freezing nuclei are rare, and cloud droplets normally remain in the liquid phase at sub­freezing temperatures down to - 40 C.

Cloud droplets range in diameter from 1 to 50 J1 depending mainly on the size of the origi­nal nucleus and the extent of coalescence with other droplets. An average raindrop contains about a million times the water mass of a single cloud droplet and cannot be formed in a rea­sonable time by the ordinary droplet growth mechanism. Two processes are believed to act naturally if the required circumstances are met. The first of these is coalescence of colliding droplets that fall at different speeds; such a process requires a broad spectrum of droplet sizes. The second mechanism is the Bergeron­Findeisen process, which requires the presence of a few ice crystals in a predominantly liquid cloud at subfreezing temperatures. The dif­ference between the saturation vapor pressures over liquid and ice at low temperatures causes a rapid transfer of water from the liquid to the ice. Because the requirements for these pro­cesses are rather stringent, most clouds do not precipitate.

In principle it should be possible to trigger the first process by artificially injecting into the cloud liquid water masses, which fragment immediately into drops of a wide range of sizes. Each of these should then grow by coalescence and fracture into a large number of new seeds, effecting a chain reaction that can ultimately produce beneficial amounts of precipitation. The second process has been stimulated in natural clouds by seeding with freezing nuclei such as silver iodide or with "dry ice," which produces ice crystals by the sudden chilling of droplets.

Both of these methods have been employed

METEOROLOGY

widely in rain-making operations and experi­ments. These techniques appear to have been effective in a few experiments, but the amounts of augmentation have usually been too small to demonstrate conclusively under statistically controlled conditions. In most (but probably not all) cases where there is sufficient liquid water in the cloud to produce a significant amount of precipitation and sufficient vertical motion to sustain it once it begins, the precipi­tation forms naturally. In a few conditions the artificially created latent heat of sublimation may provide enough additional buoyancy to trigger a large amount of additional cloud growth and precipitation.

Injection of silver iodide into cumulonimbus clouds has also been attempted for the purpose of preventing hail. By increasing the number of ice seeds that compete for the available cloud water it is hoped that one can substitute for a small number of large hail stones a vastly in­creased number of small stones that melt before reaching the ground.

Boundary Layer Meteorology The atmo­spheric boundary layer comprises the lowest kilometer (more or less), where the air proper­ties are directly influenced by interactions with the surface of the earth. The layer is intensively studied for several reasons. Nearly all of the heat and water vapor that generate kinetic energy and precipitation pass through this layer and are controlled by processes near the earth's surface. The greater part of the kinetic energy that is generated by storms is dissipated within this layer. Finally, pollutants enter the atmo­sphere near the surface, and they are transported by the winds in this layer, undergo chemical transformations with other species, and are diffused by turbulence in varying degrees.

Boundary layer processes must be included in numerical prediction models. Without the inclu­sion of friction, storms tend to overintensify. It has also been found that longer period pre­diction of precipitation requires the inclusion of evaporation and the transfer of water vapor through the boundary layer into the free atmosphere.

Legislation has made it necessary to find and use working theories for relating ambient air quality to the industrial emissions that affect it. The pressure imposed by regulation authorities and the requirements of legislation have been a strong stimulus for research to develop better theories for transport, modification, and diffu­sion in natural environments. The problems are compounded by the fact that many industrial operations are situated near coastlines or in complex terrain that defy the application of simple theories. Moreover, the chemical trans­formations involve a large number of possible reactions under very low concentrations, often involving heterogeneous phases. Of particular importance is the conversion of S02 to sulfate and the precipitation of the resulting acidic ions

744

(along with products derived from oxides of nitrogen) in what is commonly called acid rain. Natural rain is slightly acidic due to the presence of carbon dioxide. Rain downstream of pollu­tant sources sometimes reaches pH values of 4.0 or less.

Dilution of pollutant concentrations by tur­bulent diffusion has been intensively studied for more than 50 years, both in the laboratory and in field experiments. Turbulence in the atmospheric environment is more complicated than in the laboratory, and the challenge of these complications has attracted the attention of a number of prominent hydrodynamicists. During the last 25 years the emphasis has been directed mainly at observing the distribution of mean wind and turbulence on towers within the lowest 300 meters above the surface. The main efforts are now shifting toward computer modeling of turbulence throughout the entire boundary layer.

Research and Publications Research is mete­orology is carried out in many countries and results are published in a variety of specialized journals. A partial list of these includes the Journal of the Atmospheric Sciences, Journal of Climate and Applied Meteorology, Journal of Atmospheric and Terrestrial Physics, Quar­terly Journal of the Royal Meteorological Soci­ety, Tellus, !zvestiya-Atmospheric and Oceanic Physics, Journal of Geophysical Research, and Atmospheric Environment. A complete list of publications in this field can be found in Meteorological and Geoastrophysical Abstracts published by the American Meteorological Society.

ALFREDK.BLACKADAR

References

Anthes, R. A., Cahir, J. J., Fraser, A. B., and Panofsky, H. A., "The Atmosphere," 3rd Ed., Columbus, Ohio, Charles E. Merrill Publishing Company, 1981 (531 pp.).

Berger, A. (Ed.), "Climatic Variations and Variability: Facts and Theories," D. Reidel Publishing Co., 1981 (795 pp.).

Butcher, S. S., and Charlson, R. J., "An Introduction to Air Chemistry," New York, Academic Press, 1972 (241 pp.).

Byers, H. R., "Elements of Cloud Physics," Chicago, Univ. Chicago Press, 1965 (191 pp.).

Munn, R. E., "Descriptive Micrometeorology," New York, Academic Press, 1966 (245 pp.).

Pruppacher, H. R., and Klett, J. D., "Microphysics of Clouds and Precipitation," D. Reidel Publishing Co., 1978 (714 pp.).

Wallace, J. M., and Hobbs, P. V., "Atmospheric Sci­ence: An Introductory Survey," New York, Aca­demic Press, 1977 (467 pp.).

Cross-references: AIR GLOW, COMPUTERS, PLAN· ETARY ATMOSPHERES, TELEMETRY, TEMPERA· TURE AND THERMOMETRY.

745

MICHELSON-MORLEY EXPERIMENT*

Introduction The revival and development of the wave theory of light at the beginning of the nineteenth century, principally through the contributions of Young and Fresnel, posed a problem which proved to be of major interest for physics throughout the entire century. The question concerned the nature of the medium in which light is propagated. This medium was called the "aether" and an enormous amqunt of experimental and theoretical work was ex­pended in efforts to determine its properties. On the experimental side, a long series of elec­trical and optical investigations were carried out attempting to measure the motion of the earth through the ether medium. For many years, the experimental precision permitted measurements only to the first power of the ratio of the speed of the earth in its orbit to the speed of light (vic ~ 10-4 ), and these "first-order experi­ments" uniformly gave null results. It became the accepted view that the earth's motion through the ether could not be detected by laboratory experiments of this sensitivity. With the development of Maxwell's electromagnetic theory of light, and especially with its exten­sions by Lorentz in his electron theory, theo­retical explanations for the null results obtained in the early ether drift experiments were pro­vided. These results were in harmony with the Galilean-Newtonian principle of relativity in mechanics, which explains why the essential features of all uniform motions are independent of the frame of reference in which they are ob­served. In Maxwell's electromagnetic theory, however the situation was altered when quanti­ties of the second order in (vic) were considered. According to the Max well theory, effects de­pending on (vlc)2 should have been detectable in optical and electrical experiments. The pres­ence of these "second-order effects" would in­dicate a preferred reference frame for the phe­nomena in which the ether would be at rest. At first, this feature of Maxwell's theory imply­ing observable ether drift effects of the second order in (vic) raised a purely hypothetical ques­tion, since the accuracy needed for such experi­ments was a part in a hundred million, and no experimental techniques then known could attain this sensitivity.

Michelson pondered this problem and it led him to invent the Michelson interferometer, which was capable of measurements of the re­quired sensitivity, and to plan the ether drift experiment which he carried to completion in collaboration with Edward W. Morley at Cleve­land in 1887. This famous optical interference experiment was devised to measure the motion of the earth through the ether medium by means of an extremely sensitive comparison of

*The author passed away on March 5, 1982, after preparing this article.

MICHELSON-MORLEY EXPERIMENT

the velocity of light traveling in two mutually perpendicular directions. The experiment, when completed in 1887, gave a most convincing null result and proved to be the culmination of the long nineteenth century search for the ether. At that time, the definitive null result of the Michelson-Morley experiment was a most dis­concerting finding for theoretical physics, and indeed for many years repetitions of this ex­periment and related ones were performed with the hope of finding positive experimental evidence for the earth's motion through the ether. These later experiments, however, have all been shown to be consistent with the original null result obtained by Michelson and Morley. In the years following 1887, their experiment led to extensive and revdlutionary develop­ments in theoretical physics, and proved to be a major incentive for the work of FitzGerald, Lorentz, Larmor, Poincare, and others, leading finally in 1905 to the special theory of relativity of Albert Einstein.

The optical paths in the Michelson-Morley interferometer are shown in plan in Fig. 1. Light from a is divided into two coherent beams at the half-reflecting, half-transmitting rear surface of the optical flat b. These two beams travel at 90° to each other and are multi­ply reflected by two systems of mirrors d - e and dl - el. On returning to b part of the light from e - d is reflected into the telescope at f, and light from el - dl is also transmitted to f. These two coherent beams of light produce interference fringes. These are formed in white

/ 0./

~

FIG. 1. Optical paths in the Michelson-Morley interferometer.

MICHELSON-MORLEY EXPERIMENT

only when the optical paths in both arms are exactly equal, a condition produced by moving the mirror at el by a micrometer_ c is an optical compensating plate. The effective optical path length of each arm of the apparatus was in­creased to 1100 em by the repeated reflections from the mirror system.

Figure 2 is a perspective drawing of the Michelson-Morley interferometer showing the optical system mounted on a 5 foot square sandstone slab. The slab is supported on the annular wooden float, which in turn fitted into the annular cast-iron trough containing mercury which floated the apparatus. On the outside of this tank can be seen some of the numbers I to 16 used to locate the position of the inter­ferometer in azimuth. The trough was mounted on a brick pier which in turn was supported by a special concrete base. The height of the ap­paratus was such that the telescope was at eye level to permit convenient observation of the fringes when the instrument was rotating in the mercury. While observations were being made, the optical parts were covered with a wooden box to reduce air currents and temperature fluctuations.

This arrangement permitted the interferometer to be continuously rotated in the horizontal plane so that observations of the interference fringes could be made at all azimuths with re­spect to the earth's orbital velocity through space. When set in motion, the interferometer would rotate slowly (about once in 6 minutes) for hours at a time. No starting and stopping was necessary, and the motion was so slow that

746

accurate readings of fringe positions could easily be made while the apparatus rotated.

The experiment to observe "the relative mo­tion of the earth and the luminiferous ether" for which this instrument was devised, was planned by Michelson and Morley as follows. When the interferometer is oriented as in Fig. 3 with the arm LI parallel to the direction of the earth's velocity v in space, the time required for light to travel from M to MI and return to M in its.new position is,

(11(1)=--+--=--- ~=-LI LI 2LI I ( v) e - v e + vel - ~2 e

The time for light to make the journey to and from the mirror M2 in the other interferometer arm L2 is,

(1 (1) = [2L2(1 + tan2Q:)I/2/el

and since tan2Q: = v2 f(e 2 - V2)

(1) _ 2L2 ___ _ (1 - e (1 - ~2) 1/2'

When the interferometer is rotated through 90° in the horizontal plane so that the arm L2 is parallel to v, the corresponding times are,

2L2 I (11(2) =---

e I - ~2

2LI ( (2)=-----1 e (1 - ~2 ) 1/2

FIG. 2. Michelson-Morley interferometer used at Oeveland in 1887.

747

s

FIG. 3. The Michelson-Morley experiment.

Thus, the total phase shift (in time) between the two light beams expected on the ether theory for a rotation of the interferometer through 90° is,

2L 1 [ I 1] t::..t = -c- I={32 - (1 - ~2 )1/2

2L 2 [1 1] + -c- 1 - ~2 - (I - ~2 )1/2

2(L 1 + L 2) [ 1 1] = c 1 - ~2 - (1 - ~2) 1/2

For equal interferometer arms, as used in this experiment,

L1 = L2 = L, and, since ~ < I,

2L At~-~2

c

The observations give the positions of the fringes, rather than times, so the quantity of importance for the experiment is the change in optical path in the two arms of the interferometer.

A = cAt = 2L(vlc)2

This is the quantity of second order in (vic) referred to above.

With the Michelson-Morley interferometer, the magnitude of the expected shift of the white-light interference pattern was 0.4 of a fringe as the instrument was rotated through an angle of 90° in the horizontal plane. Michelson and Morley felt completely confident that fringe shifts of this order of magnitude could be determined with high precision.

In July of 1887, Michelson and Morley were able to make their definitive observations. The experiments which gave their final measure­ments were conducted at noon and during the

MICHELSON-MORLEY EXPERIMENT

evening of the days of July 8, 9, 11, 12 of 1887. Instead of the expected shift of 0.4 of a fringe they found "that if there is any displace­ment due to the relative motion of the earth and the luminiferous ether, this cannot be much greater than 0.01 of the distance between the fringes."

The result of the Michelson-Morley experi­ment has always been accepted as definitive and formed an essential base for the long train of theoretical developments that finally culminated in the special theory of relativity. The first im­portant suggestion advanced to explain the null result of Michelson and Morley was G. F. Fitz­Gerald's hypothesis that the length of the interferometer is contracted in the direction of its motion through the ether by the exact amount necessary to compensate for the in­creased time needed by the light signal in its to-and-fro path. This contraction hypothesis was made quantitative by H. A. Lorentz in further development of his electron theory in which he introduced the formalism which has since been known as the "Lorentz transforma­tion" for the analysis of relative motions.

H. Poincare also contributed greatly to both the philosophical and mathematical develop­ments of the theory. As early as 1899, he asserted that the result of Michelson and Morley should be generalized to a doctrine that absolute mo­tion is in principle not detectable by laboratory experiments of any kind. Poincare further elaborated his ideas in 1900 and in 1904 and gave to his generalization the name "the principle of relativity." He also completed the theory of Lorentz and it was he who named the essential transformation "the Lorentz transformation."

In 1905 Einstein published his famous paper on the "Electrodynamics of Moving Bodies" in which he developed the special theory of rela­tivity from two postulates: (I) the principle of relativity was accepted as the impossibility of detecting uniform me tion by laboratory experi­ments, and (2) the constancy of the speed of light was generalized to a postulate that light is always propagated in empty space with a ve­locity independent of the motion of the source. Both postulates have a close relationship to the Michelson-Morley experiment, which Einstein knew through his study of the work of Lorentz. Einstein's paper is generally considered as the definitive exposition of the special relativity principle, and the climax of the century-long developments which had begun with Young and Fresnel to explain the electrical and optical properties of moving bodies.

At all times the Michelson-Morley experiment continued to have great interest and was repeated many times throughout more than a half cen­tury. In 1904 Morley and Miller4 showed that the Fitzgerald-Lorentz contraction is the same in several materials. All repetitions after 1887 failed to find the full expected "aether­drift," although Dayton C. Miller's trials on Mount Wilson (1921-1926) gave a small effect,

MICHELSON-MORLEY EXPERIMENT

later shown to be due to temperature gradients. 5

The most certain null result was that obtained by J oos6 using an interferometer built by Zeiss of Jena. Finally, experiments by Townes7 with very sensitive laser techniques, gave definitive confirmation of Michelson and Morley's work.

In 1922 at the height of his fame for rela­tivity, Einstein lectured widely in Japan. A reprint and discussion of these lectures has recently become available.s In his lecture on December 14, 1922 at Kyoto University, Einstein referred several times to the inter­ferometer experiment, stating that he "had thought about the result even in his student days." In 1950 Einstein told the writerl that after 1905 he and Lorentz had discussed the Michelson-Morley experiment many times while he was working on the general theory of relativity. Today, both the experiment and the theories are among the prized achievements of Physics.

R. S. SHANKLAND

References

1. Shankland, R. S., "Conversations with Albert Ein­stein," A mer. J. Phys., 31, 47 (1963); 41, 895 (1973); 43, 464 (1975).

2. Shankland, R. S., "Michelson-Morley Experiment," Amer. J. Phys., 32, 16 (1964).

3. Shankland, R. S., "Michelson-Morley Experiment," Sci. Amer., Nov. 1964, p. 107-114.

4. Morley, E. W., and Miller, D. C., Phil. Mag. 9,680 (1905).

5. Shankland, R. S., et aI., Rev. Mod. Physics 27,167 (1955).

6. Joos, G., Ann. Physik 7, 385 (1930); Naturwiss, 38,784 (1931).

7. Townes, C. H., Phys. Rev. Letters 1,342 (1958). 8. Ogawa, T., Japanese Studies in the History of Sci-

ence, Nov. 18, 1979.

Cross-references: INTERFERENCE AND INTERFER­OMETRY; LIGHT; OPTICS, GEOMETRICAL; OP­TICS, PHYSICAL; RELATIVITY.

MICROPARACRYSTALS

Most noncrystalline matter consists of micro­para crystals (mPCs). They can be identified by X-ray, neutron, and electron diffraction pat­terns. The reflections of microparacrystals show a characteristic broadening (see, for example, Figs. 8, 9, and 11 of the article DIFFRACTION BY MATTER AND DIFFRACTION GRATINGS. Plotting for instance the integral width [j~ of the reflections (hOO) against h we can calculate the number N of netplanes and the paracyrstal­line distortion g by means of the following theoretical relationships: 1

748

[j~ = ~ [~ + (1Tgh)2]

g=~l/d (1)

~l = (d2 - d 2)l/2,

where d is the distance between atoms of neigh­boring netplanes orthogonal to their surfaces and ~l the variance of these distances d. Figure 1 shows a two-dimensional paracrystalline lat­tice, where 3% larger coins are mixed statistically with 97% smaller ones. At the center of the lat­tice there exist some crystalline-like domains. Near the boundaries the netplanes become more and more distorted until finally they are de­stroyed at some places. Similar phenomena exist in atomic dimensions. From statistical laws it is known that the distance variance ~N increases with v'N orthogonal to the surface between atoms of the first and (N + 1 )th net­planes. When ~N reaches 100a*% of d, then the valence bonds between atoms or molecules within the Nth netplane are strongly overstrained so that the netplane suffers a break. This leads to the so-called "a*-relation:"

~N = v'N ~l = a* d, hence v'N = a* /g a*=0.15±0.03. (2)

In Fig. 2 is plotted v'N against l/g for numerous different colloids. a* always has values between 0.12 and 0.l8. Another example is given by the steel-ball model (cf. the article DIFFRACTION BY MATTER AND DIFFRACTION GRATING, dotted line, Fig. 11). The a*-relation was firstly published in 1967.2 At that time its fundamen­tal importance for the whole world of non­crystalline matter was not recognized. F. J. Balta-Calleja3 stimulated the publication of diagrams like Fig. 2. Nevertheless, the impor­tance of the a* = relation is rarely understood nowadays. On the right-hand side of Fig. 2, for instance, an example is given of the ammonia contact catalysts. Their micro para crystals are, on account of relation (2), very small compared with crystals and therefore build up a large thermostable "inner" surface of some 100 m2 per 1 cm3 • This large surface is important in synthesizing ammonia (NH4 0H) from hydrogen (H2), nitrogen (N2), and water (H20) in a rational technical way:

3H2 + N2 + 2H20 = 2NH4 0H.

In Figure 12 of the article on PARACRYSTALS it is shown that the microparacrystals of this catalyst consist of a-Fe lattices into which FeAl20 4 molecules are statistically imbedded and destroy the crystalline order of the Fe atoms. Here g - 1 % and f!! - 15. At tem­peratures higher than 400 C some FeA120 4 molecules slowly begin to emigrate, the g-value therefore becomes smaller, and ;;iii reaches

749 MICROPARACRYSTALS

FIG.!. Two-dimensional model of a paracrystalline lattice with quadratic lattice cells. 97% smaller coins are mixed statis­tically with 3% larger ones.

VN

20

z 15

"0 (5 e ~ 10 c :J C" <11

5

/

/

/

/ , /

V

50

Conventional Crystals I

0. • • 02 I

" 0 , 0. ~ ,0 I(

/

/ PRD49 + /

/ Catalysts 0 / Polymer

/ / Single Crystals 0 Bulk Polymers ®

Spiral Paracrystal"

Graphite 6

Melts 0

Gas •

100 1 150 Yg

rec i procal 9 -value

FIG. 2. The a*-law of colloidal systems. The rela­tion of Eq. (2) gives the theoretical background for noncrystalline matter independent of its chemical composition.

values up to Vii = 20. With higher annealing temperature the microparacrystals more nearly approach the crystalline state in the direction of the arrow until the catalyst loses its favor­able properties. A new example of the existence of microparacrystals was found recently in the NiAl20 3 catalyst.4 Adjacent to the catalyst in Fig. 2 are plotted mpes ("single crystals" and bulk material) in polymers. Their g-values lie between 2 and 5%. The example for "crystal­line" polymers is given in Fig. 9 of the article on PARACRYSTALS, where some micropara­crystals can be recognized. They are linked together by long molecules and work therefore as knots of a three-dimensional network. During strain the chains glide along each other, each moment building up new micropara­crystals which, step by step, become smaller until finally every microparacrystal is disinte­grated into 30 smaller ones [Fig. 3(b)]. The most disordered microparacrystals are to be found in melts (left corner oLYig. 2). There they have values g ~ 8% and yN ~ 2. Figure 4 shows as an example a microparacrystal of molten Fe or Pb which builds up a cubic face­centered lattice similar to that below the melt­ing point. In contrast to the solid state, the atoms have a high mobility and can move over to other nuclei building up icosahedra in some

MICROPARACRYSTALS 750

a o lfil 300 A b , ,

FIG. 3. Microparacrystals in polymers are the knots of a three-dimensional network: (a) before and (b) after stretching eight-fold.

FIG. 4. Microparacrystals in molten iron, lead, and silver. The arrows indicate probable movements in tangential direction forming an icosahedron.

cases.S Equation (2) is not only an empirically detected relation but the fundamental law of colloid and surface science. It manifests an equilibrium state of microparacrystals which is unknown in conventional solid state physics of condensed matter. For details see the article MICROPARACRYSTALS, EQUILIBRIUM STATE OF.

ROLF HOSEMANN

References

1. Hosemann, R., Ergeb. Ex. Nat. Wiss. 24, 142 (1951).

2. Hosemann, R., Lemm, K., Schonfeld, A., and Wilke, W., Kolloid-Z. u. Z. Polymere 216-217, 103 (1967).

3. Balta-Calleja, F. J., and Hosemann, R., 1. Appl. Cryst. 13,521 (1980).

4. Wright, C. J., Windsor, C. G., and Puxley, D. c., Nat. Phys. Div. A.E.R.E. Harwell, Oxfordshire, U.K., MPD/NBS/189, Feb. 1982.

5. Steffen, B., and Hosemann, R., Phys. Rev. B13, 3232 (1976).

Cross-references: DIFFRACTION BY MATTER AND DIFFRACTION GRATINGS; MICROPARACRYS· TALS, EQUILIBRIUM STATES OF; PARACRYS­TALS.

MICROPARACRYSTALS, EQUILIBRIUM STATE OF

The physical m~aning of the a*-law is that the mean number N of netplanes in an ensemble of microparacrystals (the so called mPCs) depends only on the relative distance fluctuation g of atoms belonging to neighboring netplanes (see also the article MICROPARACRYSTALS. Here,

E(N/M) 1

• 0,8 ooo~ 1M I 0,6

0

0,4 0

0,2

2 3 4 5 6 7 - N/M

FIG. 1. The probability E(N) of further growth of a microparacrystal consisting of N netplanes.

751

FIG. 2. Frequency distribution K (N) of micropara­crystals with N + 1 netplanes. Open circles and squares plot K (N) by computer calculation; solid lines drawn by Maxwell approximation. Crosses indicate distribu­tion as directly observed for N = 11 microparacrystals in DuPont PRD 49 fibers.

FIG. 3. High-precision transmission-electron-micro­scopic picture of DuPont PRD 49 fibers.

from the probability £(N), what amount of microparacrystals with N - 1 netplanes aggre­gate the next netplane (see Fig. 1), can be cal­culated by statistical mechanics. As in biology, where the expectation of surviving the next day decreases continuously day by day, here £(N) begins with a plateau £ = 1 which suddenly changes to a smooth decay. The frequency K (N) of microparacrystals meeting N + 1 net-

MICROWAVE SPECTROSCOPY

planes is given by

K(N) = £(1) £(2) £(3) . ..

. £(N - 1) (1 - £(N». (1)

K (N) is plo!1;ed in Fig. 2 for N = 11 (open circles) and N = 62 (open squares) and can be approximated by Maxwellian functions. 2 The crosses in Fig. 2 show the K (N) distribution directly observed3 from high-precision trans­I.!!ission-electron-microscopic diagrams, with N = 11 (Fig. 3). This agreement proves directly that the equilibrium state of microparacrystals is of utmost importance for the understanding of noncrystalline condensed matter.4

ROLF HOSEMANN

References

1. Hosemann, R., Schmidt, W., Lange, A., and Hent­schel, M., Colloid & Polymer Sci. 259,1161 (1981).

2. Hosemann, R., Colloid & Polymer Sci. (in press). 3. Dobb, M. G., Hindeleh, A. M., Johnson, D. J., and

Saville, B. P.,Nature 253, 189 (1975). 4. Hosemann, R., Physica Scripta (in press).

Cross-references: COLLOIDS, THERMODYNAMICS OF; MICROPARACRYSTALS; PARACRYSTALS.

MICROW A VE SPECTROSCOPY

Microwaves are electromagnetic waves which range in length from about 30 cm to a fraction of a millimeter or in frequency from 109 to 0.5 X 1012 CpS . This corresponds to the rota­tional frequency range of a large class of mole­cules. Thus, microwave radiation passing through a gas can be absorbed when the rotating electric dipole moment of the molecule interacts with the electric vector of the radiation. Likewise, absorption can take place if the rotating mag­netic moment of the molecule interacts with the magnetic vector of the radiation.

Most microwave spectroscopy is based on a study of transitions induced by interaction of the molecular electric dipole with the incident radiation.

A microwave spectrometer consists basically of a monochromatic microwave source (kly­stron), an absorption cell, and a detector. The absorption cell must transmit the microwave of interest and in the centimeter region may have cross-sectional dimensions of 1 X 4 cm and may be a few meters in length. Normally a metal strip is inserted along the length of the cell and is insulated from the cell. In this way, an auxil­iary, spatially uniform electric field may be established in the absorption cell without af­fecting the microwaves. The Stark effect thereby produced splits the molecular energy levels into a series of levels and enables one to identify the transition.

MICROWAVE SPECTROSCOPY

The Hamiltonian for a rotating rigid asymmet­ric molecule including possible fine and hyper­fine structure terms is given in Eq. (1). It is assumed, as is most commonly so, that the molecule is in a 1 ~ state, i.e ., that there is no net electronic angular momentum and no net electron spin (singlet state). The Hamiltonian written is quite general and in many cases not all of the terms shown in Eq. (1) need be in­cluded to account for spectra observed under normal resolution .

A brief description will be given of each term in order that one may most simply understand the kinds of interactions which may occur and which may be pertinent to an understanding of the spectra of rotating molecules.

H = HR + Hdist. + Hs + HZe + HQ

+Hzi+HD (1)

1. HR is the framework rotational kinetic energy and may be written

fz2 [la 2 lb 2 le2] HR=- -+-+-81T2 Ia Ib Ie

where la , lb, and Je are the components of the total angular momentum in units of fz referred to body-fixed principal axes. la, Ib , and Ie are the moments of inertia about the respective principal axes. The Hamiltonian may be written

HR = AJi + BJb2 + CJe2

or displaying the total angular momentum J

A, Band C are rotational constants with A > B > C. In this form, units can be chosen so as to give energy levels directly in megacycles per second. HR describes a rigid symmetric top if I a is equal to lb. For a diatomic molecule, Ia = Ib and Ie ~ o. (11Ie becomes very large, and the rotational levels about the c axis are too far apart to become excited by microwaves). For the spherical rotor,Ia = Ib = Ie, but this implies no dipole moment and therefore no observable rotational spectrum.

Energy levels for the symmetric top may be determined by analytical methods,1 by factori­zation methods2 , or by using the commutation properties of the angular momenta.3 In Eulerian coordinates, the wave equation separates, and the wave function has the form

where 8JKM is the solution to the differential equation in the polar angle 8 which results after separation of the simple terms in the azimuthal angle I/> and the angle l/I which defines the direc­tion of the line of nodes. The equation for 8JKM with appropriate change of variable becomes the equation for the Jacobi polynomials.

752

The solution l/I is characterized by three quan­tum numbers: J, the total angular momentum; K the component of angular momentum along the symmetry axis of the molecule; and M, the magnetic quantum number or projection of J along an arbitrary space axis . The energy does not depend on M in the absence of external electric or magnetic fields. The energy levels for the symmetric top have the form

EJ.K = BJ(J + I) + (C - B)K2

The rotational constants A = Band C may typ­ically range from 2000 to 300 000 MHz for' presently observable spectra. A hertz is one cycle per second. A megahertz (MHz) is one million cycles per second .

Selection Rules In all but the accidentally symmetric top, the permanent dipole will be along the symmetry axis. In this case, the selection rules for absorption of radiation through rotation are: 4

J-+J± I,K-+K

For a component of the dipole moment perpen­dicular to the symmetry axis, the selection rules are

!:J..J = ± 1, 0 and !:J..K = ± I

In both cases, !:J..M = ± I, O. The wave functions of the asymmetric top

are expressed as linear combinations of sym­metric top functions. The energy remains diag­onal in J but not in K. One must, therefore, arbitrarily label the energy levels and determine the selection rules. This will not be done here. In order to do this, however, one needs to know only the non vanishing matrix elements of the three components of the dipole moment for the symmetric top given above.

The selection rule for diatomic molecules is simplyJ-+J± I,M-+M,M± 1.

2. Hdist. describes centrifugal stretching cor­rections to the energy levels which for an asymmetric molecule can be quite complicated. Corrections for a symmetric top molecule are easily derived, and the framework energy in this case including centrifugal stretching is given by·

EJ.K = BJ(J + I) + (C- B)KL DJJ2(J + 1)2

- DJKl(l + I)K2 - DKK··

For a non-rigid diatomic molecule or linear polyatomic molecule, K = 0, and the energy is given by :

EJ = BvJ(J + 1) - DvJ2(J + 1)2

and since J -+ J + I, for absorption the line fn."­quencies are:

Vr = 2Bv(J + I) - 4Dv(J + 1)3

753

where Bv is the "effective" spectral constant, h(87r2 Iv), for the particular vibrational state for which the rotational spectrum is observed, and where Dv is the centrifugal stretching con­stant for that state.

In terms of the constants Be and De for the hypothetical vibrationless state, Bv and Dv for diatomic molecules are:

Bv =Be - a(v+ 1) Dv = De + ~(v + 1)

where a and ~ are interaction constants which are very small in comparison with Band D, re­spectively, and where v is the vibrational quan­tum number. For linear poly atomic molecules, there is more than one vibrational mode, and the above equations must be written in the more general form

( di) Bv = Be + ~ aj Vi + -2

Dv = De + ~ ~i (Vi + ~~) where the summation is taken over all the funda­mental modes of vibrations . The -subscript i refers to the ith mode, and di represents the degeneracy of that mode.

In analysis of spectra, the variations of the stretching constantsDJ and DJK with vibrational state are customarily neglected. It is seldom pos­sible to obtain sufficient data for the evaluation of these effects upon B and for determination of Be even for the simpler symmetric tops.

Centrifugal stretching constants mal typically range from 8.5 to 0.002 MHz or less.

3. The third term Hs is the contribution to the Hamiltonian arising from the Stark effect and may be written as

Hs = l1e . lh

where l1e is the vector dipole moment and lh is the external electric field .

If the dipole moment lies along the "e" body­fixed principal axis, Hs has the form

Hs =l1e(A xClh x +A/lhy +AzClhz)

where Axc, Ayc, Azc are the direction cosines of the e principal axis with space-fixed axes xyz. lhx, lh y , and ll,z are the cOI?ponents of th~ .elec­tric field along the space-fixed axes. Additional terms will be added to this expression if the dipole moment has components along the re­maining two principal axes. In order to obtain the contribution to the energy from this part of the Hamiltonian, one must evaluate matrix elements of the direction cosines with respect to symmetric top wave functions. Methods described in reference 5 enable one to do this.

In the case of a symmetric top, the dipole

MICROWAVE SPECTROSCOPY

moment will have a component only along the c axis so that Hs consists of the single term Azclhz when lhx = lh y = O.

In this case, the energy associated with Hs is diagonal in all three quantum numbersJKM and has the form

where M, the "magnetic quantum number," measures the component of J along lhz and can take the values M = J, J - I,' .. , - J. The selec­tion rules for Mare M -+ M; M -+ M ± 1, de­pending on the polarization of the microwaves.

For asymmetric molecules, l1e will in general have components along the A and B axes as well as C. The A and B components give rise to matrix elements off diagonal inJ,K and M . For a dipole moment of I debye and an electric field of 1 volt/cm, l1elh is 0.5 MHz.

4. Hze is the contribution to the Hamiltonian due to the interaction of the external magnetic field with the magnetic moment which is created by rotation of the molecule.

We also include the interaction of the external magnetic field with the dipole moment of indi­vidual nuclei. For a molecule with two nuclear spins, II and 12 , this may be written

Hze = L I1n(J)jkJjHk + I1ngl (II . H) j. k

+ I1ng2(I2 . H)

where g(J) is in general a tensor, Jj are compo­nents of J along axes to which H is referred, Hk are the components of the field H usually referred to the space-fixed axes, and gl and g2 are called the nuclear magnetic g factors. The interaction between J and H is the same order as that between the nuclear spin and H. There­fore, we introduce the term I1n so that g coef­ficients are of the order of unity . Thus for a field of one gauss, the quantity I1ngH is 0 .7 kHz.

For molecules with electronic angular momen­tum, I1n in the first term is replaced by 110, the Bohr magneton which is 1836 times larger than I1n. Thus, in this case I1ngH is - 1.4 MHz for a field of one gauss . For a discussion of the prob­lem of determining molecular g values, as well as magnetic susceptibility anisotropies and mole­cular quadrupole moments, see Ref. 5.

S. HQ is the energy of interaction of the nuclear electric quadrupole with the gradient of the electric field produced by the electrons in the molecule at nucleus with spin I. For a nucleus on the axis of a symmetric top, the quadrupole operator is ordinarily considered to be of the form :

- eQq {3K2 }

HQ = 21(21 - 1)(2J - 1)(2J + 3) 1 - J(J + 1)

[3(J + 1)2 + % (J . 1) - J2 J2 ]

MICROWAVE SPECTROSCOPY

This operator yields only those matrix ele­ments of the quadrupole interaction which are diagonal in J. The diagonal contributions are sufficient for most cases.

In the expression above, eQ is defined by

eQ = (I,Ifpn[3Zn2 - rn2]dTnII,I)

where Pn is the nuclear charge density at a dis­tance rn from the center of charge of the nucleus and dT is the differential volume element for the nuclear volume. Zn is the position coordinate along the direction of the nuclear spin I. The matrix element considered is that for which M/=I.

The quantity q is defined as

q = [a 2 vl ac2 J(rn = 0)

where V is the electrostatic potential due to the electronic cloud and other nuclei surrounding the nucleus and c is the axis in the body-fixed system which is parallel to the symmetry axis of the molecule. The quantity eqQ varies from - 1000 to 1000 MHz a1thoulZh the intermediate values are more common.6 A tabulation of matrix elements for quadrupole interaction is given in Ref. 7. For a general discussion, see the book, Ref. 8.

6. HZi represents the interaction between the magnetic field caused by rotation of the charged particles which make up the molecule and the nuclear magnetic moments of the nuclei. For the case of 2 nuclei, this takes the form

HZi = L C(l)jkJjIlk + L, C(2)jk JjI2k j. k j. k

C( 1) and C(2) represent the internal magnetic moment tensors for the two nuclei. The C co­efficients are of the order of 1O-2 MHz.9 J and I are pure numbers. This correction will therefore be unimportant for the large majority of mole­cules. For values of the coefficients as deter­mined by molecular beam work, see Ref. 7 and 10.

7. HD is the dipole interaction between the two nuclei which may be written in the form

H =glg2/.ln2 [I .1 _ 3(11 'R)(I2 'R)] D R3 I 2 R2

where gl and g2 are the nuclear gyromagnetic ratios of the nuclei, /.In is the nuclear magneton and R is the distance between the two nuclei.

The operator which is usually used to represent this in teraction is

[3(11 'J)(12 'J)+3(12 'J)(11 .J)- 211 '12 ]2]

(2J - 1)(2J + 3)

754

This operator, like that given for the quadrupole interaction above, and usually quoted in the literature, will yield only those matrix elements which are diagonal in the quantum number J. The coefficientgl g2 /.ln 2/R3 may be of the order of a kilocycle, This correction is observed only in very rare cases.9

Matrix elements for all of the above-mentioned components of the Hamiltonian may be evalu­ated by the methods in references 11. The matrix elements themselves are too lengthy to be tabu­lated here.

There is an additional interaction which, for completeness, should be mentioned. The nuclear spins may interact with one another through mutual coupling with the surrounding electron cloud. This gives a correction of the form CII • 12 , The coefficient C may be larger than that in the dipole-dipole interaction term. In Til, C has the value of 6.57 kHz.7

The preceding discussion emphasizes the in­terpretation of microwave absorption spectra. One is thereby led to a knowledge of the struc­ture of the molecule, the value of nuclear spins, and various coupling constants. For a short review which emphasizes the experimental as­pects of microwave spectroscopy see Ref. 12.

Microwave spectroscopy is also an effective technique for determination of barrier heights associated with internal rotation.13 , 14, 15 It may also yield the barrier to ring puckering and also the barrier to inversion, the earliest example of the latter being the inversion of nitrogen through the plane of the three hydrogens in ammonia; see Rudolph review. 23

For a discussion of and references to micro­wave pressure broadening, line shape, and inten­sities see G. Birnbaum.16 Among other things line width is related to the rate of energy trans­fer between molecules. The shape of a micro­wave line is broadened as pressure increases and this can camouflage fine structure.

We have not discussed electron spin resonance (ESR), also called electron paramagnetic reso­nance.17 (See RESONANCE and MAGNETIC RESONANCE.) In this case transitions occur between energy levels created by unpaired elec­tron spins in the presence of an external mag­netic field. Absorptions of this type are 0 bserved in molecules, free atoms, radicals, and solids.

For a review of properties of high-tempera­ture species as studied by microwave absorption spectroscopy see Ref. 18.

The role of transient effects in microwave spectroscopy is a new area of research. Tran­sients may be achieved by switching the applied Stark field in a microwave spectrometer. The main usefulness of the technique appears to be in the determination of the rates of energy transfer between gas molecules. For details on this subject and many references to other reviews and papers including the same when applied to infrared laser spectroscopy, see the review by R. H. Schwendeman.19

Microwave spectroscopy has been extended to the submillimeter infrared region; however,

755

this extension has been limited by the ability to generate harmonics and mix signals in point contact devices. With the advent of the laser, coherent sources have become available in the infrared region. This related subject has been reviewed by V. I. Corcoran.20

Although less common than the Stark effect technique, the Zeeman effect, that is, exposure of the molecules to very high magnetic fields (up to 30 kg) may also be used to obtain molec­ular information. The molecules investigated by this technique are in a "nonmagnetic" ground state. The Zeeman effect then is due to the very small molecular rotational magnetic moment which results from the rotation of the unequally distributed positive and negative charge. The main effect is first order in the magnetic quan­tum number M, but the slight nonlinear com­pression of the splitting pattern yields magnetic susceptibility anisotropies that permit calcula­tion of the molecular quadrupole moment tensor. For the effect on the Hamiltonian and reference to orifnal papers see the review by H. D. Rudolph.2

For further details and discussion of topics omitted, the reader is referred to the book13

by W. Gordy and R. L. Cook. For other books see Refs. 6 and 21. For a review of microwave spectroscopy see the article by D. R. Lide, Ref. 22; also the reviews by H. D. Rudolph (1970), Morino and Hirota (1969), and Flygare (1967).23

The Hamiltonian required to explain the results of molecular beam electric resonance (MBER) experiments involving radio frequency transitions contains the same kind of inter­action terms listed above. Therefore the litera­ture on MBER may be referred to for further theoretical discussion. See, for example, English and Zorn, Ref. 22.

For a review of beam maser spectroscopy see Ref. 24.

For a comprehensive computation of micro­wave spectra including measured frequencies, assigned molecular species, assigned quantum number, and molecular constants determined from such data, the reader is referred to the multivolume work "Microwave Spectral Tables" prepared by rersonnel of the National Bureau of Standards. 5

DONALD G. BURKHARD

References

1. Dennison, D. M., Phys. Rev., 28, 318 (1926); Reiche, F., and Rademacher, H., Z. Physik, 39, 444 (1926), and 41, 453 (1927).

2. Burkhard, D. G., l. Mol. Spectry., 2, 187 (1958); Shaffer, W. H., and Louck, J. D.,J. Mol. Spec try ., 3,123 (1959).

3. Klein, 0., Z. Physik, 58,730 (1929). 4. Dennison, D. M., Rev. Mod. Phys., 3, 280 (1931). 5. Flygare, W. H., and Benson, R. C.,Mol. Phys, 20,

225 (1971). 6. Townes, C. H., and Schawlow, A. 1., "Microwave

Spectroscopy," New York, McGraw-Hill, 1955.

MICROWA VE SPECTROSCOPY

7. Stephenson, D. A., Dickinson, 1. T., and Zorn, 1. C.,l. Chern. Phys., 53(4),1529 (1970).

8. Lucken, E. A. C., "Nuclear Quadrupole Coupling Constants," New York, Academic Press, 1969.

9. Thaddeus, P., Krisher, L. C., and Loubser, J. M. N.,/. Chern. Phys., 40,257 (1964).

10. English, T. C., and Zorn, 1. C., l. Chern. Phys., 47(10),3896 (1967).

11. Condon, E. U., and Odabasi, Halis, "Atomic Struc­ture," Cambridge, U.K., Cambridge Univ. Press, 1980; Landau, L. D., and Lifshitz, E. M., "Quan­tum Mechanics," New York, Pergamon, 1977.

12. Strandberg, M. W. P., "Microwave Spectroscopy," in "McGraw-Hill Encyclopedia of Science and Technology," New York, McGraw-Hill, 1977.

13. Gordy, W., and Cook, R. L., "Microwave Molecu­lar Spectra," New York, Wiley, 1970.

14. Lin, C. C., and Swalen, J. D., Rev. Mod. Phys. 31, 841 (1959).

15. Burkhard, D. G., l. Opt. Soc. Am. SO, 1214 (1960).

16. Birnbaum, G., "Intermolecular Forces," in "Ad­vances in Chemical Physics," Vol. 12, New York, Wiley, 1967.

17. Squires, T. L., "An Introduction to Electron Spin Resonance," New York, Academic Press, 1963; Alger, R. S., "Electron Paramagnetic Resonance: Techniques and Applications," New York, Inter­science, 1968; Carrington, A., Levy, D. H., and Miller, T. A., "Electron Resonance of Gaseous Diatomic Molecules," in "Advances in Chemical Physics," Vol. 18, New York, Interscience, 1970.

18. Lovas, F., and Lide, D. R., Adv. High Temp. Chern. 3 (1972).

19. Schwendeman, R. H., Ann. Rev. of Phys. Chern. 29 (1978).

20. Corcoran, V. J., App. Spectroscopy Revs. 7 (1974).

21. Guillory, W. A., "Introduction to Molecular Struc­ture and Spectroscopy," Boston, Allyn and Bacon, 1977. Wollrab, J. E., "Rotational Spectra and Molecular Structure," New York, Academic Press, 1967. Sugden, T. M., and Kenney, C. N., "Micro­wave Spectroscopy of Gases," New York, Van Nostrand Reinhold, 1965; Ingram, D. J. E., "Spectroscopy at Radio and Microwave Frequen­cies," 2nd Ed., New York, Plenum, 1967; Hedvig, P., and Zentai, G., "Microwave Study of Chemical Structures and Reactions," CRC Press, 1969; Svidziniskii, K. V., "Soviet Maser Research," New York, Plenum Press, 1964.

22. English, T., and Zorn, J. C., "Molecular Beam Spectroscopy," and Lide, D. R., "Microwave Spectroscopy," in "Methods of Experimental Physics," Vol. 3, 2nd Ed., New York, Academic Press, 1972.

23. Ann. Rev. Phys. Chern. 21 (1970); 20 (1969); and 18 (1967), resp.

24. Laine, D. C., Repts. Prog. Phys. 33,1001 (1970). 25. "Microwave Spectral Tables," Superintendent of

Documents, U.S. Govt. Printing Office, Washing­ton, D.C. 20402.

Cross-references: ATOMIC AND MOLECULAR BEAMS, MAGNETIC RESONANCE, MICROWAVE TRANSMISSION, SPECTROSCOPY, ZEEMAN AND STARK EFFECTS.

MICROWAVE TRANSMISSION

MICROWA VE TRANSMISSION

That portion of the electromagnetic spectrum adjacent to the far-infrared region is commonly referred to as the microwave region. It is bounded by wavelengths in the vicinity of 10 centimeters (10 cm) and 1 millimeter (1 mm). The longest wavelength of 10 cm corresponds to a frequency of 3 X 109 cycles per second (abbreviated 3000 megahertz or 3 kilomegahertz or 3 gigahertz or 3 GHz). The shortest wave­length of I mm corresponds to a frequency of 3 X lOll cycles per second (abbreviated 300,000 megahertz or 300,000 MHz or 300 kilomega­hertz or 300 gigahertz or 300 GHz).

The development of microwave transmission on a major scale was initiated in 1940 with the advent of the magnetron, an electronic gen­erator of high-power microwaves. The magne­tron spearheaded wartime radar at approxi­mately 3 GHz and led to the utilization of waveguides for the efficient transmission of microwaves from the generator of the trans­mitting antenna and from the receiving an­tenna to the detector.

In essence, a waveguide is a hollow metal tube capable of propagating electromagnetic waves within its interior from its sending end to its receiving end. Unlike waves in space which usually propagate outward in all directions, waves in waveguides are fully confined while they propagate.

An electromagnetic wave is comprised of an electric field and a magnetic field. In free space, these fields are always perpendicular to one another and to the direction of wave propaga­tion at any instant in time. However, when a wave travels through a waveguide, the confine­ment forces one of the fields, but never both, to have a component that is parallel to the direction of wave propagation.

In a waveguide, there are a number of pos­sible field configurations. Each configuration is known as an operating mode and is deter­mined by the operating frequency or wave­length and the lateral dimensions of the wave­guide. There are two fundamental classes of modes that may propagate in a waveguide. In one class, the electric field is everywhere per­pendicular, or transverse, to the direction of propagation and the magnetic field has a longitudinal component. It is referred to as the transverse electric mode or TE mode. In the other class, the magnetic field is transverse and the electric field has a longitudinal com­ponent. This configuration leads to the trans­verse magnetic mode or TM mode.

Propagation of either the TE mode or the TM mode is limited by the cross-sectional dimensions of the guide. For example, in a rectangular waveguide, the longest wavelength that can be propagated is equivalent to twice its width. Therefore, to transmit a 3 GHz signal in a rectangular waveguide, the width of the guide must be at least 5 cm. At 300 MHz, this

756

width becomes 50 cm. Thus, waveguides are reasonable in size for the transmission of micro­waves. Furthermore, when used with ferrites, thin metallic films and magnets, waveguide components can be designed to function as isolators, circulators, modulators, discrimina­tors, or attenuators.

The propagation of electromagnetic waves in free space between transmitting and receiving antennas can be characterized in terms of ground waves, sky waves, and space waves. At microwave frequencies, ground waves attenuate completely within a few feet of travel, sky waves are influenced by the iono­sphere and can penetrate through into outer space, and space waves travel through the atmosphere immediately above the surface of the earth. At microwave frequencies, space waves behave like light waves and travel in a direct line of sight. They follow many of the rules of optics. They can be reflected from smooth conducting surfaces and can be focused by reflectors or lenses.

If a space wave is radiated from a point an­tenna, the radiated energy spreads out like an ever-expanding sphere, and the amount of energy per square foot of wave front decreases inversely with the square of the distance from the antenna. The power that can be extracted from a wave front by a similar point antenna varies inversely with the square of the fre­quency. Thus, a point antenna receives power which is inversely proportional to both the square of the distance from the source and the square of the frequency. The ratio of the power received to the total power radiated is known as path attenuation.

When the receiving antenna is a parabola­shaped dish, power extracted from the wave front is greatly increased. The ratio of the power received by such an antenna to the power re­ceived by a theoretical point antenna is defined as antenna gain. The gain of a parabolic antenna increases with the antenna area and the operat­ing frequency. Thus, for a given microwave transmission with fixed-sized antennas, the path attenuation increases with frequency, the an­tenna gain increases with frequency, and the overall result is that one tends to offset the other.

In radio broadcasting, the signal power radi­ates equally in all directions, and a receiving antenna picks up only a tiny fraction of the signal power. To overcome this low efficiency, the broadcast station must transmit a large amount of power. By contrast, a point-to-point microwave system radiates only a small amount of power, but it uses a directional transmitting antenna to concentrate power into a narrow beam directed toward the receiving antenna. Consequently, such systems are characterized by high efficiencies.

Because microwave transmission in free space follows essentially a straight line, reflectors are utilized to redirect a beam over or around an

757

obstruction. The simplest and most common reflector system consists of a parabolic antenna mounted at ground level which focuses a beam on a reflector mounted at the top of a tower. This reflector inclined at 45° redirects the beam horizontally to a distint site where a similar "periscope" reflector system may be used to reflect the beam down to another ground level. If two sites are separated by a mountain, it may be necessary to use a large, flat surface reflector referred to as a "billboard" reflector. In a typi­cal system, a billboard reflector might be lo­cated at a turn in a valley, effectively bending the beam to follow the valley. Many arrange­ments are possible which, in effect, resemble huge mirror systems.

Microwaves are ideally suited for communica­tion systems where a broad frequency band­width of the order of several megacycles is required for the rapid transmission of signals which contain a large amount of information, such as, in television signals. Most of the major cities of the United States are serviced by micro­wave television links so that they can receive television programs which originate from other cities. These systems can also accommodate thousands of telephone channels.

In 1960, experiments were initiated aiming toward communicating over transoceanic dis­tances via microwaves by utilizing balloons as reflectors. Echo I and Echo II were attempts in this direction as passive satellites. The first active repeater satellite (Telstar I) was launched in 1962 and resulted in live telecasts between Europe and the United States in addition to teleprint and other signals. Telstar II added more data to accent the value of satellite communications.

In 1962, Congress authorized the formation of Communications Satellite Corporation (Com­sat). In 1964, Comsat took the lead in forming the International Telecommunications Satellite Organization (Intelsat) to coordinate interna­tional developments in the use of satellites. Over 100 nations are now members of this extraordinary effort. Today, all transoceanic "live" TV broadcasts and two-thirds of all transoceanic telephone and telegraph communi­cations are via Intelsat satellites. They operate in the 12/14 GHz band in addition to the 4/6 GHz band.

Microwaves are broadly used for radar navi­gation, and for the launching, guidanc~, and fusing of missiles. A typical defense project which uses microwave techniques is the DEW radar line which protects the United States from external enemy attacks.

The HAYSTACK facility which has been in operation since 1966 at Millstone Hill in Massa­chusetts is the first Western radar built for spacecraft tracking, space communications, and radar astronomy. Through radar astronomical techniques, the multipurpose HAYSTACK 120-foot paraboloid antenna reflector has greatly enhanced our knowledge of the galaxy and

MODULATION

solar system. Another famous facility is the 2l0-foot GOLDSTONE antenna at the NASA Deep Space Institute (California).

ANTHONY B. GIORDANO

References

Nichols, E. J., and Tear, J. D., "Joining the Infrared and Electric Wave Spectra," Astrophys. J. 61, 17-37 (1923).

Carter, S. P., and Solomon, L., "Modern Microwaves," Electronics (June 24, 1960).

Southworth, G. C., "Survey and History of the Pro­gress of the Microwave Art," Proc. IRE (May 1962).

Wheeler, G. J., "Introduction to Microwaves," Engle­wood Cliffs, New Jersey, Prentice-Hall, Inc., 1963.

Evans, J. V., and Hagfors, T. (Ed.), "Radio Astron­omy," New York, McGraw-Hill, 1968.

Yeh, P., "Satellite Communications and Terrestrial Networks," Dedham, Massachusetts, Horizon House, Inc., 1977.

Topol, S., "Satellite Communications-History and Future," Microwave Journal (November 1978).

Cuccia, C. 1., "Satellite Communications and the Information Decade," Microwave Journal (January 1982).

Cross-references: ANTENNAS, ELECTROMAG­NETIC THEORY, MICROWAVE SPECTROSCOPY PROPAGATION OF ELECTROMAGNETIC WAVES: RADAR.

MODULATION

Modulation is defined as the process, or the result of the process, whereby some character­istic of one wave is varied in accordance with some characteristic of another wave (AS A). Usually one of these waves is considered to be a c.arrier wave w?ile the other is a modulating SIgnal. The vanous types of modulation such as am~litude, frequency, phase, pulse ~idth, pulse tIme, and so on are designated in accor­dance with the parameter of the carrier which is being varied.

Amplitude modulation (AM) is easily accom­plished and widely used. Inspection of Fig. I shows that the voltage of the amplitude modu­lated wave may be expressed by the following equation

v = Vc(l + M sin Wmt) sin wet,

where Ve is the peak carrier voltage, We and Wm are the radian frequencies of the carrier and modulating signals, respectively, and t is time in seconds. The modulation index 1M may have values from zero to one. When the trigonometric ~dentity ~ina sin b = ~ cos (a - b) - ~ cos (a + b) IS used III the equation above, this equation

MODULATION

II

t~ t--...

(I+M)Vc

Vc - ----------­,

758

(I-M) Vc O~~I+++HH~I#+.L-

· t (a) Carrier (b) Modulating signal (c) Amplitude-modulated Carrier

FIG. 1. Amplitude modulation.

become~

MVc v = Vc Sin wet + -- cos (We - wm)t

2

MVe - -- cos (We + Wm)t

2

This equation shows that new frequencies, called side frequencies or side bands, are gener­ated by the amplitude modulation process. These new frequencies are the sum and differ­ence of the carrier and modulating frequencies.

Amplitude modulation is accomplished by mixing the carrier and modulating signals in a nonlinear device such as a vacuum tube or tran­sistor amplifier operated in a nonlinear region of its characteristics. The nonlinear characteristic produces the new side-band frequencies. Fre­quency converters or translators and AM de­tectors are basically modulators. The various types of pulse modulation are actually special types of amplitude modulation.

A special type of amplitude modulation known as pulse modulation is commonly used in digital communication and other applications. In pulse modulation, the modulating signal abruptly changes the carrier amplitude from zero to some maximum amplitude V m (or vice versa) as shown in Fig. 2(a).

Therefore, the modulation index is I, or 100%, at all times. The side frequencies produced in the modulation process are determined by first using Fourier analysis to find the frequency

o

I----T----j

(a) Pulse-modulated carrier

components of the rectangular modulating signal and then adding to the carrier frequency a pair of side frequencies for each of those components. The spacing of these components is 1fT, which is the fundamental frequency, and the amplitude of these components vary as shown in Fig. 2(b). The envelope of this ampli­tude variation follows the familiar pattern of a (sin x)/x function. An infinite bandwidth would be required to either produce or reproduce a perfectly rectangular pulse, which of course is impossible to obtain. Bandwidths in the neigh­borhood of 2/td, where td is the pulse duration in seconds, are commonly used to transmit a double-sideband pulse-modulated signal.

Frequency modulation (FM) is illustrated by Fig. 3. The frequency variation, or deviation, is proportional to the amplitude of the modulating signal. The voltage equation for a frequency modulated wave follows.

v = Vc sin (Wet + Mf sin wmt)

The modulation index Mf is the ratio of maxi­mum carrier frequency deviation to the modu­lating frequency. This ratio is known as the deviation ratio and may vary from zero to values of the order of 1000. FM requires a broader transmission bandwidth than AM but may have superior noise and interference rejection capabil­ities. A large value of modulation index provides excellent interference rejection capability but requires a comparatively large bandwidth. The approximate bandwidth requirement for a fre-

-1 r- M= 11T

I I I I I I I I I I I I I I I I I I " - 21td " - 1/td " " + 1/td " + 21td

" = carrier frequency

(b) side frequencies generated.

FIG. 2. Pulse (amplitude) modulation.

759 MOLE CONCEPT

t-- t- t-

(a) Carrier (b) Modulating signal (c) Frequency-modulated Carrier

FIG. 3. Frequency modulation.

quency modulated wave may be obtained from the following relationship

Bandwidth = 2 (Modulating frequency) (Mf + 1)

The noise and interference characteristics of FM transmission are normally considered satis­factory when the modulation index or deviation ratio is five or greater.

Phase modulation is accomplished when the relative phase of the carrier is varied in acc?r­dance with the amplitude of the modulatmg signal. Since frequency is the time rate of change of phase, frequency modu.latio~ occurs w~en the phase modulating techmque IS used and Vlce versa. In fact, the equation given for a frequency­modulated wave is equally applicable for a phase­modulated wave. However, the phase-modulat­ing technique results in a deviation ratio, or modulation index, which is independent of the modulating frequency, while the frequency modulating technique results in a deviation ratio which is inversely proportional to the modul.at­ing frequency, assuming invarient modulatmg voltage amplitude in each case.

The phase-modulating techniques can be used to produce frequency-modulated waves, pro­viding the amplitude of the modulating volt~ge is inversely proportional to the modulatmg frequency. This inverse relationship can ~e obtained by including in the modulator, a cu­cuit which has a voltage transfer ratio inversely proportional to the frequency.

CHARLES L. ALLEY

References

Alley, C. L. and Atwood, K. W., "Electronic Engineer­ing," Third Edition, New York, John Wiley & Sons, 1973.

Comer, David J., "Modern Electronic Circuit Design," Reading, Massachusetts, Addison Wesley, 1978.

DeFrance, J. J., "Communications Electronics Cir­cuits," Second Edition, San Francisco, 1972.

Cross-references: MICROWAVE TRANSMISSION, PROPAGATION OF ELECTROMAGNETIC WAVES, PULSE GENERATION, RADAR, WAVE MOTION.

MOLE CONCEPT

The mole (derived from the Latin moles = heap or pile) is the chemist's measure of amount of pure substance. It is relevant to recognize that the familiar molecule is a diminutive (little mole). Formerly, the connotation of mole was a "gram molecular weight." Current usage tends more to use the term mole to mean an amount containing Avogadro's number of whatever units are being considered. Thus, we can have a mole of atoms, ions, radicals, electrons or quanta. This usage makes unnecessary such terms as "gram-atom," "gram-formula weight," etc. .

A definition of the term is: The mole IS the amount of (pure) substance containing the same number of chemical units as there are atoms in exactly twelve grams of 12 C. This definition involves the acceptance of two dictates- the scale of atomic masses and the magnitude of the gram. Both have been established by interna­tional agreement. Usage sometimes indicates a different mass unit, e.g., a "pound mole" or even a "ton mole"; substitution of "pound" or "ton" for "gram" in the above definition is implied.

All stoichiometry essentially is based on the evaluation of the number of moles of substance. The most common involves the measurement of mass. Thus 25.000 grams of H2 0 will contain 25.000/18.015 moles of H2 0; 25.000 grams of sodium will contain 25.000/22.990 moles of Na (atomic and formula masses used to five signifi­cant figures). The convenient measurements on gases are pressure, volume and temperature .. Use of the ideal gas law constant R allows duect calculation of the number of moles n = (P X V)/ (R X T). T is the absolute temperature;R must be chosen in units appropriate for P, V and T (e.g., R = 0.0820 liter atm mole-I deg K-I). ~t may be noted that acceptance of Av?gad~o s principle (equal volumes of gases under Identical conditions contain equal numbers of molecules) is inherent in this calculation. So too are the approximations of the ideal ~as law. Refined calculations can be made by usmg more correct equations of state.

Many chemical reactions are most conven­iently carried out or measured in solution (e.g., by titration). The usual concentration conven-

MOLE CONCEPT

tion is the molar solution. (Some chemists prefer to use the equivalent term formal). A 1.0 molar solution is one which contains one mole of solute per liter of solution. Thus the number of moles of solute in a sample will be

n = Volume (liters) X Molarity (moles/liter)

760

Lewis, G. N., and Randall, M., "Thermodynamics," Second edition, revised by Pitzer, K. S., and Brewer, L., New York, McGraw-Hill Book Co., 1961.

Cross-references: CHEMISTRY, ELECTROCHEMIS­TRY, GAS LAWS, MOLECULAR WEIGHT.

The amount of chemical reaction occurring at MOLECULAR BIOLOGY an electrode during an electrolysis can be ex-' pressed in moles simply as n = q (coulombs}/z 1" where z is the oxidation number (charge) of the ion and 1" is the faraday constant, 96 487.0 coulombs/mole. Thus the faraday can be con­sidered to be the charge on a mole of electrons. This affords one of the most accurate methods of evaluating the Avogadro number (6.0220 X 1023), since the value of the elementary charge is known with high precision .

Modem chemistry increasingly uses data at the atomic level for calculation at the molar level. Since the former often are expressed as quanta, appropriate conversion factors must involve the Avogadro number. Thus the einstein of energy is that associated with a mole of photons, or E=Nhv. Thus light of 2537)\ wavelength will represent energy of

Molecular biology, the study of biologically im­portant molecules and their interactions, is the result of the progression of biology from the classical study of whole organisms to the more recent study of individual cells and their com­ponents. Its beginnings are usually dated from the announcement of the double-helical struc­ture of DNA molecules made by Watson and Crick in 1953. During the early years of molec­ular biology, most attention was focused on bacteria and their viruses, since they were the most easily studied systems. For example, under appropriately controlled conditions, hundreds or thousands of liters of bacteria can be prepared in which every cell is essentially identical. Most of our fundamental knowledge

6.02 X 1 023 (photons/mole) X 6.62 X 1Q-27(erg-sec} X 3.000 X 10Io(cm/sec} E=------------------------------------------------~~~

2.537 X 10-5 (cm) X 4.184 X I 07 (erg/cal) X 103(cal/kcal)

E = 113 kcal/mole

If the SI system of units is used

6.022 X 1023(mol- l ) X 6.626 X 10-34 (1 . s) X 3.000 X 1Q8(ms- l ) E=--------~--~----------~--~--------~--~

2.537 X 1Q-7(m)

= 4.740 X 105 (J mol-I) Another convenient conversion factor is I eV/ particle = 23.05 kcal/mole.

The chemist's use of formulas and equations always implies reactions of moles of material, thus HCI(g) stands for one mole of hydrogen chloride in the gaseous state. Thermodynamic quantities are symbolized by capital letters standing for molar quantities, e.g., CIJ (heat capacity at constant volume in cal mole- I deg- I ), G (Gibbs function in cal/mole), etc. At times it is more convenient to convert an exten­sive property into an intensive expression. This is especially true in dealing with multicompo­nent systems. These are referred to as "partial molal quantities" and are given a symbol em­ploying a bar over the letter. Thus the partial molal volume, VI = (a V/anl) is the rate of change of the total volume of a solution with the amount (number of moles) of component 1.

WILLIAM F. KIEFFER

References

Kieffer, W. F., "The Mole Concept in Chemistry," Ed. 2, New York, Van Nostrand Reinhold, 1973.

about the ways in which cells synthesize and use their macromolecules was originally derived from the study of bacterial systems.

At the present time, however, the trend is in the opposite direction. A concerted effort is underway to apply the models developed for molecular biologic processes to multicellular organisms. Such organisms present a real chal­lenge to biologists, since most complex organ­isms contain more than one kind of cell (the cells have "differentiated"), and the interac­tions between these groups of cells within an organism are carefully controlled. Moreover cells from multicellular organisms differ in fun­damental ways froIll those of bacteria.

Examples of these differences can be seen by referring to Figs. I A and 1 B, which show trans­mission electron micrographs taken of thin sec­tions of the two types of cells. Figure IA is a bacterial cell which exhibits typical features such as a central cluster of DNA; basically fea­tureless cytoplasm (the liquid portion of the cell) surrounded by a lipid bilayer (cell or unit membrane); and a rigid cell wall around the entire organism. This type of cell is considered

761 MOLECULAR BIOLOGY

FIG. 1. Examples of biologic organization as seen in the transmission electron microscope. (A) A thin section of the prokaryote Bacillus sphaericus prepared by Dr. Elizabeth W. David­son, Arizona State University. The length of the bar is 0.1 micrometer. (B) A thin section of a eukaryotic cell (from rat liver) prepared by Dr. Candice J. Coffin, Arizona State University. The length of the bar is 1.0 micrometer. (C) An intact bacterial virus, PBSl, negatively stained with potassium phosphotungstate by E. A. Birge. The length of the bar is 0.1 micrometer.

The labeled structures are: C, cytoplasm; CW, cell wall; E, endoplasmic reticulum; H, head; M, cell membrane; Mi, mitochrondria; N, nucleus; Nd, nucleoid; T, tail; Tf, tail fiber.

ancestral, in an evolutionary sense, to the type of cell shown in Fig. I B, and is designated as prokaryotic. Figure 1 B shows a eukaryotic cell with its typical nucleus surrounded by a unit membrane; cytoplasm containing energy­producing mitochondria and the membranous structures known as endoplasmic reticulum; and a unit membrane surrounding the entire cell. Animal cells do not have cell walls, al­though plant cells do. This type of cellular or­ganization is typical not only of plants and animals but also the unicellular protozoa, fungi, and true algae.

Despite the differences noted above, there are basic similarities between prokaryotic and eukaryotic cells. The basic materials from which the cells are made are identical, as are many of the macromolecules within the cells. In both cases, the genetic material is deoxyribo­nucleic acid (DNA), which must be synthesized in a semiconservative manner (replicated) prior to each cell division. This process is facilitated by the double-stranded nature of the DNA mol­ecule itself.

Both DNA and the related molecule ribonu­cleic acid (RNA) are polymers of nucleotide bases composed of certain nitrogenous bases (adenine, cytosine, guanine, thymine and uracil; abbreviated A, C, G, T, and U, respectively) coupled to a pentose (either deoxyribose or ribose) and then to phosphate, as shown in Fig. 2. The polymeric chain is formed by alter­nating pentoses and phosphates with the ni­trogenous bases projecting to one side. RNA is generally single-stranded, incorporating the bases ACGU, while DNA is generally a double­stranded molecule incorporating the bases ACGT. The structure of DNA molecules is somewhat variable and dependent upon such factors as the temperature, salt concentration, and base composition, but a typical helical structure for DNA is shown in Fig. 3. The struc­ture is formed by pairing bases from the parallel strands according to the following rule: A pairs with T and G pairs with C. When RNA is in­volved, U is substituted for T. During the repli­cation process mentioned above, this base pair­ing is utilized to spontaneously line up the

MOLECULAR BIOLOGY 762

ph01Phll.

/ FIG. 2. Pairing of DNA strands to make a helix. In the diagram the backbones

of the two DNA strands are shown along the right and left margins. The bases pro­ject into the space between and are held together by hydrogen bonds (dotted line). (Reproduced from Waiter, W. G., McBee, R. H., and Temple, K. 1., "Introduction to Microbiology," New York, D. Van Nostrand Company, 1973.)

precursor nuc1eotides so that a polymerase enzyme can join them together. Polymerases are extremely fast acting and may join as many as 250-1000 nuc1eotides per second under the appropriate conditions.

The genetic information for any cell is en­coded in redundant form within its DNA base sequence due to the specificity of base pairing. This information is not, however, directly avail­able for use. Instead an RNA copy of one of the two DNA strands (the "sense" strand) is made by an RNA polymerase following the usual base pairing rules in a process called tran­scription. Each coding region of the DNA is ca­pable of producing specific RNA molecules whose functions are predetermined. Some are used as part of the subcellular structures called ribosomes and are designated rRNA. Others are used as highly specific carriers of the amino acids, the subunits of the polymers called pro­teins, and are designated transfer or tRNA. Still other molecules, the rarest class, contain the actual code which determines the sequence of amino acids used to construct a protein. These molecules are designated as messenger or mRNA molecules and tend to be unstable. They code for only one protein if isolated from a­eukaryotic cell but may code linearly for as many as ten discrete proteins if isolated from a prokaryotic cell. To a first approximation, each region of the DNA coding for an rRNA, tRNA, or individual protein molecule represents a gene. Overlapping genes are rare but not unknown.

The next step in utilization of genetic infor-

FIG. 3. The winding of a DNA double helix. A less magnified view of a DNA molecule than shown in Fig. 2, this diagram shows the B form structure. There are ten base pairs per turn of the helix, and each pair is rotated 36° with respect to the preceding pair. Re­produced from Walter et al.

763

mation is the translation of the genetic code into the appropriate sequence of amino acids. Each triplet of bases (codon) codes for a spe­cific amino acid or punctuation signal, and all possible triplets are meaningful. Therefore ribosomes must always attach to mRNA at spe­cific sites to ensure the proper "reading frame." During translation, codons on the mRNA are matched to corresponding anticodons on the tRNA by the usual base pairing rules to assure delivery of the correct amino acid.

In prokaryotes, which have no compart­mentalization within their cytoplasm, transla­tion occurs as soon as the mRNA is formed. In eukaryotes, however, the DNA and initial RNA transcripts are found in the nucleus, while the translation machinery is found in the cytoplasm. As a result the RNA must be exported from the nucleus. Before this can happen, the RNA must be processed to remove certain noncoding or intervening sequences which are present in most eukaryotic genes. The processed mRNA is passed through pores in the nuclear membrane into the cytoplasm. In either type of cell, trans­lation occurs in a processive fashion on a com­plex of a single mRNA molecule and multiple ribosomes called a polysome. Although pro­karyotic and eUkaryotic ribosomes are similar in function, they are somewhat different in structure, with the prokaryotic ribosomes being smaller than their eukaryotic counterparts.

Interestingly certain organelles within eu­karyotic cells, the chloroplasts (sites of photo­synthesis) and the mitochondria, contain small DNA molecules coding for ribosomes of the prokaryotic type. It is now considered likely that at least chloroplasts are descended from primordial prokaryotic cells which colonized eukaroytic cells.

As protein molecules are produced, they fold spontaneously into three-dimensional config­urations appropriate to their function. In the case of proteins intended for use outside the producing cell, they are generally produced in an inactive configuration which is altered by removal of a portion of the amino acid chain during transport through the cell membrane. As is true for most substances, movement of proteins across the cell membrane is an energy­requiring process.

All of the normal cellular processes men­tioned above can be subverted by small obligate intracellular parasites called viruses. These en­tities represent the boundary between living and nonliving matter. When not in a cell, they have the appearance of complex crystals of protein and nucleic acid, as shown in Figure 1 C. Viruses are generally rod-shaped or polyhedral structures consisting of a protein "coat" and a highly condensed nucleic acid molecule which may be RNA or DNA. Infection of a cell con­sists of movement of the nucleic acid across the cell membrane. In the case of prokaryotic cells, only the nucleic acid enters the cell. For eu-

MOLECULAR BIOLOGY

karyotic cells the entire virus enters, and then the protein coat is removed.

Once inside the cell, viruses follow one of two patterns. In a "lytic" infection, the viral nucleic acid immediately uses the host cell enzymes to produce more viral nucleic acid and the pro­teins necessary to encase it. Other viral specific enzymes may be produced to facilitate this pro­cess by disrupting host cell functions not neces­sary for the production of viruses. New virus particles are produced not by division of a pre­existing entity, as is the case with cells, but rather by a sequential assembly process which shows remarkable similarities to formation of chemical crystals from a supersaturated solu­tion. In most cases, the host cell is destroyed or "lysed" during release of the viral particles. This observation has given rise to an alternative name for bacterial viruses-bacteriophages or simply phages.

The second mode of viral infection is called a "temperate" infection. In this case the viral nu­cleic acid establishes a semipermanent relation­ship with the cell which preserves both virus and host cell. If the viral nucleic acid is RNA, it is converted to DNA. The DNA then sets up a stable association with the host DNA such that the host cell replicates the viral DNA at the same time as it replicates its own DNA. The viral DNA, instead of producing coat proteins and lytic factors, produces a protein repressor which acts to prevent the synthesis of mRNA mole­cules coding for the lytic functions. The quies­cent viral DNA is now designated as a provirus or prophage, since the appropriate stimulus will cause it to revert to the lytic form and destroy its host cell. Not all viruses are capable of the temperate response, and those which are ca­pable of the temperate response do not neces­sarily use it.

Proviral DNA has been observed to exist in two forms. It may integrate itself into the host cell DNA and become an actual physical part of the cell's genetic material. Alternatively it may exist as a small independent circular DNA molecule which replicates side-by-side with the host DNA. The latter form of DNA is called a plasmid and is more commonly found in prokaryotes.

The term plasmid actually encompasses a much larger group of DNA molecules than just viruses. Nonviral plasmids have been observed in most bacteria and many of the simpler eu­karyotes such as yeast. These extra pieces of DNA are considered dispensable to the host cell even though they may, under the appropriate conditions, increase the cell's chances for sur­vival. Examples of this include plasmids which make bacterial cells resistant to certain anti­biotics or which allow them to break down cer­tain complex molecules such as xylene for food.

Recent discoveries in molecular biology are having a profound effect on our understanding of molecular genetics and on the way in which

MOLECULAR BIOLOGY 764

Donor DNA

© ~ Fragment Only

~ EeoRI

0 or

AATTC G G CTTAA ~

~ Vector Plus One or More Fragments

AATTC G G CTTAA

t EeoRI

or

0 ~ Vector Only

0 FIG. 4. A diagrammatic representation of gene splicing. The double circle at

the bottom left of the diagram represents a double-stranded plasmid DNA mole­cule, while the double-stranded donor DNA at the top may be from any source. The EcoRI enzyme is the prototypical restriction endonuclease which always leaves identical single-stranded ends on the cut DNA. As shown on the right-hand side of the diagram, the single-stranded regions may pair so as to reform the orig­inal molecules or to form new constructs. (Adapted from Birge, E. A., "Bacterial and Bacteriophage Genetics: An Introduction," New York, Springer-Verlag, 1981.)

biologic problems can be solved. It is now known that while most portions of DNA mole­cules are stable over long periods of time, in both prokaryotes and eukaryotes certain small regions within the DNA molecules are naturally highly unstable. These unstable regions, called transposons, consist of special terminal ele­ments with a wide variety of coding elements such as antibiotic resistance located between them. The terminal elements have the ability to cause the entire transposon to simultaneously replicate itself and insert itself into a new posi­tion on the same or a different DNA molecule. Transposons thus represent "jumping genes" and provide a way to move bits of DNA around in the cell in a more or less random fashion.

The greatest impact on modern biology has been made by combining the information pre­sented above with the new techniques which have been developed for producing artificial re­arrangement of DNA molecules-techniques known as gene splicing. The procedures all de­pend upon the action of certain restriction endonucleases produced by various bacteria. These enzymes attack all "foreign" DNA, mol­ecules which have not been suitably modified by the addition of small substituents such as methyl groups at specific sites. They cleave the unmodified DNA at base sequence-specific sites to produce variably sized fragments. Since all DNA fragments produced by a given enzyme will have identical ends, it is a comparatively easy job to rejoin the fragments in new com­binations and permutations in a manner such as shown in Fig. 4. When the spliced DNA is in­serted into a cell and the cell is allowed to grow,

the result is a clone of cells all of which carry the particular DNA segment of interest.

Application of these techniques has led to a true biologic revolution. It is now possible to splice purified DNA into the middle of plasmids or transposons and to insert the spliced DNA into living cells. This is genetic engineering in the fullest sense of the term. The results that have been obtained from the process have in­cluded such oddities as bacteria which produce human insulin or growth hormone or animal cells which carry bacteriophage DNA. Con­structs like these may some day permit us to understand precisely how cells regulate their internal processes as well as their interactions with neighboring cells. They certainly promise to revolutionize the study of biology.

As the preceding discussion indicates, it is ap­parent that a cell is constantly involved in many activities which require the movement of mol­ecules. The extent of this feat becomes more apparent when a few size comparisons are made. A typical bacterial cell may be a rod approxi­mately one micrometer in diameter and several micrometers in length. The DNA molecule of the same cell is a circular structure approxi­mately one millimeter in length. In the case of a human cell, the diameter of the nucleus is about 10 micrometers and the 46 linear DNA molecules represent a length of about one meter. Clearly the DNA cannot exist as a ran­dom coil within the cell and still allow space for other activities.

Prokaryotic and eukaryotic cells have solved this problem in different ways. In prokaryotic cells the circular DNA molecule is formed into

765

FIG. 5. Supercoiling of DNA. The two DNA mole­cules have been spread onto a thin plastic film; stained with uranyl acetate; and then shadowed with platinum and palladium. The molecule on the left has retained its supercoils, while the one on the right has one or more broken phospho diester bonds and has therefore lost all of its superhelicity. (Electron micrograph by E. A. Birge.)

a series of loops, each of which is supercoiled by a family of enzymes called topoisomerases. These enzymes break either one or two strands of the DNA helix and can pass other strands of DNA through the nicked region. Therefore they also have the interesting property of being able to tie and untie knots within the DNA mole­cule. An example of the difference between supercoiled and "relaxed" DNA molecules can be seen in Fig. S. The highly supercoiled DNA molecule comprises the nucleoid seen in Fig. I A.

Eukaryotic cells literally coil their linear DNA around cylinders of protein called histones. There are four histone proteins, three of which are used to make cylinders and one of which covers the DNA which connects adjacent cylin­ders. The net effect is to take a single DNA molecule and coil it into a sort of "beads on a string" structure. Such a structure is called a chromosome and may exist in an extended state or in an even more condensed form dur­ing cell division.

Further information on packaging problems and many other topics discussed in this article can be obtained from the references.

EDWARD A. BIRGE

References

Adams, R. L. P., Burdon, R. H., Campbell, A. M., Leader, D. P., Smellie, R. M. S., "The Biochemistry of the Nucleic Acids," Ninth Edition. New York, Chapman and Hall, 1981.

Glover, D. M., "Genetic Engineering: Cloning DNA," New York, Chapman and Hall, 1980.

Primrose, S. B., Dimmock, N. J., "Introduction to Modern Virology," Second Edition, New York, John Wiley & Sons, 1980.

MOLECULAR SPECTROSCOPY

Wang, 1. C. "DNA Topoisomerases," Scientific Amer­ican 247(1),94-109 (1982).

Watson, J. D., "Molecular Biology of the Gene," Menlo Park, California, W. A. Benjamin, Inc., 1976.

Cross-references: BIOMEDICAL INSTRUMENTA­TION, BIOPHYSICS, ELECTRON MICROSCOPE, MEDICAL PHYSICS, PHOTOSYNTHESIS.

MOLECULAR SPECTROSCOPY

Molecular spectroscopy encompasses the broad range of efforts to understand and utilize the interaction of gas-phase molecules with electro­magnetic radiation. Spectroscopy is the basic tool for exploring the internal structure of molecules, and spectroscopic studies are of fundamental importance for understanding the microscopic properties of matter. Molecular spectroscopy is also of considerable practical value, since the spectrum of a molecule pro­vides a characteristic "fingerprint" by which that molecule may be identified.

The spectrum of a molecule may be measured either by determining the wavelengths absorbed by the molecule (absorption spectroscopy) or by observing the wavelengths emitted by a sample of excited molecules (emission spectros­copy). In either case, the spectrum is far more complex than an atomic spectrum. Emission and absorption are found to occur from radio­frequency wavelengths through the infrared and visible regions of the spectrum and far into the ultraviolet. Under conditions of low resolution, the visible spectrum is observed to consist of numerous bands, hence the designa­tion band spectra. Higher resolution demon­strates that each band is composed of numer­ous closely spaced lines. The microwave and infrared portions of the spectrum are much less congested and easier to analyze. Herz­berg's three-volume work l on molecular spec­troscopy is the major reference. The fact that Volumes I and II have been in print for over 30 years testifies to their enormous success. A more modern presentation is given by Steinfeld. 2

Those wavelengths which are present in a molecular spectrum are governed by the law of quantum physics which states that a photon of frequency f = (£2 - £ 1 )/h is emitted or ab­sorbed whenever the molecule undergoes a transition between energy levels £ land £2' Here h is Planck's constant. Understanding the spectrum, then, is equivalent to understanding the energy levels of a molecule. All of the essential features of molecular spectra are present in diatomic molecules, to which the discussion below is restricted. The energy level structure of polyatomic molecules can be ex­plained by extending the concepts developed for diatomic molecules. A recent compilation of data for diatomic molecules is given by Huber and Herzberg. 3

MOLECULAR SPECTROSCOPY

It is customary in molecular spectroscopy to express frequencies in units of wave numbers (cm-1 ). The wave number of a photon of frequency I is I/c, where c is the speed of light in vacuum. Since the frequency I and wave­length A of an electromagnetic wave are related by AI = c, it is seen that the wave number is the reciprocal of the wavelength. Using I = {E2 - E 1 )/h, the photon's wavenumber is {E2 - E1)/he. The values E\/hc and E2 /hc, which also have units of cm - , are known as the term values of the energy levels Eland E 2. Spectroscopists find it convenient to refer to energy levels by their term values, since a transi­tion between two levels involves a photon whose wave number is given simply by the dif­ference in the term values of the levels.

The electrons in a molecule are much lighter and move much more rapidly than do the nuclei. Consequently, it is possible, to a high level of accuracy, to separate the problems of electronic motion and nuclear motion. This procedure is called the Born-Oppenheimer approximation. Each electronic state is charac­terized by the value of its electronic angular momentum projected onto the internuclear axis. This component of the electronic angular momentum is conserved by virtue of the fact that a diatomic molecule is symmetric for rotations about the internuclear axis. In analogy with the atomic physics notation of S, P, D, F, ... , an electronic state is labeled~, II, .::l, <1>, ... if its projected angular momentum, in units of h/2rr, is 0, l, 2, 3, .... The energy of an electronic state depends not only upon the electron configuration but also upon the internuclear separation R . Theorists are still challenged by the difficult problem of calculat­ing accurate electronic energies. The transitions between different electronic states are respon­sible for visible and ultraviolet molecular spec­tra since, as in atoms, energy differences between the states are generally several electron volts. The appearance of bands rather than distinct lines is a consequence of the nuclear motion.

A physical model for the nuclear motion is obtained by considering the molecule to be a dumbbell which can vibrate along the inter­nuclear axis as well as rotate end-over-end. The vibrational energy G and the rotational energy F must be added to the electronic energy Te to give the total molecular energy: T = Te + G + F. Nuclear vibration occurs in the potential well formed by the negative electronic binding energy, which is a function of R, and the posi­tive energy due to Coulomb repulsion of the nuclear cores. The potential reaches a minimum at a particular value re of the internuclear separation. This is the equilibrium internuclear distance, and the nuclear separation oscillates about this equilibrium value. For small displace­ments from equilibrium, the vibrational motion can be approximated by that of a simple har­monic oscillator. The classical oscillation fre­quency for nuclei of mass M 1 and M 2 is given

766

by

11k lose = 2; V;

where k is the force constant and J.l. = M 1M 2 / (M 1 + M 2) is the reduced mass. Solution of the Schrodinger wave equation for the quantum harmonic oscillator leads to energy levels given by

Evib{V) = hlose{v + t) where v = 0, 1,2,3, ... is the vibrational quan­tum number. Transforming to term values, by dividing by he, the vibrational energy levels of a harmonic oscillator molecule are

G{v) = w{v + t) where w is the vibrational frequency expressed in cm-1 .

Although the harmonic oscillator approxima­tion displays the essential features of the vibra­tional motion, the actual potential in which the nuclei vibrate deviates rather sharply from a harmonic potential. A more complete expres­sion for the vibrational energy can be developed as a power series in (v + !) and is given by

G{v) = We{v + t) - WeXe{V + t)2 + WeYe{V + t)3 + ....

Here the subscript e refers to the equilibrium position, and the coefficients for each higher­order term (WeXe, WeYe, ... ) become suc­cessively smaller. It is rarely necessary to go beyond the cubic term when analyzing experi­mental data. The energy difference between adjacent vibrational states is typically 0.1 elec­tron volt, a factor of 100 less than the energy difference between electronic states.

A first approximation for the end-over-end rotational motion is to consider the molecule to be a rigid rotor. Quantizing the angular mo­mentum leads to energy levels which are given by

h 2 J{J+ 1) Erot(J) = 8rr2 I

where I is the molecule's moment of inertia and J = 0, 1, 2, 3, ... is the rotational angular mo­mentum quantum number. Expressing these as term values, the rotational energy levels of a molecule are

where

F{J) = BJ{J + 1)

h B=--

8rr2 eI

is called the rotational constant.

767

Again, the actual nuclear motion is more complex than this simple model. Centrifugal distortion of the molecule has the effect of introducing a term proportional to f2 (J + 1)2. In addition, the rotational constant depends slightly upon the vibrational quantum number, since vibration changes the average moment of inertia. These considerations lead to a more general formula for the rotational energy, namely,

where the subscript v indicates a dependence upon the vibrational quantum number. The separation between adjacent rotational energy levels is typically 0.00 I electron volt .

Division of the molecular energy into elec­tronic, vibrational, and rotational energies has been quite successful for understanding the primary features of molecular spectra. At the very highest levels of resolution, however, it is observed that each line in a band splits into several very closely spaced lines. This fine structure is a result of interactions, or couplings, between the various types of motion which, until now, have been considered separately. A typical example, known as A-doubling, is a coupling between the molecule's electronic and rotational motions . The existence of fine struc­ture emphasizes the limitations of the Born­Oppenheimer approximation. After the various couplings are included, all known aspects of molecular spectra can be understood.

Each spectral line is a consequence of the absorption or emission of photons which occurs when molecules undergo transitions between two energy levels. Comparison of an observed spectrum with theoretical energy levels requires knowing which transitions are allowed and which are forbidden. Information of this sort is codified into selection rules. Most selection rules can be understood by considering the pos­sible symmetries of a molecule.

Visible and ultraviolet spectra result from transitions between two different electronic states. The primary selection rule specifies that the molecule's angular momentum quantum number J can change only by /j.J = ± I or O. Three groups of lines appear, each associated with a particular value of /j.J, which are called P, Q, and R branches. Changes in the vibra­tional quantum number are not restricted by any selection rules. However, the Franck­Condon principle, which states that the inter­nuclear separation cannot change during the emission or absorption of a photon, makes transitions between some pairs of vibrational levels far more likely than between other pairs, A band is formed from the combined P, Q, and R branches associated with a transition be­tween a particular vibrational level of the upper electronic state and a particular vibrational level of the lower electronic state. Thirty or more rotational levels in the initial state of a room

MOLECULAR SPECTROSCOPY

temperature gas can be populated, and each is allowed by the /j.J selection rule to undergo a transition to three levels in the final state. Hence each band in a spectrum is comprised of 100 or more distinct lines. It is not surprising, then, that the analysis of molecular spectra made little progress before the advent of quan­tum physics.

Spectral lines in the infrared occur when a molecule undergoes a transition between two different vibrational levels within the same electronic state. The /j.J = ± I or 0 selection rule still holds, so P, Q, and R branches can again be identified (although the /j.J = 0 Q-branch transitions are forbidden in ~ ele·;tronic states). If the vibrational motion were exactly that of a harmonic oscillator, a selection rule /j.v = ± I would apply to the vibrational quantum num­ber. This rule is not rigorous since, as was noted, the harmonic oscillator model is not perfect. Nevertheless, /j.v = ± I transitions are usually the strongest.

Transitions between the rotational levels of a given vibrational level are characterized by frequencies in the far infrared and microwave regions of the spectrum. Only /j.J = ± I are pos­sible for this case, so only a P and an R branch appear. Rotational spectra appear very simple and regular when compared to visible band spectra. Transitions between two rotational levels can often be detected by the absorption of microwaves. The high accuracy with which microwave frequencies can be measured allows rigorous tests of the theory of molecular struc­ture. This aspect of molecular spectroscopy is discussed in a well-known text by Townes and Schawlow.4

Another important feature of a molecular spectrum is the intensities of the lines. Even casual observation of a spectrum reveals that some lines are quite intense while others are barely perceptible. This can be understood by noting that not all of the possible transitions from a particular initial state are equally probable. Each molecular transition is charac­terized, then, not only by a wavelength but also by a transition probability. The larger the transition probability, the more intense the spectral line. In emission spectroscopy, where the initial state is an excited state, the transi­tion probabilities determine the average amount of time that a molecule spends in the excited state before emitting a photon. This quantity, known as the lifetime of the excited state, is typically I X 10-7 second. The measurement of excited state lifetimes is an important part of molecular spectroscopy.

Energy levels in polyatomic molecules may also be understood by a consideration of elec­tronic, vibrational, and rotational motions. Instead of a single mode of vibration, an N-atom polyatomic molecule will have 3N - 6 possible modes of vibration; 3N - 5 for a linear mole­cule. Rotation in a poly atomic molecule is possible about three separate axes, although

MOLECULAR SPECTROSCOPY

a useful simplification occurs for symmetric­top molecules, where two of the axes become equivalent. As a consequence of the additional vibrational and rotational motions, polyatomic molecules have many more energy levels than diatomic molecules. The spectrum which is actually observed is again dependent upon se­lection rules which govern the allowed transi­tions between energy levels. Even moderate­size molecules (3 or 4 atoms) exhibit spectra which are exceedingly complex, and full rota­tional resolution is, in general, obtainable only at the highest possible resolution. Spectra of larger molecules are usually analyzed only in terms of electronic and vibrational motions, ro­tational analysis not being possible. The study of polyatomic molecular spectra is aided con­siderably by the use of group theory.

Experimental molecular spectroscopy his­torically has been pursued with the use of grat­ing spectrometers. While other techniques are now available, grating spectrometers continue to playa major role in contemporary research, especially for investigations of ultraviolet spec­tra. The wide spectral range of spectrometers, from the far infrared into the vacuum ultra­violet, makes them exceptionally versatile tools. Instruments range in size from portable table­top models, with resolution of 1 part in 103 , to room-size giants, with resolution of 1 part in 106 . Spectrometers designed for use in the ultraviolet or far infrared are evacuated, to pre­vent atmospheric absorption, and utilize special optical materials.

Both absorption and emission spectra can be measured with a grating instrument. For ab­sorption, light from a continuum source is passed through an absorption cell (sometimes the spectrometer itself) and then dispersed. Ex­citation spectra are obtained by dispersing the light from a discharge which contains the mole­cule of in terest.

Photographic plates have long been the tra­ditional means of detecting the dispersed light, but they are rapidly being replaced by photo­electric detectors, which are more sensitive as well as more compatible with computers.

Fourier transform spectroscopy5 represents an alternative to grating spectroscopy. The basis for this technique extends back to Michelson, who noted that the output of an interferometer, as a function of time, is the Fourier transform of the light source spectrum, as a function of frequency. Practical realization, however, has awaited the appearance of high-speed, inex­pensive computers. In a typical experiment, light from a continuum source passes through an absorption cell, containing the molecule of interest, and then through a Michelson interfer­ometer. The interferometer's output intensity, as measured by an appropriate detector, is digi­tized as the mirror moves and then transferred to a computer, which calculates the Fourier transform of the data to produce the spectrum.

768

Resolution with a Fourier transform spectrom­eter can exceed that with a large grating spec­trometer. Infrared spectroscopy has made the most use of this technique, with high resolution studies of, among many other molecules, H20, CO2 , CH4 , and C2H4 • Fourier transform spectroscopy has been valuable as well for visible spectra, especially for mapping the complex spectrum of 12 , which extends throughout much of the visible region of the spectrum.6

Because of the high absolute precision with which they were measured, 12 wavelengths are now routinely used as standards in many spec­troscopy experiments.

The recent development of lasers, especially tunable lasers, has awakened a new interest in molecular spectroscopy and motivated a rapid proliferation of new experimental techniques. 7

Although absorption spectroscopy with laser sources has been highly successful, particularly in the infrared, the most innovative techniques have utilized laser-induced fluorescence. Spectra are obtained in these experiments by detecting photons which are emitted from excited states (fluorescence) as the laser wavelength is varied. Excitation to the higher energy level occurs whenever the laser's wavelength coincides with a spectral line, enabling those molecules in the proper lower energy level to absorb laser pho­tons and thus undergo the upward transition. Significant advantages of laser spectroscopy in­clude high resolution, low background and noise, and exceptionally high sensitivity. Laser­induced fluorescence experiments have ob­tained spectra from molecules at densities as low as 104 cm -3. This has been especially valu­able for the study of free radicals (chemically unstable molecules) and molecular ions.

Laser spectroscopy techniques have been de­veloped for both the near-infrared and far-infra­red spectral regions. Tunable diode lasers and color center lasers now cover the entire near­infrared region from I to 30 /lm. Wavelengths for transitions between different vibrational energy levels of a single electronic state typically fall within this range, and laser absorption spectroscopy, because of the narrow laser line­widths, provides good resolution of the rota­tional structure even for quite large molecules. Extensive investigations have been made for molecules such as NH 3 , C2H2 , CF4 , and SF 6.

Far-infrared lasers have wavelengths which are well-matched to rotational transitions in many molecules. These lasers, however, operate at fixed frequencies. Spectroscopy can nonethe­less be performed by "tuning" the molecule. This is accomplished by applying a strong magnetic field which shifts the molecule's energy levels (Zeeman effect) until the transi­tion frequency between the two levels matches the laser frequency. Laser magnetic resonance, as this procedure is called, is rapidly increasing in usage as more and more far-infrared laser lines are discovered.

769

Tunable laser radiation from dye lasers is available throughout the visible region of the spectrum. Nonlinear optical techniques, such as frequency doubling, extend the range of tun­ability to wavelengths as short as 200 nm. This is the wavelength region for band spectra associated with transitions between different electronic states, and both laser absorption and laser-induced fluorescence, as well as more exotic laser techniques, have been used to study a large number of molecules. In most laser spectroscopy experiments, the resolution is limited not by the laser's linewidth but rather by the Doppler width of the transition, a conse­quence of molecular motion. The limiting reso­lution of around I part in 106 , often inadequate to resolve rotational details in polyatomic molecules, is poor when compared with the limit of I part in 109 or better which is set by the laser's linewidth. This has spurred consider­able interest in techniques of Doppler-free spectroscopy. One such technique, which has been especially fruitful for ultra-high resolu­tion spectroscopy, is optical-optical double resonance. It has been used for studies of molecules such as BaF2 , CaF2 , and N02 . Other Doppler-free techniques have been applied to a variety of molecules.

Ultrahigh resolution alone is often inadequate for analysis of the highly complex spectra of polyatomic molecules. A new experimental technique, introduced to grapple with this problem, is laser spectroscopy of supersonic molecular beams.8 Molecules of interest, often seeded into a rare gas, are forced at high pres­sure through an expansion nozzle into vacuum. The expansion process cools the molecules' vibrational and rotational motions down to temperatures of only a few degrees Kelvin, leaving the molecules in only a handful of the lowest energy levels. This greatly simplifies the spectrum and facilitates analysis. The small molecule N02 has a remarkably complex visible spectrum which for many years had stubbornly resisted attempts at analysis. New efforts with supersonic beams and Doppler-free laser spec­troscopy have, however, finally succeeded in establishing a basis for understanding this molecule.

Recent advances in the spectroscopy of free radicals and molecular ions,9 difficult to pro­duce in large quantities, stem from the develop­ment of such high-sensitivity techniques as laser spectroscopy. The unpaired electron found in most of these species adds complexity as well as interest to their spectra. The number of radi­cals and ions studied, however, remains quite small in comparison with stable, neutral mole­cules. High resolution studies of several mo­lecular ions have been performed by laser spectroscopy of ion beams. The infrared spec­trum of the one-electron molecule HD+ was measured with sufficient accuracy to test rig­orously the foundations of molecular theory.

MOLECULAR SPECTROSCOPY

Another promising' technique, which avoids problems due to the rapid recombination of molecular ions, is the use of laser spectroscopy to study ions which are stored in an ion trap.

Spectroscopic data for molecules are valu­able far beyond the walls of the laboratory. Spectra are used in a wide range of applications, from the routine industrial analysis of chemicals to the identification of atmospheric pollutants. Perhaps the most important application in re­cent years of molecular spectroscopy has been in the field of interstellar chemistry.1O Nearly 60 molecules in interstellar molecular clouds have been identified on the basis of their molecular spectrum, and the number continues to grow. While astronomers have observed and identified some visible spectral lines, most of the molecules have been detected by radio astronomers on the basis of microwave-emitting rotational transitions. Identification of these interstellar molecules has been possible only because of the extensive collection and tabula­tion of molecular spectroscopy data which has been going on for many years. In a few cases, suggestions by astronomers that some of the observed features were due to "exotic" mole­cules have prompted laboratory workers to produce and measure the spectra of these species. The interchange between astronomy and molecular spectroscopy has been beneficial for both sides, but with over 200 interstellar microwave lines still unidentified, much work remains to be done.

Both fundamental and applied spectroscopy have witnessed remarkable growth in the last decade. Increasing demand for spectroscopic data and continuing advances in technology will undoubtedly keep molecular spectroscopy vigorous for many years to come.

RANDALL D. KNIGHT

References

L Herzberg, Gerhard F., "Molecular Spectra and Molecular Structure," VoL I, "Spectra of Di­atomic Molecules (2nd ed., 1950); Vol. II, "Infra­red and Raman Spectra of Polyatomic Molecules" (1945); VoL III, "Electronic Spectra and Elec­tronic Structure of Polyatomic Molecules," (1966), New York, Van Nostrand Reinhold.

2. Steinfeld, Jeffrey I., "Molecules and Radiation Cambridge, Mass., MIT Press, 1978.

3. Huber, K. P., and Herzberg, G. F., "Molecular Spectra and Molecular Structure," VoL IV, "Constants of Diatomic Molecules," New York, Van Nostrand Reinhold, 1979.

4. Townes, C. H., and Schawlow, A. L., "Microwave Spectroscopy," New York, McGraw-Hill, 1955.

5. Becker, E. D., and Farrar, T. c., "Fourier Trans­form Spectroscopy," Science 178,361 (1972).

6. Gerstenkorn, S., and Luc, P., "Atlas du Spectre d'Absorption de la Molecule d'Iode (14 800-20000 cm-I )," Paris, Editions du C.N.R.S.,

MOLECULAR SPECTROSCOPY

1978. A correction to the atlas is given in Gersten­korn, S., and Luc, P., Revue de Physique Ap­pliquee 14, 791 (1979). Subtraction of 0.0056 cm -1 from all wavenumbers in the atlas results in an absolute accuracy of 0.002 cm-1 and a rela­tive accuracy of 0.0007 cm -1 .

7. See papers in Hall, 1. 1., and Carlsten, J. 1. (Eds.), "Laser Spectroscopy," Vol. III, Berlin, Springer­Verlag, 1977 ; and in Walther, H., and Rothe, K. W. (Eds.), "Laser Spectroscopy," Vol. IV, Berlin, Springer-Verlag, 1979.

8. Levy, Donald H., "Laser Spectroscopy of Cold Gas Phase Molecules," Ann. Rev. Phys. Chern. 31, 197 (1980).

9. Saykally, R. J., and Woods, R. C., "High Resolu­tion Spectroscopy of Molecular Ions," Ann. Rev. Phys. Chern. 32,403 (1981).

10. Green, Sheldon, "Interstellar Chemistry: Exotic Molecules in Space," Ann. Rev. Phys. Chern. 32, 103 (1981).

Cross-references: ABSORPTION SPECTRA; ATOMIC SPECTRA; ENERGY LEVELS, ATOMIC; FOURIER ANALYSIS; LASER; RAMAN EFFECT AND RA­MAN SPECTROSCOPY ; SCHRODINGER EQUA­TION; SPECTROSCOPY.

MOLECULAR WEIGHT

The molecular weight of a chemical compound is the sum of the atomic weights of its constit­uent atoms. The molecule is the smallest weight of a substance which still retains all of its chemical properties. By convention, each atomic weight, and therefore molecular weights, are expressed relative to an arbitrary standard (see below). For example, the molecule of acetic acid, CH3 COOH, contains two atoms of carbon, four of hydrogen, and two pf oxygen, so that its molecular weight is the sum of 2( 12.0 1) + 4(1.01) + 2(16.00), which totals 60.06. This molecular weight value is clearly in arbitrary units, but a related quantity, the gram-molecu­lar weight or mole, is the molecular weight expressed in grams. One mole of any compound has been found to contain 6.022 X 1023 mole­cules, and this number is called the Avogadro constant.

For many years, the standard used for atomic weights was the exact value 16 for the naturally occurring mixture of isotopes of oxygen. An­other system of atomic weights, based on the value of 16 for the most abundant (99.8 per cent) oxygen isotope, came into use for com­parisons involving single atoms or molecules where isotopic differences were important. A conference of the International Commission on Atomic Weights in 1961 adopted as the standard a value of exactly 12 for the carbon-l 2 isotope, and since then all atomic weights in use have been based on this standard.

The weights of molecules range from a value of about two for the hydrogen molecule to

770

several millions for some virus molecules and certain polymeric compounds. Molecular dimen­sions accordingly range from a diameter of about 4A for the hydrogen molecule to several thousand angstroms-which has permitted view­ing single large molecules in the electron micro­scope. Molecular sizes are generally much small­er and are not measured directly, but are de­duced from x-ray diffraction studies of ordered groups of molecules in the crystalline state or from the physical properties such as hydrody­namic behavior of molecules in the gaseous or liquid state.

Many methods for determining molecular weights which are described below depend fundamentally on counting the number of mole­cules present in a given weight of sample. How­ever, any usable sample contains a very large number of molecules: at least ten trillion of the largest known molecules are present in the smallest weight measurable on a sensitive bal­ance. Therefore, an indirect count is made by measuring physical properties which are propor­tional to the large number of molecules present. A consequence of the large number of molecules sampled is the averaging of any variations in content of atomic isotopes in individual mole­cules, so that normal isotopic fluctuations lead to no measurable deviation of molecular weight values. Abnormally high concentrations of iso­topes in radiation products may, however, pro­duce altered molecular weights.

The term molecular weight is properly applied to compounds in which chemical bonding of all atoms holds the molecule together under nor­mal conditions (see BOND, CHEMICAL). Thus, covalent compounds, as represented by many organic substances, usually are found to have the same molecular weight in the solid, liquid, and gaseous states. However, substances in which some bonds are highly polar may exist as un-ionized or even associated molecules in the gaseous state and in nonpolar solvents, but they may be ionized when dissolved in polar solvents. For example, ferric chloride exists in the gaseous state as FeCh at high temperatures, as Fe2 Cl6 at lower temperatures as well as in nonpolar solvents, but reverts to FeCl3 in solvents of moderate polarity, and becomes ionic in water solutions-as chloride ions and hydrated ferric ions. Similarly, acetic acid and some other car­boxylic acids associate as dimers in the vapor state and in solvents of low polarity, but exist as monomers with progressive ionization as the solvent polarity increases.

Truly ionic compounds, such as most salts, exist only as ions in the solid and dissolved states, so that the term molecule is not appli­cable and is not commonly used. Instead, the term, formula weight, is used; this denotes the sum of the atomic weights in the simplest for­mula representation of the compound. If a broad definition of a molecule as an aggregate of atoms held together by primary valence bonds

771

is adopted, then salts in the crystalline state would appear to have a molecular weight which is essentially infinite and limited only by the size of the crystal, since each ion is surrounded by several ions of opposite polarity to which it is attached by ionic bonds of equal magnitude.

A further complication in the definition of molecular weights occurs with inorganic poly­mers, such as the poly phosphates and poly­silicates, whose polymeric nature is clearly evident in both their crystal structure and their highly viscous behavior in the molten state. However, the magnitude of their molecular weights often cannot be found by conventional methods because they are either insoluble or react with solvents, with consequent degrada­tion. These examples indicate that the molecular weight often depends on the conditions used for measurement and must be specified where compounds subject to association, dissociation or reaction are studied.

The history of the clarification of molecular weight concepts is of considerable interest, since this was so intimately related to other develop­ments in chemical knowledge. Although Dalton had published a table of atomic weights in 1808, and by 1825 molecular formulas, derived from combining weights, were in use, many miscon­ceptions of these formulas remained until about 1860. Then evidence from chemical reactions and from measurements of vapor densities firm­ly established the formulas of many inorganic and simple organic compounds as they are re­presented today. The vapor density method, based on Avogadro's hypothesis, was thus the first molecular weight method and continues to be useful for compounds that can be easily vola­tilized. It was not until 1881 that Raoult showed that the depression of freezing points was pro­portional to the molar concentration of solute. In 1884, van't Hoff related the osmotic pressure of solutions to the vapor pressure, boiling point, and freezing point behavior, and these methods were quickly put into use for determining mole­cular weights. The abnormal physical properties of salt solutions were explained in 1887 by the ionization theory of Arrhenius, and the very careful measurements of many of these proper­ties furnished the strongest confirmation of the theory. While these measurements provided the most precise determinations of the extent of dissociation of weak electrolytes, they also con­tributed to the development of the Debye­Hiickel theory for strong electrolytes.

Molecular Weight Distributions Most syn­thetic and many· natural polymeric substances are mixtures of molecules having various chain lengths, and thus of different molecular weights-so-called poly disperse systems. In such cases, molecular weight values have an ambigu­ous meaning, and no single such value will com­pletely represent a sample. Various techniques for measuring molecular weights, when applied to one of these materials, will produce values

MOLECULAR WEIGHT

which often disagree by a factor of two or more. This disagreement arises from the different bases of the methods-for example, some methods yield so-called number-average molecular weights by determining the number-concentration of molecules in a sample, while other methods pro­duce weight-average molecular weights which are related to the weight-concentrations of each species. Another common value is the viscosity­average molecular weight, which is related to the viscosity contribution of each species. Other bases are of importance for certain methods of study, and some of these are complex functions involving several averages. For some purposes, the determination of a single average molecular weight is sufficient for establishing relations be­tween molecular weight and the behavior of polymers, but the type of molecular weight average must be so chosen as to have a close relation to the behavior property of interest. A more detailed knowledge of the constitution of a sample is sometimes required, particularly if several properties are to be considered, or if unusual forms of molecular weight distribution curve are present.

The problem of completely defining the mole­cular weight nature of poly disperse materials is most accurately solved by determining the fre­quency of occurrence of each molecular species and representing the results as a frequency dis­tribution curve. Such a study is generally quite tedious, though there are a few methods which provide muc-h or fhe required information in one experiment. The method currently most used for determining molecular weight dis­tributions of polymers is size exclusion chroma­tography, which includes gel permeation chro­matography and gel filtration. This involves measurement of the differences in extent of permeation of molecules of different sizes into pores of a solid or gel matrix. The distribu­tion of molecular sizes found is converted into a distribution of molecular weights by calibra­tion with standard polymer samples. The method is rapid and applicable to many polymer types. Alternatively, polymers can be separated by fractional precipitation or fractional solution into a series of fractions each of which contains a fairly narrow distribution of molecular weights. Each fraction can then be character­ized by one of the methods described below to yield an average molecular weight. Finally, the molecular weight distribution curve can be constructed by summation of these results. While the curve derived is somewhat inexact, it is the best approach to samples which are not susceptible to analysis by the chromatographic methods. The ultracentrifuge is less com­monly used for determining molecular weight distributions in a single experiment, partly because of high instrumentation costs and partly because of the complexity of methods needed to analyze the data.

Uses. Molecular weight measurements, in

MOLECULAR WEIGHT

conjunction with the law of combining propor­tions, have enabled the atomic weights of ele­ments in compounds to be determined. When the atomic weights are known, molecular weight measurements permit the assignment of molecu­lar formulas. Other applications to compounds of low molecular weight allow determination of the extent of ionization of weak electrolytes, and the extent of association of some uncharged compounds which aggregate. The study of mole­cular weights is becoming increasingly valuable in assessing the effects which various molecular species of a polymer sample have on the physi­cal properties of the product. Through such knowledge, the synthetic process may be modi­fied to improve the properties of polymers.

Methods of Measurement Many physical and certain chemical properties vary substantially with the molecular weight of compounds, and these properties are the bases of all molecular weight methods. The summary given in this sec­tion includes principally the methods which are most frequently used or have general applica­bility. The choice of the most suitable method for a given sample depends on its state (gas, liquid, or solid), the magnitude of the molecular weight and the accuracy required in its determi­nation, as well as on the stability of the com­pound to physical or chemical treatment. Some mention of the applicability of the methods in these regards is given wherever possible.

Gases and Liquids Avogadro's hypothesis (1811) that equal volumes of different gases contain the same number of molecules under the same conditions made it possible to find how many times heavier a single molecule of one gas is than that of another. Thus, relative molecular weights of all gases could be estab­lished by comparing the weights of equal vol­umes of gases. The significance of the idea and utilization of this method were first clearly demonstrated by Dumas in 1827, but it was not until 1860 that the results were accepted by most scientists when Cannizzaro showed that a consistent system of atomic weights resulted. With the additional information from chemical experiments on the number of atoms of each kind present in each molecule, the relative weights of each atom were obtained. The as­sumption of the integral value, 16, for the atomic weight of oxygen (to give a value close to unity for the lightest element, hydrogen) then enabled molecular weights of all gaseous compounds to be determined. The method obviously can be applied to other molecules which normally occur in the liquid state but can be volatilized by heating. The Dumas and Victor Meyer methods are most used for mo­lecular weight determinations with liquids in this way. These methods have been refined so that gas densities can now be determined with an accuracy of 0.02 per cent, and extremely small weights of material (about 1 p.g) can be similarly studied with somewhat less accuracy. High tem-

772

peratures up to 2000°C have been used to study substances which are volatilized only with diffi­culty, provided decomposition can be avoided.

Solids Measured by ColIigative Methods It has been shown that nonvolatile molecules dis­solved in a solvent affect several physical proper­ties of the solvent in proportion to the number of solute molecules present per unit volume. Among these properties are a decrease of the vapor pressure of the solvent, a rise in its boiling point, a decrease in its freezing point, and the development of osmotic pressure when the solution is separated from the solvent by a semi­permeable membrane. Properties such as these which are related to the number of molecules in a sample rather than to the type of molecule are called colligative properties. They are the basis for some of the most useful techniques for molecular weight determination. The magnitude of the effects and the ease of measurement differ greatly, so that certain of the colligative properties are preferred for this purpose. For example, an aqueous solution containing 0.2 gram of sucrose (molecular weight 342) in 100 ml has a vapor pressure 0.01 per cent less than that of the solvent, a boiling point 0.003°C greater, and a freezing point 0.011 °c lower than the solvent, but will develop an osmotic pressure of 150 cm of water. Since the effects are related to the number-concentration of solute mole­cules, each method leads to a number average molecular weight if the sample consists of a mixture of molecules of different sizes. Accu­rate results with any of the techniques are ob­tained only when measurements at a series of concentrations are extrapolated to infinite di­lution where the system becomes ideal, i.e. is not affected by interactions between molecules.

Direct vapor pressure measurements with a differential manometer are generally limited to the larger depressions produced by low molecu­lar weight solutes, while refined techniques such as isothermal distillation require the most exact control of conditions. Isopiestic methods allow the comparison of the vapor pressure of so­lutions of an unknown with those containing a known substance, and several modifications have been used more than other vapor pressure methods. Ebulliometric techniques which de­pend on the elevation of the boiling point of a solvent are often used for solutes of low molecu­lar weight and find some use for large molecules. Since boiling points are highly sensitive to the atmospheric pressure, it is either necessary to control pressure very precisely, or more com­monly to measure the boiling points of both the solvent and solution simultaneously. Often a differential thermometer is employed to determine only the difference of the two tem­peratures, and these devices have been made so sensitive that molecular weights as large as 30 000 have sometimes been studied. Tech­niques involving the lowering of the freezing point of a solvent (cryoscopic methods) are

773

much used for rapid approximate determina­tions of molecular weights in the identification of organic compounds. For this purpose a sub­stance such as camphor, which is a good solvent for many organic compounds and has a large molar depression constant, is often chosen to magnify the difference in freezing point of the solvent and the solution of the unknown. Since freezing-point depressions are not sensitive to atmospheric pressure, they are easier to measure accurately than the methods described above, and much use has been made of them for precise studies of solutes having low molecular weights. The possibility of association or ionization of the solute must be considered with any of these methods, since these effects will greatly in­fluence the result.

Osmotic pressures are so much larger than any other colligative property that they are most widely used for molecular weight measurements, particularly for long-chain polymers where the high sensitivity of the method is required. For accurate measurements, a membrane is required which permits the flow of solvent through its pores but completely holds back solute mole­cules. This condition is best satisfied where there are large differences in size of the solute and solvent molecules or of their affinity for the membrane. Membranes made from cellulose compounds are often successfully used for poly­mers which contain little material with molecu­lar weights below about 10 000. Below this molecular weight the pore size of the satisfactory membranes is so small that solvent flow is very slow, and thus a very long time is required to reach constant osmotic pressure. In spite of this handicap, some of the most precise osmotic pressure measurements have been obtained with aqueous solutions of sucrose and similar small solutes by the use of membranes prepared by precipitating such materials as copper ferrocya­nide in the pores of a solid support. The upper limit of molecular weights satisfactorily mea­ured by osmometry is usually about 500 000, which is fixed by the lowest pressures that can be measured precisely and by the maximum concentrations of material which still give satis­factory extrapolations to infinite dilution. In comparing various colligative properties for the characterization of polymers, osmometry has the advantage that it is unaffected by the pres­ence of impurities of very low molecular weight which will diffuse through membranes able to retain the polymer, whereas the other proper­ties are greatly affected by the same impurities.

Modern instrumentation has provided com­mercial instruments utilizing several of these colligative properties for routine, accurate meas­urements in very short time and with small samples. This is true for boiling point, vapor pressure, and freezing point measurements of molecular weights up to several thousand, and for membrane osmotic pressure measurements of high molecular weight samples.

MOLECULAR WEIGHT

X-ray Diffraction X-ray diffraction analysis is a powerful method for determining exact molecular weight and structural characteristics of compounds in their crystalline state. However the method is complicated and slower than many techniques which provide molecular weights of accuracy sufficient for many purposes and so is usually employed only when the additional structural information is needed. The sample to be examined must have a high degree of crystal­line order and is preferably a single crystal at least 0.1 mm in size; such samples are prepared fairly readily from many inorganic and non­polymeric organic compounds. Alternatively, crystalline powders of certain crystal types may provide suitable results. Diffraction patterns are then obtained by one of several methods, and the angular positions of the reflections are used to calculate the lattice spacings, and thus the size of the unit cell. This unit cell is the small­est volume unit which retains all geometrical features of the crystalline class, and it contains a small integral number of molecules. A rough estimate of the molecular weight of the com­pound is needed from a determination by an independent method in order to obtain this integral number. Finally, the resultant molecu­lar volume is multiplied by the exact bulk den­sity of the crystal and by the Avogadro number to yield the molecular weight (see X·RAY DIF· FRACTION).

Light Scattering Measurements of the in­tensity of light scattered by dissolved molecules allow the determination of molecular weights. Most commonly the method is used for poly­mers above 10 000 units, though under optimum conditions molecular weights as low as 1000 have been determined. Since the intensity scat­tered by a given weight of dissolved material is directly proportional to the mass of each mole­cule, a weight-average value of the molecular weight is obtained for a poly disperse system. An average dimension of the molecule can also be obtained by a study of the angular variation of scattered light intensity, provided some dim­ension of the molecules exceeds a few hundred angstroms. The interaction between dissolved molecules substantially affects the intensity of scattered light so that extrapolation to infinite dilution of data collected at several polymer concentrations is required. The method has been so well developed in the last decade that it is now probably the most used method for deter­mining absolute molecular weights of polymers. In addition, it provides information on sizes which is furnished by few other methods. The greatest problem encountered is in the removal of suspended large particles which otherwise would distort the angular scattering pattern of the solutions. This is rather easily accomplished by filtration in some cases, but it may be a formidable difficulty for particles which are highly solvated or are peptized by the molecules to be studied. Auxiliary information is required

MOLECULAR WEIGHT

on the refractive index increment of the sample, i.e., the change in refractive index of the solvent produced by unit concentration of the sample. This information is supplied by a differential refractometer using the same wavelength of light as that employed in measurements of the intensity of scatter.

The Ultracentrifuge. The sedimentation of large molecules in a strong centrifugal field enables the determination of both average mo­lecular weights and the distribution of molecu­lar weights in certain systems. When a solution containing polymer or other large molecules is centrifuged at forces up to 250 000 times gra­vity, the molecules begin to settle, leaving pure solvent above a boundary which progressively moves toward the bottom of the cell. This boundary is a rather sharp gradient of concen­trations for molecules of uniform size, such as globular proteins, but for polydisperse systems, the boundary is diffuse, the lowest molecular weights lagging behind the larger molecules. An optical system is provided for viewing this boundary, and a study as a function of the time of centrifuging yields the rate of sedimentation for the single component or for each of many components of a poly disperse system. These sedimentation rates may then be related to the corresponding molecular weights of the species present after the diffusion coefficients for each species are determined by independent experi­ments. Both the sedimentation and the diffusion rates are affected by interactions between mole­cules, so that each must be studied as a function of concentration and extrapolated to infinite dilution as is done for the colligative properties. The result of this detailed work is the distribu­tion of molecular weights in the sample which is available by few other methods. At present, this method is only partly satisfactory for mole­cular weight determinations with linear poly­mers because of the large concentration depen­dence of the diffusion coefficients. Difficulties have been found in reliably extrapolating dif­fusion coefficients beyond the lowest polymer concentrations which are experimentally attain­able at present.

A modification of the sedimentation method which avoids the study of diffusion constants is the sedimentation equilibrium method in which molecules are allowed to sediment in a much weaker field. Under these conditions, the sedi­menting force is balanced by the force of dif­fusion, so that after times from a day to two weeks molecules of each size reach different equilibrium positions, and the optical measure­ment of the concentration of polymer at each point gives the molecular weight distribution directly. However, again extrapolation to infinite dilution must be used to overcome interaction effects. The chief difficulty here is the long time of centrifuging required, and the necessary stability of the apparatus during the period. A newer and somewhat faster technique, the

774

Archibald method, permits the determination of weight-average molecular weights of polymers by analysis of the concentration gradient near boundaries soon after sedimentation begins.

Chemical Analysis When reactive groups in a compound may be determined exactly and easily, this analysis may be used to determine the gram equivalent weight of the substance. This is the weight in grams which combines with or is equivalent to one gram-atomic weight of hydrogen. This equivalent weight may then be converted to the molecular weight by multiply­ing by the number of groups per molecule which reacted (provided they each are also equivalent to one hydrogen). If the number of reactive groups in the molecule is not known, then one of the physical methods for determining molec­ular weight must be used instead. The chemical method is convenient and often used for the identification of organic substances containing free carboxyl or amino groups which can readily be titrated, and for esters which can be saponi­fied and determinations made of the amount of alkali consumed in this process. The equivalent weights of ionic substances containing, for ex­ample, halide or sulfate groups may also be determined by titration or by gravimetric analy­sis of insolu ble compounds formed with reagents which act in a stoichiometric fashion. In the titration of acids, the "neutral equivalent" is the weight of material which combines with one equivalent of alkali, and a similar definition applies to the "saponification equivalent" of esters. If only one carboxyl or ester group is present in the molecule, these values equal the molecular weight of the compound.

In a similar way, if the terminal groups on polymer chains can be determined by a chemi­cal reaction without affecting other groups in the molecule, the equivalent weight or molecu­lar weight of the polymer may be obtained in certain cases. For polydisperse systems, a num­ber-average value of the molecular weight is obtained because the process essentially counts the total number of groups per unit weight of sample. Since the method depends on the effect of a single group in a long chain, its sensitivity decreases as the molecular weight rises, and so is seldom applicable above molecular weights of 20 000. Particularly at high molecular weights, the method is very sensitive to small amounts of impurities which can react with the testing reagent, so that careful purification of samples is desired.

It is also important to know that impurities or competing mechanisms of polymerization do not lead to branching or other processes which may provide greater or fewer reactive groups per molecule. The analysis for end groups must be carried out under mild conditions which do not degrade the polymer, since this would also lead to lower molecular weight values than expected. Labeling of end groups either with radioactive isotopes or with heavy isotopes which can be

775

analyzed with the mass spectrometer provides a rapid and convenient analysis for end groups. This labeling can be accomplished with a labeled initiator if this remains at the chain ends, or after polymerization is complete, by exchange of weakly bonded groups with similar groups in a labeled compound. Molecular weight determi­nations by end group analysis are often used for condensation polymers of lower molecular weights and are especially valuable in studying degradation processes in polymers.

GEORGE L. BEYER

References

Daniels, F., Williams, J. W., Bender, P., Alberty, R. A., Cornwell, C. D., Harriman, J. E., "Experimental . Physical Chemistry," Seventh Edition, New York, McGraw-Hill Book Company, 1970.

Billmeyer, F. W., Jr., "Textbook of Polymer Science," New York, Wiley-Interscience, 1971 .

Scholte, T. G., in "Polymer Molecular Weights," (p. E. Slade, Jr., Ed.), Part II, New York, Marcel Dekker, 1975.

Wells, A. F ., "Structural Inorganic Chemistry," Fourth Edition, New York, Oxford Univ. Press, 1975.

Cross-references: ATOMIC PHYSICS; BOND, CHEM­ICAL; CENTRIFUGE ; LIGHT SCATTERING; MOLE­CULES AND MOLECULAR STRUCTURE; OSMO­SIS; POLYMER PHYSICS; VAPOR PRESSURE AND EVAPORATION; X-RAY DIFFRACTION.

MOLECULES AND MOLECULAR STRUCTURE

A molecule is a local assembly of atomic nuclei and electrons in a state of dynamic stability. The cohesive forces are electrostatic, but, in addition, relatively small electromagnetic inter­actions may occur between the spin and orbital motions of the electrons, especially in the neigh­borhood of heavy nuclei. The internuclear separations are of the order of I to 2 X 10-10

metres, and the energies required to dissociate a stable molecule into smaller fragments fall into the I to 5 e V range. The simplest diatomic species is the hydrogen molecule-ion H2 + with two nuclei and one electron. At the other extreme, the protein ribonuclease contains 1876 nuclei and 7396 electrons per molecule.

Historically, molecules were regarded as being formed by the association of individual atoms. This led to the concept of valency, i.e., the number of individual chemical bonds or link­ages with which a particular atom can attach itself to other atoms. When the electronic theory of the atom was developed, these bonds were interpreted in terms of the behavior of the valence, or outer shell, electrons of the com­bining atoms. Each atom with a partly filled valence shell attempts to acquire a completed octet of outer electrons, either by electron

MOLECULES AND MOLECULAR STRUCTURE

transfer, as in (a), to give an electrovalent bond, resulting from Coulombic attraction between the oppositely charged ions

R :CI:CI: "+" -

R:N:O:

R (a) (b) (R = Gb)

(c)

(W. Kossel, 1916); or by electron sharing, as in (b) and (c), to give a covalent bond (G. N. Lewis, 1916). In (b), each chlorine atom donates one electron to form a homopolar bond, which is written Cl-Cl where the bar denotes on this theory one single bond, or shared elec­tron pair. In (c), the nitrogen-oxygen bond is formed by two electrons donated by only the nitrogen atom, giving a semipolar, or coordinate­covalent bond, which is written R3 N ~ 0, and which is electrically polarized. Double or triple bonds result from the sharing of four or six electrons between adjacent atoms, as in ethyl­ene (d) and acetylene (e) respectively.

H H \ /

C=C / \

H-C=:C-H

H H (d) (e)

However, difficulties arise in describing the structures of many molecules in this fashion. For example, in benzene (C6 H6 ), a typical aromatic compound, the carbon nuclei form a plane regular hexagon, but the electrons can only be conventionally written as forming alter­nate single and double bonds between them. Furthermore, an electron cannot be identified as coming specifically from any of these bonds upon ionization. Such difficulties disappear in the quantum-mechanical theory of a polyatomic molecule, whose electronic wave function can be constructed from nonlocalized electron orbit­als extending over all of the nuclei. The concept of valency is not basic to this theory, but is simply a convenient approximation by which the electron density distribution is partitioned in different regions in the molecule.

Molecular compounds consist of two or more stable species held together by weak forces. In clathrates, a gaseous substance such as S02, HCI, CO2 or a rare gas is held in the crystal lattice of a solid, such as ~-quinol, by van der Waals-London dispersion forces. The gas hy­drates, e.g., Cl2 • 6H 20, contain halogen mole­cules similarly trapped in ice-like structures. The hydrogen bond, with energy'" 0.25 eV, is responsible not only for the high degree of molecular association in liquids such as water (O-H---O-H---) but also for such molecules as the formic acid dimer

MOLECULES AND MOLECULAR STRUCTURE

O-H---O

/ \ H-C C-H

\ / O---H-O

which contains two hydrogen bonds indicated by dashed lines. Molecular complexes vary greatly in their stability; in donor-acceptor complexes, electronic charge is transferred from the donor (e.g., NH3) to the acceptor (e.g., BF3), as in a semipolar bond. The BF3 . NH3 complex has a binding energy with respect to dissociation into NH3 and BF3 of 1.8 eV. The bond here is relatively strong; the electron trans­fer can occur between the components in their electronic ground states. On the other hand, in weaker complexes such as C6 H6 - 12 , with bind­ing energy of about 0.06 eV, there is only a fractional transfer of charge from benzene to iodine. The actual ionic charge-transfer state lies at much higher energy than the ground state of the complex.

The discovery of XePtF 6 by Bartlett (1962) has been followed by the synthesis of many other rare gas compounds whose existence was not predicted by classical valency theories. Compounds such as XeF2 , XeF4 , XeF6 and XeOF4 are quite stable, the average Xe-F bond energy in the square planar molecule XeF 4 be­ing 1.4 eV.

A molecule X is characterized by: (1) A stoichiometric formula AaBbCc'"

where a,b,c,' .. are the numbers of atoms of elements A,B,C, ... that it contains. The ratio a : b : c : ... is found by chemical analysis for these elements. The absolute values of a, b, c, ... are then fixed by determination of the molec­ular wei$.ht of X. For a volatile substanc~ the gas denSIty of X and of a gas of known molecular weight are compared at the same temperature and pressure. The molecular weights are in the ratio of the gas densities, since Avogadro's prin­ciple states that equal volumes of gases at the same temperature and pressure contain the same numbers of molecules. For a nonvolatile sub­stance, a known weight can be dissolved in a solvent, and the resultant lowering of vapor pressure, elevation of the boiling point, or de­pression of the freezing point of the solvent can be measured. Each of these properties depends upon the number of molecules of solute present, so the number of molecules per unit weight of X is found and, hence, the molecular weight. For substances of high molecular weight such as proteins (molecular weight '" 34 000-200-000) or polymers, the molecular weight is found from osmotic pressure measurements or the rate of sedimentation in a centrifuge. The molecular weight of a molecule in crystalline form is de­termined when the density of the crystal and the dimensions of the unit cell from x-ray analysis are both known. Finally, for stable volatile compounds, it is often possible to form

776

the ion X+ and pass this through a mass spectro­graph to determine the molecular weight.

(2) The spatial distribution of the nuclei in their mean eqUilibrium or "rest" positions. At an elementary level, this is described in geomet­rical language. For example, in carbon tetra­chloride, CCI4 , the four chlorine nuclei are disposed at the comers of a regular tetrahedron, and the carbon nucleus is at the center. In the [CoCl4 ] 2- ion, the arrangement of the chlorine nuclei about the central metal nucleus is also tetrahedral, whereas in [PdCl4 ] 2- it is planar.

At a more sophisticated level, each molecule is classified under a symmetry point group. Most nonlinear molecules possess only 1, 2, 3, 4 or 6-fold rotation axes, and belong to one of the 32 crystallographic point groups. For example, the pyramidal ammonia molecule NH3 has a threefold rotation axis C3 through the nitrogen nucleus and three reflection planes av inter­secting at this axis, and belongs to the C3v(3m) point grouJ2.: Tetrahedral molecules CX4 belong to the Td(43m) point group. Linear diatomic and polyatomic molecules belong to either of the continuous point groups Dooh or Coov accord­ing to whether a center of symmetry is present or not.

The symmetry classification does not define the geometry of a molecule completely. The values of certain bond lengths or angles must also be specified. In carbon tetrachloride, it is sufficient to give the C-Cl distance (1.77 X 10-10 meters) since classification under the T d

point group implies that all four of these bonds have e9ual length and the angle between them is 109 28'. In ammonia, both the N-H dis­tance (1.015 X 10-10 meters) and the angle HNH (107°) must be specified. In general, the lower the molecular symmetry, the greater is the number of such independent parameters required to characterize the geometry. Informa­tion about the symmetry and internal dimen­sions . of a molecule is obtained experimentally by SPECTROSCOPY, ELECTRON DIFFRACTION, NEUTRON DIFFRACTION, X-RAY DIFFRAC­TION, and MAGNETIC RESONANCE. (See these topics for details.) Nuclear magnetic resonance (NMR) is widely used to study molecular struc­ture since it gives information about both the chemical environment of a given nucleus in a molecule and also the disposition of neighbor­ing nuclei. While commonly employed on pro­tons, its use is increasing for other nuclei with nonzero spin angular momentum.

(3) The dynamical state is defined by the values of certain observables associated with orbital and spin motions of the electrons and with vibration and rotation of the nuclei, and also by symmetry properties of the correspond­ing stationary-state wave functions. Except for cases when heavy nuclei are present, the total electron spin angular momentum of a molecule is separately conserved with magnitude Sfz, and molecular states are classified as singlet, doublet,

777

triplet, ... according to the value of the multi­plicity (2S + I). This is shown by a prefix super­script to the term symbol, as in atoms.

The Born-Oppenheimer approximation per­mits the molecular Hamiltonian H to be sepa­rated into a component He that depends only on the coordinates of the electrons relative to the nuclei plus a component depending upon the nuclear coordinates, which in turn can be written as a sum Hv + Hr of terms for vibration­al and rotational motion of the nuclei (we may ignore translation here). The eigenfunctions '11 of H may correspondingly be factorized as the product 'I1e 'I1v 'I1r of eigenfunctions of these three operators, and the eigenvalues E decom­posed as the sum Ee + Ev + Er. In general, we find Ee > Ev > Er.

Electronic states of molecules are classified according to the symmetry properties of 'I1e (which forms a basis for an irreducible represen­tation of the molecular point group). Thus 3 B lu is a term symbol for benzene (D6h point group) that denotes a triplet electronic state whose wave function transforms like the Blu represen­tation of the group. In the case of diatomic and linear polyatomic molecules, the term symbol shows the magnitude of the conserved compo­nent of orbital electronic angular momentum M about the axis, states being classified as ~, II, A,'" according to A = 0, I, 2, .. '. The superscript + or - shows the behavior of'l1e for a linear molecule upon reflection in a plane containing the molecular axis; for centrosym­metric linear molecules (Dooh point group) the subscript g or u shows the parity + I or - I re­spectively for '11 e with respect to inversion at the center.

The vibrational wavefunction 'I1v can be ap­proximated by a product of 3N - 6 harmonic oscillator wave functions t/Ji, each a function of a normal displacement coordinate Qi,

3N-6

'I1v = n t/Ji(Qi) i =1

The product is (3N - 5), for a linear molecule; N is the number of nuclei. Each oscillatory mode can be excited with quanta Vi = 0, I, 2, .... When Vi = 0, t/Ji transforms like the totally symmetrical representation of the molecular point group; when Vi = 1, t/Ji transforms like Qi. The symmetry of 'I1v under the molecular point group is found from the direct product for all the t/Ji. The vibrationless ground state with VI = V2 = ... = ° is always totally symmetrical.

Each rotational state is characterized by a value for the quantum number J, where J(J + I) fj2 is the squared angular momentum for rota­tion of the nuclei (apart from spin). If la, Ib and Ie denote the moments about the principal axes of inertia of the molecule, then a spherical top has I a = I b = Ie; a molecule with two princi­pal moments equal is either a prolate (Ie = Ib >

MOLECULES AND MOLECULAR STRUCTURE

Ia) or an oblate (Ie > h = Ia) symmetric top; if Ie> h > la, the top is asymmetric. Symmetric top molecules have Cn symmetry axes with n ;;;;. 3 and belong to point groups with degene­rate representations. The component K1i of rotational angular momentum about the top axis is conserved and the rotational levels are also characterized by the value of the quantum number K = 0, 1, 2, ... J. A symmetry classifi­cation is made for 'I1r under the rotational sub­group of the molecular point group. Finally, each eigenstate is described as + or - according to the parity of '11 under inversion in a space­fixed coordinate system.

(4) In order to distinguish between different electronic states '11 e of the same symmetry and spin multiplicity, a further classification is ob­tained by expanding 'I1e as a product of n single­electron wave functions cfJi, each a function of the coordinates of one of the n electrons in the molecule.

where (n !)-1/2 is a normalization factor. Each of the molecular orbitals (MO's) cfJi is constructed to transform like an irreducible representation of the molecular point group and is usually formed by linear combination of atomic orbitals (LCAO) Xi ce'ntered upon the individual nuclei

The MO's are written in order of decreasing energy necessary to ionize the electrons which occupy them, and electrons are assigned to the MO's in accordance with the Pauli principle. For example, the electronic ground state of ammonia (C3V point group) is written

where the superscripts show the distribution of the ten electrons among three MO's of al sym­metry and one of e symmetry, the electrons in the (3al) orbital being most readily ionized. The symmetry of the resultant molecular wave­function 'I1e is found by taking direct products for each orbital occupied by an electron. Here 'I1e belongs to the totally symmetrical represen­tation (and is also singlet). Excited electronic states are obtained by promoting electrons into orbitals with higher energies, but the molecular symmetry in such states often differs from that in the ground state, as a result of changes in geometry.

In calculations of molecular properties, the MO's cfJi can be improved by variational methods which make them satisfy the Hartree-Fock equations. This gives self-consistent field (SCF) MO's, yielding a better wavefunction 'I1e. How­ever the latter is still, in practice, constructed

MOLECULES AND MOLECULAR STRUCTURE

from an incomplete set of basic functions. Further improvement is achieved by configur­ation interaction (CI), in which 'lie's of the same symmetry are allowed to mix in linear combination.

G. W. KING

References Burdett, J. K., "Molecular Shapes," New York, John

Wiley & Sons, 1980. Drago, R. S., "Physical Methods in Chemistry," Phila­

delphia, W. B. Saunders Co., 1977. Gillespie, R. J., "Molecular Geometry," London, Van

Nostrand-Rheinhold, 1972. King, G. W., "Spectroscopy and Molecular Structure,"

New York, Holt, Rinehart and Winston, Inc., 1964. Levine, I. N., "Molecular Spectroscopy," New York,

John Wiley & Sons, 1975.

Cross-references: BOND, CHEMICAL; ELECTRON DIFFRACTION; INTERMOLECULAR FORCES; MAGNETIC RESONANCE; MOLECULAR WEIGHT; NEUTRON 9IFFRACTION; QUANTUM THEORY; SPECTROSaOPY; X-RAY DIFFRACTION.

MOSSBAUER EFFECT

The Mossbauer effect is the phenomenon of recoilless resonance fluorescence of gamma rays from nuclei bound in solids. It was first dis­covered in 1958 and brought its discoverer, Rudolf L. Mossbauer, the Nobel prize for physics in 1961 . The extreme sharpness of the recoilless gamma transitions and the relative ease and accuracy in observing small energy differences make the Mossbauer effect an important tool in chemistry, solid-state phYSiCS, nuclear physics, biophysics, metallurgy, and mineralogy.

Resonance fluorescence involves the excitation of a quantized system (the absorber) from its ground state (0) to an excited state (1) by absorption of a photon emitted from an identical system (the source) decaying from state (I) to (0). Not every nucleus has a suitable gamma transition; however, the Mossbauer effect has been observed in more than 60 different iso­topes. The parameters characterizing the nuclear resonance process for some typical isotopes are illustrated in Fig. I.

To conserve energy and momentum in the emission and absorption processes, each system, the source and absorber, must acquire a recoil energy R equal to £2 /2Mc l , where £ is the photon energy, M is the mass of the recoiling system and c is the speed of light. The energy available for the excitation of the absorber is thus reduced by 2R, and resonance fluorescence can be achieved only if the missing energy 2R is not larger than the widths of the levels involved. Before 1958, it was thought that for all gamma transitions the width required to get overlap between the emission and the absorption line

778

was much larger than the natural width r, where r is related to the half-life TII2 of the excited nuclear level by the expression r TII2 = 4.55 X 10-16 e V sec. In fact, techniques had been devel­oped to compensate for the recoil energy loss by applying large Doppler shifts with an ultra­centrifuge or through thermal motion. These methods necessarily broaden the intrinsically narrow lines thereby reducing the absorption cross section.

Mossbauer discovered that in some cases these difficulties may be removed by embedding the source and absorber nuclei in a crystal. Being part of a quantized vibrational system, these nuclei interact with the lattice by exchange of vibrational quanta or phonons only. If the characteristic phonon energy is large compared to the recoil energy R for a free nucleus, the probability for the emission of a gamma ray without a change in the vibrational state of the lattice is large. For such a zero phonon transi­tion, the lattice as a whole adsorbs the recoil momentum and the recoil energy is negligibly small. At the same time, the emission and absorption lines achieve the natural width r.

For an atom bound by harmonic forces, the fraction f of events without recoil ener~ loss is given by f = exp( - 4rr2 (x 2 )(A 2 ). Here (x ) is the mean square displacement of the radiating atom taken along the direction of the photon with wavelength A. In an environment of lower than cubic symmetry, (Xl), and therefore f, may be anisotropic. A large recoilless fraction may be obtained when (Xl) is small and A large. The former condition implies small vibrational am­plitude and thus low temperature, high vibra­tional frequency and large mass M, while the latter implies low photon energy, £. Both condi­tions imply small recoil energy R.

Recoilless transitions can also occur in amor­phous substances like glasses and high-viscosity liquids. For the latter, the diffusive motion superimposed on the thermal vibration results in a broadening of the Mossbauer line.

For all Mossbauer isotopes, the nuclear half­life T 112, typically 10-8 second, is very long compared to the period of the lattice vibrations, typically 10-13 second. A conceivable first-order Doppler shift of the Mossbauer line due to the thermal motion will therefore average out to zero. The second-order Doppler effect, however, leads to an observable shift, sometimes called the temperature shift. The photons emitted by a source nucleus moving with a mean square velocity (vi) are lower in energy by a fraction (vi)/2c 2 as compared to the photons emitted at rest. Similarly the transition energy of a vibrating absorber nucleus appears lower to the incident photon by a fraction (vi)/2c 2 • In principle, the two shifts may be different whenever the source and absorber are of different composition and/or temperature.

Mossbauer performed his original experiment with 1911r at 88 K, obtaining a recoilless frac-

779 MOSSBAUER EFFECT

SCALE OF ENERGIES AND RELATED QUANTITIES IN MOSSBAUER EFFECT

ENERGY (ev)

1"10 I()' "' 10 "7

10 "' 10 ", 10 "4

10 "5 "Z

10 "I

10

10' 10' li/ 10' f IplO I~" HALFLIFE Tl;z (SEC) I I I I FOR NUCLEAR LEVEL

WAVELENGTH ). (1) 10

FOR PHOTON OF I OF WIDTH r(eV) ENERGY Ey(eV) ,,«to I:' ,"I­

~ ~~ ,. INTRINSIC LINE WIDTH r (eV) PHOTON ENERGY E yleV)

r·~ln2JTl!z Ey&ch/). Iff r IZ I I-----------Eyl z 3.2 x 10 ____________ -1

LARGEST OBSERVED HYPERFINE SPLITTING lIEMAX (eV)

FREE RECOIL ENERGY R (eV)

R • EI /2MC z r

CHARACTERISTIC PHONON ENERGY

8 • DEBYE TEMPERATURE

TEMPERATURE T (0 K) EQUIVALENT TO

ENERGY E' kT(eV)

MAXIMUM RESONANCE 1 Oil 0"11 p" zo CROSS SECTION ero (cmz) 1-~-ll--Illf--Ilf--Il FOR PHOTON OF ENERGY

Ey(eV), 00" ).z/2."

RECOILLESS FRACTION f AND SECOND-ORDER DOPPLER SHIFT (y2) 12c2 AS GIVEN BY THE DEBYE MODEL

f :exp {_§1i [1..+ (~)2!.~~ X ]} k 8 4 8) e'-I

o elT

(y 2) : 9k a [I + (T)4 ( x3 dx ] ~ 2Mc2 '5 ti 'j " - I

o

FIG. 1

tion of 1 per cent. Since the natural line width in 191 Ir, as in most other Mossbauer nuclides, is extremely narrow, Mossbauer was able to alter the degree of overlap between the emission and absorption lines by simply moving the source relative to the absorber at speeds u of the order of 1 mm/sec. Thus the gamma rays were slightly shifted in energy via the first-order Doppler effect by an amount AE = Eu/e. By plotting the transmission through the absorber as a function of the relative source-absorber velocity, one thus obtains the characteristic Mossbauer velocity spectrum which exhibits the shape of the reso­nance curve. From such a plot, one can deter­mine the recoilless fraction, the lifetime of the excited state and any possible energy differences between the emission and the absorption line.

With extreme care, it is possible to determine energy differences of the order of 1/1000 of the line width r. The latter t{JJically varies with isotope from 10-10 to 10- times the actual

gamma ray energy l!.'. The Mossbauer effect therefore enables one to detect extremely small changes in this energy. One of the earliest appli­cations of this great precision was the laboratory verification of the gravitational red shift by Pound and Rebka. According to Einstein's theory, photons have an apparent mass m = E/e2 • Thus if they fall toward the earth through a distance H, their ener~ increases by AE = mgH, so that AE/E = gH/e ~ 10-16 per meter. Using 57 Fe, which has a large recoilless fraction, and for which r/E = 3 X 10-13 , the desired ef­fect was observed when the photons were sent down the 2 2-meter tower at Harvard University.

It is well known from optical and high-fre­quency spectroscopy that a nucleus interacting with its environment through its charge distri­bution and magnetic moment can give rise to hyperfine shifts and splittings of the order of 10-9 eV to 10-5 eV" In Mossbauer experiments, such energy differences can readily be measured

NU

CLE

AR

H

YP

ER

FIN

E

INTE

RA

CTI

ON

FO

R 5

7 Fe

MU

LTI P

OL

E

OR

DE

R

EL

EC

TR

IC

MO

NO

PO

LE

: IS

OM

ER

S

HIF

T

MA

GN

ET

IC

DIP

OL

E:

ZE

EM

AN

S

PL

ITT

ING

NU

CL

EA

R

PR

OP

ER

TY

C

HA

NG

E

IN

CH

AR

GE

R

AD

IUS

!B

M

AG

NE

TIC

M

OM

EN

T

,... R

AT

OM

IC

PR

OP

ER

TY

s-

EL

EC

TR

ON

D

EN

SIT

Y

1+(0)

12

INT

ER

NA

L

MA

GN

ET

IC

FIE

LD

H

(o)

INT

ER

AC

TIO

N

EN

ER

GY

l,.

S."

Ea

-Es:

4; ze

2R2(

.y~t

(O)I

~-lt

(o)l

: _

,... H

(o)

I Z

EM

-I

~IItrtl

:l S

OU

RC

E

AB

SO

RB

ER

1

EX

C' 3

,12

,/

EN

ER

GY

L

EV

EL

E

XC

ITE

O (I

~MERI~

./

I D

IAG

RA

M

STAT

E: R

uc

EO

ES

WIT

H

GA

MM

A

I I

(nIH

! G

ROUN

D ~2

STATE:R~

/ T

RA

NS

ITIO

NS

I. N

O,1

/2

/

POIN

T

NU

CL

EU

S fi

NIT

E

NU

CLE

US

_1/2

AL

LO

WE

D

BY

IS

OM

ER

SH

IFT

+

MAG

NET

IC

FIE

LD

EX

AM

PLE

; 57

Fe

IN

Pt

VS.

57Fe

IN

K

Fe F

3 E

XA

MP

LE:

57Fa

IN

IR

ON

SE

LE

CT

ION

R

UL

ES

S

R.

RoC

-RO

No'

.OO

OI

\+(O

I\~-

lojlloll~"17

16

' ""

G_o

' 0 0

90

,...

. ,

1-'

0c'·

0 1

551-

'_,

H(e

l' 33

0kO

e "I

f R

.,

i\jl(o

l\Z

. •

VE

LOC

ITY

(e

m/s

ec)

VE

LOC

ITY

(e

m/s

ec)

·0.2

-0

.1

0 0.

1 0.

2 ·0

.6

·0.4

-0

.2

0 0

.2

0.4

0

.6

TY

PIC

AL

1.

S.

z z

0 r-

....

L..

.'

0 Ui

1.

0 Ui

1.0

[\ (

i M

OS

SB

AU

ER

II

I .......

~~j l(

III

~

t i

. "

i ..

." .

III

IJ)

~0.9 r-

• ~ 0

.9

SP

EC

TR

A

a:

a:

... ...

~ ",

0.8

: SO

URCE

57

eo I

N PL

ATIN

UM

",0.8~

>

~

~ ... ..

-,0

.7

~0.7~

'" A

BS

OR

BE

R:

KFe

F3

a:::

a:::

f-SO

URCE

. 57

Co

IN S

TAIN

LESS

S

TEE

L

AB

SO

RB

ER

' 57

Fe

IN

IRO

N

FIG

. 2

EL

EC

TR

IC

QU

AD

RU

PO

LE

EL

EC

TR

IC

QU

AD

RU

PO

LE

M

OM

EN

T

Q

EL

EC

TR

IC

FIE

LD

'G

RA

DIE

NT

q

2 3

d-l0

+1

1

EO

: e

qQ

4

1(2

1-1

)

... - l z

< !

± 3tz

1

EXC

' 3/ 2./

± ~2

1 I I

I O

NO' 1

/2 /

H

2

ISO

MER

SH

IFT

+

EL.

FIE

LD

GR

ADIE

NT

EX

AM

PLE

· 57

Fe

IN

Fe S

O.

' 7

H20

AT

78

' k

Qu

e'

0.2

x 1

0'24

cm

Z ,

eq

' 2

2 I

10

" Vlc

mz

VE

LOC

ITY

(e

m/s

.c)

0.1

a 0.

1 0

.2

0.3

0.4

I/

z/Q

Q

z I

~

1.0

~\ 1

s.

V

III

r I""

\ II

I i ~

0.9

V

.. a:

...

0.8

'" > ~ 0.

7 SO

UR

CE:

57

Co

IN

STA

INLE

SS

S

TEE

L -'

A

BS

OR

BE

R:

Fe

SO

, .7

HzO

AT

7

8'K

'" a:

f-

I

::: 0'

<Il

<Il =

.- c::

!O\ " !O\ .., .., !O\

(") ... .... :l:

781

since the line width of the recoilless transitions is of the same order of magnitude. Perhaps, therefore, the most useful feature of the Moss­bauer effect is that it may be used to obtain nuclear properties if the fields acting on the nucleus are known, and conversely, it is a powerful tool for probing solids once the various interactions are calibrated, i.e., the nuclear properties have been determined. Some representative results obtained with 57 Fe are illustrated in Fig. 2.

The most basic of these interactions is the effect of the finite nuclear size which, in general, is different for the ground state and the excited state. The electrostatic interaction of the nuclear charge with the s-electrons overlapping it raises the nuclear energy levels by an amount depend­ing on the charge radii and s-electron density at the nucleus. Therefore under proper conditions, there appears a shift in the Mossbauer resonance, the isomer shift, which is proportional to (oR/R) 011/1(0)1 2 , where oR/R is the fractional change in the nuclear radius during the decay and 011/1(0)12 is the difference in s-electron density between source and absorber. To determine the quantity oR/R, one compares the isomer shifts of two chemically simple absorbers, for which the s-electron density can be calculated. In the case of 57 Fe an isomer shift exists between com­pounds containing ferric ions, Fe 3+(3dS ), and ferrous ions, Fe 2+(3d6 ). Although the number of s-electrons is the same for both ions, a detailed calculation shows that the shielding through the additional 3d electron changes the 3s density at the nucleus. For 1291, the isomer shifts observed among different alkali iodides can be related quantitatively to the known transfer of 5p electrons to the ligands which affects the 5s density at the nucleus. Once calibrated, the isomer shift is a tool for measuring s-electron densities and is therefore of use in studying chemical bonding, energy bands in solids, and also in identifying charge states of a given atom.

One of the early successes of the Mossbauer effect was the observation of the completely resolved nuclear Zeeman splitting arising from the magnetic hyperfine interaction of s7Fe in ferromagnetic iron. For this isotope, as well as for most other Mossbauer isotopes, the magnetic moment of the nuclear ground state is known from magnetic resonance experiments, and the calibration is therefore straightforward. Care­ful analysis of the velocity spectrum for mag­netic samples is sufficient in general to reveal both the desired magnetic moment and internal magnetic field. The latter yields important in­formation about the unpaired spin density at the nucleus, which in turn is related to the exchange interaction in crystals, molecular complexes, metals and alloys. For single crys­tals or magnetized samples, the intensities of the individual lines of the Mossbauer spectrum depend on the angle between the direction of the internal field and the emitted photon. From

MOSSBAUER EFFECT

a measurement of the intensity distribution, one therefore obtains the orientation of the internal magnetic field. The temperature depen­dence of the splitting can yield Neel and Curie temperatures and also relaxation times.

Whenever one of the nuclear levels possesses a quadrupole moment and an electric field gradient exists at the position of the nucleus, quadrupole splitting of the Mossbauer spectrum may be observed. If the quadrupole moment is known either for the ground state or for the excited state, then a Mossbauer measurement will readily yield the parameters of the field gradient tensor. Usually, however, the quad­rupole moment is not known, and the field gradient tensor must be determined from other work or else calculated from first principles. This tensor exists whenever the symmetry of the surrounding charge distribution is lower than cubic, and it is generally specified by two independent parameters. This tensor is easiest to calculate for cases of axial symmetry, in which it is characterized by one parameter, the field gradient, q. For simple ionic systems, it is possible to estimate q with some degree of cer­tainty, and thereby determine the quadrupole moment. Once this is done, the Mossbauer effect may be used to measure field gradient tensors in more complicated systems. Such measurements yield information about crystalline symmetries, crystalline field splittings, shielding due to closed shell electrons, relaxation phenomena and chem­ical bonding. In addition, with single crystals, a study of the relative intensity of the various lines of the resonance spectrum as a function of an~e can yield information about the orientation of the crystalline field axes and, thus, the orienta­tion of complexes in solids.

In cases where both magnetic and quadrupole splittings are present the spectrum depends markedly upon the relative orientation between the hyperfine magnetic field and the axes of the electric field gradient tensor. Paramagnetic complexes of lower than cubic symmetry and many magnetically ordered compounds are of this type. Careful quantitative analysis can then yield the magnitude as well as the relative orien­tation of the hyperfine interactions. There are also disordered systems such as alloys, amor­phous solids, and especially spin-glasses, in which each atom has a slighty different environment. In such cases s7Fe or other Mossbauer nuclides have been used to probe the distribution of isomer shifts, local magnetic fields, and/or elec­tric field gradients. Moreover, the hyperfine interactions may be nonstationary, for instance as a result of spin fluctuations, diffusion, or other time dependent processes. Such systems have been successfully treated using dynamical models of the Mossbauer line shape.

This article has only covered the basic features of the Mossbauer effect and the phenomena which affect the Mossbauer velocity spectrum in a general way. The actual application of the

MOSSBAUER EFFECT

effect is extremely far reaching, embracing not only almost all areas of physics but also the fields of chemistry, biology, geology, metal~ lurgy, and engineering. The reader is advised to consult the references for more information.

References

R. INGALLS P.DEBRUNNER

Mossbauer, R. L.,Science, 137,731 (1962). "Mossbauer Effect: Selected Reprints," New York,

American Institute of Physics, 1963. Stevens, J. G., and Stevens, V. E. (Eds.), "Mossbauer

Effect Data Index," New York, IFI/Plenum, 1966-present.

CURRENT DOWN, INTO

PAPER

CURRENT

Up, OUT FROM PAPER

782

FIG. 1. Direction of force due to current in a mag­netic field.

Goldanskii, V. I., and Herber, R. H. (Eds.), "Chemical F = Bil newtons. Fig. 1 shows the directions Applications to Mossbauer Spectroscopy," New of current and force. York, Academic Press, 1964.

Greenwood, N. W., and Gibb, T. C., "Mossbauer Spec- (3) Magnetic structures tend to move to the position of minimum reluctance (maximum

troscopy," London, Chapman and Hall, Ltd., 1971. inductance) with force F = dw/dx, where w is Gonser, U. (Ed.), "Mossbauer Spectroscopy," Berlin,

stored magnetic energy and x is distance. Springer-Verlag, 1975. (4) The force between two coupled circuits

Shenoy, G. K., and Wagner, F. E. (Eds.), "Mossbauer is Isomer Shifts," Amsterdam, North-Holland, 1978.

Cohen, R. L., "Applications of Mossbauer Spectro­scopy," Vol. 2, New York, Academic Press, 1980.

Gonser, U. (Ed.), "Mossbauer Spectroscopy II: The Exotic Side of the Method," Berlin, Springer­Verlag, 1981.

Cross-references: CONSERVATION LAWS AND SYMMETRY, DOPPLER EFFECT, ISOTOPES, LU­MINESCENCE, PHONONS, RADIOACTIVITY, ZEE­MAN AND STARK EFFECTS.

MOTORS, ELECTRIC

History Power conversion was discovered by M. Faraday in 1831; the commutator, by J. Henry, Pixii, and C. Wheatstone (1841); the electromagnetic field, by J. Brett (1840), Wheat­stone and Cooke (1845), and W. von Siemens (1867); drum armatures, by Siemens, Pacinotti, and von Alteneck; ring armatures, by Gramme (1870); and disc armatures, by Desroziers (1885) and Fritsche (1890). Ring and disk types are now seldom used. Revolving magnetic fields (1885) and ac theory were discovered by G. Ferraris; polyphase motors and systems, by N. Tesla (1888); the squirrel-cage rotor, by C. S. Bradley (1889); and ac commutator motors, by R. Eickmeyer, E. Thomson, L. Atkinson, and others.

Principles These are explained by the laws of Ohm, Kirchoff, Lenz and Maxwell; specif­ically:

(1) Moving a conductor of length I across a magnetic flux field of density B with a velocity v generates in the conductor an electromotive force (emf) e = vBI volts. In motors, e opposes the current i and decreases as load increases.

(2) The force on such a conductor equals

Frequently L 1 and/or L2 are constant and they make no contribution to the force. Here i 1 and i2 are currents, Ll and L2 are total self-induc­tances, and M is mutual inductance.

Symbols

B = flux density in Tesla F = force in direction x in newtons i, ii, i2 = instantaneous currents in amperes Ia,h = dc and effective currents in amperes L 1 , L 2, M = inductances in henrys s = slip = (Syn. rpm - rpm)/Syn. rpm u = velocity in meters per second w = energy in joules 1, x = length, distance, in meters rpa = useful flux per pole in webers rpm = maximum flux per pole in webers

Motor Types Of many hundreds, the most used are:

(1) Direct Current (a) Series. The field coils are of heavy wire in series with the armature. The torque and current are high at low speeds, and low at high speeds. Torque and speed vary inversely as a square-law function. These motors will run away at light loads unless a speed­limiting device is used.

(b) Shunt. The field coils are of fine wire in parallel with the armature. The speed drops slightly, and the current and torque increase with load. Many small motors (up to 15 hp) use permanent magnets for the field, particularly where high acceleration rates are necessary.

(c) Compound. Both shunt and series fields are used in the same motor. Behavior is inter-

783

mediate between (a) and (b). The armature current in all dc motors is described as Ia = (VT - Ea)/R a, where Ea = prJ>aZaS/(ma X 60); VT = terminal voltage, Ea = counter electro­motive force (cemf), Ra = armature resistance, ma = number of paths, and Za = number of conductors, all for the armature. In addition, p = number of poles, S = rpm, rJ>a = useful flux per pole. rJ>a may be nearly constant or it may be a function of Ia.

Speed control is achieved by adjusting field current and/or armature terminal voltage, tapped series coils, or (more rarely) field reluctance or double commutator. To avoid damage from excessive currents during starting, either start­ing resistances or voltage controls are needed, except for very small motors.

(2) Alternating Current (a) Polyphase in­duction: These are usually three-phase, with phase windings distributed equally in slots around the periphery of the stator to produce alternate Nand S poles. When fed with poly­phase current a revolving field is set up which turns at S = 120f/p rpm. This revolving field induces a counter emf E = 2.22frJ>m Zkw volts per phase, where rJ> is flux per pole, kw is the winding factor, and Z is the number of series turns.

The rotor is made of a number of short cir­cuited conductors. If these conductors are open, or the rotor is running at synchronous speed S, each rotor phase behaves as an inductance and draws an exciting current lagging nearly! cycle behind the emf. Shaft load reduces the speed from S to SO - s), where s is slip. When re­ferred to the stator mounted armature (pri­mary), s causes the emf in each rotor phase to be sE (in volts). Each rotor phase sees current (in amperes):

where r2 is the resistance andx2 is the rotor leak­age reactance in ohms per phase.

The current 12 produces the needed torque. It creates a new magnetomotive force (mmf) which turns at synchronous speed S in the same direction as the revolving stator field, and lags the stator field in space by the electrical angle 90° + tan-1 (X2/r2)' To balance this mmf the stator draws additional current sufficient to produce the load torque. Performance can be calculated (as for a transformer) if r2 /s is taken as the independent variable. Three-phase motors require less material than one- or two-phase motors.

(b) Single-phase induction. With the motor at rest there is no revolving field; the motor will start in either direction only if given a push. Rotation sets up an elliptical revolving field which turns synchronously at nonuniform

MOTORS, ELECTRIC

velocity in the same direction as the rotor and pulsates between the limits rJ>m and rJ>m (1 - s). The stator current is the resultant of (stator + cross axis) magnetizing + load + loss currents. Analysis of this type of motor is less simple than that for a polyphase motor.

Starting is with a line switch, reduced voltage (auto-transformer or "compensator," wye-delta, series resistors, or chokes), wound rotors and resistors, part-windings, or more elaborate schemes. For single-phase motors, auxiliary start windings (split-phase or capacitor), or repulsion-start are used and are removed at approximately 65% of full speed. For very small, low-torque motors, shaded poles are used.

Speed control has historically been by reduced voltage, wound rotor, or elaborate degenerative feedback schemes. State-of-the-art is variable­voltage, variable-frequency electronic inverters. The motor speed tracks the applied frequency.

(c) Synchronous. Commonly these have a stationary phase-wound armature and revolving dc or permanent magnet field structure. When at full speed and synchronized, the revolving armature mmf stands still relative to the dc field. When the angle (la, Ea) = 0, the armature mmf poles stand midway between the field poles. When Ia lags Ea, the armature mmf as­sists the field; it opposes the field when Ia leads Ea. Armature current Ia adjusts to a value and time phase position such that the counter emf Ea and current Ia are correct to meet the existing load. Important modeling relations are Ia = (V - Ea)/Za; power converted = mlaEa . cos (la, Ea); input = mla V cos (la, V); and armature copper loss = mla2ra. Where Ea is a function of (la, If); angle (la, Ea) and Ea are in opposition; Ia increases with load; and m = number of phases.

When connected directly to the utility power system, synchronous motors are used where the rpm must be fixed, for power factor correction, regulating transmission line voltages, or speeds too low for good induction motor performance. They are not self starting unless special means are provided, usually squirrel-cage or phase windings in the pole faces. Precautions against high ac voltages in the dc field coils are needed when starting. The motors will carry some load with the dc field winding open. Small sizes (reluctance types) operate without field wind­ings. Single-phase motors are less satisfactory because of lower efficiency, tendency to severe hunting, and problems in starting.

(d) ac Commutator. These are mostly single­phase series, repulsion, or combination of these types. Single-phase series motors are used in large quantities for portable tools, vacuum sweepers, garbage disposals, household appli­ances, etc. The ac series motor is similar to the dc motor except for a laminated field and precautions against low power factor and poor commutation. The runaway speed is high.

(e) Brushless dc. These are ac synchronous and induction motors which receive their power

MOTORS, ELECTRIC

from electronic inverters. The inverters are fed from a dc source. By proper inverter control, the entire system responds the same as a dc motor. Uses include industrial drives, electric cars, railways, and aircraft control actuators.

(f) Stepper Motors. ac and dc motors specially designed for high torque and low inertia. Nor­mally electronically controlled to discrete posi­tions, or steps. These are used extensively for industrial positioners, computer line printers, robots, etc.

Probable Future Trends These include less expensive, higher energy product permanent magnets; electrical insulators with better heat conduction properties; and superconductors. Superconducting windings will require motors to be built of nonmagnetic materials, since the magnetic fields will exceed the saturation levels of magnetic materials. Rapid advances in power electronics will influence motor designs.

FREDERICK C. BROCKHURST

References

SIemon, G. R., and Straughen, A., "Electric Machines," Reading, Mass., Addison-Wesley Publishing Com­pany, Inc., 1980.

Say, M. G., "Alternating Current Machines," New York, Halstead Press, John Wiley & Sons, Inc., 1976.

Cross-references: ALTERNATING CURRENTS, ELEC­TRICITY, INDUCED ELECTROMOTIVE FORCE, INDUCTANCE, MAGNETISM, TRANSFORMER.

MUSICAL SOUND

Musical sound may be characterized as an aural sensation caused by the rapid periodic motion of a sonorous body, while noise is due to non­periodic motions. The above statement, origi­nally made by Helmholtz, may be modified slightly so that the frequencies of vibration of the body fall into the limits of hearing: 20 to 20,000 Hz. This definition is not clear cut; there are some noises in the note of a harp (the twang) as well as a recognizable note in the sqeak of a shoe. In other cases it is even more difficult to make a distinction between music and noise. In some modern "electronic music" hisses and thumps are considered a part of the music. White noise is a complex sound whose frequency components are so closely spaced and so numerous that the sound ceases to have pitch. The average power per frequency of these components is approximately the same over the whole audible range, and the noise has a hissing sound similar to that one gets from FM radio that is tuned between stations. Pink noise has its lower frequency components rela­tively louder than the high frequency compo­nents and this is accomplished by keeping the average power the same in each octave (or in each! octave) band from 20 to 20,000 Hz.

784

The attributes of musical sound and their subjective correlates are described briefly. The number of cycles per second, frequency, is a physical entity and may be measured objec­tively. Pitch, however, is a psychological phe­nomenon and needs a human subject to per­ceive it. In general, as the frequency of a single sinusoidal vibration of a sonorous body (pure tone) is raised, the pitch is higher. However, pitch and frequency do not bear a simple linear relationship. To define the relationship human subjects are used to construct a pitch scale so that one note can be judged to be two times the pitch of another and so on. The unit of pitch on this scale is called the mel, and a pitch of 1000 mels is arbitrarily assigned to a fre­quency of 1000 Hz. In general, it is observed that the pitch is slightly less than the frequency when the frequencies are higher than 1000 Hz, and slightly more than the frequency at fre­quencies less than 1000 Hz. Pitch also depends on loudness; for a 200 Hz tone if the loudness is increased, the pitch decreases, and the same happens for frequencies up to 1000 Hz. Be­tween 1000 and 3000 Hz pitch is relatively independent of loudness, while above 4000 Hz, increasing the loudness raises the pitch. Small pitch changes with loudness do occur with complex tones and whether the pitch goes up or down with loudness seems to depend on the harmonic structure of the complex sound. A rapid variation in pitch when the variation occurs at the rate of from two to five times per second is called vibrato. The pitch variation in mels may be large or small but the rate at which the pitch is varied is rarely greater than five times per second. Violinists produce vibrato by sliding their fingers back and forth a minute distance on a stopped string. A variation in loudness occurring at the rate of two to five times a second is called tremolo. Singers often produce a combination of tremolo and vibrato to give added color to their renditions.

The ability to discriminate between pitches depends on other variables beside the acuity of the listener. The just noticeable differences (jnd's) also depend on the frequencies of the pure tones. Jnd's generally get larger as the frequency increases. Duration of a pure -tone is also a requirement for recognition of pitch. Some pure tones of even 3 millisecond duration can still be recognized as having a definite pitch. The duration of time required to recognize pitch depends on the frequency and to some extent on the loudness of the tone. It takes longer to recognize the pitch of a low fre­quency pure tone than for one of a high fre­quency. In all cases if the time is too short one hears a click rather than a clear pitch. The pitch of a complex musical tone depends on the spec­trum of the complex tone. If the complex t~ne is composed of the fundamental and exact over­tones, the ear recognizes the pitch as that of the fundamental, even if the fundamental is weak

785

or is missing. Manufacturers of small portable radios take advantage of this when they install small loudspeakers in these radios. These speak­ers are not capable of producing the fundamen­tals of certain low frequency musical sounds but the ear seems to fill in the "missing funda­mental."

Like frequency, intensity is a physical entity defined as the amount of sound energy passing through unit area per second in a direction perpendicular to the area. It is proportional to the square of the sound pressure, the latter being the rms pressure over and above the con­stant mean atmospheric pressure. Since sound pressure is proportional to the amplitude of a longitudinal sound wave (see WAVE MOTION) and to the frequency of the wave, intensity since it depends on energy, is proportional to the square of the amplitude and to the square of the frequency. Sound intensity is measured in watts per second per square centimeter and, since the ear is so sensitive, a more usual unit is the microwatt per second per square centi­meter. By way of example, a soft speaking voice produces an intensity of 0.1 micromicro­watt/cm2 sec, while fifteen hundred bass voices singing fortissimo at a distance 1 cm away produce 40 watt/cm2 sec. Because of such a large range of intensities, the decibel scale of intensity is normally used to designate intensitr levels. An arbitrary level of 10-16 watts/em sec is taken as a standard for comparison at 1000 Hz. This level is very close to the thresh­old of audibility. At this frequency, other sound levels are compared by forming the logarithm of the ratio of the desired sound to this arbitrary one. Thus log 1/10-16 is the number of bels a sound of intensity I has, com­pared to this level. Since this unit is incon­veniently large, it has been subdivided into the decibel, one-tenth its size. Thus 10 log l/l 0-16

equals the number of decibels (dB) the sound has. A few intensity decibel levels are listed:

db Quiet whisper 10 Ordinary conversation 60 Noisy factory 90 Thunder (loud) 110 Pain threshold 120

While intensity levels can be measured physi­cally, loudness levels are subjective and need human subjects for their evaluation. The unit of loudness is the phon, and an arbitrary level of zero phons is the loudness of a 1000 Hz note which has an intensity level of 0 dB. Sounds of equal loudness, however, do not have the same intensity levels for different frequencies. From a series of experiments involving human sub­jects, Fletcher and Munson in 1933 constructed a set of equal-loudness contours for different frequencies of pure tones. These show that for quiet sounds (a level of 5 phons) the intensity

MUSICAL SOUND

level at 1000 Hz is about 5 dB lower than an equally loud sound at 2000 Hz, for 30 Hz about 70 dB lower, and at 10,000 Hz about 20 dB lower. In general, as the intensity levelincreases, loudness levels tend to be more alike at all frequencies. This means that as a sound gets less intense at all frequencies, the ear tends to hear the higher and lower portions of sound less loudly than the middle positions. Some high fidelity systems incorporate circuitry that auto­matically boosts the high and low frequencies as the intensity level of the sound is decreased. This control is usually designated a loudness control.

At times it is necessary to have a scale of ab­solute perceived loudness. The unit of this perceived loudness is the sone. It is arrived at by a set of complicated procedures involving human subjects usually placed in a free field situation (e.g., anechoic chamber). One sone of loudness is defined as the sound pressure at the ear of 40 dB for a 1000 Hz pure tone. Two sones is perceived in these experiments as twice as loud, three sones three times, etc. In many situations, musical as well as noisy, the figures for sones of multiple sounds seem to add up in an arithmetical way; figures for db do not. Some consumer testing groups use the sone scale for evaluating the noise of various con­sumer products.

The entity which enables a person to recog­nize the difference between equally load tones of the same pitch coming from different musi­cal instruments is called timbre, quality, or tone color. A simple fundamental law in acoustics states that the ear recognizes only those sounds due to simple harmonic motions (see VIBRA­TION) as pure tones. A tuning fork of frequency f, when struck, causes the air to vibrate in a manner which is very nearly simple harmonic. The sound that is heard does, in fact, give the impression that it is simple and produces a pure tone of a single pitch. If one now strikes simul­taneously a series of tuning forks having fre­quencies f (the fundamental), 2f, 3f, 4f, 5f, etc. (overtones), the pitch heard is the same as that of the fork of frequency f except that the sound has a different timbre. The timbre of the sound of the series can be changed by altering the loudness of the individual forks from zero loudness to any given loudness. Another way to alter the timbre is to vary the time it takes for a composite sound to grow and to decay. A slow growth of an envelope, even though it contains the same frequencies, makes for a different timbre than one which has a rapid growth. The difference in timbre between a B-flat saxophone and an oboe is almost entirely due to the difference in growth or decay time.

A fundamental theorem discovered by the mathematician Fourier states that any compli­cated periodic vibration may be analyzed into a set of components which has simple harmonic vibrations of single frequencies. If this method

MUSICAL SOUND

of analysis is applied to the composite tones of musical instruments, it is seen that these tones consist of a fundamental plus a series of over­tones, the intensity of overtones being different for instruments of differing timbre. Rise and decay times will also differ. The reverse of analysis is the synthesis of a musical sound. Helmholtz was able to synthesize sound by combining sets of oscillating tuning forks of various loudness to produce a single composite steady tone of a definite timbre. Modern syn­thesizers are more sophisticated. Electrical oscillators of the simple harmonic variety are combined electrically and then these electrical composite envelopes are electronically modified to produce differing rise and decay time. A transducer changes the electrical composite envelope into an acoustical one so that a sound of any desired timbre, rise and/or decay time can be produced. An alternative way to produce similar effects is to use an oscillation known as the square wave. When this oscillation is anal­yzed by the method of Fourier, it is shown to consist of a fundamental plus the odd harmon­ics or overtones. Another kind of oscillation, a saw-tooth wave, then analyzed, is shown to consist of the fundamental and all harmonics­even and odd. A square wave or a sawtooth wave produced by an appropriate electrical oscillator can be passed through an electrical filter which can attenuate any range of fre-

786

quencies of the original wave. This altered wave can later be transformed into the corresponding sound wave. In this way sounds having desired rise and decay times, plus the required funda­mental and overtone structure, can be made as desired.

JESS J. JOSEPHS

References

Rayleigh, J. W. S., "The Theory of Sound," New York, Dover Publications, 1945.

Hehnholtz, H., "On the Sensations of Tone," New York, Dover Publications, 1954.

Stephens, R. W., and Bate A. E., "Acoustics and Vi­brational Physics," London, Edward Arnold, 1966.

Josephs, J. J., "The Physics of Musical Sound," New York, Van Nostrand Reinhold, 1967.

Winckel, F., "Music Sound and Sensation," New York, Dover, 1967.

Benade, A. H., "Fundamentals of Musical Acoustics," London, Oxford Univ. Press, 1976.

Backus, J., "The Acoustical Foundations of Music," New York, Norton, 1977.

Rossing, T. D., "The Science of Sound," Reading, MA, Addison-Wesley, 1982.

Cross·references: ACOUSTICS, ARCHITECTURAL ACOUSTICS, NOISE, FOURIER ANALYSIS, RE­PRODUCTION OF SOUND, VIBRATION, WAVE MOTION.